From 1b6d9db7e3f246df6344ea6cfee8c9f81e0eb652 Mon Sep 17 00:00:00 2001
From: Jules Laplace
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries. +
+ +MegaPixels.cc is an independent research project about publicly available face recognition datasets. This website is based, in part, on earlier installations and research projects about facial recognition datasets in 2016-2018, which focused particularly on the MegaFace dataset. Since then it has evolved into a large-scale survey of publicly-available face and person analysis datasets, covering their usage, geographies, and ethics.
-An academic report and presentation on the findings is forthcoming. This site is published to make the research more accessible to a wider audience and to include visualizations and interactive features not possible in PDF publications. Continued research on MegaPixels is supported by a 1 year Researcher-in-Residence grant from Karlsruhe HfG.
+MegaPixels.cc is a research project about publicly available face recognition datasets. This website is based, in part, on an earlier installations and research about facial recognition datasets. Since then it has evolved into a large-scale survey of publicly-available face and person analysis datasets. Initially this site was planned as a facial recognition tool to search the datasets. After building several prototypes using over 1 million face images from these datasets, it became clear that facial recognition was mereley a face similar search. The results were not accurate enough to align with goals of this website: to promote responsible use of data and expose existing and past ethical breaches.
+An academic report and presentation on the findings of this project is forthcoming. Throughout 2019, this site will be updated with more datasets and research reports on the general themes of remote biometric analysis and media collected "in the wild". The continued research on MegaPixels is supported by a 1 year Researcher-in-Residence grant from Karlsruhe HfG (2019-2020).
When possible, and once thoroughly verified, data generated for MegaPixels will be made available for download on github.com/adamhrv/megapixels
[ page under development ]
+The Duke Multi-Target, Multi-Camera Tracking Dataset (MTMC) is a dataset of video recorded on Duke University campus during for the purpose of training, evaluating, and improving multi-target multi-camera tracking for surveillance. The dataset includes over 14 hours of 1080p video from 8 cameras positioned around Duke's campus during February and March 2014. Over 2,700 unique people are included in the dataset, which has become of the most widely used person re-identification image datasets.
+The 8 cameras deployed on Duke's campus were specifically setup to capture students "during periods between lectures, when pedestrian traffic is heavy".
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries. +
+ +- To help understand how Duke MTMC Dataset has been used around the world for commercial, military and academic research; publicly available research citing Duke Multi-Target, Multi-Camera Tracking Project is collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal reserach projects at that location. + To help understand how Duke MTMC Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Duke Multi-Target, Multi-Camera Tracking Project was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries. + The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
- -- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. -
- - -DukeMTMC is a new, manually annotated, calibrated, multi-camera data set recorded outdoors on the Duke University campus with 8 synchronized cameras. It consists of:
-8 static cameras x 85 minutes of 1080p 60 fps video - More than 2,000,000 manually annotated frames - More than 2,000 identities - Manual annotation by 5 people over 1 year - More identities than all existing MTMC datasets combined - Unconstrained paths, diverse appearance
-People involved: -Ergys Ristani, Francesco Solera, Roger S. Zou, Rita Cucchiara, Carlo Tomasi.
-Navigation:
-Data Set - Downloads - Downloads - Dataset Extensions - Performance Measures - Tracking Systems - Publications - How to Cite - Contact
-Welcome to the Duke Multi-Target, Multi-Camera Tracking Project.
-DukeMTMC aims to accelerate advances in multi-target multi-camera tracking. It provides a tracking system that works within and across cameras, a new large scale HD video data set recorded by 8 synchronized cameras with more than 7,000 single camera trajectories and over 2,000 unique identities, and a new performance evaluation method that measures how often a system is correct about who is where. -DukeMTMC Data Set -Snapshot from the DukeMTMC data set.
-DukeMTMC is a new, manually annotated, calibrated, multi-camera data set recorded outdoors on the Duke University campus with 8 synchronized cameras. It consists of:
-8 static cameras x 85 minutes of 1080p 60 fps video - More than 2,000,000 manually annotated frames - More than 2,000 identities - Manual annotation by 5 people over 1 year - More identities than all existing MTMC datasets combined - Unconstrained paths, diverse appearance
-News
-05 Feb 2019 We are organizing the 2nd Workshop on MTMCT and ReID at CVPR 2019 - 25 Jul 2018: The code for DeepCC is available on github - 28 Feb 2018: OpenPose detections now available for download - 19 Feb 2018: Our DeepCC tracker has been accepted to CVPR 2018 - 04 Oct 2017: A new blog post describes ID measures of performance - 26 Jul 2017: Slides from the BMTT 2017 workshop are now available - 09 Dec 2016: DukeMTMC is now hosted on MOTChallenge
-DukeMTMC Downloads
-DukeMTMC dataset (tracking)
-Dataset Extensions
-Below is a list of dataset extensions provided by the community:
-DukeMTMC-VideoReID (download) - DukeMTMC-reID (download) - DukeMTMC4REID - DukeMTMC-attribute
-If you use or extend DukeMTMC, please refer to the license terms. -DukeMTMCT Benchmark
-DukeMTMCT is a tracking benchmark hosted on motchallenge.net. Click here for the up-to-date rankings. Here you will find the official motchallenge-devkit used for evaluation by MOTChallenge. For detailed instructions how to submit on motchallenge you can refer to this link.
-Trackers are ranked using our identity-based measures which compute how often the system is correct about who is where, regardless of how often a target is lost and reacquired. Our measures are useful in applications such as security, surveillance or sports. This short post describes our measures with illustrations, while for details you can refer to the original paper. -Tracking Systems
-We provide code for the following tracking systems which are all based on Correlation Clustering optimization:
-DeepCC for single- and multi-camera tracking [1] - Single-Camera Tracker (demo video) [2] - Multi-Camera Tracker (demo video, failure cases) [2] - People-Groups Tracker [3] - Original Single-Camera Tracker [4]
Facial Recognition Evaluation (FERET) is develop, test, and evaluate face recognition algorithms
-The goal of the FERET program was to develop automatic face recognition capabilities that could be employed to assist security, intelligence, and law enforcement personnel in the performance of their duties.
+[ page under development ]
+{% include 'dashboard.html' %}
+RESEARCH below this line
The FERET program is sponsored by the U.S. Depart- ment of Defense’s Counterdrug Technology Development Program Office. The U.S. Army Research Laboratory (ARL) is the technical agent for the FERET program. ARL designed, administered, and scored the FERET tests. George Mason University collected, processed, and main- tained the FERET database. Inquiries regarding the FERET database or test should be directed to P. Jonathon Phillips.
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries. -
- -- To help understand how HRT Transgender has been used around the world for commercial, military and academic research; publicly available research citing HRT Transgender Dataset is collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal reserach projects at that location. -
- -- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. -
- - +[ page under development ]
+{% include 'dashboard.html' }
RESEARCH below this line
++ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries. +
+ ++ To help understand how LFWP has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Labeled Face Parts in the Wild was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location. +
+ ++ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. +
+ + +RESEARCH below this line
@@ -40,11 +100,10 @@Release 1 of LFPW consists of 1,432 faces from images downloaded from the web using simple text queries on sites such as google.com, flickr.com, and yahoo.com. Each image was labeled by three MTurk workers, and 29 fiducial points, shown below, are included in dataset. LFPW was originally described in the following publication:
Due to copyright issues, we cannot distribute image files in any format to anyone. Instead, we have made available a list of image URLs where you can download the images yourself. We realize that this makes it impossible to exactly compare numbers, as image links will slowly disappear over time, but we have no other option. This seems to be the way other large web-based databases seem to be evolving.
(PAGE UNDER DEVELOPMENT)
-Labeled Faces in The Wild (LFW) is "a database of face photographs designed for studying the problem of unconstrained face recognition 1. It is used to evaluate and improve the performance of facial recognition algorithms in academic, commercial, and government research. According to BiometricUpdate.com 3, LFW is "the most widely used evaluation set in the field of facial recognition, LFW attracts a few dozen teams from around the globe including Google, Facebook, Microsoft Research Asia, Baidu, Tencent, SenseTime, Face++ and Chinese University of Hong Kong."
+[ PAGE UNDER DEVELOPMENT ]
+Labeled Faces in The Wild (LFW) is "a database of face photographs designed for studying the problem of unconstrained face recognition 1. It is used to evaluate and improve the performance of facial recognition algorithms in academic, commercial, and government research. According to BiometricUpdate.com 3, LFW is "the most widely used evaluation set in the field of facial recognition, LFW attracts a few dozen teams from around the globe including Google, Facebook, Microsoft Research Asia, Baidu, Tencent, SenseTime, Face++ and Chinese University of Hong Kong."
The LFW dataset includes 13,233 images of 5,749 people that were collected between 2002-2004. LFW is a subset of Names of Faces and is part of the first facial recognition training dataset created entirely from images appearing on the Internet. The people appearing in LFW are...
The Names and Faces dataset was the first face recognition dataset created entire from online photos. However, Names and Faces and LFW are not the first face recognition dataset created entirely "in the wild". That title belongs to the UCD dataset. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer.
The Names and Faces dataset was the first face recognition dataset created entire from online photos. However, Names and Faces and LFW are not the first face recognition dataset created entirely "in the wild". That title belongs to the UCD dataset. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer.
The Names and Faces dataset was the first face recognition dataset created entire from online photos. However, Names and Faces and LFW are not the first face recognition dataset created entirely "in the wild". That title belongs to the UCD dataset. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer.
The Names and Faces dataset was the first face recognition dataset created entire from online photos. However, Names and Faces and LFW are not the first face recognition dataset created entirely "in the wild". That title belongs to the UCD dataset. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer.
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries. +
+ +- To help understand how LFW has been used around the world for commercial, military and academic research; publicly available research citing Labeled Faces in the Wild is collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal reserach projects at that location. + To help understand how LFW has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Labeled Faces in the Wild was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries. + The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
- -- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. -
- -Add a paragraph about how usage extends far beyond academia into research centers for largest companies in the world. And even funnels into CIA funded research in the US and defense industry usage in China.
-Research, text, and graphics ©Adam Harvey / megapixels.cc
+Jingtuo Liu, Yafeng Deng, Tao Bai, Zhengping Wei, Chang Huang. Targeting Ultimate Accuracy: Face Recognition via Deep Embedding. https://arxiv.org/abs/1506.07310
-Lee, Justin. "PING AN Tech facial recognition receives high score in latest LFW test results". BiometricUpdate.com. Feb 13, 2017. https://www.biometricupdate.com/201702/ping-an-tech-facial-recognition-receives-high-score-in-latest-lfw-test-results
+Jingtuo Liu, Yafeng Deng, Tao Bai, Zhengping Wei, Chang Huang. Targeting Ultimate Accuracy: Face Recognition via Deep Embedding. https://arxiv.org/abs/1506.07310
+Lee, Justin. "PING AN Tech facial recognition receives high score in latest LFW test results". BiometricUpdate.com. Feb 13, 2017. https://www.biometricupdate.com/201702/ping-an-tech-facial-recognition-receives-high-score-in-latest-lfw-test-results
(PAGE UNDER DEVELOPMENT)
+[ PAGE UNDER DEVELOPMENT]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries. +
+ +- To help understand how Market 1501 has been used around the world for commercial, military and academic research; publicly available research citing Market 1501 Dataset is collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal reserach projects at that location. + To help understand how Market 1501 has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Market 1501 Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries. -
- -@@ -114,7 +95,7 @@
-(PAGE UNDER DEVELOPMENT)
-At vero eos et accusamus et iusto odio dignissimos ducimus, qui blanditiis praesentium voluptatum deleniti atque corrupti, quos dolores et quas molestias excepturi sint, obcaecati cupiditate non-provident, similique sunt in culpa, qui officia deserunt mollitia animi, id est laborum et dolorum fuga. Et harum quidem rerum facilis est et expedita distinctio.
-Nam libero tempore, cum soluta nobis est eligendi optio, cumque nihil impedit, quo minus id, quod maxime placeat, facere possimus, omnis voluptas assumenda est, omnis dolor repellendus. Temporibus autem quibusdam et aut officiis debitis aut rerum necessitatibus saepe eveniet, ut et voluptates repudiandae sint et molestiae non-recusandae. Itaque earum rerum hic tenetur a sapiente delectus, ut aut reiciendis voluptatibus maiores alias consequatur aut perferendis doloribus asperiores repellat
+[ PAGE UNDER DEVELOPMENT ]
- To help understand how MsCeleb has been used around the world for commercial, military and academic research; publicly available research citing Microsoft Celebrity Dataset is collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal reserach projects at that location. + To help understand how MsCeleb has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Microsoft Celebrity Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
Add more analysis here
-@@ -123,6 +101,15 @@
+[ page under development ]
+The Oxford Town Centre dataset is a video of pedestrians in a busy downtown area in Oxford used for creating surveillance algorithms with "potential applications in activity recognition and remote biometric analysis" or non-cooperative face recognition. 1
+Based on observations of the dataset video and Google Street images, the source of the footage has been geolocated to a public CCTV camera at the intersection of Cornmarket and Market St. Oxford, England (map). Based on an analysis of the papers that use or cite this dataset 2 the inferred year of capture was definitely 2009 and the season was perhaps February or March based on the the window advertisements and cool-weather clothing.
+Halfway through the video a peculiar and somewhat rude man enters the video and stands directly over top a water drain for over a minute. His unusual demeanor and apparently scripted behavior suggests a possible relationship to the CCTV operators.
+Although Oxford Town Centre dataset first appears as a pedestrian dataset, it was created to improve the stabilization of pedstrian detections in order to extract a more accurate head region that would lead to improvements in face recognition.
++ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries. +
+ ++ To help understand how TownCentre has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Oxford Town Centre was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location. +
+ ++ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. +
+ + +Several researchers have posted their demo videos using the Oxford Town Centre dataset on YouTube:
+[ add visualization ]
+TODO
+(PAGE UNDER DEVELOPMENT)
+[ PAGE UNDER DEVELOPMENT ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries. +
+ +- To help understand how PIPA Dataset has been used around the world for commercial, military and academic research; publicly available research citing People in Photo Albums Dataset is collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal reserach projects at that location. + To help understand how PIPA Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing People in Photo Albums Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
@@ -102,18 +98,16 @@
-[ PAGE UNDER DEVELOPMENT ]
++ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries. +
+ ++ To help understand how PubFig has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Public Figures Face Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location. +
+ ++ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. +
+ + +(PAGE UNDER DEVELOPMENT)
-Unconstrained College Students (UCCS) is a dataset of long-range surveillance photos captured at University of Colorado Colorado Springs. According to the authors of two papers associated with the dataset, subjects were "photographed using a long-range high-resolution surveillance camera without their knowledge" [^funding_sb]. The images were captured using a Canon 7D digital camera fitted with a Sigma 800mm telephoto lens pointed out the window of an office.
-The UCCS dataset was funded by ODNI (Office of Director of National Intelligence), IARPA (Intelligence Advance Research Projects Activity), ONR MURI Office of Naval Research and The Department of Defense Multidisciplinary University Research Initiative, Army SBIR (Small Business Innovation Research), SOCOM SBIR (Special Operations Command and Small Business Innovation Research), and the National Science Foundation.
-The images in UCCS include students walking between classes on campus over 19 days in 2012 - 2013. The dates include:
+[ page under development ]
+UnConstrained College Students (UCCS) is a dataset of long-range surveillance photos captured at University of Colorado Colorado Springs. According to the authors of two papers associated with the dataset, subjects were "photographed using a long-range high-resolution surveillance camera without their knowledge" 2. To create the dataset, the researchers used a Canon 7D digital camera fitted with a Sigma 800mm telephoto lens and photographed students 150–200m away through their office window. Photos were taken during the morning and afternoon while students were walking to and from classes. The primary uses of this dataset are to train, validate, and build recognition and face detection algorithms for realistic surveillance scenarios.
+What makes the UCCS dataset unique is that it includes the highest resolution images of any publicly available face recognition dataset discovered so far (18MP), that it was captured on a campus without consent or awareness using a long-range telephoto lens, and that it was funded by United States defense and intelligence agencies.
+Combined funding sources for the creation of the initial and final release of the dataset include ODNI (Office of Director of National Intelligence), IARPA (Intelligence Advance Research Projects Activity), ONR MURI (Office of Naval Research and The Department of Defense Multidisciplinary University Research Initiative), Army SBIR (Small Business Innovation Research), SOCOM SBIR (Special Operations Command and Small Business Innovation Research), and the National Science Foundation. 1 2
+In 2017 the UCCS face dataset was used for a defense and intelligence agency funded face recognition challenge at the International Joint Biometrics Conference in Denver, CO. And in 2018 the dataset was used for the 2nd Unconstrained Face Detection and Open Set Recognition Challenge at the European Computer Vision Conference (ECCV) in Munich, Germany. Additional research projects that have used the UCCS dataset are included below in the list of verified citations.
++ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries. +
+ ++ To help understand how UCCS has been used around the world by commercial, military, and academic organizations; existing publicly available research citing UnConstrained College Students Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location. +
+ ++ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. +
+ + +The images in UCCS were taken on 18 non-consecutive days during 2012–2013. Analysis of the EXIF data embedded in original images reveal that most of the images were taken on Tuesdays, and the most frequent capture time throughout the week was 12:30PM.
+| Year | -Month | -Day | Date | -Time Range | Photos | ||
|---|---|---|---|---|---|---|---|
| 2012 | -Februay | ---- | -23 | -- | +Feb 23, 2012 | 132 | |
| 2012 | -March | ---- | -6 | -- | -- | +March 6, 2012 | +288 |
| 2012 | -March | ---- | -8 | -- | -- | +March 8, 2012 | +506 |
| 2012 | -March | ---- | -13 | -- | -- | +March 13, 2012 | +160 |
| 2012 | -Februay | ---- | -23 | -- | -132 | +March 20, 2012 | +1,840 |
| 2012 | -March | ---- | -6 | -- | -- | +March 22, 2012 | +445 |
| 2012 | -March | ---- | -8 | -- | -- | +April 3, 2012 | +1,639 |
| 2012 | -March | ---- | -13 | -- | -- | +April 12, 2012 | +14 |
| 2012 | -Februay | ---- | -23 | -- | -132 | +April 17, 2012 | +19 |
| 2012 | -March | ---- | -6 | -- | -- | +April 24, 2012 | +63 |
| 2012 | -March | ---- | -8 | -- | -- | +April 25, 2012 | +11 |
| 2012 | -March | ---- | -13 | -- | -- | +April 26, 2012 | +20 |
| 2012 | -Februay | ---- | -23 | -- | -132 | +
| Date | +Photos | ||||||
|---|---|---|---|---|---|---|---|
| 2012 | -March | ---- | -6 | -- | -- | +Jan 28, 2013 | +1,056 |
| 2012 | -March | ---- | -8 | -- | -- | +Jan 29, 2013 | +1,561 |
| 2012 | -March | ---- | -13 | -- | -- | +Feb 13, 2013 | +739 |
| 2012 | -Februay | ---- | -23 | -- | -132 | +Feb 19, 2013 | +723 |
| 2012 | -March | ---- | -6 | -- | -- | +Feb 20, 2013 | +965 |
| 2012 | -March | ---- | -8 | -- | -- | +Feb 26, 2013 | +736 |
2012-03-20 -2012-03-22 -2012-04-03 -2012-04-12 -2012-04-17 -2012-04-24 -2012-04-25 -2012-04-26 -2013-01-28 -2013-01-29 -2013-02-13 -2013-02-19 -2013-02-20 -2013-02-26
-- To help understand how UCCS has been used around the world for commercial, military and academic research; publicly available research citing UnConstrained College Students Dataset is collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal reserach projects at that location. -
- -- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries. -
- -- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. -
- - -The original Sapkota and Boult dataset, from which UCCS is derived, received funding from1:
+The location of the camera and subjects can confirmed using the Bellingcat method. The visual clues that lead to location of the camera and subjects include the unique pattern of the sidewalk that is only used on the UCCS Pedestrian Spine near the West Lawn, the two UCCS sign poles with matching graphics still visible in Google Street View, the no parking sign and directionality of its arrow, the back of street sign next to it, the slight bend in the sidewalk, the presence of cars passing in the background of the image, and the far wall of the parking garage all match images in the dataset. The original papers also provides another clue: a picture of the camera inside the office that was used to create the dataset. The window view in this image provides another match for the brick pattern on the north facade of the Kraember Family Library and the green metal fence along the sidewalk. View the location on Google Maps
+The UnConstrained College Students dataset is associated with two main research papers: "Large Scale Unconstrained Open Set Face Database" and "Unconstrained Face Detection and Open-Set Face Recognition Challenge". Collectively, these papers and the creation of the dataset have received funding from the following organizations:
The more recent UCCS version of the dataset received funding from 2:
-If you attended University of Colorado Colorado Springs and were captured by the long range surveillance camera used to create this dataset, there is unfortunately currently no way to be removed. The authors do not provide any options for students to opt-out nor were students informed they would be used for training face recognition. According to the authors, the lack of any consent or knowledge of participation is what provides part of the value of Unconstrained College Students Dataset.
+Please direct any questions about the ethics of the dataset to the University of Colorado Colorado Springs Ethics and Compliance Office
+For further technical information about the dataset, visit the UCCS dataset project page.
+[ page under development ]
++ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries. +
+ ++ To help understand how Brainwash Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Brainwash Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location. +
+ ++ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. +
+ + +import difflib; seq = difflib.SequenceMatcher(a=a.lower(), b=b.lower()); score = seq.ratio()(PAGE UNDER DEVELOPMENT)
+[ page under development ]
VIPeR (Viewpoint Invariant Pedestrian Recognition) is a dataset of pedestrian images captured at University of California Santa Cruz in 2007. Accoriding to the reserachers 2 "cameras were placed in different locations in an academic setting and subjects were notified of the presence of cameras, but were not coached or instructed in any way."
VIPeR is amongst the most widely used publicly available person re-identification datasets. In 2017 the VIPeR dataset was combined into a larger person re-identification created by the Chinese University of Hong Kong called PETA (PEdesTrian Attribute).
- To help understand how VIPeR has been used around the world for commercial, military and academic research; publicly available research citing Viewpoint Invariant Pedestrian Recognition is collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal reserach projects at that location. + To help understand how VIPeR has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Viewpoint Invariant Pedestrian Recognition was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
@@ -125,11 +106,10 @@
TODO RESEARCH below these lines Selected dataset sequences: (a) MBGC, (b) CMU MoBo, (c) First
-Honda/UCSD, and (d) YouTube Celebrities.
-This research is supported by the Central Intelligence Agency, the Biometrics
+ [ page under development ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how YouTube Celebrities has been used around the world by commercial, military, and academic organizations; existing publicly available research citing YouTube Celebrities was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ Facial recognition is a scam. During the last 20 years commericial, academic, and governmental agencies have promoted the false dream of a future with face recognition. This essay debunks the popular myth that such a thing ever existed. There is no such thing as face recognition. For the last 20 years, government agencies, commercial organizations, and academic institutions have played the public as a fool, selling a roadmap of the future that simply does not exist. Facial recognition, as it is currently defined, promoted, and sold to the public, government, and commercial sector is a scam. Committed to developing robust solutions with superhuman accuracy, the industry has repeatedly undermined itself by never actually developing anything close to "face recognition". There is only biased feature vector clustering and probabilistic thresholding. Ever since government agencies began developing face recognition in the early 1960's, datasets of face images have always been central to developing and validating face recognition technologies. Today, these datasets no longer originate in labs, but instead from family photo albums posted on photo sharing sites, surveillance camera footage from college campuses, search engine queries for celebrities, cafe livestreams, or videos on YouTube. During the last year, hundreds of these facial analysis datasets created "in the wild" have been collected to understand how they contribute to a global supply chain of biometric data that is powering the global facial recognition industry. While many of these datasets include public figures such as politicians, athletes, and actors; they also include many non-public figures: digital activists, students, pedestrians, and semi-private shared photo albums are all considered "in the wild" and fair game for research projects. Some images are used with creative commons licenses, yet others were taken in unconstrained scenarios without awareness or consent. At first glance it appears many of the datasets were created for seemingly harmless academic research, but when examined further it becomes clear that they're also used by foreign defense agencies.
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how {{ metadata.meta.dataset.name_display }} has been used around the world by commercial, military, and academic organizations; existing publicly available research citing {{ metadata.meta.dataset.name_full }} was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+
- To help understand how {{ metadata.meta.dataset.name_display }} has been used around the world for commercial, military and academic research; publicly available research citing {{ metadata.meta.dataset.name_full }} is collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal reserach projects at that location.
+ To help understand how {{ metadata.meta.dataset.name_display }} has been used around the world by commercial, military, and academic organizations; existing publicly available research citing {{ metadata.meta.dataset.name_full }} was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the location markers to reveal research projects at that location.
[ page under development ] The Oxford Town Centre dataset is a video of pedestrians in a busy downtown area in Oxford used for creating surveillance algorithms with "potential applications in activity recognition and remote biometric analysis" or non-cooperative face recognition. 1 Based on observations of the dataset video and Google Street images, the source of the footage has been geolocated to a public CCTV camera at the intersection of Cornmarket and Market St. Oxford, England (map). Based on an analysis of the papers that use or cite this dataset 2 the inferred year of capture was definitely 2009 and the season was perhaps February or March based on the the window advertisements and cool-weather clothing. Halfway through the video a peculiar and somewhat rude man enters the video and stands directly over top a water drain for over a minute. His unusual demeanor and apparently scripted behavior suggests a possible relationship to the CCTV operators. The Oxford Town Centre dataset is a video of pedestrians in a busy downtown area in Oxford used for creating surveillance algorithms with potential applications in activity recognition, remote biometric analysis, and non-cooperative face recognition. 1 REVISE Although Oxford Town Centre dataset first appears as a pedestrian dataset, it was created to improve the stabilization of pedstrian detections in order to extract a more accurate head region that would lead to improvements in face recognition.
@@ -111,7 +110,11 @@
Several researchers have posted their demo videos using the Oxford Town Centre dataset on YouTube: The street location of the camera used for the Oxford Town Centre dataset can be easily confirmed using only two visual clues in video: the GAP store and the main road source. The camera angle and field of view indicate that the camera was elevated and placed at the corner. The edge of the building is visible and there is a small white nylon strap and pigeon deterrent spikes visible on the upper perimeter of the building. Combined with stability of camera and pigeon appearances in front of the camera at 1:24 and 3:29, these visual cues indicate that the camera was mounted outside on the corner of the building just above the deterrence spikes. Halfway through the video a peculiar and somewhat rude man enters the video and stands directly over top a water drain for over a minute. His unusual demeanor and apparently scripted behavior suggests a possible relationship to the CCTV operators. Several researchers have posted their demo videos using the Oxford Town Centre dataset on YouTube: [ PAGE UNDER DEVELOPMENT ]
This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
@@ -74,7 +74,7 @@
- To help understand how MsCeleb has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Microsoft Celebrity Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+ To help understand how Microsoft Celeb has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Microsoft Celebrity Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
Brainwash is a head detection dataset created from San Francisco's Brainwash Cafe livecam footage. It includes 11,918 images of "everyday life of a busy downtown cafe" 1 captured at 100 second intervals throught the entire day. Brainwash dataset was captured during 3 days in 2014: October 27, November 13, and November 24. According the author's reserach paper introducing the dataset, the images were acquired with the help of Angelcam.com. 2 Brainwash is not a widely used dataset but since its publication by Stanford University in 2015, it has notably appeared in several research papers from the National University of Defense Technology in Changsha, China. In 2016 and in 2017 researchers there conducted studies on detecting people's heads in crowded scenes for the purpose of surveillance. 3 4 If you happen to have been at Brainwash cafe in San Francisco at any time on October 26, November 13, or November 24 in 2014 you are most likely included in the Brainwash dataset and have unwittingly contributed to surveillance research.
@@ -112,7 +112,7 @@
TODO TODO [ page under development ] The Oxford Town Centre dataset is a video of pedestrians in a busy downtown area in Oxford used for creating surveillance algorithms with potential applications in activity recognition, remote biometric analysis, and non-cooperative face recognition. 1 REVISE Although Oxford Town Centre dataset first appears as a pedestrian dataset, it was created to improve the stabilization of pedstrian detections in order to extract a more accurate head region that would lead to improvements in face recognition. The Oxford Town Centre dataset is a video of pedestrians in a busy downtown area in Oxford used for creating surveillance algorithms with potential applications in remote biometric analysis and non-cooperative face recognition. 1 The dataset was originally created to build algorithms that improve the stability of pedestrian detectors to provide more accurate head location estimates, leading to more accurate face recognition. Oxford Town Centre dataset is unique in that it uses footage from a public CCTV camera that is designated for public safety. Since its publication in 2009, the Oxford Town Centre CCTV footage dataset, and all 2,200 people in the video, have been redistributed around the world for the purpose of surveillance research and development. There are over 80 verified research projects that have used the Oxford Town Centre dataset. The usage even extends to commercial organizations including Amazon, Disney, and OSRAM. The street location of the camera used for the Oxford Town Centre dataset can be easily confirmed using only two visual clues in video: the GAP store and the main road source. The camera angle and field of view indicate that the camera was elevated and placed at the corner. The edge of the building is visible and there is a small white nylon strap and pigeon deterrent spikes visible on the upper perimeter of the building. Combined with stability of camera and pigeon appearances in front of the camera at 1:24 and 3:29, these visual cues indicate that the camera was mounted outside on the corner of the building just above the deterrence spikes. Halfway through the video a peculiar and somewhat rude man enters the video and stands directly over top a water drain for over a minute. His unusual demeanor and apparently scripted behavior suggests a possible relationship to the CCTV operators. The street location of the camera used for the Oxford Town Centre dataset can be easily confirmed using only two visual clues in video: the GAP store and the main road source. The camera angle and field of view indicate that the camera was elevated and placed at the corner. The edge of the building is visible and there is a small white nylon strap and pigeon deterrent spikes visible on the upper perimeter of the building. The field of view indicates the camera uses a wide angle lens. Combined with the camera's stability and pigeon appearances in front of the camera at 1:24 and 3:29, these visual cues indicate that the camera was mounted outside on the corner of the building just above the deterrence spikes. Several researchers have posted their demo videos using the Oxford Town Centre dataset on YouTube: [ add visualization ] TODO Benfold, Ben and Reid, Ian. "Stable Multi-Target Tracking in Real-Time Surveillance Video". CVPR 2011. Pages 3457-3464. The location of the camera and subjects can confirmed using the Bellingcat method. The visual clues that lead to location of the camera and subjects include the unique pattern of the sidewalk that is only used on the UCCS Pedestrian Spine near the West Lawn, the two UCCS sign poles with matching graphics still visible in Google Street View, the no parking sign and directionality of its arrow, the back of street sign next to it, the slight bend in the sidewalk, the presence of cars passing in the background of the image, and the far wall of the parking garage all match images in the dataset. The original papers also provides another clue: a picture of the camera inside the office that was used to create the dataset. The window view in this image provides another match for the brick pattern on the north facade of the Kraember Family Library and the green metal fence along the sidewalk. View the location on Google Maps The location of the camera and subjects can confirmed using several visual cues in the dataset images: the unique pattern of the sidewalk that is only used on the UCCS Pedestrian Spine near the West Lawn, the two UCCS sign poles with matching graphics still visible in Google Street View, the no parking sign and directionality of its arrow, the back of street sign next to it, the slight bend in the sidewalk, the presence of cars passing in the background of the image, and the far wall of the parking garage all match images in the dataset. The original papers also provides another clue: a picture of the camera inside the office that was used to create the dataset. The window view in this image provides another match for the brick pattern on the north facade of the Kraember Family Library and the green metal fence along the sidewalk. View the location on Google Maps The UnConstrained College Students dataset is associated with two main research papers: "Large Scale Unconstrained Open Set Face Database" and "Unconstrained Face Detection and Open-Set Face Recognition Challenge". Collectively, these papers and the creation of the dataset have received funding from the following organizations: TODO [ page under development ] The Duke Multi-Target, Multi-Camera Tracking Dataset (MTMC) is a dataset of video recorded on Duke University campus during for the purpose of training, evaluating, and improving multi-target multi-camera tracking for surveillance. The dataset includes over 14 hours of 1080p video from 8 cameras positioned around Duke's campus during February and March 2014. Over 2,700 unique people are included in the dataset, which has become of the most widely used person re-identification image datasets. The 8 cameras deployed on Duke's campus were specifically setup to capture students "during periods between lectures, when pedestrian traffic is heavy". The Duke Multi-Target, Multi-Camera Tracking Dataset (MTMC) is a dataset of video recorded on Duke University campus for research and development of networked camera surveillance systems. MTMC tracking is used for citywide dragnet surveillance systems such as those used throughout China by SenseTime 1 and the oppressive monitoring of 2.5 million Uyghurs in Xinjiang by SenseNets 2. In fact researchers from both SenseTime 4 5 and SenseNets 3 used the Duke MTMC dataset for their research. The Duke MTMC dataset is unique because it is the largest publicly available MTMC and person re-identification dataset and has the longest duration of annotated video. In total, the Duke MTMC dataset provides over 14 hours of 1080p video from 8 synchronized surveillance cameras. 6 It is among the most widely used person re-identification datasets in the world. The approximately 2,700 unique people in the Duke MTMC videos, most of whom are students, are used for research and development of surveillance technologies by commercial, academic, and even defense organizations. The creation and publication of the Duke MTMC dataset in 2016 was originally funded by the U.S. Army Research Laboratory and the National Science Foundation 6. Since 2016 use of the Duke MTMC dataset images have been publicly acknowledged in research funded by or on behalf of the Chinese National University of Defense 7 8, IARPA and IBM 9, and U.S. Department of Homeland Security 10. The 8 cameras deployed on Duke's campus were specifically setup to capture students "during periods between lectures, when pedestrian traffic is heavy". 6 Camera 7 and 2 capture large groups of prospective students and children. Camera 5 was positioned to capture students as they enter and exit Duke University's main chapel. Each camera's location is documented below.
@@ -109,17 +111,19 @@
The Duke MTMC dataset paper mentions 2,700 identities, but their ground truth file only lists annotations for 1,812 https://foreignpolicy.com/2019/03/19/962492-orwell-china-socialcredit-surveillance/ "Attention-Aware Compositional Network for Person Re-identification". 2018. Source "End-to-End Deep Kronecker-Product Matching for Person Re-identification". 2018. source "Person Re-identification with Deep Similarity-Guided Graph Neural Network". 2018. Source "Performance Measures and a Data Set for "Tracking by Animation: Unsupervised Learning of Multi-Object Attentive Trackers". 2018. Source "Unsupervised Multi-Object Detection for Video Surveillance Using Memory-Based Recurrent Attention Networks". 2018. Source "Horizontal Pyramid Matching for Person Re-identification". 2019. Source "Re-Identification with Consistent Attentive Siamese Networks". 2018. Source [ page under development ] The Oxford Town Centre dataset is a video of pedestrians in a busy downtown area in Oxford used for creating surveillance algorithms with potential applications in remote biometric analysis and non-cooperative face recognition. 1 The dataset was originally created to build algorithms that improve the stability of pedestrian detectors to provide more accurate head location estimates, leading to more accurate face recognition. Oxford Town Centre dataset is unique in that it uses footage from a public CCTV camera that is designated for public safety. Since its publication in 2009, the Oxford Town Centre CCTV footage dataset, and all 2,200 people in the video, have been redistributed around the world for the purpose of surveillance research and development. There are over 80 verified research projects that have used the Oxford Town Centre dataset. The usage even extends to commercial organizations including Amazon, Disney, and OSRAM. The Oxford Town Centre dataset is a CCTV video of pedestrians in a busy downtown area in Oxford used for research and development of activity and face recognition systems. 1 The CCTV video was obtained from a public surveillance camera at the corner of Cornmarket and Market St. in Oxford, England and includes approximately 2,200 people. Since its publication in 2009 2 the Oxford Town Centre dataset has been used in over 80 verified research projects including commercial research by Amazon, Disney, OSRAM, and Huawei; and academic research in China, Israel, Russia, Singapore, the US, and Germany among tensomes more. The Oxford Town Centre dataset is unique in that it uses footage from a public surveillance camera that would otherwise be designated for public safety. The video shows that the pedestrians act normally and unrehearsed indicating lack of consent or notice of participation in the research project. The street location of the camera used for the Oxford Town Centre dataset can be easily confirmed using only two visual clues in video: the GAP store and the main road source. The camera angle and field of view indicate that the camera was elevated and placed at the corner. The edge of the building is visible and there is a small white nylon strap and pigeon deterrent spikes visible on the upper perimeter of the building. The field of view indicates the camera uses a wide angle lens. Combined with the camera's stability and pigeon appearances in front of the camera at 1:24 and 3:29, these visual cues indicate that the camera was mounted outside on the corner of the building just above the deterrence spikes. The street location of the camera used for the Oxford Town Centre dataset was confirmed by matching the road, benches, and store signs source. At that location, two public CCTV cameras exist mounted on the side of the Northgate House building at 13-20 Cornmarket St. Because of the lower camera's mounting pole directionality, a view from a private camera in the building across the street can be ruled out because it would have to show more of silhouette of the lower camera's mounting pole. Two options remain: either the public CCTV camera mounted to the side of the building was used or the researchers mounted their own camera to the side of the building in the same location. Because the researchers used many other existing public CCTV cameras for their research projects it is likely that they would also be able to access to this camera. To discredit the theory that this public CCTV is only seen pointing the other way in Google Street View images, at least one public photo shows the upper CCTV camera pointing in the same direction as the Oxford Town Centre dataset proving the camera can and has been rotated before. As for the capture date, the text on the storefront display shows a sale happening from December 2nd – 7th indicating the capture date was between or just before those dates. The capture year is either 2008 or 2007 since prior to 2007 the Carphone Warehouse (photo, history) did not exist at this location. Since the sweaters in the GAP window display are more similar to those in a GAP website snapshot from November 2007, our guess is that the footage was obtained during late November or early December 2007. The lack of street vendors and slight waste residue near the bench suggests that is was probably a weekday after rubbish removal. Several researchers have posted their demo videos using the Oxford Town Centre dataset on YouTube: TODO MegaPixels.cc Terms and Privacy MegaPixels is an independent art and research project about the origins and ethics of publicly available face analysis image datasets. By accessing MegaPixels (the Service or Services) you agree to the terms and conditions set forth below. We reserve the right, at our sole discretion, to modify or replace these Terms at any time. If a revision is material we will try to provide at least 30 days notice prior to any new terms taking effect. What constitutes a material change will be determined at our sole discretion. By continuing to access or use our Service after those revisions become effective, you agree to be bound by the revised terms. If you do not agree to the new terms, please stop using the Service. MegaPixels is an independent and academic art and research project about the origins and ethics of publicly available face analysis image datasets. By accessing MegaPixels (the Service or Services) you agree to the terms and conditions set forth below. The MegaPixels site has been designed to minimize the amount of network requests to 3rd party services and therefore prioritize the privacy of the viewer by only loading local dependencies. Additionaly, this site does not use any anaytics programs to monitor site viewers. In fact, the only data collected are the necessary server logs that used only for preventing misuse, which are deleteted at regular short-term intervals. The MegaPixels site has been designed to minimize the amount of network requests to 3rd party services and therefore prioritize the privacy of the viewer. This site does not use any local or external analytics programs to monitor site viewers. In fact, the only data collected are the necessary server logs used only for preventing misuse, which are deleted at short-term intervals. In order to provide certain features of the site, some 3rd party services are needed. Currently, the MegaPixels.cc site uses two 3rd party services: (1) Leaflet.js for the interactive map and (2 Digital Ocean Spaces as a condent delivery network. Both services encrypt your requests to their server using HTTPS and neither service requires storing any cookies or authentication. However, both services will store files in your web browser's local cache (local storage) to improve loading performance. None of these local storage files are using for analytics, cookie-like technologies, tracking, or any similar purpose. In order to provide certain features of the site, some 3rd party services are needed. Currently, the MegaPixels.cc site uses two 3rd party services: (1) Leaflet.js for the interactive map and (2) Digital Ocean Spaces as a content delivery network. Both services encrypt your requests to their server using HTTPS and neither service requires storing any cookies or authentication. However, both services will store files in your web browser's local cache (local storage) to improve loading performance. None of these local storage files are using for analytics, tracking, or any similar purpose. The MegaPixels.cc contains many links to 3rd party websites, especically in the list of citations that are provided for each dataset. This website has no control over and assumes no responsibility for, the content, privacy policies, or practices of any third party web sites or services. You further acknowledge and agree that megapixels.cc shall not be responsible or liable, directly or indirectly, for any damage or loss caused or alleged to be caused by or in connection with use of or reliance on any such content, goods or services available on or through any such web sites or services. The MegaPixels.cc contains many links to 3rd party websites, especially in the list of citations that are provided for each dataset. This website has no control over and assumes no responsibility for, the content, privacy policies, or practices of any third party web sites or services. You acknowledge and agree that megapixels.cc shall not be responsible or liable, directly or indirectly, for any damage or loss caused or alleged to be caused by or in connection with use of or reliance on any such content, goods or services available on or through any such web sites or services. We advise you to read the terms and conditions and privacy policies of any third-party web sites or services that you visit. While every intention is made to verify and publish only verifiablenformation, at times amendments to accuracy may be required. In no event will the operators of this site be liable for your use or misuse of the information provided. We may terminate or suspend access to our Service immediately, without prior notice or liability, for any reason whatsoever, including without limitation if you breach the Terms. While every intention is made to publish only verifiable information, at times existing information may be revised or deleted and new information may be added for clarity or correction. In no event will the operators of this site be liable for your use or misuse of the information provided. We may terminate or suspend access to our Service immediately without prior notice or liability, for any reason whatsoever, including without limitation if you breach the Terms. All provisions of the Terms which by their nature should survive termination shall survive termination, including, without limitation, ownership provisions, warranty disclaimers, indemnity and limitations of liability. You may not access or use, or attempt to access or use, the Services to take any action that could harm us or a third party. You may not use the Services in violation of applicable laws or in violation of our or any third party’s intellectual property or other proprietary or legal rights. You further agree that you shall not attempt (or encourage or support anyone else's attempt) to circumvent, reverse engineer, decrypt, or otherwise alter or interfere with the Services, or any content thereof, or make any unauthorized use thereof. (i) access any part of the Services, Content, data or information you do not have permission or authorization to access; (ii) use robots, spiders, scripts, service, software or any manual or automatic device, tool, or process designed to data mine or scrape the Content, data or information from the Services, or otherwise access or collect the Content, data or information from the Services using automated means; (iii) use services, software or any manual or automatic device, tool, or process designed to circumvent any restriction, condition, or technological measure that controls access to the Services in any way, including overriding any security feature or bypassing or circumventing any access controls or use limits of the Services; (iv) cache or archive the Content (except for a public search engine’s use of spiders for creating search indices); (iv) cache or archive the Content (except for a public search engine’s use of spiders for creating search indices) with prior written consent; (v) take action that imposes an unreasonable or disproportionately large load on our network or infrastructure; and (vi) do anything that could disable, damage or change the functioning or appearance of the Services, including the presentation of advertising. Engaging in a prohibited use of the Services may result in civil, criminal, and/or administrative penalties, fines, or sanctions against the user and those assisting the user. Our failure to enforce any right or provision of these Terms will not be considered a waiver of those rights. If any provision of these Terms is held to be invalid or unenforceable by a court, the remaining provisions of these Terms will remain in effect. These Terms constitute the entire agreement between us regarding our Service, and supersede and replace any prior agreements we might have between us regarding the Service. You hereby indemnify, defend and hold harmless MegaPixels (and its creators) and all officers, directors, owners, agents, information providers, affiliates, licensors and licensees (collectively, the "Indemnified Parties") from and against any and all liability and costs, including, without limitation, reasonable attorneys' fees, incurred by the Indemnified Parties in connection with any claim arising out of any breach by you or any user of your account of these Terms of Service or the foregoing representations, warranties and covenants. You shall cooperate as fully as reasonably required in the defense of any such claim. We reserves the right, at its own expense, to assume the exclusive defense and control of any matter subject to indemnification by you. We reserve the right, at our sole discretion, to modify or replace these Terms at any time. By continuing to use or access our Service after revisions become effective, you agree to be bound by the revised terms. If you do not agree to revised terms, please do not use the Service. "readme.txt" https://exhibits.stanford.edu/data/catalog/sx925dc9385. Stewart, Russel. Andriluka, Mykhaylo. "End-to-end people detection in crowded scenes". 2016. [ page under development ] The Duke Multi-Target, Multi-Camera Tracking Dataset (MTMC) is a dataset of video recorded on Duke University campus for research and development of networked camera surveillance systems. MTMC tracking is used for citywide dragnet surveillance systems such as those used throughout China by SenseTime 1 and the oppressive monitoring of 2.5 million Uyghurs in Xinjiang by SenseNets 2. In fact researchers from both SenseTime 4 5 and SenseNets 3 used the Duke MTMC dataset for their research. The Duke MTMC dataset is unique because it is the largest publicly available MTMC and person re-identification dataset and has the longest duration of annotated video. In total, the Duke MTMC dataset provides over 14 hours of 1080p video from 8 synchronized surveillance cameras. 6 It is among the most widely used person re-identification datasets in the world. The approximately 2,700 unique people in the Duke MTMC videos, most of whom are students, are used for research and development of surveillance technologies by commercial, academic, and even defense organizations. The creation and publication of the Duke MTMC dataset in 2016 was originally funded by the U.S. Army Research Laboratory and the National Science Foundation 6. Since 2016 use of the Duke MTMC dataset images have been publicly acknowledged in research funded by or on behalf of the Chinese National University of Defense 7 8, IARPA and IBM 9, and U.S. Department of Homeland Security 10. The creation and publication of the Duke MTMC dataset in 2016 was originally funded by the U.S. Army Research Laboratory and the National Science Foundation 6. Since 2016 use of the Duke MTMC dataset images have been publicly acknowledged in research funded by or on behalf of the Chinese National University of Defense 7 8, IARPA and IBM 9, and U.S. Department of Homeland Security 10. The 8 cameras deployed on Duke's campus were specifically setup to capture students "during periods between lectures, when pedestrian traffic is heavy". 6 Camera 7 and 2 capture large groups of prospective students and children. Camera 5 was positioned to capture students as they enter and exit Duke University's main chapel. Each camera's location is documented below. "Attention-Aware Compositional Network for Person Re-identification". 2018. Source "End-to-End Deep Kronecker-Product Matching for Person Re-identification". 2018. source "Person Re-identification with Deep Similarity-Guided Graph Neural Network". 2018. Source "Performance Measures and a Data Set for "Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking". 2016. Source "Tracking by Animation: Unsupervised Learning of Multi-Object Attentive Trackers". 2018. Source "Unsupervised Multi-Object Detection for Video Surveillance Using Memory-Based Recurrent Attention Networks". 2018. Source "Horizontal Pyramid Matching for Person Re-identification". 2019. Source The Oxford Town Centre dataset is a CCTV video of pedestrians in a busy downtown area in Oxford used for research and development of activity and face recognition systems. 1 The CCTV video was obtained from a public surveillance camera at the corner of Cornmarket and Market St. in Oxford, England and includes approximately 2,200 people. Since its publication in 2009 2 the Oxford Town Centre dataset has been used in over 80 verified research projects including commercial research by Amazon, Disney, OSRAM, and Huawei; and academic research in China, Israel, Russia, Singapore, the US, and Germany among tensomes more. The Oxford Town Centre dataset is unique in that it uses footage from a public surveillance camera that would otherwise be designated for public safety. The video shows that the pedestrians act normally and unrehearsed indicating lack of consent or notice of participation in the research project. The Oxford Town Centre dataset is a CCTV video of pedestrians in a busy downtown area in Oxford used for research and development of activity and face recognition systems. 1 The CCTV video was obtained from a public surveillance camera at the corner of Cornmarket and Market St. in Oxford, England and includes approximately 2,200 people. Since its publication in 2009 2 the Oxford Town Centre dataset has been used in over 80 verified research projects including commercial research by Amazon, Disney, OSRAM, and Huawei; and academic research in China, Israel, Russia, Singapore, the US, and Germany among dozens more. The Oxford Town Centre dataset is unique in that it uses footage from a public surveillance camera that would otherwise be designated for public safety. The video shows that the pedestrians act normally and unrehearsed indicating they neither knew of or consented to participation in the research project. "readme.txt" https://exhibits.stanford.edu/data/catalog/sx925dc9385. "readme.txt" https://exhibits.stanford.edu/data/catalog/sx925dc9385. Stewart, Russel. Andriluka, Mykhaylo. "End-to-end people detection in crowded scenes". 2016. Li, Y. and Dou, Y. and Liu, X. and Li, T. Localized Region Context and Object Feature Fusion for People Head Detection. ICIP16 Proceedings. 2016. Pages 594-598. Zhao. X, Wang Y, Dou, Y. A Replacement Algorithm of Non-Maximum Suppression Base on Graph Clustering. The Duke MTMC dataset paper mentions 2,700 identities, but their ground truth file only lists annotations for 1,812 https://foreignpolicy.com/2019/03/19/962492-orwell-china-socialcredit-surveillance/ "Attention-Aware Compositional Network for Person Re-identification". 2018. Source "End-to-End Deep Kronecker-Product Matching for Person Re-identification". 2018. source "Unsupervised Multi-Object Detection for Video Surveillance Using Memory-Based Recurrent Attention Networks". 2018. Source "Horizontal Pyramid Matching for Person Re-identification". 2019. Source "Re-Identification with Consistent Attentive Siamese Networks". 2018. Source Jingtuo Liu, Yafeng Deng, Tao Bai, Zhengping Wei, Chang Huang. Targeting Ultimate Accuracy: Face Recognition via Deep Embedding. https://arxiv.org/abs/1506.07310 Lee, Justin. "PING AN Tech facial recognition receives high score in latest LFW test results". BiometricUpdate.com. Feb 13, 2017. https://www.biometricupdate.com/201702/ping-an-tech-facial-recognition-receives-high-score-in-latest-lfw-test-results "readme.txt" https://exhibits.stanford.edu/data/catalog/sx925dc9385. "readme.txt" https://exhibits.stanford.edu/data/catalog/sx925dc9385. Li, Y. and Dou, Y. and Liu, X. and Li, T. Localized Region Context and Object Feature Fusion for People Head Detection. ICIP16 Proceedings. 2016. Pages 594-598. Zhao. X, Wang Y, Dou, Y. A Replacement Algorithm of Non-Maximum Suppression Base on Graph Clustering. Benfold, Ben and Reid, Ian. "Stable Multi-Target Tracking in Real-Time Surveillance Video". CVPR 2011. Pages 3457-3464. Sapkota, Archana and Boult, Terrance. "Large Scale Unconstrained Open Set Face Database." 2013. is Berlin-based American artist and researcher. His previous projects (CV Dazzle, Stealth Wear, and SkyLift) explore the potential for counter-surveillance as artwork. He is the founder of VFRAME (visual forensics software for human rights groups) and is a currently researcher in residence at Karlsruhe HfG. is an American technologist and artist also based in Berlin. He was previously the CTO of a digital agency in NYC and now also works at VFRAME, developing computer vision and data analysis software for human rights groups. Jules also builds experimental software for artists and musicians.
-
+
+ If you use our data, research, or graphics please cite our work:
+
+ MegaPixels is an independent art and research project by Adam Harvey and Jules LaPlace investigating the ethics and individual privacy implications of publicly available face recognition datasets, and their role in industry and governmental expansion into biometric surveillance technologies. The MegaPixels site is made possible with support from Mozilla is Berlin-based American artist and researcher. His previous projects (CV Dazzle, Stealth Wear, and SkyLift) explore the potential for counter-surveillance as artwork. He is the founder of VFRAME (visual forensics software for human rights groups) and is a currently researcher in residence at Karlsruhe HfG. is an American technologist and artist also based in Berlin. He was previously the CTO of a digital agency in NYC and now also works at VFRAME, developing computer vision and data analysis software for human rights groups. Jules also builds experimental software for artists and musicians.
- MegaPixels.cc is a research project about publicly available face recognition datasets. This website is based, in part, on an earlier installations and research about facial recognition datasets. Since then it has evolved into a large-scale survey of publicly-available face and person analysis datasets. Initially this site was planned as a facial recognition tool to search the datasets. After building several prototypes using over 1 million face images from these datasets, it became clear that facial recognition was mereley a face similar search. The results were not accurate enough to align with goals of this website: to promote responsible use of data and expose existing and past ethical breaches. An academic report and presentation on the findings of this project is forthcoming. Throughout 2019, this site will be updated with more datasets and research reports on the general themes of remote biometric analysis and media collected "in the wild". The continued research on MegaPixels is supported by a 1 year Researcher-in-Residence grant from Karlsruhe HfG (2019-2020). When possible, and once thoroughly verified, data generated for MegaPixels will be made available for download on github.com/adamhrv/megapixels Please direct questions, comments, or feedback to mastodon.social/@adamhrv MegaPixels.cc Terms and Privacy MegaPixels is an independent and academic art and research project about the origins and ethics of publicly available face analysis image datasets. By accessing MegaPixels (the Service or Services) you agree to the terms and conditions set forth below. The MegaPixels site has been designed to minimize the amount of network requests to 3rd party services and therefore prioritize the privacy of the viewer. This site does not use any local or external analytics programs to monitor site viewers. In fact, the only data collected are the necessary server logs used only for preventing misuse, which are deleted at short-term intervals. In order to provide certain features of the site, some 3rd party services are needed. Currently, the MegaPixels.cc site uses two 3rd party services: (1) Leaflet.js for the interactive map and (2) Digital Ocean Spaces as a content delivery network. Both services encrypt your requests to their server using HTTPS and neither service requires storing any cookies or authentication. However, both services will store files in your web browser's local cache (local storage) to improve loading performance. None of these local storage files are using for analytics, tracking, or any similar purpose. The MegaPixels.cc contains many links to 3rd party websites, especially in the list of citations that are provided for each dataset. This website has no control over and assumes no responsibility for, the content, privacy policies, or practices of any third party web sites or services. You acknowledge and agree that megapixels.cc shall not be responsible or liable, directly or indirectly, for any damage or loss caused or alleged to be caused by or in connection with use of or reliance on any such content, goods or services available on or through any such web sites or services. We advise you to read the terms and conditions and privacy policies of any third-party web sites or services that you visit. While every intention is made to publish only verifiable information, at times existing information may be revised or deleted and new information may be added for clarity or correction. In no event will the operators of this site be liable for your use or misuse of the information provided. We may terminate or suspend access to our Service immediately without prior notice or liability, for any reason whatsoever, including without limitation if you breach the Terms. All provisions of the Terms which by their nature should survive termination shall survive termination, including, without limitation, ownership provisions, warranty disclaimers, indemnity and limitations of liability. You may not access or use, or attempt to access or use, the Services to take any action that could harm us or a third party. You may not use the Services in violation of applicable laws or in violation of our or any third party’s intellectual property or other proprietary or legal rights. You further agree that you shall not attempt (or encourage or support anyone else's attempt) to circumvent, reverse engineer, decrypt, or otherwise alter or interfere with the Services, or any content thereof, or make any unauthorized use thereof. Without prior written consent, you shall not: (i) access any part of the Services, Content, data or information you do not have permission or authorization to access; (ii) use robots, spiders, scripts, service, software or any manual or automatic device, tool, or process designed to data mine or scrape the Content, data or information from the Services, or otherwise access or collect the Content, data or information from the Services using automated means; (iii) use services, software or any manual or automatic device, tool, or process designed to circumvent any restriction, condition, or technological measure that controls access to the Services in any way, including overriding any security feature or bypassing or circumventing any access controls or use limits of the Services; (iv) cache or archive the Content (except for a public search engine’s use of spiders for creating search indices) with prior written consent; (v) take action that imposes an unreasonable or disproportionately large load on our network or infrastructure; and (vi) do anything that could disable, damage or change the functioning or appearance of the Services, including the presentation of advertising. Engaging in a prohibited use of the Services may result in civil, criminal, and/or administrative penalties, fines, or sanctions against the user and those assisting the user. These Terms shall be governed and construed in accordance with the laws of Berlin, Germany, without regard to its conflict of law provisions. Our failure to enforce any right or provision of these Terms will not be considered a waiver of those rights. If any provision of these Terms is held to be invalid or unenforceable by a court, the remaining provisions of these Terms will remain in effect. These Terms constitute the entire agreement between us regarding our Service, and supersede and replace any prior agreements we might have between us regarding the Service. You hereby indemnify, defend and hold harmless MegaPixels (and its creators) and all officers, directors, owners, agents, information providers, affiliates, licensors and licensees (collectively, the "Indemnified Parties") from and against any and all liability and costs, including, without limitation, reasonable attorneys' fees, incurred by the Indemnified Parties in connection with any claim arising out of any breach by you or any user of your account of these Terms of Service or the foregoing representations, warranties and covenants. You shall cooperate as fully as reasonably required in the defense of any such claim. We reserves the right, at its own expense, to assume the exclusive defense and control of any matter subject to indemnification by you. We reserve the right, at our sole discretion, to modify or replace these Terms at any time. By continuing to use or access our Service after revisions become effective, you agree to be bound by the revised terms. If you do not agree to revised terms, please do not use the Service. [ page under development ]
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
-
- To help understand how 50 People One Question Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing 50 People One Question was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
-
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
- [ page under development ]
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
-
- To help understand how Asian Face Age Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing The Asian Face Age Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
-
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
- The Asian Face Age Dataset (AFAD) is a new dataset proposed for evaluating the performance of age estimation, which contains more than 160K facial images and the corresponding age and gender labels. This dataset is oriented to age estimation on Asian faces, so all the facial images are for Asian faces. It is noted that the AFAD is the biggest dataset for age estimation to date. It is well suited to evaluate how deep learning methods can be adopted for age estimation.
-Motivation For age estimation, there are several public datasets for evaluating the performance of a specific algorithm, such as FG-NET [1] (1002 face images), MORPH I (1690 face images), and MORPH II[2] (55,608 face images). Among them, the MORPH II is the biggest public dataset to date. On the other hand, as we know it is necessary to collect a large scale dataset to train a deep Convolutional Neural Network. Therefore, the MORPH II dataset is extensively used to evaluate how deep learning methods can be adopted for age estimation [3][4]. However, the ethnic is very unbalanced for the MORPH II dataset, i.e., it has only less than 1% Asian faces. In order to evaluate the previous methods for age estimation on Asian Faces, the Asian Face Age Dataset (AFAD) was proposed. There are 164,432 well-labeled photos in the AFAD dataset. It consist of 63,680 photos for female as well as 100,752 photos for male, and the ages range from 15 to 40. The distribution of photo counts for distinct ages are illustrated in the figure above. Some samples are shown in the Figure on the top. Its download link is provided in the "Download" section. In addition, we also provide a subset of the AFAD dataset, called AFAD-Lite, which only contains PLACEHOLDER well-labeled photos. It consist of PLACEHOLDER photos for female as well as PLACEHOLDER photos for male, and the ages range from 15 to 40. The distribution of photo counts for distinct ages are illustrated in Fig. PLACEHOLDER. Its download link is also provided in the "Download" section. The AFAD dataset is built by collecting selfie photos on a particular social network -- RenRen Social Network (RSN) [5]. The RSN is widely used by Asian students including middle school, high school, undergraduate, and graduate students. Even after leaving from school, some people still access their RSN account to connect with their old classmates. So, the age of the RSN user crosses a wide range from 15-years to more than 40-years old. Please notice that this dataset is made available for academic research purpose only. RESEARCH below this line The motivation for the AFLW database is the need for a large-scale, multi-view, real-world face database with annotated facial features. We gathered the images on Flickr using a wide range of face relevant tags (e.g., face, mugshot, profile face). The downloaded set of images was manually scanned for images containing faces. The key data and most important properties of the database are: https://www.tugraz.at/institute/icg/research/team-bischof/lrs/downloads/aflw/ Brainwash is a head detection dataset created from San Francisco's Brainwash Cafe livecam footage. It includes 11,918 images of "everyday life of a busy downtown cafe" 1 captured at 100 second intervals throught the entire day. Brainwash dataset was captured during 3 days in 2014: October 27, November 13, and November 24. According the author's reserach paper introducing the dataset, the images were acquired with the help of Angelcam.com. 2 Brainwash is not a widely used dataset but since its publication by Stanford University in 2015, it has notably appeared in several research papers from the National University of Defense Technology in Changsha, China. In 2016 and in 2017 researchers there conducted studies on detecting people's heads in crowded scenes for the purpose of surveillance. 3 4 If you happen to have been at Brainwash cafe in San Francisco at any time on October 26, November 13, or November 24 in 2014 you are most likely included in the Brainwash dataset and have unwittingly contributed to surveillance research.
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
-
- To help understand how Brainwash Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Brainwash Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
-
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
- TODO "readme.txt" https://exhibits.stanford.edu/data/catalog/sx925dc9385. Stewart, Russel. Andriluka, Mykhaylo. "End-to-end people detection in crowded scenes". 2016. Li, Y. and Dou, Y. and Liu, X. and Li, T. Localized Region Context and Object Feature Fusion for People Head Detection. ICIP16 Proceedings. 2016. Pages 594-598. Zhao. X, Wang Y, Dou, Y. A Replacement Algorithm of Non-Maximum Suppression Base on Graph Clustering. [ page under development ]
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
-
- To help understand how Brainwash Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Brainwash Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
-
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
- The dataset contains images of people collected from the web by typing common given names into Google Image Search. The coordinates of the eyes, the nose and the center of the mouth for each frontal face are provided in a ground truth file. This information can be used to align and crop the human faces or as a ground truth for a face detection algorithm. The dataset has 10,524 human faces of various resolutions and in different settings, e.g. portrait images, groups of people, etc. Profile faces or very low resolution faces are not labeled. [ PAGE UNDER DEVELOPMENT ]
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
-
- To help understand how CelebA Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Large-scale CelebFaces Attributes Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
-
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
- [ PAGE UNDER DEVELOPMENT ]
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
-
- To help understand how COFW Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Caltech Occluded Faces in the Wild was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
-
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
- COFW is "is designed to benchmark face landmark algorithms in realistic conditions, which include heavy occlusions and large shape variations" [Robust face landmark estimation under occlusion]. We asked four people with different levels of computer vision knowledge to each collect 250 faces representative of typical real-world images, with the clear goal of challenging computer vision methods.
-The result is 1,007 images of faces obtained from a variety of sources. Robust face landmark estimation under occlusion Our face dataset is designed to present faces in real-world conditions. Faces show large variations in shape and occlusions due to differences in pose, expression, use of accessories such as sunglasses and hats and interactions with objects (e.g. food, hands, microphones, etc.). All images were hand annotated in our lab using the same 29 landmarks as in LFPW. We annotated both the landmark positions as well as their occluded/unoccluded state. The faces are occluded to different degrees, with large variations in the type of occlusions encountered. COFW has an average occlusion of over 23%.
-To increase the number of training images, and since COFW has the exact same landmarks as LFPW, for training we use the original non-augmented 845 LFPW faces + 500 COFW faces (1345 total), and for testing the remaining 507 COFW faces. To make sure all images had occlusion labels, we annotated occlusion on the available 845 LFPW training images, finding an average of only 2% occlusion. http://www.vision.caltech.edu/xpburgos/ICCV13/ This research is supported by NSF Grant 0954083 and by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via IARPA R&D Contract No. 2014-14071600012.
- To help understand how COFW Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Caltech Occluded Faces in the Wild was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the location markers to reveal research projects at that location.
-
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
-
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
- TODO The Duke Multi-Target, Multi-Camera Tracking Dataset (MTMC) is a dataset of video recorded on Duke University campus for research and development of networked camera surveillance systems. MTMC tracking is used for citywide dragnet surveillance systems such as those used throughout China by SenseTime 1 and the oppressive monitoring of 2.5 million Uyghurs in Xinjiang by SenseNets 2. In fact researchers from both SenseTime 4 5 and SenseNets 3 used the Duke MTMC dataset for their research. The Duke MTMC dataset is unique because it is the largest publicly available MTMC and person re-identification dataset and has the longest duration of annotated video. In total, the Duke MTMC dataset provides over 14 hours of 1080p video from 8 synchronized surveillance cameras. 6 It is among the most widely used person re-identification datasets in the world. The approximately 2,700 unique people in the Duke MTMC videos, most of whom are students, are used for research and development of surveillance technologies by commercial, academic, and even defense organizations. The creation and publication of the Duke MTMC dataset in 2016 was originally funded by the U.S. Army Research Laboratory and the National Science Foundation 6. Since 2016 use of the Duke MTMC dataset images have been publicly acknowledged in research funded by or on behalf of the Chinese National University of Defense 7 8, IARPA and IBM 9, and U.S. Department of Homeland Security 10. The 8 cameras deployed on Duke's campus were specifically setup to capture students "during periods between lectures, when pedestrian traffic is heavy". 6 Camera 7 and 2 capture large groups of prospective students and children. Camera 5 was positioned to capture students as they enter and exit Duke University's main chapel. Each camera's location is documented below.
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
-
- To help understand how Duke MTMC Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Duke Multi-Target, Multi-Camera Tracking Project was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
-
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
- The Duke MTMC dataset paper mentions 2,700 identities, but their ground truth file only lists annotations for 1,812 https://foreignpolicy.com/2019/03/19/962492-orwell-china-socialcredit-surveillance/ "Attention-Aware Compositional Network for Person Re-identification". 2018. Source "End-to-End Deep Kronecker-Product Matching for Person Re-identification". 2018. source "Person Re-identification with Deep Similarity-Guided Graph Neural Network". 2018. Source "Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking". 2016. Source "Tracking by Animation: Unsupervised Learning of Multi-Object Attentive Trackers". 2018. Source "Unsupervised Multi-Object Detection for Video Surveillance Using Memory-Based Recurrent Attention Networks". 2018. Source "Horizontal Pyramid Matching for Person Re-identification". 2019. Source "Re-Identification with Consistent Attentive Siamese Networks". 2018. Source Ignore content below these lines [ page under development ] {% include 'dashboard.html' %} The FERET program is sponsored by the U.S. Depart- ment of Defense’s Counterdrug Technology Development Program Office. The U.S. Army Research Laboratory (ARL) is the technical agent for the FERET program. ARL designed, administered, and scored the FERET tests. George Mason University collected, processed, and main- tained the FERET database. Inquiries regarding the FERET database or test should be directed to P. Jonathon Phillips. [ page under development ] {% include 'dashboard.html' } Explore publicly available facial recognition datasets. More datasets will be added throughout 2019.
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
-
- To help understand how LFWP has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Labeled Face Parts in the Wild was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
-
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
- RESEARCH below this line Release 1 of LFPW consists of 1,432 faces from images downloaded from the web using simple text queries on sites such as google.com, flickr.com, and yahoo.com. Each image was labeled by three MTurk workers, and 29 fiducial points, shown below, are included in dataset. LFPW was originally described in the following publication: Due to copyright issues, we cannot distribute image files in any format to anyone. Instead, we have made available a list of image URLs where you can download the images yourself. We realize that this makes it impossible to exactly compare numbers, as image links will slowly disappear over time, but we have no other option. This seems to be the way other large web-based databases seem to be evolving. https://neerajkumar.org/databases/lfpw/ This research was performed at Kriegman-Belhumeur Vision Technologies and was funded by the CIA through the Office of the Chief Scientist. https://www.cs.cmu.edu/~peiyunh/topdown/ (nk_cvpr2011_faceparts.pdf) [ PAGE UNDER DEVELOPMENT ] Labeled Faces in The Wild (LFW) is "a database of face photographs designed for studying the problem of unconstrained face recognition 1. It is used to evaluate and improve the performance of facial recognition algorithms in academic, commercial, and government research. According to BiometricUpdate.com 3, LFW is "the most widely used evaluation set in the field of facial recognition, LFW attracts a few dozen teams from around the globe including Google, Facebook, Microsoft Research Asia, Baidu, Tencent, SenseTime, Face++ and Chinese University of Hong Kong." The LFW dataset includes 13,233 images of 5,749 people that were collected between 2002-2004. LFW is a subset of Names of Faces and is part of the first facial recognition training dataset created entirely from images appearing on the Internet. The people appearing in LFW are... The Names and Faces dataset was the first face recognition dataset created entire from online photos. However, Names and Faces and LFW are not the first face recognition dataset created entirely "in the wild". That title belongs to the UCD dataset. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer. The Names and Faces dataset was the first face recognition dataset created entire from online photos. However, Names and Faces and LFW are not the first face recognition dataset created entirely "in the wild". That title belongs to the UCD dataset. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer. The Names and Faces dataset was the first face recognition dataset created entire from online photos. However, Names and Faces and LFW are not the first face recognition dataset created entirely "in the wild". That title belongs to the UCD dataset. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer. The Names and Faces dataset was the first face recognition dataset created entire from online photos. However, Names and Faces and LFW are not the first face recognition dataset created entirely "in the wild". That title belongs to the UCD dataset. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer.
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
-
- To help understand how LFW has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Labeled Faces in the Wild was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
-
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
- Add a paragraph about how usage extends far beyond academia into research centers for largest companies in the world. And even funnels into CIA funded research in the US and defense industry usage in China. Jingtuo Liu, Yafeng Deng, Tao Bai, Zhengping Wei, Chang Huang. Targeting Ultimate Accuracy: Face Recognition via Deep Embedding. https://arxiv.org/abs/1506.07310 Lee, Justin. "PING AN Tech facial recognition receives high score in latest LFW test results". BiometricUpdate.com. Feb 13, 2017. https://www.biometricupdate.com/201702/ping-an-tech-facial-recognition-receives-high-score-in-latest-lfw-test-results [ PAGE UNDER DEVELOPMENT]
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
-
- To help understand how Market 1501 has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Market 1501 Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
-
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
- @proceedings{zheng2016mars,
-title={MARS: A Video Benchmark for Large-Scale Person Re-identification},
-author={Zheng, Liang and Bie, Zhi and Sun, Yifan and Wang, Jingdong and Su, Chi and Wang, Shengjin and Tian, Qi},
-booktitle={European Conference on Computer Vision},
-year={2016},
-organization={Springer}
-} [ PAGE UNDER DEVELOPMENT ]
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
-
- To help understand how Microsoft Celeb has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Microsoft Celebrity Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
-
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
- "readme.txt" https://exhibits.stanford.edu/data/catalog/sx925dc9385. Li, Y. and Dou, Y. and Liu, X. and Li, T. Localized Region Context and Object Feature Fusion for People Head Detection. ICIP16 Proceedings. 2016. Pages 594-598. Zhao. X, Wang Y, Dou, Y. A Replacement Algorithm of Non-Maximum Suppression Base on Graph Clustering. The Oxford Town Centre dataset is a CCTV video of pedestrians in a busy downtown area in Oxford used for research and development of activity and face recognition systems. 1 The CCTV video was obtained from a public surveillance camera at the corner of Cornmarket and Market St. in Oxford, England and includes approximately 2,200 people. Since its publication in 2009 2 the Oxford Town Centre dataset has been used in over 80 verified research projects including commercial research by Amazon, Disney, OSRAM, and Huawei; and academic research in China, Israel, Russia, Singapore, the US, and Germany among dozens more. The Oxford Town Centre dataset is unique in that it uses footage from a public surveillance camera that would otherwise be designated for public safety. The video shows that the pedestrians act normally and unrehearsed indicating they neither knew of or consented to participation in the research project.
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
-
- To help understand how TownCentre has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Oxford Town Centre was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
-
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
- The street location of the camera used for the Oxford Town Centre dataset was confirmed by matching the road, benches, and store signs source. At that location, two public CCTV cameras exist mounted on the side of the Northgate House building at 13-20 Cornmarket St. Because of the lower camera's mounting pole directionality, a view from a private camera in the building across the street can be ruled out because it would have to show more of silhouette of the lower camera's mounting pole. Two options remain: either the public CCTV camera mounted to the side of the building was used or the researchers mounted their own camera to the side of the building in the same location. Because the researchers used many other existing public CCTV cameras for their research projects it is likely that they would also be able to access to this camera. To discredit the theory that this public CCTV is only seen pointing the other way in Google Street View images, at least one public photo shows the upper CCTV camera pointing in the same direction as the Oxford Town Centre dataset proving the camera can and has been rotated before. As for the capture date, the text on the storefront display shows a sale happening from December 2nd – 7th indicating the capture date was between or just before those dates. The capture year is either 2008 or 2007 since prior to 2007 the Carphone Warehouse (photo, history) did not exist at this location. Since the sweaters in the GAP window display are more similar to those in a GAP website snapshot from November 2007, our guess is that the footage was obtained during late November or early December 2007. The lack of street vendors and slight waste residue near the bench suggests that is was probably a weekday after rubbish removal. Several researchers have posted their demo videos using the Oxford Town Centre dataset on YouTube: [ PAGE UNDER DEVELOPMENT ]
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
-
- To help understand how PIPA Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing People in Photo Albums Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
-
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
- [ PAGE UNDER DEVELOPMENT ]
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
-
- To help understand how PubFig has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Public Figures Face Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
-
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
- [ page under development ] UnConstrained College Students (UCCS) is a dataset of long-range surveillance photos captured at University of Colorado Colorado Springs. According to the authors of two papers associated with the dataset, subjects were "photographed using a long-range high-resolution surveillance camera without their knowledge" 2. To create the dataset, the researchers used a Canon 7D digital camera fitted with a Sigma 800mm telephoto lens and photographed students 150–200m away through their office window. Photos were taken during the morning and afternoon while students were walking to and from classes. The primary uses of this dataset are to train, validate, and build recognition and face detection algorithms for realistic surveillance scenarios. What makes the UCCS dataset unique is that it includes the highest resolution images of any publicly available face recognition dataset discovered so far (18MP), that it was captured on a campus without consent or awareness using a long-range telephoto lens, and that it was funded by United States defense and intelligence agencies. Combined funding sources for the creation of the initial and final release of the dataset include ODNI (Office of Director of National Intelligence), IARPA (Intelligence Advance Research Projects Activity), ONR MURI (Office of Naval Research and The Department of Defense Multidisciplinary University Research Initiative), Army SBIR (Small Business Innovation Research), SOCOM SBIR (Special Operations Command and Small Business Innovation Research), and the National Science Foundation. 1 2 In 2017 the UCCS face dataset was used for a defense and intelligence agency funded face recognition challenge at the International Joint Biometrics Conference in Denver, CO. And in 2018 the dataset was used for the 2nd Unconstrained Face Detection and Open Set Recognition Challenge at the European Computer Vision Conference (ECCV) in Munich, Germany. Additional research projects that have used the UCCS dataset are included below in the list of verified citations.
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
-
- To help understand how UCCS has been used around the world by commercial, military, and academic organizations; existing publicly available research citing UnConstrained College Students Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
-
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
- The images in UCCS were taken on 18 non-consecutive days during 2012–2013. Analysis of the EXIF data embedded in original images reveal that most of the images were taken on Tuesdays, and the most frequent capture time throughout the week was 12:30PM. The location of the camera and subjects can confirmed using several visual cues in the dataset images: the unique pattern of the sidewalk that is only used on the UCCS Pedestrian Spine near the West Lawn, the two UCCS sign poles with matching graphics still visible in Google Street View, the no parking sign and directionality of its arrow, the back of street sign next to it, the slight bend in the sidewalk, the presence of cars passing in the background of the image, and the far wall of the parking garage all match images in the dataset. The original papers also provides another clue: a picture of the camera inside the office that was used to create the dataset. The window view in this image provides another match for the brick pattern on the north facade of the Kraember Family Library and the green metal fence along the sidewalk. View the location on Google Maps The UnConstrained College Students dataset is associated with two main research papers: "Large Scale Unconstrained Open Set Face Database" and "Unconstrained Face Detection and Open-Set Face Recognition Challenge". Collectively, these papers and the creation of the dataset have received funding from the following organizations: If you attended University of Colorado Colorado Springs and were captured by the long range surveillance camera used to create this dataset, there is unfortunately currently no way to be removed. The authors do not provide any options for students to opt-out nor were students informed they would be used for training face recognition. According to the authors, the lack of any consent or knowledge of participation is what provides part of the value of Unconstrained College Students Dataset. Please direct any questions about the ethics of the dataset to the University of Colorado Colorado Springs Ethics and Compliance Office For further technical information about the dataset, visit the UCCS dataset project page. [ page under development ]
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
-
- To help understand how Brainwash Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Brainwash Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
-
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
- [ page under development ] VIPeR (Viewpoint Invariant Pedestrian Recognition) is a dataset of pedestrian images captured at University of California Santa Cruz in 2007. Accoriding to the reserachers 2 "cameras were placed in different locations in an academic setting and subjects were notified of the presence of cameras, but were not coached or instructed in any way." VIPeR is amongst the most widely used publicly available person re-identification datasets. In 2017 the VIPeR dataset was combined into a larger person re-identification created by the Chinese University of Hong Kong called PETA (PEdesTrian Attribute).
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
-
- To help understand how VIPeR has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Viewpoint Invariant Pedestrian Recognition was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
-
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
- [ page under development ]
- This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
-
- To help understand how YouTube Celebrities has been used around the world by commercial, military, and academic organizations; existing publicly available research citing YouTube Celebrities was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
-
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
- Results are only stored for the duration of the analysis and are deleted when you leave this page. Facial recognition is a scam. During the last 20 years commericial, academic, and governmental agencies have promoted the false dream of a future with face recognition. This essay debunks the popular myth that such a thing ever existed. There is no such thing as face recognition. For the last 20 years, government agencies, commercial organizations, and academic institutions have played the public as a fool, selling a roadmap of the future that simply does not exist. Facial recognition, as it is currently defined, promoted, and sold to the public, government, and commercial sector is a scam. Committed to developing robust solutions with superhuman accuracy, the industry has repeatedly undermined itself by never actually developing anything close to "face recognition". There is only biased feature vector clustering and probabilistic thresholding. Ever since government agencies began developing face recognition in the early 1960's, datasets of face images have always been central to developing and validating face recognition technologies. Today, these datasets no longer originate in labs, but instead from family photo albums posted on photo sharing sites, surveillance camera footage from college campuses, search engine queries for celebrities, cafe livestreams, or videos on YouTube. During the last year, hundreds of these facial analysis datasets created "in the wild" have been collected to understand how they contribute to a global supply chain of biometric data that is powering the global facial recognition industry. While many of these datasets include public figures such as politicians, athletes, and actors; they also include many non-public figures: digital activists, students, pedestrians, and semi-private shared photo albums are all considered "in the wild" and fair game for research projects. Some images are used with creative commons licenses, yet others were taken in unconstrained scenarios without awareness or consent. At first glance it appears many of the datasets were created for seemingly harmless academic research, but when examined further it becomes clear that they're also used by foreign defense agencies. The MegaPixels site is based on an earlier installation (also supported by Mozilla) at the Tactical Tech Glassroom in London in 2017; and a commission from the Elevate arts festival curated by Berit Gilma about pedestrian recognition datasets in 2018, and research during CV Dazzle from 2010-2015. Through the many prototypes, conversations, pitches, PDFs, and false starts this project has endured during the last 5 years, it eventually evolved into something much different than originally imagined. Now, as datasets become increasingly influential in shaping the computational future, it's clear that they must be critically analyzed to understand the biases, shortcomings, funding sources, and contributions to the surveillance industry. However, it's misguided to only criticize these datasets for their flaws without also praising their contribution to society. Without publicly available facial analysis datasets there would be less public discourse, less open-source software, and less peer-reviewed research. Public datasets can indeed become a vital public good for the information economy but as this projects aims to illustrate, many ethical questions arise about consent, intellectual property, surveillance, and privacy. Ever since the first computational facial recognition research project by the CIA in the early 1960s, data has always played a vital role in the development of our biometric future. Without facial recognition datasets there would be no facial recognition. Datasets are an indispensable part of any artificial intelligence system because, as Geoffrey Hinton points out: Our relationship to computers has changed. Instead of programming them, we now show them and they figure it out. - Geoffrey Hinton Algorithms learn from datasets. And we program algorithms by building datasets. But datasets aren't like code. There's no programming language made of data except for the data itself. Ignore content below these lines It was the early 2000s. Face recognition was new and no one seemed sure exactly how well it was going to perform in practice. In theory, face recognition was poised to be a game changer, a force multiplier, a strategic military advantage, a way to make cities safer and to secure borders. This was the future John Ashcroft demanded with the Total Information Awareness act of the 2003 and that spooks had dreamed of for decades. It was a future that academics at Carnegie Mellon Universtiy and Colorado State University would help build. It was also a future that celebrities would play a significant role in building. And to the surprise of ordinary Internet users like myself and perhaps you, it was a future that millions of Internet users would unwittingly play role in creating. Now the future has arrived and it doesn't make sense. Facial recognition works yet it doesn't actually work. Facial recognition is cheap and accessible but also expensive and out of control. Facial recognition research has achieved headline grabbing superhuman accuracies over 99.9% yet facial recognition is also dangerously inaccurate. During a trial installation at Sudkreuz station in Berlin in 2018, 20% of the matches were wrong, a number so low that it should not have any connection to law enforcement or justice. And in London, the Metropolitan police had been using facial recognition software that mistakenly identified an alarming 98% of people as criminals 1, which perhaps is a crime itself. MegaPixels is an online art project that explores the history of facial recognition from the perspective of datasets. To paraphrase the artist Trevor Paglen, whoever controls the dataset controls the meaning. MegaPixels aims to unravel the meanings behind the data and expose the darker corners of the biometric industry that have contributed to its growth. MegaPixels does not start with a conclusion, a moralistic slant, or a Whether or not to build facial recognition was a question that can no longer be asked. As an outspoken critic of face recognition I've developed, and hopefully furthered, my understanding during the last 10 years I've spent working with computer vision. Though I initially disagreed, I've come to see technocratic perspective as a non-negotiable reality. As Oren (nytimes article) wrote in NYT Op-Ed "the horse is out of the barn" and the only thing we can do collectively or individually is to steer towards the least worse outcome. Computational communication has entered a new era and it's both exciting and frightening to explore the potentials and opportunities. In 1997 getting access to 1 teraFLOPS of computational power would have cost you $55 million and required a strategic partnership with the Department of Defense. At the time of writing, anyone can rent 1 teraFLOPS on a cloud GPU marketplace for less than $1/day. 2. I hope that this project will illuminate the darker areas of strange world of facial recognition that have not yet received attention and encourage discourse in academic, industry, and . By no means do I believe discourse can save the day. Nor do I think creating artwork can. In fact, I'm not exactly sure what the outcome of this project will be. The project is not so much what I publish here but what happens after. This entire project is only a prologue. As McLuhan wrote, "You can't have a static, fixed position in the electric age". And in our hyper-connected age of mass surveillance, artificial intelligece, and unevenly distributed virtual futures the most irrational thing to be is rational. Increasingly the world is becoming a contradiction where people use surveillance to protest surveillance, use Like many projects, MegaPixels had spent years meandering between formats, unfeasible budgets, and was generally too niche of a subject. The basic idea for this project, as proposed to the original Glass Room installation in 2016 in NYC, was to build an interactive mirror that showed people if they had been included in the LFW facial recognition dataset. The idea was based on my reaction to all the datasets I'd come across during research for the CV Dazzle project. I'd noticed strange datasets created for training and testing face detection algorithms. Most were created in labratory settings and their interpretation of face data was very strict. It was the early 2000s. Face recognition was new and no one seemed sure how well it was going to perform in practice. In theory, face recognition was poised to be a game changer, a force multiplier, a strategic military advantage, a way to make cities safer and to secure the borders. It was the future that John Ashcroft demanded with the Total Information Awareness act of the 2003. It was a future that academics helped build. It was a future that celebrities helped build. And it was a future that A decade earlier the Department of Homeland Security and the Counterdrug Technology Development Program Office initated a feasibilty study called FERET (FacE REcognition Technology) to "develop automatic face recognition capabilities that could be employed to assist security, intelligence, and law enforcement personnel in the performance of their duties [^feret_website]." One problem with FERET dataset was that the photos were in controlled settings. For face recognition to work it would have to be used in uncontrolled settings. Even newer datasets such as the Multi-PIE (Pose, Illumination, and Expression) from Carnegie Mellon University included only indoor photos of cooperative subjects. Not only were the photos completely unrealistic, CMU's Multi-Pie included only 18 individuals and cost $500 for academic use [^cmu_multipie_cost], took years to create, and required consent from every participant. Sharman, Jon. "Metropolitan Police's facial recognition technology 98% inaccurate, figures show". 2018. https://www.independent.co.uk/news/uk/home-news/met-police-facial-recognition-success-south-wales-trial-home-office-false-positive-a8345036.html↩ Calle, Dan. "Supercomptuers". 1997. http://ei.cs.vt.edu/~history/SUPERCOM.Calle.HTML↩ This post will be about the meaning of "face". How do people define it? How to biometrics researchers define it? How has it changed during the last decade. What can you know from a very small amount of information? Ideas: As the resolution
-formatted as rectangular databases of 16 bit RGB-tuples or 8 bit grayscale values To consider how visual privacy applies to real world surveillance situations, the first A single 8-bit grayscale pixel with 256 values is enough to represent the entire alphabet A 2x2 pixels contains Using no more than a 42 pixel (6x7 image) face image researchers [cite] were able to correctly distinguish between a group of 50 people. Yet The likely outcome of face recognition research is that more data is needed to improve. Indeed, resolution is the determining factor for all biometric systems, both as training data to increase Pixels, typically considered the buiding blocks of images and vidoes, can also be plotted as a graph of sensor values corresponding to the intensity of RGB-calibrated sensors. Wi-Fi and cameras presents elevated risks for transmitting videos and image documentation from conflict zones, high-risk situations, or even sharing on social media. How can new developments in computer vision also be used in reverse, as a counter-forensic tool, to minimize an individual's privacy risk? As the global Internet becomes increasingly effecient at turning the Internet into a giant dataset for machine learning, forensics, and data analysing, it would be prudent to also consider tools for decreasing the resolution. The Visual Defense module is just that. What are new ways to minimize the adverse effects of surveillance by dulling the blade. For example, a researcher paper showed that by decreasing a face size to 12x16 it was possible to do 98% accuracy with 50 people. This is clearly an example of This research module, tentatively called Visual Defense Tools, aims to explore the What all 3 examples illustrate is that face recognition is anything but absolute. In a 2017 talk, Jason Matheny the former directory of IARPA, admitted the face recognition is so brittle it can be subverted by using a magic marker and drawing "a few dots on your forehead". In fact face recognition is a misleading term. Face recognition is search engine for faces that can only ever show you the mos likely match. This presents real a real threat to privacy and lends Globally, iPhone users unwittingly agree to 1/1,000,000 probably
-relying on FaceID and TouchID to protect their information agree to a NIST 906932. Performance Assessment of Face Recognition Using Super-Resolution. Shuowen Hu, Robert Maschal, S. Susan Young, Tsai Hong Hong, Jonathon P. Phillips↩ A list of 100 things computer vision can see, eg: for i in {1..9};do wget http://visiond1.cs.umbc.edu/webpage/codedata/ADLdataset/ADL_videos/P_0$i.MP4;done;for i in {10..20}; do wget http://visiond1.cs.umbc.edu/webpage/codedata/ADLdataset/ADL_videos/P_$i.MP4;done The 27 attributes are: source: https://github.com/vana77/Market-1501_Attribute/blob/master/README.md The 23 attributes are: source: https://github.com/vana77/DukeMTMC-attribute/blob/master/README.md The joints and other keypoints (eyes, ears, nose, shoulders, elbows, wrists, hips, knees and ankles)
-The 3D pose inferred from the keypoints.
-Visibility boolean for each keypoint
-Region annotations (upper clothes, lower clothes, dress, socks, shoes, hands, gloves, neck, face, hair, hat, sunglasses, bag, occluder)
-Body type (male, female or child) source: https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/shape/h3d/ =INDEX(A2:A9,MATCH(datasets!D1,B2:B9,0))
-=VLOOKUP(A2, datasets!A:J, 7, FALSE) Right ankle
-Right knee
-Right hip
-Left hip
-Left knee
-Left ankle
-Right wrist
-Right elbow
-Right shoulder
-Left shoulder
-Left elbow
-Left wrist
-Neck
-Head top source: http://web.archive.org/web/20170915023005/sam.johnson.io/research/lsp.html is Berlin-based American artist and researcher. His previous projects (CV Dazzle, Stealth Wear, and SkyLift) explore the potential for counter-surveillance as artwork. He is the founder of VFRAME (visual forensics software for human rights groups) and is a currently researcher in residence at Karlsruhe HfG. is an American technologist and artist also based in Berlin. He was previously the CTO of a digital agency in NYC and now also works at VFRAME, developing computer vision and data analysis software for human rights groups. Jules also builds experimental software for artists and musicians.
+ and include this license and attribution protocol within any derivative work. If you publish data derived from MegaPixels, the original dataset creators should first be notified. The MegaPixels dataset is made available under the Open Data Commons Attribution License (https://opendatacommons.org/licenses/by/1.0/) and for academic use only. READABLE SUMMARY OF Open Data Commons Attribution License You are free: To Share: To copy, distribute and use the dataset
+ To Create: To produce works from the dataset
+ To Adapt: To modify, transform and build upon the database As long as you: Attribute: You must attribute any public use of the database, or works produced from the database, in the manner specified in the license. For any use or redistribution of the database, or works produced from it, you must make clear to others the license of the database and keep intact any notices on the original database. ATTRIBUTION PROTOCOL If you use the MegaPixels data or any data derived from it, please cite the original work as follows: and include this license and attribution protocol within any derivative work. If you publish data derived from MegaPixels, the original dataset creators should first be notified. The MegaPixels dataset is made available under the Open Data Commons Attribution License (https://opendatacommons.org/licenses/by/1.0/) and for academic use only. READABLE SUMMARY OF Open Data Commons Attribution License You are free: To Share: To copy, distribute and use the dataset
+To Create: To produce works from the dataset
+To Adapt: To modify, transform and build upon the database As long as you: Attribute: You must attribute any public use of the database, or works produced from the database, in the manner specified in the license. For any use or redistribution of the database, or works produced from it, you must make clear to others the license of the database and keep intact any notices on the original database. MegaPixels is an independent art and research project by Adam Harvey and Jules LaPlace that investigates the ethics, origins, and individual privacy implications of face recognition image datasets and their role in the expansion of biometric surveillance technologies. The MegaPixels site is made possible with support from Mozilla is Berlin-based American artist and researcher. His previous projects (CV Dazzle, Stealth Wear, and SkyLift) explore the potential for counter-surveillance as artwork. He is the founder of VFRAME (visual forensics software for human rights groups) and is a currently researcher in residence at Karlsruhe HfG. is an American technologist and artist also based in Berlin. He was previously the CTO of a digital agency in NYC and now also works at VFRAME, developing computer vision and data analysis software for human rights groups. Jules also builds experimental software for artists and musicians.
+ The MegaPixels website is based on an earlier installation from 2017 and ongoing research and lectures (TedX, CPDP) about facial recognition datasets. Over the last several years this project has evolved into a large-scale interrogation of hundreds of publicly-available face and person analysis datasets. MegaPixels aims to provide a critical perspective on machine learning image datsets, one that might otherwise escape academia and the industry funded artificial intelligence think tanks that are often supported by the same technology companies who have created many of the datasets presented on this site. MegaPixels is an independent project, designed as a public resource for educators, students, journalists, and researchers. Each dataset presented on this site undergoes a thorough review of its images, intent, and funding sources. Though the goals are similar to publishing a public academic paper, MegaPixels is a website-first reserch project aligns closley with the goals of pre-print academic publications. As such we welcome feedback and ways to improve this site and the clarity of the research. Because this project surfaces many funding issues with datasets (from datasets funded by the C.I.A. to the National Unviversity of Defense and Technology in China), it is important that we are transparent about own funding. The original MegaPixels installation in 2017 was built as a commission for and with support from Tactical Technology Collective and Mozilla. The bulk of the research and web-development during 2018 - 2018 was supported by a grant from Mozilla. Continued development in 2019 is partially supported by a 1-year Reseacher-in-Residence grant from Karlsruhe HfG, lecture and workshop fees, and from commissions and sales from the Privacy Gift Shop. Please get in touch if you are interested in supporting this project. Please direct questions, comments, or feedback to mastodon.social/@adamhrv If you use MegaPixels or any data derived from it for your work, please cite our original work as follows: MegaPixels.cc Terms and Privacy MegaPixels is an independent and academic art and research project about the origins and ethics of publicly available face analysis image datasets. By accessing MegaPixels (the Service or Services) you agree to the terms and conditions set forth below. The MegaPixels site has been designed to minimize the amount of network requests to 3rd party services and therefore prioritize the privacy of the viewer. This site does not use any local or external analytics programs to monitor site viewers. In fact, the only data collected are the necessary server logs used only for preventing misuse, which are deleted at short-term intervals. In order to provide certain features of the site, some 3rd party services are needed. Currently, the MegaPixels.cc site uses two 3rd party services: (1) Leaflet.js for the interactive map and (2) Digital Ocean Spaces as a content delivery network. Both services encrypt your requests to their server using HTTPS and neither service requires storing any cookies or authentication. However, both services will store files in your web browser's local cache (local storage) to improve loading performance. None of these local storage files are using for analytics, tracking, or any similar purpose. The MegaPixels.cc contains many links to 3rd party websites, especially in the list of citations that are provided for each dataset. This website has no control over and assumes no responsibility for, the content, privacy policies, or practices of any third party web sites or services. You acknowledge and agree that megapixels.cc shall not be responsible or liable, directly or indirectly, for any damage or loss caused or alleged to be caused by or in connection with use of or reliance on any such content, goods or services available on or through any such web sites or services. We advise you to read the terms and conditions and privacy policies of any third-party web sites or services that you visit. When you access the Service, we record your visit to the site in a server log file for the purposes of maintaining site security and preventing misuse. This includes your IP address and the header information sent by your web browser which includes the User Agent, referrer, and the requested page on our site. We do not share or make public any information about individual site visitors, unless where required by law to the extent that server logs are only retained for a limited duration. We provide information for educational, journalistic, and research purposes. The published information on MegaPixels is made available under the Open Data Commons Attribution License (https://opendatacommons.org/licenses/by/1.0/) and for academic use only. You are free: To Share: To copy, distribute and use the dataset
+To Create: To produce works from the dataset
+To Adapt: To modify, transform and build upon the database As long as you: Attribute: You must attribute any public use of the database, or works produced from the database, in the manner specified in the license. For any use or redistribution of the database, or works produced from it, you must make clear to others the license of the database and keep intact any notices on the original database. If you use the MegaPixels data or any data derived from it, please cite the original work as follows: While every intention is made to publish only verifiable information, at times information may be edited, removed, or appended for clarity or correction. In no event will the operators of this site be liable for your use or misuse of the information provided. We may terminate or suspend access to our Service immediately without prior notice or liability, for any reason whatsoever, including without limitation if you breach the Terms. All provisions of the Terms which by their nature should survive termination shall survive termination, including, without limitation, ownership provisions, warranty disclaimers, indemnity and limitations of liability. You may not access or use, or attempt to access or use, the Services to take any action that could harm us or a third party. You may not use the Services in violation of applicable laws or in violation of our or any third party’s intellectual property or other proprietary or legal rights. You further agree that you shall not attempt (or encourage or support anyone else's attempt) to circumvent, reverse engineer, decrypt, or otherwise alter or interfere with the Services, or any content thereof, or make any unauthorized use thereof. Without prior written consent, you shall not: (i) access any part of the Services, Content, data or information you do not have permission or authorization to access; (ii) use robots, spiders, scripts, service, software or any manual or automatic device, tool, or process designed to data mine or scrape the Content, data or information from the Services, or otherwise access or collect the Content, data or information from the Services using automated means; (iii) use services, software or any manual or automatic device, tool, or process designed to circumvent any restriction, condition, or technological measure that controls access to the Services in any way, including overriding any security feature or bypassing or circumventing any access controls or use limits of the Services; (iv) cache or archive the Content (except for a public search engine’s use of spiders for creating search indices) with prior written consent; (v) take action that imposes an unreasonable or disproportionately large load on our network or infrastructure; and (vi) do anything that could disable, damage or change the functioning or appearance of the Services, including the presentation of advertising. Engaging in a prohibited use of the Services may result in civil, criminal, and/or administrative penalties, fines, or sanctions against the user and those assisting the user. These Terms shall be governed and construed in accordance with the laws of Berlin, Germany, without regard to its conflict of law provisions. Our failure to enforce any right or provision of these Terms will not be considered a waiver of those rights. If any provision of these Terms is held to be invalid or unenforceable by a court, the remaining provisions of these Terms will remain in effect. These Terms constitute the entire agreement between us regarding our Service, and supersede and replace any prior agreements we might have between us regarding the Service. You hereby indemnify, defend and hold harmless MegaPixels (and its creators) and all officers, directors, owners, agents, information providers, affiliates, licensors and licensees (collectively, the "Indemnified Parties") from and against any and all liability and costs, including, without limitation, reasonable attorneys' fees, incurred by the Indemnified Parties in connection with any claim arising out of any breach by you or any user of your account of these Terms of Service or the foregoing representations, warranties and covenants. You shall cooperate as fully as reasonably required in the defense of any such claim. We reserves the right, at its own expense, to assume the exclusive defense and control of any matter subject to indemnification by you. We reserve the right, at our sole discretion, to modify or replace these Terms at any time. By continuing to use or access our Service after revisions become effective, you agree to be bound by the revised terms. If you do not agree to revised terms, please do not use the Service. [ page under development ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how 50 People One Question Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing 50 People One Question was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ [ page under development ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how Asian Face Age Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing The Asian Face Age Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The Asian Face Age Dataset (AFAD) is a new dataset proposed for evaluating the performance of age estimation, which contains more than 160K facial images and the corresponding age and gender labels. This dataset is oriented to age estimation on Asian faces, so all the facial images are for Asian faces. It is noted that the AFAD is the biggest dataset for age estimation to date. It is well suited to evaluate how deep learning methods can be adopted for age estimation.
+Motivation For age estimation, there are several public datasets for evaluating the performance of a specific algorithm, such as FG-NET [1] (1002 face images), MORPH I (1690 face images), and MORPH II[2] (55,608 face images). Among them, the MORPH II is the biggest public dataset to date. On the other hand, as we know it is necessary to collect a large scale dataset to train a deep Convolutional Neural Network. Therefore, the MORPH II dataset is extensively used to evaluate how deep learning methods can be adopted for age estimation [3][4]. However, the ethnic is very unbalanced for the MORPH II dataset, i.e., it has only less than 1% Asian faces. In order to evaluate the previous methods for age estimation on Asian Faces, the Asian Face Age Dataset (AFAD) was proposed. There are 164,432 well-labeled photos in the AFAD dataset. It consist of 63,680 photos for female as well as 100,752 photos for male, and the ages range from 15 to 40. The distribution of photo counts for distinct ages are illustrated in the figure above. Some samples are shown in the Figure on the top. Its download link is provided in the "Download" section. In addition, we also provide a subset of the AFAD dataset, called AFAD-Lite, which only contains PLACEHOLDER well-labeled photos. It consist of PLACEHOLDER photos for female as well as PLACEHOLDER photos for male, and the ages range from 15 to 40. The distribution of photo counts for distinct ages are illustrated in Fig. PLACEHOLDER. Its download link is also provided in the "Download" section. The AFAD dataset is built by collecting selfie photos on a particular social network -- RenRen Social Network (RSN) [5]. The RSN is widely used by Asian students including middle school, high school, undergraduate, and graduate students. Even after leaving from school, some people still access their RSN account to connect with their old classmates. So, the age of the RSN user crosses a wide range from 15-years to more than 40-years old. Please notice that this dataset is made available for academic research purpose only. Brainwash is a head detection dataset created from San Francisco's Brainwash Cafe livecam footage. It includes 11,918 images of "everyday life of a busy downtown cafe" 1 captured at 100 second intervals throught the entire day. Brainwash dataset was captured during 3 days in 2014: October 27, November 13, and November 24. According the author's reserach paper introducing the dataset, the images were acquired with the help of Angelcam.com. 2 Brainwash is not a widely used dataset but since its publication by Stanford University in 2015, it has notably appeared in several research papers from the National University of Defense Technology in Changsha, China. In 2016 and in 2017 researchers there conducted studies on detecting people's heads in crowded scenes for the purpose of surveillance. 3 4 If you happen to have been at Brainwash cafe in San Francisco at any time on October 26, November 13, or November 24 in 2014 you are most likely included in the Brainwash dataset and have unwittingly contributed to surveillance research.
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how Brainwash Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Brainwash Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ TODO
+
+ If you use our data, research, or graphics please cite our work:
+
+ "readme.txt" https://exhibits.stanford.edu/data/catalog/sx925dc9385. Stewart, Russel. Andriluka, Mykhaylo. "End-to-end people detection in crowded scenes". 2016. Li, Y. and Dou, Y. and Liu, X. and Li, T. Localized Region Context and Object Feature Fusion for People Head Detection. ICIP16 Proceedings. 2016. Pages 594-598. Zhao. X, Wang Y, Dou, Y. A Replacement Algorithm of Non-Maximum Suppression Base on Graph Clustering. [ page under development ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how Brainwash Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Brainwash Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset contains images of people collected from the web by typing common given names into Google Image Search. The coordinates of the eyes, the nose and the center of the mouth for each frontal face are provided in a ground truth file. This information can be used to align and crop the human faces or as a ground truth for a face detection algorithm. The dataset has 10,524 human faces of various resolutions and in different settings, e.g. portrait images, groups of people, etc. Profile faces or very low resolution faces are not labeled. [ PAGE UNDER DEVELOPMENT ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how CelebA Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Large-scale CelebFaces Attributes Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ [ PAGE UNDER DEVELOPMENT ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how COFW Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Caltech Occluded Faces in the Wild was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ COFW is "is designed to benchmark face landmark algorithms in realistic conditions, which include heavy occlusions and large shape variations" [Robust face landmark estimation under occlusion]. We asked four people with different levels of computer vision knowledge to each collect 250 faces representative of typical real-world images, with the clear goal of challenging computer vision methods.
+The result is 1,007 images of faces obtained from a variety of sources. Robust face landmark estimation under occlusion Our face dataset is designed to present faces in real-world conditions. Faces show large variations in shape and occlusions due to differences in pose, expression, use of accessories such as sunglasses and hats and interactions with objects (e.g. food, hands, microphones, etc.). All images were hand annotated in our lab using the same 29 landmarks as in LFPW. We annotated both the landmark positions as well as their occluded/unoccluded state. The faces are occluded to different degrees, with large variations in the type of occlusions encountered. COFW has an average occlusion of over 23%.
+To increase the number of training images, and since COFW has the exact same landmarks as LFPW, for training we use the original non-augmented 845 LFPW faces + 500 COFW faces (1345 total), and for testing the remaining 507 COFW faces. To make sure all images had occlusion labels, we annotated occlusion on the available 845 LFPW training images, finding an average of only 2% occlusion. http://www.vision.caltech.edu/xpburgos/ICCV13/ This research is supported by NSF Grant 0954083 and by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via IARPA R&D Contract No. 2014-14071600012.
+ To help understand how COFW Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Caltech Occluded Faces in the Wild was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the location markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+ TODO The Duke Multi-Target, Multi-Camera Tracking Dataset (MTMC) is a dataset of video recorded on Duke University campus for research and development of networked camera surveillance systems. MTMC tracking is used for citywide dragnet surveillance systems such as those used throughout China by SenseTime 1 and the oppressive monitoring of 2.5 million Uyghurs in Xinjiang by SenseNets 2. In fact researchers from both SenseTime 4 5 and SenseNets 3 used the Duke MTMC dataset for their research. The Duke MTMC dataset is unique because it is the largest publicly available MTMC and person re-identification dataset and has the longest duration of annotated video. In total, the Duke MTMC dataset provides over 14 hours of 1080p video from 8 synchronized surveillance cameras. 6 It is among the most widely used person re-identification datasets in the world. The approximately 2,700 unique people in the Duke MTMC videos, most of whom are students, are used for research and development of surveillance technologies by commercial, academic, and even defense organizations. The creation and publication of the Duke MTMC dataset in 2016 was originally funded by the U.S. Army Research Laboratory and the National Science Foundation 6. Since 2016 use of the Duke MTMC dataset images have been publicly acknowledged in research funded by or on behalf of the Chinese National University of Defense 7 8, IARPA and IBM 9, and U.S. Department of Homeland Security 10. The 8 cameras deployed on Duke's campus were specifically setup to capture students "during periods between lectures, when pedestrian traffic is heavy". 6 Camera 7 and 2 capture large groups of prospective students and children. Camera 5 was positioned to capture students as they enter and exit Duke University's main chapel. Each camera's location is documented below. Duke MTMC (Multi-Target, Multi-Camera Tracking) is a dataset of video recorded on Duke University campus for research and development of networked camera surveillance systems. MTMC tracking algorithms are used for citywide dragnet surveillance systems such as those used throughout China by SenseTime 1 and the oppressive monitoring of 2.5 million Uyghurs in Xinjiang by SenseNets 2. In fact researchers from both SenseTime 4 5 and SenseNets 3 used the Duke MTMC dataset for their research. In this investigation into the Duke MTMC dataset, we found that researchers at Duke Univesity in Durham, North Carolina captured over 2,000 students, faculty members, and passersby into one of the most prolific public surveillance research datasets that's used around the world by commercial and defense surveillance organizations. Since it's publication in 2016, the Duke MTMC dataset has been used in over 100 studies at organizations around the world including SenseTime 4 5, SenseNets 3, IARPA and IBM 9, Chinese National University of Defense 7 8, US Department of Homeland Security 10, Tencent, Microsoft, Microsft Asia, Fraunhofer, Senstar Corp., Alibaba, Naver Labs, Google and Hewlett-Packard Labs to name only a few. The creation and publication of the Duke MTMC dataset in 2014 (published in 2016) was originally funded by the U.S. Army Research Laboratory and the National Science Foundation 6. Though our analysis of the geographic locations of the publicly available research shows over twice as many citations by researchers from China (44% China, 20% United States). In 2018 alone, there were 70 research project citations from China. The 8 cameras deployed on Duke's campus were specifically setup to capture students "during periods between lectures, when pedestrian traffic is heavy". 6. Camera 5 was positioned to capture students as entering and exiting the university's main chapel. Each camera's location and approximate field of view. The heat map visualization shows the locations where pedestrians were most frequently annotated in each video from the Duke MTMC datset.
@@ -110,18 +111,122 @@
The Duke MTMC dataset paper mentions 2,700 identities, but their ground truth file only lists annotations for 1,812 Original funding for the Duke MTMC dataset was provided by the Army Research Office under Grant No. W911NF-10-1-0387 and by the National Science Foundation
+under Grants IIS-10-17017 and IIS-14-20894. The video timestamps contain the likely, but not yet confirmed, date and times of capture. Because the video timestamps align with the start and stop time sync data provided by the researchers, it at least aligns the relative time. The rainy weather on that day also contribute towards the likelihood of March 14, 2014.. If you attended Duke University and were captured by any of the 8 surveillance cameras positioned on campus in 2014, there is unfortunately no way to be removed. The dataset files have been distributed throughout the world and it would not be possible to contact all the owners for removal. Nor do the authors provide any options for students to opt-out, nor did they even inform students they would be used at test subjects for surveillance research and development in a project funded, in part, by the United States Army Research Office.
+
+ If you use our data, research, or graphics please cite our work:
+
+ If you use any data from the Duke MTMC please follow their license and cite their work as: https://foreignpolicy.com/2019/03/19/962492-orwell-china-socialcredit-surveillance/ "Attention-Aware Compositional Network for Person Re-identification". 2018. Source "End-to-End Deep Kronecker-Product Matching for Person Re-identification". 2018. source "Person Re-identification with Deep Similarity-Guided Graph Neural Network". 2018. Source "Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking". 2016. Source "Tracking by Animation: Unsupervised Learning of Multi-Object Attentive Trackers". 2018. Source "Unsupervised Multi-Object Detection for Video Surveillance Using Memory-Based Recurrent Attention Networks". 2018. Source "Horizontal Pyramid Matching for Person Re-identification". 2019. Source "Re-Identification with Consistent Attentive Siamese Networks". 2018. Source "Attention-Aware Compositional Network for Person Re-identification". 2018. SemanticScholar, PDF "End-to-End Deep Kronecker-Product Matching for Person Re-identification". 2018. SemanticScholar, PDF "Person Re-identification with Deep Similarity-Guided Graph Neural Network". 2018. SemanticScholar "Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking". 2016. SemanticScholar "Tracking by Animation: Unsupervised Learning of Multi-Object Attentive Trackers". 2018. SemanticScholar "Unsupervised Multi-Object Detection for Video Surveillance Using Memory-Based Recurrent Attention Networks". 2018. SemanticScholar "Horizontal Pyramid Matching for Person Re-identification". 2019. SemanticScholar "Re-Identification with Consistent Attentive Siamese Networks". 2018. SemanticScholar [ page under development ] {% include 'dashboard.html' %} The FERET program is sponsored by the U.S. Depart- ment of Defense’s Counterdrug Technology Development Program Office. The U.S. Army Research Laboratory (ARL) is the technical agent for the FERET program. ARL designed, administered, and scored the FERET tests. George Mason University collected, processed, and main- tained the FERET database. Inquiries regarding the FERET database or test should be directed to P. Jonathon Phillips.
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how LFWP has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Labeled Face Parts in the Wild was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ RESEARCH below this line Release 1 of LFPW consists of 1,432 faces from images downloaded from the web using simple text queries on sites such as google.com, flickr.com, and yahoo.com. Each image was labeled by three MTurk workers, and 29 fiducial points, shown below, are included in dataset. LFPW was originally described in the following publication: Due to copyright issues, we cannot distribute image files in any format to anyone. Instead, we have made available a list of image URLs where you can download the images yourself. We realize that this makes it impossible to exactly compare numbers, as image links will slowly disappear over time, but we have no other option. This seems to be the way other large web-based databases seem to be evolving. https://neerajkumar.org/databases/lfpw/ This research was performed at Kriegman-Belhumeur Vision Technologies and was funded by the CIA through the Office of the Chief Scientist. https://www.cs.cmu.edu/~peiyunh/topdown/ (nk_cvpr2011_faceparts.pdf) [ PAGE UNDER DEVELOPMENT]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how Market 1501 has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Market 1501 Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ @proceedings{zheng2016mars,
+title={MARS: A Video Benchmark for Large-Scale Person Re-identification},
+author={Zheng, Liang and Bie, Zhi and Sun, Yifan and Wang, Jingdong and Su, Chi and Wang, Shengjin and Tian, Qi},
+booktitle={European Conference on Computer Vision},
+year={2016},
+organization={Springer}
+} [ PAGE UNDER DEVELOPMENT ] https://www.hrw.org/news/2019/01/15/letter-microsoft-face-surveillance-technology The street location of the camera used for the Oxford Town Centre dataset was confirmed by matching the road, benches, and store signs source. At that location, two public CCTV cameras exist mounted on the side of the Northgate House building at 13-20 Cornmarket St. Because of the lower camera's mounting pole directionality, a view from a private camera in the building across the street can be ruled out because it would have to show more of silhouette of the lower camera's mounting pole. Two options remain: either the public CCTV camera mounted to the side of the building was used or the researchers mounted their own camera to the side of the building in the same location. Because the researchers used many other existing public CCTV cameras for their research projects it is likely that they would also be able to access to this camera. To discredit the theory that this public CCTV is only seen pointing the other way in Google Street View images, at least one public photo shows the upper CCTV camera pointing in the same direction as the Oxford Town Centre dataset proving the camera can and has been rotated before. As for the capture date, the text on the storefront display shows a sale happening from December 2nd – 7th indicating the capture date was between or just before those dates. The capture year is either 2008 or 2007 since prior to 2007 the Carphone Warehouse (photo, history) did not exist at this location. Since the sweaters in the GAP window display are more similar to those in a GAP website snapshot from November 2007, our guess is that the footage was obtained during late November or early December 2007. The lack of street vendors and slight waste residue near the bench suggests that is was probably a weekday after rubbish removal. Several researchers have posted their demo videos using the Oxford Town Centre dataset on YouTube:
+
+ If you use our data, research, or graphics please cite our work:
+
+ [ PAGE UNDER DEVELOPMENT ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how PIPA Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing People in Photo Albums Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ [ PAGE UNDER DEVELOPMENT ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how PubFig has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Public Figures Face Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ [ page under development ] UnConstrained College Students (UCCS) is a dataset of long-range surveillance photos captured at University of Colorado Colorado Springs. According to the authors of two papers associated with the dataset, subjects were "photographed using a long-range high-resolution surveillance camera without their knowledge" 2. To create the dataset, the researchers used a Canon 7D digital camera fitted with a Sigma 800mm telephoto lens and photographed students 150–200m away through their office window. Photos were taken during the morning and afternoon while students were walking to and from classes. The primary uses of this dataset are to train, validate, and build recognition and face detection algorithms for realistic surveillance scenarios. What makes the UCCS dataset unique is that it includes the highest resolution images of any publicly available face recognition dataset discovered so far (18MP), that it was captured on a campus without consent or awareness using a long-range telephoto lens, and that it was funded by United States defense and intelligence agencies. Combined funding sources for the creation of the initial and final release of the dataset include ODNI (Office of Director of National Intelligence), IARPA (Intelligence Advance Research Projects Activity), ONR MURI (Office of Naval Research and The Department of Defense Multidisciplinary University Research Initiative), Army SBIR (Small Business Innovation Research), SOCOM SBIR (Special Operations Command and Small Business Innovation Research), and the National Science Foundation. 1 2 In 2017 the UCCS face dataset was used for a defense and intelligence agency funded face recognition challenge at the International Joint Biometrics Conference in Denver, CO. And in 2018 the dataset was used for the 2nd Unconstrained Face Detection and Open Set Recognition Challenge at the European Computer Vision Conference (ECCV) in Munich, Germany. Additional research projects that have used the UCCS dataset are included below in the list of verified citations. UCCS is part of the IARAP Janus team https://vast.uccs.edu/project/iarpa-janus/
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how UCCS has been used around the world by commercial, military, and academic organizations; existing publicly available research citing UnConstrained College Students Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The images in UCCS were taken on 18 non-consecutive days during 2012–2013. Analysis of the EXIF data embedded in original images reveal that most of the images were taken on Tuesdays, and the most frequent capture time throughout the week was 12:30PM. The location of the camera and subjects can confirmed using several visual cues in the dataset images: the unique pattern of the sidewalk that is only used on the UCCS Pedestrian Spine near the West Lawn, the two UCCS sign poles with matching graphics still visible in Google Street View, the no parking sign and directionality of its arrow, the back of street sign next to it, the slight bend in the sidewalk, the presence of cars passing in the background of the image, and the far wall of the parking garage all match images in the dataset. The original papers also provides another clue: a picture of the camera inside the office that was used to create the dataset. The window view in this image provides another match for the brick pattern on the north facade of the Kraember Family Library and the green metal fence along the sidewalk. View the location on Google Maps The UnConstrained College Students dataset is associated with two main research papers: "Large Scale Unconstrained Open Set Face Database" and "Unconstrained Face Detection and Open-Set Face Recognition Challenge". Collectively, these papers and the creation of the dataset have received funding from the following organizations: If you attended University of Colorado Colorado Springs and were captured by the long range surveillance camera used to create this dataset, there is unfortunately currently no way to be removed. The authors do not provide any options for students to opt-out nor were students informed they would be used for training face recognition. According to the authors, the lack of any consent or knowledge of participation is what provides part of the value of Unconstrained College Students Dataset.
+
+ If you use our data, research, or graphics please cite our work:
+
+ [ page under development ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how Brainwash Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Brainwash Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ [ page under development ] VIPeR (Viewpoint Invariant Pedestrian Recognition) is a dataset of pedestrian images captured at University of California Santa Cruz in 2007. Accoriding to the reserachers 2 "cameras were placed in different locations in an academic setting and subjects were notified of the presence of cameras, but were not coached or instructed in any way." VIPeR is amongst the most widely used publicly available person re-identification datasets. In 2017 the VIPeR dataset was combined into a larger person re-identification created by the Chinese University of Hong Kong called PETA (PEdesTrian Attribute).
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how VIPeR has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Viewpoint Invariant Pedestrian Recognition was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ [ page under development ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how YouTube Celebrities has been used around the world by commercial, military, and academic organizations; existing publicly available research citing YouTube Celebrities was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ Results are only stored for the duration of the analysis and are deleted when you leave this page. Facial recognition is a scam. During the last 20 years commericial, academic, and governmental agencies have promoted the false dream of a future with face recognition. This essay debunks the popular myth that such a thing ever existed. There is no such thing as face recognition. For the last 20 years, government agencies, commercial organizations, and academic institutions have played the public as a fool, selling a roadmap of the future that simply does not exist. Facial recognition, as it is currently defined, promoted, and sold to the public, government, and commercial sector is a scam. Committed to developing robust solutions with superhuman accuracy, the industry has repeatedly undermined itself by never actually developing anything close to "face recognition". There is only biased feature vector clustering and probabilistic thresholding. Ever since government agencies began developing face recognition in the early 1960's, datasets of face images have always been central to developing and validating face recognition technologies. Today, these datasets no longer originate in labs, but instead from family photo albums posted on photo sharing sites, surveillance camera footage from college campuses, search engine queries for celebrities, cafe livestreams, or videos on YouTube. During the last year, hundreds of these facial analysis datasets created "in the wild" have been collected to understand how they contribute to a global supply chain of biometric data that is powering the global facial recognition industry. While many of these datasets include public figures such as politicians, athletes, and actors; they also include many non-public figures: digital activists, students, pedestrians, and semi-private shared photo albums are all considered "in the wild" and fair game for research projects. Some images are used with creative commons licenses, yet others were taken in unconstrained scenarios without awareness or consent. At first glance it appears many of the datasets were created for seemingly harmless academic research, but when examined further it becomes clear that they're also used by foreign defense agencies. The MegaPixels site is based on an earlier installation (also supported by Mozilla) at the Tactical Tech Glassroom in London in 2017; and a commission from the Elevate arts festival curated by Berit Gilma about pedestrian recognition datasets in 2018, and research during CV Dazzle from 2010-2015. Through the many prototypes, conversations, pitches, PDFs, and false starts this project has endured during the last 5 years, it eventually evolved into something much different than originally imagined. Now, as datasets become increasingly influential in shaping the computational future, it's clear that they must be critically analyzed to understand the biases, shortcomings, funding sources, and contributions to the surveillance industry. However, it's misguided to only criticize these datasets for their flaws without also praising their contribution to society. Without publicly available facial analysis datasets there would be less public discourse, less open-source software, and less peer-reviewed research. Public datasets can indeed become a vital public good for the information economy but as this projects aims to illustrate, many ethical questions arise about consent, intellectual property, surveillance, and privacy. Ever since the first computational facial recognition research project by the CIA in the early 1960s, data has always played a vital role in the development of our biometric future. Without facial recognition datasets there would be no facial recognition. Datasets are an indispensable part of any artificial intelligence system because, as Geoffrey Hinton points out: Our relationship to computers has changed. Instead of programming them, we now show them and they figure it out. - Geoffrey Hinton Algorithms learn from datasets. And we program algorithms by building datasets. But datasets aren't like code. There's no programming language made of data except for the data itself. Ignore content below these lines It was the early 2000s. Face recognition was new and no one seemed sure exactly how well it was going to perform in practice. In theory, face recognition was poised to be a game changer, a force multiplier, a strategic military advantage, a way to make cities safer and to secure borders. This was the future John Ashcroft demanded with the Total Information Awareness act of the 2003 and that spooks had dreamed of for decades. It was a future that academics at Carnegie Mellon Universtiy and Colorado State University would help build. It was also a future that celebrities would play a significant role in building. And to the surprise of ordinary Internet users like myself and perhaps you, it was a future that millions of Internet users would unwittingly play role in creating. Now the future has arrived and it doesn't make sense. Facial recognition works yet it doesn't actually work. Facial recognition is cheap and accessible but also expensive and out of control. Facial recognition research has achieved headline grabbing superhuman accuracies over 99.9% yet facial recognition is also dangerously inaccurate. During a trial installation at Sudkreuz station in Berlin in 2018, 20% of the matches were wrong, a number so low that it should not have any connection to law enforcement or justice. And in London, the Metropolitan police had been using facial recognition software that mistakenly identified an alarming 98% of people as criminals 1, which perhaps is a crime itself. MegaPixels is an online art project that explores the history of facial recognition from the perspective of datasets. To paraphrase the artist Trevor Paglen, whoever controls the dataset controls the meaning. MegaPixels aims to unravel the meanings behind the data and expose the darker corners of the biometric industry that have contributed to its growth. MegaPixels does not start with a conclusion, a moralistic slant, or a Whether or not to build facial recognition was a question that can no longer be asked. As an outspoken critic of face recognition I've developed, and hopefully furthered, my understanding during the last 10 years I've spent working with computer vision. Though I initially disagreed, I've come to see technocratic perspective as a non-negotiable reality. As Oren (nytimes article) wrote in NYT Op-Ed "the horse is out of the barn" and the only thing we can do collectively or individually is to steer towards the least worse outcome. Computational communication has entered a new era and it's both exciting and frightening to explore the potentials and opportunities. In 1997 getting access to 1 teraFLOPS of computational power would have cost you $55 million and required a strategic partnership with the Department of Defense. At the time of writing, anyone can rent 1 teraFLOPS on a cloud GPU marketplace for less than $1/day. 2. I hope that this project will illuminate the darker areas of strange world of facial recognition that have not yet received attention and encourage discourse in academic, industry, and . By no means do I believe discourse can save the day. Nor do I think creating artwork can. In fact, I'm not exactly sure what the outcome of this project will be. The project is not so much what I publish here but what happens after. This entire project is only a prologue. As McLuhan wrote, "You can't have a static, fixed position in the electric age". And in our hyper-connected age of mass surveillance, artificial intelligece, and unevenly distributed virtual futures the most irrational thing to be is rational. Increasingly the world is becoming a contradiction where people use surveillance to protest surveillance, use Like many projects, MegaPixels had spent years meandering between formats, unfeasible budgets, and was generally too niche of a subject. The basic idea for this project, as proposed to the original Glass Room installation in 2016 in NYC, was to build an interactive mirror that showed people if they had been included in the LFW facial recognition dataset. The idea was based on my reaction to all the datasets I'd come across during research for the CV Dazzle project. I'd noticed strange datasets created for training and testing face detection algorithms. Most were created in labratory settings and their interpretation of face data was very strict. It was the early 2000s. Face recognition was new and no one seemed sure how well it was going to perform in practice. In theory, face recognition was poised to be a game changer, a force multiplier, a strategic military advantage, a way to make cities safer and to secure the borders. It was the future that John Ashcroft demanded with the Total Information Awareness act of the 2003. It was a future that academics helped build. It was a future that celebrities helped build. And it was a future that A decade earlier the Department of Homeland Security and the Counterdrug Technology Development Program Office initated a feasibilty study called FERET (FacE REcognition Technology) to "develop automatic face recognition capabilities that could be employed to assist security, intelligence, and law enforcement personnel in the performance of their duties [^feret_website]." One problem with FERET dataset was that the photos were in controlled settings. For face recognition to work it would have to be used in uncontrolled settings. Even newer datasets such as the Multi-PIE (Pose, Illumination, and Expression) from Carnegie Mellon University included only indoor photos of cooperative subjects. Not only were the photos completely unrealistic, CMU's Multi-Pie included only 18 individuals and cost $500 for academic use [^cmu_multipie_cost], took years to create, and required consent from every participant. Sharman, Jon. "Metropolitan Police's facial recognition technology 98% inaccurate, figures show". 2018. https://www.independent.co.uk/news/uk/home-news/met-police-facial-recognition-success-south-wales-trial-home-office-false-positive-a8345036.html↩ Calle, Dan. "Supercomptuers". 1997. http://ei.cs.vt.edu/~history/SUPERCOM.Calle.HTML↩ This post will be about the meaning of "face". How do people define it? How to biometrics researchers define it? How has it changed during the last decade. What can you know from a very small amount of information? Ideas: As the resolution
+formatted as rectangular databases of 16 bit RGB-tuples or 8 bit grayscale values To consider how visual privacy applies to real world surveillance situations, the first A single 8-bit grayscale pixel with 256 values is enough to represent the entire alphabet A 2x2 pixels contains Using no more than a 42 pixel (6x7 image) face image researchers [cite] were able to correctly distinguish between a group of 50 people. Yet The likely outcome of face recognition research is that more data is needed to improve. Indeed, resolution is the determining factor for all biometric systems, both as training data to increase Pixels, typically considered the buiding blocks of images and vidoes, can also be plotted as a graph of sensor values corresponding to the intensity of RGB-calibrated sensors. Wi-Fi and cameras presents elevated risks for transmitting videos and image documentation from conflict zones, high-risk situations, or even sharing on social media. How can new developments in computer vision also be used in reverse, as a counter-forensic tool, to minimize an individual's privacy risk? As the global Internet becomes increasingly effecient at turning the Internet into a giant dataset for machine learning, forensics, and data analysing, it would be prudent to also consider tools for decreasing the resolution. The Visual Defense module is just that. What are new ways to minimize the adverse effects of surveillance by dulling the blade. For example, a researcher paper showed that by decreasing a face size to 12x16 it was possible to do 98% accuracy with 50 people. This is clearly an example of This research module, tentatively called Visual Defense Tools, aims to explore the What all 3 examples illustrate is that face recognition is anything but absolute. In a 2017 talk, Jason Matheny the former directory of IARPA, admitted the face recognition is so brittle it can be subverted by using a magic marker and drawing "a few dots on your forehead". In fact face recognition is a misleading term. Face recognition is search engine for faces that can only ever show you the mos likely match. This presents real a real threat to privacy and lends Globally, iPhone users unwittingly agree to 1/1,000,000 probably
+relying on FaceID and TouchID to protect their information agree to a NIST 906932. Performance Assessment of Face Recognition Using Super-Resolution. Shuowen Hu, Robert Maschal, S. Susan Young, Tsai Hong Hong, Jonathon P. Phillips↩ A list of 100 things computer vision can see, eg: for i in {1..9};do wget http://visiond1.cs.umbc.edu/webpage/codedata/ADLdataset/ADL_videos/P_0$i.MP4;done;for i in {10..20}; do wget http://visiond1.cs.umbc.edu/webpage/codedata/ADLdataset/ADL_videos/P_$i.MP4;done The 27 attributes are: source: https://github.com/vana77/Market-1501_Attribute/blob/master/README.md The 23 attributes are: source: https://github.com/vana77/DukeMTMC-attribute/blob/master/README.md The joints and other keypoints (eyes, ears, nose, shoulders, elbows, wrists, hips, knees and ankles)
+The 3D pose inferred from the keypoints.
+Visibility boolean for each keypoint
+Region annotations (upper clothes, lower clothes, dress, socks, shoes, hands, gloves, neck, face, hair, hat, sunglasses, bag, occluder)
+Body type (male, female or child) source: https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/shape/h3d/ =INDEX(A2:A9,MATCH(datasets!D1,B2:B9,0))
+=VLOOKUP(A2, datasets!A:J, 7, FALSE) Right ankle
+Right knee
+Right hip
+Left hip
+Left knee
+Left ankle
+Right wrist
+Right elbow
+Right shoulder
+Left shoulder
+Left elbow
+Left wrist
+Neck
+Head top source: http://web.archive.org/web/20170915023005/sam.johnson.io/research/lsp.html [ page under development ] UnConstrained College Students (UCCS) is a dataset of long-range surveillance photos captured at University of Colorado Colorado Springs. According to the authors of two papers associated with the dataset, subjects were "photographed using a long-range high-resolution surveillance camera without their knowledge" 2. To create the dataset, the researchers used a Canon 7D digital camera fitted with a Sigma 800mm telephoto lens and photographed students 150–200m away through their office window. Photos were taken during the morning and afternoon while students were walking to and from classes. The primary uses of this dataset are to train, validate, and build recognition and face detection algorithms for realistic surveillance scenarios. What makes the UCCS dataset unique is that it includes the highest resolution images of any publicly available face recognition dataset discovered so far (18MP), that it was captured on a campus without consent or awareness using a long-range telephoto lens, and that it was funded by United States defense and intelligence agencies. Combined funding sources for the creation of the initial and final release of the dataset include ODNI (Office of Director of National Intelligence), IARPA (Intelligence Advance Research Projects Activity), ONR MURI (Office of Naval Research and The Department of Defense Multidisciplinary University Research Initiative), Army SBIR (Small Business Innovation Research), SOCOM SBIR (Special Operations Command and Small Business Innovation Research), and the National Science Foundation. 1 2 In 2017 the UCCS face dataset was used for a defense and intelligence agency funded face recognition challenge at the International Joint Biometrics Conference in Denver, CO. And in 2018 the dataset was used for the 2nd Unconstrained Face Detection and Open Set Recognition Challenge at the European Computer Vision Conference (ECCV) in Munich, Germany. Additional research projects that have used the UCCS dataset are included below in the list of verified citations. UCCS is part of the IARAP Janus team https://vast.uccs.edu/project/iarpa-janus/ UnConstrained College Students (UCCS) is a dataset of long-range surveillance photos captured at University of Colorado Colorado Springs. According to the authors of two papers associated with the dataset, over 1,700 students and pedestrians were "photographed using a long-range high-resolution surveillance camera without their knowledge" 2. In this investigation, we examine the funding sources, contents of the dataset, photo EXIF data, and publicy available research project citations. According to the author's of the the UnConstrained College Students dataset it is primarliy used for research and development of "face detection and recognition research towards surveillance applications that are becoming more popular and more required nowadays, and where no automatic recognition algorithm has proven to be useful yet." Applications of this technology include usage by defense and intelligence agencies, who were also the primary funding sources of the UCCS dataset. In the two papers associated with the release of the UCCS dataset (Unconstrained Face Detection and Open-Set Face Recognition Challenge and Large Scale Unconstrained Open Set Face Database), the researchers disclosed their funding sources as ODNI (United States Office of Director of National Intelligence), IARPA (Intelligence Advance Research Projects Activity), ONR MURI (Office of Naval Research and The Department of Defense Multidisciplinary University Research Initiative), Army SBIR (Small Business Innovation Research), SOCOM SBIR (Special Operations Command and Small Business Innovation Research), and the National Science Foundation. Further, UCCS's VAST site explicity states they are part of the IARPA Janus, a face recognition project developed to serve the needs of national intelligence interests. The UCCS dataset includes the highest resolution images of any publicly available face recognition dataset discovered so far (18MP) and was, as of 2018, the "largest surveillance FR benchmark in the public domain." 3 To create the dataset, the researchers used a Canon 7D digital camera fitted with a Sigma 800mm telephoto lens and photographed students from a distance of 150–200m through their office window. Photos were taken during the morning and afternoon while students were walking to and from classes. According to an analysis of the EXIF data embedded in the photos, nearly half of the 16,149 photos were taken on Tuesdays. The most popular time was during lunch break. All of the photos were taken during the spring semester in 2012 and 2013 but the dataset was not publicy released until 2016. In 2017 the UCCS face dataset was used for a defense and intelligence agency funded face recognition challenge at the International Joint Biometrics Conference in Denver, CO. And in 2018 the dataset was again used for the 2nd Unconstrained Face Detection and Open Set Recognition Challenge at the European Computer Vision Conference (ECCV) in Munich, Germany. Additional research projects that have used the UCCS dataset are included below in the list of verified citations. As of April 15, 2019, the UCCS dataset is no longer available for public download. During the three years it was publicly available (2016-2019) the UCCS dataset apepared in at least 5 publicly available research papers including verified usage from University of Notre Dame (US), Beihang University (China), Beckman Institute (US), Queen Mary University of London (UK), Carnegie Mellon University (US),Karlsruhe Institute of Technology (DE), and Vision Semantics Ltd (UK) who lists the UK Ministry of Defence and Metropolitan Police as partners. To show the types of face images used in the UCCS student dataset while protecting their individual privacy, a generative adversarial network was used to interpolate between identities in the dataset. The image below shows a generative adversarial network trained on the UCCS face bounding box areas from 16,000 images and over 90,000 face regions.
@@ -212,7 +212,7 @@
The location of the camera and subjects can confirmed using several visual cues in the dataset images: the unique pattern of the sidewalk that is only used on the UCCS Pedestrian Spine near the West Lawn, the two UCCS sign poles with matching graphics still visible in Google Street View, the no parking sign and directionality of its arrow, the back of street sign next to it, the slight bend in the sidewalk, the presence of cars passing in the background of the image, and the far wall of the parking garage all match images in the dataset. The original papers also provides another clue: a picture of the camera inside the office that was used to create the dataset. The window view in this image provides another match for the brick pattern on the north facade of the Kraember Family Library and the green metal fence along the sidewalk. View the location on Google Maps The UnConstrained College Students dataset is associated with two main research papers: "Large Scale Unconstrained Open Set Face Database" and "Unconstrained Face Detection and Open-Set Face Recognition Challenge". Collectively, these papers and the creation of the dataset have received funding from the following organizations: Sapkota, Archana and Boult, Terrance. "Large Scale Unconstrained Open Set Face Database." 2013. Günther, M. et. al. "Unconstrained Face Detection and Open-Set Face Recognition Challenge," 2018. Arxiv 1708.02337v3. "Surveillance Face Recognition Challenge". SemanticScholar [ page under development ] [ page under development ] [ page under development ] [ page under development ] Brainwash is a head detection dataset created from San Francisco's Brainwash Cafe livecam footage. It includes 11,918 images of "everyday life of a busy downtown cafe" 1 captured at 100 second intervals throught the entire day. Brainwash dataset was captured during 3 days in 2014: October 27, November 13, and November 24. According the author's reserach paper introducing the dataset, the images were acquired with the help of Angelcam.com. 2 Brainwash is a head detection dataset created from San Francisco's Brainwash Cafe livecam footage. It includes 11,918 images of "everyday life of a busy downtown cafe" 1 captured at 100 second intervals throught the entire day. Brainwash dataset was captured during 3 days in 2014: October 27, November 13, and November 24. According the author's reserach paper introducing the dataset, the images were acquired with the help of Angelcam.com. 2 Brainwash is not a widely used dataset but since its publication by Stanford University in 2015, it has notably appeared in several research papers from the National University of Defense Technology in Changsha, China. In 2016 and in 2017 researchers there conducted studies on detecting people's heads in crowded scenes for the purpose of surveillance. 3 4 If you happen to have been at Brainwash cafe in San Francisco at any time on October 26, November 13, or November 24 in 2014 you are most likely included in the Brainwash dataset and have unwittingly contributed to surveillance research. [ page under development ] [ page under development ] [ PAGE UNDER DEVELOPMENT ] [ PAGE UNDER DEVELOPMENT ] [ PAGE UNDER DEVELOPMENT ] [ PAGE UNDER DEVELOPMENT ] [ page under development ] Duke MTMC (Multi-Target, Multi-Camera Tracking) is a dataset of video recorded on Duke University campus for research and development of networked camera surveillance systems. MTMC tracking algorithms are used for citywide dragnet surveillance systems such as those used throughout China by SenseTime 1 and the oppressive monitoring of 2.5 million Uyghurs in Xinjiang by SenseNets 2. In fact researchers from both SenseTime 4 5 and SenseNets 3 used the Duke MTMC dataset for their research. In this investigation into the Duke MTMC dataset, we found that researchers at Duke Univesity in Durham, North Carolina captured over 2,000 students, faculty members, and passersby into one of the most prolific public surveillance research datasets that's used around the world by commercial and defense surveillance organizations. In this investigation into the Duke MTMC dataset, we found that researchers at Duke University in Durham, North Carolina captured over 2,000 students, faculty members, and passersby into one of the most prolific public surveillance research datasets that's used around the world by commercial and defense surveillance organizations. Since it's publication in 2016, the Duke MTMC dataset has been used in over 100 studies at organizations around the world including SenseTime 4 5, SenseNets 3, IARPA and IBM 9, Chinese National University of Defense 7 8, US Department of Homeland Security 10, Tencent, Microsoft, Microsft Asia, Fraunhofer, Senstar Corp., Alibaba, Naver Labs, Google and Hewlett-Packard Labs to name only a few. The creation and publication of the Duke MTMC dataset in 2014 (published in 2016) was originally funded by the U.S. Army Research Laboratory and the National Science Foundation 6. Though our analysis of the geographic locations of the publicly available research shows over twice as many citations by researchers from China (44% China, 20% United States). In 2018 alone, there were 70 research project citations from China. The 8 cameras deployed on Duke's campus were specifically setup to capture students "during periods between lectures, when pedestrian traffic is heavy". 6. Camera 5 was positioned to capture students as entering and exiting the university's main chapel. Each camera's location and approximate field of view. The heat map visualization shows the locations where pedestrians were most frequently annotated in each video from the Duke MTMC datset. The 8 cameras deployed on Duke's campus were specifically setup to capture students "during periods between lectures, when pedestrian traffic is heavy". 6. Camera 5 was positioned to capture students as entering and exiting the university's main chapel. Each camera's location and approximate field of view. The heat map visualization shows the locations where pedestrians were most frequently annotated in each video from the Duke MTMC dataset. https://foreignpolicy.com/2019/03/19/962492-orwell-china-socialcredit-surveillance/ "Attention-Aware Compositional Network for Person Re-identification". 2018. SemanticScholar, PDF "End-to-End Deep Kronecker-Product Matching for Person Re-identification". 2018. SemanticScholar, PDF [ page under development ] {% include 'dashboard.html' %} [ page under development ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how LFW has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Labeled Faces in the Wild was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The FERET program is sponsored by the U.S. Depart- ment of Defense’s Counterdrug Technology Development Program Office. The U.S. Army Research Laboratory (ARL) is the technical agent for the FERET program. ARL designed, administered, and scored the FERET tests. George Mason University collected, processed, and main- tained the FERET database. Inquiries regarding the FERET database or test should be directed to P. Jonathon Phillips. [ page under development ] [ page under development ] {% include 'dashboard.html' }
diff --git a/site/public/datasets/lfw/index.html b/site/public/datasets/lfw/index.html
index 60a6bf0e..1907f959 100644
--- a/site/public/datasets/lfw/index.html
+++ b/site/public/datasets/lfw/index.html
@@ -27,7 +27,8 @@
[ PAGE UNDER DEVELOPMENT ] [ PAGE UNDER DEVELOPMENT ] Labeled Faces in The Wild (LFW) is "a database of face photographs designed for studying the problem of unconstrained face recognition 1. It is used to evaluate and improve the performance of facial recognition algorithms in academic, commercial, and government research. According to BiometricUpdate.com 3, LFW is "the most widely used evaluation set in the field of facial recognition, LFW attracts a few dozen teams from around the globe including Google, Facebook, Microsoft Research Asia, Baidu, Tencent, SenseTime, Face++ and Chinese University of Hong Kong." The LFW dataset includes 13,233 images of 5,749 people that were collected between 2002-2004. LFW is a subset of Names of Faces and is part of the first facial recognition training dataset created entirely from images appearing on the Internet. The people appearing in LFW are... The Names and Faces dataset was the first face recognition dataset created entire from online photos. However, Names and Faces and LFW are not the first face recognition dataset created entirely "in the wild". That title belongs to the UCD dataset. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer. [ PAGE UNDER DEVELOPMENT] [ PAGE UNDER DEVELOPMENT] [ PAGE UNDER DEVELOPMENT ] [ PAGE UNDER DEVELOPMENT ] https://www.hrw.org/news/2019/01/15/letter-microsoft-face-surveillance-technology The Oxford Town Centre dataset is a CCTV video of pedestrians in a busy downtown area in Oxford used for research and development of activity and face recognition systems. 1 The CCTV video was obtained from a public surveillance camera at the corner of Cornmarket and Market St. in Oxford, England and includes approximately 2,200 people. Since its publication in 2009 2 the Oxford Town Centre dataset has been used in over 80 verified research projects including commercial research by Amazon, Disney, OSRAM, and Huawei; and academic research in China, Israel, Russia, Singapore, the US, and Germany among dozens more. The Oxford Town Centre dataset is a CCTV video of pedestrians in a busy downtown area in Oxford used for research and development of activity and face recognition systems. 1 The CCTV video was obtained from a public surveillance camera at the corner of Cornmarket and Market St. in Oxford, England and includes approximately 2,200 people. Since its publication in 2009 2 the Oxford Town Centre dataset has been used in over 80 verified research projects including commercial research by Amazon, Disney, OSRAM, and Huawei; and academic research in China, Israel, Russia, Singapore, the US, and Germany among dozens more. The Oxford Town Centre dataset is unique in that it uses footage from a public surveillance camera that would otherwise be designated for public safety. The video shows that the pedestrians act normally and unrehearsed indicating they neither knew of or consented to participation in the research project. [ PAGE UNDER DEVELOPMENT ] [ PAGE UNDER DEVELOPMENT ] [ PAGE UNDER DEVELOPMENT ] [ PAGE UNDER DEVELOPMENT ] UnConstrained College Students (UCCS) is a dataset of long-range surveillance photos captured at University of Colorado Colorado Springs. According to the authors of two papers associated with the dataset, over 1,700 students and pedestrians were "photographed using a long-range high-resolution surveillance camera without their knowledge" 2. In this investigation, we examine the funding sources, contents of the dataset, photo EXIF data, and publicy available research project citations. UnConstrained College Students (UCCS) is a dataset of long-range surveillance photos captured at University of Colorado Colorado Springs. According to the authors of two papers associated with the dataset, over 1,700 students and pedestrians were "photographed using a long-range high-resolution surveillance camera without their knowledge" 2. In this investigation, we examine the funding sources, contents of the dataset, photo EXIF data, and publicy available research project citations. According to the author's of the the UnConstrained College Students dataset it is primarliy used for research and development of "face detection and recognition research towards surveillance applications that are becoming more popular and more required nowadays, and where no automatic recognition algorithm has proven to be useful yet." Applications of this technology include usage by defense and intelligence agencies, who were also the primary funding sources of the UCCS dataset. In the two papers associated with the release of the UCCS dataset (Unconstrained Face Detection and Open-Set Face Recognition Challenge and Large Scale Unconstrained Open Set Face Database), the researchers disclosed their funding sources as ODNI (United States Office of Director of National Intelligence), IARPA (Intelligence Advance Research Projects Activity), ONR MURI (Office of Naval Research and The Department of Defense Multidisciplinary University Research Initiative), Army SBIR (Small Business Innovation Research), SOCOM SBIR (Special Operations Command and Small Business Innovation Research), and the National Science Foundation. Further, UCCS's VAST site explicity states they are part of the IARPA Janus, a face recognition project developed to serve the needs of national intelligence interests. The UCCS dataset includes the highest resolution images of any publicly available face recognition dataset discovered so far (18MP) and was, as of 2018, the "largest surveillance FR benchmark in the public domain." 3 To create the dataset, the researchers used a Canon 7D digital camera fitted with a Sigma 800mm telephoto lens and photographed students from a distance of 150–200m through their office window. Photos were taken during the morning and afternoon while students were walking to and from classes. According to an analysis of the EXIF data embedded in the photos, nearly half of the 16,149 photos were taken on Tuesdays. The most popular time was during lunch break. All of the photos were taken during the spring semester in 2012 and 2013 but the dataset was not publicy released until 2016. [ page under development ] [ page under development ] [ page under development ] [ page under development ] VIPeR (Viewpoint Invariant Pedestrian Recognition) is a dataset of pedestrian images captured at University of California Santa Cruz in 2007. Accoriding to the reserachers 2 "cameras were placed in different locations in an academic setting and subjects were notified of the presence of cameras, but were not coached or instructed in any way." VIPeR is amongst the most widely used publicly available person re-identification datasets. In 2017 the VIPeR dataset was combined into a larger person re-identification created by the Chinese University of Hong Kong called PETA (PEdesTrian Attribute). [ page under development ] [ page under development ] This page was last updated on {{ metadata.updated }} [ page under development ] Duke MTMC (Multi-Target, Multi-Camera Tracking) is a dataset of video recorded on Duke University campus for research and development of networked camera surveillance systems. MTMC tracking algorithms are used for citywide dragnet surveillance systems such as those used throughout China by SenseTime 1 and the oppressive monitoring of 2.5 million Uyghurs in Xinjiang by SenseNets 2. In fact researchers from both SenseTime 4 5 and SenseNets 3 used the Duke MTMC dataset for their research. In this investigation into the Duke MTMC dataset, we found that researchers at Duke Univesity in Durham, North Carolina captured over 2,000 students, faculty members, and passersby into one of the most prolific public surveillance research datasets that's used around the world by commercial and defense surveillance organizations. In this investigation into the Duke MTMC dataset, we found that researchers at Duke University in Durham, North Carolina captured over 2,000 students, faculty members, and passersby into one of the most prolific public surveillance research datasets that's used around the world by commercial and defense surveillance organizations. Since it's publication in 2016, the Duke MTMC dataset has been used in over 100 studies at organizations around the world including SenseTime 4 5, SenseNets 3, IARPA and IBM 9, Chinese National University of Defense 7 8, US Department of Homeland Security 10, Tencent, Microsoft, Microsft Asia, Fraunhofer, Senstar Corp., Alibaba, Naver Labs, Google and Hewlett-Packard Labs to name only a few. The creation and publication of the Duke MTMC dataset in 2014 (published in 2016) was originally funded by the U.S. Army Research Laboratory and the National Science Foundation 6. Though our analysis of the geographic locations of the publicly available research shows over twice as many citations by researchers from China (44% China, 20% United States). In 2018 alone, there were 70 research project citations from China. The 8 cameras deployed on Duke's campus were specifically setup to capture students "during periods between lectures, when pedestrian traffic is heavy". 6. Camera 5 was positioned to capture students as entering and exiting the university's main chapel. Each camera's location and approximate field of view. The heat map visualization shows the locations where pedestrians were most frequently annotated in each video from the Duke MTMC datset. The 8 cameras deployed on Duke's campus were specifically setup to capture students "during periods between lectures, when pedestrian traffic is heavy". 6. Camera 5 was positioned to capture students as entering and exiting the university's main chapel. Each camera's location and approximate field of view. The heat map visualization shows the locations where pedestrians were most frequently annotated in each video from the Duke MTMC dataset. https://foreignpolicy.com/2019/03/19/962492-orwell-china-socialcredit-surveillance/ "Attention-Aware Compositional Network for Person Re-identification". 2018. SemanticScholar, PDF "End-to-End Deep Kronecker-Product Matching for Person Re-identification". 2018. SemanticScholar, PDF [ page under development ] UnConstrained College Students (UCCS) is a dataset of long-range surveillance photos captured at University of Colorado Colorado Springs. According to the authors of two papers associated with the dataset, subjects were "photographed using a long-range high-resolution surveillance camera without their knowledge" 2. To create the dataset, the researchers used a Canon 7D digital camera fitted with a Sigma 800mm telephoto lens and photographed students 150–200m away through their office window. Photos were taken during the morning and afternoon while students were walking to and from classes. The primary uses of this dataset are to train, validate, and build recognition and face detection algorithms for realistic surveillance scenarios. What makes the UCCS dataset unique is that it includes the highest resolution images of any publicly available face recognition dataset discovered so far (18MP), that it was captured on a campus without consent or awareness using a long-range telephoto lens, and that it was funded by United States defense and intelligence agencies. Combined funding sources for the creation of the initial and final release of the dataset include ODNI (Office of Director of National Intelligence), IARPA (Intelligence Advance Research Projects Activity), ONR MURI (Office of Naval Research and The Department of Defense Multidisciplinary University Research Initiative), Army SBIR (Small Business Innovation Research), SOCOM SBIR (Special Operations Command and Small Business Innovation Research), and the National Science Foundation. 1 2 In 2017 the UCCS face dataset was used for a defense and intelligence agency funded face recognition challenge at the International Joint Biometrics Conference in Denver, CO. And in 2018 the dataset was used for the 2nd Unconstrained Face Detection and Open Set Recognition Challenge at the European Computer Vision Conference (ECCV) in Munich, Germany. Additional research projects that have used the UCCS dataset are included below in the list of verified citations. UCCS is part of the IARAP Janus team https://vast.uccs.edu/project/iarpa-janus/ https://arxiv.org/abs/1708.02337 UnConstrained College Students (UCCS) is a dataset of long-range surveillance photos captured at University of Colorado Colorado Springs developed primarily for research and development of "face detection and recognition research towards surveillance applications" 1. According to the authors of two papers associated with the dataset, over 1,700 students and pedestrians were "photographed using a long-range high-resolution surveillance camera without their knowledge". 3 In this investigation, we examine the contents of the dataset, funding sources, photo EXIF data, and information from publicly available research project citations. The UCCS dataset includes over 1,700 unique identities, most of which are students walking to and from class. As of 2018, it was the "largest surveillance [face recognition] benchmark in the public domain." 4 The photos were taken during the spring semesters of 2012 – 2013 on the West Lawn of the University of Colorado Colorado Springs campus. The photographs were timed to capture students during breaks between their scheduled classes in the morning and afternoon during Monday through Thursday. "For example, a student taking Monday-Wednesday classes at 12:30 PM will show up in the camera on almost every Monday and Wednesday." 2. The long-range surveillance images in the UnContsrained College Students dataset were captured using a Canon 7D 18 megapixel digital camera fitted with a Sigma 800mm F5.6 EX APO DG HSM telephoto lens and pointed out an office window across the university's West Lawn. The students were photographed from a distance of approximately 150 meters through an office window. "The camera [was] programmed to start capturing images at specific time intervals between classes to maximize the number of faces being captured." 2
+Their setup made it impossible for students to know they were being photographed, providing the researchers with realistic surveillance images to help build face detection and recognition systems for real world applications in defense, intelligence, and commercial applications. The EXIF data embedded in the images shows that the photo capture times follow a similar pattern, but also highlights that the vast majority of photos (over 7,000) were taken on Tuesdays around noon during students' lunch break. The lack of any photos taken on Friday shows that the researchers were only interested in capturing images of students. The two research papers associated with the release of the UCCS dataset (Unconstrained Face Detection and Open-Set Face Recognition Challenge and Large Scale Unconstrained Open Set Face Database), acknowledge that the primary funding sources for their work were United States defense and intelligence agencies. Specifically, development of the UnContrianed College Students dataset was funded by the Intelligence Advanced Research Projects Activity (IARPA), Office of Director of National Intelligence (ODNI), Office of Naval Research and The Department of Defense Multidisciplinary University Research Initiative (ONR MURI), Small Business Innovation Research (SBIR), Special Operations Command and Small Business Innovation Research (SOCOM SBIR), and the National Science Foundation. Further, UCCS's VAST site explicitly states they are part of the IARPA Janus, a face recognition project developed to serve the needs of national intelligence interests, clearly establishing the the funding sources and immediate benefactors of this dataset are United States defense and intelligence agencies. Although the images were first captured in 2012 – 2013 the dataset was not publicly released until 2016. Then in 2017 the UCCS face dataset formed the basis for a defense and intelligence agency funded face recognition challenge project at the International Joint Biometrics Conference in Denver, CO. And in 2018 the dataset was again used for the 2nd Unconstrained Face Detection and Open Set Recognition Challenge at the European Computer Vision Conference (ECCV) in Munich, Germany. As of April 15, 2019, the UCCS dataset is no longer available for public download. But during the three years it was publicly available (2016-2019) the UCCS dataset appeared in at least 6 publicly available research papers including verified usage from Beihang University who is known to provide research and development for China's military. The images in UCCS were taken on 18 non-consecutive days during 2012–2013. Analysis of the EXIF data embedded in original images reveal that most of the images were taken on Tuesdays, and the most frequent capture time throughout the week was 12:30PM. To show the types of face images used in the UCCS student dataset while protecting their individual privacy, a generative adversarial network was used to interpolate between identities in the dataset. The image below shows a generative adversarial network trained on the UCCS face bounding box areas from 16,000 images and over 90,000 face regions. The location of the camera and subjects can confirmed using several visual cues in the dataset images: the unique pattern of the sidewalk that is only used on the UCCS Pedestrian Spine near the West Lawn, the two UCCS sign poles with matching graphics still visible in Google Street View, the no parking sign and directionality of its arrow, the back of street sign next to it, the slight bend in the sidewalk, the presence of cars passing in the background of the image, and the far wall of the parking garage all match images in the dataset. The original papers also provides another clue: a picture of the camera inside the office that was used to create the dataset. The window view in this image provides another match for the brick pattern on the north facade of the Kraember Family Library and the green metal fence along the sidewalk. View the location on Google Maps The UnConstrained College Students dataset is associated with two main research papers: "Large Scale Unconstrained Open Set Face Database" and "Unconstrained Face Detection and Open-Set Face Recognition Challenge". Collectively, these papers and the creation of the dataset have received funding from the following organizations: This page was last updated on 2019-4-15 "2nd Unconstrained Face Detection and Open Set Recognition Challenge." https://vast.uccs.edu/Opensetface/. Accessed April 15, 2019. Sapkota, Archana and Boult, Terrance. "Large Scale Unconstrained Open Set Face Database." 2013. Günther, M. et. al. "Unconstrained Face Detection and Open-Set Face Recognition Challenge," 2018. Arxiv 1708.02337v3. "Surveillance Face Recognition Challenge". SemanticScholar [ page under development ] {% include 'dashboard.html' %} [ page under development ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how 50 People One Question Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing 50 People One Question was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ [ page under development ] {% include 'dashboard.html' %} [ page under development ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how Asian Face Age Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing The Asian Face Age Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The Asian Face Age Dataset (AFAD) is a new dataset proposed for evaluating the performance of age estimation, which contains more than 160K facial images and the corresponding age and gender labels. This dataset is oriented to age estimation on Asian faces, so all the facial images are for Asian faces. It is noted that the AFAD is the biggest dataset for age estimation to date. It is well suited to evaluate how deep learning methods can be adopted for age estimation.
Motivation For age estimation, there are several public datasets for evaluating the performance of a specific algorithm, such as FG-NET [1] (1002 face images), MORPH I (1690 face images), and MORPH II[2] (55,608 face images). Among them, the MORPH II is the biggest public dataset to date. On the other hand, as we know it is necessary to collect a large scale dataset to train a deep Convolutional Neural Network. Therefore, the MORPH II dataset is extensively used to evaluate how deep learning methods can be adopted for age estimation [3][4]. Brainwash is a head detection dataset created from San Francisco's Brainwash Cafe livecam footage. It includes 11,918 images of "everyday life of a busy downtown cafe" 1 captured at 100 second intervals throught the entire day. Brainwash dataset was captured during 3 days in 2014: October 27, November 13, and November 24. According the author's reserach paper introducing the dataset, the images were acquired with the help of Angelcam.com. 2 Brainwash is a head detection dataset created from San Francisco's Brainwash Cafe livecam footage. It includes 11,918 images of "everyday life of a busy downtown cafe" 1 captured at 100 second intervals throught the entire day. Brainwash dataset was captured during 3 days in 2014: October 27, November 13, and November 24. According the author's reserach paper introducing the dataset, the images were acquired with the help of Angelcam.com. 2 Brainwash is not a widely used dataset but since its publication by Stanford University in 2015, it has notably appeared in several research papers from the National University of Defense Technology in Changsha, China. In 2016 and in 2017 researchers there conducted studies on detecting people's heads in crowded scenes for the purpose of surveillance. 3 4 If you happen to have been at Brainwash cafe in San Francisco at any time on October 26, November 13, or November 24 in 2014 you are most likely included in the Brainwash dataset and have unwittingly contributed to surveillance research. {% include 'dashboard.html' %} {% include 'supplementary_header.html' %} TODO
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how Brainwash Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Brainwash Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ TODO {% include 'cite_our_work.html' %}
+
+ If you use our data, research, or graphics please cite our work:
+
+ "readme.txt" https://exhibits.stanford.edu/data/catalog/sx925dc9385. Stewart, Russel. Andriluka, Mykhaylo. "End-to-end people detection in crowded scenes". 2016. Li, Y. and Dou, Y. and Liu, X. and Li, T. Localized Region Context and Object Feature Fusion for People Head Detection. ICIP16 Proceedings. 2016. Pages 594-598. [ page under development ] {% include 'dashboard.html' %} [ page under development ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how Brainwash Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Brainwash Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset contains images of people collected from the web by typing common given names into Google Image Search. The coordinates of the eyes, the nose and the center of the mouth for each frontal face are provided in a ground truth file. This information can be used to align and crop the human faces or as a ground truth for a face detection algorithm. The dataset has 10,524 human faces of various resolutions and in different settings, e.g. portrait images, groups of people, etc. Profile faces or very low resolution faces are not labeled. [ PAGE UNDER DEVELOPMENT ] {% include 'dashboard.html' %} [ PAGE UNDER DEVELOPMENT ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how CelebA Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Large-scale CelebFaces Attributes Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ [ PAGE UNDER DEVELOPMENT ] {% include 'dashboard.html' %} COFW is "is designed to benchmark face landmark algorithms in realistic conditions, which include heavy occlusions and large shape variations" [Robust face landmark estimation under occlusion]. [ PAGE UNDER DEVELOPMENT ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how COFW Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Caltech Occluded Faces in the Wild was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ COFW is "is designed to benchmark face landmark algorithms in realistic conditions, which include heavy occlusions and large shape variations" [Robust face landmark estimation under occlusion]. We asked four people with different levels of computer vision knowledge to each collect 250 faces representative of typical real-world images, with the clear goal of challenging computer vision methods.
The result is 1,007 images of faces obtained from a variety of sources. This research is supported by NSF Grant 0954083 and by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via IARPA R&D Contract No. 2014-14071600012. https://www.cs.cmu.edu/~peiyunh/topdown/ {% include 'map.html' %} {% include 'supplementary_header.html' %} {% include 'citations.html' %} {% include 'chart.html' %} TODO
+ To help understand how COFW Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Caltech Occluded Faces in the Wild was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the location markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+ TODO [ page under development ] Duke MTMC (Multi-Target, Multi-Camera Tracking) is a dataset of video recorded on Duke University campus for research and development of networked camera surveillance systems. MTMC tracking algorithms are used for citywide dragnet surveillance systems such as those used throughout China by SenseTime 1 and the oppressive monitoring of 2.5 million Uyghurs in Xinjiang by SenseNets 2. In fact researchers from both SenseTime 4 5 and SenseNets 3 used the Duke MTMC dataset for their research. In this investigation into the Duke MTMC dataset, we found that researchers at Duke University in Durham, North Carolina captured over 2,000 students, faculty members, and passersby into one of the most prolific public surveillance research datasets that's used around the world by commercial and defense surveillance organizations. Since it's publication in 2016, the Duke MTMC dataset has been used in over 100 studies at organizations around the world including SenseTime 4 5, SenseNets 3, IARPA and IBM 9, Chinese National University of Defense 7 8, US Department of Homeland Security 10, Tencent, Microsoft, Microsft Asia, Fraunhofer, Senstar Corp., Alibaba, Naver Labs, Google and Hewlett-Packard Labs to name only a few. The creation and publication of the Duke MTMC dataset in 2014 (published in 2016) was originally funded by the U.S. Army Research Laboratory and the National Science Foundation 6. Though our analysis of the geographic locations of the publicly available research shows over twice as many citations by researchers from China (44% China, 20% United States). In 2018 alone, there were 70 research project citations from China. The 8 cameras deployed on Duke's campus were specifically setup to capture students "during periods between lectures, when pedestrian traffic is heavy". 6. Camera 5 was positioned to capture students as entering and exiting the university's main chapel. Each camera's location and approximate field of view. The heat map visualization shows the locations where pedestrians were most frequently annotated in each video from the Duke MTMC dataset. {% include 'dashboard.html' %} {% include 'supplementary_header.html' %} The 8 cameras deployed on Duke's campus were specifically setup to capture students "during periods between lectures, when pedestrian traffic is heavy". 6. Camera 5 was positioned to capture students as entering and exiting the university's main chapel. Each camera's location and approximate field of view. The heat map visualization shows the locations where pedestrians were most frequently annotated in each video from the Duke MTMC dataset.
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how Duke MTMC Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Duke Multi-Target, Multi-Camera Tracking Project was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ Original funding for the Duke MTMC dataset was provided by the Army Research Office under Grant No. W911NF-10-1-0387 and by the National Science Foundation
under Grants IIS-10-17017 and IIS-14-20894. The video timestamps contain the likely, but not yet confirmed, date and times of capture. Because the video timestamps align with the start and stop time sync data provided by the researchers, it at least aligns the relative time. The rainy weather on that day also contribute towards the likelihood of March 14, 2014.. === columns 2 =========== === end columns If you attended Duke University and were captured by any of the 8 surveillance cameras positioned on campus in 2014, there is unfortunately no way to be removed. The dataset files have been distributed throughout the world and it would not be possible to contact all the owners for removal. Nor do the authors provide any options for students to opt-out, nor did they even inform students they would be used at test subjects for surveillance research and development in a project funded, in part, by the United States Army Research Office. {% include 'cite_our_work.html' %} If you use any data from the Duke MTMC please follow their license and cite their work as:
+
+ If you use our data, research, or graphics please cite our work:
+
+ If you use any data from the Duke MTMC please follow their license and cite their work as: [ page under development ] {% include 'dashboard.html' %} [ page under development ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how LFW has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Labeled Faces in the Wild was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The FERET program is sponsored by the U.S. Depart- ment of Defense’s Counterdrug Technology Development Program Office. The U.S. Army Research Laboratory (ARL) is the technical agent for the FERET program. ARL designed, administered, and scored the FERET tests. George Mason University collected, processed, and main- tained the FERET database. Inquiries regarding the FERET database or test should be directed to P. Jonathon Phillips. [ page under development ] {% include 'dashboard.html' } [ page under development ] {% include 'dashboard.html' } {% include 'dashboard.html' %} RESEARCH below this line
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how LFWP has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Labeled Face Parts in the Wild was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ RESEARCH below this line Release 1 of LFPW consists of 1,432 faces from images downloaded from the web using simple text queries on sites such as google.com, flickr.com, and yahoo.com. Each image was labeled by three MTurk workers, and 29 fiducial points, shown below, are included in dataset. LFPW was originally described in the following publication: Due to copyright issues, we cannot distribute image files in any format to anyone. Instead, we have made available a list of image URLs where you can download the images yourself. We realize that this makes it impossible to exactly compare numbers, as image links will slowly disappear over time, but we have no other option. This seems to be the way other large web-based databases seem to be evolving. [ PAGE UNDER DEVELOPMENT ] Labeled Faces in The Wild (LFW) is "a database of face photographs designed for studying the problem of unconstrained face recognition 1. It is used to evaluate and improve the performance of facial recognition algorithms in academic, commercial, and government research. According to BiometricUpdate.com 3, LFW is "the most widely used evaluation set in the field of facial recognition, LFW attracts a few dozen teams from around the globe including Google, Facebook, Microsoft Research Asia, Baidu, Tencent, SenseTime, Face++ and Chinese University of Hong Kong." The LFW dataset includes 13,233 images of 5,749 people that were collected between 2002-2004. LFW is a subset of Names of Faces and is part of the first facial recognition training dataset created entirely from images appearing on the Internet. The people appearing in LFW are... The Names and Faces dataset was the first face recognition dataset created entire from online photos. However, Names and Faces and LFW are not the first face recognition dataset created entirely "in the wild". That title belongs to the UCD dataset. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer. The Names and Faces dataset was the first face recognition dataset created entire from online photos. However, Names and Faces and LFW are not the first face recognition dataset created entirely "in the wild". That title belongs to the UCD dataset. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer. The Names and Faces dataset was the first face recognition dataset created entire from online photos. However, Names and Faces and LFW are not the first face recognition dataset created entirely "in the wild". That title belongs to the UCD dataset. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer. The Names and Faces dataset was the first face recognition dataset created entire from online photos. However, Names and Faces and LFW are not the first face recognition dataset created entirely "in the wild". That title belongs to the UCD dataset. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer. The Names and Faces dataset was the first face recognition dataset created entire from online photos. However, Names and Faces and LFW are not the first face recognition dataset created entirely "in the wild". That title belongs to the UCD dataset. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer. {% include 'dashboard.html' %} {% include 'supplementary_header.html' %}
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how LFW has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Labeled Faces in the Wild was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ Add a paragraph about how usage extends far beyond academia into research centers for largest companies in the world. And even funnels into CIA funded research in the US and defense industry usage in China. [ PAGE UNDER DEVELOPMENT] {% include 'dashboard.html' %} [ PAGE UNDER DEVELOPMENT]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how Market 1501 has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Market 1501 Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ [ PAGE UNDER DEVELOPMENT ] https://www.hrw.org/news/2019/01/15/letter-microsoft-face-surveillance-technology {% include 'dashboard.html' %} {% include 'supplementary_header.html' %}
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how Microsoft Celeb has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Microsoft Celebrity Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The Oxford Town Centre dataset is a CCTV video of pedestrians in a busy downtown area in Oxford used for research and development of activity and face recognition systems. 1 The CCTV video was obtained from a public surveillance camera at the corner of Cornmarket and Market St. in Oxford, England and includes approximately 2,200 people. Since its publication in 2009 2 the Oxford Town Centre dataset has been used in over 80 verified research projects including commercial research by Amazon, Disney, OSRAM, and Huawei; and academic research in China, Israel, Russia, Singapore, the US, and Germany among dozens more. The Oxford Town Centre dataset is a CCTV video of pedestrians in a busy downtown area in Oxford used for research and development of activity and face recognition systems. 1 The CCTV video was obtained from a public surveillance camera at the corner of Cornmarket and Market St. in Oxford, England and includes approximately 2,200 people. Since its publication in 2009 2 the Oxford Town Centre dataset has been used in over 80 verified research projects including commercial research by Amazon, Disney, OSRAM, and Huawei; and academic research in China, Israel, Russia, Singapore, the US, and Germany among dozens more. The Oxford Town Centre dataset is unique in that it uses footage from a public surveillance camera that would otherwise be designated for public safety. The video shows that the pedestrians act normally and unrehearsed indicating they neither knew of or consented to participation in the research project. {% include 'dashboard.html' %} {% include 'supplementary_header.html' %}
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how TownCentre has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Oxford Town Centre was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The street location of the camera used for the Oxford Town Centre dataset was confirmed by matching the road, benches, and store signs source. At that location, two public CCTV cameras exist mounted on the side of the Northgate House building at 13-20 Cornmarket St. Because of the lower camera's mounting pole directionality, a view from a private camera in the building across the street can be ruled out because it would have to show more of silhouette of the lower camera's mounting pole. Two options remain: either the public CCTV camera mounted to the side of the building was used or the researchers mounted their own camera to the side of the building in the same location. Because the researchers used many other existing public CCTV cameras for their research projects it is likely that they would also be able to access to this camera. To discredit the theory that this public CCTV is only seen pointing the other way in Google Street View images, at least one public photo shows the upper CCTV camera pointing in the same direction as the Oxford Town Centre dataset proving the camera can and has been rotated before. As for the capture date, the text on the storefront display shows a sale happening from December 2nd – 7th indicating the capture date was between or just before those dates. The capture year is either 2008 or 2007 since prior to 2007 the Carphone Warehouse (photo, history) did not exist at this location. Since the sweaters in the GAP window display are more similar to those in a GAP website snapshot from November 2007, our guess is that the footage was obtained during late November or early December 2007. The lack of street vendors and slight waste residue near the bench suggests that is was probably a weekday after rubbish removal. ==== columns ==== === end columns {% include 'cite_our_work.html' %}
+
+ If you use our data, research, or graphics please cite our work:
+
+ [ PAGE UNDER DEVELOPMENT ] {% include 'dashboard.html' %} [ PAGE UNDER DEVELOPMENT ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how PIPA Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing People in Photo Albums Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ [ PAGE UNDER DEVELOPMENT ] {% include 'dashboard.html' %} [ PAGE UNDER DEVELOPMENT ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how PubFig has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Public Figures Face Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ UnConstrained College Students (UCCS) is a dataset of long-range surveillance photos captured at University of Colorado Colorado Springs developed primarily for research and development of "face detection and recognition research towards surveillance applications" 1. According to the authors of two papers associated with the dataset, over 1,700 students and pedestrians were "photographed using a long-range high-resolution surveillance camera without their knowledge". 3 In this investigation, we examine the contents of the dataset, funding sources, photo EXIF data, and information from publicly available research project citations. UnConstrained College Students (UCCS) is a dataset of long-range surveillance photos captured at University of Colorado Colorado Springs developed primarily for research and development of "face detection and recognition research towards surveillance applications" 1. According to the authors of two papers associated with the dataset, over 1,700 students and pedestrians were "photographed using a long-range high-resolution surveillance camera without their knowledge". 3 In this investigation, we examine the contents of the dataset, funding sources, photo EXIF data, and information from publicly available research project citations. The UCCS dataset includes over 1,700 unique identities, most of which are students walking to and from class. As of 2018, it was the "largest surveillance [face recognition] benchmark in the public domain." 4 The photos were taken during the spring semesters of 2012 – 2013 on the West Lawn of the University of Colorado Colorado Springs campus. The photographs were timed to capture students during breaks between their scheduled classes in the morning and afternoon during Monday through Thursday. "For example, a student taking Monday-Wednesday classes at 12:30 PM will show up in the camera on almost every Monday and Wednesday." 2. The long-range surveillance images in the UnContsrained College Students dataset were captured using a Canon 7D 18 megapixel digital camera fitted with a Sigma 800mm F5.6 EX APO DG HSM telephoto lens and pointed out an office window across the university's West Lawn. The students were photographed from a distance of approximately 150 meters through an office window. "The camera [was] programmed to start capturing images at specific time intervals between classes to maximize the number of faces being captured." 2
+ The long-range surveillance images in the UnContsrained College Students dataset were captured using a Canon 7D 18 megapixel digital camera fitted with a Sigma 800mm F5.6 EX APO DG HSM telephoto lens and pointed out an office window across the university's West Lawn. The students were photographed from a distance of approximately 150 meters through an office window. "The camera [was] programmed to start capturing images at specific time intervals between classes to maximize the number of faces being captured." 2
Their setup made it impossible for students to know they were being photographed, providing the researchers with realistic surveillance images to help build face detection and recognition systems for real world applications in defense, intelligence, and commercial applications. In the two papers associated with the release of the UCCS dataset (Unconstrained Face Detection and Open-Set Face Recognition Challenge and Large Scale Unconstrained Open Set Face Database), the researchers disclosed their funding sources as ODNI (United States Office of Director of National Intelligence), IARPA (Intelligence Advance Research Projects Activity), ONR MURI (Office of Naval Research and The Department of Defense Multidisciplinary University Research Initiative), Army SBIR (Small Business Innovation Research), SOCOM SBIR (Special Operations Command and Small Business Innovation Research), and the National Science Foundation. Further, UCCS's VAST site explicity states they are part of the IARPA Janus, a face recognition project developed to serve the needs of national intelligence interests. In the two papers associated with the release of the UCCS dataset (Unconstrained Face Detection and Open-Set Face Recognition Challenge and Large Scale Unconstrained Open Set Face Database), the researchers disclosed their funding sources as ODNI (United States Office of Director of National Intelligence), IARPA (Intelligence Advance Research Projects Activity), ONR MURI (Office of Naval Research and The Department of Defense Multidisciplinary University Research Initiative), Army SBIR (Small Business Innovation Research), SOCOM SBIR (Special Operations Command and Small Business Innovation Research), and the National Science Foundation. Further, UCCS's VAST site explicity states they are part of the IARPA Janus, a face recognition project developed to serve the needs of national intelligence interests. The EXIF data embedded in the images shows that the photo capture times follow a similar pattern, but also highlights that the vast majority of photos (over 7,000) were taken on Tuesdays around noon during students' lunch break. The lack of any photos taken on Friday shows that the researchers were only interested in capturing images of students. The two research papers associated with the release of the UCCS dataset (Unconstrained Face Detection and Open-Set Face Recognition Challenge and Large Scale Unconstrained Open Set Face Database), acknowledge that the primary funding sources for their work were United States defense and intelligence agencies. Specifically, development of the UnContrianed College Students dataset was funded by the Intelligence Advanced Research Projects Activity (IARPA), Office of Director of National Intelligence (ODNI), Office of Naval Research and The Department of Defense Multidisciplinary University Research Initiative (ONR MURI), Small Business Innovation Research (SBIR), Special Operations Command and Small Business Innovation Research (SOCOM SBIR), and the National Science Foundation. Further, UCCS's VAST site explicitly states they are part of the IARPA Janus, a face recognition project developed to serve the needs of national intelligence interests, clearly establishing the the funding sources and immediate benefactors of this dataset are United States defense and intelligence agencies. The two research papers associated with the release of the UCCS dataset (Unconstrained Face Detection and Open-Set Face Recognition Challenge and Large Scale Unconstrained Open Set Face Database), acknowledge that the primary funding sources for their work were United States defense and intelligence agencies. Specifically, development of the UnContrianed College Students dataset was funded by the Intelligence Advanced Research Projects Activity (IARPA), Office of Director of National Intelligence (ODNI), Office of Naval Research and The Department of Defense Multidisciplinary University Research Initiative (ONR MURI), Small Business Innovation Research (SBIR), Special Operations Command and Small Business Innovation Research (SOCOM SBIR), and the National Science Foundation. Further, UCCS's VAST site explicitly states they are part of the IARPA Janus, a face recognition project developed to serve the needs of national intelligence interests, clearly establishing the the funding sources and immediate benefactors of this dataset are United States defense and intelligence agencies. Although the images were first captured in 2012 – 2013 the dataset was not publicly released until 2016. Then in 2017 the UCCS face dataset formed the basis for a defense and intelligence agency funded face recognition challenge project at the International Joint Biometrics Conference in Denver, CO. And in 2018 the dataset was again used for the 2nd Unconstrained Face Detection and Open Set Recognition Challenge at the European Computer Vision Conference (ECCV) in Munich, Germany. As of April 15, 2019, the UCCS dataset is no longer available for public download. But during the three years it was publicly available (2016-2019) the UCCS dataset appeared in at least 6 publicly available research papers including verified usage from Beihang University who is known to provide research and development for China's military. {% include 'dashboard.html' %} {% include 'supplementary_header.html' %} To show the types of face images used in the UCCS student dataset while protecting their individual privacy, a generative adversarial network was used to interpolate between identities in the dataset. The image below shows a generative adversarial network trained on the UCCS face bounding box areas from 16,000 images and over 90,000 face regions. === columns 2
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how UCCS has been used around the world by commercial, military, and academic organizations; existing publicly available research citing UnConstrained College Students Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ To show the types of face images used in the UCCS student dataset while protecting their individual privacy, a generative adversarial network was used to interpolate between identities in the dataset. The image below shows a generative adversarial network trained on the UCCS face bounding box areas from 16,000 images and over 90,000 face regions. =========== === end columns The location of the camera and subjects can confirmed using several visual cues in the dataset images: the unique pattern of the sidewalk that is only used on the UCCS Pedestrian Spine near the West Lawn, the two UCCS sign poles with matching graphics still visible in Google Street View, the no parking sign and directionality of its arrow, the back of street sign next to it, the slight bend in the sidewalk, the presence of cars passing in the background of the image, and the far wall of the parking garage all match images in the dataset. The original papers also provides another clue: a picture of the camera inside the office that was used to create the dataset. The window view in this image provides another match for the brick pattern on the north facade of the Kraember Family Library and the green metal fence along the sidewalk. View the location on Google Maps The UnConstrained College Students dataset is associated with two main research papers: "Large Scale Unconstrained Open Set Face Database" and "Unconstrained Face Detection and Open-Set Face Recognition Challenge". Collectively, these papers and the creation of the dataset have received funding from the following organizations: {% include 'cite_our_work.html' %}
+
+ If you use our data, research, or graphics please cite our work:
+
+ "2nd Unconstrained Face Detection and Open Set Recognition Challenge." https://vast.uccs.edu/Opensetface/. Accessed April 15, 2019. Sapkota, Archana and Boult, Terrance. "Large Scale Unconstrained Open Set Face Database." 2013. Günther, M. et. al. "Unconstrained Face Detection and Open-Set Face Recognition Challenge," 2018. Arxiv 1708.02337v3. [ page under development ] {% include 'dashboard.html' %} [ page under development ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how Brainwash Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Brainwash Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ [ page under development ] VIPeR (Viewpoint Invariant Pedestrian Recognition) is a dataset of pedestrian images captured at University of California Santa Cruz in 2007. Accoriding to the reserachers 2 "cameras were placed in different locations in an academic setting and subjects were notified of the presence of cameras, but were not coached or instructed in any way." VIPeR is amongst the most widely used publicly available person re-identification datasets. In 2017 the VIPeR dataset was combined into a larger person re-identification created by the Chinese University of Hong Kong called PETA (PEdesTrian Attribute). {% include 'dashboard.html' %}
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how VIPeR has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Viewpoint Invariant Pedestrian Recognition was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ [ page under development ] {% include 'dashboard.html' %} [ page under development ]
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how YouTube Celebrities has been used around the world by commercial, military, and academic organizations; existing publicly available research citing YouTube Celebrities was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ YouTube Celebrities
-
-YouTube Celebrities
+Who used YouTube Celebrities?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ Notes...
+
+
+the views of our sponsors.
@@ -47,11 +97,10 @@ the views of our sponsors.
Motivation
+ Motivation
← Back to test index
![]()
![]()
![]()
![]()
Test table
+
+
+
+
+
+
+Col1
+Col2
+Col3
+
+
+
+Content1
+Content2
+Content3
+Who used {{ metadata.meta.dataset.name_display }}?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ Biometric Trade Routes
-
+
Supplementary Information
+ Supplementary Information
diff --git a/site/public/datasets/oxford_town_centre/index.html b/site/public/datasets/oxford_town_centre/index.html
index db62a5a6..34234e50 100644
--- a/site/public/datasets/oxford_town_centre/index.html
+++ b/site/public/datasets/oxford_town_centre/index.html
@@ -47,11 +47,10 @@
Oxford Town Centre
![]()
Who used TownCentre?
Supplementary Information
-Location
+![]()
Demo Videos Using Oxford Town Centre Dataset
+
Microsoft Celeb Dataset (MS Celeb)
Who used MsCeleb?
+ Who used Microsoft Celeb?
Biometric Trade Routes
![]()
Who used Brainwash Dataset?
Supplementary Information
-![]()
![]()
![]()
![]()
![]()
Supplementary Information
Data Visualizations
-![]()
TODO
+![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Alternate Layout
+![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
TODO
-
Oxford Town Centre
Who used TownCentre?
@@ -111,9 +110,8 @@
Supplementary Information
Location
-![]()
Demo Videos Using Oxford Town Centre Dataset
+![]()
Demo Videos Using Oxford Town Centre Dataset
-
-
Location
-![]()
![]()
Funding
--
cgit v1.2.3-70-g09d2
From 1422970bec513e70488ac808f4ce6310b5ac6aca Mon Sep 17 00:00:00 2001
From: adamhrv
![]()
![]()
![]()
-
Duke MTMC
![]()
![]()
![]()
![]()
Who used Duke MTMC Dataset?
Supplementary Information
-Data Visualizations
-![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Alternate Layout
-![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
TODO
-
-
-Notes
+Oxford Town Centre
-Who used TownCentre?
@@ -110,8 +109,10 @@
Supplementary Information
Location
-![]()
Demo Videos Using Oxford Town Centre Dataset
+![]()
![]()
![]()
Demo Videos Using Oxford Town Centre Dataset
-
-
Changes
-Privacy
-3rd Party Services
-Links To Other Web Sites
-The Information We Provide
-Prohibited Uses
Indemnity
Changes
+Duke MTMC
-![]()
![]()
![]()
![]()
![]()
Who used Duke MTMC Dataset?
@@ -118,7 +117,7 @@
Oxford Town Centre
-Who used TownCentre?
--
cgit v1.2.3-70-g09d2
From 9b1e2709cbdb40eabb34d379df18e61c10e3737c Mon Sep 17 00:00:00 2001
From: Jules Laplace References
'
content += footnote_txt
+ content += 'References
Notes
References
References
-References
References
References
+
+
+@online{megapixels,
+ author = {Harvey, Adam. LaPlace, Jules.},
+ title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets},
+ year = 2019,
+ url = {https://megapixels.cc/},
+ urldate = {2019-04-20}
+}
+
+
+While every intention is made to publish only verifiable information, at times information may be edited, removed, or appended for clarity or correction. In no event will the operators of this site be liable for your use or misuse of the information provided.
+
+We may terminate or suspend access to our Service immediately without prior notice or liability, for any reason whatsoever, including without limitation if you breach the Terms.
+
+All provisions of the Terms which by their nature should survive termination shall survive termination, including, without limitation, ownership provisions, warranty disclaimers, indemnity and limitations of liability.
+
+### Prohibited Uses
+
+You may not access or use, or attempt to access or use, the Services to take any action that could harm us or a third party. You may not use the Services in violation of applicable laws or in violation of our or any third party’s intellectual property or other proprietary or legal rights. You further agree that you shall not attempt (or encourage or support anyone else's attempt) to circumvent, reverse engineer, decrypt, or otherwise alter or interfere with the Services, or any content thereof, or make any unauthorized use thereof.
+
+Without prior written consent, you shall not:
+
+(i) access any part of the Services, Content, data or information you do not have permission or authorization to access;
+
+(ii) use robots, spiders, scripts, service, software or any manual or automatic device, tool, or process designed to data mine or scrape the Content, data or information from the Services, or otherwise access or collect the Content, data or information from the Services using automated means;
+
+(iii) use services, software or any manual or automatic device, tool, or process designed to circumvent any restriction, condition, or technological measure that controls access to the Services in any way, including overriding any security feature or bypassing or circumventing any access controls or use limits of the Services;
+
+(iv) cache or archive the Content (except for a public search engine’s use of spiders for creating search indices) with prior written consent;
+
+(v) take action that imposes an unreasonable or disproportionately large load on our network or infrastructure; and
+
+(vi) do anything that could disable, damage or change the functioning or appearance of the Services, including the presentation of advertising.
+
+Engaging in a prohibited use of the Services may result in civil, criminal, and/or administrative penalties, fines, or sanctions against the user and those assisting the user.
+
+### Governing Law
+
+These Terms shall be governed and construed in accordance with the laws of Berlin, Germany, without regard to its conflict of law provisions.
+
+Our failure to enforce any right or provision of these Terms will not be considered a waiver of those rights. If any provision of these Terms is held to be invalid or unenforceable by a court, the remaining provisions of these Terms will remain in effect. These Terms constitute the entire agreement between us regarding our Service, and supersede and replace any prior agreements we might have between us regarding the Service.
+
+### Indemnity
+
+You hereby indemnify, defend and hold harmless MegaPixels (and its creators) and all officers, directors, owners, agents, information providers, affiliates, licensors and licensees (collectively, the "Indemnified Parties") from and against any and all liability and costs, including, without limitation, reasonable attorneys' fees, incurred by the Indemnified Parties in connection with any claim arising out of any breach by you or any user of your account of these Terms of Service or the foregoing representations, warranties and covenants. You shall cooperate as fully as reasonably required in the defense of any such claim. We reserves the right, at its own expense, to assume the exclusive defense and control of any matter subject to indemnification by you.
+
+### Changes
+
+We reserve the right, at our sole discretion, to modify or replace these Terms at any time. By continuing to use or access our Service after revisions become effective, you agree to be bound by the revised terms. If you do not agree to revised terms, please do not use the Service.
\ No newline at end of file
diff --git a/site/content/pages/about/index.md b/site/content/pages/about/index.md
index 880ca356..f4fa5445 100644
--- a/site/content/pages/about/index.md
+++ b/site/content/pages/about/index.md
@@ -21,57 +21,39 @@ authors: Adam Harvey
-MegaPixels is an independent art and research project by Adam Harvey and Jules LaPlace investigating the ethics and individual privacy implications of publicly available face recognition datasets, and their role in industry and governmental expansion into biometric surveillance technologies.
+MegaPixels is an independent art and research project by Adam Harvey and Jules LaPlace that investigates the ethics, origins, and individual privacy implications of face recognition datasets and their role in commercial, academic, and military use of biometric surveillance technologies.
-The MegaPixels site is made possible with support from Mozilla
+ATTRIBUTION PROTOCOL
-Adam Harvey
- Jules LaPlace
-
+@online{megapixels,
+ author = {Harvey, Adam. LaPlace, Jules.},
+ title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets},
+ year = 2019,
+ url = {https://megapixels.cc/},
+ urldate = {2019-04-20}
+}
+
-An academic report and presentation on the findings of this project is forthcoming. Throughout 2019, this site will be updated with more datasets and research reports on the general themes of remote biometric analysis and media collected "in the wild". The continued research on MegaPixels is supported by a 1 year Researcher-in-Residence grant from Karlsruhe HfG (2019-2020).
+and include this license and attribution protocol within any derivative work.
-When possible, and once thoroughly verified, data generated for MegaPixels will be made available for download on [github.com/adamhrv/megapixels](https://github.com/adamhrv/megapixels)
+If you publish data derived from MegaPixels, the original dataset creators should first be notified.
-=== columns 3
+The MegaPixels dataset is made available under the Open Data Commons Attribution License (https://opendatacommons.org/licenses/by/1.0/) and for academic use only.
-#### Team
+READABLE SUMMARY OF Open Data Commons Attribution License
-- Adam Harvey: Concept, research and analysis, design, computer vision
-- Jules LaPlace: Information and systems architecture, data management, web applications
+You are free:
-===========
+ To Share: To copy, distribute and use the dataset
+ To Create: To produce works from the dataset
+ To Adapt: To modify, transform and build upon the database
-#### Contributing Researchers
+As long as you:
-- Berit Gilma: Dataset statistics
-- Beth (aka Ms. Celeb): Dataset usage verification
-- Mathana Stender: Commercial usage verification
+ Attribute: You must attribute any public use of the database, or works produced from the database, in the manner specified in the license. For any use or redistribution of the database, or works produced from it, you must make clear to others the license of the database and keep intact any notices on the original database.
-===========
-#### Code and Libraries
-
-- [Semantic Scholar](https://semanticscholar.org) for citation aggregation
-- Leaflet.js for maps
-- C3.js for charts
-- ThreeJS for 3D visualizations
-- PDFMiner.Six and Pandas for research paper data analysis
-
-=== end columns
-
-Please direct questions, comments, or feedback to [mastodon.social/@adamhrv](https://mastodon.social/@adamhrv)
diff --git a/site/content/pages/about/legal.md b/site/content/pages/about/legal.md
index a2c2b8bd..eb9f5559 100644
--- a/site/content/pages/about/legal.md
+++ b/site/content/pages/about/legal.md
@@ -40,10 +40,43 @@ The MegaPixels.cc contains many links to 3rd party websites, especially in the l
We advise you to read the terms and conditions and privacy policies of any third-party web sites or services that you visit.
+### Information We Collect
-### The Information We Provide
+When you access the Service, we record your visit to the site in a server log file for the purposes of maintaining site security and preventing misuse. This includes your IP address and the header information sent by your web browser which includes the User Agent, referrer, and the requested page on our site.
-While every intention is made to publish only verifiable information, at times existing information may be revised or deleted and new information may be added for clarity or correction. In no event will the operators of this site be liable for your use or misuse of the information provided.
+### Information We Share
+
+We do not share or make public any information about individual site visitors, unless where required by law to the extent that server logs are only retained for a limited duration.
+
+
+### Information We Provide
+
+We provide information for educational, journalistic, and research purposes. The published information on MegaPixels is made available under the Open Data Commons Attribution License (https://opendatacommons.org/licenses/by/1.0/) and for academic use only.
+
+You are free:
+
+> To Share: To copy, distribute and use the dataset
+To Create: To produce works from the dataset
+To Adapt: To modify, transform and build upon the database
+
+As long as you:
+
+
+> Attribute: You must attribute any public use of the database, or works produced from the database, in the manner specified in the license. For any use or redistribution of the database, or works produced from it, you must make clear to others the license of the database and keep intact any notices on the original database.
+
+If you use the MegaPixels data or any data derived from it, please cite the original work as follows:
+
+
+@online{megapixels,
+ author = {Harvey, Adam. LaPlace, Jules.},
+ title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets},
+ year = 2019,
+ url = {https://megapixels.cc/},
+ urldate = {2019-04-20}
+}
+
+
+While every intention is made to publish only verifiable information, at times information may be edited, removed, or appended for clarity or correction. In no event will the operators of this site be liable for your use or misuse of the information provided.
We may terminate or suspend access to our Service immediately without prior notice or liability, for any reason whatsoever, including without limitation if you breach the Terms.
diff --git a/site/content/pages/datasets/uccs/assets/uccs_mean_bboxes_comp.jpg b/site/content/pages/datasets/uccs/assets/uccs_mean_bboxes_comp.jpg
deleted file mode 100644
index 18f4c5ec..00000000
Binary files a/site/content/pages/datasets/uccs/assets/uccs_mean_bboxes_comp.jpg and /dev/null differ
diff --git a/site/content/pages/datasets/uccs/index.md b/site/content/pages/datasets/uccs/index.md
index 05e683af..5cd17fa8 100644
--- a/site/content/pages/datasets/uccs/index.md
+++ b/site/content/pages/datasets/uccs/index.md
@@ -115,6 +115,9 @@ For further technical information about the dataset, visit the [UCCS dataset pro
- adding more verified locations to map and charts
- add EXIF file to CDN
+{% include 'cite_our_work.html' %}
+
+
### Footnotes
[^funding_sb]: Sapkota, Archana and Boult, Terrance. "Large Scale Unconstrained Open Set Face Database." 2013.
--
cgit v1.2.3-70-g09d2
From 57fba037d519e45488599288f7753cb7a3cd32aa Mon Sep 17 00:00:00 2001
From: adamhrv Cite Our Work
+
+@online{megapixels,
+ author = {Harvey, Adam. LaPlace, Jules.},
+ title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets},
+ year = 2019,
+ url = {https://megapixels.cc/},
+ urldate = {2019-04-20}
+}
+
+
+About MegaPixels
-
-
-Adam Harvey
- Jules LaPlace
- Team
-
-
-Contributing Researchers
-
-
-Code and Libraries
-
-
-Legal
-
-
-Privacy
-3rd Party Services
-Links To Other Web Sites
-The Information We Provide
-Prohibited Uses
-Governing Law
-Indemnity
-Changes
-Press
-
-
-
-
-50 People 1 Question
-Who used 50 People One Question Dataset?
-
- Biometric Trade Routes
-
-
-
- Dataset Citations
- Asian Face Age Dataset
-Who used Asian Face Age Dataset?
-
- Biometric Trade Routes
-
-
-
- Dataset Citations
- (ignore) research notes
-
-
-Annotated Facial Landmarks in The Wild
-![]()
-Brainwash Dataset
-Who used Brainwash Dataset?
-
- Biometric Trade Routes
-
-
-
- Dataset Citations
- Supplementary Information
-
-![]()
![]()
![]()
-
-Caltech 10K Faces Dataset
-Who used Brainwash Dataset?
-
- Biometric Trade Routes
-
-
-
- Dataset Citations
- (ignore) research notes
-CelebA Dataset
-Who used CelebA Dataset?
-
- Biometric Trade Routes
-
-
-
- Dataset Citations
- Research
-
-
-Caltech Occluded Faces in the Wild
-Who used COFW Dataset?
-
- Biometric Trade Routes
-
-
-
- Dataset Citations
- (ignore) research notes
-
-
-
-
-Biometric Trade Routes
-
-
-
- Supplementary Information
-
-Dataset Citations
- Who used COFW Dataset?
-
- - replace graphic
-Duke MTMC
-![]()
![]()
![]()
![]()
Who used Duke MTMC Dataset?
-
- Biometric Trade Routes
-
-
-
- Dataset Citations
- Supplementary Information
-
-Notes
-Statistics
-
-
-FacE REcognition Dataset (FERET)
-(ignore) RESEARCH below this line
-
-
-"The FERET database and evaluation procedure for face-recognition algorithms"
-
-
-FERET (Face Recognition Technology) Recognition Algorithm Development and Test Results
-
-
-Funding
-HRT Transgender Dataset
-Facial Recognition Datasets
-Labeled Face Parts in The Wild
-Who used LFWP?
-
- Biometric Trade Routes
-
-
-
- Dataset Citations
-
-
-Labeled Faces in the Wild
-![]()
Who used LFW?
-
- Biometric Trade Routes
-
-
-
- Dataset Citations
- Supplementary Information
-
-Commercial Use
-Research
-
-
-
-
-Market-1501 Dataset
-Who used Market 1501?
-
- Biometric Trade Routes
-
-
-
- Dataset Citations
- (ignore) research Notes
-
-
-Microsoft Celeb Dataset (MS Celeb)
-Who used Microsoft Celeb?
-
- Biometric Trade Routes
-
-
-
- Dataset Citations
- Supplementary Information
-
-Additional Information
-
-
-Oxford Town Centre
-Who used TownCentre?
-
- Biometric Trade Routes
-
-
-
- Dataset Citations
- Supplementary Information
-
-Location
-![]()
![]()
![]()
Demo Videos Using Oxford Town Centre Dataset
-
-
-People in Photo Albums
-Who used PIPA Dataset?
-
- Biometric Trade Routes
-
-
-
- Dataset Citations
- PubFig
-Who used PubFig?
-
- Biometric Trade Routes
-
-
-
- Dataset Citations
- UnConstrained College Students
-Who used UCCS?
-
- Biometric Trade Routes
-
-
-
- Dataset Citations
- Supplementary Information
-
-Dates and Times
-![]()
![]()
UCCS photos taken in 2012
-
-
-
-
-
-
-Date
-Photos
-
-
-Feb 23, 2012
-132
-
-
-March 6, 2012
-288
-
-
-March 8, 2012
-506
-
-
-March 13, 2012
-160
-
-
-March 20, 2012
-1,840
-
-
-March 22, 2012
-445
-
-
-April 3, 2012
-1,639
-
-
-April 12, 2012
-14
-
-
-April 17, 2012
-19
-
-
-April 24, 2012
-63
-
-
-April 25, 2012
-11
-
-
-
-April 26, 2012
-20
-UCCS photos taken in 2013
-
-
-
-
-
-
-Date
-Photos
-
-
-Jan 28, 2013
-1,056
-
-
-Jan 29, 2013
-1,561
-
-
-Feb 13, 2013
-739
-
-
-Feb 19, 2013
-723
-
-
-Feb 20, 2013
-965
-
-
-
-Feb 26, 2013
-736
-Location
-![]()
![]()
Funding
-
-
-Opting Out
-Ethics
-Technical Details
-Under Development
-
-
-VGG Face 2
-Who used Brainwash Dataset?
-
- Biometric Trade Routes
-
-
-
- Dataset Citations
- (ignore) research notes
-
-
-import difflib; seq = difflib.SequenceMatcher(a=a.lower(), b=b.lower()); score = seq.ratio()TODO
-
-
-VIPeR Dataset
-Who used VIPeR?
-
- Biometric Trade Routes
-
-
-
- Dataset Citations
- YouTube Celebrities
-Who used YouTube Celebrities?
-
- Biometric Trade Routes
-
-
-
- Dataset Citations
- Notes...
-
-
-Face Analysis
-00: Introduction
- Motivation
-
-for other post
-Add progressive gan of FERET
-
-
-From 1 to 100 Pixels
- High resolution insights from low resolution data
-
-
-
-
-Research
-
-
-
-
-a-Z0-9 with room to spare.Prior Research
-
-
-Notes
-
-
-
-
-What Computers Can See
-
-
-From PubFig Dataset
-
-
-From Market 1501
-
-
-
-
-
-
-attribute
-representation in file
-label
-
-
-gender
-gender
-male(1), female(2)
-
-
-hair length
-hair
-short hair(1), long hair(2)
-
-
-sleeve length
-up
-long sleeve(1), short sleeve(2)
-
-
-length of lower-body clothing
-down
-long lower body clothing(1), short(2)
-
-
-type of lower-body clothing
-clothes
-dress(1), pants(2)
-
-
-wearing hat
-hat
-no(1), yes(2)
-
-
-carrying backpack
-backpack
-no(1), yes(2)
-
-
-carrying bag
-bag
-no(1), yes(2)
-
-
-carrying handbag
-handbag
-no(1), yes(2)
-
-
-age
-age
-young(1), teenager(2), adult(3), old(4)
-
-
-8 color of upper-body clothing
-upblack, upwhite, upred, uppurple, upyellow, upgray, upblue, upgreen
-no(1), yes(2)
-
-
-
-9 color of lower-body clothing
-downblack, downwhite, downpink, downpurple, downyellow, downgray, downblue, downgreen,downbrown
-no(1), yes(2)
-From DukeMTMC
-
-
-
-
-
-
-attribute
-representation in file
-label
-
-
-gender
-gender
-male(1), female(2)
-
-
-length of upper-body clothing
-top
-short upper body clothing(1), long(2)
-
-
-wearing boots
-boots
-no(1), yes(2)
-
-
-wearing hat
-hat
-no(1), yes(2)
-
-
-carrying backpack
-backpack
-no(1), yes(2)
-
-
-carrying bag
-bag
-no(1), yes(2)
-
-
-carrying handbag
-handbag
-no(1), yes(2)
-
-
-color of shoes
-shoes
-dark(1), light(2)
-
-
-8 color of upper-body clothing
-upblack, upwhite, upred, uppurple, upgray, upblue, upgreen, upbrown
-no(1), yes(2)
-
-
-
-7 color of lower-body clothing
-downblack, downwhite, downred, downgray, downblue, downgreen, downbrown
-no(1), yes(2)
-From H3D Dataset
-From Leeds Sports Pose
-Research Blog
-Chart
-← Back to test index
-Citations
-← Back to test index
-CSV Test
-← Back to test index
-Index of datasets
-← Back to test index
-Face search
-← Back to test index
-Gallery test
-← Back to test index
-![]()
![]()
![]()
Test table
-
-
-
-
-
-
-Col1
-Col2
-Col3
-
-
-
-Content1
-Content2
-Content3
-Megapixels UI Tests
-
-Map test
-← Back to test index
-Name search
-← Back to test index
-Pie Chart
-← Back to test index
-
-@online{megapixels,
- author = {Harvey, Adam. LaPlace, Jules.},
- title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets},
- year = 2019,
- url = {https://megapixels.cc/},
- urldate = {2019-04-20}
-}
-
-and include this license and attribution protocol within any derivative work.
+Adam Harvey
+ Jules LaPlace
+
+@online{megapixels,
+ author = {Harvey, Adam. LaPlace, Jules.},
+ title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets},
+ year = 2019,
+ url = {https://megapixels.cc/},
+ urldate = {2019-04-20}
+}
+
diff --git a/site/content/pages/about/legal.md b/site/content/pages/about/legal.md
index eb9f5559..85cf5c48 100644
--- a/site/content/pages/about/legal.md
+++ b/site/content/pages/about/legal.md
@@ -17,6 +17,7 @@ authors: Adam Harvey
diff --git a/site/content/pages/about/press.md b/site/content/pages/about/press.md
index 9f3874ce..d3ed008c 100644
--- a/site/content/pages/about/press.md
+++ b/site/content/pages/about/press.md
@@ -17,6 +17,7 @@ authors: Adam Harvey
--
cgit v1.2.3-70-g09d2
From 828ab34ca5e01e03e055ef9e091a88cd516a6061 Mon Sep 17 00:00:00 2001
From: adamhrv
+@inproceedings{ristani2016MTMC,
+ title = {Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking},
+ author = {Ristani, Ergys and Solera, Francesco and Zou, Roger and Cucchiara, Rita and Tomasi, Carlo},
+ booktitle = {European Conference on Computer Vision workshop on Benchmarking Multi-Target Tracking},
+ year = {2016}
+}
+
### Footnotes
[^sensetime_qz]: Legal
+
+
+
+@online{megapixels,
+ author = {Harvey, Adam. LaPlace, Jules.},
+ title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets},
+ year = 2019,
+ url = {https://megapixels.cc/},
+ urldate = {2019-04-20}
+}
+
+
+About MegaPixels
+
+
+Adam Harvey
+ Jules LaPlace
+ Team
+
+
+Contributing Researchers
+
+
+Code and Libraries
+
+
+Attribution
+
+@online{megapixels,
+ author = {Harvey, Adam. LaPlace, Jules.},
+ title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets},
+ year = 2019,
+ url = {https://megapixels.cc/},
+ urldate = {2019-04-20}
+}
+Legal
+
+
+Privacy
+3rd Party Services
+Links To Other Web Sites
+Information We Collect
+Information We Share
+Information We Provide
+
+
+
+@online{megapixels,
+ author = {Harvey, Adam. LaPlace, Jules.},
+ title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets},
+ year = 2019,
+ url = {https://megapixels.cc/},
+ urldate = {2019-04-20}
+}
+Prohibited Uses
+Governing Law
+Indemnity
+Changes
+Press
+
+
+
+
+50 People 1 Question
+Who used 50 People One Question Dataset?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ Asian Face Age Dataset
+Who used Asian Face Age Dataset?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ (ignore) research notes
+
+
+Brainwash Dataset
+Who used Brainwash Dataset?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ Supplementary Information
+
+![]()
![]()
![]()
+
+Cite Our Work
+
+@online{megapixels,
+ author = {Harvey, Adam. LaPlace, Jules.},
+ title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets},
+ year = 2019,
+ url = {https://megapixels.cc/},
+ urldate = {2019-04-20}
+}
+
+
+References
Caltech 10K Faces Dataset
+Who used Brainwash Dataset?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ (ignore) research notes
+CelebA Dataset
+Who used CelebA Dataset?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ Research
+
+
+Caltech Occluded Faces in the Wild
+Who used COFW Dataset?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ (ignore) research notes
+
+
+
+
+Biometric Trade Routes
+
+
+
+ Supplementary Information
+
+Dataset Citations
+ Who used COFW Dataset?
+
+ - replace graphic
+Duke MTMC
-![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Who used Duke MTMC Dataset?
Supplementary Information
-Notes
-References
Funding
+Video Timestamps
+
+
+
+
+
+
+Camera
+Date
+Start
+End
+
+
+Camera 1
+March 14, 2014
+4:14PM
+5:43PM
+
+
+Camera 2
+March 14, 2014
+4:13PM
+4:43PM
+
+
+Camera 3
+March 14, 2014
+4:20PM
+5:48PM
+
+
+
+Camera 4
+March 14, 2014
+4:21PM
+5:54PM
+
+
+
+
+
+
+Camera
+Date
+Start
+End
+
+
+Camera 5
+March 14, 2014
+4:12PM
+5:43PM
+
+
+Camera 6
+March 14, 2014
+4:18PM
+5:43PM
+
+
+Camera 7
+March 14, 2014
+4:16PM
+5:40PM
+
+
+
+Camera 8
+March 14, 2014
+4:25PM
+5:42PM
+Opting Out
+Notes
+
+
+Cite Our Work
+
+@online{megapixels,
+ author = {Harvey, Adam. LaPlace, Jules.},
+ title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets},
+ year = 2019,
+ url = {https://megapixels.cc/},
+ urldate = {2019-04-20}
+}
+
+
+
+@inproceedings{ristani2016MTMC,
+ title = {Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking},
+ author = {Ristani, Ergys and Solera, Francesco and Zou, Roger and Cucchiara, Rita and Tomasi, Carlo},
+ booktitle = {European Conference on Computer Vision workshop on Benchmarking Multi-Target Tracking},
+ year = {2016}
+}
+References
FacE REcognition Dataset (FERET)
+(ignore) RESEARCH below this line
+
+
+"The FERET database and evaluation procedure for face-recognition algorithms"
+
+
+FERET (Face Recognition Technology) Recognition Algorithm Development and Test Results
+
+
+Funding
+Labeled Face Parts in The Wild
+Who used LFWP?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+
+
+Market-1501 Dataset
+Who used Market 1501?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ (ignore) research Notes
+
+
+Microsoft Celeb Dataset (MS Celeb)
Who used Microsoft Celeb?
diff --git a/site/public/datasets/oxford_town_centre/index.html b/site/public/datasets/oxford_town_centre/index.html
index 63dc52d4..af020855 100644
--- a/site/public/datasets/oxford_town_centre/index.html
+++ b/site/public/datasets/oxford_town_centre/index.html
@@ -27,7 +27,7 @@
![]()
![]()
![]()
Demo Videos Using Oxford Town Centre Dataset
-
-
+![]()
![]()
![]()
Cite Our Work
+
+@online{megapixels,
+ author = {Harvey, Adam. LaPlace, Jules.},
+ title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets},
+ year = 2019,
+ url = {https://megapixels.cc/},
+ urldate = {2019-04-20}
+}
+
+
References
People in Photo Albums
+Who used PIPA Dataset?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ PubFig
+Who used PubFig?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ UnConstrained College Students
+Who used UCCS?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ Supplementary Information
+
+Dates and Times
+![]()
![]()
UCCS photos taken in 2012
+
+
+
+
+
+
+Date
+Photos
+
+
+Feb 23, 2012
+132
+
+
+March 6, 2012
+288
+
+
+March 8, 2012
+506
+
+
+March 13, 2012
+160
+
+
+March 20, 2012
+1,840
+
+
+March 22, 2012
+445
+
+
+April 3, 2012
+1,639
+
+
+April 12, 2012
+14
+
+
+April 17, 2012
+19
+
+
+April 24, 2012
+63
+
+
+April 25, 2012
+11
+
+
+
+April 26, 2012
+20
+UCCS photos taken in 2013
+
+
+
+
+
+
+Date
+Photos
+
+
+Jan 28, 2013
+1,056
+
+
+Jan 29, 2013
+1,561
+
+
+Feb 13, 2013
+739
+
+
+Feb 19, 2013
+723
+
+
+Feb 20, 2013
+965
+
+
+
+Feb 26, 2013
+736
+Location
+![]()
![]()
Funding
+
+
+Opting Out
+Ethics
+
+
+Downloads
+
+
+Cite Our Work
+
+@online{megapixels,
+ author = {Harvey, Adam. LaPlace, Jules.},
+ title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets},
+ year = 2019,
+ url = {https://megapixels.cc/},
+ urldate = {2019-04-20}
+}
+
+
+References
VGG Face 2
+Who used Brainwash Dataset?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ (ignore) research notes
+
+
+import difflib; seq = difflib.SequenceMatcher(a=a.lower(), b=b.lower()); score = seq.ratio()TODO
+
+
+VIPeR Dataset
+Who used VIPeR?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ YouTube Celebrities
+Who used YouTube Celebrities?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ Notes...
+
+
+Face Analysis
+00: Introduction
+ Motivation
+
+for other post
+Add progressive gan of FERET
+
+
+From 1 to 100 Pixels
+ High resolution insights from low resolution data
+
+
+
+
+Research
+
+
+
+
+a-Z0-9 with room to spare.Prior Research
+
+
+Notes
+
+
+
+
+What Computers Can See
+
+
+From PubFig Dataset
+
+
+From Market 1501
+
+
+
+
+
+
+attribute
+representation in file
+label
+
+
+gender
+gender
+male(1), female(2)
+
+
+hair length
+hair
+short hair(1), long hair(2)
+
+
+sleeve length
+up
+long sleeve(1), short sleeve(2)
+
+
+length of lower-body clothing
+down
+long lower body clothing(1), short(2)
+
+
+type of lower-body clothing
+clothes
+dress(1), pants(2)
+
+
+wearing hat
+hat
+no(1), yes(2)
+
+
+carrying backpack
+backpack
+no(1), yes(2)
+
+
+carrying bag
+bag
+no(1), yes(2)
+
+
+carrying handbag
+handbag
+no(1), yes(2)
+
+
+age
+age
+young(1), teenager(2), adult(3), old(4)
+
+
+8 color of upper-body clothing
+upblack, upwhite, upred, uppurple, upyellow, upgray, upblue, upgreen
+no(1), yes(2)
+
+
+
+9 color of lower-body clothing
+downblack, downwhite, downpink, downpurple, downyellow, downgray, downblue, downgreen,downbrown
+no(1), yes(2)
+From DukeMTMC
+
+
+
+
+
+
+attribute
+representation in file
+label
+
+
+gender
+gender
+male(1), female(2)
+
+
+length of upper-body clothing
+top
+short upper body clothing(1), long(2)
+
+
+wearing boots
+boots
+no(1), yes(2)
+
+
+wearing hat
+hat
+no(1), yes(2)
+
+
+carrying backpack
+backpack
+no(1), yes(2)
+
+
+carrying bag
+bag
+no(1), yes(2)
+
+
+carrying handbag
+handbag
+no(1), yes(2)
+
+
+color of shoes
+shoes
+dark(1), light(2)
+
+
+8 color of upper-body clothing
+upblack, upwhite, upred, uppurple, upgray, upblue, upgreen, upbrown
+no(1), yes(2)
+
+
+
+7 color of lower-body clothing
+downblack, downwhite, downred, downgray, downblue, downgreen, downbrown
+no(1), yes(2)
+From H3D Dataset
+From Leeds Sports Pose
+Research Blog
+Chart
+← Back to test index
+Citations
+← Back to test index
+CSV Test
+← Back to test index
+Index of datasets
+← Back to test index
+Face search
+← Back to test index
+Gallery test
+← Back to test index
+![]()
![]()
![]()
Test table
+
+
+
+
+
+
+Col1
+Col2
+Col3
+
+
+
+Content1
+Content2
+Content3
+Megapixels UI Tests
+
+Map test
+← Back to test index
+Name search
+← Back to test index
+Pie Chart
+← Back to test index
+UnConstrained College Students
+UnConstrained College Students
-![]()
![]()
Who used UCCS?
Location
![]()
![]()
Funding
+![]()
Funding
References
References
50 People 1 Question
+50 People 1 Question
-Who used 50 People One Question Dataset?
diff --git a/site/public/datasets/afad/index.html b/site/public/datasets/afad/index.html
index df14e7cd..832ce86a 100644
--- a/site/public/datasets/afad/index.html
+++ b/site/public/datasets/afad/index.html
@@ -26,7 +26,8 @@
Asian Face Age Dataset
+Asian Face Age Dataset
-Who used Asian Face Age Dataset?
diff --git a/site/public/datasets/brainwash/index.html b/site/public/datasets/brainwash/index.html
index 03331a2d..494856ec 100644
--- a/site/public/datasets/brainwash/index.html
+++ b/site/public/datasets/brainwash/index.html
@@ -27,7 +27,8 @@
Brainwash Dataset
+Brainwash Dataset
-Caltech 10K Faces Dataset
+Caltech 10K Faces Dataset
-Who used Brainwash Dataset?
diff --git a/site/public/datasets/celeba/index.html b/site/public/datasets/celeba/index.html
index c4caef20..e42ceb6f 100644
--- a/site/public/datasets/celeba/index.html
+++ b/site/public/datasets/celeba/index.html
@@ -27,7 +27,8 @@
CelebA Dataset
+CelebA Dataset
-Who used CelebA Dataset?
diff --git a/site/public/datasets/cofw/index.html b/site/public/datasets/cofw/index.html
index 4851e256..39e9680b 100644
--- a/site/public/datasets/cofw/index.html
+++ b/site/public/datasets/cofw/index.html
@@ -26,7 +26,8 @@
Caltech Occluded Faces in the Wild
+Caltech Occluded Faces in the Wild
-Who used COFW Dataset?
diff --git a/site/public/datasets/duke_mtmc/index.html b/site/public/datasets/duke_mtmc/index.html
index ba32484a..78067101 100644
--- a/site/public/datasets/duke_mtmc/index.html
+++ b/site/public/datasets/duke_mtmc/index.html
@@ -27,7 +27,8 @@
Duke MTMC
+Duke MTMC
+ ![]()
![]()
![]()
![]()
![]()
Who used Duke MTMC Dataset?
@@ -217,7 +218,11 @@ under Grants IIS-10-17017 and IIS-14-20894.
booktitle = {European Conference on Computer Vision workshop on Benchmarking Multi-Target Tracking},
year = {2016}
}
-References
ToDo
+
+
+References
FacE REcognition Dataset (FERET)
+FacE REcognition Dataset (FERET)
-(ignore) RESEARCH below this line
+ Who used LFW?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ (ignore) RESEARCH below this line
-Funding
+Funding
HRT Transgender Dataset
+HRT Transgender Dataset
-Labeled Face Parts in The Wild
+Labeled Face Parts in The Wild
-Who used LFWP?
Labeled Faces in the Wild
+Labeled Faces in the Wild
-Market-1501 Dataset
+Market-1501 Dataset
-Who used Market 1501?
diff --git a/site/public/datasets/msceleb/index.html b/site/public/datasets/msceleb/index.html
index be21280c..b4d02c87 100644
--- a/site/public/datasets/msceleb/index.html
+++ b/site/public/datasets/msceleb/index.html
@@ -27,7 +27,8 @@
Microsoft Celeb Dataset (MS Celeb)
+Microsoft Celeb Dataset (MS Celeb)
-Oxford Town Centre
+Oxford Town Centre
-Who used TownCentre?
diff --git a/site/public/datasets/pipa/index.html b/site/public/datasets/pipa/index.html
index 780b3029..d02540f0 100644
--- a/site/public/datasets/pipa/index.html
+++ b/site/public/datasets/pipa/index.html
@@ -27,7 +27,8 @@
People in Photo Albums
+People in Photo Albums
-Who used PIPA Dataset?
diff --git a/site/public/datasets/pubfig/index.html b/site/public/datasets/pubfig/index.html
index 2c8bd7b1..ed593054 100644
--- a/site/public/datasets/pubfig/index.html
+++ b/site/public/datasets/pubfig/index.html
@@ -27,7 +27,8 @@
PubFig
+PubFig
-Who used PubFig?
diff --git a/site/public/datasets/uccs/index.html b/site/public/datasets/uccs/index.html
index 1d76de3a..27d30716 100644
--- a/site/public/datasets/uccs/index.html
+++ b/site/public/datasets/uccs/index.html
@@ -28,7 +28,7 @@
UnConstrained College Students
-![]()
VGG Face 2
+VGG Face 2
-Who used Brainwash Dataset?
diff --git a/site/public/datasets/viper/index.html b/site/public/datasets/viper/index.html
index 5b3ac35b..494c249b 100644
--- a/site/public/datasets/viper/index.html
+++ b/site/public/datasets/viper/index.html
@@ -27,7 +27,8 @@
VIPeR Dataset
+VIPeR Dataset
-YouTube Celebrities
-YouTube Celebrities
+Who used YouTube Celebrities?
diff --git a/todo.md b/todo.md
index cc4736cd..4586611e 100644
--- a/todo.md
+++ b/todo.md
@@ -16,7 +16,6 @@
## Datasets
- JL: this paper isn't appearing in the UCCS list of verified papers but should be included https://arxiv.org/pdf/1708.02337.pdf
-- JL: add h2 dataset title above the right-sidebar so title extends full width
- AH: add dataset analysis for MS Celeb, IJB-C
- AH: fix dataset analysis for UCCS, brainwahs graphics
- AH: add license information to each dataset page
--
cgit v1.2.3-70-g09d2
From a13e9d0471bc6f78692cc212541a9a5c659b4ef1 Mon Sep 17 00:00:00 2001
From: adamhrv Duke MTMC
+![]()
![]()
![]()
![]()
![]()
Who used Duke MTMC Dataset?
@@ -217,7 +218,11 @@ under Grants IIS-10-17017 and IIS-14-20894.
booktitle = {European Conference on Computer Vision workshop on Benchmarking Multi-Target Tracking},
year = {2016}
}
-References
ToDo
+
+
+References
UnConstrained College Students
-![]()
![]()
![]()
![]()
Who used UCCS?
@@ -116,9 +117,8 @@
Supplementary Information
-Dates and Times
-![]()
![]()
UCCS photos taken in 2012
+![]()
UCCS photos taken in 2012
Date
@@ -212,7 +212,7 @@
Location
![]()
![]()
Funding
+![]()
Funding
Ethics
Downloads
@@ -250,8 +250,12 @@
}
-
References
References
50 People 1 Question
-Who used 50 People One Question Dataset?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ (ignore) research notes
+ Who used Asian Face Age Dataset?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ (ignore) research notes
Brainwash Dataset
-![]()
![]()
![]()
Who used Brainwash Dataset?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ Supplementary Information
+
+![]()
![]()
![]()
-Cite Our Work
+
+@online{megapixels,
+ author = {Harvey, Adam. LaPlace, Jules.},
+ title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets},
+ year = 2019,
+ url = {https://megapixels.cc/},
+ urldate = {2019-04-20}
+}
+
+
References
Caltech 10K Faces Dataset
-(ignore) research notes
+ Who used Brainwash Dataset?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ (ignore) research notes
CelebA Dataset
-Research
+ Who used CelebA Dataset?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ Research
Caltech Occluded Faces in the Wild
-(ignore) research notes
-Who used COFW Dataset?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ (ignore) research notes
+
@@ -54,11 +104,58 @@ To increase the number of training images, and since COFW has the exact same la
Biometric Trade Routes
+
+
+
+ Supplementary Information
+
+Dataset Citations
+ Who used COFW Dataset?
+
+ - replace graphic
Duke MTMC
-![]()
![]()
![]()
![]()
Funding
+![]()
![]()
![]()
![]()
Who used Duke MTMC Dataset?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ Supplementary Information
+
+Funding
Video Timestamps
+
-
Camera
Date
@@ -95,8 +152,7 @@ under Grants IIS-10-17017 and IIS-14-20894.
+
-
Camera
Date
@@ -131,15 +187,30 @@ under Grants IIS-10-17017 and IIS-14-20894.
Opting Out
+Opting Out
Notes
-Cite Our Work
+
+@online{megapixels,
+ author = {Harvey, Adam. LaPlace, Jules.},
+ title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets},
+ year = 2019,
+ url = {https://megapixels.cc/},
+ urldate = {2019-04-20}
+}
+
+
+
@inproceedings{ristani2016MTMC,
title = {Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking},
diff --git a/site/public/datasets/feret/index.html b/site/public/datasets/feret/index.html
index 8af139ab..387826b0 100644
--- a/site/public/datasets/feret/index.html
+++ b/site/public/datasets/feret/index.html
@@ -42,9 +42,59 @@
(ignore) RESEARCH below this line
+ Who used LFW?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ (ignore) RESEARCH below this line
-Funding
+Funding
HRT Transgender Dataset
-Labeled Face Parts in The Wild
-Who used LFWP?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+
diff --git a/site/public/datasets/lfw/index.html b/site/public/datasets/lfw/index.html
index 54b10611..7997629f 100644
--- a/site/public/datasets/lfw/index.html
+++ b/site/public/datasets/lfw/index.html
@@ -28,7 +28,7 @@
Labeled Faces in the Wild
-![]()
![]()
Commercial Use
+Who used LFW?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ Supplementary Information
+
+Commercial Use
-load_file assets/lfw_commercial_use.csv
-name_display, company_url, example_url, country, description
-Research
+Research
-
+
(ignore) research Notes
+ Who used Market 1501?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ (ignore) research Notes
Microsoft Celeb Dataset (MS Celeb)
-Additional Information
+Who used Microsoft Celeb?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ Supplementary Information
+
+Additional Information
diff --git a/site/public/datasets/oxford_town_centre/index.html b/site/public/datasets/oxford_town_centre/index.html
index d6f7378f..b48efe3e 100644
--- a/site/public/datasets/oxford_town_centre/index.html
+++ b/site/public/datasets/oxford_town_centre/index.html
@@ -28,7 +28,7 @@
Oxford Town Centre
-Location
+Who used TownCentre?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ Supplementary Information
+
+Location
![]()
![]()
![]()
![]()
![]()
![]()
Cite Our Work
+
+@online{megapixels,
+ author = {Harvey, Adam. LaPlace, Jules.},
+ title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets},
+ year = 2019,
+ url = {https://megapixels.cc/},
+ urldate = {2019-04-20}
+}
+
+
References
People in Photo Albums
-Who used PIPA Dataset?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ PubFig
-Who used PubFig?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ UnConstrained College Students
-![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
UCCS photos taken in 2012
+Who used UCCS?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ Supplementary Information
+
+![]()
UCCS photos taken in 2012
-
Date
@@ -120,8 +177,7 @@ Their setup made it impossible for students to know they were being photographed
UCCS photos taken in 2013
+UCCS photos taken in 2013
-
Date
@@ -155,10 +211,9 @@ Their setup made it impossible for students to know they were being photographed
Location
+Location
![]()
Funding
+![]()
Funding
-Cite Our Work
+
+@online{megapixels,
+ author = {Harvey, Adam. LaPlace, Jules.},
+ title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets},
+ year = 2019,
+ url = {https://megapixels.cc/},
+ urldate = {2019-04-20}
+}
+
+
References
(ignore) research notes
+ Who used Brainwash Dataset?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ (ignore) research notes
-TODO
+TODO
VIPeR Dataset
-Who used VIPeR?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ YouTube Celebrities
-Notes...
+Who used YouTube Celebrities?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ Notes...
@@ -94,19 +116,16 @@ If you use any data from the Duke MTMC please follow their [license](http://visi
}
+{% include 'cite_our_work.html' %}
+
+
#### ToDo
- clean up citations, formatting
### Footnotes
+[^xinjiang_nyt]: Mozur, Paul. "One Month, 500,000 Face Scans: How China Is Using A.I. to Profile a Minority". https://www.nytimes.com/2019/04/14/technology/china-surveillance-artificial-intelligence-racial-profiling.html. April 14, 2019.
[^sensetime_qz]:
+
+
+