From 2813b772c8a088307f7a1ab9df167875d320162d Mon Sep 17 00:00:00 2001
From: adamhrv and include this license and attribution protocol within any derivative work. If you publish data derived from MegaPixels, the original dataset creators should first be notified.
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.
Brainwash is a dataset of livecam images taken from San Francisco's Brainwash Cafe. It includes 11,918 images of "everyday life of a busy downtown cafe" 1 captured at 100 second intervals throught the entire day. The Brainwash dataset includes 3 full days of webcam images taken on October 27, November 13, and November 24 in 2014. According the author's reserach paper introducing the dataset, the images were acquired with the help of Angelcam.com 2 The Brainwash dataset is unique because it uses images from a publicly available webcam that records people inside a privately owned business without any consent. No ordinary cafe custom could ever suspect there image would end up in dataset used for surveillance reserach and development, but that is exactly what happened to customers at Brainwash cafe in San Francisco. Although Brainwash appears to be a less popular dataset, it was used in 2016 and 2017 by researchers from the National University of Defense Technology in China took note of the dataset and used it for two research projects on advancing the capabilities of object detection to more accurately isolate the target region in an image (PDF). 3 4. The dataset also appears in a 2017 research paper from Peking University for the purpose of improving surveillance capabilities for "people detection in the crowded scenes".
@@ -99,7 +99,7 @@
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.
diff --git a/site/public/datasets/caltech_10k/index.html b/site/public/datasets/caltech_10k/index.html
index abb55148..e86c5ca3 100644
--- a/site/public/datasets/caltech_10k/index.html
+++ b/site/public/datasets/caltech_10k/index.html
@@ -96,7 +96,7 @@
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.
https://www.hrw.org/news/2019/01/15/letter-microsoft-face-surveillance-technology The Microsoft Celeb dataset is a face recognition training site made entirely of images scraped from the Internet. According to Microsoft Research who created and published the dataset in 2016, MS Celeb is the largest publicly available face recognition dataset in the world, containing over 10 million images of 100,000 individuals. But Microsoft's ambition was bigger. They wanted to recognize 1 million individuals. As part of their dataset they released a list of 1 million target identities for researchers to identity. The identities In 2019, Microsoft CEO Brad Smith called for the governmental regulation of face recognition, an admission of his own company's inability to control their surveillance-driven business model. Yet since then, and for the last 4 years, Microsoft has willingly and actively played a significant role in accelerating growth in the very same industry they called for the government to regulate. This investigation looks look into the MS Celeb dataset and Microsoft Research's role in creating and distributing the largest publicly available face recognition dataset in the world to both. to spur growth and incentivize researchers, Microsoft released a dataset called MS Celeb, or Microsft Celeb, in which they developed and published a list of exactly 1 million targeted people whose biometrics would go on to build
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.
"readme.txt" https://exhibits.stanford.edu/data/catalog/sx925dc9385. Li, Y. and Dou, Y. and Liu, X. and Li, T. Localized Region Context and Object Feature Fusion for People Head Detection. ICIP16 Proceedings. 2016. Pages 594-598. Zhao. X, Wang Y, Dou, Y. A Replacement Algorithm of Non-Maximum Suppression Base on Graph Clustering. The Oxford Town Centre dataset is a CCTV video of pedestrians in a busy downtown area in Oxford used for research and development of activity and face recognition systems. 1 The CCTV video was obtained from a public surveillance camera at the corner of Cornmarket and Market St. in Oxford, England and includes approximately 2,200 people. Since its publication in 2009 2 the Oxford Town Centre dataset has been used in over 80 verified research projects including commercial research by Amazon, Disney, OSRAM, and Huawei; and academic research in China, Israel, Russia, Singapore, the US, and Germany among dozens more. The Oxford Town Centre dataset is a CCTV video of pedestrians in a busy downtown area in Oxford used for research and development of activity and face recognition systems. 1 The CCTV video was obtained from a surveillance camera at the corner of Cornmarket and Market St. in Oxford, England and includes approximately 2,200 people. Since its publication in 2009 2 the Oxford Town Centre dataset has been used in over 80 verified research projects including commercial research by Amazon, Disney, OSRAM, and Huawei; and academic research in China, Israel, Russia, Singapore, the US, and Germany among dozens more. The Oxford Town Centre dataset is unique in that it uses footage from a public surveillance camera that would otherwise be designated for public safety. The video shows that the pedestrians act normally and unrehearsed indicating they neither knew of or consented to participation in the research project.
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.
The street location of the camera used for the Oxford Town Centre dataset was confirmed by matching the road, benches, and store signs source. At that location, two public CCTV cameras exist mounted on the side of the Northgate House building at 13-20 Cornmarket St. Because of the lower camera's mounting pole directionality, a view from a private camera in the building across the street can be ruled out because it would have to show more of silhouette of the lower camera's mounting pole. Two options remain: either the public CCTV camera mounted to the side of the building was used or the researchers mounted their own camera to the side of the building in the same location. Because the researchers used many other existing public CCTV cameras for their research projects it is likely that they would also be able to access to this camera. To discredit the theory that this public CCTV is only seen pointing the other way in Google Street View images, at least one public photo shows the upper CCTV camera pointing in the same direction as the Oxford Town Centre dataset proving the camera can and has been rotated before. The street location of the camera used for the Oxford Town Centre dataset was confirmed by matching the road, benches, and store signs source. At that location, two public CCTV cameras exist mounted on the side of the Northgate House building at 13-20 Cornmarket St. Because of the lower camera's mounting pole directionality, a view from a private camera in the building across the street can be ruled out because it would have to show more of silhouette of the lower camera's mounting pole. Two options remain: either the public CCTV camera mounted to the side of the building was used or the researchers mounted their own camera to the side of the building in the same location. Because the researchers used many other existing public CCTV cameras for their research projects it is increasingly likely that they would also be able to access to this camera. Next, to discredit the theory that this public CCTV is only seen pointing the other way in Google Street View images, at least one public photo shows the upper CCTV camera pointing in the same direction as the Oxford Town Centre dataset proving the camera can and has been rotated before. As for the capture date, the text on the storefront display shows a sale happening from December 2nd – 7th indicating the capture date was between or just before those dates. The capture year is either 2008 or 2007 since prior to 2007 the Carphone Warehouse (photo, history) did not exist at this location. Since the sweaters in the GAP window display are more similar to those in a GAP website snapshot from November 2007, our guess is that the footage was obtained during late November or early December 2007. The lack of street vendors and slight waste residue near the bench suggests that is was probably a weekday after rubbish removal.
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.
- The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms.
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.
Exploring Disentangled Feature Representation Beyond Face Identification From https://arxiv.org/pdf/1804.03487.pdf
+The attribute IDs from 1 to 40 corre-spond to: ‘5 o Clock Shadow’, ‘Arched Eyebrows’, ‘Attrac-tive’, ‘Bags Under Eyes’, ‘Bald’, ‘Bangs’, ‘Big Lips’, ‘BigNose’, ‘Black Hair’, ‘Blond Hair’, ‘Blurry’, ‘Brown Hair’,‘Bushy Eyebrows’, ‘Chubby’, ‘Double Chin’, ‘Eyeglasses’,‘Goatee’, ‘Gray Hair’, ‘Heavy Makeup’, ‘High Cheek-bones’, ‘Male’, ‘Mouth Slightly Open’, ‘Mustache’, ‘Nar-row Eyes’, ‘No Beard’, ‘Oval Face’, ‘Pale Skin’, ‘PointyNose’, ‘Receding Hairline’, ‘Rosy Cheeks’, ‘Sideburns’,‘Smiling’, ‘Straight Hair’, ‘Wavy Hair’, ‘Wearing Ear-rings’, ‘Wearing Hat’, ‘Wearing Lipstick’, ‘Wearing Neck-lace’, ‘Wearing Necktie’ and ‘Young’. It’ Brainwash is a dataset of livecam images taken from San Francisco's Brainwash Cafe. It includes 11,918 images of "everyday life of a busy downtown cafe" 1 captured at 100 second intervals throught the entire day. The Brainwash dataset includes 3 full days of webcam images taken on October 27, November 13, and November 24 in 2014. According the author's reserach paper introducing the dataset, the images were acquired with the help of Angelcam.com 2 The Brainwash dataset is unique because it uses images from a publicly available webcam that records people inside a privately owned business without any consent. No ordinary cafe custom could ever suspect there image would end up in dataset used for surveillance reserach and development, but that is exactly what happened to customers at Brainwash cafe in San Francisco. Although Brainwash appears to be a less popular dataset, it was used in 2016 and 2017 by researchers from the National University of Defense Technology in China took note of the dataset and used it for two research projects on advancing the capabilities of object detection to more accurately isolate the target region in an image (PDF). 3 4. The dataset also appears in a 2017 research paper from Peking University for the purpose of improving surveillance capabilities for "people detection in the crowded scenes".
+ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
+
+ To help understand how IJB-C has been used around the world by commercial, military, and academic organizations; existing publicly available research citing IARPA Janus Benchmark C was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
+
+ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work.
+
+
+ If you use our data, research, or graphics please cite our work:
+
+ "readme.txt" https://exhibits.stanford.edu/data/catalog/sx925dc9385. Stewart, Russel. Andriluka, Mykhaylo. "End-to-end people detection in crowded scenes". 2016. Li, Y. and Dou, Y. and Liu, X. and Li, T. Localized Region Context and Object Feature Fusion for People Head Detection. ICIP16 Proceedings. 2016. Pages 594-598. Zhao. X, Wang Y, Dou, Y. A Replacement Algorithm of Non-Maximum Suppression Base on Graph Clustering.Dataset Citations
Dataset Citations
![]()
![]()
Who used Brainwash Dataset?
Dataset Citations
Supplementary Information
-![]()
![]()
![]()
![]()
Cite Our Work
Dataset Citations
Dataset Citations
Dataset Citations
Dataset Citations
Dataset Citations
Dataset Citations
Dataset Citations
Dataset Citations
Dataset Citations
Who used Microsoft Celeb?
@@ -98,7 +101,7 @@
Dataset Citations
Additional Information
-
-References
References
Who used TownCentre?
@@ -98,7 +98,7 @@
Dataset Citations
Supplementary Information
Location
-![]()
![]()
![]()
Dataset Citations
Dataset Citations
Dataset Citations
Dataset Citations
Dataset Citations
Dataset Citations
From SenseTime paper
+From PubFig Dataset
Brainwash Dataset
+![]()
Who used IJB-C?
+
+ Biometric Trade Routes
+
+
+
+ Dataset Citations
+ Supplementary Information
+
+![]()
![]()
Cite Our Work
+
+@online{megapixels,
+ author = {Harvey, Adam. LaPlace, Jules.},
+ title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets},
+ year = 2019,
+ url = {https://megapixels.cc/},
+ urldate = {2019-04-18}
+}
+
+
+References
"readme.txt" https://exhibits.stanford.edu/data/catalog/sx925dc9385.
-Stewart, Russel. Andriluka, Mykhaylo. "End-to-end people detection in crowded scenes". 2016.
-Li, Y. and Dou, Y. and Liu, X. and Li, T. Localized Region Context and Object Feature Fusion for People Head Detection. ICIP16 Proceedings. 2016. Pages 594-598.
-Zhao. X, Wang Y, Dou, Y. A Replacement Algorithm of Non-Maximum Suppression Base on Graph Clustering.
+Duke MTMC (Multi-Target, Multi-Camera) is a dataset of surveillance video footage taken on Duke University's campus in 2014 and is used for research and development of video tracking systems, person re-identification, and low-resolution facial recognition. The dataset contains over 14 hours of synchronized surveillance video from 8 cameras at 1080p and 60FPS with over 2 million frames of 2,000 students walking to and from classes. The 8 surveillance cameras deployed on campus were specifically setup to capture students "during periods between lectures, when pedestrian traffic is heavy" 1.
In this investigation into the Duke MTMC dataset we tracked down over 100 publicly available research papers that explicitly acknowledged using Duke MTMC. Our analysis shows that the dataset has spread far beyond its origins and intentions in academic research projects at Duke University. Since its publication in 2016, more than twice as many research citations originated in China as in the United States. Among these citations were papers with explicit and direct links to the Chinese military and several of the companies known to provide Chinese authorities with the oppressive surveillance technology used to monitor millions of Uighur Muslims.
-<<<<<<< HEAD -In one 2018 paper jointly published by researchers from SenseNets and SenseTime (and funded by SenseTime Group Limited) entitled Attention-Aware Compositional Network for Person Re-identification, the Duke MTMC dataset was used for "extensive experiments" on improving person re-identification across multiple surveillance cameras with important applications in "finding missing elderly and children, and suspect tracking, etc." Both SenseNets and SenseTime have been directly linked to the providing surveillance technology to monitor Uighur Muslims in China. 2 3 4
-======= -In one 2018 paper jointly published by researchers from SenseNets and SenseTime (and funded by SenseTime Group Limited) entitled Attention-Aware Compositional Network for Person Re-identification, the Duke MTMC dataset was used for "extensive experiments" on improving person re-identification across multiple surveillance cameras with important applications in "finding missing elderly and children, and suspect tracking, etc." Both SenseNets and SenseTime have been directly linked to the providing surveillance technology to monitor Uighur Muslims in China. 1 2 3
->>>>>>> 61fbcb8f2709236f36a103a73e0bd9d1dd3723e8 +In one 2018 paper jointly published by researchers from SenseNets and SenseTime (and funded by SenseTime Group Limited) entitled Attention-Aware Compositional Network for Person Re-identification, the Duke MTMC dataset was used for "extensive experiments" on improving person re-identification across multiple surveillance cameras with important applications in "finding missing elderly and children, and suspect tracking, etc." Both SenseNets and SenseTime have been directly linked to the providing surveillance technology to monitor Uighur Muslims in China. 4 2 3
Despite repeated warnings by Human Rights Watch that the authoritarian surveillance used in China represents a violation of human rights, researchers at Duke University continued to provide open access to their dataset for anyone to use for any project. As the surveillance crisis in China grew, so did the number of citations with links to organizations complicit in the crisis. In 2018 alone there were over 70 research projects happening in China that publicly acknowledged benefiting from the Duke MTMC dataset. Amongst these were projects from SenseNets, SenseTime, CloudWalk, Megvii, Beihang University, and the PLA's National University of Defense Technology.
By some metrics the dataset is considered a huge success. It is regarded as highly influential research and has contributed to hundreds, if not thousands, of projects to advance artificial intelligence for person tracking and monitoring. All the above citations, regardless of which country is using it, align perfectly with the original intent of the Duke MTMC dataset: "to accelerate advances in multi-target multi-camera tracking".
-<<<<<<< HEADThe same logic applies for all the new extensions of the Duke MTMC dataset including Duke MTMC Re-ID, Duke MTMC Video Re-ID, Duke MTMC Groups, and Duke MTMC Attribute. And it also applies to all the new specialized datasets that will be created from Duke MTMC, such as the low-resolution face recognition dataset called QMUL-SurvFace, which was funded in part by SeeQuestor, a computer vision provider to law enforcement agencies including Scotland Yards and Queensland Police. From the perspective of academic researchers, security contractors, and defense agencies using these datasets to advance their organization's work, Duke MTMC provides significant value regardless of who else is using it, so long as it advances their own interests in artificial intelligence.
But this perspective comes at significant cost to civil rights, human rights, and privacy. The creation and distribution of the Duke MTMC illustrates an egregious prioritization of surveillance technologies over individual rights, where the simple act of going to class could implicate your biometric data in a surveillance training dataset, perhaps even used by foreign defense agencies against your own ethics, against your own political interests, or against universal human rights.
-For the approximately 2,000 students in Duke MTMC dataset, there is unfortunately no escape. It would be impossible to remove oneself from all copies of the dataset downloaded around the world. Instead, over 2,000 students and visitors who happened to be walking to class on March 13, 2014 will forever remain in all downloaded copies of the Duke MTMC dataset and all its extensions, contributing to a global supply chain of data that powers governmental and commercial expansion of biometric surveillance technologies.
-======= -The same logic applies for all the new extensions of the Duke MTMC dataset including Duke MTMC Re-ID, Duke MTMC Video Re-ID, Duke MTMC Groups, and Duke MTMC Attribute. And it also applies to all the new specialized datasets that will be created from Duke MTMC, such as the low-resolution face recognition dataset called QMUL-SurvFace, which was funded in part by SeeQuestor, a computer vision provider to law enforcement agencies including Scotland Yards and Queensland Police. From the perspective of academic researchers, security contractors, and defense agencies using these datasets to advance their organization's work, Duke MTMC provides significant value regardless of who else is using it so long as it accelerate advances their own interests in artificial intelligence.
-But this perspective comes at significant cost to civil rights, human rights, and privacy. The creation and distribution of the Duke MTMC illustrates an egregious prioritization of surveillance technologies over individual rights, where the simple act of going to class could implicate your biometric data in a surveillance training dataset, perhaps even used by foreign defense agencies against your own ethics, against universal human rights, or against your own political interests.
For the approximately 2,000 students in Duke MTMC dataset there is unfortunately no escape. It would be impossible to remove oneself from all copies of the dataset downloaded around the world. Instead, over 2,000 students and visitors who happened to be walking to class in 2014 will forever remain in all downloaded copies of the Duke MTMC dataset and all its extensions, contributing to a global supply chain of data that powers governmental and commercial expansion of biometric surveillance technologies.
->>>>>>> 61fbcb8f2709236f36a103a73e0bd9d1dd3723e8The video timestamps contain the likely, but not yet confirmed, date and times of capture. Because the video timestamps align with the start and stop time sync data provided by the researchers, it at least aligns the relative time. The rainy weather on that day also contributes towards the likelihood of March 14, 2014.
-=======The video timestamps contain the likely, but not yet confirmed, date and times the video recorded. Because the video timestamps align with the start and stop time sync data provided by the researchers, it at least confirms the relative timing. The precipitous weather on March 14, 2014 in Durham, North Carolina supports, but does not confirm, that this day is a potential capture date.
->>>>>>> 61fbcb8f2709236f36a103a73e0bd9d1dd3723e8| Camera | @@ -345,28 +331,11 @@
|---|
If you use any data from the Duke MTMC, please follow their license and cite their work as:
-
-@inproceedings{ristani2016MTMC,
- title = {Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking},
- author = {Ristani, Ergys and Solera, Francesco and Zou, Roger and Cucchiara, Rita and Tomasi, Carlo},
- booktitle = {European Conference on Computer Vision workshop on Benchmarking Multi-Target Tracking},
- year = {2016}
-}
-The original Duke MTMC dataset paper mentions 2,700 identities, but their ground truth file only lists annotations for 1,812, and their own research typically mentions 2,000. For this write up we used 2,000 to describe the approximate number of students.
+The original Duke MTMC dataset paper mentions 2,700 identities, but their ground truth file only lists annotations for 1,812, and their own research typically mentions 2,000. For this writeup we used 2,000 to describe the approximate number of students.
Please direct any questions about the ethics of the dataset to Duke University's Institutional Ethics & Compliance Office using the number at the bottom of the page.
@@ -383,17 +352,8 @@ }
-<<<<<<< HEAD -If you use any data from the Duke MTMC please follow their license and cite their work as:
+If you use any data from the Duke MTMC, please follow their license and cite their work as:
@inproceedings{ristani2016MTMC,
title = {Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking},
@@ -401,26 +361,25 @@
booktitle = {European Conference on Computer Vision workshop on Benchmarking Multi-Target Tracking},
year = {2016}
}
-Mozur, Paul. "One Month, 500,000 Face Scans: How China Is Using A.I. to Profile a Minority". https://www.nytimes.com/2019/04/14/technology/china-surveillance-artificial-intelligence-racial-profiling.html. April 14, 2019.
-https://foreignpolicy.com/2019/03/19/962492-orwell-china-socialcredit-surveillance/
-"Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking". 2016. SemanticScholar
->>>>>>> 61fbcb8f2709236f36a103a73e0bd9d1dd3723e8 +Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor
+Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor Loren ipsum dolor
++ This bar chart presents a ranking of the top countries where dataset citations originated. Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries. +
+ ++ To help understand how IJB-C has been used around the world by commercial, military, and academic organizations; existing publicly available research citing IARPA Janus Benchmark C was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location. +
+ ++ The dataset citations used in the visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources. These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please cite our work. +
+ + ++ + If you use our data, research, or graphics please cite our work: + +
+@online{megapixels,
+ author = {Harvey, Adam. LaPlace, Jules.},
+ title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets},
+ year = 2019,
+ url = {https://megapixels.cc/},
+ urldate = {2019-04-18}
+}
+
+
+diff --git a/site/public/datasets/ijb_c/index.html b/site/public/datasets/ijb_c/index.html new file mode 100644 index 00000000..b6a16bfe --- /dev/null +++ b/site/public/datasets/ijb_c/index.html @@ -0,0 +1,151 @@ + + +
+
+ + + + + + + + + + +
+ + +