summaryrefslogtreecommitdiff
path: root/site/content/pages/datasets
diff options
context:
space:
mode:
authorJules Laplace <julescarbon@gmail.com>2019-04-17 22:08:59 +0200
committerJules Laplace <julescarbon@gmail.com>2019-04-17 22:08:59 +0200
commitfba670e97b1baee6739aacf55325ce8dfd835be5 (patch)
treec5a88f9964c8cc87a22331128580750c5f874a7b /site/content/pages/datasets
parent699d7a77b9d4120dfb75f271cb924b0e05a2fcaa (diff)
parent61fbcb8f2709236f36a103a73e0bd9d1dd3723e8 (diff)
merge
Diffstat (limited to 'site/content/pages/datasets')
-rw-r--r--site/content/pages/datasets/brainwash/assets/00425000_640x480.jpgbin33299 -> 0 bytes
-rw-r--r--site/content/pages/datasets/brainwash/assets/00425000_960.jpgbin47240 -> 0 bytes
-rwxr-xr-xsite/content/pages/datasets/brainwash/assets/brainwash_example.jpgbin0 -> 94735 bytes
-rwxr-xr-xsite/content/pages/datasets/brainwash/assets/brainwash_grid.jpgbin0 -> 354770 bytes
-rwxr-xr-xsite/content/pages/datasets/brainwash/assets/brainwash_mean_overlay.jpgbin150399 -> 0 bytes
-rwxr-xr-xsite/content/pages/datasets/brainwash/assets/brainwash_mean_overlay_wm.jpgbin151713 -> 0 bytes
-rw-r--r--site/content/pages/datasets/brainwash/assets/brainwash_montage.jpgbin235410 -> 0 bytes
-rw-r--r--site/content/pages/datasets/brainwash/index.md24
-rw-r--r--site/content/pages/datasets/duke_mtmc/index.md42
-rw-r--r--site/content/pages/datasets/index.md2
-rw-r--r--site/content/pages/datasets/msceleb/index.md1
-rw-r--r--site/content/pages/datasets/uccs/assets/uccs_grid.jpgbin142280 -> 112588 bytes
-rw-r--r--site/content/pages/datasets/uccs/index.md28
13 files changed, 42 insertions, 55 deletions
diff --git a/site/content/pages/datasets/brainwash/assets/00425000_640x480.jpg b/site/content/pages/datasets/brainwash/assets/00425000_640x480.jpg
deleted file mode 100644
index de62175a..00000000
--- a/site/content/pages/datasets/brainwash/assets/00425000_640x480.jpg
+++ /dev/null
Binary files differ
diff --git a/site/content/pages/datasets/brainwash/assets/00425000_960.jpg b/site/content/pages/datasets/brainwash/assets/00425000_960.jpg
deleted file mode 100644
index caa96fe2..00000000
--- a/site/content/pages/datasets/brainwash/assets/00425000_960.jpg
+++ /dev/null
Binary files differ
diff --git a/site/content/pages/datasets/brainwash/assets/brainwash_example.jpg b/site/content/pages/datasets/brainwash/assets/brainwash_example.jpg
new file mode 100755
index 00000000..3ddf4323
--- /dev/null
+++ b/site/content/pages/datasets/brainwash/assets/brainwash_example.jpg
Binary files differ
diff --git a/site/content/pages/datasets/brainwash/assets/brainwash_grid.jpg b/site/content/pages/datasets/brainwash/assets/brainwash_grid.jpg
new file mode 100755
index 00000000..0271ec6a
--- /dev/null
+++ b/site/content/pages/datasets/brainwash/assets/brainwash_grid.jpg
Binary files differ
diff --git a/site/content/pages/datasets/brainwash/assets/brainwash_mean_overlay.jpg b/site/content/pages/datasets/brainwash/assets/brainwash_mean_overlay.jpg
deleted file mode 100755
index 2f5917e3..00000000
--- a/site/content/pages/datasets/brainwash/assets/brainwash_mean_overlay.jpg
+++ /dev/null
Binary files differ
diff --git a/site/content/pages/datasets/brainwash/assets/brainwash_mean_overlay_wm.jpg b/site/content/pages/datasets/brainwash/assets/brainwash_mean_overlay_wm.jpg
deleted file mode 100755
index 790dbb79..00000000
--- a/site/content/pages/datasets/brainwash/assets/brainwash_mean_overlay_wm.jpg
+++ /dev/null
Binary files differ
diff --git a/site/content/pages/datasets/brainwash/assets/brainwash_montage.jpg b/site/content/pages/datasets/brainwash/assets/brainwash_montage.jpg
deleted file mode 100644
index 193fdd03..00000000
--- a/site/content/pages/datasets/brainwash/assets/brainwash_montage.jpg
+++ /dev/null
Binary files differ
diff --git a/site/content/pages/datasets/brainwash/index.md b/site/content/pages/datasets/brainwash/index.md
index 156b02c7..b57bcdf4 100644
--- a/site/content/pages/datasets/brainwash/index.md
+++ b/site/content/pages/datasets/brainwash/index.md
@@ -19,30 +19,24 @@ authors: Adam Harvey
### sidebar
### end sidebar
-*Brainwash* is a head detection dataset created from San Francisco's Brainwash Cafe livecam footage. It includes 11,918 images of "everyday life of a busy downtown cafe"[^readme] captured at 100 second intervals throught the entire day. Brainwash dataset was captured during 3 days in 2014: October 27, November 13, and November 24. According the author's reserach paper introducing the dataset, the images were acquired with the help of Angelcam.com.[^end_to_end]
+Brainwash is a dataset of livecam images taken from San Francisco's Brainwash Cafe. It includes 11,918 images of "everyday life of a busy downtown cafe"[^readme] captured at 100 second intervals throught the entire day. The Brainwash dataset includes 3 full days of webcam images taken on October 27, November 13, and November 24 in 2014. According the author's [reserach paper](https://www.semanticscholar.org/paper/End-to-End-People-Detection-in-Crowded-Scenes-Stewart-Andriluka/1bd1645a629f1b612960ab9bba276afd4cf7c666) introducing the dataset, the images were acquired with the help of Angelcam.com[^end_to_end]
-Brainwash is not a widely used dataset but since its publication by Stanford University in 2015, it has notably appeared in several research papers from the National University of Defense Technology in Changsha, China. In 2016 and in 2017 researchers there conducted studies on detecting people's heads in crowded scenes for the purpose of surveillance. [^localized_region_context] [^replacement_algorithm]
+The Brainwash dataset is unique because it uses images from a publicly available webcam that records people inside a privately owned business without any consent. No ordinary cafe custom could ever suspect there image would end up in dataset used for surveillance reserach and development, but that is exactly what happened to customers at Brainwash cafe in San Francisco.
-If you happen to have been at Brainwash cafe in San Francisco at any time on October 26, November 13, or November 24 in 2014 you are most likely included in the Brainwash dataset and have unwittingly contributed to surveillance research.
+Although Brainwash appears to be a less popular dataset, it was used in 2016 and 2017 by researchers from the National University of Defense Technology in China took note of the dataset and used it for two [research](https://www.semanticscholar.org/paper/Localized-region-context-and-object-feature-fusion-Li-Dou/b02d31c640b0a31fb18c4f170d841d8e21ffb66c) [projects](https://www.semanticscholar.org/paper/A-Replacement-Algorithm-of-Non-Maximum-Suppression-Zhao-Wang/591a4bfa6380c9fcd5f3ae690e3ac5c09b7bf37b) on advancing the capabilities of object detection to more accurately isolate the target region in an image ([PDF](https://www.itm-conferences.org/articles/itmconf/pdf/2017/04/itmconf_ita2017_05006.pdf)). [^localized_region_context] [^replacement_algorithm]. The dataset also appears in a 2017 [research paper](https://ieeexplore.ieee.org/document/7877809) from Peking University for the purpose of improving surveillance capabilities for "people detection in the crowded scenes".
-{% include 'dashboard.html' %}
-{% include 'supplementary_header.html' %}
+![caption: A visualization of 81,973 head annotations from the Brainwash dataset training partition. &copy; megapixels.cc](assets/brainwash_grid.jpg)
-![caption: A visualization of 81,973 head annotations from the Brainwash dataset training partition. &copy; megapixels.cc](assets/brainwash_saliency_map.jpg)
+{% include 'dashboard.html' %}
-![caption: An sample image from the Brainwash dataset used for training face and head detection algorithms for surveillance. The datset contains about 12,000 images. License: Open Data Commons Public Domain Dedication (PDDL)](assets/00425000_960.jpg)
+{% include 'supplementary_header.html' %}
-![caption: 49 of the 11,918 images included in the Brainwash dataset. License: Open Data Commons Public Domain Dedication (PDDL)](assets/brainwash_montage.jpg)
+![caption: An sample image from the Brainwash dataset used for training face and head detection algorithms for surveillance. The datset contains about 12,000 images. License: Open Data Commons Public Domain Dedication (PDDL)](assets/brainwash_example.jpg)
-TODO
+![caption: A visualization of 81,973 head annotations from the Brainwash dataset training partition. &copy; megapixels.cc](assets/brainwash_saliency_map.jpg)
-- change supp images to 2x2 grid with bboxes
-- add bounding boxes to the header image
-- remake montage with randomized images, with bboxes
-- add ethics link to Stanford
-- add optout info
{% include 'cite_our_work.html' %}
@@ -52,4 +46,4 @@ TODO
[^readme]: "readme.txt" https://exhibits.stanford.edu/data/catalog/sx925dc9385.
[^end_to_end]: Stewart, Russel. Andriluka, Mykhaylo. "End-to-end people detection in crowded scenes". 2016.
[^localized_region_context]: Li, Y. and Dou, Y. and Liu, X. and Li, T. Localized Region Context and Object Feature Fusion for People Head Detection. ICIP16 Proceedings. 2016. Pages 594-598.
-[^replacement_algorithm]: Zhao. X, Wang Y, Dou, Y. A Replacement Algorithm of Non-Maximum Suppression Base on Graph Clustering. \ No newline at end of file
+[^replacement_algorithm]: Zhao. X, Wang Y, Dou, Y. A Replacement Algorithm of Non-Maximum Suppression Base on Graph Clustering.
diff --git a/site/content/pages/datasets/duke_mtmc/index.md b/site/content/pages/datasets/duke_mtmc/index.md
index 2140fed7..0f4986de 100644
--- a/site/content/pages/datasets/duke_mtmc/index.md
+++ b/site/content/pages/datasets/duke_mtmc/index.md
@@ -22,7 +22,7 @@ Duke MTMC (Multi-Target, Multi-Camera) is a dataset of surveillance video footag
In this investigation into the Duke MTMC dataset we tracked down over 100 publicly available research papers that explicitly acknowledged using Duke MTMC. Our analysis shows that the dataset has spread far beyond its origins and intentions in academic research projects at Duke University. Since its publication in 2016, more than twice as many research citations originated in China as in the United States. Among these citations were papers with explicit and direct links to the Chinese military and several of the companies known to provide Chinese authorities with the oppressive surveillance technology used to monitor millions of Uighur Muslims.
-In one 2018 [paper](http://openaccess.thecvf.com/content_cvpr_2018/papers/Xu_Attention-Aware_Compositional_Network_CVPR_2018_paper.pdf) jointly published by researchers from SenseNets and SenseTime (and funded by SenseTime Group Limited) entitled [Attention-Aware Compositional Network for Person Re-identification](https://www.semanticscholar.org/paper/Attention-Aware-Compositional-Network-for-Person-Xu-Zhao/14ce502bc19b225466126b256511f9c05cadcb6e), the Duke MTMC dataset was used for "extensive experiments" on improving person re-identification across multiple surveillance cameras with important applications in "finding missing elderly and children, and suspect tracking, etc." Both SenseNets and SenseTime have been directly linked to the providing surveillance technology to monitor Uighur Muslims in China. [^sensetime_qz][^sensenets_uyghurs][^xinjiang_nyt]
+In one 2018 [paper](http://openaccess.thecvf.com/content_cvpr_2018/papers/Xu_Attention-Aware_Compositional_Network_CVPR_2018_paper.pdf) jointly published by researchers from SenseNets and SenseTime (and funded by SenseTime Group Limited) entitled [Attention-Aware Compositional Network for Person Re-identification](https://www.semanticscholar.org/paper/Attention-Aware-Compositional-Network-for-Person-Xu-Zhao/14ce502bc19b225466126b256511f9c05cadcb6e), the Duke MTMC dataset was used for "extensive experiments" on improving person re-identification across multiple surveillance cameras with important applications in "finding missing elderly and children, and suspect tracking, etc." Both SenseNets and SenseTime have been directly linked to the providing surveillance technology to monitor Uighur Muslims in China. [^xinjiang_nyt][^sensetime_qz][^sensenets_uyghurs]
![caption: A collection of 1,600 out of the approximately 2,000 students and pedestrians in the Duke MTMC dataset. These students were also included in the Duke MTMC Re-ID dataset extension used for person re-identification, and eventually the QMUL SurvFace face recognition dataset. Open Data Commons Attribution License.](assets/duke_mtmc_reid_montage.jpg)
@@ -30,18 +30,17 @@ Despite [repeated](https://www.hrw.org/news/2017/11/19/china-police-big-data-sys
| Organization | Paper | Link | Year | Used Duke MTMC |
|---|---|---|---|
-| SenseNets, SenseTime | Attention-Aware Compositional Network for Person Re-identification | [SemanticScholar](https://www.semanticscholar.org/paper/Attention-Aware-Compositional-Network-for-Person-Xu-Zhao/14ce502bc19b225466126b256511f9c05cadcb6e) | 2018 | &#x2714; |
-|SenseTime| End-to-End Deep Kronecker-Product Matching for Person Re-identification | [thcvf.com](http://openaccess.thecvf.com/content_cvpr_2018/papers/Shen_End-to-End_Deep_Kronecker-Product_CVPR_2018_paper.pdf) | 2018| &#x2714; |
-|CloudWalk| Horizontal Pyramid Matching for Person Re-identification | [arxiv.org](https://arxiv.org/pdf/1804.05275.pdf) | 20xx | &#x2714; |
-| Megvii | Multi-Target, Multi-Camera Tracking by Hierarchical Clustering: Recent Progress on DukeMTMC Project | [SemanticScholar](https://www.semanticscholar.org/paper/Multi-Target%2C-Multi-Camera-Tracking-by-Hierarchical-Zhang-Wu/10c20cf47d61063032dce4af73a4b8e350bf1128) | 2018 | &#x2714; |
+| Beihang University | Orientation-Guided Similarity Learning for Person Re-identification | [ieee.org](https://ieeexplore.ieee.org/document/8545620) | 2018 | &#x2714; |
+| Beihang University | Online Inter-Camera Trajectory Association Exploiting Person Re-Identification and Camera Topology | [acm.org](https://dl.acm.org/citation.cfm?id=3240663) | 2018 | &#x2714; |
+| CloudWalk | CloudWalk re-identification technology extends facial biometric tracking with improved accuracy | [BiometricUpdate.com](https://www.biometricupdate.com/201903/cloudwalk-re-identification-technology-extends-facial-biometric-tracking-with-improved-accuracy) | 2019 | &#x2714; |
+|CloudWalk| Horizontal Pyramid Matching for Person Re-identification | [arxiv.org](https://arxiv.org/pdf/1804.05275.pdf) | 2018 | &#x2714; |
| Megvii | Person Re-Identification (slides) | [github.io](https://zsc.github.io/megvii-pku-dl-course/slides/Lecture%2011,%20Human%20Understanding_%20ReID%20and%20Pose%20and%20Attributes%20and%20Activity%20.pdf) | 2017 | &#x2714; |
+| Megvii | Multi-Target, Multi-Camera Tracking by Hierarchical Clustering: Recent Progress on DukeMTMC Project | [SemanticScholar](https://www.semanticscholar.org/paper/Multi-Target%2C-Multi-Camera-Tracking-by-Hierarchical-Zhang-Wu/10c20cf47d61063032dce4af73a4b8e350bf1128) | 2018 | &#x2714; |
| Megvii | SCPNet: Spatial-Channel Parallelism Network for Joint Holistic and Partial PersonRe-Identification | [arxiv.org](https://arxiv.org/abs/1810.06996) | 2018 | &#x2714; |
-| CloudWalk | CloudWalk re-identification technology extends facial biometric tracking with improved accuracy | [BiometricUpdate.com](https://www.biometricupdate.com/201903/cloudwalk-re-identification-technology-extends-facial-biometric-tracking-with-improved-accuracy) | 2018 | &#x2714; |
-| CloudWalk | Horizontal Pyramid Matching for Person Re-identification | [arxiv.org](https://arxiv.org/abs/1804.05275)] | 2018 | &#x2714; |
| National University of Defense Technology | Tracking by Animation: Unsupervised Learning of Multi-Object Attentive Trackers | [SemanticScholar.org](https://www.semanticscholar.org/paper/Tracking-by-Animation%3A-Unsupervised-Learning-of-He-Liu/e90816e1a0e14ea1e7039e0b2782260999aef786) | 2018 | &#x2714; |
| National University of Defense Technology | Unsupervised Multi-Object Detection for Video Surveillance Using Memory-Based Recurrent Attention Networks | [SemanticScholar.org](https://www.semanticscholar.org/paper/Unsupervised-Multi-Object-Detection-for-Video-Using-He-He/59f357015054bab43fb8cbfd3f3dbf17b1d1f881) | 2018 | &#x2714; |
-| Beihang University | Orientation-Guided Similarity Learning for Person Re-identification | [ieee.org](https://ieeexplore.ieee.org/document/8545620) | 2018 | &#x2714; |
-| Beihang University | Online Inter-Camera Trajectory Association Exploiting Person Re-Identification and Camera Topology | [acm.org](https://dl.acm.org/citation.cfm?id=3240663) | 2018 | &#x2714; |
+| SenseNets, SenseTime | Attention-Aware Compositional Network for Person Re-identification | [SemanticScholar](https://www.semanticscholar.org/paper/Attention-Aware-Compositional-Network-for-Person-Xu-Zhao/14ce502bc19b225466126b256511f9c05cadcb6e) | 2018 | &#x2714; |
+|SenseTime| End-to-End Deep Kronecker-Product Matching for Person Re-identification | [thcvf.com](http://openaccess.thecvf.com/content_cvpr_2018/papers/Shen_End-to-End_Deep_Kronecker-Product_CVPR_2018_paper.pdf) | 2018| &#x2714; |
The reasons that companies in China use the Duke MTMC dataset for research are technically no different than the reasons it is used in the United States and Europe. In fact, the original creators of the dataset published a follow up report in 2017 titled [Tracking Social Groups Within and Across Cameras](https://www.semanticscholar.org/paper/Tracking-Social-Groups-Within-and-Across-Cameras-Solera-Calderara/9e644b1e33dd9367be167eb9d832174004840400) with specific applications to "automated analysis of crowds and social gatherings for surveillance and security applications". Their work, as well as the creation of the original dataset in 2014 were both supported in part by the United States Army Research Laboratory.
@@ -49,12 +48,12 @@ Citations from the United States and Europe show a similar trend to that in Chin
| Organization | Paper | Link | Year | Used Duke MTMC |
|---|---|---|---|
-| IARPA, IBM, CloudWalk | Horizontal Pyramid Matching for Person Re-identification | [arxiv.org](https://arxiv.org/abs/1804.05275) | 2018 | &#x2714; |
+| IARPA, IBM | Horizontal Pyramid Matching for Person Re-identification | [arxiv.org](https://arxiv.org/abs/1804.05275) | 2018 | &#x2714; |
| Microsoft | ReXCam: Resource-Efficient, Cross-CameraVideo Analytics at Enterprise Scale | [arxiv.org](https://arxiv.org/abs/1811.01268) | 2018 | &#x2714; |
| Microsoft | Scaling Video Analytics Systems to Large Camera Deployments | [arxiv.org](https://arxiv.org/pdf/1809.02318.pdf) | 2018 | &#x2714; |
-| University College of London, National University of Defense Technology | Unsupervised Multi-Object Detection for Video Surveillance Using Memory-Based RecurrentAttention Networks | [SemanticScholar.org](https://pdfs.semanticscholar.org/59f3/57015054bab43fb8cbfd3f3dbf17b1d1f881.pdf) | 2018 | &#x2714; |
-| Vision Semantics Ltd. | Unsupervised Person Re-identification by Deep Learning Tracklet Association | [arxiv.org](https://arxiv.org/abs/1809.02874) | 2018 | &#x2714; |
+| University College of London | Unsupervised Multi-Object Detection for Video Surveillance Using Memory-Based RecurrentAttention Networks | [SemanticScholar.org](https://pdfs.semanticscholar.org/59f3/57015054bab43fb8cbfd3f3dbf17b1d1f881.pdf) | 2018 | &#x2714; |
| US Dept. of Homeland Security | Re-Identification with Consistent Attentive Siamese Networks | [arxiv.org](https://arxiv.org/abs/1811.07487/) | 2019 | &#x2714; |
+| Vision Semantics Ltd. | Unsupervised Person Re-identification by Deep Learning Tracklet Association | [arxiv.org](https://arxiv.org/abs/1809.02874) | 2018 | &#x2714; |
By some metrics the dataset is considered a huge success. It is regarded as highly influential research and has contributed to hundreds, if not thousands, of projects to advance artificial intelligence for person tracking and monitoring. All the above citations, regardless of which country is using it, align perfectly with the original [intent](http://vision.cs.duke.edu/DukeMTMC/) of the Duke MTMC dataset: "to accelerate advances in multi-target multi-camera tracking".
@@ -66,7 +65,7 @@ The same logic applies for all the new extensions of the Duke MTMC dataset inclu
But this perspective comes at significant cost to civil rights, human rights, and privacy. The creation and distribution of the Duke MTMC illustrates an egregious prioritization of surveillance technologies over individual rights, where the simple act of going to class could implicate your biometric data in a surveillance training dataset, perhaps even used by foreign defense agencies against your own ethics, against your own political interests, or against universal human rights.
-For the approximately 2,000 students in Duke MTMC dataset, there is unfortunately no escape. It would be impossible to remove oneself from all copies of the dataset downloaded around the world. Instead, over 2,000 students and visitors who happened to be walking to class on March 13, 2014 will forever remain in all downloaded copies of the Duke MTMC dataset and all its extensions, contributing to a global supply chain of data that powers governmental and commercial expansion of biometric surveillance technologies.
+For the approximately 2,000 students in Duke MTMC dataset there is unfortunately no escape. It would be impossible to remove oneself from all copies of the dataset downloaded around the world. Instead, over 2,000 students and visitors who happened to be walking to class in 2014 will forever remain in all downloaded copies of the Duke MTMC dataset and all its extensions, contributing to a global supply chain of data that powers governmental and commercial expansion of biometric surveillance technologies.
![caption: Duke MTMC camera views for 8 cameras deployed on campus &copy; megapixels.cc](assets/duke_mtmc_cameras.jpg)
@@ -80,7 +79,7 @@ For the approximately 2,000 students in Duke MTMC dataset, there is unfortunatel
#### Video Timestamps
-The video timestamps contain the likely, but not yet confirmed, date and times of capture. Because the video timestamps align with the start and stop [time sync data](http://vision.cs.duke.edu/DukeMTMC/details.html#time-sync) provided by the researchers, it at least aligns the relative time. The [rainy weather](https://www.wunderground.com/history/daily/KIGX/date/2014-3-19?req_city=Durham&req_state=NC&req_statename=North%20Carolina&reqdb.zip=27708&reqdb.magic=1&reqdb.wmo=99999) on that day also contributes towards the likelihood of March 14, 2014.
+The video timestamps contain the likely, but not yet confirmed, date and times the video recorded. Because the video timestamps align with the start and stop [time sync data](http://vision.cs.duke.edu/DukeMTMC/details.html#time-sync) provided by the researchers, it at least confirms the relative timing. The [precipitous weather](https://www.wunderground.com/history/daily/KIGX/date/2014-3-19?req_city=Durham&req_state=NC&req_statename=North%20Carolina&reqdb.zip=27708&reqdb.magic=1&reqdb.wmo=99999) on March 14, 2014 in Durham, North Carolina supports, but does not confirm, that this day is a potential capture date.
=== columns 2
@@ -105,7 +104,13 @@ The video timestamps contain the likely, but not yet confirmed, date and times o
#### Errata
-- The Duke MTMC dataset paper mentions 2,700 identities, but their ground truth file only lists annotations for 1,812.
+The original Duke MTMC dataset paper mentions 2,700 identities, but their ground truth file only lists annotations for 1,812, and their own research typically mentions 2,000. For this writeup we used 2,000 to describe the approximate number of students.
+
+#### Ethics
+
+Please direct any questions about the ethics of the dataset to Duke University's [Institutional Ethics & Compliance Office](https://hr.duke.edu/policies/expectations/compliance/) using the number at the bottom of the page.
+
+{% include 'cite_our_work.html' %}
#### Citing Duke MTMC
@@ -120,15 +125,10 @@ If you use any data from the Duke MTMC, please follow their [license](http://vis
}
</pre>
-{% include 'cite_our_work.html' %}
-
-#### ToDo
-
-- clean up citations, formatting
-
### Footnotes
[^duke_mtmc_orig]: "Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking". 2016. [SemanticScholar](https://www.semanticscholar.org/paper/Performance-Measures-and-a-Data-Set-for-Tracking-Ristani-Solera/27a2fad58dd8727e280f97036e0d2bc55ef5424c)
[^sensetime_qz]: <https://qz.com/1248493/sensetime-the-billion-dollar-alibaba-backed-ai-company-thats-quietly-watching-everyone-in-china/>
[^sensenets_uyghurs]: <https://foreignpolicy.com/2019/03/19/962492-orwell-china-socialcredit-surveillance/>
[^xinjiang_nyt]: Mozur, Paul. "One Month, 500,000 Face Scans: How China Is Using A.I. to Profile a Minority". https://www.nytimes.com/2019/04/14/technology/china-surveillance-artificial-intelligence-racial-profiling.html. April 14, 2019.
+
diff --git a/site/content/pages/datasets/index.md b/site/content/pages/datasets/index.md
index c0373d60..289aa2fd 100644
--- a/site/content/pages/datasets/index.md
+++ b/site/content/pages/datasets/index.md
@@ -13,4 +13,4 @@ sync: false
# Facial Recognition Datasets
-Explore publicly available facial recognition datasets. More datasets will be added throughout 2019.
+Explore publicly available facial recognition datasets feeding into research and development of biometric surveillance technologies at the largest technology companies and defense contractors in the world.
diff --git a/site/content/pages/datasets/msceleb/index.md b/site/content/pages/datasets/msceleb/index.md
index 70e85699..d5e52952 100644
--- a/site/content/pages/datasets/msceleb/index.md
+++ b/site/content/pages/datasets/msceleb/index.md
@@ -19,7 +19,6 @@ authors: Adam Harvey
### sidebar
### end sidebar
-[ PAGE UNDER DEVELOPMENT ]
https://www.hrw.org/news/2019/01/15/letter-microsoft-face-surveillance-technology
diff --git a/site/content/pages/datasets/uccs/assets/uccs_grid.jpg b/site/content/pages/datasets/uccs/assets/uccs_grid.jpg
index d3d898ea..95dff617 100644
--- a/site/content/pages/datasets/uccs/assets/uccs_grid.jpg
+++ b/site/content/pages/datasets/uccs/assets/uccs_grid.jpg
Binary files differ
diff --git a/site/content/pages/datasets/uccs/index.md b/site/content/pages/datasets/uccs/index.md
index 68fff4db..b6073384 100644
--- a/site/content/pages/datasets/uccs/index.md
+++ b/site/content/pages/datasets/uccs/index.md
@@ -20,43 +20,37 @@ authors: Adam Harvey
### sidebar
### end sidebar
-UnConstrained College Students (UCCS) is a dataset of long-range surveillance photos captured at University of Colorado Colorado Springs developed primarily for research and development of "face detection and recognition research towards surveillance applications"[^uccs_vast]. According to the authors of two papers associated with the dataset, over 1,700 students and pedestrians were "photographed using a long-range high-resolution surveillance camera without their knowledge".[^funding_uccs] In this investigation, we examine the contents of the dataset, funding sources, photo EXIF data, and information from publicly available research project citations.
-
+UnConstrained College Students (UCCS) is a dataset of long-range surveillance photos captured at University of Colorado Colorado Springs developed primarily for research and development of "face detection and recognition research towards surveillance applications"[^uccs_vast]. According to the authors of [two](https://www.semanticscholar.org/paper/Unconstrained-Face-Detection-and-Open-Set-Face-G%C3%BCnther-Hu/d4f1eb008eb80595bcfdac368e23ae9754e1e745) [papers](https://www.semanticscholar.org/paper/Large-scale-unconstrained-open-set-face-database-Sapkota-Boult/07fcbae86f7a3ad3ea1cf95178459ee9eaf77cb1) associated with the dataset, over 1,700 students and pedestrians were "photographed using a long-range high-resolution surveillance camera without their knowledge".[^funding_uccs] In this investigation, we examine the contents of the [dataset](http://vast.uccs.edu/Opensetface/), its funding sources, photo EXIF data, and information from publicly available research project citations.
The UCCS dataset includes over 1,700 unique identities, most of which are students walking to and from class. As of 2018, it was the "largest surveillance [face recognition] benchmark in the public domain."[^surv_face_qmul] The photos were taken during the spring semesters of 2012 &ndash; 2013 on the West Lawn of the University of Colorado Colorado Springs campus. The photographs were timed to capture students during breaks between their scheduled classes in the morning and afternoon during Monday through Thursday. "For example, a student taking Monday-Wednesday classes at 12:30 PM will show up in the camera on almost every Monday and Wednesday."[^sapkota_boult].
-![caption: Example images from the UnConstrained College Students Dataset. ](assets/uccs_grid.jpg)
+![caption: The location at University of Colorado Colorado Springs where students were surreptitiously photographed with a long-range surveillance camera for use in a defense and intelligence agency funded research project on face recognition. Image: Google Maps](assets/uccs_map_aerial.jpg)
-The long-range surveillance images in the UnContsrained College Students dataset were captured using a Canon 7D 18 megapixel digital camera fitted with a Sigma 800mm F5.6 EX APO DG HSM telephoto lens and pointed out an office window across the university's West Lawn. The students were photographed from a distance of approximately 150 meters through an office window. "The camera [was] programmed to start capturing images at specific time intervals between classes to maximize the number of faces being captured."[^sapkota_boult]
-Their setup made it impossible for students to know they were being photographed, providing the researchers with realistic surveillance images to help build face detection and recognition systems for real world applications in defense, intelligence, and commercial applications.
-![caption: The location at University of Colorado Colorado Springs where students were surreptitiously photographed with a long-range surveillance camera for use in a defense and intelligence agency funded research project on face recognition. Image: Google Maps](assets/uccs_map_aerial.jpg)
+The long-range surveillance images in the UnContsrained College Students dataset were taken using a Canon 7D 18-megapixel digital camera fitted with a Sigma 800mm F5.6 EX APO DG HSM telephoto lens and pointed out an office window across the university's West Lawn. The students were photographed from a distance of approximately 150 meters through an office window. "The camera [was] programmed to start capturing images at specific time intervals between classes to maximize the number of faces being captured."[^sapkota_boult]
+Their setup made it impossible for students to know they were being photographed, providing the researchers with realistic surveillance images to help build face recognition systems for real world applications for defense, intelligence, and commercial partners.
-In the two papers associated with the release of the UCCS dataset ([Unconstrained Face Detection and Open-Set Face Recognition Challenge](https://www.semanticscholar.org/paper/Unconstrained-Face-Detection-and-Open-Set-Face-G%C3%BCnther-Hu/d4f1eb008eb80595bcfdac368e23ae9754e1e745) and [Large Scale Unconstrained Open Set Face Database](https://www.semanticscholar.org/paper/Large-scale-unconstrained-open-set-face-database-Sapkota-Boult/07fcbae86f7a3ad3ea1cf95178459ee9eaf77cb1)), the researchers disclosed their funding sources as ODNI (United States Office of Director of National Intelligence), IARPA (Intelligence Advance Research Projects Activity), ONR MURI (Office of Naval Research and The Department of Defense Multidisciplinary University Research Initiative), Army SBIR (Small Business Innovation Research), SOCOM SBIR (Special Operations Command and Small Business Innovation Research), and the National Science Foundation. Further, UCCS's VAST site explicity [states](https://vast.uccs.edu/project/iarpa-janus/) they are part of the [IARPA Janus](https://www.iarpa.gov/index.php/research-programs/janus), a face recognition project developed to serve the needs of national intelligence interests.
+![caption: Example images from the UnConstrained College Students Dataset. ](assets/uccs_grid.jpg)
-The EXIF data embedded in the images shows that the photo capture times follow a similar pattern, but also highlights that the vast majority of photos (over 7,000) were taken on Tuesdays around noon during students' lunch break. The lack of any photos taken on Friday shows that the researchers were only interested in capturing images of students.
+The EXIF data embedded in the images shows that the photo capture times follow a similar pattern to that outlined by the researchers, but also highlights that the vast majority of photos (over 7,000) were taken on Tuesdays around noon during students' lunch break. The lack of any photos taken between Friday through Sunday shows that the researchers were only interested in capturing images of students during the peak campus hours.
![caption: UCCS photos captured per weekday &copy; megapixels.cc](assets/uccs_exif_plot_days.png)
![caption: UCCS photos captured per weekday &copy; megapixels.cc](assets/uccs_exif_plot.png)
-The two research papers associated with the release of the UCCS dataset ([Unconstrained Face Detection and Open-Set Face Recognition Challenge](https://www.semanticscholar.org/paper/Unconstrained-Face-Detection-and-Open-Set-Face-G%C3%BCnther-Hu/d4f1eb008eb80595bcfdac368e23ae9754e1e745) and [Large Scale Unconstrained Open Set Face Database](https://www.semanticscholar.org/paper/Large-scale-unconstrained-open-set-face-database-Sapkota-Boult/07fcbae86f7a3ad3ea1cf95178459ee9eaf77cb1)), acknowledge that the primary funding sources for their work were United States defense and intelligence agencies. Specifically, development of the UnContrianed College Students dataset was funded by the Intelligence Advanced Research Projects Activity (IARPA), Office of Director of National Intelligence (ODNI), Office of Naval Research and The Department of Defense Multidisciplinary University Research Initiative (ONR MURI), Small Business Innovation Research (SBIR), Special Operations Command and Small Business Innovation Research (SOCOM SBIR), and the National Science Foundation. Further, UCCS's VAST site explicitly [states](https://vast.uccs.edu/project/iarpa-janus/) they are part of the [IARPA Janus](https://www.iarpa.gov/index.php/research-programs/janus), a face recognition project developed to serve the needs of national intelligence interests, clearly establishing the the funding sources and immediate benefactors of this dataset are United States defense and intelligence agencies.
-
-
-Although the images were first captured in 2012 &ndash; 2013 the dataset was not publicly released until 2016. Then in 2017 the UCCS face dataset formed the basis for a defense and intelligence agency funded [face recognition challenge](http://www.face-recognition-challenge.com/) project at the International Joint Biometrics Conference in Denver, CO. And in 2018 the dataset was again used for the [2nd Unconstrained Face Detection and Open Set Recognition Challenge](https://erodner.github.io/ial2018eccv/) at the European Computer Vision Conference (ECCV) in Munich, Germany.
-
-As of April 15, 2019, the UCCS dataset is no longer available for public download. But during the three years it was publicly available (2016-2019) the UCCS dataset appeared in at least 6 publicly available research papers including verified usage from Beihang University who is known to provide research and development for China's military.
+The two research papers associated with the release of the UCCS dataset ([Unconstrained Face Detection and Open-Set Face Recognition Challenge](https://www.semanticscholar.org/paper/Unconstrained-Face-Detection-and-Open-Set-Face-G%C3%BCnther-Hu/d4f1eb008eb80595bcfdac368e23ae9754e1e745) and [Large Scale Unconstrained Open Set Face Database](https://www.semanticscholar.org/paper/Large-scale-unconstrained-open-set-face-database-Sapkota-Boult/07fcbae86f7a3ad3ea1cf95178459ee9eaf77cb1)), acknowledge that the primary funding sources for their work were United States defense and intelligence agencies. Specifically, development of the UnContsrianed College Students dataset was funded by the Intelligence Advanced Research Projects Activity (IARPA), Office of Director of National Intelligence (ODNI), Office of Naval Research and The Department of Defense Multidisciplinary University Research Initiative (ONR MURI), and the Special Operations Command and Small Business Innovation Research (SOCOM SBIR) amongst others. UCCS's VAST site also explicitly [states](https://vast.uccs.edu/project/iarpa-janus/) their involvement in the [IARPA Janus](https://www.iarpa.gov/index.php/research-programs/janus) face recognition project developed to serve the needs of national intelligence, establishing that immediate benefactors of this dataset include United States defense and intelligence agencies, but it would go on to benefit other similar organizations.
+In 2017, one year after its public release, the UCCS face dataset formed the basis for a defense and intelligence agency funded [face recognition challenge](http://www.face-recognition-challenge.com/) project at the International Joint Biometrics Conference in Denver, CO. And in 2018 the dataset was again used for the [2nd Unconstrained Face Detection and Open Set Recognition Challenge](https://erodner.github.io/ial2018eccv/) at the European Computer Vision Conference (ECCV) in Munich, Germany.
+As of April 15, 2019, the UCCS dataset is no longer available for public download. But during the three years it was publicly available (2016-2019) the UCCS dataset appeared in at least 6 publicly available research papers including verified usage from Beihang University who is known to provide research and development for China's military; and Vision Semantics Ltd who lists the UK Ministory of Defence as a project partner.
{% include 'dashboard.html' %}
{% include 'supplementary_header.html' %}
-
-To show the types of face images used in the UCCS student dataset while protecting their individual privacy, a generative adversarial network was used to interpolate between identities in the dataset. The image below shows a generative adversarial network trained on the UCCS face bounding box areas from 16,000 images and over 90,000 face regions.
+Since this site To show the types of face images used in the UCCS student dataset while protecting their individual privacy, a generative adversarial network was used to interpolate between identities in the dataset. The image below shows a generative adversarial network trained on the UCCS face bounding box areas from 16,000 images and over 90,000 face regions.
![caption: GAN generated approximations of students in the UCCS dataset. &copy; megapixels.cc 2018](assets/uccs_pgan_01.jpg)
@@ -98,7 +92,7 @@ To show the types of face images used in the UCCS student dataset while protecti
### Location
-The location of the camera and subjects can confirmed using several visual cues in the dataset images: the unique pattern of the sidewalk that is only used on the UCCS Pedestrian Spine near the West Lawn, the two UCCS sign poles with matching graphics still visible in Google Street View, the no parking sign and directionality of its arrow, the back of street sign next to it, the slight bend in the sidewalk, the presence of cars passing in the background of the image, and the far wall of the parking garage all match images in the dataset. The [original papers](https://www.semanticscholar.org/paper/Large-scale-unconstrained-open-set-face-database-Sapkota-Boult/07fcbae86f7a3ad3ea1cf95178459ee9eaf77cb1) also provides another clue: a [picture of the camera](https://www.semanticscholar.org/paper/Large-scale-unconstrained-open-set-face-database-Sapkota-Boult/07fcbae86f7a3ad3ea1cf95178459ee9eaf77cb1/figure/1) inside the office that was used to create the dataset. The window view in this image provides another match for the brick pattern on the north facade of the Kraember Family Library and the green metal fence along the sidewalk. View the [location on Google Maps](https://www.google.com/maps/place/University+of+Colorado+Colorado+Springs/@38.8934297,-104.7992445,27a,35y,258.51h,75.06t/data=!3m1!1e3!4m5!3m4!1s0x87134fa088fe399d:0x92cadf3962c058c4!8m2!3d38.8968312!4d-104.8049528)
+The location of the camera and subjects was confirmed using several visual cues in the dataset images: the unique pattern of the sidewalk that is only used on the UCCS Pedestrian Spine near the West Lawn, the two UCCS sign poles with matching graphics still visible in Google Street View, the no parking sign and directionality of its arrow, the back of street sign next to it, the slight bend in the sidewalk, the presence of cars passing in the background of the image, and the far wall of the parking garage all match images in the dataset. The [original papers](https://www.semanticscholar.org/paper/Large-scale-unconstrained-open-set-face-database-Sapkota-Boult/07fcbae86f7a3ad3ea1cf95178459ee9eaf77cb1) also provides another clue: a [picture of the camera](https://www.semanticscholar.org/paper/Large-scale-unconstrained-open-set-face-database-Sapkota-Boult/07fcbae86f7a3ad3ea1cf95178459ee9eaf77cb1/figure/1) inside the office that was used to create the dataset. The window view in this image provides another match for the brick pattern on the north facade of the Kraember Family Library and the green metal fence along the sidewalk. View the [location on Google Maps](https://www.google.com/maps/place/University+of+Colorado+Colorado+Springs/@38.8934297,-104.7992445,27a,35y,258.51h,75.06t/data=!3m1!1e3!4m5!3m4!1s0x87134fa088fe399d:0x92cadf3962c058c4!8m2!3d38.8968312!4d-104.8049528)
![caption: 3D view showing the angle of view of the surveillance camera used for UCCS dataset. Image: Google Maps](assets/uccs_map_3d.jpg)