From 27340ac4cd43f8eec7414495b541a65566ae2656 Mon Sep 17 00:00:00 2001 From: adamhrv Date: Tue, 8 Oct 2019 16:02:47 +0200 Subject: update site, white --- site/public/datasets/adience/index.html | 7 +- site/public/datasets/brainwash/index.html | 17 ++-- site/public/datasets/duke_mtmc/index.html | 11 +-- site/public/datasets/helen/index.html | 94 ++++++++++++++++++++-- site/public/datasets/ibm_dif/index.html | 29 +++---- site/public/datasets/ijb_c/index.html | 7 +- site/public/datasets/index.html | 30 ++++++- site/public/datasets/lfpw/index.html | 15 ++-- site/public/datasets/megaface/index.html | 12 +-- site/public/datasets/msceleb/index.html | 29 ++++--- site/public/datasets/oxford_town_centre/index.html | 10 +-- site/public/datasets/pipa/index.html | 7 +- site/public/datasets/uccs/index.html | 11 +-- site/public/datasets/who_goes_there/index.html | 7 +- 14 files changed, 203 insertions(+), 83 deletions(-) (limited to 'site/public/datasets') diff --git a/site/public/datasets/adience/index.html b/site/public/datasets/adience/index.html index b2aa2733..a03fb3c6 100644 --- a/site/public/datasets/adience/index.html +++ b/site/public/datasets/adience/index.html @@ -55,8 +55,7 @@
-
Adience ...
Adience ... -

Adience

+

Adience

 Nine of 11,917 images from the the Brainwash dataset. Graphic: megapixels.cc based on Brainwash dataset by Russel et. al. License: <a href="https://opendatacommons.org/licenses/pddl/summary/index.html">Open Data Commons Public Domain Dedication</a> (PDDL)
Nine of 11,917 images from the the Brainwash dataset. Graphic: megapixels.cc based on Brainwash dataset by Russel et. al. License: Open Data Commons Public Domain Dedication (PDDL)
+
 Nine of 11,917 images from the the Brainwash dataset. Graphic: megapixels.cc based on Brainwash dataset by Russel et. al. License: <a href="https://opendatacommons.org/licenses/pddl/summary/index.html">Open Data Commons Public Domain Dedication</a> (PDDL)
Nine of 11,917 images from the the Brainwash dataset. Graphic: megapixels.cc based on Brainwash dataset by Russel et. al. License: Open Data Commons Public Domain Dedication (PDDL)

Press Coverage

+ +

Cite Our Work

diff --git a/site/public/datasets/duke_mtmc/index.html b/site/public/datasets/duke_mtmc/index.html index fc141450..351606cb 100644 --- a/site/public/datasets/duke_mtmc/index.html +++ b/site/public/datasets/duke_mtmc/index.html @@ -55,8 +55,8 @@

-
Duke MTMC is a dataset of surveillance camera footage of students on Duke University campus
Duke MTMC contains over 2 million video frames and 2,700 unique identities collected from 8 HD cameras at Duke University campus in March 2014 -

Duke MTMC

+
A still frame from the Duke MTMC (Multi-Target-Multi-Camera) CCTV dataset captured on Duke University campus in 2014. The dataset has now been terminated by the author in response to this report.

Duke MTMC

+

Update: In response to this report and an investigation by the Financial Times, Duke University has terminated the Duke MTMC dataset.

Duke MTMC (Multi-Target, Multi-Camera) is a dataset of surveillance video footage taken on Duke University's campus in 2014 and is used for research and development of video tracking systems, person re-identification, and low-resolution facial recognition. The dataset contains over 14 hours of synchronized surveillance video from 8 cameras at 1080p and 60 FPS, with over 2 million frames of 2,000 students walking to and from classes. The 8 surveillance cameras deployed on campus were specifically setup to capture students "during periods between lectures, when pedestrian traffic is heavy". 1

+

Duke MTMC (Multi-Target, Multi-Camera) is a dataset of surveillance video footage taken on Duke University's campus in 2014 and is used for research and development of video tracking systems, person re-identification, and low-resolution facial recognition.

+

The dataset contains over 14 hours of synchronized surveillance video from 8 cameras at 1080p and 60 FPS, with over 2 million frames of 2,000 students walking to and from classes. The 8 surveillance cameras deployed on campus were specifically setup to capture students "during periods between lectures, when pedestrian traffic is heavy". 1

For this analysis of the Duke MTMC dataset over 100 publicly available research papers that used the dataset were analyzed to find out who's using the dataset and where it's being used. The results show that the Duke MTMC dataset has spread far beyond its origins and intentions in academic research projects at Duke University. Since its publication in 2016, more than twice as many research citations originated in China as in the United States. Among these citations were papers links to the Chinese military and several of the companies known to provide Chinese authorities with the oppressive surveillance technology used to monitor millions of Uighur Muslims.

In one 2018 paper jointly published by researchers from SenseNets and SenseTime (and funded by SenseTime Group Limited) entitled Attention-Aware Compositional Network for Person Re-identification, the Duke MTMC dataset was used for "extensive experiments" on improving person re-identification across multiple surveillance cameras with important applications in suspect tracking. Both SenseNets and SenseTime have been linked to the providing surveillance technology to monitor Uighur Muslims in China. 4 2 3

 A collection of 1,600 out of the approximately 2,000 students and pedestrians in the Duke MTMC dataset. These students were also included in the Duke MTMC Re-ID dataset extension used for person re-identification, and eventually the QMUL SurvFace face recognition dataset. Open Data Commons Attribution License.
A collection of 1,600 out of the approximately 2,000 students and pedestrians in the Duke MTMC dataset. These students were also included in the Duke MTMC Re-ID dataset extension used for person re-identification, and eventually the QMUL SurvFace face recognition dataset. Open Data Commons Attribution License.

Despite repeated warnings by Human Rights Watch that the authoritarian surveillance used in China represents a humanitarian crisis, researchers at Duke University continued to provide open access to their dataset for anyone to use for any project. As the surveillance crisis in China grew, so did the number of citations with links to organizations complicit in the crisis. In 2018 alone there were over 90 research projects happening in China that publicly acknowledged using the Duke MTMC dataset. Amongst these were projects from CloudWalk, Hikvision, Megvii (Face++), SenseNets, SenseTime, Beihang University, China's National University of Defense Technology, and the PLA's Army Engineering University.

@@ -268,10 +269,10 @@
-

Information Supply chain

+

Information Supply Chain

- To help understand how Duke MTMC Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Duke Multi-Target, Multi-Camera Tracking Project was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location. + To help understand how Duke MTMC Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Duke Multi-Target, Multi-Camera Tracking Project was collected, verified, and geocoded to show the information supply chains of people appearing in the images. Click on the markers to reveal research projects at that location.

diff --git a/site/public/datasets/helen/index.html b/site/public/datasets/helen/index.html index 44ef462e..ffd432b9 100644 --- a/site/public/datasets/helen/index.html +++ b/site/public/datasets/helen/index.html @@ -4,7 +4,7 @@ MegaPixels: HELEN - + @@ -55,8 +55,7 @@
-
HELEN Face Dataset
HELEN (under development) -

HELEN

+
Example images from the HELEN dataset

HELEN Dataset

[ page under development ]

-
+

Helen is a dataset of annotated face images used for facial component localization. It includes 2,330 images from Flickr found by searching for "portrait" combined with terms such as "family", "wedding", "boy", "outdoor", and "studio". 1

+

The dataset was published in 2012 with the primary motivation listed as facilitating "high quality editing of portraits". However, the paper's introduction also mentions that facial feature localization "is an essential component for face recognition, tracking and expression analysis." 1

+

Irregardless of the authors' primary motivations, the HELEN dataset has become one of the most widely used datasets for training facial landmark algorithms, which are essential parts of most facial recogntion processing systems. Facial landmarking are used to isolate facial features such as the eyes, nose, jawline, and mouth in order to align faces to match a templated pose.

+
 An example annotation from the HELEN dataset showing 194 points that were originally annotated by Mechanical Turk workers. Graphic © 2019 MegaPixels.cc based on data from HELEN dataset by  Le, Vuong et al.
An example annotation from the HELEN dataset showing 194 points that were originally annotated by Mechanical Turk workers. Graphic © 2019 MegaPixels.cc based on data from HELEN dataset by Le, Vuong et al.

This analysis shows that since its initial publication in 2012, the HELEN dataset has been used in over 200 research projects related to facial recognition with the vast majority of research taking place in China.

+

Commercial use includes IBM, NVIDIA, NEC, Microsoft Research Asia, Google, Megvii, Microsoft, Intel, Daimler, Tencent, Baidu, Adobe, Facebook

+

Military and Defense Usage includes NUDT

+

http://eccv2012.unifi.it/

+

TODO

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +
OrganizationPaperLinkYearUsed Duke MTMC
SenseTime, AmazonLook at Boundary: A Boundary-Aware Face Alignment Algorithm
2018year
SenseTimeReenactGAN: Learning to Reenact Faces via Boundary Transfer2018year
+

The dataset was used for training the OpenFace software "we used the HELEN and LFPW training subsets for training and the rest for testing" https://github.com/TadasBaltrusaitis/OpenFace/wiki/Datasets

+

The popular dlib facial landmark detector was trained using HELEN

+

In addition to the 200+ verified citations, the HELEN dataset was used for

+ +

It's been converted into new datasets including

+ +

The original site

+ +

Example Images

+
 An image from the HELEN dataset "wedding" category used for training face recognition  2839127417_1.jpg for outdoor studio
An image from the HELEN dataset "wedding" category used for training face recognition 2839127417_1.jpg for outdoor studio
+
 An image from the HELEN dataset "wedding" category used for training face recognition 2325274893_1
An image from the HELEN dataset "wedding" category used for training face recognition 2325274893_1
 An image from the HELEN dataset "wedding" category used for training face recognition 2325274893_1
An image from the HELEN dataset "wedding" category used for training face recognition 2325274893_1
+
 An image from the HELEN dataset "wedding" category used for training face recognition 2325274893_1
An image from the HELEN dataset "wedding" category used for training face recognition 2325274893_1
 Original Flickr image used in HELEN facial analysis and recognition dataset for the keyword "family". 296814969
Original Flickr image used in HELEN facial analysis and recognition dataset for the keyword "family". 296814969
+
 Original Flickr image used in HELEN facial analysis and recognition dataset for the keyword "family". 296814969
Original Flickr image used in HELEN facial analysis and recognition dataset for the keyword "family". 296814969

Who used Helen Dataset?

@@ -91,10 +156,10 @@

-

Information Supply chain

+

Information Supply Chain

- To help understand how Helen Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Helen Dataset was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location. + To help understand how Helen Dataset has been used around the world by commercial, military, and academic organizations; existing publicly available research citing Helen Dataset was collected, verified, and geocoded to show how AI training data has proliferated around the world. Click on the markers to reveal research projects at that location.

@@ -130,7 +195,10 @@

Supplementary Information

+

Age and Gender Distribution

+

Age and gender estimation distribution were calculated by anlayzing all faces in the dataset images. This may include additional faces appearing next to an annotated face, or this may skip false faces that were erroneously included as part of the original dataset. These numbers are provided as an estimation and not a factual representation of the exact gender and age of all faces.

+
 Visualization of the HELEN dataset 194-point facial landmark annotations. Credit: graphic © MegaPixels.cc 2019, data from HELEN dataset by Zhou, Brand, Lin 2013. If you use this image please credit both the graphic and data source.
Visualization of the HELEN dataset 194-point facial landmark annotations. Credit: graphic © MegaPixels.cc 2019, data from HELEN dataset by Zhou, Brand, Lin 2013. If you use this image please credit both the graphic and data source.

Cite Our Work

@@ -147,7 +215,17 @@ }

-
+

Cite the Original Author's Work

+

If you find the HELEN dataset useful or reference it in your work, please cite the author's original work as:

+
+@inproceedings{Le2012InteractiveFF,
+ title={Interactive Facial Feature Localization},
+ author={Vuong Le and Jonathan Brandt and Zhe L. Lin and Lubomir D. Bourdev and Thomas S. Huang},
+ booktitle={ECCV},
+ year={2012}
+}
+

References

  • 1 abLe, Vuong et al. “Interactive Facial Feature Localization.” ECCV (2012). +