From f3d56483e743f83d25b15616205dbdd49aad0382 Mon Sep 17 00:00:00 2001 From: adamhrv Date: Wed, 15 May 2019 11:24:33 +0200 Subject: . --- site/public/datasets/brainwash/index.html | 12 ++++++------ site/public/datasets/duke_mtmc/index.html | 11 +++++++++-- site/public/datasets/msceleb/index.html | 2 +- site/public/datasets/uccs/index.html | 3 --- 4 files changed, 16 insertions(+), 12 deletions(-) (limited to 'site/public/datasets') diff --git a/site/public/datasets/brainwash/index.html b/site/public/datasets/brainwash/index.html index 8ae6b122..4fcea807 100644 --- a/site/public/datasets/brainwash/index.html +++ b/site/public/datasets/brainwash/index.html @@ -4,7 +4,7 @@ MegaPixels - + @@ -53,7 +53,7 @@
-
Brainwash is a dataset of webcam images taken from the Brainwash Cafe in San Francisco in 2014
The Brainwash dataset includes 11,918 images of "everyday life of a busy downtown cafe" and is used for training head detection surveillance algorithms +
Brainwash is a dataset of webcam images taken from the Brainwash Cafe in San Francisco
The Brainwash dataset includes 11,917 images of "everyday life of a busy downtown cafe" and is used for training head detection surveillance algorithms

Brainwash Dataset

Brainwash is a dataset of livecam images taken from San Francisco's Brainwash Cafe. It includes 11,918 images of "everyday life of a busy downtown cafe" 1 captured at 100 second intervals throughout the entire day. The Brainwash dataset includes 3 full days of webcam images taken on October 27, November 13, and November 24 in 2014. According the author's research paper introducing the dataset, the images were acquired with the help of Angelcam.com 2

+

Brainwash is a dataset of livecam images taken from San Francisco's Brainwash Cafe. It includes 11,917 images of "everyday life of a busy downtown cafe" 1 captured at 100 second intervals throughout the entire day. The Brainwash dataset includes 3 full days of webcam images taken on October 27, November 13, and November 24 in 2014. According the author's research paper introducing the dataset, the images were acquired with the help of Angelcam.com. 2

The Brainwash dataset is unique because it uses images from a publicly available webcam that records people inside a privately owned business without any consent. No ordinary cafe customer would ever suspect that their image would end up in dataset used for surveillance research and development, but that is exactly what happened to customers at Brainwash cafe in San Francisco.

-

Although Brainwash appears to be a less popular dataset, it was notably used in 2016 and 2017 by researchers affiliated with the National University of Defense Technology in China for two research projects on advancing the capabilities of object detection to more accurately isolate the target region in an image. 3 4. The National University of Defense Technology is controlled by China's top military body, the Central Military Commission.

+

Although Brainwash appears to be a less popular dataset, it was notably used in 2016 and 2017 by researchers affiliated with the National University of Defense Technology in China for two research projects on advancing the capabilities of object detection to more accurately isolate the target region in an image. 3 4 The National University of Defense Technology is controlled by China's top military body, the Central Military Commission.

The dataset also appears in a 2017 research paper from Peking University for the purpose of improving surveillance capabilities for "people detection in the crowded scenes".

-
 Nine of 11,917 images from the the Brainwash dataset. Graphics credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
Nine of 11,917 images from the the Brainwash dataset. Graphics credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
+
 An sample image from the Brainwash dataset used for training face and head detection algorithms for surveillance. The dataset contains a total of 11,917 images and 81,973 annotated heads.  Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
An sample image from the Brainwash dataset used for training face and head detection algorithms for surveillance. The dataset contains a total of 11,917 images and 81,973 annotated heads. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
 A visualization of the active regions for 81,973 head annotations in the Brainwash dataset training partition. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
A visualization of the active regions for 81,973 head annotations in the Brainwash dataset training partition. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)

Who used Brainwash Dataset?

@@ -140,7 +140,7 @@

Supplementary Information

-
 An sample image from the Brainwash dataset used for training face and head detection algorithms for surveillance. The dataset contains a total of 11,917 images and 81,973 annotated heads.  Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
An sample image from the Brainwash dataset used for training face and head detection algorithms for surveillance. The dataset contains a total of 11,917 images and 81,973 annotated heads. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
 A visualization of the active regions for 81,973 head annotations in the Brainwash dataset training partition. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
A visualization of the active regions for 81,973 head annotations in the Brainwash dataset training partition. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
+
 Nine of 11,917 images from the the Brainwash dataset. Graphics credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
Nine of 11,917 images from the the Brainwash dataset. Graphics credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)

Cite Our Work

diff --git a/site/public/datasets/duke_mtmc/index.html b/site/public/datasets/duke_mtmc/index.html index 24ee6cc2..16d11cb0 100644 --- a/site/public/datasets/duke_mtmc/index.html +++ b/site/public/datasets/duke_mtmc/index.html @@ -74,9 +74,9 @@

Website
duke.edu

Duke MTMC (Multi-Target, Multi-Camera) is a dataset of surveillance video footage taken on Duke University's campus in 2014 and is used for research and development of video tracking systems, person re-identification, and low-resolution facial recognition. The dataset contains over 14 hours of synchronized surveillance video from 8 cameras at 1080p and 60 FPS, with over 2 million frames of 2,000 students walking to and from classes. The 8 surveillance cameras deployed on campus were specifically setup to capture students "during periods between lectures, when pedestrian traffic is heavy" 1.

-

In this investigation into the Duke MTMC dataset we tracked down over 100 publicly available research papers that explicitly acknowledged using Duke MTMC. Our analysis shows that the dataset has spread far beyond its origins and intentions in academic research projects at Duke University. Since its publication in 2016, more than twice as many research citations originated in China as in the United States. Among these citations were papers with explicit and direct links to the Chinese military and several of the companies known to provide Chinese authorities with the oppressive surveillance technology used to monitor millions of Uighur Muslims.

+

In this investigation into the Duke MTMC dataset we tracked down over 100 publicly available research papers that explicitly acknowledged using Duke MTMC. Our analysis shows that the dataset has spread far beyond its origins and intentions in academic research projects at Duke University. Since its publication in 2016, more than twice as many research citations originated in China as in the United States. Among these citations were papers links to the Chinese military and several of the companies known to provide Chinese authorities with the oppressive surveillance technology used to monitor millions of Uighur Muslims.

In one 2018 paper jointly published by researchers from SenseNets and SenseTime (and funded by SenseTime Group Limited) entitled Attention-Aware Compositional Network for Person Re-identification, the Duke MTMC dataset was used for "extensive experiments" on improving person re-identification across multiple surveillance cameras with important applications in "finding missing elderly and children, and suspect tracking, etc." Both SenseNets and SenseTime have been directly linked to the providing surveillance technology to monitor Uighur Muslims in China. 4 2 3

-
 A collection of 1,600 out of the approximately 2,000 students and pedestrians in the Duke MTMC dataset. These students were also included in the Duke MTMC Re-ID dataset extension used for person re-identification, and eventually the QMUL SurvFace face recognition dataset. Open Data Commons Attribution License.
A collection of 1,600 out of the approximately 2,000 students and pedestrians in the Duke MTMC dataset. These students were also included in the Duke MTMC Re-ID dataset extension used for person re-identification, and eventually the QMUL SurvFace face recognition dataset. Open Data Commons Attribution License.

Despite repeated warnings by Human Rights Watch that the authoritarian surveillance used in China represents a violation of human rights, researchers at Duke University continued to provide open access to their dataset for anyone to use for any project. As the surveillance crisis in China grew, so did the number of citations with links to organizations complicit in the crisis. In 2018 alone there were over 70 research projects happening in China that publicly acknowledged benefiting from the Duke MTMC dataset. Amongst these were projects from SenseNets, SenseTime, CloudWalk, Megvii, Beihang University, and the PLA's National University of Defense Technology.

+
 A collection of 1,600 out of the approximately 2,000 students and pedestrians in the Duke MTMC dataset. These students were also included in the Duke MTMC Re-ID dataset extension used for person re-identification, and eventually the QMUL SurvFace face recognition dataset. Open Data Commons Attribution License.
A collection of 1,600 out of the approximately 2,000 students and pedestrians in the Duke MTMC dataset. These students were also included in the Duke MTMC Re-ID dataset extension used for person re-identification, and eventually the QMUL SurvFace face recognition dataset. Open Data Commons Attribution License.

Despite repeated warnings by Human Rights Watch that the authoritarian surveillance used in China represents humanitarian crisis, researchers at Duke University continued to provide open access to their dataset for anyone to use for any project. As the surveillance crisis in China grew, so did the number of citations with links to organizations complicit in the crisis. In 2018 alone there were over 90 research projects happening in China that publicly acknowledged using and benefiting from the Duke MTMC dataset. Amongst these were projects from CloudWalk, Hikvision, Megvii (Face++), SenseNets, SenseTime, Beihang University, and the PLA's National University of Defense Technology.

@@ -116,6 +116,13 @@ + + + + + + + diff --git a/site/public/datasets/msceleb/index.html b/site/public/datasets/msceleb/index.html index aabda46c..dfe1b2d9 100644 --- a/site/public/datasets/msceleb/index.html +++ b/site/public/datasets/msceleb/index.html @@ -194,7 +194,7 @@
Organization
HikvisionLearning Incremental Triplet Margin for Person Re-identificationarxiv.org2018
Megvii Person Re-Identification (slides) github.io
-

After publishing this list, researchers affiliated with Microsoft Asia then worked with researchers affiliated with China's National University of Defense Technology (controlled by China's Central Military Commission) and used the MS Celeb image dataset for their research paper on using "Faces as Lighting Probes via Unsupervised Deep Highlight Extraction" with potential applications in 3D face recognition.

+

After publishing this list, researchers affiliated with Microsoft Asia then worked with researchers affiliated with China's National University of Defense Technology (controlled by China's Central Military Commission) and used the MS Celeb image dataset for their research paper on using "Faces as Lighting Probes via Unsupervised Deep Highlight Extraction" with potential applications in 3D face recognition.

In an April 10, 2019 article published by Financial Times based on data surfaced during this investigation, Samm Sacks (a senior fellow at the New America think tank) commented that this research raised "red flags because of the nature of the technology, the author's affiliations, combined with what we know about how this technology is being deployed in China right now". Adding, that "the [Chinese] government is using these technologies to build surveillance systems and to detain minorities [in Xinjiang]". 2

Four more papers published by SenseTime that also use the MS Celeb dataset raise similar flags. SenseTime is a computer vision surveillance company that until April 2019 provided surveillance to Chinese authorities to monitor and track Uighur Muslims in Xinjiang province, and had been flagged numerous times as having potential links to human rights violations.

One of the 4 SenseTime papers, "Exploring Disentangled Feature Representation Beyond Face Identification", shows how SenseTime was developing automated face analysis technology to infer race, narrow eyes, nose size, and chin size, all of which could be used to target vulnerable ethnic groups based on their facial appearances.

diff --git a/site/public/datasets/uccs/index.html b/site/public/datasets/uccs/index.html index b5ceebd3..1a3a471f 100644 --- a/site/public/datasets/uccs/index.html +++ b/site/public/datasets/uccs/index.html @@ -56,9 +56,6 @@
UnConstrained College Students is a dataset of long-range surveillance photos of students on University of Colorado in Colorado Springs campus
The UnConstrained College Students dataset includes 16,149 images of 1,732 students, faculty, and pedestrians and is used for developing face recognition and face detection algorithms

UnConstrained College Students