From 1b6aba08b8eca4f09456bd55ca617138cf8502b9 Mon Sep 17 00:00:00 2001 From: adamhrv Date: Wed, 24 Apr 2019 10:10:06 +0200 Subject: udpate --- site/public/about/attribution/index.html | 1 + site/public/about/index.html | 3 ++- site/public/about/legal/index.html | 1 + site/public/about/press/index.html | 12 +++++++++- site/public/datasets/brainwash/index.html | 2 +- site/public/datasets/duke_mtmc/index.html | 2 +- site/public/datasets/index.html | 14 +++++------ site/public/datasets/msceleb/index.html | 28 ++++++++++++++-------- site/public/datasets/oxford_town_centre/index.html | 2 +- site/public/datasets/uccs/index.html | 2 +- 10 files changed, 44 insertions(+), 23 deletions(-) (limited to 'site/public') diff --git a/site/public/about/attribution/index.html b/site/public/about/attribution/index.html index 2f59fd7c..4105847e 100644 --- a/site/public/about/attribution/index.html +++ b/site/public/about/attribution/index.html @@ -38,6 +38,7 @@
diff --git a/site/public/about/index.html b/site/public/about/index.html index cd621ba7..17b63704 100644 --- a/site/public/about/index.html +++ b/site/public/about/index.html @@ -38,6 +38,7 @@
@@ -58,7 +59,7 @@

MegaPixels is an art and research project first launched in 2017 for an installation at Tactical Technology Collective's GlassRoom about face recognition datasets. In 2018 MegaPixels was extended to cover pedestrian analysis datasets for a commission by Elevate Arts festival in Austria. Since then MegaPixels has evolved into a large-scale interrogation of hundreds of publicly-available face and person analysis datasets, the first of which launched on this site in April 2019.

MegaPixels aims to provide a critical perspective on machine learning image datasets, one that might otherwise escape academia and industry funded artificial intelligence think tanks that are often supported by the several of the same technology companies who have created datasets presented on this site.

-

MegaPixels is an independent project, designed as a public resource for educators, students, journalists, and researchers. Each dataset presented on this site undergoes a thorough review of its images, intent, and funding sources. Though the goals are similar to publishing an academic paper, MegaPixels is a website-first research project, with a academic publications to follow.

+

MegaPixels is an independent project, designed as a public resource for educators, students, journalists, and researchers. Each dataset presented on this site undergoes a thorough review of its images, intent, and funding sources. Though the goals are similar to publishing an academic paper, MegaPixels is a website-first research project, with an academic publication to follow.

One of the main focuses of the dataset investigations presented on this site is to uncover where funding originated. Because of our emphasis on other researcher's funding sources, it is important that we are transparent about our own. This site and the past year of research have been primarily funded by a privacy art grant from Mozilla in 2018. The original MegaPixels installation in 2017 was built as a commission for and with support from Tactical Technology Collective and Mozilla. The research into pedestrian analysis datasets was funded by a commission from Elevate Arts, and continued development in 2019 is supported in part by a 1-year Researcher-in-Residence grant from Karlsruhe HfG, as well as lecture and workshop fees.

Team
diff --git a/site/public/datasets/brainwash/index.html b/site/public/datasets/brainwash/index.html index 0c760858..4653ec92 100644 --- a/site/public/datasets/brainwash/index.html +++ b/site/public/datasets/brainwash/index.html @@ -5,7 +5,7 @@ - + diff --git a/site/public/datasets/duke_mtmc/index.html b/site/public/datasets/duke_mtmc/index.html index 24789730..0c164b6a 100644 --- a/site/public/datasets/duke_mtmc/index.html +++ b/site/public/datasets/duke_mtmc/index.html @@ -5,7 +5,7 @@ - + diff --git a/site/public/datasets/index.html b/site/public/datasets/index.html index 38d2960d..ffe24671 100644 --- a/site/public/datasets/index.html +++ b/site/public/datasets/index.html @@ -5,7 +5,7 @@ - + @@ -37,7 +37,7 @@

Face Recognition Datasets

-

Explore face recognition datasets contributing the growing crisis of authoritarian biometric surveillance technologies. This first group of datasets focuses usage connected to foreign surveillance companies and defense organizations.

+

Explore face recognition datasets contributing to the growing crisis of authoritarian biometric surveillance technologies. This first group of 5 datasets focuses on image usage connected to foreign surveillance and defense organizations.

@@ -49,7 +49,7 @@
- Brainwash + Brainwash Dataset
2015
Head detection
@@ -61,7 +61,7 @@
- Duke MTMC + Duke MTMC Dataset
2016
Person re-identification, multi-camera tracking
@@ -73,7 +73,7 @@
- Microsoft Celeb + Microsoft Celeb Dataset
2016
Large-scale face recognition
@@ -85,7 +85,7 @@
- Oxford Town Centre + Oxford Town Centre Dataset
2009
Person detection, gaze estimation
@@ -97,7 +97,7 @@

Microsoft Celeb (MS Celeb) is a dataset of 10 million face images scraped from the Internet and used for research and development of large-scale biometric recognition systems. According to Microsoft Research, who created and published the dataset in 2016, MS Celeb is the largest publicly available face recognition dataset in the world, containing over 10 million images of nearly 100,000 individuals. Microsoft's goal in building this dataset was to distribute an initial training dataset of 100,000 individuals' images, and to use this dataset to accelerate research into recognizing a larger target list of one million people "using all the possibly collected face images of this individual on the web as training data". 1

-

These one million people, defined by Microsoft Research as "celebrities", are often merely people who must maintain an online presence for their professional lives. Microsoft's list of 1 million people is an expansive exploitation of the current reality that for many people, including academics, policy makers, writers, artists, and especially journalists, maintaining an online presence is mandatory. This fact should not allow Microsoft or anyone else to use their biometrics for research and development of surveillance technology. Many names in the target list even include people critical of the very technology Microsoft is using their name and biometric information to build. The list includes digital rights activists like Jillian York; artists critical of surveillance including Trevor Paglen, Jill Magid, and Aram Bartholl; Intercept founders Laura Poitras, Jeremy Scahill, and Glenn Greenwald; Data and Society founder danah boyd; and even Julie Brill, the former FTC commissioner responsible for protecting consumer privacy, to name a few.

+

Microsoft Celeb (MS Celeb) is a dataset of 10 million face images scraped from the Internet and used for research and development of large-scale biometric recognition systems. According to Microsoft Research, who created and published the dataset in 2016, MS Celeb is the largest publicly available face recognition dataset in the world, containing over 10 million images of nearly 100,000 individuals. Microsoft's goal in building this dataset was to distribute an initial training dataset of 100,000 individuals' images to accelerate research into recognizing a larger target list of one million people "using all the possibly collected face images of this individual on the web as training data". 1

+

These one million people, defined by Microsoft Research as "celebrities", are often merely people who must maintain an online presence for their professional lives. Microsoft's list of 1 million people is an expansive exploitation of the current reality that for many people, including academics, policy makers, writers, artists, and especially journalists; maintaining an online presence is mandatory. This fact should not allow Microsoft nor anyone else to use their biometrics for research and development of surveillance technology. Many names in the target list even include people critical of the very technology Microsoft is using their name and biometric information to build. The list includes digital rights activists like Jillian York; artists critical of surveillance including Trevor Paglen, Jill Magid, and Aram Bartholl; Intercept founders Laura Poitras, Jeremy Scahill, and Glenn Greenwald; Data and Society founder danah boyd; and even Julie Brill, the former FTC commissioner responsible for protecting consumer privacy, to name a few.

Microsoft's 1 Million Target List

-

Below is a selection of names from the full target list, curated to illustrate Microsoft's expansive and exploitative practice of scraping the Internet for biometric training data. The entire name file can be downloaded from msceleb.org. You can email msceleb@microsoft.com to have your name removed. Names appearing with * indicate that Microsoft also distributed images.

+

Below is a selection of 24 names from the full target list, curated to illustrate Microsoft's expansive and exploitative practice of scraping the Internet for biometric training data. The entire name file can be downloaded from msceleb.org. You can email msceleb@microsoft.com to have your name removed. Names appearing with * indicate that Microsoft also distributed your images.

@@ -112,6 +112,10 @@ + + + +
NameHito Steyerl Artist, writer
James RisenJournalist
@@ -122,10 +126,6 @@ - - - - @@ -158,6 +158,14 @@ + + + + + + + + @@ -167,8 +175,8 @@
James RisenJournalist
Jeremy Scahill* Journalist
Artist
Michael AntiPolitical blogger
Manal al-Sharif*Womens's rights activist
Shoshana Zuboff Author, academic
-

After publishing this list, researchers from Microsoft Asia then worked with researchers affiliated with China's National University of Defense Technology (controlled by China's Central Military Commission) and used the the MS Celeb dataset for their research paper on using "Faces as Lighting Probes via Unsupervised Deep Highlight Extraction" with potential applications in 3D face recognition.

-

In an article published by Financial Times based on data surfaced during this investigation, Samm Sacks (a senior fellow at the New America think tank) commented that this research raised "red flags because of the nature of the technology, the author's affiliations, combined with what we know about how this technology is being deployed in China right now". Adding, that "the [Chinese] government is using these technologies to build surveillance systems and to detain minorities [in Xinjiang]". 2

+

After publishing this list, researchers affiliated with Microsoft Asia then worked with researchers affiliated with China's National University of Defense Technology (controlled by China's Central Military Commission) and used the the MS Celeb dataset for their research paper on using "Faces as Lighting Probes via Unsupervised Deep Highlight Extraction" with potential applications in 3D face recognition.

+

In an April 10, 2019 article published by Financial Times based on data surfaced during this investigation, Samm Sacks (a senior fellow at the New America think tank) commented that this research raised "red flags because of the nature of the technology, the author's affiliations, combined with what we know about how this technology is being deployed in China right now". Adding, that "the [Chinese] government is using these technologies to build surveillance systems and to detain minorities [in Xinjiang]". 2

Four more papers published by SenseTime, which also use the MS Celeb dataset, raise similar flags. SenseTime is a computer vision surveillance company that until April 2019 provided surveillance to Chinese authorities to monitor and track Uighur Muslims in Xinjiang province, and had been flagged numerous times as having potential links to human rights violations.

One of the 4 SenseTime papers, "Exploring Disentangled Feature Representation Beyond Face Identification", shows how SenseTime was developing automated face analysis technology to infer race, narrow eyes, nose size, and chin size, all of which could be used to target vulnerable ethnic groups based on their facial appearances.

Earlier in 2019, Microsoft President and Chief Legal Officer Brad Smith called for the governmental regulation of face recognition, citing the potential for misuse, a rare admission that Microsoft's surveillance-driven business model had lost its bearing. More recently Smith also announced that Microsoft would seemingly take a stand against such potential misuse, and had decided to not sell face recognition to an unnamed United States agency, citing a lack of accuracy. The software was not suitable to be used on minorities, because it was trained mostly on white male faces.

diff --git a/site/public/datasets/oxford_town_centre/index.html b/site/public/datasets/oxford_town_centre/index.html index cada5dd4..50859604 100644 --- a/site/public/datasets/oxford_town_centre/index.html +++ b/site/public/datasets/oxford_town_centre/index.html @@ -5,7 +5,7 @@ - + diff --git a/site/public/datasets/uccs/index.html b/site/public/datasets/uccs/index.html index 3ff4a345..adb411c6 100644 --- a/site/public/datasets/uccs/index.html +++ b/site/public/datasets/uccs/index.html @@ -5,7 +5,7 @@ - + -- cgit v1.2.3-70-g09d2 From 8b9eb9f89f3eb750218577d83c8621b286004090 Mon Sep 17 00:00:00 2001 From: adamhrv Date: Wed, 24 Apr 2019 13:52:02 +0200 Subject: add favicon --- site/assets/img/favicon/android-icon-144x144.png | Bin 0 -> 1004 bytes site/assets/img/favicon/android-icon-192x192.png | Bin 0 -> 504 bytes site/assets/img/favicon/android-icon-36x36.png | Bin 0 -> 848 bytes site/assets/img/favicon/android-icon-48x48.png | Bin 0 -> 887 bytes site/assets/img/favicon/android-icon-72x72.png | Bin 0 -> 881 bytes site/assets/img/favicon/android-icon-96x96.png | Bin 0 -> 925 bytes site/assets/img/favicon/apple-icon-114x114.png | Bin 0 -> 956 bytes site/assets/img/favicon/apple-icon-120x120.png | Bin 0 -> 962 bytes site/assets/img/favicon/apple-icon-144x144.png | Bin 0 -> 1004 bytes site/assets/img/favicon/apple-icon-152x152.png | Bin 0 -> 1069 bytes site/assets/img/favicon/apple-icon-180x180.png | Bin 0 -> 1112 bytes site/assets/img/favicon/apple-icon-57x57.png | Bin 0 -> 859 bytes site/assets/img/favicon/apple-icon-60x60.png | Bin 0 -> 859 bytes site/assets/img/favicon/apple-icon-72x72.png | Bin 0 -> 881 bytes site/assets/img/favicon/apple-icon-76x76.png | Bin 0 -> 871 bytes site/assets/img/favicon/apple-icon-precomposed.png | Bin 0 -> 1076 bytes site/assets/img/favicon/apple-icon.png | Bin 0 -> 1076 bytes site/assets/img/favicon/browserconfig.xml | 2 + site/assets/img/favicon/favicon-16x16.png | Bin 0 -> 779 bytes site/assets/img/favicon/favicon-32x32.png | Bin 0 -> 837 bytes site/assets/img/favicon/favicon-96x96.png | Bin 0 -> 925 bytes site/assets/img/favicon/favicon.ico | Bin 0 -> 1150 bytes site/assets/img/favicon/manifest.json | 41 +++++++++++++++ site/assets/img/favicon/ms-icon-144x144.png | Bin 0 -> 1004 bytes site/assets/img/favicon/ms-icon-150x150.png | Bin 0 -> 1018 bytes site/assets/img/favicon/ms-icon-310x310.png | Bin 0 -> 1825 bytes site/assets/img/favicon/ms-icon-70x70.png | Bin 0 -> 878 bytes site/assets/img/megapixels_logo_black.png | Bin 0 -> 1269 bytes site/assets/img/megapixels_logo_black.svg | 58 +++++++++++++++++++++ site/public/about/assets/LICENSE/index.html | 19 +++++++ site/public/about/attribution/index.html | 19 +++++++ site/public/about/index.html | 19 +++++++ site/public/about/legal/index.html | 19 +++++++ site/public/about/press/index.html | 19 +++++++ site/public/datasets/brainwash/index.html | 19 +++++++ site/public/datasets/duke_mtmc/index.html | 19 +++++++ site/public/datasets/hrt_transgender/index.html | 19 +++++++ site/public/datasets/ijb_c/index.html | 19 +++++++ site/public/datasets/index.html | 19 +++++++ site/public/datasets/msceleb/index.html | 19 +++++++ site/public/datasets/oxford_town_centre/index.html | 19 +++++++ site/public/datasets/uccs/index.html | 19 +++++++ site/public/index.html | 19 ++++++- site/public/info/index.html | 19 +++++++ site/public/research/00_introduction/index.html | 19 +++++++ .../research/01_from_1_to_100_pixels/index.html | 19 +++++++ .../research/02_what_computers_can_see/index.html | 19 +++++++ site/public/research/index.html | 19 +++++++ site/public/test/chart/index.html | 19 +++++++ site/public/test/citations/index.html | 19 +++++++ site/public/test/csv/index.html | 19 +++++++ site/public/test/datasets/index.html | 19 +++++++ site/public/test/face_search/index.html | 19 +++++++ site/public/test/gallery/index.html | 19 +++++++ site/public/test/index.html | 19 +++++++ site/public/test/map/index.html | 19 +++++++ site/public/test/name_search/index.html | 19 +++++++ site/public/test/pie_chart/index.html | 19 +++++++ 58 files changed, 651 insertions(+), 1 deletion(-) create mode 100644 site/assets/img/favicon/android-icon-144x144.png create mode 100644 site/assets/img/favicon/android-icon-192x192.png create mode 100644 site/assets/img/favicon/android-icon-36x36.png create mode 100644 site/assets/img/favicon/android-icon-48x48.png create mode 100644 site/assets/img/favicon/android-icon-72x72.png create mode 100644 site/assets/img/favicon/android-icon-96x96.png create mode 100644 site/assets/img/favicon/apple-icon-114x114.png create mode 100644 site/assets/img/favicon/apple-icon-120x120.png create mode 100644 site/assets/img/favicon/apple-icon-144x144.png create mode 100644 site/assets/img/favicon/apple-icon-152x152.png create mode 100644 site/assets/img/favicon/apple-icon-180x180.png create mode 100644 site/assets/img/favicon/apple-icon-57x57.png create mode 100644 site/assets/img/favicon/apple-icon-60x60.png create mode 100644 site/assets/img/favicon/apple-icon-72x72.png create mode 100644 site/assets/img/favicon/apple-icon-76x76.png create mode 100644 site/assets/img/favicon/apple-icon-precomposed.png create mode 100644 site/assets/img/favicon/apple-icon.png create mode 100644 site/assets/img/favicon/browserconfig.xml create mode 100644 site/assets/img/favicon/favicon-16x16.png create mode 100644 site/assets/img/favicon/favicon-32x32.png create mode 100644 site/assets/img/favicon/favicon-96x96.png create mode 100644 site/assets/img/favicon/favicon.ico create mode 100644 site/assets/img/favicon/manifest.json create mode 100644 site/assets/img/favicon/ms-icon-144x144.png create mode 100644 site/assets/img/favicon/ms-icon-150x150.png create mode 100644 site/assets/img/favicon/ms-icon-310x310.png create mode 100644 site/assets/img/favicon/ms-icon-70x70.png create mode 100644 site/assets/img/megapixels_logo_black.png create mode 100644 site/assets/img/megapixels_logo_black.svg (limited to 'site/public') diff --git a/site/assets/img/favicon/android-icon-144x144.png b/site/assets/img/favicon/android-icon-144x144.png new file mode 100644 index 00000000..341ece89 Binary files /dev/null and b/site/assets/img/favicon/android-icon-144x144.png differ diff --git a/site/assets/img/favicon/android-icon-192x192.png b/site/assets/img/favicon/android-icon-192x192.png new file mode 100644 index 00000000..de4c5e42 Binary files /dev/null and b/site/assets/img/favicon/android-icon-192x192.png differ diff --git a/site/assets/img/favicon/android-icon-36x36.png b/site/assets/img/favicon/android-icon-36x36.png new file mode 100644 index 00000000..d1929dfd Binary files /dev/null and b/site/assets/img/favicon/android-icon-36x36.png differ diff --git a/site/assets/img/favicon/android-icon-48x48.png b/site/assets/img/favicon/android-icon-48x48.png new file mode 100644 index 00000000..433c679c Binary files /dev/null and b/site/assets/img/favicon/android-icon-48x48.png differ diff --git a/site/assets/img/favicon/android-icon-72x72.png b/site/assets/img/favicon/android-icon-72x72.png new file mode 100644 index 00000000..3254f659 Binary files /dev/null and b/site/assets/img/favicon/android-icon-72x72.png differ diff --git a/site/assets/img/favicon/android-icon-96x96.png b/site/assets/img/favicon/android-icon-96x96.png new file mode 100644 index 00000000..8bab32b4 Binary files /dev/null and b/site/assets/img/favicon/android-icon-96x96.png differ diff --git a/site/assets/img/favicon/apple-icon-114x114.png b/site/assets/img/favicon/apple-icon-114x114.png new file mode 100644 index 00000000..54c91340 Binary files /dev/null and b/site/assets/img/favicon/apple-icon-114x114.png differ diff --git a/site/assets/img/favicon/apple-icon-120x120.png b/site/assets/img/favicon/apple-icon-120x120.png new file mode 100644 index 00000000..78fa8236 Binary files /dev/null and b/site/assets/img/favicon/apple-icon-120x120.png differ diff --git a/site/assets/img/favicon/apple-icon-144x144.png b/site/assets/img/favicon/apple-icon-144x144.png new file mode 100644 index 00000000..341ece89 Binary files /dev/null and b/site/assets/img/favicon/apple-icon-144x144.png differ diff --git a/site/assets/img/favicon/apple-icon-152x152.png b/site/assets/img/favicon/apple-icon-152x152.png new file mode 100644 index 00000000..1f0d9e00 Binary files /dev/null and b/site/assets/img/favicon/apple-icon-152x152.png differ diff --git a/site/assets/img/favicon/apple-icon-180x180.png b/site/assets/img/favicon/apple-icon-180x180.png new file mode 100644 index 00000000..ab9d8f82 Binary files /dev/null and b/site/assets/img/favicon/apple-icon-180x180.png differ diff --git a/site/assets/img/favicon/apple-icon-57x57.png b/site/assets/img/favicon/apple-icon-57x57.png new file mode 100644 index 00000000..4a45472b Binary files /dev/null and b/site/assets/img/favicon/apple-icon-57x57.png differ diff --git a/site/assets/img/favicon/apple-icon-60x60.png b/site/assets/img/favicon/apple-icon-60x60.png new file mode 100644 index 00000000..631aeb61 Binary files /dev/null and b/site/assets/img/favicon/apple-icon-60x60.png differ diff --git a/site/assets/img/favicon/apple-icon-72x72.png b/site/assets/img/favicon/apple-icon-72x72.png new file mode 100644 index 00000000..3254f659 Binary files /dev/null and b/site/assets/img/favicon/apple-icon-72x72.png differ diff --git a/site/assets/img/favicon/apple-icon-76x76.png b/site/assets/img/favicon/apple-icon-76x76.png new file mode 100644 index 00000000..b31e92b5 Binary files /dev/null and b/site/assets/img/favicon/apple-icon-76x76.png differ diff --git a/site/assets/img/favicon/apple-icon-precomposed.png b/site/assets/img/favicon/apple-icon-precomposed.png new file mode 100644 index 00000000..188d62bf Binary files /dev/null and b/site/assets/img/favicon/apple-icon-precomposed.png differ diff --git a/site/assets/img/favicon/apple-icon.png b/site/assets/img/favicon/apple-icon.png new file mode 100644 index 00000000..188d62bf Binary files /dev/null and b/site/assets/img/favicon/apple-icon.png differ diff --git a/site/assets/img/favicon/browserconfig.xml b/site/assets/img/favicon/browserconfig.xml new file mode 100644 index 00000000..c5541482 --- /dev/null +++ b/site/assets/img/favicon/browserconfig.xml @@ -0,0 +1,2 @@ + +#ffffff \ No newline at end of file diff --git a/site/assets/img/favicon/favicon-16x16.png b/site/assets/img/favicon/favicon-16x16.png new file mode 100644 index 00000000..9db00ab0 Binary files /dev/null and b/site/assets/img/favicon/favicon-16x16.png differ diff --git a/site/assets/img/favicon/favicon-32x32.png b/site/assets/img/favicon/favicon-32x32.png new file mode 100644 index 00000000..0e7082ce Binary files /dev/null and b/site/assets/img/favicon/favicon-32x32.png differ diff --git a/site/assets/img/favicon/favicon-96x96.png b/site/assets/img/favicon/favicon-96x96.png new file mode 100644 index 00000000..8bab32b4 Binary files /dev/null and b/site/assets/img/favicon/favicon-96x96.png differ diff --git a/site/assets/img/favicon/favicon.ico b/site/assets/img/favicon/favicon.ico new file mode 100644 index 00000000..93bc04b4 Binary files /dev/null and b/site/assets/img/favicon/favicon.ico differ diff --git a/site/assets/img/favicon/manifest.json b/site/assets/img/favicon/manifest.json new file mode 100644 index 00000000..013d4a6a --- /dev/null +++ b/site/assets/img/favicon/manifest.json @@ -0,0 +1,41 @@ +{ + "name": "App", + "icons": [ + { + "src": "\/android-icon-36x36.png", + "sizes": "36x36", + "type": "image\/png", + "density": "0.75" + }, + { + "src": "\/android-icon-48x48.png", + "sizes": "48x48", + "type": "image\/png", + "density": "1.0" + }, + { + "src": "\/android-icon-72x72.png", + "sizes": "72x72", + "type": "image\/png", + "density": "1.5" + }, + { + "src": "\/android-icon-96x96.png", + "sizes": "96x96", + "type": "image\/png", + "density": "2.0" + }, + { + "src": "\/android-icon-144x144.png", + "sizes": "144x144", + "type": "image\/png", + "density": "3.0" + }, + { + "src": "\/android-icon-192x192.png", + "sizes": "192x192", + "type": "image\/png", + "density": "4.0" + } + ] +} \ No newline at end of file diff --git a/site/assets/img/favicon/ms-icon-144x144.png b/site/assets/img/favicon/ms-icon-144x144.png new file mode 100644 index 00000000..341ece89 Binary files /dev/null and b/site/assets/img/favicon/ms-icon-144x144.png differ diff --git a/site/assets/img/favicon/ms-icon-150x150.png b/site/assets/img/favicon/ms-icon-150x150.png new file mode 100644 index 00000000..452f3971 Binary files /dev/null and b/site/assets/img/favicon/ms-icon-150x150.png differ diff --git a/site/assets/img/favicon/ms-icon-310x310.png b/site/assets/img/favicon/ms-icon-310x310.png new file mode 100644 index 00000000..78f5888a Binary files /dev/null and b/site/assets/img/favicon/ms-icon-310x310.png differ diff --git a/site/assets/img/favicon/ms-icon-70x70.png b/site/assets/img/favicon/ms-icon-70x70.png new file mode 100644 index 00000000..df71e3ea Binary files /dev/null and b/site/assets/img/favicon/ms-icon-70x70.png differ diff --git a/site/assets/img/megapixels_logo_black.png b/site/assets/img/megapixels_logo_black.png new file mode 100644 index 00000000..64bc6659 Binary files /dev/null and b/site/assets/img/megapixels_logo_black.png differ diff --git a/site/assets/img/megapixels_logo_black.svg b/site/assets/img/megapixels_logo_black.svg new file mode 100644 index 00000000..8eb610e7 --- /dev/null +++ b/site/assets/img/megapixels_logo_black.svg @@ -0,0 +1,58 @@ + + + +image/svg+xml \ No newline at end of file diff --git a/site/public/about/assets/LICENSE/index.html b/site/public/about/assets/LICENSE/index.html index bb04213b..c0d5a7e0 100644 --- a/site/public/about/assets/LICENSE/index.html +++ b/site/public/about/assets/LICENSE/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/about/attribution/index.html b/site/public/about/attribution/index.html index 4105847e..aec1b0f8 100644 --- a/site/public/about/attribution/index.html +++ b/site/public/about/attribution/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/about/index.html b/site/public/about/index.html index 17b63704..9b3e455b 100644 --- a/site/public/about/index.html +++ b/site/public/about/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/about/legal/index.html b/site/public/about/legal/index.html index 06dca21e..41b74351 100644 --- a/site/public/about/legal/index.html +++ b/site/public/about/legal/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/about/press/index.html b/site/public/about/press/index.html index 1f068391..febc1256 100644 --- a/site/public/about/press/index.html +++ b/site/public/about/press/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/datasets/brainwash/index.html b/site/public/datasets/brainwash/index.html index 4653ec92..8ae6b122 100644 --- a/site/public/datasets/brainwash/index.html +++ b/site/public/datasets/brainwash/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/datasets/duke_mtmc/index.html b/site/public/datasets/duke_mtmc/index.html index 0c164b6a..24ee6cc2 100644 --- a/site/public/datasets/duke_mtmc/index.html +++ b/site/public/datasets/duke_mtmc/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/datasets/hrt_transgender/index.html b/site/public/datasets/hrt_transgender/index.html index 92da8352..1dde3ded 100644 --- a/site/public/datasets/hrt_transgender/index.html +++ b/site/public/datasets/hrt_transgender/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/datasets/ijb_c/index.html b/site/public/datasets/ijb_c/index.html index 36e4fabf..3bc23ca5 100644 --- a/site/public/datasets/ijb_c/index.html +++ b/site/public/datasets/ijb_c/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/datasets/index.html b/site/public/datasets/index.html index ffe24671..b5fe52ed 100644 --- a/site/public/datasets/index.html +++ b/site/public/datasets/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/datasets/msceleb/index.html b/site/public/datasets/msceleb/index.html index 4d41a4c0..f1d59366 100644 --- a/site/public/datasets/msceleb/index.html +++ b/site/public/datasets/msceleb/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/datasets/oxford_town_centre/index.html b/site/public/datasets/oxford_town_centre/index.html index 50859604..0cf55b5c 100644 --- a/site/public/datasets/oxford_town_centre/index.html +++ b/site/public/datasets/oxford_town_centre/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/datasets/uccs/index.html b/site/public/datasets/uccs/index.html index adb411c6..b5ceebd3 100644 --- a/site/public/datasets/uccs/index.html +++ b/site/public/datasets/uccs/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/index.html b/site/public/index.html index bd99bed6..81e48a1b 100644 --- a/site/public/index.html +++ b/site/public/index.html @@ -9,7 +9,6 @@ - @@ -17,6 +16,24 @@ + + + + + + + + + + + + + + + + + + diff --git a/site/public/info/index.html b/site/public/info/index.html index ddea9617..3bc9a82d 100644 --- a/site/public/info/index.html +++ b/site/public/info/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/research/00_introduction/index.html b/site/public/research/00_introduction/index.html index 6e9597e5..64635c55 100644 --- a/site/public/research/00_introduction/index.html +++ b/site/public/research/00_introduction/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/research/01_from_1_to_100_pixels/index.html b/site/public/research/01_from_1_to_100_pixels/index.html index 0cb36584..7b86f5ef 100644 --- a/site/public/research/01_from_1_to_100_pixels/index.html +++ b/site/public/research/01_from_1_to_100_pixels/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/research/02_what_computers_can_see/index.html b/site/public/research/02_what_computers_can_see/index.html index 3ed75233..c949875c 100644 --- a/site/public/research/02_what_computers_can_see/index.html +++ b/site/public/research/02_what_computers_can_see/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/research/index.html b/site/public/research/index.html index 0ceb321f..01df017a 100644 --- a/site/public/research/index.html +++ b/site/public/research/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/test/chart/index.html b/site/public/test/chart/index.html index e50eb6ad..2613fe52 100644 --- a/site/public/test/chart/index.html +++ b/site/public/test/chart/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/test/citations/index.html b/site/public/test/citations/index.html index 76f8fc30..22edf880 100644 --- a/site/public/test/citations/index.html +++ b/site/public/test/citations/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/test/csv/index.html b/site/public/test/csv/index.html index 62475e98..9c9c6863 100644 --- a/site/public/test/csv/index.html +++ b/site/public/test/csv/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/test/datasets/index.html b/site/public/test/datasets/index.html index 3ef3952c..7d50e1df 100644 --- a/site/public/test/datasets/index.html +++ b/site/public/test/datasets/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/test/face_search/index.html b/site/public/test/face_search/index.html index 79bcc264..3748f4fb 100644 --- a/site/public/test/face_search/index.html +++ b/site/public/test/face_search/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/test/gallery/index.html b/site/public/test/gallery/index.html index eeb48500..7367ba74 100644 --- a/site/public/test/gallery/index.html +++ b/site/public/test/gallery/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/test/index.html b/site/public/test/index.html index 7c368909..0c549571 100644 --- a/site/public/test/index.html +++ b/site/public/test/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/test/map/index.html b/site/public/test/map/index.html index cec4fa29..fe25a4e2 100644 --- a/site/public/test/map/index.html +++ b/site/public/test/map/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/test/name_search/index.html b/site/public/test/name_search/index.html index 7e793e9e..cf4982b8 100644 --- a/site/public/test/name_search/index.html +++ b/site/public/test/name_search/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/site/public/test/pie_chart/index.html b/site/public/test/pie_chart/index.html index f982e476..2d3dcb9e 100644 --- a/site/public/test/pie_chart/index.html +++ b/site/public/test/pie_chart/index.html @@ -14,6 +14,25 @@ + + + + + + + + + + + + + + + + + + + -- cgit v1.2.3-70-g09d2 From dcbe971121734dfd1964d151200b4d9db714adba Mon Sep 17 00:00:00 2001 From: adamhrv Date: Wed, 24 Apr 2019 16:35:36 +0200 Subject: fix about/press link --- site/public/about/attribution/index.html | 2 +- site/public/about/index.html | 2 +- site/public/about/legal/index.html | 2 +- site/public/about/press/index.html | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) (limited to 'site/public') diff --git a/site/public/about/attribution/index.html b/site/public/about/attribution/index.html index aec1b0f8..29c220e4 100644 --- a/site/public/about/attribution/index.html +++ b/site/public/about/attribution/index.html @@ -57,7 +57,7 @@
diff --git a/site/public/about/index.html b/site/public/about/index.html index 9b3e455b..288aa2aa 100644 --- a/site/public/about/index.html +++ b/site/public/about/index.html @@ -57,7 +57,7 @@
diff --git a/site/public/about/legal/index.html b/site/public/about/legal/index.html index 41b74351..5b34c319 100644 --- a/site/public/about/legal/index.html +++ b/site/public/about/legal/index.html @@ -57,7 +57,7 @@
diff --git a/site/public/about/press/index.html b/site/public/about/press/index.html index febc1256..57a07449 100644 --- a/site/public/about/press/index.html +++ b/site/public/about/press/index.html @@ -57,7 +57,7 @@
-- cgit v1.2.3-70-g09d2 From ac65f124a415e1b2033da3a16ca1e135d08fbeb1 Mon Sep 17 00:00:00 2001 From: Jules Laplace Date: Fri, 26 Apr 2019 19:22:27 +0200 Subject: add better share image --- site/assets/img/megapixels-share.png | Bin 0 -> 102573 bytes site/public/about/attribution/index.html | 2 +- site/public/about/index.html | 2 +- site/public/about/legal/index.html | 2 +- site/public/about/press/index.html | 2 +- site/public/index.html | 1 + site/templates/home.html | 3 ++- 7 files changed, 7 insertions(+), 5 deletions(-) create mode 100644 site/assets/img/megapixels-share.png (limited to 'site/public') diff --git a/site/assets/img/megapixels-share.png b/site/assets/img/megapixels-share.png new file mode 100644 index 00000000..a40dfaca Binary files /dev/null and b/site/assets/img/megapixels-share.png differ diff --git a/site/public/about/attribution/index.html b/site/public/about/attribution/index.html index 29c220e4..aec1b0f8 100644 --- a/site/public/about/attribution/index.html +++ b/site/public/about/attribution/index.html @@ -57,7 +57,7 @@
diff --git a/site/public/about/index.html b/site/public/about/index.html index 288aa2aa..9b3e455b 100644 --- a/site/public/about/index.html +++ b/site/public/about/index.html @@ -57,7 +57,7 @@
diff --git a/site/public/about/legal/index.html b/site/public/about/legal/index.html index 5b34c319..41b74351 100644 --- a/site/public/about/legal/index.html +++ b/site/public/about/legal/index.html @@ -57,7 +57,7 @@
diff --git a/site/public/about/press/index.html b/site/public/about/press/index.html index 57a07449..febc1256 100644 --- a/site/public/about/press/index.html +++ b/site/public/about/press/index.html @@ -57,7 +57,7 @@
diff --git a/site/public/index.html b/site/public/index.html index 81e48a1b..4c25f476 100644 --- a/site/public/index.html +++ b/site/public/index.html @@ -9,6 +9,7 @@ + diff --git a/site/templates/home.html b/site/templates/home.html index 81e48a1b..0e7b2167 100644 --- a/site/templates/home.html +++ b/site/templates/home.html @@ -9,6 +9,7 @@ + @@ -73,4 +74,4 @@ - \ No newline at end of file + -- cgit v1.2.3-70-g09d2 From 98385977e777fa18019d975ad160cc5725e9001d Mon Sep 17 00:00:00 2001 From: adamhrv Date: Thu, 2 May 2019 19:57:21 +0200 Subject: fix typos --- security.md | 6 ++ site/content/_drafts_/adience/index.md | 32 +++++++++ site/content/_drafts_/ibm_dif/index.md | 28 ++++++++ site/content/_drafts_/megaface/index.md | 49 ++++++++++++++ site/content/pages/about/attribution.md | 2 +- site/content/pages/about/index.md | 2 +- site/content/pages/about/legal.md | 4 +- site/content/pages/about/press.md | 2 +- site/content/pages/datasets/ijb_c/index.md | 9 +++ .../content/pages/datasets/msceleb/assets/notes.md | 3 + site/content/pages/datasets/msceleb/index.md | 40 ++++++------ site/content/pages/datasets/uccs/assets/notes.md | 5 ++ .../pages/research/00_introduction/index.md | 11 ++++ .../research/01_from_1_to_100_pixels/index.md | 5 ++ site/public/about/legal/index.html | 2 +- site/public/datasets/ijb_c/index.html | 6 +- .../datasets/msceleb/assets/notes/index.html | 75 ++++++++++++++++++++++ site/public/datasets/msceleb/index.html | 46 ++++++------- site/public/datasets/uccs/assets/notes/index.html | 75 ++++++++++++++++++++++ site/public/index.html | 6 +- site/public/research/00_introduction/index.html | 9 +++ .../research/01_from_1_to_100_pixels/index.html | 4 ++ site/templates/home.html | 6 +- todo.md | 35 ++++++---- 24 files changed, 393 insertions(+), 69 deletions(-) create mode 100644 security.md create mode 100644 site/content/_drafts_/adience/index.md create mode 100644 site/content/_drafts_/ibm_dif/index.md create mode 100644 site/content/_drafts_/megaface/index.md create mode 100644 site/content/pages/datasets/msceleb/assets/notes.md create mode 100644 site/content/pages/datasets/uccs/assets/notes.md create mode 100644 site/public/datasets/msceleb/assets/notes/index.html create mode 100644 site/public/datasets/uccs/assets/notes/index.html (limited to 'site/public') diff --git a/security.md b/security.md new file mode 100644 index 00000000..d0bffdb4 --- /dev/null +++ b/security.md @@ -0,0 +1,6 @@ +# MegaPixels + +### Potential Blacklist + +- 103.213.248.154 + - 5,000 hits April with unknown browser from Hong Kong around April 22 \ No newline at end of file diff --git a/site/content/_drafts_/adience/index.md b/site/content/_drafts_/adience/index.md new file mode 100644 index 00000000..60a6cd1f --- /dev/null +++ b/site/content/_drafts_/adience/index.md @@ -0,0 +1,32 @@ +------------ + +status: draft +title: Adience +desc: Adience is a ... +subdesc: Adience contains ... +slug: Adience +cssclass: dataset +image: assets/background.jpg +year: 2007 +published: 2019-2-23 +updated: 2019-2-23 +authors: Adam Harvey + +------------ + +## Adience Dataset + +### sidebar +### end sidebar + +[ page under development ] + +- Deep Age Estimation: From Classification to Ranking + - https://verify.megapixels.cc/paper/adience/verify/4f1249369127cc2e2894f6b2f1052d399794919a + - funded by FordMotor Company University Reserach Program +- Unconstrained Age Estimation with Deep Convolutional Neural Networks + - https://verify.megapixels.cc/paper/adience/verify/31f1e711fcf82c855f27396f181bf5e565a2f58d + - "we augment our data by sampling 1000 images for the age group of 0-20 from Adience [3]" + - the work was supported by IARPA and ODNI + +{% include 'dashboard.html' %} \ No newline at end of file diff --git a/site/content/_drafts_/ibm_dif/index.md b/site/content/_drafts_/ibm_dif/index.md new file mode 100644 index 00000000..5d72193b --- /dev/null +++ b/site/content/_drafts_/ibm_dif/index.md @@ -0,0 +1,28 @@ +------------ + +status: draft +title: IBM Diversity in Faces +desc: IBM Diversity in Faces is a person re-identification dataset of images captured at UC Santa Cruz in 2007 +subdesc: IBM Diversity in Faces contains 1,264 images and 632 persons on the UC Santa Cruz campus and is used to train person re-identification algorithms for surveillance +slug: IBM Diversity in Faces +cssclass: dataset +image: assets/background.jpg +year: 2007 +published: 2019-2-23 +updated: 2019-2-23 +authors: Adam Harvey + +------------ + +## IBM Diversity in Faces Dataset + +### sidebar +### end sidebar + +[ page under development ] + +in "Understanding Unequal Gender Classification Accuracyfrom Face Images" researcher affilliated with IBM created a new version of PPB so they didn't have to agree to the terms of the original PPB. + +>We use an approximation of the PPB dataset for the ex-periments in this paper. This dataset contains images ofparliament members from the six countries identified in[4] and were manually labeled by us into the categoriesdark-skinned and light-skinned.1Our approximation tothe PPB dataset, which we call PPB*, is very similar toPPB and satisfies the relevant characteristics for the study we perform. Table 1 compares the decomposition of theoriginal PPB dataset and our PPB* approximation accord-ing to skin type and gender. + +{% include 'dashboard.html' %} \ No newline at end of file diff --git a/site/content/_drafts_/megaface/index.md b/site/content/_drafts_/megaface/index.md new file mode 100644 index 00000000..4c7bb309 --- /dev/null +++ b/site/content/_drafts_/megaface/index.md @@ -0,0 +1,49 @@ +------------ + +status: draft +title: MegaFace +desc: MegaFace is a face recognition dataset created by scraping Flickr photo albums +subdesc: MegaFace contains 1,264 images and 632 persons on the UC Santa Cruz campus and is used to train person re-identification algorithms for surveillance +slug: MegaFace +cssclass: dataset +image: assets/background.jpg +year: 2007 +published: 2019-2-23 +updated: 2019-2-23 +authors: Adam Harvey + +------------ + +## MegaFace Dataset + +### sidebar +### end sidebar + +[ page under development ] + +*MegaFace (Viewpoint Invariant Pedestrian Recognition)* is a dataset of pedestrian images captured at University of California Santa Cruz in 2007. Accoriding to the reserachers 2 "cameras were placed in different locations in an academic setting and subjects were notified of the presence of cameras, but were not coached or instructed in any way." + +MegaFace is amongst the most widely used publicly available person re-identification datasets. In 2017 the MegaFace dataset was combined into a larger person re-identification created by the Chinese University of Hong Kong called PETA (PEdesTrian Attribute). + +{% include 'dashboard.html' %} + + +### Research notes + +Dataset was used in research paper funded by SenseTime + +- https://verify.megapixels.cc/paper/megaface/verify/380d5138cadccc9b5b91c707ba0a9220b0f39271 +- x + +From "On Low-Resolution Face Recognition in the Wild:Comparisons and New Techniques" + +- Says 130,154 Flickr accounts, but I got 48,382 +- https://verify.megapixels.cc/paper/megaface/verify/841855205818d3a6d6f85ec17a22515f4f062882 + +> 2) MegaFace Challenge 2 LR subset:The MegaFace challenge 2 (MF2) training dataset [48] is the largest (in the numberof identities) publicly available facial recognition dataset, with4.7 million face images and over 672,000 identities. The MF2dataset is obtained by running the Dlib [29] face detector onimages from Flickr [68], yielding 40 million unlabeled faces across 130,154 distinct Flickr accounts. Automatic identity labeling is performed using a clustering algorithm. We per-formed a subset selection from the MegaFace Challenge 2training set with tight bounding boxes to generate a LR subsetof this dataset. Faces smaller than 50x50 pixels are gathered for each identity, and then we eliminated identities with fewer thanfive images available. This subset selection approach produced 6,700 identities and 85,344 face images in total. The extractionprocess does yield some non-face images, as does the originaldataset processing. No further data cleaning is conducted onthis subset. + +UHDB31: A Dataset for Better Understanding Face Recognitionacross Pose and Illumination Variatio + +- http://openaccess.thecvf.com/content_ICCV_2017_workshops/papers/w37/Le_UHDB31_A_Dataset_ICCV_2017_paper.pdf +- MegaFace 1 used 690,572 and 1,027,060 +- MegaFace 2 used 672,057 and 4,753,320 \ No newline at end of file diff --git a/site/content/pages/about/attribution.md b/site/content/pages/about/attribution.md index 180d87f0..5060b2d9 100644 --- a/site/content/pages/about/attribution.md +++ b/site/content/pages/about/attribution.md @@ -16,7 +16,7 @@ authors: Adam Harvey
diff --git a/site/content/pages/about/index.md b/site/content/pages/about/index.md index 0d9246ca..4cf390fc 100644 --- a/site/content/pages/about/index.md +++ b/site/content/pages/about/index.md @@ -16,7 +16,7 @@ authors: Adam Harvey
diff --git a/site/content/pages/about/legal.md b/site/content/pages/about/legal.md index 53cbca9e..e88fbb17 100644 --- a/site/content/pages/about/legal.md +++ b/site/content/pages/about/legal.md @@ -16,7 +16,7 @@ authors: Adam Harvey
@@ -37,7 +37,7 @@ In order to provide certain features of the site, some 3rd party services are ne ### Links To Other Web Sites -The MegaPixels.cc contains many links to 3rd party websites, especially in the list of citations that are provided for each dataset. This website has no control over and assumes no responsibility for, the content, privacy policies, or practices of any third party web sites or services. You acknowledge and agree that megapixels.cc shall not be responsible or liable, directly or indirectly, for any damage or loss caused or alleged to be caused by or in connection with use of or reliance on any such content, goods or services available on or through any such web sites or services. +The MegaPixels.cc contains many links to 3rd party websites, especially in the list of citations that are provided for each dataset. This website has no control over and assumes no responsibility for the content, privacy policies, or practices of any third party web sites or services. You acknowledge and agree that megapixels.cc (and its creators) shall not be responsible or liable, directly or indirectly, for any damage or loss caused or alleged to be caused by or in connection with use of or reliance on any such content, goods or services available on or through any such web sites or services. We advise you to read the terms and conditions and privacy policies of any third-party web sites or services that you visit. diff --git a/site/content/pages/about/press.md b/site/content/pages/about/press.md index a66f231d..2839bf20 100644 --- a/site/content/pages/about/press.md +++ b/site/content/pages/about/press.md @@ -16,7 +16,7 @@ authors: Adam Harvey
diff --git a/site/content/pages/datasets/ijb_c/index.md b/site/content/pages/datasets/ijb_c/index.md index 46cab323..9e3f1808 100644 --- a/site/content/pages/datasets/ijb_c/index.md +++ b/site/content/pages/datasets/ijb_c/index.md @@ -27,6 +27,15 @@ The IARPA Janus Benchmark C is a dataset created by ![caption: A visualization of the IJB-C dataset](assets/ijb_c_montage.jpg) +## Research notes + +From original papers: https://noblis.org/wp-content/uploads/2018/03/icb2018.pdf + +Collection for the dataset began by identifying CreativeCommons subject videos, which are often more scarce thanCreative Commons subject images. Search terms that re-sulted in large quantities of person-centric videos (e.g. “in-terview”) were generated and translated into numerous lan-guages including Arabic, Korean, Swahili, and Hindi to in-crease diversity of the subject pool. Certain YouTube userswho upload well-labeled, person-centric videos, such as the World Economic Forum and the International University Sports Federation were also identified. Titles of videos per-taining to these search terms and usernames were scrapedusing the YouTube Data API and translated into English us-ing the Yandex Translate API4. Pattern matching was per-formed to extract potential names of subjects from the trans-lated titles, and these names were searched using the Wiki-data API to verify the subject’s existence and status as a public figure, and to check for Wikimedia Commons im-agery. Age, gender, and geographic region were collectedusing the Wikipedia API.Using the candidate subject names, Creative Commonsimages were scraped from Google and Wikimedia Com-mons, and Creative Commons videos were scraped fromYouTube. After images and videos of the candidate subjectwere identified, AMT Workers were tasked with validat-ing the subject’s presence throughout the video. The AMTWorkers marked segments of the video in which the subjectwas present, and key frames + + +IARPA funds Italian researcher https://www.micc.unifi.it/projects/glaivejanus/ + {% include 'dashboard.html' %} {% include 'supplementary_header.html' %} diff --git a/site/content/pages/datasets/msceleb/assets/notes.md b/site/content/pages/datasets/msceleb/assets/notes.md new file mode 100644 index 00000000..0d8900d1 --- /dev/null +++ b/site/content/pages/datasets/msceleb/assets/notes.md @@ -0,0 +1,3 @@ +## Derivative Datasets + +- Racial Faces in the Wild http://whdeng.cn/RFW/index.html \ No newline at end of file diff --git a/site/content/pages/datasets/msceleb/index.md b/site/content/pages/datasets/msceleb/index.md index f0b07557..5f48ebfd 100644 --- a/site/content/pages/datasets/msceleb/index.md +++ b/site/content/pages/datasets/msceleb/index.md @@ -2,8 +2,8 @@ status: published title: Microsoft Celeb Dataset -desc: Microsoft Celeb 1M is a target list and dataset of web images used for research and development of face recognition -subdesc: The MS Celeb dataset includes over 10 million images of about 100K people and a target list of 1 million individuals +desc: Microsoft Celeb 1M is a dataset of 10 millions faces images harvested from the Internet +subdesc: The MS Celeb dataset includes 100K people and a target list of 1 million individuals slug: msceleb cssclass: dataset image: assets/background.jpg @@ -21,66 +21,66 @@ authors: Adam Harvey Microsoft Celeb (MS Celeb) is a dataset of 10 million face images scraped from the Internet and used for research and development of large-scale biometric recognition systems. According to Microsoft Research, who created and published the [dataset](https://www.microsoft.com/en-us/research/publication/ms-celeb-1m-dataset-benchmark-large-scale-face-recognition-2/) in 2016, MS Celeb is the largest publicly available face recognition dataset in the world, containing over 10 million images of nearly 100,000 individuals. Microsoft's goal in building this dataset was to distribute an initial training dataset of 100,000 individuals' images to accelerate research into recognizing a larger target list of one million people "using all the possibly collected face images of this individual on the web as training data".[^msceleb_orig] -These one million people, defined by Microsoft Research as "celebrities", are often merely people who must maintain an online presence for their professional lives. Microsoft's list of 1 million people is an expansive exploitation of the current reality that for many people, including academics, policy makers, writers, artists, and especially journalists; maintaining an online presence is mandatory. This fact should not allow Microsoft nor anyone else to use their biometrics for research and development of surveillance technology. Many names in the target list even include people critical of the very technology Microsoft is using their name and biometric information to build. The list includes digital rights activists like Jillian York; artists critical of surveillance including Trevor Paglen, Jill Magid, and Aram Bartholl; Intercept founders Laura Poitras, Jeremy Scahill, and Glenn Greenwald; Data and Society founder danah boyd; and even Julie Brill, the former FTC commissioner responsible for protecting consumer privacy, to name a few. +These one million people, defined by Microsoft Research as "celebrities", are often merely people who must maintain an online presence for their professional lives. Microsoft's list of 1 million people is an expansive exploitation of the current reality that for many people, including academics, policy makers, writers, artists, activists, and journalists; maintaining an online presence is mandatory. This fact should not allow Microsoft nor anyone else to use their biometrics for research and development of surveillance technology. Many names in the target list even include people critical of the very technology Microsoft is using their name and biometric information to build. The list includes digital rights activists like Jillian York; artists critical of surveillance including Trevor Paglen, Jill Magid, and Aram Bartholl; Intercept founders Laura Poitras, Jeremy Scahill, and Glenn Greenwald; Data and Society founder danah boyd; and even Julie Brill, the former FTC commissioner responsible for protecting consumer privacy, to name only 8 out of 1 million. ### Microsoft's 1 Million Target List -Below is a selection of 24 names from the full target list, curated to illustrate Microsoft's expansive and exploitative practice of scraping the Internet for biometric training data. The entire name file can be downloaded from [msceleb.org](https://www.msceleb.org). You can email msceleb@microsoft.com to have your name removed. Names appearing with * indicate that Microsoft also distributed your images. +Below is a selection of 24 names from the full target list, curated to illustrate Microsoft's expansive and exploitative practice of scraping the Internet for biometric training data. The entire name file can be downloaded from [msceleb.org](https://www.msceleb.org). You can email msceleb@microsoft.com to have your name removed. Subjects whose images were distributed by Microsoft are indicated with the total image count. No number indicates the name is only exists in target list. === columns 2 -| Name | Profession | +| Name (images) | Profession | | --- | --- | --- | | Adrian Chen | Journalist | -| Ai Weiwei* | Artist | -| Aram Bartholl | Internet artist | +| Ai Weiwei (220) | Artist, activist | +| Aram Bartholl | Conceptual artist | | Astra Taylor | Author, director, activist | -| Alexander Madrigal | Journalist | -| Bruce Schneier* | Cryptologist | +| Bruce Schneier (107) | Cryptologist | +| Cory Doctorow (104) | Blogger, journalist | | danah boyd | Data & Society founder | | Edward Felten | Former FTC Chief Technologist | -| Evgeny Morozov* | Tech writer, researcher | -| Glenn Greenwald* | Journalist, author | +| Evgeny Morozov (108) | Tech writer, researcher | +| Glenn Greenwald (86) | Journalist, author | | Hito Steyerl | Artist, writer | | James Risen | Journalist | ==== -| Name | Profession | +| Name (images) | Profession | | --- | --- | --- | -| Jeremy Scahill* | Journalist | +| Jeremy Scahill (200) | Journalist | | Jill Magid | Artist | | Jillian York | Digital rights activist | | Jonathan Zittrain | EFF board member | | Julie Brill | Former FTC Commissioner| | Kim Zetter | Journalist, author | -| Laura Poitras* | Filmmaker | +| Laura Poitras (104) | Filmmaker | | Luke DuBois | Artist | | Michael Anti | Political blogger | -| Manal al-Sharif* | Womens's rights activist | +| Manal al-Sharif (101) | Womens's rights activist | | Shoshana Zuboff | Author, academic | | Trevor Paglen | Artist, researcher | === end columns -After publishing this list, researchers affiliated with Microsoft Asia then worked with researchers affiliated with China's National University of Defense Technology (controlled by China's Central Military Commission) and used the the MS Celeb dataset for their [research paper](https://www.semanticscholar.org/paper/Faces-as-Lighting-Probes-via-Unsupervised-Deep-Yi-Zhu/b301fd2fc33f24d6f75224e7c0991f4f04b64a65) on using "Faces as Lighting Probes via Unsupervised Deep Highlight Extraction" with potential applications in 3D face recognition. +After publishing this list, researchers affiliated with Microsoft Asia then worked with researchers affiliated with China's [National University of Defense Technology](https://en.wikipedia.org/wiki/National_University_of_Defense_Technology) (controlled by China's Central Military Commission) and used the MS Celeb image dataset for their research paper on using "[Faces as Lighting Probes via Unsupervised Deep Highlight Extraction]((https://www.semanticscholar.org/paper/Faces-as-Lighting-Probes-via-Unsupervised-Deep-Yi-Zhu/b301fd2fc33f24d6f75224e7c0991f4f04b64a65)" with potential applications in 3D face recognition. In an April 10, 2019 [article](https://www.ft.com/content/9378e7ee-5ae6-11e9-9dde-7aedca0a081a) published by Financial Times based on data surfaced during this investigation, Samm Sacks (a senior fellow at the New America think tank) commented that this research raised "red flags because of the nature of the technology, the author's affiliations, combined with what we know about how this technology is being deployed in China right now". Adding, that "the [Chinese] government is using these technologies to build surveillance systems and to detain minorities [in Xinjiang]".[^madhu_ft] -Four more papers published by SenseTime, which also use the MS Celeb dataset, raise similar flags. SenseTime is a computer vision surveillance company that until [April 2019](https://uhrp.org/news-commentary/china%E2%80%99s-sensetime-sells-out-xinjiang-security-joint-venture) provided surveillance to Chinese authorities to monitor and track Uighur Muslims in Xinjiang province, and had been [flagged](https://www.nytimes.com/2019/04/14/technology/china-surveillance-artificial-intelligence-racial-profiling.html) numerous times as having potential links to human rights violations. +Four more papers published by SenseTime that also use the MS Celeb dataset raise similar flags. SenseTime is a computer vision surveillance company that until [April 2019](https://uhrp.org/news-commentary/china%E2%80%99s-sensetime-sells-out-xinjiang-security-joint-venture) provided surveillance to Chinese authorities to monitor and track Uighur Muslims in Xinjiang province, and had been [flagged](https://www.nytimes.com/2019/04/14/technology/china-surveillance-artificial-intelligence-racial-profiling.html) numerous times as having potential links to human rights violations. One of the 4 SenseTime papers, "[Exploring Disentangled Feature Representation Beyond Face Identification](https://www.semanticscholar.org/paper/Exploring-Disentangled-Feature-Representation-Face-Liu-Wei/1fd5d08394a3278ef0a89639e9bfec7cb482e0bf)", shows how SenseTime was developing automated face analysis technology to infer race, narrow eyes, nose size, and chin size, all of which could be used to target vulnerable ethnic groups based on their facial appearances. -Earlier in 2019, Microsoft President and Chief Legal Officer [Brad Smith](https://blogs.microsoft.com/on-the-issues/2018/12/06/facial-recognition-its-time-for-action/) called for the governmental regulation of face recognition, citing the potential for misuse, a rare admission that Microsoft's surveillance-driven business model had lost its bearing. More recently Smith also [announced](https://www.reuters.com/article/us-microsoft-ai/microsoft-turned-down-facial-recognition-sales-on-human-rights-concerns-idUSKCN1RS2FV) that Microsoft would seemingly take a stand against such potential misuse, and had decided to not sell face recognition to an unnamed United States agency, citing a lack of accuracy. The software was not suitable to be used on minorities, because it was trained mostly on white male faces. +Earlier in 2019, Microsoft President and Chief Legal Officer [Brad Smith](https://blogs.microsoft.com/on-the-issues/2018/12/06/facial-recognition-its-time-for-action/) called for the governmental regulation of face recognition, citing the potential for misuse, a rare admission that Microsoft's surveillance-driven business model had lost its bearing. More recently Smith also [announced](https://www.reuters.com/article/us-microsoft-ai/microsoft-turned-down-facial-recognition-sales-on-human-rights-concerns-idUSKCN1RS2FV) that Microsoft would seemingly take a stand against such potential misuse, and had decided to not sell face recognition to an unnamed United States agency, citing a lack of accuracy. In effect, Microsoft's face recognition software was not suitable to be used on minorities because it was trained mostly on white male faces. What the decision to block the sale announces is not so much that Microsoft had upgraded their ethics, but that Microsoft publicly acknowledged it can't sell a data-driven product without data. In other words, Microsoft can't sell face recognition for faces they can't train on. -Until now, that data has been freely harvested from the Internet and packaged in training sets like MS Celeb, which are overwhelmingly [white](https://www.nytimes.com/2018/02/09/technology/facial-recognition-race-artificial-intelligence.html) and [male](https://gendershades.org). Without balanced data, facial recognition contains blind spots. And without datasets like MS Celeb, the powerful yet inaccurate facial recognition services like Microsoft's Azure Cognitive Service also would not be able to see at all. +Until now, that data has been freely harvested from the Internet and packaged in training sets like MS Celeb, which are overwhelmingly [white](https://www.nytimes.com/2018/02/09/technology/facial-recognition-race-artificial-intelligence.html) and [male](https://gendershades.org). Without balanced data, facial recognition contains blind spots. And without datasets like MS Celeb, the powerful yet inaccurate facial recognition services like Microsoft's Azure Cognitive Service the services might not exist at all. ![caption: A visualization of 2,000 of the 100,000 identity included in the image dataset distributed by Microsoft Research. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)](assets/msceleb_montage.jpg) -Microsoft didn't only create MS Celeb for other researchers to use, they also used it internally. In a publicly available 2017 Microsoft Research project called "[One-shot Face Recognition by Promoting Underrepresented Classes](https://www.microsoft.com/en-us/research/publication/one-shot-face-recognition-promoting-underrepresented-classes/)," Microsoft leveraged the MS Celeb dataset to analyze their algorithms and advertise the results. Interestingly, Microsoft's [corporate version](https://www.microsoft.com/en-us/research/publication/one-shot-face-recognition-promoting-underrepresented-classes/) of the paper does not mention they used the MS Celeb datset, but the [open-access version](https://www.semanticscholar.org/paper/One-shot-Face-Recognition-by-Promoting-Classes-Guo/6cacda04a541d251e8221d70ac61fda88fb61a70) published on arxiv.org explicitly mentions that Microsoft Research introspected their algorithms "on the MS-Celeb-1M low-shot learning benchmark task." +Microsoft didn't only create MS Celeb for other researchers to use, they also used it internally. In a publicly available 2017 Microsoft Research project called "[One-shot Face Recognition by Promoting Underrepresented Classes](https://www.microsoft.com/en-us/research/publication/one-shot-face-recognition-promoting-underrepresented-classes/)," Microsoft leveraged the MS Celeb dataset to build their algorithms and advertise the results. Interestingly, Microsoft's [corporate version](https://www.microsoft.com/en-us/research/publication/one-shot-face-recognition-promoting-underrepresented-classes/) of the paper does not mention they used the MS Celeb datset, but the [open-access version](https://www.semanticscholar.org/paper/One-shot-Face-Recognition-by-Promoting-Classes-Guo/6cacda04a541d251e8221d70ac61fda88fb61a70) published on arxiv.org explicitly mentions that Microsoft Research introspected their algorithms "on the MS-Celeb-1M low-shot learning benchmark task." We suggest that if Microsoft Research wants to make biometric data publicly available for surveillance research and development, they should start with releasing their researchers' own biometric data, instead of scraping the Internet for journalists, artists, writers, actors, athletes, musicians, and academics. diff --git a/site/content/pages/datasets/uccs/assets/notes.md b/site/content/pages/datasets/uccs/assets/notes.md new file mode 100644 index 00000000..d248573d --- /dev/null +++ b/site/content/pages/datasets/uccs/assets/notes.md @@ -0,0 +1,5 @@ + +## Additional papers that used UCCS + +- https://verify.megapixels.cc/paper/megaface/verify/841855205818d3a6d6f85ec17a22515f4f062882 +- "we use the database subset that has assigned identities (180 identities total)." diff --git a/site/content/pages/research/00_introduction/index.md b/site/content/pages/research/00_introduction/index.md index 477679d4..ad8e2200 100644 --- a/site/content/pages/research/00_introduction/index.md +++ b/site/content/pages/research/00_introduction/index.md @@ -32,6 +32,17 @@ There is only biased feature vector clustering and probabilistic thresholding. Yesterday's [decision](https://www.reuters.com/article/us-microsoft-ai/microsoft-turned-down-facial-recognition-sales-on-human-rights-concerns-idUSKCN1RS2FV) by Brad Smith, CEO of Microsoft, to not sell facial recognition to a US law enforcement agency is not an about face by Microsoft to become more humane, it's simply a perfect illustration of the value of training data. Without data, you don't have a product to sell. Microsoft realized that doesn't have enough training data to sell +## Cost of Faces + +Univ Houston paid subjects $20/ea +http://web.archive.org/web/20170925053724/http://cbl.uh.edu/index.php/pages/research/collecting_facial_images_from_multiples_in_texas + +FaceMeta facedataset.com + +- BASIC: 15,000 images for $6,000 USD +- RECOMMENDED: 50,000 images for $12,000 USD +- ADVANCED: 100,000 images for $18,000 USD* + ## Use Your Own Biometrics First diff --git a/site/content/pages/research/01_from_1_to_100_pixels/index.md b/site/content/pages/research/01_from_1_to_100_pixels/index.md index b219dffb..ddffdf91 100644 --- a/site/content/pages/research/01_from_1_to_100_pixels/index.md +++ b/site/content/pages/research/01_from_1_to_100_pixels/index.md @@ -40,6 +40,11 @@ What can you know from a very small amount of information? - 100x100 all you need for medical diagnosis - 100x100 0.5% of one Instagram photo + +Notes: + +- Google FaceNet used images with (face?) sizes: Input sizes range from 96x96 pixels to 224x224pixels in our experiments. FaceNet: A Unified Embedding for Face Recognition and Clustering https://arxiv.org/pdf/1503.03832.pdf + Ideas: - Find specific cases of facial resolution being used in legal cases, forensic investigations, or military footage diff --git a/site/public/about/legal/index.html b/site/public/about/legal/index.html index 5b34c319..d8d81d04 100644 --- a/site/public/about/legal/index.html +++ b/site/public/about/legal/index.html @@ -68,7 +68,7 @@

3rd Party Services

In order to provide certain features of the site, some 3rd party services are needed. Currently, the MegaPixels.cc site uses two 3rd party services: (1) Leaflet.js for the interactive map and (2) Digital Ocean Spaces as a content delivery network. Both services encrypt your requests to their server using HTTPS and neither service requires storing any cookies or authentication. However, both services will store files in your web browser's local cache (local storage) to improve loading performance. None of these local storage files are using for analytics, tracking, or any similar purpose.

Links To Other Web Sites

-

The MegaPixels.cc contains many links to 3rd party websites, especially in the list of citations that are provided for each dataset. This website has no control over and assumes no responsibility for, the content, privacy policies, or practices of any third party web sites or services. You acknowledge and agree that megapixels.cc shall not be responsible or liable, directly or indirectly, for any damage or loss caused or alleged to be caused by or in connection with use of or reliance on any such content, goods or services available on or through any such web sites or services.

+

The MegaPixels.cc contains many links to 3rd party websites, especially in the list of citations that are provided for each dataset. This website has no control over and assumes no responsibility for the content, privacy policies, or practices of any third party web sites or services. You acknowledge and agree that megapixels.cc (and its creators) shall not be responsible or liable, directly or indirectly, for any damage or loss caused or alleged to be caused by or in connection with use of or reliance on any such content, goods or services available on or through any such web sites or services.

We advise you to read the terms and conditions and privacy policies of any third-party web sites or services that you visit.

Information We Collect

When you access the Service, we record your visit to the site in a server log file for the purposes of maintaining site security and preventing misuse. This includes your IP address and the header information sent by your web browser which includes the User Agent, referrer, and the requested page on our site.

diff --git a/site/public/datasets/ijb_c/index.html b/site/public/datasets/ijb_c/index.html index 3bc23ca5..f58be23f 100644 --- a/site/public/datasets/ijb_c/index.html +++ b/site/public/datasets/ijb_c/index.html @@ -75,7 +75,11 @@

[ page under development ]

The IARPA Janus Benchmark C is a dataset created by

-
 A visualization of the IJB-C dataset
A visualization of the IJB-C dataset
+
 A visualization of the IJB-C dataset
A visualization of the IJB-C dataset

Research notes

+

From original papers: https://noblis.org/wp-content/uploads/2018/03/icb2018.pdf

+

Collection for the dataset began by identifying CreativeCommons subject videos, which are often more scarce thanCreative Commons subject images. Search terms that re-sulted in large quantities of person-centric videos (e.g. “in-terview”) were generated and translated into numerous lan-guages including Arabic, Korean, Swahili, and Hindi to in-crease diversity of the subject pool. Certain YouTube userswho upload well-labeled, person-centric videos, such as the World Economic Forum and the International University Sports Federation were also identified. Titles of videos per-taining to these search terms and usernames were scrapedusing the YouTube Data API and translated into English us-ing the Yandex Translate API4. Pattern matching was per-formed to extract potential names of subjects from the trans-lated titles, and these names were searched using the Wiki-data API to verify the subject’s existence and status as a public figure, and to check for Wikimedia Commons im-agery. Age, gender, and geographic region were collectedusing the Wikipedia API.Using the candidate subject names, Creative Commonsimages were scraped from Google and Wikimedia Com-mons, and Creative Commons videos were scraped fromYouTube. After images and videos of the candidate subjectwere identified, AMT Workers were tasked with validat-ing the subject’s presence throughout the video. The AMTWorkers marked segments of the video in which the subjectwas present, and key frames

+

IARPA funds Italian researcher https://www.micc.unifi.it/projects/glaivejanus/

+

Who used IJB-C?

diff --git a/site/public/datasets/msceleb/assets/notes/index.html b/site/public/datasets/msceleb/assets/notes/index.html new file mode 100644 index 00000000..a249f08b --- /dev/null +++ b/site/public/datasets/msceleb/assets/notes/index.html @@ -0,0 +1,75 @@ + + + + MegaPixels + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

+ + +
MegaPixels
+ +
+ +
+
+ + + +
+ + + + + \ No newline at end of file diff --git a/site/public/datasets/msceleb/index.html b/site/public/datasets/msceleb/index.html index f1d59366..aabda46c 100644 --- a/site/public/datasets/msceleb/index.html +++ b/site/public/datasets/msceleb/index.html @@ -4,7 +4,7 @@ MegaPixels - + @@ -53,7 +53,7 @@
-
Microsoft Celeb 1M is a target list and dataset of web images used for research and development of face recognition
The MS Celeb dataset includes over 10 million images of about 100K people and a target list of 1 million individuals +
Microsoft Celeb 1M is a dataset of 10 millions faces images harvested from the Internet
The MS Celeb dataset includes 100K people and a target list of 1 million individuals

Microsoft Celeb Dataset (MS Celeb)

Microsoft Celeb (MS Celeb) is a dataset of 10 million face images scraped from the Internet and used for research and development of large-scale biometric recognition systems. According to Microsoft Research, who created and published the dataset in 2016, MS Celeb is the largest publicly available face recognition dataset in the world, containing over 10 million images of nearly 100,000 individuals. Microsoft's goal in building this dataset was to distribute an initial training dataset of 100,000 individuals' images to accelerate research into recognizing a larger target list of one million people "using all the possibly collected face images of this individual on the web as training data". 1

-

These one million people, defined by Microsoft Research as "celebrities", are often merely people who must maintain an online presence for their professional lives. Microsoft's list of 1 million people is an expansive exploitation of the current reality that for many people, including academics, policy makers, writers, artists, and especially journalists; maintaining an online presence is mandatory. This fact should not allow Microsoft nor anyone else to use their biometrics for research and development of surveillance technology. Many names in the target list even include people critical of the very technology Microsoft is using their name and biometric information to build. The list includes digital rights activists like Jillian York; artists critical of surveillance including Trevor Paglen, Jill Magid, and Aram Bartholl; Intercept founders Laura Poitras, Jeremy Scahill, and Glenn Greenwald; Data and Society founder danah boyd; and even Julie Brill, the former FTC commissioner responsible for protecting consumer privacy, to name a few.

+

These one million people, defined by Microsoft Research as "celebrities", are often merely people who must maintain an online presence for their professional lives. Microsoft's list of 1 million people is an expansive exploitation of the current reality that for many people, including academics, policy makers, writers, artists, activists, and journalists; maintaining an online presence is mandatory. This fact should not allow Microsoft nor anyone else to use their biometrics for research and development of surveillance technology. Many names in the target list even include people critical of the very technology Microsoft is using their name and biometric information to build. The list includes digital rights activists like Jillian York; artists critical of surveillance including Trevor Paglen, Jill Magid, and Aram Bartholl; Intercept founders Laura Poitras, Jeremy Scahill, and Glenn Greenwald; Data and Society founder danah boyd; and even Julie Brill, the former FTC commissioner responsible for protecting consumer privacy, to name only 8 out of 1 million.

Microsoft's 1 Million Target List

-

Below is a selection of 24 names from the full target list, curated to illustrate Microsoft's expansive and exploitative practice of scraping the Internet for biometric training data. The entire name file can be downloaded from msceleb.org. You can email msceleb@microsoft.com to have your name removed. Names appearing with * indicate that Microsoft also distributed your images.

+

Below is a selection of 24 names from the full target list, curated to illustrate Microsoft's expansive and exploitative practice of scraping the Internet for biometric training data. The entire name file can be downloaded from msceleb.org. You can email msceleb@microsoft.com to have your name removed. Subjects whose images were distributed by Microsoft are indicated with the total image count. No number indicates the name is only exists in target list.

- + @@ -92,24 +92,24 @@ - - + + - + - - + + - - + + @@ -120,11 +120,11 @@ - + - + @@ -139,13 +139,13 @@
NameName (images) Profession
Journalist
Ai Weiwei*ArtistAi Weiwei (220)Artist, activist
Aram BarthollInternet artistConceptual artist
Astra Taylor Author, director, activist
Alexander MadrigalJournalistBruce Schneier (107)Cryptologist
Bruce Schneier*CryptologistCory Doctorow (104)Blogger, journalist
danah boydFormer FTC Chief Technologist
Evgeny Morozov*Evgeny Morozov (108) Tech writer, researcher
Glenn Greenwald*Glenn Greenwald (86) Journalist, author
- + - + @@ -169,7 +169,7 @@ - + @@ -181,7 +181,7 @@ - + @@ -194,14 +194,14 @@
NameName (images) Profession
Jeremy Scahill*Jeremy Scahill (200) Journalist
Journalist, author
Laura Poitras*Laura Poitras (104) Filmmaker
Political blogger
Manal al-Sharif*Manal al-Sharif (101) Womens's rights activist
-

After publishing this list, researchers affiliated with Microsoft Asia then worked with researchers affiliated with China's National University of Defense Technology (controlled by China's Central Military Commission) and used the the MS Celeb dataset for their research paper on using "Faces as Lighting Probes via Unsupervised Deep Highlight Extraction" with potential applications in 3D face recognition.

+

After publishing this list, researchers affiliated with Microsoft Asia then worked with researchers affiliated with China's National University of Defense Technology (controlled by China's Central Military Commission) and used the MS Celeb image dataset for their research paper on using "Faces as Lighting Probes via Unsupervised Deep Highlight Extraction" with potential applications in 3D face recognition.

In an April 10, 2019 article published by Financial Times based on data surfaced during this investigation, Samm Sacks (a senior fellow at the New America think tank) commented that this research raised "red flags because of the nature of the technology, the author's affiliations, combined with what we know about how this technology is being deployed in China right now". Adding, that "the [Chinese] government is using these technologies to build surveillance systems and to detain minorities [in Xinjiang]". 2

-

Four more papers published by SenseTime, which also use the MS Celeb dataset, raise similar flags. SenseTime is a computer vision surveillance company that until April 2019 provided surveillance to Chinese authorities to monitor and track Uighur Muslims in Xinjiang province, and had been flagged numerous times as having potential links to human rights violations.

+

Four more papers published by SenseTime that also use the MS Celeb dataset raise similar flags. SenseTime is a computer vision surveillance company that until April 2019 provided surveillance to Chinese authorities to monitor and track Uighur Muslims in Xinjiang province, and had been flagged numerous times as having potential links to human rights violations.

One of the 4 SenseTime papers, "Exploring Disentangled Feature Representation Beyond Face Identification", shows how SenseTime was developing automated face analysis technology to infer race, narrow eyes, nose size, and chin size, all of which could be used to target vulnerable ethnic groups based on their facial appearances.

-

Earlier in 2019, Microsoft President and Chief Legal Officer Brad Smith called for the governmental regulation of face recognition, citing the potential for misuse, a rare admission that Microsoft's surveillance-driven business model had lost its bearing. More recently Smith also announced that Microsoft would seemingly take a stand against such potential misuse, and had decided to not sell face recognition to an unnamed United States agency, citing a lack of accuracy. The software was not suitable to be used on minorities, because it was trained mostly on white male faces.

+

Earlier in 2019, Microsoft President and Chief Legal Officer Brad Smith called for the governmental regulation of face recognition, citing the potential for misuse, a rare admission that Microsoft's surveillance-driven business model had lost its bearing. More recently Smith also announced that Microsoft would seemingly take a stand against such potential misuse, and had decided to not sell face recognition to an unnamed United States agency, citing a lack of accuracy. In effect, Microsoft's face recognition software was not suitable to be used on minorities because it was trained mostly on white male faces.

What the decision to block the sale announces is not so much that Microsoft had upgraded their ethics, but that Microsoft publicly acknowledged it can't sell a data-driven product without data. In other words, Microsoft can't sell face recognition for faces they can't train on.

-

Until now, that data has been freely harvested from the Internet and packaged in training sets like MS Celeb, which are overwhelmingly white and male. Without balanced data, facial recognition contains blind spots. And without datasets like MS Celeb, the powerful yet inaccurate facial recognition services like Microsoft's Azure Cognitive Service also would not be able to see at all.

-
 A visualization of 2,000 of the 100,000 identity included in the image dataset distributed by Microsoft Research. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
A visualization of 2,000 of the 100,000 identity included in the image dataset distributed by Microsoft Research. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)

Microsoft didn't only create MS Celeb for other researchers to use, they also used it internally. In a publicly available 2017 Microsoft Research project called "One-shot Face Recognition by Promoting Underrepresented Classes," Microsoft leveraged the MS Celeb dataset to analyze their algorithms and advertise the results. Interestingly, Microsoft's corporate version of the paper does not mention they used the MS Celeb datset, but the open-access version published on arxiv.org explicitly mentions that Microsoft Research introspected their algorithms "on the MS-Celeb-1M low-shot learning benchmark task."

+

Until now, that data has been freely harvested from the Internet and packaged in training sets like MS Celeb, which are overwhelmingly white and male. Without balanced data, facial recognition contains blind spots. And without datasets like MS Celeb, the powerful yet inaccurate facial recognition services like Microsoft's Azure Cognitive Service the services might not exist at all.

+
 A visualization of 2,000 of the 100,000 identity included in the image dataset distributed by Microsoft Research. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
A visualization of 2,000 of the 100,000 identity included in the image dataset distributed by Microsoft Research. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)

Microsoft didn't only create MS Celeb for other researchers to use, they also used it internally. In a publicly available 2017 Microsoft Research project called "One-shot Face Recognition by Promoting Underrepresented Classes," Microsoft leveraged the MS Celeb dataset to build their algorithms and advertise the results. Interestingly, Microsoft's corporate version of the paper does not mention they used the MS Celeb datset, but the open-access version published on arxiv.org explicitly mentions that Microsoft Research introspected their algorithms "on the MS-Celeb-1M low-shot learning benchmark task."

We suggest that if Microsoft Research wants to make biometric data publicly available for surveillance research and development, they should start with releasing their researchers' own biometric data, instead of scraping the Internet for journalists, artists, writers, actors, athletes, musicians, and academics.

Who used Microsoft Celeb?

diff --git a/site/public/datasets/uccs/assets/notes/index.html b/site/public/datasets/uccs/assets/notes/index.html new file mode 100644 index 00000000..0218d1b2 --- /dev/null +++ b/site/public/datasets/uccs/assets/notes/index.html @@ -0,0 +1,75 @@ + + + + MegaPixels + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
MegaPixels
+ +
+ +
+
+ + + +
+ + + + + \ No newline at end of file diff --git a/site/public/index.html b/site/public/index.html index 81e48a1b..9cb30060 100644 --- a/site/public/index.html +++ b/site/public/index.html @@ -1,16 +1,16 @@ - MegaPixels + MegaPixels: Face Recognition Datasets - + - + diff --git a/site/public/research/00_introduction/index.html b/site/public/research/00_introduction/index.html index 64635c55..cb6ff7a7 100644 --- a/site/public/research/00_introduction/index.html +++ b/site/public/research/00_introduction/index.html @@ -76,6 +76,15 @@

There is only biased feature vector clustering and probabilistic thresholding.

If you don't have data, you don't have a product.

Yesterday's decision by Brad Smith, CEO of Microsoft, to not sell facial recognition to a US law enforcement agency is not an about face by Microsoft to become more humane, it's simply a perfect illustration of the value of training data. Without data, you don't have a product to sell. Microsoft realized that doesn't have enough training data to sell

+

Cost of Faces

+

Univ Houston paid subjects $20/ea +http://web.archive.org/web/20170925053724/http://cbl.uh.edu/index.php/pages/research/collecting_facial_images_from_multiples_in_texas

+

FaceMeta facedataset.com

+
    +
  • BASIC: 15,000 images for $6,000 USD
  • +
  • RECOMMENDED: 50,000 images for $12,000 USD
  • +
  • ADVANCED: 100,000 images for $18,000 USD*
  • +

Use Your Own Biometrics First

If researchers want faces, they should take selfies and create their own dataset. If researchers want images of families to build surveillance software, they should use and distibute their own family portraits.

Motivation

diff --git a/site/public/research/01_from_1_to_100_pixels/index.html b/site/public/research/01_from_1_to_100_pixels/index.html index 7b86f5ef..cc9d3f94 100644 --- a/site/public/research/01_from_1_to_100_pixels/index.html +++ b/site/public/research/01_from_1_to_100_pixels/index.html @@ -92,6 +92,10 @@
  • 100x100 all you need for medical diagnosis
  • 100x100 0.5% of one Instagram photo
  • +

    Notes:

    +
      +
    • Google FaceNet used images with (face?) sizes: Input sizes range from 96x96 pixels to 224x224pixels in our experiments. FaceNet: A Unified Embedding for Face Recognition and Clustering https://arxiv.org/pdf/1503.03832.pdf
    • +

    Ideas:

    • Find specific cases of facial resolution being used in legal cases, forensic investigations, or military footage
    • diff --git a/site/templates/home.html b/site/templates/home.html index 81e48a1b..9cb30060 100644 --- a/site/templates/home.html +++ b/site/templates/home.html @@ -1,16 +1,16 @@ - MegaPixels + MegaPixels: Face Recognition Datasets - + - + diff --git a/todo.md b/todo.md index 4586611e..dc7ebaad 100644 --- a/todo.md +++ b/todo.md @@ -2,27 +2,27 @@ ## Global -- JL/AH:U tidy up desktop css - - dataset index page - JL: mobile CSS + - lightbox/modal on mobile, close button not visible + - decrease font-size of intro header +- AH: change intro heads to match twitter word counts better +- AH: ensure one good graphic per dataset page for social sharing +- AH: add social share graphic for homepage +- AH: add press kit/downloads ## Splash -- JL: add scripted slow-slow-zoom out effect + intro anim -- AH: about 50 render heads for homepage + name list for word cloud -- AH: work on CTA overlay design intro anim -- AH: add mozilla to footer +- AH: create high quality 3d heads +- JL/AH: add IJB-C names to word cloud ## Datasets - JL: this paper isn't appearing in the UCCS list of verified papers but should be included https://arxiv.org/pdf/1708.02337.pdf -- AH: add dataset analysis for MS Celeb, IJB-C -- AH: fix dataset analysis for UCCS, brainwahs graphics -- AH: add license information to each dataset page +- AH: add dataset analysis for IJB-C, HRT Transgender, MegaFace, PIPA ## About -- x +- ok ## Flickr Analysis @@ -53,28 +53,37 @@ Collect Flickr IDs and metadata for: - yfcc_100m -## Analysis: +## FT Analysis: - [x] Brainwash - [x] Duke MTMC - [x] UCCS -- [ ] MSCeleb +- [x] MSCeleb - [ ] IJB-C (and IJB-A/B?) - [ ] HRT Transgender - [x] Town Centre +## NYT Analysis: + +- [ ] Helen +- [ ] MegaFace +- [ ] PIPA + ## Verifications - [x] Brainwash - [x] Duke MTMC +- [ ] Helen - [x] UCCS +- [ ] MegaFace - [x] MSCeleb +- [ ] PIPA - [x] IJB-C (and IJB-A/B?) - [x] HRT Transgender - [x] Town Centre ------------------- +----------- ## Datasets for next launch: -- cgit v1.2.3-70-g09d2 From 5312831a4b4e09885c0dd38c45fb3b1fa6896306 Mon Sep 17 00:00:00 2001 From: Jules Laplace Date: Wed, 15 May 2019 00:16:48 +0200 Subject: only list image count if present --- site/public/datasets/index.html | 40 ++++++++++++++++++++++++---------------- site/templates/datasets.html | 4 +++- 2 files changed, 27 insertions(+), 17 deletions(-) (limited to 'site/public') diff --git a/site/public/datasets/index.html b/site/public/datasets/index.html index b5fe52ed..8cc5b612 100644 --- a/site/public/datasets/index.html +++ b/site/public/datasets/index.html @@ -66,62 +66,70 @@
      - + +
      Brainwash Dataset
      2015
      Head detection
      -
      11,917 images
      -
      + +
      11,917 images
      +
      - + +
      Duke MTMC Dataset
      2016
      Person re-identification, multi-camera tracking
      -
      2,000,000 images
      -
      2,700
      + +
      2,000,000 images
      +
      - + +
      Microsoft Celeb Dataset
      2016
      -
      Large-scale face recognition
      -
      1,000,000 images
      -
      100,000
      +
      Face recognition
      + +
      10,000,000 images
      +
      - + +
      Oxford Town Centre Dataset
      2009
      Person detection, gaze estimation
      -
      images
      -
      2,200
      +
      - + +
      UnConstrained College Students Dataset
      2016
      Face recognition, face detection
      -
      16,149 images
      -
      1,732
      + +
      16,149 images
      +
      diff --git a/site/templates/datasets.html b/site/templates/datasets.html index 0c3cbad1..0b1f296e 100644 --- a/site/templates/datasets.html +++ b/site/templates/datasets.html @@ -19,7 +19,9 @@
      {{ dataset.meta.dataset.year_published }}
      {{ dataset.meta.dataset.purpose }}
      -
      {{ dataset.meta.dataset.images }} images
      + {% if dataset.meta.dataset.images %} +
      {{ dataset.meta.dataset.images }} images
      + {% endif %}
      -- cgit v1.2.3-70-g09d2 From f3d56483e743f83d25b15616205dbdd49aad0382 Mon Sep 17 00:00:00 2001 From: adamhrv Date: Wed, 15 May 2019 11:24:33 +0200 Subject: . --- megapixels/commands/site/watch.py | 2 +- notes.md | 70 -------------------------- security.md | 6 --- site/content/_drafts_/lfw/index.md | 9 ++++ site/content/pages/datasets/brainwash/index.md | 16 +++--- site/content/pages/datasets/duke_mtmc/index.md | 7 +-- site/content/pages/datasets/msceleb/index.md | 2 +- site/public/about/attribution/index.html | 2 +- site/public/about/index.html | 2 +- site/public/about/legal/index.html | 2 +- site/public/about/press/index.html | 2 +- site/public/datasets/brainwash/index.html | 12 ++--- site/public/datasets/duke_mtmc/index.html | 11 +++- site/public/datasets/msceleb/index.html | 2 +- site/public/datasets/uccs/index.html | 3 -- 15 files changed, 44 insertions(+), 104 deletions(-) delete mode 100644 notes.md delete mode 100644 security.md (limited to 'site/public') diff --git a/megapixels/commands/site/watch.py b/megapixels/commands/site/watch.py index 7bd71038..d1c75c29 100644 --- a/megapixels/commands/site/watch.py +++ b/megapixels/commands/site/watch.py @@ -35,7 +35,7 @@ def cli(ctx): observer.schedule(SiteBuilder(), path=cfg.DIR_SITE_CONTENT, recursive=True) observer.start() - build_file(cfg.DIR_SITE_CONTENT + "/datasets/lfw/index.md") + #build_file(cfg.DIR_SITE_CONTENT + "/datasets/brainwash/index.md") try: while True: diff --git a/notes.md b/notes.md deleted file mode 100644 index 9dcf3da6..00000000 --- a/notes.md +++ /dev/null @@ -1,70 +0,0 @@ -PATH=/home/adam/torch/install/bin:/home/adam/anaconda3/bin:/home/adam/anaconda3/envs/megapixels/bin:/home/adam/anaconda3/bin:/home/adam/.nvm/versions/node/v9.9.0/bin:/home/adam/bin:/home/adam/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/lib/nvidia/:/usr/local/cuda/bin:/usr/lib/nvidia/:/usr/local/cuda/bin - -PATH=/home/adam/anaconda3/bin:/home/adam/.nvm/versions/node/v9.9.0/bin:/home/adam/code/google-cloud-sdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/local/cuda/bin - -CUDA_HOME=/usr/local/cuda -LD_LIBRARY_PATH=/home/adam/torch/install/lib::/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64 - - - -LD_LIBRARY_PATH=/home/adam/torch/install/lib::/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64 - - -LD_LIBRARY_PATH=:/usr/local/cuda/lib64 -CUDA_HOME=/usr/local/cuda - - -TORCH_NVCC_FLAGS=-D__CUDA_NO_HALF_OPERATORS__ -TORCH_NVCC_FLAGS=-D__CUDA_NO_HALF_OPERATORS__ - - -export PATH=/usr/local/cuda/bin:"$PATH -./clean.sh -export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__" -./install.sh - -find . -name "*.JPEG" | xargs -I {} convert {} -resize "256^>" {} - -find . -name \*.png -exec identify -ping {} \; -or -exec echo {} \; -find . -name \*.jpg -exec identify -ping {} \; -or -exec rm -f {} \; - -luarocks install cudnn -luarocks install display - -scp undisclosed:/home/adam/FIDs.zip . -unzip -q FIDs.zip -find FIDs_NEW -name \*.jpg > list.txt -mkdir -p /work/megapixels_dev/3rdparty/art-DCGAN/fiw/images/ - -while read -r line;do dst=/work/megapixels_dev/3rdparty/art-DCGAN/hipsterwars/images/$(basename "$line");src=`pwd`/$line;ln -s $src $dst;done < list.txt - -extension="${filename##*.}" - - -filename="${filename%.*} - -for d in $(find source -type d) - do - ls $d/*.bin 1>/dev/null 2>&1 && ln -s $d/*.bin target/$(basename $d).dat;done - -gpu=0 batchSize=1 imsize=10 noisemode=linefull net=bedrooms_4_net_G.t7 th generate.lua - -DATA_ROOT=fiw dataset=folder ndf=50 ngf=150 name=fiw_01 nThreads=6 gpu=2 th main.lua - -DATA_ROOT=megaface_13 dataset=folder ndf=50 ngf=150 name=megaface_13 nThreads=6 gpu=1 th main.lua - -DATA_ROOT=hipsterwars dataset=folder ndf=50 ngf=150 name=hipsterwars nThreads=6 gpu=2 th main.lua - -export PATH=/usr/local/cuda/bin/:$PATH - -export LD_LIBRARY_PATH=/usr/local/cuda/lib64/:$LD_LIBRARY_PATH - - -git clone https://github.com/soumith/cudnn.torch.git -b R7 && cd cudnn.torch && luarocks make cudnn-scm-1.rockspec - -scp undisclosed:/home/adam/hipsterwars_v1.0.zip . -find . -name "*.jpg" -print0 | xargs -0 mogrify -flop - -https://github.com/facebookresearch/deepcluster - -DATA_ROOT=hipsterwars dataset=folder ndf=100 ngf=200 batchSize=128 name=hipsterwars_d100_g200_b128 nThreads=8 gpu=2 th main.lua && DATA_ROOT=hipsterwars dataset=folder ndf=100 ngf=200 batchSize=64 name=hipsterwars_d100_g200_b64 nThreads=8 gpu=2 th main.lua && DATA_ROOT=hipsterwars dataset=folder ndf=100 ngf=300 batchSize=128 name=hipsterwars_d100_g300_b128 nThreads=8 gpu=2 th main.lua && DATA_ROOT=hipsterwars dataset=folder ndf=100 ngf=300 batchSize=64 name=hipsterwars_d100_g200_b64 nThreads=8 gpu=2 th main.lua \ No newline at end of file diff --git a/security.md b/security.md deleted file mode 100644 index d0bffdb4..00000000 --- a/security.md +++ /dev/null @@ -1,6 +0,0 @@ -# MegaPixels - -### Potential Blacklist - -- 103.213.248.154 - - 5,000 hits April with unknown browser from Hong Kong around April 22 \ No newline at end of file diff --git a/site/content/_drafts_/lfw/index.md b/site/content/_drafts_/lfw/index.md index 5d90e87f..ad43e2dd 100644 --- a/site/content/_drafts_/lfw/index.md +++ b/site/content/_drafts_/lfw/index.md @@ -18,6 +18,15 @@ authors: Adam Harvey ### sidebar ### end sidebar + + +## Research notes + +- Used in https://verify.megapixels.cc/paper/feret/verify/8aff9c8a0e17be91f55328e5be5e94aea5227a35https://verify.megapixels.cc/paper/feret/verify/8aff9c8a0e17be91f55328e5be5e94aea5227a35 by Raythen BBN https://en.wikipedia.org/wiki/BBN_Technologies a military contractor +----- + +## Old content + [ PAGE UNDER DEVELOPMENT ] *Labeled Faces in The Wild* (LFW) is "a database of face photographs designed for studying the problem of unconstrained face recognition[^lfw_www]. It is used to evaluate and improve the performance of facial recognition algorithms in academic, commercial, and government research. According to BiometricUpdate.com[^lfw_pingan], LFW is "the most widely used evaluation set in the field of facial recognition, LFW attracts a few dozen teams from around the globe including Google, Facebook, Microsoft Research Asia, Baidu, Tencent, SenseTime, Face++ and Chinese University of Hong Kong." diff --git a/site/content/pages/datasets/brainwash/index.md b/site/content/pages/datasets/brainwash/index.md index 2c51a7b2..e6217a18 100644 --- a/site/content/pages/datasets/brainwash/index.md +++ b/site/content/pages/datasets/brainwash/index.md @@ -2,8 +2,8 @@ status: published title: Brainwash Dataset -desc: Brainwash is a dataset of webcam images taken from the Brainwash Cafe in San Francisco in 2014 -subdesc: The Brainwash dataset includes 11,918 images of "everyday life of a busy downtown cafe" and is used for training head detection surveillance algorithms +desc: Brainwash is a dataset of webcam images taken from the Brainwash Cafe in San Francisco +subdesc: The Brainwash dataset includes 11,917 images of "everyday life of a busy downtown cafe" and is used for training head detection surveillance algorithms slug: brainwash cssclass: dataset image: assets/background.jpg @@ -19,23 +19,25 @@ authors: Adam Harvey ### sidebar ### end sidebar -Brainwash is a dataset of livecam images taken from San Francisco's Brainwash Cafe. It includes 11,918 images of "everyday life of a busy downtown cafe"[^readme] captured at 100 second intervals throughout the entire day. The Brainwash dataset includes 3 full days of webcam images taken on October 27, November 13, and November 24 in 2014. According the author's [research paper](https://www.semanticscholar.org/paper/End-to-End-People-Detection-in-Crowded-Scenes-Stewart-Andriluka/1bd1645a629f1b612960ab9bba276afd4cf7c666) introducing the dataset, the images were acquired with the help of Angelcam.com[^end_to_end] +Brainwash is a dataset of livecam images taken from San Francisco's Brainwash Cafe. It includes 11,917 images of "everyday life of a busy downtown cafe"[^readme] captured at 100 second intervals throughout the entire day. The Brainwash dataset includes 3 full days of webcam images taken on October 27, November 13, and November 24 in 2014. According the author's [research paper](https://www.semanticscholar.org/paper/End-to-End-People-Detection-in-Crowded-Scenes-Stewart-Andriluka/1bd1645a629f1b612960ab9bba276afd4cf7c666) introducing the dataset, the images were acquired with the help of Angelcam.com. [^end_to_end] The Brainwash dataset is unique because it uses images from a publicly available webcam that records people inside a privately owned business without any consent. No ordinary cafe customer would ever suspect that their image would end up in dataset used for surveillance research and development, but that is exactly what happened to customers at Brainwash cafe in San Francisco. -Although Brainwash appears to be a less popular dataset, it was notably used in 2016 and 2017 by researchers affiliated with the National University of Defense Technology in China for two [research](https://www.semanticscholar.org/paper/Localized-region-context-and-object-feature-fusion-Li-Dou/b02d31c640b0a31fb18c4f170d841d8e21ffb66c) [projects](https://www.semanticscholar.org/paper/A-Replacement-Algorithm-of-Non-Maximum-Suppression-Zhao-Wang/591a4bfa6380c9fcd5f3ae690e3ac5c09b7bf37b) on advancing the capabilities of object detection to more accurately isolate the target region in an image. [^localized_region_context] [^replacement_algorithm]. The [National University of Defense Technology](https://en.wikipedia.org/wiki/National_University_of_Defense_Technology) is controlled by China's top military body, the Central Military Commission. +Although Brainwash appears to be a less popular dataset, it was notably used in 2016 and 2017 by researchers affiliated with the National University of Defense Technology in China for two [research](https://www.semanticscholar.org/paper/Localized-region-context-and-object-feature-fusion-Li-Dou/b02d31c640b0a31fb18c4f170d841d8e21ffb66c) [projects](https://www.semanticscholar.org/paper/A-Replacement-Algorithm-of-Non-Maximum-Suppression-Zhao-Wang/591a4bfa6380c9fcd5f3ae690e3ac5c09b7bf37b) on advancing the capabilities of object detection to more accurately isolate the target region in an image. [^localized_region_context] [^replacement_algorithm] The [National University of Defense Technology](https://en.wikipedia.org/wiki/National_University_of_Defense_Technology) is controlled by China's top military body, the Central Military Commission. The dataset also appears in a 2017 [research paper](https://ieeexplore.ieee.org/document/7877809) from Peking University for the purpose of improving surveillance capabilities for "people detection in the crowded scenes". -![caption: Nine of 11,917 images from the the Brainwash dataset. Graphics credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)](assets/brainwash_grid.jpg) +![caption: An sample image from the Brainwash dataset used for training face and head detection algorithms for surveillance. The dataset contains a total of 11,917 images and 81,973 annotated heads. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)](assets/brainwash_example.jpg) + +![caption: A visualization of the active regions for 81,973 head annotations in the Brainwash dataset training partition. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)](assets/brainwash_saliency_map.jpg) + {% include 'dashboard.html' %} {% include 'supplementary_header.html' %} -![caption: An sample image from the Brainwash dataset used for training face and head detection algorithms for surveillance. The dataset contains a total of 11,917 images and 81,973 annotated heads. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)](assets/brainwash_example.jpg) -![caption: A visualization of the active regions for 81,973 head annotations in the Brainwash dataset training partition. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)](assets/brainwash_saliency_map.jpg) +![caption: Nine of 11,917 images from the the Brainwash dataset. Graphics credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)](assets/brainwash_grid.jpg) {% include 'cite_our_work.html' %} diff --git a/site/content/pages/datasets/duke_mtmc/index.md b/site/content/pages/datasets/duke_mtmc/index.md index 11414fd3..928c79fa 100644 --- a/site/content/pages/datasets/duke_mtmc/index.md +++ b/site/content/pages/datasets/duke_mtmc/index.md @@ -8,7 +8,7 @@ slug: duke_mtmc cssclass: dataset image: assets/background.jpg published: 2019-4-18 -updated: 2019-4-18 +updated: 2019-05-22 authors: Adam Harvey ------------ @@ -20,13 +20,13 @@ authors: Adam Harvey Duke MTMC (Multi-Target, Multi-Camera) is a dataset of surveillance video footage taken on Duke University's campus in 2014 and is used for research and development of video tracking systems, person re-identification, and low-resolution facial recognition. The dataset contains over 14 hours of synchronized surveillance video from 8 cameras at 1080p and 60 FPS, with over 2 million frames of 2,000 students walking to and from classes. The 8 surveillance cameras deployed on campus were specifically setup to capture students "during periods between lectures, when pedestrian traffic is heavy"[^duke_mtmc_orig]. -In this investigation into the Duke MTMC dataset we tracked down over 100 publicly available research papers that explicitly acknowledged using Duke MTMC. Our analysis shows that the dataset has spread far beyond its origins and intentions in academic research projects at Duke University. Since its publication in 2016, more than twice as many research citations originated in China as in the United States. Among these citations were papers with explicit and direct links to the Chinese military and several of the companies known to provide Chinese authorities with the oppressive surveillance technology used to monitor millions of Uighur Muslims. +In this investigation into the Duke MTMC dataset we tracked down over 100 publicly available research papers that explicitly acknowledged using Duke MTMC. Our analysis shows that the dataset has spread far beyond its origins and intentions in academic research projects at Duke University. Since its publication in 2016, more than twice as many research citations originated in China as in the United States. Among these citations were papers links to the Chinese military and several of the companies known to provide Chinese authorities with the oppressive surveillance technology used to monitor millions of Uighur Muslims. In one 2018 [paper](http://openaccess.thecvf.com/content_cvpr_2018/papers/Xu_Attention-Aware_Compositional_Network_CVPR_2018_paper.pdf) jointly published by researchers from SenseNets and SenseTime (and funded by SenseTime Group Limited) entitled [Attention-Aware Compositional Network for Person Re-identification](https://www.semanticscholar.org/paper/Attention-Aware-Compositional-Network-for-Person-Xu-Zhao/14ce502bc19b225466126b256511f9c05cadcb6e), the Duke MTMC dataset was used for "extensive experiments" on improving person re-identification across multiple surveillance cameras with important applications in "finding missing elderly and children, and suspect tracking, etc." Both SenseNets and SenseTime have been directly linked to the providing surveillance technology to monitor Uighur Muslims in China. [^xinjiang_nyt][^sensetime_qz][^sensenets_uyghurs] ![caption: A collection of 1,600 out of the approximately 2,000 students and pedestrians in the Duke MTMC dataset. These students were also included in the Duke MTMC Re-ID dataset extension used for person re-identification, and eventually the QMUL SurvFace face recognition dataset. Open Data Commons Attribution License.](assets/duke_mtmc_reid_montage.jpg) -Despite [repeated](https://www.hrw.org/news/2017/11/19/china-police-big-data-systems-violate-privacy-target-dissent) [warnings](https://www.hrw.org/news/2018/02/26/china-big-data-fuels-crackdown-minority-region) by Human Rights Watch that the authoritarian surveillance used in China represents a violation of human rights, researchers at Duke University continued to provide open access to their dataset for anyone to use for any project. As the surveillance crisis in China grew, so did the number of citations with links to organizations complicit in the crisis. In 2018 alone there were over 70 research projects happening in China that publicly acknowledged benefiting from the Duke MTMC dataset. Amongst these were projects from SenseNets, SenseTime, CloudWalk, Megvii, Beihang University, and the PLA's National University of Defense Technology. +Despite [repeated](https://www.hrw.org/news/2017/11/19/china-police-big-data-systems-violate-privacy-target-dissent) [warnings](https://www.hrw.org/news/2018/02/26/china-big-data-fuels-crackdown-minority-region) by Human Rights Watch that the authoritarian surveillance used in China represents humanitarian crisis, researchers at Duke University continued to provide open access to their dataset for anyone to use for any project. As the surveillance crisis in China grew, so did the number of citations with links to organizations complicit in the crisis. In 2018 alone there were over 90 research projects happening in China that publicly acknowledged using and benefiting from the Duke MTMC dataset. Amongst these were projects from CloudWalk, Hikvision, Megvii (Face++), SenseNets, SenseTime, Beihang University, and the PLA's National University of Defense Technology. | Organization | Paper | Link | Year | Used Duke MTMC | |---|---|---|---| @@ -34,6 +34,7 @@ Despite [repeated](https://www.hrw.org/news/2017/11/19/china-police-big-data-sys | Beihang University | Online Inter-Camera Trajectory Association Exploiting Person Re-Identification and Camera Topology | [acm.org](https://dl.acm.org/citation.cfm?id=3240663) | 2018 | ✔ | | CloudWalk | CloudWalk re-identification technology extends facial biometric tracking with improved accuracy | [BiometricUpdate.com](https://www.biometricupdate.com/201903/cloudwalk-re-identification-technology-extends-facial-biometric-tracking-with-improved-accuracy) | 2019 | ✔ | |CloudWalk| Horizontal Pyramid Matching for Person Re-identification | [arxiv.org](https://arxiv.org/pdf/1804.05275.pdf) | 2018 | ✔ | +| Hikvision | Learning Incremental Triplet Margin for Person Re-identification | [arxiv.org](https://arxiv.org/abs/1812.06576) | 2018 | ✔ | | Megvii | Person Re-Identification (slides) | [github.io](https://zsc.github.io/megvii-pku-dl-course/slides/Lecture%2011,%20Human%20Understanding_%20ReID%20and%20Pose%20and%20Attributes%20and%20Activity%20.pdf) | 2017 | ✔ | | Megvii | Multi-Target, Multi-Camera Tracking by Hierarchical Clustering: Recent Progress on DukeMTMC Project | [SemanticScholar](https://www.semanticscholar.org/paper/Multi-Target%2C-Multi-Camera-Tracking-by-Hierarchical-Zhang-Wu/10c20cf47d61063032dce4af73a4b8e350bf1128) | 2018 | ✔ | | Megvii | SCPNet: Spatial-Channel Parallelism Network for Joint Holistic and Partial PersonRe-Identification | [arxiv.org](https://arxiv.org/abs/1810.06996) | 2018 | ✔ | diff --git a/site/content/pages/datasets/msceleb/index.md b/site/content/pages/datasets/msceleb/index.md index 5f48ebfd..d1542f3f 100644 --- a/site/content/pages/datasets/msceleb/index.md +++ b/site/content/pages/datasets/msceleb/index.md @@ -64,7 +64,7 @@ Below is a selection of 24 names from the full target list, curated to illustrat === end columns -After publishing this list, researchers affiliated with Microsoft Asia then worked with researchers affiliated with China's [National University of Defense Technology](https://en.wikipedia.org/wiki/National_University_of_Defense_Technology) (controlled by China's Central Military Commission) and used the MS Celeb image dataset for their research paper on using "[Faces as Lighting Probes via Unsupervised Deep Highlight Extraction]((https://www.semanticscholar.org/paper/Faces-as-Lighting-Probes-via-Unsupervised-Deep-Yi-Zhu/b301fd2fc33f24d6f75224e7c0991f4f04b64a65)" with potential applications in 3D face recognition. +After publishing this list, researchers affiliated with Microsoft Asia then worked with researchers affiliated with China's [National University of Defense Technology](https://en.wikipedia.org/wiki/National_University_of_Defense_Technology) (controlled by China's Central Military Commission) and used the MS Celeb image dataset for their research paper on using "[Faces as Lighting Probes via Unsupervised Deep Highlight Extraction](https://www.semanticscholar.org/paper/Faces-as-Lighting-Probes-via-Unsupervised-Deep-Yi-Zhu/b301fd2fc33f24d6f75224e7c0991f4f04b64a65)" with potential applications in 3D face recognition. In an April 10, 2019 [article](https://www.ft.com/content/9378e7ee-5ae6-11e9-9dde-7aedca0a081a) published by Financial Times based on data surfaced during this investigation, Samm Sacks (a senior fellow at the New America think tank) commented that this research raised "red flags because of the nature of the technology, the author's affiliations, combined with what we know about how this technology is being deployed in China right now". Adding, that "the [Chinese] government is using these technologies to build surveillance systems and to detain minorities [in Xinjiang]".[^madhu_ft] diff --git a/site/public/about/attribution/index.html b/site/public/about/attribution/index.html index aec1b0f8..29c220e4 100644 --- a/site/public/about/attribution/index.html +++ b/site/public/about/attribution/index.html @@ -57,7 +57,7 @@
      diff --git a/site/public/about/index.html b/site/public/about/index.html index 9b3e455b..288aa2aa 100644 --- a/site/public/about/index.html +++ b/site/public/about/index.html @@ -57,7 +57,7 @@
      diff --git a/site/public/about/legal/index.html b/site/public/about/legal/index.html index df092153..d8d81d04 100644 --- a/site/public/about/legal/index.html +++ b/site/public/about/legal/index.html @@ -57,7 +57,7 @@
      diff --git a/site/public/about/press/index.html b/site/public/about/press/index.html index febc1256..57a07449 100644 --- a/site/public/about/press/index.html +++ b/site/public/about/press/index.html @@ -57,7 +57,7 @@
      diff --git a/site/public/datasets/brainwash/index.html b/site/public/datasets/brainwash/index.html index 8ae6b122..4fcea807 100644 --- a/site/public/datasets/brainwash/index.html +++ b/site/public/datasets/brainwash/index.html @@ -4,7 +4,7 @@ MegaPixels - + @@ -53,7 +53,7 @@
      -
      Brainwash is a dataset of webcam images taken from the Brainwash Cafe in San Francisco in 2014
      The Brainwash dataset includes 11,918 images of "everyday life of a busy downtown cafe" and is used for training head detection surveillance algorithms +
      Brainwash is a dataset of webcam images taken from the Brainwash Cafe in San Francisco
      The Brainwash dataset includes 11,917 images of "everyday life of a busy downtown cafe" and is used for training head detection surveillance algorithms

      Brainwash Dataset

      Brainwash is a dataset of livecam images taken from San Francisco's Brainwash Cafe. It includes 11,918 images of "everyday life of a busy downtown cafe" 1 captured at 100 second intervals throughout the entire day. The Brainwash dataset includes 3 full days of webcam images taken on October 27, November 13, and November 24 in 2014. According the author's research paper introducing the dataset, the images were acquired with the help of Angelcam.com 2

      +

      Brainwash is a dataset of livecam images taken from San Francisco's Brainwash Cafe. It includes 11,917 images of "everyday life of a busy downtown cafe" 1 captured at 100 second intervals throughout the entire day. The Brainwash dataset includes 3 full days of webcam images taken on October 27, November 13, and November 24 in 2014. According the author's research paper introducing the dataset, the images were acquired with the help of Angelcam.com. 2

      The Brainwash dataset is unique because it uses images from a publicly available webcam that records people inside a privately owned business without any consent. No ordinary cafe customer would ever suspect that their image would end up in dataset used for surveillance research and development, but that is exactly what happened to customers at Brainwash cafe in San Francisco.

      -

      Although Brainwash appears to be a less popular dataset, it was notably used in 2016 and 2017 by researchers affiliated with the National University of Defense Technology in China for two research projects on advancing the capabilities of object detection to more accurately isolate the target region in an image. 3 4. The National University of Defense Technology is controlled by China's top military body, the Central Military Commission.

      +

      Although Brainwash appears to be a less popular dataset, it was notably used in 2016 and 2017 by researchers affiliated with the National University of Defense Technology in China for two research projects on advancing the capabilities of object detection to more accurately isolate the target region in an image. 3 4 The National University of Defense Technology is controlled by China's top military body, the Central Military Commission.

      The dataset also appears in a 2017 research paper from Peking University for the purpose of improving surveillance capabilities for "people detection in the crowded scenes".

      -
       Nine of 11,917 images from the the Brainwash dataset. Graphics credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
      Nine of 11,917 images from the the Brainwash dataset. Graphics credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
      +
       An sample image from the Brainwash dataset used for training face and head detection algorithms for surveillance. The dataset contains a total of 11,917 images and 81,973 annotated heads.  Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
      An sample image from the Brainwash dataset used for training face and head detection algorithms for surveillance. The dataset contains a total of 11,917 images and 81,973 annotated heads. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
       A visualization of the active regions for 81,973 head annotations in the Brainwash dataset training partition. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
      A visualization of the active regions for 81,973 head annotations in the Brainwash dataset training partition. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)

      Who used Brainwash Dataset?

      @@ -140,7 +140,7 @@

      Supplementary Information

      -
       An sample image from the Brainwash dataset used for training face and head detection algorithms for surveillance. The dataset contains a total of 11,917 images and 81,973 annotated heads.  Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
      An sample image from the Brainwash dataset used for training face and head detection algorithms for surveillance. The dataset contains a total of 11,917 images and 81,973 annotated heads. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
       A visualization of the active regions for 81,973 head annotations in the Brainwash dataset training partition. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
      A visualization of the active regions for 81,973 head annotations in the Brainwash dataset training partition. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
      +
       Nine of 11,917 images from the the Brainwash dataset. Graphics credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)
      Nine of 11,917 images from the the Brainwash dataset. Graphics credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)

      Cite Our Work

      diff --git a/site/public/datasets/duke_mtmc/index.html b/site/public/datasets/duke_mtmc/index.html index 24ee6cc2..16d11cb0 100644 --- a/site/public/datasets/duke_mtmc/index.html +++ b/site/public/datasets/duke_mtmc/index.html @@ -74,9 +74,9 @@

      Website

    Duke MTMC (Multi-Target, Multi-Camera) is a dataset of surveillance video footage taken on Duke University's campus in 2014 and is used for research and development of video tracking systems, person re-identification, and low-resolution facial recognition. The dataset contains over 14 hours of synchronized surveillance video from 8 cameras at 1080p and 60 FPS, with over 2 million frames of 2,000 students walking to and from classes. The 8 surveillance cameras deployed on campus were specifically setup to capture students "during periods between lectures, when pedestrian traffic is heavy" 1.

    -

    In this investigation into the Duke MTMC dataset we tracked down over 100 publicly available research papers that explicitly acknowledged using Duke MTMC. Our analysis shows that the dataset has spread far beyond its origins and intentions in academic research projects at Duke University. Since its publication in 2016, more than twice as many research citations originated in China as in the United States. Among these citations were papers with explicit and direct links to the Chinese military and several of the companies known to provide Chinese authorities with the oppressive surveillance technology used to monitor millions of Uighur Muslims.

    +

    In this investigation into the Duke MTMC dataset we tracked down over 100 publicly available research papers that explicitly acknowledged using Duke MTMC. Our analysis shows that the dataset has spread far beyond its origins and intentions in academic research projects at Duke University. Since its publication in 2016, more than twice as many research citations originated in China as in the United States. Among these citations were papers links to the Chinese military and several of the companies known to provide Chinese authorities with the oppressive surveillance technology used to monitor millions of Uighur Muslims.

    In one 2018 paper jointly published by researchers from SenseNets and SenseTime (and funded by SenseTime Group Limited) entitled Attention-Aware Compositional Network for Person Re-identification, the Duke MTMC dataset was used for "extensive experiments" on improving person re-identification across multiple surveillance cameras with important applications in "finding missing elderly and children, and suspect tracking, etc." Both SenseNets and SenseTime have been directly linked to the providing surveillance technology to monitor Uighur Muslims in China. 4 2 3

    -
     A collection of 1,600 out of the approximately 2,000 students and pedestrians in the Duke MTMC dataset. These students were also included in the Duke MTMC Re-ID dataset extension used for person re-identification, and eventually the QMUL SurvFace face recognition dataset. Open Data Commons Attribution License.
    A collection of 1,600 out of the approximately 2,000 students and pedestrians in the Duke MTMC dataset. These students were also included in the Duke MTMC Re-ID dataset extension used for person re-identification, and eventually the QMUL SurvFace face recognition dataset. Open Data Commons Attribution License.

    Despite repeated warnings by Human Rights Watch that the authoritarian surveillance used in China represents a violation of human rights, researchers at Duke University continued to provide open access to their dataset for anyone to use for any project. As the surveillance crisis in China grew, so did the number of citations with links to organizations complicit in the crisis. In 2018 alone there were over 70 research projects happening in China that publicly acknowledged benefiting from the Duke MTMC dataset. Amongst these were projects from SenseNets, SenseTime, CloudWalk, Megvii, Beihang University, and the PLA's National University of Defense Technology.

    +
     A collection of 1,600 out of the approximately 2,000 students and pedestrians in the Duke MTMC dataset. These students were also included in the Duke MTMC Re-ID dataset extension used for person re-identification, and eventually the QMUL SurvFace face recognition dataset. Open Data Commons Attribution License.
    A collection of 1,600 out of the approximately 2,000 students and pedestrians in the Duke MTMC dataset. These students were also included in the Duke MTMC Re-ID dataset extension used for person re-identification, and eventually the QMUL SurvFace face recognition dataset. Open Data Commons Attribution License.

    Despite repeated warnings by Human Rights Watch that the authoritarian surveillance used in China represents humanitarian crisis, researchers at Duke University continued to provide open access to their dataset for anyone to use for any project. As the surveillance crisis in China grew, so did the number of citations with links to organizations complicit in the crisis. In 2018 alone there were over 90 research projects happening in China that publicly acknowledged using and benefiting from the Duke MTMC dataset. Amongst these were projects from CloudWalk, Hikvision, Megvii (Face++), SenseNets, SenseTime, Beihang University, and the PLA's National University of Defense Technology.

    @@ -116,6 +116,13 @@ + + + + + + + diff --git a/site/public/datasets/msceleb/index.html b/site/public/datasets/msceleb/index.html index aabda46c..dfe1b2d9 100644 --- a/site/public/datasets/msceleb/index.html +++ b/site/public/datasets/msceleb/index.html @@ -194,7 +194,7 @@
    Organization
    HikvisionLearning Incremental Triplet Margin for Person Re-identificationarxiv.org2018
    Megvii Person Re-Identification (slides) github.io
    -

    After publishing this list, researchers affiliated with Microsoft Asia then worked with researchers affiliated with China's National University of Defense Technology (controlled by China's Central Military Commission) and used the MS Celeb image dataset for their research paper on using "Faces as Lighting Probes via Unsupervised Deep Highlight Extraction" with potential applications in 3D face recognition.

    +

    After publishing this list, researchers affiliated with Microsoft Asia then worked with researchers affiliated with China's National University of Defense Technology (controlled by China's Central Military Commission) and used the MS Celeb image dataset for their research paper on using "Faces as Lighting Probes via Unsupervised Deep Highlight Extraction" with potential applications in 3D face recognition.

    In an April 10, 2019 article published by Financial Times based on data surfaced during this investigation, Samm Sacks (a senior fellow at the New America think tank) commented that this research raised "red flags because of the nature of the technology, the author's affiliations, combined with what we know about how this technology is being deployed in China right now". Adding, that "the [Chinese] government is using these technologies to build surveillance systems and to detain minorities [in Xinjiang]". 2

    Four more papers published by SenseTime that also use the MS Celeb dataset raise similar flags. SenseTime is a computer vision surveillance company that until April 2019 provided surveillance to Chinese authorities to monitor and track Uighur Muslims in Xinjiang province, and had been flagged numerous times as having potential links to human rights violations.

    One of the 4 SenseTime papers, "Exploring Disentangled Feature Representation Beyond Face Identification", shows how SenseTime was developing automated face analysis technology to infer race, narrow eyes, nose size, and chin size, all of which could be used to target vulnerable ethnic groups based on their facial appearances.

    diff --git a/site/public/datasets/uccs/index.html b/site/public/datasets/uccs/index.html index b5ceebd3..1a3a471f 100644 --- a/site/public/datasets/uccs/index.html +++ b/site/public/datasets/uccs/index.html @@ -56,9 +56,6 @@
    UnConstrained College Students is a dataset of long-range surveillance photos of students on University of Colorado in Colorado Springs campus
    The UnConstrained College Students dataset includes 16,149 images of 1,732 students, faculty, and pedestrians and is used for developing face recognition and face detection algorithms

    UnConstrained College Students