diff options
| author | Adam Harvey <adam@ahprojects.com> | 2019-04-21 09:40:57 +0200 |
|---|---|---|
| committer | Adam Harvey <adam@ahprojects.com> | 2019-04-21 09:40:57 +0200 |
| commit | d6dc546c97a18669416f203c3cf30c066bb28cfb (patch) | |
| tree | dccdb299dd8417a51765bfaa0e4f9dccf363866f /site | |
| parent | 234ef07325a4ff47c089d8b3d83dd3123d67a95f (diff) | |
update typos
Diffstat (limited to 'site')
| -rw-r--r-- | site/content/pages/datasets/msceleb/index.md | 2 | ||||
| -rw-r--r-- | site/content/pages/datasets/uccs/index.md | 2 | ||||
| -rw-r--r-- | site/public/datasets/msceleb/index.html | 2 | ||||
| -rw-r--r-- | site/public/datasets/uccs/index.html | 2 |
4 files changed, 4 insertions, 4 deletions
diff --git a/site/content/pages/datasets/msceleb/index.md b/site/content/pages/datasets/msceleb/index.md index 553cbd14..3d5c6c59 100644 --- a/site/content/pages/datasets/msceleb/index.md +++ b/site/content/pages/datasets/msceleb/index.md @@ -19,7 +19,7 @@ authors: Adam Harvey ### sidebar ### end sidebar -Microsoft Celeb (MS Celeb) is a dataset of 10 million face images scraped from the internet and used for research and development of large-scale biometric recognition systems. According to Microsoft Research, who created and published the [dataset](https://www.microsoft.com/en-us/research/publication/ms-celeb-1m-dataset-benchmark-large-scale-face-recognition-2/) in 2016, MS Celeb is the largest publicly available face recognition dataset in the world, containing over 10 million images of nearly 100,000 individuals. Microsoft's goal in building this dataset was to distribute an initial training dataset of 100,000 individuals' images, and to use this dataset to accelerate research into recognizing a larger target list of one million people "using all the possibly collected face images of this individual on the web as training data".[^msceleb_orig] +Microsoft Celeb (MS Celeb) is a dataset of 10 million face images scraped from the Internet and used for research and development of large-scale biometric recognition systems. According to Microsoft Research, who created and published the [dataset](https://www.microsoft.com/en-us/research/publication/ms-celeb-1m-dataset-benchmark-large-scale-face-recognition-2/) in 2016, MS Celeb is the largest publicly available face recognition dataset in the world, containing over 10 million images of nearly 100,000 individuals. Microsoft's goal in building this dataset was to distribute an initial training dataset of 100,000 individuals' images, and to use this dataset to accelerate research into recognizing a larger target list of one million people "using all the possibly collected face images of this individual on the web as training data".[^msceleb_orig] These one million people, defined by Microsoft Research as "celebrities", are often merely people who must maintain an online presence for their professional lives. Microsoft's list of 1 million people is an expansive exploitation of the current reality that for many people, including academics, policy makers, writers, artists, and especially journalists, maintaining an online presence is mandatory. This fact should not allow Microsoft or anyone else to use their biometrics for research and development of surveillance technology. Many names in the target list even include people critical of the very technology Microsoft is using their name and biometric information to build. The list includes digital rights activists like Jillian York; artists critical of surveillance including Trevor Paglen, Jill Magid, and Aram Bartholl; Intercept founders Laura Poitras, Jeremy Scahill, and Glenn Greenwald; Data and Society founder danah boyd; and even Julie Brill, the former FTC commissioner responsible for protecting consumer privacy, to name a few. diff --git a/site/content/pages/datasets/uccs/index.md b/site/content/pages/datasets/uccs/index.md index d37db132..0850bd99 100644 --- a/site/content/pages/datasets/uccs/index.md +++ b/site/content/pages/datasets/uccs/index.md @@ -44,7 +44,7 @@ The two research papers associated with the release of the UCCS dataset ([Uncons In 2017, one year after its public release, the UCCS face dataset formed the basis for a defense and intelligence agency funded [face recognition challenge](http://www.face-recognition-challenge.com/) project at the International Joint Biometrics Conference in Denver, CO. And in 2018 the dataset was again used for the [2nd Unconstrained Face Detection and Open Set Recognition Challenge](https://erodner.github.io/ial2018eccv/) at the European Computer Vision Conference (ECCV) in Munich, Germany. -As of April 15, 2019, the UCCS dataset is no longer available for public download. But during the three years it was publicly available (2016-2019) the UCCS dataset appeared in at least 6 publicly available research papers including verified usage from Beihang University who is known to provide research and development for China's military; and Vision Semantics Ltd who lists the UK Ministory of Defence as a project partner. +As of April 15, 2019, the UCCS dataset is no longer available for public download. But during the three years it was publicly available (2016-2019) the UCCS dataset appeared in at least 6 publicly available research papers including verified usage from Beihang University who is known to provide research and development for China's military; and Vision Semantics Ltd who lists the UK Ministry of Defence as a project partner. {% include 'dashboard.html' %} diff --git a/site/public/datasets/msceleb/index.html b/site/public/datasets/msceleb/index.html index f9c184c8..89543f08 100644 --- a/site/public/datasets/msceleb/index.html +++ b/site/public/datasets/msceleb/index.html @@ -57,7 +57,7 @@ </div><div class='meta'> <div class='gray'>Website</div> <div><a href='http://www.msceleb.org/' target='_blank' rel='nofollow noopener'>msceleb.org</a></div> - </div></div><p>Microsoft Celeb (MS Celeb) is a dataset of 10 million face images scraped from the internet and used for research and development of large-scale biometric recognition systems. According to Microsoft Research, who created and published the <a href="https://www.microsoft.com/en-us/research/publication/ms-celeb-1m-dataset-benchmark-large-scale-face-recognition-2/">dataset</a> in 2016, MS Celeb is the largest publicly available face recognition dataset in the world, containing over 10 million images of nearly 100,000 individuals. Microsoft's goal in building this dataset was to distribute an initial training dataset of 100,000 individuals' images, and to use this dataset to accelerate research into recognizing a larger target list of one million people "using all the possibly collected face images of this individual on the web as training data".<a class="footnote_shim" name="[^msceleb_orig]_1"> </a><a href="#[^msceleb_orig]" class="footnote" title="Footnote 1">1</a></p> + </div></div><p>Microsoft Celeb (MS Celeb) is a dataset of 10 million face images scraped from the Internet and used for research and development of large-scale biometric recognition systems. According to Microsoft Research, who created and published the <a href="https://www.microsoft.com/en-us/research/publication/ms-celeb-1m-dataset-benchmark-large-scale-face-recognition-2/">dataset</a> in 2016, MS Celeb is the largest publicly available face recognition dataset in the world, containing over 10 million images of nearly 100,000 individuals. Microsoft's goal in building this dataset was to distribute an initial training dataset of 100,000 individuals' images, and to use this dataset to accelerate research into recognizing a larger target list of one million people "using all the possibly collected face images of this individual on the web as training data".<a class="footnote_shim" name="[^msceleb_orig]_1"> </a><a href="#[^msceleb_orig]" class="footnote" title="Footnote 1">1</a></p> <p>These one million people, defined by Microsoft Research as "celebrities", are often merely people who must maintain an online presence for their professional lives. Microsoft's list of 1 million people is an expansive exploitation of the current reality that for many people, including academics, policy makers, writers, artists, and especially journalists, maintaining an online presence is mandatory. This fact should not allow Microsoft or anyone else to use their biometrics for research and development of surveillance technology. Many names in the target list even include people critical of the very technology Microsoft is using their name and biometric information to build. The list includes digital rights activists like Jillian York; artists critical of surveillance including Trevor Paglen, Jill Magid, and Aram Bartholl; Intercept founders Laura Poitras, Jeremy Scahill, and Glenn Greenwald; Data and Society founder danah boyd; and even Julie Brill, the former FTC commissioner responsible for protecting consumer privacy, to name a few.</p> <h3>Microsoft's 1 Million Target List</h3> <p>Below is a selection of names from the full target list, curated to illustrate Microsoft's expansive and exploitative practice of scraping the Internet for biometric training data. The entire name file can be downloaded from <a href="https://www.msceleb.org">msceleb.org</a>. You can email <a href="mailto:msceleb@microsoft.com?subject=MS-Celeb-1M Removal Request&body=Dear%20Microsoft%2C%0A%0AI%20recently%20discovered%20that%20you%20use%20my%20identity%20for%20commercial%20use%20in%20your%20MS-Celeb-1M%20dataset%20used%20for%20research%20and%20development%20of%20face%20recognition.%20I%20do%20not%20wish%20to%20be%20included%20in%20your%20dataset%20in%20any%20format.%20%0A%0APlease%20remove%20my%20name%20and%2For%20any%20associated%20images%20immediately%20and%20send%20a%20confirmation%20once%20you've%20updated%20your%20%22Top1M_MidList.Name.tsv%22%20file.%0A%0AThanks%20for%20promptly%20handing%20this%2C%0A%5B%20your%20name%20%5D">msceleb@microsoft.com</a> to have your name removed. Names appearing with * indicate that Microsoft also distributed images.</p> diff --git a/site/public/datasets/uccs/index.html b/site/public/datasets/uccs/index.html index fa20ed3f..3ff4a345 100644 --- a/site/public/datasets/uccs/index.html +++ b/site/public/datasets/uccs/index.html @@ -64,7 +64,7 @@ Their setup made it impossible for students to know they were being photographed </section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/uccs/assets/uccs_grid.jpg' alt=' Example images from the UnConstrained College Students Dataset. '><div class='caption'> Example images from the UnConstrained College Students Dataset. </div></div></section><section><p>The EXIF data embedded in the images shows that the photo capture times follow a similar pattern to that outlined by the researchers, but also highlights that the vast majority of photos (over 7,000) were taken on Tuesdays around noon during students' lunch break. The lack of any photos taken between Friday through Sunday shows that the researchers were only interested in capturing images of students during the peak campus hours.</p> </section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/uccs/assets/uccs_exif_plot_days.png' alt=' UCCS photos captured per weekday © megapixels.cc'><div class='caption'> UCCS photos captured per weekday © megapixels.cc</div></div></section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/uccs/assets/uccs_exif_plot.png' alt=' UCCS photos captured per weekday © megapixels.cc'><div class='caption'> UCCS photos captured per weekday © megapixels.cc</div></div></section><section><p>The two research papers associated with the release of the UCCS dataset (<a href="https://www.semanticscholar.org/paper/Unconstrained-Face-Detection-and-Open-Set-Face-G%C3%BCnther-Hu/d4f1eb008eb80595bcfdac368e23ae9754e1e745">Unconstrained Face Detection and Open-Set Face Recognition Challenge</a> and <a href="https://www.semanticscholar.org/paper/Large-scale-unconstrained-open-set-face-database-Sapkota-Boult/07fcbae86f7a3ad3ea1cf95178459ee9eaf77cb1">Large Scale Unconstrained Open Set Face Database</a>), acknowledge that the primary funding sources for their work were United States defense and intelligence agencies. Specifically, development of the UnContsrianed College Students dataset was funded by the Intelligence Advanced Research Projects Activity (IARPA), Office of Director of National Intelligence (ODNI), Office of Naval Research and The Department of Defense Multidisciplinary University Research Initiative (ONR MURI), and the Special Operations Command and Small Business Innovation Research (SOCOM SBIR) amongst others. UCCS's VAST site also explicitly <a href="https://vast.uccs.edu/project/iarpa-janus/">states</a> their involvement in the <a href="https://www.iarpa.gov/index.php/research-programs/janus">IARPA Janus</a> face recognition project developed to serve the needs of national intelligence, establishing that immediate benefactors of this dataset include United States defense and intelligence agencies, but it would go on to benefit other similar organizations.</p> <p>In 2017, one year after its public release, the UCCS face dataset formed the basis for a defense and intelligence agency funded <a href="http://www.face-recognition-challenge.com/">face recognition challenge</a> project at the International Joint Biometrics Conference in Denver, CO. And in 2018 the dataset was again used for the <a href="https://erodner.github.io/ial2018eccv/">2nd Unconstrained Face Detection and Open Set Recognition Challenge</a> at the European Computer Vision Conference (ECCV) in Munich, Germany.</p> -<p>As of April 15, 2019, the UCCS dataset is no longer available for public download. But during the three years it was publicly available (2016-2019) the UCCS dataset appeared in at least 6 publicly available research papers including verified usage from Beihang University who is known to provide research and development for China's military; and Vision Semantics Ltd who lists the UK Ministory of Defence as a project partner.</p> +<p>As of April 15, 2019, the UCCS dataset is no longer available for public download. But during the three years it was publicly available (2016-2019) the UCCS dataset appeared in at least 6 publicly available research papers including verified usage from Beihang University who is known to provide research and development for China's military; and Vision Semantics Ltd who lists the UK Ministry of Defence as a project partner.</p> </section><section> <h3>Who used UCCS?</h3> |
