diff options
| author | adamhrv <adam@ahprojects.com> | 2019-04-24 10:10:06 +0200 |
|---|---|---|
| committer | adamhrv <adam@ahprojects.com> | 2019-04-24 10:10:06 +0200 |
| commit | 1b6aba08b8eca4f09456bd55ca617138cf8502b9 (patch) | |
| tree | 674d4e59eb546b5be03e1533ab1b324ef06582a2 /site/public/datasets | |
| parent | 21fdd0560146d0d2ec77d8517994d5ce20b446e1 (diff) | |
udpate
Diffstat (limited to 'site/public/datasets')
| -rw-r--r-- | site/public/datasets/brainwash/index.html | 2 | ||||
| -rw-r--r-- | site/public/datasets/duke_mtmc/index.html | 2 | ||||
| -rw-r--r-- | site/public/datasets/index.html | 14 | ||||
| -rw-r--r-- | site/public/datasets/msceleb/index.html | 28 | ||||
| -rw-r--r-- | site/public/datasets/oxford_town_centre/index.html | 2 | ||||
| -rw-r--r-- | site/public/datasets/uccs/index.html | 2 |
6 files changed, 29 insertions, 21 deletions
diff --git a/site/public/datasets/brainwash/index.html b/site/public/datasets/brainwash/index.html index 0c760858..4653ec92 100644 --- a/site/public/datasets/brainwash/index.html +++ b/site/public/datasets/brainwash/index.html @@ -5,7 +5,7 @@ <meta charset="utf-8" /> <meta name="author" content="Adam Harvey" /> <meta name="description" content="Brainwash is a dataset of webcam images taken from the Brainwash Cafe in San Francisco in 2014" /> - <meta property="og:title" content="MegaPixels: Brainwash"/> + <meta property="og:title" content="MegaPixels: Brainwash Dataset"/> <meta property="og:type" content="website"/> <meta property="og:image" content="https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/brainwash/assets/background.jpg" /> <meta property="og:url" content="https://megapixels.cc/datasets/brainwash/"/> diff --git a/site/public/datasets/duke_mtmc/index.html b/site/public/datasets/duke_mtmc/index.html index 24789730..0c164b6a 100644 --- a/site/public/datasets/duke_mtmc/index.html +++ b/site/public/datasets/duke_mtmc/index.html @@ -5,7 +5,7 @@ <meta charset="utf-8" /> <meta name="author" content="Adam Harvey" /> <meta name="description" content="Duke MTMC is a dataset of surveillance camera footage of students on Duke University campus" /> - <meta property="og:title" content="MegaPixels: Duke MTMC"/> + <meta property="og:title" content="MegaPixels: Duke MTMC Dataset"/> <meta property="og:type" content="website"/> <meta property="og:image" content="https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/duke_mtmc/assets/background.jpg" /> <meta property="og:url" content="https://megapixels.cc/datasets/duke_mtmc/"/> diff --git a/site/public/datasets/index.html b/site/public/datasets/index.html index 38d2960d..ffe24671 100644 --- a/site/public/datasets/index.html +++ b/site/public/datasets/index.html @@ -5,7 +5,7 @@ <meta charset="utf-8" /> <meta name="author" content="Adam Harvey" /> <meta name="description" content="Facial Recognition Datasets" /> - <meta property="og:title" content="MegaPixels: MegaPixels: Datasets"/> + <meta property="og:title" content="MegaPixels: MegaPixels: Face Recognition Datasets"/> <meta property="og:type" content="website"/> <meta property="og:image" content="https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/msceleb/assets/background.jpg" /> <meta property="og:url" content="https://megapixels.cc/datasets/"/> @@ -37,7 +37,7 @@ <div class='dataset-heading'> <section><h1>Face Recognition Datasets</h1> -<p>Explore face recognition datasets contributing the growing crisis of authoritarian biometric surveillance technologies. This first group of datasets focuses usage connected to foreign surveillance companies and defense organizations.</p> +<p>Explore face recognition datasets contributing to the growing crisis of authoritarian biometric surveillance technologies. This first group of 5 datasets focuses on image usage connected to foreign surveillance and defense organizations.</p> </section> </div> @@ -49,7 +49,7 @@ <a href="/datasets/brainwash/" style="background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/brainwash/assets/index.jpg)"> <div class="dataset"> - <span class='title'>Brainwash</span> + <span class='title'>Brainwash Dataset</span> <div class='fields'> <div class='year visible'><span>2015</span></div> <div class='purpose'><span>Head detection</span></div> @@ -61,7 +61,7 @@ <a href="/datasets/duke_mtmc/" style="background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/duke_mtmc/assets/index.jpg)"> <div class="dataset"> - <span class='title'>Duke MTMC</span> + <span class='title'>Duke MTMC Dataset</span> <div class='fields'> <div class='year visible'><span>2016</span></div> <div class='purpose'><span>Person re-identification, multi-camera tracking</span></div> @@ -73,7 +73,7 @@ <a href="/datasets/msceleb/" style="background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/msceleb/assets/index.jpg)"> <div class="dataset"> - <span class='title'>Microsoft Celeb</span> + <span class='title'>Microsoft Celeb Dataset</span> <div class='fields'> <div class='year visible'><span>2016</span></div> <div class='purpose'><span>Large-scale face recognition</span></div> @@ -85,7 +85,7 @@ <a href="/datasets/oxford_town_centre/" style="background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/oxford_town_centre/assets/index.jpg)"> <div class="dataset"> - <span class='title'>Oxford Town Centre</span> + <span class='title'>Oxford Town Centre Dataset</span> <div class='fields'> <div class='year visible'><span>2009</span></div> <div class='purpose'><span>Person detection, gaze estimation</span></div> @@ -97,7 +97,7 @@ <a href="/datasets/uccs/" style="background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/uccs/assets/index.jpg)"> <div class="dataset"> - <span class='title'>UnConstrained College Students</span> + <span class='title'>UnConstrained College Students Dataset</span> <div class='fields'> <div class='year visible'><span>2016</span></div> <div class='purpose'><span>Face recognition, face detection</span></div> diff --git a/site/public/datasets/msceleb/index.html b/site/public/datasets/msceleb/index.html index 89543f08..4d41a4c0 100644 --- a/site/public/datasets/msceleb/index.html +++ b/site/public/datasets/msceleb/index.html @@ -5,7 +5,7 @@ <meta charset="utf-8" /> <meta name="author" content="Adam Harvey" /> <meta name="description" content="Microsoft Celeb 1M is a target list and dataset of web images used for research and development of face recognition" /> - <meta property="og:title" content="MegaPixels: Microsoft Celeb"/> + <meta property="og:title" content="MegaPixels: Microsoft Celeb Dataset"/> <meta property="og:type" content="website"/> <meta property="og:image" content="https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/msceleb/assets/background.jpg" /> <meta property="og:url" content="https://megapixels.cc/datasets/msceleb/"/> @@ -57,10 +57,10 @@ </div><div class='meta'> <div class='gray'>Website</div> <div><a href='http://www.msceleb.org/' target='_blank' rel='nofollow noopener'>msceleb.org</a></div> - </div></div><p>Microsoft Celeb (MS Celeb) is a dataset of 10 million face images scraped from the Internet and used for research and development of large-scale biometric recognition systems. According to Microsoft Research, who created and published the <a href="https://www.microsoft.com/en-us/research/publication/ms-celeb-1m-dataset-benchmark-large-scale-face-recognition-2/">dataset</a> in 2016, MS Celeb is the largest publicly available face recognition dataset in the world, containing over 10 million images of nearly 100,000 individuals. Microsoft's goal in building this dataset was to distribute an initial training dataset of 100,000 individuals' images, and to use this dataset to accelerate research into recognizing a larger target list of one million people "using all the possibly collected face images of this individual on the web as training data".<a class="footnote_shim" name="[^msceleb_orig]_1"> </a><a href="#[^msceleb_orig]" class="footnote" title="Footnote 1">1</a></p> -<p>These one million people, defined by Microsoft Research as "celebrities", are often merely people who must maintain an online presence for their professional lives. Microsoft's list of 1 million people is an expansive exploitation of the current reality that for many people, including academics, policy makers, writers, artists, and especially journalists, maintaining an online presence is mandatory. This fact should not allow Microsoft or anyone else to use their biometrics for research and development of surveillance technology. Many names in the target list even include people critical of the very technology Microsoft is using their name and biometric information to build. The list includes digital rights activists like Jillian York; artists critical of surveillance including Trevor Paglen, Jill Magid, and Aram Bartholl; Intercept founders Laura Poitras, Jeremy Scahill, and Glenn Greenwald; Data and Society founder danah boyd; and even Julie Brill, the former FTC commissioner responsible for protecting consumer privacy, to name a few.</p> + </div></div><p>Microsoft Celeb (MS Celeb) is a dataset of 10 million face images scraped from the Internet and used for research and development of large-scale biometric recognition systems. According to Microsoft Research, who created and published the <a href="https://www.microsoft.com/en-us/research/publication/ms-celeb-1m-dataset-benchmark-large-scale-face-recognition-2/">dataset</a> in 2016, MS Celeb is the largest publicly available face recognition dataset in the world, containing over 10 million images of nearly 100,000 individuals. Microsoft's goal in building this dataset was to distribute an initial training dataset of 100,000 individuals' images to accelerate research into recognizing a larger target list of one million people "using all the possibly collected face images of this individual on the web as training data".<a class="footnote_shim" name="[^msceleb_orig]_1"> </a><a href="#[^msceleb_orig]" class="footnote" title="Footnote 1">1</a></p> +<p>These one million people, defined by Microsoft Research as "celebrities", are often merely people who must maintain an online presence for their professional lives. Microsoft's list of 1 million people is an expansive exploitation of the current reality that for many people, including academics, policy makers, writers, artists, and especially journalists; maintaining an online presence is mandatory. This fact should not allow Microsoft nor anyone else to use their biometrics for research and development of surveillance technology. Many names in the target list even include people critical of the very technology Microsoft is using their name and biometric information to build. The list includes digital rights activists like Jillian York; artists critical of surveillance including Trevor Paglen, Jill Magid, and Aram Bartholl; Intercept founders Laura Poitras, Jeremy Scahill, and Glenn Greenwald; Data and Society founder danah boyd; and even Julie Brill, the former FTC commissioner responsible for protecting consumer privacy, to name a few.</p> <h3>Microsoft's 1 Million Target List</h3> -<p>Below is a selection of names from the full target list, curated to illustrate Microsoft's expansive and exploitative practice of scraping the Internet for biometric training data. The entire name file can be downloaded from <a href="https://www.msceleb.org">msceleb.org</a>. You can email <a href="mailto:msceleb@microsoft.com?subject=MS-Celeb-1M Removal Request&body=Dear%20Microsoft%2C%0A%0AI%20recently%20discovered%20that%20you%20use%20my%20identity%20for%20commercial%20use%20in%20your%20MS-Celeb-1M%20dataset%20used%20for%20research%20and%20development%20of%20face%20recognition.%20I%20do%20not%20wish%20to%20be%20included%20in%20your%20dataset%20in%20any%20format.%20%0A%0APlease%20remove%20my%20name%20and%2For%20any%20associated%20images%20immediately%20and%20send%20a%20confirmation%20once%20you've%20updated%20your%20%22Top1M_MidList.Name.tsv%22%20file.%0A%0AThanks%20for%20promptly%20handing%20this%2C%0A%5B%20your%20name%20%5D">msceleb@microsoft.com</a> to have your name removed. Names appearing with * indicate that Microsoft also distributed images.</p> +<p>Below is a selection of 24 names from the full target list, curated to illustrate Microsoft's expansive and exploitative practice of scraping the Internet for biometric training data. The entire name file can be downloaded from <a href="https://www.msceleb.org">msceleb.org</a>. You can email <a href="mailto:msceleb@microsoft.com?subject=MS-Celeb-1M Removal Request&body=Dear%20Microsoft%2C%0A%0AI%20recently%20discovered%20that%20you%20use%20my%20identity%20for%20commercial%20use%20in%20your%20MS-Celeb-1M%20dataset%20used%20for%20research%20and%20development%20of%20face%20recognition.%20I%20do%20not%20wish%20to%20be%20included%20in%20your%20dataset%20in%20any%20format.%20%0A%0APlease%20remove%20my%20name%20and%2For%20any%20associated%20images%20immediately%20and%20send%20a%20confirmation%20once%20you've%20updated%20your%20%22Top1M_MidList.Name.tsv%22%20file.%0A%0AThanks%20for%20promptly%20handing%20this%2C%0A%5B%20your%20name%20%5D">msceleb@microsoft.com</a> to have your name removed. Names appearing with * indicate that Microsoft also distributed your images.</p> </section><section><div class='columns columns-2'><div class='column'><table> <thead><tr> <th>Name</th> @@ -112,6 +112,10 @@ <td>Hito Steyerl</td> <td>Artist, writer</td> </tr> +<tr> +<td>James Risen</td> +<td>Journalist</td> +</tr> </tbody> </table> </div><div class='column'><table> @@ -122,10 +126,6 @@ </thead> <tbody> <tr> -<td>James Risen</td> -<td>Journalist</td> -</tr> -<tr> <td>Jeremy Scahill*</td> <td>Journalist</td> </tr> @@ -158,6 +158,14 @@ <td>Artist</td> </tr> <tr> +<td>Michael Anti</td> +<td>Political blogger</td> +</tr> +<tr> +<td>Manal al-Sharif*</td> +<td>Womens's rights activist</td> +</tr> +<tr> <td>Shoshana Zuboff</td> <td>Author, academic</td> </tr> @@ -167,8 +175,8 @@ </tr> </tbody> </table> -</div></div></section><section><p>After publishing this list, researchers from Microsoft Asia then worked with researchers affiliated with China's National University of Defense Technology (controlled by China's Central Military Commission) and used the the MS Celeb dataset for their <a href="https://www.semanticscholar.org/paper/Faces-as-Lighting-Probes-via-Unsupervised-Deep-Yi-Zhu/b301fd2fc33f24d6f75224e7c0991f4f04b64a65">research paper</a> on using "Faces as Lighting Probes via Unsupervised Deep Highlight Extraction" with potential applications in 3D face recognition.</p> -<p>In an <a href="https://www.ft.com/content/9378e7ee-5ae6-11e9-9dde-7aedca0a081a">article</a> published by Financial Times based on data surfaced during this investigation, Samm Sacks (a senior fellow at the New America think tank) commented that this research raised "red flags because of the nature of the technology, the author's affiliations, combined with what we know about how this technology is being deployed in China right now". Adding, that "the [Chinese] government is using these technologies to build surveillance systems and to detain minorities [in Xinjiang]".<a class="footnote_shim" name="[^madhu_ft]_1"> </a><a href="#[^madhu_ft]" class="footnote" title="Footnote 2">2</a></p> +</div></div></section><section><p>After publishing this list, researchers affiliated with Microsoft Asia then worked with researchers affiliated with China's National University of Defense Technology (controlled by China's Central Military Commission) and used the the MS Celeb dataset for their <a href="https://www.semanticscholar.org/paper/Faces-as-Lighting-Probes-via-Unsupervised-Deep-Yi-Zhu/b301fd2fc33f24d6f75224e7c0991f4f04b64a65">research paper</a> on using "Faces as Lighting Probes via Unsupervised Deep Highlight Extraction" with potential applications in 3D face recognition.</p> +<p>In an April 10, 2019 <a href="https://www.ft.com/content/9378e7ee-5ae6-11e9-9dde-7aedca0a081a">article</a> published by Financial Times based on data surfaced during this investigation, Samm Sacks (a senior fellow at the New America think tank) commented that this research raised "red flags because of the nature of the technology, the author's affiliations, combined with what we know about how this technology is being deployed in China right now". Adding, that "the [Chinese] government is using these technologies to build surveillance systems and to detain minorities [in Xinjiang]".<a class="footnote_shim" name="[^madhu_ft]_1"> </a><a href="#[^madhu_ft]" class="footnote" title="Footnote 2">2</a></p> <p>Four more papers published by SenseTime, which also use the MS Celeb dataset, raise similar flags. SenseTime is a computer vision surveillance company that until <a href="https://uhrp.org/news-commentary/china%E2%80%99s-sensetime-sells-out-xinjiang-security-joint-venture">April 2019</a> provided surveillance to Chinese authorities to monitor and track Uighur Muslims in Xinjiang province, and had been <a href="https://www.nytimes.com/2019/04/14/technology/china-surveillance-artificial-intelligence-racial-profiling.html">flagged</a> numerous times as having potential links to human rights violations.</p> <p>One of the 4 SenseTime papers, "<a href="https://www.semanticscholar.org/paper/Exploring-Disentangled-Feature-Representation-Face-Liu-Wei/1fd5d08394a3278ef0a89639e9bfec7cb482e0bf">Exploring Disentangled Feature Representation Beyond Face Identification</a>", shows how SenseTime was developing automated face analysis technology to infer race, narrow eyes, nose size, and chin size, all of which could be used to target vulnerable ethnic groups based on their facial appearances.</p> <p>Earlier in 2019, Microsoft President and Chief Legal Officer <a href="https://blogs.microsoft.com/on-the-issues/2018/12/06/facial-recognition-its-time-for-action/">Brad Smith</a> called for the governmental regulation of face recognition, citing the potential for misuse, a rare admission that Microsoft's surveillance-driven business model had lost its bearing. More recently Smith also <a href="https://www.reuters.com/article/us-microsoft-ai/microsoft-turned-down-facial-recognition-sales-on-human-rights-concerns-idUSKCN1RS2FV">announced</a> that Microsoft would seemingly take a stand against such potential misuse, and had decided to not sell face recognition to an unnamed United States agency, citing a lack of accuracy. The software was not suitable to be used on minorities, because it was trained mostly on white male faces.</p> diff --git a/site/public/datasets/oxford_town_centre/index.html b/site/public/datasets/oxford_town_centre/index.html index cada5dd4..50859604 100644 --- a/site/public/datasets/oxford_town_centre/index.html +++ b/site/public/datasets/oxford_town_centre/index.html @@ -5,7 +5,7 @@ <meta charset="utf-8" /> <meta name="author" content="Adam Harvey" /> <meta name="description" content="Oxford Town Centre is a dataset of surveillance camera footage from Cornmarket St Oxford, England" /> - <meta property="og:title" content="MegaPixels: Oxford Town Centre"/> + <meta property="og:title" content="MegaPixels: Oxford Town Centre Dataset"/> <meta property="og:type" content="website"/> <meta property="og:image" content="https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/oxford_town_centre/assets/background.jpg" /> <meta property="og:url" content="https://megapixels.cc/datasets/oxford_town_centre/"/> diff --git a/site/public/datasets/uccs/index.html b/site/public/datasets/uccs/index.html index 3ff4a345..adb411c6 100644 --- a/site/public/datasets/uccs/index.html +++ b/site/public/datasets/uccs/index.html @@ -5,7 +5,7 @@ <meta charset="utf-8" /> <meta name="author" content="Adam Harvey" /> <meta name="description" content="UnConstrained College Students is a dataset of long-range surveillance photos of students on University of Colorado in Colorado Springs campus" /> - <meta property="og:title" content="MegaPixels: UnConstrained College Students"/> + <meta property="og:title" content="MegaPixels: UnConstrained College Students Dataset"/> <meta property="og:type" content="website"/> <meta property="og:image" content="https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/uccs/assets/background.jpg" /> <meta property="og:url" content="https://megapixels.cc/datasets/uccs/"/> |
