diff options
| author | adamhrv <adam@ahprojects.com> | 2019-05-17 12:32:00 +0200 |
|---|---|---|
| committer | adamhrv <adam@ahprojects.com> | 2019-05-17 12:32:00 +0200 |
| commit | 84b286e1bd85feba12174a2a480d2be404e7b9c5 (patch) | |
| tree | 4317828715117cd733e9bacd9d9c2acca68c3835 /site | |
| parent | 5cf3ee3bbabecbf9669b5846ebdd01f5773af607 (diff) | |
fix text
Diffstat (limited to 'site')
| -rw-r--r-- | site/content/pages/datasets/msceleb/index.md | 4 | ||||
| -rw-r--r-- | site/public/datasets/msceleb/index.html | 4 |
2 files changed, 4 insertions, 4 deletions
diff --git a/site/content/pages/datasets/msceleb/index.md b/site/content/pages/datasets/msceleb/index.md index 58bacf1e..8623767b 100644 --- a/site/content/pages/datasets/msceleb/index.md +++ b/site/content/pages/datasets/msceleb/index.md @@ -3,7 +3,7 @@ status: published title: Microsoft Celeb Dataset desc: Microsoft Celeb 1M is a dataset of 10 million face images harvested from the Internet -subdesc: The MS Celeb dataset includes 100K people and a target list of 1 million individuals +subdesc: The MS Celeb dataset includes 100,000 people and a target list of 1,000,000 individuals slug: msceleb cssclass: dataset image: assets/background.jpg @@ -82,7 +82,7 @@ Until now, that data has been freely harvested from the Internet and packaged in Microsoft didn't only create MS Celeb for other researchers to use, they also used it internally. In a publicly available 2017 Microsoft Research project called "[One-shot Face Recognition by Promoting Underrepresented Classes](https://www.microsoft.com/en-us/research/publication/one-shot-face-recognition-promoting-underrepresented-classes/)," Microsoft leveraged the MS Celeb dataset to build their algorithms and advertise the results. Interestingly, Microsoft's [corporate version](https://www.microsoft.com/en-us/research/publication/one-shot-face-recognition-promoting-underrepresented-classes/) of the paper does not mention they used the MS Celeb datset, but the [open-access version](https://www.semanticscholar.org/paper/One-shot-Face-Recognition-by-Promoting-Classes-Guo/6cacda04a541d251e8221d70ac61fda88fb61a70) published on arxiv.org explicitly mentions that Microsoft Research introspected their algorithms "on the MS-Celeb-1M low-shot learning benchmark task." -We suggest that if Microsoft Research wants to make biometric data publicly available for surveillance research and development, they should start with releasing their researchers' own biometric data, instead of scraping the Internet for journalists, artists, writers, actors, athletes, musicians, and academics. +If Microsoft Research wants to make biometric data publicly available for surveillance research and development, perhaps they should start with releasing their employees own biometric data instead of scraping the Internet for journalists, artists, writers, actors, athletes, musicians, and academics. A publicly available face recognition dataset of all Microsoft Researcher employees would be a welcome replacement. {% include 'dashboard.html' %} diff --git a/site/public/datasets/msceleb/index.html b/site/public/datasets/msceleb/index.html index 3bda88ea..b59e77a8 100644 --- a/site/public/datasets/msceleb/index.html +++ b/site/public/datasets/msceleb/index.html @@ -53,7 +53,7 @@ </header> <div class="content content-dataset"> - <section class='intro_section' style='background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/msceleb/assets/background.jpg)'><div class='inner'><div class='hero_desc'><span class='bgpad'>Microsoft Celeb 1M is a dataset of 10 million face images harvested from the Internet</span></div><div class='hero_subdesc'><span class='bgpad'>The MS Celeb dataset includes 100K people and a target list of 1 million individuals + <section class='intro_section' style='background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/msceleb/assets/background.jpg)'><div class='inner'><div class='hero_desc'><span class='bgpad'>Microsoft Celeb 1M is a dataset of 10 million face images harvested from the Internet</span></div><div class='hero_subdesc'><span class='bgpad'>The MS Celeb dataset includes 100,000 people and a target list of 1,000,000 individuals </span></div></div></section><section><h2>Microsoft Celeb Dataset (MS Celeb)</h2> </section><section><div class='right-sidebar'><div class='meta'> <div class='gray'>Published</div> @@ -202,7 +202,7 @@ <p>What the decision to block the sale announces is not so much that Microsoft had upgraded their ethics, but that Microsoft publicly acknowledged it can't sell a data-driven product without data. In other words, Microsoft can't sell face recognition for faces they can't train on.</p> <p>Until now, that data has been freely harvested from the Internet and packaged in training sets like MS Celeb, which are overwhelmingly <a href="https://www.nytimes.com/2018/02/09/technology/facial-recognition-race-artificial-intelligence.html">white</a> and <a href="https://gendershades.org">male</a>. Without balanced data, facial recognition contains blind spots. And without datasets like MS Celeb, the powerful yet inaccurate facial recognition services like Microsoft's Azure Cognitive Service the services might not exist at all.</p> </section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/msceleb/assets/msceleb_montage.jpg' alt=' A visualization of 2,000 of the 100,000 identity included in the image dataset distributed by Microsoft Research. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)'><div class='caption'> A visualization of 2,000 of the 100,000 identity included in the image dataset distributed by Microsoft Research. Credit: megapixels.cc. License: Open Data Commons Public Domain Dedication (PDDL)</div></div></section><section><p>Microsoft didn't only create MS Celeb for other researchers to use, they also used it internally. In a publicly available 2017 Microsoft Research project called "<a href="https://www.microsoft.com/en-us/research/publication/one-shot-face-recognition-promoting-underrepresented-classes/">One-shot Face Recognition by Promoting Underrepresented Classes</a>," Microsoft leveraged the MS Celeb dataset to build their algorithms and advertise the results. Interestingly, Microsoft's <a href="https://www.microsoft.com/en-us/research/publication/one-shot-face-recognition-promoting-underrepresented-classes/">corporate version</a> of the paper does not mention they used the MS Celeb datset, but the <a href="https://www.semanticscholar.org/paper/One-shot-Face-Recognition-by-Promoting-Classes-Guo/6cacda04a541d251e8221d70ac61fda88fb61a70">open-access version</a> published on arxiv.org explicitly mentions that Microsoft Research introspected their algorithms "on the MS-Celeb-1M low-shot learning benchmark task."</p> -<p>We suggest that if Microsoft Research wants to make biometric data publicly available for surveillance research and development, they should start with releasing their researchers' own biometric data, instead of scraping the Internet for journalists, artists, writers, actors, athletes, musicians, and academics.</p> +<p>If Microsoft Research wants to make biometric data publicly available for surveillance research and development, perhaps they should start with releasing their employees own biometric data instead of scraping the Internet for journalists, artists, writers, actors, athletes, musicians, and academics. A publicly available face recognition dataset of all Microsoft Researcher employees would be a welcome replacement.</p> </section><section> <h3>Who used Microsoft Celeb?</h3> |
