blob: 4c9f1576d71df045620bc21528a4f29b9b243f8e (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
|
------------
status: published
title: Microsoft Celeb
desc: MS Celeb is a dataset of web images used for training and evaluating face recognition algorithms
subdesc: The MS Celeb dataset includes over 10,000,000 images and 93,000 identities of semi-public figures collected using the Bing search engine
slug: msceleb
cssclass: dataset
image: assets/background.jpg
year: 2015
published: 2019-4-18
updated: 2019-4-18
authors: Adam Harvey
------------
## Microsoft Celeb Dataset (MS Celeb)
### sidebar
### end sidebar
The Microsoft Celeb dataset is a face recognition training site made entirely of images scraped from the Internet. According to Microsoft Research who created and published the dataset in 2016, MS Celeb is the largest publicly available face recognition dataset in the world, containing over 10 million images of 100,000 individuals.
But Microsoft's ambition was bigger. They wanted to recognize 1 million individuals. As part of their dataset they released a list of 1 million target identities for researchers to identity. The identities
https://www.microsoft.com/en-us/research/publication/ms-celeb-1m-dataset-benchmark-large-scale-face-recognition-2/
In 2019, Microsoft CEO Brad Smith called for the governmental regulation of face recognition, an admission of his own company's inability to control their surveillance-driven business model. Yet since then, and for the last 4 years, Microsoft has willingly and actively played a significant role in accelerating growth in the very same industry they called for the government to regulate. This investigation looks look into the [MS Celeb](https://www.microsoft.com/en-us/research/publication/ms-celeb-1m-dataset-benchmark-large-scale-face-recognition-2/) dataset and Microsoft Research's role in creating and distributing the largest publicly available face recognition dataset in the world to both.
to spur growth and incentivize researchers, Microsoft released a dataset called [MS Celeb](https://msceleb.org), or Microsft Celeb, in which they developed and published a list of exactly 1 million targeted people whose biometrics would go on to build
{% include 'dashboard.html' %}
{% include 'supplementary_header.html' %}
### Additional Information
- SenseTime https://www.semanticscholar.org/paper/The-Devil-of-Face-Recognition-is-in-the-Noise-Wang-Chen/9e31e77f9543ab42474ba4e9330676e18c242e72
- Microsoft used it https://www.semanticscholar.org/paper/One-shot-Face-Recognition-by-Promoting-Classes-Guo/6cacda04a541d251e8221d70ac61fda88fb61a70
- https://www.hrw.org/news/2019/01/15/letter-microsoft-face-surveillance-technology
- https://www.scmp.com/tech/science-research/article/3005733/what-you-need-know-about-sensenets-facial-recognition-firm
### Footnotes
[^brad_smith]: Brad Smith cite
|