1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
|
------------
status: published
title: MSC
slug: munich-security-conference
desc: Analyzing the Transnational Flow of Facial Recognition Training Data
subdesc: Where does face data originate and who's using it?
cssclass: dataset
image: assets/background.jpg
published: 2019-4-18
updated: 2019-4-19
authors: Adam Harvey
------------
## Analysis for the Munich Security Conference Transnational Security Report
### sidebar
+ Images Analyzed: 24,302,637
+ Datasets Analyzed: 30
+ Years: 2006 - 2018
+ Status: Ongoing Investigation
+ Last Updated: June 27, 2019
### end sidebar
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."
=== columns 2
```
single_pie_chart /site/research/munich_security_conference/assets/megapixels_origins_top.csv
Caption: Sources of Publicly Available Face Training Data 2006 - 2018
Top: 10
OtherLabel: Other
```
===
```
single_pie_chart /site/research/munich_security_conference/assets/summary_countries.csv
Caption: Locations Where Face Data Is Used
Top: 14
OtherLabel: Other
```
=== end columns
=== columns 2
#### Sources of Face Data
Add text
| Source | Images |
| --- | --- |
|Search Engines | 30,127,200 |
|Flickr.com | 11,783,888 |
|IMDb.com | 5,251,410 |
|CCTV | 959,312 |
|Wikimedia.org | 183,500 |
|Mugshots | 113,268 |
|Other Sources Combined | 37,044 |
|YouTube.com | 31,888 |
===
#### Where Face Data Is Used
Add text
|country | citations|
| --- | --- |
|China | 327|
|United States | 302|
|United Kingdom | 187|
|Australia | 38|
|Germany | 35|
|Singapore | 27|
|Canada | 25|
|Netherlands | 25|
|Italy | 22|
|France | 17|
|India | 14|
|South Korea | 12|
|Spain | 10|
|Switzerland | 9|
=== end columns
## Over 6,000 Embassy Images on Flickr Found in Face Recognition Datasets
Including over 2,000 more for racial analysis



=== columns 2
```
single_pie_chart /site/research/munich_security_conference/assets/megapixels_origins_top.csv
Caption: Sources of Face Training Data
Top: 5
OtherLabel: Other Countries
Colors: categoryRainbow
```
===========
```
single_pie_chart /site/research/munich_security_conference/assets/embassy_counts_summary_dataset.csv
Caption: Dataset sources
Top: 4
OtherLabel: Other
Colors: categoryRainbow
```
=== end columns
{% include 'supplementary_header.html' %}
```
load_file /site/research/munich_security_conference/assets/embassy_counts_public.csv
Headings: Images, Dataset, Embassy, Flickr ID, URL, Guest, Host
```
{% include 'cite_our_work.html' %}
|