summaryrefslogtreecommitdiff
path: root/site/public/datasets/ijb_c/index.html
blob: 3ebe4362b71c3db54d0c9e5542965896e5ab395a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
<!doctype html>
<html>
<head>
  <title>MegaPixels: IJB-C</title>
  <meta charset="utf-8" />
  <meta name="author" content="Adam Harvey" />
  <meta name="description" content="IARPA Janus Benchmark C is a dataset of web images used" />
  <meta property="og:title" content="MegaPixels: IJB-C"/>
  <meta property="og:type" content="website"/>
  <meta property="og:image" content="https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/ijb_c/assets/background.jpg" />
  <meta property="og:url" content="https://megapixels.cc/datasets/ijb_c/"/>
  <meta property="og:site_name" content="MegaPixels" />
  <meta name="referrer" content="no-referrer" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=no"/>
  <meta name="apple-mobile-web-app-status-bar-style" content="black">
  <meta name="apple-mobile-web-app-capable" content="yes">

  <link rel="apple-touch-icon" sizes="57x57" href="/assets/img/favicon/apple-icon-57x57.png">
  <link rel="apple-touch-icon" sizes="60x60" href="/assets/img/favicon/apple-icon-60x60.png">
  <link rel="apple-touch-icon" sizes="72x72" href="/assets/img/favicon/apple-icon-72x72.png">
  <link rel="apple-touch-icon" sizes="76x76" href="/assets/img/favicon/apple-icon-76x76.png">
  <link rel="apple-touch-icon" sizes="114x114" href="/assets/img/favicon/apple-icon-114x114.png">
  <link rel="apple-touch-icon" sizes="120x120" href="/assets/img/favicon/apple-icon-120x120.png">
  <link rel="apple-touch-icon" sizes="144x144" href="/assets/img/favicon/apple-icon-144x144.png">
  <link rel="apple-touch-icon" sizes="152x152" href="/assets/img/favicon/apple-icon-152x152.png">
  <link rel="apple-touch-icon" sizes="180x180" href="/assets/img/favicon/apple-icon-180x180.png">
  <link rel="icon" type="image/png" sizes="192x192"  href="/assets/img/favicon/android-icon-192x192.png">
  <link rel="icon" type="image/png" sizes="32x32" href="/assets/img/favicon/favicon-32x32.png">
  <link rel="icon" type="image/png" sizes="96x96" href="/assets/img/favicon/favicon-96x96.png">
  <link rel="icon" type="image/png" sizes="16x16" href="/assets/img/favicon/favicon-16x16.png">
  <link rel="manifest" href="/assets/img/favicon/manifest.json">
  <meta name="msapplication-TileColor" content="#ffffff">
  <meta name="msapplication-TileImage" content="/ms-icon-144x144.png">
  <meta name="theme-color" content="#ffffff">
  
  <link rel='stylesheet' href='/assets/css/fonts.css' />
  <link rel='stylesheet' href='/assets/css/css.css' />
  <link rel='stylesheet' href='/assets/css/leaflet.css' />
  <link rel='stylesheet' href='/assets/css/applets.css' />
  <link rel='stylesheet' href='/assets/css/mobile.css' />
</head>
<body>
  <header>
    <a class='slogan' href="/">
      <div class='logo'></div>
      <div class='site_name'>MegaPixels</div>
      <div class='page_name'>IJB-C</div>
    </a>
    <div class='links'>
      <a href="/datasets/">Datasets</a>
      <a href="/about/">About</a>
      <a href="/about/updates">Updates</a>
    </div>
  </header>
  <div class="content content-dataset">
    
  <section class='intro_section' style='background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/ijb_c/assets/background.jpg)'><div class='inner'><div class='hero_desc'><span class='bgpad'>IARPA Janus Benchmark C is a dataset of web images used</span></div><div class='hero_subdesc'><span class='bgpad'>The IJB-C dataset contains 21,294 images and 11,779 videos of 3,531 identities
</span></div></div></section><section><h2>IARPA Janus Benchmark C (IJB-C)</h2>
</section><section><div class='right-sidebar'><div class='meta'>
    <div class='gray'>Published</div>
    <div>2017</div>
  </div><div class='meta'>
    <div class='gray'>Images</div>
    <div>21,294 </div>
  </div><div class='meta'>
    <div class='gray'>Videos</div>
    <div>11,799 </div>
  </div><div class='meta'>
    <div class='gray'>Identities</div>
    <div>3,531 </div>
  </div><div class='meta'>
    <div class='gray'>Purpose</div>
    <div>Face recognition</div>
  </div><div class='meta'>
    <div class='gray'>Website</div>
    <div><a href='https://www.nist.gov/programs-projects/face-challenges' target='_blank' rel='nofollow noopener'>nist.gov</a></div>
  </div></div><p>[ page under development ]</p>
<p>The IARPA Janus Benchmark C (IJB&ndash;C) is a dataset of web images used for face recognition research and development. The IJB&ndash;C dataset contains 3,531 people</p>
<p>Among the target list of 3,531 names are activists, artists, journalists, foreign politicians,</p>
<ul>
<li>Subjects 3531</li>
<li>Templates: 140739</li>
<li>Genuine Matches: 7819362</li>
<li>Impostor Matches: 39584639</li>
</ul>
<p>Why not include US Soliders instead of activists?</p>
<p>was creted by Nobilis, a United States Government contractor is used to develop software for the US intelligence agencies as part of the IARPA Janus program.</p>
<p>The IARPA Janus program is</p>
<p>these representations must address the challenges of Aging, Pose, Illumination, and Expression (A-PIE) by exploiting all available imagery.</p>
<ul>
<li>metadata annotations were created using crowd annotations</li>
<li>created by Nobilis</li>
<li>used mechanical turk</li>
<li>made for intelligence analysts</li>
<li>improve performance of face recognition tools</li>
<li>by fusing the rich spatial, temporal, and contextual information available from the multiple views captured by today’s "media in the wild"</li>
</ul>
<p>The name list includes</p>
<ul>
<li>2 videos from CCC<ul>
<li>yq6ZC-YLHZA.png<ul>
<li>Katharina Nocun: Deine Rechte sind in diesen Freihandelsabkommen nicht verfügbar</li>
</ul>
</li>
<li>fF2MxkDzlVg<ul>
<li>Jillian York: "Technology companies now hold an unprecedented ability to shape the world around us by limiting our ability to access certain content and by crafting proprietary algorithm that bring us our daily streams of content. Matthew Stender, Jillian C. York"</li>
</ul>
</li>
<li>Maya Zankoul. She's an old friend, a Lebanese web designer who's put out a couple of books locally and has a Wikipedia page, probably created by a Lebanese Wikipedia editor die-hard. Not famous. How on earth?</li>
<li>Melissa Gira Grant (also a journalist)</li>
<li>Nadezhda Tolokinnikova (Pussy Riot)</li>
<li>Derrick Ashong (activist and journalist)</li>
<li>Michael Anti</li>
<li>Lina Ben Mhenni</li>
<li>Manal al-Sharif</li>
<li>Juan Carlos de Martin (not an activist but not really famous either!)</li>
<li>Anita Sarkeesian</li>
<li>Amal Clooney (lawyer)</li>
<li>Anil Dash (startup guy)</li>
<li>Bruno Latour (philosopher)</li>
<li>Dan Gillmor (tech journalist)</li>
<li>Eben Upton (founder of raspberry pi)</li>
<li>Evgeny Morozov</li>
<li>Gabriella Coleman</li>
<li>Maria Popova</li>
<li>Molly Crabapple</li>
<li>Paola Antonelli</li>
<li>Seymour Hersh</li>
<li>Ta-Nehisi Coates</li>
</ul>
</li>
</ul>
<p>The first 777 are non-alphabetical. From 777-3531 is alphabetical</p>
</section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/ijb_c/assets/ijb_c_montage.jpg' alt=' A visualization of the IJB-C dataset'><div class='caption'> A visualization of the IJB-C dataset</div></div></section><section><h2>Research notes</h2>
<p>From original papers: <a href="https://noblis.org/wp-content/uploads/2018/03/icb2018.pdf">https://noblis.org/wp-content/uploads/2018/03/icb2018.pdf</a></p>
<p>Collection for the dataset began by identifying CreativeCommons subject videos, which are often more scarce thanCreative Commons subject images.   Search terms that re-sulted in large quantities of person-centric videos (e.g. “in-terview”) were generated and translated into numerous lan-guages including Arabic, Korean, Swahili, and Hindi to in-crease diversity of the subject pool. Certain YouTube userswho upload well-labeled, person-centric videos, such as the World  Economic  Forum  and  the  International  University Sports Federation were also identified. Titles of videos per-taining to these search terms and usernames were scrapedusing the YouTube Data API and translated into English us-ing the Yandex Translate API4. Pattern matching was per-formed to extract potential names of subjects from the trans-lated titles, and these names were searched using the Wiki-data  API  to  verify  the  subject’s  existence  and  status  as  a public figure,  and to check for Wikimedia Commons im-agery.  Age, gender, and geographic region were collectedusing the Wikipedia API.Using the candidate subject names, Creative Commonsimages  were  scraped  from  Google  and  Wikimedia  Com-mons,  and  Creative  Commons  videos  were  scraped  fromYouTube. After images and videos of the candidate subjectwere  identified,  AMT  Workers  were  tasked  with  validat-ing the subject’s presence throughout the video.  The AMTWorkers marked segments of the video in which the subjectwas present, and key frames</p>
<p>IARPA funds Italian researcher <a href="https://www.micc.unifi.it/projects/glaivejanus/">https://www.micc.unifi.it/projects/glaivejanus/</a></p>
</section><section>
  <h3>Who used IJB-C?</h3>

  <p>
    This bar chart presents a ranking of the top countries where dataset citations originated.  Mouse over individual columns to see yearly totals. These charts show at most the top 10 countries.
  </p>
 
 </section>

<section class="applet_container">
<!-- 	<div style="position: absolute;top: 0px;right: -55px;width: 180px;font-size: 14px;">Labeled Faces in the Wild Dataset<br><span class="numc" style="font-size: 11px;">20 citations</span>
</div> -->
 <div class="applet" data-payload="{&quot;command&quot;: &quot;chart&quot;}"></div>
</section>

<section class="applet_container">
 <div class="applet" data-payload="{&quot;command&quot;: &quot;piechart&quot;}"></div>
</section>

<section>
	
	<h3>Informaton Supply chain</h3>

	<p>
		To help understand how IJB-C has been used around the world by commercial, military, and academic organizations; existing publicly available research citing IARPA Janus Benchmark C was collected, verified, and geocoded to show the biometric trade routes of people appearing in the images. Click on the markers to reveal research projects at that location.
	</p>
 
 </section>

<section class="applet_container fullwidth">
 <div class="applet" data-payload="{&quot;command&quot;: &quot;map&quot;}"></div>
</section>

<div class="caption">
	<ul class="map-legend">
	<li class="edu">Academic</li>
	<li class="com">Commercial</li>
	<li class="gov">Military / Government</li>
	</ul>
	<div class="source">Citation data is collected using <a href="https://semanticscholar.org" target="_blank">SemanticScholar.org</a> then dataset usage verified and geolocated.</div >
</div>


<section class="applet_container">

  <h3>Dataset Citations</h3>
  <p>
    The dataset citations used in the visualizations were collected from <a href="https://www.semanticscholar.org">Semantic Scholar</a>, a website which aggregates and indexes research papers.  Each citation was geocoded using names of institutions found in the PDF front matter, or as listed on other resources.  These papers have been manually verified to show that researchers downloaded and used the dataset to train or test machine learning algorithms. If you use our data, please <a href="/about/attribution">cite our work</a>.
  </p>

  <div class="applet" data-payload="{&quot;command&quot;: &quot;citations&quot;}"></div>
</section><section>

  <div class="hr-wave-holder">
      <div class="hr-wave-line hr-wave-line1"></div>
      <div class="hr-wave-line hr-wave-line2"></div>
  </div>

  <h2>Supplementary Information</h2>
  
</section><section>

  <h4>Cite Our Work</h4>
  <p>
  	
  	If you find this analysis helpful, please cite our work:

<pre id="cite-bibtex">
@online{megapixels,
  author = {Harvey, Adam. LaPlace, Jules.},
  title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets},
  year = 2019,
  url = {https://megapixels.cc/},
  urldate = {2019-04-18}
}</pre>

	</p>
</section>

  </div>
  <footer>
    <ul class="footer-left">
      <li><a href="/">MegaPixels.cc</a></li>
      <li><a href="/datasets/">Datasets</a></li>
      <li><a href="/about/">About</a></li>
      <li><a href="/about/press/">Press</a></li>
      <li><a href="/about/legal/">Legal and Privacy</a></li>
    </ul>
    <ul class="footer-right">
      <li>MegaPixels &copy;2017-19 &nbsp;<a href="https://ahprojects.com">Adam R. Harvey</a></li>
      <li>Made with support from &nbsp;<a href="https://mozilla.org">Mozilla</a></li>
    </ul>
  </footer>
</body>

<script src="/assets/js/dist/index.js"></script>
</html>