summaryrefslogtreecommitdiff
path: root/site/public/datasets/lfw/index.html
blob: e90cdcc552eb58db850938c8593dafc0f0295091 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
<!doctype html>
<html>
<head>
  <title>MegaPixels</title>
  <meta charset="utf-8" />
  <meta name="author" content="Adam Harvey" />
  <meta name="description" content="<span style="color:#ff0000">Labeled Faces in The Wild (LFW)</span> is a database of face photographs designed for studying the problem of unconstrained face recognition." />
  <meta name="referrer" content="no-referrer" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
  <link rel='stylesheet' href='/assets/css/fonts.css' />
  <link rel='stylesheet' href='/assets/css/tabulator.css' />
  <link rel='stylesheet' href='/assets/css/css.css' />
  <link rel='stylesheet' href='/assets/css/leaflet.css' />
  <link rel='stylesheet' href='/assets/css/applets.css' />
</head>
<body>
  <header>
    <a class='slogan' href="/">
      <div class='logo'></div>
      <div class='site_name'>MegaPixels</div>
    </a>
    <div class='links'>
      <a href="/datasets/">Datasets</a>
      <a href="/research/">Research</a>
      <a href="/about/">About</a>
    </div>
  </header>
  <div class="content content-">
    
  <section class='intro_section' style='background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/background.jpg)'><div class='inner'><div class='hero_desc'><span><span style="color:#ff0000">Labeled Faces in The Wild (LFW)</span> is a database of face photographs designed for studying the problem of unconstrained face recognition.</span></div><div class='hero_subdesc'><span>It includes 13,456 images of 4,432 people's images copied from the Internet during 2002-2004.
</span></div></div></section><section><div class='image'><div class='intro-caption caption'>A few of the 5,749 people in the Labeled Faces in the Wild Dataset, thee most widely used face dataset for benchmarking face recognition algorithms.</div></div></section><section><div class='left-sidebar'><div class='meta'><div><div class='gray'>Created</div><div>2002-2004</div></div><div><div class='gray'>Images</div><div>13,233</div></div><div><div class='gray'>Identities</div><div>5,749</div></div><div><div class='gray'>Origin</div><div>Yahoo! News Images</div></div><div><div class='gray'>Used by</div><div>Facebook, Google, Microsoft, Baidu, Tencent, SenseTime, Face++, CIA, NSA, IARPA</div></div><div><div class='gray'>Website</div><div><a href="http://vis-www.cs.umass.edu/lfw">vis-www.cs.umass.edu/lfw</a></div></div></div><ul>
<li>There are about 3 men for every 1 woman in the LFW dataset<a class="footnote_shim" name="[^lfw_www]_1"> </a><a href="#[^lfw_www]" class="footnote" title="Footnote 1">1</a></li>
<li>The person with the most images is <a href="http://vis-www.cs.umass.edu/lfw/person/George_W_Bush_comp.html">George W. Bush</a> with 530</li>
<li>There are about 3 George W. Bush's for every 1 <a href="http://vis-www.cs.umass.edu/lfw/person/Tony_Blair.html">Tony Blair</a></li>
<li>The LFW dataset includes over 500 actors, 30 models, 10 presidents, 124 basketball players, 24 football players, 11 kings, 7 queens, and 1 <a href="http://vis-www.cs.umass.edu/lfw/person/Moby.html">Moby</a></li>
<li>In all 3 of the LFW publications [^lfw_original_paper], [^lfw_survey], [^lfw_tech_report] the words "ethics", "consent", and "privacy" appear 0 times</li>
<li>The word "future" appears 71 times</li>
<li>* denotes partial funding for related research</li>
</ul>
</div><h2>Labeled Faces in the Wild</h2>
<p><em>Labeled Faces in The Wild</em> (LFW) is "a database of face photographs designed for studying the problem of unconstrained face recognition<a class="footnote_shim" name="[^lfw_www]_2"> </a><a href="#[^lfw_www]" class="footnote" title="Footnote 1">1</a>. It is used to evaluate and improve the performance of facial recognition algorithms in academic, commercial, and government research. According to BiometricUpdate.com<a class="footnote_shim" name="[^lfw_pingan]_1"> </a><a href="#[^lfw_pingan]" class="footnote" title="Footnote 3">3</a>, LFW is "the most widely used evaluation set in the field of facial recognition, LFW attracts a few dozen teams from around the globe including Google, Facebook, Microsoft Research Asia, Baidu, Tencent, SenseTime, Face++ and Chinese University of Hong Kong."</p>
<p>The LFW dataset includes 13,233 images of 5,749 people that were collected between 2002-2004. LFW is a subset of <em>Names of Faces</em> and is part of the first facial recognition training dataset created entirely from images appearing on the Internet. The people appearing in LFW are...</p>
<p>The <em>Names and Faces</em> dataset was the first face recognition dataset created entire from online photos. However, <em>Names and Faces</em> and <em>LFW</em> are not the first face recognition dataset created entirely "in the wild". That title belongs to the <a href="/datasets/ucd_faces/">UCD dataset</a>. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer.</p>
<p>The <em>Names and Faces</em> dataset was the first face recognition dataset created entire from online photos. However, <em>Names and Faces</em> and <em>LFW</em> are not the first face recognition dataset created entirely "in the wild". That title belongs to the <a href="/datasets/ucd_faces/">UCD dataset</a>. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer.</p>
</section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_montage_all_crop.jpg' alt='All 5,379 people in the Labeled Faces in The Wild Dataset. Showing one face per person'><div class='caption'>All 5,379 people in the Labeled Faces in The Wild Dataset. Showing one face per person</div></div></section><section><p>The <em>Names and Faces</em> dataset was the first face recognition dataset created entire from online photos. However, <em>Names and Faces</em> and <em>LFW</em> are not the first face recognition dataset created entirely "in the wild". That title belongs to the <a href="/datasets/ucd_faces/">UCD dataset</a>. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer.</p>
<p>The <em>Names and Faces</em> dataset was the first face recognition dataset created entire from online photos. However, <em>Names and Faces</em> and <em>LFW</em> are not the first face recognition dataset created entirely "in the wild". That title belongs to the <a href="/datasets/ucd_faces/">UCD dataset</a>. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer.</p>
</section><section>		<h3>Biometric Trade Routes</h3><!-- 	<div class="map-sidebar right-sidebar">	  <h3>Legend</h3>	  <ul>	    <li><span style="color: #f2f293">&#9632;</span> Industry</li>	    <li><span style="color: #f30000">&#9632;</span> Academic</li>	    <li><span style="color: #3264f6">&#9632;</span> Government</li>	  </ul>	</div>	 -->	<p>		To understand how this dataset has been used, its citations have been geocoded to show an approximate geographic digital trade route of the biometric data. Lines indicate an organization (education, commercial, or governmental) that has cited the LFW dataset in their research. Data is compiled from <a href="https://www.semanticscholar.org">Semantic Scholar</a>.	</p>  </section><section class="applet_container"> <div class="applet" data-payload="{&quot;command&quot;: &quot;map&quot;}"></div></section><div class="caption">	<div class="map-legend-item"><span class="edu">&#9632;</span> Academic</div>	<div class="map-legend-item"><span class="com">&#9632;</span> Industry</div>	<div class="map-legend-item"><span class="gov">&#9632;</span> Government</div></div><section><p>Sed ut perspiciatis, unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam eaque ipsa, quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt, explicabo. Nemo enim ipsam voluptatem, quia voluptas sit, aspernatur aut odit aut fugit, sed quia.</p>
<hr class="supp">

<h2>Supplementary Information for Labeled Faces in The Wild</h2>
</section><section class="applet_container">  <h3>Citations</h3>  <p>Add graph showing distribution by country. Add information about how the citations were generated. Add button/link to download CSV</p>  <div class="applet" data-payload="{&quot;command&quot;: &quot;citations&quot;}"></div></section><section>  <h3>Synthetic Faces</h3>  <p>To visualize the types of photos in the dataset without explicitly publishing individual's identities a generative adversarial network (GAN) was trained on the entire dataset. The images in this video show a neural network learning the visual latent space and then interpolating between archetypical identities within the LFW dataset.</p></section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/synthetic_01.jpg' alt='Synthetically generated face from the visual space of LFW dataset'><div class='caption'>Synthetically generated face from the visual space of LFW dataset</div></div>
<div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/synthetic_02.jpg' alt='Synthetically generated face from the visual space of LFW dataset'><div class='caption'>Synthetically generated face from the visual space of LFW dataset</div></div>
<div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/synthetic_03.jpg' alt='Synthetically generated face from the visual space of LFW dataset'><div class='caption'>Synthetically generated face from the visual space of LFW dataset</div></div>
<div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/synthetic_01.jpg' alt='Synthetically generated face from the visual space of LFW dataset'><div class='caption'>Synthetically generated face from the visual space of LFW dataset</div></div></section><section><h3>Commercial Use of Labeled Faces in The Wild</h3>
<p>Add a paragraph about how usage extends far beyond academia into research centers for largest companies in the world. And even funnels into CIA funded research in the US and defense industry usage in China.</p>
</section><section class='applet_container'><div class='applet' data-payload='{"command": "load_file assets/lfw_commercial_use.csv", "fields": ["name_display, company_url, example_url, country, description"]}'></div></section><section><h3>Code</h3>
<p>The LFW dataset is so widely used that access to the facial data has built directly into a popular code library called Sci-Kit Learn. It includes a function called <code>fetch_lfw_people</code> to download the faces in the LFW dataset.</p>
</section><section><pre><code class="lang-python">#!/usr/bin/python

import numpy as np
from sklearn.datasets import fetch_lfw_people
import imageio
import imutils

# download LFW dataset (first run takes a while)
lfw_people = fetch_lfw_people(min_faces_per_person=1, resize=1, color=True, funneled=False)

# introspect dataset
n_samples, h, w, c = lfw_people.images.shape
print(f&#39;{n_samples:,} images at {w}x{h} pixels&#39;)
cols, rows = (176, 76)
n_ims = cols * rows

# build montages
im_scale = 0.5
ims = lfw_people.images[:n_ims]
montages = imutils.build_montages(ims, (int(w * im_scale,   int(h * im_scale)), (cols, rows))
montage = montages[0]

# save full montage image
imageio.imwrite(&#39;lfw_montage_full.png&#39;, montage)

# make a smaller version
montage = imutils.resize(montage, width=960)
imageio.imwrite(&#39;lfw_montage_960.jpg&#39;, montage)
</code></pre>
</section><section><p>Research, text, and graphics ©Adam Harvey / megapixels.cc</p>
</section><section><ul class="footnotes"><li><a name="[^lfw_www]" class="footnote_shim"></a><span class="backlinks"><a href="#[^lfw_www]_1">a</a><a href="#[^lfw_www]_2">b</a></span><p><a href="http://vis-www.cs.umass.edu/lfw/results.html">http://vis-www.cs.umass.edu/lfw/results.html</a></p>
</li><li><a name="[^lfw_baidu]" class="footnote_shim"></a><span class="backlinks"></span><p>Jingtuo Liu, Yafeng Deng, Tao Bai, Zhengping Wei, Chang Huang. Targeting Ultimate Accuracy: Face Recognition via Deep Embedding. <a href="https://arxiv.org/abs/1506.07310">https://arxiv.org/abs/1506.07310</a></p>
</li><li><a name="[^lfw_pingan]" class="footnote_shim"></a><span class="backlinks"><a href="#[^lfw_pingan]_1">a</a></span><p>Lee, Justin. "PING AN Tech facial recognition receives high score in latest LFW test results". BiometricUpdate.com. Feb 13, 2017. <a href="https://www.biometricupdate.com/201702/ping-an-tech-facial-recognition-receives-high-score-in-latest-lfw-test-results">https://www.biometricupdate.com/201702/ping-an-tech-facial-recognition-receives-high-score-in-latest-lfw-test-results</a></p>
</li></ul></section>

  </div>
  <footer>
    <div>
      <a href="/">MegaPixels.cc</a>
      <a href="/about/disclaimer/">Disclaimer</a>
      <a href="/about/terms/">Terms of Use</a>
      <a href="/about/privacy/">Privacy</a>
      <a href="/about/">About</a>
      <a href="/about/team/">Team</a>
    </div>
    <div>
      MegaPixels &copy;2017-19 Adam R. Harvey /&nbsp;
      <a href="https://ahprojects.com">ahprojects.com</a>
    </div>
  </footer>
</body>

<script src="/assets/js/dist/index.js"></script>
</html>