4 files changed, 572 insertions, 0 deletions
diff --git a/site/public/datasets/lfw/index.html b/site/public/datasets/lfw/index.html
new file mode 100644
index 00000000..e080229f
--- /dev/null
+++ b/site/public/datasets/lfw/index.html
@@ -0,0 +1,283 @@
+<!doctype html>
+<html>
+<head>
+  <title>MegaPixels</title>
+  <meta charset="utf-8" />
+  <meta name="author" content="Adam Harvey" />
+  <meta name="description" content="LFW: Labeled Faces in The Wild" />
+  <meta name="referrer" content="no-referrer" />
+  <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
+  <link rel='stylesheet' href='/assets/css/fonts.css' />
+  <link rel='stylesheet' href='/assets/css/css.css' />
+</head>
+<body>
+  <header>
+    <a class='slogan' href="/">
+      <div class='logo'></div>
+      <div class='site_name'>MegaPixels</div>
+      <span class='sub'>The Darkside of Datasets</span>
+    </a>
+    <div class='links'>
+      <a href="/search/">Face Search</a>
+      <a href="/datasets/">Datasets</a>
+      <a href="/research/">Research</a>
+      <a href="/about/">About</a>
+    </div>
+  </header>
+  <div class="content">
+    
+  <section><h1>Labeled Faces in the Wild</h1>
+</section><section><div class='meta'><div><div class='gray'>Created</div><div>2007</div></div><div><div class='gray'>Images</div><div>13,233</div></div><div><div class='gray'>People</div><div>5,749</div></div><div><div class='gray'>Created From</div><div>Yahoo News images</div></div><div><div class='gray'>Search available</div><div>Searchable</div></div></div></section><section><p>Labeled Faces in The Wild (LFW) is amongst the most widely used facial recognition training datasets in the world and is the first of its kind to be created entirely from images posted online. The LFW dataset includes 13,233 images of 5,749 people that were collected between 2002-2004. Use the tools below to check if you were included in this dataset or scroll down to read the analysis.</p>
+<p>{INSERT IMAGE SEARCH MODULE}</p>
+<p>{INSERT TEXT SEARCH MODULE}</p>
+<pre><code>load file: lfw_names_gender_kg_min.csv
+Name, Images, Gender, Description
+</code></pre>
+</section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_feature.jpg' alt='Eighteen of the 5,749 people in the Labeled Faces in the Wild Dataset. The most widely used face dataset for benchmarking commercial face recognition algorithms.'><div class='caption'>Eighteen of the 5,749 people in the Labeled Faces in the Wild Dataset. The most widely used face dataset for benchmarking commercial face recognition algorithms.</div></div></section><section><h2>Intro</h2>
+<p>Three paragraphs describing the LFW dataset in a format that can be easily replicated for the other datasets. Nothing too custom. An analysis of the initial research papers with context relative to all the other dataset papers.</p>
+</section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_montage_everyone_1920.jpg' alt=' all 5,749 people in the LFW Dataset sorted from most to least images collected.'><div class='caption'> all 5,749 people in the LFW Dataset sorted from most to least images collected.</div></div></section><section><h2>LFW by the Numbers</h2>
+<ul>
+<li>Was first published in 2007</li>
+<li>Developed out of a prior dataset from Berkely called "Faces in the Wild" or "Names and Faces" [^lfw_original_paper]</li>
+<li>Includes 13,233 images and 5,749 different people [^lfw_website]</li>
+<li>There are about 3 men for every 1 woman (4,277 men and 1,472 women)[^lfw_website]</li>
+<li>The person with the most images is George W. Bush with 530</li>
+<li>Most people (70%) in the dataset have only 1 image</li>
+<li>Thre are 1,680 people in the dataset with 2 or more images [^lfw_website]</li>
+<li>Two out of 4 of the original authors received funding from the Office of Director of National Intelligence and IARPA for their 2016 LFW survey follow up report </li>
+<li>The LFW dataset includes over 500 actors, 30 models, 10 presidents, 24 football players, 124 basketball players, 11 kings, and 2 queens</li>
+<li>In all the LFW publications provided by the authors the words "ethics", "consent", and "privacy" appear 0 times [^lfw_original_paper], [^lfw_survey], [^lfw_tech_report] , [^lfw_website]</li>
+<li>The word "future" appears 71 times</li>
+</ul>
+<h1>Facts</h1>
+<ul>
+<li>Was created for the purpose of improving "unconstrained face recognition" [^lfw_original_paper]</li>
+<li>All images in LFW were obtained "in the wild" meaning without any consent from the subject or from the photographer</li>
+<li>The faces were detected using the Viola-Jones haarcascade face detector [^lfw_website] [^lfw_survey]</li>
+<li>Is considered the "most popular benchmark for face recognition" [^lfw_baidu]</li>
+<li>Is "the most widely used evaluation set in the field of facial recognition" [^lfw_pingan]</li>
+<li>Is used by several of the largest tech companies in the world including "Google, Facebook, Microsoft Research Asia, Baidu, Tencent, SenseTime, Face++ and Chinese University of Hong Kong." [^lfw_pingan]</li>
+</ul>
+<p>need citations</p>
+<ul>
+<li>All images were copied from Yahoo News between 2002 - 2004 [^lfw_original_paper]</li>
+<li>SenseTime, who has relied on LFW for benchmarking their facial recognition performance, is the leading provider of surveillance to the Chinese Government (need citation)</li>
+</ul>
+</section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_montage_top1_640.jpg' alt=' former President George W. Bush'><div class='caption'> former President George W. Bush</div></div>
+<div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_montage_top2_4_640.jpg' alt=' Colin Powel (236), Tony Blair (144), and Donald Rumsfeld (121)'><div class='caption'> Colin Powel (236), Tony Blair (144), and Donald Rumsfeld (121)</div></div></section><section><h2>People and Companies using the LFW Dataset</h2>
+<p>This section describes who is using the dataset and for what purposes. It should include specific examples of people or companies with citations and screenshots. This section is followed up by the graph, the map, and then the supplementary material.</p>
+<p>The LFW dataset is used by numerous companies for <a href="about/glossary#benchmarking">benchmarking</a> algorithms and in some cases <a href="about/glossary#training">training</a>. According to the benchmarking results page [^lfw_results] provided by the authors, over 2 dozen companies have contributed their benchmark results.</p>
+<p>According to BiometricUpdate.com [^lfw_pingan], LFW is "the most widely used evaluation set in the field of facial recognition, LFW attracts a few dozen teams from around the globe including Google, Facebook, Microsoft Research Asia, Baidu, Tencent, SenseTime, Face++ and Chinese University of Hong Kong."</p>
+<p>According to researchers at the Baidu Research – Institute of Deep Learning "LFW has been the most popular evaluation benchmark for face recognition, and played a very important role in facilitating the face recognition society to improve algorithm. [^lfw_baidu]."</p>
+<p>In addition to commercial use as an evaluation tool, alll of the faces in LFW dataset are prepackaged into a popular machine learning code framework called scikit-learn.</p>
+<pre><code>load file: lfw_commercial_use.csv
+name_display,company_url,example_url,country,description
+</code></pre>
+<table>
+<thead><tr>
+<th style="text-align:left">Company</th>
+<th style="text-align:left">Country</th>
+<th style="text-align:left">Industries</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td style="text-align:left"><a href="http://www.aratek.co">Aratek</a></td>
+<td style="text-align:left">China</td>
+<td style="text-align:left">Biometric sensors for telecom, civil identification, finance, education, POS, and transportation</td>
+</tr>
+<tr>
+<td style="text-align:left"><a href="http://www.aratek.co">Aratek</a></td>
+<td style="text-align:left">China</td>
+<td style="text-align:left">Biometric sensors for telecom, civil identification, finance, education, POS, and transportation</td>
+</tr>
+<tr>
+<td style="text-align:left"><a href="http://www.aratek.co">Aratek</a></td>
+<td style="text-align:left">China</td>
+<td style="text-align:left">Biometric sensors for telecom, civil identification, finance, education, POS, and transportation</td>
+</tr>
+</tbody>
+</table>
+<p>Add 2-4 screenshots of companies mentioning LFW here</p>
+</section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_screenshot_01.jpg' alt=' "PING AN Tech facial recognition receives high score in latest LFW test results"'><div class='caption'> "PING AN Tech facial recognition receives high score in latest LFW test results"</div></div>
+<div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_screenshot_02.jpg' alt=' "Face Recognition Performance in LFW benchmark"'><div class='caption'> "Face Recognition Performance in LFW benchmark"</div></div>
+<div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_screenshot_03.jpg' alt=' "The 1st place in face verification challenge, LFW"'><div class='caption'> "The 1st place in face verification challenge, LFW"</div></div></section><section><p>In benchmarking, companies use a dataset to evaluate their algorithms which are typically trained on other data. After training, researchers will use LFW as a benchmark to compare results with other algorithms.</p>
+<p>For example, Baidu (est. net worth $13B) uses LFW to report results for their "Targeting Ultimate Accuracy: Face Recognition via Deep Embedding". According to the three Baidu researchers who produced the paper:</p>
+<h2>Citations</h2>
+<p>Overall, LFW has at least 456 citations from 123 countries. Sed ut perspiciatis, unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam eaque ipsa, quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt, explicabo. Nemo enim ipsam voluptatem, quia voluptas sit, aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos.</p>
+<p>Sed ut perspiciatis, unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam eaque ipsa, quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt, explicabo. Nemo enim ipsam voluptatem, quia voluptas sit, aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos.</p>
+</section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/temp_graph.jpg' alt='Distribution of citations per year per country for the top 5 countries with citations for the LFW Dataset'><div class='caption'>Distribution of citations per year per country for the top 5 countries with citations for the LFW Dataset</div></div></section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/temp_map.jpg' alt='Geographic distributions of citations for the LFW Dataset'><div class='caption'>Geographic distributions of citations for the LFW Dataset</div></div></section><section><h2>Conclusion</h2>
+<p>The LFW face recognition training and evaluation dataset is a historically important face dataset as it was the first popular dataset to be created entirely from Internet images, paving the way for a global trend towards downloading anyone’s face from the Internet and adding it to a dataset. As will be evident with other datasets, LFW’s approach has now become the norm.</p>
+<p>For all the 5,000 people in this datasets, their face is forever a part of facial recognition history. It would be impossible to remove anyone from the dataset because it is so ubiquitous. For their rest of the lives and forever after, these 5,000 people will continue to be used for training facial recognition surveillance.</p>
+<h2>Right to Removal</h2>
+<p>If you are affected by disclosure of your identity in this dataset please do contact the authors. Many have stated that they are willing to remove images upon request. The authors of the LFW dataset provide the following email for inquiries:</p>
+<p>You can use the following message to request removal from the dataset:</p>
+<p>To: Gary Huang <a href="mailto:mailto:gbhuang@cs.umass.edu">mailto:gbhuang@cs.umass.edu</a></p>
+<p>Subject: Request for Removal from LFW Face Dataset</p>
+<p>Dear [researcher name],</p>
+<p>I am writing to you about the "Labeled Faces in The Wild Dataset". Recently I discovered that your dataset includes my identity and I no longer wish to be included in your dataset.</p>
+<p>The dataset is being used thousands of companies around the world to improve facial recognition software including usage by governments for the purpose of law enforcement, national security, tracking consumers in retail environments, and tracking individuals through public spaces.</p>
+<p>My name as it appears in your dataset is [your name]. Please remove all images from your dataset and inform your newsletter subscribers to likewise update their copies.</p>
+<p>- [your name]</p>
+<hr>
+<h2>Supplementary Data</h2>
+<p>Researchers, journ</p>
+<table>
+<thead><tr>
+<th style="text-align:left">Title</th>
+<th style="text-align:left">Organization</th>
+<th style="text-align:left">Country</th>
+<th style="text-align:left">Type</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td style="text-align:left">3D-aided face recognition from videos</td>
+<td style="text-align:left">University of Lyon</td>
+<td style="text-align:left">France</td>
+<td style="text-align:left">edu</td>
+</tr>
+<tr>
+<td style="text-align:left">A Community Detection Approach to Cleaning Extremely Large Face Database</td>
+<td style="text-align:left">National University of Defense Technology, China</td>
+<td style="text-align:left">China</td>
+<td style="text-align:left">edu</td>
+</tr>
+<tr>
+<td style="text-align:left">3D-aided face recognition from videos</td>
+<td style="text-align:left">University of Lyon</td>
+<td style="text-align:left">France</td>
+<td style="text-align:left">edu</td>
+</tr>
+<tr>
+<td style="text-align:left">3D-aided face recognition from videos</td>
+<td style="text-align:left">University of Lyon</td>
+<td style="text-align:left">France</td>
+<td style="text-align:left">edu</td>
+</tr>
+<tr>
+<td style="text-align:left">3D-aided face recognition from videos</td>
+<td style="text-align:left">University of Lyon</td>
+<td style="text-align:left">France</td>
+<td style="text-align:left">edu</td>
+</tr>
+<tr>
+<td style="text-align:left">3D-aided face recognition from videos</td>
+<td style="text-align:left">University of Lyon</td>
+<td style="text-align:left">France</td>
+<td style="text-align:left">edu</td>
+</tr>
+<tr>
+<td style="text-align:left">3D-aided face recognition from videos</td>
+<td style="text-align:left">University of Lyon</td>
+<td style="text-align:left">France</td>
+<td style="text-align:left">edu</td>
+</tr>
+<tr>
+<td style="text-align:left">3D-aided face recognition from videos</td>
+<td style="text-align:left">University of Lyon</td>
+<td style="text-align:left">France</td>
+<td style="text-align:left">edu</td>
+</tr>
+<tr>
+<td style="text-align:left">3D-aided face recognition from videos</td>
+<td style="text-align:left">University of Lyon</td>
+<td style="text-align:left">France</td>
+<td style="text-align:left">edu</td>
+</tr>
+<tr>
+<td style="text-align:left">3D-aided face recognition from videos</td>
+<td style="text-align:left">University of Lyon</td>
+<td style="text-align:left">France</td>
+<td style="text-align:left">edu</td>
+</tr>
+<tr>
+<td style="text-align:left">3D-aided face recognition from videos</td>
+<td style="text-align:left">University of Lyon</td>
+<td style="text-align:left">France</td>
+<td style="text-align:left">edu</td>
+</tr>
+<tr>
+<td style="text-align:left">3D-aided face recognition from videos</td>
+<td style="text-align:left">University of Lyon</td>
+<td style="text-align:left">France</td>
+<td style="text-align:left">edu</td>
+</tr>
+<tr>
+<td style="text-align:left">3D-aided face recognition from videos</td>
+<td style="text-align:left">University of Lyon</td>
+<td style="text-align:left">France</td>
+<td style="text-align:left">edu</td>
+</tr>
+<tr>
+<td style="text-align:left">3D-aided face recognition from videos</td>
+<td style="text-align:left">University of Lyon</td>
+<td style="text-align:left">France</td>
+<td style="text-align:left">edu</td>
+</tr>
+<tr>
+<td style="text-align:left">3D-aided face recognition from videos</td>
+<td style="text-align:left">University of Lyon</td>
+<td style="text-align:left">France</td>
+<td style="text-align:left">edu</td>
+</tr>
+</tbody>
+</table>
+<h2>Code</h2>
+<pre><code class="lang-python">#!/usr/bin/python
+
+import numpy as np
+from sklearn.datasets import fetch_lfw_people
+import imageio
+import imutils
+
+# download LFW dataset (first run takes a while)
+lfw_people = fetch_lfw_people(min_faces_per_person=1, resize=1, color=True, funneled=False)
+
+# introspect dataset
+n_samples, h, w, c = lfw_people.images.shape
+print(&#39;{:,} images at {}x{}&#39;.format(n_samples, w, h))
+cols, rows = (176, 76)
+n_ims = cols * rows
+
+# build montages
+im_scale = 0.5
+ims = lfw_people.images[:n_ims
+montages = imutils.build_montages(ims, (int(w*im_scale, int(h*im_scale)), (cols, rows))
+montage = montages[0]
+
+# save full montage image
+imageio.imwrite(&#39;lfw_montage_full.png&#39;, montage)
+
+# make a smaller version
+montage_960 = imutils.resize(montage, width=960)
+imageio.imwrite(&#39;lfw_montage_960.jpg&#39;, montage_960)
+</code></pre>
+<h2>Disclaimer</h2>
+<p>MegaPixels is an educational art project designed to encourage discourse about facial recognition datasets. Any ethical or legal issues should be directed to the researcher's parent organizations. Except where necessary for contact or clarity, the names of researchers have been subsituted by their parent organization. In no way does this project aim to villify researchers who produced the datasets.</p>
+<p>Read more about <a href="about/code-of-conduct">MegaPixels Code of Conduct</a></p>
+<div class="footnotes">
+<hr>
+<ol></ol>
+</div>
+</section>
+
+  </div>
+  <footer>
+    <div>
+      <a href="/">MegaPixels.cc</a>
+      <a href="/about/disclaimer/">Disclaimer</a>
+      <a href="/about/terms/">Terms of Use</a>
+      <a href="/about/privacy/">Privacy</a>
+      <a href="/about/">About</a>
+      <a href="/about/team/">Team</a>
+    </div>
+    <div>
+      MegaPixels &copy;2017-19 Adam R. Harvey /&nbsp;
+      <a href="https://ahprojects.com">ahprojects.com</a>
+    </div>
+  </footer>
+</body>
+
+<script src="/assets/js/app/site.js"></script>
+</html>
+\ No newline at end of file
diff --git a/site/public/datasets/lfw/what/index.html b/site/public/datasets/lfw/what/index.html
new file mode 100644
index 00000000..ceafb35a
--- /dev/null
+++ b/site/public/datasets/lfw/what/index.html
@@ -0,0 +1,142 @@
+<!doctype html>
+<html>
+<head>
+  <title>MegaPixels</title>
+  <meta charset="utf-8" />
+  <meta name="author" content="Adam Harvey" />
+  <meta name="description" content="LFW: Labeled Faces in The Wild" />
+  <meta name="referrer" content="no-referrer" />
+  <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
+  <link rel='stylesheet' href='/assets/css/fonts.css' />
+  <link rel='stylesheet' href='/assets/css/css.css' />
+</head>
+<body>
+  <header>
+    <a class='slogan' href="/">
+      <div class='logo'></div>
+      <div class='site_name'>MegaPixels</div>
+      <span class='sub'>The Darkside of Datasets</span>
+    </a>
+    <div class='links'>
+      <a href="/search/">Face Search</a>
+      <a href="/datasets/">Datasets</a>
+      <a href="/research/">Research</a>
+      <a href="/about/">About</a>
+    </div>
+  </header>
+  <div class="content">
+    
+  <section><h1>Labeled Faces in The Wild</h1>
+<ul>
+<li>Created 2007 (auto)</li>
+<li>Images 13,233 (auto)</li>
+<li>People 5,749 (auto)</li>
+<li>Created From Yahoo News images (auto)</li>
+<li>Analyzed and searchable (auto)</li>
+</ul>
+<p><em>Labeled Faces in The Wild</em> is amongst the most widely used facial recognition training datasets in the world and is the first facial recognition dataset [^lfw_names_faces] of its kind to be created entirely from Internet photos. It includes 13,233 images of 5,749 people that appeared on Yahoo News between 2002 - 2004.</p>
+</section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_grid_preview.jpg' alt='Eight out of 5,749 people in the Labeled Faces in the Wild dataset. The face recognition training dataset is created entirely from photos downloaded from the Internet.'><div class='caption'>Eight out of 5,749 people in the Labeled Faces in the Wild dataset. The face recognition training dataset is created entirely from photos downloaded from the Internet.</div></div></section><section><h2>INTRO</h2>
+<p>It began in 2002. Researchers at University of Massachusetts Amherst were developing algorithms for facial recognition and they needed more data. Between 2002-2004 they scraped Yahoo News for images of public figures. Two years later they cleaned up the dataset and repackaged it as Labeled Faces in the Wild (LFW).</p>
+<p>Since then the LFW dataset has become one of the most widely used datasets used for evaluating face recognition algorithms. The associated research paper “Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments” has been cited 996 times reaching 45 different countries throughout the world.</p>
+<p>The faces come from news stories and are mostly celebrities from the entertainment industry, politicians, and villains. It’s a sampling of current affairs and breaking news that has come to pass. The images, detached from their original context now server a new purpose: to train, evaluate, and improve facial recognition.</p>
+<p>As the most widely used facial recognition dataset, it can be said that each individual in LFW has, in a small way, contributed to the current state of the art in facial recognition surveillance. John Cusack, Julianne Moore, Barry Bonds, Osama bin Laden, and even Moby are amongst these biometric pillars, exemplar faces provided the visual dimensions of a new computer vision future.</p>
+</section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_a_to_c.jpg' alt='From Aaron Eckhart to Zydrunas Ilgauskas. A small sampling of the LFW dataset'><div class='caption'>From Aaron Eckhart to Zydrunas Ilgauskas. A small sampling of the LFW dataset</div></div></section><section><p>In addition to commercial use as an evaluation tool, all of the faces in LFW dataset are prepackaged into a popular machine learning code framework called scikit-learn.</p>
+<h2>Usage</h2>
+<pre><code class="lang-python">#!/usr/bin/python
+from matplotlib import plt
+from sklearn.datasets import fetch_lfw_people
+lfw_people = fetch_lfw_people()
+lfw_person = lfw_people[0]
+plt.imshow(lfw_person)
+</code></pre>
+<h2>Commercial Use</h2>
+<p>The LFW dataset is used by numerous companies for benchmarking algorithms and in some cases training. According to the benchmarking results page [^lfw_results] provided by the authors, over 2 dozen companies have contributed their benchmark results</p>
+<pre><code>load file: lfw_commercial_use.csv
+name_display,company_url,example_url,country,description
+</code></pre>
+<table>
+<thead><tr>
+<th style="text-align:left">Company</th>
+<th style="text-align:left">Country</th>
+<th style="text-align:left">Industries</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td style="text-align:left"><a href="http://www.aratek.co">Aratek</a></td>
+<td style="text-align:left">China</td>
+<td style="text-align:left">Biometric sensors for telecom, civil identification, finance, education, POS, and transportation</td>
+</tr>
+<tr>
+<td style="text-align:left"><a href="http://www.aratek.co">Aratek</a></td>
+<td style="text-align:left">China</td>
+<td style="text-align:left">Biometric sensors for telecom, civil identification, finance, education, POS, and transportation</td>
+</tr>
+<tr>
+<td style="text-align:left"><a href="http://www.aratek.co">Aratek</a></td>
+<td style="text-align:left">China</td>
+<td style="text-align:left">Biometric sensors for telecom, civil identification, finance, education, POS, and transportation</td>
+</tr>
+</tbody>
+</table>
+<p>Add 2-4 screenshots of companies mentioning LFW here</p>
+</section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_screenshot_01.png' alt='ReadSense'><div class='caption'>ReadSense</div></div></section><section><p>In benchmarking, companies use a dataset to evaluate their algorithms which are typically trained on other data. After training, researchers will use LFW as a benchmark to compare results with other algorithms.</p>
+<p>For example, Baidu (est. net worth $13B) uses LFW to report results for their "Targeting Ultimate Accuracy: Face Recognition via Deep Embedding". According to the three Baidu researchers who produced the paper:</p>
+<blockquote><p>LFW has been the most popular evaluation benchmark for face recognition, and played a very important role in facilitating the face recognition society to improve algorithm. <sup class="footnote-ref" id="fnref-baidu_lfw"><a href="#fn-baidu_lfw">1</a></sup>.</p>
+</blockquote>
+<h2>Citations</h2>
+<table>
+<thead><tr>
+<th style="text-align:left">Title</th>
+<th style="text-align:left">Organization</th>
+<th style="text-align:left">Country</th>
+<th style="text-align:left">Type</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td style="text-align:left">3D-aided face recognition from videos</td>
+<td style="text-align:left">University of Lyon</td>
+<td style="text-align:left">France</td>
+<td style="text-align:left">edu</td>
+</tr>
+<tr>
+<td style="text-align:left">A Community Detection Approach to Cleaning Extremely Large Face Database</td>
+<td style="text-align:left">National University of Defense Technology, China</td>
+<td style="text-align:left">China</td>
+<td style="text-align:left">edu</td>
+</tr>
+</tbody>
+</table>
+<h2>Conclusion</h2>
+<p>The LFW face recognition training and evaluation dataset is a historically important face dataset as it was the first popular dataset to be created entirely from Internet images, paving the way for a global trend towards downloading anyone’s face from the Internet and adding it to a dataset. As will be evident with other datasets, LFW’s approach has now become the norm.</p>
+<p>For all the 5,000 people in this datasets, their face is forever a part of facial recognition history. It would be impossible to remove anyone from the dataset because it is so ubiquitous. For their rest of the lives and forever after, these 5,000 people will continue to be used for training facial recognition surveillance.</p>
+<h2>Notes</h2>
+<p>According to BiometricUpdate.com<sup class="footnote-ref" id="fnref-biometric_update_lfw"><a href="#fn-biometric_update_lfw">2</a></sup>, LFW is "the most widely used evaluation set in the field of facial recognition, LFW attracts a few dozen teams from around the globe including Google, Facebook, Microsoft Research Asia, Baidu, Tencent, SenseTime, Face++ and Chinese University of Hong Kong."</p>
+<div class="footnotes">
+<hr>
+<ol><li id="fn-baidu_lfw"><p>"Chinese tourist town uses face recognition as an entry pass". New Scientist. November 17, 2016. <a href="https://www.newscientist.com/article/2113176-chinese-tourist-town-uses-face-recognition-as-an-entry-pass/">https://www.newscientist.com/article/2113176-chinese-tourist-town-uses-face-recognition-as-an-entry-pass/</a><a href="#fnref-baidu_lfw" class="footnote">&#8617;</a></p></li>
+<li id="fn-biometric_update_lfw"><p>"PING AN Tech facial recognition receives high score in latest LFW test results". <a href="https://www.biometricupdate.com/201702/ping-an-tech-facial-recognition-receives-high-score-in-latest-lfw-test-results">https://www.biometricupdate.com/201702/ping-an-tech-facial-recognition-receives-high-score-in-latest-lfw-test-results</a><a href="#fnref-biometric_update_lfw" class="footnote">&#8617;</a></p></li>
+</ol>
+</div>
+</section>
+
+  </div>
+  <footer>
+    <div>
+      <a href="/">MegaPixels.cc</a>
+      <a href="/about/disclaimer/">Disclaimer</a>
+      <a href="/about/terms/">Terms of Use</a>
+      <a href="/about/privacy/">Privacy</a>
+      <a href="/about/">About</a>
+      <a href="/about/team/">Team</a>
+    </div>
+    <div>
+      MegaPixels &copy;2017-19 Adam R. Harvey /&nbsp;
+      <a href="https://ahprojects.com">ahprojects.com</a>
+    </div>
+  </footer>
+</body>
+
+<script src="/assets/js/app/site.js"></script>
+</html>
+\ No newline at end of file
diff --git a/site/public/datasets/vgg_face2/index.html b/site/public/datasets/vgg_face2/index.html
new file mode 100644
index 00000000..24a1059b
--- /dev/null
+++ b/site/public/datasets/vgg_face2/index.html
@@ -0,0 +1,84 @@
+<!doctype html>
+<html>
+<head>
+  <title>MegaPixels</title>
+  <meta charset="utf-8" />
+  <meta name="author" content="Adam Harvey" />
+  <meta name="description" content="A large scale image dataset for face recognition" />
+  <meta name="referrer" content="no-referrer" />
+  <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
+  <link rel='stylesheet' href='/assets/css/fonts.css' />
+  <link rel='stylesheet' href='/assets/css/css.css' />
+</head>
+<body>
+  <header>
+    <a class='slogan' href="/">
+      <div class='logo'></div>
+      <div class='site_name'>MegaPixels</div>
+      <span class='sub'>The Darkside of Datasets</span>
+    </a>
+    <div class='links'>
+      <a href="/search/">Face Search</a>
+      <a href="/datasets/">Datasets</a>
+      <a href="/research/">Research</a>
+      <a href="/about/">About</a>
+    </div>
+  </header>
+  <div class="content">
+    
+  <section><h1>VGG Faces2</h1>
+</section><section><div class='meta'><div><div class='gray'>Created</div><div>2018</div></div><div><div class='gray'>Images</div><div>3.3M</div></div><div><div class='gray'>People</div><div>9,000</div></div><div><div class='gray'>Created From</div><div>Scraping search engines</div></div><div><div class='gray'>Search available</div><div>[Searchable](#)</div></div></div></section><section><p>VGG Face2 is the updated version of the VGG Face dataset and now includes over 3.3M face images from over 9K people. The identities were selected by taking the top 500K identities in Google's Knowledge Graph of celebrities and then selecting only the names that yielded enough training images. The dataset was created in the UK but funded by Office of Director of National Intelligence in the United States.</p>
+<p>{INSERT IMAGE SEARCH MODULE}</p>
+<p>{INSERT TEXT SEARCH MODULE}</p>
+<pre><code>load file: lfw_names_gender_kg_min.csv
+Name, Images, Gender, Description
+</code></pre>
+<h2>VGG Face2 by the Numbers</h2>
+<ul>
+<li>1,331 actresses, 139 presidents</li>
+<li>3 husbands and 16 wives</li>
+<li>2 snooker player</li>
+<li>1 guru</li>
+<li>1 pornographic actress</li>
+<li>3 computer programmer</li>
+</ul>
+<h1>Names and descriptions</h1>
+<ul>
+<li>The original VGGF2 name list has been updated with the results returned from Google Knowledge</li>
+<li>Names with a similarity score greater than 0.75 where automatically updated. Scores computed using <code>import difflib; seq = difflib.SequenceMatcher(a=a.lower(), b=b.lower()); score = seq.ratio()</code></li>
+<li>The 97 names with a score of 0.75 or lower were manually reviewed and includes name changes validating using Wikipedia.org results for names such as "Bruce Jenner" to "Caitlyn Jenner", spousal last-name changes, and discretionary changes to improve search results such as combining nicknames with full name when appropriate, for example changing "Aleksandar Petrović" to "Aleksandar 'Aco' Petrović" and minor changes such as "Mohammad Ali" to "Muhammad Ali"</li>
+<li>The 'Description` text was automatically added when the Knowledge Graph score was greater than 250</li>
+</ul>
+<h1>TODO</h1>
+<ul>
+<li>create name list, and populate with Knowledge graph information like LFW</li>
+<li>make list of interesting number stats, by the numbers</li>
+<li>make list of interesting important facts</li>
+<li>write intro abstract</li>
+<li>write analysis of usage</li>
+<li>find examples, citations, and screenshots of useage</li>
+<li>find list of companies using it for table</li>
+<li>create montages of the dataset, like LFW</li>
+<li>create right to removal information</li>
+</ul>
+</section>
+
+  </div>
+  <footer>
+    <div>
+      <a href="/">MegaPixels.cc</a>
+      <a href="/about/disclaimer/">Disclaimer</a>
+      <a href="/about/terms/">Terms of Use</a>
+      <a href="/about/privacy/">Privacy</a>
+      <a href="/about/">About</a>
+      <a href="/about/team/">Team</a>
+    </div>
+    <div>
+      MegaPixels &copy;2017-19 Adam R. Harvey /&nbsp;
+      <a href="https://ahprojects.com">ahprojects.com</a>
+    </div>
+  </footer>
+</body>
+
+<script src="/assets/js/app/site.js"></script>
+</html>
+\ No newline at end of file
diff --git a/site/public/datasets/vgg_faces2/index.html b/site/public/datasets/vgg_faces2/index.html
new file mode 100644
index 00000000..3f778f71
--- /dev/null
+++ b/site/public/datasets/vgg_faces2/index.html
@@ -0,0 +1,63 @@
+<!doctype html>
+<html>
+<head>
+  <title>MegaPixels</title>
+  <meta charset="utf-8" />
+  <meta name="author" content="Adam Harvey" />
+  <meta name="description" content="" />
+  <meta name="referrer" content="no-referrer" />
+  <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
+  <link rel='stylesheet' href='/assets/css/fonts.css' />
+  <link rel='stylesheet' href='/assets/css/css.css' />
+</head>
+<body>
+  <header>
+    <a class='slogan' href="/">
+      <div class='logo'></div>
+      <div class='site_name'>MegaPixels</div>
+      <span class='sub'>The Darkside of Datasets</span>
+    </a>
+    <div class='links'>
+      <a href="/search/">Face Search</a>
+      <a href="/datasets/">Datasets</a>
+      <a href="/research/">Research</a>
+      <a href="/about/">About</a>
+    </div>
+  </header>
+  <div class="content">
+    
+  <section><h1>Labeled Faces in The Wild</h1>
+</section><section><div class='meta'><div><div class='gray'>Created</div><div>2007</div></div><div><div class='gray'>Images</div><div>13,233</div></div><div><div class='gray'>People</div><div>5,749</div></div><div><div class='gray'>Created From</div><div>Yahoo News images</div></div><div><div class='gray'>Search available</div><div>[Searchable](#)</div></div></div></section><section><p>Labeled Faces in The Wild is amongst the most widely used facial recognition training datasets in the world and is the first dataset of its kind to be created entirely from Internet photos. It includes 13,233 images of 5,749 people downloaded from the Internet, otherwise referred to by researchers as “The Wild”.</p>
+<h2>INTRO</h2>
+<p>It began in 2002. Researchers at University of Massachusetts Amherst were developing algorithms for facial recognition and they needed more data. Between 2002-2004 they scraped Yahoo News for images of public figures. Two years later they cleaned up the dataset and repackaged it as Labeled Faces in the Wild (LFW).</p>
+<p>Since then the LFW dataset has become one of the most widely used datasets used for evaluating face recognition algorithms. The associated research paper “Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments” has been cited 996 times reaching 45 different countries throughout the world.</p>
+<p>The faces come from news stories and are mostly celebrities from the entertainment industry, politicians, and villains. It’s a sampling of current affairs and breaking news that has come to pass. The images, detached from their original context now server a new purpose: to train, evaluate, and improve facial recognition.</p>
+<p>As the most widely used facial recognition dataset, it can be said that each individual in LFW has, in a small way, contributed to the current state of the art in facial recognition surveillance. John Cusack, Julianne Moore, Barry Bonds, Osama bin Laden, and even Moby are amongst these biometric pillars, exemplar faces provided the visual dimensions of a new computer vision future.</p>
+<h2>Commercial Use</h2>
+<p>The dataset is used by numerous companies for benchmarking algorithms. According to the benchmarking results page <sup class="footnote-ref" id="fnref-lfw_results"><a href="#fn-lfw_results">1</a></sup> provided by the authors, there over 2 dozen commercial uses of the LFW face dataset.</p>
+<div class="footnotes">
+<hr>
+<ol><li id="fn-lfw_results"><p>"LFW Results". Accessed Dec 3, 2018. <a href="http://vis-www.cs.umass.edu/lfw/results.html">http://vis-www.cs.umass.edu/lfw/results.html</a><a href="#fnref-lfw_results" class="footnote">&#8617;</a></p></li>
+</ol>
+</div>
+</section>
+
+  </div>
+  <footer>
+    <div>
+      <a href="/">MegaPixels.cc</a>
+      <a href="/about/disclaimer/">Disclaimer</a>
+      <a href="/about/terms/">Terms of Use</a>
+      <a href="/about/privacy/">Privacy</a>
+      <a href="/about/">About</a>
+      <a href="/about/team/">Team</a>
+    </div>
+    <div>
+      MegaPixels &copy;2017-19 Adam R. Harvey /&nbsp;
+      <a href="https://ahprojects.com">ahprojects.com</a>
+    </div>
+  </footer>
+</body>
+
+<script src="/assets/js/app/site.js"></script>
+</html>
+\ No newline at end of file