diff options
Diffstat (limited to 'site/public/datasets')
| -rw-r--r-- | site/public/datasets/index.html | 54 | ||||
| -rw-r--r-- | site/public/datasets/lfw/index.html | 61 | ||||
| -rw-r--r-- | site/public/datasets/vgg_face2/index.html | 8 |
3 files changed, 93 insertions, 30 deletions
diff --git a/site/public/datasets/index.html b/site/public/datasets/index.html new file mode 100644 index 00000000..bcc7c1ab --- /dev/null +++ b/site/public/datasets/index.html @@ -0,0 +1,54 @@ +<!doctype html> +<html> +<head> + <title>MegaPixels</title> + <meta charset="utf-8" /> + <meta name="author" content="Adam Harvey" /> + <meta name="description" content="Facial Recognition Datasets" /> + <meta name="referrer" content="no-referrer" /> + <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" /> + <link rel='stylesheet' href='/assets/css/fonts.css' /> + <link rel='stylesheet' href='/assets/css/tabulator.css' /> + <link rel='stylesheet' href='/assets/css/css.css' /> + <link rel='stylesheet' href='/assets/css/leaflet.css' /> + <link rel='stylesheet' href='/assets/css/applets.css' /> +</head> +<body> + <header> + <a class='slogan' href="/"> + <div class='logo'></div> + <div class='site_name'>MegaPixels</div> + <span class='sub'>The Darkside of Datasets</span> + </a> + <div class='links'> + <a href="/datasets/">Datasets</a> + <a href="/research/">Research</a> + <a href="/about/">About</a> + </div> + </header> + <div class="content"> + + <section><h1>Facial Recognition Datasets</h1> +<p>Regular Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.</p> +<h3>Summary</h3> +</section><section><div class='meta'><div><div class='gray'>Found</div><div>275 datasets</div></div><div><div class='gray'>Created between</div><div>1993-2018</div></div><div><div class='gray'>Smallest dataset</div><div>20 images</div></div><div><div class='gray'>Largest dataset</div><div>10,000,000 images</div></div></div></section><section><div class='meta'><div><div class='gray'>Highest resolution faces</div><div>450x500 (Unconstrained College Students)</div></div><div><div class='gray'>Lowest resolution faces</div><div>16x20 pixels (QMUL SurvFace)</div></div></div></section><section class='applet_container'><div class='applet' data-payload='{"command": "load_file https://megapixels.nyc3.digitaloceanspaces.com/v1/citations/datasets.csv"}'></div></section> + + </div> + <footer> + <div> + <a href="/">MegaPixels.cc</a> + <a href="/about/disclaimer/">Disclaimer</a> + <a href="/about/terms/">Terms of Use</a> + <a href="/about/privacy/">Privacy</a> + <a href="/about/">About</a> + <a href="/about/team/">Team</a> + </div> + <div> + MegaPixels ©2017-19 Adam R. Harvey / + <a href="https://ahprojects.com">ahprojects.com</a> + </div> + </footer> +</body> + +<script src="/assets/js/dist/index.js"></script> +</html>
\ No newline at end of file diff --git a/site/public/datasets/lfw/index.html b/site/public/datasets/lfw/index.html index 3c83acd3..9adf29b1 100644 --- a/site/public/datasets/lfw/index.html +++ b/site/public/datasets/lfw/index.html @@ -8,7 +8,10 @@ <meta name="referrer" content="no-referrer" /> <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" /> <link rel='stylesheet' href='/assets/css/fonts.css' /> + <link rel='stylesheet' href='/assets/css/tabulator.css' /> <link rel='stylesheet' href='/assets/css/css.css' /> + <link rel='stylesheet' href='/assets/css/leaflet.css' /> + <link rel='stylesheet' href='/assets/css/applets.css' /> </head> <body> <header> @@ -18,7 +21,6 @@ <span class='sub'>The Darkside of Datasets</span> </a> <div class='links'> - <a href="/search/">Face Search</a> <a href="/datasets/">Datasets</a> <a href="/research/">Research</a> <a href="/about/">About</a> @@ -28,9 +30,9 @@ <section><h1>Labeled Faces in the Wild</h1> </section><section><div class='meta'><div><div class='gray'>Created</div><div>2007</div></div><div><div class='gray'>Images</div><div>13,233</div></div><div><div class='gray'>People</div><div>5,749</div></div><div><div class='gray'>Created From</div><div>Yahoo News images</div></div><div><div class='gray'>Search available</div><div>Searchable</div></div></div></section><section><p>Labeled Faces in The Wild (LFW) is amongst the most widely used facial recognition training datasets in the world and is the first of its kind to be created entirely from images posted online. The LFW dataset includes 13,233 images of 5,749 people that were collected between 2002-2004. Use the tools below to check if you were included in this dataset or scroll down to read the analysis.</p> -</section><section><div class='applet' data-payload='{"command": "face_search"}'></div></section><section><div class='applet' data-payload='{"command": "name_search"}'></div></section><section><div class='applet' data-payload='{"command": "load file", "opt": "lfw_names_gender_kg_min.csv", "fields": "Name, Images, Gender, Description"}'></div></section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_feature.jpg' alt='Eighteen of the 5,749 people in the Labeled Faces in the Wild Dataset. The most widely used face dataset for benchmarking commercial face recognition algorithms.'><div class='caption'>Eighteen of the 5,749 people in the Labeled Faces in the Wild Dataset. The most widely used face dataset for benchmarking commercial face recognition algorithms.</div></div></section><section><h2>Intro</h2> +</section><section class='applet_container'><div class='applet' data-payload='{"command": "face_search"}'></div></section><section class='applet_container'><div class='applet' data-payload='{"command": "name_search"}'></div></section><section class='applet_container'><div class='applet' data-payload='{"command": "load_file assets/lfw_names_gender_kg_min.csv", "fields": ["Name, Images, Gender, Description"]}'></div></section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_feature.jpg' alt='Eighteen of the 5,749 people in the Labeled Faces in the Wild Dataset. The most widely used face dataset for benchmarking commercial face recognition algorithms.'><div class='caption'>Eighteen of the 5,749 people in the Labeled Faces in the Wild Dataset. The most widely used face dataset for benchmarking commercial face recognition algorithms.</div></div></section><section><h2>Intro</h2> <p>Three paragraphs describing the LFW dataset in a format that can be easily replicated for the other datasets. Nothing too custom. An analysis of the initial research papers with context relative to all the other dataset papers.</p> -</section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_montage_everyone_1920.jpg' alt=' all 5,749 people in the LFW Dataset sorted from most to least images collected.'><div class='caption'> all 5,749 people in the LFW Dataset sorted from most to least images collected.</div></div></section><section><h2>LFW by the Numbers</h2> +</section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_montage_everyone_nocrop_1920.jpg' alt=' all 5,749 people in the LFW Dataset sorted from most to least images collected.'><div class='caption'> all 5,749 people in the LFW Dataset sorted from most to least images collected.</div></div></section><section><h2>LFW by the Numbers</h2> <ul> <li>Was first published in 2007</li> <li>Developed out of a prior dataset from Berkely called "Faces in the Wild" or "Names and Faces" [^lfw_original_paper]</li> @@ -65,10 +67,7 @@ <p>According to BiometricUpdate.com [^lfw_pingan], LFW is "the most widely used evaluation set in the field of facial recognition, LFW attracts a few dozen teams from around the globe including Google, Facebook, Microsoft Research Asia, Baidu, Tencent, SenseTime, Face++ and Chinese University of Hong Kong."</p> <p>According to researchers at the Baidu Research – Institute of Deep Learning "LFW has been the most popular evaluation benchmark for face recognition, and played a very important role in facilitating the face recognition society to improve algorithm. [^lfw_baidu]."</p> <p>In addition to commercial use as an evaluation tool, alll of the faces in LFW dataset are prepackaged into a popular machine learning code framework called scikit-learn.</p> -<pre><code>load file: lfw_commercial_use.csv -name_display,company_url,example_url,country,description -</code></pre> -<table> +</section><section class='applet_container'><div class='applet' data-payload='{"command": "load_file assets/lfw_commercial_use.csv", "fields": ["name_display, company_url, example_url, country, description"]}'></div></section><section><table> <thead><tr> <th style="text-align:left">Company</th> <th style="text-align:left">Country</th> @@ -101,7 +100,7 @@ name_display,company_url,example_url,country,description <h2>Citations</h2> <p>Overall, LFW has at least 456 citations from 123 countries. Sed ut perspiciatis, unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam eaque ipsa, quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt, explicabo. Nemo enim ipsam voluptatem, quia voluptas sit, aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos.</p> <p>Sed ut perspiciatis, unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam eaque ipsa, quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt, explicabo. Nemo enim ipsam voluptatem, quia voluptas sit, aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos.</p> -</section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/temp_graph.jpg' alt='Distribution of citations per year per country for the top 5 countries with citations for the LFW Dataset'><div class='caption'>Distribution of citations per year per country for the top 5 countries with citations for the LFW Dataset</div></div></section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/temp_map.jpg' alt='Geographic distributions of citations for the LFW Dataset'><div class='caption'>Geographic distributions of citations for the LFW Dataset</div></div></section><section><h2>Conclusion</h2> +</section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/temp_graph.jpg' alt='Distribution of citations per year per country for the top 5 countries with citations for the LFW Dataset'><div class='caption'>Distribution of citations per year per country for the top 5 countries with citations for the LFW Dataset</div></div></section><section class='applet_container'><div class='applet' data-payload='{"command": "map"}'></div></section><section class='applet_container'><div class='applet' data-payload='{"command": "citations"}'></div></section><section><h2>Conclusion</h2> <p>The LFW face recognition training and evaluation dataset is a historically important face dataset as it was the first popular dataset to be created entirely from Internet images, paving the way for a global trend towards downloading anyone’s face from the Internet and adding it to a dataset. As will be evident with other datasets, LFW’s approach has now become the norm.</p> <p>For all the 5,000 people in this datasets, their face is forever a part of facial recognition history. It would be impossible to remove anyone from the dataset because it is so ubiquitous. For their rest of the lives and forever after, these 5,000 people will continue to be used for training facial recognition surveillance.</p> <h2>Right to Removal</h2> @@ -219,28 +218,36 @@ name_display,company_url,example_url,country,description </tbody> </table> <h2>Code</h2> -</section><section><div class='applet' data-payload='{"command": "python"}'></div></section><section><p>import numpy as np +</section><section><pre><code class="lang-python">#!/usr/bin/python + +import numpy as np from sklearn.datasets import fetch_lfw_people import imageio -import imutils</p> -<h1>download LFW dataset (first run takes a while)</h1> -<p>lfw_people = fetch_lfw_people(min_faces_per_person=1, resize=1, color=True, funneled=False)</p> -<h1>introspect dataset</h1> -<p>n_samples, h, w, c = lfw_people.images.shape -print('{:,} images at {}x{}'.format(n_samples, w, h)) +import imutils + +# download LFW dataset (first run takes a while) +lfw_people = fetch_lfw_people(min_faces_per_person=1, resize=1, color=True, funneled=False) + +# introspect dataset +n_samples, h, w, c = lfw_people.images.shape +print('{:,} images at {}x{}'.format(n_samples, w, h)) cols, rows = (176, 76) -n_ims = cols * rows</p> -<h1>build montages</h1> -<p>im_scale = 0.5 +n_ims = cols * rows + +# build montages +im_scale = 0.5 ims = lfw_people.images[:n_ims -montages = imutils.build_montages(ims, (int(w<em>im_scale, int(h</em>im_scale)), (cols, rows)) -montage = montages[0]</p> -<h1>save full montage image</h1> -<p>imageio.imwrite('lfw_montage_full.png', montage)</p> -<h1>make a smaller version</h1> -<p>montage_960 = imutils.resize(montage, width=960) -imageio.imwrite('lfw_montage_960.jpg', montage_960)</p> -</section><section><div class='applet' data-payload='{"command": ""}'></div></section><section><h2>Disclaimer</h2> +montages = imutils.build_montages(ims, (int(w*im_scale, int(h*im_scale)), (cols, rows)) +montage = montages[0] + +# save full montage image +imageio.imwrite('lfw_montage_full.png', montage) + +# make a smaller version +montage_960 = imutils.resize(montage, width=960) +imageio.imwrite('lfw_montage_960.jpg', montage_960) +</code></pre> +</section><section><h2>Disclaimer</h2> <p>MegaPixels is an educational art project designed to encourage discourse about facial recognition datasets. Any ethical or legal issues should be directed to the researcher's parent organizations. Except where necessary for contact or clarity, the names of researchers have been subsituted by their parent organization. In no way does this project aim to villify researchers who produced the datasets.</p> <p>Read more about <a href="about/code-of-conduct">MegaPixels Code of Conduct</a></p> <div class="footnotes"> @@ -266,5 +273,5 @@ imageio.imwrite('lfw_montage_960.jpg', montage_960)</p> </footer> </body> -<script src="/assets/js/app/site.js"></script> +<script src="/assets/js/dist/index.js"></script> </html>
\ No newline at end of file diff --git a/site/public/datasets/vgg_face2/index.html b/site/public/datasets/vgg_face2/index.html index 817fc9a0..63715a4f 100644 --- a/site/public/datasets/vgg_face2/index.html +++ b/site/public/datasets/vgg_face2/index.html @@ -8,7 +8,10 @@ <meta name="referrer" content="no-referrer" /> <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" /> <link rel='stylesheet' href='/assets/css/fonts.css' /> + <link rel='stylesheet' href='/assets/css/tabulator.css' /> <link rel='stylesheet' href='/assets/css/css.css' /> + <link rel='stylesheet' href='/assets/css/leaflet.css' /> + <link rel='stylesheet' href='/assets/css/applets.css' /> </head> <body> <header> @@ -18,7 +21,6 @@ <span class='sub'>The Darkside of Datasets</span> </a> <div class='links'> - <a href="/search/">Face Search</a> <a href="/datasets/">Datasets</a> <a href="/research/">Research</a> <a href="/about/">About</a> @@ -28,7 +30,7 @@ <section><h1>VGG Faces2</h1> </section><section><div class='meta'><div><div class='gray'>Created</div><div>2018</div></div><div><div class='gray'>Images</div><div>3.3M</div></div><div><div class='gray'>People</div><div>9,000</div></div><div><div class='gray'>Created From</div><div>Scraping search engines</div></div><div><div class='gray'>Search available</div><div>[Searchable](#)</div></div></div></section><section><p>VGG Face2 is the updated version of the VGG Face dataset and now includes over 3.3M face images from over 9K people. The identities were selected by taking the top 500K identities in Google's Knowledge Graph of celebrities and then selecting only the names that yielded enough training images. The dataset was created in the UK but funded by Office of Director of National Intelligence in the United States.</p> -</section><section><div class='applet' data-payload='{"command": "face_search"}'></div></section><section><div class='applet' data-payload='{"command": "name_search"}'></div></section><section><div class='applet' data-payload='{"command": "load file", "opt": "lfw_names_gender_kg_min.csv", "fields": "Name, Images, Gender, Description"}'></div></section><section><h2>VGG Face2 by the Numbers</h2> +</section><section class='applet_container'><div class='applet' data-payload='{"command": "face_search"}'></div></section><section class='applet_container'><div class='applet' data-payload='{"command": "name_search"}'></div></section><section class='applet_container'><div class='applet' data-payload='{"command": "load_file assets/lfw_names_gender_kg_min.csv", "fields": ["Name, Images, Gender, Description"]}'></div></section><section><h2>VGG Face2 by the Numbers</h2> <ul> <li>1,331 actresses, 139 presidents</li> <li>3 husbands and 16 wives</li> @@ -75,5 +77,5 @@ </footer> </body> -<script src="/assets/js/app/site.js"></script> +<script src="/assets/js/dist/index.js"></script> </html>
\ No newline at end of file |
