diff options
Diffstat (limited to 'site/public')
| -rw-r--r-- | site/public/about/credits/index.html | 10 | ||||
| -rw-r--r-- | site/public/about/disclaimer/index.html | 10 | ||||
| -rw-r--r-- | site/public/about/index.html | 23 | ||||
| -rw-r--r-- | site/public/about/press/index.html | 13 | ||||
| -rw-r--r-- | site/public/about/privacy/index.html | 11 | ||||
| -rw-r--r-- | site/public/about/terms/index.html | 12 | ||||
| -rw-r--r-- | site/public/datasets/index.html | 71 | ||||
| -rw-r--r-- | site/public/datasets/lfw/index.html | 98 | ||||
| -rw-r--r-- | site/public/datasets/vgg_face2/index.html | 33 | ||||
| -rw-r--r-- | site/public/datasets_v0/index.html | 53 | ||||
| -rw-r--r-- | site/public/datasets_v0/lfw/index.html | 131 | ||||
| -rw-r--r-- | site/public/datasets_v0/lfw/right-to-removal/index.html | 61 | ||||
| -rw-r--r-- | site/public/datasets_v0/lfw/tables/index.html | 52 | ||||
| -rw-r--r-- | site/public/datasets_v0/vgg_face2/index.html | 80 | ||||
| -rw-r--r-- | site/public/index.html | 112 | ||||
| -rw-r--r-- | site/public/info/index.html | 2 | ||||
| -rw-r--r-- | site/public/research/00_introduction/index.html | 12 | ||||
| -rw-r--r-- | site/public/research/01_from_1_to_100_pixels/index.html | 3 |
18 files changed, 566 insertions, 221 deletions
diff --git a/site/public/about/credits/index.html b/site/public/about/credits/index.html index fecc6c7b..6e4f06c1 100644 --- a/site/public/about/credits/index.html +++ b/site/public/about/credits/index.html @@ -28,7 +28,15 @@ <div class="content"> <section><h1>Credits</h1> -<ul> +</section><section><div class='right-sidebar'><ul> +<li><a href="/about/">About</a></li> +<li><a href="/about/press/">Press</a></li> +<li><a href="/about/credits/">Credits</a></li> +<li><a href="/about/disclaimer/">Disclaimer</a></li> +<li><a href="/about/terms/">Terms and Conditions</a></li> +<li><a href="/about/privacy/">Privacy Policy</a></li> +</ul> +</div><ul> <li>MegaPixels by Adam Harvey</li> <li>Made with support from Mozilla</li> <li>Site developed by Jules Laplace</li> diff --git a/site/public/about/disclaimer/index.html b/site/public/about/disclaimer/index.html index a108baa0..b93194fa 100644 --- a/site/public/about/disclaimer/index.html +++ b/site/public/about/disclaimer/index.html @@ -28,7 +28,15 @@ <div class="content"> <section><h1>Disclaimer</h1> -<p>Last updated: December 04, 2018</p> +</section><section><div class='right-sidebar'><ul> +<li><a href="/about/">About</a></li> +<li><a href="/about/press/">Press</a></li> +<li><a href="/about/credits/">Credits</a></li> +<li><a href="/about/disclaimer/">Disclaimer</a></li> +<li><a href="/about/terms/">Terms and Conditions</a></li> +<li><a href="/about/privacy/">Privacy Policy</a></li> +</ul> +</div><p>Last updated: December 04, 2018</p> <p>The information contained on MegaPixels.cc website (the "Service") is for academic and artistic purposes only.</p> <p>MegaPixels.cc assumes no responsibility for errors or omissions in the contents on the Service.</p> <p>In no event shall MegaPixels.cc be liable for any special, direct, indirect, consequential, or incidental damages or any damages whatsoever, whether in an action of contract, negligence or other tort, arising out of or in connection with the use of the Service or the contents of the Service. MegaPixels.cc reserves the right to make additions, deletions, or modification to the contents on the Service at any time without prior notice.</p> diff --git a/site/public/about/index.html b/site/public/about/index.html index fecc6c7b..b7401ee8 100644 --- a/site/public/about/index.html +++ b/site/public/about/index.html @@ -4,7 +4,7 @@ <title>MegaPixels</title> <meta charset="utf-8" /> <meta name="author" content="Adam Harvey" /> - <meta name="description" content="MegaPixels Project Team Credits" /> + <meta name="description" content="About MegaPixels" /> <meta name="referrer" content="no-referrer" /> <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" /> <link rel='stylesheet' href='/assets/css/fonts.css' /> @@ -27,15 +27,20 @@ </header> <div class="content"> - <section><h1>Credits</h1> -<ul> -<li>MegaPixels by Adam Harvey</li> -<li>Made with support from Mozilla</li> -<li>Site developed by Jules Laplace</li> -<li>Design and graphics: Adam Harvey</li> -<li>Research assistants: Berit Gilma</li> + <section><h1>About MegaPixels</h1> +</section><section><div class='right-sidebar'><ul> +<li><a href="/about/press/">Press</a></li> +<li><a href="/about/credits/">Credits</a></li> +<li><a href="/about/disclaimer/">Disclaimer</a></li> +<li><a href="/about/terms/">Terms and Conditions</a></li> +<li><a href="/about/privacy/">Privacy Policy</a></li> </ul> -</section> +<div class='meta'><div><div class='gray'>Years</div><div>2002-2019</div></div><div><div class='gray'>Datasets Analyzed</div><div>325</div></div><div><div class='gray'>Author</div><div>Adam Harvey</div></div><div><div class='gray'>Development</div><div>Jules LaPlace</div></div><div><div class='gray'>Research Assistance</div><div>Berit Gilma</div></div></div></div><p>MegaPixels aims to answer to these questions and reveal the stories behind the millions of images used to train, evaluate, and power the facial recognition surveillance algorithms used today. MegaPixels is authored by Adam Harvey, developed in collaboration with Jules LaPlace, and produced in partnership with Mozilla.</p> +<p>MegaPixels aims to answer to these questions and reveal the stories behind the millions of images used to train, evaluate, and power the facial recognition surveillance algorithms used today. MegaPixels is authored by Adam Harvey, developed in collaboration with Jules LaPlace, and produced in partnership with Mozilla.</p> +</section><section class='images'><div class='sideimage'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/site/about/assets/adam-harvey.jpg' alt='Adam Harvey'><div><p><strong>Adam Harvey</strong> is an American artist and researcher based in Berlin. His previous projects (CV Dazzle, Stealth Wear, and SkyLift) explore the potential for countersurveillance as artwork. He is the founder of VFRAME (visual forensics software for human rights groups), the recipient of 2 PrototypeFund awards, and is currently a researcher in residence at Karlsruhe HfG studying artifical intelligence and datasets.</p> +</div></div></section><section class='images'><div class='sideimage'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/site/about/assets/jules-laplace.jpg' alt='Jules LaPlace'><div><p><strong>Jules LaPlace</strong> is an American artist and technologist also based in Berlin. He was previously the CTO of a NYC digital agency and currently works at VFRAME, developing computer vision for human rights groups, and building creative software for artists.</p> +</div></div></section><section class='images'><div class='sideimage'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/site/about/assets/mozilla.png' alt='Mozilla'><div><p><strong>Mozilla</strong> is a free software community founded in 1998 by members of Netscape. The Mozilla community uses, develops, spreads and supports Mozilla products, thereby promoting exclusively free software and open standards, with only minor exceptions. The community is supported institutionally by the not-for-profit Mozilla Foundation and its tax-paying subsidiary, the Mozilla Corporation.</p> +</div></div></section> </div> <footer> diff --git a/site/public/about/press/index.html b/site/public/about/press/index.html index b9dd97c2..d36b6bc6 100644 --- a/site/public/about/press/index.html +++ b/site/public/about/press/index.html @@ -28,10 +28,19 @@ <div class="content"> <section><h1>Press</h1> -</section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/site/about/assets/test.jpg' alt='alt text'><div class='caption'>alt text</div></div></section><section><ul> -<li>Aug 22, 2018: "Transgender YouTubers had their videos grabbed to train facial recognition software" by James Vincent <a href="https://www.theverge.com/2017/8/22/16180080/transgender-youtubers-ai-facial-recognition-dataset">https://www.theverge.com/2017/8/22/16180080/transgender-youtubers-ai-facial-recognition-dataset</a></li> +</section><section><div class='right-sidebar'><ul> +<li><a href="/about/">About</a></li> +<li><a href="/about/press/">Press</a></li> +<li><a href="/about/credits/">Credits</a></li> +<li><a href="/about/disclaimer/">Disclaimer</a></li> +<li><a href="/about/terms/">Terms and Conditions</a></li> +<li><a href="/about/privacy/">Privacy Policy</a></li> +</ul> +</div></section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/site/about/assets/test.jpg' alt='alt text'><div class='caption'>alt text</div></div></section><section><ul> <li>Aug 22, 2018: "Transgender YouTubers had their videos grabbed to train facial recognition software" by James Vincent <a href="https://www.theverge.com/2017/8/22/16180080/transgender-youtubers-ai-facial-recognition-dataset">https://www.theverge.com/2017/8/22/16180080/transgender-youtubers-ai-facial-recognition-dataset</a></li> <li>Aug 22, 2018: "Transgender YouTubers had their videos grabbed to train facial recognition software" by James Vincent <a href="https://www.theverge.com/2017/8/22/16180080/transgender-youtubers-ai-facial-recognition-dataset">https://www.theverge.com/2017/8/22/16180080/transgender-youtubers-ai-facial-recognition-dataset</a></li> +<li>Aug 22, 2018: "Transgender YouTubers had their videos grabbed to train facial recognition software" by James Vincent <a href="https://www.theverge.com/2017/8/22/16180080/transgender-youtubers-ai-facial-recognition-dataset">https://www.theverge.com/2017/8/22/16180080/transgender-youtubers-ai-facial-recognition-dataset</a> +lfw</li> </ul> </section> diff --git a/site/public/about/privacy/index.html b/site/public/about/privacy/index.html index 92a1b9a8..1b3b9d2f 100644 --- a/site/public/about/privacy/index.html +++ b/site/public/about/privacy/index.html @@ -28,10 +28,17 @@ <div class="content"> <section><h1>Privacy Policy</h1> -<p>A summary of our privacy policy is as follows:</p> +</section><section><div class='right-sidebar'><ul> +<li><a href="/about/">About</a></li> +<li><a href="/about/press/">Press</a></li> +<li><a href="/about/credits/">Credits</a></li> +<li><a href="/about/disclaimer/">Disclaimer</a></li> +<li><a href="/about/terms/">Terms and Conditions</a></li> +<li><a href="/about/privacy/">Privacy Policy</a></li> +</ul> +</div><p>A summary of our privacy policy is as follows:</p> <p>The MegaPixels site does not use any analytics programs or collect any data besides the necessary IP address of your connection, which are deleted every 30 days and used only for security and to prevent misuse.</p> <p>The image processing sections of the site do not collect any data whatsoever. All processing takes place in temporary memory (RAM) and then is displayed back to the user over a SSL secured HTTPS connection. It is the sole responsibility of the user whether they discard, by closing the page, or share their analyzed information and any potential consequences that may arise from doing so.</p> -<hr> <p>A more complete legal version is below:</p> <p><strong>This is a boilerplate Privacy policy from <a href="https://termsfeed.com/">https://termsfeed.com/</a></strong></p> <p><strong>Needs to be reviewed</strong></p> diff --git a/site/public/about/terms/index.html b/site/public/about/terms/index.html index fd17b4d9..8bd6e738 100644 --- a/site/public/about/terms/index.html +++ b/site/public/about/terms/index.html @@ -27,8 +27,16 @@ </header> <div class="content"> - <section><p>Terms and Conditions ("Terms")</p> -<p>Last updated: December 04, 2018</p> + <section><h1>Terms and Conditions ("Terms")</h1> +</section><section><div class='right-sidebar'><ul> +<li><a href="/about/">About</a></li> +<li><a href="/about/press/">Press</a></li> +<li><a href="/about/credits/">Credits</a></li> +<li><a href="/about/disclaimer/">Disclaimer</a></li> +<li><a href="/about/terms/">Terms and Conditions</a></li> +<li><a href="/about/privacy/">Privacy Policy</a></li> +</ul> +</div><p>Last updated: December 04, 2018</p> <p>Please read these Terms and Conditions ("Terms", "Terms and Conditions") carefully before using the MegaPixels website (the "Service") operated by megapixels.cc ("us", "we", or "our").</p> <p>Your access to and use of the Service is conditioned on your acceptance of and compliance with these Terms.</p> <p>By accessing or using the Service you agree to be bound by these Terms. If you disagree with any part of the terms then you may not access the Service.</p> diff --git a/site/public/datasets/index.html b/site/public/datasets/index.html index 77c5ab2b..7398da17 100644 --- a/site/public/datasets/index.html +++ b/site/public/datasets/index.html @@ -29,27 +29,78 @@ <section><h1>Facial Recognition Datasets</h1> -<p>Regular Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.</p> -<h3>Summary</h3> -</section><section><div class='meta'><div><div class='gray'>Found</div><div>275 datasets</div></div><div><div class='gray'>Created between</div><div>1993-2018</div></div><div><div class='gray'>Smallest dataset</div><div>20 images</div></div><div><div class='gray'>Largest dataset</div><div>10,000,000 images</div></div></div></section><section><div class='meta'><div><div class='gray'>Highest resolution faces</div><div>450x500 (Unconstrained College Students)</div></div><div><div class='gray'>Lowest resolution faces</div><div>16x20 pixels (QMUL SurvFace)</div></div></div></section><section class='applet_container'><div class='applet' data-payload='{"command": "load_file https://megapixels.nyc3.digitaloceanspaces.com/v1/citations/datasets.csv"}'></div></section> +</section><section><div class='meta'><div><div class='gray'>Found</div><div>275 datasets</div></div><div><div class='gray'>Created between</div><div>1993-2018</div></div><div><div class='gray'>Smallest dataset</div><div>20 images</div></div><div><div class='gray'>Largest dataset</div><div>10,000,000 images</div></div></div><section><section><div class='meta'><div><div class='gray'>Highest resolution faces</div><div>450x500 (Unconstrained College Students)</div></div><div><div class='gray'>Lowest resolution faces</div><div>16x20 pixels (QMUL SurvFace)</div></div></div><section> - <section> - <h2>Dataset Portraits</h2> + <section class='wide dataset-intro'> <p> - We have prepared detailed studies of some of the more noteworthy datasets. + We have prepared detailed case studies of some of the more noteworthy datasets, including tools to help you learn what is contained in these datasets, and even whether your own face has been used to train these algorithms. </p> <div class="dataset-list"> - <a href="/datasets/lfw/"> + <a href="/datasets/afad/" style="background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/afad/assets/index.jpg)"> <div class="dataset"> - Labeled Faces in The Wild + <span>Asian Face Age Dataset</span> </div> </a> - <a href="/datasets/vgg_face2/"> + <a href="/datasets/aflw/" style="background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/aflw/assets/index.jpg)"> <div class="dataset"> - VGG Face2 + <span>Annotated Facial Landmarks in The Wild</span> + </div> + </a> + + <a href="/datasets/caltech_10k/" style="background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/caltech_10k/assets/index.jpg)"> + <div class="dataset"> + <span>Caltech 10K Faces Dataset</span> + </div> + </a> + + <a href="/datasets/cofw/" style="background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/cofw/assets/index.jpg)"> + <div class="dataset"> + <span>Caltech Occluded Faces in The Wild</span> + </div> + </a> + + <a href="/datasets/facebook/" style="background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/facebook/assets/index.jpg)"> + <div class="dataset"> + <span>Facebook</span> + </div> + </a> + + <a href="/datasets/feret/" style="background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/feret/assets/index.jpg)"> + <div class="dataset"> + <span>FERET: FacE REcognition </span> + </div> + </a> + + <a href="/datasets/lfpw/" style="background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfpw/assets/index.jpg)"> + <div class="dataset"> + <span>Labeled Face Parts in The Wild</span> + </div> + </a> + + <a href="/datasets/lfw/" style="background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/index.jpg)"> + <div class="dataset"> + <span>Labeled Faces in The Wild</span> + </div> + </a> + + <a href="/datasets/uccs/" style="background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/uccs/assets/index.jpg)"> + <div class="dataset"> + <span>Unconstrained College Students</span> + </div> + </a> + + <a href="/datasets/vgg_face2/" style="background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/vgg_face2/assets/index.jpg)"> + <div class="dataset"> + <span>VGG Face 2 Dataset</span> + </div> + </a> + + <a href="/datasets/youtube_celebrities/" style="background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/youtube_celebrities/assets/index.jpg)"> + <div class="dataset"> + <span>YouTube Celebrities</span> </div> </a> diff --git a/site/public/datasets/lfw/index.html b/site/public/datasets/lfw/index.html index a6226720..5b5e58f3 100644 --- a/site/public/datasets/lfw/index.html +++ b/site/public/datasets/lfw/index.html @@ -4,7 +4,7 @@ <title>MegaPixels</title> <meta charset="utf-8" /> <meta name="author" content="Adam Harvey" /> - <meta name="description" content="LFW: Labeled Faces in The Wild" /> + <meta name="description" content="Labeled Faces in The Wild (LFW) is a database of face photographs designed for studying the problem of unconstrained face recognition." /> <meta name="referrer" content="no-referrer" /> <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" /> <link rel='stylesheet' href='/assets/css/fonts.css' /> @@ -27,54 +27,42 @@ </header> <div class="content"> - <section><h1>Labeled Faces in the Wild</h1> -</section><section><div class='meta'><div><div class='gray'>Created</div><div>2007</div></div><div><div class='gray'>Images</div><div>13,233</div></div><div><div class='gray'>People</div><div>5,749</div></div><div><div class='gray'>Created From</div><div>Yahoo News images</div></div><div><div class='gray'>Search available</div><div>Searchable</div></div></div></section><section class='applet_container'><div class='applet' data-payload='{"command": "face_search"}'></div></section><section class='applet_container'><div class='applet' data-payload='{"command": "name_search"}'></div></section><section class='fullwidth'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_feature.jpg' alt='Eighteen of the 5,749 people in the Labeled Faces in the Wild Dataset. The most widely used face dataset for benchmarking commercial face recognition algorithms.'><div class='caption'>Eighteen of the 5,749 people in the Labeled Faces in the Wild Dataset. The most widely used face dataset for benchmarking commercial face recognition algorithms.</div></div></section><section><h3>Intro</h3> -<p>Labeled Faces in The Wild (LFW) is among the most widely used facial recognition training datasets in the world and is the first of its kind to be created entirely from images posted online. The LFW dataset includes 13,233 images of 5,749 people that were collected between 2002-2004. Use the tools below to check if you were included in this dataset or scroll down to read the analysis.</p> -<p>Three paragraphs describing the LFW dataset in a format that can be easily replicated for the other datasets. Nothing too custom. An analysis of the initial research papers with context relative to all the other dataset papers.</p> -</section><section class='fullwidth'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_montage_everyone_nocrop_1920.jpg' alt=' From George W. Bush to Jamie Lee Curtis: all 5,749 people in the LFW Dataset sorted from most to least images collected.'><div class='caption'> From George W. Bush to Jamie Lee Curtis: all 5,749 people in the LFW Dataset sorted from most to least images collected.</div></div></section><section><h3>LFW by the Numbers</h3> + <section class='intro_section' style='background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/background.jpg)'><div class='inner'><div class='hero_desc'><span><span style='color: #ff0000'>Labeled Faces in The Wild</span> (LFW) is a database of face photographs designed for studying the problem of unconstrained face recognition.</span></div><div class='hero_subdesc'><span>It includes 13,456 images of 4,432 people’s images copied from the Internet during 2002-2004. +</span></div></div></section><section><div class='image'><div class='caption'>A few of the 5,749 people in the Labeled Faces in the Wild Dataset. The most widely used face dataset for benchmarking commercial face recognition algorithms.</div></div></section><section><div class='right-sidebar'><h3>Statistics</h3> +<div class='meta'><div><div class='gray'>Years</div><div>2002-2004</div></div><div><div class='gray'>Images</div><div>13,233</div></div><div><div class='gray'>Identities</div><div>5,749</div></div><div><div class='gray'>Origin</div><div>Yahoo News Images</div></div><div><div class='gray'>Funding</div><div>(Possibly, partially CIA)</div></div></div><h3>INSIGHTS</h3> <ul> -<li>Was first published in 2007</li> -<li>Developed out of a prior dataset from Berkely called "Faces in the Wild" or "Names and Faces" [^lfw_original_paper]</li> -<li>Includes 13,233 images and 5,749 different people [^lfw_website]</li> -<li>There are about 3 men for every 1 woman (4,277 men and 1,472 women)[^lfw_website]</li> -<li>The person with the most images is George W. Bush with 530</li> -<li>Most people (70%) in the dataset have only 1 image</li> -<li>Thre are 1,680 people in the dataset with 2 or more images [^lfw_website]</li> -<li>Two out of 4 of the original authors received funding from the Office of Director of National Intelligence and IARPA for their 2016 LFW survey follow up report </li> -<li>The LFW dataset includes over 500 actors, 30 models, 10 presidents, 24 football players, 124 basketball players, 11 kings, and 2 queens</li> -<li>In all the LFW publications provided by the authors the words "ethics", "consent", and "privacy" appear 0 times [^lfw_original_paper], [^lfw_survey], [^lfw_tech_report] , [^lfw_website]</li> +<li>There are about 3 men for every 1 woman (4,277 men and 1,472 women) in the LFW dataset<a class="footnote_shim" name="[^lfw_www]_1"> </a><a href="#[^lfw_www]" class="footnote" title="Footnote 1">1</a></li> +<li>The person with the most images is <a href="http://vis-www.cs.umass.edu/lfw/person/George_W_Bush_comp.html">George W. Bush</a> with 530</li> +<li>There are about 3 George W. Bush's for every 1 <a href="http://vis-www.cs.umass.edu/lfw/person/Tony_Blair.html">Tony Blair</a></li> +<li>The LFW dataset includes over 500 actors, 30 models, 10 presidents, 124 basketball players, 24 football players, 11 kings, 7 queens, and 1 <a href="http://vis-www.cs.umass.edu/lfw/person/Moby.html">Moby</a></li> +<li>In all 3 of the LFW publications [^lfw_original_paper], [^lfw_survey], [^lfw_tech_report] the words "ethics", "consent", and "privacy" appear 0 times</li> <li>The word "future" appears 71 times</li> </ul> -<h3>Facts</h3> +</div><h2>Labeled Faces in the Wild</h2> +<p><em>Labeled Faces in The Wild</em> (LFW) is "a database of face photographs designed for studying the problem of unconstrained face recognition<a class="footnote_shim" name="[^lfw_www]_2"> </a><a href="#[^lfw_www]" class="footnote" title="Footnote 1">1</a>. It is used to evaluate and improve the performance of facial recognition algorithms in academic, commercial, and government research. According to BiometricUpdate.com<a class="footnote_shim" name="[^lfw_pingan]_1"> </a><a href="#[^lfw_pingan]" class="footnote" title="Footnote 3">3</a>, LFW is "the most widely used evaluation set in the field of facial recognition, LFW attracts a few dozen teams from around the globe including Google, Facebook, Microsoft Research Asia, Baidu, Tencent, SenseTime, Face++ and Chinese University of Hong Kong."</p> +<p>The LFW dataset includes 13,233 images of 5,749 people that were collected between 2002-2004. LFW is a subset of <em>Names of Faces</em> and is part of the first facial recognition training dataset created entirely from images appearing on the Internet. The people appearing in LFW are...</p> +<p>The <em>Names and Faces</em> dataset was the first face recognition dataset created entire from online photos. However, <em>Names and Faces</em> and <em>LFW</em> are not the first face recognition dataset created entirely "in the wild". That title belongs to the <a href="/datasets/ucd_faces/">UCD dataset</a>. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer.</p> +<h3>Biometric Trade Routes</h3> +<p>To understand how this dataset has been used, its citations have been geocoded to show an approximate geographic digital trade route of the biometric data. Lines indicate an organization (education, commercial, or governmental) that has cited the LFW dataset in their research. Data is compiled from <a href="https://www.semanticscholar.org">Semantic Scholar</a>.</p> +</section><section class='applet_container'><div class='applet' data-payload='{"command": "map"}'></div></section><section><h3>Synthetic Faces</h3> +<p>To visualize the types of photos in the dataset without explicitly publishing individual's identities a generative adversarial network (GAN) was trained on the entire dataset. The images in this video show a neural network learning the visual latent space and then interpolating between archetypical identities within the LFW dataset.</p> +</section><section class='fullwidth'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_synthetic.jpg' alt=''></div></section><section><h3>Citations</h3> +<p>Browse or download the geocoded citation data collected for the LFW dataset.</p> +</section><section class='applet_container'><div class='applet' data-payload='{"command": "citations"}'></div></section><section><h3>Additional Information</h3> +<p>(tweet-sized snippets go here)</p> <ul> -<li>Was created for the purpose of improving "unconstrained face recognition" [^lfw_original_paper]</li> -<li>All images in LFW were obtained "in the wild" meaning without any consent from the subject or from the photographer</li> -<li>The faces were detected using the Viola-Jones haarcascade face detector [^lfw_website] [^lfw_survey]</li> -<li>Is considered the "most popular benchmark for face recognition" [^lfw_baidu]</li> -<li>Is "the most widely used evaluation set in the field of facial recognition" [^lfw_pingan]</li> -<li><p>Is used by several of the largest tech companies in the world including "Google, Facebook, Microsoft Research Asia, Baidu, Tencent, SenseTime, Face++ and Chinese University of Hong Kong." [^lfw_pingan]</p> -</li> -<li><p>All images were copied from Yahoo News between 2002 - 2004 [^lfw_original_paper]</p> -</li> -<li>SenseTime, who has relied on LFW for benchmarking their facial recognition performance, is the leading provider of surveillance to the Chinese Government</li> +<li>The LFW dataset is considered the "most popular benchmark for face recognition" <a class="footnote_shim" name="[^lfw_baidu]_1"> </a><a href="#[^lfw_baidu]" class="footnote" title="Footnote 2">2</a></li> +<li>The LFW dataset is "the most widely used evaluation set in the field of facial recognition" <a class="footnote_shim" name="[^lfw_pingan]_2"> </a><a href="#[^lfw_pingan]" class="footnote" title="Footnote 3">3</a></li> +<li>All images in LFW dataset were obtained "in the wild" meaning without any consent from the subject or from the photographer</li> +<li>The faces in the LFW dataset were detected using the Viola-Jones haarcascade face detector [^lfw_website] [^lfw-survey]</li> +<li>The LFW dataset is used by several of the largest tech companies in the world including "Google, Facebook, Microsoft Research Asia, Baidu, Tencent, SenseTime, Face++ and Chinese University of Hong Kong." <a class="footnote_shim" name="[^lfw_pingan]_3"> </a><a href="#[^lfw_pingan]" class="footnote" title="Footnote 3">3</a></li> +<li>All images in the LFW dataset were copied from Yahoo News between 2002 - 2004</li> +<li>In 2014, two of the four original authors of the LFW dataset received funding from IARPA and ODNI for their followup paper <a href="https://www.semanticscholar.org/paper/Labeled-Faces-in-the-Wild-%3A-Updates-and-New-Huang-Learned-Miller/2d3482dcff69c7417c7b933f22de606a0e8e42d4">Labeled Faces in the Wild: Updates and New Reporting Procedures</a> via IARPA contract number 2014-14071600010</li> +<li>The dataset includes 2 images of <a href="http://vis-www.cs.umass.edu/lfw/person/George_Tenet.html">George Tenet</a>, the former Director of Central Intelligence (DCI) for the Central Intelligence Agency whose facial biometrics were eventually used to help train facial recognition software in China and Russia</li> </ul> </section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_montage_top1_640.jpg' alt=' former President George W. Bush'><div class='caption'> former President George W. Bush</div></div> -<div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_montage_top2_4_640.jpg' alt=' Colin Powell (236), Tony Blair (144), and Donald Rumsfeld (121)'><div class='caption'> Colin Powell (236), Tony Blair (144), and Donald Rumsfeld (121)</div></div></section><section><h3>People and Companies using the LFW Dataset</h3> -<p>This section describes who is using the dataset and for what purposes. It should include specific examples of people or companies with citations and screenshots. This section is followed up by the graph, the map, and then the supplementary material.</p> -<p>The LFW dataset is used by numerous companies for <a href="about/glossary#benchmarking">benchmarking</a> algorithms and in some cases <a href="about/glossary#training">training</a>. According to the benchmarking results page [^lfw_results] provided by the authors, over 2 dozen companies have contributed their benchmark results.</p> -<p>According to BiometricUpdate.com [^lfw_pingan], LFW is "the most widely used evaluation set in the field of facial recognition, LFW attracts a few dozen teams from around the globe including Google, Facebook, Microsoft Research Asia, Baidu, Tencent, SenseTime, Face++ and Chinese University of Hong Kong."</p> -<p>According to researchers at the Baidu Research – Institute of Deep Learning "LFW has been the most popular evaluation benchmark for face recognition, and played a very important role in facilitating the face recognition society to improve algorithm. [^lfw_baidu]."</p> -<p>In addition to commercial use as an evaluation tool, alll of the faces in LFW dataset are prepackaged into a popular machine learning code framework called scikit-learn.</p> -</section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_screenshot_01.jpg' alt=' "PING AN Tech facial recognition receives high score in latest LFW test results"'><div class='caption'> "PING AN Tech facial recognition receives high score in latest LFW test results"</div></div> -<div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_screenshot_02.jpg' alt=' "Face Recognition Performance in LFW benchmark"'><div class='caption'> "Face Recognition Performance in LFW benchmark"</div></div> -<div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_screenshot_03.jpg' alt=' "The 1st place in face verification challenge, LFW"'><div class='caption'> "The 1st place in face verification challenge, LFW"</div></div></section><section><p>In benchmarking, companies use a dataset to evaluate their algorithms which are typically trained on other data. After training, researchers will use LFW as a benchmark to compare results with other algorithms.</p> -<p>For example, Baidu (est. net worth $13B) uses LFW to report results for their "Targeting Ultimate Accuracy: Face Recognition via Deep Embedding". According to the three Baidu researchers who produced the paper:</p> -<h3>Citations</h3> -<p>Overall, LFW has at least 116 citations from 11 countries.</p> -</section><section class='applet_container'><div class='applet' data-payload='{"command": "map"}'></div></section><section class='applet_container'><div class='applet' data-payload='{"command": "citations"}'></div></section><section><h3>Conclusion</h3> -<p>The LFW face recognition training and evaluation dataset is a historically important face dataset as it was the first popular dataset to be created entirely from Internet images, paving the way for a global trend towards downloading anyone’s face from the Internet and adding it to a dataset. As will be evident with other datasets, LFW’s approach has now become the norm.</p> -<p>For all the 5,000 people in this datasets, their face is forever a part of facial recognition history. It would be impossible to remove anyone from the dataset because it is so ubiquitous. For their rest of the lives and forever after, these 5,000 people will continue to be used for training facial recognition surveillance.</p> -<h2>Code</h2> +<div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_montage_top2_4_640.jpg' alt=' Colin Powell (236), Tony Blair (144), and Donald Rumsfeld (121)'><div class='caption'> Colin Powell (236), Tony Blair (144), and Donald Rumsfeld (121)</div></div></section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/lfw_montage_all_crop.jpg' alt='All 5,379 faces in the Labeled Faces in The Wild Dataset'><div class='caption'>All 5,379 faces in the Labeled Faces in The Wild Dataset</div></div></section><section><h2>Code</h2> +<p>The LFW dataset is so widely used that a popular code library called Sci-Kit Learn includes a function called <code>fetch_lfw_people</code> to download the faces in the LFW dataset.</p> </section><section><pre><code class="lang-python">#!/usr/bin/python import numpy as np @@ -87,31 +75,29 @@ lfw_people = fetch_lfw_people(min_faces_per_person=1, resize=1, color=True, funn # introspect dataset n_samples, h, w, c = lfw_people.images.shape -print('{:,} images at {}x{}'.format(n_samples, w, h)) +print(f'{n_samples:,} images at {w}x{h} pixels') cols, rows = (176, 76) n_ims = cols * rows # build montages im_scale = 0.5 -ims = lfw_people.images[:n_ims -montages = imutils.build_montages(ims, (int(w*im_scale, int(h*im_scale)), (cols, rows)) +ims = lfw_people.images[:n_ims] +montages = imutils.build_montages(ims, (int(w * im_scale, int(h * im_scale)), (cols, rows)) montage = montages[0] # save full montage image imageio.imwrite('lfw_montage_full.png', montage) # make a smaller version -montage_960 = imutils.resize(montage, width=960) -imageio.imwrite('lfw_montage_960.jpg', montage_960) +montage = imutils.resize(montage, width=960) +imageio.imwrite('lfw_montage_960.jpg', montage) </code></pre> -</section><section><h2>Disclaimer</h2> -<p>MegaPixels is an educational art project designed to encourage discourse about facial recognition datasets. Any ethical or legal issues should be directed to the researcher's parent organizations. Except where necessary for contact or clarity, the names of researchers have been subsituted by their parent organization. In no way does this project aim to villify researchers who produced the datasets.</p> -<p>Read more about <a href="about/code-of-conduct">MegaPixels Code of Conduct</a></p> -<div class="footnotes"> -<hr> -<ol></ol> -</div> -</section> +</section><section><h3>Supplementary Material</h3> +</section><section class='applet_container'><div class='applet' data-payload='{"command": "load_file assets/lfw_commercial_use.csv", "fields": ["name_display, company_url, example_url, country, description"]}'></div></section><section><p>Text and graphics ©Adam Harvey / megapixels.cc</p> +</section><section><ul class="footnotes"><li><a name="[^lfw_www]" class="footnote_shim"></a><span class="backlinks"><a href="#[^lfw_www]_1">a</a><a href="#[^lfw_www]_2">b</a></span><p><a href="http://vis-www.cs.umass.edu/lfw/results.html">http://vis-www.cs.umass.edu/lfw/results.html</a></p> +</li><li><a name="[^lfw_baidu]" class="footnote_shim"></a><span class="backlinks"><a href="#[^lfw_baidu]_1">a</a></span><p>Jingtuo Liu, Yafeng Deng, Tao Bai, Zhengping Wei, Chang Huang. Targeting Ultimate Accuracy: Face Recognition via Deep Embedding. <a href="https://arxiv.org/abs/1506.07310">https://arxiv.org/abs/1506.07310</a></p> +</li><li><a name="[^lfw_pingan]" class="footnote_shim"></a><span class="backlinks"><a href="#[^lfw_pingan]_1">a</a><a href="#[^lfw_pingan]_2">b</a><a href="#[^lfw_pingan]_3">c</a></span><p>Lee, Justin. "PING AN Tech facial recognition receives high score in latest LFW test results". BiometricUpdate.com. Feb 13, 2017. <a href="https://www.biometricupdate.com/201702/ping-an-tech-facial-recognition-receives-high-score-in-latest-lfw-test-results">https://www.biometricupdate.com/201702/ping-an-tech-facial-recognition-receives-high-score-in-latest-lfw-test-results</a></p> +</li></ul></section> </div> <footer> diff --git a/site/public/datasets/vgg_face2/index.html b/site/public/datasets/vgg_face2/index.html index b7ba5a4c..efe6cb84 100644 --- a/site/public/datasets/vgg_face2/index.html +++ b/site/public/datasets/vgg_face2/index.html @@ -4,7 +4,7 @@ <title>MegaPixels</title> <meta charset="utf-8" /> <meta name="author" content="Adam Harvey" /> - <meta name="description" content="A large scale image dataset for face recognition" /> + <meta name="description" content="VGG Face 2 Dataset" /> <meta name="referrer" content="no-referrer" /> <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" /> <link rel='stylesheet' href='/assets/css/fonts.css' /> @@ -27,35 +27,10 @@ </header> <div class="content"> - <section><h1>VGG Faces2</h1> -</section><section><div class='meta'><div><div class='gray'>Created</div><div>2018</div></div><div><div class='gray'>Images</div><div>3.3M</div></div><div><div class='gray'>People</div><div>9,000</div></div><div><div class='gray'>Created From</div><div>Scraping search engines</div></div><div><div class='gray'>Search available</div><div>[Searchable](#)</div></div></div></section><section><p>VGG Face2 is the updated version of the VGG Face dataset and now includes over 3.3M face images from over 9K people. The identities were selected by taking the top 500K identities in Google's Knowledge Graph of celebrities and then selecting only the names that yielded enough training images. The dataset was created in the UK but funded by Office of Director of National Intelligence in the United States.</p> -</section><section class='applet_container'><div class='applet' data-payload='{"command": "face_search"}'></div></section><section class='applet_container'><div class='applet' data-payload='{"command": "name_search"}'></div></section><section class='applet_container'><div class='applet' data-payload='{"command": "load_file assets/lfw_names_gender_kg_min.csv", "fields": ["Name, Images, Gender, Description"]}'></div></section><section><h3>VGG Face2 by the Numbers</h3> + <section><h1>VGG Face 2</h1> +</section><section><div class='meta'><div><div class='gray'>Years</div><div>TBD</div></div><div><div class='gray'>Images</div><div>TBD</div></div><div><div class='gray'>Identities</div><div>TBD</div></div><div><div class='gray'>Origin</div><div>TBD</div></div><div><div class='gray'>Funding</div><div>IARPA</div></div></div><section><section class='fullwidth'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/vgg_face2/assets/vgg_face2_index.gif' alt='...'><div class='caption'>...</div></div></section><section><h3>Analysis</h3> <ul> -<li>1,331 actresses, 139 presidents</li> -<li>3 husbands and 16 wives</li> -<li>2 snooker player</li> -<li>1 guru</li> -<li>1 pornographic actress</li> -<li>3 computer programmer</li> -</ul> -<h3>Names and descriptions</h3> -<ul> -<li>The original VGGF2 name list has been updated with the results returned from Google Knowledge</li> -<li>Names with a similarity score greater than 0.75 where automatically updated. Scores computed using <code>import difflib; seq = difflib.SequenceMatcher(a=a.lower(), b=b.lower()); score = seq.ratio()</code></li> -<li>The 97 names with a score of 0.75 or lower were manually reviewed and includes name changes validating using Wikipedia.org results for names such as "Bruce Jenner" to "Caitlyn Jenner", spousal last-name changes, and discretionary changes to improve search results such as combining nicknames with full name when appropriate, for example changing "Aleksandar Petrović" to "Aleksandar 'Aco' Petrović" and minor changes such as "Mohammad Ali" to "Muhammad Ali"</li> -<li>The 'Description' text was automatically added when the Knowledge Graph score was greater than 250</li> -</ul> -<h2>TODO</h2> -<ul> -<li>create name list, and populate with Knowledge graph information like LFW</li> -<li>make list of interesting number stats, by the numbers</li> -<li>make list of interesting important facts</li> -<li>write intro abstract</li> -<li>write analysis of usage</li> -<li>find examples, citations, and screenshots of useage</li> -<li>find list of companies using it for table</li> -<li>create montages of the dataset, like LFW</li> -<li>create right to removal information</li> +<li>The VGG Face 2 dataset includes approximately 1,331 actresses, 139 presidents, 16 wives, 3 husbands, 2 snooker player, and 1 guru</li> </ul> </section> diff --git a/site/public/datasets_v0/index.html b/site/public/datasets_v0/index.html new file mode 100644 index 00000000..c2e6617b --- /dev/null +++ b/site/public/datasets_v0/index.html @@ -0,0 +1,53 @@ +<!doctype html> +<html> +<head> + <title>MegaPixels</title> + <meta charset="utf-8" /> + <meta name="author" content="Adam Harvey" /> + <meta name="description" content="Facial Recognition Datasets" /> + <meta name="referrer" content="no-referrer" /> + <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" /> + <link rel='stylesheet' href='/assets/css/fonts.css' /> + <link rel='stylesheet' href='/assets/css/tabulator.css' /> + <link rel='stylesheet' href='/assets/css/css.css' /> + <link rel='stylesheet' href='/assets/css/leaflet.css' /> + <link rel='stylesheet' href='/assets/css/applets.css' /> +</head> +<body> + <header> + <a class='slogan' href="/"> + <div class='logo'></div> + <div class='site_name'>MegaPixels</div> + </a> + <div class='links'> + <a href="/datasets/">Datasets</a> + <a href="/research/">Research</a> + <a href="/about/">About</a> + </div> + </header> + <div class="content"> + + <section><h1>Facial Recognition Datasets</h1> +<p>Regular Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.</p> +<h3>Summary</h3> +</section><section><div class='meta'><div><div class='gray'>Found</div><div>275 datasets</div></div><div><div class='gray'>Created between</div><div>1993-2018</div></div><div><div class='gray'>Smallest dataset</div><div>20 images</div></div><div><div class='gray'>Largest dataset</div><div>10,000,000 images</div></div></div><section><section><div class='meta'><div><div class='gray'>Highest resolution faces</div><div>450x500 (Unconstrained College Students)</div></div><div><div class='gray'>Lowest resolution faces</div><div>16x20 pixels (QMUL SurvFace)</div></div></div><section><section class='applet_container'><div class='applet' data-payload='{"command": "load_file https://megapixels.nyc3.digitaloceanspaces.com/v1/citations/datasets.csv"}'></div></section> + + </div> + <footer> + <div> + <a href="/">MegaPixels.cc</a> + <a href="/about/disclaimer/">Disclaimer</a> + <a href="/about/terms/">Terms of Use</a> + <a href="/about/privacy/">Privacy</a> + <a href="/about/">About</a> + <a href="/about/team/">Team</a> + </div> + <div> + MegaPixels ©2017-19 Adam R. Harvey / + <a href="https://ahprojects.com">ahprojects.com</a> + </div> + </footer> +</body> + +<script src="/assets/js/dist/index.js"></script> +</html>
\ No newline at end of file diff --git a/site/public/datasets_v0/lfw/index.html b/site/public/datasets_v0/lfw/index.html new file mode 100644 index 00000000..4ee4799f --- /dev/null +++ b/site/public/datasets_v0/lfw/index.html @@ -0,0 +1,131 @@ +<!doctype html> +<html> +<head> + <title>MegaPixels</title> + <meta charset="utf-8" /> + <meta name="author" content="Adam Harvey" /> + <meta name="description" content="LFW: Labeled Faces in The Wild" /> + <meta name="referrer" content="no-referrer" /> + <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" /> + <link rel='stylesheet' href='/assets/css/fonts.css' /> + <link rel='stylesheet' href='/assets/css/tabulator.css' /> + <link rel='stylesheet' href='/assets/css/css.css' /> + <link rel='stylesheet' href='/assets/css/leaflet.css' /> + <link rel='stylesheet' href='/assets/css/applets.css' /> +</head> +<body> + <header> + <a class='slogan' href="/"> + <div class='logo'></div> + <div class='site_name'>MegaPixels</div> + </a> + <div class='links'> + <a href="/datasets/">Datasets</a> + <a href="/research/">Research</a> + <a href="/about/">About</a> + </div> + </header> + <div class="content"> + + <section><h1>Labeled Faces in the Wild</h1> +</section><section><div class='meta'><div><div class='gray'>Created</div><div>2007</div></div><div><div class='gray'>Images</div><div>13,233</div></div><div><div class='gray'>People</div><div>5,749</div></div><div><div class='gray'>Created From</div><div>Yahoo News images</div></div><div><div class='gray'>Search available</div><div>Searchable</div></div></div><section><section class='applet_container'><div class='applet' data-payload='{"command": "face_search"}'></div></section><section class='applet_container'><div class='applet' data-payload='{"command": "name_search"}'></div></section><section class='fullwidth'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/site/datasets_v0/lfw/assets/lfw_feature.jpg' alt='Eighteen of the 5,749 people in the Labeled Faces in the Wild Dataset. The most widely used face dataset for benchmarking commercial face recognition algorithms.'><div class='caption'>Eighteen of the 5,749 people in the Labeled Faces in the Wild Dataset. The most widely used face dataset for benchmarking commercial face recognition algorithms.</div></div></section><section><h3>Intro</h3> +<p>Labeled Faces in The Wild (LFW) is among the most widely used facial recognition training datasets in the world and is the first of its kind to be created entirely from images posted online. The LFW dataset includes 13,233 images of 5,749 people that were collected between 2002-2004. Use the tools below to check if you were included in this dataset or scroll down to read the analysis.</p> +<p>Three paragraphs describing the LFW dataset in a format that can be easily replicated for the other datasets. Nothing too custom. An analysis of the initial research papers with context relative to all the other dataset papers.</p> +</section><section class='fullwidth'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/site/datasets_v0/lfw/assets/lfw_montage_everyone_nocrop_1920.jpg' alt=' From George W. Bush to Jamie Lee Curtis: all 5,749 people in the LFW Dataset sorted from most to least images collected.'><div class='caption'> From George W. Bush to Jamie Lee Curtis: all 5,749 people in the LFW Dataset sorted from most to least images collected.</div></div></section><section><h3>LFW by the Numbers</h3> +<ul> +<li>Was first published in 2007</li> +<li>Developed out of a prior dataset from Berkely called "Faces in the Wild" or "Names and Faces" [^lfw_original_paper]</li> +<li>Includes 13,233 images and 5,749 different people [^lfw_website]</li> +<li>There are about 3 men for every 1 woman (4,277 men and 1,472 women)[^lfw_website]</li> +<li>The person with the most images is George W. Bush with 530</li> +<li>Most people (70%) in the dataset have only 1 image</li> +<li>Thre are 1,680 people in the dataset with 2 or more images [^lfw_website]</li> +<li>Two out of 4 of the original authors received funding from the Office of Director of National Intelligence and IARPA for their 2016 LFW survey follow up report </li> +<li>The LFW dataset includes over 500 actors, 30 models, 10 presidents, 24 football players, 124 basketball players, 11 kings, and 2 queens</li> +<li>In all the LFW publications provided by the authors the words "ethics", "consent", and "privacy" appear 0 times [^lfw_original_paper], [^lfw_survey], [^lfw_tech_report] , [^lfw_website]</li> +<li>The word "future" appears 71 times</li> +</ul> +<h3>Facts</h3> +<ul> +<li>Was created for the purpose of improving "unconstrained face recognition" [^lfw_original_paper]</li> +<li>All images in LFW were obtained "in the wild" meaning without any consent from the subject or from the photographer</li> +<li>The faces were detected using the Viola-Jones haarcascade face detector [^lfw_website] [^lfw_survey]</li> +<li>Is considered the "most popular benchmark for face recognition" [^lfw_baidu]</li> +<li>Is "the most widely used evaluation set in the field of facial recognition" [^lfw_pingan]</li> +<li><p>Is used by several of the largest tech companies in the world including "Google, Facebook, Microsoft Research Asia, Baidu, Tencent, SenseTime, Face++ and Chinese University of Hong Kong." [^lfw_pingan]</p> +</li> +<li><p>All images were copied from Yahoo News between 2002 - 2004 [^lfw_original_paper]</p> +</li> +<li>SenseTime, who has relied on LFW for benchmarking their facial recognition performance, is the leading provider of surveillance to the Chinese Government</li> +</ul> +</section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/site/datasets_v0/lfw/assets/lfw_montage_top1_640.jpg' alt=' former President George W. Bush'><div class='caption'> former President George W. Bush</div></div> +<div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/site/datasets_v0/lfw/assets/lfw_montage_top2_4_640.jpg' alt=' Colin Powell (236), Tony Blair (144), and Donald Rumsfeld (121)'><div class='caption'> Colin Powell (236), Tony Blair (144), and Donald Rumsfeld (121)</div></div></section><section><h3>People and Companies using the LFW Dataset</h3> +<p>This section describes who is using the dataset and for what purposes. It should include specific examples of people or companies with citations and screenshots. This section is followed up by the graph, the map, and then the supplementary material.</p> +<p>The LFW dataset is used by numerous companies for <a href="about/glossary#benchmarking">benchmarking</a> algorithms and in some cases <a href="about/glossary#training">training</a>. According to the benchmarking results page [^lfw_results] provided by the authors, over 2 dozen companies have contributed their benchmark results.</p> +<p>According to BiometricUpdate.com [^lfw_pingan], LFW is "the most widely used evaluation set in the field of facial recognition, LFW attracts a few dozen teams from around the globe including Google, Facebook, Microsoft Research Asia, Baidu, Tencent, SenseTime, Face++ and Chinese University of Hong Kong."</p> +<p>According to researchers at the Baidu Research – Institute of Deep Learning "LFW has been the most popular evaluation benchmark for face recognition, and played a very important role in facilitating the face recognition society to improve algorithm. [^lfw_baidu]."</p> +<p>In addition to commercial use as an evaluation tool, alll of the faces in LFW dataset are prepackaged into a popular machine learning code framework called scikit-learn.</p> +</section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/site/datasets_v0/lfw/assets/lfw_screenshot_01.jpg' alt=' "PING AN Tech facial recognition receives high score in latest LFW test results"'><div class='caption'> "PING AN Tech facial recognition receives high score in latest LFW test results"</div></div> +<div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/site/datasets_v0/lfw/assets/lfw_screenshot_02.jpg' alt=' "Face Recognition Performance in LFW benchmark"'><div class='caption'> "Face Recognition Performance in LFW benchmark"</div></div> +<div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/site/datasets_v0/lfw/assets/lfw_screenshot_03.jpg' alt=' "The 1st place in face verification challenge, LFW"'><div class='caption'> "The 1st place in face verification challenge, LFW"</div></div></section><section><p>In benchmarking, companies use a dataset to evaluate their algorithms which are typically trained on other data. After training, researchers will use LFW as a benchmark to compare results with other algorithms.</p> +<p>For example, Baidu (est. net worth $13B) uses LFW to report results for their "Targeting Ultimate Accuracy: Face Recognition via Deep Embedding". According to the three Baidu researchers who produced the paper:</p> +<h3>Citations</h3> +<p>Overall, LFW has at least 116 citations from 11 countries.</p> +</section><section class='applet_container'><div class='applet' data-payload='{"command": "map"}'></div></section><section class='applet_container'><div class='applet' data-payload='{"command": "citations"}'></div></section><section><h3>Conclusion</h3> +<p>The LFW face recognition training and evaluation dataset is a historically important face dataset as it was the first popular dataset to be created entirely from Internet images, paving the way for a global trend towards downloading anyone’s face from the Internet and adding it to a dataset. As will be evident with other datasets, LFW’s approach has now become the norm.</p> +<p>For all the 5,000 people in this datasets, their face is forever a part of facial recognition history. It would be impossible to remove anyone from the dataset because it is so ubiquitous. For their rest of the lives and forever after, these 5,000 people will continue to be used for training facial recognition surveillance.</p> +<h2>Code</h2> +</section><section><pre><code class="lang-python">#!/usr/bin/python + +import numpy as np +from sklearn.datasets import fetch_lfw_people +import imageio +import imutils + +# download LFW dataset (first run takes a while) +lfw_people = fetch_lfw_people(min_faces_per_person=1, resize=1, color=True, funneled=False) + +# introspect dataset +n_samples, h, w, c = lfw_people.images.shape +print(f'{n_samples:,} images at {w}x{h} pixels') +cols, rows = (176, 76) +n_ims = cols * rows + +# build montages +im_scale = 0.5 +ims = lfw_people.images[:n_ims] +montages = imutils.build_montages(ims, (int(w * im_scale, int(h * im_scale)), (cols, rows)) +montage = montages[0] + +# save full montage image +imageio.imwrite('lfw_montage_full.png', montage) + +# make a smaller version +montage_960 = imutils.resize(montage, width=960) +imageio.imwrite('lfw_montage_960.jpg', montage_960) +</code></pre> +</section><section><div class="footnotes"> +<hr> +<ol></ol> +</div> +</section> + + </div> + <footer> + <div> + <a href="/">MegaPixels.cc</a> + <a href="/about/disclaimer/">Disclaimer</a> + <a href="/about/terms/">Terms of Use</a> + <a href="/about/privacy/">Privacy</a> + <a href="/about/">About</a> + <a href="/about/team/">Team</a> + </div> + <div> + MegaPixels ©2017-19 Adam R. Harvey / + <a href="https://ahprojects.com">ahprojects.com</a> + </div> + </footer> +</body> + +<script src="/assets/js/dist/index.js"></script> +</html>
\ No newline at end of file diff --git a/site/public/datasets_v0/lfw/right-to-removal/index.html b/site/public/datasets_v0/lfw/right-to-removal/index.html new file mode 100644 index 00000000..97ce4d05 --- /dev/null +++ b/site/public/datasets_v0/lfw/right-to-removal/index.html @@ -0,0 +1,61 @@ +<!doctype html> +<html> +<head> + <title>MegaPixels</title> + <meta charset="utf-8" /> + <meta name="author" content="Adam Harvey" /> + <meta name="description" content="LFW: Labeled Faces in The Wild" /> + <meta name="referrer" content="no-referrer" /> + <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" /> + <link rel='stylesheet' href='/assets/css/fonts.css' /> + <link rel='stylesheet' href='/assets/css/tabulator.css' /> + <link rel='stylesheet' href='/assets/css/css.css' /> + <link rel='stylesheet' href='/assets/css/leaflet.css' /> + <link rel='stylesheet' href='/assets/css/applets.css' /> +</head> +<body> + <header> + <a class='slogan' href="/"> + <div class='logo'></div> + <div class='site_name'>MegaPixels</div> + </a> + <div class='links'> + <a href="/datasets/">Datasets</a> + <a href="/research/">Research</a> + <a href="/about/">About</a> + </div> + </header> + <div class="content"> + + <section><h1>Labeled Faces in the Wild</h1> +<h2>Right to Removal</h2> +<p>If you are affected by disclosure of your identity in this dataset please do contact the authors. Many have stated that they are willing to remove images upon request. The authors of the LFW dataset provide the following email for inquiries:</p> +<p>You can use the following message to request removal from the dataset:</p> +<p>To: Gary Huang <a href="mailto:mailto:gbhuang@cs.umass.edu">mailto:gbhuang@cs.umass.edu</a></p> +<p>Subject: Request for Removal from LFW Face Dataset</p> +<p>Dear [researcher name],</p> +<p>I am writing to you about the "Labeled Faces in The Wild Dataset". Recently I discovered that your dataset includes my identity and I no longer wish to be included in your dataset.</p> +<p>The dataset is being used thousands of companies around the world to improve facial recognition software including usage by governments for the purpose of law enforcement, national security, tracking consumers in retail environments, and tracking individuals through public spaces.</p> +<p>My name as it appears in your dataset is [your name]. Please remove all images from your dataset and inform your newsletter subscribers to likewise update their copies.</p> +<p>- [your name]</p> +</section> + + </div> + <footer> + <div> + <a href="/">MegaPixels.cc</a> + <a href="/about/disclaimer/">Disclaimer</a> + <a href="/about/terms/">Terms of Use</a> + <a href="/about/privacy/">Privacy</a> + <a href="/about/">About</a> + <a href="/about/team/">Team</a> + </div> + <div> + MegaPixels ©2017-19 Adam R. Harvey / + <a href="https://ahprojects.com">ahprojects.com</a> + </div> + </footer> +</body> + +<script src="/assets/js/dist/index.js"></script> +</html>
\ No newline at end of file diff --git a/site/public/datasets_v0/lfw/tables/index.html b/site/public/datasets_v0/lfw/tables/index.html new file mode 100644 index 00000000..dd460843 --- /dev/null +++ b/site/public/datasets_v0/lfw/tables/index.html @@ -0,0 +1,52 @@ +<!doctype html> +<html> +<head> + <title>MegaPixels</title> + <meta charset="utf-8" /> + <meta name="author" content="Adam Harvey" /> + <meta name="description" content="LFW: Labeled Faces in The Wild" /> + <meta name="referrer" content="no-referrer" /> + <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" /> + <link rel='stylesheet' href='/assets/css/fonts.css' /> + <link rel='stylesheet' href='/assets/css/tabulator.css' /> + <link rel='stylesheet' href='/assets/css/css.css' /> + <link rel='stylesheet' href='/assets/css/leaflet.css' /> + <link rel='stylesheet' href='/assets/css/applets.css' /> +</head> +<body> + <header> + <a class='slogan' href="/"> + <div class='logo'></div> + <div class='site_name'>MegaPixels</div> + </a> + <div class='links'> + <a href="/datasets/">Datasets</a> + <a href="/research/">Research</a> + <a href="/about/">About</a> + </div> + </header> + <div class="content"> + + <section><h1>Labeled Faces in the Wild</h1> +<h2>Tables</h2> +</section><section class='applet_container'><div class='applet' data-payload='{"command": "load_file assets/lfw_names_gender_kg_min.csv", "fields": ["Name, Images, Gender, Description"]}'></div></section><section class='applet_container'><div class='applet' data-payload='{"command": "load_file assets/lfw_commercial_use.csv", "fields": ["name_display, company_url, example_url, country, description"]}'></div></section><section></section> + + </div> + <footer> + <div> + <a href="/">MegaPixels.cc</a> + <a href="/about/disclaimer/">Disclaimer</a> + <a href="/about/terms/">Terms of Use</a> + <a href="/about/privacy/">Privacy</a> + <a href="/about/">About</a> + <a href="/about/team/">Team</a> + </div> + <div> + MegaPixels ©2017-19 Adam R. Harvey / + <a href="https://ahprojects.com">ahprojects.com</a> + </div> + </footer> +</body> + +<script src="/assets/js/dist/index.js"></script> +</html>
\ No newline at end of file diff --git a/site/public/datasets_v0/vgg_face2/index.html b/site/public/datasets_v0/vgg_face2/index.html new file mode 100644 index 00000000..6a67e7e4 --- /dev/null +++ b/site/public/datasets_v0/vgg_face2/index.html @@ -0,0 +1,80 @@ +<!doctype html> +<html> +<head> + <title>MegaPixels</title> + <meta charset="utf-8" /> + <meta name="author" content="Adam Harvey" /> + <meta name="description" content="A large scale image dataset for face recognition" /> + <meta name="referrer" content="no-referrer" /> + <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" /> + <link rel='stylesheet' href='/assets/css/fonts.css' /> + <link rel='stylesheet' href='/assets/css/tabulator.css' /> + <link rel='stylesheet' href='/assets/css/css.css' /> + <link rel='stylesheet' href='/assets/css/leaflet.css' /> + <link rel='stylesheet' href='/assets/css/applets.css' /> +</head> +<body> + <header> + <a class='slogan' href="/"> + <div class='logo'></div> + <div class='site_name'>MegaPixels</div> + </a> + <div class='links'> + <a href="/datasets/">Datasets</a> + <a href="/research/">Research</a> + <a href="/about/">About</a> + </div> + </header> + <div class="content"> + + <section><h1>VGG Faces2</h1> +</section><section><div class='meta'><div><div class='gray'>Created</div><div>2018</div></div><div><div class='gray'>Images</div><div>3.3M</div></div><div><div class='gray'>People</div><div>9,000</div></div><div><div class='gray'>Created From</div><div>Scraping search engines</div></div><div><div class='gray'>Search available</div><div>[Searchable](#)</div></div></div><section><section><p>VGG Face2 is the updated version of the VGG Face dataset and now includes over 3.3M face images from over 9K people. The identities were selected by taking the top 500K identities in Google's Knowledge Graph of celebrities and then selecting only the names that yielded enough training images. The dataset was created in the UK but funded by Office of Director of National Intelligence in the United States.</p> +</section><section class='applet_container'><div class='applet' data-payload='{"command": "face_search"}'></div></section><section class='applet_container'><div class='applet' data-payload='{"command": "name_search"}'></div></section><section class='applet_container'><div class='applet' data-payload='{"command": "load_file assets/lfw_names_gender_kg_min.csv", "fields": ["Name, Images, Gender, Description"]}'></div></section><section><h3>VGG Face2 by the Numbers</h3> +<ul> +<li>1,331 actresses, 139 presidents</li> +<li>3 husbands and 16 wives</li> +<li>2 snooker player</li> +<li>1 guru</li> +<li>1 pornographic actress</li> +<li>3 computer programmer</li> +</ul> +<h3>Names and descriptions</h3> +<ul> +<li>The original VGGF2 name list has been updated with the results returned from Google Knowledge</li> +<li>Names with a similarity score greater than 0.75 where automatically updated. Scores computed using <code>import difflib; seq = difflib.SequenceMatcher(a=a.lower(), b=b.lower()); score = seq.ratio()</code></li> +<li>The 97 names with a score of 0.75 or lower were manually reviewed and includes name changes validating using Wikipedia.org results for names such as "Bruce Jenner" to "Caitlyn Jenner", spousal last-name changes, and discretionary changes to improve search results such as combining nicknames with full name when appropriate, for example changing "Aleksandar Petrović" to "Aleksandar 'Aco' Petrović" and minor changes such as "Mohammad Ali" to "Muhammad Ali"</li> +<li>The 'Description' text was automatically added when the Knowledge Graph score was greater than 250</li> +</ul> +<h2>TODO</h2> +<ul> +<li>create name list, and populate with Knowledge graph information like LFW</li> +<li>make list of interesting number stats, by the numbers</li> +<li>make list of interesting important facts</li> +<li>write intro abstract</li> +<li>write analysis of usage</li> +<li>find examples, citations, and screenshots of useage</li> +<li>find list of companies using it for table</li> +<li>create montages of the dataset, like LFW</li> +<li>create right to removal information</li> +</ul> +</section> + + </div> + <footer> + <div> + <a href="/">MegaPixels.cc</a> + <a href="/about/disclaimer/">Disclaimer</a> + <a href="/about/terms/">Terms of Use</a> + <a href="/about/privacy/">Privacy</a> + <a href="/about/">About</a> + <a href="/about/team/">Team</a> + </div> + <div> + MegaPixels ©2017-19 Adam R. Harvey / + <a href="https://ahprojects.com">ahprojects.com</a> + </div> + </footer> +</body> + +<script src="/assets/js/dist/index.js"></script> +</html>
\ No newline at end of file diff --git a/site/public/index.html b/site/public/index.html index d2986084..d5a2e59f 100644 --- a/site/public/index.html +++ b/site/public/index.html @@ -3,15 +3,13 @@ <head> <title>MegaPixels</title> <meta charset="utf-8" /> - <meta name="author" content="Adam Harvey" /> - <meta name="description" content="" /> + <meta name="author" content="info@megapixels.cc" /> + <meta name="description" content="The Dark Side of Datasets" /> <meta name="referrer" content="no-referrer" /> <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" /> <link rel='stylesheet' href='/assets/css/fonts.css' /> - <link rel='stylesheet' href='/assets/css/tabulator.css' /> <link rel='stylesheet' href='/assets/css/css.css' /> - <link rel='stylesheet' href='/assets/css/leaflet.css' /> - <link rel='stylesheet' href='/assets/css/applets.css' /> + <link rel='stylesheet' href='/assets/css/splash.css' /> </head> <body> <header> @@ -20,112 +18,22 @@ <div class='site_name'>MegaPixels</div> </a> <div class='links'> - <a href="/datasets/">Datasets</a> - <a href="/research/">Research</a> - <a href="/about/">About</a> + <a href="/datasets/" class='aboutLink'>DATASETS</a> + <a href="/research/" class='aboutLink'>RESEARCH</a> + <a href="/about/" class='aboutLink'>ABOUT</a> </div> </header> - <div class="content"> - - <div class='hero'> - <div class='inner'> - <div id="face_container"> - <div class='currentFace'></div> - </div> - <div class='intro'> - <div class='headline'> - MegaPixels is an art project that explores the dark side of face recognition datasets and the future of computer vision. - </div> - - <div class='buttons'> - <a href="/datasets/lfw/"><button class='important'>Find Your Face</button></a> - <a href="/analyze/"><button class='normal'>Analyze Your Face</button></a> - </div> - - <div class='under'> - Made by Adam Harvey in collaboration with Jules Laplace, and in partnership with Mozilla.<br/> - <a href='/about/'>Read more about MegaPixels</a> - </div> - </div> - </div> - </div> - - <section class='wide dataset-intro'> - <h2>Face Recognition Datasets</h2> - <div class='right-sidebar'> - <h4>SUMMARY</h4> - <div class='meta'> - <div><div class='gray'>Found</div><div>275 datasets</div></div> - <div><div class='gray'>Created between</div><div>1993-2018</div></div> - <div><div class='gray'>Smallest dataset</div><div>20 images</div></div> - <div><div class='gray'>Largest dataset</div><div>10,000,000 images</div></div> - <div><div class='gray'>Highest resolution faces</div><div>450x500 (Unconstrained College Students)</div></div> - <div><div class='gray'>Lowest resolution faces</div><div>16x20 pixels (QMUL SurvFace)</div></div> - </div> - </div> - - <p> - MegaPixels is an online art project that explores the history of face recognition from the perspective of datasets. MegaPixels aims to unravel the meanings behind the data and expose the darker corners of the biometric industry that have contributed to its growth. - </p> - <p> - Through a mix of case studies, visualizations, and interactive tools, Megapixels will use face recognition datasets to tell the history of modern biometrics. Many people have contributed to the development of face recignition technology, both wittingly and unwittingly. Not only scientists, but also celebrities and regular internet users have played a part. - </p> - <p> - Face recognition is a mess of contradictinos. It works, yet it doesn't actually work. It's cheap and accessible, but also expensive and out of control. Face recognition research has achieved headline grabbing superhuman accuracies over 99.9%, yet in practice it's also dangerously inaccurate. - </p> - <p> - During a trial installation at Sudkreuz station in Berlin in 2018, 20% of the matches were wrong, a number so low that it should not have any connection to law enforcement or justice. And in London, the Metropolitan police had been using face recognition software that mistakenly identified an alarming 98% of people as criminals, which perhaps is a crime itself. - </p> - </section> - - <section class='wide dataset-intro'> - <h2>Dataset Portraits</h2> - <p> - We have prepared detailed case studies of some of the more noteworthy datasets, including tools to help you learn what is contained in these datasets, and even whether your own face has been used to train these algorithms. - </p> - - <div class="dataset-list"> - - <a href="/datasets/lfw/"> - <div class="dataset"> - Labeled Faces in The Wild - </div> - </a> - - <a href="/datasets/vgg_face2/"> - <div class="dataset"> - VGG Face2 - </div> - </a> - - </div> - </section> - - + <div class="splash"> + <div id="three_container"></div> </div> <footer> <div> - <a href="/">MegaPixels.cc</a> - <a href="/about/disclaimer/">Disclaimer</a> - <a href="/about/terms/">Terms of Use</a> - <a href="/about/privacy/">Privacy</a> - <a href="/about/">About</a> - <a href="/about/team/">Team</a> </div> <div> MegaPixels ©2017-19 Adam R. Harvey / - <a href="https://ahprojects.com">ahprojects.com</a> + <a href="https://ahprojects.com/megapixels/">ahprojects.com</a> </div> </footer> </body> - -<script src="https://cdnjs.cloudflare.com/ajax/libs/babel-polyfill/7.0.0/polyfill.min.js"></script> -<script src="https://cdnjs.cloudflare.com/ajax/libs/three.js/97/three.min.js"></script> -<script src="https://unpkg.com/three.texttexture@18.10.24"></script> -<script src="/assets/demo/cloud/THREE.TextSprite.js"></script> -<script src="/assets/js/vendor/three.meshline.js"></script> -<script src="/assets/js/vendor/oktween.js"></script> -<script src="/assets/js/app/face.js"></script> - -<script src="/assets/js/dist/index.js"></script> +<script src="/assets/js/dist/splash.js"></script> </html>
\ No newline at end of file diff --git a/site/public/info/index.html b/site/public/info/index.html index d3a7d549..0b59e647 100644 --- a/site/public/info/index.html +++ b/site/public/info/index.html @@ -27,7 +27,7 @@ </header> <div class="content"> - <section><h2>What do facial recognition algorithms see?</h2> + <section><h2>Face Analysis</h2> </section><section class='applet_container'><div class='applet' data-payload='{"command": "face_analysis"}'></div></section><section><p>Results are only stored for the duration of the analysis and are deleted when you leave this page.</p> </section> diff --git a/site/public/research/00_introduction/index.html b/site/public/research/00_introduction/index.html index b6cc8e4a..395bd268 100644 --- a/site/public/research/00_introduction/index.html +++ b/site/public/research/00_introduction/index.html @@ -42,18 +42,18 @@ </div> </section> - <section><div class='meta'><div><div class='gray'>Posted</div><div>Dec. 15</div></div><div><div class='gray'>Author</div><div>Adam Harvey</div></div></div></section><section><p>It was the early 2000s. Face recognition was new and no one seemed sure exactly how well it was going to perform in practice. In theory, face recognition was poised to be a game changer, a force multiplier, a strategic military advantage, a way to make cities safer and to secure borders. This was the future John Ashcroft demanded with the Total Information Awareness act of the 2003 and that spooks had dreamed of for decades. It was a future that academics at Carnegie Mellon Universtiy and Colorado State University would help build. It was also a future that celebrities would play a significant role in building. And to the surprise of ordinary Internet users like myself and perhaps you, it was a future that millions of Internet users would unwittingly play role in creating.</p> + <section><div class='meta'><div><div class='gray'>Posted</div><div>Dec. 15</div></div><div><div class='gray'>Author</div><div>Adam Harvey</div></div></div><section><section><p>Ever since the first computational facial recognition research project by the CIA in the early 1960s, data has always played a vital role in the development of our biometric future. Without facial recognition datasets there would be no facial recognition. Datasets are an indispensable part of any artificial intelligence system because, as Geoffrey Hinton points out:</p> +<blockquote><p>Our relationship to computers has changed. Instead of programming them, we now show them and they figure it out. - <a href="https://www.youtube.com/watch?v=-eyhCTvrEtE">Geoffrey Hinton</a></p> +</blockquote> +<p>Algorithms learn from datasets. And we program algorithms by building datasets. But datasets aren't like code. There's no programming language made of data except for the data itself.</p> +<p>Ignore content below these lines</p> +<p>It was the early 2000s. Face recognition was new and no one seemed sure exactly how well it was going to perform in practice. In theory, face recognition was poised to be a game changer, a force multiplier, a strategic military advantage, a way to make cities safer and to secure borders. This was the future John Ashcroft demanded with the Total Information Awareness act of the 2003 and that spooks had dreamed of for decades. It was a future that academics at Carnegie Mellon Universtiy and Colorado State University would help build. It was also a future that celebrities would play a significant role in building. And to the surprise of ordinary Internet users like myself and perhaps you, it was a future that millions of Internet users would unwittingly play role in creating.</p> <p>Now the future has arrived and it doesn't make sense. Facial recognition works yet it doesn't actually work. Facial recognition is cheap and accessible but also expensive and out of control. Facial recognition research has achieved headline grabbing superhuman accuracies over 99.9% yet facial recognition is also dangerously inaccurate. During a trial installation at Sudkreuz station in Berlin in 2018, 20% of the matches were wrong, a number so low that it should not have any connection to law enforcement or justice. And in London, the Metropolitan police had been using facial recognition software that mistakenly identified an alarming 98% of people as criminals <sup class="footnote-ref" id="fnref-met_police"><a href="#fn-met_police">1</a></sup>, which perhaps is a crime itself.</p> <p>MegaPixels is an online art project that explores the history of facial recognition from the perspective of datasets. To paraphrase the artist Trevor Paglen, whoever controls the dataset controls the meaning. MegaPixels aims to unravel the meanings behind the data and expose the darker corners of the biometric industry that have contributed to its growth. MegaPixels does not start with a conclusion, a moralistic slant, or a</p> <p>Whether or not to build facial recognition was a question that can no longer be asked. As an outspoken critic of face recognition I've developed, and hopefully furthered, my understanding during the last 10 years I've spent working with computer vision. Though I initially disagreed, I've come to see technocratic perspective as a non-negotiable reality. As Oren (nytimes article) wrote in NYT Op-Ed "the horse is out of the barn" and the only thing we can do collectively or individually is to steer towards the least worse outcome. Computational communication has entered a new era and it's both exciting and frightening to explore the potentials and opportunities. In 1997 getting access to 1 teraFLOPS of computational power would have cost you $55 million and required a strategic partnership with the Department of Defense. At the time of writing, anyone can rent 1 teraFLOPS on a cloud GPU marketplace for less than $1/day. <sup class="footnote-ref" id="fnref-asci_option_red"><a href="#fn-asci_option_red">2</a></sup>.</p> <p>I hope that this project will illuminate the darker areas of strange world of facial recognition that have not yet received attention and encourage discourse in academic, industry, and . By no means do I believe discourse can save the day. Nor do I think creating artwork can. In fact, I'm not exactly sure what the outcome of this project will be. The project is not so much what I publish here but what happens after. This entire project is only a prologue.</p> <p>As McLuhan wrote, "You can't have a static, fixed position in the electric age". And in our hyper-connected age of mass surveillance, artificial intelligece, and unevenly distributed virtual futures the most irrational thing to be is rational. Increasingly the world is becoming a contradiction where people use surveillance to protest surveillance, use</p> <p>Like many projects, MegaPixels had spent years meandering between formats, unfeasible budgets, and was generally too niche of a subject. The basic idea for this project, as proposed to the original <a href="https://tacticaltech.org/projects/the-glass-room-nyc/">Glass Room</a> installation in 2016 in NYC, was to build an interactive mirror that showed people if they had been included in the <a href="/datasets/lfw">LFW</a> facial recognition dataset. The idea was based on my reaction to all the datasets I'd come across during research for the CV Dazzle project. I'd noticed strange datasets created for training and testing face detection algorithms. Most were created in labratory settings and their interpretation of face data was very strict.</p> -<p>About the name</p> -<p>About the funding</p> -<p>About me</p> -<p>About the team</p> -<p>Conclusion</p> <h3>for other post</h3> <p>It was the early 2000s. Face recognition was new and no one seemed sure how well it was going to perform in practice. In theory, face recognition was poised to be a game changer, a force multiplier, a strategic military advantage, a way to make cities safer and to secure the borders. It was the future that John Ashcroft demanded with the Total Information Awareness act of the 2003. It was a future that academics helped build. It was a future that celebrities helped build. And it was a future that</p> <p>A decade earlier the Department of Homeland Security and the Counterdrug Technology Development Program Office initated a feasibilty study called FERET (FacE REcognition Technology) to "develop automatic face recognition capabilities that could be employed to assist security, intelligence, and law enforcement personnel in the performance of their duties [^feret_website]."</p> diff --git a/site/public/research/01_from_1_to_100_pixels/index.html b/site/public/research/01_from_1_to_100_pixels/index.html index 4446e1be..c11e966e 100644 --- a/site/public/research/01_from_1_to_100_pixels/index.html +++ b/site/public/research/01_from_1_to_100_pixels/index.html @@ -68,6 +68,9 @@ <li>NIST report on sres states several resolutions</li> <li>"Results show that the tested face recognition systems yielded similar performance for query sets with eye-to-eye distance from 60 pixels to 30 pixels" <sup class="footnote-ref" id="fnref-nist_sres"><a href="#fn-nist_sres">1</a></sup></li> </ul> +<ul> +<li>"Note that we only keep the images with a minimal side length of 80 pixels." and "a face will be labeled as “Ignore” if it is very difficult to be detected due to blurring, severe deformation and unrecognizable eyes, or the side length of its bounding box is less than 32 pixels." Ge_Detecting_Masked_Faces_CVPR_2017_paper.pdf </li> +</ul> <div class="footnotes"> <hr> <ol><li id="fn-nist_sres"><p>NIST 906932. Performance Assessment of Face Recognition Using Super-Resolution. Shuowen Hu, Robert Maschal, S. Susan Young, Tsai Hong Hong, Jonathon P. Phillips<a href="#fnref-nist_sres" class="footnote">↩</a></p></li> |
