diff options
Diffstat (limited to 'site')
| -rw-r--r-- | site/public/about/index.html | 3 | ||||
| -rw-r--r-- | site/public/datasets/brainwash/index.html | 43 | ||||
| -rw-r--r-- | site/public/datasets/cofw/index.html | 2 | ||||
| -rw-r--r-- | site/public/datasets/lfw/index.html | 2 | ||||
| -rw-r--r-- | site/public/datasets/mars/index.html | 4 | ||||
| -rw-r--r-- | site/public/research/01_from_1_to_100_pixels/index.html | 1 | ||||
| -rw-r--r-- | site/public/research/02_what_computers_can_see/index.html | 143 |
7 files changed, 175 insertions, 23 deletions
diff --git a/site/public/about/index.html b/site/public/about/index.html index 694f7ec9..3c270ee1 100644 --- a/site/public/about/index.html +++ b/site/public/about/index.html @@ -37,7 +37,8 @@ <li><a href="/about/privacy/">Privacy Policy</a></li> </ul> </section><p>(PAGE UNDER DEVELOPMENT)</p> -<p><div style="font-size:20px;line-height:36px">Ever since government agencies began researching face recognition in the early 1960's, datasets of face images have always been central to technological advancements. Today, these datasets no longer originate in labs, but instead from family photo albums posted on photo sharing sites, surveillance cameras on college campuses, search engine queries for celebrities, cafe livestreams, or <a href="https://www.theverge.com/2017/8/22/16180080/transgender-youtubers-ai-facial-recognition-dataset">personal videos</a> posted on YouTube. Collectively, facial recognition datasets are now gathered "in the wild".</div></p><p>MegaPixels is art and research by <a href="https://ahprojects.com">Adam Harvey</a> about facial recognition datasets that unravels their histories, futures, geographies, and meanings. Throughout 2019 this site this site will publish research reports, visualizations, raw data, and interactive tools to explore how publicly available facial recognition datasets contribute to a global supply chain of biometric data that powers the global facial recognition industry.</p><p>During the last year, hundreds of these facial analysis datasets created "in the wild" have been collected to understand how they contribute to a global supply chain of biometric data that is powering the global facial recognition industry.</p> +<p><div style="font-size:20px;line-height:36px">Ever since government agencies began developing face recognition in the early 1960's, datasets of face images have always been central to technological advancements. Today, these datasets no longer originate in labs, but instead from family photo albums posted on photo sharing sites, surveillance cameras on college campuses, search engine queries for celebrities, cafe livestreams, and <a href="https://www.theverge.com/2017/8/22/16180080/transgender-youtubers-ai-facial-recognition-dataset">personal videos</a> posted on YouTube. </div></p><p>Collectively, facial recognition datasets are now gathered "in the wild".</p> +<p>MegaPixels is art and research by <a href="https://ahprojects.com">Adam Harvey</a> about facial recognition datasets that unravels their histories, futures, geographies, and meanings. Throughout 2019 this site this site will publish research reports, visualizations, raw data, and interactive tools to explore how publicly available facial recognition datasets contribute to a global supply chain of biometric data that powers the global facial recognition industry.</p><p>During the last year, hundreds of these facial analysis datasets created "in the wild" have been collected to understand how they contribute to a global supply chain of biometric data that is powering the global facial recognition industry.</p> <p>The MegaPixels website is produced in partnership with <a href="https://mozilla.org">Mozilla</a>.</p> <div class="flex-container team-photos-container"> <div class="team-member"> diff --git a/site/public/datasets/brainwash/index.html b/site/public/datasets/brainwash/index.html index ab002c78..e5baca7a 100644 --- a/site/public/datasets/brainwash/index.html +++ b/site/public/datasets/brainwash/index.html @@ -4,7 +4,7 @@ <title>MegaPixels</title> <meta charset="utf-8" /> <meta name="author" content="Adam Harvey" /> - <meta name="description" content="Brainwash is a dataset of webcam images from the Brainwash Cafe in San Francisco" /> + <meta name="description" content="Brainwash is a dataset of webcam images taken from the Brainwash Cafe in San Francisco" /> <meta name="referrer" content="no-referrer" /> <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" /> <link rel='stylesheet' href='/assets/css/fonts.css' /> @@ -26,15 +26,29 @@ </header> <div class="content content-dataset"> - <section class='intro_section' style='background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/brainwash/assets/background.jpg)'><div class='inner'><div class='hero_desc'><span class='bgpad'><span style='color: #ffaa00'>Brainwash</span> is a dataset of webcam images from the Brainwash Cafe in San Francisco</span></div><div class='hero_subdesc'><span class='bgpad'>The Brainwash dataset includes 11,918 images of "everyday life of a busy downtown cafe" and is used for training head detection algorithms -</span></div></div></section><section><div class='left-sidebar'><div class='meta'><div><div class='gray'>Collected</div><div>2014</div></div><div><div class='gray'>Published</div><div>2015</div></div><div><div class='gray'>Images</div><div>11,918</div></div><div><div class='gray'>Faces</div><div>91,146</div></div><div><div class='gray'>Created by</div><div>Stanford Department of Computer Science</div></div><div><div class='gray'>Funded by</div><div>Max Planck Center for Visual Computing and Communication</div></div><div><div class='gray'>Resolution</div><div>640x480px</div></div><div><div class='gray'>Size</div><div>4.1GB</div></div><div><div class='gray'>Origin</div><div>Brainwash Cafe, San Franscisco</div></div><div><div class='gray'>Purpose</div><div>Training face detection</div></div><div><div class='gray'>Website</div><div><a href="https://exhibits.stanford.edu/data/catalog/sx925dc9385">stanford.edu</a></div></div><div><div class='gray'>Paper</div><div><a href="http://arxiv.org/abs/1506.04878">End-to-End People Detection in Crowded Scenes</a></div></div></div></div><h2>Brainwash Dataset</h2> + <section class='intro_section' style='background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/brainwash/assets/background.jpg)'><div class='inner'><div class='hero_desc'><span class='bgpad'><span style='color: #ffaa00'>Brainwash</span> is a dataset of webcam images taken from the Brainwash Cafe in San Francisco</span></div><div class='hero_subdesc'><span class='bgpad'>The Brainwash dataset includes 11,918 images of "everyday life of a busy downtown cafe" and is used for training head detection algorithms +</span></div></div></section><section><div class='left-sidebar'><div class='meta'><div><div class='gray'>Published</div><div>2015</div></div><div><div class='gray'>Images</div><div>11,918</div></div><div><div class='gray'>Faces</div><div>91,146</div></div><div><div class='gray'>Created by</div><div>Stanford Department of Computer Science</div></div><div><div class='gray'>Funded by</div><div>Max Planck Center for Visual Computing and Communication</div></div><div><div class='gray'>Location</div><div>Brainwash Cafe, San Franscisco</div></div><div><div class='gray'>Purpose</div><div>Training face detection</div></div><div><div class='gray'>Website</div><div><a href="https://exhibits.stanford.edu/data/catalog/sx925dc9385">stanford.edu</a></div></div><div><div class='gray'>Paper</div><div><a href="http://arxiv.org/abs/1506.04878">End-to-End People Detection in Crowded Scenes</a></div></div><div><div class='gray'>Explicit Consent</div><div>No</div></div></div></div><h2>Brainwash Dataset</h2> <p>(PAGE UNDER DEVELOPMENT)</p> <p><em>Brainwash</em> is a face detection dataset created from the Brainwash Cafe's livecam footage including 11,918 images of "everyday life of a busy downtown cafe<a class="footnote_shim" name="[^readme]_1"> </a><a href="#[^readme]" class="footnote" title="Footnote 1">1</a>". The images are used to develop face detection algorithms for the "challenging task of detecting people in crowded scenes" and tracking them.</p> <p>Before closing in 2017, Brainwash Cafe was a "cafe and laundromat" located in San Francisco's SoMA district. The cafe published a publicy available livestream from the cafe with a view of the cash register, performance stage, and seating area.</p> <p>Since it's publication by Stanford in 2015, the Brainwash dataset has appeared in several notable research papers. In September 2016 four researchers from the National University of Defense Technology in Changsha, China used the Brainwash dataset for a research study on "people head detection in crowded scenes", concluding that their algorithm "achieves superior head detection performance on the crowded scenes dataset<a class="footnote_shim" name="[^localized_region_context]_1"> </a><a href="#[^localized_region_context]" class="footnote" title="Footnote 2">2</a>". And again in 2017 three researchers at the National University of Defense Technology used Brainwash for a study on object detection noting "the data set used in our experiment is shown in Table 1, which includes one scene of the brainwash dataset<a class="footnote_shim" name="[^replacement_algorithm]_1"> </a><a href="#[^replacement_algorithm]" class="footnote" title="Footnote 3">3</a>".</p> </section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/brainwash/assets/00425000_960.jpg' alt=' An sample image from the Brainwash dataset used for training face and head detection algorithms for surveillance. The datset contains about 12,000 images. License: Open Data Commons Public Domain Dedication (PDDL)'><div class='caption'> An sample image from the Brainwash dataset used for training face and head detection algorithms for surveillance. The datset contains about 12,000 images. License: Open Data Commons Public Domain Dedication (PDDL)</div></div></section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/brainwash/assets/brainwash_montage.jpg' alt=' 49 of the 11,918 images included in the Brainwash dataset. License: Open Data Commons Public Domain Dedication (PDDL)'><div class='caption'> 49 of the 11,918 images included in the Brainwash dataset. License: Open Data Commons Public Domain Dedication (PDDL)</div></div></section><section> + <h3>Who used Brainwash Dataset?</h3> + + <p> + This bar chart presents a ranking of the top countries where citations originated. Mouse over individual columns + to see yearly totals. Colors are only assigned to the top 10 overall countries. + </p> + + </section> + +<section class="applet_container"> +<!-- <div style="position: absolute;top: 0px;right: -55px;width: 180px;font-size: 14px;">Labeled Faces in the Wild Dataset<br><span class="numc" style="font-size: 11px;">20 citations</span> +</div> --> + <div class="applet" data-payload="{"command": "chart"}"></div> +</section><section> - <h3>Biometric Trade Routes (beta)</h3> + <h3>Information Supply Chain</h3> <!-- <div class="map-sidebar right-sidebar"> <h3>Legend</h3> @@ -47,7 +61,7 @@ --> <p> To understand how Brainwash Dataset has been used around the world... - affected global research on computer vision, surveillance, defense, and consumer technology, the and where this dataset has been used the locations of each organization that used or referenced the datast + affected global research on computer vision, surveillance, defense, and consumer technology, the and where this dataset has been used the locations of each organization that used or referenced the datast </p> </section> @@ -65,20 +79,9 @@ <section> <p class='subp'> - The data is generated by collecting all citations for all original research papers associated with the dataset. Then the PDFs are then converted to text and the organization names are extracted and geocoded. Because of the automated approach to extracting data, actual use of the dataset can not yet be confirmed. This visualization is provided to help locate and confirm usage and will be updated as data noise is reduced. + Standardized paragraph of text about the map. Sed ut perspiciatis, unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam eaque ipsa, quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt, explicabo. </p> -</section><section> - <h3>Who used Brainwash Dataset?</h3> - - <p> - This bar chart presents a ranking of the top countries where citations originated. Mouse over individual columns - to see yearly totals. Colors are only assigned to the top 10 overall countries. - </p> - - </section> - -<section class="applet_container"> - <div class="applet" data-payload="{"command": "chart"}"></div> +</section><section><p>Add more analysis here</p> </section><section> @@ -92,11 +95,11 @@ <h3>Citations</h3> <p> - Citations were collected from <a href="https://www.semanticscholar.org">Semantic Scholar</a>, a website which aggregates + The citations used for the geographic visualizations were collected from <a href="https://www.semanticscholar.org">Semantic Scholar</a>, a website which aggregates and indexes research papers. Metadata was extracted from these papers, including extracting names of institutions automatically from PDFs, and then the addresses were geocoded. Data is not yet manually verified, and reflects anytime the paper was cited. Some papers may only mention the dataset in passing, while others use it as part of their research methodology. </p> <p> - Add button/link to download CSV + Add [button/link] to download CSV. Add search input field to filter. Expand number of rows to 10. Reduce URL text to show only the domain (ie https://arxiv.org/pdf/123456 --> arxiv.org) </p> <div class="applet" data-payload="{"command": "citations"}"></div> diff --git a/site/public/datasets/cofw/index.html b/site/public/datasets/cofw/index.html index 605a325a..20138c3c 100644 --- a/site/public/datasets/cofw/index.html +++ b/site/public/datasets/cofw/index.html @@ -108,6 +108,8 @@ To increase the number of training images, and since COFW has the exact same la </section> <section class="applet_container"> + <div style="position: absolute;top: 0px;right: -55px;width: 180px;font-size: 14px;">Labeled Faces in the Wild Dataset<br><span class="numc" style="font-size: 11px;">20 citations</span> +</div> <div class="applet" data-payload="{"command": "chart"}"></div> </section><section><p>TODO</p> <h2>- replace graphic</h2> diff --git a/site/public/datasets/lfw/index.html b/site/public/datasets/lfw/index.html index 477673e2..8670f909 100644 --- a/site/public/datasets/lfw/index.html +++ b/site/public/datasets/lfw/index.html @@ -90,6 +90,8 @@ </section> <section class="applet_container"> + <div style="position: absolute;top: 0px;right: -55px;width: 180px;font-size: 14px;">Labeled Faces in the Wild Dataset<br><span class="numc" style="font-size: 11px;">20 citations</span> +</div> <div class="applet" data-payload="{"command": "chart"}"></div> </section><section> diff --git a/site/public/datasets/mars/index.html b/site/public/datasets/mars/index.html index bfad52a3..b053b456 100644 --- a/site/public/datasets/mars/index.html +++ b/site/public/datasets/mars/index.html @@ -4,7 +4,7 @@ <title>MegaPixels</title> <meta charset="utf-8" /> <meta name="author" content="Adam Harvey" /> - <meta name="description" content="The Motion Analysis and Re-identification Set (MARS) is a dataset is collection of CCTV footage " /> + <meta name="description" content="Motion Analysis and Re-identification Set (MARS) is a dataset is collection of CCTV footage " /> <meta name="referrer" content="no-referrer" /> <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" /> <link rel='stylesheet' href='/assets/css/fonts.css' /> @@ -26,7 +26,7 @@ </header> <div class="content content-dataset"> - <section class='intro_section' style='background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/mars/assets/background.jpg)'><div class='inner'><div class='hero_desc'><span class='bgpad'>The <span style="color:#99ccee">Motion Analysis and Re-identification Set (MARS)</span> is a dataset is collection of CCTV footage </span></div><div class='hero_subdesc'><span class='bgpad'>The MARS dataset includes 1,191,003 of people used for training person re-identification algorithms + <section class='intro_section' style='background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/mars/assets/background.jpg)'><div class='inner'><div class='hero_desc'><span class='bgpad'><span style="color:#99ccee">Motion Analysis and Re-identification Set (MARS)</span> is a dataset is collection of CCTV footage </span></div><div class='hero_subdesc'><span class='bgpad'>The MARS dataset includes 1,191,003 of people used for training person re-identification algorithms </span></div></div></section><section><div class='left-sidebar'><div class='meta'><div><div class='gray'>Collected</div><div>TBD</div></div><div><div class='gray'>Published</div><div>TBD</div></div><div><div class='gray'>Images</div><div>TBD</div></div><div><div class='gray'>Faces</div><div>TBD</div></div></div></div><h2>Motion Analysis and Re-identification Set (MARS)</h2> <p>(PAGE UNDER DEVELOPMENT)</p> <p>At vero eos et accusamus et iusto odio dignissimos ducimus, qui blanditiis praesentium voluptatum deleniti atque corrupti, quos dolores et quas molestias excepturi sint, obcaecati cupiditate non-provident, similique sunt in culpa, qui officia deserunt mollitia animi, id est laborum et dolorum fuga. Et harum quidem rerum facilis est et expedita distinctio.</p> diff --git a/site/public/research/01_from_1_to_100_pixels/index.html b/site/public/research/01_from_1_to_100_pixels/index.html index 5254fb40..c91d17ad 100644 --- a/site/public/research/01_from_1_to_100_pixels/index.html +++ b/site/public/research/01_from_1_to_100_pixels/index.html @@ -78,6 +78,7 @@ </ul> <ul> <li>"Note that we only keep the images with a minimal side length of 80 pixels." and "a face will be labeled as “Ignore” if it is very difficult to be detected due to blurring, severe deformation and unrecognizable eyes, or the side length of its bounding box is less than 32 pixels." Ge_Detecting_Masked_Faces_CVPR_2017_paper.pdf </li> +<li>IBM DiF: "Faces with region size less than 50x50 or inter-ocular distance of less than 30 pixels were discarded. Faces with non-frontal pose, or anything beyond being slightly tilted to the left or the right, were also discarded."</li> </ul> <div class="footnotes"> <hr> diff --git a/site/public/research/02_what_computers_can_see/index.html b/site/public/research/02_what_computers_can_see/index.html index 202359e0..9389bf84 100644 --- a/site/public/research/02_what_computers_can_see/index.html +++ b/site/public/research/02_what_computers_can_see/index.html @@ -126,6 +126,149 @@ <li>Wearing Necktie</li> <li>Wearing Necklace</li> </ul> +<h2>From Market 1501</h2> +<p>The 27 attributes are:</p> +<table> +<thead><tr> +<th style="text-align:center">attribute</th> +<th style="text-align:center">representation in file</th> +<th style="text-align:center">label</th> +</tr> +</thead> +<tbody> +<tr> +<td style="text-align:center">gender</td> +<td style="text-align:center">gender</td> +<td style="text-align:center">male(1), female(2)</td> +</tr> +<tr> +<td style="text-align:center">hair length</td> +<td style="text-align:center">hair</td> +<td style="text-align:center">short hair(1), long hair(2)</td> +</tr> +<tr> +<td style="text-align:center">sleeve length</td> +<td style="text-align:center">up</td> +<td style="text-align:center">long sleeve(1), short sleeve(2)</td> +</tr> +<tr> +<td style="text-align:center">length of lower-body clothing</td> +<td style="text-align:center">down</td> +<td style="text-align:center">long lower body clothing(1), short(2)</td> +</tr> +<tr> +<td style="text-align:center">type of lower-body clothing</td> +<td style="text-align:center">clothes</td> +<td style="text-align:center">dress(1), pants(2)</td> +</tr> +<tr> +<td style="text-align:center">wearing hat</td> +<td style="text-align:center">hat</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">carrying backpack</td> +<td style="text-align:center">backpack</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">carrying bag</td> +<td style="text-align:center">bag</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">carrying handbag</td> +<td style="text-align:center">handbag</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">age</td> +<td style="text-align:center">age</td> +<td style="text-align:center">young(1), teenager(2), adult(3), old(4)</td> +</tr> +<tr> +<td style="text-align:center">8 color of upper-body clothing</td> +<td style="text-align:center">upblack, upwhite, upred, uppurple, upyellow, upgray, upblue, upgreen</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">9 color of lower-body clothing</td> +<td style="text-align:center">downblack, downwhite, downpink, downpurple, downyellow, downgray, downblue, downgreen,downbrown</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +</tbody> +</table> +<p>source: <a href="https://github.com/vana77/Market-1501_Attribute/blob/master/README.md">https://github.com/vana77/Market-1501_Attribute/blob/master/README.md</a></p> +<h2>From DukeMTMC</h2> +<p>The 23 attributes are:</p> +<table> +<thead><tr> +<th style="text-align:center">attribute</th> +<th style="text-align:center">representation in file</th> +<th style="text-align:center">label</th> +</tr> +</thead> +<tbody> +<tr> +<td style="text-align:center">gender</td> +<td style="text-align:center">gender</td> +<td style="text-align:center">male(1), female(2)</td> +</tr> +<tr> +<td style="text-align:center">length of upper-body clothing</td> +<td style="text-align:center">top</td> +<td style="text-align:center">short upper body clothing(1), long(2)</td> +</tr> +<tr> +<td style="text-align:center">wearing boots</td> +<td style="text-align:center">boots</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">wearing hat</td> +<td style="text-align:center">hat</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">carrying backpack</td> +<td style="text-align:center">backpack</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">carrying bag</td> +<td style="text-align:center">bag</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">carrying handbag</td> +<td style="text-align:center">handbag</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">color of shoes</td> +<td style="text-align:center">shoes</td> +<td style="text-align:center">dark(1), light(2)</td> +</tr> +<tr> +<td style="text-align:center">8 color of upper-body clothing</td> +<td style="text-align:center">upblack, upwhite, upred, uppurple, upgray, upblue, upgreen, upbrown</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">7 color of lower-body clothing</td> +<td style="text-align:center">downblack, downwhite, downred, downgray, downblue, downgreen, downbrown</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +</tbody> +</table> +<p>source: <a href="https://github.com/vana77/DukeMTMC-attribute/blob/master/README.md">https://github.com/vana77/DukeMTMC-attribute/blob/master/README.md</a></p> +<h2>From H3D Dataset</h2> +<p>The joints and other keypoints (eyes, ears, nose, shoulders, elbows, wrists, hips, knees and ankles) +The 3D pose inferred from the keypoints. +Visibility boolean for each keypoint +Region annotations (upper clothes, lower clothes, dress, socks, shoes, hands, gloves, neck, face, hair, hat, sunglasses, bag, occluder) +Body type (male, female or child)</p> +<p>source: <a href="https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/shape/h3d/">https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/shape/h3d/</a></p> </section> </div> |
