diff options
Diffstat (limited to 'site')
47 files changed, 998 insertions, 138 deletions
diff --git a/site/assets/css/css.css b/site/assets/css/css.css index adc1a2fa..2a004e76 100644 --- a/site/assets/css/css.css +++ b/site/assets/css/css.css @@ -645,7 +645,7 @@ section.images { } .image:first-child:nth-last-child(2), .image:first-child:nth-last-child(2) ~ .image { - width: 300px; + width: 455px; } .image:first-child:nth-last-child(3), .image:first-child:nth-last-child(3) ~ .image { @@ -727,15 +727,29 @@ section.fullwidth .image { } .research_index h1 { margin-top: 20px; - text-decoration: underline; +} +.research_index .wide { + min-height: 33vh; + padding: 40px 20px; } .desktop .research_index section:hover h1 { color: #fff; } +.research_index section { + padding: 20px; +} .research_index section:hover h2 { color: #ddd; } - +.research_index section h2, +.research_index section h3, +.research_index section h4 { + max-width: 90%; + margin: 30px; +} +.research_index .readmore span { + border-bottom: 2px solid; +} /* home page */ .hero { @@ -959,6 +973,7 @@ section.intro_section { max-width: 960px; margin: 3rem auto; } +.research_index h2, .intro_section .hero_desc { font-size: 38px; line-height: 60px; @@ -966,14 +981,16 @@ section.intro_section { color: #ddd; font-weight: 400; } +.mobile .research_index h2, .mobile .intro_section .hero_desc{ font-size: 16px; line-height: 32px; margin-bottom: 20px; } -.intro_section .hero_desc .dataset-name{ - color:#fff; +.intro_section .hero_desc .dataset-name { + color: #fff; } +.research_index h3, .intro_section .hero_subdesc { font-size: 17px; line-height: 36px; @@ -981,27 +998,37 @@ section.intro_section { font-weight: 400; color: #ddd; } +.mobile .research_index h3, .mobile .intro_section .hero_subdesc { font-size: 14px; line-height: 28px; } +.research_index h2 .bgpad, .intro_section .hero_desc .bgpad { box-shadow: -10px -10px #181818, 10px -10px #181818, 10px 10px #181818, -10px 10px #181818; background: #181818; } +.research_index h3 .bgpad, .intro_section .hero_subdesc .bgpad { box-shadow: -10px -10px #181818, 10px -10px #181818, 10px 10px #181818, -10px 10px #181818; background: #181818; } +.mobile .research_index h2 .bgpad, .mobile .intro_section .hero_desc .bgpad { box-shadow: -6px -6px #181818, 6px -6px #181818, 6px 6px #181818, -6px 6px #181818; background: #181818; } +.mobile .research_index h3 .bgpad, .mobile .intro_section .hero_subdesc .bgpad { box-shadow: -6px -6px #181818, 6px -6px #181818, 6px 6px #181818, -6px 6px #181818; background: #181818; } +.research_index h4 .bgpad { + box-shadow: -6px -6px #181818, 6px -6px #181818, 6px 6px #181818, -6px 6px #181818; + background: #181818; + color: #eee; +} .firefox .intro_section div > span { box-decoration-break: clone; diff --git a/site/content/_drafts_/lfw/index.md b/site/content/_drafts_/lfw/index.md index ad43e2dd..a5d6bd18 100644 --- a/site/content/_drafts_/lfw/index.md +++ b/site/content/_drafts_/lfw/index.md @@ -54,7 +54,7 @@ Add a paragraph about how usage extends far beyond academia into research center ``` load_file assets/lfw_commercial_use.csv -name_display, company_url, example_url, country, description +Headings: name_display, company_url, example_url, country, description ``` diff --git a/site/content/pages/datasets/msceleb/index.md b/site/content/pages/datasets/msceleb/index.md index 453c1522..0e457cd9 100644 --- a/site/content/pages/datasets/msceleb/index.md +++ b/site/content/pages/datasets/msceleb/index.md @@ -101,9 +101,9 @@ For example, on October 28, 2019, the MS Celeb dataset will be used for a new co And in June, shortly after [posting](https://twitter.com/adamhrv/status/1134511293526937600) about the disappearance of the MS Celeb dataset, it reemerged on [Academic Torrents](https://academictorrents.com/details/9e67eb7cc23c9417f39778a8e06cca5e26196a97/tech). As of June 10, the MS Celeb dataset files have been redistributed in at least 9 countries and downloaded 44 times without any restrictions. The files were seeded and are mostly distributed by an AI company based in China called Hyper.ai, which states that it redistributes MS Celeb and other datasets for "teachers and students of service industry-related practitioners and research institutes."[^hyperai_readme] -Earlier in 2019 images from the MS Celeb were also repackaged into another face dataset called *Racial Faces in the Wild (RFW)*. To create it, the RFW authors uploaded face images from the MS Celeb dataset to the Face++ API and used the inferred racial scores to segregate people into four subsets: Caucasian, Asian, Indian, and African each with 3,000 subjects. That dataset then appeared in a subsequent research project from researchers affiliated with IIIT-Delhi and IBM TJ Watson called [Deep Learning for Face Recognition: Pride or Prejudiced?](https://arxiv.org/abs/1904.01219), which aims to reduce bias but also inadvertently furthers racist language and ideologies that can not be repeated here. +Earlier in 2019 images from the MS Celeb were also repackaged into another face dataset called *Racial Faces in the Wild (RFW)*. To create it, the RFW authors uploaded face images from the MS Celeb dataset to the Face++ API and used the inferred racial scores to segregate people into four subsets: Caucasian, Asian, Indian, and African each with 3,000 subjects. That dataset then appeared in a subsequent research project from researchers affiliated with IIIT-Delhi and IBM TJ Watson called [Deep Learning for Face Recognition: Pride or Prejudiced?](https://arxiv.org/abs/1904.01219), which aims to reduce bias but also inadvertently furthers racist ideologies, using discredited racial terminology that cannot be repeated here. -The estimated racial scores for the MS Celeb face images used in the RFW dataset were computed using the Face++ API, which is owned by Megvii Inc, a company that has been repeatedly linked to the oppressive surveillance of Uighur Muslims in Xinjiang, China. According to posts from the [ChinAI Newsletter](https://chinai.substack.com/p/chinai-newsletter-11-companies-involved-in-expanding-chinas-public-security-apparatus-in-xinjiang) and [BuzzFeedNews](https://www.buzzfeednews.com/article/ryanmac/us-money-funding-facial-recognition-sensetime-megvii), Megvii announced in 2017 at the China-Eurasia Security Expo in Ürümqi, Xinjiang, that it would be the official technical support unit of the "Public Security Video Laboratory" in Xinjiang, China. If they didn't already, it's highly likely that Megvii has a copy of everyone's biometric faceprint from the MS Celeb dataset, either from uploads to the Face++ API or through the research projects explicitly referencing MS Celeb dataset usage, such as a 2018 paper called [GridFace: Face Rectification via Learning Local Homography Transformations](https://arxiv.org/pdf/1808.06210.pdf) jointly published by 3 authors, all of whom worked for Megvii. +The estimated racial scores for the MS Celeb face images used in the RFW dataset were computed using the Face++ API, which is owned by Megvii Inc, a company that has been repeatedly linked to the oppressive surveillance of Uighur Muslims in Xinjiang, China. According to posts from the [ChinAI Newsletter](https://chinai.substack.com/p/chinai-newsletter-11-companies-involved-in-expanding-chinas-public-security-apparatus-in-xinjiang) and [BuzzFeedNews](https://www.buzzfeednews.com/article/ryanmac/us-money-funding-facial-recognition-sensetime-megvii), Megvii announced in 2017 at the China-Eurasia Security Expo in Ürümqi, Xinjiang, that it would be the official technical support unit of the "Public Security Video Laboratory" in Xinjiang, China. If they didn't already, it's highly likely that Megvii has a copy of everyone's biometric faceprint from the MS Celeb dataset, either from uploads to the Face++ API or through research projects explicitly referencing MS Celeb dataset usage, such as a 2018 paper called [GridFace: Face Rectification via Learning Local Homography Transformations](https://arxiv.org/pdf/1808.06210.pdf) jointly published by 3 authors, all of whom worked for Megvii. ## Commercial Usage diff --git a/site/content/pages/research/_introduction/index.md b/site/content/pages/research/_introduction/index.md new file mode 100644 index 00000000..bdf1c1b0 --- /dev/null +++ b/site/content/pages/research/_introduction/index.md @@ -0,0 +1,49 @@ +------------ + +status: draft +title: Introducing MegaPixels +desc: Introduction to Megapixels +slug: 00_introduction +cssclass: dataset +published: 2018-12-15 +updated: 2018-12-15 +authors: Adam Harvey + +------------ + +# Introduction + +Face recognition has become the focal point for ... + +Add 68pt landmarks animation + +But biometric currency is ... + +Add rotation 3D head + +Inflationary... + +Add Theresea May 3D + +(comission for CPDP) + +Add info from the AI Traps talk + + ++ Posted: Dec. 15 ++ Author: Adam Harvey + + + +``` +load_file /site/research/00_introduction/assets/summary_countries_top.csv +Headings: country, Xcitations +``` + +Paragraph text to test css formatting. Paragraph text to test css formatting. Paragraph text to test css formatting. Paragraph text to test css formatting. Paragraph text to test css formatting. + + + +[ page under development ] + +
\ No newline at end of file diff --git a/site/content/pages/research/munich_security_conference/assets/embassy_counts_summary_dataset.csv b/site/content/pages/research/munich_security_conference/assets/embassy_counts_summary_dataset.csv index 89f3c226..3a439821 100755 --- a/site/content/pages/research/munich_security_conference/assets/embassy_counts_summary_dataset.csv +++ b/site/content/pages/research/munich_security_conference/assets/embassy_counts_summary_dataset.csv @@ -1,5 +1,5 @@ dataset,images -ibm_dif,389 -megaface,5679 -vgg_face,1 -who_goes_there,2372 +IBM Diversity in Faces,389 +MegaFace,5679 +VGG Face,1 +Who Goes There,2372
\ No newline at end of file diff --git a/site/content/pages/research/munich_security_conference/assets/megapixels_origins_top.csv b/site/content/pages/research/munich_security_conference/assets/megapixels_origins_top.csv index 081b4636..ae6e8f11 100755 --- a/site/content/pages/research/munich_security_conference/assets/megapixels_origins_top.csv +++ b/site/content/pages/research/munich_security_conference/assets/megapixels_origins_top.csv @@ -1,9 +1,8 @@ source,images -Search Engines,30127200 -Flickr.com,11783888 -IMDb.com,5251410 -CCTV,959312 -Wikimedia.org,183500 -Mugshots,113268 -YouTube.com,31888 -Other Sources Combined,37044 +Internet Search Engines,15063600 +Flickr.com,5891944 +Internet Movie Database (IMDB.com),2625705 +CCTV,479656 +Wikimedia.org,91750 +Mugshots,56634 +YouTube.com,15944
\ No newline at end of file diff --git a/site/content/pages/research/munich_security_conference/index.md b/site/content/pages/research/munich_security_conference/index.md index aba39b1c..c4c6a70c 100644 --- a/site/content/pages/research/munich_security_conference/index.md +++ b/site/content/pages/research/munich_security_conference/index.md @@ -1,19 +1,19 @@ ------------ status: published -title: MSC +title: Transnational Flows of Face Recognition Image Training Data slug: munich-security-conference -desc: Analyzing the Transnational Flow of Facial Recognition Training Data +desc: Analyzing Transnational Flows of Face Recognition Image Training Data subdesc: Where does face data originate and who's using it? cssclass: dataset image: assets/background.jpg -published: 2019-4-18 -updated: 2019-4-19 +published: 2019-6-28 +updated: 2019-6-29 authors: Adam Harvey ------------ -## Analysis for the Munich Security Conference Transnational Security Report +## Face Datasets and Information Supply Chains ### sidebar @@ -21,21 +21,30 @@ authors: Adam Harvey + Datasets Analyzed: 30 + Years: 2006 - 2018 + Status: Ongoing Investigation -+ Last Updated: June 27, 2019 ++ Last Updated: June 28, 2019 ### end sidebar +National AI strategies often rely on transnational data sources to capitalize on recent advancements in deep learning and neural networks. Researchers benefiting from these transnational data flows can yield quick and significant gains across diverse sectors from health care to biometrics. But new challenges emerge when national AI strategies collide with national interests. -Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum." +Our [earlier research](https://www.ft.com/content/cf19b956-60a2-11e9-b285-3acd5d43599e) on the [MS Celeb](/datasets/msceleb) and [Duke](/datasets/duke_mtmc) datasets published with the Financial Times revealed that several computer vision image datasets created by US companies and universities were unexpectedly also used for research by the National University of Defense Technology in China, along with top Chinese surveillance firms including SenseTime, SenseNets, CloudWalk, Hikvision, and Megvii/Face++ which have all been linked to the oppressive surveillance of Uighur Muslims in Xinjiang. + +In this new research for the [Munich Security Conference's Transnational Security Report](https://tsr.securityconference.de) we provide summary statistics about the origins and endpoints of facial recognition information supply chains. To make it more personal, we gathered additional data on the number of public photos from embassies that are currently being used in facial recognition datasets. + + +### 24 Million Non-Cooperative Faces + +In total, we analyzed 30 publicly available face recognition and face analysis datasets that collectively include over 24 million non-cooperative images. Of these 24 million images, over 15 million face images are from Internet search engines, over 5.8 million from Flickr.com, over 2.5 million from the Internet Movie Database (IMDb.com), and nearly 500,000 from CCTV footage. All 24 million images were collected without any explicit consent, a type of face image that researchers call "in the wild". + +Next we manually verified 1,134 publicly available research papers that cite these datasets to determine who was using the data and where it was being used. Even though the vast majority of the images originated in the United States, the publicly available research citations show that only about 25% citations are from the country of the origin while the majority of citations are from China. -Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum." === columns 2 ``` single_pie_chart /site/research/munich_security_conference/assets/megapixels_origins_top.csv -Caption: Sources of Publicly Available Face Training Data 2006 - 2018 +Caption: Sources of Publicly Available Non-Cooperative Face Image Training Data 2006 - 2018 Top: 10 OtherLabel: Other ``` @@ -44,106 +53,73 @@ OtherLabel: Other ``` single_pie_chart /site/research/munich_security_conference/assets/summary_countries.csv -Caption: Locations Where Face Data Is Used +Caption: Locations Where Face Data Is Used Based on Public Research Citations Top: 14 OtherLabel: Other ``` === end columns +### 6,000 Embassy Photos Being Used To Train Facial Recognition -=== columns 2 +Of the 5.8 million Flickr images we found over 6,000 public photos from Embassy Flickr accounts were used to train facial recognition technologies. These images were used in the MegaFace and IBM Diversity in Faces datasets. Over 2,000 more images were included in the Who Goes There dataset, used for facial ethnicity analysis research. A few of the embassy images found in facial recognition datasets are shown below. -#### Sources of Face Data - -Add text - -| Source | Images | -| --- | --- | -|Search Engines | 30,127,200 | -|Flickr.com | 11,783,888 | -|IMDb.com | 5,251,410 | -|CCTV | 959,312 | -|Wikimedia.org | 183,500 | -|Mugshots | 113,268 | -|Other Sources Combined | 37,044 | -|YouTube.com | 31,888 | - -=== +=== columns 2 -#### Locations Where Face Data Is Used +``` +single_pie_chart /site/research/munich_security_conference/assets/country_counts.csv +Caption: Photos from these embassies are being used to train face recognition software +Top: 4 +OtherLabel: Other +Colors: categoryRainbow +``` -Add text +===== -|country | citations| -| --- | --- | -|China | 327| -|United States | 302| -|United Kingdom | 187| -|Australia | 38| -|Germany | 35| -|Singapore | 27| -|Canada | 25| -|Netherlands | 25| -|Italy | 22| -|France | 17| -|India | 14| -|South Korea | 12| -|Spain | 10| -|Switzerland | 9| +``` +single_pie_chart /site/research/munich_security_conference/assets/embassy_counts_summary_dataset.csv +Caption: Embassy images were found in these datasets +Top: 4 +OtherLabel: Other +Colors: categoryRainbow +``` === end columns + + - -## Over 6,000 Embassy Images on Flickr Found in Face Recognition Datasets - -Including over 2,000 more for racial analysis + - +This brief research aims to shed light on the emerging politics of data. A photo is no longer just a photo when it can also be surveillance training data, and datasets can no longer be separated from the development of software when software is now built with data. "Our relationship to computers has changed", says Geoffrey Hinton, one of the founders of modern day neural networks and deep learning. "Instead of programming them, we now show them and they figure it out."[^hinton]. +As data becomes more political, national AI strategies might also want to include transnational dataset strategies. -=== columns 2 - - - -==== +*This research post is ongoing and will updated during July and August, 2019.* - - - -=== end columns +### Further Reading +- [MS Celeb Dataset Analysis](/datasets/msceleb) +- [Brainwash Dataset Analysis](/datasets/brainwash) +- [Duke MTMC Dataset Analysis](/datasets/duke_mtmc) +- [Unconstrained College Students Dataset Analysis](/datasets/uccs) +- [Duke MTMC dataset author apologies to students](https://www.dukechronicle.com/article/2019/06/duke-university-facial-recognition-data-set-study-surveillance-video-students-china-uyghur) +- [BBC coverage of MS Celeb dataset takedown](https://www.bbc.com/news/technology-48555149) +- [Spiegel coverage of MS Celeb dataset takedown](https://www.spiegel.de/netzwelt/web/microsoft-gesichtserkennung-datenbank-mit-zehn-millionen-fotos-geloescht-a-1271221.html) -=== columns 2 - -``` -single_pie_chart /site/research/munich_security_conference/assets/megapixels_origins_top.csv -Caption: Sources of Face Training Data -Top: 5 -OtherLabel: Other Countries -``` -=========== +{% include 'supplementary_header.html' %} ``` -single_pie_chart /site/research/munich_security_conference/assets/embassy_counts_summary_dataset.csv -Caption: Dataset sources -Top: 4 -OtherLabel: Other +load_file /site/research/munich_security_conference/assets/embassy_counts_public.csv +Headings: Images, Dataset, Embassy, Flickr ID, URL, Guest, Host ``` -=== end columns +{% include 'cite_our_work.html' %} -{% include 'supplementary_header.html' %} - -[ add a download button for CSV data ] +### Footnotes -``` -load_file /site/research/munich_security_conference/assets/embassy_counts_public.csv -Images, Dataset, Embassy, Flickr ID, URL, Guest, Host -``` +[^hinton]: "Heroes of Deep Learning: Andrew Ng interviews Geoffrey Hinton". Published on Aug 8, 2017. <https://www.youtube.com/watch?v=-eyhCTvrEtE> -{% include 'cite_our_work.html' %}
\ No newline at end of file diff --git a/site/content/pages/test/csv.md b/site/content/pages/test/csv.md index 85f714b4..ef3327f8 100644 --- a/site/content/pages/test/csv.md +++ b/site/content/pages/test/csv.md @@ -16,5 +16,5 @@ authors: Megapixels ``` load_file /site/test/assets/test.csv -Name, Images, Year, Gender, Description, URL +Headings: Name, Images, Year, Gender, Description, URL ``` diff --git a/site/public/about/assets/LICENSE/index.html b/site/public/about/assets/LICENSE/index.html index f1e3a9fd..40929e4f 100644 --- a/site/public/about/assets/LICENSE/index.html +++ b/site/public/about/assets/LICENSE/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-"> diff --git a/site/public/about/attribution/index.html b/site/public/about/attribution/index.html index 15270150..4e7474b0 100644 --- a/site/public/about/attribution/index.html +++ b/site/public/about/attribution/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-about"> diff --git a/site/public/about/index.html b/site/public/about/index.html index 16a2e967..a46653c6 100644 --- a/site/public/about/index.html +++ b/site/public/about/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-about"> diff --git a/site/public/about/legal/index.html b/site/public/about/legal/index.html index 49ed926d..8beafeea 100644 --- a/site/public/about/legal/index.html +++ b/site/public/about/legal/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-about"> diff --git a/site/public/about/news/index.html b/site/public/about/news/index.html index fcba7877..de44468e 100644 --- a/site/public/about/news/index.html +++ b/site/public/about/news/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-about"> diff --git a/site/public/datasets/brainwash/index.html b/site/public/datasets/brainwash/index.html index 3dacd6e1..18600b6f 100644 --- a/site/public/datasets/brainwash/index.html +++ b/site/public/datasets/brainwash/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-dataset"> diff --git a/site/public/datasets/duke_mtmc/index.html b/site/public/datasets/duke_mtmc/index.html index 9a70a3f6..fc141450 100644 --- a/site/public/datasets/duke_mtmc/index.html +++ b/site/public/datasets/duke_mtmc/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-dataset"> diff --git a/site/public/datasets/helen/index.html b/site/public/datasets/helen/index.html index a7ada42a..44ef462e 100644 --- a/site/public/datasets/helen/index.html +++ b/site/public/datasets/helen/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-dataset"> diff --git a/site/public/datasets/hrt_transgender/index.html b/site/public/datasets/hrt_transgender/index.html index 02324a2f..2e5e9c62 100644 --- a/site/public/datasets/hrt_transgender/index.html +++ b/site/public/datasets/hrt_transgender/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-dataset"> diff --git a/site/public/datasets/ibm_dif/index.html b/site/public/datasets/ibm_dif/index.html index 1c465f93..be5dbfe4 100644 --- a/site/public/datasets/ibm_dif/index.html +++ b/site/public/datasets/ibm_dif/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-dataset"> diff --git a/site/public/datasets/ijb_c/index.html b/site/public/datasets/ijb_c/index.html index a36fac14..abe7d5ed 100644 --- a/site/public/datasets/ijb_c/index.html +++ b/site/public/datasets/ijb_c/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-dataset"> diff --git a/site/public/datasets/index.html b/site/public/datasets/index.html index 1fb83352..a634b877 100644 --- a/site/public/datasets/index.html +++ b/site/public/datasets/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-"> diff --git a/site/public/datasets/megaface/index.html b/site/public/datasets/megaface/index.html index 33abf6c1..712af28a 100644 --- a/site/public/datasets/megaface/index.html +++ b/site/public/datasets/megaface/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-dataset"> diff --git a/site/public/datasets/msceleb/assets/notes/index.html b/site/public/datasets/msceleb/assets/notes/index.html index cac21eef..36c32429 100644 --- a/site/public/datasets/msceleb/assets/notes/index.html +++ b/site/public/datasets/msceleb/assets/notes/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-"> diff --git a/site/public/datasets/msceleb/index.html b/site/public/datasets/msceleb/index.html index 7109cc9b..42a44571 100644 --- a/site/public/datasets/msceleb/index.html +++ b/site/public/datasets/msceleb/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-dataset"> @@ -212,8 +212,8 @@ <p>Despite the recent termination of the <a href="https://msceleb.org">msceleb.org</a> website, the dataset still exists in several repositories on GitHub, the hard drives of countless researchers, and will likely continue to be used in research projects around the world.</p> <p>For example, on October 28, 2019, the MS Celeb dataset will be used for a new competition called "<a href="https://ibug.doc.ic.ac.uk/resources/lightweight-face-recognition-challenge-workshop/">Lightweight Face Recognition Challenge & Workshop</a>" where the best face recognition entries will be awarded $5,000 from Huawei and $3,000 from DeepGlint. The competition is part of the <a href="http://iccv2019.thecvf.com/program/workshops">ICCV 2019 conference</a>. This time the challenge is no longer being organized by Microsoft, who created the dataset, but instead by Imperial College London (UK) and <a href="https://github.com/deepinsight/insightface">InsightFace</a> (CN). The organizers provide a <a href="https://ibug.doc.ic.ac.uk/resources/lightweight-face-recognition-challenge-workshop/">25GB download of cropped faces</a> from MS Celeb for anyone to download (in .rec format).</p> <p>And in June, shortly after <a href="https://twitter.com/adamhrv/status/1134511293526937600">posting</a> about the disappearance of the MS Celeb dataset, it reemerged on <a href="https://academictorrents.com/details/9e67eb7cc23c9417f39778a8e06cca5e26196a97/tech">Academic Torrents</a>. As of June 10, the MS Celeb dataset files have been redistributed in at least 9 countries and downloaded 44 times without any restrictions. The files were seeded and are mostly distributed by an AI company based in China called Hyper.ai, which states that it redistributes MS Celeb and other datasets for "teachers and students of service industry-related practitioners and research institutes."<a class="footnote_shim" name="[^hyperai_readme]_1"> </a><a href="#[^hyperai_readme]" class="footnote" title="Footnote 6">6</a></p> -<p>Earlier in 2019 images from the MS Celeb were also repackaged into another face dataset called <em>Racial Faces in the Wild (RFW)</em>. To create it, the RFW authors uploaded face images from the MS Celeb dataset to the Face++ API and used the inferred racial scores to segregate people into four subsets: Caucasian, Asian, Indian, and African each with 3,000 subjects. That dataset then appeared in a subsequent research project from researchers affiliated with IIIT-Delhi and IBM TJ Watson called <a href="https://arxiv.org/abs/1904.01219">Deep Learning for Face Recognition: Pride or Prejudiced?</a>, which aims to reduce bias but also inadvertently furthers racist language and ideologies that can not be repeated here.</p> -<p>The estimated racial scores for the MS Celeb face images used in the RFW dataset were computed using the Face++ API, which is owned by Megvii Inc, a company that has been repeatedly linked to the oppressive surveillance of Uighur Muslims in Xinjiang, China. According to posts from the <a href="https://chinai.substack.com/p/chinai-newsletter-11-companies-involved-in-expanding-chinas-public-security-apparatus-in-xinjiang">ChinAI Newsletter</a> and <a href="https://www.buzzfeednews.com/article/ryanmac/us-money-funding-facial-recognition-sensetime-megvii">BuzzFeedNews</a>, Megvii announced in 2017 at the China-Eurasia Security Expo in Ürümqi, Xinjiang, that it would be the official technical support unit of the "Public Security Video Laboratory" in Xinjiang, China. If they didn't already, it's highly likely that Megvii has a copy of everyone's biometric faceprint from the MS Celeb dataset, either from uploads to the Face++ API or through the research projects explicitly referencing MS Celeb dataset usage, such as a 2018 paper called <a href="https://arxiv.org/pdf/1808.06210.pdf">GridFace: Face Rectification via Learning Local Homography Transformations</a> jointly published by 3 authors, all of whom worked for Megvii.</p> +<p>Earlier in 2019 images from the MS Celeb were also repackaged into another face dataset called <em>Racial Faces in the Wild (RFW)</em>. To create it, the RFW authors uploaded face images from the MS Celeb dataset to the Face++ API and used the inferred racial scores to segregate people into four subsets: Caucasian, Asian, Indian, and African each with 3,000 subjects. That dataset then appeared in a subsequent research project from researchers affiliated with IIIT-Delhi and IBM TJ Watson called <a href="https://arxiv.org/abs/1904.01219">Deep Learning for Face Recognition: Pride or Prejudiced?</a>, which aims to reduce bias but also inadvertently furthers racist ideologies, using discredited racial terminology that cannot be repeated here.</p> +<p>The estimated racial scores for the MS Celeb face images used in the RFW dataset were computed using the Face++ API, which is owned by Megvii Inc, a company that has been repeatedly linked to the oppressive surveillance of Uighur Muslims in Xinjiang, China. According to posts from the <a href="https://chinai.substack.com/p/chinai-newsletter-11-companies-involved-in-expanding-chinas-public-security-apparatus-in-xinjiang">ChinAI Newsletter</a> and <a href="https://www.buzzfeednews.com/article/ryanmac/us-money-funding-facial-recognition-sensetime-megvii">BuzzFeedNews</a>, Megvii announced in 2017 at the China-Eurasia Security Expo in Ürümqi, Xinjiang, that it would be the official technical support unit of the "Public Security Video Laboratory" in Xinjiang, China. If they didn't already, it's highly likely that Megvii has a copy of everyone's biometric faceprint from the MS Celeb dataset, either from uploads to the Face++ API or through research projects explicitly referencing MS Celeb dataset usage, such as a 2018 paper called <a href="https://arxiv.org/pdf/1808.06210.pdf">GridFace: Face Rectification via Learning Local Homography Transformations</a> jointly published by 3 authors, all of whom worked for Megvii.</p> <h2>Commercial Usage</h2> <p>Microsoft's <a href="http://web.archive.org/web/20180218212120/http://www.msceleb.org/download/sampleset">MS Celeb website</a> says it was created for "non-commercial research purpose only." Publicly available research citations and competitions show otherwise.</p> <p>In 2017 Microsoft Research organized a face recognition competition at the International Conference on Computer Vision (ICCV), one of the top 2 computer vision conferences worldwide, where industry and academia used the MS Celeb dataset to compete for the highest performance scores. The 2017 winner was Beijing-based OrionStar Technology Co., Ltd.. In their <a href="https://www.prnewswire.com/news-releases/orionstar-wins-challenge-to-recognize-one-million-celebrity-faces-with-artificial-intelligence-300494265.html">press release</a>, OrionStar boasted a 13% increase on the difficult set over last year's winner. The prior year's competitors included Beijing-based Faceall Technology Co., Ltd., a company providing face recognition for "smart city" applications.</p> diff --git a/site/public/datasets/oxford_town_centre/index.html b/site/public/datasets/oxford_town_centre/index.html index 40f8bbc6..11fb436f 100644 --- a/site/public/datasets/oxford_town_centre/index.html +++ b/site/public/datasets/oxford_town_centre/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-dataset"> diff --git a/site/public/datasets/uccs/assets/notes/index.html b/site/public/datasets/uccs/assets/notes/index.html index c8daf796..ce36f3d9 100644 --- a/site/public/datasets/uccs/assets/notes/index.html +++ b/site/public/datasets/uccs/assets/notes/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-"> diff --git a/site/public/datasets/uccs/index.html b/site/public/datasets/uccs/index.html index 96ab1e09..2dcf88a1 100644 --- a/site/public/datasets/uccs/index.html +++ b/site/public/datasets/uccs/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-dataset"> diff --git a/site/public/datasets/who_goes_there/index.html b/site/public/datasets/who_goes_there/index.html index 3db77ff7..a00fd151 100644 --- a/site/public/datasets/who_goes_there/index.html +++ b/site/public/datasets/who_goes_there/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-dataset"> diff --git a/site/public/index.html b/site/public/index.html index e5a6cd62..98b780b2 100644 --- a/site/public/index.html +++ b/site/public/index.html @@ -49,7 +49,7 @@ <div class='links'> <a href="/datasets/" class='aboutLink'>DATASETS</a> <a href="/about/" class='aboutLink'>ABOUT</a> - <a href="/about/news" class='updateLink'>News</a> + <a href="/research" class='updateLink'>Research</a> </div> </header> <div class="splash"> diff --git a/site/public/info/index.html b/site/public/info/index.html index f6280e58..51b4e5f8 100644 --- a/site/public/info/index.html +++ b/site/public/info/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-"> diff --git a/site/public/msc b/site/public/msc new file mode 120000 index 00000000..74f15455 --- /dev/null +++ b/site/public/msc @@ -0,0 +1 @@ +research/munich_security_conference/
\ No newline at end of file diff --git a/site/public/research/_from_1_to_100_pixels/index.html b/site/public/research/_from_1_to_100_pixels/index.html new file mode 100644 index 00000000..a978b264 --- /dev/null +++ b/site/public/research/_from_1_to_100_pixels/index.html @@ -0,0 +1,158 @@ +<!doctype html> +<html> +<head> + <title>MegaPixels: From 1 to 100 Pixels</title> + <meta charset="utf-8" /> + <meta name="author" content="Adam Harvey" /> + <meta name="description" content="High resolution insights from low resolution imagery" /> + <meta property="og:title" content="MegaPixels: From 1 to 100 Pixels"/> + <meta property="og:type" content="website"/> + <meta property="og:summary" content="MegaPixels is an art and research project about face recognition datasets created \"in the wild\"/> + <meta property="og:image" content="https://nyc3.digitaloceanspaces.com/megapixels/v1/research/_from_1_to_100_pixels/assets/intro.jpg" /> + <meta property="og:url" content="https://megapixels.cc/research/_from_1_to_100_pixels/"/> + <meta property="og:site_name" content="MegaPixels" /> + <meta name="referrer" content="no-referrer" /> + <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=no"/> + <meta name="apple-mobile-web-app-status-bar-style" content="black"> + <meta name="apple-mobile-web-app-capable" content="yes"> + + <link rel="apple-touch-icon" sizes="57x57" href="/assets/img/favicon/apple-icon-57x57.png"> + <link rel="apple-touch-icon" sizes="60x60" href="/assets/img/favicon/apple-icon-60x60.png"> + <link rel="apple-touch-icon" sizes="72x72" href="/assets/img/favicon/apple-icon-72x72.png"> + <link rel="apple-touch-icon" sizes="76x76" href="/assets/img/favicon/apple-icon-76x76.png"> + <link rel="apple-touch-icon" sizes="114x114" href="/assets/img/favicon/apple-icon-114x114.png"> + <link rel="apple-touch-icon" sizes="120x120" href="/assets/img/favicon/apple-icon-120x120.png"> + <link rel="apple-touch-icon" sizes="144x144" href="/assets/img/favicon/apple-icon-144x144.png"> + <link rel="apple-touch-icon" sizes="152x152" href="/assets/img/favicon/apple-icon-152x152.png"> + <link rel="apple-touch-icon" sizes="180x180" href="/assets/img/favicon/apple-icon-180x180.png"> + <link rel="icon" type="image/png" sizes="192x192" href="/assets/img/favicon/android-icon-192x192.png"> + <link rel="icon" type="image/png" sizes="32x32" href="/assets/img/favicon/favicon-32x32.png"> + <link rel="icon" type="image/png" sizes="96x96" href="/assets/img/favicon/favicon-96x96.png"> + <link rel="icon" type="image/png" sizes="16x16" href="/assets/img/favicon/favicon-16x16.png"> + <link rel="manifest" href="/assets/img/favicon/manifest.json"> + <meta name="msapplication-TileColor" content="#ffffff"> + <meta name="msapplication-TileImage" content="/ms-icon-144x144.png"> + <meta name="theme-color" content="#ffffff"> + + <link rel='stylesheet' href='/assets/css/fonts.css' /> + <link rel='stylesheet' href='/assets/css/css.css' /> + <link rel='stylesheet' href='/assets/css/leaflet.css' /> + <link rel='stylesheet' href='/assets/css/applets.css' /> + <link rel='stylesheet' href='/assets/css/mobile.css' /> +</head> +<body> + <header> + <a class='slogan' href="/"> + <div class='logo'></div> + <div class='site_name'>MegaPixels</div> + + </a> + <div class='links'> + <a href="/datasets/">Datasets</a> + <a href="/about/">About</a> + <a href="/research">Research</a> + </div> + </header> + <div class="content content-"> + + <section><h1>From 1 to 100 Pixels</h1> +<h3>High resolution insights from low resolution data</h3> +<p>This post will be about the meaning of "face". How do people define it? How to biometrics researchers define it? How has it changed during the last decade.</p> +<p>What can you know from a very small amount of information?</p> +<ul> +<li>1 pixel grayscale</li> +<li>2x2 pixels grayscale, font example, can encode letters</li> +<li>3x3 pixels: can create a font</li> +<li>4x4 pixels: how many variations</li> +<li>8x8 yotta yotta, many more variations</li> +<li>5x7 face recognition </li> +<li>12x16 activity recognition</li> +<li>6/5 (up to 124/106) pixels in height/width, and the average is 24/20 for QMUL SurvFace</li> +<li>(prepare a Progan render of the QMUL dataset and TinyFaces)</li> +<li>20x16 tiny faces paper</li> +<li>20x20 MNIST handwritten images <a href="http://yann.lecun.com/exdb/mnist/">http://yann.lecun.com/exdb/mnist/</a></li> +<li>24x24 haarcascade detector idealized images</li> +<li>32x32 CIFAR image dataset</li> +<li>40x40 can do emotion detection, face recognition at scale, 3d modeling of the face. include datasets with faces at this resolution including pedestrian.</li> +<li>NIST standards begin to appear from 40x40, distinguish occular pixels</li> +<li>need more material from 60-100</li> +<li>60x60 show how texture emerges and pupils, eye color, higher resolution of features and compare to lower resolution faces</li> +<li>100x100 all you need for medical diagnosis</li> +<li>100x100 0.5% of one Instagram photo</li> +</ul> +<p>Notes:</p> +<ul> +<li>Google FaceNet used images with (face?) sizes: Input sizes range from 96x96 pixels to 224x224pixels in our experiments. FaceNet: A Unified Embedding for Face Recognition and Clustering <a href="https://arxiv.org/pdf/1503.03832.pdf">https://arxiv.org/pdf/1503.03832.pdf</a></li> +</ul> +<p>Ideas:</p> +<ul> +<li>Find specific cases of facial resolution being used in legal cases, forensic investigations, or military footage</li> +<li>resolution of boston bomber face</li> +<li>resolution of the state of the union image</li> +</ul> +<h3>Research</h3> +<ul> +<li>NIST report on sres states several resolutions</li> +<li>"Results show that the tested face recognition systems yielded similar performance for query sets with eye-to-eye distance from 60 pixels to 30 pixels" <sup class="footnote-ref" id="fnref-nist_sres"><a href="#fn-nist_sres">1</a></sup></li> +</ul> +<ul> +<li>"Note that we only keep the images with a minimal side length of 80 pixels." and "a face will be labeled as “Ignore” if it is very difficult to be detected due to blurring, severe deformation and unrecognizable eyes, or the side length of its bounding box is less than 32 pixels." Ge_Detecting_Masked_Faces_CVPR_2017_paper.pdf </li> +<li>IBM DiF: "Faces with region size less than 50x50 or inter-ocular distance of less than 30 pixels were discarded. Faces with non-frontal pose, or anything beyond being slightly tilted to the left or the right, were also discarded."</li> +</ul> +<p>As the resolution +formatted as rectangular databases of 16 bit RGB-tuples or 8 bit grayscale values</p> +<p>To consider how visual privacy applies to real world surveillance situations, the first</p> +<p>A single 8-bit grayscale pixel with 256 values is enough to represent the entire alphabet <code>a-Z0-9</code> with room to spare.</p> +<p>A 2x2 pixels contains</p> +<p>Using no more than a 42 pixel (6x7 image) face image researchers [cite] were able to correctly distinguish between a group of 50 people. Yet</p> +<p>The likely outcome of face recognition research is that more data is needed to improve. Indeed, resolution is the determining factor for all biometric systems, both as training data to increase</p> +<p>Pixels, typically considered the buiding blocks of images and vidoes, can also be plotted as a graph of sensor values corresponding to the intensity of RGB-calibrated sensors.</p> +<p>Wi-Fi and cameras presents elevated risks for transmitting videos and image documentation from conflict zones, high-risk situations, or even sharing on social media. How can new developments in computer vision also be used in reverse, as a counter-forensic tool, to minimize an individual's privacy risk?</p> +<p>As the global Internet becomes increasingly effecient at turning the Internet into a giant dataset for machine learning, forensics, and data analysing, it would be prudent to also consider tools for decreasing the resolution. The Visual Defense module is just that. What are new ways to minimize the adverse effects of surveillance by dulling the blade. For example, a researcher paper showed that by decreasing a face size to 12x16 it was possible to do 98% accuracy with 50 people. This is clearly an example of</p> +<p>This research module, tentatively called Visual Defense Tools, aims to explore the</p> +<h3>Prior Research</h3> +<ul> +<li>MPI visual privacy advisor</li> +<li>NIST: super resolution</li> +<li>YouTube blur tool</li> +<li>WITNESS: blur tool</li> +<li>Pixellated text </li> +<li>CV Dazzle</li> +<li>Bellingcat guide to geolocation</li> +<li>Peng! magic passport</li> +</ul> +<h3>Notes</h3> +<ul> +<li>In China, out of the approximately 200 million surveillance cameras only about 15% have enough resolution for face recognition. </li> +<li>In Apple's FaceID security guide, the probability of someone else's face unlocking your phone is 1 out of 1,000,000. </li> +<li>In England, the Metropolitan Police reported a false-positive match rate of 98% when attempting to use face recognition to locate wanted criminals. </li> +<li>In a face recognition trial at Berlin's Sudkreuz station, the false-match rate was 20%. </li> +</ul> +<p>What all 3 examples illustrate is that face recognition is anything but absolute. In a 2017 talk, Jason Matheny the former directory of IARPA, admitted the face recognition is so brittle it can be subverted by using a magic marker and drawing "a few dots on your forehead". In fact face recognition is a misleading term. Face recognition is search engine for faces that can only ever show you the mos likely match. This presents real a real threat to privacy and lends</p> +<p>Globally, iPhone users unwittingly agree to 1/1,000,000 probably +relying on FaceID and TouchID to protect their information agree to a</p> +<div class="footnotes"> +<hr> +<ol><li id="fn-nist_sres"><p>NIST 906932. Performance Assessment of Face Recognition Using Super-Resolution. Shuowen Hu, Robert Maschal, S. Susan Young, Tsai Hong Hong, Jonathon P. Phillips<a href="#fnref-nist_sres" class="footnote">↩</a></p></li> +</ol> +</div> +</section> + + </div> + <footer> + <ul class="footer-left"> + <li><a href="/">MegaPixels.cc</a></li> + <li><a href="/datasets/">Datasets</a></li> + <li><a href="/about/">About</a></li> + <li><a href="/about/news/">News</a></li> + <li><a href="/about/legal/">Legal & Privacy</a></li> + </ul> + <ul class="footer-right"> + <li>MegaPixels ©2017-19 <a href="https://ahprojects.com">Adam R. Harvey</a></li> + <li>Made with support from <a href="https://mozilla.org">Mozilla</a></li> + </ul> + </footer> +</body> + +<script src="/assets/js/dist/index.js"></script> +</html>
\ No newline at end of file diff --git a/site/public/research/_introduction/index.html b/site/public/research/_introduction/index.html new file mode 100644 index 00000000..8b17c016 --- /dev/null +++ b/site/public/research/_introduction/index.html @@ -0,0 +1,92 @@ +<!doctype html> +<html> +<head> + <title>MegaPixels: Introducing MegaPixels</title> + <meta charset="utf-8" /> + <meta name="author" content="Adam Harvey" /> + <meta name="description" content="Introduction to Megapixels" /> + <meta property="og:title" content="MegaPixels: Introducing MegaPixels"/> + <meta property="og:type" content="website"/> + <meta property="og:summary" content="MegaPixels is an art and research project about face recognition datasets created \"in the wild\"/> + <meta property="og:image" content="https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/msceleb/assets/background.jpg" /> + <meta property="og:url" content="https://megapixels.cc/research/_introduction/"/> + <meta property="og:site_name" content="MegaPixels" /> + <meta name="referrer" content="no-referrer" /> + <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=no"/> + <meta name="apple-mobile-web-app-status-bar-style" content="black"> + <meta name="apple-mobile-web-app-capable" content="yes"> + + <link rel="apple-touch-icon" sizes="57x57" href="/assets/img/favicon/apple-icon-57x57.png"> + <link rel="apple-touch-icon" sizes="60x60" href="/assets/img/favicon/apple-icon-60x60.png"> + <link rel="apple-touch-icon" sizes="72x72" href="/assets/img/favicon/apple-icon-72x72.png"> + <link rel="apple-touch-icon" sizes="76x76" href="/assets/img/favicon/apple-icon-76x76.png"> + <link rel="apple-touch-icon" sizes="114x114" href="/assets/img/favicon/apple-icon-114x114.png"> + <link rel="apple-touch-icon" sizes="120x120" href="/assets/img/favicon/apple-icon-120x120.png"> + <link rel="apple-touch-icon" sizes="144x144" href="/assets/img/favicon/apple-icon-144x144.png"> + <link rel="apple-touch-icon" sizes="152x152" href="/assets/img/favicon/apple-icon-152x152.png"> + <link rel="apple-touch-icon" sizes="180x180" href="/assets/img/favicon/apple-icon-180x180.png"> + <link rel="icon" type="image/png" sizes="192x192" href="/assets/img/favicon/android-icon-192x192.png"> + <link rel="icon" type="image/png" sizes="32x32" href="/assets/img/favicon/favicon-32x32.png"> + <link rel="icon" type="image/png" sizes="96x96" href="/assets/img/favicon/favicon-96x96.png"> + <link rel="icon" type="image/png" sizes="16x16" href="/assets/img/favicon/favicon-16x16.png"> + <link rel="manifest" href="/assets/img/favicon/manifest.json"> + <meta name="msapplication-TileColor" content="#ffffff"> + <meta name="msapplication-TileImage" content="/ms-icon-144x144.png"> + <meta name="theme-color" content="#ffffff"> + + <link rel='stylesheet' href='/assets/css/fonts.css' /> + <link rel='stylesheet' href='/assets/css/css.css' /> + <link rel='stylesheet' href='/assets/css/leaflet.css' /> + <link rel='stylesheet' href='/assets/css/applets.css' /> + <link rel='stylesheet' href='/assets/css/mobile.css' /> +</head> +<body> + <header> + <a class='slogan' href="/"> + <div class='logo'></div> + <div class='site_name'>MegaPixels</div> + + </a> + <div class='links'> + <a href="/datasets/">Datasets</a> + <a href="/about/">About</a> + <a href="/research">Research</a> + </div> + </header> + <div class="content content-dataset"> + + <section><h1>Introduction</h1> +<p>Face recognition has become the focal point for ...</p> +<p>Add 68pt landmarks animation</p> +<p>But biometric currency is ...</p> +<p>Add rotation 3D head</p> +<p>Inflationary...</p> +<p>Add Theresea May 3D</p> +<p>(comission for CPDP)</p> +<p>Add info from the AI Traps talk</p> +<ul> +<li>Posted: Dec. 15</li> +<li>Author: Adam Harvey</li> +</ul> +</section><section class='applet_container'><div class='applet' data-payload='{"command": "load_file /site/research/00_introduction/assets/summary_countries_top.csv", "fields": ["Headings: country, Xcitations"]}'></div></section><section><p>Paragraph text to test css formatting. Paragraph text to test css formatting. Paragraph text to test css formatting. Paragraph text to test css formatting. Paragraph text to test css formatting.</p> +<p>[ page under development ]</p> +</section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/site/research/_introduction/assets/test.png' alt=' This is the caption'><div class='caption'> This is the caption</div></div></section> + + </div> + <footer> + <ul class="footer-left"> + <li><a href="/">MegaPixels.cc</a></li> + <li><a href="/datasets/">Datasets</a></li> + <li><a href="/about/">About</a></li> + <li><a href="/about/news/">News</a></li> + <li><a href="/about/legal/">Legal & Privacy</a></li> + </ul> + <ul class="footer-right"> + <li>MegaPixels ©2017-19 <a href="https://ahprojects.com">Adam R. Harvey</a></li> + <li>Made with support from <a href="https://mozilla.org">Mozilla</a></li> + </ul> + </footer> +</body> + +<script src="/assets/js/dist/index.js"></script> +</html>
\ No newline at end of file diff --git a/site/public/research/_what_computers_can_see/index.html b/site/public/research/_what_computers_can_see/index.html new file mode 100644 index 00000000..35f6d47d --- /dev/null +++ b/site/public/research/_what_computers_can_see/index.html @@ -0,0 +1,343 @@ +<!doctype html> +<html> +<head> + <title>MegaPixels: What Computers Can See</title> + <meta charset="utf-8" /> + <meta name="author" content="Adam Harvey" /> + <meta name="description" content="What Computers Can See" /> + <meta property="og:title" content="MegaPixels: What Computers Can See"/> + <meta property="og:type" content="website"/> + <meta property="og:summary" content="MegaPixels is an art and research project about face recognition datasets created \"in the wild\"/> + <meta property="og:image" content="https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/msceleb/assets/background.jpg" /> + <meta property="og:url" content="https://megapixels.cc/research/_what_computers_can_see/"/> + <meta property="og:site_name" content="MegaPixels" /> + <meta name="referrer" content="no-referrer" /> + <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=no"/> + <meta name="apple-mobile-web-app-status-bar-style" content="black"> + <meta name="apple-mobile-web-app-capable" content="yes"> + + <link rel="apple-touch-icon" sizes="57x57" href="/assets/img/favicon/apple-icon-57x57.png"> + <link rel="apple-touch-icon" sizes="60x60" href="/assets/img/favicon/apple-icon-60x60.png"> + <link rel="apple-touch-icon" sizes="72x72" href="/assets/img/favicon/apple-icon-72x72.png"> + <link rel="apple-touch-icon" sizes="76x76" href="/assets/img/favicon/apple-icon-76x76.png"> + <link rel="apple-touch-icon" sizes="114x114" href="/assets/img/favicon/apple-icon-114x114.png"> + <link rel="apple-touch-icon" sizes="120x120" href="/assets/img/favicon/apple-icon-120x120.png"> + <link rel="apple-touch-icon" sizes="144x144" href="/assets/img/favicon/apple-icon-144x144.png"> + <link rel="apple-touch-icon" sizes="152x152" href="/assets/img/favicon/apple-icon-152x152.png"> + <link rel="apple-touch-icon" sizes="180x180" href="/assets/img/favicon/apple-icon-180x180.png"> + <link rel="icon" type="image/png" sizes="192x192" href="/assets/img/favicon/android-icon-192x192.png"> + <link rel="icon" type="image/png" sizes="32x32" href="/assets/img/favicon/favicon-32x32.png"> + <link rel="icon" type="image/png" sizes="96x96" href="/assets/img/favicon/favicon-96x96.png"> + <link rel="icon" type="image/png" sizes="16x16" href="/assets/img/favicon/favicon-16x16.png"> + <link rel="manifest" href="/assets/img/favicon/manifest.json"> + <meta name="msapplication-TileColor" content="#ffffff"> + <meta name="msapplication-TileImage" content="/ms-icon-144x144.png"> + <meta name="theme-color" content="#ffffff"> + + <link rel='stylesheet' href='/assets/css/fonts.css' /> + <link rel='stylesheet' href='/assets/css/css.css' /> + <link rel='stylesheet' href='/assets/css/leaflet.css' /> + <link rel='stylesheet' href='/assets/css/applets.css' /> + <link rel='stylesheet' href='/assets/css/mobile.css' /> +</head> +<body> + <header> + <a class='slogan' href="/"> + <div class='logo'></div> + <div class='site_name'>MegaPixels</div> + + </a> + <div class='links'> + <a href="/datasets/">Datasets</a> + <a href="/about/">About</a> + <a href="/research">Research</a> + </div> + </header> + <div class="content content-"> + + <section><h1>What Computers Can See About Your Face</h1> +<p>Rosalind Picard on Affective Computing Podcast with Lex Fridman</p> +<ul> +<li>we can read with an ordinary camera on your phone, from a neutral face if</li> +<li>your heart is racing</li> +<li>if your breating is becoming irregular and showing signs of stress</li> +<li>how your heart rate variability power is changing even when your heart is not necessarily accelerating</li> +<li>we can tell things about your stress even if you have a blank face</li> +</ul> +<p>in emotion studies</p> +<ul> +<li>when participants use smartphone and multiple data types are collected to understand patterns of life can predict tomorrow's mood</li> +<li>get best results </li> +<li>better than 80% accurate at predicting tomorrow's mood levels</li> +</ul> +<p>A list of 100 things computer vision can see, eg:</p> +<ul> +<li>age, race, gender, ancestral origin, body mass index</li> +<li>eye color, hair color, facial hair, glasses</li> +<li>beauty score, </li> +<li>intelligence</li> +<li>what you're looking at</li> +<li>medical conditions</li> +<li>tired, drowsiness in car</li> +<li>affectiva: interest in product, intent to buy</li> +</ul> +<h2>From SenseTime paper</h2> +<p>Exploring Disentangled Feature Representation Beyond Face Identification</p> +<p>From <a href="https://arxiv.org/pdf/1804.03487.pdf">https://arxiv.org/pdf/1804.03487.pdf</a> +The attribute IDs from 1 to 40 corre-spond to: ‘5 o Clock Shadow’, ‘Arched Eyebrows’, ‘Attractive’, ‘Bags Under Eyes’, ‘Bald’, ‘Bangs’, ‘Big Lips’, ‘BigNose’, ‘Black Hair’, ‘Blond Hair’, ‘Blurry’, ‘Brown Hair’,‘Bushy Eyebrows’, ‘Chubby’, ‘Double Chin’, ‘Eyeglasses’,‘Goatee’, ‘Gray Hair’, ‘Heavy Makeup’, ‘High Cheek-bones’, ‘Male’, ‘Mouth Slightly Open’, ‘Mustache’, ‘Nar-row Eyes’, ‘No Beard’, ‘Oval Face’, ‘Pale Skin’, ‘PointyNose’, ‘Receding Hairline’, ‘Rosy Cheeks’, ‘Sideburns’,‘Smiling’, ‘Straight Hair’, ‘Wavy Hair’, ‘Wearing Ear-rings’, ‘Wearing Hat’, ‘Wearing Lipstick’, ‘Wearing Neck-lace’, ‘Wearing Necktie’ and ‘Young’. It’</p> +<h2>From PubFig Dataset</h2> +<ul> +<li>Male</li> +<li>Asian</li> +<li>White</li> +<li>Black</li> +<li>Baby</li> +<li>Child</li> +<li>Youth</li> +<li>Middle Aged</li> +<li>Senior</li> +<li>Black Hair</li> +<li>Blond Hair</li> +<li>Brown Hair</li> +<li>Bald</li> +<li>No Eyewear</li> +<li>Eyeglasses</li> +<li>Sunglasses</li> +<li>Mustache</li> +<li>Smiling Frowning</li> +<li>Chubby</li> +<li>Blurry</li> +<li>Harsh Lighting</li> +<li>Flash</li> +<li>Soft Lighting</li> +<li>Outdoor Curly Hair</li> +<li>Wavy Hair</li> +<li>Straight Hair</li> +<li>Receding Hairline</li> +<li>Bangs</li> +<li>Sideburns</li> +<li>Fully Visible Forehead </li> +<li>Partially Visible Forehead </li> +<li>Obstructed Forehead</li> +<li>Bushy Eyebrows </li> +<li>Arched Eyebrows</li> +<li>Narrow Eyes</li> +<li>Eyes Open</li> +<li>Big Nose</li> +<li>Pointy Nose</li> +<li>Big Lips</li> +<li>Mouth Closed</li> +<li>Mouth Slightly Open</li> +<li>Mouth Wide Open</li> +<li>Teeth Not Visible</li> +<li>No Beard</li> +<li>Goatee </li> +<li>Round Jaw</li> +<li>Double Chin</li> +<li>Wearing Hat</li> +<li>Oval Face</li> +<li>Square Face</li> +<li>Round Face </li> +<li>Color Photo</li> +<li>Posed Photo</li> +<li>Attractive Man</li> +<li>Attractive Woman</li> +<li>Indian</li> +<li>Gray Hair</li> +<li>Bags Under Eyes</li> +<li>Heavy Makeup</li> +<li>Rosy Cheeks</li> +<li>Shiny Skin</li> +<li>Pale Skin</li> +<li>5 o' Clock Shadow</li> +<li>Strong Nose-Mouth Lines</li> +<li>Wearing Lipstick</li> +<li>Flushed Face</li> +<li>High Cheekbones</li> +<li>Brown Eyes</li> +<li>Wearing Earrings</li> +<li>Wearing Necktie</li> +<li>Wearing Necklace</li> +</ul> +<p>for i in {1..9};do wget <a href="http://visiond1.cs.umbc.edu/webpage/codedata/ADLdataset/ADL_videos/P_0$i.MP4;done;for">http://visiond1.cs.umbc.edu/webpage/codedata/ADLdataset/ADL_videos/P_0$i.MP4;done;for</a> i in {10..20}; do wget <a href="http://visiond1.cs.umbc.edu/webpage/codedata/ADLdataset/ADL_videos/P_$i.MP4;done">http://visiond1.cs.umbc.edu/webpage/codedata/ADLdataset/ADL_videos/P_$i.MP4;done</a></p> +<h2>From Market 1501</h2> +<p>The 27 attributes are:</p> +<table> +<thead><tr> +<th style="text-align:center">attribute</th> +<th style="text-align:center">representation in file</th> +<th style="text-align:center">label</th> +</tr> +</thead> +<tbody> +<tr> +<td style="text-align:center">gender</td> +<td style="text-align:center">gender</td> +<td style="text-align:center">male(1), female(2)</td> +</tr> +<tr> +<td style="text-align:center">hair length</td> +<td style="text-align:center">hair</td> +<td style="text-align:center">short hair(1), long hair(2)</td> +</tr> +<tr> +<td style="text-align:center">sleeve length</td> +<td style="text-align:center">up</td> +<td style="text-align:center">long sleeve(1), short sleeve(2)</td> +</tr> +<tr> +<td style="text-align:center">length of lower-body clothing</td> +<td style="text-align:center">down</td> +<td style="text-align:center">long lower body clothing(1), short(2)</td> +</tr> +<tr> +<td style="text-align:center">type of lower-body clothing</td> +<td style="text-align:center">clothes</td> +<td style="text-align:center">dress(1), pants(2)</td> +</tr> +<tr> +<td style="text-align:center">wearing hat</td> +<td style="text-align:center">hat</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">carrying backpack</td> +<td style="text-align:center">backpack</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">carrying bag</td> +<td style="text-align:center">bag</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">carrying handbag</td> +<td style="text-align:center">handbag</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">age</td> +<td style="text-align:center">age</td> +<td style="text-align:center">young(1), teenager(2), adult(3), old(4)</td> +</tr> +<tr> +<td style="text-align:center">8 color of upper-body clothing</td> +<td style="text-align:center">upblack, upwhite, upred, uppurple, upyellow, upgray, upblue, upgreen</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">9 color of lower-body clothing</td> +<td style="text-align:center">downblack, downwhite, downpink, downpurple, downyellow, downgray, downblue, downgreen,downbrown</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +</tbody> +</table> +<p>source: <a href="https://github.com/vana77/Market-1501_Attribute/blob/master/README.md">https://github.com/vana77/Market-1501_Attribute/blob/master/README.md</a></p> +<h2>From DukeMTMC</h2> +<p>The 23 attributes are:</p> +<table> +<thead><tr> +<th style="text-align:center">attribute</th> +<th style="text-align:center">representation in file</th> +<th style="text-align:center">label</th> +</tr> +</thead> +<tbody> +<tr> +<td style="text-align:center">gender</td> +<td style="text-align:center">gender</td> +<td style="text-align:center">male(1), female(2)</td> +</tr> +<tr> +<td style="text-align:center">length of upper-body clothing</td> +<td style="text-align:center">top</td> +<td style="text-align:center">short upper body clothing(1), long(2)</td> +</tr> +<tr> +<td style="text-align:center">wearing boots</td> +<td style="text-align:center">boots</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">wearing hat</td> +<td style="text-align:center">hat</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">carrying backpack</td> +<td style="text-align:center">backpack</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">carrying bag</td> +<td style="text-align:center">bag</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">carrying handbag</td> +<td style="text-align:center">handbag</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">color of shoes</td> +<td style="text-align:center">shoes</td> +<td style="text-align:center">dark(1), light(2)</td> +</tr> +<tr> +<td style="text-align:center">8 color of upper-body clothing</td> +<td style="text-align:center">upblack, upwhite, upred, uppurple, upgray, upblue, upgreen, upbrown</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +<tr> +<td style="text-align:center">7 color of lower-body clothing</td> +<td style="text-align:center">downblack, downwhite, downred, downgray, downblue, downgreen, downbrown</td> +<td style="text-align:center">no(1), yes(2)</td> +</tr> +</tbody> +</table> +<p>source: <a href="https://github.com/vana77/DukeMTMC-attribute/blob/master/README.md">https://github.com/vana77/DukeMTMC-attribute/blob/master/README.md</a></p> +<h2>From H3D Dataset</h2> +<p>The joints and other keypoints (eyes, ears, nose, shoulders, elbows, wrists, hips, knees and ankles) +The 3D pose inferred from the keypoints. +Visibility boolean for each keypoint +Region annotations (upper clothes, lower clothes, dress, socks, shoes, hands, gloves, neck, face, hair, hat, sunglasses, bag, occluder) +Body type (male, female or child)</p> +<p>source: <a href="https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/shape/h3d/">https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/shape/h3d/</a></p> +<h2>From Leeds Sports Pose</h2> +<p>=INDEX(A2:A9,MATCH(datasets!D1,B2:B9,0)) +=VLOOKUP(A2, datasets!A:J, 7, FALSE)</p> +<p>Right ankle +Right knee +Right hip +Left hip +Left knee +Left ankle +Right wrist +Right elbow +Right shoulder +Left shoulder +Left elbow +Left wrist +Neck +Head top</p> +<p>source: <a href="http://web.archive.org/web/20170915023005/sam.johnson.io/research/lsp.html">http://web.archive.org/web/20170915023005/sam.johnson.io/research/lsp.html</a></p> +</section> + + </div> + <footer> + <ul class="footer-left"> + <li><a href="/">MegaPixels.cc</a></li> + <li><a href="/datasets/">Datasets</a></li> + <li><a href="/about/">About</a></li> + <li><a href="/about/news/">News</a></li> + <li><a href="/about/legal/">Legal & Privacy</a></li> + </ul> + <ul class="footer-right"> + <li>MegaPixels ©2017-19 <a href="https://ahprojects.com">Adam R. Harvey</a></li> + <li>Made with support from <a href="https://mozilla.org">Mozilla</a></li> + </ul> + </footer> +</body> + +<script src="/assets/js/dist/index.js"></script> +</html>
\ No newline at end of file diff --git a/site/public/research/index.html b/site/public/research/index.html new file mode 100644 index 00000000..f4f90531 --- /dev/null +++ b/site/public/research/index.html @@ -0,0 +1,87 @@ +<!doctype html> +<html> +<head> + <title>MegaPixels: Research</title> + <meta charset="utf-8" /> + <meta name="author" content="Adam Harvey" /> + <meta name="description" content="Research blog" /> + <meta property="og:title" content="MegaPixels: Research"/> + <meta property="og:type" content="website"/> + <meta property="og:summary" content="MegaPixels is an art and research project about face recognition datasets created \"in the wild\"/> + <meta property="og:image" content="https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/msceleb/assets/background.jpg" /> + <meta property="og:url" content="https://megapixels.cc/research/"/> + <meta property="og:site_name" content="MegaPixels" /> + <meta name="referrer" content="no-referrer" /> + <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=no"/> + <meta name="apple-mobile-web-app-status-bar-style" content="black"> + <meta name="apple-mobile-web-app-capable" content="yes"> + + <link rel="apple-touch-icon" sizes="57x57" href="/assets/img/favicon/apple-icon-57x57.png"> + <link rel="apple-touch-icon" sizes="60x60" href="/assets/img/favicon/apple-icon-60x60.png"> + <link rel="apple-touch-icon" sizes="72x72" href="/assets/img/favicon/apple-icon-72x72.png"> + <link rel="apple-touch-icon" sizes="76x76" href="/assets/img/favicon/apple-icon-76x76.png"> + <link rel="apple-touch-icon" sizes="114x114" href="/assets/img/favicon/apple-icon-114x114.png"> + <link rel="apple-touch-icon" sizes="120x120" href="/assets/img/favicon/apple-icon-120x120.png"> + <link rel="apple-touch-icon" sizes="144x144" href="/assets/img/favicon/apple-icon-144x144.png"> + <link rel="apple-touch-icon" sizes="152x152" href="/assets/img/favicon/apple-icon-152x152.png"> + <link rel="apple-touch-icon" sizes="180x180" href="/assets/img/favicon/apple-icon-180x180.png"> + <link rel="icon" type="image/png" sizes="192x192" href="/assets/img/favicon/android-icon-192x192.png"> + <link rel="icon" type="image/png" sizes="32x32" href="/assets/img/favicon/favicon-32x32.png"> + <link rel="icon" type="image/png" sizes="96x96" href="/assets/img/favicon/favicon-96x96.png"> + <link rel="icon" type="image/png" sizes="16x16" href="/assets/img/favicon/favicon-16x16.png"> + <link rel="manifest" href="/assets/img/favicon/manifest.json"> + <meta name="msapplication-TileColor" content="#ffffff"> + <meta name="msapplication-TileImage" content="/ms-icon-144x144.png"> + <meta name="theme-color" content="#ffffff"> + + <link rel='stylesheet' href='/assets/css/fonts.css' /> + <link rel='stylesheet' href='/assets/css/css.css' /> + <link rel='stylesheet' href='/assets/css/leaflet.css' /> + <link rel='stylesheet' href='/assets/css/applets.css' /> + <link rel='stylesheet' href='/assets/css/mobile.css' /> +</head> +<body> + <header> + <a class='slogan' href="/"> + <div class='logo'></div> + <div class='site_name'>MegaPixels</div> + + </a> + <div class='links'> + <a href="/datasets/">Datasets</a> + <a href="/about/">About</a> + <a href="/research">Research</a> + </div> + </header> + <div class="content content-"> + + <section><h1>Research Blog</h1> +</section><div class='research_index'> + <a href='/research/munich_security_conference/'><section class='wide' style='background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/site/research/munich_security_conference/assets/background.jpg);' /> + <section> + <h4><span class='bgpad'>28 June 2019</span></h4> + <h2><span class='bgpad'>Analyzing Transnational Flows of Face Recognition Image Training Data</span></h2> + <h3><span class='bgpad'>Where does face data originate and who's using it?</span></h3> + <h4 class='readmore'><span class='bgpad'>Read more...</span></h4> + </section> + </section></a> + </div> + + </div> + <footer> + <ul class="footer-left"> + <li><a href="/">MegaPixels.cc</a></li> + <li><a href="/datasets/">Datasets</a></li> + <li><a href="/about/">About</a></li> + <li><a href="/about/news/">News</a></li> + <li><a href="/about/legal/">Legal & Privacy</a></li> + </ul> + <ul class="footer-right"> + <li>MegaPixels ©2017-19 <a href="https://ahprojects.com">Adam R. Harvey</a></li> + <li>Made with support from <a href="https://mozilla.org">Mozilla</a></li> + </ul> + </footer> +</body> + +<script src="/assets/js/dist/index.js"></script> +</html>
\ No newline at end of file diff --git a/site/public/research/munich_security_conference/index.html b/site/public/research/munich_security_conference/index.html new file mode 100644 index 00000000..0b625f53 --- /dev/null +++ b/site/public/research/munich_security_conference/index.html @@ -0,0 +1,128 @@ +<!doctype html> +<html> +<head> + <title>MegaPixels: Transnational Flows of Face Recognition Image Training Data</title> + <meta charset="utf-8" /> + <meta name="author" content="Adam Harvey" /> + <meta name="description" content="Analyzing Transnational Flows of Face Recognition Image Training Data" /> + <meta property="og:title" content="MegaPixels: Transnational Flows of Face Recognition Image Training Data"/> + <meta property="og:type" content="website"/> + <meta property="og:summary" content="MegaPixels is an art and research project about face recognition datasets created \"in the wild\"/> + <meta property="og:image" content="https://nyc3.digitaloceanspaces.com/megapixels/v1/research/munich_security_conference/assets/background.jpg" /> + <meta property="og:url" content="https://megapixels.cc/research/munich_security_conference/"/> + <meta property="og:site_name" content="MegaPixels" /> + <meta name="referrer" content="no-referrer" /> + <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=no"/> + <meta name="apple-mobile-web-app-status-bar-style" content="black"> + <meta name="apple-mobile-web-app-capable" content="yes"> + + <link rel="apple-touch-icon" sizes="57x57" href="/assets/img/favicon/apple-icon-57x57.png"> + <link rel="apple-touch-icon" sizes="60x60" href="/assets/img/favicon/apple-icon-60x60.png"> + <link rel="apple-touch-icon" sizes="72x72" href="/assets/img/favicon/apple-icon-72x72.png"> + <link rel="apple-touch-icon" sizes="76x76" href="/assets/img/favicon/apple-icon-76x76.png"> + <link rel="apple-touch-icon" sizes="114x114" href="/assets/img/favicon/apple-icon-114x114.png"> + <link rel="apple-touch-icon" sizes="120x120" href="/assets/img/favicon/apple-icon-120x120.png"> + <link rel="apple-touch-icon" sizes="144x144" href="/assets/img/favicon/apple-icon-144x144.png"> + <link rel="apple-touch-icon" sizes="152x152" href="/assets/img/favicon/apple-icon-152x152.png"> + <link rel="apple-touch-icon" sizes="180x180" href="/assets/img/favicon/apple-icon-180x180.png"> + <link rel="icon" type="image/png" sizes="192x192" href="/assets/img/favicon/android-icon-192x192.png"> + <link rel="icon" type="image/png" sizes="32x32" href="/assets/img/favicon/favicon-32x32.png"> + <link rel="icon" type="image/png" sizes="96x96" href="/assets/img/favicon/favicon-96x96.png"> + <link rel="icon" type="image/png" sizes="16x16" href="/assets/img/favicon/favicon-16x16.png"> + <link rel="manifest" href="/assets/img/favicon/manifest.json"> + <meta name="msapplication-TileColor" content="#ffffff"> + <meta name="msapplication-TileImage" content="/ms-icon-144x144.png"> + <meta name="theme-color" content="#ffffff"> + + <link rel='stylesheet' href='/assets/css/fonts.css' /> + <link rel='stylesheet' href='/assets/css/css.css' /> + <link rel='stylesheet' href='/assets/css/leaflet.css' /> + <link rel='stylesheet' href='/assets/css/applets.css' /> + <link rel='stylesheet' href='/assets/css/mobile.css' /> +</head> +<body> + <header> + <a class='slogan' href="/"> + <div class='logo'></div> + <div class='site_name'>MegaPixels</div> + + </a> + <div class='links'> + <a href="/datasets/">Datasets</a> + <a href="/about/">About</a> + <a href="/research">Research</a> + </div> + </header> + <div class="content content-dataset"> + + <section class='intro_section' style='background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/site/research/munich_security_conference/assets/background.jpg)'><div class='inner'><div class='hero_desc'><span class='bgpad'>Analyzing Transnational Flows of Face Recognition Image Training Data</span></div><div class='hero_subdesc'><span class='bgpad'>Where does face data originate and who's using it? +</span></div></div></section><section><h2>Face Datasets and Information Supply Chains</h2> +</section><section><div class='right-sidebar'><div class='meta'><div class='gray'>Images Analyzed</div><div>24,302,637</div></div><div class='meta'><div class='gray'>Datasets Analyzed</div><div>30</div></div><div class='meta'><div class='gray'>Years</div><div>2006 - 2018</div></div><div class='meta'><div class='gray'>Status</div><div>Ongoing Investigation</div></div><div class='meta'><div class='gray'>Last Updated</div><div>June 28, 2019</div></div></div><p>National AI strategies often rely on transnational data sources to capitalize on recent advancements in deep learning and neural networks. Researchers benefiting from these transnational data flows can yield quick and significant gains across diverse sectors from health care to biometrics. But new challenges emerge when national AI strategies collide with national interests.</p> +<p>Our <a href="https://www.ft.com/content/cf19b956-60a2-11e9-b285-3acd5d43599e">earlier research</a> on the <a href="/datasets/msceleb">MS Celeb</a> and <a href="/datasets/duke_mtmc">Duke</a> datasets published with the Financial Times revealed that several computer vision image datasets created by US companies and universities were unexpectedly also used for research by the National University of Defense Technology in China, along with top Chinese surveillance firms including SenseTime, SenseNets, CloudWalk, Hikvision, and Megvii/Face++ which have all been linked to the oppressive surveillance of Uighur Muslims in Xinjiang.</p> +<p>In this new research for the <a href="https://tsr.securityconference.de">Munich Security Conference's Transnational Security Report</a> we provide summary statistics about the origins and endpoints of facial recognition information supply chains. To make it more personal, we gathered additional data on the number of public photos from embassies that are currently being used in facial recognition datasets.</p> +<h3>24 Million Non-Cooperative Faces</h3> +<p>In total, we analyzed 30 publicly available face recognition and face analysis datasets that collectively include over 24 million non-cooperative images. Of these 24 million images, over 15 million face images are from Internet search engines, over 5.8 million from Flickr.com, over 2.5 million from the Internet Movie Database (IMDb.com), and nearly 500,000 from CCTV footage. All 24 million images were collected without any explicit consent, a type of face image that researchers call "in the wild".</p> +<p>Next we manually verified 1,134 publicly available research papers that cite these datasets to determine who was using the data and where it was being used. Even though the vast majority of the images originated in the United States, the publicly available research citations show that only about 25% citations are from the country of the origin while the majority of citations are from China.</p> +</section><section><div class='columns columns-2'><section class='applet_container'><div class='applet' data-payload='{"command": "single_pie_chart /site/research/munich_security_conference/assets/megapixels_origins_top.csv", "fields": ["Caption: Sources of Publicly Available Non-Cooperative Face Image Training Data 2006 - 2018", "Top: 10", "OtherLabel: Other"]}'></div></section><section class='applet_container'><div class='applet' data-payload='{"command": "single_pie_chart /site/research/munich_security_conference/assets/summary_countries.csv", "fields": ["Caption: Locations Where Face Data Is Used Based on Public Research Citations", "Top: 14", "OtherLabel: Other"]}'></div></section></div></section><section><h3>6,000 Embassy Photos Being Used To Train Facial Recognition</h3> +<p>Of the 5.8 million Flickr images we found over 6,000 public photos from Embassy Flickr accounts were used to train facial recognition technologies. These images were used in the MegaFace and IBM Diversity in Faces datasets. Over 2,000 more images were included in the Who Goes There dataset, used for facial ethnicity analysis research. A few of the embassy images found in facial recognition datasets are shown below.</p> +</section><section><div class='columns columns-2'><section class='applet_container'><div class='applet' data-payload='{"command": "single_pie_chart /site/research/munich_security_conference/assets/country_counts.csv", "fields": ["Caption: Photos from these embassies are being used to train face recognition software", "Top: 4", "OtherLabel: Other", "Colors: categoryRainbow"]}'></div></section><section class='applet_container'><div class='applet' data-payload='{"command": "single_pie_chart /site/research/munich_security_conference/assets/embassy_counts_summary_dataset.csv", "fields": ["Caption: Embassy images were found in these datasets", "Top: 4", "OtherLabel: Other", "Colors: categoryRainbow"]}'></div></section></div></section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/site/research/munich_security_conference/assets/4606260362.jpg' alt=' An image in the MegaFace dataset obtained from United Kingdoms Embassy in Italy'><div class='caption'> An image in the MegaFace dataset obtained from United Kingdom's Embassy in Italy</div></div> +<div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/site/research/munich_security_conference/assets/4749096858.jpg' alt=' An image in the MegaFace dataset obtained from the Flickr account of the United States Embassy in Kabul, Afghanistan'><div class='caption'> An image in the MegaFace dataset obtained from the Flickr account of the United States Embassy in Kabul, Afghanistan</div></div></section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/site/research/munich_security_conference/assets/4730007024.jpg' alt=' An image in the MegaFace dataset obtained from U.S. Embassy Canberra'><div class='caption'> An image in the MegaFace dataset obtained from U.S. Embassy Canberra</div></div></section><section><p>This brief research aims to shed light on the emerging politics of data. A photo is no longer just a photo when it can also be surveillance training data, and datasets can no longer be separated from the development of software when software is now built with data. "Our relationship to computers has changed", says Geoffrey Hinton, one of the founders of modern day neural networks and deep learning. "Instead of programming them, we now show them and they figure it out."<a class="footnote_shim" name="[^hinton]_1"> </a><a href="#[^hinton]" class="footnote" title="Footnote 1">1</a>.</p> +<p>As data becomes more political, national AI strategies might also want to include transnational dataset strategies.</p> +<p><em>This research post is ongoing and will updated during July and August, 2019.</em></p> +<h3>Further Reading</h3> +<ul> +<li><a href="/datasets/msceleb">MS Celeb Dataset Analysis</a></li> +<li><a href="/datasets/brainwash">Brainwash Dataset Analysis</a></li> +<li><a href="/datasets/duke_mtmc">Duke MTMC Dataset Analysis</a></li> +<li><a href="/datasets/uccs">Unconstrained College Students Dataset Analysis</a></li> +<li><a href="https://www.dukechronicle.com/article/2019/06/duke-university-facial-recognition-data-set-study-surveillance-video-students-china-uyghur">Duke MTMC dataset author apologies to students</a></li> +<li><a href="https://www.bbc.com/news/technology-48555149">BBC coverage of MS Celeb dataset takedown</a></li> +<li><a href="https://www.spiegel.de/netzwelt/web/microsoft-gesichtserkennung-datenbank-mit-zehn-millionen-fotos-geloescht-a-1271221.html">Spiegel coverage of MS Celeb dataset takedown</a></li> +</ul> +</section><section> + + <div class="hr-wave-holder"> + <div class="hr-wave-line hr-wave-line1"></div> + <div class="hr-wave-line hr-wave-line2"></div> + </div> + + <h2>Supplementary Information</h2> + +</section><section class='applet_container'><div class='applet' data-payload='{"command": "load_file /site/research/munich_security_conference/assets/embassy_counts_public.csv", "fields": ["Headings: Images, Dataset, Embassy, Flickr ID, URL, Guest, Host"]}'></div></section><section> + + <h4>Cite Our Work</h4> + <p> + + If you find this analysis helpful, please cite our work: + +<pre id="cite-bibtex"> +@online{megapixels, + author = {Harvey, Adam. LaPlace, Jules.}, + title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets}, + year = 2019, + url = {https://megapixels.cc/}, + urldate = {2019-04-18} +}</pre> + + </p> +</section><section><h3>References</h3><section><ul class="footnotes"><li>1 <a name="[^hinton]" class="footnote_shim"></a><span class="backlinks"><a href="#[^hinton]_1">a</a></span>"Heroes of Deep Learning: Andrew Ng interviews Geoffrey Hinton". Published on Aug 8, 2017. <a href="https://www.youtube.com/watch?v=-eyhCTvrEtE">https://www.youtube.com/watch?v=-eyhCTvrEtE</a> +</li></ul></section></section> + + </div> + <footer> + <ul class="footer-left"> + <li><a href="/">MegaPixels.cc</a></li> + <li><a href="/datasets/">Datasets</a></li> + <li><a href="/about/">About</a></li> + <li><a href="/about/news/">News</a></li> + <li><a href="/about/legal/">Legal & Privacy</a></li> + </ul> + <ul class="footer-right"> + <li>MegaPixels ©2017-19 <a href="https://ahprojects.com">Adam R. Harvey</a></li> + <li>Made with support from <a href="https://mozilla.org">Mozilla</a></li> + </ul> + </footer> +</body> + +<script src="/assets/js/dist/index.js"></script> +</html>
\ No newline at end of file diff --git a/site/public/test/chart/index.html b/site/public/test/chart/index.html index 33fafb48..e3134df9 100644 --- a/site/public/test/chart/index.html +++ b/site/public/test/chart/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-"> diff --git a/site/public/test/citations/index.html b/site/public/test/citations/index.html index a5fbcc76..3c630adc 100644 --- a/site/public/test/citations/index.html +++ b/site/public/test/citations/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-"> diff --git a/site/public/test/csv/index.html b/site/public/test/csv/index.html index d3ca0953..f1204c90 100644 --- a/site/public/test/csv/index.html +++ b/site/public/test/csv/index.html @@ -50,14 +50,14 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-"> <section><h1>CSV Test</h1> <h3><a href="/test/">← Back to test index</a></h3> -</section><section class='applet_container'><div class='applet' data-payload='{"command": "load_file /site/test/assets/test.csv", "fields": ["Name, Images, Year, Gender, Description, URL"]}'></div></section> +</section><section class='applet_container'><div class='applet' data-payload='{"command": "load_file /site/test/assets/test.csv", "fields": ["Headings: Name, Images, Year, Gender, Description, URL"]}'></div></section> </div> <footer> diff --git a/site/public/test/datasets/index.html b/site/public/test/datasets/index.html index 136fbd60..fccc5367 100644 --- a/site/public/test/datasets/index.html +++ b/site/public/test/datasets/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-"> diff --git a/site/public/test/face_search/index.html b/site/public/test/face_search/index.html index 3545bb00..52279498 100644 --- a/site/public/test/face_search/index.html +++ b/site/public/test/face_search/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-"> diff --git a/site/public/test/gallery/index.html b/site/public/test/gallery/index.html index b1061867..79adb78e 100644 --- a/site/public/test/gallery/index.html +++ b/site/public/test/gallery/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-"> diff --git a/site/public/test/index.html b/site/public/test/index.html index 626f1f0f..81e805ef 100644 --- a/site/public/test/index.html +++ b/site/public/test/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-"> diff --git a/site/public/test/map/index.html b/site/public/test/map/index.html index 6cba3b6f..4b4f2a4b 100644 --- a/site/public/test/map/index.html +++ b/site/public/test/map/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-"> diff --git a/site/public/test/name_search/index.html b/site/public/test/name_search/index.html index 1be553a5..50fc4a1b 100644 --- a/site/public/test/name_search/index.html +++ b/site/public/test/name_search/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-"> diff --git a/site/public/test/pie_chart/index.html b/site/public/test/pie_chart/index.html index 6b13aad6..4ba67b78 100644 --- a/site/public/test/pie_chart/index.html +++ b/site/public/test/pie_chart/index.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-"> diff --git a/site/templates/home.html b/site/templates/home.html index 19bdafe6..b74c0408 100644 --- a/site/templates/home.html +++ b/site/templates/home.html @@ -49,7 +49,7 @@ <div class='links'> <a href="/datasets/" class='aboutLink'>DATASETS</a> <a href="/about/" class='aboutLink'>ABOUT</a> - <a href="/about/news" class='updateLink'>News</a> + <a href="/research" class='updateLink'>Research</a> </div> </header> <div class="splash"> diff --git a/site/templates/layout.html b/site/templates/layout.html index b170b2b1..7c9052b2 100644 --- a/site/templates/layout.html +++ b/site/templates/layout.html @@ -50,7 +50,7 @@ <div class='links'> <a href="/datasets/">Datasets</a> <a href="/about/">About</a> - <a href="/about/news">News</a> + <a href="/research">Research</a> </div> </header> <div class="content content-{{ metadata.cssclass }}"> |
