diff options
| author | Jules Laplace <julescarbon@gmail.com> | 2019-04-19 09:50:01 +0200 |
|---|---|---|
| committer | Jules Laplace <julescarbon@gmail.com> | 2019-04-19 09:50:01 +0200 |
| commit | 11663b4b83cd735e83229a4ce85d6a3d4c1cb941 (patch) | |
| tree | eaeaff244520c572f4f88496b819d5311202ac32 /site/content | |
| parent | 06033681b31c643a17e983241848296354cbdc80 (diff) | |
| parent | cf0d2816acf0ef73ddffbf649677fafcc953c004 (diff) | |
Merge branch 'master' of github.com:adamhrv/megapixels_dev
Diffstat (limited to 'site/content')
| -rw-r--r-- | site/content/_drafts_/50_people_one_question/index.md (renamed from site/content/pages/datasets/50_people_one_question/index.md) | 0 | ||||
| -rw-r--r-- | site/content/_drafts_/afad/index.md (renamed from site/content/pages/datasets/afad/index.md) | 0 | ||||
| -rw-r--r-- | site/content/_drafts_/caltech_10k/index.md (renamed from site/content/pages/datasets/caltech_10k/index.md) | 0 | ||||
| -rw-r--r-- | site/content/_drafts_/celeba/index.md (renamed from site/content/pages/datasets/celeba/index.md) | 0 | ||||
| -rw-r--r-- | site/content/_drafts_/cofw/index.md (renamed from site/content/pages/datasets/cofw/index.md) | 0 | ||||
| -rw-r--r-- | site/content/_drafts_/feret/index.md (renamed from site/content/pages/datasets/feret/index.md) | 0 | ||||
| -rw-r--r-- | site/content/_drafts_/lfpw/index.md (renamed from site/content/pages/datasets/lfpw/index.md) | 0 | ||||
| -rw-r--r-- | site/content/_drafts_/lfw/assets/background.jpg (renamed from site/content/pages/datasets/lfw/assets/background.jpg) | bin | 212118 -> 212118 bytes | |||
| -rw-r--r-- | site/content/_drafts_/lfw/assets/background_lg.jpg (renamed from site/content/pages/datasets/lfw/assets/background_lg.jpg) | bin | 316873 -> 316873 bytes | |||
| -rw-r--r-- | site/content/_drafts_/lfw/assets/fetch_lfw_people.py (renamed from site/content/pages/datasets/lfw/assets/fetch_lfw_people.py) | 0 | ||||
| -rw-r--r-- | site/content/_drafts_/lfw/assets/index.jpg (renamed from site/content/pages/datasets/lfw/assets/index.jpg) | bin | 25757 -> 25757 bytes | |||
| -rw-r--r-- | site/content/_drafts_/lfw/assets/lfw_commercial_use.csv (renamed from site/content/pages/datasets/lfw/assets/lfw_commercial_use.csv) | 0 | ||||
| -rw-r--r-- | site/content/_drafts_/lfw/assets/lfw_feature.jpg (renamed from site/content/pages/datasets/lfw/assets/lfw_feature.jpg) | bin | 198556 -> 198556 bytes | |||
| -rw-r--r-- | site/content/_drafts_/lfw/assets/lfw_montage_all_crop.jpg (renamed from site/content/pages/datasets/lfw/assets/lfw_montage_all_crop.jpg) | bin | 865449 -> 865449 bytes | |||
| -rw-r--r-- | site/content/_drafts_/lfw/assets/lfw_montage_all_crop_1280.jpg (renamed from site/content/pages/datasets/lfw/assets/lfw_montage_all_crop_1280.jpg) | bin | 374074 -> 374074 bytes | |||
| -rw-r--r-- | site/content/_drafts_/lfw/assets/lfw_montage_all_crop_960.jpg (renamed from site/content/pages/datasets/lfw/assets/lfw_montage_all_crop_960.jpg) | bin | 169159 -> 169159 bytes | |||
| -rw-r--r-- | site/content/_drafts_/lfw/assets/lfw_montage_everyone_nocrop_1920.jpg (renamed from site/content/pages/datasets/lfw/assets/lfw_montage_everyone_nocrop_1920.jpg) | bin | 690387 -> 690387 bytes | |||
| -rw-r--r-- | site/content/_drafts_/lfw/assets/lfw_montage_top1_640.jpg (renamed from site/content/pages/datasets/lfw/assets/lfw_montage_top1_640.jpg) | bin | 125850 -> 125850 bytes | |||
| -rw-r--r-- | site/content/_drafts_/lfw/assets/lfw_montage_top2_4_640.jpg (renamed from site/content/pages/datasets/lfw/assets/lfw_montage_top2_4_640.jpg) | bin | 122238 -> 122238 bytes | |||
| -rw-r--r-- | site/content/_drafts_/lfw/assets/lfw_names_gender_kg_min.csv (renamed from site/content/pages/datasets/lfw/assets/lfw_names_gender_kg_min.csv) | 0 | ||||
| -rw-r--r-- | site/content/_drafts_/lfw/index.md (renamed from site/content/pages/datasets/lfw/index.md) | 0 | ||||
| -rw-r--r-- | site/content/_drafts_/market_1501/assets/background.jpg (renamed from site/content/pages/datasets/market_1501/assets/background.jpg) | bin | 308757 -> 308757 bytes | |||
| -rw-r--r-- | site/content/_drafts_/market_1501/assets/index.jpg (renamed from site/content/pages/datasets/market_1501/assets/index.jpg) | bin | 24177 -> 24177 bytes | |||
| -rw-r--r-- | site/content/_drafts_/market_1501/index.md (renamed from site/content/pages/datasets/market_1501/index.md) | 0 | ||||
| -rw-r--r-- | site/content/_drafts_/pipa/index.md (renamed from site/content/pages/datasets/pipa/index.md) | 0 | ||||
| -rw-r--r-- | site/content/_drafts_/pubfig/assets/background.jpg (renamed from site/content/pages/datasets/pubfig/assets/background.jpg) | bin | 159672 -> 159672 bytes | |||
| -rw-r--r-- | site/content/_drafts_/pubfig/assets/index.jpg (renamed from site/content/pages/datasets/pubfig/assets/index.jpg) | bin | 20802 -> 20802 bytes | |||
| -rw-r--r-- | site/content/_drafts_/pubfig/index.md (renamed from site/content/pages/datasets/pubfig/index.md) | 0 | ||||
| -rw-r--r-- | site/content/_drafts_/vgg_face2/index.md (renamed from site/content/pages/datasets/vgg_face2/index.md) | 0 | ||||
| -rw-r--r-- | site/content/_drafts_/viper/assets/background.jpg (renamed from site/content/pages/datasets/viper/assets/background.jpg) | bin | 203679 -> 203679 bytes | |||
| -rw-r--r-- | site/content/_drafts_/viper/assets/index.jpg (renamed from site/content/pages/datasets/viper/assets/index.jpg) | bin | 17294 -> 17294 bytes | |||
| -rw-r--r-- | site/content/_drafts_/viper/index.md (renamed from site/content/pages/datasets/viper/index.md) | 0 | ||||
| -rw-r--r-- | site/content/_drafts_/youtube_celebrities/index.md (renamed from site/content/pages/datasets/youtube_celebrities/index.md) | 0 | ||||
| -rw-r--r-- | site/content/pages/about/attribution.md | 1 | ||||
| -rw-r--r-- | site/content/pages/about/index.md | 36 | ||||
| -rw-r--r-- | site/content/pages/about/legal.md | 1 | ||||
| -rw-r--r-- | site/content/pages/about/press.md | 1 | ||||
| -rw-r--r-- | site/content/pages/datasets/brainwash/index.md | 6 | ||||
| -rw-r--r-- | site/content/pages/datasets/duke_mtmc/index.md | 4 | ||||
| -rw-r--r-- | site/content/pages/datasets/hrt_transgender/index.md | 2 | ||||
| -rwxr-xr-x | site/content/pages/datasets/ijb_c/assets/background.jpg | bin | 0 -> 134927 bytes | |||
| -rwxr-xr-x | site/content/pages/datasets/ijb_c/assets/ijb_c_montage.jpg | bin | 0 -> 424821 bytes | |||
| -rwxr-xr-x | site/content/pages/datasets/ijb_c/assets/index.jpg | bin | 0 -> 14856 bytes | |||
| -rw-r--r-- | site/content/pages/datasets/ijb_c/index.md | 36 | ||||
| -rw-r--r-- | site/content/pages/datasets/index.md | 4 | ||||
| -rw-r--r-- | site/content/pages/datasets/msceleb/assets/background.jpg | bin | 422970 -> 157480 bytes | |||
| -rw-r--r-- | site/content/pages/datasets/msceleb/assets/index.jpg | bin | 39839 -> 21845 bytes | |||
| -rw-r--r-- | site/content/pages/datasets/msceleb/assets/msceleb_montage.jpg | bin | 0 -> 712507 bytes | |||
| -rw-r--r-- | site/content/pages/datasets/msceleb/index.md | 94 | ||||
| -rw-r--r-- | site/content/pages/index.md | 2 | ||||
| -rw-r--r-- | site/content/pages/research/00_introduction/index.md | 12 |
51 files changed, 111 insertions, 88 deletions
diff --git a/site/content/pages/datasets/50_people_one_question/index.md b/site/content/_drafts_/50_people_one_question/index.md index 8b7fb931..8b7fb931 100644 --- a/site/content/pages/datasets/50_people_one_question/index.md +++ b/site/content/_drafts_/50_people_one_question/index.md diff --git a/site/content/pages/datasets/afad/index.md b/site/content/_drafts_/afad/index.md index 755506d8..755506d8 100644 --- a/site/content/pages/datasets/afad/index.md +++ b/site/content/_drafts_/afad/index.md diff --git a/site/content/pages/datasets/caltech_10k/index.md b/site/content/_drafts_/caltech_10k/index.md index db2383c7..db2383c7 100644 --- a/site/content/pages/datasets/caltech_10k/index.md +++ b/site/content/_drafts_/caltech_10k/index.md diff --git a/site/content/pages/datasets/celeba/index.md b/site/content/_drafts_/celeba/index.md index 3f3aea79..3f3aea79 100644 --- a/site/content/pages/datasets/celeba/index.md +++ b/site/content/_drafts_/celeba/index.md diff --git a/site/content/pages/datasets/cofw/index.md b/site/content/_drafts_/cofw/index.md index 3cafe5b1..3cafe5b1 100644 --- a/site/content/pages/datasets/cofw/index.md +++ b/site/content/_drafts_/cofw/index.md diff --git a/site/content/pages/datasets/feret/index.md b/site/content/_drafts_/feret/index.md index 034ff4aa..034ff4aa 100644 --- a/site/content/pages/datasets/feret/index.md +++ b/site/content/_drafts_/feret/index.md diff --git a/site/content/pages/datasets/lfpw/index.md b/site/content/_drafts_/lfpw/index.md index 09506313..09506313 100644 --- a/site/content/pages/datasets/lfpw/index.md +++ b/site/content/_drafts_/lfpw/index.md diff --git a/site/content/pages/datasets/lfw/assets/background.jpg b/site/content/_drafts_/lfw/assets/background.jpg Binary files differindex 2c517060..2c517060 100644 --- a/site/content/pages/datasets/lfw/assets/background.jpg +++ b/site/content/_drafts_/lfw/assets/background.jpg diff --git a/site/content/pages/datasets/lfw/assets/background_lg.jpg b/site/content/_drafts_/lfw/assets/background_lg.jpg Binary files differindex 3ab1607d..3ab1607d 100644 --- a/site/content/pages/datasets/lfw/assets/background_lg.jpg +++ b/site/content/_drafts_/lfw/assets/background_lg.jpg diff --git a/site/content/pages/datasets/lfw/assets/fetch_lfw_people.py b/site/content/_drafts_/lfw/assets/fetch_lfw_people.py index 639883a6..639883a6 100644 --- a/site/content/pages/datasets/lfw/assets/fetch_lfw_people.py +++ b/site/content/_drafts_/lfw/assets/fetch_lfw_people.py diff --git a/site/content/pages/datasets/lfw/assets/index.jpg b/site/content/_drafts_/lfw/assets/index.jpg Binary files differindex bc36c106..bc36c106 100644 --- a/site/content/pages/datasets/lfw/assets/index.jpg +++ b/site/content/_drafts_/lfw/assets/index.jpg diff --git a/site/content/pages/datasets/lfw/assets/lfw_commercial_use.csv b/site/content/_drafts_/lfw/assets/lfw_commercial_use.csv index a2a4b39c..a2a4b39c 100644 --- a/site/content/pages/datasets/lfw/assets/lfw_commercial_use.csv +++ b/site/content/_drafts_/lfw/assets/lfw_commercial_use.csv diff --git a/site/content/pages/datasets/lfw/assets/lfw_feature.jpg b/site/content/_drafts_/lfw/assets/lfw_feature.jpg Binary files differindex 8ef2459e..8ef2459e 100644 --- a/site/content/pages/datasets/lfw/assets/lfw_feature.jpg +++ b/site/content/_drafts_/lfw/assets/lfw_feature.jpg diff --git a/site/content/pages/datasets/lfw/assets/lfw_montage_all_crop.jpg b/site/content/_drafts_/lfw/assets/lfw_montage_all_crop.jpg Binary files differindex b44d6430..b44d6430 100644 --- a/site/content/pages/datasets/lfw/assets/lfw_montage_all_crop.jpg +++ b/site/content/_drafts_/lfw/assets/lfw_montage_all_crop.jpg diff --git a/site/content/pages/datasets/lfw/assets/lfw_montage_all_crop_1280.jpg b/site/content/_drafts_/lfw/assets/lfw_montage_all_crop_1280.jpg Binary files differindex 5cad0c32..5cad0c32 100644 --- a/site/content/pages/datasets/lfw/assets/lfw_montage_all_crop_1280.jpg +++ b/site/content/_drafts_/lfw/assets/lfw_montage_all_crop_1280.jpg diff --git a/site/content/pages/datasets/lfw/assets/lfw_montage_all_crop_960.jpg b/site/content/_drafts_/lfw/assets/lfw_montage_all_crop_960.jpg Binary files differindex 015c11c7..015c11c7 100644 --- a/site/content/pages/datasets/lfw/assets/lfw_montage_all_crop_960.jpg +++ b/site/content/_drafts_/lfw/assets/lfw_montage_all_crop_960.jpg diff --git a/site/content/pages/datasets/lfw/assets/lfw_montage_everyone_nocrop_1920.jpg b/site/content/_drafts_/lfw/assets/lfw_montage_everyone_nocrop_1920.jpg Binary files differindex 3418f0af..3418f0af 100644 --- a/site/content/pages/datasets/lfw/assets/lfw_montage_everyone_nocrop_1920.jpg +++ b/site/content/_drafts_/lfw/assets/lfw_montage_everyone_nocrop_1920.jpg diff --git a/site/content/pages/datasets/lfw/assets/lfw_montage_top1_640.jpg b/site/content/_drafts_/lfw/assets/lfw_montage_top1_640.jpg Binary files differindex 8e7954b1..8e7954b1 100644 --- a/site/content/pages/datasets/lfw/assets/lfw_montage_top1_640.jpg +++ b/site/content/_drafts_/lfw/assets/lfw_montage_top1_640.jpg diff --git a/site/content/pages/datasets/lfw/assets/lfw_montage_top2_4_640.jpg b/site/content/_drafts_/lfw/assets/lfw_montage_top2_4_640.jpg Binary files differindex deedc552..deedc552 100644 --- a/site/content/pages/datasets/lfw/assets/lfw_montage_top2_4_640.jpg +++ b/site/content/_drafts_/lfw/assets/lfw_montage_top2_4_640.jpg diff --git a/site/content/pages/datasets/lfw/assets/lfw_names_gender_kg_min.csv b/site/content/_drafts_/lfw/assets/lfw_names_gender_kg_min.csv index 1ff5f785..1ff5f785 100644 --- a/site/content/pages/datasets/lfw/assets/lfw_names_gender_kg_min.csv +++ b/site/content/_drafts_/lfw/assets/lfw_names_gender_kg_min.csv diff --git a/site/content/pages/datasets/lfw/index.md b/site/content/_drafts_/lfw/index.md index 5d90e87f..5d90e87f 100644 --- a/site/content/pages/datasets/lfw/index.md +++ b/site/content/_drafts_/lfw/index.md diff --git a/site/content/pages/datasets/market_1501/assets/background.jpg b/site/content/_drafts_/market_1501/assets/background.jpg Binary files differindex f3440590..f3440590 100644 --- a/site/content/pages/datasets/market_1501/assets/background.jpg +++ b/site/content/_drafts_/market_1501/assets/background.jpg diff --git a/site/content/pages/datasets/market_1501/assets/index.jpg b/site/content/_drafts_/market_1501/assets/index.jpg Binary files differindex e866defd..e866defd 100644 --- a/site/content/pages/datasets/market_1501/assets/index.jpg +++ b/site/content/_drafts_/market_1501/assets/index.jpg diff --git a/site/content/pages/datasets/market_1501/index.md b/site/content/_drafts_/market_1501/index.md index e106a498..e106a498 100644 --- a/site/content/pages/datasets/market_1501/index.md +++ b/site/content/_drafts_/market_1501/index.md diff --git a/site/content/pages/datasets/pipa/index.md b/site/content/_drafts_/pipa/index.md index 250878ff..250878ff 100644 --- a/site/content/pages/datasets/pipa/index.md +++ b/site/content/_drafts_/pipa/index.md diff --git a/site/content/pages/datasets/pubfig/assets/background.jpg b/site/content/_drafts_/pubfig/assets/background.jpg Binary files differindex db748a8f..db748a8f 100644 --- a/site/content/pages/datasets/pubfig/assets/background.jpg +++ b/site/content/_drafts_/pubfig/assets/background.jpg diff --git a/site/content/pages/datasets/pubfig/assets/index.jpg b/site/content/_drafts_/pubfig/assets/index.jpg Binary files differindex 2470b35c..2470b35c 100644 --- a/site/content/pages/datasets/pubfig/assets/index.jpg +++ b/site/content/_drafts_/pubfig/assets/index.jpg diff --git a/site/content/pages/datasets/pubfig/index.md b/site/content/_drafts_/pubfig/index.md index 5f2e1ad5..5f2e1ad5 100644 --- a/site/content/pages/datasets/pubfig/index.md +++ b/site/content/_drafts_/pubfig/index.md diff --git a/site/content/pages/datasets/vgg_face2/index.md b/site/content/_drafts_/vgg_face2/index.md index acf2476e..acf2476e 100644 --- a/site/content/pages/datasets/vgg_face2/index.md +++ b/site/content/_drafts_/vgg_face2/index.md diff --git a/site/content/pages/datasets/viper/assets/background.jpg b/site/content/_drafts_/viper/assets/background.jpg Binary files differindex db0b2857..db0b2857 100644 --- a/site/content/pages/datasets/viper/assets/background.jpg +++ b/site/content/_drafts_/viper/assets/background.jpg diff --git a/site/content/pages/datasets/viper/assets/index.jpg b/site/content/_drafts_/viper/assets/index.jpg Binary files differindex 6eaa365c..6eaa365c 100644 --- a/site/content/pages/datasets/viper/assets/index.jpg +++ b/site/content/_drafts_/viper/assets/index.jpg diff --git a/site/content/pages/datasets/viper/index.md b/site/content/_drafts_/viper/index.md index 291b2136..291b2136 100644 --- a/site/content/pages/datasets/viper/index.md +++ b/site/content/_drafts_/viper/index.md diff --git a/site/content/pages/datasets/youtube_celebrities/index.md b/site/content/_drafts_/youtube_celebrities/index.md index 49bfaa2e..49bfaa2e 100644 --- a/site/content/pages/datasets/youtube_celebrities/index.md +++ b/site/content/_drafts_/youtube_celebrities/index.md diff --git a/site/content/pages/about/attribution.md b/site/content/pages/about/attribution.md index 5060b2d9..148fe6d1 100644 --- a/site/content/pages/about/attribution.md +++ b/site/content/pages/about/attribution.md @@ -16,7 +16,6 @@ authors: Adam Harvey <section class="about-menu"> <ul> <li><a href="/about/">About</a></li> -<li><a href="/about/press/">Press</a></li> <li><a class="current" href="/about/attribution/">Attribution</a></li> <li><a href="/about/legal/">Legal / Privacy</a></li> </ul> diff --git a/site/content/pages/about/index.md b/site/content/pages/about/index.md index a6ce3d3d..2e5f9c9e 100644 --- a/site/content/pages/about/index.md +++ b/site/content/pages/about/index.md @@ -16,7 +16,6 @@ authors: Adam Harvey <section class="about-menu"> <ul> <li><a class="current" href="/about/">About</a></li> -<li><a href="/about/press/">Press</a></li> <li><a href="/about/attribution/">Attribution</a></li> <li><a href="/about/legal/">Legal / Privacy</a></li> </ul> @@ -24,9 +23,7 @@ authors: Adam Harvey MegaPixels is an independent art and research project by Adam Harvey and Jules LaPlace that investigates the ethics, origins, and individual privacy implications of face recognition image datasets and their role in the expansion of biometric surveillance technologies. -MegaPixels is made possible with support from <a href="http://mozilla.org">Mozilla</a>, our primary funding partner. - -Additional support for MegaPixels is provided by the European ARTificial Intelligence Network (AI LAB) at the Ars Electronica Center, 1-year research-in-residence grant from Karlsruhe HfG, and sales from the Privacy Gift Shop. +This project is made possible with support from <a href="http://mozilla.org">Mozilla</a>. <div class="flex-container team-photos-container"> @@ -45,16 +42,13 @@ Additional support for MegaPixels is provided by the European ARTificial Intelli </div> -The MegaPixels website is based on an [earlier installation from 2017](https://ahprojects.com/megapixels-glassroom/) and ongoing research and lectures ([TedX](https://www.youtube.com/watch?v=bfhcco9gS30), [CPDP](https://www.cpdpconferences.org/events/megapixels-is-in-publicly-available-facial-recognition-datasets)) about facial recognition datasets. Over the last several years this project has evolved into a large-scale interrogation of hundreds of publicly-available face and person analysis datasets. - -MegaPixels aims to provide a critical perspective on machine learning image datsets, one that might otherwise escape academia and the industry funded artificial intelligence think tanks that are often supported by the same technology companies who have created many of the datasets presented on this site. - -MegaPixels is an independent project, designed as a public resource for educators, students, journalists, and researchers. Each dataset presented on this site undergoes a thorough review of its images, intent, and funding sources. Though the goals are similar to publishing a public academic paper, MegaPixels is a website-first reserch project aligns closley with the goals of pre-print academic publications. As such we welcome feedback and ways to improve this site and the clarity of the research. +MegaPixels is an art and research project first launched in 2017 for an [installation](https://ahprojects.com/megapixels-glassroom/) at Tactical Technology Collective's GlassRoom about facial recognition datasets. In 2018 it was extended to cover pedestrian analysis datasets for a [commission by Elevate Arts festival](https://esc.mur.at/de/node/2370) in Austria. Since then MegaPixels has evolved into a large-scale interrogation of hundreds of publicly-available face and person analysis datasets. -Because this project surfaces many funding issues with datasets (from datasets funded by the C.I.A. to the National Unviversity of Defense and Technology in China), it is important that we are transparent about own funding. The original MegaPixels installation in 2017 was built as a commission for and with support from Tactical Technology Collective and Mozilla. The bulk of the research and web-development during 2018 - 2018 was supported by a grant from Mozilla. Continued development in 2019 is partially supported by a 1-year Reseacher-in-Residence grant from Karlsruhe HfG, lecture and workshop fees, and from commissions and sales from the Privacy Gift Shop. +MegaPixels aims to provide a critical perspective on machine learning image datsets, one that might otherwise escape academia and industry funded artificial intelligence think tanks that are often supported by the several of the same technology companies who have created datasets presented on this site. -Please get in touch if you are interested in supporting this project. +MegaPixels is an independent project, designed as a public resource for educators, students, journalists, and researchers. Each dataset presented on this site undergoes a thorough review of its images, intent, and funding sources. Though the goals are similar to publishing an academic paper, MegaPixels is a website-first research project, with an academic paper to follow. +One of the main focuses of the dataset investigations presented on this site is to uncover where funding originated. Because of our empahasis on other researchers' funding sources, it is important that we are transparent about our own. This site and the past year of reserach have been primarily funded by a privacy art grant from Mozilla in 2018. The original MegaPixels installation in 2017 was built as a commission for and with support from Tactical Technology Collective and Mozilla. The research into pedestrian analysis datasets was funded by a commission from Elevate Arts, and continued development in 2019 is supported in part by a 1-year Reseacher-in-Residence grant from Karlsruhe HfG and lecture and workshop fees. === columns 3 @@ -62,7 +56,6 @@ Please get in touch if you are interested in supporting this project. - Adam Harvey: Concept, research and analysis, design, computer vision - Jules LaPlace: Information and systems architecture, data management, web applications -You are free: =========== @@ -84,18 +77,6 @@ You are free: === end columns -Please direct questions, comments, or feedback to [mastodon.social/@adamhrv](https://mastodon.social/@adamhrv) - -#### Funding Partners - -The MegaPixels website, research, and development is made possible with support form Mozilla, our primary funding partner. - -[ add logos ] - -Additional support is provided by the European ARTificial Intelligence Network (AI LAB) at the Ars Electronica Center and a 1-year research-in-residence grant from Karlsruhe HfG. - -[ add logos ] - ##### Attribution If you use MegaPixels or any data derived from it for your work, please cite our original work as follows: @@ -106,6 +87,11 @@ If you use MegaPixels or any data derived from it for your work, please cite our title = {MegaPixels: Origins, Ethics, and Privacy Implications of Publicly Available Face Recognition Image Datasets}, year = 2019, url = {https://megapixels.cc/}, - urldate = {2019-04-20} + urldate = {2019-04-18} } </pre> + + +##### Contact + +Please direct questions, comments, or feedback to [mastodon.social/@adamhrv](https://mastodon.social/@adamhrv)
\ No newline at end of file diff --git a/site/content/pages/about/legal.md b/site/content/pages/about/legal.md index 85cf5c48..a58fde48 100644 --- a/site/content/pages/about/legal.md +++ b/site/content/pages/about/legal.md @@ -16,7 +16,6 @@ authors: Adam Harvey <section class="about-menu"> <ul> <li><a href="/about/">About</a></li> -<li><a href="/about/press/">Press</a></li> <li><a href="/about/attribution/">Attribution</a></li> <li><a class="current" href="/about/legal/">Legal / Privacy</a></li> </ul> diff --git a/site/content/pages/about/press.md b/site/content/pages/about/press.md index d3ed008c..8ab797c8 100644 --- a/site/content/pages/about/press.md +++ b/site/content/pages/about/press.md @@ -16,7 +16,6 @@ authors: Adam Harvey <section class="about-menu"> <ul> <li><a href="/about/">About</a></li> -<li><a class="current" href="/about/press/">Press</a></li> <li><a href="/about/attribution/">Attribution</a></li> <li><a href="/about/legal/">Legal / Privacy</a></li> </ul> diff --git a/site/content/pages/datasets/brainwash/index.md b/site/content/pages/datasets/brainwash/index.md index 75b0c006..79294114 100644 --- a/site/content/pages/datasets/brainwash/index.md +++ b/site/content/pages/datasets/brainwash/index.md @@ -21,9 +21,9 @@ authors: Adam Harvey Brainwash is a dataset of livecam images taken from San Francisco's Brainwash Cafe. It includes 11,918 images of "everyday life of a busy downtown cafe"[^readme] captured at 100 second intervals throught the entire day. The Brainwash dataset includes 3 full days of webcam images taken on October 27, November 13, and November 24 in 2014. According the author's [reserach paper](https://www.semanticscholar.org/paper/End-to-End-People-Detection-in-Crowded-Scenes-Stewart-Andriluka/1bd1645a629f1b612960ab9bba276afd4cf7c666) introducing the dataset, the images were acquired with the help of Angelcam.com[^end_to_end] -The Brainwash dataset is unique because it uses images from a publicly available webcam that records people inside a privately owned business without any consent. No ordinary cafe custom could ever suspect there image would end up in dataset used for surveillance reserach and development, but that is exactly what happened to customers at Brainwash cafe in San Francisco. +The Brainwash dataset is unique because it uses images from a publicly available webcam that records people inside a privately owned business without any consent. No ordinary cafe custom could ever suspect their image would end up in dataset used for surveillance research and development, but that is exactly what happened to customers at Brainwash cafe in San Francisco. -Although Brainwash appears to be a less popular dataset, it was used in 2016 and 2017 by researchers from the National University of Defense Technology in China took note of the dataset and used it for two [research](https://www.semanticscholar.org/paper/Localized-region-context-and-object-feature-fusion-Li-Dou/b02d31c640b0a31fb18c4f170d841d8e21ffb66c) [projects](https://www.semanticscholar.org/paper/A-Replacement-Algorithm-of-Non-Maximum-Suppression-Zhao-Wang/591a4bfa6380c9fcd5f3ae690e3ac5c09b7bf37b) on advancing the capabilities of object detection to more accurately isolate the target region in an image ([PDF](https://www.itm-conferences.org/articles/itmconf/pdf/2017/04/itmconf_ita2017_05006.pdf)). [^localized_region_context] [^replacement_algorithm]. The dataset also appears in a 2017 [research paper](https://ieeexplore.ieee.org/document/7877809) from Peking University for the purpose of improving surveillance capabilities for "people detection in the crowded scenes". +Although Brainwash appears to be a less popular dataset, it notably was used in 2016 and 2017 by researchers affiliated the National University of Defense Technology in China for two [research](https://www.semanticscholar.org/paper/Localized-region-context-and-object-feature-fusion-Li-Dou/b02d31c640b0a31fb18c4f170d841d8e21ffb66c) [projects](https://www.semanticscholar.org/paper/A-Replacement-Algorithm-of-Non-Maximum-Suppression-Zhao-Wang/591a4bfa6380c9fcd5f3ae690e3ac5c09b7bf37b) on advancing the capabilities of object detection to more accurately isolate the target region in an image ([PDF](https://www.itm-conferences.org/articles/itmconf/pdf/2017/04/itmconf_ita2017_05006.pdf)). [^localized_region_context] [^replacement_algorithm]. The dataset also appears in a 2017 [research paper](https://ieeexplore.ieee.org/document/7877809) from Peking University for the purpose of improving surveillance capabilities for "people detection in the crowded scenes".  @@ -31,7 +31,7 @@ Although Brainwash appears to be a less popular dataset, it was used in 2016 and {% include 'supplementary_header.html' %} - +  diff --git a/site/content/pages/datasets/duke_mtmc/index.md b/site/content/pages/datasets/duke_mtmc/index.md index 2420d042..9356823e 100644 --- a/site/content/pages/datasets/duke_mtmc/index.md +++ b/site/content/pages/datasets/duke_mtmc/index.md @@ -44,7 +44,7 @@ Despite [repeated](https://www.hrw.org/news/2017/11/19/china-police-big-data-sys The reasons that companies in China use the Duke MTMC dataset for research are technically no different than the reasons it is used in the United States and Europe. In fact, the original creators of the dataset published a follow up report in 2017 titled [Tracking Social Groups Within and Across Cameras](https://www.semanticscholar.org/paper/Tracking-Social-Groups-Within-and-Across-Cameras-Solera-Calderara/9e644b1e33dd9367be167eb9d832174004840400) with specific applications to "automated analysis of crowds and social gatherings for surveillance and security applications". Their work, as well as the creation of the original dataset in 2014 were both supported in part by the United States Army Research Laboratory. -Citations from the United States and Europe show a similar trend to that in China, including publicly acknowledged and verified usage of the Duke MTMC dataset supported or carried out by the United States Department of Homeland Security, IARPA, IBM, Microsoft (who provides surveillance to ICE), and Vision Semantics (who works with the UK Ministry of Defence). One [paper](https://pdfs.semanticscholar.org/59f3/57015054bab43fb8cbfd3f3dbf17b1d1f881.pdf) is even jointly published by researchers affiliated with both the University College of London and the National University of Defense Technology in China. +Citations from the United States and Europe show a similar trend to that in China, including publicly acknowledged and verified usage of the Duke MTMC dataset supported or carried out by the United States Department of Homeland Security, IARPA, IBM, Microsoft (who has provided surveillance to ICE), and Vision Semantics (who has worked with the UK Ministry of Defence). One [paper](https://pdfs.semanticscholar.org/59f3/57015054bab43fb8cbfd3f3dbf17b1d1f881.pdf) is even jointly published by researchers affiliated with both the University College of London and the National University of Defense Technology in China. | Organization | Paper | Link | Year | Used Duke MTMC | |---|---|---|---| @@ -79,7 +79,7 @@ For the approximately 2,000 students in Duke MTMC dataset there is unfortunately #### Video Timestamps -The video timestamps contain the likely, but not yet confirmed, date and times the video recorded. Because the video timestamps align with the start and stop [time sync data](http://vision.cs.duke.edu/DukeMTMC/details.html#time-sync) provided by the researchers, it at least confirms the relative timing. The [precipitous weather](https://www.wunderground.com/history/daily/KIGX/date/2014-3-19?req_city=Durham&req_state=NC&req_statename=North%20Carolina&reqdb.zip=27708&reqdb.magic=1&reqdb.wmo=99999) on March 14, 2014 in Durham, North Carolina supports, but does not confirm, that this day is a potential capture date. +The video timestamps contain the likely, but not yet confirmed, date and times the video recorded. Because the video timestamps align with the start and stop [time sync data](http://vision.cs.duke.edu/DukeMTMC/details.html#time-sync) provided by the researchers, it at least confirms the relative timing. The [precipitous weather](https://www.wunderground.com/history/daily/KIGX/date/2014-3-19?req_city=Durham&req_state=NC&req_statename=North%20Carolina&reqdb.zip=27708&reqdb.magic=1&reqdb.wmo=99999) on March 14, 2014 in Durham, North Carolina supports, but does not confirm, that this day is the likely capture date. === columns 2 diff --git a/site/content/pages/datasets/hrt_transgender/index.md b/site/content/pages/datasets/hrt_transgender/index.md index 137e6dcb..fb820593 100644 --- a/site/content/pages/datasets/hrt_transgender/index.md +++ b/site/content/pages/datasets/hrt_transgender/index.md @@ -1,6 +1,6 @@ ------------ -status: published +status: draft title: HRT Transgender Dataset desc: TBD subdesc: TBD diff --git a/site/content/pages/datasets/ijb_c/assets/background.jpg b/site/content/pages/datasets/ijb_c/assets/background.jpg Binary files differnew file mode 100755 index 00000000..6958a2b2 --- /dev/null +++ b/site/content/pages/datasets/ijb_c/assets/background.jpg diff --git a/site/content/pages/datasets/ijb_c/assets/ijb_c_montage.jpg b/site/content/pages/datasets/ijb_c/assets/ijb_c_montage.jpg Binary files differnew file mode 100755 index 00000000..3b5a0e40 --- /dev/null +++ b/site/content/pages/datasets/ijb_c/assets/ijb_c_montage.jpg diff --git a/site/content/pages/datasets/ijb_c/assets/index.jpg b/site/content/pages/datasets/ijb_c/assets/index.jpg Binary files differnew file mode 100755 index 00000000..7268d6ad --- /dev/null +++ b/site/content/pages/datasets/ijb_c/assets/index.jpg diff --git a/site/content/pages/datasets/ijb_c/index.md b/site/content/pages/datasets/ijb_c/index.md new file mode 100644 index 00000000..46cab323 --- /dev/null +++ b/site/content/pages/datasets/ijb_c/index.md @@ -0,0 +1,36 @@ +------------ + +status: draft +title: IJB-C +desc: IARPA Janus Benchmark C is a dataset of web images used +subdesc: The IJB-C dataset contains 21,294 images and 11,779 videos of 3,531 identities +slug: ijb_c +cssclass: dataset +image: assets/background.jpg +year: 2017 +published: 2019-4-18 +updated: 2019-4-18 +authors: Adam Harvey + +------------ + +## IARPA Janus Benchmark C (IJB-C) + +### sidebar +### end sidebar + +[ page under development ] + +The IARPA Janus Benchmark C is a dataset created by + + + + + +{% include 'dashboard.html' %} + +{% include 'supplementary_header.html' %} + +{% include 'cite_our_work.html' %} + +### Footnotes diff --git a/site/content/pages/datasets/index.md b/site/content/pages/datasets/index.md index 289aa2fd..6da4aa10 100644 --- a/site/content/pages/datasets/index.md +++ b/site/content/pages/datasets/index.md @@ -11,6 +11,6 @@ sync: false ------------ -# Facial Recognition Datasets +# Face Recognition Datasets -Explore publicly available facial recognition datasets feeding into research and development of biometric surveillance technologies at the largest technology companies and defense contractors in the world. +Explore publicly available facial recognition datasets contributing the growing crisis of authoritarian biometric surveillance technologies. This first group of datasets presented in April 2019 focus on connections to surveillance companies connected to defense organizations. diff --git a/site/content/pages/datasets/msceleb/assets/background.jpg b/site/content/pages/datasets/msceleb/assets/background.jpg Binary files differindex c1cd486e..8ace49a3 100644 --- a/site/content/pages/datasets/msceleb/assets/background.jpg +++ b/site/content/pages/datasets/msceleb/assets/background.jpg diff --git a/site/content/pages/datasets/msceleb/assets/index.jpg b/site/content/pages/datasets/msceleb/assets/index.jpg Binary files differindex fb3a934a..1c12e410 100644 --- a/site/content/pages/datasets/msceleb/assets/index.jpg +++ b/site/content/pages/datasets/msceleb/assets/index.jpg diff --git a/site/content/pages/datasets/msceleb/assets/msceleb_montage.jpg b/site/content/pages/datasets/msceleb/assets/msceleb_montage.jpg Binary files differnew file mode 100644 index 00000000..b64348f4 --- /dev/null +++ b/site/content/pages/datasets/msceleb/assets/msceleb_montage.jpg diff --git a/site/content/pages/datasets/msceleb/index.md b/site/content/pages/datasets/msceleb/index.md index 0c78e094..c16016f8 100644 --- a/site/content/pages/datasets/msceleb/index.md +++ b/site/content/pages/datasets/msceleb/index.md @@ -2,7 +2,7 @@ status: published title: Microsoft Celeb -desc: Microsoft Celeb 1M is a target list and dataset of web images used for research and development of face recognition technologies +desc: Microsoft Celeb 1M is a target list and dataset of web images used for research and development of face recognition subdesc: The MS Celeb dataset includes over 10 million images of about 100K people and a target list of 1 million individuals slug: msceleb cssclass: dataset @@ -19,73 +19,67 @@ authors: Adam Harvey ### sidebar ### end sidebar -Microsoft Celeb (MS Celeb) is a dataset of 10 million face images scraped from the Internet and used for research and development of large-scale biometric recognition systems. According to Microsoft Research who created and published the [dataset](http://msceleb.org) in 2016, MS Celeb is the largest publicly available face recognition dataset in the world, containing over 10 million images of nearly 100,000 individuals. Microsoft's goal in building this dataset was to distribute the initial training dataset of 100,000 individuals images and use this to accelerate reserch into recognizing a target list of one million individuals from their face images "using all the possibly collected face images of this individual on the web as training data".[^msceleb_orig] +Microsoft Celeb (MS Celeb) is a dataset of 10 million face images scraped from the Internet and used for research and development of large-scale biometric recognition systems. According to Microsoft Research who created and published the [dataset](https://www.microsoft.com/en-us/research/publication/ms-celeb-1m-dataset-benchmark-large-scale-face-recognition-2/) in 2016, MS Celeb is the largest publicly available face recognition dataset in the world, containing over 10 million images of nearly 100,000 individuals. Microsoft's goal in building this dataset was to distribute an initial training dataset of 100,000 individuals images and use this to accelerate research into recognizing a target list of one million people from their face images "using all the possibly collected face images of this individual on the web as training data".[^msceleb_orig] -These one million people, defined as Micrsoft Research as "celebrities", are often merely people who must maintain an online presence for their professional lives. Microsoft's list of 1 million people is an expansive exploitation of the current reality that for many people including academics, policy makers, writers, artists, and especially journalists maintaining an online presence is mandatory and should not allow Microsoft (or anyone else) to use their biometrics for reserach and development of surveillance technology. Many of names in target list even include people critical of the very technology Microsoft is using their name and biometric information to build. The list includes digital rights activists like Jillian York and [add more]; artists critical of surveillance including Trevor Paglen, Hito Steryl, Kyle McDonald, Jill Magid, and Aram Bartholl; Intercept founders Laura Poitras, Jeremy Scahill, and Glen Greenwald; Data and Society founder danah boyd; and even Julie Brill the former FTC commissioner responsible for protecting consumer’s privacy to name a few. +These one million people, defined by Microsoft Research as "celebrities", are often merely people who must maintain an online presence for their professional lives. Microsoft's list of 1 million people is an expansive exploitation of the current reality that for many people including academics, policy makers, writers, artists, and especially journalists maintaining an online presence is mandatory and should not allow Microsoft or anyone else to use their biometrics for research and development of surveillance technology. Many of names in the target list even include people critical of the very technology Microsoft is using their name and biometric information to build. The list includes digital rights activists like Jillian York; artists critical of surveillance including Trevor Paglen, Jill Magid, and Aram Bartholl; Intercept founders Laura Poitras, Jeremy Scahill, and Glen Greenwald; Data and Society founder danah boyd; and even Julie Brill the former FTC commissioner responsible for protecting consumer privacy to name a few. ### Microsoft's 1 Million Target List -Below is a list of names that were included in list of 1 million individuals curated to illustrate Microsoft's expansive and exploitative practice of scraping the Internet for biometric training data. The entire name file can be downloaded from [msceleb.org](https://msceleb.org). Names appearing with * indicate that Microsoft also distributed imaged. - -[ cleaning this up ] +Below is a selection of names from the full target list, curated to illustrate Microsoft's expansive and exploitative practice of scraping the Internet for biometric training data. The entire name file can be downloaded from [msceleb.org](https://msceleb.org). You can email <a href="mailto:msceleb@microsoft.com?subject=MS-Celeb-1M Removal Request&body=Dear%20Microsoft%2C%0A%0AI%20recently%20discovered%20that%20you%20use%20my%20identity%20for%20commercial%20use%20in%20your%20MS-Celeb-1M%20dataset%20used%20for%20research%20and%20development%20of%20face%20recognition.%20I%20do%20not%20wish%20to%20be%20included%20in%20your%20dataset%20in%20any%20format.%20%0A%0APlease%20remove%20my%20name%20and%2For%20any%20associated%20images%20immediately%20and%20send%20a%20confirmation%20once%20you've%20updated%20your%20%22Top1M_MidList.Name.tsv%22%20file.%0A%0AThanks%20for%20promptly%20handing%20this%2C%0A%5B%20your%20name%20%5D">msceleb@microsoft.com</a> to have your name removed. Names appearing with * indicate that Microsoft also distributed images. === columns 2 -| Name | ID | Profession | Images | -| --- | --- | --- | --- | -| Jeremy Scahill | /m/02p_8_n | Journalist | x | -| Jillian York | /m/0g9_3c3 | Digital rights activist | x | -| Astra Taylor | /m/05f6_39 | Author, activist | x | -| Jonathan Zittrain | /m/01f75c | EFF board member | no | -| Julie Brill | x | x | x | -| Jonathan Zittrain | x | x | x | -| Bruce Schneier | m.095js | Cryptologist and author | yes | -| Julie Brill | m.0bs3s9g | x | x | -| Kim Zetter | /m/09r4j3 | x | x | -| Ethan Zuckerman | x | x | x | -| Jill Magid | x | x | x | -| Kyle McDonald | x | x | x | -| Trevor Paglen | x | x | x | -| R. Luke DuBois | x | x | x | +| Name | Profession | +| --- | --- | --- | +| Adrian Chen | Journalist | +| Ai Weiwei* | Artist | +| Aram Bartholl | Internet artist | +| Astra Taylor | Author, director, activist | +| Alexander Madrigal | Journlist | +| Bruce Schneier* | Cryptologist | +| danah boyd | Data & Society founder | +| Edward Felten | Former FTC Chief Technologist | +| Evgeny Morozov* | Tech writer, researcher | +| Glen Greenwald* | Journalist, author | +| Hito Steryl | Artist, writer | -==== +=== -| Name | ID | Profession | Images | -| --- | --- | --- | -- | -| Trevor Paglen | x | x | x | -| Ai Weiwei | /m/0278dyq | x | x | -| Jer Thorp | /m/01h8lg | x | x | -| Edward Felten | /m/028_7k | x | x | -| Evgeny Morozov | /m/05sxhgd | Scholar and technology critic | yes | -| danah boyd | /m/06zmx5 | Data and Society founder | x | -| Bruce Schneier | x | x | x | -| Laura Poitras | x | x | x | -| Trevor Paglen | x | x | x | -| Astra Taylor | x | x | x | -| Shoshanaa Zuboff | x | x | x | -| Eyal Weizman | m.0g54526 | x | x | -| Aram Bartholl | m.06_wjyc | x | x | -| James Risen | m.09pk6b | x | x | +| Name | Profession | +| --- | --- | --- | +| James Risen | Journalist | +| Jeremy Scahill* | Journalist | +| Jill Magid | Artist | +| Jillian York | Digital rights activist | +| Jonathan Zittrain | EFF board member | +| Julie Brill | Former FTC Commissioner| +| Kim Zetter | Journalist, author | +| Laura Poitras* | Filmmaker | +| Luke DuBois | Artist | +| Shoshana Zuboff | Author, academic | +| Trevor Paglen | Artist, researcher | === end columns -After publishing this list, researchers from Microsoft Asia then worked with researchers affilliated with China's National University of Defense Technology (controlled by China's Central Military Commission) and used the the MS Celeb dataset for their [research paper](https://www.semanticscholar.org/paper/Faces-as-Lighting-Probes-via-Unsupervised-Deep-Yi-Zhu/b301fd2fc33f24d6f75224e7c0991f4f04b64a65) on using "Faces as Lighting Probes via Unsupervised Deep Highlight Extraction" with potential applications in 3D face recognition. +After publishing this list, researchers from Microsoft Asia then worked with researchers affiliated with China's National University of Defense Technology (controlled by China's Central Military Commission) and used the the MS Celeb dataset for their [research paper](https://www.semanticscholar.org/paper/Faces-as-Lighting-Probes-via-Unsupervised-Deep-Yi-Zhu/b301fd2fc33f24d6f75224e7c0991f4f04b64a65) on using "Faces as Lighting Probes via Unsupervised Deep Highlight Extraction" with potential applications in 3D face recognition. + +In an [article](https://www.ft.com/content/9378e7ee-5ae6-11e9-9dde-7aedca0a081a) published by Financial Times based on data surfaced during this investigation, Samm Sacks (a senior fellow at New America think tank) commented that this research raised "red flags because of the nature of the technology, the author's affiliations, combined with what we know about how this technology is being deployed in China right now". Adding, that "the [Chinese] government is using these technologies to build surveillance systems and to detain minorities [in Xinjiang]".[^madhu_ft] -In an article published by the Financial Times based on data discovered during this investigation, Samm Sacks (senior fellow at New American and China tech policy expert) commented that this research raised "red flags because of the nature of the technology, the authors affilliations, combined with the what we know about how this technology is being deployed in China right now".[^madhu_ft] +Four more papers published by SenseTime which also use the MS Celeb dataset raise similar flags. SenseTime is a computer vision surveillance company who until [April 2019](https://uhrp.org/news-commentary/china%E2%80%99s-sensetime-sells-out-xinjiang-security-joint-venture) provided surveillance to Chinese authorities to monitor and track Uighur Muslims in Xinjiang province and had been [flagged](https://www.nytimes.com/2019/04/14/technology/china-surveillance-artificial-intelligence-racial-profiling.html) numerous times as having potential links to human rights violations. -Four more papers published by SenseTime which also use the MS Celeb dataset raise similar flags. SenseTime is Beijing based company providing surveillance to Chinese authorities including [ add context here ] has been [flagged](https://www.nytimes.com/2019/04/14/technology/china-surveillance-artificial-intelligence-racial-profiling.html) as complicity in potential human rights violations. +One of the 4 SenseTime papers, "[Exploring Disentangled Feature Representation Beyond Face Identification](https://www.semanticscholar.org/paper/Exploring-Disentangled-Feature-Representation-Face-Liu-Wei/1fd5d08394a3278ef0a89639e9bfec7cb482e0bf)", shows how SenseTime was developing automated face analysis technology to infer race, narrow eyes, nose size, and chin size, all of which could be used to target vulnerable ethnic groups based on their facial appearances. -One of the 4 SenseTime papers, "Exploring Disentangled Feature Representation Beyond Face Identification", shows how SenseTime is developing automated face analysis technology to infer race, narrow eyes, nose size, and chin size, all of which could be used to target vulnerable ethnic groups based on their facial appearances.[^disentangled] +Earlier in 2019, Microsoft CEO [Brad Smith](https://blogs.microsoft.com/on-the-issues/2018/12/06/facial-recognition-its-time-for-action/) called for the governmental regulation of face recognition citing the potential for misuse, a rare admission that Microsoft's surveillance-driven business model had lost its bearing. More recently Smith also [announced](https://www.reuters.com/article/us-microsoft-ai/microsoft-turned-down-facial-recognition-sales-on-human-rights-concerns-idUSKCN1RS2FV) that Microsoft would seemingly take stand against such potential misuse and decided to not sell face recognition to an unnamed United States agency, citing a lack of accuracy made it not suitable to be used on minorities, because it was trained mostly on white male faces. -Earlier in 2019, Microsoft CEO [Brad Smith](https://blogs.microsoft.com/on-the-issues/2018/12/06/facial-recognition-its-time-for-action/) called for the governmental regulation of face recognition, citing the potential for misuse, a rare admission that Microsoft's surveillance-driven business model had lost its bearing. More recently Smith also [announced](https://www.reuters.com/article/us-microsoft-ai/microsoft-turned-down-facial-recognition-sales-on-human-rights-concerns-idUSKCN1RS2FV) that Microsoft would seemingly take stand against potential misuse and decided to not sell face recognition to an unnamed United States law enforcement agency, citing that their technology was not accurate enough to be used on minorities because it was trained mostly on white male faces. +What the decision to block the sale announces is not so much that Microsoft had upgraded their ethics, but that Microsoft publicly acknowledged it can't sell a data-driven product without data. In other words, Microsoft can't sell face recognition for faces they can't train on. -What the decision to block the sale announces is not so much that Microsoft has upgraded their ethics, but that it publicly acknolwedged it can't sell a data-driven product without data. Microsoft can't sell face recognition for faces they can't train on. +Until now, that data has been freely harvested from the Internet and packaged in training sets like MS Celeb, which are overwhelmingly [white](https://www.nytimes.com/2018/02/09/technology/facial-recognition-race-artificial-intelligence.html) and [male](https://gendershades.org). Without balanced data, facial recognition contains blind spots. And without datasets like MS Celeb, the powerful yet inaccurate facial recognition services like Microsoft's Azure Cognitive Service also would not be able to see at all. -Until now, that data has been freely harvested from the Internet and packaged in training sets like MS Celeb, which are overwhelmingly [white](https://www.nytimes.com/2018/02/09/technology/facial-recognition-race-artificial-intelligence.html) and [male](https://gendershades.org). Without balanced data, facial recognition contains blind spots. And without datasets like MS Celeb, the powerful yet innaccurate facial recognition services like Microsoft's Azure Cognitive Service also would not be able to see at all. + -Microsoft didn't only create MS Celeb for other researchers to use, they also used it internally. In a publicly available 2017 Microsoft Research project called "([One-shot Face Recognition by Promoting Underrepresented Classes](https://www.microsoft.com/en-us/research/publication/one-shot-face-recognition-promoting-underrepresented-classes/))", Microsoft leveraged the MS Celeb dataset to analyse their algorithms and advertise the results. Interestingly, the Microsoft's [corporate version](https://www.microsoft.com/en-us/research/publication/one-shot-face-recognition-promoting-underrepresented-classes/) does not mention they used the MS Celeb datset, but the [open-acess version](https://www.semanticscholar.org/paper/One-shot-Face-Recognition-by-Promoting-Classes-Guo/6cacda04a541d251e8221d70ac61fda88fb61a70) of the paper published on arxiv.org that same year explicity mentions that Microsoft Research tested their algorithms "on the MS-Celeb-1M low-shot learning benchmark task." +Microsoft didn't only create MS Celeb for other researchers to use, they also used it internally. In a publicly available 2017 Microsoft Research project called [One-shot Face Recognition by Promoting Underrepresented Classes](https://www.microsoft.com/en-us/research/publication/one-shot-face-recognition-promoting-underrepresented-classes/), Microsoft leveraged the MS Celeb dataset to analyze their algorithms and advertise the results. Interestingly, Microsoft's [corporate version](https://www.microsoft.com/en-us/research/publication/one-shot-face-recognition-promoting-underrepresented-classes/) of the paper does not mention they used the MS Celeb datset, but the [open-access version](https://www.semanticscholar.org/paper/One-shot-Face-Recognition-by-Promoting-Classes-Guo/6cacda04a541d251e8221d70ac61fda88fb61a70) published on arxiv.org explicitly mentions that Microsoft Research introspected their algorithms "on the MS-Celeb-1M low-shot learning benchmark task." -We suggest that if Microsoft Research wants biometric data for surveillance research and development, they should start with own researcher's biometric data instead of scraping the Internet for journalists, artists, writers, and academics. +We suggest that if Microsoft Research wants to make biometric data publicly available for surveillance research and development, they should start with releasing their researchers' own biometric data instead of scraping the Internet for journalists, artists, writers, actors, athletes, musicians, and academics. {% include 'dashboard.html' %} @@ -93,7 +87,5 @@ We suggest that if Microsoft Research wants biometric data for surveillance rese ### Footnotes -[^brad_smith]: Brad Smith cite [^msceleb_orig]: MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition -[^madhu_ft]: Microsoft worked with Chinese military university on artificial intelligence -[^disentangled]: "Exploring Disentangled Feature Representation Beyond Face Identification"
\ No newline at end of file +[^madhu_ft]: Murgia, Madhumita. Microsoft worked with Chinese military university on artificial intelligence. Financial Times. April 10, 2019.
\ No newline at end of file diff --git a/site/content/pages/index.md b/site/content/pages/index.md index 1cf47aac..65611f49 100644 --- a/site/content/pages/index.md +++ b/site/content/pages/index.md @@ -2,7 +2,7 @@ status: published title: Megapixels -desc: The Darkside of Datasets +desc: Face Recognition Datasets slug: analysis published: 2018-12-15 updated: 2018-12-15 diff --git a/site/content/pages/research/00_introduction/index.md b/site/content/pages/research/00_introduction/index.md index 91b87b18..477679d4 100644 --- a/site/content/pages/research/00_introduction/index.md +++ b/site/content/pages/research/00_introduction/index.md @@ -15,8 +15,11 @@ authors: Megapixels + Posted: Dec. 15 + Author: Adam Harvey + Facial recognition is a scam. +It's extractive and damaging industry that's built on the biometric backbone of the Internet. + During the last 20 years commericial, academic, and governmental agencies have promoted the false dream of a future with face recognition. This essay debunks the popular myth that such a thing ever existed. There is no such thing as *face recognition*. For the last 20 years, government agencies, commercial organizations, and academic institutions have played the public as a fool, selling a roadmap of the future that simply does not exist. Facial recognition, as it is currently defined, promoted, and sold to the public, government, and commercial sector is a scam. @@ -25,6 +28,15 @@ Committed to developing robust solutions with superhuman accuracy, the industry There is only biased feature vector clustering and probabilistic thresholding. +## If you don't have data, you don't have a product. + +Yesterday's [decision](https://www.reuters.com/article/us-microsoft-ai/microsoft-turned-down-facial-recognition-sales-on-human-rights-concerns-idUSKCN1RS2FV) by Brad Smith, CEO of Microsoft, to not sell facial recognition to a US law enforcement agency is not an about face by Microsoft to become more humane, it's simply a perfect illustration of the value of training data. Without data, you don't have a product to sell. Microsoft realized that doesn't have enough training data to sell + + +## Use Your Own Biometrics First + +If researchers want faces, they should take selfies and create their own dataset. If researchers want images of families to build surveillance software, they should use and distibute their own family portraits. + ### Motivation Ever since government agencies began developing face recognition in the early 1960's, datasets of face images have always been central to developing and validating face recognition technologies. Today, these datasets no longer originate in labs, but instead from family photo albums posted on photo sharing sites, surveillance camera footage from college campuses, search engine queries for celebrities, cafe livestreams, or <a href="https://www.theverge.com/2017/8/22/16180080/transgender-youtubers-ai-facial-recognition-dataset">videos on YouTube</a>. |
