From 642b6c412e920b0a41dafd78982ed363a747f99b Mon Sep 17 00:00:00 2001 From: Jules Laplace Date: Fri, 28 Jun 2019 00:44:59 -0400 Subject: css fixes etc --- site/content/pages/datasets/msceleb/index.md | 4 ++-- site/content/pages/research/_introduction/index.md | 2 +- .../pages/research/munich_security_conference/index.md | 12 ++---------- 3 files changed, 5 insertions(+), 13 deletions(-) (limited to 'site/content') diff --git a/site/content/pages/datasets/msceleb/index.md b/site/content/pages/datasets/msceleb/index.md index 453c1522..0e457cd9 100644 --- a/site/content/pages/datasets/msceleb/index.md +++ b/site/content/pages/datasets/msceleb/index.md @@ -101,9 +101,9 @@ For example, on October 28, 2019, the MS Celeb dataset will be used for a new co And in June, shortly after [posting](https://twitter.com/adamhrv/status/1134511293526937600) about the disappearance of the MS Celeb dataset, it reemerged on [Academic Torrents](https://academictorrents.com/details/9e67eb7cc23c9417f39778a8e06cca5e26196a97/tech). As of June 10, the MS Celeb dataset files have been redistributed in at least 9 countries and downloaded 44 times without any restrictions. The files were seeded and are mostly distributed by an AI company based in China called Hyper.ai, which states that it redistributes MS Celeb and other datasets for "teachers and students of service industry-related practitioners and research institutes."[^hyperai_readme] -Earlier in 2019 images from the MS Celeb were also repackaged into another face dataset called *Racial Faces in the Wild (RFW)*. To create it, the RFW authors uploaded face images from the MS Celeb dataset to the Face++ API and used the inferred racial scores to segregate people into four subsets: Caucasian, Asian, Indian, and African each with 3,000 subjects. That dataset then appeared in a subsequent research project from researchers affiliated with IIIT-Delhi and IBM TJ Watson called [Deep Learning for Face Recognition: Pride or Prejudiced?](https://arxiv.org/abs/1904.01219), which aims to reduce bias but also inadvertently furthers racist language and ideologies that can not be repeated here. +Earlier in 2019 images from the MS Celeb were also repackaged into another face dataset called *Racial Faces in the Wild (RFW)*. To create it, the RFW authors uploaded face images from the MS Celeb dataset to the Face++ API and used the inferred racial scores to segregate people into four subsets: Caucasian, Asian, Indian, and African each with 3,000 subjects. That dataset then appeared in a subsequent research project from researchers affiliated with IIIT-Delhi and IBM TJ Watson called [Deep Learning for Face Recognition: Pride or Prejudiced?](https://arxiv.org/abs/1904.01219), which aims to reduce bias but also inadvertently furthers racist ideologies, using discredited racial terminology that cannot be repeated here. -The estimated racial scores for the MS Celeb face images used in the RFW dataset were computed using the Face++ API, which is owned by Megvii Inc, a company that has been repeatedly linked to the oppressive surveillance of Uighur Muslims in Xinjiang, China. According to posts from the [ChinAI Newsletter](https://chinai.substack.com/p/chinai-newsletter-11-companies-involved-in-expanding-chinas-public-security-apparatus-in-xinjiang) and [BuzzFeedNews](https://www.buzzfeednews.com/article/ryanmac/us-money-funding-facial-recognition-sensetime-megvii), Megvii announced in 2017 at the China-Eurasia Security Expo in Ürümqi, Xinjiang, that it would be the official technical support unit of the "Public Security Video Laboratory" in Xinjiang, China. If they didn't already, it's highly likely that Megvii has a copy of everyone's biometric faceprint from the MS Celeb dataset, either from uploads to the Face++ API or through the research projects explicitly referencing MS Celeb dataset usage, such as a 2018 paper called [GridFace: Face Rectification via Learning Local Homography Transformations](https://arxiv.org/pdf/1808.06210.pdf) jointly published by 3 authors, all of whom worked for Megvii. +The estimated racial scores for the MS Celeb face images used in the RFW dataset were computed using the Face++ API, which is owned by Megvii Inc, a company that has been repeatedly linked to the oppressive surveillance of Uighur Muslims in Xinjiang, China. According to posts from the [ChinAI Newsletter](https://chinai.substack.com/p/chinai-newsletter-11-companies-involved-in-expanding-chinas-public-security-apparatus-in-xinjiang) and [BuzzFeedNews](https://www.buzzfeednews.com/article/ryanmac/us-money-funding-facial-recognition-sensetime-megvii), Megvii announced in 2017 at the China-Eurasia Security Expo in Ürümqi, Xinjiang, that it would be the official technical support unit of the "Public Security Video Laboratory" in Xinjiang, China. If they didn't already, it's highly likely that Megvii has a copy of everyone's biometric faceprint from the MS Celeb dataset, either from uploads to the Face++ API or through research projects explicitly referencing MS Celeb dataset usage, such as a 2018 paper called [GridFace: Face Rectification via Learning Local Homography Transformations](https://arxiv.org/pdf/1808.06210.pdf) jointly published by 3 authors, all of whom worked for Megvii. ## Commercial Usage diff --git a/site/content/pages/research/_introduction/index.md b/site/content/pages/research/_introduction/index.md index b99e3048..7e839fe7 100644 --- a/site/content/pages/research/_introduction/index.md +++ b/site/content/pages/research/_introduction/index.md @@ -1,6 +1,6 @@ ------------ -status: published +status: draft title: Introducing MegaPixels desc: Introduction to Megapixels slug: 00_introduction diff --git a/site/content/pages/research/munich_security_conference/index.md b/site/content/pages/research/munich_security_conference/index.md index aba39b1c..6a1b84e9 100644 --- a/site/content/pages/research/munich_security_conference/index.md +++ b/site/content/pages/research/munich_security_conference/index.md @@ -71,7 +71,7 @@ Add text === -#### Locations Where Face Data Is Used +#### Where Face Data Is Used Add text @@ -104,16 +104,8 @@ Including over 2,000 more for racial analysis ![caption: MegaFace from U.S. Embassy Canberra](assets/4730007024.jpg) -=== columns 2 - ![caption: An image from the MegaFace dataset obtained from United Kingdom's Embassy in Italy https://flickr.com/photos/ukinitaly](assets/4606260362.jpg) - -==== - -![caption: An imgae from the MegaFace dataset obtained from the Flick account of the United States Embassy in Kabul Afghanistan https://flickr.com/photos/kabulpublicdiplomacy](assets/4749096858.jpg) - - -=== end columns +![caption: An image from the MegaFace dataset obtained from the Flickr account of the United States Embassy in Kabul, Afghanistan https://flickr.com/photos/kabulpublicdiplomacy](assets/4749096858.jpg) === columns 2 -- cgit v1.2.3-70-g09d2 From fe418ace1873b81fa3d2c622d3b414dcacbe7b56 Mon Sep 17 00:00:00 2001 From: Jules Laplace Date: Fri, 28 Jun 2019 02:06:53 -0400 Subject: more options... add download csv link and search --- client/chart/constants.js | 45 ++++++++++++---- client/chart/singlePie.chart.js | 22 +++----- client/index.js | 9 ++++ client/table/citations.table.js | 6 +-- client/table/file.table.js | 60 ++++++++++++++++------ client/table/tabulator.css | 12 +++-- site/content/_drafts_/lfw/index.md | 2 +- site/content/pages/research/_introduction/index.md | 2 +- .../research/munich_security_conference/index.md | 4 +- site/content/pages/test/csv.md | 2 +- 10 files changed, 112 insertions(+), 52 deletions(-) (limited to 'site/content') diff --git a/client/chart/constants.js b/client/chart/constants.js index 6fe94433..0f23f06b 100644 --- a/client/chart/constants.js +++ b/client/chart/constants.js @@ -48,12 +48,35 @@ export const colorblindSafeRainbow = [ // '#888888', ] +export const categoryRainbow = [ + '#5e4fa2', + '#66c2a5', + '#d53e4f', + '#f46d43', + '#9e0142', + '#3288bd', + '#fee08b', + '#abdda4', + '#e6f598', + '#fdae61', + '#ffffbf', + // '#888888', +] + export const institutionColors = [ '#f2f293', // edu (yellow) '#3264f6', // company (blue) '#f30000', // gov/mil (red) ] +export const colorTable = { + rainbow, + bigRainbow, + colorblindSafeRainbow, + institutionColors, + categoryRainbow, +} + /* stuff for a 'countries' legend */ export const topCountryCount = 10 @@ -63,20 +86,20 @@ export const otherCountriesLabel = 'Other Countries' /* institution tuples, labels and templates */ export const initialInstitutionLookup = { - 'edu': 0, - 'company': 0, - 'gov': 0, + edu: 0, + company: 0, + gov: 0, } export const institutionOrder = { - 'edu': 0, - 'company': 1, - 'gov': 2, + edu: 0, + company: 1, + gov: 2, } export const institutionLabels = { - 'edu': 'Academic', - 'company': 'Commercial', - 'gov': 'Military / Government', - 'mil': 'Military / Government', -} \ No newline at end of file + edu: 'Academic', + company: 'Commercial', + gov: 'Military / Government', + mil: 'Military / Government', +} diff --git a/client/chart/singlePie.chart.js b/client/chart/singlePie.chart.js index ed2da582..2e770bd7 100644 --- a/client/chart/singlePie.chart.js +++ b/client/chart/singlePie.chart.js @@ -6,33 +6,24 @@ import 'c3/c3.css' import './chart.css' import { - rainbow, bigRainbow + rainbow, bigRainbow, colorTable } from './constants' class SinglePieChart extends Component { state = { keys: [], data: [], - fields: {}, } componentDidMount() { const { payload } = this.props - console.log(payload) - console.log(payload.fields) - const fields = {} - payload.fields.forEach(field => { - const [k, v] = field.split(': ') - fields[k] = v - }) - fetch(payload.url, { mode: 'cors' }) .then(r => r.text()) .then(text => { try { const keys = text.split('\n')[0].split(',').map(s => s.trim().replace(/"/, '')) const data = csv.toJSON(text, { headers: { included: true } }) - this.setState({ keys, data, fields }) + this.setState({ keys, data }) } catch (e) { console.error("error making json:", payload.url) console.error(e) @@ -41,8 +32,9 @@ class SinglePieChart extends Component { } render() { - const { keys, data, fields } = this.state - console.log(keys, data) + const { fields } = this.props.payload + const { keys, data } = this.state + // console.log(keys, data) const [labelField, numberField] = keys if (!data.length) return null @@ -63,6 +55,8 @@ class SinglePieChart extends Component { const height = chartRows.length < 6 ? 316 : chartRows.length < 10 ? 336 : 356 + const pattern = colorTable[fields.Colors] || (chartRows.length < 10 ? rainbow : bigRainbow) + return (
@@ -72,7 +66,7 @@ class SinglePieChart extends Component { type: 'pie', }} color={{ - pattern: chartRows.length < 10 ? rainbow : bigRainbow, + pattern, }} tooltip={{ format: { diff --git a/client/index.js b/client/index.js index 835d859c..9644ba5c 100644 --- a/client/index.js +++ b/client/index.js @@ -73,6 +73,15 @@ function runApplets() { let opt = null payload.cmd = cmd payload.partz = cmdPartz + const fields = {} + if (payload.fields) { + payload.fields.forEach(field => { + const [k, v] = field.split(': ') + fields[k] = v + }) + } + payload.fields = fields + if (payload.cmd === 'load_file' || payload.cmd === 'single_pie_chart') { payload.url = 'https://nyc3.digitaloceanspaces.com/megapixels/v1' + cmdPartz.shift() return [el, payload] diff --git a/client/table/citations.table.js b/client/table/citations.table.js index ef5ab0b5..c1c71906 100644 --- a/client/table/citations.table.js +++ b/client/table/citations.table.js @@ -1,11 +1,9 @@ import React, { Component } from 'react' -import { bindActionCreators } from 'redux' -import { connect } from 'react-redux' import { ReactTabulator } from 'react-tabulator' import { saveAs } from 'file-saver' import { Loader } from '../common' -import { toArray, toTuples, domainFromUrl } from '../util' +import { domainFromUrl } from '../util' export const citationsColumns = [ { title: 'Title', field: 'title', sorter: 'string' }, @@ -111,7 +109,7 @@ class CitationsTable extends Component { value={this.state.q} onChange={e => this.updateFilter(e.target.value)} className='q' - placeholder='Enter text to search citations...' + placeholder='Enter text to search citations' /> this.download()}>Download CSV
diff --git a/client/table/file.table.js b/client/table/file.table.js index c880810a..c195b09d 100644 --- a/client/table/file.table.js +++ b/client/table/file.table.js @@ -1,17 +1,17 @@ import React, { Component } from 'react' -import { bindActionCreators } from 'redux' -import { connect } from 'react-redux' import { ReactTabulator } from 'react-tabulator' import csv from 'parse-csv' -import { toArray, toTuples, domainFromUrl } from '../util' +import { domainFromUrl } from '../util' import { Loader } from '../common' class FileTable extends Component { state = { keys: [], data: [], + filteredData: [], columns: [], + q: '', } componentDidMount() { @@ -25,7 +25,7 @@ class FileTable extends Component { const data = csv.toJSON(text, { headers: { included: true } }) // console.log(data) const columns = this.getColumns(keys, data, payload.fields) - this.setState({ keys, data, columns }) + this.setState({ keys, data, filteredData: data, columns }) } catch (e) { console.error("error making json:", payload.url) console.error(e) @@ -34,10 +34,11 @@ class FileTable extends Component { } getColumns(keys, data, fields) { - let titles = fields.length ? fields[0].split(', ') : keys + let titles = fields.Headings ? fields.Headings.split(', ') : keys // let numberFields = [] let columns = keys.map((field, i) => { const title = titles[i] || field + let widthGrow = 1 if (field.match('url')) { let textField = field.replace('url', 'label') data.forEach(el => el[textField] = domainFromUrl(el[field])) @@ -49,31 +50,60 @@ class FileTable extends Component { sorter: 'string' } } + if (title === 'Embassy') { + widthGrow = 2 + } switch (field) { case 'images': case 'year': return { title, field: field.toLowerCase(), sorter: 'number' } default: - return { title, field: field.toLowerCase(), sorter: 'string' } + return { title, field: field.toLowerCase(), sorter: 'string', widthGrow } } }) return columns } + updateFilter(q) { + const { keys, data } = this.state + if (!q.length) { + this.setState({ q, filteredData: data }) + } else { + let qRe = new RegExp('(' + q.replace(/\s+/g, ' ').trim().replace(' ', '|') + ')', 'gi') + let filteredData = data.filter(row => keys.some(key => row[key].match(qRe))) + this.setState({ q, filteredData }) + } + } + render() { + const { payload } = this.props + const { q, columns, data, filteredData } = this.state if (!this.state.data.length) { return } + const fn = payload.url.split('/').pop() return ( - +
+
+ this.updateFilter(e.target.value)} + className='q' + placeholder='Enter text to search data' + /> + Download CSV +
+ +
) } } diff --git a/client/table/tabulator.css b/client/table/tabulator.css index 95768976..0ea81974 100644 --- a/client/table/tabulator.css +++ b/client/table/tabulator.css @@ -40,7 +40,7 @@ max-width: 400px; margin-bottom: 10px; background-image: url(/assets/img/icon-search.png); - background-position: 380px center; + background-position: 378px center; background-repeat: no-repeat; box-shadow: 0px 2px 4px rgba(0,0,0,0.2); border: 0; @@ -53,17 +53,21 @@ align-items: flex-start; justify-content: space-between; } -span.download { +.download { display: block; font-size: 13px; color: #ddd; cursor: pointer; background: #333; - padding: 5px 8px; + padding: 5px 8px 5px 8px; border-radius: 5px; transition: all 0.2s; + border: 0 !important; } -.desktop span.download:hover { +.content a.download { + padding: 5px 8px 5px 8px; +} +.desktop .download:hover { color: #fff; background: #666; } \ No newline at end of file diff --git a/site/content/_drafts_/lfw/index.md b/site/content/_drafts_/lfw/index.md index ad43e2dd..a5d6bd18 100644 --- a/site/content/_drafts_/lfw/index.md +++ b/site/content/_drafts_/lfw/index.md @@ -54,7 +54,7 @@ Add a paragraph about how usage extends far beyond academia into research center ``` load_file assets/lfw_commercial_use.csv -name_display, company_url, example_url, country, description +Headings: name_display, company_url, example_url, country, description ``` diff --git a/site/content/pages/research/_introduction/index.md b/site/content/pages/research/_introduction/index.md index 7e839fe7..bdf1c1b0 100644 --- a/site/content/pages/research/_introduction/index.md +++ b/site/content/pages/research/_introduction/index.md @@ -37,7 +37,7 @@ Add info from the AI Traps talk ``` load_file /site/research/00_introduction/assets/summary_countries_top.csv -country, Xcitations +Headings: country, Xcitations ``` Paragraph text to test css formatting. Paragraph text to test css formatting. Paragraph text to test css formatting. Paragraph text to test css formatting. Paragraph text to test css formatting. diff --git a/site/content/pages/research/munich_security_conference/index.md b/site/content/pages/research/munich_security_conference/index.md index 6a1b84e9..3c4d7f08 100644 --- a/site/content/pages/research/munich_security_conference/index.md +++ b/site/content/pages/research/munich_security_conference/index.md @@ -115,6 +115,7 @@ single_pie_chart /site/research/munich_security_conference/assets/megapixels_ori Caption: Sources of Face Training Data Top: 5 OtherLabel: Other Countries +Colors: categoryRainbow ``` =========== @@ -124,6 +125,7 @@ single_pie_chart /site/research/munich_security_conference/assets/embassy_counts Caption: Dataset sources Top: 4 OtherLabel: Other +Colors: categoryRainbow ``` === end columns @@ -134,7 +136,7 @@ OtherLabel: Other ``` load_file /site/research/munich_security_conference/assets/embassy_counts_public.csv -Images, Dataset, Embassy, Flickr ID, URL, Guest, Host +Headings: Images, Dataset, Embassy, Flickr ID, URL, Guest, Host ``` diff --git a/site/content/pages/test/csv.md b/site/content/pages/test/csv.md index 85f714b4..ef3327f8 100644 --- a/site/content/pages/test/csv.md +++ b/site/content/pages/test/csv.md @@ -16,5 +16,5 @@ authors: Megapixels ``` load_file /site/test/assets/test.csv -Name, Images, Year, Gender, Description, URL +Headings: Name, Images, Year, Gender, Description, URL ``` -- cgit v1.2.3-70-g09d2 From cd529774334bf719bc9ba086f54b4521d2b03700 Mon Sep 17 00:00:00 2001 From: Jules Laplace Date: Fri, 28 Jun 2019 02:18:54 -0400 Subject: js --- client/common/loader.component.js | 1 + site/content/pages/research/munich_security_conference/index.md | 2 -- 2 files changed, 1 insertion(+), 2 deletions(-) (limited to 'site/content') diff --git a/client/common/loader.component.js b/client/common/loader.component.js index 98ba5c1b..b24b31b1 100644 --- a/client/common/loader.component.js +++ b/client/common/loader.component.js @@ -5,6 +5,7 @@ export default function Loader() { const spinCfg = { width: 5, radius: 20, + speed: 1, color: 'white', } return ( diff --git a/site/content/pages/research/munich_security_conference/index.md b/site/content/pages/research/munich_security_conference/index.md index 3c4d7f08..520ea950 100644 --- a/site/content/pages/research/munich_security_conference/index.md +++ b/site/content/pages/research/munich_security_conference/index.md @@ -132,8 +132,6 @@ Colors: categoryRainbow {% include 'supplementary_header.html' %} -[ add a download button for CSV data ] - ``` load_file /site/research/munich_security_conference/assets/embassy_counts_public.csv Headings: Images, Dataset, Embassy, Flickr ID, URL, Guest, Host -- cgit v1.2.3-70-g09d2 From 448cf8530f0479cc7f45c34056d24118b09331ad Mon Sep 17 00:00:00 2001 From: Jules Laplace Date: Fri, 28 Jun 2019 02:41:58 -0400 Subject: date --- site/content/pages/research/munich_security_conference/index.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'site/content') diff --git a/site/content/pages/research/munich_security_conference/index.md b/site/content/pages/research/munich_security_conference/index.md index 520ea950..2a97a7c0 100644 --- a/site/content/pages/research/munich_security_conference/index.md +++ b/site/content/pages/research/munich_security_conference/index.md @@ -7,8 +7,8 @@ desc: Analyzing the Transnational Flow of Facial Recognition Training Data subdesc: Where does face data originate and who's using it? cssclass: dataset image: assets/background.jpg -published: 2019-4-18 -updated: 2019-4-19 +published: 2019-6-28 +updated: 2019-6-29 authors: Adam Harvey ------------ -- cgit v1.2.3-70-g09d2 From e59b5e38a6dfcb61375686ec83a4606f50ab012d Mon Sep 17 00:00:00 2001 From: Adam Harvey Date: Fri, 28 Jun 2019 18:35:49 +0200 Subject: msc ready v1 --- .../assets/embassy_counts_summary_dataset.csv | 8 +- .../assets/megapixels_origins_top.csv | 15 ++- .../research/munich_security_conference/index.md | 122 +++++++++------------ site/public/about/assets/LICENSE/index.html | 2 +- site/public/about/attribution/index.html | 2 +- site/public/about/index.html | 2 +- site/public/about/legal/index.html | 2 +- site/public/about/news/index.html | 2 +- site/public/datasets/brainwash/index.html | 2 +- site/public/datasets/duke_mtmc/index.html | 2 +- site/public/datasets/helen/index.html | 2 +- site/public/datasets/hrt_transgender/index.html | 2 +- site/public/datasets/ibm_dif/index.html | 2 +- site/public/datasets/ijb_c/index.html | 2 +- site/public/datasets/index.html | 2 +- site/public/datasets/megaface/index.html | 2 +- .../datasets/msceleb/assets/notes/index.html | 2 +- site/public/datasets/msceleb/index.html | 6 +- site/public/datasets/oxford_town_centre/index.html | 2 +- site/public/datasets/uccs/assets/notes/index.html | 2 +- site/public/datasets/uccs/index.html | 2 +- site/public/datasets/who_goes_there/index.html | 2 +- site/public/index.html | 2 +- site/public/info/index.html | 2 +- .../research/_from_1_to_100_pixels/index.html | 20 +--- site/public/research/_introduction/index.html | 22 +--- .../research/_what_computers_can_see/index.html | 20 +--- site/public/research/index.html | 13 ++- .../research/munich_security_conference/index.html | 55 +++++----- site/public/test/chart/index.html | 2 +- site/public/test/citations/index.html | 2 +- site/public/test/csv/index.html | 4 +- site/public/test/datasets/index.html | 2 +- site/public/test/face_search/index.html | 2 +- site/public/test/gallery/index.html | 2 +- site/public/test/index.html | 2 +- site/public/test/map/index.html | 2 +- site/public/test/name_search/index.html | 2 +- site/public/test/pie_chart/index.html | 2 +- site/templates/home.html | 2 +- site/templates/layout.html | 2 +- 41 files changed, 151 insertions(+), 196 deletions(-) (limited to 'site/content') diff --git a/site/content/pages/research/munich_security_conference/assets/embassy_counts_summary_dataset.csv b/site/content/pages/research/munich_security_conference/assets/embassy_counts_summary_dataset.csv index 89f3c226..3a439821 100755 --- a/site/content/pages/research/munich_security_conference/assets/embassy_counts_summary_dataset.csv +++ b/site/content/pages/research/munich_security_conference/assets/embassy_counts_summary_dataset.csv @@ -1,5 +1,5 @@ dataset,images -ibm_dif,389 -megaface,5679 -vgg_face,1 -who_goes_there,2372 +IBM Diversity in Faces,389 +MegaFace,5679 +VGG Face,1 +Who Goes There,2372 \ No newline at end of file diff --git a/site/content/pages/research/munich_security_conference/assets/megapixels_origins_top.csv b/site/content/pages/research/munich_security_conference/assets/megapixels_origins_top.csv index 081b4636..ae6e8f11 100755 --- a/site/content/pages/research/munich_security_conference/assets/megapixels_origins_top.csv +++ b/site/content/pages/research/munich_security_conference/assets/megapixels_origins_top.csv @@ -1,9 +1,8 @@ source,images -Search Engines,30127200 -Flickr.com,11783888 -IMDb.com,5251410 -CCTV,959312 -Wikimedia.org,183500 -Mugshots,113268 -YouTube.com,31888 -Other Sources Combined,37044 +Internet Search Engines,15063600 +Flickr.com,5891944 +Internet Movie Database (IMDB.com),2625705 +CCTV,479656 +Wikimedia.org,91750 +Mugshots,56634 +YouTube.com,15944 \ No newline at end of file diff --git a/site/content/pages/research/munich_security_conference/index.md b/site/content/pages/research/munich_security_conference/index.md index 2a97a7c0..0f8a5bda 100644 --- a/site/content/pages/research/munich_security_conference/index.md +++ b/site/content/pages/research/munich_security_conference/index.md @@ -3,7 +3,7 @@ status: published title: MSC slug: munich-security-conference -desc: Analyzing the Transnational Flow of Facial Recognition Training Data +desc: Analyzing Transnational Flows of Face Recognition Image Training Data subdesc: Where does face data originate and who's using it? cssclass: dataset image: assets/background.jpg @@ -13,7 +13,7 @@ authors: Adam Harvey ------------ -## Analysis for the Munich Security Conference Transnational Security Report +## Face Datasets and Information Supply Chains ### sidebar @@ -21,21 +21,30 @@ authors: Adam Harvey + Datasets Analyzed: 30 + Years: 2006 - 2018 + Status: Ongoing Investigation -+ Last Updated: June 27, 2019 ++ Last Updated: June 28, 2019 ### end sidebar +National AI strategies often rely on transnational data sources to capitalize on recent advancements in deep learning and neural networks. Researchers benefiting from these transnational data flows can yield quick and significant gains across diverse sectors from health care to biometrics. But new challenges emerge when national AI strategies collide with national interests. -Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum." +Our earlier research on the [MS Celeb](/datsets) and [Duke](/datsets/duke_mtmc) datasets published with the Financial Times revealed that several computer vision image datasets created by US companies and universities were unexpectedly also used for research by the National University of Defense Technology in China, along with top Chinese surveillance firms including SenseTime, SenseNets, CloudWalk, Hikvision, and Megvii/Face++ which have all been linked to the oppressive surveillance of Uighur Muslims in Xinjiang. + +In this new research for the Munich Security Conference's Transnational Security Report we provide summary statistics about the origins and endpoints of facial recognition information supply chains. To make it more personal, we gathered additional data on the number of public photos from Embassies that are currently being used in facial recognition datasets. + + +### 24 Million Non-Cooperative Faces + +In total, we analyzed 30 publicly available face recognition and face analysis datasets that collectively include over 24 million non-cooperative images. Of these 24 million images, over 15 million face images are from Internet search engines, over 5.8 million from Flickr.com, over 2.5 million from the Internet Movie Database (IMDb.com), and nearly 500,000 from CCTV footage. All 24 million images were collected without any explicit consent, a type of face image researchers call "in the wild". + +Next we manually verified 1,134 publicly available research papers that cite these datasets to determine who was using the data and where it was being used. Even though all of the images originated in the United States, the publicly available research citations show that only about 25% citations are from the country of the origin while the majority of citations are from China. -Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum." === columns 2 ``` single_pie_chart /site/research/munich_security_conference/assets/megapixels_origins_top.csv -Caption: Sources of Publicly Available Face Training Data 2006 - 2018 +Caption: Sources of Publicly Available Non-Cooperative Face Image Training Data 2006 - 2018 Top: 10 OtherLabel: Other ``` @@ -44,85 +53,32 @@ OtherLabel: Other ``` single_pie_chart /site/research/munich_security_conference/assets/summary_countries.csv -Caption: Locations Where Face Data Is Used +Caption: Locations Where Face Data Is Used Based on Public Research Citations Top: 14 OtherLabel: Other ``` === end columns +### 6,000 Embassy Photos Being Used To Train Facial Recognition -=== columns 2 - -#### Sources of Face Data - -Add text - -| Source | Images | -| --- | --- | -|Search Engines | 30,127,200 | -|Flickr.com | 11,783,888 | -|IMDb.com | 5,251,410 | -|CCTV | 959,312 | -|Wikimedia.org | 183,500 | -|Mugshots | 113,268 | -|Other Sources Combined | 37,044 | -|YouTube.com | 31,888 | - -=== - -#### Where Face Data Is Used - -Add text - -|country | citations| -| --- | --- | -|China | 327| -|United States | 302| -|United Kingdom | 187| -|Australia | 38| -|Germany | 35| -|Singapore | 27| -|Canada | 25| -|Netherlands | 25| -|Italy | 22| -|France | 17| -|India | 14| -|South Korea | 12| -|Spain | 10| -|Switzerland | 9| - -=== end columns - - - -## Over 6,000 Embassy Images on Flickr Found in Face Recognition Datasets - -Including over 2,000 more for racial analysis - - -![caption: MegaFace from U.S. Embassy Canberra](assets/4730007024.jpg) - - -![caption: An image from the MegaFace dataset obtained from United Kingdom's Embassy in Italy https://flickr.com/photos/ukinitaly](assets/4606260362.jpg) -![caption: An image from the MegaFace dataset obtained from the Flickr account of the United States Embassy in Kabul, Afghanistan https://flickr.com/photos/kabulpublicdiplomacy](assets/4749096858.jpg) - +Of the 5.8 million Flickr images we found over 6,000 public photos from Embassy Flickr accounts were used to train facial recognition technologies. These images were used in the MegaFace, IBM Diversity in Faces datasets. Over 2,000 more images were used in the Who Goes There datasets used for facial ethnicity analysis research. A few of the embassy images found in facial recognition datasets are shown below. === columns 2 ``` -single_pie_chart /site/research/munich_security_conference/assets/megapixels_origins_top.csv -Caption: Sources of Face Training Data -Top: 5 -OtherLabel: Other Countries +single_pie_chart /site/research/munich_security_conference/assets/country_counts.csv +Caption: Photos from these embassies are being used to train face recognition software +Top: 4 +OtherLabel: Other Colors: categoryRainbow ``` -=========== +===== ``` single_pie_chart /site/research/munich_security_conference/assets/embassy_counts_summary_dataset.csv -Caption: Dataset sources +Caption: Embassy images were found in these datasets Top: 4 OtherLabel: Other Colors: categoryRainbow @@ -130,6 +86,29 @@ Colors: categoryRainbow === end columns +![caption: An image in the MegaFace dataset obtained from United Kingdom's Embassy in Italy](assets/4606260362.jpg) +![caption: An image in the MegaFace dataset obtained from the Flickr account of the United States Embassy in Kabul, Afghanistan](assets/4749096858.jpg) + +![caption: An image in the MegaFace dataset obtained from U.S. Embassy Canberra](assets/4730007024.jpg) + + +This brief research aims to shed light on the emerging politics of data. A photo is no longer just a photo when it can also be surveillance training data, and datasets can no longer be separated from the development of software when software is now built with data. "Our relationship to computers has changed", says Geoffrey Hinton, one of the founders of modern day neural networks and deep learning. "Instead of programming them, we now show them and they figure it out."[^hinton]. + +National AI strategies might also want to include transnational dataset strategies. + +*This research post is going and will updated during July and August, 2019.* + +### Further Reading + +- [MS Celeb Dataset Analysis](/datasets/msceleb) +- [Brainwash Dataset Analysis](/datasets/brainwash) +- [Duke MTMC Dataset Analysis](/datasets/duke_mtmc) +- [Unconstrained College Students Dataset Analysis](/datasets/uccs) +- [Duke MTMC dataset author apologies to students](https://www.dukechronicle.com/article/2019/06/duke-university-facial-recognition-data-set-study-surveillance-video-students-china-uyghur) +- [BBC coverage of MS Celeb dataset takedown](https://www.bbc.com/news/technology-48555149) +- [Spiegel coverage of MS Celeb dataset takdown](https://www.spiegel.de/netzwelt/web/microsoft-gesichtserkennung-datenbank-mit-zehn-millionen-fotos-geloescht-a-1271221.html) + + {% include 'supplementary_header.html' %} ``` @@ -137,5 +116,10 @@ load_file /site/research/munich_security_conference/assets/embassy_counts_public Headings: Images, Dataset, Embassy, Flickr ID, URL, Guest, Host ``` +{% include 'cite_our_work.html' %} + +### Footnotes + +[^hinton]: "Heroes of Deep Learning: Andrew Ng interviews Geoffrey Hinton". Published on Aug 8, 2017. + -{% include 'cite_our_work.html' %} \ No newline at end of file diff --git a/site/public/about/assets/LICENSE/index.html b/site/public/about/assets/LICENSE/index.html index f1e3a9fd..40929e4f 100644 --- a/site/public/about/assets/LICENSE/index.html +++ b/site/public/about/assets/LICENSE/index.html @@ -50,7 +50,7 @@
diff --git a/site/public/about/attribution/index.html b/site/public/about/attribution/index.html index 15270150..4e7474b0 100644 --- a/site/public/about/attribution/index.html +++ b/site/public/about/attribution/index.html @@ -50,7 +50,7 @@
diff --git a/site/public/about/index.html b/site/public/about/index.html index 16a2e967..a46653c6 100644 --- a/site/public/about/index.html +++ b/site/public/about/index.html @@ -50,7 +50,7 @@
diff --git a/site/public/about/legal/index.html b/site/public/about/legal/index.html index 49ed926d..8beafeea 100644 --- a/site/public/about/legal/index.html +++ b/site/public/about/legal/index.html @@ -50,7 +50,7 @@
diff --git a/site/public/about/news/index.html b/site/public/about/news/index.html index fcba7877..de44468e 100644 --- a/site/public/about/news/index.html +++ b/site/public/about/news/index.html @@ -50,7 +50,7 @@
diff --git a/site/public/datasets/brainwash/index.html b/site/public/datasets/brainwash/index.html index 3dacd6e1..18600b6f 100644 --- a/site/public/datasets/brainwash/index.html +++ b/site/public/datasets/brainwash/index.html @@ -50,7 +50,7 @@
diff --git a/site/public/datasets/duke_mtmc/index.html b/site/public/datasets/duke_mtmc/index.html index 9a70a3f6..fc141450 100644 --- a/site/public/datasets/duke_mtmc/index.html +++ b/site/public/datasets/duke_mtmc/index.html @@ -50,7 +50,7 @@
diff --git a/site/public/datasets/helen/index.html b/site/public/datasets/helen/index.html index a7ada42a..44ef462e 100644 --- a/site/public/datasets/helen/index.html +++ b/site/public/datasets/helen/index.html @@ -50,7 +50,7 @@
diff --git a/site/public/datasets/hrt_transgender/index.html b/site/public/datasets/hrt_transgender/index.html index 02324a2f..2e5e9c62 100644 --- a/site/public/datasets/hrt_transgender/index.html +++ b/site/public/datasets/hrt_transgender/index.html @@ -50,7 +50,7 @@
diff --git a/site/public/datasets/ibm_dif/index.html b/site/public/datasets/ibm_dif/index.html index 1c465f93..be5dbfe4 100644 --- a/site/public/datasets/ibm_dif/index.html +++ b/site/public/datasets/ibm_dif/index.html @@ -50,7 +50,7 @@
diff --git a/site/public/datasets/ijb_c/index.html b/site/public/datasets/ijb_c/index.html index a36fac14..abe7d5ed 100644 --- a/site/public/datasets/ijb_c/index.html +++ b/site/public/datasets/ijb_c/index.html @@ -50,7 +50,7 @@
diff --git a/site/public/datasets/index.html b/site/public/datasets/index.html index 1fb83352..a634b877 100644 --- a/site/public/datasets/index.html +++ b/site/public/datasets/index.html @@ -50,7 +50,7 @@
diff --git a/site/public/datasets/megaface/index.html b/site/public/datasets/megaface/index.html index 33abf6c1..712af28a 100644 --- a/site/public/datasets/megaface/index.html +++ b/site/public/datasets/megaface/index.html @@ -50,7 +50,7 @@
diff --git a/site/public/datasets/msceleb/assets/notes/index.html b/site/public/datasets/msceleb/assets/notes/index.html index cac21eef..36c32429 100644 --- a/site/public/datasets/msceleb/assets/notes/index.html +++ b/site/public/datasets/msceleb/assets/notes/index.html @@ -50,7 +50,7 @@
diff --git a/site/public/datasets/msceleb/index.html b/site/public/datasets/msceleb/index.html index 7109cc9b..42a44571 100644 --- a/site/public/datasets/msceleb/index.html +++ b/site/public/datasets/msceleb/index.html @@ -50,7 +50,7 @@
@@ -212,8 +212,8 @@

Despite the recent termination of the msceleb.org website, the dataset still exists in several repositories on GitHub, the hard drives of countless researchers, and will likely continue to be used in research projects around the world.

For example, on October 28, 2019, the MS Celeb dataset will be used for a new competition called "Lightweight Face Recognition Challenge & Workshop" where the best face recognition entries will be awarded $5,000 from Huawei and $3,000 from DeepGlint. The competition is part of the ICCV 2019 conference. This time the challenge is no longer being organized by Microsoft, who created the dataset, but instead by Imperial College London (UK) and InsightFace (CN). The organizers provide a 25GB download of cropped faces from MS Celeb for anyone to download (in .rec format).

And in June, shortly after posting about the disappearance of the MS Celeb dataset, it reemerged on Academic Torrents. As of June 10, the MS Celeb dataset files have been redistributed in at least 9 countries and downloaded 44 times without any restrictions. The files were seeded and are mostly distributed by an AI company based in China called Hyper.ai, which states that it redistributes MS Celeb and other datasets for "teachers and students of service industry-related practitioners and research institutes." 6

-

Earlier in 2019 images from the MS Celeb were also repackaged into another face dataset called Racial Faces in the Wild (RFW). To create it, the RFW authors uploaded face images from the MS Celeb dataset to the Face++ API and used the inferred racial scores to segregate people into four subsets: Caucasian, Asian, Indian, and African each with 3,000 subjects. That dataset then appeared in a subsequent research project from researchers affiliated with IIIT-Delhi and IBM TJ Watson called Deep Learning for Face Recognition: Pride or Prejudiced?, which aims to reduce bias but also inadvertently furthers racist language and ideologies that can not be repeated here.

-

The estimated racial scores for the MS Celeb face images used in the RFW dataset were computed using the Face++ API, which is owned by Megvii Inc, a company that has been repeatedly linked to the oppressive surveillance of Uighur Muslims in Xinjiang, China. According to posts from the ChinAI Newsletter and BuzzFeedNews, Megvii announced in 2017 at the China-Eurasia Security Expo in Ürümqi, Xinjiang, that it would be the official technical support unit of the "Public Security Video Laboratory" in Xinjiang, China. If they didn't already, it's highly likely that Megvii has a copy of everyone's biometric faceprint from the MS Celeb dataset, either from uploads to the Face++ API or through the research projects explicitly referencing MS Celeb dataset usage, such as a 2018 paper called GridFace: Face Rectification via Learning Local Homography Transformations jointly published by 3 authors, all of whom worked for Megvii.

+

Earlier in 2019 images from the MS Celeb were also repackaged into another face dataset called Racial Faces in the Wild (RFW). To create it, the RFW authors uploaded face images from the MS Celeb dataset to the Face++ API and used the inferred racial scores to segregate people into four subsets: Caucasian, Asian, Indian, and African each with 3,000 subjects. That dataset then appeared in a subsequent research project from researchers affiliated with IIIT-Delhi and IBM TJ Watson called Deep Learning for Face Recognition: Pride or Prejudiced?, which aims to reduce bias but also inadvertently furthers racist ideologies, using discredited racial terminology that cannot be repeated here.

+

The estimated racial scores for the MS Celeb face images used in the RFW dataset were computed using the Face++ API, which is owned by Megvii Inc, a company that has been repeatedly linked to the oppressive surveillance of Uighur Muslims in Xinjiang, China. According to posts from the ChinAI Newsletter and BuzzFeedNews, Megvii announced in 2017 at the China-Eurasia Security Expo in Ürümqi, Xinjiang, that it would be the official technical support unit of the "Public Security Video Laboratory" in Xinjiang, China. If they didn't already, it's highly likely that Megvii has a copy of everyone's biometric faceprint from the MS Celeb dataset, either from uploads to the Face++ API or through research projects explicitly referencing MS Celeb dataset usage, such as a 2018 paper called GridFace: Face Rectification via Learning Local Homography Transformations jointly published by 3 authors, all of whom worked for Megvii.

Commercial Usage

Microsoft's MS Celeb website says it was created for "non-commercial research purpose only." Publicly available research citations and competitions show otherwise.

In 2017 Microsoft Research organized a face recognition competition at the International Conference on Computer Vision (ICCV), one of the top 2 computer vision conferences worldwide, where industry and academia used the MS Celeb dataset to compete for the highest performance scores. The 2017 winner was Beijing-based OrionStar Technology Co., Ltd.. In their press release, OrionStar boasted a 13% increase on the difficult set over last year's winner. The prior year's competitors included Beijing-based Faceall Technology Co., Ltd., a company providing face recognition for "smart city" applications.

diff --git a/site/public/datasets/oxford_town_centre/index.html b/site/public/datasets/oxford_town_centre/index.html index 40f8bbc6..11fb436f 100644 --- a/site/public/datasets/oxford_town_centre/index.html +++ b/site/public/datasets/oxford_town_centre/index.html @@ -50,7 +50,7 @@
diff --git a/site/public/datasets/uccs/assets/notes/index.html b/site/public/datasets/uccs/assets/notes/index.html index c8daf796..ce36f3d9 100644 --- a/site/public/datasets/uccs/assets/notes/index.html +++ b/site/public/datasets/uccs/assets/notes/index.html @@ -50,7 +50,7 @@
diff --git a/site/public/datasets/uccs/index.html b/site/public/datasets/uccs/index.html index 96ab1e09..2dcf88a1 100644 --- a/site/public/datasets/uccs/index.html +++ b/site/public/datasets/uccs/index.html @@ -50,7 +50,7 @@
diff --git a/site/public/datasets/who_goes_there/index.html b/site/public/datasets/who_goes_there/index.html index 3db77ff7..a00fd151 100644 --- a/site/public/datasets/who_goes_there/index.html +++ b/site/public/datasets/who_goes_there/index.html @@ -50,7 +50,7 @@
diff --git a/site/public/index.html b/site/public/index.html index e5a6cd62..98b780b2 100644 --- a/site/public/index.html +++ b/site/public/index.html @@ -49,7 +49,7 @@
diff --git a/site/public/info/index.html b/site/public/info/index.html index f6280e58..51b4e5f8 100644 --- a/site/public/info/index.html +++ b/site/public/info/index.html @@ -50,7 +50,7 @@
diff --git a/site/public/research/_from_1_to_100_pixels/index.html b/site/public/research/_from_1_to_100_pixels/index.html index 74f334cc..a978b264 100644 --- a/site/public/research/_from_1_to_100_pixels/index.html +++ b/site/public/research/_from_1_to_100_pixels/index.html @@ -50,27 +50,13 @@
-
-

From 1 to 100 Pixels

-
-
-
Posted
-
2018-12-04
-
-
-
By
-
Adam Harvey
-
- -
-
- -

High resolution insights from low resolution data

+

From 1 to 100 Pixels

+

High resolution insights from low resolution data

This post will be about the meaning of "face". How do people define it? How to biometrics researchers define it? How has it changed during the last decade.

What can you know from a very small amount of information?

    diff --git a/site/public/research/_introduction/index.html b/site/public/research/_introduction/index.html index 66905247..8b17c016 100644 --- a/site/public/research/_introduction/index.html +++ b/site/public/research/_introduction/index.html @@ -50,27 +50,13 @@
    -
    -

    Introducing MegaPixels

    -
    -
    -
    Posted
    -
    2018-12-15
    -
    -
    -
    By
    -
    Adam Harvey
    -
    - -
    -
    - -

    Face recognition has become the focal point for ...

    +

    Introduction

    +

    Face recognition has become the focal point for ...

    Add 68pt landmarks animation

    But biometric currency is ...

    Add rotation 3D head

    @@ -82,7 +68,7 @@
  • Posted: Dec. 15
  • Author: Adam Harvey
-

Paragraph text to test css formatting. Paragraph text to test css formatting. Paragraph text to test css formatting. Paragraph text to test css formatting. Paragraph text to test css formatting.

+

Paragraph text to test css formatting. Paragraph text to test css formatting. Paragraph text to test css formatting. Paragraph text to test css formatting. Paragraph text to test css formatting.

[ page under development ]

 This is the caption
This is the caption
diff --git a/site/public/research/_what_computers_can_see/index.html b/site/public/research/_what_computers_can_see/index.html index 003dd733..35f6d47d 100644 --- a/site/public/research/_what_computers_can_see/index.html +++ b/site/public/research/_what_computers_can_see/index.html @@ -50,27 +50,13 @@
-
-

What Computers Can See

-
-
-
Posted
-
2018-12-15
-
-
-
By
-
Adam Harvey
-
- -
-
- -

Rosalind Picard on Affective Computing Podcast with Lex Fridman

+

What Computers Can See About Your Face

+

Rosalind Picard on Affective Computing Podcast with Lex Fridman

diff --git a/site/public/research/munich_security_conference/index.html b/site/public/research/munich_security_conference/index.html index 499d8e9f..b0503f84 100644 --- a/site/public/research/munich_security_conference/index.html +++ b/site/public/research/munich_security_conference/index.html @@ -4,7 +4,7 @@ MegaPixels: MSC - + @@ -50,31 +50,36 @@
-
-

MSC

-
-
-
Posted
-
2019-4-18
-
-
-
By
-
Adam Harvey
-
- -
-
- -
Analyzing the Transnational Flow of Facial Recognition Data
Where does face data originate and who's using it? -

[page under devlopment]

-

Intro paragraph.

-

[ add montage of extracted faces here]

-
 Placeholder caption
Placeholder caption
 Placeholder caption
Placeholder caption
 Placeholder caption
Placeholder caption
 Placeholder caption
Placeholder caption
+
Analyzing Transnational Flows of Face Recognition Image Training Data
Where does face data originate and who's using it? +

Face Datasets and Information Supply Chains

+

National AI strategies often rely on transnational data sources to capitalize on recent advancements in deep learning and neural networks. Researchers benefiting from these transnational data flows can yield quick and significant gains across diverse sectors from health care to biometrics. But new challenges emerge when national AI strategies collide with national interests.

+

Our earlier research on the MS Celeb and Duke datasets published with the Financial Times revealed that several computer vision image datasets created by US companies and universities were unexpectedly also used for research by the National University of Defense Technology in China, along with top Chinese surveillance firms including SenseTime, SenseNets, CloudWalk, Hikvision, and Megvii/Face++ which have all been linked to the oppressive surveillance of Uighur Muslims in Xinjiang.

+

In this new research for the Munich Security Conference's Transnational Security Report we provide summary statistics about the origins and endpoints of facial recognition information supply chains. To make it more personal, we gathered additional data on the number of public photos from Embassies that are currently being used in facial recognition datasets.

+

24 Million Non-Cooperative Faces

+

In total, we analyzed 30 publicly available face recognition and face analysis datasets that collectively include over 24 million non-cooperative images. Of these 24 million images, over 15 million face images are from Internet search engines, over 5.8 million from Flickr.com, over 2.5 million from the Internet Movie Database (IMDb.com), and nearly 500,000 from CCTV footage. All 24 million images were collected without any explicit consent, a type of face image researchers call "in the wild".

+

Next we manually verified 1,134 publicly available research papers that cite these datasets to determine who was using the data and where it was being used. Even though all of the images originated in the United States, the publicly available research citations show that only about 25% citations are from the country of the origin while the majority of citations are from China.

+

6,000 Embassy Photos Being Used To Train Facial Recognition

+

Of the 5.8 million Flickr images we found over 6,000 public photos from Embassy Flickr accounts were used to train facial recognition technologies. These images were used in the MegaFace, IBM Diversity in Faces datasets. Over 2,000 more images were used in the Who Goes There datasets used for facial ethnicity analysis research. A few of the embassy images found in facial recognition datasets are shown below.

+
 An image in the MegaFace dataset obtained from United Kingdoms Embassy in Italy
An image in the MegaFace dataset obtained from United Kingdom's Embassy in Italy
+
 An image in the MegaFace dataset obtained from the Flickr account of the United States Embassy in Kabul, Afghanistan
An image in the MegaFace dataset obtained from the Flickr account of the United States Embassy in Kabul, Afghanistan
 An image in the MegaFace dataset obtained from U.S. Embassy Canberra
An image in the MegaFace dataset obtained from U.S. Embassy Canberra

This brief research aims to shed light on the emerging politics of data. A photo is no longer just a photo when it can also be surveillance training data, and datasets can no longer be separated from the development of software when software is now built with data. "Our relationship to computers has changed", says Geoffrey Hinton, one of the founders of modern day neural networks and deep learning. "Instead of programming them, we now show them and they figure it out." 1.

+

National AI strategies might also want to include transnational dataset strategies.

+

This research post is going and will updated during July and August, 2019.

+

Further Reading

+ +
@@ -83,8 +88,7 @@

Supplementary Information

-

[ add a download button for CSV data ]

-
+

Cite Our Work

@@ -101,7 +105,8 @@ }

-
+

References

diff --git a/site/public/test/chart/index.html b/site/public/test/chart/index.html index 33fafb48..e3134df9 100644 --- a/site/public/test/chart/index.html +++ b/site/public/test/chart/index.html @@ -50,7 +50,7 @@
diff --git a/site/public/test/citations/index.html b/site/public/test/citations/index.html index a5fbcc76..3c630adc 100644 --- a/site/public/test/citations/index.html +++ b/site/public/test/citations/index.html @@ -50,7 +50,7 @@
diff --git a/site/public/test/csv/index.html b/site/public/test/csv/index.html index d3ca0953..f1204c90 100644 --- a/site/public/test/csv/index.html +++ b/site/public/test/csv/index.html @@ -50,14 +50,14 @@