summaryrefslogtreecommitdiff
path: root/site/content/pages/research/00_introduction/index.md
diff options
context:
space:
mode:
Diffstat (limited to 'site/content/pages/research/00_introduction/index.md')
-rw-r--r--site/content/pages/research/00_introduction/index.md20
1 files changed, 18 insertions, 2 deletions
diff --git a/site/content/pages/research/00_introduction/index.md b/site/content/pages/research/00_introduction/index.md
index 6fec7ab5..bcb3d57c 100644
--- a/site/content/pages/research/00_introduction/index.md
+++ b/site/content/pages/research/00_introduction/index.md
@@ -1,6 +1,6 @@
------------
-status: published
+status: draft
title: 00: Introduction
desc: Introduction to Megapixels
slug: 00_introduction
@@ -16,6 +16,22 @@ authors: Megapixels
+ Author: Adam Harvey
+
+### Motivation
+
+Ever since government agencies began developing face recognition in the early 1960's, datasets of face images have always been central to developing and validating face recognition technologies. Today, these datasets no longer originate in labs, but instead from family photo albums posted on photo sharing sites, surveillance camera footage from college campuses, search engine queries for celebrities, cafe livestreams, or <a href="https://www.theverge.com/2017/8/22/16180080/transgender-youtubers-ai-facial-recognition-dataset">videos on YouTube</a>.
+
+During the last year, hundreds of these facial analysis datasets created "in the wild" have been collected to understand how they contribute to a global supply chain of biometric data that is powering the global facial recognition industry.
+
+While many of these datasets include public figures such as politicians, athletes, and actors; they also include many non-public figures: digital activists, students, pedestrians, and semi-private shared photo albums are all considered "in the wild" and fair game for research projects. Some images are used with creative commons licenses, yet others were taken in unconstrained scenarios without awareness or consent. At first glance it appears many of the datasets were created for seemingly harmless academic research, but when examined further it becomes clear that they're also used by foreign defense agencies.
+
+The MegaPixels site is based on an earlier [installation](https://ahprojects.com/megapixels-glassroom) (also supported by Mozilla) at the [Tactical Tech Glassroom](https://theglassroom.org/) in London in 2017; and a commission from the Elevate arts festival curated by Berit Gilma about pedestrian recognition datasets in 2018, and research during [CV Dazzle](https://cvdazzle.com) from 2010-2015. Through the many prototypes, conversations, pitches, PDFs, and false starts this project has endured during the last 5 years, it eventually evolved into something much different than originally imagined. Now, as datasets become increasingly influential in shaping the computational future, it's clear that they must be critically analyzed to understand the biases, shortcomings, funding sources, and contributions to the surveillance industry. However, it's misguided to only criticize these datasets for their flaws without also praising their contribution to society. Without publicly available facial analysis datasets there would be less public discourse, less open-source software, and less peer-reviewed research. Public datasets can indeed become a vital public good for the information economy but as this projects aims to illustrate, many ethical questions arise about consent, intellectual property, surveillance, and privacy.
+
+<!-- who provided funding to research, development this project understand the role these datasets have played in creating biometric surveillance technologies. -->
+
+
+
+
Ever since the first computational facial recognition research project by the CIA in the early 1960s, data has always played a vital role in the development of our biometric future. Without facial recognition datasets there would be no facial recognition. Datasets are an indispensable part of any artificial intelligence system because, as Geoffrey Hinton points out:
> Our relationship to computers has changed. Instead of programming them, we now show them and they figure it out. - [Geoffrey Hinton](https://www.youtube.com/watch?v=-eyhCTvrEtE)
@@ -26,7 +42,7 @@ Algorithms learn from datasets. And we program algorithms by building datasets.
Ignore content below these lines
-----
-
+
It was the early 2000s. Face recognition was new and no one seemed sure exactly how well it was going to perform in practice. In theory, face recognition was poised to be a game changer, a force multiplier, a strategic military advantage, a way to make cities safer and to secure borders. This was the future John Ashcroft demanded with the Total Information Awareness act of the 2003 and that spooks had dreamed of for decades. It was a future that academics at Carnegie Mellon Universtiy and Colorado State University would help build. It was also a future that celebrities would play a significant role in building. And to the surprise of ordinary Internet users like myself and perhaps you, it was a future that millions of Internet users would unwittingly play role in creating.