summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--site/content/pages/about/index.md2
-rw-r--r--site/content/pages/datasets/brainwash/index.md4
-rw-r--r--site/content/pages/datasets/lfw/index.md4
-rw-r--r--site/content/pages/datasets/mars/index.md4
-rw-r--r--site/public/about/index.html2
-rw-r--r--site/public/datasets/brainwash/index.html4
-rw-r--r--site/public/datasets/lfw/index.html4
-rw-r--r--site/public/datasets/mars/index.html4
8 files changed, 14 insertions, 14 deletions
diff --git a/site/content/pages/about/index.md b/site/content/pages/about/index.md
index deb4c0e7..9c66fbc4 100644
--- a/site/content/pages/about/index.md
+++ b/site/content/pages/about/index.md
@@ -27,7 +27,7 @@ authors: Adam Harvey
<p><div style="font-size:20px;line-height:36px">Ever since government agencies began developing face recognition in the early 1960's, datasets of face images have always been central to the development and evaluation face recognition technology. Today, these datasets no longer originate in labs, but instead from family photo albums posted on social media sites, CCTV camera footage from college campuses, search engine queries for celebrities, cafe livestreams, or <a href="https://www.theverge.com/2017/8/22/16180080/transgender-youtubers-ai-facial-recognition-dataset">videos on YouTube</a>. </div></p>
-While many of these datasets include public figures such as politicans, athletes, and actors; they also include many non-public figures including digital activists, students, pedestrians, and people's semi-private shared photo albums. Some images are used with creative commons licenses, yet others were taken in unconstrained scenarios without awareness or consent. At first glance it appears many of the datasets were created for seemingly harmless academic research studies, but when examined further it becomes clear that they're also used by defense contractors in foreign countries.
+While many of these datasets include public figures such as politicians, athletes, and actors; they also include many non-public figures including digital activists, students, pedestrians, and people's semi-private shared photo albums. Some images are used with creative commons licenses, yet others were taken in unconstrained scenarios without awareness or consent. At first glance it appears many of the datasets were created for seemingly harmless academic research, but when examined further it becomes clear that they're also used by foreign defense agencies.
During the last year, hundreds of these facial analysis datasets created "in the wild" have been collected to understand how they contribute to a global supply chain of biometric data that is helping to power the global facial recognition industry.
diff --git a/site/content/pages/datasets/brainwash/index.md b/site/content/pages/datasets/brainwash/index.md
index 4812e55d..a01f5bf4 100644
--- a/site/content/pages/datasets/brainwash/index.md
+++ b/site/content/pages/datasets/brainwash/index.md
@@ -2,8 +2,8 @@
status: published
title: Brainwash
-desc: Brainwash is a dataset of people from webcams the Brainwash Cafe in San Francisco being used to train face detection algorithms
-subdesc: Brainwash dataset includes 11,918 images of "everyday life of a busy downtown cafe"
+desc: Brainwash is a dataset of webcam images from the Brainwash Cafe in San Francisco
+subdesc: The Brainwash dataset includes 11,918 images of "everyday life of a busy downtown cafe" and is used for training head detection algorithms
slug: brainwash
cssclass: dataset
color: #ffaa00
diff --git a/site/content/pages/datasets/lfw/index.md b/site/content/pages/datasets/lfw/index.md
index b07c0e4b..b803efc5 100644
--- a/site/content/pages/datasets/lfw/index.md
+++ b/site/content/pages/datasets/lfw/index.md
@@ -2,8 +2,8 @@
status: published
title: Labeled Faces in The Wild
-desc: Labeled Faces in The Wild (LFW) is a database of face photographs designed for studying the problem of unconstrained face recognition.
-subdesc: It includes 13,456 images of 4,432 people's images copied from the Internet during 2002-2004.
+desc: Labeled Faces in The Wild (LFW) is the first facial recognition dataset created entirely from online photos
+subdesc: It includes 13,456 images of 4,432 people's images copied from the Internet during 2002-2004 and is the most frequently used dataset in the world for benchmarking face recognition algorithms.
image: assets/background.jpg
slug: lfw
year: 2007
diff --git a/site/content/pages/datasets/mars/index.md b/site/content/pages/datasets/mars/index.md
index 93edaeea..30c9a4d7 100644
--- a/site/content/pages/datasets/mars/index.md
+++ b/site/content/pages/datasets/mars/index.md
@@ -2,8 +2,8 @@
status: published
title: MARS
-desc: <span style="color:#ffaa00">MARS</span> is a dataset of people...
-subdesc: MARS includes...
+desc: The <span style="color:#ffaa00">Motion Analysis and Re-identification Set (MARS) is a MARS</span> dataset is collection of CCTV footage
+subdesc: The MARS dataset includes 1,191,003 of people and is used for training person re-identification algorithms
slug: mars
cssclass: dataset
image: assets/background.jpg
diff --git a/site/public/about/index.html b/site/public/about/index.html
index 15c4a831..8a95825d 100644
--- a/site/public/about/index.html
+++ b/site/public/about/index.html
@@ -36,7 +36,7 @@
<li><a href="/about/privacy/">Privacy Policy</a></li>
</ul>
</section><p>(PAGE UNDER DEVELOPMENT)</p>
-<p><div style="font-size:20px;line-height:36px">Ever since government agencies began developing face recognition in the early 1960's, datasets of face images have always been central to the development and evaluation face recognition technology. Today, these datasets no longer originate in labs, but instead from family photo albums posted on social media sites, CCTV camera footage from college campuses, search engine queries for celebrities, cafe livestreams, or <a href="https://www.theverge.com/2017/8/22/16180080/transgender-youtubers-ai-facial-recognition-dataset">videos on YouTube</a>. </div></p><p>While many of these datasets include public figures such as politicans, athletes, and actors; they also include many non-public figures including digital activists, students, pedestrians, and people's semi-private shared photo albums. Some images are used with creative commons licenses, yet others were taken in unconstrained scenarios without awareness or consent. At first glance it appears many of the datasets were created for seemingly harmless academic research studies, but when examined further it becomes clear that they're also used by defense contractors in foreign countries.</p>
+<p><div style="font-size:20px;line-height:36px">Ever since government agencies began developing face recognition in the early 1960's, datasets of face images have always been central to the development and evaluation face recognition technology. Today, these datasets no longer originate in labs, but instead from family photo albums posted on social media sites, CCTV camera footage from college campuses, search engine queries for celebrities, cafe livestreams, or <a href="https://www.theverge.com/2017/8/22/16180080/transgender-youtubers-ai-facial-recognition-dataset">videos on YouTube</a>. </div></p><p>While many of these datasets include public figures such as politicians, athletes, and actors; they also include many non-public figures including digital activists, students, pedestrians, and people's semi-private shared photo albums. Some images are used with creative commons licenses, yet others were taken in unconstrained scenarios without awareness or consent. At first glance it appears many of the datasets were created for seemingly harmless academic research, but when examined further it becomes clear that they're also used by foreign defense agencies.</p>
<p>During the last year, hundreds of these facial analysis datasets created "in the wild" have been collected to understand how they contribute to a global supply chain of biometric data that is helping to power the global facial recognition industry.</p>
<p>MegaPixels is art and research by <a href="https://ahprojects.com">Adam Harvey</a> about publicly available facial recognition datasets that aims to unravel their histories, futures, geographies, and contents. Throughout 2019 this site, coded by Jules LaPlace, will publish research reports, visualizations, downloadable statistics, and interactive tools for searching the datasets.</p>
<p>The MegaPixels website is produced in partnership with <a href="https://mozilla.org">Mozilla</a> who provided the funding to research the datasets, build the site, and develop tools to help you understand the role these datasets have played in creating biometric surveillance technologies.</p>
diff --git a/site/public/datasets/brainwash/index.html b/site/public/datasets/brainwash/index.html
index 9cf2db0d..ab002c78 100644
--- a/site/public/datasets/brainwash/index.html
+++ b/site/public/datasets/brainwash/index.html
@@ -4,7 +4,7 @@
<title>MegaPixels</title>
<meta charset="utf-8" />
<meta name="author" content="Adam Harvey" />
- <meta name="description" content="Brainwash is a dataset of people from webcams the Brainwash Cafe in San Francisco being used to train face detection algorithms" />
+ <meta name="description" content="Brainwash is a dataset of webcam images from the Brainwash Cafe in San Francisco" />
<meta name="referrer" content="no-referrer" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<link rel='stylesheet' href='/assets/css/fonts.css' />
@@ -26,7 +26,7 @@
</header>
<div class="content content-dataset">
- <section class='intro_section' style='background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/brainwash/assets/background.jpg)'><div class='inner'><div class='hero_desc'><span class='bgpad'><span style='color: #ffaa00'>Brainwash</span> is a dataset of people from webcams the Brainwash Cafe in San Francisco being used to train face detection algorithms</span></div><div class='hero_subdesc'><span class='bgpad'>Brainwash dataset includes 11,918 images of "everyday life of a busy downtown cafe"
+ <section class='intro_section' style='background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/brainwash/assets/background.jpg)'><div class='inner'><div class='hero_desc'><span class='bgpad'><span style='color: #ffaa00'>Brainwash</span> is a dataset of webcam images from the Brainwash Cafe in San Francisco</span></div><div class='hero_subdesc'><span class='bgpad'>The Brainwash dataset includes 11,918 images of "everyday life of a busy downtown cafe" and is used for training head detection algorithms
</span></div></div></section><section><div class='left-sidebar'><div class='meta'><div><div class='gray'>Collected</div><div>2014</div></div><div><div class='gray'>Published</div><div>2015</div></div><div><div class='gray'>Images</div><div>11,918</div></div><div><div class='gray'>Faces</div><div>91,146</div></div><div><div class='gray'>Created by</div><div>Stanford Department of Computer Science</div></div><div><div class='gray'>Funded by</div><div>Max Planck Center for Visual Computing and Communication</div></div><div><div class='gray'>Resolution</div><div>640x480px</div></div><div><div class='gray'>Size</div><div>4.1GB</div></div><div><div class='gray'>Origin</div><div>Brainwash Cafe, San Franscisco</div></div><div><div class='gray'>Purpose</div><div>Training face detection</div></div><div><div class='gray'>Website</div><div><a href="https://exhibits.stanford.edu/data/catalog/sx925dc9385">stanford.edu</a></div></div><div><div class='gray'>Paper</div><div><a href="http://arxiv.org/abs/1506.04878">End-to-End People Detection in Crowded Scenes</a></div></div></div></div><h2>Brainwash Dataset</h2>
<p>(PAGE UNDER DEVELOPMENT)</p>
<p><em>Brainwash</em> is a face detection dataset created from the Brainwash Cafe's livecam footage including 11,918 images of "everyday life of a busy downtown cafe<a class="footnote_shim" name="[^readme]_1"> </a><a href="#[^readme]" class="footnote" title="Footnote 1">1</a>". The images are used to develop face detection algorithms for the "challenging task of detecting people in crowded scenes" and tracking them.</p>
diff --git a/site/public/datasets/lfw/index.html b/site/public/datasets/lfw/index.html
index 4fbd06a5..477673e2 100644
--- a/site/public/datasets/lfw/index.html
+++ b/site/public/datasets/lfw/index.html
@@ -4,7 +4,7 @@
<title>MegaPixels</title>
<meta charset="utf-8" />
<meta name="author" content="Adam Harvey" />
- <meta name="description" content="Labeled Faces in The Wild (LFW) is a database of face photographs designed for studying the problem of unconstrained face recognition." />
+ <meta name="description" content="Labeled Faces in The Wild (LFW) is the first facial recognition dataset created entirely from online photos" />
<meta name="referrer" content="no-referrer" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<link rel='stylesheet' href='/assets/css/fonts.css' />
@@ -26,7 +26,7 @@
</header>
<div class="content content-">
- <section class='intro_section' style='background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/background.jpg)'><div class='inner'><div class='hero_desc'><span class='bgpad'><span style='color: #ff0000'>Labeled Faces in The Wild</span> (LFW) is a database of face photographs designed for studying the problem of unconstrained face recognition.</span></div><div class='hero_subdesc'><span class='bgpad'>It includes 13,456 images of 4,432 people's images copied from the Internet during 2002-2004.
+ <section class='intro_section' style='background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/lfw/assets/background.jpg)'><div class='inner'><div class='hero_desc'><span class='bgpad'><span style='color: #ff0000'>Labeled Faces in The Wild</span> (LFW) is the first facial recognition dataset created entirely from online photos</span></div><div class='hero_subdesc'><span class='bgpad'>It includes 13,456 images of 4,432 people's images copied from the Internet during 2002-2004 and is the most frequently used dataset in the world for benchmarking face recognition algorithms.
</span></div></div></section><section><div class='left-sidebar'><div class='meta'><div><div class='gray'>Created</div><div>2002 &ndash; 2004</div></div><div><div class='gray'>Images</div><div>13,233</div></div><div><div class='gray'>Identities</div><div>5,749</div></div><div><div class='gray'>Origin</div><div>Yahoo! News Images</div></div><div><div class='gray'>Used by</div><div>Facebook, Google, Microsoft, Baidu, Tencent, SenseTime, Face++, CIA, NSA, IARPA</div></div><div><div class='gray'>Website</div><div><a href="http://vis-www.cs.umass.edu/lfw">umass.edu</a></div></div></div><ul>
<li>There are about 3 men for every 1 woman in the LFW dataset<a class="footnote_shim" name="[^lfw_www]_1"> </a><a href="#[^lfw_www]" class="footnote" title="Footnote 1">1</a></li>
<li>The person with the most images is <a href="http://vis-www.cs.umass.edu/lfw/person/George_W_Bush_comp.html">George W. Bush</a> with 530</li>
diff --git a/site/public/datasets/mars/index.html b/site/public/datasets/mars/index.html
index 62f8847e..d6f2eeff 100644
--- a/site/public/datasets/mars/index.html
+++ b/site/public/datasets/mars/index.html
@@ -4,7 +4,7 @@
<title>MegaPixels</title>
<meta charset="utf-8" />
<meta name="author" content="Adam Harvey" />
- <meta name="description" content="MARS is a dataset of people..." />
+ <meta name="description" content="The Motion Analysis and Re-identification Set (MARS) is a MARS dataset is collection of CCTV footage " />
<meta name="referrer" content="no-referrer" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<link rel='stylesheet' href='/assets/css/fonts.css' />
@@ -26,7 +26,7 @@
</header>
<div class="content content-dataset">
- <section class='intro_section' style='background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/mars/assets/background.jpg)'><div class='inner'><div class='hero_desc'><span class='bgpad'><span style="color:#ffaa00">MARS</span> is a dataset of people...</span></div><div class='hero_subdesc'><span class='bgpad'>MARS includes...
+ <section class='intro_section' style='background-image: url(https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/mars/assets/background.jpg)'><div class='inner'><div class='hero_desc'><span class='bgpad'>The <span style="color:#ffaa00">Motion Analysis and Re-identification Set (MARS) is a MARS</span> dataset is collection of CCTV footage </span></div><div class='hero_subdesc'><span class='bgpad'>The MARS dataset includes 1,191,003 of people and is used for training person re-identification algorithms
</span></div></div></section><section><div class='left-sidebar'><div class='meta'><div><div class='gray'>Collected</div><div>TBD</div></div><div><div class='gray'>Published</div><div>TBD</div></div><div><div class='gray'>Images</div><div>TBD</div></div><div><div class='gray'>Faces</div><div>TBD</div></div></div></div><h2>MARS</h2>
<p>(PAGE UNDER DEVELOPMENT)</p>
<p>At vero eos et accusamus et iusto odio dignissimos ducimus, qui blanditiis praesentium voluptatum deleniti atque corrupti, quos dolores et quas molestias excepturi sint, obcaecati cupiditate non-provident, similique sunt in culpa, qui officia deserunt mollitia animi, id est laborum et dolorum fuga. Et harum quidem rerum facilis est et expedita distinctio.</p>