From 70f79c37278d7c47bee29cdf091bde448aae9240 Mon Sep 17 00:00:00 2001 From: adamhrv Date: Tue, 19 Mar 2019 12:21:03 +0100 Subject: html gen --- site/public/about/index.html | 3 +- site/public/datasets/brainwash/index.html | 43 ++++--- site/public/datasets/cofw/index.html | 2 + site/public/datasets/lfw/index.html | 2 + site/public/datasets/mars/index.html | 4 +- .../research/01_from_1_to_100_pixels/index.html | 1 + .../research/02_what_computers_can_see/index.html | 143 +++++++++++++++++++++ 7 files changed, 175 insertions(+), 23 deletions(-) (limited to 'site/public') diff --git a/site/public/about/index.html b/site/public/about/index.html index 694f7ec9..3c270ee1 100644 --- a/site/public/about/index.html +++ b/site/public/about/index.html @@ -37,7 +37,8 @@
  • Privacy Policy
  • (PAGE UNDER DEVELOPMENT)

    -

    Ever since government agencies began researching face recognition in the early 1960's, datasets of face images have always been central to technological advancements. Today, these datasets no longer originate in labs, but instead from family photo albums posted on photo sharing sites, surveillance cameras on college campuses, search engine queries for celebrities, cafe livestreams, or personal videos posted on YouTube. Collectively, facial recognition datasets are now gathered "in the wild".

    MegaPixels is art and research by Adam Harvey about facial recognition datasets that unravels their histories, futures, geographies, and meanings. Throughout 2019 this site this site will publish research reports, visualizations, raw data, and interactive tools to explore how publicly available facial recognition datasets contribute to a global supply chain of biometric data that powers the global facial recognition industry.

    During the last year, hundreds of these facial analysis datasets created "in the wild" have been collected to understand how they contribute to a global supply chain of biometric data that is powering the global facial recognition industry.

    +

    Ever since government agencies began developing face recognition in the early 1960's, datasets of face images have always been central to technological advancements. Today, these datasets no longer originate in labs, but instead from family photo albums posted on photo sharing sites, surveillance cameras on college campuses, search engine queries for celebrities, cafe livestreams, and personal videos posted on YouTube.

    Collectively, facial recognition datasets are now gathered "in the wild".

    +

    MegaPixels is art and research by Adam Harvey about facial recognition datasets that unravels their histories, futures, geographies, and meanings. Throughout 2019 this site this site will publish research reports, visualizations, raw data, and interactive tools to explore how publicly available facial recognition datasets contribute to a global supply chain of biometric data that powers the global facial recognition industry.

    During the last year, hundreds of these facial analysis datasets created "in the wild" have been collected to understand how they contribute to a global supply chain of biometric data that is powering the global facial recognition industry.

    The MegaPixels website is produced in partnership with Mozilla.

    diff --git a/site/public/datasets/brainwash/index.html b/site/public/datasets/brainwash/index.html index ab002c78..e5baca7a 100644 --- a/site/public/datasets/brainwash/index.html +++ b/site/public/datasets/brainwash/index.html @@ -4,7 +4,7 @@ MegaPixels - + @@ -26,15 +26,29 @@
    -
    Brainwash is a dataset of webcam images from the Brainwash Cafe in San Francisco
    The Brainwash dataset includes 11,918 images of "everyday life of a busy downtown cafe" and is used for training head detection algorithms -

    Brainwash Dataset

    +
    Brainwash is a dataset of webcam images taken from the Brainwash Cafe in San Francisco
    The Brainwash dataset includes 11,918 images of "everyday life of a busy downtown cafe" and is used for training head detection algorithms +

    Brainwash Dataset

    (PAGE UNDER DEVELOPMENT)

    Brainwash is a face detection dataset created from the Brainwash Cafe's livecam footage including 11,918 images of "everyday life of a busy downtown cafe 1". The images are used to develop face detection algorithms for the "challenging task of detecting people in crowded scenes" and tracking them.

    Before closing in 2017, Brainwash Cafe was a "cafe and laundromat" located in San Francisco's SoMA district. The cafe published a publicy available livestream from the cafe with a view of the cash register, performance stage, and seating area.

    Since it's publication by Stanford in 2015, the Brainwash dataset has appeared in several notable research papers. In September 2016 four researchers from the National University of Defense Technology in Changsha, China used the Brainwash dataset for a research study on "people head detection in crowded scenes", concluding that their algorithm "achieves superior head detection performance on the crowded scenes dataset 2". And again in 2017 three researchers at the National University of Defense Technology used Brainwash for a study on object detection noting "the data set used in our experiment is shown in Table 1, which includes one scene of the brainwash dataset 3".

     An sample image from the Brainwash dataset used for training face and head detection algorithms for surveillance. The datset contains about 12,000 images. License: Open Data Commons Public Domain Dedication (PDDL)
    An sample image from the Brainwash dataset used for training face and head detection algorithms for surveillance. The datset contains about 12,000 images. License: Open Data Commons Public Domain Dedication (PDDL)
     49 of the 11,918 images included in the Brainwash dataset. License: Open Data Commons Public Domain Dedication (PDDL)
    49 of the 11,918 images included in the Brainwash dataset. License: Open Data Commons Public Domain Dedication (PDDL)
    +

    Who used Brainwash Dataset?

    + +

    + This bar chart presents a ranking of the top countries where citations originated. Mouse over individual columns + to see yearly totals. Colors are only assigned to the top 10 overall countries. +

    + +
    + +
    + +
    +
    -

    Biometric Trade Routes (beta)

    +

    Information Supply Chain

    To understand how Brainwash Dataset has been used around the world... - affected global research on computer vision, surveillance, defense, and consumer technology, the and where this dataset has been used the locations of each organization that used or referenced the datast + affected global research on computer vision, surveillance, defense, and consumer technology, the and where this dataset has been used the locations of each organization that used or referenced the datast

    @@ -65,20 +79,9 @@

    - The data is generated by collecting all citations for all original research papers associated with the dataset. Then the PDFs are then converted to text and the organization names are extracted and geocoded. Because of the automated approach to extracting data, actual use of the dataset can not yet be confirmed. This visualization is provided to help locate and confirm usage and will be updated as data noise is reduced. + Standardized paragraph of text about the map. Sed ut perspiciatis, unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam eaque ipsa, quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt, explicabo.

    -
    -

    Who used Brainwash Dataset?

    - -

    - This bar chart presents a ranking of the top countries where citations originated. Mouse over individual columns - to see yearly totals. Colors are only assigned to the top 10 overall countries. -

    - -
    - -
    -
    +

    Add more analysis here

    @@ -92,11 +95,11 @@

    Citations

    - Citations were collected from Semantic Scholar, a website which aggregates + The citations used for the geographic visualizations were collected from Semantic Scholar, a website which aggregates and indexes research papers. Metadata was extracted from these papers, including extracting names of institutions automatically from PDFs, and then the addresses were geocoded. Data is not yet manually verified, and reflects anytime the paper was cited. Some papers may only mention the dataset in passing, while others use it as part of their research methodology.

    - Add button/link to download CSV + Add [button/link] to download CSV. Add search input field to filter. Expand number of rows to 10. Reduce URL text to show only the domain (ie https://arxiv.org/pdf/123456 --> arxiv.org)

    diff --git a/site/public/datasets/cofw/index.html b/site/public/datasets/cofw/index.html index 605a325a..20138c3c 100644 --- a/site/public/datasets/cofw/index.html +++ b/site/public/datasets/cofw/index.html @@ -108,6 +108,8 @@ To increase the number of training images, and since COFW has the exact same la
    +
    Labeled Faces in the Wild Dataset
    20 citations +

    TODO

    - replace graphic

    diff --git a/site/public/datasets/lfw/index.html b/site/public/datasets/lfw/index.html index 477673e2..8670f909 100644 --- a/site/public/datasets/lfw/index.html +++ b/site/public/datasets/lfw/index.html @@ -90,6 +90,8 @@
    +
    Labeled Faces in the Wild Dataset
    20 citations +
    diff --git a/site/public/datasets/mars/index.html b/site/public/datasets/mars/index.html index bfad52a3..b053b456 100644 --- a/site/public/datasets/mars/index.html +++ b/site/public/datasets/mars/index.html @@ -4,7 +4,7 @@ MegaPixels - + @@ -26,7 +26,7 @@
    -
    The Motion Analysis and Re-identification Set (MARS) is a dataset is collection of CCTV footage
    The MARS dataset includes 1,191,003 of people used for training person re-identification algorithms +
    Motion Analysis and Re-identification Set (MARS) is a dataset is collection of CCTV footage
    The MARS dataset includes 1,191,003 of people used for training person re-identification algorithms

    Motion Analysis and Re-identification Set (MARS)

    (PAGE UNDER DEVELOPMENT)

    At vero eos et accusamus et iusto odio dignissimos ducimus, qui blanditiis praesentium voluptatum deleniti atque corrupti, quos dolores et quas molestias excepturi sint, obcaecati cupiditate non-provident, similique sunt in culpa, qui officia deserunt mollitia animi, id est laborum et dolorum fuga. Et harum quidem rerum facilis est et expedita distinctio.

    diff --git a/site/public/research/01_from_1_to_100_pixels/index.html b/site/public/research/01_from_1_to_100_pixels/index.html index 5254fb40..c91d17ad 100644 --- a/site/public/research/01_from_1_to_100_pixels/index.html +++ b/site/public/research/01_from_1_to_100_pixels/index.html @@ -78,6 +78,7 @@
    • "Note that we only keep the images with a minimal side length of 80 pixels." and "a face will be labeled as “Ignore” if it is very difficult to be detected due to blurring, severe deformation and unrecognizable eyes, or the side length of its bounding box is less than 32 pixels." Ge_Detecting_Masked_Faces_CVPR_2017_paper.pdf
    • +
    • IBM DiF: "Faces with region size less than 50x50 or inter-ocular distance of less than 30 pixels were discarded. Faces with non-frontal pose, or anything beyond being slightly tilted to the left or the right, were also discarded."

    diff --git a/site/public/research/02_what_computers_can_see/index.html b/site/public/research/02_what_computers_can_see/index.html index 202359e0..9389bf84 100644 --- a/site/public/research/02_what_computers_can_see/index.html +++ b/site/public/research/02_what_computers_can_see/index.html @@ -126,6 +126,149 @@
  • Wearing Necktie
  • Wearing Necklace
  • +

    From Market 1501

    +

    The 27 attributes are:

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    attributerepresentation in filelabel
    gendergendermale(1), female(2)
    hair lengthhairshort hair(1), long hair(2)
    sleeve lengthuplong sleeve(1), short sleeve(2)
    length of lower-body clothingdownlong lower body clothing(1), short(2)
    type of lower-body clothingclothesdress(1), pants(2)
    wearing hathatno(1), yes(2)
    carrying backpackbackpackno(1), yes(2)
    carrying bagbagno(1), yes(2)
    carrying handbaghandbagno(1), yes(2)
    ageageyoung(1), teenager(2), adult(3), old(4)
    8 color of upper-body clothingupblack, upwhite, upred, uppurple, upyellow, upgray, upblue, upgreenno(1), yes(2)
    9 color of lower-body clothingdownblack, downwhite, downpink, downpurple, downyellow, downgray, downblue, downgreen,downbrownno(1), yes(2)
    +

    source: https://github.com/vana77/Market-1501_Attribute/blob/master/README.md

    +

    From DukeMTMC

    +

    The 23 attributes are:

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    attributerepresentation in filelabel
    gendergendermale(1), female(2)
    length of upper-body clothingtopshort upper body clothing(1), long(2)
    wearing bootsbootsno(1), yes(2)
    wearing hathatno(1), yes(2)
    carrying backpackbackpackno(1), yes(2)
    carrying bagbagno(1), yes(2)
    carrying handbaghandbagno(1), yes(2)
    color of shoesshoesdark(1), light(2)
    8 color of upper-body clothingupblack, upwhite, upred, uppurple, upgray, upblue, upgreen, upbrownno(1), yes(2)
    7 color of lower-body clothingdownblack, downwhite, downred, downgray, downblue, downgreen, downbrownno(1), yes(2)
    +

    source: https://github.com/vana77/DukeMTMC-attribute/blob/master/README.md

    +

    From H3D Dataset

    +

    The joints and other keypoints (eyes, ears, nose, shoulders, elbows, wrists, hips, knees and ankles) +The 3D pose inferred from the keypoints. +Visibility boolean for each keypoint +Region annotations (upper clothes, lower clothes, dress, socks, shoes, hands, gloves, neck, face, hair, hat, sunglasses, bag, occluder) +Body type (male, female or child)

    +

    source: https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/shape/h3d/

    -- cgit v1.2.3-70-g09d2