From c8e7a10be948c2405d46d8c3caf4a8c6675eee29 Mon Sep 17 00:00:00 2001 From: Jules Laplace Date: Wed, 27 Feb 2019 19:35:54 +0100 Subject: rebuild --- site/public/datasets/vgg_face2/index.html | 33 ++++--------------------------- 1 file changed, 4 insertions(+), 29 deletions(-) (limited to 'site/public/datasets/vgg_face2/index.html') diff --git a/site/public/datasets/vgg_face2/index.html b/site/public/datasets/vgg_face2/index.html index b7ba5a4c..08b02cc7 100644 --- a/site/public/datasets/vgg_face2/index.html +++ b/site/public/datasets/vgg_face2/index.html @@ -4,7 +4,7 @@ MegaPixels - + @@ -27,35 +27,10 @@
-

VGG Faces2

-
Created
2018
Images
3.3M
People
9,000
Created From
Scraping search engines
Search available
[Searchable](#)

VGG Face2 is the updated version of the VGG Face dataset and now includes over 3.3M face images from over 9K people. The identities were selected by taking the top 500K identities in Google's Knowledge Graph of celebrities and then selecting only the names that yielded enough training images. The dataset was created in the UK but funded by Office of Director of National Intelligence in the United States.

-

VGG Face2 by the Numbers

+

VGG Face 2

+
Years
TBD
Images
TBD
Identities
TBD
Origin
TBD
Funding
IARPA
...
...

Analysis

    -
  • 1,331 actresses, 139 presidents
  • -
  • 3 husbands and 16 wives
  • -
  • 2 snooker player
  • -
  • 1 guru
  • -
  • 1 pornographic actress
  • -
  • 3 computer programmer
  • -
-

Names and descriptions

-
    -
  • The original VGGF2 name list has been updated with the results returned from Google Knowledge
  • -
  • Names with a similarity score greater than 0.75 where automatically updated. Scores computed using import difflib; seq = difflib.SequenceMatcher(a=a.lower(), b=b.lower()); score = seq.ratio()
  • -
  • The 97 names with a score of 0.75 or lower were manually reviewed and includes name changes validating using Wikipedia.org results for names such as "Bruce Jenner" to "Caitlyn Jenner", spousal last-name changes, and discretionary changes to improve search results such as combining nicknames with full name when appropriate, for example changing "Aleksandar Petrović" to "Aleksandar 'Aco' Petrović" and minor changes such as "Mohammad Ali" to "Muhammad Ali"
  • -
  • The 'Description' text was automatically added when the Knowledge Graph score was greater than 250
  • -
-

TODO

-
    -
  • create name list, and populate with Knowledge graph information like LFW
  • -
  • make list of interesting number stats, by the numbers
  • -
  • make list of interesting important facts
  • -
  • write intro abstract
  • -
  • write analysis of usage
  • -
  • find examples, citations, and screenshots of useage
  • -
  • find list of companies using it for table
  • -
  • create montages of the dataset, like LFW
  • -
  • create right to removal information
  • +
  • The VGG Face 2 dataset includes approximately 1,331 actresses, 139 presidents, 16 wives, 3 husbands, 2 snooker player, and 1 guru
-- cgit v1.2.3-70-g09d2 From ef90adeb4230ac27c18d3ed9e2cfab000c8689e0 Mon Sep 17 00:00:00 2001 From: Jules Laplace Date: Thu, 28 Feb 2019 18:09:27 +0100 Subject: recreate dataset index --- megapixels/app/site/builder.py | 2 +- site/assets/css/css.css | 13 +++- site/content/pages/about/index.md | 2 - site/content/pages/datasets/index.md | 27 ++++++++ site/content/pages/datasets/lfw/index.md | 2 +- site/public/datasets/index.html | 71 +++++++++++++++++++--- site/public/datasets/lfw/index.html | 2 +- site/public/datasets/vgg_face2/index.html | 2 +- site/public/datasets_v0/index.html | 2 +- site/public/datasets_v0/lfw/index.html | 2 +- .../datasets_v0/lfw/right-to-removal/index.html | 1 - site/public/datasets_v0/vgg_face2/index.html | 2 +- site/public/index.html | 6 ++ site/templates/datasets.html | 9 ++- 14 files changed, 116 insertions(+), 27 deletions(-) create mode 100644 site/content/pages/datasets/index.md (limited to 'site/public/datasets/vgg_face2/index.html') diff --git a/megapixels/app/site/builder.py b/megapixels/app/site/builder.py index 15055110..603d4788 100644 --- a/megapixels/app/site/builder.py +++ b/megapixels/app/site/builder.py @@ -78,7 +78,7 @@ def build_index(key, research_posts, datasets): template = env.get_template("page.html") s3_path = s3.make_s3_path(cfg.S3_SITE_PATH, metadata['path']) content = parser.parse_markdown(metadata, sections, s3_path, skip_h1=False) - content += loader.parse_research_index(research_posts) + content += parser.parse_research_index(research_posts) html = template.render( metadata=metadata, content=content, diff --git a/site/assets/css/css.css b/site/assets/css/css.css index 29833be7..3bd09f23 100644 --- a/site/assets/css/css.css +++ b/site/assets/css/css.css @@ -1,4 +1,4 @@ -* { box-sizing: border-box; outline: 0; } +da* { box-sizing: border-box; outline: 0; } html, body { margin: 0; padding: 0; @@ -396,7 +396,10 @@ section.fullwidth .image { } .sideimage img { margin-right: 10px; + width: 250px; + height: 250px; } + /* blog index */ .research_index { @@ -521,7 +524,8 @@ section.fullwidth .image { text-decoration: none; transition: background-color 0.1s cubic-bezier(0,0,1,1); background: black; - margin: 0 20px 20px 0; + margin: 0 11px 11px 0; + border: 0; } .dataset-list .dataset { width: 220px; @@ -538,6 +542,11 @@ section.fullwidth .image { .dataset-list a:nth-child(3n+3) { background-color: rgba(255, 255, 0, 0.1); } .desktop .dataset-list .dataset:nth-child(3n+3):hover { background-color: rgba(255, 255, 0, 0.2); } +.dataset-list span { + box-shadow: -3px -3px black, 3px -3px black, -3px 3px black, 3px 3px black; + background-color: black; + box-decoration-break: clone; +} /* intro section for datasets */ diff --git a/site/content/pages/about/index.md b/site/content/pages/about/index.md index 861cfd07..66fac8ae 100644 --- a/site/content/pages/about/index.md +++ b/site/content/pages/about/index.md @@ -37,5 +37,3 @@ MegaPixels aims to answer to these questions and reveal the stories behind the m ![sideimage:Jules LaPlace](assets/jules-laplace.jpg) **Jules LaPlace** is an American artist and technologist also based in Berlin. He was previously the CTO of a NYC digital agency and currently works at VFRAME, developing computer vision for human rights groups, and building creative software for artists. **Mozilla** is a free software community founded in 1998 by members of Netscape. The Mozilla community uses, develops, spreads and supports Mozilla products, thereby promoting exclusively free software and open standards, with only minor exceptions. The community is supported institutionally by the not-for-profit Mozilla Foundation and its tax-paying subsidiary, the Mozilla Corporation. - - diff --git a/site/content/pages/datasets/index.md b/site/content/pages/datasets/index.md new file mode 100644 index 00000000..c408fba4 --- /dev/null +++ b/site/content/pages/datasets/index.md @@ -0,0 +1,27 @@ +------------ + +status: published +title: MegaPixels: Datasets +desc: Facial Recognition Datasets +slug: home +published: 2018-12-15 +updated: 2018-12-15 +authors: Adam Harvey +sync: false + +------------ + +# Facial Recognition Datasets + +### Sidebar + ++ Found: 275 datasets ++ Created between: 1993-2018 ++ Smallest dataset: 20 images ++ Largest dataset: 10,000,000 images + ++ Highest resolution faces: 450x500 (Unconstrained College Students) ++ Lowest resolution faces: 16x20 pixels (QMUL SurvFace) + +## End Sidebar + diff --git a/site/content/pages/datasets/lfw/index.md b/site/content/pages/datasets/lfw/index.md index 972fafe2..4161561d 100644 --- a/site/content/pages/datasets/lfw/index.md +++ b/site/content/pages/datasets/lfw/index.md @@ -4,7 +4,7 @@ status: published title: Labeled Faces in The Wild desc: Labeled Faces in The Wild (LFW) is a database of face photographs designed for studying the problem of unconstrained face recognition. subdesc: It includes 13,456 images of 4,432 people’s images copied from the Internet during 2002-2004. -image: assets/lfw_feature.jpg +image: assets/background.jpg caption: A few of the 5,749 people in the Labeled Faces in the Wild Dataset. The most widely used face dataset for benchmarking commercial face recognition algorithms. slug: lfw published: 2019-2-23 diff --git a/site/public/datasets/index.html b/site/public/datasets/index.html index 77c5ab2b..17c938ac 100644 --- a/site/public/datasets/index.html +++ b/site/public/datasets/index.html @@ -29,27 +29,78 @@

Facial Recognition Datasets

-

Regular Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

-

Summary

-
Found
275 datasets
Created between
1993-2018
Smallest dataset
20 images
Largest dataset
10,000,000 images
Highest resolution faces
450x500 (Unconstrained College Students)
Lowest resolution faces
16x20 pixels (QMUL SurvFace)
+
-
-

Dataset Portraits

+

- We have prepared detailed studies of some of the more noteworthy datasets. + We have prepared detailed case studies of some of the more noteworthy datasets, including tools to help you learn what is contained in these datasets, and even whether your own face has been used to train these algorithms.

- +
- Labeled Faces in The Wild + Asian Face Age Dataset
- +
- VGG Face2 + Annotated Facial Landmarks in The Wild +
+
+ + +
+ Caltech 10K Faces Dataset +
+
+ + +
+ Caltech Occluded Faces in The Wild +
+
+ + +
+ Facebook +
+
+ + +
+ FERET: FacE REcognition +
+
+ + +
+ Labeled Face Parts in The Wild +
+
+ + +
+ Labeled Faces in The Wild +
+
+ + +
+ Unconstrained College Students +
+
+ + +
+ VGG Face 2 Dataset +
+
+ + +
+ YouTube Celebrities
diff --git a/site/public/datasets/lfw/index.html b/site/public/datasets/lfw/index.html index 08ec8ee3..5b5e58f3 100644 --- a/site/public/datasets/lfw/index.html +++ b/site/public/datasets/lfw/index.html @@ -27,7 +27,7 @@
-
Labeled Faces in The Wild (LFW) is a database of face photographs designed for studying the problem of unconstrained face recognition.
It includes 13,456 images of 4,432 people’s images copied from the Internet during 2002-2004. +
Labeled Faces in The Wild (LFW) is a database of face photographs designed for studying the problem of unconstrained face recognition.
It includes 13,456 images of 4,432 people’s images copied from the Internet during 2002-2004.
A few of the 5,749 people in the Labeled Faces in the Wild Dataset. The most widely used face dataset for benchmarking commercial face recognition algorithms.