From 12672416ce355e0993ee2a2ef26e130bf4f87120 Mon Sep 17 00:00:00 2001 From: adamhrv Date: Mon, 4 Mar 2019 22:20:36 +0100 Subject: cosmetics --- site/public/datasets/index.html | 78 --------------------- site/public/datasets/lfw/index.html | 111 ------------------------------ site/public/datasets/vgg_face2/index.html | 74 -------------------- 3 files changed, 263 deletions(-) delete mode 100644 site/public/datasets/index.html delete mode 100644 site/public/datasets/lfw/index.html delete mode 100644 site/public/datasets/vgg_face2/index.html (limited to 'site/public/datasets') diff --git a/site/public/datasets/index.html b/site/public/datasets/index.html deleted file mode 100644 index 9cd50016..00000000 --- a/site/public/datasets/index.html +++ /dev/null @@ -1,78 +0,0 @@ - - - - MegaPixels - - - - - - - - - - - - -
- - -
MegaPixels
-
- -
-
- - -

Facial Recognition Datasets

-
- -
- - -
- - -
- - - - - \ No newline at end of file diff --git a/site/public/datasets/lfw/index.html b/site/public/datasets/lfw/index.html deleted file mode 100644 index e90cdcc5..00000000 --- a/site/public/datasets/lfw/index.html +++ /dev/null @@ -1,111 +0,0 @@ - - - - MegaPixels - - - Labeled Faces in The Wild (LFW) is a database of face photographs designed for studying the problem of unconstrained face recognition." /> - - - - - - - - - -
- - -
MegaPixels
-
- -
-
- -
Labeled Faces in The Wild (LFW) is a database of face photographs designed for studying the problem of unconstrained face recognition.
It includes 13,456 images of 4,432 people's images copied from the Internet during 2002-2004. -
A few of the 5,749 people in the Labeled Faces in the Wild Dataset, thee most widely used face dataset for benchmarking face recognition algorithms.

Labeled Faces in the Wild

-

Labeled Faces in The Wild (LFW) is "a database of face photographs designed for studying the problem of unconstrained face recognition 1. It is used to evaluate and improve the performance of facial recognition algorithms in academic, commercial, and government research. According to BiometricUpdate.com 3, LFW is "the most widely used evaluation set in the field of facial recognition, LFW attracts a few dozen teams from around the globe including Google, Facebook, Microsoft Research Asia, Baidu, Tencent, SenseTime, Face++ and Chinese University of Hong Kong."

-

The LFW dataset includes 13,233 images of 5,749 people that were collected between 2002-2004. LFW is a subset of Names of Faces and is part of the first facial recognition training dataset created entirely from images appearing on the Internet. The people appearing in LFW are...

-

The Names and Faces dataset was the first face recognition dataset created entire from online photos. However, Names and Faces and LFW are not the first face recognition dataset created entirely "in the wild". That title belongs to the UCD dataset. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer.

-

The Names and Faces dataset was the first face recognition dataset created entire from online photos. However, Names and Faces and LFW are not the first face recognition dataset created entirely "in the wild". That title belongs to the UCD dataset. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer.

-
All 5,379 people in the Labeled Faces in The Wild Dataset. Showing one face per person
All 5,379 people in the Labeled Faces in The Wild Dataset. Showing one face per person

The Names and Faces dataset was the first face recognition dataset created entire from online photos. However, Names and Faces and LFW are not the first face recognition dataset created entirely "in the wild". That title belongs to the UCD dataset. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer.

-

The Names and Faces dataset was the first face recognition dataset created entire from online photos. However, Names and Faces and LFW are not the first face recognition dataset created entirely "in the wild". That title belongs to the UCD dataset. Images obtained "in the wild" means using an image without explicit consent or awareness from the subject or photographer.

-

Biometric Trade Routes

To understand how this dataset has been used, its citations have been geocoded to show an approximate geographic digital trade route of the biometric data. Lines indicate an organization (education, commercial, or governmental) that has cited the LFW dataset in their research. Data is compiled from Semantic Scholar.

Academic
Industry
Government

Sed ut perspiciatis, unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam eaque ipsa, quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt, explicabo. Nemo enim ipsam voluptatem, quia voluptas sit, aspernatur aut odit aut fugit, sed quia.

-
- -

Supplementary Information for Labeled Faces in The Wild

-

Citations

Add graph showing distribution by country. Add information about how the citations were generated. Add button/link to download CSV

Synthetic Faces

To visualize the types of photos in the dataset without explicitly publishing individual's identities a generative adversarial network (GAN) was trained on the entire dataset. The images in this video show a neural network learning the visual latent space and then interpolating between archetypical identities within the LFW dataset.

Synthetically generated face from the visual space of LFW dataset
Synthetically generated face from the visual space of LFW dataset
-
Synthetically generated face from the visual space of LFW dataset
Synthetically generated face from the visual space of LFW dataset
-
Synthetically generated face from the visual space of LFW dataset
Synthetically generated face from the visual space of LFW dataset
-
Synthetically generated face from the visual space of LFW dataset
Synthetically generated face from the visual space of LFW dataset

Commercial Use of Labeled Faces in The Wild

-

Add a paragraph about how usage extends far beyond academia into research centers for largest companies in the world. And even funnels into CIA funded research in the US and defense industry usage in China.

-

Code

-

The LFW dataset is so widely used that access to the facial data has built directly into a popular code library called Sci-Kit Learn. It includes a function called fetch_lfw_people to download the faces in the LFW dataset.

-
#!/usr/bin/python
-
-import numpy as np
-from sklearn.datasets import fetch_lfw_people
-import imageio
-import imutils
-
-# download LFW dataset (first run takes a while)
-lfw_people = fetch_lfw_people(min_faces_per_person=1, resize=1, color=True, funneled=False)
-
-# introspect dataset
-n_samples, h, w, c = lfw_people.images.shape
-print(f'{n_samples:,} images at {w}x{h} pixels')
-cols, rows = (176, 76)
-n_ims = cols * rows
-
-# build montages
-im_scale = 0.5
-ims = lfw_people.images[:n_ims]
-montages = imutils.build_montages(ims, (int(w * im_scale,   int(h * im_scale)), (cols, rows))
-montage = montages[0]
-
-# save full montage image
-imageio.imwrite('lfw_montage_full.png', montage)
-
-# make a smaller version
-montage = imutils.resize(montage, width=960)
-imageio.imwrite('lfw_montage_960.jpg', montage)
-
-

Research, text, and graphics ©Adam Harvey / megapixels.cc

-
- -
- - - - - \ No newline at end of file diff --git a/site/public/datasets/vgg_face2/index.html b/site/public/datasets/vgg_face2/index.html deleted file mode 100644 index d0a161cb..00000000 --- a/site/public/datasets/vgg_face2/index.html +++ /dev/null @@ -1,74 +0,0 @@ - - - - MegaPixels - - - - - - - - - - - - -
- - -
MegaPixels
-
- -
-
- -

VGG Face 2

-
Years
TBD
Images
TBD
Identities
TBD
Origin
TBD
Funding
IARPA
...
...

Analysis

-
    -
  • The VGG Face 2 dataset includes approximately 1,331 actresses, 139 presidents, 16 wives, 3 husbands, 2 snooker player, and 1 guru
  • -
-

Names and descriptions

-
    -
  • The original VGGF2 name list has been updated with the results returned from Google Knowledge
  • -
  • Names with a similarity score greater than 0.75 where automatically updated. Scores computed using import difflib; seq = difflib.SequenceMatcher(a=a.lower(), b=b.lower()); score = seq.ratio()
  • -
  • The 97 names with a score of 0.75 or lower were manually reviewed and includes name changes validating using Wikipedia.org results for names such as "Bruce Jenner" to "Caitlyn Jenner", spousal last-name changes, and discretionary changes to improve search results such as combining nicknames with full name when appropriate, for example changing "Aleksandar Petrović" to "Aleksandar 'Aco' Petrović" and minor changes such as "Mohammad Ali" to "Muhammad Ali"
  • -
  • The 'Description' text was automatically added when the Knowledge Graph score was greater than 250
  • -
-

TODO

-
    -
  • create name list, and populate with Knowledge graph information like LFW
  • -
  • make list of interesting number stats, by the numbers
  • -
  • make list of interesting important facts
  • -
  • write intro abstract
  • -
  • write analysis of usage
  • -
  • find examples, citations, and screenshots of useage
  • -
  • find list of companies using it for table
  • -
  • create montages of the dataset, like LFW
  • -
  • create right to removal information
  • -
-
- -
- - - - - \ No newline at end of file -- cgit v1.2.3-70-g09d2