From d69086a1b2d7d6e6def55f35e30d0623701de011 Mon Sep 17 00:00:00 2001 From: Jules Laplace Date: Tue, 4 Dec 2018 21:12:59 +0100 Subject: embedding images --- site/public/datasets/lfw/index.html | 112 ++++++++++++++++++++++++++++++++++++ 1 file changed, 112 insertions(+) create mode 100644 site/public/datasets/lfw/index.html (limited to 'site/public/datasets/lfw/index.html') diff --git a/site/public/datasets/lfw/index.html b/site/public/datasets/lfw/index.html new file mode 100644 index 00000000..8455bc60 --- /dev/null +++ b/site/public/datasets/lfw/index.html @@ -0,0 +1,112 @@ + + + + MegaPixels + + + + + + + + + +
+ + +
MegaPixels
+ The Darkside of Datasets +
+ +
+
+ +
    +
  • Created 2007
  • +
  • Images 13,233
  • +
  • People 5,749
  • +
  • Created From Yahoo News images
  • +
  • Search available Searchable
  • +
+

Labeled Faces in The Wild is amongst the most widely used facial recognition training datasets in the world and is the first dataset of its kind to be created entirely from Internet photos. It includes 13,233 images of 5,749 people downloaded from the Internet, otherwise referred to by researchers as “The Wild”.

+

Eight out of 5,749 people in the Labeled Faces in the Wild dataset. The face recognition training dataset is created entirely from photos downloaded from the Internet.

+

INTRO

+

It began in 2002. Researchers at University of Massachusetts Amherst were developing algorithms for facial recognition and they needed more data. Between 2002-2004 they scraped Yahoo News for images of public figures. Two years later they cleaned up the dataset and repackaged it as Labeled Faces in the Wild (LFW).

+

Since then the LFW dataset has become one of the most widely used datasets used for evaluating face recognition algorithms. The associated research paper “Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments” has been cited 996 times reaching 45 different countries throughout the world.

+

The faces come from news stories and are mostly celebrities from the entertainment industry, politicians, and villains. It’s a sampling of current affairs and breaking news that has come to pass. The images, detached from their original context now server a new purpose: to train, evaluate, and improve facial recognition.

+

As the most widely used facial recognition dataset, it can be said that each individual in LFW has, in a small way, contributed to the current state of the art in facial recognition surveillance. John Cusack, Julianne Moore, Barry Bonds, Osama bin Laden, and even Moby are amongst these biometric pillars, exemplar faces provided the visual dimensions of a new computer vision future.

+

From Aaron Eckhart to Zydrunas Ilgauskas. A small sampling of the LFW dataset

+

In addition to commercial use as an evaluation tool, alll of the faces in LFW dataset are prepackaged into a popular machine learning code framework called scikit-learn.

+

Usage

+
#!/usr/bin/python
+from matplotlib import plt
+from sklearn.datasets import fetch_lfw_people
+lfw_people = fetch_lfw_people()
+lfw_person = lfw_people[0]
+plt.imshow(lfw_person)
+
+

Commercial Use

+

The LFW dataset is used by numerous companies for benchmarking algorithms and in some cases training. According to the benchmarking results page 1 provided by the authors, over 2 dozen companies have contributed their benchmark results

+

(Jules: this load the assets/lfw_vendor_results.csv)

+

In benchmarking, companies use a dataset to evaluate their algorithms which are typically trained on other data. After training, researchers will use LFW as a benchmark to compare results with other algorithms.

+

For example, Baidu (est. net worth $13B) uses LFW to report results for their "Targeting Ultimate Accuracy: Face Recognition via Deep Embedding". According to the three Baidu researchers who produced the paper:

+

LFW has been the most popular evaluation benchmark for face recognition, and played a very important role in facilitating the face recognition society to improve algorithm. 2.

+
+

Citations

+ + + + + + + + + + + + + + + + + + + + + + +
TitleOrganizationCountryType
3D-aided face recognition from videosUniversity of LyonFranceedu
A Community Detection Approach to Cleaning Extremely Large Face DatabaseNational University of Defense Technology, ChinaChinaedu
+

Conclusion

+

The LFW face recognition training and evaluation dataset is a historically important face dataset as it was the first popular dataset to be created entirely from Internet images, paving the way for a global trend towards downloading anyone’s face from the Internet and adding it to a dataset. As will be evident with other datasets, LFW’s approach has now become the norm.

+

For all the 5,000 people in this datasets, their face is forever a part of facial recognition history. It would be impossible to remove anyone from the dataset because it is so ubiquitous. For their rest of the lives and forever after, these 5,000 people will continue to be used for training facial recognition surveillance.

+
+
+
  1. "LFW Results". Accessed Dec 3, 2018. http://vis-www.cs.umass.edu/lfw/results.html

  2. +
  3. "Chinese tourist town uses face recognition as an entry pass". New Scientist. November 17, 2016. https://www.newscientist.com/article/2113176-chinese-tourist-town-uses-face-recognition-as-an-entry-pass/

  4. +
+
+
+ +
+ + + + \ No newline at end of file -- cgit v1.2.3-70-g09d2