update public

author: adamhrv <adam@ahprojects.com> 2019-06-27 23:58:23 +0200
committer: adamhrv <adam@ahprojects.com> 2019-06-27 23:58:23 +0200
commit: 852e4c1e36c38f57f80fc5d441da82d5991b2212 (patch)
tree: 0c8bc3bbcb6c679e28ba387d0c1e47fb3d16830a /site/public/datasets/ijb_c/index.html
parent: ae165ef1235a6997d5791ca241fd3fd134202c92 (diff)
1 files changed, 6 insertions, 16 deletions
diff --git a/site/public/datasets/ijb_c/index.html b/site/public/datasets/ijb_c/index.html
index ccb7d90d..a36fac14 100644
--- a/site/public/datasets/ijb_c/index.html
+++ b/site/public/datasets/ijb_c/index.html
@@ -76,26 +76,16 @@
     <div class='gray'>Website</div>
     <div><a href='https://www.nist.gov/programs-projects/face-challenges' target='_blank' rel='nofollow noopener'>nist.gov</a></div>
   </div></div><p>[ page under development ]</p>
-<p>The IARPA Janus Benchmark C (IJB&ndash;C) is a dataset of web images used for face recognition research and development. The IJB&ndash;C dataset contains 3,531 people</p>
-<p>Among the target list of 3,531 names are activists, artists, journalists, foreign politicians,</p>
+<p>The IARPA Janus Benchmark C (IJB&ndash;C) is a dataset of web images used for face recognition research and development. The IJB&ndash;C dataset contains 3,531 people from 21,294 images and 3,531 videos. The list of 3,531 names are activists, artists, journalists, foreign politicians, and public speakers.</p>
+<p>Key Findings:</p>
 <ul>
-<li>Subjects 3531</li>
-<li>Templates: 140739</li>
-<li>Genuine Matches: 7819362</li>
-<li>Impostor Matches: 39584639</li>
-</ul>
-<p>Why not include US Soliders instead of activists?</p>
-<p>was creted by Nobilis, a United States Government contractor is used to develop software for the US intelligence agencies as part of the IARPA Janus program.</p>
-<p>The IARPA Janus program is</p>
-<p>these representations must address the challenges of Aging, Pose, Illumination, and Expression (A-PIE) by exploiting all available imagery.</p>
-<ul>
-<li>metadata annotations were created using crowd annotations</li>
-<li>created by Nobilis</li>
-<li>used mechanical turk</li>
+<li>metadata annotations were created using crowd annotations on Mechanical Turk</li>
+<li>The dataset was creatd Nobilis</li>
 <li>made for intelligence analysts</li>
 <li>improve performance of face recognition tools</li>
 <li>by fusing the rich spatial, temporal, and contextual information available from the multiple views captured by today’s "media in the wild"</li>
 </ul>
+<p>The dataset includes Creative Commons images</p>
 <p>The name list includes</p>
 <ul>
 <li>2 videos from CCC<ul>
@@ -134,7 +124,7 @@
 <p>The first 777 are non-alphabetical. From 777-3531 is alphabetical</p>
 </section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/ijb_c/assets/ijb_c_montage.jpg' alt=' A visualization of the IJB-C dataset'><div class='caption'> A visualization of the IJB-C dataset</div></div></section><section><h2>Research notes</h2>
 <p>From original papers: <a href="https://noblis.org/wp-content/uploads/2018/03/icb2018.pdf">https://noblis.org/wp-content/uploads/2018/03/icb2018.pdf</a></p>
-<p>Collection for the dataset began by identifying CreativeCommons subject videos, which are often more scarce thanCreative Commons subject images.   Search terms that re-sulted in large quantities of person-centric videos (e.g. “in-terview”) were generated and translated into numerous lan-guages including Arabic, Korean, Swahili, and Hindi to in-crease diversity of the subject pool. Certain YouTube userswho upload well-labeled, person-centric videos, such as the World  Economic  Forum  and  the  International  University Sports Federation were also identified. Titles of videos per-taining to these search terms and usernames were scrapedusing the YouTube Data API and translated into English us-ing the Yandex Translate API4. Pattern matching was per-formed to extract potential names of subjects from the trans-lated titles, and these names were searched using the Wiki-data  API  to  verify  the  subject’s  existence  and  status  as  a public figure,  and to check for Wikimedia Commons im-agery.  Age, gender, and geographic region were collectedusing the Wikipedia API.Using the candidate subject names, Creative Commonsimages  were  scraped  from  Google  and  Wikimedia  Com-mons,  and  Creative  Commons  videos  were  scraped  fromYouTube. After images and videos of the candidate subjectwere  identified,  AMT  Workers  were  tasked  with  validat-ing the subject’s presence throughout the video.  The AMTWorkers marked segments of the video in which the subjectwas present, and key frames</p>
+<p>Collection for the dataset began by identifying CreativeCommons subject videos, which are often more scarce than Creative Commons subject images.   Search terms that re-sulted in large quantities of person-centric videos (e.g. “in-terview”) were generated and translated into numerous lan-guages including Arabic, Korean, Swahili, and Hindi to in-crease diversity of the subject pool. Certain YouTube userswho upload well-labeled, person-centric videos, such as the World  Economic  Forum  and  the  International  University Sports Federation were also identified. Titles of videos per-taining to these search terms and usernames were scrapedusing the YouTube Data API and translated into English us-ing the Yandex Translate API4. Pattern matching was per-formed to extract potential names of subjects from the trans-lated titles, and these names were searched using the Wiki-data  API  to  verify  the  subject’s  existence  and  status  as  a public figure,  and to check for Wikimedia Commons im-agery.  Age, gender, and geographic region were collectedusing the Wikipedia API.Using the candidate subject names, Creative Commonsimages  were  scraped  from  Google  and  Wikimedia  Com-mons,  and  Creative  Commons  videos  were  scraped  fromYouTube. After images and videos of the candidate subjectwere  identified,  AMT  Workers  were  tasked  with  validat-ing the subject’s presence throughout the video.  The AMTWorkers marked segments of the video in which the subjectwas present, and key frames</p>
 <p>IARPA funds Italian researcher <a href="https://www.micc.unifi.it/projects/glaivejanus/">https://www.micc.unifi.it/projects/glaivejanus/</a></p>
 </section><section>
   <h3>Who used IJB-C?</h3>
author	adamhrv <adam@ahprojects.com>	2019-06-27 23:58:23 +0200
committer	adamhrv <adam@ahprojects.com>	2019-06-27 23:58:23 +0200
commit	852e4c1e36c38f57f80fc5d441da82d5991b2212 (patch)
tree	0c8bc3bbcb6c679e28ba387d0c1e47fb3d16830a /site/public/datasets/ijb_c/index.html
parent	ae165ef1235a6997d5791ca241fd3fd134202c92 (diff)