diff options
| author | adamhrv <adam@ahprojects.com> | 2019-06-27 23:58:23 +0200 |
|---|---|---|
| committer | adamhrv <adam@ahprojects.com> | 2019-06-27 23:58:23 +0200 |
| commit | 852e4c1e36c38f57f80fc5d441da82d5991b2212 (patch) | |
| tree | 0c8bc3bbcb6c679e28ba387d0c1e47fb3d16830a /site/public/datasets/ijb_c/index.html | |
| parent | ae165ef1235a6997d5791ca241fd3fd134202c92 (diff) | |
update public
Diffstat (limited to 'site/public/datasets/ijb_c/index.html')
| -rw-r--r-- | site/public/datasets/ijb_c/index.html | 22 |
1 files changed, 6 insertions, 16 deletions
diff --git a/site/public/datasets/ijb_c/index.html b/site/public/datasets/ijb_c/index.html index ccb7d90d..a36fac14 100644 --- a/site/public/datasets/ijb_c/index.html +++ b/site/public/datasets/ijb_c/index.html @@ -76,26 +76,16 @@ <div class='gray'>Website</div> <div><a href='https://www.nist.gov/programs-projects/face-challenges' target='_blank' rel='nofollow noopener'>nist.gov</a></div> </div></div><p>[ page under development ]</p> -<p>The IARPA Janus Benchmark C (IJB–C) is a dataset of web images used for face recognition research and development. The IJB–C dataset contains 3,531 people</p> -<p>Among the target list of 3,531 names are activists, artists, journalists, foreign politicians,</p> +<p>The IARPA Janus Benchmark C (IJB–C) is a dataset of web images used for face recognition research and development. The IJB–C dataset contains 3,531 people from 21,294 images and 3,531 videos. The list of 3,531 names are activists, artists, journalists, foreign politicians, and public speakers.</p> +<p>Key Findings:</p> <ul> -<li>Subjects 3531</li> -<li>Templates: 140739</li> -<li>Genuine Matches: 7819362</li> -<li>Impostor Matches: 39584639</li> -</ul> -<p>Why not include US Soliders instead of activists?</p> -<p>was creted by Nobilis, a United States Government contractor is used to develop software for the US intelligence agencies as part of the IARPA Janus program.</p> -<p>The IARPA Janus program is</p> -<p>these representations must address the challenges of Aging, Pose, Illumination, and Expression (A-PIE) by exploiting all available imagery.</p> -<ul> -<li>metadata annotations were created using crowd annotations</li> -<li>created by Nobilis</li> -<li>used mechanical turk</li> +<li>metadata annotations were created using crowd annotations on Mechanical Turk</li> +<li>The dataset was creatd Nobilis</li> <li>made for intelligence analysts</li> <li>improve performance of face recognition tools</li> <li>by fusing the rich spatial, temporal, and contextual information available from the multiple views captured by today’s "media in the wild"</li> </ul> +<p>The dataset includes Creative Commons images</p> <p>The name list includes</p> <ul> <li>2 videos from CCC<ul> @@ -134,7 +124,7 @@ <p>The first 777 are non-alphabetical. From 777-3531 is alphabetical</p> </section><section class='images'><div class='image'><img src='https://nyc3.digitaloceanspaces.com/megapixels/v1/datasets/ijb_c/assets/ijb_c_montage.jpg' alt=' A visualization of the IJB-C dataset'><div class='caption'> A visualization of the IJB-C dataset</div></div></section><section><h2>Research notes</h2> <p>From original papers: <a href="https://noblis.org/wp-content/uploads/2018/03/icb2018.pdf">https://noblis.org/wp-content/uploads/2018/03/icb2018.pdf</a></p> -<p>Collection for the dataset began by identifying CreativeCommons subject videos, which are often more scarce thanCreative Commons subject images. Search terms that re-sulted in large quantities of person-centric videos (e.g. “in-terview”) were generated and translated into numerous lan-guages including Arabic, Korean, Swahili, and Hindi to in-crease diversity of the subject pool. Certain YouTube userswho upload well-labeled, person-centric videos, such as the World Economic Forum and the International University Sports Federation were also identified. Titles of videos per-taining to these search terms and usernames were scrapedusing the YouTube Data API and translated into English us-ing the Yandex Translate API4. Pattern matching was per-formed to extract potential names of subjects from the trans-lated titles, and these names were searched using the Wiki-data API to verify the subject’s existence and status as a public figure, and to check for Wikimedia Commons im-agery. Age, gender, and geographic region were collectedusing the Wikipedia API.Using the candidate subject names, Creative Commonsimages were scraped from Google and Wikimedia Com-mons, and Creative Commons videos were scraped fromYouTube. After images and videos of the candidate subjectwere identified, AMT Workers were tasked with validat-ing the subject’s presence throughout the video. The AMTWorkers marked segments of the video in which the subjectwas present, and key frames</p> +<p>Collection for the dataset began by identifying CreativeCommons subject videos, which are often more scarce than Creative Commons subject images. Search terms that re-sulted in large quantities of person-centric videos (e.g. “in-terview”) were generated and translated into numerous lan-guages including Arabic, Korean, Swahili, and Hindi to in-crease diversity of the subject pool. Certain YouTube userswho upload well-labeled, person-centric videos, such as the World Economic Forum and the International University Sports Federation were also identified. Titles of videos per-taining to these search terms and usernames were scrapedusing the YouTube Data API and translated into English us-ing the Yandex Translate API4. Pattern matching was per-formed to extract potential names of subjects from the trans-lated titles, and these names were searched using the Wiki-data API to verify the subject’s existence and status as a public figure, and to check for Wikimedia Commons im-agery. Age, gender, and geographic region were collectedusing the Wikipedia API.Using the candidate subject names, Creative Commonsimages were scraped from Google and Wikimedia Com-mons, and Creative Commons videos were scraped fromYouTube. After images and videos of the candidate subjectwere identified, AMT Workers were tasked with validat-ing the subject’s presence throughout the video. The AMTWorkers marked segments of the video in which the subjectwas present, and key frames</p> <p>IARPA funds Italian researcher <a href="https://www.micc.unifi.it/projects/glaivejanus/">https://www.micc.unifi.it/projects/glaivejanus/</a></p> </section><section> <h3>Who used IJB-C?</h3> |
