Merge branch 'master' of asdf.us:megapixels_dev

author: jules@lens <julescarbon@gmail.com> 2019-05-03 16:02:03 +0200
committer: jules@lens <julescarbon@gmail.com> 2019-05-03 16:02:03 +0200
commit: d0bc27630c13c4649eb394a49525f4150e4b82f2 (patch)
tree: 71fbf167457dcbdeff44f223b7dbb8aa6302947f /site/content/pages/datasets/ijb_c/index.md
parent: 8b0408ab56c687352228e8ec50a71ad48bdd6d18 (diff)
parent: f7b1c28108143eaf99df37c2bb5d8e711733b40e (diff)
1 files changed, 9 insertions, 0 deletions
diff --git a/site/content/pages/datasets/ijb_c/index.md b/site/content/pages/datasets/ijb_c/index.md
index 46cab323..9e3f1808 100644
--- a/site/content/pages/datasets/ijb_c/index.md
+++ b/site/content/pages/datasets/ijb_c/index.md
@@ -27,6 +27,15 @@ The IARPA Janus Benchmark C is a dataset created by
 ![caption: A visualization of the IJB-C dataset](assets/ijb_c_montage.jpg)
 
 
+## Research notes
+
+From original papers: https://noblis.org/wp-content/uploads/2018/03/icb2018.pdf
+
+Collection for the dataset began by identifying CreativeCommons subject videos, which are often more scarce thanCreative Commons subject images.   Search terms that re-sulted in large quantities of person-centric videos (e.g. “in-terview”) were generated and translated into numerous lan-guages including Arabic, Korean, Swahili, and Hindi to in-crease diversity of the subject pool. Certain YouTube userswho upload well-labeled, person-centric videos, such as the World  Economic  Forum  and  the  International  University Sports Federation were also identified. Titles of videos per-taining to these search terms and usernames were scrapedusing the YouTube Data API and translated into English us-ing the Yandex Translate API4. Pattern matching was per-formed to extract potential names of subjects from the trans-lated titles, and these names were searched using the Wiki-data  API  to  verify  the  subject’s  existence  and  status  as  a public figure,  and to check for Wikimedia Commons im-agery.  Age, gender, and geographic region were collectedusing the Wikipedia API.Using the candidate subject names, Creative Commonsimages  were  scraped  from  Google  and  Wikimedia  Com-mons,  and  Creative  Commons  videos  were  scraped  fromYouTube. After images and videos of the candidate subjectwere  identified,  AMT  Workers  were  tasked  with  validat-ing the subject’s presence throughout the video.  The AMTWorkers marked segments of the video in which the subjectwas present, and key frames 
+
+
+IARPA funds Italian researcher https://www.micc.unifi.it/projects/glaivejanus/
+
 {% include 'dashboard.html' %}
 
 {% include 'supplementary_header.html' %}
author	jules@lens <julescarbon@gmail.com>	2019-05-03 16:02:03 +0200
committer	jules@lens <julescarbon@gmail.com>	2019-05-03 16:02:03 +0200
commit	d0bc27630c13c4649eb394a49525f4150e4b82f2 (patch)
tree	71fbf167457dcbdeff44f223b7dbb8aa6302947f /site/content/pages/datasets/ijb_c/index.md
parent	8b0408ab56c687352228e8ec50a71ad48bdd6d18 (diff)
parent	f7b1c28108143eaf99df37c2bb5d8e711733b40e (diff)