readme

author: Jules Laplace <julescarbon@gmail.com> 2019-02-08 23:37:44 +0100
committer: Jules Laplace <julescarbon@gmail.com> 2019-02-08 23:37:44 +0100
commit: cc96e02f88db212d2ac3fb709a53bf26d8995aa7 (patch)
tree: 93b566da54fc76d593d09d059df57a985eed4439 /scraper/README.md
parent: d2cbf2f7cb64f36c04612e3a7d996ed1b8ce7228 (diff)
1 files changed, 4 insertions, 0 deletions
diff --git a/scraper/README.md b/scraper/README.md
index 4399abd3..33b2d975 100644
--- a/scraper/README.md
+++ b/scraper/README.md
@@ -42,6 +42,10 @@ We do a two-stage fetch process as only about 66% of their papers are in this da
 
 Loads titles from citations file and queries the S2 search API to get paper IDs, then uses the paper IDs from the search entries to query the S2 papers API to get first-degree citations, authors, etc.
 
+### s2-papers.py
+
+Of course, searching is not totally accurate, so run the s2-papers.py script to build a report of all the papers, so you can correct any papers that did not resolve. Also reports papers without a location.
+
 ### s2-dump-ids.py
 
 Dump all the paper IDs and citation IDs from the queried papers.
author	Jules Laplace <julescarbon@gmail.com>	2019-02-08 23:37:44 +0100
committer	Jules Laplace <julescarbon@gmail.com>	2019-02-08 23:37:44 +0100
commit	cc96e02f88db212d2ac3fb709a53bf26d8995aa7 (patch)
tree	93b566da54fc76d593d09d059df57a985eed4439 /scraper/README.md
parent	d2cbf2f7cb64f36c04612e3a7d996ed1b8ce7228 (diff)