summaryrefslogtreecommitdiff
path: root/scraper/README.md
diff options
context:
space:
mode:
authorJules Laplace <julescarbon@gmail.com>2019-02-08 23:37:44 +0100
committerJules Laplace <julescarbon@gmail.com>2019-02-08 23:37:44 +0100
commitcc96e02f88db212d2ac3fb709a53bf26d8995aa7 (patch)
tree93b566da54fc76d593d09d059df57a985eed4439 /scraper/README.md
parentd2cbf2f7cb64f36c04612e3a7d996ed1b8ce7228 (diff)
readme
Diffstat (limited to 'scraper/README.md')
-rw-r--r--scraper/README.md4
1 files changed, 4 insertions, 0 deletions
diff --git a/scraper/README.md b/scraper/README.md
index 4399abd3..33b2d975 100644
--- a/scraper/README.md
+++ b/scraper/README.md
@@ -42,6 +42,10 @@ We do a two-stage fetch process as only about 66% of their papers are in this da
Loads titles from citations file and queries the S2 search API to get paper IDs, then uses the paper IDs from the search entries to query the S2 papers API to get first-degree citations, authors, etc.
+### s2-papers.py
+
+Of course, searching is not totally accurate, so run the s2-papers.py script to build a report of all the papers, so you can correct any papers that did not resolve. Also reports papers without a location.
+
### s2-dump-ids.py
Dump all the paper IDs and citation IDs from the queried papers.