Skip to content

Instantly share code, notes, and snippets.

@davidlenz
Last active June 4, 2018 17:51
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save davidlenz/a1ffa3bb943c810e753b59771fa9a894 to your computer and use it in GitHub Desktop.
Save davidlenz/a1ffa3bb943c810e753b59771fa9a894 to your computer and use it in GitHub Desktop.
Download pdf files from arxiv based on a search query. https://github.com/lukasschwab/arxiv.py
import time, os
import arxiv
QUERY = 'ECB'
NUM_RESULTS = 10 #
SLEEPTIME = 0.1 # seconds
savedir = './arxiv_papers/{}/'.format(QUERY)
if not os.path.exists(savedir):
os.makedirs(savedir)
# Query for a paper of interest, then download
papers = arxiv.query(search_query=QUERY, max_results=NUM_RESULTS)
# download papers
for paper in papers:
try:
arxiv.download(paper, dirname=savedir, slugify=True)
except Exception as e:
print(e)
# be gentle to the api
time.sleep(SLEEPTIME)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment