Skip to content

Instantly share code, notes, and snippets.

@ecoopnet
Last active March 5, 2022 19:06
Show Gist options
  • Save ecoopnet/dc6cdf560d1e3fb5032d6480de4ec9fb to your computer and use it in GitHub Desktop.
Save ecoopnet/dc6cdf560d1e3fb5032d6480de4ec9fb to your computer and use it in GitHub Desktop.
AI勉強会用 arxiv のURLリストからpdfダウンロード
#!/bin/sh
# 導入:
# 1. download-arxiv-pdfs.sh (このファイル)をダウンロードする
# 2. chmod +x download-arxiv-pdfs.sh
# 使い方:
# ./download-arxiv-pdfs.sh 2022-02-18-recent.txt # 勉強会用の recent.txt または hype.txtを指定
top10txtfile="$1"
if [ '!' -s "$top10txtfile" -o $(basename "$top10txtfile" | sed -e 's/^[^\.]*\.//') != txt ]; then
echo "arxiv sanity の top hype か top recent のまとめtxtファイルを指定して下さい。" >&2
exit 1
fi
noext=$(basename $top10txtfile | sed -e 's/.txt$//')
i=0;for f in $(cat "$top10txtfile" | grep -e '^https\?://arxiv.org/abs');do
i=$(expr $i + 1);
pdf=$(echo "$f" | sed -e s/abs/pdf/);
outfile="${noext}-$i.pdf"
echo "$pdf" " to " $outfile;
curl -L "$pdf" > "$outfile"
#wget -c -o "$outfile" "$pdf.pdf"
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment