Skip to content

Instantly share code, notes, and snippets.

@Guillawme
Last active September 18, 2022 08:53
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Guillawme/7dd10912829f24d3ec0415a2d76fa3f5 to your computer and use it in GitHub Desktop.
Save Guillawme/7dd10912829f24d3ec0415a2d76fa3f5 to your computer and use it in GitHub Desktop.
get_all_phenixbb.sh
#!/usr/bin/env bash
if [[ ! -d phenixbb ]]
then
mkdir phenixbb
fi
cd phenixbb
curl -s https://phenix-online.org/pipermail/phenixbb/ |\
grep -o "2[0-9]\+-[A-Z][a-z]\+\.txt\.gz" |\
while read input_url
do
output_txt=$(basename ${input_url} .gz |\
sed -e 's/January/01/' \
-e 's/February/02/' \
-e 's/March/03/' \
-e 's/April/04/' \
-e 's/May/05/' \
-e 's/June/06/' \
-e 's/July/07/' \
-e 's/August/08/' \
-e 's/September/09/' \
-e 's/October/10/' \
-e 's/November/11/' \
-e 's/December/12/')
curl -s https://phenix-online.org/pipermail/phenixbb/${input_url} |\
gunzip > ${output_txt}
done
find . -name '????-??.txt' | sort -t- -n -k1,2 | xargs cat > phenixbb.txt
find . -name '????-??.txt' -delete
cd ..
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment