Skip to content

Instantly share code, notes, and snippets.

@tonejito
Last active April 25, 2020 08:41
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tonejito/24a94142ced2641c9ffbe2e7186e4d70 to your computer and use it in GitHub Desktop.
Save tonejito/24a94142ced2641c9ffbe2e7186e4d70 to your computer and use it in GitHub Desktop.
Pull CV data from official site and convert it to JSON format
#!/bin/bash
# Andrés Hernández (tonejito)
# This script is released under the terms of the BSD 2 clause license
set -evx
SITE="coronavirus.gob.mx"
ENDPOINT="datos"
DATA_FILE="Downloads/filesDD.php"
SETS="csvaxd csvmun"
TYPES="Confirmados Sospechosos Negativos Defunciones"
for SET in ${SETS}
do
for TYPE in ${TYPES}
do
# Get the not-csv file for each ${SET} and ${TYPE}
curl -vfsSL \
-H "referer: https://${SITE}/${ENDPOINT}/" \
-H "authority: ${SITE}" \
-H "origin: https://${SITE}" \
-H "sec-fetch-site: same-origin" \
-H "sec-fetch-mode: cors" \
-H "sec-fetch-dest: empty" \
-H "accept: */*" \
-H "content-type: multipart/form-data; boundary=----MIMEBOUNDARY" \
--data-binary $"------MIMEBOUNDARY\r\nContent-Disposition: form-data; name="sPatType"\r\n\r\n${TYPE}\r\n------MIMEBOUNDARY--\r\n" \
-o ./${SET}-${TYPE}.not-csv \
"https://${SITE}/${ENDPOINT}/${DATA_FILE}?${SET}" \
2>&1 | tee ./${SET}-${TYPE}.headers.log
# Clean up the file to get the JSON payload
sed -n 2p ./${SET}-${TYPE}.not-csv | \
sed -e 's/^[[:space:]]\{1,\}"data":[[:space:]]//g' -e 's/,$//g' | \
jq -r '.' > ./${SET}-${TYPE}.json
# Be nice with the net
sleep 1
done
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment