Skip to content

Instantly share code, notes, and snippets.

@gurdiga
Created August 16, 2019 10:37
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gurdiga/80f15a96e35c8a67c3371362050fd820 to your computer and use it in GitHub Desktop.
Save gurdiga/80f15a96e35c8a67c3371362050fd820 to your computer and use it in GitHub Desktop.
The Makefile I used to mangle some Optimizely logs
default:
Make what?
sync: download upload
download:
time aws --profile mx-optimizely s3 sync s3://optimizely-export-ng/10629356/10629356/2.0/2019/08/14/ 2019/08/14/
download-day:
test -z "$(DAY)" && (echo "Give me a DAY, please. Something like DAY=08/21." && exit 1) || \
time aws --profile mx-optimizely s3 sync s3://optimizely-export-ng/10629356/10629356/2.0/2019/$(DAY)/ 2019/$(DAY)/
maxlen:
test -z "$(FIELD_NO)" && (echo "Give me a FIELD_NO, please. Something like FIELD_NO=20." && exit 1) || \
parallel -j 20 'echo {}>/dev/stderr; gzcat {} | awk -F "\t" "{ print length(\$$$(FIELD_NO)) }" | sort -g | tail -1 | tee /dev/stderr' | \
sort -g | tail
fields:
find 2019 -name *.gz | head -1 | xargs gzcat | head -1 | tr '\t' '\n' | nl
upload:
time aws s3 sync 2019/08/ s3://blueshift_aws_staging/optimizely-logs-ng/2019/08/14/
remove-manifests:
find 2019 -name status.yaml -delete
remove-csv-headers:
time find . -name '*tsv.gz' | while read file; do gzcat $$file | tail -n +2 | gzip | sponge $$file; done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment