Skip to content

Instantly share code, notes, and snippets.

@jhpoelen
Last active October 16, 2020 01:05
Show Gist options
  • Save jhpoelen/18d4b3f23ac51026a833080473b9231d to your computer and use it in GitHub Desktop.
Save jhpoelen/18d4b3f23ac51026a833080473b9231d to your computer and use it in GitHub Desktop.
bash script to get pollination records
#!/bin/bash
#
# 2020-10-15
#
# This script is a way to select pollination and flower visits record from
# one of the data products provided via https://globalbioticinteractions.org/data .
#
# This particular example uses a July 2020 data publication.
#
# For more recent data, see https://globalbioticinteractions.org/data .
#
# for a list of GloBI supported OBO Relation Ontology RO terms, see
# https://github.com/globalbioticinteractions/nomer/blob/main/nomer/src/test/resources/org/globalbioticinteractions/nomer/match/ro.tsv
#
# Please feel free to contact me if you have questions,
# -jorrit
#
# print commands and stop on errors
set -xe
# print header
curl "https://zenodo.org/record/3950590/files/interactions.tsv.gz"\
| gunzip\
| head -n1\
| gzip\
> indexed-pollination-and-flower-visit-records.tsv.gz
# append data
curl "https://zenodo.org/record/3950590/files/interactions.tsv.gz"\
| gunzip\
| grep -E "(RO_0002455)|(RO_0002456)|(RO_0002622)|(RO_0002623)"\
| gzip\
>> indexed-pollination-and-flower-visit-records.tsv.gz
#
# If all goes well the resulting file contains 113049 lines or 113048 interaction records.
# $ zcat indexed-pollination-and-flower-visit-records.tsv.gz | wc -l
# 113049
#
# and the uncompressed data has content id: hash://sha256/909f25361299bf7f9f478924775ebe52154f5cde882e29ad6b0c61e816b9f6d1
# $ cat indexed-pollination-and-flower-visit-records.tsv.gz | gunzip | sha256sum
# 909f25361299bf7f9f478924775ebe52154f5cde882e29ad6b0c61e816b9f6d1 -
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment