Skip to content

Instantly share code, notes, and snippets.

@IsmailM
Last active June 11, 2020 22:21
Show Gist options
  • Save IsmailM/e929e91b06c892d3bfca65d537899245 to your computer and use it in GitHub Desktop.
Save IsmailM/e929e91b06c892d3bfca65d537899245 to your computer and use it in GitHub Desktop.
using the PGP api to get fastq urls, md5s and sizes
# The below is using JQ from https://stedolan.github.io/jq/ +
# the PGP API v1.2 - https://www.personalgenomes.org.uk/api/v1.2/
curl -X GET "https://www.personalgenomes.org.uk/api/v1.2/all_wgs" -H "accept: application/json" | jq -r '
.[] | [
.hex_id,
(.data[]?.fastq_ftp),
(.data[]?.fastq_md5),
(.data[]?.fastq_bytes | split(";") | .[] | tonumber | . /1024/1024/1024)
] | flatten | @csv' > wgs_fastqs.csv
# Note, some of the records have three fastq files - so the CSV does not fully line up :(
# The 3 exome sequencing datasets
# Note this endpoint is not documented, but it exists (sorry)
curl -X GET "https://www.personalgenomes.org.uk/api/v1.2/all_wxs" -H "accept: application/json" | jq -r '
.[] | [
.hex_id,
(.data[]?.fastq_ftp),
(.data[]?.fastq_md5),
(.data[]?.fastq_bytes | split(";") | .[] | tonumber | . /1024/1024/1024)
] | flatten | @csv' > wxs_fastqs.csv
# Note in the above you can also split the fastq_ftp and fastq_md5 fields
(.data[]?.fastq_ftp | split(";")),
(.data[]?.fastq_md5 | split(";")),
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment