Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@inutano
Created July 17, 2018 07:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save inutano/574fae93518d2fcea2d153c24f98319a to your computer and use it in GitHub Desktop.
Save inutano/574fae93518d2fcea2d153c24f98319a to your computer and use it in GitHub Desktop.
#!/bin/sh
# Get ID-total sequences from Quanto data set and calculate total
RDF_PORTAL_EP="http://integbio.jp/rdf/sparql"
QUERY=$(
CAT <<'EOF'
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX sos: <http://purl.jp/bio/01/quanto/ontology/sos#>
SELECT ?insdcId ?total_sequences
FROM <http://quanto.dbcls.jp>
WHERE {
?quanto rdfs:seeAlso ?insdcId;
sos:totalSequences ?x1 .
?x1 rdf:value ?total_sequences .
}
EOF
)
curl -H 'Accept: text/tab-separated-values' --data-urlencode "query=${QUERY}" "${RDF_PORTAL_EP}" | \
awk '
/SRR/ {
ncbi += $2
} /ERR/ {
ebi += $2
} /DRR/ {
ddbj += $2
} END {
total=ncbi+ebi+ddbj;
print "total " NR " records";
print "ncbi: " ncbi " (" ncbi / total * 100 "%)";
print "ebi: " ebi " (" ebi / total * 100 "%)";
print "ddbj: " ddbj " (" ddbj / total * 100 "%)";
}
'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment