Skip to content

Instantly share code, notes, and snippets.

@seandavi
Created November 11, 2023 11:02
Show Gist options
  • Save seandavi/b1cc000ad5a143d372b24637fc781f07 to your computer and use it in GitHub Desktop.
Save seandavi/b1cc000ad5a143d372b24637fc781f07 to your computer and use it in GitHub Desktop.
Example queries from ENA browser API

The ENA browser API https://www.ebi.ac.uk/ena/portal/api/swagger-ui/

There is only limit, no offset. API is designed to simply stream large resultsets

Search by SRA study ID

Output as TSV

SEARCH_QUERY='secondary_study_accession=SRP082656' && curl "https://www.ebi.ac.uk/ena/portal/api/search?query=${SEARCH_QUERY}&result=read_run&fields=experiment_accession%2Cexperiment_title%2Csecondary_study_accession%2Caligned%2Caltitude%2Cassembly_quality%2Cassembly_software%2Cbam_aspera%2Cbam_bytes%2Cbam_ftp%2Cbam_galaxy%2Cbam_md5%2Cbase_count%2Cbinning_software%2Cbio_material%2Cbisulfite_protocol%2Cbroad_scale_environmental_context%2Cbroker_name%2Ccage_protocol%2Ccell_line%2Ccell_type%2Ccenter_name%2Cchecklist%2Cchip_ab_provider%2Cchip_protocol%2Cchip_target%2Ccollected_by%2Ccollection_date%2Ccollection_date_end%2Ccollection_date_start%2Ccompleteness_score%2Ccontamination_score%2Ccontrol_experiment%2Ccountry%2Ccultivar%2Cculture_collection%2Cdatahub%2Cdepth%2Cdescription%2Cdev_stage%2Cdnase_protocol%2Cecotype%2Celevation%2Cenvironment_biome%2Cenvironment_feature%2Cenvironment_material%2Cenvironmental_medium%2Cenvironmental_sample%2Cexperiment_alias%2Cexperiment_target%2Cexperimental_factor%2Cexperimental_protocol%2Cextraction_protocol%2Cfaang_library_selection%2Cfastq_aspera%2Cfastq_bytes%2Cfastq_ftp%2Cfastq_galaxy%2Cfastq_md5%2Cfile_location%2Cfirst_created%2Cfirst_public%2Cgermline%2Chi_c_protocol%2Chost%2Chost_body_site%2Chost_genotype%2Chost_gravidity%2Chost_growth_conditions%2Chost_phenotype%2Chost_scientific_name%2Chost_sex%2Chost_status%2Chost_tax_id%2Cidentified_by%2Cinstrument_model%2Cinstrument_platform%2Cinvestigation_type%2Cisolate%2Cisolation_source%2Clast_updated%2Clat%2Clibrary_construction_protocol%2Clibrary_gen_protocol%2Clibrary_layout%2Clibrary_max_fragment_size%2Clibrary_min_fragment_size%2Clibrary_name%2Clibrary_pcr_isolation_protocol%2Clibrary_prep_date%2Clibrary_prep_date_format%2Clibrary_prep_latitude%2Clibrary_prep_location%2Clibrary_prep_longitude%2Clibrary_selection%2Clibrary_source%2Clibrary_strategy%2Clocal_environmental_context%2Clocation%2Clocation_end%2Clocation_start%2Clon%2Cmarine_region%2Cmating_type%2Cncbi_reporting_standard%2Cnominal_length%2Cnominal_sdev%2Cpcr_isolation_protocol%2Cph%2Cproject_name%2Cprotocol_label%2Cread_count%2Cread_strand%2Crestriction_enzyme%2Crestriction_enzyme_target_sequence%2Crestriction_site%2Crna_integrity_num%2Crna_prep_3_protocol%2Crna_prep_5_protocol%2Crna_purity_230_ratio%2Crna_purity_280_ratio%2Crt_prep_protocol%2Crun_accession%2Crun_alias%2Crun_date%2Csalinity%2Csample_accession%2Csample_alias%2Csample_capture_status%2Csample_collection%2Csample_description%2Csample_material%2Csample_prep_interval%2Csample_prep_interval_units%2Csample_storage%2Csample_storage_processing%2Csample_title%2Csampling_campaign%2Csampling_platform%2Csampling_site%2Cscientific_name%2Csecondary_project%2Csecondary_sample_accession%2Csequencing_date%2Csequencing_date_format%2Csequencing_location%2Csequencing_longitude%2Csequencing_method%2Csequencing_primer_catalog%2Csequencing_primer_lot%2Csequencing_primer_provider%2Cserotype%2Cserovar%2Csex%2Cspecimen_voucher%2Csra_aspera%2Csra_bytes%2Csra_ftp%2Csra_galaxy%2Csra_md5%2Cstatus%2Cstrain%2Cstudy_accession%2Cstudy_alias%2Cstudy_title%2Csub_species%2Csub_strain%2Csubmission_accession%2Csubmission_tool%2Csubmitted_aspera%2Csubmitted_bytes%2Csubmitted_format%2Csubmitted_ftp%2Csubmitted_galaxy%2Csubmitted_host_sex%2Csubmitted_md5%2Csubmitted_read_type%2Ctag%2Ctarget_gene%2Ctax_id%2Ctaxonomic_classification%2Ctaxonomic_identity_marker%2Ctemperature%2Ctissue_lib%2Ctissue_type%2Ctransposase_protocol%2Cvariety&format=tsv&limit=0"

Search by Bioproject ID

Output as TSV

SEARCH_QUERY='study_accession=PRJNA339914' && curl "https://www.ebi.ac.uk/ena/portal/api/search?query=${SEARCH_QUERY}&result=read_run&fields=experiment_accession%2Cexperiment_title%2Csecondary_study_accession%2Caligned%2Caltitude%2Cassembly_quality%2Cassembly_software%2Cbam_aspera%2Cbam_bytes%2Cbam_ftp%2Cbam_galaxy%2Cbam_md5%2Cbase_count%2Cbinning_software%2Cbio_material%2Cbisulfite_protocol%2Cbroad_scale_environmental_context%2Cbroker_name%2Ccage_protocol%2Ccell_line%2Ccell_type%2Ccenter_name%2Cchecklist%2Cchip_ab_provider%2Cchip_protocol%2Cchip_target%2Ccollected_by%2Ccollection_date%2Ccollection_date_end%2Ccollection_date_start%2Ccompleteness_score%2Ccontamination_score%2Ccontrol_experiment%2Ccountry%2Ccultivar%2Cculture_collection%2Cdatahub%2Cdepth%2Cdescription%2Cdev_stage%2Cdnase_protocol%2Cecotype%2Celevation%2Cenvironment_biome%2Cenvironment_feature%2Cenvironment_material%2Cenvironmental_medium%2Cenvironmental_sample%2Cexperiment_alias%2Cexperiment_target%2Cexperimental_factor%2Cexperimental_protocol%2Cextraction_protocol%2Cfaang_library_selection%2Cfastq_aspera%2Cfastq_bytes%2Cfastq_ftp%2Cfastq_galaxy%2Cfastq_md5%2Cfile_location%2Cfirst_created%2Cfirst_public%2Cgermline%2Chi_c_protocol%2Chost%2Chost_body_site%2Chost_genotype%2Chost_gravidity%2Chost_growth_conditions%2Chost_phenotype%2Chost_scientific_name%2Chost_sex%2Chost_status%2Chost_tax_id%2Cidentified_by%2Cinstrument_model%2Cinstrument_platform%2Cinvestigation_type%2Cisolate%2Cisolation_source%2Clast_updated%2Clat%2Clibrary_construction_protocol%2Clibrary_gen_protocol%2Clibrary_layout%2Clibrary_max_fragment_size%2Clibrary_min_fragment_size%2Clibrary_name%2Clibrary_pcr_isolation_protocol%2Clibrary_prep_date%2Clibrary_prep_date_format%2Clibrary_prep_latitude%2Clibrary_prep_location%2Clibrary_prep_longitude%2Clibrary_selection%2Clibrary_source%2Clibrary_strategy%2Clocal_environmental_context%2Clocation%2Clocation_end%2Clocation_start%2Clon%2Cmarine_region%2Cmating_type%2Cncbi_reporting_standard%2Cnominal_length%2Cnominal_sdev%2Cpcr_isolation_protocol%2Cph%2Cproject_name%2Cprotocol_label%2Cread_count%2Cread_strand%2Crestriction_enzyme%2Crestriction_enzyme_target_sequence%2Crestriction_site%2Crna_integrity_num%2Crna_prep_3_protocol%2Crna_prep_5_protocol%2Crna_purity_230_ratio%2Crna_purity_280_ratio%2Crt_prep_protocol%2Crun_accession%2Crun_alias%2Crun_date%2Csalinity%2Csample_accession%2Csample_alias%2Csample_capture_status%2Csample_collection%2Csample_description%2Csample_material%2Csample_prep_interval%2Csample_prep_interval_units%2Csample_storage%2Csample_storage_processing%2Csample_title%2Csampling_campaign%2Csampling_platform%2Csampling_site%2Cscientific_name%2Csecondary_project%2Csecondary_sample_accession%2Csequencing_date%2Csequencing_date_format%2Csequencing_location%2Csequencing_longitude%2Csequencing_method%2Csequencing_primer_catalog%2Csequencing_primer_lot%2Csequencing_primer_provider%2Cserotype%2Cserovar%2Csex%2Cspecimen_voucher%2Csra_aspera%2Csra_bytes%2Csra_ftp%2Csra_galaxy%2Csra_md5%2Cstatus%2Cstrain%2Cstudy_accession%2Cstudy_alias%2Cstudy_title%2Csub_species%2Csub_strain%2Csubmission_accession%2Csubmission_tool%2Csubmitted_aspera%2Csubmitted_bytes%2Csubmitted_format%2Csubmitted_ftp%2Csubmitted_galaxy%2Csubmitted_host_sex%2Csubmitted_md5%2Csubmitted_read_type%2Ctag%2Ctarget_gene%2Ctax_id%2Ctaxonomic_classification%2Ctaxonomic_identity_marker%2Ctemperature%2Ctissue_lib%2Ctissue_type%2Ctransposase_protocol%2Cvariety&format=tsv&limit=0"

Output as json (comes as a json array, not wrapped as an object)

SEARCH_QUERY='study_accession=PRJNA339914' && curl "https://www.ebi.ac.uk/ena/portal/api/search?query=${SEARCH_QUERY}&result=read_run&fields=experiment_accession%2Cexperiment_title%2Csecondary_study_accession%2Caligned%2Caltitude%2Cassembly_quality%2Cassembly_software%2Cbam_aspera%2Cbam_bytes%2Cbam_ftp%2Cbam_galaxy%2Cbam_md5%2Cbase_count%2Cbinning_software%2Cbio_material%2Cbisulfite_protocol%2Cbroad_scale_environmental_context%2Cbroker_name%2Ccage_protocol%2Ccell_line%2Ccell_type%2Ccenter_name%2Cchecklist%2Cchip_ab_provider%2Cchip_protocol%2Cchip_target%2Ccollected_by%2Ccollection_date%2Ccollection_date_end%2Ccollection_date_start%2Ccompleteness_score%2Ccontamination_score%2Ccontrol_experiment%2Ccountry%2Ccultivar%2Cculture_collection%2Cdatahub%2Cdepth%2Cdescription%2Cdev_stage%2Cdnase_protocol%2Cecotype%2Celevation%2Cenvironment_biome%2Cenvironment_feature%2Cenvironment_material%2Cenvironmental_medium%2Cenvironmental_sample%2Cexperiment_alias%2Cexperiment_target%2Cexperimental_factor%2Cexperimental_protocol%2Cextraction_protocol%2Cfaang_library_selection%2Cfastq_aspera%2Cfastq_bytes%2Cfastq_ftp%2Cfastq_galaxy%2Cfastq_md5%2Cfile_location%2Cfirst_created%2Cfirst_public%2Cgermline%2Chi_c_protocol%2Chost%2Chost_body_site%2Chost_genotype%2Chost_gravidity%2Chost_growth_conditions%2Chost_phenotype%2Chost_scientific_name%2Chost_sex%2Chost_status%2Chost_tax_id%2Cidentified_by%2Cinstrument_model%2Cinstrument_platform%2Cinvestigation_type%2Cisolate%2Cisolation_source%2Clast_updated%2Clat%2Clibrary_construction_protocol%2Clibrary_gen_protocol%2Clibrary_layout%2Clibrary_max_fragment_size%2Clibrary_min_fragment_size%2Clibrary_name%2Clibrary_pcr_isolation_protocol%2Clibrary_prep_date%2Clibrary_prep_date_format%2Clibrary_prep_latitude%2Clibrary_prep_location%2Clibrary_prep_longitude%2Clibrary_selection%2Clibrary_source%2Clibrary_strategy%2Clocal_environmental_context%2Clocation%2Clocation_end%2Clocation_start%2Clon%2Cmarine_region%2Cmating_type%2Cncbi_reporting_standard%2Cnominal_length%2Cnominal_sdev%2Cpcr_isolation_protocol%2Cph%2Cproject_name%2Cprotocol_label%2Cread_count%2Cread_strand%2Crestriction_enzyme%2Crestriction_enzyme_target_sequence%2Crestriction_site%2Crna_integrity_num%2Crna_prep_3_protocol%2Crna_prep_5_protocol%2Crna_purity_230_ratio%2Crna_purity_280_ratio%2Crt_prep_protocol%2Crun_accession%2Crun_alias%2Crun_date%2Csalinity%2Csample_accession%2Csample_alias%2Csample_capture_status%2Csample_collection%2Csample_description%2Csample_material%2Csample_prep_interval%2Csample_prep_interval_units%2Csample_storage%2Csample_storage_processing%2Csample_title%2Csampling_campaign%2Csampling_platform%2Csampling_site%2Cscientific_name%2Csecondary_project%2Csecondary_sample_accession%2Csequencing_date%2Csequencing_date_format%2Csequencing_location%2Csequencing_longitude%2Csequencing_method%2Csequencing_primer_catalog%2Csequencing_primer_lot%2Csequencing_primer_provider%2Cserotype%2Cserovar%2Csex%2Cspecimen_voucher%2Csra_aspera%2Csra_bytes%2Csra_ftp%2Csra_galaxy%2Csra_md5%2Cstatus%2Cstrain%2Cstudy_accession%2Cstudy_alias%2Cstudy_title%2Csub_species%2Csub_strain%2Csubmission_accession%2Csubmission_tool%2Csubmitted_aspera%2Csubmitted_bytes%2Csubmitted_format%2Csubmitted_ftp%2Csubmitted_galaxy%2Csubmitted_host_sex%2Csubmitted_md5%2Csubmitted_read_type%2Ctag%2Ctarget_gene%2Ctax_id%2Ctaxonomic_classification%2Ctaxonomic_identity_marker%2Ctemperature%2Ctissue_lib%2Ctissue_type%2Ctransposase_protocol%2Cvariety&format=json&limit=0"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment