Skip to content

Instantly share code, notes, and snippets.

@ag1805x
Last active June 11, 2020 06:11
Show Gist options
  • Save ag1805x/67fba418476f409fb76d49b6045895ff to your computer and use it in GitHub Desktop.
Save ag1805x/67fba418476f409fb76d49b6045895ff to your computer and use it in GitHub Desktop.
Extract gene ids of particular biotype from GFF3 files
#Extract gene ids of particular biotype from GFF3 files
#Assuming: GFF3 file is Homo_sapiens.GRCh38.84.gff3 (ftp://ftp.ensembl.org/pub/release-84/gff3/homo_sapiens/Homo_sapiens.GRCh38.84.gff3.gz)
#Assuming: biotype=protein_coding
grep "biotype=protein_coding" Homo_sapiens.GRCh38.84.gff3 | cut -f9 | cut -d';' -f1 |cut -d'=' -f2 | grep "gene" | cut -d':' -f2 | sort | uniq > proten_coding_ids
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment