Skip to content

Instantly share code, notes, and snippets.

View ag1805x's full-sized avatar
😎

Arindam Ghosh ag1805x

😎
View GitHub Profile
@ag1805x
ag1805x / extract_gene_id
Last active June 11, 2020 06:11
Extract gene ids of particular biotype from GFF3 files
#Extract gene ids of particular biotype from GFF3 files
#Assuming: GFF3 file is Homo_sapiens.GRCh38.84.gff3 (ftp://ftp.ensembl.org/pub/release-84/gff3/homo_sapiens/Homo_sapiens.GRCh38.84.gff3.gz)
#Assuming: biotype=protein_coding
grep "biotype=protein_coding" Homo_sapiens.GRCh38.84.gff3 | cut -f9 | cut -d';' -f1 |cut -d'=' -f2 | grep "gene" | cut -d':' -f2 | sort | uniq > proten_coding_ids