- Go to Ensembl portal and select BioMart
- Human GRCh37: https://grch37.ensembl.org/
- All others: http://ensembl.org/
- Select the following attributes:
- Chromosome/scaffold name
- Gene start (bp)
- Gene end (bp)
- Gene stable ID
- Gene stable ID version
- Gene name
- Gene type
- Export the results to a tsv file.
- Sort them:
cat mart_export.txt | awk 'NR==1' > EnsemblBioMart.GRCh37p13.EnsemblGenes104.tsv
cat mart_export.txt | awk 'NR>1' | sort -k1,1V -k2,2n -k3,3n >> EnsemblBioMart.GRCh37p13.EnsemblGenes104.tsv
- Edit the header line:
#Chromosome Gene_start Gene_end Gene_stable_ID Gene_stable_ID_version Gene_symbol Gene_type
- Compress and index them:
bgzip -l9 EnsemblBioMart.GRCh37p13.EnsemblGenes104.tsv
tabix -s1 -b2 -e3 EnsemblBioMart.GRCh37p13.EnsemblGenes104.tsv.gz
Last active
July 30, 2021 17:10
-
-
Save yk-tanigawa/78e453c9ca7ae3731d593885011a2fa0 to your computer and use it in GitHub Desktop.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment