Skip to content

Instantly share code, notes, and snippets.

@ag1805x
Created June 22, 2022 12:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ag1805x/d92e339cae1be334a921a72ef6d0959f to your computer and use it in GitHub Desktop.
Save ag1805x/d92e339cae1be334a921a72ef6d0959f to your computer and use it in GitHub Desktop.
# Extract gene lengths from GTF file
# Download GTF file from https://gdc.cancer.gov/about-data/gdc-data-processing/gdc-reference-files
echo "Gene_ID, Gene_name, Gene_type, Chr, Start, End, Gene_length" > gdc_gencode_v36_annotation.csv
awk '{if($3 == "gene") print $0}' gencode.v36.annotation.gtf | awk -F '[\t ;]' '{print $10","$16","$13","$1","$4","$5","$5-$4}' >> gdc_gencode_v36_annotation.csv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment