Skip to content

Instantly share code, notes, and snippets.

@venkan
Created May 18, 2018 09:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save venkan/e85cda985cb3d8383d4341f4d9df5f5c to your computer and use it in GitHub Desktop.
Save venkan/e85cda985cb3d8383d4341f4d9df5f5c to your computer and use it in GitHub Desktop.
TCGA metadata - To get the read length, to check whether all the samples are paired-end and to get all other information for TCGA samples
library(GenomicDataCommons)
q = files() %>%
filter(~ cases.project.project_id == 'TCGA-LIHC' &
data_type == 'Aligned Reads' &
experimental_strategy == 'WXS' &
data_format == 'BAM') %>% select('file_id') %>%
expand('analysis.metadata.read_groups')
file_ids = ids(q)
z = results_all(q)
read_length_list = sapply(z$analysis$metadata$read_groups,'[[','read_length')
library(dplyr)
#h <- z$analysis$metadata$read_groups %>% bind_rows() %>% as_tibble()
rg_info = bind_rows(z$analysis$metadata$read_groups)
write.csv(rg_info, "TCGA-LIHC_read_length.csv", row.names= FALSE)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment