Skip to content

Instantly share code, notes, and snippets.

@millerh1
Last active September 14, 2021 00:19
Show Gist options
  • Save millerh1/76d44214acf6454eed7db70b939735c1 to your computer and use it in GitHub Desktop.
Save millerh1/76d44214acf6454eed7db70b939735c1 to your computer and use it in GitHub Desktop.
Encode Experiment ID -> Narrow Peak URLs
encExp2Peaks <- function(acc, getIDR=FALSE) {
resp <- httr::GET(url = paste0("https://www.encodeproject.org/search/?type=File&dataset=/experiments/",
acc,"/&file_type=bed+narrowPeak&format=json&frame=object&limit=all"), httr::accept_json()) %>%
httr::content(as="parsed")
res <- lapply(seq(resp$`@graph`), function(i) {
entry <- resp$`@graph`[[i]]
tibble(
accfile = entry$accession,
sup = ! is.null(entry$superseded_by %>% unlist()),
type = entry$file_type,
assembly = entry$assembly,
date = as.Date(entry$date_created),
url = entry$s3_uri,
output_type = entry$output_type,
biol_reps = entry$biological_replicates %>% unlist() %>% paste0(collapse = "_"),
acc = acc
)
}) %>% bind_rows() %>%
mutate(best_samps = case_when(
"1_2" %in% biol_reps ~ "1_2",
TRUE ~ "1"
)) %>%
dplyr::filter(! sup,
assembly == "GRCh38",
biol_reps == best_samps) %>%
mutate(dateMax = max(date)) %>%
dplyr::filter(date == dateMax) %>%
mutate(url = paste0("https://www.encodeproject.org/files/", accfile, "/@@download/", accfile, ".bed.gz"))
if (nrow(res) > 1 & getIDR) {
res <- dplyr::filter(res, output_type == "IDR thresholded peaks")
} else {
res
}
}
@millerh1
Copy link
Author

Takes an encode experiment accession and pulls the latest GRCh38 narrow peak file URL (with 1_2 replicates combined, if available).

@millerh1
Copy link
Author

millerh1 commented Sep 10, 2021

Usage:

 > encExp2Peaks("ENCSR825SVO")

# A tibble: 1 × 8
  accfile     sup   type           assembly url                                                                           biol_reps acc   best_samps
  <chr>       <lgl> <chr>          <chr>    <chr>                                                                         <chr>     <chr> <chr>     
1 ENCFF128BDG FALSE bed narrowPeak GRCh38   https://www.encodeproject.org/files/ENCFF128BDG/@@download/ENCFF128BDG.bed.gz 1_2       ENCS… 1_2  

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment