Skip to content

Instantly share code, notes, and snippets.

@slavailn
Created September 26, 2016 22:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save slavailn/219038d2e48b21d653e534f4766fa366 to your computer and use it in GitHub Desktop.
Save slavailn/219038d2e48b21d653e534f4766fa366 to your computer and use it in GitHub Desktop.
Annotate genomic intervals in bed format using ChIPpeakAnno
library("ChIPpeakAnno")
library("GenomicRanges")
library("biomaRt")
# Annotate genomic intervals in bed format using ChIPpeakAnno
setwd("~/Projects/grey_miseq/re-doing_hotspots/high_confidence_hotspots/")
list.files()
# Read in bed file with genomic intervals, ATAC-seq hotspots in this case
Q1 <- read.table("Q1_high_confidence.bed", header=F, sep = "\t")
names(Q1) <- c("chr", "start", "end")
# Convert to GRanges
gr1 <- makeGRangesFromDataFrame(Q1, ignore.strand = T, seqnames.field = "chr",
start.field = "start", end.field = "end")
# Get mouse TSS
data("TSS.mouse.GRCm38")
########################################################################################################
# Annotate Q1 hotspots with nearest TSS
Q1_annot <- annotatePeakInBatch(myPeakList = gr1,
AnnotationData = TSS.mouse.GRCm38,
featureType = "TSS",
output = "nearestLocation",
PeakLocForDistance = "middle",
select = "first",
ignore.strand = T)
# Load mart
ensembl <- useMart("ENSEMBL_MART_ENSEMBL", host="www.ensembl.org")
ensembl <- useDataset("mmusculus_gene_ensembl", mart=ensembl)
Q1_annot <- addGeneIDs(Q1_annot, mart = ensembl, feature_id_type = "ensembl_gene_id",
IDs2Add = c("entrezgene", "mgi_symbol", "description"))
write.table(as.data.frame(Q1_annot), "Q1_high_confidence_annotated.txt", sep = "\t", col.names = T,
row.names = F, quote = F)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment