Skip to content

Instantly share code, notes, and snippets.

@ag1805x
Created September 16, 2019 06:41
Show Gist options
  • Save ag1805x/89cbc34ca9b20b877a3f070455289906 to your computer and use it in GitHub Desktop.
Save ag1805x/89cbc34ca9b20b877a3f070455289906 to your computer and use it in GitHub Desktop.
Using hclust() to cluster RNA-seq samples and visualize dendrogram
library(dendextend)
countsPC <- read.table("CountsPC_Gene_MultimapOverlap.tsv", header=T, row.names=c(1))
factors <- as.data.frame(read.table("factors.txt", sep="\t", header = TRUE, row.names=c(1)))
dist <- dist(t(countsPC), method="euclidean")
cluster <- hclust(dist)
# main, sub, xlab, ylab
dend <- as.dendrogram(cluster)
#Add colors to label
label_colors_to_use <- as.numeric(factors$SampleType) #Input groups of sample type for label color
label_colors_to_use <- label_colors_to_use[order.dendrogram(dend)] #Order group annotation as per in dendrogram
#Use to add custom label color
for(i in 1:length(label_colors_to_use)){
if(label_colors_to_use[i] == 1){
label_colors_to_use[i] = "blue"
}else if(label_colors_to_use[i] == 2){
label_colors_to_use[i] = "red"
}
}
#Add colored bars showing study accession
StudyAccession_colors_to_use <- as.numeric(factors$StudyAccession) #Input groups of sample type for label color
StudyAccession_colors_to_use <- StudyAccession_colors_to_use[order.dendrogram(dend)] #Order group annotation as per in dendrogram
dend <- set(dend, "labels_colors", label_colors_to_use) #Apply colors to labels
dend <- hang.dendrogram(dend, hang=-1) #Leaves at same height
dend <- set(dend, "branches_lwd", 2) #Thick branches
dend <- set(dend, "labels_cex", 1) #label font size
tiff(filename="Sample_hclust.tiff", height=10, width=20, units='in', res=300)
par(mar = c(10,5,3,1))
plot(dend, main = "Sample Clustering\ndist.method=Eucleadian", ylab = "Height") #Plot the dendrogram
colored_bars(colors = StudyAccession_colors_to_use, dend = dend, rowLabels = "Project", sort_by_labels_order = FALSE)
dev.off()
@dartagnan323
Copy link

Thanks for a quick reply, I will try that. for 2) not sure yet how to use that "rowSums", but will search and come back if I'm really stuck.

@ag1805x
Copy link
Author

ag1805x commented Nov 23, 2020

Just use this on your raw count matrix:

keep <- rowSums(cpm(counts)>1) >1
counts_filt <- counts[keep, ]

After this continue as is with the filtered counts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment