Skip to content

Instantly share code, notes, and snippets.

@FloWuenne
Created November 9, 2018 16:30
Show Gist options
  • Save FloWuenne/fca174d38ffd81c513fbd979e34f7062 to your computer and use it in GitHub Desktop.
Save FloWuenne/fca174d38ffd81c513fbd979e34f7062 to your computer and use it in GitHub Desktop.
Quick testing of the Jensen-Shannon implementation in the philentropy package
## Testing of JSD function in philentrop package
## Documentation at:
## https://www.rdocumentation.org/packages/philentropy/versions/0.2.0/topics/JSD
## Load library
library(philentropy)
## We will test the Jensen-Shannon divergence with the aplpication of scoring gene regulatory networks as described in this \
## paper: https://www.cell.com/cell-reports/fulltext/S2211-1247(18)31634-6?dgcid=raven_jbs_etoc_email#secsectitle0070
## In this article, the authors used two distributions to represent regulon activity and cell identity
## Distribution 1: Regulon activity score (RAS) normalized so they sum up to 1
## Distribution 2: Cell identity as 1 or 0 for a specific cell type normalized so they sum up to 1
## Now let's use some very simply toy examples to see whether the JDS function works as we would expect and to learn how to
## run it correctly
## First, we will test perfect overlap between the two distributions, that is, we have high RAS values in all cells of that cell type
## and 0 RAS values in all other cells. This is obviously not representative of biology but gives us a positive control
## to test the Jensen-Shannon divergence. With such perfectly correlating distributions, the Jensen-Shannon divergence should be 0
## and the Jensen-Shannon distance, which is what the authors used and call the Regulon specificty score (RSS) should be 1
## Make RAS distribution
dist1 <- c(100,0,0,100,0,100)
dist1_norm <- dist1/sum(dist1)
## Make cell identity distribution
dist2 <- c(1,0,0,1,0,1)
dist2_norm <- dist2/sum(dist2)
## Put distributions in a data frame
dist_df <- rbind(dist1_norm,dist2_norm)
## Calculate the Jensen-Shannon divergence
jsd_divergence <- philentropyJSD(dist_df)
## Calculate Jensen-Shannon distance
jsd_distance <- 1-sqrt(jsd_divergence)
## Next we will test the opposite scenario where RAS scores are completely independent of cell identities.
## Negative control
## Make RAS distribution
dist1 <- c(100,0,0,100,0,100)
dist1_norm <- dist1/sum(dist1)
## Make cell identity distribution
dist2 <- c(0,1,1,0,1,0)
dist2_norm <- dist2/sum(dist2)
## Put distributions in a data frame
dist_df <- rbind(dist1_norm,dist2_norm)
## Calculate the Jensen-Shannon divergence
jsd_divergence <- philentropyJSD(dist_df)
## Calculate Jensen-Shannon distance
jsd_distance <- 1-sqrt(jsd_divergence)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment