Skip to content

Instantly share code, notes, and snippets.

@abhik1368
abhik1368 / cdk.txt
Created August 21, 2023 00:25
cdk file
CC(C)C(=O)COc1nc(N)nc2[nH]cnc12 ZINC03814457
Nc1nc(OC[C@@H]2CCCO2)c2nc[nH]c2n1 ZINC03814459
Nc1nc(OC[C@H]2CCC(=O)N2)c2nc[nH]c2n1 ZINC03814460
Nc1nc(OCC2CCCCC2)c2nc[nH]c2n1 ZINC00023543
Nc1nc(OC[C@@H]2CC=CCC2)c2nc[nH]c2n1 ZINC03814458
Cn1cnc2c(NCc3ccccc3)nc(NCCO)nc21 ZINC01641925
CC[C@H](CO)Nc1nc(NCc2ccccc2)c2ncn(C(C)C)c2n1 ZINC01649340
COc1ccc(CNc2nc(N(CCO)CCO)nc3c2ncn3C(C)C)cc1 ZINC01487345
Nc1nc(N)c(N=O)c(OCC2CCCCC2)n1 ZINC03814479
COc1ccc2c(c1)/C(=C/c1cnc[nH]1)C(=O)N2 ZINC03814467
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@abhik1368
abhik1368 / dbtest.csv
Created November 16, 2022 00:46
molecule_maccs
We can make this file beautiful and searchable if this error is corrected: It looks like row 2 should actually have 1 column, instead of 167. in line 1.
DrugBank_ID MACCS
DB07361 ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '1', '1', '1', '0', '0', '0', '0', '1', '0', '0', '0', '1', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '1', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '1', '0', '0', '1', '1', '1', '1', '0', '0', '0', '1', '1', '0', '0', '1', '0', '0', '0', '0', '1', '1', '1', '0', '1', '1', '0', '1', '1', '1', '1', '1', '0', '0', '1', '1', '0', '0', '0', '0', '0', '1', '1', '0', '1', '1', '0', '0', '0', '1', '0', '0', '0', '0', '0', '1', '0', '1', '1', '1', '0', '1', '0', '0', '0', '0', '1', '0', '0', '1', '0', '1', '0', '0', '0', '1', '0', '1', '1', '1', '1', '0', '1', '0', '0', '1', '1', '1', '1', '1', '0']
DB13157 ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'

PDBe-KB

Note: this gist is heavily based on materials provided by organizers of the Mining PDBe and PDBe-KB Using a Graph Database workshop.

Why graph DB?

Graph DBs are much more useful for representing unstructured, sparse data and relationships than conventional relational DBs.

Graph is a structure that models pairwise relations between objects:

  • consists of nodes (vertices), edges (relationships) and properties (attributes):
    • nodes represent entities
    • edges encode connections between nodes
@abhik1368
abhik1368 / Link_Prediction_by_Regression.R
Last active August 29, 2015 14:18
Link_Prediction_by_Regression
@abhik1368
abhik1368 / nbi.R
Created February 26, 2015 05:50
Network_Inference
nbi < - function (A){
# A is the n x m adjacencny matrix here
n <- nrow(A)
m <- ncol(A)
# You need to calculate the degree of columns to use it as node weight
Ky <- diag(1/colSums(A))
Ky[is.infinite(Ky) | is.na(Ky)] <- 0
kx <- rowSums(A)
Nx <- 1/(matrix(kx, nrow=n, ncol=n, byrow=TRUE))
@abhik1368
abhik1368 / mclust.R
Last active August 29, 2015 14:08
mclust.R
# Algorithm to perform MCL Clustering in R
# Add the identity matrix to the matrix which indicates self loops
add.selfloops <- function (M) {
LM<-M+diag(dim(M)[1])
return (LM);
}
# Inflation step of MCL
inflate <- function (M,inf) {
M <- M^(inf)
@abhik1368
abhik1368 / random_walk.R
Last active August 29, 2015 14:06
Random_walk_with_restart
# Parameter r: restart probability
r<-0.8
# convergence cutoff
conv_cut<-1e-10
RWR <- function(M, p_0, r,conv_cut, prop=FALSE) {
# use Network propagation when prop=TRUE
if(prop) {
w<-colSums(M)
@abhik1368
abhik1368 / rie.R
Created September 12, 2014 02:58
Robust Initial Enhancement Metric
# x = a vector of scores
# y = a vector of labels
function (x, y, decreasing = TRUE, alpha = 20)
{
if (length(x) != length(y)) {
stop(paste("The length of scores should be equal to number of labels."))
}
N <- length(y)
n <- length(which(y == 1))
@abhik1368
abhik1368 / bedroc.R
Created September 12, 2014 02:57
Boltzmann-Enhanced Discrimination of ROC
# x = a vector of scores
# y = a vector of labels
function (x, y, decreasing = TRUE, alpha = 20)
{
if (length(x) != length(y)) {
stop(paste("The length of scores should be equal to number of labels."))
}
N <- length(y)
n <- length(which(y == 1))