Skip to content

Instantly share code, notes, and snippets.

@jpaggi
Last active January 26, 2017 08:53
Show Gist options
  • Save jpaggi/33300cd4bb7e061b50bc25fd85b27d6f to your computer and use it in GitHub Desktop.
Save jpaggi/33300cd4bb7e061b50bc25fd85b27d6f to your computer and use it in GitHub Desktop.
MPRA DragoNN
name model_weights weights_url license sha1 dragonn_version gist_id
MRPA DragoNN
mpra_model.weights.hd5
unrestricted
be007a27de30d0eaf330e23fa37fc628259dfaf4
0.1.3
33300cd4bb7e061b50bc25fd85b27d6f

CS 273B Final Project

Title: Predicting regulatory activities from massively parallel reporter assays

Description: We trained a deep convolutional neural net on MRPA data from Ernst et. al., 2016 [1]. The input of the model is a one-hot encoded vector of 145 bp DNA sequences. The model predicts the MPRA assay activities, normalized in the way described in the paper, then squashed into [-1, 1]. The study examines two cell lines, HepG2 and K562, and two promoters per cell line, minP and SV40P. For a given sequence, our model predicts a length 4 vector of outputs corresponding to the predicted activity in (HepG2, minP), (K562, minP), (HepG2, SV40P), and (K562, SV40P). Please see our preprint [2] and our git repository [3] for more details on our model and validation that our model learns biologically relevant information.

[1] Ernst J, Melnikov A, Zhang X, Wang L, Rogov P, Mikkelsen T, Kellis M. Genome-scale high-resolution mapping of activating and repressive nucleotides in regulatory regions. Nature Biotechnology, 34:1180-1190, 2016. [2] Joe Paggi, Andrew Lamb, Kevin Tian, Irving Hsu, Pierre-Louis Cedoz, Prasad Kawthekar. bioRxiv 099879; doi: https://doi.org/10.1101/099879 [3] https://github.com/irvhsu/cs273b-project

{
"layers": [
{
"trainable": true,
"b_constraint": null,
"name": "Convolution2D",
"custom_name": "convolution2d",
"cache_enabled": true,
"activation": "relu",
"W_constraint": null,
"nb_col": 13,
"input_shape": [
1,
4,
145
],
"dim_ordering": "th",
"subsample": [
1,
1
],
"init": "he_normal",
"nb_filter": 100,
"border_mode": "valid",
"b_regularizer": {
"l2": 0.0,
"name": "WeightRegularizer",
"l1": 0
},
"W_regularizer": {
"l2": 0.0,
"name": "WeightRegularizer",
"l1": 0
},
"activity_regularizer": null,
"nb_row": 4
},
{
"cache_enabled": true,
"trainable": true,
"name": "Dropout",
"custom_name": "dropout",
"p": 0.1
},
{
"trainable": true,
"b_constraint": null,
"name": "Convolution2D",
"custom_name": "convolution2d",
"cache_enabled": true,
"activation": "relu",
"W_constraint": null,
"nb_col": 13,
"input_shape": [
1,
4,
145
],
"dim_ordering": "th",
"subsample": [
1,
1
],
"init": "he_normal",
"nb_filter": 100,
"border_mode": "valid",
"b_regularizer": {
"l2": 0.0,
"name": "WeightRegularizer",
"l1": 0
},
"W_regularizer": {
"l2": 0.0,
"name": "WeightRegularizer",
"l1": 0
},
"activity_regularizer": null,
"nb_row": 1
},
{
"cache_enabled": true,
"trainable": true,
"name": "Dropout",
"custom_name": "dropout",
"p": 0.1
},
{
"name": "MaxPooling2D",
"custom_name": "maxpooling2d",
"cache_enabled": true,
"trainable": true,
"dim_ordering": "th",
"pool_size": [
1,
40
],
"strides": [
1,
40
],
"border_mode": "valid"
},
{
"cache_enabled": true,
"trainable": true,
"name": "Flatten",
"custom_name": "flatten"
},
{
"W_constraint": null,
"b_constraint": null,
"name": "Dense",
"custom_name": "dense",
"activity_regularizer": null,
"trainable": true,
"cache_enabled": true,
"init": "glorot_uniform",
"input_dim": null,
"b_regularizer": null,
"W_regularizer": null,
"activation": "linear",
"output_dim": 4
}
],
"class_mode": "categorical",
"optimizer": {
"beta_1": 0.8999999761581421,
"epsilon": 1e-08,
"beta_2": 0.9990000128746033,
"lr": 0.0010000000474974513,
"name": "Adam"
},
"name": "Sequential",
"loss": "mean_squared_error"
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment