name | model_weights | weights_url | license | sha1 | dragonn_version | gist_id |
---|---|---|---|---|---|---|
MRPA DragoNN |
mpra_model.weights.hd5 |
unrestricted |
be007a27de30d0eaf330e23fa37fc628259dfaf4 |
0.1.3 |
33300cd4bb7e061b50bc25fd85b27d6f |
Title: Predicting regulatory activities from massively parallel reporter assays
Description: We trained a deep convolutional neural net on MRPA data from Ernst et. al., 2016 [1]. The input of the model is a one-hot encoded vector of 145 bp DNA sequences. The model predicts the MPRA assay activities, normalized in the way described in the paper, then squashed into [-1, 1]. The study examines two cell lines, HepG2 and K562, and two promoters per cell line, minP and SV40P. For a given sequence, our model predicts a length 4 vector of outputs corresponding to the predicted activity in (HepG2, minP), (K562, minP), (HepG2, SV40P), and (K562, SV40P). Please see our preprint [2] and our git repository [3] for more details on our model and validation that our model learns biologically relevant information.
[1] Ernst J, Melnikov A, Zhang X, Wang L, Rogov P, Mikkelsen T, Kellis M. Genome-scale high-resolution mapping of activating and repressive nucleotides in regulatory regions. Nature Biotechnology, 34:1180-1190, 2016. [2] Joe Paggi, Andrew Lamb, Kevin Tian, Irving Hsu, Pierre-Louis Cedoz, Prasad Kawthekar. bioRxiv 099879; doi: https://doi.org/10.1101/099879 [3] https://github.com/irvhsu/cs273b-project