srogovtsev/Coursera: Deep Learning Specialization.md

## Coursera: Deep Learning Specialization.md

      
    Raw
  

              Coursera: Deep Learning Specialization.md
            
          
    Improving Deep Neural Networks

Sources:


Forum Thread by Yiwei Chen

Dropout


Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15 , 1929–1958: http://jmlr.org/papers/v15/srivastava14a.html

Initialization


He, K., Zhang, X., Ren, S., and Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification: https://arxiv.org/abs/1502.01852


Glorot, X. and Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks: http://proceedings.mlr.press/v9/glorot10a.html


Optimization


RMSprop: http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf


Kingma, D. and Ba, J. (2014). Adam: A method for stochastic optimization: https://arxiv.org/abs/1412.6980


Dauphin, Y., Pascanu, R., Gulcehre, C., Cho, K., Ganguli, S., and Bengio, Y. (2014). Identifying and attacking the saddle point problem in high-dimensional non-convex optimization: https://arxiv.org/abs/1406.2572


Hyperparameter tuning


Bergstra, J. and Bengio, Y. (2012). Random search for hyper-parameter optimization. J. Machine Learning Res., 13 , 281–305: http://www.jmlr.org/papers/v13/bergstra12a.html


Bergstra, J, et. al. Algorithms for Hyper-Parameter Optimization. Advances in Neural Information Processing Systems (pp. 2546-2554): http://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf


Batch Norm


Ioffe, S. and Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift: https://arxiv.org/abs/1502.03167

Convolutional Neural Networks

Sources:


Coursera: Convolutional Neural Networks Papers by Ruby Childs
Course References by Michiel Koens

Classic Networks


LeNet-5: LeCun et al, "Gradient-Based Learning Applied to Document Recognition": http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf


AlexNet: Krizhevsky et al, "ImageNet Classification with Deep Convolutional Neural Networks": https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf


VGG-16: Simonyan et al, "Very Deep Convolutional Networks for Large-Scale Image Recognition": https://arxiv.org/pdf/1409.1556.pdf


Resnets


Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, "Deep Residual Learning for Image Recognition": https://arxiv.org/abs/1512.03385

Networks in Networks and 1x1 Convolutions


Min Lin, Qiang Chen, Shuicheng Yan, "Network In Network": https://arxiv.org/abs/1312.4400

Inception Networks


Christian Szegedy, and lots of others, "Going Deeper with Convolutions": https://arxiv.org/abs/1409.4842

Convolutional Implementation of Sliding Windows


Pierre Sermanet, David Eigen, Xiang Zhang, Michael Mathieu, Rob Fergus, Yann LeCun, "OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks": https://arxiv.org/abs/1312.6229

Bounding Box Predictions


Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi, "You Only Look Once: Unified, Real-Time Object Detection": https://arxiv.org/abs/1506.02640


Joseph Redmon, Ali Farhadi, "YOLO9000: Better, Faster, Stronger": https://arxiv.org/abs/1612.08242


Region Proposals


Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation": https://arxiv.org/abs/1311.2524


Ross Girshick, "Fast R-CNN": https://arxiv.org/abs/1504.08083


Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks": https://arxiv.org/abs/1506.01497


Siamese Network


Taigman et al, "DeepFace: Closing the Gap to Human-Level Performance in Face Verification": https://www.cs.toronto.edu/~ranzato/publications/taigman_cvpr14.pdf

Triplet Loss


Florian Schroff, Dmitry Kalenichenko, James Philbin, "FaceNet: A Unified Embedding for Face Recognition and Clustering": https://arxiv.org/abs/1503.03832

What are deep ConvNets learning?


Matthew D Zeiler, Rob Fergus, "Visualizing and Understanding Convolutional Networks": https://arxiv.org/abs/1311.2901

Neural Style


Leon A. Gatys, Alexander S. Ecker, Matthias Bethge, "A Neural Algorithm of Artistic Style": https://arxiv.org/abs/1508.06576

Log0, TensorFlow Implementation of "A Neural Algorithm of Artistic Style": http://www.chioka.in/tensorflow-implementation-neural-algorithm-of-artistic-style


Harish Narayanan, "Convolutional neural networks for artistic style transfer": https://harishnarayanan.org/writing/artistic-style-transfer/

Sequence Models

Sources


A page by Samrat Saha


A forum post by Giovanni Ciriani


RNN


Chung J, Gulcehre C, Cho K, Bengio Y. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv:14123555. December 2014: http://arxiv.org/abs/1412.3555.


Cho K, van Merrienboer B, Bahdanau D, Bengio Y. On the Properties of Neural Machine Translation: Encoder-Decoder Approaches. arXiv:14091259. September 2014: http://arxiv.org/abs/1409.1259.


Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735-1780: http://www.bioinf.jku.at/publications/older/2604.pdf


Word Embeddings


Andrej Karpathy. The Unreasonable Effectiveness of Recurrent Neural Networks. http://karpathy.github.io/2015/05/21/rnn-effectiveness/. Published 2015.


Pennington J, Socher R, Manning C. Glove: Global Vectors for Word Representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar: Association for Computational Linguistics; 2014:1532-1543: https://nlp.stanford.edu/pubs/glove.pdf


Mikolov T, Sutskever I, Chen K, Corrado G, Dean J. Distributed Representations of Words and Phrases and their Compositionality. arXiv:13104546. October 2013: http://arxiv.org/abs/1310.4546.


Mikolov T, Chen K, Corrado G, Dean J. Efficient Estimation of Word Representations in Vector Space. arXiv:13013781. January 2013: http://arxiv.org/abs/1301.3781.


Mikolov T, Yih W, Zweig G. Linguistic Regularities in Continuous Space Word Representations. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics; 2013:746–751: http://aclweb.org/anthology/N13-1090.


Maaten L van der, Hinton GE. Visualizing Data using t-SNE. In: ; 2008.


Bengio Y, Ducharme R, Vincent P, Jauvin C. A Neural Probabilistic Language Model. 2003:19.


Bolukbasi T, Chang K-W, Zou J, Saligrama V, Kalai A. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. July 2016: https://arxiv.org/abs/1607.06520v1.


Sequence to Sequence Models


Cho K, van Merrienboer B, Gulcehre C, et al. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. arXiv:14061078. June 2014: http://arxiv.org/abs/1406.1078.


Sutskever I, Vinyals O, Le QV. Sequence to Sequence Learning with Neural Networks. In: Proc. NIPS. Montreal, CA; 2014. http://arxiv.org/abs/1409.3215.


Papineni K, Roukos S, Ward T, Zhu W-J. Bleu: a Method for Automatic Evaluation of Machine Translation. In: Proceedings of 40th Annual Meeting of the Association for Computational Linguistics. Philadelphia, Pennsylvania, USA: Association for Computational Linguistics; 2002:311–318. doi:10.3115/1073083.1073135: https://www.aclweb.org/anthology/P02-1040.pdf


Image Captioning


Mao J, Xu W, Yang Y, Wang J, Huang Z, Yuille A. Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN). arXiv:14126632. December 2014: http://arxiv.org/abs/1412.6632.


Karpathy A, Fei-Fei L. Deep Visual-Semantic Alignments for Generating Image Descriptions. December 2014: https://arxiv.org/abs/1412.2306v2.


Vinyals O, Toshev A, Bengio S, Erhan D. Show and Tell: A Neural Image Caption Generator. arXiv:14114555. November 2014: http://arxiv.org/abs/1411.4555.


Attention-based Models


Bahdanau D, Cho K, Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate. arXiv:14090473. September 2014: http://arxiv.org/abs/1409.0473.


Xu K, Ba J, Kiros R, et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. arXiv:150203044. February 2015: http://arxiv.org/abs/1502.03044.


Speech Recognition


Graves A, Fernandez S, Gomez F, Schmidhuber J. Connectionist Temporal Classiﬁcation: Labelling Unsegmented Sequence Data with Recurrent Neural Networks. 2006:8: ftp://ftp.idsia.ch/pub/juergen/icml2006.pdf