Skip to content

Instantly share code, notes, and snippets.

@garibarba
Last active October 2, 2020 14:48
Show Gist options
  • Star 6 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save garibarba/ad2833875a80bd4dc5d2e6fae18aad75 to your computer and use it in GitHub Desktop.
Save garibarba/ad2833875a80bd4dc5d2e6fae18aad75 to your computer and use it in GitHub Desktop.
Deep Learning Won't-Read List

Comprehensive Texts - Books

Courses and long reads

Training

Generalization

Catastrophic forgetting

Biology

Reinforcement Learning

⭐ Asynchronous

Recurrent Networks

Recurrent Networks

Hierarchical

LSTM

Attention / Learned Policies

Video

Extensions

Practical

Memory models

Generation and reconstruction

Dimensionality reduction

⭐ Inverting networks

GANs

NLP

Research tools

Other lists by other people

Privacy and data security

Unsorted

NIPS16

Physics

Mess

Semi supervised learning - Chapelle et al. 2006. Also Laserre et al. 2006 & Larroselle and Bengio 2008

Boosting

What My Deep Model Doesn't Know... http://mlg.eng.cam.ac.uk/yarin/blog_3d801aa532c1ce.html by cambridge traditional ML person

Video captioning, attention over input frames: Yao et al, "Describing Videos by Exploiting Temporal Structure" -> Elegibility Traces

Attention: DRAW MNIST numbers. Better newer: DeepMind Jaderberg "Spatial Transformer Networks"

Multiple Object Recognition with Visual Attention (RNN policy instead of CNN) Ba et al.

Visual attention (Xu et al.)

Pointer networks (Vinyals)

Perpetual Learning Machines

Yoshua Bengio: segregated RNN modules with different timings (1995)?

https://www.quora.com/What-do-you-think-is-the-most-puzzling-thing-about-Deep-Learning-that-has-not-been-researched-enough-yet (time and brain)

Curriculum learning http://www.machinelearning.org/archive/icml2009/papers/119.pdf (teach first simple concepts)

Show and Tell: A Neural Image Caption Generator

Hinton's best for NLP in 2015: NiPS paper by Sutskever, Vinyals and Le (2014) and the papers from Yoshua Bengio's lab on machine translation using recurrent nets.

Investigate this source http://www.computervisiontalks.com

MIT autoencoder 3d angles rendering

NLP almost from scratch

Best --> “Glove: Global Vectors for Word Representation” by Pennington et al. (2014)

“Distributed Representations of Words and Phrases and their Compositionality” (Mikolov et al. 2013)

NNLM, HLBL, RNN, Skip-gram/CBOW, (Bengio et al; Collobert & Weston; Huang et al; Mnih & Hinton; Mikolov et al; Mnih & Kavukcuoglu)

Sequence-to-Sequence

1. MT Kalchbrenner et al, EMNLP 2013][Cho et al, EMLP 2014][Sutskever & Vinyals & Le, NIPs2014] Luong et al, ACL 2015][Bahdanau et al, ICLR 2015]
2. Image captions [Mao et al, ICLR 2015]IVinyals et al, CVPR 2015] Donahue et al, CVPR2015][Xu et al, ICML 2015]
3. Speech (Chorowsky et al, NIPS DL 2014][Chan et al, arxiv 2015]
4. Parsing [Vinyals & Kaiser et al, NIPs 2015]
5. Dialogue [Shang et al, AcL2015 Sordoni et al, NAACL 2015 Vinyals & Le, ICML DL 2015h
6. Video Generation Srivastava et al. ICML 201 Unive
7. Geometry Minyals & Fortunato & Jaitly, NIPs 2015]

Related Memory Models

* RNNSearch (Bahdanau et al.) for Machine Translation
+ Can be seen as a MemNN where memory goes back only
one sentence (writes embedding for each word).
+ At prediction time, reads memory and performs a soft max
to find best alignment (most useful words).
* Generating Sequences With RNNs (Graves, '13)
+ Also does alignment with previous sentence to generate
handwriting (so RNN knows what letter it's currently on)
* Neural Turing Machines (Graves et al., 14)
[on arxiv just 5 days after MemNNs!l
+ Has read and write operations over memory to perform
tasks (e.g. copy, sort, associative recall).
+ 128 memory slots in experiments; content addressing
computes a score for each slot 3 slow for large memory
*Earlier work by (Das '92), (Schmidhuber et al., 93),
DISCERN (Miikkulainen, '90) and others...

Information theory book http://www.inference.phy.cam.ac.uk/mackay/itila/ by MacKay and https://www.quora.com/What-is-a-good-explanation-of-Information-Theory refers to Shannon's book

Echo state networks - Jaeger, Herbert, and Harald Haas. “Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication.” science 304.5667 (2004): 78-80.

Kohonen, Teuvo. “Self-organized formation of topologically correct feature maps.” Biological cybernetics 43.1 (1982): 59-69.

Meetings?, EAMT, PhD, ListsOfLinks http://www.cis.uni-muenchen.de/~davidk/deep-munich/ http://www.cis.uni-muenchen.de/~fraser/nmt_seminar_2016_SS/nmt_reading_list.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment