Skip to content

Instantly share code, notes, and snippets.

A Few Useful Things to Know about Machine Learning

The paper presents some key lessons and "folk wisdom" that machine learning researchers and practitioners have learnt from experience and which are hard to find in textbooks.

1. Learning = Representation + Evaluation + Optimization

All machine learning algorithms have three components:

  • Representation for a learner is the set if classifiers/functions that can be possibly learnt. This set is called hypothesis space. If a function is not in hypothesis space, it can not be learnt.
  • Evaluation function tells how good the machine learning model is.
  • Optimisation is the method to search for the most optimal learning model.
@shagunsodhani
shagunsodhani / Memory Networks.md
Last active March 28, 2024 11:17
Notes for "Memory Networks" paper

Memory Networks

Introduction

  • Memory Networks combine inference components with a long-term memory component.
  • Used in the context of Question Answering (QA) with memory component acting as a (dynamic) knowledge base.
  • Link to the paper.

Related Work

@shagunsodhani
shagunsodhani / DistBelief.md
Created April 2, 2016 07:10
Notes for "Large Scale Distributed Deep Networks" paper

Large Scale Distributed Deep Networks

Introduction

  • In machine learning, accuracy tends to increase with an increase in the number of training examples and number of model parameters.
  • For large data, training becomes slow on even GPU (due to increase CPU-GPU data transfer).
  • Solution: Distributed training and inference - DistBelief
  • Link to paper

DistBelief

@shagunsodhani
shagunsodhani / Pregel.md
Created December 20, 2015 11:55
Notes on Pregel Paper

The Pregel paper introduces a vertex-centric, large-scale graph computational model. Interestingly, the name Pregel comes from the name of the river which the Seven Bridges of Königsberg spanned.

Computational Model

The system takes as input a directed graph with properties assigned to both vertices and edges. The computation consists of a sequence of iterations, called supersteps. In each superstep, a user-defined function is invoked on each vertex in parallel. This function essentially implements the algorithm by specifying the behaviour of a single vertex V during a single superstep S. The function can read messages sent to the vertex V during the previous superstep (S-1), change the state of the vertex or its out-going edges', mutate the graph topology by adding/removing vertices or edges and by sending messages to other vertices that would be received in the next superstep (S+1). Since all computation during a superstep is performed locally, th

@shagunsodhani
shagunsodhani / LargeVis.md
Last active November 10, 2023 02:05
Notes for LargeVis paper

#Visualizing Large-scale and High-dimensional Data

Introduction

  • LargeVis - a technique to visualize large-scale and high-dimensional data in low-dimensional space.
  • Problem relates to both information visualization and machine learning (and data mining) domain.
  • Link to the paper

t-SNE

@shagunsodhani
shagunsodhani / bAbI.md
Created May 25, 2016 17:18
Notes for "Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks" Paper

Introduction

The paper presents a framework and a set of synthetic toy tasks (classified into skill sets) for analyzing the performance of different machine learning algorithms.

Tasks

  • Single/Two/Three Supporting Facts: Questions where a single(or multiple) supporting facts provide the answer. More is the number of supporting facts, tougher is the task.
  • Two/Three Supporting Facts: Requires differentiation between objects and subjects.
  • Yes/No Questions: True/False questions.
  • Counting/List/Set Questions: Requires ability to count or list objects having a certain property.
@shagunsodhani
shagunsodhani / Advances In Optimizing Recurrent Networks.md
Created June 26, 2016 16:44
Notes for "Advances In Optimizing Recurrent Networks" paper

Advances In Optimizing Recurrent Networks

Introduction

  • Recurrent Neural Networks (RNNs) are very powerful at modelling sequences but they are not good at learning long-term dependencies.
  • The paper discusses the reasons behind this difficulty and some suggestions to mitigate it.
  • Link to the paper.

Optimization Difficulty

@shagunsodhani
shagunsodhani / The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training.md
Created June 8, 2016 17:10
Summary of "The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training" paper

The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training

Introduction

  • The paper explores the challenges involved in training deep networks, the effect of unsupervised pre-training on training process and visualizes the error function landscape for deep architectures.
  • Link to the paper

Experiments

  • Datasets used - Shapeset and MNIST.
@shagunsodhani
shagunsodhani / ElasticNet.md
Created March 13, 2016 17:51
Notes for "Regularization and variable selection via the elastic net" paper.

Regularization and variable selection via the elastic net

Introduction to elastic net

  • Regularization and variable selection method.
  • Sparse Representation
  • Exihibits grouping effect.
  • Prticulary useful when number of predictors (p) >> number of observations (n).
  • LARS-EN algorithm to compute elastic net regularization path.
  • Link to paper.
@shagunsodhani
shagunsodhani / PixelRNN.md
Created October 9, 2016 13:22
Summary of PixelRNN paper

Pixel Recurrent Neural Network

Introduction

  • Problem: Building an expressive, tractable and scalable image model which can be used in downstream tasks like image generation, reconstruction, compression etc.
  • Link to the paper

Model

  • Scan the image, one row at a time and one pixel at a time (within each row).