Skip to content

Instantly share code, notes, and snippets.

View xxsang's full-sized avatar

Shen Ren xxsang

  • National University of Singapore
  • Singapore
View GitHub Profile
@xxsang
xxsang / Classification
Created September 24, 2019 14:49
Classification
1. Introduction
1. response is qualitative variable
2. estimate the probabilities that X belongs to each category C
3. logistic regression->binary
4. multiclass logistic regression/discriminant analysis -> multi-class
2. Logistic regression
1. p(X) = e^(linear model)/(1+e^(linear model))
2. Transformation of linear model to range[0,1]
3. log(p(X)/(1-P(X)) = beta0+beta1X, log odds/logit transformation of p(X)
4. Maximum Likelihood (Fisher)
@xxsang
xxsang / Linear Regression
Created September 20, 2019 09:49
Linear Regression
1. Simple linear regression
1. Assumes the dependence of response Y on predictors X1,…Xp is linear
2. Simple is good
3. residual: e = yi-yi_hat
4. residual sum of squares = e1^2+…en^2
5. optimisation problem to minimise total RSS, has closed form solution
6. A measure of precision -> how close the estimator is close to 0 (no relationship)
1. Standard error of the slope
2. var(e)/spread around the mean of X
3. SE of the intercept
@xxsang
xxsang / Overview of Statistical Learning
Created September 15, 2019 05:17
Overview of Statistical Learning
1. Regression model
1. Target/response to predict Y
2. features/input/predictor X = vector(X1, X2, X3)
3. Y = f(X) + e, e captures measurement errors and other discrepancies
4. Good for
1. make prediction
2. understand which components are important
3. maybe able to understand how each component affects Y (depending on the complexity of f)
5. at a particular point, f(4) = E(Y|X=4) E is expected value
6. regression function f(x) == E(Y|X=x) conditional expectation
@xxsang
xxsang / Statistical Learning Introduction
Created September 9, 2019 09:46
Statistical Learning Introduction
1. Look at the data first before jumping to analysis
2. Supervised learning problem
1. tasks
1. Accurately predict unseen test cases
2. Understand which inputs affect the outcome and how
3. Assess the quality of our predictions and inferences
2. know when and how to use them
3. evaluate the model
3. Unsupervised learning
1. Data is unlabeled
@xxsang
xxsang / Gaussian Process and Bayesian Optimisation
Created September 3, 2019 10:17
Gaussian Process and Bayesian Optimisation
1. Nonparametric methods
1. Parametric methods
1. fit to fixed number of parameters
2. Nonparametric
1. number of parameters depends on dataset size
2. k-nearest neighbours
3. Gaussian/uniform kernels
3. Comparison
1. Parametric
1. limited complexity
@xxsang
xxsang / Variational Autoencoder + Variational Dropout
Created August 27, 2019 05:24
Variational Autoencoder + Variational Dropout
1. Scaling Variational Inference & unbiased estimates
a. Scale to big datasets
i. Traditionally, Too slow for big data
ii. Not very beneficial
b. Mixture model (Bayesian + deep learning)
@xxsang
xxsang / Markov Chain Monte Carlo
Created August 23, 2019 05:16
Markov Chain Monte Carlo
1. Monte Carlo Estimation
1. Approximation Simulation
2. Easy to program, easy to parallel, can be slow for some problems
3. quick and dirty
4. unbiased
5. like an infinite large emsemble of neural networks
6. full bayesian modelling
7. approximate intractable
8. M-step of EM algorithm
2. Sampling from 1-d distributions
@xxsang
xxsang / Latent Dirichlet Allocation
Created August 20, 2019 08:00
Latent Dirichlet Allocation
1. Topic Modelling
1. Decompose books to distributional topics
2. Assign topics to texts
3. Compute similarity/distance between vectors of texts
1. Euclidean distance
2. Cosine similarity
2. Dirichlet distribution
1. support: unitary simplex
2. A distribution over triangle
3. Latent Dirichlet Allocation
@xxsang
xxsang / Variation Inference
Created August 13, 2019 06:49
Variation Inference
1. Goal: compute approximate posterior probability
2. Steps:
1. select a family of distributions Q as variational family, a product of qi(zi)
2. find best approximation q(z) of p*(z), minimize KL divergence
3. Mean-field approximation
1. Coordinate descend to minimize KL divergence
2. Ising model
4. Variational EM
1. Use variational inference at the E step, instead of minimizing full posterior, minimizing meaningful approximation of posterior as a family of distributions Q
2. Called variational EM
@xxsang
xxsang / Expectation Maximisation Algorithm
Created August 9, 2019 14:26
Expectation Maximisation Algorithm
1. General Form of EM
1. Concave functions
2. Satisfy Jensen’s equality f(Et)>=Ef(t)
3. kullback-leibler divergence: measure differnce of two probabilistic distributions
1. KL divergence
2. how different is each data point at any point of x-axis in the log scale, take expectation
3. non symmetric
4. = 0 if compare to self
5. always non-negative
4. EM