- Define the problem and assemble a dataset
- Choose a measure of success (loss function)
- Decide on an evaluation protocol
- K-Fold cross validation (= few samples)
- Iterated k-fold validation (model evaluation when little data)
- Prepare your data
- Data tensors
- Normalize
- Feature engineering
- Develop a model that does better than a baseline
-
Last layer activation
-
Loss function
-
Optimization configuration (default = rmsprop)
-
Binary classification
- => Sigmoid
- Binary crossentropy
-
Single label classification
- => softmax
- Categorical crossentropy
-
Multilabel classification => Sigmoid
- Binary crossentropy
-
Regression
- => None
- MSE
-
Regression [0,1]
- => Sigmoid
- mse or binary crossentropy
- Scale up
- Develop a model that overfits
- Optimization vs Generalization
- Underfitting vs Overfitting
- Add layers
- Make layers bigger
- Train for more epochs
- Regularize your model and tune your hyperparameters
- Add dropout
- Add/remove layers Add L1/L2 regularization
- Try hyperparameters
- Convolutional neural networks ~ convnets
- window 3x3 / 5x5
- pattern
- spatial hierarchies
- 3D tensors = feature maps
- stride
- downsample
- use maxpooling
- recurrent neural network
- 1D convnets
- N-grames => N consecutives items
- Bags of n-grams
- One hot encoding
- Word embeddings
RNN
-
sequence
-
memory
-
Simple RNN
-
LSTM: Long short term memory
-
good idea to increase the capacity of the network until overfitting becomes the primary obstacles
- Fonctional API
- multiple inputs
- acyclic graph
- Advanced architecture patterns
- residual connections
- normalization
- depthwise separable convolution
- Automatic hyperparameter optimization
- hyperopt
- hyperas
-
Augment artistic creation
-
LSTM: 2014-2016
-
Text generation
-
Deep dream
-
Style transfer
-
VAE
-
GAN: Generative adversarial network
- generator vs discriminator