##Model
This is an implementation of a deep convolutional neural network model inspired by the paper Springenberg, Dosovitskiy, Brox, Riedmiller 2014.
The model run script is included below (cifar10_allcnn.py).
The trained weights file can be downloaded from AWS (cifar10_allcnn_e350.p)
This model is acheiving 89.5% top-1 accuracy on the validation data set. This is done using zca whitened, global contrast normalized data, without crops or flips. This is the same performance we achieve running the same model configuration and data through Caffe.
This script was tested with the
neon commit SHA e7ab2c2e2.
Make sure that your local repo is synced to this commit and run the
installation procedure before proceeding.
If neon is installed into a virtualenv
, make sure that it is activated before running the commands below. Also, the commands below use the GPU backend by default so add -b cpu
if you are running on a system without a compatible GPU.
To test the model performance on the validation data set use the following command:
python cifar10_allcnn.py --model_file cifar10_allcnn_e350.p -eval 1
To train the model from scratch for 350 epochs, use the command:
python cifar10_allcnn.py -b gpu -e 350 -s cifar10_allcnn_trained.p
Additional options are available to add features like saving checkpoints and displaying logging information,
use the --help
option for details.
Machine and GPU specs:
Intel(R) Core(TM) i5-4690K CPU @ 3.50GHz
Ubuntu 14.04.2 LTS
GPU: GeForce GTX TITAN X
CUDA Driver Version 7.0
The run times for the fprop and bprop pass are given in the table below. The same model configuration is used in neon and caffe. 50 iterations are timed in each framework and only the mean value is reported.
-------------------------------------------
| Func | neon (mean) | caffe (mean)|
-------------------------------------------
| fprop | 14 ms | 19 ms |
| bprop | 34 ms | 65 ms |
| update | 3 ms | * |
| iteration | 51 ms | 85 ms |
-------------------------------------------
* caffe update operation may be included in bprop or iteration time but is not individually timed.
Jost Tobias Springenberg, Alexey Dosovitskiy, Thomas Brox and Martin A. Riedmiller.
Striving for Simplicity: The All Convolutional Net.
arXiv preprint arXiv:1412.6806, 2014.
Could someone explain in more details why we need to pad to nearest power of 2 to match conv output? I mean why not to
train_set = ArrayIterator(X_train, y_train, nclass=10, lshape=(3, 32, 32))
valid_set = ArrayIterator(X_test, y_test, nclass=10, lshape=(3, 32, 32))
Conv((1, 1, 16), *_conv) replace with Conv((1, 1, 10), *_conv),