Skip to content

Instantly share code, notes, and snippets.

@juliensimon
Created September 3, 2017 14:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save juliensimon/29165dc7293e2602a6203255f65264b1 to your computer and use it in GitHub Desktop.
Save juliensimon/29165dc7293e2602a6203255f65264b1 to your computer and use it in GitHub Desktop.
CIFAR-10, Keras 1.2 and MXNet 0.11.0-rc3 on p2.8xlarge
ubuntu@ip-172-31-41-99:~/keras/examples$ time python cifar10_resnet50_mxnet.py
Using MXNet backend.
X_train shape: (50000, 3, 32, 32)
50000 train samples
10000 test samples
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
input_1 (InputLayer) (None, 3, 32, 32) 0
____________________________________________________________________________________________________
convolution2d_1 (Convolution2D) (None, 64, 16, 16) 9472 input_1[0][0]
____________________________________________________________________________________________________
batchnormalization_1 (BatchNorma (None, 64, 16, 16) 256 convolution2d_1[0][0]
____________________________________________________________________________________________________
activation_1 (Activation) (None, 64, 16, 16) 0 batchnormalization_1[0][0]
____________________________________________________________________________________________________
maxpooling2d_1 (MaxPooling2D) (None, 64, 7, 7) 0 activation_1[0][0]
____________________________________________________________________________________________________
convolution2d_2 (Convolution2D) (None, 64, 7, 7) 4160 maxpooling2d_1[0][0]
____________________________________________________________________________________________________
batchnormalization_2 (BatchNorma (None, 64, 7, 7) 256 convolution2d_2[0][0]
____________________________________________________________________________________________________
activation_2 (Activation) (None, 64, 7, 7) 0 batchnormalization_2[0][0]
____________________________________________________________________________________________________
convolution2d_3 (Convolution2D) (None, 64, 7, 7) 36928 activation_2[0][0]
____________________________________________________________________________________________________
batchnormalization_3 (BatchNorma (None, 64, 7, 7) 256 convolution2d_3[0][0]
____________________________________________________________________________________________________
activation_3 (Activation) (None, 64, 7, 7) 0 batchnormalization_3[0][0]
____________________________________________________________________________________________________
convolution2d_5 (Convolution2D) (None, 256L, 7, 7) 16640 maxpooling2d_1[0][0]
____________________________________________________________________________________________________
convolution2d_4 (Convolution2D) (None, 256, 7, 7) 16640 activation_3[0][0]
____________________________________________________________________________________________________
merge_1 (Merge) (None, 256L, 7, 7) 0 convolution2d_5[0][0]
convolution2d_4[0][0]
____________________________________________________________________________________________________
batchnormalization_4 (BatchNorma (None, 256L, 7, 7) 1024 merge_1[0][0]
____________________________________________________________________________________________________
activation_4 (Activation) (None, 256L, 7, 7) 0 batchnormalization_4[0][0]
____________________________________________________________________________________________________
convolution2d_6 (Convolution2D) (None, 64, 7, 7) 16448 activation_4[0][0]
____________________________________________________________________________________________________
batchnormalization_5 (BatchNorma (None, 64, 7, 7) 256 convolution2d_6[0][0]
____________________________________________________________________________________________________
activation_5 (Activation) (None, 64, 7, 7) 0 batchnormalization_5[0][0]
____________________________________________________________________________________________________
convolution2d_7 (Convolution2D) (None, 64, 7, 7) 36928 activation_5[0][0]
____________________________________________________________________________________________________
batchnormalization_6 (BatchNorma (None, 64, 7, 7) 256 convolution2d_7[0][0]
____________________________________________________________________________________________________
activation_6 (Activation) (None, 64, 7, 7) 0 batchnormalization_6[0][0]
____________________________________________________________________________________________________
convolution2d_8 (Convolution2D) (None, 256, 7, 7) 16640 activation_6[0][0]
____________________________________________________________________________________________________
merge_2 (Merge) (None, 256L, 7, 7) 0 merge_1[0][0]
convolution2d_8[0][0]
____________________________________________________________________________________________________
batchnormalization_7 (BatchNorma (None, 256L, 7, 7) 1024 merge_2[0][0]
____________________________________________________________________________________________________
activation_7 (Activation) (None, 256L, 7, 7) 0 batchnormalization_7[0][0]
____________________________________________________________________________________________________
convolution2d_9 (Convolution2D) (None, 64, 7, 7) 16448 activation_7[0][0]
____________________________________________________________________________________________________
batchnormalization_8 (BatchNorma (None, 64, 7, 7) 256 convolution2d_9[0][0]
____________________________________________________________________________________________________
activation_8 (Activation) (None, 64, 7, 7) 0 batchnormalization_8[0][0]
____________________________________________________________________________________________________
convolution2d_10 (Convolution2D) (None, 64, 7, 7) 36928 activation_8[0][0]
____________________________________________________________________________________________________
batchnormalization_9 (BatchNorma (None, 64, 7, 7) 256 convolution2d_10[0][0]
____________________________________________________________________________________________________
activation_9 (Activation) (None, 64, 7, 7) 0 batchnormalization_9[0][0]
____________________________________________________________________________________________________
convolution2d_11 (Convolution2D) (None, 256, 7, 7) 16640 activation_9[0][0]
____________________________________________________________________________________________________
merge_3 (Merge) (None, 256L, 7, 7) 0 merge_2[0][0]
convolution2d_11[0][0]
____________________________________________________________________________________________________
batchnormalization_10 (BatchNorm (None, 256L, 7, 7) 1024 merge_3[0][0]
____________________________________________________________________________________________________
activation_10 (Activation) (None, 256L, 7, 7) 0 batchnormalization_10[0][0]
____________________________________________________________________________________________________
convolution2d_12 (Convolution2D) (None, 128, 4, 4) 32896 activation_10[0][0]
____________________________________________________________________________________________________
batchnormalization_11 (BatchNorm (None, 128, 4, 4) 512 convolution2d_12[0][0]
____________________________________________________________________________________________________
activation_11 (Activation) (None, 128, 4, 4) 0 batchnormalization_11[0][0]
____________________________________________________________________________________________________
convolution2d_13 (Convolution2D) (None, 128, 4, 4) 147584 activation_11[0][0]
____________________________________________________________________________________________________
batchnormalization_12 (BatchNorm (None, 128, 4, 4) 512 convolution2d_13[0][0]
____________________________________________________________________________________________________
activation_12 (Activation) (None, 128, 4, 4) 0 batchnormalization_12[0][0]
____________________________________________________________________________________________________
convolution2d_15 (Convolution2D) (None, 512L, 4, 4) 131584 merge_3[0][0]
____________________________________________________________________________________________________
convolution2d_14 (Convolution2D) (None, 512, 4, 4) 66048 activation_12[0][0]
____________________________________________________________________________________________________
merge_4 (Merge) (None, 512L, 4, 4) 0 convolution2d_15[0][0]
convolution2d_14[0][0]
____________________________________________________________________________________________________
batchnormalization_13 (BatchNorm (None, 512L, 4, 4) 2048 merge_4[0][0]
____________________________________________________________________________________________________
activation_13 (Activation) (None, 512L, 4, 4) 0 batchnormalization_13[0][0]
____________________________________________________________________________________________________
convolution2d_16 (Convolution2D) (None, 128, 4, 4) 65664 activation_13[0][0]
____________________________________________________________________________________________________
batchnormalization_14 (BatchNorm (None, 128, 4, 4) 512 convolution2d_16[0][0]
____________________________________________________________________________________________________
activation_14 (Activation) (None, 128, 4, 4) 0 batchnormalization_14[0][0]
____________________________________________________________________________________________________
convolution2d_17 (Convolution2D) (None, 128, 4, 4) 147584 activation_14[0][0]
____________________________________________________________________________________________________
batchnormalization_15 (BatchNorm (None, 128, 4, 4) 512 convolution2d_17[0][0]
____________________________________________________________________________________________________
activation_15 (Activation) (None, 128, 4, 4) 0 batchnormalization_15[0][0]
____________________________________________________________________________________________________
convolution2d_18 (Convolution2D) (None, 512, 4, 4) 66048 activation_15[0][0]
____________________________________________________________________________________________________
merge_5 (Merge) (None, 512L, 4, 4) 0 merge_4[0][0]
convolution2d_18[0][0]
____________________________________________________________________________________________________
batchnormalization_16 (BatchNorm (None, 512L, 4, 4) 2048 merge_5[0][0]
____________________________________________________________________________________________________
activation_16 (Activation) (None, 512L, 4, 4) 0 batchnormalization_16[0][0]
____________________________________________________________________________________________________
convolution2d_19 (Convolution2D) (None, 128, 4, 4) 65664 activation_16[0][0]
____________________________________________________________________________________________________
batchnormalization_17 (BatchNorm (None, 128, 4, 4) 512 convolution2d_19[0][0]
____________________________________________________________________________________________________
activation_17 (Activation) (None, 128, 4, 4) 0 batchnormalization_17[0][0]
____________________________________________________________________________________________________
convolution2d_20 (Convolution2D) (None, 128, 4, 4) 147584 activation_17[0][0]
____________________________________________________________________________________________________
batchnormalization_18 (BatchNorm (None, 128, 4, 4) 512 convolution2d_20[0][0]
____________________________________________________________________________________________________
activation_18 (Activation) (None, 128, 4, 4) 0 batchnormalization_18[0][0]
____________________________________________________________________________________________________
convolution2d_21 (Convolution2D) (None, 512, 4, 4) 66048 activation_18[0][0]
____________________________________________________________________________________________________
merge_6 (Merge) (None, 512L, 4, 4) 0 merge_5[0][0]
convolution2d_21[0][0]
____________________________________________________________________________________________________
batchnormalization_19 (BatchNorm (None, 512L, 4, 4) 2048 merge_6[0][0]
____________________________________________________________________________________________________
activation_19 (Activation) (None, 512L, 4, 4) 0 batchnormalization_19[0][0]
____________________________________________________________________________________________________
convolution2d_22 (Convolution2D) (None, 128, 4, 4) 65664 activation_19[0][0]
____________________________________________________________________________________________________
batchnormalization_20 (BatchNorm (None, 128, 4, 4) 512 convolution2d_22[0][0]
____________________________________________________________________________________________________
activation_20 (Activation) (None, 128, 4, 4) 0 batchnormalization_20[0][0]
____________________________________________________________________________________________________
convolution2d_23 (Convolution2D) (None, 128, 4, 4) 147584 activation_20[0][0]
____________________________________________________________________________________________________
batchnormalization_21 (BatchNorm (None, 128, 4, 4) 512 convolution2d_23[0][0]
____________________________________________________________________________________________________
activation_21 (Activation) (None, 128, 4, 4) 0 batchnormalization_21[0][0]
____________________________________________________________________________________________________
convolution2d_24 (Convolution2D) (None, 512, 4, 4) 66048 activation_21[0][0]
____________________________________________________________________________________________________
merge_7 (Merge) (None, 512L, 4, 4) 0 merge_6[0][0]
convolution2d_24[0][0]
____________________________________________________________________________________________________
batchnormalization_22 (BatchNorm (None, 512L, 4, 4) 2048 merge_7[0][0]
____________________________________________________________________________________________________
activation_22 (Activation) (None, 512L, 4, 4) 0 batchnormalization_22[0][0]
____________________________________________________________________________________________________
convolution2d_25 (Convolution2D) (None, 256, 2, 2) 131328 activation_22[0][0]
____________________________________________________________________________________________________
batchnormalization_23 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_25[0][0]
____________________________________________________________________________________________________
activation_23 (Activation) (None, 256, 2, 2) 0 batchnormalization_23[0][0]
____________________________________________________________________________________________________
convolution2d_26 (Convolution2D) (None, 256, 2, 2) 590080 activation_23[0][0]
____________________________________________________________________________________________________
batchnormalization_24 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_26[0][0]
____________________________________________________________________________________________________
activation_24 (Activation) (None, 256, 2, 2) 0 batchnormalization_24[0][0]
____________________________________________________________________________________________________
convolution2d_28 (Convolution2D) (None, 1024L, 2, 2) 525312 merge_7[0][0]
____________________________________________________________________________________________________
convolution2d_27 (Convolution2D) (None, 1024, 2, 2) 263168 activation_24[0][0]
____________________________________________________________________________________________________
merge_8 (Merge) (None, 1024L, 2, 2) 0 convolution2d_28[0][0]
convolution2d_27[0][0]
____________________________________________________________________________________________________
batchnormalization_25 (BatchNorm (None, 1024L, 2, 2) 4096 merge_8[0][0]
____________________________________________________________________________________________________
activation_25 (Activation) (None, 1024L, 2, 2) 0 batchnormalization_25[0][0]
____________________________________________________________________________________________________
convolution2d_29 (Convolution2D) (None, 256, 2, 2) 262400 activation_25[0][0]
____________________________________________________________________________________________________
batchnormalization_26 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_29[0][0]
____________________________________________________________________________________________________
activation_26 (Activation) (None, 256, 2, 2) 0 batchnormalization_26[0][0]
____________________________________________________________________________________________________
convolution2d_30 (Convolution2D) (None, 256, 2, 2) 590080 activation_26[0][0]
____________________________________________________________________________________________________
batchnormalization_27 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_30[0][0]
____________________________________________________________________________________________________
activation_27 (Activation) (None, 256, 2, 2) 0 batchnormalization_27[0][0]
____________________________________________________________________________________________________
convolution2d_31 (Convolution2D) (None, 1024, 2, 2) 263168 activation_27[0][0]
____________________________________________________________________________________________________
merge_9 (Merge) (None, 1024L, 2, 2) 0 merge_8[0][0]
convolution2d_31[0][0]
____________________________________________________________________________________________________
batchnormalization_28 (BatchNorm (None, 1024L, 2, 2) 4096 merge_9[0][0]
____________________________________________________________________________________________________
activation_28 (Activation) (None, 1024L, 2, 2) 0 batchnormalization_28[0][0]
____________________________________________________________________________________________________
convolution2d_32 (Convolution2D) (None, 256, 2, 2) 262400 activation_28[0][0]
____________________________________________________________________________________________________
batchnormalization_29 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_32[0][0]
____________________________________________________________________________________________________
activation_29 (Activation) (None, 256, 2, 2) 0 batchnormalization_29[0][0]
____________________________________________________________________________________________________
convolution2d_33 (Convolution2D) (None, 256, 2, 2) 590080 activation_29[0][0]
____________________________________________________________________________________________________
batchnormalization_30 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_33[0][0]
____________________________________________________________________________________________________
activation_30 (Activation) (None, 256, 2, 2) 0 batchnormalization_30[0][0]
____________________________________________________________________________________________________
convolution2d_34 (Convolution2D) (None, 1024, 2, 2) 263168 activation_30[0][0]
____________________________________________________________________________________________________
merge_10 (Merge) (None, 1024L, 2, 2) 0 merge_9[0][0]
convolution2d_34[0][0]
____________________________________________________________________________________________________
batchnormalization_31 (BatchNorm (None, 1024L, 2, 2) 4096 merge_10[0][0]
____________________________________________________________________________________________________
activation_31 (Activation) (None, 1024L, 2, 2) 0 batchnormalization_31[0][0]
____________________________________________________________________________________________________
convolution2d_35 (Convolution2D) (None, 256, 2, 2) 262400 activation_31[0][0]
____________________________________________________________________________________________________
batchnormalization_32 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_35[0][0]
____________________________________________________________________________________________________
activation_32 (Activation) (None, 256, 2, 2) 0 batchnormalization_32[0][0]
____________________________________________________________________________________________________
convolution2d_36 (Convolution2D) (None, 256, 2, 2) 590080 activation_32[0][0]
____________________________________________________________________________________________________
batchnormalization_33 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_36[0][0]
____________________________________________________________________________________________________
activation_33 (Activation) (None, 256, 2, 2) 0 batchnormalization_33[0][0]
____________________________________________________________________________________________________
convolution2d_37 (Convolution2D) (None, 1024, 2, 2) 263168 activation_33[0][0]
____________________________________________________________________________________________________
merge_11 (Merge) (None, 1024L, 2, 2) 0 merge_10[0][0]
convolution2d_37[0][0]
____________________________________________________________________________________________________
batchnormalization_34 (BatchNorm (None, 1024L, 2, 2) 4096 merge_11[0][0]
____________________________________________________________________________________________________
activation_34 (Activation) (None, 1024L, 2, 2) 0 batchnormalization_34[0][0]
____________________________________________________________________________________________________
convolution2d_38 (Convolution2D) (None, 256, 2, 2) 262400 activation_34[0][0]
____________________________________________________________________________________________________
batchnormalization_35 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_38[0][0]
____________________________________________________________________________________________________
activation_35 (Activation) (None, 256, 2, 2) 0 batchnormalization_35[0][0]
____________________________________________________________________________________________________
convolution2d_39 (Convolution2D) (None, 256, 2, 2) 590080 activation_35[0][0]
____________________________________________________________________________________________________
batchnormalization_36 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_39[0][0]
____________________________________________________________________________________________________
activation_36 (Activation) (None, 256, 2, 2) 0 batchnormalization_36[0][0]
____________________________________________________________________________________________________
convolution2d_40 (Convolution2D) (None, 1024, 2, 2) 263168 activation_36[0][0]
____________________________________________________________________________________________________
merge_12 (Merge) (None, 1024L, 2, 2) 0 merge_11[0][0]
convolution2d_40[0][0]
____________________________________________________________________________________________________
batchnormalization_37 (BatchNorm (None, 1024L, 2, 2) 4096 merge_12[0][0]
____________________________________________________________________________________________________
activation_37 (Activation) (None, 1024L, 2, 2) 0 batchnormalization_37[0][0]
____________________________________________________________________________________________________
convolution2d_41 (Convolution2D) (None, 256, 2, 2) 262400 activation_37[0][0]
____________________________________________________________________________________________________
batchnormalization_38 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_41[0][0]
____________________________________________________________________________________________________
activation_38 (Activation) (None, 256, 2, 2) 0 batchnormalization_38[0][0]
____________________________________________________________________________________________________
convolution2d_42 (Convolution2D) (None, 256, 2, 2) 590080 activation_38[0][0]
____________________________________________________________________________________________________
batchnormalization_39 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_42[0][0]
____________________________________________________________________________________________________
activation_39 (Activation) (None, 256, 2, 2) 0 batchnormalization_39[0][0]
____________________________________________________________________________________________________
convolution2d_43 (Convolution2D) (None, 1024, 2, 2) 263168 activation_39[0][0]
____________________________________________________________________________________________________
merge_13 (Merge) (None, 1024L, 2, 2) 0 merge_12[0][0]
convolution2d_43[0][0]
____________________________________________________________________________________________________
batchnormalization_40 (BatchNorm (None, 1024L, 2, 2) 4096 merge_13[0][0]
____________________________________________________________________________________________________
activation_40 (Activation) (None, 1024L, 2, 2) 0 batchnormalization_40[0][0]
____________________________________________________________________________________________________
convolution2d_44 (Convolution2D) (None, 512, 1, 1) 524800 activation_40[0][0]
____________________________________________________________________________________________________
batchnormalization_41 (BatchNorm (None, 512, 1, 1) 2048 convolution2d_44[0][0]
____________________________________________________________________________________________________
activation_41 (Activation) (None, 512, 1, 1) 0 batchnormalization_41[0][0]
____________________________________________________________________________________________________
convolution2d_45 (Convolution2D) (None, 512, 1, 1) 2359808 activation_41[0][0]
____________________________________________________________________________________________________
batchnormalization_42 (BatchNorm (None, 512, 1, 1) 2048 convolution2d_45[0][0]
____________________________________________________________________________________________________
activation_42 (Activation) (None, 512, 1, 1) 0 batchnormalization_42[0][0]
____________________________________________________________________________________________________
convolution2d_47 (Convolution2D) (None, 2048L, 1, 1) 2099200 merge_13[0][0]
____________________________________________________________________________________________________
convolution2d_46 (Convolution2D) (None, 2048, 1, 1) 1050624 activation_42[0][0]
____________________________________________________________________________________________________
merge_14 (Merge) (None, 2048L, 1, 1) 0 convolution2d_47[0][0]
convolution2d_46[0][0]
____________________________________________________________________________________________________
batchnormalization_43 (BatchNorm (None, 2048L, 1, 1) 8192 merge_14[0][0]
____________________________________________________________________________________________________
activation_43 (Activation) (None, 2048L, 1, 1) 0 batchnormalization_43[0][0]
____________________________________________________________________________________________________
convolution2d_48 (Convolution2D) (None, 512, 1, 1) 1049088 activation_43[0][0]
____________________________________________________________________________________________________
batchnormalization_44 (BatchNorm (None, 512, 1, 1) 2048 convolution2d_48[0][0]
____________________________________________________________________________________________________
activation_44 (Activation) (None, 512, 1, 1) 0 batchnormalization_44[0][0]
____________________________________________________________________________________________________
convolution2d_49 (Convolution2D) (None, 512, 1, 1) 2359808 activation_44[0][0]
____________________________________________________________________________________________________
batchnormalization_45 (BatchNorm (None, 512, 1, 1) 2048 convolution2d_49[0][0]
____________________________________________________________________________________________________
activation_45 (Activation) (None, 512, 1, 1) 0 batchnormalization_45[0][0]
____________________________________________________________________________________________________
convolution2d_50 (Convolution2D) (None, 2048, 1, 1) 1050624 activation_45[0][0]
____________________________________________________________________________________________________
merge_15 (Merge) (None, 2048L, 1, 1) 0 merge_14[0][0]
convolution2d_50[0][0]
____________________________________________________________________________________________________
batchnormalization_46 (BatchNorm (None, 2048L, 1, 1) 8192 merge_15[0][0]
____________________________________________________________________________________________________
activation_46 (Activation) (None, 2048L, 1, 1) 0 batchnormalization_46[0][0]
____________________________________________________________________________________________________
convolution2d_51 (Convolution2D) (None, 512, 1, 1) 1049088 activation_46[0][0]
____________________________________________________________________________________________________
batchnormalization_47 (BatchNorm (None, 512, 1, 1) 2048 convolution2d_51[0][0]
____________________________________________________________________________________________________
activation_47 (Activation) (None, 512, 1, 1) 0 batchnormalization_47[0][0]
____________________________________________________________________________________________________
convolution2d_52 (Convolution2D) (None, 512, 1, 1) 2359808 activation_47[0][0]
____________________________________________________________________________________________________
batchnormalization_48 (BatchNorm (None, 512, 1, 1) 2048 convolution2d_52[0][0]
____________________________________________________________________________________________________
activation_48 (Activation) (None, 512, 1, 1) 0 batchnormalization_48[0][0]
____________________________________________________________________________________________________
convolution2d_53 (Convolution2D) (None, 2048, 1, 1) 1050624 activation_48[0][0]
____________________________________________________________________________________________________
merge_16 (Merge) (None, 2048L, 1, 1) 0 merge_15[0][0]
convolution2d_53[0][0]
____________________________________________________________________________________________________
batchnormalization_49 (BatchNorm (None, 2048L, 1, 1) 8192 merge_16[0][0]
____________________________________________________________________________________________________
activation_49 (Activation) (None, 2048L, 1, 1) 0 batchnormalization_49[0][0]
____________________________________________________________________________________________________
averagepooling2d_1 (AveragePooli (None, 2048L, 1L, 1L) 0 activation_49[0][0]
____________________________________________________________________________________________________
flatten_1 (Flatten) (None, 2048) 0 averagepooling2d_1[0][0]
____________________________________________________________________________________________________
dense_1 (Dense) (None, 10) 20490 flatten_1[0][0]
====================================================================================================
Total params: 23,592,842
Trainable params: 23,547,402
Non-trainable params: 45,440
____________________________________________________________________________________________________
Using real-time data augmentation.
Epoch 1/100
[13:18:23] src/operator/././cudnn_algoreg-inl.h:112: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)
"""
Objective
=========
Trains a Resnet50 network on CIFAR10 small image dataset.
Benchmarks
==========
MXNet Backend
=============
Takes lot of time on a CPU machine. Recommended to run on GPU machine.
Increase number of epochs up to 200 for accuracy > 0.94
Epochs - 10
Batchsize - 32 per GPU.
Example
1 GPU -> Batchsize = 32
2 GPU -> Batchsize = 64
4 GPU -> Batchsize = 128
8 GPU -> Batchsize = 256
16 GPU -> Batchsize = 512
1 GPU
=====
Average time per epoch: 160.93
Train Accuracy: 0.69
Test Accuracy: 0.67
/aug
return model
/usr/local/lib/python2.7/dist-packages/mxnet/module/bucketing_module.py:385: UserWarning: Optimizer created manually outside Module but rescale_grad is not normalized to 1.0/batch_size/num_workers (1.0 vs. 0.00390625). Is this intended?
force_init=force_init)
^CTraceback (most recent call last):
File "cifar10_resnet50_mxnet.py", line 400, in <module>
validation_data=(X_test, Y_test))
File "/usr/local/lib/python2.7/dist-packages/Keras-1.2.2-py2.7.egg/keras/engine/training.py", line 1559, in fit_generator
class_weight=class_weight)
File "/usr/local/lib/python2.7/dist-packages/Keras-1.2.2-py2.7.egg/keras/engine/training.py", line 1322, in train_on_batch
outputs = self.train_function(ins)
File "/usr/local/lib/python2.7/dist-packages/Keras-1.2.2-py2.7.egg/keras/engine/training.py", line 1954, in train_function
data, label, _, data_shapes, label_shapes = self._adjust_module(inputs, 'train')
File "/usr/local/lib/python2.7/dist-packages/Keras-1.2.2-py2.7.egg/keras/engine/training.py", line 1914, in _adjust_module
self._mod.switch_bucket(phase, data_shapes, label_shapes)
File "/usr/local/lib/python2.7/dist-packages/mxnet/module/bucketing_module.py", line 355, in switch_bucket
force_rebind=False, shared_module=self._buckets[self._default_bucket_key])
File "/usr/local/lib/python2.7/dist-packages/mxnet/module/module.py", line 417, in bind
state_names=self._state_names)
File "/usr/local/lib/python2.7/dist-packages/mxnet/module/executor_group.py", line 231, in __init__
self.bind_exec(data_shapes, label_shapes, shared_group)
File "/usr/local/lib/python2.7/dist-packages/mxnet/module/executor_group.py", line 327, in bind_exec
shared_group))
File "/usr/local/lib/python2.7/dist-packages/mxnet/module/executor_group.py", line 603, in _bind_ith_exec
shared_buffer=shared_data_arrays, **input_shapes)
File "/usr/local/lib/python2.7/dist-packages/mxnet/symbol.py", line 1473, in simple_bind
ctypes.byref(exe_handle)))
KeyboardInterrupt
^C^C^C^C
real 0m18.201s
user 0m10.952s
sys 0m10.764s
ubuntu@ip-172-31-41-99:~/keras/examples$ vi cifar10_resnet50_mxnet.py
ubuntu@ip-172-31-41-99:~/keras/examples$ time python cifar10_resnet50_mxnet.py
Using MXNet backend.
X_train shape: (50000, 3, 32, 32)
50000 train samples
10000 test samples
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
input_1 (InputLayer) (None, 3, 32, 32) 0
____________________________________________________________________________________________________
convolution2d_1 (Convolution2D) (None, 64, 16, 16) 9472 input_1[0][0]
____________________________________________________________________________________________________
batchnormalization_1 (BatchNorma (None, 64, 16, 16) 256 convolution2d_1[0][0]
____________________________________________________________________________________________________
activation_1 (Activation) (None, 64, 16, 16) 0 batchnormalization_1[0][0]
____________________________________________________________________________________________________
maxpooling2d_1 (MaxPooling2D) (None, 64, 7, 7) 0 activation_1[0][0]
____________________________________________________________________________________________________
convolution2d_2 (Convolution2D) (None, 64, 7, 7) 4160 maxpooling2d_1[0][0]
____________________________________________________________________________________________________
batchnormalization_2 (BatchNorma (None, 64, 7, 7) 256 convolution2d_2[0][0]
____________________________________________________________________________________________________
activation_2 (Activation) (None, 64, 7, 7) 0 batchnormalization_2[0][0]
____________________________________________________________________________________________________
convolution2d_3 (Convolution2D) (None, 64, 7, 7) 36928 activation_2[0][0]
____________________________________________________________________________________________________
batchnormalization_3 (BatchNorma (None, 64, 7, 7) 256 convolution2d_3[0][0]
____________________________________________________________________________________________________
activation_3 (Activation) (None, 64, 7, 7) 0 batchnormalization_3[0][0]
____________________________________________________________________________________________________
convolution2d_5 (Convolution2D) (None, 256L, 7, 7) 16640 maxpooling2d_1[0][0]
____________________________________________________________________________________________________
convolution2d_4 (Convolution2D) (None, 256, 7, 7) 16640 activation_3[0][0]
____________________________________________________________________________________________________
merge_1 (Merge) (None, 256L, 7, 7) 0 convolution2d_5[0][0]
convolution2d_4[0][0]
____________________________________________________________________________________________________
batchnormalization_4 (BatchNorma (None, 256L, 7, 7) 1024 merge_1[0][0]
____________________________________________________________________________________________________
activation_4 (Activation) (None, 256L, 7, 7) 0 batchnormalization_4[0][0]
____________________________________________________________________________________________________
convolution2d_6 (Convolution2D) (None, 64, 7, 7) 16448 activation_4[0][0]
____________________________________________________________________________________________________
batchnormalization_5 (BatchNorma (None, 64, 7, 7) 256 convolution2d_6[0][0]
____________________________________________________________________________________________________
activation_5 (Activation) (None, 64, 7, 7) 0 batchnormalization_5[0][0]
____________________________________________________________________________________________________
convolution2d_7 (Convolution2D) (None, 64, 7, 7) 36928 activation_5[0][0]
____________________________________________________________________________________________________
batchnormalization_6 (BatchNorma (None, 64, 7, 7) 256 convolution2d_7[0][0]
____________________________________________________________________________________________________
activation_6 (Activation) (None, 64, 7, 7) 0 batchnormalization_6[0][0]
____________________________________________________________________________________________________
convolution2d_8 (Convolution2D) (None, 256, 7, 7) 16640 activation_6[0][0]
____________________________________________________________________________________________________
merge_2 (Merge) (None, 256L, 7, 7) 0 merge_1[0][0]
convolution2d_8[0][0]
____________________________________________________________________________________________________
batchnormalization_7 (BatchNorma (None, 256L, 7, 7) 1024 merge_2[0][0]
____________________________________________________________________________________________________
activation_7 (Activation) (None, 256L, 7, 7) 0 batchnormalization_7[0][0]
____________________________________________________________________________________________________
convolution2d_9 (Convolution2D) (None, 64, 7, 7) 16448 activation_7[0][0]
____________________________________________________________________________________________________
batchnormalization_8 (BatchNorma (None, 64, 7, 7) 256 convolution2d_9[0][0]
____________________________________________________________________________________________________
activation_8 (Activation) (None, 64, 7, 7) 0 batchnormalization_8[0][0]
____________________________________________________________________________________________________
convolution2d_10 (Convolution2D) (None, 64, 7, 7) 36928 activation_8[0][0]
____________________________________________________________________________________________________
batchnormalization_9 (BatchNorma (None, 64, 7, 7) 256 convolution2d_10[0][0]
____________________________________________________________________________________________________
activation_9 (Activation) (None, 64, 7, 7) 0 batchnormalization_9[0][0]
____________________________________________________________________________________________________
convolution2d_11 (Convolution2D) (None, 256, 7, 7) 16640 activation_9[0][0]
____________________________________________________________________________________________________
merge_3 (Merge) (None, 256L, 7, 7) 0 merge_2[0][0]
convolution2d_11[0][0]
____________________________________________________________________________________________________
batchnormalization_10 (BatchNorm (None, 256L, 7, 7) 1024 merge_3[0][0]
____________________________________________________________________________________________________
activation_10 (Activation) (None, 256L, 7, 7) 0 batchnormalization_10[0][0]
____________________________________________________________________________________________________
convolution2d_12 (Convolution2D) (None, 128, 4, 4) 32896 activation_10[0][0]
____________________________________________________________________________________________________
batchnormalization_11 (BatchNorm (None, 128, 4, 4) 512 convolution2d_12[0][0]
____________________________________________________________________________________________________
activation_11 (Activation) (None, 128, 4, 4) 0 batchnormalization_11[0][0]
____________________________________________________________________________________________________
convolution2d_13 (Convolution2D) (None, 128, 4, 4) 147584 activation_11[0][0]
____________________________________________________________________________________________________
batchnormalization_12 (BatchNorm (None, 128, 4, 4) 512 convolution2d_13[0][0]
____________________________________________________________________________________________________
activation_12 (Activation) (None, 128, 4, 4) 0 batchnormalization_12[0][0]
____________________________________________________________________________________________________
convolution2d_15 (Convolution2D) (None, 512L, 4, 4) 131584 merge_3[0][0]
____________________________________________________________________________________________________
convolution2d_14 (Convolution2D) (None, 512, 4, 4) 66048 activation_12[0][0]
____________________________________________________________________________________________________
merge_4 (Merge) (None, 512L, 4, 4) 0 convolution2d_15[0][0]
convolution2d_14[0][0]
____________________________________________________________________________________________________
batchnormalization_13 (BatchNorm (None, 512L, 4, 4) 2048 merge_4[0][0]
____________________________________________________________________________________________________
activation_13 (Activation) (None, 512L, 4, 4) 0 batchnormalization_13[0][0]
____________________________________________________________________________________________________
convolution2d_16 (Convolution2D) (None, 128, 4, 4) 65664 activation_13[0][0]
____________________________________________________________________________________________________
batchnormalization_14 (BatchNorm (None, 128, 4, 4) 512 convolution2d_16[0][0]
____________________________________________________________________________________________________
activation_14 (Activation) (None, 128, 4, 4) 0 batchnormalization_14[0][0]
____________________________________________________________________________________________________
convolution2d_17 (Convolution2D) (None, 128, 4, 4) 147584 activation_14[0][0]
____________________________________________________________________________________________________
batchnormalization_15 (BatchNorm (None, 128, 4, 4) 512 convolution2d_17[0][0]
____________________________________________________________________________________________________
activation_15 (Activation) (None, 128, 4, 4) 0 batchnormalization_15[0][0]
____________________________________________________________________________________________________
convolution2d_18 (Convolution2D) (None, 512, 4, 4) 66048 activation_15[0][0]
____________________________________________________________________________________________________
merge_5 (Merge) (None, 512L, 4, 4) 0 merge_4[0][0]
convolution2d_18[0][0]
____________________________________________________________________________________________________
batchnormalization_16 (BatchNorm (None, 512L, 4, 4) 2048 merge_5[0][0]
____________________________________________________________________________________________________
activation_16 (Activation) (None, 512L, 4, 4) 0 batchnormalization_16[0][0]
____________________________________________________________________________________________________
convolution2d_19 (Convolution2D) (None, 128, 4, 4) 65664 activation_16[0][0]
____________________________________________________________________________________________________
batchnormalization_17 (BatchNorm (None, 128, 4, 4) 512 convolution2d_19[0][0]
____________________________________________________________________________________________________
activation_17 (Activation) (None, 128, 4, 4) 0 batchnormalization_17[0][0]
____________________________________________________________________________________________________
convolution2d_20 (Convolution2D) (None, 128, 4, 4) 147584 activation_17[0][0]
____________________________________________________________________________________________________
batchnormalization_18 (BatchNorm (None, 128, 4, 4) 512 convolution2d_20[0][0]
____________________________________________________________________________________________________
activation_18 (Activation) (None, 128, 4, 4) 0 batchnormalization_18[0][0]
____________________________________________________________________________________________________
convolution2d_21 (Convolution2D) (None, 512, 4, 4) 66048 activation_18[0][0]
____________________________________________________________________________________________________
merge_6 (Merge) (None, 512L, 4, 4) 0 merge_5[0][0]
convolution2d_21[0][0]
____________________________________________________________________________________________________
batchnormalization_19 (BatchNorm (None, 512L, 4, 4) 2048 merge_6[0][0]
____________________________________________________________________________________________________
activation_19 (Activation) (None, 512L, 4, 4) 0 batchnormalization_19[0][0]
____________________________________________________________________________________________________
convolution2d_22 (Convolution2D) (None, 128, 4, 4) 65664 activation_19[0][0]
____________________________________________________________________________________________________
batchnormalization_20 (BatchNorm (None, 128, 4, 4) 512 convolution2d_22[0][0]
____________________________________________________________________________________________________
activation_20 (Activation) (None, 128, 4, 4) 0 batchnormalization_20[0][0]
____________________________________________________________________________________________________
convolution2d_23 (Convolution2D) (None, 128, 4, 4) 147584 activation_20[0][0]
____________________________________________________________________________________________________
batchnormalization_21 (BatchNorm (None, 128, 4, 4) 512 convolution2d_23[0][0]
____________________________________________________________________________________________________
activation_21 (Activation) (None, 128, 4, 4) 0 batchnormalization_21[0][0]
____________________________________________________________________________________________________
convolution2d_24 (Convolution2D) (None, 512, 4, 4) 66048 activation_21[0][0]
____________________________________________________________________________________________________
merge_7 (Merge) (None, 512L, 4, 4) 0 merge_6[0][0]
convolution2d_24[0][0]
____________________________________________________________________________________________________
batchnormalization_22 (BatchNorm (None, 512L, 4, 4) 2048 merge_7[0][0]
____________________________________________________________________________________________________
activation_22 (Activation) (None, 512L, 4, 4) 0 batchnormalization_22[0][0]
____________________________________________________________________________________________________
convolution2d_25 (Convolution2D) (None, 256, 2, 2) 131328 activation_22[0][0]
____________________________________________________________________________________________________
batchnormalization_23 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_25[0][0]
____________________________________________________________________________________________________
activation_23 (Activation) (None, 256, 2, 2) 0 batchnormalization_23[0][0]
____________________________________________________________________________________________________
convolution2d_26 (Convolution2D) (None, 256, 2, 2) 590080 activation_23[0][0]
____________________________________________________________________________________________________
batchnormalization_24 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_26[0][0]
____________________________________________________________________________________________________
activation_24 (Activation) (None, 256, 2, 2) 0 batchnormalization_24[0][0]
____________________________________________________________________________________________________
convolution2d_28 (Convolution2D) (None, 1024L, 2, 2) 525312 merge_7[0][0]
____________________________________________________________________________________________________
convolution2d_27 (Convolution2D) (None, 1024, 2, 2) 263168 activation_24[0][0]
____________________________________________________________________________________________________
merge_8 (Merge) (None, 1024L, 2, 2) 0 convolution2d_28[0][0]
convolution2d_27[0][0]
____________________________________________________________________________________________________
batchnormalization_25 (BatchNorm (None, 1024L, 2, 2) 4096 merge_8[0][0]
____________________________________________________________________________________________________
activation_25 (Activation) (None, 1024L, 2, 2) 0 batchnormalization_25[0][0]
____________________________________________________________________________________________________
convolution2d_29 (Convolution2D) (None, 256, 2, 2) 262400 activation_25[0][0]
____________________________________________________________________________________________________
batchnormalization_26 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_29[0][0]
____________________________________________________________________________________________________
activation_26 (Activation) (None, 256, 2, 2) 0 batchnormalization_26[0][0]
____________________________________________________________________________________________________
convolution2d_30 (Convolution2D) (None, 256, 2, 2) 590080 activation_26[0][0]
____________________________________________________________________________________________________
batchnormalization_27 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_30[0][0]
____________________________________________________________________________________________________
activation_27 (Activation) (None, 256, 2, 2) 0 batchnormalization_27[0][0]
____________________________________________________________________________________________________
convolution2d_31 (Convolution2D) (None, 1024, 2, 2) 263168 activation_27[0][0]
____________________________________________________________________________________________________
merge_9 (Merge) (None, 1024L, 2, 2) 0 merge_8[0][0]
convolution2d_31[0][0]
____________________________________________________________________________________________________
batchnormalization_28 (BatchNorm (None, 1024L, 2, 2) 4096 merge_9[0][0]
____________________________________________________________________________________________________
activation_28 (Activation) (None, 1024L, 2, 2) 0 batchnormalization_28[0][0]
____________________________________________________________________________________________________
convolution2d_32 (Convolution2D) (None, 256, 2, 2) 262400 activation_28[0][0]
____________________________________________________________________________________________________
batchnormalization_29 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_32[0][0]
____________________________________________________________________________________________________
activation_29 (Activation) (None, 256, 2, 2) 0 batchnormalization_29[0][0]
____________________________________________________________________________________________________
convolution2d_33 (Convolution2D) (None, 256, 2, 2) 590080 activation_29[0][0]
____________________________________________________________________________________________________
batchnormalization_30 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_33[0][0]
____________________________________________________________________________________________________
activation_30 (Activation) (None, 256, 2, 2) 0 batchnormalization_30[0][0]
____________________________________________________________________________________________________
convolution2d_34 (Convolution2D) (None, 1024, 2, 2) 263168 activation_30[0][0]
____________________________________________________________________________________________________
merge_10 (Merge) (None, 1024L, 2, 2) 0 merge_9[0][0]
convolution2d_34[0][0]
____________________________________________________________________________________________________
batchnormalization_31 (BatchNorm (None, 1024L, 2, 2) 4096 merge_10[0][0]
____________________________________________________________________________________________________
activation_31 (Activation) (None, 1024L, 2, 2) 0 batchnormalization_31[0][0]
____________________________________________________________________________________________________
convolution2d_35 (Convolution2D) (None, 256, 2, 2) 262400 activation_31[0][0]
____________________________________________________________________________________________________
batchnormalization_32 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_35[0][0]
____________________________________________________________________________________________________
activation_32 (Activation) (None, 256, 2, 2) 0 batchnormalization_32[0][0]
____________________________________________________________________________________________________
convolution2d_36 (Convolution2D) (None, 256, 2, 2) 590080 activation_32[0][0]
____________________________________________________________________________________________________
batchnormalization_33 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_36[0][0]
____________________________________________________________________________________________________
activation_33 (Activation) (None, 256, 2, 2) 0 batchnormalization_33[0][0]
____________________________________________________________________________________________________
convolution2d_37 (Convolution2D) (None, 1024, 2, 2) 263168 activation_33[0][0]
____________________________________________________________________________________________________
merge_11 (Merge) (None, 1024L, 2, 2) 0 merge_10[0][0]
convolution2d_37[0][0]
____________________________________________________________________________________________________
batchnormalization_34 (BatchNorm (None, 1024L, 2, 2) 4096 merge_11[0][0]
____________________________________________________________________________________________________
activation_34 (Activation) (None, 1024L, 2, 2) 0 batchnormalization_34[0][0]
____________________________________________________________________________________________________
convolution2d_38 (Convolution2D) (None, 256, 2, 2) 262400 activation_34[0][0]
____________________________________________________________________________________________________
batchnormalization_35 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_38[0][0]
____________________________________________________________________________________________________
activation_35 (Activation) (None, 256, 2, 2) 0 batchnormalization_35[0][0]
____________________________________________________________________________________________________
convolution2d_39 (Convolution2D) (None, 256, 2, 2) 590080 activation_35[0][0]
____________________________________________________________________________________________________
batchnormalization_36 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_39[0][0]
____________________________________________________________________________________________________
activation_36 (Activation) (None, 256, 2, 2) 0 batchnormalization_36[0][0]
____________________________________________________________________________________________________
convolution2d_40 (Convolution2D) (None, 1024, 2, 2) 263168 activation_36[0][0]
____________________________________________________________________________________________________
merge_12 (Merge) (None, 1024L, 2, 2) 0 merge_11[0][0]
convolution2d_40[0][0]
____________________________________________________________________________________________________
batchnormalization_37 (BatchNorm (None, 1024L, 2, 2) 4096 merge_12[0][0]
____________________________________________________________________________________________________
activation_37 (Activation) (None, 1024L, 2, 2) 0 batchnormalization_37[0][0]
____________________________________________________________________________________________________
convolution2d_41 (Convolution2D) (None, 256, 2, 2) 262400 activation_37[0][0]
____________________________________________________________________________________________________
batchnormalization_38 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_41[0][0]
____________________________________________________________________________________________________
activation_38 (Activation) (None, 256, 2, 2) 0 batchnormalization_38[0][0]
____________________________________________________________________________________________________
convolution2d_42 (Convolution2D) (None, 256, 2, 2) 590080 activation_38[0][0]
____________________________________________________________________________________________________
batchnormalization_39 (BatchNorm (None, 256, 2, 2) 1024 convolution2d_42[0][0]
____________________________________________________________________________________________________
activation_39 (Activation) (None, 256, 2, 2) 0 batchnormalization_39[0][0]
____________________________________________________________________________________________________
convolution2d_43 (Convolution2D) (None, 1024, 2, 2) 263168 activation_39[0][0]
____________________________________________________________________________________________________
merge_13 (Merge) (None, 1024L, 2, 2) 0 merge_12[0][0]
convolution2d_43[0][0]
____________________________________________________________________________________________________
batchnormalization_40 (BatchNorm (None, 1024L, 2, 2) 4096 merge_13[0][0]
____________________________________________________________________________________________________
activation_40 (Activation) (None, 1024L, 2, 2) 0 batchnormalization_40[0][0]
____________________________________________________________________________________________________
convolution2d_44 (Convolution2D) (None, 512, 1, 1) 524800 activation_40[0][0]
____________________________________________________________________________________________________
batchnormalization_41 (BatchNorm (None, 512, 1, 1) 2048 convolution2d_44[0][0]
____________________________________________________________________________________________________
activation_41 (Activation) (None, 512, 1, 1) 0 batchnormalization_41[0][0]
____________________________________________________________________________________________________
convolution2d_45 (Convolution2D) (None, 512, 1, 1) 2359808 activation_41[0][0]
____________________________________________________________________________________________________
batchnormalization_42 (BatchNorm (None, 512, 1, 1) 2048 convolution2d_45[0][0]
____________________________________________________________________________________________________
activation_42 (Activation) (None, 512, 1, 1) 0 batchnormalization_42[0][0]
____________________________________________________________________________________________________
convolution2d_47 (Convolution2D) (None, 2048L, 1, 1) 2099200 merge_13[0][0]
____________________________________________________________________________________________________
convolution2d_46 (Convolution2D) (None, 2048, 1, 1) 1050624 activation_42[0][0]
____________________________________________________________________________________________________
merge_14 (Merge) (None, 2048L, 1, 1) 0 convolution2d_47[0][0]
convolution2d_46[0][0]
____________________________________________________________________________________________________
batchnormalization_43 (BatchNorm (None, 2048L, 1, 1) 8192 merge_14[0][0]
____________________________________________________________________________________________________
activation_43 (Activation) (None, 2048L, 1, 1) 0 batchnormalization_43[0][0]
____________________________________________________________________________________________________
convolution2d_48 (Convolution2D) (None, 512, 1, 1) 1049088 activation_43[0][0]
____________________________________________________________________________________________________
batchnormalization_44 (BatchNorm (None, 512, 1, 1) 2048 convolution2d_48[0][0]
____________________________________________________________________________________________________
activation_44 (Activation) (None, 512, 1, 1) 0 batchnormalization_44[0][0]
____________________________________________________________________________________________________
convolution2d_49 (Convolution2D) (None, 512, 1, 1) 2359808 activation_44[0][0]
____________________________________________________________________________________________________
batchnormalization_45 (BatchNorm (None, 512, 1, 1) 2048 convolution2d_49[0][0]
____________________________________________________________________________________________________
activation_45 (Activation) (None, 512, 1, 1) 0 batchnormalization_45[0][0]
____________________________________________________________________________________________________
convolution2d_50 (Convolution2D) (None, 2048, 1, 1) 1050624 activation_45[0][0]
____________________________________________________________________________________________________
merge_15 (Merge) (None, 2048L, 1, 1) 0 merge_14[0][0]
convolution2d_50[0][0]
____________________________________________________________________________________________________
batchnormalization_46 (BatchNorm (None, 2048L, 1, 1) 8192 merge_15[0][0]
____________________________________________________________________________________________________
activation_46 (Activation) (None, 2048L, 1, 1) 0 batchnormalization_46[0][0]
____________________________________________________________________________________________________
convolution2d_51 (Convolution2D) (None, 512, 1, 1) 1049088 activation_46[0][0]
____________________________________________________________________________________________________
batchnormalization_47 (BatchNorm (None, 512, 1, 1) 2048 convolution2d_51[0][0]
____________________________________________________________________________________________________
activation_47 (Activation) (None, 512, 1, 1) 0 batchnormalization_47[0][0]
____________________________________________________________________________________________________
convolution2d_52 (Convolution2D) (None, 512, 1, 1) 2359808 activation_47[0][0]
____________________________________________________________________________________________________
batchnormalization_48 (BatchNorm (None, 512, 1, 1) 2048 convolution2d_52[0][0]
____________________________________________________________________________________________________
activation_48 (Activation) (None, 512, 1, 1) 0 batchnormalization_48[0][0]
____________________________________________________________________________________________________
convolution2d_53 (Convolution2D) (None, 2048, 1, 1) 1050624 activation_48[0][0]
____________________________________________________________________________________________________
merge_16 (Merge) (None, 2048L, 1, 1) 0 merge_15[0][0]
convolution2d_53[0][0]
____________________________________________________________________________________________________
batchnormalization_49 (BatchNorm (None, 2048L, 1, 1) 8192 merge_16[0][0]
____________________________________________________________________________________________________
activation_49 (Activation) (None, 2048L, 1, 1) 0 batchnormalization_49[0][0]
____________________________________________________________________________________________________
averagepooling2d_1 (AveragePooli (None, 2048L, 1L, 1L) 0 activation_49[0][0]
____________________________________________________________________________________________________
flatten_1 (Flatten) (None, 2048) 0 averagepooling2d_1[0][0]
____________________________________________________________________________________________________
dense_1 (Dense) (None, 10) 20490 flatten_1[0][0]
====================================================================================================
Total params: 23,592,842
Trainable params: 23,547,402
Non-trainable params: 45,440
____________________________________________________________________________________________________
Not using data augmentation.
Train on 50000 samples, validate on 10000 samples
Epoch 1/100
[13:18:55] src/operator/././cudnn_algoreg-inl.h:112: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)
/usr/local/lib/python2.7/dist-packages/mxnet/module/bucketing_module.py:385: UserWarning: Optimizer created manually outside Module but rescale_grad is not normalized to 1.0/batch_size/num_workers (1.0 vs. 0.00390625). Is this intended?
force_init=force_init)
49920/50000 [============================>.] - ETA: 0s - loss: 7.4342 - acc: 0.3508[13:19:37] src/operator/././cudnn_algoreg-inl.h:112: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)
50000/50000 [==============================] - 45s - loss: 7.4340 - acc: 0.3508 - val_loss: 8.0809 - val_acc: 0.2931
Epoch 2/100
50000/50000 [==============================] - 25s - loss: 6.7589 - acc: 0.4978 - val_loss: 7.0586 - val_acc: 0.4474
Epoch 3/100
50000/50000 [==============================] - 25s - loss: 6.4458 - acc: 0.5799 - val_loss: 7.2990 - val_acc: 0.3793
Epoch 4/100
50000/50000 [==============================] - 25s - loss: 6.2015 - acc: 0.6478 - val_loss: 7.3075 - val_acc: 0.3861
Epoch 5/100
50000/50000 [==============================] - 25s - loss: 5.9922 - acc: 0.7095 - val_loss: 7.1960 - val_acc: 0.4205
Epoch 6/100
50000/50000 [==============================] - 25s - loss: 5.8184 - acc: 0.7620 - val_loss: 6.9669 - val_acc: 0.4878
Epoch 7/100
50000/50000 [==============================] - 25s - loss: 5.6454 - acc: 0.8120 - val_loss: 7.3894 - val_acc: 0.4639
Epoch 8/100
50000/50000 [==============================] - 25s - loss: 5.5012 - acc: 0.8554 - val_loss: 7.2800 - val_acc: 0.4839
Epoch 9/100
50000/50000 [==============================] - 25s - loss: 5.3884 - acc: 0.8847 - val_loss: 7.0814 - val_acc: 0.4858
Epoch 10/100
50000/50000 [==============================] - 25s - loss: 5.2744 - acc: 0.9147 - val_loss: 7.2281 - val_acc: 0.4821
Epoch 11/100
50000/50000 [==============================] - 25s - loss: 5.2058 - acc: 0.9266 - val_loss: 7.6308 - val_acc: 0.4822
Epoch 12/100
50000/50000 [==============================] - 25s - loss: 5.1349 - acc: 0.9429 - val_loss: 7.2960 - val_acc: 0.5028
Epoch 13/100
50000/50000 [==============================] - 25s - loss: 5.0833 - acc: 0.9517 - val_loss: 7.7287 - val_acc: 0.4924
Epoch 14/100
50000/50000 [==============================] - 25s - loss: 5.0313 - acc: 0.9581 - val_loss: 8.5238 - val_acc: 0.4266
Epoch 15/100
50000/50000 [==============================] - 25s - loss: 4.9819 - acc: 0.9648 - val_loss: 7.3515 - val_acc: 0.5220
Epoch 16/100
50000/50000 [==============================] - 25s - loss: 4.9520 - acc: 0.9651 - val_loss: 7.0066 - val_acc: 0.5572
Epoch 17/100
50000/50000 [==============================] - 25s - loss: 4.9040 - acc: 0.9719 - val_loss: 8.3245 - val_acc: 0.4404
Epoch 18/100
50000/50000 [==============================] - 25s - loss: 4.8683 - acc: 0.9741 - val_loss: 7.8812 - val_acc: 0.4690
Epoch 19/100
50000/50000 [==============================] - 25s - loss: 4.8422 - acc: 0.9726 - val_loss: 7.4009 - val_acc: 0.5348
Epoch 20/100
50000/50000 [==============================] - 25s - loss: 4.8061 - acc: 0.9756 - val_loss: 6.9567 - val_acc: 0.5820
Epoch 21/100
50000/50000 [==============================] - 25s - loss: 4.7671 - acc: 0.9799 - val_loss: 9.9805 - val_acc: 0.3547
Epoch 22/100
50000/50000 [==============================] - 25s - loss: 4.7418 - acc: 0.9783 - val_loss: 6.9288 - val_acc: 0.5603
Epoch 23/100
50000/50000 [==============================] - 25s - loss: 4.7077 - acc: 0.9805 - val_loss: 7.5308 - val_acc: 0.4925
Epoch 24/100
50000/50000 [==============================] - 25s - loss: 4.6820 - acc: 0.9803 - val_loss: 8.2754 - val_acc: 0.4650
Epoch 25/100
50000/50000 [==============================] - 25s - loss: 4.6469 - acc: 0.9826 - val_loss: 8.1159 - val_acc: 0.4397
Epoch 26/100
50000/50000 [==============================] - 25s - loss: 4.6180 - acc: 0.9830 - val_loss: 6.9717 - val_acc: 0.5681
Epoch 27/100
50000/50000 [==============================] - 25s - loss: 4.5861 - acc: 0.9844 - val_loss: 8.8103 - val_acc: 0.4260
Epoch 28/100
50000/50000 [==============================] - 25s - loss: 4.5613 - acc: 0.9836 - val_loss: 6.5432 - val_acc: 0.6113
Epoch 29/100
50000/50000 [==============================] - 25s - loss: 4.5263 - acc: 0.9861 - val_loss: 7.2728 - val_acc: 0.5516
Epoch 30/100
50000/50000 [==============================] - 25s - loss: 4.5028 - acc: 0.9855 - val_loss: 6.5828 - val_acc: 0.6071
Epoch 31/100
50000/50000 [==============================] - 25s - loss: 4.4691 - acc: 0.9875 - val_loss: 9.0100 - val_acc: 0.3929
Epoch 32/100
50000/50000 [==============================] - 25s - loss: 4.4402 - acc: 0.9881 - val_loss: 7.7215 - val_acc: 0.4929
Epoch 33/100
50000/50000 [==============================] - 25s - loss: 4.4165 - acc: 0.9873 - val_loss: 7.2202 - val_acc: 0.5409
Epoch 34/100
50000/50000 [==============================] - 25s - loss: 4.3992 - acc: 0.9834 - val_loss: 6.7706 - val_acc: 0.5767
Epoch 35/100
50000/50000 [==============================] - 25s - loss: 4.3613 - acc: 0.9880 - val_loss: 6.6951 - val_acc: 0.5864
Epoch 36/100
50000/50000 [==============================] - 25s - loss: 4.3405 - acc: 0.9866 - val_loss: 6.9227 - val_acc: 0.5557
Epoch 37/100
50000/50000 [==============================] - 25s - loss: 4.3128 - acc: 0.9866 - val_loss: 6.2588 - val_acc: 0.6211
Epoch 38/100
50000/50000 [==============================] - 25s - loss: 4.2789 - acc: 0.9895 - val_loss: 6.7703 - val_acc: 0.5710
Epoch 39/100
50000/50000 [==============================] - 25s - loss: 4.2577 - acc: 0.9882 - val_loss: 6.2447 - val_acc: 0.6402
Epoch 40/100
50000/50000 [==============================] - 25s - loss: 4.2244 - acc: 0.9906 - val_loss: 9.1206 - val_acc: 0.4046
Epoch 41/100
50000/50000 [==============================] - 25s - loss: 4.2106 - acc: 0.9873 - val_loss: 6.8567 - val_acc: 0.5640
Epoch 42/100
50000/50000 [==============================] - 25s - loss: 4.1781 - acc: 0.9893 - val_loss: 7.3209 - val_acc: 0.5161
Epoch 43/100
50000/50000 [==============================] - 25s - loss: 4.1509 - acc: 0.9902 - val_loss: 6.7022 - val_acc: 0.5891
Epoch 44/100
50000/50000 [==============================] - 25s - loss: 4.1270 - acc: 0.9896 - val_loss: 7.2685 - val_acc: 0.5423
Epoch 45/100
50000/50000 [==============================] - 25s - loss: 4.1082 - acc: 0.9873 - val_loss: 6.5343 - val_acc: 0.6038
Epoch 46/100
50000/50000 [==============================] - 25s - loss: 4.0767 - acc: 0.9905 - val_loss: 7.0244 - val_acc: 0.5356
Epoch 47/100
50000/50000 [==============================] - 25s - loss: 4.0627 - acc: 0.9871 - val_loss: 6.5169 - val_acc: 0.5807
Epoch 48/100
50000/50000 [==============================] - 25s - loss: 4.0323 - acc: 0.9889 - val_loss: 5.9066 - val_acc: 0.6562
Epoch 49/100
50000/50000 [==============================] - 25s - loss: 3.9978 - acc: 0.9927 - val_loss: 8.0794 - val_acc: 0.4507
Epoch 50/100
50000/50000 [==============================] - 25s - loss: 3.9760 - acc: 0.9920 - val_loss: 6.2954 - val_acc: 0.6156
Epoch 51/100
50000/50000 [==============================] - 25s - loss: 3.9523 - acc: 0.9919 - val_loss: 6.4975 - val_acc: 0.5984
Epoch 52/100
50000/50000 [==============================] - 25s - loss: 3.9289 - acc: 0.9918 - val_loss: 6.4966 - val_acc: 0.6013
Epoch 53/100
50000/50000 [==============================] - 25s - loss: 3.9047 - acc: 0.9918 - val_loss: 6.0992 - val_acc: 0.6303
Epoch 54/100
50000/50000 [==============================] - 25s - loss: 3.8816 - acc: 0.9916 - val_loss: 6.1480 - val_acc: 0.6178
Epoch 55/100
50000/50000 [==============================] - 25s - loss: 3.8585 - acc: 0.9914 - val_loss: 5.9164 - val_acc: 0.6447
Epoch 56/100
50000/50000 [==============================] - 24s - loss: 3.8313 - acc: 0.9925 - val_loss: 6.6936 - val_acc: 0.5693
Epoch 57/100
50000/50000 [==============================] - 25s - loss: 3.8204 - acc: 0.9892 - val_loss: 6.1162 - val_acc: 0.6260
Epoch 58/100
50000/50000 [==============================] - 25s - loss: 3.7933 - acc: 0.9903 - val_loss: 7.6055 - val_acc: 0.4636
Epoch 59/100
50000/50000 [==============================] - 25s - loss: 3.7803 - acc: 0.9868 - val_loss: 7.1323 - val_acc: 0.5064
Epoch 60/100
50000/50000 [==============================] - 25s - loss: 3.7618 - acc: 0.9859 - val_loss: 6.2774 - val_acc: 0.5892
Epoch 61/100
50000/50000 [==============================] - 25s - loss: 3.7283 - acc: 0.9898 - val_loss: 6.4765 - val_acc: 0.5817
Epoch 62/100
50000/50000 [==============================] - 25s - loss: 3.7049 - acc: 0.9902 - val_loss: 6.0344 - val_acc: 0.6097
Epoch 63/100
50000/50000 [==============================] - 25s - loss: 3.6727 - acc: 0.9936 - val_loss: 5.4557 - val_acc: 0.6775
Epoch 64/100
50000/50000 [==============================] - 25s - loss: 3.6487 - acc: 0.9942 - val_loss: 6.8655 - val_acc: 0.5378
Epoch 65/100
50000/50000 [==============================] - 25s - loss: 3.6255 - acc: 0.9951 - val_loss: 8.4668 - val_acc: 0.4403
Epoch 66/100
50000/50000 [==============================] - 25s - loss: 3.6085 - acc: 0.9928 - val_loss: 6.3563 - val_acc: 0.5679
Epoch 67/100
50000/50000 [==============================] - 25s - loss: 3.5873 - acc: 0.9930 - val_loss: 6.1473 - val_acc: 0.6016
Epoch 68/100
50000/50000 [==============================] - 25s - loss: 3.5669 - acc: 0.9924 - val_loss: 6.8488 - val_acc: 0.5569
Epoch 69/100
50000/50000 [==============================] - 24s - loss: 3.5496 - acc: 0.9913 - val_loss: 5.4005 - val_acc: 0.6773
Epoch 70/100
50000/50000 [==============================] - 25s - loss: 3.5146 - acc: 0.9958 - val_loss: 6.4531 - val_acc: 0.5842
Epoch 71/100
50000/50000 [==============================] - 25s - loss: 3.4974 - acc: 0.9941 - val_loss: 7.1032 - val_acc: 0.5022
Epoch 72/100
50000/50000 [==============================] - 25s - loss: 3.4865 - acc: 0.9912 - val_loss: 6.9691 - val_acc: 0.5408
Epoch 73/100
50000/50000 [==============================] - 25s - loss: 3.4577 - acc: 0.9935 - val_loss: 5.7409 - val_acc: 0.6230
Epoch 74/100
50000/50000 [==============================] - 24s - loss: 3.4382 - acc: 0.9938 - val_loss: 5.4519 - val_acc: 0.6605
Epoch 75/100
50000/50000 [==============================] - 25s - loss: 3.4130 - acc: 0.9945 - val_loss: 6.4179 - val_acc: 0.5593
Epoch 76/100
50000/50000 [==============================] - 25s - loss: 3.3965 - acc: 0.9936 - val_loss: 6.5389 - val_acc: 0.5762
Epoch 77/100
50000/50000 [==============================] - 25s - loss: 3.3790 - acc: 0.9926 - val_loss: 7.0960 - val_acc: 0.4743
Epoch 78/100
50000/50000 [==============================] - 25s - loss: 3.3680 - acc: 0.9893 - val_loss: 5.8224 - val_acc: 0.6162
Epoch 79/100
50000/50000 [==============================] - 25s - loss: 3.3398 - acc: 0.9921 - val_loss: 8.5480 - val_acc: 0.4254
Epoch 80/100
50000/50000 [==============================] - 25s - loss: 3.3270 - acc: 0.9897 - val_loss: 5.3450 - val_acc: 0.6533
Epoch 81/100
50000/50000 [==============================] - 25s - loss: 3.2957 - acc: 0.9935 - val_loss: 5.2966 - val_acc: 0.6656
Epoch 82/100
50000/50000 [==============================] - 25s - loss: 3.2778 - acc: 0.9930 - val_loss: 5.6172 - val_acc: 0.6409
Epoch 83/100
50000/50000 [==============================] - 24s - loss: 3.2588 - acc: 0.9925 - val_loss: 5.9547 - val_acc: 0.5763
Epoch 84/100
50000/50000 [==============================] - 25s - loss: 3.2336 - acc: 0.9951 - val_loss: 5.9915 - val_acc: 0.5808
Epoch 85/100
50000/50000 [==============================] - 25s - loss: 3.2205 - acc: 0.9927 - val_loss: 5.4194 - val_acc: 0.6469
Epoch 86/100
50000/50000 [==============================] - 24s - loss: 3.2021 - acc: 0.9925 - val_loss: 7.3006 - val_acc: 0.4766
Epoch 87/100
50000/50000 [==============================] - 25s - loss: 3.1901 - acc: 0.9901 - val_loss: 5.4143 - val_acc: 0.6309
Epoch 88/100
50000/50000 [==============================] - 25s - loss: 3.1640 - acc: 0.9925 - val_loss: 5.4471 - val_acc: 0.6337
Epoch 89/100
50000/50000 [==============================] - 25s - loss: 3.1449 - acc: 0.9930 - val_loss: 5.6257 - val_acc: 0.6164
Epoch 90/100
50000/50000 [==============================] - 25s - loss: 3.1268 - acc: 0.9927 - val_loss: 5.5993 - val_acc: 0.5964
Epoch 91/100
50000/50000 [==============================] - 25s - loss: 3.1075 - acc: 0.9928 - val_loss: 5.3717 - val_acc: 0.6397
Epoch 92/100
50000/50000 [==============================] - 25s - loss: 3.0820 - acc: 0.9952 - val_loss: 5.0897 - val_acc: 0.6722
Epoch 93/100
50000/50000 [==============================] - 24s - loss: 3.0619 - acc: 0.9958 - val_loss: 5.3178 - val_acc: 0.6489
Epoch 94/100
50000/50000 [==============================] - 25s - loss: 3.0462 - acc: 0.9949 - val_loss: 5.4110 - val_acc: 0.6317
Epoch 95/100
50000/50000 [==============================] - 25s - loss: 3.0248 - acc: 0.9955 - val_loss: 5.0131 - val_acc: 0.6798
Epoch 96/100
50000/50000 [==============================] - 25s - loss: 3.0070 - acc: 0.9953 - val_loss: 5.0207 - val_acc: 0.6695
Epoch 97/100
50000/50000 [==============================] - 25s - loss: 2.9878 - acc: 0.9963 - val_loss: 5.5625 - val_acc: 0.6148
Epoch 98/100
50000/50000 [==============================] - 25s - loss: 2.9725 - acc: 0.9949 - val_loss: 5.8556 - val_acc: 0.5889
Epoch 99/100
50000/50000 [==============================] - 25s - loss: 2.9640 - acc: 0.9923 - val_loss: 5.2620 - val_acc: 0.6506
Epoch 100/100
50000/50000 [==============================] - 25s - loss: 2.9415 - acc: 0.9937 - val_loss: 5.6695 - val_acc: 0.6184
Model training complete.
TRAINING ACCURACY - 0.993659999981
TEST ACCURACY - 0.6197
real 42m47.987s
user 490m7.064s
sys 64m42.488s
ubuntu@ip-172-31-41-99:~/keras/examples$
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment