Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?

Network-in-Network trained in Torch7

Trained by me using https://github.com/soumith/imagenet-multiGPU.torch, achieves 62.6% top1 center crop accuracy on ImageNet validation set. Tested here: https://github.com/szagoruyko/imagenet-validation.torch

Download links:

  1. https://www.dropbox.com/s/mclw90yba6eml60/nin_bn_final.t7 (31 MB)
  2. https://www.dropbox.com/s/npmr5egvjbg7ovb/nin_nobn_final.t7 (31 MB) - Batch Normalization intergrated into convolutional layers

Load as:

net = torch.load'./nin_nobn_final.t7':unpack()

Input image size is 224.

Preprocessing

Separate mean std per channel is saved with the network:

> print(net.transform)
{
  mean :
    {
      1 : 0.48462227599918
      2 : 0.45624044862054
      3 : 0.40588363755159
    }
  std :
    {
      1 : 0.22889466674951
      2 : 0.22446679341259
      3 : 0.22495548344775
    }
}

Can be loaded without CUDA support.

Schedule to train

The model is train in 35 epochs, a bit more than a day on Titan X with CUDNN V4.

local regimes = {
  -- start, end,    LR,   WD,
  {  1,      9,   1e-1,   5e-4, },
  { 10,     19,   1e-2,   5e-4  },
  { 20,     25,   1e-3,   0 },
  { 26,     30,   1e-4,   0 },
}

Printout:

With Batch Normalization:

nn.Sequential {
  (1): nn.SpatialConvolution(3 -> 96, 11x11, 4,4, 5,5)
  (2): nn.SpatialBatchNormalization
  (3): nn.ReLU
  (4): nn.SpatialConvolution(96 -> 96, 1x1)
  (5): nn.SpatialBatchNormalization
  (6): nn.ReLU
  (7): nn.SpatialConvolution(96 -> 96, 1x1)
  (8): nn.SpatialBatchNormalization
  (9): nn.ReLU
  (10): nn.SpatialMaxPooling(3,3,2,2,1,1)
  (11): nn.SpatialConvolution(96 -> 256, 5x5, 1,1, 2,2)
  (12): nn.SpatialBatchNormalization
  (13): nn.ReLU
  (14): nn.SpatialConvolution(256 -> 256, 1x1)
  (15): nn.SpatialBatchNormalization
  (16): nn.ReLU
  (17): nn.SpatialConvolution(256 -> 256, 1x1)
  (18): nn.SpatialBatchNormalization
  (19): nn.ReLU
  (20): nn.SpatialMaxPooling(3,3,2,2,1,1)
  (21): nn.SpatialConvolution(256 -> 384, 3x3, 1,1, 1,1)
  (22): nn.SpatialBatchNormalization
  (23): nn.ReLU
  (24): nn.SpatialConvolution(384 -> 384, 1x1)
  (25): nn.SpatialBatchNormalization
  (26): nn.ReLU
  (27): nn.SpatialConvolution(384 -> 384, 1x1)
  (28): nn.SpatialBatchNormalization
  (29): nn.ReLU
  (30): nn.SpatialMaxPooling(3,3,2,2,1,1)
  (31): nn.SpatialConvolution(384 -> 1024, 3x3, 1,1, 1,1)
  (32): nn.SpatialBatchNormalization
  (33): nn.ReLU
  (34): nn.SpatialConvolution(1024 -> 1024, 1x1)
  (35): nn.SpatialBatchNormalization
  (36): nn.ReLU
  (37): nn.SpatialConvolution(1024 -> 1024, 1x1)
  (38): nn.SpatialBatchNormalization
  (39): nn.ReLU
  (40): nn.SpatialAveragePooling(7,7,1,1)
  (41): nn.View(-1)
  (42): nn.Linear(1024 -> 1000)
}

Without:

nn.Sequential {
  [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> (24) -> (25) -> (26) -> (27) -> (28) -> (29) -> (30) -> output]
  (1): nn.SpatialConvolution(3 -> 96, 11x11, 4,4, 5,5)
  (2): nn.ReLU
  (3): nn.SpatialConvolution(96 -> 96, 1x1)
  (4): nn.ReLU
  (5): nn.SpatialConvolution(96 -> 96, 1x1)
  (6): nn.ReLU
  (7): nn.SpatialMaxPooling(3,3,2,2,1,1)
  (8): nn.SpatialConvolution(96 -> 256, 5x5, 1,1, 2,2)
  (9): nn.ReLU
  (10): nn.SpatialConvolution(256 -> 256, 1x1)
  (11): nn.ReLU
  (12): nn.SpatialConvolution(256 -> 256, 1x1)
  (13): nn.ReLU
  (14): nn.SpatialMaxPooling(3,3,2,2,1,1)
  (15): nn.SpatialConvolution(256 -> 384, 3x3, 1,1, 1,1)
  (16): nn.ReLU
  (17): nn.SpatialConvolution(384 -> 384, 1x1)
  (18): nn.ReLU
  (19): nn.SpatialConvolution(384 -> 384, 1x1)
  (20): nn.ReLU
  (21): nn.SpatialMaxPooling(3,3,2,2,1,1)
  (22): nn.SpatialConvolution(384 -> 1024, 3x3, 1,1, 1,1)
  (23): nn.ReLU
  (24): nn.SpatialConvolution(1024 -> 1024, 1x1)
  (25): nn.ReLU
  (26): nn.SpatialConvolution(1024 -> 1024, 1x1)
  (27): nn.ReLU
  (28): nn.SpatialAveragePooling(7,7,1,1)
  (29): nn.View(-1)
  (30): nn.Linear(1024 -> 1000)
}
function createModel(nGPU)
require 'cunn'
local model = nn.Sequential()
local function block(...)
local arg = {...}
local no = arg[2]
model:add(nn.SpatialConvolution(...))
model:add(nn.SpatialBatchNormalization(no,1e-3))
model:add(nn.ReLU(true))
model:add(nn.SpatialConvolution(no, no, 1, 1, 1, 1, 0, 0))
model:add(nn.SpatialBatchNormalization(no,1e-3))
model:add(nn.ReLU(true))
model:add(nn.SpatialConvolution(no, no, 1, 1, 1, 1, 0, 0))
model:add(nn.SpatialBatchNormalization(no,1e-3))
model:add(nn.ReLU(true))
end
local function mp(...)
model:add(nn.SpatialMaxPooling(...))
end
block(3, 96, 11, 11, 4, 4, 5, 5)
mp(3, 3, 2, 2, 1, 1)
block(96, 256, 5, 5, 1, 1, 2, 2)
mp(3, 3, 2, 2, 1, 1)
block(256, 384, 3, 3, 1, 1, 1, 1)
mp(3, 3, 2, 2, 1, 1)
block(384, 1024, 3, 3, 1, 1, 1, 1)
model:add(nn.SpatialAveragePooling(7, 7, 1, 1))
model:add(nn.View(-1):setNumInputDims(3))
model:add(nn.Linear(1024,1000))
model:add(nn.LogSoftMax())
model.imageSize = 256
model.imageCrop = 224
return model:cuda()
end
@szagoruyko

This comment has been minimized.

Copy link
Owner Author

commented Feb 15, 2016

Train/test top-1 error

screen shot 2016-02-15 at 19 31 01

Python notebook code to generate this plot

from bokeh.plotting import figure, output_notebook, show
import numpy as np
from bokeh.models import Range1d
output_notebook()

trainlog = np.loadtxt('train.log', skiprows=1)
testlog = np.loadtxt('test.log', skiprows=1)

x = np.arange(1,trainlog.shape[0])
y_train = 100 - trainlog[:,1]
y_test = 100 - testlog[:,0]

p = figure(title='top1-error, %', x_axis_label = 'epoch')
p.line(x,y_train,color='red',  legend = 'train error')
p.line(x,y_test, color='blue', legend = 'test error')
p.set(y_range=Range1d(0, 100))
show(p)
@Atcold

This comment has been minimized.

Copy link

commented Mar 10, 2016

Nice gist, @szagoruyko!

@lichengunc

This comment has been minimized.

Copy link

commented Jul 7, 2016

The two links seems broken, could you please update the two NIN models links?

@szagoruyko

This comment has been minimized.

Copy link
Owner Author

commented Sep 6, 2016

@lichengunc work for me, can you check from another computer?

@ClementPinard

This comment has been minimized.

Copy link

commented Dec 30, 2016

Hello, thanks you for publishing your network. I would like to use this model, but there are some problems regarding the getParameters() function.

net = torch.load'./nin_bn_final.t7':unpack()
param,grad = net:getParameters()

works, but when i try to get parameters for a sole layer before (say, for basic pruning or anything else)

net = torch.load'./nin_bn_final.t7':unpack()
layer = net:get(4) --nn.SpatialConvolution(96 -> 96, 1x1)
paramLayer = layer:getParameters()
param,grad = net:getParameters()

gets me this output :

/home/ml/torch/install/share/lua/5.1/nn/Module.lua:323: misaligned parameter at 5
stack traceback:
        [C]: in function 'assert'
        /home/ml/torch/install/share/lua/5.1/nn/Module.lua:323: in function 'getParameters'
        [string "p,g = a:getParameters()"]:1: in main chunk
        [C]: in function 'xpcall'
        /home/ml/torch/install/share/lua/5.1/trepl/init.lua:679: in function 'repl'
        ...e/ml/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
        [C]: at 0x00405d50

The problem is only existing on nin_bn (not in nin-nobn), and i can't reproduce it anywhere else, even on networks with batchNormalization layers, such as inception-v3. Do you have an idea of what is going on ?

@szagoruyko

This comment has been minimized.

Copy link
Owner Author

commented Jan 6, 2017

@ClementPinard that's weird, have never encountered this error. Can you post this in torch/nn issues?

@archenroot

This comment has been minimized.

Copy link

commented Jan 17, 2017

I am able to run image tracker for example, but trying not opencv imagenet classification and having following issue, any ideas guys? Or should I go somewhere else?

zangetsu@ares ~/data/proj/neural-networks/torch-opencv-demos/imagenet_classification $ th
 
  ______             __   |  Torch7 
 /_  __/__  ________/ /   |  Scientific computing for Lua. 
  / / / _ \/ __/ __/ _ \  |  Type ? for help 
 /_/  \___/_/  \__/_//_/  |  https://github.com/torch 
                          |  http://torch.ch 
	
th> import 'nn'
                                                                      [0.0705s]	
th> import 'dpnn'
                                                                      [0.0485s]	
th> net = torch.load('nin_nobn_final.t7'):unpack():float()
Warning: Failed to load function from bytecode: (binary): cannot load incompatible bytecode[string "net = torch.load('nin_nobn_final.t7'):unpack(..."]:1: attempt to call method 'unpack' (a nil value)
stack traceback:
	[string "net = torch.load('nin_nobn_final.t7'):unpack(..."]:1: in main chunk
	[C]: in function 'xpcall'
	/usr/share/lua/5.1/trepl/init.lua:679: in function 'repl'
	/usr/bin/th:204: in main chunk
	[C]: at 0x004045a0	
                                                                      [0.0187s]	
th> exit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.