Skip to content

Instantly share code, notes, and snippets.

@szagoruyko
Last active January 17, 2017 11:30
Show Gist options
  • Save szagoruyko/0f5b4c5e2d2b18472854 to your computer and use it in GitHub Desktop.
Save szagoruyko/0f5b4c5e2d2b18472854 to your computer and use it in GitHub Desktop.

Network-in-Network trained in Torch7

Trained by me using https://github.com/soumith/imagenet-multiGPU.torch, achieves 62.6% top1 center crop accuracy on ImageNet validation set. Tested here: https://github.com/szagoruyko/imagenet-validation.torch

Download links:

  1. https://www.dropbox.com/s/mclw90yba6eml60/nin_bn_final.t7 (31 MB)
  2. https://www.dropbox.com/s/npmr5egvjbg7ovb/nin_nobn_final.t7 (31 MB) - Batch Normalization intergrated into convolutional layers

Load as:

net = torch.load'./nin_nobn_final.t7':unpack()

Input image size is 224.

Preprocessing

Separate mean std per channel is saved with the network:

> print(net.transform)
{
  mean :
    {
      1 : 0.48462227599918
      2 : 0.45624044862054
      3 : 0.40588363755159
    }
  std :
    {
      1 : 0.22889466674951
      2 : 0.22446679341259
      3 : 0.22495548344775
    }
}

Can be loaded without CUDA support.

Schedule to train

The model is train in 35 epochs, a bit more than a day on Titan X with CUDNN V4.

local regimes = {
  -- start, end,    LR,   WD,
  {  1,      9,   1e-1,   5e-4, },
  { 10,     19,   1e-2,   5e-4  },
  { 20,     25,   1e-3,   0 },
  { 26,     30,   1e-4,   0 },
}

Printout:

With Batch Normalization:

nn.Sequential {
  (1): nn.SpatialConvolution(3 -> 96, 11x11, 4,4, 5,5)
  (2): nn.SpatialBatchNormalization
  (3): nn.ReLU
  (4): nn.SpatialConvolution(96 -> 96, 1x1)
  (5): nn.SpatialBatchNormalization
  (6): nn.ReLU
  (7): nn.SpatialConvolution(96 -> 96, 1x1)
  (8): nn.SpatialBatchNormalization
  (9): nn.ReLU
  (10): nn.SpatialMaxPooling(3,3,2,2,1,1)
  (11): nn.SpatialConvolution(96 -> 256, 5x5, 1,1, 2,2)
  (12): nn.SpatialBatchNormalization
  (13): nn.ReLU
  (14): nn.SpatialConvolution(256 -> 256, 1x1)
  (15): nn.SpatialBatchNormalization
  (16): nn.ReLU
  (17): nn.SpatialConvolution(256 -> 256, 1x1)
  (18): nn.SpatialBatchNormalization
  (19): nn.ReLU
  (20): nn.SpatialMaxPooling(3,3,2,2,1,1)
  (21): nn.SpatialConvolution(256 -> 384, 3x3, 1,1, 1,1)
  (22): nn.SpatialBatchNormalization
  (23): nn.ReLU
  (24): nn.SpatialConvolution(384 -> 384, 1x1)
  (25): nn.SpatialBatchNormalization
  (26): nn.ReLU
  (27): nn.SpatialConvolution(384 -> 384, 1x1)
  (28): nn.SpatialBatchNormalization
  (29): nn.ReLU
  (30): nn.SpatialMaxPooling(3,3,2,2,1,1)
  (31): nn.SpatialConvolution(384 -> 1024, 3x3, 1,1, 1,1)
  (32): nn.SpatialBatchNormalization
  (33): nn.ReLU
  (34): nn.SpatialConvolution(1024 -> 1024, 1x1)
  (35): nn.SpatialBatchNormalization
  (36): nn.ReLU
  (37): nn.SpatialConvolution(1024 -> 1024, 1x1)
  (38): nn.SpatialBatchNormalization
  (39): nn.ReLU
  (40): nn.SpatialAveragePooling(7,7,1,1)
  (41): nn.View(-1)
  (42): nn.Linear(1024 -> 1000)
}

Without:

nn.Sequential {
  [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> (24) -> (25) -> (26) -> (27) -> (28) -> (29) -> (30) -> output]
  (1): nn.SpatialConvolution(3 -> 96, 11x11, 4,4, 5,5)
  (2): nn.ReLU
  (3): nn.SpatialConvolution(96 -> 96, 1x1)
  (4): nn.ReLU
  (5): nn.SpatialConvolution(96 -> 96, 1x1)
  (6): nn.ReLU
  (7): nn.SpatialMaxPooling(3,3,2,2,1,1)
  (8): nn.SpatialConvolution(96 -> 256, 5x5, 1,1, 2,2)
  (9): nn.ReLU
  (10): nn.SpatialConvolution(256 -> 256, 1x1)
  (11): nn.ReLU
  (12): nn.SpatialConvolution(256 -> 256, 1x1)
  (13): nn.ReLU
  (14): nn.SpatialMaxPooling(3,3,2,2,1,1)
  (15): nn.SpatialConvolution(256 -> 384, 3x3, 1,1, 1,1)
  (16): nn.ReLU
  (17): nn.SpatialConvolution(384 -> 384, 1x1)
  (18): nn.ReLU
  (19): nn.SpatialConvolution(384 -> 384, 1x1)
  (20): nn.ReLU
  (21): nn.SpatialMaxPooling(3,3,2,2,1,1)
  (22): nn.SpatialConvolution(384 -> 1024, 3x3, 1,1, 1,1)
  (23): nn.ReLU
  (24): nn.SpatialConvolution(1024 -> 1024, 1x1)
  (25): nn.ReLU
  (26): nn.SpatialConvolution(1024 -> 1024, 1x1)
  (27): nn.ReLU
  (28): nn.SpatialAveragePooling(7,7,1,1)
  (29): nn.View(-1)
  (30): nn.Linear(1024 -> 1000)
}
function createModel(nGPU)
require 'cunn'
local model = nn.Sequential()
local function block(...)
local arg = {...}
local no = arg[2]
model:add(nn.SpatialConvolution(...))
model:add(nn.SpatialBatchNormalization(no,1e-3))
model:add(nn.ReLU(true))
model:add(nn.SpatialConvolution(no, no, 1, 1, 1, 1, 0, 0))
model:add(nn.SpatialBatchNormalization(no,1e-3))
model:add(nn.ReLU(true))
model:add(nn.SpatialConvolution(no, no, 1, 1, 1, 1, 0, 0))
model:add(nn.SpatialBatchNormalization(no,1e-3))
model:add(nn.ReLU(true))
end
local function mp(...)
model:add(nn.SpatialMaxPooling(...))
end
block(3, 96, 11, 11, 4, 4, 5, 5)
mp(3, 3, 2, 2, 1, 1)
block(96, 256, 5, 5, 1, 1, 2, 2)
mp(3, 3, 2, 2, 1, 1)
block(256, 384, 3, 3, 1, 1, 1, 1)
mp(3, 3, 2, 2, 1, 1)
block(384, 1024, 3, 3, 1, 1, 1, 1)
model:add(nn.SpatialAveragePooling(7, 7, 1, 1))
model:add(nn.View(-1):setNumInputDims(3))
model:add(nn.Linear(1024,1000))
model:add(nn.LogSoftMax())
model.imageSize = 256
model.imageCrop = 224
return model:cuda()
end
@lichengunc
Copy link

The two links seems broken, could you please update the two NIN models links?

@szagoruyko
Copy link
Author

@lichengunc work for me, can you check from another computer?

@ClementPinard
Copy link

Hello, thanks you for publishing your network. I would like to use this model, but there are some problems regarding the getParameters() function.

net = torch.load'./nin_bn_final.t7':unpack()
param,grad = net:getParameters()

works, but when i try to get parameters for a sole layer before (say, for basic pruning or anything else)

net = torch.load'./nin_bn_final.t7':unpack()
layer = net:get(4) --nn.SpatialConvolution(96 -> 96, 1x1)
paramLayer = layer:getParameters()
param,grad = net:getParameters()

gets me this output :

/home/ml/torch/install/share/lua/5.1/nn/Module.lua:323: misaligned parameter at 5
stack traceback:
        [C]: in function 'assert'
        /home/ml/torch/install/share/lua/5.1/nn/Module.lua:323: in function 'getParameters'
        [string "p,g = a:getParameters()"]:1: in main chunk
        [C]: in function 'xpcall'
        /home/ml/torch/install/share/lua/5.1/trepl/init.lua:679: in function 'repl'
        ...e/ml/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
        [C]: at 0x00405d50

The problem is only existing on nin_bn (not in nin-nobn), and i can't reproduce it anywhere else, even on networks with batchNormalization layers, such as inception-v3. Do you have an idea of what is going on ?

@szagoruyko
Copy link
Author

@ClementPinard that's weird, have never encountered this error. Can you post this in torch/nn issues?

@archenroot
Copy link

I am able to run image tracker for example, but trying not opencv imagenet classification and having following issue, any ideas guys? Or should I go somewhere else?

zangetsu@ares ~/data/proj/neural-networks/torch-opencv-demos/imagenet_classification $ th
 
  ______             __   |  Torch7 
 /_  __/__  ________/ /   |  Scientific computing for Lua. 
  / / / _ \/ __/ __/ _ \  |  Type ? for help 
 /_/  \___/_/  \__/_//_/  |  https://github.com/torch 
                          |  http://torch.ch 
	
th> import 'nn'
                                                                      [0.0705s]	
th> import 'dpnn'
                                                                      [0.0485s]	
th> net = torch.load('nin_nobn_final.t7'):unpack():float()
Warning: Failed to load function from bytecode: (binary): cannot load incompatible bytecode[string "net = torch.load('nin_nobn_final.t7'):unpack(..."]:1: attempt to call method 'unpack' (a nil value)
stack traceback:
	[string "net = torch.load('nin_nobn_final.t7'):unpack(..."]:1: in main chunk
	[C]: in function 'xpcall'
	/usr/share/lua/5.1/trepl/init.lua:679: in function 'repl'
	/usr/bin/th:204: in main chunk
	[C]: at 0x004045a0	
                                                                      [0.0187s]	
th> exit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment