nervanazoo/googlenet_neon.py

## readme.md

      
    Raw
  

              readme.md
            
          
    Model

This is an implementation of the GoogLeNet model for image classification described in Szegedy et. al. 2014.
The model presented here does not include any Local Response Normalization layers as were used in the published implementation.
Model script

The model run script is included below (googlenet_neon.py)
Trained weights

The trained weights file can be downloaded from AWS using the following link: trained googlenet model weights.
Performance

This model is acheiving 64% top-1 and 85.5% top-5 accuracy on the validation data set.
During training, the images were randomly cropped and flipped horizontally but scale jittering and colorspace noise addition was not implemented.
Instructions

To run the model, first the ImageNet data set needs to be uploaded and converted to the format compatible with neon (see  instructions).  Note there has been some changes to the format of the mean data subtraction; users with the old format may be prompted to run an update script before proceeding.
This script works with the neon commit SHA 66846b409.  Make sure that your local repo is synced to this commit and run the installation procedure before proceeding.
If neon is installed into a virtualenv, make sure that it is activated before running the commands below.  Also, the commands below use the GPU backend by default so add -b cpu if you are running on a system without a compatible GPU.
To test the model performance on the validation data set and benchmark the run times use the following command:
python googlenet_neon.py -w path/to/dataset/batches --model_file googlenet.p

Additional options are available to add features like saving checkpoints and displaying logging information, use the --help option for details.  For information on generating the ILSVRC2012 data ste macrobacthes check out the
neon documentation page.
Training

Training this model requires some features to neon which will be released soon.  These scripts will be updated to include the training procedure as soon as possible.
Benchmarks

Machine and GPU specs:
Intel(R) Core(TM) i5-4690 CPU @ 3.50GHz
Ubunutu 14.04
GPU: GeForce GTX TITAN X
CUDA Driver Version 7.0

The run times for the fprop and bprop pass and the parameter update are given in the table below.  The iteration row is the combined runtime for all functions in a training iteration.  These results are for each minibatch consisting of 128 images of shape 224x224x3.  The model was run 12 times, the first two passes were ignored and the last 10 were used to get the benchmark results.
------------------------------
|    Func     |      Mean    |
------------------------------
| fprop       |   116 msec   |
| bprop       |   261 msec   |
| update      |    45 msec   |
| iteration   |   424 msec   |
------------------------------

Citation

Going deeper with convolutions
Szegedy, Christian; Liu, Wei; Jia, Yangqing; Sermanet, Pierre; Reed, Scott; Anguelov, Dragomir;
Erhan, Dumitru; Vanhoucke, Vincent; Rabinovich, Andrew
arXiv:1409.4842


## googlenet_neon.py
#!/usr/bin/env python
# ----------------------------------------------------------------------------
# Copyright 2015 Nervana Systems Inc.
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ----------------------------------------------------------------------------
"""
Googlenet V1 implementation
"""

import os

from neon.util.argparser import NeonArgparser
from neon.layers import Conv, Pooling, MergeBroadcast, BranchNode, Affine, Tree, Dropout
from neon.layers import GeneralizedCost, Multicost
from neon.initializers import Constant, Xavier
from neon.backends import gen_backend
from neon.optimizers import GradientDescentMomentum, MultiOptimizer
from neon.transforms import Rectlin, Softmax, CrossEntropyMulti, TopKMisclassification
from neon.models import Model
from neon.data import ImageLoader

parser = NeonArgparser(__doc__)
parser.add_argument('--subset_pct', type=float, default=100,
                    help='subset of training dataset to use (percentage)')
args = parser.parse_args()

# setup data provider
img_set_options = dict(repo_dir=args.data_dir, inner_size=224,
                       dtype=args.datatype, subset_pct=args.subset_pct)
test = ImageLoader(set_name='validation', scale_range=(256, 256),
                   do_transforms=False, **img_set_options)

init1 = Xavier(local=False)
initx = Xavier(local=True)
bias = Constant(val=0.20)
relu = Rectlin()

common = dict(activation=relu, init=initx, bias=bias)
commonp1 = dict(activation=relu, init=initx, bias=bias, padding=1)
commonp2 = dict(activation=relu, init=initx, bias=bias, padding=2)
pool3s1p1 = dict(fshape=3, padding=1, strides=1)
pool3s2p1 = dict(fshape=3, padding=1, strides=2, op='max')


def inception(kvals):
    (p1, p2, p3, p4) = kvals

    branch1 = [Conv((1, 1, p1[0]), **common)]
    branch2 = [Conv((1, 1, p2[0]), **common),
               Conv((3, 3, p2[1]), **commonp1)]
    branch3 = [Conv((1, 1, p3[0]), **common),
               Conv((5, 5, p3[1]), **commonp2)]
    branch4 = [Pooling(op="max", **pool3s1p1),
               Conv((1, 1, p4[0]), **common)]
    return MergeBroadcast(layers=[branch1, branch2, branch3, branch4], merge="depth")

def main_branch(branch_nodes):
    return [Conv((7, 7, 64), padding=3, strides=2, **common),
            Pooling(**pool3s2p1),
            Conv((1, 1, 64), **common),
            Conv((3, 3, 192), **commonp1),
            Pooling(**pool3s2p1),
            inception([(64, ), (96, 128), (16, 32), (32, )]),
            inception([(128,), (128, 192), (32, 96), (64, )]),
            Pooling(**pool3s2p1),
            inception([(192,), (96, 208), (16, 48), (64, )]),
            branch_nodes[0],
            inception([(160,), (112, 224), (24, 64), (64, )]),
            inception([(128,), (128, 256), (24, 64), (64, )]),
            inception([(112,), (144, 288), (32, 64), (64, )]),
            branch_nodes[1],
            inception([(256,), (160, 320), (32, 128), (128,)]),
            Pooling(**pool3s2p1),
            inception([(256,), (160, 320), (32, 128), (128,)]),
            inception([(384,), (192, 384), (48, 128), (128,)]),
            Pooling(fshape=7, strides=1, op="avg"),
            Affine(nout=1000, init=init1, activation=Softmax(), bias=Constant(0))]


def aux_branch(bnode):
    return [bnode,
            Pooling(fshape=5, strides=3, op="avg"),
            Conv((1, 1, 128), **common),
            Affine(nout=1024, init=init1, activation=relu, bias=bias),
            Dropout(keep=0.3),
            Affine(nout=1000, init=init1, activation=Softmax(), bias=Constant(0))]


# Now construct the model
branch_nodes = [BranchNode(name='branch' + str(i)) for i in range(2)]
main1 = main_branch(branch_nodes)
aux1 = aux_branch(branch_nodes[0])
aux2 = aux_branch(branch_nodes[1])

model = Model(layers=Tree([main1, aux1, aux2], alphas=[1.0, 0.3, 0.3]))

valmetric = TopKMisclassification(k=5)

# dummy optimizer for benchmarking
# training implementation coming soon
opt_gdm = GradientDescentMomentum(0.0, 0.0)
opt_biases = GradientDescentMomentum(0.0, 0.0)
opt = MultiOptimizer({'default': opt_gdm, 'Bias': opt_biases})

# setup cost function as CrossEntropy
cost = Multicost(costs=[GeneralizedCost(costfunc=CrossEntropyMulti()),
                        GeneralizedCost(costfunc=CrossEntropyMulti()),
                        GeneralizedCost(costfunc=CrossEntropyMulti())],
                 weights=[1, 0., 0.])  # We only want to consider the CE of the main path

assert os.path.exists(args.model_file), 'script requires the trained weights file'
model.load_params(args.model_file)
model.initialize(test, cost)

print 'running speed benchmark...'
model.benchmark(test, cost, opt)

print '\nCalculating performance on validation set...'
test.reset()
mets = model.eval(test, metric=valmetric)
print 'Validation set metrics:'
print 'LogLoss: %.2f, Accuracy: %.1f %% (Top-1), %.1f %% (Top-5)' % (mets[0],
                                                                     (1.0-mets[1])*100,
                                                                     (1.0-mets[2])*100)
	#!/usr/bin/env python
	# ----------------------------------------------------------------------------
	# Copyright 2015 Nervana Systems Inc.
	# Licensed under the Apache License, Version 2.0 (the "License");
	# you may not use this file except in compliance with the License.
	# You may obtain a copy of the License at
	#
	# http://www.apache.org/licenses/LICENSE-2.0
	#
	# Unless required by applicable law or agreed to in writing, software
	# distributed under the License is distributed on an "AS IS" BASIS,
	# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	# See the License for the specific language governing permissions and
	# limitations under the License.
	# ----------------------------------------------------------------------------
	"""
	Googlenet V1 implementation
	"""

	import os

	from neon.util.argparser import NeonArgparser
	from neon.layers import Conv, Pooling, MergeBroadcast, BranchNode, Affine, Tree, Dropout
	from neon.layers import GeneralizedCost, Multicost
	from neon.initializers import Constant, Xavier
	from neon.backends import gen_backend
	from neon.optimizers import GradientDescentMomentum, MultiOptimizer
	from neon.transforms import Rectlin, Softmax, CrossEntropyMulti, TopKMisclassification
	from neon.models import Model
	from neon.data import ImageLoader

	parser = NeonArgparser(__doc__)
	parser.add_argument('--subset_pct', type=float, default=100,
	help='subset of training dataset to use (percentage)')
	args = parser.parse_args()

	# setup data provider
	img_set_options = dict(repo_dir=args.data_dir, inner_size=224,
	dtype=args.datatype, subset_pct=args.subset_pct)
	test = ImageLoader(set_name='validation', scale_range=(256, 256),
	do_transforms=False, **img_set_options)

	init1 = Xavier(local=False)
	initx = Xavier(local=True)
	bias = Constant(val=0.20)
	relu = Rectlin()

	common = dict(activation=relu, init=initx, bias=bias)
	commonp1 = dict(activation=relu, init=initx, bias=bias, padding=1)
	commonp2 = dict(activation=relu, init=initx, bias=bias, padding=2)
	pool3s1p1 = dict(fshape=3, padding=1, strides=1)
	pool3s2p1 = dict(fshape=3, padding=1, strides=2, op='max')


	def inception(kvals):
	(p1, p2, p3, p4) = kvals

	branch1 = [Conv((1, 1, p1[0]), **common)]
	branch2 = [Conv((1, 1, p2[0]), **common),
	Conv((3, 3, p2[1]), **commonp1)]
	branch3 = [Conv((1, 1, p3[0]), **common),
	Conv((5, 5, p3[1]), **commonp2)]
	branch4 = [Pooling(op="max", **pool3s1p1),
	Conv((1, 1, p4[0]), **common)]
	return MergeBroadcast(layers=[branch1, branch2, branch3, branch4], merge="depth")

	def main_branch(branch_nodes):
	return [Conv((7, 7, 64), padding=3, strides=2, **common),
	Pooling(**pool3s2p1),
	Conv((1, 1, 64), **common),
	Conv((3, 3, 192), **commonp1),
	Pooling(**pool3s2p1),
	inception([(64, ), (96, 128), (16, 32), (32, )]),
	inception([(128,), (128, 192), (32, 96), (64, )]),
	Pooling(**pool3s2p1),
	inception([(192,), (96, 208), (16, 48), (64, )]),
	branch_nodes[0],
	inception([(160,), (112, 224), (24, 64), (64, )]),
	inception([(128,), (128, 256), (24, 64), (64, )]),
	inception([(112,), (144, 288), (32, 64), (64, )]),
	branch_nodes[1],
	inception([(256,), (160, 320), (32, 128), (128,)]),
	Pooling(**pool3s2p1),
	inception([(256,), (160, 320), (32, 128), (128,)]),
	inception([(384,), (192, 384), (48, 128), (128,)]),
	Pooling(fshape=7, strides=1, op="avg"),
	Affine(nout=1000, init=init1, activation=Softmax(), bias=Constant(0))]


	def aux_branch(bnode):
	return [bnode,
	Pooling(fshape=5, strides=3, op="avg"),
	Conv((1, 1, 128), **common),
	Affine(nout=1024, init=init1, activation=relu, bias=bias),
	Dropout(keep=0.3),
	Affine(nout=1000, init=init1, activation=Softmax(), bias=Constant(0))]


	# Now construct the model
	branch_nodes = [BranchNode(name='branch' + str(i)) for i in range(2)]
	main1 = main_branch(branch_nodes)
	aux1 = aux_branch(branch_nodes[0])
	aux2 = aux_branch(branch_nodes[1])

	model = Model(layers=Tree([main1, aux1, aux2], alphas=[1.0, 0.3, 0.3]))

	valmetric = TopKMisclassification(k=5)

	# dummy optimizer for benchmarking
	# training implementation coming soon
	opt_gdm = GradientDescentMomentum(0.0, 0.0)
	opt_biases = GradientDescentMomentum(0.0, 0.0)
	opt = MultiOptimizer({'default': opt_gdm, 'Bias': opt_biases})

	# setup cost function as CrossEntropy
	cost = Multicost(costs=[GeneralizedCost(costfunc=CrossEntropyMulti()),
	GeneralizedCost(costfunc=CrossEntropyMulti()),
	GeneralizedCost(costfunc=CrossEntropyMulti())],
	weights=[1, 0., 0.]) # We only want to consider the CE of the main path

	assert os.path.exists(args.model_file), 'script requires the trained weights file'
	model.load_params(args.model_file)
	model.initialize(test, cost)

	print 'running speed benchmark...'
	model.benchmark(test, cost, opt)

	print '\nCalculating performance on validation set...'
	test.reset()
	mets = model.eval(test, metric=valmetric)
	print 'Validation set metrics:'
	print 'LogLoss: %.2f, Accuracy: %.1f %% (Top-1), %.1f %% (Top-5)' % (mets[0],
	(1.0-mets[1])*100,
	(1.0-mets[2])*100)