Skip to content

Instantly share code, notes, and snippets.

Last active August 29, 2015 14:20
Show Gist options
  • Save sotelo/27ede38216a70dbc597e to your computer and use it in GitHub Desktop.
Save sotelo/27ede38216a70dbc597e to your computer and use it in GitHub Desktop.
A description of the problems I found with Aggregators.
import theano
import numpy
from numpy.testing import assert_allclose
from collections import OrderedDict
from fuel.datasets import IndexableDataset
from fuel.streams import DataStream
from fuel.schemes import SequentialScheme
from theano import tensor
from blocks.monitoring.evaluators import DatasetEvaluator
from blocks.monitoring.aggregation import Mean, TakeLast
# Introduction:
# Our dataset has 5 observations of 2 features each.
# We compute a Linear Transformation to a space of 2 new-features.
# We want to get the mean of the new-features.
num_examples = 5
num_batches = 3
batch_size = 2
features = numpy.array([[3, 3],
[2, 9],
[2, 4],
[5, 1],
[3, 3]], dtype=theano.config.floatX)
dataset = IndexableDataset(OrderedDict([('features', features)]))
x = tensor.matrix('features')
y = -x = 'y'
# The results of the mean should be a 2-dimensional vector with shapes:
# [-3, 4]
y.tag.aggregation_scheme = Mean(y, 1.0)
# If we use batch size of 1, that is what we get.
data_stream_1 = DataStream(dataset,
iteration_scheme=SequentialScheme(num_examples, 1))
print DatasetEvaluator([y]).evaluate(data_stream_1)['y']
# [[-3. -4.]]
# If we use batch size of 2,
data_stream_2 = DataStream(dataset,
iteration_scheme=SequentialScheme(num_examples, 2))
# If we evaluate the same variable as before, we now will get a result of size 2x2
# This is INCORRECT. The problem is that the last batch, which only has 1 example,
# gets broadcasted to 2 dimensions and averaged with all the other results. So even
# if we compute the mean afterwards the results will be different.
print DatasetEvaluator([y]).evaluate(data_stream_2)['y']
#[[-2.66666675 -3.33333325]
# [-3.33333325 -4.33333349]]
print DatasetEvaluator([y]).evaluate(data_stream_2)['y'].mean(axis=0)
# [-3. -3.83333349]
# An alternative, would be to comute the mean with another theano variable and
# then aggregate according to it.
z = tensor.mean(y, axis=0) = 'z'
z.tag.aggregation_scheme = Mean(z, 1.0)
print DatasetEvaluator([z]).evaluate(data_stream_1)['z']
# [-3. -4.]
print DatasetEvaluator([z]).evaluate(data_stream_2)['z']
# [-3. -3.83333325]
# This will make the same mistake, because it will consider the last minibatch with the
# same weight (importance) as the others even though it has less examples.
# Actually, this error was part of the aggregator before the changes I introduced.
# The worst part is that is dangerous for evaluation of performances. For example,
# lets say that the dataset is of size 1001, and that we use minibatches of size 1000.
# We will have 2 minibatches, 1 of size 1000 and the other of size 1. Lets say that
# we have a classifier that gets wrong all but the last example in our data. And that
# we compute the missclassification rate. Each minibatch will be averaged with the same
# weight and therefore we will get .5 accuracy, when the true accuracy is 1/1001.
# This is probably an example that will never happen, however is something that is worth
# considering and is a result of using minibatches of different size.
u = tensor.mean(z) = 'u'
#u.tag.aggregation_scheme = Mean(u, 1.)
print DatasetEvaluator([u]).evaluate(data_stream_1)['u']
# 3.5
# Is the correct results
print DatasetEvaluator([u]).evaluate(data_stream_2)['u']
# -3.41666674614
#Please not that I didn't tagged the variable, so that I would use the default behaviour.
# Finally, the last thing that could happen is that if we have minibatches of different size
# and the last one has more than 1 example.
# If we use batch size of 3,
data_stream_3 = DataStream(dataset,
iteration_scheme=SequentialScheme(num_examples, 3))
# print DatasetEvaluator([y]).evaluate(data_stream_3)['y']
# This will cause a shape mismatch error.
print DatasetEvaluator([z]).evaluate(data_stream_3)['z']
# Again this will produce a bad result with no warning or anything.
# [-3.16666651 -3.66666675]
# Maybe this could be solved by adjusting the denominator, however that's not the
# default case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment