terriyu/diary-8jul2013.md

## diary-8jul2013.md

      
    Raw
  

              diary-8jul2013.md
            
          
    8 Jul 2013

Gist Markdown references


https://gist.github.com/micmcg/976172
https://gist.github.com/dupuy/1855764
http://webapps.stackexchange.com/questions/29602/markdown-to-insert-and-display-an-image-on-github-repo

Bug I'm working on, "unable to sort data with MongoDB"

Status


The bug report is here: https://bugs.launchpad.net/ceilometer/+bug/1193906


Still trying to reproduce the error jd mentioned in the bug report
OperationFailure: database error: too much data for sort() with no index. add an index or specify a smaller limit

Previously, I was unable to reproduce the error and my tests were taking a
very long time to run.


jd's suggestion


jd says the problem is that the tests still use MIM:

The error I had was using MongoDB, not MIM (the MongoDB-In-Memory
implementation we use in unit tests). We are still using MIM by default to
run unit test, and you are (again) pointing a limitation of this...
It's simple, you just have to have a MongoDB running and export an
environment variable so the unit tests will know where to use it instead
of MIM


jd says the solution is to run this command first
$ export CEILOMETER_TEST_MONGODB_URL=mongodb://localhost:27017/ceilometer

then run the test command
$ tox -e py27 ...


The MongoDB should already be running.  If it's not, then run the command
$ sudo service mongodb start


jd says this is a known problem that he's trying to fix:

This is actually why I've a patch into the review queue dropping our MIM
usage to replace it with a real MongoDB instance instead

jd's patch "Use a real MongoDB instance to run unit tests":
https://review.openstack.org/#/c/33290/

This will allow more real tests, and use of more functionnality not
implemented in MIM such as aggregation."


Running the tests again with real MongoDB instances


I did what jd suggested and did my tests again after running the export
command.  I varied the size of the database and noted the runtime and
whether the test passed or failed.


My testing procedure


Step 1 (Setup): Execute $ export CEILOMETER_TEST_MONGODB_URL=mongodb://localhost:27017/ceilometer, then add a line timestamps_for_test_samples_default_order = timestamps_for_test_samples_default_order*<integer> in the test database in tests/storage/base.py


Step 2 (Testing): Run a single test using $ tox -e py27 -- tests.storage.test_impl_mongodb.RawSampleTest.test_get_samples_in_default_order and record the results.


Step 3 (Iterate with different size database): Change the number in  and do Step 2 again.


Here are my results.  Multiplier is the number <integer> in
timestamps_for_test_samples_default_order = timestamps_for_test_samples_default_order*<integer>

So if the <integer> is 100, it means the database has been made roughly 100
times larger.  The column Pass says whether the test passed (yes or no).
If the test didn't pass (N), then the error message was:
OperationFailure: database error: too much data for sort() with no index.  add an index or specify a smaller limit

This is a table summarizing the results.


Multiplier
Runtime (s)
Pass


100
2.640
Y


1000
20.351
Y


10000
193.064
Y


12000
240.318
Y


15000
291.161
N


20000
401.303
N


30000
595.871
N


50000
964.876
N


70000
1386.196
N


100000
1961.760
N


The runtime appears to scale linearly with the multiplier.  The line is a
linear regression best fit, whose model is given in the plot title.


R code for the plot:
mult <- c(100, 1000, 10000, 12000, 15000, 20000, 30000, 50000, 70000, 100000)
runtime <- c(2.640, 20.351, 193.064, 240.318, 291.161, 401.303, 595.871, 964.876, 1386.196, 1961.760)
model <- lm(runtime ~ mult)
png(filename = "sort_plot.png", width = 480, height = 480)
plot(mult, runtime, main = "test_get_samples_in_default_order runtime vs database size\n Runtime (sec) = 1.09 + 0.01963*multiplication_factor", xlab = "Multiplication factor", ylab = "Runtime (sec)", pch = ifelse(mult > 12000, "x", "o"), col = ifelse(mult > 12000, "red", "black"))
abline(model)
legend("topleft", c("Pass", "Fail"), pch = c("o", "x"), col = c("black", "red"))
dev.off()


Example output for a sort failure: https://gist.github.com/terriyu/5954419#file-tox-output-sort-failure-txt


Attempt to add an index and fix the bug


I added in a line of code to create an index for descending time stamps, and
ran the test with the multipler being 15000, which I know will make sort()
fail without an index.  However, my index has no effect and I get the same
error:
OperationFailure: database error: too much data for sort() with no index.  add an index or specify a smaller limit


My patch is here: https://review.openstack.org/#/c/36159/


When I tried to submit my patch to Gerrit, Git complained that I had
unstashed changes because my etc/ceilometer/ceilometer.conf.sample had
been modified and was unstaged, so I stashed it using git stash


## sort_plot.png

      
    Raw
  

              sort_plot.png
            
          
## tox-output-sort-failure.txt
$ tox -e py27 -- tests.storage.test_impl_mongodb.RawSampleTest.test_get_samples_in_default_order
GLOB sdist-make: /opt/stack/ceilometer/setup.py
py27 inst-nodeps: /opt/stack/ceilometer/.tox/dist/ceilometer-2013.2.a161.gf352c61.zip
py27 runtests: commands[0]
running testr
running=${PYTHON:-python} -m subunit.run discover -t ./ ./tests --list
running=${PYTHON:-python} -m subunit.run discover -t ./ ./tests  --load-list /tmp/tmplxTh9A
======================================================================
FAIL: tests.storage.test_impl_mongodb.RawSampleTest.test_get_samples_in_default_order
tags: worker-0
----------------------------------------------------------------------
pythonlogging:'': {{{connecting to MongoDB replicaset "" on __test__}}}

Traceback (most recent call last):
  File "/opt/stack/ceilometer/tests/storage/base.py", line 357, in test_get_samples_in_default_order
    for sample in self.conn.get_samples(f):
  File "/opt/stack/ceilometer/ceilometer/storage/impl_mongodb.py", line 492, in get_samples
    for s in samples:
  File "/opt/stack/ceilometer/.tox/py27/local/lib/python2.7/site-packages/pymongo/cursor.py", line 814, in next
    if len(self.__data) or self._refresh():
  File "/opt/stack/ceilometer/.tox/py27/local/lib/python2.7/site-packages/pymongo/cursor.py", line 763, in _refresh
    self.__uuid_subtype))
  File "/opt/stack/ceilometer/.tox/py27/local/lib/python2.7/site-packages/pymongo/cursor.py", line 720, in __send_message
    self.__uuid_subtype)
  File "/opt/stack/ceilometer/.tox/py27/local/lib/python2.7/site-packages/pymongo/helpers.py", line 100, in _unpack_response
    error_object["$err"])
OperationFailure: database error: too much data for sort() with no index.  add an index or specify a smaller limit
======================================================================
FAIL: process-returncode
tags: worker-0
----------------------------------------------------------------------
Binary content:
  traceback (test/plain; charset="utf8")
Ran 2 tests in 298.813s (+277.853s)
FAILED (id=70, failures=2)
error: testr failed (1)
ERROR: InvocationError: '/opt/stack/ceilometer/.tox/py27/bin/python setup.py testr --slowest --testr-args=--concurrency=1 tests.storage.test_impl_mongodb.RawSampleTest.test_get_samples_in_default_order'
py27 runtests: commands[1]
WARNING:test command found but not installed in testenv
  cmd: /bin/bash
  env: /opt/stack/ceilometer/.tox/py27
Maybe forgot to specify a dependency?
py27 runtests: commands[2]
running testr
running=${PYTHON:-python} -m subunit.run discover -t ./nova_tests ./nova_tests --list
PASSED (id=65)
Slowest Tests
Test id                                                                          Runtime (s)
-------------------------------------------------------------------------------  -----------
tests.storage.test_impl_mongodb.RawSampleTest.test_get_samples_in_default_order  298.480
process-returncode                                                                 0.000
__________________________________________________________________________ summary ___________________________________________________________________________
ERROR:   py27: commands failed
Multiplier	Runtime (s)	Pass
100	2.640	Y
1000	20.351	Y
10000	193.064	Y
12000	240.318	Y
15000	291.161	N
20000	401.303	N
30000	595.871	N
50000	964.876	N
70000	1386.196	N
100000	1961.760	N
	$ tox -e py27 -- tests.storage.test_impl_mongodb.RawSampleTest.test_get_samples_in_default_order
	GLOB sdist-make: /opt/stack/ceilometer/setup.py
	py27 inst-nodeps: /opt/stack/ceilometer/.tox/dist/ceilometer-2013.2.a161.gf352c61.zip
	py27 runtests: commands[0]
	running testr
	running=${PYTHON:-python} -m subunit.run discover -t ./ ./tests --list
	running=${PYTHON:-python} -m subunit.run discover -t ./ ./tests --load-list /tmp/tmplxTh9A
	======================================================================
	FAIL: tests.storage.test_impl_mongodb.RawSampleTest.test_get_samples_in_default_order
	tags: worker-0
	----------------------------------------------------------------------
	pythonlogging:'': {{{connecting to MongoDB replicaset "" on __test__}}}

	Traceback (most recent call last):
	File "/opt/stack/ceilometer/tests/storage/base.py", line 357, in test_get_samples_in_default_order
	for sample in self.conn.get_samples(f):
	File "/opt/stack/ceilometer/ceilometer/storage/impl_mongodb.py", line 492, in get_samples
	for s in samples:
	File "/opt/stack/ceilometer/.tox/py27/local/lib/python2.7/site-packages/pymongo/cursor.py", line 814, in next
	if len(self.__data) or self._refresh():
	File "/opt/stack/ceilometer/.tox/py27/local/lib/python2.7/site-packages/pymongo/cursor.py", line 763, in _refresh
	self.__uuid_subtype))
	File "/opt/stack/ceilometer/.tox/py27/local/lib/python2.7/site-packages/pymongo/cursor.py", line 720, in __send_message
	self.__uuid_subtype)
	File "/opt/stack/ceilometer/.tox/py27/local/lib/python2.7/site-packages/pymongo/helpers.py", line 100, in _unpack_response
	error_object["$err"])
	OperationFailure: database error: too much data for sort() with no index. add an index or specify a smaller limit
	======================================================================
	FAIL: process-returncode
	tags: worker-0
	----------------------------------------------------------------------
	Binary content:
	traceback (test/plain; charset="utf8")
	Ran 2 tests in 298.813s (+277.853s)
	FAILED (id=70, failures=2)
	error: testr failed (1)
	ERROR: InvocationError: '/opt/stack/ceilometer/.tox/py27/bin/python setup.py testr --slowest --testr-args=--concurrency=1 tests.storage.test_impl_mongodb.RawSampleTest.test_get_samples_in_default_order'
	py27 runtests: commands[1]
	WARNING:test command found but not installed in testenv
	cmd: /bin/bash
	env: /opt/stack/ceilometer/.tox/py27
	Maybe forgot to specify a dependency?
	py27 runtests: commands[2]
	running testr
	running=${PYTHON:-python} -m subunit.run discover -t ./nova_tests ./nova_tests --list
	PASSED (id=65)
	Slowest Tests
	Test id Runtime (s)
	------------------------------------------------------------------------------- -----------
	tests.storage.test_impl_mongodb.RawSampleTest.test_get_samples_in_default_order 298.480
	process-returncode 0.000
	__________________________________________________________________________ summary ___________________________________________________________________________
	ERROR: py27: commands failed