Skip to content

Instantly share code, notes, and snippets.

@terriyu
Last active December 19, 2015 12:19
Show Gist options
  • Save terriyu/5954419 to your computer and use it in GitHub Desktop.
Save terriyu/5954419 to your computer and use it in GitHub Desktop.
Journal for OpenStack Ceilometer work -- 8 Jul 2013 Also included: plot and tox error message

8 Jul 2013

Gist Markdown references

Bug I'm working on, "unable to sort data with MongoDB"

Status

  • The bug report is here: https://bugs.launchpad.net/ceilometer/+bug/1193906

  • Still trying to reproduce the error jd mentioned in the bug report

    OperationFailure: database error: too much data for sort() with no index. add an index or specify a smaller limit
    

    Previously, I was unable to reproduce the error and my tests were taking a very long time to run.

jd's suggestion

  • jd says the problem is that the tests still use MIM:

    The error I had was using MongoDB, not MIM (the MongoDB-In-Memory implementation we use in unit tests). We are still using MIM by default to run unit test, and you are (again) pointing a limitation of this... It's simple, you just have to have a MongoDB running and export an environment variable so the unit tests will know where to use it instead of MIM

  • jd says the solution is to run this command first

    $ export CEILOMETER_TEST_MONGODB_URL=mongodb://localhost:27017/ceilometer
    

    then run the test command

    $ tox -e py27 ...
    
  • The MongoDB should already be running. If it's not, then run the command

    $ sudo service mongodb start
    
  • jd says this is a known problem that he's trying to fix:

    This is actually why I've a patch into the review queue dropping our MIM usage to replace it with a real MongoDB instance instead

    jd's patch "Use a real MongoDB instance to run unit tests": https://review.openstack.org/#/c/33290/

    This will allow more real tests, and use of more functionnality not implemented in MIM such as aggregation."

Running the tests again with real MongoDB instances

  • I did what jd suggested and did my tests again after running the export command. I varied the size of the database and noted the runtime and whether the test passed or failed.

  • My testing procedure

    • Step 1 (Setup): Execute $ export CEILOMETER_TEST_MONGODB_URL=mongodb://localhost:27017/ceilometer, then add a line timestamps_for_test_samples_default_order = timestamps_for_test_samples_default_order*<integer> in the test database in tests/storage/base.py

    • Step 2 (Testing): Run a single test using $ tox -e py27 -- tests.storage.test_impl_mongodb.RawSampleTest.test_get_samples_in_default_order and record the results.

    • Step 3 (Iterate with different size database): Change the number in and do Step 2 again.

  • Here are my results. Multiplier is the number <integer> in

    timestamps_for_test_samples_default_order = timestamps_for_test_samples_default_order*<integer>
    

    So if the <integer> is 100, it means the database has been made roughly 100 times larger. The column Pass says whether the test passed (yes or no). If the test didn't pass (N), then the error message was:

    OperationFailure: database error: too much data for sort() with no index.  add an index or specify a smaller limit
    

    This is a table summarizing the results.

    Multiplier Runtime (s) Pass
    100 2.640 Y
    1000 20.351 Y
    10000 193.064 Y
    12000 240.318 Y
    15000 291.161 N
    20000 401.303 N
    30000 595.871 N
    50000 964.876 N
    70000 1386.196 N
    100000 1961.760 N
  • The runtime appears to scale linearly with the multiplier. The line is a linear regression best fit, whose model is given in the plot title.

Runtime vs Multiplier

  • R code for the plot:

    mult <- c(100, 1000, 10000, 12000, 15000, 20000, 30000, 50000, 70000, 100000)
    runtime <- c(2.640, 20.351, 193.064, 240.318, 291.161, 401.303, 595.871, 964.876, 1386.196, 1961.760)
    model <- lm(runtime ~ mult)
    png(filename = "sort_plot.png", width = 480, height = 480)
    plot(mult, runtime, main = "test_get_samples_in_default_order runtime vs database size\n Runtime (sec) = 1.09 + 0.01963*multiplication_factor", xlab = "Multiplication factor", ylab = "Runtime (sec)", pch = ifelse(mult > 12000, "x", "o"), col = ifelse(mult > 12000, "red", "black"))
    abline(model)
    legend("topleft", c("Pass", "Fail"), pch = c("o", "x"), col = c("black", "red"))
    dev.off()
    
  • Example output for a sort failure: https://gist.github.com/terriyu/5954419#file-tox-output-sort-failure-txt

Attempt to add an index and fix the bug

  • I added in a line of code to create an index for descending time stamps, and ran the test with the multipler being 15000, which I know will make sort() fail without an index. However, my index has no effect and I get the same error:

    OperationFailure: database error: too much data for sort() with no index.  add an index or specify a smaller limit
    
  • My patch is here: https://review.openstack.org/#/c/36159/

  • When I tried to submit my patch to Gerrit, Git complained that I had unstashed changes because my etc/ceilometer/ceilometer.conf.sample had been modified and was unstaged, so I stashed it using git stash

$ tox -e py27 -- tests.storage.test_impl_mongodb.RawSampleTest.test_get_samples_in_default_order
GLOB sdist-make: /opt/stack/ceilometer/setup.py
py27 inst-nodeps: /opt/stack/ceilometer/.tox/dist/ceilometer-2013.2.a161.gf352c61.zip
py27 runtests: commands[0]
running testr
running=${PYTHON:-python} -m subunit.run discover -t ./ ./tests --list
running=${PYTHON:-python} -m subunit.run discover -t ./ ./tests --load-list /tmp/tmplxTh9A
======================================================================
FAIL: tests.storage.test_impl_mongodb.RawSampleTest.test_get_samples_in_default_order
tags: worker-0
----------------------------------------------------------------------
pythonlogging:'': {{{connecting to MongoDB replicaset "" on __test__}}}
Traceback (most recent call last):
File "/opt/stack/ceilometer/tests/storage/base.py", line 357, in test_get_samples_in_default_order
for sample in self.conn.get_samples(f):
File "/opt/stack/ceilometer/ceilometer/storage/impl_mongodb.py", line 492, in get_samples
for s in samples:
File "/opt/stack/ceilometer/.tox/py27/local/lib/python2.7/site-packages/pymongo/cursor.py", line 814, in next
if len(self.__data) or self._refresh():
File "/opt/stack/ceilometer/.tox/py27/local/lib/python2.7/site-packages/pymongo/cursor.py", line 763, in _refresh
self.__uuid_subtype))
File "/opt/stack/ceilometer/.tox/py27/local/lib/python2.7/site-packages/pymongo/cursor.py", line 720, in __send_message
self.__uuid_subtype)
File "/opt/stack/ceilometer/.tox/py27/local/lib/python2.7/site-packages/pymongo/helpers.py", line 100, in _unpack_response
error_object["$err"])
OperationFailure: database error: too much data for sort() with no index. add an index or specify a smaller limit
======================================================================
FAIL: process-returncode
tags: worker-0
----------------------------------------------------------------------
Binary content:
traceback (test/plain; charset="utf8")
Ran 2 tests in 298.813s (+277.853s)
FAILED (id=70, failures=2)
error: testr failed (1)
ERROR: InvocationError: '/opt/stack/ceilometer/.tox/py27/bin/python setup.py testr --slowest --testr-args=--concurrency=1 tests.storage.test_impl_mongodb.RawSampleTest.test_get_samples_in_default_order'
py27 runtests: commands[1]
WARNING:test command found but not installed in testenv
cmd: /bin/bash
env: /opt/stack/ceilometer/.tox/py27
Maybe forgot to specify a dependency?
py27 runtests: commands[2]
running testr
running=${PYTHON:-python} -m subunit.run discover -t ./nova_tests ./nova_tests --list
PASSED (id=65)
Slowest Tests
Test id Runtime (s)
------------------------------------------------------------------------------- -----------
tests.storage.test_impl_mongodb.RawSampleTest.test_get_samples_in_default_order 298.480
process-returncode 0.000
__________________________________________________________________________ summary ___________________________________________________________________________
ERROR: py27: commands failed
@anteaya
Copy link

anteaya commented Jul 9, 2013

This is terrific work, Terri. I am able to follow your progress, I understand your results and the steps you take to address them. Most importantly you are documenting your progress to show others (and have a record for yourself).

Well done, Terri. Keep it up,
Anita.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment