Skip to content

Instantly share code, notes, and snippets.

@mjg123
Last active August 29, 2015 14:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mjg123/cb544f2f6853df8d866c to your computer and use it in GitHub Desktop.
Save mjg123/cb544f2f6853df8d866c to your computer and use it in GitHub Desktop.
Diagnosing a gate failure using logstash.openstack.org

Recently we noticed that the "check-tempest-dsvm-multinode-full" job, which is non-voting in nova's check jobs, was failing in a new way.

Some notes about how the failure was diagnosed:

  1. Job is failing
  2. why? find a signature for the failure, eg
File "tempest/api/compute/admin/test_live_migration.py", line 116, in test_live_block_migration"
AND build_name:"check-tempest-dsvm-multinode-full"
AND project:"openstack/nova"
  1. Raise a bug: https://bugs.launchpad.net/nova/+bug/1463747
  2. How often is the job failing?
build_name:"check-tempest-dsvm-multinode-full"
AND message:"Finished: SUCCESS"

vs

build_name:"check-tempest-dsvm-multinode-full"
AND message:"Finished: FAILURE"

(also for this particular job: https://jogo.github.io/gate/multinode.html <-- chrome browser)

  1. what's the actual cause? check the appropriate logs (in this case n-cpu) and find a signature there.
  2. logstash.o.o that:
message:"TypeError: string indices must be integers"
AND tags:"screen-n-cpu.txt"
  1. If this is a recently-occuring failure only, look at the oldest patchset (build_change) and check gerrit to see if it's related.

  2. It does seem to be related: https://review.openstack.org/#/c/177437/

  3. Propose a revert, or fix forward. In this case, both:

  1. In the end the revert was abandoned and the fix merged.

Random notes:

  • logstash keeps 10 days' worth of logs.
  • Q: what gets indexed? A: this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment