Skip to content

Instantly share code, notes, and snippets.

@mtho11
Created October 19, 2016 15:34
Show Gist options
  • Save mtho11/2bdc4280affb4f52b8808a9f9a57e9b0 to your computer and use it in GitHub Desktop.
Save mtho11/2bdc4280affb4f52b8808a9f9a57e9b0 to your computer and use it in GitHub Desktop.
Hawkular Root Cause Analysis
NOTE: This process is NOT about identifying faults in people, the goal of this (and all) RCA is to identify oportunities to improve our team processes.
BlueJeans: https://redhat.bluejeans.com/1826343252
Stated issue: URLs always in Down state
http://livingontheedge.hawkular.org/hawkular-ui/url/url-list
Where issue was encountered: LOTE
Reported by: Upstream Community
Jira: https://issues.jboss.org/browse/HAWKULAR-890
Goal of RCA: Understand how QE missed catching this issue. If possible identify Counter-Measures
Strive for the "5 Why's"
Background reading:
http://www.isixsigma.com/tools-templates/cause-effect/determine-root-cause-5-whys/
https://en.wikipedia.org/wiki/5_Whys
Why-1
Why was the URL down issue reported by Upstream Community (reported on #hawkular), and not by JONE QE?
No regression tests running against LOTE
Do we have Manual test case for testing URLS?
Yes & tests were run against MS7 / MS8
Why-2
Why LOTE always report very slow response time (often >5s) ?
No regression tests running against LOTE (TH: as far as I know it's always been that slow since we have LOTE in place few months back)
Why-3
Why was no UI regression tests running against LOTE?
Because login UI scripts fail.
** QE is not accountable to running regression tests in LOTE
Why-4
Why-5
Conclusion:
Issue is only seen in LOTE
LOTE testing by QE is "nice to have" and not "must have"
Counter Measures
None at this time: QE process worked in that URL manual tests are in place and were run during MS7 / MS8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment