What did I see?
- A Slack message form Helene asking me to look at a Github issue that looked like a production bug.
What actually happened?
- A change in our OpenRosa http layer (from Apache HttpClient to OkHttp) had created a change in behaviour where, after successfully authenticating with a server over HTTPS, Collect would always preemptively send credentials using Basic Authentication regardless of the Authentication scheme initially used.
- This problem meant that servers that assumed the
Authorization
header was a Digest one would break. - On servers that checked the scheme in the
Authorization
header we were generating extra traffic because of the401
back and forths. - Basic Authentication (not Digest) was tested at our HTTP layer and QA had verified Collect worked against Aggregate and multiple other servers
What did I do?
- Inspected the Github issue
- Updated local
master
branch and create a fix branch - Reviewed authorisation with sandbox Aggregate server and emulator following the issue steps
- Noticed that tests were missing for behaviour after initial authorisation - back filled them. Tests pass without changes.
- Followed code through to try and understand how Basic auth request could follow a Digest authorisation
- Noticed that that
BasicAuthenticator
is added beforeDigestAuthenticator
toDispatchingAuthenticator
inOkHttpConnection
. This seems like it could be a problem but didn't make sense because earlier tests passed. - Sent reply to user in issue to see if we could get the headers their server was replying with when they experienced the problem.
- Looked at how to set up proxying to double check emulator traffic - realised that the new Android Studio Profiler supports network profiling
- Used Network Profiler to inspect traffic during "Get Blank Form" flow - noticed that requests following initial
401
and then authorisation always sentBasic
headers and then sentDigest
requests after another401
- Tried to diagnose if the issue was in
OkHttpConnection
by debugging - Started debugging inside of
DispatchingAuthenticator
- discoveredBasic
auth is being sent as it's first in the list of authenticators. This was still confusing because tests pass but then realised that tests don't cover case whereDigest
is authenticated over HTTPS. - Added test to check follow up requests to
Digest
authentication over HTTPS - test failed and fix is just changing the order of howAuthenticators
are added (as suspected in 6). - Created and wrote up PR.
- PR accepted by Grzegorz - he suggests I work out how we run QA. We look at previous OkHttp changes and Grzegorz suggests we send build to user with issue.
- Linked to previous test plan and added
needs-testing
label to PR.
What could I improve?
- Using the Network Profiler more: makes it a lot easier to inspect network traffic and good tool for developers working on HTTP parts of the app or diagnosing issues.
- Have more oversight on issues: I totally missed this issue come in and wouldn't have seen it if it hadn't come up on Slack. Adding email notifications is probably a good next step?
- Removing static state from
OkHttpConnection
: this made it hard to reason about what was going on while debugging the tests/app - Moving authentication logic out of
OkHttpConnection
: it would be way nicer if we didn't have the authentication logic dumped into the class with everything else as the tests are duplicated across HTTP verbs and it's harder to understand the flow of setting up authentication than it probably should be.
PR for code change is here