The problem was the code written by Team A was updated (or clobbered) by Team B during a merge conflict resolution.
This is common problem which can easily be identified and fixed during QA phase. But this problem was not identified during QA and made its way to production.
-
Large amount of untracked differences between repository's branches (namely - master, staging, production). In this particular scenario code deployed in dev1 environment had the required property (ctladmin_endpoint) because the branch that was deployed had that value but in production that property had been removed during merge conflict resolution.
-
Cause of these differences is our practice of cherry picking commits into different branches. The developer first creates a PR for sprint branch then production. Ideally, we should cherry pick commits as infrequently as possible
-
Keep as little a difference as possible between master and staging and other branches. The difference between branches should be well defined and well understood. Example - Some new feature is being tested in spint branch which is not available in production. Note:- Code from spint branch should be merged in production as soon as QA gives a go-ahead for the new feature.
-
Never allow coder to create PRs for production. Coders will create only one PR for any issue (preferably to staging branch, Sprint branch, or any feature branch, but also to hotfix branch if required). And its the PM's responsibility to create merge PRs and back merge PRs (in case of hotfix) to keep all branches in sync as much as possible.