Check if the observed behavior is the same for other tenants, not just for a single one. The wider the dataset, the easier it becomes to isolate the probem.
Check for data bleed accross tenants
- Configuration mix
- HTTP requests routed to the wrong tenant
- Message bus listeners consuming and processing data for the wrong tenant or for more than 1 tenants.
Check end-to-end configurations:
- KeyVault secrets
- Environment variables
- AppSettings
- Configurations persisted in the Database
- In-memory configurations
- Framework default configurations
- Service default configurations
- Third-party libraries default configurations
- Configurations modified by the CI/CD pipelines
- Configurations modified dynamically (observable files, message listeners, Rest API's)
The number one cause for inconsistent behavior is concurrency issues. This can be caused by multiple instances of the same service stepping on each other's toes
- Multiple different versions deployed in the same cluster
- Hotspots where one instance receives too much of the traffic
- Working with unprotected shared resources (race conditions)
Use logs and metrics monitoring systems to observe the data flow and start isolating the issue from there.