Skip to content

Instantly share code, notes, and snippets.

@ccampanale
Last active September 12, 2022 13:42
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ccampanale/2aa07c6c83b3488bac707b9f415469dd to your computer and use it in GitHub Desktop.
Save ccampanale/2aa07c6c83b3488bac707b9f415469dd to your computer and use it in GitHub Desktop.
Various details collected and analyzed investigating compliance violations inconsistencies in Monarch Spaces.

Process Details

  • An event is used to trigger the creation of a compliance resource update which is used to track details about the Spaces and Accounts which are to be used and the result of the process.
  • There are N Compliance Resource worker nodes in the system which process resource updates for an account being updated. This is fanned out and each node will update all resources for a specific account.
  • The resource update will query the aggregated AWS Config resources for the account, transform the data, and create or update a resource record in the system for each resources in the account.
  • Creating or updated a resource record emits an event which triggers a process to review the ingested AWS config rules evaluation results for the resource and create, update, or delete violations for NON_COMPLIANT violations.

Observations

  • Some violations in QA were showing incorrect compliance rule information which indicated the lack of a description, however the rule itself is queryable and has a description.
  • The resource record shows the evaluated results and the transformed results which has the correct rule information and description.
  • Violations with incorrect rule information should be updated to correc the rule information based on the most recent data in the ingested resource record, but they are not.
  • These violations have updatedOn timestamps which are several days old.

Infrastructure Resource Details

QA

  • space-compliance x2
    • 64 vCPU
    • 256mb RAM
  • space-resource x4
    • 64 vCPU
    • 256mb RAM

Prod

  • space-compliance x2
    • 64 vCPU
    • 1024mb RAM
  • space-resource x4
    • 64 vCPU
    • 256mb RAM

Violation Updates

QA

image

Prod

image

@ccampanale
Copy link
Author

There are no associated errors which would indicate any form of resource contention or message/event delivery failure in QA. The only thing I can conclude at this time is that there is some form of contention that is silently failing behind the scenes, perhaps related to the difference in memory for the compliance nodes which handle compliance violations, reports, and notifications.

@ccampanale
Copy link
Author

Is there perhaps a difference in the organizational or AWS config aggregation configuration between the test and production orgs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment