Skip to content

Instantly share code, notes, and snippets.

@kichik
Last active February 9, 2024 15:31
Show Gist options
  • Save kichik/7a2ecb0d36358c50c7b878ad9fd982bc to your computer and use it in GitHub Desktop.
Save kichik/7a2ecb0d36358c50c7b878ad9fd982bc to your computer and use it in GitHub Desktop.
CloudFormation template that stops RDS from automatically starting back up
# aws cloudformation deploy --template-file KeepDbStopped.yml --stack-name stop-db --capabilities CAPABILITY_IAM --parameter-overrides DB=arn:aws:rds:us-east-1:XXX:db:XXX
Description: Automatically stop RDS instance every time it turns on due to exceeding the maximum allowed time being stopped
Parameters:
DB:
Description: ARN of database that needs to be stopped
Type: String
AllowedPattern: arn:aws:rds:[a-z0-9\-]+:[0-9]+:db:[^:]*
MaxStartupTime:
Description: Maximum number of minutes to wait between database is automatically started and the time it's ready to be shut down. Extend this limit if your database takes a long time to boot up.
Type: Number
MinValue: 10
Default: 25
Resources:
DatabaseStopperFunction:
Type: AWS::Lambda::Function
Properties:
Role: !GetAtt DatabaseStopperRole.Arn
Runtime: python3.6
Handler: index.handler
Timeout: 20
Code:
ZipFile:
Fn::Sub: |
import boto3
import time
def handler(event, context):
print("got", event)
db = event["detail"]["SourceArn"]
id = event["detail"]["SourceIdentifier"]
message = event["detail"]["Message"]
region = event["region"]
rds = boto3.client("rds", region_name=region)
if message == "DB instance is being started due to it exceeding the maximum allowed time being stopped.":
print("database turned on automatically, setting last seen tag...")
last_seen = int(time.time())
rds.add_tags_to_resource(ResourceName=db, Tags=[{"Key": "DbStopperLastSeen", "Value": str(last_seen)}])
elif message == "DB instance started":
print("database started (and sort of available?)")
last_seen = 0
for t in rds.list_tags_for_resource(ResourceName=db)["TagList"]:
if t["Key"] == "DbStopperLastSeen":
last_seen = int(t["Value"])
if time.time() < last_seen + (60 * ${MaxStartupTime}):
print("database was automatically started in the last ${MaxStartupTime} minutes, turning off...")
time.sleep(10) # even waiting for the "started" event is not enough, so add some wait
rds.stop_db_instance(DBInstanceIdentifier=id)
print("success! removing auto-start tag...")
rds.add_tags_to_resource(ResourceName=db, Tags=[{"Key": "DbStopperLastSeen", "Value": "0"}])
else:
print("ignoring manual database start")
else:
print("error: unknown database event!")
DatabaseStopperRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Action:
- sts:AssumeRole
Effect: Allow
Principal:
Service:
- lambda.amazonaws.com
ManagedPolicyArns:
- arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
Policies:
- PolicyName: Notify
PolicyDocument:
Version: '2012-10-17'
Statement:
- Action:
- rds:StopDBInstance
Effect: Allow
Resource: !Ref DB
- Action:
- rds:AddTagsToResource
- rds:ListTagsForResource
- rds:RemoveTagsFromResource
Effect: Allow
Resource: !Ref DB
Condition:
ForAllValues:StringEquals:
aws:TagKeys:
- DbStopperLastSeen
DatabaseStopperPermission:
Type: AWS::Lambda::Permission
Properties:
Action: lambda:InvokeFunction
FunctionName: !GetAtt DatabaseStopperFunction.Arn
Principal: events.amazonaws.com
SourceArn: !GetAtt DatabaseStopperRule.Arn
DatabaseStopperRule:
Type: AWS::Events::Rule
Properties:
EventPattern:
source:
- aws.rds
detail-type:
- "RDS DB Instance Event"
resources:
- !Ref DB
detail:
Message:
- "DB instance is being started due to it exceeding the maximum allowed time being stopped."
- "DB instance started"
Targets:
- Arn: !GetAtt DatabaseStopperFunction.Arn
Id: DatabaseStopperLambda
@cagriar
Copy link

cagriar commented May 4, 2020

@kichik Thanks, your updates worked successfully!

@derianpt
Copy link

Hi, cool script! Just some thoughts

@kichik @cagriar Have you guys ever come across an RDS instance taking longer than 20 minutes to start (after AWS auto starts it)? Say 25 minutes.

In that case, this script will falsely exclude it from being stopped, right?

I was thinking maybe these lines:
https://gist.github.com/kichik/7a2ecb0d36358c50c7b878ad9fd982bc#file-keepdbstopped-yml-L43-L52

can be modified to:

tags = rds.list_tags_for_resource(ResourceName=db)["TagList"]

if is_started_by_aws(tags):
    print("database was automatically started, turning off...")
    time.sleep(10)
    # even waiting for the "started" event is not enough, so add some wait
    rds.stop_db_instance(DBInstanceIdentifier=id)

    print("success! removing auto-start tag...")
    rds.remove_tags_from_resource(ResourceName=db, TagKeys=["DbStopperLastSeen"])
else:
    print("ignoring manual database start")


def is_started_by_aws(tags):
    """
    Checks if a RDS instance was auto started by AWS
    :param tags: List of Tags configured for RDS instance
    :return: False if the resource has "DbStopperLastSeen" tag, True otherwise
    """
    for tag in tags:
        if tag["Key"].lower() == "DbStopperLastSeen".lower():
            return True

    return False

This way:

  1. It won't matter how long the RDS instance takes to be available.
  2. We don't need to use the time.time() UNIX timestamp value stored in last seen tag. It will just be an identifier added by the earlier lambda run.

Thoughts?

@kichik
Copy link
Author

kichik commented Oct 18, 2020

I wanted to put a time limit on it in case we ever miss the message, the user turns off the DB before we do, the tag fails to set, or basically any unforeseen issue. I'll turn it into a stack parameter, but I still want to keep it in place.

@regevbr
Copy link

regevbr commented Jan 24, 2021

Please notice that for Aurora this will not work and you need to operate on the cluster level.
@kichik thoughts?
Here is the modified (not yet tested) version:

# aws cloudformation deploy --template-file KeepDbStopped.yml --stack-name stop-db --capabilities CAPABILITY_IAM --parameter-overrides DBCluster=arn:aws:rds:us-east-1:XXX:cluster:XXX
Description: Automatically stop RDS Aurora cluster every time it turns on due to exceeding the maximum allowed time being stopped
Parameters:
  DBCluster:
    Description: ARN of database cluster that needs to be stopped
    Type: String
    AllowedPattern: arn:aws:rds:[a-z0-9\-]+:[0-9]+:cluster:[^:]*
  MaxStartupTime:
    Description: Maximum number of minutes to wait between database is automatically started and the time it's ready to be shut down. Extend this limit if your database takes a long time to boot up.
    Type: Number
    MinValue: 10
    Default: 25
Resources:
  DatabaseStopperFunction:
    Type: AWS::Lambda::Function
    Properties:
      Role: !GetAtt DatabaseStopperRole.Arn
      Runtime: python3.6
      Handler: index.handler
      Timeout: 20
      Code:
        ZipFile:
          Fn::Sub: |
            import boto3
            import time

            def handler(event, context):
              print("got", event)
              db = event["detail"]["SourceArn"]
              id = event["detail"]["SourceIdentifier"]
              message = event["detail"]["Message"]
              region = event["region"]
              rds = boto3.client("rds", region_name=region)

              if message == "Cluster instance is being started due to it exceeding the maximum allowed time being stopped.":
                print("database turned on automatically, setting last seen tag...")
                last_seen = int(time.time())
                rds.add_tags_to_resource(ResourceName=db, Tags=[{"Key": "DbStopperLastSeen", "Value": str(last_seen)}])

              elif message == "Cluster instance started":
                print("database started (and sort of available?)")

                last_seen = 0
                for t in rds.list_tags_for_resource(ResourceName=db)["TagList"]:
                  if t["Key"] == "DbStopperLastSeen":
                    last_seen = int(t["Value"])

                if time.time() < last_seen + (60 * ${MaxStartupTime}):
                  print("database was automatically started in the last ${MaxStartupTime} minutes, turning off...")
                  time.sleep(10)  # even waiting for the "started" event is not enough, so add some wait
                  rds.stop_db_cluster(DBClusterIdentifier=id)

                  print("success! removing auto-start tag...")
                  rds.add_tags_to_resource(ResourceName=db, Tags=[{"Key": "DbStopperLastSeen", "Value": "0"}])

                else:
                  print("ignoring manual database start")

              else:
                print("error: unknown database event!")
  DatabaseStopperRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Action:
              - sts:AssumeRole
            Effect: Allow
            Principal:
              Service:
                - lambda.amazonaws.com
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
      Policies:
        - PolicyName: Notify
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Action:
                  - rds:StopDBCluster
                Effect: Allow
                Resource: !Ref DBCluster
              - Action:
                  - rds:AddTagsToResource
                  - rds:ListTagsForResource
                  - rds:RemoveTagsFromResource
                Effect: Allow
                Resource: !Ref DBCluster
                Condition:
                  ForAllValues:StringEquals:
                    aws:TagKeys:
                      - DbStopperLastSeen
  DatabaseStopperPermission:
    Type: AWS::Lambda::Permission
    Properties:
      Action: lambda:InvokeFunction
      FunctionName: !GetAtt DatabaseStopperFunction.Arn
      Principal: events.amazonaws.com
      SourceArn: !GetAtt DatabaseStopperRule.Arn
  DatabaseStopperRule:
    Type: AWS::Events::Rule
    Properties:
      EventPattern:
        source:
          - aws.rds
        detail-type:
          - "RDS Cluster Instance Event"
        resources:
          - !Ref DBCluster
        detail:
          Message:
            - "Cluster instance is being started due to it exceeding the maximum allowed time being stopped."
            - "Cluster instance started"
      Targets:
        - Arn: !GetAtt DatabaseStopperFunction.Arn
          Id: DatabaseStopperLambda

@kichik
Copy link
Author

kichik commented Jan 24, 2021

@regevbr your code looks like it would work. Thanks! I think using Aurora Serverless might work too.

@67lisbon
Copy link

67lisbon commented Feb 17, 2022

@kichik it doesnt work for me when I run a test event in Lambda. I get this error

17 Feb 2022 14:09 [INFO] (/var/runtime/bootstrap.py) main started at epoch 1645106978694
17 Feb 2022 14:09 [INFO] (/var/runtime/bootstrap.py) init completed at epoch 1645106978694
got {'key1: 'value1', 'key2': 'value2', 'key3': 'value3'}
'SourceArn' : KeyError
Traceback (most recent call last):
File "/var/task/index.py", line 6, in handler
db = event["SourceArn"]
KeyError: 'SourceArn'

This also happens for ["detail"] in db = event["detail"]["SourceArn"]

I have only run this through Lambda as a 'Test' I configured on the Lambda. I have not tested this yet by using the Event Rule that listens for the message.

In the AWS cli if I run the command rds describe-events against my RDS Cluster I can see the following under 'Events' SourceIdentifier, SourceType, SourceArn and Message

@Klohto
Copy link

Klohto commented Feb 17, 2022

Run it against actual event test data.
Do you see {'key1: 'value1', 'key2': 'value2', 'key3': 'value3'} containing SourceArn?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment