Skip to content

Instantly share code, notes, and snippets.

@Warns
Last active March 12, 2021 19:05
Show Gist options
  • Save Warns/a6abd99cdf417b32263a5806dbcd9611 to your computer and use it in GitHub Desktop.
Save Warns/a6abd99cdf417b32263a5806dbcd9611 to your computer and use it in GitHub Desktop.
Answers

Question 1: Optimize the Pipeline

For the first problem the first option can be by separating OWASP Dependency Check from the main pipeline and running it in a scheduled job because it adds a long extra time as it downloads and scans the historical dependency database. Along with that using CICD tool's caching mechanism depending on the tool such as Gitlab's CodeCommit's caching mechanisms.

Together with the first option can be done by separating the build step from unit tests and coverage so the team can get faster feedback and can focus on debugging the step thats causing the problem. Also, depending on the stage the project is at, unit tests can be entirely avoided if there are frequent rewrites in the code and the team needs much faster feedback. Therefore it can be implemented until the code is ready enough to face unit tests and follow test-driven dev process.

Also, security and integration test shouldn't come before unit tests as they're the most basic ones. Also its a good appraoch to run code quality after unit tests specially if there are good metrics defined for the code quality.

Another option can be by using Docker's --cache-from to get cache from existing image which can be included in the build context by calling in the existing image first then building after that.

By using caching mechanisms and separating time consuming steps from the main pipeline can hugely decrease the time in which the team can get feedback and work on debugging.

Question 2: Infrastructure as Code

I would use the following strategy (a strategy that I have been using for quite sometime now);

  1. Store Terraform code in git and make a separate repository for it as its better not to have it alongside the project itself since Terraform code won't be changing as much as the project does after sometime, also definitely keep Terraform state, variables, logs and any other file that could contain sensitive info unchecked to the repo.

  2. Every developer should create their own branches for each task and then PR their change to the main release branch for static analysis and reviewing.

  3. Have a single release branch to prevent any unwanted deployments or rollbacks.

  4. Store Terraform state in a remote backend this can be any storage (S3 bucket for example). I'd personally recommend using Terraform's cloud as it has been under heavy development recently and it can also natively handle lock state without the need of a separate database.

Question 3: Security

There are a couple of different ways to handle this depending on where the APIs are hosted.

Since the question clearly states firewall rules, then using for example AWS's security groups and WAF. I would use WAF console or the CLI to create web ACL that would then contain my combination of WAF rules.

But again depending on where the APIs are hosted for example if hosted on Kubernetes I would use ingress controllers or a service mesh provider such as istio to have a better control over which specific applications are accessed both internally and externally.

Then again if the environments are hosted on AWS Lambda for example I would have an API gateway infront of it and manage access using access controls/resource policies also if hosted on other resources.

Example resource policy using AWS API Gateway

{
  "Version": "2012-10-17",
  "Statement": [{
      "Effect": "Allow",
      "Principal": "*",
      "Action": "execute-api:Invoke",
      "Resource": "execute-api:/*/*/*"
    },
    {
      "Effect": "Deny",
      "Principal": "*",
      "Action": "execute-api:Invoke",
      "Resource": "execute-api:/*/*/*",
      "Condition": {
        "NotIpAddress": {
          "aws:SourceIp": ["sourceIpOrCIDRBlock", "sourceIpOrCIDRBlock"]
        }
      }
    }
  ]
}

For testing out the rules and check access, if not using an online API connectivity test tool, I would use a script to curl/telnet the endpoints and have an output to check the results, or use a tool such as https://github.com/nmap/nmap to test my firewall configs.

Question 4: Docker

Since this Dockerfile is calling a Python image and creating a layer using FROM. Then we should use Python commands in the following steps and installing packages using pip and not apt-get.

Copying app files should be done after pip installations so COPY . . statement should be done after boto3 installation.

And since virtualenv is a pip package the first RUN command should be RUN pip install virtualenv

Also if we are to install more than a package and run multiple lines then virtualenv should be properly used by setting the PATH environment variable to allow using it in all other Python commands within the Dockerfile as each RUN statement uses a separate shell.

Also to follow Python convention, dependencies should be installed using requirements.txt and then referencing it in Dockerfile for installation using pip such as RUN pip3 install -r requirements.txt

Also http.server should be in the CMD instead of bash, and started in the foreground without & so it should be CMD python -m http.server

  1. Each RUN and COPY statement will create its own layer so it should be 4, to get more detailed view on layers dive tool can be used https://github.com/wagoodman/dive

But since this Dockerfile would fail to run and dive wouldn't run either, its best to fix the issues mentioned and run it.

Question 5: AWS IAM Policy

  1. This IAM policy grants the following;
  • Programmatic read-write access to the company-data bucket
  • Allows permission to run any version of a specific task (update-tables:*) on a specific cluster which is prod in this case.
  1. To provide only access from a specific IP address the following can be used;
{
  "Id": "SourceIP",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "SourceIP",
      "Action": "s3:*",
      "Effect": "Deny",
      "Resource": [
        "arn:aws:s3:::company-data",
      ],
      "Condition": {
        "NotIpAddress": {
          "aws:SourceIp": [
            "IP address or IP range", 
          ]
        }
      },
      "Principal": "*"
    }
  ]
}

6. Infrastructure

Please note that given the limited amount of information for this project and not knowing the nature and structure of the app and services I have included a 3 versions of phase 1 depending whether the app is containerizable or not and whether there would be a major change in the app or not. Please view the diagrams and descriptions with this in mind. Also, I tried to avoid extra details such as caching, notification, mail, monitoring services as those are subject to change with various factors.

Phase 1. Migration to the Cloud.

A. At this stage I would plan a very basic migration to the cloud that could be deployed in a short time. And specially if there aren't going to be any changes on the application then

B. If the app is subject to minor changes then with this structure it will give the development team some degree of levarage.

C. If the app is subject to major changes then the development team will need more flexibility to make changes on the app (this seems the better option as per the description given).

With this AWS 3 Tier architecture the development will have all the elasticity to make changes on the application.

  • Serving the app through a CDN that stores data to an S3 bucket which can be replicated to different zones.
  • Using Amazon Cognito to have an easy sign in/sign up and user authorization experience as its AWS's standard.
  • Using Amazon API Gateway to serve requests though a secure API Gateway which is the standard in modern cloud infrastructure that will make it easy to funnel API requests at any scale and make it easier for the developers to develop modern APIs.
  • Using AWS Lambda to handle light requests with with great performance (and low cost as that is one of the points in the project description).
  • Used an S3 bucket to store Lambda functions and serve metrics to Amazon CloudWatch for observability.
  • Using Amazon Aurora (DynamoDB as an alternative) to store relational database. Chosen this because its a simple service, high performance, and cost-effective.

Instead of AWS Lambda, if the application is Dockerizable then Amazon ECS or EKS can be used as a strong alternative. Specially if the application has heavy and high load services that needs to run. Or a hybrid infrastructure can be used to serve different load intensity services.

Phase 2. Global Scalability

Replicating the infrastructure from Phase 1-B to 4 different regions with a central S3 Bucket to store data globally in a central region to keep latency optimized. Since each region has their own CDN (CloudFront) only the first time caching the data will have a bit higher latency because website content will be retrieved from the central storage and cached in the CDN. Users at this point can experience a bit longer time for media to load, once cached next users will not notice a difference.

Phase 3. Security

To keep the diagram simple and clear I chose to mention here what options can be added at this stage to enhance security.

  • As a go-to service WAF can be used to keep requests passing through the app and APIs secure to prevent possible malicious activity from causing performance and availability issues.
  • Additionally in a zone redundant structure adding VPCs to critical services such as backend closed data transfers or admin panel.
  • Running each availability zone and resource block in a separate subnet and have them securely connect to each other if necessary will further enhance security.
  • Using Security Groups, ACLs and IAM services to have more custom control over data and user access.

Bonus

Creating AMI templates for the infrastructure specially infrastructure in Diagram Phase 3 will greatly help in replicating the availability zones for different regions.

Reference

To prepare the diagrams in #6 I took reference from AWS architecture templates and best practices as well as Azure architecture templates and best practices.
https://aws.amazon.com/architecture/
https://docs.microsoft.com/en-us/azure/architecture/browse/
Shareable Lucid Chart for the diagrams listed above:
https://lucid.app/lucidchart/30d9fcd4-1569-4d57-b7fb-b2258a152fa4/view

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment