Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save josephpaulmckenzie/bd8eb3aaf589c677a8cc8c2712101a9a to your computer and use it in GitHub Desktop.
Save josephpaulmckenzie/bd8eb3aaf589c677a8cc8c2712101a9a to your computer and use it in GitHub Desktop.
Mediavine Infrastructure Engineer Questionnaire

Instructions

  • Fork this gist.
  • Please respond to the prompts below, uploading additional files if necessary.
  • Reply back to the email you were sent with the link to your completed gist.

*** I just wanted wanted to start off saying I always recommend using an AWS lambda pipeline used with a variety of other serverless componets using AWS to do anything like the below.

  • Describe an application hosted in a public cloud that you’ve been responsible for configuring, maintaining, designing, or deploying. What were some challenges that arose and what tools or processes did you apply to solve them?

The application consisted of creating a reminder system for veterinary appointments. This system used lambas, SQS, API gateway and a series of Auroa RDS databases. Some of the challenges arose around invalid email and phone numbers. Wanting to verify that a reminder had been sent to the customer could sometimes be difficult because of different cell/email providers are hard to work with as they can see mass amounts of notices as spam coming from one place. We used Twilio and Sendgrid to send out the notices to customers and using a webhook with an API using AWS API gateway we were able to have customers confirm appointments by replying to text messages or emails.

  • Please describe one of the applications of automation that you’re most proud of or excited about that solved a need around a cloud/infrastructure issue or requirement.
  • Created a system that was responsible for getting customers the best price available for luxury apartments. We had to first get updated or new CSVs files from an SFTP server containing data such as available move in dates, prices for different dates, available units, buildings and ect .

We first checked a S3 bucket in AWS to see if the CSV file existed. If it did not we uploaded the file to the bucket. If the file existed we checked the MD5 hash of the file on the server vs the file in the bucket to see if the data did indeed change. Once verifying the data had been updated we would allow the file in the bucket to be updated. If no changes had been made we stopped the pipeline.

Upon a file being uploaded to the bucket a trigger set on the bucket would fire off a lambda that would check for an existing table in dynamo. If the table existed we would delete it so no stagant data would remain. Upon successful deletion of the table a message would be sent to SQS and trigger another lambda that would create a new table with Id keys and any needed sort keys.

Once the table was created would pull the s3 file from the bucket and run some data intergity checks to make sure that no rows were incomplete or had incorrect data types.

When the table was completely inserted with new data a message was sent to an SQS letting our pipeline know that it was ready for the next step.

The next step was to call out to an external SOAP API. Using the data from the previous step we could get the missing data that was not included in the CSV file from the SFTP server. After getting the data from the SOAP API we would then convert the data into a readable format for updating the relavant table in dynamo. Once that was complete we would send a message to the step in our pipeline via SQS cotaining the data needed to update the table.

Once all data in the dynamo table was complete we would hit an API in drupal updating the database there. Which would in return update the data for the frontend site.

On the site there was a "move in calender" that a user could click on and get the best prices for move in dates and other such things. The data populated in it was created by a custom API using AWS API Gateway that would hit our dynamo table and do some checks against holidays and other things and would then return all needed information for the user.

Any kind of errors or warnings that would happen to occur by using cloudwatch I created a custom lmabda that recieved streamed logs matching anything that contained "ERROR, WARNING" and if those were found set out an SNS message to a topic that could send Slack notifications, send emails or by hitting an API that could send me push notifications to an Android app that I crerated for my personal use for a few essential services in AWS so that I could respond to them even if out and about and not by my laptop.

  • There are more things involved in this pipeline of services but I felt this was getting to long to read already.

  • As an example of a project we recently completed, how would you design the infrastructure for a web service that:

    1. Receives thousands of requests per second
    2. Should be protected against regional/geographical outages
    3. Allows for A/B testing of different backends
    4. Returns highly dynamic data
    5. Needs to perform some asynchronous tasks interacting with external resources (data imports)
    6. Needs fairly static data from external db(s) to process each request
  • Given the above design, how would you configure the following. Answers can be as detailed as you'd like, or conceptual in nature:

    1. Deployments of the web service (and potential a/b tests)
    2. Deployments/Changes to the core infrastructure

I would use AWS API gateway that could easily handle that type of workload for requests per second by using route 53 we can route the traffic between regions. Currently, the default API endpoint type in API Gateway is the edge-optimized API endpoint. This will not work for us, so by using the new regional API endpoint in API Gateway it moves the API endpoint into the region and by using a custom domain name is unique per region. By doing this we can make it possible to run a full copy of our API in each region and then by using Route 53 and latency-based routing and health checks we can achieve an active-setup and failover.

I tend to like circleci for my testing as I can create a custom docker image to not only run any tests we want but for also deploying to AWS by using the serverless framework. I really like the serverless framework because it makes it seamless to create a YAML file that will easily create any infrastrutcure that we need. By using this we are also able to quickly make a test environment or quickly spin up another complete stack that we know is the exact same for everyone that would deploy it or make changes. This makes it easy to hookup to Github so that anytime a merge to master is done it not only still tests but also will deploy to AWS.

Any kind of interaction with external resources can be achieved quite easy by using lambda. The external database type chosen would have an affect on what exactly I would do, however no matter it should be simple.

In order for me to to say exactly what I would like to do I would like more information but if I had it how I wanted to I would use AWS all the way for as much as we can. By using AWS and specfically Lambda we can easily scale to any number of requests. I'm typically not fan of requiring much to be using a server that is always running. By using lambda along with SQS or similar we can make it scale up or down and thus saving us money and time configuring servers needlessly.

  • Please diagram the infrastructure stack you'd choose for an environment with the following needs:
    1. api gateway
    2. authentication layer/gateway
    3. some high-throughput services that return fairly static data
    4. services that require authentication, and could vary in load substantially

Link to diagram : https://miro.com/app/board/o9J_kvQ4q2w=/

I've included a link to the miro board as i feel it's much easier to view the diagram this way with the ability to zoom in and have a clearer view. If you woud like I can include an image here as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment