Skip to content

Instantly share code, notes, and snippets.

@seahyc
Last active September 1, 2022 09:01
Show Gist options
  • Save seahyc/5b8d1fbc9126130520fcdb8b8e543198 to your computer and use it in GitHub Desktop.
Save seahyc/5b8d1fbc9126130520fcdb8b8e543198 to your computer and use it in GitHub Desktop.
Change of email

Good Day,

Thank you for applying for the position of Data Engineer at Glints. The following describe the Technical Assessment requirement for this position.

Problem Set

A key part of a Data Engineer’s responsibilities is maintaining the serviceability of Data Warehouse. To achieve this, you will need to understand the set up and operation of a basic Data Warehouse.

In this technical assessment, you are required to submit the setup for the data pipeline of a basic data warehouse using Docker and Apache Airflow.

Your final setup should include the following:

  • A source postgres database (Database X)
  • A target postgres database (Database Y, which is not the same Docker container as Database X)
  • Apache Airflow with webserver accessible from localhost:5884
  • A Directed Acyclic Graph (DAG) for transferring the content of Source Database X to Target Database Y
  • README.md detailing the usage of your submission

As the focus of this technical assessment is on the set up of a basic data pipeline using Airflow, the content of the table in Source Postgres Database X to be transferred can be determined by the candidate. It can be as basic as:

id creation_date sale_value
0 12-12-21 1000
1 13-12-21 2000

Submission Requirement

A public Git repository containing minimally:

  • Docker-compose.yml
  • README.md explaining your setup, instruction on running your setup and credentials for inspecting the final Target Database Y
  • And any required script to run the setup in a Linux environment.

Any required Docker image for your setup should be stored on a public Docker repository. Your setup will be assessed in a Linux environment. The DAG will be triggered manually via Airflow web interface at localhost:5884. Subsequently, the content of Target Database Y will be inspected.

Do your best to keep the duration you spend working on this Assessment to 2 days (1 weekend). When you are done(or should you have any query), email your repository url to geraldine@glints.com.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment