Skip to content

Instantly share code, notes, and snippets.

@kavehmz
Last active May 6, 2022 04:53
Show Gist options
  • Save kavehmz/2522810edd1c053dc480ccf60384363d to your computer and use it in GitHub Desktop.
Save kavehmz/2522810edd1c053dc480ccf60384363d to your computer and use it in GitHub Desktop.

Skill Test

Please submit your answer in the form of a private github gist.

We expect this task to take no more than 3 hours.

ETL process

Please create a process which can download a sample data set of data from the following location and append it in a database table.

https://data.cms.gov/provider-data/archived-data/doctors-clinicians

Criteria:

  • Your process should be able to get the file as an environment variable, named DATASET, and download extract and import the related file:
    For example, DATASET= doctors_and_clinicians_02_2022.zip will only import doctors_and_clinicians_02_2022.zip file
  • Your process should be able to accept a list of columns to import, as IMPORT_FIELDS, and only import the mentioned fields from the file.
  • Import destination can be any DB of your choice. For example, you can start a SQLite, PostgreSQL, or MySQL in a docker container and import the data there.
  • You can pick you language of choice.
  • You do not need to write the whole process in one language. Likewise, you can divide the tasks between shell scripts and programs of your choice if you needed.
  • You need to dockerize your solution.
  • Eventual result must be executable like:
     $ export DATASET=doctorsandclinicians022022.zip  
     $ export IMPORTFIELDS="NPI,IndPACID,IndenrlID"  
     $ etl.sh  
     starting to download...  
     processessing...  
     importing...  
     done. 2000 records imported.  
     $
     `
    
    

What were are looking for:

  • You can design and implement an ETL system.
  • Furthermore, you are familiar with docker.
  • You are familiar with general concepts of databases, and you can interact with them.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment