Skip to content

Instantly share code, notes, and snippets.

@ottobricks
Forked from Pelielo/data-intern-challenge.md
Last active February 10, 2021 21:16
Show Gist options
  • Save ottobricks/6b40c0de58bf9ae0fb0e0663356c6a6f to your computer and use it in GitHub Desktop.
Save ottobricks/6b40c0de58bf9ae0fb0e0663356c6a6f to your computer and use it in GitHub Desktop.
Conta Stone's Data Intern Challenge

Data Intern Challenge

This is Conta Stone's data challenge for intern applicants. The objective is to extract and analyze data from a database.

Instructions

The solution can be developed using Python, SQL scripts, a BI tool or a combination of those. It must be hosted in a public code repository such as GitHub and GitLab, or sent as a compressed .zip folder including all the necessary files to replicated your environment and run your code.

The database is available here and contains credit card transactional data in 4 tables:

  • customers
  • cards
  • transactions
  • frauds
  1. Extract and analyze the data in order to answer the following questions. Provide a description and/or comments for each solution.
  • What is the average age of the customers in the database?
  • How is the card_family ranked based on the credit_limit given to each card?
  • For the transactions flagged as fraud, what are the ids of the transactions with the highest value?
  1. Analysis:
  • Analyze whether or not the fraudulent transactions are somehow associated to other features in the dataset. Explain your results.

It is not mandatory to answer all questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment