Skip to content

Instantly share code, notes, and snippets.

@ludflu
Last active May 20, 2019 16:20
Show Gist options
  • Save ludflu/f6cf8c1f3466ac0755b1982ce02d0506 to your computer and use it in GitHub Desktop.
Save ludflu/f6cf8c1f3466ac0755b1982ce02d0506 to your computer and use it in GitHub Desktop.
interview questions for a data engineering manager
Technical questions:
1. Describe a data pipeline or data warehouse you've built
2. How do you go about gathering requirements for a data pipeline or warehouse?
3. How do you unit test ETL systems?
4. Explain CI/CD for data systems
5. How do you track data provenance?
6. What makes a software architecture good or bad? What makes a code module good or bad? A function?
7. If someone gives you a process that is too slow, how do you improve its performance?
8. Explain normalized vs denormalized data schemas. Why would you pick one over the other?
9. Algorithm / coding question: given two files of integers, how would you write a program to find the intersection of those
two files?
10. If you're A/B testing a change, how do you know that the change you see is not attributable to chance alone?
Project Manager questions:
1. If you're given a project that can't be completed in the allotted time, what do you do?
What kind of tradeoffs can you make?
2. What do you do with a project that is already late?
3. If you have a group of engineers who disagree about a technical approach, how do you find a path forward?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment