Skip to content

Instantly share code, notes, and snippets.

@datageneralist
Last active May 15, 2020 15:16
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save datageneralist/97354adf2f67c31196d62384d9194002 to your computer and use it in GitHub Desktop.
Save datageneralist/97354adf2f67c31196d62384d9194002 to your computer and use it in GitHub Desktop.
Tables in Medium
Role Summary Key Phrases Primary Software Secondary Software
Data Engineer They design, develop, support, and connect systems that store or report data. Data engineers get the data to you in a clean format before any analysis can be done. You want a perfectionist in this role; otherwise, the data might end up dirty. Data modeling ETL Data warehouse Data pipelines Excel, Python, SQL, NoSQL, Oracle, SAP, Teradata, MongoDB, Business Objects, AWS, Azure, GCP JavaScript, Java, C/C++
Business Intelligence Engineer BI engineers are data engineers who are slightly weaker at programming but more adept at gathering business requirements. They design, develop, document, and manage scalable solutions for new and ongoing metrics, reports, analyses, and dashboards to support business needs. Requirements gathering Reports, metrics, dashboards Data warehouse ETL Databases Excel, Python, SQL, NoSQL, Oracle, SAP, Teradata, MongoDB, Business Objects, AWS, Azure, GCP C/C++, Hive, Spark, Hadoop, Pig
Analyst Combine domain knowledge with some decent programming skills and you can get some quick, useful insights on the data. This individual loves exploring data to find interesting subsets of data. Their technical skills are more breadth than depth. Speed is the priority over perfect, clean code. Insights and analysis Data vizualization Macros Automation Excel, SQL, Python, Tableau, Pandas, Dplyr, Alteryx, PowerBI AWS. Azure, GCP
Senior Analyst Same as the analyst, but with more experience. They tend to understand the business a little more and are faster at analyzing the data. Insights and analysis Data vizualization Macros Automation Excel, SQL, Python, Tableau, Pandas, Dplyr, Alteryx, PowerBI AWS. Azure, GCP
Analytics/BI Manager A manager who will have direct reports that include data engineers and analysts. They are managers for various technical projects aimed at building business intelligence tools. This individual communicates with senior level stakeholders on a semi-regular basis. They must be a generalist who grasps a wide array of technical concepts. Requirements gathering Reports, metrics, dashboards Data warehouse ETL Communication Insights Leadership Excel, Powerpoint, Word, SQL, Tableau, NoSQL, Oracle, SAP, Teradata, MongoDB, Business Objects, AWS, Azure, GCP Python, C/C++, Java JavaScript
Applied Machine Learning Engineer Applied machine learning engineers possess a strong understanding of how to use algorithms to churn through large data sets and glean useful insights. This individual understands the required assumptions and assessments needed to build a useful model. They must be able to cope with failure because finding a useful model requires a lot of tinkering and testing. Machine learning Data mining AI Experimentation Statistical models Algorithms Python, R, SQL, Sci-kit Learn, Tensorflow, Pytorch, H2O, NoSQL AWS, Azure, GCP, Hive, Spark, Pig, Hadoop, Java, C/C++
Statistician Statisticians are similar to Applied Machine Learninng Engineers; however, they have stronger statistics expertise but are typically weaker at programming. Statisticians help decision makers come to safe decisions under uncertainty. They identify conclusions that can be made beyond the data. Machine learning Data mining AI Experimentation Statistical models Algorithms Python, R, SQL, Sci-kit Learn, Tensorflow, Pytorch, H2O, NoSQL, STATA, SAS AWS, Azure, GCP, Hive, Spark, Pig, Hadoop
Data Scientist The Data Scientist is a combination of the Senior Analyst, Applied Machine Learning Engineer, and Statistician. The best Data Scientists possess the skills sets of all three roles; however, these are called unicorns because they are hard to find. Most data scientists only have a subset of all these skills. Machine learning Data mining AI Experimentation Statistical models Algorithms Databases Python, R, SQL, Sci-kit Learn, Tensorflow, Pytorch, H2O, NoSQL AWS, Azure, GCP, Hive, Spark, Pig, Hadoop, Java, C/C++
Data Science Lead The Data Scientist Lead manages the data science team and ensures that they add value to the business. They manage projects aimed at extracting value from data using statistical and machine learning models. Direct reports include analysts, engineers, statisticians, and data scientists. This individual communicates with senior level stakeholders on a semi-regular basis. Machine learning Communication AI Leadership Cloud Experimentation Excel, Powerpoint, Word, Python, R, SQL, Sci-kit Learn, Tensorflow, Pytorch, H2O, NoSQL, AWS, Azure, GCP Hive, Spark, Pig, Hadoop, Java, C/C++
Decision Maker The Decision Maker, typically a director, understands the science AND art of decision making. They are responsible for identifying areas where data can provide value, framing the appropriate use case, and ensuring their team executes on the project. They must understand the analytics, as well as, the potential impact on the business. This individual communicates with senior level stakeholders on a regular basis. Machine learning Communication AI Leadership Vision Cloud Excel, Powerpoint, Word, SQL, Tableau, NoSQL, Oracle, SAP, Teradata, MongoDB, Business Objects, AWS, Azure, GCP Java, Python, C/C++
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment