Great Data Science Project Criteria:
- Problem statement that defines a measurable, and/or falsifiable outcome. “Frequency of [specific event] is influential over [some outcome]”. “Users who use [some feature in app] are differentiable from users who less frequently use [some feature in app]”. etc. If you can’t frame a data problem properly, none of has it has purpose. The biggest challenge in data science is making sense and defining the gray area of business problems. This also comes with experience.
- EDA EDA EDA. Define your scope. Report only what is necessary and relevant to your problem statement. If the model reports only 4-5 common variables as parameters (logistic regression for instance), focus on those when summarizing your work in terms of EDA.
- How much data is necessary to make this analysis work? Are you sampling? Is a t-test necessary to gain assurance or a rank order test?
- Explain which model makes the most sense to use. Are you trying to gain inference about a data problem?