Skip to content

Instantly share code, notes, and snippets.

@8bit-pixies
Last active May 3, 2016 11:15
Show Gist options
  • Save 8bit-pixies/8142823 to your computer and use it in GitHub Desktop.
Save 8bit-pixies/8142823 to your computer and use it in GitHub Desktop.

The Data Science Curriculum

Start here.

  • Intro to Data Science UW / Coursera
  • Topics: Python NLP on Twitter API, Distributed Computing Paradigm, MapReduce/Hadoop & Pig Script, SQL/NoSQL, Relational Algebra, Experiment design, Statistics, Graphs, Amazon EC2, Visualization.

Math

Computing

Projects

  • Coursework
  • Sentiment analysis, trending topics, and friendship mapping with Twitter API
  • Joins and Matrix Manipulation in MapReduce (AWS EC2)
  • In-database Text analysis (SQL)
  • Sentiment analysis of movie tweets (Python)

Further Study Resources:

A Note About Direction

This is an introduction geared toward those with at least a minimum understanding of programming, and (perhaps obviously) an interest in the components of Data Science (like statistics and distributed computing). Out of personal preference and need for focus, I geared the original curriculum toward Python tools and resources, so I've explicitly marked when resources use other tools to teach conceptual material (like R)


From Open Source Data Science Masters. Follow me on Twitter @cChpmnSiu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment