Skip to content

Instantly share code, notes, and snippets.

@daksis
Created September 15, 2017 20:48
Show Gist options
  • Save daksis/a00816eb5149920b266b3758e3823542 to your computer and use it in GitHub Desktop.
Save daksis/a00816eb5149920b266b3758e3823542 to your computer and use it in GitHub Desktop.
Hitchhikers Guide to Machine Learning Resources

Hitchhikers Guide to Machine Learning Resources

Reference Sources

  1. Project Rhea — online learning community where students teach other students. The tutorials here vary in detail and quality. Generally they are more that a definition at mathworld, less than a step by step. Good for getting the highlights on an unfamiliar topic
  2. Wolfram Mathworld — Like wikipedia, but only for Math. Go here when you have no idea what an "Isotopic Kernel is or why you would care. Mathworld will give you 80+ entries that are linked.
  3. Wikipedia Math Portal — the place to find everything math related on Wikipedia.

Math References

Printed Material

Stanford CS229 - Machine Learning — Has some of the best backgrounders. Highly recommended are the following:

Video Material

  1. Kahn Academy — linear algebra
  2. MIT OCW - Strang Linear Algebra — One of the best online courses for Linear Algebra. The Textbook isn't free. But there is a free Linear Algebra text by Jim Heffron

Courses

We're in luck. In the past several years there has been an explosion in high quality video series that are targeting the "advanced layperson"

  1. Coursera Machine Learning Andrew Ng — if you only take one course on ML this is it
  2. MIT Open Courseware - If you find you need a background in something, MIT probably has a course in it
  3. Safari Books online - Commercial offering
  4. Kahn Academy - Statistics — A ground up introduction to statistics. Assumes almost not prior exposure to stats

Backgrounders

  1. Introduction to Mathematical Thinking — A great course to "reset" your ideas around what math is about and what it is used for.
  2. MIT OCW - Mathematics for Computer Science — Covers proofs and introduces discrete mathematics/probability.

Books

Math

  1. Introduction to Statistical Learning, 6th ed. One of the best books for people wishing to get up to speed on practical uses of machine learning. Each chapter is presented with background and lab work in R that guides you through the application of the ML technique just discussed. The book is available online as a free PDF. You'll often see this book referred to online as ISL.
  2. Linear Algebra, 3rd Ed. — Jim Hefferon has an awesome text book on linear algebra. It's free, and there are problems and solutions.
  3. Introduction to Linear Algebra, 5e — Linear Algebra text Book by Gilbert Strang. Not available for free. I haven't used an updated edition, but the older editions were well written and easily digestible.
  4. Gilbert Strang's Calculus — one of the best introductions to Calculus. Starts out with the application of calculus to various problems in physics. This helps build up an intuition around calculus. It then back fills the intuition with layers of theory.
  5. Python for Data Analysis, 2ed Wes McKinney's book on

ML

  1. Machine Learning - The Art and Science of Algorithms That Make Sense of Data: Peter Flach's book is my recommendation for the best book giving an accessible overview of the wide — and often confusing — ML landscape. It has an easily readable format with enough math to help you understand what the ML technique is doing but not so much math that you can't understand what the hell the ML technique is doing.

Programing Languages

  1. Kermit Sigmon's Matlab Primer: Eventually if you google for "how do I do this in Matlab" you'll come across this online book.
  2. GNU Octave Manual
  3. Wikibooks GNU Octave Tutorial — Quick and dirty introduction to Octave
  4. Wikibooks MATLAB Tutorial — Because Octave is similar to MATLAB many of the tutorials for MATLAB work for Octave.

Advanced Books

  1. Elements of Statistical Learning,2nd ed. A deep dive (as in bottom of the ocean) into the math behind data mining, inference, and prediction. If you really want to know why something works in the ML space, this is a great book.

Food for Thought

  1. Weapons of Math Destruction — Cathy O'Neil

    "A former Wall Street quant sounds an alarm on the mathematical models that pervade modern life — and threaten to rip apart our social fabric"

  2. IBM and the Holocaust

    IBM and the Holocaust is the stunning story of IBM's strategic alliance with Nazi Germany -- beginning in 1933 in the first weeks that Hitler came to power and continuing throughout World War II. As the Third Reich embarked upon its plan of conquest and genocide, IBM and its subsidiaries helped create enabling technologies, step-by-step, from the identification and cataloging programs of the 1930s to the selections of the 1940s.

Utilities/Tools

  1. Matlab - by Mathwork's: You will eventually come across some code written in/for Matlab. Matlab is an environment designed to make Mathematical computing problems like ML easier. The downside is the price commercial licenses are north of $2k USD. However, a home user licenses starts at around $150 USD
  2. GNU Octave: Because Matlab is too damn expensive. Octave is an open source Mathematical computing environment that is mostly compatible with Matlab. Just remember that mostly part.key
  3. R Studio: R Studio is an IDE for the R programming languages. It comes in prices from free to really expensive.
  4. Shiny: If you need to turn an R script/application on the web. Shiny might be a good fit. R+Web App == Shiny
  5. Jupyter Notebook An open source tool for sharing documents that are interactive coding environments
  6. Anaconda — Anaconda is the simplest way to install a machine learning environment. Everything is designed to work together (which helps you avoid pip/pyenv hell in python) and it has been extended to have other applications/utilities related to doing data work.
  7. PyData.org — Python Tools for working with Data Science
  8. SciKit Learn — A collection of implementations of common machine learning algorithms.
  9. PANDAS — One of the most common libraries for working with, manipulating, cleaning, filtering and generally munging data

Languages

Like it or not you’ll need to learn another language. Here are the recommended languages.

0th Choice

Listen: ML work is done in python. That's just the way things are.

  1. Python

First Choices:

You're probably going to see a lot of R and Octave in the wilds of ML so you might as well learn the languages the locals use: 2. R 3. Matlab/Octave

Second Choices

  1. C++
  2. Fortran
  3. C
  4. Julia
  5. Lisp/Clojure
  6. Java* (And a lot of the languages that work on the JVM)
  7. SAS
  8. STATA

You know What you're doing

  1. OCaml
  2. F#
  3. Assembler
  4. OpenCL

Frameworks

Reference Links for the talk

These are a few of the better articles that link to topics raised in my talk.

  1. Ars Technica — Ai could threaten up to 47 percent of jobs
  2. Frey Osborne - Future of Employment Scholarly paper using ML to cluster which jobs are most likely to be automated in the near future. Ironically: it uses ML to determine this.
  3. NYT Book Review: IBM And the Holocaust
  4. NVIDIA - Bringing AI to Computer Graphics
  5. AI used to create Malware
  6. Uber sparks new Taxi wars in SA
  7. Why Taxi Wars are not just about the money
  8. Extreme Tech - Diver in Fatal Tesla Crash Wasn't paying attention
  9. NYTimes: Russian Election Hacking Winder than originally thought
  10. NY Times: The Perfect Weapon, how Russian Cyberpower Invaded the US
  11. ArsTechnica - Facebook Helped advertisers target teens that feel "worthless"
  12. NY Times - Purged Facebook Page tied to Kremlin
  13. Adobe Makes Selfies Less Embarrassing
  14. Ars Technica - Movie written by algorithm turns out to be hilarious and intense
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment