Skip to content

Instantly share code, notes, and snippets.

@thundergolfer
Last active October 28, 2017 13:17
Show Gist options
  • Save thundergolfer/730b2c178335ff6266d18184e4104069 to your computer and use it in GitHub Desktop.
Save thundergolfer/730b2c178335ff6266d18184e4104069 to your computer and use it in GitHub Desktop.

Hey r/learnmachinelearning

I'm writing this short post in response to this infographic post, The 4 Stages of Machine Learning. I basically replied to it saying that it encapsulates reasonably well ML in a research context, but not so much the greater problem of production ML systems. People asked me to expand on that so here it is.

What's my experience with production ML? Pretty limited, but my very short time in it has been eye-opening. I started a year long data engineering internship with Zendesk's ML product team in November 2016. It's the team that posted Serving Tensorflow in Production at Zendesk recently. Our product is Automatic Answers.

Getting XX.XX% on a dataset vs. creating a product for users

Perhaps the most important difference between the ML most people here know and production ML at Zendesk is that Zendesk's ML must have business value, which means it must offer value to customers. Zendesk has a product model, and Automatic Answers must fit into that and drive the companie's growth. Sure ML and AI are really cool, but if you can't get it to be useful to a user then you have nothing. Before Zendesk, I saw ML as "how can I get this damn network to train and perform on this dataset? The pros have achieved 9X.XX% accuracy." Now 'doing ML' for the team includes:

  • Managing customer expectations
  • Debugging problems end-users face with how the model behaves
  • Serving 1000s of models and their predictions on demand to thousands of people
  • So. much. UX.

I can't tell you how influential UX seems to be to the success of real-software systems. Your model can be awesome, but your ML system (in a production context) really includes everything the product relies on, from the data ingestion system to the end-user UX. If those fail you your model is pointless and your ML system is crippled.

more about ML in the real world (again, from what I've seen)

Just as it's known that Data Science is really around 20% research and 80% data stewardship, being an ML engineer in production means that your responsibilities extend beyond training models on ready-to-go datasets. You're going to be using them, so capabilities with AWS/GCP, Hadoop/Spark, Tensorflow Serving, Pachyderm, Docker, Data Visualisation, SQL are all very handy. Also, all of a sudden your work becomes part of a wider team, product, and company so it must actually be reproducible, implementable, bug-free, documented. Those four things don't really constrain researchers and at-home ML hobbyists.

Eyes Open

Realistically, the world of 'open-source and MOOC' ML consists mostly of 3 kinds of ML work:

  1. Tutorials
  2. Implementing ML research papers
  3. Personal Projects

Of these, the first two are basically 'follow the steps'. It's certainly not easy but it's not like what's done in industry. The third may or may not involve real end-users and production-ready ML engineering, most don't. You can go along and learn a bunch about ML through doing the above things and still be wholly unsuccessful at production ML engineering, which really does require you put on different hats (product management, data engineer, reliability engineer, OPS, Tester, etc).

What's heartening to me

I should remind that I am not an ML Engineering Intern, I'm a Data Engineering intern. Though part of my work is solving problems peculiarly associated with and created by ML systems, and it's pretty awesome. Though I'm not sitting there with IPython and Tensorboard open, my work falls within what I would call ML Engineering. It's a much broader problem than the already massive problem that is Machine Learning research, and it stands on its own as a pretty great (under-exposed) area of software engineering. I likely could not have gained an internship in ML research as an undergrad, but being a data engineer in an ML team is pretty much the next best thing. Further, it seems the norm amongst ML teams to encourage cross-fertilisation of skills so if you're in the ML team you're bound to be up-skilling in ML. If you don't have/want-to-get a PHD, but love ML, seriously consider shaping yourself as a Data/ML engineer. You'll be in high-demand.

Learn more about real-world ML

There really isn't enough material out there on 'real-world ML'. The field is still quite new, and people are still finding their feet. Most of the good ML engineering stuff comes, unsurprisingly, out of Google. They've been doing ML in production for a long time now and on a massive scale, so they've come face to face with it's unique challenges.

Properly explaining how big the problem of production ML is would take more than a few books, and so far no one's even written one book on it. Nevertheless, here are some things I've read which give at least some insight into production ML and its teams.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment