Create a gist now

Instantly share code, notes, and snippets.

Embed
Link collection for the "AI for humans" talk

Slides

Intro and history of ML on the web

  • Autodraw by Google is a tool that allows you to doodle what you want to paint and turns it into a proper icon by detecting the outline and making an ML based assumption what it could be.
  • Quickdraw by Google is a game they created a few years before Autodraw to train the model.
  • ReCaptcha is a CAPTCHA engine that feeds the data back into Google's ML systems. For example, currently being asked to detect street signs or cars is a good indicator that this data will go into the self-driving cars project.
  • Amazon's Mechanical Turk is a service by Amazon to get humans to do things for you. A lot of the data accumulated with that one could be used to train models. One very famous examples back then was to ask people to paint sheep facing left.
  • All the big players in IT offer AI/ML services, like Google, Microsoft and Amazon
  • NVIDIA released an interesting new algorithm to automatically fill missing parts of images using a deep learning network trained on faces.

Vision services and recognition

  • There is a bot scouring Twitter that turns images into alternative text. All you have to do is to tag a thread with an image with #vision_api.
  • The Vision API of Microsoft's Cognitive Services analyses images and detects what is in them. You get a list of tags and a human readable description. It also detects known entities, faces, celebrities and gives you the colours used in the image.
  • The Face API of Microsoft's Cognitive Services detects human faces in an image. You get a truckload of data back: Age, Emotion, Gender, Pose, Smile, and Facial Hair along with 27(!) landmarks for each face in the image.
  • Emotion detection is interesting and can be done by pretty basic means Susan Hinton's Emoji Face Demo is a good example how you can run this on your own machine.

Speech Services

  • The Speech API of Microsoft's Cognitive Services and Bing turns spoken words into text and speaks out text content using generated voices in various languages. Try it out on this demo.
  • An excellent article on Smashing Magazine by Burke Holland on The Rise Of Intelligent Conversational UI with examples on why this is a useful thing and how to build your own speech recognition services without using the Alexa API.
  • LUIS, an interface allowing you to train your own data sets for trigger words in spoken sentences, based on the LUIS API
  • Converting speech to text is much easier when you trained the system on the person speaking. The Speaker recognition API allows you to detect who spoke a certain audio file and allows you to train the recognition algorithm on your own voice for much better results.

Interesting further materials

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment