Skip to content

Instantly share code, notes, and snippets.

@dpogue
Created February 14, 2020 17:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dpogue/b87daf0ba944af226f87efaf84951605 to your computer and use it in GitHub Desktop.
Save dpogue/b87daf0ba944af226f87efaf84951605 to your computer and use it in GitHub Desktop.

How much do you need to know about machine learning?

VanJS 20200213 - Dina Berry

Works in the content org at Microsoft.

How many people consider themselves just JS devs? Not many. How many JS and at least other language? A lot more, mostly on the backend.

If you had documentation for an API, would you want it language-based or just REST? Mix of both

What does AI do?

AI is just a function call, and it has an obvious return: an answer, and a (confidence) score for that answer

const { answer, score } = describeImage()

Pretrained Models

Means it already has an algorithm and has been trained with existing data on that algorithm.

What data is appropriate to send through the function?

< demo using Microsoft Computer Vision REST API >

It's going to return tags and a caption (and confidence).

< demo returns the wrong caption, with confidence of 89% >

The algorithm is 89% confident, but it's still wrong. You need to look at your data before you use AI.

Speech demo: The container (with the AI model) is 14GB for one voice with one language.

Custom Models

You have the basic pre-trained model, and now you bring your own stuff.

Need to learn some vocabulary, with a lot of overlapping terms.

  • Example: The data you are training your model with.

    • Positive Example: Data you do want to recognize
    • Negative Example: Data you don't want to recognize
  • Features: A "cheat-sheet", helps the machine find or predict something better

  • Models: The algorithm and the data together.

  • Overtraining: Showing the algorithm the same thing over and over in the hope it gets better at predicting it.

Development Cycle is also very different (in a loop):

  • Explore data
  • Label data
  • Add Features
  • Test
  • Publish

Labelling Data in an AI Model: Identify positive examples in the data

It's important to label data even if you're not using it, to provide better positive examples for classification.

With pretrained models, you have to do far less labeling. Just label the app-specific custom data.

https://github.com/microsoft/recognizers-text

Training Platform

Monetization of your data. You can use pretrained models across your data.

Discovering Animals with ML and JS

VanJS 20200213 - Jonny Kalambay

I don't post a lot on social media, but when I do it's often animals. But I don't always know what something is, so I wanted to approach that with programming.

Built an app named "Dog Scope", like Pokémon Go, but for dogs

How does Machine Learning work?

You have a program that can perform tasks, and does it better the more it does it. You have a model, and it has "neurons".

You have an input, and an output, and between those you have a weight and a bias. The accuracy depends on how well the weight and the bias can model reality.

This is basically y=mx+b

In the real world, you're likely to be modeling connections of neurons. One neuron's output is the input to another neuron. Special functions in between that are called "activation functions".

Ex. ReLu (Rectifier Linear Units) If the input is less than 0, the output will be 0, otherwise the output is input.

Ex. Soft Max It takes a number of inputs and outputs that number of outputs but as probabilities (they add up to 1). Normalizer?

Dog Classifier: Input is a bunch of pixels, output is probability of dog breed.

Why would you want to do this in the browser? Cost, Speed, Privacy Pedictions can happen on the client-side. Faster because data is not making a round-trip. The data never leaves the device.

Client-side, most popular option is TensorFlow.js Library built by Google. They have a lot of pretrained models you can use out of the box.

Steps:

  1. Load
  2. Predict

Using the Image Classifier model from TensorFlow (MobileNet)

const model = await mobilenet.load(); const results = await model.classify(input);

When you are working with custom models:

  1. Prepare
  2. Load
  3. Predict

When you first create a model, it isn't configured to give useful outputs. You need to set it up with your pipeline (activation functions), and then train it. Feed your data to the model and it will give you random results. Take the loss score (how wrong it was) and feed it back to optimize it.

Accuracy will eventually improve with more cycles.

Transfer Learning: Taking an existing trained model and retraining it on new data

  1. Convert element to Tensor
  2. Reshape Tensor
  3. Normalize Tendor
  4. Run the inference

A Tensor is a... container of data... ish

Performance

On the front-end, models can be kinda slow. You can warm them up ahead of time.

Models can get big, need to think before having users download the model. Train a model that's as small as possible while still being powerful enough.

Machine Learning uses a LOT of memory. Using WebGL it does not garbage collect properly. Need to be careful to clean up.

You can always just use it on nodeJS on the backend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment