dpogue/vanjs-1.md

## vanjs-1.md

      
    Raw
  

              vanjs-1.md
            
          
    How much do you need to know about machine learning?

VanJS 20200213 - Dina Berry
Works in the content org at Microsoft.
How many people consider themselves just JS devs? Not many.
How many JS and at least other language? A lot more, mostly on the backend.
If you had documentation for an API, would you want it language-based or just REST?
Mix of both
What does AI do?

AI is just a function call, and it has an obvious return:
an answer, and a (confidence) score for that answer
const { answer, score } = describeImage()

Pretrained Models

Means it already has an algorithm and has been trained with existing data on that algorithm.
What data is appropriate to send through the function?
< demo using Microsoft Computer Vision REST API >
It's going to return tags and a caption (and confidence).
< demo returns the wrong caption, with confidence of 89% >
The algorithm is 89% confident, but it's still wrong.
You need to look at your data before you use AI.
Speech demo: The container (with the AI model) is 14GB for one voice with one language.
Custom Models

You have the basic pre-trained model, and now you bring your own stuff.
Need to learn some vocabulary, with a lot of overlapping terms.


Example: The data you are training your model with.

Positive Example: Data you do want to recognize
Negative Example: Data you don't want to recognize


Features: A "cheat-sheet", helps the machine find or predict something better


Models: The algorithm and the data together.


Overtraining: Showing the algorithm the same thing over and over in the hope it gets better at predicting it.


Development Cycle is also very different (in a loop):

Explore data
Label data
Add Features
Test
Publish

Labelling Data in an AI Model: Identify positive examples in the data
It's important to label data even if you're not using it, to provide better positive examples for classification.
With pretrained models, you have to do far less labeling. Just label the app-specific custom data.
https://github.com/microsoft/recognizers-text
Training Platform

Monetization of your data. You can use pretrained models across your data.

  
## vanjs-2.md

      
    Raw
  

              vanjs-2.md
            
          
    Discovering Animals with ML and JS

VanJS 20200213 - Jonny Kalambay
I don't post a lot on social media, but when I do it's often animals. But I don't always know what something is, so I wanted to approach that with programming.
Built an app named "Dog Scope", like Pokémon Go, but for dogs
How does Machine Learning work?

You have a program that can perform tasks, and does it better the more it does it.
You have a model, and it has "neurons".
You have an input, and an output, and between those you have a weight and a bias.
The accuracy depends on how well the weight and the bias can model reality.
This is basically y=mx+b
In the real world, you're likely to be modeling connections of neurons. One neuron's output is the input to another neuron.
Special functions in between that are called "activation functions".
Ex. ReLu (Rectifier Linear Units)
If the input is less than 0, the output will be 0, otherwise the output is input.
Ex. Soft Max
It takes a number of inputs and outputs that number of outputs but as probabilities (they add up to 1). Normalizer?
Dog Classifier: Input is a bunch of pixels, output is probability of dog breed.
Why would you want to do this in the browser? Cost, Speed, Privacy
Pedictions can happen on the client-side.
Faster because data is not making a round-trip.
The data never leaves the device.
Client-side, most popular option is TensorFlow.js
Library built by Google.
They have a lot of pretrained models you can use out of the box.
Steps:


Load
Predict

Using the Image Classifier model from TensorFlow (MobileNet)
const model = await mobilenet.load();
const results = await model.classify(input);
When you are working with custom models:

Prepare
Load
Predict

When you first create a model, it isn't configured to give useful outputs.
You need to set it up with your pipeline (activation functions), and then train it.
Feed your data to the model and it will give you random results.
Take the loss score (how wrong it was) and feed it back to optimize it.
Accuracy will eventually improve with more cycles.
Transfer Learning:
Taking an existing trained model and retraining it on new data

Convert element to Tensor
Reshape Tensor
Normalize Tendor
Run the inference

A Tensor is a... container of data... ish
Performance

On the front-end, models can be kinda slow. You can warm them up ahead of time.
Models can get big, need to think before having users download the model.
Train a model that's as small as possible while still being powerful enough.
Machine Learning uses a LOT of memory. Using WebGL it does not garbage collect properly.
Need to be careful to clean up.
You can always just use it on nodeJS on the backend.