VanJS 20200213 - Dina Berry
Works in the content org at Microsoft.
How many people consider themselves just JS devs? Not many. How many JS and at least other language? A lot more, mostly on the backend.
If you had documentation for an API, would you want it language-based or just REST? Mix of both
AI is just a function call, and it has an obvious return: an answer, and a (confidence) score for that answer
const { answer, score } = describeImage()
Means it already has an algorithm and has been trained with existing data on that algorithm.
What data is appropriate to send through the function?
< demo using Microsoft Computer Vision REST API >
It's going to return tags and a caption (and confidence).
< demo returns the wrong caption, with confidence of 89% >
The algorithm is 89% confident, but it's still wrong. You need to look at your data before you use AI.
Speech demo: The container (with the AI model) is 14GB for one voice with one language.
You have the basic pre-trained model, and now you bring your own stuff.
Need to learn some vocabulary, with a lot of overlapping terms.
-
Example: The data you are training your model with.
- Positive Example: Data you do want to recognize
- Negative Example: Data you don't want to recognize
-
Features: A "cheat-sheet", helps the machine find or predict something better
-
Models: The algorithm and the data together.
-
Overtraining: Showing the algorithm the same thing over and over in the hope it gets better at predicting it.
Development Cycle is also very different (in a loop):
- Explore data
- Label data
- Add Features
- Test
- Publish
Labelling Data in an AI Model: Identify positive examples in the data
It's important to label data even if you're not using it, to provide better positive examples for classification.
With pretrained models, you have to do far less labeling. Just label the app-specific custom data.
https://github.com/microsoft/recognizers-text
Monetization of your data. You can use pretrained models across your data.