Lights, Camera, Algorithm
Act out and discuss machine learning algorithms. This activity is from a SRCCON 2018 session led by Jeremy Merrill and Rachel Shorey.
- Index cards
- Dice with varying numbers of faces (several D10 and one D6 for sure)
- Masking tape to mark floor
- Paper, easel, marker
- Stickers in several colors
A concrete understanding of what machine learning is by… acting it out. If you have never implemented a machine learning algorithm before, then by the end of this session...you probably still will not be able to implement a machine learning algorithm. The goal is to learn more about machine learning and the issues that come up when it is applied to real world data. This could be a starting point for digging deeper, could help you with reporting on machine learning, could help you have a better sense of what’s happening with something you use already, or could just be a fun way to end your day.
We're going to experience two different algorithms. First I want to really quickly define machine-learning, talk about why we're doing two algorithms. So, first. Machine learning is: Not magic It's just manipulating numbers to find patterns that might not be intuitively obvious. You can contrast it with regular computer code in that regular code, you tell the computer what to do; with machine learning, you tell the computer how to figure out what to do based on some data. Two kinds of machine learning, we're trying both of them. Supervised: This is where you have a bunch of things for which you know the right answer. And then you want to decide for new things what the right answer is based on the ones you already knew about. Simplified version of random forest. Unsupervised This is where you have a bunch of things and you don't know the right answer, but want to learn something interesting about them by finding similarities and differences. Simplified version of k-means OKAY LETS GO! We’re starting with supervised.
- ~50 index cards with the name of a fruit or vegetable, clearly indicating whether it is a FRUIT or VEGETABLE, with features written on the back.
- Furniture moved out of the way
- An easel with paper
Recommendation: your algorithm will work better if you have an approximately equal number of fruits and vegetables, so we recommend sorting the cards in alternating fruit-veg order to make this happen if you have fewer people.
If the group is advanced, you might want to include a few adversarial examples to test at the end. We did not do this due to time and crowd restrictions, but if you want to, we might recommend:
- Things that are not obviously a fruit or vegetable (maybe a poisonous plant)
- Things that are very obviously not a fruit or a vegetable (maybe a baseball cap, or a cup of water)
- Something that is a fruit or a vegetable but has most of the data missing
In a random forest, we build an arbitrary number of decision trees [ask someone to explain what a decision tree is?], each to an arbitrary depth. For each tree, at a branching point we select some features at random and pick the best of those to branch on. In this case, we need to keep it pretty simple, so we’re going to build three trees. And we’re going to stop building each tree when we’ve done just 2 levels of splitting. And each time, we’re just going to randomly pick a single feature to split on. Let’s see how it goes!
Everyone’s going to get an index card with a fruit or vegetable on it. It will be labelled “fruit” or “veg”. On the back, you’ll find 6 features, with true/false answers: Does the food item have more than 30 calories per 100g? Does the food item cost more than a dollar a pound? Do you need to peel the item? Is the item green? Do you keep it in the fridge? Does it grow on a tree?
Facilitator 1 will set up to record the results of each tree on a separate sheet of paper on the easel for testing purposes later.
Facilitator 2 will have each data point roll a D10. Anyone rolling a 1 will be put in “test”. Move the test data to the side. Impress on the test data the importance that they not become contaminated by mingling with the training data. Facilitator 1 will be responsible for babysitting the test data. If the test data seems bored/rowdy, they could be encouraged to perhaps play duck duck goose. (They will probably enjoy watching though).
With the entire training group, facilitator 2 will roll a D6 to select the feature we will split on first. The data points who are TRUE on that feature will go right, the datapoints who are FALSE will go left. Both facilitators should help out with this and confirm everyone knows what is happening for the first round. Facilitator 1 will draw the node of the tree with the selected feature.
For each of the 2 groups created in step 3, facilitator 2 will roll the die again. Facilitator 2 will throw out the roll if the same feature was selected and roll again. The individuals will split again, and facilitator 1 will record.
At each leaf, facilitator 2 will count the number of fruits and veggies. Facilitator 1 will record. If any node is tied, facilitator 2 will split the node again using the same procedure as in previous steps.
Repeat steps 3-5 two more times, for a total of 3 trees. Facilitator 1 will continue recording and can possibly encourage some cheering/jeering from the likely now bored test set (“TEST IS THE BEST” etc).
Training set is banished. Using the diagrams, facilitator 1 will help one test data point run through the first tree out loud (if you happen to know your test data and there is someone who's particularly charismatic, pick that person, it goes well if this datapoint gets really into it). Then all members of the test set will evaluate themselves, recording a “vote” for each tree. [Note: we tried actually dividing them the way we had the training data, and it got confusing. We would recommend having them each trace through mentally and keep track of how many fruit/veg votes they got, and then just say that out loud]
Facilitator 1 will have each test data point will introduce themself with their name (eg: “Apple”), true category (“Fruit”) and number of votes for “Fruit” vs “Vegetable”. Facilitator 2 (NOTE THIS SWITCH) will record who was categorized correctly. We will record the overall prediction accuracy of the algorithm, and also note if that accuracy was in any way skewed (eg if it just predicts everything is a fruit).
If using adversarial examples, facilitator 1 will evaluate the adversarial examples out loud. Facilitator 2 will record these as well.
- ~50 index cards with the name of a fruit or vegetable including several adversarial examples (eg
- Furniture moved out of the way
- A grid roughly taped on the floor
In k-means, we’re going to spread our data out based on two dimensions. Then we’re going to randomly place some cluster centroids on the board. Each data point will “choose” the centroid closest to them. The centroids will then move, to be in the approximate center of the points who have “chosen” them. Then we’ll re-choose centroids. Then the centroids will move again. We’ll keep doing this until no one moves.
Facilitator 1 selects 3-4 people to be centroids from the group, depending on group size. Facilitator 1 huddles with centroids to discuss their role. As part of this, each centroid will be assigned a sticker color, and will get a set of those stickers.
Facilitator 2 gives each data point a fresh index card with the name of the fruit or vegetable on it, and directs them to stand in the most appropriate place on the grid. The X axis is “sweetness” and the y axis is “size”. Give some examples, eg:
- Up here, at the top, we’d put the very largest fruits or vegetables, such as one of those giant pumpkins
- Down here at the bottom, we’d imagine things like green peas or blueberries
- Over to the left, we’d want things that aren’t sweet at all, like maybe celery?
- And to the right, like, literal sugar cane.
- Use your own judgement!
When the data points are all placed, facilitator 1 will have each centroid roll a D10 twice, once for sweetness and once for size. Help them place themselves on the grid, and put a sticker down so they can find their spot again if they move.
Direct data points to identify their nearest centroid. If they feel they are equidistant, one of the facilitators will be the judge [people were good at deciding and did so reasonably in our session]. Each data point will collect a sticker from the nearest centroid.
One by one, we will cycle through the centroids. For each centroid, ask the people who have that color sticker to raise their hands. The centroid must now estimate the midpoint of their data points and move there. They should put a new sticker on the ground.
Each data point should again identify the nearest centroid, and if it has changed, get a new sticker color. Remember which is the newest one. Facilitators ask if anyone changed sticker color. If so, repeat steps 5 and 6 until no one changes. [We stopped after 3 moves despite the fact that there was still movement because people got the idea and the clusters seemed pretty solid].
For each centroid, ask them to huddle and talk about what they have in common, and whether there are any outliers. Then ask each centroid to report back about what is in the cluster, what kinds of things they have in common (eg, are all the citrus fruits there?) and what outliers they found.
Afterwards, discussion, with prompts.
- When does our Random Forest classifier do a bad job of classifying/grouping? [in our case, the random forest was 100% successful(!) so we kind of flailed a little bit here]
- Our "training" data had a bunch of clear answers. When is your data not going to be so clear? (i.e. barriers to getting real data)
- What happens if the "right answers" are actually wrong?
- Why did we test, anyway?
- Did you disagree with any of the provided data or classifications in Random Forest? [Avocado was VERY upset in our random forest. Would recommend finding Avocado for a person-on-the-street interview]
- Better real examples of when to use ML?
- Have you had times its gone badly? (Talk about Breaking the Black Box).
- Talk about the effects of randomly choosing a training set and how things could differ if different people end up in the training set (consider if an intuitive edge cases, like tomatoes or venus fly traps, abstract nouns, verbs, something that makes sense but has missing data (a prehistoric animal about which we don't know all the answers?) being in or out of the training set changes the result).
- If you are reporting on machine learning, what questions should you definitely ask?
- If you were try to explain why either of the models, once it's in production, made a particular decision, what would you say? [be sure to mention complete unexplainability of deep learning]
|fruit||>30 cal per 100g||over $1/pound||need to peel||green?||keep in fridge||grows on a tree||truth|