jrosen48/generality-widget-notes.md

## generality-widget-notes.md

      
    Raw
  

              generality-widget-notes.md
            
          
    The Generality Widget

The goal of this document

The goal of this document is to describe how and why we created The Generality Widget and how we made it, focusing on:

What data we used
How we coded the data
How we prepared the data
How we analyzed the data
How we used the results of the analysis in the widget
The intended use and specific functionality of the widget

What data we used


Data from 845 students' responses over five time points (with a good deal of missing data): https://drive.google.com/file/d/1PLRN2dCBXQaf5BPvIEEyuUspjrfoHwy-/view?usp=sharing
From "embedded" assessments targeting students' explanations and/or models of a phenomena
Students wrote between 1-3 sentences, and indicated whether their explanation and/or model is general or specific

How we coded the data


We used a combined ML (first) and human-driven approach (second) to inductively develop a coding frame for students' consideration of the generality of their explanations and/or models.
We also sought to understand how reliably we could apply the coding frame with supervised ML

How we prepared the data


Manual identification and removal of blank responses
NLP techniques: stemming, removing stop words; see some details here: https://gist.github.com/jrosen48/6b5051640975d53d2f5d3b88f8c6a3fe
winnowing our codes to six (by ignoring sub-codes)

How we analyzed the data


used the R package quanteda (https://quanteda.io/) and quanteda.textmodels (https://cran.r-project.org/web/packages/quanteda.textmodels/quanteda.textmodels.pdf) for a naive bayes and support vector machine classifer
also used quanteda.classifers (https://github.com/quanteda/quanteda.classifiers) for a deep learning/neural net classifer
used leave one out cross validation and the weighted kappa and percent accuracy statistics to select the best classifier
(could focus on other metrics)
(could inspect the confusion matrix to see for which codes the classifier performed best)
(we did not focus a great deal on optimizing each specific classifier/tuning the classifier)
(need to understand how each specific algorithm is working better and why)
selected the support vector machine, which seemed to perform both best (relative to the other classifiers) and adequately (kappa = .65, percent agreement = .70); neural network was very similar

How we used the results of the analysis in the widget


used the fitted model can be used to make predictions on new data; this is what the Shiny app does: https://jmichaelrosenberg.shinyapps.io/generality-shiny/
we are logging responses/feedback through the app (this could be improved; right now it's fairly manual)