Skip to content

Instantly share code, notes, and snippets.

@ehzawad
Last active July 21, 2020 16:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ehzawad/f1de218c2d68a60dc8d91d7fc3d00505 to your computer and use it in GitHub Desktop.
Save ehzawad/f1de218c2d68a60dc8d91d7fc3d00505 to your computer and use it in GitHub Desktop.

Towards the Automatic Classification of Students Answers to Open-ended Questions

Introduction

To broaden the scope the MOOCs, an automated machine-scoring tool is a necessity. This paper will open up a scope aiding the grader in the process of evaluation of student answers to open-ended questions through the ASAG task, sufficing the need of accessible and feasible education at large scale, using a handful of statistics and deep learning toolkits.

Goal

our goal is simple, using NLP techniques we have to pick a better approach out of response-based approach and reference-based approach., juxtaposing the resulted outputs and expected output for the discrimination between correct and incorrect students' answer.

Contributions

Comparing the performance of response-based and reference-based models, we are more likely to tug out the best semantic, analytic, and predictive tool of these two aforementioned methods, making the mind to contribute for ASAG.

Outline

Chapter 1

sort of introduction, motivation, and goal behind the research

Chapter 2

describing the jargons and intricacies of ASAG technique

Chapter 3

elaboration of each component

Chapter 4

applying the pipeline on three different standard datasets, and contracts the state-of-art resullts

Chapter 5

limitation and possible future work

Chapter 6

conclusion

Background and related work

MOOC, little bit of history, a course named connectivism and connected knowledge

as research continued on it, Coursera and Edx came into being.

Formative assessments in MOOCs

tldr; assessment for learning

interactive learning experience between teachers and students, based on continued feedback loop, incremental update what the students are learning

Summative assessments in MOOCs

Summative assessments is arguably critical because it assesses and evaluates their mastery on subject. It is a challenge for automated tools for open-ended assessments(ASAG) and for essay scoring(AES).

The Automatic Short Answer Grading Task

Basically objective questions

ASAG Approaches

five approaches are delineated

concept mapping -> juxtaposing students' answers and expected answers by replacing morphological and syntactic variation of words or phrases or clauses or sentences.

information extraction -> semantically pattern matching extracted from course material, comparing and contrasting students' answer and formulated teachers' answers.

corpus-based methods -> It basically utilize the collection of answers which relied on statical analysis, enhancing paraphrase recognition and calculating distance measures.

machine learning

predictive and clustering models

evalution

Here we need to evalute which method or methods work best. We can post problem as a kaggle problem

How it works on a new set of unseen answers

three approaches

  1. reference-based approaches

  2. response-based approaches

  3. hybrid approaches

reference-based approaches

comparing a student answer and a model answer at different levels, taking various aspects into accounts, like, content, structure, and style

counting the ratio of matches, sequencing of words

another study, learn the cost of the operations, for instances, stemming match, synonym alignment, number of word shifts, inserting, deletion, paraphrasing, substitution, which essentially does the job of measuring a similarity between two sentences.

checks stylistic aspects of text

Deep Belief Network coalesced three ideas: capturing similarity between student answers and model answers, representing difficulty level, the probability of students' mastery based on past performance

few models to get it done:

Naive Bayes

Logistic Regression

Decision Tree

Artificial Neural Network

Support Vector Machine

DBM

response-based approaches

It generates vector space classification from all students answers to identify syntactic, semantic and lexical characteristics

Random forest model, gradient boost machine models to generate an average score

four-grams, six-grams models, which essentially used as RF, GBM, Ridge Regression(RR), Support Vector Regression (SVR) and K-nearest neighbors(KNN)

hybrid approaches

basically the amalgamation of reference-based and response-based approaches, but in theory it can be stacked by other algorithms, for example, Canonical Correlation Analysis(CCA)

AI techniques for ASAG

classification

training set

validation set

test set

three classification algorithm

random forest classifier

support vector machine

stacked logistic regression

Natural Language Processing

AI covering linguistic aspects such as morphology, syntax, semantics and discourse, on the purpose of applying it on various aspects of field, for example, QA, text recognition, text summarization, and text generation, spelling correction, tone of a text

Semantic Web

It gives us an ability to read machine-readable contents from the internet, which basically glean huge amount of data from the wikipedia, or anchored links

Vector space models

N-grams -> n word sequence length

Word Embeddings -> curse of dimensionality

Glove -> a count based unsupervised learning method model learns through the ratio of occurrence probabilities

Word2Vec -> a predictive-based unsupervised learning method which gives us ability on both predicting words using (CBOW) model, and predicting context using skip-gram model or architecture

DBpedia -> generates vector space representation of DBpedia entities

FastText -> mitigate a problem which skip-gram leads to.

Sentence Embeddings

projecting sentence representations into continuous spaces, using a RNN to extract information of a sentence, feeding into a multi-layer perceptron, importantly when the learning rate passes the threshold of 10-5 training stops.

Research methodology

Jargonic explanation of the whole thesis paper

General Architecture

Few diagram of how algorithm works

Limitations

It has limitations because of language nuances, and we need to curate huge amount of large datasets for test set, validation set, and train set. For some open-ended questions there are some subjective matter we have to take into accounts, and our deep learning algorithms need to be adaptive for various circumstances. Therefore, we are subject to trails distinctive algorithm for distinctive cases.

tons of figures to measure accuracy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment