ehzawad/NLP-based.md

## NLP-based.md

      
    Raw
  

              NLP-based.md
            
          
    Towards the Automatic Classification of Students Answers to Open-ended Questions

Introduction

To broaden the scope the MOOCs, an automated machine-scoring tool is a necessity. This paper will open up a scope aiding the grader in the process of evaluation of student answers to open-ended questions through the ASAG task, sufficing the need of accessible and feasible education at large scale, using a handful of statistics and deep learning toolkits.
Goal

our goal is simple, using NLP techniques we have to pick a better approach out of response-based approach and reference-based approach., juxtaposing the resulted outputs and expected output for the discrimination between correct and incorrect students' answer.
Contributions

Comparing the performance of response-based and reference-based models, we are more likely to tug out the best semantic, analytic, and predictive tool of these two aforementioned methods, making the mind to contribute for ASAG.
Outline


Chapter 1

sort of introduction, motivation, and goal behind the research


Chapter 2

describing the jargons and intricacies of ASAG technique


Chapter 3

elaboration of each component


Chapter 4

applying the pipeline on three different standard datasets, and contracts the state-of-art resullts


Chapter 5

limitation and possible future work


Chapter 6

conclusion

Background and related work

MOOC, little bit of history, a course named connectivism and connected knowledge
as research continued on it, Coursera and Edx came into being.
Formative assessments in MOOCs

tldr; assessment for learning
interactive learning experience between teachers and students, based on continued feedback loop, incremental update what the students are learning
Summative assessments in MOOCs

Summative assessments is arguably critical because it assesses and evaluates their mastery on subject. It is a challenge for automated tools for open-ended assessments(ASAG) and for essay scoring(AES).
The Automatic Short Answer Grading Task

Basically objective questions
ASAG Approaches

five approaches are delineated

concept mapping -> juxtaposing students' answers and expected answers by replacing morphological and syntactic variation of words or phrases or clauses or sentences.
information extraction -> semantically pattern matching extracted from course material, comparing and contrasting students' answer and formulated teachers' answers.
corpus-based methods -> It basically utilize the collection of answers which relied on statical analysis, enhancing paraphrase recognition and calculating distance measures.
machine learning
predictive and clustering models
evalution
Here we need to evalute which method or methods work best. We can post problem as a kaggle problem
How it works on a new set of unseen answers
three approaches


reference-based approaches


response-based approaches


hybrid approaches


reference-based approaches

comparing a student answer and a model answer at different levels, taking various aspects into accounts, like, content, structure, and style
counting the ratio of matches, sequencing of words
another study, learn the cost of the operations, for instances, stemming match, synonym alignment, number of word shifts, inserting, deletion, paraphrasing, substitution, which essentially does the job of measuring a similarity between two sentences.
checks stylistic aspects of text
Deep Belief Network coalesced three ideas: capturing similarity between student answers and model answers, representing difficulty level, the probability of students' mastery based on past performance
few models to get it done:

Naive Bayes
Logistic Regression
Decision Tree
Artificial Neural Network
Support Vector Machine
DBM

response-based approaches

It generates vector space classification from all students answers to identify syntactic, semantic and lexical characteristics
Random forest model, gradient boost machine models to generate an average score
four-grams, six-grams models, which essentially used as RF, GBM, Ridge Regression(RR), Support Vector Regression (SVR) and K-nearest neighbors(KNN)
hybrid approaches

basically the amalgamation of reference-based and response-based approaches, but in theory it can be stacked by other algorithms, for example, Canonical Correlation Analysis(CCA)
AI techniques for ASAG

classification


training set
validation set
test set

three classification algorithm
random forest classifier
support vector machine
stacked logistic regression
Natural Language Processing

AI covering linguistic aspects such as morphology, syntax, semantics and discourse, on the purpose of applying it on various aspects of field, for example, QA, text recognition, text summarization, and text generation, spelling correction, tone of a text
Semantic Web

It gives us an ability to read machine-readable contents from the internet, which basically glean huge amount of data from the wikipedia, or anchored links
Vector space models

N-grams -> n word sequence length
Word Embeddings -> curse of dimensionality
Glove -> a count based unsupervised learning method model learns through the ratio of occurrence probabilities
Word2Vec -> a predictive-based unsupervised learning method which gives us ability on both predicting words using (CBOW) model, and predicting context using skip-gram model or architecture
DBpedia -> generates vector space representation of DBpedia entities
FastText -> mitigate a problem which skip-gram leads to.
Sentence Embeddings

projecting sentence representations into continuous spaces, using a RNN to extract information of a sentence, feeding into a multi-layer perceptron, importantly when the learning rate passes the threshold of 10^-5 training stops.
Research methodology

Jargonic explanation of the whole thesis paper
General Architecture

Few diagram of how algorithm works
Limitations

It has limitations because of language nuances, and we need to curate huge amount of large datasets for test set, validation set, and train set. For some open-ended questions there are some subjective matter we have to take into accounts, and our deep learning algorithms need to be adaptive for various circumstances. Therefore, we are subject to trails distinctive algorithm for distinctive cases.
tons of figures to measure accuracy