- Input data is called training data and has a known label or result such as spam/not-spam or a stock price at a time.
- A model is prepared through a training process where it is required to make predictions and is corrected when those predictions are wrong. The training process continues until the model achieves a desired level of accuracy on the training data.
- Example problems are classification and regression.
- Example algorithms include Logistic Regression and the Back Propagation Neural Network.
- Input data is not labelled and does not have a known result.
- A model is prepared by deducing structures present in the input data. This may be to extract general rules. It may through a mathematical process to systematically reduce redundancy, or it may be to organize data by similarity.
- Example problems are clustering, dimensionality reduction and association rule learning.
- Example algorithms include: the Apriori algorithm and k-Means.
Algorithms are often grouped by similarity in terms of their function (how they work). For example, tree-based methods, and neural network inspired methods.
Regression is concerned with modelling the relationship between variables that is iteratively refined using a measure of error in the predictions made by the model.
- Ordinary Least Squares Regression (OLSR)
- Linear Regression
- Logistic Regression
- Stepwise Regression
- Multivariate Adaptive Regression Splines (MARS)
- Locally Estimated Scatterplot Smoothing (LOESS)
Instance-based learning model a decision problem with instances or examples of training data that are deemed important or required to the model.
- k-Nearest Neighbour (kNN)
- Learning Vector Quantization (LVQ)
- Self-Organizing Map (SOM)
- Locally Weighted Learning (LWL)
Decision tree methods construct a model of decisions made based on actual values of attributes in the data.
- Classification and Regression Tree (CART)
- Iterative Dichotomiser 3 (ID3)
- C4.5 and C5.0 (different versions of a powerful approach)
- Chi-squared Automatic Interaction Detection (CHAID)
- Decision Stump
- M5
- Conditional Decision Trees
Bayesian methods are those that explicitly apply Bayes’ Theorem for problems such as classification and regression.
- Naive Bayes
- Gaussian Naive Bayes
- Multinomial Naive Bayes
- Averaged One-Dependence Estimators (AODE)
- Bayesian Belief Network (BBN)
- Bayesian Network (BN)
Clustering Algorithms Clustering, like regression, describes the class of problem and the class of methods.
- k-Means
- k-Medians
- Expectation Maximisation (EM)
- Hierarchical Clustering
Deep Learning methods are a modern update to Artificial Neural Networks that exploit abundant cheap computation.
They are concerned with building much larger and more complex neural networks, and as commented above, many methods are concerned with semi-supervised learning problems where large datasets contain very little labelled data.
- Deep Boltzmann Machine (DBM)
- Deep Belief Networks (DBN)
- Convolutional Neural Network (CNN)
- Stacked Auto-Encoders
- C4.5 (decision tree)
- k-means (clustering)
- Support vector machines (next to C4.5, a classifier to try out first)
- Apriori (association rule learning --> recommendation engine)
- EM (i.e. expectation-maximization for clustering)
- PageRank (network analysis; think of the PageRank in Google's search engine)
- AdaBoost (boosting, and thus an ensemble learning algorithm; taking in and combining multiple learning algorithm)
- kNN (aka k-Nearest Neighbors, thus classification)
- Naive Bayes (family of classification algorithms assuming that all features is independent of each other)
- CART (aka classification and regression trees, thus a classifier)