##Fundamental ML : The usual suspects
Student : Saurabh Mahindre
Shogun has a wide codebase covering various sophisticated algorithms and multiple interfaces. The goal of this project was to improve the implementations of existing basic Machine learning algoritms in terms of efficiency, performance and test coverage. Here I will list my contributions in form of commits, PRs that made improvements to various algorithms and some of the benchmarks.
My commits on shogun-develop:
Pull requests with improvements to algorithms:
- KMeans speedup
- Parallel KMeans
- KMeans++
- Distance speedup
- Parallel Crossvalidation
- Least Angle Regression
- Decision trees
- Random Forest
- Parallel Bagging Machine
- Multicore Liblinear
- KNN
- LSH - KNN
- Random Rotation ensembles (WIP)
Apart from this I have also added cookbooks for some of these algorithms:
- Random Forest Classification
- Random Forest Regression
- Linear Discriminant analysis
- Sparse Approx. Gaussian Process
- Perceptron
- Large Margin Nearest Neighbors
- Least Angle Regression
- ID3
KMeans
Dataset | Shogun-old | Shogun-new | Shogun-new-multicore (3) |
---|---|---|---|
isolet | 7.390395 | 3.039222 | 1.545623 |
covtype | 63.604286 | 28.711537 | 19.381208 |
waveform | 0.012567 | 0.012547 | 0.017663 |
corel-histogram | 2.730833 | 2.179383 | 1.437981 |
Random Forest
Dataset | Shogun-old | Shogun-new | ----- | ----- | ----- | ----- | iris | 0.057174 | 0.018281 | scene | 3.818921 | 1.650221 | isolet | 13.025225 | 7.742546 | mammography | 1.171553 | 1.113959 | satellite | 1.426217 | 1.185141 |
Least Angle Regression - LASSO
Dataset | Shogun-old | Shogun-new |
---|---|---|
cosExp | 0.911573 | 0.754447 |
arcene | 0.652691 | 0.664856 |
madelon | 1.785826 | 1.298820 |
diabetes | 0.000569 | 0.094831 |
Will be adding more benchmark results, the benchmark code can be found on my fork: https://github.com/Saurabh7/benchmarks/tree/newbenchmarks. This will be merged with https://github.com/zoq/benchmarks eventually. I am finishing a blog/webpage with some details about the improvements will share the link here soon.