GSOC'18 Final Report: Inside the Black Box
Name: Shubham Shukla
Project: Inside The Black Box
Mentors: Heiko Strathmann, Giovanni De Toni
Organization and Repository: Shogun Machine Learning Toolbox: https://github.com/shogun-toolbox/shogun
My commits to shogun/develop
My commits to shogun/data
Blog Posts and user guides: https://shubham808.github.io/blog/
Shogun is a large scale machine learning toolkit developed by many diffrent minds and ideas. This means we have a lot of opportunities to optimize what goes on under the hood and create something simple and impactful. This project focuses a bit more on Iterative Algorithms among other things. The premature stopping framework was improved to make it more robust and natural to use and modify, the progress bar was improved to make it more verbose along with implementing it in most iterative algorithms, We also worked on making algorithms respect the provided feature types by making them fully templated for a more generic behaviour.
This is a very brief overview of the project. For more details, guides, weekly updates and peer review please visit my blog.
Table of Contents
- StoppableSGObject class and progress bar
- Iterative Machine
- Feature type dispatching and generic nature
- Other Contributions and Ideas
StoppableSGObject class and progress bar
In shogun we have a
SignalHandler that gives some control of what happens in case of an event, like premature cancellation (
CTRL+C), back to the user for some algorithms. We take all the premature stopping code and make it accessible to non
CMachine types as well by placing it in a new
CStoppableSGObject class. My mentor also introduced
m_callback data member which can accept a lambda function that can serve as a way to fire a cancel computation signal.
The class is implemented in PR4280. Other updates include
- #4286: replacing cancel_computation calls with macro
- #4287: enable premature stopping in all machines
- #4291: use case for
We added a default prefix for
class_name::method_name to the progress bar. This slightly changes the usage which is now done using
SG_PROGRESS macro instead of
progress method. We implemented it in most iterative algorithms. This helped us prepare a list of iterative algorithms
PR: #4305: The new macro and its usage
Future Work: Finding more use cases for it and using it to extend the list of iterative algorithms here.
Previously an algorithm could define what happens when a training process is cancelled or paused with the help of methods like
on_pause etc. This is not flexible for a user behind an interface like Python. We use the concept of mixins for the first time in shogun to write a new
CIterativeMachine class which allows the user to cancel training anytime, return the control to user where he can execute some more code, and then resume it later if needed. The pre-trained model remains usable and concurrent. This is done by making sure the model updates its
state in every iteration.
The mixin is implemented in PR4335. Related PRs are:
- #4320: update the
stateof Perceptron in every iteration
- PR4335: this also includes the implementation of
- Porting all algorithms from the List of Iterative Algorithms wiki page to this code style
- Systematic tests for all Iterative machines that ensure proper state update and concurrency.
Feature type dispatching and generic nature
There is an implicit assumption in most algorithms that the provided feature type will be 64 bit dense. To introduce more generic behaviour in an automated way we have written some new classes. These are all mixins that use the curiously recursive template pattern. This means algorithms will inherit from themselves and the orignal base class. The concept is new to shogun and brings a lot more possibilities for similar ideas. The idea is to dispatch feature types from base class and then have subclasses implement a templated version of
- #4373: This implements two mixin + crtp classes to dispatch dense and string feature types. We also implement dense featue types in
CLeastAngleRegression. There are also unit tests for training a machine will a list of feature types.
- #4389: a meta example for dense dispatcher
- Add more dispatchers with tests along with implementations.
- An automated way to create a new dispatcher class or maybe a workaround so that we don’t need to have as many dispatchers as the number of feature types.
Other Contributions and Ideas:
Observer and put
Shogun has an API to set values of member parameters that are registered by algorithms. This API can be used together with parameter observers to record summaries of change in member variables. This can be done using a new
put_observe method that will inform the parameter observers on anything that is being updated. This comes with a design change of using
put to change member data in code instead of using assignment. To benchmark the overhead we will gain over direct assignment we have written a benchmark in #4342
- Prototyping and benchmarking of
- Infering type using
Anyinstead of doing so in
Systematic tests for Iterative machine
Writing seperate tests in multiple classes that aim to do the same thing is redundant. We can use
TYPED_TESTS instead to do it for a number of classes. This has a few issues but will be possible in the future. An idea of how to do it is #4327 for serialization tests.
LibASTParserto provide more information on base classes
- Making a general dataset for proper testing
An idea of how these tests should look like is in
The implementation of
CNewtonSVM was outdated. We refactored it to use
SGVectors/SGMatrix for data storage and using
linalg API for computations. We also ported it to the new IterativeMachine code style.
PRs: We refactored most of the code in #4347. We added a new method to calculate pseudo inverse for matrices with two implementations in linalg #4356
Meta examples and cookbook contributions
These are some contributions to meta examples and cookbooks
- Neural Network Factory: Adding the option to
auto_initializethe neural network along with a new
layerfactory to create new layers. The example from python looks much more intuitive now.
PR: #4386 contains the factory example along with a cookbook for training a Convolutional Neural Network on a dataset for mnist images of 0, 1, 2 in shogun. The corresponding dataset is #165
- #4346: NewtonSVM meta example
- #4340: Diffusion meta cookbook and example
- #4310: porting
LeastAngleRegressionto new API
- #4297: porting
KernelRidgeRegressionmeta example to new API
- Deleting CLabelsFactory: The
CLabelsFactoryperformed static casts which were not needed anymore so we deleted it and used
PRs: #4281, #4277
- #4278: A few meta examples on using distance machines from factory
- #4236: factory methods in lda meta example
Parallel computation of
sample in CLogDetEstimator: #4235 We used openMP to make the code parallel along with refactoring it for efficient memory usage