mmalohlava/karel_ha_assignment.md

## karel_ha_assignment.md

      
    Raw
  

              karel_ha_assignment.md
            
          
    Assignment: Improve H2O PCA

The goal of this assignment is to:

Get familiar with the H2O stack
Make an improvement in H2O

Details

H2O provides implementation of the PCA algorithm which depends on the Jama library.
The library is used for several tasks including Singular Value Decomposition (SVD).
However, the library also introduces sub-optimal performance.
The idea is to replace Jama SVD commutation by the netlib-java library,
and measure performance impact.
Your task is to:

Clone development version of H2O from GitHub https://github.com/h2oai/h2o-3
Build H2O
Explore PCA implementation and how it is used, for example, in JUnit tests.
Replace use of Jama SVD by netlib-java
Measure impact of the change with a single node micro benchmark(s).
Create a pull request with your change

Hints


IntelliJ IDEA is a great tool for Java development
You can build only Java part of H2O by invoking ./gradlew :h2o-assemblies:main:build
JUnit tests are great source of information
If you are stuck, please, do not be afraid to contact us

Evaluation criteria


Does it work? Can I launch H2O and compute PCA?
Was performance measured?
How good is the implementation?

Enjoy!