Skip to content

Instantly share code, notes, and snippets.

<link rel="import" href="../paper-button/paper-button.html">
<link rel="import" href="../paper-checkbox/paper-checkbox.html">
<link rel="import" href="../paper-input/paper-input.html">
<link rel="import" href="../core-icons/core-icons.html">
<link rel="import" href="../paper-icon-button/paper-icon-button.html">
<link rel="import" href="../paper-item/paper-item.html">
<link rel="import" href="../paper-tabs/paper-tabs.html">
<link rel="import" href="../paper-tabs/paper-tab.html">
<link rel="import" href="../paper-radio-group/paper-radio-group.html">
<link rel="import" href="../paper-radio-button/paper-radio-button.html">
<link rel="import" href="../paper-button/paper-button.html">
<link rel="import" href="../paper-checkbox/paper-checkbox.html">
<link rel="import" href="../paper-input/paper-input.html">
<link rel="import" href="../paper-progress/paper-progress.html">
<link rel="import" href="../paper-slider/paper-slider.html">
<link rel="import" href="../paper-tabs/paper-tab.html">
<polymer-element name="my-element">
<template>
@zaleslaw
zaleslaw / sample_jdbc.java
Last active November 11, 2016 17:32
JDBC intro
public static void main(String[] args) throws SQLException {
Connection connection = getConnection(DB_NAME);
System.out.println(connection.getCatalog());
}
private static Connection getConnection(String databaseName) throws SQLException {
return DriverManager.getConnection(URL + databaseName, USER_NAME, PASSWORD);
}
@zaleslaw
zaleslaw / Manual hyperparamter tuning
Created October 19, 2020 13:43
Manual hyperparamter tuning
// Tune hyper-parameters with K-fold Cross-Validation on the split training set.
int[] pSet = new int[] {1, 2};
int[] maxDeepSet = new int[] {1, 2, 3, 4, 5, 10, 20};
int bestP = 1;
int bestMaxDeep = 1;
double avg = Double.MIN_VALUE;
for (int p : pSet) {
for (int maxDeep : maxDeepSet) {
@zaleslaw
zaleslaw / manual_tuning.java
Created October 19, 2020 13:45
Manual hyperparameter tuning in Apache Ignite ML
// Tune hyper-parameters with K-fold Cross-Validation on the split training set.
int[] pSet = new int[] {1, 2};
int[] maxDeepSet = new int[] {1, 2, 3, 4, 5, 10, 20};
int bestP = 1;
int bestMaxDeep = 1;
double avg = Double.MIN_VALUE;
for (int p : pSet) {
for (int maxDeep : maxDeepSet) {
@zaleslaw
zaleslaw / brute_force_parallelism.java
Created October 19, 2020 14:45
brute_force_parallelism
CrossValidation<DecisionTreeNode, Integer, Vector> scoreCalculator
= new CrossValidation<>();
ParamGrid paramGrid = new ParamGrid()
.withParameterSearchStrategy(new BruteForceStrategy())
.addHyperParam("p", normalizationTrainer::withP, new Double[] {1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0})
.addHyperParam("maxDeep", trainerCV::withMaxDeep, new Double[] {1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0})
.addHyperParam("minImpurityDecrease", trainerCV::withMinImpurityDecrease, new Double[] {0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0});
scoreCalculator
Topic Feature Ignite ML 2.9 Spark ML 3.0 Note Specific to Spark
Classification Binomial logistic regression yes yes
Classification Decision tree classifier yes yes
Classification Linear SVM yes yes
Classification Random Forest classifier yes yes
Classification Gradient-boosted tree classifier yes yes
Classification Multilayer perceptron classifier yes yes
Classification KNN yes no
Classification Weighted KNN yes no
Classification Indexed KNN yes no
@zaleslaw
zaleslaw / RandomForest.csv
Last active November 27, 2020 14:30
Random Forest
Number of rows, Training time, ms, Training time, minutes,
1000, 5709, 0.09
10000, 19862, 0.3
100000, 93825, 1.5
300000, 202718, 3.37
1000000, 694007, 11.5
2000000, 1544073, 25.7
5000000, 4155318, 69
10000000, 8433875, 140 ( 2h, 20)m
@zaleslaw
zaleslaw / LSQR.csv
Created November 27, 2020 14:34
Apache Ignite ML, Performance report for Linear Regression with LSQR
Number of rows Number of features Training time, ms Training time, minutes Throughput, ms/row
10000 10 3611 0,06 0,3611
10000 100 4910 0,08 491
10000 1000 9137 0,15 0,9137
100000 10 3221 0,05 0,03221
100000 100 3615 0,06 0,03615
100000 1000 6771 0,11 0,06771
1000000 10 3174 0,05 0,003174
1000000 100 3942 0,07 0,003942
1000000 1000 13878 0,23 0,013878
@zaleslaw
zaleslaw / CSV.csv
Created November 27, 2020 14:39
Apache Ignite ML, Performance report for Linear Regression with SGD
Number of rows Number of features Training time, ms Training time, minutes Throughput, ms/row
1000 10 3366 0,06 3,366
1000 100 4002 0,07 4,002
1000 1000 14877 0,25 14,877
10000 10 2894 0,05 0,2894
10000 100 5718 0,10 0,5718
10000 1000 21858 0,36 2,1858
100000 10 10893 0,18 0,10893
100000 100 12733 0,21 0,12733
100000 1000 87573 1,46 0,87573