Skip to content

Instantly share code, notes, and snippets.

@sile
Last active Jan 21, 2020
Embed
What would you like to do?

The aim of this benchmark is to compare the performances of Optuna's pruners (i.e., NopPruner, MedianPruner, SuccessiveHalvingPruner and the ongoing HyperbandPruner). All of the pruners were used by the default settings in this benchmark.

The commands to execute this benchmark.

// (1) Downloads `kurobako` (BBO benchmark tool) binary.
$ curl -L https://github.com/sile/kurobako/releases/download/0.1.3/kurobako-0.1.3.linux-amd64 -o kurobako
$ chmod +x kurobako && sudo mv kurobako /usr/local/bin/

// (2) Downloads data files of HPOBench. (notice that the total size is over 700MB)
$ curl -OL http://ml4aad.org/wp-content/uploads/2019/01/fcnet_tabular_benchmarks.tar.gz
$ tar xf fcnet_tabular_benchmarks.tar.gz && cd fcnet_tabular_benchmarks/

// (3) Generates problem recipes (there are four datasets).
//     Please see https://github.com/automl/nas_benchmarks for the detail of HPOBench.
$ kurobako problem-suite hpobench fcnet . > problems.json

// (4) Generates solver recipes.
$ kurobako solver --name 'TPE' optuna --pruner nop > solvers.json
$ kurobako solver --name 'SuccessiveHalving with TPE' optuna --pruner asha >> solvers.json
$ kurobako solver --name 'MedianPruner with TPE' optuna --pruner median >> solvers.json
$ kurobako solver --name 'Hyperband with TPE' command -- python hyperband-solver.py >> solvers.json

// (5) Runs benchmark.
$ kurobako studies --solvers $(cat solvers.json) --problems $(cat problems.json) --repeats 30 --budget 50 | kurobako run --parallelism 10 > result.json

Bechmark results (the following images are generated by $cat result.json | kurobako plot curve). hpo-bench-naval-880b007e3c61f4aecb7e9b0aa2f9be5fea9f491a076853f68f402769aa254034 hpo-bench-parkinson-445bfa45fdbb8ec6ae6d4dba1909114f3948fa67b47209258db9291480b405b5 hpo-bench-protein-add73d4788d7900b34988a8b91cde43e820cac99f9e354e1e71b0ea0be3ef4a6 hpo-bench-slice-d88e1704447639bde17f236f7af47f93274d1f02bc8ec66733146ff9cdf50196

A detailed and text-based report is here (the report was produced by the command $ cat result.json | kurobako report).

The following table is the summary of the ranking (quoted from the report):

Solver Borda Firsts
Hyperband with TPE 9 1
MedianPruner with TPE 7 1
SuccessiveHalving with TPE 4 1
TPE 4 1

The HyperbandPruner has the most Borda count, which indicates that it may be suited as the default pruner.

import argparse
from kurobako import solver
from kurobako.solver.optuna import OptunaSolverFactory
import optuna
parser = argparse.ArgumentParser()
parser.add_argument('--min-resource', type=int, default=1)
parser.add_argument('--reduction-factor', type=int, default=3)
parser.add_argument('--min-early-stopping-rate-low', type=int, default=0)
parser.add_argument('--min-early-stopping-rate-high', type=int, default=4)
parser.add_argument('--loglevel', choices=['debug', 'info', 'warning', 'error'], default='error')
parser.add_argument('--sampler', choices=['tpe', 'random'], default='tpe')
args = parser.parse_args()
if args.loglevel == 'debug':
optuna.logging.set_verbosity(optuna.logging.DEBUG)
elif args.loglevel == 'info':
optuna.logging.set_verbosity(optuna.logging.INFO)
elif args.loglevel == 'warning':
optuna.logging.set_verbosity(optuna.logging.WARNING)
elif args.loglevel == 'error':
optuna.logging.set_verbosity(optuna.logging.ERROR)
pruner = optuna.pruners.HyperbandPruner(
min_resource=args.min_resource,
reduction_factor=args.reduction_factor,
min_early_stopping_rate_low=args.min_early_stopping_rate_low,
min_early_stopping_rate_high=args.min_early_stopping_rate_high,
)
if args.sampler == 'tpe':
sampler = optuna.samplers.TPESampler()
elif args.sampler == 'random':
sampler = optuna.samplers.RandomSampler()
def create_study(seed):
return optuna.create_study(pruner=pruner, sampler=sampler)
if __name__ == '__main__':
factory = OptunaSolverFactory(create_study)
runner = solver.SolverRunner(factory)
runner.run()

Benchmark Result Report

  • Report ID: ab49fdfa2cf62ec133d6940711b5352308dc18334ffd5f255faee51ef905520c
  • Kurobako Version: 0.1.3
  • Number of Solvers: 4
  • Number of Problems: 4
  • Metrics Precedence: best value -> AUC -> elapsed time

Please refer to "A Strategy for Ranking Optimizers using Multiple Criteria" for the ranking strategy used in this report.

Table of Contents

  1. Overall Results
  2. Individual Results
  3. Solvers
  4. Problems
  5. Studies

Overall Results

Solver Borda Firsts
Hyperband with TPE 9 1
MedianPruner with TPE 7 1
SuccessiveHalving with TPE 4 1
TPE 4 1

Individual Results

(1) Problem: HPO-Bench-Parkinson

Ranking Solver Best (avg +- sd) AUC (avg +- sd) Elapsed (avg +- sd)
1 MedianPruner with TPE (study) 0.010319 +- 0.002872 0.646 +- 0.232 289.338 +- 60.135
2 Hyperband with TPE (study) 0.011403 +- 0.004289 0.731 +- 0.240 120.106 +- 12.651
3 SuccessiveHalving with TPE (study) 0.014822 +- 0.008332 0.712 +- 0.350 249.325 +- 131.821
4 TPE (study) 0.012665 +- 0.005501 0.955 +- 0.342 5.471 +- 0.440

(2) Problem: HPO-Bench-Naval

Ranking Solver Best (avg +- sd) AUC (avg +- sd) Elapsed (avg +- sd)
1 TPE (study) 0.000130 +- 0.000224 0.029 +- 0.039 5.399 +- 0.376
2 Hyperband with TPE (study) 0.000095 +- 0.000115 0.013 +- 0.013 126.993 +- 14.142
3 MedianPruner with TPE (study) 0.000119 +- 0.000175 0.013 +- 0.013 296.964 +- 73.482
4 SuccessiveHalving with TPE (study) 0.000644 +- 0.001719 0.039 +- 0.074 522.541 +- 442.635

(3) Problem: HPO-Bench-Protein

Ranking Solver Best (avg +- sd) AUC (avg +- sd) Elapsed (avg +- sd)
1 SuccessiveHalving with TPE (study) 0.233401 +- 0.007767 9.703 +- 0.336 222.130 +- 234.573
2 Hyperband with TPE (study) 0.233522 +- 0.010821 10.002 +- 0.538 111.348 +- 11.700
3 MedianPruner with TPE (study) 0.235170 +- 0.013301 10.280 +- 0.630 250.531 +- 40.316
4 TPE (study) 0.241203 +- 0.015100 10.554 +- 0.756 5.424 +- 0.400

(4) Problem: HPO-Bench-Slice

Ranking Solver Best (avg +- sd) AUC (avg +- sd) Elapsed (avg +- sd)
1 Hyperband with TPE (study) 0.000350 +- 0.000236 0.020 +- 0.011 119.939 +- 15.240
2 MedianPruner with TPE (study) 0.000319 +- 0.000088 0.018 +- 0.008 292.157 +- 84.493
3 TPE (study) 0.000308 +- 0.000088 0.030 +- 0.024 5.391 +- 0.343
4 SuccessiveHalving with TPE (study) 0.000510 +- 0.000351 0.022 +- 0.014 532.892 +- 674.634

Solvers

ID: 5c2d0f6ef4e91f85ff31aadd7bf531744cb1ac80283ab4372fc8498540e479c5

recipe:

{
  "name": "Hyperband with TPE",
  "command": {
    "path": "python",
    "args": [
      "hyperband-solver.py",
      "--sampler",
      "tpe"
    ]
  }
}

specification:

{
  "name": "Hyperband with TPE",
  "attrs": {
    "github": "https://github.com/optuna/optuna",
    "paper": "Akiba, Takuya, et al. \"Optuna: A next-generation hyperparameter optimization framework.\" Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2019.",
    "version": "optuna=0.19.0, kurobako-py=0.1.1"
  },
  "capabilities": [
    "UNIFORM_CONTINUOUS",
    "UNIFORM_DISCRETE",
    "LOG_UNIFORM_CONTINUOUS",
    "CATEGORICAL",
    "CONDITIONAL",
    "CONCURRENT"
  ]
}

ID: ff6afef357c6e481d59cf7c916f6d1d35763ac00803b2a0e838d5cb33b51f543

recipe:

{
  "name": "MedianPruner with TPE",
  "optuna": {}
}

specification:

{
  "name": "MedianPruner with TPE",
  "attrs": {
    "github": "https://github.com/optuna/optuna",
    "paper": "Akiba, Takuya, et al. \"Optuna: A next-generation hyperparameter optimization framework.\" Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2019.",
    "version": "optuna=0.19.0, kurobako-py=0.1.1"
  },
  "capabilities": [
    "UNIFORM_CONTINUOUS",
    "UNIFORM_DISCRETE",
    "LOG_UNIFORM_CONTINUOUS",
    "CATEGORICAL",
    "CONDITIONAL",
    "CONCURRENT"
  ]
}

ID: 507a3f8cf732d538feba4e2cdd6cf17d2b9f90a256810849929a7215ab592672

recipe:

{
  "name": "SuccessiveHalving with TPE",
  "optuna": {
    "pruner": "asha"
  }
}

specification:

{
  "name": "SuccessiveHalving with TPE",
  "attrs": {
    "github": "https://github.com/optuna/optuna",
    "paper": "Akiba, Takuya, et al. \"Optuna: A next-generation hyperparameter optimization framework.\" Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2019.",
    "version": "optuna=0.19.0, kurobako-py=0.1.1"
  },
  "capabilities": [
    "UNIFORM_CONTINUOUS",
    "UNIFORM_DISCRETE",
    "LOG_UNIFORM_CONTINUOUS",
    "CATEGORICAL",
    "CONDITIONAL",
    "CONCURRENT"
  ]
}

ID: c704b8debecaeee2d9b06549c5e3db2815760072fff200244dea9764ec623ea8

recipe:

{
  "name": "TPE",
  "optuna": {
    "pruner": "nop"
  }
}

specification:

{
  "name": "TPE",
  "attrs": {
    "github": "https://github.com/optuna/optuna",
    "paper": "Akiba, Takuya, et al. \"Optuna: A next-generation hyperparameter optimization framework.\" Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2019.",
    "version": "optuna=0.19.0, kurobako-py=0.1.1"
  },
  "capabilities": [
    "UNIFORM_CONTINUOUS",
    "UNIFORM_DISCRETE",
    "LOG_UNIFORM_CONTINUOUS",
    "CATEGORICAL",
    "CONDITIONAL",
    "CONCURRENT"
  ]
}

Problems

ID: 880b007e3c61f4aecb7e9b0aa2f9be5fea9f491a076853f68f402769aa254034

recipe:

{
  "hpobench": {
    "dataset": "./fcnet_naval_propulsion_data.hdf5"
  }
}

specification:

{
  "name": "HPO-Bench-Naval",
  "attrs": {
    "github": "https://github.com/automl/nas_benchmarks",
    "paper": "Klein, Aaron, and Frank Hutter. \"Tabular Benchmarks for Joint Architecture and Hyperparameter Optimization.\" arXiv preprint arXiv:1905.04970 (2019).",
    "version": "kurobako_problems=0.1.3"
  },
  "params_domain": [
    {
      "name": "activation_fn_1",
      "range": {
        "type": "CATEGORICAL",
        "choices": [
          "tanh",
          "relu"
        ]
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "activation_fn_2",
      "range": {
        "type": "CATEGORICAL",
        "choices": [
          "tanh",
          "relu"
        ]
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "batch_size",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 4
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "dropout_1",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 3
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "dropout_2",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 3
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "init_lr",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 6
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "lr_schedule",
      "range": {
        "type": "CATEGORICAL",
        "choices": [
          "cosine",
          "const"
        ]
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "n_units_1",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 6
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "n_units_2",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 6
      },
      "distribution": "UNIFORM",
      "constraint": null
    }
  ],
  "values_domain": [
    {
      "name": "Validation MSE",
      "range": {
        "type": "CONTINUOUS",
        "low": 0.0
      },
      "distribution": "UNIFORM",
      "constraint": null
    }
  ],
  "steps": 100
}

ID: 445bfa45fdbb8ec6ae6d4dba1909114f3948fa67b47209258db9291480b405b5

recipe:

{
  "hpobench": {
    "dataset": "./fcnet_parkinsons_telemonitoring_data.hdf5"
  }
}

specification:

{
  "name": "HPO-Bench-Parkinson",
  "attrs": {
    "github": "https://github.com/automl/nas_benchmarks",
    "paper": "Klein, Aaron, and Frank Hutter. \"Tabular Benchmarks for Joint Architecture and Hyperparameter Optimization.\" arXiv preprint arXiv:1905.04970 (2019).",
    "version": "kurobako_problems=0.1.3"
  },
  "params_domain": [
    {
      "name": "activation_fn_1",
      "range": {
        "type": "CATEGORICAL",
        "choices": [
          "tanh",
          "relu"
        ]
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "activation_fn_2",
      "range": {
        "type": "CATEGORICAL",
        "choices": [
          "tanh",
          "relu"
        ]
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "batch_size",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 4
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "dropout_1",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 3
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "dropout_2",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 3
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "init_lr",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 6
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "lr_schedule",
      "range": {
        "type": "CATEGORICAL",
        "choices": [
          "cosine",
          "const"
        ]
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "n_units_1",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 6
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "n_units_2",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 6
      },
      "distribution": "UNIFORM",
      "constraint": null
    }
  ],
  "values_domain": [
    {
      "name": "Validation MSE",
      "range": {
        "type": "CONTINUOUS",
        "low": 0.0
      },
      "distribution": "UNIFORM",
      "constraint": null
    }
  ],
  "steps": 100
}

ID: add73d4788d7900b34988a8b91cde43e820cac99f9e354e1e71b0ea0be3ef4a6

recipe:

{
  "hpobench": {
    "dataset": "./fcnet_protein_structure_data.hdf5"
  }
}

specification:

{
  "name": "HPO-Bench-Protein",
  "attrs": {
    "github": "https://github.com/automl/nas_benchmarks",
    "paper": "Klein, Aaron, and Frank Hutter. \"Tabular Benchmarks for Joint Architecture and Hyperparameter Optimization.\" arXiv preprint arXiv:1905.04970 (2019).",
    "version": "kurobako_problems=0.1.3"
  },
  "params_domain": [
    {
      "name": "activation_fn_1",
      "range": {
        "type": "CATEGORICAL",
        "choices": [
          "tanh",
          "relu"
        ]
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "activation_fn_2",
      "range": {
        "type": "CATEGORICAL",
        "choices": [
          "tanh",
          "relu"
        ]
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "batch_size",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 4
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "dropout_1",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 3
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "dropout_2",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 3
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "init_lr",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 6
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "lr_schedule",
      "range": {
        "type": "CATEGORICAL",
        "choices": [
          "cosine",
          "const"
        ]
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "n_units_1",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 6
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "n_units_2",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 6
      },
      "distribution": "UNIFORM",
      "constraint": null
    }
  ],
  "values_domain": [
    {
      "name": "Validation MSE",
      "range": {
        "type": "CONTINUOUS",
        "low": 0.0
      },
      "distribution": "UNIFORM",
      "constraint": null
    }
  ],
  "steps": 100
}

ID: d88e1704447639bde17f236f7af47f93274d1f02bc8ec66733146ff9cdf50196

recipe:

{
  "hpobench": {
    "dataset": "./fcnet_slice_localization_data.hdf5"
  }
}

specification:

{
  "name": "HPO-Bench-Slice",
  "attrs": {
    "github": "https://github.com/automl/nas_benchmarks",
    "paper": "Klein, Aaron, and Frank Hutter. \"Tabular Benchmarks for Joint Architecture and Hyperparameter Optimization.\" arXiv preprint arXiv:1905.04970 (2019).",
    "version": "kurobako_problems=0.1.3"
  },
  "params_domain": [
    {
      "name": "activation_fn_1",
      "range": {
        "type": "CATEGORICAL",
        "choices": [
          "tanh",
          "relu"
        ]
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "activation_fn_2",
      "range": {
        "type": "CATEGORICAL",
        "choices": [
          "tanh",
          "relu"
        ]
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "batch_size",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 4
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "dropout_1",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 3
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "dropout_2",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 3
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "init_lr",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 6
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "lr_schedule",
      "range": {
        "type": "CATEGORICAL",
        "choices": [
          "cosine",
          "const"
        ]
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "n_units_1",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 6
      },
      "distribution": "UNIFORM",
      "constraint": null
    },
    {
      "name": "n_units_2",
      "range": {
        "type": "DISCRETE",
        "low": 0,
        "high": 6
      },
      "distribution": "UNIFORM",
      "constraint": null
    }
  ],
  "values_domain": [
    {
      "name": "Validation MSE",
      "range": {
        "type": "CONTINUOUS",
        "low": 0.0
      },
      "distribution": "UNIFORM",
      "constraint": null
    }
  ],
  "steps": 100
}

Studies

ID: 99860593ae06d98cf295ae859fed896461aefc9aadf0e4e532734cca3ccc7d39

ID: e98f05609720b2dde1dadc14cc68fdf13eb4eae792baa8126061f226fcc06cd1

ID: 4e1157e85ba9e6252a1c17b080e7b6fcb2f9be329bd90f492dc91b16f0658f0b

ID: 0e4f067355d41d5416a24258f28a8d7b2cb946b25172013a233c3714146894df

ID: 186d7ff94b4bc8556b0a7c7f81384e126fc4ca106c46ce4638d0527af021c075

ID: efe00cb2acfeedfb75cdac284be3323c5844d3f48b89344ce10f718e9defe4f2

ID: 54cdfcebc95fc90e4af65cc5db05e49a301d00ccc92c3cf630c38e57d69ba83a

ID: 90c6cba83a1919cc970237a59711a8de95c2b7f65253953dbbcb421db0de2853

ID: b237d54e88f15501437112aaaeeb4f1a07f616744d677c3cc0bb2d8c0422eac8

ID: 3e41ea145df955d6378bd4dcf70ef49ac2af94bb217805a5958d8de60df6ec70

ID: 3c878cf37c14f9688166cf33a80d9085c8c15f826d79834217f87062b126de4b

ID: f9ff929d4971e789dec2c24c8a30bf879008d496355a89f6ba50379baa9b9dfe

ID: d2436e4a02f2ea4ec10cb37ddadb42a081d8357100ea76eac1f8ac3d806c52a1

ID: 01aab167501c209e547ede677a748b8ae2ecb330530c2285ed26d62c1028c92f

ID: c916abbe637a0bb2ab89ebff381c9d5982e92b58f71b4a5f16ff97f5644dfb30

ID: 23de9dbcae124d4edd76fcd63c64141c13959605853881f647928668cbb07af1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment