Skip to content

Instantly share code, notes, and snippets.

@zheng-alice
Last active February 22, 2021 09:48
Show Gist options
  • Save zheng-alice/98f286c5c6b4789f140fa79041f282bb to your computer and use it in GitHub Desktop.
Save zheng-alice/98f286c5c6b4789f140fa79041f282bb to your computer and use it in GitHub Desktop.
Reload and resume a scikit-optimize optimization

Reloading skopt optimizations

Scikit-optimize, or skopt, is a pretty neat module that lets you "minimize (very) expensive and noisy black-box functions". Here are some methods you can call to do that:

  • dummy_minimize()
  • forest_minimize()
  • gbrt_minimize()
  • gp_minimize()

When they are done, these methods return a special object that contains information about the performed optimization — parameters tried, values gotten, the predicted surrogate model, etc. You can then save() and load() this object to/from a file.

There is a problem, though — there isn't a method you can call to resume the optimization; you can only start new optimizations from scratch.

Fortunately, the returned object contains enough information to reload the optimization:

Simple method

The optimization functions have input parameters that let you specify already known points — x0 and y0. These can be used to pass the results of the previous optimization and thus continue from its end state.

Since there is no longer a need to query points randomly, n_random_starts is set to 0.

The previous result's parameters are deepcopy()-ed to avoid changing it throughout the reloaded optimization.

Consistency issue

There is a problem with this simple approach — its behavior is not idential to uninterrupted optimizations. In other words, if you optimize a funciton for 25 iterations and then use this method to reload that optimization for another 25, it won't query for the same points as the initial optimization if it had been set to run for 50 iterations. Here are the sources of error:

  1. The x0-y0 points are loaded at the beginning of the optimization. This is treated as a typical iteration, and exhausts a step of the internal random state. Here's what we can do about it:
  2. Advance the random state a step back. As far as I'm aware, this is impossible.
  3. Don't treat the loading operation as an iteration — break into the base_minimize method and explicitly pass the points to the Optimizer object without using optimizer.tell(). The problem with this approach is that the first iteration won't query for the right point. This point is determined at the end of the call to optimizer.tell() (this is what advances the random state in the first place), and since we avoided it, the first iteration's optimizer.ask() call will use a random point instead.
  4. Use the optimizations' callback option to save a copy of the random state that's lagging a step behind. Loading the points will exhaust this extra step and bring the random state to its proper place. This works, but requires you to modify code that runs optimizations. Thus, you won't be able to reload old optimizations that were created before the modification.
  5. The internally-used Optimizer object contains some internal values that are lost. Even if you fix the first issue, your reloads will still occasionally break and query for different points.
  6. For example, gp_minimize uses three different functions to predict the next point, and picks one. The probabilities used to to pick each are derived from self.gains_, an internally-stored property. This roperty is set to zeros upon creation and modified every iteration with the -= operator. Since reloaded optimizations skip the iterations of already queried points, this property will be different. Even if there was a convenient way to modify this property, we would have no idea what to set it to unless we keep track of it while running the initial optimization.

Stable method

The simplest way to achieve consistency is to query points in exactly the same manner as we would in an uninterrupted optimization. This is identical to just running the initial optimization with the same parameters, but for a larger number of iterations. To cut out the time required to run the initial portion of the optimization, we can avoid evaluating the function and instead pass the already known values.

We can do this by making a wrapper to the evaluated function, func_new(). For the first part of the reloaded optimization, this wrapper will return the previously gotten values. Then, it will switch to evaluating the function as usual.

Inconveniences

  1. This approach will only save the time taken by function evaluations. It will not help if your function performs relatively fast to Optimizer's ask() and tell().
  2. The initial random state is saved as an object and returned with the optimization result. The problem is that only copy() is used (as opposed to deepcopy()), and the random state's internal values get modified throughout the optimization. Thus, the object behaves differently and becomes unusable. It's possible to recreate this object, but you'll need to know init_seed, the parameter passed as random_state to the initial optimization.
import numpy as np
import warnings
from copy import deepcopy
from skopt.learning import GaussianProcessRegressor
from skopt.optimizer import base_minimize
from sklearn.utils import check_random_state
def reload_simple(result, addtl_calls):
""" Continue an skopt optimization from its returned OptimizeResult object.
Will not run through previously queried points, but its behavior won't
be identical to longer, uninterrupted optimizations.
PARAMETERS
----------
result [OptimizeResult, scipy object]:
Result of an skopt optimization, as returned by the optimization method.
Tested methods:
dummy_minimize
forest_minimize
gbrt_minimize
gp_minimize
addtl_calls [int]:
Number of additional iterations to perform.
RETURNS
-------
result_new [OptimizeResult, scipy object]:
Updated optimization result returned as an OptimizeResult object.
"""
args = deepcopy(result.specs['args'])
args['n_calls'] = addtl_calls
args['n_random_starts'] = 0
args['x0'] = deepcopy(result.x_iters)
args['y0'] = deepcopy(result.func_vals)
args['random_state'] = deepcopy(result.random_state)
return base_minimize(**args)
def func_new(params):
global func_, xs_, ys_
if(len(xs_) > 0):
y = ys_.pop(0)
if(params != xs_.pop(0)):
warnings.warn("Deviated from expected value, re-evaluating", RuntimeWarning)
else:
return y
return func_(params)
def reload_stable(result, addtl_calls, init_seed=None):
""" Continue an skopt optimization from its returned OptimizeResult object.
Consistent with uninterrupted optimizations, but will take more time
feeding in previously queried points.
PARAMETERS
----------
result [OptimizeResult, scipy object]:
Result of an skopt optimization, as returned by the optimization method.
Tested methods:
dummy_minimize
forest_minimize
gbrt_minimize
gp_minimize
addtl_calls [int]:
Number of additional iterations to perform.
init_seed [int, RandomState instance, or None (default)]:
The value passed as 'random_state' to the original optimization.
RETURNS
-------
result_new [OptimizeResult, scipy object]:
Updated optimization result returned as an OptimizeResult object.
"""
args = deepcopy(result.specs['args'])
args['n_calls'] += addtl_calls
# global b/c I couldn't find a better way to pass
global func_, xs_, ys_
func_ = args['func']
xs_ = list(result.x_iters)
ys_ = list(result.func_vals)
args['func'] = func_new
# recover initial random_state
if(isinstance(args['random_state'], np.random.RandomState)):
args['random_state'] = check_random_state(init_seed)
# if gp_minimize
if(isinstance(result.specs['args']['base_estimator'], GaussianProcessRegressor)):
args['random_state'].randint(0, np.iinfo(np.int32).max)
# run the optimization
result_new = base_minimize(**args)
# change the function back, to reload multiple times
result_new.specs['args']['func'] = func_
return result_new
from reload import reload_stable
import numpy as np
from skopt import forest_minimize
from skopt.space import Real
from skopt.utils import use_named_args
from skopt.plots import plot_objective
import matplotlib.pyplot as plt
space = [Real(-2, 2, "uniform", name='x'),
Real(-1, 3, "uniform", name='y')]
@use_named_args(space)
def rosenbrock(x, y):
# true minimum is 0 at (a, a^2)
a = 1
b = 100
return (a-x)**2 + b*(y-x*x)**2
result1 = forest_minimize(rosenbrock, space, n_calls=25, random_state=0)
result1 = reload_stable(result1, addtl_calls=25, init_seed=0)
result2 = forest_minimize(rosenbrock, space, n_calls=50, random_state=0)
if(np.allclose(result1.func_vals, result2.func_vals)):
print("TEST PASSED")
print("OPTIMUM:", result1.fun, "at", result1.x)
else:
print("TEST FAILED")
# predicted function
plot_objective(result1)
# true function
class dummy_model:
def predict(self, points):
return [rosenbrock(point) for point in points]
result1.models[-1] = dummy_model()
plot_objective(result1)
plt.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment