Skip to content

Instantly share code, notes, and snippets.

@c-bata
Last active July 16, 2019 12:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save c-bata/7eb7738c829c9858f4955dab5ac851e3 to your computer and use it in GitHub Desktop.
Save c-bata/7eb7738c829c9858f4955dab5ac851e3 to your computer and use it in GitHub Desktop.
Benchmark of GaussianProcessRegressor for https://github.com/scikit-learn/scikit-learn/pull/14378
import time
import numpy as np
import matplotlib.pyplot as plt
from sklearn.gaussian_process import kernels as sk_kern
from sklearn.gaussian_process import GaussianProcessRegressor
def objective(x):
return x + 20 * np.sin(x)
def plot_result(x_test, mean, std):
plt.plot(x_test[:, 0], mean, color="C0", label="predict mean")
plt.fill_between(x_test[:, 0], mean + std, mean - std, color="C0", alpha=.3, label="1 sigma confidence")
xx = np.linspace(-20, 20, 200)
plt.plot(xx, objective(xx), "--", color="C0", label="true function")
plt.title("function evaluation")
plt.legend()
plt.savefig("gpr_predict.png", dpi=150)
def main():
kernel = sk_kern.RBF(1.0, (1e-3, 1e3)) + sk_kern.ConstantKernel(1.0, (1e-3, 1e3))
clf = GaussianProcessRegressor(
kernel=kernel,
alpha=1e-10,
optimizer="fmin_l_bfgs_b",
n_restarts_optimizer=20,
normalize_y=True)
np.random.seed(0)
x_train = np.random.uniform(-20, 20, 200)
y_train = objective(x_train) + np.random.normal(loc=0, scale=.1, size=x_train.shape)
times = []
for i in range(10):
start = time.time()
clf.fit(x_train.reshape(-1, 1), y_train)
elapsed = time.time() - start
print(f"elapsed: {elapsed:.3f}s")
times.append(elapsed)
print("score:", np.array(times).mean(), np.array(times).std())
x_test = np.linspace(-20., 20., 200).reshape(-1, 1)
pred_mean, pred_std = clf.predict(x_test, return_std=True)
plot_result(x_test, pred_mean, pred_std)
if __name__ == '__main__':
main()
@c-bata
Copy link
Author

c-bata commented Jul 15, 2019

patch

gpr.py L420

        kernel = self.kernel_.clone_with_theta(theta)

        kernel = self.kernel_
        kernel.theta = theta

before

$ python examples/gpr_bench.py 
elapsed: 3.620s
elapsed: 4.616s
elapsed: 3.892s
elapsed: 3.784s
elapsed: 3.908s
elapsed: 4.280s
elapsed: 2.884s
elapsed: 2.543s
elapsed: 2.502s
elapsed: 4.946s
score: 3.6975975036621094 0.7911395580877181

After

$ python examples/gpr_bench.py 
elapsed: 2.642s
elapsed: 2.423s
elapsed: 2.921s
elapsed: 2.704s
elapsed: 2.856s
elapsed: 3.667s
elapsed: 2.737s
elapsed: 2.397s
elapsed: 2.292s
elapsed: 2.590s
score: 2.722917938232422 0.36802930790866245

almost 26.4% faster

@c-bata
Copy link
Author

c-bata commented Jul 15, 2019

To confirm this PR doesn't break the logic, I generate the graph of model prediction results.

before after
gpr_predict gpr_predict2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment