fmnobar/Hyperband_v2.ipynb

## Hyperband_v2.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "69c6bd5e-83cf-4d76-8f8e-9f860bd0d71b",
   "metadata": {},
   "source": [
    "# Hyperparameter Optimization with Hyperband - 30 Times Faster Than Bayesian Optimization?\n",
    "\n",
    "How a machine leanring model performs highly depends on identifying a good set of hyperparameters. Let's first understand what Hyperparameters are. Hyperparameters can be considered inputs to a machine learning model that impact how the model performs - they are the levers we can pull to impact the outcome of the model. For example, a good set of hyperparameters can result in a machine learning model making more accurate predictions, while a worse set of hyperparameters can results in less accurate predictions. The task of finding a good set of hyperparameters is called hyperparameter optimization. And as you may expect, there are various hyperparameter optimization methodologies, starting from very simple ones such as Grid Search and Random Search, to more sophisticated methodologies that can \"learn\" from each iteration and hence are more efficient, such as Bayesian Optimization. I have covered these approaches in a different post (linked below) and for this post, we are going to talk about a more recent approach that provides much more speed and efficiency compared to traditional Grid and Random Search and is more comparable to Bayesian Optimization and is called Hyperband. \n",
    "\n",
    "[Grid Search, Random Search and Bayesian Optimization](https://towardsdatascience.com/hyperparameter-optimization-intro-and-implementation-of-grid-search-random-search-and-bayesian-b2f16c00578a)\n",
    "\n",
    "You may be wondering why we would even need Hyperband, considering we can just use Bayesian Optimization. Bayesian Optimization is an excellent approach is classical machine learning tasks such as classification but it can be quite computationally expensive in deep learning problems, such as language models, etc., which have exploded in popularity recently and is exactly the area where Hyperband shines. In fact, Hyperband can be up to 30 times faster than Bayesian Optimization in deep-learning use cases.\n",
    "\n",
    "Let's get started!\n",
    "\n",
    "## What is Hyperband? A Conceptual Overview\n",
    "\n",
    "The main idea behind Hyperband is to optimize an exploration strategy by allocating our limited resources to the more promising sets of hyperparameters. Let's walk through an imaginary example by talking about the multi-armed bandit problem. Multi-armed bandit problem comes from the scenario where a gambler is at a casino with a row of slot machines in front of him. In such a scenario, our bandit is similar to a machine learning algorithm with limited resources since he can only spin so many slot machines given his time constraints. Therefore, he needs to pick the most promising ones to maximize his wins. The slot machines are the hyperparameters in this example, since they impact the bandit's performance of winning. The bandit has to start from somewhere so let's say that he randomly selects a few of the slot machines, given his limited resources and then based on the outcome of these slot machines, decides which of the slot machines are more promising and which ones are less (according to his best judgement). Then the bandit will ignore the slot machines that he decides are less promising and spends all of his limited resources only on the promising slot machines. Given his available resources, he continues narrowing down to the winning slot machines.\n",
    "\n",
    "Now that we understand the bandit problem, let's go back to the machine learning realm and walk through what exactly hyperband does. First, hyperband allocates the existing resources to a randomly-selected sets of hyperparameters (i.e. selects random sets of hyperparameters and runs the machine learning model for each set). Then hyperband early-stops poor-performing sets of hyperparameter optimization (i.e. similar to the bandit ignoring less-promising slot machines) and allocates those resources to the more promising sets of hyperparameters. The process continues until pre-defined resources are exhausted. \n",
    "\n",
    "Now that we have an overall understanding of how hyperband works, let's implement an example and look at the results to better understand it. \n",
    "\n",
    "## Implementation\n",
    "\n",
    "For implementation, we will optimize the hyperparameters of a neural network classifier from [scikit-learn](https://scikit-learn.org/stable/) library using [Hyperopt](https://github.com/hyperopt/hyperopt) for hyperband. \n",
    "\n",
    "### Step 1 - Import Libraries\n",
    "\n",
    "Let's start by importing the necessary libraries. NumPy and scikit-learn are among the more common packages and [Hyperopt](https://github.com/hyperopt/hyperopt) is a package developed for hyperparameter optimization. Code block below imports the libraries. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "7b3fda4f-7f52-43ce-bb52-3fcbf966f5f5",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from hyperopt import fmin, tpe, Trials, hp # https://github.com/hyperopt/hyperopt/wiki/FMin\n",
    "from sklearn.datasets import load_digits # https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html\n",
    "from sklearn.model_selection import cross_val_score # https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.cross_val_score.html\n",
    "from sklearn.neural_network import MLPClassifier # https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html\n",
    "import time"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3a9371f1-4f9d-4f15-b0d9-b1275aab707b",
   "metadata": {},
   "source": [
    "### Step 2 - Define The Search Space\n",
    "\n",
    "Search space, in the context of hyperparameter optimization, is the universe of the hyperparameters and their corresponding values considered for the optimization task. \n",
    "\n",
    "In this example, the search space includes four hyperparameters as defined below. Note that you do not necessarily need to understand each of these hyperparameters in order to understand the overall hyperparameter optimization process. Feel free to glance over them and move to the next section if necessary. \n",
    "\n",
    "1. `learning_rate_init`: The initial learning rate of the optimizer\n",
    "2. `hidden_layer_sizes`: The number of neurons in each hidden layer of the neural network\n",
    "3. `alpha`: The regularization parameter\n",
    "4. `activation`: The activation function used in each neuron\n",
    "\n",
    "We can define the search space using the `hp.choice` and `hp.loguniform` functions provided by the Hyperopt package. The `hp.choice` function specifies a list of possible values, while the `hp.loguniform` function specifies a uniform distribution in log space.\n",
    "\n",
    "Search space is formally defined as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "add59912-4e65-4a7e-bbb6-5ef48c99b1f7",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define search space\n",
    "space = {     \n",
    "    'learning_rate_init': hp.loguniform('learning_rate_init', np.log(0.001), np.log(0.1)),\n",
    "    'hidden_layer_sizes': hp.choice('hidden_layer_sizes', [(32,), (64,), (128,), (256,), (32, 32), (64, 64), (128, 128), (256, 256)]),\n",
    "    'alpha': hp.loguniform('alpha', np.log(0.0001), np.log(0.01)),\n",
    "    'activation': hp.choice('activation', ['identity', 'logistic', 'tanh', 'relu'])\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c19855d3-cbb2-4b55-8192-cc04451e0914",
   "metadata": {},
   "source": [
    "### Step 3 - Define The Objective Function\n",
    "\n",
    "As the next step, we need to define the objective function that evaluates each configuration of hyperparameters. In other words, we will be optimizing the objective function. In this example, we will use the cross-validation score of the neural network classifier as the objective function. The objective function takes a set of hyperparameters as input, initializes a neural network classifier with those hyperparameters and then evaluates the performance of the classifier using cross-validation. \n",
    "\n",
    "Note that we will use the negative of the cross-validation score as the objective function because [Hyperopt](https://github.com/hyperopt/hyperopt) tries to minimize the objective function.\n",
    "\n",
    "Let's formally define the objective function. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "325f1169-c708-4908-944c-bf8a71d77012",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define objective function\n",
    "def objective(params):\n",
    "    clf = MLPClassifier(\n",
    "        learning_rate_init = params['learning_rate_init'],\n",
    "        hidden_layer_sizes = params['hidden_layer_sizes'],\n",
    "        alpha = params['alpha'],\n",
    "        activation = params['activation'],\n",
    "        random_state = 1234\n",
    "    )\n",
    "    score = cross_val_score(clf, X, y, cv=5).mean()\n",
    "    \n",
    "    return -score  # Minimize negative score\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f5ddeaf1-6079-44af-b2c6-1bb3616b3f89",
   "metadata": {},
   "source": [
    "### Step 4 - Load The Data Set\n",
    "\n",
    "In this step, we will load the digits data set from [scikit-learn](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html). `load_digits` is a popular data set used as a benchmark for testing classification tasks in machine learning, which contains 8x8 images of handwritten digits. Each image is represented as an 8x8 array of grayscale values, and the target variable is the digit that the image represents (i.e., a number between 0 and 9). The dataset contains a total of 1,797 images, with 10% of the data reserved for testing and the remaining 90% used for training.\n",
    "\n",
    "You are likely to come across the `load_digits` dataset used for evaluating the performance of a classifier with different sets of hyperparameters, since it is relatively small and the training and cross-validation process can be completed relatively quickly. These characteristics make this data set an ideal candidate for our example.\n",
    "\n",
    "Let's go ahead and load the data set."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4945b5b2-9819-4f44-a7e1-f5ba77dc4dbd",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load data from sklearn\n",
    "digits = load_digits()\n",
    "X, y = digits.data, digits.target"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0adeacea-cb4f-4e0f-9b9d-2c152c860f52",
   "metadata": {},
   "source": [
    "### Step 5 - Optimize\n",
    "\n",
    "Now that we have imported our libraries, defined the search space and objective function and loaded the data, we are finally ready to kick off the hyperparameter optimization itself via hyperband. We will use the `fmin()` function from the Hyperopt package to run the hyperband algorithm. `fmin()` has some arguments as defined below:\n",
    "\n",
    "- `fn` argument is the objective function that we defined in Step 3\n",
    "- `space` argument is the search space that we defined in Step 2\n",
    "- `algo` argument is the optimization algorithm that we want to use (we will be using Tree-structured Parzen Estimator (TPE) algorithm)\n",
    "- `max_evals` argument is the maximum number of iterations for the hyperband algorithm\n",
    "- `trials` argument is an instance of the `Trials()` class, which keeps track of the results of each iteration of the optimization process\n",
    "\n",
    "Note that I am also using the `time()` package to measure how long this process takes. \n",
    "\n",
    "Let's run the optimization and find the winning set of hyperparameters. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "2134a5ee-ff0c-416b-ba50-907c737896e7",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  8%|██▋                             | 17/200 [00:43<10:58,  3.60s/trial, best loss: -0.9543748065614361]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  9%|██▉                             | 18/200 [00:47<11:36,  3.83s/trial, best loss: -0.9543748065614361]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " 24%|███████▊                        | 49/200 [02:36<08:25,  3.35s/trial, best loss: -0.9604859176725471]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " 25%|████████                        | 50/200 [02:41<09:47,  3.92s/trial, best loss: -0.9604859176725471]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " 46%|██████████████▉                 | 93/200 [05:32<07:23,  4.14s/trial, best loss: -0.9604859176725471]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " 52%|████████████████▎              | 105/200 [06:11<04:22,  2.76s/trial, best loss: -0.9604982977406376]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " 53%|████████████████▍              | 106/200 [06:15<05:01,  3.20s/trial, best loss: -0.9604982977406376]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " 55%|█████████████████              | 110/200 [06:29<05:10,  3.45s/trial, best loss: -0.9604982977406376]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " 56%|█████████████████▎             | 112/200 [06:41<07:12,  4.91s/trial, best loss: -0.9604982977406376]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " 58%|██████████████████▏            | 117/200 [07:03<05:57,  4.30s/trial, best loss: -0.9604982977406376]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " 59%|██████████████████▎            | 118/200 [07:07<05:57,  4.35s/trial, best loss: -0.9604982977406376]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " 64%|███████████████████▋           | 127/200 [07:39<04:34,  3.75s/trial, best loss: -0.9604982977406376]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " 66%|████████████████████▌          | 133/200 [08:11<06:06,  5.48s/trial, best loss: -0.9604982977406376]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " 80%|████████████████████████▋      | 159/200 [10:06<03:45,  5.49s/trial, best loss: -0.9604982977406376]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " 83%|█████████████████████████▋     | 166/200 [10:38<02:37,  4.64s/trial, best loss: -0.9604982977406376]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " 84%|█████████████████████████▉     | 167/200 [10:45<02:55,  5.33s/trial, best loss: -0.9604982977406376]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " 98%|██████████████████████████████▏| 195/200 [12:44<00:18,  3.79s/trial, best loss: -0.9604982977406376]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n",
      "/Users/mafarzad/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.\n",
      "  warnings.warn(\n",
      "\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "100%|███████████████████████████████| 200/200 [13:05<00:00,  3.93s/trial, best loss: -0.9604982977406376]\n"
     ]
    }
   ],
   "source": [
    "# Start time\n",
    "start_time = time.time()\n",
    "\n",
    "# Run hyperband algorithm\n",
    "trials = Trials()\n",
    "best = fmin(\n",
    "    fn = objective,\n",
    "    space = space,\n",
    "    algo = tpe.suggest,\n",
    "    max_evals = 200,\n",
    "    trials = trials\n",
    ")\n",
    "\n",
    "# End time\n",
    "end_time = time.time()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "87a0cd5b-1c5e-4722-a5bc-52d08b5c76c0",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "hyperopt\n",
      "\n",
      "elapsed_time: 13.1 minutes\n",
      "\n",
      "best hyperparameter set:\n",
      "{'activation': 3, 'alpha': 0.0029752003177542416, 'hidden_layer_sizes': 7, 'learning_rate_init': 0.0018296408104005331}\n"
     ]
    }
   ],
   "source": [
    "print(f\"hyperopt\\n\")\n",
    "print(f\"elapsed_time: {round((end_time - start_time)/60, 1)} minutes\\n\")\n",
    "print(f\"best hyperparameter set:\")\n",
    "print(best)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "88cb21eb-528c-485a-bb55-36dd86100cdc",
   "metadata": {},
   "source": [
    "## Conclusion\n",
    "\n",
    "In this post we talked about the importance of hyperparameter optimization in machine learning and introduced Hyperband as a hyperparameter optimization methodology that utilizes exploration and smart resource allocation to identify an optimized set of hyperparameter optimization, much faster than approaches such as Bayesian Optimization. We then implemented a hyperband optimization approach step by step. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "513b1e44-79b8-42b5-8de5-991e96ce6847",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}