Skip to content

Instantly share code, notes, and snippets.

@quantshah
Last active July 10, 2024 10:52
Show Gist options
  • Save quantshah/ea1fddb4ec2e61d3a9a5957b791bca92 to your computer and use it in GitHub Desktop.
Save quantshah/ea1fddb4ec2e61d3a9a5957b791bca92 to your computer and use it in GitHub Desktop.
Universal quantum classifier using Pennylane - a python library for quantum machine learning
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Universal quantum classifier\n",
"\n",
"**Author: Shahnawaz Ahmed (shahnawaz.ahmed95@gmail.com)**\n",
"\n",
"A single-qubit quantum circuit which can implement arbitrary unitary operations can be used as a universal classifier much like a single hidden-layered Neural Network. As surprising as it sounds, Adri´an P´erez-Salinas et al. (2019) discuss this with their idea of ´data reuploading´. It is possible to load a single qubit with arbitrary dimensional data and use it as a universal classifier.\n",
"\n",
"In this example, we will implement this idea with `Pennylane` - a python based tool for quantum machine learning, automatic differentiation, and optimization of hybrid quantum-classical computations.\n",
"\n",
"\n",
"### Circles\n",
"\n",
"We consider a simple classification problem will train a single-qubit variational quantum circuit to achieve this goal. The data is generated as a set of random points in a plane $(x_1, x_2)$ and labeled as 1 (blue) or 0 (red) depending on whether they lie inside or outside a circle. \n",
"\n",
"![Screenshot 2019-07-22 14 32 38](https://user-images.githubusercontent.com/6968324/61652759-3bd7ea80-aca8-11e9-88c6-9e9f4d443703.png)\n",
"\n",
"\n",
"### Quantum states, unitaries and data-reuploading\n",
"\n",
"A single-qubit quantum state is characterized by a two-dimensional state vector and can be visualized as a point in the so-called Bloch sphere. Instead of just being a 0 or 1, it can exist as a superposition with say 30% chance of being in the $|0 \\rangle$ and 70% chance of being in the $|1 \\rangle$ state. This is represented by a state vector $|\\psi \\rangle = 0.3|0 \\rangle + 0.7|0 \\rangle $ - the probability \"amplitude\" of the quantum state. In general we can take a vector $(\\alpha, \\beta)$ to represent the probabilities that a qubit can take and visualize it as follows:\n",
"\n",
"![Screenshot 2019-07-22 14 33 49](https://user-images.githubusercontent.com/6968324/61652766-3f6b7180-aca8-11e9-9de5-8efcbd7b63f2.png)\n",
"\n",
"In order to load data onto a single qubit, we use a unitary operation U ($x_1$, $x_2$, $x_3$) which is just a parameterized matrix multiplication representing the rotation of the state vector in the Bloch sphere. Eg., to load $(x_1, x_2)$ into the qubit, we just start from some initial state vector, $|0 \\rangle $, apply the unitary operation $U(x_1, x_2, 0)$ and end up at a new point on the Bloch sphere. Here we have padded 0 since our data is only 2D. Adri´an P´erez-Salinas et al. (2019) discuss how to load a higher dimensional data point ($[x_1, x_2, x_3, x_4, x_5, x_6]$) by breaking it down in sets of three parameters ($U(x_1, x_2, x_3), U(x_4, x_5, x_6)$).\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Model parameters and linear layers with data re-uploading\n",
"\n",
"Once we load the data onto the quantum circuit, we want to have some trainable nonlinear model similar to a Neural Network and a way of learning the weights of the model from data. This is again done with unitaries, $U(\\theta_1, \\theta_2, \\theta_3)$ such that we load the data first and then apply the weights to form a single layer $L(\\vec \\theta, \\vec x) = U(\\vec \\theta)U(\\vec x)$. In principle, this is just application of two matrix multiplications on an input vector initialized to some value. In order to increase the number of trainable parameters (similar to increasing neurons in a single layer of a neural network), we can reapply this layer again and again with new sets of weights, $L(\\vec \\theta_1, \\vec x) L(\\vec \\theta_2, , \\vec x) ... L(\\vec \\theta_L, \\vec x)$ for $L$ layers. The quantum circuit would look like the following:\n",
"\n",
"![Screenshot 2019-07-22 14 49 40](https://user-images.githubusercontent.com/6968324/61652768-41353500-aca8-11e9-819a-8b40380e7dfa.png)\n",
"\n",
"\n",
"\n",
"### Nonlinear \"collapse\" and the cost function\n",
"\n",
"So far, we have only performed linear operations (matrix multiplications) and we know that we need to have some nonlinear squashing similar to activation functions in neural networks to really make a universal classifier (Cybenko 1989). Here is where things gets a bit quantum. After the application of the layers, we will end up at some point on the Bloch sphere due to the sequence of unitaries implementing rotations of the input. These are still just linear transformations of the input state. Now, the output of the model should be a class label which can be encoded as fixed vectors (Blue = $[1, 0]$, Red = $[0, 1]$) on the Bloch sphere. We want to end up at either of them after transforming our input state through alternate applications of data layer and weights. \n",
"\n",
"We can use the idea of the \"collapse\" of our quantum state state into one or other class. This happens when we measure the quantum state which leads to its projection as either the state 0 or 1. We can compute the fidelity (or closeness) of the output state to the class label making the output state jump to either $| 0 \\rangle or |1\\rangle$. By repeating this process several times, we can compute the probability or overlap and assign a label based on which label state, our output has a higher overlap.\n",
"\n",
"<img width=\"1021\" alt=\"Screenshot 2019-07-22 16 47 50\" src=\"https://user-images.githubusercontent.com/6968324/61652769-42666200-aca8-11e9-918e-c7ab5a503a9a.png\">\n",
"\n",
"We can then define the cost function as the sum of the fidelities for all the data points and optimize the parameters $(\\vec \\theta)$ to reduce the cost and get a trained model which can make the classification for new data points.\n",
"\n",
"$$\n",
"\\texttt{Cost} = \\sum_{\\texttt{data points}} (1 - \\texttt{fidelity}(\\psi_{\\texttt{output}}(\\vec x, \\vec \\theta), \\psi_{\\texttt{label}}))\n",
"$$\n",
"\n",
"Now, we can use our favorite optimizer to maximize the sum of the fidelities over all data points (or batches of datapoints) and find the optimal weights for classification.\n",
"\n",
"In `pennylane`, we can define an observable (the expected output label) and make a circuit to return the fidelity using the `Hermitian` operator.\n",
"\n",
"### Multiple qubits, entanglement and Deep Neural Networks\n",
"\n",
"The Universal Approximation Theorem declares a single hidden layered neural network to be capable of approximating any function to arbitrary accuracy. But in practice, it might require a large number of neurons in the single hidden layer and here is where Deep Neural Networks come into action. Deep Neural Networks proved to be better in practice and we have some intuitive idea why, read \"Why does deep and cheap learning work so well?\" by Henry W. Lin, Max Tegmark (MIT) and David Rolnick (2016).\n",
"\n",
"Adri´an P´erez-Salinas et al. (2019) describe that in their approach the \"layers\" $L_i(\\vec \\theta_i, \\vec x )$ are analogous to the size of the intermediate hidden layer of the neural network. And what counts for deep (multiple layers of the neural network) relates to the number of qubits. So, multiple qubits with entanglement between them could provide some quantum advantage over classical neural networks. But here, we will only implement a single qubit classifier.\n",
"\n",
"<img width=\"1088\" alt=\"Screenshot 2019-07-22 17 16 22\" src=\"https://user-images.githubusercontent.com/6968324/61652774-43978f00-aca8-11e9-972e-8d13208055f2.png\">"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## \"Talk is cheap. Show me the code.\"\n",
"### - Linus Torvalds"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pennylane as pl\n",
"from pennylane import numpy as np\n",
"from pennylane.optimize import AdamOptimizer, GradientDescentOptimizer\n",
"\n",
"from sklearn.metrics import accuracy_score\n",
"from qutip import Qobj, fidelity, Bloch\n",
"\n",
"import matplotlib.pyplot as plt\n",
"%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Make a dataset of points in and out of a circle"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 360x360 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"def circle(samples, center=[0., 0.], radius=0.8):\n",
" \"\"\"\n",
" Generates a dataset of points with 1/0 labels inside a given radius. \n",
" \n",
" Parameters\n",
" ----------\n",
" samples: int\n",
" The number of samples to generate.\n",
" \n",
" center: tuple\n",
" The center of the circle\n",
" \n",
" radius: float\n",
" The radius of the circle.\n",
" \"\"\"\n",
" Xvals, yvals = [], []\n",
"\n",
" for i in range(samples):\n",
" x = 2*(np.random.rand(2)) - 1\n",
" y = 0\n",
" if np.linalg.norm(x - center) < radius:\n",
" y = 1 \n",
" Xvals.append(x)\n",
" yvals.append(y)\n",
" return np.array(Xvals), np.array(yvals)\n",
"\n",
"\n",
"def plot_data(x, y, fig=None, ax=None):\n",
" \"\"\"\n",
" Plot data with red/blue values for a binary classification.\n",
" \n",
" Parameters\n",
" ----------\n",
" x: ndarray (m, 2)\n",
" An array of m data points with each having dimension 2\n",
" \n",
" y: ndarray (m)\n",
" An array of labels as int (0/1).\n",
" \"\"\"\n",
" if fig == None:\n",
" fig, ax = plt.subplots(1, 1, figsize=(5,5))\n",
" reds = y == 0\n",
" blues = y == 1\n",
" ax.scatter(x[reds, 0], x[reds, 1], c=\"red\",\n",
" s=20, edgecolor='k')\n",
" ax.scatter(x[blues, 0], x[blues, 1], c=\"blue\",\n",
" s=20, edgecolor='k')\n",
" ax.set_xlabel(\"$x_1$\")\n",
" ax.set_ylabel(\"$x_2$\")\n",
"\n",
"\n",
"Xdata, ydata = circle(500)\n",
"plot_data(Xdata, ydata)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Define output labels as quantum state vectors"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"def density_matrix(state):\n",
" \"\"\"Calculates the density matrix representation of a state.\n",
"\n",
" Args:\n",
" state (array[complex]): array representing a quantum state vector\n",
" Returns:\n",
" dm: (array[complex]): array representing the density matrix.\n",
" \"\"\"\n",
" return state*np.conj(state).T\n",
"\n",
"label_0 = [[1.0 + 0.j],\n",
" [0. + 0.j]]\n",
"\n",
"label_1 = [[0. + 0.j],\n",
" [1. + 0.j]]\n",
"\n",
"state_labels = [label_0, label_1]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Make a simple classifier data reloading circuit"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"dev = pl.device('default.qubit', wires=1)\n",
"# Use your own pennylane-plugin to run on some particular backend\n",
"\n",
"@pl.qnode(dev)\n",
"def qcircuit(var, x=None, y=None):\n",
" \"\"\"A variational quantum circuit representing the Universal classifier.\n",
"\n",
" Args:\n",
" var (array[float]): array of variables\n",
" x (array[float]): single input vector\n",
" y (array[float]): single output state density matrix\n",
"\n",
" Returns:\n",
" float: fidelity between output state and input\n",
" \"\"\"\n",
" for v in var:\n",
" pl.Rot(*x, wires=0)\n",
" pl.Rot(*v, wires=0)\n",
" return pl.expval(pl.Hermitian(y, wires=[0]))\n",
"\n",
"\n",
"def fidelity(state1, state2):\n",
" \"\"\"\n",
" Calculates the fidelity between two state vectors\n",
"\n",
" Args:\n",
" state1 (array[float]): State vector representation\n",
" state2 (array[float]): State vector representation\n",
"\n",
" Returns:\n",
" float: fidelity between `state1` and `state2`\n",
" \"\"\"\n",
" return np.abs(np.dot(np.conj(state1), state2))\n",
"\n",
" \n",
"def cost(weights, x, y, state_labels=None):\n",
" \"\"\"Cost function to be minimized.\n",
"\n",
" Args:\n",
" weights (array[float]): array of weights\n",
" x (array[float]): 2-d array of input vectors.\n",
" y (array[float]): 1-d array of targets.\n",
" state_labels (array[float]): array of state representations for labels\n",
" \n",
" Returns:\n",
" float: loss value to be minimized\n",
" \"\"\"\n",
" # Compute prediction for each input in data batch\n",
" loss = 0.\n",
" dm_labels = [density_matrix(s) for s in state_labels]\n",
" for i in range(len(x)):\n",
" f = qcircuit(weights, x=x[i], y=dm_labels[y[i]])\n",
" loss = loss + (1 - f)\n",
" return loss/len(x)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"def test(weights, x, y, state_labels=None):\n",
" \"\"\"\n",
" Tests on a given set of data.\n",
" \n",
" Args:\n",
" weights (array[float]): array of weights\n",
" x (array[float]): 2-d array of input vectors.\n",
" y (array[float]): 1-d array of targets.\n",
" state_labels (array[float]): 1-d array of state representations for labels\n",
" \n",
" Returns:\n",
" predicted (array([int]): predicted labels for test data\n",
" output_states (array[float]): output quantum states from the circuit\n",
" \"\"\"\n",
" fidelity_values = []\n",
" output_states = []\n",
" dm_labels = [density_matrix(s) for s in state_labels]\n",
" for i in range(len(x)):\n",
" expectation = qcircuit(weights, x=x[i], y=dm_labels[y[i]])\n",
" output_states.append(dev._state)\n",
" predicted = predicted_labels(output_states, state_labels)\n",
" return predicted, output_states\n",
"\n",
"\n",
"def predicted_labels(states, state_labels=None):\n",
" \"\"\"\n",
" Computes the label of the predicted state by selecting the one\n",
" with maximum fidelity\n",
" \n",
" Args:\n",
" weights (array[float]): array of weights\n",
" x (array[float]): 2-d array of input vectors.\n",
" y (array[float]): 1-d array of targets.\n",
" state_labels (array[float]): 1-d array of state representations for labels\n",
" \n",
" Returns:\n",
" float: loss value to be minimized\n",
" \"\"\"\n",
" output_labels = [np.argmax([fidelity(s, label) for label in state_labels]) for s in states]\n",
" return np.array(output_labels)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"def accuracy_score(y_true, y_pred):\n",
" \"\"\"Accuracy score.\n",
"\n",
" Args:\n",
" y_true (array[float]): 1-d array of targets.\n",
" y_predicted (array[float]): 1-d array of predictions\n",
" state_labels (array[float]): 1-d array of state representations for labels\n",
"\n",
" Returns:\n",
" score : float\n",
" The fraction of correctly classified samples.\n",
" \"\"\"\n",
" score = y_true == y_pred\n",
" return score.sum()/len(y_true)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"def iterate_minibatches(inputs, targets, batch_size):\n",
" \"\"\"\n",
" A generator for batches of the input data\n",
" \n",
" Args:\n",
" inputs (array[float]): input data\n",
" targets (array[float]): targets\n",
" \n",
" Returns:\n",
" inputs (array[float]): one batch of input data of length `batch_size`\n",
" targets (array[float]): one batch of targets of length `batch_size`\n",
" \"\"\"\n",
" for start_idx in range(0, inputs.shape[0] - batch_size + 1, batch_size):\n",
" idxs = slice(start_idx, start_idx + batch_size)\n",
" yield inputs[idxs], targets[idxs]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Generate training and test data"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"num_training = 200\n",
"num_test = 500\n",
"\n",
"Xdata, y_train = circle(num_training)\n",
"X_train = np.hstack((Xdata, np.zeros((Xdata.shape[0], 1))))\n",
"\n",
"Xtest, y_test = circle(num_test)\n",
"X_test = np.hstack((Xtest, np.zeros((Xtest.shape[0], 1))))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Train and evaluate the classifier using gradient descent"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Epoch: 0 | Cost: 0.509948 | Train accuracy 0.515000 | Test Accuracy : 0.508000\n",
"Epoch: 1 | Cost: 0.430845 | Train accuracy 0.605000 | Test Accuracy : 0.654000\n",
"Epoch: 2 | Cost: 0.353277 | Train accuracy 0.730000 | Test Accuracy : 0.748000\n",
"Epoch: 3 | Cost: 0.306970 | Train accuracy 0.765000 | Test Accuracy : 0.790000\n",
"Epoch: 4 | Cost: 0.295256 | Train accuracy 0.820000 | Test Accuracy : 0.808000\n",
"Epoch: 5 | Cost: 0.294993 | Train accuracy 0.805000 | Test Accuracy : 0.792000\n",
"Epoch: 6 | Cost: 0.293592 | Train accuracy 0.805000 | Test Accuracy : 0.790000\n",
"Epoch: 7 | Cost: 0.291759 | Train accuracy 0.815000 | Test Accuracy : 0.804000\n",
"Epoch: 8 | Cost: 0.290740 | Train accuracy 0.830000 | Test Accuracy : 0.808000\n",
"Epoch: 9 | Cost: 0.289939 | Train accuracy 0.820000 | Test Accuracy : 0.810000\n",
"Epoch: 10 | Cost: 0.289068 | Train accuracy 0.830000 | Test Accuracy : 0.808000\n"
]
}
],
"source": [
"num_layers = 3\n",
"learning_rate = 0.1\n",
"epochs = 10\n",
"batch_size = 20\n",
"\n",
"opt = AdamOptimizer(learning_rate, beta1=0.9, beta2=0.999)\n",
"\n",
"# initialize random weights\n",
"weights = np.random.random(size=(num_layers, 3))\n",
"\n",
"predicted_train, states_train = test(weights, X_train, y_train, state_labels)\n",
"accuracy_train = accuracy_score(y_train, predicted_train)\n",
" \n",
"predicted_test, states_test = test(weights, X_test, y_test, state_labels)\n",
"accuracy_test = accuracy_score(y_test, predicted_test)\n",
"\n",
"loss = cost(weights, X_test, y_test, state_labels)\n",
"\n",
"print(\"Epoch: {:2d} | Cost: {:3f} | Train accuracy {:3f} | Test Accuracy : {:3f}\".format(0,\n",
" loss[0],\n",
" accuracy_train,\n",
" accuracy_test))\n",
"for it in range(epochs):\n",
" for Xbatch, ybatch in iterate_minibatches(X_train, y_train, batch_size=batch_size):\n",
" weights = opt.step(lambda v: cost(v, Xbatch, ybatch, state_labels), weights)\n",
" \n",
" predicted_train, states_train = test(weights, X_train, y_train, state_labels)\n",
" accuracy_train = accuracy_score(y_train, predicted_train)\n",
" loss = cost(weights, X_train, y_train, state_labels) \n",
"\n",
" predicted_test, states_test = test(weights, X_test, y_test, state_labels)\n",
" accuracy_test = accuracy_score(y_test, predicted_test)\n",
"\n",
" print(\"Epoch: {:2d} | Cost: {:3f} | Train accuracy {:3f} | Test Accuracy : {:3f}\".format(it + 1,\n",
" loss[0],\n",
" accuracy_train,\n",
" accuracy_test))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Results"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"==========================================================================\n",
"Number of training data points: 200; Number of test data points: 500\n",
"Total number of calls to circuit for training: 2000\n",
"Number of times model parameters were updated: 100\n",
"Cost: 0.289068 | Train accuracy 0.830000 | Test Accuracy : 0.808000\n",
"==========================================================================\n"
]
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 576x288 with 2 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"print(\"==========================================================================\")\n",
"print(\"Number of training data points: {}; Number of test data points: {}\".format(len(X_train), len(X_test)))\n",
"print(\"Total number of calls to circuit for training: {}\".format(int(len(X_train)*epochs)))\n",
"print(\"Number of times model parameters were updated: {:2d}\".format(int(len(X_train)*epochs/batch_size)))\n",
"print(\"Cost: {:3f} | Train accuracy {:3f} | Test Accuracy : {:3f}\".format(loss[0],\n",
" accuracy_train,\n",
" accuracy_test))\n",
"print(\"==========================================================================\")\n",
"\n",
"fig, axes = plt.subplots(1, 2, figsize = (8, 4))\n",
"plot_data(X_test, predicted_test, fig, axes[0])\n",
"plot_data(X_test, y_test, fig, axes[1])\n",
"axes[0].set_title(\"Predictions\")\n",
"axes[1].set_title(\"True data\")\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# References\n",
"\n",
"[1] Pérez-Salinas, Adrián, et al. \"Data re-uploading for a universal quantum classifier.\" arXiv preprint arXiv:1907.02085 (2019)."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment