Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rpicard92/de067bb96629a75122dc9d12c2ad03af to your computer and use it in GitHub Desktop.
Save rpicard92/de067bb96629a75122dc9d12c2ad03af to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# HW1 - Part 1\n",
"## CS-5891-01 Special Topics Deep Learning\n",
"## Ronald Picard\n",
"\n",
"In this noteboook we will walk through how to perform linear regression using gradient decent. The goal for this example is to predict what the profit for a new food truck might be given the population of the city. The input data to our model will be the population of a city, and the output data of our model will be the predicted profit. Since this is a 2-D linear regression model, the classic equation of the a line (y = m * x + b) will be our model. \n",
"\n",
"To start we need import some needed classes."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import numpy as np\n",
"import struct\n",
"from matplotlib import pyplot\n",
"import csv\n",
"import time"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, we must change our path string to the path of our data file containing the features. (Please note that you must change this string to point to the food_truck_data.txt data file on your machine.) The data file is a csv-style matrix made up of 2 feature vectors (columns). Each entry in the first feature vector is the population of the particular city, whereas each entry in the second feature vector is the profit a food truck has recieved in the particular city. Each feature vector is of the same size, and there is a one-to-one correspondence between each indexed vector entry of the first and second feature vectors in the feature matrix."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"## path\n",
"path = 'C:/Users/computer/Desktop/git/deep-learning-jupyter-notebooks/1-linear-regression-using-gradient-descent/food_truck_data.txt'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, we retrieve the data from the data file as follows. This imports the feature data into a feature matrix from which we can extract each feature vector."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"## retreive data \n",
"data = np.genfromtxt(path, delimiter=',', dtype=np.float32)\n",
"population_vector = data[:,0]\n",
"profit_vector = data[:,1]\n",
"# print(population)\n",
"# print(profit)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, we will normalize each of our feature vectors to (approximately) between -1 <= x <= 1. This will reduce the bias towards the larger feature and provide a more balanced model. We do this by using the using the formula: x(i) = (x(i) - Mean(x))/Std(x). The technique is known as feature scaling. (Please note that this means if we utilize our model later on, the input data must be normalized to produce an accurate prediction.)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [],
"source": [
"## rectrieve size\n",
"m = np.size(profit_vector)\n",
"\n",
"## find the mean\n",
"population_mean = np.mean(population_vector)\n",
"profit_mean = np.mean(profit_vector)\n",
"#print('pop mean: ' + str(population_mean) + ', prof mean: ' + str(profit_mean))\n",
"\n",
"## find the std\n",
"population_std = np.std(population_vector)\n",
"profit_std = np.std(profit_vector)\n",
"#print('pop std: ' + str(population_std) + ', prof std:' + str(profit_std))\n",
"\n",
"## normalize the data to between -1 <= x_j <= 1 using equation x_j^i = (x_j^i - u_j)/s_j \n",
"population_mean_vector = np.multiply(population_mean, np.ones(np.size(m)))\n",
"#print(population_mean_vector)\n",
"normalized_population_vector = np.dot(1/population_std, population_vector - population_mean_vector)\n",
"#print(normalized_population_vector)\n",
"profit_mean_vector = np.multiply(profit_mean, np.ones(np.size(m)))\n",
"#print(profit_mean_vector)\n",
"normalized_profit_vector = np.dot(1/profit_std, profit_vector - profit_mean_vector)\n",
"#print(normalized_profit_vector)\n",
"\n",
"## create feature matrix\n",
"x = np.column_stack((np.ones(m), normalized_population_vector))\n",
"#print(x[:,0])\n",
"#print(x[:,1])\n",
"## re-assignment\n",
"\n",
"# re-assignment\n",
"y = normalized_profit_vector\n",
"#print(y)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we create our 2 X 1 parameter vector with values initialized to 1. The reason that our parameter vector contains only two values is because we are trying to perform a 2-D linear regression. This means we need to find the slope and intercept values for the classic equation of a line h(x) = theta0 + theta1*x (this is analagous to our y = m * x + b equation). (Please note that the initial value of 1 was chosen pesudo-arbitrarly. We could have just as easily chosen 0.5 and recieved (nearly) the same result for a linear regression model.)"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
"## create parameter vector\n",
"theta = np.array([1, 1])\n",
"#print(theta)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that our model is set up and preprocessing is complete we need contruct our gradient descent algrorithm. This algorithm makes use of NumPy to perform linear alebra operations in a vectorized manner. The goal of our gradient decent algorithm is to minimize our cost function; which in this case is the least means squared. Calculating cost function gradient, multiplying it by the learning rate (step size), and subtracting this from the parameter vector iteratively will eventually lead us to the minima of our cost function and thus our ideal model parameters. \n",
"\n",
"There are three primary interative steps to our gradient decsent algorithm: \n",
"\n",
"1) Calculate the value of the cost function: j = (1/(2m)) * (X * Theta - Y)' * (X * Theta - Y).\n",
"\n",
"2) Calculate the value of the gradient of the cost function: dj_dTheta = (1/m) * (X' * X * Theta - X' * Y)\n",
"\n",
"3) Update the theta vector: Theta = Theta - alpha * dj_dTheta\n",
"\n",
"In the above steps: m = size of our feature vectors, Theta = parameter vector, Y = output profit feature vector, X = input feature matrix, j = cost function value, dj_dTheata = gradient vector with respect to the Theta vector, alpha = learning rate hyper-paramerter\n",
"\n",
"These steps repeat for the specified number of iterations. The final result of the Theta vector provides the values of theta0 (intercept) and theta1 (slope) corresponding to our line equation.\n",
"\n",
"While our algorithm is running we will collect the cost function values and the corresponding interation # in two arrays that will allow us to plot the cost function value as a function of the number iterations; which will allow us see a plot of the learning curve for our gradient descent algorithm. "
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"elapsed time: 0.0019948482513427734 ms\n",
"final cost function value: 0.49029398087552617\n",
"interations: 100\n",
"Normalized Parameters: theta0: 2.6499769928728125e-05, theta1: -0.13929690232082123\n"
]
}
],
"source": [
"# retrieve start time\n",
"start = time.time()\n",
"\n",
"## begin gradiant descent\n",
"alpha = 0.1\n",
"interations = 100\n",
"j_array = []\n",
"i_array = []\n",
"for i in range(interations):\n",
"\n",
" # calculate cost value\n",
" hx = np.dot(x, theta) # M X 2 times 2 X 1 = M X 1 \n",
" hx_minus_y = hx - y # M X 1 minus M X 1 = M X 1\n",
" hx_minus_y_transpose = np.transpose(hx_minus_y) # (M X 1)^T = 1 X M\n",
" j = np.dot(1/(2*m), np.dot(hx_minus_y_transpose, hx_minus_y)) # scalar * 1 X M times M X 1 = scalar\n",
" j_array.append(j)\n",
" i_array.append(i)\n",
"\n",
" # calculate gradient vector\n",
" x_transpose_x = np.dot(np.transpose(x), x) # 2 X M times M X 2 = 2 X 2\n",
" x_transpose_x_theta = np.dot(x_transpose_x, theta) # scalar * 2 X 1 = 2 X 1 \n",
" x_transpose_y = np.dot(np.transpose(x), y) # 2 X M times M X 1 = 2 X 1\n",
" dj_dtheta = np.dot(1/m, x_transpose_x_theta - x_transpose_y) # 2 X 1 minus 2 X 1 = 2 X 1\n",
"\n",
" # update theta vector\n",
" theta = theta - np.dot(alpha, dj_dtheta) # 2 X 1 - scalar * 2 X 1 = 2 X 1\n",
"\n",
"# retrieve end time\n",
"end = time.time()\n",
"\n",
"# logger data\n",
"print('elapsed time: ' + str(end-start) + ' ms')\n",
"print('final cost function value: ' + str(j))\n",
"print('interations: ' + str(interations))\n",
"print('Normalized Parameters: theta0: ' + str(theta[0]) + ', theta1: ' + str(theta[1]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As printed above, the parameters after 100 iterations came out to (approximatly) the following as shown above:\n",
"\n",
"1) theta0 = 2.6461507394829433e-05\n",
"\n",
"2) theta1 = 0.8378775736547807\n",
"\n",
"Therefore, our normalized 2-D line equation becomes: \n",
"\n",
"Price = 2.6461507394829433e-05 + 0.8378775736547807 * Population \n",
"\n",
"Finally we will plot the results. \n",
"\n",
"1) The first plot is the normalized line solution of the gradient decent algorithm. \n",
"\n",
"2) The second plot is denormalized line solution of the gradient decent algorithm. \n",
"\n",
"3) The third plot is the learning curve that relates the cost function value to the number of iterations of the algorithm. As seen by this plot, the cost function converges rapidly in about 20 iterations at a 0.1 leraning rate. "
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# plot the results\n",
"\n",
"pyplot.figure()\n",
"pyplot.scatter(normalized_population_vector, normalized_profit_vector) \n",
"normalized_xmin = -2\n",
"normalized_xmax = 2\n",
"normalized_h_x_min = theta[0] + np.multiply(theta[1], normalized_xmin)\n",
"normalized_h_x_max = theta[0] + np.multiply(theta[1], normalized_xmax)\n",
"pyplot.plot([normalized_xmin, normalized_xmax], [normalized_h_x_min, normalized_h_x_max], 'red')\n",
"pyplot.title('Normalized 2-D Linear Regression')\n",
"pyplot.xlabel('Population')\n",
"pyplot.ylabel('Profit')\n",
"pyplot.show()\n",
"\n",
"pyplot.figure()\n",
"pyplot.scatter(population_vector, profit_vector) \n",
"xmin = np.dot(normalized_xmin, population_std) + population_mean\n",
"xmax = np.dot(normalized_xmax, population_std) + population_mean\n",
"h_x_min = np.dot(normalized_h_x_min, profit_std) + profit_mean\n",
"h_x_max = np.dot(normalized_h_x_max, profit_std) + profit_mean\n",
"pyplot.plot([xmin, xmax], [h_x_min, h_x_max], 'red')\n",
"pyplot.title('Denormalized 2-D Linear Regression')\n",
"pyplot.xlabel('Population')\n",
"pyplot.ylabel('Profit')\n",
"pyplot.show()\n",
"\n",
"# plot learning curve\n",
"pyplot.figure()\n",
"pyplot.plot(i_array, j_array, 'red')\n",
"pyplot.title('Learning Curve')\n",
"pyplot.xlabel('Iterations')\n",
"pyplot.ylabel('Cost')\n",
"pyplot.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This concludes part 1."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment