Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rpicard92/40ebb20e2b0311ec04642441b2cf193e to your computer and use it in GitHub Desktop.
Save rpicard92/40ebb20e2b0311ec04642441b2cf193e to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# HW1 - Part 2\n",
"## CS-5891-01 Special Topics Deep Learning\n",
"## Ronald Picard\n",
"\n",
"In this noteboook we will walk through how to perform linear regression using gradient decent. The goal for this example is to predict what the a good market price should be for your home given previous data for the price based on size (square footage) and the number of bedrooms. The input data to our model will the size of a home and the number of bedrooms, and the output data of our model will be a good market price. Since this is a 3-D linear regression model, the equation of 3-D line (z = b + a * x + b * y) will be our model. \n",
"\n",
"To start we need import some needed classes."
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import numpy as np\n",
"import struct\n",
"from mpl_toolkits.mplot3d import Axes3D\n",
"import matplotlib.pyplot as pyplot\n",
"import csv\n",
"import time"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, we must change our path string to the path of our data file containing the features. (Please note that you must change this string to point to the housing_price_data.txt data file on your machine.) The data file is a csv-style matrix made up of 3 feature vectors (columns). Each entry in the first feature vector is the size of a particular house, each entry in the second feature vector is the number of bedrooms of the particular house, and each entry in the third feature vector is the price of that house. Each feature vector is of the same size, and there is a one-to-one correspondence between each indexed vector entry of the first, second, and third feature vectors in the feature matrix."
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [],
"source": [
"## path\n",
"path = 'C:/Users/computer/Desktop/git/deep-learning-jupyter-notebooks/1-linear-regression-using-gradient-descent/housing_price_data.txt'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, we retrieve the data from the data file as follows. This imports the feature data into a feature matrix from which we can extract each feature vector."
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [],
"source": [
"## retreive data \n",
"data = np.genfromtxt(path, delimiter=',', dtype=np.float64)\n",
"size_vector = data[:,0]\n",
"bedrooms_vector = data[:,1]\n",
"price_vector = data[:,2]\n",
"#print(size_vector)\n",
"#print(bedrooms_vector)\n",
"#print(price_vector)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, we will normalize each of our feature vectors to (approximately) between -1 <= x <= 1. This will reduce the bias towards the larger feature and provide a more balanced model. We do this by using the using the formula: x(i) = (x(i) - Mean(x))/Std(x). The technique is known as feature scaling. (Please note that this means if we utilize our model later on, the input data must be normalized to produce an accurate prediction.)"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [],
"source": [
"## rectrieve size\n",
"m = np.size(price_vector)\n",
"\n",
"## find the mean\n",
"size_mean = np.mean(size_vector)\n",
"bedrooms_mean = np.mean(bedrooms_vector)\n",
"price_mean = np.mean(price_vector)\n",
"#print('size mean: ' + str(size_mean) + ', bedrooms mean: ' + str(bedrooms_mean) + ', price: ' + str(price_mean))\n",
"\n",
"## find the std\n",
"size_std = np.std(size_vector)\n",
"bedrooms_std = np.std(bedrooms_vector)\n",
"price_std = np.std(price_vector)\n",
"#print('size std: ' + str(size_std) + ', bedrooms std:' + str(bedrooms_std) + ', price std: ' + str(price_std))\n",
"\n",
"## normalize the data to between -1 <= x_j <= 1 using equation x_j^i = (x_j^i - u_j)/s_j \n",
"size_mean_vector = np.multiply(size_mean, np.ones(np.size(size_vector)))\n",
"#print(size_mean_vector)\n",
"normalized_size_vector = np.dot(1/size_std, size_vector-size_mean_vector)\n",
"#print(normalized_size_vector)\n",
"bedrooms_mean_vector = np.multiply(bedrooms_mean, np.ones(np.size(bedrooms_vector)))\n",
"#print(bedrooms_mean_vector)\n",
"normalized_bedrooms_vector = np.dot(1/bedrooms_std, bedrooms_vector - bedrooms_mean_vector)\n",
"#print(normalized_bedrooms_vector)\n",
"price_mean_vector = np.multiply(price_mean, np.ones(np.size(price_vector)))\n",
"#print(price_mean_vector)\n",
"normalized_price_vector = np.dot(1/price_std, price_vector - price_mean_vector)\n",
"#print(normalized_price_vector)\n",
"\n",
"\n",
"## create feature matrix\n",
"x = np.column_stack((np.ones(m), normalized_size_vector, normalized_bedrooms_vector))\n",
"#print(x[:,0])\n",
"#print(x[:,1])\n",
"#print(x[:,2])\n",
"\n",
"## re-assignment\n",
"y = normalized_price_vector\n",
"#print(y)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we create our 3 X 1 parameter vector with values initialized to 1. The reason that our parameter vector contains only three values is because we are trying to perform a 3-D linear regression. This means we need to find the parameter values for the equation of a 3-D line h(x) = theta0 + theta1 * x theta2 * x (this is analagous to our z = d + a * x + b * y equation). (Please note that the initial value of 1 was chosen pesudo-arbitrarly. We could have just as easily chosen 0.5 and recieved (nearly) the same result for a linear regression model.)"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [],
"source": [
"## create parameter vector\n",
"theta = np.array([1, 1, 1])\n",
"#print(theta)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that our model is set up and preprocessing is complete we need contruct our gradient descent algrorithm. This algorithm makes use of NumPy to perform linear alebra operations in a vectorized manner. The goal of our gradient decent algorithm is to minimize our cost function; which in this case is the least means squared. Calculating cost function gradient, multiplying it by the learning rate (step size), and subtracting this from the parameter vector iteratively will eventually lead us to the minima of our cost function and thus the ideal model parameters.\n",
"\n",
"There are three primary interative steps to our gradient decsent algorithm: \n",
"\n",
"1) Calculate the value of the cost function: j = (1/(2m)) * (X * Theta - Y)' * (X * Theta - Y). \n",
"\n",
"2) Calculate the value of the gradient of the cost function: dj_dTheta = (1/m) * (X' * X * Theta - X' * Y) \n",
"\n",
"3) Update the theta vector: Theta = Theta - alpha * dj_dTheta\n",
"\n",
"In the above steps: m = size of our feature vectors, Theta = parameter vector, Y = output profit feature vector, X = input feature matrix, j = cost function value, dj_dTheata = gradient vector with respect to the Theta vector, alpha = learning rate hyper-paramerter\n",
"\n",
"These steps repeat for the specified number of iterations. The final result of the Theta vector provides the values of theta0, theta1, and theta2 corresponding to our line equation.\n",
"\n",
"While our algorithm is running we will collect the cost function values and the corresponding interation # in two arrays that will allow us to plot the cost function value as a function of the number iterations; which will allow us see a plot of the learning curve for our gradient descent algorithm. "
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"elapsed time: 0.0019927024841308594 ms\n",
"final cost function value: 0.008769467118571728\n",
"interations: 100\n",
"Normalized Parameters: theta0: 2.6561398887836924e-05, theta1: 0.2583387777030867 , theta3: 0.7367893894351465\n"
]
}
],
"source": [
"# retrieve start time\n",
"start = time.time()\n",
"\n",
"## begin gradiant descent\n",
"alpha = 0.1\n",
"interations = 100\n",
"j_array = []\n",
"i_array = []\n",
"for i in range(interations):\n",
" \n",
" # calculate cost value\n",
" hx = np.dot(x, theta) # M X 3 times 3 X 1 = M X 1 \n",
" hx_minus_y = hx - y # M X 1 minus M X 1 = M X 1\n",
" hx_minus_y_transpose = np.transpose(hx_minus_y) # (M X 1)^T = 1 X M\n",
" j = np.dot(1/(2*m), np.dot(hx_minus_y_transpose, hx_minus_y)) # scalar * 1 X M times M X 1 = scalar\n",
" j_array.append(j)\n",
" i_array.append(i)\n",
"\n",
" # calculate gradient vector\n",
" x_transpose_x = np.dot(np.transpose(x), x) # 3 X M times M X 3 = 3 X 3\n",
" x_transpose_x_theta = np.dot(x_transpose_x, theta) # scalar * 3 X 1 = 3 X 1 \n",
" x_transpose_y = np.dot(np.transpose(x), y) # 3 X M times M X 1 = 3 X 1\n",
" dj_dtheta = np.dot(1/m, x_transpose_x_theta - x_transpose_y) # 3 X 1 minus 3 X 1 = 3 X 1\n",
"\n",
" # update theta vector\n",
" theta = theta - np.dot(alpha, dj_dtheta) # 3 X 1 - scalar * 3 X 1 = 3 X 1\n",
"\n",
" #input(\"Press Enter to continue...\")\n",
"\n",
"# retrieve end time\n",
"end = time.time()\n",
"\n",
"# logger data\n",
"print('elapsed time: ' + str(end-start) + ' ms')\n",
"print('final cost function value: ' + str(j))\n",
"print('interations: ' + str(interations))\n",
"print('Normalized Parameters: theta0: ' + str(theta[0]) + ', theta1: ' + str(theta[1]), ', theta3: ' + str(theta[2]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As printed above, the parameters after 100 iterations came out to (approximatly) the following as shown above:\n",
"\n",
"1) theta0 = 2.6561398887530704e-05\n",
"\n",
"2) theta1 = 0.8795566151371863\n",
"\n",
"3) theta2 = -0.04796939659432394\n",
"\n",
"Therefore, our 3-D line equation becomes: \n",
"\n",
"Price = 2.6561398887530704e-05 + 0.8795566151371863 * Size -0.04796939659432394 * Bedrooms \n",
"\n",
"Finally we will plot the results. \n",
"\n",
"1) The first plot is normalized line solution of the gradient decent algorithm. \n",
"\n",
"2) The second plot is denormalized line solution of the gradient decent algorithm. \n",
"\n",
"3) The third plot is the learning curve that relates the cost function value to the number of iterations of the algorithm. As seen by this plot, the cost function converges rapidly in about 20 iterations at a 0.1 leraning rate. "
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# plot results\n",
"\n",
"# plot normalized linear regression\n",
"fig = pyplot.figure()\n",
"ax = fig.add_subplot(111, projection='3d')\n",
"ax.scatter3D(normalized_size_vector, normalized_bedrooms_vector, normalized_price_vector, c='b', marker='o')\n",
"normalized_xmin = -3\n",
"normalized_xmax = 3\n",
"normalized_h_x_min = theta[0] + np.multiply(theta[1], normalized_xmin) + np.multiply(theta[2], normalized_xmin)\n",
"normalized_h_x_max = theta[0] + np.multiply(theta[1], normalized_xmax) + np.multiply(theta[2], normalized_xmax)\n",
"ax.plot3D([normalized_xmin, normalized_h_x_max], [normalized_xmin, normalized_h_x_max], [normalized_h_x_min, normalized_h_x_max], 'red')\n",
"ax.set_title('Normalized 3-D Linear Regression')\n",
"ax.set_xlabel('Size')\n",
"ax.set_ylabel('Bedrooms')\n",
"ax.set_zlabel('Price')\n",
"pyplot.show()\n",
"\n",
"# plot denormalized linear regression\n",
"xmin_size = np.dot(normalized_xmin, size_std) + size_mean\n",
"xmax_size = np.dot(normalized_xmax, size_std) + size_mean\n",
"xmin_bedrooms = np.dot(normalized_xmin, bedrooms_std) + bedrooms_mean\n",
"xmax_bedrooms = np.dot(normalized_xmax, bedrooms_std) + bedrooms_mean\n",
"h_x_min = np.dot(normalized_h_x_min, price_std) + price_mean\n",
"h_x_max = np.dot(normalized_h_x_max, price_std) + price_mean\n",
"fig2 = pyplot.figure()\n",
"ax2 = fig2.add_subplot(111, projection='3d')\n",
"ax2.scatter3D(size_vector, bedrooms_vector, price_vector, c='b', marker='o')\n",
"ax2.plot3D([xmin_size, xmax_size], [xmin_bedrooms, xmax_bedrooms], [h_x_min, h_x_max], 'red')\n",
"ax2.set_title('Denormalized 3-D Linear Regression')\n",
"ax2.set_xlabel('Size in ft^2')\n",
"ax2.set_ylabel('Bedrooms')\n",
"ax2.set_zlabel('Price')\n",
"pyplot.show()\n",
"\n",
"# plot learning curve\n",
"pyplot.figure()\n",
"pyplot.plot(i_array, j_array, 'red')\n",
"pyplot.title('Learning Curve')\n",
"pyplot.xlabel('Iterations')\n",
"pyplot.ylabel('Cost')\n",
"pyplot.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I currently live in an apartment; however, for illustration we will test this model on my apartment as if it were a house. The features of my apartment are as follows:\n",
"\n",
"Size (ft^2): 925\n",
"Number of Bedrooms: 2\n",
"\n",
"Now we will use our model to predict a good market price given the features of my house. "
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Nomalized Good Market Price: $-1.2963059031963535\n",
"Good Market Price: $166000.26594221333\n"
]
}
],
"source": [
"# features\n",
"size_feature_value = 925\n",
"bedrooms_feature_value = 2\n",
"\n",
"# perform normalization of the features\n",
"normalized_size_feature_value = np.dot(1/size_std, size_feature_value - size_mean)\n",
"normalized_bedrooms_feature_value = np.dot(1/bedrooms_std, bedrooms_feature_value - bedrooms_mean)\n",
"\n",
"# normalized price\n",
"normalized_price = theta[0] + theta[1]*normalized_size_feature_value + theta[2]*normalized_bedrooms_feature_value\n",
"\n",
"# denormalize the price\n",
"price = np.dot(normalized_price, price_std) + price_mean\n",
"\n",
"print('Nomalized Good Market Price: $' + str(normalized_price))\n",
"print('Good Market Price: $' + str(price))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As illustrated, a good market price for my house based on this model is approximately $166000.27. This looks correct based on our denormalized linear regression plot shown above."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This concludes part 2."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment