infinite-Joy9l/high_level_function.ipynb

## high_level_function.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The high level function for the LR algorithm which, for a number of steps (num_iters) finds gradients which take the Theta values (coefficients of known factors) from an estimation closer (new_theta) to their \"optimum estimation\" which is the set of values best representing the system in a linear combination model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def Logistic_Regression(X,Y,alpha,theta,num_iters):\n",
    "    m = len(Y)\n",
    "    for x in range(num_iters):\n",
    "        new_theta = Gradient_Descent(X,Y,theta,m,alpha)\n",
    "        theta = new_theta\n",
    "        if x % 100 == 0:\n",
    "            #here the cost function is used to present the final hypothesis of the model in the same form for each gradient-step iteration\n",
    "            Cost_Function(X,Y,theta,m)\n",
    "            print('theta ', theta)\n",
    "            print('cost is ', Cost_Function(X,Y,theta,m))\n",
    "    Declare_Winner(theta)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "These are the initial guesses for theta as well as the learning rate of the algorithm.\n",
    "A learning rate too low will not close in on the most accurate values within a reasonable number of iterations.\n",
    "An alpha too high might overshoot the accurate values or cause erratic guesses.\n",
    "Each iteration increases model accuracy but with diminishing returns, and takes a signficicant coefficient times O(n)*|Theta|, n = dataset length"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "cost is  0.690657580726\n",
      "theta  [0.015808968977217012, 0.014030982200249273]\n",
      "cost is  0.690657580726\n",
      "cost is  0.690657580726\n",
      "cost is  0.562528409195\n",
      "theta  [1.1258734026268615, 1.4264625581846324]\n",
      "cost is  0.562528409195\n",
      "cost is  0.562528409195\n",
      "cost is  0.528439281756\n",
      "theta  [1.7026031775249526, 2.848326035342597]\n",
      "cost is  0.528439281756\n",
      "cost is  0.528439281756\n",
      "cost is  0.516717865749\n",
      "theta  [2.0421060348686075, 4.272934539058638]\n",
      "cost is  0.516717865749\n",
      "cost is  0.516717865749\n",
      "cost is  0.512012633692\n",
      "theta  [2.257624206713857, 5.6981422511820581]\n",
      "cost is  0.512012633692\n",
      "cost is  0.512012633692\n",
      "cost is  0.509944033988\n",
      "theta  [2.4006753346573633, 7.1232898942713021]\n",
      "cost is  0.509944033988\n",
      "cost is  0.509944033988\n",
      "cost is  0.508981901926\n",
      "theta  [2.4982963698142258, 8.5482017093169134]\n",
      "cost is  0.508981901926\n",
      "cost is  0.508981901926\n",
      "cost is  0.508517769016\n",
      "theta  [2.5661259492605346, 9.9728620003137944]\n",
      "cost is  0.508517769016\n",
      "cost is  0.508517769016\n",
      "cost is  0.50828833603\n",
      "theta  [2.6138281667277883, 11.397304265307881]\n",
      "cost is  0.50828833603\n",
      "cost is  0.50828833603\n",
      "cost is  0.508173009598\n",
      "theta  [2.6476542015013051, 12.82157202310713]\n",
      "cost is  0.508173009598\n",
      "cost is  0.508173009598\n",
      "Scikit won.. :(\n",
      "Your score:  0.6363636363636364\n",
      "Scikits score:  0.878787878788\n"
     ]
    }
   ],
   "source": [
    "initial_theta = [0,0]\n",
    "alpha = 0.1\n",
    "iterations = 1000\n",
    "Logistic_Regression(X,Y,alpha,initial_theta,iterations)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"The high level function for the LR algorithm which, for a number of steps (num_iters) finds gradients which take the Theta values (coefficients of known factors) from an estimation closer (new_theta) to their \"optimum estimation\" which is the set of values best representing the system in a linear combination model"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 14,
	"metadata": {
	"collapsed": true
	},
	"outputs": [],
	"source": [
	"def Logistic_Regression(X,Y,alpha,theta,num_iters):\n",
	" m = len(Y)\n",
	" for x in range(num_iters):\n",
	" new_theta = Gradient_Descent(X,Y,theta,m,alpha)\n",
	" theta = new_theta\n",
	" if x % 100 == 0:\n",
	" #here the cost function is used to present the final hypothesis of the model in the same form for each gradient-step iteration\n",
	" Cost_Function(X,Y,theta,m)\n",
	" print('theta ', theta)\n",
	" print('cost is ', Cost_Function(X,Y,theta,m))\n",
	" Declare_Winner(theta)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"These are the initial guesses for theta as well as the learning rate of the algorithm.\n",
	"A learning rate too low will not close in on the most accurate values within a reasonable number of iterations.\n",
	"An alpha too high might overshoot the accurate values or cause erratic guesses.\n",
	"Each iteration increases model accuracy but with diminishing returns, and takes a signficicant coefficient times O(n)*\|Theta\|, n = dataset length"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 16,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"cost is 0.690657580726\n",
	"theta [0.015808968977217012, 0.014030982200249273]\n",
	"cost is 0.690657580726\n",
	"cost is 0.690657580726\n",
	"cost is 0.562528409195\n",
	"theta [1.1258734026268615, 1.4264625581846324]\n",
	"cost is 0.562528409195\n",
	"cost is 0.562528409195\n",
	"cost is 0.528439281756\n",
	"theta [1.7026031775249526, 2.848326035342597]\n",
	"cost is 0.528439281756\n",
	"cost is 0.528439281756\n",
	"cost is 0.516717865749\n",
	"theta [2.0421060348686075, 4.272934539058638]\n",
	"cost is 0.516717865749\n",
	"cost is 0.516717865749\n",
	"cost is 0.512012633692\n",
	"theta [2.257624206713857, 5.6981422511820581]\n",
	"cost is 0.512012633692\n",
	"cost is 0.512012633692\n",
	"cost is 0.509944033988\n",
	"theta [2.4006753346573633, 7.1232898942713021]\n",
	"cost is 0.509944033988\n",
	"cost is 0.509944033988\n",
	"cost is 0.508981901926\n",
	"theta [2.4982963698142258, 8.5482017093169134]\n",
	"cost is 0.508981901926\n",
	"cost is 0.508981901926\n",
	"cost is 0.508517769016\n",
	"theta [2.5661259492605346, 9.9728620003137944]\n",
	"cost is 0.508517769016\n",
	"cost is 0.508517769016\n",
	"cost is 0.50828833603\n",
	"theta [2.6138281667277883, 11.397304265307881]\n",
	"cost is 0.50828833603\n",
	"cost is 0.50828833603\n",
	"cost is 0.508173009598\n",
	"theta [2.6476542015013051, 12.82157202310713]\n",
	"cost is 0.508173009598\n",
	"cost is 0.508173009598\n",
	"Scikit won.. :(\n",
	"Your score: 0.6363636363636364\n",
	"Scikits score: 0.878787878788\n"
	]
	}
	],
	"source": [
	"initial_theta = [0,0]\n",
	"alpha = 0.1\n",
	"iterations = 1000\n",
	"Logistic_Regression(X,Y,alpha,initial_theta,iterations)"
	]
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.6.1"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 2
	}