Skip to content

Instantly share code, notes, and snippets.

@JingliSHI0206
Created October 1, 2020 04:00
Show Gist options
  • Save JingliSHI0206/2eb561f358060017cac2628273e65421 to your computer and use it in GitHub Desktop.
Save JingliSHI0206/2eb561f358060017cac2628273e65421 to your computer and use it in GitHub Desktop.
ReserveEngineerofAI_UTAS.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "ReserveEngineerofAI_UTAS.ipynb",
"provenance": [],
"collapsed_sections": [],
"mount_file_id": "1biWFDiO4LVR6cSeRTrJxVv6i_4QgnpCD",
"authorship_tag": "ABX9TyOIJ6HSmyNkiVS3IXZCgW6I",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/JingliSHI0206/2eb561f358060017cac2628273e65421/reserveengineerofai_utas.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hZ40ug1qlcSZ"
},
"source": [
"<p><img alt=\"Colaboratory logo\" height=\"45px\" src=\"https://www.utas.edu.au/__data/assets/image/0004/1243606/utas-logo-int.png\" align=\"left\" hspace=\"10px\" vspace=\"0px\"></p>\n",
"\n",
"<h1>KIT719: Artificial Intelligence and Natural Language</h1>\n",
"\n",
"\n",
"# Semester Two: Reverse Engineering of AI Models\n",
"A simple demo to help understand low-level theory of AI models, including 4 steps of training routine:\n",
"- Initialization\n",
"- Forward Propagation\n",
"- Backward Propagation\n",
"- Optimization"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "SCwp7jvhmC8H"
},
"source": [
"## XOR Logic Gates Implementation using AI Models\n",
"A simple demo to help understand low-level theory of AI models\n",
"\n",
"\n",
"## XOR Logic Gate\n",
"![](https://drive.google.com/uc?export=view&id=1KG3UssqUK8DLGPb7xK7Chm2ykMxmprMM)\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "eQrv_C6roGOd"
},
"source": [
"## AI Model\n",
"AI Models:\n",
"- Input Layer: 1\n",
"- Hidden Layer: 1\n",
"- Output Layer: 1\n",
"\n",
"A bias value allows you to shift the activation function to the left or right, which may be critical for successful learning.\n",
"\n",
"![](https://drive.google.com/uc?export=view&id=1zQjPgv1DIM6M5rR9bfO3hJLnQoFJdxMv)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "yyfbfNO_q8mU"
},
"source": [
"# XOR Model\n",
"\n",
"There are 4 steps for XOR AI models.\n",
"\n",
"![](https://drive.google.com/uc?export=view&id=1Ai1xEqlV6BXYUg2DMOY7w222SySCbKvD)\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "KtAdfG_MqNHR"
},
"source": [
"# Import Libs\n",
"\n",
"numpy is a Python library for working with arrays."
]
},
{
"cell_type": "code",
"metadata": {
"id": "NLpYESn8kwdV"
},
"source": [
"import numpy as np \n",
"#np.random.seed(0)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "FI9R9sJSqlnV"
},
"source": [
"# Step 1: Initialization\n",
"Initilize input, weight and bias values.\n",
"![](https://drive.google.com/uc?export=view&id=1LhyZAghcSSiNYuBkhGxbyyd_gC5zt67e)\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "EW1AJizT10Zk"
},
"source": [
"## 1.1 Initialize X (input data)"
]
},
{
"cell_type": "code",
"metadata": {
"id": "5hJsScXurx2M",
"outputId": "395f1025-24ca-48ae-c543-9d945278249f",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 119
}
},
"source": [
"#Input Data\n",
"X = np.array(\n",
" [\n",
" [0,0],\n",
" [0,1],\n",
" [1,0],\n",
" [1,1]\n",
" ]\n",
" )\n",
"print('X size: ', X.shape)\n",
"print('X data: \\n', X)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"X size: (4, 2)\n",
"X data: \n",
" [[0 0]\n",
" [0 1]\n",
" [1 0]\n",
" [1 1]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ColVcZZI18ha"
},
"source": [
"## 1.2 Initialize Bias (input data)"
]
},
{
"cell_type": "code",
"metadata": {
"id": "B2PeDew1uWPQ",
"outputId": "10314e93-5a0f-4f4e-d567-eded15a6fcb3",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 68
}
},
"source": [
"# Bias Data\n",
"b0 = np.array(\n",
" [[1,1]]\n",
")\n",
"\n",
"b1 = -1\n",
"\n",
"print('b0 size: ', b0.shape)\n",
"print('b0 data: \\n', b0)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"b0 size: (1, 2)\n",
"b0 data: \n",
" [[1 1]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "HmTYbF772CZr"
},
"source": [
"## 1.3 Initialize Weights"
]
},
{
"cell_type": "code",
"metadata": {
"id": "IEKW57REsRcC",
"outputId": "e0026f78-3373-4a22-87ce-91fda80722d3",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 102
}
},
"source": [
"# Weights Data\n",
"W1 = np.array(\n",
" [\n",
" [-1, 1],\n",
" [1, -1]\n",
" ]\n",
")\n",
"W2 = np.array(\n",
" [\n",
" [1], \n",
" [1]\n",
" ]\n",
")\n",
"\n",
"print('W1 = \\n', W1)\n",
"print('W2= \\n', W2.shape)\n"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"W1 = \n",
" [[-1 1]\n",
" [ 1 -1]]\n",
"W2= \n",
" (2, 1)\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "m1EPKUsA2Hei"
},
"source": [
"## 1.4 Initialize y (output data)"
]
},
{
"cell_type": "code",
"metadata": {
"id": "8sb40UGrr5e7",
"outputId": "b53a325d-5c1f-4dd1-f4a8-de3f29e9c071",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 85
}
},
"source": [
"y = np.array(\n",
" [\n",
" [0],\n",
" [1],\n",
" [1],\n",
" [0]\n",
" ]\n",
" )\n",
"y"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[0],\n",
" [1],\n",
" [1],\n",
" [0]])"
]
},
"metadata": {
"tags": []
},
"execution_count": 6
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "NNPt8xelxKiL"
},
"source": [
"# Step 2: Forward Propagation\n",
"\n",
"Compute S and f(x)=sigmoid(x)\n",
"\n",
"![](https://drive.google.com/uc?export=view&id=1DcCG42mRzcOCUFtEXfrTFfkwYQ1ezplQ)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "IxEZWlkZxG1U"
},
"source": [
"## 2.1 Activation Function (Sigmoid)\n",
"\n",
"$$\n",
"f(x)={\\frac {1}{1+e^{-x}}}\n",
"$$"
]
},
{
"cell_type": "code",
"metadata": {
"id": "oUpt4dCuw6_e"
},
"source": [
"import matplotlib.pyplot as plt\n",
"\n",
"def plot(func,x, yaxis=(-1.4, 1.4)):\n",
" plt.ylim(yaxis)\n",
" plt.locator_params(nbins=5)\n",
" plt.xticks(fontsize = 14)\n",
" plt.yticks(fontsize = 14)\n",
" plt.axhline(lw=1, c='black')\n",
" plt.axvline(lw=1, c='black')\n",
" plt.grid(alpha=0.4, ls='-.')\n",
" plt.box(on=None)\n",
" plt.plot(x, func(x), c='r', lw=3)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "JyV-NnAlqjA1"
},
"source": [
"def sigmoid (x):\n",
" return 1/(1 + np.exp(-x))"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "MqeD6OSww-Z0",
"outputId": "ed371adb-c7ca-4848-ce75-9c5cfd3a05ab",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 269
}
},
"source": [
"data = np.arange(-5, 5, 0.01)\n",
"plot(sigmoid,data, yaxis=(-0.4, 1.4))"
],
"execution_count": null,
"outputs": [
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"tags": [],
"needs_background": "light"
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "p3F9bxxOx0P-"
},
"source": [
"# Step 3: Backward Propagation\n",
"\n",
"Compute gradient descent of loss/cost function\n",
"\n",
"![](https://drive.google.com/uc?export=view&id=1YD2XQocDJuhRw3Se964NfOki38JgHG1l)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "UZcyY_cDzCPl"
},
"source": [
"##3.1 Derivative of Preactivation Function\n",
"\n",
"(1) Preactivation Function\n",
"$$\n",
"s = \\sum{w_{i} * x_{i} + b} = W*X + b\n",
"$$\n",
"\n",
"(2) Derivative of Preactivation\n",
"$$\n",
"\\frac{\\partial s}{\\partial W} = s'= W\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "SxJJk8-cyk7t"
},
"source": [
"##3.2 Derivative of Activation Function\n",
"\n",
"(1) Activation Function\n",
"$$\n",
"f(x) = {\\frac {1}{1+e^{-x}}}\n",
"$$\n",
"\n",
"(2) Derivative of Activation\n",
"$$\n",
"f'(x)=f(x)(1-f(x))\n",
" = {\\frac {1}{1+e^{-x}}} (1 - {\\frac {1}{1+e^{-x}}})\n",
"$$"
]
},
{
"cell_type": "code",
"metadata": {
"id": "A686VugBw8b8"
},
"source": [
"def sigmoid_derivative(f_x):\n",
" return f_x * (1 - f_x)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "pA0NZMz01CJE"
},
"source": [
"# Step 4: Optimization\n",
"\n",
"Updating weights and bias\n",
"\n",
"$$\n",
"W_{t+1} = W_{t} - \\alpha \\frac{\\partial L}{\\partial w}\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "a-KUEsaj8D2s"
},
"source": [
"## 4.1 Update Layer 2 Weights\n",
"\n",
"$$\n",
"W^{(2)}_{t+1} = W^{(2)}_{t} - \\alpha \\frac{\\partial L}{\\partial W^{(2)}} \\space\n",
"$$\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "GSk5wkl2TiEE"
},
"source": [
"$$\n",
"\\frac{\\partial L}{\\partial W^{(2)}} = \\frac{\\partial L}{\\partial \\hat{y}} * \\frac{\\partial \\hat{y}}{\\partial W^{(2)}}\n",
"$$\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dplPzwPzTk9C"
},
"source": [
"$$\n",
"L(\\hat{y},y) = \\frac{1}{n} (y - \\hat{y})^2 \\space\\space\\space\\space, \\space\\space\\space\\space \\hat {y} = f(s^2) = f(W^{(2)}*h + b_1)\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qSMz2p1fTnti"
},
"source": [
"$$\n",
"\\frac{\\partial L}{\\partial \\hat{y}} = \\frac{2}{n} *(y - \\hat{y}) \\space\\space\\space\\space, \\space\\space\\space\\space \\frac{\\partial \\hat{y}}{\\partial W^{(2)}} = f'(W^{(2)}*h + b_1)\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "a-fbX3dKTpk6"
},
"source": [
"$$\n",
"\\frac{\\partial L}{\\partial W^{(2)}} = \\frac{2}{n} *(y - \\hat{y}) * f'\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "yPN61ROLCCZm"
},
"source": [
"## 4.2 Update Layer 1 Weights\n",
"\n",
"### how to update W1?"
]
},
{
"cell_type": "code",
"metadata": {
"id": "h5b2xMCplLnq",
"outputId": "7b977725-a0a3-47f8-fe7d-4d430edfbb66",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"#Training Model\n",
"\n",
"epochs = 10000\n",
"lr = 0.01\n",
"\n",
"for _ in range(epochs):\n",
"\t#Forward Propagation\n",
"\ts1 = np.dot(X,W1) + b0\n",
"\th = sigmoid(s1)\n",
"\n",
"\ts2 = np.dot(h,W2) + b1\n",
"\ty_pred = sigmoid(s2)\n",
"\n",
"\t#Backpropagation\n",
"\t#L = 1/2 * (y - y_pred)**2\n",
"\tL_derivative = (y-y_pred) * sigmoid_derivative(y_pred)\n",
" \n",
"\tL_w_derivative = L_derivative.dot(W2.T) * sigmoid_derivative(h)\n",
"\t\n",
"\n",
"\t#Updating Weights and Biases\n",
"\tW2 = W2 - lr * h.T.dot(L_derivative)\n",
"\tb1 = b1 - lr * np.sum(L_derivative,axis=0,keepdims=True) \n",
"\tW1 = W1 - lr * X.T.dot(L_w_derivative) \n",
"\tb0 = b0 - lr * np.sum(L_w_derivative,axis=0,keepdims=True) \n",
" \n",
"print('Finish model training...')"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"Finish model training...\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "WSfPd271oU8r",
"outputId": "a3aba56f-7eec-4f66-eaff-43083252606d",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 544
}
},
"source": [
"print('\\n' + '-'*50 +'\\n')\n",
"print(\"Final layer 1 weights W1:\\n \",end='')\n",
"print(W1)\n",
"print('\\n' + '-'*50 +'\\n')\n",
"print(\"Final layer 2 weights W2:\\n \",end='')\n",
"print(W2)\n",
"print('\\n' + '-'*50 +'\\n')\n",
"print(\"Final layer 1 bias b0:\\n \",end='')\n",
"print(b0)\n",
"print('\\n' + '-'*50 +'\\n')\n",
"print(\"Final layer 1 bias b11:\\n \",end='')\n",
"print(b1)\n",
"print('\\n' + '-'*50 +'\\n')\n",
"\n",
"print(\"\\nOutput from neural network after 10,000 epochs:\\n \",end='')\n",
"print(y_pred)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"\n",
"--------------------------------------------------\n",
"\n",
"Final layer 1 weights W1:\n",
" [[-0.75547202 1.29550082]\n",
" [ 1.29550082 -0.75547202]]\n",
"\n",
"--------------------------------------------------\n",
"\n",
"Final layer 2 weights W2:\n",
" [[2.84998061]\n",
" [2.84998061]]\n",
"\n",
"--------------------------------------------------\n",
"\n",
"Final layer 1 bias b0:\n",
" [[1.65862956 1.65862956]]\n",
"\n",
"--------------------------------------------------\n",
"\n",
"Final layer 1 bias b11:\n",
" [[1.27711062]]\n",
"\n",
"--------------------------------------------------\n",
"\n",
"\n",
"Output from neural network after 10,000 epochs:\n",
" [[0.99768327]\n",
" [0.99756126]\n",
" [0.99756126]\n",
" [0.9983539 ]]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "yyPDhAuUvGbK"
},
"source": [
"# Exercises\n",
"\n",
"## 1. Write formula of layer 1 weight updating.\n",
"\n",
"## 2. Compute prediction by using different loss/cost function.\n",
"\n",
"\n",
"## 3. Compare different bias (plot different equations)."
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment