Skip to content

Instantly share code, notes, and snippets.

@hotchpotch
Created August 20, 2021 23:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hotchpotch/87f5bdcbcf96cd8b0f8dab375d0e155c to your computer and use it in GitHub Desktop.
Save hotchpotch/87f5bdcbcf96cd8b0f8dab375d0e155c to your computer and use it in GitHub Desktop.
基本統計学_3章_練習問題.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": " 基本統計学_3章_練習問題.ipynb",
"provenance": [],
"collapsed_sections": [],
"toc_visible": true,
"authorship_tag": "ABX9TyNFv8AZzEO2nEosD6zhjTnY",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/hotchpotch/87f5bdcbcf96cd8b0f8dab375d0e155c/-_3-_.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "NCytvwwbeczk"
},
"source": [
"# 基礎統計学 第4版 3章練習問題"
]
},
{
"cell_type": "code",
"metadata": {
"id": "yTvtDkn99_Hj"
},
"source": [
"from __future__ import annotations\n",
"\n",
"!pip install -q japanize-matplotlib\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import japanize_matplotlib"
],
"execution_count": 39,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "cYnsDX_0Hu1d"
},
"source": [
"# 1.\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "l3dthcj6-Y9J"
},
"source": [
"\n",
"x = [1.2, 0.7, 1.5, 1.8, 0.5, 3.4, 1.0, 3.0, 2.8, 2.5]\n",
"y = [2.7, 2.4, 2.7, 3.3, 1.1, 5.8, 2.2, 4.2, 4.4, 3.8]\n",
"\n"
],
"execution_count": 40,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 279
},
"id": "z27FslP4HmSL",
"outputId": "7e638f62-386b-48e9-ed86-bfbcd842b25f"
},
"source": [
"#@title a) plot\n",
"\n",
"def plot(x, y, lines: None | list[tuple[float, float]]= None, xlabel=\"x\", ylabel=\"y\"):\n",
" plt.xlabel(xlabel)\n",
" plt.ylabel(ylabel)\n",
" plt.scatter(x=x, y=y, marker=\"x\")\n",
" if lines:\n",
" plt.plot(*zip(*lines), color=\"gray\")\n",
" plt.grid()\n",
" return plt\n",
"\n",
"plot(x, y, xlabel=\"広告費\", ylabel=\"利益\").show()\n"
],
"execution_count": 41,
"outputs": [
{
"output_type": "display_data",
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXsAAAEGCAYAAACEgjUUAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAUTklEQVR4nO3de4xcZ3nH8e/TxsFx1k4aki4VKSwFUnBrSrsWpRiaLCTcagGllD9aibABmUtbSlqEaBEGmUt6oRgqWm4Vm5YiLRQoF3MRpqyddAkNay4JtRo3AqcpEJpwidkQEyd5+sfMOLPr2Z31eM/OzHm/H2llz5yZfd9Hk/PzyTPnvCcyE0lSvf1UvycgSaqeYS9JBTDsJakAhr0kFcCwl6QCGPaSVIBKwz4iHhQRH42Iz0fE3oh4VJXjSZI6iyrPs4+ITwKXZ+ahiDgPuDczv9fpteeee26OjY31NM4dd9zBmWee2ftEB1DdaqpbPVC/mupWD9Svpk71HDhw4LbMPK/be0+ralIR8QBgA7AjIn4duB7406VePzY2xtzcXE9j7du3j4suuqin9w6qutVUt3qgfjXVrR6oX02d6omIm1by3sqO7CPiMcBngd/MzOsi4g00jux3tr1mB7ADYHR0dHx6erqnsebn5xkZGVmFWQ+OutVUt3qgfjXVrR6oX02d6pmYmDiQmVu7vjkzK/kBLgCuanu8BfjkUq8fHx/PXs3MzPT83kFVt5rqVk9m/WqqWz2Z9aupUz3AXK4gk6v8gvZGYENEPLT5+CnAVyscT5K0hMp69pl5b0RcBrwnItYBtwAvqGo8SdLSKgt7gMy8DnhilWNI0rDKTCJiyceryYuqJKkPdu89xK49B1vfaZKZ7NpzkN17D1UynmEvSWssMzly9BhTs4ePB/6uPQeZmj3MkaPHjv8DsJoqbeNIkk4UEezcvhmAqdnDTM0eBmBy2xg7t2+upJXjkb0k9UF74LdUFfRg2EtSX7RaN+3ae/irzbCXpDXW3qOf3DbGN694OpPbxhb08FebPXtJWmMRwab16xb06FstnU3r11XSyjHsJakPLr/kggXn1bcC3569JNXM4mCvKujBsJekIhj2klQAw16SCmDYS1IBDHtJKoBhL0kFMOwlqQCGvSQVwLCXpAIY9pJUAMNekgpg2EtSAQx7SSqAYS9JBTDsJakAhr0kFcCwl6QCGPaSVADDXpIKYNhLUgEMe0kqgGEvSQUw7CWpAIa9JBXAsJekApxW5S+PiCuBRwBHm0+9JTM/XuWYkqQTVRr2wIOAizLzaNdXSpIqU3Ub52zgnRFxVUS8PSI2VDyeJKmDyMzqfnnEu4HXZ+bNEfFaYH1m/lnb9h3ADoDR0dHx6enpnsaZn59nZGRkNaY8MOpWU93qgfrVVLd6oH41dapnYmLiQGZu7frmzFyTH2Az8G9LbR8fH89ezczM9PzeQVW3mupWT2b9aqpbPZn1q6lTPcBcriCDK2vjRMQZEfH6iDi9+dTTgC9XNZ4kaWmVfUGbmXdGxG3AtRFxO/At4EVVjSdJWlqlZ+Nk5tuAt1U5hiSpOy+qkqQCGPaSVADDXpIKYNhLUgEMe0kqgGEvSQUw7CWpAIa9JBXAsJekAhj2klQAw16SCmDYS1IBDHtJKoBhL0kFMOwlqQCGvSQVwLCX1FeN26gu/Virw7CX1De79x5i156DxwM+M9m15yC79x7q88zqx7CX1BeZyZGjx5iaPXw88HftOcjU7GGOHD3mEf4qq/QetJK0lIhg5/bNAEzNHmZq9jAAk9vG2Ll9MxHRx9nVj0f2kvqmPfBbDPpqGPaS+qbVumnX3sPX6jHsJfVFe49+ctsY37zi6UxuG1vQw9fqsWcvqS8igk3r1y3o0bdaOpvWr7OVs8oMe0l9c/klF5CZx4O9FfgG/eqzjSOprxYH+2oEvRdqnciwl1QrXqjVmW0cSbXRfqEWwIUbWfAlcHvLqDSGvaTaWHyh1jlb7mbq+ju8UAvbOJJqxgu1OjPsJdWKF2p1ZhtHUm0svlBry8Zbmdx03vEefslH+B7ZS6qNxRdqQSPgJ7eNFX+hlkf2kmrFC7U6q/zIPiJeExH7qh5HklqquFBr2FUa9hGxFXhIlWNIkrqrLOwj4gxgN/CqqsaQJK1MVHU6UkS8HdiXmR+KiH2ZeVGH1+wAdgCMjo6OT09P9zTW/Pw8IyMjpzLdgVO3mupWD9SvprrVA/WrqVM9ExMTBzJza9c3Z+aq/wBPAd7f9nhft/eMj49nr2ZmZnp+76CqW011qyezfjXVrZ7M+tXUqR5gLleQy1W1cbYD50XERyPio8AvR8Q/VTSWJKmLSk69zMw/an/cbOM8r4qxJEndrclFVdmhXy9pMKRrvxfBK2ilgrn2ezm8glYqVLr2e1EMe6lQrv1eFts4UsFc+70chr1UsFaPvp1rv9eTbRypUK2gd+33MnhkLxXKtd/L4pG9VDDXfi+HR/ZS4Vz7vQyGvSQVwLCXpAIsG/YR8di2vz87Ita3PX5UlROTJK2ebkf2uwAi4jeAVwD3i4gNEXEm8NaqJydJWh0dz8aJiEcAt7X9fRewAXhH6yVrMjtJ0qpY6tTLs4Ep4HzgTcClwPtpHM23gv6KymcnSVoVHcM+M78YERcCX8jMZ8Px07Ee33yJR/aSNESW7Nln5l3A92PhSbd3A/c0/5QkDYmOYR8RWyJiBngo8Ilm3x5gDvhS809J0pBYqmf/PeDZwAeAFwLvotG//0Ns4UjS0FmqZ/9tgIiIzLwlIi4DPgX8SWbe0tw2s3bTlCSdim7n2f8xQGZ+D3gBcEfbtpdXNSlptXgzbalhyVUvI2IvkIu+oG2dlXM6cA3wtUpnJ52C3XsPceToseOrOLbWb9+0fh2XX3JBv6cnraklwz4zL1lqW0RsAD5TyYykVbD4Zto7t2/2Ztoq2nJH9g8GXgcs/v/e1wE/AfZWNivpFC2+mXYr9L2Ztkq13M1LbgZeBTwE+A5wFNgEfCsz7wFeX/30pN61Ar8V9OCt9lSupc6z3wj8bWZ+F3g6jaP784G3NINeGnjeTFu6z1KnXv4oIsYj4qXAVuAu4PeAf2w+13rd36/NNKWTs/hm2u09e/AIX+VZro0zC9wL/AD4FRpH93cAN+KFVRpwi2+m3d7D92baKtFyYf8l4P+ABwL/QOPo/t3AKLAz/X9hDThvpi3dZ7lTLz8AEBE3A7dk5k8i4hnArxn0GhbeTFtqWO7IHoDMvLHt7wkcqHRGkqRV5w3HJakAhr0kFcCwl6QCVBb2EfHKiPhCRHwlIt4bEadXNZYkaXmVhH1EnAucBWzLzF8FNgDPrGIsSVJ3Xc/G6UVm3ga8GiAiRmisqfP1KsaSJHUXVZ4yHxHvB54M/BXw5sXn50fEDmAHwOjo6Pj09HRP48zPzzMyMnKKsx0sdaupbvVA/WqqWz1Qv5o61TMxMXEgM7d2fXNmVvpDo4XzYeD5y71ufHw8ezUzM9PzewdV3WqqWz2Z9aupbvVk1q+mTvUAc7mCLK6qZ//oiLi0+Y/Jj4FDwNlVjCVJ6q6qs3FuAB4XEXMRcTUwBrynorEkSV1U9QXtncCLqvjdkqST50VVklQAw16SCmDYS1IBDHtJKoBhX4BcdOHc4sd1HVvSfQz7mtu99xC79hw8HrLZvBH37r2Haj22pIUM+xrLTI4cPcbU7OHjobtrz0GmZg9z5OixSo+yF48NrNnYkk5UyXn2GgytG2wDTM0eZmr2MACT28Yqv/H24rHP2XI3U9ffsSZjSzqRR/Y11x66LWsVtv0cW9JChn3NtVo37dr76HUdW9JChn2NtffoJ7eN8c0rns7ktrEFPfy1GnvLA89as7ElnciefY1FBJvWr1vQJ2+1VTatX1d5z7597P3796/Z2JJOZNjX3OWXXEBmHg/XVuCvRdj2c2xJC9nGKcDicF3LsO3n2JLuY9hLUgEMe0kqgGG/iGu5SKojw76Na7lIqivDvqmf68hIUtU89bKpn+vISFLVPLJv41oukurKsG/jWi6S6sqwb+rnOjKSVDV79k39XEdGkqpm2LdxLRdJdWUbZ5F+r+XiRV2SqmDYDxAv6pJUFds4A6L9oi6ACzey4Avj9vaSJJ0sw35AeINuSVWyjTNAvKhLUlUM+wHiRV2SqmIbZ0CccIPujbcyuem84z18j/AlnQqP7AfE4ou6oBHwk9vGvKhL0inzyH6AeFGXpKpUemQfEc+NiGsi4uqI+GBEbKhyvDro90VdkuqpsrCPiHOAVwJPzMwnADcBL6xqPEnS0ioL+8z8PvD4zLyz+dRpwJ3LvEWSVJGo+rS+iFgP/CVwP+APMvOetm07gB0Ao6Oj49PT0z2NMT8/z8jIyCrMdnDUraa61QP1q6lu9UD9aupUz8TExIHM3Nr1zZlZ2Q9wPvBp4GndXjs+Pp69mpmZ6fm9g6puNdWtnsz61VS3ejLrV1OneoC5XEEeV3Y2TvOI/kpgMjNvrmocSVJ3VZ56eTHwSOB9bWeUfD4zd1U4piSpg8rCPjP3AA+s6vdLklbOK2glqQCGvSQVwLCXpAIY9pJUAMNekgpg2EtSAQx7SSqAYS9JBTDsJakAhr0kFcCwl6QCGPaSVADDXpIKYNhLUgEMe0kqgGEvSQUw7CWpAIa9JBXAsJekAhj2klQAw16SCmDYS1IBDHtJKoBhL0kFMOwlqQCGvSQVwLCXpAIY9pJUAMNekgow1GGfmcs+liQ1DG3Y7957iF17Dh4P+Mxk156D7N57qM8zk6TBc1q/J9CLzOTI0WNMzR4G4MKNsGvPQaZmDzO5bYzMJCL6O0lJGiBDGfYRwc7tmwGYmj3MOVvuZur6O5jcNsbO7ZsNeklaZGjbOO2B32LQS1JnlYV9RDwnIj4YEf9Txe9v9ejbtffwJUn3qbKNcyvwUuDrq/2LW0Hf6tFv2Xgrk5vOO97D9whfkhaqLOwzcz9QSehGBJvWrzveo9+/f//xls6m9esMeklaJKpue0TELZn5gCW27QB2AIyOjo5PT0/3NMb8/DwjIyO9T3IA1a2mutUD9aupbvVA/WrqVM/ExMSBzNza9c2ZWekPcMtKXjc+Pp69mpmZ6fm9g6puNdWtnsz61VS3ejLrV1OneoC5XEHGDu3ZOJKklTPsJakAlYd9LtGvlyStHY/sJakAlZ+Ns1IRcStwU49vPxe4bRWnMwjqVlPd6oH61VS3eqB+NXWq58GZeV63Nw5M2J+KiJjLlZx6NETqVlPd6oH61VS3eqB+NZ1KPbZxJKkAhr0kFaAuYf/ufk+gAnWrqW71QP1qqls9UL+aeq6nFj17SdLy6nJkL0lahmEvSQUYqrCPiOdGxLURcSAi/qbD9n2Lfh7Tj3muVLcbvHSrdxCtoKYrI+KLbZ/RM9Z6jier+TlcExFXN2vbsGj7y5qf01cj4hX9mudKraCeodqPACLilRHxhYj4SkS8NyJOX7R9qPalFdRz8vvRSlZLG4Qf4MHADcBZQAAfAH5n0Wuu6fc8T7KmC2lcJHHCyqArqXcQf5arqbn988D6fs/zJOo5B5gDzmg+/mvgZW3btwHXAKc3f/4d2NrvefdaT/O5YduPzgXeyH3fQU4Dv9u2faj2pW71NJ876f1omI7snwp8ODNvz0a17wKe1doYEacBZzePVK6KiNdHxE/3a7IrkZn7M3Opq/uWrXdQdakJ4Gzgnc3P6O2LjyoHTWZ+H3h8Zt7ZfOo04M62l2wHpjLzrsy8C3gv8Mw1nuaKdatnSPej2zLz1ZmZETECbGLhHfKGal9aQT3Qw340TGF/f+CWtsffAX627fEIsI/GzVAuAn4OeOEaza0K3eodVnPAazLzN2ncuvI1fZ5PV5l5NCLWR8TbgDNoBHrL0H1OXeoZ2v0oIt4PfBOYAf6rbdPQfUawbD3Qw340TGH/XRZ+QA9oPgdAZv4wM1/S/PNe4CPAwPcal7FsvcMqM3dk5s3Nh//CEHxGEXE+8K/AZzLzxZl5T9vmofuclqtnmPejzPx9Gi2bxwKXtm0aus8Ilq2np/1omML+U8BvR8TG5uPLgI+1NkbEAyLiz+O+G9A+FfjyGs9xNS1b7zCKiDOabYHWl01PY8A/o4hYD1wJ7MjMT3d4yceA50XEuma741Lg42s4xZPSrZ5h3I8i4tERcSlAZv4YOESjzdEyVPtSt3p63Y+GJuwz8zvAm4CrIuI/gO9m5oeb30S3/qUeAb4cEVfT+CJm6K6ei4jpiHj0UvX2eXo9aavpThor9l0bEfuBcWBXf2fX1cXAI4H3tZ35sLP1311mztEI92uBLwKfaD43qJath+Hcj24AHhcRc805jwHvGeJ9qVs9Pe1HXkErSQUYmiN7SVLvDHtJKoBhL0kFMOwlqQCGvYoVERdFxOuW2PaSiDgrIp4VES/vsP0hbWez/HdzXZzrIuKG5nPPqbwA6SR4No6KEhGX0TgXPmmslXIW0Fq07a7MfHLzfOxvAF8DzgPWA60LWGYy843N3/VsGldnPhG4Cbgf8DPAjcDhzPznNSlKWoHT+j0BaY09CHhtZu5rhvozO4TyG4DnZeanI+JZwFhmvrXD73pZ889fAB4CbKAR+GMAEfGVzPzPKoqQTpZhr9IEcG/z7xfTCP/7NkY8lMZ6MU+KiCcBDwM2NZcYaHkH8APgPTSuMP028NnWa2lc8PLzNNZgkQaCbRwVJSKuAD6SmV+KiHcBf0cj3K9trjK4jkZrZzm30/i+6+HA5zhxkapHAI/IzB+u7uyl3nlkr9LcH7i9Geq/mJnXRcSLgQngLzLzGHBbRNxKo2ff7uE01kGfA4iI/6XRn79y0euGYpVIlcWwV2l+icaysc+gsfIjmfnOiPhURHw2M1sLSn0tMy9uf2NEvLnD73sY8PwOz0kDxTaOihERFwBvp3HzijcC3wN+ROPmHbcDFwBPyMx7Vnhkv4XGjUoeTuNofiPwFuAI8CHg6nQH04Aw7FWMiDiXxpkz36CxZOy3m0vItrY/jsYt+TIiPrfEkf10Zs5FxBnAK4HPZebsoteNAb8FvLvZFpL6zrCXpAJ4Ba0kFcCwl6QCGPaSVADDXpIKYNhLUgEMe0kqwP8Dt+q5OSKzst8AAAAASUVORK5CYII=\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 294
},
"id": "N-yPmOltHysL",
"outputId": "564bbc80-19e0-4462-a210-fa848a8e6408"
},
"source": [
"#@title b) 線形回帰とグラフplot\n",
"\n",
"def linear_regression(x, y) -> tuple(float, float):\n",
" \"\"\"\n",
" 線形回帰の、y = ax + b の a, b を返す\n",
" \"\"\"\n",
" assert len(x) == len(y)\n",
" n = len(x)\n",
" x = np.array(x)\n",
" y = np.array(y)\n",
" total_x = x.sum()\n",
" total_y = y.sum()\n",
" total_x_square = (x ** 2).sum()\n",
" total_y_square = (y ** 2).sum()\n",
" total_xy = (x * y).sum()\n",
" (b, a) = np.linalg.solve([\n",
" [n, total_x], [total_x, total_x_square]],\n",
" [total_y, total_xy]\n",
" )\n",
" return (a, b)\n",
"\n",
"(a, b) = linear_regression(x, y)\n",
"print(f\"回帰直線 y = {a:0.4}x + {b:0.4}\")\n",
"\n",
"predictor_1 = lambda x: a*x + b\n",
"lines = [(i, predictor_1(i)) for i in [0, 4]]\n",
"plot(x, y, xlabel=\"広告費\", ylabel=\"利益\", lines=lines).show()\n",
"\n"
],
"execution_count": 42,
"outputs": [
{
"output_type": "stream",
"text": [
"回帰直線 y = 1.249x + 0.9627\n"
],
"name": "stdout"
},
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "LuqjPX_MM78N",
"outputId": "99d72a7f-137a-4e9f-eb11-20be14c8e0ff"
},
"source": [
"#@title b) x=2 のときの推定利益\n",
"predictor_1(2)\n"
],
"execution_count": 43,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"3.459763313609468"
]
},
"metadata": {},
"execution_count": 43
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "pXEwI1vfOcwE"
},
"source": [
"# 2."
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "0Eznl8ZCOcY5",
"outputId": "c5ba48c8-a27a-4e54-a196-66d331264872"
},
"source": [
"def r2_score(x, y, ddof=0) -> float:\n",
" \"\"\"\n",
" x, y における決定係数 R二乗を求める\n",
" ddof=1 なら不偏分散\n",
" \"\"\"\n",
" (a, b) = linear_regression(x, y)\n",
" x = np.array(x)\n",
" y = np.array(y)\n",
" n = len(x)\n",
" \n",
" y_mean = sum(y) / n # y.mean() で出せるけど…\n",
" y_variance = ((y - y_mean) ** 2).sum() / (n-ddof)\n",
" y_std = np.sqrt(y_variance)\n",
"\n",
" s_y_x2 = ((y - (a*x + b)) ** 2).sum() / (n-ddof)\n",
" r_2 = 1 - (s_y_x2 / y_variance)\n",
" return r_2\n",
"\n",
"\n",
"x = [1,4,10,2,5,6,8,1]\n",
"y = [64,35,11,47,27,36,30,59]\n",
"\n",
"(a, b) = linear_regression(x, y)\n",
"print(f\"回帰直線 y = {a:0.4}x + {b:0.4}\")\n",
"r2 = r2_score(x, y)\n",
"print(f\"決定係数 R2 = {r2:0.4}\")"
],
"execution_count": 44,
"outputs": [
{
"output_type": "stream",
"text": [
"回帰直線 y = -4.891x + 61.25\n",
"決定係数 R2 = 0.8555\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "HAwP1G-DV0fn"
},
"source": [
"# 3."
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "QcJBvBKDV1u8",
"outputId": "3229c490-dc13-46f9-f136-9cce4c4a396f"
},
"source": [
"x = [1,2,3,4,4,5,5,6,6,7]\n",
"y = [4,5,6,5,6,8,7,10,11,10]\n",
"\n",
"(a, b) = linear_regression(x, y)\n",
"print(f\"回帰直線 y = {a:0.4}x + {b:0.4}\")\n",
"r2 = r2_score(x, y)\n",
"print(f\"決定係数 R2 = {r2:0.4}\")\n",
"print(f\"5日の貯蔵可能月数予想: {a*5 + b:0.4}\")"
],
"execution_count": 45,
"outputs": [
{
"output_type": "stream",
"text": [
"回帰直線 y = 1.165x + 2.19\n",
"決定係数 R2 = 0.813\n",
"5日の貯蔵可能月数予想: 8.016\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ObZYXZc5XYag"
},
"source": [
"# 4."
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 278
},
"id": "jn_PhtqIXaSH",
"outputId": "0ce59364-5742-4fc8-fe05-f16f569d8e94"
},
"source": [
"#@title a) plot\n",
"x = [46,30,34,52,38,44,40,45,34,60]\n",
"y = [10,7,9,13,8,12,11,11,7,14]\n",
"\n",
"plot(x,y, xlabel=\"湿度\", ylabel=\"水分含有量\").show()\n"
],
"execution_count": 46,
"outputs": [
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 295
},
"id": "ZSZw-ujGZUtA",
"outputId": "b2e70587-9b39-4d4c-b0ff-500fc8ed3157"
},
"source": [
"#@title b) 回帰直線 + plot\n",
"(a, b) = linear_regression(x, y)\n",
"print(f\"回帰直線 y = {a:0.4}x + {b:0.4}\")\n",
"\n",
"predictor_4 = lambda x: a*x + b\n",
"lines = [(i, predictor_4(i)) for i in [28, 62]]\n",
"plot(x,y, xlabel=\"湿度\", ylabel=\"水分含有量\", lines=lines).show()\n"
],
"execution_count": 47,
"outputs": [
{
"output_type": "stream",
"text": [
"回帰直線 y = 0.2451x + -0.1689\n"
],
"name": "stdout"
},
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "xbFzjM1hag6E",
"outputId": "14adc2fc-cdb3-4f52-e0d7-6db7e1f65920"
},
"source": [
"#@title c) x=5の時\n",
"\n",
"print(f\"{predictor_4(50):0.4}\")\n"
],
"execution_count": 48,
"outputs": [
{
"output_type": "stream",
"text": [
"12.09\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "v5KFmEIcbiBh"
},
"source": [
"# 5."
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 277
},
"id": "emdJ1cBXbjb4",
"outputId": "55b20e06-9627-4c0d-a08a-9fe6815a7fa4"
},
"source": [
"#@title a) plot\n",
"\n",
"x = [30,25,50,38,20,70,35,24,60,45]\n",
"y = [80,80,45,70,96,20,50,90,25,50]\n",
"\n",
"plot(x,y, xlabel=\"時間\", ylabel=\"点数\").show()\n",
"\n"
],
"execution_count": 49,
"outputs": [
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 297
},
"id": "nk23DA6wb5ba",
"outputId": "5c3493c2-0d67-4f28-f803-560de1336a72"
},
"source": [
"#@title b) 回帰直線 + plot\n",
"(a, b) = linear_regression(x, y)\n",
"print(f\"回帰直線 y = {a:0.4}x + {b:0.4}\")\n",
"\n",
"predictor_5 = lambda x: a*x + b\n",
"lines = [(i, predictor_5(i)) for i in [18, 72]]\n",
"plot(x,y, xlabel=\"時間\", ylabel=\"点数\", lines=lines).show()\n"
],
"execution_count": 50,
"outputs": [
{
"output_type": "stream",
"text": [
"回帰直線 y = -1.548x + 122.1\n"
],
"name": "stdout"
},
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "3nX2G_3Jc2UJ",
"outputId": "2bd5beca-c0d5-4303-afd4-195d8a12122d"
},
"source": [
"#@title c) 決定係数 R2\n",
"\n",
"r2 = r2_score(x, y)\n",
"print(f\"決定係数 R2 = {r2:0.4}\")"
],
"execution_count": 51,
"outputs": [
{
"output_type": "stream",
"text": [
"決定係数 R2 = 0.9198\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "oDey9sHDc_G_"
},
"source": [
"# 6.\n"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "B91gG_9sdLqC",
"outputId": "560af9d5-6b6a-4759-9769-7b063f953ae0"
},
"source": [
"#@title a) 回帰直線\n",
"\n",
"x = [19,15,9,10,11,19]\n",
"y = [31.7,32.3,8.5,14.3,14.0,17.8]\n",
"\n",
"(a, b) = linear_regression(x, y)\n",
"print(f\"回帰直線 y = {a:0.4}x + {b:0.4}\")"
],
"execution_count": 52,
"outputs": [
{
"output_type": "stream",
"text": [
"回帰直線 y = 1.566x + -1.891\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "M3Cmtyocdx7x",
"outputId": "7ca53df3-0b0c-4164-d75f-1d95300b56ab"
},
"source": [
"#@title b) x=12\n",
"\n",
"print(f\"{a*12 + b:0.4f}\")\n"
],
"execution_count": 53,
"outputs": [
{
"output_type": "stream",
"text": [
"16.8964\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "v8ZOBjPoeVV9"
},
"source": [
"# 7.\n",
"\n"
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "WZQMeI-YeyxB",
"outputId": "c6f3c948-5c9a-45bb-e537-e8c9a3b84f5a"
},
"source": [
"x = [262,247,298,367,75,194,123]\n",
"y = [14300,10200,16840,12050,8240,3650,2860]\n",
"\n",
"(a, b) = linear_regression(x, y)\n",
"print(f\"回帰直線 y = {a:0.4f}x + {b:0.4f}\")\n",
"r2 = r2_score(x, y)\n",
"print(f\"決定係数 R2 = {r2:0.4}\")"
],
"execution_count": 54,
"outputs": [
{
"output_type": "stream",
"text": [
"回帰直線 y = 34.9784x + 1909.1162\n",
"決定係数 R2 = 0.4582\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 277
},
"id": "4NL9xTxHfHtR",
"outputId": "347ec147-3ed8-4b77-ce81-e0357c890ea5"
},
"source": [
"#@title 決定係数が悪いグラフ\n",
"# 練習問題じゃないけど R2 が悪かったので plot してみる。\n",
"# なるほど決定係数が良くないと、予想もあまりあてはまらない。\n",
"predictor_7 = lambda x: a*x + b\n",
"lines = [(i, predictor_7(i)) for i in [0, 400]]\n",
"plot(x,y,lines=lines).show()\n"
],
"execution_count": 55,
"outputs": [
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZAYBWORSgaiJ"
},
"source": [
"# 8."
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 311
},
"id": "93ersyzTgYOg",
"outputId": "d9a8894a-55ac-4380-f3f7-bf213f6f7fdd"
},
"source": [
"x = [6,6.25,6.5,6.75,7,7.25,7.5,7.75,8]\n",
"y = [20,26,35,25,47,30,35,80,65]\n",
"\n",
"(a, b) = linear_regression(x, y)\n",
"print(f\"回帰直線 y = {a:0.4f}x + {b:0.4f}\")\n",
"r2 = r2_score(x, y)\n",
"print(f\"決定係数 R2 = {r2:0.4}\")\n",
"\n",
"predictor_8 = lambda x: a*x + b\n",
"lines = [(i, predictor_8(i)) for i in [5.75, 8.25]]\n",
"plot(x,y,lines=lines).show()"
],
"execution_count": 56,
"outputs": [
{
"output_type": "stream",
"text": [
"回帰直線 y = 23.1333x + -121.6000\n",
"決定係数 R2 = 0.6186\n"
],
"name": "stdout"
},
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ytjkrr9MhhEF"
},
"source": [
"# 9."
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "qkrRGp0yguNq",
"outputId": "3e61676e-6e91-4b33-9162-3cce47d360bf"
},
"source": [
"#@title a)\n",
"x = [1,2,3,4,5]\n",
"y = [18,9,13,18,27]\n",
"(a, b) = linear_regression(x, y)\n",
"print(f\"回帰直線 y = {a:0.4f}x + {b:0.4f}\")"
],
"execution_count": 57,
"outputs": [
{
"output_type": "stream",
"text": [
"回帰直線 y = 2.7000x + 8.9000\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "NqvyfA8whfBJ",
"outputId": "6b9c4733-616a-46a5-83e0-cffb5ba732ce"
},
"source": [
"#@title b)\n",
"(c, d) = linear_regression(y, x)\n",
"print(f\"回帰直線 x = {c:0.4f}y + {d:0.4f}\")"
],
"execution_count": 58,
"outputs": [
{
"output_type": "stream",
"text": [
"回帰直線 x = 0.1484y + 0.4780\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 264
},
"id": "jczghZQqiCla",
"outputId": "898a350b-2880-4c4f-ccef-3de48bc21ded"
},
"source": [
"#@title c)\n",
"\n",
"predictor_9_x = lambda x: a*x + b\n",
"predictor_9_y = lambda y: c*y + d\n",
"\n",
"lines_x = [(i, predictor_9_x(i)) for i in [0, 5]]\n",
"lines_y = [(i, predictor_9_y(i)) for i in [0, 30]]\n",
"\n",
"plt.plot(*zip(*lines_x), color=\"blue\")\n",
"(lines_y_y, lines_y_x) = zip(*lines_y)\n",
"plt.plot(lines_y_x, lines_y_y, color=\"green\")\n",
"plt.show()\n"
],
"execution_count": 59,
"outputs": [
{
"output_type": "display_data",
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "LBA_0MwQTG4F"
},
"source": [
"# 10. 重回帰"
]
},
{
"cell_type": "code",
"metadata": {
"id": "eTMpbmLjTI65"
},
"source": [
"def multiple_regression(x, y, z) -> tuple(float, float, float):\n",
" \"\"\"\n",
" 重回帰の y = b + ax + cz の a, b, c を返す。b が切片。\n",
" \"\"\"\n",
" assert len(x) == len(y) == len(z)\n",
" n = len(x)\n",
" x = np.array(x)\n",
" y = np.array(y)\n",
" z = np.array(z)\n",
" total_x = x.sum()\n",
" total_y = y.sum()\n",
" total_z = z.sum()\n",
" total_x_square = (x ** 2).sum()\n",
" total_y_square = (y ** 2).sum()\n",
" total_z_square = (z ** 2).sum()\n",
" total_xy = (x * y).sum()\n",
" total_xz = (x * z).sum()\n",
" total_zy = (z * y).sum()\n",
"\n",
" (b, a, c) = np.linalg.solve([\n",
" [n, total_x, total_z],\n",
" [total_x, total_x_square, total_xz],\n",
" [total_z, total_xz, total_z_square],\n",
" ],\n",
" [\n",
" total_y, total_xy, total_zy\n",
" ]\n",
" )\n",
" return (a, b, c)\n",
"\n"
],
"execution_count": 60,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "lsQKc2gncUAx"
},
"source": [
"def multiple_r2_score(x, y, z, ddof=0) -> float:\n",
" \"\"\"\n",
" x, y, z における y の決定係数 R2 を求める\n",
" ddof=1 なら不偏分散\n",
" \"\"\"\n",
" (a, b, c) = multiple_regression(x, y, z)\n",
" x = np.array(x)\n",
" y = np.array(y)\n",
" z = np.array(z)\n",
" n = len(x)\n",
" \n",
" y_mean = sum(y) / n # y.mean() で出せるけど…\n",
" y_variance = ((y - y_mean) ** 2).sum() / (n-ddof)\n",
" y_std = np.sqrt(y_variance)\n",
"\n",
" sy_xz_2 = ((y - (b + a*x + z*c)) ** 2).sum() / (n-ddof)\n",
" r_2 = 1 - (sy_xz_2 / y_variance)\n",
" return r_2\n",
"\n"
],
"execution_count": 61,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "FDFtJZMFX3E1",
"outputId": "c30a0fba-a724-4972-8df2-f8c34429c930"
},
"source": [
"#@title a)\n",
"\n",
"y = [8,10,5,9]\n",
"x = [3,6,2,1]\n",
"z = [3,4,2,3]\n",
"(a,b,c) = multiple_regression(x,y,z)\n",
"print(f\"b + ax + cz = {b:0.4f} + {a:0.4f}x + {c:0.4f}z\")\n",
"r2 = multiple_r2_score(x,y,z)\n",
"print(f\"決定係数 R2: {r2}\")"
],
"execution_count": 62,
"outputs": [
{
"output_type": "stream",
"text": [
"b + ax + cz = -1.0000 + -0.5000x + 3.5000z\n",
"決定係数 R2: 1.0\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "88yFM5gbe1Hs",
"outputId": "420f13cc-bf66-4336-c5b7-54c1f08d1fe3"
},
"source": [
"#@title b)\n",
"\n",
"y = [33,52,50,47,38]\n",
"x = [2,5,4,3,4]\n",
"z = [41,57,64,70,45]\n",
"(a,b,c) = multiple_regression(x,y,z)\n",
"print(f\"b + ax + cz = {b:0.4f} + {a:0.4f}x + {c:0.4f}z\")\n",
"r2 = multiple_r2_score(x,y,z)\n",
"print(f\"決定係数 R2: {r2}\")"
],
"execution_count": 63,
"outputs": [
{
"output_type": "stream",
"text": [
"b + ax + cz = 5.6012 + 3.8456x + 0.4432z\n",
"決定係数 R2: 0.9385406984016261\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "71Q8n4tSaihU",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "6331593a-1956-4f33-afd2-918d0180fe75"
},
"source": [
"from sklearn.linear_model import LinearRegression\n",
"X = np.array(list(zip(x, z)))\n",
"reg = LinearRegression().fit(X, y)\n",
"print(reg.score(X, y))\n",
"print(reg.coef_)"
],
"execution_count": 64,
"outputs": [
{
"output_type": "stream",
"text": [
"0.9385406984016265\n",
"[3.8455857 0.44322496]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gf4oPlWIfdaE"
},
"source": [
"# 11."
]
},
{
"cell_type": "code",
"metadata": {
"id": "FsI3cDkkfdG4",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "6399bf48-9a98-4183-e19f-e8666830cd52"
},
"source": [
"#@title a)\n",
"\n",
"y = [21,22,19,31,19,14,8,24,22,12]\n",
"x = [7,8,4,9,3,6,3,7,8,2]\n",
"z = [1,4,9,9,6,2,1,9,7,5]\n",
"\n",
"(a,b,c) = multiple_regression(x,y,z)\n",
"print(f\"b + ax + cz = {b:0.4f} + {a:0.4f}x + {c:0.4f}z\")\n"
],
"execution_count": 65,
"outputs": [
{
"output_type": "stream",
"text": [
"b + ax + cz = 3.7195 + 1.7989x + 0.9862z\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "dMS8a4gkiy6C",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "6dccc155-ed7c-48bc-cc2b-52e1decb897f"
},
"source": [
"#@title b)\n",
"\n",
"def multi_variance(x, y, z, ddof=0) -> float:\n",
" \"\"\"\n",
" sy*(xz**2) の分散\n",
" \"\"\"\n",
" (a, b, c) = multiple_regression(x, y, z)\n",
" x = np.array(x)\n",
" y = np.array(y)\n",
" z = np.array(z)\n",
" n = len(x)\n",
" \n",
" y_mean = sum(y) / n # y.mean() で出せるけど…\n",
" y_variance = ((y - y_mean) ** 2).sum() / (n-ddof)\n",
" y_std = np.sqrt(y_variance)\n",
"\n",
" sy_xz_2 = ((y - (b + a*x + z*c)) ** 2).sum() / (n-ddof)\n",
" return sy_xz_2\n",
"\n",
"def multi_std(x, y, z, ddof=0):\n",
" return np.sqrt(multi_variance(x,y,z, ddof=0))\n",
"\n",
"print(multi_std(x,y,z))\n"
],
"execution_count": 66,
"outputs": [
{
"output_type": "stream",
"text": [
"2.3690804710817375\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "CZ45Vxpgi4Pq",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "8b4b9f44-fb1e-4d95-9df6-a2a753c49a6c"
},
"source": [
"#@title c)\n",
"\n",
"(a,b) = linear_regression(x,y)\n",
"print(f\"b + ax = {b:0.4f} + {a:0.4f}x\")\n"
],
"execution_count": 67,
"outputs": [
{
"output_type": "stream",
"text": [
"b + ax = 7.3529 + 2.0784x\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "y5JyRHbAjKlq",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "5c6312d1-6e2b-4e92-c920-45b0ef90f0d2"
},
"source": [
"#@title d)\n",
"\n",
"def single_variance(x, y, ddof=0) -> float:\n",
" \"\"\"\n",
" sy*(x**2) の分散\n",
" \"\"\"\n",
" (a, b) = linear_regression(x, y)\n",
" x = np.array(x)\n",
" y = np.array(y)\n",
" n = len(x)\n",
" \n",
" y_mean = sum(y) / n\n",
" y_variance = ((y - y_mean) ** 2).sum() / (n-ddof)\n",
" y_std = np.sqrt(y_variance)\n",
"\n",
" s_y_x2 = ((y - (a*x + b)) ** 2).sum() / (n-ddof)\n",
" return s_y_x2\n",
"\n",
"\n",
"def single_std(x, y, ddof=0):\n",
" return np.sqrt(single_variance(x,y,ddof=0))\n",
"\n",
"print(single_std(x,y))\n"
],
"execution_count": 68,
"outputs": [
{
"output_type": "stream",
"text": [
"3.7849029308660525\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "RFirHvgsk69d"
},
"source": [
"# 12"
]
},
{
"cell_type": "code",
"metadata": {
"id": "9NjAUVTkkm68",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "e6c7fbc0-488f-4bd5-b9f7-0955e3d80080"
},
"source": [
"#@title a)\n",
"\n",
"y = [7,12,15,8,10,11,10,13,12,16]\n",
"x = [9,13,18,9,14,13,13,14,15,20]\n",
"z = [2,5,4,4,3,4,3,5,5,4]\n",
"\n",
"(a,b,c) = multiple_regression(x,y,z)\n",
"print(f\"b + ax + cz = {b:0.4f} + {a:0.4f}x + {c:0.4f}z\")\n"
],
"execution_count": 69,
"outputs": [
{
"output_type": "stream",
"text": [
"b + ax + cz = -1.3141 + 0.7063x + 0.7609z\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "s0BAs6hGlaPb",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "3cf40cac-e1ff-4c00-ca16-6eec49f42e09"
},
"source": [
"#@title b)\n",
"\n",
"print(multi_std(x,y,z))\n"
],
"execution_count": 70,
"outputs": [
{
"output_type": "stream",
"text": [
"0.5443856033467414\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "iPU9icMwskh7",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "57fcf1e0-4082-4d23-cb7a-43cb312aa11b"
},
"source": [
"#@title c)\n",
"\n",
"(a,b) = linear_regression(x,y)\n",
"print(f\"b + ax = {b:0.4f} + {a:0.4f}x\")\n"
],
"execution_count": 71,
"outputs": [
{
"output_type": "stream",
"text": [
"b + ax = 0.5795 + 0.7841x\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "rpcCpLG3sx-c",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "8d2147b5-2cd6-4e3c-d482-8589a6caaeb3"
},
"source": [
"#@title d)\n",
"\n",
"print(single_std(x,y))\n"
],
"execution_count": 72,
"outputs": [
{
"output_type": "stream",
"text": [
"0.8647122485123434\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "RJkWuFeTt8z4"
},
"source": [
"# 13"
]
},
{
"cell_type": "code",
"metadata": {
"id": "C6kTk8MWuEcj",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "89e25cd9-d966-4999-c61f-73ed7d149618"
},
"source": [
"import math\n",
"\n",
"n = 50\n",
"total_x = 3990\n",
"total_y = 4095\n",
"total_x_square = 323610\n",
"total_y_square = 338375\n",
"total_xy = 330450\n",
"(b, a) = np.linalg.solve([\n",
" [n, total_x], [total_x, total_x_square]],\n",
" [total_y, total_xy]\n",
" )\n",
"\n",
"print(f\"回帰直線 y = {a:0.4}x + {b:0.4}\")\n",
"\n",
"x_mean = total_x / n\n",
"y_mean = total_y / n\n",
"\n",
"# x の分散と偏差\n",
"x_variance = (total_x_square/n - (x_mean)**2)\n",
"x_std = math.sqrt(x_variance)\n",
"\n",
"# y の分散と偏差\n",
"y_variance = (total_y_square/n - (y_mean)**2)\n",
"y_std = math.sqrt(y_variance)\n",
"\n",
"# xy の共分散\n",
"xy_covariance = (total_xy/n) - (x_mean*y_mean)\n",
"\n",
"r = xy_covariance / (x_std*y_std)\n",
"print(f\"相関係数 R: {r:0.4f}\")\n",
"\n",
"# テキストに載っている回答の相関係数は 0.604 であわない…\n"
],
"execution_count": 73,
"outputs": [
{
"output_type": "stream",
"text": [
"回帰直線 y = 0.7045x + 25.68\n",
"相関係数 R: 0.9291\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "wRIqQUEk1qiv",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "951ec3c9-c85f-41f8-840b-f010644fb223"
},
"source": [
"# ↑の相関係数の計算は合っているように思えるので、\n",
"# 表3.9 のデータでの相関係数を全く同じ計算で計算してみる\n",
"\n",
"x = [68,73,70,65, 85,59,92,78,60,75,83,74,84,77,80]\n",
"y = [52,58,63,55,88,60,85,60,55,70,75,50,85,68,78]\n",
"assert len(x), len(y)\n",
"n = len(x)\n",
"\n",
"total_x = sum(x)\n",
"total_y = sum(y)\n",
"total_x_square = (np.array(x) ** 2).sum()\n",
"total_y_square = (np.array(y) ** 2).sum()\n",
"total_xy = (np.array(x) * np.array(y)).sum()\n",
"\n",
"x_mean = total_x / n\n",
"y_mean = total_y / n\n",
"\n",
"# x の分散と偏差\n",
"x_variance = (total_x_square/n - (x_mean)**2)\n",
"x_std = math.sqrt(x_variance)\n",
"\n",
"# y の分散と偏差\n",
"y_variance = (total_y_square/n - (y_mean)**2)\n",
"y_std = math.sqrt(y_variance)\n",
"\n",
"# xy の共分散\n",
"xy_covariance = (total_xy/n) - (x_mean*y_mean)\n",
"\n",
"r = xy_covariance / (x_std*y_std)\n",
"print(f\"相関係数 R: {r:0.4f}\")\n",
"# この値は書籍の説明と一致した値。うーん。\n"
],
"execution_count": 74,
"outputs": [
{
"output_type": "stream",
"text": [
"相関係数 R: 0.8053\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "crcyPL0m5_ek"
},
"source": [
"# 14"
]
},
{
"cell_type": "code",
"metadata": {
"id": "zL24iFGk6B1r",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "edad63c3-f94b-4089-d3f5-e2844cab3591"
},
"source": [
"\n",
"x = [56,42,72,36,63,47,55,49,38,42,68,60]\n",
"y = [147,125,160,118,149,128,150,145,115,140,152,155]\n",
"\n",
"assert len(x), len(y)\n",
"n = len(x)\n",
"\n",
"total_x = sum(x)\n",
"total_y = sum(y)\n",
"total_x_square = (np.array(x) ** 2).sum()\n",
"total_y_square = (np.array(y) ** 2).sum()\n",
"total_xy = (np.array(x) * np.array(y)).sum()\n",
"\n",
"(b, a) = np.linalg.solve([\n",
" [n, total_x], [total_x, total_x_square]],\n",
" [total_y, total_xy]\n",
" )\n",
"\n",
"print(f\"回帰直線 y = {a:0.4}x + {b:0.4}\")\n",
"\n",
"x_mean = total_x / n\n",
"y_mean = total_y / n\n",
"\n",
"# x の分散と偏差\n",
"x_variance = (total_x_square/n - (x_mean)**2)\n",
"x_std = math.sqrt(x_variance)\n",
"\n",
"# y の分散と偏差\n",
"y_variance = (total_y_square/n - (y_mean)**2)\n",
"y_std = math.sqrt(y_variance)\n",
"\n",
"# xy の共分散\n",
"xy_covariance = (total_xy/n) - (x_mean*y_mean)\n",
"\n",
"r = xy_covariance / (x_std*y_std)\n",
"print(f\"相関係数 R: {r:0.4f}\")\n",
"# この数値も相関係数はテキスト回答は0.876でちょっとずれている\n",
"# 一応numpyの実装でもチェック。こちらもテキスト回答と異なっていて、自分の回答とあっている。\n",
"np.corrcoef(x, y)\n"
],
"execution_count": 75,
"outputs": [
{
"output_type": "stream",
"text": [
"回帰直線 y = 1.138x + 80.78\n",
"相関係数 R: 0.8961\n"
],
"name": "stdout"
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[1. , 0.89613936],\n",
" [0.89613936, 1. ]])"
]
},
"metadata": {},
"execution_count": 75
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "tmXt-UpS8E12"
},
"source": [
"# 15"
]
},
{
"cell_type": "code",
"metadata": {
"id": "zvRNw_Oy8KXa",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "3b80b28f-65fa-4919-c713-1f1e4209d3cb"
},
"source": [
"A = [5,1,7,8,2,4,3,6]\n",
"B = [3,2,7,8,1,6,4,5]\n",
"C = [7,3,4,6,2,5,1,8]\n",
"\n",
"# 13. 14. で相関係数を求めてきたので、今回は手抜きで numpy で計算させる\n",
"ABC = np.array([A, B, C])\n",
"abc_coef = np.corrcoef(np.array([A, B, C]))\n",
"print(abc_coef)\n",
"print(f\"ABの相関係数:{ abc_coef[0][1]: 0.4f}\")\n",
"print(f\"ACの相関係数:{ abc_coef[0][2]: 0.4f}\")\n",
"print(f\"BCの相関係数:{ abc_coef[1][2]: 0.4f}\")\n",
"print(\"ABの相関は高い(判断が一致してる)、BCの相関は低い(大きく食い違ってる)\")"
],
"execution_count": 76,
"outputs": [
{
"output_type": "stream",
"text": [
"[[1. 0.85714286 0.64285714]\n",
" [0.85714286 1. 0.4047619 ]\n",
" [0.64285714 0.4047619 1. ]]\n",
"ABの相関係数: 0.8571\n",
"ACの相関係数: 0.6429\n",
"BCの相関係数: 0.4048\n",
"ABの相関は高い(判断が一致してる)、BCの相関は低い(大きく食い違ってる)\n"
],
"name": "stdout"
}
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment