-
-
Save whan0623/92d7cbe35ef29c594538f12c64485865 to your computer and use it in GitHub Desktop.
6. 머신러닝으로 업무 효율화하기.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"nbformat": 4, | |
"nbformat_minor": 0, | |
"metadata": { | |
"colab": { | |
"name": "6. 머신러닝으로 업무 효율화하기.ipynb", | |
"provenance": [], | |
"authorship_tag": "ABX9TyNkawQzTLzXHsvPwljRj7dM", | |
"include_colab_link": true | |
}, | |
"kernelspec": { | |
"name": "python3", | |
"display_name": "Python 3" | |
}, | |
"language_info": { | |
"name": "python" | |
} | |
}, | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "view-in-github", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"<a href=\"https://colab.research.google.com/gist/whan0623/92d7cbe35ef29c594538f12c64485865/6.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "WoJ8ubWfasPl" | |
}, | |
"source": [ | |
"#6-1 업무 시스템에 머신러닝 적용하기\n", | |
"- 사용자가 거의 없는 야간에 DB에 ETL(Extract/Transform/Load) 처리" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "RkTzNgGca9Kd" | |
}, | |
"source": [ | |
"#6-2 학습 모델을 저장하고 읽어 들이는 방법" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "Cf_mJ0wsdnrf" | |
}, | |
"source": [ | |
"##scikit-learn에서 학습데이터 읽고 저장하기" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "FnRht00FbP8N" | |
}, | |
"source": [ | |
"###scikit-learn에서 학습데이터 저장하기" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "HOnuJwHGaMb5", | |
"outputId": "bd0ed4fa-0f13-418a-be92-9e36a31ff0e9" | |
}, | |
"source": [ | |
"from sklearn import datasets, svm\n", | |
"from sklearn.externals import joblib\n", | |
"\n", | |
"# 붓꽃 데이터 읽어 들이기\n", | |
"iris = datasets.load_iris()\n", | |
"\n", | |
"# 데이터 학습하기\n", | |
"clf = svm.SVC()\n", | |
"clf.fit(iris.data, iris.target)\n", | |
"\n", | |
"# 학습한 데이터 저장하기\n", | |
"joblib.dump(clf, 'iris.pkl', compress=True)" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stderr", | |
"text": [ | |
"/usr/local/lib/python3.7/dist-packages/sklearn/externals/joblib/__init__.py:15: FutureWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+.\n", | |
" warnings.warn(msg, category=FutureWarning)\n" | |
] | |
}, | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"['iris.pkl']" | |
] | |
}, | |
"metadata": {}, | |
"execution_count": 5 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "WPUGS8vYb_CU" | |
}, | |
"source": [ | |
"### 구글 드라이브(google drive)에 저장하기" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "cavoph4gbkVm", | |
"outputId": "5ec3e27b-55d0-4090-83a6-1d802cef2a99" | |
}, | |
"source": [ | |
"from google.colab import drive\n", | |
"drive.mount('/gdrive', force_remount=True)\n", | |
"\n", | |
"from sklearn.externals import joblib\n", | |
"joblib.dump(clf, '/gdrive/My Drive/Colab Notebooks/파이썬을 이용한 머신러닝,딥러닝 실전앱개발/iris.plk', compress=True)" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Mounted at /gdrive\n" | |
] | |
}, | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"['/gdrive/My Drive/Colab Notebooks/파이썬을 이용한 머신러닝,딥러닝 실전앱개발/iris.plk']" | |
] | |
}, | |
"metadata": {}, | |
"execution_count": 2 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "xl2xm6EocD37" | |
}, | |
"source": [ | |
"###scikit-learn에서 학습된 데이터 읽어오기\n", | |
"\n", | |
"\n", | |
"\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "5kyusomNcQVC" | |
}, | |
"source": [ | |
"from sklearn.externals import joblib\n", | |
"\n", | |
"# 이전에 저장한 학습된 데이터 읽어 들이기\n", | |
"clf = joblib.load('iris.pkl')" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "vXkOq5QgcK7T" | |
}, | |
"source": [ | |
"###구글 드라이브(google drive)에 저장하기" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "S22lRzQTchrK", | |
"outputId": "2ce7326d-dab7-4b69-f4f3-8e83ba96b635" | |
}, | |
"source": [ | |
"from google.colab import drive\n", | |
"drive.mount('/gdrive', force_remount=True)\n", | |
"\n", | |
"from sklearn.externals import joblib\n", | |
"clf = joblib.load('/gdrive/My Drive/Colab Notebooks/파이썬을 이용한 머신러닝,딥러닝 실전앱개발/iris.plk')" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Mounted at /gdrive\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "K71gmkLEfWEu" | |
}, | |
"source": [ | |
"### 평가하기" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "4jzYN9Z0cpJy", | |
"outputId": "25d2470d-8751-45bb-e6e8-ab7d93c51fbc" | |
}, | |
"source": [ | |
"from sklearn import datasets, svm\n", | |
"from sklearn.metrics import accuracy_score\n", | |
"\n", | |
"# 붓꽃 데이터 읽어 들이기\n", | |
"iris = datasets.load_iris()\n", | |
"# 예측하기\n", | |
"pre = clf.predict(iris.data)\n", | |
"# 정답률 확인하기\n", | |
"print(accuracy_score(iris.target, pre))" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"0.9733333333333334\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "slhqKgQgdyTm" | |
}, | |
"source": [ | |
"## Tensorflow와 Keras에서 학습데이터 읽고 저장하기" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "BBjvwiw3d4So" | |
}, | |
"source": [ | |
"### Tensorflow와 Keras에서 학습데이터 저장하기" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "dQVQFXj_eMuG", | |
"outputId": "dae1227d-a35a-4cb6-c90a-b596ad1d9d6f" | |
}, | |
"source": [ | |
"from sklearn import datasets\n", | |
"import keras\n", | |
"from keras.models import Sequential\n", | |
"from keras.layers import Dense, Dropout\n", | |
"from keras.utils.np_utils import to_categorical\n", | |
"\n", | |
"# 붓꽃 데이터 읽어 들이기\n", | |
"iris = datasets.load_iris()\n", | |
"in_size = 4\n", | |
"nb_classes=3\n", | |
"# 레이블 데이터를 One-hot 형식으로 변환하기\n", | |
"x = iris.data\n", | |
"y = to_categorical(iris.target, nb_classes)\n", | |
"\n", | |
"# 모델 정의하기 --- (*1)\n", | |
"model = Sequential()\n", | |
"model.add(Dense(512, activation='relu', input_shape=(in_size,)))\n", | |
"model.add(Dense(512, activation='relu'))\n", | |
"model.add(Dropout(0.2))\n", | |
"model.add(Dense(nb_classes, activation='softmax'))\n", | |
"# 컴파일하기 --- (*2)\n", | |
"model.compile(\n", | |
" loss='categorical_crossentropy',\n", | |
" optimizer='adam',\n", | |
" metrics=['accuracy'])\n", | |
"# 학습 실행하기 --- (*3)\n", | |
"model.fit(x, y, batch_size=20, epochs=50)\n", | |
"\n", | |
"# 모델 저장하기 --- (*4)\n", | |
"model.save('iris_model.h5')\n", | |
"# 학습한 가중치 데이터 저장하기 --- (*5)\n", | |
"model.save_weights('iris_weight.h5')" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Epoch 1/50\n", | |
"8/8 [==============================] - 1s 5ms/step - loss: 0.8119 - accuracy: 0.6267\n", | |
"Epoch 2/50\n", | |
"8/8 [==============================] - 0s 6ms/step - loss: 0.4872 - accuracy: 0.7533\n", | |
"Epoch 3/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.3727 - accuracy: 0.8000\n", | |
"Epoch 4/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.3047 - accuracy: 0.8533\n", | |
"Epoch 5/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.2168 - accuracy: 0.9333\n", | |
"Epoch 6/50\n", | |
"8/8 [==============================] - 0s 5ms/step - loss: 0.2075 - accuracy: 0.9533\n", | |
"Epoch 7/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.1665 - accuracy: 0.9200\n", | |
"Epoch 8/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.1430 - accuracy: 0.9600\n", | |
"Epoch 9/50\n", | |
"8/8 [==============================] - 0s 5ms/step - loss: 0.1338 - accuracy: 0.9533\n", | |
"Epoch 10/50\n", | |
"8/8 [==============================] - 0s 5ms/step - loss: 0.1604 - accuracy: 0.9333\n", | |
"Epoch 11/50\n", | |
"8/8 [==============================] - 0s 5ms/step - loss: 0.1916 - accuracy: 0.9133\n", | |
"Epoch 12/50\n", | |
"8/8 [==============================] - 0s 6ms/step - loss: 0.1872 - accuracy: 0.9267\n", | |
"Epoch 13/50\n", | |
"8/8 [==============================] - 0s 5ms/step - loss: 0.1294 - accuracy: 0.9467\n", | |
"Epoch 14/50\n", | |
"8/8 [==============================] - 0s 5ms/step - loss: 0.1166 - accuracy: 0.9733\n", | |
"Epoch 15/50\n", | |
"8/8 [==============================] - 0s 5ms/step - loss: 0.1040 - accuracy: 0.9600\n", | |
"Epoch 16/50\n", | |
"8/8 [==============================] - 0s 5ms/step - loss: 0.1207 - accuracy: 0.9600\n", | |
"Epoch 17/50\n", | |
"8/8 [==============================] - 0s 6ms/step - loss: 0.1142 - accuracy: 0.9533\n", | |
"Epoch 18/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.0797 - accuracy: 0.9733\n", | |
"Epoch 19/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.1177 - accuracy: 0.9467\n", | |
"Epoch 20/50\n", | |
"8/8 [==============================] - 0s 5ms/step - loss: 0.0944 - accuracy: 0.9533\n", | |
"Epoch 21/50\n", | |
"8/8 [==============================] - 0s 5ms/step - loss: 0.0915 - accuracy: 0.9733\n", | |
"Epoch 22/50\n", | |
"8/8 [==============================] - 0s 5ms/step - loss: 0.0876 - accuracy: 0.9467\n", | |
"Epoch 23/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.0994 - accuracy: 0.9733\n", | |
"Epoch 24/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.1277 - accuracy: 0.9533\n", | |
"Epoch 25/50\n", | |
"8/8 [==============================] - 0s 5ms/step - loss: 0.1238 - accuracy: 0.9333\n", | |
"Epoch 26/50\n", | |
"8/8 [==============================] - 0s 5ms/step - loss: 0.0937 - accuracy: 0.9467\n", | |
"Epoch 27/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.1045 - accuracy: 0.9667\n", | |
"Epoch 28/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.0990 - accuracy: 0.9600\n", | |
"Epoch 29/50\n", | |
"8/8 [==============================] - 0s 6ms/step - loss: 0.0658 - accuracy: 0.9800\n", | |
"Epoch 30/50\n", | |
"8/8 [==============================] - 0s 5ms/step - loss: 0.0906 - accuracy: 0.9733\n", | |
"Epoch 31/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.0883 - accuracy: 0.9667\n", | |
"Epoch 32/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.0833 - accuracy: 0.9733\n", | |
"Epoch 33/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.0851 - accuracy: 0.9667\n", | |
"Epoch 34/50\n", | |
"8/8 [==============================] - 0s 5ms/step - loss: 0.0748 - accuracy: 0.9733\n", | |
"Epoch 35/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.1006 - accuracy: 0.9533\n", | |
"Epoch 36/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.0846 - accuracy: 0.9600\n", | |
"Epoch 37/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.0923 - accuracy: 0.9733\n", | |
"Epoch 38/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.0720 - accuracy: 0.9667\n", | |
"Epoch 39/50\n", | |
"8/8 [==============================] - 0s 6ms/step - loss: 0.0850 - accuracy: 0.9600\n", | |
"Epoch 40/50\n", | |
"8/8 [==============================] - 0s 6ms/step - loss: 0.0771 - accuracy: 0.9733\n", | |
"Epoch 41/50\n", | |
"8/8 [==============================] - 0s 5ms/step - loss: 0.0665 - accuracy: 0.9733\n", | |
"Epoch 42/50\n", | |
"8/8 [==============================] - 0s 5ms/step - loss: 0.0617 - accuracy: 0.9800\n", | |
"Epoch 43/50\n", | |
"8/8 [==============================] - 0s 5ms/step - loss: 0.0659 - accuracy: 0.9733\n", | |
"Epoch 44/50\n", | |
"8/8 [==============================] - 0s 5ms/step - loss: 0.0768 - accuracy: 0.9667\n", | |
"Epoch 45/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.0725 - accuracy: 0.9600\n", | |
"Epoch 46/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.0778 - accuracy: 0.9600\n", | |
"Epoch 47/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.0786 - accuracy: 0.9667\n", | |
"Epoch 48/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.0792 - accuracy: 0.9867\n", | |
"Epoch 49/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.0695 - accuracy: 0.9733\n", | |
"Epoch 50/50\n", | |
"8/8 [==============================] - 0s 4ms/step - loss: 0.0820 - accuracy: 0.9667\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "PFD3BvkweB1e" | |
}, | |
"source": [ | |
"### 구글 드라이브(google drive)에 저장하기" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "FunaqbEqedBn", | |
"outputId": "ea1bda18-4af5-4400-a035-4df0190f520e" | |
}, | |
"source": [ | |
"from google.colab import drive\n", | |
"drive.mount('/gdrive', force_remount=True)\n", | |
"\n", | |
"# 모델 저장하기 --- (*4)\n", | |
"model.save('/gdrive/My Drive/Colab Notebooks/파이썬을 이용한 머신러닝,딥러닝 실전앱개발/iris_model.h5')\n", | |
"# 학습한 가중치 데이터 저장하기 --- (*5)\n", | |
"model.save_weights('/gdrive/My Drive/Colab Notebooks/파이썬을 이용한 머신러닝,딥러닝 실전앱개발/iris_weight.h5')" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Mounted at /gdrive\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "6fakAlQqd-zW" | |
}, | |
"source": [ | |
"### Tensorflow와 Keras에서 학습데이터 읽어오기" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "8_4fFpaOe0BF" | |
}, | |
"source": [ | |
"from sklearn import datasets\n", | |
"import keras\n", | |
"from keras.models import load_model\n", | |
"from keras.utils.np_utils import to_categorical\n", | |
"\n", | |
"# 붓꽃 데이터 읽어 들이기\n", | |
"iris = datasets.load_iris()\n", | |
"in_size = 4\n", | |
"nb_classes=3\n", | |
"# 레이블 데이터를 One-hot 형식으로 변환하기\n", | |
"x = iris.data\n", | |
"y = to_categorical(iris.target, nb_classes)\n", | |
"\n", | |
"# 모델 읽어 들이기 --- (*1)\n", | |
"model = load_model('iris_model.h5')\n", | |
"# 가중치 데이터 읽어 들이기 --- (*2)\n", | |
"model.load_weights('iris_weight.h5')" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "oYd-PrxreI-l" | |
}, | |
"source": [ | |
"### 구글 드라이브(google drive)에서 읽어오기" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "yDMT1Pq2e99M", | |
"outputId": "5d710a2c-25cc-4a91-b948-8472d82691d4" | |
}, | |
"source": [ | |
"from google.colab import drive\n", | |
"drive.mount('/gdrive', force_remount=True)\n", | |
"\n", | |
"# 모델 읽어 들이기 --- (*1)\n", | |
"model = load_model('/gdrive/My Drive/Colab Notebooks/파이썬을 이용한 머신러닝,딥러닝 실전앱개발/iris_model.h5')\n", | |
"# 가중치 데이터 읽어 들이기 --- (*2)\n", | |
"model.load_weights('/gdrive/My Drive/Colab Notebooks/파이썬을 이용한 머신러닝,딥러닝 실전앱개발/iris_weight.h5')" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Mounted at /gdrive\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "OQdNCs38faT3" | |
}, | |
"source": [ | |
"### 평가하기" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "xhGSnJKie-Bc", | |
"outputId": "0e09ed3f-ce93-4d7a-f774-49978c07013e" | |
}, | |
"source": [ | |
"# 모델 평가하기 --- (*3)\n", | |
"score = model.evaluate(x, y, verbose=1)\n", | |
"print(\"정답률=\", score[1])" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"5/5 [==============================] - 0s 3ms/step - loss: 0.0577 - accuracy: 0.9733\n", | |
"정답률= 0.9733333587646484\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "txJfW3JQfnVo" | |
}, | |
"source": [ | |
"#6-3 뉴스 기사의 카테고리 판정하기\n", | |
"- BoW(Bag-of-Words) : 문장을 벡터 데이터로 변환\n", | |
"- TF-IDF : 문장을 수리초 변환, 출현빈도와 함께 문장 전체에서 단어의 중요도 고려" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "e9hDVBkYgm0R" | |
}, | |
"source": [ | |
"##TF-IDF 모듈 만들기" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "MmSxoWi-hLx5" | |
}, | |
"source": [ | |
"###koNLpy 설치\n", | |
"- !pip install konlpy\n", | |
" - 느낌표(!) 뒤에 쉘명령어를 쓰면 코랩에서 실행됨\n", | |
"- 참조 : https://couplewith.tistory.com/entry/Python-KoNLPy-%ED%98%95%ED%83%9C%EC%86%8C-%EB%B6%84%EC%84%9D%EA%B8%B0-%EB%B9%84%EA%B5%90-Komoran-Okt-Kkma" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "IFAqcDQphK2o", | |
"outputId": "672a2ab7-1ba7-4885-d6b3-ecf66160b0d1" | |
}, | |
"source": [ | |
"!pip install konlpy" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Collecting konlpy\n", | |
" Downloading konlpy-0.5.2-py2.py3-none-any.whl (19.4 MB)\n", | |
"\u001b[K |████████████████████████████████| 19.4 MB 131 kB/s \n", | |
"\u001b[?25hRequirement already satisfied: lxml>=4.1.0 in /usr/local/lib/python3.7/dist-packages (from konlpy) (4.2.6)\n", | |
"Collecting beautifulsoup4==4.6.0\n", | |
" Downloading beautifulsoup4-4.6.0-py3-none-any.whl (86 kB)\n", | |
"\u001b[K |████████████████████████████████| 86 kB 6.6 MB/s \n", | |
"\u001b[?25hRequirement already satisfied: numpy>=1.6 in /usr/local/lib/python3.7/dist-packages (from konlpy) (1.19.5)\n", | |
"Collecting colorama\n", | |
" Downloading colorama-0.4.4-py2.py3-none-any.whl (16 kB)\n", | |
"Requirement already satisfied: tweepy>=3.7.0 in /usr/local/lib/python3.7/dist-packages (from konlpy) (3.10.0)\n", | |
"Collecting JPype1>=0.7.0\n", | |
" Downloading JPype1-1.3.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (448 kB)\n", | |
"\u001b[K |████████████████████████████████| 448 kB 69.9 MB/s \n", | |
"\u001b[?25hRequirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from JPype1>=0.7.0->konlpy) (3.7.4.3)\n", | |
"Requirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.7/dist-packages (from tweepy>=3.7.0->konlpy) (1.3.0)\n", | |
"Requirement already satisfied: requests[socks]>=2.11.1 in /usr/local/lib/python3.7/dist-packages (from tweepy>=3.7.0->konlpy) (2.23.0)\n", | |
"Requirement already satisfied: six>=1.10.0 in /usr/local/lib/python3.7/dist-packages (from tweepy>=3.7.0->konlpy) (1.15.0)\n", | |
"Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.7/dist-packages (from requests-oauthlib>=0.7.0->tweepy>=3.7.0->konlpy) (3.1.1)\n", | |
"Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests[socks]>=2.11.1->tweepy>=3.7.0->konlpy) (2.10)\n", | |
"Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests[socks]>=2.11.1->tweepy>=3.7.0->konlpy) (3.0.4)\n", | |
"Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests[socks]>=2.11.1->tweepy>=3.7.0->konlpy) (1.24.3)\n", | |
"Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests[socks]>=2.11.1->tweepy>=3.7.0->konlpy) (2021.5.30)\n", | |
"Requirement already satisfied: PySocks!=1.5.7,>=1.5.6 in /usr/local/lib/python3.7/dist-packages (from requests[socks]>=2.11.1->tweepy>=3.7.0->konlpy) (1.7.1)\n", | |
"Installing collected packages: JPype1, colorama, beautifulsoup4, konlpy\n", | |
" Attempting uninstall: beautifulsoup4\n", | |
" Found existing installation: beautifulsoup4 4.6.3\n", | |
" Uninstalling beautifulsoup4-4.6.3:\n", | |
" Successfully uninstalled beautifulsoup4-4.6.3\n", | |
"Successfully installed JPype1-1.3.0 beautifulsoup4-4.6.0 colorama-0.4.4 konlpy-0.5.2\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "q-60kS5Nfq4s" | |
}, | |
"source": [ | |
"# TF-IDF로 텍스트를 벡터로 변환하는 모듈\n", | |
"from konlpy.tag import Okt\n", | |
"import pickle\n", | |
"import numpy as np\n", | |
"\n", | |
"# KoNLPy의 Okt객체 초기화 ---- ( ※ 1)\n", | |
"okt = Okt()\n", | |
"# 전역 변수 --- ( ※ 2)\n", | |
"word_dic = {'_id': 0} # 단어 사전\n", | |
"dt_dic = {} # 문장 전체에서의 단어 출현 횟수\n", | |
"id_files = [] # 문서들을 저장할 리스트\n", | |
"\n", | |
"def tokenize(text):\n", | |
" '''KoNLPy로 형태소 분석하기''' # --- ( ※ 3) \n", | |
" result = []\n", | |
" word_s = okt.pos(text, norm=True, stem=True)\n", | |
" for n, h in word_s:\n", | |
" if not (h in ['Noun', 'Verb ', 'Adjective']): continue\n", | |
" if h == 'Punctuation' and h2 == 'Number': continue\n", | |
" result.append(n)\n", | |
" return result\n", | |
"\n", | |
"def words_to_ids(words, auto_add = True):\n", | |
" ''' 단어를 ID로 변환하기 ''' # --- ( ※ 4)\n", | |
" result = []\n", | |
" for w in words:\n", | |
" if w in word_dic:\n", | |
" result.append(word_dic[w])\n", | |
" continue\n", | |
" elif auto_add:\n", | |
" id = word_dic[w] = word_dic['_id']\n", | |
" word_dic['_id'] += 1\n", | |
" result.append(id)\n", | |
" return result\n", | |
"\n", | |
"def add_text(text):\n", | |
" '''텍스트를 ID 리스트로 변환해서 추가하기''' # --- (*5)\n", | |
" ids = words_to_ids(tokenize(text))\n", | |
" id_files.append(ids)\n", | |
"\n", | |
"def add_file(path):\n", | |
" '''텍스트 파일을 학습 전용으로 추가하기''' # --- (*6)\n", | |
" with open(path, \"r\", encoding=\"utf-8\") as f:\n", | |
" s = f.read()\n", | |
" add_text(s)\n", | |
"\n", | |
"def calc_files():\n", | |
" '''추가한 파일 계산하기''' # --- (*7)\n", | |
" global dt_dic\n", | |
" result = []\n", | |
" doc_count = len(id_files)\n", | |
" dt_dic = {}\n", | |
" # 단어 출현 횟수 세기 --- (*8)\n", | |
" for words in id_files:\n", | |
" used_word = {}\n", | |
" data = np.zeros(word_dic['_id'])\n", | |
" for id in words:\n", | |
" data[id] += 1\n", | |
" used_word[id] = 1\n", | |
" # 단어 t가 사용되고 있을 경우 dt_dic의 수를 1 더하기 --- (*9)\n", | |
" for id in used_word:\n", | |
" if not(id in dt_dic): dt_dic[id] = 0\n", | |
" dt_dic[id] += 1\n", | |
" # 정규화하기 --- (*10)\n", | |
" data = data / len(words) \n", | |
" result.append(data)\n", | |
" # TF-IDF 계산하기 --- (*11)\n", | |
" for i, doc in enumerate(result):\n", | |
" for id, v in enumerate(doc):\n", | |
" idf = np.log(doc_count / dt_dic[id]) + 1\n", | |
" doc[id] = min([doc[id] * idf, 1.0])\n", | |
" result[i] = doc\n", | |
" return result\n", | |
"\n", | |
"def save_dic(fname):\n", | |
" '''사전을 파일로 저장하기''' # --- (*12)\n", | |
" pickle.dump(\n", | |
" [word_dic, dt_dic, id_files],\n", | |
" open(fname, \"wb\"))\n", | |
"\n", | |
"def load_dic(fname):\n", | |
" '''사전 파일 읽어 들이기''' # --- (*13)\n", | |
" global word_dic, dt_dic, id_files\n", | |
" n = pickle.load(open(fname, 'rb'))\n", | |
" word_dic, dt_dic, id_files = n\n", | |
"\n", | |
"def calc_text(text):\n", | |
" ''' 문장을 벡터로 변환하기 ''' # --- ( ※ 14)\n", | |
" data = np.zeros(word_dic['_id'])\n", | |
" words = words_to_ids(tokenize(text), False)\n", | |
" for w in words:\n", | |
" data[w] += 1\n", | |
" data = data / len(words)\n", | |
" for id, v in enumerate(data):\n", | |
" idf = np.log(len(id_files) / dt_dic[id]) + 1\n", | |
" data[id] = min([data[id] * idf, 1.0])\n", | |
" return data\n" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "3DyOJleeRatH" | |
}, | |
"source": [ | |
"### 모듈 테스트하기 --- ( ※ 15)\n", | |
"- 아래 코드는 실행시키면 4개의 files가 추가되므로 뒤의 신문기사에서 오류 발생\n", | |
" if __name__ == '__main__':\n", | |
" add_text('비')\n", | |
" add_text('오늘은 비가 내렸어요.') \n", | |
" add_text('오늘은 더웠지만 오후부터 비가 내렸다.') \n", | |
" add_text('비가 내리는 일요일이다.') \n", | |
" print(calc_files())\n", | |
" print(word_dic)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "lQSnU9Z3kFs8" | |
}, | |
"source": [ | |
"##신문기사 2400개의 데이터를 코랩에서 사용\n", | |
"- 구글 드라이브 연결\n", | |
"- genre.tar.gz 파일 복사\n", | |
"- 압축풀기 " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "SOdM6Xiukvm8" | |
}, | |
"source": [ | |
"import os\n", | |
"\n", | |
"paths = ['./100/','./101/','./103/','./105/',]\n", | |
"for path in paths:\n", | |
" files = os.listdir(path)\n", | |
" for file in files:\n", | |
" if file.endswith(\".txt\"):\n", | |
" os.remove(path + file)\n", | |
" os.rmdir(path)" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 53 | |
}, | |
"id": "qajBE3c5lffq", | |
"outputId": "5b4332e5-83b4-422e-c67c-0143a44410fd" | |
}, | |
"source": [ | |
"from google.colab import drive\n", | |
"drive.mount('/gdrive', force_remount=True)\n", | |
"\n", | |
"path = \"/gdrive/My Drive/Colab Notebooks/파이썬을 이용한 머신러닝,딥러닝 실전앱개발/\"\n", | |
"file = \"genre.tar.gz\"\n", | |
"\n", | |
"import shutil\n", | |
"shutil.copy(path+file, file)" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Mounted at /gdrive\n" | |
] | |
}, | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"application/vnd.google.colaboratory.intrinsic+json": { | |
"type": "string" | |
}, | |
"text/plain": [ | |
"'genre.tar.gz'" | |
] | |
}, | |
"metadata": {}, | |
"execution_count": 10 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "TrQNporUjEic" | |
}, | |
"source": [ | |
"!tar -zxvf genre.tar.gz" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "IMUxaEUnR8c2" | |
}, | |
"source": [ | |
"### 리스트 id_files가 비어있음을 확인한 후 진행" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "RugMoCgWKb2P", | |
"outputId": "ce00b6e6-968d-48b3-e25b-5f7a073dd427" | |
}, | |
"source": [ | |
"id_files\n" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"[]" | |
] | |
}, | |
"metadata": {}, | |
"execution_count": 40 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "Y87Hp3iZuSKt" | |
}, | |
"source": [ | |
"### 텍스트 분류하기\n", | |
"- 실행시간 : 16분" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "AIhy-gj3ihuR", | |
"outputId": "3e8f6d45-09e6-4800-bce0-e897cd0ec73a" | |
}, | |
"source": [ | |
"import os, glob, pickle\n", | |
"#import tfidf #코랩에서 앞 셀에 정의되어 있어 import 생략 가능\n", | |
"\n", | |
"# 변수 초기화\n", | |
"y = []\n", | |
"x = []\n", | |
"\n", | |
"# 디렉터리 내부의 파일 목록 전체에 대해 처리하기 --- (*1)\n", | |
"def read_files(path, label):\n", | |
" print(\"read_files=\", path)\n", | |
" files = glob.glob(path + \"/*.txt\")\n", | |
" for f in files:\n", | |
" if os.path.basename(f) == 'LICENSE.txt': continue\n", | |
" #tfidf.add_file(f)\n", | |
" add_file(f) #코랩에서 앞 셀에 정의되어 있어 tfidf 생략 가능\n", | |
" y.append(label)\n", | |
"\n", | |
"# 기사를 넣은 디렉터리 읽어 들이기 --- ( ※ 2)\n", | |
"#read_files('text/100', 0)\n", | |
"#read_files('text/101', 1)\n", | |
"#read_files('text/103', 2)\n", | |
"#read_files('text/105', 3)\n", | |
"\n", | |
"read_files('./100', 0)\n", | |
"read_files('./101', 1)\n", | |
"read_files('./103', 2)\n", | |
"read_files('./105', 3)\n", | |
"\n", | |
"\n", | |
"# TF-IDF 벡터로 변환하기 --- (*3)\n", | |
"#x = tfidf.calc_files()\n", | |
"x = calc_files() #코랩에서 앞 셀에 정의되어 있어 tfidf 생략 가능" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"read_files= ./100\n", | |
"read_files= ./101\n", | |
"read_files= ./103\n", | |
"read_files= ./105\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "mHfErRwaW3OV", | |
"outputId": "b2661337-b242-4701-d32d-3d993130baee" | |
}, | |
"source": [ | |
"len(x), len(y)" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"(3197, 3197)" | |
] | |
}, | |
"metadata": {}, | |
"execution_count": 42 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "YdmbgweTDJHY" | |
}, | |
"source": [ | |
"#### 저장하기" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "ntp37QlNpfM4", | |
"outputId": "82cb87e7-a6e9-4a06-819a-3cb25701e1f4" | |
}, | |
"source": [ | |
"# 저장하기 --- (*4)\n", | |
"pickle.dump([y, x], open('genre.pickle', 'wb'))\n", | |
"#tfidf.save_dic('text/genre-tdidf.dic')\n", | |
"save_dic('genre-tdidf.dic') #코랩에서 앞 셀에 정의되어 있어 tfidf 생략 가능\n", | |
"print('ok')" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"ok\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "8OdYGTI0DN1_" | |
}, | |
"source": [ | |
"#### 구글 드라이브에 저장하기\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "pmAZKVZ6pOAA", | |
"outputId": "6e5bb66a-30b2-4f17-899a-b9f572544a9e" | |
}, | |
"source": [ | |
"from google.colab import drive\n", | |
"drive.mount('/gdrive', force_remount=True)\n", | |
"\n", | |
"path = \"/gdrive/My Drive/Colab Notebooks/파이썬을 이용한 머신러닝,딥러닝 실전앱개발/\"\n", | |
"\n", | |
"pickle.dump([y, x], open(path + 'genre.pickle', 'wb'))\n", | |
"#tfidf.save_dic('text/genre-tdidf.dic')\n", | |
"save_dic(path + 'genre-tdidf.dic') #코랩에서 앞 셀에 정의되어 있어 tfidf 생략 가능\n", | |
"print('ok')" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Mounted at /gdrive\n", | |
"ok\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "-I_upKd7DQ7H" | |
}, | |
"source": [ | |
"#### 구글 드라이브에서 읽어오기\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "CkBmXbjJDUyn", | |
"outputId": "9e6eb62f-d606-4458-f94a-7d5a2a7a57a6" | |
}, | |
"source": [ | |
"from google.colab import drive\n", | |
"drive.mount('/gdrive', force_remount=True)\n", | |
"\n", | |
"path = \"/gdrive/My Drive/Colab Notebooks/파이썬을 이용한 머신러닝,딥러닝 실전앱개발/\"\n", | |
"\n", | |
"import pickle\n", | |
"y, x = pickle.load(open(path + 'genre.pickle', 'rb'))\n" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Mounted at /gdrive\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "-sIuUamHEttA" | |
}, | |
"source": [ | |
"#### 읽어오기" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "hMdyEqzgEiVJ" | |
}, | |
"source": [ | |
"path = \"./\"\n", | |
"import pickle\n", | |
"y, x = pickle.load(open(path + 'genre.pickle', 'rb'))" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "6BY63bECE6qf" | |
}, | |
"source": [ | |
"## TF-IDF를 나이브 베이즈로 학습시키기\n", | |
"- 머신러닝 대표 평가 지표 : precision, recall, f1-score\n", | |
"- 참조 : https://gaussian37.github.io/ml-concept-ml-evaluation/" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "Ece10yE0E-R_", | |
"outputId": "f9d8ba65-9142-46e5-bd47-18aff3d3936b" | |
}, | |
"source": [ | |
"import pickle\n", | |
"from sklearn.naive_bayes import GaussianNB\n", | |
"from sklearn.model_selection import train_test_split\n", | |
"import sklearn.metrics as metrics\n", | |
"import numpy as np\n", | |
"\n", | |
"from google.colab import drive\n", | |
"drive.mount('/gdrive', force_remount=True)\n", | |
"\n", | |
"path = \"/gdrive/My Drive/Colab Notebooks/파이썬을 이용한 머신러닝,딥러닝 실전앱개발/\"\n", | |
"\n", | |
"# TF-IDF 데이터베이스 읽어 들이기 --- (*1)\n", | |
"data = pickle.load(open(path+\"genre.pickle\", \"rb\"))\n", | |
"y = data[0] # 레이블\n", | |
"x = data[1] # TF-IDF\n", | |
"\n", | |
"# 학습 전용과 테스트 전용으로 구분하기 --- (*2)\n", | |
"x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)\n", | |
"\n", | |
"# 나이브 베이즈로 학습하기 --- (*3)\n", | |
"model = GaussianNB()\n", | |
"model.fit(x_train, y_train)\n", | |
"\n", | |
"# 평가하고 결과 출력하기 --- (*4)\n", | |
"y_pred = model.predict(x_test)\n", | |
"acc = metrics.accuracy_score(y_test, y_pred)\n", | |
"rep = metrics.classification_report(y_test, y_pred)\n", | |
"\n", | |
"print(\"정답률=\", acc)\n", | |
"print(rep)" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Mounted at /gdrive\n", | |
"정답률= 0.8265625\n", | |
" precision recall f1-score support\n", | |
"\n", | |
" 0 0.90 0.87 0.88 167\n", | |
" 1 0.86 0.74 0.80 164\n", | |
" 2 0.76 0.90 0.82 165\n", | |
" 3 0.80 0.80 0.80 144\n", | |
"\n", | |
" accuracy 0.83 640\n", | |
" macro avg 0.83 0.83 0.83 640\n", | |
"weighted avg 0.83 0.83 0.83 640\n", | |
"\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "r-kxf8DuasU8" | |
}, | |
"source": [ | |
"## 딥러닝으로 정답률 개선하기\n", | |
"- scikit-learn에서 딥러닝으로 변경하기" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "uioQwBScayCL", | |
"outputId": "62e3caac-8a33-45b2-e28b-50feeae9aa49" | |
}, | |
"source": [ | |
"import pickle\n", | |
"from sklearn.model_selection import train_test_split\n", | |
"import sklearn.metrics as metrics\n", | |
"import keras\n", | |
"from keras.models import Sequential\n", | |
"from keras.layers import Dense, Dropout\n", | |
"from tensorflow.keras.optimizers import RMSprop\n", | |
"import matplotlib.pyplot as plt\n", | |
"import numpy as np\n", | |
"import h5py\n", | |
"\n", | |
"# 분류할 레이블 수 --- (*1)\n", | |
"nb_classes = 4\n", | |
"\n", | |
"# 데이터베이스 읽어 들이기 --- (*2)\n", | |
"from google.colab import drive\n", | |
"drive.mount('/gdrive', force_remount=True)\n", | |
"path = \"/gdrive/My Drive/Colab Notebooks/파이썬을 이용한 머신러닝,딥러닝 실전앱개발/\"\n", | |
"data = pickle.load(open(path+\"genre.pickle\", \"rb\"))\n", | |
"y = data[0] # 레이블\n", | |
"x = data[1] # TF-IDF\n", | |
"# 레이블 데이터를 One-hot 형식으로 변환하기 --- (*3)\n", | |
"y = keras.utils.np_utils.to_categorical(y, nb_classes)\n", | |
"in_size = x[0].shape[0]\n", | |
"\n", | |
"# 학습 전용과 테스트 전용으로 구분하기 --- (*4)\n", | |
"x_train, x_test, y_train, y_test = train_test_split(\n", | |
" np.array(x), np.array(y), test_size=0.2)\n", | |
"\n", | |
"# MLP모델의 구조 정의하기 --- (*5)\n", | |
"model = Sequential()\n", | |
"model.add(Dense(512, activation='relu', input_shape=(in_size,)))\n", | |
"model.add(Dropout(0.2))\n", | |
"model.add(Dense(512, activation='relu'))\n", | |
"model.add(Dropout(0.2))\n", | |
"model.add(Dense(nb_classes, activation='softmax'))\n", | |
"\n", | |
"# 모델 컴파일하기 --- (*6)\n", | |
"model.compile(\n", | |
" loss='categorical_crossentropy',\n", | |
" optimizer=RMSprop(),\n", | |
" metrics=['accuracy'])\n", | |
"\n", | |
"# 학습 실행하기 --- (*7)\n", | |
"hist = model.fit(x_train, y_train,\n", | |
" batch_size=128, \n", | |
" epochs=20,\n", | |
" verbose=1,\n", | |
" validation_data=(x_test, y_test))\n", | |
"\n", | |
"# 평가하기 ---(*8)\n", | |
"score = model.evaluate(x_test, y_test, verbose=1)\n", | |
"print(\"정답률=\", score[1], 'loss=', score[0])\n", | |
"\n", | |
"# 가중치데이터 저장하기 --- (*9)\n", | |
"model.save_weights('./genre-model.hdf5')\n", | |
"model.save_weights(path + 'genre-model.hdf5')\n" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Mounted at /gdrive\n", | |
"Epoch 1/20\n", | |
"20/20 [==============================] - 7s 318ms/step - loss: 0.8204 - accuracy: 0.7454 - val_loss: 0.4381 - val_accuracy: 0.8500\n", | |
"Epoch 2/20\n", | |
"20/20 [==============================] - 6s 289ms/step - loss: 0.2233 - accuracy: 0.9320 - val_loss: 0.3576 - val_accuracy: 0.8766\n", | |
"Epoch 3/20\n", | |
"20/20 [==============================] - 6s 292ms/step - loss: 0.0966 - accuracy: 0.9699 - val_loss: 0.3964 - val_accuracy: 0.8797\n", | |
"Epoch 4/20\n", | |
"20/20 [==============================] - 6s 288ms/step - loss: 0.0437 - accuracy: 0.9879 - val_loss: 0.3888 - val_accuracy: 0.8813\n", | |
"Epoch 5/20\n", | |
"20/20 [==============================] - 6s 287ms/step - loss: 0.0193 - accuracy: 0.9957 - val_loss: 0.4514 - val_accuracy: 0.8734\n", | |
"Epoch 6/20\n", | |
"20/20 [==============================] - 6s 289ms/step - loss: 0.0087 - accuracy: 0.9977 - val_loss: 0.4943 - val_accuracy: 0.8828\n", | |
"Epoch 7/20\n", | |
"20/20 [==============================] - 6s 287ms/step - loss: 0.0070 - accuracy: 0.9988 - val_loss: 0.5501 - val_accuracy: 0.8797\n", | |
"Epoch 8/20\n", | |
"20/20 [==============================] - 6s 289ms/step - loss: 0.0050 - accuracy: 0.9973 - val_loss: 0.5541 - val_accuracy: 0.8734\n", | |
"Epoch 9/20\n", | |
"20/20 [==============================] - 6s 306ms/step - loss: 0.0038 - accuracy: 0.9980 - val_loss: 0.5626 - val_accuracy: 0.8828\n", | |
"Epoch 10/20\n", | |
"20/20 [==============================] - 6s 288ms/step - loss: 0.0016 - accuracy: 0.9992 - val_loss: 0.6297 - val_accuracy: 0.8859\n", | |
"Epoch 11/20\n", | |
"20/20 [==============================] - 6s 287ms/step - loss: 0.0027 - accuracy: 0.9992 - val_loss: 0.6161 - val_accuracy: 0.8734\n", | |
"Epoch 12/20\n", | |
"20/20 [==============================] - 6s 289ms/step - loss: 0.0028 - accuracy: 0.9984 - val_loss: 0.6291 - val_accuracy: 0.8781\n", | |
"Epoch 13/20\n", | |
"20/20 [==============================] - 6s 290ms/step - loss: 7.0816e-04 - accuracy: 1.0000 - val_loss: 0.6514 - val_accuracy: 0.8828\n", | |
"Epoch 14/20\n", | |
"20/20 [==============================] - 6s 288ms/step - loss: 0.0012 - accuracy: 0.9992 - val_loss: 0.6843 - val_accuracy: 0.8766\n", | |
"Epoch 15/20\n", | |
"20/20 [==============================] - 6s 289ms/step - loss: 8.6862e-04 - accuracy: 0.9996 - val_loss: 0.6957 - val_accuracy: 0.8781\n", | |
"Epoch 16/20\n", | |
"20/20 [==============================] - 6s 289ms/step - loss: 3.6986e-04 - accuracy: 1.0000 - val_loss: 0.7144 - val_accuracy: 0.8828\n", | |
"Epoch 17/20\n", | |
"20/20 [==============================] - 6s 288ms/step - loss: 3.6865e-04 - accuracy: 1.0000 - val_loss: 0.7641 - val_accuracy: 0.8750\n", | |
"Epoch 18/20\n", | |
"20/20 [==============================] - 6s 289ms/step - loss: 9.1236e-04 - accuracy: 0.9996 - val_loss: 0.8215 - val_accuracy: 0.8844\n", | |
"Epoch 19/20\n", | |
"20/20 [==============================] - 6s 289ms/step - loss: 2.1180e-04 - accuracy: 1.0000 - val_loss: 0.7527 - val_accuracy: 0.8766\n", | |
"Epoch 20/20\n", | |
"20/20 [==============================] - 6s 288ms/step - loss: 3.1430e-04 - accuracy: 1.0000 - val_loss: 0.7906 - val_accuracy: 0.8750\n", | |
"20/20 [==============================] - 1s 32ms/step - loss: 0.7906 - accuracy: 0.8750\n", | |
"정답률= 0.875 loss= 0.7905627489089966\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 281 | |
}, | |
"id": "xiIZ3GAobbX1", | |
"outputId": "7e014aec-9085-4bc4-fdcf-cfb2679b769e" | |
}, | |
"source": [ | |
"# 학습 상태를 그래프로 그리기 --- (*10)\n", | |
"plt.plot(hist.history['accuracy'])\n", | |
"plt.plot(hist.history['val_accuracy'])\n", | |
"plt.title('Accuracy')\n", | |
"plt.legend(['train', 'test'], loc='upper left')\n", | |
"plt.show()" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "display_data", | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAEICAYAAABRSj9aAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO3deXxcdb3/8dcnySRpmrRN03RfoaWUzQKxrAooSwtXQBEERPGqFze4eK8beBUEr9fleq+4I3grLuy4VS1QUBB/sqZS9u5QmjZp06RJmiaTZDLf3x/fk2SSJs20mcwkZ97Px2MeOXPOmZnPnMy853u+ZzPnHCIiEl45mS5ARESGl4JeRCTkFPQiIiGnoBcRCTkFvYhIyCnoRURCTkEvIhJyCnoJFTN73Mx2m1lBpmsRGSkU9BIaZjYXeBvggPPT+Lp56XotkYOhoJcw+SDwNHAHcGXXSDObZWa/MbNaM6szsx8kTPsXM3vNzPaY2atmdlww3pnZ/IT57jCz/wyGTzezKjP7gpnVAD8zs1Iz+2PwGruD4ZkJj59oZj8zs+3B9N8F4182s3clzBcxs11mduywLSXJOgp6CZMPAncGt3PMbIqZ5QJ/BLYAc4EZwD0AZnYx8JXgcePwawF1Sb7WVGAiMAe4Cv9d+llwfzbQCvwgYf5fAkXAkcBk4DvB+F8AVyTMdy5Q7Zx7Psk6RAZlOteNhIGZnQo8Bkxzzu0ys7XAT/At/BXB+FifxzwMrHTOfbef53PAAufcxuD+HUCVc+5LZnY6sAoY55yLDlDPYuAx51ypmU0DtgFlzrndfeabDqwDZjjnmszsAeBZ59y3DnphiPShFr2ExZXAKufcruD+XcG4WcCWviEfmAVsOsjXq00MeTMrMrOfmNkWM2sCngAmBGsUs4D6viEP4JzbDvwduMjMJgDL8GskIimjjUgy6pnZGOASIDfoMwcoACYAO4DZZpbXT9hvBQ4d4Glb8F0tXaYCVQn3+64KfwZYCJzgnKsJWvTPAxa8zkQzm+Cca+jntX4OfBT/fXzKObdt4HcrcuDUopcwuBDoBI4AFge3RcDfgmnVwDfMbKyZFZrZKcHjfgp81syON2++mc0Jpq0BLjezXDNbCpw2SA0l+H75BjObCNzYNcE5Vw08CPwo2GgbMbO3Jzz2d8BxwLX4PnuRlFLQSxhcCfzMOfemc66m64bfGHoZ8C5gPvAmvlX+PgDn3P3A1/DdPHvwgTsxeM5rg8c1AO8Ppu3PLcAYYBd+u8BDfaZ/AOgA1gI7gU93TXDOtQK/BuYBvznA9y4yKG2MFRkBzOwG4DDn3BWDzixygNRHL5JhQVfPR/CtfpGUU9eNSAaZ2b/gN9Y+6Jx7ItP1SDip60ZEJOTUohcRCbkR10c/adIkN3fu3EyXISIyqqxevXqXc668v2kjLujnzp1LZWVlpssQERlVzGzLQNPUdSMiEnIKehGRkFPQi4iE3Ijro+9PR0cHVVVVRKP9nhE2VAoLC5k5cyaRSCTTpYhISIyKoK+qqqKkpIS5c+diZpkuZ9g456irq6Oqqop58+ZluhwRCYlBu27MbLmZ7TSzlweYbmb2PTPbaGYvdl2KLZh2pZltCG5X9vf4ZESjUcrKykId8gBmRllZWVasuYhI+iTTR38HsHQ/05cBC4LbVcCPofv8HTcCJwBLgBvNrPRgCw17yHfJlvcpIukzaNeNc+4JM5u7n1kuAH7h/LkUnjazCcGl004HHnHO1QOY2SP4H4y7h1q0SDZri3XS0NJB/d52dre0dw83tLTjHIzJz6UgksuY4FYYyfF/83MpzMtlTH7PtIJIDgV5OfttYMTjjo54nFinI9bphzvjjo7OYFw8TkfCtK5xfaf1DMeJxXv+do3riDsY4ilZzIy8HCMvN4dIbt/hHPIS/kZ6DeeQl+OXQWyA97bve+pdfyzuGOopZaaOH8PlJ8we0nP0JxV99DPwJ2XqUhWMG2j8PszsKvzaALNnp/5NpkJDQwN33XUXn/zkJw/oceeeey533XUXEyZMGKbKhl+sM05zW4w90RhN0Q6ao37Yj+tgTzCtORqjMJLDhKJ8Jo7Np7QoQmlRPqVj85kQDEdyh29HL+ccbbF4r9qaozGa+tzvqndPtGOfeWPxoX1RI7k5PlgTQrVwkMAtjORQGPHztcXi7O4T4LtbgtveDhpa2tnb3pmiJeaZEdSRiwEdnUGQB2E8xEVywLUMRaZP3TXU+hfPmjBig37InHO3AbcBVFRUjMizrDU0NPCjH/1on6CPxWLk5Q28GFeuXDncpR2QWGec+r3t7NzTRm3XrblnuLG1Y5/wbu0YPFjycoziwjyiHZ1EO+IDzldSkEdp8CPQ9YMwoSjCxKJ8JozNByDa3klrRyfRjoS/7f55W4NxbcHf1uD1uh6TTFAXRnIoLogwrjCP4sI8SgrzmFQ8luKCCPl5B/9NdQ46Ol2wDHw9zW0xdjW397yHmP/bFht4GXUvq8K8YPnkU15cwGGTS4Jl1nvZlSYM55h1v3Zb1/JKWJ7RhGXWNT5xWTpHd+s2r/uv9RrX3RIOpuXl5hDpmj/XiOTkkJsTtJiDx0USpuUltLT7jhtq16VzLlhbSFwLiXf/aHVPS1ir6EhorWP+s5yXk9Nv/d3LYJjqHy6pCPpt+Isfd5kZjNuG775JHP94Cl4vI6677jo2bdrE4sWLiUQiFBYWUlpaytq1a1m/fj0XXnghW7duJRqNcu2113LVVVcBPad0aG5uZtmyZZx66qk8+eSTzJgxg9///veMGTMmZTXWNEZ5o25vrwDf2ZQY5FHq9rb32+opKcyjvKSACWN8iMycWOSDsCCPksJI8LfrFqEkYVpJYV6v1f/W9s5erVDfOm2nPhj2Nz+8qbaZhpYOmtv2vXZ3JNe6W7qJ3RCFkVxKx+YzLaHF3DU+sc7igp46xxVGKA6G8/Myf/hIPO6IxhJ+vNp9CBfk5fg1oDER8g5y7SeSm0NJYXbunmvmf2AiuTCG3EyXM2KkIuhXAFeb2T34Da+NzrlqM3sY+K+EDbBnA9cP9cVu+sMrvLq9aahP08sR08dx47uO3O883/jGN3j55ZdZs2YNjz/+OOeddx4vv/xy926Qy5cvZ+LEibS2tvLWt76Viy66iLKysl7PsWHDBu6++25uv/12LrnkEn79619zxRVDu6BQS3uMlS/VcH/lVp55vb7XtEiuUV5cQHlJATMmFLJ41njKSwopLymgvLiAyeMKuqcXRlL3pRiTn8uY/DFMn5D8j1hbrJPGlg7MrDu0h7ObJ9Nycoyi/DyK8jNdiWSDQYPezO7Gt8wnmVkVfk+aCIBz7lZgJXAusBFoAf45mFZvZl8Fngue6uauDbNhsGTJkl77un/ve9/jt7/9LQBbt25lw4YN+wT9vHnzWLx4MQDHH388b7zxxkG9tnOOyi27ub9yK396sZq97Z3MLSvis2cfxuJZpd0BPn5MhJyckbkq2VdBXi6Tx6kFJjIcktnr5rJBpjvgUwNMWw4sP7jS+jdYyztdxo4d2z38+OOP8+ijj/LUU09RVFTE6aef3u++8AUFBd3Dubm5tLa2HtBrVje28pt/bOOB1VW8vmsvY/NzOe+YaVxcMYuKOaUjtn9QRDJrRGyMHQ1KSkrYs2dPv9MaGxspLS2lqKiItWvX8vTTT6fsdaMdnTz62g7ur6zibxtqiTs4Yd5EPnXGfJYdNZWxBfoXisj+KSWSVFZWximnnMJRRx3FmDFjmDJlSve0pUuXcuutt7Jo0SIWLlzIiSeeOOTXe6mqkftXb+X3a7bT2NrB9PGFXH3GfC46fiZzysYO/gQiIoERd83YiooK1/fCI6+99hqLFi3KUEXpE+uMs7ulg1defZUP/XY7BXk5LD1qKu89fiYnHzqJ3FHS3y4i6Wdmq51zFf1NU4t+hGhq7eDN+hbizmFmfO3dR/FPx0xn/Jjs3E1ORFJHQT8C1DW3sb2hlcJILrMmFvH6ngJOWzQn02WJSEgo6DPIOUdNU5TaPW2UFEaYPbFI3TMiknIK+gyJO0dVfSsNre2Ujc1n+oQx2j1SRIaFgj4DYp1xttS1sLc9xtTxhZQXFyjkRWTYKOjTrC3WyRu7WmjvjDN7YhETdAy8iAyz8J5MJMW6zl55MG655RZaWlpoaY+xaedeYvE4h0waq5AXkbRQ0CdpqEFfU9fI5tq95OTAoeXFOqJVRNJGaZOkxNMUn3XWWUyePJn77ruPtrY23v3ud3PTTTexd+9eLrnkEqqqqujs7OTLX/4yO3bsYPv27Zxz1jspK5vE3554PNRnZRSRkWf0Bf2D10HNS6l9zqlHw7Jv7HeWxNMUr1q1igceeIBnn30W5xznn38+TzzxBLW1tUyfPp0//elPgF8LaKGAb/73t7n39w+yeMHsUXM2SREJDzUtD8KqVatYtWoVxx57LMcddxxr165lw4YNHH300TzyyCN84Qtf4K9/fYLGzgi7mtvIMWPWxCKFvIhkxOhr0Q/S8k4H5xzXX389H/vYx/aZ9o9//IM//PGPfO76L1Jx8tu56cYbyR3BlxgTkfBTiz5JiacpPuecc1i+fDnNzc0AbNu2jZ07d7J9+3ZyIwWccPaFXPmxa9iy7mXKSwr2e4pjEZHhNvpa9BmSeJriZcuWcfnll3PSSScBUFxczK9+9Stefm0dn//85zDLZeyYfH5y660AXHXVVSxdupTp06fz2GOPZfJtiEgW0mmKU6Q9Fmf9jj1EcnOYW1ZEwRCuwToa3q+IjCw6TXEa7GiK4oB5k4rIz9O1T0Vk5FAffQq0tneyu6WdScX5CnkRGXFGTdCPtC6mRDVNUXJzjPLigsFnHsRIfp8iMjqNiqAvLCykrq5uRIZgc7SDPdEOJpcUkDfEI16dc9TV1VFYWJii6kRERkkf/cyZM6mqqqK2tjbTpfTiHNTuiRJ3kNdUwK4U7CtfWFjIzJkzU1CdiIg3KoI+Eokwb968TJexj9+v2ca1v93M/17yFs44QuEsIiPTqOi6GYnaYp18e9U6Fk0bx4WLZ2S6HBGRASnoD9KdT7/J1vpWrlt2uM5hIyIjmoL+IDRFO/j+XzZwyvwy3r5gUqbLERHZLwX9Qbj18U3sbunguqWLdLIyERnxFPQHqKYxyvK/v84Fi6dz9MzxmS5HRGRQCvoD9J1H1hOPw2fPXpjpUkREkpJU0JvZUjNbZ2Ybzey6fqbPMbM/m9mLZva4mc1MmNZpZmuC24pUFp9u63fs4f7VW7nixDnMmliU6XJERJIy6H70ZpYL/BA4C6gCnjOzFc65VxNm+zbwC+fcz83sHcDXgQ8E01qdc4tTXHdGfPPBtYzNz+Pqd8zPdCkiIklLpkW/BNjonNvsnGsH7gEu6DPPEcBfguHH+pk+6j2zuY4/r93JJ844lIlj8zNdjohI0pIJ+hnA1oT7VcG4RC8A7wmG3w2UmFlZcL/QzCrN7Gkzu7C/FzCzq4J5KkfaaQ7An4Pmvx5cy9RxhXz4lJF3hK6IyP6kamPsZ4HTzOx54DRgG9AZTJsTnAz/cuAWMzu074Odc7c55yqccxXl5eUpKil1Vr5UwwtbG/j3sw6jcAgXFBERyYRkznWzDZiVcH9mMK6bc247QYvezIqBi5xzDcG0bcHfzWb2OHAssGnIladJR2ec/354LYdNKeai43U+GxEZfZJp0T8HLDCzeWaWD1wK9Np7xswmmVnXc10PLA/Gl5pZQdc8wClA4kbcEe/uZ9/kjboWvrD0cHJ1qgMRGYUGDXrnXAy4GngYeA24zzn3ipndbGbnB7OdDqwzs/XAFOBrwfhFQKWZvYDfSPuNPnvrjGjNbTG+++gGTpg3kXccPjnT5YiIHJSkTlPsnFsJrOwz7oaE4QeAB/p53JPA0UOsMWNu++sm6va283/n6lQHIjJ66cjYAexsinL7317nvKOnsXjWhEyXIyJy0BT0A7jlzxvo6IzzuXN0qgMRGd0U9P3YVNvMvc9t5fITZjN30thMlyMiMiQK+n5866G1FObl8K/vXJDpUkREhkxB30flG/U8/MoOPnbaoUwqLsh0OSIiQ6agT+Cc4+sPrqW8pICPvk2nOhCRcFDQJ1j16g5Wb9nNp89cQFF+UnueioiMeAr6QDzu+NZDazmkfCzvq5g1+ANEREYJBX2guinKptq9fOjkueTlarGISHgo0QI1ja0AunKUiISOgj6wvSEKwLTxhRmuREQktRT0gZrGrqAfk+FKRERSS0EfqG6MUpSfy7hC7W0jIuGioA/UNLUydXyhzlIpIqGjoA9sb4gyXd02IhJCCvpATWOUqdoQKyIhpKAHYp1xdu6Jao8bEQklBT2wc08bcac9bkQknBT0+D1uQPvQi0g4Kejp2YdeffQiEkYKeqA6OP2B9roRkTBS0OO7bsZEchk3RgdLiUj4KOjxXTfTdLCUiISUgh7Y3tiq/nkRCS0FPV0tevXPi0g4ZX3Q+4Ol2rRrpYiEVtYHfW1zG51xp64bEQmtrA/6roOlpk9Q0ItIOGV90HcfLDVOffQiEk5ZH/TbG/zBUuqjF5GwSirozWypma0zs41mdl0/0+eY2Z/N7EUze9zMZiZMu9LMNgS3K1NZfCrUNEYpjOQwoSiS6VJERIbFoEFvZrnAD4FlwBHAZWZ2RJ/Zvg38wjl3DHAz8PXgsROBG4ETgCXAjWZWmrryh666ye9aqYOlRCSskmnRLwE2Ouc2O+fagXuAC/rMcwTwl2D4sYTp5wCPOOfqnXO7gUeApUMvO3VqGqNMHaduGxEJr2SCfgawNeF+VTAu0QvAe4LhdwMlZlaW5GMxs6vMrNLMKmtra5OtPSWqG1qZpj1uRCTEUrUx9rPAaWb2PHAasA3oTPbBzrnbnHMVzrmK8vLyFJU0uM64Y4cOlhKRkEvmdI3bgFkJ92cG47o557YTtOjNrBi4yDnXYGbbgNP7PPbxIdSbUru6D5bSrpUiEl7JtOifAxaY2TwzywcuBVYkzmBmk8ys67muB5YHww8DZ5tZabAR9uxg3IjQtWvldLXoRSTEBg1651wMuBof0K8B9znnXjGzm83s/GC204F1ZrYemAJ8LXhsPfBV/I/Fc8DNwbgRQVeWEpFskNSVNpxzK4GVfcbdkDD8APDAAI9dTk8Lf0TpuVasum5EJLyy+sjY6sZWCvJyKNXBUiISYlke9LqylIiEX1YHfU1jVP3zIhJ6WR301bqylIhkgaQ2xoZRZ9yxoymqg6WyUWcH7N4CdRtg1wb/t24T1L8Os5bAOV+D8TMHfx6RUSJrg76uuY1Y3Cnow8o52FsbBPnGINSDv7vfgHisZ96iMihbAHNOgrUrYcMqOO3zcOKnIC8/Y29B0qhuEzx7O8Q74JhLYWYFhGjbXdYG/fbufejVdQNASz28eB9YDkya74Nv3AzIGSW9e9EmeOFu2LbaB/uujdDW2DM9twDKDoXJR8Ci82HSAv8eyw6Fook98+3eAg9dD49+BdbcDed9G+a9Pe1v54DE2qD6Bah6DgonwGHnwNhJma4qebF2qHnJ1z9mAix6F+SPTc9rb30WnvwevPZHyI2A5cJzP4VJC2Hx5fCWS6FkanpqGUZZG/Q1jSPggiPxuA+jlnp/izbCtGOgeHL6atj9Bjz1I3j+l9DR0nta3hgfhGXze4Jx0nx/v3B8+mrcn8Zt8MyPYfXPoa3J/ziVzYdjLg6CfL6vefwsyMkd/PlK58Bld8H6h2Hl5+Dn74KjL4az/3PkfOGbd8LWZ4Lbs7D9eehsT5jBfBfUwmVw2DIoXziyWqd766Dq2Z76t62GWLRn+p8+A0deCIuvgNknpr72eCesexCe/D5sfdr/OL7tM7DkKoiMgVd+C2vuhEdvhD/fDPPPhGPf75flKF3DM+dcpmvopaKiwlVWVg776/zs769z0x9eZfWXzqSsuCA1T9pS77sLWuqhpQ5a6/sZrg+G66B1N7h47+fIyYMFZ8Pi9/uWWe4w7eO/bbX/oL/6e9+KOfpiOOlTviXY1W/d1dVRt9H/ICTWOnZyEP7ze/8QTDwkPWsBNS/Bkz+Alx/w3TRHXggnXQ0zjkvda3S0wv/7jr/lFcIZX4S3/gvkprF9FO+Ena/1hOLWZ2D3635abj5MPxZmneBvM98Ke6ph/UOwbqVv5QOUzvOhv3AZzD5p+D5T/dYfh13retdft9FPy4nAtLcE9S/xt91vwPN3+rDt2AsTDw1a1pfB+H1OfHtgOlr9Wt+TP4D6TTBhtv/MLH4/FBTvO/+uDT7wX7jHL9cxE+GYS/z8044ZWi3DwMxWO+cq+p2WrUH/9ZWv8bMn32DdV5ce2H70sTao3xx0D2zo/bd1gLM75Bb47oGiMhhT6v8WTfQfnK7hojLfmlj/MLx4LzTvgKJJcMz7fGtiypFDf9PxuO9/fvL7sOX/QcE4qPhnOOHjMG76IO+73QdM98bLjT0/BC11PfONnex/oBYug0NOT+0quHOw+TFf/6a/QGQsHH+lr790Tupep6+6TfDg52HjozDlaDjvf2D2CcPzWtEm2FbZE4pVlX5NBfyynX1CT7BPewvk7aeR0rjNh/76h2DzX6Gzza+JzT/L/3/mn+m7SlKprdk3Irrrf9avqYL/jHeH+gn+RyoyQNdpW7NvhKy5E7b83XcpHnKG/y4sPA8iB7AmvneX74559jb/WZ1+LJz8r74LL5kf7Xin/7w9/yv/A9rZDlOP9mscR18MY8uSr2UYKej7cc3dz/NiVQN//dwZ+050zv+C92rZBqHW8Gbvlm3xlN5dGiXTeod5URlEig5s9bMz5kNlza9g3UN+A9G0xXDsFXDURb37lJPREYWX7vMtmV3rYNxMOOmTcOwHoHDcgT1Xf1rq/fKpXQubHvO1tzX5VvAhp8NhS/1t3LSDe/7ODnj5Nz7gd7zkl/kJH/c/UmPSdMEy5+C1P8BD10HTNv8lP+umofWFO+dbsIndMDteAZwPtslH9oTirCVQOvfguzHamv2P5Log+Ft2+bXHOSf7LomFy2DivH0fF2vvWRvtWhNtSVgj7bXGWtd7za98ka979on+PUw85ODqr98Ma+7y20yaqnxXy9Hv9S3r6ccO/Jx1m+CpH/ofi1jUfwZP/lf/ng92ObbUw0sP+OesXuPXShYu89/NQ9+Z3rW9PhT0/bj41ifJMePej53kR8Ta4OH/CFYtN/nVxi6Rop6+6rIFvbssUhGU+7O3zof083f6kMvNh8PP80Fz6Bn773duqYfK5fDMT2DvTt8KOfla380xnKvvsXZ480nfD7pupf9xBP+lXHiu/2JMOWrwL1u0CVbfAc/c6sO1/HA4+RrfitpfS3Y4tTXDE9/yAZJfDO+8AY7/UHL9/x1R352SGOx7d/ppBeP8nh5drfUZxw/fZyve6dcU1j/o/0e1a/348kW+e6Q7zOuhfc/AzxMpCtZKJ/asoZYdCrNOhJnHp/5HON4Jr//Vh/5rf/DhPfkIH/jHvA+Kg2tZbH0W/v5dWPsn/zk/5n3+c1O+MLX11Lzsa3nxXv/DWTzVb7gfO2nf5ZK4Fn8gayMHQEHfj1O/+Rcq5pRyy6XH+hF/+DSs/plfPSxf2LvvuWT6yNj7pPpF35J48T7fgiqZ7vcKWPx+v0bRpe8G1vln+g/6vNPSv1HOOd/HvG6lb0lWVQLOr1UsXAYLl8Lct/UO7r4bWOe+zbfE5p85Mv4PADvXwsrPwht/8z9g5/3vvtsH9rfRtHSeD/Surpjyw5P7sRgO9Zt7WvptTUEXY2K34sSE4CrrGR6o2yUdWhvgld/4BtC2ymDb1jk+cLc+41v9b/2o38BaMmV4a4m1+y7RNXf5xlhLPbQ3Dzx/ZOzAPwJdOxIcBAV9H/G4Y+GXH+SjbzuELyw93P+DfvcJHyZnf3VYXzslYm3+S/n8nbDxEb+qPOtEvzq75e+9N7CefHVq+vdTpXmn3w6x7kHfldDR4lvGh77DB/mWJ3tvYD35Gh+kI5FzfjV+1X/491XxYb+su8J99xt+vr4bTWctSe+eVWG3c23QALrXdxee9KmBN7CmS6wtobtrPztkJA5HG/33+CMHd8kOBX0fO/dEWfK1P3PzBUfywXl74Kdn+j0WPvC7jPaxHZQ9NX6vgDV3wq71UDA+2MD6scE3sGZaRyu8/oQP/fUP+e0i6drAmkrRRnjs6/DsT/yP7tjJvfumB9toKgJ+21xHy0F32Sno+3ixqoHzf/B3lr9vAe944hLf1/exJ0Z3K8s5qF3n+1gLSjJdzYFzzvcVl0xL/Z4g6bJ7C7hO3y0zkvZbl6ywv6AfZc3X1NjeEMWIU/H89dC4FT60cnSHPPhgmXx4pqs4eGYweVGmqxia0bIGIlknK4O+prGVT+SuYNybj8LSbw7fPtEiIiPACNmFIb0Ktj7BZ/Luxx35Ht+XLSISYtkX9I1VvGvDl3kzZyZ2/vfVlyoioZddQR9rg/uuJDfezncn3ZDZ3a9ERNIku4L+4S/Ctkr+K3I1rmxBpqsREUmL7An6F+6F536KO/Fq7tl7nK4VKyJZIzuCfscr8IdrYc4p7Drpi3R0OqbrgiMikiXCH/TRRrj3Cn961vf+jJo9/hJyatGLSLYId9A7B7/7pD974sV3QMkUto+EK0uJiKRRuA+Y+vstsPaPcM7X/YWfgZrgWrHT1HUjIlkivC3615/w13s88t1w4ie6R1c3RonkGmVjR+e1H0VEDlQ4g75pOzzwYX9u5z4HRdU0tjJlXCE5OTpQSkSyQ/i6bmLtcN+V/hS4H/rTPmdy3N4Y1R43IpJVkmrRm9lSM1tnZhvN7Lp+ps82s8fM7Hkze9HMzg3GzzWzVjNbE9xuTfUb2MeqL/kLEl/wg34vHVbTGNUeNyKSVQZt0ZtZLvBD4CygCnjOzFY4515NmO1LwH3OuR+b2RHASmBuMG2Tc25xassewEsP+Is/nPgp3zffh3OOmsYo045S0ItI9kimRb8E2Oic2+ycawfuAS7oM48Dui6LMh7YnroSk1S3CVZcA7NPgrNu6n+Wve20d8a1a6WIZJVkgn4GsDXhflUwLtFXgCvMrArfmr8mYdq8oEvnr2b2tv5ewMyuMrNKM6usra1NvvpEE0Scq6IAAAyhSURBVObAqf/u95fPjfQ7S9eulVPVRy8iWSRVe91cBtzhnJsJnAv80sxygGpgtnPuWODfgbvMbJ8LIjrnbnPOVTjnKsrLyw+ugtw8OO1zUDJ1wFmqu/ehV4teRLJHMkG/DZiVcH9mMC7RR4D7AJxzTwGFwCTnXJtzri4YvxrYBBw21KIPVnXXUbETFPQikj2SCfrngAVmNs/M8oFLgRV95nkTeCeAmS3CB32tmZUHG3Mxs0OABcDmVBV/oKobo+TlGJPGFmSqBBGRtBt0rxvnXMzMrgYeBnKB5c65V8zsZqDSObcC+Axwu5n9G37D7Iecc87M3g7cbGYdQBz4uHOuftjezSBqGqM6WEpEsk5SB0w551biN7ImjrshYfhV4JR+Hvdr4NdDrDFltje0qn9eRLJOOE+BMICapijTJmiPGxHJLlkT9M45qhujatGLSNbJmqCv39tOeyzO1HEKehHJLlkT9F370E/XrpUikmWyJuh1VKyIZKusCfrqJh0VKyLZKXuCvqHVHyxVrIOlRCS7ZE3Qdx0slauDpUQky2RN0FfrgiMikqWyKOh1VKyIZKesCHodLCUi2Swrgr6hpYO2WFy7VopIVsqKoN8enId+ulr0IpKFsiLoew6WUtCLSPbJiqDvuYSgum5EJPtkSdC3kptjlJfoYCkRyT5ZEvRRppQU6GApEclKWRH0NTpYSkSyWFYEvd+HXv3zIpKdQh/0/mApHRUrItkr9EHf2NpBtCOurhsRyVqhD3rtWiki2S4Lgt4fFTtNlxAUkSyVBUGvK0uJSHYLfdDXNEbJMSjXlaVEJEuFPui3N/grS+Xlhv6tioj0K/TpV9PUqj1uRCSrhT7odcEREcl2oQ565xzVDToqVkSyW1JBb2ZLzWydmW00s+v6mT7bzB4zs+fN7EUzOzdh2vXB49aZ2TmpLH4wTa0xWjs61aIXkayWN9gMZpYL/BA4C6gCnjOzFc65VxNm+xJwn3Pux2Z2BLASmBsMXwocCUwHHjWzw5xznal+I/2pbvL70KuPXkSyWTIt+iXARufcZudcO3APcEGfeRwwLhgeD2wPhi8A7nHOtTnnXgc2Bs+XFtUNOipWRCSZoJ8BbE24XxWMS/QV4Aozq8K35q85gMcOGx0sJSKSuo2xlwF3OOdmAucCvzSzpJ/bzK4ys0ozq6ytrU1RSVDT2OoPltKVpUQkiyUTxtuAWQn3ZwbjEn0EuA/AOfcUUAhMSvKxOOduc85VOOcqysvLk69+ENsbo5SXFBDRwVIiksWSScDngAVmNs/M8vEbV1f0medN4J0AZrYIH/S1wXyXmlmBmc0DFgDPpqr4wdTogiMiIoPvdeOci5nZ1cDDQC6w3Dn3ipndDFQ651YAnwFuN7N/w2+Y/ZBzzgGvmNl9wKtADPhUuva4AX/mysOmlKTr5URERqRBgx7AObcSv5E1cdwNCcOvAqcM8NivAV8bQo0HxV9ZKsrbD0tdV5CIyGgU2s7rpmiMlvZOpqvrRkSyXGiDvibYtVIHS4lItgtt0HdfWUpBLyJZLsRBHxwsNUFdNyKS3UId9GYwWQdLiUiWC23Q1zS2Ul6sg6VEREKbgtWNUXXbiIgQ9qAfpw2xIiKhDfqaxqh2rRQRIaRB3xTtoLktxvQJCnoRkVAGfc/BUuqjFxEJZdDrgiMiIj3CGfQNOipWRKRLOIO++2ApBb2ISCiDvqYxyqTiAvLzQvn2REQOSCiTcHtjq7ptREQCoQx6fwlBBb2ICIQ66LVrpYgIhDDo90Q72NMW01GxIiKB0AV9jfahFxHpJXRB33OwlLpuREQghEGvFr2ISG+hC/rtwbVip+gUxSIiQAiDXgdLiYj0Fro0rNY+9CIivYQw6HVUrIhIohAGvVr0IiKJQhX0zW0x9kRjuuCIiEiCUAV9TbDHjS4hKCLSI1RB33Ww1FTtWiki0i2poDezpWa2zsw2mtl1/Uz/jpmtCW7rzawhYVpnwrQVqSy+Lx0VKyKyr7zBZjCzXOCHwFlAFfCcma1wzr3aNY9z7t8S5r8GODbhKVqdc4tTV/LAqht80E8ZX5COlxMRGRWSadEvATY65zY759qBe4AL9jP/ZcDdqSjuQNU0tTKpOJ+CvNxMvLyIyIiUTNDPALYm3K8Kxu3DzOYA84C/JIwuNLNKM3vazC4c4HFXBfNU1tbWJln6vqobozo9sYhIH6neGHsp8IBzrjNh3BznXAVwOXCLmR3a90HOuduccxXOuYry8vKDfvHqhihTx6l/XkQkUTJBvw2YlXB/ZjCuP5fSp9vGObct+LsZeJze/fcpVd3Yql0rRUT6SCbonwMWmNk8M8vHh/k+e8+Y2eFAKfBUwrhSMysIhicBpwCv9n1sKuxti9EU1ZWlRET6GnSvG+dczMyuBh4GcoHlzrlXzOxmoNI51xX6lwL3OOdcwsMXAT8xszj+R+UbiXvrpFJbLM673jKdo6aPH46nFxEZtax3LmdeRUWFq6yszHQZIiKjipmtDraH7iNUR8aKiMi+FPQiIiGnoBcRCTkFvYhIyCnoRURCTkEvIhJyCnoRkZBT0IuIhNyIO2DKzGqBLUN4iknArhSVMxxU39CovqFRfUMzkuub45zr96yQIy7oh8rMKgc6OmwkUH1Do/qGRvUNzUivbyDquhERCTkFvYhIyIUx6G/LdAGDUH1Do/qGRvUNzUivr1+h66MXEZHewtiiFxGRBAp6EZGQG5VBb2ZLzWydmW00s+v6mV5gZvcG058xs7lprG2WmT1mZq+a2Stmdm0/85xuZo1mtia43ZCu+hJqeMPMXgpef58rvZj3vWAZvmhmx6WxtoUJy2aNmTWZ2af7zJPWZWhmy81sp5m9nDBuopk9YmYbgr+lAzz2ymCeDWZ2ZRrr+28zWxv8/35rZhMGeOx+PwvDWN9XzGxbwv/w3AEeu9/v+zDWd29CbW+Y2ZoBHjvsy2/InHOj6oa/nOEm4BAgH3gBOKLPPJ8Ebg2GLwXuTWN904DjguESYH0/9Z0O/DHDy/ENYNJ+pp8LPAgYcCLwTAb/3zX4g0EytgyBtwPHAS8njPsWcF0wfB3wzX4eNxHYHPwtDYZL01Tf2UBeMPzN/upL5rMwjPV9BfhsEv///X7fh6u+PtP/B7ghU8tvqLfR2KJfAmx0zm12zrUD9wAX9JnnAuDnwfADwDvNzNJRnHOu2jn3j2B4D/AaMCMdr51iFwC/cN7TwAQzm5aBOt4JbHLODeVo6SFzzj0B1PcZnfg5+zlwYT8PPQd4xDlX75zbDTwCLE1Hfc65Vc65WHD3aWBmql83WQMsv2Qk830fsv3VF2THJcDdqX7ddBmNQT8D2Jpwv4p9g7R7nuCD3giUpaW6BEGX0bHAM/1MPsnMXjCzB83syLQW5jlglZmtNrOr+pmezHJOh0sZ+AuW6WU4xTlXHQzXAFP6mWekLMcP49fQ+jPYZ2E4XR10LS0foOtrJCy/twE7nHMbBpieyeWXlNEY9KOCmRUDvwY+7Zxr6jP5H/iuiLcA3wd+l+76gFOdc8cBy4BPmdnbM1DDfplZPnA+cH8/k0fCMuzm/Dr8iNxX2cz+A4gBdw4wS6Y+Cz8GDgUWA9X47pGR6DL235of8d+l0Rj024BZCfdnBuP6ncfM8oDxQF1aqvOvGcGH/J3Oud/0ne6ca3LONQfDK4GImU1KV33B624L/u4EfotfRU6UzHIebsuAfzjndvSdMBKWIbCjqzsr+Luzn3kyuhzN7EPAPwHvD36M9pHEZ2FYOOd2OOc6nXNx4PYBXjfTyy8PeA9w70DzZGr5HYjRGPTPAQvMbF7Q4rsUWNFnnhVA194N7wX+MtCHPNWC/rz/A15zzv3vAPNM7dpmYGZL8P+HdP4QjTWzkq5h/Ea7l/vMtgL4YLD3zYlAY0I3RboM2JLK9DIMJH7OrgR+3888DwNnm1lp0DVxdjBu2JnZUuDzwPnOuZYB5knmszBc9SVu83n3AK+bzPd9OJ0JrHXOVfU3MZPL74Bkemvwwdzwe4Ssx2+N/49g3M34DzRAIX51fyPwLHBIGms7Fb8K/yKwJridC3wc+Hgwz9XAK/g9CJ4GTk7z8jskeO0Xgjq6lmFijQb8MFjGLwEVaa5xLD64xyeMy9gyxP/gVAMd+H7ij+C3+/wZ2AA8CkwM5q0Afprw2A8Hn8WNwD+nsb6N+P7trs9h155o04GV+/sspKm+XwafrRfx4T2tb33B/X2+7+moLxh/R9dnLmHetC+/od50CgQRkZAbjV03IiJyABT0IiIhp6AXEQk5Bb2ISMgp6EVEQk5BLyIScgp6EZGQ+/+0/LxrL8votgAAAABJRU5ErkJggg==\n", | |
"text/plain": [ | |
"<Figure size 432x288 with 1 Axes>" | |
] | |
}, | |
"metadata": { | |
"needs_background": "light" | |
} | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "0j9bBua6c7-j" | |
}, | |
"source": [ | |
"## 직접 문장을 지정해 판정하기" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "FRg8FXNFdkwE", | |
"outputId": "6a1517ca-69ec-4583-967c-09af99229dd6" | |
}, | |
"source": [ | |
"import pickle\n", | |
"import numpy as np\n", | |
"import keras\n", | |
"from keras.models import Sequential\n", | |
"from keras.layers import Dense, Dropout\n", | |
"from tensorflow.keras.optimizers import RMSprop\n", | |
"from keras.models import model_from_json\n", | |
"\n", | |
"# 텍스트 준비하기 --- ( ※ 1)\n", | |
"text1 = \"\"\"\n", | |
"대통령이 북한과 관련된 이야기로 한미 정상회담을 준비하고 있습니다.\n", | |
"\"\"\"\n", | |
"text2 = \"\"\"\n", | |
"iPhone과 iPad를 모두 가지고 다니므로 USB를 2개 연결할 수 있는 휴대용 배터리를 선호합니다.\n", | |
"\"\"\"\n", | |
"text3 = \"\"\"\n", | |
"이번 주에는 미세먼지가 많을 것으로 예상되므로 노약자는 외출을 자제하는 것이 좋습니다.\n", | |
"\"\"\"\n", | |
"\n", | |
"# TF-IDF 사전 읽어 들이기 --- (*2)\n", | |
"from google.colab import drive\n", | |
"drive.mount('/gdrive', force_remount=True)\n", | |
"path = \"/gdrive/My Drive/Colab Notebooks/파이썬을 이용한 머신러닝,딥러닝 실전앱개발/\"\n", | |
"#tfidf.load_dic(path + \"genre-tdidf.dic\")\n", | |
"load_dic(path + \"genre-tdidf.dic\")\n", | |
"\n", | |
"# Keras 모델 정의하고 가중치 데이터 읽어 들이기 --- (*3)\n", | |
"nb_classes = 4\n", | |
"model = Sequential()\n", | |
"#model.add(Dense(512, activation='relu', input_shape=(52800,)))\n", | |
"model.add(Dense(512, activation='relu', input_shape=(36120,)))\n", | |
"model.add(Dropout(0.2))\n", | |
"model.add(Dense(512, activation='relu'))\n", | |
"model.add(Dropout(0.2))\n", | |
"model.add(Dense(nb_classes, activation='softmax'))\n", | |
"model.compile(\n", | |
" loss='categorical_crossentropy',\n", | |
" optimizer=RMSprop(),\n", | |
" metrics=['accuracy'])\n", | |
"model.load_weights(path + 'genre-model.hdf5')\n", | |
"\n", | |
"# 텍스트 지정해서 판별하기 --- (*4)\n", | |
"def check_genre(text):\n", | |
" # 레이블 정의하기\n", | |
" LABELS = [\"정치\", \"경제\", \"생활 \", \"IT/과학\"]\n", | |
" # TF-IDF 벡터로 변환하기 -- (*5)\n", | |
"# data = tfidf.calc_text(text)\n", | |
" data = calc_text(text)\n", | |
" # MLP로 예측하기 --- (*6)\n", | |
" pre = model.predict(np.array([data]))[0]\n", | |
" n = pre.argmax()\n", | |
" print(LABELS[n], \"(\", pre[n], \")\")\n", | |
" return LABELS[n], float(pre[n]), int(n) \n", | |
"\n", | |
"if __name__ == '__main__':\n", | |
" check_genre(text1)\n", | |
" check_genre(text2)\n", | |
" check_genre(text3)\n", | |
"\n" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Mounted at /gdrive\n", | |
"정치 ( 1.0 )\n", | |
"IT/과학 ( 0.9999517 )\n", | |
"생활 ( 0.99997985 )\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "ZZekGxjDf55H" | |
}, | |
"source": [ | |
"# 6-4 웹에서 사용할 수 있는 뉴스 카테고리 판정 애플리케이션 만들기\n", | |
"- 콘솔에서 작업해야 하고\n", | |
"- 버전 오류로 추후 작업" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "fvUSlVnn6Jtp" | |
}, | |
"source": [ | |
"#6-5 머신러닝에 데이터베이스(RDBMS) 사용하기" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "HdwgqZqtXs1U" | |
}, | |
"source": [ | |
"## sqlite DB에 테이블 만들기" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "IZbqDPnMgCgL" | |
}, | |
"source": [ | |
"import sqlite3\n", | |
"\n", | |
"dbpath = \"./hw.sqlite3\"\n", | |
"sql = '''\n", | |
" CREATE TABLE IF NOT EXISTS person (\n", | |
" id INTEGER PRIMARY KEY,\n", | |
" height NUMBER,\n", | |
" weight NUMBER,\n", | |
" typeNo INTEGER\n", | |
" )\n", | |
"'''\n", | |
"with sqlite3.connect(dbpath) as conn:\n", | |
" conn.execute(sql)" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "1KHjJJthYAmU" | |
}, | |
"source": [ | |
"## sqlite DB에 데이터 입력하기" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "-QY5qSvlYE7k", | |
"outputId": "1dc6d09e-4b29-45d3-c8ec-242f84224d0a" | |
}, | |
"source": [ | |
"import sqlite3\n", | |
"import random\n", | |
"\n", | |
"dbpath = \"./hw.sqlite3\"\n", | |
"\n", | |
"def insert_db(conn):\n", | |
" # 더미 데이터 만들기 --- (*1)\n", | |
" height = random.randint(130, 180)\n", | |
" weight = random.randint(30, 100)\n", | |
" # 더미 데이터를 기반으로 체형 데이터 생성하기 --- (*2)\n", | |
" type_no = 1\n", | |
" bmi = weight / (height / 100) ** 2\n", | |
" if bmi < 18.5:\n", | |
" type_no = 0\n", | |
" elif bmi < 25:\n", | |
" type_no = 1\n", | |
" elif bmi < 30:\n", | |
" type_no = 2\n", | |
" elif bmi < 35:\n", | |
" type_no = 3\n", | |
" elif bmi < 40:\n", | |
" type_no = 4\n", | |
" else:\n", | |
" type_no = 5\n", | |
" # 데이터베이스에 저장하기 --- (*3)\n", | |
" sql = '''\n", | |
" INSERT INTO person (height, weight, typeNo) \n", | |
" VALUES (?,?,?)\n", | |
" '''\n", | |
" values = (height,weight, type_no)\n", | |
" print(values)\n", | |
" conn.executemany(sql,[values])\n", | |
"\n", | |
"# 100개의 데이터 삽입하기\n", | |
"with sqlite3.connect(dbpath) as conn:\n", | |
" # 데이터 100개 삽입하기 --- (*4)\n", | |
" for i in range(100):\n", | |
" insert_db(conn)\n", | |
" # 확인하기 --- (*5)\n", | |
" c = conn.execute('SELECT count(*) FROM person')\n", | |
" cnt = c.fetchone()\n", | |
" print(cnt[0])\n" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"(173, 57, 1)\n", | |
"(171, 52, 0)\n", | |
"(159, 70, 2)\n", | |
"(141, 53, 2)\n", | |
"(174, 64, 1)\n", | |
"(132, 80, 5)\n", | |
"(132, 34, 1)\n", | |
"(173, 58, 1)\n", | |
"(151, 78, 3)\n", | |
"(151, 78, 3)\n", | |
"(151, 40, 0)\n", | |
"(164, 59, 1)\n", | |
"(178, 40, 0)\n", | |
"(171, 51, 0)\n", | |
"(158, 70, 2)\n", | |
"(164, 63, 1)\n", | |
"(137, 76, 5)\n", | |
"(148, 64, 2)\n", | |
"(143, 58, 2)\n", | |
"(130, 85, 5)\n", | |
"(147, 41, 1)\n", | |
"(136, 43, 1)\n", | |
"(175, 84, 2)\n", | |
"(158, 86, 3)\n", | |
"(131, 78, 5)\n", | |
"(167, 91, 3)\n", | |
"(153, 91, 4)\n", | |
"(178, 59, 1)\n", | |
"(140, 55, 2)\n", | |
"(159, 40, 0)\n", | |
"(149, 73, 3)\n", | |
"(166, 64, 1)\n", | |
"(135, 57, 3)\n", | |
"(164, 87, 3)\n", | |
"(179, 91, 2)\n", | |
"(144, 55, 2)\n", | |
"(146, 59, 2)\n", | |
"(165, 54, 1)\n", | |
"(175, 46, 0)\n", | |
"(160, 88, 3)\n", | |
"(158, 43, 0)\n", | |
"(171, 69, 1)\n", | |
"(145, 91, 5)\n", | |
"(179, 76, 1)\n", | |
"(158, 85, 3)\n", | |
"(145, 94, 5)\n", | |
"(158, 35, 0)\n", | |
"(131, 46, 2)\n", | |
"(139, 67, 3)\n", | |
"(145, 97, 5)\n", | |
"(145, 66, 3)\n", | |
"(164, 64, 1)\n", | |
"(145, 70, 3)\n", | |
"(151, 94, 5)\n", | |
"(159, 89, 4)\n", | |
"(156, 55, 1)\n", | |
"(131, 90, 5)\n", | |
"(166, 99, 4)\n", | |
"(146, 70, 3)\n", | |
"(165, 38, 0)\n", | |
"(180, 80, 1)\n", | |
"(179, 89, 2)\n", | |
"(148, 99, 5)\n", | |
"(155, 76, 3)\n", | |
"(168, 68, 1)\n", | |
"(169, 44, 0)\n", | |
"(169, 63, 1)\n", | |
"(137, 90, 5)\n", | |
"(171, 46, 0)\n", | |
"(152, 34, 0)\n", | |
"(149, 47, 1)\n", | |
"(140, 61, 3)\n", | |
"(158, 65, 2)\n", | |
"(148, 74, 3)\n", | |
"(176, 42, 0)\n", | |
"(135, 91, 5)\n", | |
"(177, 85, 2)\n", | |
"(167, 73, 2)\n", | |
"(154, 83, 3)\n", | |
"(148, 100, 5)\n", | |
"(150, 86, 4)\n", | |
"(171, 90, 3)\n", | |
"(177, 100, 3)\n", | |
"(159, 55, 1)\n", | |
"(170, 96, 3)\n", | |
"(177, 41, 0)\n", | |
"(130, 76, 5)\n", | |
"(177, 69, 1)\n", | |
"(139, 77, 4)\n", | |
"(147, 42, 1)\n", | |
"(142, 51, 2)\n", | |
"(145, 41, 1)\n", | |
"(155, 76, 3)\n", | |
"(163, 49, 0)\n", | |
"(180, 75, 1)\n", | |
"(151, 68, 2)\n", | |
"(146, 53, 1)\n", | |
"(179, 88, 2)\n", | |
"(169, 50, 0)\n", | |
"(148, 57, 2)\n", | |
"100\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "ywOB9E2_YOs8" | |
}, | |
"source": [ | |
"## 키, 체중, 체형 학습하기" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "AWtp6d8aYSXN", | |
"outputId": "7d962a11-f59b-4a17-a855-aca402ab5c07" | |
}, | |
"source": [ | |
"import keras\n", | |
"from keras.models import Sequential\n", | |
"from keras.layers import Dense, Dropout\n", | |
"from tensorflow.keras.optimizers import RMSprop\n", | |
"\n", | |
"in_size = 2 # 체중과 키를 입력으로\n", | |
"nb_classes = 6 # 체형은 6단계로 구별\n", | |
"\n", | |
"# MLP모델의 구조 정의하기\n", | |
"model = Sequential()\n", | |
"model.add(Dense(512, activation='relu', input_shape=(in_size,)))\n", | |
"model.add(Dropout(0.5))\n", | |
"model.add(Dense(nb_classes, activation='softmax'))\n", | |
"\n", | |
"# 모델 컴파일하기\n", | |
"model.compile(\n", | |
" loss='categorical_crossentropy',\n", | |
" optimizer=RMSprop(),\n", | |
" metrics=['accuracy'])\n", | |
"\n", | |
"model.save('hw_model.h5')\n", | |
"print(\"saved\")" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"saved\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "bMwfmR75Yxp1" | |
}, | |
"source": [ | |
"## DB에서 값을 읽어 학습하기" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "EweKriQhYpvE", | |
"outputId": "30178029-a4f0-4b3b-9cde-894f1e9d6e58" | |
}, | |
"source": [ | |
"import keras\n", | |
"from keras.models import load_model\n", | |
"from keras.utils.np_utils import to_categorical\n", | |
"import numpy as np\n", | |
"import sqlite3\n", | |
"import os\n", | |
"\n", | |
"# 데이터베이스에서 데이터 100개 읽어 들이기 --- (*1)\n", | |
"dbpath = \"./hw.sqlite3\"\n", | |
"select_sql = \"SELECT * FROM person ORDER BY id DESC LIMIT 100\"\n", | |
"# 읽어 들인 데이터를 리스트에 추가하기 --- (*2)\n", | |
"x = []\n", | |
"y = []\n", | |
"with sqlite3.connect(dbpath) as conn:\n", | |
" for row in conn.execute(select_sql):\n", | |
" id, height, weight, type_no = row\n", | |
" # 데이터를 정규화하기 --- (*3)\n", | |
" height = height / 200\n", | |
" weight = weight / 150\n", | |
" y.append(type_no)\n", | |
" x.append(np.array([height, weight]))\n", | |
"\n", | |
"# 모델 읽어 들이기 --- (*4)\n", | |
"model = load_model('hw_model.h5')\n", | |
"\n", | |
"# 이미 학습 데이터가 있는 경우 읽어 들이기 --- (*5)\n", | |
"if os.path.exists('hw_weights.h5'):\n", | |
" model.load_weights('hw_weights.h5')\n", | |
"\n", | |
"nb_classes = 6 # 체형은 6단계로 구별\n", | |
"y = to_categorical(y, nb_classes) # One-hot 벡터로 변환하기\n", | |
"\n", | |
"# 학습하기 --- (*6)\n", | |
"model.fit(np.array(x), y,\n", | |
" batch_size=50,\n", | |
" epochs=100)\n", | |
"\n", | |
"# 결과 저장하기 --- (*7)\n", | |
"model.save_weights('hw_weights.h5')\n", | |
"\n" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Epoch 1/100\n", | |
"2/2 [==============================] - 1s 8ms/step - loss: 1.8102 - accuracy: 0.0900\n", | |
"Epoch 2/100\n", | |
"2/2 [==============================] - 0s 6ms/step - loss: 1.7673 - accuracy: 0.2800\n", | |
"Epoch 3/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.7499 - accuracy: 0.3000\n", | |
"Epoch 4/100\n", | |
"2/2 [==============================] - 0s 6ms/step - loss: 1.7398 - accuracy: 0.2500\n", | |
"Epoch 5/100\n", | |
"2/2 [==============================] - 0s 10ms/step - loss: 1.7235 - accuracy: 0.2900\n", | |
"Epoch 6/100\n", | |
"2/2 [==============================] - 0s 7ms/step - loss: 1.7292 - accuracy: 0.2400\n", | |
"Epoch 7/100\n", | |
"2/2 [==============================] - 0s 6ms/step - loss: 1.7179 - accuracy: 0.2900\n", | |
"Epoch 8/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.7175 - accuracy: 0.2600\n", | |
"Epoch 9/100\n", | |
"2/2 [==============================] - 0s 6ms/step - loss: 1.7044 - accuracy: 0.2300\n", | |
"Epoch 10/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.6993 - accuracy: 0.2700\n", | |
"Epoch 11/100\n", | |
"2/2 [==============================] - 0s 6ms/step - loss: 1.6972 - accuracy: 0.2700\n", | |
"Epoch 12/100\n", | |
"2/2 [==============================] - 0s 6ms/step - loss: 1.6986 - accuracy: 0.2700\n", | |
"Epoch 13/100\n", | |
"2/2 [==============================] - 0s 6ms/step - loss: 1.6756 - accuracy: 0.2400\n", | |
"Epoch 14/100\n", | |
"2/2 [==============================] - 0s 8ms/step - loss: 1.6900 - accuracy: 0.2400\n", | |
"Epoch 15/100\n", | |
"2/2 [==============================] - 0s 8ms/step - loss: 1.6787 - accuracy: 0.2900\n", | |
"Epoch 16/100\n", | |
"2/2 [==============================] - 0s 6ms/step - loss: 1.6681 - accuracy: 0.2800\n", | |
"Epoch 17/100\n", | |
"2/2 [==============================] - 0s 7ms/step - loss: 1.6769 - accuracy: 0.2700\n", | |
"Epoch 18/100\n", | |
"2/2 [==============================] - 0s 6ms/step - loss: 1.6572 - accuracy: 0.2700\n", | |
"Epoch 19/100\n", | |
"2/2 [==============================] - 0s 6ms/step - loss: 1.6685 - accuracy: 0.2800\n", | |
"Epoch 20/100\n", | |
"2/2 [==============================] - 0s 9ms/step - loss: 1.6665 - accuracy: 0.2900\n", | |
"Epoch 21/100\n", | |
"2/2 [==============================] - 0s 9ms/step - loss: 1.6613 - accuracy: 0.2800\n", | |
"Epoch 22/100\n", | |
"2/2 [==============================] - 0s 7ms/step - loss: 1.6565 - accuracy: 0.2900\n", | |
"Epoch 23/100\n", | |
"2/2 [==============================] - 0s 7ms/step - loss: 1.6542 - accuracy: 0.2800\n", | |
"Epoch 24/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.6533 - accuracy: 0.2800\n", | |
"Epoch 25/100\n", | |
"2/2 [==============================] - 0s 9ms/step - loss: 1.6515 - accuracy: 0.3000\n", | |
"Epoch 26/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.6505 - accuracy: 0.3000\n", | |
"Epoch 27/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.6340 - accuracy: 0.2900\n", | |
"Epoch 28/100\n", | |
"2/2 [==============================] - 0s 3ms/step - loss: 1.6530 - accuracy: 0.2900\n", | |
"Epoch 29/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.6493 - accuracy: 0.2800\n", | |
"Epoch 30/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.6416 - accuracy: 0.2700\n", | |
"Epoch 31/100\n", | |
"2/2 [==============================] - 0s 6ms/step - loss: 1.6340 - accuracy: 0.2900\n", | |
"Epoch 32/100\n", | |
"2/2 [==============================] - 0s 7ms/step - loss: 1.6299 - accuracy: 0.2900\n", | |
"Epoch 33/100\n", | |
"2/2 [==============================] - 0s 6ms/step - loss: 1.6374 - accuracy: 0.2900\n", | |
"Epoch 34/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.6313 - accuracy: 0.2800\n", | |
"Epoch 35/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.6288 - accuracy: 0.3600\n", | |
"Epoch 36/100\n", | |
"2/2 [==============================] - 0s 6ms/step - loss: 1.6222 - accuracy: 0.3400\n", | |
"Epoch 37/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.6153 - accuracy: 0.3000\n", | |
"Epoch 38/100\n", | |
"2/2 [==============================] - 0s 8ms/step - loss: 1.6144 - accuracy: 0.3500\n", | |
"Epoch 39/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.6030 - accuracy: 0.3800\n", | |
"Epoch 40/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.6114 - accuracy: 0.3400\n", | |
"Epoch 41/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.6092 - accuracy: 0.3300\n", | |
"Epoch 42/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.6095 - accuracy: 0.3300\n", | |
"Epoch 43/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.6095 - accuracy: 0.3200\n", | |
"Epoch 44/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.6030 - accuracy: 0.3700\n", | |
"Epoch 45/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.6091 - accuracy: 0.3400\n", | |
"Epoch 46/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.6019 - accuracy: 0.3100\n", | |
"Epoch 47/100\n", | |
"2/2 [==============================] - 0s 6ms/step - loss: 1.5766 - accuracy: 0.3700\n", | |
"Epoch 48/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.5911 - accuracy: 0.3700\n", | |
"Epoch 49/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.5911 - accuracy: 0.3500\n", | |
"Epoch 50/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.5820 - accuracy: 0.3600\n", | |
"Epoch 51/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.5691 - accuracy: 0.3600\n", | |
"Epoch 52/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.5629 - accuracy: 0.3900\n", | |
"Epoch 53/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.5712 - accuracy: 0.3400\n", | |
"Epoch 54/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.5642 - accuracy: 0.3800\n", | |
"Epoch 55/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.5683 - accuracy: 0.3500\n", | |
"Epoch 56/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.5630 - accuracy: 0.4100\n", | |
"Epoch 57/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.5581 - accuracy: 0.3700\n", | |
"Epoch 58/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.5495 - accuracy: 0.3700\n", | |
"Epoch 59/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.5441 - accuracy: 0.4200\n", | |
"Epoch 60/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.5447 - accuracy: 0.3500\n", | |
"Epoch 61/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.5363 - accuracy: 0.3600\n", | |
"Epoch 62/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.5434 - accuracy: 0.3800\n", | |
"Epoch 63/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.5348 - accuracy: 0.4000\n", | |
"Epoch 64/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.5346 - accuracy: 0.3800\n", | |
"Epoch 65/100\n", | |
"2/2 [==============================] - 0s 9ms/step - loss: 1.5393 - accuracy: 0.3800\n", | |
"Epoch 66/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.5240 - accuracy: 0.4200\n", | |
"Epoch 67/100\n", | |
"2/2 [==============================] - 0s 11ms/step - loss: 1.5110 - accuracy: 0.4200\n", | |
"Epoch 68/100\n", | |
"2/2 [==============================] - 0s 8ms/step - loss: 1.5150 - accuracy: 0.3900\n", | |
"Epoch 69/100\n", | |
"2/2 [==============================] - 0s 10ms/step - loss: 1.5031 - accuracy: 0.4100\n", | |
"Epoch 70/100\n", | |
"2/2 [==============================] - 0s 8ms/step - loss: 1.5057 - accuracy: 0.4100\n", | |
"Epoch 71/100\n", | |
"2/2 [==============================] - 0s 9ms/step - loss: 1.4972 - accuracy: 0.4400\n", | |
"Epoch 72/100\n", | |
"2/2 [==============================] - 0s 17ms/step - loss: 1.4966 - accuracy: 0.4100\n", | |
"Epoch 73/100\n", | |
"2/2 [==============================] - 0s 7ms/step - loss: 1.4998 - accuracy: 0.4000\n", | |
"Epoch 74/100\n", | |
"2/2 [==============================] - 0s 6ms/step - loss: 1.5072 - accuracy: 0.4300\n", | |
"Epoch 75/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.5017 - accuracy: 0.4000\n", | |
"Epoch 76/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.4846 - accuracy: 0.4100\n", | |
"Epoch 77/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.4822 - accuracy: 0.3800\n", | |
"Epoch 78/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.4723 - accuracy: 0.4200\n", | |
"Epoch 79/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.4755 - accuracy: 0.4000\n", | |
"Epoch 80/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.4819 - accuracy: 0.3600\n", | |
"Epoch 81/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.4739 - accuracy: 0.4300\n", | |
"Epoch 82/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.4784 - accuracy: 0.4000\n", | |
"Epoch 83/100\n", | |
"2/2 [==============================] - 0s 7ms/step - loss: 1.4562 - accuracy: 0.4400\n", | |
"Epoch 84/100\n", | |
"2/2 [==============================] - 0s 6ms/step - loss: 1.4505 - accuracy: 0.4200\n", | |
"Epoch 85/100\n", | |
"2/2 [==============================] - 0s 7ms/step - loss: 1.4425 - accuracy: 0.4700\n", | |
"Epoch 86/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.4608 - accuracy: 0.4000\n", | |
"Epoch 87/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.4614 - accuracy: 0.4400\n", | |
"Epoch 88/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.4396 - accuracy: 0.4100\n", | |
"Epoch 89/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.4385 - accuracy: 0.4400\n", | |
"Epoch 90/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.4416 - accuracy: 0.4300\n", | |
"Epoch 91/100\n", | |
"2/2 [==============================] - 0s 8ms/step - loss: 1.4369 - accuracy: 0.4300\n", | |
"Epoch 92/100\n", | |
"2/2 [==============================] - 0s 9ms/step - loss: 1.4261 - accuracy: 0.4200\n", | |
"Epoch 93/100\n", | |
"2/2 [==============================] - 0s 8ms/step - loss: 1.4423 - accuracy: 0.4300\n", | |
"Epoch 94/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.4230 - accuracy: 0.4600\n", | |
"Epoch 95/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.4228 - accuracy: 0.4500\n", | |
"Epoch 96/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.4027 - accuracy: 0.4600\n", | |
"Epoch 97/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.4114 - accuracy: 0.4600\n", | |
"Epoch 98/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.4030 - accuracy: 0.4700\n", | |
"Epoch 99/100\n", | |
"2/2 [==============================] - 0s 5ms/step - loss: 1.4049 - accuracy: 0.4400\n", | |
"Epoch 100/100\n", | |
"2/2 [==============================] - 0s 4ms/step - loss: 1.3888 - accuracy: 0.4500\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "_7Uo03AnZH1U" | |
}, | |
"source": [ | |
"## 정답률 확인하기(임의의 데이터로 테스트)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "owF0feuQYXXz", | |
"outputId": "c66852e6-3ddd-4a49-a0d3-79c96caa420f" | |
}, | |
"source": [ | |
"from keras.models import load_model\n", | |
"import numpy as np\n", | |
"\n", | |
"# 학습하기모델 읽어 들이기 --- (*1)\n", | |
"model = load_model('hw_model.h5')\n", | |
"# 학습한 데이터 읽어 들이기 --- (*2)\n", | |
"model.load_weights('hw_weights.h5')\n", | |
"# 레이블\n", | |
"LABELS = [\n", | |
" '저체중', '표준 체중 ', '1비만(1도)',\n", | |
" '비만(2도)', '비만(3도)', '비만(4도)' \n", | |
"]\n", | |
"\n", | |
"# 테스트 데이터 지정하기 --- (*3)\n", | |
"height = 160\n", | |
"weight = 50\n", | |
"# 정규화하기 --- (*4)\n", | |
"test_x = [height / 200, weight / 150]\n", | |
"# 예측하기 --- (*5)\n", | |
"pre = model.predict(np.array([test_x]))\n", | |
"idx = pre[0].argmax()\n", | |
"print(LABELS[idx], '/ 가능성', pre[0][idx])\n", | |
"\n" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"표준 체중 / 가능성 0.33491102\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "MAYcUAcBZlAs" | |
}, | |
"source": [ | |
"### 분류 정답률 확인하기" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "C6i-_ZwWZoEb", | |
"outputId": "8871f022-fb60-48ca-d786-860f18714fa8" | |
}, | |
"source": [ | |
"from keras.models import load_model\n", | |
"import numpy as np\n", | |
"import random\n", | |
"from keras.utils.np_utils import to_categorical\n", | |
"\n", | |
"# 학습하기모델 읽어 들이기 --- (*1)\n", | |
"model = load_model('hw_model.h5')\n", | |
"# 학습한 데이터 읽어 들이기 --- (*2)\n", | |
"model.load_weights('hw_weights.h5')\n", | |
"\n", | |
"# 정답 데이터를 1000개 만들기 --- (*3)\n", | |
"x = []\n", | |
"y = []\n", | |
"for i in range(1000):\n", | |
" h = random.randint(130, 180)\n", | |
" w = random.randint(30, 100)\n", | |
" bmi = w / ((h / 100) ** 2)\n", | |
" type_no = 1\n", | |
" if bmi < 18.5:\n", | |
" type_no = 0\n", | |
" elif bmi < 25:\n", | |
" type_no = 1\n", | |
" elif bmi < 30:\n", | |
" type_no = 2\n", | |
" elif bmi < 35:\n", | |
" type_no = 3\n", | |
" elif bmi < 40:\n", | |
" type_no = 4\n", | |
" else:\n", | |
" type_no = 5\n", | |
" x.append(np.array([h / 200, w / 150]))\n", | |
" y.append(type_no)\n", | |
"\n", | |
"# 형식 변환하기 --- (*4)\n", | |
"x = np.array(x)\n", | |
"y = to_categorical(y, 6)\n", | |
"# 정답률 확인하기 --- (*5)\n", | |
"score = model.evaluate(x, y, verbose=1)\n", | |
"print(\"정답률=\", score[1], \"손실 =\", score[0])\n", | |
"\n" | |
], | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"32/32 [==============================] - 0s 1ms/step - loss: 1.4534 - accuracy: 0.3600\n", | |
"정답률= 0.36000001430511475 손실 = 1.4534181356430054\n" | |
] | |
} | |
] | |
} | |
] | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment