Skip to content

Instantly share code, notes, and snippets.

@EsterRibeiro
Created December 10, 2018 13:45
Show Gist options
  • Save EsterRibeiro/4c985e88daff097c96ff547990254858 to your computer and use it in GitHub Desktop.
Save EsterRibeiro/4c985e88daff097c96ff547990254858 to your computer and use it in GitHub Desktop.
Filtragem Colaborativa.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "Filtragem Colaborativa.ipynb",
"version": "0.3.2",
"provenance": [],
"collapsed_sections": [],
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/EsterRibeiro/4c985e88daff097c96ff547990254858/filtragem-colaborativa.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"metadata": {
"id": "JHQBUPZFHXWx",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"Um sistema de filtragem colaborativa recomenda os itens de acordo com avaliações de outros usuários, e prediz a nota que determinado usuário (que ainda não avaliou o item) irá dar para o item. Para isso, é necessário calcular os usuários que tenham maior semelhança com o usuário alvo. Esse tipo de sistema é menos genérico do que um SR com filtragem demográfica por exemplo, mas ele não precisa saber a categoria do item, entre outras informações mais específicas para fazer recomendações."
]
},
{
"metadata": {
"id": "g0R_yIXvKUF8",
"colab_type": "code",
"outputId": "ea50a2c1-e69e-4b98-c14b-f58992e15e37",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 272
}
},
"cell_type": "code",
"source": [
"#libraries\n",
"\n",
"!pip install surprise\n",
"\n",
"import pandas as pd \n",
"import numpy as np \n",
"import matplotlib.pyplot as plt\n",
"from surprise import Reader\n",
"from surprise import Dataset\n",
"from surprise import SVD\n",
"from surprise import evaluate\n",
"from google.colab import files\n",
"\n",
"reader = Reader()"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"Collecting surprise\n",
" Downloading https://files.pythonhosted.org/packages/61/de/e5cba8682201fcf9c3719a6fdda95693468ed061945493dea2dd37c5618b/surprise-0.1-py2.py3-none-any.whl\n",
"Collecting scikit-surprise (from surprise)\n",
"\u001b[?25l Downloading https://files.pythonhosted.org/packages/4d/fc/cd4210b247d1dca421c25994740cbbf03c5e980e31881f10eaddf45fdab0/scikit-surprise-1.0.6.tar.gz (3.3MB)\n",
"\u001b[K 100% |████████████████████████████████| 3.3MB 7.4MB/s \n",
"\u001b[?25hRequirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.6/dist-packages (from scikit-surprise->surprise) (0.13.0)\n",
"Requirement already satisfied: numpy>=1.11.2 in /usr/local/lib/python3.6/dist-packages (from scikit-surprise->surprise) (1.14.6)\n",
"Requirement already satisfied: scipy>=1.0.0 in /usr/local/lib/python3.6/dist-packages (from scikit-surprise->surprise) (1.1.0)\n",
"Requirement already satisfied: six>=1.10.0 in /usr/local/lib/python3.6/dist-packages (from scikit-surprise->surprise) (1.11.0)\n",
"Building wheels for collected packages: scikit-surprise\n",
" Running setup.py bdist_wheel for scikit-surprise ... \u001b[?25l-\b \b\\\b \b|\b \b/\b \b-\b \b\\\b \b|\b \b/\b \b-\b \b\\\b \b|\b \b/\b \b-\b \b\\\b \bdone\n",
"\u001b[?25h Stored in directory: /root/.cache/pip/wheels/ec/c0/55/3a28eab06b53c220015063ebbdb81213cd3dcbb72c088251ec\n",
"Successfully built scikit-surprise\n",
"Installing collected packages: scikit-surprise, surprise\n",
"Successfully installed scikit-surprise-1.0.6 surprise-0.1\n"
],
"name": "stdout"
}
]
},
{
"metadata": {
"id": "1ggVjiPxH0PX",
"colab_type": "code",
"outputId": "fff3c49d-8449-47df-a4e2-2d23c18d423b",
"colab": {
"resources": {
"http://localhost:8080/nbextensions/google.colab/files.js": {
"data": "Ly8gQ29weXJpZ2h0IDIwMTcgR29vZ2xlIExMQwovLwovLyBMaWNlbnNlZCB1bmRlciB0aGUgQXBhY2hlIExpY2Vuc2UsIFZlcnNpb24gMi4wICh0aGUgIkxpY2Vuc2UiKTsKLy8geW91IG1heSBub3QgdXNlIHRoaXMgZmlsZSBleGNlcHQgaW4gY29tcGxpYW5jZSB3aXRoIHRoZSBMaWNlbnNlLgovLyBZb3UgbWF5IG9idGFpbiBhIGNvcHkgb2YgdGhlIExpY2Vuc2UgYXQKLy8KLy8gICAgICBodHRwOi8vd3d3LmFwYWNoZS5vcmcvbGljZW5zZXMvTElDRU5TRS0yLjAKLy8KLy8gVW5sZXNzIHJlcXVpcmVkIGJ5IGFwcGxpY2FibGUgbGF3IG9yIGFncmVlZCB0byBpbiB3cml0aW5nLCBzb2Z0d2FyZQovLyBkaXN0cmlidXRlZCB1bmRlciB0aGUgTGljZW5zZSBpcyBkaXN0cmlidXRlZCBvbiBhbiAiQVMgSVMiIEJBU0lTLAovLyBXSVRIT1VUIFdBUlJBTlRJRVMgT1IgQ09ORElUSU9OUyBPRiBBTlkgS0lORCwgZWl0aGVyIGV4cHJlc3Mgb3IgaW1wbGllZC4KLy8gU2VlIHRoZSBMaWNlbnNlIGZvciB0aGUgc3BlY2lmaWMgbGFuZ3VhZ2UgZ292ZXJuaW5nIHBlcm1pc3Npb25zIGFuZAovLyBsaW1pdGF0aW9ucyB1bmRlciB0aGUgTGljZW5zZS4KCi8qKgogKiBAZmlsZW92ZXJ2aWV3IEhlbHBlcnMgZm9yIGdvb2dsZS5jb2xhYiBQeXRob24gbW9kdWxlLgogKi8KKGZ1bmN0aW9uKHNjb3BlKSB7CmZ1bmN0aW9uIHNwYW4odGV4dCwgc3R5bGVBdHRyaWJ1dGVzID0ge30pIHsKICBjb25zdCBlbGVtZW50ID0gZG9jdW1lbnQuY3JlYXRlRWxlbWVudCgnc3BhbicpOwogIGVsZW1lbnQudGV4dENvbnRlbnQgPSB0ZXh0OwogIGZvciAoY29uc3Qga2V5IG9mIE9iamVjdC5rZXlzKHN0eWxlQXR0cmlidXRlcykpIHsKICAgIGVsZW1lbnQuc3R5bGVba2V5XSA9IHN0eWxlQXR0cmlidXRlc1trZXldOwogIH0KICByZXR1cm4gZWxlbWVudDsKfQoKLy8gTWF4IG51bWJlciBvZiBieXRlcyB3aGljaCB3aWxsIGJlIHVwbG9hZGVkIGF0IGEgdGltZS4KY29uc3QgTUFYX1BBWUxPQURfU0laRSA9IDEwMCAqIDEwMjQ7Ci8vIE1heCBhbW91bnQgb2YgdGltZSB0byBibG9jayB3YWl0aW5nIGZvciB0aGUgdXNlci4KY29uc3QgRklMRV9DSEFOR0VfVElNRU9VVF9NUyA9IDMwICogMTAwMDsKCmZ1bmN0aW9uIF91cGxvYWRGaWxlcyhpbnB1dElkLCBvdXRwdXRJZCkgewogIGNvbnN0IHN0ZXBzID0gdXBsb2FkRmlsZXNTdGVwKGlucHV0SWQsIG91dHB1dElkKTsKICBjb25zdCBvdXRwdXRFbGVtZW50ID0gZG9jdW1lbnQuZ2V0RWxlbWVudEJ5SWQob3V0cHV0SWQpOwogIC8vIENhY2hlIHN0ZXBzIG9uIHRoZSBvdXRwdXRFbGVtZW50IHRvIG1ha2UgaXQgYXZhaWxhYmxlIGZvciB0aGUgbmV4dCBjYWxsCiAgLy8gdG8gdXBsb2FkRmlsZXNDb250aW51ZSBmcm9tIFB5dGhvbi4KICBvdXRwdXRFbGVtZW50LnN0ZXBzID0gc3RlcHM7CgogIHJldHVybiBfdXBsb2FkRmlsZXNDb250aW51ZShvdXRwdXRJZCk7Cn0KCi8vIFRoaXMgaXMgcm91Z2hseSBhbiBhc3luYyBnZW5lcmF0b3IgKG5vdCBzdXBwb3J0ZWQgaW4gdGhlIGJyb3dzZXIgeWV0KSwKLy8gd2hlcmUgdGhlcmUgYXJlIG11bHRpcGxlIGFzeW5jaHJvbm91cyBzdGVwcyBhbmQgdGhlIFB5dGhvbiBzaWRlIGlzIGdvaW5nCi8vIHRvIHBvbGwgZm9yIGNvbXBsZXRpb24gb2YgZWFjaCBzdGVwLgovLyBUaGlzIHVzZXMgYSBQcm9taXNlIHRvIGJsb2NrIHRoZSBweXRob24gc2lkZSBvbiBjb21wbGV0aW9uIG9mIGVhY2ggc3RlcCwKLy8gdGhlbiBwYXNzZXMgdGhlIHJlc3VsdCBvZiB0aGUgcHJldmlvdXMgc3RlcCBhcyB0aGUgaW5wdXQgdG8gdGhlIG5leHQgc3RlcC4KZnVuY3Rpb24gX3VwbG9hZEZpbGVzQ29udGludWUob3V0cHV0SWQpIHsKICBjb25zdCBvdXRwdXRFbGVtZW50ID0gZG9jdW1lbnQuZ2V0RWxlbWVudEJ5SWQob3V0cHV0SWQpOwogIGNvbnN0IHN0ZXBzID0gb3V0cHV0RWxlbWVudC5zdGVwczsKCiAgY29uc3QgbmV4dCA9IHN0ZXBzLm5leHQob3V0cHV0RWxlbWVudC5sYXN0UHJvbWlzZVZhbHVlKTsKICByZXR1cm4gUHJvbWlzZS5yZXNvbHZlKG5leHQudmFsdWUucHJvbWlzZSkudGhlbigodmFsdWUpID0+IHsKICAgIC8vIENhY2hlIHRoZSBsYXN0IHByb21pc2UgdmFsdWUgdG8gbWFrZSBpdCBhdmFpbGFibGUgdG8gdGhlIG5leHQKICAgIC8vIHN0ZXAgb2YgdGhlIGdlbmVyYXRvci4KICAgIG91dHB1dEVsZW1lbnQubGFzdFByb21pc2VWYWx1ZSA9IHZhbHVlOwogICAgcmV0dXJuIG5leHQudmFsdWUucmVzcG9uc2U7CiAgfSk7Cn0KCi8qKgogKiBHZW5lcmF0b3IgZnVuY3Rpb24gd2hpY2ggaXMgY2FsbGVkIGJldHdlZW4gZWFjaCBhc3luYyBzdGVwIG9mIHRoZSB1cGxvYWQKICogcHJvY2Vzcy4KICogQHBhcmFtIHtzdHJpbmd9IGlucHV0SWQgRWxlbWVudCBJRCBvZiB0aGUgaW5wdXQgZmlsZSBwaWNrZXIgZWxlbWVudC4KICogQHBhcmFtIHtzdHJpbmd9IG91dHB1dElkIEVsZW1lbnQgSUQgb2YgdGhlIG91dHB1dCBkaXNwbGF5LgogKiBAcmV0dXJuIHshSXRlcmFibGU8IU9iamVjdD59IEl0ZXJhYmxlIG9mIG5leHQgc3RlcHMuCiAqLwpmdW5jdGlvbiogdXBsb2FkRmlsZXNTdGVwKGlucHV0SWQsIG91dHB1dElkKSB7CiAgY29uc3QgaW5wdXRFbGVtZW50ID0gZG9jdW1lbnQuZ2V0RWxlbWVudEJ5SWQoaW5wdXRJZCk7CiAgaW5wdXRFbGVtZW50LmRpc2FibGVkID0gZmFsc2U7CgogIGNvbnN0IG91dHB1dEVsZW1lbnQgPSBkb2N1bWVudC5nZXRFbGVtZW50QnlJZChvdXRwdXRJZCk7CiAgb3V0cHV0RWxlbWVudC5pbm5lckhUTUwgPSAnJzsKCiAgY29uc3QgcGlja2VkUHJvbWlzZSA9IG5ldyBQcm9taXNlKChyZXNvbHZlKSA9PiB7CiAgICBpbnB1dEVsZW1lbnQuYWRkRXZlbnRMaXN0ZW5lcignY2hhbmdlJywgKGUpID0+IHsKICAgICAgcmVzb2x2ZShlLnRhcmdldC5maWxlcyk7CiAgICB9KTsKICB9KTsKCiAgY29uc3QgY2FuY2VsID0gZG9jdW1lbnQuY3JlYXRlRWxlbWVudCgnYnV0dG9uJyk7CiAgaW5wdXRFbGVtZW50LnBhcmVudEVsZW1lbnQuYXBwZW5kQ2hpbGQoY2FuY2VsKTsKICBjYW5jZWwudGV4dENvbnRlbnQgPSAnQ2FuY2VsIHVwbG9hZCc7CiAgY29uc3QgY2FuY2VsUHJvbWlzZSA9IG5ldyBQcm9taXNlKChyZXNvbHZlKSA9PiB7CiAgICBjYW5jZWwub25jbGljayA9ICgpID0+IHsKICAgICAgcmVzb2x2ZShudWxsKTsKICAgIH07CiAgfSk7CgogIC8vIENhbmNlbCB1cGxvYWQgaWYgdXNlciBoYXNuJ3QgcGlja2VkIGFueXRoaW5nIGluIHRpbWVvdXQuCiAgY29uc3QgdGltZW91dFByb21pc2UgPSBuZXcgUHJvbWlzZSgocmVzb2x2ZSkgPT4gewogICAgc2V0VGltZW91dCgoKSA9PiB7CiAgICAgIHJlc29sdmUobnVsbCk7CiAgICB9LCBGSUxFX0NIQU5HRV9USU1FT1VUX01TKTsKICB9KTsKCiAgLy8gV2FpdCBmb3IgdGhlIHVzZXIgdG8gcGljayB0aGUgZmlsZXMuCiAgY29uc3QgZmlsZXMgPSB5aWVsZCB7CiAgICBwcm9taXNlOiBQcm9taXNlLnJhY2UoW3BpY2tlZFByb21pc2UsIHRpbWVvdXRQcm9taXNlLCBjYW5jZWxQcm9taXNlXSksCiAgICByZXNwb25zZTogewogICAgICBhY3Rpb246ICdzdGFydGluZycsCiAgICB9CiAgfTsKCiAgaWYgKCFmaWxlcykgewogICAgcmV0dXJuIHsKICAgICAgcmVzcG9uc2U6IHsKICAgICAgICBhY3Rpb246ICdjb21wbGV0ZScsCiAgICAgIH0KICAgIH07CiAgfQoKICBjYW5jZWwucmVtb3ZlKCk7CgogIC8vIERpc2FibGUgdGhlIGlucHV0IGVsZW1lbnQgc2luY2UgZnVydGhlciBwaWNrcyBhcmUgbm90IGFsbG93ZWQuCiAgaW5wdXRFbGVtZW50LmRpc2FibGVkID0gdHJ1ZTsKCiAgZm9yIChjb25zdCBmaWxlIG9mIGZpbGVzKSB7CiAgICBjb25zdCBsaSA9IGRvY3VtZW50LmNyZWF0ZUVsZW1lbnQoJ2xpJyk7CiAgICBsaS5hcHBlbmQoc3BhbihmaWxlLm5hbWUsIHtmb250V2VpZ2h0OiAnYm9sZCd9KSk7CiAgICBsaS5hcHBlbmQoc3BhbigKICAgICAgICBgKCR7ZmlsZS50eXBlIHx8ICduL2EnfSkgLSAke2ZpbGUuc2l6ZX0gYnl0ZXMsIGAgKwogICAgICAgIGBsYXN0IG1vZGlmaWVkOiAkewogICAgICAgICAgICBmaWxlLmxhc3RNb2RpZmllZERhdGUgPyBmaWxlLmxhc3RNb2RpZmllZERhdGUudG9Mb2NhbGVEYXRlU3RyaW5nKCkgOgogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAnbi9hJ30gLSBgKSk7CiAgICBjb25zdCBwZXJjZW50ID0gc3BhbignMCUgZG9uZScpOwogICAgbGkuYXBwZW5kQ2hpbGQocGVyY2VudCk7CgogICAgb3V0cHV0RWxlbWVudC5hcHBlbmRDaGlsZChsaSk7CgogICAgY29uc3QgZmlsZURhdGFQcm9taXNlID0gbmV3IFByb21pc2UoKHJlc29sdmUpID0+IHsKICAgICAgY29uc3QgcmVhZGVyID0gbmV3IEZpbGVSZWFkZXIoKTsKICAgICAgcmVhZGVyLm9ubG9hZCA9IChlKSA9PiB7CiAgICAgICAgcmVzb2x2ZShlLnRhcmdldC5yZXN1bHQpOwogICAgICB9OwogICAgICByZWFkZXIucmVhZEFzQXJyYXlCdWZmZXIoZmlsZSk7CiAgICB9KTsKICAgIC8vIFdhaXQgZm9yIHRoZSBkYXRhIHRvIGJlIHJlYWR5LgogICAgbGV0IGZpbGVEYXRhID0geWllbGQgewogICAgICBwcm9taXNlOiBmaWxlRGF0YVByb21pc2UsCiAgICAgIHJlc3BvbnNlOiB7CiAgICAgICAgYWN0aW9uOiAnY29udGludWUnLAogICAgICB9CiAgICB9OwoKICAgIC8vIFVzZSBhIGNodW5rZWQgc2VuZGluZyB0byBhdm9pZCBtZXNzYWdlIHNpemUgbGltaXRzLiBTZWUgYi82MjExNTY2MC4KICAgIGxldCBwb3NpdGlvbiA9IDA7CiAgICB3aGlsZSAocG9zaXRpb24gPCBmaWxlRGF0YS5ieXRlTGVuZ3RoKSB7CiAgICAgIGNvbnN0IGxlbmd0aCA9IE1hdGgubWluKGZpbGVEYXRhLmJ5dGVMZW5ndGggLSBwb3NpdGlvbiwgTUFYX1BBWUxPQURfU0laRSk7CiAgICAgIGNvbnN0IGNodW5rID0gbmV3IFVpbnQ4QXJyYXkoZmlsZURhdGEsIHBvc2l0aW9uLCBsZW5ndGgpOwogICAgICBwb3NpdGlvbiArPSBsZW5ndGg7CgogICAgICBjb25zdCBiYXNlNjQgPSBidG9hKFN0cmluZy5mcm9tQ2hhckNvZGUuYXBwbHkobnVsbCwgY2h1bmspKTsKICAgICAgeWllbGQgewogICAgICAgIHJlc3BvbnNlOiB7CiAgICAgICAgICBhY3Rpb246ICdhcHBlbmQnLAogICAgICAgICAgZmlsZTogZmlsZS5uYW1lLAogICAgICAgICAgZGF0YTogYmFzZTY0LAogICAgICAgIH0sCiAgICAgIH07CiAgICAgIHBlcmNlbnQudGV4dENvbnRlbnQgPQogICAgICAgICAgYCR7TWF0aC5yb3VuZCgocG9zaXRpb24gLyBmaWxlRGF0YS5ieXRlTGVuZ3RoKSAqIDEwMCl9JSBkb25lYDsKICAgIH0KICB9CgogIC8vIEFsbCBkb25lLgogIHlpZWxkIHsKICAgIHJlc3BvbnNlOiB7CiAgICAgIGFjdGlvbjogJ2NvbXBsZXRlJywKICAgIH0KICB9Owp9CgpzY29wZS5nb29nbGUgPSBzY29wZS5nb29nbGUgfHwge307CnNjb3BlLmdvb2dsZS5jb2xhYiA9IHNjb3BlLmdvb2dsZS5jb2xhYiB8fCB7fTsKc2NvcGUuZ29vZ2xlLmNvbGFiLl9maWxlcyA9IHsKICBfdXBsb2FkRmlsZXMsCiAgX3VwbG9hZEZpbGVzQ29udGludWUsCn07Cn0pKHNlbGYpOwo=",
"ok": true,
"headers": [
[
"content-type",
"application/javascript"
]
],
"status": 200,
"status_text": ""
}
},
"base_uri": "https://localhost:8080/",
"height": 2024
}
},
"cell_type": "code",
"source": [
"#carregando os dataset de avaliações\n",
"\n",
"uploaded = files.upload()\n",
"\n",
"avaliacoes = pd.read_csv('ratings_small.csv')\n",
"\n",
"avaliacoes\n",
"\n"
],
"execution_count": 0,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/html": [
"\n",
" <input type=\"file\" id=\"files-035be742-12ae-4eac-bd42-d873c1c7e38d\" name=\"files[]\" multiple disabled />\n",
" <output id=\"result-035be742-12ae-4eac-bd42-d873c1c7e38d\">\n",
" Upload widget is only available when the cell has been executed in the\n",
" current browser session. Please rerun this cell to enable.\n",
" </output>\n",
" <script src=\"/nbextensions/google.colab/files.js\"></script> "
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {
"tags": []
}
},
{
"output_type": "stream",
"text": [
"Saving ratings_small.csv to ratings_small.csv\n"
],
"name": "stdout"
},
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>userId</th>\n",
" <th>movieId</th>\n",
" <th>rating</th>\n",
" <th>timestamp</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>31</td>\n",
" <td>2.5</td>\n",
" <td>1260759144</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1</td>\n",
" <td>1029</td>\n",
" <td>3.0</td>\n",
" <td>1260759179</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>1</td>\n",
" <td>1061</td>\n",
" <td>3.0</td>\n",
" <td>1260759182</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>1</td>\n",
" <td>1129</td>\n",
" <td>2.0</td>\n",
" <td>1260759185</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>1</td>\n",
" <td>1172</td>\n",
" <td>4.0</td>\n",
" <td>1260759205</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>1</td>\n",
" <td>1263</td>\n",
" <td>2.0</td>\n",
" <td>1260759151</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>1</td>\n",
" <td>1287</td>\n",
" <td>2.0</td>\n",
" <td>1260759187</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>1</td>\n",
" <td>1293</td>\n",
" <td>2.0</td>\n",
" <td>1260759148</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>1</td>\n",
" <td>1339</td>\n",
" <td>3.5</td>\n",
" <td>1260759125</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>1</td>\n",
" <td>1343</td>\n",
" <td>2.0</td>\n",
" <td>1260759131</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>1</td>\n",
" <td>1371</td>\n",
" <td>2.5</td>\n",
" <td>1260759135</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>1</td>\n",
" <td>1405</td>\n",
" <td>1.0</td>\n",
" <td>1260759203</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>1</td>\n",
" <td>1953</td>\n",
" <td>4.0</td>\n",
" <td>1260759191</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>1</td>\n",
" <td>2105</td>\n",
" <td>4.0</td>\n",
" <td>1260759139</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>1</td>\n",
" <td>2150</td>\n",
" <td>3.0</td>\n",
" <td>1260759194</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>1</td>\n",
" <td>2193</td>\n",
" <td>2.0</td>\n",
" <td>1260759198</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>1</td>\n",
" <td>2294</td>\n",
" <td>2.0</td>\n",
" <td>1260759108</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>1</td>\n",
" <td>2455</td>\n",
" <td>2.5</td>\n",
" <td>1260759113</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>1</td>\n",
" <td>2968</td>\n",
" <td>1.0</td>\n",
" <td>1260759200</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>1</td>\n",
" <td>3671</td>\n",
" <td>3.0</td>\n",
" <td>1260759117</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20</th>\n",
" <td>2</td>\n",
" <td>10</td>\n",
" <td>4.0</td>\n",
" <td>835355493</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21</th>\n",
" <td>2</td>\n",
" <td>17</td>\n",
" <td>5.0</td>\n",
" <td>835355681</td>\n",
" </tr>\n",
" <tr>\n",
" <th>22</th>\n",
" <td>2</td>\n",
" <td>39</td>\n",
" <td>5.0</td>\n",
" <td>835355604</td>\n",
" </tr>\n",
" <tr>\n",
" <th>23</th>\n",
" <td>2</td>\n",
" <td>47</td>\n",
" <td>4.0</td>\n",
" <td>835355552</td>\n",
" </tr>\n",
" <tr>\n",
" <th>24</th>\n",
" <td>2</td>\n",
" <td>50</td>\n",
" <td>4.0</td>\n",
" <td>835355586</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25</th>\n",
" <td>2</td>\n",
" <td>52</td>\n",
" <td>3.0</td>\n",
" <td>835356031</td>\n",
" </tr>\n",
" <tr>\n",
" <th>26</th>\n",
" <td>2</td>\n",
" <td>62</td>\n",
" <td>3.0</td>\n",
" <td>835355749</td>\n",
" </tr>\n",
" <tr>\n",
" <th>27</th>\n",
" <td>2</td>\n",
" <td>110</td>\n",
" <td>4.0</td>\n",
" <td>835355532</td>\n",
" </tr>\n",
" <tr>\n",
" <th>28</th>\n",
" <td>2</td>\n",
" <td>144</td>\n",
" <td>3.0</td>\n",
" <td>835356016</td>\n",
" </tr>\n",
" <tr>\n",
" <th>29</th>\n",
" <td>2</td>\n",
" <td>150</td>\n",
" <td>5.0</td>\n",
" <td>835355395</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99974</th>\n",
" <td>671</td>\n",
" <td>4034</td>\n",
" <td>4.5</td>\n",
" <td>1064245493</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99975</th>\n",
" <td>671</td>\n",
" <td>4306</td>\n",
" <td>5.0</td>\n",
" <td>1064245548</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99976</th>\n",
" <td>671</td>\n",
" <td>4308</td>\n",
" <td>3.5</td>\n",
" <td>1065111985</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99977</th>\n",
" <td>671</td>\n",
" <td>4880</td>\n",
" <td>4.0</td>\n",
" <td>1065111973</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99978</th>\n",
" <td>671</td>\n",
" <td>4886</td>\n",
" <td>5.0</td>\n",
" <td>1064245488</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99979</th>\n",
" <td>671</td>\n",
" <td>4896</td>\n",
" <td>5.0</td>\n",
" <td>1065111996</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99980</th>\n",
" <td>671</td>\n",
" <td>4963</td>\n",
" <td>4.5</td>\n",
" <td>1065111855</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99981</th>\n",
" <td>671</td>\n",
" <td>4973</td>\n",
" <td>4.5</td>\n",
" <td>1064245471</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99982</th>\n",
" <td>671</td>\n",
" <td>4993</td>\n",
" <td>5.0</td>\n",
" <td>1064245483</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99983</th>\n",
" <td>671</td>\n",
" <td>4995</td>\n",
" <td>4.0</td>\n",
" <td>1064891537</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99984</th>\n",
" <td>671</td>\n",
" <td>5010</td>\n",
" <td>2.0</td>\n",
" <td>1066793004</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99985</th>\n",
" <td>671</td>\n",
" <td>5218</td>\n",
" <td>2.0</td>\n",
" <td>1065111990</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99986</th>\n",
" <td>671</td>\n",
" <td>5299</td>\n",
" <td>3.0</td>\n",
" <td>1065112004</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99987</th>\n",
" <td>671</td>\n",
" <td>5349</td>\n",
" <td>4.0</td>\n",
" <td>1065111863</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99988</th>\n",
" <td>671</td>\n",
" <td>5377</td>\n",
" <td>4.0</td>\n",
" <td>1064245557</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99989</th>\n",
" <td>671</td>\n",
" <td>5445</td>\n",
" <td>4.5</td>\n",
" <td>1064891627</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99990</th>\n",
" <td>671</td>\n",
" <td>5464</td>\n",
" <td>3.0</td>\n",
" <td>1064891549</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99991</th>\n",
" <td>671</td>\n",
" <td>5669</td>\n",
" <td>4.0</td>\n",
" <td>1063502711</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99992</th>\n",
" <td>671</td>\n",
" <td>5816</td>\n",
" <td>4.0</td>\n",
" <td>1065111963</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99993</th>\n",
" <td>671</td>\n",
" <td>5902</td>\n",
" <td>3.5</td>\n",
" <td>1064245507</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99994</th>\n",
" <td>671</td>\n",
" <td>5952</td>\n",
" <td>5.0</td>\n",
" <td>1063502716</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99995</th>\n",
" <td>671</td>\n",
" <td>5989</td>\n",
" <td>4.0</td>\n",
" <td>1064890625</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99996</th>\n",
" <td>671</td>\n",
" <td>5991</td>\n",
" <td>4.5</td>\n",
" <td>1064245387</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99997</th>\n",
" <td>671</td>\n",
" <td>5995</td>\n",
" <td>4.0</td>\n",
" <td>1066793014</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99998</th>\n",
" <td>671</td>\n",
" <td>6212</td>\n",
" <td>2.5</td>\n",
" <td>1065149436</td>\n",
" </tr>\n",
" <tr>\n",
" <th>99999</th>\n",
" <td>671</td>\n",
" <td>6268</td>\n",
" <td>2.5</td>\n",
" <td>1065579370</td>\n",
" </tr>\n",
" <tr>\n",
" <th>100000</th>\n",
" <td>671</td>\n",
" <td>6269</td>\n",
" <td>4.0</td>\n",
" <td>1065149201</td>\n",
" </tr>\n",
" <tr>\n",
" <th>100001</th>\n",
" <td>671</td>\n",
" <td>6365</td>\n",
" <td>4.0</td>\n",
" <td>1070940363</td>\n",
" </tr>\n",
" <tr>\n",
" <th>100002</th>\n",
" <td>671</td>\n",
" <td>6385</td>\n",
" <td>2.5</td>\n",
" <td>1070979663</td>\n",
" </tr>\n",
" <tr>\n",
" <th>100003</th>\n",
" <td>671</td>\n",
" <td>6565</td>\n",
" <td>3.5</td>\n",
" <td>1074784724</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>100004 rows × 4 columns</p>\n",
"</div>"
],
"text/plain": [
" userId movieId rating timestamp\n",
"0 1 31 2.5 1260759144\n",
"1 1 1029 3.0 1260759179\n",
"2 1 1061 3.0 1260759182\n",
"3 1 1129 2.0 1260759185\n",
"4 1 1172 4.0 1260759205\n",
"5 1 1263 2.0 1260759151\n",
"6 1 1287 2.0 1260759187\n",
"7 1 1293 2.0 1260759148\n",
"8 1 1339 3.5 1260759125\n",
"9 1 1343 2.0 1260759131\n",
"10 1 1371 2.5 1260759135\n",
"11 1 1405 1.0 1260759203\n",
"12 1 1953 4.0 1260759191\n",
"13 1 2105 4.0 1260759139\n",
"14 1 2150 3.0 1260759194\n",
"15 1 2193 2.0 1260759198\n",
"16 1 2294 2.0 1260759108\n",
"17 1 2455 2.5 1260759113\n",
"18 1 2968 1.0 1260759200\n",
"19 1 3671 3.0 1260759117\n",
"20 2 10 4.0 835355493\n",
"21 2 17 5.0 835355681\n",
"22 2 39 5.0 835355604\n",
"23 2 47 4.0 835355552\n",
"24 2 50 4.0 835355586\n",
"25 2 52 3.0 835356031\n",
"26 2 62 3.0 835355749\n",
"27 2 110 4.0 835355532\n",
"28 2 144 3.0 835356016\n",
"29 2 150 5.0 835355395\n",
"... ... ... ... ...\n",
"99974 671 4034 4.5 1064245493\n",
"99975 671 4306 5.0 1064245548\n",
"99976 671 4308 3.5 1065111985\n",
"99977 671 4880 4.0 1065111973\n",
"99978 671 4886 5.0 1064245488\n",
"99979 671 4896 5.0 1065111996\n",
"99980 671 4963 4.5 1065111855\n",
"99981 671 4973 4.5 1064245471\n",
"99982 671 4993 5.0 1064245483\n",
"99983 671 4995 4.0 1064891537\n",
"99984 671 5010 2.0 1066793004\n",
"99985 671 5218 2.0 1065111990\n",
"99986 671 5299 3.0 1065112004\n",
"99987 671 5349 4.0 1065111863\n",
"99988 671 5377 4.0 1064245557\n",
"99989 671 5445 4.5 1064891627\n",
"99990 671 5464 3.0 1064891549\n",
"99991 671 5669 4.0 1063502711\n",
"99992 671 5816 4.0 1065111963\n",
"99993 671 5902 3.5 1064245507\n",
"99994 671 5952 5.0 1063502716\n",
"99995 671 5989 4.0 1064890625\n",
"99996 671 5991 4.5 1064245387\n",
"99997 671 5995 4.0 1066793014\n",
"99998 671 6212 2.5 1065149436\n",
"99999 671 6268 2.5 1065579370\n",
"100000 671 6269 4.0 1065149201\n",
"100001 671 6365 4.0 1070940363\n",
"100002 671 6385 2.5 1070979663\n",
"100003 671 6565 3.5 1074784724\n",
"\n",
"[100004 rows x 4 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 3
}
]
},
{
"metadata": {
"id": "7v2d7O0bKOQP",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"Os filmes serão avaliados de uma escala entre 1 e 5."
]
},
{
"metadata": {
"id": "BRymPKV-KaH_",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"data = Dataset.load_from_df(avaliacoes[['userId', 'movieId', 'rating']], reader)\n",
"\n",
"x = data.split(n_folds=5)\n"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "o6PjYwezKzUB",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"SVD - Decomposição do valor singular\n",
"\n",
"\n",
"* MAE computes the average error. We use the absolute value for each error (r_ui - est_ui), else some errors would be negative which makes no sense.\n",
"* RMSE does quite the same thing, expect that instead of taking the absolute value we square the errors. This has the effect of having only positive errors as well, but big errors are heavily penalized. To obtain a number which is of the dimension of a rating (error), we take the root.\n",
"\n",
"\n",
"MAE - Calcula a média do erro. Usa o valor absoluto de cada erro (uma vez que erros negativos não tem sentido).\n",
"RMSE - Faz o mesmo que o MAE, exceto que ao invés de pegar o valor do erro absoluto faz-se o quadrado do erro. O efeito disso é ter somente valores de erro positivos, mas grandes erros são fortemente penalizados.\n",
"\n",
"\n",
"\n",
"\n"
]
},
{
"metadata": {
"id": "sOy-DraCKzru",
"colab_type": "code",
"outputId": "7b4c1af1-97d2-490c-8abd-f210d98fc473",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 768
}
},
"cell_type": "code",
"source": [
"#mostra a escala de erro\n",
"\n",
"svd = SVD()\n",
"evaluate(svd, data, measures=['RMSE', 'MAE'])"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"/usr/local/lib/python3.6/dist-packages/surprise/evaluate.py:66: UserWarning: The evaluate() method is deprecated. Please use model_selection.cross_validate() instead.\n",
" 'model_selection.cross_validate() instead.', UserWarning)\n",
"/usr/local/lib/python3.6/dist-packages/surprise/dataset.py:193: UserWarning: Using data.split() or using load_from_folds() without using a CV iterator is now deprecated. \n",
" UserWarning)\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"Evaluating RMSE, MAE of algorithm SVD.\n",
"\n",
"------------\n",
"Fold 1\n",
"RMSE: 0.8934\n",
"MAE: 0.6875\n",
"------------\n",
"Fold 2\n",
"RMSE: 0.8954\n",
"MAE: 0.6894\n",
"------------\n",
"Fold 3\n",
"RMSE: 0.8859\n",
"MAE: 0.6834\n",
"------------\n",
"Fold 4\n",
"RMSE: 0.9040\n",
"MAE: 0.6976\n",
"------------\n",
"Fold 5\n",
"RMSE: 0.9065\n",
"MAE: 0.6963\n",
"------------\n",
"------------\n",
"Mean RMSE: 0.8970\n",
"Mean MAE : 0.6908\n",
"------------\n",
"------------\n"
],
"name": "stdout"
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"CaseInsensitiveDefaultDict(list,\n",
" {'mae': [0.6874891046447704,\n",
" 0.6894408711894955,\n",
" 0.6834057658802518,\n",
" 0.6975685029693567,\n",
" 0.6963356105796227],\n",
" 'rmse': [0.8933654197506257,\n",
" 0.8953883321484486,\n",
" 0.8858690715615625,\n",
" 0.9039984377663985,\n",
" 0.9064656294955329]})"
]
},
"metadata": {
"tags": []
},
"execution_count": 5
}
]
},
{
"metadata": {
"id": "7mA_wXpMK-MV",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"Como foi feita a predição."
]
},
{
"metadata": {
"id": "ndtH6Bf0K-_9",
"colab_type": "code",
"outputId": "25d8f8cb-ace4-467d-d2a5-f3d441747d39",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"trein = data.build_full_trainset()\n",
"\n",
"svd.fit(trein)\n",
"trein"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"<surprise.trainset.Trainset at 0x7f9790f176d8>"
]
},
"metadata": {
"tags": []
},
"execution_count": 6
}
]
},
{
"metadata": {
"id": "b8gKBbPkLNVt",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"Mostra o id do usuário, e as avaliações que ele deu em determinados filmes (os filmes também estão representados por ID)."
]
},
{
"metadata": {
"id": "dH5gPi21LXW5",
"colab_type": "code",
"outputId": "8a290df3-0b34-4d1a-8f1a-32b5961aff52",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1475
}
},
"cell_type": "code",
"source": [
"avaliacoes[avaliacoes['userId'] == 50]"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>userId</th>\n",
" <th>movieId</th>\n",
" <th>rating</th>\n",
" <th>timestamp</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>7987</th>\n",
" <td>50</td>\n",
" <td>10</td>\n",
" <td>4.0</td>\n",
" <td>847412607</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7988</th>\n",
" <td>50</td>\n",
" <td>21</td>\n",
" <td>3.0</td>\n",
" <td>847412676</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7989</th>\n",
" <td>50</td>\n",
" <td>39</td>\n",
" <td>2.0</td>\n",
" <td>847412692</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7990</th>\n",
" <td>50</td>\n",
" <td>47</td>\n",
" <td>3.0</td>\n",
" <td>847412643</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7991</th>\n",
" <td>50</td>\n",
" <td>95</td>\n",
" <td>3.0</td>\n",
" <td>847412837</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7992</th>\n",
" <td>50</td>\n",
" <td>110</td>\n",
" <td>4.0</td>\n",
" <td>847412607</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7993</th>\n",
" <td>50</td>\n",
" <td>150</td>\n",
" <td>3.0</td>\n",
" <td>847412515</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7994</th>\n",
" <td>50</td>\n",
" <td>160</td>\n",
" <td>3.0</td>\n",
" <td>847412712</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7995</th>\n",
" <td>50</td>\n",
" <td>161</td>\n",
" <td>4.0</td>\n",
" <td>847412607</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7996</th>\n",
" <td>50</td>\n",
" <td>165</td>\n",
" <td>4.0</td>\n",
" <td>847412543</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7997</th>\n",
" <td>50</td>\n",
" <td>185</td>\n",
" <td>3.0</td>\n",
" <td>847412606</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7998</th>\n",
" <td>50</td>\n",
" <td>208</td>\n",
" <td>3.0</td>\n",
" <td>847412606</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7999</th>\n",
" <td>50</td>\n",
" <td>231</td>\n",
" <td>1.0</td>\n",
" <td>847412567</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8000</th>\n",
" <td>50</td>\n",
" <td>253</td>\n",
" <td>3.0</td>\n",
" <td>847412628</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8001</th>\n",
" <td>50</td>\n",
" <td>282</td>\n",
" <td>3.0</td>\n",
" <td>847412811</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8002</th>\n",
" <td>50</td>\n",
" <td>292</td>\n",
" <td>4.0</td>\n",
" <td>847412586</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8003</th>\n",
" <td>50</td>\n",
" <td>296</td>\n",
" <td>4.0</td>\n",
" <td>847412515</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8004</th>\n",
" <td>50</td>\n",
" <td>315</td>\n",
" <td>3.0</td>\n",
" <td>847412812</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8005</th>\n",
" <td>50</td>\n",
" <td>316</td>\n",
" <td>3.0</td>\n",
" <td>847412567</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8006</th>\n",
" <td>50</td>\n",
" <td>337</td>\n",
" <td>3.0</td>\n",
" <td>847412859</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8007</th>\n",
" <td>50</td>\n",
" <td>339</td>\n",
" <td>4.0</td>\n",
" <td>847412628</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8008</th>\n",
" <td>50</td>\n",
" <td>344</td>\n",
" <td>2.0</td>\n",
" <td>847412544</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8009</th>\n",
" <td>50</td>\n",
" <td>349</td>\n",
" <td>3.0</td>\n",
" <td>847412567</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8010</th>\n",
" <td>50</td>\n",
" <td>356</td>\n",
" <td>4.0</td>\n",
" <td>847412567</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8011</th>\n",
" <td>50</td>\n",
" <td>357</td>\n",
" <td>4.0</td>\n",
" <td>847412692</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8012</th>\n",
" <td>50</td>\n",
" <td>367</td>\n",
" <td>4.0</td>\n",
" <td>847412643</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8013</th>\n",
" <td>50</td>\n",
" <td>368</td>\n",
" <td>4.0</td>\n",
" <td>847413071</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8014</th>\n",
" <td>50</td>\n",
" <td>377</td>\n",
" <td>3.0</td>\n",
" <td>847412643</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8015</th>\n",
" <td>50</td>\n",
" <td>380</td>\n",
" <td>3.0</td>\n",
" <td>847412515</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8016</th>\n",
" <td>50</td>\n",
" <td>420</td>\n",
" <td>3.0</td>\n",
" <td>847412692</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8017</th>\n",
" <td>50</td>\n",
" <td>434</td>\n",
" <td>3.0</td>\n",
" <td>847412586</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8018</th>\n",
" <td>50</td>\n",
" <td>440</td>\n",
" <td>4.0</td>\n",
" <td>847412711</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8019</th>\n",
" <td>50</td>\n",
" <td>442</td>\n",
" <td>3.0</td>\n",
" <td>847412812</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8020</th>\n",
" <td>50</td>\n",
" <td>454</td>\n",
" <td>3.0</td>\n",
" <td>847412628</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8021</th>\n",
" <td>50</td>\n",
" <td>457</td>\n",
" <td>3.0</td>\n",
" <td>847413020</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8022</th>\n",
" <td>50</td>\n",
" <td>480</td>\n",
" <td>4.0</td>\n",
" <td>847412586</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8023</th>\n",
" <td>50</td>\n",
" <td>509</td>\n",
" <td>3.0</td>\n",
" <td>847412812</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8024</th>\n",
" <td>50</td>\n",
" <td>527</td>\n",
" <td>4.0</td>\n",
" <td>847412676</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8025</th>\n",
" <td>50</td>\n",
" <td>539</td>\n",
" <td>3.0</td>\n",
" <td>847412676</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8026</th>\n",
" <td>50</td>\n",
" <td>553</td>\n",
" <td>3.0</td>\n",
" <td>847413043</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8027</th>\n",
" <td>50</td>\n",
" <td>587</td>\n",
" <td>3.0</td>\n",
" <td>847412659</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8028</th>\n",
" <td>50</td>\n",
" <td>589</td>\n",
" <td>5.0</td>\n",
" <td>847412628</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8029</th>\n",
" <td>50</td>\n",
" <td>590</td>\n",
" <td>3.0</td>\n",
" <td>847412515</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8030</th>\n",
" <td>50</td>\n",
" <td>597</td>\n",
" <td>3.0</td>\n",
" <td>847412659</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8031</th>\n",
" <td>50</td>\n",
" <td>780</td>\n",
" <td>4.0</td>\n",
" <td>847412879</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8032</th>\n",
" <td>50</td>\n",
" <td>786</td>\n",
" <td>3.0</td>\n",
" <td>847413043</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" userId movieId rating timestamp\n",
"7987 50 10 4.0 847412607\n",
"7988 50 21 3.0 847412676\n",
"7989 50 39 2.0 847412692\n",
"7990 50 47 3.0 847412643\n",
"7991 50 95 3.0 847412837\n",
"7992 50 110 4.0 847412607\n",
"7993 50 150 3.0 847412515\n",
"7994 50 160 3.0 847412712\n",
"7995 50 161 4.0 847412607\n",
"7996 50 165 4.0 847412543\n",
"7997 50 185 3.0 847412606\n",
"7998 50 208 3.0 847412606\n",
"7999 50 231 1.0 847412567\n",
"8000 50 253 3.0 847412628\n",
"8001 50 282 3.0 847412811\n",
"8002 50 292 4.0 847412586\n",
"8003 50 296 4.0 847412515\n",
"8004 50 315 3.0 847412812\n",
"8005 50 316 3.0 847412567\n",
"8006 50 337 3.0 847412859\n",
"8007 50 339 4.0 847412628\n",
"8008 50 344 2.0 847412544\n",
"8009 50 349 3.0 847412567\n",
"8010 50 356 4.0 847412567\n",
"8011 50 357 4.0 847412692\n",
"8012 50 367 4.0 847412643\n",
"8013 50 368 4.0 847413071\n",
"8014 50 377 3.0 847412643\n",
"8015 50 380 3.0 847412515\n",
"8016 50 420 3.0 847412692\n",
"8017 50 434 3.0 847412586\n",
"8018 50 440 4.0 847412711\n",
"8019 50 442 3.0 847412812\n",
"8020 50 454 3.0 847412628\n",
"8021 50 457 3.0 847413020\n",
"8022 50 480 4.0 847412586\n",
"8023 50 509 3.0 847412812\n",
"8024 50 527 4.0 847412676\n",
"8025 50 539 3.0 847412676\n",
"8026 50 553 3.0 847413043\n",
"8027 50 587 3.0 847412659\n",
"8028 50 589 5.0 847412628\n",
"8029 50 590 3.0 847412515\n",
"8030 50 597 3.0 847412659\n",
"8031 50 780 4.0 847412879\n",
"8032 50 786 3.0 847413043"
]
},
"metadata": {
"tags": []
},
"execution_count": 7
}
]
},
{
"metadata": {
"id": "hvztfOXkLof7",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"Nessa linha, ele prediz a avaliação do filme com determinado ID do qual o usuário 1 ainda não avaliou."
]
},
{
"metadata": {
"id": "gkRAn2vuLq_k",
"colab_type": "code",
"outputId": "53a44f83-a900-4dcd-f81b-25d73b207178",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"cell_type": "code",
"source": [
"svd.predict(50, 509, 5)"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"Prediction(uid=50, iid=509, r_ui=5, est=3.5132013541727947, details={'was_impossible': False})"
]
},
"metadata": {
"tags": []
},
"execution_count": 11
}
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment