Created
September 7, 2020 21:49
-
-
Save dmenezesgabriel/b2bf3eb049d48d1315d83db9e5fcad36 to your computer and use it in GitHub Desktop.
funds-explorer-crawler.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"nbformat": 4, | |
"nbformat_minor": 0, | |
"metadata": { | |
"colab": { | |
"name": "funds-explorer-crawler.ipynb", | |
"provenance": [], | |
"collapsed_sections": [], | |
"authorship_tag": "ABX9TyM+XWOqA9vbmNhgsaS2/QqF", | |
"include_colab_link": true | |
}, | |
"kernelspec": { | |
"name": "python3", | |
"display_name": "Python 3" | |
} | |
}, | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "view-in-github", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"<a href=\"https://colab.research.google.com/gist/dmenezesgabriel/b2bf3eb049d48d1315d83db9e5fcad36/funds-explorer-crawler.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "dp3LJ_TZ6gt_", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"# Funds Explorer Rank - Crawler\n", | |
"\n", | |
"## Objetivo\n", | |
"Este notebook tem por objetivo facilitar a seleção de fundos de investimentos imobiliários(FIIs), para constiutição de carteira previdênciaria com bons distribuidores de proventos, cuja distribuição seja sustentável a longo prazo.\n", | |
"\n", | |
"## Método\n", | |
"- Extração de dados do site [Funds Explorer](https://www.fundsexplorer.com.br/)\n", | |
"- Tratamento dos dados\n", | |
"- Filtro baseado em requisitos minimos de aceitação\n", | |
"- É necessária análise qualitativa dos papéis resultantes\n", | |
"\n", | |
"## Referências para análise quantitativa/qualitativa\n", | |
"- [Funds Explorer](https://www.fundsexplorer.com.br/)\n", | |
"- [Youtube](https://www.youtube.com/)\n", | |
"- [Status Invest](https://statusinvest.com.br/)\n", | |
"- [FIIs](https://fiis.com.br/)\n", | |
"- [Clube Fii](https://www.clubefii.com.br/)\n", | |
"\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "6DTg-35ADh9-", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"import datetime\n", | |
"import logging\n", | |
"import requests\n", | |
"import numpy as np\n", | |
"import pandas as pd\n", | |
"import pytz" | |
], | |
"execution_count": 27, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "JHeR7WNxR1L8", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"## Extração dos dados (Data Extraction)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "b7aupVWNCSP6", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"def now():\n", | |
" \"\"\"\n", | |
" Current timestamp at time zone utc\n", | |
" :returns: timestamp\n", | |
" \"\"\"\n", | |
" return pytz.UTC.localize(datetime.datetime.utcnow())\n", | |
"\n", | |
"\n", | |
"def now_br():\n", | |
" \"\"\"\n", | |
" Current timestamp at time zone América São Paulo\n", | |
" :returns: timestamp\n", | |
" \"\"\"\n", | |
" return now().astimezone(pytz.timezone(\"America/Sao_Paulo\"))" | |
], | |
"execution_count": 28, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "IPuNqGR7R_Dn", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"# Logging basic configuration\n", | |
"logging.basicConfig(level=logging.INFO)\n", | |
"_logger = logging.getLogger()" | |
], | |
"execution_count": 29, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "KlnabVI8DCgb", | |
"colab_type": "code", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 35 | |
}, | |
"outputId": "28223687-70ae-4b93-88c7-bd0c0b7efe70" | |
}, | |
"source": [ | |
"# Updated at\n", | |
"str(now_br())" | |
], | |
"execution_count": 30, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"application/vnd.google.colaboratory.intrinsic+json": { | |
"type": "string" | |
}, | |
"text/plain": [ | |
"'2020-09-07 18:17:39.036756-03:00'" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 30 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "kONU85UaDrJj", | |
"colab_type": "code", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 34 | |
}, | |
"outputId": "32946faa-8521-4589-bb37-5c0a43c6b14f" | |
}, | |
"source": [ | |
"url = 'https://www.fundsexplorer.com.br/ranking'\n", | |
"headers = {\n", | |
" 'User-Agent': \n", | |
" 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36'\n", | |
" ' (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36'\n", | |
"}\n", | |
"\n", | |
"# Collecting Data\n", | |
"response = requests.get(url, headers)\n", | |
"_logger.info(\"%s url request's status: %s\", url, response.status_code)\n", | |
"# List object of Data Frames\n", | |
"list_obj = pd.read_html(response.text, attrs={'id': 'table-ranking'})\n", | |
"df = list_obj[0]" | |
], | |
"execution_count": 31, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"text": [ | |
"INFO:root:https://www.fundsexplorer.com.br/ranking url request's status: 200\n" | |
], | |
"name": "stderr" | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "_f5ZUyVdLo80", | |
"colab_type": "code", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 326 | |
}, | |
"outputId": "3733f32c-ffe7-463b-eb32-7de8d3241a01" | |
}, | |
"source": [ | |
"df.head()" | |
], | |
"execution_count": 32, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>Códigodo fundo</th>\n", | |
" <th>Setor</th>\n", | |
" <th>Preço Atual</th>\n", | |
" <th>Liquidez Diária</th>\n", | |
" <th>Dividendo</th>\n", | |
" <th>DividendYield</th>\n", | |
" <th>DY (3M)Acumulado</th>\n", | |
" <th>DY (6M)Acumulado</th>\n", | |
" <th>DY (12M)Acumulado</th>\n", | |
" <th>DY (3M)Média</th>\n", | |
" <th>DY (6M)Média</th>\n", | |
" <th>DY (12M)Média</th>\n", | |
" <th>DY Ano</th>\n", | |
" <th>Variação Preço</th>\n", | |
" <th>Rentab.Período</th>\n", | |
" <th>Rentab.Acumulada</th>\n", | |
" <th>PatrimônioLíq.</th>\n", | |
" <th>VPA</th>\n", | |
" <th>P/VPA</th>\n", | |
" <th>DYPatrimonial</th>\n", | |
" <th>VariaçãoPatrimonial</th>\n", | |
" <th>Rentab. Patr.no Período</th>\n", | |
" <th>Rentab. Patr.Acumulada</th>\n", | |
" <th>VacânciaFísica</th>\n", | |
" <th>VacânciaFinanceira</th>\n", | |
" <th>QuantidadeAtivos</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>FIVN11</td>\n", | |
" <td>Shoppings</td>\n", | |
" <td>R$ 4,52</td>\n", | |
" <td>624.0</td>\n", | |
" <td>R$ 0,00</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>R$ 71.559.512,32</td>\n", | |
" <td>R$ 7,60</td>\n", | |
" <td>59.0</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>56,00%</td>\n", | |
" <td>NaN</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>BZLI11</td>\n", | |
" <td>Títulos e Val. Mob.</td>\n", | |
" <td>R$ 15,70</td>\n", | |
" <td>2032.0</td>\n", | |
" <td>R$ 0,00</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,68%</td>\n", | |
" <td>0,68%</td>\n", | |
" <td>1,14%</td>\n", | |
" <td>R$ 428.081.706,75</td>\n", | |
" <td>R$ 11,86</td>\n", | |
" <td>132.0</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>XTED11</td>\n", | |
" <td>Lajes Corporativas</td>\n", | |
" <td>R$ 7,27</td>\n", | |
" <td>2389.0</td>\n", | |
" <td>R$ 0,00</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>1,10%</td>\n", | |
" <td>1,10%</td>\n", | |
" <td>-37,83%</td>\n", | |
" <td>R$ 26.298.791,43</td>\n", | |
" <td>R$ 14,67</td>\n", | |
" <td>50.0</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>100,00%</td>\n", | |
" <td>100,00%</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>ALMI11</td>\n", | |
" <td>Lajes Corporativas</td>\n", | |
" <td>R$ 1.200,00</td>\n", | |
" <td>125.0</td>\n", | |
" <td>R$ 0,00</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>2,27%</td>\n", | |
" <td>2,27%</td>\n", | |
" <td>-28,29%</td>\n", | |
" <td>R$ 220.554.740,93</td>\n", | |
" <td>R$ 1.983,82</td>\n", | |
" <td>60.0</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>71,19%</td>\n", | |
" <td>NaN</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>DOMC11</td>\n", | |
" <td>Lajes Corporativas</td>\n", | |
" <td>R$ 479,98</td>\n", | |
" <td>31.0</td>\n", | |
" <td>R$ 0,00</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>4,35%</td>\n", | |
" <td>4,35%</td>\n", | |
" <td>-10,00%</td>\n", | |
" <td>R$ 244.148.383,80</td>\n", | |
" <td>R$ 999,05</td>\n", | |
" <td>48.0</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>23,44%</td>\n", | |
" <td>NaN</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" Códigodo fundo Setor ... VacânciaFinanceira QuantidadeAtivos\n", | |
"0 FIVN11 Shoppings ... NaN 1\n", | |
"1 BZLI11 Títulos e Val. Mob. ... NaN 0\n", | |
"2 XTED11 Lajes Corporativas ... 100,00% 1\n", | |
"3 ALMI11 Lajes Corporativas ... NaN 1\n", | |
"4 DOMC11 Lajes Corporativas ... NaN 1\n", | |
"\n", | |
"[5 rows x 26 columns]" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 32 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "tQhfevYHj355", | |
"colab_type": "code", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 34 | |
}, | |
"outputId": "b5f0dd06-b469-4360-dec8-40a71026f005" | |
}, | |
"source": [ | |
"len(df)" | |
], | |
"execution_count": 33, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"183" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 33 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "rU1OxD15Qaf0", | |
"colab_type": "code", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 476 | |
}, | |
"outputId": "de976e13-c111-4ee8-f5b2-27e141d8cdc9" | |
}, | |
"source": [ | |
"df.dtypes" | |
], | |
"execution_count": 34, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"Códigodo fundo object\n", | |
"Setor object\n", | |
"Preço Atual object\n", | |
"Liquidez Diária float64\n", | |
"Dividendo object\n", | |
"DividendYield object\n", | |
"DY (3M)Acumulado object\n", | |
"DY (6M)Acumulado object\n", | |
"DY (12M)Acumulado object\n", | |
"DY (3M)Média object\n", | |
"DY (6M)Média object\n", | |
"DY (12M)Média object\n", | |
"DY Ano object\n", | |
"Variação Preço object\n", | |
"Rentab.Período object\n", | |
"Rentab.Acumulada object\n", | |
"PatrimônioLíq. object\n", | |
"VPA object\n", | |
"P/VPA float64\n", | |
"DYPatrimonial object\n", | |
"VariaçãoPatrimonial object\n", | |
"Rentab. Patr.no Período object\n", | |
"Rentab. Patr.Acumulada object\n", | |
"VacânciaFísica object\n", | |
"VacânciaFinanceira object\n", | |
"QuantidadeAtivos int64\n", | |
"dtype: object" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 34 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "U7hGWeNNRnC-", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"## Limpeza de dados (Data Cleaning)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "25wmmodaUrmF", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"### Cabeçalhos\n", | |
"- Remoção de espaços\n", | |
"- Remoção de acentos e pontuação\n", | |
"- Transformação dos caracteres para minúsculo" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "sEApqRgkS6kn", | |
"colab_type": "code", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 170 | |
}, | |
"outputId": "ab495026-2b6c-4d5b-a812-01b077decc96" | |
}, | |
"source": [ | |
"# Cleanning data headers\n", | |
"df.columns = df.columns.str.replace('\\s+', '_')\n", | |
"df.columns = df.columns.str.replace(r'[^\\w\\s]+', '_')\n", | |
"df.columns = df.columns.str.lower()\n", | |
"df.columns = df.columns.str.normalize('NFKD').str.encode('ascii', errors='ignore').str.decode('utf-8')\n", | |
"df.columns" | |
], | |
"execution_count": 35, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"Index(['codigodo_fundo', 'setor', 'preco_atual', 'liquidez_diaria',\n", | |
" 'dividendo', 'dividendyield', 'dy__3m_acumulado', 'dy__6m_acumulado',\n", | |
" 'dy__12m_acumulado', 'dy__3m_media', 'dy__6m_media', 'dy__12m_media',\n", | |
" 'dy_ano', 'variacao_preco', 'rentab_periodo', 'rentab_acumulada',\n", | |
" 'patrimonioliq_', 'vpa', 'p_vpa', 'dypatrimonial',\n", | |
" 'variacaopatrimonial', 'rentab__patr_no_periodo',\n", | |
" 'rentab__patr_acumulada', 'vacanciafisica', 'vacanciafinanceira',\n", | |
" 'quantidadeativos'],\n", | |
" dtype='object')" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 35 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "8kl6qJSZU4La", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"### Valores\n", | |
"- Conversão de moeda em formatode texto para número\n", | |
"- Remoção de espaços\n", | |
"- Remoção de acentos e pontuação\n", | |
"- Transformação dos caracteres para minúsculo\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "EejQ3yADUzOy", | |
"colab_type": "code", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 292 | |
}, | |
"outputId": "c2afd978-06d4-44f4-d4ae-d3954f549d82" | |
}, | |
"source": [ | |
"df.head()" | |
], | |
"execution_count": 36, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>codigodo_fundo</th>\n", | |
" <th>setor</th>\n", | |
" <th>preco_atual</th>\n", | |
" <th>liquidez_diaria</th>\n", | |
" <th>dividendo</th>\n", | |
" <th>dividendyield</th>\n", | |
" <th>dy__3m_acumulado</th>\n", | |
" <th>dy__6m_acumulado</th>\n", | |
" <th>dy__12m_acumulado</th>\n", | |
" <th>dy__3m_media</th>\n", | |
" <th>dy__6m_media</th>\n", | |
" <th>dy__12m_media</th>\n", | |
" <th>dy_ano</th>\n", | |
" <th>variacao_preco</th>\n", | |
" <th>rentab_periodo</th>\n", | |
" <th>rentab_acumulada</th>\n", | |
" <th>patrimonioliq_</th>\n", | |
" <th>vpa</th>\n", | |
" <th>p_vpa</th>\n", | |
" <th>dypatrimonial</th>\n", | |
" <th>variacaopatrimonial</th>\n", | |
" <th>rentab__patr_no_periodo</th>\n", | |
" <th>rentab__patr_acumulada</th>\n", | |
" <th>vacanciafisica</th>\n", | |
" <th>vacanciafinanceira</th>\n", | |
" <th>quantidadeativos</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>FIVN11</td>\n", | |
" <td>Shoppings</td>\n", | |
" <td>R$ 4,52</td>\n", | |
" <td>624.0</td>\n", | |
" <td>R$ 0,00</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>R$ 71.559.512,32</td>\n", | |
" <td>R$ 7,60</td>\n", | |
" <td>59.0</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>56,00%</td>\n", | |
" <td>NaN</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>BZLI11</td>\n", | |
" <td>Títulos e Val. Mob.</td>\n", | |
" <td>R$ 15,70</td>\n", | |
" <td>2032.0</td>\n", | |
" <td>R$ 0,00</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,68%</td>\n", | |
" <td>0,68%</td>\n", | |
" <td>1,14%</td>\n", | |
" <td>R$ 428.081.706,75</td>\n", | |
" <td>R$ 11,86</td>\n", | |
" <td>132.0</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>XTED11</td>\n", | |
" <td>Lajes Corporativas</td>\n", | |
" <td>R$ 7,27</td>\n", | |
" <td>2389.0</td>\n", | |
" <td>R$ 0,00</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>1,10%</td>\n", | |
" <td>1,10%</td>\n", | |
" <td>-37,83%</td>\n", | |
" <td>R$ 26.298.791,43</td>\n", | |
" <td>R$ 14,67</td>\n", | |
" <td>50.0</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>100,00%</td>\n", | |
" <td>100,00%</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>ALMI11</td>\n", | |
" <td>Lajes Corporativas</td>\n", | |
" <td>R$ 1.200,00</td>\n", | |
" <td>125.0</td>\n", | |
" <td>R$ 0,00</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>2,27%</td>\n", | |
" <td>2,27%</td>\n", | |
" <td>-28,29%</td>\n", | |
" <td>R$ 220.554.740,93</td>\n", | |
" <td>R$ 1.983,82</td>\n", | |
" <td>60.0</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>71,19%</td>\n", | |
" <td>NaN</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>DOMC11</td>\n", | |
" <td>Lajes Corporativas</td>\n", | |
" <td>R$ 479,98</td>\n", | |
" <td>31.0</td>\n", | |
" <td>R$ 0,00</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>0,00%</td>\n", | |
" <td>4,35%</td>\n", | |
" <td>4,35%</td>\n", | |
" <td>-10,00%</td>\n", | |
" <td>R$ 244.148.383,80</td>\n", | |
" <td>R$ 999,05</td>\n", | |
" <td>48.0</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>23,44%</td>\n", | |
" <td>NaN</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" codigodo_fundo setor ... vacanciafinanceira quantidadeativos\n", | |
"0 FIVN11 Shoppings ... NaN 1\n", | |
"1 BZLI11 Títulos e Val. Mob. ... NaN 0\n", | |
"2 XTED11 Lajes Corporativas ... 100,00% 1\n", | |
"3 ALMI11 Lajes Corporativas ... NaN 1\n", | |
"4 DOMC11 Lajes Corporativas ... NaN 1\n", | |
"\n", | |
"[5 rows x 26 columns]" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 36 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "NiVeaYr4XHOz", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"def columns_which_contains(df, value):\n", | |
" \"\"\"\n", | |
" Serch for DataFrame column's values which contains a specific value\n", | |
" :value: any characters, phrases, symbols\n", | |
" :returns: list of DataFrame column's names\n", | |
" \"\"\"\n", | |
" return [df[column].name for column in df if df[column].astype(str).str.contains(value).any()]" | |
], | |
"execution_count": 37, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "8rwwtoNNeyrE", | |
"colab_type": "code", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 34 | |
}, | |
"outputId": "5252a639-b8ae-4dac-ab90-0269b0d92cb2" | |
}, | |
"source": [ | |
"# Convert string BR currency to float\n", | |
"currency_columns = columns_which_contains(df, '\\$')\n", | |
"_logger.info(\"Currency Columns: %s\", currency_columns)\n", | |
"for column_name in currency_columns:\n", | |
" df[column_name] = df[column_name].str.replace('.', '')\n", | |
" df[column_name] = df[column_name].str.replace(',', '.')\n", | |
" df[column_name] = df[column_name].str.replace('[R\\$ ,]', '', regex=True).astype(float)" | |
], | |
"execution_count": 38, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"text": [ | |
"INFO:root:Currency Columns: ['preco_atual', 'dividendo', 'patrimonioliq_', 'vpa']\n" | |
], | |
"name": "stderr" | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "WDCSsUBDiROz", | |
"colab_type": "code", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 54 | |
}, | |
"outputId": "ac000e62-a886-4a81-9a90-f95ace8d5a4b" | |
}, | |
"source": [ | |
"# Convert string % to float\n", | |
"percentual_columns = columns_which_contains(df, '\\%')\n", | |
"_logger.info(\"Percentual Columns: %s\", percentual_columns)\n", | |
"for column_name in percentual_columns:\n", | |
" df[column_name] = df[column_name].str.replace(',', '.')\n", | |
" df[column_name] = df[column_name].str.replace('[%,]', '', regex=True).astype(float)\n", | |
" df[column_name] = df[column_name]/100" | |
], | |
"execution_count": 39, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"text": [ | |
"INFO:root:Percentual Columns: ['dividendyield', 'dy__3m_acumulado', 'dy__6m_acumulado', 'dy__12m_acumulado', 'dy__3m_media', 'dy__6m_media', 'dy__12m_media', 'dy_ano', 'variacao_preco', 'rentab_periodo', 'rentab_acumulada', 'dypatrimonial', 'variacaopatrimonial', 'rentab__patr_no_periodo', 'rentab__patr_acumulada', 'vacanciafisica', 'vacanciafinanceira']\n" | |
], | |
"name": "stderr" | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "V0-63A1ahTiX", | |
"colab_type": "code", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 292 | |
}, | |
"outputId": "48b2929a-0699-4779-e25b-c8c2c1dd232f" | |
}, | |
"source": [ | |
"df.head()" | |
], | |
"execution_count": 40, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>codigodo_fundo</th>\n", | |
" <th>setor</th>\n", | |
" <th>preco_atual</th>\n", | |
" <th>liquidez_diaria</th>\n", | |
" <th>dividendo</th>\n", | |
" <th>dividendyield</th>\n", | |
" <th>dy__3m_acumulado</th>\n", | |
" <th>dy__6m_acumulado</th>\n", | |
" <th>dy__12m_acumulado</th>\n", | |
" <th>dy__3m_media</th>\n", | |
" <th>dy__6m_media</th>\n", | |
" <th>dy__12m_media</th>\n", | |
" <th>dy_ano</th>\n", | |
" <th>variacao_preco</th>\n", | |
" <th>rentab_periodo</th>\n", | |
" <th>rentab_acumulada</th>\n", | |
" <th>patrimonioliq_</th>\n", | |
" <th>vpa</th>\n", | |
" <th>p_vpa</th>\n", | |
" <th>dypatrimonial</th>\n", | |
" <th>variacaopatrimonial</th>\n", | |
" <th>rentab__patr_no_periodo</th>\n", | |
" <th>rentab__patr_acumulada</th>\n", | |
" <th>vacanciafisica</th>\n", | |
" <th>vacanciafinanceira</th>\n", | |
" <th>quantidadeativos</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>FIVN11</td>\n", | |
" <td>Shoppings</td>\n", | |
" <td>4.52</td>\n", | |
" <td>624.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0.0000</td>\n", | |
" <td>0.0000</td>\n", | |
" <td>0.0000</td>\n", | |
" <td>7.155951e+07</td>\n", | |
" <td>7.60</td>\n", | |
" <td>59.0</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0.5600</td>\n", | |
" <td>NaN</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>BZLI11</td>\n", | |
" <td>Títulos e Val. Mob.</td>\n", | |
" <td>15.70</td>\n", | |
" <td>2032.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0068</td>\n", | |
" <td>0.0068</td>\n", | |
" <td>0.0114</td>\n", | |
" <td>4.280817e+08</td>\n", | |
" <td>11.86</td>\n", | |
" <td>132.0</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>XTED11</td>\n", | |
" <td>Lajes Corporativas</td>\n", | |
" <td>7.27</td>\n", | |
" <td>2389.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0110</td>\n", | |
" <td>0.0110</td>\n", | |
" <td>-0.3783</td>\n", | |
" <td>2.629879e+07</td>\n", | |
" <td>14.67</td>\n", | |
" <td>50.0</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>1.0000</td>\n", | |
" <td>1.0</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>ALMI11</td>\n", | |
" <td>Lajes Corporativas</td>\n", | |
" <td>1200.00</td>\n", | |
" <td>125.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0227</td>\n", | |
" <td>0.0227</td>\n", | |
" <td>-0.2829</td>\n", | |
" <td>2.205547e+08</td>\n", | |
" <td>1983.82</td>\n", | |
" <td>60.0</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0.7119</td>\n", | |
" <td>NaN</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>DOMC11</td>\n", | |
" <td>Lajes Corporativas</td>\n", | |
" <td>479.98</td>\n", | |
" <td>31.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0435</td>\n", | |
" <td>0.0435</td>\n", | |
" <td>-0.1000</td>\n", | |
" <td>2.441484e+08</td>\n", | |
" <td>999.05</td>\n", | |
" <td>48.0</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0.2344</td>\n", | |
" <td>NaN</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" codigodo_fundo setor ... vacanciafinanceira quantidadeativos\n", | |
"0 FIVN11 Shoppings ... NaN 1\n", | |
"1 BZLI11 Títulos e Val. Mob. ... NaN 0\n", | |
"2 XTED11 Lajes Corporativas ... 1.0 1\n", | |
"3 ALMI11 Lajes Corporativas ... NaN 1\n", | |
"4 DOMC11 Lajes Corporativas ... NaN 1\n", | |
"\n", | |
"[5 rows x 26 columns]" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 40 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "Fe3tZ-zTkCBw", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"df['setor'] = df['setor'].str.replace('\\s+', '_')\n", | |
"df['setor'] = df['setor'].str.replace(r'[^\\w\\s]+', '_')\n", | |
"df['setor'] = df['setor'].str.lower()\n", | |
"df['setor'] = df['setor'].str.normalize('NFKD').str.encode('ascii', errors='ignore').str.decode('utf-8')" | |
], | |
"execution_count": 41, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "TARebIHVz0Ex", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"## Análise Exploratória dos dados (Exploratory Data Analysis)\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "LEp-p6hIiBg_", | |
"colab_type": "code", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 68 | |
}, | |
"outputId": "15de3422-a7e6-4394-cb96-d975c22d5186" | |
}, | |
"source": [ | |
"df.setor.unique()" | |
], | |
"execution_count": 42, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"array(['shoppings', 'titulos_e_val__mob_', 'lajes_corporativas',\n", | |
" 'hibrido', 'outros', 'residencial', 'hotel', 'hospital',\n", | |
" 'logistica'], dtype=object)" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 42 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "qA064rABlJij", | |
"colab_type": "code", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 34 | |
}, | |
"outputId": "c301dc84-ba8d-4820-f4a4-be3adfa33580" | |
}, | |
"source": [ | |
"df.codigodo_fundo.nunique()" | |
], | |
"execution_count": 43, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"183" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 43 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "V7NOVuK4mO_M", | |
"colab_type": "code", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 100 | |
}, | |
"outputId": "46a35eb1-2a53-4a21-9be2-f31fd2296560" | |
}, | |
"source": [ | |
"# Convert p/vpa to the correct base\n", | |
"df['p_vpa'] = df['p_vpa']/100\n", | |
"# Checking if is it correct\n", | |
"df[df['codigodo_fundo'] == 'VRTA11']" | |
], | |
"execution_count": 44, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>codigodo_fundo</th>\n", | |
" <th>setor</th>\n", | |
" <th>preco_atual</th>\n", | |
" <th>liquidez_diaria</th>\n", | |
" <th>dividendo</th>\n", | |
" <th>dividendyield</th>\n", | |
" <th>dy__3m_acumulado</th>\n", | |
" <th>dy__6m_acumulado</th>\n", | |
" <th>dy__12m_acumulado</th>\n", | |
" <th>dy__3m_media</th>\n", | |
" <th>dy__6m_media</th>\n", | |
" <th>dy__12m_media</th>\n", | |
" <th>dy_ano</th>\n", | |
" <th>variacao_preco</th>\n", | |
" <th>rentab_periodo</th>\n", | |
" <th>rentab_acumulada</th>\n", | |
" <th>patrimonioliq_</th>\n", | |
" <th>vpa</th>\n", | |
" <th>p_vpa</th>\n", | |
" <th>dypatrimonial</th>\n", | |
" <th>variacaopatrimonial</th>\n", | |
" <th>rentab__patr_no_periodo</th>\n", | |
" <th>rentab__patr_acumulada</th>\n", | |
" <th>vacanciafisica</th>\n", | |
" <th>vacanciafinanceira</th>\n", | |
" <th>quantidadeativos</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>100</th>\n", | |
" <td>VRTA11</td>\n", | |
" <td>titulos_e_val__mob_</td>\n", | |
" <td>104.18</td>\n", | |
" <td>20535.0</td>\n", | |
" <td>0.51</td>\n", | |
" <td>0.0049</td>\n", | |
" <td>0.0191</td>\n", | |
" <td>0.0389</td>\n", | |
" <td>0.0731</td>\n", | |
" <td>0.0064</td>\n", | |
" <td>0.0065</td>\n", | |
" <td>0.0061</td>\n", | |
" <td>0.0513</td>\n", | |
" <td>-0.0626</td>\n", | |
" <td>-0.058</td>\n", | |
" <td>-0.1628</td>\n", | |
" <td>8.036133e+08</td>\n", | |
" <td>105.72</td>\n", | |
" <td>0.99</td>\n", | |
" <td>0.0058</td>\n", | |
" <td>0.0057</td>\n", | |
" <td>0.0115</td>\n", | |
" <td>0.046</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" codigodo_fundo setor ... vacanciafinanceira quantidadeativos\n", | |
"100 VRTA11 titulos_e_val__mob_ ... NaN 0\n", | |
"\n", | |
"[1 rows x 26 columns]" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 44 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "M1wd4hgqlZVM", | |
"colab_type": "code", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 317 | |
}, | |
"outputId": "332af54d-99a5-4bea-93b5-34da4be4158e" | |
}, | |
"source": [ | |
"df.describe()" | |
], | |
"execution_count": 45, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>preco_atual</th>\n", | |
" <th>liquidez_diaria</th>\n", | |
" <th>dividendo</th>\n", | |
" <th>dividendyield</th>\n", | |
" <th>dy__3m_acumulado</th>\n", | |
" <th>dy__6m_acumulado</th>\n", | |
" <th>dy__12m_acumulado</th>\n", | |
" <th>dy__3m_media</th>\n", | |
" <th>dy__6m_media</th>\n", | |
" <th>dy__12m_media</th>\n", | |
" <th>dy_ano</th>\n", | |
" <th>variacao_preco</th>\n", | |
" <th>rentab_periodo</th>\n", | |
" <th>rentab_acumulada</th>\n", | |
" <th>patrimonioliq_</th>\n", | |
" <th>vpa</th>\n", | |
" <th>p_vpa</th>\n", | |
" <th>dypatrimonial</th>\n", | |
" <th>variacaopatrimonial</th>\n", | |
" <th>rentab__patr_no_periodo</th>\n", | |
" <th>rentab__patr_acumulada</th>\n", | |
" <th>vacanciafisica</th>\n", | |
" <th>vacanciafinanceira</th>\n", | |
" <th>quantidadeativos</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>count</th>\n", | |
" <td>181.000000</td>\n", | |
" <td>181.000000</td>\n", | |
" <td>183.000000</td>\n", | |
" <td>183.000000</td>\n", | |
" <td>183.000000</td>\n", | |
" <td>183.000000</td>\n", | |
" <td>183.000000</td>\n", | |
" <td>183.000000</td>\n", | |
" <td>183.000000</td>\n", | |
" <td>183.000000</td>\n", | |
" <td>180.000000</td>\n", | |
" <td>183.000000</td>\n", | |
" <td>183.000000</td>\n", | |
" <td>183.000000</td>\n", | |
" <td>1.830000e+02</td>\n", | |
" <td>183.000000</td>\n", | |
" <td>181.000000</td>\n", | |
" <td>158.000000</td>\n", | |
" <td>158.000000</td>\n", | |
" <td>158.000000</td>\n", | |
" <td>158.000000</td>\n", | |
" <td>102.000000</td>\n", | |
" <td>20.000000</td>\n", | |
" <td>183.000000</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>mean</th>\n", | |
" <td>214.766519</td>\n", | |
" <td>12945.563536</td>\n", | |
" <td>1.431858</td>\n", | |
" <td>0.006131</td>\n", | |
" <td>0.016523</td>\n", | |
" <td>0.029861</td>\n", | |
" <td>0.051211</td>\n", | |
" <td>0.005508</td>\n", | |
" <td>0.004977</td>\n", | |
" <td>0.004266</td>\n", | |
" <td>0.038985</td>\n", | |
" <td>-0.005975</td>\n", | |
" <td>0.000163</td>\n", | |
" <td>-0.098667</td>\n", | |
" <td>4.812253e+08</td>\n", | |
" <td>232.200109</td>\n", | |
" <td>1.077072</td>\n", | |
" <td>0.005738</td>\n", | |
" <td>-0.009444</td>\n", | |
" <td>-0.003806</td>\n", | |
" <td>0.008344</td>\n", | |
" <td>0.118266</td>\n", | |
" <td>0.111330</td>\n", | |
" <td>3.688525</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>std</th>\n", | |
" <td>440.920525</td>\n", | |
" <td>42403.691112</td>\n", | |
" <td>4.181371</td>\n", | |
" <td>0.013986</td>\n", | |
" <td>0.019139</td>\n", | |
" <td>0.024219</td>\n", | |
" <td>0.042841</td>\n", | |
" <td>0.006380</td>\n", | |
" <td>0.004039</td>\n", | |
" <td>0.003570</td>\n", | |
" <td>0.026859</td>\n", | |
" <td>0.084616</td>\n", | |
" <td>0.086451</td>\n", | |
" <td>0.155748</td>\n", | |
" <td>6.498980e+08</td>\n", | |
" <td>504.873012</td>\n", | |
" <td>1.852732</td>\n", | |
" <td>0.003834</td>\n", | |
" <td>0.080512</td>\n", | |
" <td>0.080806</td>\n", | |
" <td>0.104916</td>\n", | |
" <td>0.206672</td>\n", | |
" <td>0.221205</td>\n", | |
" <td>8.478559</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>min</th>\n", | |
" <td>0.860000</td>\n", | |
" <td>1.000000</td>\n", | |
" <td>0.000000</td>\n", | |
" <td>0.000000</td>\n", | |
" <td>0.000000</td>\n", | |
" <td>0.000000</td>\n", | |
" <td>0.000000</td>\n", | |
" <td>0.000000</td>\n", | |
" <td>0.000000</td>\n", | |
" <td>0.000000</td>\n", | |
" <td>0.000000</td>\n", | |
" <td>-0.713500</td>\n", | |
" <td>-0.713500</td>\n", | |
" <td>-0.960600</td>\n", | |
" <td>-3.099391e+06</td>\n", | |
" <td>-4.090000</td>\n", | |
" <td>-4.150000</td>\n", | |
" <td>0.000000</td>\n", | |
" <td>-0.960000</td>\n", | |
" <td>-0.959700</td>\n", | |
" <td>-0.957600</td>\n", | |
" <td>0.000000</td>\n", | |
" <td>0.000000</td>\n", | |
" <td>0.000000</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>25%</th>\n", | |
" <td>83.000000</td>\n", | |
" <td>279.000000</td>\n", | |
" <td>0.300000</td>\n", | |
" <td>0.003600</td>\n", | |
" <td>0.009950</td>\n", | |
" <td>0.017650</td>\n", | |
" <td>0.000000</td>\n", | |
" <td>0.003300</td>\n", | |
" <td>0.002900</td>\n", | |
" <td>0.000000</td>\n", | |
" <td>0.025025</td>\n", | |
" <td>-0.031450</td>\n", | |
" <td>-0.027450</td>\n", | |
" <td>-0.186300</td>\n", | |
" <td>1.106316e+08</td>\n", | |
" <td>94.815000</td>\n", | |
" <td>0.870000</td>\n", | |
" <td>0.003900</td>\n", | |
" <td>-0.002775</td>\n", | |
" <td>0.001200</td>\n", | |
" <td>0.005950</td>\n", | |
" <td>0.000000</td>\n", | |
" <td>0.000000</td>\n", | |
" <td>0.000000</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>50%</th>\n", | |
" <td>101.500000</td>\n", | |
" <td>2389.000000</td>\n", | |
" <td>0.550000</td>\n", | |
" <td>0.005200</td>\n", | |
" <td>0.014900</td>\n", | |
" <td>0.029700</td>\n", | |
" <td>0.054100</td>\n", | |
" <td>0.005000</td>\n", | |
" <td>0.004900</td>\n", | |
" <td>0.004500</td>\n", | |
" <td>0.037300</td>\n", | |
" <td>0.000000</td>\n", | |
" <td>0.004500</td>\n", | |
" <td>-0.078300</td>\n", | |
" <td>2.396393e+08</td>\n", | |
" <td>99.910000</td>\n", | |
" <td>0.990000</td>\n", | |
" <td>0.005550</td>\n", | |
" <td>-0.000100</td>\n", | |
" <td>0.004650</td>\n", | |
" <td>0.026050</td>\n", | |
" <td>0.045500</td>\n", | |
" <td>0.052500</td>\n", | |
" <td>1.000000</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>75%</th>\n", | |
" <td>131.110000</td>\n", | |
" <td>13018.000000</td>\n", | |
" <td>0.750000</td>\n", | |
" <td>0.006600</td>\n", | |
" <td>0.019200</td>\n", | |
" <td>0.038950</td>\n", | |
" <td>0.073700</td>\n", | |
" <td>0.006400</td>\n", | |
" <td>0.006500</td>\n", | |
" <td>0.006150</td>\n", | |
" <td>0.048400</td>\n", | |
" <td>0.035500</td>\n", | |
" <td>0.041800</td>\n", | |
" <td>0.004400</td>\n", | |
" <td>5.910821e+08</td>\n", | |
" <td>119.950000</td>\n", | |
" <td>1.070000</td>\n", | |
" <td>0.006875</td>\n", | |
" <td>0.000900</td>\n", | |
" <td>0.007650</td>\n", | |
" <td>0.046600</td>\n", | |
" <td>0.105750</td>\n", | |
" <td>0.114275</td>\n", | |
" <td>3.500000</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>max</th>\n", | |
" <td>3150.000000</td>\n", | |
" <td>529243.000000</td>\n", | |
" <td>30.000000</td>\n", | |
" <td>0.187800</td>\n", | |
" <td>0.196400</td>\n", | |
" <td>0.208800</td>\n", | |
" <td>0.237900</td>\n", | |
" <td>0.065500</td>\n", | |
" <td>0.034800</td>\n", | |
" <td>0.019800</td>\n", | |
" <td>0.216900</td>\n", | |
" <td>0.221000</td>\n", | |
" <td>0.224000</td>\n", | |
" <td>0.263100</td>\n", | |
" <td>3.807861e+09</td>\n", | |
" <td>3536.130000</td>\n", | |
" <td>25.180000</td>\n", | |
" <td>0.036800</td>\n", | |
" <td>0.055900</td>\n", | |
" <td>0.062000</td>\n", | |
" <td>0.259300</td>\n", | |
" <td>1.000000</td>\n", | |
" <td>1.000000</td>\n", | |
" <td>72.000000</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" preco_atual liquidez_diaria ... vacanciafinanceira quantidadeativos\n", | |
"count 181.000000 181.000000 ... 20.000000 183.000000\n", | |
"mean 214.766519 12945.563536 ... 0.111330 3.688525\n", | |
"std 440.920525 42403.691112 ... 0.221205 8.478559\n", | |
"min 0.860000 1.000000 ... 0.000000 0.000000\n", | |
"25% 83.000000 279.000000 ... 0.000000 0.000000\n", | |
"50% 101.500000 2389.000000 ... 0.052500 1.000000\n", | |
"75% 131.110000 13018.000000 ... 0.114275 3.500000\n", | |
"max 3150.000000 529243.000000 ... 1.000000 72.000000\n", | |
"\n", | |
"[8 rows x 24 columns]" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 45 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "xUfx2f2-nW90", | |
"colab_type": "code", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 68 | |
}, | |
"outputId": "4bb73b23-f8f5-438a-aca7-f1107a078de2" | |
}, | |
"source": [ | |
"df.setor.unique()" | |
], | |
"execution_count": 46, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"array(['shoppings', 'titulos_e_val__mob_', 'lajes_corporativas',\n", | |
" 'hibrido', 'outros', 'residencial', 'hotel', 'hospital',\n", | |
" 'logistica'], dtype=object)" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 46 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "od848w9cnn8M", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"# Filtering relevant sectors\n", | |
"df = df[df['setor'].isin(['shoppings', 'titulos_e_val__mob_', 'lajes_corporativas', 'hibrido', 'logistica'])].copy()" | |
], | |
"execution_count": 47, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "Sci_9X36vhR-", | |
"colab_type": "code", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 51 | |
}, | |
"outputId": "111ade0a-a918-427c-82bb-a31462fd50b4" | |
}, | |
"source": [ | |
"df.setor.unique()" | |
], | |
"execution_count": 48, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"array(['shoppings', 'titulos_e_val__mob_', 'lajes_corporativas',\n", | |
" 'hibrido', 'logistica'], dtype=object)" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 48 | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "6gKvjT1tv2Uu", | |
"colab_type": "code", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 224 | |
}, | |
"outputId": "7f3490c0-8041-474c-d872-a07bc8bb11c6" | |
}, | |
"source": [ | |
"df.head()" | |
], | |
"execution_count": 49, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>codigodo_fundo</th>\n", | |
" <th>setor</th>\n", | |
" <th>preco_atual</th>\n", | |
" <th>liquidez_diaria</th>\n", | |
" <th>dividendo</th>\n", | |
" <th>dividendyield</th>\n", | |
" <th>dy__3m_acumulado</th>\n", | |
" <th>dy__6m_acumulado</th>\n", | |
" <th>dy__12m_acumulado</th>\n", | |
" <th>dy__3m_media</th>\n", | |
" <th>dy__6m_media</th>\n", | |
" <th>dy__12m_media</th>\n", | |
" <th>dy_ano</th>\n", | |
" <th>variacao_preco</th>\n", | |
" <th>rentab_periodo</th>\n", | |
" <th>rentab_acumulada</th>\n", | |
" <th>patrimonioliq_</th>\n", | |
" <th>vpa</th>\n", | |
" <th>p_vpa</th>\n", | |
" <th>dypatrimonial</th>\n", | |
" <th>variacaopatrimonial</th>\n", | |
" <th>rentab__patr_no_periodo</th>\n", | |
" <th>rentab__patr_acumulada</th>\n", | |
" <th>vacanciafisica</th>\n", | |
" <th>vacanciafinanceira</th>\n", | |
" <th>quantidadeativos</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>FIVN11</td>\n", | |
" <td>shoppings</td>\n", | |
" <td>4.52</td>\n", | |
" <td>624.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0.0000</td>\n", | |
" <td>0.0000</td>\n", | |
" <td>0.0000</td>\n", | |
" <td>7.155951e+07</td>\n", | |
" <td>7.60</td>\n", | |
" <td>0.59</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0.5600</td>\n", | |
" <td>NaN</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>BZLI11</td>\n", | |
" <td>titulos_e_val__mob_</td>\n", | |
" <td>15.70</td>\n", | |
" <td>2032.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0068</td>\n", | |
" <td>0.0068</td>\n", | |
" <td>0.0114</td>\n", | |
" <td>4.280817e+08</td>\n", | |
" <td>11.86</td>\n", | |
" <td>1.32</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>XTED11</td>\n", | |
" <td>lajes_corporativas</td>\n", | |
" <td>7.27</td>\n", | |
" <td>2389.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0110</td>\n", | |
" <td>0.0110</td>\n", | |
" <td>-0.3783</td>\n", | |
" <td>2.629879e+07</td>\n", | |
" <td>14.67</td>\n", | |
" <td>0.50</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>1.0000</td>\n", | |
" <td>1.0</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>ALMI11</td>\n", | |
" <td>lajes_corporativas</td>\n", | |
" <td>1200.00</td>\n", | |
" <td>125.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0227</td>\n", | |
" <td>0.0227</td>\n", | |
" <td>-0.2829</td>\n", | |
" <td>2.205547e+08</td>\n", | |
" <td>1983.82</td>\n", | |
" <td>0.60</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0.7119</td>\n", | |
" <td>NaN</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>4</th>\n", | |
" <td>DOMC11</td>\n", | |
" <td>lajes_corporativas</td>\n", | |
" <td>479.98</td>\n", | |
" <td>31.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0</td>\n", | |
" <td>0.0435</td>\n", | |
" <td>0.0435</td>\n", | |
" <td>-0.1000</td>\n", | |
" <td>2.441484e+08</td>\n", | |
" <td>999.05</td>\n", | |
" <td>0.48</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0.2344</td>\n", | |
" <td>NaN</td>\n", | |
" <td>1</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" codigodo_fundo setor ... vacanciafinanceira quantidadeativos\n", | |
"0 FIVN11 shoppings ... NaN 1\n", | |
"1 BZLI11 titulos_e_val__mob_ ... NaN 0\n", | |
"2 XTED11 lajes_corporativas ... 1.0 1\n", | |
"3 ALMI11 lajes_corporativas ... NaN 1\n", | |
"4 DOMC11 lajes_corporativas ... NaN 1\n", | |
"\n", | |
"[5 rows x 26 columns]" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 49 | |
} | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "IAb-ixK5yvA9", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"### Definindo requisitos para filtro\n", | |
"- 90% < P/VPA < 110%\n", | |
"- Quantidade de ativos > 5 para os setores de shoppings, lajes corporativas e logistica\n", | |
"- Vacância física < 10%\n", | |
"- Líquidez diária acima de 20k\n", | |
"- Dividend Yield acumulado acima de 5%" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "qVb1_FKO6ZhU", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"# Price vs net worth\n", | |
"df['acceptable_pricing'] = np.where((df['p_vpa'] > 0.9) & (df['p_vpa'] < 1.1) , True, False)\n", | |
"\n", | |
"# Asset diversification\n", | |
"diverssified_assets_conditions = [\n", | |
" # 1st Condition\n", | |
" (df['setor'].isin(['shoppings', 'lajes_corporativas', 'logistica'])) &\n", | |
" (df['quantidadeativos'] > 5),\n", | |
" # 2st Condition\n", | |
" (df['setor'].isin(['shoppings', 'lajes_corporativas', 'logistica'])) &\n", | |
" (df['quantidadeativos'] < 5),\n", | |
" # 3rd Condition\n", | |
" (~df['setor'].isin(['shoppings', 'lajes_corporativas', 'logistica']))\n", | |
"]\n", | |
"diverssified_assets_choices = [True, False, True]\n", | |
"df['diverssified_assets'] = np.select(diverssified_assets_conditions, diverssified_assets_choices, default=None)\n", | |
"\n", | |
"# Vacancy\n", | |
"df['razonably_occupied'] = np.where((df['vacanciafisica'] < 0.1) | (df['vacanciafisica'].isnull()) | (df['vacanciafisica'].isnull()) , True, False)\n", | |
"\n", | |
"# Liquidity\n", | |
"df['is_liquid'] = np.where(df['liquidez_diaria'] > 20000, True, False)\n", | |
"\n", | |
"# dividend yield\n", | |
"df['good_earnings'] = np.where(df['dy__12m_acumulado'] > 0.05, True, False)" | |
], | |
"execution_count": 50, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "D2tbViMiv-UZ", | |
"colab_type": "code", | |
"colab": {} | |
}, | |
"source": [ | |
"papper_acceptable_conditions = (\n", | |
" df.acceptable_pricing &\n", | |
" df.diverssified_assets &\n", | |
" df.razonably_occupied &\n", | |
" df.is_liquid &\n", | |
" df.good_earnings\n", | |
")\n", | |
"df_results = df.loc[papper_acceptable_conditions].sort_values(by=['dy__12m_acumulado', 'liquidez_diaria'], ascending=False)" | |
], | |
"execution_count": 51, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "zANqRRqafIaB", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"### Resultado\n", | |
"*Os papéis abaixo não são recomendação de compra, mas sim estudo*." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "Zi7hHHrADke4", | |
"colab_type": "code", | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 286 | |
}, | |
"outputId": "9f9ae3e4-b569-46b8-c6e2-5be8c151c1a8" | |
}, | |
"source": [ | |
"df_results" | |
], | |
"execution_count": 52, | |
"outputs": [ | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>codigodo_fundo</th>\n", | |
" <th>setor</th>\n", | |
" <th>preco_atual</th>\n", | |
" <th>liquidez_diaria</th>\n", | |
" <th>dividendo</th>\n", | |
" <th>dividendyield</th>\n", | |
" <th>dy__3m_acumulado</th>\n", | |
" <th>dy__6m_acumulado</th>\n", | |
" <th>dy__12m_acumulado</th>\n", | |
" <th>dy__3m_media</th>\n", | |
" <th>dy__6m_media</th>\n", | |
" <th>dy__12m_media</th>\n", | |
" <th>dy_ano</th>\n", | |
" <th>variacao_preco</th>\n", | |
" <th>rentab_periodo</th>\n", | |
" <th>rentab_acumulada</th>\n", | |
" <th>patrimonioliq_</th>\n", | |
" <th>vpa</th>\n", | |
" <th>p_vpa</th>\n", | |
" <th>dypatrimonial</th>\n", | |
" <th>variacaopatrimonial</th>\n", | |
" <th>rentab__patr_no_periodo</th>\n", | |
" <th>rentab__patr_acumulada</th>\n", | |
" <th>vacanciafisica</th>\n", | |
" <th>vacanciafinanceira</th>\n", | |
" <th>quantidadeativos</th>\n", | |
" <th>acceptable_pricing</th>\n", | |
" <th>diverssified_assets</th>\n", | |
" <th>razonably_occupied</th>\n", | |
" <th>is_liquid</th>\n", | |
" <th>good_earnings</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>29</th>\n", | |
" <td>RECT11</td>\n", | |
" <td>hibrido</td>\n", | |
" <td>99.97</td>\n", | |
" <td>26083.0</td>\n", | |
" <td>0.81</td>\n", | |
" <td>0.0084</td>\n", | |
" <td>0.0245</td>\n", | |
" <td>0.0502</td>\n", | |
" <td>0.1064</td>\n", | |
" <td>0.0082</td>\n", | |
" <td>0.0084</td>\n", | |
" <td>0.0089</td>\n", | |
" <td>0.0580</td>\n", | |
" <td>-0.0480</td>\n", | |
" <td>-0.0400</td>\n", | |
" <td>-0.0118</td>\n", | |
" <td>6.711422e+08</td>\n", | |
" <td>96.24</td>\n", | |
" <td>1.04</td>\n", | |
" <td>0.0084</td>\n", | |
" <td>0.0002</td>\n", | |
" <td>0.0086</td>\n", | |
" <td>0.0242</td>\n", | |
" <td>0.097</td>\n", | |
" <td>NaN</td>\n", | |
" <td>7</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>149</th>\n", | |
" <td>MXRF11</td>\n", | |
" <td>hibrido</td>\n", | |
" <td>10.55</td>\n", | |
" <td>529243.0</td>\n", | |
" <td>0.07</td>\n", | |
" <td>0.0066</td>\n", | |
" <td>0.0193</td>\n", | |
" <td>0.0406</td>\n", | |
" <td>0.0885</td>\n", | |
" <td>0.0064</td>\n", | |
" <td>0.0068</td>\n", | |
" <td>0.0074</td>\n", | |
" <td>0.0555</td>\n", | |
" <td>-0.0256</td>\n", | |
" <td>-0.0192</td>\n", | |
" <td>-0.0346</td>\n", | |
" <td>1.259445e+09</td>\n", | |
" <td>10.24</td>\n", | |
" <td>1.03</td>\n", | |
" <td>0.0068</td>\n", | |
" <td>-0.0007</td>\n", | |
" <td>0.0061</td>\n", | |
" <td>0.0539</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>100</th>\n", | |
" <td>VRTA11</td>\n", | |
" <td>titulos_e_val__mob_</td>\n", | |
" <td>104.18</td>\n", | |
" <td>20535.0</td>\n", | |
" <td>0.51</td>\n", | |
" <td>0.0049</td>\n", | |
" <td>0.0191</td>\n", | |
" <td>0.0389</td>\n", | |
" <td>0.0731</td>\n", | |
" <td>0.0064</td>\n", | |
" <td>0.0065</td>\n", | |
" <td>0.0061</td>\n", | |
" <td>0.0513</td>\n", | |
" <td>-0.0626</td>\n", | |
" <td>-0.0580</td>\n", | |
" <td>-0.1628</td>\n", | |
" <td>8.036133e+08</td>\n", | |
" <td>105.72</td>\n", | |
" <td>0.99</td>\n", | |
" <td>0.0058</td>\n", | |
" <td>0.0057</td>\n", | |
" <td>0.0115</td>\n", | |
" <td>0.0460</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>55</th>\n", | |
" <td>RBRF11</td>\n", | |
" <td>titulos_e_val__mob_</td>\n", | |
" <td>97.70</td>\n", | |
" <td>26983.0</td>\n", | |
" <td>0.43</td>\n", | |
" <td>0.0041</td>\n", | |
" <td>0.0160</td>\n", | |
" <td>0.0342</td>\n", | |
" <td>0.0704</td>\n", | |
" <td>0.0053</td>\n", | |
" <td>0.0057</td>\n", | |
" <td>0.0059</td>\n", | |
" <td>0.0397</td>\n", | |
" <td>0.0504</td>\n", | |
" <td>0.0547</td>\n", | |
" <td>-0.1415</td>\n", | |
" <td>6.409490e+08</td>\n", | |
" <td>95.87</td>\n", | |
" <td>1.02</td>\n", | |
" <td>0.0063</td>\n", | |
" <td>0.0329</td>\n", | |
" <td>0.0394</td>\n", | |
" <td>-0.1984</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>163</th>\n", | |
" <td>HGCR11</td>\n", | |
" <td>titulos_e_val__mob_</td>\n", | |
" <td>105.63</td>\n", | |
" <td>22190.0</td>\n", | |
" <td>0.48</td>\n", | |
" <td>0.0046</td>\n", | |
" <td>0.0138</td>\n", | |
" <td>0.0315</td>\n", | |
" <td>0.0635</td>\n", | |
" <td>0.0046</td>\n", | |
" <td>0.0053</td>\n", | |
" <td>0.0053</td>\n", | |
" <td>0.0419</td>\n", | |
" <td>0.0381</td>\n", | |
" <td>0.0429</td>\n", | |
" <td>-0.0740</td>\n", | |
" <td>1.308826e+09</td>\n", | |
" <td>105.79</td>\n", | |
" <td>1.00</td>\n", | |
" <td>0.0045</td>\n", | |
" <td>0.0036</td>\n", | |
" <td>0.0081</td>\n", | |
" <td>0.0350</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>41</th>\n", | |
" <td>BCFF11</td>\n", | |
" <td>titulos_e_val__mob_</td>\n", | |
" <td>90.96</td>\n", | |
" <td>33876.0</td>\n", | |
" <td>0.45</td>\n", | |
" <td>0.0053</td>\n", | |
" <td>0.0141</td>\n", | |
" <td>0.0292</td>\n", | |
" <td>0.0624</td>\n", | |
" <td>0.0047</td>\n", | |
" <td>0.0049</td>\n", | |
" <td>0.0052</td>\n", | |
" <td>0.0348</td>\n", | |
" <td>-0.0714</td>\n", | |
" <td>-0.0665</td>\n", | |
" <td>-0.1353</td>\n", | |
" <td>1.721999e+09</td>\n", | |
" <td>83.74</td>\n", | |
" <td>1.09</td>\n", | |
" <td>0.0054</td>\n", | |
" <td>-0.0156</td>\n", | |
" <td>-0.0103</td>\n", | |
" <td>-0.1911</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>0</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>71</th>\n", | |
" <td>BTLG11</td>\n", | |
" <td>logistica</td>\n", | |
" <td>107.50</td>\n", | |
" <td>33152.0</td>\n", | |
" <td>0.50</td>\n", | |
" <td>0.0046</td>\n", | |
" <td>0.0140</td>\n", | |
" <td>0.0309</td>\n", | |
" <td>0.0568</td>\n", | |
" <td>0.0047</td>\n", | |
" <td>0.0051</td>\n", | |
" <td>0.0047</td>\n", | |
" <td>0.0364</td>\n", | |
" <td>0.0459</td>\n", | |
" <td>0.0507</td>\n", | |
" <td>0.0452</td>\n", | |
" <td>7.008339e+08</td>\n", | |
" <td>100.34</td>\n", | |
" <td>1.07</td>\n", | |
" <td>0.0050</td>\n", | |
" <td>0.0118</td>\n", | |
" <td>0.0168</td>\n", | |
" <td>0.0390</td>\n", | |
" <td>0.000</td>\n", | |
" <td>NaN</td>\n", | |
" <td>8</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" <td>True</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" codigodo_fundo setor ... is_liquid good_earnings\n", | |
"29 RECT11 hibrido ... True True\n", | |
"149 MXRF11 hibrido ... True True\n", | |
"100 VRTA11 titulos_e_val__mob_ ... True True\n", | |
"55 RBRF11 titulos_e_val__mob_ ... True True\n", | |
"163 HGCR11 titulos_e_val__mob_ ... True True\n", | |
"41 BCFF11 titulos_e_val__mob_ ... True True\n", | |
"71 BTLG11 logistica ... True True\n", | |
"\n", | |
"[7 rows x 31 columns]" | |
] | |
}, | |
"metadata": { | |
"tags": [] | |
}, | |
"execution_count": 52 | |
} | |
] | |
} | |
] | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment