Skip to content

Instantly share code, notes, and snippets.

@willirath
Last active July 14, 2023 14:17
Show Gist options
  • Save willirath/ee78d101296339b97ae2cc5bd2337fd2 to your computer and use it in GitHub Desktop.
Save willirath/ee78d101296339b97ae2cc5bd2337fd2 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "c22818de-13c2-41b1-a4f0-edddcfee82e3",
"metadata": {},
"outputs": [],
"source": [
"# %pip install nltk pandas matplotlib numpy"
]
},
{
"cell_type": "markdown",
"id": "59b43f0e-bf57-4b14-ab8d-abb860248cb8",
"metadata": {},
"source": [
"# Stats about the use of articles in English and Portugese\n",
"\n",
"(With many pinches of salt....)"
]
},
{
"cell_type": "markdown",
"id": "2d07b9eb-cdd3-4bc1-b4dd-ef56a602181a",
"metadata": {},
"source": [
"## Imports, downloads\n",
"\n",
"Note we first import the downloader and only later import the corpora."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "37e4f03a-09bd-4be7-92a3-6bb99aebd822",
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "cda7a5ff-9c9a-478f-ad32-1ee64d843330",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import nltk"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "7e7f7091-1740-417a-bb43-80401f1cb8dc",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"[nltk_data] Downloading package floresta to /home/jovyan/nltk_data...\n",
"[nltk_data] Package floresta is already up-to-date!\n",
"[nltk_data] Downloading package brown to /home/jovyan/nltk_data...\n",
"[nltk_data] Package brown is already up-to-date!\n"
]
},
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nltk.download(\"floresta\")\n",
"nltk.download(\"brown\")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "408fd3fa-a93c-4518-a17b-7ce6bf4b8b3f",
"metadata": {},
"outputs": [],
"source": [
"from nltk.corpus import brown, floresta"
]
},
{
"cell_type": "markdown",
"id": "c5e9f115-3056-4b29-99b3-a236af0cbb86",
"metadata": {},
"source": [
"## Analysing the use of articles in English and Portugese\n",
"\n",
"The English _Brown_ corpus tags articles with `\"AT\"`, the Portugese _Floresta_ corpus uses `\"art\"`."
]
},
{
"cell_type": "markdown",
"id": "0d10d66c-2a7e-49b5-b5a1-a2423595fd80",
"metadata": {},
"source": [
"### Let's quantify the frequency of articles."
]
},
{
"cell_type": "markdown",
"id": "6260cd93-e2c6-4590-bdd1-71254a346ceb",
"metadata": {},
"source": [
"The tagged words are returned as lists of tuples with the first tuple element containing the word and the second element containing the tag. As Python starts counting elements at 0, we want to count, how often the element with the number 1 contains either `\"AT\"` for the _Brown_ corpus or `\"art\"` for the _Floresta_ corpus.\n",
"\n",
"We do this using a list comprehension mapping each tagged word to either `True` for articles or `False` for all others and then summing and normalizing over all words."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "36e88599-4993-4372-960f-3a15b698f724",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.08541912104113704\n"
]
}
],
"source": [
"number_words_in_brown = len(brown.words())\n",
"number_articles_in_brown = sum((\"AT\" in p[1] for p in brown.tagged_words()))\n",
"fraction_articles_in_brown = number_articles_in_brown / number_words_in_brown\n",
"print(fraction_articles_in_brown)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "98368377-a708-4969-b1be-10376c14312f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.1385873156732058\n"
]
}
],
"source": [
"number_words_in_floresta = len(floresta.words())\n",
"number_articles_in_floresta = sum((\"art\" in p[1] for p in floresta.tagged_words()))\n",
"fraction_articles_in_floresta = number_articles_in_floresta / number_words_in_floresta\n",
"print(fraction_articles_in_floresta)"
]
},
{
"cell_type": "markdown",
"id": "468021a5-8494-4f72-b40e-1ea39f2a727d",
"metadata": {},
"source": [
"### Using a function\n",
"\n",
"As we've repeated almost exactly the same code twice (and as we might want to do the same for other copora), we could try and find a better way of re-using this logic. This is what functions are for."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "66ed7699-d117-487a-8d6e-d88920a2b673",
"metadata": {},
"outputs": [],
"source": [
"def fraction_of_articles(corpus, article_tag=None):\n",
" number_words = len(corpus.words())\n",
" number_articles = sum((article_tag in p[1] for p in corpus.tagged_words()))\n",
" fraction_articles = number_articles / number_words\n",
" return fraction_articles"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "506d60b5-ceaf-479d-b219-95f01b69d81f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.08541912104113704\n"
]
}
],
"source": [
"fraction_articles_in_brown = fraction_of_articles(brown, article_tag=\"AT\")\n",
"print(fraction_articles_in_brown)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "c7a0fb35-6e50-4b38-bc98-80276d087a25",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.1385873156732058\n"
]
}
],
"source": [
"fraction_articles_in_floresta = fraction_of_articles(floresta, article_tag=\"art\")\n",
"print(fraction_articles_in_floresta)"
]
},
{
"cell_type": "markdown",
"id": "c6cd10b0-ba6d-4535-93a1-7e52c06bc351",
"metadata": {},
"source": [
"### Distance between two uses of articles\n",
"\n",
"Let's do statistics about the typical distance between two subsequent uses of (the same or different) articles."
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "79e21e0c-ec6d-4ebc-a7f0-4a3edcc2663e",
"metadata": {},
"outputs": [],
"source": [
"def words_since_last_article(corpus, article_tag=None):\n",
" tagged_words = corpus.tagged_words()\n",
" distance = 0\n",
" for w in tagged_words:\n",
" if article_tag in w[1]:\n",
" yield distance # will create a generator\n",
" distance = 0\n",
" else:\n",
" distance = distance + 1"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "f6c4f338-538e-46db-bea4-1554974a18b9",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0\n",
"6\n",
"8\n"
]
}
],
"source": [
"dist = words_since_last_article(brown, article_tag=\"AT\")\n",
"print(next(dist))\n",
"print(next(dist))\n",
"print(next(dist))"
]
},
{
"cell_type": "markdown",
"id": "f6d0d1bb-2831-4917-8336-9b3f662766bf",
"metadata": {},
"source": [
"We want to put this into a Pandas datatype which has built in methods for statistics and visualisation:"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "19ae948c-795f-4ee5-a9ad-983844a5091c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"count 99188.000000\n",
"mean 10.706920\n",
"std 10.745149\n",
"min 0.000000\n",
"25% 4.000000\n",
"50% 7.000000\n",
"75% 14.000000\n",
"max 387.000000\n",
"Name: dist_since_last_article, dtype: float64"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"distances_brown = pd.Series(\n",
" words_since_last_article(brown, article_tag=\"AT\"),\n",
" name=\"dist_since_last_article\",\n",
")\n",
"\n",
"distances_brown.describe()"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "7903e43b-5c54-4a5d-a973-ddc868e875c4",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"count 29360.000000\n",
"mean 6.215463\n",
"std 8.009240\n",
"min 0.000000\n",
"25% 3.000000\n",
"50% 4.000000\n",
"75% 8.000000\n",
"max 1045.000000\n",
"Name: dist_since_last_article, dtype: float64"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"distances_floresta = pd.Series(\n",
" words_since_last_article(floresta, article_tag=\"art\"),\n",
" name=\"dist_since_last_article\",\n",
")\n",
"\n",
"distances_floresta.describe()"
]
},
{
"cell_type": "markdown",
"id": "4f867fa4-5466-400b-84c2-69b6f0d986eb",
"metadata": {},
"source": [
"And some visualisation: We'll look at the quantiles of the distances between use of any article."
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "c191f763-dc9e-473c-a86a-4a0d77813d37",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAmEAAAEmCAYAAAAqQEcCAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8pXeV/AAAACXBIWXMAAA9hAAAPYQGoP6dpAABFR0lEQVR4nO3deVxU9f4/8NcMDMM2LKJsijuKCy655YZo7nta95Z93fJq9xeaS3bDrBCztMxcytQ0xdvNNL1ppV4TSbDcMs0VFUUITTYXZBMYZs7vD2RkZJEzznDmDK/n4zGPnDNnec/5IL76nM/5HIUgCAKIiIiIqEYppS6AiIiIqDZiCCMiIiKSAEMYERERkQQYwoiIiIgkwBBGREREJAGGMCIiIiIJMIQRERERSYAhjIiIiEgC9lIXYGl6vR43b96ERqOBQqGQuhwiIiKyYYIgICcnB/7+/lAqq+7rsvkQdvPmTQQEBEhdBhEREdUi169fR4MGDapcx+ZDmEajAVByMtzc3CxyDK1Wi/3792PgwIFQqVQWOQaZF9tMnthu8sM2kye2m+mys7MREBBgyB9VsfkQVnoJ0s3NzaIhzNnZGW5ubvxhlQm2mTyx3eSHbSZPbLcnV50hUByYT0RERCQBhjAiIiIiCTCEEREREUnA5seEVYcgCCguLoZOpzNpe61WC3t7exQUFJi8D6pZpW1WWFgIALC3t+cUJkREVKNqfQgrKipCamoq8vPzTd6HIAjw9fXF9evX+Q+5TJS2WUpKChQKBZydneHn5wcHBwepSyMiolqiVocwvV6PpKQk2NnZwd/fHw4ODiaFKL1ej9zcXLi6uj52YjayDqVt5uLiguLiYmRmZiIpKQmBgYFsQyIiG3bvvhbuTtZxx2etDmFFRUXQ6/UICAiAs7OzyfvR6/UoKiqCo6Mj/wGXidI2c3JyglKphEqlwp9//mloRyIisj03s+5j9OrDeKFrQ8zuHyj51SsmBoDBifgzQERk43IKtHg56gQycgqx/0Ia7mulH8PNf3mIiIjIphXr9Ji+5Q9cSstBPY0aX07qAmcH6S8GMoQRERGRzRIEAZE/xiMuIROOKiW+nNgZ9T2cpC4LAEMYERER2bCNh5Px1bE/oVAAK1/oiHYNPKQuyYAhTIYmTZoEhUJheHl5eWHw4ME4e/as1KWZJDk52ej7lH0dO3YMABAVFQWFQoHBgwcbbZuVlQWFQoHY2FgJKiciImu2/0IaFu2JBwDMH9oKg9r4SlyRMYYwmRo8eDBSU1ORmpqKmJgY2NvbY/jw4VVuo9Vqa6g60xw4cMDwnUpfnTp1Mnxub2+PAwcO4ODBgxJWSUREcnDuxj3M3HoaggC81K0hpvRqInVJ5TCElSEIAvKLik163S/SmbxtflExBEEQVatarYavry98fX3RoUMHhIeH4/r168jMzATwsHdp27Zt6NOnDxwdHfH1119Dr9dj4cKFaNCgAdRqNTp06IB9+/YZ9vvcc89h+vTphvezZs2CQqHApUuXAJRM6+Hi4oIDBw4AAEJDQ/Haa6/hX//6F+rUqQNfX18sWLDApPPv5eVl+E6lL5Xq4VwuLi4uePnllxEeHm7S/omIqHa4mXUfUzafwH2tDn1a1EPkyDaST0dREelvDbAi97U6tH73J0mOHb9wkMl3auTm5uI///kPmjdvDi8vL6PPwsPDsWzZMnTs2BGOjo5YuXIlli1bhnXr1qFjx47YuHEjRo4ciQsXLiAwMBB9+vTBunXrDNvHxcWhbt26iI2NRVBQEE6cOAGtVosePXoY1tm8eTPmzJmD48eP4+jRo5g0aRJ69uyJAQMGmHYyqrBgwQI0b94cO3bswHPPPWf2/RMRkbyVnYoiyFeDz8Z1hL2ddfY5WWdV9Fi7d++Gq6srXF1dodFo8MMPP2Dbtm3l5ruaNWsWxowZgyZNmsDPzw8ff/wx3nzzTbzwwgto2bIlPvzwQ3To0AErVqwAUNKzFR8fj8zMTNy9exfx8fGYOXOmYcxVbGwsunTpYjS5bbt27RAREYHAwEBMmDABnTt3RkxMjOjv1KNHD8N3Kn09yt/fHzNnzsT8+fNRXFws+hhERGS7KpqKQuNoHbPjV4Q9YWU4qewQv3CQ6O30ej1ysnOgcdOYPOmnk8pO1Pp9+/bFmjVrAAB3797F559/jiFDhuC3335Do0aNDOt17tzZ8Ofs7GzcvHkTPXv2NNpXz549cebMGQBA27ZtUadOHcTFxcHBwQEdO3bE8OHDsXr1agAlPWOhoaFG27dr187ovZ+fHzIyMkR9HwDYtm0bWrVq9dj13nzzTaxbtw4bN27E3/72N9HHISIi2yMIAhb8eMEqp6KoDENYGQqFwqRLgnq9HsUOdnB2sK+xmdddXFzQvHlzw/sNGzbA3d0d69evx6JFi4zWE0OhUCAkJASxsbFQq9UIDQ1Fu3btUFhYiPPnz+PIkSOYO3eu0TZlx22V7kOv14v+TgEBAUbfqTIeHh6YN28eIiMjH3szAhER1Q5f/pqE/xxLscqpKCrDy5E2QqFQQKlU4v79+5Wu4+bmBn9/fxw+fNho+eHDh9G6dWvD+z59+iA2NhaxsbEIDQ2FUqlESEgIli5disLCwnI9aVKYMWMGlEolVq5cKXUpREQksf0X0vD+3osArHMqisqwJ0ymCgsLkZaWBqDkcuRnn32G3NxcjBgxosrt3njjDURERKBZs2bo0KEDNm3ahNOnT+Prr782rBMaGorZs2fDwcEBvXr1MiybO3cuunTpIrp3rbpu375t+E6lPDw8KnygtqOjIyIjIxEWFmaRWoiISB7KTkXxf09b51QUlWEIk6l9+/bBz88PAKDRaBAUFITt27eXG6/1qNdeew337t3D66+/joyMDLRu3Ro//PADAgMDDesEBwfDw8MDLVq0MAyODw0NhU6ne+z+KzJp0iQkJyc/dkLV/v37l1v2zTff4IUXXqhw/YkTJ2LZsmWIj48XXRMREcnfX1n38XKZqSgWjLDOqSgqwxAmQ1FRUYiKiqpyncaNG1c495hSqURERAQiIiIq3VapVOLOnTtGyzp06FDh/ioKVrt27TJ6n5SUhL59+4qutaxJkyZh0qRJRsvs7Oxw4cKFKrcjIiLblFOgxZSoE8iUwVQUlWEII4u6d+8eEhMTsWfPHqlLISIiGyG3qSgqwxBGFuXu7o4bN25IXQYREdmIslNROKnssHFiF6ufiqIy8uq3IyIiolrNeCqKDghu4C51SSZjCCMiIiJZ+OmRqSgGymQqisowhBEREZHVO3sjC7NkOhVFZRjCiIiIyKr9lXUfUzb/LtupKCrDEEZERERWyxamoqiMbXwLIiIisjllp6Lw1qixUaZTUVRG0hC2ePFidOnSBRqNBt7e3hg9ejQuX75stE5BQQHCwsLg5eUFV1dXjB07Funp6RJVLA+hoaGYNWuW1GUQERGZTBAERPzwcCqKLyd2gb9Mp6KojKQhLC4uDmFhYTh27Biio6Oh1WoxcOBA5OXlGdaZPXs2fvzxR2zfvh1xcXG4efMmxowZI2HV0ps0aRIUCkW519WrV6UuzSA2NhYKhQJZWVkW23fpy8fHB2PHjsW1a9fKfVbR63GPTyIiIul9+WsSvj5uG1NRVEbSyVr37dtn9D4qKgre3t44efIkQkJCcO/ePXz55ZfYsmUL+vXrBwDYtGkTWrVqhWPHjuHpp5+WomyrMHjwYGzatMloWb169cyyb51OB4VCAaXSuq9WX758GRqNBleuXMG0adMwYsQInDp1CqmpqYZ1Zs6ciezsbKNzVadOHSnKJSKiarK1qSgqY1X/yt67dw/Aw38kT548Ca1Wa/Rg56CgIDRs2BBHjx6VpEZroVar4evra/Sys7OrcN27d+9iwoQJ8PT0hLOzM4YMGYIrV64YPo+KioKHhwd++OEHtG7dGmq1GikpKSgsLMTcuXNRv359uLi4oFu3bka9SH/++SdGjBgBT09PuLi4oE2bNti7dy+Sk5MNz4r09PSEQqEwPPdx37596NWrFzw8PODl5YXhw4cjMTHRpHPg7e0NPz8/hISE4N1330V8fDySk5ONzomTk1O5c+Xg4GDS8YiIyPLO3sjCzK1/2NRUFJWxmscW6fV6zJo1Cz179kTbtm0BAGlpaXBwcICHh4fRuj4+PkhLS6twP4WFhSgsLDS8z87OBgBotVpotVqjdbVaLQRBgF6vh16vBwQB0OaLrl14sJ1QqITe1FtmVc5ANbcVBMFQd1XrlH4+ceJEXL16Fbt27YKbmxvCw8MxdOhQnD9/HiqVCnq9Hvn5+fjwww/xxRdfwMvLC3Xr1kVYWBguXryILVu2wN/fH7t27cLgwYNx5swZBAYG4tVXX0VRURFiY2Ph4uKC+Ph4ODs7o379+ti+fTuef/55XLx4EW5ubnBycoJer0dOTg5mzZqFdu3aITc3FxEREXj22Wdx6tSpave8lX4vQ7uhJJQCJWMIy56Xys5V6QPDy7a/IAjQarWVhlmSXunf4Uf/LpP1YpvJk1TtdjPrPqZEnUCBVo+QQC/MH9wCxcXFNVrDkxJzzqwmhIWFheH8+fP49ddfn2g/ixcvRmRkZLnl+/fvh7Ozs9Eye3t7+Pr6Ijc3F0VFRYA2Hx6rW5l0XA+TtnooK+xiSRCrBq1Wiz179sDNzc2wrH///oiKigIAFBcXo6ioCNnZ2UhMTMSPP/6Iffv2oX379gCANWvWoG3btvjmm28wevRoFBQUQKvVYsmSJYYAnJSUhKioKJw7dw5+fn4AgKlTp2LPnj1Yt24d3n33XSQnJ2PkyJFo1KgRACAkJAQAkJeXB0dHRwCAk5OT4bxnZ2djwIABhpq9vb2xYsUKNG/eHL/99htat25dre+fn18SlHNycqBUKpGWloaPPvoI/v7+8PPzMwTv0nNVXFxstKysnJwcAEBRURHu37+PQ4cOye4vfG0UHR0tdQkkEttMnmqy3QqKgRXn7ZB5XwE/ZwHDPNKx/6d9j9/QypT+G1UdVhHCpk+fjt27d+PQoUNo0KCBYbmvry+KioqQlZVl1BuWnp4OX9+Krw/PmzcPc+bMMbzPzs5GQEAABg4caBRagJJek+vXr8PV1bUkNBRJ1wPiptEADi7VWlelUiE0NBSff/65YZmLi4vh+9nb28PBwQFubm64fv067O3t0a9fP0MPj5ubG1q2bIk///wTbm5ucHR0hIODA3r06GGY/C45ORk6nQ5dunQxOnZhYSG8vb3h5uaGmTNnIiwsDIcOHcIzzzyDMWPGoF27dgBgCF4ajcbovF+5cgURERH47bffcOvWLUMP1Z07d8q1T2VK992mTRsIgoD8/Hy0b98eO3bsQN26dcudK3t7+3L7FgQBOTk50Gg0UCgUKCgogJOTE0JCQgwBkqyPVqtFdHQ0BgwYAJXKdm5Tt2VsM3mq6XYr1ukx7T9/IPX+bXhr1Nj2Sjf4ucvzd3Fl/9NfEUlDmCAImDFjBnbu3InY2Fg0aWJ83bdTp05QqVSIiYnB2LFjAZQMxk5JSUH37t0r3KdarTZcmipLpVKV+0EqOwBdqVQCalfgrZuiv4der0d2Tg7cNBqTB7MrRVyOVCgUcHV1RYsWLapcx/C9AKM/V7SOk5OT0WW4/Px82NnZ4eTJk+Uuz7m6ukKpVGLatGkYMmQI9uzZg/3792PJkiVYtmwZZsyYUelxR40ahUaNGmH9+vXw9/eHXq9H27ZtUVxcXO1zV7reL7/8Ajc3N3h7e0Oj0VR6Hiq6yaA0/JU9BwqFosKfE7I+bCf5YZvJU020myAIWLD7PH65etswFUXDuhX/TpcDMedL0hAWFhaGLVu24Pvvv4dGozGM83J3d4eTkxPc3d0xZcoUzJkzB3Xq1IGbmxtmzJiB7t27W+bOSIWi2r1RRvR6QKUr2dbK7ihs1aoViouLcfz4cfTo0QMAcPv2bVy+fLnKy38dO3aETqdDRkYGevfuXel6AQEB+Oc//4l//vOfmDdvHtavX48ZM2YYBr/rdDrDuqXHXb9+vWGfT3L5uUmTJuXGCxIRkbzUhqkoKiNpCFuzZg2AkslFy9q0aZPhbrrly5dDqVRi7NixKCwsxKBBg4wuw1HVAgMDMWrUKEydOhXr1q2DRqNBeHg46tevj1GjRlW6XYsWLfDSSy9hwoQJWLZsGTp27IjMzEzExMSgXbt2GDZsGGbNmoUhQ4agRYsWuHv3Lg4ePIhWrUrG1DVq1AgKhQK7d+/G0KFD4eTkBE9PT3h5eeGLL76An58fUlJSEB4eXlOngoiIrExtmYqiMqK7bU6dOoVz584Z3n///fcYPXo03nrrrZLB7SKU3rn26Ks0gAGAo6MjVq9ejTt37iAvLw/fffddpePBqGKbNm1Cp06dMHz4cHTv3h2CIGDv3r2P7TLdtGkTJkyYgNdffx0tW7bE6NGjceLECTRs2BBASS9XWFgYWrVqhcGDB6NFixaGgFy/fn1ERkYiPDwcPj4+mD59OpRKJbZu3YqTJ0+ibdu2mD17NpYuXVruuKGhoUY/A0REZHvKTkUx/ulGNj0VRWUUQum9+tXUpUsXhIeHG2Yob9OmDZ599lmcOHECw4YNw4oVKyxUqmmys7Ph7u6Oe/fuVTgwPykpCU2aNHmiwdh6vR7Z2dlwc3Oz+glO5aBRo0aIjIy0aBB7tM3M9bNAlqXVarF3714MHTqU44tkgm0mT5Zut7+y7mP06sPIzClEaMt62DChs808lLuq3PEo0d84ISEBHTp0AABs374dISEh2LJlC6KiovDf//7XpIKJSl24cAHu7u6YMGGC1KUQEZEF5BRo8fKmE8jMKUSQrwafvtjRZgKYWKK/ddmJLw8cOIChQ4cCKBmgfevWLfNWR7VOmzZtcPbsWfYoEhHZoGKdHmFb/sDl9Bx4a9TYOKkLNI61t4dU9L90nTt3xqJFi/DVV18hLi4Ow4YNA1AyuaePj4/ZCyQiIiL5EwQB7/5wAYcSMg1TUfh7OEldlqREh7AVK1bg1KlTmD59OubPn4/mzZsDAHbs2GGYAoGIiIiorA2/JGFLLZ2KojKip6ho166d0d2RpZYuXcpn7hEREVE5+86n4YP/lUxF8faw1rVuKorKmDTwJisrCxs2bMC8efNw584dAEB8fDwyMjLMWlxNEXmDKNkg/gwQEVnGmetZmLXt4VQUL/dsLHVJVkN0T9jZs2fxzDPPwMPDA8nJyZg6dSrq1KmD7777DikpKfj3v/9tiTotovS22/z8fDg51e7r0rVd6QNXeQs9EZH53Libj3/8+3cUaPUIbVkPESNaG55RTCaEsDlz5mDy5Mn46KOPjJ7XN3ToUIwbN86sxVmanZ0dPDw8DD14zs7OJv1w6PV6FBUVoaCggHf1yURpm92/fx8FBQXIyMiAh4cHL6kTEZlJdoEWU6J+N0xF8dm4p2rtVBSVER3CTpw4gXXr1pVbXr9+fcOzH+WkdPb9J7mUKggC7t+/DycnJyZ8mXi0zTw8PPgkBiIiM9Hq9Aj7+pTRVBSuakmflGiVRJ8RtVqN7OzscssTEhJQr149sxRVkxQKBfz8/ODt7Q2tVmvSPrRaLQ4dOoSQkBBezpKJ0jbr06cPnJyc2ANGRGQmgiAg4ocL+OXKLU5F8RiiQ9jIkSOxcOFCfPvttwBKQkxKSgrefPNNjB071uwF1hQ7OzuT/yG2s7NDcXExHB0dGcJkorTN1Go1AxgRkRmVnYpi1YsdORVFFURfnF22bBlyc3Ph7e2N+/fvo0+fPmjevDk0Gg3ef/99S9RIREREMvDoVBQDWnMS96qI7glzd3dHdHQ0fv31V5w9exa5ubl46qmn0L9/f0vUR0RERDJQdiqKCd05FUV1mDxKrlevXujVq5c5ayEiIiIZunE3H1M2P5yK4t3hnIqiOqoVwlatWlXtHb722msmF0NERETyUjoVxa1cTkUhVrVC2PLly6u1M4VCwRBGRERUS3AqiidTrTOVlJRk6TqIiIhIRjgVxZNjfyERERGJtv6Xa5yK4gmJDmFjx47Fhx9+WG75Rx99hOeff94sRREREZH12nc+FYv/dwkAp6J4EqJD2KFDhzB06NByy4cMGYJDhw6ZpSgiIiKyTiVTUZzmVBRmIDqE5ebmwsHBodxylUpV4eOMiIiIyDZwKgrzEh3CgoODsW3btnLLt27ditatW5ulKCIiIrIu2QVavBx1glNRmJHo+0jfeecdjBkzBomJiejXrx8AICYmBt988w22b99u9gKJiIhIWqVTUSSk58Jbo8amyZyKwhxEn8ERI0Zg165d+OCDD7Bjxw44OTmhXbt2OHDgAPr06WOJGomIiEgigiDg3e8fTkWxcVIX+LlzKgpzMCnGDhs2DMOGDTN3LURERGRl1v9yDd/89nAqirb1ORWFufBiLhEREVXopwvphqko3uFUFGZXrZ6wOnXqICEhAXXr1oWnp2eVd0LcuXPHbMURERGRNP7MAT7/7znDVBSTORWF2VX72ZEajcbwZ96OSkREZLv+yrqP9ZftUKDVoy+norCYaoWwiRMnGv48adIkS9VCREREEruYmo2wr08iR6tAkK8Gn3IqCosRfVbt7OyQkZFRbvnt27dhZ2dnlqKIiIioZml1eqyKuYKRn/2Ka7fy4a4S8MX/deRUFBYk+swKglDh8sLCwgpn0iciIiLrdjE1G3O3n8GFmyVPvukfVA99XFLh5+4ocWW2rdohbNWqVQAAhUKBDRs2wNXV1fCZTqfDoUOHEBQUZP4KiYiIyCK0Oj3WxCbi05+vQKsT4OGsQuTINhjSuh7+979UqcuzedUOYcuXLwdQ0hO2du1ao0uPDg4OaNy4MdauXWv+ComIiMjs4m9m440dD3u/BrT2wfvPtoW3xhFarVbi6mqHaoewpKQkAEDfvn2xc+dOeHh4WKomIiIishCtTo/PD5b0fhXrH/Z+jWzvzzsga5iogflarRYpKSlITTVPF+WhQ4cwYsQI+PuXNPyuXbuMPp80aRIUCoXRa/DgwWY5NhERUW0TfzMboz47jOUHElCsFzCwtQ/2zw7BqA71GcAkIGpgvkqlQkFBgdkOnpeXh/bt2+Pll1/GmDFjKlxn8ODB2LRpk+G9Wq022/GJiIhqA/Z+WSfRd0eGhYXhww8/xIYNG2Bv/2S3rQ4ZMgRDhgypch21Wg1fX98nOg4REVFtFX+z5M7H+NSSsV8DW/tg0YOxXyQt0SnqxIkTiImJwf79+xEcHAwXFxejz7/77juzFQcAsbGx8Pb2hqenJ/r164dFixbBy8ur0vULCwtRWFhoeJ+dXfJDp9VqLTbQsHS/HMgoH2wzeWK7yQ/bTDpanR5rDyXh89hrJb1fTiq8OzwIw4N9oVAoqmwTtpvpxJwzhVDZxF+VmDx5cpWfl710KIZCocDOnTsxevRow7KtW7fC2dkZTZo0QWJiIt566y24urri6NGjlU4Mu2DBAkRGRpZbvmXLFjg7O5tUGxERkZz8lQd8fdUOf+WXXGoM9tTjb031cON0nhaXn5+PcePG4d69e3Bzc6tyXdEhzFIqCmGPunbtGpo1a4YDBw7gmWeeqXCdinrCAgICcOvWrceeDFNptVpER0djwIABUKlUFjkGmRfbTJ7YbvLDNqtZj+v9qvZ+2G4my87ORt26dasVwmT1LIKmTZuibt26uHr1aqUhTK1WVzh4X6VSWfwHqSaOQebFNpMntpv8sM0szxJjv9hu4ok5XyaFsB07duDbb79FSkoKioqKjD47deqUKbuslhs3buD27dvw8/Oz2DGIiIjkhHc+ypfoB3ivWrUKkydPho+PD/744w907doVXl5euHbt2mPvdHxUbm4uTp8+jdOnTwMomRD29OnTSElJQW5uLt544w0cO3YMycnJiImJwahRo9C8eXMMGjRIbNlEREQ2h/N+yZvonrDPP/8cX3zxBV588UVERUXhX//6F5o2bYp3330Xd+7cEbWv33//HX379jW8nzNnDgBg4sSJWLNmDc6ePYvNmzcjKysL/v7+GDhwIN577z3OFUZERLWaVqfH6oNX8dnPV9n7JWOiQ1hKSgp69OgBAHByckJOTg4AYPz48Xj66afx2WefVXtfoaGhqOq+gJ9++klseURERDbt0bFfg9r4YNHoYNTTsINCbkSHMF9fX9y5cweNGjVCw4YNcezYMbRv3x5JSUlVBioiIiIyHXu/bI/oENavXz/88MMP6NixIyZPnozZs2djx44d+P333yt99BARERGZjr1ftkl0CPviiy+g1+sBlDzCyMvLC0eOHMHIkSPxyiuvmL1AIiKi2oq9X7ZNdAhTKpVQKh/eVPnCCy/ghRdeMGtRREREtR17v2yfrCZrJSIisnXs/ao9GMKIiIisBHu/aheGMCIiIokVFevxeezD3i9PZxUiR7XFiHZ+7P2yYQxhREREErpw8x7mbj+Li+z9qnUYwoiIiCTA3i8yWwh76623kJaWho0bN5prl0RERDaJvV8EmDGE/fXXX7h+/bq5dkdERGRz2PslsaI8IGYh0CQECBomdTXiQ9jPP/+MHj16wNHR0Wj55s2bzVYUERGRrWHvl8SSDwPfhwF3k4ALO4GmfQEHZ0lLEh3CRo4cieLiYnTp0gWhoaHo06cPevbsCScnJ0vUR0REJGvs/ZJYae/X8bUl793qAyNXSR7AABNC2N27d/Hbb78hLi4OcXFxWLFiBYqKitC5c2f07dsXixYtskSdREREsvNo79fgNr54b3Rb9n7VlLK9XwDw1ARg4CLA0V3auh4QHcJUKhV69uyJnj174q233sKFCxewdOlSfP311zh27BhDGBER1XoV9X4tHNUWw9n7VTMq6/1q3l/auh4hOoQlJCQgNjYWsbGxiIuLQ2FhIXr37o2PP/4YoaGhFiiRiIhIPtj7JTEr7/0qS3QICwoKQr169TBz5kyEh4cjODiYqZ6IiGq9ouKSZz6uPsjeL0kU5QEHIoHf1pW8d2vwoPfrGWnrqoLoEPbaa6/h0KFDWLhwIXbv3o3Q0FCEhoaiV69ecHaWfpAbERFRTWPvl8SSf33Q+5Vc8t6Ke7/KEh3CVqxYAQDIysrCL7/8gri4OMyfPx8XLlxAx44dcfjwYXPXSEREZJXY+yUxGfZ+lWXyZK06nQ5arRaFhYUoKChAYWEhLl++bM7aiIiIrNbZG1l487/n2PslFZn2fpUlOoTNmDEDcXFxiI+Ph6enJ0JCQjB16lSEhoYiODjYEjUSERFZjfN/3cPKmCuIjk8HAPZ+1TSZ936VJTqEpaWlYdq0aQgNDUXbtm0tURMREZHVeTR8KRTA6A718dbQVuz9qik20PtVlkk9YT169IC9vfGmxcXFOHLkCEJCQsxWHBERkdQqCl8j2/tjRr9ANPd2lbi6WsKGer/KEh3C+vbti9TUVHh7exstv3fvHvr27QudTme24oiIiKTC8GUlbKz3qyzRIUwQhAqved++fRsuLi5mKYqIiEgqDF9WwkZ7v8qqdggbM2YMAEChUGDSpElQqx9e/9bpdDh79ix69Ohh/gqJiIhqAMOXFSnX+zXxQe+Xm6RlmVu1Q5i7e0m3nyAI0Gg0cHJyMnzm4OCAp59+GlOnTjV/hURERBbE8GVFakHvV1nVDmGbNm0CADRu3Bhz587lpUciIpI1hi8rU0t6v8oSPSYsIiICAJCRkWGYnLVly5blBuoTERFZo0fDl/JB+JrO8CWNWtb7VZboEJaTk4NXX30VW7duNdwJaWdnh7///e9YvXq14bIlERGRNWH4skK1sPerLNEh7B//+Af++OMP7N69G927dwcAHD16FDNnzsQrr7yCrVu3mr1IIiIiU53/6x5WHLiCAxcZvqxGUR5wYAHw2xcl72tR71dZokPY7t278dNPP6FXr16GZYMGDcL69esxePBgsxZHRERkKoYvK1XLe7/KEh3CvLy8Krzk6O7uDk9PT7MURUREZCqGLyvF3q9yRIewt99+G3PmzMFXX30FX19fACXPk3zjjTfwzjvvmL1AIiKi6mD4smKP9n51mgQMeK9W9n6VVa0Q1rFjR6NZ8q9cuYKGDRuiYcOGAICUlBSo1WpkZmbilVdesUylREREFWD4smIV9X6N+hRo1k/SsqxFtULY6NGjLXLwQ4cOYenSpTh58iRSU1Oxc+dOo2MJgoCIiAisX78eWVlZ6NmzJ9asWYPAwECL1ENERPLB8GXl2Pv1WNUKYaVzg5lbXl4e2rdvj5dfftnwWKSyPvroI6xatQqbN29GkyZN8M4772DQoEGIj4+Ho6OjRWoiIiLrxvBl5dj7VW2ix4SZ05AhQzBkyJAKPxMEAStWrMDbb7+NUaNGAQD+/e9/w8fHB7t27cILL7xQk6USEZHEGL5kgL1fokgawqqSlJSEtLQ09O/f37DM3d0d3bp1w9GjRysNYYWFhSgsLDS8z87OBgBotVpotVqL1Fq6X0vtn8yPbSZPbDf5MUebXbiZjU8PJiLmUiaAkvA1PNgPr4Y2RbN6Lk+8fypPdLsV5UF58D3Y/b4BACC41Ydu2EoITUNLd2iBKq2TmJ9Fqw1haWlpAAAfHx+j5T4+PobPKrJ48WJERkaWW75//344Ozubt8hHREdHW3T/ZH5sM3liu8mPKW12PRfYd0OJ83eVAAAFBDxVV8CgBnr4OF3H5RPXcdnchZKR6rSbV84ldEzZAJeiDABAsldfXKj/Aoov5QOX9lq6RKuTn59f7XWtNoSZat68eZgzZ47hfXZ2NgICAjBw4EC4uVmmO1Sr1SI6OhoDBgyASqWyyDHIvNhm8sR2kx9T2qw6PV9kWdVqt6JcKA8ugt3V0t6vBtANW4H6TUNRvwZrtTalV+Cqw+QQVlRUhKSkJDRr1gz29ubPcqVzkKWnp8PPz8+wPD09HR06dKh0O7VaDbVaXW65SqWy+C/tmjgGmRfbTJ7YbvJTnTY7d+MeVsYk4MDFkh4VjvmSXqXtlvRLydivrD9L3neaBMWA92DPsV+ifjeJTk/5+fmYMWMGNm/eDABISEhA06ZNMWPGDNSvXx/h4eFid1mhJk2awNfXFzExMYbQlZ2djePHj+P//b//Z5ZjEBGR9CoKX6M61Mf0fs3RrB7Dl1UpzAViIh/e+egeAIz8FGjWV9q6ZEp0CJs3bx7OnDmD2NhYo2dF9u/fHwsWLBAVwnJzc3H16lXD+6SkJJw+fRp16tRBw4YNMWvWLCxatAiBgYGGKSr8/f0tNm8ZERHVHIYvmamg94t3Pj4Z0SFs165d2LZtG55++mmjWfTbtGmDxMREUfv6/fff0bfvw/RcOpZr4sSJiIqKwr/+9S/k5eVh2rRpyMrKQq9evbBv3z7OEUZEJGMMXzLD3i+LER3CMjMz4e3tXW55Xl6eUSirjtDQUAiCUOnnCoUCCxcuxMKFC8WWSUREVobhS34Uf/4K7J7J3i8LER3COnfujD179mDGjBkAYAheGzZsQPfu3c1bHRERyd71XOCV//yBny8/vNuR4cvKFeUi+Pq/Yf/HgZL37P2yCNEh7IMPPsCQIUMQHx+P4uJirFy5EvHx8Thy5Aji4uIsUSMREclM6r372HsuDbvP/IU/rtsDyGT4snZ6PXD9GHBhF+wv7ETTvJIeS/Z+WY7oENarVy+cPn0aS5YsQXBwMPbv34+nnnoKR48eRXBwsCVqJCIiGSgNXnvO3sSplCzDcgUEjGzvj9f6t2D4sjZlghcu/gDkpAIAFADyHerC4bl1sG/Rv8pdkOlMmuCrWbNmWL9+vblrISIimak0eCmAzo08MbiND1Rp5/Hi6GDO7WYtKgleAAC1OxA0DMUth+PA5QIMadJHsjJrA9EhbO/evbCzs8OgQYOMlv/000/Q6/WVPpCbiIhsw+OC17BgPwwJ9oOPmyO0Wi327j0vXbFUohrBC21GA037AvYOELRaCFdq3yOHaproEBYeHo4lS5aUWy4IAsLDwxnCiIhskJjgRVaibPCK/x7ILfPc5QqCF9U80SHsypUraN26dbnlQUFBRhOvEhGRvDF4yRCDl6yIDmHu7u64du0aGjdubLT86tWrcHHhg1WJiOSMwUuGGLxkS3QIGzVqFGbNmoWdO3eiWbNmAEoC2Ouvv46RI0eavUAiIrIsBi8ZYvCyCaJD2EcffYTBgwcjKCgIDRo0AADcuHEDvXv3xscff2z2AomIyPyqCl5dGtXB0GBfBi9rU63g9SzQNJTBSyZMuhx55MgRREdH48yZM3ByckK7du0QEhJiifqIiMhMGLxkyBC8dgLxPzB42RiT5glTKBQYOHAgBg4caO56iIjIjBi8ZIjBq9YwKYTFxMQgJiYGGRkZ0Ov1Rp9t3LjRLIUREZFpGLxkiMGrVhIdwiIjI7Fw4UJ07twZfn5+hgd4ExGRdBi8ZIjBq9YTHcLWrl2LqKgojB8/3hL1EBFRNTF4yZBeB6QcA+J3MXiR+BBWVFSEHj16WKIWIiJ6DAYvGaoqeDm6A0HDgdajGbxqIdEh7B//+Ae2bNmCd955xxL1EBHRIxi8ZIjBi6pBdAgrKCjAF198gQMHDqBdu3ZQqVRGn3/yySdmK46IqLZi8JIhBi8SSXQIO3v2LDp06AAAOH/+vNFnHKRPRGQ6Bi8ZYvCiJyA6hB08eNASdRAR1Sp384pwJSMXCek5uJKegzM37uH09SzD5wxeVkivA+4mA5mXSl4Zl4CkQwxeZDKT5gkjIqLqeTRslfw5F7dyC8uty+BlJSoKW5kXgVtXgOKC8uszeJGJTAphv//+O7799lukpKSgqKjI6LPvvvvOLIUREcnJ3byikqCVkYsr6TlISM/FlYyKw1apBp5OCPR2RQsfDQJ9NOgdWJfBqyaJDVsAYO8I1G0BeLcC6rUE/NoDjUMYvMgkokPY1q1bMWHCBAwaNAj79+/HwIEDkZCQgPT0dDz77LOWqJGIyGqYI2wFeruiubcrXNS8GFEjyoatjItA5mXxYateK8A7CPBoBCjtarR8sl2ifwN88MEHWL58OcLCwqDRaLBy5Uo0adIEr7zyCvz8/CxRIxFRjTM1bLV4ELICfTRo4eOKZvUYtmoMwxbJjOjfDImJiRg2bBgAwMHBAXl5eVAoFJg9ezb69euHyMhIsxdJRGQpDFsyxLBFNkL0bwxPT0/k5OQAAOrXr4/z588jODgYWVlZyM/PN3uBRETmwLAlQ08ctoJKXgxbZKVE/yYJCQlBdHQ0goOD8fzzz2PmzJn4+eefER0djWeeecYSNRIRVVtp2ErIyMVVQ9jKwa3cokq3YdiSGMMW1VKif8N89tlnKCgo+Usxf/58qFQqHDlyBGPHjsXbb79t9gKJiCrCsCVDDFtERkT/5qlTp47hz0qlEuHh4WYtiIjoUTq9gN+T72Df+VT8ekGJhWdjcTuvGmHLxxWB3iVhq7m3K5wdGLZq1P27UFz4AU8lb4X9hqXAbYYtorJE/0ays7NDamoqvL29jZbfvn0b3t7e0Ol0ZiuOiGqv0uC191wq9p5PQ2ZO6dgtJYCSAMawZYXu3wUu7QEu7AKuxcJer0VA2c8ZtogMRP+mEgShwuWFhYVwcOBkdURkusqDF+DmaI/+rbzhlH0dYwf0QJC/B8OWtXgkeEGvNXwkeLfGZWULNO89FvZ+bRm2iMqo9m+wVatWASh5SPeGDRvg6upq+Eyn0+HQoUMICgoyf4VEZNMeF7wGtvHFsHZ+6NmsLhSCDnv3piC4vjtUKgYwSVURvODdBmjzLNBmNIrdG+Py3r1o1mIIoFJJVS2RVar2b7Hly5cDKOkJW7t2LezsHv6fjIODAxo3boy1a9eav0IisjmlwWvPuVT87zHBy8FeafhMq+VwB0lVM3ihbuDD5VotiKhi1Q5hSUlJAIC+ffviu+++g6enp8WKIiLbY2rwIokZBa+DgL744WeVBS8iqhbR/fkHDx40eq/T6XDu3Dk0atTI7MFswYIF5Wbgb9myJS5dumTW4xCRZTB4yRSDF1GNEB3CZs2aheDgYEyZMgU6nQ4hISE4evQonJ2dsXv3boSGhpq1wDZt2uDAgQOG9/b2HAdCZM10egEnHozxqih4DWrji6EMXtanquDl0xZoPZrBi8jMRCea7du34//+7/8AAD/++COSk5Nx6dIlfPXVV5g/fz4OHz5s3gLt7eHr62vWfRKReTF4yRSDF5GkRIew27dvG0LR3r178fzzz6NFixZ4+eWXsXLlSrMXeOXKFfj7+8PR0RHdu3fH4sWL0bBhQ7Mfh4jEYfCSqfw7wOW9wIWdDwbXM3gRSUV0CPPx8UF8fDz8/Pywb98+rFmzBgCQn59vdMekOXTr1g1RUVFo2bIlUlNTERkZid69e+P8+fPQaDQVblNYWIjCwof/GGRnZwMAtFottBa6S6d0v5baP5kf28w0Or2A3/+8i30X0vHThXRklnlEkJujPQa09saQNj7o3tTrYfASdGa7q5HtZqL7d6FI+B+U8d9DkRwHRZngJXi3gb7VSOhbjQS8zH9XI9tMnthuphNzzhRCZbOvVmLBggVYsWIF/Pz8kJ+fj4SEBKjVamzcuBHr16/H0aNHRRdcXVlZWWjUqBE++eQTTJkypdL6Hh3MDwBbtmyBs7OzxWojslV6AbiWDZy+rcSZOwpkaxWGz5ztBATXEdDBS0ALdwHs8LIequJc+N07Bf+7v6FezgUo8TAI33MMwE3Prrjp0RW5jn4SVklke/Lz8zFu3Djcu3cPbm5uVa4rOoQBwI4dO3D9+nU8//zzaNCgAQBg8+bN8PDwwKhRo0yrupq6dOmC/v37Y/HixRV+XlFPWEBAAG7duvXYk2EqrVaL6OhoDBgwACpORigLbLOqVdXj5e5UMnN9uR6vGsB2ewxTerwsjG0mT2w302VnZ6Nu3brVCmEm3Wr43HPPlVs2ceJEU3YlSm5uLhITEzF+/PhK11Gr1VCr1eWWq1Qqi/8g1cQxyLzYZg9VNcbL3UmFga19rGaMF9utjGqO8VLUDYQdAKkeGMQ2kye2m3hizle1QtiqVaswbdo0ODo6Gh5fVJnXXnut2gd/nLlz52LEiBFo1KgRbt68iYiICNjZ2eHFF1802zGIajM5BS8qI/9OyV2N8bs4uJ5IxqoVwpYvX46XXnoJjo6OhscXVUShUJg1hN24cQMvvvgibt++jXr16qFXr144duwY6tWrZ7ZjENU2pcFrz9lU7LvA4CUbDF5ENqdaIaz0kUWP/tnStm7dWmPHIrJlZYPX/86n4VZu+eA1rJ0fejB4WZfHBa82o4HWzwJ1m0tUIBE9CU4/T2SjGLxkisGLqNaoVgibM2dOtXf4ySefmFwMEZlOEATcvFeAhLQc/Hwpw7aCl64YuJsERVo8/O8egyK+EDDzvISSu59VMsCewYuo1qhWCPvjjz+M3p86dQrFxcVo2bIlACAhIQF2dnbo1KmT+SskIiOlYetKeg6upOciIT0HVzJycTUjF7mFxUbryi54PQhbyLwEZFwCMi8CmZeBWwmArgj2ALoAQLK0ZVocgxdRrVCtEHbw4EHDnz/55BNoNBps3rwZnp6eAIC7d+9i8uTJ6N27t2WqJKqFxIStUvZKBZrUdUHHhh4YGmzFwesxYatCKmcIXs1xK1cLLy8vKBWKiteTK6U90LgngxdRLSJ6TNiyZcuwf/9+QwADAE9PTyxatAgDBw7E66+/btYCiWzdk4StFj4aBPq4ItBbgxY+rmhc1wUqOysKXSaGLdRtAdQLAryDgHqtgHotAY9GKNbpcGTvXgwdOhRKzl1ERDInOoRlZ2cjMzOz3PLMzEzk5OSYpSgiW8Sw9YgqwhaUlXw3nXmeQUlEZA1Eh7Bnn30WkydPxrJly9C1a1cAwPHjx/HGG29gzJgxZi+QSG4Yth5hStgiIqoFRIewtWvXYu7cuRg3bpzhSeH29vaYMmUKli5davYCiayV4W7E9BxcfRC2EjJycTU9B3lFFffYMGwREVEp0SHM2dkZn3/+OZYuXYrExEQAQLNmzeDi4mL24oisAcPWI0rDlveDkFWvVUnocm/IsEVEJILJk7W6uLigXbt25qyFSFIMW49g2CIisijOmE+1jiAIuJl1H/F3FUg9nIzEzHyGLYYtIqIaxxBGNuvxPVt2wKUEo21kF7YyHoSszIsloev2FYYtIiKZYAgj2Ssbtgx3JFajZ6uuWo+OTX3R0s+NYYuIiGocQxjJhqlhq2k9FwR6l/RstfDRINDbFfXdHRD90z4MHdoeKmuY9NMobF16eDmRYYuIyGYxhJHVEQQBf2Xdx5WMXFxJz0FCem7JPFsmhK3KerZKp1epcU8ctoIeTgHBsEVEJGsMYSQZU8KWyq5kzFZ1w5ZkdMXAnWtlgtaDy4kMW0RE9ABDGFkcw9YjVM4PLh+WBq0HlxMZtoiIahWGMDIbQ9hKz8WVjAdhKz0HVzNybShsPQhZpZcTb19l2CIiIpMwhJFoTxS2HoSsFj4ldyM28pJJ2Lp1BdBXMo6MYYuIiEzAEEaVEvR6pKbdxLWMHFy7lYfEzDxcy8xF0q085FcQttQAXO0UaFjHGU3ruaJpPRc0e/DfAE/nCsJWIVBQWCPfpRytFq4FN6G49CNw56rIsPUgZJWO3XIPYNgiIiLRGMLIiKDX49r5Y8g4vg0BN39CAyEV/gB6lV1JCcCxip3kPHhds2ChT0gF4BkAuFjRhwxbRERkeQxhVC54NRNS0UzqompAsdIBSp/WUHq3ZtgiIqIaxxBWS1UVvAoEFS66doOu1WgEhTwHV42HlKVahFarxd69ezF02DAorWGyViIiqnUYwmoRMcGro5unpLVanEJR8iIiIpIIQ5iNMwSvY1sRkLq/dgcvIiIiK8IQZoMeF7ziXZ+GvvUoBPVm8CIiIpIKQ5iNqHbwCnkeT9ngGC8iIiK5YQiTMQYvIiIi+WIIkxkGLyIiItvAECYDgl6PxHNHkXl8W5XBqxWDFxERkWwwhFmpR4NXcyEVzR98xuBFREQkfwxhVoTBi4iIqPZgCJMYgxcREVHtxBAmgeoFr9FoFfIcgxcREZGNYgirIcbB6yc0F9LKBS+hdcnM9QxeREREtk8WIWz16tVYunQp0tLS0L59e3z66afo2rWr1GU9FoMXERERVcbqQ9i2bdswZ84crF27Ft26dcOKFSswaNAgXL58Gd7e3lKXV87jg1d3CK1HMXgRERHVclYfwj755BNMnToVkydPBgCsXbsWe/bswcaNGxEeHi5xdSUEvR4Ft5NxYuMcNEzbz+BFREREj2XVIayoqAgnT57EvHnzDMuUSiX69++Po0ePVrhNYWEhCgsLDe+zs7MBAFqtFlqt1uw1nty5Eg0ursffhTTDsgJBhQuuT0MfNBKBPccgWONu+MwSNZB4pe3A9pAXtpv8sM3kie1mOjHnzKpD2K1bt6DT6eDj42O03MfHB5cuXapwm8WLFyMyMrLc8v3798PZ2dnsNQopiXhaSEOBoMIp+w644dEV9r7tYefgCOiBm78cNvsxyXyio6OlLoFMwHaTH7aZPLHdxMvPz6/2ulYdwkwxb948zJkzx/A+OzsbAQEBGDhwINzc3Mx+vMybrXH8dHv8VeSJwcNGoItKZfZjkPlptVpER0djwIABULHNZIPtJj9sM3liu5mu9ApcdVh1CKtbty7s7OyQnp5utDw9PR2+vr4VbqNWq6FWq8stV6lUFvlB8m8UiHr+jZG2d6/FjkGWwzaTJ7ab/LDN5IntJp6Y86W0YB1PzMHBAZ06dUJMTIxhmV6vR0xMDLp37y5hZURERERPxqp7wgBgzpw5mDhxIjp37oyuXbtixYoVyMvLM9wtSURERCRHVh/C/v73vyMzMxPvvvsu0tLS0KFDB+zbt6/cYH0iIiIiObH6EAYA06dPx/Tp06Uug4iIiMhsrHpMGBEREZGtYggjIiIikgBDGBEREZEEZDEm7EkIggBA3ORpYmm1WuTn5yM7O5vzqcgE20ye2G7ywzaTJ7ab6UrzRmn+qIrNh7CcnBwAQEBAgMSVEBERUW2Rk5MDd3f3KtdRCNWJajKm1+tx8+ZNaDQaKBQKixyj9NFI169ft8ijkcj82GbyxHaTH7aZPLHdTCcIAnJycuDv7w+lsupRXzbfE6ZUKtGgQYMaOZabmxt/WGWGbSZPbDf5YZvJE9vNNI/rASvFgflEREREEmAIIyIiIpIAQ5gZqNVqREREQK1WS10KVRPbTJ7YbvLDNpMntlvNsPmB+URERETWiD1hRERERBJgCCMiIiKSAEMYERERkQQYwoiIiIgkwBBWDatXr0bjxo3h6OiIbt264bfffqty/e3btyMoKAiOjo4IDg7G3r17a6hSKktMu61fvx69e/eGp6cnPD090b9//8e2M1mG2L9vpbZu3QqFQoHRo0dbtkAqR2ybZWVlISwsDH5+flCr1WjRogV/T0pAbLutWLECLVu2hJOTEwICAjB79mwUFBTUULU2SqAqbd26VXBwcBA2btwoXLhwQZg6darg4eEhpKenV7j+4cOHBTs7O+Gjjz4S4uPjhbfffltQqVTCuXPnarjy2k1su40bN05YvXq18McffwgXL14UJk2aJLi7uws3btyo4cprN7HtViopKUmoX7++0Lt3b2HUqFE1UywJgiC+zQoLC4XOnTsLQ4cOFX799VchKSlJiI2NFU6fPl3DldduYtvt66+/FtRqtfD1118LSUlJwk8//ST4+fkJs2fPruHKbQtD2GN07dpVCAsLM7zX6XSCv7+/sHjx4grX/9vf/iYMGzbMaFm3bt2EV155xaJ1kjGx7fao4uJiQaPRCJs3b7ZUiVQBU9qtuLhY6NGjh7BhwwZh4sSJDGE1TGybrVmzRmjatKlQVFRUUyVSBcS2W1hYmNCvXz+jZXPmzBF69uxp0TptHS9HVqGoqAgnT55E//79DcuUSiX69++Po0ePVrjN0aNHjdYHgEGDBlW6PpmfKe32qPz8fGi1WtSpU8dSZdIjTG23hQsXwtvbG1OmTKmJMqkMU9rshx9+QPfu3REWFgYfHx+0bdsWH3zwAXQ6XU2VXeuZ0m49evTAyZMnDZcsr127hr1792Lo0KE1UrOtsvkHeD+JW7duQafTwcfHx2i5j48PLl26VOE2aWlpFa6flpZmsTrJmCnt9qg333wT/v7+5QI1WY4p7fbrr7/iyy+/xOnTp2ugQnqUKW127do1/Pzzz3jppZewd+9eXL16Fa+++iq0Wi0iIiJqouxaz5R2GzduHG7duoVevXpBEAQUFxfjn//8J956662aKNlmsSeM6BFLlizB1q1bsXPnTjg6OkpdDlUiJycH48ePx/r161G3bl2py6Fq0uv18Pb2xhdffIFOnTrh73//O+bPn4+1a9dKXRpVITY2Fh988AE+//xznDp1Ct999x327NmD9957T+rSZI09YVWoW7cu7OzskJ6ebrQ8PT0dvr6+FW7j6+sran0yP1PardTHH3+MJUuW4MCBA2jXrp0ly6RHiG23xMREJCcnY8SIEYZler0eAGBvb4/Lly+jWbNmli26ljPl75qfnx9UKhXs7OwMy1q1aoW0tDQUFRXBwcHBojWTae32zjvvYPz48fjHP/4BAAgODkZeXh6mTZuG+fPnQ6lkn44peNaq4ODggE6dOiEmJsawTK/XIyYmBt27d69wm+7duxutDwDR0dGVrk/mZ0q7AcBHH32E9957D/v27UPnzp1rolQqQ2y7BQUF4dy5czh9+rThNXLkSPTt2xenT59GQEBATZZfK5nyd61nz564evWqITADQEJCAvz8/BjAaogp7Zafn18uaJUGaYGPoDad1HcGWLutW7cKarVaiIqKEuLj44Vp06YJHh4eQlpamiAIgjB+/HghPDzcsP7hw4cFe3t74eOPPxYuXrwoREREcIoKCYhttyVLlggODg7Cjh07hNTUVMMrJydHqq9QK4ltt0fx7siaJ7bNUlJSBI1GI0yfPl24fPmysHv3bsHb21tYtGiRVF+hVhLbbhEREYJGoxG++eYb4dq1a8L+/fuFZs2aCX/729+k+go2gSGsGj799FOhYcOGgoODg9C1a1fh2LFjhs/69OkjTJw40Wj9b7/9VmjRooXg4OAgtGnTRtizZ08NV0yCIK7dGjVqJAAo94qIiKj5wms5sX/fymIIk4bYNjty5IjQrVs3Qa1WC02bNhXef/99obi4uIarJjHtptVqhQULFgjNmjUTHB0dhYCAAOHVV18V7t69W/OF2xCFILAfkYiIiKimcUwYERERkQQYwoiIiIgkwBBGREREJAGGMCIiIiIJMIQRERERSYAhjIiIiEgCDGFEREREEmAIIyIyo9jYWCgUCmRlZQEAoqKi4OHhIWlNRGSdGMKIiEwUGhqKWbNmGS3r0aMHUlNT4e7uLk1RRCQb9lIXQERkSxwcHODr6yt1GUQkA+wJIyLZycvLw4QJE+Dq6go/Pz8sW7bMqFdKoVBg165dRtt4eHggKirK8P7NN99EixYt4OzsjKZNm+Kdd96BVqs1fL5gwQJ06NABX331FRo3bgx3d3e88MILyMnJAQBMmjQJcXFxWLlyJRQKBRQKBZKTk8tdjqzI999/j6eeegqOjo5o2rQpIiMjUVxcbK7TQ0QywRBGRLLzxhtvIC4uDt9//z3279+P2NhYnDp1StQ+NBoNoqKiEB8fj5UrV2L9+vVYvny50TqJiYnYtWsXdu/ejd27dyMuLg5LliwBAKxcuRLdu3fH1KlTkZqaitTUVAQEBDz2uL/88gsmTJiAmTNnIj4+HuvWrUNUVBTef/99UfUTkfwxhBGRrOTm5uLLL7/Exx9/jGeeeQbBwcHYvHmz6J6kt99+Gz169EDjxo0xYsQIzJ07F99++63ROnq9HlFRUWjbti169+6N8ePHIyYmBgDg7u4OBwcHODs7w9fXF76+vrCzs3vscSMjIxEeHo6JEyeiadOmGDBgAN577z2sW7dOVP1EJH8cE0ZEspKYmIiioiJ069bNsKxOnTpo2bKlqP1s27YNq1atQmJiInJzc1FcXAw3NzejdRo3bgyNRmN47+fnh4yMjCeq/8yZMzh8+LBRz5dOp0NBQQHy8/Ph7Oz8RPsnIvlgCCMim6NQKCAIgtGysuO9jh49ipdeegmRkZEYNGgQ3N3dsXXrVixbtsxoG5VKVW6/er3+iWrLzc1FZGQkxowZU+4zR0fHJ9o3EckLQxgRyUqzZs2gUqlw/PhxNGzYEABw9+5dJCQkoE+fPgCAevXqITU11bDNlStXkJ+fb3h/5MgRNGrUCPPnzzcs+/PPP0XX4uDgAJ1OJ2qbp556CpcvX0bz5s1FH4+IbAtDGBHJiqurK6ZMmYI33ngDXl5e8Pb2xvz586FUPhzi2q9fP3z22Wfo3r07dDod3nzzTaNercDAQKSkpGDr1q3o0qUL9uzZg507d4qupXHjxjh+/DiSk5Ph6uqKOnXqPHabd999F8OHD0fDhg3x3HPPQalU4syZMzh//jwWLVokugYiki8OzCci2Vm6dCl69+6NESNGoH///ujVqxc6depk+HzZsmUICAhA7969MW7cOMydO9dorNXIkSMxe/ZsTJ8+HR06dMCRI0fwzjvviK5j7ty5sLOzQ+vWrVGvXj2kpKQ8dptBgwZh9+7d2L9/P7p06YKnn34ay5cvR6NGjUQfn4jkTSE8OnCCiEiGQkND0aFDB6xYsULqUoiIqoU9YUREREQSYAgjIiIikgAvRxIRERFJgD1hRERERBJgCCMiIiKSAEMYERERkQQYwoiIiIgkwBBGREREJAGGMCIiIiIJMIQRERERSYAhjIiIiEgCDGFEREREEvj/Hnpe1Z1V/McAAAAASUVORK5CYII=\n",
"text/plain": [
"<Figure size 700x300 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"ax = distances_brown.quantile(np.arange(0, 1, 0.1)).plot(\n",
" label=\"Brown, EN\", legend=True,\n",
" figsize=(7, 3),\n",
")\n",
"distances_floresta.quantile(np.arange(0, 1, 0.1)).plot(\n",
" ax=ax,\n",
" label=\"Floresta, PT\", legend=True,\n",
" ylabel=\"distance btw. articles\",\n",
" xlabel=\"quantile\",\n",
" grid=True,\n",
");"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
matplotlib
nltk
numpy
pandas
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment