Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save MachineLearningIsEasy/375f68fe596ba7efc9b8a50804f8339d to your computer and use it in GitHub Desktop.
Save MachineLearningIsEasy/375f68fe596ba7efc9b8a50804f8339d to your computer and use it in GitHub Desktop.
NLP векторные представления текстов
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "2 Векторные представления.ipynb",
"provenance": [],
"collapsed_sections": []
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "5K1WlUmN6pJN"
},
"source": [
"![logo.png](data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAANAAAACJCAIAAADSRyEcAAAAA3NCSVQICAjb4U/gAAAgAElEQVR4nOy9d7xeVZU+vtbe+5S3335z0zupBAKEEqSI0mQQFBS7jm2U0bGXsY36FeuIZey9MgwKYqELARIISSAkISH1pt5e33vfes7Za/3+2Oec9703BEEHJ+pvfyjJW857ynPWep5nrb0PMjP8ow9mBmACYEQJiPXvkV+k6hBVh3V1hL08BUXQVSYfUKCwUSXQSgu7UbrNwmkSdiMKOXHTGpgBBQJO2vI/5sB/YMDFOIN6lASlHn90lz+yzR/dGYzv1+Ueqo5wUGLygAIAAgYABkAAAERAgcIC6QorK90WmZqhcgvsxqVWw1KVnSdUovZrrP9/5P1DAo6ZgYEJhTIvkJevDmyq9j5YHVgf5PfoyiCThygAFQoLhAIUAAIAQ5zVbctsAJiANVDA7AMFgAJVWqam280r3PaznfazrYaF0Y8TMAFK/IeE3T8Y4JgZCAAQJQCQX6j0PFA+9Ltq39qgcIjJQ2mjdBEtRsEMzADACIxIIvovACMCMDAgAzIjATILAgQQiIAAgIysWVdZV4EJ7Qa7abk7/aLEjEvthkXhvhhcovi/Ox3/B+MfB3DMTABoLrA3sr3YeWP50O+C/F4AQpVE4TAKYgAmgWRhoEQgkQBAs/S0qpJd1bZPVsBSs0AAhdoSgS09R3i28C0RIBIzBqx8Upolg0BEgQwUkC6x9oTd6LSfmZp7TWLGpcJKQQi7f6Bo9w8BODZEDQUAlLvXFHZ9v9J9D3l5oVKoXGZBzAhkC9+WPgKXAneo0tBdbj1SbO8ptQ1UGke9XFEnKtr1taVBMAhAEEAKtSP9pKpkrUKTM9qeGJyW7JuW7G13B7PWuBI6IOWRHbBEFAIByKegAEwqtyg175XpBa+WiXYAYAqO1it/l+PvHHDMDKwNVyt33zu+/euV7vuAA2FlQFhEDEC28B3p+aR6Si278rO3j8zbNz6jt9Q67qcCVoAgkAWQFCyQBTIYHodhSiUWzIJAEEhmIYAcVW1y8jNTXYuynUtye2anD2etcQJZ0Y5mKRARgYMSBUWVmpla8LrMorfIRCsDAOlJIvfvb/w9A46j61cdenxsy+fKh28HIGHlGJGIJeqErDLAkUL7psHFGweX7s3PHPUyDEJJbQtfIYUBB5EB2fwJEBCAjUQFMGwOAAENFgEEMQasPLY0S1d4U5P9yxufPK358cW5vVmrUNV2lRxEFAisq+SPq/TszJJ/TS96k5AOkwbEv2Ni9/cJOGYjQqWujoxt/UJh1w84KAm7gQGJWIkgqSoFP/HY4KJ7u0/bOrxgtJqRglzlK9QAwICEAs2fANngCpEZABCNVDXKAQEBOUIdmrcRgFkgAyIB+mR72lZCz0x1ndn66PPaHpmVOkIsS9pFQCGQdYX8gt1ySsPJH09MvxDqbpW/v/E3DzhmZmZEjHk3kwYhEaB08Hejj37Uz+8WdgMKpTVLoVOqPFzNruk+5c7DZ+wbm8YoEqqqUAMgg2CDGI7wg4BGcQIAAh6FOQjTKwIC1zAHjBjGO0RAFsDAosp2leycVTitefMlU+9bkttNIMo6IRAQkfwCsE7Nf03DKZ+Ubsvfq5j42wacgRqEHi4gIlOAQpE3NrLpo4XdPxTSQZUi0gCctspjXvLOw6f//tDZhwtTbOknpA/ABIIBw1Bm/nAU5mrxLMQcRGBE4JDSgcEYY/zX6K0wBAoAgaxBloKkLb3Tmh+/csbtS3O7PbKq2pECAFhXh63s/MZVX0zMuLhe6/zdjL9hwBm0aaLu3qEZU1sBgHQgpKoOPDr80Nu94S3SaQGAgDipqgBwX9fK/973gs6xaa6sOsoHEAQY4sPgpA5zUaaM8BO9NTmHRuiEGI7wVJgzX0FARgQQgghEMUg6wju37eGrZ/1uRrK7EKQZUAjBQYnJzyx7d8PKTwAKoCA2qP8Oxt8q4IgIEfuH8m//929s3r7vny447QsffZNtW4XdPx3d8H6mqrCyJrBlreKu0Zk/3PmiR/qX2CJwLY9ZUigBMApBAM8cc9EXuYa5OhkxEY5Qi53x9sNPSmQCUfCTjXb+qpm/v2za3VJQKUgoE+oqg4npFzef/R2ZnMJ/R5j7WwWcH2gp8Oe33PeeT31ventjV//YrT/6f6c6v+x6+NPKbUChAs2u9ADg1/vO/eXeF5QCN2OVTT2gDi5PjTngKGNOpHRxMISjZAQicBzYJsoIwMmYi7ePiALJ16qkkysat795/s8XZjrHgowAQCF1dcjKzGs+76dO80lMPgrrWGfDEFlis6OAAkUdqT2uxt8qP2AiIYQUwve8gZHKlNYG58lrhx/7rHSbAWWgOWuV+suNH3nkTd/ecTkxpqyyZmF8DmaO0BGWVYE5FAbx/4HBVFyZOaRx4WcNdpgh2gjHRbAQx8zIBpEcfpQ5jGtmEwbDDMCsSUhBOSu/ffSEDz/+0d8cuTglSwK1Ji2dpqB4uP/OF5WP3IXCYvKfMjowMxELIZSUSkmlpBSmLnc8hpK/yQjnB4Gl1EOPPvnWD371JZeeUyqOn5/86mlTu8qUQSBiyNmlB3uWX7/lquFKJuuUNUuONGOc/upCVJj8YrOjjnJFb0bf4pqMiN9CnvhXw9uejtLFEhZiP09I0AHLYpC6oOPBt83/iSurZe0qCaw9pqD57O+m5l7F5AOq+tAVq6UDh/se3Li9t3+kIZtaddLCFUvmQp2oOn7G3xjgmDnQ2lJqzcNbX/uuL372w2961RVnlx586fjh+zzIMmsETqrqDXvO/8GTl9hC2yLQLOpz6GTMhbkufKcOc3+a0kFcdJhI6Sa4dBEcI5xG/x5F6cxXBei8nzshu+99i785PdlTCFJKMLPmoNS0+tvp+a9k7YNQsTYnZkT88ndv/v4Nd+bHi+a3Xdt+0QtWfeYDr82mk8wsxHGUx/6WAMfMQaAtS92x5tE3f+ArX/vU26+8+PQjv7/Y671fJVuBSSIpob++5cW/2b86a5cg5FXhFZ3Mxuo4VkS5no7SIQLHmKvbCD6VjKgBdwKmn5rSTcAcg0RdCFKNTv6DS/5recOTY35GCWYm9gvN5/wwNfdqIl8IxQxELKX49FdvuP77t7Q1NygVesVM3D80+oLnnfzjL7/HUrLepPw/H8cR9p9yGIJiRqC1Zanf3Pnwmz/w1e9+/h1XXnxm950v1f0P2ql2JjJFgv+38RW3dK5usAuGg0GEMcPGoI6NAYQci5lxEqWDGqWLKZeheQwTNsIIbOBa44WmrzPeOAAzhlh7akrH4f/Cuz8AlVLFMS/1H1vft37wlJw1FhAiCrRSQ2vfXDp8hxAWUWDQtmnrnu/+4vaOtiZEDAKtNQWBJuaO9qZ7Htx8w61rhBBa0/ETVI5rwGkiRBQCzdm0lPrvW+9/58e/9dPr33XJ+av67n+L33WbTLRqHVhCE+AnN7zqviMrmtwxHQeWsKONTf9ayHgg0nV8tAKIZYTBQSwjgDGUFvXADal/iDnDmcKtA0EoI8JPcp2MMDCHCdsPkcrApEE50iPGz25/54MDZ0SYkyjsoQdeXx141GAOAG69c32gCQGZIxsGEQCCgJIJ97d3PRKm1OMmjx2/gCNiKQQRdfUOlStVy1Lf/+87P/i5H/33f73//NUrhx69rrLvJzLRRhRIQQDw6Q2vWNezuMEpBFrWwSVShCYTxpgLaRQzhxkVIswhRjSDY3xEMQjNhgEM5jAKUWyqXyEcEUGEvDHMqgZkYVyrwxxGcIwwHf0uE4FQQgsIvrjj7esGVmWt8YAQhc3aG1zz6qDYJZUFwAe7+pWSxDQpZzKzpWTvwEihVBHieMmnAHB82omsiaUQ9zy4+XPfvKmnf3j6lJYlC2euWb/t19/+8KkrThjrvKW47TPSbWHSCGyL4LqNL1vbvaTRLQQko4gDBhcYmhXIBh8RbQ0pHTNBXa2AOeohZ2NmEVC4EQ43wsw10geIJlZxmDQlSk3kaU3EgGgpaStJyEyh14KAbHYh3Ei4Y2iSKjJyCEMCVIIC0l968m0fV+UVjdvH/ZSyUrp4eOiBf2554W9ZWpYSxrcJmUP9SQSQUihTMjtu5OrxGOG0ZoG4beeBt3zoa3v2d2lNnYd6v/fLO975+heduuKE0tCusUf+DVWKAYghbVW+ve2Suw6d3OAUAi0wzGKxaVbPlgyZOjalC2HHAOwHlC9U8oVypEfrKB3ELlr4FQyNONTEo4WyFDizPbt0buv86Y0JR40WKp6npQCOsjxynNePRenCjROjEoEm/Pz2aw8UZiRVRWsSTlOlZ83who8gihMXz/b9QAhRL/5MGq16/sI50xKuozUdJ2iD4zPCMROiuvn2deWy19qc8wPtOFZLY2btxp2vu+r5+fXXkjci7FwQcINb+tWes27cfXaDUyAWhnmFYaouFJnejdDMRRNi0PzVD/iz77hw/sxmTSyj1MMMxYp3qCd//6MH7t3YaSkppWAGTZxN2V9+z0WppA0Avk/v/+rdvUMF25IC0PMDx1HvePlpF6ya09Gcti2pNQ3my+u3df3o91t7BgqphKUJwPA2MFKYmUPjxORXDO8DjmMysXCkP+6nvrDj7ded9NmErAbE0m0t7vxGduZ5L7vikh/ccEexUk24dhBoAywpBDBrTa+76gIIT8rxElmOR8CZUSxVpBShRCUG1mNVu7rzK9XeB1SyPdA6Y1c29c3/1tZL0laZImfBSEeMUlUd5iAGGYQNR4aD6ZkdDbOnNh69A8vmtV969sL7Nu3/6DfuISITRYTAeTOaXDs8b0qiIW1BQEnX+up7L1o2vw1Cv5AEYntT6sXnLjxrxfR3fvGu/d35hKuITYI3ODDpNTaIkTnK+6FuQQDWIJKqvL8w8792/fOHl3090JKBpZ3ue/AdM6/a8MWP/ctbP/SV/JifSSdN1CxVqoVS5UNvv/qcM5ZrTfJ48uGOo12pGwgA5515YqXqayIlpUAaq1qnTevxdn8N7CYibUs9VMl86dErGECgMSVqmbHG9GtFpDoZEarLkEZV/YCI/UBTnXtAxIEmP9DnnzrnrS89bbxYFWHBlCvVgIiJuFINiBmYBWCp4l179anL5rd5vtbEiKikEAKZ2fN1a0PyY29cjaYMUoepOhlREzS1na/JCNYss9b42v5Vvz70oowqaAKULlf7+x9890Xnnvyrb3/4rFMXe55frfpa6/mzp37rM9e+9y0v0URG4/+Vr9/TjOMRcFIKIrrk/FPf9aYXB4HOjxeHx8pXX3zqKxc/MF4oSimZwRbBf2255PB4kys94jrVaTYR0aAYcxxRrtiAw4iGidB5ASGwqz//5P7+zq5hIULEEPE/nXNCW1PaD7QhhEJg/A8AoMCqH0xvy1x81nwilhKlwIM9+Z/ftm3NowcBwFIi0LRkbuupi6cUyx5G8XcSpavZIgCTKJ05NGJMW8VfHrhi6+iSlCprTdJpKh+8Ob/nplNXLPqXV100raPlpm//+60//MTtP/vUVS86m4iPwxL+8ZhSjTGBCB95xzV33v/YFReuOvuM05a5d46u30hWkybO2eXf7z/lnkMn5uySJgEhTYn92bBkZWhQpDwjkYAct+PGzN/8BQA+/+MH7tvQ6djqorMWfOptLzBbaswmZk3NPb6r17EkR0ZyuKsAiFDxgoWzmpOuFWiSAvtHim+97g9dA+Oa+GNvfN41Fy7VWkuBKxa2P/j4YUQginzCKAqbwldE6eL8Gh9EGI4Fas32d/e8+rMnXydRMwOqRH7zf2TnX/bHh588eemclcvnmx3T+riLbWYcjxEOQukGj2zeFfj+e9969ekrGoc3f0GLFDM70u8uNv7giecnZJXiUlOoOrmmOuvCRkTSaz0akW3GkxxRKVBKdB11+7rdB3tGhEBNxMwNaZeIwmL/hG8wMhNxe1OamY1T/cTe/t6h8Y7mdNJRDz5+KGzSBGhvSmHs+DEzhXWLmjMcHvpkZziWxsQiKUu7x+f++tCLUqqkCYRK0vjeys6vP76r//QV87Umz/eZWUpxHKINjlvAERMA3PvQliULZhBD37pPU6lLKNcA7ifbz+sv5WwRMMFEUxfqMBdWpY5F6aI3cAKAGIBBCBSII+OV6FUM46Sh95MsVkRmdiyJUSHW8zUi+IFGAD8gQw2ZQQqs8Tbz+5GHXLOCYYIzDJOqEciaRFoWbj1y4a7xeQlZ0UTKbRzc/DV/eOuKE5dLKaQ8rmdCHI+AY2YpBCI+tGn7BeeeJYKe8r6fC6dBa0pZ1Uf75t518MSMXdJxraiGOTMNoFakqq+Z1lO62OgH5okRy3wv4u8x8wvN3Zr3Fg8ExppeATDGXegms0AQAm1LCoG2JULjt766GgIr3HjI2TDahRqlq1U7BOpy4N6w/wozMZFAWTz2vosGZ0xrZ2ZxHKMNjlvACSF6+4d7+obOOfOU4hNfgWAUhIXAmsXPn3yepqg2EJUO6pgV19U0/7SMqFnB5lOIQqASAMyupaIiFY6OV4RAmAis6McMlavzXcNEywKhXPF3Hxp6cv/g7oNDh/vGhERmOlaxP/ryn3CGNYiULG4YPGnD0EkpWWIAj5MnNm5JePsAwKwHddyO41E0EDEiP7jhiba2jukNQ4fv/oW0G0hTxq7cf2Txpr45GatKFLYDhQIjrBVhFLSQo5a0sF4KUU0rkhGhPRcjiAEAiqXKyHi5WPaXzWufM72RiBBhZKx8oGvEUpIixyweUQ8AT0jMDMxAwEnXOtQz+sZP/TZqgYOkY2mOdwzC+YQcub1gjseQztC7Zqg5w7GMMPr2lkOXrGzahkBCWuWx/rEnv9N0+hcMlI/bVSOOO8DFhc57120+c9XJfORGr9hvp9uRtKfVr3avEhCd0ShmYXQxogtnIoHxdhmjjBdNmuda0RzrZAMCAFxwxoKp7bnmXPLF5y1JunagSUlx65od/cOFxmyi4us4rZoRObdQz+wMkog4X6xg3HqJaEmZcC0DT6zVPTiED3PUExxizuxzdLBQX40gEAlVfmJ04cbBk1a3bSwEaenkSgd+lV3+XploO34qp0eP4xBwIKXwfX/b7u5rLjujsOuT0k4TcVpVH+6Zv7V/ZjIOb/GFwvA6hb1DcZ0+crgmXNEwVZkuSdMKBwBgZtq/7MIT4z3RRMD8q7u3feemDZmUrSnqh6tXDVF5tj7uGYM35aprLjxFKWECnhC478jImscOupYis5U69CNGXXUhpsAEU8ZasR+jtwAQmRAYgP/Qdf7pLZsRCKWti92lzhuzy95JrMMe5eNvPIeA46h3Ugh85l3OhsBt29mpMbG0cXd+3w5pNRil9/t9JxMjGNcqzqEcW2xhmTR0Qmo2CCKGVSPzVtyXhkeJziAwzYqMiFKgr6mza7hS9ZMJGyAsck7cXYiqaLUhELSmVMK69mWr6j9736b9d67fm3AUakPbQgIQ4gvC1pPwZkAGABGyA8GABMgkwp5hAGR2RHXb6KId+QXLGnaVdBJVorj/fzJL3oaojtuk+tyKBimFUnJSL8PTD2Ji5nsf2nbC7LZs4R4/YGZ0lb93tH1T7+ykqkbNZ3F4i32syP+oE3kx9ea42Twi/pEEmLBjSknbkralTGe2bakPvP7c97z27EKpKrDWhFQbGG2j7gA5CnJjxYofaD/QVS8INJXKfhR5J/QM1zeBCmSBxAwBSU9bFbIDrRDYFl5alRqd0SZnpNkZbnZGcvZ4WhWZ8Z6eswUSMAuV9Ie3VfvXAzBTMOnQjpPxnEQ4w4KFwAceeWLdxh2XX3j60oWzTP37T34xNEQe23/xaVlvYD2qFDE5MlhzeNG45zY4JWIBUd9Y1PMd17zDggPH+Qomy4iI0pn2yNqNwMQg4Hu/fmTr7p5Uwj5zxazLz1vCwFrzay5befu63bsPDCglJ905HPV8RFeX4wNBhJRrSymYmZClFCjC3rmwPbeOAAgkBCQWPkkGYQk/Z+WbnPEGZyxjFZOy4khfiUAKitQJMgvNWNU2IIxUcwlVIbCZvNKBW9wpzwNpbnKK20CPk/GcAM5g66FHn3z1O79YKldv/N0Dv/vRJ6Z3tPxJzBmkDgwOHekvrJpeLA73CJkVoMeqiXVHFjjSJ4hcqqjgE/WE4LEo3WQZETlhDBMmlxgkbdpx5K6HdufS7s1/fCKbcs5fNb/qBULgOStnb93dk7Uk8tF7XFvUJn5NChwrVt/0qVsXz215/2vPjsHIYbgFgJgAEAD6rIiFK71picGpyf7WxEjGKtkiMJ4dgWAWBqsEYdsmIiskRxQ9snrKbQuz+6tEaKXLh28vz3yx1bhYJtqiSHocwe65ARyzRNz25P5CqTJ35pRDR/oPHO6bMbVVEz19iCMipdTGrfuasvbsxJa8x2Chq/xNvbMP5lsc5TNHk6Pi6cmh5ptI6YzxUE/p4nbwiNJFtYnIp0MEgKRrZdNOYzYRaNrwxOHzV4WlydlTG8P13yYiLopSk70SFOj7ev22w6bYZT5QU6yRVY1IPllM2OiOzU53z0j1Ze2iANaMBLJKdp2ADr8X3j7IxAggCETA6kipY17mIACjsHWlv7D3p6phmXSb7abldsMis5L6sWA3UQQ95+M5AZwUgpkvff5pv7pt3fbdBy8+/9SVy+c/fXgzfpZS6s77N332m79ZuSAhCttZuMCkhN7QM9fTMmF5FHVz19xQnJBeY+PAQKPOpcN6YRjZcbXm2nA3iMnMiAKuVAOAMAymEvZT+b4Q5eUJmIuaDzCdsF1n4hk2x4kggAJSBKItMXxC7uC0VL8tfLM4MET1LIFaChJRPztxuNqmAa1EjcgC2Rb+uJ8qa9eVVc02E3NlwHUTutJVPXK42veQ1bjMbl4pnIxpKwwFuYm4ZBZoQiHEXwd5zwngjEqYMbX11h98/JXv+PzzVi1NJV0zXf5YXyEiRLHh8d1v+cBXSzqxJNvjF/tQ2AKp4Dub+2dYMoiUQOQjcFTbPialC3MoxyGwjtJBiNXJORIiD6z+9Iu4DHpUqSHazOTIB8A0EaFm9wUwEVTYbnLGljXum5nulYJ8UlWyzcckaktqZqhqe9RLj/mZcT9V9JNVsn22iCVH+dSWfkJW0lbJFd6c9OETG3cKrAqLyyN7D1f3ZdOJbDrhYhGGH6LSk6r1NMidAhDOXdVEUgiQ0VTWv5Z191zZIojo+UEy4Tz/rBUPbtj+pldcFN/3T/l5TWRbcnh0vOp5qXTTiR2DSgAxJpS/b7T1UL7JFj6Ze7DmXUWVqkmUDibmUAwbaY+idBBNapg0GADExHDGzGwS8VFfiCyyeg5HUKtY1Uc+EEg+SYTgpKZdixsP2ML3SWltASIiW+gDwLif6iu19Jabh71cKUgEpEK6CYwi2uNwblA4Qdtnq7/a/MLK2qxVmJbsa7CGtu9dP1x2NYFlWamEk0na2dSDLR0L2hZcqhoWtTQkbUuN5Auf++ZNpXL13W+6Yu7MKc9E1f3l4zn04aRAZl6xZO6vb1tH9HSNzkGgbcsaHhn/9s/+MGt6x1jRX9Q24pEAQEvonUNTCr6dsyuaRSxDY7OK6+IZGHIUFq8mu3RxRo1bSqJWkgmOBgAgQLUapBI2RDnQ8zQQw1FJNbJgoOoFHL1pK0lMlhTMrIQAACJmwVrrirZmuvmVzdtbEyOetjyyARiBbNQ+y8PF9v3j0/pKzRVyAFCiFoJt6Yf3C8aNc2ioHAgABoEgmAYrjX2V5t5y697xuRk5vGDF6tWtSdvbP1aoDoxW82OFgdFS5+GHvbX378+3bR+ak2toGh8b2brzkO8HPX1DN37zw4h/jTj3HALOsJ9lJ8wuV70DR/rnzeogmjx9iKO1QnbuO/Lyt332nNOX/vSr73/b+z7Z6o4EJI35uXOoAyAKYVHMqmEOQn4Gx6B00XTomi1SC0qhZqjtEhFpoornz5raeNk5izlqAugZHAs04VE2nNm2ENg3VEBEY4IsXzBlalv2SG+eiM89ZTaEMU8eHqguzO1/4fRDwEFV26ajxRZBwGLP2LTdo7OGqjkGoYS20YsKp6bvBMNqLdaR/Og2IgBEKgWJinYa7DHNdsGzt3anD6rzpzoHFnccXDRvhEH6JLTWVc/3yuP7evVBmv/Dm7dlUk6g7f6hPDGpv8qqws8t4Ii4uTHT0pjdsqNz3qwOIq4Pc8ysNVlK3f3g5rd88Gv/+vp/eu9bXgIAK2aLtFXWnJWoy9rqHG1RqMNpMIwTMQfhejHHpnR1qTf6T11Xba1EgAAAH/zn5197zdmI0NGSzaZd5rBLe8MTh6UUxJMjHAJrZseWO/f3lyqea1vE3NaU/sEnrnzo8UPT2rKrT5pFREpJRJaDdz1v6k6ipCYFyAJJIh0utG0dnjdYaTRPIzH1fIqm1YTTKIwwnUBHoTY7DQCRPG3nvUyjPUaMQrDyjhAFnWNTD4xPmZE8siy3q9EZD9hRUtrZplMa9NnNowunrnrlh/ub0vjBt11lKWWahJ87PJjxXNZSEUlrRLF4wYxNW/ZcefFZYYskAJgsw6yU/O4vbr/u6zde/x9vufLis4LAE9I6ZS6qUkCMlqDhcrK3mLWEDvEUxrmIDtZN0wpV55+QEaEtwrUZXExhCQ6IeHp7Q7z7QaCZwbLk5p1dax/rTCdtPyBmNjNo4kMAZttSh/tGb1+766UvWO75GgCmt+VeduFyACAiBpSCDjz601LPw9JOaWJmcKRf9N3HBhd2jk0VCI6oMgoyBS1zDHX3CYSNcrEzghHLjwp0yAQwWs1gmhgYUenqKOiKqyQxHyrN7q10LM3tPCHXCYB+IH3gcvfAXLtr9fzxj3z8+gWzmjWRlH+NXrXn8DdC9wjx1BMXPLH7ICLKaJELczNJKT543Q+v//5vbv7+R6+8+Cw/0AJRIC6dESJTCd1fyuSrjkTiOgYV9U6EIS4m6HHnbJR6Q7cXw/ooAMT1h2iqALNjKSHQUnLS/a2UtCy5bU/Ph7/yh+iLjOO+xm8AACAASURBVIiuo8wMGtdRBrzElHTtr93w0LY9vbaljCsUaCLSQggpaHDHz++59w6NLgAxsKO8Q+Nttx86fV9+mi18JQJmBAplMIQ3Ql33OUR7Hv3BVO4jT8ecBh73U8yITICCgyIHJQYBDI70GORjIyet6T2zFLiO9JkhkXD3d42cPr1rTnI3MQisbf45HX9OhGOOrjVOdOuPGsbxOXnpvK/+4Nb8WDGTTjCzmflXqXivfdd/9g6M3HfjZ6e0NQVBYClJxABs+T0eCmaQSL3FXFUrW1YJRKwAMDLYYBKlw0mULuyUxfqGuZqMMKuK4MGuYVtJzSTRzMcBBKh4wZG+0bWb99+5bpcfaMdWRIwAmvSeQ4PppAMAfqDN3GMmVhJLZe/t1/3mdZef8sIz5ne0ZGxLBhqGhvNHNv94/Ya1Rc+SEoDZEvrxgXlbBucLwY704oZzhNqUGTPjAupjc9zEHPZYYa0CG+oHKgaJgGRo4ukq+QXhNAIGBAKRXVntrU65u+ecM1oe7Uj0IbpdfUMNja1+//0cULJjNZM2Dwf7MyDxzMezBpzBWqyfKXIO6z8AEL5mAsDsGe1Sys5DvScvm+d5vm1bR3oGX/a2z86e0X7PDdfZtgq0VkoBE6IAQL/YA6iYWSD3F9NE0XysuLeoNnH9KEoX1VIBImf4WJQOgAFsCz/69T9EpYD4MFAzVaoBM6SSjm0pIkYEJbBY8t78yZtinYGAlkIAIGJbST/QX/nF2h/fumlKcybp2hWfFrvr0nQQVUpKQCZEWNe9dE9+mit9ADR9VrX+yhrmanNnolalsDGpVrurIdLsO1W07ZOyZMAggAL2xwFEGPgRCYQjPI+d+/vPOrXpsWV2d2//yOIFM4RKFrr+KK2E07KSSQM+t9Lh2aVUc2xCiENd/Wse3rr/UK9BXl2wD0ONafpAAE3kOva8WR2/vHXN7s4u27bWP7bzgms+/PzVK3759Q9YljQBD0J2JUj7VB1GlOYcD5XTkbFmfoYj9wMif4Mhag0JO0DCTlqopaSwjhpuZMLr9Ukrss4AWKHIptxcxhVoeFikdON8Z74R/RURiFkgNKTdgPhAz/Dm3YMNlYdarS7lZKQAZA0A93ct2zM6NSE98/Wo152jX+fIZ6m9aJb94qigVzvL8QdC1kc+KY8UAgEAMJFfNA2dZpEeBCYQEkkgbRpZtaF7ql/Jd7Q1B4EWVqJ8+A5/rBOFNLWH5248iwgXx7b//O7N3/n57RXPsy31hqtf+KFrX6aJzNwNKYUQQmstpdSaNJFtqR/deNfmJ/aOF8u/u/uR1acu2bKj8wNvu+qN11xkFlmJ/TlDizkoclAAlCZC5Stu6EBxnGpC3h/19JpSPHJkUk1whuurDsdwhhExbp8z8cNsRGsytcfoFzAqZdZFF6ipFkAARq1JCg5k4uSWQydP6S4HDgCZHbm/a9nBsTbX8giiKRm1JVCMCIVYToccM+xIeJr0GvbSMUBA6JMVd6+wLsc3GYTVPHO3cdKhO/bMyRcWvqHZ8QITZmXp4G/TC18v7ByHqeY5Gc9iu1qTEOK2ezd+9hv/IwRmU0lLqS999+bf3f2IbSmzfjYifvC6H579kvd//Ue/lVLYltqyo/Pfv/CTiuenkq6U4qY/rL30glVvvOYizw/iyetmhCKDKqArjAIBNOO456CZwIlhnIrb4MCEB7M8zLFkRBQAII4IUfEqKqCHQTEOWqY0SvVxNJweUQssYSmLalMPOVKXiOwFqsUZOalldzmQ5tpbQj/cs+jAWLurPKJasAyzYbhT9fsKWKMLTy8jwqNCYGL0tOlnQQBgXYbQAI8wxxCz3f7+3n3B6h5/noUVYkChKCiWDv0+2oPnSkA80wgX78Edax51bUtJGWgthGhtyv7n926+/5FtRCwE9g+OPvDIE45tffl7t2zbeSCVdJ/ceziVdO3Q5hHNjZmdew9rHabRST+BCL29vcViIZVQATGxqGoFyBRxYxPKwk5/jGF2LEoXyYio3TeMCE8pI6IGE6gv9kf4xPqZ/XHPcNgwHLUHRPFICn16206F2icFAK70N/fP3TUyNWlV2dzh4UbMDschFaH+oCZQuhAzYI4zjHNx6AUI+1YwYMkcPvWEtR8h0liYAIxEJKWwLau3f2hGR+uG4aVnZEfaknlfKyHdYKyz0rc+MeUsZoobq3WUZE2r4l8ANoA/wxaRInwCmpGnROw6dlNDprEh3dSQacilOXxqh2jMpZsaMgnX8QMdf56ZfT94GoNRyXBxGnPLBiTCq8lxtIHopg3R99SULtIXACgRpEAhBAoQaB7TbKqXIVzjsBHFnvD6oxBCoBCAiCKuo3KsixgFCEQZLziC7JO1rPlgeyrvk4UAjgoOjbc+PjDblb6mCF6h+jEyHzjsfImmcNdROoAJwRiYMeKTHDPXME4yAxOLWvcKa6i3WZAZyLatYqmyefve/qGxmR1NDHL94MpKYBlXG1Wi2rdWlwcAkIHMLaukNP9gfcPqnzueaYTDsEEbLr/w9F/euqZS9SxL+YEeGSt8460vuejcU+JPzpjauuahra+68vzXvPT5ANB5sOef/vmT5WrVsW2tKT9evPzCMxBRax2vul37CYDmhkTVlkThHR7ouI4Qpa+Ifv0JSsdsMO4Hge8TYtQmyYAAKFBI4VhKoiAmru/rBDCBz/e1r71wAgEAMlq2VEpyZKwgQLnqU+S/AEJAsik5Mj+zv1AFZl8gj3rWuiPzBBIAAQhi0AFhtCeMIBCVElEXcDR3MaJ00dSzSM0YsmY8kXpKFyLPrNaJEcejWhoFJGbHtnbuPXz32s3liqekvO/hJy5PuW66+dGhZWe3P2b2i7VX7r43Pe/lJmWVypX//u0Duzq75sxov+bycxuyKfP6cw44ADAr7p5/1orPfej1X/vRb8cL5WTC+fT7XnPhOSs9P4hFw4fefvWH3n41AHh+IAW6ju06djqZGBodsy3r/W+96rUvvYCO7WtrFgAictxZRfWkmjnAEzAHAFF6jdNf6NIxsNbU2pTpaM0Chzw45lz58fKR/tFypZJNJSbLCCECrae25Voa08xgWsMFQu/A+MBoSSkMp7siLpk3xbElRxU2j61lqY2z0m5VWwhgy+DB7iXCaRSVArAAYClFa0MqDPaAAsDXNDJeiYIsHo25WlUu0g/m7qqXEWGwBjMfIoppRgxFPFVJMZwv3HX/Y36gXccG4CM9g3ev3fryy84+UJw5daxnYVM/oUtkVUd3V4Z3qez88ULhDe/7ygPrn3Adq+L5v75t7S++9oGW5txf0lfyLAAXu2BvePmFV1x81uHugantzS1NWWY2U07Mu6aKEBfmXvmOL1xy/qkffPvV+w72tLc0dLQ1AdQ8taf4FWEhCuaAQQhkJUxTUqgSJ2EOoqahkLDVxR5ElkIOjRdffumpH3rTxUf/UKXqdx4evPGOR2+++3HXscPEjYgMAqFY9v7l5asvO3dZ/Vd+9ruNn/nu3c25lDYgB/zCuy+b1p6rfcIbKO3Zr7mNARwRHMg3lqavzOzvvuOhsaQrfT+Y0ph66QVLQ0oKIKUolr0b79zmBTqe2V/DHIRML45jyLUAV4+5kAAAC6jpCQwdNQZE1qwceaR7oFytJlyHNAFCMmH3Doz0D402N7fs91fKrv85cHDHrOkd82e18OhG1XTCzXc8vOahrTOmteqApBKbn+j83g13fOQd1wSaEI95BZ9+PDvj1/yG1tSYSzfm0gCgiQTWlrMwQi7QzMy2Uu/6j++kku6n3/caRDxpydzo809bnJAOCAsCHxAFkiN9A07iqO+o9m9cSIhkRL1hAGBEAGliZjOlmZnNNEAEcB1ryfyOT/7rZfNntn7u+3elky5TaKkRsWupGe2NhjJLIbQmKcXMjkazkUgncsXzzUqGiKCkLPes96oVoZLMuhzAoz0zNGtN2gQws2KrEOh52sRwYrYtmUrY5dGiCLUDhG0KABBpgtrCscd2hs2/CgOu6XmFESs19Vgj1CJyhIgQBMFNtz2UTtipbEtTaWCG2Lp5x8HTTlyAvPWJPwz+8u7u1qas5wUAQD6lUu6O3YcAQAjk+h6bZzP+nMBo+iY0EYWTrMLXTaRVSlpK2pb6ya/uufP+x35y/XsAQGtNRM9kiTxUCVQJYGIAiZyyq2aVyZpijIhWeOG5TkbE5scEmyE8vQAghLCUtJQ09NE8RuM1l59+3mkLCsWKaetFgEBTOuV0tGYBwByeqdFNb29IuHZAFFvNDRnXUtKxlW0pwWP+6C6hXGJOOdBb7Shwcy4lLKlC34Ugl3ExLGsAAjCxpWQ2ZYfWcigH6+yYyDGB2gHHznCUhinkGwikRGAwCwwondgIQgTPD2ZNb2tqyBRLFUBgprFCefH8GVdfuvrkpfM5KG/unSKTbRRUd3UeGc4XWmXnvJltFc83bpelVLXqd7Q3gVmL48/ADQD82d0iAidPxzBo27O/+877H3Vdu7Up94Vv/eoHX3pXS1Mu0PpoE+SpBgIAyhRaaSj1AqBAzjllivu6w96QmME9BaULL5FJQJFDYuQ0SrHv0MAda7fbSmbS7mXnnZhK2CYYX/mCk+59ZDeGVhoEQdDa2NjUkIRYjzMgYntTujHjDuZLlpLIwMQ//s3GhmwCgLxAnjune07Cr2jbUrj78OiPNjbm/YOuxX1DRRMXESGXSpCJUIgASMQCIZdxQ0YQqmqs6+Z7GmcYIG6SAWBGidoSPkUhEWWy5tIhEumEa192wao/rnt8aHhcCLFy2bzzzjzRdawprY1LF8746a2wfeDgCxfQ5Zecl0m5jqsuvvLCi97c093d47rOSMVrzKX/+eUXxmrsz0PO/0J7EjObx/Dcft+md3z824Vi2VJyvFi+9rWXnXXKYj8InhnaIk9SWtJuClgbH6glUQQj6akmBrmWXp+C0oWqACeEPCICkE929nzqm39obkjlx8tbd3Vd9+4rzGk7YU57NuUaZxEQ/ICmteWUlETEDIHWtiWJOZV02lsyvYNjtiUJAJl/eMsjxAQoPM9f+EZvwfJG7elcQqzbSbesHcgmh4lZSGkrSZqUxGzaIWIA9LVWxtZiyKWcSPlATQKHmIPYpYsUa43SRX6HcaXBkr4tfIoTppWMXAxGBoHC9/0prblrLj9nJF9QUjU2pHxNVc+vsqeUetmlq266pffUlR3ppD1WKKd8Pji4Vgh15cVndfUNd7Q1Xfu6y5YsmPk0gu+ZjP8dwAHASL7wyet/GQS6rTlHzAnXuf+RbSP5QkM2FQS6/pGxx745kFkzgUxOAQ5M/3hbqiCQainF+L8TMBcbu0c5wyGx5vhHHVs1N6Qas0nHVo/tOFT1fNtSiJhNu6mkPZwv2YgoUGs9s6PR3Mr58fLug/2nnzg7CLSl5Iz2hk1PHI6t+1zGRYCArRanrz3j+SQQNQMfLLQnHJ1wwCQf0zjn2CqdsLUmJfBgb6GtKeXakphzaVea6fYAUfWzHnOhdAWeaHfXKB0AAjG4omqhz2EVS6JKYTgrC0xiFSh8PwDE5sYsM1SrPhqDkVFrnbCVJ5o6R+US9IWUgPCrm372+pe889rXv5ijZjzzYNa/BC3P+stGhxIRMxOz1pqYpRRjhdLgSD6ZsD0/CAKtlDzUNfDI5p2IaFnmobECEYk40OZLsX8abRkAmKUUVmY2k0aEgMSU9FikGzhWBFBfBIgMtKOd4fDSYbRxs/OBjhcsj5+YoYQwT2KAaCNzpjWbxDGULz6647A51wAwe1qTec4QMgNxoEmTDgKa4g4iEwBKQWNVt3u8QWBADETEYV8TpRK26yjjY3X35yvVABE16XTStu0wmkJoWcemdXzEHFG66KxNcIaZGJOqrIRmQmYNwhEqA6zDjYGZWUYGN0EQaK1DMzs8aYgISordQy3EkHTt29Zsnj3FvvbVZ2hNTGQu+l/eEvysAYeIpkLPzAJRSqmk3Prk/v/8zs2lclUKiWiCGdi2+tgXf/bCV37knR//9s9+fe+OPYf8QEsplJRmBVqTrbQ2F4WJWUjrqz+55ys3bMmkk0Q6INGeKmTtSkBhsjkW5mJ/s15GROzZnPFw9yFqBQCuxdrQ60cGRGK2lJg5tcm8NThS2LG3B0LVAbOmNsmoEmLAQIyWDNqSI5oNdqmnkCsHloBQD5vdIaJMyraUNPfGwEihXPEECk2ccFXSscyqeBzVEcLjjMpUkYyY1GASV1iYGbJ2EUNTjlAlUSWZCNgsEMVxHQYj19AcLzCYJlJAkXSwazwDdtOTuzt37ut60fkrveG9BmRCiv+VOV3PrlvEXKF1m3a0NGZPmDd9YCh/613rb7593ZGeweevXvGSS1bfcOualsYcM/f3j77uqgve+5aXbN7eufHxXTf94cGv//i3tqXmzJxyyvL5q1YsXLpwVi6bEvE0Sa2llJ+8/mef+86dl5wIwakWAgckGt3ylPT48EDSEprqW8lrrRJhEfMYlC7UUzGyOC441MXXsF+EAczCR0lnWlvorvUNju0/MgQAhrjMmNLg2orYuIOIyAGJbKKcscuaTacW9BRyCOMcmkQQZj7ihrQrEBA40DQ6Vi5VPIFAmm1bZlL2cL5kpmFxVCYFjCNaPaWrb52JyjAICNxgj4fxkbWwG0DaQD5GBf8QWmGHPoRVC2PfMQgEy5KWhMExeLJL3Lt24/PPOrWlqaEyesBuWx1VcJ9peGNmCp8EORmjz6J4T8zI/O+f/8n3brhjRkfrKcvnP76js60595JLVr/4wjPaWhqCQM+c2vqLW+5LJpzXX/2Cd73pimTCmTmt7cUXngEAI/nC9l0HN2zZvWnLnpt+vzYIgiltTSsWz1l10gnLF82eNb3tngc3f+unt82a2tRbUKNlO2X7vpZptzK3cWhLX0d47jE6xfE6gHhsSifqokV0HBh5C2ERqwa6MB4GWrc3Z1sa00Z3dw/ke4fyhVLVdPlOacnm0u5YsaKUNKFLs2hyCrYIiFEgFwM1VElLzEfiMcyFgNCQSTCDQCxV/UKpOl6sogBmlkJkU64OVxWAKKAZhBxN6WoyolbtY7CEn7MKZJb4ZS0SbSgkUxVQQlQtRpOVw1sWohk6IITwfH3zbesOd/cHMP6ZH+5+2cltJy+bVyqXJA1TUBYqEXubzwQqiCilBDA9ghO+9ywAJ4UYGhm7+faHOlqbxsZLuzqP/PBL7zLPVgeAIAiEEB9421Ubt+y+4qIzX3Xl+WScNzYlF9GYS5+9aunZq5YCgOcHuzu7Ht22d+OW3V/5wW9G8oX2lobxYjmTTgoI+gvOkXxq+ZRhTysAWNzSf0vkq4f3ezQRJjy8CTIi6sDgsHIw4SgA4+08hXOJgAi+rzvasq5j+YFG5O7+/Hih0j80lk62ElFDJtnSlBnKlywpY4rV6BSMFSqRxqtu0XcE6pDwmzhLrISRqCQFFovVqh+MF6txvaoh4wJElACgNk1mEuZqFzQutTIgaJY5q5i2SkShYSQT7RxNHAfTK8hhz0iIufD3gIGVFHc98NjjOzqzqQRRMFBKD44ezI8VXNcmv0DVEaESsRs4SfMZShmyTGZilkJWPe/+9dvmzuxYMGfqpKrSsyjeB4HOpJPnrz7xN3c8TET/+vrXrVgy1/cDFCiEkKGJwNWqb5QpAKtobQcjEQz8ENC21LITZi07YZZ5+tiRnsH1j+38j+t/IaRAoLJv7+xvXDltEIB9LRa39KWtqtYIIvY8oiwTmaBPiTmTUusDHEdOfu3S1k5cGFWCQM+c0sSRtu3uH/X84Ejf6NwZrYEm21LT23NP7OkG1zY6QiBnnZJRowJ5tOr6WgrgENoMKIAILEulk05AZCk5VqwQ0VixojUhIBHn0q5EqKemEzFnQhPXdUaZvQujjmZscvKW8D1tCdAoEsJtAwrij8U9CRDX80MpD1KI/Hip82BfKuEysETtY+5Ab3FgcHjJCXPzo0N+eRDdKQykpIoCiJE0DOFzfETU4RAypLd9+Bu/vWd9S2P2x9e/54yTFxm+9OwABwDGnW9vaTxj5aJ3vOHy8888UWsz49IIZpJSFkuV/qFRsw6h1roerwAgJcbgMzna7PH0jpaXXrr6y9+7uX8ob0mJCFt6mq85aS8iV7Wamc3PzI3sGW5JYEBmU7U7PqraT6J0oREcUet4N6IiRXTN6gFnTiAD8JxpTYgohahU/b6hcUA81D0MkYSdPbVJazLbBwaFOmVVw3XEkMc81zxPK9KWjAxEnHSdSBnA6HhZABRKVfPkSSLOpBzLkkZFRncVRHV3CGteYc04wlDkDBtYtyWGjefE5Mlkq3AamAMAYKa4DB4DN6Z0YJZUJybSzCYzUJUsH1Lbdu5zEsmWrErYHkgBICpVz3Xs8DzW5Qdm7h/KH+rq33uge++Bnh17Dq3dsH3OjPZDXQN/+OPGM1cuJubYiX1GgDOV+Me27b3uv258ZPOun37lveeftcIPtBX1FxmGOJIff+2//Wd33/Anr//l/NlT583q0E+1wkPsxonwu2xckraWhu6+YbCVI/X2vqaRsuOqwNeywS2fPKV7+0B70vKB6krakT2PEYJCX7ie0k2CVSRcGcJaWN1eMUQO9qxpzQAgBA6OFIbHipYSB3uGo0sNs6c1i+gnCYQtAkf4HDbTQtFzTJk9DEnR+cmmHMuSnh8wc75QlhLLZa9cDdJJm4iSrpVwVKFUFVLGKjta8g7DiH4MSseArqy2uSMBGUGhZWoGCpuCshAysp8mYQ4gasH3iTJpd9b0tl37jti2jcilikYrZ0nv1rvWu8q3mvKzV6qdezp/dvOa561a+rF/e8VYobT/UN+eA917D3QfPNLfPzharniWJZsbMtM7Ws9cuahUrq7dsD2VdM8/60SeSOP+NOCY2fTnfOLLP9+0dU/CsX980z3nnL7cPCovCm+slNj8ROfDj+2cPaN9V2fXveu2zJ89lbT+k89OFAKDgJSSL71k9bqNO7KZpKOCrnxqV3/Dqpl9npYBizOnH/qfHSeGs0vr/c9oDyIZAQATQ91knhY1Cx9NgRkBmYiSrj29PXyaZc9AvlCs2pY83DOCGDbEz+xotK1wRiMzWEIroXWACECMlcDCsFOPwbyESETG3QUAHdBYoSqlqPp+sVTNpZ2qrx1LppN2frwi5QQvMbwpjkXpkBEg0Ko9PZyxSz4pBEIUKjMb2PC0uGI2CXNRekUkTbbrzJzWtmPPIQAmItdJrj7z9CtP474Rv7+/5+Gd1Y9/+Zc93T2JhPur29Y9uGG7bSnXsdtbG+bMmHLRuSsXzJk2a3pbR2ujbVvmvL3mpRf84Y8bTpg3/bQVC5kneMXPyhYBBBwvlitV3zzwrx40zLxkwcyFc6ft2nekrTl3xspFzCye2VwMs7VrLj/33nVbfnvPIx2tOQ3Whq6pZ8/thSpXfGtJa9+c3PCBfKOjgpq4N2cwlhEQMfAJ6XUChwudj1CQTg5+iBho3dyQbm/JGCR394/qQCdcu394PNDaPM57amsuk3Irni+FYAYptEAKwooCetqQMYivswlEDRkXgAVi1Q+K5aqUQmsaK1WmYY6ZlcBc2jnUSxELirgcPiWlm4A5ApiR6hdAAAjko90gk9NYe7HxUY85CK2U8IiJyLZV/9DofQ9tueKiMz3Pf/jRHVe9+ILZMzsLxScdJzlnxpTlZyw7/yWnXfDy9/tBkHDsd7zh8ssuOK25MXt0xSisBgA05tKvfsnzoc5Kq0HlT6IBEZkYET/53levOvmE88488RPvfmXkMGEEOMEAU9oab/zWh770sTfd/N2Pnrh4Dkfe1TP5CQBUSn7jM9e++RUX+wGVCvnfP54cGBO2BJ8w51ZWzzxQ9aUAntBiHvnyUe6MO7E55jwTgWWgZorbXPcqAAAy+r6e0pJNJ51AEwDs2NcznC8Wy9V9hwYGhgrmyjU1pFoaUkGgTelTgkYkjvr1NAnD/jkyXswtnksnTI2hUKqOjpd9PyiVvcHRUuiqITakE+HuhvscW7vRbocoi/6MAMyaMKkqU1ODAUtEBvJVeo6w0mGNIWRsYY9xRJ7Dg+ZoosZv7nx40bzpK5fNa8ilHdtKuU6pQkIgEXm+Hu0bXDC7/QsfeeOi+dObGjOvvOK8lqYcM2uiINCRdR+eASGkkpKYg0Br/RQzDp9RhDOLAp2yfP5vvv+xaGd5EphML/2MjtY3vOyF4Vl+Ng0FJkYmE87nPvyGN7/iok1bd4+XfLIrUFwvRNoL1PPn7Puf7cvNQ48YEesWtplA6aBG6WLTte534rQLk2wRc76CQM/saIoiAFx6zvLlC6ebBJpKOsxAxJaSU9tyew72JxzLNM1gXNsIw1F4F4TTX4gtS2aSDmlmoISjLjxzISJqooZM0g80CtRMZg6sCdyhfR3FobjfPKZ0RkYggqfVrExvWpU8slEwoLAaTjBzWSFu1zR7Uoc5CBtodCqZuP2+Tcz8wuedXK56WlOpUi2UyqpJmV9BgVIiMV9z+bnXXH7ueVd/8Ne3rXvFi88FM2ddPXVAEYhCPXXHxrOwReLFtiZl5UmfMffxn1EGiSP/vNlT582eCgDl/SP9a+6XbqYcqIXNg6dNPXL/wbkZuxo5vnE30lGUDti4T2EZom5wXeE1vtkjR5SJaPbUZox2ZsWi6SsWTZ/wdQBmnjW1KdAUl7bMjxp4izishaqXNXE2oRIJSzMxsWtbi+e2meqGJgo0CcBQqErBYTkForKJ8VZq1XuTFSMPByTSvGyP0cigfZlok6mZHE/FqMfcxPRKBMmEu333wa07D7z+6hdYSg4O5f+49vFSqXrz7Q+teFWTbBJMZsaIEIjVqmfb1ttfe9nnv3XTeLH0ssuel8v8OfMbngUsYgw9DZhMA+afXXQz9XIi8oMgCLQ97WIrO4+DsqEwl5/wJCLFCcHkRqgjbtHrRk7GnSIThnnKpBCC17q/nAAAIABJREFU6ugFsZmjBEKI2dNbzIaO1dWCiLOnNUPo5bGmKBgxILAMZxXUMj4RpZOOYyvi8JLrgLXWgdax20/EUWnf9MvFsdgcAIcWIkTeDAAieVq2JYbbEsM+SURg8q2GpSgTTLrm82Bs/EJkiCAzKyWG8+O33bvponNObm3KMvPBI/19AyPppNvdO9Td02NZlmmBBrQgMsX2Hezp7R/50HU/+vfP/SQ0NY8+xU87nh0y/pLOu2f+E0IIS0opWNrZ5JyryS8IKYqedfq0wye195R8C+OLEWKOw35tgBqlC5V/DZ8AoANdLFUqFW9odLylMeM6luEZpYpXqXrA7DpqRkcjAAjEqhd0Hh7Yf2Rw/5HBfYcGegfzZksAMGtqk6WkaRvxSQRaGHwIwbYMKM5izACsiXIZN+7BzBfLg6PF4dHSUL44NFoINCMCEbm2SiWd8PlxHDcwQ6R74oYEiJgoM8AJDYf/v/a+O8yO4sr3nKoON8+dnGeU0yhLSAgkJJCERBRBIiPABmfWa1jbaz97be++xWF3bWN7wYHFBhvbYGPAZAmBiJJQFpIY5TAjafLMnZu7u+q8P6q7544k4gMWbM430ndv3+rqCr8+dVKdYqhyOwimR/TicSQt9Pm9X94toyxFrpfz4afWjBpWM7lpeCaXl1LWVJZEwoHeRKooFmqsiqigMgBCbqpxRcQX1+/QNF5eWrRl5z4hBH8nR74o+tCdteWSa8ui8MjrU82/ImkTcEOzLmt6bXNbNXgKqb+XyRPpPPGECngDuKZSACiKhaaMawwHzUBA/9L1i8CzILYe60kks4bOY5FgdVmRkhk2vLb/xm/eGwmbAJDJWpPHNPzuPz6h6qmtjIeDhhCSIdiC25JrJIGAAQQ1i/KeiO86U6koGvRwgytebu7oSeqahgC2I8+f1zSsrjSXdwwdY2HzaGe/AeCdFY2EnjJCA9scyc0VrFcGe+rCnbbQkCE5WaNkIjNLSWQQeIFqqkRA8G3FkkQkHFzx/Oa8ZS+aOy2byzPGHCEqy4qWnT+n5WhHXU1NTeke29sSjHoYQO0poU9ctvCfv/vrXN6+bukCXdOECll9J/RhBRwAIAMSerQxNHRpsvkXLFCWyptnNB6cXn1k47HasGFJYK5wDQMi3XFqhG8TUSrO6VNHPvqLf/SfoADKGK58Zafanl1RGi2KhlR+nUPHum3bQTJVwF9bVyKdyUfCASKqKImWxkNHO/pNg9mS5Rw9pvYtIEWNPKRcq6+6xBjEvWSa+bydzuTdiUQUQiSSWfS+FkUDNMCnyZVDQcmkOEiNACCippJDHIUgjlIiDxil00A6/qYu105HnmMBAICEkKGg2by3dfP2vdcuXcAZs2wbEQHRdkRVeVFddZlj2zqkCZl6Z5ke8wdw6Xmzp00cYdnO6GF16rV8pyveh/GAXkW+1zDadDMz4iAdAuRMXj9lk5+fEAoNIYXmDvL2MQFJKR0hjvuzbce2hSQydG3n3iMPPPlqNBKwLKemIi6JLFs4Qh462q04AwDonCdSmfbufkdIyxa6xitLY47jIJCQLGWbDImAhISokQWQ4HoeQErSOI+EDMeRiJDK5vOWPTBLCIlklqSURFLKonBgwGQGPgOn40Q6RGkJrS7cVRvptKSOCCRyeryJhapJ5tGLH1HLumJtQCQJGMNIOJBIph99Zt1ZsydXV8TzllWY/8pxRCpjccqEtbwkpjyHzCz2Z0RKGlpfNXpYnXy3Oxs+vIADAFRMLjY8PPJaafVxztKWMbO2dfGI3cm8yVFCwdp5vEhHBEiSKGBoGucqH4r/p+uarnOGuGrNzs98617Ldjhjtu2MH1mjcRYKGBpnR9p6GUMiSUCMYSqd7+vPapyZhsY5Gz2synIEYyAJ+nIhIDJ0rhtmVZEoCjOGHKRUqb5CAb2sOKLrLBIycnnbsoXnACUA6E/nDV0LmbppaJWlEXcbIviWOHIXwgKRjgg1JiaV7fdMcgK1gFExk6TjGsU9JcPFHCJI0DlLJDP7D7c9turV4Y3V08aPSGfyjKHP/ly9C3nMSJvcloRAEpjJzZKCGQEVBPSuBfkP8ZIKviQnY+NvyRz4M9kpRD3naDdO3bi2tT6ZNzUu/Uh/OGEHl5TSNPRtu1rv+cuLYtBmcbRtp6O7f9uu1q27WnTOTUMTQpgBvflA+z0Pvay0/V0H2gK6Jl1LK3HO/vjEqzv2HXUcqWms5WiPss8xpJ5sCEv5/pbOdCbnOM6OA+axrqzGmZdXjrY0twKArvEjHQn0bHVAoDHW3Zt6ZcsBAmCIlu1oHKUcMGO4UCgQ6RAp55iTyg+UBfvyjsEYSJELVM7hwUpyMoDcl/08ayNJSZqmdXYnHnzy5WQyGwya5591im3bCF60FoJnTZJSQmkgwZgEgUQON+LMjMOAl+L/V2s8IWriw0dSOoxp/Tvv6F17Cw+WO46IB/IPNY/91+fnxQN54Wamc+1tvv/eNxDnbTuXdzyTKvo2WQQwDC0cNAnULlhkDNM5y7IFAwDESCjAObrWZABEls1Zti2Ubcw09YBhEEhJPKxbF4zYSsKyBQZ1+6XDjc3ddZGAu32TAFSmaVB5gzl3tRgEFZtkuXUCQzQMze8A+X0B11SICA7xuJlZ3LhBndEAJFCPREZ/EnkASKC79QPBCyUHBCFlKBh8Yd32F9a9Fi+KZjK5S887fWh9lWWpBB1+RggEAofYWTUbqsM9tjRQZPTiceGhlyjF9jhT+bujDzeHAwAAREZSREffmNn/gNW9mevRRN68cHTz2ta6lftGFAVyQjLXoqlkFhhQHIggoOvBgAHgu3XRGzdU+cvVNSCSEsJBIxJi6oJKbY4DdclQ0GAhdTMSuFHUHGXKNnqyodqonZe6yXFkeXZ/P0jpziICBE0NPM42YNkAVIt10NRVP91lE1GJBuhFdfhqplIdZlTs0pmwpcYQpLCC1XOZHpVOlqGXYFUZhFzzCiKiEE5dVYmha32JZEk8VlwUcYR0o7UKvBGCeETLlAT6BakMpKRFGj1N/72Rvj7UMpwiFceE3Cie8QNApjyDluBfOvWVulgiZ2vM100Hi3SeC11KRwohbCkdIR0hlJtPCCGl51wkTzEUJIUQUjhCbaJyQ2WVFKXOfRNSFXHNLkjkCHY0VQQgQUrLwfJQMmZkhaepAoCUSjEo9GW68pckkm7qfu9oLvICqAjQ36lFgEB5oU0qO1AT7rYcjQFKJ6vHxxglE0lkEX3frWJvEjzYIYBti4a6imXnzZ43a8Kl554ei4RUtOJgyzA5xKpCXUFuS4lAEnlAiw4B1Yr3iD4CgAMAQEbSMStmxMbfInPdjPG8wyvCma/PeYHcCCBf0IET1Ag4uWVYDaT3FQp5oxK31XU5uBLyLP3ghaMhaSiOJIstoak4h6BmNxT12ZJ5PIu8wl4kWqH/oNBY7+veHhz9pzGQWUdvjHWMLzuYEzpDIukwLRKoWUhSgB9tVeBXcH147rpKji0aaivmzGgqjUccRzBA3zKszHNAyEE0RtpVplmSNg9VMbOE6B1sn3lL+mgADhEBGUlRNOmrZtUcaSU0jfXnzVn1Lf8wc21/zmAg4QTMudxCuu+xa9sasGa5UzsACw9zRN62Vg8g4E6iqwKCO8OuBqkx2ZsLdWUiHCUBOITDi7tN5kjyjTTgN2cw5lw924e599WDMwABMZCW4MVm6rSq16W3zY+kZdYuYIEyKW03gtl7z3zJwseccv9blp3J5B0h0NUXBkJUEcAhVmb2VoT6HKkhApDQ42OULPEeupc+GoCDgoW19PQ7UI+RsDiHvlzgygmvXTlhW3c2wFEMwAW8eQUAb0nyMOeVAI+5EAxgTgXUDLC6QZhDD3OeXRZ864Yj2YFEKUNJALbgZeF0fazPcrjHRAdhboB9gvfntdDtbUEhBHIkM5k9t3a7yW0hGEOQTlovm26UTCKRUek8fdyq+0/EnERA9M5hcLtNCH5GE5KEI+KtGpMEqBxlRtEYeGOf8rujjwzgAAAZJ+noRaNKT/sZOWkEYAgpy/jSaWvOGbG3JxPUwFUGXZYxgDlvdRpsGfavFxQeEOkKMOcbwPzrnmHMDU0hnYlDidKkZWpuTlRsqujQmFBnpvoBk77G4GOuUKQbKADe0opu+MgZdTuKzaQlOGMonZwWbgjWLiRhgRdccjLMuf5TpEEinQ9yH5dIZEtebPY1RjssoSECibxeNIoZUXJDod4z+igBDgAAOUk7NOSi2ORviFwnMo0IbcG/NW/1GY0He7JBjgKgAC5qpj1s+YviwEI3sNC+qUjnBzUNnKI0qBKOIm2Ze3srdOYAkSV4VSQ5tLjXEpwBFYj/AwtmQbsGqRG+WIAgpURBOKd2Z224Jy90zoCkxfRoqPEiZBxAIHjBTAWY8+QJ9906TqQ7zjKs9GEpsan4oMEEAQJJZIZRNt0D/d8x4BARUCPpxCd/LTzyBpFtZ1xzJAOE2xasmt14yMWcb88oWP4KpbEBuJBnXXpjkU5h7kSRrlCNkIAac3Z3V6ZskzMJQEKyyVXHApqt5tBVeY8X6Y5TIzwZjghRSmKS8IzanY2xjpyjcSSSEpAFh1yMgRISFiKCcre4mJPuS+Or3uCpESeIdORJiQjSErw23DEk1pGXOiKSyOvxMVqokuA9Zm/wkQMcFCgQpaf9NFh/nsh1cq7ZgnEmv79w5YJh+7qzQYayAHPk2noH1Ag6To14Q5HO5XvkifUnx5yKw+BMJq1Ac1e1wQQR2BJLAtlJVcfyDh/AGZ1EpCvAnPteIEhHMgCaV79jaFF7ztHVHn2STrBhiRYZSnaWkLkgLWRjfuzmW4l05Fn7iJjO7Cnle11ORhK5Eag8/f1gb/BRBBwoBQIRmF42795A1VyR6+JcswUHgNsWrFo2bkdvNgi+LUoxkpOJdL4aMUikg0GYI1movZ4Ec75IJwl07uzsrurJhnUmECAv2PiKjoaiRM7hbABz4GPOs4tAoUjHUNqCG8xe0LCtMdqZc3TOCIiktIIN5+kl48nJIPJCAbAQc8ctrydgTol0bjlEaTl8Uum+smDKFpwhksiaZdN5sIzea3VB0UcScOD69SXqkbKz/mhWzVaYE5LZgv+fuS/ePHNtxtJtwd5cpPPUCG9lLIwZpoHdTT6rw8FqhD+dbmwxAaLMC21jewNDcvceEJzecCikW7YsiIqULjQGiXRACMCQso5WHEgtHrK1KtSXczTu4j4frD/XKJtO6gStgcSLHugGR/Yejzn3NZK+jRcIGFBWGI3Ro+NKWvNCZwgkbR4oN6tO80SNjwFXQMg4SMHMeMX8Pwdq5otsB+OaBEzmjZumbfr+2SvDhtWfNzkKjy0VWOkGqREAbxAz/OZqhHfdjY5EIJJocPtgonR3T2WAOwTgSFZk5uY2HiQJUnoasq9GFASBqDUz6+jDizoWNW6NmVlLaJwBkQDhhBqWmGUzyE4BMM8b7Nm7BxjdoMheV6QD3+7hlyECZCAtoZUYiVnVu6UHA5IiWHc24wE4YXvfezZrA+LqR5NICmCcRK77xU9l9t/Pg+VEJAQUBXIH++Lfe3H22tb6mJnXGAn3LEcvgdUJzn5vi4QXMwyuix08hLq+f/LvAT/1hwp2VPdLQI3J80Zsj5lZR3ACCGrO653lzx8aYmrCd5ODG6KhmBFYUtOZnFp5cFzpUUcyCcgYB2kBasEhF+nxJnIyKoWg6w72mk/eluYBLzENuORV61wDDqrXAhiCLbUAz53duLnIzNvK2mKnzarTQ7Xz3byF7w/g+Le//e33o94PjNy1FbXQ0EvIyeaOPovc4Jxnba0kmF08cm9Qs7e2VacsI6g5btyHB6oCzIE3HYXfCq4PFHZnXH32vPNuSXWdIVlS68qGR8S7FKQcyaqjKZ2Lg33FnFHBXciAJGBeaFWh/nkNrw8p6rKkBoAMmXSyzCgKD79Ci44gkQHkBU0s+M/NJkcDzfb8XOqfC2uvkQzJlprBrLMaXisNZCyhcYbSyWqxoeGGC7wuvy9og78BDqeISAIQIk/t/k3vq18haTM9KqWDADEzt72j4r9fnbGmpd7UREh3BDHyWZ27XRMA/KlTSyy4cerqhwE24X1XARZuJeBV5GEOKOcYI0vb5zbssYSmMrkFdGdbW+WalnqNS65YFGFeaEHdnlB+ZFzpUQUFFbYnnawWHRZqvIgZcRI5RO7anxX+iQr4HLi6tNcgL57cfYYCnTLOcCRL6kEtd2bda+WhdF5onCFJC/VYdORy1CMA9P6dXQl/M4ADV5wSyLR8x6vdr3ze7t3OA6VEICSFdAsInto78p4tk/f1loR0x9QEERAxFy5e9JnPLlwQorvjFPzP7rSfHHPgchUEdDE3uaplRvUhlXCECAOas7u79KXDDZIYATKEofGuieWtxYGMJTT1DJIOgTDLTw3UzAdEkDYyBt4z/KXRbwd4T3Uxp2rxORQxTyEFBMg5ekkgOa9+Z9zMWkJjDEnawIzIiKu1UOX7elKqor8dwCki6SDTpJXoXf/19J57kJuohaUUABQ18n25wCPNY/7y+tiWRNzURFBzwNXcYGB59ddJeiPMnSjSedc9kc4Hbl5qp1QfmlLZkhM6AyAAg4vDifiLhxpiRnZa1eHKcL8gbkuVZA3JyTCjOFC/SI83gcgptu1z4JNjbpBIN5A7rlCkQwQJkBf6kFj76TV7TE3YkjNEkjagFhl+uRapJymQvb/nj8PfHOAIAJUagQCZg4/0bfym3b+Hm8WAXAipMRk28p3p8NN7Rzy2e/TenlIACGqOzokAPK0CAHycuWrEwPLqr1fwFmqEjzlL8hk1hyZXtqRtU0iuc2d4vKsh1r2/u2hfbwkR6pwIGJFFREbxxEDtWcwoIpHzFNKBQOUTMVeg6Qwsr14GM7cTCJQXuo7OpIqD48uOSmCCGEMkYQE3IsMu+8DQBn9zgAMXc0RAEhkXue7+rd9P7f41iSwz4gQopDSYCBl2f85c21r39L6Rm49V9+aCGqOA5mhcusYPN1T2jTH3piIdAjIPe5bgiXxwenXLwqE762N9w4q64oEMQ5KEu7vLNrfVJLKaDlkjUmVUnanHx5K0gRxERgWS42DMAbguUXSFtpOJdAASER3iQmJtpGda5cHycMoSOgCg0kjMovDQpVqo6gNDG/wtAm6A/HHMd23q3/q9bOtTAMD0GAFKKTmTId0mggN9xWtb6l9padjVXdaXCxKgzqXBhMakjyTXm+rbVAr1PnLVWOU/AEIJzCZuCU1KNDVRG01MqWyZWXNwatXhoGY7xB3JlI5rslwqSzv7Rh3Ac0T8NMMwuEwDMiIGAJ5OU4A58CDlGT4GqxEeWwUiYI7kQmJJMDmhrHVYvAsR1TIKQNLJaNGhocYl3Ih9AHJbIf0NA85ndQKZBgDZIyv7t/8k3/YCkGBGDJCrxOEmd0zNsRze2h/b0Vm5tb1qV3fZ0WQsmTcdyRGBMeJIHAAZISMGiMoF6prymZAoiBExCciQAppTGswMifc0lR+bWH50ZElnPJAhwoxjSGKIDJFI2CBt0mKBsomRyikJWdXcHtnfE0/aIYagMemZdz2PhxuGNKAIH7e8KrWCCCVwhxiCLAsmRxW3D413m5pQ6bkRGUmLSATKTwnUzAfG4YNFG/xNA26AiAQAqpHNHlmV2nVX7uiz0u5nWhi1ABFKIgRpcmFwBwDSttGRDrcmig4l4keSsfZ0tDcbSlpmztYtyQUwIoZADEnn0tBEWLdiZr48lKqOJOuLehtifTWRRDyQ1blwJMs7uiM5IjIGIB3ppICIBSv0eJNRPoOZJdLJa2hrjFKWfjgRO9hX3JmJ5BwdARkCZxLVqe8A4GPOlRyVlgASUACXEjnKiJ6rjiSGxLurwkmdS1tyScgQgaQUOWaWBusWGEWjlPPhA0Yb/J0ADgAAyEuvzADA6nktve+P2ZbHnP59AIRaCJlJwNRRmZxJnQmdCcYIABzJ8o6WFzzn6JbgjmSOZAyBM2EwaWqOyZ2A5uhcMJRE6EhmS+5ITqAOGCOUjhRZEnlmFJkVp4aHX2OUT893bXL6mgEIeUA5PTmSxoQkTOTM9nS4PRXpyYbStmkJTQBzI9cKVGkAYggcZUCzY2auNJiqiiTLQ+mQbhOgLTmRck2QdHLIdKNsSqBqDtOC76sv4c3p7wdwAKCMdQOwk1Yyd2x15vCj+faXRboFpA3cRG4i6gTMj+5QTk8GxJBUFjD0jA8EIAkloVpPPeUCEAlJkLBI5IAEGnG9eHywblGw/lyjeJzXFmn37sx1rBWZNkSG3CBgyuHPmdQYEZEtWMbW07aRtvWcreeFpqzWirma3Anqdli3wroV0B0NSQIKyQQxV8khQSKPTNOLRpmVp2mhKgL44JfRQvo7A5wiNzJIKtkOAES+z+pcn2t7wepcZyf2yHw3SRuRAdORaYAaIANgUKAH+nUBAIAEkkACyCFpg3QAEbUID9UaJRPNqtlm1WwjPtZ7uIrVQ8/MK+y+1/Ndm0TmCEkHmYFMI29eGJL68z39J3QFpfdHvm0aJEmbpMO0kFY00iybqoXr1KP/txibT3+XgPPIi9UF73h4AAAnfcTua7Z7X7P7mp3kAZFtk1YfORkSFpAA99Rm3+KKgAxQQx5gepQFyrRwnVY0Uo83GSXjtegIpocGHkYCkBVMuYpIUTn4wUkdtnt3Ov37pdVHStFBjip1gxde5Id8uE8f0JXVJlgB0iGSyAwerNSLRunFY7lZDK7r7330kL59+rsGnEdq5ZSeWX/QrEg7LfM9Mt8t8r1kJaSTAZkn6QAyZDryINMjzIizQAkzS5kR97mmV7WKd2dvON8qKs9DoRR5kW51kged9BGZ7yGRJemAaxNhg9rmRQiTCiViOtMiLFCuRRq06BAerPT2Z0nv9v99tMHHgDuOvHBIcnVAfMf5z9RGegDFfFghF3qrByub2oB0Ja2kyPfIfI+0eqWdJCdL0gIpAAiQIWrKcceMGDOKWaCEmSWMm4NaMjh65MNAHwPuTcnbWeMH5Z682AD/wLeLsDd7ouereuf1+JGW/+uy2hvRx4D7sNIA1n0ajOaCDQsDBT5k/OxE+hhwHyGiE3QG+JDD60T6CKTr+pg8es83if4v0Ed4E83H9FGkjwH3MX2g9DHgPqYPlDTy0o4y5lom1dFwDJk6Rum4Aif7VQKgfzt5R40rS5ZyXxdas/wKC8m/3S3gJmDAwuuFdNJKAKHweFa3JW7MGCAyHHzgnP/rQAXoHuzkdpMxlVWw8K4Tu6wKc+bnwiKhsrEWVKV+VT6tgiFlACcfwOPG5O2PHhRMgTof/PiWew078VkqEae6S9mTj+ts4ez7w6I+SEmS3Hu9Yqoet8tqTN5jLVUdsX3cMKlcuG/fgkoFu3Dp3e7IPelh1Ce9+MHTu+7Uh/xZb/T0wjZgJpvbtH2fEPLUqWN0jSPigZa21/e0TBo3rLaqlIhyeWvNpuZwMDB94kjO2b5Dx3bta500dmhtdRkR5S17zcZmw9BmTh7NufuKb96+b8/Bo3nLLo5FJoxpbKyrHEiCgZhMZbfu3M844wyJwHFEwDQmNw3zz8NExC079h3r6C0vLZo2YYR/0e8DAAghN+/Yl8/bms5VIKwQMhoJjh/dqMowxvKWvX7r7pajnVJSRVl8yvjhZcUxKaX/+h7r6Nm9/4hp6Cqi0rLtspKicSPrAXDPgSN7Dhyd0jS8urJEekdxqkdnc9baTc2RcGD6xJHqRdq55/DhI50zJo0sKY4RUX8ys27zrsqK4kljhwLA9l2HWo91Tp84qqwkpjbBM4ZrNzXbjph9yrjeRGr91t2RUPDUqWPcRQNg7cbX09n8rGljIqGgP1vqQzqT27JjPzJ0z5cWUhJNGjs0Ggn5g+w4Yt3mXalMdlhD9ahhtf71ZCq7bsuusuLY5KZh6ooQ8tUtu23HmTl5dCBgtHf2btq+b3hj1ahhdYpNZHPW2s3NkVBg2oSRnLP9h9ua97ZMGDOkvqbcHxb14Wh795ad+0c01owaVvv63pb9h44Nb6weM6Je1ZNIptdu2lVWEoP1W3ePnfep2lOu3X/4mMpr/OV/vwuqF/3g539SS9u21/eXTrh8yuIv9CVSRPSFb9wBNYu/97MHFLP9wyOrAyOWzFpySyabI6L9h45d/tnvVk+7OjTyotDIi2JjLh11xk3/dvvvHcdxhLBsm4h+ff+KeNOy2lOuLRq3tLhpWf2M5Y0zr9u1v5WIbNuRUh5sbR8x55M1066pn7F8e/NBKaXjCPLIEYKItu86VH/KtXUzlsfHLYuNXRoft9QcvmTsmZ/OZnOqFy+se+3My/65dMLlqiXF45dNXvyF+x56joiEEOpB//AvP483LauaclXRuKXlky6vnHLV6Rfdms7kiOimr9wONYt/dNdDRGRZtvtoxyGiDdt2Fzctm3rOzclURj3r8s/ehg3nPfDYC6rY46teNYdduOiab6ivF37iO6zxvN8+uIqIcrk8ET26cl3phMtHz70pkUwfONxWP3N5cdOyBx71bn/21fi4pU3zP9PZ1SelFMLtu207RPSHR54vblpWe8q1sbFLY2OXNsxYHh+37P5Hnyci27FVvx57Zl3phMurpl512kW3pjM5KaUa+SeeXR8YvuSMS79i244QkoiOtnePPOOTVVOv2rpzPxH9+H8ehrpzrvvH/ySivGUR0brNzcVNyyYu/FxXT4KIbv6XO6Fm8Xd+fB8RqTr98fnPn/8Zas+54dYfEtHPfvNoYPiSyYs+f7StW83dP377F1B/7qe++hMNESOhgCTyvcu6pkViIY1z8MK7opFgKBhQ74Sua9FYiGtcvTcrnt9kGvoFC2YEA2ZPX/Kmr/5k47a9UycMv+LCufFYeO3mXQ8/9cp//fKheCzyhesvcBxBBK1t3Rpn0yaMmDF5dCqdfXzVq7m8rU7SXDFDAAAR90lEQVT1E1JqGl/5wuZUOltZFm/vSjz53Iam0Y1EkgZnNhZC6IYWDYduvv6CUNAkAOGIspIY17hiKjd++fbeRGrhnCkXLJyJCE88u+G5V7Z++f/eVVNZMvfUCWonQkdXXzBgqFPbW491PfnsBk3j6mh1XdeisTBnTAipcpD7/u/CAVEjZpp6UTTEC473jEVDAdNQXwOmXhQNKVlK07T2rr7v/Pi+WDQUjQTTmdyQ+sqbrlh0+91//fnvnjh/wUzT0O689wlA/OInlpSVFjmO8Bm/YlNH2roYZ1PHDz99+jgAeGr1xp17Drsp1d3dFfj4qvUB0ygtjh4+0rF2c/OZsyYqQQ0RY9FQMKCruVNwCQUDJN2OcMZi0ZCua34BAIhGgqGgqT7rGh8YFqJCYYlzHou5vb5+2YK/rli7YdueO3/3+L/eeu3m7fseeXrt2OF1t9x0saZYohwsyR0nlh4nXCvxEBEPH+1cv3V3cSxyzpmnAMAf//r8ptf2jh/T+Nvbv1xTWQIAy86fUxQN3XHv479/ePXypfMjoSAidHT1ZbL5RXOnfm75+f3JzJPPbVCCKnhHTz/53IZoOHjhwlN/86dnnnl5yxdvXKJx7smm4E+qlKRp/Is3LjF1fQCIUiLif9/zWEd34twzp9/zo1vVQZ+XXzD32i/+x9PPb7r7/pXzZk1kjFm2093bbzvik1csmn1K0/qtu//yxMuFsqYQwjB0zpk/5VK603/cgBC92Yi5vxIREefs3378+9ZjXUXRsBDucX03XbX4kRXrtuzY/+gz60qLoy+9un3K+BFXXTRPHYhTsJ4CEbV19Gay+TNmjv+nT18KANt2Hti0fa+r6BAZmtbVk1i9dtuYEXUTxgy5+/4VT6/eeNZpkwY3DDjnavlWA144++rNR0TOOSJyzlRfCodF1zV/WNTbCMALYRMOBT519Tnbmg/8+fGXbrpy8b0Prmrv6v3yZ5aOHl73jiVolfoHAbI568e/emjf4bY5M5smjh1CRC+v3ymkPPesU2oqS3J5y7JsKeUVF86NRUJtHb279x9hDKWkI23dhqFXlsUdIfr6U4Uzp3G+7+CxV7fuHj287h8+cWFleXzHrkNbdx4AAMV4BrUEAADSmVzhRYaY6E9v3r7P0LWrLz5T03gub+XyFmO49LzZjOGeA0d6EynGsKc32dmdiEVC8WjYESKVzhWqUESka9rBlrZ1m3e9smFn894WIip49d/psIEtBCL+5cmXH3jsxSF1lWoi1Wl8VRUly5fNl0R33PPYd3/2ACJ+bvl54VBADjquCRgyRGxt69I1Xlle7DhCHTnvQ8GTJbYfONw+b9aET16xKGAaL766PZFM67rmj5mUlEpn05l8JptPZ/KFzIzcMw5lNpdPpbPZXD6btfxf1QeN87aO3m2vH9iyY1/rsS7G0DB0HxuKHEcsOfvUUyaN6k2kvvCNO1e+uHlofdWnrzlXEr1j15aQIhoJ/eWJV/66ct3BlvYzZ038/tc/gYh5y27v6tM0PqyhSkrinHHGEaG0OFpcFD58tLOnt58IEsn0sY7ucCjQUFuhzlk7Toda+eLmzu6+WdPGlhTHpjQN//2e1U+v3jh94kgX6AWqA0fMZPNXff4HnLNAwBjRWH3NJWdNGDPkWEdPX38qGg7W15Qp3KjyNZWlwYCRSmf7EqniosjR9u7uvmRVRXF1ZYnGuXpfC9/jaCR038PP33HvE4gQi4TGj278v19ePnXCCCHkO1X7pJShoJlMZb71w/tKi2M333DhbT/9Y8GQyhsuW/jI02sPtrRZjrNg9pSLFs0SUvqcFQAIgDFMZ3JHjnWHguaQukrFuQtHT8H3yec26Do/dcqYMcPrRg2rbd7b+vL6neecOR0AiMDQ+bHO3gVXfh0AAEFKsm3hVyGJwiFz4/a9c5d9Va0oQkjT0AeQJGQ0Gnr4qTW/+dMzDLEkHj116ph//adrhtRVKgOFX4+ha//06Uuuv+VHu/a39qcy3/zildUVxY4j3o2NgDHMWVYylUXEVDrb1tGLiLYjHEcwZJwz1zKCgIiMMWWMUSdOHmnrau/sKyuO1VWXwQm6pyPEk6s3lBXHLjt/DkO8YsncUNB89pWtuZzFOTspXxFSOo7Yva/1l79/atlnbtt78Kiua7btMMZcw5g3CkqJJgBl5Nt3+Fgyla2vLisuihw3c2ryMtnc2WdM+fG3P/XtW64eO6J+3eZdN//Lz3N5S9c0eicsjogAUOP8B3c+2Ly35dZPXTL31An9qeyAgYZA17iucyGJMcb4ycxvkojoSFvXsY6e0uLY0PrKEx+ka/xoe/cL67ZPaRq+YM4UTeNLzj41m8s//fxG9DbNAgAiGIZmGLqh64Z+PMchAoZo6LphaIauK1j7xBnLZvOnTh3z/a/dcOunLykriT2yYs0t//orAFBJYQt7HTAML2YLlUoAvvOeAKSUQkpvgEASqa/HrWWcsf5k5ubrL/j0Nef++0/+eNcfn77133712G++EzSNUNAUQihlVgoJBISYTGX6U5mAqRfFwlLSzj0tPYnUhDFDKkrj4O9980CwY9eh15oPVpTF73voOV3T+lOZitL47gNHXt26e86MJkmSgdtuRBSSQkHzgTu/VlwUae/q+/w37ljx/Kb7H33h89ddYBh6OpPrT2WFkFKJKMi6e/tzOauqvCQWCQHA9uZDlm2PHlanlq3jppgxls1bMyaPuuGyhQBw/vwZl37q3w+0tDXvbQ0FTdVacdyISX/EBi27UspYNPSXJ195Yd32s8+YesPlC3ftbeWc+aEfnLO773/21c27pk4YkUpnV7209dFn1l20aJbjCH++pZSI7LXmg129/adPH6sEEl8GkJKUwLfqpS3dvcnGuorv/vcDJOlAS1tZSeyl9TvbO3sry4uBwLJFdXnxw//zL5xzRGjr6Dn/hm9btqMQyZBlsvmp40fc+d0v2I6ja9qGbXsu/+x3/cFhDHN5a8qE4TddtRgA5pzSdN0tP9y680B3T3/QNH3AKdn0P37+YM6yx45saN7bcse9j1+0aFZpcYwpNCJAKGhyxjhjmsalJEPX1NdgwCxEruqkrmsl8ejXb7589LC6HbsPr3xxs6bx2qpSAHh5w+uMMcPQNY1zzp5bs627p7+yvHjcyAbO2epXtgkhJo4b6q4XXgCXWi6ffG5DLmelM7k7f/vE7Xc/cs+fVzlCCCGfeHa9LzsXvo1qCACgsiw+dfxwKWV3bzIWCVWWxbO5/Evrd3DOdE1TfXnmxS152xnWUFVeWiSJXl6/Mxgwp3p2vhMJARzvFMDqyhK1skgXYaC0ezVESiALmLr6GjB1ooH1hQg0xtZtbtY0fttXr+OM+YKXAm5nT+KuPzzFOf/3r1x3+YVzLdu+87ePZ3MWY+g9jnRdY4w99fwmxxHnnHmK7o4u48x9tCr85HMbg0Fz78FjP77r4dvvfuTJ1RuDAfNYR89za7aR779BNA3dNDRD1wxDx4LNOUovYQwNXTMN3dA1Q9foBNOx47jD0lBb4SbplFLt91ekafyJ5zasXrttRGP1L75388ihNQdb23/1+6cYQw0AOOecse27DvX0JRGwpy8ZDpntnX07dx8moP2Hjum6plwZCryaxhmiJCotjp0xc/zd9694ZMXaJWefumTRrEdWrHvp1R0/vuvhZefPMXRtzabXf/rrR7N5a9l5s01D//UDK19Y91o0HFo4Z6rfes4YY0zXOAC8sG47MvzOrdecMmmUbTuaxp98dsNt//3A2k3NmWzuOOhrGkfEo+09uby97+DRp5/fxBgbM6KOMVwwe/LW1w/c9/DqofVVZ8wc7wjx0FNrHl6xRtf4dUvnd3T1/eyex/YcPFpTVTp7RpM/oKol/ntlGHpfMt1ytDOTzf/uoedajnWVl8RGDqs9cLhNTcPm7fuCQRMA0pl8KGgebO3YsfswABxoaVMT6XaQM93Q0tn8Vz67dNK4YUru9mAqEfHX96/ctf/IOfOmz5wyuqG2/LcPrtq288Cfn3jp2kvOchyhjopJZXJPr964es22YQ1Vo4bV7th9SLG0bN4KBY1d+1uPtnfbjti8Y19pPPrL799cWhxzHMEY+9FdD93/6AurXtpyxYVzGUNN4wqaSh2XUnLGtIJeq8klddgcY6q1g4ZF19KZ3LGOnkR/+o57H+/uTTaNbiwvLbJtR9ddiTxv2Xfe+7iQdM0lZ9VWlS6/dP6WnfsfeOzFqy6epzmO6E2m+/pTV9/8AzWbnDPTMO76w9O/+N2TAMAQJFAwYKjJTmVy/b3JbNZSzTr3rOn/88cVT63e8FrzgfPnz/jE5Qvvvn/Ft354309/86iu8fauPoZ46Tmn/+ONFz381JqbvvKTWDR04cKZp00bK4RUI97bn0r0pwOmsWHbntVrXhs7sv7ixaeZnuJz2QVzfvLrv67ftmfFC5svWjRLqEMXARxHJBLpbNY659pvAmAylSGiS8457aol84SQN99w4bbmg8+8tOXGr9xeURoXUnZ09UVCgS/dePGiedO+etvd//XLv8RjkZtvuKCqvNiybMPQbUf09CUT/WnVzf5k2rbFnfc+ccc9j9mOyOasqvL4N794ZTQczGTzyXTmYEv7Ocu/5UGfaZz92+1/+M6Pfq8GMJ+3+xIp9WsqnWs50rn4zOmfu/Y8tU4JIXoSSSllOBjYtf/I7f/ziMb55687n4iqK0quXDLvm/957w/u+NOC2ZOryosdR+i69twr2z7ztZ/FY+GevuTyL/6nI6T/1gVM479++Zcjbd1Noxubmw9ef+WiGZNH+6/lxYtm/e7BZ596fmNnT0LjvKsnkUhmFHQUP+vtTyVTWUcIAMjm8oneZDKdRTe4GB1H9CSSoZCp5IT+VMZynN/+edVvH1yVy9u5vDWkrvLbt1yNiP2pbH9fqj+ZAYA773386ec3njZ93JVL5jpCXHLu6fc+uGrli5tv++n92qhhtZ+8/OzOnoQvP/pMxHd/WZZdW10WCQUAYP7pk9KZ3MypowGAAGZNG3fLTRcfPtqh7OHf+9oNc0+dsPLFzS1HO4WQC8+YOv/0SRcunMkYq60qvXjxrGkTRnz22vMYQ8Xeo5HQFRfOTaezJfHosY6eS8457Zwzp5uG7jhq0aGK8uIv3XTR2k3NpqnS/gAiI6KG2vLPXnd+Pm+pBTkei0wZP/zsM6Yoz3FRLPzb2//pgcdeXLPx9baOXs7ZsMbqCxbMmH1KExFNHDv04sWnnXvm9OXLFkgpOedE1FhbsXzp/Ma6SsPQiejSc2eXxKOhYEBKaehafU35/NmThzVUEdGIoTWfX36+ZTsDJtnBI4aAmVx+wpghineev2BGbXXpP39uma5ryuxeURa/5uKzwiEzEg50vN571umTTpk0avaMJrWAfvKKs7t6+3v7kj19yeqKEiXpM0TGcMyI+vFjGonICwsgROxNpB5Zscay7XgsfPWyBTdetVhKKYRkDAHx1Glj/+ETF7Z19qXSudHD6q5ftnDsiDrOmYquiEXDVy2Zl8nmaypKiGjm5NHLL184//TJAKAcgPU15csvnV9RFg+HAkR08aLTQgEzFDSFkIGAMayh+uwzplRXlBDRrGljrrlswaK509RQXH3xmdctXRAJB23bMQ39m1+8qqaqtK66zDU3vzsqPANUfT3uypuUfJvVvnnJN7r3je4Sym/wbsn3Mr07eju9PmGgyHYcInryuQ2xMZd+/44/nXjjngNHqqddfcMtP3ybD/JLva1Gvw16k2E5sVOa+x3egtAz86hTbVXkSUH4CqjvRKRM1UqWV4e++yVVL71AHT/6xS2jNGUELLQ/gWv7JiyIdQEvBGigELmRRV4IEKoCKsAJlAfCj7HxOn9cSwrLqG4W1l9o9BcnWKFPNmJuR4SQfqQQnRDzQ+Tu52PulsLjI4JUMyzL7u9J9CaSjhCOEMrKoMSS3kSqq6c/mc5KKR0hOGOc88KBUhOk5LDCPnq/DsyRYhiIyAua+lbDwgoDutS9hSBR9bi/wnsdnuQ1w19mBuk39FahMsdFnr3l9bduSUGUynEX37wlHyS9ee+Uv+FgS/t9Dz931mmTZk0b63sglK21py959/0rx46sP3/+jHcaCfbB08e7tj6mD5T+H5NnDFkF2IMgAAAAAElFTkSuQmCC)\n",
"\n",
"[перейти](https://www.bigdataschool.ru/)"
]
},
{
"cell_type": "code",
"metadata": {
"id": "vI0rSTxNfMXu",
"outputId": "fdcd65b6-3e0c-4d91-a8ac-7eb04b1aa29c",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 241
}
},
"source": [
"!pip install pymorphy2\n",
"import pandas as pd\n",
"import numpy as np\n",
"import nltk\n",
"import re\n",
"import csv\n",
"from nltk.stem import WordNetLemmatizer\n",
"import sklearn\n",
"import codecs\n",
"import pymorphy2\n",
"import seaborn as sns\n",
"sns.set_style(\"darkgrid\")\n",
"from nltk.stem.snowball import SnowballStemmer\n",
"\n",
"from google.colab import drive\n",
"drive.mount('/content/drive')"
],
"execution_count": 1,
"outputs": [
{
"output_type": "stream",
"text": [
"Collecting pymorphy2\n",
"\u001b[?25l Downloading https://files.pythonhosted.org/packages/07/57/b2ff2fae3376d4f3c697b9886b64a54b476e1a332c67eee9f88e7f1ae8c9/pymorphy2-0.9.1-py3-none-any.whl (55kB)\n",
"\u001b[K |████████████████████████████████| 61kB 1.7MB/s \n",
"\u001b[?25hCollecting pymorphy2-dicts-ru<3.0,>=2.4\n",
"\u001b[?25l Downloading https://files.pythonhosted.org/packages/3a/79/bea0021eeb7eeefde22ef9e96badf174068a2dd20264b9a378f2be1cdd9e/pymorphy2_dicts_ru-2.4.417127.4579844-py2.py3-none-any.whl (8.2MB)\n",
"\u001b[K |████████████████████████████████| 8.2MB 5.9MB/s \n",
"\u001b[?25hRequirement already satisfied: docopt>=0.6 in /usr/local/lib/python3.6/dist-packages (from pymorphy2) (0.6.2)\n",
"Collecting dawg-python>=0.7.1\n",
" Downloading https://files.pythonhosted.org/packages/6a/84/ff1ce2071d4c650ec85745766c0047ccc3b5036f1d03559fd46bb38b5eeb/DAWG_Python-0.7.2-py2.py3-none-any.whl\n",
"Installing collected packages: pymorphy2-dicts-ru, dawg-python, pymorphy2\n",
"Successfully installed dawg-python-0.7.2 pymorphy2-0.9.1 pymorphy2-dicts-ru-2.4.417127.4579844\n",
"Mounted at /content/drive\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "LBF3ntXC5TWG",
"outputId": "48e8eaa1-d6f2-4213-e658-1b43c4612c29",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 68
}
},
"source": [
"import nltk\n",
"nltk.download('stopwords')"
],
"execution_count": 2,
"outputs": [
{
"output_type": "stream",
"text": [
"[nltk_data] Downloading package stopwords to /root/nltk_data...\n",
"[nltk_data] Unzipping corpora/stopwords.zip.\n"
],
"name": "stdout"
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"True"
]
},
"metadata": {
"tags": []
},
"execution_count": 2
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "NrdXlozI5Hev"
},
"source": [
"### Функции"
]
},
{
"cell_type": "code",
"metadata": {
"id": "dRIUfW88fI6A"
},
"source": [
"from nltk.corpus import stopwords\n",
"stopWords = set(stopwords.words('russian'))\n",
"\n",
"def csv_to_list(arr):\n",
" arr_list = []\n",
" for row in arr:\n",
" arr_list.append(list_to_str(row))\n",
" return arr_list\n",
"\n",
"def list_to_str(arr):\n",
" str_ = ''\n",
" for rec in arr:\n",
" str_+=rec\n",
" return str_\n",
"\n",
"def df_preprocess(text): \n",
" reg = re.compile('[^а-яА-яa-zA-Z0-9 ]') #\n",
" text = text.lower().replace(\"ё\", \"е\")\n",
" text = text.replace(\"ъ\", \"ь\")\n",
" text = text.replace(\"й\", \"и\")\n",
" text = re.sub('((www\\.[^\\s]+)|(https?://[^\\s]+))', 'сайт', text)\n",
" text = re.sub('@[^\\s]+', 'пользователь', text)\n",
" text = reg.sub(' ', text)\n",
" \n",
" # Лемматизация\n",
" #morph = pymorphy2.MorphAnalyzer()\n",
" #text =[morph.parse(word)[0].normal_form for word in text.split()]\n",
"\n",
" # Стемминг\n",
" # stemmer = SnowballStemmer(\"russian\")\n",
" # text =[stemmer.stem(word) for word in text.split()]\n",
"\n",
" # Стемминг + удаление стоп слов\n",
" stemmer = SnowballStemmer(\"russian\")\n",
" #text =[stemmer.stem(word) for word in text.split() if word not in stopWords]\n",
" text = ' '.join([stemmer.stem(word) for word in text.split() if word not in stopWords])\n",
"\n",
" return text"
],
"execution_count": 4,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "0dCTRpn55ONd"
},
"source": [
"### Считываем данные\n",
"\n",
"Используем корпус с сайта https://study.mokoron.com/#download"
]
},
{
"cell_type": "code",
"metadata": {
"id": "87Chu_tOYxW2"
},
"source": [
"positive_recalls = csv_to_list(csv.reader(codecs.open('/content/drive/My Drive/Colab Notebooks/NLP/positive_recalls.csv', 'rU', 'utf-8', errors='ignore')))\n",
"negative_recalls = csv_to_list(csv.reader(codecs.open('/content/drive/My Drive/Colab Notebooks/NLP/negative_recalls.csv', 'rU', 'utf-8', errors='ignore')))"
],
"execution_count": 5,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "wF5HZPIb957U"
},
"source": [
"### Формируем датасет "
]
},
{
"cell_type": "code",
"metadata": {
"id": "KjrfTaD_7wo1",
"outputId": "d605803a-529a-45dd-f769-3949698f333e",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
}
},
"source": [
"df_positive_recalls = pd.DataFrame(positive_recalls, columns=['recall'])\n",
"df_positive_recalls['type']=1\n",
"df_positive_recalls.head()"
],
"execution_count": 6,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>recall</th>\n",
" <th>type</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>@first_timee хоть я и школота но поверь у нас ...</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Да все-таки он немного похож на него. Но мой м...</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>RT @KatiaCheh: Ну ты идиотка) я испугалась за ...</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>RT @digger2912: \"Кто то в углу сидит и погибае...</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>@irina_dyshkant Вот что значит страшилка :D\\nН...</td>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" recall type\n",
"0 @first_timee хоть я и школота но поверь у нас ... 1\n",
"1 Да все-таки он немного похож на него. Но мой м... 1\n",
"2 RT @KatiaCheh: Ну ты идиотка) я испугалась за ... 1\n",
"3 RT @digger2912: \"Кто то в углу сидит и погибае... 1\n",
"4 @irina_dyshkant Вот что значит страшилка :D\\nН... 1"
]
},
"metadata": {
"tags": []
},
"execution_count": 6
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "C3oNsEsg-2Yb",
"outputId": "1ca2222d-fa21-44a8-c104-6cc3dbb114eb",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
}
},
"source": [
"df_negative_recalls = pd.DataFrame(negative_recalls, columns=['recall'])\n",
"df_negative_recalls['type']=0\n",
"df_negative_recalls.head()"
],
"execution_count": 7,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>recall</th>\n",
" <th>type</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>на работе был полный пиддес :| и так каждое за...</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Коллеги сидят рубятся в Urban terror а я из-за...</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>@elina_4post как говорят обещаного три года жд...</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Желаю хорошего полёта и удачной посадкия буду ...</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Обновил за каким-то лешим surf теперь не работ...</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" recall type\n",
"0 на работе был полный пиддес :| и так каждое за... 0\n",
"1 Коллеги сидят рубятся в Urban terror а я из-за... 0\n",
"2 @elina_4post как говорят обещаного три года жд... 0\n",
"3 Желаю хорошего полёта и удачной посадкия буду ... 0\n",
"4 Обновил за каким-то лешим surf теперь не работ... 0"
]
},
"metadata": {
"tags": []
},
"execution_count": 7
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "2H5O4MG1-8uT",
"outputId": "73d56848-2634-4299-fa95-395514f8483b",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
}
},
"source": [
"# Объединяем два датафрейма вместе\n",
"df_recalls = pd.concat((df_negative_recalls, df_positive_recalls),axis = 0).sample(frac = 1.0) # объединяем и перемешиваем\n",
"df_recalls.index = range(0,len(df_recalls))\n",
"df_recalls.head()"
],
"execution_count": 8,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>recall</th>\n",
" <th>type</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Сегодня у всех парней обострениеговно изо рта ...</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Я никогда не туплю. Я просто делаю все в своем...</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>RT @Sugar_Kroshka: Опять сердце ..и голова кру...</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>RT @DoDFavorit: @Olga_Wholock Мадам да Вы смущ...</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>RT @SachihiroB: 20k Years Into Space — очень з...</td>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" recall type\n",
"0 Сегодня у всех парней обострениеговно изо рта ... 1\n",
"1 Я никогда не туплю. Я просто делаю все в своем... 1\n",
"2 RT @Sugar_Kroshka: Опять сердце ..и голова кру... 0\n",
"3 RT @DoDFavorit: @Olga_Wholock Мадам да Вы смущ... 1\n",
"4 RT @SachihiroB: 20k Years Into Space — очень з... 1"
]
},
"metadata": {
"tags": []
},
"execution_count": 8
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qYV7zbB7_MC-"
},
"source": [
"### Удалим стоп слова"
]
},
{
"cell_type": "code",
"metadata": {
"id": "AZ_qod5R5LpL"
},
"source": [
"stopWords = set(stopwords.words('russian'))"
],
"execution_count": 9,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "54lNMQdG5QHC",
"outputId": "8a9c7b8e-f69c-41a3-9c53-c89705c74dd4",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
}
},
"source": [
"stopWords"
],
"execution_count": 10,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{'а',\n",
" 'без',\n",
" 'более',\n",
" 'больше',\n",
" 'будет',\n",
" 'будто',\n",
" 'бы',\n",
" 'был',\n",
" 'была',\n",
" 'были',\n",
" 'было',\n",
" 'быть',\n",
" 'в',\n",
" 'вам',\n",
" 'вас',\n",
" 'вдруг',\n",
" 'ведь',\n",
" 'во',\n",
" 'вот',\n",
" 'впрочем',\n",
" 'все',\n",
" 'всегда',\n",
" 'всего',\n",
" 'всех',\n",
" 'всю',\n",
" 'вы',\n",
" 'где',\n",
" 'да',\n",
" 'даже',\n",
" 'два',\n",
" 'для',\n",
" 'до',\n",
" 'другой',\n",
" 'его',\n",
" 'ее',\n",
" 'ей',\n",
" 'ему',\n",
" 'если',\n",
" 'есть',\n",
" 'еще',\n",
" 'ж',\n",
" 'же',\n",
" 'за',\n",
" 'зачем',\n",
" 'здесь',\n",
" 'и',\n",
" 'из',\n",
" 'или',\n",
" 'им',\n",
" 'иногда',\n",
" 'их',\n",
" 'к',\n",
" 'как',\n",
" 'какая',\n",
" 'какой',\n",
" 'когда',\n",
" 'конечно',\n",
" 'кто',\n",
" 'куда',\n",
" 'ли',\n",
" 'лучше',\n",
" 'между',\n",
" 'меня',\n",
" 'мне',\n",
" 'много',\n",
" 'может',\n",
" 'можно',\n",
" 'мой',\n",
" 'моя',\n",
" 'мы',\n",
" 'на',\n",
" 'над',\n",
" 'надо',\n",
" 'наконец',\n",
" 'нас',\n",
" 'не',\n",
" 'него',\n",
" 'нее',\n",
" 'ней',\n",
" 'нельзя',\n",
" 'нет',\n",
" 'ни',\n",
" 'нибудь',\n",
" 'никогда',\n",
" 'ним',\n",
" 'них',\n",
" 'ничего',\n",
" 'но',\n",
" 'ну',\n",
" 'о',\n",
" 'об',\n",
" 'один',\n",
" 'он',\n",
" 'она',\n",
" 'они',\n",
" 'опять',\n",
" 'от',\n",
" 'перед',\n",
" 'по',\n",
" 'под',\n",
" 'после',\n",
" 'потом',\n",
" 'потому',\n",
" 'почти',\n",
" 'при',\n",
" 'про',\n",
" 'раз',\n",
" 'разве',\n",
" 'с',\n",
" 'сам',\n",
" 'свою',\n",
" 'себе',\n",
" 'себя',\n",
" 'сейчас',\n",
" 'со',\n",
" 'совсем',\n",
" 'так',\n",
" 'такой',\n",
" 'там',\n",
" 'тебя',\n",
" 'тем',\n",
" 'теперь',\n",
" 'то',\n",
" 'тогда',\n",
" 'того',\n",
" 'тоже',\n",
" 'только',\n",
" 'том',\n",
" 'тот',\n",
" 'три',\n",
" 'тут',\n",
" 'ты',\n",
" 'у',\n",
" 'уж',\n",
" 'уже',\n",
" 'хорошо',\n",
" 'хоть',\n",
" 'чего',\n",
" 'чем',\n",
" 'через',\n",
" 'что',\n",
" 'чтоб',\n",
" 'чтобы',\n",
" 'чуть',\n",
" 'эти',\n",
" 'этого',\n",
" 'этой',\n",
" 'этом',\n",
" 'этот',\n",
" 'эту',\n",
" 'я'}"
]
},
"metadata": {
"tags": []
},
"execution_count": 10
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-mTrgM0E_eAW"
},
"source": [
"### Очитска текста приведение слов к стандартному виду"
]
},
{
"cell_type": "code",
"metadata": {
"id": "gHI2wqsn_kMB",
"outputId": "a553bb84-2503-4387-c85a-9afdd6d4b871",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 51
}
},
"source": [
"%time df_recalls['recall'] = df_recalls['recall'].apply(df_preprocess)"
],
"execution_count": 11,
"outputs": [
{
"output_type": "stream",
"text": [
"CPU times: user 1min 40s, sys: 79.8 ms, total: 1min 40s\n",
"Wall time: 1min 40s\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "Kca4OwK8BRPt",
"outputId": "f51ad583-e159-4f84-a810-45bb64cf185a",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
}
},
"source": [
"df_recalls.head()"
],
"execution_count": 12,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>recall</th>\n",
" <th>type</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>сегодн парн обострениеговн из рта льет</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>тупл прост дела сво стил d сайт</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>rt пользовател сердц голов круж</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>rt пользовател пользовател мад смуща деиствите...</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>rt пользовател 20k years int spac очен занимат...</td>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" recall type\n",
"0 сегодн парн обострениеговн из рта льет 1\n",
"1 тупл прост дела сво стил d сайт 1\n",
"2 rt пользовател сердц голов круж 0\n",
"3 rt пользовател пользовател мад смуща деиствите... 1\n",
"4 rt пользовател 20k years int spac очен занимат... 1"
]
},
"metadata": {
"tags": []
},
"execution_count": 12
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "O2NIMMwd_Cpi"
},
"source": [
"### Train/test split"
]
},
{
"cell_type": "code",
"metadata": {
"id": "_kkes5A7_CAl"
},
"source": [
"from sklearn.model_selection import train_test_split\n",
"X_train, X_test, y_train, y_test = train_test_split(df_recalls['recall'], df_recalls['type'], test_size=.15, random_state=42)\n"
],
"execution_count": 13,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "ImvDPxubh_Su",
"outputId": "cb002f7c-caf9-488e-8374-1f46696a7f1b",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"type(X_train)"
],
"execution_count": 14,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"pandas.core.series.Series"
]
},
"metadata": {
"tags": []
},
"execution_count": 14
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "J9nRJ3Z5iEjQ",
"outputId": "34b6cfdf-94e5-449a-8524-4cf47f8d1925",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"print(len(X_train), len(X_test))"
],
"execution_count": 15,
"outputs": [
{
"output_type": "stream",
"text": [
"192808 34026\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "ipEpIuJwiM0g",
"outputId": "5eb9cf10-1d9e-4188-bce5-110655681d3b",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 35
}
},
"source": [
"X_train[100]"
],
"execution_count": 16,
"outputs": [
{
"output_type": "execute_result",
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "string"
},
"text/plain": [
"'добав стен потеря музык сайт'"
]
},
"metadata": {
"tags": []
},
"execution_count": 16
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "hTYYSDy5jx_o",
"outputId": "7fb84c2e-4e70-4a75-9af3-c574b5762e90",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"y_train[100]"
],
"execution_count": 17,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"1"
]
},
"metadata": {
"tags": []
},
"execution_count": 17
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZBy6f8q6-xzv"
},
"source": [
"### BOW"
]
},
{
"cell_type": "code",
"metadata": {
"id": "1PdMo6PY_BXL"
},
"source": [
"from sklearn.feature_extraction.text import CountVectorizer\n",
"vectorizer = CountVectorizer()\n",
"\n",
"X_train_BOW = vectorizer.fit_transform(X_train)\n",
"X_test_BOW = vectorizer.transform(X_test)"
],
"execution_count": 18,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "URXczudRj9p2",
"outputId": "147ddf3c-d72b-4d2b-891e-c67b86dd1775",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"print(X_train_BOW.shape, X_test_BOW.shape)"
],
"execution_count": 19,
"outputs": [
{
"output_type": "stream",
"text": [
"(192808, 110125) (34026, 110125)\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "6D2JImLqZ-bW",
"outputId": "a94654ee-d1b5-457f-a0c2-10dfc4d90048",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 35
}
},
"source": [
"X_train.iloc[200]"
],
"execution_count": 20,
"outputs": [
{
"output_type": "execute_result",
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "string"
},
"text/plain": [
"'пользовател иогурт эрмигурт очен широк горлышк пит неудобн бутылк убира губ царапа'"
]
},
"metadata": {
"tags": []
},
"execution_count": 20
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "hb9w4Mgqm2s7",
"outputId": "ba29e657-8dcb-46d9-ee8d-1d86cf8e310d",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 51
}
},
"source": [
"# Векторное представление\n",
"X_train_BOW[200]"
],
"execution_count": 21,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"<1x110125 sparse matrix of type '<class 'numpy.int64'>'\n",
"\twith 12 stored elements in Compressed Sparse Row format>"
]
},
"metadata": {
"tags": []
},
"execution_count": 21
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "pmOeNxCznybd"
},
"source": [
"### TF-IDF"
]
},
{
"cell_type": "code",
"metadata": {
"id": "T3mKWWnwn0im"
},
"source": [
"from sklearn.feature_extraction.text import TfidfVectorizer\n",
"vectorizer = TfidfVectorizer()\n",
"\n",
"X_train_TFIDF = vectorizer.fit_transform(X_train)\n",
"X_test_TFIDF = vectorizer.transform(X_test)"
],
"execution_count": 22,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "NpFtdBSYoBgT",
"outputId": "abc70af0-d17a-4cb1-8734-7fb7ef259986",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"print(X_train_TFIDF.shape, X_test_TFIDF.shape)"
],
"execution_count": 23,
"outputs": [
{
"output_type": "stream",
"text": [
"(192808, 110125) (34026, 110125)\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "E6ArOkpioIOs",
"outputId": "bfc6ee5b-e1c1-4093-d47d-02d20cb2e433",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 51
}
},
"source": [
"# Векторное представление\n",
"X_train_TFIDF[200]"
],
"execution_count": 24,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"<1x110125 sparse matrix of type '<class 'numpy.float64'>'\n",
"\twith 12 stored elements in Compressed Sparse Row format>"
]
},
"metadata": {
"tags": []
},
"execution_count": 24
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "DVF6O7dsaRjO"
},
"source": [
"### Строим простейшую модель"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hd5OlzOpbxD_"
},
"source": [
"#### На данных BOW"
]
},
{
"cell_type": "code",
"metadata": {
"id": "4q3FF9T5dgZZ"
},
"source": [
"from sklearn.linear_model import LogisticRegression\n",
"from sklearn.metrics import accuracy_score\n"
],
"execution_count": 25,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "2hIL5d9Wa3WV",
"outputId": "ae2b187c-4134-4881-a9e6-5cfae750a1d8",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 207
}
},
"source": [
"# обучаем классификатор\n",
"%time clf = LogisticRegression(random_state=0).fit(X_train_BOW, y_train)"
],
"execution_count": 26,
"outputs": [
{
"output_type": "stream",
"text": [
"CPU times: user 6.83 s, sys: 4.54 s, total: 11.4 s\n",
"Wall time: 5.86 s\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"/usr/local/lib/python3.6/dist-packages/sklearn/linear_model/_logistic.py:940: ConvergenceWarning: lbfgs failed to converge (status=1):\n",
"STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.\n",
"\n",
"Increase the number of iterations (max_iter) or scale the data as shown in:\n",
" https://scikit-learn.org/stable/modules/preprocessing.html\n",
"Please also refer to the documentation for alternative solver options:\n",
" https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression\n",
" extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG)\n"
],
"name": "stderr"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "1CEVfbAKbYga"
},
"source": [
"# вычисляем предсказания\n",
"y_predict_BOW = clf.predict(X_test_BOW)"
],
"execution_count": 27,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "cXxx3qZKboL8",
"outputId": "aff5779e-817f-4d14-ce09-5fd0544eebc0",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"# вычисляем метрику accuracy\n",
"accuracy_score(y_predict_BOW, y_test)"
],
"execution_count": 28,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"0.7360253923470288"
]
},
"metadata": {
"tags": []
},
"execution_count": 28
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ccgCm6NKb0Wy"
},
"source": [
"#### На данных TF-IDF"
]
},
{
"cell_type": "code",
"metadata": {
"id": "-Lpnp-YQb20w",
"outputId": "c3e068b2-4d00-441f-f3b4-cb318f18deb8",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 207
}
},
"source": [
"%time clf = LogisticRegression(random_state=43).fit(X_train_TFIDF, y_train)"
],
"execution_count": 29,
"outputs": [
{
"output_type": "stream",
"text": [
"CPU times: user 7.34 s, sys: 4.94 s, total: 12.3 s\n",
"Wall time: 6.3 s\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"/usr/local/lib/python3.6/dist-packages/sklearn/linear_model/_logistic.py:940: ConvergenceWarning: lbfgs failed to converge (status=1):\n",
"STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.\n",
"\n",
"Increase the number of iterations (max_iter) or scale the data as shown in:\n",
" https://scikit-learn.org/stable/modules/preprocessing.html\n",
"Please also refer to the documentation for alternative solver options:\n",
" https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression\n",
" extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG)\n"
],
"name": "stderr"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "ERtxqj6qb7i0"
},
"source": [
"y_predict_TFIDF = clf.predict(X_test_TFIDF)"
],
"execution_count": 30,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "wFvxSEK7cBB7",
"outputId": "52a9241a-514d-4da6-dc89-ead417ed644d",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"accuracy_score(y_predict_TFIDF, y_test)"
],
"execution_count": 31,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"0.7341444777523071"
]
},
"metadata": {
"tags": []
},
"execution_count": 31
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "VQxum79ockEI"
},
"source": [
"#### На данных BOW с биграммами"
]
},
{
"cell_type": "code",
"metadata": {
"id": "hDFo5A8wcinG",
"outputId": "7a4639e2-1deb-419d-b689-95721daedb00",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 207
}
},
"source": [
"#-----------------------------------------------\n",
"vectorizer = CountVectorizer(ngram_range=(1, 2))\n",
"#-----------------------------------------------\n",
"X_train_BOW_bi = vectorizer.fit_transform(X_train)\n",
"X_test_BOW_bi = vectorizer.transform(X_test)\n",
"#-----------------------------------------------\n",
"print(X_train_BOW_bi.shape, X_test_BOW_bi.shape)\n",
"#-----------------------------------------------\n",
"clf = LogisticRegression(random_state=0).fit(X_train_BOW_bi, y_train)\n",
"#-----------------------------------------------\n",
"y_predict_BOW_bi = clf.predict(X_test_BOW_bi)\n",
"#-----------------------------------------------\n",
"accuracy_score(y_predict_BOW_bi, y_test)"
],
"execution_count": 32,
"outputs": [
{
"output_type": "stream",
"text": [
"(192808, 893652) (34026, 893652)\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"/usr/local/lib/python3.6/dist-packages/sklearn/linear_model/_logistic.py:940: ConvergenceWarning: lbfgs failed to converge (status=1):\n",
"STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.\n",
"\n",
"Increase the number of iterations (max_iter) or scale the data as shown in:\n",
" https://scikit-learn.org/stable/modules/preprocessing.html\n",
"Please also refer to the documentation for alternative solver options:\n",
" https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression\n",
" extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG)\n"
],
"name": "stderr"
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"0.7493681302533357"
]
},
"metadata": {
"tags": []
},
"execution_count": 32
}
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment