Skip to content

Instantly share code, notes, and snippets.

@iaroslav-ai
Last active October 31, 2018 14:33
Show Gist options
  • Save iaroslav-ai/e0ee16c567420b60df5cdc82088f737c to your computer and use it in GitHub Desktop.
Save iaroslav-ai/e0ee16c567420b60df5cdc82088f737c to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Example learning with disballanced dataset\n",
"\n",
"ML with disballanced dataset is not an issue in case of separable classes. In this example, a model learns to separate perfectly both classes, even though the ratio of class instances is ~ 1000 to 1."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Positive instances: 118\n",
"Negative instances: 99882\n",
"Positive to negative class ratio: 0.0011813940449730681\n",
"Model test accuracy: 1.0 (1.0 is perfectly accurate)\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/iaroslav/.local/lib/python3.5/site-packages/sklearn/svm/base.py:922: ConvergenceWarning: Liblinear failed to converge, increase the number of iterations.\n",
" \"the number of iterations.\", ConvergenceWarning)\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAlYAAAEICAYAAACdyboFAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAIABJREFUeJzt3XuYHXWd5/HPty+BDrdOSHChExJABJPFMdgLOMwFR3y4J3FVJIKKD8KwM8zo4jDLCHIbZpwRRNaRHUVlUEAgshIDxI0zCjo6JNIQCAbIGMMl6URpQoKaBNLpfPePqk7OOek+dTr966o6Ve/X8/STPqfq1Pl2pT6nvqeu5u4CAADA6LVkXQAAAEBR0FgBAAAEQmMFAAAQCI0VAABAIDRWAAAAgdBYAQAABEJjlTIzazWz35nZoSHHDVDXyWb2wli/DzBSZrZXnINDUnivU81s1Vi/D8rHzKabmZtZWwPjnm9mP0mjrmHe/wUzOzn+/dNm9rU9nM4KMzspaHFNgMYqQfyBPvizw8y2Vjw+d6TTc/cBd9/X3V8KOW6azOzjZvZI1nUgW6GzUTHdJWZ23uBjd38jzsG6MJWHYWYXm9m/ZV0Hwosbi21mNqnm+WVxczQ9m8rS5+5/7+4fTxrPzG43s+trXjvT3R8Zs+JyisYqQfyBvq+77yvpJUlnVTx3V+34jXwbAYpgpNkAmszzkuYNPjCzYySNz66cPcM6KX00VqNkZteb2b1mdreZ/VbSeWb2zvhb9yYzW29mXzSz9nj8tspvPGZ2Zzz8e2b2WzN71MwOG+m48fDTzOw/zew1M/snM/upmZ0/TN3jzewOM9toZiskvaNm+JVmtjp+nxVmNjt+/hhJX5L0h/GWiVfi52eb2ZNm9hsze8nMPhNwNqMJxbuyPxMvR6+Y2V1m1hkP28fM7jGzV+OcLDWzCWb2eUn/TdLX4uXr82a2d5yDKfFr7zGzm81scbx8/tTMplW87xlm9ot4ujfXbgGrqXGfuK5NZva0pFk1w68ys+fj9/m5mZ0RPz9L0s2STorr/FX8/HvN7KmKHHx6DGYt0nGHpI9UPP6opG9WjmBmB5jZN82sz8xejD83W+JhrWZ2Y7zsr5Z0xhCv/Xq8juiN1yWtSUXZrl2KF5nZuvj1f1Ux/Bozuy9eX/xG0vlm1mJml5vZL81sg5nNN7OJFa/5cFz/BjO7oub9rjGzOyse/4GZ/UecmTUW7ba8SNK5kv46zsMD8biVuxT3ivO4Lv652cz2ioedZGZrzexTZvZy/Dd9rOI9TzezZ+Ic9lb+vbnk7vw0+CPpBUkn1zx3vaRtks5S1Kh2KFoxHC+pTdLhkv5T0iXx+G2SXNL0+PGdkl6R1C2pXdK9ku7cg3EPkvRbSXPiYZdK6pd0/jB/y42SHpE0QdI0Sc9IeqFi+NmSDo7/pg9J+p2kN8XDPi7pkZrp/YmkmfH4vxfXeWbW/2f8pPMzTDb+l6R/l3SIpL0l3S7pX+Jhn5B0X5yXtjgz+8TDlkg6r2I6e8c5mBI/vkfSy5KOjZf1+yTdHg87OF5Wz4yH/XWcg/OGqftmST+Q1CnpMEkrJa2qGP7Bihx8OM7YpHjYxZL+rWZ6767IwbGSXpV0atb/P/zs2fIcLw9vldQqaW38WVn5mfxNSd+VtJ+k6Yo+6y+oWD6ekzRV0kRJD8evbYuH3y/pK5L2iT+/fybpT+Nh50v6yTC1TY+nc3f82mMk9Q3mT9I18TI/V7vWSZ+IczVF0l7x+94djz8jzswfxcNukrS9ZnqD65lpcQbmxfk6UNLb42G3S7p+qPkY/35dXMNBkiZL+g9JfxsPOyl+z+vi6Z4uaYukCfHw9ZL+MP59gqRjs15G6v2wxSqMn7j7A+6+w923uvtj7r7U3be7+2pJt0r64zqvv8/de9y9X9Jdkt6+B+OeKelJd/9uPOwLipqb4ZytKAQb3f1FRVuhdnL3+e6+Pv6bvqUoIN3DTczdf+juK+Lxn1K08qv3N6P4LpZ0ubuvc/fXJV0r6YNmZoo++CdLOiLOyWPuvnkE057v7k/Ey/q3tCsHZ0l6zN0fjIfdKGljnemcrejDfZO7Py/plsqB7n5vRQ7ukNSrmq27NeP/oCIHT0iaL3LQzAa3Wr1H0rOK/v8lRVukJJ0j6W/c/bfu/oKkzytqwKVo2brZ3de4+6uSPlvx2jcpah4+6e6b3f1lRZ/Z54ygtmvj1z4t6V9UsdtS0qPuvmBwnaQoi1e4+1p3f0NRs/R+i3YTvl/Sg+7+43jYZyTtGOY9P6Toy8Td7t7v7hvc/ckG6z1X0nXu/rK79yn6PPhwxfD+eHi/uy9S1OwdVTFshpntH6+znmjwPTNBYxXGmsoHZna0mT1kZr+KN8VeJ2nS0C+VJP2q4vctkvbdg3EPqazDo9Z+bZ3pHFxT94uVA+PNu0/Fm3s3STpadf4Gi3Z/PhJvEn9N0Vaten8zCixunqZKWlSxDC1T9JlzoKSvS/qRpPviXQB/38hukAqN5mCHKlaGNTW2SPovqp+DC8xsecXf8GbVz8GJZvajihycX2985N4dipqJ81WzG1DR/2u7qpeZFyV1xb9XLYs1402LX7u+Ytn6iqKtOY2qnfYhwwwbfL/7K97rWUkDkt5UW2f8BWfDMO85VdIvR1BjpUO0+7yqrHmDu2+veFyZ6/cpakRfjPP1zj2sIRU0VmF4zeOvSPq5pDe7+/6SrpJkY1zDekWbeSXtXLF1DT+6fqUoJIN2XtLBzA6X9M+S/oekA929U9Em7cG/ofbvlaItVP9X0lR3P0DS1zT2fzNyKm7seyX9ibt3Vvzs7e6veHSm31XufrSiXRAf0K5v60MtX42qzUGLhslB3HT9WsPn4C2S/knSRZImxjlYpfo5mK9oF/1gDm4XOWha8db85xWt1L9TM/gVRVtSplU8d6h2NfLrNcyypaiReUPRbuXBbOzv7jNHUF7ttCvPmq1dNtdIOm2ILPbW1mlm4xV9+RnKGklHDDMsKbfrtPu8auhM33iL9hxFjecCRTnLLRqrsbGfpNckbTazt0r60xTe80FJx5rZWfHm3U8o2tUynPmSPm1mnRZdJ+uSimH7KgpJn6Ie7UJFW6wG/VrSFIsPyI/tJ+lVd3/dzE7QyDZpo5i+LOkfzGyqJJnZQWZ2Vvz7yWY2I258fqPo+IrB3Q+/VnRs4p5YKOn4+GDXNkXHGk6oM/58SVfEBxJPk/RnFcP2jWvqk9RiZhcr2mI16NeSptquE1Msfs2GOAe/r6hhRHO7QNEXhKpd1e4+oGj5+Tsz2y9efi5VdCys4mF/aWZTzGyCpMsrXrte0vclfd7M9o8PLj/CzEay2/gzFp2ENFPSxxQ19MP5clznNEkys8lmNicedp+kM+OD0scp2sMyXG9wl6STzexsi06uOtDMBnfDJ+X2bklXxu89SdEGhzvrjK+41nFmdq6ZHRDv3v+Nht9VmQs0VmPjU4rOIPmtoq1X9Rb4INz914oOtL1J0WbcIxTtenljmJdcreibyguSvqeKzdzuvlzRN/WfxeMcJWlpxWv/VdIvJP3a4rOhFG3d+qxFZ0Z+Wjn/RoFUfE7Sv0n6Ybxc/IeiA7qlaCvSdxVl5OeSFmlXTr4g6SMWnbH6uZG8YbzCmifpi4q2KEyR9LSGz8GV8XgvSXpI1Tl4QtEKqUdRDg6Lfx/0/xTl52UzWxtvpbtY0o3x3/vXkr49kvqRP+7+S3fvGWbwX0jaLGm1pJ8oOt7vtnjYVyUtlvSUpCe0+xavj0gap+jEoY2KGpyDR1DajxRtQf2BpBvd/ft1xv3fir50fD9eNpcoOsFK7r5C0p/Hta+PaxnyMBKPrql4uqJ13KuSnlR0spIU7d6fEe9uXDDEy69XlJ/lijL5RPxcIz4s6YX40JqLFR2vlVsWfRagaOLjVdZJer+7/3vW9QBZiLda/UrRNbYezboeYLQsuvzO85Laa45JQk6wxapALLodR2d8bZDPKNr//7OMywJSZdH13A4ws70VbZndIunxjMsCUBI0VsXyB4o2SfdJOkXSe+PTZ4Ey+SNF3+hfVnRdqfe6+7ZsSwJQFuwKBAAACIQtVgAAAIFkdnPGSZMm+fTp07N6e6DK448//oq717s8xZgjE8gTMgFUazQTmTVW06dPV0/PcGewAukysxeTxxpbZAJ5QiaAao1mgl2BAAAAgdBYAQAABEJjBQAAEAiNFQAAQCA0VgAAAIHQWAEAAARCYwUAABBIYmNlZreZ2ctm9vNhhpuZfdHMVpnZcjM7NnyZQH6QCaAamQB2aeQCobdL+pKkbw4z/DRJR8Y/x0v65/jfTJz71Uf101++mtXbI4faW6QbPvB2zZ3VFWqSt6uJMiFJC5b16toHVmjjlv4sy9ipvUXq3xH+tRPGt+v1/gFtHeHEO9pbEl/T3iINuLTDpVYzzTt+qq6fe4wk6T03PaJfvLx5yNe1WPSakepob9H73jFFDy1fv9v/m0lqq5kPE8a364y3HawHn1qvTVvr/z9PGN+uq8+aSSYKkokWScO9tGiZ2Lu9dcj/s73aWtRiqqp5MBPfeXyttiT8LSEzkdhYufuPzWx6nVHmSPqmR3dzXmJmnWZ2sLuvH3V1I1TvPxLl1b9D+uS9T0pSkNA0UyakfOZiT1cgSa/d05VkIyudylEG3HXnkpe0dPWGxHm7JyuQwZruXPLSkMNcu8+HjVv6hx2/1sYt/bp0fjkzsWBZry779pOjWgbHwmjqqffSomViuLre2L7781llIsQxVl2S1lQ8Xhs/l6oFy3pzt/JAvtyweGVab5WLTEj5bKqKpJnn7Q6Xrrj/6bTeLheZWLCsV5+8N39NVZGQiZQPXjezi8ysx8x6+vr6gk47xZUmmtS6TVuzLmE3Y5kJvmwgyeZtA1mXsJuxzMS1D6wIOj0UT4hMhGiseiVNrXg8JX5uN+5+q7t3u3v35Mlhb5qex5Um8uWAjva03ioXmeDLBnIkF5nIy/FUKLYQjdVCSR+Jz/o4QdJrWew372jnyhGozyy1t8pFJnr5soH8yEUmgDQkHrxuZndLOknSJDNbK+lqSe2S5O5flrRI0umSVknaIuljY1VsPSM94wHlE+rbarNkotVMA76HR4kCI9AsmTBFB/4DY6mRswLnJQx3SX8erKI9RFiQlmbJBE0VkrQG2ozbLJkgEUgSIhPsPwMKqjO9Y8rQpOYdPzV5pAIhE0gSIhM0VkBB9Q+wexz1dU+bmHUJqSITSDJ4gdPRKExjld5xyUBzyOOp9MiXT39nedYlpIpMIMmVC5rsOlZjqa0wfwkApCPpNh9A2dy1tLErtddTmHaEzwcAADAaIc75KUxjBQAAkDUaKwAoKa6rDFQLkQliBRRUV2dH1iUg5/bdu1yXHyATSBIiEzRWQEG96+iw91lD8Wwq2b3zyASShMgEjRVQUA8/15d1Cci5FG9MngtkAklCZILGCigobsKMJGW7YCaZQJIQmaCxAoCS4oKZQLUQmaCxAgAACITGCgAAIBAaK6CgjBtoAlXIBJKEWEYK01hxfRKg2rnHH5p1CUCu/P7hE7MuATkXYhkpTGPF9UmAatfPPSbrEpBzrSXbhPPCBs4KRH0hlpHCNFZcnwQARmYgxB1nm8g6LreABCEuyVGYxorAAMDIlG2L1SEcMoIEITJRmMaKwADAyJRti9VlpxyVdQnIuRCZKExjxTFWQLUFy3qzLgE5x0k/QLUQmShMY8UxVkC1GxavzLoE5FzZtuCQCSQJkYnCNFYcYwVUIxNIMndWV9YlpIpMIEmITBSmseIYK6AamQCqHdDRnnUJKIHCNFZl26QNJOG4Q6Ba/8COrEtACRSmsSrbJm0gyUPL12ddAnKubCc4bN42kHUJyLkQmShMY1W2Dwhky8xONbOVZrbKzC4fYvihZvawmS0zs+VmdnraNW7c0p/2W6LJXHH/08Gm1QyZAJKEyERDjVUzBIazPZAWM2uVdIuk0yTNkDTPzGbUjHalpPnuPkvSOZL+T7pVAslCbcEhEyiKEJlIbKyaJTAhLkMPNOg4SavcfbW7b5N0j6Q5NeO4pP3j3w+QtC7F+oC0kQkg1sgWKwIDVOuStKbi8dr4uUrXSDrPzNZKWiTpL9IpDcgEmQBijTRWwQJjZheZWY+Z9fT1cUFPFNo8Sbe7+xRJp0u6w8x2yxuZQJZSvlMgmUDuhchEqIPXGwqMu9/q7t3u3j15MqeCo2n1Sppa8XhK/FylCyTNlyR3f1TS3pIm1U6ITCBLAe8USCZQCCEy0UhjFSwwQEE8JulIMzvMzMYpOq5wYc04L0l6tySZ2VsVZYKv3ygqMgHEGmmsCAxQwd23S7pE0mJJzyo6cWOFmV1nZrPj0T4l6UIze0rS3ZLOdw9w23Qgh8gEsEtb0gjuvt3MBgPTKum2wcBI6nH3hYoC81Uz+5+KtqQRGBSauy9SdDxh5XNXVfz+jKQT066rkplEClFPe8ArGZIJFEGITCQ2VlJzBGb/vVr1mze4qi4wiBUIknzwuEOzLiFVZAJJbvjA20c9jcJceZ2mCqjWxU2YkaBstz0iE0hDYRorANW4CTOSlO22R2QCSa59YMWop0FjBRTUw89x/ghQiUwgSYgvG4VprEIehAkUwTpu8wRUIRNIQ2HakRAHnAFFcgjHkyDB+JJ9IyUTSBIiE+VKFVAiHE+CJOPaWrMuIVVkAklCZKIwjdU1C0d/wBlQJBxPgiSvbS3XwetkAklCZKIwjdWmkn1AAEk4ngRJyrZrrJdMIEGITBSmsQJQrWwrTYzcZacclXUJqWo1y7oE5FyITNBYAQXF8SRAtQEuvY4UFKax4nsIUK1sV9XGyF1x/9NZl5CqFlYUSMAFQivwPQSoVraramPkNm8r163AdrCiQAIuEAoAAJAjNFYAUFLsGQOqdXa0j3oaNFYAUFJvPmifrEsAcmXmIfuNeho0VgBQUqv7tmRdApArS1ZvHPU0aKwAoKS4/ABQLUQmaKwAAAACobECAAAIpDCNFRd+A4CR4RYvQLUQmShMY9XK5wMAjAjHWAHVOMaqQv+OrCsA8oXvGkhSti39JftzsQdCZKIwjRWAamyLQJKy3eKlZH8u9kCITBSmsZowfvRXSwUAABiNwjRWV581M+sSAABAyRWmsZo7qyvrEgAAQMk11FiZ2almttLMVpnZ5cOMc7aZPWNmK8zsW2HLBPKlGTLRXpivTWgGzZAJIA1tSSOYWaukWyS9R9JaSY+Z2UJ3f6ZinCMl/Y2kE919o5kdNFYFD2fBst603xIl1SyZ2Hfvdm3c0p/226KEmiUTQBoa+U57nKRV7r7a3bdJukfSnJpxLpR0i7tvlCR3fzlsmcmufWBF2m+J8mqKTNBUIUVNkQkgDY00Vl2S1lQ8Xhs/V+ktkt5iZj81syVmdupQEzKzi8ysx8x6+vr69qziYbASQYqaIhNcVRspaopMlO26XchGqKMw2iQdKekkSfMkfdXMOmtHcvdb3b3b3bsnT54c6K2BXMo8E1xVGzmTeSbKdt0uZKORxqpX0tSKx1Pi5yqtlbTQ3fvd/XlJ/6koQKnp4EhdpKcpMsGXc6SITACxRrqRxyQdaWaHmdk4SedIWlgzzgJF30JkZpMUbfJdHbDORC3s9kB6miITfDlHisgEEEtsrNx9u6RLJC2W9Kyk+e6+wsyuM7PZ8WiLJW0ws2ckPSzpMnffMFZFD2XztoE03w4l1iyZANJCJoBdEi+3IEnuvkjSoprnrqr43SVdGv8AhUcmgGpkAohwYBIAAEAgNFYAAACB0FgBAAAEQmMFAAAQSGEaqxOPmJh1CUCuTBjfnnUJyLmyXaSGTCBJiEwUprG668J3Zl0CkCtnvO3grEtAzpXtuk5kAklCZKIwjdWCZbUX+QXK7aHl67MuATnX1dmRdQmpIhNIEiIThWmsrlm4IusSgFzhxuRI8q6jy3XPVjKBJCEyUZjGatNWAgMAI/Hwc31ZlwDkSohMFKaxAlCtbAcmY+R6N23NuoRUkQkkCZEJGiugoMp2YDJGrrVkN68nE0gSIhM0VkBBcWo5kgx4uVoNMoEkITJRmMbqyIP2yboEIFde7x/IugQgV8gE0lCYxqrvt9uyLgHIla39O7IuAcgVMoE0FKax4qxAABiZch1hBSTjyusAgD1WsmPXgUQhMlGYxoqDEgFgZHaU69h1IFGITBSmsbr6rJlZlwAAAEquMI0VAABA1grTWN2weGXWJQC50tnB7nHU116YNUBjyASShMhEYWK1rmS3ZgCSTN5vXNYlIOemTyrX9f/IBJKEyERhGqsD+CYCVPnFy5uzLgE5V7ZlpGx/L0YuxDJSmMaK04YBAEDWCtNYbdrCBUIBAEC2CtNYHdLZkXUJAACg5ArTWL3r6MlZlwDkyl5thYk3xsh5JxyadQlAroTIREOfvGZ2qpmtNLNVZnZ5nfHeZ2ZuZt2jrmyEHnxqfdpviRJrhkx0tLem/ZZoIi2Srp97TLDpNUMmgHpCZSKxsTKzVkm3SDpN0gxJ88xsxhDj7SfpE5KWjrqqPcBNmJGWZsnEa2QCdewIOK1myQRQT6hMNLLF6jhJq9x9tbtvk3SPpDlDjPe3kv5R0uuBagPyqiky0cn9M5FgwbLeUJNqikywexxJQmSikaWsS9Kaisdr4+d2MrNjJU1194fqTcjMLjKzHjPr6evrG3Gx9acddHJAPU2RCecGu0gQ8I4VTZEJdo8jSYhMjLp9N7MWSTdJ+lTSuO5+q7t3u3v35MlhDzZnJYK8yEsm2D2OJGndsYJMoFmEyEQjjVWvpKkVj6fEzw3aT9J/lfSImb0g6QRJC9M+MLGVTVZID5lAIQTcXUwmUAghMtFIY/WYpCPN7DAzGyfpHEkLBwe6+2vuPsndp7v7dElLJM12955RVzcCA2yyQnrIBAoh4CJCJlAIIRaRxMbK3bdLukTSYknPSprv7ivM7Dozmz36EsKYwIG6SEmzZIIv50gS6sxRMoGiCJGJtkZGcvdFkhbVPHfVMOOeNOqq9gBfRJAmMoEiCHnzejKBIgiRicKce8pBiQAwMmzBAaqFyERhGqsWPiCAKuPbCxNvjJGN3LweqBIiE4X55N3BJl6gyl5csweowhY6pKEwjRWAamyNAKpxjBXSQGMFFBS7x4FqZAJpKExjxSZeoBq7x4FqZAJpKExjxSZeAACQtcI0Vl2dHVmXAAAASq4wjdVlpxyVdQkAAKDkCtNY9bz4atYlALnSGfCq2kARkAmkoTCN1d1L12RdApAr18yemXUJyLmynfNDJpAkRCYK01hx13Kg2txZXVmXgJwr26cmmUCSEJkoTGPVyvUWAGBEOOkHqBYiE4VprOYdPzXrEoBcWbCsN+sSkHPvOnpy1iWkikygnhaFORGuMI1V97SJWZcA5MoNi1dmXQJy7qHl67MuIVXXPrAi6xKQYzsCTacwjRUrEaDauk1bsy4BOVe2+0mW7e/FyF2zcPTNd2Eaq15WIkCVzvGcWg4AI7Fp6+ib78I0Vhy8DlR7o38g6xIAoHQK01hxuQWg2pb+UEcMAAAaVZjGitOGAQDAaLQH6IoK01iV7bRhIAm370CSsh1AQSaQZHuAnV+FaazKdtowkGTmIftlXQJyrmwHUJAJJAlxVFFhGitOowWqLVm9MesSgFwhE0hDYRorANU4oQOoRiaQBhorAACAQArTWE3gYogAACBjDTVWZnaqma00s1VmdvkQwy81s2fMbLmZ/cDMpoUvtb6rz5qZ9luixJohEy1lO+ULmSITQCSxsTKzVkm3SDpN0gxJ88xsRs1oyyR1u/vbJN0n6XOhCwXyolkysYPDSZAgxDV7JDKB4jjvhENHPY1GYnWcpFXuvtrdt0m6R9KcyhHc/WF33xI/XCJpyqgrGyFuwowUNUUmgCQBL85PJlAI1889ZtTTaKSx6pK0puLx2vi54Vwg6XtDDTCzi8ysx8x6+vr6Gq+yAeu4CTPS0xSZAFJEJoBY0IPXzew8Sd2SbhhquLvf6u7d7t49eXLYK6XvHWqbNhBQlpkAkmRxJXIygaJrpBvplTS14vGU+LkqZnaypCskzXb3N8KU17g3tnPDWaSmKTIBJDnz9w4ONSkygUJYsGy3xXbEGmmsHpN0pJkdZmbjJJ0jaWHlCGY2S9JXFIXl5VFXtQc4KBEpaopMGGdAIcGDTwW7FRiZQCFcs3DFqKeR2Fi5+3ZJl0haLOlZSfPdfYWZXWdms+PRbpC0r6Rvm9mTZrZwmMmNGU6jRVqaJRNcZBpJNm0NcyswMoGiCJGJtkZGcvdFkhbVPHdVxe8nj7qSUdqrrUVbA57iAtTTDJkwle8mu8gOmQAihTnim6YKqMYKBKhGJpCGwjRW7AkEAABZK0xjxTcRoBr3zwSqkQmkoTCNFYBq3D8TqEYmkAYaK6Cg5s6qd+FroHzIBNJAYwUAABAIjRVQYJzUAQDporECCuzcEw7NugTkWFdnR9YlpO48MoE6QmSiMI0VYQF2d/8To7/vFYrrXUeX7ybHZAL1XHbKUaOeRmEaq+5pE7MuAcidzdsGsi4BOXbPz9ZkXULqyATq+XbPS6OeRmEaq09/Z3nWJQBAU9nO3euBKj/95aujnkZhGqst3NIGAABkrDCNFQAAQNZorAAAAAKhsQIK6tyvPpp1Cci5st07j0wgSYhM0FgBBRXiIEwU2+9e78+6hFSRCSQJkQkaKwAoKc75AaqFyERhGquybdIGAAD5U5jG6uqzZmZdAgAAKLnCNFZzZ3VlXQIAACi5wjRWAICRaWuxrEsAciVEJmisgILiuEMkGSjZLW3IBJKEyEShGiu+fQG7cNwhkpSrrSITSBYiE4VqrG78wO9lXQKQG3Nndamzg2/owCAygTQUqrHiAHag2jWzZ6qjvTXrMoDcIBMYa4VqrCTpvBMOzboEIDfmzurSZ//7MepoL1zUgT1CJjDWGlqyzOxUM1tpZqvM7PIhhu9lZvfGw5ea2fTQhTbq+rnH0FxhzDVTJubO6tKzf3uaTjxiYlYloATIBBBJbKzMrFXSLZJOkzRD0jwzm1Ez2gWSNrr7myV9QdI/hi50JK6fe4xe+IczxLHsGAvNmAlJuuskFZ71AAAIf0lEQVTCd+rmD7496zJQQGQC2KWRLVbHSVrl7qvdfZukeyTNqRlnjqRvxL/fJ+ndZpZ5W3PT2QQGY6JpMzF3VhdbdLHTkQftE2pSZAKFECITjTRWXZLWVDxeGz835Djuvl3Sa5IOrJ2QmV1kZj1m1tPX17dnFY/A3FldbOrFWGjaTEjRFl3gTfuN079eelKoyZEJNL1QmUj16D13v9Xdu929e/Lkyam8510XvlNtmX8nQh7kscnOIhOSOOW85M474VAtveI9WZcxJDKBLITMRCONVa+kqRWPp8TPDTmOmbVJOkDShhAFhrDqs2fQXJXciUdM1F0XvjPU5Jo+E9fMnqn2HByE2NnRrn3GlePU9/HxWWitDez9GjxjbSz+i048YuJYbKEpRCZyEImdy0f2O0nHXqOZaDWN6VmcoTPR1sA4j0k60swOUxSMcyR9qGachZI+KulRSe+X9EN3z9VFfVd99gxdueBpfWvpS6q8Yn2LSTtc6urs0GWnHKW5s7q0YFmvbli8Uus2bdUBHe3qH9ihzdsGJEULu8fjv+voyXr4ub6d45lJG7f0N1TP+PYWjWtr1aatyePvM65V7z22Sw8tX79z+u0t0oBHtbea6YTDJ+iFDVvVu2lr3enscNfW/h113+/EIybunFarmQbc1dXZoekHdmjJ6o0acFermQ6fPF6r+jar9n96wvh2zTh4v53jtphkiuodZJJ+f4j32autRW9s372+ca2mbQONL1LnnXDoWG7eb/pMDF7z7YbFK6vmf+X/92AealXm45Ahlot5x09V97SJVeMMN60kte81mLnaZbNy+rWv2ZNhnePb5S69trV/5/s++NT6nXmdML5dV581c9jpVn42DPf3D76m3t8y1HwY/KzZtKW/atoLlvXq2gdW7PyM6Oxo18xD9tvt/2aMclGYTFTOw472Fu3d3rrbvK5Vtkx0jm/X6/0DO9cltXmoV2dSJq5ZuGLYnA03H/KWCWtkuTaz0yXdLKlV0m3u/ndmdp2kHndfaGZ7S7pD0ixJr0o6x91X15tmd3e39/T0jPoPAEIws8fdvXsE45MJFBqZAKo1molGtljJ3RdJWlTz3FUVv78u6QMjLRJoVmQCqEYmgAiXngUAAAikoS1WmVg+X7r/YskHdj036WjpkqXZ1QQAAFBHPhur5fOl71y4+/OvPCfdeLT0V8+lXxOQteXzpQc/KW3bvOu59n2ks26W3nZ2dnUBWXnwUqnnNkkVxwp3XyCdeVNmJQH53BX44CeHH/a79dI3ZqdXC5AHy+dLCy6ubqokqX9z9CVk+fxs6gKy8uClUs/XVdVUSdFzNx6dSUmAlNfGqnblUev5H0WhAsriB9dJOwaGH77gz9OrBciDx28ffhhfwJGhfDZWjej5etYVAOl5bW394Tu2pVMHkBde54uGFH0BBzLQvI2VxDcSlMcBU5LHYXcgysQauGI/mUAGmruxev5HNFcoh3dflTzOdy5kFznK4x3nJ49DJpCB5m6sJDb3ohwaPeuv5+usSFAOjZ75RyaQsuZvrCTpS8dnXQEw9tr3aWy8egf1AmVEJpCiYjRWr3BdK5TA9uFvsF0l6aBeoGzIBFJUjMYKKAPf0dh4jRzUC5QJmUCK8tlYdUzMugIgfxpdOTRyUC9QBNbgKoxMIEX5bKxO+8eRjT+Jq+yiBBJXDsbtPFAu7/hY/eHWQiaQunzeK3DwDKgHPhndsqMebsyMshhcOTx+e3TMiLVGzRYrDZQVmUAO5bOxkqLmihvLAtXOvImVBlCJTCBn8rkrEAAAoAnRWAEAAARCYwUAABAIjRUAAEAgNFYAAACB0FgBAAAEQmMFAAAQCI0VAABAIObu2byxWZ+kF8do8pMkvTJG024mzIfG58E0d5881sXUQyZSwXwgE4NYFiLMh8CZyKyxGktm1uPu3VnXkTXmA/NgEPMhwnxgHgxiPkSYD+HnAbsCAQAAAqGxAgAACKSojdWtWReQE8wH5sEg5kOE+cA8GMR8iDAfAs+DQh5jBQAAkIWibrECAABIHY0VAABAIE3dWJnZqWa20sxWmdnlQwzfy8zujYcvNbPp6Vc59hqYD+ebWZ+ZPRn/fDyLOseSmd1mZi+b2c+HGW5m9sV4Hi03s2PTrjENZII8SOShEpkgE1LKmXD3pvyR1Crpl5IOlzRO0lOSZtSM82eSvhz/fo6ke7OuO6P5cL6kL2Vd6xjPhz+SdKyknw8z/HRJ35Nkkk6QtDTrmjNaFgqdCfKw828sfR5GsDyQCTIRNBPNvMXqOEmr3H21u2+TdI+kOTXjzJH0jfj3+yS928wsxRrT0Mh8KDx3/7GkV+uMMkfSNz2yRFKnmR2cTnWpIRPkQRJ5qEAmyISkdDPRzI1Vl6Q1FY/Xxs8NOY67b5f0mqQDU6kuPY3MB0l6X7x58z4zm5pOabnS6HxqZmSCPDSqDHmQyIREJhoVLBPN3FihcQ9Imu7ub5P0r9r17QwoI/IAVCMTATVzY9UrqbKrnhI/N+Q4ZtYm6QBJG1KpLj2J88HdN7j7G/HDr0l6R0q15Ukjy0uzIxPkoVFlyINEJiQy0ahgmWjmxuoxSUea2WFmNk7RQYcLa8ZZKOmj8e/vl/RDj49SK5DE+VCzn3i2pGdTrC8vFkr6SHzmxwmSXnP39VkXFRiZIA+NKkMeJDIhkYlGBctEW9i60uPu283sEkmLFZ31cJu7rzCz6yT1uPtCSV+XdIeZrVJ00No52VU8NhqcD39pZrMlbVc0H87PrOAxYmZ3SzpJ0iQzWyvpakntkuTuX5a0SNFZH6skbZH0sWwqHTtkgjwMIg8RMkEmBqWZCW5pAwAAEEgz7woEAADIFRorAACAQGisAAAAAqGxAgAACITGCgAAIBAaKwAAgEBorAAAAAL5/zIQT/yCACkWAAAAAElFTkSuQmCC\n",
"text/plain": [
"<Figure size 720x288 with 3 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"from sklearn.svm import LinearSVC\n",
"from sklearn.model_selection import train_test_split\n",
"\n",
"X = np.random.rand(100000, 2)\n",
"y = np.sum(np.abs(X), axis=-1) < 0.05\n",
"X = (X.T - 0.1*y.T).T # perfectly separable classes.\n",
"\n",
"print('Positive instances: %s' % np.sum(y==True))\n",
"print('Negative instances: %s' % np.sum(y==False))\n",
"print('Positive to negative class ratio: %s' % (np.sum(y==True) / np.sum(y==False)))\n",
"\n",
"X_train, X_test, y_train, y_test = train_test_split(X, y)\n",
"\n",
"# Appropriate selection of parameter C should be done\n",
"model = LinearSVC(C=10000.0)\n",
"model.fit(X_train, y_train)\n",
"y_pred = model.predict(X_test)\n",
"score = model.score(X_test, y_test)\n",
"print(\"Model test accuracy: %s (1.0 is perfectly accurate)\" % score)\n",
"\n",
"# visualize data and predictions\n",
"def plot_data(X, y, title):\n",
" plt.title(title)\n",
" plt.scatter(X[~y, 0], X[~y, 1])\n",
" plt.scatter(X[y, 0], X[y, 1])\n",
"\n",
"plt.figure(figsize=(10, 4))\n",
"plt.subplot(1, 3, 1)\n",
"plot_data(X_train, y_train, 'Training data')\n",
"plt.subplot(1, 3, 2)\n",
"plot_data(X_test, y_test, 'Testing data')\n",
"plt.subplot(1, 3, 3)\n",
"plot_data(X_test, y_pred, 'Model predictions')\n",
"plt.show()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.2"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment