Skip to content

Instantly share code, notes, and snippets.

@capissimo
Created March 28, 2018 14:43
Show Gist options
  • Save capissimo/16306b04fd4039d48ba6df67857fbf88 to your computer and use it in GitHub Desktop.
Save capissimo/16306b04fd4039d48ba6df67857fbf88 to your computer and use it in GitHub Desktop.
Chapter 1 of Elegant Scipy
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Глава 1. \n",
"# Элегантный NumPy: фундамент научного программирования на Python\n",
"> [Библиотека NumPy] повсюду. Она окружает нас. Даже сейчас, она с нами рядом. \n",
"> Ты видишь ее, когда смотришь в окно или включаешь телевизор. Ты ощущаешь ее, \n",
"> когда работаешь, идешь в церковь, когда платишь налоги.\n",
">\n",
"> — Морфеус, к/ф «Матрица»"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [],
"source": [
"def rpkm(counts, lengths):\n",
" \"\"\"Вычислить прочтения на тысячу оснований экзона на миллион \n",
" картированных прочтений (reads per kilobase transcript per million reads).\n",
"\n",
" RPKM = (10^9 * C) / (N * L)\n",
" где:\n",
"\n",
" C = количества прочтений, картированных на ген\n",
" N = суммы количеств картированных (выровненных) прочтений в эксперименте\n",
" L = длина экзона в парах оснований для гена\n",
"\n",
" Параметры\n",
" ---------\n",
" counts: массив, форма (N_genes, N_samples)\n",
" РНК-сек (или подобные) количественные данные, где столбцы являются \n",
" отдельными образцами, и строки - генами.\n",
" lengths: массив, форма (N_genes,)\n",
" Длины генов в парах оснований в том же порядке, что и\n",
" строки в counts.\n",
"\n",
" Возвращает\n",
" ----------\n",
" normed: массив, форма (N_genes, N_samples)\n",
" Матрица количеств counts, нормализованная согласно RPKM.\n",
" \"\"\"\n",
" N = np.sum(counts, axis=0) # просуммировать каждый столбец, чтобы \n",
" # получить суммы количеств прочтений на образец\n",
" L = lengths\n",
" C = counts\n",
"\n",
" normed = 1e9 * C / (N[np.newaxis, :] * L[:, np.newaxis])\n",
"\n",
" return(normed)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Введение в данные: что такое экспрессия гена?"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [],
"source": [
"gene0 = [100, 200]\n",
"gene1 = [50, 0]\n",
"gene2 = [350, 100]\n",
"expression_data = [gene0, gene1, gene2]"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"350"
]
},
"execution_count": 55,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"expression_data[2][0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## N-мерные массивы NumPy"
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1 2 3 4]\n",
"<class 'numpy.ndarray'>\n"
]
}
],
"source": [
"import numpy as np\n",
"\n",
"array1d = np.array([1, 2, 3, 4])\n",
"print(array1d)\n",
"print(type(array1d))"
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(4,)\n"
]
}
],
"source": [
"print(array1d.shape)"
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[100 200]\n",
" [ 50 0]\n",
" [350 100]]\n",
"(3, 2)\n",
"<class 'numpy.ndarray'>\n"
]
}
],
"source": [
"array2d = np.array(expression_data)\n",
"print(array2d)\n",
"print(array2d.shape)\n",
"print(type(array2d))"
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2\n"
]
}
],
"source": [
"print(array2d.ndim)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Зачем использовать массивы ndarray вместо списков Python?"
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"# Создать массив ndarray целочисленных в диапазоне\n",
"# от 0 и до (но не включая) 1 000 000\n",
"array = np.arange(1e6)\n",
"\n",
"# Конвертировать его в список\n",
"list_array = array.tolist()"
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"193 ms ± 11.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
]
}
],
"source": [
"%timeit -n10 y = [val * 5 for val in list_array]"
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"7.14 ms ± 696 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
]
}
],
"source": [
"%timeit -n10 x = array * 5"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1 2 3]\n",
"[1 2]\n",
"[6 2]\n"
]
}
],
"source": [
"# Создать массив ndarray x\n",
"x = np.array([1, 2, 3], np.int32)\n",
"print(x)\n",
"\n",
"# Создать \"срез\" массива x\n",
"y = x[:2]\n",
"print(y)\n",
"\n",
"# Назначить первому элементу среза y значение 6\n",
"y[0] = 6\n",
"print(y)"
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[6 2 3]\n"
]
}
],
"source": [
"# Теперь первый элемент в массиве x поменялся на 6!\n",
"print(x)"
]
},
{
"cell_type": "code",
"execution_count": 65,
"metadata": {},
"outputs": [],
"source": [
"y = np.copy(x[:2])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Векторизация"
]
},
{
"cell_type": "code",
"execution_count": 66,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[2 4 6 8]\n"
]
}
],
"source": [
"x = np.array([1, 2, 3, 4])\n",
"print(x * 2)"
]
},
{
"cell_type": "code",
"execution_count": 67,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1 3 5 5]\n"
]
}
],
"source": [
"y = np.array([0, 1, 2, 1])\n",
"print(x + y)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Транслирование "
]
},
{
"cell_type": "code",
"execution_count": 68,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[1]\n",
" [2]\n",
" [3]\n",
" [4]]\n"
]
}
],
"source": [
"x = np.array([1, 2, 3, 4])\n",
"x = np.reshape(x, (len(x), 1))\n",
"print(x)"
]
},
{
"cell_type": "code",
"execution_count": 69,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[0 1 2 1]]\n"
]
}
],
"source": [
"y = np.array([0, 1, 2, 1])\n",
"y = np.reshape(y, (1, len(y)))\n",
"print(y)"
]
},
{
"cell_type": "code",
"execution_count": 70,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(4, 1)\n",
"(1, 4)\n"
]
}
],
"source": [
"print(x.shape)\n",
"print(y.shape)"
]
},
{
"cell_type": "code",
"execution_count": 71,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[0 1 2 1]\n",
" [0 2 4 2]\n",
" [0 3 6 3]\n",
" [0 4 8 4]]\n"
]
}
],
"source": [
"outer = x * y\n",
"print(outer)"
]
},
{
"cell_type": "code",
"execution_count": 72,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(4, 4)\n"
]
}
],
"source": [
"print(outer.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Исследование набора данных экспрессии генов\n",
"### Чтение данных при помощи библиотеки pandas"
]
},
{
"cell_type": "code",
"execution_count": 73,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 00624286-41dd-476f-a63b-d2a5f484bb45 TCGA-FS-A1Z0 TCGA-D9-A3Z1 \\\n",
"A1BG 1272.36 452.96 288.06 \n",
"A1CF 0.00 0.00 0.00 \n",
"A2BP1 0.00 0.00 0.00 \n",
"A2LD1 164.38 552.43 201.83 \n",
"A2ML1 27.00 0.00 0.00 \n",
"\n",
" 02c76d24-f1d2-4029-95b4-8be3bda8fdbe TCGA-EB-A51B \n",
"A1BG 400.11 420.46 \n",
"A1CF 1.00 0.00 \n",
"A2BP1 0.00 1.00 \n",
"A2LD1 165.12 95.75 \n",
"A2ML1 0.00 8.00 \n"
]
}
],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"\n",
"# Импортировать данные TCGA по меланоме\n",
"filename = 'data/counts.txt'\n",
"with open(filename, 'rt') as f:\n",
" data_table = pd.read_csv(f, index_col=0) # pandas выполняет разбор данных \n",
"\n",
"print(data_table.iloc[:5, :5])"
]
},
{
"cell_type": "code",
"execution_count": 74,
"metadata": {},
"outputs": [],
"source": [
"# Имена образцов\n",
"samples = list(data_table.columns)"
]
},
{
"cell_type": "code",
"execution_count": 75,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" GeneID GeneLength\n",
"GeneSymbol \n",
"CPA1 1357 1724\n",
"GUCY2D 3000 3623\n",
"UBC 7316 2687\n",
"C11orf95 65998 5581\n",
"ANKMY2 57037 2611\n"
]
}
],
"source": [
"# Импортировать длины генов\n",
"filename = 'data/genes.csv'\n",
"with open(filename, 'rt') as f:\n",
" # Разобрать файл при помощи pandas, индексировать по GeneSymbol\n",
" gene_info = pd.read_csv(f, index_col=0)\n",
"\n",
"print(gene_info.iloc[:5, :])"
]
},
{
"cell_type": "code",
"execution_count": 76,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Гены в data_table: 20500\n",
"Гены в gene_info: 20503\n"
]
}
],
"source": [
"print(\"Гены в data_table: \", data_table.shape[0])\n",
"print(\"Гены в gene_info: \", gene_info.shape[0])"
]
},
{
"cell_type": "code",
"execution_count": 77,
"metadata": {},
"outputs": [],
"source": [
"# Взять подмножество генной информации, которая \n",
"# совпадает с количественными данными\n",
"matched_index = pd.Index.intersection(data_table.index, gene_info.index)"
]
},
{
"cell_type": "code",
"execution_count": 78,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"20500 генов измерено в 375 индивидуумах.\n"
]
}
],
"source": [
"# Двумерный массив ndarray, содержащий количества экспрессии \n",
"# для каждого гена в каждом индивидууме\n",
"counts = np.asarray(data_table.loc[matched_index], dtype=int)\n",
"gene_names = np.array(matched_index)\n",
"\n",
"# Проверить, сколько генов и индивидуумов измерено\n",
"print(f'{counts.shape[0]} генов измерено в {counts.shape[1]} индивидуумах.')"
]
},
{
"cell_type": "code",
"execution_count": 79,
"metadata": {},
"outputs": [],
"source": [
"# Одномерный массив ndarray, содержащий длины каждого гена\n",
"gene_lengths = np.asarray(gene_info.loc[matched_index]['GeneLength'],\n",
" dtype=int)"
]
},
{
"cell_type": "code",
"execution_count": 80,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(20500, 375)\n",
"(20500,)\n"
]
}
],
"source": [
"print(counts.shape)\n",
"print(gene_lengths.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Нормализация\n",
"### Нормализация между образцами"
]
},
{
"cell_type": "code",
"execution_count": 81,
"metadata": {},
"outputs": [],
"source": [
"# Заставить все графики в блокноте Jupyter\n",
"# в дальнейшем появляться локально \n",
"%matplotlib inline\n",
"# Применить к графикам собственный стилевой файл \n",
"import matplotlib.pyplot as plt\n",
"#plt.style.use('style/elegant.mplstyle')\n",
"\n",
"# переопределение стиля\n",
"from matplotlib import rcParams\n",
"rcParams['font.family'] = 'sans-serif'\n",
"rcParams['font.sans-serif'] = ['Ubuntu Condensed']\n",
"rcParams['figure.figsize'] = (4.8, 3)\n",
"rcParams['legend.fontsize'] = 10\n",
"rcParams['xtick.labelsize'] = 9\n",
"rcParams['ytick.labelsize'] = 9"
]
},
{
"cell_type": "code",
"execution_count": 82,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAADQCAYAAABV2umIAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJzt3Xl8VPW5+PHPk8lkDwlkgQQIAcIiAWWJICAUF4qlreC+tcq1brXeLt7r/dlbrba2vff+em21vT9v1bpUrdftp+C+gLiwiARBtrCENSQEsgAhCdlmvvePM5EYJpBhcnJmed6v17wyc2bOOc9hwpPv+Z7v+T5ijEEppdTpi3E6AKWUCneaSJVSKkiaSJVSKkiaSJVSKkiaSJVSKkiaSJVSKkiaSJVSKkghn0hF5Nci8lA3PnetiKwWked7Iy6llGoX0olURAqAbN/zPBH5VEQ+FpE0Px+/DLgAKBARd2/GqZSKbhLqdzaJyCxgPnAYKAZGAvuMMS91+tx3gTlAkzHmn3s7TqVU9ArpFmknOcB9wLVAoog8JCIf+R7nA6OArUCuiLicDFQpFV3CrUW6yhjzThefWwNMA14FfmyM2dFrQSqlolo4tUifBO4VkQ9EJMfP+3/HOvVvAXb1amRKqagW8i1SpZQKdeHUIlVKqZCkiVQppYIU63QAJ5OZmWny8/OdDkMpFaXWrFlTbYzJOtXnQjqR5ufnU1xc7HQYSqkoJSJ7uvM5PbVXSqkgaSJVSqkgaSJVSqkgaSJVUavV46W6vpnmNo/ToagwF9IXm5Syw8odNTz+6U6Wl1bT3ObFFSNMzu/Hrd8YxqxR2U6Hp8KQJlIVNeqaWrl/0SZeXVtOVmo8107JIz8jmcq6Jt74soIFT63mqqLBPDB/LHGxerKmuk8TqYoKlUeauOHJzymtqufHF4zg9lnDSXAfnyTspxeO4E9LtvP/lu7g4NEmHru+CLdLk6nqHv1NURGv/PAxLvvvFZQfPsYzN07mztkjv5ZEAeJjXdw1ZzS/mT+WpVur+OWijQ5Fq8KRtkhVRDvc2MINT35OXVMr/3PzOYwb5K+4wnHfO2cIFYeP8chHOzhnWAbzxg/spUhVONMWqYpYLW1ebvpbMXtrGnn8+qJTJtF2d84eycS8dO5ZuJHKI002R6kigSZSFbH+7Z0Sivcc4sErz+KcYRndXi/WFcMfrhxPc5uX/3h3i40RqkihiVRFpHc37uep5btZMC2f756VG/D6+ZnJ3DxjKK+tLWfNnkM2RKgiiSZSFXH21DRw1yvrOWtwOv8694zT3s7tswrISo3n9+9pq1SdnG2JVESu8JVOfrrDsjEiskZE3hYRsWvfKno1tXr40fNfIMB/XTMhqPGgyfGx/PAbw/lsZy2f76rtuSBVxLEtkRpjXgZmAZM71Jm/GvglcAAYb9e+VfT67VslbCyv48ErxzO4X1LQ27tmch6ZKfH8acn2HohORSo7W6QuYAtQbIxp9S3OBSqB/YDfcSUicouIFItIcVVVlV3hqQj0xpcVPPvZHm6ZOYzZY/r3yDYT41zcPGMoy0qr2VxR1yPbVJHHzhapBxgDpIlIvL+PdLHeY8aYImNMUVbWKSemVgqAnVX1/PzVDUwa0pe75ozq0W1ffXYeiW4Xf1uxu0e3qyKHrRebfMk0FhjkW1QBDAByfM+VClpTq4fb//4Fbpfw52sm9PitnWlJbuZPGMjCdeUcamjp0W2ryGDnqf1PReRT4BgwV0QKgBeAXwH9gS/t2reKLr9ctJEtlUf541XjyU1PtGUfN0wbQnOblxeLy2zZvgpvtt0iaox5CHjIz1tFdu1TRZ+Xi8t4qXgfd5xXYOsUeKMH9OHs/L68tLqMW2cOQwedqI50HKkKW8W7a/nFaxuZOiyDn80eafv+rpg0mJ3VDawtO2z7vlR40USqwtLemkZueXYNA/sm8sh1E3HF2N9C/Na4ASS4Y3hlzT7b96XCiyZSFXYO1jWx4KnP8XgNT9xQRN/kuF7Zb2qCm2+NzeHNLytoatXyJOo4TaQqrFQdbeaaxz+jsq6JJ24oYlhWSq/u/7KJg6hramNxyYFe3a8KbZpIVdjYUVXP5X9ZQcXhJp5acDZF+f16PYapwzPITo3nzS/39/q+VejSRKrCwodbDnDpIyuob2rjuZumMCWAafF6kitG+NbYASzdepCG5jZHYlChRxOpCmmHGlr4+asbuPHpYnLSElj4o+lMGtLX0Zjmjsuhuc3Lki0HHY1DhQ4tNaJCUl1TK098uosnlu2ioaWNW2cO485vjiQ+1nXqlW1WlN+PrNR43l6/n4tPY65TFXk0kaqQ0tDcxtMrdvPYJzs5cqyViwoH8LPZIxk1INXp0L7Sfnr/4uoyGprbSI7X/0bRTn8DVEhobvPw/Kq9/NeHpdQ0tHD+6GzunD2SsQO7V2ept80dl8MzK/fw4ZaDpzUDv4osmkiV45ZuOcgvX99IWe0xpg3P4J/njGJinrP9oKdydn4/MlPieH/zAU2kShOpck5Tq4f7Fm3ixeIyRvZP4ZkbJzNjRGZY3MfuihHOG5XNu5sqafV4e3zGKRVe9NtXjqhtaOG6v67ixeIyfjhrOG/847nMHJkVFkm03QVn9OdoUxurd2sZkminLVLV644ca+W6v65iZ1U9j1w3kbnjcpwO6bTMGJFJnCuGJSUHmTY80+lwlIO0Rap6VUubl5v+tprSg0d5/PqisE2iYBXHmzo8gyUlBzDGb8EHFSXsnNj5KhH5VEQ+EpG4DssrfcsG2LVvFbp+93YJq3cf4sErxzNzZPiXkrlwTH921zSyo6rB6VCUg+xskb5kjJkBVAN58FVBvHeMMbOMMZX+VtLid5Hr3Y2VPL1iNzdOHxoxA9kvGG1NJr1EJzGJanYWvzMikgikArt8i/sChSJyyUnW0+J3EejIsVbuXbSRwtw+3P2t0U6H02Ny0xMZk9OHJSV6u2g0s7uP9A/APb4ieBhjqoHpwI9EZIjN+1Yh5D/e3UJNfTP/cdmZxMVGVtf8+aOzWbP3EHVNraf+sIpIdvaRTsRqmK7uuNxX474GSLdr3yq0bKo4wvOr9rJg2tCQvVMpGDNHZuHxGlaUVjsdinKInU2D84DzfReW7hGRqSJyuYgsB1qA9TbuW4WQ37+3lbRENz+5cITTodhiQl46KfGxfLxNE2m0srOK6IPAg37eesWufarQ89nOGj7aWsXPvzWatES30+HYwu2KYdrwDD7ZVoUxJqxuKlA9I7I6q1TI+cP72+jfJ54bpuU7HYqtZo7MovzwMXZW6zCoaKSJVNnmi72H+Hx3Lbd9YzgJbufnEbXTN3xjYj/ZpkP2opEmUmWbxz7eSVqimyuLBjsdiu0G90tiaGayJtIopYlU2WJnVT3vba7k++cMiZqJj2eOyGTlzhot1RyFNJEqWzy5fBduV0zE9412NHNkFk2tXop3H3I6FNXLNJGqHlff3MZrX5Rz8Vm5ZKXGOx1OrzlnWAZul/DJdj29jzaaSFWPW7SunIYWD9dOyXM6lF6VHB9L0ZB+2k8ahTSRqh5ljOH5VXsZPSCVCYOj7+a1c0dksqXyKNX1zU6HonqRJlLVo9bvO8Kmijqum5IXlQPTpxdYEzyv2FHjcCSqN2kiVT3q+VV7SXS7mDdhoNOhOGLcwDRSE2L1vvsoo4lU9ZiG5jbeWF/Bd8/KoU9CZN4OeiquGOGcYRks36GJNJpoIlU95r1NlTS2eLgiCgbgn8z04RmU1R6jrLbR6VBUL9FEqnrMa2vLGdQ3kUkhXpPebu39pMv19D5qaCJVPeJgXRPLS6u5ZMJAYmKi7yJTRwXZKWSnxrNcLzhFjV4tfici/UVkpW95ql37Vr3v9S8r8BqYNz46LzJ1JCJML8hkRWk1Xq9WF40GASdSEblCRHaJyF4Ruf0kHz2h+B3wbeA5YDFwYcDRqpD12tpyzhyURkF2itOhhIRpwzOoaWhh64GjToeiesHptEjvA4qAQuCfuvpQF8XvcoFKYD/gt+miVUTDz7YDR9lUUcclUTrkyR/tJ40u3U6kIvI/IvI8MAj4M/AYkHSK1b5W/K4Tv+c8WkU0/Ly2thxXjPCdMyOjxHJPyE1PZGhmsg7MjxKBzG/2F9/PRzss81dKBOiy+F0FMADIRGs2RQSv17BobTkzRmRG1QQl3TFteAYL15bT6vHidul13UgWyLd7lzHm406P4pN8/oTid8BbwPew+kcXBxG3ChGf766l4kiTntb7cW5BJg0tHtbvO+x0KMpmgbRIp4nI7zovNMb8q78Pn6T43dQA9qlC3MK15STHufjmmAFOhxJypg7PQASWba9h0pB+ToejbBRIi7QF2AJs7fRQUaqp1cNbG/YzZ+wAEuMiuybT6UhPiqMwt4/eLhoFAmmRvmqMeca2SFTY+XDLQY42tXHphEFOhxKypg/P5Mnlu2hsaSMpLjpKrkSjQFqk4zovEJHlPRiLCjOvrS0nOzWeqcMznA4lZE0ryKTVY1it5UciWiCJNF1E5ohIHxFJFZGLAO34iVKHGlr4aOtB5o3PxRXlt4SezOT8fsS5YnRavQgXyLnGlcC9wH8CgtVfeo0dQanQ99aG/bR6DPP1av1JJca5mDgknWWaSCNatxOpMaZERK7DujsJYL8xxmtPWCrULVpXzojsFMbk9HE6lJB3bkEm//n+NmobWuiXHOd0OMoGgdzZtADYBrwEvAxsFZEbbYpLhbCy2kZW7z7E/AkDo7KcSKCOlx/RVmmkCqSP9F7gLGPMdGPMNOAs4Bf2hKVC2aJ15QDMG6+3hHZHe/kRve8+cgXSR/oF8KiIfI51n/zZwJe2RKVCljGG19aWMzm/H4P6nmqqBQUQ64ph6rAM7SeNYIG0SK8GngfigQTgRawLUCqKbKqoY0dVg15kCtC5IzIpqz3G3hotPxKJAkmkFwAerMlG1gNtwPl2BKVC12try3G7hLnj9JbQQLT3k2qrNDIFcmrfPtRpPvAa1hAoA7zf00Gp0OTxGl7/soLzRmWTnqRXnwMxLDOZnLQElpdWc+2UvFOvoMJKIMOf/gFARKYYY/RqfRRasaOaqqPNOtPTaWgvP7K45ABer4n6ulaRJpDhTytFZAWQLyIr2h82xqZCzCtr9tEnIZbzRmc7HUpYOrcgk8ONrWzeX+d0KKqHBXJqf7VtUaiQd6SxlXc2VnJV0WAS3DrT0+mYVmDNSbCstJqxA9Mcjkb1pEAuNi00xuzp/DjVSiLyaxF5qNOySt+Ez3rFIky8vr6CljYvV5092OlQwlZ2agIj+6foeNIIFEiLdLivZtPXGGOu7WoFESkAsoGmDstcwDvtfa4qPLy0uowzcvpQmKu3hAZjekEmz6/aS1OrR1v2ESSQFmkdVr2mzo8uGWNKgRc6Le4LFIrIJf7W0SqioWdzRR0byo9wZdEgvSU0SOcWZNLc5uWLvTqtXiQJpEX6iDHm42B3aIypFpHpwDsi8kXn7gFjzGNYFUopKiryW2lU9a6X15QR54ph/ni9Wh+sKcMycMUIy0urmTY80+lwVA8JpEX6rIj8UUTe9z3+KCKnNSDOGNMK1ADpp7O+6j3NbR4Wri1n9pj+9NWZi4KWEh/LhMHpLCvVMs2RJKCLTUAxcIfvsRpY1N2VReRqEZkqIpf7ZtZvQUsyh7x3N1ZyqLGVK4q0nEhPmV6QyYZ9hznS2Op0KKqHBHJqnwBUAvux7miqxLrv/qSMMR8BH3Va/EoA+1UOenblHoZkJDFzRJbToUSMc0dk8vCS7azcWcNFY3XgSiQIpEV6DXALsBL4DLgVnSE/om2uqKN4zyG+N2WI3onTg8YPTic5zsUn2/ViaqQI5BbR9cBVNsaiQsyzn+0mPjZGT+t7mNsVw/SCTD7eWoUxRkdCRIBAbhGt6PTYLyIVdgannHPkWCsL11Ywb3yuTlBig/NGZ1N++BjbD9Y7HYrqAYH0kTYCo4wxHruCUaHjlTX7ONbq4fqp+U6HEpFmjbL6nJduOcjI/qkOR6OCFUgfaSywU0T2isgWEfmLiOTbE5ZyksdreGblbibmpes94TbJSUtk9IBUlm496HQoqgd0O5EaY/KNMUOMMXnAFGAN8K5tkSnHvLepkj01jdw8Y5jToUS0WaOyKd59iKNNOgwq3J0ykYrIzs4PYC3wr0Cy7RGqXmWM4dGPd5CfkcQ3C3Vojp3OG5VFm9foJCYRoDt9pF6g0O5AVGhYtauWL/cd4Tfzx+LSIU+2mjikL6kJsSzdUsVFY3OcDkcFoVuJ1BjTbHskKiQ8+vEOMpLjuHySDnmym9sVw8wRWSzdelCHQYW57iTSwV3MhC+A8dW4VxFga+VRlm6t4s7ZI3WKt17yjVFZvLVhP5v311GYqxf2wlV3Eulo26NQIeFPS7aTHOfi++cMcTqUqNE+DOrDkoOaSMPYKRNpd2bBV+FvS2Udb23Yzx3nFegsT70oOzWBCXnpvLe5kn+8YITT4ajTFMg4UhXBHl68ndT4WG6aMdTpUKLORYUD2FheR1lto9OhqNOkiVSxuaKOdzZW8g/T8/V2UAfM8Q0ze3/zAYcjUadLE6ni4SXbSE2I5Qfn6gB8J+RnJjOqfyrvbap0OhR1mmxPpJ2riIpIfxFZKSKfiojeZOywdWWHeW/TAW6cPpS0JLfT4UStOWMHULy7lup6HWkYjmxNpB2qiHb0beA5YDFwoZ91tPhdLzHG8Lu3SshMiePmmdoaddKcwv54DSzW0/uwZGsi7aKKaC7HZ9o/oZqaMeYxY0yRMaYoK0tnZbfTB5sP8PnuWn5y4UhS4gOZCEz1tDE5fRjUN5F39fQ+LDndR6pVQh3S6vHy7+9uYVhWMlefPdjpcKKeiDB3XA7LtldzqKHF6XBUgJxIpBXAACDH91w54IXVZeysauDui0bjdjn991QBzBufS5vX8NaG/U6HogLUa/+D2quIAm8B38PqH13cW/tXxx1tauXhxduYnN+P2WP6Ox2O8hmT04cR2SksWlfudCgqQLZ3jHVRRXSq3ftVXXt48XZqGlp44oYzdKKMECIizJ8wkN+/t5V9hxoZ1DfJ6ZBUN+k5XZTZduAoT63YzVVFgzlrcLrT4ahOLj4rF4BF67TXK5xoIo0ixhjuW7SJ5DgXd80Z5XQ4yo/B/ZKYNKQvi9aVY4xeiw0XmkijyFsb9rNyZw13zRlFRkq80+GoLlw2cRDbDtSzruyw06GobtJEGiUamtv4zZsljMnpw7VTdJq8UHbx+FyS41w8v2qv06GobtJEGiX+/GEplXVNPDC/UEuIhLiU+FguHj+QN9ZXcOSYFsYLB5pIo8Dmijr++ulOLp80iElD+jkdjuqGayfn0dTqZeFaHQoVDjSRRjiP13D3q+tJS3Tzi7lnOB2O6qZxg9IYNzCNv6/aoxedwoAm0gj31PJdrN93hPsuLtSZ78PM9VOHsO1APR9v08l7Qp0m0ghWVtvIg+9v4/zR2Xz3TC33G27mjR9I/z7xPPbJTqdDUaegiTRCGWP4xcKNxAg8MH+s3sEUhuJiY/jBuUNZsaOGDfuOOB2OOglNpBHqxdVlfLKtirvmjGJgeqLT4ajTdM3kPFLjY/nvj0udDkWdhCbSCLS3ppEH3tzMtOEZXD813+lwVBBSE9wsmJ7P2xsq2ViurdJQpYk0wni8hjtfWkdMjPD7K84iRseMhr2bZw4jPcnN79/b6nQoqguaSCPMY5/spHjPIX49r1BP6SNEnwQ3P5pVwMfbqli5o8bpcJQftiXSrorciUiliHwkIgPs2ne0Kt5dy4Pvb+Xb43KYP/6EKi4qjH1/6hBy0xL41RubaPV4nQ5HdWJni/SEInci4gLeMcbMMsZocZoeVF3fzB3Pr2Vg30R+d+k4vUofYRLcLu67uJAtlUd5avkup8NRndiZSP0VuesLFIrIJV2tpFVEA9fm8fLTF9ZR29jCI9dNJC1RyypHojmFA7jwjP788YPtlNU2Oh2O6qC3+kgNgDGmGpgO/EhE/E5BpFVEA2OM4VdvbGZZaTUPzCukMDfN6ZCUjX41r5DYGOGnL66jTU/xQ4adidRvkTtjTCtQA+j07D3gr5/u4tnP9nDLzGFcdXae0+Eomw1MT+Q3l4xlzZ5D/GnJdqfDUT52JtKORe5SRGSqiFwuIsuBFmC9jfuOCv/z+V5++3YJc8cN4O6LRjsdjuol88YP5LKJg/jz0lLe36SXGkKBbcXvjDEH8F/k7hW79hlNnv1sD/cu3MisUVn84crxOl40yvxm/lhKq+r5yQvreOnWqYwbpF06TtJxpGGmzePlgTc3c+/CjVwwOptHvz+JBLfL6bBUL0uMc/H49ZPolxzHgqc+Z0tlndMhRTVNpGGk9GA9Vz32GU8s28WCafk8+v1JxMdqEo1W2akJPPuDycS6hGsfX8XmCk2mTtFEGgbKahu5d+FG5j78KaUH63noqvHcf3EhsS79+qLdsKwUXrxlKvGxMVzxlxUsKTngdEhRybY+UtU1YwwNLR4amttoaG6juc1rPVo9Xz1vaG6jpLKOVTtrWVd2GLdLuHzSIH42eyTZqQlOH4IKIfmZybx6+zRufqaYm54p5p+/OYrbvjFca3P1IgnlMgZFRUWmuLjY6TCCVtvQwifbqlhWWs3mijp2VTdwrNVzyvXcLmFMbhoXFQ5g3vhccvXeeXUSx1o8/Mv/X88bX1YwOb8fD155FoP7JTkdVlgTkTXGmKJTfk4TqX1W7Kjm2ZV7WFxygFaPIT3JzVmD0hmWlUxOWgIp8W6S4lwkuGOIj3URHxtDvDuGOJe1LC8jSftAVUCMMbz6RTn3vb4JrzH8+IIR3Dh9KHGx2g10OjSROqj04FF++1YJS7dW0S85jksnDOTi8bkU5qbp6ZbqFWW1jfzqjc0sLjnAsMxk/uWi0cwp7K9zMARIE6kDDjW08NDibTy3ai9Jbhf/eEEBN0zL11alcszSrQd54M3N7Kxq4IycPvzkggJmjxmgf9C7qbuJVC829YCWNi/PrNzNn5Zsp765jWun5PGzC0eSkRLvdGgqyp03KpsZBZm8/mUFf/6wlNue+4LB/RK5bsoQriwaTD+tLNsjtEUaBGMMi0sO8ru3S9hV3cCMEZnc8+0xjBqQeuqVleplbR4v726q5NmVe1i1qxa3Szi3IJNvjcvhwjP6a1L1Q1ukNttYfoTfvV3Cih01DM9K5qkFZzNrVJb2QamQFeuK4Ttn5vKdM3PZduAoLxeX8faGSpZutaa9GD0glbPz+zFuUBrDs1IYnpVMWqJbf6e7QVukAbJqxW9l4boK0pPc/OzCkVw7JQ+3Do5XYcgYw4byI3yyrYpVu2pZs+cQjS3Hh+bFx8aQkRxHv5Q4kuNiSXBbI0oS3C4SYq3nfRLdpPkefZPiyE1PJC8jiZT48G+naYu0h20sP8Ljn+7kzfX7iY0Rbp81nNtmDadPgk6irMKXiHDmoHTOHJTOHVjFE8tqG9lRVc/Oqgaq65uprm+htqGZhhYPhxtbaGr10tTmoanVw7EWD/XNbXj9tMcykuMYlpVMYW4aYwemUZjbh4LslIhsdGiLtAter2H7wXqWbDnA6+sq2FJ5lJT4WK4+ezA3zRjGgDS9u0gpsP6v1Le0caSxldqGFvYdOsae2gb21jSy/WA9JfvrvmrlxsXGcMaAVAoHpjFuYBpjc9MYOSAlZEe2ROXwp7c37OdQYwvumBhiXYIrRnC7Yoht/+lnGcDRpjbqmlqpqW9hV7X1l3ht2WFqG1oAmJiXzrzxA7lk4kBtgSoVII/XsKu6gU0VR9hUUcfG8iNsLD9CXVMbYN3BN7J/KoW5fRiSkcygvokMTE8kOzWBlIRYUhNiT2jFGmNo8Xg51uKhocXDsZY2Gls8NDR7aG7z0OoxtHq8tLR5afF4afV4mTkiK+A7vRw/tReR/sBCoA2Ya4w56m9ZT+7zkY9K2Vge3Aw4cbExDM1I5rxR2UwZ1o9pwzMY1Fdvs1PqdLlihILsFAqyU5jnq25rjKGs9hgbyo+wscJKrEtKDlLja7x0FueKgfZrXgY8xuDx159wEn/53kTbbpm1rUUqIjcCiUAmsN4Y85q/ZSfbRqAt0sONLTS3WX992jyGNq+hzWs9b/V48XgNrZ6vLzNAakIsfRLc9E2OI6dPgk6SrJRDGlvaKD90jH2Hj1FT38LRplbqm9qob7Far+LLpq4YSIqLJdHtIinORVJ8LEm+5/Fu63ZrtysGt8s684yLjSEt0R3w3L2Ot0ixqoiWAK0cryLqb9nXiMgtwC0AeXmB1SBKT9JxcEqFs6S4WEb0T2VE//Aai92rVUS7sUyriCqlwk5vVxH1W1lUKaXCmZ2n9m9x/MLSThGZ2mnZH2zct1JK9Ronqoj6W6aUUmEr8m4xUEqpXhbSA/JFpArY43QcPSATqHY6CBtE6nFB5B6bHldghhhjTnnVO6QTaaQQkeLujEULN5F6XBC5x6bHZQ89tVdKqSBpIlVKqSBpIu0djzkdgE0i9bggco9Nj8sG2keqlFJB0hapUkoFSROpUkoFSROpUkoFSROpDUSkv4isFJFPRSTVt+wq3+uPRCQs5/vzd1y+5UNEpNnJ2IJxkuP6toi8LCJ9nYzvdHXxe1joW/aZiPRzOsZgiMivReShDq/9fo+9QROpPb4NPAcsBi70LXvJGDMD6+6LwCZaDR3+jgvgp8B2RyLqGSccl4ikALcaY64wxhxyMrgg+Pu+zgf+DVgGTHQorqCJSAGQ3WlxV7+fttNEao9coBLYj28Ca2OMEZFEIBXY5WBswTjhuDq0asL5tsMTjgs4FxgjIh+KyADHIguOv+N6A7gUazrLZQ7FFTRjTCnwQqfF/o63V2gitV/H8WV/AO4xxni6+nAYaT+ua4FnnQykh7UfVz/gv4CXgKucC6fHtB/XQKxkE4d1f3qk6tVxnZpI7XHCBNYiMhGrYbraycCC5G9i7mHAv2C13m5xKrAg+TuuKiAZaOJ42bVw4++4rgQ+AFYDcxyKyy6OTRyvA/Jt0Kla6mNAKTANuBmrNXCPMSbsTqv8HZcxZqXvvY+MMbMcDO+0dfF9rQUWYRVrvN4Ys9uxAE9TF8cVB/wZ6w/EfGNHPbEvAAAF5UlEQVRM2FaqEJFZwHzgM6xZ4nZiY5Xik8aiiVQppYKjp/ZKKRUkTaRKKRUkTaRKKRUkTaRKKRUkTaRKKRUkTaRKqajT+T79DsvPE5HPReQDEXF3d3uaSCOciNwuIttFZJuIjHU6HqWc1vE+fRHJ801y8rGIpAHfAW4CjgBDur1NHUcauUQkH3gbmIx1y1ybMSZsZ2lSqqd0GMx/GCgGRgL7sAb13wLEGmNu7O72tEUa2S4EXjfG1BtjGoBcEVkGICL3isg1IvIPIvKhb9kCEVnqe36eiOzwPZ8lIrs6fObf23cgIvki8lnHnfqmChwtIs+JyDdFZJCIFItIiYhM6LDNoyJyoH17IvKR7+fffe/fLyK3+bbV/t5cX+v6AxFxi0iKiLwvIntE5BcisltEmkSkVET6+n6W+locyR1ivF9EbvM9r/T9vM63nddFRDp8tj2WOBHZ2dX6vufHfNtoj7f9580icr/v+ddi8m1rlu84F3f4d60TkQoRedr3Pf3G994SEfl5p9c5IvI3Efmlb9nTInJfwL8x0ScHuA9rzohEYCjWXVJpEsBUfJpII1t/OtxzbIzZhTVDDlhTqBUDacAoEUkH5nJ8soc0IFVECoF5QKAt2UlApjHmfeDHwKNYf+nb/3O7gNeB/9NxJREZjTU7EfifeOJ+oAg4CMwArgPWAUONMb8FZgHrjDEFvunvUowxBb795Z8sYGPM340xQ4B44Cw/H1nASWYV8iXfVXS6h93X13ZXh0VdxXQ3ENu+GvAq0N4q+gJo75rpj3Wm8dVrY8x+rO/sHBFxYf0bheU8qr1sP/BLY0yRMeZvwPewZpWqxDqT6xZNpJHNy/H/mO22ichQIM8Ysx3ogzV/4xzAzfHk1Qd4B7gIKMA67Wn3AxFZ7zs96sr/Bf7J97wQ+Nz3aP/Pn4zVD9XZPwEv+p5XcWI/1UjfdmYAGcA4YIUxxttFHJkish8rOW7u9N79IrKl/YWvtb0FOAdI7/RZF9ap4Kqu1seaItHfH5zrgI9PEdNwrPvfu9rWRmCEiAwDSvy8Bus7O4D1nX3m24Y6uSeBe31nODnA81j36xdi/Rt2iybSyFbOiYnoU+ByYLfvdQrwLtYMTiuApA7LlwEXY53qxIpIvO+9J7CSw1en+H6sBWZ3eN25dTnMF19H2ViTarTH9jJwAdbkIe2OGGNGG2PyjDEvY/0On6yjvxoYBGzFupDQ0f3GmNEdXv8amIr1B6Sz72K1oDsm7M7rD8Vq4XR2Ocf/OHQV0x3Aw11tyzf1YgVwGfBJ59e+j6UA7wEP+H4mofwyxnxkjPmpMWavMWaaMWa2MWa/MeZFY8w4Y8wsX3dYt2gijWxLgO+KSL/2/kRgKfAz33tg9Qu1t44+BBI6LG8FjvrWocN7AMewWrBduR/4oYhkYLW6zsY63dzkO9W9pMN2252B1QUAgDGmyhgzGatroV21iIyX4+VatmDNrNV+an0CX9Jp9R3jyQhdd2EMAZ45xfrn478VswJrRqKTxVRljCnp8BF/2+r83fn7LhdjJeE1fP37UjbSRBrBjDHlwD1Y/yE3AuOxTk3jON7qSvT9nIfVikzutPw2rP44OH46vgCrxfNvJ9l9PdbUbXcAfwJ+hHUa9WvgF8BuY8yKTuts8rOss7ux+g7X+S4ePQNMEZHtWF0QJ8QhIvt87/lraXb0e6ykPx2o6/Te340xjV2tKCJTsI7tHqx/mykicilWAn2yGzE91mFblwK3Av/tW/dS3/bfBlqNMe3dAZ1fJxpjqoEpWC3nry6uKXvp8Kco42shvmyMOd/pWCKJr7/4NmPM1b7X/w5sMcY8fRrbWgCMNsbc7Xv9AvAXrD+As40xd/mWf7Pja+UcTaRRRERmAE8BPzTGfOB0PJHE9wcqzxiz1vd6NFBvjNl38jX9bmsQ1pX9Lb7XE7BO9W8ALjLGVIjInVhnBheF8+TMkUITqVJKBUn7SJVSKkiaSJVSKkiaSJVSKkiaSJVSKkiaSJVSKkj/Czq+5RYT24DQAAAAAElFTkSuQmCC\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x15558177470>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Количественная статистика:\n",
" минимум: 6231205\n",
" среднее: 52995255.33866667\n",
" максимум: 103219262\n"
]
}
],
"source": [
"total_counts = np.sum(counts, axis=0) # просуммировать столбцы\n",
" # (axis=1 будет суммировать строки)\n",
"\n",
"from scipy import stats\n",
"\n",
"# Применить гауссово сглаживание для оценки плотности\n",
"density = stats.kde.gaussian_kde(total_counts)\n",
"\n",
"# Создать значения, для которых оценить плотность, с целью построения графика\n",
"x = np.arange(min(total_counts), max(total_counts), 10000)\n",
"\n",
"# Создать график плотности\n",
"fig, ax = plt.subplots()\n",
"ax.plot(x, density(x))\n",
"ax.set_xlabel(\"Суммы количеств на индивидуум\")\n",
"ax.set_ylabel(\"Плотность\")\n",
"plt.tight_layout()\n",
"plt.savefig('pics/1_06.png', dpi=600) \n",
"plt.show()\n",
"\n",
"print(f'Количественная статистика:\\n минимум: {np.min(total_counts)}'\n",
" f'\\n среднее: {np.mean(total_counts)}'\n",
" f'\\n максимум: {np.max(total_counts)}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Нормализация размера библиотеки между образцами"
]
},
{
"cell_type": "code",
"execution_count": 84,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAAClCAYAAAAONXX6AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJztnX2cXFWZ579PdXeqOy+d0HSbDjRJhcVIUs0MLy0CiZIAG9yRZbMMDIbo7GgkQz52dnxjN5li3AGJo+y0OgZnBjCEibMdVxTX3TXjZrQ7SASEBGQ2EhBEQI3wATSC0bxsePaPe29x63Z11am6t7qqOs/38zmfrnP7nnOfe++5v/ucl3uOqCqGYRhG9aTqbYBhGEazY0JqGIYRExNSwzCMmJiQGoZhxMSE1DAMIyYmpIZhGDExITUMw4iJCalhGEZMTEgNwzBi0lpvA2pNd3e3ZjKZepthGEYTsmfPnpdVtafcfpNeSDOZDLt37663GYZhNCEi8pzLfla1NwzD8Nm2bRv9/f20tLTQ398P0OWSzoTUMAwDT0RzuRybNm3i0KFDbNq0CeBkl7TiMvuTiGSBdwD3ANcDU4DbVXVv9WZPDAMDA2pVe8MwytHf38+mTZtYtmxZfpuI/EhV31IuratH+hU8Zf4R8CSwB/h6FbYahmE0JPv27WPJkiXRzb9xSesqpO3AVuDXwE7gQTyv1DAMY1KwcOFCdu3aFd083SWtq5B+D7gN+DFwO/D3wKirgYZhGI1OLpdj9erVjI6OcvToUUZHRwEyLmmdhj+p6h/HsM8wDKPhWblyJQDr1q1j3759LFy4EODnLmldO5suBT4JnOhv+iXw56r6rWoMnkiss8kwjGoRkT2qOlBuP9cB+ZuAi1T1Z37mfXhV+zdXb6JhGMbkwFVIfwr8qYg8BChwLo4ur2EYxmTHtbNpBbAfuBR4J/ACcHmtjDIMw2gmXIV0DvAd4PN++DbQWy6RiMwTkcMiskhE9ojIdvFYJiKPiMgWf7/3+PGb/fh6EXlURK7z47f66S/z41/z9z+n8lM2DMNIFlchvc0Pj+ANfbrN/1uODwFPAe8GPg68CJwJvA94LzBPRE4ErgUuBC7z010JnAesFpF24GzgCuBaETkVeB34MGCjCQzDqDuuw5+WAYjIPlW9yCWNiAQf+78MnITXHPALvC+kgvgLeN5up6q+JiK/FZHpQIuqHhaRNNANvFIkbRAvduw1wBqAuXPnuphrGIZRNU5CKiLD/s+TQr9R1WtKJLsG+BJwVmR7dLxVuXh4u9O+qno73ocDDAwMlB/fZRiGEQPXXvvbIn9dOBVYAiwCFuB9mz8Hr9NqP14b62w87/JV3xPtUNWDInLM90YP4Xm0XXieaDhtkJdhGEZdcW0jfQyYD5ylqvcC9wHPl0qgqh9R1XcDjwOXADfiCedjwBY8b/V5VX0FuAP4LvBNP/lX8b7nv1NVDwGP4gnxHar6jG/3Z/G+/zcMw6grrl82PYjXU3+Nqp4qIgLsU9XTa21gXOzLJsMwqsX1yyZXj3Q68I/AIb8X/XLGb8s0DMM4rnBtI/1j4Ga8YUe7gX3A1bUyyjAMo5lwFdJ+Vb2yppYYhmE0Ka5V+xtqaoVhGEYT4+qRnioi+wHBaxsVQFX1pJpZZhiG0SS4ftnkKriGYRjHHbYcs2EYRkxMSA3DMGLi+q39x4ttV9WbkjXHMAyj+XBt+/wF8DG8pZj/O3CsVgYZhmE0G05Ve1W9A8gC9+PNMXo+3qB8wzCM4x7Xqv0obwx7Am/JkYuA5TWyyzAMo2moaGJnwzAMYyyuHuk/Ay3hTXgD8p1myzcMw5jMuHY2DQLr8SZa/pyqPlk7kwzDMJoL13Gks/EmY34YGBaRb4nIFbUzyzAMo3lw9UjDbaT/0//bD9yTrDmGYRjNh6uQHgA+r6Hp9EXkY7UxyTAMo7lwrdr/GaHOJhGZAqwtlUBErhKRe0XkLhFZJiKPiMgW/3/v8eM3+/H1IvKoiFznx28VkT0icpkf/5q//zki0i4iIyLykIjYWsuGYdQdVyH9AvBDEblHRO4BfgjcWSqBqt4NLAXOxVtj/r3APBE5EbgWuBC4zN/9SuA8YLW/lMnZwBXAtSJyKt7M/B/Gm6n/AmCPb9MfFju2iKwRkd0isvull15yPEXDMIzqcK3a/w3eOvEL8MT3aeBIqQQi0oK3guj38ZZSfsEPc4BOVX1NRH7rL8PcoqqH/SWYu4FX8D5LPTmUtlj8jGLHtnXtDcOYSFw90h+q6muqukdVH1bVXwEPlUqgqsfw1rSfSeEY1KiwjSd06rCviaRhGHXH1SP9ld9+eT+eeC3GG1NaElU9JiKtwM+AXrxhVC8Ar/qeaIeqHhSRY743egh4GejC8zz3+6EXz5MN4r/v77ff9UQNwzBqhatHugKYC2wEPgWc5m8bFxH5kIjcB/wOrz31S8DzqvoKcAfwXeCb/u5fBR4E7lTVQ8CjwNeBO1T1Gd/OzwJb8cT8HOCDwNcc7TcMw6gZEhrRVHpHkflAt6o+XFuTkmVgYEB3795dbzMMw2hCRGSPqg6U28/1W/svAfOBecApItIGjKjq2+OZaRiG0fy4Vu3fBrwD+I0f7wJOqYlFhmEYTYZrZ9N/BX4AzBWRh4A3AX9RM6sMwzCaCNf5SO8QkS8CJ/qbXlHXxlXDMIxJjlPVXkSywN3ALuAB4JsisqSWhhmGYTQLrlX7/4b3ieYDeJ9rngvchTcMyjAM47jGVUhnAW/2A3gz5M8QkTWQ/yTTMAzjuMRVSG9ibC/93+J9bWRtpYZhHNe4djaVnOnJMAzjeMZ1QP4RvG/b85vwFr87qSZWGYZhNBGuVfvnVPXN5XczDMM4/nAV0tkiMhzdqKrXJGyPYRhG0+EqpP+2plYYhmE0Ma7f2q9V1XvDAW8aO8MwjgO2bdtGf38/LS0t9Pf3s23btnqb1FC4eqRni8hbVPVJABFZiLeukmEYk5xt27aRy+XYvHkzS5YsYdeuXaxevRqAlStX1tm6xsBpPlIRuRD4K6APz4t9AfhzVd1RW/PiY/ORGkY8+vv72bRpE8uWLctvGx0dZd26dezdu7eOltUe1/lIXYW0S1V/Gdn2dlW9L4aNE4IJqWHEo6WlhUOHDtHW1pbfdvToUdrb2zl27FgdLas9rkLq2kb6bX/t+TYROVtEvgt8rIwBV4vIfSKyU0R+z1+nfrt4JLbOvaP9hmFUycKFC9m1a1fBtl27drFw4cI6WdR4uArpuXgL0z0CfB74sKr+uzJpvuLPoP8ysB74OPAicCbwPpJb534Mtq69YSRHLpdj9erVjI6OcvToUUZHR1m9ejW5XK7epjUMrkL6U+A/4y01sgD43yJScgVPVVUR6QBmAIcpvjZ9wTr3QME694DLOvfFjn27qg6o6kBPT4/jKRqGUYyVK1eyceNG1q1bR3t7O+vWrWPjxo0N2dFUr9EFTkKqqnP80Kmqb/J/u3we+hngBiDckFLtWvUu69xPKmzIidEorFy5kr1793Ls2DH27t3bsCKay+XYtGkThw4dYtOmTeRyuQl5blwndp4rIp8XkX8WkR0i8tciMqdMmrPxHNOHKb42/bjr3AOVrHM/KalnoTCMZmTjxo1s3ryZZcuW0dbWxrJly9i8eTMbN26s/cFVtWzAm9D5YuApYCHwEeChMmk+CjwB7MRbOG83sB1PvJfhtbdu8fd9jx+/2Y+vx1vb/jo/fqv//8v8+Nf8+DnlbD/nnHO0GclmszoyMlKwbWRkRLPZbJ0sMozGJpVK6ZEjRwq2HTlyRFOpVNV5ArvVQSNdhz89oaqni8jtqrpGRFLAE6q6IJ6M155mHf50PA85MYxqqMV416SHP70VQFWDGfFfB5ZWZZnhhA05MYzKqOvoAhe3tZlDs1bth4eHdf78+ToyMqJHjhzRkZERnT9/vg4PD9fbtOOK4eFhzWazmkqlNJvN2vVvcJK+XzhW7esudLUOzSqkqvYQ1xt7mRmJCilwEfB1YCQUtrikrXdoZiFtVibLC8A6/IykhfR5vC+NnsYbkP8W4FmXtPUOJqQTy2Ty4mrRC2w0F65C6trZ9HNVfRD4Cd7nnSvxPvc0jALqOpYvYazDz3DF9cum8/2fV+J5py8B76qVUUbzsm/fPpYsWVKwbcmSJezbt69OFlWPfWNuuOL6ZdNFIvJpYC7ewPxLgTNqaZhRW2r1+elk8uKa6Rtzo8641P+BHwEfwJt85F/jzdT0nEvaegdrIx1LLdsxa91GOlk6sozmgIQ7m34CLAd+5v+9FHjeJW29gwnpWGrdG10rsZtMHVlGc5C0kH4C2BIJn3BJW+9gQjqWZu2NtuFIRpRa11CSFtL3uezXiMGEdCzNKkjFXgBbt25VwKr6FTIZmkgmooZSi6r9m/HGkOaDS9p6BxPSsTRrFTn6AhgeHtY5c+ZoJpNpqvOoN0nf/3qJ8kQ4BEkL6W/xvmYaDYURl7T1DiakxWlGjyQqAJlMRnt7ewtsbwbPOgni3L8kBaieL+WJaKJKWkifctmvEUMjC2kzilm9CV8zQLdu3Vrw/2Zo641LXPFKUoDq2UzUjB5prsi2j7mkrXeYP39+w4hVWAT6+vq0p6enaarXjSj61TxIjXgelRJXQFzSu16nenZcNmMb6TNAayg+BfixS9p6hylTpjSEWLlUS3O5nKbTaaeHfCIFoVHHhrrY1Uwvr4kSr3LXrZL7Xe+Oy2brtf8o8CRwjx+eKualjpP2JuBzwCJgD95yI0J1y43socLlRhYsWFBwYep1k9PptOZyufz/UqmU7tixI29LUHhFpGzhLVbQe3p6tK+vryYFqpYPS1yRLvUgNVOb6kSLV6nrVkn+E91GOtE1ikSF1MuPGcA5eLPln+CY5jTg730hvQnv+/wtwFnAViDrd2KdCNzrH+MHftrdeMsxPwy0A/fjLQf9DeBU4G7/C6u/KXLcNX763XPnzi24MPWqdqRSKc1kMvkbn81mdceOHXlbstmsDg0NFRTW8QrvRPdeV+oBRQv74OBgRQ9tJZ55KaJ5R19e5c5jIslms5rL5QquUxCPUmvxinu/aymiE92xlbRHei7wD8DdfrwN+CPHtEt9If2iL8SfBC4Dvu0L6DDQDzzq738/MD0U/xegD/hffpPCbmAJsAlvGNZXSx2/Xh5p9CGOCmVU/KJCqzp+4Y0W9KgoqybbVhjnIc/lctra2qq5XK5o4Y+ey/DwsPb09CigIqKZTEZ7enqqeliSuE4TRXCuYZHIZDIqIkX3r6V4VerxTpSQTkQzQvRcgGc0QSF9ErgEeDK07WnHtMWE9F0RIc1GhHTaOELa5nuoYSG9u9Tx69VGmkqldOvWrWPa58JV93B1PFr1V3X3SCv1tCp9sw8ODmpra6sODQ3pwYMHdWhoSFtbW3VwcLCsbeU87ej+fX192tXVVeBd9/b2al9fX1HbShF9AfT19emsWbMaZtxp+KEVEV21alXB/1etWqUiMm6zRa3Eq5LyMRFeYnCugC5atMjJ2aj2ONFzAQ5rgkK6E7gOeA74N8Bnge84pg2EdLyq/XdCVfuwJxpU7R/yq/bfc63ah0O9eu37+vp0zpw5BTdl1qxZ2tra6tSeV0nhrbTtr9I3eyUeadQLTKVSevDgwYLCHi780XMBdPbs2QXnsmPHDgWK2laKYi+AVCqlXV1dde+1j563iBR47oEnD4wpDxPRRu4q1LX2EsPnumjRIh0aGip4LpI8VrFzCTuPpYKrkM4CBvE6fL7g/+50TBsI6SLqsLZ9vcaR9vX1aW9vb0FhL+dZlSq8g4ODmk6nFdB0Oq2nnHKKiogCCmhHR4ezV1BpG1i5/Ut1qrm0/UbHhq5fv77gWNUKaTab1RUrVhRctxUrVjREVb6Y575gwYL8/QR0wYIFRa/bRLeRl6IWw5/GK0+BqA4NDemiRYsmpG0Y2KMJCuncYsElbb1DvYQ0WrXPZrO6devWqgpY1LNavHixArp48eKqPK1qPNLx9i/XJlqujTRKMU9+zpw5VVXtRaSolz9eu+NEEn1oly9fnhfQAwcO5F+Sy5cvz+8TCFQjtf0m7ZGW66QdHh7WRYsWKVC0nMdp8pgIj/QBv+3yAeDVIO6Stt4hrpBWe2OSLGDpdFqHhoYK4pdffrmm0+n8tqGhoYJ4KSpt1yq1v0uve6le+yD/cDvmjBkzNJPJ5B+iajub0um0rlq1quDYq1atcr5OtSR63dLptF5yySX565ZOp/Xiiy8usDXskYabWoB8eYj70h6P8Z6DpNtIK21jj9oYdyhdT09PQdkDjmpSQprfGVqBJypJU+8QR0hdBi6PJxAuNzVaXY923oQb2RcuXJhPC+hLL71UUN09ePBgRdXf6LGXL1/uLHbh/9dicHip9r5KXmwioi0tLQVtpC0tLQ3hkRZrG543b15B2fIf5DHlJ1pDmTlzpgK6YsWKcb34SoaklbO1kufA9VqEXwpr164t2Ulbanyta1v+eHYEQhqMokhUSEMe6TPAp13SNEqICmlSw35chLJcm2epjpDwVzjpdFrXrl2bzz8JjzRcYLq7uzWVShVUv107L+J63sXSr1ixIl+1Db9gKvU4GtkjVa2+1z5aLltbW3XatGnjjnQo1vySSqW0u7vbaYhZpfe4kmcsalt3d7e2tLQUlMVwJ21fX9+45bLYELLu7u6S0yyWat9X1cSr9vP8dtFpLvs3UggLaTHXvVQBKjW2z6VKW6oARavrw8PD2tXVpW1tbWN64gPRXbt2rS5cuLCqNtJwgWltbdWOjo78dWhra9N0Ol1QWF2HCcWtTkXbkru6uvIiKiI6e/ZsFRHt6upSYMxY21IPdBJtpLUcZhSmkhEG0WsG6EUXXZRvY02n09rf35+Pi4iuWLEif6xKh5hVUuuo9BmLdggC2traqm1tbfn0HR0d2tramq9h9PT0FM273DNVzJMu1R6rqol3Nm0B7gyFLcCdLmnrHXp6eoreJEDb2tp0ypQpBUOSwlWeYl5C4PmVGkjuUoAAPXjwYEGB2r59e756nkqldMOGDfljdXV15YfDpNNpPeOMM/Ln1draqp2dnSWrXmEPNHi4grd+UECDamQg4mHBqcQDWb58uVOTRSDqM2fOzNse2BFc/8BrCjpZduzYUXBu0Qc63GQhIjowMFBg28DAQFFvd7zzilb1qm2vdWH58uVOIzGiHXJBjaK7u1uPHDmiK1asUEA7Ozv14MGD+fsbnCug27dvL7hupUZGlBv9EL7mgE6dOtW5s1BE8i/M8IiF4P4F5wboySefrF1dXdrb26tbt24dk3f0xVmuHEc9+3ANJiBpj/RivDlIbwPO9D3UeS5p6x3wxyVGC6iIaGdnZ35bUBDDVdxwmmjaqEda7C1fqgBFq52Ann/++XnRCAQmaBuaP39+gR2nnHJKQQEICnaxtqHoUCwgL1hhLyYcD1+boGnBpd1zcHCwQJiDv8WaLIpV5YJjtrS05L3lqVOn5ntpR0ZGCh6G8O/BwUFNpVL5+x2cQ9B2GAjMqlWrxvX6wi/SqOcefqhdvNNy7ZLhF07wgg9e8sGLPlyNDYuZiOQ9s+i9DP4f/p3NZvNNGoDecsstBYJRSkiDEQVr167VAwcO6Nq1a/MjCoJr3tvbm782QMELqlTegE6fPj1fHsLXQFXznWthRygs4uG8i3XCrV+/vuA8wy/e4BpGm7jC7dIkPCB/ri+e/wHYhff9/NkuaesdRKSgcAaiGTzE4cLW1tam06ZNy4vftGnTxi2cfX192tnZWeDldXZ2lhxIHn6wgryDwhnEBwYG9MiRI/mHpLu7O19Na2tr097eXj377LMV0Pnz5xcdzB2twgK6YcOGggIWvhbRhzAawqJYroobFdDg2gTXPFxVC/IP34ditk2ZMkUBHR4e1o6OjjEvtsCWcFthUCsIHwe85pDw/QhX/aJDtcKCFbzggvPJZDLa3t5eUEsId9hFRx9EX9KBqAcvy/Gu/YwZM8YM0BcRnTlzZlEBjd7PqMMwXjtk8LIKziUshOl0WhcvXlwg4tHyEX4OwjYE7fmlhDT8kg+uc2Br9PzWrFlT0GEYfsaiVfVyM6wF9o1XXvzzfdFFZ1yFdAtvVOmD0BRV++BGRQtYsZsUvPHGK5DhgpvL5cZsa29vL7hpGzZsyOcV9cTCBS8cgrahsC3B8QOvMJvN6gUXXJAvQIHHGH7zhjufwKtuRV8oxR6KUg91cN1KtYMG+wVeYfj6BOcVbA+qdcWOE67OB9sCz2jGjBkK6AknnKCATps2Lb9fe3t7gXiFmzLA837Cn+Vef/31+esY7owK8gt7SC0tLdre3q5AXgijIfB+o2KVyWS0q6sr/5JubW3V9vb2gvs93j2IDocKajNBrz6gp512WsH9DotatKxFPe3gegZ2BPcm3BQQ9BVccsklRW1sb28veNEEIXyPigFeTSws0kGaIGQyGc1ms/maVSDMIyMjOnPmzIKmuVJefnBe4XIBb7zMojXUmg1/asaQSqUKqrSuYTxPLfCO+vr6xlSHgXw7VS6X05aWlny8mCcWHMfFK7znnnvyHQKpVEr379+fFwAR0enTpxeIcWCbS96VBNXCoTlR79RFjIMQvSfBkK5SYfr06fl2yuHh4bwIhPMqd87hB37q1Kna2tqav46B6IznDYVfBuVsjb64wj3IQN67dr1e4X2DJgyXNNGaVVtbW9Fjhju6iuURLu8uL2UorM0UI7DthBNOyL8wgvyDIX3RexocJxDE8N9iXn+x+19BmX89MSEF9kfCL4D99RZJR9sruWglC2+xwlMqtLS0jBHOUvmXCu3t7Tpr1izt6+vTbDY7popbLATVwiSuQdjW4AUBnrcwY8aMvMCH7U3yuOOFM888U4G8iLteT0IPVOAtBb3dlQhbsXDgwIEJOfdqw3jlOPDsw8IbPpfgngfOhMv1zWQyeU+vGMFojGpD0KwQtKfHvXfFQpJC+jTQUm9RrJeQlgsiMuatH/6fS/pw3MUzK1eAi/0ut69LOPnkk4u+HIpV65opJP3CmSwhKB/hNlTwaiLl7ndQ3b744osV3hDSM844Y9zjNGJw0RnxxaYkIvIs3qz2grei6E7gU6r6bNnEdca/2YbhTCqV4vXXX686/YEDB5g1a1aCFtWPuNcimldbWxuHDx9OJL+JQlWl3D5OQlqQQGQmcDXwEVU9vUrbJgwT0okjyYduoplM4mckS6JCKiJdQCCcTwKzVPXH1Zs3MZiQGoYRBxchbXXJSERywEq82eoBzsCbWPmmqq0zDMOYJLi2kT4D/Cu/ZxQRSeEtxzy/xvbFxjxSwzDikJhHijer/b0i8jBeT9ZbgW/FsM0wDGPSUEkb6el4q30CPK6qj9fMKjd7ZgP/A/h/wB+o6mvj7GceqVExcTrOnn76aU477bSELTLqRezOJhG5BbhRVQ+GtgmwGrheVd+ShKHVICLvBzqAbuBfVPXrof+twVvbHuAc/S+db6S78VXixAtsiJlXMx07HK/nset9Hep5bLsHdboOCQjpB4G1wC2qulVEzgY2A48Dn1DVJ8odoFaIyA3APrwVSKeo6q3j7GceqWEYVRO7jVRVvyAiw8AnRWQd3nrzH1DV+xOyMSlMLA3DqBslhVREHsATKQEywMvAX3u1e1DVC2psXyn2A734Vfs62mEYxnFOuar9vFKJVfW5xC1yxLWzaWBgQHfv3j2hthmGMTkQkT2qOlBuv3JV+7oJZTlU9UXg/HrbYRiGUfG39s2GiLwEPIfXBPBy6F9x4knm1UzHbiRbjtdjN5Itx8Ox56lqD+Wo9zR3Ezid3u6k4knm1UzHbiRbjtdjN5Itx9Oxy4UUhmEYRixMSA3DMGJyPAnp7QnGk8yrmY7dSLYcr8duJFuOp2OXZNJ3NhmGYdSa48kjNQzDqAkmpIZhGDExITUMw4iJCakDIvKCiOwUkd6Y+dwkIp8TkUUiskdEtkswcUH1eWVE5FkR2VllPleLyH3++f1eAnaF81sQ07arROReEblLRJaJyCMisiWBvJaKyBMi8uVq8grlOU9EDid0P4O8Yt3PUH5Bmb0gAduCvM5LyLZ3icjdIrI4AduCvM6KWdYG/XN8VkT+rFK7TEjLICItwD+p6lJVfSFGPqcBb/Kj7wY+DrwInBkzrxZgi6ourdK0r6jq2/G+4lgfx64i+Ukc21T1bmApcC7e/LLvBeaJyIkx80oDf6Wq767GrhAfAp4i5v2M5BX3fhaUWeCdcWyL5PVSArZNB/5UVa8CLo1pWzivV+PYpqq3+ml/hDdBU0V2mZCW5wQgKyL/Pk4mqvo0EHhAJwEvAL8ATo6Z1wxgqYhcWKVdKiIdfj6H49hVJL/pcWzzH+IngN1Aj2/bC8CcmHlNAf5QRH6/Grv8/Lr8ny8T835G8op1P33CZTaWbZG8krBtCbBIREYSsC2cV09c20RkEfAM3nlWZJcJaRlU9WVgMfDBcrNhVXuIWIlVfwBcBnxGRNqrzOYzwA3AsYTs+gxwg6ruiWObqh4DFgEz8Ty1qm2L5LUD+ADwxUrzCXEN8KVih4qTVxL3M1xmiX/dwnn9Kq5tQBdwK/AVvJU2qrYtktfbErDtcuAbkW1OdpmQOqCqR4FXgFkJZRnMpTrH/x0L9ZaCOYpXZa0I8VY9UFV9OAm7IvnFss1PfwxvlrKf+bbNxvMW4uTVh3c/p1aTj8+pwH/CE+c/IN51y+clImviXjMYU2Zj3dNwXgnY9hLeBPGHgNdi2hbOSxKwbSlwH1U8ByakZRCRK0Xke8ARkptA+svAjXii8FicjERknYg8COxU1V9XkcUy4CK/kf7bCdiVz09E/iKObSLyIRG5D/gdcCee1/a8qr4SM68PAN8H/q7SfAJU9SN+G+vjwCXEuG6RvNIx72e0zN4Qx7ZIXu+IaxueUL0DeD9wXhzbInkdS8C22ar6G6p4Pu3LJsMwjJiYR2oYhhETE1LDMIyYmJAahmHExITUMAwjJiakhmEYMTEhNRoO/3vzB/3f00TklRiDrA2j5piQGo3OGrwvWAyjYTEhNRoWEWnDm3TjeeBPRORT/vZnAw9VRJ4SkR+LyLN+/C7fo71ERO7yt+31bNw0AAACRElEQVQUkadF5FEROUlE/sQPJ4rIY6Hj/U5EnvP3XyYi/+hv3ywiGyLx80XkRhG509/2l+LPTOXb8M5QvnnbjcmJCanRyFyB9x11qa9G9gPZIttvCP3OAP3AD4GzQts/CnQA+NOlfR9vRiKAH4TyPQPYGok/hvfd/ltFJIX3lc4JDudkTEJMSI1G5ipgWyh+rYg8QeGMPFOKpFuKNyVdmKeAC4FRP94JLOCNb6mD2a8AUNVfAS0iMhM4qqo/j8R/6+fxCJ7X/FM/j4DbxZv/tKpZtIzmwoTUaFTmAvt9wQq4Q1VPB34O4IvaoSJp1zD2O/oFeN/qBzMOvZ/C2Z/m402bFub7fl4PjhOfDvwTcDPejFLhSVDWAF8F1o17hsakwYTUaFTmAP9QZp+L8MQtyl7gl5FtijfxRjAzUAfwfyJ5PRhJMwp8GPjOOPEO4GE8ER4FoiMLfge0lTkHYxLQWm8DDGMc9vnzmY7HiXgzQh3GawI4WUT+o/+/6JrkP8Gb+fzXeG2gy/G8WxURRORtwE3+/zcAbxKRK/CE9nZgp59PNN6BN4fr21T1ZRGZ5m9/Gc/b/SXwR3jtp8YkxmZ/MpoSEckAX1bV8/z4dUCvqv5lFXktBa4Llh7xe9ifwPNQb1TVq/3tp4fjhhFgHqnRrLyI1+sesJ2xVWtX/i/w6VD8LrwZ17+B5+0iIlfjea1XVXkMYxJjHqlhGEZMrLPJMAwjJiakhmEYMTEhNQzDiIkJqWEYRkxMSA3DMGJiQmoYhhGT/w8paEInOSGJCQAAAABJRU5ErkJggg==\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x155025b0898>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Извлечь выборку для построения графика\n",
"np.random.seed(seed=7) # Задать начальное значение случайного числа, \n",
" # чтобы получить устойчивые результаты\n",
"# Случайно отобрать 70 образцов\n",
"samples_index = np.random.choice(range(counts.shape[1]), size=70, replace=False)\n",
"counts_subset = counts[:, samples_index]\n",
"\n",
"# Индивидуальная настройка меток оси Х, чтобы легче было читать графики\n",
"def reduce_xaxis_labels(ax, factor):\n",
" \"\"\"Показать только каждую i-ую метку для предотвращения скапливания на\n",
" оси Х, например factor = 2 будет наносить каждую вторую метку оси Х,\n",
" начиная с первой.\n",
"\n",
" Параметры\n",
" ---------\n",
" ax : ось графика matplotlib, подлежашая корректировке\n",
" factor : int, коэффициент уменьшения числа меток оси Х\n",
" \"\"\"\n",
" plt.setp(ax.xaxis.get_ticklabels(), visible=False)\n",
" for label in ax.xaxis.get_ticklabels()[factor-1::factor]:\n",
" label.set_visible(True)\n",
"\n",
"# Коробчатая диаграмма количеств экспрессии на индивидуум\n",
"fig, ax = plt.subplots(figsize=(4.8, 2.4))\n",
"\n",
"#with plt.style.context('style/thinner.mplstyle'):\n",
"ax.boxplot(counts_subset)\n",
"ax.set_xlabel(\"Индивидуумы\")\n",
"ax.set_ylabel(\"Количества экспрессии генов\")\n",
"reduce_xaxis_labels(ax, 5)\n",
"plt.tight_layout()\n",
"#plt.savefig('pics/1_07.png', dpi=600) "
]
},
{
"cell_type": "code",
"execution_count": 85,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAUwAAACnCAYAAAB6mmWqAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAFo9JREFUeJzt3XnUHXV9x/H3BwhL2UzYwYaogGwWTaKIWAmU4xaOCoqg1qUuERVxq61aq4SirX/g4bSeUxutIFCk4EFtFSjVGqUIHBIEF3aUICahCUnBhSUN3/4xc5ObyfM89zd3Zu6de5/P65x77p157vzm+5vl+/xmub9RRGBmZr1tM+wAzMxGhROmmVkiJ0wzs0ROmGZmiZwwzcwSOWGamSVywjQzS5SUMCW9SdK/SlooaZmkn0h6Q9PBmZm1iVJuXJd0H/Bu4DvAkcAG4PqIOKDZ8MzM2mO7xO9tCzwTWAccDwh4qqmgzMzaKLWF+W5g38LoVRGxpJGozMxaKClhAkiaBRyaD94VEQ83FdSee+4Zc+bMaap4M7MtLF++fG1E7NXre0mH5JL+CngDcBvZ4fhzJF0REedUC3Nic+bMYdmyZU0UbWa2FUkrUr6Xeg7zHcCzIm+OStoGuA9oJGGambVRasK8CviBpJuBAJ4PXNNYVGZmLZSUMCPiTEmHkt1SBPCViLi9ubDMzNon9RzmosKoF0t6sa+Sm9mokgRAmU7UU38auV/+Orfr836JQZ0j6XxJcyTdL2lpcnRmZg3p52kTqYfkiwEkndb5nELSQcDewONkN79fMNn0eSt2EcDs2bNTZ2FmNjCpvyVfJWklcLCklV3DU4qIe4HL8sFdgQWSjpvku0siYn5EzN9rr563Q5mZDVxqCzPp8LtHGbdKOgn4oaRjI+LxqmWamQ1SagvzBZK+KunyfHiGpNeXnVlE/I6s444dyk5rZjZsqRd9Ls5fRwFExAbgs2VmJOn9km4ElkbEI6WiNLNGSdp01XhUDCPm1BvXVwEHATtKegXwUiDpp0QRsRRYmg/+Q8n4Rlo/ty1Y87xethYRI5cwYfDrMLWF+Rqy5PrvwElkP4s8uamgxsU47JCj2PJIMQ7rxgYvtYX5mYh4X6ORWF8G0Vpqsmy39poxjsu1DXVKTZivlHQwWU9Fm0TE3fWHNDrasALbMP8qioeC/SzTtqyHOtVRpzqXR1uW8bDnn5owVwDFn0EGcEK94WzWhhVUjGGimLo/l405pfy6VZ1Hr+mrlp9yLq3XemibfpdJm7aDYZzjbEMOKEq9D3NBw3FMNM9WnDsrrqypVl7ZmIvfH1Sdq7ZapoqxjhZjahxtVazzoJZJGW3Zv7qNwj9CP2Z3gEbhAkrdMbZtgx+EXnXuZ5mMwrZT1kR1avv2MtYJs9dGNuiNcBAbQx11avtGO11NtV5GIaEOO0HWsYxSu3f7T7LOMzaNAiIiGjuHOUEMQP3ngaqcgxy01PjaXKe2xTOqhn34OornZes4DZF60edM4GNkvQ6dHxF3VZprH3qdB6prR2zzjtzvCh/1iwdtu5jWRNLvp0yfA91a08sk9ZB8H+AC4GbgUknXSDqlubB6K3MxpmMUDltGzaB22qotmTKHs72Gi2XVtV01/Y+tTIzjsK80UYfUFubxXZ//LX9/DnBlrdEMQJtbkDY8xUPbMncvtLW1VVS29ToKdZpKE3Uo1YGwDdYo3JdpNp2kXvS5D9gfuAl4iiFc9JmORuG+TLPpJPUc5qHAWcDvyHpQf4WTpZlNZhzOgU4kNWF+guyhZ8uB9wC/lvT3jUVlZiOl14WxcZF60Wdp1+fv5e/juUTMrLRxuEiUIjVhviciTu8ekT+u4of1h2Rm1k6ph+RzJT27MyDpMGBuMyGZmbVTagvzncAFkp5OdoX8IeC9jUVlZtZCqQnzMeDY6DqTK+klzYRkZtZOqYfkV8TWl72+XHcwZmZtltrCvEHSZcCPyK6OHwv8tLGozMxaKDVhvgk4kez349sA/wJc01RQZmZtlJowdwMOADZGxHmStgEOBH7ZWGRmZi2Teg7zGuBZZD+PhOyw/OpGIjIza6nUhLkLcAnwuKQdgVfhX/qY2TSTekj+FuBcsp6KlgF3AKc1FZSZWRul9od5i6RTgX3Jnu2zKiI2pkwr6Ryyc6BLgIvJbnpfOMFtSmZmrZZ0SC7pjcA9ZD2sXw7cJ+mDCdMdBOydD54OfIosYT63r2jNzIYo9Rzmp4C5EXFMRLwIOAL4QK+JIuJesv4zIeuAeDWwiuyK+xYkLZK0TNKyNWvWJIZlZjY4qecwZwBfLHTftIekSwEi4o0l57vV4XhELCE7bGf+/Pk+XDez1klNmK8BZhXG/VPJea0kOwe6X/7ZzGykpCbMmUzcKizTH+ZlwEXA/wC3lZjOzKwVUhPmt/NX9zF5kNCBcEQsZXOP7fNLxGZm1iqpCXN1H+cpzczGSmrCfKak7vOOncfs7t9ATGZmrZSaMHeKiA3dIyTt0kA8ZmatlXof5s8mGHdTnYGYmbVdagtzvaQz2LID4ccbi8rMrIXK3Id5FrCQrFV6J/DqpoIyM2uj1IR5ZER8ojMgaRbwPuBvGonKzKyFUs9hvlXSVZIOk3QWcD3w2wbjMjNrndQW5rnAPOBa4Mdkz/hxwjSzaSW1hflF4F3AvcCuwHn5uIGZNWvWFu/TUdll4GW2tWEsk3FcD6OwLTYyz4ho3WvevHmRdzAcHZ3PnfeZM2cGEDNnztzqOynDTUzfdAzFZVD391OGe9Wh7uVad/kp21HddWhiPdS9HnsNV61T1f23yjqZLMbuMoFlkZCbkhIY8DLgBrLD8F+RPaZiQcq0/bzmzZvXcwVVXQFVN+KJpu81POiNrmodU5JJ2fVSNuZe72XrVKa8purQa/qqySRl+l5llq1Tr5jqnr5M/JNN0/33uhPm7WSPmbgjH54L3J0ybT+vefPmld5xqq6Asi2blIRZd3IpO33ZZTLZd6sktKoxtyFhlo2h7umLn6smo4nKmOg7VRJm08uw+Hmq+qRsa6kJM/Wiz10R8aik8/Lh24AHEqdthfXr1xMRFDpBnvTvvYYn0n3OZN26dUnTlImhah2rfr8JbYih7apuV8XpYfyWe0p96qhz0kWfiDg5f/9y/r4xIk7se64DMIyTzJ0Vsn79+oHNs4zpePFhEHWuOo9e01fdrgaxXQ562xrWtpzawhw54/YftA7juEzKHjkMI4amp69DsRU6Uat0KlXrMOj59WtsE6aZpat6Oqju+bdV6mN2/1LSA5KelLRS0ipJdzcdnJlZm6TeuH4m8CzgFxGxf0Tsh1unY2ccz3Ga1Sk16V0ZERskfU/S9cATZF292RgZlcOiqZQ9F2ZWhrLbkEpMIB0GbBMRP28mpOy55MuXL9+080YEnL375i+c/cim8W15z5dNK2Jpapml1LHp9VR7nQrlZe/t3taGsa1utZxavozKbqta/OjyiOj5kMakhCnpo8DryB6T++dkLdPzI+K8KSfs0/z582PZSfdsHtHCFdLERjjsZJRSx2HHUHeCrLqjjcJ6q2NbHbUEW7aOQK0J8xfAS4FbgdnABrKb2Rt5CNpELcy6N8JBtL6qbjQ9d+4BL4OUOlb9+6jtaE0k7aaTzyDqWPe23c+2WqYMak6Y9wOXAouAJWRPjfzTiPjDnhP3ISVhNr3jlt6Iaf9GOArJZNjvE9Vx0C3Kpuc3XdfjVO/UnDAXAnsWRq+JiKt6TtyHfhLmsJNP9r12tXrr3hGh+o428BbjAJJJ6w65E047tC3mcUuYX4yIM3p+sSZ1JMymLw7UsaON2rvrOJwEV3f5TdRx2Am4ah2pOWE+BPx1cXxELOk5cR/a0MIchR3NdXQdm6rjsBPgoNcjiQkz9T7M7YB9yc5dmrVGfHo3OHv37N1qo8WPbk4qZw87mvZITZj/HRHnVJ2ZpNVkj+g9PSJWVy3PzDu2DVLqTyMfK46QdHmZGUnaFrg6IhY4WbaTW2s2Koa1raa2MOdKenZE3AWbfu0zt+S8ZgJHSDo5Ir5R/KOkRWS3LTF79uySRVsd3Fqbvob9z7Ls/Ie2rUYkPaLiOLLfjv+Kzc/0eVnKtIVyZgDfBQ6c6nt1PKJi0O9tiMF1jIhP77b5NaZ1nA7rcdB1pM5HVETED4AXpXy3RzkbJD0MPA1YUbU8syK3kq1Jqf1hvlfSfZIezIdnaPPzfZJIep2yno6eBH5SPlQzs+FKvejzIeAI4Df58EbglDIzioivR8SxEfHmvBlsZjZSUhPmUuBCYE9JnwNuAr7dUExmZq2Ueg7zXZKOBL5OdvP6xRHxs0YjMzNrmaSEKekthVFzJc2NiIsaiMnMrJVS78M8rOvzW4GvNhCLmVmrpSbMT+fvc4DXRsTHmwnHzKy9UhPmXcBTwBrgE82FY2bWXqkXfZ7RdCBmZm2XetHnVmBHYGVnFNnPik5oKjAzs7ZJPSQ/AfgIcAjwpYi4trmQzMzaKfXG9ZPIzmP+EPicpDskfbi5sMysmyRmzpw57DC20MaYmpbawuz0tP4ocH5DsWw902m4Qmzw2r6ddToTWbdu3bBD2aSNMQ1CasK8odEoJjDRCmn7hm2ZptdTneWP6o7vfaEeneW4fv36pO+nJswfk/1+HDa3NoPs3OZAjOqGXdUoJR9oZj11xzio7aDNCWm67gt1mGxbyh+E1lNqwvxf4DNkh+QPRsSqfoK1rU21Yza9Y/Rb/iCTSdMJuK55tjnB1qVsHdu2TOrYllIT5rVkP4ncGXiGpO2Bj0WEeyyqYBA75ii0IJvWRAu1bJnF9dBruEo8TSi73Jo4pdaGBJx64/qfdQ9LOhy4jBHv4q2JFVh1Ryjb4qzz+03pNc8mYxpWgpwqhl7DZbXln1jd2+IwTsX0knrj+rbAQuDIfNTPgXlNBdWvqiugzI470fRVd4Sm/4sPY6PrNc+27AhTGUaMbWhNlVH3tt7W7SL1PswrgFcBa/PXwnzcUBUTIDBp8pjo7916/b0uTbemoN46pMRbd52Gfdph2OpYj3Wst3FbrnVIPYd5FPC8iHgUQNLXgNsaiyrBIP5DTffzfynx1n0etulW+XSQcuTRxhZeEwm67jJTE+Zi4BZJj5DdTrQ7cE5tUbSQd8RmeLkO3igs8yZibKLM1IR5aURcJGnPPJC1kubUFsWI8CGKpRrFbWXUYq77ToMUqecwb5Z0WkSsBXaQdAlwZYNxtc6gznHa6BvFbWXUYi7GO6j4UxPmnwDHSLoOuAq4MiLmNheWDcuotTLMBik1YX4HOBo4OJ/mo5J+1FhUNhSj1sowG7TUc5inNxqFmdkISP2lzwoASVdHxCuaDcmsGp9WsKakHpJ3HNRIFGY18WkFa1JSwpTUeazus/udkaR9JN0g6TpJu/ZbjpnZsKS2MN8uaQawnaTtO6+S81oIXAJ8Fzix5LRmZkOXmjBnkD3Tp/t1Z8l57Q+sBlYBBxT/KGmRpGWSlq1Zs2arDj3LDvc7XZXhQc5rkPNxHV3H6VDHFKkXfeaULrlHkRPMYwmwBGD+/PmxYsWK4t9LDfczzagNtyEG19F1TBluQwwpMfaSeg7zKEnfkHR3/vqGpOeWnNdKYF9gPzY/39zMbGQk/5YcWATcnA8/H/gacFiJeX0H+Cbwf8DnS0xnZtYKqQnz98ChZM/0CbKr5b8vM6OIeAg4plR0ZmYtkpowXwOcBZycD98OvLqRiMzMWkr9nPhsmqQ1wApgT7Ie3jvaPtyGGFxH1zFluA0xtKmOB0bEXvQSEZO+yG4BWtn1vrJ7eKpp63gBy0ZpuA0xuI6uo+tYbZqpXr0OyX8bEQf3+I6Z2bTQK2HuIemzwFPAY2Q3nt8J3BwRTzYdnJlZm/S6D/NDZL/quQd4ENgZeDtwu6RTG44N8hvZR2i4DTG4jq5jynAbYmhjHafU10UfSQcC34yI55We2MxsRLXyKrmZWRuV7Q/TzGzaGuuEKWm1pKWS9q1YzjmSzpd0uKTlkq5SP12dTFzmHEn3S1paoazT8n5Gl0r6o6oxFso7pIb4TpX0A0kXSjpe0i2SLui3vAnKXCDpTkmXVSzzQElP1LWeu8qrvI67yuxs0y+qKcZOeS+sMcaFkq6QdGxNMXbKe14N2+KZeX3vl/SBsvGNbcKUtC1wdUQsiIjVFco5CNg7Hzwd+BTwEFC285HJytwWuCAiFvRbHnB5RPwx2Q24H6shxu7yVDW+iLgCWAC8gKxPgjcDB0rao6YydwD+NiKqPnvqg2QXOGtZz13l1bGOt9imgZdXjbFQ3pqaYtwFeHdEnAq8rIYYu8t7tGqMEfGFfPq7gTll4xvbhAnMBI6QdHLPb04hIu4FOi2XKfv07LPMXYEFko6rUF5I2ikv64mqMRbK26VqfPmOeSewDNgrj281Wc9VdZS5PfBaSUdVKG9W/nEtNaznQnmV13Gue5uuY1vsLq+uGF8MHC7pv2qKsbu8veqIUdLhwC/I6lwqvrFNmBGxFjgWeF9+Vb/2WdRSSMStwEnA5yXtWKGozwOfBDZ2F1+1vIhYXjW+iNgIHA7sTtbaqhxfocxrgXcCX+63POCNwMUTzapqeXWt4+5tmhqWY6G89XXECMwCvgBcDryjaoyF8o6uKcZXAd8qjEuKb2wTJkBEbAAeBp5WU5GN9OkZEb8DNpAdWpYmaW5WTNxcR4yF8irHl5exkeyHEg/m8e1D9t+9b11lPp1sPf9BheKeCfwFWRJ+JdXX86byJC2qYxnCVtt05W2xu7yaYlxDdr/248BvaoixuzzVFOMC4Dr62FfGNmFKep2k64EngZ/UVOxlwGKynf22OgqU9H5JNwJLI+KRPos5HjghPxn+3Rpi3FSepL+uGp+kD0q6juzXYl8ha3k9EBEP9xlfscx3AjcB/9hveRHx4fwc6O1kz5yqtAwL5e1QwzoubtOfrBpjobyX1BEjWSJ6CdkPXF5YNcZCeRtrinGfiPgtfezPvg/TzCzR2LYwzczq5oRpZpbICdPMLJETpplZIidMM7NETphmZomcMK1ReccTN+afd5b0cMVfaZgNjROmDdIisp+6mY0kJ0wbCEkzyHrYeQB4m6S/y8ff32lxSrpH0n2S7s+HL8xbqCdKujAft1TSvZJ+LGl/SW/LX3tIuq1rfo9JWpF//3hJl+Tj/1nSxwvDx0haLOkr+bizlXc/l8fw8q5yN8Vu048Tpg3KKWQdKEz107KVwBETjP9k1+c5wJHAz4HuR6R8BNgJIO/b8Cay7sUAbu0q9znARYXh28g68Xi+pG3IftI3M6FONs04YdqgnAp8rWv4XZLuZMtutbafYLoFZH1KdrsHOA74fj68G3AImztQ6HRzB0BErAe2lbQ7sCEifl0Y/n1exi1kreBf5WV0LFHWWXHfXfrZeHDCtEGYDazME1PHlyLiUODXAHnyenyCaRexdacah5B14NHpPuztbNm12zPI+jjsdlNe1o2TDO8CXA2cS9ZdXHfPR4uArwPvn7SGNi04Ydog7Ad8tcd3TiBLYkU/A9YVxgVZDzudLr52Av6jUNaNhWm+T/bY6O9NMrwTcDNZsv0+ULyS/xgwo0cdbMxtN+wAbFq4I++IeDJ7kHX79gTZofsBks7K/1Z8bvQvyR4v8AjZOcqXkrVWQxKSjgbOyf/+cWBvSaeQJdQlwNK8nOLwTmSdLx8dEWsl7ZyPX0vWel0HvJ7s/KZNU+7ezYZO0hzgsoh4YT58BrBvRJzdR1kLgDM6z/fJr2jfSdbiXBwRp+XjD+0eNkvhFqa1wUNkV7k7rmLrQ+JUPwU+1zV8IdmjDb5F1npF0mlkrdBT+5yHTVNuYZqZJfJFHzOzRE6YZmaJnDDNzBI5YZqZJXLCNDNL5IRpZpbo/wETPN09A7ksUQAAAABJRU5ErkJggg==\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x1555aeb2da0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Коробчатая диаграмма количеств экспрессии генов на индивидуум\n",
"fig, ax = plt.subplots(figsize=(4.8, 2.4))\n",
"\n",
"#with plt.style.context('style/thinner.mplstyle'):\n",
"ax.boxplot(np.log(counts_subset + 1))\n",
"ax.set_xlabel(\"Индивидуумы\")\n",
"ax.set_ylabel(\"Лог-количества экспрессии генов\")\n",
"reduce_xaxis_labels(ax, 5)\n",
"plt.tight_layout()\n",
"#plt.savefig('pics/1_08.png', dpi=600) "
]
},
{
"cell_type": "code",
"execution_count": 86,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVgAAACnCAYAAABU30Q4AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJztnX+0HFWd4D/ffv3ykry8vLwmj7yY3tBxIJK8sG5+MEpgNEFOZgZYJjgyENjVUTHKEB1dhl2cuCMwrOxkTxxZmKMw/sCM+8Kg668zgpvViCvOyBJUXJQfKkPQCfEQQmSIYhj87h/Vt1NVqe6+1V3Vr7vz/ZxT572qrvv9fm/Vvd/7vbdu3RJVxTAMw8iewnQbYBiG0a+YgzUMw8gJc7CGYRg5YQ7WMAwjJ8zBGoZh5IQ5WMMwjJwwB2sYhpETXg5WRC4Tkb8VkfNEZI+IfE9ENuVtnGEYRi8jPi8aiMiPgbcDXwJWAC8C31TVRfmaZxiG0bsUPc8bAF4OHATWAwL8Oi+jDMMw+gHfCPbtwETs8FOqelsuVhmGYfQBXg4WQERKwKnV3UdV9ZncrKrD/PnztVKpdFqtYRhGhAceeOCAqo43O89riEBEtgKbgAcJhgdOE5FPq+r17ZmZjkqlwp49ezqp0jAM4xhEZK/Peb7TtN4KnKaql6nqpcArgTe3apxhGEYvsXPnTlasWMHAwAArVqwAKPmk833IdRfwdRG5H1DgdODLLVlqGIbRQ+zcuZOtW7fysY99jLPOOot7772Xs88+22sGVZox2FMJpmgB/EBVf9CivS2zZs0atSECwzA6yYoVK7j55ptZv3597ZiIPKaqr2iW1ncMdnPs0FkicpbNIjAMo995+OGHOeuss+KHn/dJ6zsGu7C63RD6f6GvgYZh9D7xccidO3dOt0kdYdmyZdx7773xw3O8Equq90YwNJAqTZbb6tWr1TCMzjM1NaVLlizR3bt365EjR3T37t26ZMkSnZqamm7Tcicp78Cv1Mdnep0ETwH7CF6R3ef2fdJmufWLg52amtLJyUktFAo6OTl5XBRSo7eZnJzU3bt3R47t3r1bJycnp8mizhKvs8DjmpWD7ZatHxxsr0YC1igc3xQKBT1y5Ejk2JEjR7RQKEyTRf7kUXaBPZphBPubwCeBO6v7g8Af+KTNcusHB9uLkUCvNgpGdvRiuVXNr+xm7WAfBc4heEXWHfuRT9ost35wsN0aCTRq5aerclnU3D30aiObV9nN2sHeA7wD2Av8LvCXwFd90ma59aqDDTuKoaEh3bp1a+T36Y4EmlWe6WgUerVC9zO92ODlVXazdrDzgC3ALcBfVf+f65M2y60bHWyzQhd3FFu3btVisahbt27tGscxOTmpW7dujeTD7bvfk6KAcrmcW4XLSmf8/mzZsqVp+l50JHnQq9ehEwFN1g72r3zOq5P2euBDwHLgAYLXbiX0e+LxpK0bHGz45pXLZR0fH28YZSU5iq1bt+rQ0FCuBTdN5RARrVQqkXxUKhUVkZqseDQ5Pj7eNO/tkBR57NixQwFvna00bsdT5NyojPTqdehUQJO1g/1H4BRgaXjzSHcy8JGqg70eOA/4BLAydE7i8aRtyZIl09qixm9epVLRiYmJiB3x1jGLLkraSCJt5RgaGtLt27dHjm3fvl2Hhobq2lAul3Mdl01qmCqVilYqFW+dcRmTk5O6ffv2yPnx9L36MCctzcpIq9ehE1Fv2ucFeQQ0eYzBfi227fZMu67qYD8KrAY+AJwf+j3xeOj3zcAeYE+xWJzWFjXelQb0mmuuiRQ6F2Vl5YhaiSTSVg4RSdThItgkWmk40nTX60zu1h07dnjrjNtYKBT08OHDkfPj6bvlIaTP0FPaoY8wzcpIq/c376i3Xm+qXC7X6mSaMtIqmTrYdrY6Dva80O+Jx5O2pUuXRjKZNrJot3WNOyIXwYa70gsXLtRKpdJWV7rdMaS0laPZGGy9NGkihaSuW6FQ0Pnz59eGKMbHx49J007U3KsRbDNHVe9ajo+Pa6FQSLyWcZqVkVYiwU5cu7iOeJ3z6VVmQTc62LaHCFatWhXJZJqWaWpqSsfHx2tjiz6FME68Kz01NaWlUkkHBwcb3txSqaRDQ0MK6NDQkG7ZsiUid8uWLbXfi8Wizp07t1Z5XIUJy2yW77QOs5XII3w9naMsFAp1x7riFaNcLmupVIo0RhMTE1oulxvqbGZnozHypPG4cPTjIsFOjD22My3O51ouXLiw4bVspqOVscxCoaA7duyI5GvHjh2ZRo/xhmFyclJ37dpV05EU5GRx/7r2Ta6Qg11e7erfRbDIzDurY7SR441ktRPBlstlnZiYiFScZhU6TtLDoPnz59eGBJK6J0kPZsKVulQqaaFQ0O3bt+vhw4d1YmJCC4VCzQn7RF1xtmzZosVisSZz+/btEV0+XU6fLmm4wRocHNTR0dG6kUO8YgB61113RSrfrl27FGhoU7MHM/FGdGRk5BgH2uwhZdrudlra7ebGHRmgV199dcNr6WNDo8YqqSe1Zs0aJVgfWkVEh4aGdOHChRGZzRx9WloZpkt7P5PKet5rEfxvYHdo8x6DzXKbMWNGy5EFoKtWrVIRqRWIVatWNSyE8Ys9NDSkGzdurBsZ+jyYibewg4ODOjY2FplzesYZZ9QKbrFY1FmzZtVs9om844WwXC7r7NmzdXBw0LsLGb8G8QI2ODgYqXCFQkF37doVKejxLme8YqxduzbS5bzmmmtq96OVqLpcLuvo6Ggtqq5UKjo4OFi7lvHeQytDI63QbKhjampK582bp4ODg16NVTxYAHT27NlaLBZrOi644IKIk0lqnNI4nkKhoFdccUWtp+XqEaCHDh3SK664QgGN19G0QUyza7dhw4ZI8OCG6MLBQ/gepu25+pR1VVVCL1012nwd7CuqXfgPA6/wSZPH1s4sAlcYxsbGIn/TtPLNukn1HsxcccUVEScdLvwu+nAFolQq6cDAgAJ6+PBhveyyy2rnphlfC0c4xWJRR0dHVUTqRhbNuqxxRyQiunz58sg54a6aatQpxKPqcKPhGhJAS6VSTV7a8Tyg5mBFREdGRmr32EXybt9tM2fOjNyv8PS0LPB5WOcaBnc93Lj0/PnzEyPcYrGos2fPrtkadnbhaxouMyMjI8dE6/HhkUZlypVLd//i19E5XiCzIYJ6zs49gA3rX7BgQe0eF4vFWkOa1HMdHR2NNEY+ZX3RokX5DREAr6lubyGYs/pl4PU+abPc2pkHG74RIqILFixo6mBbGejfsGFDJEqOd5tERAcGBmqVx0WwrlK7iDU8rjsyMqLFYrGmo5mjKZfLNRkuj4ODgxGHGu5CNosWk4ZGnGxHs7GvyclJ3bhxY6QihjfX2AwPD6tqa0+xXcPp7BwcHNSZM2fW7NywYUNN16FDh2o6N2zYUJMRn57WLj69GheBhnsYrgFKGspwTtU91KrnYGfOnFm7F8ViUefPn1/TGY+aK5WKzpw5s9bQxaP9YrEYubZhXfHGK0w7D5iSnJ3TEY9gw9F6OIIFdNeuXZF8T0xMKFC3rMcjXic/1kge0Qwd7PuTNp+0WW7tOlgRqY1xhm9M0rnuhqWp5Eljn67yhCv9nDlzas5uy5YtWigUdGBgIBIZuC4PoJdffnmkADWLCoaHh2uR86FDh2oynfNSjTrYZtFi0jxZF1k3Gu8MNz5xJx13CJVKpWa3j01JuOsW1jFv3ryaTBGpXUu376Jc3+lpaUl68OO602E7RSTSM3I9rKRrMTQ0pOecc06toQd08eLFKiK1fZfveHfe4SI716vZuHGjAjp37tzESBDQzZs3RxrIE088MSLzlFNOOcYRtfOAKWn6YDxKBnT9+vWRexauo3EHOzk5qdu2bYvYHS5XxWJRh4eHI8NMuTvYbtnadbArV66MFPSVK1fWjWDdTQzfVNXGlTzJEbmINVwgXBfQ3axwVxbQM888s5a+XC7ryMhILaL1eXAA6KWXXhoZInCRXNLYWLOGJKmgj4+P1yJj93dkZMT7ZQZAzz333Ei0eOONN0auQ9oxWEJDBO5au79uO/3002s6Jycn9aqrrqob/WRBuVzWefPmRRqiefPmaalUipSJ4eHhSF5LpVLdhh6ODqnEu+vx/IbHR8NlHdBt27ZFGtHNmzdHzglH83HH00hHfKgpKbL1Iak+OVkuUCoWi8c0IPHx6nAP0gVX4foTd8jhoZBwZJ7nEMGPgV8SvHAwbQ+5Vq9e3dLNChcGF9WFC0SSTCDSdXYtWL1K7mQcPnw4cnzZsmURHZOTk3rZZZdFhhnClToeBbuxuY0bN3o/OIDgCb1jamqqJsdFneFx3GbRYthGJ2NkZOSYrumMGTPq2hV30uHIKh61uXuRdt5y3Jk55z80NKRPP/10TfZpp51Wk58UwWY5ayA+SyQ8o8MRbxgqlUrkfk1OTuqaNWsi13tsbCwSwSZt4UbcXYvw9V+wYEEtr4B+8YtfjNQD57xVj/aKwvcormvp0qV1h1darbPxoal6eXVb0nMRF6C4c2bNmlX3ASKgmzZtipQ7dy9i9mX6kGsQeBvwJYI3q2b6pMt6cxFsvCVudvMALZfLkYsM0XHJJAfrbpBzLI0quavI9brSrnDUe1AWzkd4Xiyg55xzTuKDg6SohGp0E+4q7969u1ZBwvlw+2kmtofzUq/Bil8XpzM8nubsdw/0qEZfLnJLktHo/rp8hOfmhp14eHNjsM7m8LUJO5xWyllSGjeNKGm2hGry7IfR0VEtl8u1BiEpL0uWLKkNO4XzFnd8lUpFTzjhhNp9cJHfvHnzIkNX4Z6UajSCdY7J6QrbEh9SqFcO0hAvM/F7uHbtWt23b5+uXbs2cjxeR+OzCJrN1wYSH4qFG6c8pmm9H/gzgpcCvgM8A/x3n7RZbkkOtt5+vKCHL7T7PRzF+chspiNpDLZYLOqGDRsiTjpcCMMFIqkQNoou69noxnWdjgULFtTm1vo0JEmOJt7QAHr++edHZJ1//vl1K1fSjAw3fOIaOzfE0ErljOfD5X/OnDl1p22JSO0BV5KOtHbUKyNuDDC8H3YKjaYShc8Ld4vjDnXRokURpzd37txI2te97nWRccp4Y+ScqwsIkhzmtm3bIvkLTyeMPxTLwsHWa9jnzJkTeZDsGo969Sf+oMw9bE3qGYUbOydzdHQ0MqSTxxDBaxO21/ikzXJL42CT9htFo63KjO/Ho880hS7usJMKGdXIpZkjCtsRLvxZ5dM57nCrXm9mRj0n7uZhNmtomjVuSee4YyMjIxEbnSNJOrfZsaT70+z88BigcxLhMfTwtanX6AJ6ww031OROTU0d01U/8cQTa2ncbAm3uUg9PFvCyYnPMa1XZpyjqTeMkbYcNaNenXW6XCMTbmyS7o97kBl/VpD04M/pCwdi9aZFkvFiL3ckHLvTJ22WW7sOtt5+HjKz1FGvYZjOfAG1B2euIIb387Sh2TnhCpc0/pmFg/Xdb1ZhfRsTN/vBbevXr4/su4dkcQfknEw4Um/m7JJsKJVKkR7HwMBAbYJ/Fo2Tb/rwfQ3/rVfuBgYGEsuAy0f8fHfP6tW3kB2ZOtjHCL1gACxjGj8Z003Or5t0dFKni5LiL264492QL0AvuOCCSI/CveTRTF6Wdrfbc3LRlotE3V933A0JuR5EoyGhuA7fBtE50njk2IqDrbcftyvpfHfeyMhIZGijXj5c2Qz3Yuq9ZJSm3GXtYF8D/D3wJPATgrUDNvikzXIzB9tdOuMvVdQbz5yufIXf4gEiMzCaVehuykd4HDS8hR2o75BQq3ZD9C25+BBBFjp89gE9+eSTI+Xu5JNPbuhg3ZuSbijk6quv7joHezqxrw3Qg2OwPhcur/1O6LB8Rfcbdc+7IV9p0mQ1pt6q3XDsQy43Yb/TDjZNROpm1IQbplKpVIvA09gQsyNTB/tEwrHHfNJmuZmD7T6dndDRjs5uHL/upI4sHWy4N+D2O+1gw284wtHhkHrT+9zwibM1vFpdNznYncAdwLsIlhm8A/ifPmmz3MzBdp/OTujo13x1QkdWOuPTl+JzdbPQ4bMff3HAZ3pfHtG/r4Mt4MdlwMcJXjiYCfwP4BLPtIZh9Djbtm1jxowZkWMzZsxg27ZtHbVj06ZN3HrrrSxduhSApUuXcuutt7Jp06a6aW6++WZeeOEFAF544QVuvvnmjtgKeDvYucAi4CVV/W8Eb3SVc7PKMIyuYtOmTdx0000MDw8DMDw8zE033dTQseVpy0MPPQTAQw89NC02+OLrYL8M/AbBEAEE4x9352KRYRhdSS85tm7B18HOAT4FvCAiM4ELCJysNyKyRUTuEZEnROTtoeP7q8cn0sgzDMPodnwd7BuBG4BfE8yB/XfAxWkUqeotqrqO4KWFOwBEZAC4W1XXqer+NPIMwzC6naLPSar6bRG5CJgABoCnVPWltMpEZDnBIgk/rx4aAyZF5EJV/VydNJsJVvBi8eLFaVUahmFMG14RrIhcCvwQ+CxwJ/BjEXl3C/ouAL7gdlT1AHAmcKWInJSUQFVvU9U1qrpmfHy8BZWGYRjTg1cES7BU4SpVfQ5ARIaBhwg+x52GdcAt4QOq+qKIPAPMA/amlGcYhtG1+DrYQeAjIhI+doKITAGo6qWechao6vMicgmBM10EvAd4HPiepwzDMIyewNfBbgRKsWO3plWmqiurf+8IHf5MWjmGYRi9gK+DHSNhWpaq/p9szTEMw+gffB3s31W38BiBAuZgDcMw6uDrYPenGGc1DMMw8HewLxeRfaF9IVhV5mU52GQYhtEX+DrYWar6YviAiMzJwR7DMIy+wfdV2YcSjt2XpSGGYRj9hm8E+6yIvIPgu1xK8PbVC7lZZRiG0QekmQf7LuA8gqj3EeD38jLKMAyjH/B1sCtU9U/djoiUgCuBP8/FKsMwjD7Adwz2TSJyl4gsE5F3Ad8Ens/RLsMwjJ7HN4K9AVgN7AK+Q/CNLnOwhmEYDfB1sB+p/v0RMAJsJ3jYdXYeRhmGYfQDvgtur8/bEMMwjH7Dd8Ht3xaRfxCR50XkJyKyR0TW5WybYRhGT+M7RPCXwKuB+1R1mYisIviu1tLcLDMMw+hxfGcRPFr9msH26v6DwJP5mGQYhtEf+I7BXlj9+9Hq35eAc3K0yzAMo+fxjWAzQUT2i8g9IjJR3V9QHdv9hoiMdNIWwzCMvOmYgxWRAeBuVV2nqvurh88DPgV8BYuIDcPoM3xnEfwnEXlSRI6IyD4ReUpEHkupawyYFJELQ8deBuwHniL4AGKS7s3VWQt7nn766ZQqDcMwpg/fCHYL8BvA46r6MlVdiP8MBABU9QDBKlxXishJSafUSXebqq5R1TXj4+NpVBqGYUwrvk7ys6r6ooh8VUS+CfyKYOnCVFRlPAPMI/hs9z5gApiPfbbbMIw+w3cWwR9X/14pIsuAgqp+P40iEXkD8B7gcWC5iMwGvgR8HvgX4INp5BmGYXQ7Xg5WRK4G3gDsAP4EKIrIh1R1e+OUR1HVzwCfSfjpDF8ZhmEYvYTvEMEVwAbgu8Bi4EXgUY6+eGAYhmHE8HWwBeAtBJ+J+ROCr8q+lJdRhmEY/YCvg72S4EHUVaFjb8/eHMMwjP7B18H+W1V9R66WGIZh9Bm+DvZCEfl2/KCq3paxPYZhGH2D74sGRYL5qgtjmzFNlEqlyF/D6FaaldV+Lsu+DvZeVb1eVa8Lb7laVof4zcjj5vTCDX/22WdRVZ599tmWZXSi4PfL/WlW7nqhzLRCFvmKl9W4zCzKcpy0dud2/1S16QbckXDsTp+0WW6rV6/WwGSt+3dsbEwBHRsbS9wPn+uIn5NWpu9+OzLidjezMSlN2nx2Qkcr187Hrvg9brSflL5ZPtoth0n58rErTT58yl0zmfF8dcO187Gjmd3N8tns/gB71Md3ep0EjwGvCO0vA37kkzbLzcfBpr157chK+zdLu310tVvQfXRklb8sr10zJ5KmMnayTLTbOLVy7dLKbCa3U9csrc74/61cu3AaPB2s70OutwGfEJF/Vd3/GcHUrZ7DdUdEZLpNSUUrdncir+Gu1cGDB3PTk4Z4vuM2dmsZaGZX/Pcs8pGHzF6gU/XJdy2CrwNrvaUaxw29UCF7wUbDn1KpFBnP9WnYpysQ8F0P9o9E5Mci8tPq/qCI2GuyhnGc0Q0P81yDqer/YCyPB2k++M4ieA8wCfxzdf8l4PW5WGQYRtcyXY6qV/F1sPcAtwPzReQvgPuAv8vJJsMwjL7Adwz2bSKygmC5QQH+RlUfytUywzCMHsd3Pdg3xg6tEpFVqrojB5sMwzD6At9pWstC/78J+GQOthiGYfQVvg72/dW/FeD3VfW9aRWJyMUEH098Cdigqkeqx/cDjwCX6NHPeRuGYfQ8vg+5HgUeJvhkzJ+2qOtOVf0t4ADBVxEQkQHgblVdZ841Hfr+uXDtaPDXyB273kYrSPAmWIeUicwi+Mjhuar6kojMB+4CblTVz9VJsxnYDLB48eLVe9986OiP1/4cEalNIm/lb1VHWzL6VRfXjja81r2Wn37XVe9+dXt+mpWz6bxP9XQAD6jqmqY+zwlpeJLId4GZBJ/ZhmAmgarq2U0TR+V8GPi4qt4fOjYI3A28VVX3Nkq/Zs0afeCBBzK5OdNd+Nq1uxsKekd0hM/P61rFbPLJV1v58HR+bZftDly7pDJxzPVsst/JMp5VnZXrnsvUwZYIPhezFPhrVd3VNNGxMlYBl6vqHyX89rfAB1T1wUYykhxsHgUi1wpcR0e7hdDHSUyHjqwcVVhX3hW4lTLRqg1J1zBPJ9GvujIvA03qLBlHsG6a1ijBxw9nEjjaDzZNfFTGVQSLxuwHvgJ8FVhE8JbY48AbtYkxrTjYVrq5vdzidrTh8MhPr167ftLVjo6m9aeF6D+tzukc+qing4wd7JuSjqtqR6dr+TjYThSI6YhWOtF97wbn0InK1A1jy0kRUtqy28p96oZGtxt1pS13ZOxglyYdV9XHmibOkLyGCHrRSXRDQQ/yl2+F7ddrl4WuLBr2PIZXeuHatauDjB3sYYL1ByB4wAUtPORql15xsMeLrn7LT7/q6rf8ZKGr3cYJTwfr+6LBIeC/AM8BP1XVpzzTGca0Y3NYjThy3XNHnea1+enxdbC7CF6RHQaWiMgM4BpVtRW1jK6nU5XJMOL4rqb15vC+iCwH7sCWLDSM44pu6A04G2r/dzG+q2kNAOcBK6qHvg+szssowzC6k27oDTgbAG87pqth8B0i+DRwEPi/1f3zgDcDG/MwyjAMI0umq2HwdbCvBFaq6nMAIrITaPjWlWEYxvGOr4O9Dvi2iPwcUII3uq7PzSrDMIw+wNfBTqnqDglWv0JVD4hIJTerDMMw+gDf9WDvF5GLVfUAMCQinwI+m6NdhmEYPY+vg30dcIaIfINg/dbPquqq/MwyDMPofXwd7JeAVwGnVNNcLSJ/n5tVhmEYfYDvGOwluVphGIbRh/i+ybUXQETuVtXfzdckwzCM/sB3iMBxci5WGIaRCSLC2NjYdJthVPFysCLiPtP9ilYVicgCEfkHEfmGiIzUO+YpywoRnbkOeejoxvvnY1O7duedb/f66MGDByM6u+F6d4MNacni2vlGsG+R4OOERRGZ4baUus4DPkXwuZhzGhxrSL1C1OgixH9POr/ZOWkvdFL6tDIbne97HdLmq5kOHxrlq9X71+49TmtTPI2P3Z24lmnKodOpqqmudzMbfH7P8tq1amc7zjGza+cENTxJ5Ang1xxdbLuqW1/urUjkfcDDwAnADFW9JelYQrrIZ7uffPJJp9z9nrjvc067++5YljrHxsY4ePDgMb/nmQ9fHY3yWc/u+O/t6Eirs16+8igjadOEbYjno1m+8shHPRu64do1yrdvGWh27dLmS0SyW3BbVSs+56UgyasnenpVvQ24DYIvGuzduzf+e8N9n3Oy3u8VmWl1ZGFTJ3Q0k9EJHc32u8GGVmROh93dkO965zTDdwz2lSLyORF5rLp9TkT+TUpd+4AJYGH1/3rHDMMw+gLvtQgIuun3V/dPB3YCy1Lo+hLweeBfgMdF5IzYMe9PgBuGYfQCvg72F8CpBN/kUoLZBL9Io0hVfwackfBT0jHDMIyex9fBbgTeBVxY3f8B8Hu5WGQYhtEneM0i6BZE5GlgLzAfOBD6qd39LGR0gw7Ll+noBp39oqORzpNUdZxmqGrdDXiK4OGT+7svvN8obZ4bsCfL/TxkTocOy5fp6Aad/aLDR2ezrdkQwfOqekqTcwzDMIwEmjnYE0TkAwQvGfwS2A88AtyvqkfyNs4wDKOXaTYP9j3Ao8APgZ8Cw8BbgB+IyEU529aI2zLez0PmdOiwfJmObtDZLzp8dDakpYdcInIS8HlVXZk6sWEYxnFCT80iMAzD6CXSrgdrGIZheHJcOlgR2S8i94jIRAayrheRD4nIchF5QETukvDSPO3JrIjIEyJyTxuyLpZgvd17RORfZ2FjTObSdm2syrxIRL4uIreLyHoR+baIfCJDmetE5BERuaNNmSeJyK8yvt9OZtv3OyTTlfG1GdrpZL46QzvPE5FPi8iZGdrpZK7MoP5sqeb5CRH547Q2HncOVkQGgLtVdZ2q7m9T1snAidXdS4A/A34GpF0Ip57MAeATqrquDTPvVNXfIpgcfU0WNsZkSgY2oqqfBtYBv0mw7sW/B04SkRMykjkE3Kiq7X5f7t0ED30zud8xmVnc70gZB36HbMplWObTGdk5B3i7ql4E/HZGdoZlPteunap6SzX9Y0AlrY3HnYMFxoBJEbmw6ZlNUNUfAS4iehnBNLangEUZyRwB1onIa9uQpyIyqyrrVxnZGJY5p10boVaBHwH2AONVO/cTrLSWhcwZwO+LyCvbkFeq/nuAjO53TGbb97tKuIxnYmdMZlZ2ngUsF5HdGdoZljmehZ0ishx4nCDfqWw87hysqh4AzgSurM6GyEVNJkJUvwucD3xQRGa2IeqDwPuAl8Li27HNyVTVB8jARlV9CVgOjBJEcrWfMpK5C7gc+Gir8oBLgb9JUpWFzKzud7iMk921DMt8Ngs7gRJwC3An8NYs7IzJfBV0yRSqAAADzElEQVTZ2HkB8IXYMS8bjzsHC6CqLwLPAPMyFJvL2raqehh4kaCLmxoRWRWI0fuzsjEms20bHVWHWCSYcz0BLCCIGLKQWSa457PbEPdy4D8SOO1zyeZ+12SKyOYMr2W4jGdSLsMyM7LzaYK59S8A/5yRnWGZkpGd64Bv0EL9Oe4crIi8QUS+CRwBvpeh6DuA6wicwoNZCBSRd4rIt4B7VPXnLYpZD5xdHej/SkY21mSKyH/OwEZE5N0i8g2CNwY/ThDVPamqz2Qk83LgPuDDrcpT1f9QHcP9AcE35Nq+ljGZQxldy3AZf18WdsZkviYLOwmc1msIXl56dRZ2xmS+lJGdC1T1eVqo4zYP1jAMIyeOuwjWMAyjU5iDNQzDyAlzsIZhGDlhDtYwDCMnzMEahmHkhDlYwzCMnDAHa0w71UVOvlX9f1hEnmnzzRvD6ArMwRrdxmaC1x0No+cxB2t0DSIySLD605PAH4rIf60ef8JFtCLyQxH5sYg8Ud2/vRoBnyMit1eP3SMiPxKR74jIy0TkD6vbCSLyYEjfL0Vkb/X89SLyqerxj4nIe2P7Z4jIdSLy8eqxa6W6nGLVht8Jya3ZbhzfmIM1uonXEyzS0ej1wn3AZMLx94X+rwArgO8D4c8aXQXMAqiu53kfwTJ5AN8NyT0N2BHbf5Bg0ZjTRaRA8GrnmEeejOMYc7BGN3ERsDO0/zYReYTo0nAzEtKtI1hPNcwPgdcCX6vuzwWWcnSRDrd8IwCq+iwwICKjwIuq+k+x/V9UZXybIMr+SVWG4zYJFvduZ6k9o88wB2t0C4uBfVVH5vhrVT0V+CeAqrN7ISHtZo5dxGUpwYIxbhm8txBdqnAJwbqeYe6ryvpWnf05wN3ADQTLH4ZX5toMfAZ4Z90cGscd5mCNbmEh8Mkm55xN4PTiPAQcjB1TgpWf3DJ1s4D/FZP1rViarxF8qv6rdfZnAfcTOOevAfGZDr8EBpvkwTiOKE63AYZR5eHq4t31OIFgGcNfEQwlLBKRd1V/i3+r/h8JPvHxc4Ix1g0E0bCKCCLyKuD66u/vBU4UkdcTOODbgHuqcuL7swgWLX+Vqh4QkeHq8QME0fFB4A8IxmcNw5YrNHoDEakAd6jqq6v77wAmVPXaFmStA97hvs9VfeL/CEFEe52qXlw9fmp43zDSYhGs0Sv8jGAWgOMuju2i+/L/gL8I7d9O8HmRLxBEx4jIxQRR7kUt6jAMi2ANwzDywh5yGYZh5IQ5WMMwjJwwB2sYhpET5mANwzBywhysYRhGTpiDNQzDyIn/D9AEAU2RjVxqAAAAAElFTkSuQmCC\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x1555ad65c88>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Нормализовать по размеру библиотеки\n",
"# Разделить количества экспрессии на суммы количеств\n",
"# для конкретного индивидуума\n",
"# Умножить на 1 миллион, чтобы вернуться к аналогичной шкале\n",
"counts_lib_norm = counts / total_counts * 1000000\n",
"# Обратите внимание, как здесь мы применили трансляцию дважды!\n",
"counts_subset_lib_norm = counts_lib_norm[:,samples_index]\n",
"\n",
"# Коробчатая диаграмма количеств экспрессии на индивидуум\n",
"fig, ax = plt.subplots(figsize=(4.8, 2.4))\n",
"\n",
"#with plt.style.context('style/thinner.mplstyle'):\n",
"ax.boxplot(np.log(counts_subset_lib_norm + 1))\n",
"ax.set_xlabel(\"Индивидуумы\")\n",
"ax.set_ylabel(\"Лог-количества экспрессии генов\")\n",
"reduce_xaxis_labels(ax, 5)\n",
"plt.tight_layout()\n",
"#plt.savefig('pics/1_09.png', dpi=600) "
]
},
{
"cell_type": "code",
"execution_count": 87,
"metadata": {},
"outputs": [],
"source": [
"import itertools as it\n",
"from collections import defaultdict\n",
"\n",
"def class_boxplot(data, classes, colors=None, **kwargs):\n",
" \"\"\"Создать коробчатую диаграмму, в которой коробки расцвечены, \n",
" согласно класса, к которому они принадлежат.\n",
"\n",
" Параметры\n",
" ---------\n",
" data : массивоподобный список вещественных значений\n",
" Входные данные. Один коробчатый график будет сгенерирован \n",
" для каждого элемента в `data`.\n",
" classes : список строковых значений той же длины, что и `data`\n",
" Класс, к которому принадлежит каждое распределение в `data`.\n",
"\n",
" Другие параметры\n",
" ----------------\n",
" kwargs : словарь\n",
" Именованные аргументы для передачи в `plt.boxplot`.\n",
" \"\"\"\n",
" all_classes = sorted(set(classes))\n",
" colors = plt.rcParams['axes.prop_cycle'].by_key()['color']\n",
" class2color = dict(zip(all_classes, it.cycle(colors)))\n",
"\n",
" # Отобразить классы на векторы данных\n",
" # другие классы получают пустой список в этой позиции для смещения\n",
" class2data = defaultdict(list)\n",
" for distrib, cls in zip(data, classes):\n",
" for c in all_classes:\n",
" class2data[c].append([])\n",
" class2data[cls][-1] = distrib\n",
"\n",
" # Затем по очереди построить каждый коробчатый график \n",
" # с соответствующим цветом\n",
" fig, ax = plt.subplots()\n",
" lines = []\n",
" for cls in all_classes:\n",
" # задать цвет для всех элементов коробчатого графика\n",
" for key in ['boxprops', 'whiskerprops', 'flierprops']:\n",
" kwargs.setdefault(key, {}).update(color=class2color[cls])\n",
" # нарисовать коробчатый график\n",
" box = ax.boxplot(class2data[cls], **kwargs)\n",
" lines.append(box['whiskers'][0])\n",
" ax.legend(lines, all_classes)\n",
" return ax"
]
},
{
"cell_type": "code",
"execution_count": 88,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAADVCAYAAAAFF3g7AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJzt3Xl4VdXV+PHvAkKgMpVBmQokgCWJwotGQGzLUAUUS9WnlUF/iDxAJ2kpr1VfYzWgKKJirbzFH7QRVCZrteoPpEkrYCkOgAoIBASJGAUZFMJMCOv3x7m5JiEh+yY5ufck6/M890nuuWdY93JZ2efsfdYWVcUYY0zF1Yl2AMYYE3SWSI0xppIskRpjTCVZIjXGmEqyRGqMMZVkidQYYyrJEqkxxlSSUyIVkYEi8qiIXCoiL4nIayIy0O/gjDEmCMRlQL6IbAdmAP8LXA/kA/NVtaO/4RljTOxzPbWPA3YD+wEB4kM/jTGm1nNtkT4ItC+xOFdVf+9LVMYYEyBOiRRARARoE3q6V1XP+haVMcYEiGtn0xhgO/BX4CVgm4iM9TEuY4wJDNdT+53Apap6PPT8W8AmVe3sc3zGGBPzXBPpX4GTwHuAAlcAjVX1Jn/DM8aY2OeaSOsCg4CU0KItQKaqnvExNmOMCQTXRDqotOWqmlnlERljTMDUc1xvZOjnDcAreGNIFbBEaoyp9ZyHPwGIyBZVTfYxHmOMCRynFqmIvI3XAu0kImsKl6tqX78CM8aYoHC9RlrqPfWq+qnDtlOBJqo6SUSuBBaqasL5tmnZsqV26tSp3LiMMcZP69evP6Cqrcpbz/UaaR1gDN6Qp8kiEgf0As6bSEWkC3Ah3tApgHHAnjLWnQBMAOjQoQPr1q1zDM0YY/whIuU2FsG9aMlrwGfAjwBUNR+YV95GqroDWBwKKBn4GDhdxrpzVDVVVVNbtSr3D4AxxsQM10R6EsgFEJEkEfkNcDjCY90GZES4jTHGxDzXRPoT4GpgG/A4XiWoGyI8Vlvgj0CyiNwY4bbGGBOzXK+RjlLVOytzIFX9PwAislJVX6nMvkztlJ+fT25uLidPnix/ZWMi0KBBA9q3b09cXFyFtndNpGNF5HFKFHNW1VKvd5ZYZyWwssjz/u7hGfON3NxcGjduTKdOnfCqOhpTearKwYMHyc3NJSHhvAOKyuSaSOPwTusL72gq/JlYoaNWk4r8Z4vkBgVTvU6ePGlJ1FQ5EaFFixbs37+/wvtwSqSq2qnCR4iiMpNielNIj7SvzMQCS6LGD5X9Xtl0zMZXIhLxwwSXy/Xr06dPc/ZszZpgwxKp8ZWqlv54oEmZr5ng+cc//kGPHj2YMmVKmescO3aM8ePHk5KSwueff16N0fnP9V77Z/GuiYYXAaqqNt2IjyraOrNkZKrT7t27mTx5MpmZmbRr167M9e688066dOnC3LlzqzG66uHa2fQCcB/evE2zga99i8iE2TVeEwSvvvoq48aNO28SBcjKymL79u3VFFX1cj21/xjvXvs1wCzgf4AWPsVkTExauXIlI0aMAKB///5kZ2eTl5fHwIED6dKlC2+88QYA48aNIyEhgcGDB5OXlwdAx44deeuttwBISEhg1apVpR5jzJgxtG3blrZt2zJmzBgAnnvuORITE7nhhhvIz88Pr7tx40YSExO5+OKLWbBgAQALFiygY8eODBs2DFXl0UcfJTExkZtuuomzZ8/y7LPPMnDgQADmzZvHgAEDAMjJySE1NZWUlBR27tzJvn376Nq1K127duWxxx7j9OnTJCZ6g3Sef/55HnnkkXAc27Zt409/+hPJycksX748/PkAzJ07l/T0dA4cOMC+fftISUnh5ptvDr+PwvfbsGFDsrOzmTdvHvPmzePgwYP06NEj/D6Tk5Pp2bMnu3fvZtq0abRr147GjRuH1xk3bhydO3dm1qxZxf59br31VjIzM0lPT2flypVkZ2dz9dVXR/6PXw7XRDoFeADoj5dU44E7qjwaYwLmhRde4Morr+TNN98kLS0NgB07dvDGG2/Qq1cv5s+fD0B8fDyvv/46GzZs4MSJE3z9ddkndRkZGWRkeHdTqypTpkzh/fffp0WLFixdujS83ldffUWvXr147733uP/++ykoKOCWW27h008/5dSpU2zYsIE777yT7du38+GHH7Jv3z4OHz7Mtm3bOHToEMuWLQtfPnr66aeZOnUqkyZNYv78+Rw/fpwWLVqwefNmMjIyOHLkCE2bNuXQoUO8//77pKamhuM4duwYDz74IC+99BL33ntveHl+fj6PPfZYeJ0OHTqwadMmgPAfnYKCAubPn0/v3r2LfQZPPPEEJ06cAOChhx7iqaeeYvLkycycOZO0tDQWLFjA0KFD2bBhA5s2bSIvL4+PPvqImTNnhvexfv16Dhw4wKBB30zwMX36dM6cqfoZklyHP91e5Uc2ppI63bO0/JUilDN96HlfX7ZsGd26dWP37t0AbN68mUGDBtGhQwcOHDhQ7HJM3759eeUV7ya+73znO2RnZ7N8+XKuv/56jhw54hTPgQMHaNKkCc2aNaNXr1589NFH3HBD8buzmzVrRsuWLfnyyy/JzMxk+vTp7Nmzh0OHDvHJJ5/Qp08fUlNTad26NXl5eVx99dX84x//ID8/P5xIs7Oz+fvf/w7Aj370o/C+69evT/fu3dm2bRu9e/fmgw8+4MMPPyzWqVSvXj0aN25McnIyX375ZXj5ggUL6NevX3idhg0bUq9ePfr27cu2bdsAL8E2bdq02PvJy8tj+/bttG3bNvwZ9+rVi71794b/MBWVnZ3Nm2++Sc+ePcPJF+Cuu+4iM/ObSTx27txJgwYNnD73SLl2Nn0INAC+KFyE19k00JeojHFQXtLzw3XXXcfixYvDp65QdqdgnTp1KCgo4Pjx4xQUFNC+fXveeOMNbr31Vo4fP+58TJdOx8Jj3X///WzYsIFf/OIXAHTt2pUvvviCYcOG8fXXX3P06FGGDBnCjBkzGDFiBC+//DLgtXxfe+01UlK8+S1zcnLO2ff3v/99Vq9eTX5+Pk2aNAm//u1vf5vDhw+H1y300ksvMWnSJFavXk2zZs3ClzmKDnPbtWvXOddWMzIymD59OtOnT3f6DFSV22+/Pdz6LdSzZ0+ysrLC72nWrFksXLgw/NlUJddT+4HA34CDwHRVHWBJ1BhITk5m7dq15Obm0rx582L/4deuXUvXrl3DraQf/vCHNG7cmHr16jnXC2jZsiWHDx8mLy+PdevWhZNCUUePHuXzzz+ndevWqCrx8fHh144fP058fDzx8fHs2rWLEydO0KZNG06dOsXAgQPDcSQlJZGVlQV44zwLFRQUsGHDBjp37kz//v2ZPXs2ffsWnxjjsssu4+233+azzz6jTZs24eV9+/alXj2vrXbBBRcQFxfH3r172bhxI0lJSWzZsoVTp06dk0hPnDjB4MGDz/mMy3r/SUlJrFixgoKCgmKxp6enM3v2bA4ePAhAq1atSEpKKucTrxjXRHo93i2ibwGPishWEZnsS0TGBMitt97KW2+9Rb9+/Zg2bVp4+TXXXMPLL7/M7bffHk6kQ4cOZfbs2YB3SpuXl8fNN9983v2LCPfddx89evRg7969DB1avBW+dOlSUlJS+M1vfkNcXBy/+93vSE5O5j//+Q9NmjThgQceoHPnztSpU4cePXqEY3n11Vfp2bMnx44dA2DSpEk8//zzdO7cmXfffReADRs2kJiYyODBg2nbti3t2rXjwgsv5Nprry0Ww4033sjGjRvp168f6enpgHcqP3Zs8dGR6enpXH755ezfv59BgwYxfPjwYp1WhcaPH1/sD1JaWhp33HEHM2bM4Le//e0561966aX06dOHxMRE7r777vDyRo0aMWHChHAH1IQJE877WVdKmQOmiw+Qvq2Ux2iXbSvyuPzyy9VXDzTxd/9+C3r8qhV6D1u2bPEhkKrXr18/3bp1q+/HWbFihQ4fPtyXfe/atUt79+5dbNnp06e1d+/eeubMGV+OGW2lfb+AdeqQs1zHkV6qJcroiciTwHPlbVg4ZxOQA9wMfK6qP3U8rjEmBuzatYurr76a3//+99StWzfa4cQc10R6o4g8pKqHAESkOTAMOLedXUSJOZueUtU/iMhaEYlTb7oSY2qclStXVstx+vfvX6zTqyp16tSJd955J/w8ISGBnTt3+nKsmsA1kaYBa0WkcMxGcyC9vI1UdYeILAZuUFUVkYuAz0pLoiUnvzPGmKBwTaSrVLWriLTA66A6CERUAVW8q8ezKKMVq6pzgDkAqampdrO4MSYwXHvt1wCo6kFV3a+qZ4FlER7rRmCtqjpNb2qMMUHh2iL9WESm4yVUBa7im8H5rgYA3xeR64Bx6k3VbIwxgefc2QTcAlyD14rNxnEWUS0xZ5MxxtQ0ron0JF4JvYOq+jcAEWnoW1TGGBMgrtdIlwHjgCcARCQOWOtXUMbEooKCAkaOHEmHDh149913w6XaCqWnp/PMM8+cU4Ku6GuFipbkmzt3Ll27duWWW24pdrzCbf785z9z7733cubMGX7605+SkJAQrg4F3l1EnTp1ok+fPsC5JeyKHqt169bAuWXzACZPnkzHjh158MEH6dKlC3FxcXTu3Jnly5czevRoEhMTSUpKYtOmTRw9epSkpCQSEhLYuHFjVX7MgeSaSDsBPwcKKy0kA9YiNbXKP//5TwoKCti9ezdXXHFFmeuVLEFXeK93ac6ePcsf//hHsrOzycnJOWes5tGjR5k1axZ33303mZmZ1K9fn02bNvHwww9TUFBAQUEBF1100TljV4uWsCut4EfJsnnbtm1jzZo15OTkkJaWxo4dO2jXrh2bN29myJAh7N69m2XLlnHTTTfx1ltv0ahRI7Zu3crDDz9cLKnXVq6n9r8F/gW0EZEcYD/wM7+CMsZJetPy14l4n2XPPLBp06ZwwY7CKkfXXXcdKSkpLFq06Jz1i5agA5g6dSp/+ctfePbZZ8PrHDx4kJ07d5KSksKRI0c4cOAAnTt3Dr/+5JNPkpaWRtOmTcPl5Bo1akSrVq3Ys2cPTZo0OacMHRQvYdeqVSs+/bT4YJmSZfM2bdpEnz59zjsB4TXXXMOhQ4fYvn07OTk5jBgxgj179oSLQ9dmrvVIlxH5cCdj/FXN062cPXv2nCSzbNky5s6dywsvvFDqNoUl6ADuv/9+2rVrx0MPPcTPf/5zwKt1kZKSwtq1pV8p69mzJ5mZmfzyl78Ezm1dfvLJJ6VO8VG0hF1ycjIJCQkkJSXx1VdfhY9btGzeiy++WG65vqysLFatWsXjjz9O3bp1ue2220hISGDx4sXn3a42cDq1F5H+oVs7c0TkUxH5SETOX7bGmBqmW7durFmzBig+n1bDhg2LTQFSqGgJurLWLSzIvHfv3mIl4Apdf/31nDlzhqysrHA5uWPHjrFv3z5at27Niy++WGqLsGgJO4CFCxeydetWmjdvDpxbNq9bt268/fbb5c7kGh8fz6lTpygoKPCtSHIQuV4j/RMwXFU7qWpH4DpgejnbGFOjDBkyhPz8fLp06cKqVau48MILGTx4MFlZWQwfPrzYuiVL0DVv3pypU6fyq1/9ikmTJoXXq1OnDg899BC9e/cu8xR5ypQpPPzwwwwaNIijR49yySWXcM8997Bq1SqysrLCrdtCpZWwK6lk2bzu3bvTs2dPOnfuzMKFC89Zv2PHjgwaNIgHH3yQCRMmMH78eJ544gnGjx9P+/btXT/CGkvO99cnvJJ3XbTkp/tz4BkAVb235DaVkZqaquvWravKXRYX9Fk4gx4/VOg9bN261bfCvFWp8Pph0aIfJvaV9v0SkfWqmlrGJmGunU0/A1qXWHbeyk/GGFNbuCbSraFHMaq6u2rDMSb4SpagMzWfayLdBnyAN+ldIQX6lr66McbUHq6JNFdVLWmaqFNVp1k1jYmES1/R+bgm0vYisqaUg1tyNdWmQYMGHDx4kBYtWlgyNVVGVTl48GClhnO5JtJuFT6CMVWkffv25Obmsn///miHYmqYBg0aVGoYl2sifUFVv190gYj8B68u6XkVmfxuDvA88CUwVCvblja1TlxcHAkJEU3MYEy1cB2Q30xEBotIExFpLCJD8OZtOq8ik98BjADux0uk/1WhaI2JskWLFnHJJZdQt25dLrnkklLvsTe1j2sivRlvLvv/AO/gldQbUd5GoSr4hTfitgX2AnuAc24OFpEJIrJORNbZqZuJRYsWLSItLY2nn36akydP8vTTT5OWlmbJtJrF4h8z10R6VlVHqeqlqpqCl0S7V+K455zWq+ocVU1V1dRWrVpVYtfG+GPatGmMGjWKiRMn0qBBAyZOnMioUaOYNm1atEOrNWL1j5lrIn1GRJ4RkeYiMgzYAJR721QJX+DdHdWGyOd7MibqtmzZwoIFC4r9J16wYAFbtmyJdmi1Rqz+MXPtbBqCN2/T23iD84cBn0V4rMXAc8A+vERsTKDUr1+fiRMnhouLDBgwgIkTJ3LvvVVaasKcx5YtWzh27BgZGRl873vfY/Xq1YwdO/aceqvVzTWRZhf5/VLgn3in54nlbVhi8rtIW7HGxIzTp08za9YsevbsGf5PPGvWrFLL3xl/1K9fn6uuuoqJEyeGi4xcddVV7NmzJ6pxOZ3aq2pCKY9yk6gxNUlycnKpp5XJycnRDq3WOHXqFEuWLGHs2LEcOXKEsWPHsmTJEk6dOhXVuFwLO/cUkddF5GMR2SYiL4mIfXtMrZKWlsbChQuLXSNduHAhaWlp0Q6t1oiPj2f48OFkZGTQuHFjMjIyGD58OPHx8VGNy7Wz6VlgEnAGbwzoQuBFv4IyJhaNHDmSoUOHcu2111K/fn2uvfZahg4dysiRI6MdWq1x+vRp1qxZU+yP2Zo1a6J+ecU1kZ5W1Z3AYlU9AbxKKUOYjKnJFi1axJIlS2jTpg116tShTZs2LFmyJOpDb2qTWL284nqNtFfo55TQzwKgp49xGRNz7rrrLurVq0dGRgYnT54kIyODevXqcdddd0U7tPMqnBk0kkesitXLK6699udQ1TNVGYgxsS43N5fMzMxiw5/mz5/PoEGDohzZ+ZVV1qLTPUvJmT60mqOpnMLLKEV77adNmxb1yysVTqTGGBMNI0eOjHriLMm11/4WEflARD4RkV2hx7/8Ds6YWNK+fXtGjx7NihUryM/PZ8WKFYwePdpm0fRZRS5NVPflCdfOpkeBa4HTwHeBJMDqmZlaZcaMGRQUFDB27Fji4+MZO3YsBQUFzJgxI9qh1WiqWuqj493/r8zXqrtKp+up/TuquldENgHzgZPATv/CMib2FJ5OTps2DRHhggsu4OGHH46500xT/ZwSqar+JPTrCGAQXkv2n34FZUysON8p4ubNmxk1ahSjRo065zWrW167RHKNdAle8ZIHgUeAm/wMzJhYEITTShN9rqf2U4GfAUuBS4B8vCLPNhLZGFPruSbSuniVnr4CBuDNb3/Wr6CMMSZIXHvtH8EryPxM6GdrvFN8ZyKSIiJvi8g7IlLufE/GGBMUri3SCwtvD62EgXgJ+QfAZVhnlTGmhnBNpD8TkXfwTunDVDUzgmO9DqSHjrm65IsiMgGYANChQ4cIdmuMMdHlmkgbAyMpnkgViCSRtsObRTQRaAnkFn1RVecAcwBSU1Ot29MYExiuiXSHqo6t5LFuBl7DO60fDPylkvurOaZ3hJOHItsmvWlk6zdoBvdEd14bY2oq10S6ouQCEXlSVX8bwbFeBp7Guyvqhgi2q/lOHoL0w/4eI9LEa4xx5tprf6OINCt8Eup1HxbJgVR1lap2V9VeqmrTMRtjagzXFmkasFZEjoSeN8frODLGmFrP9V77xcBiEWmB14o9oHYfnDHGAI6JVER+DEwG4lW1j4jEichEVZ3pb3iO/O6ssY6a8tm/ganFXE/tH8erR/o6gKrmi8gdQGwkUr87a6yjpnz2b2BqMddEuhkYBzQVkV8A1wDrfIvKGGMCxDWR/gSvhN4BoBGQASz3KyhjjAkS10TaF8gD3iux7K0qj8gYYwLGNZH+osjv1wBZeLeIWiI1xtR6rsOfRgKISBNgXeFzY4wx7sOf9uAVcs4nVnrqjTEmRri2SNv4HYgxxgSVa4s0C2+6kfAiQFV1oC9RGWNMgLh2Nt0B3INXuekPqrrNv5CMMSZYXBPpRcCzQBdgoYjsB+ao6su+RVbb2J07xgSWayIdUOT310I/L8WrMepMRIYCY4AJqvp1JNvWeFaP1JjAck2kj1T2QCLSCPiZqpZax9TmbDLGBJVrYecjQHaRx7bQz0h8D0gWkTdFpHXJF1V1jqqmqmpqq1atIty1McZEj2uLdA/eHU15wFeqWlCBYzUHZuF1WA0HnqrAPowxJua4tkh38U2hkhwRWS4iqREeaz9wAV4ilXLWNcaYwHAdkF+0swkRGYTX0RTJxcx/A3cCDYHREWxngsI6tEwt5Xpqj4h0By4JPd2MNxTKmaqexJuG2dRUVtjZ1FKudzbNBrryTRm9MUAOoV72mGD/0YwxUeLaIh0CdCnsZBKRusAO36KqCGsNGWOixDWR/l/gIxHZgleHNAX4s29RGWNMgLh2Nk0XkT8BF4cWfQyc8S0qY4wJEKfhTyKyCkhW1XXAFuB3wId+BmaMMUHhemo/HpguIqeBZGAe3/TgG2NMreY6IH8KcArvNs9DwBV41aCMMabWc22RPlPipzHGmBDXzqZVfgdijDFB5Xpqb4wxpgyuvfbNRGSMiEwKPa8jIgn+hmaMMcHg2iJdjndv/a9DzxV4w5eIjDEmYFwTaSPgBeCkiDQAhuElU2OMqfVce+1vAx4CzgLr8QblD4/0YCLSEdiuqvGRbmuMMbHKtdd+PfCTKjjeJLzbS40xpsZwLaP3IdAA+KJwEaCqOtD1QCLSPPTrgTJet8nvjDGB5HpqPxD4b7yiJXNVNbMCxxoFPA/0LO1FVZ0DzAFITU2166/GRKjHlEwOn8iPaJtO9yx1Xrdpwzg2PDAo0rBqBddEej3ezKF7gUdF5Cm8hDozgmMl8s1MohNCidMYU0UOn8gnZ/pQ3/YfSdKtbVwTaeFkdXnAH0K/R9RqVNXJACKy0pKoMaYkv1vU4F+r2jWRxgF/UVUFEJHvAncBz0V6QFXtH+k2xvgtyP+Jawq/W9TgX6vaNZF2BN4RkQeAQcAPgGm+RGRMFAT5P7GJPtdEuhpvsrs/A+8D9+GNKTXGmFrPNZGOCP3MCv0cjneNtCK998YYU6O4JtLXgVcKr5ECiMhIf0Iyxphgcb3X/omiSTTkwaoOxhhjgsi1RfqKiKwB3sE7pb8SO603xhjA/V77ySJyMXApXit2nqpu8jUyY4wJCNfCzgnALcBVqvpXIFtErvI1MmOMCQjXa6SvAZ8BPwJQ1Xy8KZmNMabWc71GehLIBRCRJLxB+Yf9CsoYY4LEtUX6E+BqvMIljwPfAW7wKyhjjAkS186mT4E7fY7FGGMCybWw8x68YU8CfBv4CkBV2/oXmjEmEjkNRkG6n/sHu6JXOtcWaZvC30Vki6om+xeSMaYiOp1c6Hs90hzf9h5sri3SRXgt0o7ABxU5kIgMB+4ACoBBqnq6IvsxxtRMfreovWOAH61q1177Z/AS6X5V3VrBY72oqktE5CWgA7CjgvupmdKb+rv/Bs383b8xleR3ixr8a1W7JtL+hCrii0h4oapOdT2QqqqINAQaA7tKvl6rJ79Lj/AvZHrTyLcxxvjGNZF+gddrvxJYgnd6XhEzgftU9ZztbfI7E01BPq000efa2TRXRJ7Fu010ErAGyIjkQCJymbcrXRtxlC78PDW20+IaL8inlSb6XDubVvDN8CeAwXhTNEcyAc0AYKCIrMRrla6OYNvzs1NjY0wUubZIB1T2QKr6BPBEZfdjYpidFZhayrVF+nu8gfiNgW8BDYFvqeoQH2MzQWJnBaYWc+1s6gxcjlfY+WVgPRHOa2+MMTWV66n9GBGpAwzBm2KkE9Dbx7iMMSYwXE/t7wP6AF8CjwL/VNWv/AzMGGOCwrWMXhcgH68l+t/AShHZ7ldQxhgTJM6n9j7HYYwxgeU6Z1NzEfmriOwSkd0i8qqItCl/S2OMqflcT+3nAM+raoKqdgBmA8/5F5YxxgRHmaf2InI3Xi89QC+guYhMKnwZuEJE3gRQ1YG+RmmMMTGszESqqo/i9dAjIn8DFqrq30LPfwz8WlV/WC1RGmNMDHM9tZ8A3CwiOSLyGTAGGO1bVMYYEyCuvfYHgeE+x2KMMYHkOiD/Q6ABXl1S8K6Rql0bNcYY93vtB+INxL8YmKuqmf6FZIwxweKaSK8HtgF7gUdF5Cm8hDrT9UAichHwd+AMcJ2qHok0WGP81Omepb7uv2nDOF/3D/6+h6DHD/69B1Etv4iTiNxW2nJVne98IJGxeOX3WgIbVfWVEq8XnbPp8k8//dR11+c7ZsTbuHwe1aUi8UPw30PQ44fgv4egxw9V8x5EZL2qppa3nmtn03wRSQEuBeoCH6vqexHG1BbYinfPfrtSjlHlczbF0pehIoIePwT/PQQ9fgj+ewhC/K6dTTPx6pG+DZwFbhORr1R1RAWPG/ufjDHGOHK+RqqqFxddICKfRHisL4DWhE7tI9zWGGNilmsiLRCRL0osayUie/CGQbV12MdSvulscu6kMsaYWOd6jTSpsgdS1S+BKyu7H2OMiTWu10hbAzcDRady3KOqc32JyhhjAsR1+NMmYAFwBzAV786mtFBJvaoPSmQ/UPnxT2VrCRzwcf9+C3r8EPz3EPT4IfjvoTri76iqrcpbyfUaabyqTheREXgJ7hRwojLRnY9L4JUhIutcxobFqqDHD8F/D0GPH4L/HmIpftdEel3o5+3APXhVo8b6EpExxgSMayL9LfArVf0AqwJljDHFOLdIRaQr3rXRMFUN6kyic6IdQCUFPX4I/nsIevwQ/PcQM/G7djat5Ny7kayMnjHG4JhIjTHGlM11OubBIrI+NNVIjoi8LyJDyt/SGGNqPtdT++3AQFXNDT1vD6xQ1a4+x2eMMTHPdfK7z4CficiPROR6vLqhn/sXlr9EZKqI/CHacVSL/H8pAAAEOklEQVSEiAwXkX+LyEoRqR/teCIlIj8VkVUiMi/asVSGfYeiK9a+R66J9Aa86k2D8ea63wsM8ysoP4lIF+DCaMdRCS+q6vfx7ujw5c4yP6nqX4H+QC8R8b/kug/sOxR9sfY9ch3+1Ab4V+hRqDWQV+UR+UxVd4jIYrw/DoGjqioiDYHGwK5oxxMpEakLbAHeVdX8aMdTEfYdir5Y+x65JtIPgHeLPBe84VA2/Ck6ZgL3qWpBtAOJlKoWiEgy8LKIxKvqqWjHVEsF9jsEsfc9ck2kh4BpeC3QXFXd419I5nxE5DK8RsXaaMdSUaH/BPWA9sDOaMdT29SE7xDE1vfI9RppJjAa7z77pSLyUajTyVS/AcDAUEfB96IdTKREZJKI/Buv6E2ksyyYqhHo7xDE3veoQgPyQ03qxaravepDMsaYYLE7m4wxppJcT+2NMcaUwRKpMcZUkiVSY4ypJEukptYQkfoiYt95U+XsS2VqPBG5QETmApuBdtGOx9Q8lkhNbfA4sENVu6rqZ9EOxtQ8NvzJ1HgisgO4WFXPRjsWUzNZi9T4TkT6h4p8ELqbppuINBGRN0Vkh4hcG3ptiohkhH5PF5FnQ79fLiKbReQ9EWkuIt1F5BMR2S4it5Q4VncR2SIiH4hIBxFpiVepabOIvCgicaVtLyK3iMinIvKaeO4OrfOyiNQRkXkiMiT0mBfa5o8i8rmI/E+1fZgmJlkiNdFyK/A2XuGbaaFlTYErQh1CfYBvh5bfC/wUyMKruNQceA/oBUwNVQIqdB/wG7yiHJOBC4DdwKWh168tbXtVXaCqHYF4oAfe5YCLgf+ijJJ5qvprIAH4ZYU/BVMjWCI11eU6EcnGS14AKcB7qrobaCkiAjQB3serefsZXpk3gG7Ay8AtQLPCHarqIbyamhcVOU4KXpJ8D7gEOAOcUNUzwBrgu6VtLyJjQvH1CR0jEfgS+FhV95b2hkTkf4FNeGUmTS1midRUl2Wq2g0vwRUqeYG+EfAG8BBeoZxvhZYLcLmqdlLVmSW2OQvULbGs6H4P4SXowuUlj1m4/VTgytDxUdWPgbYAIvLtEtsU1pu4DEjCS8amFrNEaqJlC95pfHvgK/V6PRsCa/FOl1cADULrbgWuCV2rDCdNEWmEN5xpb8n9AqnAZlU9BuSLSGuge2hfpW0vwKkir30rVOPyVCiekuoCp6wDy4AlUhM9LwA/AFYBaaFlDYECoLeqHsC7vglea3E6kA10Ci0bijcu9KkSFdKnAbOAu4AnQ8vSgfVAK7yWbmnbP4aXhK/Cq7s7RUR24rVYN4S2eSb0+DFeK/dzEdkFHKv4x2BqAhv+ZAJHRPoDP1fVEVHafh5eGcnlFdne1DzWIjUmcqsJ8Cy6pupZi9QYYyrJWqTGGFNJlkiNMaaSLJEaY0wlWSI1xphKskRqjDGVZInUGGMq6f8Dh6Mg0sO/VXgAAAAASUVORK5CYII=\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x155000ac1d0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"log_counts_3 = list(np.log(counts.T[:3] + 1))\n",
"log_ncounts_3 = list(np.log(counts_lib_norm.T[:3] + 1))\n",
"ax = class_boxplot(log_counts_3 + log_ncounts_3,\n",
" ['сырые количества'] * 3 + ['норм. по размеру библиотеки'] * 3,\n",
" labels=[1, 2, 3, 1, 2, 3])\n",
"ax.set_xlabel('номер образца')\n",
"ax.set_ylabel('логарифмические количества экспрессии генов')\n",
"plt.tight_layout()\n",
"#plt.savefig('pics/1_10.png', dpi=600) "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Нормализация между генами"
]
},
{
"cell_type": "code",
"execution_count": 89,
"metadata": {},
"outputs": [],
"source": [
"def binned_boxplot(x, y, *, # относится только к Python 3! (*см. совет в книге)\n",
" xlabel='длина гена (логарифмическая шкала)',\n",
" ylabel='срд. лог-количества'):\n",
" \"\"\"Построить график распределения `y` независимо от `x`, используя\n",
" большое число коробчатых графиков.\n",
" Примечание: ожидается, что все входные данные приведены в\n",
" логарифмическую шкалу.\n",
"\n",
" Параметры\n",
" ---------\n",
" x: Одномерный массив вещественных значений\n",
" Значения независимых переменных.\n",
" y: Одномерный массив вещественных значений\n",
" Значения зависимых переменных.\n",
" \"\"\"\n",
" # Определить интервалы для `x` в зависимости от плотности \n",
" # результатов наблюдений\n",
" x_hist, x_bins = np.histogram(x, bins='auto')\n",
"\n",
" # Применить `np.digitize` для нумерации интервалов\n",
" # Отбросить последний край интервала, потому что он нарушает допущение \n",
" # метода `digitize` об открытости справа. Максимальный результат наблюдения \n",
" # правильно попадает в последний интервал.\n",
" x_bin_idxs = np.digitize(x, x_bins[:-1])\n",
"\n",
" # Применить эти индексы для создания списка массивов, где каждый содержит\n",
" # значения`y`, соответствующие значениям `x` в последнем интервале.\n",
" # Этот формат входных данных ожидается на входе в `plt.boxplot`\n",
" binned_y = [y[x_bin_idxs == i]\n",
" for i in range(np.max(x_bin_idxs))]\n",
" fig, ax = plt.subplots(figsize=(4.8,1.3)) # \n",
"\n",
" # Создать метки оси Х, используя центры интервалов\n",
" x_bin_centers = (x_bins[1:] + x_bins[:-1]) / 2\n",
" x_ticklabels = np.round(np.exp(x_bin_centers)).astype(int)\n",
"\n",
" # Создать коробчатую диаграмму\n",
" ax.boxplot(binned_y, labels=x_ticklabels)\n",
"\n",
" # Показать только каждую 10-ую метку, чтобы \n",
" # предотвратить скапливание на оси Х\n",
" reduce_xaxis_labels(ax, 10)\n",
"\n",
" # Скорректировать имена осей\n",
" ax.set_xlabel(xlabel)\n",
" ax.set_ylabel(ylabel)"
]
},
{
"cell_type": "code",
"execution_count": 90,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAABmCAYAAACZSRngAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAHiNJREFUeJztnX2cXEWZ779P9wyTkNcZZpwJmZsZMCqZCeqVrLuSiAG8Ebzs3eDbEuT6kvVtZHRZWTTsoIDiClkG2QXuur7c6KwO7qLr3l026OBtomYRJYDKSyKgBvlE4SYGjURhQnjuH3XqTPWZ093VbzPdob6fT326u069neo+v36qTtVzRFUJBAKBQOVkZrsBgUAg0OwEIQ0EAoEqCUIaCAQCVRKENBAIBKokCGkgEAhUSRDSQCAQqJIgpIFAIFAlLT6JRKQHeBOw2Il+TFU/XZdWBQKBQBPha5HeChwNvAt4DHgcuKRejQoEAoFmwldI21T1SmAf8AjwY+D3dWtVIBAINBFeQ3vgtdHr24FNGAHeWJcWBQKBQJPhK6RPisiZwG3Ad4CjgJ/UrVWBQCDQRPgO7f8DOBczpO8BFgDfqFejAoFAoJkQH+9PIvIgsBJ4GFgOCPCgqvbVt3mBQCDQ+PgO7X8B7AIOR69gRDUQCASe8/hapAOq+sAMtCcQCASaDt850n+taysCgUCgifG1SA8BezFzo2pfVfXY+jYvEAgEGh/fOdLdqvqCurYkEAgEmhTfof20PfUickqN2xIIBAJNie/Qfreq9ifiHlTVF9arYYFAINAs+ArpjZh50dsxc6SrgVZVfX19mxcIBAKNj6+QZoBXAydipgN2AV9X1UP1bV4gEAg0Pr43mxYCS4HDqjoaCWsf8LO6tSwQCASaBN+bTV8Hng+8P/qswC11aVEgEAg0Gb4W6Xzgi8DrRGQO8BqMmNaEzs5O7e/vr1VxgecY+/fvZ8+ePfT39zN//nyefPJJdu/ezdKlS+no6ADg5z//OXv37gVAROjs7KS9vX1aukDA5a677tqnql0lE6pqyQCcBHwFuA+4H7gJeLFPXp9w0kknaSBQKYODg5rL5fLicrmcDg4Oqqrq8PCwtrS0aE9Pj27dulVHR0e1paVFh4eH89IFAkmAHeqjkT6J6h2CkAYKMT4+roODg5rJZHRwcFDHx8enpclkMjo5OZkXNzk5qZlMRlVV29radHR0NC/d6OiotrW15aWbKXzOKdAY1FRIMTeVfpoMPnl9QhDSQBrj4+Pa1dWl/f39mslktLOzU7PZrIpIngCVskgBPXjwYF66gwcPKjDjFun4+Lged9xxmsvldHJyUnO5nB533HE6PDxcN3ENwl05tRbSNifcZ9/75PUJQUgbi3IvvHpdqL29vbpkyRLN5XI6Njamixcv1mw2q4D29/drV1eXjo+PFxQn2w5rkbrpNm/erK2trXnp0sDcCygZ50ua6I+MjGhra2vB9qtW3sel+iZQnFoL6V9H4UvArT55yglBSBuHci+8WlyohUQC0ImJCVWdEtXNmzfHluSSJUu0t7e3aBmqU3Oko6OjumXLFu3p6VFAOzo68tIVEs0klYqoavo0xMDAgIpIXpxrKVfTx6Ws9UBxai2kbwXeApwJHOWTp5wQhLRxKHTh9fb2ThOq8fFxbWtrmzbULudCdYfvIqL9/f0K6Pj4eJ6Q2vcTExOxkJVjGQ4PD2tbW5sC2tbWpsPDw/Ext7wkblyxdL6k9W8mk9GBgYG8OHfudnBwUEdGRvL6334uRan540Bx6iWkecEnr08IQjr7WIsO0IGBgTxrZ2xsLLYCrUXU1dWlXV1dmslk9ODBg3lWUrELNWk5dnR0aE9PT17ZgPb29mpvb298DNDNmzdrT09PbIX6DrvduGJiWUo0ayGkadZla2urjoyM5KVz/4zsH0yyj3zaESzS6qi1kF6Beab9P0Qi+lbgrT55fUIQ0tnFtQoB7e7ujucfVVX7+/u1s7MzTwBtevdCtZarLce1XAcHB1VEYtFwBWHTpk157bEiYYMtL5PJ6KJFi3RsbCxPcJN53TLcuOT7tDyVxqWVV6y/7Z/W4OCgDg8PFx26F/pzaGtrK1qPrSvMkVZOzZc/AR3ApRhv+edgnJYEIW1yxsfHtaWlJRasoaEhXbJkiba3t+vSpUtjwVqwYEEsaK2trfHF7YrAli1bFNCenp5Y7KzlmsvldGBgQFevXh3nFREFtKWlpeBcpRUbQLPZrHZ2dqZOAVQihsnj1QhpOVZvWrpic7zANDG0/edDuGtfObW2SG8DctHrf2KWQ93pk9cnBCGdHay1AujWrVtja2XdunWxWNo50La2tviOuRu6urryxK61tTVVFO38pLUiW1tbdWhoKI5zxde1NF3RcUXT1ufW477OdFzacV+BLyTq7vvkHKntg0pIs3AD6dTcIq1nCEI6O9ibFmlit2HDBgXy5uPmzZunCxcu1Pnz5+fFuenGxsby6rDp0kR4cHAwFgV7J921NG1+97VR4+pdn/3Dc/vax7JME81CFrOPRf1co9YW6Xha8MnrE4KQzizuHF1LS4vOmTNHs9msDg0NxZZoNpvVjo4OVTUX0dFHHx1fbP39/bp27do8gbTlpV20gI6OjipMDefBzMVmMplp4tpIAtkoQpqcVy0mhpWIYqFzKhb3XKDWQvoocArwKjf45PUJQUhnhvHxce3t7Y3FsKWlRYeGhjSbzercuXNjaxDQRYsWTbs7DugnPvEJzeVy2t3drUC8rtOmS5vLg/x51bQ4m9aWY2mWuNlqQyHR9BXSUudULK4huXShCTWi1kL6e+A7wM2YO/fnAC0e+R4DtgE9xdIFIa0/dj60v79fJyYmYuHq6upSEdHFixfnCdvQ0FC8hCkpgG6wlqpNl2Y5FQqudWqHrrYcS7PENUIbgpDWnloLaR9wPMZD/p8A/wjcUiJPFtjiU34Q0vpjlym5C7TtsiZ7194V0nnz5uXdtS8VVAtfoD7B3kgqVk4jxzVCG2otpIW+27RyplHCMmwWYa6pkKZmhLeXON4JfB84u8DxdwE7gB3Lli2rc3cErIC6u2Jci7BQsFale7PDBnd+U7U6IU3mSSunkeMaoQ31ENK0uGJl+3KkCamXh3wReY2I3CUiu6NwN2bYXhBV3Yd5SN75ItKXcvzTqrpKVVd1dZX2mxqojhUrVrB9+3ZOPfVUrrrqKjZu3MhFF12UmnbevHnx+/vuuw8wjpEBjj/++PjYs88+y/Lly+vY6kAtEZGalGP0pXKsE+0jypm2j9oCDwK9zude4CHPvP8EvKRYmjC0rz/uHOnFF1+cd2PJhu7ubhUR7ejo0LPOOivP8nDnO7PZrK5YsSLPKrXpLCQsmFIhmSetnEaOa4Q21Cuuvb1dgfg17dyLxRVK45N2tsHTIvV91MijwLtF5PvRD//lwJ5iGUTkDcBfYHyX/siznkCd2LBhAwDnnnsuV155JQMDA4CxPg8ePBinmzt3Lvv37+fOO+/My3///ffH7w8fPszOnTsBY5UGZoeOjg72799fcVyx9+7rE088garmWbTu8WTZpdBLF8Jli8zrEYKvkK4HzsM8qwngAeCPi2VQ1a9gHk8SmAVuvPFGPv7xj7Nz505WrFjByMgIAG1tbUxOTsbp1BmmPf744/H7ci+OgCFt2OoblzxWKu6JJ56YFl9OHJAnkGmiadOl1Z0UV1/k8gNxXr2s7OyNiY/ZCrwDEOfzi4DP+eT1CWFoX1vSHFV0dXXpggULtLOzMx7GQ/5No7S1nqp+w3ObzlJO3mT+QuXMVlx7e/u04W0yzr4vdN4+cZXmq1ed9s57Ml2hPkv7nEahdjQi1Hj508eA7wFnANdg7ran3o2vJAQhrS1prtPsMifrbT5tjtQNyXWdyR1IjSyk7e3t3unceb9CotkIolZpnCuGldYZL2VyRNWNs7j9VgxbbvJ7akR8hdR3aL8d2A18FrgbuAQIk2MNys6dO1mzZk1e3L59+wB43/veFw/3k2QymXjO0/yGpmjUudBCQ95K06UNeeuNO1eYNm9YTZxcfqB4HdFcpU2XhjsUT4uzw3Pf4b5NcyRNH0nygklNJLIlJVpVdWMtGrFq1SrdsWNHLYoKACtXruS6667j1FNPjeNEhJaWFiYmJlizZg3bt2/ntNNOy8vX2tpKJpPh8OHDXHXVVVx44YXeghJfVNHvyb73FSM3v4jQ3t7O/v37p8VB4QvWjWtvb/dKVyiu4nyOSKWJWBpxugrrrKq9tt+jNqaJZjJfsm/TvvNSupKWrxERkbtUdVXJhD5mqw3AmeWk9w1haF9b0uZIwexWKja0b29vjxfpJ4f2pUKhpTE+ed20bp7ksDstfb3iiqVJmzv0jau0zpk6z+Qw3qs/LCnDfR/yymgwqMfOJuDBctL7hiCktSfpzFdEtLOzM/Vmkw1WREUkdq+n6i+GVvwKzTH6CvFMi0mlAlOJ6DS6kJYTZ783S6XieiQIqe/Qfp6qHhSRC1T12pIZyiQM7evPypUrWb9+PVdffTWTk5MMDAzEa0PtWlJ3jrSrq4u9e/eaH0kFc4W1zDfTcYXmGuvdjtk492rzufpRztDe4i65akR8h/a+N5vuAV5YDxEN1Be7nvSBBx5g165dHD58GMhfYP/MM88A+VsI7c2ptJs0ra2tHDp0KP7c0tISl9EsFLtZU0g0Zwo7B5nEt03JdIXKK4Sdjy4nT6XMRB0zgo/ZCuxn6tn2cfDJ6xPC0D6dap+1486Vjo2N6eLFi+MhWXI50/z58zWXy8We6mG6G7zVq1drJpPRBQsWKEy50BseHvYeCpYKafmqjcsbXkZDzFJ569EO3zrdePs+bUlXWp5yln75tMEnnVu+2wYf8r6nBoQ6PI55WvDJ6xOCkE6n2qc/2mfOw9QjROyz4yF9jnRwcFA3bdpUUGzdR5FUIhyzJaTViFo948oRvrS4avK67UnbTFAsfSEhLRRXjJoJaY0dOltqKqSmPF4MnBuFok5Iyg1BSKdT6Hnkvb298c0j+2C6pLVqRdg+bdM+5sMVybSQy+V0yZIlsRf9pEXqUuwim2khzbuIKrA+ay2kSZGrhWjORFyh9hXqnzRmTUjrRE2FFPh74JtMDesngE/75PUJQUin4zpgdp8LD+j69eu1q6srHobbRx5bMbUi3NbWpqOjo6pqnDi7S55GR0f1gx/84LRnKHV2dsZ37N2ndpZjrdRDSNN26JTatVNNXDnnVMiiS+unZowrdJ5pBCEtLqQ/A7LO5yzwM5+8PqGZhXTdunUlHSS3tLSUnOdMzof29vZqLpfLG+L39fVpd3e3ZrNZXbx4seZyOZ2YmMjzcm8Fd3JyMrZI3WfO2zA0NKR9fX15z5l3Ldw04Ujbb14PIa2nQFYipK6AJLdBzrbIzXRczZcy1WlIXitqLaSbgJ3AVzEenXYCIyXydAPfxTzraUGxtM0qpOvWrYsFKHnBATpnzpz4mBXGtHnOQk5Gurq6pj1jaWxsTAHt6+tTVY0/24fQ5XI5bW1tjb3gu89Dt2LrhtbWVp07d25JMak0Lm2dZSkhjd+nDM+rHbKXiksTTUtRgSlx/EiMqwVu/zciNRVSUx4LgVVRWOSRfiNwPnApJRycNKuQiogODQ0pGEtQdeqHYa26XC6nJ598cvxjyeVycVpLsflQILYSrRi7Pz47ZJ+YmIjjRkZG8jw52TlSN6TtbLLtT1qaxXboTJuTTIRiApaaJ5GmHOGrJC4+R+e1krhalTObccnjpeKeC/gKqe+C/L9Oi1fVvyqS5xKM5XoMcJSqXp84/i7Mc5tYtmzZSY888og5cNmi6PU36QXb4/Hn36THJdMXSucTN9N4tuuahZfygQOXl0w358qneWpTW15cz6fm8th7fl+6XorvGbfH0tYdunG+eVV12jrGYnEupeLs+2aMK+c8Kzn3hqWUHtSZmu61B34JvIUylj9hPES9HiOW5xdLGyzSdIvUzqm6w/6RkRHNZrMKU/5Du7u78242pdXhMjw8nLc0anh4uJbdEggcMVBji/QA8H+A32AeMXKbqt5RIs9GYC7maaI/UtWvFUm7FzgI7IuiOqP3nQ0eNwjMwbgUzADPkL9bzHauAIeBn2Aea/0oZpODLacPM3WyOyrvqUS6PmC+c+yXwAInTjEPIzyUkne2+2im6wvnGc69lnF9qlr66Zw+agu8CjgN83iRPwe2UeKZ9ZRxsylKvyP5/kiJa4Q2hHMP5xnOvbI4n+C1115Vv+V+FpEbgJESeR4HXuFTfiAQCDQzvk5L8lDVZ4DLSyYMBAKB5wCZ2W6Aw6dT3h8pcY3QhnDu9Y1rhDaEc69PXEm8bjalZhS5RFWvqChzIBAIHEFUY5HeWrNWBAKBQBPju/zplJToqzB35jeq6rYatysQCASaBl+L9N+B9wBDThgAVgDfKpJvVhGR/y4iN4lIu4i8QkR+VsOyPyoi14rIBSJyu4jcFMWvF5F7ROS6Ksv/UxH5johsE5EXishuEdkWHTtXRO4UkfEanMdjUR1LRWSLiFxQq/Nw+mhARO4Ska0SbasRkQ3O+WwRkR+KyF+UWb7bRy9260hrv4hcLCKfr/Bc+kTk6ZTv+1QRuVuiJ+2KSEZEPikiHyyz/EER+a6I3CEiHcnfa6K/hqM6P1RG+R8VkWuj938mIl+M3uf1k4jcHPXnkyIyT0SOE5FHRGRxifLfKCLfsv2baO95UXuviD5fFX1+c/T5ZeVcm851fbaIfF9EbhWR1pR6LxKRHSKyOfrsdS6V4CukvwU+Bfwt8CFV3QA8rqpPa6WTrHVGROYD71bVN6rqE8A7MAvZa1H2cuB50ce/VdWTgf7oy9wIvA4YEJE5VVTzz6r6SsyiYMGs210bHXs9cDqw3P6AKkFEssAtUbnLyF/FUdV5JProHOAjwOPASyMxPTNR12rMeZWD20eb3DqS7Y9+D2kjK18uAB5i+vf9duB/An0icgxwHvBDVd1cZvmnAZ8AtgMvw/m9FuivPwL+h0/B7nchIv3AalU9zykr7idVPQt4L6ZvDwKvBX5dqg5VvQlYC7w86he3ve/ErEU/K/r8WszSyD+LPp+O2Ujicy7xdQ2swfTTbzD9n+yndWq2d65z6i15LpXgK6SfwXzRbwSuFZEfAseKyAtEZG49GlYD1mB+HDkRORlzEUzWomBVfRj4cvReRaQbeFRVD2F2ObVh/nzaq6hDo761O5jWisirosOfx/iF/XZUZ6W0A4Micraqfhf4v86xqs7D7SPgWMzOq18CSzE/9q1OWsVchP9RZh1uHz2dqCPZ/rcAXyj3PABExD64al/K923P7TFgCeZifbe1gsrg3zGC1gP8gvzfa15/YXbJteG5fDHxXZyBEbubo75L+57fRnTXWlVvAEo+WCn6U94F7AD+W6K9C1X1t8DvIiG0uwDnRXX8Deb78yG+roGvAcPAgegck/30ryJyDfDFcs6lEryEVFUvV9XLVPVCVX0dxgPUVuAfMJ7zG5EO4HrgnzEX6P+uRyXRv+D1gB2WXov543klZotmNVwDXKKqd2H+za+JrMMXAT/G/JllKy1cVfdhLMHzRaQvcbiW55FXLfAnwL/YCBHpwvhkuLqC8q7B+HU4nKgj2f4/AIpuay7CucA/Rm1Nft8uivndnQm8sswh5FKMGB8FXEj+7zWvv4BRzG+69NbF6XRg+us+jKWW9j3/oZbYAp5EVQ9jpvsWYQyufymUFPgscBv535kv7nV9OvAIsEhEFjC9n44HHgSOq6CesvD6R4v+Rd4KnIhx6vwQxhHJ3jq2rVr2Yjrw15jz/DvMP9nZWmTffwWcDdypqo8AqOo2EVkPfFZVff9lpyEiLzPF6Z1RuQdF5BDGetgAnIyxLvoxe/grQlUPicivgMWJ+JqcR8QvMJbWkuh9HzCG+T7WAG8CrijXunb7SET+2K1DVe+x7cf8ZpcD1wErReRFqvrjMqo6nsgSAi7C+b6dc+vGCOFejKX1NGZKxpc3Af+GGda/PypjQETOJtFfqvplEbkH+MsyyrfY9j2Fudmc9z2LSE+UpmxU9bCItAC95H+/ByINmRtNF3xSRO4HXl5h++11/QFMfx0blZX8Xb1GVS8UkXtFpLXK0VtxfPaRArdgdjKdgfkXG6HMvagzHTDOPL4BfBvoj+K21bD8tZh/8+uAH2D8DyzH/Et+DTi+yvIvxAyVtgEfxlhTV0bHPgDcG9WTqaKONwD/ibG2BDOkuyA6VvV5OH00gBnybXXba78P4EfRed5cRR+d4taR1n7Mn87nqzifbSnf96nA3US+JzBTYNuAL5RZ9quifvg+cGza79Xprz8F/gk4poLvohtjDeYwjnLy+ilq/2jKeS8uUf4FGL8aX2FqNZBt73lRH10RfR4GtmCE1eb/ged5uNf1h6LrYBswL6Wfron69IZyzqWS4Lv86acYEY2jMOtIT4867cGShQQCgcARiq+Qfgbz75uGquppNW1VIBAINBG+QvoHmKG8OnGnqOq369m4QCAQaAZ8hXS3qvYn4h5U1RfWq2GBQCDQLPgK6Y2YedHbMcsXVgOtqlruAupAIBA44vAV0gzwaszypwzmTunXtZ7LCQKBQKBJqNiNXiDgi4i8ArO75RszXO9y4CWq+tWZrLcREZERYHMwfupDIzl2Dhy5fJgabc8tk17MKCpgFrGfM9uNOFIJQhqoK9E+9RdgFkIHZo8vEIS0bgQhbTJE5PNiXOo9FXnyQYzbs0dE5FDk6egyEVkrIieIyDedNCdE8V+O4v5ORPaIyMWJOi4Tkf8nIndHnz8rIj8RkeFyywL+EPiWXTonIs9E7b8j+vxqEXlIRL4tIotEpD86t4ejYx8SkUdF5H9F6c+IznWXiJxm2xO9vlNELovenwncALxHRC6K4jIiclhElonIXBH5nRj3eP0y5f5tu/0c1XWGc+ydUVu/FH1eIiLfi+KOEuOSsFWMW7rOlLYfL8bV3x7bl06fPxa93hHVf7WInBO9v0NEjhGRH0T1TOvrqE9/4pTz5qif/k1EBLOD7aX+v7RAOQQhbT4Es93vB05cFnMzcE8i7SaK+FNQ1fdjhnzvTTn8EVV9mYiciNlKuBKzNbXcspYDD0PsIejxqP2Wj2M8Jt2KcUkHZrvgclX9pqpehXHxd6qItGP2td+IceISe1gS47rNCmYnxrHHZcDnMN6OzorO41eYx4qvAw7geLYSkXUY5yZgVqfgHMtg9sCfgHGh9/zo8xeAF6mqnboYBm5S1X0pbX8v8PfAxzDPTS+Hj2DmOCcL9HU2aptpvOqXVLUP45vhJar6LPArMQ5iAjUmCGnzYV3GuczD+GR0eT7TfTxuxTh1AOLHat+LcfRRiBMw+6/vAVyXib5lLcS4aCvUzqWq+hBmj/nKZOUi8leYVSJ9GM9CQOwa7nmRtQXwZqacjK/G7CXfCxyK2nlG1JbbMPvaX4PxoLTAqe49wM3R+71RnZZjMH16P2bPfidm/vX2SKTs+b0DI+JpbZ/E7BUvl+OAVao6HpWb1tfi3kgSkbeJyC6M31LrkOZA1AeBGhOEtPlYinFk7NLNdFd3wxhH3C6vxfjlREQGMJ5zVqSU52KdSp+gqu6F61vWU0yJx/FMt5oLLhsR4zHoXRjB+lGBtlnegHHkYeMlkS6L8ev6++jzUuBR4OgozUsxrgmt6H8O04c3OGXcH/XDUlX9Hub6cdv/FOaP4sQCbb8eY42W+yjzvcBCEfkvaX0txl3f7xJ5PopxnnyLEzcnOv9AjQlC2kSIyPOAyYTlsQ5zgT+TSL5XVXcWKS4LPO1YU4XYiRmaZkXkqArK2oMRLTCu4m5LHP9lNExehbH2XDIYn5XJc0NElgG/dbYt3+6kuwMz1bEYI4DnAt9kyqL+HvDz6L0V+ZWYYTcAqvpjVT0ROD+K2gd0i0iP0w+7gJMdq/gw8OfAJwu0vQdjKV+aPJ8SPIkR3ytJ7+s3Mf1mnjB95NJN8T/NQIUEIW0ubgZeLCIPYyyoT2Ispw+npC31XO77gD1inpVzsFAiVb0XI0w/xTzwsNyy7sLMUZ6O8Zz+qcTxD2Pcop2J8fzv1n0A+GpU9xKmpirejXGj9rHo8zM4jpBV9TGMWF2DEcLdGIe/Vkg/4+SdF71+XVWtuE4jEq5LMCJs/wyuxTxG4wExjoVR4z/218B/TWn725iympMcIyL3YRylfyNK69b/FWAQY3nGfS1mrexfMv27+RvgAcw0xwExNyYfd+ZyAzUkLMhvIkRkN3CCqj4lIicAn9Kp5zg1LCJyL7BeVSt2QO2U9TZMH2zySLsWWKuql1Vbb7MjIh8BnlXVK2a7LUciwSJtLjYytbD9USC51KhRuR7juzYwe/wR5mkBgToQLNJA3RGRjMdcbKCOhO+gvgQhDQQCgSoJQ/tAIBCokiCkgUAgUCVBSAOBQKBKgpAGAoFAlQQhDQQCgSr5/6PJ5upP6pVDAAAAAElFTkSuQmCC\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x155004afba8>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"log_counts = np.log(counts_lib_norm + 1)\n",
"mean_log_counts = np.mean(log_counts, axis=1) # по всем образцам\n",
"log_gene_lengths = np.log(gene_lengths)\n",
"\n",
"#with plt.style.context('style/thinner.mplstyle'):\n",
"binned_boxplot(x=log_gene_lengths, y=mean_log_counts)\n",
"plt.tight_layout()\n",
"#plt.savefig('pics/1_11.png', dpi=600) "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Нормализация по образцам и генам: RPKM"
]
},
{
"cell_type": "code",
"execution_count": 91,
"metadata": {},
"outputs": [],
"source": [
"# Создать переменные в соответствии с формулой RPKM, чтобы легче было сравнивать\n",
"C = counts\n",
"N = counts.sum(axis=0) # просуммировать каждый столбец, чтобы получить суммы (.astype(int))\n",
" # количеств прочтений на образец\n",
"L = gene_lengths # длины для каждого гена, совпадающего со строками в `C` (.astype(int))"
]
},
{
"cell_type": "code",
"execution_count": 92,
"metadata": {},
"outputs": [],
"source": [
"# Умножить все количества на 10^9\n",
"C_tmp = 10^9 * C"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Правила транслирования"
]
},
{
"cell_type": "code",
"execution_count": 93,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"C_tmp.shape (20500, 375)\n",
"L.shape (20500,)\n"
]
}
],
"source": [
"print('C_tmp.shape', C_tmp.shape)\n",
"print('L.shape', L.shape)"
]
},
{
"cell_type": "code",
"execution_count": 94,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"C_tmp.shape (20500, 375)\n",
"L.shape (20500, 1)\n"
]
}
],
"source": [
"L = L[:, np.newaxis] # добавить размерность в L со значением 1\n",
"print('C_tmp.shape', C_tmp.shape)\n",
"print('L.shape', L.shape)"
]
},
{
"cell_type": "code",
"execution_count": 95,
"metadata": {},
"outputs": [],
"source": [
"# Разделить каждую строку на длину гена для этого гена (L)\n",
"C_tmp = C_tmp / L"
]
},
{
"cell_type": "code",
"execution_count": 96,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"C_tmp.shape (20500, 375)\n",
"N.shape (375,)\n"
]
}
],
"source": [
"# N = counts.sum(axis=0) # просуммировать каждый столбец, чтобы получить суммы\n",
" # количеств прочтений на образец\n",
"\n",
"# Проверить формы массивов C_tmp и N\n",
"print('C_tmp.shape', C_tmp.shape)\n",
"print('N.shape', N.shape)"
]
},
{
"cell_type": "code",
"execution_count": 97,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"C_tmp.shape (20500, 375)\n",
"N.shape (1, 375)\n"
]
}
],
"source": [
"# Добавить в N дополнительную размерность\n",
"N = N[np.newaxis, :]\n",
"print('C_tmp.shape', C_tmp.shape)\n",
"print('N.shape', N.shape)"
]
},
{
"cell_type": "code",
"execution_count": 98,
"metadata": {},
"outputs": [],
"source": [
"# Разделить каждый столбец на суммы количеств для этого столбца (N)\n",
"rpkm_counts = C_tmp / N"
]
},
{
"cell_type": "code",
"execution_count": 106,
"metadata": {},
"outputs": [],
"source": [
"def rpkm(counts, lengths):\n",
" \"\"\"Вычислить прочтения на тысячу оснований экзона на миллион \n",
" картированных прочтений.\n",
"\n",
" RPKM = (10^9 * C) / (N * L)\n",
"\n",
" где:\n",
" C = количества прочтений, картированных на ген\n",
" N = суммы количеств картированных (выровненных) прочтений в эксперименте\n",
" L = длина экзона в парах оснований для гена\n",
"\n",
" Параметры\n",
" ---------\n",
" counts: массив, форма (N_genes, N_samples)\n",
" РНК-сек (или подобные) количественные данные, где столбцы являются \n",
" отдельными образцами, и строки - генами.\n",
" lengths: массив, форма (N_genes,)\n",
" Длины генов в парах оснований в том же порядке, что и\n",
" строки в counts.\n",
"\n",
" Возвращает\n",
" ----------\n",
" normed: массив, форма (N_genes, N_samples)\n",
" Матрица количеств counts, нормализованная согласно RPKM.\n",
" \"\"\"\n",
" N = np.sum(counts, axis=0) # просуммировать каждый столбец, чтобы \n",
" # получить суммы количеств прочтений на образец\n",
" L = lengths\n",
" print(np.sum(np.isnan(L)))\n",
" C = counts\n",
" print(np.sum(np.isnan(C)))\n",
"\n",
" normed = 1e9 * C / (N[np.newaxis, :] * L[:, np.newaxis])\n",
" print(np.sum(np.isnan(normed)))\n",
"\n",
" return(normed)"
]
},
{
"cell_type": "code",
"execution_count": 108,
"metadata": {},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'count_rpkm' is not defined",
"output_type": "error",
"traceback": [
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[1;31mNameError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m<ipython-input-108-de090f802685>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mnp\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msum\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mcount_rpkm\u001b[0m \u001b[1;33m<\u001b[0m \u001b[1;36m0\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[1;31mNameError\u001b[0m: name 'count_rpkm' is not defined"
]
}
],
"source": [
" np.sum(count_rpkm < 0)"
]
},
{
"cell_type": "code",
"execution_count": 109,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0\n",
"0\n",
"0\n"
]
}
],
"source": [
"counts_rpkm = rpkm(counts, gene_lengths)"
]
},
{
"cell_type": "code",
"execution_count": 111,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3224564"
]
},
"execution_count": 111,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
" np.sum(counts_rpkm < 0)"
]
},
{
"cell_type": "code",
"execution_count": 112,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3156507"
]
},
"execution_count": 112,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.sum(counts_rpkm < -1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### RPKM между нормализацией генов"
]
},
{
"cell_type": "code",
"execution_count": 101,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAABmCAYAAACZSRngAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAHgdJREFUeJztnX+cXFV5/9/PzE6WkN2QXRI2hG2ywaiwG1QktTWJmAQawFIaVKIhfEWjotF9WYpFaRckIrRIuwgNfGvVFEWz67cqUkqDBLsE5YuoiT8ASeSHBAk2aUKAkNj8YPP0j3Pv7Jk7d2bOzNzZ3dmc9+t1XjPz3HPOPefMnc+cc+45zxVVxePxeDyVkxrpAng8Hk+944XU4/F4qsQLqcfj8VSJF1KPx+OpEi+kHo/HUyVeSD0ej6dKvJB6PB5PlTS4RBKRqcBSYJJl3q6qX6pJqTwej6eOcO2R3gccDVwCbAd2AFfWqlAej8dTT7gKaaOqXg/sAp4Ffg38T81K5fF4PHWE09AeeEfw+gHgCowAr6hJiTwej6fOcBXSvSJyDnA/8ENgHPB0zUrl8Xg8dYTr0P4/gAsxQ/qpQDNwb60K5fF4PPWEuHh/EpEngNnAU8AsQIAnVHVGbYvn8Xg8ox/Xof3vgC3AYPAKRlQ9Ho/niMe1R9qpqo8PQ3k8Ho+n7nCdI72zpqXweDyeOsa1R3oI2ImZG9XwVVWn1bZ4Ho/HM/pxnSPdqqqvrWlJPB6Pp05xHdrn7akXkdMTLovH4/HUJa5D+62q2hGxPaGqr6tVwTwej6decBXSfsy86EOYOdJ5QEZV31Xb4nk8Hs/ox1VIU8CZwCmY6YAtwPdU9VBti+fxeDyjH9ebTROBE4BBVe0NhHUG8EzNSubxeDx1guvNpu8BrwE+EXxW4J6alMjj8XjqDNceaRPwDeCdInIUcBZGTGvC5MmTtaOjo1bZezwejxObNm3apapTSsVzFdKLgWuBw8Am4HHgPZUXrzgdHR1s3LixVtl7PB6PEyLyrEs8p6G9qm5S1Xer6mxV7VLVC1T1keqK6PHUDhFBRJzijRb6+/uZPXs26XSa2bNn09/fP9JF8jjiJKQi8oyI/CYaal04z+ijnn7shVakuIrscNLf309PTw+rV69m//79rF69mp6enpz2rae2P+JQ1ZIBaLTCY+F7l7SVhNNOO009I0dfX592dXVpKpXSrq4u7evry9pnzpypAwMDevDgQR0YGNCZM2dmjydxjlJg5uad4rkct1+jtlL5uZSjFOF5u7q6dGBgIOfYwMCAdnV1qWqybe9xB9ioLhrpFAn+Nghrgftc0lQTvJCOHH19fTplyhTt6OhQEdGOjo6cH3tPT0+OAIafXfPu6upSEdFMJqM9PT1FRSFONAsJX6F4hYgTTRdxLSW45RLmkUql9ODBgznHDh48qKlUSlW1pNCWez6PG0kL6cXA+4BzgHEuaaoJXkiHn1DkAE2n0zkiB2h7e7uKiDY3N2smk1FARSQrLI2Njdrd3Z3NL/qDtXtUnZ2dOm/evJy0S5YsyZ7fVbxcepDlpK3UVkjIy+k5lxLKas4Rdz6PG7US0pzgmPYa4CagE3PHfx3BjqpCwQvp8GKLHKDLli3TTCajIpIjbul0WgGdOnVq1hYKam9vb9Hemp1PGFpaWvTkk0/W3t5eTafTKiI5aasRtFqJZiU2+3NcmVXzvwMgp5cOZIXWTut7pLUlaSG9FvNM+38ORPRi4GKHdLOALwZCeg3wp8BtwKnF0nkhHT76+vq0sbEx2zMMBXP58uXa2dmZ88MOQ29vrzY2Nuqpp56aM+wP81DNF7cwnHfeeSoi2tTUlNMjDYf80bQhY9UWPRa2YzhvbMezhTZ8dZkjLSTghYS+2LEjjUSF1ORHK3A1xlv+ezFOS1zSLQiE9CvAaZi51nNj4l0CbAQ2Tp8+vfYtdAQTnasEtKOjI9urzGQy2d5nOEfa0NCQI3zR9/Zcqmrh3qI9HRAX4tKOZVs58e3pl/A1SiXtGD3mcvxIwVVIXZc/3Q98OxDFKcDfYTxBVYrmGVS/pKpzVHXOlCklNxJ4KsReZnPyySdz/fXXAzBv3jwuu+wyAA4dOhT+ufHcc88BMDg4mM1j0qRJ2fcHDhwAYOvWrU7nD/P1lM+yZct47LHHALKvIfZyrnLbeLQtBauKVccMhWHEaWeTqi5M4Fy/A6YCxwfvPSPAddddx5o1a1i4cCFbtmwh/NNau3Yta9euzcY7fPgwMCSg6XSawcFBVJWdO3fm5Tt37lweeqia/1bPaCQUWVuco8JbULhDMVv1slO+iRBzruHASUhFpC/OrqoXlnGubwK3A/8N/LKMdJ4E2bx5M/Pnzwdg2rRpfOpTn6K5uZlXXnklL+7EiRNZsGABd911F6+++mrWHoqszaZNm2pXaM+IEid2oa1ob7aIqKnqmOoJu+61fxuwHOPcuSxUdQOwIfg4p9z0nurp7+/nuuuuY/PmzWQyGZYuXcqTTz7Jtm3baGhoyIpkU1MTe/fuzabbs2cPTz/9dF5+U6dOZfv27Tm2e+65h0WLFtW2Ip4Ro7W11Smea0/TNb96wVVIJwPXAS8DzwPhnKlnlBPOia5Zs4b58+ezdOlS7rzzTo45xgy7jj76aPbs2QOQI6Ih+/bty7P19fWxaNEixo0bl51PXbFihVN5WlpaePHFF6uoUf0Sikf0tdxj5djsY7t3744tj4st7juLi+va09z9iUGMm+MxgssdKYwT5xMxHvL/HPg6cI9L2kqCX/6UHNGF3u3t7drU1JS9W9/W1pa902vfeQ9DePfeDnHxWlpaFNDFixerauG79qVCXNqRtoV1a2lpyXlfri2unvaxuHaqxpbEefTqibkh5juNEmeLixPX3qMNHO/au95ssl1JPQr8m4h8wCWtZ2Sx50QBtm3bxrp16zj33HPp6+ujp6cne+yFF17ISz9+/Hj27t1LY2Mjg4ODOXOlNi+99BIA9957b8I1qJxqe2EhL774Yl5PqxpblHJ66Hr1RFh1DHr1ROSze+KPF7Tl9wLj4ttEz6GripdvrA3ZXXF9ZtNZmPWfxwam3UCPqtbES/6cOXPU+yNNhtmzZ7N69WoWLjQLL0SEZcuWcccdd3Do0CGmTZvGtm3bCqYXkex8V3d3N7fccgttbW3s2LGDtrY2du7cyeHDh7NiEcYN39uvLkTTtLS0sHv37jwb5IpUnA2GphLiyuBqKyduMVsp0XJBPrunaH55wlegPOEd9TB+IVvBuljx7ZtKrnOkdjz7uhltiMgmVS19b8el2wo8AbRbn9uBJ13SVhL80D45ol6DmpubFdDly5frvn37dPny5bFDwnABvmr+0DAM4f56O15LS0v2ffTVJcSlqdXQ19UWZ7eHu3G2uDAayu1iiw7j4+LZ14JNnK0U5cYfTnAc2rv2SP8TswD/J0FDvQU4XVUXlExcAb5Hmiz2XftUKsW4ceM47rjj2Lp1a85d+7gbQR0dHWzdupX29na2bduW00sJr51orzPaCyznBlMSPcjEbQV6fmHc8LhtGw3lHs5zZ4lbO1pkPSkwJnqkrkLaDFwEdAWmx4Gvq2r+4sME8EJaG/r7+7nwQrP0t7GxkQMHDjB58mR27doFQCaT4dCh3Cdsh0udwvguQhpSzg8+yrAJQHQHzKqX82yjUSCjf07ReFmRj/kzq6WQVrLQfiwIqevQ/kNYHpuA1wNrXNJWEvzQPp9KHSHb6WfOnKkdHR26fv36rNOLKVOmqIhoKpXKuYMf3q2Pem1SzR++Re1xcaO21tZWpyF+LWwuQ9fhLI8donf57Xhx0yaF0kbjRb83O37F7WjlbV8LUVsp7DKPNkjY+9PngB8DZwM3YpyLnO+StpLghTSXJLyjh06Y29vbFcwSpnQ6rS0tLZrJZPL8gdoOm2shpNWkrdSWN6fpmHY4RDPaFnYaW1SjxwuldbWVEshighvNO3rMlUrS5GH9OSZJ0kJ6FvBBYBtwVyCoi13SVhK8kOZSrXf00B1bKKArV67MWQu6ZMkSBXJ8ikZ9YdaDkOb8mAqsf6xU9KpNHydGxdpiuGzRMrn2ZqN1tvMfESGtEUkL6W0x4V9c0lYSvJDmUugxFKHj5WLD/bA3m8lk9IYbbsjpzQLZtJDrHi/qIX80C2mlPU1XWyWiGa1btN0qEb7Fixdn3RCGr3HxirnbS0pwiwmfF9LSgnpOOfErDV5Ic4nrkfb09Ggmkyk53A/Ths9fGhgY0PXr1+f0SKPOgru7u3MEejQJaa1FM87mGrdY3QrZyokfF+J2kh111FF58ezRRaXl9EKanJA+UU78SoMX0lzi5kjDh8fZxA33w95s9MF14cXb3t7u7Cx4NAjpSNjCEF1HG9qi9YirWyFbtUJaSPCiN60mTJhQVTlz3heZj6xEFI8YIQUmBK+XusSvNnghzcf2ah8OwTs7O7Wvr0+7u7tzhuX2Q+jCHqktxnaP1LWnUkhIC+0lT0JIi/U+k5j7LNcWV+ZqbdFj9mNfopsdXIW0qakp50+3ULxqyp7oXfYa3ShKgqSFdFh6omHwQhpP9Emcvb292tzcrKlUSnt7e3XdunXa1tamDQ0NWTG109x+++1ZAQ3v3odUIqTFdhwlIaQjKZrV9jRdbeFrKJqu7WiH6LzpBRdckB1d2I92iQpzEvVJgsSFOUGSFtLdDD3bPhtc0lYSvJDGY8+VhgLZ0NCgbW1tOXOk4cPpQgqtQa1WSEv90G17VAxtD0muedbKVuiOuku7VGsDclZVxAV7nvroo4/WRx55JHYEUExk7RD2epOoTxIknV+SJC2kF8cFl7SVBC+k8UTv3ts/QFsg9+3bl3dx2kJh2+J6XsUWfkc/lxLScm8OjZUhu6vNRQyTCpMnT9Z0Op19nHZra2vek0rLrc9YJ1EhNfnxBuDCILzRNV0lYawKqevupELx4u7eZzIZbWtry7FFe6QhhX7EcfOccfFKOeAYiTvqldji2qJWtmKPV04qpFIpbWxszLmJWCzYKzamTJlSUR2PFDFNVEiBfwK+z9Cwfj3wJZe0lYSxJKT22j9A58yZU3S5UrFdTHHH7DnSffv2aW9vb84cqU0hgYz77OKQuJCtkl5lLUQ47o9huOY+Sw3XRzLYQhr2Tu12LHSNHCniaZO0kD4DpK3PaeAZl7SVhHoV0lNOOaXoBdzU1KQwtPYvbrlSqV1M0d5qmHc479XQ0KCtra05x1U1bz4yfM378ViCFo1Tji0pMSxHNKN1CxkJW5LCZ+84KxbCpx7YIboOGMx64ZaWFm1oaND169fnCeSoEs0RvqOftJBeAWwGvoN5VtNmjGNnZ3EE2oAfAT8EmovFrUchDUX0vPPO02nTpmUvxnD3EZg5qtAfqKrZnZRKpXLyAXTSpEk5trh4IdGeYyqVyvM9OmHChOwF6drTLORPM4xXyt9mkgLpYrPbbzTYokIWJ2grV650Esjo7rIkhDmdTmtra2uskHqGSFRITX5MxDwFdA5wjGs6K/0K4OPA1ZRweFKPQhqKaPh+7ty5OT/28ePH5wlF7H75AkPcQvvq7TzjtgPmbfGs0vlwKRF2Eb5yhvH2MdsWPT7abDn1jfmuGhoaCravqxi6xEulUnl37jOZjF5yySUqInr88cd7IS2Cq5C6+iP92zi7qv5NycRDeVyJ6ckeC4xT1Vsixy8BLgGYPn36ac8+GzwmqphT2KgvyZxjLxd3MhsXbxht97/9ThY+sCQvnoiQSqUYvKqpQMUqI/oIijhny6VsqoUfHVIqXqwtdJgc1Nsm7hxJ2EqWKWHbwMAA8+fP58EHH8x7XHU1T1QN/cOWQyqV4vDhw3R0dPDb3/6WdDrNpEmTuPnmm1m2bFlF5RjrJO2P9L+A91HF8ifgSuBdGLH8eLG4Y6VHGvr0XLlypc6ZMyev91juXftSVOslypMcWD1C+zWJcPnll2d7pCKSvbbizhGOBkRERUTPPPPMnE0Z5fq1PdIg4R7pHuDfsJ5rr6oPl0yYm8cKYDwwGXhEVb9bJO5OYB+wK4i/Kzg0eRTbOoP6vQQ0Y27IxXEYs8Hh2RqUZQZmCmYrcBSwH/MY7eeCc0bTjnSbjWXbG8HtKb0OKFDosQKHMY8HBchE7KlIHocx1+V+YC/mGoTR02YjZSt2bIaqTqEULmoLvB1YBPwZ8BfABuA2l7RWHs43m4L4G+1Xb0veNtLn97aRP7+3lY7vElyfa/+A/VlEbgV6CkQvlMcO4K3lpPF4PJ56oKKhh6q+Cnw24bJ4PB5PXZIqHWXE+FLk1duSt430+b1t5M/vbaXjl8TpZlNsQpErVfXaihJ7PB7PGKKaHul9iZXC4/F46hjX5U+nx5g/j7kTv0JVNyRcLo/H46kbXHuk/w58FFhphU7gZOCBIulGDSLypyLyLRFpEZG3isgzCeZ9jYjcJCKXishDIvKtwL5ERH4uIqurzP89IvJDEdkgIq8Tka0isiE4dqGI/FRE+hKox/bgHCeIyG0icmlS9bDaqFNENonIOgm2/4jIMqs+t4nIL0XkL8vI226fN9j5x5VdRP5aRL5aQR1miMiBmO95oYj8TERuCz6nROQLIvKpMvPvEpEficjDItIavU4j7dQdnPPTZeR/jYjcFLz/oIh8I3if00YicnfQlntFZIKIzBSRZ0VkUon8LxCRB8K2jZT3oqC81wafPx98Xh58frPrb9L6LZ8vIj8RkftEJBNzzstFZKOI3BB8dqpHJbgK6SvAF4GbgU+r6jJgh6oe0EonWYcREWkCPqKqF6jqi8CHMLu1ksh7FnBc8PFmVZ0LdARf7ArgnUCniBxVxWn+VVXfhlkgLJg1vAuCY+8CzgBmhRdTJYhIGrgnyHc6uSs6qqpHpI3eC3wG2AG8KRDTcyLnmoeplyt2+1xh5x8te3AtxI2wXLgUeJL87/kDwP8BZojIscBFwC9V9YYy818E/B3wIPBmrOu0QDv9MXCeS8b2dyAiHcA8Vb3IyivbRqp6LvAxTLvuA96B2WhSFFX9FrAAeEvQLnZ5P4xZj35u8PkdmOWQHww+n4HZKFCqHtnfMjAf00YvY9o+2kaL1WzvXGyds2Q9KsFVSL+M+ZIvAG4SkV8C00TktSIyvhYFS5j5mItkQETmYn4MB5PIWFWfAr4ZvFcRaQOeU9VDmJ0kjZg/opYqzqFBOzcDTcACEXl7cPirGB+xPwjOWSktQJeInK+qPwL+0zpWVT3sNgKmAdsxAnEC5sJfZ8VVzI/xP8rI326fA5H8o2V/H/C1cusgIq3B210x33NYp+3A8Zgf7EfCnlAZ/DtG0KYCvyP3Os1pJ8xupkYclzBGvoOzMWJ3d9Bucd/v+wnuXKvqrUBJpwDBn/EWYCPwJ5HyTlTVV4DfB2KYAl4FJgTn+HvMd1eK7G8Z+C7QDewJ6hdtoztF5EbgG+XUoxKchFRVP6uqq1T1k6r6TowHqHXAP2M85492WoFbgH/F/ED/pRYnCf4RbwHCYelNmD+ht2G2aFbDjcCVqroJ869+Y9A7fD3wa8wfW6FtqSVR1V2YnuDHRWRG5HCS9cg5LfDnwB2hQUSmYPwx/EOZed2I8ecwaNmU/LL/IVDW9uaAC4GvB2WMfs82irnezgHeVuYw8gSMGI8DPknudZrTTkAv5louvX0xn1ZMWz2G6a3Ffb9/pGVuA1fVQcyU3zGYTtcdhaICXwHuJ/f7ci17+Fs+A7PN9RgRaSa/jU4EngBmlnmOsnH6Nwv+QS4GTsHs1X0S43hkZw3LliQ7MY35EqbO/4j5Vztfi+z5r4DzgZ+q6rMAqrpBRJYAX1HV8lz1WIjIm012+tMg330icgjTi1gGzMX0MjqApys9j6oeEpEXgEkReyL1CPgdpsd1fPB+BnA75vuYDywFri2nd223j4j8mZ2/qv48LDvm2p0FrAZmi8jrVfXXjqc5kaA3BFyO9T1bdWrDCOFOTE/rAIX3yMexFLgLM6z/RJBHp4icT6SdVPWbIvJz4K/KyD8kLN9+zA3nnO9XRKYGccpGVQdFpAFoJ/d73RPoyPhguuALIvIr4C0VlD38LV+GaatpQT7Ra+ksVf2kiDwqIpkqR2zFcdlHCtyD2cl0NuYfrIcy96KOZMA48LgX+AHQEdg2JJj/Asy/+mrgFxhfBLMw/5jfBU6sMv9PYoZMG4CrMD2q64NjlwGPBudJVXGOdwP/H9PrEszQ7tLgWNX1sNqoEzP0W2eXN/w+gEeCet5dYfucbucfV3bMH85XK6zHhpjveSHwMwL/E5hpsA3A18rM++1B/X8CTIu7Tq12eg/w/4BjK/gO2jC9wQGMk5ucNgrK3xtT70kl8r8U40vj2wytCArLe1HQRtcGn7uB2zDCGqb/hUMd7N/yp4NrfwMwIaaNbgza89Zy6lFJcF3+9BuMiGZNmHWkZwQN9kTJTDwej2eM4iqkX8b888ahqrqowDGPx+MZ87gK6R9ihvJq2U5X1R/UsnAej8dTD7gK6VZV7YjYnlDV19WqYB6Px1MvuAppP2Ze9CHM0oV5QEZVy1k07fF4PGMSVyFNAWdilj+lMHdIv6e1XE7g8Xg8dULFbvQ8nnIRkbdidrjcO8znnQW8UVW/M5znHY2ISA9wg+8EJctoduzsGXtcRUJbc8ukHTOa8pjF7O8d6UKMNbyQeoaFYK/6azELoj0jx9fwQpo4XkjrFBH5qhh3evsDbz6IcX32rIgcCjwdrRKRBSJykoh834pzUmD/ZmD7RxF5XkT+OnKOVSLy3yLys+DzV0TkaRHpLjcv4I+AB8IldCLyalD+h4PPZ4rIkyLyAxE5RkQ6gro9FRz7tIg8JyL/N4h/dlDXLSKyKCxP8PphEVkVvD8HuBX4qIhcHthSIjIoItNFZLyI/F6Mi7wOGXIB92D4OTjX2daxDwdlXRt8Pl5EfhzYxolxR5gR45puckzZTxTj6u/5sC2tNt8evD4cnP8fROS9wfuHReRYEflFcJ68tg7a9Gkrn+VBO90lIoLZvfYm9yvN44IX0vpFMFv+fmHZ0pibgs9H4l5BEb8KqvoJzJDvYzGHP6OqbxaRUzDbCWdjtqWWm9cs4CnIegnaEZQ/5DqM16T7MG7pwGwZnKWq31fVz2Pc+y0UkRbM3vZ+jAOXrJclMe7bQsGcjHHusQpYg/F4dG5QjxcwjxdfDOzB8molIosxzk3ArFLBOpbC7IM/CeNG7zXB568Br1fVcOqiG/iWqu6KKfvHgH8CPod5fno5fAYzx3mwQFung7KZwquuVdUZGL8Mb1TVw8ALYpzDeBLCC2n9ErqMs5mA8c1o8xry/Tyuwzh3ALKP134U4+ijECdh9mD/HLBdJ7rmNRHjpq1QOU9Q1Scx+8xnR08uIn+DWS0yA+NdCMi6hzsu6G0BLGfI2fg8zH7yncChoJxnB2W5H7O3/SyMF6Vm63QfBe4O3u8MzhlyLKZNf4XZsz8ZM//6UCBSYf0+hBHxuLIfxOwZL5eZwBxV7QvyjWtrsW8kicj7RWQLxndp6IxmT9AGnoTwQlq/nIBxZGzTRr6bu26MQ26bd2D8ciIinRgPOifH5GcTOpQ+SVXtH65rXvsZEo8Tye81F1w+IsZr0CUYwXqkQNlC3o1x5hHaJRIvjfHp+j/B5xOA54CjgzhvwrglDEV/DaYNb7Xy+FXQDieo6o8xvyO7/PsxfxSnFCj7LZjeaLmPNN8JTBSRP4hrazEu+34fSXMNxoHyPZbtqKD+noTwQlqHiMhxwMFIz2Mx5gf+aiT6TlXdXCS7NHDA6k0VYjNmaJoWkXEV5PU8RrTAuIu7P3L8v4Jh8hxMb88mhfFbGa0bIjIdeMXavvyQFe9hzFTHJIwAXgh8n6Ee9Y+B3wbvQ5GfjRl2A6Cqv1bVU4CPB6ZdQJuITLXaYQsw1+oVDwJ/AXyhQNmnYnrKV0frU4K9GPG9nvi2Xkr+zTwhf+TSRvE/TU+ZeCGtT+4G3iAiT2F6UF/A9Jyuiolb6vncjwHPi3lezr5CkVT1UYww/Qbz4MNy89qEmaM8A+M9/YuR41dh3KOdg/H6b597D/Cd4NzHMzRV8RGMO7XPBZ9fxXKGrKrbMWJ1I0YIt2Ic/4ZC+mUr7YTg9XuqGoprHoFwXYkR4fDP4CbMozQeF+NgGDW+Y18CTo0p+/sZ6jVHOVZEHsM4TL83iGuf/9tAF6bnmW1rMWtl/4r87+bvgccx0xx7xNyY3GHN5XoSwC/Ir0NEZCtwkqruF5GTgC/q0DOcRi0i8iiwRFUrdj5t5fV+TBtc4RB3AbBAVVdVe956R0Q+AxxW1WtHuixjCd8jrU9WMLSw/TkgutRotHILxoetZ+T4Y8zTAjwJ4nuknmFDRFIOc7GeGuK/g9rghdTj8XiqxA/tPR6Pp0q8kHo8Hk+VeCH1eDyeKvFC6vF4PFXihdTj8Xiq5H8B+UZZl3l6RmsAAAAASUVORK5CYII=\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x15558b79208>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"log_counts = np.log(counts + 1)\n",
"mean_log_counts = np.mean(log_counts, axis=1)\n",
"log_gene_lengths = np.log(gene_lengths)\n",
"\n",
"#with plt.style.context('style/thinner.mplstyle'):\n",
"binned_boxplot(x=log_gene_lengths, y=mean_log_counts)\n",
"plt.tight_layout()\n",
"#plt.savefig('pics/1_12.png', dpi=600) "
]
},
{
"cell_type": "code",
"execution_count": 102,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['ABHD10, 2642bp', 'ACSL4, 5335bp']\n"
]
}
],
"source": [
"print(gene_labels)"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"c:\\python36\\lib\\site-packages\\ipykernel_launcher.py:1: RuntimeWarning: invalid value encountered in log\n",
" \"\"\"Entry point for launching an IPython kernel.\n",
"c:\\python36\\lib\\site-packages\\numpy\\lib\\function_base.py:4291: RuntimeWarning: Invalid value encountered in percentile\n",
" interpolation=interpolation)\n",
"c:\\python36\\lib\\site-packages\\matplotlib\\cbook\\__init__.py:1856: RuntimeWarning: invalid value encountered in less_equal\n",
" wiskhi = np.compress(x <= hival, x)\n",
"c:\\python36\\lib\\site-packages\\matplotlib\\cbook\\__init__.py:1863: RuntimeWarning: invalid value encountered in greater_equal\n",
" wisklo = np.compress(x >= loval, x)\n",
"c:\\python36\\lib\\site-packages\\matplotlib\\cbook\\__init__.py:1871: RuntimeWarning: invalid value encountered in less\n",
" np.compress(x < stats['whislo'], x),\n",
"c:\\python36\\lib\\site-packages\\matplotlib\\cbook\\__init__.py:1872: RuntimeWarning: invalid value encountered in greater\n",
" np.compress(x > stats['whishi'], x)\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAABmCAYAAACZSRngAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAEvxJREFUeJztnXm8XEWVx7+/RBAMe4KBhCGPECCGRWQCCgiERSCIM+z7jCwKDkZlRAaEABFRFiVEWQd0AEVlBMSRfVB5gGJAHosshj0ZBgkmjAKiCSGc+eNUJzf3db9X3f2a9/pxvp9Pf7pv3eqqc8/tPreqbtXvyswIgiAIGmdIfxsQBEHQ7kQgDYIgaJIIpEEQBE0SgTQIgqBJIpAGQRA0SQTSIAiCJolAGgRB0CTvyckkaS3gAGC1QvJcM7usJVYFQRC0Ebkt0juA9wFHA3OBl4GprTIqCIKgncgNpO81s7OB+cAc4Engby2zKgiCoI3I6toDe6T3I4CT8AB8ZEssCoIgaDNyA+lfJE0G7gTuAZYHnm2ZVUEQBG1Ebtf+ZuAQvEu/FrAycHurjAqCIGgnlKP+JOkpYBPgGWAcIOApMxvTWvOCIAgGPrld+z8As4DF6R08qAZBELzryW2RTjCzJ94Be4IgCNqO3DHSn7bUiiAIgjYmt0W6CJiHj41a5d3MRrXWvCAIgoFP7hjpbDPboKWWBEEQtCm5Xftua+olbd/HtgRBELQluV372WbWUUp7ysw2bJVhQRAE7UJuIP0RPi56Lz5Gui2wnJnt21rzgiAIBj65gXQIsAuwKT4cMAu4zcwWtda8IAiCgU/uzaZVgNHAYjM7LwXWMcDzLbMsCIKgTci92XQbsD7w+bRtwK0tsSgIgqDNyG2RrgRcDewjaQVgNzyYZiNpJD6x/y1gDzN7vZwGDAc68elWk3orc8SIEdbR0VGPGUEQBNl0dXXNN7M1e8uXG0g/CZwJvA10AU8AB9Zp08fxYDwCH2+9oUra74ArzOwrOQV2dHTwwAMP1GlGEARBHpLm5OTLCqRm1gXs15RFMAr4PbAIH2+tlvY8MElSp5ndVa0QSUfjjzxh3XXXbdKkIAiC5sl9+N3zVOnKm9nYBuutNixgZvawpD2BuyVta2YLqmS6jLRAYOLEiXUNLwRBELSC3K79+MLnLuDvG6jrD7go9Ai8C181zczeSGv73wt0C6RBEAQDjdxAenp6HwO8ZGYLG6jrZpbeWHpO0taltOmSPgccCnSa2asN1BEEQfCOkxtIn8S74/cAv2ikIjN7Gdi6yq5i2gXpFQRB0DbkBtIKawIHSQLAzL7X5xYFQRC0GbkT8jcApuNr7MHX3aslFgVBELQZWYHUzKYCG+I3h/YBFgI/bKFdQRAEbUPu9Kc7WaqMvzxwFnA8sGXrTAuCIGgPcifk79hqQ4IgCNqV3BZp1W68mR3St+YEQRC0H7l37bfD53fGDaYgCIISuYF0BPA14FXgReBO4LpWGRUEQdBO1LNEdCgwDBiLC5h8EpjcIruCIAjahtybTUUpqUeB/5J0RGtMCoIgaC+y5pFK2k1Sl6TZ6fUgMLeeiiSNlPQbSfdIWrlamqQJqZ5bVFk+FQRBMMDJ7dpfAOxkZv8LIGkdfJx0gzrqyhF2/hBwGj50sDnwUB3lB0EQ9Au5gfQF4BhJ9+MT87fCbzrVQ46w8yi8pftS2u4WSGsKO09bNb1XEY2q7Fuy/Wr1tHry90Rv+avZ2C1P4fs5+YMg6DdyH8e8MnAYsHFKegL4vpm9nl2RNBUPmsOB5czsonIa3iK9BNgX+LWZ3dxTmRMnTrR41EgQBK1CUpeZTewtX26L9EDgUktRV9JGwAzgqDpsyhF2Hpm21077giAIBjy5gXQMMFPS6cCuwPb4vNJ66FXYGdc9/R7wR+CROssPgiDoF3ID6a+A2cB3gAeBqfgTRbPJFHZ+Aui1GR0EQTCQyA2kB6X3O9L7gfhNp//uc4uCIAjajNwJ+UcASJpsZre21qQgCIL2Ilchv8K3WmJFEARBG5O7smlY+nhxC20JgiBoS3JbpA8BmNmMFtoSBEHQlmTL6En6ejnRzE7uY3uCIAjajtxA+q8ttSIIgqCNyb1rf5WkzYBNUtLjZhYT5oMgCMh/ZtMluNLT/SnpcEmzzezollkWBEHQJuR27XcHxpnZYgBJQ4FnWmZVEARBG5F71/7fgcckXS/pOuAxfLloFrUEm8vpkiZJmiXpmvoOIwiCoP/IHSM9W9LFwIYp6Wkzq0ck8yCqCzaX04cCZ5nZVXWUHQRB0K/kjpF2m/okqZ7pT7UEm8vpAvaV9HCtm1k1hZ2DIAj6idyu/RHALFzmrviqiaSTJHVK6gR2KuyqpSRtZnYj8Cl6GDYws8vMbKKZTVxzzTUzzQ+CIGgduTebhgEfo/BcezOb2dMXzOxs4GwASWdQXbD5D1XSXwHel2NUV1fXfElvAPNxcej5adeIUlpP+yL/wMvfDjZG/ndH/jHkYGa9voAd8FblJ4AvAJ3AFTnfTd+fADwA3IK3gj8HjKuS/vW0PaWOsh8ovldL62lf5B94+dvBxsj/7smf88q92XRXcVvSRcApOd9N3y8LNl9Q+FxMPzm9giAI2obcrv0ymNlbwFf62JYgCIK2pF490oHIZaX3amk97Yv8Ay9/O9gY+d89+Xsl63HMVb8oTTWzMxv6chAEwSCimRbpHb1nCYIgGPxktUglbV8l+Rz8OfRHmllnH9sVBEHQNuS2SG8EPgP8S+E1AfgAcFcP32srJH1c0rWSVpe0taTn+6jcMyTNkHScpHslXZvS95L0kKQLeiujh7IPlHRPWvywoaTZaREEkg6R9FtJP2zS/rmp/NGSrpB0XF/YX/BLNy0GSQcXjuMKSY9IytbFLflls5KmQze7JX1Z0pV12j9G0sIq53VHSQ9KuiJtD5F0vqR/q6PsjSX9RtJMSWuUf48l/0xJ9Z1YR/lnSJqRPh8l6er0eRnfSLop+fAvkoZJWk/SHEmr1Sh3f0l3VXxZsvOwZOeZafuctH1o2t4i9z9X+K/uLel+SXdIWq5KnSdIekDSuWm7R/sbJTeQvg5cij/87kQzOxh42cwWWqODrAMMSSsBx5jZ/mb2J3yF1Ut9UO444P1p81tmtg3QkU76kcA+wARJKzRYxY/NbDt88rDw+b2T0r59gZ2BcZUfWQP2DwVuTWWuy7IzPRq2v+SXiubCy8DmKZhOLtWzLX48uRT9clKx/LLd6dxX63X1xnHA03Q/r0cA/wSMkTQcOAx4xMzOraPsnYCzgF8BW1D4Pdbwz0eAf8gpuOh7SR3AtmZ2WKGsJb4xsz2BY3F/vgHsAfy5Vtlmdi0wCdgq+aJo56fxOel7pu09gK2Bo9L2zsCCDPuX/FeBj+K+eRX3d9k3u5rZRGDXQp017W+U3EB6OX5i9wdmSHoEGCVpA0kr9rVR/cRH8R/PLyVtg/9B3my2UDN7BrgmfTZJI4EXzGwR8DbwXvxCtXqD5Vs6BysDKwGTJO2Qdl+JL3K4O9XXCKsDG0va28x+A/yisK9h+4t+obvmwmR8kUYlr+F/zpvrKL/ol4Wl8st2/zNQl1COpDXSx/lVzmvleObiq/b2AI6ptIoyuREPaGvhq/6Kv8dl/AMsTseTOy+86Pvd8aB3U/JXtXN6OOkutpldBPypVtnpwjsLX1jzsZKdq5jZ68BfUzAcAryFr5zEzL6Bn6veWPJfBW4ApgCvpeMq++ankqYDV+fY3yhZgdTMvmJm08zseDPbB59Efwsur7dZXxvVT6wBXAj8GP/D/kdfV5Culhey9NEtM/CL1HbA/zVR9HRgqpl14Vf76amFuBGuiTAq/cDrxszm463Bz0oaU9rdV/YvUyXwj8BPKgmS1sSFar5ZZ1nTgal4oCmWX7Z7S6DHJc9VOAT4frKvfF6LGP7bmgxsV0eXcjQeiJcHjmfZ3+My/gHOw3+zjYhPrIH76DG81VbtnH7YelkSXsFcs3gCsCre8PpJray4psadLHt+cm2u/Fd3BuYAq0pame6+GQs8BaxXZx11kav+tBLwSWBTXOruaeCzZjavhba908zDnf1n3C/fxq96e5vZDX1Ux97Ab81sDoCZdUraC/iOmeVcibshaQsvyn6bynxD0iK8VXEwsA3e6ugAnm2kDjNbJOkVYLVSetP2J8qaC2OA7+H+/yhwAHBmPa3qol8kfaJYvpk9VLEb/z2Pw1fbbSJpIzPrUZAnMZbUMgJOoHBeC8czEg+G8/BW10J8+CWHA4Cf4d36z6fvT5C0NyX/mNk1kh4CvpRZdpGKbQvwm8/LnFNJa6U82ZjZYknvAdZh2fP4WoolK6ZhgvMlPQ5s1YDNlf/qF3EfjUrllH87u5nZ8ZIelbRcEz2znslZRwrciq9k2h2/ap1CnWtRB/oLWAG4Hbgb6EhpnX1U9iT8Sn8B8DCuVTAOv5reAIxtouzj8a5UJ3Aq3rI6O+37IvBoqmNIg+XvB/wab30J7+Ydl/Y1ZX/BL8toLhT2d6b336Xju6lBv2xfLL+a3fiF5soGjqGzynndEXiQpEeBD4t1AlfVUe4O6bjvB0ZV+z0W/HMg8J/A8AZ8PxJvFf4SWKXsm2T7eVWOebUa5R4H3ANcx9JZQRU7D0t+OTNtTwGuwANr5fsPZ9he/K+emH7jncCwKr6Znvx4UY79jb5ypz89hwfRJUn4PNKdk7Oe6rWQIAiCQUpuIL0cv9JWw8xspxr7giAIBj25gXRLvCtvhbTtzezuVhoXBEHQDuQG0tlm1lFKe8rMNqzxlSAIgncNuYH0R/i46L34tIVtgeXMrJ4J0kEQBIOS3EA6BNgFn/40BL8bepu1aipBEARBG9GwjF4Q9AWStsZXvNz+Dtc7DvigmV3/TtY7EJF0CnBuNIwaZzAIOwftzan0wVLcBlgH72EFPrn9oP42op2JQBr0G2m9+gb4BOmg/7iKCKRNEYF0ECHpSrmM3oKk6oNcAm2OpEVJ6WiapEmSxkv6eSHP+JR+TUr7tqQXJX25VMc0SX+U9GDa/o6kZyVNqbcs4MPAXZVpdZLeSvbPTNu7SHpa0t2SVpXUkY7tmbTvREkvSLo45d89HessSTtV7Envn5Y0LX2eDFwEfEbSCSltiKTFktaVtKKkv8pl8jq0VBLuV5XtVNfuhX2fTrb+IG2vLem+lLa8XIpwOblE3Ygqto+VS/29WPFlwedz0/vMVP83JR2UPs+UNFzSw6mebr5OPn22UM6hyU8/kyR85drm+b+0oEwE0sGF8KV/DxfShuI3Cl8s5T2JHrQWzOzzeJfv2Cq7TzOzLSRtii8r3ARfjlpvWeOAZ2CJatDLyf4KX8OVk+7ApenAlxCOM7Ofm9k5uLTfjpJWx9e3/wgXblmitCSXc6sEzBG4yMc04Lu48tGe6ThewR85vivwGgVFK0m74uIm4DNXKOwbgq+HH49L6a2ftq8CNjKzytDFFOBaM5tfxfZjgUuAr+LPVK+H0/Axzjdr+Hposs2NN/uBmY3B9Rg+aGZvA6/IxWGCBohAOrioSMYVGYZrNRZZn+66j7fgYg/AkkduP4oLfdRiPL4W+yGgKKeYW9YquFxbLTtHm9nT+HrzTcqVSzoZn0EyBlcbApbIxL0/tbYADmWpAPm2+LryecCiZOfuyZY78TXuu+FqSisXqvsMcFP6PC/VWWE47tPH8TX7I/Dx13tTkKoc36fwIF7N9jfxNeT1sh4w0cx+mMqt5msVbyRJOlzSLFzDtCJE81ryQdAAEUgHF6NxIeMiI+kucTcFF+kusgeuy4mkCbiizgeqlFekIiQ93syKf9zcshawNHiMpXurueaUErmK0NF4wPpdDdsq7IeLelTSVco3FNdy/VvaHg28ALwv5dkclyOsBP3v4j68qFDG48kPo83sPvy/VbR/AX6h2LSG7RfirdF6H3M+D1hF0t9V87Vctu+vpe+cgQsq31pIWyEdf9AAEUgHCZLeD7xZannsiv/B3ypln2dmv++huKHAwkJrqha/x7umQyUt30BZL+JBC1w27s7S/pdSN3ki3torMgTXsSwfG5LWBV4vLGm+t5BvJj7UsRoeAA8Bfs7SFvV9wP+kz5Ugvwne7QbAzJ40s02Bz6ak+cBISWsV/DAL2KbQKl4MfAE4v4bta+Et5dPLx9MLf8GD79lU9/UBdL+ZJ7r3XEbS80Uz6IEIpIOHm4DNJD2Dt6DOx1tOp1bJ29szux8DXpQ/P+eNWpnM7FE8MD2HPwyx3rK68DHKnXE19UtL+0/F5dIm42r/xbpfA65Pda/N0qGKY3B5ta+m7bcoiCKb2Vw8WE3HA+FsXAi4EkgvL3x3WHq/zcwqwbUbKXBNxYNw5WIwA3+0xhNywWHMNWP/DHyoiu2Hs7TVXGa4pMdwEfXbU95i/dcBG+MtzyW+ls+V/RLdz803gCfwYY7X5DcmXy6M5QZ1EhPyBwmSZgPjzWyBpPHApbb02U0DFkmPAnuZWUOi06WyDsd9cFJG3knAJDOb1my97Y6k04C3zezM/ralXYkW6eDhSJZObH8BKE81GqhciOvaBv3HR/CnBQQNEi3SoF+RNCRjLDZoIXEOmicCaRAEQZNE1z4IgqBJIpAGQRA0SQTSIAiCJolAGgRB0CQRSIMgCJrk/wHNxzwVby5b0AAAAABJRU5ErkJggg==\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x1555aeba8d0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"log_counts = np.log(counts_rpkm + 1)\n",
"mean_log_counts = np.mean(log_counts, axis=1)\n",
"log_gene_lengths = np.log(gene_lengths)\n",
"\n",
"#with plt.style.context('style/thinner.mplstyle'):\n",
"binned_boxplot(x=log_gene_lengths, y=mean_log_counts)\n",
"plt.tight_layout()\n",
"#plt.savefig('pics/1_13.png', dpi=600) "
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"c:\\python36\\lib\\site-packages\\ipykernel_launcher.py:7: RuntimeWarning: invalid value encountered in log\n",
" import sys\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAagAAAEYCAYAAAAJeGK1AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJzt3Xt4VdW57/HvS4gJBkFRVGiEAGINxFs3XkqxJe4t3qpt97bVwDlKk4LXnLrVjR5yar00lOKl2thWUVCqEne3T+u2ohYsETe1WrFVkQZtaVEiqCgoNwNJeM8fcyVNQkJG0sysFdbv8zzrSdZcc6z1C6yVN2PMMcc0d0dERCTV9El2ABERkbaoQImISEpSgRIRkZSkAiUiIilJBUpERFKSCpSIiKQkFSgREUlJKlAiIpKSVKBERCQl9U12AIBDDjnE8/Lykh1DRER6wCuvvPKhuw/uaL+UKFB5eXmsWLEi2TFERKQHmNnbIftpiE9ERFKSCpSIiKQkFSgREUlJKlAiIpKSVKBERCQlqUCJiEhKUoESEZGUpAIlIiIpKSVO1BURiYOZdbmtu3djEukK9aBEZJ/l7u3ehl/35F4fl+RTgRIRkZSkAiUiIilJBUpERFKSCpSIiKQkFSgREUlJKlAiIpKSVKBERCQlBZ2oa2bfAQ4CDgD2B/oB+7v7mTFmExGRNBbagxoJnJ74/hfA1cAlsSSSHlFZWUlBQQEZGRkUFBRQWVmZ7EgiIi0E9aDc/Ztm1gc4E7gFyANOjjGXxKiyspKysjLmzZvHhAkTWL58OSUlJQAUFRUlOZ2ISCSoB2Vm/w94Avg34AfAaHf/S5zBJD7l5eXMmzePwsJCMjMzKSwsZN68eZSXlyc7mohIk9AhviOBOqKe0zXAc2b2VlyhJF7V1dVMmDChxbYJEyZQXV2dpEQiInsKHeKbGnMO6UH5+fksX76cwsLCpm3Lly8nPz8/ialERFoKncVXABQDBzbbvM7dvxtLKolVWVkZJSUlexyD0hCfiKSS0OtB/TdQBnyfaPaeAXMBFaheqHEiRGlpKdXV1eTn51NeXq4JEiKSUkIL1E53f9TMvg2MAHYCWzpqZGYjgJ8n9j3b3Xd2Oal0q6KiIhUkEUlpoZMkTkx8vZDonKixRDP6OvI14CfAcqDFUXkzm25mK8xsxcaNGwNjiIhIuggtUP9kZl8EhgOLgCeBh8xsvZn9cC/tdgNZwGbg0OYPuPtcdx/n7uMGDx7chegiIrIvCx3iu6yNbaPdfVAH7f4TeAQYBlzemWAiIpLeQqeZ73GwwsxeD2i3wcz+majX9dvOxxMRkXQVOs18A+DNNwEd9Z4ws88AFcDd7r69SwlFRCQthfaghnTlyd39XeBfu9JWRETSW+hafEVm9pSZXWBmD5rZH8xsStzhJD5azVxEUl3oJInZwFTgaeAkYD3wEtEECOlltJq5iPQGodPM3wWeA25w99eB7cCOuEJJvLSauYj0BqHHoMYnvp2TuP+pmZ0UWyqJlVYzF5HeIPQY1BQz+6OZ/dXM/mZmfyM6WVd6ocbVzJvTauYikmpCh/h+AJwF7AI+C+QTrcknvVDjauZVVVXU1dVRVVVFSUkJZWVlyY4mItIkdJLEi+7+npmtBBYAtcCa+GJJnLSauYj0BqHHoM5PfHshMImo5/VsXKGke5nZXh9ftWoVkydPZvLkyW0+7u5tbhcRiVPoMaihZnYb0ZJFVwPnEF3+XXoBd2/3Nvy6J/f6uIqTiCRL6BDfY8B9wP1EK5SfCPwSGBNTLhERSXOhBWoI0blQwxL3PwQGmtkkAHdfHEM2EZEOHXfTYj75tK5LbfOuX9TpNgP7ZfLadyd16fWkc0IL1ANA6yPoixPbPPG9iEiP++TTOtbOPqfHXq8rRU26JnSSxM0AZnZwdNc3xZpKRETSXpuTJMwsy8wWNLt/kZm9AywBnjWzt8zsmz0VUkRE0k+bPSh332lmR5nZUe7+FnA9cJy7bwYws35ANdHQn4iISLfb2xDflcCTZrYI6AdMM7OtwEDg88CLPZBPRETSVLsFyt1fMbPjgdOAGiAnsf9mYJa7v9QzEUVEJB3tdZKEu+8gWhT2STM7FjiGaNZebQ9kExGRNBY0i8/MfgqMBn6f2DTVzNa6+/TYkomISFoLPQ/qTOBId28AMLMM4C+xpRIRkbQXWqDuBd4wsz8RDfGNJVr2SEREJBahJ+rONrOfAEclNv3Z3T+JL5aIiKS70B4U7r4FWBFjFhERkSahV9QVERHpUaGz+E4FDgIOAPYHsoEcd58dYzYRkQ6tzZ4MN/bk6wHoCEdPCB3iW0BUlJ4kWkHiE2BLXKFERELl1S7s8dXM1/bYq6W30EkSIxMn6p4J/G+i4nRZnMFERCS9hQ7xTQDOBY4G/gT8BvgooF3jdHQDztZlOkREJFToJImHga8Srcc3BigFng5odxrwfWA58LnmD5jZdDNbYWYrNm7cGJ5YRETSQugQX17j94lLbTTeOvIrosOXfYmKVPPnnAvMBRg3bpwHpRURkbQR1IMysxFmdqOZ3e7unwJbgbyApp8B3gP2Aw7pckoREUk7oUN8TxBdcuM8AHevAx4MaPcNoqvwvgyc0YV8IiKSpkKnmdcSFSjMLB+YRNiJAL8AKhLtv9qVgCIikp5CC9T5RBMj3gRuI5rJ95WOGrn7MuDYLqcTEZG0FVqgvgr8h7s3TWYws2uJipWIiEi3Cz0G9W0go/GOmWWiE3VFRCRGoT2oHwOrzGxV4v6xwPx4IomIiISfB3W7mc0luh6UAX9x949jTSYiImkt9DyoKURLFh0O3AM8b2ZFcQYTEZH0FjrEdzNwCbAIKADqgN8ClTHlkk467qbFfPJpXZfa5l2/qEvtBvbL5LXvTupSWxGRjoQWqAxgJLAJKCQa5tsdVyjpvE8+revRSw5A1wubSHfryffiwH6ZPfZa6S60QH0fGEI0vDckse2WWBKJiHRCV/8wy7t+UY//USedEzpJ4t64g4iIiDQXeh6UiIhIjwod4sPMBhFdsBDgTXfv8IKFIiIiXRV6Rd0yoAh4PbHpGDP7L3e/ObZkIiKS1kJ7UCXAqMa1+MysD7CGaPq5iIhItwstUE8By8zsZcCBE4FnYkslIiJpL3QW35VmdjTRSboA8939T/HFEhGRdBd6DOqiVpvGmdk4d/9ZDJlERESCh/juBHQuVApbmz0Zbuzp14SwCyuLiHReaIHaBHw3ziDyj8mrXZiUpY7W9ugrikg6CS1QfYHVze4b0WSJkd2eSEREhPBJEnkx5xAREWlBSx2JiEhKUoESEZGUFHpF3Rlm9pKZXWFmfzOzdWZ2TdzhREQkfYVOkrgUmAS8CgwjuqLum8DtMeUSEZE0F1qg+gDFQC1wLdEsvoa4QomIiIQWqCuAQ4Dmw3qXdH8cERGRSNAxKHdfBPyRaGivDnjV3Z+KM5iIiKS30EkSPwXuIFostgC41czmBrS70syeM7O1ZqYel4iIBAsd4jsTONLdGwDMLAP4S0eN3P1u4G4zWww82vwxM5sOTAcYNmxYZzJLO/KuX9SjrzewX2aPvp6IpJfQAnUv8IaZ/YloiaOxwP0hDc1sDPBXd2+xqqi7zwXmAowbN86DE0uburoOX971i3p8DT8RkRChSx3NNrOfAEclNv25dcHZi/OA/+5KOBERSV+h14Oa1cY23H1mQPOJwN2dzCUiImkudIjvzcTXWUBIUWruMHff1sk2IiKS5kILVCXRybnXAwsT3wdx9xO6kEtERNJcZ3pQTlSY3kTXgxIRkZiFTpIYEXcQERGR5kJP1P2KmS0zsxcT9zPN7Op4o4mISDoLvR7UbUAJMBDA3euAK+MKJSIiEnoMahXwLWCgmV0GnA6siC2ViIikvdACdT7RckcfAv2B+cAzcYUSEREJLVCPuftXgSfjDCMiItIotECdaGan0+r8J3df3P2RREREwgvUYmByq22e2C4iItLtQs+D+mbcQURERJoLnWYuIiLSo1SgREQkJYVebuNVIBtY37gJcHc/La5gIiKS3kInSZwGXEN0wcL7NHtPRETiFjrE92WiVcyfB35gZtVai09EeqPKykoKCgp4e855FBQUUFlZmexI0o7QHlTj+U9bgDtjyiIi0q3M9n7pulWrVjF58mQmT259Fg24e1yxJFBoD6qqnZuISMpy9xa33NxchgwZwtKlS9m1axdLly5lyJAh5Obm7rGvJF9oD+oRYBzRUkdb+fsFC4tjyiUi0u1qampYvHgxhYWFABQWFrJgwQImTZqU5GTSlqAelLufCpyX2H8D8B13V3ESkV5n6dKlFBQUkJGRQUFBAUuXLk12JGlH6AULHwCKiI5BnQ381cyeiDOYiEh3GzRoEHPmzKG4uJitW7dSXFzMnDlzGDRoULKjSRtCh/hujDOEiEhP2H///dm9ezcVFRVce+21DB8+nAEDBrD//vsnO5q0IXSSxMPu/nbzG7AwzmAiIt1t/fr1FBUVsWHDBtydDRs2UFRUxPr16ztuLD0utEAdaGZnmNkAMzvAzM4E1CcWkV5l6NChPP744zz99NPs2rWLp59+mscff5yhQ4cmO5q0IXSI7xvAd4DbiGbwrQYujCuUiEhcWk8h15Ty1BVaoPKBKd7sf9LMioDXYkkl3aqjkxXtB3tvrw+w7CvWr1/PJZdcwllnncXOnTvJysqiuLiYe++9N9nRpA2hQ3y3+56/pW7p7jASj9YnIC5cuJABAwaQmZkJQGZmJgMGDGDhwoV77KviJPsSDfH1LqEF6pdm9oKZ3WFmt5vZC+hqur3WlVdeybZt25g9ezbbt29n9uzZbNu2jSuvvDLZ0URipyG+3sNC/3PM7CjgGKKittrdVwa2OweYCkx3981t7TNu3DhfsWJFUA75x5kZRUVFvP7661RXV5Ofn8+xxx5LZWWlPqyyT8vIyODBBx/kBz/4QdN7/7rrrmPq1Kk0NDQkO17aMLNX3H1cR/uFXg9qBDAFOMDdrzazTDP7grv/toN2/YFL3P28Nh6bDkwHGDZsWEgM6UaLFi1i0KBBuDvbt29n0aJFyY4kErv8/Hxyc3N54403mrZVVVWRn5+fxFTSntAhvieAdcC5AO5eBzwY0G4CMMbMlprZ4c0fcPe57j7O3ccNHjy4E5GlO2zZsoW1a9fi7qxdu5YtW7YkO5JI7MrKyrjgggsYMWIEffr0YcSIEVxwwQWUlZUlO5q0IXQWXy1QA5iZ5QOTgE8C2g0C7k60vwC4qyshJR5mhrs3fRVJB7W1tXz88ce4O++++y7Z2dnJjiTtCO1BnQ/8C9H5T7cCRwBfDWi3EcghKlB7n+ssPap5UWosUiL7uhkzZtC/f39+/etfs2vXLn7961/Tv39/ZsyYkexo0obOTJI4CDiaqKitcff3AtpkA/8N9AMucve1be2nSRI9y8wwMw499FA++OCDpq+aVi77OjNj8eLFnH766U3blixZwqRJk/Te70HdPUniWqJrP71GdB2o48zsaXe/dm/t3L0WOCPkNaRnaaqtiKS60GNQ04Cx7l4PYGZ9gDXAXguUpLZdu3bh7uzatSvZUUR6RG5uLl/5yleor6+nrq6OzMxM+vbtS25ubrKjSRtCC1Qf4PlWxykOS5ywi7uP7+5gEi8zY/Pm6LS0zZs3a6KEpIUxY8awePHf1xioq6ujrq6OMWPGJDGVtCe0QI0HdMGUfUROTg7bt2/noIMOYvPmzU1fc3Jykh1NJFbPPvssZsZhhx3WdPz1/fff59lnn012NGlD6Cy+s4Avtb41uzaU9CI7d+4kJyeHgQMH0qdPHwYOHEhOTg47d+5MdjSRWO3evZtZs2axYcMGGhoa2LBhA7NmzWL37t3JjiZtCC1QdxKtaN76Jr1QfX09FRUVTT2mnJwcKioqqK+vT3IyEZG/Cx3i2wR8N84g0nOysrLYvHlzi+Ve7rjjDrKyspKYSiR+GRkZlJWVsd9++3HppZdyzz33UFZWRkZGRrKjSRtCe1B9iU7Sbby9mfgqvdC0adO47rrruOOOO9ixYwd33HEH1113HdOmTUt2NJFYXXbZZbg711xzDTk5OVxzzTW4O5dddlmyo0kbgidJuPv65hvM7MgY8kgPqKioAGDmzJlcc801ZGVlcemllzZtF9lXjR8/nvvvv5/a2tqmbVlZWYwfr4nIqShoJQkz+5u7j2i17U13/2x3hNBKEiLSE4444ggaGhp45JFHmDBhAsuXL2fKlClkZGSwbt26ZMdLG6ErSYQO8f3ZzGab2Xlmdq6ZzQbWd9hKUlZlZSUFBQVkZGRQUFBAZWVlsiOJxK6mpoaLL76Y0tJSsrOzKS0t5eKLL6ampibZ0aQNoUN8XyO6HtTpJC5YCHwlrlASr8rKSsrKypg3b17TX5ElJSUAFBUVJTmdSLweeOABKisrm977es+nrtAhvm8B8zyxs5l9Fpjh7iXdEUJDfD2roKCAiooKCgsLm7ZVVVVRWlraYmafyL4mMzOTrKwsBg8ezNtvv83w4cPZuHEjO3fupK6uLtnx0kZ3D/ENB140szPN7A7gEeDJfySgJE91dTU1NTUthvhqamqorq5OdjSRWNXX17Njxw7WrVuHu7Nu3Tp27NihcwBTVGiBWg7MBe4HjgT+H7A9rlASr6FDh1JaWsr27dF/4fbt2yktLWXo0KFJTiYSr759+5KTk8MRRxyBmXHEEUeQk5ND376hRzukJ4UWqAuJLt++BPiI6Oq4F8YVSuK1Y8cOtm3bRmlpKVu3bqW0tJRt27axY8eOZEcTiVV9fT39+/dn/vz57Ny5k/nz59O/f3/1oFJUUIFy928CJUQ9pzKg2N2L4wwm8dm0aRMzZsxg/vz5HHDAAcyfP58ZM2awadOmZEcTid3UqVNbzOKbOnVqsiNJO4IKlJlNBX5PNHvvF8BaM9PUl16ssLCQN954g4aGBt54440WEyZE9lW5ubk8+OCDVFRUUFtbS0VFBQ8++KCuB5WiQof4rgNOAWrc/ZTE97fElkpilZuby0UXXURVVRV1dXVUVVVx0UUX6UMq+7w5c+bQ0NBAcXExWVlZFBcX09DQwJw5c5IdTdoQWqB+k7ia7tWJ+x8Av40nksRNH1JJV0VFRdx1113k5ORgZuTk5HDXXXfpXKgUFXQeVNx0HlTPq6yspLy8nOrqavLz8ykrK9OHVER6RHefByX7mKKiohbHoFScJF1oma/eI3jyv5kNBAY221Tr7h90fyQRkXhoma/eJXSpo7uB04BhwMrE5sPcfWR3hNAQn4j0BC3zlRq6e4jvXHcfA9QApwLjAfsH8omI9Dgt89W7hA7x/Tjx9WfAn4F64OexJBIRiUnjMl8HHXQQ7q5lvlJc6EoScxJfZwFfBE5z9+viDCYi0t2aL/PV/KuW+UpNoStJ3GVm68zsVmAxsMTM7ow3mohI99q0aRPnnHMOM2fOJCcnh5kzZ3LOOedoma8UFXwMCvgs8C3gc8AxwDdCGprZe2b2nJkd3rWIIiLdZ9myZQwZMgQzY8iQISxbtizZkaQdoQXKiC610QeYBywgOg6190ZmGcDT7j7R3d9r9dh0M1thZis2btzYydgiIp2XkZHRYgX/xhX9MzIykh1N2hA6zfwEYECrzZ+4+6sdtDsEeAr4vrv/sr39NM1cRHqCmTFgwAAGDRrEO++8w7Bhw9i0aRNbtmwhFVbVSRfdPc38Fndf1uq21+IE4O4fAl8ArjCz4YGvJSISm8svv5ycnBwAcnJyuPzyy5OcSNoTOs18vJnNar3R3Wd21NDd68zsI+BA4O1O5hMR6Ta5ubksWLCARx55pGkliSlTpmgl/xQV2oPaRXQtqDdb3fbKzM43s98m2r/e1ZAiIt1hzpw51NfXU1xcTHZ2NsXFxdTX12sl/xQV2oP6hbv/rLNP7u6PAY91tp2ISBwa19srLy8HoiG+WbNmaR2+FBXagzqm9YZEz0h6Ka3oLOlKK/n3HqE9qAPN7Azgd4ATTXwYFFsqiZVWdBaR3iB0mnk+8B2inpQRHY/6XshMvhCaZt6ztKKziCRT6DTz4Cvqmtl+wMHuvuEfDdeaClTPysjIoLa2lszMzKZtdXV1ZGdn09DQkMRkIpIOuvU8qMQU82rg5cT9TDN79B+LKMmSn5/P8uXLW2xbvnw5+fn5SUokIrKn0EkS57v7KGBr4n4mcEo8kSRuZWVllJSUUFVVRV1dHVVVVZSUlFBWVpbsaCIiTUInSSw0s+XAEDP7T2AccE98sSROjRMhSktLqa6uJj8/n/Lyck2QEJGU0pljUAcBo4gmSfzF3Td3VwgdgxIRSR+hx6CCelBmdkMb23D3m7sSTkREpCOhx6Deb3Yrbva9iIhILEKPQVUlvuYBte5+bzxxREREIqEF6l6iFSQ+ILqqroiISKyCCpS7F3a8l4iISPcJnSSxBhgKvATsJprJ5+5+WozZREQkjYVOkjga+D/AduBR4CwVJxERiVNogfq/wBDgFeAy4F0z+1FsqUREJO2FTpJY1uz738QRREREpLnQApUVawoREZFWQgvUz4FfEE2OaOTA4m5PJCIiQniB2gn8FtgC1AAvu3t9bKlERCTthU6S+AkwnOgSGzOANWY2LbZUIiKS9kJP1L2p+X0zOwxYANwXRygREZHQE3UPAL4JFCQ2rQIuiCuUiIhI6BDfr4ADgMcStxzgibhCiYiIhE6SGAo86u5rAMzsL0Q9KhERkViEFqjLgUfMbCjR9PINiW0iIiKxCC1Qf3D3U5pvMLNTY8gjIiIChB+DetbMrjezTDP7nJk9D1wb0tDMhpvZzq5HFBGRdBTagzoJuBL4A/AJ8G13fyWw7VXAn7uQTURE0lhogVqX+JoD1AJPmpm7+9C9NTKzQYlvP2zjsenAdIBhw4YFxhARkXQRNMTn7kMStwHufmji+70Wp4TJwEPtPOdcdx/n7uMGDx7cmcwiIpIGQo9BAWBmKzv5/COJlkYak+gxiYiIBAkd4mu0uTM7u/vVAGb2nLvP7eRriYhIGgvqQZnZjwHc/YtdeRF3n9iVdiIikr5Ce1Bnm9loWl4PCnd/q/sjiYiIhBeot4HWQ3QOnNa9cURERCKhl9uYGHMOERGRFkKPQZ1hZq+Y2drE7Q9mdmbc4UREJH2FDvFVAKe5ew2AmeUCVcDouILV1dVRU1NDbW1tXC8h3SQ7O5vc3FwyMzOTHUVE9iGdWUniEjP7PdGxp5OAd2NLBdTU1HDAAQeQl5eHmXXcQJLC3fnoo4+oqalhxIgRyY4jIvuQ0BN1vwqsB84AzgTeA86LKxRAbW0tBx98sIpTijMzDj74YPV0RaTbhfaghgC/SdwaHQ5s6fZEzag49Q76fxKROIQWqD8CLzW7b2iauYiIxCi0QH0MlBP1mGrcfUN8kURERMIL1GLgIqA/MMLM9gOud/cnY0smIiJpLfRE3W82v29mY4BHARUoERGJRacut9HI3f/k7sd2d5hU0tDQQFFREcOGDeOll15i4sSJrF69uunxG2+8kXvuuYcPPviA0aNHM3r0aG699dYWjzV67rnnuPDCCwG47777GD16NFOmTGnxeo1t7r//fmbOnEl9fT1f//rXGTFiBPPnz2/ar2/fvuTl5XHKKacAMHHixKbnvfHGG1u81uGHHw7A2rVrGTduHGPHjmXNmjUAXH311QwfPpxbbrmFI488kszMTEaNGsUzzzzDRRddxMiRI8nPz2flypVs27aN/Px8RowYweuvv96d/8wiIu3q7OU2kibv+kXd/pxrZ5/T7mPPPvssDQ0NvPPOO+zevbvd/Xbs2MHBBx/M888/z3HHHUdxcXG7++7evZsf/ehHrF69mi9+8YusWbOGUaNGNT2+bds2Hn74YZYtW8bixYvZb7/9WLlyJccffzwXX3wxAIcddliLIgTRSc233norkydPbnNGXUVFBTfffDPvvvsuCxYsYMqUKbzwwgusXbsWd+c73/kOeXl5rFq1iuzsbGbPns1TTz3FQw89xPPPP88xxxxDdXU1lZWVzJ8/nzvvvHOv/64iIt2h1xSovRWTOKxcuZLx48cD0KdP1NE8++yzGTt2LJWVlXvsv99++3Hsscfy5ptvAnDzzTczb948HnjggaZ9PvroI9asWcPYsWPZunUrH374YYsC9cMf/pCysjIGDhzIqlWrOOmkk+jfvz+DBw9mw4YNDBgwgIEDB+7x2o888ghf+tKXABg8eDBvv/12i8dXr17N448/DsC5557LypUrOeWUUzCzdqeIn3766Xz88ce89dZbrF27lgsvvJANGzZQWFgY/G8okooqKyspLy+nurqa/Px8ysrKKCoqSnYsaUOXhvjSwe7du/f45f3UU09x1FFH8fDDD7fZpk+fPjQ0NABwww03cMMNN/C9732v6XF3Z+zYsaxevZp3332Xk08+uUX7E044gcWLFzfdb/36f/3rX/nMZz6zx+s+9thjXHDBBQCMGTOGESNGkJ+fz6ZNm5pe94knnmDNmjXceeedbf5srS1ZsoTbbruN2267jZ/85CdcfPHF3HvvvXttI5LqKisrKSsro6KigtraWioqKigrK2vzj05JPhWodhx99NG88MILQPQLvlG/fv2oq6vbY/+GhgZee+21Fj2i1vsecsghvP/++7z33nvs2rVrj+f48pe/TH19PUuWLGHMmDG8/PLLbN++nQ8++IDDDz+cn//85232YMaPH0/fvn/vDC9cuJDq6moGDRoEQH5+PkuWLAFg165dHH300fzud7/D3Vv8bK1lZWWxc+dOGhoayM7Obnc/kd6ivLycefPmUVhYSGZmJoWFhcybN4/y8vJkR5M2qEC148wzz6Suro4jjzySZcuWceihh3LGGWewZMmSpt5Ko9dee42RI0dyxhlnMHToUAYNGsTNN9/MFVdcwVVXXdW0X58+ffje977HySef3O5Q2U033cSsWbOYNGkS27Zto6CggOuvv55ly5axZMkSLr300hb79+3bd6/HvQCuuuoqHnroIUaNGsVLL73EscceywknnMCoUaNYuHDhHvsPHz6cSZMmccsttzB9+nSmTZvG7bffzrRp08jNzQ39JxRJOdXV1UyYMKEEBS7eAAAGz0lEQVTFtgkTJlBdXZ2kRLI3tre/oHvKuHHjfMWKFS22NY4Pp7rG4zMvvvhisqMkVW/5/5L0VlBQQEVFRYs/EKuqqigtLeWNN95IYrL0YmavuPu4jvZTD0pE0kZZWRklJSVUVVVRV1dHVVUVJSUllJWVJTuatKHXzOJLVXl5eWnfexLpLRpn65WWljb1+svLyzWLL0WldIFyd62U3QukwjCxSKiioiIVpF4iZYf4srOz+eijj/TLL8U1XrBQs/xEpLulbA8qNzeXmpoaNm7cmOwo0oHGS76LiHSnlC1QmZmZuoS4iEgaS9khPhERSW8qUCIikpJUoEREJCWlxEoSZrYReLvDHSUOhwAfJjuESBLovZ88w919cEc7pUSBkuQxsxUhS46I7Gv03k99GuITEZGUpAIlIiIpSQVK5iY7gEiS6L2f4nQMSkREUpJ6UCIikpJUoKRbmdkgM9PKsbJPM7NMMzs02Tn2dSpQ0t3uAI5OdgiRmH0BmJnsEPs6FagUZWb/Zmarm91/zMyeM7MXzexEM8szs8cTj000szsT3//OzJaa2TNmNjaxbYSZvW1mBybu321mr5jZl9t43VPN7H8Sjx+S2HaOmf2XmR2UuP95M/tb4vvvm9nLZnZH3P8mkn7a+BycaGbLzWyZmRWa2U2J9+p9icdfbdV+uJntbOe5J5rZajN7NHH/XjP7o5mda2YHmdnTiff2iMbnTnwGj4/vJ5bmVKBS19eAdWaWn7h/CPBV4E7gn/fSrp+7nwZcBtyd2HY28DFAYvjtc8C/AtPaaL/c3U8FqoDjzaw/cIm7f93dNyf2+RawIfH9THc/ETit2XPMM7MlZtbPzL5sZq8nit6o8B9fBNjzc3A78HV3/5K7VwH/6u7/5O5tvZcBrgL+3M5jGcD33f1CAHe/BLgi8fyb3f0s4JdAYWL/1919ors3FsHzzewPZvYlADOrNrMXzOzSf+QHlr9TgUpBZpYBDAPmA+c1e+hp4C5gYeL+BDN7jqhoteDufwP6mVmGu/8YaCwuhwAfERWYz7TRzi26jPEY4PfABGBMold2uJmNIfrA72q2/zHAS82epiRx/4zE680HbgYu7+Q/haSxdj4HB7r7hma7LTGzq81sj99lZjYo8W17yxntD/ybmR2X2P8c4ElgQeL+jUR/6D2T2H+MmZU0a/8Y8A3g3xP3dwKnApeE/oyydypQqelE4HXgOWBSs+1nAV8n+mUPUW9nItFfiZ3liVtbZgJz3X0LMIioJ/Zz4ALgYqJfGEBTj+z7wIxWz/E2MKTZ/bWArmoondHW56Cu+Q7ufjXwDlDZRvvJwEPtPbm7/4poNOD+xP1FRMeWvpK4fyMwC/h8oslJwEQzO7XZ07R4n7t7A7DDzPoF/HzSARWo1HQacDLwU+A4M8tq9tiHwICOnsDM8oDaxAemuQ+Jis5QYH0b7Q4HTnT3xxObNgI5QC1giXY/Ivpr8mtEvaKfufsnrZ5qJFDT7P5w4IOOcos009bn4GMzG918J3d/DDjazDJbtR9J9IfTGDOb3s5rfETUk2r0IXBCq/vHJ15nN9FnZlCzx1u8zxOjDwe6+6dBP6HsVcpeUTfNfR44193fM7OfAv8EbCIafjDg2r203WpmvyHqHZW2ftDda83sj0Rj6zckhlGed/cvJHb5AtGxp+eIek5PJl6vH3CRuzdOxnjO3X9pZr8CBpvZ5cCXiT7A9xIVo+8S/RV7BbANmNrFfw9JT219Dq4BHkxMfLiL6A+kg4Gn3L3OzEaa2TNAtbv/OzS9V+cmhuf6uHvjhIpZRD2znyYKy5NEvfzZZvZZ4B7gQOB/mdkEYDawleh9nQ+cnsj47UTeUURD22315qQLtJJEmkvMUPoPd4/l+JCZTSX6i3KP42QiPcnMbgIWufvvY3r+V91dM/y6kYb4ZDfRuUsi+7rn4ypOEg/1oEREJCWpByUiIilJBUpERFKSCpSIiKQkFSgREUlJKlAiMUks6FtrZn8xs/9Jdh6R3kYn6orE61V3PyXZIUR6I/WgRHqImT1gZmvM7LLE/eVmdqSZ/YuZPZzYVmNmfZu1edjM/iVZmUWSSQVKpAeY2QlAFnAccHWS44j0CipQIj3jaKJ131YA2c22/5pmq8MnrEr0tvT5lLSmD4BIzzDgPnc/2t2PaLb9DKC41b5jiVbMPg2RNKYCJdIzqoHTzKyPme0XsP+nQOvLR4ikFRUokR7g7n8E/gD8jegieO15H3gLGAgs7YFoIilLi8WKiEhKUg9KRERSkgqUiIikJBUoERFJSSpQIiKSklSgREQkJalAiYhISlKBEhGRlPT/Aef2++hLu3IbAAAAAElFTkSuQmCC\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x1555b1b6b38>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"gene_idxs = np.array([80, 186])\n",
"gene1, gene2 = gene_names[gene_idxs]\n",
"len1, len2 = gene_lengths[gene_idxs]\n",
"gene_labels = [f'{gene1}, {len1}bp', f'{gene2}, {len2}bp']\n",
"\n",
"log_counts = list(np.log(counts[gene_idxs] + 1))\n",
"log_ncounts = list(np.log(counts_rpkm[gene_idxs] + 1))\n",
"\n",
"ax = class_boxplot(log_counts,\n",
" ['сырые количества'] * 3,\n",
" labels=gene_labels)\n",
"ax.set_xlabel('Гены')\n",
"ax.set_ylabel('лог-количества экспрессии генов по всем образцам')\n",
"plt.tight_layout()\n",
"#plt.savefig('pics/1_14.png', dpi=600)"
]
},
{
"cell_type": "code",
"execution_count": 103,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"c:\\python36\\lib\\site-packages\\numpy\\lib\\function_base.py:4291: RuntimeWarning: Invalid value encountered in percentile\n",
" interpolation=interpolation)\n",
"c:\\python36\\lib\\site-packages\\matplotlib\\cbook\\__init__.py:1856: RuntimeWarning: invalid value encountered in less_equal\n",
" wiskhi = np.compress(x <= hival, x)\n",
"c:\\python36\\lib\\site-packages\\matplotlib\\cbook\\__init__.py:1863: RuntimeWarning: invalid value encountered in greater_equal\n",
" wisklo = np.compress(x >= loval, x)\n",
"c:\\python36\\lib\\site-packages\\matplotlib\\cbook\\__init__.py:1871: RuntimeWarning: invalid value encountered in less\n",
" np.compress(x < stats['whislo'], x),\n",
"c:\\python36\\lib\\site-packages\\matplotlib\\cbook\\__init__.py:1872: RuntimeWarning: invalid value encountered in greater\n",
" np.compress(x > stats['whishi'], x)\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAADUCAYAAADOS6ueAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAHNRJREFUeJzt3Xl8FeW9x/HPT4VEAUGEEiBCQFCWS3EJCFQrLnXBSi1qxet9FRVx6W3rUu21ekuxtaittWq1VVqt+8WlIFVxq5q61ZawuEIBATUGRVBBNoHwu3/MEA7hhDw5ySRM8n2/XueVM8+ZmeeXw+THzPPMPI+5OyIikrtdGjsAEZG0UyIVEakjJVIRkTpSIhURqSMlUhGROlIiFRGpIyVSEZE6UiIVEamj3Xb0oZn9FGgbv/YEWgP5wB5AnrsflHiEIiI7uR0mUqAQGAH8A3gImAl8CayPf4qINHtW0yOiZrYrcBhwItAP+AB4yt2nJB+eiMjOr6YzUty9wsz+BbQH2gEHAysBJVIREWo4IzWz04GTidpFnwGmu/vCBopNRCQVakqkfyPqaNpiN6KefgM2uHtxsuGJiOz8amwjzbqR2W5Avruvrv+QRETSZYf3kZrZQ2ZWWKWsNTABmJdgXCIiqVHTDfl3A4+Z2Xgza2lmJwJziC7tByQenYhICoTc/tQCuAQ4B/gE+C93X9QAsdWoQ4cOXlRU1NhhiEgTNXPmzOXu3rGm9Wp6smkp4ERnoG2AvYCXzcwAd/cu9RFsroqKiigtLW3MEESkCTOz90LW22EidffO9ROOiEjTVVNnU6GZ3W5mz5rZ78ysxlNcEZHmpqbOpvuBfwI/ABYDf0w8IhGRlKnpEdFOwL1EbaS3AuPizicDcPcNyYYnkoyNGzdSVlbG+vXrGzsU2Qnk5+dTWFhIixYtctq+pkT6CfBvosS5pdMpc7lnTrWKNLKysjLatGlDUVERUd+pNFfuzooVKygrK6NHjx457aOmzqbDctqryE5u/fr1SqICgJmx995788knn+S8j5puf2oFnAv0BhYCNxGNAnUWcKq7D8q5ZpFGpiQqW9T1WKips+n/gL2Bp4ku4+cArxBd2p9Qp5pFRJqImhJpb3f/X3ef5u7fJ7opv6+7X+fuyxogPpEmq7y8nB49etC7d2+++93vUlFRQc+ePdlvv/0YMWIEK1euZMmSJQwZMgSAW2+9lUsvvRSAfffdl1deeaVyX8cccwx33313o/weUnMizTOz8i0voCvwgZktjZdFJEcbNmygU6dOLFiwgM2bN/Poo4+ydu1a5s+fz8EHH8wtt9xSue6nn37K73//e8aPHw/AmjVreOihhwBYvnw5r776KitXrmyU30NqSKTu3tPdu2S8WsQ/Ozf246EiTUlxcTELFiyoXB42bBiLFm0d0uKKK67gsssuY8899wSgsLCQ2bNn4+489thjHHfccdsl0gkTJnDbbbcBUFBQAMAbb7xBv379OPDAA3n//fcB6Nu3L7169eKiiy4CYPHixbRq1Yr169ezZMkSdt11VzZv3szq1avp27cvPXr04I033gCgpKSENm3a0KlTJy6//HIAhg8fDsAZZ5xBSUkJ1113HT179mTUqFFs3rw5awy9evWiV69eHH744axZs4aSkhJ69uzJ0KFDWbt27TZn5meeeSZPPfUUJSUljB49urLOefPmcdddd1XGAWyz3fTp09lvv/34xje+wcaNG+v071WVpmMWaWSbNm3ihRdeoFevXpVlTz31FH369AFg4cKFzJgxgzFjxgDRHQf5+fkcdNBBzJo1iyeffJJRo0axatWqGuu6+uqruemmm7jkkku44YYbAPjss89YsGABs2bN4s0332TlypW0atWK5557jmnTptGxY0dWrVpF69atmTt3LhMnTuTOO+8EoKKigpEjR3LddddtU8+8efOYMiWajejSSy9l/vz5zJkzh2XLlmWNYfXq1SxcuJCKigqWLFnC8OHDWbRoEV/96ld55pln6vgNRyZMmEBpaSlf+cpXeOmll+pln1vUOGeTSHNQdPkT9b7PJdfW3B/7+uuvs88++zBgwABOOeUURo8eTZ8+fSguLuaaa65h2bJldOjQgTVr1jBr1iwOPvhg1qxZQ15eHiNGjGDatGmsXr2agoIC1qxZw8MPP8xPfvITLrvsMiBKHjfeeGNlfW+//TaDBw/mo48+2qZN1cwYOnQob775JoWFhYwYMYKnn36a8vJyjjjiCL744gs+//xzRo8ezdKlSzniiCOAqImhbdu2VPWb3/yG0047DYBFixYxZMgQiouLKSgoyBrD8uXL6dy5M4WFhfTr14+nn36aiy++mGXLljF06NDK76pPnz4sXbq08kx0+vTp9OnTp/LMFuCOO+5g+vTp3HzzzWSODjd//nwGDx7M2rVrOemkk4L+DUMFJVIz6wGcCbRx90vip5sGu/srO95SJB1Ckl4SBg4cyPTp0ykuLmbDhg106NCBefO2HTO9Xbt2XH/99YwdO5aZM2eybt06WrZsyde//nXOOussLrjgAvLy8li3bh2nnnoqp556KhAl0QkTJnD++edXXtpD9bf67LLLLlRUVLB69WqKioqYOXMme+yxB3l5eaxdu5Y77riDMWPG0KNHDyZPngxESbJr167b7GfZsmV079698ub23r17U15ezsiRI/nss8+yxtChQwfKysoYM2YMjz/+OL/4xS/4y1/+sk2yHzhwIK+99hpnnnlmZdmIESOYPHlyZXMCwNixYznjjDMYN25cZZwAbdu23e67rS+hl/Z/JZqG+UQAd98I3JVIRCLNTPv27TnmmGOYOnVqtesceuih9O3blzvvvJN169ax2267kZ+fz6GHHsrRRx9Nfn4+69atq7Gufv36MWPGDEpLS+nfv/82n5WWltK7d+/K/XTv3p3BgwcDUXNCRUUF+fn5letv3LiRqVOnVp6dbjF37lzOO++8yuW1a9eSl5dHXl4eixcvrjaGXXfdlRYtWvDll19uV1dt7b777tu1g3bo0IE5c+awYUP9P9keemm/HigDMLO+wDFEUzKLSD0YN24cV1111Q7XGT9+PCNHjuSQQw6pfCb8wQcfBOCtt94KGjfgyiuv5PTTT6dly5ZMmzYNiC6ri4qKGDRoEIcccgj3339/ZX15eXlceOGFrFmzhnHjxnHKKaewcuVKxowZwy9/+UuKiooYNmwY8+fPr6yjf//+DBs2rLJt82c/+xlTpkxhwIABDBw4MGsMrVu3prCwkB49enD88ccDcNRRR7Fhw4bKDrMQbdu25a677uLhhx/ert322muvZdSoUeTn5zNjxgxatWoVvN+aBE1+Z2bdiUaA6kN0M/5c4EZ3L6u3SHJQXFzsGthZcjF37lz69u3b2GHsFAoKCvjoo48aO4xGl+2YMLOZIbMlh56RLgWuyCG2bZhZJ+BRYBMwwt2/qKZsKPCAu+c2goCISAMKTaRfAB9mLOc6+tMJwH1AB+BoYGo1ZecQJW8RSZjORuuuNmek3wBWAZ+6e0WO9XUhahbYSPSU1HZlZtYPWADsm20HZnYu0UAqdOvWLccwRETqT2iv/RLgTuBJYImZPWVmNbYb1CBb46wDY+K6sm/kPsndi929uGNHzXwiuQvpH5Dmoa7HQlAidffh7n54nMD2AW4A/pJDfeVAAdA5fp+trAtwM9DPzL6dQx0iNcrPz2fFihVKplI5sHNdbrcKvSH/ene/NKPiZ8wsl0dBnmBrx9KiuFMps+wGd58a11my5b1IfSssLKSsrKxOg/lK07FlqpFchbaRftvMrnb3zwHMrD3RvaS14u4fA0OzfLRdmbsPr+3+RUK1aNEi52klRKoKTaRXAjPM7It4uT0wIZGIRERSJiiRuvtkYLKZ7U1069MKV+OSiAgQ2NlkZgPNbCrwD+BVYIqZHZhoZCIiKRF6af8A0b2bM+LlQXGZnrETkWYvNJGuJXrOfhXRvZ77x2UiIs1eaCI9CfghsOW+zneAbyUSkYhIyoR2Nn0I/E/CsYiIpFLoDfkbgcw7lw1wTYAnIhJ+ab/E3XsnGomISEqFJtJOZvZA1UJ3/896jkdEJHVCE+mJiUYhIpJioZ1Nf086EBGRtAodj1RERKoRemmPme0JtAH2APKBVu7+WlKBiYikRejtT5uAZcALwJcZHymRikizF3pp3wu4FmgJ7A68CVyUVFAiImkSmkjLiSakKwdaAT2IkqqISLMXmkg/Bf5M1EY6E1gBfC+poERE0iS0s+k/gLbAnkBrosv73ZMKSkQkTULvI11iZm2JkukW65MJSUQkXUJ77W8BjgS6AW8QDVrSCeiZXGgiIukQ/Iiou3c3s3nA14EKYFFyYYmIpEdoZ9Ot8c97iHrv5wMP1rYyM+tkZv8ws5fMrE22MjO7yMxeNbOHa7t/EZHGYLWdDNTMWhEl4GOJpmX+d+iz+GZ2NlEnVQfgDXefWrUMeNTd3cxmAMPcfWN1+ysuLvbS0tJaxS8iEsrMZrp7cU3rhbaRvkA0V1OmQcCvgY9qEVcXYC6wEeiarSxOop2AD7IlUTM7l2giPrp161aLqkVEkhHaRnp+lrJp7v7zOtSd7VTYzcyAW4CLs27kPgmYBNEZaR3qFxGpF6GJ9PAqy0b0hFNtlQMFbL2Mz1b2bWCGu7+Xw/5FRBpcaCLtnKXsTznU9wTwKLAJWGRmQ6uU3QBMBA4zsxHAOe6+MId6REQaTGgi/TXR002zie4d7Qs8U9vK3P1jYGiWjzLLflDb/YqINKbQRPokUZtmO+BtYDlRG2bVS34RkWYnNJF2dfdeZrYUODDuWV+QZGAiImkRekP++Lg3vWecRHcn6lkXEWn2Qs9IpxDNJNouyqdA9ISTiEizF5pInyMaFX8UUW+9AWcQDWIiItKshSbSAnf/mpkdBtxNNG/Td5ILS0QkPUIT6dnxz4uJBi7ZBfhxIhGJiKRM6MDOfzezvYBVwI+Ahe5em2fsRUSarNBBS34EjCV6hHMzMNDMnnT3S5MMTkQkDUIv7c8F+rv7JgAz2wV4F1AiFZFmLzSRGvBixq1PAJ3M7FUAdx9W34GJiKRFaCL9GrBHkoGIiKRV1kRqZrsBB7j7luHnvwROA/YneuZ+LnC/u69qkChFRHZi1T0iuhm4P2P5EWAf4G/xa2/gsWRDExFJh6xnpO6+2cymmtnvgJ8CHYGfuft6ADMrI57uQ0SkudtRG+mVRPeMPgsUAR+Y2QaiqZjfIrrUFxFp9qpNpO5eAfwqfomISDWChtEzs4Hxpf78+DXVzA5MOjgRkTQIvf3pAaI20Rnx8qC4rG8SQYmIpEloIl0L9CF61t6JboNam1RQIiJpEppITwJ+SDRVMkTzNn0rkYhERFImdPSnD4H/qWtlZtaJrVMvj3D3L6qWEd2vei/wMXCCu3td6xURSVLonE315QTgPqKb+o+upmw0MJ4okR7QwPGJiNRaQyfSLsBHwFKgazVl2dapZGbnmlmpmZV+8sknDRK0iMiOhI5H+jzRY6NONHjJ7sAe7t6nDnVnu2SvWrbdOu4+CZgEUFxcrMt+EWl0oWek9wArgC+IZhQdS263PpUDBUDn+H22smzriIjstEI7m+4ysynAcURzNY0h6rVfXMv6nmBrx9IiMxtapewG4N9EiXsZ8Hot9y8i0uBCL+2fA/Yj6hD6NVEb5j7UMpG6+8fA0CwfZZa9AxTXZr8iIo0p9D7Se4G2wJ7AQURtpPnAiwnFJSKSGrW5tG8JtNfsoSIi2wodtGQi0aj4pfFyCzObnGRgIiJpEdprf4q770vUaw/QAhiSTEgiIukSmkgfMLOXgc5m9iDwJnBbcmGJiKRHaGfT1cBNwL5EUzMvBDYkFZSISJqEJtK33X1/4jZSADN7G+ifSFQiIikSmkg/M7PzgVeJHtv8GrA+sahERFKktuORnkB0aT8vLhMRafZCE+lK4F53n2tm3wRaAsuTC0tEJD1CE+l0YLf4pvz5wEbgQuDwpAITEUmL0ETa1d33M7Nydz8EwMzeTzAuEZHUCE2ku5hZOdAx/gmwOqGYRERSJfRZ+15JByIiklYNPdWIiEiTo0QqIlJHoQM7FwI/YuuTTG8Dv3V3dTiJSLMX2tk0jWgakD/Ey8Vx2YFJBCUikiahiTSfrVMke/w+L6mgRETSJDSRng5cSTQCFESX9qcnEpGISMqE3v70BnBawrGIiKRS6FQji6q8FpvZotpUZGb9zGymmU03M6uu3MyuMbMZZnZDbX8ZEZHGEHr702agD9A3fm15XxujgfHAx8ABOyi/wt0HAUfWcv8iIo0itI20PTAhS/kVtairC1s7rLoCs7OVu/tsMxsA/DPbTszsXOBcgG7dutWiehGRZIQm0otz2bmZXQ4cFy92Y+vtU17NJm5m+cA1wBlZV3CfBEwCKC4urm4/IiINJrSz6e5cdu7u1wLXApjZz4ECoDNQnrFaeZXy7wH3uPvKXOoUEWloDfmI6GTgKqAT8LqZ/cDMelUtB44ALjGzEjNr3YDxiYjkJPTSHjNrC7TNKFrv7stCt3f3d4ieiNridxnvM8tPDN2niMjOIPRZ+1uIetG7Ec1pD9EZZM+E4hIRSY3QM9IT3b27mc0DDgMqgFrdRyoi0lSFJtJb45/3AAuATcBDiUQkIpIyob32v4p/TjSze+P3HyQZmIhIWoQ+InqTmX1gZr8GngGeNbMbkw1NRCQdgttIgf2BD4meRNoEvAdclFBcIiKpEZpIDfgT0RnsHXHZpkQiEhFJmdBEOgrYE7g9o+xX9R+OiEj6hCbSX7j7NxONREQkpUIT6TAzm1i10N1rM/qTiEiTFJpINwDziNpKRUQkQ2gineLu9yQaiYhISoWO/jSgaoGZvVLPsYiIpFJoIm1nZsea2Z5m1sbMjiMaNV9EpNkLvbT/DvBT4HqidtJ5aDpmEREg/Fn7uWZ2JrC3uy9NNiQRkXQJfdZ+IjAXmBEvtzCzyUkGJiKSFqFtpKe4+77AF/FyC2BIMiGJiKRLaCJ9wMxeBjqb2YNEo+TfllxYIiLpEdpGOsHMbgL2JepsWujunyUamYhISoTO2TQ+Sxnu/vP6D0lEJF1CL+0/znidnfFeRKTZC02kL8SvxUTTMN/u7rfXsM02zKyfmc00s+lmZjsqN7PTzaykNvsXEWksoYn0dqLOpbOBc3KsazQwnuhM9oDqyuNkenyOdYiINLjQzqYj6qGuLsBHwFKgKzC7mvLOwHTg/Gw7MbNzgXMBunXrVg9hiYjUTegN+e+a2TozKzGz583sBTN7PmC7y+NtSoAjMz7yajZx4FvAlOr26e6T3L3Y3Ys7duwYEr6ISKJCL+37AD8E1gCTgePd/cgdbwLufq27D3f34cB9QAHRGWd5xmrlVcq7A/cA/czs0MD4REQaTWgi/QlRopsJXAB8aGY317KuycBVQCfgdTP7gZn1qlru7se5+2jgHXd/uZZ1iIg0uNDRn/6e8f65XCpy93eA4oyi32W8L66yOvFZrIjITi80keYlGoWISIqFJtKHiDqAMudscuCZeo9IRCRlQhPpl8ArwCqgDJjh7psSi0pEJEVCO5t+T9SbPgT4MfCumY1LLCoRkRQJvSH/qsxlM+sE3A38MYmgRETSJHT0pzbAWcB/xEVvA6clFZSISJqEXto/BrQBHolfrYC/JhWUiEiahHY2dQEmu/u7AGa2kOgMVUSk2QtNpN8D7jezLkS3PS2Ny0REmr3QRDrL3beZ7M7MDksgHhGR1AltI/1bPJJTCzM7yMxeBC5NMjARkbQIPSMdDHwfmAWsBC5095mJRSUikiKhifSD+GcrYD3wuJm5u3dJJiwRkfQIvSG/c9KBiIikVWgb6XbMbGx9BiIiklY5J1Lgs3qLQkQkxcy9uumTMlYyu9Xd/7sB4qkVM/sEeK+x49gJdQCWN3YQkio6ZrLr7u41Tg4XmkgXA8ew7XikuPv8nMOTxJhZqbtvN+uASHV0zNRNaK/9e8CkKmXOtjODiog0S6G99sMTjkNEJLVC57U/1sxmmtmS+DXLzI5LOjjJWdWrB5Ga6Jipg9A20vnAke5eFi8XAi+4e++E4xMR2enV5smm88zsX0Rto4OBDxOLSkQkRULPSNsA/wX0j4veAe5z91UJxiY7CTNrD6x19/WNHYvsfMysBbCXuy9r7FgaS2gi3S9buW5/ah7M7C7gRnef09ixyM7HzIYDJ7n7RY0dS2MJfbJpNnBbxuv2+KfUgpmdbGbzMpYfMbMSM3vNzAaZWZGZPRp/NtzMbozf/8PMnjezp8ysf1zWw8zeM7N28fItcYfgN7PUe5iZvRR/3iEuO8HMHjazveLlofH9wpjZNWY2w8xuSPo7kXBZjp9BZvaymf3dzI4ws6vif+M/xp/PqbJ9dzP7spp9DzezeWY2OV6+3cxmm9mJZraXmT0ZHxM9tuw7PnYPSO43ThF3r/FF1B56FDAI6ByyjV5Zv8f7gGeBvvFyCdAOGA1cDhQBj8afDSc6CwSYE//sQdTJB/DfwOvx9vnAq0RTZk/LUu+WK4/rgaOB1sBfq6xzB/BqlfW31HsXMDOOfXfgm8AbwEvAvo39vTaXV5bj58XMv0fgzSrrz6my/FvgrWr2fRQwpkrZMOCejOUrgLPj95nlw4EyomE2D4/L5sbH5PmN/b01xCv0jPQZ4LvxH/sTZvZWtjMfqZ6Z7Qp0A+4ERmZ89CRwE/BAvHyomZUAN1bdh7svBnY3s13d/Va2jnfQAVhBNAVM1yzbuZkZ0A/4F3Ao0C8+yy0ws37AAmBDxvoDgH9m7GZsvHxsXN+dwM/RlDMNoprjp527L81Y7Vkzu8TMtvu7jtu5ofrHQPcATjazgfH6JwCPE027jplNAC4AnorX71dl4KJHgO8AF8fLXwKHAeeF/o5pFpRI3f0sdx/j7ie7+0FEX9jEZENrcgYRncWVED1uu8XxwKlESQngZY8egMilvcnjVzZXAJM86iBsD9wCPEQ0rfYYoj9QAMwsH7gG+HGVfbwHZA6puAQozCFOqb1sx8/GzBXc/RLgfeD/smz/n8C91e3c3R8DzgH+FC8/AXwN+Fa8PIHob35ovMlgYHiVKYe2OT7cvQJYa2a7B/x+qZbT6E/u/o67f7W+g2nijgQOAf4ADDSzvIzPlgN71rQDMysC1scHaKblRMmxC1CeZbsCYJC7PxoXfcLWQbot3u5morOMbxOdZd7j7iur7Kon0SXcFt2BZttT28CyHT+fm9k293K7+yNAn7gnPVNPov8Y+5nZudXUsYLozHSL5cCBVZYPiOvZTHSstc/4fJvjI74Kaufu64J+wzRr7LaF5vICHgMK4vd/IGp/mgK8DLxC9D99EdnbSF8CngP+BgzI2GcJ0YEK0RnmLKL2y12BVzLWO5no7LEEOIWoTfVpoja2osz9ZcT6Wrx+a6IzkX/G5S2BM4maAmYDAxv7u20Or2qOnwPiY+d5ojPHp4FS4Jp4vVVEl+K/zfJvPBYYl1E+Md72+0T/uT5B1AZ/OrA/8EL8792fqGnoZaJmqXyiZPt2fIwMiff3BVEz0v829nfXEK+g258kXeKe1cvcPZH2SzM7kyiBb9eOK+lgZlcBT7j7vxLa/xx3bzY9+nUZ2Fl2XpsB3bokO/JiUkm0OdIZqYhIHemMVESkjpRIRUTqSIlURKSOlEilSYjHKVhvZgvN7KXGjkeal9DxSEXSYI67D2nsIKT50RmpNElm9mcze9fMLoiXXzazXmZ2tJndF5eVmdluGdvcZ2ZHN1bMkl5KpNLkmNmBQB4wELikkcORZkCJVJqiPkQDe5QSPcK4xdNkDM4Sezs+e9XfguRMB480RQb80d37uPs+GeXHAmdXWbc/0cAbRzZUcNL0KJFKUzQXONLMdjGzlgHrrwOqjpYkEkyJVJocd59NNBLWYnY8bu7HwHygLdEISiI50bP2IiJ1pDNSEZE6UiIVEakjJVIRkTpSIhURqSMlUhGROlIiFRGpIyVSEZE6UiIVEamj/wcHJ9EuhoFEMQAAAABJRU5ErkJggg==\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x155004af4e0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"ax = class_boxplot(log_ncounts,\n",
" ['RPKM-нормализованные'] * 3,\n",
" labels=gene_labels)\n",
"ax.set_xlabel('Гены')\n",
"ax.set_ylabel('лог.количества по всем образцам после RPKM');\n",
"plt.tight_layout()\n",
"#plt.savefig('pics/1_15.png', dpi=600)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment