Skip to content

Instantly share code, notes, and snippets.

@capissimo
Created March 31, 2018 22:24
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save capissimo/f13148f79cadae2cb0a4134917b16727 to your computer and use it in GitHub Desktop.
Save capissimo/f13148f79cadae2cb0a4134917b16727 to your computer and use it in GitHub Desktop.
Chpater 1 from Elegant SciPy (rev 2)
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Глава 1. \n",
"# Элегантный NumPy: фундамент научного программирования на Python\n",
"> [Библиотека NumPy] повсюду. Она окружает нас. Даже сейчас, она с нами рядом. \n",
"> Ты видишь ее, когда смотришь в окно или включаешь телевизор. Ты ощущаешь ее, \n",
"> когда работаешь, идешь в церковь, когда платишь налоги.\n",
">\n",
"> — Морфеус, к/ф «Матрица»"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"def rpkm(counts, lengths):\n",
" \"\"\"Вычислить прочтения на тысячу оснований экзона на миллион \n",
" картированных прочтений (reads per kilobase transcript per million reads).\n",
"\n",
" RPKM = (10^9 * C) / (N * L)\n",
" где:\n",
"\n",
" C = количества прочтений, картированных на ген\n",
" N = суммы количеств картированных (выровненных) прочтений в эксперименте\n",
" L = длина экзона в парах оснований для гена\n",
"\n",
" Параметры\n",
" ---------\n",
" counts: массив, форма (N_genes, N_samples)\n",
" РНК-сек (или подобные) количественные данные, где столбцы являются \n",
" отдельными образцами, и строки - генами.\n",
" lengths: массив, форма (N_genes,)\n",
" Длины генов в парах оснований в том же порядке, что и\n",
" строки в counts.\n",
"\n",
" Возвращает\n",
" ----------\n",
" normed: массив, форма (N_genes, N_samples)\n",
" Матрица количеств counts, нормализованная согласно RPKM.\n",
" \"\"\"\n",
" N = np.sum(counts, axis=0) # просуммировать каждый столбец, чтобы \n",
" # получить суммы количеств прочтений на образец\n",
" L = lengths\n",
" C = counts\n",
"\n",
" normed = 1e9 * C / (N[np.newaxis, :] * L[:, np.newaxis])\n",
"\n",
" return(normed)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Введение в данные: что такое экспрессия гена?"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"gene0 = [100, 200]\n",
"gene1 = [50, 0]\n",
"gene2 = [350, 100]\n",
"expression_data = [gene0, gene1, gene2]"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"350"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"expression_data[2][0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## N-мерные массивы NumPy"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1 2 3 4]\n",
"<class 'numpy.ndarray'>\n"
]
}
],
"source": [
"import numpy as np\n",
"\n",
"array1d = np.array([1, 2, 3, 4])\n",
"print(array1d)\n",
"print(type(array1d))"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(4,)\n"
]
}
],
"source": [
"print(array1d.shape)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[100 200]\n",
" [ 50 0]\n",
" [350 100]]\n",
"(3, 2)\n",
"<class 'numpy.ndarray'>\n"
]
}
],
"source": [
"array2d = np.array(expression_data)\n",
"print(array2d)\n",
"print(array2d.shape)\n",
"print(type(array2d))"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2\n"
]
}
],
"source": [
"print(array2d.ndim)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Зачем использовать массивы ndarray вместо списков Python?"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"# Создать массив ndarray целочисленных в диапазоне\n",
"# от 0 и до (но не включая) 1 000 000\n",
"array = np.arange(1e6)\n",
"\n",
"# Конвертировать его в список\n",
"list_array = array.tolist()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"199 ms ± 9.79 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
]
}
],
"source": [
"%timeit -n10 y = [val * 5 for val in list_array]"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"7.84 ms ± 1.83 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
]
}
],
"source": [
"%timeit -n10 x = array * 5"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1 2 3]\n",
"[1 2]\n",
"[6 2]\n"
]
}
],
"source": [
"# Создать массив ndarray x\n",
"x = np.array([1, 2, 3], np.int32)\n",
"print(x)\n",
"\n",
"# Создать \"срез\" массива x\n",
"y = x[:2]\n",
"print(y)\n",
"\n",
"# Назначить первому элементу среза y значение 6\n",
"y[0] = 6\n",
"print(y)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[6 2 3]\n"
]
}
],
"source": [
"# Теперь первый элемент в массиве x поменялся на 6!\n",
"print(x)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"y = np.copy(x[:2])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Векторизация"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[2 4 6 8]\n"
]
}
],
"source": [
"x = np.array([1, 2, 3, 4])\n",
"print(x * 2)"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1 3 5 5]\n"
]
}
],
"source": [
"y = np.array([0, 1, 2, 1])\n",
"print(x + y)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Транслирование "
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[1]\n",
" [2]\n",
" [3]\n",
" [4]]\n"
]
}
],
"source": [
"x = np.array([1, 2, 3, 4])\n",
"x = np.reshape(x, (len(x), 1))\n",
"print(x)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[0 1 2 1]]\n"
]
}
],
"source": [
"y = np.array([0, 1, 2, 1])\n",
"y = np.reshape(y, (1, len(y)))\n",
"print(y)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(4, 1)\n",
"(1, 4)\n"
]
}
],
"source": [
"print(x.shape)\n",
"print(y.shape)"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[0 1 2 1]\n",
" [0 2 4 2]\n",
" [0 3 6 3]\n",
" [0 4 8 4]]\n"
]
}
],
"source": [
"outer = x * y\n",
"print(outer)"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(4, 4)\n"
]
}
],
"source": [
"print(outer.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Исследование набора данных экспрессии генов\n",
"### Чтение данных при помощи библиотеки pandas"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 00624286-41dd-476f-a63b-d2a5f484bb45 TCGA-FS-A1Z0 TCGA-D9-A3Z1 \\\n",
"A1BG 1272.36 452.96 288.06 \n",
"A1CF 0.00 0.00 0.00 \n",
"A2BP1 0.00 0.00 0.00 \n",
"A2LD1 164.38 552.43 201.83 \n",
"A2ML1 27.00 0.00 0.00 \n",
"\n",
" 02c76d24-f1d2-4029-95b4-8be3bda8fdbe TCGA-EB-A51B \n",
"A1BG 400.11 420.46 \n",
"A1CF 1.00 0.00 \n",
"A2BP1 0.00 1.00 \n",
"A2LD1 165.12 95.75 \n",
"A2ML1 0.00 8.00 \n"
]
}
],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"\n",
"# Импортировать данные TCGA по меланоме\n",
"filename = 'data/counts.txt'\n",
"with open(filename, 'rt') as f:\n",
" data_table = pd.read_csv(f, index_col=0) # pandas выполняет разбор данных \n",
"\n",
"print(data_table.iloc[:5, :5])"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"# Имена образцов\n",
"samples = list(data_table.columns)"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" GeneID GeneLength\n",
"GeneSymbol \n",
"CPA1 1357 1724\n",
"GUCY2D 3000 3623\n",
"UBC 7316 2687\n",
"C11orf95 65998 5581\n",
"ANKMY2 57037 2611\n"
]
}
],
"source": [
"# Импортировать длины генов\n",
"filename = 'data/genes.csv'\n",
"with open(filename, 'rt') as f:\n",
" # Разобрать файл при помощи pandas, индексировать по GeneSymbol\n",
" gene_info = pd.read_csv(f, index_col=0)\n",
"\n",
"print(gene_info.iloc[:5, :])"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Гены в data_table: 20500\n",
"Гены в gene_info: 20503\n"
]
}
],
"source": [
"print(\"Гены в data_table: \", data_table.shape[0])\n",
"print(\"Гены в gene_info: \", gene_info.shape[0])"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [],
"source": [
"# Взять подмножество генной информации, которая \n",
"# совпадает с количественными данными\n",
"matched_index = pd.Index.intersection(gene_info.index, data_table.index)"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"20500 генов измерено в 375 индивидуумах.\n"
]
}
],
"source": [
"# Двумерный массив ndarray, содержащий количества экспрессии \n",
"# для каждого гена в каждом индивидууме\n",
"counts = np.asarray(data_table.loc[matched_index], dtype=int)\n",
"gene_names = np.array(matched_index)\n",
"\n",
"# Проверить, сколько генов и индивидуумов измерено\n",
"print(f'{counts.shape[0]} генов измерено в {counts.shape[1]} индивидуумах.')"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [],
"source": [
"# Одномерный массив ndarray, содержащий длины каждого гена\n",
"gene_lengths = np.asarray(gene_info.loc[matched_index]['GeneLength'],\n",
" dtype=int)"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(20500, 375)\n",
"(20500,)\n"
]
}
],
"source": [
"print(counts.shape)\n",
"print(gene_lengths.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Нормализация\n",
"### Нормализация между образцами"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [],
"source": [
"# Заставить все графики в блокноте Jupyter\n",
"# в дальнейшем появляться локально \n",
"%matplotlib inline\n",
"# Применить к графикам собственный стилевой файл \n",
"import matplotlib.pyplot as plt\n",
"#plt.style.use('style/elegant.mplstyle')\n",
"\n",
"# переопределение стиля\n",
"from matplotlib import rcParams\n",
"rcParams['font.family'] = 'sans-serif'\n",
"rcParams['font.sans-serif'] = ['Ubuntu Condensed']\n",
"rcParams['figure.figsize'] = (4.8, 3)\n",
"rcParams['legend.fontsize'] = 10\n",
"rcParams['xtick.labelsize'] = 9\n",
"rcParams['ytick.labelsize'] = 9"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAagAAAEYCAYAAAAJeGK1AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJzt3Xl4lOWh/vHvk43sIQnZCBAgYV8EQRABwX3B1l3RurS1i62/nu6tPce2Hu3mOdraY1trV1RaqQuuuFUUZVVA9k0ISdgSIAlJyD6ZeX5/zCAxBgiQmfedzP25rlxJXmbmvRkxd553eR5jrUVERMRtopwOICIi0hkVlIiIuJIKSkREXEkFJSIirqSCEhERV1JBiYiIK6mgRETElVxXUMaY+4wxD3fhcTcbY1YaY/4ZilwiIhJariooY0wRkB34eoAxZrEx5l1jTFonD78WuAAoMsbEhjKniIgEn3HbTBLGmJnAVUANsAoYCuyx1j7d4XGfAS4Bmq213wt1ThERCS5XjaA6yAN+CtwMJBhjHjbGLAp8nA8MA7YBfY0x0U4GFRGR7uf2EdT71trXjvG41cA5wHzgP6y1xSELKSIiQefmEdTfgB8bY/5tjMnr5M//gf8QYCtQEtJkIiISdK4bQYmIiIC7R1AiIhLBXFFQxi/FGGOcziIiIu4Q43SAgGSgrq6uzukcIiISep0OTlwxghIREelIBSUiIq6kghIREVdSQYmIiCupoERExJVUUCIi4koqKBERcSUVlIiIuJJbbtQVkRM4WN/Cixsr2FhxmDafZWhWErNG5FDYJ8npaCJB4YrJYo0xKQRmkkhJSXE6joirVNa38JM3tvHnFbto81mS4qKJiTLUNrdhDMwel8/DV44iO6WX01FFTlWnM0loBCXiYu8VV3HDk6upbGjlK2cP4M4pAxmTl4IxhrLqRh5bUcZDi3by9o5KFtwxiQn9ezsdWaTbaAQl4lL//HAPtz21lsLMRJ65bSJj+6Z2+rgN5XVc8dcPqG5s5Z2vncNElZSEH83FJxIuHl+5m1v+uYZzB2ew8lvTj1lOAGPyUln2jan0SYpj1l/ep7S6MYRJRYJHBSXiMq9t2c8X/7WWC4r68Modk0iNjz3hc/LTEnjtS5NpafNx09wPafP6QpBUJLhUUCIusn5fHTc8uZqxeak8/4WzSIzr+mni4Tkp/PG6sawoO8TP3toexJQioaGCEnGJ+pY2rp6zktResbx8xySSe538NUyzx+dz8/h8frFwOx8drA9CSpHQUUGJuMQPXtlMSXUj8249k369E075dX595SgSYqP5j+c34oaLoEROlQpKxAX+ve0gjy4r49vnDmb64MzTeq2clF7cd8kw3th2kDe2HeymhCKhp4IScVhtk4cv/mstw7OT+dllw7vlNb92zkAGZiTw49e3ahQlYUsFJeKwb7+4iX11zcyZPY6E2Ohuec24mCh+fOFQVu2u5eVN+7vlNUVCTQUl4qBXNu/n7yt388Pzi5hckN6tr33bxH4U9Uni/rc+0ihKwpIKSsQh1Y2tfPnpdYzJS+GnFw/t9tePiY7iOzMGs2p3LUtLqrv99UWCTQUl4pBvzN9IZUMrj88eT6+Y7jm019FtE/qRkRjLb97bGZTXFwmmoBSUMeZ6Y8y7xpg57baNNMasNsa8aozpdN4lkUgxf305/1yzlx9fNJTx/dKCtp+kXjF8dUoBL2ysYGdVQ9D2IxIMQSkoa+0zwExgkjHmyDwts4GfAPuBccHYr0g4OFjfwp3PrWdCvzR+dEFR0Pd319SBGGP484pdQd+XSHcK1ggqGtgKrLLWegKb+wIVQDmQH4z9iridtZavPbeB2qY2Hr9pPLHRwT/Knp+WwKwR2cxZuRuP5uiTMBKsEZQXGAmkGWM6W0VNlxRJRJq3Zh/PrS/nvkuHMSo3dEvL3DFpABWHW3h1y4GQ7VPkdAXt17dAScUA/QKb9gG5QF7ga5GIsq+2mbvmb+DsgnS+N7MwpPu+fEQ2eam9+Ov7Oswn4SNYh/i+ZYxZDDQBlxtjioB5wH8DOcC6YOxXxK2stXzlmXU0ebzMmT2O6KjQXicUEx3F7RP7s2DLfvbVNod03yKnKliH+B621k631l5nrX3EWrvDWrvZWjvRWnu5tVYHwiWi/P2D3SzYcoBfzhrBsOxkRzLcPrEfPgtPr9MBDAkPug9KJMhKqhr55osbmVGYyX9MG+RYjuE5KYzrm8q8NXsdyyByMlRQIkHk9Vlue2oNUcbw+OxxRIX40F5Hs8fn8/6uGkqqtCy8uJ8KSiSIHlpUzJKSah65ejQFGYlOx+HGcX0BmLdWoyhxPxWUSJCs3FXDPa9v5dqxedw6od+JnxACAzMSmVKQzrw1Og8l7qeCEgmCqoZWrntiFX1T43nsurG4aXavm8bns768jq37DzsdReS4VFAi3czns9zyzw+pqGvh2dsnkpkU53SkT7h6TC4AL2yscDiJyPGpoES62X++upXXtx7kt1eNYmL/3k7H+ZR+vRM4q39vFZS4ngpKpBs9uqyUB97ZwZ1TCvjqlAKn4xzTlaNzeH9XjW7aFVdTQYl0k/nry/l/8zcwa0Q2j1w92lXnnTq6anQeAC9t0ihK3EsFJdINnl23jxueXM2kAenMu3UCMSGYpfx0jMxJpqhPEi+qoMTF3P1/kUgY+Nv7u5g990POHtCbN74ymeReMU5HOiFjDFeNzmXh9krqmj0nfoKIA1RQIqfI67N8/+XN3PH0Os4vyuS1L59NanzsiZ/oEleNzsXjtbymJTjEpVRQIqegtLqR8x5dxoOLivn6OQN59UuTSYl3/8ipvbML0klPiOX1bQedjiLSqfD6P0rEYV6f5U8ryvjhK1sAmDN7HLef1d/hVKcmOspw8bAsXt96AJ/POj5PoEhHKiiRLnpnRyXfemET68vrOK8ok7/dOI6BLphf73RcNjybf63dx/ryOsblpzkdR+QTVFAiJ/DRwXruXrCF5zdUUJCewLO3T+CaMXmuvoy8qy4ZlgXAa1sPqKDEdVRQIsew/3AL9735EY+tKCMhNoqfXTaM78woJCE22ulo3SY3NZ7x+am8tvUAP7pgiNNxRD5BBSXSgc9n+ePyMu5esIUmj5c7pxTwk4uGkp3Sy+loQXHZ8GweeKeY2iYPaQnhcxWi9Hy6ik+knbLqRs79/VLumr+Bswt6s+kHM/ndNWN6bDkBXDo8G6/P8tZ2Xc0n7qIRlEjAm9sOcNPcD2nzWebMHsdtE/v1iPNMJzKlIJ20+Bhe33qQa8f2dTqOyMdUUCLAP1bv4fZ5axmRncz8z09kSFay05FCJiY6iguHZvHGtgNYayOilCU86BCfRLw5H+zm1qfWMH1QBsu+MS2iyumIC4b0YXdNM8VVjU5HEfmYCkoi2utbD/ClZ9Zx0ZAsXv1y+M0G0V0uGNIHgIU6DyUuooKSiLWp4jDXP7GKMbkpPHv7xB51+fjJGtIniX5p8SzcXul0FJGPqaAkIjV5vNz45GoSY6N55UuTInbkdIQxhguG9OHt7ZX4fNbpOCJAkArKGHOjMWaxMWaRMSau3faKwLbcYOxXpKu++9ImNlUc5ombxpOfluB0HFe4YEgfqho9rC+vczqKCBC8EdTT1trpQCUwAMAYEw28Zq2daa3VKmnimPeKq3h0WRnfmTGYS4ZnOx3HNS4Y4p/2SIf5xC2CUlDWWmuMSQBSgJLA5nRglDHm6mDsU6QrWtt8fO259QzMSOD+S4c5HcdV+qbFMzw7WQUlrhHMc1C/Bu6x1noBrLWVwFTgLmNMQRD3K3JMD7+3k8376/nd1WNIjIvs806dOb+oD+/trKK1zed0FJGgnYM6E/9AamX77dZaD1AF9A7GfkWOp6qhlZ8v3M5nRuYwa2SO03Fc6YIhfWho9fLBrkNORxEJ2gjqPOD8wAUR9xhjphhjrjPGLAVagfVB2q/IMT3w9g4Ot7Txi8uHOx3FtWYWZWKMzkOJOwTlGIe19iHgoU7+6Nlg7E/kRPbWNvHIkhJuObMfo/NSnY7jWhmJcYzPT2NRcRU/dTqMRDzdByUR4X/eKabNZ/nvS3RhxInMGJzJirJDtLR5nY4iEU4FJT1eVUMrf3l/F587M59BmeG9RHsozCjMpLnNx8pdNU5HkQingpIe7w/LSmls9fK9mYVORwkL0wZlAPDuziqHk0ikU0FJj9bk8fLIkhIuH5Gtc09dlJkUx5i8FN4rrnY6ikQ4FZT0aHNX7+FgfSvf1+jppJw7OJOlpdV4vLofSpyjgpIey1rLo8tKGZuXyozCTKfjhJUZhZk0tHpZs7fW6SgSwVRQ0mOt2l3Lmr11fHVKgVaJPUnTj5yHKtZ5KHGOCkp6rD8uLyUpLppbJuQ7HSXs5KbGMywrifd26jyUOEcFJT1STZOHeWv3cdP4fFLjY52OE5ZmFGayeGcVXq0PJQ5RQUmPNHf1Hhpbvdw5RfMSn6pzB2dS29zGBq0PJQ5RQUmPNGflbsb1TWVCf81LfKrOHey/sETnocQpKijpcbbsP8zqPbXcNrGf01HCWv/0BAZlJOqGXXGMCkp6nLmr9xBl4KbxujjidM0ozOS94iqs1XkoCT0VlPQoPp9l7od7uWhoFrmp8U7HCXvnDs6gqtHDlv31TkeRCKSCkh5lSUk1uw41cesEHd7rDkfm5VtaqsvNJfRUUNKjPLl6D0lx0Vw1OtfpKD1CUZ8kspLjWFqigpLQU0FJj9Hs8fLMun1cOzaPpF5BWYsz4hhjmDYogyUqKHGACkp6jLe2V1Lb3KaLI7rZ1IEZFFc1UlHX7HQUiTAqKOkx5q8vJy0+hvOL+jgdpUfReShxigpKeoQ2r4+XNlVwxcgc4mL0z7o7jc9PIz4miqUlh5yOIhFG/ydLj7C4pJqqRg/XjMlzOkqPExcTxeSCdJ2HkpBTQUmP8PyGCuJjorhkWJbTUXqkqQPTWbO3loaWNqejSARRQUnY8/ks8zeUc+nwbF29FyTTBmXQ5rN8sLvG6SgSQVRQEvZW7alhb20z14zRvU/BMmVgBsag+6EkpFRQEvae31BBTJThipE5TkfpsXonxDI6N0XnoSSkVFAS1qy1PLe+nPOKMklPjHM6To82dWAGy8sOaQFDCZmgFJQx5kZjzGJjzCJjTFxgW44xZnlge0ow9iuRZ/P+erZXNujqvRCYOiiDuuY2NlZoAUMJjWCNoJ621k4HKoEBgW2zgLnAW8CFQdqvRJjnN5RjDFypufeC7sgNu0t26jCfhEZQCspaa40xCUAKUBLY3BeoAMoBzUUj3WL+hnKmFKSTp6U1gq4gPYG+qfEsLdUNuxIawTwH9WvgHmutt5M/00FsOW0lVY2s2VvH1aN1eC8Ujk4cqxV2JTSCdQ7qTPwDqZXtNu8DcoG8wNcip+WFjeUAXK3Ly0Nm6qB0dtc0s+tQo9NRJAIE667G84DzjTGL8J9zWggsAF4A2vCPrkROy/wNFZzRN5XCPklOR4kYH08cW3KIAemJDqeRni4oBWWtfQh4qJM/mhKM/Unk2X+4haWl1fz0oqFOR4koY/NSSYqLZmlpNTedqVPJEly6D0rC0osbK7AWrhmr80+hFBMdxRRNHCshooKSsDR/QzmFmYmMztUtdaE2dVAG68vrqG3yOB1FejgVlISdmiYPC7dXcs2YPIwxTseJONMGZWAtrCjT5eYSXCooCTsLNu+nzWd1eM8hkwekEx1ltMKuBN1JFZQxJtoYc2bgJlwRR8zfUE5eai8m9e/tdJSIlBIfwxl9U3UeSoLuZEdQc4A7gJe6P4rIiTW2tvH6toNcPTqPqCgd3nPKtEEZrCg7hMfrczqK9GBdKihjzLnGmBnACOBpICOoqUSO4c1tB2ls9WrtJ4dNG5RBk8fH2r2aOFaCp6sjqPOAmcDLgc9PBCmPyHHN31BBekIs5xZmOh0lok0dGJg4VtMeSRB19UbdBGvt3UFNInICHq+Plzfv58pROcRG6/oeJ/VNi2dQRiJLSqr59oxCp+NID9XVgrrdGJPdfoO19otByCNyTIt2VFHT5OFqrf3kCtMGZfDmRwex1upyfwmKrhZUKfDfQcwhckLzN5STGBfNxcOynI4i+AvqydV7KK5qpEjzIUoQdPU4yUvW2rIjH8DkYIYS6cjns7ywsYLLh2eTEBvtdBzBP6MEaAFDCZ6uFtRVxpheAIF7oL4bvEgin7ai7BAVh1u0tIaLjMhOJj0hVjfsStB09RDffcDzxpg4/Mtl/Cp4kUQ+bf6GcmKjDbNG5DgdRQKiogxTB2Xohl0Jmq4W1OtAPf5l28uBxUFLJNKBtZbnNpRz4ZAs0hJinY4j7UwdmM4rm/dTWd9Cn+ReTseRHqarh/iex3//UyIwA3gxWIFEOlq7t47S6iau1dx7rnNkAcNlpZo4VrpfVwsqE3gb/8q4bwe+FwmJ5zaUE2XgylE6vOc2E/v3Ji46Sof5JCi6eojvVuALQDawH7g5aIlEOnhufTkzCjN1CMmF4mOjmdg/TQUlQdHVEdQ0YAewFCgGpgctkUg7W/YfZuuBeq7VzbmuNW1QBqv21NDk8TodRXqYrhbUkdvE7wZskLKIfMpz68sBNHuEi00blIHHa1m1u8bpKNLDdLWgSvDPJtES+FwanDginzR/QzlTCtLpmxbvdBQ5hnM+njhWh/mke53sbOYvtPtaJKh2VjWwZm+drt5zucykOEbkJLNUBSXdTLOZi2vNX18BwDU6vOd60wZl8My6cnw+q4UkpdtoNnNxrfkbyhmfn8qgzESno8gJTB2YwZ9X7GLz/sOMzkt1Oo70EJrNXFxpb20Ty8sO8bPLhjkdRbrgyA27S0qqVVDSbbp6DurIaCkPIDCj+XEZY+4zxjzcYVuFMWaRMUYzfspxPb12HwDXn9HX4STSFYMzE8lN6aWJY6VbdXUE9YPA5z1AP2NMlLX29mM92BhThP+m3uZ226KB16y1XzjVsBI55q3dx5n90hialex0FOkCYwzTNHGsdLOujqBGAj+x1v4Y+Eng+2Oy1u4A5nXYnA6MMsZcfdIpJaIUVzbwwa4aZo/T6CmcTB2UQWl1E3trm5yOIj1EVwvqP4BfGGPeAH4OfONkd2StrQSmAncZYwpO9vkSOf4VOLx3gw7vhZUj56GWlmjiWOkeXT3E1wv4S7vv405lZ9ZajzGmCugNnPA8lkSmeWv3cs7AdAoydPVeOBnXN5WkuGiWlFRzg0a/0g26WlBzAh/tvdeVJxpjZuMvo3zg28BOYH0X9ysRZlPFYTaUH+aRq0c7HUVOUkx0FJMHpLOkpMrpKNJDdLWgEoF+QA2wCv/6UMdlrV0ELOqw+dmTyCYRaN6avUQZuE6zR4SlaYMy+NlbH3G4uY2U+K7+eBHpXFfPQU0GHgD+BeQAzwUtkUQsay3z1u7jvKI+5KZq7r1wNG1QBj4Ly8t0NZ+cvhP+imOMqQbWcnRGc4ChQUskEWv1nlp2VDZw9/lFTkeRUzRlYDrRUYZ3i6u4eFj2iZ8gchxdGYOvs9aeH/QkEvEeX7mbXjFRXDNG93GHq+ReMZzVvzeLinUeSk5fVwpqoDHmJx03WmvvC0IeiVCtbT6eWrOXK0flkp54SheJikvMLMzkwUXFNLS0kdRL56Hk1HXlHNTngXc7+RDpNgu27Keq0cPnz+rndBQ5TTMLM2nzWZaV6n4oOT0n/PXGWqsykqCbs3I3uSm9uGholtNR5DRNHZRBdJRhUXElFw3Tf085dV29ik8kaA7Wt/DqlgPcMqEfMdH6JxnudB5Kuot+Gojj/vnhXtp8ltsn6vBeTzGzMJMPdtXQ0NLmdBQJYyoocdzjq3YzoV+a1hHqQXQeSrqDCkoctW5fLWv21nH7xP5OR5Fu1P48lMipUkGJox5bXkZ8TBSfm5DvdBTpRkfOQ72r81ByGlRQ4pj6ljbmrt7LDeP6kqF7n3qcmYWZfLBb56Hk1KmgxDFPrdnL4ZY2vnq2lgfriWYWZuLxWpaX6TyUnBoVlDjmseVljM5NYcrAdKejSBAcPQ+lw3xyalRQ4ohVu2tYvaeWO6cUYIw58RMk7Bw5D/X2dl0oIadGBSWOeGx5GYlx0dwyQfc+9WQXDOnDB7trqG3yOB1FwpAKSkKutsnDU2v2MntcX9ISYp2OI0F08dAsvD7LOzs0ipKTp4KSkPvbB7toaPXy9XMGOh1FguzsgnSS4qL590cqKDl5KigJKa/P8n9LSpg+OIMJ/Xs7HUeCLC4mipmFmfz7o4NOR5EwpIKSkHppUwWl1U18a/pgp6NIiFw8LIvtlQ2UVjc6HUXCjApKQurh93YyMCOBK0dr1dxIcWQJFY2i5GSpoCRkPtxTw3s7q/nGtEFER+nS8kgxPDuZ/LR4FZScNBWUhMxvF5eQ3CuaOyYNcDqKhJAxhouGZrFweyVen3U6joQRFZSERHldM0+t2csXzhqgS8sj0MVDs6hu9PDhnlqno0gYUUFJSPzm3Z14fZZvTh/kdBRxwAVD+gA6DyUnRwUlQXeosZVHl5dy47h8CvskOR1HHJCd0ovx+am8vu2A01EkjAStoIwx9xljHm73fY4xZrkxZrExJiVY+xX3+f3SUupbvNx9fpHTUcRBs0bksKz0ENWNrU5HkTARlIIyxhQB2R02zwLmAm8BFwZjv+I+DS1tPPzeTmaNyGZsXy3pHsmuGJmD12d5Y6sO80nXBKWgrLU7gHkdNvcFKoByQMunRoi/vL+LqkYPP7pgiNNRxGFn9e9NVnIcr2ze73QUCRNOnYPStaYRoLXNx4OLipk+OIOpgzKcjiMOi4oyXD48m9e2HqDN63M6joSBUBbUPiAXyAt8LT3c3NV72FPbzI907kkCrhiZw6Emj1bZlS4JekEZY2YbY6YAC4Bb8J9/eivY+xVnebw+fvbWds7sl8alwzuejpRIdfGwLGKijA7zSZfEBOuFrbWLgEUdNk8J1v7EXR5fuZuS6kYeuXqSVsyVj6XGxzKjMJMFWw7wwBUjnY4jLqf7oKTbtbb5uP+t7Uwa0JvLR2j0JJ80a0Q2myoOU1Kl2c3l+FRQ0u3+9sEudh1q4r5Lhmn0JJ/y2VH+meyf31jucBJxOxWUdKuWNi8/f2s75wxM5+JhWU7HERcq7JPEuL6pPLtOBSXHp4KSbvWXFbvYU9vMf2v0JMdx3Rl5LC87xN7aJqejiIupoKTbNHm8/GLhDqYPzvh4clCRzlw3ti8A89dXOJxE3EwFJd3mD0tL2VfXrHNPckLDspMZnZvCs+t1S6QcmwpKukVNk4efv7WdS4ZlMbNIoyc5sWvH5rG4pJqKumano4hLqaCkW/xq4Q5qmj38atYIp6NImLhubB7WwvMbdZhPOqeCktO2p6aJ3y7eyefOzGdcfprTcSRMjMpNYUROMk+t2et0FHEpFZSctnvf+AifhfsvHe50FAkjxhhuObMfi3dW66Zd6ZQKSk7L5orD/H3lLr4+tYCBGYlOx5Ewc8sE/8o7cz/c43AScSMVlJyWH726heReMfyX1nuSUzAgPZGZhZk8sWoP1moVHvkkFZScssU7q3hp035+eF4RfZJ7OR1HwtStE/qxo7KB93fVOB1FXEYFJafE57N8+8VN5KfF883pg5yOI2HsujPyiI+J4olVu52OIi6jgpJT8viq3azeU8sDs0aQ1Ctoq7ZIBEiNj+WaMXn848O9NLS0OR1HXEQFJSftcHMb//nqVs4uSOfmM/OdjiM9wJ3nFFDX3Ma8tZpZQo5SQclJ+8XC7VQcbuG3V43SlEbSLaYNymBUbgqPLit1Ooq4iApKTsrOqgZ+/e5Obp3Qj0kD0p2OIz2EMYavTSlg9Z5aVupiCQlQQclJ+f7Lm4mJNvxylm7Kle5168R+JMVFaxQlH1NBSZf9e9tB5m+o4EfnF5GfluB0HOlhUuNjuXVCP/65Zi/7D7c4HUdcQAUlXdLs8fL1+RsY0ieJ780sdDqO9FDfnjGYVq+P/1u80+ko4gIqKOmSX729gx2VDfzh2jHEx0Y7HUd6qKFZyVwzJo8/LCvjcLMuOY90Kig5oY8O1vPLhTu4eXw+Fw7NcjqO9HA/OK+QmiYPf3m/zOko4jAVlByXtZavP7eBhNgoHvrsSKfjSASYNCCdmYWZPLhoJ00er9NxxEEqKDmux1fuYeH2Sn5+2XByU+OdjiMR4t5LhrKvrpk/LC11Ooo4KCgFZYzJMcYsN8YsNsaktNteYYxZZIzJDcZ+pXvtqWnimy9uZPrgDL52zkCn40gEmVHYh4uHZvHLhdupa/Y4HUccEqwR1CxgLvAWcCGAMSYaeM1aO9NaqzWeXc5ay5eeXkebz/L3G8cRFaUZIyS0fnH5cKoaPTy0SFf0RapgFVRfoAIoB45M1pYOjDLGXB2kfUo3+uv7u3hj20H+Z9YICvskOR1HItCE/r25/ow8Hny3mF2HtOJuJArFOSgLYK2tBKYCdxljCkKwXzlFW/cf5lsvbuK8okwd2hNH/e8VI7HW8p2XNjsdRRwQrILaB+QCeYGvAbDWeoAqoHeQ9iunqcnj5YYnV5MQG82TN4/XoT1xVEFGIvdcOJTn1pfzxtYDTseREAtWQS0AbsF//inZGDPFGHOdMWYp0AqsD9J+5TR984WNbCg/zJM3j9d0RuIK3505mCF9kvj6/A3Ua72oiGKstU5nIHClX11dXR0pKSknfLwEx2PLS7nz2Q3cfX4Rv5w1wuk4Ih9bvLOKGX9YxpcnD+Cx689wOo50v04P1eg+KAHgrY8Octf8jVw2PJv7Lx3mdByRT5g+OJPvzSjkTyt2sWDzfqfjSIiooIQN5XVc/8RqhmcnM+/WM4mJ1j8LcZ/7LxvG2LxUPj9vra7qixD6SRThNlcc5oI/LicpLppX7phEanys05FEOtUrJpqnb5tAq9fHNXNWaRqkCKCCimBr99ZywR+XE20Mb39tCgMzEp2OJHJcw7KTmXvzeFbvqeUrz6zDDefQJXhUUBHq1S37mf77pcREGRbeOYWhWclORxLpks+MyuVnlw1j7uq9/PCVLU7HkSCKcTqAhFZrm497XtvKg+8Wc0ZeKgu+NJm+aZoEVsLLf14whIq6Fv53UTGZSXH88Pxw2X/LAAAODklEQVQipyNJEKigIoS1lte3HuC7L29my/56vnL2AH792VEk9dI/AQk/xhh+e9Voqho93L1gCx6vj/+6cAjG6MbynkQ/nXq4yvoWXtq0n98vK+XDPbUU9UnilTsmMWtkjtPRRE5LVJThiZvGERNl+PHr26htbuOBWSM0+0kPooIKE80eL6XVjVQ3emj0eGls9dLi9dHa5qPVG/hos7R6fTR5vBRXNbK+vI41e2uxFkbmJPPotWP44qQBxMXo1KP0DDHRUcyZPY7U+BgeXFRMSXUjc2aPI1lHBnoEzSThUiVVjby0qYLFJdW8X3aIPbXNJ/X83JRejMpNYdqgDGaNyGFi/zQd/pAey1rLb97byfdf3szo3FRe+MJZDMrUValhpNMfTiooF2n2ePnHh3v504oyPthVA0BBegLnDMxgRE4ygzISyUqKIzEumsTYaOJjo4mLNsRFRxEXE+X/HB1FXIyhV0y0w38bkdB7c9sBbnzyQ3zW8odrxvC5Cf2cjiRdo4Jyq2aPl/9bXMJD7xZzoL6VsXmpfO7MfK4/o69+CxQ5SaXVjdzyjw9ZWnqIm8bn87trRpORGOd0LDk+FZTbWGuZt2Yfd7+6hV2HmrhkWBY/OK+I84oydThO5DS0eX386u0d3PvmR2QkxvLgZ0Zy64R++v/KvVRQblJS1cgdT6/lnR1VjM9P5aHPjuK8oj5OxxLpUdbvq+POZ9ezvOwQMwoz+c1nRzG+X5rTseTTVFBu4PNZHl1Wyg8XbCHKGB78zEi+NHmALo0VCRKfz/LXD3Zx94ItVDd6mD2uL/dfNpyiPklOR5OjVFBOK65s4I6n1/FucRWXDMviz9efQf90LQooEgq1TR7+d1Exv3lvJ61tPm4+M5/vzihkbN9Up6OJCso5Pp/l90tLufvVLcREGX7z2VF8YVJ/HQ8XcUBFXTO/fHsHf31/Fw2tXi4emsVdUwdy2YhsYrXUjFNUUE4ormzgi/9ay3s7q7l0uH/U1K+3Rk0iTqtubOWx5WU8sqSE8roWspLjuHl8Pjec0ZfJBelE67B7KKmgQqnjqOnhK0fx+bM0ahJxG4/XxxvbDjJn5W5e3rSfVq+PzMRYLh2ezczCTCYXpDMyJ0WFFVwqqFDZWF7H1+dvYLFGTSJh5VBjK29uO8iCLQd4besBKhtaAUiKi2Z4djJFfZIo6pNE39R4+iTFkZkYS2ZSHL0TYkmLjyE1PlZFdmpUUMFW2+Th3je38ciSUtLiY/jfK0bqXJNImLLWsqOygfd31bBydw3bDtSzo7KB0kNNeH3H/rmZ3CuatHh/YaXFx5KWEPPx9/4iiyU9MZbCzERGZKfQr3e8fkaooIKnvqWN3y0p4cFFxVQ3efjK2QX8/LLhZCbp7nWRnsbj9VHV0EpVo4fKhhaqGjzUNnuobW6jpinwdVPbx9s6ft/S5vvE6yXFRTMyJ4VJA3ozaUBvJg9IZ0ifpEi79UQF1d321DTxpxVlPLqsjMqGVi4bns39lw5jQv/eTkcTEZdqafNS1eDho4P1bD1Qz5YD9azfV8eqPTXUt3gB6J0Qy+QBvZlSkM45AzOYXNCb1PhYh5MHlQqqO1TWt/Dy5v08t76c17YewAKXD8/mvy4cwpSBGU7HE5Ew5fVZtuw/zAe7anh/1yGWlx1iY8VhrAVjYFROClMGpnNOQQZTBqYzNCupJx0a7NkFdbC+hf2HW4iNjiImyvg/og0xUVGfmPE7Jsp06T+qz2epONxCcVUD2w82sHJ3DcvLDrGhvA6fhQHpCdw0Lp+vTinQhK4iEhS1TR4+2OX/2bO8rJoVZTXUNHkAyEiM5eyCdKYUpDM6N4UhWckUZiYSHxu8lQw8Xh+NrV4aPV4aWr20eX0Mz+mWQUXPLqgH3ynm+69s7tJjjyxJcXR5Cv/n2GhDY6v34+PG7d+a1PgYJg/ozdSBGXxmVA7j87W+koiEls9n2Xqg3l9YpYdYVlbNlv31H/+5MdAvLZ6clF5kJfUiKzmO9IRYesUc/VlngFavj5Y2/4d/kVN/8TS0tn28IOqRzw3tvm7rcHFIbkovyu+9uDv+aqErKGNMDvAC0AZcbq093Nm2do8/7YL66GA96/bV0ea1tPl8tPksbT6Lx2vxeI+uOtvSdnTl2aMr0fpo9fq3JcZG0zshlt4JMeSmxFPUJ5HCzCQGZiTq8lERcZ3aJg8fHWxgR2UD2ysbKK5q4EB9CwfrWznY0EpNk4fWNh8tXt8nfunuFRP1cXEdWWPuyOekuOgTbkuKiyEtPobPjMrtjr9GSAvqi0AC0AdYb619vrNt7R4fNuegRETCkbUWr8/isxAb3bVTHSHUaZiYIO2sL7AF8AD5x9kmIiIhYIz/vHw4CcXMiJ0N0Zw/8SUiIq4WrILaB+QCeYGvj7VNRESkU8E6xLeAoxdE7DTGTOmw7ddB2q+IiPQQPeYycxERCVudnhzT6lwiIuJKKigREXElFZSIiLhSsC6SOCWHDx8+8YNERKRHSU1NTQHqbYeLItxykYQuPRcRiWyp7afAA/cUlMF/j1T9iR4rIiI9kjtHUCIiIh3pIgkREXElFZSIiLiSCsphxpgcY8xyY8ziwIwaGGNuDHy/yBgT53RGJ3X2/gS2FxhjWpzM5gbHeX9mGWOeMcakO5nPacf4/2tUYNsKY0yG0xmdZoy5zxjzcLvvO/035QQVlPNmAXOBt4ALA9uettZOByqBAU4Fc4nO3h+AbwHbHUnkLp96f4wxycBXrbXXW2sPORnOBTr793M+8EtgCXCmQ7lcwRhTBGR32Hys/+dCTgXlvL5ABVBOYJ0sa601xiQAKUCJg9nc4FPvT7vfeiudCuUin3p/gGnASGPM28aYblnuNIx19v68DFyD/8rhJQ7lcgVr7Q5gXofNnb1njlBBuUv7Syp/DdxjrfU6FcaFjrw/NwNPOhnEpY68PxnA74CngRudi+M6R96ffPw/gOPwr/Atx+boZd4qKOd9ap0sY8yZ+AdSK50M5hKdrSM2GPgB/lHCV5wK5hKdvT8HgSSgmWPMEh1BOnt/bgD+DawELnEol5u5Zu0+3QflMGNMDkfXyfoTsAM4B/gy/t/y7rHWRuxhiM7eH2vt8sCfLbLWznQwnuOO8e9nDfAikADcZq0tdSygw47x/sQBj+Av8KustRE9i40xZiZwFbACKAN2cvQ9u7zj7A4hzaaCEhERN9IhPhERcSUVlIiIuJIKSkREXEkFJSIirqSCEhERV1JBiYiIK6mgREQkKDpORNtu+3nGmA+MMf82xsQe6/kqKAkrxpgoY8zcwEzv33U6j4h0rv1EtMaYAYHZ0d81xqQBVwBfAmqBgmO9hgpKws0s/LNJzLTWPmSMWQRgjHnJGBNnjKk2xsQYY35+pMCMMbuNMdnGmC8bYx4JbFt05AWNMfcG7qY/8v0iY8xoY8zvjTEZxph3Ah/pgT+vMcZsNcZ8PvAx0xhzf+DrOcaYge0+Xx1Y1uHewHP/Yox5zxjzn8aYUmPMWmPM+YHXX2qM+a92Odq//r3GmNjA45YEJhM+8rg5gZzr2v/d2n2eGchbcSRbYPviwPcPBPb9QrvXizXG7Ozk+SONMQsC2xcEJqSNMsb8NbBMw6fea4lMHSai/SLwAP4ZTi4BngL+H1AXeFynVFASbsbgn5LliLrAD+soIB3/PHTnAsMD3wNU4/+NbRIQ3cX9/Bj4Gf5Zr+cATwBXBg5HrAR+1e6xKcBNga+bA98f8TX8s4ufY4wZCMRZa8+11v4i8Lrfsta+jX/OvOnAeccKZK31WGvPA5YBZ3T44+PNSRiNf3mJ149sMMYceY8AcoDPAVGBpToI/H1SO3l+IxAX+C34ELAVKARyrbX7ObX3Wnq+POCn+Cd6TgAG4Z9WKe14a06poCTcNPPJH3zrgOuBj/AXw+v4f9iW4P/HHxN4zCSggaPlMTYwgri0k31kAL2tteX4f3jvDnz0BXoDNR0e/0X8M4eDfx2dP3J0HZ1c/Ovq9As8v+wYf69cYDPwYYftdwMPg/+QiTHmHeBaIL7dY+ICOY+s/ZQbGD2Na/f3qe3wutfRrrDw/2abhf89ApgMbDzG88vxz932AbAaf7HWH+e9FikHfmKtnWitfRy4Bf/oqgL/v5dOqaAk3Kznk6OMZcD3gXeBZPw/pEcAb+L/TS0Z8OEfDSwLbDvyOtfiP8zQUTWw2hhzObAff7n0w/8/2RT8P5Tbi8dfkFhrl1hrp+IvJYBdwCXW2pHAAY69AGWFtXY4nx5B/Qr/4ozgH839jk8vNTIJeKbDa80E1ga+H8fRsjliL9B+KZcr8RfkmMDfdWm7P+v4/OXA9/C/58uAbwc+H+u9Fvkb8OPARRF5wD/xT0g7ik8eEfkEFZSEm4VAeuBk60+BxfgPMS3CP5IA+Ab+Eolvt+0+4DWOjjz6Aws4OvLp6H+A7wDzgTuALwBv4//B/OcOj/37cfI+Biw2xvwicKw9KnCOq+NS7MYYsxT/CORY/g3cg79Y69tt/8hau6yzJxhjxgFfBf4CXAp8Hv8s1e1L7gD+32aH4h95HgaeO87zF+Ifsa231m7BP8J6i2O/1xKhrLWLrLXfstbustaeY629yFpbbq39l7V2TOBccsOxnq/ZzCWsBc4JzbXWamG+TgQu/phprb03cA7sXmvt50/n+cB3gV9Za78ceMxz1tpruzO3CECM0wFETpUxJh54A/i501lcbC1QGvi6gk9e3HGqz38euAvAGDMf/xLqIt1OIygREXElnYMSERFXUkGJiIgrqaBERMSVVFAiIuJKKigREXGl/w8N3gzeVGkFmgAAAABJRU5ErkJggg==\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x295f10db0f0>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Количественная статистика:\n",
" минимум: 6231205\n",
" среднее: 52995255.33866667\n",
" максимум: 103219262\n"
]
}
],
"source": [
"total_counts = np.sum(counts, axis=0) # просуммировать столбцы\n",
" # (axis=1 будет суммировать строки)\n",
"\n",
"from scipy import stats\n",
"\n",
"# Применить гауссово сглаживание для оценки плотности\n",
"density = stats.kde.gaussian_kde(total_counts)\n",
"\n",
"# Создать значения, для которых оценить плотность, с целью построения графика\n",
"x = np.arange(min(total_counts), max(total_counts), 10000)\n",
"\n",
"# Создать график плотности\n",
"fig, ax = plt.subplots()\n",
"ax.plot(x, density(x))\n",
"ax.set_xlabel(\"Суммы количеств на индивидуум\")\n",
"ax.set_ylabel(\"Плотность\")\n",
"plt.tight_layout()\n",
"plt.savefig('pics/1_06.png', dpi=600) \n",
"plt.show()\n",
"\n",
"print(f'Количественная статистика:\\n минимум: {np.min(total_counts)}'\n",
" f'\\n среднее: {np.mean(total_counts)}'\n",
" f'\\n максимум: {np.max(total_counts)}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Нормализация размера библиотеки между образцами"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAAClCAYAAAAONXX6AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJztnXuUFdWd7z8/Wp4N+ODlC0l8BUhM1ETDHR8B49ywCHFlEjMxD4wkGp2JmYnkarj3+kp0TRySMZnIullzScRoEjHBoKiIuQpofGCM4APpVnkoCDbdNGDDabpp6N/9Y++Crt0NVafrnD7ntL/PWmd9e/ep2udXu3b99m/vqtpbVBXDMAyj+/QptQGGYRiVjjlSwzCMjJgjNQzDyIg5UsMwjIyYIzUMw8hIr3Gk4hgiIlJqWwzDeH9xWKkNKCCDgaampqZS22EYRu8iMTjrNRGpYRhGqTBHahiGkRFzpIZhGBkxR2oYhhGwdVcrP1m6hq27WlNtb47UMAwjYO4LG7nu4RrmvrAx1fa96a69YRhGQZh+1uiYJiG9ZfYnERmCf/xpyJAhpTbHMIzegz3+ZBiGUWzMkRqGYWTEHKlhGEZGzJEahmFkxBypYRhGRsyRGoZhZKRojlRExohIq4iMF5EXRWSRn+pukoisEJG5fruv+/StPj1TRFaKyFU+PdvvP9Wn7/fbf7xYthuGYeRDMSPS7wFvApcANwJbgNOB6cA0YIyIDAOuAD4FTPX7XQxMAL4lIgOAM4EvAFeIyIlAO3ANcGkRbTcMw0hNURypiBzl/9wKHAvUAe8Cx3VI1wHHAENVdSfQLCKDgSpVbQX6A8OBxi72jdKGYRglJ5UjFZFBXj8iItUpdvkqcE8X/w9fo0pKd/x/2m0NwzB6lMR37UXkT0CViAwHaoBTcF3xQ3EicC4wHjgVWICLPjf7z9HAKFx02eQj0YGqmhORfSLSH2jBRbRH4SLRjvtGeRmGYZScNBHpSOBfgGrgFqBf0g6qOkNVLwFWAxcCP8Q5zpeBubhodYOqNgJzgKeAR/zu84HlwJ2q2gKsxDniOaq6ztv8M+DulMdoGIZRVBInLRGR23DOVHDd6QZV/UEP2JYXNmmJYRhFInHSkjTT6F0PnMOB7vXTGY0yDMPoVaTp2v8JmAgM8vpgEe0xDMOoONJEpMOAJcBGYDQwuagWGYZhVBhpHOk03EP0I3EP1X+1qBYZhmFUGGkc6bnAGtxbSgKcB6wvplGGYRiVRJox0uiO1UzsIXjDMIxOpIlI1+OcaSvwVlGtMQzDqEDSONJJuEj0gQ5/P1VMowzDMCqJNI50DvA5oK+qzhaRDxXZJsMwjIoizRjpH3HvxEd362cXzxzDMIzKI40jbQQ+AhwhItcCueKaZBiGUVmkedd+NHAybtalemCZqu7rAdvywt61NwyjSBTkXfvfqOoFBTDGMAyjV5LGkR4rIkv83wKoOVbDMIwDJDpSVR3bE4YYhmFUKrYcs2EYRkbSLDVyfvg/VbUH8g3DMDxpxkgvxy0XMg/Y4f9njtQwDMOTpmt/A+5h/JG4dZcWFdUiwzCMCiONI70MOB83jV49MCVpBxH5kog8KSJ3icgkEVkhInP9d1/36Vt9eqaIrBSRq3x6toi8KCJTffp+v/3HRWSAiCwRkb+KyAndO2TDMIzCkqZrv4wD0+cJKabSU9U/ish84DWgP25y6DtEZBhwBW4557/g1oO6GJgAPC0idwFnAl8AfiEiq4F24BrgUtwyJy8Cq4Av4lYTNQzDKClpHOm3yHOMVESqcEsxP49bNK/Of44BhqrqThFp9uvZV6lqq1/LfjjuldR3geM67NtV+rQ8jtMwDKNoFGWM1L9COh44HKjq+FW46cGySLGtTTJtGEZZkCYi/YbXN71OAf6WtJOq7hORw4B3cO/pj8JFk00+Eh2oqjkR2eej0RZgK3AUB5Z+3uz3PaZD+mN+u82pjtAwDKPIpHGkVap6U5QQkX9L2kFEvocbw9wC3AncA7ysqo0iMgc3NPCI33w+sBz4L1VtEZGVwALgRlVdJyJ9cGOhV+DGXK/HjbFenPIYDcMwikqa2Z+eAf438DYwBrhNVSf0gG15YbM/GYZRJAoy+9NXcMsxH4stx2wYhtGJNI70elX9dpQQkTuBbxbPJMMwjMoijSM9UUQu40DX/qSiWmQYhlFhpHn86Qu4R40+iXuU6R+KapFhGEaFkcaRTgAmAx9R1V8Dny6uSYZhGJVFGkf6Q+CfcDebAL5TPHMMwzAqjzSO9Engd8BYEXkAm0LPMAwjRuJzpAAiMhSoBnao6u6iW9UN7DlSwzCKRPbnSEXkfwEX4CYKOVZEXlHVawpgnGEYRq8gzeNP/11VJ0YJEXmyeOYYhmFUHmkc6aYOyzEDtIvIUmxZZsMwDCDdcsxf6wlDDMMwKpU0Y6TbgJeiJBaJGoZhxEjTtX/ZHKdhGMbBSeNICcZIMcdqGIZxgDSO9BequiBKiMiXi2iPYRhGxZHmzaYf+KVAEJEBwIzimmQYhlFZpHGkPwIWiMgTwAPAj4trkmEY5cjWXa38ZOkatu5qLbUpZUearv1WVZ0CICL9sEmdDeN9ydwXNnLdwzUAXDvp5BJbU16kmo9URO4SkWnAY7jVPg+JiHxZRP4iIstE5KMi8qKILBLHJBFZISJz/bZf9+lbfXqmiKwUkat8erbff6pP3++3/3i3j9owjLyZftZoZk0dx/SzRpfalLIjjSPdDbQDtwPPASek2OcPqnoebtnkmcCNuPWeTset/zQNGCMiw3Crg34KmOr3vRg3B+q3/JjsmbjJpa8QkRO9LdcAl6Y5QMMwCsPwwf25dtLJDB/cv9SmlB1puvbLcDPk/8Zr4kwoqqoiMhAYglt/vg436clxuHlN6/znGGCoqu4UkWa/3n2Vqrb6G1zDgcYu9o3ShmEYJSdNRJoDLsdFlt8AmlLmfTtuDfp9Hf4XztmXlO74/7TbGoZh9ChpHOkvgOuAAcBNwP9J2kFEzsQFpi/gItKjcdHn5g7pUbjosslHogNVNQfs89FoC25o4ChcJLq5i7wMwzA60dNPGKRxpAtUtQ64SVXfAR5Ksc8k4AIRWQY8jluuZBTwMjAXuAfYoKqNwBzcrPuP+H3nA8uBO1W1BVgJLADmqOo6b/PPgLtTHWGFYY+YGEZ2oicM5r6wsUd+L9UM+ZVAb5kh/ydL13DdwzXMmjrOHjExjG6ydVcrc1/YyPSzRhfi5ljifSFzpGVGgSuAYRjZye5IReTrwNeA/j7DzeU4R2lvcaSGYZQdiY40zRjp93HPcfZR1UnAmKxWGYZh9CbSONLZfuXQZSLyHLCqyDYZhmFUFGmXYx6kqs0ichqwzj+mVFZY194oNDZebXgKshzzn4AqERkO1ACn4F7pNIxejU3SYaQlTdd+JPAvQDVwC9CvqBYZFUtvewbWJukw0pLmrv1tOGcahbf1qvqDYhuWL9a1Lz32DKzRS8netQfqVHVmAYwxejlR5GYRnPF+I01Euh73Wud+VPVHxTSqO1hEahhGkShIRLoDN5VeYmaGYRjvR9LcbLpPVZ9S1SdV9cmiW2QUlUq+IVTJthu9mzSO9KLoDxER4N+KZ45RbHp6VpxCUsm2G72bNF37u0RkKW6C5r7AvcU1ySgmlXxDqJJtN3o3ad9s+hAwRFX/VnyTuofdbHr/YW8eZaM3lV+RjyX7pCUisgD4J+D/+vTC7HYZRnasq5+N3lR+pT6WNF37EcBPgbNF5CxgWHFNMox0vN+7+lmjsEKXXykj3FLXhTTPkZ6IW/TuaNySyr9S1Q09YFteVErXvjd1p4zSUm5vkpWbPQWkIPORngusA54B1gITU/2yyI9E5OciMl5EXhSRReKYJCIrRGSu3+7rPn2rT88UkZUicpVPz/b7T/Xp+/32H09jR7lR6i6I0Xsot7kAys2eniSNI4288fdJua69iJyMez8f4BLgRlw0ezowHZgGjBGRYcAVuNmkpvrtLwYmAN8SkQHAmbiJpa/w0XE7cA1waVe/3VhmzxqGzz6Gla3cn40sd/sORWh7JR9LVwwf3J9rJ51ctJ5NvuVVbHvKmTSOFGAI0KCqd6vqb5I2VtU1wDyfPBa37PK7wHEd0nW4ZZWHqupOoNkvy1ylqq24pU2GA41d7BulO3HPinfKKuILI9CwsuUbofa0MyhmBF3sYwltL/feQLk5+nIvr3Iizc0mgHrgywX4vXBANind8f+ptp125vEMGDS4ZN2LcAw0aRA830HycI7MYo+5FnMQv9jzfYa2l/qGRBLlNv9pOZdXud1rSONIJ+Kc1mT/ZpOq6jfz+I3NuBtVx/i/o/QoXHTZ5CPRgaqaE5F9ItIfaAG2AkfhItHNXeTViWE+4isV4cUwPMGepO9Dwspd7IsvH/vCyp1U2cNjKfTFEdqeb1n3NBeNH8WytY1cNH5UqU0Byru8yq3RSdO1vwc3qfM7wL8DP8zzN+b5fUYBL+NmkroH2KCqjcAc4CngEb/9fGA5cKeqtgArgQXAHFVd523+GXB3nnb0CMUecA+HBvL9vXy7j/lsn29XOjyW2U+v57qHa5j99PpUtvU2Fq7ewqKaehau3lJqU7pFTw5NFPs6y/tYVPWQH+Am/3kA18WfnbRPKT64cVxtamrSUtKws0VnLXlTG3a2lNSOgzFryZvKjIU6a8mbBd8+PPZ8y+KmxTXKjIV60+KaVNvna0+5kVReSfaX2/HlW7fyoaePNTiWZP+TuIFbfvkEr2OAE9Jk3NOfcnGkxaxMhSDfCtmTFbjQv1Xu5yLJvny/L7VjLebv9/S5DI6lII70zg6fubgud8kdZxd2loUjLXRlyhrlvZ+prWvSKXOWa21daevEwQjPZWhvvhFpuTccWShxvS+II302iEjHpMm4pz/l4kgLTdjdLXT3Nwvl7tQrzbFMmbNcmbFQp8xZ3q39y/185EPWYylwWST6nzQ3m1qBu4DfAP+Ju4Nu9BSaoCUk63OGxb45cdH4UUwZN7Js7oIncfvnxjNl3Ehu/9z4UptScrLWrR6vm2m8rR6I+o4C7s1nn576UKCItNxa9WJ37bPkl9WWYkeMlRaR5ku+Xfsyi/IOSdZhmQLXzYJ07c/GjY0u9vrJNBn39OdgjjTfAi12ZUyqIMWurOHvl/KGRbHvUpdbo1ho8j13Wc91vg1TPvkXulHIWleKcbNpOW5stMrrX9Nk3NOfgznSfE9+kqPLGuWE42A9fcPgwl8+q8xYqBf+8llVTXasxSQ89msXrlJmLNRrF67qcvsS37ktO/J1JmE63/H2YjrecNukG2+FfqIhIb9E/5NmjLQWN+nIDV5r0w0alAfnfeBIxo6s5rwPHJlq+6SHorNOOhKOg4VjOfmO6yX9/utbdvLZXz3P61t2AnDGcUNjGh5vlged8y2L8NhXbmqKadKELz1tX7HJOslKuH34gkOnSUUKPN4e/n4+dTm8TsN6me91EtaVpHOZ+QH/NN4WOAn4O69D0+zT0x98RPq3NZtiLVkYgSW10s+t26pjb3tCn1u3tcuWK+SmR32r/mj37qJnjUiTtg8j4HwfscmHpLJIKvtHX3tXR9zwqD762rupji0k6dzle656+vGp0L58o65w/6SIs9AvTFz7oO9RPLgqlf0dyfc6LfQwQ8L32SJSEZkvIqNVdS3wHPB54NHuueye4X88tJpFNfVcvWAVAB8aUR3TWUv85LNL1gCdW6obHnuD2vocNzz2RroflEATSIoywpZx8eo6Rt64mMWr67rcP6llDiPge1/axKKaeu59aRMAaxp2cedfN7CmYRfQOYLNh625PXENI6RnfIT0TNcR0h3Pvk1Dro07nn27y7JIYvofXqa2Psf0P7zc5e+H5yopArx3pS+rlZu6zi+pPBLyD8s6LL/w3Cad++a2fTGdfOoIxo6sZvKpI7q0L+9p7xIi2JWbm2Kaz/k749ihMQ1ta8ztYdnaRhp92eRbN5KONayb+ZLUtf8eMEtE/gV4CMgB53Xrl3qI0YcPAGDMEU6HV/eLaXiyw8oYOt6ki+Hqcz7IrKnjuPqcD3ZpT7h96LiTnMul816iIdfGpfNe6nL/pKGIYdX9mHjSMIb5429u3RfTr/1+JbX1Ob72+5UAfHv+Kyyqqefb81/p0v5DsWLTezENbQ1/O3QkN3z6ZMaOrOaGT5/cZVksX9/IuH9fwvL1jV3+/s8+N54R1X352UGGTcJzlTQ3QOiYwnMV2pPUte5kz4JVsUZ/pS+3SJO6t+H3g/pWxTTfoCCpEf3KGccxZdxIvnLGcV0e/y2fOZWxI6u55TOnAp2d36G47oKTmTV1HNdd0PUEJDN8gDTjodVA9rlPw3MV1s18G80kR/pN3Jjod4C9uIlHru+O4T3F2m27AXizsRmAq8/1F8+57uL5/vkfZER1X75/vksnVcZ8HV+S4zxt1GBGVPfltFGDncFBKx/uf/clpzOiui93X3I60LklTmqZwwh8UP+qmI4a0j+mu/fsi2mWccLEcaz7XmJRTT3T73ONxOLXG6itz7H49Yaut0+IOJdv3EFDro3lG3ek+/2g7Dptn/AM71d9I/RV3wh1KqsgAg5/L2y0Q0J7kuwN63pSUBASNqIh4bUybd5L1NbnmDav6/MXNhT5ENqa9IxtvtF/UgSa7wQ6SY70SWAZ8G3cjEvL/P/KlrOOPxyAT55wBNC5Vbxt6Voacm3ctnQtkNw96nxzaU9MQ8KLKRxE/+6Dr9GQa+O7D74GdG7lw/0njz+a+h9NZvL4o4HkLk9Ygf7fmw0xnTD6CEZU92XCaFc+N154CiOq+3LjhacAoN5LRJrPzbVxIwfH9NfPb2BRTT2/ft4t8TWoX1VMt+xsjWlY9p0apX/8GGNHVjP3Hz/WZVmF5+aOp9ezqKaeO/zFkNQIdmpUA3tDR9W/qk9Mw3M9bkQ11X37MO4gjjLsLYXlF9of2hsO04R1IQwKZkVrKi11jWoYUYajVOH34fF96oNHxTTMIKmh6EhS9J64f3Buw/2v9I3Elft7WsF1HNi+ccfumCZxSEeqqk929UmVc4k4/8SjGFHdlwtOcoudhq1s8569MQ0v9lfrdsY0HEMMu1/hxR5WtrB7db6vdJGGF0tSFBVW7vD4wgq0tjEX02seWk1Dro1rfBfpP55aT0Oujf94Kl3L27HCh8deW78rpkvXbo3phBO8E/eN3MxJJ9G3jzBz0kkANPsoONKbH3ud6x6u4ebHXu/SltDJh+du/ivvxjTsvoWEZR82ciF1vgGINDzXV93/Krm2dq66/1WgsyMLy+Nv7+yI6cPeoUd634qNDJ75CPet8M4liJBDZxE2TH/dsCOmYUR925SxjB1ZzW1TxrryDXoAtzyxhtr6HLc84ew//siBMQ0bjrChOBTh+PAR/avo20c4wvecwt5LGGGG5za8DhtzrTENh6E2eYcZ6Z/faIhpEmmXGqkYZnhH8c++O9G2rz2u7RrTZ97aFtPTjh4S07Ayhb28sCUML6ZR1X1jOmxQv5g+5k9UpLOWrmFRTf1Bo4bP3/UCtfU5Pn/XC10eX+gMjh86MKY3//0pVPftw81/7yLQwwdUxXTH7r0x7RT1dajw4XfbmttiWrdzT0yvuv9VGnJt+x3LjY+9QVu7cqMvq9ARLnj13ZiGXckwAjt+aP+YbvP/j3TTe7tjGhKW/S1/fp1FNfXc8mfnyEPHfrbv/UR6gh+Xj7Sqj8R0cU19TMPeyZtbm2N65KC+MZ1+3yvk2tqZfp9zlKEjzvngINLnN2yP6XG+XCI9vP9hMf3tik3U1uf47QoX4YY9gLB7HY45X+kbjiv9+U1qiDo6w9CxfWfBKtrale/463jzey0xDR1rY/OemP5gUS219Tl+sMg9rbl+W3NMd7a0xXRRbUNMw4AriV7nSPsF3a3+vhJHGobs7/mCjDTszl1x9mj6iFOAE48aFNON23fHdKR3mJE+/db2mD7zVmNMb/3MqVT37cOtfoD+T6/WxTSMGrbsaovpYN9tizSMsLftbovpT59cR66tnZ8+uQ6AJ95sjGlY4cIx3Y4VPmz1w+h34GES09a97TENHe8QX+aRbt+9J6Yf9Y1bpOEY3FPrt8X0MJfNfo267JGGjdSCVXUxnf9qXEPH/szb22O6YUdLTMMI+42tuZiOHV4d091t7TFV1ZgO93Uq0q/8bgUNuTa+8rsVQOeu+Tp/nyDSsNHO+Ug10kW19TE9clA/ThxWzZG+0d/evId1jTm2R04raMhafT6RhnUx7MFM+/1KFtXUM+33K3nLNx6R+iLYr5/wjVWkaxpyMV26tjGmG7c3x7SqT5+Yvu3PUaTHHd4/ptt274tpEr3Okb7jC26D12Xrt8d0W/PemL69rSWmm72DjfT6xbW0q1OAR2rqYvrAqrguXdMY0wF9+8T02bffi+mPl6wh19bOj/3NoCrRmLZ5pxNpyBP+dyJ9tHZLTJta9sR0lY/2Ih3qo5FIR/puWKTf8E8NfMNHgXt95Lt3Xzszfas/07f6YeVf72/8Rdr/sD4xre4nMX3M36SIdG+7xHSJP8ZID/fdvkijCzxSH1Qf0D3tMQ17G+eMOTKmfXy/I9Iw/5w/0EgvP/t4qvv24fKzj/e/qzEd4R1SpI97hxZp2NvZ/F7rIXVHy76Y7toT123+wCNtbI5rGAGf4oODSMOob+qvn6e2PsfUXz/f5fehK5//yuaYhhH96w079+u73hlH6tve/frUusaY7vCBT6THDO4X0/XbW2L6ER8IRNrmnX2kO1vimi+9zpHu3ueqYfPerh92CyvrLl+QkYaOMbwYd7S0xzSsrFv84HWkaxuaYxqyqm5XTBt9CxjpO02tMQ1pD7S2PhfT0P4WXy6RbvBdpUjf9XZHWp9ri+kG7xQ3bNvdaVghJDoFkYZRwHZfhpGG24fDMDta9sb0/lVbYppr05iG7As0fFzqube3xTQKRg6mIZfd+xK5tnYuu/elLr9/x4+lRuoPe7+GNPhIPdLQ/pAaf84jzZfHfQMV6cubd8Q0rJtrfGQdaYu/9iINnVnoWMOhj46EdWGrd/6Rho32sxvei2lI+H1Y9uF1kC+9zpFmJefPXO4gjjiJ0LHtDTTE17n9mpWwAhaaLf6i3tLcRpPvckdaaVz7kBujvPYhN0YZOop8SXKM5U5Yd/2p3q8hDb5xjTSJcFhquy/n7SnKu4xmj+ySinOkIjJKRJ4Tkb+IyJBS2/N+ZlV9c0wrjUq33ygfKs6RAp8Ffgs8DlxYYlsMwzBSrWtfbhwL1ABtQKfnKv7aeAn+piagvLb1og7fWrpQ6Q8PX1g2tli6N6TJY9ueTydRiY60I52O8Oxh86DfoP1ffnj4wtjGli5cupxssXRlpztSalu6Sich0TNqlYKIfBMYCAwHXlHVBf7/Q4CmpqYmhgyxoVPDMApG4txulRiRPgI8gLsRfnuJbTEMw6g8R6qqW4D/Vmo7DMMwIirOkSaxc2f+ExIbhmEcjKFDhw4BdukhxkErboz0YIjIMcDmUtthGEavZKiqHjRK602OVICjgV2ltsUwjF7H+yMiNQzDKBWV+GaTYRhGWWGO1DAMIyPmSLtAROpEZJmIHJ0xnx+JyM9FZLyIvCgii/xYbtb8PiAib4nIsm7m82U/6csyEfloVtuC/E7NYpvP70si8qSI3CUik0RkhYjMLVB+E0WkVkTmZchvjIi0FvC8RvllOq8d8ovq798VyL4ovwkFsu+zIvJHETmnQPZF+Z1RgLp3tT/Wt0TkX9PaZ440QESqgEdVdaKq1mXI52RgpE9eAtwIbAFOL0B+VcBcVZ3YTfP+oKrnAVuBmVltC/KTjLahqn8EJgJn4xZenAaMEZFhBcivP/BjVb2ku/bhlil/kwKc1yC/rOc1Vn+ByVntC/JrKIB9g4ErVfVLwGcKYF/H/Jqy2qeqs/3+bwAfSGufOdLOHAl8WET+IUsmqroGiKKeY4E64F26mGilG/kNASaKyKe6mZeKyECfT2sBbOuY3+AstsH+i7cW+BswwttXBxxTgPz6AV8UkY91My+/ZCZbKcB5DfLLdF49HetvZvuC/Aph37nAeBFZUiD7OuY3ogD2ISLjgXW4401lnznSAFXdCpwDfEdExhTjJzJnoPoSMBW4XUQGdDOb24HriU+4nsW224HrVfXFrLap6j5gPHA4LkrLZF+Q35+By4FfdScv4KvAPV39TNb8CnFeO9ZfClN2HfPbntU+4ChgNvAH4FtZ7Qvy+2QB7AO4CHgw+N8h7TNH2gWq2gY0AkcUKMvNuGdcC/bSgKrmcFMJ9s93XxE502WhLxTCtiC/TLZFeOd3GPCOt28ULjrImt/xuHM7qJtZnQhch3PMU8h+XvfnJyLfLlDZday/metdx/wKYF8DUA20ADsLYF/H/KQQ5YcbBvoLeVwb5kgDRORiEXkG2AO8UqBs5wE/xDmDl7NmJiLfFZHlwDJV7XqRmkMzCbjAD8o/XgDb9ucnIjdktA0R+Z6I/AXYDdyJi9g2qGpjAfK7HHge+GV38lLVGX58dTVuYvFMZRfk178AZdex/l6f1b4gv/Oz2odzUOcD3wQmZLUvyG9fAewDGKWqu8jjurUH8g3DMDJiEalhGEZGzJEahmFkxBypYRhGRsyRGoZhZMQcqWEYRkbMkRqGYWTEHKlR1vhJRm72f98nIhNLa5FhdKbXrdlk9E5E5BO4B68fFZHLgLeAiap6s4gsxr3FUwvcDFyGe1B7GvA/cS8d5HAPqO/BTUZRBYzuYv863Cq1p+ImYDkG+L3PZyVwmqp+V0Q24iay+C7uAfBl3s5lWSbNMCoTi0iNSuGfgbsP8l0DbiamjlzptT/wgt//Qv+/Prg3nLrafwVwGnAGzrF+xP+9AjeBxzgR+TjQ7NOGYY7UqAg+hJvWbLdPzwR+3uH7qmD7M/z2EZOBh4FlPn0RsOQg+78InAmcoqrP46YunIB7rXQI7rXhq31+h/t9bheR34lI33wPzOgdmCM1KoHJwH91SN+Gm8Mzmo9yd7D9ObjueMRi4CwORKGn4rr7nfZX1bW4CUly/l+NPr9VuCkCHwc+AWwEBvptZgDvAd2ams+ofMz5CKxZAAAAz0lEQVSRGpXAfFXdfpDv7sDN1jMP53CPxk2p1uy/b8VNr/YgsMj/7y4OTIsW21/cqgibgKf890uBLarajpvLdDUHpn8bAOzATSH4IaCm+4doVDI2aYlR0XS8uSMidwE3q+pbWfYH/hX4uaq+LSKfx61pfrDxWcMwR2pUNiJyup8QGREZC7ylqi0Z9v8icIyqXi0in8ONh35eVcPhA8PYjzlSwzCMjNgYqWEYRkbMkRqGYWTEHKlhGEZGzJEahmFkxBypYRhGRsyRGoZhZOT/A2nNJe7oUm02AAAAAElFTkSuQmCC\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x295ec013e80>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Извлечь выборку для построения графика\n",
"np.random.seed(seed=7) # Задать начальное значение случайного числа, \n",
" # чтобы получить устойчивые результаты\n",
"# Случайно отобрать 70 образцов\n",
"samples_index = np.random.choice(range(counts.shape[1]), size=70, replace=False)\n",
"counts_subset = counts[:, samples_index]\n",
"\n",
"# Индивидуальная настройка меток оси Х, чтобы легче было читать графики\n",
"def reduce_xaxis_labels(ax, factor):\n",
" \"\"\"Показать только каждую i-ую метку для предотвращения скапливания на\n",
" оси Х, например factor = 2 будет наносить каждую вторую метку оси Х,\n",
" начиная с первой.\n",
"\n",
" Параметры\n",
" ---------\n",
" ax : ось графика matplotlib, подлежашая корректировке\n",
" factor : int, коэффициент уменьшения числа меток оси Х\n",
" \"\"\"\n",
" plt.setp(ax.xaxis.get_ticklabels(), visible=False)\n",
" for label in ax.xaxis.get_ticklabels()[factor-1::factor]:\n",
" label.set_visible(True)\n",
"\n",
"# Коробчатая диаграмма количеств экспрессии на индивидуум\n",
"fig, ax = plt.subplots(figsize=(4.8, 2.4))\n",
"\n",
"#with plt.style.context('style/thinner.mplstyle'):\n",
"ax.boxplot(counts_subset)\n",
"ax.set_xlabel(\"Индивидуумы\")\n",
"ax.set_ylabel(\"Количества экспрессии генов\")\n",
"reduce_xaxis_labels(ax, 5)\n",
"plt.tight_layout()\n",
"plt.savefig('pics/1_07.png', dpi=600) "
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAUwAAAClCAYAAAA3UsShAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAE8xJREFUeJzt3XmwJWV5x/Hv47Aqw6rDJospwhoXDCEYIsxMxoQAQ1BAiSWrBBeYSDkVRAVqRAJUUkyoQEKMBggQamRIRMfIqGS4iFYwCIKlQAKmQIkOMGzDvj75o/swh8Od2293v28v5/4+Vbfu6XtPv+/T3W8//XafPm+buyMiIsXe0HYAIiJ9oYQpIhJICVNEJJASpohIICVMEZFArSZMy8w0M2szDhGREOu0XP9GwOrVq1e3HIaICIUdN52Si4gEUsIUEQmkhCkiEkgJU0QkkBKmiExrtnBZ8HuTJUwzO8vMLjCzHc3sPjObSFWXiEgTkiRMM9sJmJVPzgAudffZKeoSEWlKkoTp7vcCS/LJmcBsM9s/RV0iIk1Jfg3T3W8HDgYWm9kGqesTEUmlkQ993P1p4EVg/SbqExFJoTBhmtnRZrbEzD5uZt83szPKVGBmC8zsZmDC3Z+oHKmIJFXm0+LpKqSH+QngJOBUYA5wUEjB7j7h7qe4+4Xuvo+7n1Yjzt5QoxNpTtP7W0jC3BRYCmwGfCefljE3rol/XJdLmlGYMN19N3ef6+6bufscd9+1icBk7cZlpx+X5ZDpI+Qa5ifNbMLMrjezFWZ2UhOByfSjBBpPinXZhe3Tdgwhp+THuPtsd58H/AFwbIpA2l4RkxmNabIYy8bdxnI2UWcbdXSxzcQyLtts3IQkzCvM7EYzux64Abg8cUytGddkFkMfDgxdpXVRXdfWXcg1zIvcfX93n5f3NC9sIjCRvirayaskga4ljlS6vpwh1zBvyK9dPpr/XtFEYOOi6w1goC9xynjpW7sL6WHOcfe5wB35p+VzG4irtJAV37eNE2o6XduTyfXlclIX2madGIJ7mMBWTfYwddF7jb7EWWRclqMLtC7XLuW6Cephkg2eMbfNHqZ6Ud3S1QNaE+0kdplqy/0R0sP8B+BKYHk+/eXUQcWihihFihLsuFzq6evpcwwxlyPktqJ3AguAR81sFvCOaLWLiPRISMI8BTgHeB74a7KBOCSRrp7qigisE/Ceo9z9mOSRiIh0XEjCPMjMHhr+g7uflSgeERkjtnAZfv78tsOIJiRhHps6CBGRPihMmO5+YxOBiEi/jVtvcjKNPNNHRGQcFPYwzex2wIHHAAO8q1+PFBFJKeQa5hzgRGAWsNTdb04bkohIN4UkzMX57/XIxsb8b3c/OGFMIiKdFPKhz3FNBCIi0nVlr2ECoGuYIjId6RqmiEggXcMUEQkUkjC/NNyrNLPZ6cIREemukBvXF49M/2WKQEREui6kh3mpmd0AvJK//6q0IYmIdFNIwlyVP6YCADP7YMJ4REQ6K+SU/DNmtj6AmW0ALEwbkohIN4X0ML8IfM3M1gNeAs4LKdjMzgI2Bv4RuAJ4EDjI3b1irCIirQpJmCvIEuWG7n6tmW1aNIOZ7UR23+ZzwJHAmcDhwLuAH1cPV0SkPSGn5N8EdgU+l09fXTSDu98LLMkntwFWAr8Gtq0Qo4hIJ4SOh7kKWMfMjgBm1KhPp+Mi0lshCfNwsm/5XA3MBD5Qso5fAVsBW+evRUR6KeQa5nbAfmSn1g8C9wA3lahjCXA58BBwR9kARUS6IiRh/j1wDHA/WfJcCuxVNJO7TwAT+WTh+0VEui4kYW4CfD5/bcA2ZnYJgLsfnyowEZGuCUmY84ANhqYXpQlFRKTbQhLmuUOvBw9BU89SRKadkIS5K9nN55ZP69YgEZmWQhLm23j9abh6mCIy7YQkzLnAM0PThV+NFBEZRyE3rl/s7vcPfoC/SR2UiEgXhfQw7zWzy8juw9wBeCBpRCIiHRXyXPITzGwHsm/6PMTQ43ZFRKaTwlNyM/sWsDvwE7LH7V6ZOiipbvPTl7/mt8h0Fnt/CLmGeQowH7id7HvkH49S8xRGF3JckkCV5Sg7z2PPvoifP5/Hnn2xfIAdFqMNFJWRoo5xabujmmjLMcTeH0IS5meBDYHvA/uS4Js+25/1XWDNihxdyBgLXXdjxdjYVZZjdJ42Gl1REmgi0cRoA0XrMsa6rtt2Nz99ObZwWal6m2gTMbZH3fWbYjnLlhmSME8ge8TEt4FL8umoHn+ufKMa/h2i7saq0kBSbOAmGl3RzlE0PVkdRfWmOEgWKapjsv+nPvAO6pwqrrqJvkpSTpH4i8pMcdAsG8OokIT5r8DeZI+Z2J9stKJWhSxk2R00Rq+p7AaOUWfZDR7y/roNc7L5u9BTjiHFuilStl2F/j/lASlGHSk6KXXbXUjCfKO7nwdc6+5nk41e1Hl1E0mVpNxEneNiOi1rXXXXVV8PTlXUPZgUCUmYBwC4++CZPn9cqaaEBg3BFi7r9QXlFMZ1Z6lyaaGtOFLOH6KJdtrmtfUm9/vChOnur4xMdy47NHGK0Vd9SOpVNHFpIVYcKeePpe7loTY+mG1jvw+5D3OWmX3QzI7Of+Y3EZiINKcLl4e6cvCYSsgp+XXAFsCpZEO8fSFpRCIiHRWSMH/u7hcDvyBLnC+lDUmaNq7XOUViC7mG+cH85UfIkuahSSOSxvXhVCiEEr+kVjj4hpkdDRxI9gTIjwC7AV9MGdTPVh3CncfAzwAN8N59Xdleg8Q/uGG6SFfi7qPpuu5Chnf7BHAwcAuwC9kzyaMmzP965EjuPObIV1f+Hm/+xqsNf5w2RZVGVnaeNhpyE9srxnKNljEa93RNAiGK1l2VMrqgbEyhj9ldCmxG9vXI6COu773FEvzCD41dghxVpZGVnafo/TGSdhMNP8YOOqqojJA66h7A+rL+R5Vd/4MYId4BKkWHo+xyhSTMw9z9rqDoEknRu0gxf92G3UQdMZJ2jKRctBxdOMuYbDnqHsCaWP9dMIgRqHyAitEmYp9FhHxK/k9mtp2ZbT/4KV1LTXu8+Rvs/s/ZCoPBQhs/W3XIWucZfU+VMqaKIaSOyeYpW0fRe4rmKbucVeeZKsaQ95RdV00IiWl0XdVdd4P5y5RRNoYqdbQhRZuoW2ZIwnwb2b2Xg59FlWqKqMoOWfT/Kg297sqvu3OFqBJjiobaxLK2UWfsxD+Yv86BNrTtd+0A1Qchp+SXu/tnkkdSw2TXS8pq4wOAPpxaxdLGsvZh/bZxPbivdXQhjpAe5m7DE/kjKzolxRGzi6eIMn6aaGdV6ohxyaoNqeMI6WG+YmaLWPPUyK4erGUa6UqPZlz1oXfehpAe5geAG4Bnye7BrPRNHzNbaWYTZrZVlflFhnWlRyPTS0gP81jgCGBdd59nZguAC8tUYmYzgOvc/bjyIUpq6q1JH8X47KKskB7mn5F9NXLw3j+tUM9mwB5m9v4K80pi6q1JG3cx1I2hjU/7QxLmVcB/kCW85cBlZStx91VkT5w8ycx2KDu/SIgu7PR91YWDZhdiKFJ4Su7uF1LyFHwt5bxoZo+QfbXy/rrliYzSBxWSWsiI6zeY2Qozezj/vaJsJWZ2uJn9AHgB+EmVQEVE2hbSw5xjZhsBy919bpVK3P0a4Joq84qIdEXIeJiDW4rOTx+OiEh3hdxWtIj883oz2w/A3b+XMCYRkU4KSZgfBeYBS4DH878pYYrItBNyW9EZwIeBWcCWQOe+Sy4i0oTQb/o4cE8+fSDwo1QBiYh0VUjCnMh/O9lzyXWLm4hMSyEJ81Je/+0eXcMUkWknJGFuCGwLPEF2Kv61pBGJiHRUSMLcB1iX7CuN7wH+DZifMigRkS4KSZgPkA2csQ3Z1xq/nTQiEZGOCrmt6GvAHOBNwGxgccqAROT1bOEyNttw3bbDeI0uxpRaSMLcAlhBNsTbinw6uum48qV9fWh3g+d7P3r2AS1HskYXY2pCSMI8CvhD4HPA+8huYo9q9TkHAq9d+X1oyLJGE9srdh193um1f8RTZl2GJMzDgLPd/UTgauDsGrEF6XND7oM+Jp6m2kQfEpH2j3qGt3HZdRmSMH8KLDWzrwB/AZxZKUoBinfI1Dts1Z2tjUSSos6pyuzTumlD2eW0hcs6t27qHmxCEuapZM/k+RNge+DLlWqSwo21tv9Xaah960GmqrNObyJEjDK7llQmU3Y5/fz5SdZ32+uqMGG6+xx3fy+wW/660iDCUk2Vhlrm/eMs1bqoewBLkcTbTiSh6sTZhbYd0sMcuDpZFA2p26gmm3+qnSFVTG33OGPV2YedfDTGugewrvZyYyjanpPF2fYlqrJCnumzV/7ybxPHUkrZHbLKxho22fx1d4YqDb3scnT1dLorO/lU2oqxi9f+isRqy3XLTC2kh/lXAO5+beJYphSS/Mqc6ozLp65tnXY2/YFMUzG0Lda1v5Dt17feXReEJMw3DJ4WOXiCZPKoRoQmPz9/fqcuvnfxCBki1cEmZk+4r+u2CUVnISFJuc3eddsH4qmEfOgzGzge+Dxw3Lh+6KMdMC2t3/b0Zd334fpuyFMjL81fPgC81czM3Y+NUnvH6ZREquhru+lj3LZwGUCtD17LCDkl3x04093PILtpfY8kkXRMX47K0i19bTd9jHuyS3GplyNkeLc/B84xs1nAg8CCJJFIq/rYuxBp2pQJ08y2B34NnD70Zz3TZ8z4+fOxhct61bsQaUNRD/NmYDlrHn42+H184rhERDqnKGHeDywCngcedvdXkkckUoMuLUhKRR/63E2WMM8FrjGzb5rZ+5JHJVJBHz+4kH6Zsofp7scNT5vZG4GvAt9NGZSISBeVGXwDd3/G3Us/MdLMtjSz/zSzm8xsZtn5RUS6oFTCrOEg4ErgemBeQ3WKiEQVch9mDNsAdwEvAts2VKeISFTmnv62SjM7nSxhbgGs6+5/l/99JrB69erVzJypM3URaZUVvaGpU/JfAVsBW+evRUR6p6lT8n8HrgVeAhY3VKeISFSNJEx3fxB4TxN1iYik0lQPc0pPPvlk2yGIyDS38cYbzwSe8ik+2GnkQ5+1Vm6ma5oi0iUbu/tae3BtJ0wj+zDoqdaCEBFZo7s9TBGRPmnqtiIRkd5TwhQRCTQ2CdPMVprZhJltFaGss8zsAjPb3cxuNbNv5ddbY5S5o5ndZ2YTNcr6UD6QyYSZvSNGjCNl7lw3xrzMI8zsRjO7zMzmmNltQw/Vi1HmbDO728yW1CkzL3cHM3s+1jYfKq/29h4qc9DGfy9iuxyUuU/EOA8ys6Vmtm/EOAdl7hmpbZ6cL/d9Zvap0DjHImGa2QzgOnef7e4ra5a1EzArnzyS7MFvDwLvilTmDODS/PHFVV3t7u8FVgGnxYhxpEyLECPuvhSYDewNnAgcBexgZltEKnN94Fx3P7JOnLlTgHuItM2HyouxvV/TxoEDYsQ4UubDkeLcCPiYux8B/FGkOIfLXB0jTne/KC/jf4AdQ+Mci4QJbAbsYWbvr1uQu98LDHos2wAryZ5rVHnQkJEyZwKzzWz/GuW5mW2Yl/V8pBiHy9yobozw6g55N/Aj4C15nCvJviIbo8z1gMPM7J0149w8f7mKCNt8pLza2zs33MajtMuRMmPF+fvA7ma2ImKcw2W+JVKcmNnuwP+SLXtQnGORMN19FbAvcJKZ7ZCqmiiFuN8OHAwsNrMNahS1mOzhdC8PF18ntkGZ7n4rEWJ095fJHtO8CVlP69V/RSrzO8AJwFeqlpf7MHDFZNXVLS/W9h5u48Rbl8NlPhYjTmBz4CLgauCjMeIcKfN3iRMnwCHA10f+NmWcY5EwAdz9ReARYNOIxSYZNMTdnyYb6m79KvOb2buzYvyWWDGOlFk7xoE8wa0DPJDHuSXZ0TxGmW8l2+ZvrFMe8BvAqWSJ+EDqr89XyzOzEyOuy+E2HqVdDpcZKc6HgTcBzwFPRopzuEyLtT7JLu3cRIl9aCwSppkdbmY/AF4AfhKx6CXAF8h28jtiFGhmC8zsZmDC3Z+oWMwcYG5+4fv6SDG+WqaZnREhRszsFDO7CXgWuISs1/ULd38kUpknAD8ELq5aHoC7fzq/Dnon2QDXtdbnSHnrR1qXw2389LoxTlLmfjHiJEtA+5E9WXafGHGOlPlypDgBtnT3pyixn+vGdRGRQGPRwxQRaYISpohIICVMEZFASpgiIoGUMEVEAilhiogEUsKUxuSDZSzKX3/VzGa3G5FIOZ14po9ML2a2F9mNyNeZ2bHAfcBsd19kZsvJvslyN7AIOJbsxuWjgM+S3aj/NNnN2y+QDZwwA9hukvlXkj2tdGeyAUW2Bq7Ky/kx8HZ3X2BmvyQbdGEB2Q3RE3mcE3UHeZDxoh6mtOGTwOVr+d/DZCMGDftY/nt94JZ8/nn5395A9o2fyea/DXg7sCdZAv2t/PVtZANP7GZmvw08k0+LTEkJU5q2C9mQWs/m06cBFwz9f8bI+/fM3z9wAPBNYCKfPgRYsZb5bwXeDfymu/+QbIi9fci+TjmT7Gu0J+flbZLPs9jM/sXM1i27YDL+lDClaQcAXxqaPo9s7MjBuIfPjrx/X7LT6IHlwO+wple5M9lp+uvmd/efkw2o8XT+p0fy8n5KNoTd9cBewC+BDfP3fBp4Aqg1ZJyMJyVMado17v7YWv53IdkIMkvIEutWZEN6PZP//3my4b2+Dnwr/9tlrBmS6zXzWzb6/v8B38v/fwPwoLu/QjaW5p2sGYJsA+BxsiHudgHuqr6IMq40+IZ0xvCHLGZ2GbDI3e+rMz/wKeACd7/fzA4le+702q6fikxJCVM6w8zelQ+4i5ntCtzn7s/VmP8wYGt3P9nM5pNdrzzU3UdP+0WCKGGKiATSNUwRkUBKmCIigZQwRUQCKWGKiARSwhQRCaSEKSIS6P8BkLqiazqRmbQAAAAASUVORK5CYII=\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x295f10d6b70>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Коробчатая диаграмма количеств экспрессии генов на индивидуум\n",
"fig, ax = plt.subplots(figsize=(4.8, 2.4))\n",
"\n",
"#with plt.style.context('style/thinner.mplstyle'):\n",
"ax.boxplot(np.log(counts_subset + 1))\n",
"ax.set_xlabel(\"Индивидуумы\")\n",
"ax.set_ylabel(\"Лог-количества экспрессии генов\")\n",
"reduce_xaxis_labels(ax, 5)\n",
"plt.tight_layout()\n",
"plt.savefig('pics/1_08.png', dpi=600) "
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVgAAAClCAYAAAAZF+UzAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAGmRJREFUeJztnXuQHlWZh5/XABHCICGQBBCBFckFVEC8YFAnETUbSJabCtYiRFm8hZUNZUQFKoKrVCyy7Mqu66KGRVcjsIuEmKBCMhAoUIQAJUm4BxDMVXRCgCSGd//onuSbTk/6dE/3d5vfUzXV0/11v+fX5/L26bdPnzZ3RwghRPm8rtEChBCiXZGDFUKIipCDFUKIipCDFUKIipCDFUKIimhKB2sRHWZmjdYihBBF2aXRAvpgT6C7u7u70TqEEKIvMjuATdmDFUKIdkAOVgghKkIOVgghKkIOVgghAln30ia+vfgJ1r20KWh/OVghhAhkzn3PMWP+cubc91zQ/s06ikAIIZqOqe88qNcyC2vG2bTMrIN4mFZHR0ej5QghRBoapiWEKI+8MciBjhysECKYvDHIgY5isEKIYPLGIAc6mT1YM/ukmc01s8+a2V1mdkk9hLUburUS7cC+ew7mS+MPY989BzdaSm4a0QZDQgSfA74AzADGAydWqqhNaeVbK10cRDvQiDYYEiLYG7gBGAr8Kl4XOWnlW6ueignwpfGHNViNEMVoRBvUMK0BzrqXNjHnvueY+s6D+rztC9mnEbqEaDD9H6ZlZp83sy4zu83MFpnZF4JSNrvMzK4ys7Fmdr+ZLaid37Wv7e1Cq9xWX33308yYv5yr7366z32Scbd6nFujQiqtUm6iNQiJwZ7t7p3ufgLwQeCcrAPM7DBgeLx6BnApsBo4qma3vrY3PSGNsBEOopBz8MQygHqc29R3HsSsk8b0up3Le35p+2fZaOVYeRW08gWnGbSHxGB/ZGZ3AFvi/a/LOsDdnzCzucDJwAHAKuCPwIHA0ni3vrY3Hcnb1ZCYZCPiPUVipdOOP5Qhg3fJpbMe59bTa64l7/ml7Z9lo5Vj5UXICsW0cvy9KbS7eyV/QCdwFfB94B3AN4ETa35P3R7/1gH40y+s9VmLHve1G171RjJr0ePO9Hk+a9Hj7u6+dsOrddGVN5166WoUZeRHu+dRXpJ1O0mR/KpHHoekUQcd2X4wcwdYDCwC/hQvFwUZ3u5gLyMa2jUHOLrm99TtXuNgL5u/dKeFH0IZmbxiVbdPuuZeX7Gqu7I00siq/K1EMo/k6HZOEQfSLM6wHvU2LY0G1KlMP5gZg3X38e4+AXjI3SfE/+dhLvB1YATwkJmdH8doe21PO/CsY964QxwuL2XE1OYtW82C5WuYt2x1ZWmkxYvS4pDNSJGY9NV3xQ/X7ur74dpAJqRO7ZCnAQ8sk+R9cSCkrKeMHcGkMcOZMnZEsI68pLWNZoyfZ8ZgzWwx0SOQkWa2CCDEybp7F9AVrx5b89N3av6v3b4Dw1LicHmZMnYEXU+u71dhZ8Xl0n7Pim09unoD029ZxuzJYxk1oiM1XpQWh9wZ9RraVEpMumfcSInjR4qcf6OGg2WlGxIL3mGfAg8s8xJS1j0dks43D+NLI6oZZpnWNuoRP89dX0K6ucAewP4h+5bxRxwi6O5OvyXPQ6Nus7PitpOuudeZPs8nXXNv6u9lpFmW3aSNMmLSWccUsVmkrJu1foTQiLBLWhr3PLXOR19xu9/z1LqG6qjHMYly63+IwMz+E/gxcGu8fk1x/19/3nfIUEYPH8L7DhkafEwZwzuStzDJ25fZk8cyacxwZk8eC8D6jZvpenI96zduLi1NKOdWPKk9mU7IbWbePC1yu5e8NW2W29mQ8E+RcsqbR1UNW5p6/UOsWLORqddHkb4q5itIak8796zzKxJCyar7WYSMg307cD7wJzMbDrwtWF0TcNGCFaxYs5GLFqwIPqaKWE6yIY8a0cEvzn03o+JbqOm3LGPB8jVMv2XZtmNKaRAZt+JFnFCRBpTM01mLn2DG/OXMWvxE6v5F4s8/Xfo8C5av4adLnwfSnVbyfLPi62WQVp92yMMC5ZQsl1mL4jxdFOXpo6s3cOL3f8Ojqzf0qSNvHUtzUjM/9BaG7Po6Zn7oLUE2ipDUntZxymy3iRBKPS7AIQ72AqKhVJuAbxNN/NIybN76Wq9lGnbhLb3W8zbuZEWGHQs7qyEne7RpNrJIcyjTxh3KrJPGMG3coanHhKRRhhNKVtSlz3f3WkLvcijixF/esrXXMs1pJZ1QPR4khqQx8fD9GD18CBMP32/bttr8SCunZLksfaG71zJ50U69w0k4zLS63IuUOO91D7zAxi2vcd0DL/R5fsk2lpdk/bn89idYsWYjl9++/QKdlc/Tjo/bwvFRWyhS9/O2yRAHe5a7n+3uE+Pl74Isl0SRgqk95vhD9um1DLG5756DmTF/+U4bd62NtN5nMuCeVfjJHi3kv3q+vHlrr2XIuYQ0/jIeHiQr6uUfOZzRw4dw+UcOL2wzyR67Duq1TLu4JJ1Qs0y/l+Ywakkrpx3CDKccyaQxw7n6lCMBOP+9B7PfkF05/70HA32ca8JhptXlWs48+kAmjRnOmUcfuG1bWuegbJL1Jy3NrLqePP+Qup/sKVcRIjjRzC6t/QuyXBF5He6MCYcx66QxzJhQ3ZscfRV2cpms3FnnktZz3Nkxewwe1GvZF3l7islzyezlpOhMVswlK19kxZqNLFn5YrCNrH2SPZQ0kk4oNJ3+EBL7y3JSIeU0bMhudL55GMOG7AbAlXc+zdqNW7jyzr7TTeZZlo6fPhiHYR58ftu2tM5BGXlaa6OKEQIheZq88OW9IIc42HOAOxJ/TUMZvdH+klbByiDkatnLwcQ9tb7CASE2QjjvxodZsHwN5934cPAxab2H2mUZJNNIc2whZZXMj7yOfof1gOFTWbrS4oVZY4uPPmCvXss0knk2akQHC5av6dNZvrxpa69lvUhe5IvUwSL0t3ce8qLBHcm/Qim1OFmNrEgjzPo979UyWQmLkqVzSxzP3lIT1857/kW15roY1GFcaAg9vcNkzzrPuXx5/jJmzF/Ol+f3DkPVXoCTMeieu7Yy79722G1Qr2WjSKuDWaRdpLLKIO2Ckwd99LDJqfr2tQjHH7pPr2WzMnFU/OBo1H7ZO+ckT7mEXkx2ZvPmR1b3WvbYq707S8ag09Ltb33qKwyT125/dYTWwawHhUXIoz1kHOyDZrY0ngt2cc/bXGLgMiN+g2ZGE86uVFv5sx4ctRI/PvOoXss0+uopZ5H3YtEMDwVPPXIko4cP4dQjRwYfE/rQuMxOTUgPdjzRvAFLga94/rkIRJtRViiianriZsn4WTPeFWQxcezIXss06lEuVc6xmqdcLvnlY6xYs5FLfvlY8DE/+O2zLFi+hh/89tki8goR4mBnA6OBkURzw86vVpIQ5dATNyv74eNAJjmOuFGM2m9Ir2UIaWOvqyZzshd3n1oPIUKI5ic5jrhR7BsPRetZhnD1KUcyelZXryF6VZM3BrtIMVghBi49zqmeTiqNIvHmRtzRhHwyZjxwHtE3tm5w93urlSSEaFaaJezSKs8BQhzs7Hi5G1EM9lF3P6lCTUII0RaEONjv1fZazayzOjlCCNE+hI4iqOWfqxAihBDtRkgPdk782ZjX4v1/Uq0kIYRoD0J6sOs8+vDhB939A8DavImY2TQz6zKzlWb2mZrtq+Lt4a9jCCFEixDiYL9sZoMBzOz1wIV5E3H3q929E3iM6K0wzGwQsNDdO919VV6bQgjR7ISECC4HbjKz3YC/AlcUScjMxgJPuftf4k1DgSPM7BR3v6mITSGEaGZCHOwiIse6u7v/3Mz2LpjWFODmnhV3X2dm44CFZvaAuz9T0K4QQjQlISGC+URzEXw1Xr++YFqdwJLaDe6+BVgPFHXaQgjRtIT0YAHWAbuY2UeBojPtjnD3l8zsDOAZ4EDgn4CngGqnJRdCiAYQ4mBPB04m6rl2AKcWScjdj46Xc2s231jElhBCtAIhDvYg4P3AAcBq4HESt/pCCCF2JMTB/gdwNtFt/UHADcCxVYoSQoh2IMTBvgH4Wvy/AQeY2Q8B3P1TVQkTQohWJ8TBngC8vmZ9ZjVShBCivQhxsN+q+d8AV89VCCGyCXGwo4EziJwrNPwr80II0RqEONhD2TEsoB6sEEJkEOJgJwAv16zrrSshhAgg5FXZ77r7Mz1/wL9ULUoIIdqBkB7sE2Z2LdE42IOBP1SqSAgh2oRMB+vu55rZwURvcq0BXqxclRBCtAGZIQIzWwCMJZqQ5Tzgx1WLEkKIdiAkBnsBMBl4kGgegs9WqkgIIdqEEAf7FWB34C5gHHqTSwghggh5yHUu8AFgOPA8cHelioQQok0I6cH+L/Au4FIiR3tDpYqEEKJNCOnB7uHuV5jZXu7+DTO7rXJVQgjRBoT0YCcCuHvPN7n+tjo5QgjRPmQ6WHd/LbG+pUhCZrbKzLrMbGS8PsLM7jGzJWbWUcSmEEI0M5khAjMbTvRF2J45YV9091vyJGJmg4CF7j61ZvOJRGNq9yWac/amPDaFEKLZCQkRLASGATOIpiz8eoF0hgJHmNkpNdsOAFYBfyT6wqwQQrQVIQ72SXf/LvAskaP9a95E3H0d0RjaL8Sv3e6wS16bQgjR7ITEYD8W//v3RE725CIJxbHb9Wyf7vAFYCSwf/y/EEK0FSFzEXzSzOYCHyN6bfbTeRMxs9PN7G5gMzDWzI4DfkHktE8ANPRLCNF2hIyD/RxwEnAfMApYAlyeJxF3vxG4MeWn4/LYEUKIViIkBvsGore3hgK/RF80EEKIIEJ6sKe5+/LKlQghRJsR4mB/YGYfZ/tXZXH3Z6uTJIQQ7UHoV2W/Tu/PduurskIIkUGIg73O3b9cuRIhhGgzQh5yjaldiT8hI4QQIoMQB/uamc00s6lmNhO9ddUw9rn41l5LIVqBkHrbrnU7xMGeCiwGXiEaA1voTa4ivOmyXwM7Zn6ZhdBKBfviK1vwKyfz4iuFJjQDss+3jPyoKk8bUVZpafZVH/PoapV6V4bOZL1Ns1lG3U7SDOUS4mDPAS4CznX326njRw///GrvTA8pqLwOJK1gsxpQSAMrw0YWRXoGWXkYkh9509jZeebJj5Dyz0tWumnnktyWpauIQymjPpRRx0IcX950qrCZtk8Rp112HQtxsP8ATKrZ98xCKVVASOUvkulZDShrvSwbeStqkYZcJD/KsFEkP7JsltEIy+hJFbGZ90JY5FyKXPiqOt8qbOZNN+Rc++twQxzsT4DbiaYbvBW4Nshyk1DFrUe9yKu9XufarLe3VTTCelHkItYoR9YulNHDzSJkNq3vuPsEdx/h7hPd/b+C1Yi2pFUaYbNeCET/aZWyDZlNa7GZLTKztfFyUT2ECdFfWuVC0Eo0i2MrUraN0B7Sgx0PTAEejXuyE6qXJYRoRlr5otUI7SHf5OoZonVl9XKEEKJ9CHlVdibxywVm9n4Ad7+zQk1CCNEWhDjYTxN9dWAu8Od4mxysEEJkEDJM6xLgE8BwYASguQiEECKAkB7sOUQhgsfj9UnA76oSJIQQ7UKIg+2Kl040J2zuyV7iCbunAVuBD7v75nj7KmAFcIa7r8prVwghmpkQBzuHHd/eyhuDvd7df2ZmNwJvAp4ws0HAQnefmtPWgOWRdVNYdjY8AmhSs/qgPBf9IcTB7g4cCPyFKDRwU95E3N3NbHegA3g63jyU6PXbU9w91eZv15/BsrPPUOWOOWLfefiVk7ELb6ksN6pyKK3qqOqR52XRrHkcoqtZtfeXEAf7HmBXoq/JHgf8HzC5QFqzgYvdfSuAu68zs3HAQjN7wN2fSR7wrmFz8e98fFvlbuVCaFbtSV1VOZS8duuVX0XSyXtMPdKAHfO4UXUuq06l6WqVC1nePA1xsH8AxgEHAA8Tfbo7F2Z2DFFH9r7a7e6+xczWEznvHRxskjIKoVENt4rKX6RnUMShZqVThQMpoqteTilLWxkXrTLqeqOcVla6Vekqoz5k2cyrPWSY1k3AeGAI0EnUE83LeGCCmXWZ2cVmdpyZnW5mdwObiRx3XThi33mM/e8oo6AnA41H1k3Ztk/atv6mU0RHGeeS3CdLV73OpYiOpN0yzq0eulqZkLIso72UYbNZ6kMtIT3YYcAi4DngIGBi3kTc/UrSX7W9Ma+tWtKuUHmvYmlXpKxeTRW94DQdedNpltusZgkz1It63FnV644nSci5VVEujQp3lJ1OSA/2LODDwFeBDxG9dNAUpF1d6nEVK6P3WSTddqaqPGxUOnl15L1LKEKajaSOZsmfJPVqC2WnE+JgTwO+4e7nAdcD3ygl5TZiIDlCqKYRtmoDaqSOKkJZzZI/9aAeF5MQB/t74AYz+z7wJeDSytSIlqBVGmGz9sbKolXKoVmpR/6FONgZRGNW/47oJYFrKlMjRInIAYlGk/mQK55wGzPb193XVS9JCCF2TrOOK08SMoqgh+sBfc1AiAFMszi2IiMXGqE95Jtcx8b//lvFWoQQTU4rh10aoT0kBjsLwN1/XrEWIYRoK0JCBK+r+ZKsEb3yqlCBEEJkEPKQq9PMDgH2B/7o7isr1iSEEG1ByFdl58T//gF4o5mZu59TqSohhGgDQkIEY4HT3f05MzuIaLpCIYQQGYQ42H8Evmlmw4HVwPnVShJCiPZgpw7WzN4E/BG4uGZz847qFUKIJiKrB3svcCvbP3bYs/xUxbqEEKLlyXKwzwAzgU3AWnd/rXJFQgjRJmQ52BVEDhZgbzPbDfhXd/91paqEEIWwC29h6O67NlqGiNmpg01+UtvM9gB+BsjBNgg1oPrTKnne827+n76R+6MjldMqeVg2Ia/KbsPdX3b33F+UNbMRZnaPmS0xs46+tqUemyiYViqosrX6lVHWl92AysjjkGOqKLuqtfeV53nTKePcqyqXKqhNNy0P03Q1a9vuTx3L5WD7wYnAj4HbgBN2sq0X3d+cBGwvmKIFlXe9DBshWovoSNLfc0nqDHUoeRtQFfkRoj0r/6o6Jit/0ihiI+/FoR7tJet8036vqr0UcY5F2kefuHvlf0TDvE4DzgOm9bWtZv8OwLu7u53p87yWrPWQfWRj4Npg+jxn+jwf+rWF0jHAbaSVQU4bmb7P3Ksf1mpmFwPLib5Qu6u7/3vatpr9O4Du7u5uOjr6jB4IIUQjsawd6hUieAEYSTRhzAs72SaEEG1Dni8a9IdfAD8H/go8ZWbHJbbNrpMOIYSoG3UJEeRFIQIhRAuQGSKoVw+2EBs2bGi0BCGESGWvvfbqAF7ynfRSm7UHq7isEKIV2Mvd++wJNquDNaIHYC81WosQQuyE1uvBCiFEO1CvYVpCCDHgkIMVQoiKaHsHa2arzKzLzEaWZO8yM7vKzMaa2f1mtiCOGZdh8xAzW2lmXf209/F4Ep0uM3tbGToTNg8vQ2ds96NmdoeZXWtm483sgZoPbZZhs9PMVpjZ3BK0Hmxmm0ou+x6bpZR9jd2eev/eErX22HxPieV/opndYGbjytKZsHt0SW1qWnzuK83si6Fa29rBmtkgYKG7d7r7qhLsHQYMj1fPAC4l+k7ZUSXZHATMcffOfsgEuN7d3wesAy4qQ2fCppWkE3e/AegE3kU0L8VZwMFmNqwkm4OBb7n7Gf3VClwAPE5JZZ+wWVbZ96r3wETKqae1NteWodXM9gQ+4+4fBT5Shs4Uu91laHX3q2MbjwGHhGptawcLDAWOMLNTyjDm7k8APT2hA4BVRN8sO7Akmx1Ap5l9oJ863cx2j+1tKklnrc09y9AJ2xruCuB3wH6x1lVEr1CXYXM34DQze3s/de4T/7uOkso+YbOUso+prfelaE3YLEvr8cBYM1tUos6k3f0or66OBZ4iOv8grW3tYN19HTAO+IKZHVxlUqUYcX8QOAmYbWav76e52UQzlm2tTaIMm+5+PyXpdPetRJ+GfwNRL27bTyXZ/BVwLvD9fsgE+ATwo7TkyrBZZtnX1nvKy9Namy9SjtZ9gKuB64FPl6Ezxe67Ka9NTQFuTmzbqda2drAA7r4FWA/sXbLpSiarcfeNwBaiW9tCmNkxkSm/j5J0JmyWorOH2CHuAvwh1jqCqIdQhs03EpX/Hv2U+TfADCLHPYlyyn6bTTM7r+Q8ra33pdTTWpslaV0LDAFeBTaUpTNh10rM105gCTnaVFs7WDM73czuBjYDD5dsfi7wdSJn8FAZBs3sfDO7F+hy97/0w9R4YEIc2L+NcnRus2lml5SkEzO7wMyWAK8APyTq0T3r7utLsnku8Bvgu/3R6e7T4zjuMqIJ4vudpwmbg0vM09p6f3EZWhM231+S1iXA+4m+Uv2eMnSm2N1aVr4CI9z9JXK0fb1oIIQQFdHWPVghhGgkcrBCCFERcrBCCFERcrBCCFERcrBCCFERcrBCCFERcrCiqYgnZ5kZ//8zM+tsrCIhitPU3+QSAxczO5ZosPhCMzsHWAl0uvtMM7uV6A2lFcBM4ByiweVnAV8herliI9Eg+81Ek3MMAg5KOX4V0deNDyeaxGZ/4CexnaXAW939fDN7jmhij/OJBq13xTq7ypigRbQn6sGKZuXzwHV9/LaWaEarWj4TLwcD98XHnxBvex3RG11pxz8AvBU4msjhHhn//wDRBCdjzOwdwMvxuhDByMGKZmQU0bRwr8TrFwFX1fw+KLH/0fH+PUwE5gNd8foUYFEfx98PHAO8xd1/QzR15HuIXq/tIHrFelps7w3xMbPN7H/MbNe8JyYGFnKwohmZCHyvZv0KonlTe+b6fCWx/zii2/oebgXeyfZe6+FEYYMdjnf3J4kmcNkYb1of2/s90bSMtwHHAs8Bu8f7TAf+AvRrCkTR/sjBimbkRnd/sY/fvkM0q9FcIkc8kmhaupfj3zcRTVF3M7Ag3nYt26eV63W8RV+6eB64M/59MbDa3V8jmkt2Gdun0ns98GeiaRtHAcuLn6IYCGiyF9FS1D5UMrNrgZnuvrI/xwNfBK5y92fM7GSib933Ff8VIhg5WNFSmNlR8eTUmNloYKW7v9qP408D9nf3aWY2mSjeerK7J8MQQuRGDlYIISpCMVghhKgIOVghhKgIOVghhKgIOVghhKgIOVghhKgIOVghhKiI/weOoKeSAo1Z3wAAAABJRU5ErkJggg==\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x295f2f8fef0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Нормализовать по размеру библиотеки\n",
"# Разделить количества экспрессии на суммы количеств\n",
"# для конкретного индивидуума\n",
"# Умножить на 1 миллион, чтобы вернуться к аналогичной шкале\n",
"counts_lib_norm = counts / total_counts * 1000000\n",
"# Обратите внимание, как здесь мы применили трансляцию дважды!\n",
"counts_subset_lib_norm = counts_lib_norm[:,samples_index]\n",
"\n",
"# Коробчатая диаграмма количеств экспрессии на индивидуум\n",
"fig, ax = plt.subplots(figsize=(4.8, 2.4))\n",
"\n",
"#with plt.style.context('style/thinner.mplstyle'):\n",
"ax.boxplot(np.log(counts_subset_lib_norm + 1))\n",
"ax.set_xlabel(\"Индивидуумы\")\n",
"ax.set_ylabel(\"Лог-количества экспрессии генов\")\n",
"reduce_xaxis_labels(ax, 5)\n",
"plt.tight_layout()\n",
"plt.savefig('pics/1_09.png', dpi=600) "
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [],
"source": [
"import itertools as it\n",
"from collections import defaultdict\n",
"\n",
"def class_boxplot(data, classes, colors=None, **kwargs):\n",
" \"\"\"Создать коробчатую диаграмму, в которой коробки расцвечены, \n",
" согласно класса, к которому они принадлежат.\n",
"\n",
" Параметры\n",
" ---------\n",
" data : массивоподобный список вещественных значений\n",
" Входные данные. Один коробчатый график будет сгенерирован \n",
" для каждого элемента в `data`.\n",
" classes : список строковых значений той же длины, что и `data`\n",
" Класс, к которому принадлежит каждое распределение в `data`.\n",
"\n",
" Другие параметры\n",
" ----------------\n",
" kwargs : словарь\n",
" Именованные аргументы для передачи в `plt.boxplot`.\n",
" \"\"\"\n",
" all_classes = sorted(set(classes))\n",
" colors = plt.rcParams['axes.prop_cycle'].by_key()['color']\n",
" class2color = dict(zip(all_classes, it.cycle(colors)))\n",
"\n",
" # Отобразить классы на векторы данных\n",
" # другие классы получают пустой список в этой позиции для смещения\n",
" class2data = defaultdict(list)\n",
" for distrib, cls in zip(data, classes):\n",
" for c in all_classes:\n",
" class2data[c].append([])\n",
" class2data[cls][-1] = distrib\n",
"\n",
" # Затем по очереди построить каждый коробчатый график \n",
" # с соответствующим цветом\n",
" fig, ax = plt.subplots()\n",
" lines = []\n",
" for cls in all_classes:\n",
" # задать цвет для всех элементов коробчатого графика\n",
" for key in ['boxprops', 'whiskerprops', 'flierprops']:\n",
" kwargs.setdefault(key, {}).update(color=class2color[cls])\n",
" # нарисовать коробчатый график\n",
" box = ax.boxplot(class2data[cls], **kwargs)\n",
" lines.append(box['whiskers'][0])\n",
" ax.legend(lines, all_classes)\n",
" return ax"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAADQCAYAAABV2umIAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAHitJREFUeJzt3Xl8VOW9x/HPD4HALWFHWRRIAhYSK41FQFrLUgQFpagVrNIKXsC22pYCLlcoDYtIVWhVbmmLAi5QXIoIFRRaDJRKFasCAoELglhlaVgMmxDCc/84kzEBAmcmnMzC9/165TWZmXPO85slvzznPOf8HnPOISIi0asU6wBERBKdEqmISDkpkYqIlJMSqYhIOSmRioiUkxKpiEg5xTSRmifVzCyWcYiIlEfloDZsZmOBms65oWZ2FTDbOZd20mI1gIKCgoKgwhAROVfK7PAF0iM1sxbAhSUeGgTsCKItEZFYCySROuc2A3MAzCwT+D/gWBBtiYjEWkUcI70DmF4B7YiIxERFJNLGwBNAppndWAHtiYhUqMAGm4o5534AYGa5zrlXgm5PRKSiWSyrP5lZKqFR+9TU1JjFISLiQ8WO2ouInE/Oi0RqM0bEOgQRSWJnTaRm9qCZLTaziWb2jpn9oSICE5HE98UXX5x1mWPHjnHixIkKiCY4fnqk1wE9gN7OuXZAZrAhSbLTHkLye+ONN2jTpg1jxowpc5lDhw4xePBgsrKy+PTTTyswunPPz6j9ceBveJfGLwWKgg1JRBLZ9u3bGTZsGIsXL6ZJkyZlLjdixAhatGjBtGnTKjC6YJw1kTrnuphZVaAesMc5pyuURKRMr776KoMGDTpjEgVYsmQJmzZtqqCoguXnGOkEYD7wCDDfzB4OPCo5Le0Sx5fc3FxuvfVWADp37kxeXh4FBQV07dqVFi1asGjRIgAGDRpEWloaPXr0oLhAT7NmzVi+fDkAaWlpLFu27LRtDBgwgMaNG9O4cWMGDBgAwLPPPkt6ejp9+vShsLAwvOyaNWtIT0/n0ksvZdasWQDMmjWLZs2a0bt3b5xz/PrXvyY9PZ2bbrqJEydOMGPGDLp27QrAzJkz6dKlCwDbtm2jbdu2ZGVlsWXLFnbv3k3Lli1p2bIljz76KMeOHSM9PR2A5557jocf/jItbNy4kd/97ndkZmby+uuvh98fgGnTppGTk0N+fj67d+8mKyuLvn37hl9H8eutXr06eXl5zJw5k5kzZ7Jnzx7atGkTfp2ZmZlkZ2ezfft2HnroIZo0aUJqamp4mUGDBpGRkcGUKVNKfT79+/dn8eLF5OTkkJubS15eHt26dYv8wz+Jn2OkXfGOj/4A+C7wnXK3KpKknn/+ea666iqWLl3KyJEjAdi8eTOLFi2iXbt2PPPMMwCkpKSwYMECVq9ezZEjR9i3b1+Z25w+fTrTp3tXWTvnGDNmDO+99x716tXjtddeCy+3d+9e2rVrxzvvvMPo0aMpKiri9ttv5+OPP+bo0aOsXr2aESNGsGnTJj744AN2797N559/zsaNG9m/fz8LFy6kuKLlk08+ydixYxk6dCjPPPMMhw8fpl69eqxbt47p06dz4MABatWqxf79+3nvvfdo27ZtOI5Dhw4xbtw4Xn75ZR588MHw44WFhTz66KPhZZo2bcratWsBwv90ioqKeOaZZ2jfvn2p92DSpEkcOXIEgPHjx/P4448zbNgwJk+ezMiRI5k1axa9evVi9erVrF27loKCAj788EMmT54c3sa//vUv8vPz6d69e/ixiRMncvz4cV+f7Zn4OUY6EVhgZhfgHS+dWO5WRQJgwxec8226STec8fmFCxfSqlUrtm/fDsC6devo3r07TZs2JT8/n5IXvHTs2JFXXvEu7rvkkkvIy8vj9ddf5/rrr+fAgQO+4snPz6dmzZrUrl2bdu3a8eGHH9KnT59Sy9SuXZv69euza9cuFi9ezMSJE9mxYwf79+/no48+okOHDrRt25aGDRtSUFBAt27deOONNygsLAwn0ry8PObNmwfADTd8+R5UrVqVyy+/nI0bN9K+fXvef/99Pvjgg1KDSpUrVyY1NZXMzEx27doVfnzWrFl06tQpvEz16tWpXLkyHTt2ZOPGjYCXYGvVqlXq9RQUFLBp0yYaN24cfo/btWvHzp07w/+YSsrLy2Pp0qVkZ2eHky/Afffdx+LFi8P3t2zZQrVq1Xy972fjJ5G+BwwpcT92l0KJnMHZkl4QevbsyZw5c8K7rgBl1SmvVKkSRUVFHD58mKKiIi6++GIWLVpE//79OXz4sO82/dRBL25r9OjRrF69mh//+McAtGzZks8++4zevXuzb98+Dh48yLXXXssjjzzCrbfeyty5cwGv5zt//nyysrIAb1f/5G1fffXVrFixgsLCQmrWrBl+vk6dOnz++efhZYu9/PLLDB06lBUrVlC7du3wYQ4zC7+mrVu3nnJsdfr06UycOJGJE7/sw53pPXDOMXDgwHDvt1h2djZLliwJv6YpU6Ywe/bs8HtTHn527ccAOcBK4Feh30XkNDIzM1m1ahX//ve/qVu3bqk/+FWrVtGyZctwL+k73/kOqampVK5c2df5lgD169fn888/p6CggHfffTecFEo6ePAgn376KQ0bNsQ5R0pKSvi5w4cPk5KSQkpKClu3buXIkSM0atSIo0eP0rVr13AcrVu3ZsmSJYB3nmexoqIiVq9eTUZGBp07d2bq1Kl07NixVPtXXHEFK1eu5JNPPqFRo0bhxzt27Ejlyl7f7Stf+QpVqlRh586drFmzhtatW7N+/XqOHj16SiI9cuQIPXr0OOU9Luv1t27dmjfffJOioqJSsefk5DB16lT27NkDQIMGDWjduvVZ3nF/zppInXMDnXMDgY3OuTudc3eek5ZFklD//v1Zvnw5nTp14qGHHgo/fs011zB37lwGDhwYTqS9evVi6tSpgLdLW1BQQN++fc+4fTNj1KhRtGnThp07d9KrV69Sz7/22mtkZWXx85//nCpVqnDvvfeSmZnJP/7xD2rWrMmvfvUrMjIyqFSpEm3atAnH8uqrr5Kdnc2hQ4cAGDp0KM899xwZGRm8/fbbAKxevZr09HR69OhB48aNadKkCRdeeCHXXXddqRhuvPFG1qxZQ6dOncjJyQG8Xfk77yydOnJycvjGN77Bf/7zH7p3706/fv1KDVoVGzx4cKl/SCNHjuSee+7hkUce4Re/+MUpy3/ta1+jQ4cOpKenc//994cfr1GjBkOGDAkPQA0ZMuSUdaPmnDvjD/AmsLTE7dKzreP3B0gFXEFBgQsS04cHuv2KoteRmDp16uQ2bNgQeDtvvvmm69evXyDb3rp1q2vfvn2px44dO+bat2/vjh8/HkibcajMXOb3PNKvAqnOuXfPXQoXkUS1detWunXrxi9/+UsuuOCCWIcTc2dNpGb2CvAx8G3gCjOb75zrHXhkIkkiNze3Qtrp3LlzqUGvc6l58+b885//DN9PS0tjy5YtgbSViPwMNjUAXgK+MLMr8a5wEhGRED+nP/0Qb96ltcANwG1+Nlw8HTOwDegLfOqcuyW6MEVE4pefRPqIc+57kWy0xHTMXwCPO+d+a2arzKyKc67wLKuLiCQUP4n0m2ZWahZQd5ZToJxzm81sDtDHOefM7CLgEyVREUlGfhJph/I0YN4JYFOAU0/4EhFJAn5Of/q4nG3cCKw6B9sREYlLFTFnUxfgNjPLDR07FRFJKn7qkW4xs91mtrT4x8+GnXO5zrmhzrmfOue+7pzr7JzbXP6QReJDUVER3//+92natClvv/12uOZlsZycHH7/+9+fUsuz5HPFStY2nTZtGi1btuT2228v1V7xOk899RQPPvggx48f55ZbbiEtLS1cZg+8yzGbN29Ohw7eUbmTa4GWbKthw4bAqfVHAYYNG0azZs0YN24cLVq0oEqVKmRkZPD666/zwx/+kPT0dFq3bs3atWs5ePAgrVu3Ji0tjTVr1pzLtzkh+OmRfhUYDqwH/og3f5PIee+vf/0rRUVFbN++nSuvvLLM5U6u5VlcNON0Tpw4wRNPPEFeXh7btm075aT3gwcPMmXKFO6//34WL15M1apVWbt2LRMmTKCoqIiioiIuuuiiUy4CKFkL9HSVk06uP7px40beeusttm3bxsiRI9m8eTNNmjRh3bp1XHvttWzfvp2FCxdy0003sXz5cmrUqMGGDRuYMGFCqaR+vvAz2FRyQpX/Bm4Bbg4mHJHoBTGDgBv4WJnPrV27Nlz5qLhcXM+ePcnKyuJPf/rTKcuXrOUJMHbsWJ5++mlmzJgRXmbPnj1s2bKFrKwsDhw4QH5+PhkZGeHnf/Ob3zBy5Ehq1aoVrstZo0YNGjRowI4dO6hZs+Yp9TyhdC3QBg0a8PHHpYcsTq4/unbtWjp06FCqxN3JrrnmGvbv38+mTZvYtm0bt956Kzt27AhX2T+f+BlsGlgRgYiU15mSXhBOnDhxSpJZuHAh06ZN4/nnnz/tOsW1PAFGjx5NkyZNGD9+PD/60Y8Ar4hQVlYWq1atOu362dnZLF68mJ/85CfAqb3Ljz766LRzJZWsBZqZmUlaWhqtW7dm79694XZL1h998cUXz1r3dMmSJSxbtozHHnuMCy64gDvuuIO0tDTmzJlzxvWSUWDHSEWSXatWrXjrrbcASlXCr169eqm5lIqVrOVZ1rLFle137txZqpZmseuvv57jx4+zZMmScF3OQ4cOsXv3bho2bMiLL7542h5hyVqgALNnz2bDhg3UrVsXOLX+aKtWrVi5cmXJSm2nlZKSwtGjRykqKjpn1eYTkY6RikTp2muvpbCwkBYtWrBs2TIuvPBCevTowZIlS+jXr1+pZU+u5Vm3bl3Gjh3L3XffzdChQ8PLVapUifHjx9O+ffsyd5HHjBnDhAkT6N69OwcPHuSyyy7jgQceYNmyZSxZsiTcuy12ulqgJzu5/ujll19OdnY2GRkZzJ49+5TlmzVrRvfu3Rk3bhxDhgxh8ODBTJo0icGDB3PxxRf7fQuThp3pvw2Amc3Am17EgIuBAufcOTlGamapQEFBQQGpqannYpOnb2fGiArf7QuCXkdiKj5+WLJ6kiSkMo91+Blsmu+ceyW8JbN+Z1pYROR842fX/n4zSwEws2rAsGBDEkkuJ9fylOTjp0c6FngllEwLgVMnVREROY/5SaTNnHM9i++Y2c8CjEdEJOH42bUfYGbNzNMc+EGwIYmIJBY/PdIf481l3wjYDfw0yIBERBKNn0S6A/gnUMU5NyU0o6iIiIT42bV/GdjJl3M1TQkuHBGRxOMnke4BLgNqm9m9wKFgQxIRSSx+du1vxJvTfgveMdLJfjZcYhbRPwLPAbuAXu5sl1KJiCQYP4m0K9619lWBIrx57l840wonzSJ6KzAa+B7wdeD9csQrIhJ3/OzajwFuds51BXrjJdUzClXCL66l1RjvGOsO4NT6XiIiCc5Pj7QWsCBUm9CAFsWl9ELJNRLarReRpOOnsHNWOdv4DGiIdx7qZ+XclkjM5R88yoxVnzDwykuoXyMl1uGc9+Lh8zhrIj2pkLMBLsKe6BzgWbyBqtWRhScSf2as+oT7/rIBgHu7aGLcWIuHz8PPrn00u/A453KB3NDdtpGuLxKvemdeRO6WPfTOvCjWoQjx8Xn4GWzKLjnNiKYakfPd/PW7WLhhN/PX74p1KEJ8fB5+eqTZzrltxXfMrFVw4YjEv4FXXlLqVmIrHj4PPz3Sk6dD/EMQgYgkivo1Uri3SwsNNEmYnx7pUjN7A+9c0MbAymBDEhHxL1EGm36NVxm/HrAPSAs0IhGRCCTKrv0c4E68S0SfBu4LNCKROJd/8CiPvrmZ/INHYx1KudnwBbEOodzi4VCLn0T6GHA58DawHJgRaEQicW7Kiq3c95cNTFmxNdahSJzwk0g741Vumop3hVLnAOMRiX920q2c9/wcI30cuAmvmtOnwLxAIxKJc/d8M42vVK2s05/iiA1fgJt0Q8za99MjfQk4DPTFKzoyN9CIROJcPByTk/jip0e62zk3x8wudM49b2a9A49KRCSBnLVH6py7PXT7ROi2b9BBicS7ZBjtlnPHz669iIicgZ8yellAL6Ba6KF859zvAo1KRCSB+OmRzgLW4k0zsgwYEGRAIiKJxk8iXe6cWwR8BNwMHAk2JBGRxOJnqpGfhX7tj3eF0/pIGwkdHngK7xTmns65vZFuQ0QkXvk5Rvo/QBfgPbypmd8H7oqwna7Aw8C3gSuAv0a4vohI3PJzHmlPvAS4zjmXaWYromhnAZATai+a9UVE4pafY6SFwN8AC00zcjyKdprg1TOtCtSPYn0Rkbjlp0c6NjSRXXn0Bebj7db3wCvHd16rO+uX7DsW+bidzRjhe9k6Vauz9/ZxEbchIpHxk0gnm9nQkg8455ZH2M5c4EngC6BPhOsmpX3HjuAGPhZoG5EkXRGJnp9EWgevdF5x0TCHV5fUN+fcMrwRfxGRpOMnkf7ROfdw8R0zqxVgPCIiCcfPYFOPk+6/EkQgIiKJyk+PdLOZzQQ+BpoB/w40orPQIE180ech4u/KpkFm1gxoBPzHObcl+LDKpkGa+KLPQ8THrr2ZjQb+CDzlnNtiZuODD0tEJHH4OUZ6Hd7VTbvNrApwTbAhiYgkFj+J9NfAQqAe8GdAB6tERErwM9j0HjAE7/xRC92KiEiIn0Q6JnT7Hb6s2nRnMOGIiCQeP4k0B0gFWjrnlEBFRE7iN5EeAe4LNhQRkcTkJ5Hm4h0XbWFmLQCcc88GGZSISCLxM2qfAYwCLg04FhGRhOQnkf4NGAZkAv2Aw4FGJCKSYPzs2jfH27V/NXT/vwKL5jyyLncS63MnBdsGQMCXb4qIv0Ra8vxRO8uyZTKzXsAAYIhzbl+020kWWZ2HV8g16jrpVyR4fnbtH+DLk/CjSqZmVgO4yzl3i5KoiCQbP4m0MXA7cD3QEG/upUh9C8g0s6Vm1jCK9UVE4pafMnq1zawyUBu4Cq+wc+cI26kLTMGbs6kf8HiE64uIxK2zJtLQ1CLfxeuZ7gB+EEU7/wHSgP1AtSjWFxGJW3527efiHRddFbofzcn4fwe+jXeN/rwo1hcRiVt+Ru0NOAEUhW4jHmxyzn3BqXM/SRLQaVwi/hLpzcBNQEdgJ5qXXkrQaVwi/hLpWGC8c26XmV0HvAh0DzassqkHJCLxxk8i/T3wWzOrDfwL+F6wIZ2ZekAiEm/8JNIpodu6eCPu84CugUUkIpJg/JxH2qUiAhERSVR+Tn8SEZEz8DOvfSMzu8vM7gnd/2rwYYmIJA4/PdKXgV3AbaH7U86wrIjIecdPIt0DZAG1zexe4FCwIYmIJBY/o/Y34l3e+RFezzTYkzhFRBKMn0T6FF/WIzWgP5rXXkQkzE8ifQ64C9gIzMabmllEREL8HCO9GlgPXA4sB+4NNCIRkQTjp0c6A2+XfmYE64iInDf89EgfA2oC24GewPRAIxIRSTB+EundwBBgBVAI6JJREZES/CTSF4HLgGN4k+D9NZqGzKyZmR2NZl0RkXjm53jnk865ucV3zKxflG0NBf4vynVFROKWnx7pfWaWAmBm1YBhkTZiZnVDv+ZHuq6ISLzz0yMdB7xiZlWB48DEKNq5De981Owo1hURn+qOep19RwojWseGL4ho+TrVq7B3/LURrZPs/CTSpXgJtLpzbl6oUn6k0oFvAZlmNsQ598cotiEiZ7HvSCFu0g2BthFp4o1Gov1D8JNI/wLMxxtomkcUczY554YBmFmukqjEo0T7w012ifYPwe/J9flAZTO7Bbgg2sacc52jXVckSIn2hyvxxc9g0/eAqsBLQCpeNSgREQnxk0h/BjQFqgCX4J3GJCIiIX527XNDt4/gFSyxwKIREUlAfhJpZ7xd+l3OueXBhiMiknj89kiPAO8HG4qISGLyk0hHERqpNzMDnHOua6BRiYgkED+J9G7gv4EDwAzn3KfBhiQiklj8jNr/D3Ah0BHYYmZ/CTYkEZHEctYeqXNuYEUEIiKSqM6aSM3scbxjpKlAdaCac6530IGJiCQKP7v2a/GubNqLd539bYFGJCKSYPwk0nnASqAa0AdoHGhEIiIJxs+o/XvAHmAN3pxNDwB3BhmUiERnXX5v1t8RcBsAuGAbSTB+BpuaAphZJaAGkBJ0UCISnaz68yukipXSaGm+yuiZ2Ri8058c8Afgz0EGJSLnt0TrWfsZtf8RsN059yszqwLMNLMPnHNb/DYSmjDvHqAI6O6cOxZ1xCKS9BKtZ11mIjWz8UAT4Apgm5l9M/TUpcAUM9vhnPN7rPRF59wLZvYyXkm+zeUJOlnYjBGBbr9O1eqBbl9EPGUmUufcKAAz+wbwMN5gUx0gDfiec+6Q30acc87MquOdi7q1XBEnCTfwsYjXsRkjolpPRILlZ7DpX2Z2O3A1kAeMd85FNrmNZzIwyjlXFMW6IoFKtGNyEl/8HCOdwZefvgF9ifD0JzO7Aq9juiriCE+3Pe0SyzmWaMfkJL74GbV/DrgL2AjMxqtNGqkuQFczy8Xrla6IYhuAdolFJP74SaRXA+uBbOBHeJeJ3hNJI865ScCkiKOThKA9BDnf+UmkM/F27Weg+ZrkJNpDEPGXSC8FhuMVLikCngY+DjIoEZFE4ieRjgGucc4dCp3CtAyYE2xYIiKJw08irQUs8KZrwoAWZrYUQHM3iYj4O480qyICERFJVH7OI+2PV8y5Wuihz5xz/QONSkQkgfgp7DwcuBmoFNqVbx5oRCIiCcZPIp3inDsC5JrZSuDDgGMSEUkofgabDMA5lwPkBBmMiEgi8pNI7zazTSUfcM4tDygeEZGE4yeRzgM6h343vKuclEhFREL8HCN9HO9KpkJgG/BEkAGJiCQaP4m0eH6md0K3rwQUi4hIQvKTSA04gXed/QlUuEREpBQ/ifRmvIIlVwFVgD6BRiQikmD8DDb9lC97oZcAPwPGRtKImV2EN2h1HOjpnDsQyfoi4p8NXxDo9utUrxLo9hORn0TaBxha4n40u/a9gOeB+kA3dJxV4lAyJKBIp0ux4QsCn2IlWon0efhJpNXxTsQ3YAPRVbpvHFq3EG+KZ5G4kkwJKBkk2udhzvmfjsvMLgcmO+e6RdSI2Si8RFoPqOKc+9/Q46lAQUFBAampqZFsUkSkopW5N+5nsCnMObcm0iQa8hnQEGgU+l1EJGn42bU/F17jy8GmyRXUpohIhaiQROqc24V3+pSISNKJaNdeREROVVG79md04IBOKxWR+FazZs1U4KA7zQh9RKP255qZafBJRBJJzdNdUBTrRGp4o/kHYxaEiIh/8dcjFRFJBhpsEhEpJyVSEZFyUiIVESmnpE+kZjbWzH4b6zjKw8z6mdnfzSzXzKrGOp5omdktZrbMzGbGOpby0vcqfsTD9yqpE6mZtQAujHUc58CLzrmrgXygaayDiZZz7iW8iRTbmVnCFrXU9yq+xMP3KqkTqXNuMzAn1nGUl3POmVl1IBXYGut4omVmFwB5wLvOucJYxxMtfa/iSzx8r5I6kSaZycAo51xRrAOJVij2TKCWmaXEOh4B9L06J5RIE4CZXYHXgVgV61jKK/SlrwxcHOtYznf6Xp07SqSJoQvQNTQo8K1YBxMtMxtqZn8HjgAfxToe0ffqnMWgK5tERMpHPVIRkXJSIhURKSclUhGRclIiFREpJyVSEZFyUiKVpGZmdczsLjOrFutYJHkpkUrSMrOawCtAAXAixuFIEtN5pJK0zOweYIdz7s+xjkWSm3qkUuHMbEDop7OZ5ZhZXTN7M/RTx8y6m9mC0LL/MLNvmFlG6PdXQ4/PMrPlZjaqxHZLbQfv+uvhZvaCmVU2swlm9paZTQ0tP8jMVpvZD82sbaik3Auh53JPun3SzNaY2dcq8r2SxKBEKvHgJmAm8CzwXaAO0NTMmgN1Q/fvAO4HtplZK6CJc+7beJc4ppSxnerAAGALcDXQGLgNqGlmzZ1zTwFXAbc5594Fvk0Z12o7534KPAD0OncvW5KFEqnEygNAcWHki4BPQj+N8cq6/Q3IARYCtfBmm30U6ATULLGdHUC9MrbzRejxj4BGJdbZBjQ2sxxgEVA8EPVOaN1SzCzFzF4CxpdYViRMiVRiZSIwNPT7Lrye4MV4ibEGkAv0BFbi9Sx3A0Odc193zr1TYjuNgH1lbGcL0BJoEHqu2CWh7V3nnOtU/KBz7kog9TTFgdsAO4Fh0b9cSWaVYx2ACDAXb3Qd4EZgEN5I+/V4PctqwFPAbDPbC9wAEKr4875z7kgZ26kGvBTa1iTgB8ALwL+dc5tDx0f/AWw3s6uBsXiDU4VmdpmZzQMuAw7jJdOr8A4diJSiUXtJSGaW65zrHOE6M4Ec59y2CNbJAXKdc7mRtCXnF/VIRc7sZbzDACJlUo9URKScNNgkIlJOSqQiIuWkRCoiUk5KpCIi5aREKiJSTv8P+o3hwRFL77cAAAAASUVORK5CYII=\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x295f2fb07f0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"log_counts_3 = list(np.log(counts.T[:3] + 1))\n",
"log_ncounts_3 = list(np.log(counts_lib_norm.T[:3] + 1))\n",
"ax = class_boxplot(log_counts_3 + log_ncounts_3,\n",
" ['сырые количества'] * 3 + ['норм. по размеру библиотеки'] * 3,\n",
" labels=[1, 2, 3, 1, 2, 3])\n",
"ax.set_xlabel('номер образца')\n",
"ax.set_ylabel('логарифмические количества экспрессии генов')\n",
"plt.tight_layout()\n",
"plt.savefig('pics/1_10.png', dpi=600) "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Нормализация между генами"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [],
"source": [
"def binned_boxplot(x, y, *, # относится только к Python 3! (*см. совет в книге)\n",
" xlabel='длина гена (логарифмическая шкала)',\n",
" ylabel='срд. лог-количества'):\n",
" \"\"\"Построить график распределения `y` независимо от `x`, используя\n",
" большое число коробчатых графиков.\n",
" Примечание: ожидается, что все входные данные приведены в\n",
" логарифмическую шкалу.\n",
"\n",
" Параметры\n",
" ---------\n",
" x: Одномерный массив вещественных значений\n",
" Значения независимых переменных.\n",
" y: Одномерный массив вещественных значений\n",
" Значения зависимых переменных.\n",
" \"\"\"\n",
" # Определить интервалы для `x` в зависимости от плотности \n",
" # результатов наблюдений\n",
" x_hist, x_bins = np.histogram(x, bins='auto')\n",
"\n",
" # Применить `np.digitize` для нумерации интервалов\n",
" # Отбросить последний край интервала, потому что он нарушает допущение \n",
" # метода `digitize` об открытости справа. Максимальный результат наблюдения \n",
" # правильно попадает в последний интервал.\n",
" x_bin_idxs = np.digitize(x, x_bins[:-1])\n",
"\n",
" # Применить эти индексы для создания списка массивов, где каждый содержит\n",
" # значения`y`, соответствующие значениям `x` в последнем интервале.\n",
" # Этот формат входных данных ожидается на входе в `plt.boxplot`\n",
" binned_y = [y[x_bin_idxs == i]\n",
" for i in range(np.max(x_bin_idxs))]\n",
" fig, ax = plt.subplots(figsize=(4.8,1.3)) # \n",
"\n",
" # Создать метки оси Х, используя центры интервалов\n",
" x_bin_centers = (x_bins[1:] + x_bins[:-1]) / 2\n",
" x_ticklabels = np.round(np.exp(x_bin_centers)).astype(int)\n",
"\n",
" # Создать коробчатую диаграмму\n",
" ax.boxplot(binned_y, labels=x_ticklabels)\n",
"\n",
" # Показать только каждую 10-ую метку, чтобы \n",
" # предотвратить скапливание на оси Х\n",
" reduce_xaxis_labels(ax, 10)\n",
"\n",
" # Скорректировать имена осей\n",
" ax.set_xlabel(xlabel)\n",
" ax.set_ylabel(ylabel)"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAABfCAYAAAC6PE+FAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAGotJREFUeJztnXmUVPWVxz9XNlllE1RExBiXxiRoghuIKCYSFAwa476gSZwo5jAwEHNiMmqYRDESJ+BMYqKA5qiDawCBGJHWCCpERYVGI9EgYthBVpsG7/zxe6/7V9Wvql6tXd3czzl1qt72295737q/7f5EVTEMwzBy54CGToBhGEZjx4TUMAwjT0xIDcMw8sSE1DAMI09MSA3DMPLEhNQwDCNPTEgNwzDyxITUMAwjTzIKqYjcJyIvicgmEVkgIstKkTDDMIzGQhyL9ERVHQhUARcBnxU3SYZhGI2LOEJ6c/A9EXgImFK85BiGYTQ+msc4Z42IDAFeBY4ElhY1RYZhGI2MOBbpk0Af4B1gM/BAUVNkGIbRyIgjpK2BjcBWnAUbx4o1DMPYb5BMbvREZDjQydu1RVVn5hSZiADtgB1q/vsMw2gixLEuL1TVawsUXztg27Zt2woUnGEYRsmQVAfiVO0vEJEXgs8CEXmhgAkzjIKxcUc1dy9YycYd1QU5zzDiEsciXaqqZxc9JYaRJ1OXrGb87BUAjDvr6IRj763bzphZVUwaVsHMqnUpzzOMXIgjpM/5GyLyQ1X9TZHSYxg5M7Jfz4RvnzGzqpizYj0A0y/tm/I8w8iFOJ1NrwGXAKuAXsDjqtovp8hE2hO0kbZv3z6XIIz9jI07qpm6ZDUj+/Wka7tWOYfjW6THdrdnz8iJvNpIfwD8JzAXuIO6mU6GUXTC6vrUJavzCufY7u159runZC2iMnZWrH25YG21TYc4Qvov3Kym2ap6NbCluEkyjDqGV3Rn6PHdGF7RvSjC44tioQQyLlF/EsUUVxPu4hFHSJ8A1gKXB9s2174J09AvW7KYzaxax5wV65lZtY4pCz9k/OwVTFn4YclFLypt+TKyX08mnn98QlttoSzwKIoZ9v5OHCHdBJwAdBSRccDO4ibJaEhyedmKKb4JYhM256dp1k9XFc9GCEsh1F3btWLcWUcntP1GiatPPmWdKWwjd+II6QhgEfBz4HXgwqKmyGhQol62TC9vPpZOurBl7KwEsRk1oDcTzz+eUQN6Zx1PKclHhKPE1Q/TL+ts40kVtpE/cYT0D8DVwNeBq4Jto4kS9bJlasuLa+lEiaZfXc8lbZBbh1ChLM58w8n2er/N2Cgf4gjpbcHn6OD79uIlx2gIMlmcUS+vL66ZBC4Mf8rLH9a3XHOsrpcj6dKZTadWuuN+m7FRPsQV0nuAf6jqKlVdVdwkGaUmk1UY9fJGDX5PJci1M46EepZrttX1xiKqIXHTG9eqLmQ7Zy5l2djKv1TEEdL/BG5U1ZHFTozRQKSwCsOXJmoIUkimJgCoe/lH9e9da7n64Vi7XXy6tmvF+Nkrsi6vbAXdBDM74kwRnQmoiGzBjexXm3vfNAhnDV12Yg/atmqe0sp59M01zFmxnn6HHwTA7X95n53Ve+udl26KJsDKDTu45rGl9ea7j5+9Ar1nWCGyZBgNQhyL9CzgMeBN4Mcmok2H0IJ8dOma2n1R1fNdNfvqvsNJckmT5ZJ72H0mLljJ+NkruOKRN5mzYj1jZlVlFF0jO6IsyXyaFZoCVdcIVdeknNVZUOJYpJOC75bAwyLynqqeX8Q0GSVieEV3Kv+xiV3V+7j9ufdr94eWYkibFs1qvy/r24Mlqz/lsr49Eq5Jx+KPtgLQoVUzjuvWlp8OrhNbq9IbxaJieul8x2cU0lzbRkVkLfAucKmqrs0lDKPw+E5Awk6kfj0PqteBMbJfz1pBHTWgd23Vf+qS1cxZsZ5BX+gSGb6MnVWvmn5yz468+MFm2rRsztJVW5n39w2c2jv6eqNpEVqEpRS1hiCjkIpIFW6KKMRsIxWRZsBc66AqP3yfnX712rcMk310hlX25GuSLddUXH/KESxfv4ND2rVk0aqt7Krel3c+jNIR9ecY93gmAc0UdmMhThtpNfATYBRwTsw20k5AHxEZkU/ijMKQavC8L5B+u2hym1m+U0AfWPwRc1as5401nwJ1ba5G46WptqvmSpw20meA84COOHF8UlXTOi5R1Y0i0h+YKyJv2NjThiXZc3yyxTll4Yfc/tz77Nyzl9vOPS7t9Tv37K09Ny4LVm4E4O8bdgDUCqpRepqKBVhuxGkjTZjJJCL/ESdgVa0RkU04ATYhbUAy9ZCHVe1UVe6wU2p4RXcmv+wG7W/csSd2/Ku37gZgz15Xzdu7r2m3lxnZ0fnWeWyeMKShk5EXGav2InKjiFSKyPPBwnefxbjm2yKyENgDvF2AdBp54A9L8qvp4e/Nu5wopqpyP7rUjSN9dOka3tvgnH+F33HYXl0DQKCjbNkdX4SN8qHzrfNS7ks+lk3Vf8vumvwSVgbEaSO9RlUHqeo5wGCcA5O0qOoTqtpfVa+y9evLC3/20cQX3PjO5993Ve+U4ujNfBo7sDcHt23B2IHxPTCJJD5mH2zenUvSjQjSiVshzvGFMkrwtuyuQe8ZlrMYLt84nOUbh+d0bTkRp430YRF5EagJzn+ouEkyiolfzb9o+t8AOLhdS/oc2oFJwyoir7nsxB4s+fhTLjuxB2NmVbFhZw2TF8VvrendqQ3L1u2o3f58P/hrjaqu+mIVR9ziXLNld029/f4+XwDTiWLyNeE+AL1nWNE6l/p0neniKEropSOORfqMqp6pqueo6iBgZaYLjPLFr+Z/6RC3ftEpR3RKu56RX7WfNKyCocd3Sym6UbRukfiYHVCaySZFJZNFl0rgfAsuSsxS7Ut1TXhO1L5kSzGT9Rh1fPnG4VRdIyWxGhvzSIA4QvoLEblTRM4QkT8DpxY7UUbuRA1VSjV8qWu7lgnfqXh99daE72zZstv18IcPW8XBbXIKp1REiWSUtRan2lsO5FN97tN1JhXTlT5dZ8YOJ1W7aSo6tW6RU9rKiThV+/k48XwSmAisL2qKjLxIHuqUah9QO91zyDEHc/eClSmXPJ737oba7+3Vb/PiB5sjnZak4p9bdgHwebC9bP2urPMVh2zbAqNe+FTVXqhfxQ0tuFJaUnVCFl0Z9oWu6hpYTl312VmXifv8a/x9UURVw/0wwyPZlktj77GHeEIKbhXRV4uZEKMwRA118ocv+YRTRIHa7+QxpgB3nXcc4559l7vOO46HXv8YgK1Z9Lz/z4gTuOnp5dQUuHE0kwBCfaswShRL0RYYkknM6tC0oqgprvHDTM5Pn64za/f5gpxJaNPhh9nY2znzIc440ukAInK3qo4rfpKMfPBnK4WEgjnoC10Y57WDhgJ78+m9arej2Fa9j8/VfYfV9PA7Fb61d9fQYwFo3xK274FD2kY3JfjXpOuoCUklgNlWLQtNlIXn70snZtTuIa0oJocfXhOXKOvST1scClG+TWVyQJw20pDongij7EnlVT0U2MmLVqVcvqLzrfMSXOcNC8R2WIY1g/yOixufWkbN58r2wIjdWp25syNVR00c8h2Skwt++2GfrjPrtSmG+3yi9pULcdpDy6UNuByI47TkCVX9tqr+WykSZJSOUFiHV3Rn0Be6RM582rK7hlH9e9O2pfP+tGnnHj7cspubB/TmvkWrYs1K6dKmBet21tAM2AecekTHeudECWc5kdgWWEdUlTukMQ/tyUfgo9pNmzpx2kj7i8iDwe/Q+9N1RUyTUWBSdTaFdGnbMnJ/yKade2rbWB94zTkg6dOtHRA9zCd5e91OJ4rhvKnX13xa9sIZElUl9yl1Z1M2yNhZdGrdouhlG8YTkqrdtCm71IsjpKeS+LfSBEYB7l/4HVC+P9IogY1qXxwzq6q2M2rPXtf3/uYn22qPp+rQCWnVDPxp/M3kgAYVzijrMrrjp37bZWMhFLLNE4YUVej9eDKRTkDTtY83BuK0kfYF7gemAr8DTixqioyC4w/C96eIRi2zHNW+6A/CnzLiBIYe340pI06IHf8BSe/P1s/iD50qBv7YSL89Mxx4Xs5tl7l0zvjXJFuP6Sjl+M6GaNcuJHGE9BbgAlUdDHwL+HFxk2QUE7/jKZyx9MDij9L6Gz22e/vamU/+77jsTarENM+mizNP/E6TdB0ovrg2FJ1at6gnlKGYpRPAVNekOi+uxReel434huRyTWMmziP9IvCMiEzD+SatLGaCjMIRzmh6b932aKEMLMU312yLXEY5FdlWFdu3SnzMWpZQSH3rslwszVQiEyVwmycMSSuAccNJpljim+s1haSUi96FxBlHeouIHAh0BjarakY3ekZ++O2Y+SwOF1bjK/+xqbaNc2f13trllMO1mIZXdGdm1bqslg/JhgNbNIfddY2kB7c9kFWfFuYxitubXmo6tW4R2T6Z3KaYb/tlocQqTGsuHVTlNha0ITqzsl6zSURsXfsik6mXPS5Rw5smLnA+Z3bV7EsYvD8ui6p6tvTq1JpPttVZwx9vy19Ey703Pa71WE6UqoOqKVKsNZuMPEg1gD4OUQ5KurRtyfjZK5x16/kWzYZ8Xqypl/R16WjTPPjOTlCi2jbLrZoexyprjL3RqWgqfkQLRSw3esBQ4EbgeREZVdwkGX4ve0iUZ/uoziG/V97/HRJ6wQ+/4wikPzQll2mBYcfU7OtOBuBPI/tFnpfKZVtDi2ZUh48vnk1JINPh/1n06TqTAT3nFizsUrrrKwZFW7PJKCx+dR9IWfWPclrit32mWiok3aJouXg58tsuO9/qXrhwLftUa9qnmofuh5m8r5Ckah/02zPDau/+Ip7pSFcG2babhn+UnVq3YHNeqWoY4np/qkVVf5XpHBHpjrNk9wJDVXV7Dmnb7/E7nXyB9GcaJZPstMT/3fnWebxyc3+Om1jJlBEnxFpuIh1xnXOQJEzJ8USFU6wpl6FQ+qLpdwwVqhPIyI5y67DKlqyFVESOUdW/ZzjtPOCPQFfgHODpHNJWtrz64SZGzniLqd/5Cu/8axs3Pb2c+0b0ocOBzbl+xtv8engFW6v3pe11Dy3AdD30vhX6y/kr2bK7hl/OX8lnNfvYvfdz+vU8qHYO/DF3LgCclRAO/RjQc26tUICzLE+bvBCA0yYv5OXV3wxiinbZ5lN/X6L7teR96diyuyblNamswnT7oq4x67JpUe7TS7MWUuB0IJOQHgaswK3z1CPVSZkKJ91YsIrpGnl93GviEnXNyIr5PFk1GG6D/sBScHO+gMVA36dnsXT9MNZPq/OCnRzOcpw4gfvXmcr7nDfti4nx3PeZ2zfNnVOPR9zX+mnwcu1OJ0ydWrfwhDKa8LzN3u9QeHxrLWpfWP3q1LpFQjj+eaGYRQl21DXprMJM+/zfhR5i1FjJZOVFHQ/3+ceifpfagixXAQ2RTIt8isjApF0jgV7AaFWNXGpZRG7FCWkXoIWq3hfsbw9sW7NmDe3bN16vfIv/uYkfPL2M/x1xAlVrtzNm9gomnX887Q9szk1PLePOocexdc8+rjrpcLpkGAe6aUc1D7/xcaxzC3mtYRjZ0aFDhw7AjqiVkeMI6QfAtHATtzzzURmuuQ5ojavav62qTwf7DwU+yTYDhmEYZUKHqD6fOFV7wbXvbwOW4do8M/EsdZ1Nk7z9a3HV/h1RFxmGYZQ5kdoVxyLtBTTDTRHtC9wBzAPujNHpZBiG0eSJI6QCnIGzJNfj1rUXYK2qRrsLMgzD2I+II6QzgPeAVUBPoL+qxqneG4Zh7BfEmSJ6JPAX3JCnF4AuIjIwoje/QRGR80TkcRHpJCKniciHBQ7/DhG5V0RGi8giEXk82P8tEXlTRCYXII5LROSvIlIpIseIyD9FpDI4drmILBGRR/KNJwhvbRBPDxGZKiKjg/0FyY9XXhUi8rqIzAlqN4jIZV6+porIWyLy7znE4ZfXl/14ovIhIj8O3EHmkp9eIlIdcf/PEpE3RGRqsH2AiPxaRMbnEEcfEXlFRF4Vkc7Jz3FSuY0K4v1RlnHcISL3Br+vF5E/Br8TyktEZgflukNE2opIbxFZJSL1F9yqH8fFIvJiWNZJ6b4ySPeEYPuuYPuKYPukbN5d770fISKLReQvItIiIt5xIvI3EZkYbMfOTxziCOmDwFnBZxCuEyn8XRaISDvgBlW9WFW3AN8F/lXA8I8GugWb/62qpwNHBjfsOuBCoEKcu8F8mKGqZwAbcc0nU1V1UHDsImAwcHT4oOSKiDQD5gZhH0Fip2Pe+Ukqr0uBnwHrgL6BmPoDXK/DDce9KIeo/PK6xY8nOR/BM5LPn/9o4H3q3/+RwFVALxHpAlwJvKWqE3OI42zgl7hhwSfhPccpyu1UIPbkdP++iMiRuNrllV54teWlqufj/GvMUNWdOH8bW+PEo6qP4/Th5KCM/HR/DzgTOD/YHgqcBlwfbA8GYrkH8997YACuvD7F3Yvk8vqGqn4N+IYXb6z8xCGOkB6qqreHH6BV8PuOQiWiAAzAPQAviMjpuAd+T6ECV9WVwGPBbxU3BXa1qtYAnwOtgO1ApzzjURFpjVv6uh0wSETODA5PA34BvBTEmw+dgD4iMkJVXwHme8fyzo9fXri29bU4QeiBe7jneOcq7qV7Nod4/PKqToonOR9XA9NzyY+IdA5+boy4/2H+1gKH4l7QG0LLJ0tm4cTsENwwQf85Tig33FqCrchiUk3SfRmCE7rZQRlG3fdrccsMEYwF3xInnuCP+l3gb8DXk9IdDh/aFQjhAbjRPW2DeO7G3cs41L73uNmTo4BtQT6Ty+sZEZmEm3GZVX7iEEdIzxGRQYEpPAj3r1ludAamADNwL+SD6U/PneCfbgoQVkXvBX6P65ArhL+FScCtqvo67l97UmAZHotrqz4seFBzRlU34qzAm8SNyvApdH4SogYuAJ4Kd4jIwcD3gYw+HFIwCbiVukVKw3iS89EPeDXHOC4HHg7Sm3z/fRT3LH4TOCOHamMPnCC3BMaS+BwnlBtwD+5ZPzjLOEI648ptGc5Ki7rvp6hq1mWmqvuACuAg4OKkdCecCvwBWEDi/YuL/94PxvXjHCRu4k9yeR2Fa57snUM8GYkjpJfhTPFbcCJ6RTESkicbcP9on+H+oX+D+6caUYS4RgBLVHUVgKpW4tayeinfUQwicpILUpcEYe/ETbNthbsP9+MeziPziScIuwbYBHRM2l9JgfIT8AnOwgonY/QCHsLdnwHAT4EJuVjZSeWVEI+fD9zwvaOBycCZInJsllEdBYzHicM4vPvvxdsdJ4Lhs1hN9ivufgfXH7EEJ27+c5xQbqr6GK6aPD9VYBnw3xlJvu8ickhwTk4EYtocOJzE+70tsERbq+pOVf01MIFE6zGXPIzBWdtrgZOp/5ydq6q/BQbm2zQWiao2+g9wIPBn3EtzZLCvssBxDML9a0/GTa+vxL2cg3HViqMKEMdYXJWoEicwr+LG6xI8KO8EcR2QZzzfBhbirCzBVeFGB8cKkh+vvCpwVbw5frrD+wO8HeR3dp7lNdCPJyofuD+gaXnkqTLi/p8FvIFrzwZnbFQC03MI/8ygPBYDh0U9x165XQL8H9Alx/vSHWcJvgB0SC6vIB/3ROS/Y4w4RgN/BZ6gbmRQmO4rg/KaEGyPwq1Q3Nq7fmnMvPjv/Y+C96MSaBtRXpOCsr0v2/zE+WQc/mQYhmGkp4TrORqGYTRNTEgNwzDyxITUMAwjT0xIDcMw8sSE1Cg5ItIt81lGOkSkq4jY+1sm2I0wSkowhi+nGUZZxvNfxY6jgbmKxCmQRgNiQmqUmgtw/myLTcq1wpoIj+DmlhtlgAlpE0ZElgaf24LtSnGehaaJyJEicpuIDPa2p4XnBd+TReRtEfmSF+a0IIzhInKniLwmIkOCacS3JYXzuDiPQod4yTobeDE4/m6YPnHejhYEn07BvqUi8n0R+a44D1FXB9ctD9JwbnI+guNfBU4SkRvD6q+IrBaRbiLyPanzcDRNRI4Vkee99A/y0vNCkJ624rwMLRaRi4Jy7CZ1HqBq8ynOC9VrIvKeiBzmlVtlWDYRcd0lIqeKyPniPHz9JLjm2qCMtgbbteWgquuAjla9Lw/sJjRt3sHNMvG5NGn7hlQXq+rNuKnByYuYXgo8B3wZ5/zi2hTXX4ybvz3I290L+Dj4vdZL34U4xywP4axWcLOt7lfVP+A8BF0e7N+Ac4YxJjkfQfvrnbh51zXAbcHxzTjfBSfjpoyGXI+byvgZzvlJyLdwc91n4uafjwIGqeqTwfGf4aY2JufzHNwsmldwc+Yz0QvnGOhVVZ2tqv2Ac4NjLYCfEyxUG1EOG3HzzY0GxoS0aZN8f5Pnfp9I4tLaQzxrtFVgcU3ATcVLpnNw/dM4z0HgBPWx4PouIjIXJ0D+9S2AnRHhdQdWB5/D/AOBRT3XD0edB6FQqPx8DMRNE92kqr8HTheR5sBbOBHdSZ1gdqduDZ43gWOAu4PtQ3BTUK8A2uC8nu3ykvVVYFlEPmcEYQyJyGMUwwjmywfW91+pc0TSEc/VW0Q57CGeWBtFxoS0iSIiXwR8p8BdcL4affrj2tpC5mmd/9Ov4CzGMUSzFee8Y5CqhhbSNOos3sHA80CyO7lNQA8R+TJunnzIOpyDi8Op70v2m6p6pr9DRNpQJ+B+PnYRuIETkVBk2gXndgAW4Va4BdfMMBVAVatVdRjOKQm4ZXXuVNWvqepTwN4gzpCpOCs4OZ9tcVZs3Hbg3wPXiUgrnG/TM4OyAOdv9I005dCZArqCM3LHhLTpcj/OU9W9OEvxPuCupHNm4IQniiqcmP4KV4VMILDO3hKRl0TkkojrF+FEdTSJL/tSnOV3L84VXMhTuGr2SOBPSWG9JSILqRPYr+AcVTwQkY/5OKv0S7jRAb+jzmq7g0SLbr6qro5IOziHGzcE7aedgd8CL4pI+EfxIM4j1ytJ+RxOtCejE3D35GyCJoGA6iCd1wdpewVQEbkcWKmq/p9KbTkEbaOtVXV3ivQbJcScljRRRKQytC793w2NiBwB3KGq1+YRRtr8iHPT9gTwQ22iK92KyLnAV1X1Fw2dFsMs0qbM6BS/GxRV/YjcHSzHjWMHbpXbJimiAW1wTpGNMsAsUsMwjDwxi9QwDCNPTEgNwzDyxITUMAwjT0xIDcMw8sSE1DAMI09MSA3DMPLk/wG9s9OoWUWdPAAAAABJRU5ErkJggg==\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x295f2fb4588>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"log_counts = np.log(counts_lib_norm + 1)\n",
"mean_log_counts = np.mean(log_counts, axis=1) # по всем образцам\n",
"log_gene_lengths = np.log(gene_lengths)\n",
"\n",
"#with plt.style.context('style/thinner.mplstyle'):\n",
"binned_boxplot(x=log_gene_lengths, y=mean_log_counts)\n",
"plt.tight_layout()\n",
"plt.savefig('pics/1_11.png', dpi=600) "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Нормализация по образцам и генам: RPKM"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {},
"outputs": [],
"source": [
"# Создать переменные в соответствии с формулой RPKM, чтобы легче было сравнивать\n",
"C = counts\n",
"N = np.sum(counts, axis=0)\n",
"#N = counts.sum(axis=0) # просуммировать каждый столбец, чтобы получить суммы (.astype(int))\n",
" # количеств прочтений на образец\n",
"L = gene_lengths # длины для каждого гена, совпадающего со строками в `C` (.astype(int))"
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {},
"outputs": [],
"source": [
"# Умножить все количества на 10^9\n",
"#C_tmp = 1e9 * C # 10**9 * C\n",
"C_tmp = 1e9 * C.astype(float)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Правила транслирования"
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"C_tmp.shape (20500, 375)\n",
"L.shape (20500,)\n"
]
}
],
"source": [
"print('C_tmp.shape', C_tmp.shape)\n",
"print('L.shape', L.shape)"
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"C_tmp.shape (20500, 375)\n",
"L.shape (20500, 1)\n"
]
}
],
"source": [
"L = L[:, np.newaxis] # добавить размерность в L со значением 1\n",
"print('C_tmp.shape', C_tmp.shape)\n",
"print('L.shape', L.shape)"
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {},
"outputs": [],
"source": [
"# Разделить каждую строку на длину гена для этого гена (L)\n",
"C_tmp = C_tmp / L"
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"C_tmp.shape (20500, 375)\n",
"N.shape (375,)\n"
]
}
],
"source": [
"# N = counts.sum(axis=0) # просуммировать каждый столбец, чтобы получить суммы\n",
" # количеств прочтений на образец\n",
"\n",
"# Проверить формы массивов C_tmp и N\n",
"print('C_tmp.shape', C_tmp.shape)\n",
"print('N.shape', N.shape)"
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"C_tmp.shape (20500, 375)\n",
"N.shape (1, 375)\n"
]
}
],
"source": [
"# Добавить в N дополнительную размерность\n",
"N = N[np.newaxis, :]\n",
"print('C_tmp.shape', C_tmp.shape)\n",
"print('N.shape', N.shape)"
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {},
"outputs": [],
"source": [
"# Разделить каждый столбец на суммы количеств для этого столбца (N)\n",
"rpkm_counts = C_tmp / N"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {},
"outputs": [],
"source": [
"def rpkm(counts, lengths):\n",
" \"\"\"Вычислить прочтения на тысячу оснований экзона на миллион \n",
" картированных прочтений.\n",
"\n",
" RPKM = (10^9 * C) / (N * L)\n",
"\n",
" где:\n",
" C = количества прочтений, картированных на ген\n",
" N = суммы количеств картированных (выровненных) прочтений в эксперименте\n",
" L = длина экзона в парах оснований для гена\n",
"\n",
" Параметры\n",
" ---------\n",
" counts: массив, форма (N_genes, N_samples)\n",
" РНК-сек (или подобные) количественные данные, где столбцы являются \n",
" отдельными образцами, и строки - генами.\n",
" lengths: массив, форма (N_genes,)\n",
" Длины генов в парах оснований в том же порядке, что и\n",
" строки в counts.\n",
"\n",
" Возвращает\n",
" ----------\n",
" normed: массив, форма (N_genes, N_samples)\n",
" Матрица количеств counts, нормализованная согласно RPKM.\n",
" \"\"\"\n",
" C = counts.astype(float) # use float to avoid overflow with `1e9 * C`\n",
" ####N = np.sum(C, axis=0) # sum each column to get total reads per sample\n",
" N = np.sum(counts, axis=0) # sum each column to get total reads per sample\n",
" L = lengths \n",
" \n",
" \n",
" #N = np.sum(counts, axis=0) # просуммировать каждый столбец, чтобы \n",
" # получить суммы количеств прочтений на образец\n",
" #L = lengths\n",
" #C = counts\n",
"\n",
" normed = 1e9 * C / (N[np.newaxis, :] * L[:, np.newaxis])\n",
"\n",
" return(normed)"
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {},
"outputs": [],
"source": [
"counts_rpkm = rpkm(counts, gene_lengths)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### RPKM между нормализацией генов"
]
},
{
"cell_type": "code",
"execution_count": 65,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAABfCAYAAAC6PE+FAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAG0FJREFUeJztnXmUFNW9xz8/YMBhk01ACAgGNQxG0YgrKqiJioLikrgrLvhiSA4Pn8S8aIKEl7hE4nviOYkboHkGEZcHiJi4jIqAEg2gDmJUQojCIAPIPgzwe3/cWz3VNdXd1dt093A/5/Spru3eW7eqvvW72++KquJwOByOzGlW6AQ4HA5HqeOE1OFwOLLECanD4XBkiRNSh8PhyBInpA6Hw5ElTkgdDocjS5yQOhwOR5Y4IXU4HI4sSSmkIvKQiLwpIjUi8rqIfNgYCXM4HI5SIYpFeoyqngZUARcDu/KbJIfD4Sgtogjpj+3yXuAJYEr+kuNwOBylR4sIx3whIucAi4E+wNK8psjhcDhKjCgW6bPAAOADYCPwWF5T5HA4HCVGFCEtBzYAmzEWbBQr1uFwOPYbJJUbPREZAXT0bdqkqrNzErmIAG2Bber8+TkcjhIlinV5kapel6f42wJbtmzZkqfgHQ6HI2dIoh1RhPQCEXnNF5Cq6hk5SZbD4XA0AaLUkS5V1TPsb6gTUUexI7fOycuxDkciogjpn/0rIvKTPKXF4WgUnHg6ck0UIb1QRPqIoQ9wdX6T5HCE01gCGBZP1G25iG/Dtlrue/1TNmyrzUn4jvwTRUh/CPwSeAmYSP1IJ4ej4ORLzBojzkRMXbKG8XNXMHXJmtD92Qqts8hzTxQhXYsZ1TRXVa8BNuU3SY5iIeyFS/USN5awZRJPuucUSnBGDerFvef3Z9SgXqH7/ULrRLE4iCKks4B1wBV23Y21349JZS1FIZkY57oIXWzVAVHo0rYVtw3tR5e2rULDSyW0jsYnipDWAEcCHUTkNmB7lIBFZKKIPCAiFSLynojMsx3wHUXMyuqtnPfoO6ys3hq6f0RFN4b178qIim5x25OJhrfPE9ApC1ZlLcbJ4ikkUS3pbCzuLm1bMX7uigZC6ygcUYR0JLAQ+BXwHnBRqhNEpB/Q1a5eBvwCqAYGZpZMR2Mxbk4V81asZ9ycqrjt3ov9p6VfMG/Feh57958p6+mCYuBZswj7lUWVbf1rPj8Q6cZXDB+rYiSKkD4KXAN8F9Ni/2iqE1T1U2CGXe2BqRpYC/TMLJmOfOIvak8eXsGw/l2ZPLwiwbG7Aaj8tIbxc1cwZcGqyPF41uyJvTpQ+VkNNdtNWO7lbBxKpY64FIkipBPsr59d3pVFfG48fREy5W1T1J7y9io6t2nJkG92pnOblqHHrvzK1OxUW0t0R93eyPE8uGAV81as56ZZHzBvxXrGPO8mW8glYdUGUcWwKYpm1bUS++WbKENEJwDtgM9UdXUGcXwJdAcOtv8dRcKGbbVMXbKGHbVWDNVX/E7AlJFHMm5OFX07lvPQwtW0LmseOb6XP/kKgK93GUv0iIPaZJ54R8ngCVnF9Ma1oxozvigW6S+BW1R1VIZxzMBYsd2AZRmG4cgR/mK8J5qtWzXn3vP7M2Zw35Qtwp7FetWxPRnWvyuXHxO9tmbS2YfTpqwZJ/XumPpgR1GRqi41mUVbMV2TilpTsIajWKSzARWRTaThtERVK4FKu3pcpgl05Ba/xemJ5ahBvWItwCurt1L5WU2DVvng+Wcd1oVX/r6BQb0OZMLZ34oU929e+5Ttdft4a1UNAO/803VJdjQNogjpUGA0phX+GVVdnN8kOfJJUDxvG9ovbv/oWct58/ONbKvdE3r+iIpuVH5WQ8cDzKPjNT5Foara1K/utEGv2rgj3eQ7Sgi5dQ56//Ckx3S6Y35suXHSOY2RrLwQpWg/GfgWpp7zSRGZm98kOfKJv7O3V8yfX7WO/ve8xuJVNdTt3QcQWwbxuj8tXG2sSa/xKQrNA22NG3eEi7Uj93iClWpb2DF+sYt6XlQ27axD7x/Opp11OQuzEKS0SLOoG3UUOV4xvXPrMmp21HH1jKWMHNCdRas3M7hPJxat3tzgHM8CPal3B1Zt3MmUkUdGjm9PoJqsKXfhiCJc/vWw/1HFL9W5ne6YHxMq//6geIWF7R2j9w+P1WVGOS8qH20YQdW18JGJJeNwCk1KIRWRKkw/UHCOnUser5Fp1KBesWL+sn9t5n+XruW0vp0Yf0Y/DmrXilGDenHfG583ON+zQNdvNy/TEd3aRY67RTPYE723VNGQSJyC2/z7wsQmuC1MpLztwW3Bc1KFk6uwISh2HvWil401OaDL7Fh6lGjVAcVIlKJ9LfBzYAxwlhPR0iFsTLt/rLxXzO9xYDkAnVuH9x3141mg6ViiHpp4poack8z689ajbvMXPz3RCNvmkUzsGoOPNozgow0jQrcZUZSE+8MY0GU2FdOVAV1mx36pSMdKlVvn0LG8LPLxxUiUxqYXgPOADsAAEXlWVZ3jkhLA30LvNSp5jUWpWuUT4Vmg6ViiHhd/uztPLV2b9nmpCLMUN+2sS1qMjSp2+RJAT7T8ouQXsjALsKHQaeg5Xph+S9Ifj2cB1p+rced429K9lkysVC8tpdzQBNHqSONGMonIf+QvOY5c4m+h95hdVc28FesZ8s3O3GbFsHWr5rGlX2iTCWq6dLpjPp/cPpSnlq6lmcA+heYJDNR06wgzKcbmi1Ri522LKnb1Z9DAEgxuC57jLzaH4Z2rKbaFEfeRinhOIkqxKB8kSh3pLcD3gT2YqoDn8p0oR24I694UJq6XD+zJkjVfc/nAnnFC65Gqi0qUYtymnXWxvqrd2rRk7bbddA0MQ/Vbk+nW4zU2ieoNU4mdt80jldgVkjCr2aMQH6liJkrR/lpVPQFi89C/g/NJ2qTwi6dfaD2LNNVLk24r7nPXHcdJUxby3HXHJS1+FwthohkmgMXY5uzVP2aSt1HqQhMRVtwv1FDRxiCKkD4pIm8Adfb4J/KbJEc+CY5smrpkTay+NKyTfiZdW1K9tCf27RxbFtKy8b/syYrkxWY1Rk2Lv/6xsdMeVkXQFAXUI1Jjk79xSURKu1Z4P8dvcXqiun33Htq09EYq1XePguIpSucD/8sepUjemGRiSfrPCbaC+8U3Gys1jFx9ZHLZ0b+xidL96dcicreInCoiLwMn5jtRjvzhH9nkOShBiXWJysVUIsWIv3tPsq4+xYDX+BJWH52om1DwnESt4MnCDsOLrzG6KJXyRzuKkL4KHAg8C/wFaNhL21H0JJsnyfPg5HXSb4re6/39H6P2hcwlYSIUFKmw1uvgtnTENRXBsMPC2TjpnLTFt9A0lg9SP1GK9mBmEXXOSgqAv6idzRw9/rrR8XNXoPcPb9Bn1As/2NJfCoS3ojfsW1kowuopPWHKth9lrgTOn55sKWSdciHqYlNapKo6XVWnA0fa/66xqRHJVVE7zNIM6wpVanjF9LDRN/5tjUki67Ips79db5CoFikYL/mORiYbsQs2HAXxLNBSno2y0JZmGMnqJ4ul5T8bctlQ1VSI0iF/lqpeoqr/1hgJcsQT1qk+KsGiey5HKhWCZB3EC4FXt1mI7kWFZOOkc2w9ZOl7bcoVUSzSU0Tkcfvf8/50fR7T5EhB1HrTMGvW39E+G/Lhm9Ij0bDJbMaDZ0tQNAs5RnyxnWHAWyZiZfVWhvXvysrqrRn5RkiGdy86lpexMcEx+1NxP0qr/YnUzyT6S7KbRdSRA6LWm/q7Ovn/pyKZdRUcxpmPIl6qus3GbHX313cWS6v1pU++H7dMxM2zljNvxXpunrU852nIR54Ue7e0ZESxSAcCPwTKgN3AH4BMZhN15IhCNhI1dt1YIaxPP8Uinn427aiNWyZit53lYHeC2Q6KjagOU4qRKBbp7cAFqnomcCHws/wmyZEKf71por6hxYRnaWQyZUWhrM9iZp9q3DIRg/t0ilsWO6WQ94mIIqRvAC+IyDSMb9LKfCbIEU5Yh/opC1Yxfu4KpixYlXZ4jTkczxPDYHVAIVt+w0SzWKzPRxauilsGaduqLG4Jpj7UvwS44YTeDOvflRtO6B1XX1rMlKqYRvFHeruIHAB0Ajaq6q78J8sRJMxJc8zhfJJBHP6pG/z//Y6PPTKdyTGs+J2Ng+DGsECLRTTDuOX5j2LLm07u22D/XmuJ7vVZpKNtPehoX32oN1HhoF4HsmTN18xbsR4oXbEqZtKes0lE3JxNBSCsXnTMKX1p07JFxnWlUbzGQ+JWdI9sHASHhbO/4rXCt2ym7NkH5WXSYN/iVTXsqDUTX3lLgJ27zf+N22tjlqf/uDvP7MfnNdu55tgenPfoO0weXpHzlvz9GTdnU4kQ1uqeTkt8VMKK/OmMEEq35TXRHELpEOZYI8o49mJj2GNmFLY3S/Wu3ftiAjrsUbPvgqlLmHbZ0bQpa8a0y46Onav2k7Xm613MW7GecXOqaN3SznzQsjnPfbCOj9dv5z9fWhnb71UTRaljj+IHYH8mipC+AAwDbgFeEZEx+U2SIwrJuih5+9Jp1PEEMNt6y3Qbh3IxjDPMsUapONuYX7Uutty0M751vX15C0Y8vgSATbvMvpoddQzs2YHT+3VhYM8OsWNbWzeI3zjwAA5qU8aPTz6EMYP7cu/5/RkzuC9/+3ILAN3atWJY/65MHl4Rq1sP1rF7+VZs4lkIZyRRiTLW/i5VvVNVx6jqUOCARkhXyeM1Dq2s3pqyZd0TvmQemvwEJ3tLJJT+xp1UYhomgJnMRpkrErltK7XZJsMajvzFdE8ovaWfls2a8dWO+A9bx/IWcf1DPWF7+JKjGNa/K63LmvPV9jp+8/pncefdelpfDmpTxi/OOowXbzzBFOsj1LEXExXTtWidQ0exSONQ1d+me46IdBORRSLylog02YqZp99fQ9vbX+Tp99cw4eWVjJ+7glFPL03aed5vNUbtaO+3Gr3pgIMi3OmO+XFiF2ZpJhNKCBfXxnAIksyS9FuapcDoZz+MWwJc+sR7sWWd1YU6hSsHdgegXUvzWh7YuozDOreO2zZiQHe27zZl/807d8fueec2LRnyzc7ssUbtzt17456n+99cFRNY75wxp1iL9ZSGDVqO9EjHaQkAInK4qn6S5mnnAX8EugBnAc+nG28xM79qHdfMWMrW2r3s2rOPG2YuZ2edeaLXba01wjQNGNrwa7pgzbmx/ye/Mo/yFs04b9phVE0z2wb3egmIF5WwFnF/q36w9btjeRkfxeLRBvu9UDqWl4Vu27SzLtSreiJP64m2LVhzbujcR8nGrDcVRx9+ttmGoW2799K6hakTbd0CZi6vBmDrbvPsVG+t5cbje3PfG59z5MHtWbR6M706lPNZzQ5zXO3eBr4U+nYqB6C8ZfO4Bsqvttbyyt83IBDX+6MUXSYWI2kLKXAykK6Q9gBWYOZ96pnooFSTYwXrRyqma2idiXd+on1RwkkUdhi9MZ1rj+8xu4GFt2XMBgZMMXP+hIXnF7uFVuz82xb4BNA7f0CX2bExzrH4ppmvFdPq96sVJiPCaro/UT8HkRJf1xrmjzLMZ2bYWPMo4cit9dflTYcRn8bSx28t6/3D+emcD7m3chXjh9RbfX+68hiumbGUJy4bSIfyMkbNXMbU7x/NB2u38KPnP+LmE3vx9LK1PHHZQI7r3ZGD2rViREU3ZldVM2pQLy4/pifj5lRx55n9eOsfm+J6bZzapyO/evVTJg+viBu4Mf6Mfg3CSZRuR/qIphgdISKnBTaNAg4BxqpqpEG8InIHRkg7A2Wq+pDd3g7Y8sUXX9CuXemW+P/ycTWjZy3n4UuO4rvf6lbo5DgcjjzQvn379sA2DRHNKEL6OaZgCqZa+lpVPTSdBIjI9UA5pmi/XFWft9sPBr5MJyyHw+EoIO1VtcHwsChFe8FUmW0BPsTUcabLi5huVHuAyb7t6zDF/m0ZhOlwOByNTahWRbFIDwGaY4aIDgQmAvOBuzNodHI4HI4mRxQhFeBUjOW4HvgUY6WuU9XidjvkcDgcjUAUIZ0JrMT4IO0FnKKqmRTvHQ6Ho0kSpUN+H8x89p8ArwGdReS0kNb8okJEzhORZ0Sko4icJCLp+5pLHPZEEXlARMaKyEIRecZuv1BE/iYiD+Ygjh/YAQyVInK4iPxDRCrtvitEZImIPJWDeNbZOHqKyFQRGWu35+RafHlVISLvicg8W8pBRC73XdNUEVkmIv+eQRz+vDrKH0/YdYjIz6xbyHTjOUREakPu+1AReV9Eptr1ZiLyOxEZn0EcA+zglcUi0in47AbybIyN96dpxjFRRB6w/28QkT/a/3F5JSJzbZ5uE5E2ItJXRFaLSIcU4V8qIm94eRxI81U2zZPs+j12/Uq7fmw676rvPR8pIu+KyF9EpCwk3ttE5K8icq9dj3QtUYkipI8DQ+1vCKbRyPtflIhIW+BmVb1UVTcBNwJrcxR2P6CrXf1vVT0Z6GNv3vXARUCFGNeD2TBTVU8FNmCqUqaq6hC772LgTKCf99Bkgog0B16y4fYmvvEx62sJ5NVlwC+AamCgFdNzfYdfD5yCubZ08efV7f54gtdhn41MjYCxwN9peN9HAVcDh4hIZ+AqYJmq3ptBHGcAvwEWAMfie3YT5NmJQORxuv57IiJ9MCXMq3zhxfJKVc/H+NiYqarbMT43NqeKQ1WfwejD8TZ//Gm+CTgdON+uDwNOAm6w62cCkVx1+t9zYDAmr77G3IdgXn1PVY8DvueLN+W1RCWKkB5sx9vfpap3Aa3s/4m5SkQeGIx5GF4TkZMxD//uXASsqp8CM+x/FZFuwBpVrQP2Aa2ArUDHLONRESnHTIPdFhgiIqfb3dOAXwNv2ngzpSMwQERGquoi4FXfvqyvxZ9XmDr2dRhR6Il5yOf5jlXMy/diBvH486o2EE/wOq4Bpqcbh4h4buY3hNx379rWAQdjXtKbPesnTeZgxKw7pmug/9mNyzNgL+baIg+sCdyTczBiN9fmX9g9vw542J77ELApVRz2A/0x8Ffgu4E0e92HdlghbIbpzdPGxnEf5h5GIfaeY0ZLjgG22GsM5tULIjIZM8Iy8rVEJYqQniUiQ6wpPATzxSx2OgFTgJmYF/Px5Idnhv3qTQG84ugDwCOYxrlEkyumw2TgDlV9D/MFn2ytwyMw9dY97EObEaq6AWMF/khM7ww/ub6WuKiBC4DnvA0ichAwGkjbl4NlMnAHRlz88QSvYxCwOIPwrwCetGkN3nc/inn+zgVOzaDo2BMjyC2BW4l/duPyDLgf83wflGYcHp0wefYhxlILu+cnqGpa+aWqe4EK4EDg0kCa4w4FHgVeJ/6+RcX/np+Jacc5UMxAn2BeHYqpnsyLY4EoQno5xhS/HSOiV+YjITnmK8wXbhfma/0/mC/XyBzHMxJYoqqrAVS1EjOv1ZvZ9mgQkWNNkLrEhr0dM8S2FeaePIx5UPtkE4+1qGqADoHtleToWixfYqwsbxDGIcATmPsyGLgTmJSJhR3Iq7h4/NeB6cbXD3gQOF1EjkgjmkOB8RiBuA3ffffF2Q0jgt7zV0v6vpW+j2mTWIIRN/+zG5dnqjoDU1R+NVFgKfC/JxK85yLS3R6TNlZMWwDfIP4+b7GWaLmqblfV3wGTiLceM0n/OIylvQ44nobP19mq+nvgtGyqwxKiqk3uh3H19zLm5eljt1XmMPwhmK/3g8BSzFD7fpiv4vPAoTmI41ZM8agSIzKLMX13sQ/NBzauZlnEcQnwNsbSEkwxbqzdl5Nr8eVVBaaoN8+fZu++AMvttc7NMq9O88cTdh2Yj8+0DK+nMuS+DwXex9RjgzE4KoHpGYR/us2Ld4EeYc+uL89+ADwNdM7wnnTDWIOvAe2DeWWv4/6Q6++QIvyxwFvALOp7Bnlpvsrm1SS7PgaYihFW7/ylEa/D/57/1L4TlUCbkLyabPP1oXSuJeovZfcnh8PhcCQnbX+kDofD4YjHCanD4XBkiRNSh8PhyBInpA6Hw5ElTkgdBUNEuqY+ypEMEekiIu49LjDuBjgKgu3Ll/YIowzi+a98x1FgriZ+KKSjADghdRSKCzB+bfNNwjnCmghPYcaYOwqIE9L9ABFZan8T7HqlGO9C00Skj4hMEJEzfevTvOPs8kERWS4i3/aFOc2GMUJE7haRd0TkHDuceEIgnGfEeBXq7kvWGcAbdv/HXvrEeDx63f462m1LRWS0iNwoxkPUNfa8j2wazg5eh93/HeBYEbnFK/6KyBoR6SoiN0m9l6NpInKEiLziS/8QX3pes+lpI8bb0LsicrHNx65S7wUqdp1ivFC9IyIrRaSHL98qvbwJieseETlRRM4X493r5/ac62webbbrsXxQ1WqggyveFxaX+fsHH2BGm/i5LLB+c6KTVfXHmCHC54WE8WfgKIwDjOsSnH8pZgz3EN/mQ4B/2f/rfOm7COOU5QmM1QpmtNXDqvooxlPQFXb7VxinGOOC12HrX+/GjL+uAybY/RsxfguOxwwZ9bgBM6RxF8b5iceFmPHuszFj0McAQ1T1Wbv/F5ghjsHrPAszmmYRZtx8Kg7BOAharKpzVXUQcLbdVwb8CjOaipB82IAZd+4oEE5I9w+C9zk4/vsY4qfYPsdnjbayFtckzJC8IJ3s+c9jvAeBEdQZ9vzOIvISRoD855cB20PC6wassb8e/h3Won7JH44aT0KeUPmv4zTMMNEaVX0EOFlEWgDLMCK6nXrB7Eb9XDx/Aw4H7rPr3TFDUK8EWmO8n+3wJes7wIch1znThhF1nunh2DHz1vp+i3pnJB3wuXwLyYfdRBNrR55wQtrEEZHDAL9j4M4Yn41+TsHUtXnM13rfp0djLMZxhLMZ48BjiKp6FtI06i3eM4FXgKBLuRqgp4gchRkn71GNcXTxDRr6kD1XVU/3bxCR1tQLuP86dmBdwYmIJzJt7bHtgYWYmW3BVDNMBVDVWlUdjnFMAmZ6nbtV9ThVfQ7YY+P0mIqxgoPX2QZjxUatB34EuF5EWmH8m55u8wKMz9H3k+RDJ3LoEs6RPk5Imz4PY7xUPYCxFB8C7gkcMxMjPGFUYcT0t5giZBzWOlsmIm+KyA9Czl+IEdWxxL/sSzGW3wMYd3Aez2GK2aOA/wuEtUxE3qZeYI/GOKx4LOQ6XsVYpd/G9A74A/VW20TiLbpXVXVNSNrBON642dafdgJ+D7whIt6H4nGMN65FgescQbhHoyMx9+QMbJWApdam8wabtkWAisgVwKeq6v+oxPLB1o2Wq+rOBOl3NALOaUkTR0QqPevS/7/QiEhvYKKqXpdFGEmvR4y7tlnAT7SJzngrImcD31HVXxc6LfszziJt+oxN8L+gqOo/yczBcjpxbMPMdtskRdTSGuMc2VFAnEXqcDgcWeIsUofD4cgSJ6QOh8ORJU5IHQ6HI0uckDocDkeWOCF1OByOLHFC6nA4HFny/0qtVdCWs3nUAAAAAElFTkSuQmCC\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x295f279f128>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"log_counts = np.log(counts + 1)\n",
"mean_log_counts = np.mean(log_counts, axis=1)\n",
"log_gene_lengths = np.log(gene_lengths)\n",
"\n",
"#with plt.style.context('style/thinner.mplstyle'):\n",
"binned_boxplot(x=log_gene_lengths, y=mean_log_counts)\n",
"plt.tight_layout()\n",
"plt.savefig('pics/1_12.png', dpi=600) "
]
},
{
"cell_type": "code",
"execution_count": 66,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"c:\\python36\\lib\\site-packages\\ipykernel_launcher.py:1: RuntimeWarning: invalid value encountered in log\n",
" \"\"\"Entry point for launching an IPython kernel.\n",
"c:\\python36\\lib\\site-packages\\numpy\\lib\\function_base.py:4291: RuntimeWarning: Invalid value encountered in percentile\n",
" interpolation=interpolation)\n",
"c:\\python36\\lib\\site-packages\\matplotlib\\cbook\\__init__.py:1856: RuntimeWarning: invalid value encountered in less_equal\n",
" wiskhi = np.compress(x <= hival, x)\n",
"c:\\python36\\lib\\site-packages\\matplotlib\\cbook\\__init__.py:1863: RuntimeWarning: invalid value encountered in greater_equal\n",
" wisklo = np.compress(x >= loval, x)\n",
"c:\\python36\\lib\\site-packages\\matplotlib\\cbook\\__init__.py:1871: RuntimeWarning: invalid value encountered in less\n",
" np.compress(x < stats['whislo'], x),\n",
"c:\\python36\\lib\\site-packages\\matplotlib\\cbook\\__init__.py:1872: RuntimeWarning: invalid value encountered in greater\n",
" np.compress(x > stats['whishi'], x)\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAABfCAYAAAC6PE+FAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAEFlJREFUeJzt3Xm0nHV9x/H3hy1sCSFRAkFNoBTOCQoKiKzJZamERWQpZQvIolARPCkWpcctUIoBNNIC5yggCcvxUPYiAlWWG6gkgEIClIKlKqXFBAlLIJQA+u0fv9/kPve5czPPzNwlQz6vc+65M8/M/LZn5ju/eZ7n9/spIjAzs9atMdwFMDPrdA6kZmZtciA1M2uTA6mZWZscSM3M2uRAambWJgdSM7M2OZCambWpYSCVdJmkByQtkXS/pKeGomBmZp2iSo/0ExExGXgaOBx4e3CLZGbWWaoE0jPy/wuBa4BLB684ZmadZ60Kz/lfSVOB+cBEYMGglsjMrMNU6ZHeDGwLPAm8AvxoUEtkZtZhqgTS9YCXgddIPdgqvVgzs9WGGk2jJ+lgYOPCplcj4vZBLVXfMgjYEHgzPO+fma1iqvQuD4uIE9rJRNI44DbgPeCAiHijvA0YC3QDv4uIrlISGwJLly5d2k4xzMzaoX4fqNAjfRV4vJBQRMTeTeUunUQ6RPAB4ImIuLW8Lf9Ni4hz6rx+JDmQjhw5spmszcwGSr+BtEqPdEGzgbOO8cB/AO8Cm/ez7bdAl6TuiJjbZn5mZkOmysmmnxXvSPpym3nW6wJHRCwADgJmSVq3zTzMzIZMlUB6iKSJSiYCx7WQz4vApsBm+XbdbRGxjNRDHdFCHmZmw6LKT/svAt8mBbyX6Bnp1Iyf0nNi6TeSdi1tmyXpDOBYoDsiXm8hDzOzYVHlZNNmwMHA2hFxqaRtIuLZISldTxl8ssnMhlu/J5uq/LS/CVgEHJPve6y9mVlBlUC6BPgoMFrSWcCywS2SmVlnqfLTfk1gMunE0GLSMcw/DUHZimXwT3szG25tXUd6Zf4fOaFpwEkDUCgzs/eFKoF0Rv5/La1d+mRm9r5WNZCOBP4rIp4f3OKYmXWeKoH028DbEfHSYBfGzKwTVQmktwORJy9padISM7P3syqBdC/gFGAT4MaImD+4RTIz6yxVAums/H8d4FpJz0bEQYNYJjOzjtIwkEbEiUNREDOzTtUwkEp6mjREFHyM1Mysjyo/7ZcDXwdeB56NiD8ObpHMzDpLlUB6G3AgMBrYVtLNEeGJS8zMsirHSHutoSTpb5vNpOLidx8mjZ5aDBzo1ULNrFM0nP1J0mmSuiXdI+k+4O0W8jkQuA64B9i3n21HAd8iBdKPt5CHmdmwqPLT/nMR8SlYsb78wzQ/J2mVxe/Gk05q/T7ff7xvMsnTn0uTsEy6um+ntfhY7XZNcVvxteXnDbR65axnZfUys1VXlUB6raS5pIC3FnBNm3nWXfyuwnNWWFmgKT5W73lVtw2HVaUcZtacSiebiieXJE1tIZ/aQne1NezrbRtH3wXyzMxWeVUC6fmSXiQtVvcNYB5wd5P5NFz8DniW1Nt9CVjYZPpmZsOmSiC9F9gFuBm4kBTomhIRi4Fd6zxU3PY0sFOzaZuZDbcqgRRgfv4zM7OSKteRXg0g6aKIOGvwi2Rm1lmqrCJa41XnzMzqqHJB/k0AEfHXg18cM7POU+UY6e6Srsq3a7M/eRVRM7OsSiDdhd4XyA/uMCAzsw5TJZB+HPgisDbwDvBDwKuJmpllVQLp2UBXRCyXNAJ4gHQhvZmZUS2QzgVuk7SYNIyze1BLZGbWYVRl2k9J6wJjgFciopVp9NoiaSSwdOnSpYwc6auwzGxY9Ht+qOk1myR5zSYzswKv2WRm1qaqazYdAGyM12wyM+uj4cimiDgnIr4ZEadHxF7Aus1kIGmSpF9JujPPsF93u6QuSc9Iur6FepiZDZtmxtoDEBHfbfIl/a3FVN6+JvCdiDiq2TKZmQ2npgOppK2bfEl5Lab+tq8PHC5p+2bLZGY2nJoOpMBujZ4g6ey88mg3UDzD39+1VhERPwE+D1zZQpnMzIZNlcufJpc2TZF0PDA9Ip6o95qImAnMzK8/l/prMb1YZ/sSUs+0rjfeeKNRcc3MBsWoUaNGAm9GnYvvG16QL+k3wJzaXdLyzFtWzVzSJHrWYjoI+BJwF7BOaft5wKeBOeWrAiR5QTwzWxWMiog+PboqgfS3wFXAUuApYEZE7DkoRey/DCL1Xt8cynzNzErq9kirXEfaRTqjPoZ0dv3P8vykMyPi1wNaxH7kgv9+KPIyM2tWlR6pgD1JZ9lfAp4j/cRfFBHLB72EZmaruCqB9AbSmvPPAx8Gdo+IfYegbGZmHaHK5U8TgZ8DvwbuA8ZKmlznbH7HkXSgpBslbSxp13w8eKDSPlfSxZKmS3pI0o15+yGSHpd0SRtpHynpwXyJ2daSfpcvNUPSMZIelfTjAajDopzH5pJmS5o+gHWotU+fkW+Sji7UZ7akhZL+psn0i220XWkUXZ/yS/o7SXOazGOCpOV19vFekh6TNDvfX0PS9yV9tcn0t5U0T9J8SWPK79FSO52e8/xaE+mfK+nifPtkSdfl273aR9IduR3flLSBpC0kPS9pdIP0j5A0t9aupfJOy+U9L9+/IN8/Nt/foernsfA5PlTSI5J+LmntOnmeJemXki7M9yvVo4oqgfQqYK/810Uae1+73bEkbQicGhFHRMSrpGtYB+Q4rKStgE3y3X+MiN2AiXnnngQcBkxSmp6wFTfkE34vkw6zzI6IrvzY4cA+wFa1N1OLdVgTuCun+xF6H09vqw6l9uk1wi0H0/1Lee1Oqlczim10Nr1H0fUqf34vtNIxmA78J3338YnAccAESWOBacDCiLiwyfT3Br4D/BuwA4X3aD/ttAtwcJWEi/tA0kTSL81phbRWtE9EHAScRmrTZaS5N15rlEdE3EiKEzvndimW9wvAFNIVO+Q0dwVOzvf3ARpO2Vn8HAN7kNrodVLbl9vo0xGxE+nqoFqeDetRRZVAulkeb39ORJwDjMi3zx2IAgyjPUhvlPsk7Ub6QLwzEAlHxHPA9fl2SBoHvBAR7wJ/AkYAb5Amgmkl/ZC0HmmJ7A2BLklT8sNzgPOBB3J+rapNUnNoRMwD7i081lYdiu1D3xFu+wN3Fp4bpA/jT5vMo9hGy0t5lMt/PHB1M+lLGpNvvlxnH9fqtIh0nfQBwKm1nlATfkIKaJuSLv8rvkd7tRPwx1ynKieQy/tgKinY3ZHbrN7+PQG4PL/2MuDVRnnkL+NngF8Cf1Eqb+0yordyMFwDeA/YIOdxEWm/NbLicwzcCpwOLM31K7fRbZJmAdc1U48qqgTSfZUmFNlCUhe9Ryp1sjHApcANpA/pVSt/emvyt+KlQO2n6cXAFaQTeK+0kfQs4BsR8SvSt/qs3DvchnRMe3x+I7ckIl4m9QS/JGlC6eGBqkOfbIHPArfUNkj6IHAK0OwcD5DbiBRkinmUy/9JYH6TaR8DXJvLWN7HRUF6r+0P7Nnkz8jNScF4HeAr9H6P9mon4Huk9/EHm0i/ZgypnZ4i9dbq7d9PRURTbZSn3JwEbAQcUSpvr6eSRjTeT+99VbXstc/xPqRzORspTQZfbqMtSYcot2gyj4aqBNKjSV3ws0lB9NiBLsQw+QPp2+9t0rf4P5G+2Q4d4HwOBR6NiOcBIqIbOITUY2zpqgdJO6Sk4tGc5jLgXVIv4mhSz2Ej0vHtluXe1RJgdGl7N23WoaA8wm0CaaDGJEl7AN8Ezmu2d11qo155FMtPurRvK+AS0qi9bSpmsSXwVVKgOIvCPi7kN44UCGvvteU0twrvX5HOTzxKCnDF92ivdoqI60k/l+/tL7GVKH4WVN6/kjbNz2laDqZrAR+i935dmnui60XEsoj4PmlQzp39p9aw7GeSetmLgJ3p+17aLyJ+AExu57BXXRGxWv6RpgP8V9KHaWLe1j2A6XeRvtkvARaQ1rraivSteSuwZRtpf4X0k6mbFGjmk67rJb+Znsx5rNFGHn8J/ILU6xLpp930/NhA1KHWPpNIP/3uLJa3ti+AJ3I972ijjSYX86hXftKXzpwW6tFdZx/vBTxGOnYNqQPSDVzdZNpTcv0fAcbXe48W2ulI4J+BsS3sg3Gk3uB9wKhy++Tyf69OvUc3SH868CBwEz1XCNXKOy230Xn5/unAbFJgrb1+QYU6FD/HX8vv/W5ggzptNCu352XN1KPKX6U1m8zMrH+tzP5kZmYFDqRmZm1yIDUza5MDqZlZmxxIbZUiaZPGz7KVkfQBSf5sDyE3tq0y8rV9TY0wajGffxjsPIbZcfQeGmmDzIHUViWfBe4egnw2b/yUjvZj0phzGyIOpKspSQvy34x8v1tplqE5kiZKmiFpn8L9ObXn5f+XSHpC0scKac7JaRwsaaakhyVNzUOMZ5TSuVFphqFNC8XaG5ibH3+mVj6lmY/uz38b520LJJ0i6fNKs0Mdn1/377kM+5XrkR/fEdhB0mm1n7+SXpC0iaQvqGfGozmStpF0T6H8XYXy3JfLs4HS7EOPSDo8t+Mm6pkJakU9lWaheljSs5LGF9qtu9Y2dfK6QNIukg5SmtXr6/k1J+Q2ei3fX9EOEbEYGO2f90PHDb36epI08qToqNL9U/t7cUScQRo2fGCdNH4GbEeaDOOEfl5/BGk8d1dh8wTgf/LtRYXyHUaajOUaUq8V0iiryyPiStKsQcfk7X8gTZBxZrke+fjrTNJ47HeBGfnxV0jzFexMGjJaczJpeOPbpMlPag4hjXu/nTQe/XSgKyJuzo9/izTcsVzPfUmja+aRxs83MoE0adD8iLgjIj4J7JcfWxv4e9KIKuq0w8ukceg2BBxIV1/lfV8eA/4J0gQPNVMLvdERucd1HmmIXtmY/PpbSTMJQQqo1+fXj5V0FykAFV+/NrCsTnrjgBfy3/jiA7lHfVcxnUizCtUCVbEek0nDRJdExBXAbpLWAhaSgugyegLmOHrWCHsc2Bq4KN/flDQE9VjSqrcjIuKtQrF2BJ6qU88bchpT69Sxns+Qx87n3veD9ExKMprCFHB12uEdqgVrGwAOpKshSX8OFCcIHkuaw7Fod9Kxtpq7o2fO0+1JPcYzqe810iQeXRFR6yHNoafHuw9wD1CeVm4JsLmk7Ujj5GsWkya9+BB954zdPyKmFDdIWp+eAF6sx1vkaeEk1YLMhvm5o4CHgPXy9r1JY7+JiOUR8RnS5CSQltyZGRE7RcQtwHs5z5rZpF5wuZ4bkHqxVY8DXwGcJGkEaY7TKbktIM09+thK2mEMAzRFnDXmQLp6upw0K9XFpJ7iZcAFpefcQAo89TxNCqbfJf2E7CX3zhZKekDSkXVe/xApqE6n94d9AanndzFpWriaW0g/s08E/qWU1kJJv6AnwG5PmsDiR3XqcS+pV/ox0tUBP6Sn13YuvXt090bEC3XKDmkSjlPz8dMxwA+AuZJqXxRXkWbhmleq58HUn93oo6R9sjf5kEC2PJfz5Fy2eUBIOgZ4LiKKXyor2iEfG10vIv6vn/LbAPOkJashSd213mXx9nCT9BHg3Ig4oY00VlofpanbbgK+HEO0Cu5Qk7QfsGNEnD/cZVlduEe6eprez+1hFRH/TfMTLDebx5ukFXDfl0E0W580UbINEfdIzcza5B6pmVmbHEjNzNrkQGpm1iYHUjOzNjmQmpm1yYHUzKxN/w+/qgMo2pCSfQAAAABJRU5ErkJggg==\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x29586b58978>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"log_counts = np.log(counts_rpkm + 1)\n",
"mean_log_counts = np.mean(log_counts, axis=1)\n",
"log_gene_lengths = np.log(gene_lengths)\n",
"\n",
"#with plt.style.context('style/thinner.mplstyle'):\n",
"binned_boxplot(x=log_gene_lengths, y=mean_log_counts)\n",
"plt.tight_layout()\n",
"#plt.savefig('pics/1_13.png', dpi=600) "
]
},
{
"cell_type": "code",
"execution_count": 144,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"c:\\python36\\lib\\site-packages\\ipykernel_launcher.py:7: RuntimeWarning: invalid value encountered in log\n",
" import sys\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAADPCAYAAACnWlnGAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAGVlJREFUeJzt3XuUVdWV7/HvJDxtHooIREEKgfBSHn1VbFql0Ci2UWMSiSgZIkZMOrReopfcRBNC0DaORlvaV6soEDGKxkSjaWJTPoAYkDC6E4IENEJjEHlcUAKl8irm/WPtgqKgYJ06teucXfX7jFHjnH3Ofswq9pjstfdac5m7IyIitdek0AGIiGSdEqmISJ6USEVE8qREKiKSp4ImUgvamJkVMg4RkXw0LfDxWwPbt2/fXuAwRESOqsYLPjXtRUTypEQqIpKn6Ka9mTUB2gGtgJbuviZimylAW+DPwGjgNXf/fi1jFREpSlGJ1Mw2A1uB14Bdycc3H2WbnkBHYKe7P2hm04HfA0qkItKgxDbtuwO3ERJvT2DG0TZw93eBOQBmdhLhqvTntQtTRKR4xTbtbwD6Eq5KnyckxWjuvt7MegM/yy08EZHiF3tF2gHYC3QBvkEtrizdfSdQYmbNct1WROrelvJdTH39XbaU7zr6ynJEUVek7n5bPgcxs7uAUsLDpj357EtE6sbMpev4zq9WAjBxeM8CR5NtsQ+bHgROA/oDy4CO7n7q0bZz9/nA/DziE5GUjD2j60GvUnsWU4/UzBa5+1Az+w3wRWCeu5+e98HN2pCMbGrTpk2+uxMRSVPeI5tuTF7/BXgCeCDfiEREGorYRPo3ZnYu8FdgKjDEzF4zs9LUIhMRyYjYRDo8+SlNfoa5+3nJPdAGp6KigquuuoqTTz6ZJUuWUFpayqpVq/Z/P3nyZB5++GE2b95Mr1696NWrF1OnTj3ou0rz589n1KhRAEyfPp1evXoxevTog45Xuc1jjz3Grbfeyt69exk5ciTdu3dnxowDXXabNm1KSUkJZ511FgClpaX79zt58uSDjtW5c2cA1q5dy+mnn07//v1ZvXo1ADfffDPdunXj9ttvp2fPnjRr1owePXrw8ssvc80113DKKafQt29fli9fTnl5OX379qV79+788Y9/rMs/s0iDEduPdH7y6oT7BJ9PJZoa2C0v1fk+/Z5La/zulVdeoaKigr/85S/s27evxvU++eQTjj/+eBYuXMjAgQO57rrralx337593HfffaxatYpzzz2X1atX06NHj/3fl5eX8+STT7JgwQLmzZtH8+bNWb58OYMGDWLMmDEAdOrU6aBkCbBnzx6mTp3K1VdfzeGqEd5///1MmTKF9evX85Of/ITRo0ezaNEi1q5di7vzgx/8gJKSElasWEHLli256667mDt3LrNnz2bhwoWcdtpprFy5kqeffpoZM2Ywbdq0I/5dRRqj2ERayoEk6sAraQV0OEdKemlYvnw5Q4cOBaBJk3DRfvHFF9O/f3+efvrpQ9Zv3rw5AwYM4O233wZgypQpPP7448ycOXP/Olu3bmX16tX079+fHTt2sGXLloMS6b333sttt91Gu3btWLFiBWeeeSatW7fmhBNOYMOGDbRt25Z27dodcuyf/vSnDBs2DIATTjiB995776DvV61axQsvvADApZdeyvLlyznrrLMws8MmXoALLriAbdu28c4777B27VpGjRrFhg0bGD58ePTfUKQxiW3azwM2EkYmfYZ6TqT1bd++fYckmblz5/K5z32OJ5988rDbNGnShIqKCgAmTZrEpEmTuOOOO/Z/7+7079+fVatWsX79eoYMGXLQ9oMHD2bevHn7l6sff82aNZx00kmHHPe5557jyiuvBKBfv350796dvn378uGHH+4/7osvvsjq1auZNm3aYX+36srKyrj77ru5++67eeihhxgzZgyPPPLIEbcRacxiE+m/A58Ci4C5hAdODVafPn1YtGgREBJRpVatWrFnz6HjCSoqKli2bNlBV5jV1+3QoQObNm1i48aN7N69+5B9XHLJJezdu5eysjL69evH0qVL+fjjj9m8eTOdO3fm2WefPewV4dChQ2na9EDD4qmnnmLlypW0b98egL59+1JWVgbA7t276dOnD4sXL8bdOVLXtxYtWrBr1y4qKipo2bJljeuJSHwiXeDus4Hr3X0JsCLFmAruoosuYs+ePfTs2ZMFCxbQsWNHRowYQVlZ2f6rv0rLli3jlFNOYcSIEZx44om0b9+eKVOmMH78eCZMmLB/vSZNmnDHHXcwZMiQGpvIP/rRj7jzzju58MILKS8v59RTT+W73/0uCxYsoKysjG9+85sHrd+0adMj3pcFmDBhArNnz6ZHjx4sWbKEAQMGMHjwYHr06MFTTz11yPrdunXjwgsv5Pbbb+eGG25g3Lhx3HPPPYwbN44uXbrE/glFGpXYDvktgIFA5aVJubv/d94Hz3iH/Mr7h2+++WahQxGR9OXdIb8M+ArwOKEb1Mwjry4i0njEJtK97v5/gXXAcqAivZCyo6SkRFejIhKdSC9MXscBpwBHvjEnItKIxPYj7Wdm3wZOBDYDS9ILSUQkW2IT6UPAGOA9oCuhP2ne1Z9ERBqC2ETajjBnE4QnVyea2QwAd1czX0QatdjuT52AFlU/qnzj7u8dukXkwTPe/UlEGpUauz/VeEVqZucBH7j7KmAQYfrlpoTpmB93d80IKiLCkZ/a7wK+lby/Ffiyu58PXApMTDswEZGsqPGK1N1/a2anm9mrQC/gkaTYRSfgqHXtzGwK0BZYC3wVWO/uI+siaBGRYhJ7j/QEwvDQncBWd6+5SGdYvyfwf5L1v+3ubmZLgaFVZxHVPVIRyZC8h4iOBGYDc4BXzGz8kVZ293eTdUmSaCdgnaZiFpGGKLb70xh3HwJgoX2/BHgwZsNk/QeAb9cqQhGRIhebSGeb2Xxgb7LN7ByO8SVgaT7dpEREilnUPdJa7TjMMHo5oaL+OcA2Qj3Td6uso3ukIpIVNd4jTS2RxlAiFZEMyb1D/kFbmw0GjiMMFW0FtHT3GUfeSkSkcYi9R3o/IZHOBRYDH6cWkYhIxkR1f3L3s4GLgQ6ESlDr0wxKRCRLYpv2LxLK571KKO58TJpBiYhkSWyH/PuBHwMrCRXyz0wtIhGpF1vKdzH19XfZUr6r0KFkXmwifYtwj7SVu/8bEWPtRaS4zVy6ju/8aiUzl64rdCiZF5tInwM2AVcnyw+kE46I1JexZ3TlXy7py9gzuhY6lMyLfWq/FegPHGtmE9FTe5HM69C6BROH9yx0GA1CbCIdTZijaQ1h8jv1IRURScQm0pfcvbRywcxeA85LJSIRkYyJTaTvmtkswiyi3YD3U4tIRCRjohKpu19vZt0I89pvcvc16YYlIvXBbnkJv+fSQoeReVFP7c3se8B04IvAHDN7JNWo5BDq8ydSvGK7P10MjAAuc/czCU/wpR6pz59I8Yq9R7qHMDzUkgdNe9MLSQ7nsn6dmL96K5f161ToUESkmth7pHpCX2Av/mkTc1duprTH8UzspNqtIsUk9opUCqxy9IlGoYgUn9jqT+0I04Z8FtgA/NLdt6UZmBxMo1BEilfsw6ZfJK9Lk9fnj7aBmU0xs2lm9hkzm2lmE2oVoYhIkYtNpAbsAyqS1xrnLgEws55Ax2TxTHQLQUQasNhE+hWgOfB3QDNCM79GyUyhc5L3iwlP/EVEGqTYRHoj0IWQTLsCN6UWkdTIblEZWJFiFNvkvhyYwIEmfeHmcBYRKTKxifQ4oDR5b4REujCNgEREsiY2kV6b647dfT4wP3k/K9ftRUSyInZk04K0AxERyarYh037mZnGJ4qIVBE7sulBoC9hmpHjzWynu6uIoYgI8Vek/ZLCJb3d/QKgXYoxiYhkSmwi3ZiUz9uWvK5PMSYRkUyJfdh0lZmVkBQtcfe1KcYkIpIpsfdIZyZv3we6mJm5+7WpRSUikiGx/Uj7AVe4+zoz68qBalAiUmTaf/9lPvp0T/T6uQw9Pq5VMz6846LahNWgxSbSm4A7zawjsIkw9l5EitBHn+5JbWZQ1Xs4vNhE2gJ4rMpy8xRiERHJpNhEOjx5HQPMSt5rrL2ICPGJdCahWMlFHEikIiJCfCL9EaHi06oq769LKygRkSyJTaTjgXOAVu7+gpkdl2JMIiKZEjuy6SWgD3BrsvxMOuGIiGRPLtWftgBNzWwk8JmU4hERyZzYpv0o4BLgZ0Bb4MupRSQieVmx5TL+NCalfQOaaehQsYn0maT6k9SxtEahaARK49W/w4updshXGj1UbCL926TqEyRzNimx1o20RqFoBIpI/Ymt/nRsrjs2symE2wCPArMJQ0u/4O76D01EGpScpxqJYWY9gY7J4ihgEiGRDkrjeCIihRSVSM3s5Oo/R1rf3d8F5iSLJwIbgQ3ASXlFKyJShGLvkT4J9AZeBiqSz2ozsknNehFpcGLvkZ5rZucAVxCKO8/O4RgfAJ0J1fU/yDlCEclZWg8bj2vVLJX9Zl0uFfKd8MS+HzAFuCHyGHOAJwgzkC6rRYwikoNceoHYLS+l1lWqMYlt2k/mQCKFiCa6u88H5ieLp+cYV6ORVudpdZwWqT+xiXQ2dXOPVKpJq/O0Ok6L1J+op/bufi7h/ug2YCUHipeIiDR69XGPVESkQYtt2v/O3f+9csHMbkopHhGRzIkd2XStmXWzoAT4WnohiYhkS+wV6T8SntyfiKZjFhE5SGwi/aK7j61cMLM7gSXphCQiki2xifTzZvY68B7QDVAJvTqUxigUjUARqT+xifQqYCxhmOdmYHRqETUyGoUikn2xD5vaAS2Bbe4+CVVxEhHZLzaRTieMmR+aLP8wnXBERLInNpGuBP4J6GJmDwL/k15IIiLZEltGb6yZdQdmAJvcfU26YYmIZEfsENExwNeBvUAzM3ve3f811chERDIi9qn99e5+TuWCmS0GlEhFRIhPpLurTMcM0C7pV6ppmUWk0Yu9R3p+2oGIiGRV7D3SPxFmAoVQSk9XoiIiidim/aZ8EmfyxP9ZYDtwsbvvqu2+RESKTWwi7VztHik5JtYvAQ8BJcDZwKs5bCsiUtRiE+l17r64csHMSnM8zj6gBfAR0DHHbUVEilrsyKZ7qi3/c47HeQb4KmF01NYctxURKWqxV6Szku5O+5JtnsrlIO6+wczOB/4D+G1uIYpIWlRNrG7EXpHOdffh7n6+uw8DVudyEDM7Cfg58IC7f5xrkCJS997etIMvPLaEtzftKHQomRd7RXqnmX1AuKL8PrCYMMd9FHdfD3w59/BEJC3feO6PLFjzIR/v2sv88X9f6HAyLfaK9FVCTdKfA2WAipaIZJxXe5Xai02kEK5CJxIq5ItIxj16xQAu7tuRR68YUOhQMi+2ab8AuI7QdWk9MCutgESkfvTu1Ib/uH5IocNoEGKvSGcDc4GBwOvk+NReRIrPlvJdTH39XbaUa6BhvmIT6e/c/U3gXnd/A/h9ijGJSD2YuXQd3/nVSmYuXVfoUDIvtvrTLcnrs8nrTWkGJSLpG3tG14NepfbMvXDP7MysDbB9+/bttGnTpmBxiIhEsJq+iC2jdx4wkjBe3oCN7v69uolNRCTbYu+R3gvcB/QGJgOlKcUjIpI5sYn0GXdfCbwF3A/8v/RCEhHJlpzvkZrZycA6r4Obq7pHKiIZkvc90geBvoRRTccDnwKX1UloIiIZF9u075dUxO/t7hcAx6YYk4hIpsQm0g3JVCPbktf1KcYkIpIpsWPtX3T3OalGIiKSUbGJdJKZNa/6gbs/kUI8IlJPtpTvYubSdYw9oysdWrcodDiZFtu0tyqvxhGeXolINmisfd2JvSJ9ouoVqJn1TikeEaknGmtfd6L6kZrZG+5+dpXlBcncTfkdXP1IRSQ7amyJxzbtXzOz/zSzJ8xsHqFafvzRzfqb2WIze9PM2ueyrQSqHSlSvGLL6E0ys7bAMcA2d9+Z43HOA34MnAv8LfBKjts3epX3swAmDu9Z4GhEpKrYkU0PE6YZ6eHuA81suruPy+E4LxGKnTQF3sg5StH9LJEiFtu0HwjcCHxoZh2BXGfLOgnYCDQHOuS4rQAdWrdg4vCe6qYiUoRiE+kE4E5gFzAVGJ/jcb5KmMZ5KTAix21FRIpabPenFsAMwhTYRrhXmotfEMrv7QQuz3FbEZGiFptIhyevVwM/Td4vjD2Iuy8g99sBIiKZEJtI5wNtgPPdfUp64YiIZE9sIi0l1CD9enqhiIhkU+zDJoCWwFVm9kMzm5RWQHJ46pAvUrxiE+k64CKgGaGZPz+leKQGKjAhUrxim/YVwHTC9CJfAmaRw8MmyZ865IsUr9iiJcM40PUJwN0970SqoiUikiH5TX4HVK30ZISkqitSERHiE+nlhNFNoKLOIiIHiU2kxxCKjhiwErgnrYBERLImtoxen8r3ZnYa8Ajw+bSCEhHJktgyeiXANcCJwAYglxJ6IiINWmzT/kngNuB9oAthvP3QtIISEcmS2ET6EaG6/Tqga7IsIiLEj2z6ErAI2A38ltAxX0REiE+kTwGbgaeB3sBrqUUkIpIxsYl0PHADYb6lPRyoTyoi0ujFDhF9vdpH7u7n5X1wDREVkezIb4iouw8HMLPz3f3VuopKRKQhyKUeKYQuUDkzs38ys/lmttbMvlGbfYiIFKuoRGpm1ydvr6vNQdz9AXcvBd4B5tRmHyIixSq2H+l4M3sHwMxOBsi1jJ6Z9QPWuPtfcwtRRKS4xTbtXyDM21RKeGJfWotjXQb8shbbiUgKNH1N3YlNpP8GvEfo+rQWuK8WxyoFflOL7UQkBZq+pu7ENu1/DjwB/I4w1v55cu9L2sndy3PcRkRSoulr6k5sIjVgH2Hupn3Uorizuw/OdRsRSU+H1i2YOLxnocNoEGKb9l8BmgN/R5hJ9PLUIhIRyZjYRHojoUnfnFD96abUIhIRyZjazNkEmrdJRGS/2ETaCs3ZJCJyWLFj7ftWvjezAdTxnE07duyoq12JiKSibdu2bYByP0ylp6jqT2kxs88CHxQsABGR3LR190Ou/AqdSA3oDKh/qYhkQfFdkYqINAS5ltGTDDKzZmbWsdBxSMNmZm3MrG2h4ygEJdLG4e+BWwsdhDR4twB5z5yRRUqkKTCzIWb2JzNbbGY3JJ/dkywvNLP+ZlZqZtOqbPNjM1tqZv9a5bPvmdmsGo4xKymWfW2y3N3MfmFmZ5nZOWb2GzP7LzPrkO5vK3XNzH6WFEF/KzkvnjGznmY2wMweMbMSM/uzmTU3s0GV50hyfr1mZi+bWf/kszPM7A0zW2Bmw83sWjP7Q9Vzr8pxjzOzXyfnYffksz8k59mg5LxdbGZvmln75Dz/vZk9V69/oGLk7vqp4x9CpatphO5ly5PPZgGDgLOBxyrXqbJN5f3qPySvrYFfA7NqOMYT1ZZ/ARxXbV93E7qplQLvA/8NDEu+W0mYYvubhf576eew/76TgcuT96cCDwCPAicDJcAG4ObknJpV7dzpDryevF8IfLbKfr9eeQ4c4di3AtdVP88IIxwvq3Je3UcoXvQToEcS81vJedUJGAysABYDQwr9N03zR1ek6WoF7Kz2WfvDfIa7u5mdBixJPrqGcILWpImZTTazY8ysFTAMeMnMhiX7MqAfoWIXwHPAV4FvJ8u7gHMATf1S5Nz9LaAtsNvd/5J8vAQ4Hfibw6z/P0ArM/sMcKy7b6jydUtgnJmVHO5YZjYZ+Efg5eSjfmb29eT9S8CXCT1t3iAUMGoBfARU3oP/PjADuApoB5QRzuXv5PI7Z40SaXquAN4mXJlWegwYB9xVfWUzawn8mAMn3BnAmzXt3N2/Bqwn3Jc6jlDr9XoODOW9FXjU3bdX2ew94LNV9lEBfJIkYilunYHPVfvsAY7+H+Geqgvu/iAwBZh+uJXdfTJwJ6FAEcCZQKmZnQOcBGwk1NzokOzjNmAksLXKbg46zwg1jLscJc5MUyJNz3PAlcBFVT673t0vdff3D7P+twjNqL+a2TFAT+B+YJiZ9a7hGO8TrnA/JFxp7CR0z+0MnOHuL1Rb/5RkG2B/P95j3f3T3H89qS9m9gVgPvA7M7ug8nN3X0RIsNXXLwF2Jv9RbjOzXtVW2UC4WqzJFsItA9x9H2HQTHtCi6YMWAqMcPcVhCb+CuDPVbY/6DwDugGbj/JrZlrsWHupBXf/jZndUlMzCrjczPoAzxDuNZ1gZt8CLnH3c5LtJrv722a2ALjQ3XeZWWtgLqFZdY277zSzPxPuw95LeEo/yMzmE65aVgMXEK4y/ndy7B6E5uHTdftbSwrGE5rKzQnN5hurfHcfofUDsMPMXgW8yjq3ALPMbBdhposvAgOBfwaodl71Bh4GjgW+ZmZnE1pPO4AfAtsI/7nvJJy7AwlXt7cmt5M2ApOS9UYCpxFKcJbSwJv26pCfEWb2a3f/hzrc3x/cfVBd7U+yqa7Pq2r7LiU8MJtwtHWzTk37DDCzZoT7ViJ1RudV3dEVqYhInnRFKiKSJyVSEZE8KZGKiORJiVREJE9KpCIieVIilUxKqmetTSoT/a9CxyONmxKpZNksdy8FRprZEjO7KEmwk5NSc7MqXys3qPysQPFKA6VEKpmWFHsZQKhpcG1ho5HGSolUsq49oe7l84SybhAS6pwq61ygWwCSJiVSybptwFJ3L3X3q5PPZgGjqqxTRiiaMbaeY5NGQolUMs3dPwGWJVO4XHmYVXYDQ4GHgF/Wa3DSaGisvYhInnRFKiKSJyVSEZE8KZGKiORJiVREJE9KpCIieVIiFRHJkxKpiEie/j+JtZC9C2wM6wAAAABJRU5ErkJggg==\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x2040f5df7f0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"gene_idxs = np.array([80, 186])\n",
"gene1, gene2 = gene_names[gene_idxs]\n",
"len1, len2 = gene_lengths[gene_idxs]\n",
"gene_labels = [f'{gene1}, {len1}bp', f'{gene2}, {len2}bp']\n",
"\n",
"log_counts = list(np.log(counts[gene_idxs] + 1))\n",
"log_ncounts = list(np.log(counts_rpkm[gene_idxs] + 1))\n",
"\n",
"ax = class_boxplot(log_counts,\n",
" ['сырые количества'] * 3,\n",
" labels=gene_labels)\n",
"ax.set_xlabel('Гены')\n",
"ax.set_ylabel('лог-количества экспрессии генов по всем образцам')\n",
"plt.tight_layout()\n",
"plt.savefig('pics/1_14.png', dpi=600)"
]
},
{
"cell_type": "code",
"execution_count": 145,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"c:\\python36\\lib\\site-packages\\numpy\\lib\\function_base.py:4291: RuntimeWarning: Invalid value encountered in percentile\n",
" interpolation=interpolation)\n",
"c:\\python36\\lib\\site-packages\\matplotlib\\cbook\\__init__.py:1856: RuntimeWarning: invalid value encountered in less_equal\n",
" wiskhi = np.compress(x <= hival, x)\n",
"c:\\python36\\lib\\site-packages\\matplotlib\\cbook\\__init__.py:1863: RuntimeWarning: invalid value encountered in greater_equal\n",
" wisklo = np.compress(x >= loval, x)\n",
"c:\\python36\\lib\\site-packages\\matplotlib\\cbook\\__init__.py:1871: RuntimeWarning: invalid value encountered in less\n",
" np.compress(x < stats['whislo'], x),\n",
"c:\\python36\\lib\\site-packages\\matplotlib\\cbook\\__init__.py:1872: RuntimeWarning: invalid value encountered in greater\n",
" np.compress(x > stats['whishi'], x)\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAADQCAYAAABV2umIAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAGWZJREFUeJzt3Xt4VdWZx/HvK0KCELAUIQqWBAOFUEALolCUQMUCtZRSGHHsCIq0Xp7W+4ziDMUb4DNqlbGOYyuDzjjVqlweNYAdNRWv5SKCFgoUsaWBpozckoBgeOePvQmHkJAdTk5Odvh9nofnnL2yz17vCZuXtddeey1zd0RE5PidlO4ARETiTolURCRJSqQiIklSIhURSVIsEqkFsszM0h2LiEhVJ6c7gIhaA7t3796d7jhE5MRVY0MuFi1SEZHGTIlURCRJSqQiIklSIhURSZISqYhIkpRIRUSSpEQqIpIkJVIRkSQpkYqIJEmJVEQkSUqkIo1AcXExubm5dOvWjSuuuIKKigq6du1K9+7dGTVqFLt27WLz5s2cf/75APz85z/n1ltvBeCss87i7bffrjzWxRdfzFNPPZWW73GiqnMiNbNTzKxdKoIROVHt37+fjh07smHDBg4ePMiCBQsoLy9n/fr19OvXj0cffbRy388++4zHHnuMadOmAVBWVsavf/1rALZv384777zDrl270vI9TlS1JlIzW2Nm75rZfDObB/wn8LPUhyZyYurfvz8bNmyo3B40aBCbNm2q3J46dSq33XYbbdq0AaBz58588MEHuDsvvfQSI0aMOCqRTp8+nccffxyA7OxsAFavXk1+fj7nnHMOf/rTnwDo2bMneXl53HjjjQB88skntGrVin379rF582aaNWvGwYMHKS0tpWfPnuTm5rJ69WoAioqKyMrKomPHjtx+++0AFBQUAHD55ZdTVFTE/fffT9euXRk7diwHDx6sNoa8vDzy8vIYMmQIZWVlFBUV0bVrVwYOHEh5efkRLfNJkyaxePFiioqKmDBhQmWd69atY+7cuZVxAEd8rrCwkO7duzN8+HAOHDiQ1N8XRGuRfhd4BdgOvAXc6e4Tk65ZRI7yxRdf8MYbb5CXl1dZtnjxYnr06AHAxo0bWbZsGRMnBv8E9+3bR2ZmJl//+tdZuXIlixYtYuzYsUSZKe3ee+/lkUce4eabb+ahhx4CYMeOHWzYsIGVK1eyZs0adu3aRatWrXjttddYuHAhp512Grt376Z169asXbuWGTNmMGfOHAAqKioYPXo0999//xH1rFu3jnnz5gFw6623sn79elatWkVJSUm1MZSWlrJx40YqKirYvHkzBQUFbNq0iT59+vDqq68m+RsOTJ8+neXLl9OhQweWLl2a9PFqnUbP3TcB95rZWcBU4BozG+buW5KuXaQRslteqvdj+oPfqXWfDz/8kDPPPJPevXszbtw4JkyYQI8ePejfvz8zZ86kpKSE9u3bU1ZWxsqVK+nXrx9lZWVkZGQwatQoFi5cSGlpKdnZ2ZSVlfH8889zxx13cNtttwFB8nj44Ycr6/v4448ZMGAA27ZtO6JP1cwYOHAga9asoXPnzowaNYolS5ZQXFzM0KFD2bNnDzt37mTChAls3bqVoUOHAkEXQ9u2bY/6Xg8++CCXXnopAJs2beL888+nf//+ZGdnVxvD9u3bOf300+ncuTP5+fksWbKEm266iZKSEgYOHFj5u+rRowdbt26tbIkWFhbSo0ePypYtwJNPPklhYSGzZ88mJyensnz9+vUMGDCA8vJyxowZE+nv8FhqTaRmdj0wFFgNPODuk5OuVaQRi5L0UqFv374UFhbSv39/9u/fT/v27Vm3bt0R+5x66qk88MADTJ48mRUrVrB3715atGjBhRdeyJVXXsm1115LRkYGe/fuZfz48YwfPx4Ikuj06dO55pprKi/tIUia1TnppJOoqKigtLSUnJwcVqxYwSmnnEJGRgbl5eU8+eSTTJw4kdzcXJ599lkgSJKdOnU64jglJSV06dKF3NxcALp160ZxcTGjR49mx44d1cbQvn17tmzZwsSJE3n55Ze55557ePHFF49I9n379uW9995j0qRJlWWjRo3i2WefrexOAJg8eTKXX345U6ZMqYwToG3btkf9bpMR5dK+C7ATyAfuM7OFZvZyvUUgIpXatWvHxRdfzPz582vcZ/DgwfTs2ZM5c+awd+9eTj75ZDIzMxk8eDAXXXQRmZmZ7N27t9a68vPzWbZsGcuXL6dXr15H/Gz58uV069at8jhdunRhwIABQNCdUFFRQWZmZuX+Bw4cYP78+ZWt00PWrl3Lj370o8rt8vJyMjIyyMjI4JNPPqkxhmbNmtG8eXM+//zzo+qqq5YtWx7VD9q+fXtWrVrF/v37j/u4iaLMkH+fux/Rc21mw+qldhE5ypQpU7jrrruOuc+0adMYPXo05513Hs2bNwfgueeeA+Cjjz5i3759tdZz5513ctlll9GiRQsWLlwIBJfVOTk5nHvuuZx33nk888wzlfVlZGRwww03UFZWxpQpUxg3bhy7du1i4sSJ3HfffeTk5DBo0CDWr19fWUevXr0YNGhQZd/mT3/6U+bNm0fv3r3p27dvtTG0bt2azp07k5uby8iRIwH45je/yf79+ytvmEXRtm1b5s6dy/PPP39Uv+2sWbMYO3YsmZmZLFu2jFatWkU+bnXM3Y+9g9krwJPuPs/MsoF/B7a6+3VJ1VwHZpZFuNRIVlZWQ1UrcsLJzs5m27Zt6Q6jsapxqZEoLdLvALeY2a+ADsAN7v5RfUUmIhJ3UVqkb4Rvc4HPgF2Au3uDXd6rRSoijcDxt0jdfWht+9Rau1lHYAHwBTDK3ffUUDYQ+B93z022ThGRhhLlyabrzewtM3vbzL5/nPV8G/hv4H+Bi45RdjWw9TjrEBFJiyjDny5z98EEY0l/cpz1nAFsI0iSnaorM7N8YANQP+MRREQaSJRE2sHMXgeWAH3M7I1w+3hV1ynrwERgThLHFRFJiyh9pN3roZ5iIBtoT/CEVHVl5wOzgXwz+5671zwiWUSkEYly134wcBOwG5gJDAdGu/u3Ildy5I2lJ4CNwCaq3GwK9y1y94Iqn9ddexFJtxrv2kdJpG8DY4B2wFLgn4Cn3b2iPiOsJQYlUhFJtxoTaZQ+0jbAtcAEoAw4E7izfuISEYm/KE82TQZahu+LUheKiEg8RbnZ9LuGCEREJK60+J2ISJIiJVIzm2ZmS8xsTbh9b2rDEhGJj6gt0pHAKOBvZtacYAiUiIgQPZHeDxQCXwZeBO5JWUQiIjET5a49wErghwnbxx58KiJyAomaSN8DFlcpu6qeYxERiaWoifRzgpmZdgHLNSRKROSwqH2kk4BlQAlwqZnNTVVAIiJxE7VF2i5hNqYXzOxfUhWQiEjcRG2R3m5mGQBm1hK4JHUhiYjES9QW6d3AfDNrQTDt3azUhSQiEi9RE+lioJRgeZCtBNPpiYgI0S/t5wMFwCnAEGBhqgISEYmbqC3SLwOvA38mmI90RMoiEhGJmaiJ9B+AK4EOwF+Bv09ZRCIiMRM1kQ4mWGdpY7h9AfBJSiISEYmZyMOfCJ6v1zP2IiJVRE2kVuVVRERCURPpLJRERUSqFamP1N2fSnUgIiJx1SBrNplZRzN718yWhmvUH1VmZjea2Ttm9nxDxCQiUl/Mvfb7R2Y2myDpZhEszZzp7qMjV2J2Vfi59sBqd59ftQxY4O5uZsuAQe5+IOHzWcDu3bt3k5WVFf3biYjUnxq7N6MOf1oNDAD+D3iHoyd5rs0ZwFrgANCpurIwiXYE/pyYREVEGruol/YLgHcJWpBjCJLg8aquCexmZsCjwE1JHFtEpMFFTaQrgZ8QPGt/gGBcaV0UA9nA6eH76sq+Byxz90/reGwRkbSK1EdaubNZM6AVkOHuf6vD5zoStGq/AJ4geEJqU0LZKGAGwRNTO4Gr3X1jwufVRyoi6VZjH2nUm03DgHFAZlj0V3e/o35iq50SqYg0AjUm0qiX9j8D/g34KnAXwZR6IiJC9ET6nLuvBT4iSKglqQtJRCRe6tRHCmBmpwFfI5iX9LcNcXNIl/Yi0ggkN47UzN7g8LAlA3oBtyYfl4hI/EV91n5o4raZzXf3p1MTkohIvBxvi/S0lEUkIhIzUR8RvQho6e6lZpYLbElhTCIisRI1kb4K7DezNgTP3X8F+HbKohIRiZGow5+auftI4C/ufi2HB+aLiJzwoj7ZdIa7Fyds5yU+wplqGv4kIo1A0tPoTTSzEQRzh5YAFQT9piIiJ7yol/Zj3H0IwXykw4DWqQtJRCReoibSu8PX/wLeAl5JTTgiIvETtY/UCKa4OwP4G/BmQ85irz5SEWkEkp796TlgOMEl/YXAonoISkSkSYiaSHOA3wDrgdeBL5vZhWZ2YaoCExGJi6h37edw5BykC4ChBI+NvlnPMYmIxEqNidTMOgDl7l4KPAN8n2B9pW3AQnf/rGFCFBFp3I51aX8x8Hfh+18RJN3fAXuBF1Mcl4hIbBzr0v5FYJ6ZtSVoiW4muGvVjmDBOhERoZbhT2Z2EjASyCd4vn4f8EdgsbuXN0iEaPiTiDQKxzf8yd0PuvsrwAvhvnlAb6BjvYYnIhJjUe/a/xdwJ8E8pJ0Jbj4NSlVQIiJxEjWR7iAYiP9ngkXvdqQsIhGRmIk6IP97wDvA/vD1u3WpxMw6mtm7ZrY07O88qszM8s1shZkVho+kiojEQp2XYz6uSsyuAloSTMO32t3nVy0DzgHeB8YBs939g4TP62aTiKRb0s/aH3m04G5+XZxBMJB/K9CphrLq9hERafSiriK6mmBC578QtCJPBsYeZ53VNYGrlqW+mSwiUk/q0kdaCJQStBpvqWM9xUA2wcD+4hrKqttHRKTRi3rXvg+QCxwEfg/srGM9rxBMdPIFsMnMBlYpewj4A/A0Qcv3wzoeX0QkbaJO7PwMwVNNLQjmJG3u7pekOLbE+nWzSUTSLbnF79z9cjNrCbR19231FpaISBMQqY/UzB4neJppSbj9i1QGJSISJ1FvNvUFfgx8Fs5T2id1IYmIxEvURHojMAP4HPhX4LqURSQiEjNR79qPcveJhzbMbAawIjUhiYjES9REepGZvQF8CnQBhqUuJBGReImaSC8DriR4jPOvwOUpi0hEJGai9pG2JRgoPxtoBXRLWUQiIjETNZH+giCRFgHzCW48iYgI0RNpS+BrBM/aDyN4wklERIj+iGg/gkdDD9nj7itTFtXR9esRURFJt6QfEdVQJxGRGkSeoNnMmptZtpk1T2VAIiJxE3Vi5xlAP4IbTh3MbKW735HSyEREYiLqONJhwIXuvt/MMoClKYxJRCRWoibSWcBLZtaMYCLmWakLSUQkXqIm0pXADxO2taaSiEgoaiJ9D1gcvjeCRHpVSiISEYmZqIn0U2A6h8dRqUUqIhKKmkjXESRSUItUROQIUQfkX5nqQERE4irygHwREale1AH5vYBvA5lh0XZ3fyxlUYmIxEjUFukzwBpgNPBbYFKqAhIRiZuoifRNd18EbAK+D+yNWoGZ5ZvZCjMrNDOrqdzMZprZMjN7qG5fQUQkvSIlUnf/Sfj2B8DTwMg61DEBmEawRMnZxyif6u7novWgRCRmIiVSM7vDzF4F7gYeA35WhzrOALYBW4FONZW7u5tZb+D9OhxbRCTtIi/HDFwIfOzu+Wb21rF2NrPbgRHh5leAfw/f1zSQ380sE5iJFtYTkZiJ2kd6AHgNMDN7nWDikhq5+yx3L3D3AuC/gWzgdKA4YbfiKuXXAU+7+646fQMRkTSLutRIgbsXHVcFZvkE/aolwCXA9cAignWfEssXAqcB+4BL3L004RhaakRE0q3GpUaiJtKVwI2JZe7+ZvJxRaNEKiKNQHJrNgFfAgo4ctKSBkukIiKNWdRE+oS7zzy0YWZtUxSPiEjsRL3Z9K0q2/PrOxARkbiK2iLdaGZzCeYl7QJsSVlEIiIxE3UavavNrAvBUKW/ufsfUxuWiEh8RH2yaRrwBPBLd/+jmd2b2rBEROIjah/pSIKnm0rMrDkwPHUhiYjES9REej9QCHwZeBG4J2URiYjETF2XY3YOr9kkIiJET6R3ha/fBP43fK/F70REiJ5IpwNZQDd3VwIVEUlQl0S6F/jH1IUiIhJPURNpEUG/aJ6Z5QG4+9OpCkpEJE6i3rU/C/hnoHsKYxERiaWoifQ14GYgH7gUKE9ZRCIiMRP10j6H4NJ+Ybh9SkqiERGJoagtUo0fFRGpQdREejuHk+ihpCoiIkS/tD+DYHXPXcBy4Bcpi0hEJGaiTqN3qpmdDJwKDCSY2LkghXGJiMRGpEQaLi3yXYKW6VbgH1IZlIhInETtI51H0C+6LNzWYHwRkVDURGrAQaAifNXNJhGRUNRE+n2gBTAIaA6MiVqBmeWb2QozKzQzO1a5mV1mZkXRwxcRSb+oifRu4GV3n0HQR/rrOtQxAZgG/BU4u6byMJmOrMNxRUQahaiJ9HHgYTNbBHwDGFeHOs4AthEk4E7HKB9JMAu/iEisRE2kjwLZBMnvG8CCY+1sZrebWVF4mT4s4Uc1PRnlBKMC5kWMR0Sk0Yg6jnRo4raZNatl/1nArHDfuwmS8OlAccJuxVXKuxCMBsg3s8Hu/lbE7yAiklZRW6RV3VX7LpWeDffvCHxoZj8O5zQ9otzdR7j7BOD3SqIiEifmXvs8JGZ2tbv/sgHiqan+LGD37t27ycrKSlcYInJiq3HYZ9Rn7a83s/WJBe7+ZlIhiYg0EVET6QIOP1t/aDo9JVIREaL3kT4CfAocADYDs1MVkIhI3ERNpC+Gr78LX+enIBYRkVjSs/YiIkmq67P2A6njs/YiIk1d1ET6Y6AzQTI9E/hJyiISEYmZqHftxwA3Jmzr0l5EJBQ1kbYEphMk0LXAg6kKSEQkbqI+a9/z0Hsz6wP8B3BRqoISEYmTSI+IppseERWRRqDGLs3jnbRERERCSqQiIklSIhURSZISqYhIkpRIRUSSpEQqIpKkqAPyG4U9e/akOwQROUG1adMmCyj1asaMxmUcadWF80RE0qGNux/VootLIjWCFUdL0x2LiJzQ4tsiFRFpzHSz6QRnZs3NrEO645CmxcyyzKxNuuNoKEqk8g1garqDkCbnFmBYuoNoKEqkDczMzjOz35vZu2b2w7DswXD7TTPrZWYFZvZwwmdmmtkyM3sooewOM5tbQx1zzazIzCaF27lmNs/MzjezC8xsqZmtMLP2qf22kiwze97MNpvZR+F58JyZ5ZlZHzP7DzPLMbMNZtbCzM4+dE6E59PrZrbYzHqFZeea2Vtm9lszG2pmk8xsVeK5llDvl8xsUXje5YZlq8Lz6uzwPH3XzN4zs3bhef2Bmb3QoL+gxsLd9acB/xAsa/0wwdCzNWHZXOBsYDDwy0P7JHzmUF/2qvC1NbAImFtDHU9X2Z4HfKnKsR4gmAqxANgCrASGhD9bC7wDXJPu35f+OARzAY8J338NeBR4AvgKkANsBW4Oz6G5Vc6VXOCN8P2bwOkJx5186O/8GHVPBa6qel4RrJoxOuE8mg0MBZ4Czgpj/ig8jzoC5wAfA+8C56X7d1rff9QiTZ+WwL4qZe2qKcPd3cx6A++HRVcQnLA1OcnMppvZKWbWEhgCvGRmQ8JjGZDP4VVhXwD+Drgp3P4cuAD40XF8L0khd/8IaAPsd/c/hcXvA/2BVtXs/wnQ0syaAae6+9aEH2cCU8wsp7q6zGw6cC2wOCzKN7PJ4fuXgLEEo2neIlgUMwPYARzqc/9nYA5wGdAW+A3BufuPdfnOcaBEmh7jgD8QtEwP+SUwBZhVdWczywRmcvgEPBd4r6aDu/sPgL8Q9FN9CVgKXM3h5WKmAk+4++6Ej30KnJ5wjAqgPEzE0rhkA92rlD1K7f/xHUjccPefA3cDv6huZ3efDswgWPQSYABQYGYXAJ2AbQTruLUPj3EnMB74v4TDHHFeAZsJ1n9rUpRI0+MF4FJgRELZ1e7+HXffUs3+1xFcVu0ys1OAPODfgCFm9tUa6thC0ML9jKDlsY9gSG42cK67L6iyf9fwM0Dl2N1T3X1v3b+epIqZfRsoAn5nZsMPlbv7OwQJtur+OcC+8D/GnWbWrcouWwlaizXZTtBlgLsfJHgwph3BFcxvgGXAt9z9Y4JL/I+BDQmfP+K8AroAJbV8zdiJ1SOiTYm7LzWzW2q6rALGmFkP4DmCvqfTzOw64BJ3vyD83HR3/4OZ/Ra42N0/N7PWQCHBZdYV7r7PzDYQ9MP+jOAu/dlmVkTQivkjMJyg1XFDWPdZBJeLv6rfby314HqCS+UWBJfNP0742WyCqx2APWb2GuAJ+9wCzDWzz4FHgO8CfYH7AKqcR18FHgdOBX5gZoMJrpb2AD8FdhL8Z76P4FztS9C6nRp2H20DpoX7jQd6EyzrXkATvLTXgPwmwMwWufvIejzeKnc/u76OJ/FQ3+dRlWMXENwwu7G2feNIl/YxZ2bNCfqxRI6bzqPkqEUqIpIktUhFRJKkRCoikiQlUhGRJCmRiogkSYlURCRJSqTSZISzZm0OZyjql+545MShRCpNzVx3LwDGm9n7ZjYiTLDTwynn5h56PfSBQ2VpileaACVSaXLCSV76EMxlMCm90ciJQIlUmqJ2BPNfzieY3g2ChPpswj7D1QUg9UWJVJqincAydy9w978Py+YCExL2+Q3B5BlXNnBs0gQpkUqT4+7lwIfh0i2XVrPLfmAQ8BiwsEGDkyZJz9qLiCRJLVIRkSQpkYqIJEmJVEQkSUqkIiJJUiIVEUmSEqmISJKUSEVEkvT/9vWV5E7muhgAAAAASUVORK5CYII=\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x2040f2475c0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"ax = class_boxplot(log_ncounts,\n",
" ['RPKM-нормализованные'] * 3,\n",
" labels=gene_labels)\n",
"ax.set_xlabel('Гены')\n",
"ax.set_ylabel('лог.количества по всем образцам после RPKM');\n",
"plt.tight_layout()\n",
"#plt.savefig('pics/1_15.png', dpi=600)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Display the source blob
Display the rendered blob
Raw
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment