Skip to content

Instantly share code, notes, and snippets.

@capissimo
Created April 1, 2018 00:53
Show Gist options
  • Save capissimo/4de6dbcc606c3d304cf1f76dfa28e16d to your computer and use it in GitHub Desktop.
Save capissimo/4de6dbcc606c3d304cf1f76dfa28e16d to your computer and use it in GitHub Desktop.
Chapter 1 from Elegant SciPy (FINAL ver.)
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Глава 1. \n",
"# Элегантный NumPy: фундамент научного программирования на Python\n",
"> [Библиотека NumPy] повсюду. Она окружает нас. Даже сейчас, она с нами рядом. \n",
"> Ты видишь ее, когда смотришь в окно или включаешь телевизор. Ты ощущаешь ее, \n",
"> когда работаешь, идешь в церковь, когда платишь налоги.\n",
">\n",
"> — Морфеус, к/ф «Матрица»"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"def rpkm(counts, lengths):\n",
" \"\"\"Вычислить прочтения на тысячу оснований экзона на миллион \n",
" картированных прочтений (reads per kilobase transcript per million reads).\n",
"\n",
" RPKM = (10^9 * C) / (N * L)\n",
" где:\n",
"\n",
" C = количества прочтений, картированных на ген\n",
" N = суммы количеств картированных (выровненных) прочтений в эксперименте\n",
" L = длина экзона в парах оснований для гена\n",
"\n",
" Параметры\n",
" ---------\n",
" counts: массив, форма (N_genes, N_samples)\n",
" РНК-сек (или подобные) количественные данные, где столбцы являются \n",
" отдельными образцами, и строки - генами.\n",
" lengths: массив, форма (N_genes,)\n",
" Длины генов в парах оснований в том же порядке, что и\n",
" строки в counts.\n",
"\n",
" Возвращает\n",
" ----------\n",
" normed: массив, форма (N_genes, N_samples)\n",
" Матрица количеств counts, нормализованная согласно RPKM.\n",
" \"\"\"\n",
" N = np.sum(counts, axis=0) # просуммировать каждый столбец, чтобы \n",
" # получить суммы количеств прочтений на образец\n",
" L = lengths\n",
" C = counts\n",
"\n",
" normed = 1e9 * C / (N[np.newaxis, :] * L[:, np.newaxis])\n",
"\n",
" return(normed)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Введение в данные: что такое экспрессия гена?"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"gene0 = [100, 200]\n",
"gene1 = [50, 0]\n",
"gene2 = [350, 100]\n",
"expression_data = [gene0, gene1, gene2]"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"350"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"expression_data[2][0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## N-мерные массивы NumPy"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1 2 3 4]\n",
"<class 'numpy.ndarray'>\n"
]
}
],
"source": [
"import numpy as np\n",
"\n",
"array1d = np.array([1, 2, 3, 4])\n",
"print(array1d)\n",
"print(type(array1d))"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(4,)\n"
]
}
],
"source": [
"print(array1d.shape)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[100 200]\n",
" [ 50 0]\n",
" [350 100]]\n",
"(3, 2)\n",
"<class 'numpy.ndarray'>\n"
]
}
],
"source": [
"array2d = np.array(expression_data)\n",
"print(array2d)\n",
"print(array2d.shape)\n",
"print(type(array2d))"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2\n"
]
}
],
"source": [
"print(array2d.ndim)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Зачем использовать массивы ndarray вместо списков Python?"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"# Создать массив ndarray целочисленных в диапазоне\n",
"# от 0 и до (но не включая) 1 000 000\n",
"array = np.arange(1e6)\n",
"\n",
"# Конвертировать его в список\n",
"list_array = array.tolist()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"199 ms ± 9.79 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
]
}
],
"source": [
"%timeit -n10 y = [val * 5 for val in list_array]"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"7.84 ms ± 1.83 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
]
}
],
"source": [
"%timeit -n10 x = array * 5"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1 2 3]\n",
"[1 2]\n",
"[6 2]\n"
]
}
],
"source": [
"# Создать массив ndarray x\n",
"x = np.array([1, 2, 3], np.int32)\n",
"print(x)\n",
"\n",
"# Создать \"срез\" массива x\n",
"y = x[:2]\n",
"print(y)\n",
"\n",
"# Назначить первому элементу среза y значение 6\n",
"y[0] = 6\n",
"print(y)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[6 2 3]\n"
]
}
],
"source": [
"# Теперь первый элемент в массиве x поменялся на 6!\n",
"print(x)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"y = np.copy(x[:2])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Векторизация"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[2 4 6 8]\n"
]
}
],
"source": [
"x = np.array([1, 2, 3, 4])\n",
"print(x * 2)"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1 3 5 5]\n"
]
}
],
"source": [
"y = np.array([0, 1, 2, 1])\n",
"print(x + y)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Транслирование "
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[1]\n",
" [2]\n",
" [3]\n",
" [4]]\n"
]
}
],
"source": [
"x = np.array([1, 2, 3, 4])\n",
"x = np.reshape(x, (len(x), 1))\n",
"print(x)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[0 1 2 1]]\n"
]
}
],
"source": [
"y = np.array([0, 1, 2, 1])\n",
"y = np.reshape(y, (1, len(y)))\n",
"print(y)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(4, 1)\n",
"(1, 4)\n"
]
}
],
"source": [
"print(x.shape)\n",
"print(y.shape)"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[0 1 2 1]\n",
" [0 2 4 2]\n",
" [0 3 6 3]\n",
" [0 4 8 4]]\n"
]
}
],
"source": [
"outer = x * y\n",
"print(outer)"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(4, 4)\n"
]
}
],
"source": [
"print(outer.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Исследование набора данных экспрессии генов\n",
"### Чтение данных при помощи библиотеки pandas"
]
},
{
"cell_type": "code",
"execution_count": 80,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 00624286-41dd-476f-a63b-d2a5f484bb45 TCGA-FS-A1Z0 TCGA-D9-A3Z1 \\\n",
"A1BG 1272.36 452.96 288.06 \n",
"A1CF 0.00 0.00 0.00 \n",
"A2BP1 0.00 0.00 0.00 \n",
"A2LD1 164.38 552.43 201.83 \n",
"A2ML1 27.00 0.00 0.00 \n",
"\n",
" 02c76d24-f1d2-4029-95b4-8be3bda8fdbe TCGA-EB-A51B \n",
"A1BG 400.11 420.46 \n",
"A1CF 1.00 0.00 \n",
"A2BP1 0.00 1.00 \n",
"A2LD1 165.12 95.75 \n",
"A2ML1 0.00 8.00 \n"
]
}
],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"\n",
"# Импортировать данные TCGA по меланоме\n",
"filename = 'data/counts.txt'\n",
"with open(filename, 'rt') as f:\n",
" data_table = pd.read_csv(f, index_col=0) # pandas выполняет разбор данных \n",
"\n",
"print(data_table.iloc[:5, :5])"
]
},
{
"cell_type": "code",
"execution_count": 81,
"metadata": {},
"outputs": [],
"source": [
"# Имена образцов\n",
"samples = list(data_table.columns)"
]
},
{
"cell_type": "code",
"execution_count": 82,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" GeneID GeneLength\n",
"GeneSymbol \n",
"CPA1 1357 1724\n",
"GUCY2D 3000 3623\n",
"UBC 7316 2687\n",
"C11orf95 65998 5581\n",
"ANKMY2 57037 2611\n"
]
}
],
"source": [
"# Импортировать длины генов\n",
"filename = 'data/genes.csv'\n",
"with open(filename, 'rt') as f:\n",
" # Разобрать файл при помощи pandas, индексировать по GeneSymbol\n",
" gene_info = pd.read_csv(f, index_col=0)\n",
"\n",
"print(gene_info.iloc[:5, :])"
]
},
{
"cell_type": "code",
"execution_count": 83,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Гены в data_table: 20500\n",
"Гены в gene_info: 20503\n"
]
}
],
"source": [
"print(\"Гены в data_table: \", data_table.shape[0])\n",
"print(\"Гены в gene_info: \", gene_info.shape[0])"
]
},
{
"cell_type": "code",
"execution_count": 84,
"metadata": {},
"outputs": [],
"source": [
"# Взять подмножество генной информации, которая \n",
"# совпадает с количественными данными\n",
"matched_index = pd.Index.intersection(gene_info.index, data_table.index)"
]
},
{
"cell_type": "code",
"execution_count": 85,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"20500 генов измерено в 375 индивидуумах.\n"
]
}
],
"source": [
"# Двумерный массив ndarray, содержащий количества экспрессии \n",
"# для каждого гена в каждом индивидууме\n",
"counts = np.asarray(data_table.loc[matched_index], dtype=int)\n",
"gene_names = np.array(matched_index)\n",
"\n",
"# Проверить, сколько генов и индивидуумов измерено\n",
"print(f'{counts.shape[0]} генов измерено в {counts.shape[1]} индивидуумах.')"
]
},
{
"cell_type": "code",
"execution_count": 86,
"metadata": {},
"outputs": [],
"source": [
"# Одномерный массив ndarray, содержащий длины каждого гена\n",
"gene_lengths = np.asarray(gene_info.loc[matched_index]['GeneLength'],\n",
" dtype=int)"
]
},
{
"cell_type": "code",
"execution_count": 87,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(20500, 375)\n",
"(20500,)\n"
]
}
],
"source": [
"print(counts.shape)\n",
"print(gene_lengths.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Нормализация\n",
"### Нормализация между образцами"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [],
"source": [
"# Заставить все графики в блокноте Jupyter\n",
"# в дальнейшем появляться локально \n",
"%matplotlib inline\n",
"# Применить к графикам собственный стилевой файл \n",
"import matplotlib.pyplot as plt\n",
"#plt.style.use('style/elegant.mplstyle')\n",
"\n",
"# переопределение стиля\n",
"from matplotlib import rcParams\n",
"rcParams['font.family'] = 'sans-serif'\n",
"rcParams['font.sans-serif'] = ['Ubuntu Condensed']\n",
"rcParams['figure.figsize'] = (4.8, 3)\n",
"rcParams['legend.fontsize'] = 10\n",
"rcParams['xtick.labelsize'] = 9\n",
"rcParams['ytick.labelsize'] = 9"
]
},
{
"cell_type": "code",
"execution_count": 88,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAADQCAYAAABV2umIAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJzt3Xl8VPW9//HXZyb7BiF7CGFN2AVUNlEBiRtgXatW69qq7e3m1db2tm61rdbeX3v11qq9tWrBvQsuVURAEASUHUSWsIQlG2SBbJBlMt/fHzORGBMIZM6cWT7PxyOPZA4zc96TDJ/5nnO+ixhjUEopdfocdgdQSqlgp4VUKaV6SAupUkr1kBZSpZTqIS2kSinVQ1pIlVKqh7SQKqVUDwVcIRWRR0TkiW7c7wYRWSMir/gjl1JKdSWgCqmIDAHSvT/nishyEflIRHp1cvergRnAEBGJ9GdOpZRqTwJtZJOITAOuAI4Aa4F8oNgY80aH+10GXAw0GmN+7O+cSinVJqBapB1kAQ8BNwCxIvKEiCz1fl0ADAV2ANki4rQzqFIqvAV6i/RTY8z8Lu63DjgH+BfwQ2PMbr+FVEqpdgK5Rfo88ICILBSRrE7+/WU8h/7NQJFfkymlVDsB1yJVSqlgE8gtUqWUCgoBUUjFI1FExO4sSil1qiLsDuCVANTW1tbanUMppbrSZUMvIFqkSikVzLSQKqVUD2khVUqpHtJCqpRSPaSFVCmleihQrtor5RfGGBbvrOSldcWsLa6h2eVmeEYC3xjXl6+Pycbp0B546tQFxMgmEUnE2/0pMTHR7jgqRO2rPsptr29kya4q+sRFcu7APkRHOPh0/xH2Hz7G+H69ef2msxiYEmd3VBWYuvyU1UKqwsKiwgqu+dta3AYemzmMb0/KJTrCM2mY2214dUMJ35+3hQiHsOg7kxiT3dkUuCrMaSFV4WveZ2VcP3c9Q9PjefO28QxKie/0foUV9cx4ZhUut2HVD89lQB9tmaov0Q75KjwtKqzg2jnrOCunFx/9xzldFlGA/LQEFtw5iWMtrVw3dx0trW4/JlXBTAupClmbSmu46sW1DM9IYP4dE0mOizrpY0ZkJvLctWNYvf8IDy/Y4YeUKhRoIVUhqbK+idnPraZXTATvfXsivWK7v6zXNWOyueXsHH63ZDfbDtZZmFKFCi2kKuS0ug03vryBioZm3rp9PDm9Y0/5OX43ewQJ0RH8YN4WAuE6ggpsWkhVyPnNop18UFjBH68cxZk5vU/rOdITo3nk4qEs3lnJwsIKHydUoUYLqQopiworePiDHdx0Vg7fnpjbo+e6c3IuucmxPPj+Dm2VqhOypJCKyNe969G/2G7bCBFZJyLv6QTOygqlNY3c8PJ6RmQk8szVo+np2yw6wsn9BXl8uv8I87cf8lFKFYosKaTGmL8D04AJItJ2lv964EHgIDDWiv2q8NXqNnzzlfU0NLfy95vPIj7aN6Ofbx3fj9zkWP7fUl2kVnXNqhapE9gOrDXGtHg3ZwPlQBnQ14r9qvD16OKdLNlVxZ+uHM3wDN8N6oh0OvjBlIEs2VXFptIanz2vCi1WtUhbgRFALxGJ7uwuVuxXhadlu6t4eMEOvnlWX24Zn+Pz5//WxH7ERzl5cpmu+q06Z9nFJm8xjQDa3tmlQCaQ5f1ZqR6rrG/ihpfXMzglnqevOqPH50U7kxwXxa3j+/Hy+hIq6pt8/vwq+Fl1aH+3iCwHjgEzRWQI8BrwSyAD2GTFflV4aXUbbn51IxX1zbx+01kkxlg3K+R3zxlAc6ubl9eXWLYPFbx00hIVtH7+3jYeW7yLZ64ezXfOGWD5/iY+uZxjLa1suneqJS1fFfB00hIVWt7YWMpji3dx56RcvxRRgNsn9OOzsjrWFetFJ/VlWkhV0FlRVM2tr23gnAHJ/O+Vo/y23+vH9iUmwsFfP93vt32q4KCFVAWVz8pqmf3X1eT0imXereO/mJzZH3rFRnL1GVm8trGUZpdOsaeO00Kqgsam0hou/PMnxEU6WXjXJNITO+tZZ60bzuzLkWMtfKDj71U7WkhVUFi2u4qpf1pJpENY/J1J9Ldp9vqCvDSSYyN5bYNevVfHaSFVAc0Yw/98tJsZz64iMzGaFT+YwjAfjlw6VVERDq4ancVbn5dzrKXVthwqsGghVQFrZVE1U59eyT1vb2X2iAw++dF55Cbbv47SdWOzqW9qZf42nchEeei69irgbCmr5Rfzt/P25wfJSIzm2WtGc+ek/gHTd3P6kBTSEqJ4fWMpV52RZXccFQC0kKqAsa/6KA8t2MGcdcUkRkfw60uHcvd5g3w2k5OvRDg9h/dz1xVzrKWV2Ej/9RxQgSmw3qEqLFU1NPPrRYU8vWIfInDv1MH87IIhpMSffLE6u1wxKpM/r9rH4p2VzB6RYXccZTMtpMo2rW7Dn1YU8fCCQmoaW7htfC4PXZRPv+RTX2PJ36YPSSExOoK3Py/XQqq0kCp7HDh8jG++sp5le6q5MD+VP3xtJKOykuyO1W3REU4uGZbGO58fxH21weEIjPO3yh561V753friI5z9xDLWl9Tw4vVjWXDnpKAqom0uH5lJeV0Tqw8csTuKspkWUuVXn+w7zNSnVxIT6WTNj87jlvH9AuZq/KmaOTwdp0N4+/Nyu6Mom2khVX6z41A9s5/7lMzEGFba3LHeF5Ljopg6KIW3tmghDXdWTex8nYgsF5GlIhLVbnu5d1umFftVgauu0cVlf12N0yG8f8dE+vYK/AtK3XH5qAy2HqxnV2WD3VGUjaxqkb5hjDkPqARy4YsF8eYbY6YZY/QjPMx8f95n7K5q4O83n83g1Hi74/hM2xV7HeUU3qxa/M6ISCyQCLStGJYMjBSRK63Ypwpc/9xcypy1xTxwYT7nD06xO45PDUqJJy81nvd3aCENZ1aeI/0DcL93ETyMMZXAFOB7ItLfwv2qAFLX6OJHb37O2Owk7i/IszuOJS4Zls6SXZU06iQmYcuqc6Rn4mmYrmm/3bvGfRXQ24r9qsDzyw92UFLTyDPXnEGEMzSvbV46LI1jLW6W7amyO4qyiVXv7OnABd4LS/eLyGQRuUZEVgDNwGaL9qsCSFHVUZ5cXsS3JuQyqX+y3XEsM3VwCtERDuZv18P7cGXJyCZjzO+B33fyT/+wYn8qMD2ysJAIh/DIJUPtjmKpuKgIpg5K4f3tFfzP5XanUXYIzWMtZbsdh+qZs/YA/zFlANm9YuyOY7lLh6ez/VA9e6uP2h1F2UALqbLErxYWEhvp5KfTh9gdxS8uGZoGwPt6eB+WtJAqnztw+BivbSzlrsn9bVmgzg5D0xPonxyrhTRMaSFVPvfHjz1dh3947kCbk/iPiHDpsHQW76rUpZrDkBZS5VN1jS7+75N9XHNGlm0rfdrlwvw06ptaWb3/sN1RlJ9pIVU+9cKa/dQ0urhn6iC7o/jd9CEpiMCinZV2R1F+poVU+Ywxhj+v2sfE3N5MyA3dfqNdSY6L4uyc3iwqrLA7ivIzLaTKZzaU1LD1YD23TehndxTbFOSn8un+I9Q1uuyOovxIC6nymTlri4lyOrh2TLbdUWwzY0gqLrfR4aJhRgup8omWVjevbijhspEZJMcF7uqfVpsysA8xEQ4W7dTD+3CihVT5xAc7KjhU38zNZ+XYHcVWMZFOzh3Yh0WFesEpnGghVT4xd10xKXGRXDIs3e4otivIT2NLeR3ltY12R1F+ooVU9VjNsRbe3FLON8b1JSpC31IFeakALNZuUGFD3/Wqx/6xuYwml5ubzg7vw/o2Y/v2ok9cpPYnDSNaSFWPzVl7gKFp8Yzvp/N1AzgdwgVDUllUWIExxu44yg/8toqoiGSIyCrv9uBeh1d9YW/1UZbtqeams3OCdn16KxTkp1Jc00hhha4uGg78toooMAt4CVgEFFi0X+VnL60rBuCbZ+phfXsz8jzT6ul50vDgz1VEs4FyoAzoa8V+lX8ZY5iztphpg1PCboKSkxmcEkf/5FgW6nDRsOC3VUQ70BNHIWD1/iPsrGzgpjDvO9oZEaEgL40luyppdevbPdSdUiEVEaeInOltbZ7ofp2tIloKZAJZ3p9VkJuztpiYCAfXjMmyO0pAmpGXSk2ji3XFR+yOoix2qovfvQjUAvnAhSe43xeriOI5J7oYeBd4E3Dhaa2qINbscvPaxhKuGJVJUkyk3XEC0gXt+pOG42xY4aRbhVREzgcEGA7cy0kK4QlWEZ18qgFVYHpv20Gqj7Zws/Yd7VJGYjSjsxJZVFjJf83IszuOslB3D+2nA9OAd7zf51iURwWJueuKyUiM5sL8NLujBLSCvDRW7K3mWEtnlwpUqOjuoX2sMeZnliZRQaP6aDPvbD3I96cMJMKpYzpOpCA/lf9ZtocVRdUU6IdOyOpuIb1FRL40G4Ux5nYL8qgg8PrGUlpajR7Wd8P5g1KIcAiLd1ZqIQ1h3S2ke4FfWphDBZG5a4sZlZnImOwku6MEvIToCCb1T2bRzgoeY7jdcZRFuntc9rYxZl/bFzDRylAqcO2sqGfVvsPcrENCu21GXirrims4fLTZ7ijKIt0tpFeISDSAtw/pvdZFUoHspXUliMANZ+rgtO4qyEvFGFiyS5cfCVXdLaSPAPNEZBEwD/itdZFUoDLGMHddMQV5qfTtdcIxGaqdCbnJxEc5ddx9COvuOdL3gXo84+XLgOWWJVIBa0VRNUXVR/nlxfl2RwkqUREOpg5O0XWcQlh3W6Tz8PQfjQOmAm9ZFUgFrjnriomPcnLlaB0Seqpm5KVSWNHAgcPH7I6iLNDdQpoCfIhnqOeH3tsqjDS2tPLGxlKuGp1FQvSpjixWBTqtXkjrbiG9CbgI+C88Y+xvsCyRCkjvbD1ITaNL+46eplGZiaQnROnhfYjqbtPiXGAXsBPPmPvzOD7PqAoDf1tzgL69Ypg+JNXuKEHJ4V1+ZPHOSowx2nUsxHS3Rdr2V/8ZOpdo2CmvbeT9HRXcfHYOTocWgNNVkJ9GeV0TWw/W2x1F+Vh3C2kRntFNTd7ve62JowLRy+tLaHUbbjm7n91RgtqML6bV08P7UHOqsz+92e5nFQaMMby45gCT+iczND3B7jhBbUCfOAanxLGoUC84hRqd/Umd0IaSGraU1/HsNaPtjhISZuSl8uqGUlytbp05K4R09y95i4g83/7rZA8QkUdE5IkO28q9SzRnnlZa5XcvrikmOsLBdWN1SKgvFOSnUdfkYs0BXX4klFgy+5OIDAHSgcZ225zAfGPMbacSUNmn2eXmlfXFXDEqk96xupyIL0wf7OmCvXhnJZMH9LE5jfKV7rZI2+YezQLwzgDVJWPMLuC1DpuTgZEicuUpJVS2+ffWg1QdbeHW8XqRyVdSE6IZ1zeJRdoxP6R0t0V6n/d7MZAjIg5jzC2nsiNjTKWITAHmi8j6kxVjZb+/fLqPnF4xFORp31FfKshL48nlRTQ0uYjXUWIhobst0hHAg8aYB4AHvbdPmTGmBagCep/O45X/7KlqYMGOCr49MVcvivjYjLxUmlvdfFxUbXcU5SPd/Tj8IfCod7mRg8APursDEbke2Af0Bf4T2ANsPsWcys/+8sl+HCJ8e1Ku3VFCzrkD+xDldLB4ZyUXD0s/+QNUwOtuIY0Gnmt3O+pkDzDGLAWWdtj8j27uT9mo2eXm+dX7mT08XecdtUB8dASTByTruPsQ0t1jthfxTJ/X9jXNojwqALy5pZxD9c1855wBdkcJWQV5qWwsraWyvsnuKMoHultI44AcIAHYDjxmWSJlu2dX7WVAn1gu0lUvLVOQn4YxOq1eqOhuIZ0IPA68DmQA/7QskbLV5+V1LNlVxZ2T+uPQCUosM75fb/rERTJ/+yG7oygfOOk5UhGpBjZyfAYoAF1rIkQ9sWwPsZEO7pioF5ms5HQIFw9NZ/72Q7jdRj+0glx3LjZtMsZcYHkSZbtDdU3MXVfMreP7kZoQbXeckDdreDqvbihhfUkNZ/fTHoHBrDuFdICIPNhxozHmEQvyKBs9u2ofTS43d5830O4oYeHioWmIwLtbD2ohDXLdKaS3Wh1C2a+xpZU/rShi5vB0hmUk2h0nLKQmRDMxN5n3th/ioYuH2h1H9cBJC6kx5iN/BFH2enVDCYfqm7nn/EF2RwkrM4en89CCHRyqayI9UU+nBCsd+6dodRse/3AXY7KTuEDH1fvVzGHpGAMLdujV+2CmhVTxz81l7Kho4BcFeboom5+N69uLjMRo3tumhTSYaSENc2634deLChmWnsBVo7PsjhN2HA5h5rB0FuyowNXqtjuOOk1aSMPcO1sP8llZHb8oyNMVQm0yc3g6h4+1sGKvzgYVrLSQhjFjDL9aWMjglDiuH5ttd5ywdcmwdKIjHMz7rNzuKOo0aSENY+9tO8S64hr+a0aezjlqo4ToCC7KT2PelnKMMXbHUadB//eEKbfb8PP3tjMoJY6bzsqxO07Yu2p0FvsPH2N9cY3dUdRpsKyQdlxFVEQyRGSViCwXEe3xbbPXNpawuayWX18yjKgI/Ty122UjM3A6hH99VmZ3FHUaLPkf1G4V0fZmAS8Bi4ACK/aruqfZ5eaB93cwJjuJ6/TcaEBIiY9i6qAU/qXnSYOSJYW0i1VEs4FyoAzPsiPKJs99up89VUd5bOYwnXUogFw1OpPth+rZdrDO7ijqFNl1TKdn1G3S0OTikYWFnD+oD5foekEB5YrRmQB6eB+E/FlIS4FMIMv7s7LB75bs5mBdE7+dNVxHMQWYvr1imTIgmVc3lOrV+yBjeSEVketFZDLwLvBNPOdHF1m9X/VV+6qP8rslu7h+bDaTB/SxO47qxI1n5fB5eR2by2rtjqJOgWWF1Biz1BhztzHmNWPMKmPMQWPMZGPMecYYPQlkg/v+vQ0ReHz2cLujqC58/YwsIhzCS+tK7I6iToH2ewkTy3ZX8camUn46fQi5yXF2x1FdSE2I5tJhnpnzW916eB8stJCGgVa34e63ttCvdww/mT7Y7jjqJG48sy8lNY0s21NldxTVTVpIw8ALq/ezoaSW380eQVxUdxZFUHa6bGQGidERzF1bbHcU1U1aSENcZX0TP3t3G1MGJGvn+yARFxXBtWOyeWNTKTXHWuyOo7pBC2mI++m726hpdPHMNWdod6cgctfk/jQ0t/Lyer3oFAy0kIaw5XuqeH71Ae6ZOojRWUl2x1Gn4Ox+vRjXN4k/r9qnfUqDgBbSENXscvOdf2ymf3IsD16Yb3ccdYpEhLsm92dzWS2f7j9idxx1ElpIQ9TvP9rN1oP1PHXVaOKj9QJTMLphXA4J0U6eXrHX7ijqJLSQhqA9VQ088kEhV43OZPaIDLvjqNOUGBPBrWf347WNJRQfOWZ3HHUCWkhDjNtt+Nbrm4h0OnjyilF2x1E9dM/UwbS6DU8uL7I7ijoBLaQh5tlV+1i6u4o/fG0EOb1j7Y6jemhgShzXjsnmz6v2cUS7QgUsLaQhpKjqKPf9eysX5afxrYm5dsdRPvKT6YOpa3LxzMq9dkdRXdBCGiLcbsPtr2/EIcJfrtU+o6HkzJzeXDosnf9esltbpQFKC2mIeGpF0ReH9DopSeh5dOYwDh9r4b+X7LI7iuqEFtIQsLGkhp+8s43ZIzL0kD5Eje3bi2+M68sTy4soq220O47qwKrF7zpdMVREykVkqYhkWrHfcNTQ5OL6uetIjY/ihevG6CF9CPvVJUNpaXXz039vszuK6sCqFulXVgwVEScw3xgzzRijSyX6gDGG78/bQmFlAy/dOI7UhGi7IykLDU6N577pQ5i7rpgluyrtjqPasaqQdrZiaDIwUkSutGifYeeZlft4cc0B7i/IY/qQVLvjKD/4RUEeg1Li+O4/NtPY0mp3HOXlj3OkBsAYUwlMAb4nIv39sN+QtnRXJT96cwuzR2Tw8EVD7Y6j/CQ20snTV41mR0UDP3tXD/EDhVWFtNMVQ40xLUAV0Nui/YaFbQfruOZva8lLi+flG8fp2vRh5uJh6fzwvIE8ubyI97YdtDuOwrpC2n7F0AQRmSwi14jICqAZ2GzRfkNeUdVRCp79hAing3dun0BSTKTdkZQNHp81nDOykrj5lQ3srmywO07Yk0CY69B7Zb+2traWxMTEk94/XO2sqOei//uEmmMuPvreOTrHaJjbVdnAxCeXk54QzcofTCE5LsruSKGuy0M/7UcaJFbvP8yUp1ZQ1+hiwZ2TtIgqhqTGM+/W8eyuauDyF9ZQ3+SyO1LY0kIa4NxuwxPL9nDuUyuIj3Ky8ofnMj5XTzErj/MHpzD3G+NYUVTNrOc+1WJqEz20D2Ari6q55+3P+XT/ES4bkcEL148lJV4P39RXvb6hhBteXs9ZOb156/bxZCXF2B0pFHV5aK+FNIC0tLrZcaieRTsreWV9CWsOHCEzMZrHZg7nlvE5OmpJndDbW8q54eX1JMdG8uZt4zmrnx65+JgWUjscrGvi032H2VRWy77qY1Q0NFHX5MJtPKOS3AYaXa00tripb3Zx4EgjrW7P32NUZiLfmdyfW8b3I0GXClHdtLGkhq89v5ryuiYevXQ490wdpN3jfEcLqb9U1jfxwpoD/GNzGau9i5aJQGZiNGnx0fSKjcAhgni3x0Q4iY10EBvpZECfOPLT4pk6KIX+fXQGJ3V6qhqauePvm5j3WTnTBqfwzNWjGZYR3P+vAoQWUqvtrT7KrxYW8vL6Eppcbibk9uZrIzOYPjiVM7KTtFWp/MoYw/OrD/Djd7bS0Ozi3qmD+fmMPBJj9H3YA1pIrVJR38RvFu3kmZX7cAjcMr4f358ygFHaPUkFgEN1Tfz03W28uOYAKXGR3Dd9CN+bMkBXlj09Wkh9zdXq5pmV+3jg/e3UNbm4fUIuD12Ur+skqYC0Zv8RHlywnfe3V5CeEMV3zxnAXZP769X9U6OF1Jc+3lPF9/61hc1ltVyUn8YTV4xkuJ6DUkFgRVE1jy7eyXvbDhHpFK4encWNZ+VwUX4aURHarfwktJD6wsG6Ju7791bmrC2mX+8Ynrh8FFeOztRuSSro7Kps4E8rivjbmmIOH2shOTaSy0ZmUJCXyvQhqXpk1TktpD1xtNnF/y4v4rEPd9HY4ubH0wbx8xl5ep5JBb1ml5uFhRW8vrGUd7cdpPqoZ3G9nF4xjMpKZFRmEv2TY8lKiiYrMYZesZHERTqJi3ISH+UkNtKJM3y6V2khPR2NLa3MWVvMLz8opLS2kVnD0/nD5SPJT0uwO5pSPud2GzaX1bJkVyXrS2rYUlbHtkP1NLncJ3xcXJSThCgnidERJERHkBgdQXZSDLnJseT2jmVQShxnZCWR0zsm2I/etJCeiv2Hj/LC6gM8vXIvh+qbmdQ/mcdnDef8wSl2R1PKr1rdhqqGZsrqGimr9QwoOdrcytGWVhqaWmlodlHf3Epdk4v6Jhd1TS5qGl2U1jSy/8ixLxXh3rGRjM5KZExWEuP69mJc316MzEwMpnOzWkhPpKXVzbriGj7aXcXbn5ezcu9hAGYNT+c/zx/EBXmpwf5JqpTfGWOoqG+msKKez8rr2Fxay+ayWjaV1tLQ7FkmJdIpjMxI/KKwjuubxJjsXoHa3zX0CumHOyt5akURkQ4HEQ4h0ilEOBxEOoVI5/FtkU4HkQ7vd6fQ6jbUNLo4cqyFstpGCisb2F15lOZWzyfnGVlJXDc2m+vGZjM4Nd7CV61UeHK7DbuqGthQXMOGklo2lNSwobSGivpmwDPib0hKPGOyPacDshJjyEqKpnds5BcjAaMjnLjcbppb3bS0Glpa3RxrcXOspdX71eFnV+uXbt82vh+zRmScavQuC6klZV9EMoA3ARcw0xhT19m2nuyjtrGFnRUNuNyeX6Lnu6HF7fnFutzHf8HuDp8VkU6hd2wk6QnRDE1L4LIRGUzI7c15A1NIT9SVOJWyksMh5KclkJ+WwHXjPGtjGmMorW08XlhLathUWsv87Ye+aL2erkinEBvp9H55hmNXH232xUv5giUtUhG5HYgFUoHNxph5nW1rd39LD+3d7uMF1iGeBcT0UF2p4FDX6KKsrpG6RtcXLcqmVveXjjQjnQ5iIhxfKpZtXz7sVeDfFime5Zi3AS0cX465s21+4XAI0Q4n2ltJqeCTGBNBYkxg95Tx23LM3dimlFJByZ/LMXe6RLNSSgU7qw523+X4haU9IjK5w7Y/WLRfpZTyu6Dt/qSUUn7m94tNp6Wurkc9opRSyjJJSUmJQL3ppPUZKC1SPW+qlAoGSZ31gQ+UQip4LkTV251FKaVOIHBbpEopFcyCZtoVpZQKVFpIlVKqh7SQ+oCIZIjIKhFZ7u3KhYhc5729VESi7M7YE529Pu/2/iLSZGc2XzjB65slIn8XkWQ78/VUF+/Pkd5tn4hIH7sz9oSIPCIiT7S73enf00paSH1jFvASsAgo8G57wxhzHlAJ5NoVzEc6e30AdwM7bUnkW195fSKSANxljPm6MeawneF8oLO/3wXAY8DHwJk25eoxERkCpHfY3NX71TJaSH0jGygHyvBOyGKMMSISCyQCRTZm84WvvL52rZhKu0L50FdeH3AuMEJEPhSRTNuS+UZnr+8d4Co8vWU+tilXjxljdgGvddjc2eu1lBZS32vfDeIPwP3GmJ5NqBhY2l7fDcBcO4NYpO319QGeAt4ArrMvjs+1vb6+eIpNFJ6pLUOVX7olaSH1ja9MyCIiZ+JpmK6xM5iPdDbhzCDgPjyttjvtCuYjnb2+CiAeaOQEQwODRGev71pgIbAGuNimXFbx+wRJ2o/UBzrM/v9/wC7gHOAOPJ/69xtjgvbwqbPXZ4xZ5f23pcaYaTbG67Eu/n4bgLfwTEZ+szFmr20Be6iL1xcF/BHPB8UVxpigHVkoItOAK4BPgH3AHny4Gke3MmghVUqpntFDe6WU6iEtpEop1UNaSJVSqoe0kCqlVA9pIVVKqR7SQqqUUj2khVQpFdY6TnrSbvt0EVktIgtFJPJEz6GFNMyJiENEXvLOUnWv3XmU8qf2k56ISK53xqiPRKQXMBv4NlAD9D/R82ghVbPwjFSaZoz5vYgsBRCRt0UkSkSqRSRCRH7TVmhF5ICIpIs0hkf4AAAECElEQVTIHSLyR++2pW1PKCIPe0ebtN1eKiKjRORPItJHRJZ4v5K9/35ERLaLyK3er2ki8ivvzy+KyIB236/0Tv32sPexz4nIMhH5uYjsFZGNInKB9/lXiMgv2uVo//wPi0ik934feyeYabvfi96cm9q/tnbfp3nzlrdl825f7r39uHffb7Z7vkgR2dPJ40eIyLve7e96J0lxiMhfvdPBfeV3rXynw6QntwOP4xnRdjHwKvB9oNZ7vy5pIVWj8Qyta1PrLSoOIBnPmPPzgWHe2wDVeD6tJwDObu7nAeDXeGYcehGYA1zuPWRaA/y23X0TgW94f2703m7zXTwzM50jIgOAKGPM+caYR73Pe7cx5kM84+PPA6Z3FcgY02KMmQ6sBMZ0+OcTzR/gxDMF3fttG0Sk7XcEkAHcCDi80/HhfT1JnTz+KBDlbQEdBrYDg4FMY8xBTu93rU5PFvAQngl5YoGBeIac9jrZvKZaSFUjX/4Pugn4OlCIp4C9j6coFOF5Q0V47zMBaOB4kTvD2yK7pJN99AF6G2PK8BSZA96vbKA3cKTD/W/HM+sSeOaVfJbj80pm4plnMsf7+H1dvK5MYCuwvsP2nwFPgOewTkSWAFcDMe3uE+XN2TYPaaa3NTq23eup6fC819CusOJp1aTh+R0BTAS2dPH4MjxjxVcD6/B8ANSf4HetrFEGPGiMOdsY8zfgm3haq+V4/gZd0kKqNvPlVttK4CfAR0ACnmIyHPgAz6d0AuDG07pa6d3W9jxX4zkU6qgaWCciM4GDeIpgDp437mQ8xaO9GDyFHGPMx8aYKXiKJ8B+4GJjzAjgEF1Pml1ujBnGV1ukv8UzITV4WsdP8dXpACcAf+/wXNOAjd7bYzleFNuUAO2nS7wcTyEf7X2tK9r9W8fHrwJ+jOd3vhL4T+/3rn7XyhrPAw94Ly5lAa/gmfxkJF8+avsKLaRqMZDsPcH+ELAcz6HlUjwtM4Af4Cl2Me22PQLM53hLrh/wLsdbkh39DrgH+BfwLeA24EM8BeQvHe77wgny/hlYLiKPes9bObznYDsuByIisgJPi64rC4H78XwAtF8KvNAYs7KzB4jIWOAu4DngEuBWPLMMtS/Gh/C0ZPLxtOTrgH+e4PGL8bSANxtjtuFpsS6i69+18iFjzFJjzN3GmP3GmHOMMRcaY8qMMa8bY0Z7rx80nOg5dPYn9SXec5YvGWNCaTJjn/FeRJtmjHnYe472YWPMrT15PHAv8FtjzB3e+/zTGHO1L3Mra0XYHUAFDhGJARYAv7E7SwDbCOz1/lzOly+Sne7j5wHfAxCRf+FZBkQFEW2RKqVUD+k5UqWU6iEtpEop1UNaSJVSqoe0kCqlVA9pIVVKqR76/7xgFC1g19vOAAAAAElFTkSuQmCC\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x295f10d6fd0>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Количественная статистика:\n",
" минимум: 6231205\n",
" среднее: 52995255.33866667\n",
" максимум: 103219262\n"
]
}
],
"source": [
"total_counts = np.sum(counts, axis=0) # просуммировать столбцы\n",
" # (axis=1 будет суммировать строки)\n",
"\n",
"from scipy import stats\n",
"\n",
"# Применить гауссово сглаживание для оценки плотности\n",
"density = stats.kde.gaussian_kde(total_counts)\n",
"\n",
"# Создать значения, для которых оценить плотность, с целью построения графика\n",
"x = np.arange(min(total_counts), max(total_counts), 10000)\n",
"\n",
"# Создать график плотности\n",
"fig, ax = plt.subplots()\n",
"ax.plot(x, density(x))\n",
"ax.set_xlabel(\"Суммы количеств на индивидуум\")\n",
"ax.set_ylabel(\"Плотность\")\n",
"plt.tight_layout()\n",
"plt.savefig('pics/1_06.png', dpi=600) \n",
"plt.show()\n",
"\n",
"print(f'Количественная статистика:\\n минимум: {np.min(total_counts)}'\n",
" f'\\n среднее: {np.mean(total_counts)}'\n",
" f'\\n максимум: {np.max(total_counts)}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Нормализация размера библиотеки между образцами"
]
},
{
"cell_type": "code",
"execution_count": 89,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAAClCAYAAAAONXX6AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJztnXuUFdWd7z8/Wp4N+ODlC0l8BUhM1ETDHR8B49ywCHFlEjMxD4wkGp2JmYnkarj3+kp0TRySMZnIullzScRoEjHBoKiIuQpofGCM4APpVnkoCDbdNGDDabpp6N/9Y++Crt0NVafrnD7ntL/PWmd9e/ep2udXu3b99m/vqtpbVBXDMAyj+/QptQGGYRiVjjlSwzCMjJgjNQzDyIg5UsMwjIyYIzUMw8hIr3Gk4hgiIlJqWwzDeH9xWKkNKCCDgaampqZS22EYRu8iMTjrNRGpYRhGqTBHahiGkRFzpIZhGBkxR2oYhhGwdVcrP1m6hq27WlNtb47UMAwjYO4LG7nu4RrmvrAx1fa96a69YRhGQZh+1uiYJiG9ZfYnERmCf/xpyJAhpTbHMIzegz3+ZBiGUWzMkRqGYWTEHKlhGEZGzJEahmFkxBypYRhGRsyRGoZhZKRojlRExohIq4iMF5EXRWSRn+pukoisEJG5fruv+/StPj1TRFaKyFU+PdvvP9Wn7/fbf7xYthuGYeRDMSPS7wFvApcANwJbgNOB6cA0YIyIDAOuAD4FTPX7XQxMAL4lIgOAM4EvAFeIyIlAO3ANcGkRbTcMw0hNURypiBzl/9wKHAvUAe8Cx3VI1wHHAENVdSfQLCKDgSpVbQX6A8OBxi72jdKGYRglJ5UjFZFBXj8iItUpdvkqcE8X/w9fo0pKd/x/2m0NwzB6lMR37UXkT0CViAwHaoBTcF3xQ3EicC4wHjgVWICLPjf7z9HAKFx02eQj0YGqmhORfSLSH2jBRbRH4SLRjvtGeRmGYZScNBHpSOBfgGrgFqBf0g6qOkNVLwFWAxcCP8Q5zpeBubhodYOqNgJzgKeAR/zu84HlwJ2q2gKsxDniOaq6ztv8M+DulMdoGIZRVBInLRGR23DOVHDd6QZV/UEP2JYXNmmJYRhFInHSkjTT6F0PnMOB7vXTGY0yDMPoVaTp2v8JmAgM8vpgEe0xDMOoONJEpMOAJcBGYDQwuagWGYZhVBhpHOk03EP0I3EP1X+1qBYZhmFUGGkc6bnAGtxbSgKcB6wvplGGYRiVRJox0uiO1UzsIXjDMIxOpIlI1+OcaSvwVlGtMQzDqEDSONJJuEj0gQ5/P1VMowzDMCqJNI50DvA5oK+qzhaRDxXZJsMwjIoizRjpH3HvxEd362cXzxzDMIzKI40jbQQ+AhwhItcCueKaZBiGUVmkedd+NHAybtalemCZqu7rAdvywt61NwyjSBTkXfvfqOoFBTDGMAyjV5LGkR4rIkv83wKoOVbDMIwDJDpSVR3bE4YYhmFUKrYcs2EYRkbSLDVyfvg/VbUH8g3DMDxpxkgvxy0XMg/Y4f9njtQwDMOTpmt/A+5h/JG4dZcWFdUiwzCMCiONI70MOB83jV49MCVpBxH5kog8KSJ3icgkEVkhInP9d1/36Vt9eqaIrBSRq3x6toi8KCJTffp+v/3HRWSAiCwRkb+KyAndO2TDMIzCkqZrv4wD0+cJKabSU9U/ish84DWgP25y6DtEZBhwBW4557/g1oO6GJgAPC0idwFnAl8AfiEiq4F24BrgUtwyJy8Cq4Av4lYTNQzDKClpHOm3yHOMVESqcEsxP49bNK/Of44BhqrqThFp9uvZV6lqq1/LfjjuldR3geM67NtV+rQ8jtMwDKNoFGWM1L9COh44HKjq+FW46cGySLGtTTJtGEZZkCYi/YbXN71OAf6WtJOq7hORw4B3cO/pj8JFk00+Eh2oqjkR2eej0RZgK3AUB5Z+3uz3PaZD+mN+u82pjtAwDKPIpHGkVap6U5QQkX9L2kFEvocbw9wC3AncA7ysqo0iMgc3NPCI33w+sBz4L1VtEZGVwALgRlVdJyJ9cGOhV+DGXK/HjbFenPIYDcMwikqa2Z+eAf438DYwBrhNVSf0gG15YbM/GYZRJAoy+9NXcMsxH4stx2wYhtGJNI70elX9dpQQkTuBbxbPJMMwjMoijSM9UUQu40DX/qSiWmQYhlFhpHn86Qu4R40+iXuU6R+KapFhGEaFkcaRTgAmAx9R1V8Dny6uSYZhGJVFGkf6Q+CfcDebAL5TPHMMwzAqjzSO9Engd8BYEXkAm0LPMAwjRuJzpAAiMhSoBnao6u6iW9UN7DlSwzCKRPbnSEXkfwEX4CYKOVZEXlHVawpgnGEYRq8gzeNP/11VJ0YJEXmyeOYYhmFUHmkc6aYOyzEDtIvIUmxZZsMwDCDdcsxf6wlDDMMwKpU0Y6TbgJeiJBaJGoZhxEjTtX/ZHKdhGMbBSeNICcZIMcdqGIZxgDSO9BequiBKiMiXi2iPYRhGxZHmzaYf+KVAEJEBwIzimmQYhlFZpHGkPwIWiMgTwAPAj4trkmEY5cjWXa38ZOkatu5qLbUpZUearv1WVZ0CICL9sEmdDeN9ydwXNnLdwzUAXDvp5BJbU16kmo9URO4SkWnAY7jVPg+JiHxZRP4iIstE5KMi8qKILBLHJBFZISJz/bZf9+lbfXqmiKwUkat8erbff6pP3++3/3i3j9owjLyZftZoZk0dx/SzRpfalLIjjSPdDbQDtwPPASek2OcPqnoebtnkmcCNuPWeTset/zQNGCMiw3Crg34KmOr3vRg3B+q3/JjsmbjJpa8QkRO9LdcAl6Y5QMMwCsPwwf25dtLJDB/cv9SmlB1puvbLcDPk/8Zr4kwoqqoiMhAYglt/vg436clxuHlN6/znGGCoqu4UkWa/3n2Vqrb6G1zDgcYu9o3ShmEYJSdNRJoDLsdFlt8AmlLmfTtuDfp9Hf4XztmXlO74/7TbGoZh9ChpHOkvgOuAAcBNwP9J2kFEzsQFpi/gItKjcdHn5g7pUbjosslHogNVNQfs89FoC25o4ChcJLq5i7wMwzA60dNPGKRxpAtUtQ64SVXfAR5Ksc8k4AIRWQY8jluuZBTwMjAXuAfYoKqNwBzcrPuP+H3nA8uBO1W1BVgJLADmqOo6b/PPgLtTHWGFYY+YGEZ2oicM5r6wsUd+L9UM+ZVAb5kh/ydL13DdwzXMmjrOHjExjG6ydVcrc1/YyPSzRhfi5ljifSFzpGVGgSuAYRjZye5IReTrwNeA/j7DzeU4R2lvcaSGYZQdiY40zRjp93HPcfZR1UnAmKxWGYZh9CbSONLZfuXQZSLyHLCqyDYZhmFUFGmXYx6kqs0ichqwzj+mVFZY194oNDZebXgKshzzn4AqERkO1ACn4F7pNIxejU3SYaQlTdd+JPAvQDVwC9CvqBYZFUtvewbWJukw0pLmrv1tOGcahbf1qvqDYhuWL9a1Lz32DKzRS8netQfqVHVmAYwxejlR5GYRnPF+I01Euh73Wud+VPVHxTSqO1hEahhGkShIRLoDN5VeYmaGYRjvR9LcbLpPVZ9S1SdV9cmiW2QUlUq+IVTJthu9mzSO9KLoDxER4N+KZ45RbHp6VpxCUsm2G72bNF37u0RkKW6C5r7AvcU1ySgmlXxDqJJtN3o3ad9s+hAwRFX/VnyTuofdbHr/YW8eZaM3lV+RjyX7pCUisgD4J+D/+vTC7HYZRnasq5+N3lR+pT6WNF37EcBPgbNF5CxgWHFNMox0vN+7+lmjsEKXXykj3FLXhTTPkZ6IW/TuaNySyr9S1Q09YFteVErXvjd1p4zSUm5vkpWbPQWkIPORngusA54B1gITU/2yyI9E5OciMl5EXhSRReKYJCIrRGSu3+7rPn2rT88UkZUicpVPz/b7T/Xp+/32H09jR7lR6i6I0Xsot7kAys2eniSNI4288fdJua69iJyMez8f4BLgRlw0ezowHZgGjBGRYcAVuNmkpvrtLwYmAN8SkQHAmbiJpa/w0XE7cA1waVe/3VhmzxqGzz6Gla3cn40sd/sORWh7JR9LVwwf3J9rJ51ctJ5NvuVVbHvKmTSOFGAI0KCqd6vqb5I2VtU1wDyfPBa37PK7wHEd0nW4ZZWHqupOoNkvy1ylqq24pU2GA41d7BulO3HPinfKKuILI9CwsuUbofa0MyhmBF3sYwltL/feQLk5+nIvr3Iizc0mgHrgywX4vXBANind8f+ptp125vEMGDS4ZN2LcAw0aRA830HycI7MYo+5FnMQv9jzfYa2l/qGRBLlNv9pOZdXud1rSONIJ+Kc1mT/ZpOq6jfz+I3NuBtVx/i/o/QoXHTZ5CPRgaqaE5F9ItIfaAG2AkfhItHNXeTViWE+4isV4cUwPMGepO9Dwspd7IsvH/vCyp1U2cNjKfTFEdqeb1n3NBeNH8WytY1cNH5UqU0Byru8yq3RSdO1vwc3qfM7wL8DP8zzN+b5fUYBL+NmkroH2KCqjcAc4CngEb/9fGA5cKeqtgArgQXAHFVd523+GXB3nnb0CMUecA+HBvL9vXy7j/lsn29XOjyW2U+v57qHa5j99PpUtvU2Fq7ewqKaehau3lJqU7pFTw5NFPs6y/tYVPWQH+Am/3kA18WfnbRPKT64cVxtamrSUtKws0VnLXlTG3a2lNSOgzFryZvKjIU6a8mbBd8+PPZ8y+KmxTXKjIV60+KaVNvna0+5kVReSfaX2/HlW7fyoaePNTiWZP+TuIFbfvkEr2OAE9Jk3NOfcnGkxaxMhSDfCtmTFbjQv1Xu5yLJvny/L7VjLebv9/S5DI6lII70zg6fubgud8kdZxd2loUjLXRlyhrlvZ+prWvSKXOWa21daevEwQjPZWhvvhFpuTccWShxvS+II302iEjHpMm4pz/l4kgLTdjdLXT3Nwvl7tQrzbFMmbNcmbFQp8xZ3q39y/185EPWYylwWST6nzQ3m1qBu4DfAP+Ju4Nu9BSaoCUk63OGxb45cdH4UUwZN7Js7oIncfvnxjNl3Ehu/9z4UptScrLWrR6vm2m8rR6I+o4C7s1nn576UKCItNxa9WJ37bPkl9WWYkeMlRaR5ku+Xfsyi/IOSdZhmQLXzYJ07c/GjY0u9vrJNBn39OdgjjTfAi12ZUyqIMWurOHvl/KGRbHvUpdbo1ho8j13Wc91vg1TPvkXulHIWleKcbNpOW5stMrrX9Nk3NOfgznSfE9+kqPLGuWE42A9fcPgwl8+q8xYqBf+8llVTXasxSQ89msXrlJmLNRrF67qcvsS37ktO/J1JmE63/H2YjrecNukG2+FfqIhIb9E/5NmjLQWN+nIDV5r0w0alAfnfeBIxo6s5rwPHJlq+6SHorNOOhKOg4VjOfmO6yX9/utbdvLZXz3P61t2AnDGcUNjGh5vlged8y2L8NhXbmqKadKELz1tX7HJOslKuH34gkOnSUUKPN4e/n4+dTm8TsN6me91EtaVpHOZ+QH/NN4WOAn4O69D0+zT0x98RPq3NZtiLVkYgSW10s+t26pjb3tCn1u3tcuWK+SmR32r/mj37qJnjUiTtg8j4HwfscmHpLJIKvtHX3tXR9zwqD762rupji0k6dzle656+vGp0L58o65w/6SIs9AvTFz7oO9RPLgqlf0dyfc6LfQwQ8L32SJSEZkvIqNVdS3wHPB54NHuueye4X88tJpFNfVcvWAVAB8aUR3TWUv85LNL1gCdW6obHnuD2vocNzz2RroflEATSIoywpZx8eo6Rt64mMWr67rcP6llDiPge1/axKKaeu59aRMAaxp2cedfN7CmYRfQOYLNh625PXENI6RnfIT0TNcR0h3Pvk1Dro07nn27y7JIYvofXqa2Psf0P7zc5e+H5yopArx3pS+rlZu6zi+pPBLyD8s6LL/w3Cad++a2fTGdfOoIxo6sZvKpI7q0L+9p7xIi2JWbm2Kaz/k749ihMQ1ta8ztYdnaRhp92eRbN5KONayb+ZLUtf8eMEtE/gV4CMgB53Xrl3qI0YcPAGDMEU6HV/eLaXiyw8oYOt6ki+Hqcz7IrKnjuPqcD3ZpT7h96LiTnMul816iIdfGpfNe6nL/pKGIYdX9mHjSMIb5429u3RfTr/1+JbX1Ob72+5UAfHv+Kyyqqefb81/p0v5DsWLTezENbQ1/O3QkN3z6ZMaOrOaGT5/cZVksX9/IuH9fwvL1jV3+/s8+N54R1X352UGGTcJzlTQ3QOiYwnMV2pPUte5kz4JVsUZ/pS+3SJO6t+H3g/pWxTTfoCCpEf3KGccxZdxIvnLGcV0e/y2fOZWxI6u55TOnAp2d36G47oKTmTV1HNdd0PUEJDN8gDTjodVA9rlPw3MV1s18G80kR/pN3Jjod4C9uIlHru+O4T3F2m27AXizsRmAq8/1F8+57uL5/vkfZER1X75/vksnVcZ8HV+S4zxt1GBGVPfltFGDncFBKx/uf/clpzOiui93X3I60LklTmqZwwh8UP+qmI4a0j+mu/fsi2mWccLEcaz7XmJRTT3T73ONxOLXG6itz7H49Yaut0+IOJdv3EFDro3lG3ek+/2g7Dptn/AM71d9I/RV3wh1KqsgAg5/L2y0Q0J7kuwN63pSUBASNqIh4bUybd5L1NbnmDav6/MXNhT5ENqa9IxtvtF/UgSa7wQ6SY70SWAZ8G3cjEvL/P/KlrOOPxyAT55wBNC5Vbxt6Voacm3ctnQtkNw96nxzaU9MQ8KLKRxE/+6Dr9GQa+O7D74GdG7lw/0njz+a+h9NZvL4o4HkLk9Ygf7fmw0xnTD6CEZU92XCaFc+N154CiOq+3LjhacAoN5LRJrPzbVxIwfH9NfPb2BRTT2/ft4t8TWoX1VMt+xsjWlY9p0apX/8GGNHVjP3Hz/WZVmF5+aOp9ezqKaeO/zFkNQIdmpUA3tDR9W/qk9Mw3M9bkQ11X37MO4gjjLsLYXlF9of2hsO04R1IQwKZkVrKi11jWoYUYajVOH34fF96oNHxTTMIKmh6EhS9J64f3Buw/2v9I3Elft7WsF1HNi+ccfumCZxSEeqqk929UmVc4k4/8SjGFHdlwtOcoudhq1s8569MQ0v9lfrdsY0HEMMu1/hxR5WtrB7db6vdJGGF0tSFBVW7vD4wgq0tjEX02seWk1Dro1rfBfpP55aT0Oujf94Kl3L27HCh8deW78rpkvXbo3phBO8E/eN3MxJJ9G3jzBz0kkANPsoONKbH3ud6x6u4ebHXu/SltDJh+du/ivvxjTsvoWEZR82ciF1vgGINDzXV93/Krm2dq66/1WgsyMLy+Nv7+yI6cPeoUd634qNDJ75CPet8M4liJBDZxE2TH/dsCOmYUR925SxjB1ZzW1TxrryDXoAtzyxhtr6HLc84ew//siBMQ0bjrChOBTh+PAR/avo20c4wvecwt5LGGGG5za8DhtzrTENh6E2eYcZ6Z/faIhpEmmXGqkYZnhH8c++O9G2rz2u7RrTZ97aFtPTjh4S07Ayhb28sCUML6ZR1X1jOmxQv5g+5k9UpLOWrmFRTf1Bo4bP3/UCtfU5Pn/XC10eX+gMjh86MKY3//0pVPftw81/7yLQwwdUxXTH7r0x7RT1dajw4XfbmttiWrdzT0yvuv9VGnJt+x3LjY+9QVu7cqMvq9ARLnj13ZiGXckwAjt+aP+YbvP/j3TTe7tjGhKW/S1/fp1FNfXc8mfnyEPHfrbv/UR6gh+Xj7Sqj8R0cU19TMPeyZtbm2N65KC+MZ1+3yvk2tqZfp9zlKEjzvngINLnN2yP6XG+XCI9vP9hMf3tik3U1uf47QoX4YY9gLB7HY45X+kbjiv9+U1qiDo6w9CxfWfBKtrale/463jzey0xDR1rY/OemP5gUS219Tl+sMg9rbl+W3NMd7a0xXRRbUNMw4AriV7nSPsF3a3+vhJHGobs7/mCjDTszl1x9mj6iFOAE48aFNON23fHdKR3mJE+/db2mD7zVmNMb/3MqVT37cOtfoD+T6/WxTSMGrbsaovpYN9tizSMsLftbovpT59cR66tnZ8+uQ6AJ95sjGlY4cIx3Y4VPmz1w+h34GES09a97TENHe8QX+aRbt+9J6Yf9Y1bpOEY3FPrt8X0MJfNfo267JGGjdSCVXUxnf9qXEPH/szb22O6YUdLTMMI+42tuZiOHV4d091t7TFV1ZgO93Uq0q/8bgUNuTa+8rsVQOeu+Tp/nyDSsNHO+Ug10kW19TE9clA/ThxWzZG+0d/evId1jTm2R04raMhafT6RhnUx7MFM+/1KFtXUM+33K3nLNx6R+iLYr5/wjVWkaxpyMV26tjGmG7c3x7SqT5+Yvu3PUaTHHd4/ptt274tpEr3Okb7jC26D12Xrt8d0W/PemL69rSWmm72DjfT6xbW0q1OAR2rqYvrAqrguXdMY0wF9+8T02bffi+mPl6wh19bOj/3NoCrRmLZ5pxNpyBP+dyJ9tHZLTJta9sR0lY/2Ih3qo5FIR/puWKTf8E8NfMNHgXt95Lt3Xzszfas/07f6YeVf72/8Rdr/sD4xre4nMX3M36SIdG+7xHSJP8ZID/fdvkijCzxSH1Qf0D3tMQ17G+eMOTKmfXy/I9Iw/5w/0EgvP/t4qvv24fKzj/e/qzEd4R1SpI97hxZp2NvZ/F7rIXVHy76Y7toT123+wCNtbI5rGAGf4oODSMOob+qvn6e2PsfUXz/f5fehK5//yuaYhhH96w079+u73hlH6tve/frUusaY7vCBT6THDO4X0/XbW2L6ER8IRNrmnX2kO1vimi+9zpHu3ueqYfPerh92CyvrLl+QkYaOMbwYd7S0xzSsrFv84HWkaxuaYxqyqm5XTBt9CxjpO02tMQ1pD7S2PhfT0P4WXy6RbvBdpUjf9XZHWp9ri+kG7xQ3bNvdaVghJDoFkYZRwHZfhpGG24fDMDta9sb0/lVbYppr05iG7As0fFzqube3xTQKRg6mIZfd+xK5tnYuu/elLr9/x4+lRuoPe7+GNPhIPdLQ/pAaf84jzZfHfQMV6cubd8Q0rJtrfGQdaYu/9iINnVnoWMOhj46EdWGrd/6Rho32sxvei2lI+H1Y9uF1kC+9zpFmJefPXO4gjjiJ0LHtDTTE17n9mpWwAhaaLf6i3tLcRpPvckdaaVz7kBujvPYhN0YZOop8SXKM5U5Yd/2p3q8hDb5xjTSJcFhquy/n7SnKu4xmj+ySinOkIjJKRJ4Tkb+IyJBS2/N+ZlV9c0wrjUq33ygfKs6RAp8Ffgs8DlxYYlsMwzBSrWtfbhwL1ABtQKfnKv7aeAn+piagvLb1og7fWrpQ6Q8PX1g2tli6N6TJY9ueTydRiY60I52O8Oxh86DfoP1ffnj4wtjGli5cupxssXRlpztSalu6Sich0TNqlYKIfBMYCAwHXlHVBf7/Q4CmpqYmhgyxoVPDMApG4txulRiRPgI8gLsRfnuJbTEMw6g8R6qqW4D/Vmo7DMMwIirOkSaxc2f+ExIbhmEcjKFDhw4BdukhxkErboz0YIjIMcDmUtthGEavZKiqHjRK602OVICjgV2ltsUwjF7H+yMiNQzDKBWV+GaTYRhGWWGO1DAMIyPmSLtAROpEZJmIHJ0xnx+JyM9FZLyIvCgii/xYbtb8PiAib4nIsm7m82U/6csyEfloVtuC/E7NYpvP70si8qSI3CUik0RkhYjMLVB+E0WkVkTmZchvjIi0FvC8RvllOq8d8ovq798VyL4ovwkFsu+zIvJHETmnQPZF+Z1RgLp3tT/Wt0TkX9PaZ440QESqgEdVdaKq1mXI52RgpE9eAtwIbAFOL0B+VcBcVZ3YTfP+oKrnAVuBmVltC/KTjLahqn8EJgJn4xZenAaMEZFhBcivP/BjVb2ku/bhlil/kwKc1yC/rOc1Vn+ByVntC/JrKIB9g4ErVfVLwGcKYF/H/Jqy2qeqs/3+bwAfSGufOdLOHAl8WET+IUsmqroGiKKeY4E64F26mGilG/kNASaKyKe6mZeKyECfT2sBbOuY3+AstsH+i7cW+BswwttXBxxTgPz6AV8UkY91My+/ZCZbKcB5DfLLdF49HetvZvuC/Aph37nAeBFZUiD7OuY3ogD2ISLjgXW4401lnznSAFXdCpwDfEdExhTjJzJnoPoSMBW4XUQGdDOb24HriU+4nsW224HrVfXFrLap6j5gPHA4LkrLZF+Q35+By4FfdScv4KvAPV39TNb8CnFeO9ZfClN2HfPbntU+4ChgNvAH4FtZ7Qvy+2QB7AO4CHgw+N8h7TNH2gWq2gY0AkcUKMvNuGdcC/bSgKrmcFMJ9s93XxE502WhLxTCtiC/TLZFeOd3GPCOt28ULjrImt/xuHM7qJtZnQhch3PMU8h+XvfnJyLfLlDZday/metdx/wKYF8DUA20ADsLYF/H/KQQ5YcbBvoLeVwb5kgDRORiEXkG2AO8UqBs5wE/xDmDl7NmJiLfFZHlwDJV7XqRmkMzCbjAD8o/XgDb9ucnIjdktA0R+Z6I/AXYDdyJi9g2qGpjAfK7HHge+GV38lLVGX58dTVuYvFMZRfk178AZdex/l6f1b4gv/Oz2odzUOcD3wQmZLUvyG9fAewDGKWqu8jjurUH8g3DMDJiEalhGEZGzJEahmFkxBypYRhGRsyRGoZhZMQcqWEYRkbMkRqGYWTEHKlR1vhJRm72f98nIhNLa5FhdKbXrdlk9E5E5BO4B68fFZHLgLeAiap6s4gsxr3FUwvcDFyGe1B7GvA/cS8d5HAPqO/BTUZRBYzuYv863Cq1p+ImYDkG+L3PZyVwmqp+V0Q24iay+C7uAfBl3s5lWSbNMCoTi0iNSuGfgbsP8l0DbiamjlzptT/wgt//Qv+/Prg3nLrafwVwGnAGzrF+xP+9AjeBxzgR+TjQ7NOGYY7UqAg+hJvWbLdPzwR+3uH7qmD7M/z2EZOBh4FlPn0RsOQg+78InAmcoqrP46YunIB7rXQI7rXhq31+h/t9bheR34lI33wPzOgdmCM1KoHJwH91SN+Gm8Mzmo9yd7D9ObjueMRi4CwORKGn4rr7nfZX1bW4CUly/l+NPr9VuCkCHwc+AWwEBvptZgDvAd2ams+ofMz5CKxZAAAAz0lEQVSRGpXAfFXdfpDv7sDN1jMP53CPxk2p1uy/b8VNr/YgsMj/7y4OTIsW21/cqgibgKf890uBLarajpvLdDUHpn8bAOzATSH4IaCm+4doVDI2aYlR0XS8uSMidwE3q+pbWfYH/hX4uaq+LSKfx61pfrDxWcMwR2pUNiJyup8QGREZC7ylqi0Z9v8icIyqXi0in8ONh35eVcPhA8PYjzlSwzCMjNgYqWEYRkbMkRqGYWTEHKlhGEZGzJEahmFkxBypYRhGRsyRGoZhZOT/A2nNJe7oUm02AAAAAElFTkSuQmCC\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x295f2d8f630>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Извлечь выборку для построения графика\n",
"np.random.seed(seed=7) # Задать начальное значение случайного числа, \n",
" # чтобы получить устойчивые результаты\n",
"# Случайно отобрать 70 образцов\n",
"samples_index = np.random.choice(range(counts.shape[1]), size=70, replace=False)\n",
"counts_subset = counts[:, samples_index]\n",
"\n",
"# Индивидуальная настройка меток оси Х, чтобы легче было читать графики\n",
"def reduce_xaxis_labels(ax, factor):\n",
" \"\"\"Показать только каждую i-ую метку для предотвращения скапливания на\n",
" оси Х, например factor = 2 будет наносить каждую вторую метку оси Х,\n",
" начиная с первой.\n",
"\n",
" Параметры\n",
" ---------\n",
" ax : ось графика matplotlib, подлежашая корректировке\n",
" factor : int, коэффициент уменьшения числа меток оси Х\n",
" \"\"\"\n",
" plt.setp(ax.xaxis.get_ticklabels(), visible=False)\n",
" for label in ax.xaxis.get_ticklabels()[factor-1::factor]:\n",
" label.set_visible(True)\n",
"\n",
"# Коробчатая диаграмма количеств экспрессии на индивидуум\n",
"fig, ax = plt.subplots(figsize=(4.8, 2.4))\n",
"\n",
"#with plt.style.context('style/thinner.mplstyle'):\n",
"ax.boxplot(counts_subset)\n",
"ax.set_xlabel(\"Индивидуумы\")\n",
"ax.set_ylabel(\"Количества экспрессии генов\")\n",
"reduce_xaxis_labels(ax, 5)\n",
"plt.tight_layout()\n",
"plt.savefig('pics/1_07.png', dpi=600) "
]
},
{
"cell_type": "code",
"execution_count": 90,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAUwAAAClCAYAAAA3UsShAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAE8xJREFUeJzt3XmwJWV5x/Hv47Aqw6rDJospwhoXDCEYIsxMxoQAQ1BAiSWrBBeYSDkVRAVqRAJUUkyoQEKMBggQamRIRMfIqGS4iFYwCIKlQAKmQIkOMGzDvj75o/swh8Od2293v28v5/4+Vbfu6XtPv+/T3W8//XafPm+buyMiIsXe0HYAIiJ9oYQpIhJICVNEJJASpohIICVMEZFArSZMy8w0M2szDhGREOu0XP9GwOrVq1e3HIaICIUdN52Si4gEUsIUEQmkhCkiEkgJU0QkkBKmiExrtnBZ8HuTJUwzO8vMLjCzHc3sPjObSFWXiEgTkiRMM9sJmJVPzgAudffZKeoSEWlKkoTp7vcCS/LJmcBsM9s/RV0iIk1Jfg3T3W8HDgYWm9kGqesTEUmlkQ993P1p4EVg/SbqExFJoTBhmtnRZrbEzD5uZt83szPKVGBmC8zsZmDC3Z+oHKmIJFXm0+LpKqSH+QngJOBUYA5wUEjB7j7h7qe4+4Xuvo+7n1Yjzt5QoxNpTtP7W0jC3BRYCmwGfCefljE3rol/XJdLmlGYMN19N3ef6+6bufscd9+1icBk7cZlpx+X5ZDpI+Qa5ifNbMLMrjezFWZ2UhOByfSjBBpPinXZhe3Tdgwhp+THuPtsd58H/AFwbIpA2l4RkxmNabIYy8bdxnI2UWcbdXSxzcQyLtts3IQkzCvM7EYzux64Abg8cUytGddkFkMfDgxdpXVRXdfWXcg1zIvcfX93n5f3NC9sIjCRvirayaskga4ljlS6vpwh1zBvyK9dPpr/XtFEYOOi6w1goC9xynjpW7sL6WHOcfe5wB35p+VzG4irtJAV37eNE2o6XduTyfXlclIX2madGIJ7mMBWTfYwddF7jb7EWWRclqMLtC7XLuW6Cephkg2eMbfNHqZ6Ud3S1QNaE+0kdplqy/0R0sP8B+BKYHk+/eXUQcWihihFihLsuFzq6evpcwwxlyPktqJ3AguAR81sFvCOaLWLiPRISMI8BTgHeB74a7KBOCSRrp7qigisE/Ceo9z9mOSRiIh0XEjCPMjMHhr+g7uflSgeERkjtnAZfv78tsOIJiRhHps6CBGRPihMmO5+YxOBiEi/jVtvcjKNPNNHRGQcFPYwzex2wIHHAAO8q1+PFBFJKeQa5hzgRGAWsNTdb04bkohIN4UkzMX57/XIxsb8b3c/OGFMIiKdFPKhz3FNBCIi0nVlr2ECoGuYIjId6RqmiEggXcMUEQkUkjC/NNyrNLPZ6cIREemukBvXF49M/2WKQEREui6kh3mpmd0AvJK//6q0IYmIdFNIwlyVP6YCADP7YMJ4REQ6K+SU/DNmtj6AmW0ALEwbkohIN4X0ML8IfM3M1gNeAs4LKdjMzgI2Bv4RuAJ4EDjI3b1irCIirQpJmCvIEuWG7n6tmW1aNIOZ7UR23+ZzwJHAmcDhwLuAH1cPV0SkPSGn5N8EdgU+l09fXTSDu98LLMkntwFWAr8Gtq0Qo4hIJ4SOh7kKWMfMjgBm1KhPp+Mi0lshCfNwsm/5XA3MBD5Qso5fAVsBW+evRUR6KeQa5nbAfmSn1g8C9wA3lahjCXA58BBwR9kARUS6IiRh/j1wDHA/WfJcCuxVNJO7TwAT+WTh+0VEui4kYW4CfD5/bcA2ZnYJgLsfnyowEZGuCUmY84ANhqYXpQlFRKTbQhLmuUOvBw9BU89SRKadkIS5K9nN55ZP69YgEZmWQhLm23j9abh6mCIy7YQkzLnAM0PThV+NFBEZRyE3rl/s7vcPfoC/SR2UiEgXhfQw7zWzy8juw9wBeCBpRCIiHRXyXPITzGwHsm/6PMTQ43ZFRKaTwlNyM/sWsDvwE7LH7V6ZOiipbvPTl7/mt8h0Fnt/CLmGeQowH7id7HvkH49S8xRGF3JckkCV5Sg7z2PPvoifP5/Hnn2xfIAdFqMNFJWRoo5xabujmmjLMcTeH0IS5meBDYHvA/uS4Js+25/1XWDNihxdyBgLXXdjxdjYVZZjdJ42Gl1REmgi0cRoA0XrMsa6rtt2Nz99ObZwWal6m2gTMbZH3fWbYjnLlhmSME8ge8TEt4FL8umoHn+ufKMa/h2i7saq0kBSbOAmGl3RzlE0PVkdRfWmOEgWKapjsv+nPvAO6pwqrrqJvkpSTpH4i8pMcdAsG8OokIT5r8DeZI+Z2J9stKJWhSxk2R00Rq+p7AaOUWfZDR7y/roNc7L5u9BTjiHFuilStl2F/j/lASlGHSk6KXXbXUjCfKO7nwdc6+5nk41e1Hl1E0mVpNxEneNiOi1rXXXXVV8PTlXUPZgUCUmYBwC4++CZPn9cqaaEBg3BFi7r9QXlFMZ1Z6lyaaGtOFLOH6KJdtrmtfUm9/vChOnur4xMdy47NHGK0Vd9SOpVNHFpIVYcKeePpe7loTY+mG1jvw+5D3OWmX3QzI7Of+Y3EZiINKcLl4e6cvCYSsgp+XXAFsCpZEO8fSFpRCIiHRWSMH/u7hcDvyBLnC+lDUmaNq7XOUViC7mG+cH85UfIkuahSSOSxvXhVCiEEr+kVjj4hpkdDRxI9gTIjwC7AV9MGdTPVh3CncfAzwAN8N59Xdleg8Q/uGG6SFfi7qPpuu5Chnf7BHAwcAuwC9kzyaMmzP965EjuPObIV1f+Hm/+xqsNf5w2RZVGVnaeNhpyE9srxnKNljEa93RNAiGK1l2VMrqgbEyhj9ldCmxG9vXI6COu773FEvzCD41dghxVpZGVnafo/TGSdhMNP8YOOqqojJA66h7A+rL+R5Vd/4MYId4BKkWHo+xyhSTMw9z9rqDoEknRu0gxf92G3UQdMZJ2jKRctBxdOMuYbDnqHsCaWP9dMIgRqHyAitEmYp9FhHxK/k9mtp2ZbT/4KV1LTXu8+Rvs/s/ZCoPBQhs/W3XIWucZfU+VMqaKIaSOyeYpW0fRe4rmKbucVeeZKsaQ95RdV00IiWl0XdVdd4P5y5RRNoYqdbQhRZuoW2ZIwnwb2b2Xg59FlWqKqMoOWfT/Kg297sqvu3OFqBJjiobaxLK2UWfsxD+Yv86BNrTtd+0A1Qchp+SXu/tnkkdSw2TXS8pq4wOAPpxaxdLGsvZh/bZxPbivdXQhjpAe5m7DE/kjKzolxRGzi6eIMn6aaGdV6ohxyaoNqeMI6WG+YmaLWPPUyK4erGUa6UqPZlz1oXfehpAe5geAG4Bnye7BrPRNHzNbaWYTZrZVlflFhnWlRyPTS0gP81jgCGBdd59nZguAC8tUYmYzgOvc/bjyIUpq6q1JH8X47KKskB7mn5F9NXLw3j+tUM9mwB5m9v4K80pi6q1JG3cx1I2hjU/7QxLmVcB/kCW85cBlZStx91VkT5w8ycx2KDu/SIgu7PR91YWDZhdiKFJ4Su7uF1LyFHwt5bxoZo+QfbXy/rrliYzSBxWSWsiI6zeY2Qozezj/vaJsJWZ2uJn9AHgB+EmVQEVE2hbSw5xjZhsBy919bpVK3P0a4Joq84qIdEXIeJiDW4rOTx+OiEh3hdxWtIj883oz2w/A3b+XMCYRkU4KSZgfBeYBS4DH878pYYrItBNyW9EZwIeBWcCWQOe+Sy4i0oTQb/o4cE8+fSDwo1QBiYh0VUjCnMh/O9lzyXWLm4hMSyEJ81Je/+0eXcMUkWknJGFuCGwLPEF2Kv61pBGJiHRUSMLcB1iX7CuN7wH+DZifMigRkS4KSZgPkA2csQ3Z1xq/nTQiEZGOCrmt6GvAHOBNwGxgccqAROT1bOEyNttw3bbDeI0uxpRaSMLcAlhBNsTbinw6uum48qV9fWh3g+d7P3r2AS1HskYXY2pCSMI8CvhD4HPA+8huYo9q9TkHAq9d+X1oyLJGE9srdh193um1f8RTZl2GJMzDgLPd/UTgauDsGrEF6XND7oM+Jp6m2kQfEpH2j3qGt3HZdRmSMH8KLDWzrwB/AZxZKUoBinfI1Dts1Z2tjUSSos6pyuzTumlD2eW0hcs6t27qHmxCEuapZM/k+RNge+DLlWqSwo21tv9Xaah960GmqrNObyJEjDK7llQmU3Y5/fz5SdZ32+uqMGG6+xx3fy+wW/660iDCUk2Vhlrm/eMs1bqoewBLkcTbTiSh6sTZhbYd0sMcuDpZFA2p26gmm3+qnSFVTG33OGPV2YedfDTGugewrvZyYyjanpPF2fYlqrJCnumzV/7ybxPHUkrZHbLKxho22fx1d4YqDb3scnT1dLorO/lU2oqxi9f+isRqy3XLTC2kh/lXAO5+beJYphSS/Mqc6ozLp65tnXY2/YFMUzG0Lda1v5Dt17feXReEJMw3DJ4WOXiCZPKoRoQmPz9/fqcuvnfxCBki1cEmZk+4r+u2CUVnISFJuc3eddsH4qmEfOgzGzge+Dxw3Lh+6KMdMC2t3/b0Zd334fpuyFMjL81fPgC81czM3Y+NUnvH6ZREquhru+lj3LZwGUCtD17LCDkl3x04093PILtpfY8kkXRMX47K0i19bTd9jHuyS3GplyNkeLc/B84xs1nAg8CCJJFIq/rYuxBp2pQJ08y2B34NnD70Zz3TZ8z4+fOxhct61bsQaUNRD/NmYDlrHn42+H184rhERDqnKGHeDywCngcedvdXkkckUoMuLUhKRR/63E2WMM8FrjGzb5rZ+5JHJVJBHz+4kH6Zsofp7scNT5vZG4GvAt9NGZSISBeVGXwDd3/G3Us/MdLMtjSz/zSzm8xsZtn5RUS6oFTCrOEg4ErgemBeQ3WKiEQVch9mDNsAdwEvAts2VKeISFTmnv62SjM7nSxhbgGs6+5/l/99JrB69erVzJypM3URaZUVvaGpU/JfAVsBW+evRUR6p6lT8n8HrgVeAhY3VKeISFSNJEx3fxB4TxN1iYik0lQPc0pPPvlk2yGIyDS38cYbzwSe8ik+2GnkQ5+1Vm6ma5oi0iUbu/tae3BtJ0wj+zDoqdaCEBFZo7s9TBGRPmnqtiIRkd5TwhQRCTQ2CdPMVprZhJltFaGss8zsAjPb3cxuNbNv5ddbY5S5o5ndZ2YTNcr6UD6QyYSZvSNGjCNl7lw3xrzMI8zsRjO7zMzmmNltQw/Vi1HmbDO728yW1CkzL3cHM3s+1jYfKq/29h4qc9DGfy9iuxyUuU/EOA8ys6Vmtm/EOAdl7hmpbZ6cL/d9Zvap0DjHImGa2QzgOnef7e4ra5a1EzArnzyS7MFvDwLvilTmDODS/PHFVV3t7u8FVgGnxYhxpEyLECPuvhSYDewNnAgcBexgZltEKnN94Fx3P7JOnLlTgHuItM2HyouxvV/TxoEDYsQ4UubDkeLcCPiYux8B/FGkOIfLXB0jTne/KC/jf4AdQ+Mci4QJbAbsYWbvr1uQu98LDHos2wAryZ5rVHnQkJEyZwKzzWz/GuW5mW2Yl/V8pBiHy9yobozw6g55N/Aj4C15nCvJviIbo8z1gMPM7J0149w8f7mKCNt8pLza2zs33MajtMuRMmPF+fvA7ma2ImKcw2W+JVKcmNnuwP+SLXtQnGORMN19FbAvcJKZ7ZCqmiiFuN8OHAwsNrMNahS1mOzhdC8PF18ntkGZ7n4rEWJ095fJHtO8CVlP69V/RSrzO8AJwFeqlpf7MHDFZNXVLS/W9h5u48Rbl8NlPhYjTmBz4CLgauCjMeIcKfN3iRMnwCHA10f+NmWcY5EwAdz9ReARYNOIxSYZNMTdnyYb6m79KvOb2buzYvyWWDGOlFk7xoE8wa0DPJDHuSXZ0TxGmW8l2+ZvrFMe8BvAqWSJ+EDqr89XyzOzEyOuy+E2HqVdDpcZKc6HgTcBzwFPRopzuEyLtT7JLu3cRIl9aCwSppkdbmY/AF4AfhKx6CXAF8h28jtiFGhmC8zsZmDC3Z+oWMwcYG5+4fv6SDG+WqaZnREhRszsFDO7CXgWuISs1/ULd38kUpknAD8ELq5aHoC7fzq/Dnon2QDXtdbnSHnrR1qXw2389LoxTlLmfjHiJEtA+5E9WXafGHGOlPlypDgBtnT3pyixn+vGdRGRQGPRwxQRaYISpohIICVMEZFASpgiIoGUMEVEAilhiogEUsKUxuSDZSzKX3/VzGa3G5FIOZ14po9ML2a2F9mNyNeZ2bHAfcBsd19kZsvJvslyN7AIOJbsxuWjgM+S3aj/NNnN2y+QDZwwA9hukvlXkj2tdGeyAUW2Bq7Ky/kx8HZ3X2BmvyQbdGEB2Q3RE3mcE3UHeZDxoh6mtOGTwOVr+d/DZCMGDftY/nt94JZ8/nn5395A9o2fyea/DXg7sCdZAv2t/PVtZANP7GZmvw08k0+LTEkJU5q2C9mQWs/m06cBFwz9f8bI+/fM3z9wAPBNYCKfPgRYsZb5bwXeDfymu/+QbIi9fci+TjmT7Gu0J+flbZLPs9jM/sXM1i27YDL+lDClaQcAXxqaPo9s7MjBuIfPjrx/X7LT6IHlwO+wple5M9lp+uvmd/efkw2o8XT+p0fy8n5KNoTd9cBewC+BDfP3fBp4Aqg1ZJyMJyVMado17v7YWv53IdkIMkvIEutWZEN6PZP//3my4b2+Dnwr/9tlrBmS6zXzWzb6/v8B38v/fwPwoLu/QjaW5p2sGYJsA+BxsiHudgHuqr6IMq40+IZ0xvCHLGZ2GbDI3e+rMz/wKeACd7/fzA4le+702q6fikxJCVM6w8zelQ+4i5ntCtzn7s/VmP8wYGt3P9nM5pNdrzzU3UdP+0WCKGGKiATSNUwRkUBKmCIigZQwRUQCKWGKiARSwhQRCaSEKSIS6P8BkLqiazqRmbQAAAAASUVORK5CYII=\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x295f269fac8>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Коробчатая диаграмма количеств экспрессии генов на индивидуум\n",
"fig, ax = plt.subplots(figsize=(4.8, 2.4))\n",
"\n",
"#with plt.style.context('style/thinner.mplstyle'):\n",
"ax.boxplot(np.log(counts_subset + 1))\n",
"ax.set_xlabel(\"Индивидуумы\")\n",
"ax.set_ylabel(\"Лог-количества экспрессии генов\")\n",
"reduce_xaxis_labels(ax, 5)\n",
"plt.tight_layout()\n",
"plt.savefig('pics/1_08.png', dpi=600) "
]
},
{
"cell_type": "code",
"execution_count": 91,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVgAAAClCAYAAAAZF+UzAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAGmRJREFUeJztnXuQHlWZh5/XABHCICGQBBCBFckFVEC8YFAnETUbSJabCtYiRFm8hZUNZUQFKoKrVCyy7Mqu66KGRVcjsIuEmKBCMhAoUIQAJUm4BxDMVXRCgCSGd//onuSbTk/6dE/3d5vfUzXV0/11v+fX5/L26bdPnzZ3RwghRPm8rtEChBCiXZGDFUKIipCDFUKIipCDFUKIipCDFUKIimhKB2sRHWZmjdYihBBF2aXRAvpgT6C7u7u70TqEEKIvMjuATdmDFUKIdkAOVgghKkIOVgghKkIOVgghAln30ia+vfgJ1r20KWh/OVghhAhkzn3PMWP+cubc91zQ/s06ikAIIZqOqe88qNcyC2vG2bTMrIN4mFZHR0ej5QghRBoapiWEKI+8MciBjhysECKYvDHIgY5isEKIYPLGIAc6mT1YM/ukmc01s8+a2V1mdkk9hLUburUS7cC+ew7mS+MPY989BzdaSm4a0QZDQgSfA74AzADGAydWqqhNaeVbK10cRDvQiDYYEiLYG7gBGAr8Kl4XOWnlW6ueignwpfGHNViNEMVoRBvUMK0BzrqXNjHnvueY+s6D+rztC9mnEbqEaDD9H6ZlZp83sy4zu83MFpnZF4JSNrvMzK4ys7Fmdr+ZLaid37Wv7e1Cq9xWX33308yYv5yr7366z32Scbd6nFujQiqtUm6iNQiJwZ7t7p3ufgLwQeCcrAPM7DBgeLx6BnApsBo4qma3vrY3PSGNsBEOopBz8MQygHqc29R3HsSsk8b0up3Le35p+2fZaOVYeRW08gWnGbSHxGB/ZGZ3AFvi/a/LOsDdnzCzucDJwAHAKuCPwIHA0ni3vrY3Hcnb1ZCYZCPiPUVipdOOP5Qhg3fJpbMe59bTa64l7/ml7Z9lo5Vj5UXICsW0cvy9KbS7eyV/QCdwFfB94B3AN4ETa35P3R7/1gH40y+s9VmLHve1G171RjJr0ePO9Hk+a9Hj7u6+dsOrddGVN5166WoUZeRHu+dRXpJ1O0mR/KpHHoekUQcd2X4wcwdYDCwC/hQvFwUZ3u5gLyMa2jUHOLrm99TtXuNgL5u/dKeFH0IZmbxiVbdPuuZeX7Gqu7I00siq/K1EMo/k6HZOEQfSLM6wHvU2LY0G1KlMP5gZg3X38e4+AXjI3SfE/+dhLvB1YATwkJmdH8doe21PO/CsY964QxwuL2XE1OYtW82C5WuYt2x1ZWmkxYvS4pDNSJGY9NV3xQ/X7ur74dpAJqRO7ZCnAQ8sk+R9cSCkrKeMHcGkMcOZMnZEsI68pLWNZoyfZ8ZgzWwx0SOQkWa2CCDEybp7F9AVrx5b89N3av6v3b4Dw1LicHmZMnYEXU+u71dhZ8Xl0n7Pim09unoD029ZxuzJYxk1oiM1XpQWh9wZ9RraVEpMumfcSInjR4qcf6OGg2WlGxIL3mGfAg8s8xJS1j0dks43D+NLI6oZZpnWNuoRP89dX0K6ucAewP4h+5bxRxwi6O5OvyXPQ6Nus7PitpOuudeZPs8nXXNv6u9lpFmW3aSNMmLSWccUsVmkrJu1foTQiLBLWhr3PLXOR19xu9/z1LqG6qjHMYly63+IwMz+E/gxcGu8fk1x/19/3nfIUEYPH8L7DhkafEwZwzuStzDJ25fZk8cyacxwZk8eC8D6jZvpenI96zduLi1NKOdWPKk9mU7IbWbePC1yu5e8NW2W29mQ8E+RcsqbR1UNW5p6/UOsWLORqddHkb4q5itIak8796zzKxJCyar7WYSMg307cD7wJzMbDrwtWF0TcNGCFaxYs5GLFqwIPqaKWE6yIY8a0cEvzn03o+JbqOm3LGPB8jVMv2XZtmNKaRAZt+JFnFCRBpTM01mLn2DG/OXMWvxE6v5F4s8/Xfo8C5av4adLnwfSnVbyfLPi62WQVp92yMMC5ZQsl1mL4jxdFOXpo6s3cOL3f8Ojqzf0qSNvHUtzUjM/9BaG7Po6Zn7oLUE2ipDUntZxymy3iRBKPS7AIQ72AqKhVJuAbxNN/NIybN76Wq9lGnbhLb3W8zbuZEWGHQs7qyEne7RpNrJIcyjTxh3KrJPGMG3coanHhKRRhhNKVtSlz3f3WkLvcijixF/esrXXMs1pJZ1QPR4khqQx8fD9GD18CBMP32/bttr8SCunZLksfaG71zJ50U69w0k4zLS63IuUOO91D7zAxi2vcd0DL/R5fsk2lpdk/bn89idYsWYjl9++/QKdlc/Tjo/bwvFRWyhS9/O2yRAHe5a7n+3uE+Pl74Isl0SRgqk95vhD9um1DLG5756DmTF/+U4bd62NtN5nMuCeVfjJHi3kv3q+vHlrr2XIuYQ0/jIeHiQr6uUfOZzRw4dw+UcOL2wzyR67Duq1TLu4JJ1Qs0y/l+Ywakkrpx3CDKccyaQxw7n6lCMBOP+9B7PfkF05/70HA32ca8JhptXlWs48+kAmjRnOmUcfuG1bWuegbJL1Jy3NrLqePP+Qup/sKVcRIjjRzC6t/QuyXBF5He6MCYcx66QxzJhQ3ZscfRV2cpms3FnnktZz3Nkxewwe1GvZF3l7islzyezlpOhMVswlK19kxZqNLFn5YrCNrH2SPZQ0kk4oNJ3+EBL7y3JSIeU0bMhudL55GMOG7AbAlXc+zdqNW7jyzr7TTeZZlo6fPhiHYR58ftu2tM5BGXlaa6OKEQIheZq88OW9IIc42HOAOxJ/TUMZvdH+klbByiDkatnLwcQ9tb7CASE2QjjvxodZsHwN5934cPAxab2H2mUZJNNIc2whZZXMj7yOfof1gOFTWbrS4oVZY4uPPmCvXss0knk2akQHC5av6dNZvrxpa69lvUhe5IvUwSL0t3ce8qLBHcm/Qim1OFmNrEgjzPo979UyWQmLkqVzSxzP3lIT1857/kW15roY1GFcaAg9vcNkzzrPuXx5/jJmzF/Ol+f3DkPVXoCTMeieu7Yy79722G1Qr2WjSKuDWaRdpLLKIO2Ckwd99LDJqfr2tQjHH7pPr2WzMnFU/OBo1H7ZO+ckT7mEXkx2ZvPmR1b3WvbYq707S8ag09Ltb33qKwyT125/dYTWwawHhUXIoz1kHOyDZrY0ngt2cc/bXGLgMiN+g2ZGE86uVFv5sx4ctRI/PvOoXss0+uopZ5H3YtEMDwVPPXIko4cP4dQjRwYfE/rQuMxOTUgPdjzRvAFLga94/rkIRJtRViiianriZsn4WTPeFWQxcezIXss06lEuVc6xmqdcLvnlY6xYs5FLfvlY8DE/+O2zLFi+hh/89tki8goR4mBnA6OBkURzw86vVpIQ5dATNyv74eNAJjmOuFGM2m9Ir2UIaWOvqyZzshd3n1oPIUKI5ic5jrhR7BsPRetZhnD1KUcyelZXryF6VZM3BrtIMVghBi49zqmeTiqNIvHmRtzRhHwyZjxwHtE3tm5w93urlSSEaFaaJezSKs8BQhzs7Hi5G1EM9lF3P6lCTUII0RaEONjv1fZazayzOjlCCNE+hI4iqOWfqxAihBDtRkgPdk782ZjX4v1/Uq0kIYRoD0J6sOs8+vDhB939A8DavImY2TQz6zKzlWb2mZrtq+Lt4a9jCCFEixDiYL9sZoMBzOz1wIV5E3H3q929E3iM6K0wzGwQsNDdO919VV6bQgjR7ISECC4HbjKz3YC/AlcUScjMxgJPuftf4k1DgSPM7BR3v6mITSGEaGZCHOwiIse6u7v/3Mz2LpjWFODmnhV3X2dm44CFZvaAuz9T0K4QQjQlISGC+URzEXw1Xr++YFqdwJLaDe6+BVgPFHXaQgjRtIT0YAHWAbuY2UeBojPtjnD3l8zsDOAZ4EDgn4CngGqnJRdCiAYQ4mBPB04m6rl2AKcWScjdj46Xc2s231jElhBCtAIhDvYg4P3AAcBq4HESt/pCCCF2JMTB/gdwNtFt/UHADcCxVYoSQoh2IMTBvgH4Wvy/AQeY2Q8B3P1TVQkTQohWJ8TBngC8vmZ9ZjVShBCivQhxsN+q+d8AV89VCCGyCXGwo4EziJwrNPwr80II0RqEONhD2TEsoB6sEEJkEOJgJwAv16zrrSshhAgg5FXZ77r7Mz1/wL9ULUoIIdqBkB7sE2Z2LdE42IOBP1SqSAgh2oRMB+vu55rZwURvcq0BXqxclRBCtAGZIQIzWwCMJZqQ5Tzgx1WLEkKIdiAkBnsBMBl4kGgegs9WqkgIIdqEEAf7FWB34C5gHHqTSwghggh5yHUu8AFgOPA8cHelioQQok0I6cH+L/Au4FIiR3tDpYqEEKJNCOnB7uHuV5jZXu7+DTO7rXJVQgjRBoT0YCcCuHvPN7n+tjo5QgjRPmQ6WHd/LbG+pUhCZrbKzLrMbGS8PsLM7jGzJWbWUcSmEEI0M5khAjMbTvRF2J45YV9091vyJGJmg4CF7j61ZvOJRGNq9yWac/amPDaFEKLZCQkRLASGATOIpiz8eoF0hgJHmNkpNdsOAFYBfyT6wqwQQrQVIQ72SXf/LvAskaP9a95E3H0d0RjaL8Sv3e6wS16bQgjR7ITEYD8W//v3RE725CIJxbHb9Wyf7vAFYCSwf/y/EEK0FSFzEXzSzOYCHyN6bfbTeRMxs9PN7G5gMzDWzI4DfkHktE8ANPRLCNF2hIyD/RxwEnAfMApYAlyeJxF3vxG4MeWn4/LYEUKIViIkBvsGore3hgK/RF80EEKIIEJ6sKe5+/LKlQghRJsR4mB/YGYfZ/tXZXH3Z6uTJIQQ7UHoV2W/Tu/PduurskIIkUGIg73O3b9cuRIhhGgzQh5yjaldiT8hI4QQIoMQB/uamc00s6lmNhO9ddUw9rn41l5LIVqBkHrbrnU7xMGeCiwGXiEaA1voTa4ivOmyXwM7Zn6ZhdBKBfviK1vwKyfz4iuFJjQDss+3jPyoKk8bUVZpafZVH/PoapV6V4bOZL1Ns1lG3U7SDOUS4mDPAS4CznX326njRw///GrvTA8pqLwOJK1gsxpQSAMrw0YWRXoGWXkYkh9509jZeebJj5Dyz0tWumnnktyWpauIQymjPpRRx0IcX950qrCZtk8Rp112HQtxsP8ATKrZ98xCKVVASOUvkulZDShrvSwbeStqkYZcJD/KsFEkP7JsltEIy+hJFbGZ90JY5FyKXPiqOt8qbOZNN+Rc++twQxzsT4DbiaYbvBW4Nshyk1DFrUe9yKu9XufarLe3VTTCelHkItYoR9YulNHDzSJkNq3vuPsEdx/h7hPd/b+C1Yi2pFUaYbNeCET/aZWyDZlNa7GZLTKztfFyUT2ECdFfWuVC0Eo0i2MrUraN0B7Sgx0PTAEejXuyE6qXJYRoRlr5otUI7SHf5OoZonVl9XKEEKJ9CHlVdibxywVm9n4Ad7+zQk1CCNEWhDjYTxN9dWAu8Od4mxysEEJkEDJM6xLgE8BwYASguQiEECKAkB7sOUQhgsfj9UnA76oSJIQQ7UKIg+2Kl040J2zuyV7iCbunAVuBD7v75nj7KmAFcIa7r8prVwghmpkQBzuHHd/eyhuDvd7df2ZmNwJvAp4ws0HAQnefmtPWgOWRdVNYdjY8AmhSs/qgPBf9IcTB7g4cCPyFKDRwU95E3N3NbHegA3g63jyU6PXbU9w91eZv15/BsrPPUOWOOWLfefiVk7ELb6ksN6pyKK3qqOqR52XRrHkcoqtZtfeXEAf7HmBXoq/JHgf8HzC5QFqzgYvdfSuAu68zs3HAQjN7wN2fSR7wrmFz8e98fFvlbuVCaFbtSV1VOZS8duuVX0XSyXtMPdKAHfO4UXUuq06l6WqVC1nePA1xsH8AxgEHAA8Tfbo7F2Z2DFFH9r7a7e6+xczWEznvHRxskjIKoVENt4rKX6RnUMShZqVThQMpoqteTilLWxkXrTLqeqOcVla6Vekqoz5k2cyrPWSY1k3AeGAI0EnUE83LeGCCmXWZ2cVmdpyZnW5mdwObiRx3XThi33mM/e8oo6AnA41H1k3Ztk/atv6mU0RHGeeS3CdLV73OpYiOpN0yzq0eulqZkLIso72UYbNZ6kMtIT3YYcAi4DngIGBi3kTc/UrSX7W9Ma+tWtKuUHmvYmlXpKxeTRW94DQdedNpltusZgkz1It63FnV644nSci5VVEujQp3lJ1OSA/2LODDwFeBDxG9dNAUpF1d6nEVK6P3WSTddqaqPGxUOnl15L1LKEKajaSOZsmfJPVqC2WnE+JgTwO+4e7nAdcD3ygl5TZiIDlCqKYRtmoDaqSOKkJZzZI/9aAeF5MQB/t74AYz+z7wJeDSytSIlqBVGmGz9sbKolXKoVmpR/6FONgZRGNW/47oJYFrKlMjRInIAYlGk/mQK55wGzPb193XVS9JCCF2TrOOK08SMoqgh+sBfc1AiAFMszi2IiMXGqE95Jtcx8b//lvFWoQQTU4rh10aoT0kBjsLwN1/XrEWIYRoK0JCBK+r+ZKsEb3yqlCBEEJkEPKQq9PMDgH2B/7o7isr1iSEEG1ByFdl58T//gF4o5mZu59TqSohhGgDQkIEY4HT3f05MzuIaLpCIYQQGYQ42H8Evmlmw4HVwPnVShJCiPZgpw7WzN4E/BG4uGZz847qFUKIJiKrB3svcCvbP3bYs/xUxbqEEKLlyXKwzwAzgU3AWnd/rXJFQgjRJmQ52BVEDhZgbzPbDfhXd/91paqEEIWwC29h6O67NlqGiNmpg01+UtvM9gB+BsjBNgg1oPrTKnne827+n76R+6MjldMqeVg2Ia/KbsPdX3b33F+UNbMRZnaPmS0xs46+tqUemyiYViqosrX6lVHWl92AysjjkGOqKLuqtfeV53nTKePcqyqXKqhNNy0P03Q1a9vuTx3L5WD7wYnAj4HbgBN2sq0X3d+cBGwvmKIFlXe9DBshWovoSNLfc0nqDHUoeRtQFfkRoj0r/6o6Jit/0ihiI+/FoR7tJet8036vqr0UcY5F2kefuHvlf0TDvE4DzgOm9bWtZv8OwLu7u53p87yWrPWQfWRj4Npg+jxn+jwf+rWF0jHAbaSVQU4bmb7P3Ksf1mpmFwPLib5Qu6u7/3vatpr9O4Du7u5uOjr6jB4IIUQjsawd6hUieAEYSTRhzAs72SaEEG1Dni8a9IdfAD8H/go8ZWbHJbbNrpMOIYSoG3UJEeRFIQIhRAuQGSKoVw+2EBs2bGi0BCGESGWvvfbqAF7ynfRSm7UHq7isEKIV2Mvd++wJNquDNaIHYC81WosQQuyE1uvBCiFEO1CvYVpCCDHgkIMVQoiKaHsHa2arzKzLzEaWZO8yM7vKzMaa2f1mtiCOGZdh8xAzW2lmXf209/F4Ep0uM3tbGToTNg8vQ2ds96NmdoeZXWtm483sgZoPbZZhs9PMVpjZ3BK0Hmxmm0ou+x6bpZR9jd2eev/eErX22HxPieV/opndYGbjytKZsHt0SW1qWnzuK83si6Fa29rBmtkgYKG7d7r7qhLsHQYMj1fPAC4l+k7ZUSXZHATMcffOfsgEuN7d3wesAy4qQ2fCppWkE3e/AegE3kU0L8VZwMFmNqwkm4OBb7n7Gf3VClwAPE5JZZ+wWVbZ96r3wETKqae1NteWodXM9gQ+4+4fBT5Shs4Uu91laHX3q2MbjwGHhGptawcLDAWOMLNTyjDm7k8APT2hA4BVRN8sO7Akmx1Ap5l9oJ863cx2j+1tKklnrc09y9AJ2xruCuB3wH6x1lVEr1CXYXM34DQze3s/de4T/7uOkso+YbOUso+prfelaE3YLEvr8cBYM1tUos6k3f0or66OBZ4iOv8grW3tYN19HTAO+IKZHVxlUqUYcX8QOAmYbWav76e52UQzlm2tTaIMm+5+PyXpdPetRJ+GfwNRL27bTyXZ/BVwLvD9fsgE+ATwo7TkyrBZZtnX1nvKy9Namy9SjtZ9gKuB64FPl6Ezxe67Ka9NTQFuTmzbqda2drAA7r4FWA/sXbLpSiarcfeNwBaiW9tCmNkxkSm/j5J0JmyWorOH2CHuAvwh1jqCqIdQhs03EpX/Hv2U+TfADCLHPYlyyn6bTTM7r+Q8ra33pdTTWpslaV0LDAFeBTaUpTNh10rM105gCTnaVFs7WDM73czuBjYDD5dsfi7wdSJn8FAZBs3sfDO7F+hy97/0w9R4YEIc2L+NcnRus2lml5SkEzO7wMyWAK8APyTq0T3r7utLsnku8Bvgu/3R6e7T4zjuMqIJ4vudpwmbg0vM09p6f3EZWhM231+S1iXA+4m+Uv2eMnSm2N1aVr4CI9z9JXK0fb1oIIQQFdHWPVghhGgkcrBCCFERcrBCCFERcrBCCFERcrBCCFERcrBCCFERcrCiqYgnZ5kZ//8zM+tsrCIhitPU3+QSAxczO5ZosPhCMzsHWAl0uvtMM7uV6A2lFcBM4ByiweVnAV8herliI9Eg+81Ek3MMAg5KOX4V0deNDyeaxGZ/4CexnaXAW939fDN7jmhij/OJBq13xTq7ypigRbQn6sGKZuXzwHV9/LaWaEarWj4TLwcD98XHnxBvex3RG11pxz8AvBU4msjhHhn//wDRBCdjzOwdwMvxuhDByMGKZmQU0bRwr8TrFwFX1fw+KLH/0fH+PUwE5gNd8foUYFEfx98PHAO8xd1/QzR15HuIXq/tIHrFelps7w3xMbPN7H/MbNe8JyYGFnKwohmZCHyvZv0KonlTe+b6fCWx/zii2/oebgXeyfZe6+FEYYMdjnf3J4kmcNkYb1of2/s90bSMtwHHAs8Bu8f7TAf+AvRrCkTR/sjBimbkRnd/sY/fvkM0q9FcIkc8kmhaupfj3zcRTVF3M7Ag3nYt26eV63W8RV+6eB64M/59MbDa3V8jmkt2Gdun0ns98GeiaRtHAcuLn6IYCGiyF9FS1D5UMrNrgZnuvrI/xwNfBK5y92fM7GSib933Ff8VIhg5WNFSmNlR8eTUmNloYKW7v9qP408D9nf3aWY2mSjeerK7J8MQQuRGDlYIISpCMVghhKgIOVghhKgIOVghhKgIOVghhKgIOVghhKgIOVghhKiI/weOoKeSAo1Z3wAAAABJRU5ErkJggg==\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x295861044e0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Нормализовать по размеру библиотеки\n",
"# Разделить количества экспрессии на суммы количеств\n",
"# для конкретного индивидуума\n",
"# Умножить на 1 миллион, чтобы вернуться к аналогичной шкале\n",
"counts_lib_norm = counts / total_counts * 1000000\n",
"# Обратите внимание, как здесь мы применили трансляцию дважды!\n",
"counts_subset_lib_norm = counts_lib_norm[:,samples_index]\n",
"\n",
"# Коробчатая диаграмма количеств экспрессии на индивидуум\n",
"fig, ax = plt.subplots(figsize=(4.8, 2.4))\n",
"\n",
"#with plt.style.context('style/thinner.mplstyle'):\n",
"ax.boxplot(np.log(counts_subset_lib_norm + 1))\n",
"ax.set_xlabel(\"Индивидуумы\")\n",
"ax.set_ylabel(\"Лог-количества экспрессии генов\")\n",
"reduce_xaxis_labels(ax, 5)\n",
"plt.tight_layout()\n",
"plt.savefig('pics/1_09.png', dpi=600) "
]
},
{
"cell_type": "code",
"execution_count": 92,
"metadata": {},
"outputs": [],
"source": [
"import itertools as it\n",
"from collections import defaultdict\n",
"\n",
"def class_boxplot(data, classes, colors=None, **kwargs):\n",
" \"\"\"Создать коробчатую диаграмму, в которой коробки расцвечены, \n",
" согласно класса, к которому они принадлежат.\n",
"\n",
" Параметры\n",
" ---------\n",
" data : массивоподобный список вещественных значений\n",
" Входные данные. Один коробчатый график будет сгенерирован \n",
" для каждого элемента в `data`.\n",
" classes : список строковых значений той же длины, что и `data`\n",
" Класс, к которому принадлежит каждое распределение в `data`.\n",
"\n",
" Другие параметры\n",
" ----------------\n",
" kwargs : словарь\n",
" Именованные аргументы для передачи в `plt.boxplot`.\n",
" \"\"\"\n",
" all_classes = sorted(set(classes))\n",
" colors = plt.rcParams['axes.prop_cycle'].by_key()['color']\n",
" class2color = dict(zip(all_classes, it.cycle(colors)))\n",
"\n",
" # Отобразить классы на векторы данных\n",
" # другие классы получают пустой список в этой позиции для смещения\n",
" class2data = defaultdict(list)\n",
" for distrib, cls in zip(data, classes):\n",
" for c in all_classes:\n",
" class2data[c].append([])\n",
" class2data[cls][-1] = distrib\n",
"\n",
" # Затем по очереди построить каждый коробчатый график \n",
" # с соответствующим цветом\n",
" fig, ax = plt.subplots()\n",
" lines = []\n",
" for cls in all_classes:\n",
" # задать цвет для всех элементов коробчатого графика\n",
" for key in ['boxprops', 'whiskerprops', 'flierprops']:\n",
" kwargs.setdefault(key, {}).update(color=class2color[cls])\n",
" # нарисовать коробчатый график\n",
" box = ax.boxplot(class2data[cls], **kwargs)\n",
" lines.append(box['whiskers'][0])\n",
" ax.legend(lines, all_classes)\n",
" return ax"
]
},
{
"cell_type": "code",
"execution_count": 93,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAADQCAYAAABV2umIAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAHitJREFUeJzt3Xl8VOW9x/HPD4HALWFHWRRIAhYSK41FQFrLUgQFpagVrNIKXsC22pYCLlcoDYtIVWhVbmmLAi5QXIoIFRRaDJRKFasCAoELglhlaVgMmxDCc/84kzEBAmcmnMzC9/165TWZmXPO85slvzznPOf8HnPOISIi0asU6wBERBKdEqmISDkpkYqIlJMSqYhIOSmRioiUkxKpiEg5xTSRmifVzCyWcYiIlEfloDZsZmOBms65oWZ2FTDbOZd20mI1gIKCgoKgwhAROVfK7PAF0iM1sxbAhSUeGgTsCKItEZFYCySROuc2A3MAzCwT+D/gWBBtiYjEWkUcI70DmF4B7YiIxERFJNLGwBNAppndWAHtiYhUqMAGm4o5534AYGa5zrlXgm5PRKSiWSyrP5lZKqFR+9TU1JjFISLiQ8WO2ouInE/Oi0RqM0bEOgQRSWJnTaRm9qCZLTaziWb2jpn9oSICE5HE98UXX5x1mWPHjnHixIkKiCY4fnqk1wE9gN7OuXZAZrAhSbLTHkLye+ONN2jTpg1jxowpc5lDhw4xePBgsrKy+PTTTyswunPPz6j9ceBveJfGLwWKgg1JRBLZ9u3bGTZsGIsXL6ZJkyZlLjdixAhatGjBtGnTKjC6YJw1kTrnuphZVaAesMc5pyuURKRMr776KoMGDTpjEgVYsmQJmzZtqqCoguXnGOkEYD7wCDDfzB4OPCo5Le0Sx5fc3FxuvfVWADp37kxeXh4FBQV07dqVFi1asGjRIgAGDRpEWloaPXr0oLhAT7NmzVi+fDkAaWlpLFu27LRtDBgwgMaNG9O4cWMGDBgAwLPPPkt6ejp9+vShsLAwvOyaNWtIT0/n0ksvZdasWQDMmjWLZs2a0bt3b5xz/PrXvyY9PZ2bbrqJEydOMGPGDLp27QrAzJkz6dKlCwDbtm2jbdu2ZGVlsWXLFnbv3k3Lli1p2bIljz76KMeOHSM9PR2A5557jocf/jItbNy4kd/97ndkZmby+uuvh98fgGnTppGTk0N+fj67d+8mKyuLvn37hl9H8eutXr06eXl5zJw5k5kzZ7Jnzx7atGkTfp2ZmZlkZ2ezfft2HnroIZo0aUJqamp4mUGDBpGRkcGUKVNKfT79+/dn8eLF5OTkkJubS15eHt26dYv8wz+Jn2OkXfGOj/4A+C7wnXK3KpKknn/+ea666iqWLl3KyJEjAdi8eTOLFi2iXbt2PPPMMwCkpKSwYMECVq9ezZEjR9i3b1+Z25w+fTrTp3tXWTvnGDNmDO+99x716tXjtddeCy+3d+9e2rVrxzvvvMPo0aMpKiri9ttv5+OPP+bo0aOsXr2aESNGsGnTJj744AN2797N559/zsaNG9m/fz8LFy6kuKLlk08+ydixYxk6dCjPPPMMhw8fpl69eqxbt47p06dz4MABatWqxf79+3nvvfdo27ZtOI5Dhw4xbtw4Xn75ZR588MHw44WFhTz66KPhZZo2bcratWsBwv90ioqKeOaZZ2jfvn2p92DSpEkcOXIEgPHjx/P4448zbNgwJk+ezMiRI5k1axa9evVi9erVrF27loKCAj788EMmT54c3sa//vUv8vPz6d69e/ixiRMncvz4cV+f7Zn4OUY6EVhgZhfgHS+dWO5WRQJgwxec8226STec8fmFCxfSqlUrtm/fDsC6devo3r07TZs2JT8/n5IXvHTs2JFXXvEu7rvkkkvIy8vj9ddf5/rrr+fAgQO+4snPz6dmzZrUrl2bdu3a8eGHH9KnT59Sy9SuXZv69euza9cuFi9ezMSJE9mxYwf79+/no48+okOHDrRt25aGDRtSUFBAt27deOONNygsLAwn0ry8PObNmwfADTd8+R5UrVqVyy+/nI0bN9K+fXvef/99Pvjgg1KDSpUrVyY1NZXMzEx27doVfnzWrFl06tQpvEz16tWpXLkyHTt2ZOPGjYCXYGvVqlXq9RQUFLBp0yYaN24cfo/btWvHzp07w/+YSsrLy2Pp0qVkZ2eHky/Afffdx+LFi8P3t2zZQrVq1Xy972fjJ5G+BwwpcT92l0KJnMHZkl4QevbsyZw5c8K7rgBl1SmvVKkSRUVFHD58mKKiIi6++GIWLVpE//79OXz4sO82/dRBL25r9OjRrF69mh//+McAtGzZks8++4zevXuzb98+Dh48yLXXXssjjzzCrbfeyty5cwGv5zt//nyysrIAb1f/5G1fffXVrFixgsLCQmrWrBl+vk6dOnz++efhZYu9/PLLDB06lBUrVlC7du3wYQ4zC7+mrVu3nnJsdfr06UycOJGJE7/sw53pPXDOMXDgwHDvt1h2djZLliwJv6YpU6Ywe/bs8HtTHn527ccAOcBK4Feh30XkNDIzM1m1ahX//ve/qVu3bqk/+FWrVtGyZctwL+k73/kOqampVK5c2df5lgD169fn888/p6CggHfffTecFEo6ePAgn376KQ0bNsQ5R0pKSvi5w4cPk5KSQkpKClu3buXIkSM0atSIo0eP0rVr13AcrVu3ZsmSJYB3nmexoqIiVq9eTUZGBp07d2bq1Kl07NixVPtXXHEFK1eu5JNPPqFRo0bhxzt27Ejlyl7f7Stf+QpVqlRh586drFmzhtatW7N+/XqOHj16SiI9cuQIPXr0OOU9Luv1t27dmjfffJOioqJSsefk5DB16lT27NkDQIMGDWjduvVZ3nF/zppInXMDnXMDgY3OuTudc3eek5ZFklD//v1Zvnw5nTp14qGHHgo/fs011zB37lwGDhwYTqS9evVi6tSpgLdLW1BQQN++fc+4fTNj1KhRtGnThp07d9KrV69Sz7/22mtkZWXx85//nCpVqnDvvfeSmZnJP/7xD2rWrMmvfvUrMjIyqFSpEm3atAnH8uqrr5Kdnc2hQ4cAGDp0KM899xwZGRm8/fbbAKxevZr09HR69OhB48aNadKkCRdeeCHXXXddqRhuvPFG1qxZQ6dOncjJyQG8Xfk77yydOnJycvjGN77Bf/7zH7p3706/fv1KDVoVGzx4cKl/SCNHjuSee+7hkUce4Re/+MUpy3/ta1+jQ4cOpKenc//994cfr1GjBkOGDAkPQA0ZMuSUdaPmnDvjD/AmsLTE7dKzreP3B0gFXEFBgQsS04cHuv2KoteRmDp16uQ2bNgQeDtvvvmm69evXyDb3rp1q2vfvn2px44dO+bat2/vjh8/HkibcajMXOb3PNKvAqnOuXfPXQoXkUS1detWunXrxi9/+UsuuOCCWIcTc2dNpGb2CvAx8G3gCjOb75zrHXhkIkkiNze3Qtrp3LlzqUGvc6l58+b885//DN9PS0tjy5YtgbSViPwMNjUAXgK+MLMr8a5wEhGRED+nP/0Qb96ltcANwG1+Nlw8HTOwDegLfOqcuyW6MEVE4pefRPqIc+57kWy0xHTMXwCPO+d+a2arzKyKc67wLKuLiCQUP4n0m2ZWahZQd5ZToJxzm81sDtDHOefM7CLgEyVREUlGfhJph/I0YN4JYFOAU0/4EhFJAn5Of/q4nG3cCKw6B9sREYlLFTFnUxfgNjPLDR07FRFJKn7qkW4xs91mtrT4x8+GnXO5zrmhzrmfOue+7pzr7JzbXP6QReJDUVER3//+92natClvv/12uOZlsZycHH7/+9+fUsuz5HPFStY2nTZtGi1btuT2228v1V7xOk899RQPPvggx48f55ZbbiEtLS1cZg+8yzGbN29Ohw7eUbmTa4GWbKthw4bAqfVHAYYNG0azZs0YN24cLVq0oEqVKmRkZPD666/zwx/+kPT0dFq3bs3atWs5ePAgrVu3Ji0tjTVr1pzLtzkh+OmRfhUYDqwH/og3f5PIee+vf/0rRUVFbN++nSuvvLLM5U6u5VlcNON0Tpw4wRNPPEFeXh7btm075aT3gwcPMmXKFO6//34WL15M1apVWbt2LRMmTKCoqIiioiIuuuiiUy4CKFkL9HSVk06uP7px40beeusttm3bxsiRI9m8eTNNmjRh3bp1XHvttWzfvp2FCxdy0003sXz5cmrUqMGGDRuYMGFCqaR+vvAz2FRyQpX/Bm4Bbg4mHJHoBTGDgBv4WJnPrV27Nlz5qLhcXM+ePcnKyuJPf/rTKcuXrOUJMHbsWJ5++mlmzJgRXmbPnj1s2bKFrKwsDhw4QH5+PhkZGeHnf/Ob3zBy5Ehq1aoVrstZo0YNGjRowI4dO6hZs+Yp9TyhdC3QBg0a8PHHpYcsTq4/unbtWjp06FCqxN3JrrnmGvbv38+mTZvYtm0bt956Kzt27AhX2T+f+BlsGlgRgYiU15mSXhBOnDhxSpJZuHAh06ZN4/nnnz/tOsW1PAFGjx5NkyZNGD9+PD/60Y8Ar4hQVlYWq1atOu362dnZLF68mJ/85CfAqb3Ljz766LRzJZWsBZqZmUlaWhqtW7dm79694XZL1h998cUXz1r3dMmSJSxbtozHHnuMCy64gDvuuIO0tDTmzJlzxvWSUWDHSEWSXatWrXjrrbcASlXCr169eqm5lIqVrOVZ1rLFle137txZqpZmseuvv57jx4+zZMmScF3OQ4cOsXv3bho2bMiLL7542h5hyVqgALNnz2bDhg3UrVsXOLX+aKtWrVi5cmXJSm2nlZKSwtGjRykqKjpn1eYTkY6RikTp2muvpbCwkBYtWrBs2TIuvPBCevTowZIlS+jXr1+pZU+u5Vm3bl3Gjh3L3XffzdChQ8PLVapUifHjx9O+ffsyd5HHjBnDhAkT6N69OwcPHuSyyy7jgQceYNmyZSxZsiTcuy12ulqgJzu5/ujll19OdnY2GRkZzJ49+5TlmzVrRvfu3Rk3bhxDhgxh8ODBTJo0icGDB3PxxRf7fQuThp3pvw2Amc3Am17EgIuBAufcOTlGamapQEFBQQGpqannYpOnb2fGiArf7QuCXkdiKj5+WLJ6kiSkMo91+Blsmu+ceyW8JbN+Z1pYROR842fX/n4zSwEws2rAsGBDEkkuJ9fylOTjp0c6FngllEwLgVMnVREROY/5SaTNnHM9i++Y2c8CjEdEJOH42bUfYGbNzNMc+EGwIYmIJBY/PdIf481l3wjYDfw0yIBERBKNn0S6A/gnUMU5NyU0o6iIiIT42bV/GdjJl3M1TQkuHBGRxOMnke4BLgNqm9m9wKFgQxIRSSx+du1vxJvTfgveMdLJfjZcYhbRPwLPAbuAXu5sl1KJiCQYP4m0K9619lWBIrx57l840wonzSJ6KzAa+B7wdeD9csQrIhJ3/OzajwFuds51BXrjJdUzClXCL66l1RjvGOsO4NT6XiIiCc5Pj7QWsCBUm9CAFsWl9ELJNRLarReRpOOnsHNWOdv4DGiIdx7qZ+XclkjM5R88yoxVnzDwykuoXyMl1uGc9+Lh8zhrIj2pkLMBLsKe6BzgWbyBqtWRhScSf2as+oT7/rIBgHu7aGLcWIuHz8PPrn00u/A453KB3NDdtpGuLxKvemdeRO6WPfTOvCjWoQjx8Xn4GWzKLjnNiKYakfPd/PW7WLhhN/PX74p1KEJ8fB5+eqTZzrltxXfMrFVw4YjEv4FXXlLqVmIrHj4PPz3Sk6dD/EMQgYgkivo1Uri3SwsNNEmYnx7pUjN7A+9c0MbAymBDEhHxL1EGm36NVxm/HrAPSAs0IhGRCCTKrv0c4E68S0SfBu4LNCKROJd/8CiPvrmZ/INHYx1KudnwBbEOodzi4VCLn0T6GHA58DawHJgRaEQicW7Kiq3c95cNTFmxNdahSJzwk0g741Vumop3hVLnAOMRiX920q2c9/wcI30cuAmvmtOnwLxAIxKJc/d8M42vVK2s05/iiA1fgJt0Q8za99MjfQk4DPTFKzoyN9CIROJcPByTk/jip0e62zk3x8wudM49b2a9A49KRCSBnLVH6py7PXT7ROi2b9BBicS7ZBjtlnPHz669iIicgZ8yellAL6Ba6KF859zvAo1KRCSB+OmRzgLW4k0zsgwYEGRAIiKJxk8iXe6cWwR8BNwMHAk2JBGRxOJnqpGfhX7tj3eF0/pIGwkdHngK7xTmns65vZFuQ0QkXvk5Rvo/QBfgPbypmd8H7oqwna7Aw8C3gSuAv0a4vohI3PJzHmlPvAS4zjmXaWYromhnAZATai+a9UVE4pafY6SFwN8AC00zcjyKdprg1TOtCtSPYn0Rkbjlp0c6NjSRXXn0Bebj7db3wCvHd16rO+uX7DsW+bidzRjhe9k6Vauz9/ZxEbchIpHxk0gnm9nQkg8455ZH2M5c4EngC6BPhOsmpX3HjuAGPhZoG5EkXRGJnp9EWgevdF5x0TCHV5fUN+fcMrwRfxGRpOMnkf7ROfdw8R0zqxVgPCIiCcfPYFOPk+6/EkQgIiKJyk+PdLOZzQQ+BpoB/w40orPQIE180ech4u/KpkFm1gxoBPzHObcl+LDKpkGa+KLPQ8THrr2ZjQb+CDzlnNtiZuODD0tEJHH4OUZ6Hd7VTbvNrApwTbAhiYgkFj+J9NfAQqAe8GdAB6tERErwM9j0HjAE7/xRC92KiEiIn0Q6JnT7Hb6s2nRnMOGIiCQeP4k0B0gFWjrnlEBFRE7iN5EeAe4LNhQRkcTkJ5Hm4h0XbWFmLQCcc88GGZSISCLxM2qfAYwCLg04FhGRhOQnkf4NGAZkAv2Aw4FGJCKSYPzs2jfH27V/NXT/vwKL5jyyLncS63MnBdsGQMCXb4qIv0Ra8vxRO8uyZTKzXsAAYIhzbl+020kWWZ2HV8g16jrpVyR4fnbtH+DLk/CjSqZmVgO4yzl3i5KoiCQbP4m0MXA7cD3QEG/upUh9C8g0s6Vm1jCK9UVE4pafMnq1zawyUBu4Cq+wc+cI26kLTMGbs6kf8HiE64uIxK2zJtLQ1CLfxeuZ7gB+EEU7/wHSgP1AtSjWFxGJW3527efiHRddFbofzcn4fwe+jXeN/rwo1hcRiVt+Ru0NOAEUhW4jHmxyzn3BqXM/SRLQaVwi/hLpzcBNQEdgJ5qXXkrQaVwi/hLpWGC8c26XmV0HvAh0DzassqkHJCLxxk8i/T3wWzOrDfwL+F6wIZ2ZekAiEm/8JNIpodu6eCPu84CugUUkIpJg/JxH2qUiAhERSVR+Tn8SEZEz8DOvfSMzu8vM7gnd/2rwYYmIJA4/PdKXgV3AbaH7U86wrIjIecdPIt0DZAG1zexe4FCwIYmIJBY/o/Y34l3e+RFezzTYkzhFRBKMn0T6FF/WIzWgP5rXXkQkzE8ifQ64C9gIzMabmllEREL8HCO9GlgPXA4sB+4NNCIRkQTjp0c6A2+XfmYE64iInDf89EgfA2oC24GewPRAIxIRSTB+EundwBBgBVAI6JJREZES/CTSF4HLgGN4k+D9NZqGzKyZmR2NZl0RkXjm53jnk865ucV3zKxflG0NBf4vynVFROKWnx7pfWaWAmBm1YBhkTZiZnVDv+ZHuq6ISLzz0yMdB7xiZlWB48DEKNq5De981Owo1hURn+qOep19RwojWseGL4ho+TrVq7B3/LURrZPs/CTSpXgJtLpzbl6oUn6k0oFvAZlmNsQ598cotiEiZ7HvSCFu0g2BthFp4o1Gov1D8JNI/wLMxxtomkcUczY554YBmFmukqjEo0T7w012ifYPwe/J9flAZTO7Bbgg2sacc52jXVckSIn2hyvxxc9g0/eAqsBLQCpeNSgREQnxk0h/BjQFqgCX4J3GJCIiIX527XNDt4/gFSyxwKIREUlAfhJpZ7xd+l3OueXBhiMiknj89kiPAO8HG4qISGLyk0hHERqpNzMDnHOua6BRiYgkED+J9G7gv4EDwAzn3KfBhiQiklj8jNr/D3Ah0BHYYmZ/CTYkEZHEctYeqXNuYEUEIiKSqM6aSM3scbxjpKlAdaCac6530IGJiCQKP7v2a/GubNqLd539bYFGJCKSYPwk0nnASqAa0AdoHGhEIiIJxs+o/XvAHmAN3pxNDwB3BhmUiERnXX5v1t8RcBsAuGAbSTB+BpuaAphZJaAGkBJ0UCISnaz68yukipXSaGm+yuiZ2Ri8058c8Afgz0EGJSLnt0TrWfsZtf8RsN059yszqwLMNLMPnHNb/DYSmjDvHqAI6O6cOxZ1xCKS9BKtZ11mIjWz8UAT4Apgm5l9M/TUpcAUM9vhnPN7rPRF59wLZvYyXkm+zeUJOlnYjBGBbr9O1eqBbl9EPGUmUufcKAAz+wbwMN5gUx0gDfiec+6Q30acc87MquOdi7q1XBEnCTfwsYjXsRkjolpPRILlZ7DpX2Z2O3A1kAeMd85FNrmNZzIwyjlXFMW6IoFKtGNyEl/8HCOdwZefvgF9ifD0JzO7Aq9juiriCE+3Pe0SyzmWaMfkJL74GbV/DrgL2AjMxqtNGqkuQFczy8Xrla6IYhuAdolFJP74SaRXA+uBbOBHeJeJ3hNJI865ScCkiKOThKA9BDnf+UmkM/F27Weg+ZrkJNpDEPGXSC8FhuMVLikCngY+DjIoEZFE4ieRjgGucc4dCp3CtAyYE2xYIiKJw08irQUs8KZrwoAWZrYUQHM3iYj4O480qyICERFJVH7OI+2PV8y5Wuihz5xz/QONSkQkgfgp7DwcuBmoFNqVbx5oRCIiCcZPIp3inDsC5JrZSuDDgGMSEUkofgabDMA5lwPkBBmMiEgi8pNI7zazTSUfcM4tDygeEZGE4yeRzgM6h343vKuclEhFREL8HCN9HO9KpkJgG/BEkAGJiCQaP4m0eH6md0K3rwQUi4hIQvKTSA04gXed/QlUuEREpBQ/ifRmvIIlVwFVgD6BRiQikmD8DDb9lC97oZcAPwPGRtKImV2EN2h1HOjpnDsQyfoi4p8NXxDo9utUrxLo9hORn0TaBxha4n40u/a9gOeB+kA3dJxV4lAyJKBIp0ux4QsCn2IlWon0efhJpNXxTsQ3YAPRVbpvHFq3EG+KZ5G4kkwJKBkk2udhzvmfjsvMLgcmO+e6RdSI2Si8RFoPqOKc+9/Q46lAQUFBAampqZFsUkSkopW5N+5nsCnMObcm0iQa8hnQEGgU+l1EJGn42bU/F17jy8GmyRXUpohIhaiQROqc24V3+pSISNKJaNdeREROVVG79md04IBOKxWR+FazZs1U4KA7zQh9RKP255qZafBJRBJJzdNdUBTrRGp4o/kHYxaEiIh/8dcjFRFJBhpsEhEpJyVSEZFyUiIVESmnpE+kZjbWzH4b6zjKw8z6mdnfzSzXzKrGOp5omdktZrbMzGbGOpby0vcqfsTD9yqpE6mZtQAujHUc58CLzrmrgXygaayDiZZz7iW8iRTbmVnCFrXU9yq+xMP3KqkTqXNuMzAn1nGUl3POmVl1IBXYGut4omVmFwB5wLvOucJYxxMtfa/iSzx8r5I6kSaZycAo51xRrAOJVij2TKCWmaXEOh4B9L06J5RIE4CZXYHXgVgV61jKK/SlrwxcHOtYznf6Xp07SqSJoQvQNTQo8K1YBxMtMxtqZn8HjgAfxToe0ffqnMWgK5tERMpHPVIRkXJSIhURKSclUhGRclIiFREpJyVSEZFyUiKVpGZmdczsLjOrFutYJHkpkUrSMrOawCtAAXAixuFIEtN5pJK0zOweYIdz7s+xjkWSm3qkUuHMbEDop7OZ5ZhZXTN7M/RTx8y6m9mC0LL/MLNvmFlG6PdXQ4/PMrPlZjaqxHZLbQfv+uvhZvaCmVU2swlm9paZTQ0tP8jMVpvZD82sbaik3Auh53JPun3SzNaY2dcq8r2SxKBEKvHgJmAm8CzwXaAO0NTMmgN1Q/fvAO4HtplZK6CJc+7beJc4ppSxnerAAGALcDXQGLgNqGlmzZ1zTwFXAbc5594Fvk0Z12o7534KPAD0OncvW5KFEqnEygNAcWHki4BPQj+N8cq6/Q3IARYCtfBmm30U6ATULLGdHUC9MrbzRejxj4BGJdbZBjQ2sxxgEVA8EPVOaN1SzCzFzF4CxpdYViRMiVRiZSIwNPT7Lrye4MV4ibEGkAv0BFbi9Sx3A0Odc193zr1TYjuNgH1lbGcL0BJoEHqu2CWh7V3nnOtU/KBz7kog9TTFgdsAO4Fh0b9cSWaVYx2ACDAXb3Qd4EZgEN5I+/V4PctqwFPAbDPbC9wAEKr4875z7kgZ26kGvBTa1iTgB8ALwL+dc5tDx0f/AWw3s6uBsXiDU4VmdpmZzQMuAw7jJdOr8A4diJSiUXtJSGaW65zrHOE6M4Ec59y2CNbJAXKdc7mRtCXnF/VIRc7sZbzDACJlUo9URKScNNgkIlJOSqQiIuWkRCoiUk5KpCIi5aREKiJSTv8P+o3hwRFL77cAAAAASUVORK5CYII=\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x295f2bd6780>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"log_counts_3 = list(np.log(counts.T[:3] + 1))\n",
"log_ncounts_3 = list(np.log(counts_lib_norm.T[:3] + 1))\n",
"ax = class_boxplot(log_counts_3 + log_ncounts_3,\n",
" ['сырые количества'] * 3 + ['норм. по размеру библиотеки'] * 3,\n",
" labels=[1, 2, 3, 1, 2, 3])\n",
"ax.set_xlabel('номер образца')\n",
"ax.set_ylabel('логарифмические количества экспрессии генов')\n",
"plt.tight_layout()\n",
"plt.savefig('pics/1_10.png', dpi=600) "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Нормализация между генами"
]
},
{
"cell_type": "code",
"execution_count": 94,
"metadata": {},
"outputs": [],
"source": [
"def binned_boxplot(x, y, *, # относится только к Python 3! (*см. совет в книге)\n",
" xlabel='длина гена (логарифмическая шкала)',\n",
" ylabel='срд. лог-количества'):\n",
" \"\"\"Построить график распределения `y` независимо от `x`, используя\n",
" большое число коробчатых графиков.\n",
" Примечание: ожидается, что все входные данные приведены в\n",
" логарифмическую шкалу.\n",
"\n",
" Параметры\n",
" ---------\n",
" x: Одномерный массив вещественных значений\n",
" Значения независимых переменных.\n",
" y: Одномерный массив вещественных значений\n",
" Значения зависимых переменных.\n",
" \"\"\"\n",
" # Определить интервалы для `x` в зависимости от плотности \n",
" # результатов наблюдений\n",
" x_hist, x_bins = np.histogram(x, bins='auto')\n",
"\n",
" # Применить `np.digitize` для нумерации интервалов\n",
" # Отбросить последний край интервала, потому что он нарушает допущение \n",
" # метода `digitize` об открытости справа. Максимальный результат наблюдения \n",
" # правильно попадает в последний интервал.\n",
" x_bin_idxs = np.digitize(x, x_bins[:-1])\n",
"\n",
" # Применить эти индексы для создания списка массивов, где каждый содержит\n",
" # значения`y`, соответствующие значениям `x` в последнем интервале.\n",
" # Этот формат входных данных ожидается на входе в `plt.boxplot`\n",
" binned_y = [y[x_bin_idxs == i]\n",
" for i in range(np.max(x_bin_idxs))]\n",
" fig, ax = plt.subplots(figsize=(4.8,1.3)) # \n",
"\n",
" # Создать метки оси Х, используя центры интервалов\n",
" x_bin_centers = (x_bins[1:] + x_bins[:-1]) / 2\n",
" x_ticklabels = np.round(np.exp(x_bin_centers)).astype(int)\n",
"\n",
" # Создать коробчатую диаграмму\n",
" ax.boxplot(binned_y, labels=x_ticklabels)\n",
"\n",
" # Показать только каждую 10-ую метку, чтобы \n",
" # предотвратить скапливание на оси Х\n",
" reduce_xaxis_labels(ax, 10)\n",
"\n",
" # Скорректировать имена осей\n",
" ax.set_xlabel(xlabel)\n",
" ax.set_ylabel(ylabel)"
]
},
{
"cell_type": "code",
"execution_count": 95,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAABfCAYAAAC6PE+FAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAGotJREFUeJztnXmUVPWVxz9XNlllE1RExBiXxiRoghuIKCYSFAwa476gSZwo5jAwEHNiMmqYRDESJ+BMYqKA5qiDawCBGJHWCCpERYVGI9EgYthBVpsG7/zxe6/7V9Wvql6tXd3czzl1qt72295737q/7f5EVTEMwzBy54CGToBhGEZjx4TUMAwjT0xIDcMw8sSE1DAMI09MSA3DMPLEhNQwDCNPTEgNwzDyxITUMAwjTzIKqYjcJyIvicgmEVkgIstKkTDDMIzGQhyL9ERVHQhUARcBnxU3SYZhGI2LOEJ6c/A9EXgImFK85BiGYTQ+msc4Z42IDAFeBY4ElhY1RYZhGI2MOBbpk0Af4B1gM/BAUVNkGIbRyIgjpK2BjcBWnAUbx4o1DMPYb5BMbvREZDjQydu1RVVn5hSZiADtgB1q/vsMw2gixLEuL1TVawsUXztg27Zt2woUnGEYRsmQVAfiVO0vEJEXgs8CEXmhgAkzjIKxcUc1dy9YycYd1QU5zzDiEsciXaqqZxc9JYaRJ1OXrGb87BUAjDvr6IRj763bzphZVUwaVsHMqnUpzzOMXIgjpM/5GyLyQ1X9TZHSYxg5M7Jfz4RvnzGzqpizYj0A0y/tm/I8w8iFOJ1NrwGXAKuAXsDjqtovp8hE2hO0kbZv3z6XIIz9jI07qpm6ZDUj+/Wka7tWOYfjW6THdrdnz8iJvNpIfwD8JzAXuIO6mU6GUXTC6vrUJavzCufY7u159runZC2iMnZWrH25YG21TYc4Qvov3Kym2ap6NbCluEkyjDqGV3Rn6PHdGF7RvSjC44tioQQyLlF/EsUUVxPu4hFHSJ8A1gKXB9s2174J09AvW7KYzaxax5wV65lZtY4pCz9k/OwVTFn4YclFLypt+TKyX08mnn98QlttoSzwKIoZ9v5OHCHdBJwAdBSRccDO4ibJaEhyedmKKb4JYhM256dp1k9XFc9GCEsh1F3btWLcWUcntP1GiatPPmWdKWwjd+II6QhgEfBz4HXgwqKmyGhQol62TC9vPpZOurBl7KwEsRk1oDcTzz+eUQN6Zx1PKclHhKPE1Q/TL+ts40kVtpE/cYT0D8DVwNeBq4Jto4kS9bJlasuLa+lEiaZfXc8lbZBbh1ChLM58w8n2er/N2Cgf4gjpbcHn6OD79uIlx2gIMlmcUS+vL66ZBC4Mf8rLH9a3XHOsrpcj6dKZTadWuuN+m7FRPsQV0nuAf6jqKlVdVdwkGaUmk1UY9fJGDX5PJci1M46EepZrttX1xiKqIXHTG9eqLmQ7Zy5l2djKv1TEEdL/BG5U1ZHFTozRQKSwCsOXJmoIUkimJgCoe/lH9e9da7n64Vi7XXy6tmvF+Nkrsi6vbAXdBDM74kwRnQmoiGzBjexXm3vfNAhnDV12Yg/atmqe0sp59M01zFmxnn6HHwTA7X95n53Ve+udl26KJsDKDTu45rGl9ea7j5+9Ar1nWCGyZBgNQhyL9CzgMeBN4Mcmok2H0IJ8dOma2n1R1fNdNfvqvsNJckmT5ZJ72H0mLljJ+NkruOKRN5mzYj1jZlVlFF0jO6IsyXyaFZoCVdcIVdeknNVZUOJYpJOC75bAwyLynqqeX8Q0GSVieEV3Kv+xiV3V+7j9ufdr94eWYkibFs1qvy/r24Mlqz/lsr49Eq5Jx+KPtgLQoVUzjuvWlp8OrhNbq9IbxaJieul8x2cU0lzbRkVkLfAucKmqrs0lDKPw+E5Awk6kfj0PqteBMbJfz1pBHTWgd23Vf+qS1cxZsZ5BX+gSGb6MnVWvmn5yz468+MFm2rRsztJVW5n39w2c2jv6eqNpEVqEpRS1hiCjkIpIFW6KKMRsIxWRZsBc66AqP3yfnX712rcMk310hlX25GuSLddUXH/KESxfv4ND2rVk0aqt7Krel3c+jNIR9ecY93gmAc0UdmMhThtpNfATYBRwTsw20k5AHxEZkU/ijMKQavC8L5B+u2hym1m+U0AfWPwRc1as5401nwJ1ba5G46WptqvmSpw20meA84COOHF8UlXTOi5R1Y0i0h+YKyJv2NjThiXZc3yyxTll4Yfc/tz77Nyzl9vOPS7t9Tv37K09Ny4LVm4E4O8bdgDUCqpRepqKBVhuxGkjTZjJJCL/ESdgVa0RkU04ATYhbUAy9ZCHVe1UVe6wU2p4RXcmv+wG7W/csSd2/Ku37gZgz15Xzdu7r2m3lxnZ0fnWeWyeMKShk5EXGav2InKjiFSKyPPBwnefxbjm2yKyENgDvF2AdBp54A9L8qvp4e/Nu5wopqpyP7rUjSN9dOka3tvgnH+F33HYXl0DQKCjbNkdX4SN8qHzrfNS7ks+lk3Vf8vumvwSVgbEaSO9RlUHqeo5wGCcA5O0qOoTqtpfVa+y9evLC3/20cQX3PjO5993Ve+U4ujNfBo7sDcHt23B2IHxPTCJJD5mH2zenUvSjQjSiVshzvGFMkrwtuyuQe8ZlrMYLt84nOUbh+d0bTkRp430YRF5EagJzn+ouEkyiolfzb9o+t8AOLhdS/oc2oFJwyoir7nsxB4s+fhTLjuxB2NmVbFhZw2TF8VvrendqQ3L1u2o3f58P/hrjaqu+mIVR9ziXLNld029/f4+XwDTiWLyNeE+AL1nWNE6l/p0neniKEropSOORfqMqp6pqueo6iBgZaYLjPLFr+Z/6RC3ftEpR3RKu56RX7WfNKyCocd3Sym6UbRukfiYHVCaySZFJZNFl0rgfAsuSsxS7Ut1TXhO1L5kSzGT9Rh1fPnG4VRdIyWxGhvzSIA4QvoLEblTRM4QkT8DpxY7UUbuRA1VSjV8qWu7lgnfqXh99daE72zZstv18IcPW8XBbXIKp1REiWSUtRan2lsO5FN97tN1JhXTlT5dZ8YOJ1W7aSo6tW6RU9rKiThV+/k48XwSmAisL2qKjLxIHuqUah9QO91zyDEHc/eClSmXPJ737oba7+3Vb/PiB5sjnZak4p9bdgHwebC9bP2urPMVh2zbAqNe+FTVXqhfxQ0tuFJaUnVCFl0Z9oWu6hpYTl312VmXifv8a/x9UURVw/0wwyPZlktj77GHeEIKbhXRV4uZEKMwRA118ocv+YRTRIHa7+QxpgB3nXcc4559l7vOO46HXv8YgK1Z9Lz/z4gTuOnp5dQUuHE0kwBCfaswShRL0RYYkknM6tC0oqgprvHDTM5Pn64za/f5gpxJaNPhh9nY2znzIc440ukAInK3qo4rfpKMfPBnK4WEgjnoC10Y57WDhgJ78+m9arej2Fa9j8/VfYfV9PA7Fb61d9fQYwFo3xK274FD2kY3JfjXpOuoCUklgNlWLQtNlIXn70snZtTuIa0oJocfXhOXKOvST1scClG+TWVyQJw20pDongij7EnlVT0U2MmLVqVcvqLzrfMSXOcNC8R2WIY1g/yOixufWkbN58r2wIjdWp25syNVR00c8h2Skwt++2GfrjPrtSmG+3yi9pULcdpDy6UNuByI47TkCVX9tqr+WykSZJSOUFiHV3Rn0Be6RM582rK7hlH9e9O2pfP+tGnnHj7cspubB/TmvkWrYs1K6dKmBet21tAM2AecekTHeudECWc5kdgWWEdUlTukMQ/tyUfgo9pNmzpx2kj7i8iDwe/Q+9N1RUyTUWBSdTaFdGnbMnJ/yKade2rbWB94zTkg6dOtHRA9zCd5e91OJ4rhvKnX13xa9sIZElUl9yl1Z1M2yNhZdGrdouhlG8YTkqrdtCm71IsjpKeS+LfSBEYB7l/4HVC+P9IogY1qXxwzq6q2M2rPXtf3/uYn22qPp+rQCWnVDPxp/M3kgAYVzijrMrrjp37bZWMhFLLNE4YUVej9eDKRTkDTtY83BuK0kfYF7gemAr8DTixqioyC4w/C96eIRi2zHNW+6A/CnzLiBIYe340pI06IHf8BSe/P1s/iD50qBv7YSL89Mxx4Xs5tl7l0zvjXJFuP6Sjl+M6GaNcuJHGE9BbgAlUdDHwL+HFxk2QUE7/jKZyx9MDij9L6Gz22e/vamU/+77jsTarENM+mizNP/E6TdB0ovrg2FJ1at6gnlKGYpRPAVNekOi+uxReel434huRyTWMmziP9IvCMiEzD+SatLGaCjMIRzmh6b932aKEMLMU312yLXEY5FdlWFdu3SnzMWpZQSH3rslwszVQiEyVwmycMSSuAccNJpljim+s1haSUi96FxBlHeouIHAh0BjarakY3ekZ++O2Y+SwOF1bjK/+xqbaNc2f13trllMO1mIZXdGdm1bqslg/JhgNbNIfddY2kB7c9kFWfFuYxitubXmo6tW4R2T6Z3KaYb/tlocQqTGsuHVTlNha0ITqzsl6zSURsXfsik6mXPS5Rw5smLnA+Z3bV7EsYvD8ui6p6tvTq1JpPttVZwx9vy19Ey703Pa71WE6UqoOqKVKsNZuMPEg1gD4OUQ5KurRtyfjZK5x16/kWzYZ8Xqypl/R16WjTPPjOTlCi2jbLrZoexyprjL3RqWgqfkQLRSw3esBQ4EbgeREZVdwkGX4ve0iUZ/uoziG/V97/HRJ6wQ+/4wikPzQll2mBYcfU7OtOBuBPI/tFnpfKZVtDi2ZUh48vnk1JINPh/1n06TqTAT3nFizsUrrrKwZFW7PJKCx+dR9IWfWPclrit32mWiok3aJouXg58tsuO9/qXrhwLftUa9qnmofuh5m8r5Ckah/02zPDau/+Ip7pSFcG2babhn+UnVq3YHNeqWoY4np/qkVVf5XpHBHpjrNk9wJDVXV7Dmnb7/E7nXyB9GcaJZPstMT/3fnWebxyc3+Om1jJlBEnxFpuIh1xnXOQJEzJ8USFU6wpl6FQ+qLpdwwVqhPIyI5y67DKlqyFVESOUdW/ZzjtPOCPQFfgHODpHNJWtrz64SZGzniLqd/5Cu/8axs3Pb2c+0b0ocOBzbl+xtv8engFW6v3pe11Dy3AdD30vhX6y/kr2bK7hl/OX8lnNfvYvfdz+vU8qHYO/DF3LgCclRAO/RjQc26tUICzLE+bvBCA0yYv5OXV3wxiinbZ5lN/X6L7teR96diyuyblNamswnT7oq4x67JpUe7TS7MWUuB0IJOQHgaswK3z1CPVSZkKJ91YsIrpGnl93GviEnXNyIr5PFk1GG6D/sBScHO+gMVA36dnsXT9MNZPq/OCnRzOcpw4gfvXmcr7nDfti4nx3PeZ2zfNnVOPR9zX+mnwcu1OJ0ydWrfwhDKa8LzN3u9QeHxrLWpfWP3q1LpFQjj+eaGYRQl21DXprMJM+/zfhR5i1FjJZOVFHQ/3+ceifpfagixXAQ2RTIt8isjApF0jgV7AaFWNXGpZRG7FCWkXoIWq3hfsbw9sW7NmDe3bN16vfIv/uYkfPL2M/x1xAlVrtzNm9gomnX887Q9szk1PLePOocexdc8+rjrpcLpkGAe6aUc1D7/xcaxzC3mtYRjZ0aFDhw7AjqiVkeMI6QfAtHATtzzzURmuuQ5ojavav62qTwf7DwU+yTYDhmEYZUKHqD6fOFV7wbXvbwOW4do8M/EsdZ1Nk7z9a3HV/h1RFxmGYZQ5kdoVxyLtBTTDTRHtC9wBzAPujNHpZBiG0eSJI6QCnIGzJNfj1rUXYK2qRrsLMgzD2I+II6QzgPeAVUBPoL+qxqneG4Zh7BfEmSJ6JPAX3JCnF4AuIjIwoje/QRGR80TkcRHpJCKniciHBQ7/DhG5V0RGi8giEXk82P8tEXlTRCYXII5LROSvIlIpIseIyD9FpDI4drmILBGRR/KNJwhvbRBPDxGZKiKjg/0FyY9XXhUi8rqIzAlqN4jIZV6+porIWyLy7znE4ZfXl/14ovIhIj8O3EHmkp9eIlIdcf/PEpE3RGRqsH2AiPxaRMbnEEcfEXlFRF4Vkc7Jz3FSuY0K4v1RlnHcISL3Br+vF5E/Br8TyktEZgflukNE2opIbxFZJSL1F9yqH8fFIvJiWNZJ6b4ySPeEYPuuYPuKYPukbN5d770fISKLReQvItIiIt5xIvI3EZkYbMfOTxziCOmDwFnBZxCuEyn8XRaISDvgBlW9WFW3AN8F/lXA8I8GugWb/62qpwNHBjfsOuBCoEKcu8F8mKGqZwAbcc0nU1V1UHDsImAwcHT4oOSKiDQD5gZhH0Fip2Pe+Ukqr0uBnwHrgL6BmPoDXK/DDce9KIeo/PK6xY8nOR/BM5LPn/9o4H3q3/+RwFVALxHpAlwJvKWqE3OI42zgl7hhwSfhPccpyu1UIPbkdP++iMiRuNrllV54teWlqufj/GvMUNWdOH8bW+PEo6qP4/Th5KCM/HR/DzgTOD/YHgqcBlwfbA8GYrkH8997YACuvD7F3Yvk8vqGqn4N+IYXb6z8xCGOkB6qqreHH6BV8PuOQiWiAAzAPQAviMjpuAd+T6ECV9WVwGPBbxU3BXa1qtYAnwOtgO1ApzzjURFpjVv6uh0wSETODA5PA34BvBTEmw+dgD4iMkJVXwHme8fyzo9fXri29bU4QeiBe7jneOcq7qV7Nod4/PKqToonOR9XA9NzyY+IdA5+boy4/2H+1gKH4l7QG0LLJ0tm4cTsENwwQf85Tig33FqCrchiUk3SfRmCE7rZQRlG3fdrccsMEYwF3xInnuCP+l3gb8DXk9IdDh/aFQjhAbjRPW2DeO7G3cs41L73uNmTo4BtQT6Ty+sZEZmEm3GZVX7iEEdIzxGRQYEpPAj3r1ludAamADNwL+SD6U/PneCfbgoQVkXvBX6P65ArhL+FScCtqvo67l97UmAZHotrqz4seFBzRlU34qzAm8SNyvApdH4SogYuAJ4Kd4jIwcD3gYw+HFIwCbiVukVKw3iS89EPeDXHOC4HHg7Sm3z/fRT3LH4TOCOHamMPnCC3BMaS+BwnlBtwD+5ZPzjLOEI648ptGc5Ki7rvp6hq1mWmqvuACuAg4OKkdCecCvwBWEDi/YuL/94PxvXjHCRu4k9yeR2Fa57snUM8GYkjpJfhTPFbcCJ6RTESkicbcP9on+H+oX+D+6caUYS4RgBLVHUVgKpW4tayeinfUQwicpILUpcEYe/ETbNthbsP9+MeziPziScIuwbYBHRM2l9JgfIT8AnOwgonY/QCHsLdnwHAT4EJuVjZSeWVEI+fD9zwvaOBycCZInJsllEdBYzHicM4vPvvxdsdJ4Lhs1hN9ivufgfXH7EEJ27+c5xQbqr6GK6aPD9VYBnw3xlJvu8ickhwTk4EYtocOJzE+70tsERbq+pOVf01MIFE6zGXPIzBWdtrgZOp/5ydq6q/BQbm2zQWiao2+g9wIPBn3EtzZLCvssBxDML9a0/GTa+vxL2cg3HViqMKEMdYXJWoEicwr+LG6xI8KO8EcR2QZzzfBhbirCzBVeFGB8cKkh+vvCpwVbw5frrD+wO8HeR3dp7lNdCPJyofuD+gaXnkqTLi/p8FvIFrzwZnbFQC03MI/8ygPBYDh0U9x165XQL8H9Alx/vSHWcJvgB0SC6vIB/3ROS/Y4w4RgN/BZ6gbmRQmO4rg/KaEGyPwq1Q3Nq7fmnMvPjv/Y+C96MSaBtRXpOCsr0v2/zE+WQc/mQYhmGkp4TrORqGYTRNTEgNwzDyxITUMAwjT0xIDcMw8sSE1Cg5ItIt81lGOkSkq4jY+1sm2I0wSkowhi+nGUZZxvNfxY6jgbmKxCmQRgNiQmqUmgtw/myLTcq1wpoIj+DmlhtlgAlpE0ZElgaf24LtSnGehaaJyJEicpuIDPa2p4XnBd+TReRtEfmSF+a0IIzhInKniLwmIkOCacS3JYXzuDiPQod4yTobeDE4/m6YPnHejhYEn07BvqUi8n0R+a44D1FXB9ctD9JwbnI+guNfBU4SkRvD6q+IrBaRbiLyPanzcDRNRI4Vkee99A/y0vNCkJ624rwMLRaRi4Jy7CZ1HqBq8ynOC9VrIvKeiBzmlVtlWDYRcd0lIqeKyPniPHz9JLjm2qCMtgbbteWgquuAjla9Lw/sJjRt3sHNMvG5NGn7hlQXq+rNuKnByYuYXgo8B3wZ5/zi2hTXX4ybvz3I290L+Dj4vdZL34U4xywP4axWcLOt7lfVP+A8BF0e7N+Ac4YxJjkfQfvrnbh51zXAbcHxzTjfBSfjpoyGXI+byvgZzvlJyLdwc91n4uafjwIGqeqTwfGf4aY2JufzHNwsmldwc+Yz0QvnGOhVVZ2tqv2Ac4NjLYCfEyxUG1EOG3HzzY0GxoS0aZN8f5Pnfp9I4tLaQzxrtFVgcU3ATcVLpnNw/dM4z0HgBPWx4PouIjIXJ0D+9S2AnRHhdQdWB5/D/AOBRT3XD0edB6FQqPx8DMRNE92kqr8HTheR5sBbOBHdSZ1gdqduDZ43gWOAu4PtQ3BTUK8A2uC8nu3ykvVVYFlEPmcEYQyJyGMUwwjmywfW91+pc0TSEc/VW0Q57CGeWBtFxoS0iSIiXwR8p8BdcL4affrj2tpC5mmd/9Ov4CzGMUSzFee8Y5CqhhbSNOos3sHA80CyO7lNQA8R+TJunnzIOpyDi8Op70v2m6p6pr9DRNpQJ+B+PnYRuIETkVBk2gXndgAW4Va4BdfMMBVAVatVdRjOKQm4ZXXuVNWvqepTwN4gzpCpOCs4OZ9tcVZs3Hbg3wPXiUgrnG/TM4OyAOdv9I005dCZArqCM3LHhLTpcj/OU9W9OEvxPuCupHNm4IQniiqcmP4KV4VMILDO3hKRl0TkkojrF+FEdTSJL/tSnOV3L84VXMhTuGr2SOBPSWG9JSILqRPYr+AcVTwQkY/5OKv0S7jRAb+jzmq7g0SLbr6qro5IOziHGzcE7aedgd8CL4pI+EfxIM4j1ytJ+RxOtCejE3D35GyCJoGA6iCd1wdpewVQEbkcWKmq/p9KbTkEbaOtVXV3ivQbJcScljRRRKQytC793w2NiBwB3KGq1+YRRtr8iHPT9gTwQ22iK92KyLnAV1X1Fw2dFsMs0qbM6BS/GxRV/YjcHSzHjWMHbpXbJimiAW1wTpGNMsAsUsMwjDwxi9QwDCNPTEgNwzDyxITUMAwjT0xIDcMw8sSE1DAMI09MSA3DMPLk/wG9s9OoWUWdPAAAAABJRU5ErkJggg==\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x29585e34e48>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"log_counts = np.log(counts_lib_norm + 1)\n",
"mean_log_counts = np.mean(log_counts, axis=1) # по всем образцам\n",
"log_gene_lengths = np.log(gene_lengths)\n",
"\n",
"#with plt.style.context('style/thinner.mplstyle'):\n",
"binned_boxplot(x=log_gene_lengths, y=mean_log_counts)\n",
"plt.tight_layout()\n",
"plt.savefig('pics/1_11.png', dpi=600) "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Нормализация по образцам и генам: RPKM"
]
},
{
"cell_type": "code",
"execution_count": 96,
"metadata": {},
"outputs": [],
"source": [
"# Создать переменные в соответствии с формулой RPKM, чтобы легче было сравнивать\n",
"C = counts\n",
"#N = np.sum(counts, axis=0)\n",
"N = counts.sum(axis=0) # просуммировать каждый столбец, чтобы получить суммы (.astype(int))\n",
" # количеств прочтений на образец\n",
"L = gene_lengths # длины для каждого гена, совпадающего со строками в `C` (.astype(int))"
]
},
{
"cell_type": "code",
"execution_count": 97,
"metadata": {},
"outputs": [],
"source": [
"# Умножить все количества на 10^9\n",
"#C_tmp = 1e9 * C # 10**9 * C\n",
"#C_tmp = 1e9 * C.astype(float)\n",
"C_tmp = 10**9 * C"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Правила транслирования"
]
},
{
"cell_type": "code",
"execution_count": 98,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"C_tmp.shape (20500, 375)\n",
"L.shape (20500,)\n"
]
}
],
"source": [
"print('C_tmp.shape', C_tmp.shape)\n",
"print('L.shape', L.shape)"
]
},
{
"cell_type": "code",
"execution_count": 99,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"C_tmp.shape (20500, 375)\n",
"L.shape (20500, 1)\n"
]
}
],
"source": [
"L = L[:, np.newaxis] # добавить размерность в L со значением 1\n",
"print('C_tmp.shape', C_tmp.shape)\n",
"print('L.shape', L.shape)"
]
},
{
"cell_type": "code",
"execution_count": 100,
"metadata": {},
"outputs": [],
"source": [
"# Разделить каждую строку на длину гена для этого гена (L)\n",
"C_tmp = C_tmp / L"
]
},
{
"cell_type": "code",
"execution_count": 103,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"C_tmp.shape (20500, 375)\n",
"N.shape (1, 375)\n"
]
}
],
"source": [
"# N = counts.sum(axis=0) # просуммировать каждый столбец, чтобы получить суммы\n",
" # количеств прочтений на образец\n",
"\n",
"# Проверить формы массивов C_tmp и N\n",
"print('C_tmp.shape', C_tmp.shape)\n",
"print('N.shape', N.shape)"
]
},
{
"cell_type": "code",
"execution_count": 104,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"C_tmp.shape (20500, 375)\n",
"N.shape (1, 1, 375)\n"
]
}
],
"source": [
"# Добавить в N дополнительную размерность\n",
"N = N[np.newaxis, :]\n",
"print('C_tmp.shape', C_tmp.shape)\n",
"print('N.shape', N.shape)"
]
},
{
"cell_type": "code",
"execution_count": 105,
"metadata": {},
"outputs": [],
"source": [
"# Разделить каждый столбец на суммы количеств для этого столбца (N)\n",
"rpkm_counts = C_tmp / N"
]
},
{
"cell_type": "code",
"execution_count": 106,
"metadata": {},
"outputs": [],
"source": [
"def rpkm(counts, lengths):\n",
" \"\"\"Вычислить прочтения на тысячу оснований экзона на миллион \n",
" картированных прочтений.\n",
"\n",
" RPKM = (10^9 * C) / (N * L)\n",
"\n",
" где:\n",
" C = количества прочтений, картированных на ген\n",
" N = суммы количеств картированных (выровненных) прочтений в эксперименте\n",
" L = длина экзона в парах оснований для гена\n",
"\n",
" Параметры\n",
" ---------\n",
" counts: массив, форма (N_genes, N_samples)\n",
" РНК-сек (или подобные) количественные данные, где столбцы являются \n",
" отдельными образцами, и строки - генами.\n",
" lengths: массив, форма (N_genes,)\n",
" Длины генов в парах оснований в том же порядке, что и\n",
" строки в counts.\n",
"\n",
" Возвращает\n",
" ----------\n",
" normed: массив, форма (N_genes, N_samples)\n",
" Матрица количеств counts, нормализованная согласно RPKM.\n",
" \"\"\"\n",
" N = np.sum(counts, axis=0).astype(float) # просуммировать каждый столбец, чтобы \n",
" # получить суммы количеств прочтений на образец\n",
" L = lengths\n",
" C = counts\n",
"\n",
" normed = 1e9 * C / (N[np.newaxis, :] * L[:, np.newaxis])\n",
"\n",
" return(normed)"
]
},
{
"cell_type": "code",
"execution_count": 107,
"metadata": {},
"outputs": [],
"source": [
"counts_rpkm = rpkm(counts, gene_lengths)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### RPKM между нормализацией генов"
]
},
{
"cell_type": "code",
"execution_count": 108,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAABfCAYAAAC6PE+FAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAG0FJREFUeJztnXmUFNW9xz8/YMBhk01ACAgGNQxG0YgrKqiJioLikrgrLvhiSA4Pn8S8aIKEl7hE4nviOYkboHkGEZcHiJi4jIqAEg2gDmJUQojCIAPIPgzwe3/cWz3VNdXd1dt093A/5/Spru3eW7eqvvW72++KquJwOByOzGlW6AQ4HA5HqeOE1OFwOLLECanD4XBkiRNSh8PhyBInpA6Hw5ElTkgdDocjS5yQOhwOR5Y4IXU4HI4sSSmkIvKQiLwpIjUi8rqIfNgYCXM4HI5SIYpFeoyqngZUARcDu/KbJIfD4Sgtogjpj+3yXuAJYEr+kuNwOBylR4sIx3whIucAi4E+wNK8psjhcDhKjCgW6bPAAOADYCPwWF5T5HA4HCVGFCEtBzYAmzEWbBQr1uFwOPYbJJUbPREZAXT0bdqkqrNzErmIAG2Bber8+TkcjhIlinV5kapel6f42wJbtmzZkqfgHQ6HI2dIoh1RhPQCEXnNF5Cq6hk5SZbD4XA0AaLUkS5V1TPsb6gTUUexI7fOycuxDkciogjpn/0rIvKTPKXF4WgUnHg6ck0UIb1QRPqIoQ9wdX6T5HCE01gCGBZP1G25iG/Dtlrue/1TNmyrzUn4jvwTRUh/CPwSeAmYSP1IJ4ej4ORLzBojzkRMXbKG8XNXMHXJmtD92Qqts8hzTxQhXYsZ1TRXVa8BNuU3SY5iIeyFS/USN5awZRJPuucUSnBGDerFvef3Z9SgXqH7/ULrRLE4iCKks4B1wBV23Y21349JZS1FIZkY57oIXWzVAVHo0rYVtw3tR5e2rULDSyW0jsYnipDWAEcCHUTkNmB7lIBFZKKIPCAiFSLynojMsx3wHUXMyuqtnPfoO6ys3hq6f0RFN4b178qIim5x25OJhrfPE9ApC1ZlLcbJ4ikkUS3pbCzuLm1bMX7uigZC6ygcUYR0JLAQ+BXwHnBRqhNEpB/Q1a5eBvwCqAYGZpZMR2Mxbk4V81asZ9ycqrjt3ov9p6VfMG/Feh57958p6+mCYuBZswj7lUWVbf1rPj8Q6cZXDB+rYiSKkD4KXAN8F9Ni/2iqE1T1U2CGXe2BqRpYC/TMLJmOfOIvak8eXsGw/l2ZPLwiwbG7Aaj8tIbxc1cwZcGqyPF41uyJvTpQ+VkNNdtNWO7lbBxKpY64FIkipBPsr59d3pVFfG48fREy5W1T1J7y9io6t2nJkG92pnOblqHHrvzK1OxUW0t0R93eyPE8uGAV81as56ZZHzBvxXrGPO8mW8glYdUGUcWwKYpm1bUS++WbKENEJwDtgM9UdXUGcXwJdAcOtv8dRcKGbbVMXbKGHbVWDNVX/E7AlJFHMm5OFX07lvPQwtW0LmseOb6XP/kKgK93GUv0iIPaZJ54R8ngCVnF9Ma1oxozvigW6S+BW1R1VIZxzMBYsd2AZRmG4cgR/mK8J5qtWzXn3vP7M2Zw35Qtwp7FetWxPRnWvyuXHxO9tmbS2YfTpqwZJ/XumPpgR1GRqi41mUVbMV2TilpTsIajWKSzARWRTaThtERVK4FKu3pcpgl05Ba/xemJ5ahBvWItwCurt1L5WU2DVvng+Wcd1oVX/r6BQb0OZMLZ34oU929e+5Ttdft4a1UNAO/803VJdjQNogjpUGA0phX+GVVdnN8kOfJJUDxvG9ovbv/oWct58/ONbKvdE3r+iIpuVH5WQ8cDzKPjNT5Foara1K/utEGv2rgj3eQ7Sgi5dQ56//Ckx3S6Y35suXHSOY2RrLwQpWg/GfgWpp7zSRGZm98kOfKJv7O3V8yfX7WO/ve8xuJVNdTt3QcQWwbxuj8tXG2sSa/xKQrNA22NG3eEi7Uj93iClWpb2DF+sYt6XlQ27axD7x/Opp11OQuzEKS0SLOoG3UUOV4xvXPrMmp21HH1jKWMHNCdRas3M7hPJxat3tzgHM8CPal3B1Zt3MmUkUdGjm9PoJqsKXfhiCJc/vWw/1HFL9W5ne6YHxMq//6geIWF7R2j9w+P1WVGOS8qH20YQdW18JGJJeNwCk1KIRWRKkw/UHCOnUser5Fp1KBesWL+sn9t5n+XruW0vp0Yf0Y/DmrXilGDenHfG583ON+zQNdvNy/TEd3aRY67RTPYE723VNGQSJyC2/z7wsQmuC1MpLztwW3Bc1KFk6uwISh2HvWil401OaDL7Fh6lGjVAcVIlKJ9LfBzYAxwlhPR0iFsTLt/rLxXzO9xYDkAnVuH9x3141mg6ViiHpp4poack8z689ajbvMXPz3RCNvmkUzsGoOPNozgow0jQrcZUZSE+8MY0GU2FdOVAV1mx36pSMdKlVvn0LG8LPLxxUiUxqYXgPOADsAAEXlWVZ3jkhLA30LvNSp5jUWpWuUT4Vmg6ViiHhd/uztPLV2b9nmpCLMUN+2sS1qMjSp2+RJAT7T8ouQXsjALsKHQaeg5Xph+S9Ifj2cB1p+rced429K9lkysVC8tpdzQBNHqSONGMonIf+QvOY5c4m+h95hdVc28FesZ8s3O3GbFsHWr5rGlX2iTCWq6dLpjPp/cPpSnlq6lmcA+heYJDNR06wgzKcbmi1Ri522LKnb1Z9DAEgxuC57jLzaH4Z2rKbaFEfeRinhOIkqxKB8kSh3pLcD3gT2YqoDn8p0oR24I694UJq6XD+zJkjVfc/nAnnFC65Gqi0qUYtymnXWxvqrd2rRk7bbddA0MQ/Vbk+nW4zU2ieoNU4mdt80jldgVkjCr2aMQH6liJkrR/lpVPQFi89C/g/NJ2qTwi6dfaD2LNNVLk24r7nPXHcdJUxby3HXHJS1+FwthohkmgMXY5uzVP2aSt1HqQhMRVtwv1FDRxiCKkD4pIm8Adfb4J/KbJEc+CY5smrpkTay+NKyTfiZdW1K9tCf27RxbFtKy8b/syYrkxWY1Rk2Lv/6xsdMeVkXQFAXUI1Jjk79xSURKu1Z4P8dvcXqiun33Htq09EYq1XePguIpSucD/8sepUjemGRiSfrPCbaC+8U3Gys1jFx9ZHLZ0b+xidL96dcicreInCoiLwMn5jtRjvzhH9nkOShBiXWJysVUIsWIv3tPsq4+xYDX+BJWH52om1DwnESt4MnCDsOLrzG6KJXyRzuKkL4KHAg8C/wFaNhL21H0JJsnyfPg5HXSb4re6/39H6P2hcwlYSIUFKmw1uvgtnTENRXBsMPC2TjpnLTFt9A0lg9SP1GK9mBmEXXOSgqAv6idzRw9/rrR8XNXoPcPb9Bn1As/2NJfCoS3ojfsW1kowuopPWHKth9lrgTOn55sKWSdciHqYlNapKo6XVWnA0fa/66xqRHJVVE7zNIM6wpVanjF9LDRN/5tjUki67Ips79db5CoFikYL/mORiYbsQs2HAXxLNBSno2y0JZmGMnqJ4ul5T8bctlQ1VSI0iF/lqpeoqr/1hgJcsQT1qk+KsGiey5HKhWCZB3EC4FXt1mI7kWFZOOkc2w9ZOl7bcoVUSzSU0Tkcfvf8/50fR7T5EhB1HrTMGvW39E+G/Lhm9Ij0bDJbMaDZ0tQNAs5RnyxnWHAWyZiZfVWhvXvysrqrRn5RkiGdy86lpexMcEx+1NxP0qr/YnUzyT6S7KbRdSRA6LWm/q7Ovn/pyKZdRUcxpmPIl6qus3GbHX313cWS6v1pU++H7dMxM2zljNvxXpunrU852nIR54Ue7e0ZESxSAcCPwTKgN3AH4BMZhN15IhCNhI1dt1YIaxPP8Uinn427aiNWyZit53lYHeC2Q6KjagOU4qRKBbp7cAFqnomcCHws/wmyZEKf71por6hxYRnaWQyZUWhrM9iZp9q3DIRg/t0ilsWO6WQ94mIIqRvAC+IyDSMb9LKfCbIEU5Yh/opC1Yxfu4KpixYlXZ4jTkczxPDYHVAIVt+w0SzWKzPRxauilsGaduqLG4Jpj7UvwS44YTeDOvflRtO6B1XX1rMlKqYRvFHeruIHAB0Ajaq6q78J8sRJMxJc8zhfJJBHP6pG/z//Y6PPTKdyTGs+J2Ng+DGsECLRTTDuOX5j2LLm07u22D/XmuJ7vVZpKNtPehoX32oN1HhoF4HsmTN18xbsR4oXbEqZtKes0lE3JxNBSCsXnTMKX1p07JFxnWlUbzGQ+JWdI9sHASHhbO/4rXCt2ym7NkH5WXSYN/iVTXsqDUTX3lLgJ27zf+N22tjlqf/uDvP7MfnNdu55tgenPfoO0weXpHzlvz9GTdnU4kQ1uqeTkt8VMKK/OmMEEq35TXRHELpEOZYI8o49mJj2GNmFLY3S/Wu3ftiAjrsUbPvgqlLmHbZ0bQpa8a0y46Onav2k7Xm613MW7GecXOqaN3SznzQsjnPfbCOj9dv5z9fWhnb71UTRaljj+IHYH8mipC+AAwDbgFeEZEx+U2SIwrJuih5+9Jp1PEEMNt6y3Qbh3IxjDPMsUapONuYX7Uutty0M751vX15C0Y8vgSATbvMvpoddQzs2YHT+3VhYM8OsWNbWzeI3zjwAA5qU8aPTz6EMYP7cu/5/RkzuC9/+3ILAN3atWJY/65MHl4Rq1sP1rF7+VZs4lkIZyRRiTLW/i5VvVNVx6jqUOCARkhXyeM1Dq2s3pqyZd0TvmQemvwEJ3tLJJT+xp1UYhomgJnMRpkrErltK7XZJsMajvzFdE8ovaWfls2a8dWO+A9bx/IWcf1DPWF7+JKjGNa/K63LmvPV9jp+8/pncefdelpfDmpTxi/OOowXbzzBFOsj1LEXExXTtWidQ0exSONQ1d+me46IdBORRSLylog02YqZp99fQ9vbX+Tp99cw4eWVjJ+7glFPL03aed5vNUbtaO+3Gr3pgIMi3OmO+XFiF2ZpJhNKCBfXxnAIksyS9FuapcDoZz+MWwJc+sR7sWWd1YU6hSsHdgegXUvzWh7YuozDOreO2zZiQHe27zZl/807d8fueec2LRnyzc7ssUbtzt17456n+99cFRNY75wxp1iL9ZSGDVqO9EjHaQkAInK4qn6S5mnnAX8EugBnAc+nG28xM79qHdfMWMrW2r3s2rOPG2YuZ2edeaLXba01wjQNGNrwa7pgzbmx/ye/Mo/yFs04b9phVE0z2wb3egmIF5WwFnF/q36w9btjeRkfxeLRBvu9UDqWl4Vu27SzLtSreiJP64m2LVhzbujcR8nGrDcVRx9+ttmGoW2799K6hakTbd0CZi6vBmDrbvPsVG+t5cbje3PfG59z5MHtWbR6M706lPNZzQ5zXO3eBr4U+nYqB6C8ZfO4Bsqvttbyyt83IBDX+6MUXSYWI2kLKXAykK6Q9gBWYOZ96pnooFSTYwXrRyqma2idiXd+on1RwkkUdhi9MZ1rj+8xu4GFt2XMBgZMMXP+hIXnF7uFVuz82xb4BNA7f0CX2bExzrH4ppmvFdPq96sVJiPCaro/UT8HkRJf1xrmjzLMZ2bYWPMo4cit9dflTYcRn8bSx28t6/3D+emcD7m3chXjh9RbfX+68hiumbGUJy4bSIfyMkbNXMbU7x/NB2u38KPnP+LmE3vx9LK1PHHZQI7r3ZGD2rViREU3ZldVM2pQLy4/pifj5lRx55n9eOsfm+J6bZzapyO/evVTJg+viBu4Mf6Mfg3CSZRuR/qIphgdISKnBTaNAg4BxqpqpEG8InIHRkg7A2Wq+pDd3g7Y8sUXX9CuXemW+P/ycTWjZy3n4UuO4rvf6lbo5DgcjjzQvn379sA2DRHNKEL6OaZgCqZa+lpVPTSdBIjI9UA5pmi/XFWft9sPBr5MJyyHw+EoIO1VtcHwsChFe8FUmW0BPsTUcabLi5huVHuAyb7t6zDF/m0ZhOlwOByNTahWRbFIDwGaY4aIDgQmAvOBuzNodHI4HI4mRxQhFeBUjOW4HvgUY6WuU9XidjvkcDgcjUAUIZ0JrMT4IO0FnKKqmRTvHQ6Ho0kSpUN+H8x89p8ArwGdReS0kNb8okJEzhORZ0Sko4icJCLp+5pLHPZEEXlARMaKyEIRecZuv1BE/iYiD+Ygjh/YAQyVInK4iPxDRCrtvitEZImIPJWDeNbZOHqKyFQRGWu35+RafHlVISLvicg8W8pBRC73XdNUEVkmIv+eQRz+vDrKH0/YdYjIz6xbyHTjOUREakPu+1AReV9Eptr1ZiLyOxEZn0EcA+zglcUi0in47AbybIyN96dpxjFRRB6w/28QkT/a/3F5JSJzbZ5uE5E2ItJXRFaLSIcU4V8qIm94eRxI81U2zZPs+j12/Uq7fmw676rvPR8pIu+KyF9EpCwk3ttE5K8icq9dj3QtUYkipI8DQ+1vCKbRyPtflIhIW+BmVb1UVTcBNwJrcxR2P6CrXf1vVT0Z6GNv3vXARUCFGNeD2TBTVU8FNmCqUqaq6hC772LgTKCf99Bkgog0B16y4fYmvvEx62sJ5NVlwC+AamCgFdNzfYdfD5yCubZ08efV7f54gtdhn41MjYCxwN9peN9HAVcDh4hIZ+AqYJmq3ptBHGcAvwEWAMfie3YT5NmJQORxuv57IiJ9MCXMq3zhxfJKVc/H+NiYqarbMT43NqeKQ1WfwejD8TZ//Gm+CTgdON+uDwNOAm6w62cCkVx1+t9zYDAmr77G3IdgXn1PVY8DvueLN+W1RCWKkB5sx9vfpap3Aa3s/4m5SkQeGIx5GF4TkZMxD//uXASsqp8CM+x/FZFuwBpVrQP2Aa2ArUDHLONRESnHTIPdFhgiIqfb3dOAXwNv2ngzpSMwQERGquoi4FXfvqyvxZ9XmDr2dRhR6Il5yOf5jlXMy/diBvH486o2EE/wOq4Bpqcbh4h4buY3hNx379rWAQdjXtKbPesnTeZgxKw7pmug/9mNyzNgL+baIg+sCdyTczBiN9fmX9g9vw542J77ELApVRz2A/0x8Ffgu4E0e92HdlghbIbpzdPGxnEf5h5GIfaeY0ZLjgG22GsM5tULIjIZM8Iy8rVEJYqQniUiQ6wpPATzxSx2OgFTgJmYF/Px5Idnhv3qTQG84ugDwCOYxrlEkyumw2TgDlV9D/MFn2ytwyMw9dY97EObEaq6AWMF/khM7ww/ub6WuKiBC4DnvA0ichAwGkjbl4NlMnAHRlz88QSvYxCwOIPwrwCetGkN3nc/inn+zgVOzaDo2BMjyC2BW4l/duPyDLgf83wflGYcHp0wefYhxlILu+cnqGpa+aWqe4EK4EDg0kCa4w4FHgVeJ/6+RcX/np+Jacc5UMxAn2BeHYqpnsyLY4EoQno5xhS/HSOiV+YjITnmK8wXbhfma/0/mC/XyBzHMxJYoqqrAVS1EjOv1ZvZ9mgQkWNNkLrEhr0dM8S2FeaePIx5UPtkE4+1qGqADoHtleToWixfYqwsbxDGIcATmPsyGLgTmJSJhR3Iq7h4/NeB6cbXD3gQOF1EjkgjmkOB8RiBuA3ffffF2Q0jgt7zV0v6vpW+j2mTWIIRN/+zG5dnqjoDU1R+NVFgKfC/JxK85yLS3R6TNlZMWwDfIP4+b7GWaLmqblfV3wGTiLceM0n/OIylvQ44nobP19mq+nvgtGyqwxKiqk3uh3H19zLm5eljt1XmMPwhmK/3g8BSzFD7fpiv4vPAoTmI41ZM8agSIzKLMX13sQ/NBzauZlnEcQnwNsbSEkwxbqzdl5Nr8eVVBaaoN8+fZu++AMvttc7NMq9O88cTdh2Yj8+0DK+nMuS+DwXex9RjgzE4KoHpGYR/us2Ld4EeYc+uL89+ADwNdM7wnnTDWIOvAe2DeWWv4/6Q6++QIvyxwFvALOp7Bnlpvsrm1SS7PgaYihFW7/ylEa/D/57/1L4TlUCbkLyabPP1oXSuJeovZfcnh8PhcCQnbX+kDofD4YjHCanD4XBkiRNSh8PhyBInpA6Hw5ElTkgdBUNEuqY+ypEMEekiIu49LjDuBjgKgu3Ll/YIowzi+a98x1FgriZ+KKSjADghdRSKCzB+bfNNwjnCmghPYcaYOwqIE9L9ABFZan8T7HqlGO9C00Skj4hMEJEzfevTvOPs8kERWS4i3/aFOc2GMUJE7haRd0TkHDuceEIgnGfEeBXq7kvWGcAbdv/HXvrEeDx63f462m1LRWS0iNwoxkPUNfa8j2wazg5eh93/HeBYEbnFK/6KyBoR6SoiN0m9l6NpInKEiLziS/8QX3pes+lpI8bb0LsicrHNx65S7wUqdp1ivFC9IyIrRaSHL98qvbwJieseETlRRM4X493r5/ac62webbbrsXxQ1WqggyveFxaX+fsHH2BGm/i5LLB+c6KTVfXHmCHC54WE8WfgKIwDjOsSnH8pZgz3EN/mQ4B/2f/rfOm7COOU5QmM1QpmtNXDqvooxlPQFXb7VxinGOOC12HrX+/GjL+uAybY/RsxfguOxwwZ9bgBM6RxF8b5iceFmPHuszFj0McAQ1T1Wbv/F5ghjsHrPAszmmYRZtx8Kg7BOAharKpzVXUQcLbdVwb8CjOaipB82IAZd+4oEE5I9w+C9zk4/vsY4qfYPsdnjbayFtckzJC8IJ3s+c9jvAeBEdQZ9vzOIvISRoD855cB20PC6wassb8e/h3Won7JH44aT0KeUPmv4zTMMNEaVX0EOFlEWgDLMCK6nXrB7Eb9XDx/Aw4H7rPr3TFDUK8EWmO8n+3wJes7wIch1znThhF1nunh2DHz1vp+i3pnJB3wuXwLyYfdRBNrR55wQtrEEZHDAL9j4M4Yn41+TsHUtXnM13rfp0djLMZxhLMZ48BjiKp6FtI06i3eM4FXgKBLuRqgp4gchRkn71GNcXTxDRr6kD1XVU/3bxCR1tQLuP86dmBdwYmIJzJt7bHtgYWYmW3BVDNMBVDVWlUdjnFMAmZ6nbtV9ThVfQ7YY+P0mIqxgoPX2QZjxUatB34EuF5EWmH8m55u8wKMz9H3k+RDJ3LoEs6RPk5Imz4PY7xUPYCxFB8C7gkcMxMjPGFUYcT0t5giZBzWOlsmIm+KyA9Czl+IEdWxxL/sSzGW3wMYd3Aez2GK2aOA/wuEtUxE3qZeYI/GOKx4LOQ6XsVYpd/G9A74A/VW20TiLbpXVXVNSNrBON642dafdgJ+D7whIt6H4nGMN65FgescQbhHoyMx9+QMbJWApdam8wabtkWAisgVwKeq6v+oxPLB1o2Wq+rOBOl3NALOaUkTR0QqPevS/7/QiEhvYKKqXpdFGEmvR4y7tlnAT7SJzngrImcD31HVXxc6LfszziJt+oxN8L+gqOo/yczBcjpxbMPMdtskRdTSGuMc2VFAnEXqcDgcWeIsUofD4cgSJ6QOh8ORJU5IHQ6HI0uckDocDkeWOCF1OByOLHFC6nA4HFny/0qtVdCWs3nUAAAAAElFTkSuQmCC\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x295f40815f8>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"log_counts = np.log(counts + 1)\n",
"mean_log_counts = np.mean(log_counts, axis=1)\n",
"log_gene_lengths = np.log(gene_lengths)\n",
"\n",
"#with plt.style.context('style/thinner.mplstyle'):\n",
"binned_boxplot(x=log_gene_lengths, y=mean_log_counts)\n",
"plt.tight_layout()\n",
"plt.savefig('pics/1_12.png', dpi=600) "
]
},
{
"cell_type": "code",
"execution_count": 112,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAABfCAYAAAC6PE+FAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAG6RJREFUeJztnXmUVdWV/z8bKRCZBykVCWhQpDBGSZOkgwOIiTMJGTSaSWK6/SXSWf5kQdvdpKPGThRb2t9PXN0xRjD2StLGRBsQaY1QJI6hW9BoVYhETAiGKiYDFFgUuvuPc++r827d9955U03uz1pv3XfHM9x7v3efaR9RVQzDMIzS6dPVETAMw+jpmJAahmGUiQmpYRhGmZiQGoZhlIkJqWEYRpmYkBqGYZSJCalhGEaZmJAahmGUSUEhFZG7ReQXIrJLRNaKyMudETHDMIyeQohFeoaqng00AJ8C3qpulAzDMHoWIUL6N9FyEfADYEn1omMYhtHz6BtwzDYRuQB4DhgPbKxqjAzDMHoYIRbpT4HJwK+B3cD3qxojwzCMHkaIkA4AdgJv4izYECvWMAzjXYMUcqMnIrOA4d6mPaq6vKTARAQYBOxX899nGEYvIcS6/KSqXlWh8AYBe/fu3VuhyxmGYXQakmtHSNH+4yKyJvqtFZE1FYyY0c3Yub+V29duZuf+1q6OStH05LgbPZsQId2oqudGvxmqem7VY2V0GUvXb2XBykaWrt/a1VEpmtC4m+AalSZESB/3V0Tk61WKi9ENmDN1LIsumcScqWODz6mmMBVz7Xxx39S0j4vvfZ5NTft69MfC6J6ECOknRGS8OMYDX6hulIzOxherUYP6M3/GBEYN6p+6P41qClMx106Le8z1KxpY1djM9SsaSvpYGEY+Qhqbvgp8EzgWaKZ9pJPRS4jFCmD+jAlF748FqVxh2rm/laXrtzJn6tiMGM6qq6X+d7uYVVeb97hCLL60LrOMBdcwKkWIkP4JN6qpRlWXiMjEKsfJ6GTSxKqY/ZUSpjTBXt7QxKrGZqa/dySzcJbl5NGDuH3da1nHFWJi7WAe/cqHyo6jYaQRUrR/CNgOXBmt21j7XkJcZP/Rhm2samxmeUNT1n6ZtwJoF7Nc+ysRh537W7OK3PH2WXW1mW1x8XzDG3srUjT3w06rvgjdZhghQroLOBUYJiLzgZbqRsnoLDIWoJBXmGbV1XLRpNHMqqvNKyTJfUmhTTt3ydNbWLCykSVPb2HUoP4sWNnIqEH9M3Fb3tCUqfdcfGkdF00azZLZp+asCy2GRWs3s2BlI4vWbk6tiw3dZhghRfvZwNnAa0ATcEdVY2R0Gn7dZj5Rii3WqccPZWD/vpnid5JcdalxnWZL62FueuLV7P3x+LbEOLe0eteQ4rnMW4HecWmHbTtu+liHetUN2/Zmlktmn9qh+sKPQ5yGeL81VBk+IRbpvcAXgY/iWuzvrWqMjKqSZhXuajmU15I80PZ2Zulbp0lytYbns3znnnkCiy6ZxNwzT8g6x7dOi61CSEtjmiW5ZPapGQv35sc3saqxmZsf35Q5f1fLocyxi9Y46/X7z/+hItaw0bsIsUhvjJYPYF2fejy+1QiwYGUj9b/bxarGZiC98eaomiMyS7/xxye2BPO16p81fjjfenJzRoRjC7GYhqpcFqe/Lc0ynjN1LAtWNmaJ+MiB/Zj+3pGMHNiPR15x9b+PvNLE6ccP65AvG97Ym7WsBKX0PjC6J6FCOhj4nar+vrrRMaqB/8KmFZln1dUy/b0jcxZX5555AgP792XO1LHsajmUKQLnKuInic95tKGJda/tBmDq2KHc9PirtBw6zI3nn1J0mtIENSZNNGOhipcybwWLLpmUScN9l53G1Q++xPcvO42ZJ48GsvNlVl0t169oyHSjqgSFupUZPYeQov03ga+p6pxqR8aoDn6x1u+0Hv+fWDs4b3HVPydXC35MWjH8modeYlVjMwcOHeaiSaOdGHl1o5Vo/U/G11/mwq+KuHzKWFra3uHyKWMz1Qp+vsT1sxNrB2fiW26rflpViPUK6JmEWKTLARWRPTjvJ2rj7XsWhTrMFypi5rJoQy3SWDP7HtEnU5T2rdy4AaqzKbf/a5pFueTpLcGWdlr4ZqX2TEIs0hnAj4ENwN+ZiPY88g2dhOwuSGmU2+Xnnk+fxkWTRjNlzNDUrk5dSTnWcGrjWo5eCGmkWZ82fLVnEiKki4FTgGOAB0RkZciFRWS7iNSLyDHlRNCoLGkv74HWt7OWSfyXu5DoBtFDXXonRTf5MZB5KzK9D5K9ENJI+0AV+ujF929T076iqwCs2qB6FBRSVZ0T/T6nqiep6iWFzhGRI4DHVHW6qm6vSEyNiuC/vPGLtfuA6+YTd3PKx879h7KWIcx9+GVWNTbz/B/2cMrogVww8ejSIt8FFGuxpjVqJa8T/w+1Pn0BjO/f9Ssaii4l2GCC6lGwjlREGnBDRCG8jnQ4MFlEZqvqw2XG0aggfh3nojWbuX3da4wbdiQAm3akD1rz6/3iY3Idm8bQI133qdd3H2TngTZuWPUb6q+dVk4yegVpdaRp9dV+vWns9+AbMyfk7WmRRqWcyxgdCSnatwL/AMwFzgupI1XVncA04FoRGVdeFI1K4hcdn9/6JgBHD+qX6ZieilfvN+/sEzh6YA3zzi5cdI154rc7Adh5oA2ANw+GW7O9naTFmmY1+pZr3Gvil6/vKXpgQKFqA6N0QoT0EeBi4GvAz0VkbsiFVbUNN05/WOnRM4qlmC457ztmMAAfes/wTNeeNK44YwwXTRrNFWeM4a5nfs+Oljbueia8S3HNEdlT3byx963gc99t+FZjLK6+AFpjVPckpI70JlVdqKpzVXUGcGShc0Tk0yLyNHAIeKkC8TQCKcbRxqhB/bKWufjRRjfW/kcbt2UchxTTMf3Q2+9krb958HDwue82CvWBjQc3+MNXja4npI70a8BlwGGc8P6s0Dmq+hDO/Z7RyaTVg+WqG7vi9DGs3/pnLjj5aG5fuzlnP9LNUX3o5h0tWcMqQxk37ChebtqfWX+nh7badyXxSK7YlSDAqsbmnKO7jM4lpEP+l1T1Q5CZl/55zCdpt6WYTuZxfRuQd6z9ysamzPK4IUdy+7rX2LEvvAvNgJrsgk9NzkltjUIsvrSOVY3NmWU5JIfZ2tj/0gmpI31ARNaJyM+BtcAPqhwno8LkKtrHnpy+MXNC3nq3k0YNzCzX/s41HMXLEPYl+qe+k+M4ozBxPXau+uxcw1dDunFZ96jSCWpsUtVzVPU8VZ0ObK5ynIwKk6uBIrZIV/92R97zv3X+RI4eWMO3zp9Ic9R/tLmIfqSv7zmQtV5MtYARRlIoSxFF/znJJbyV9ovQWwgR0m+LyK0icpaI/Bfw4WpHyugc4hcHJedLN2Lh6qyW+o+eNAogswzhk6dmD2770z5rKKk2pbTud5dhuz2RECF9EhgK/BR4Aucp3+hBFLJOrjhjTM6Xbs/BtqyW+uOHDwDILHMxYuHqzHL6e0dS06e9YtSqSKtPMX1GzcosnxAhBTeL6HwgfxnQ6JYU8lzvz4uUht9Sf8XpUZ/S08cA7YKZZM/BNvSOS9lzsI2vP/IKbV5T/dihZvEYvYuCrfaqej+AiNyuqvOrHyWj0vgt+YWcPPuWZMySp7Zw0xOv0tJ6mAOH3mZVYzOTRw8CnGDmEtPMNY/qxxteK//eHM5R/LB333JBh31p29L+G0ZnE2qRgvOSb/RwChXzfUsyg7Qv06bcyDqWjqJ24SmuPvXYQTUAfPD4oR3CHbFwdVbYyWvk2uafk4xHWlxGLFzd4WOR3BZ6TqFwuhorsnceBYVURB4CUNX/U/3oGFBdd2dpLvEWrd2cN7y506IJ6qadkDVhXC6Sgnbf+m0A/Gm/2/7s1j2pophvPde2NHyxS7tu8mNRSJBznVMonBBBTot3NSlFXAtNq22CHWaRThOR+6LfUhG5r+qxepdTqf58BX1XRtWWG7btzRue33DhT7kRyoCa7PWaPn2CRbEUUq3qLginkCCHWN3dQXyTFPN8NnxJaPhS1zQvdmbYISObPky2K15rdK0ylXJ3Fj/w/myY8dzyLa2HM9N9zKqrZXlDU/D0IfkmnkujLTG0fq+NtQfCrO54Xe+4FJm3Ilh8/frkSottMc9n3f1dNx64M8MOEdLTga8CNTgnJN8FbDbRKlLuXEIx8YPuz4aZ8Wwv2eHML8LCLJaavn1oa2sfz2RD7UsnVHzzVTvE24slPsd/bvI1EIZQ7Ee5uxIipDcA01W1VUT6A7/AudYzeggjB/ZjwcpG5s+YwNxpJzCwX99OdcM2uN8RHPCEtE8feLsHjxN9ZecsACaPWt5hG2iB/dDwJXglx35/WznEVQh+/WU+cU3rAZHclkvEk+FA7xHIUEKEdB3wiIg0AbVAfVVjZJSF373J96zelRw79EiaWto4QuBtheOHDmDLnoNdFp9Xds5KFTN/v0PzCmDadTSxP76OH04sPGn7C4XdTrhd74t0muhBxyqEXNuqQanWbHcipB/pDSJyJDAC2K2q5pW3Cwj1zOOLZ9rUyfH0Ijv2tbJo1uTg8MuxMIYc6Vqb4j75Ozpx8rVclmJHMesolL4oQkdBySdM8f74OqH7Q8NWsq3cZBrT0lPoA5JGyEenXeRdSoop7vv3pydT9JxNImLz2ncBSYHMJaq+eKbVtSb7gXZGEeyeT5/GKYvqGdBXOHBY0Qo2VxZ60XMJU3I/VN/yqjTJ9BQj/IWqIgqdEz83/v5YCnMV9/OloWfLaFjRPp6z6c/AJlUtPNWkUXF8gfRFNSmUhRqqlsw+letXNHTwcJ9PUPN1vwkh7io1+ZjBrP/jXupGD2L9H/cWOCs/xVqXRja5qiJCP0qQ+3lIs1LzMTzqH9eT61VD52y6iCLnbDLKx+/47PfljP2IzqqrLep6IxauZmLtYFY1NhfVD7TQ6KE03MskWeL2wJVTspYh58Tbk9smj1re4WWPt00etZy6+7XbiqjMW5ERD5+0bdUIJxd+vqXleZL4WUiGU2z+9/T6UQifs+kbxczZZFSGXB2fYz+iyxuaUs/LVaQqNJSzkv0N/Zcpvm4hp8S5XuRYHENe7s4mnygmrSt/W5p47L7lgtRz/HBy7Q8NJ1RcJ49azpljHyt4XDHh5OsgX26pp6spZqw9AKr6z4WOEZFaEXlWRH4pIr16jP5zW3Yx6bY1PLdlF//xwlYG3fAo33tmS/AQz3zDQXN5bQrxNRk/kElRLTSs0aeQcKXtT9tWSMCLtTgrTVKsQvYnxSopcLmE0iekGOtfO1R8c4WTL75paYyvU6xlmyu+dfdrh07y8bGdNRqtWoTUkWYhIier6m8LHHYx8O/AKOA84OES4tZteW7LLuY8+CJLL3s/V/5wA1t2H+TKH26geV8rLW3vcO3Dzm3cN1dv4sBtF3c43/8qP3rVq5n6zu886SYfiB/AZMfnPQfbsh7oOVPHZiatO/nWtZlzY1EasfCxzLn5Lbncrb3DB9TwytYLU7vf+PuT28jzQuw52Jaza1D80iZfqHzb0s4pdB1/vy8Y/v+Y5H7/uN5CWhqhvcEq3hY/f/79q0S9ZrH1qvmI36/uNrIpyUeAQkJ6HNAItAFjch1UKMH5xsnW3a+p54eeE0raOXPqnuSnDTPhRlgZb/SmMHr6mteY9t0To/h0vE4sHMMH1PDUspO4GBg99S0uXnZSznOeSolb87Xui9W8zN+vmQaYp7x4J7vqxCIyfEANu2kXzVh4YjF0L9Dy1HOAzP6O21ZkXrynPKH1i+v+OcMH1HQQqVxilrbN/++//IXO6Umt9F2NL5jx8xRybAh197f3BEijGHHsimGpopo/UBE5O7FpDjAOuE5VU+esF5GFOCEdCdSo6t3R9sHA3m3btjF4cM8t8f/q9V189eGX+dfZpzLsqH78/WO/4dsXnsLJozs3Tbv2t/LAC3/kC1OOZ6RND2EYVWXIkCFDgP2aIpohQvoasCxexU3PfGKBc74MDMAV7V9S1Yej7ccCbxSbAMMwjG7CEFXdl9wYUrQXXKXFXuBlXJ1nIR7FdZs6DCz2tm/HFfv3B1zDMAyju5GqXSEW6TjgCNwQ0dOBm4HVwK0BjU6GYRi9nhAhFeAsnCXZjJvXXoDtqtp5g6YNwzC6KSFC+iCwCeeDdCwwTVVDiveGYRjvCkI65I/HzWf/W2ANMFJEzk5pze9SRORiEfmJiAwXkb8UkS0Vvv7NInKniFwnIs+IyE+i7Z8QkQ0iclcFwrg8GsRQLyIni8jrIlIf7btSRNaLyA/LDSe63vYonDHRFDLXRdsrkh4vv+pE5H9EZFVUukFErvDStVREXhSR/1tCGH5+neaHk5YOEfk7EVlWYnrGiUhryv2fISIviMjSaL2PiPyLiCwoIYzJ0UCW50RkRPI5TuTb3Cjcvy0yjJtF5M7o/9Ui8u/R/6z8EpGVUb7uF5GBInKCiPxeRIYFhPEZEVkX53Ui3p+P4n1LtH5btP65aH1KMe+u997PFpFficgTIlKTEu58EflvEVkUrQenJ4QQIb0PmBH9puMakeL/3QIRGQRco6qfUdU9wFeAP1Xw+hOA0dHq/1PVjwDjoxv2ZeCTQJ04d4Pl8KCqnoXrlSrAUlWdHu37FDATmBA/KKUiIkcAj0XXfg/ZjY5lpyeRX58F/hFoAk6PxPRC7/AvA9Nw6SsWP79u8MNJpiN6Rsr5+F8HvErH+z8H+AIwTkRGAp8HXlTVRSWEcS7wHVy34Cl4z3GOfPswEDxm1r8vIjIeV7r8vHe9TH6p6iU4/xoPqmoLzt/GmyHhqOpPcPrwwSiP/Hj/FXAOcEm0fhHwl8DV0fpMIMhVp//eA2fi8uvPuHuRzK+PqepfAB/zwg1KTwghQnpsNN7+JlW9Cegf/b+5UpGoAGfiHoA1IvIR3AN/qFIXV9XNwI+j/yoitcBWVW0D3gH6A/uA4WWGoyIyADf19SBguoicE+1eBnwb+EUUbjkMByaLyGxVfRZ40ttXdnr8/MLVrW/HCcIY3MO9yjtWcS/doyWE4+dXayKcZDq+CNxfSnpEZET0d2fK/Y/Ttx04FveCXhNbPkWyAidmx+C6CfrPcVa+AW/j0hc8qCZxXy7ACd3KKA/T7vtVwD3RuXcDe0LCiT7UvwH+G/hoIt5x96EDkRD2wfXuGRiFczvuXoaQee9xoyfnAnujdCbz6xERWYwbcVlUekIIEdLzRGR6ZApPx301uxsjgCXAg7gXsmoznUZfuiVAXBS9E/gerkFud67zimAxsFBV/wf31V4cWYYTcXXVx0UPasmo6k6cFXituF4ZPpVOT1bQwMeBn8UbRORo4K+Bgj4ccrAYWIgTFj+cZDqmAs+VGMaVwANRfJP330dxz+KFwFklFBvH4AS5HzCP7Oc4K9+AO3DP+tFFhhEzApdvL+OstLT7/iFVLTrPIlebdcBQ4DOJeGcdCtwLrCX7/oXiv/czce04Q8UN/Enm14m46skTSginICFCegXOFL8BJ6Kfq0ZEymQH7ov2Fu4L/f9xX6rZVQhrNrBeVX8PoKr1wCdwlmJZvRhEZIq7pK6Prt2CG2bbH3cf7sE9nOPLCSe6dhuwCxiW2F5PhdIT8QbOwooHY4wDfoC7P2cC3wBuKcXKTuRXVjh+OnDd9yYAdwHniMjEIoM6EViAE4f5ePffC7cWJ4Lxs9hK8TPuXoZrj1iPEzf/Oc7KN1X9Ma6Y/GSuixXAf2cked9F5JjomJKIxLQvcDzZ93tvZIkOUNUWVf0X4BayrcdS0nA9ztreDnyQjs/Z+ar6b8DZ5VaNpaKqPf6Hc+33X7iXZny0rb7CYUzHfbXvAjbi5q6agPsSPgycWIEw5uGKRPU4gXkO11+X6EH5dRRWnzLD+TTwNM7KElwR7rpoX0XS4+VXHa6It8qPd3x/gJei9K4sM7/O9sNJSwfuA7SsjDTVp9z/GcALuPpscMZGPXB/Cdc/J8qPXwHHpT3HXr5dDvwHMLLE+1KLswTXAEOS+RWl446U9A8LCOM64JfAQ7T3DIrj/fkov26J1ucCS3HCGp+/MTAt/nv/t9H7UQ8MTMmvxVHe3l1sekJ+Bbs/GYZhGPkp2h+pYRiGkY0JqWEYRpmYkBqGYZSJCalhGEaZmJAanY6IjC58lJEPERklIvb+dhPsRhidStSHr6QRRkWG80/VDqOL+QLZQyCNLsSE1OhsPo7zZ1ttcs4V1kv4IW5sudENMCHtxYjIxuh3Y7ReL86z0DIRGS8iN4rITG99WXxctLxLRF4Skfd511wWXWOWiNwqIs+LyAXRMOIbE9f5iTiPQsd40ToXWBft/00cP3HejtZGv+HRto0i8tci8hVxHqK+GJ33ShSH85PpiPZ/AJgiIl+Li78islVERovIX0m7h6NlIjJRRH7uxX+6F581UXwGivMy9CsR+VSUj6Ol3QNUJp3ivFA9LyKbROQ4L9/q47xJCes2EfmwiFwizsPXP0TnXBXl0ZvReiYfVLUJGGbF++6B3YTeza9xo0x8PptYvybXyar6N7ihwck5pT8LPA6chnN+cVWO8z+DG7893ds8Dvhj9H+7F79P4hyz/ABntYIbbXWPqt6L8xB0ZbR9B84ZxvXJdET1r7fixl23ATdG+3fjfBd8EDdkNOZq3FDGt3DOT2I+gRvrvhw3/nwuMF1Vfxrt/0fc0MZkOs/DjaJ5FjdmvhDjcI6BnlPVlao6FTg/2lcDfAs3koqUfNiJG29udDEmpL2b5P1Njv0+g+yptS/wrNH+kcV1C24oXpIR0fkP4zwHgRPUH0fnjxSRx3AC5J9fA7SkXK8W2Br9jvN3RBb1Y/511HkQioXKT8fZuGGiu1T1e8BHRKQv8CJORFtoF8xa2ufg2QCcDNwerR+DG4L6OeAonNezA160PgC8nJLOB6NrhE58fynRePnI+v4l7Y5IhuG5ekvJh0OEibVRZUxIeykichLgOwUeifPV6DMNV9cWs1rb/Z++H2cxXk86b+Kcd0xX1dhCWka7xTsT+DmQdCe3CxgjIqfhxsnHNOEcXBxPR1+yF6rqOf4GETmKdgH303GAyA2ciMQiMyg6dgjwDG6GW3DVDEsBVLVVVS/FOSUBN63Orar6F6r6M+BwFGbMUpwVnEznQJwVG1oP/D3gyyLSH+fb9JwoL8D5G30hTz6MoIKu4IzSMSHtvdyD81R1J85SvBu4LXHMgzjhSaMBJ6b/jCtCZhFZZy+KyC9E5PKU85/Biep1ZL/sG3GW3504V3AxP8MVs+cA/5m41osi8jTtAvt+nKOK76ek40mcVfo+XO+A79Jutd1MtkX3pKpuTYk7OIcb10T1pyOAfwPWiUj8obgP55Hr2UQ6Z5HuyehU3D05l6hKIKI1iufVUdyeBVRErgQ2q6r/UcnkQ1Q3OkBVD+aIv9GJmNOSXoqI1MfWpf+/qxGR9wA3q+pVZVwjb3rEuWl7CPi69tKZbkXkfOADqvrtro6LYRZpb+a6HP+7FFX9A6U7WA4NYz9ultteKaIRR+GcIhvdALNIDcMwysQsUsMwjDIxITUMwygTE1LDMIwyMSE1DMMoExNSwzCMMjEhNQzDKJP/Bd6KXE9LT9rSAAAAAElFTkSuQmCC\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x295f40aa320>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"log_counts = np.log(counts_rpkm + 1)\n",
"mean_log_counts = np.mean(log_counts, axis=1)\n",
"log_gene_lengths = np.log(gene_lengths)\n",
"\n",
"#with plt.style.context('style/thinner.mplstyle'):\n",
"binned_boxplot(x=log_gene_lengths, y=mean_log_counts)\n",
"plt.tight_layout()\n",
"plt.savefig('pics/1_13.png', dpi=600) "
]
},
{
"cell_type": "code",
"execution_count": 110,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAADPCAYAAACnWlnGAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAGVlJREFUeJzt3XuUVdWV7/HvJDxtHooIREEKgfBSHn1VbFql0Ci2UWMSiSgZIkZMOrReopfcRBNC0DaORlvaV6soEDGKxkSjaWJTPoAYkDC6E4IENEJjEHlcUAKl8irm/WPtgqKgYJ06teucXfX7jFHjnH3Ofswq9pjstfdac5m7IyIitdek0AGIiGSdEqmISJ6USEVE8qREKiKSp4ImUgvamJkVMg4RkXw0LfDxWwPbt2/fXuAwRESOqsYLPjXtRUTypEQqIpKn6Ka9mTUB2gGtgJbuviZimylAW+DPwGjgNXf/fi1jFREpSlGJ1Mw2A1uB14Bdycc3H2WbnkBHYKe7P2hm04HfA0qkItKgxDbtuwO3ERJvT2DG0TZw93eBOQBmdhLhqvTntQtTRKR4xTbtbwD6Eq5KnyckxWjuvt7MegM/yy08EZHiF3tF2gHYC3QBvkEtrizdfSdQYmbNct1WROrelvJdTH39XbaU7zr6ynJEUVek7n5bPgcxs7uAUsLDpj357EtE6sbMpev4zq9WAjBxeM8CR5NtsQ+bHgROA/oDy4CO7n7q0bZz9/nA/DziE5GUjD2j60GvUnsWU4/UzBa5+1Az+w3wRWCeu5+e98HN2pCMbGrTpk2+uxMRSVPeI5tuTF7/BXgCeCDfiEREGorYRPo3ZnYu8FdgKjDEzF4zs9LUIhMRyYjYRDo8+SlNfoa5+3nJPdAGp6KigquuuoqTTz6ZJUuWUFpayqpVq/Z/P3nyZB5++GE2b95Mr1696NWrF1OnTj3ou0rz589n1KhRAEyfPp1evXoxevTog45Xuc1jjz3Grbfeyt69exk5ciTdu3dnxowDXXabNm1KSUkJZ511FgClpaX79zt58uSDjtW5c2cA1q5dy+mnn07//v1ZvXo1ADfffDPdunXj9ttvp2fPnjRr1owePXrw8ssvc80113DKKafQt29fli9fTnl5OX379qV79+788Y9/rMs/s0iDEduPdH7y6oT7BJ9PJZoa2C0v1fk+/Z5La/zulVdeoaKigr/85S/s27evxvU++eQTjj/+eBYuXMjAgQO57rrralx337593HfffaxatYpzzz2X1atX06NHj/3fl5eX8+STT7JgwQLmzZtH8+bNWb58OYMGDWLMmDEAdOrU6aBkCbBnzx6mTp3K1VdfzeGqEd5///1MmTKF9evX85Of/ITRo0ezaNEi1q5di7vzgx/8gJKSElasWEHLli256667mDt3LrNnz2bhwoWcdtpprFy5kqeffpoZM2Ywbdq0I/5dRRqj2ERayoEk6sAraQV0OEdKemlYvnw5Q4cOBaBJk3DRfvHFF9O/f3+efvrpQ9Zv3rw5AwYM4O233wZgypQpPP7448ycOXP/Olu3bmX16tX079+fHTt2sGXLloMS6b333sttt91Gu3btWLFiBWeeeSatW7fmhBNOYMOGDbRt25Z27dodcuyf/vSnDBs2DIATTjiB995776DvV61axQsvvADApZdeyvLlyznrrLMws8MmXoALLriAbdu28c4777B27VpGjRrFhg0bGD58ePTfUKQxiW3azwM2EkYmfYZ6TqT1bd++fYckmblz5/K5z32OJ5988rDbNGnShIqKCgAmTZrEpEmTuOOOO/Z/7+7079+fVatWsX79eoYMGXLQ9oMHD2bevHn7l6sff82aNZx00kmHHPe5557jyiuvBKBfv350796dvn378uGHH+4/7osvvsjq1auZNm3aYX+36srKyrj77ru5++67eeihhxgzZgyPPPLIEbcRacxiE+m/A58Ci4C5hAdODVafPn1YtGgREBJRpVatWrFnz6HjCSoqKli2bNlBV5jV1+3QoQObNm1i48aN7N69+5B9XHLJJezdu5eysjL69evH0qVL+fjjj9m8eTOdO3fm2WefPewV4dChQ2na9EDD4qmnnmLlypW0b98egL59+1JWVgbA7t276dOnD4sXL8bdOVLXtxYtWrBr1y4qKipo2bJljeuJSHwiXeDus4Hr3X0JsCLFmAruoosuYs+ePfTs2ZMFCxbQsWNHRowYQVlZ2f6rv0rLli3jlFNOYcSIEZx44om0b9+eKVOmMH78eCZMmLB/vSZNmnDHHXcwZMiQGpvIP/rRj7jzzju58MILKS8v59RTT+W73/0uCxYsoKysjG9+85sHrd+0adMj3pcFmDBhArNnz6ZHjx4sWbKEAQMGMHjwYHr06MFTTz11yPrdunXjwgsv5Pbbb+eGG25g3Lhx3HPPPYwbN44uXbrE/glFGpXYDvktgIFA5aVJubv/d94Hz3iH/Mr7h2+++WahQxGR9OXdIb8M+ArwOKEb1Mwjry4i0njEJtK97v5/gXXAcqAivZCyo6SkRFejIhKdSC9MXscBpwBHvjEnItKIxPYj7Wdm3wZOBDYDS9ILSUQkW2IT6UPAGOA9oCuhP2ne1Z9ERBqC2ETajjBnE4QnVyea2QwAd1czX0QatdjuT52AFlU/qnzj7u8dukXkwTPe/UlEGpUauz/VeEVqZucBH7j7KmAQYfrlpoTpmB93d80IKiLCkZ/a7wK+lby/Ffiyu58PXApMTDswEZGsqPGK1N1/a2anm9mrQC/gkaTYRSfgqHXtzGwK0BZYC3wVWO/uI+siaBGRYhJ7j/QEwvDQncBWd6+5SGdYvyfwf5L1v+3ubmZLgaFVZxHVPVIRyZC8h4iOBGYDc4BXzGz8kVZ293eTdUmSaCdgnaZiFpGGKLb70xh3HwJgoX2/BHgwZsNk/QeAb9cqQhGRIhebSGeb2Xxgb7LN7ByO8SVgaT7dpEREilnUPdJa7TjMMHo5oaL+OcA2Qj3Td6uso3ukIpIVNd4jTS2RxlAiFZEMyb1D/kFbmw0GjiMMFW0FtHT3GUfeSkSkcYi9R3o/IZHOBRYDH6cWkYhIxkR1f3L3s4GLgQ6ESlDr0wxKRCRLYpv2LxLK571KKO58TJpBiYhkSWyH/PuBHwMrCRXyz0wtIhGpF1vKdzH19XfZUr6r0KFkXmwifYtwj7SVu/8bEWPtRaS4zVy6ju/8aiUzl64rdCiZF5tInwM2AVcnyw+kE46I1JexZ3TlXy7py9gzuhY6lMyLfWq/FegPHGtmE9FTe5HM69C6BROH9yx0GA1CbCIdTZijaQ1h8jv1IRURScQm0pfcvbRywcxeA85LJSIRkYyJTaTvmtkswiyi3YD3U4tIRCRjohKpu19vZt0I89pvcvc16YYlIvXBbnkJv+fSQoeReVFP7c3se8B04IvAHDN7JNWo5BDq8ydSvGK7P10MjAAuc/czCU/wpR6pz59I8Yq9R7qHMDzUkgdNe9MLSQ7nsn6dmL96K5f161ToUESkmth7pHpCX2Av/mkTc1duprTH8UzspNqtIsUk9opUCqxy9IlGoYgUn9jqT+0I04Z8FtgA/NLdt6UZmBxMo1BEilfsw6ZfJK9Lk9fnj7aBmU0xs2lm9hkzm2lmE2oVoYhIkYtNpAbsAyqS1xrnLgEws55Ax2TxTHQLQUQasNhE+hWgOfB3QDNCM79GyUyhc5L3iwlP/EVEGqTYRHoj0IWQTLsCN6UWkdTIblEZWJFiFNvkvhyYwIEmfeHmcBYRKTKxifQ4oDR5b4REujCNgEREsiY2kV6b647dfT4wP3k/K9ftRUSyInZk04K0AxERyarYh037mZnGJ4qIVBE7sulBoC9hmpHjzWynu6uIoYgI8Vek/ZLCJb3d/QKgXYoxiYhkSmwi3ZiUz9uWvK5PMSYRkUyJfdh0lZmVkBQtcfe1KcYkIpIpsfdIZyZv3we6mJm5+7WpRSUikiGx/Uj7AVe4+zoz68qBalAiUmTaf/9lPvp0T/T6uQw9Pq5VMz6846LahNWgxSbSm4A7zawjsIkw9l5EitBHn+5JbWZQ1Xs4vNhE2gJ4rMpy8xRiERHJpNhEOjx5HQPMSt5rrL2ICPGJdCahWMlFHEikIiJCfCL9EaHi06oq769LKygRkSyJTaTjgXOAVu7+gpkdl2JMIiKZEjuy6SWgD3BrsvxMOuGIiGRPLtWftgBNzWwk8JmU4hERyZzYpv0o4BLgZ0Bb4MupRSQieVmx5TL+NCalfQOaaehQsYn0maT6k9SxtEahaARK49W/w4updshXGj1UbCL926TqEyRzNimx1o20RqFoBIpI/Ymt/nRsrjs2symE2wCPArMJQ0u/4O76D01EGpScpxqJYWY9gY7J4ihgEiGRDkrjeCIihRSVSM3s5Oo/R1rf3d8F5iSLJwIbgQ3ASXlFKyJShGLvkT4J9AZeBiqSz2ozsknNehFpcGLvkZ5rZucAVxCKO8/O4RgfAJ0J1fU/yDlCEclZWg8bj2vVLJX9Zl0uFfKd8MS+HzAFuCHyGHOAJwgzkC6rRYwikoNceoHYLS+l1lWqMYlt2k/mQCKFiCa6u88H5ieLp+cYV6ORVudpdZwWqT+xiXQ2dXOPVKpJq/O0Ok6L1J+op/bufi7h/ug2YCUHipeIiDR69XGPVESkQYtt2v/O3f+9csHMbkopHhGRzIkd2XStmXWzoAT4WnohiYhkS+wV6T8SntyfiKZjFhE5SGwi/aK7j61cMLM7gSXphCQiki2xifTzZvY68B7QDVAJvTqUxigUjUARqT+xifQqYCxhmOdmYHRqETUyGoUikn2xD5vaAS2Bbe4+CVVxEhHZLzaRTieMmR+aLP8wnXBERLInNpGuBP4J6GJmDwL/k15IIiLZEltGb6yZdQdmAJvcfU26YYmIZEfsENExwNeBvUAzM3ve3f811chERDIi9qn99e5+TuWCmS0GlEhFRIhPpLurTMcM0C7pV6ppmUWk0Yu9R3p+2oGIiGRV7D3SPxFmAoVQSk9XoiIiidim/aZ8EmfyxP9ZYDtwsbvvqu2+RESKTWwi7VztHik5JtYvAQ8BJcDZwKs5bCsiUtRiE+l17r64csHMSnM8zj6gBfAR0DHHbUVEilrsyKZ7qi3/c47HeQb4KmF01NYctxURKWqxV6Szku5O+5JtnsrlIO6+wczOB/4D+G1uIYpIWlRNrG7EXpHOdffh7n6+uw8DVudyEDM7Cfg58IC7f5xrkCJS997etIMvPLaEtzftKHQomRd7RXqnmX1AuKL8PrCYMMd9FHdfD3w59/BEJC3feO6PLFjzIR/v2sv88X9f6HAyLfaK9FVCTdKfA2WAipaIZJxXe5Xai02kEK5CJxIq5ItIxj16xQAu7tuRR68YUOhQMi+2ab8AuI7QdWk9MCutgESkfvTu1Ib/uH5IocNoEGKvSGcDc4GBwOvk+NReRIrPlvJdTH39XbaUa6BhvmIT6e/c/U3gXnd/A/h9ijGJSD2YuXQd3/nVSmYuXVfoUDIvtvrTLcnrs8nrTWkGJSLpG3tG14NepfbMvXDP7MysDbB9+/bttGnTpmBxiIhEsJq+iC2jdx4wkjBe3oCN7v69uolNRCTbYu+R3gvcB/QGJgOlKcUjIpI5sYn0GXdfCbwF3A/8v/RCEhHJlpzvkZrZycA6r4Obq7pHKiIZkvc90geBvoRRTccDnwKX1UloIiIZF9u075dUxO/t7hcAx6YYk4hIpsQm0g3JVCPbktf1KcYkIpIpsWPtX3T3OalGIiKSUbGJdJKZNa/6gbs/kUI8IlJPtpTvYubSdYw9oysdWrcodDiZFtu0tyqvxhGeXolINmisfd2JvSJ9ouoVqJn1TikeEaknGmtfd6L6kZrZG+5+dpXlBcncTfkdXP1IRSQ7amyJxzbtXzOz/zSzJ8xsHqFafvzRzfqb2WIze9PM2ueyrQSqHSlSvGLL6E0ys7bAMcA2d9+Z43HOA34MnAv8LfBKjts3epX3swAmDu9Z4GhEpKrYkU0PE6YZ6eHuA81suruPy+E4LxGKnTQF3sg5StH9LJEiFtu0HwjcCHxoZh2BXGfLOgnYCDQHOuS4rQAdWrdg4vCe6qYiUoRiE+kE4E5gFzAVGJ/jcb5KmMZ5KTAix21FRIpabPenFsAMwhTYRrhXmotfEMrv7QQuz3FbEZGiFptIhyevVwM/Td4vjD2Iuy8g99sBIiKZEJtI5wNtgPPdfUp64YiIZE9sIi0l1CD9enqhiIhkU+zDJoCWwFVm9kMzm5RWQHJ46pAvUrxiE+k64CKgGaGZPz+leKQGKjAhUrxim/YVwHTC9CJfAmaRw8MmyZ865IsUr9iiJcM40PUJwN0970SqoiUikiH5TX4HVK30ZISkqitSERHiE+nlhNFNoKLOIiIHiU2kxxCKjhiwErgnrYBERLImtoxen8r3ZnYa8Ajw+bSCEhHJktgyeiXANcCJwAYglxJ6IiINWmzT/kngNuB9oAthvP3QtIISEcmS2ET6EaG6/Tqga7IsIiLEj2z6ErAI2A38ltAxX0REiE+kTwGbgaeB3sBrqUUkIpIxsYl0PHADYb6lPRyoTyoi0ujFDhF9vdpH7u7n5X1wDREVkezIb4iouw8HMLPz3f3VuopKRKQhyKUeKYQuUDkzs38ys/lmttbMvlGbfYiIFKuoRGpm1ydvr6vNQdz9AXcvBd4B5tRmHyIixSq2H+l4M3sHwMxOBsi1jJ6Z9QPWuPtfcwtRRKS4xTbtXyDM21RKeGJfWotjXQb8shbbiUgKNH1N3YlNpP8GvEfo+rQWuK8WxyoFflOL7UQkBZq+pu7ENu1/DjwB/I4w1v55cu9L2sndy3PcRkRSoulr6k5sIjVgH2Hupn3Uorizuw/OdRsRSU+H1i2YOLxnocNoEGKb9l8BmgN/R5hJ9PLUIhIRyZjYRHojoUnfnFD96abUIhIRyZjazNkEmrdJRGS/2ETaCs3ZJCJyWLFj7ftWvjezAdTxnE07duyoq12JiKSibdu2bYByP0ylp6jqT2kxs88CHxQsABGR3LR190Ou/AqdSA3oDKh/qYhkQfFdkYqINAS5ltGTDDKzZmbWsdBxSMNmZm3MrG2h4ygEJdLG4e+BWwsdhDR4twB5z5yRRUqkKTCzIWb2JzNbbGY3JJ/dkywvNLP+ZlZqZtOqbPNjM1tqZv9a5bPvmdmsGo4xKymWfW2y3N3MfmFmZ5nZOWb2GzP7LzPrkO5vK3XNzH6WFEF/KzkvnjGznmY2wMweMbMSM/uzmTU3s0GV50hyfr1mZi+bWf/kszPM7A0zW2Bmw83sWjP7Q9Vzr8pxjzOzXyfnYffksz8k59mg5LxdbGZvmln75Dz/vZk9V69/oGLk7vqp4x9CpatphO5ly5PPZgGDgLOBxyrXqbJN5f3qPySvrYFfA7NqOMYT1ZZ/ARxXbV93E7qplQLvA/8NDEu+W0mYYvubhf576eew/76TgcuT96cCDwCPAicDJcAG4ObknJpV7dzpDryevF8IfLbKfr9eeQ4c4di3AtdVP88IIxwvq3Je3UcoXvQToEcS81vJedUJGAysABYDQwr9N03zR1ek6WoF7Kz2WfvDfIa7u5mdBixJPrqGcILWpImZTTazY8ysFTAMeMnMhiX7MqAfoWIXwHPAV4FvJ8u7gHMATf1S5Nz9LaAtsNvd/5J8vAQ4Hfibw6z/P0ArM/sMcKy7b6jydUtgnJmVHO5YZjYZ+Efg5eSjfmb29eT9S8CXCT1t3iAUMGoBfARU3oP/PjADuApoB5QRzuXv5PI7Z40SaXquAN4mXJlWegwYB9xVfWUzawn8mAMn3BnAmzXt3N2/Bqwn3Jc6jlDr9XoODOW9FXjU3bdX2ew94LNV9lEBfJIkYilunYHPVfvsAY7+H+Geqgvu/iAwBZh+uJXdfTJwJ6FAEcCZQKmZnQOcBGwk1NzokOzjNmAksLXKbg46zwg1jLscJc5MUyJNz3PAlcBFVT673t0vdff3D7P+twjNqL+a2TFAT+B+YJiZ9a7hGO8TrnA/JFxp7CR0z+0MnOHuL1Rb/5RkG2B/P95j3f3T3H89qS9m9gVgPvA7M7ug8nN3X0RIsNXXLwF2Jv9RbjOzXtVW2UC4WqzJFsItA9x9H2HQTHtCi6YMWAqMcPcVhCb+CuDPVbY/6DwDugGbj/JrZlrsWHupBXf/jZndUlMzCrjczPoAzxDuNZ1gZt8CLnH3c5LtJrv722a2ALjQ3XeZWWtgLqFZdY277zSzPxPuw95LeEo/yMzmE65aVgMXEK4y/ndy7B6E5uHTdftbSwrGE5rKzQnN5hurfHcfofUDsMPMXgW8yjq3ALPMbBdhposvAgOBfwaodl71Bh4GjgW+ZmZnE1pPO4AfAtsI/7nvJJy7AwlXt7cmt5M2ApOS9UYCpxFKcJbSwJv26pCfEWb2a3f/hzrc3x/cfVBd7U+yqa7Pq2r7LiU8MJtwtHWzTk37DDCzZoT7ViJ1RudV3dEVqYhInnRFKiKSJyVSEZE8KZGKiORJiVREJE9KpCIieVIilUxKqmetTSoT/a9CxyONmxKpZNksdy8FRprZEjO7KEmwk5NSc7MqXys3qPysQPFKA6VEKpmWFHsZQKhpcG1ho5HGSolUsq49oe7l84SybhAS6pwq61ygWwCSJiVSybptwFJ3L3X3q5PPZgGjqqxTRiiaMbaeY5NGQolUMs3dPwGWJVO4XHmYVXYDQ4GHgF/Wa3DSaGisvYhInnRFKiKSJyVSEZE8KZGKiORJiVREJE9KpCIieVIiFRHJkxKpiEie/j+JtZC9C2wM6wAAAABJRU5ErkJggg==\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x295f07a4ba8>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"gene_idxs = np.array([80, 186])\n",
"gene1, gene2 = gene_names[gene_idxs]\n",
"len1, len2 = gene_lengths[gene_idxs]\n",
"gene_labels = [f'{gene1}, {len1}bp', f'{gene2}, {len2}bp']\n",
"\n",
"log_counts = list(np.log(counts[gene_idxs] + 1))\n",
"log_ncounts = list(np.log(counts_rpkm[gene_idxs] + 1))\n",
"\n",
"ax = class_boxplot(log_counts,\n",
" ['сырые количества'] * 3,\n",
" labels=gene_labels)\n",
"ax.set_xlabel('Гены')\n",
"ax.set_ylabel('лог-количества экспрессии генов по всем образцам')\n",
"plt.tight_layout()\n",
"plt.savefig('pics/1_14.png', dpi=600)"
]
},
{
"cell_type": "code",
"execution_count": 113,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAVIAAADQCAYAAABV2umIAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAGghJREFUeJzt3Xl4FvW99/H3V4SEJWyNEiuUBAOFUEQriyDKUqGIlXIQKpYeAZGq7enjgvq4PTRuKHWpWurx2KNge3kVF0AKBqhHSdVa2wBSkKWAiMUDmEaRkIWA4ff8MZOQIEkmue9JMsnndV257sxklm/i+OU3v9Wcc4iISN2d0tABiIhEnRKpiEiMlEhFRGKkRCoiEqN6TaTmSTIzq8/7ioiE6dR6vl87ID8/P7+ebysiErMqC4B6tRcRiZESqYhIjJRIRURipEQqIhIjJVIRkRgpkYqIxEiJtBHLKyjh4TU7ySsoaehQRKQaSqSN2IKcPdy2YisLcvY0dCgiUo367pAvtTBjYLdKnyLSOFl9zkdqZkn4I5uSkpLq7b4iInGgkU0ijdnevXtJS0ujZ8+eXHXVVZSWltKjRw969erFuHHjOHjwILt37+b8888H4Ne//jW33HILAGeddRZ//vOfy681ZswYnn/++Qb5PZqrWidSM2tjZp3DCEakuTpy5AhdunRhx44dHDt2jFdffZWioiK2b9/Oeeedx/z588uP/fzzz3nqqaeYM2cOAIWFhbz00ksA5OXl8e6773Lw4MEG+T2aqxoTqZltMrO/mNlSM1sCLAB+WcM5ff1z3lPSFamdAQMGsGPHjvLtoUOHsmvXrvLtO++8k1tvvZX27dsD0LVrV95//32ccyxfvpyxY8d+JZFmZmby9NNPA5CSkgLAxo0bycjI4Nxzz+Wf//wnAH369CE9PZ0bb7wRgI8++oi2bdty+PBhdu/eTYsWLTh27BgFBQX06dOHtLQ0Nm7cCEB2djZJSUl06dKF22+/HYARI0YAMHXqVLKzs5k3bx49evRg4sSJHDt27KQxpKenk56ezvDhwyksLCQ7O5sePXowZMgQioqKKpXMp0+fzqpVq8jOzmbKlCnl99y2bRsLFy4sjwOodF5WVha9evVi9OjRHD16NKb/XhCsRPp94DUgD3gHuMs5N62Gc0YBD/rHfzumCEWakS+//JI1a9aQnp5evm/VqlX07t0bgJ07d5KTk8O0ad7/gocPHyYxMZFvf/vbrF+/npUrVzJx4kSCzLB2//3388QTT3DzzTfz2GOPAXDgwAF27NjB+vXr2bRpEwcPHqRt27a88cYbLFu2jNNOO438/HzatWvH1q1bmTt3Ls899xwApaWljB8/nnnz5lW6z7Zt21iyZAkAt9xyC9u3b2fDhg3k5uaeNIaCggJ27txJaWkpu3fvZsSIEezatYuzzz6bP/7xjzH+hT2ZmZmsXbuW008/nbfffjvm69XYau+c2wXcb2ZnAXcC15nZKOfcJ9WcthzI9K//TsxRitQjm7087td0j15W4zF///vf6datG/369WPSpElMmTKF3r17M2DAAB588EFyc3NJTk6msLCQ9evXc95551FYWEhCQgLjxo1j2bJlFBQUkJKSQmFhIS+//DJ33HEHt956K+Alj8cff7z8fps3b2bQoEHs37+/Up2qmTFkyBA2bdpE165dGTduHKtXr2bv3r2MHDmSQ4cO8cUXXzBlyhT27dvHyJEjAa+KoUOHDl/5vR599FGuuOIKAHbt2sX555/PgAEDSElJOWkMeXl5nHHGGXTt2pWMjAxWr17NTTfdRG5uLkOGDCn/W/Xu3Zt9+/aVl0SzsrLo3bt3eckW4NlnnyUrK4snn3yS1NTU8v3bt29n0KBBFBUVMWHChED/DatTYyI1s58CI4GNwCPOuZkBrnsmsB/oASQD1SVdkUYlSNILQ//+/cnKymLAgAEcOXKE5ORktm3bVumYjh078sgjjzBz5kzWrVtHcXExrVq14qKLLmLGjBlcf/31JCQkUFxczOTJk5k8eTLgJdHMzEyuu+668ld78JLmyZxyyimUlpZSUFBAamoq69ato02bNiQkJFBUVMSzzz7LtGnTSEtLY9GiRYCXJM8888xK18nNzaV79+6kpaUB0LNnT/bu3cv48eM5cODASWNITk7mk08+Ydq0aaxYsYL77ruPxYsXV0r2/fv357333mP69Onl+8aNG8eiRYvKqxMAZs6cydSpU5k1a1Z5nAAdOnT4yt82FkFe7bsDXwAZwANmtszMVtRwzg+A14Ec4LuxhSjSfHTu3JkxY8awdOnSKo8ZNmwYffr04bnnnqO4uJhTTz2VxMREhg0bxsUXX0xiYiLFxcU13isjI4OcnBzWrl1L3759K/1s7dq19OzZs/w63bt3Z9CgQYBXnVBaWkpiYmL58UePHmXp0qXlpdMyW7du5dprry3fLioqIiEhgYSEBD766KMqY2jRogUtW7akpKTkK/eqrdatW3+lHjQ5OZkNGzZw5MiROl+3oiAd8h9wzlWquTazUTWcswT4FXAYiL3c3EzlFZSwIGcPMwZ2I7ldQkOHI/Vk1qxZ3HPPPdUeM2fOHMaPH8/gwYNp2bIlAC+++CIAH3zwAYcPH67xPnfddRdXXnklrVq1YtmyZYD3Wp2amsrAgQMZPHgwL7zwQvn9EhISuOGGGygsLGTWrFlMmjSJgwcPMm3aNB544AFSU1MZOnQo27dvL79H3759GTp0aHnd5s9//nOWLFlCv3796N+//0ljaNeuHV27diUtLY1LLrkEgO985zscOXKkvMEsiA4dOrBw4UJefvnlr9TbPvTQQ0ycOJHExERycnJo27Zt4OueTI0d8s3sNeBZ59wSM0sB/hPY55z7Sa1vpg75tfLwmp3ctmIrv/heH24dmV7zCSIxSklJYf/+/Q0dRmNVZYf8ICXSy4DZZvZ74HTgBufcB/GKTKo2PqML2R9+xviMLg0diohUI0gd6RvAOGAI0An4lZm9GWpUAsAftnxK1tZc/rDl04YORZoJlUbrJkj3p5E1HSPhUIlUJBqCjGz6qZm9Y2Z/NrPL6yMo8ahEKhINQepIr3TODTOzVnhdmhaHHJP4NI2eSDQEabXfjteh3oBzgA2Ac87V1AXqZNdSq72IRFXdW+2dc73iG4uISNMSpI50mJktNrMFZtbLrzNdXR/BSTjjvkUkvoLUkc7DG53UGW8Ckv+L1x1KREQI1o+0PXA9MAUoBLoBd4UZlIhIlAQpkc4EWvvfZ4cXiohINAVpbPpbfQQiIhJVWvxORCRGgRKpmc0xs9Vmtsnfvj/csEREoiNoifQSvJb6f5lZS2B0eCGJSNjyCkp4eM1O8gpKGjqUJiFoIp0HZAFfwxsiel9oEYlI6Bbk7OG2FVtZkLOnoUNpEoK02gOsB35cYbv6caUi0qhpHof4CppI3wNWnbDv6jjHIiL1JLldglZdiKOgibQE2AEcBNaqS5SIyHFB60in460ImgtcYWYLwwpIRCRqgpZIOzvnytaHfcXM/l9YAYmIRE3QEuntZpYAYGatge+FF5KISLQELZHeCyz1Z8n/EngovJBERKIlaCJdBRQAXwf2AW+HFpGISMQEfbVfCowA2gDDgWVhBSQiEjVBS6RfA94E9uDNRzo2tIhERCImaIn034ExwB144+x/WN3BZvYfZpZtZrvN7NoYYxQRadSCJtJhwE7gXeBD4MLqDnbOzXfOjQC2A4tiCVBEpLEL3P0Jb3x94DH2ZpYB7HLOHaxLYCIiURE0kdoJn0GMR41SItIMBE2kD1G7JApeK7+6SYlIkxeo1d4593wdrt3FOVdQh/NERCIltDWbnHPnhnVtEZHGJFCJ1MyexEu6SXhLMyc658aHGZiISFQELZFuBFoBnwEvUUM/UhGR5sScq7lHk5klA5cBg/BKpfc657bX+mZmSUB+fn4+SUlJtT29yeh89yoOFB+N+3U7tW7J5/dr0JlISKpscK/Nmk2f4ZVMj+L1K9VSI3V0oPgo7tHL4n5dm7087tcUkZoFbbX/BoCZtQDaAglhBiUiEiVBG5tGAZOARH/Xp3jj7kVEmr2gjU2/BH4FfBO4B6+zvYiIEDyRvuic2wp8gJdQc8MLSUQkWoLWkc71P681s9OAb5nZVcCfnHMfhxmgiEhjF7SOdA3HZ34yoC9wS1hBiYhESdAS6ciK22a21Dn323BCEhGJlrqWSE8LLSIRkYgJ2iH/YqC1c67AzNKAT0KMSUQkUoIm0j8CR8ysPd7opm8Al4YWlYiEKq+ghAU5e5gxsBvJ7TS+JlZBE2kL59wlZvaSc+56M3sj1KiauM1549kyLYTrArVYDUaasQU5e7htxVYAbh2Z3sDRRF/QRPpDAOfcD/xtrQwag77JfwhtrL3SqAQxY2C3Sp8Sm6CJdJqZjQWS8Trjl+LVm4pIBCW3S1BJNI6Cjmya4JwbjjcD1CigXXghiUjY8gpKeHjNTvIKSho6lCYhaCK91//8HfAO8Fo44YhIfSirI12Qs6ehQ2kSgr7aZ5nZRcAhYA7wVnghiUjYVEcaX4EnLQFG473SXwSsDC0iEZGICZpIU4HXge3Am8DXzOwiv5QqIhGjV/v4Cvpq/xyV5yB9FRiJ12lRr/l1EMayIJ1at4z7NaVp0qt9fFW5+J2ZnQ4U+cNCk4DLgTOA/cAy59zntb6ZFr+rNZu9PJQ+pyJ6tmqtysXvqnu1HwOUdcD/PV7p9W9AMbA4bqGJiERcdYl0MTDZzG7CK4nuxsvInYEva7qwmV1qZi+bWad4BCoi0lhVWUfqnCs2s0uBS4BFwBDgMPAh8P3qLmpm7YBrnXPj4xiriEijVG2rvXPumHPuNeAV/9h0oB/QpYbrDgMyzOxNM0uJS6QiIo1U0O5PvwOygV/4ny/UcHxnYD7wEnBFHWMTEYmEoN2fDuB1xN8DdPO3q/MvIA34Akisc3QiIhEQtET6b8C7wBH/s9o6UuBtvMR7NV6fUxGRJivo4ndfAoEnc3bOHQa+W9egRESiJGiJtBIzq9N5IiJNUdBVRDfiTej8v0Br/7yJIcYlIhIZtakjzQIK8IaIzg4tIhGRiAmaSM/Ga4U/BmzBa40XERGCJ9JJQBugI14j0u9Ci0hEJGKCttpPNbPWQAfn3P6QYxIRiZRAJVIzexpvNNNqf/s3YQYlIhIlQV/t+wM/Az735yk9O7yQRESiJWgivRGYC5QADwM/CS0iEZGICTrWfpxzblrZhpnNBdaFE5KISLQETaQXm9ka4GOgOzAqvJBERKIlaCK9EpgBfB34FJgaWkQiIhETtI60A94Q0SeBtkDP0CISEYmYoIn0N3iJNBtYitfwJCIiBE+krYFv4Y21HwW0Ci0iEZGICVpHejXQDq9ECrAilGhERCIo6BBRdXUSEalC4AmazaylmaWYWcswA5LK3KOXNXQIIlKDoBM7zwXOw2twOt3M1jvn7gg1MhGRiAhaRzoKuMg5d8TMEvAWtxMREcCcczUfZDYBuB5oAXwJPOOcW1Lrm5klAfn5+fkkJSXV9nQRqUHnu1dxoPhoKNfu1Loln98/NpRrR4RV9YOgJdL1wI8rbNecfUWk3h0oPhpavbrNXh7KdZuCoIn0PWCV/73hJdKrQ4lIRCRigibSj4FMjhdtVSIVEfEFTaTb8BIpqEQqIlJJ0A75M8IOREQkqgJ3yK8tM9tvZtlmlhLWPUREGoOgHfL7ApcCif6uPOfcU9Uc3wJYqZKsiDQHQUukLwCbgPHAn4DpNRzfCehrZv9W99BERKIhaGPTW865lWY2A7gcKK7uYOdcnpldAKz0h5N+HGugIlKzzXnj2TKt5uPqdG1AHXZOLmhj0//xv/0R3lLMWwKcc9TMPgM64nWfEpGQ9U3+Q6gd8pVGTy5oHekdwEi8EU6jgPeBa6s5fhJwE7AL2Bh7mCIijVfg5ZiBi4DNzrkMM3unuoOdc68Ar8QanIhIFARtbDoKvAGYmb2JN3GJiIgQvER6r3MuO8xARESiKmgifczMbqy4wzn3VgjxiIhETtBE2gkYQeVJS5RIRUQInkifcc49WLZhZh1CikdEJHKCNjZ994TtpfEOREQkqoKWSHea2UK8jvXdgU9Ci0hEJGKCjmy6xsy6A2cA/3LOfRhuWCJSV2EtCdKptVZir0rQxe/mABcAZzrnvmVm9zvn7q71zbT4nUijYbOXhzactImqcvG7oHWkl+CNbso1s5bA6HhEJSLSFARNpPOALOBrwGLgvtAiEhGJmNoux+w4vmaTiIgQPJHe439+B/gf/3stficiQvBEmgkkAT2dc0qgIiIV1CaRFgO3hReKiEg0BU2k2Xj1oulmlg7gnPttWEGJiERJ0Fb7s4C7gV4hxiIiEklBE+kbwM1ABnAFUBRaRCIiERP01T4V79V+mb/dJpRoREQiKGiJVP1HRUSqEDSR3s7xJFqWVEVEhOCv9l8HpgIHgbXAb0KLSEQkYoJOo9fRzE4FOgJD8CZ2HhFiXCIikREokfpLi3wfr2S6D/j3MIMSEYmSoHWkS/DqRXP8bXXGFxHxBU2kBhwDSv1PNTaJiPiCJtLLgVbAUKAlMKGmE8ysu5mVxBCbiEgkBE2k9wIrnHNz8epIXwpwzo3AjroGJiISFUG7Pz0NPG5mHYF1wKTqDjazzv63eTHEJiISCUFLpPOBFLxW+wuAV2s4/ofA72KIS0QkMoL2Ix1ZcdvMWtRwSg9gGJBhZj92zj1Tx/hERBq9oCXSE91T3Q+dczc756YAW5RERaSpC5RIzeyaittB17R3zo2oQ0wiIpEStLHpp2a2veIO59xbIcQjIhI5QRPpqxwfW182nZ4SqYgIwetInwA+Bo4Cu4EnwwpIRCRqgibSxf7n3/zPpSHEIiL15B+fHmJcn9P5x6eHGjqUJiHoq73G2os0ITcv30LW1lwAXrtmcANHE31BE+nlwES8uUg/JcBYexFpvB67LKPSp8TGnKt5GSYzm0PlUqhzzt1b65uZJQH5+fn5JCUl1fZ0EZGGVOWbeNAS6QS8SUhqvKCISHMTtLGpNZCJN6JpCvBJWAGJSPjyCkp4eM1O8go002U8BB1r36fsezM7G/gv4OKwghKRcM1/5yPueX0HhSVfkjm2d0OHE3lBX+3LOec2oiQqEm12wqfEpNaJVESi7z8uSKNtq1OZMbBbQ4fSJARqtY/bzdRqLyLRVWX5va7T6ImIiE+JVEQkRkqkIiIxUiIVEYlRg7TaHzqkGWdEJFrat2+fBBS4k7TQ13er/RnA3nq7oYhIfLV3zn2lJFjfidTwlnUuqLebiojET8OXSEVEmiI1NjVRZtbSzE5v6DikeTCzJDNr39BxNBQl0qbrAuDOhg5Cmo3ZwKiGDqKhKJHGiZkNNrMtZvYXM/uxv+9Rf/stM+trZiPM7PEK5zxoZjlm9liFfXeY2cIq7rHQzLLNbLq/nWZmS8zsfDO70MzeNrN1ZpYc7m8rYTGzl81st5l94D8fL5pZupmdbWb/ZWapZrbDzFqZ2Tllz4r/nL1pZqvMrK+/b6CZvWNmfzKzkWY23cw2VHwGK9y3k5mt9J/HNH/fBv95O8d/fv9iZu+ZWWf/eX/fzF6p1z9QY+Wc01ccvvCWq34cr0vZJn/fQuAcYBjw32XHVDinrI56g//ZDlgJLKziHr89YXsJ0OmEaz2CNzvXCLx5Y9cDw/2fbQXeBa5r6L+Xvqp9ljKBCf733wLmA88A3wBSgX3Azf6ztfCEZygNWON//xZwRoXrzix7Fqq5953A1Sc+b8DPgPEVnq8ngZHA88BZfswf+M9XF+BcYDPwF2BwQ/9Nw/5SiTT+WgOHT9jX+ST7cM45M+sH/NXfdRXeg1mVU8ws08zamFlrYDiw3MyG+9cyIIPjq72+AvwAuMnfLgEuBK6tw+8lDcA59wHQHjjinPunv/uvwACg7UmO/whobWYtgI7OuX0VfpwIzDKz1JPdy8wygeuBVf6uDDOb6X+/HG/dthTgHbxFMBOAA0BZXfzdwHPAlUAH4HW8Z/q22vzOUaREGl+TgH/glUzL/DcwC3joxIPNLBF4kOMP2kDgvaou7pz7EfC/ePVRnYC3gWs4vgzMncAzzrn8Cqd9DJxR4RqlQJGfiCUaUoBeJ+ybT83/IB6tuOGc+zVwL/Cbkx3snMsE5uItcgkwCBhhZhcCZwL7gVZAsn+Nu4DJwGcVLlPpeQN2A11riDPylEjj6xXgCmBshX3XOOcuc86dbHmWn+C9Ph00szZAOvArYLiZfbOKe3yCV8L9HK+EcRivi24KMNA59+oJx/egwtIwfqm1o3OuuPa/ntQ3M7sUyAb+Zmajy/Y7597FS7AnHp8KHPb/wfzCzHqecMg+vNJiVfLwqgxwzh3DG0DTGe/N5nUgB/iuc24z3iv+ZmBHhfMrPW9AdyC3hl8z8jSxc5w55942s9lVvT4BE8ysN/AiXh3TaWb2E+B7zrkL/fMynXP/MLM/AWOccyVm1g7Iwnuduso5d9jMduDVw/4Sr5X+HDPLxiutfAiMxitd3ODf+yy818Lfx/e3lhD9FO9VuRXea/PPKvzsSby3IIBDZvYG4CocMxtYaGYlwBPA94H+wAMAJzxf3wSeBjoCPzKzYXhvUYeAnwNf4P0jfxjvGe6PV7q9069W2g/M8Y+bDPTDW8Z9BM3g1V4d8hsxM1vpnLskjtfb4Jw7J17Xk2iL9/N1wrVH4DWY3VjTsU2BXu0bKTNriVdfJRJ3er7iSyVSEZEYqUQqIhIjJVIRkRgpkYqIxEiJVEQkRkqkIiIxUiKVSPFn0Nrtz0p0XkPHIwJKpBJNC51zI4DJZvZXMxvrJ9hMf5q5hWWfZSeU7WugeKWJUyKVSPInfDkbb16D6Q0bjTR3SqQSVZ3x5rxcijelG3gJdVGFY0arCkDqgxKpRNUXQI5zboRz7of+voXAlArHvI43YcaMeo5NmhklUokk51wR8Hd/GZcrTnLIEWAo8BSwrF6Dk2ZHY+1FRGKkEqmISIyUSEVEYqREKiISIyVSEZEYKZGKiMRIiVREJEZKpCIiMfr/CgeLaPoyybsAAAAASUVORK5CYII=\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x295f30a0b00>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"ax = class_boxplot(log_ncounts,\n",
" ['RPKM-нормализованные'] * 3,\n",
" labels=gene_labels)\n",
"ax.set_xlabel('Гены')\n",
"ax.set_ylabel('лог.количества по всем образцам после RPKM');\n",
"plt.tight_layout()\n",
"plt.savefig('pics/1_15.png', dpi=600)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment