Skip to content

Instantly share code, notes, and snippets.

@Gibbsdavidl
Last active December 6, 2018 00:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Gibbsdavidl/dc257e66867a5f3bb8a6c6f351a633c9 to your computer and use it in GitHub Desktop.
Save Gibbsdavidl/dc257e66867a5f3bb8a6c6f351a633c9 to your computer and use it in GitHub Desktop.
ISB-CGC-Query_of_the_Month-November-2018.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "ISB-CGC-Query_of_the_Month-November-2018.ipynb",
"version": "0.3.2",
"provenance": [],
"collapsed_sections": [],
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/Gibbsdavidl/dc257e66867a5f3bb8a6c6f351a633c9/isb-cgc-query_of_the_month-november-2018.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"metadata": {
"id": "V6hHrkja94oQ",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"# **ISB-CGC Query of the Month, November 2018**\n",
"- Kawther Abdilleh, David L Gibbs\n",
"\n",
"For more Queries: https://isb-cancer-genomics-cloud.readthedocs.io/en/latest/sections/QueryOfTheMonthClub.html\n",
"\n",
"Part of the ISB-CGC: http://www.isb-cgc.org \n",
" "
]
},
{
"metadata": {
"id": "n97tZmtc93Vf",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### First we need to get authenticated. If you don't have a Google Cloud project, see https://cloud.google.com/dataproc/docs/guides/setup-project. There's $300 free credit available for new accounts!\n"
]
},
{
"metadata": {
"id": "FyrcP8Zs-VkB",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"from google.colab import auth\n",
"auth.authenticate_user()\n",
"print('Authenticated')"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "N9GQYflzapa9",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### ** Let's begin by querying our newly transformed vcf table, containing variants for the human chromosome 21. **\n",
"\n",
"For instructions on the BigQuery transformation see: https://isb-cancer-genomics-cloud.readthedocs.io/en/latest/sections/QueryOfTheMonthClub.html\n",
"\n",
"This first query demonstrates the 'magic' %%bigquery command. "
]
},
{
"metadata": {
"id": "FhNcRp87-uMe",
"colab_type": "code",
"outputId": "efbed783-069f-4329-9c6b-d3ff16bfe05b",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 359
}
},
"cell_type": "code",
"source": [
"%%bigquery --project YOUR-PROJECT-ID df\n",
"select \n",
" reference_name,\n",
" start_position,\n",
" end_position\n",
"from\n",
" `isb-cgc.QotM.1000genomes`\n",
"limit \n",
" 10"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>reference_name</th>\n",
" <th>start_position</th>\n",
" <th>end_position</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>21</td>\n",
" <td>44061667</td>\n",
" <td>44105124</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>21</td>\n",
" <td>44061693</td>\n",
" <td>44104791</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>21</td>\n",
" <td>44075242</td>\n",
" <td>44076807</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>21</td>\n",
" <td>44063828</td>\n",
" <td>44063829</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>21</td>\n",
" <td>44081693</td>\n",
" <td>44081694</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>21</td>\n",
" <td>44076133</td>\n",
" <td>44076134</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>21</td>\n",
" <td>44095212</td>\n",
" <td>44095213</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>21</td>\n",
" <td>44059175</td>\n",
" <td>44059176</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>21</td>\n",
" <td>44097081</td>\n",
" <td>44097082</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>21</td>\n",
" <td>44080981</td>\n",
" <td>44080982</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" reference_name start_position end_position\n",
"0 21 44061667 44105124\n",
"1 21 44061693 44104791\n",
"2 21 44075242 44076807\n",
"3 21 44063828 44063829\n",
"4 21 44081693 44081694\n",
"5 21 44076133 44076134\n",
"6 21 44095212 44095213\n",
"7 21 44059175 44059176\n",
"8 21 44097081 44097082\n",
"9 21 44080981 44080982"
]
},
"metadata": {
"tags": []
},
"execution_count": 3
}
]
},
{
"metadata": {
"id": "mIESR-bL_nsf",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### **Colaboratory notebooks allow us to leverage the power of Python and use pandas to create & save queries to datatables. **"
]
},
{
"metadata": {
"id": "rF1p92ah_m6V",
"colab_type": "code",
"outputId": "544b0ecd-cf68-4c5d-e41d-338d1224169b",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
}
},
"cell_type": "code",
"source": [
"\n",
"df = pd.io.gbq.read_gbq('''\n",
"select \n",
" reference_name as chr,\n",
" start_position,\n",
" end_position\n",
"from\n",
" `isb-cgc.QotM.1000genomes`\n",
"limit \n",
" 10\n",
"''', project_id=project_id, verbose=False, dialect='standard')\n",
"\n",
"df.head()"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>chr</th>\n",
" <th>start_position</th>\n",
" <th>end_position</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>21</td>\n",
" <td>34466727</td>\n",
" <td>34466728</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>21</td>\n",
" <td>34472050</td>\n",
" <td>34472051</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>21</td>\n",
" <td>34434667</td>\n",
" <td>34434668</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>21</td>\n",
" <td>34425328</td>\n",
" <td>34425329</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>21</td>\n",
" <td>34455811</td>\n",
" <td>34455812</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" chr start_position end_position\n",
"0 21 34466727 34466728\n",
"1 21 34472050 34472051\n",
"2 21 34434667 34434668\n",
"3 21 34425328 34425329\n",
"4 21 34455811 34455812"
]
},
"metadata": {
"tags": []
},
"execution_count": 5
}
]
},
{
"metadata": {
"id": "b7s98d_UBQvx",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### **We can filter variants so that we have variants with only a single alternate base. The alternate base is a RECORD type, which means it could have a list of variants. Here we use the ARRAY_LENGTH keyword**\n"
]
},
{
"metadata": {
"id": "ji9ODMVIBUIw",
"colab_type": "code",
"outputId": "46c9ff1f-500d-465e-982c-a803161f32e9",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
}
},
"cell_type": "code",
"source": [
"\n",
"df = pd.io.gbq.read_gbq('''\n",
"#standardsql\n",
"SELECT\n",
" start_position,\n",
" reference_name,\n",
" reference_bases AS original,\n",
" alternate_bases[ORDINAL(1)].alt AS alt\n",
"FROM\n",
" `isb-cgc.QotM.1000genomes` AS v\n",
"WHERE\n",
" ARRAY_LENGTH(alternate_bases) = 1\n",
"LIMIT 10\n",
"'''\n",
", project_id=project_id, verbose=False, dialect='standard') \n",
" \n",
"df.head()\n",
" "
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>start_position</th>\n",
" <th>reference_name</th>\n",
" <th>original</th>\n",
" <th>alt</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>34452189</td>\n",
" <td>21</td>\n",
" <td>A</td>\n",
" <td>AT</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>34476092</td>\n",
" <td>21</td>\n",
" <td>A</td>\n",
" <td>AG</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>34484539</td>\n",
" <td>21</td>\n",
" <td>C</td>\n",
" <td>CTT</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>34483337</td>\n",
" <td>21</td>\n",
" <td>C</td>\n",
" <td>CG</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>34425328</td>\n",
" <td>21</td>\n",
" <td>G</td>\n",
" <td>GA</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" start_position reference_name original alt\n",
"0 34452189 21 A AT\n",
"1 34476092 21 A AG\n",
"2 34484539 21 C CTT\n",
"3 34483337 21 C CG\n",
"4 34425328 21 G GA"
]
},
"metadata": {
"tags": []
},
"execution_count": 6
}
]
},
{
"metadata": {
"id": "E_V-KnXaDRun",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### **Then, further narrow it down to SNPs (a single nucleotide change).**"
]
},
{
"metadata": {
"id": "9fwuS1EhDLRw",
"colab_type": "code",
"outputId": "ecc7d143-c363-4e1d-ebde-c34cbb29089c",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
}
},
"cell_type": "code",
"source": [
"df = pd.io.gbq.read_gbq('''\n",
"SELECT\n",
" start_position,\n",
" reference_name,\n",
" reference_bases AS original,\n",
" alternate_bases[ORDINAL(1)].alt AS changed\n",
"FROM\n",
" `isb-cgc.QotM.1000genomes` AS v\n",
"WHERE\n",
" ARRAY_LENGTH(alternate_bases) = 1\n",
" AND alternate_bases[ORDINAL(1)].alt IN ('A','C','G','T')\n",
"ORDER BY start_position \n",
"LIMIT 10\n",
"''', project_id=project_id, verbose=False, dialect='standard') \n",
" \n",
"df.head()\n"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>start_position</th>\n",
" <th>reference_name</th>\n",
" <th>original</th>\n",
" <th>changed</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>9411238</td>\n",
" <td>21</td>\n",
" <td>G</td>\n",
" <td>A</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>9411244</td>\n",
" <td>21</td>\n",
" <td>C</td>\n",
" <td>A</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>9411263</td>\n",
" <td>21</td>\n",
" <td>A</td>\n",
" <td>C</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>9411266</td>\n",
" <td>21</td>\n",
" <td>G</td>\n",
" <td>T</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>9411301</td>\n",
" <td>21</td>\n",
" <td>G</td>\n",
" <td>T</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" start_position reference_name original changed\n",
"0 9411238 21 G A\n",
"1 9411244 21 C A\n",
"2 9411263 21 A C\n",
"3 9411266 21 G T\n",
"4 9411301 21 G T"
]
},
"metadata": {
"tags": []
},
"execution_count": 7
}
]
},
{
"metadata": {
"id": "OBSIw8VVDiXZ",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### **What are the five most frequent mutations on chromosome 21? **"
]
},
{
"metadata": {
"id": "xjTZi8kGuK4L",
"colab_type": "code",
"outputId": "cc88334e-c592-40c1-db24-de0bc59d2c91",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
}
},
"cell_type": "code",
"source": [
"df = pd.io.gbq.read_gbq('''\n",
"#standardsql\n",
"WITH\n",
"table1 AS (\n",
"SELECT\n",
" start_position,\n",
" reference_name,\n",
" CONCAT( reference_bases, '->', alternate_bases[ORDINAL(1)].alt) AS mutation\n",
"FROM\n",
" `isb-cgc.QotM.1000genomes` AS v\n",
"WHERE\n",
" ARRAY_LENGTH(alternate_bases) = 1\n",
" AND alternate_bases[ORDINAL(1)].alt IN ('A','C','G','T')\n",
")\n",
"\n",
"SELECT\n",
" mutation,\n",
" COUNT(mutation) AS num_mutations\n",
"FROM\n",
" table1\n",
"GROUP BY mutation\n",
"ORDER BY num_mutations DESC\n",
"LIMIT 5\n",
"\n",
"''', project_id=project_id, verbose=False, dialect='standard') \n",
" \n",
"df.head()\n"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>mutation</th>\n",
" <th>num_mutations</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>G-&gt;A</td>\n",
" <td>213081</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>C-&gt;T</td>\n",
" <td>212145</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>T-&gt;C</td>\n",
" <td>147252</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>A-&gt;G</td>\n",
" <td>141916</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>C-&gt;A</td>\n",
" <td>48410</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" mutation num_mutations\n",
"0 G->A 213081\n",
"1 C->T 212145\n",
"2 T->C 147252\n",
"3 A->G 141916\n",
"4 C->A 48410"
]
},
"metadata": {
"tags": []
},
"execution_count": 8
}
]
},
{
"metadata": {
"id": "89AdnC0YEDCo",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### **Now, let's jump into a specific example. The gene DYRK1A on chromosome 21 has been shown to contribute to the development of leukemia. Researchers are studying it as a potential theraputic target. DYRK1A resides on chromosome 21 from position 37365790 to 37517450. Let's explore variants in this gene. **\n",
"\n",
"\n"
]
},
{
"metadata": {
"id": "7MEb7mADg-kI",
"colab_type": "code",
"outputId": "62811d69-4de0-41d1-d778-98d97adc7d08",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 80
}
},
"cell_type": "code",
"source": [
"df = pd.io.gbq.read_gbq('''\n",
"#standardsql\n",
"SELECT\n",
" COUNT(reference_name) AS num_variants\n",
"FROM\n",
" `isb-cgc.QotM.1000genomes` AS v\n",
"WHERE\n",
" reference_name = '21'\n",
" AND start_position BETWEEN 37365790\n",
" AND 37517450\n",
" ''', project_id=project_id, verbose=False, dialect='standard')\n",
"df.head()\n"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>num_variants</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>4485</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" num_variants\n",
"0 4485"
]
},
"metadata": {
"tags": []
},
"execution_count": 9
}
]
},
{
"metadata": {
"id": "u7BlVrstc5Zi",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### **And how many samples do we have?**"
]
},
{
"metadata": {
"id": "TTkWuwqec7jF",
"colab_type": "code",
"outputId": "49dc532b-92fd-4014-8617-2372aca2112c",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 80
}
},
"cell_type": "code",
"source": [
"df = pd.io.gbq.read_gbq('''\n",
" SELECT\n",
" COUNT(DISTINCT(call.name)) as num_samples\n",
" FROM\n",
" `isb-cgc.QotM.1000genomes`\n",
" JOIN\n",
" UNNEST(call) AS call\n",
" WHERE\n",
" reference_name = '21'\n",
" AND (start_position BETWEEN 37365790 AND 37517450)\n",
" \n",
" ''', project_id=project_id, verbose=False, dialect='standard')\n",
"df.head()\n",
"\n",
" "
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>num_samples</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2535</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" num_samples\n",
"0 2535"
]
},
"metadata": {
"tags": []
},
"execution_count": 10
}
]
},
{
"metadata": {
"id": "CiGEgi07j2x-",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### We have 2,535 samples and the total number of genomic positions mapping to the DYRK1A gene is 4,485. How many variants are there in this gene per sample? \n",
"\n",
"To answer this question, we'll need to start working with records. To do that, we'll build up to answering this question through a series of queries.\n",
"\n",
"Our first query will flatten the call record into three columns, the call.name, and both genotype calls (one for each chromosome), where a zero is the reference call, and an alternate call otherwise. You can see most of the listed entries are actually just homozygous refs. "
]
},
{
"metadata": {
"id": "wnvYVLlaZXR-",
"colab_type": "code",
"outputId": "1db02674-ebb9-45b6-a4ea-c7651b5e9c29",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
}
},
"cell_type": "code",
"source": [
"df8 = pd.io.gbq.read_gbq('''\n",
" SELECT\n",
" call.name,\n",
" call.genotype[OFFSET(0)] g1,\n",
" call.genotype[OFFSET(1)] g2\n",
" FROM\n",
" `isb-cgc.QotM.1000genomes`\n",
" JOIN\n",
" UNNEST(call) AS call WITH OFFSET AS ci\n",
" WHERE\n",
" reference_name = '21'\n",
" AND start_position BETWEEN 37365790\n",
" AND 37517450\n",
" \n",
" LIMIT 10''', project_id=project_id, verbose=False, dialect='standard')\n",
"\n",
"df8.head()\n"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>name</th>\n",
" <th>g1</th>\n",
" <th>g2</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>HG00096</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>HG00097</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>HG00099</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>HG00100</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>HG00101</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" name g1 g2\n",
"0 HG00096 0 0\n",
"1 HG00097 0 0\n",
"2 HG00099 0 0\n",
"3 HG00100 0 0\n",
"4 HG00101 0 0"
]
},
"metadata": {
"tags": []
},
"execution_count": 17
}
]
},
{
"metadata": {
"id": "ThlsxJacZpbb",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"**In the above result, we have 11369475 rows, which (as hoped for) is the product of 2535 * 4485!**\n",
"\n",
"What we need now, is to match up the sample IDs, and genomic positions.\n",
"\n",
"Here's an example of what we're looking for:\n",
"\n",
"\n",
"\n",
"```\n",
" sample chr pos mut g1 g2 \n",
"HG00096 21 37424669 C->A 0 1 \n",
"```\n",
"\n",
"See: http://www.internationalgenome.org/data-portal/sample/HG00096\n",
"\n",
"\n",
"Let's write a query to find it."
]
},
{
"metadata": {
"id": "MqJveC6PawQv",
"colab_type": "code",
"outputId": "6216a3ef-db41-4a60-b916-83b69e4ed633",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 97
}
},
"cell_type": "code",
"source": [
"df9 = pd.io.gbq.read_gbq('''\n",
"WITH\n",
"t1 as (\n",
"SELECT\n",
" reference_name,\n",
" start_position,\n",
" call.name,\n",
" call.genotype[OFFSET(0)] g1,\n",
" call.genotype[OFFSET(1)] g2\n",
" FROM\n",
" `isb-cgc.QotM.1000genomes`\n",
" JOIN\n",
" UNNEST(call) AS call WITH OFFSET AS ci\n",
" WHERE\n",
" reference_name = '21'\n",
" AND start_position BETWEEN 37365790\n",
" AND 37517450\n",
")\n",
"\n",
"select * from t1 where name = 'HG00096' and start_position = 37424669\n",
"\n",
"''', project_id=project_id, verbose=False, dialect='standard')\n",
"\n",
"df9.head()\n",
"\n"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"WARNING:google.auth._default:No project ID could be determined. Consider running `gcloud config set project` or setting the GOOGLE_CLOUD_PROJECT environment variable\n"
],
"name": "stderr"
},
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>reference_name</th>\n",
" <th>start_position</th>\n",
" <th>name</th>\n",
" <th>g1</th>\n",
" <th>g2</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>21</td>\n",
" <td>37424669</td>\n",
" <td>HG00096</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" reference_name start_position name g1 g2\n",
"0 21 37424669 HG00096 0 1"
]
},
"metadata": {
"tags": []
},
"execution_count": 18
}
]
},
{
"metadata": {
"id": "VHhn2PzcZppO",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"###**OK we should ready to answer the question: How many variants per sample in this gene?** \n",
"\n"
]
},
{
"metadata": {
"id": "i67HZWvBj1wm",
"colab_type": "code",
"outputId": "82a38bae-af4b-42dd-8433-e930eb32dd6d",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
}
},
"cell_type": "code",
"source": [
"df10 = pd.io.gbq.read_gbq('''\n",
"WITH\n",
"t1 AS (\n",
"SELECT\n",
" reference_name,\n",
" start_position,\n",
" call.name as sample,\n",
" call.genotype[OFFSET(0)] g1,\n",
" call.genotype[OFFSET(1)] g2\n",
" FROM\n",
" `isb-cgc.QotM.1000genomes`\n",
" JOIN\n",
" UNNEST(call) AS call WITH OFFSET AS ci\n",
" WHERE\n",
" reference_name = '21'\n",
" AND (start_position BETWEEN 37365790 AND 37517450)\n",
"),\n",
"\n",
"t2 AS (\n",
" SELECT\n",
" sample,\n",
" COUNT(sample) as N\n",
" FROM\n",
" t1\n",
" WHERE\n",
" g1 = 1 OR g2 = 1\n",
" GROUP BY\n",
" sample\n",
" )\n",
"\n",
"select * from t2 GROUP BY N, sample ORDER BY N\n",
"\n",
"''', project_id=project_id, verbose=False, dialect='standard')\n",
"\n",
"df10.head()\n",
"\n"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>sample</th>\n",
" <th>N</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>HG03836</td>\n",
" <td>42</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>NA19669</td>\n",
" <td>75</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>HG00717</td>\n",
" <td>80</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>HG02655</td>\n",
" <td>85</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>HG02820</td>\n",
" <td>89</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" sample N\n",
"0 HG03836 42\n",
"1 NA19669 75\n",
"2 HG00717 80\n",
"3 HG02655 85\n",
"4 HG02820 89"
]
},
"metadata": {
"tags": []
},
"execution_count": 19
}
]
},
{
"metadata": {
"id": "awWUoYVQZpxQ",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"**In the above result, we have the expected 2,535 number of rows (for each sample). The number of variants per sample ranges from 42 to 401. Let's make a histogram.**"
]
},
{
"metadata": {
"id": "A4P0KXrZmi6M",
"colab_type": "code",
"outputId": "7376a2cf-dd07-4f44-856c-921a5cd62f27",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 396
}
},
"cell_type": "code",
"source": [
"df10.hist()"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[<matplotlib.axes._subplots.AxesSubplot object at 0x7f0588c982b0>]],\n",
" dtype=object)"
]
},
"metadata": {
"tags": []
},
"execution_count": 20
},
{
"output_type": "display_data",
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAeQAAAFZCAYAAABJ+lxSAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4yLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvNQv5yAAAG8tJREFUeJzt3WtQlOfdx/HfyrKzRTfFJbtOsDk2\naphIIIx2qg1J8FSwfRpihKFUnVbixPFQm2KUEmvNZJp4nkRDazxgmKRW6trxodM0MIm1k+kQ0mQ7\nFNPMWH3RMZbArkFRYE1C9nnRdh+N6KLZw8XN9/NKbpa9r38u8Ju9F3dt4XA4LAAAkFQjkr0AAABA\nkAEAMAJBBgDAAAQZAAADEGQAAAxAkAEAMABBBoaJCRMmqLq6+pJjLS0tmj9/fpJWBOBiBBkYRv7y\nl7/o73//e7KXAWAABBkYRn784x/rmWeeSfYyAAyAIAPDSFFRkcLhsF577bVkLwXA5xBkYJiprq7W\n5s2bdeHChWQvBcBFCDIwzNx9992aPHmy9u7dm+ylALiIPdkLAJB4jz/+uObMmaOvfOUryV4KgP/g\nETIwDHm9Xn3ve9/T9u3bk70UAP9BkIFhauHChfrkk0+SvQwA/2Hj/ZABAEg+HiEDAGAAggwAgAEI\nMgAABiDIAAAYgCADAGCApL4wSCBwLuHnHD06TV1dvQk/byJZfUarzydZf0arzycxoxXEYz6Px3XF\nzw27R8h2e0qylxB3Vp/R6vNJ1p/R6vNJzGgFiZ5v2AUZAAATEWQAAAxAkAEAMABBBgDAAAQZAAAD\nEGQAAAxAkAEAMABBBgDAAAQZAAADRH3pzAMHDqihoSHy8dGjR/XrX/9a69atkyRNmDBBTz31lCRp\n9+7deu2112Sz2bRs2TI98MAD8Vk1AAAWEzXIJSUlKikpkSS9/fbb+sMf/qCf//znqq6u1j333KPK\nykr96U9/0h133KFXX31V+/fv1/nz51VeXq777rtPKSnWfmk1AABi4ZouWdfU1GjRokU6deqU7rnn\nHklSQUGBmpub1dLSovz8fDkcDrndbo0dO1bHjx+Py6IBALCaQb/b09/+9jfddNNNSklJ0Q033BA5\nnpGRoUAgoPT0dLnd7shxt9utQCCgCRMmXPE+R49OS8qLk1/t3TaswuozWn0+yfozRpvvfyr/N0Er\nuT6/2/JQ1NtYfQ8l68+YyPkGHWSfz6eHH374suPhcHjA21/p+MWS8bZdHo8rKW/7mEhWn9Hq80nW\nn9EK80VbvxVmjMbqM8Zjvpi8/WJLS4vuvfdeud1unTlzJnK8o6NDXq9XXq9XwWDwsuMAACC6QQW5\no6NDI0eOlMPhUGpqqu644w698847kqSmpibl5+fr61//uo4cOaKPP/5YHR0d6uzs1J133hnXxQMA\nYBWDumQdCAQueX64urpaa9eu1WeffaacnBxNnTpVklRaWqp58+bJZrNp3bp1GjGCf+YMAMBgDCrI\nEydO1O7duyMf33nnndq3b99lt5s/f77mz58fu9UBADBM8BAWAAADEGQAAAxAkAEAMABBBgDAAIN+\nYRAAwP9buP5wspcQVW3VtGQvAdeAR8gAABiAIAMAYACCDACAAQgyAAAGIMgAABiAIAMAYACCDACA\nAQgyAAAGIMgAABiAIAMAYACCDACAAQgyAAAGIMgAABiAIAMAYACCDACAAQgyAAAGIMgAABiAIAMA\nYACCDACAAQgyAAAGIMgAABiAIAMAYACCDACAAQgyAAAGIMgAABiAIAMAYAD7YG7U0NCg3bt3y263\n64c//KEmTJigVatWqb+/Xx6PR5s2bZLD4VBDQ4Pq6uo0YsQIlZaWqqSkJN7rBwDAEqIGuaurSzU1\nNTp48KB6e3u1fft2NTY2qry8XEVFRdq6dat8Pp+Ki4tVU1Mjn8+n1NRUzZ07VzNnzlR6enoi5gAA\nYEiLesm6ublZU6ZM0ahRo+T1evX000+rpaVF06dPlyQVFBSoublZra2tys7OlsvlktPpVF5envx+\nf9wHAADACqI+Qv7ggw8UCoW0ePFidXd3a/ny5err65PD4ZAkZWRkKBAIKBgMyu12R77O7XYrEAjE\nb+UAAFjIoJ5DPnPmjF544QX961//0oIFCxQOhyOfu/jPF7vS8YuNHp0muz1lkEuNHY/HlfBzJprV\nZ7T6fJL1Z7T6fCZIxH9jq+9jIueLGuSMjAzde++9stvtuuWWWzRy5EilpKQoFArJ6XSqo6NDXq9X\nXq9XwWAw8nWdnZ3Kzc296n13dfV+8QmukcfjUiBwLuHnTSSrz2j1+STrz2j1+UwR7//GVt/HeMx3\ntcBHfQ75vvvu01tvvaXPPvtMXV1d6u3t1dSpU9XY2ChJampqUn5+vnJyctTW1qbu7m719PTI7/dr\n0qRJsZsCAAALi/oIecyYMfrmN7+p0tJSSdKaNWuUnZ2t1atXq76+XpmZmSouLlZqaqoqKytVUVEh\nm82mpUuXyuWy9qUMAABiZVDPIZeVlamsrOySY3v37r3sdoWFhSosLIzNygAAGEZ4pS4AAAxAkAEA\nMABBBgDAAAQZAAADEGQAAAxAkAEAMABBBgDAAAQZAAADEGQAAAxAkAEAMABBBgDAAAQZAAADEGQA\nAAxAkAEAMABBBgDAAAQZAAADEGQAAAxAkAEAMABBBgDAAAQZAAADEGQAAAxAkAEAMABBBgDAAAQZ\nAAADEGQAAAxAkAEAMABBBgDAAAQZAAADEGQAAAxAkAEAMABBBgDAAAQZAAAD2KPdoKWlRStWrNC4\nceMkSePHj9ejjz6qVatWqb+/Xx6PR5s2bZLD4VBDQ4Pq6uo0YsQIlZaWqqSkJO4DAABgBVGDLElf\n+9rXtG3btsjHP/nJT1ReXq6ioiJt3bpVPp9PxcXFqqmpkc/nU2pqqubOnauZM2cqPT09bosHAMAq\nruuSdUtLi6ZPny5JKigoUHNzs1pbW5WdnS2XyyWn06m8vDz5/f6YLhYAAKsa1CPk48ePa/HixTp7\n9qyWLVumvr4+ORwOSVJGRoYCgYCCwaDcbnfka9xutwKBQHxWDQCAxUQN8m233aZly5apqKhIJ0+e\n1IIFC9Tf3x/5fDgcHvDrrnT8YqNHp8luT7mG5caGx+NK+DkTzeozWn0+yfozWn0+EyTiv7HV9zGR\n80UN8pgxYzR79mxJ0i233KIbb7xRbW1tCoVCcjqd6ujokNfrldfrVTAYjHxdZ2encnNzr3rfXV29\nX3D5187jcSkQOJfw8yaS1We0+nyS9We0+nymiPd/Y6vvYzzmu1rgoz6H3NDQoD179kiSAoGATp8+\nrTlz5qixsVGS1NTUpPz8fOXk5KitrU3d3d3q6emR3+/XpEmTYjQCAADWFvUR8rRp07Ry5Uq98cYb\n+uSTT7Ru3TplZWVp9erVqq+vV2ZmpoqLi5WamqrKykpVVFTIZrNp6dKlcrmsfSkDAIBYiRrkUaNG\naceOHZcd37t372XHCgsLVVhYGJuVAQAwjPBKXQAAGIAgAwBgAIIMAIABCDIAAAYgyAAAGIAgAwBg\nAIIMAIABCDIAAAYgyAAAGGBQb78IwFoWrj+c7CUA+BweIQMAYACCDACAAQgyAAAGIMgAABiAIAMA\nYACCDACAAQgyAAAGIMgAABiAIAMAYACCDACAAQgyAAAGIMgAABiAIAMAYACCDACAAQgyAAAGIMgA\nABiAIAMAYACCDACAAQgyAAAGIMgAABiAIAMAYACCDACAAQYV5FAopBkzZui3v/2t2tvbNX/+fJWX\nl2vFihX6+OOPJUkNDQ165JFHVFJSogMHDsR10QAAWM2ggvzLX/5SX/7ylyVJ27ZtU3l5ufbt26db\nb71VPp9Pvb29qqmp0UsvvaSXX35ZdXV1OnPmTFwXDgCAlUQN8okTJ3T8+HE9+OCDkqSWlhZNnz5d\nklRQUKDm5ma1trYqOztbLpdLTqdTeXl58vv9cV04AABWYo92gw0bNuinP/2pDh06JEnq6+uTw+GQ\nJGVkZCgQCCgYDMrtdke+xu12KxAIRD356NFpsttTrnft183jcSX8nIlm9RmtPp80PGZEfCXie8jq\n36eJnO+qQT506JByc3N18803D/j5cDh8Tcc/r6urd1C3iyWPx6VA4FzCz5tIVp/R6vNJw2NGxF+8\nv4es/n0aj/muFvirBvnIkSM6efKkjhw5og8//FAOh0NpaWkKhUJyOp3q6OiQ1+uV1+tVMBiMfF1n\nZ6dyc3NjNwEAABZ31SA/99xzkT9v375dY8eO1V//+lc1NjbqoYceUlNTk/Lz85WTk6M1a9aou7tb\nKSkp8vv9qq6ujvviAQCwiqjPIX/e8uXLtXr1atXX1yszM1PFxcVKTU1VZWWlKioqZLPZtHTpUrlc\n1n5eAQCAWBp0kJcvXx758969ey/7fGFhoQoLC2OzKgAAhhleqQsAAAMQZAAADECQAQAwAEEGAMAA\nBBkAAAMQZAAADECQAQAwAEEGAMAABBkAAAMQZAAADECQAQAwAEEGAMAABBkAAAMQZAAADECQAQAw\nAEEGAMAABBkAAAMQZAAADECQAQAwAEEGAMAABBkAAAMQZAAADGBP9gIAAPGxcP3hZC8hqtqqacle\ngjF4hAwAgAEIMgAABiDIAAAYgCADAGAAggwAgAEIMgAABiDIAAAYgCADAGCAqC8M0tfXp6qqKp0+\nfVoXLlzQkiVLdNddd2nVqlXq7++Xx+PRpk2b5HA41NDQoLq6Oo0YMUKlpaUqKSlJxAwAAAx5UYP8\nxz/+URMnTtSiRYt06tQpLVy4UHl5eSovL1dRUZG2bt0qn8+n4uJi1dTUyOfzKTU1VXPnztXMmTOV\nnp6eiDkAABjSol6ynj17thYtWiRJam9v15gxY9TS0qLp06dLkgoKCtTc3KzW1lZlZ2fL5XLJ6XQq\nLy9Pfr8/vqsHAMAiBv1a1mVlZfrwww+1Y8cO/eAHP5DD4ZAkZWRkKBAIKBgMyu12R27vdrsVCARi\nv2IAACxo0EHev3+/3n//fT3xxBMKh8OR4xf/+WJXOn6x0aPTZLenDHYJMePxuBJ+zkSz+oxWn08a\nHjMCpn+fJ3J9UYN89OhRZWRk6KabblJWVpb6+/s1cuRIhUIhOZ1OdXR0yOv1yuv1KhgMRr6us7NT\nubm5V73vrq7eLz7BNfJ4XAoEziX8vIlk9RmtPp80PGYEJBn9fR6Pn8OrBT7qc8jvvPOOamtrJUnB\nYFC9vb2aOnWqGhsbJUlNTU3Kz89XTk6O2tra1N3drZ6eHvn9fk2aNClGIwAAYG1RHyGXlZXpySef\nVHl5uUKhkNauXauJEydq9erVqq+vV2ZmpoqLi5WamqrKykpVVFTIZrNp6dKlcrnMvhQBAIApogbZ\n6XRqy5Ytlx3fu3fvZccKCwtVWFgYm5UBADCM8EpdAAAYgCADAGAAggwAgAEIMgAABiDIAAAYgCAD\nAGAAggwAgAEIMgAABiDIAAAYgCADAGAAggwAgAEIMgAABiDIAAAYgCADAGAAggwAgAEIMgAABiDI\nAAAYgCADAGAAggwAgAEIMgAABiDIAAAYgCADAGAAggwAgAEIMgAABiDIAAAYgCADAGAAggwAgAEI\nMgAABiDIAAAYgCADAGAAggwAgAEIMgAABrAP5kYbN27Uu+++q08//VSPPfaYsrOztWrVKvX398vj\n8WjTpk1yOBxqaGhQXV2dRowYodLSUpWUlMR7/QAAWELUIL/11lv6xz/+ofr6enV1denhhx/WlClT\nVF5erqKiIm3dulU+n0/FxcWqqamRz+dTamqq5s6dq5kzZyo9PT0RcwAAMKRFvWQ9efJkPf/885Kk\nG264QX19fWppadH06dMlSQUFBWpublZra6uys7PlcrnkdDqVl5cnv98f39UDAGARUYOckpKitLQ0\nSZLP59P999+vvr4+ORwOSVJGRoYCgYCCwaDcbnfk69xutwKBQJyWDQCAtQzqOWRJev311+Xz+VRb\nW6tZs2ZFjofD4QFvf6XjFxs9Ok12e8pglxAzHo8r4edMNKvPaPX5pOExI2D693ki1zeoIL/55pva\nsWOHdu/eLZfLpbS0NIVCITmdTnV0dMjr9crr9SoYDEa+prOzU7m5uVe9366u3i+2+uvg8bgUCJxL\n+HkTyeozWn0+aXjMCEgy+vs8Hj+HVwt81EvW586d08aNG/Xiiy9GfkFr6tSpamxslCQ1NTUpPz9f\nOTk5amtrU3d3t3p6euT3+zVp0qQYjQAAgLVFfYT86quvqqurSz/60Y8ix9avX681a9aovr5emZmZ\nKi4uVmpqqiorK1VRUSGbzaalS5fK5TL7UgQAAKawhQfzZG+cJONSxXC4FGj1Ga0+nxT/GReuPxy3\n+wauRW3VtGQv4YqMu2QNAADijyADAGAAggwAgAEIMgAABiDIAAAYgCADAGAAggwAgAEIMgAABiDI\nAAAYgCADAGAAggwAgAEIMgAABiDIAAAYgCADAGAAggwAgAEIMgAABiDIAAAYgCADAGAAggwAgAEI\nMgAABiDIAAAYgCADAGAAggwAgAEIMgAABiDIAAAYgCADAGAAggwAgAEIMgAABiDIAAAYgCADAGAA\nggwAgAEIMgAABhhUkI8dO6YZM2bolVdekSS1t7dr/vz5Ki8v14oVK/Txxx9LkhoaGvTII4+opKRE\nBw4ciN+qAQCwmKhB7u3t1dNPP60pU6ZEjm3btk3l5eXat2+fbr31Vvl8PvX29qqmpkYvvfSSXn75\nZdXV1enMmTNxXTwAAFZhj3YDh8OhXbt2adeuXZFjLS0teuqppyRJBQUFqq2t1e23367s7Gy5XC5J\nUl5envx+v6ZNmxanpQPmWrj+cLKXAGCIiRpku90uu/3Sm/X19cnhcEiSMjIyFAgEFAwG5Xa7I7dx\nu90KBAJXve/Ro9Nkt6dcz7q/EI/HlfBzJprVZ7T6fMBwYfrPciLXFzXI0YTD4Ws6frGurt4vevpr\n5vG4FAicS/h5E8nqM1p9PmA4MflnOR5/11wt8Nf1W9ZpaWkKhUKSpI6ODnm9Xnm9XgWDwchtOjs7\n5fV6r+fuAQAYdq4ryFOnTlVjY6MkqampSfn5+crJyVFbW5u6u7vV09Mjv9+vSZMmxXSxAABYVdRL\n1kePHtWGDRt06tQp2e12NTY2avPmzaqqqlJ9fb0yMzNVXFys1NRUVVZWqqKiQjabTUuXLo38ghcA\nALg6W3gwT/bGSTKeO0j2849D4bdva6vM/s34ZO/hYAyFfQZMYPLfN0PiOWQAABBbBBkAAAMQZAAA\nDECQAQAwAEEGAMAABBkAAAMQZAAADPCFX8sa1mP6v6H93ZaHkr0EAIg5HiEDAGAAggwAgAEIMgAA\nBiDIAAAYgCADAGAAggwAgAF4+8UEM/2fFAEA/l+s3x6St18EAMBwBBkAAAMQZAAADECQAQAwAEEG\nAMAABBkAAAMQZAAADECQAQAwAEEGAMAABBkAAAMQZAAADECQAQAwAEEGAMAABBkAAAMQZAAADECQ\nAQAwgD3Wd/jMM8+otbVVNptN1dXVuueee2J9CgAALCemQX777bf1z3/+U/X19Tpx4oSqq6tVX18f\ny1MAAGBJMb1k3dzcrBkzZkiSvvrVr+rs2bM6f/58LE8BAIAlxfQRcjAY1N133x352O12KxAIaNSo\nUbE8zRUtXH84IecBACDWYv4c8sXC4fBVP+/xuGJ6vt9teSim9wcAQKLE9JK11+tVMBiMfNzZ2SmP\nxxPLUwAAYEkxDfI3vvENNTY2SpLee+89eb3ehF2uBgBgKIvpJeu8vDzdfffdKisrk81m089+9rNY\n3j0AAJZlC0d7ohcAAMQdr9QFAIABCDIAAAaI6z97SraWlhatWLFC48aNkySNHz9ejz76qFatWqX+\n/n55PB5t2rRJDocjySu9dseOHdOSJUv0/e9/X/PmzVN7e/uAczU0NKiurk4jRoxQaWmpSkpKkr30\nQfv8jFVVVXrvvfeUnp4uSaqoqNCDDz44ZGfcuHGj3n33XX366ad67LHHlJ2dbbk9/PyMhw8ftswe\n9vX1qaqqSqdPn9aFCxe0ZMkS3XXXXZbZw4Hma2xstMz+XSwUCunb3/62lixZoilTpiRvD8MW9tZb\nb4WXL19+ybGqqqrwq6++Gg6Hw+EtW7aEf/WrXyVjaV9IT09PeN68eeE1a9aEX3755XA4PPBcPT09\n4VmzZoW7u7vDfX194W9961vhrq6uZC590AaacfXq1eHDhw9fdruhOGNzc3P40UcfDYfD4fBHH30U\nfuCBByy3hwPNaKU9/P3vfx/euXNnOBwOhz/44IPwrFmzLLWHA81npf272NatW8Nz5swJHzx4MKl7\nOOwuWbe0tGj69OmSpIKCAjU3Nyd5RdfO4XBo165d8nq9kWMDzdXa2qrs7Gy5XC45nU7l5eXJ7/cn\na9nXZKAZBzJUZ5w8ebKef/55SdINN9ygvr4+y+3hQDP29/dfdruhOuPs2bO1aNEiSVJ7e7vGjBlj\nqT0caL6BDNX5/uvEiRM6fvy4HnzwQUnJ/bvU8kE+fvy4Fi9erO9+97v685//rL6+vsgl6oyMDAUC\ngSSv8NrZ7XY5nc5Ljg00VzAYlNvtjtzmvy9lOhQMNKMkvfLKK1qwYIEef/xxffTRR0N2xpSUFKWl\npUmSfD6f7r//fsvt4UAzpqSkWGYP/6usrEwrV65UdXW15fZQunQ+yTo/g/+1YcMGVVVVRT5O5h5a\n+jnk2267TcuWLVNRUZFOnjypBQsWXPJ/6GGL/ouvK8011Od96KGHlJ6erqysLO3cuVMvvPCC7r33\n3ktuM9RmfP311+Xz+VRbW6tZs2ZFjltpDy+e8ejRo5bbw/379+v999/XE088ccnarbKHF89XXV1t\nqf07dOiQcnNzdfPNNw/4+UTvoaUfIY8ZM0azZ8+WzWbTLbfcohtvvFFnz55VKBSSJHV0dES9JDpU\npKWlXTbXQC9lOpTnnTJlirKysiRJ06ZN07Fjx4b0jG+++aZ27NihXbt2yeVyWXIPPz+jlfbw6NGj\nam9vlyRlZWWpv79fI0eOtMweDjTf+PHjLbN/knTkyBG98cYbKi0t1YEDB/SLX/wiqT+Hlg5yQ0OD\n9uzZI0kKBAI6ffq05syZE3l5z6amJuXn5ydziTEzderUy+bKyclRW1uburu71dPTI7/fr0mTJiV5\npddv+fLlOnnypKR/P88zbty4ITvjuXPntHHjRr344ouR31i12h4ONKOV9vCdd95RbW2tpH+/011v\nb6+l9nCg+dauXWuZ/ZOk5557TgcPHtRvfvMblZSUaMmSJUndQ0u/Utf58+e1cuVKdXd365NPPtGy\nZcuUlZWl1atX68KFC8rMzNSzzz6r1NTUZC/1mhw9elQbNmzQqVOnZLfbNWbMGG3evFlVVVWXzfXa\na69pz549stlsmjdvnr7zne8ke/mDMtCM8+bN086dO/WlL31JaWlpevbZZ5WRkTEkZ6yvr9f27dt1\n++23R46tX79ea9asscweDjTjnDlz9Morr1hiD0OhkJ588km1t7crFApp2bJlmjhx4oB/v1hlvrS0\nNG3atMkS+/d527dv19ixY3XfffclbQ8tHWQAAIYKS1+yBgBgqCDIAAAYgCADAGAAggwAgAEIMgAA\nBiDIAAAYgCADAGAAggwAgAH+D9Q/YIrR06v8AAAAAElFTkSuQmCC\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x7f0588cb50b8>"
]
},
"metadata": {
"tags": []
}
}
]
},
{
"metadata": {
"id": "i-5KgIf1Zp3S",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### **So we can see most samples have around 300 variants in this gene.**\n",
"\n",
"\n"
]
},
{
"metadata": {
"id": "814Jb5DGUvqY",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### **How many variants within DYRK1A gene for a particular sample that are shared by no other samples. These variants are called private variants.**\n",
"\n",
"One way to do this would be to filter out variant positions that have a single sample ID associated with it."
]
},
{
"metadata": {
"id": "v_eAFZ9DU6f5",
"colab_type": "code",
"outputId": "59b5a709-a870-4a09-f700-f7b67a79aa8b",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 80
}
},
"cell_type": "code",
"source": [
"df10z = pd.io.gbq.read_gbq('''\n",
"\n",
"WITH\n",
"t1 AS (\n",
"SELECT\n",
" reference_name,\n",
" start_position,\n",
" call.name,\n",
" call.genotype[OFFSET(0)] g1,\n",
" call.genotype[OFFSET(1)] g2\n",
" FROM\n",
" `isb-cgc.QotM.1000genomes`\n",
" JOIN\n",
" UNNEST(call) AS call WITH OFFSET AS ci\n",
" WHERE\n",
" reference_name = '21'\n",
" AND (start_position BETWEEN 37365790 AND 37517450)\n",
"),\n",
"\n",
"t2 AS (\n",
"SELECT\n",
" start_position,\n",
" COUNT(start_position) as N\n",
"FROM\n",
" t1\n",
"WHERE\n",
" (g1 = 1 OR g2 = 1)\n",
"GROUP BY\n",
" start_position \n",
")\n",
"\n",
"select COUNT(*) private_vars from t2 WHERE N = 1\n",
"\n",
"''', project_id=project_id, verbose=False, dialect='standard')\n",
"\n",
"df10z.head()\n"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>private_vars</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1852</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" private_vars\n",
"0 1852"
]
},
"metadata": {
"tags": []
},
"execution_count": 23
}
]
},
{
"metadata": {
"id": "xT3pvBCbW-Ls",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### **OK! The above result shows 1852 private variants out of 11,369,475 variants in our table.**"
]
},
{
"metadata": {
"id": "mkTI-bsjqOc9",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### **Finally, let's do some statistics! Let's do Z-scores. Now we could just work with the pandas table we already have, but for the example, let's do it in BigQuery.**"
]
},
{
"metadata": {
"id": "ormLk6thqTE0",
"colab_type": "code",
"outputId": "8dd15854-0bc6-4649-e153-115a38916f86",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
}
},
"cell_type": "code",
"source": [
"df11 = pd.io.gbq.read_gbq('''\n",
"WITH\n",
"t1 AS (\n",
"SELECT\n",
" reference_name,\n",
" start_position,\n",
" call.name as sample,\n",
" call.genotype[OFFSET(0)] g1,\n",
" call.genotype[OFFSET(1)] g2\n",
" FROM\n",
" `isb-cgc.QotM.1000genomes`\n",
" JOIN\n",
" UNNEST(call) AS call WITH OFFSET AS ci\n",
" WHERE\n",
" reference_name = '21'\n",
" AND (start_position BETWEEN 37365790 AND 37517450)\n",
"),\n",
"\n",
"t2 AS (\n",
" SELECT\n",
" sample,\n",
" COUNT(sample) as N\n",
" FROM\n",
" t1\n",
" WHERE\n",
" g1 = 1 OR g2 = 1\n",
" GROUP BY\n",
" sample\n",
" ),\n",
"\n",
"t3 AS (\n",
"SELECT\n",
" AVG(N) avgn, \n",
" STDDEV(N) stddevn\n",
"FROM\n",
" t2\n",
"), \n",
"\n",
"t4 AS (\n",
" SELECT\n",
" sample,\n",
" N,\n",
" avgn,\n",
" stddevn,\n",
" (N - avgn) / stddevn as Z_score\n",
" FROM\n",
" t2 CROSS JOIN t3\n",
")\n",
"\n",
"select * from t4\n",
"''', project_id=project_id, verbose=False, dialect='standard')\n",
"\n",
"df11.head()"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>sample</th>\n",
" <th>N</th>\n",
" <th>avgn</th>\n",
" <th>stddevn</th>\n",
" <th>Z_score</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>HG01086</td>\n",
" <td>256</td>\n",
" <td>275.190533</td>\n",
" <td>58.873981</td>\n",
" <td>-0.325959</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>HG03342</td>\n",
" <td>256</td>\n",
" <td>275.190533</td>\n",
" <td>58.873981</td>\n",
" <td>-0.325959</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>HG03694</td>\n",
" <td>256</td>\n",
" <td>275.190533</td>\n",
" <td>58.873981</td>\n",
" <td>-0.325959</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>HG00110</td>\n",
" <td>256</td>\n",
" <td>275.190533</td>\n",
" <td>58.873981</td>\n",
" <td>-0.325959</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>HG00116</td>\n",
" <td>256</td>\n",
" <td>275.190533</td>\n",
" <td>58.873981</td>\n",
" <td>-0.325959</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" sample N avgn stddevn Z_score\n",
"0 HG01086 256 275.190533 58.873981 -0.325959\n",
"1 HG03342 256 275.190533 58.873981 -0.325959\n",
"2 HG03694 256 275.190533 58.873981 -0.325959\n",
"3 HG00110 256 275.190533 58.873981 -0.325959\n",
"4 HG00116 256 275.190533 58.873981 -0.325959"
]
},
"metadata": {
"tags": []
},
"execution_count": 24
}
]
},
{
"metadata": {
"id": "pSOCbwAjsXE1",
"colab_type": "code",
"outputId": "13891d7b-8212-4b1b-ec36-2996dee2db3c",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 447
}
},
"cell_type": "code",
"source": [
"df11.hist()"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[<matplotlib.axes._subplots.AxesSubplot object at 0x7f0585da1a20>,\n",
" <matplotlib.axes._subplots.AxesSubplot object at 0x7f05860f14a8>],\n",
" [<matplotlib.axes._subplots.AxesSubplot object at 0x7f058606ab38>,\n",
" <matplotlib.axes._subplots.AxesSubplot object at 0x7f0585f50898>]],\n",
" dtype=object)"
]
},
"metadata": {
"tags": []
},
"execution_count": 26
},
{
"output_type": "display_data",
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAe8AAAFZCAYAAACxGqelAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4yLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvNQv5yAAAIABJREFUeJzt3Xt8FPW9//HXkmSJ0cWQmKWloj1a\nvJSGW6EUEIQESECsBAhCCrQHLHBABEQFYxSqrYRQqFyiIIqiWA0Ei9FSEpBgQUMqbB8I/moV+nhY\nUCG7EkgMgYRkfn9w3GO4JdnsZYa8n3+xszs77+9kP/thvrM7azMMw0BEREQso0WoA4iIiEjjqHmL\niIhYjJq3iIiIxah5i4iIWIyat4iIiMWoeYuIiFiMmrcAcOutt5Kenl5nWXFxMePGjQtRIhERuZTw\nUAcQ8/jwww/5f//v//HjH/841FFE5BI+++wzpk+fXmeZ2+3mjjvuYOnSpSFKJcGm5i1eDz74IE8/\n/TTr1q0LdRQRuYT27duzZcsW7+2SkhJSUlKYNGlSCFNJsGnaXLwGDx6MYRh13hhExNwyMjIYM2YM\nHTp0uORjzp49y2OPPUZSUhIDBw7k/vvv55tvvgFg06ZNJCUlkZSUxMMPP0xVVRUAf/3rXxk6dCjJ\nycmMHz+e//znPwAsX76cjIwMRo4cycsvv4xhGKxYsYKkpCT69+/P7373O2pqagI/8GZOzVvqSE9P\n5w9/+ANnzpwJdRQRqcf69evxeDxMmTLlso/btWsXR44cYcuWLRQUFPCjH/2If/zjHxw5coSFCxfy\nyiuvsGXLFiorK3nllVf48ssvefzxx8nOzmbLli3069ePJ554wvt87733Hs8//zy//vWveeutt9iy\nZQu5ubls3bqVw4cP8/rrrwd66M2emrfU0aFDB7p3785LL70U6igichmHDx9myZIlZGVlER5++TOg\nMTExHDp0iK1bt1JZWcnMmTPp06cP77//Pl26dKFNmzbYbDYWL17Mr3/9a95//3169OjBjTfeCEBq\nairFxcWcPXsWgE6dOhETEwNAYWEhI0aMwOFwEB4eTmpqKgUFBYEdvOict1xo1qxZDB8+nOuvvz7U\nUUTkImpra3n00Ue57777+NGPflTv4zt27EhGRgavvvoqc+bMISEhgXnz5lFaWkqrVq28j2vZsiXA\nBcsdDgeGYVBaWgrAtdde672vvLycF198kZycHABqamq8jV0CR81bLuB0OvnlL3/J8uXLcTqdoY4j\nIud5+eWXOXv2LBMmTGjwOsnJySQnJ3PixAnS09N58cUXadu2Lf/4xz+8j/nmm284ffo0sbGxdZaf\nPHmSFi1a0Lp16wue1+l0kpCQwNixY5s2KGkUTZvLRU2YMIHq6upQxxCR8xw8eJBVq1axcOFCWrRo\n2Fv4xo0byc7OBiA6OpqbbroJgDvvvBOXy8WRI0cwDIN58+aRm5tL79692bNnD4cPHwbgjTfeoHfv\n3hednk9MTOStt96isrLS+9g///nP/hiqXIaOvAWAf/3rX3Vut2zZku3bt4cojYhcyssvv0xVVRWT\nJ0+us9xut5OXl3fRdRITE0lPT2fQoEGEhYVx4403kpmZSXR0NE8++SS/+tWvCAsLIz4+nv/+7/+m\nZcuW/O53v2Pq1KlUV1dz/fXX89RTT130uQcMGMBnn31GSkoKADfccAO///3v/TtouYDNMAwj1CFE\nRESk4TRtLiIiYjGaNhcRuQJMmzaNQ4cOXfS+7Oxsbr755iAnkkDStLmIiIjFaNpcRETEYkw5be52\nl4c6wgVat46itPRUqGM0icZgDg0dQ1ycIwhpQq++er8S/uYN0RzG2RzGCL6Ns7H1riPvBgoPDwt1\nhCbTGMzhShhDMDWX/dUcxtkcxgjBGaeat4iIiMWoeYuIiFiMmreIiIjFqHmLiIhYjCk/bS5XhgmZ\n/rk2+pq5CX55HhEJLH/UvOq9YXTkLSIiYjFq3iIiIhaj5i0iImIxOuctpqdz5yLNh+q9YXTkLSIi\nYjFq3iIiIhaj5i0iImIxat4iIiIWo+YtIiJiMWreIiIiFqPmLSIiYjFq3iIiIhaj5i0iImIxPl9h\nLS8vjxdeeIHw8HAeeOABbr31Vh555BFqamqIi4tj0aJF2O128vLyWLt2LS1atGDUqFGkpqb6M7+I\nBIlqXsQ8fGrepaWlZGdns3HjRk6dOsXy5cvJz88nLS2NwYMHs2TJEnJzcxk2bBjZ2dnk5uYSERHB\nyJEjGThwINHR0f4eh4gEkGpexFx8mjYvKiqiZ8+eXHPNNTidTp566imKi4tJTEwEoH///hQVFbFv\n3z7i4+NxOBxERkbStWtXXC6XXwcgIoGnmhcxF5+OvI8cOcLp06eZMmUKZWVlTJ8+ncrKSux2OwCx\nsbG43W48Hg8xMTHe9WJiYnC73f5JLiJBo5oXMRefz3mfOHGCFStW8OWXXzJ+/HgMw/De991/f9el\nlp+vdesowsPDfI0WMHFxjlBHaLIrYQy+MtPYzZSloQJV8w2pdyvuL180l3EGQ6j3ZaC371Pzjo2N\npUuXLoSHh3PDDTdw9dVXExYWxunTp4mMjOTYsWM4nU6cTicej8e7XklJCZ07d673+UtLT/kSK6Di\n4hy43eWhjtEkV8IYmsIsY2/o3yHUbz7fFciar6/em8vrtrmMM1hCuS99+Vs2tt59Oud9xx13sHv3\nbmprayktLeXUqVP06tWL/Px8AAoKCujTpw+dOnVi//79lJWVUVFRgcvlolu3br5sUkRCSDUvYi4+\nHXm3adOGpKQkRo0aBUBGRgbx8fHMmTOHnJwc2rZty7Bhw4iIiGD27NlMnDgRm83GtGnTcDjMczQh\nIg2jmhcxF5vR0BPRQWTGqaMrYUor2GOYkLk9aNtqiDVzE0IdAbDmtHkg1bcvroTaa4grYZxmqvlQ\n1rtpp81FREQkdNS8RURELEbNW0RExGLUvEVERCxGzVtERMRi1LxFREQsRs1bRETEYtS8RURELEbN\nW0RExGLUvEVERCxGzVtERMRi1LxFREQsRs1bRETEYtS8RURELEbNW0RExGLUvEVERCxGzVtERMRi\n1LxFREQsRs1bRETEYsJDHUDMaULm9lBHEJEgUb1bj468RURELEbNW0RExGLUvEVERCxGzVtERMRi\n1LxFREQsRs1bRETEYtS8RURELKZJzfv06dMMGDCAN998k6+++opx48aRlpbGjBkzqKqqAiAvL48R\nI0aQmprKhg0b/BJaRIJP9S5iHk1q3s899xzXXnstAMuWLSMtLY0//elP3HjjjeTm5nLq1Cmys7N5\n+eWXefXVV1m7di0nTpzwS3ARCS7Vu4h5+Ny8Dx06xMGDB+nXrx8AxcXFJCYmAtC/f3+KiorYt28f\n8fHxOBwOIiMj6dq1Ky6Xyy/BRSR4VO8i5uLz5VEXLlzI448/zqZNmwCorKzEbrcDEBsbi9vtxuPx\nEBMT410nJiYGt9td73O3bh1FeHiYr9ECJi7OEeoITXYljMFXZhq7mbI0RKjr3Wr7y1fNZZzBEOp9\nGejt+9S8N23aROfOnWnXrt1F7zcMo1HLz1daesqXWAEVF+fA7S4PdYwmuRLG0BRmGXtD/w6hfvP5\nVqjrvbm8bpvLOIMllPvSl79lY+vdp+a9Y8cODh8+zI4dOzh69Ch2u52oqChOnz5NZGQkx44dw+l0\n4nQ68Xg83vVKSkro3LmzL5sUkRBRvYuYj0/N+5lnnvH+e/ny5fzgBz/gH//4B/n5+dxzzz0UFBTQ\np08fOnXqREZGBmVlZYSFheFyuUhPT/dbeBEJPNW7iPn47SdBp0+fzpw5c8jJyaFt27YMGzaMiIgI\nZs+ezcSJE7HZbEybNg2HwxxTgSLiO9W7SGjZjIaemAoiM573uRLORzVmDFfi7/uumZsQ6giA9c55\nB1p9++JKqL2GCOU4Ve/+FYxz3rrCmoiIiMWoeYuIiFiMmreIiIjFqHmLiIhYjJq3iIiIxah5i4iI\nWIzfvuctYnb++DqMWb5uJiKX56+vv5m15nXkLSIiYjFq3iIiIhaj5i0iImIxat4iIiIWo+YtIiJi\nMWreIiIiFqPmLSIiYjFq3iIiIhaj5i0iImIxat4iIiIWo+YtIiJiMWreIiIiFqPmLSIiYjFq3iIi\nIhaj5i0iImIxat4iIiIWo+YtIiJiMWreIiIiFqPmLSIiYjHhvq6YlZXF3r17OXv2LJMnTyY+Pp5H\nHnmEmpoa4uLiWLRoEXa7nby8PNauXUuLFi0YNWoUqamp/swvIkGimhcxD5+a9+7du/nss8/Iycmh\ntLSUlJQUevbsSVpaGoMHD2bJkiXk5uYybNgwsrOzyc3NJSIigpEjRzJw4ECio6P9PQ4RCSDVvIi5\n+DRt3r17d5YuXQpAq1atqKyspLi4mMTERAD69+9PUVER+/btIz4+HofDQWRkJF27dsXlcvkvvYgE\nhWpexFx8OvIOCwsjKioKgNzcXPr27cuuXbuw2+0AxMbG4na78Xg8xMTEeNeLiYnB7XbX+/ytW0cR\nHh7mS7SAiotzhDpCk10JYwglf+0/q/0dAlnzDal3q+0vXzWXcVqJr3+TQP8tfT7nDbBt2zZyc3NZ\ns2YNgwYN8i43DOOij7/U8vOVlp5qSqyAiItz4HaXhzpGk1wJYwg1f+y/hv4dzPhGHoiar6/em8vr\ntrmM02p8+Zv48rdsbL37/GnznTt3snLlSlavXo3D4SAqKorTp08DcOzYMZxOJ06nE4/H412npKQE\np9Pp6yZFJIRU8yLm4VPzLi8vJysri1WrVnk/iNKrVy/y8/MBKCgooE+fPnTq1In9+/dTVlZGRUUF\nLpeLbt26+S+9iASFal7EXHyaNt+8eTOlpaXMnDnTuywzM5OMjAxycnJo27Ytw4YNIyIigtmzZzNx\n4kRsNhvTpk3D4TDfVKCIXJ5qXsRcbEZDT0QHkRnP+1wJ56MaM4YJmdsDnMaa1sxNaPJzWPmcdyDU\nty+uhNpriFCOU/V+ab7UvKnPeYuIiEhoqHmLiIhYjJq3iIiIxah5i4iIWIyat4iIiMWoeYuIiFiM\nmreIiIjFqHmLiIhYjJq3iIiIxah5i4iIWIyat4iIiMWoeYuIiFiMT78qJualHxgQaT5U782XjrxF\nREQsplkfefvrf63++JlIEQks1btcSXTkLSIiYjFq3iIiIhaj5i0iImIxzfqct7/441yazqOJWIPO\nnYsZ6MhbRETEYtS8RURELMZmGIYR6hDnc7vLg7IdXeBAQqUhU6ZxcY4gJAm9+uo9Ls7hl/cE1buE\nSiDqXUfeIiIiFqPmLSIiYjFq3iIiIhaj5i0iImIxQfme99NPP82+ffuw2Wykp6fTsWPHYGxWREJE\nNS8SWAFv3n//+9/5/PPPycnJ4dChQ6Snp5OTkxPozYpIiKjmRQIv4NPmRUVFDBgwAICbb76ZkydP\n8s033wR6syISIqp5kcAL+JG3x+OhQ4cO3tsxMTG43W6uueaaJj2vvrMpYk6BqHnVu0hdQb+2eUOu\nCdOQL6u/vfgef8QRkQCrr+ZV7yKNF/Bpc6fTicfj8d4uKSkhLi4u0JsVkRBRzYsEXsCbd+/evcnP\nzwfg448/xul0NnnKXETMSzUvEngBnzbv2rUrHTp0YPTo0dhsNubNmxfoTYpICKnmRQLPlD9MIiIi\nIpemK6yJiIhYjJq3iIiIxah5f8enn37KgAEDWLduHQBfffUV48aNIy0tjRkzZlBVVQVAXl4eI0aM\nIDU1lQ0bNoQy8gWysrK49957GTFiBAUFBZYbQ2VlJTNmzGDs2LGkpqZSWFhouTF86/Tp0wwYMIA3\n33zTsmMwC4/HQ/fu3SkuLg51lIA4e/Ysc+bMYcyYMYwaNYo9e/aEOpLfPf3009x7772MHj2ajz76\nKNRxAub89+CAMcQwDMOoqKgwxo4da2RkZBivvvqqYRiGMXfuXGPz5s2GYRjG4sWLjddee82oqKgw\nBg0aZJSVlRmVlZXGXXfdZZSWloYyuldRUZFx3333GYZhGMePHzfuvPNOy43hL3/5i/H8888bhmEY\nR44cMQYNGmS5MXxryZIlxvDhw42NGzdadgxm8fDDDxspKSnG7t27Qx0lIHJzc4158+YZhmEYn376\nqTFixIjQBvKz4uJiY9KkSYZhGMbBgweNUaNGhThRYFzsPThQdOT9v+x2O6tXr8bpdHqXFRcXk5iY\nCED//v0pKipi3759xMfH43A4iIyMpGvXrrhcrlDFrqN79+4sXboUgFatWlFZWWm5MQwZMoTf/OY3\nwLmZjzZt2lhuDACHDh3i4MGD9OvXD7Dea8lMioqKuPrqq7nllltCHSVgfvGLX/Doo48C565Id+LE\niRAn8q/mcsnci70H19TUBGRbat7/Kzw8nMjIyDrLKisrsdvtAMTGxuJ2u/F4PMTExHgf8+2lH80g\nLCyMqKgoAHJzc+nbt6/lxvCt0aNH89BDD5Genm7JMSxcuJC5c+d6b1txDGZQVVVFdnY2s2bNCnWU\ngIqIiKBly5YArF27lqFDh4Y4kX95PB5at27tvX2lvtYv9h4cFhYWkG0F/fKoVmVc4ht1l1oeStu2\nbSM3N5c1a9YwaNAg73IrjeGNN97gn//8Jw8//HCdfFYYw6ZNm+jcuTPt2rW76P1WGEMobNiw4YLz\n/n379iU1NZVWrVqFKJX/XWyc06dPp0+fPrz22mt8/PHHrFy5MkTpguNKf61/9z04UNS8LyMqKorT\np08TGRnJsWPHcDqdF730Y+fOnUOYsq6dO3eycuVKXnjhBRwOh+XGcODAAWJjY/n+97/P7bffTk1N\nDVdffbWlxrBjxw4OHz7Mjh07OHr0KHa73XJ/h1BITU0lNTW1zrLRo0dTW1vLa6+9xn/+8x8++ugj\nli5dSvv27UOUsukuNk4419S3b9/Os88+S0RERAiSBU5zumTu+e/BgaJp88vo1auX9zKPBQUF9OnT\nh06dOrF//37KysqoqKjA5XLRrVu3ECc9p7y8nKysLFatWkV0dDRgvTHs2bPH+79Vj8fDqVOnLDeG\nZ555ho0bN7J+/XpSU1OZOnWq5cZgFm+88Qbr169n/fr19OvXj3nz5lm6cV/K4cOHeeONN1ixYoV3\n+vxK0lwumXux9+BA0ZH3/zpw4AALFy7kiy++IDw8nPz8fP7whz8wd+5ccnJyaNu2LcOGDSMiIoLZ\ns2czceJEbDYb06ZNC+j/rhpj8+bNlJaWMnPmTO+yzMxMMjIyLDOG0aNH89hjj5GWlsbp06d54okn\n+MlPfsKcOXMsM4aLmT59uuXHIIGzYcMGTpw4waRJk7zLXnzxRe/nJKyuuVwy92LvwQsXLqRt27Z+\n35YujyoiImIxmjYXERGxGDVvERERi1HzFhERsRg1bxEREYtR8xYREbEYNW8RERGLUfMWERGxGDVv\nERERi1HzFhG5Aqxfv/6S9w0cOJDi4uILlu/Zs4eEhIRAxpIAUfMWEbG4mpoasrKyQh1DgkjXNm8m\nNmzYwJo1a6ipqSEuLo6srCwGDx7Mjh07vL8p/fvf/56WLVsyffp0HnnkEVwuF+3bt+fHP/4xHo+H\nzMxMxo0bR0JCAgUFBRw5coTu3buzePFibDZbiEco0jycPXuWefPmsWfPHmpra7n11lspKyujvLyc\n5ORkVq9ezcmTJ5kzZw5nz57lzjvvrLP+s88+S05ODq1bt65z1F1VVUVWVhY7d+6kurqaUaNGMWXK\nFLKysjhz5gyPP/44AMePHychIYGdO3dyzz33MGnSJHJzczl69ChDhw6t8zv2Ejg68m4Gvv76a558\n8kleeuklCgoKuOGGG3j22Wfp0aMHhYWF3se9++67DB48mA0bNlBSUkJhYSFPPfUUb775Zp3n2759\nOy+99BL5+fns3r0bl8sV7CGJNFu7du3iyJEjbNmyhYKCAn70ox8xZMgQwsLC2LJlC+3atWP+/PmM\nHz+e/Px8unTpwpEjRwA4ePAgL7/8Mhs3bmTjxo3861//8j7v6tWrOXjwIG+//TbvvPMO+fn5FBYW\nkpycXOd9orCwkJ///OfeH9H58MMPycnJYePGjaxbt46jR48Gd4c0U2rezUBsbCx79+7le9/7HgDd\nunXj8OHDJCUlsX37duDcz/SFh4fToUMH9uzZQ1JSEuHh4fzgBz+44H/uycnJREZGEhUVxQ9/+EO+\n+uqroI9JpLmKiYnh0KFDbN26lcrKSmbOnMkdd9zhvf/MmTPs37+fIUOGAOfq9aqrrgLONdru3btz\n3XXXERYWxi9+8QvveoWFhaSlpXl/f/6ee+6hoKCAjh07YhgGn3zyCQBbt25l8ODB3vXuvvtuwsLC\naNOmDbGxsXo/CBI172agpqaGZcuWMWTIEJKSkvjjH/+IYRgMGDCA4uJizpw5w7Zt27wFWVZWVue3\naNu0aVPn+b77O7xhYWHU1NQEZyAiQseOHcnIyODVV1+ld+/ezJ49m/Lycu/9J06cAP6vTm02G61a\ntQLg5MmTdX529tvlcO63qBcsWEBycjLJycm88sorVFZWAjBo0CDeffddTp06hcvlIjEx0bue3g9C\nQ+e8m4HNmzezfft21q1bR0xMDOvXr+ftt98mOjqajh07UlRUxLZt21i0aBFwrhgrKiq867vd7lBF\nF5GL+LbBnjhxgvT0dF544QXvfddeey0A33zzDQ6Hg9raWk6ePAmca9bfbfSlpaXefzudTiZMmED/\n/v0v2F5SUhK///3vad++Pd27d6/TsCU0dOTdDHz99df84Ac/ICYmhtLSUv761796m3NSUhLr16+n\nurqa2267DYD4+HgKCgqora3lq6++4m9/+1so44vId2zcuJHs7GwAoqOjuemmm7Db7dTW1vLNN98Q\nGRnJbbfdxtatWwH4y1/+wpkzZwDo0qULe/fu5fjx49TU1JCXl+d93sTERDZs2EBNTQ2GYfDss896\na79Lly58/fXXvPnmm3WmzCV01LybgaFDh3LixAkGDhzI7NmzmTlzJkePHiUzM5OBAweyY8cOkpOT\nvY8fM2YMLVu2ZMCAAfz2t7/lrrvu0qfJRUwiMTGRjz/+mEGDBjF48GAOHjzInDlz+OlPf0r//v1x\nuVzMnz+f1atXk5SUxEcffcTNN98MwO23387o0aNJSUlh+PDhdO3a1fu8aWlptG3blrvuuovk5GQO\nHTrET3/6U+Dc1PuAAQMoKiq66JG5BJ/NMAwj1CHEfAzD8DbshQsXUlNTQ3p6eohTiYgI6MhbLuLd\nd99lxIgRVFVVUVFRwXvvvUfnzp1DHUtERP6XPrAmF+jXrx/vvfcegwcPpkWLFvTr16/OtLqIiISW\nps1FREQsRtPmIiIiFqPmLSIiYjGmPOftdpfX/yA/at06itLSU0Hd5qUoy6WZKU8wssTFOep/0BXA\nl3o302sBzJVHWS7NTHnOz9LYeteRNxAeHhbqCF7KcmlmymOmLM2R2fa/mfIoy6WZKU9Ts6h5i4iI\nWIyat4iIiMU06Jx3VlYWe/fu5ezZs0yePJnt27fz8ccfe395auLEifTr14+8vDzWrl1LixYtGDVq\nFKmpqVRXVzN37ly+/PJLwsLCWLBgAe3atQvooETEd6p3EfOrt3nv3r2bzz77jJycHEpLS0lJSeHn\nP/85Dz74YJ1r3J46dYrs7Gxyc3OJiIhg5MiRDBw4kMLCQlq1asXixYvZtWsXixcv5plnngnooETE\nN6p3EWuot3l3796djh07Aud+Tq6ysvKiv9e6b98+4uPjvb8V27VrV1wuF0VFRQwbNgyAXr166frY\nFjEhc3uTn2PN3AQ/JJFgUr03T/6od1DNB1O957zDwsKIiooCIDc3l759+xIWFsa6desYP348s2bN\n4vjx43g8HmJiYrzrxcTE4Ha76yxv0aIFNpuNqqqqAA1HRJpC9S5iDQ3+nve2bdvIzc1lzZo1HDhw\ngOjoaG6//Xaef/55VqxYQZcuXeo8/lJXXW3I1Vhbt44K+kf6zfSdWjNlaYpAjMNM+8ZMWfzNCvVu\ntv1vpjyhynKx7Zppv4C58jQlS4Oa986dO1m5ciUvvPACDoeDnj17eu9LSEhg/vz5JCUl4fF4vMtL\nSkro3LkzTqcTt9vNbbfdRnV1NYZhYLfbL7u9YH+JPi7OEfQLw1yKmbI0lb/HYaZ9E4wsoXqTsUK9\nm+m1AObKE8os52/XTPsFzJXn/Cx+v0hLeXk5WVlZrFq1yvtp0+nTp3P48GEAiouLad++PZ06dWL/\n/v2UlZVRUVGBy+WiW7du9O7dmy1btgBQWFhIjx49GhVQRIJH9S5iDfUeeW/evJnS0lJmzpzpXTZ8\n+HBmzpzJVVddRVRUFAsWLCAyMpLZs2czceJEbDYb06ZNw+FwMGTIED744APGjBmD3W4nMzMzoAMS\nEd+p3kWswZQ/CRrsaQ0zT6WEihk/bW6WfQNX9rR5sPmyH830WgBz5fElS6A+bW6m/QLmyhPwaXMR\nERExFzVvERERi1HzFhERsRg1bxEREYtR8xYREbEYNW8RERGLUfMWERGxGDVvERERi1HzFhERsRg1\nbxEREYtR8xYREbEYNW8RERGLUfMWERGxGDVvERERi1HzFhERsRg1bxEREYsJb8iDsrKy2Lt3L2fP\nnmXy5MnEx8fzyCOPUFNTQ1xcHIsWLcJut5OXl8fatWtp0aIFo0aNIjU1lerqaubOncuXX35JWFgY\nCxYsoF27doEel4j4SPUuYn71Nu/du3fz2WefkZOTQ2lpKSkpKfTs2ZO0tDQGDx7MkiVLyM3NZdiw\nYWRnZ5Obm0tERAQjR45k4MCBFBYW0qpVKxYvXsyuXbtYvHgxzzzzTDDGJiKNpHoXsYZ6p827d+/O\n0qVLAWjVqhWVlZUUFxeTmJgIQP/+/SkqKmLfvn3Ex8fjcDiIjIyka9euuFwuioqKGDhwIAC9evXC\n5XIFcDgi0hSqdxFrqPfIOywsjKioKAByc3Pp27cvu3btwm63AxAbG4vb7cbj8RATE+NdLyYm5oLl\nLVq0wGazUVVV5V3/Ylq3jiI8PKxJA2usuDhHULd3OWbK0hSBGIeZ9o2ZsviLlerdbPvfTHlCleVi\n2zXTfgFz5WlKlgad8wbYtm2xNjhIAAAVZ0lEQVQbubm5rFmzhkGDBnmXG4Zx0cc3dvl3lZaeamgs\nv4iLc+B2lwd1m5dipixN5e9xmGnfBCNLKN9kzF7vZnotgLnyhDLL+ds1034Bc+U5P0tj671Bnzbf\nuXMnK1euZPXq1TgcDqKiojh9+jQAx44dw+l04nQ68Xg83nVKSkq8y91uNwDV1dUYhnHZ/4WLSGip\n3kXMr97mXV5eTlZWFqtWrSI6Oho4dy4rPz8fgIKCAvr06UOnTp3Yv38/ZWVlVFRU4HK56NatG717\n92bLli0AFBYW0qNHjwAOR0SaQvUuYg31Tptv3ryZ0tJSZs6c6V2WmZlJRkYGOTk5tG3blmHDhhER\nEcHs2bOZOHEiNpuNadOm4XA4GDJkCB988AFjxozBbreTmZkZ0AGJiO9U7yLWYDMaclIqyIJ9TsLM\n50FCZULm9iY/x5q5CX5I8n/Msm/gyj/nHUy+7EczvRbAXHl8yeKPeocLa95M+wXMlSco57xFRETE\nPNS8RURELEbNW0RExGLUvEVERCxGzVtERMRi1LxFREQsRs1bRETEYtS8RURELEbNW0RExGLUvEVE\nRCxGzVtERMRi1LxFREQsRs1bRETEYtS8RURELEbNW0RExGIa1Lw//fRTBgwYwLp16wCYO3cud999\nN+PGjWPcuHHs2LEDgLy8PEaMGEFqaiobNmwAoLq6mtmzZzNmzBjGjh3L4cOHAzMSEfEL1buI+YXX\n94BTp07x1FNP0bNnzzrLH3zwQfr371/ncdnZ2eTm5hIREcHIkSMZOHAghYWFtGrVisWLF7Nr1y4W\nL17MM8884/+RiEiTqd5FrKHeI2+73c7q1atxOp2Xfdy+ffuIj4/H4XAQGRlJ165dcblcFBUVMXDg\nQAB69eqFy+XyT3IR8TvVu4g11Nu8w8PDiYyMvGD5unXrGD9+PLNmzeL48eN4PB5iYmK898fExOB2\nu+ssb9GiBTabjaqqKj8OQUT8RfUuYg31TptfzD333EN0dDS33347zz//PCtWrKBLly51HmMYxkXX\nvdTy72rdOorw8DBfovksLs4R1O1djpmyNEUgxmGmfWOmLIFk1no32/43U55QZbnYds20X8BceZqS\nxafm/d3zYQkJCcyfP5+kpCQ8Ho93eUlJCZ07d8bpdOJ2u7ntttuorq7GMAzsdvtln7+09JQvsXwW\nF+fA7S4P6jYvxUxZmsrf4zDTvglGFrO8yZix3s30WgBz5QlllvO3a6b9AubKc36Wxta7T18Vmz59\nuvdTpMXFxbRv355OnTqxf/9+ysrKqKiowOVy0a1bN3r37s2WLVsAKCwspEePHr5sUkRCRPUuYj71\nHnkfOHCAhQsX8sUXXxAeHk5+fj5jx45l5syZXHXVVURFRbFgwQIiIyOZPXs2EydOxGazMW3aNBwO\nB0OGDOGDDz5gzJgx2O12MjMzgzEuEfGB6l3EGmxGQ05KBVmwpzXMPJUSKhMytzf5OdbMTfBDkv9j\nln0DzWvaPNB82Y9mei2AufL4ksUf9Q4X1ryZ9guYK09Ips1FREQkdNS8RURELEbNW0RExGLUvEVE\nRCxGzVtERMRi1LxFREQsRs1bRETEYtS8RURELEbNW0RExGLUvEVERCxGzVtERMRi1LxFREQsRs1b\nRETEYtS8RURELEbNW0RExGIa1Lw//fRTBgwYwLp16wD46quvGDduHGlpacyYMYOqqioA8vLyGDFi\nBKmpqWzYsAGA6upqZs+ezZgxYxg7diyHDx8O0FBExB9U7yLmV2/zPnXqFE899RQ9e/b0Llu2bBlp\naWn86U9/4sYbbyQ3N5dTp06RnZ3Nyy+/zKuvvsratWs5ceIE77zzDq1ateL1119nypQpLF68OKAD\nEhHfqd5FrKHe5m2321m9ejVOp9O7rLi4mMTERAD69+9PUVER+/btIz4+HofDQWRkJF27dsXlclFU\nVMTAgQMB6NWrFy6XK0BDEZGmUr2LWEO9zTs8PJzIyMg6yyorK7Hb7QDExsbidrvxeDzExMR4HxMT\nE3PB8hYtWmCz2bzTbiJiLqp3EWsIb+oTGIbhl+Xf1bp1FOHhYU3K1VhxcY6gbu9yzJSlKQIxDjPt\nGzNlCRYz1bvZ9r+Z8oQqy8W2a6b9AubK05QsPjXvqKgoTp8+TWRkJMeOHcPpdOJ0OvF4PN7HlJSU\n0LlzZ5xOJ263m9tuu43q6moMw/D+L/5SSktP+RLLZ3FxDtzu8qBu81LMlKWp/D0OM+2bYGQxy5uM\nGevdTK8FMFeeUGY5f7tm2i9grjznZ2lsvfv0VbFevXqRn58PQEFBAX369KFTp07s37+fsrIyKioq\ncLlcdOvWjd69e7NlyxYACgsL6dGjhy+bFJEQUb2LmE+9R94HDhxg4cKFfPHFF4SHh5Ofn88f/vAH\n5s6dS05ODm3btmXYsGFEREQwe/ZsJk6ciM1mY9q0aTgcDoYMGcIHH3zAmDFjsNvtZGZmBmNcIuID\n1buINdiMhpyUCrJgT2uYeSolVCZkbm/yc6yZm+CHJP/HLPsGmte0eaD5sh/N9FoAc+XxJYs/6h0u\nrHkz7RcwV56QTJuLiIhI6Kh5i4iIWIyat4iIiMWoeYuIiFiMmreIiIjFqHmLiIhYjJq3iIiIxah5\ni4iIWIyat4iIiMWoeYuIiFiMmreIiIjFqHmLiIhYjJq3iIiIxah5i4iIWIyat4iIiMWoeYuIiFhM\nuC8rFRcXM2PGDNq3bw/ALbfcwn333ccjjzxCTU0NcXFxLFq0CLvdTl5eHmvXrqVFixaMGjWK1NRU\nvw5ARAJPNS9iLj41b4Cf/exnLFu2zHv70UcfJS0tjcGDB7NkyRJyc3MZNmwY2dnZ5ObmEhERwciR\nIxk4cCDR0dF+CS8iwaOaFzEPv02bFxcXk5iYCED//v0pKipi3759xMfH43A4iIyMpGvXrrhcLn9t\nUkRCSDUvEjo+H3kfPHiQKVOmcPLkSe6//34qKyux2+0AxMbG4na78Xg8xMTEeNeJiYnB7XbX+9yt\nW0cRHh7mazSfxMU5grq9yzFTlqYIxDjMtG/MlCUYAlXzvta72fa/mfKEKsvFtmum/QLmytOULD41\n7x/+8Ifcf//9DB48mMOHDzN+/Hhqamq89xuGcdH1LrX8fKWlp3yJ5bO4OAdud3lQt3kpZsrSVP4e\nh5n2TTCymOlNJpA170u9m+m1AObKE8os52/XTPsFzJXn/CyNrXefps3btGnDkCFDsNls3HDDDVx3\n3XWcPHmS06dPA3Ds2DGcTidOpxOPx+Ndr6SkBKfT6csmRSSEVPMi5uJT887Ly+PFF18EwO128/XX\nXzN8+HDy8/MBKCgooE+fPnTq1In9+/dTVlZGRUUFLpeLbt26+S+9iASFal7EXHyaNk9ISOChhx7i\n3Xffpbq6mvnz53P77bczZ84ccnJyaNu2LcOGDSMiIoLZs2czceJEbDYb06ZNw+Ewz1SgiDSMal7E\nXHxq3tdccw0rV668YPlLL710wbLk5GSSk5N92YyImIRqXsRcdIU1ERERi1HzFhERsRg1bxEREYtR\n8xYREbEYNW8RERGLUfMWERGxGDVvERERi1HzFhERsRg1bxEREYtR8xYREbEYNW8RERGLUfMWERGx\nGDVvERERi1HzFhERsRg1bxEREYvx6fe8G+vpp59m37592Gw20tPT6dixYzA2KyIhopoXCayAN++/\n//3vfP755+Tk5HDo0CHS09PJyckJ9GZFJERU8yKBF/Bp86KiIgYMGADAzTffzMmTJ/nmm28CvVkR\nCRHVvEjgBbx5ezweWrdu7b0dExOD2+0O9GZFJERU8yKBF5Rz3t9lGEa9j4mLcwQhSei3eSlmyPL2\n4ntCHeGizLBvvmWmLGZWX837uh/Ntv/NlKexWQJZ72baL2CuPE3JEvAjb6fTicfj8d4uKSkhLi4u\n0JsVkRBRzYsEXsCbd+/evcnPzwfg448/xul0cs011wR6syISIqp5kcAL+LR5165d6dChA6NHj8Zm\nszFv3rxAb1JEQkg1LxJ4NqMhJ6FFRETENHSFNREREYtR8xYREbGYoH9VLFiysrLYu3cvZ8+eZfLk\nybzzzjuUlpYCcOLECTp37sxTTz0FnPsqy5gxY+jduzfTp0+v8zz5+fmsWbOGiIgI2rRpw4IFC7Db\n7X7PMnnyZO6++25+8pOfANC6dWuWLVtW53k++eQT5s+fD8Ctt97Kb3/724Dsl4ZmefLJJ2nRogWt\nWrVi8eLFXHXVVSHL86033niD559/nu3bt4csS3l5ObNmzeLkyZO0adOGJUuWNPo105wVFxczY8YM\n2rdvD8Att9xCcnIyS5YsITw8nKioKLKysrj22msvWNfj8TB48GBWrFhBjx49QpblxRdfJC8vj/Dw\ncObNm+e3y8P6kufYsWOkp6dTVVVFbW0tjz76qPf16+8saWlpPPHEE9hsNn74wx8yf/58wsPrtplA\nXDrX1yzn1/ygQYNClgXg9OnTDB06lKlTpzJ8+PDLb8i4AhUVFRn33XefYRiGcfz4cePOO++sc//c\nuXONffv2eW/n5OQYw4cPN5YtW3bBc91xxx1GWVmZYRiGkZGRYbzzzjsByXL48GEjJSXlss81duxY\nb+4HH3zQ2LFjR8iy/PKXv/RmyczMNNatW9eoLP7OYxiG4fF4jAkTJhj9+/cPaZaFCxcaL730kmEY\nhrF8+fI6rzWp3+7du43p06fXWZaSkmIcOnTIMAzDeO6554xVq1ZddN2HH37YSElJMXbv3h2yLJ9+\n+qmRkpJiVFdXGwcOHDCWLl3qlyy+5snMzDRef/11wzAMY+/evcaECRMClmXKlCne96UVK1YYeXl5\nde4vLi42Jk2aZBiGYRw8eNAYNWpUyLLUV/PBzPKtJUuWGMOHDzc2btxY73auyGnz7t27s3TpUgBa\ntWpFZWUlNTU1APz73/+mvLzc+7+948eP8/bbbzN69OiLPld0dDRlZWUAlJWV1blylL+zXE5VVRVf\nfPGF97H9+/enqKgoJFkAVq5c6X1sTEwMJ06caFQWf+cBWLRoEQ888ECjc/g7S2FhIXfffTcA999/\nv36Uww9at27tfY2dPHnyonVYVFTE1VdfzS233BLSLIWFhQwePJjw8HA6dOjg82vSX3m+e78v72GN\n8fnnn3tf73369OH999+vc38wL51bX5bL1XywswAcOnSIgwcP0q9fvwY95xXZvMPCwoiKigIgNzeX\nvn37EhYWBsArr7zC2LFjvY9dtGgRs2bN8t5/voyMDFJSUkhMTKS2tpZevXoFLIvH4+GBBx5g9OjR\n5OXl1Xme0tJSWrVq5b0dGxvb6EtO+isL4P3e7qlTp3jrrbdITk5uVBZ/5ykuLqZly5Z06tSp0Tn8\nncXj8fD66697p8qqqqp8ytScHTx4kClTpjBmzBjef/990tPTmTZtGklJSezdu5eUlJQ6j6+qqiI7\nO5tZs2aFPMsXX3zBV199xcSJE/nVr37FJ598EtI8v/71r9m8eTPJyclkZGQwY8aMgGW55ZZbeO+9\n9wDYuXNnnYv1QGAvndvYLJer+WBnAVi4cCFz585t+EaaPklgXlu3bjVGjhzpnfY+c+aMMXToUO/9\nf//7343HHnvMMAzD2Lhx4wXT5jU1NcbQoUONzz//3KitrTUeeOABY9u2bQHJUl5ebuTm5hpVVVXG\n119/bSQkJBjHjh3z3n/06FHjnnvu8d5+//33jQcffDAkWb5VUVFhjB8/vkFTPIHMc+bMGSMtLc04\nceKEYRiGT9Pm/spiGIYRHx9vuFwuwzAM47HHHvPplEJzdvToUeMvf/mLUVtba3z++efGnXfeaYwb\nN87Ys2ePYRjnpoHXrl1bZ53ly5cbmzZtMgzDMObMmeO3aXNfsjz++OPGE088YdTW1hoffvihMXz4\ncL9k8TVPdna28eyzzxqGYRjbt283pk2bFrAsR44cMSZNmmSMGzfOWL58+QVT9BkZGcbWrVu9t0eP\nHm38+9//DkmWb51f86HI8uc//9nIzs42DMMwli1b1qD31Cv2A2s7d+5k5cqVvPDCCzgc564f++GH\nH9aZwnz33Xc5cOAAo0aN4vjx41RVVdGuXTuGDRsGnJtSB7jhhhsA6NmzJwcOHCAxMdHvWa655hpG\njBgBnPvf6E9+8hP+/e9/43Q6vcu+OzV97Ngx733BzgJw9uxZpk6dytChQ+v/YEWA8/zzn//E4/Hw\nm9/8Bjh3Oc5Zs2bxxz/+MehZAL7//e/TpUsX4NzVxoqLixuVo7lr06YNQ4YMAc7V3nXXXcenn37K\nT3/6UwB69erF22+/XWedXbt2UVtby2uvvcZ//vMfPvroI5YuXer90FAws1x33XXcdNNN2Gw2unXr\nxhdffNGkDE3N43K5mDlzJnDu9ejLB10bmqW2tpZVq1YB5+qppKSkzjqBunSuL1m+XX5+zYciy44d\nOzh8+DA7duzg6NGj2O12vve97112pveKnDYvLy8nKyuLVatWER0d7V2+f/9+brvtNu/tuXPnsmnT\nJtavX8/UqVNJTU31Nm44d67o5MmT3ia+f/9+brzxxoBk2b17NwsWLADOTUV/8skn/Nd//Zf3/oiI\nCG666Sb27NkDQEFBAX369AlJFoDVq1fzs5/9jNTU1EZlCESeTp06kZ+fz/r161m/fj1Op7PRjduf\n+6ZHjx7s3r0bOHd50PPvl8vLy8vjxRdfBMDtdvP1118TExPDwYMHgYvX4RtvvOH9+/fr14958+Y1\nuXH7mqVv377s2rULOHce8/vf/36TczQlz4033si+ffsA+Oijjxr9HtaYLOvXr2fHjh0AvPnmmyQk\nJNRZJ1CXzvUly6VqPhRZnnnmGTZu3Mj69etJTU1l6tSp9Z6ivSKPvDdv3kxpaan3f5tw7nyC2+32\nHkVfzt/+9jeOHDniPWc5ZcoU7HY7119/PXfddVdAsnTr1o1NmzZx7733UlNTw6RJk2jTpk2dLOnp\n6TzxxBPU1tbSqVOnRp9/92eW1157jeuvv977obkePXpw//33hyxPU/kzy8yZM3nooYdYtmwZ1113\nHVOnTm1yvuYkISGBhx56iHfffZfq6mrmz5+Pw+EgIyODiIgIrr32Wp5++mkA/ud//ofnnnvOVFk6\nd+7M3/72N+69914AnnjiiZDmmTx5Mo899hhbtmwB4LHHHgtYlnbt2vHII4+wfPlyunXr5v3w1axZ\ns1iwYEHALp3rS5ZL1Xzbtm2DniUyMrLR29HlUUVERCzmipw2FxERuZKpeYuIiFiMmreIiIjFqHmL\niIhYjJq3iIiIxah5i4iIWIyat4iIiMWoeYuIiFjM/wfTaXltR3ZT+gAAAABJRU5ErkJggg==\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x7f0588b67c18>"
]
},
"metadata": {
"tags": []
}
}
]
},
{
"metadata": {
"id": "Kg78Hz_8w95E",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
""
]
},
{
"metadata": {
"id": "SbVMcj9EG8le",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### **So, the above query calculated Z-scores for a single gene. Let's make another query where we construct a series of bins, one bin per million bases, and compute a Z-score per sample, per bin.**\n",
"\n",
"This is adapted from the Google tutorial (https://codelabs.developers.google.com/codelabs/genomics-vcfbq/index.html?index=..%2F..index#0\n",
")."
]
},
{
"metadata": {
"id": "t1b_AIWRA4QZ",
"colab_type": "code",
"outputId": "3098cd2e-af82-4bd7-c4c2-66e2ce8802e5",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
}
},
"cell_type": "code",
"source": [
"\n",
"df13 = pd.io.gbq.read_gbq('''\n",
"\n",
"#standardsql\n",
"WITH ind AS (\n",
" -- count variants for each sample/ref/bin\n",
" SELECT\n",
" call.name AS sample, \n",
" reference_name AS ref, \n",
" FLOOR(start_position/1000000) AS bin, \n",
" COUNT(call.name) AS n\n",
" FROM `isb-cgc.QotM.1000genomes` \n",
" JOIN UNNEST(call) AS call\n",
" JOIN UNNEST(alternate_bases) AS alt\n",
" WHERE alt.alt != '<*>'\n",
" AND (call.genotype[OFFSET(0)] = 1 OR call.genotype[OFFSET(1)] = 1)\n",
" GROUP BY sample, ref, bin\n",
"),\n",
"\n",
"pop AS (\n",
" -- overall all samples in ref/bin\n",
" SELECT \n",
" ref, \n",
" bin, \n",
" AVG(n) AS pop_mu, \n",
" STDDEV(n) AS pop_sigma\n",
" FROM ind\n",
" GROUP BY ref, bin\n",
"),\n",
"\n",
"zscore AS (\n",
" SELECT \n",
" ind.sample, \n",
" ind.n AS ind_n,\n",
" (ind.n-pop.pop_mu)/pop.pop_sigma AS z, \n",
" pop.ref, \n",
" pop.bin, \n",
" pop.pop_mu, \n",
" pop.pop_sigma\n",
" FROM pop, ind\n",
" WHERE ind.ref = pop.ref AND ind.bin = pop.bin\n",
")\n",
"\n",
"SELECT * from zscore\n",
"ORDER BY ABS(Z) DESC\n",
"\n",
"''', project_id=project_id, verbose=False, dialect='standard') \n",
" \n",
"df13.head()\n",
"\n",
"\n"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>sample</th>\n",
" <th>ind_n</th>\n",
" <th>z</th>\n",
" <th>ref</th>\n",
" <th>bin</th>\n",
" <th>pop_mu</th>\n",
" <th>pop_sigma</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>HG02623</td>\n",
" <td>665</td>\n",
" <td>5.591835</td>\n",
" <td>21</td>\n",
" <td>10.0</td>\n",
" <td>515.824458</td>\n",
" <td>26.677389</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>HG02574</td>\n",
" <td>663</td>\n",
" <td>5.516865</td>\n",
" <td>21</td>\n",
" <td>10.0</td>\n",
" <td>515.824458</td>\n",
" <td>26.677389</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>HG02645</td>\n",
" <td>647</td>\n",
" <td>4.917106</td>\n",
" <td>21</td>\n",
" <td>10.0</td>\n",
" <td>515.824458</td>\n",
" <td>26.677389</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>HG01384</td>\n",
" <td>840</td>\n",
" <td>-4.729486</td>\n",
" <td>21</td>\n",
" <td>39.0</td>\n",
" <td>1712.270611</td>\n",
" <td>184.432426</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>HG03867</td>\n",
" <td>992</td>\n",
" <td>-4.715188</td>\n",
" <td>21</td>\n",
" <td>15.0</td>\n",
" <td>1923.495069</td>\n",
" <td>197.552059</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" sample ind_n z ref bin pop_mu pop_sigma\n",
"0 HG02623 665 5.591835 21 10.0 515.824458 26.677389\n",
"1 HG02574 663 5.516865 21 10.0 515.824458 26.677389\n",
"2 HG02645 647 4.917106 21 10.0 515.824458 26.677389\n",
"3 HG01384 840 -4.729486 21 39.0 1712.270611 184.432426\n",
"4 HG03867 992 -4.715188 21 15.0 1923.495069 197.552059"
]
},
"metadata": {
"tags": []
},
"execution_count": 6
}
]
},
{
"metadata": {
"id": "zVJu5ojlCxfK",
"colab_type": "code",
"outputId": "09e40cd2-c104-49b0-aaf6-7d274729b4ad",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 617
}
},
"cell_type": "code",
"source": [
"import matplotlib\n",
"matplotlib.rcParams['figure.figsize'] = [6, 8]\n",
"\n",
"df13.hist()"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([[<matplotlib.axes._subplots.AxesSubplot object at 0x7ff06e966390>,\n",
" <matplotlib.axes._subplots.AxesSubplot object at 0x7ff06e496e10>],\n",
" [<matplotlib.axes._subplots.AxesSubplot object at 0x7ff06e363710>,\n",
" <matplotlib.axes._subplots.AxesSubplot object at 0x7ff06e1bb550>],\n",
" [<matplotlib.axes._subplots.AxesSubplot object at 0x7ff06e07ee10>,\n",
" <matplotlib.axes._subplots.AxesSubplot object at 0x7ff06e07eef0>]],\n",
" dtype=object)"
]
},
"metadata": {
"tags": []
},
"execution_count": 10
},
{
"output_type": "display_data",
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAHhCAYAAABnb2lAAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4yLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvNQv5yAAAIABJREFUeJzs3XtwVGWe//F359Ibo4kQ7GYHCxG8\nACVJkEJZgiC3QIjrmFUTQyo4Ojgry2VhjZJMT4T4s4BwySyiKAgC2TBAhtbRiCxhUGRwiXGhXRbc\nySLUbywmIKQ1MeRGQujfH/zSEG5JOpfuzvm8qqhKP3268z0nT/Pt85znPF+Ty+VyISIihhPg7QBE\nRMQ7lABERAxKCUBExKCUAEREDEoJQETEoJQAREQMSgnAzxQXFxMbG3tNe05ODlu3bvVCRCLir5QA\nuom0tDSmTp3q7TBEPPLf//3fTJ8+vU2viY2Npbi4uJMiMgYlAD+1dOlSJk+eTFxcHA6Hg4yMDN5+\n+20Axo8fz7Zt23j66ad55JFHyM7O9nK0IjcXFRXFe++95+0wDCfI2wFI25WWljJkyBDS09P5/e9/\nz//5P/+HQYMGNdvmP//zP8nPz8fpdDJhwgSee+45/vZv/9ZLEYvcXHFxMZmZmfz85z+nvLycM2fO\nUFJSQs+ePXn77bexWq0cPXqU9PR0Lly4wKOPPtqq933zzTdv+H6iMwC/9Dd/8zdMmTIFgClTpvDn\nP/+Z8+fPN9vm8ccfJzAwkN69e9OrVy9Onz7tjVBF2mzXrl3YbDb27NlDr169eP/99wHIysri2Wef\npbCwkAcffJC//vWv7Xo/UQLwSz169CAg4NKf7rbbbgOgsrKy2TZN7QCBgYE0NjZ2XYAi7TB8+HDu\nvPNOTCYTgwcP5vTp05w/f54jR44QHx8PQFxcHLfccovH7yeXKAH4oZ9++sn9c9N//Lfffru3whHp\nUGFhYe6fm768VFRUAJe/2JhMJsLDwz1+P7lECcAP1dXV8cc//hGAwsJCIiMjMZvNXo5KpPM0fcGp\nqqoC4OLFi82+CIlnlAD80IABA/j666+Ji4tj06ZNLFiwwNshiXSqkJAQBg0a5P7i88knn1xz3Uva\nTrOA/MyIESPYsWMHAPPnz3e3R0VFuX/+7LPPmr3m6sci/igrKwubzcbatWsZM2YM99xzj7dD8nsm\nFYQRETEmDQGJiBiUhoBExG+dOHGCWbNmXfe5e+65h9WrV3dxRP5FQ0AiIgalISAREYPymyGgsrJz\n3g6hTXr2DKW8vMbbYfgMXzkeFktYyxv5kCv7va8cw5tRjB2jI2O8WZ/XGUAnCQoK9HYIPkXHo/38\n4Rgqxo7RVTEqAYiIGJQSgIiIQbUqARw7doyJEyeyefNmAE6fPs20adNISUlh7ty51NfXA1BQUMBT\nTz1FYmIi27dvB6ChocFdrSo1NZWTJ08CUFJSQnJyMsnJySxcuLAz9k1ERG6ixQRQU1PD66+/zsiR\nI91tq1atIiUlhS1bttCvXz/sdjs1NTWsXr2aTZs2kZeXR25uLhUVFezYsYPw8HC2bt3KjBkzyMnJ\nAWDRokXYbDa2bdtGVVUV+/bt67y9FBGRa7Q4C8hsNrNu3TrWrVvnbisuLua1114DYNy4cWzYsIH+\n/fsTGRnpXnp12LBhOBwOioqKSEhIACAmJgabzUZ9fT2lpaXu9WvGjRtHUVFRq6v8eMsvs7Wmjj/Y\nkDHe2yFIF/H0M6k+ckmLCSAoKIigoOab1dbWupcf7tWrF2VlZTidTiIiItzbREREXNMeEBCAyWTC\n6XQ2W8u76T1upmfPUL+4ei/e529TPUW8pd33AdzoRuK2tLfmZmRfn7crvuNm94woOYhc5lECCA0N\npa6ujpCQEM6cOYPVasVqteJ0Ot3bnD17lqFDh2K1WikrK2PQoEE0NDTgcrmwWCzuCj+A+z1EfMWy\nZcs4dOgQFy5c4MUXX+Szzz7jm2++oUePHgBMnz6dsWPHUlBQQG5uLgEBASQlJZGYmEhDQwMZGRmc\nOnWKwMBAlixZQt++fSkpKSErKwuAgQMHuodRRbzFo2mgMTExFBYWArB7925Gjx5NdHQ0R44cobKy\nkurqahwOB8OHD2fUqFHs2rULgL179zJixAiCg4MZMGAABw8ebPYeIr7gyy+/5NtvvyU/P5/169ez\nePFiAF566SXy8vLIy8tj7Nixmvggfq/FM4CjR4+ydOlSSktLCQoKorCwkBUrVpCRkUF+fj59+vQh\nISGB4OBg0tLSmD59OiaTiVmzZhEWFkZ8fDwHDhxg6tSpmM1msrOzAbDZbCxYsICLFy8SHR1NTExM\np++sSGs89NBD7gkK4eHh1NbWXreO7OHDh7v9xAfp3lpMAEOGDCEvL++a9o0bN17TFhcXR1xcXLO2\nplPgq917771s2bKlLbGKdInAwEBCQ0MBsNvtjBkzhsDAQDZv3szGjRvp1asXr776aqdPfBDpbH6z\nGJxIV9uzZw92u50NGzZw9OhRevToweDBg3n33Xd56623ePDBB5tt39ETH+Da2W/+cBFbMXaMrohR\nCUDkOvbv38+aNWtYv349YWFhzW6EHD9+PFlZWUyePLnTJz5cOfvNYgnz+VVx/SFG8P3VhTvyOGo1\nUJE2OHfuHMuWLWPt2rXuWT9z5sxxL2NSXFzMfffdp4kP4vd0BiBylZ07d1JeXs68efPcbU8++STz\n5s3jlltuITQ0lCVLlhASEqKJD+LX/KYkpC+csmkpCP9ws9v8/WHs90pX9nt/GF7p6hi761IQGgIS\nEZFOpQQgImJQSgAiIgalBCAiYlBKACIiBqVpoCLiEzTLruvpDEBExKCUAEREDMqjIaDq6mrS09P5\n6aefaGhoYNasWVgslusWu1i/fj27du3CZDIxe/ZsHn30Uc6dO0daWhrnzp0jNDSUnJwc9y33IiLS\nNTxKAH/4wx/o378/aWlpnDlzhl/84hdYLBZsNhtRUVGkpaWxb98+BgwYwM6dO90FMFJSUnjkkUfI\nzc3l4Ycf5oUXXiA/P59169bxyiuvdPS+iYjITXg0BNSzZ0/3yoaVlZX06NHjusUuiouLGT16NGaz\nmYiICO68806OHz9OUVERsbGxzbYVEZGu5VECeOyxxzh16hSxsbGkpqYyf/786xa7aE3BjF69enH2\n7Nl27oaIiLSVR0NAH330EX369OG9996jpKTEvQpik64ojCFyI/624JuIt3iUABwOB4888ggAgwYN\n4vz581y4cMH9fFOxC6vVyv/9v//3uu1lZWWEhYV5VBhD5GZutopia5PDsmXLOHToEBcuXODFF18k\nMjKS+fPn09jYiMViYfny5ZjNZgoKCsjNzSUgIICkpCQSExNpaGggIyODU6dOuUui9u3bl5KSkutO\nlBDxFo+GgPr168fhw4cBKC0t5dZbb+Wee+65ptjF3/3d3/H5559TX1/PmTNnOHv2LPfee2+zghkq\njCG+5ssvv+Tbb78lPz+f9evXs3jxYlatWkVKSgpbtmyhX79+2O12ampqWL16NZs2bSIvL4/c3Fwq\nKirYsWMH4eHhbN26lRkzZpCTkwPAokWLsNls7kkR+/bt8/KeitF5dAbwzDPPYLPZSE1N5cKFC2Rl\nZWGxWK5b7CIpKYnU1FRMJhNZWVkEBAQwbdo0XnnlFVJSUggPD2f58uUdulMi7fHQQw+5JzSEh4dT\nW1tLcXGx+xv7uHHj2LBhA/379ycyMtI9/Dls2DAcDgdFRUUkJCQAEBMTg81mo76+/roTJR599FEv\n7KHIJR4lgFtvvZU33njjmvYtW7Zc0zZt2jSmTZt2zevffvttT371dekWculIgYGBhIaGAmC32xkz\nZgxffPEFZrMZaNskh4CAAEwmE06n87oTJUS8SWsBidzAnj17sNvtbNiwgUmTJrnb2zLJ4Ubtnk5+\n8IcL3IqxY3RFjEoAItexf/9+1qxZw/r16wkLCyM0NJS6ujpCQkKaTWZwOp3u15w9e5ahQ4e6JzkM\nGjSIhoYGXC4XFovFfe8M4NHkB5WE7Di+HqNKQop4yblz51i2bBlr1651L1ESExNDYWEhcHniQnR0\nNEeOHKGyspLq6mocDgfDhw9vNslh7969jBgxguDgYAYMGHDNRAkRb9IZgMhVdu7cSXl5OfPmzXO3\nZWdnk5mZSX5+Pn369CEhIYHg4GDS0tKYPn06JpPJfT9MfHw8Bw4cYOrUqZjNZrKzswGw2WzXnSgh\n4i0mV2sHI73sZqdDuggsV9qQMf6Gz/nD2O+Vruz3/jC80p4Yu/JzfLM+4gs0BCQiIp1KCUBExKCU\nAEREDEoJQETEoJQAREQMSglARMSglABERAxKCUBExKCUAEREDMrjpSAKCgpYv349QUFB/PM//zMD\nBw5sd8UkERHpOh6dAZSXl7N69Wq2bNnCmjVr+PTTTzukYpKIiHQdjxJAUVERI0eO5LbbbsNqtfL6\n669TXFzMhAkTgMvVjg4fPuyumBQSEtKsYlJsbCxwaZVFh8PRcXskIiKt4tEQ0F//+lfq6uqYMWMG\nlZWVzJkzh9ra2nZVTKqvr3e/XsTbjh07xsyZM3nuuedITU0lIyODb775xr089PTp0xk7dqyKwvsp\nTxae8/UF5Dzh8TWAiooK3nrrLU6dOsWzzz7brMJRR1RMutrVlZFEbqS9K37W1NTw+uuvM3LkyGbt\nL730EuPGjWu23erVq7Hb7QQHB/P0008TGxvL3r17CQ8PJycnhy+++IKcnBxWrlzpLgofFRVFWloa\n+/btU01g8SqPhoB69erFgw8+SFBQEHfddRe33nort956K3V1dQA3rZjU1N5UD7WpYlJL3/7Ly2so\nKzt33X8iV7pRP2ltXzGbzaxbt67Fil1tGeK8UVF4EW/yKAE88sgjfPnll1y8eJHy8nJqamraXTFJ\nxFcEBQUREhJyTfvmzZt59tln+Zd/+Rd+/PFHFYUXv+fREFDv3r2ZPHkySUlJAGRmZhIZGUl6enq7\nKiaJ+KonnniCHj16MHjwYN59913eeustHnzwwWbbqCj8Jf4Qoye6er98uih8cnIyycnJzdo2btx4\nzXZxcXHExcU1a2u6MCbiL668HjB+/HiysrKYPHmyisJfxR9i9FRX7pcqgon4kDlz5nDy5EkAiouL\nue+++1QUXvyeisKLXOXo0aMsXbqU0tJSgoKCKCwsJDU1lXnz5nHLLbcQGhrKkiVLCAkJUVF48Wsq\nCi/djorCe4+/FIX3RFfeB6AhIBER6VRKACIiBqUEICJiUEoAIiIGpQQgImJQSgAiIgalBCAiYlBK\nACIiBqUEICJiUEoAIiIGpbWARKTD+fqyDnJJu84A6urqmDhxIh988AGnT59m2rRppKSkMHfuXOrr\n6wEoKCjgqaeeIjExke3btwOXqoClpaUxdepUUlNT3assiohI12lXAnjnnXe4/fbbAVi1ahUpKSls\n2bKFfv36Ybfb3TVTN23aRF5eHrm5uVRUVLBjxw7Cw8PZunUrM2bMICcnp0N2RkREWs/jBHDixAmO\nHz/O2LFjgUtrpE+YMAG4XO+0LTVTRXzJsWPHmDhxIps3bwbokDPckpISdyGlhQsXemfHRK7gcQJY\nunQpGRkZ7se1tbXuwu5N9U7bUjO16QN1Iz17hmKxhF33n8iVbtRPWttXampqeP3115tVAeuIM9xF\nixZhs9nYtm0bVVVV7Nu3r1P2X6S1PEoAH374IUOHDqVv377Xfb4ttVFv1n6l8vIaysrOXfefyJVu\n1E9a21fMZjPr1q1rVrKxvWe49fX1lJaWEhUV1ew9RLzJo1lAn3/+OSdPnuTzzz/n+++/x2w2Exoa\nSl1dHSEhIe56p1artdU1U5vOHkS8LSgoiKCg5h+N9p7hOp1OwsPD3ds2vYeIN3mUAFauXOn++c03\n3+TOO+/k66+/prCwkCeeeMJd7zQ6OprMzEwqKysJDAzE4XBgs9moqqpi165djB492l0zVcRfdMQZ\nbmsL8fXsGUpQUKD7sT8MefpDjJ7o6v3qit/XYfcBzJkzh/T0dPLz8+nTpw8JCQkEBwe3qWaqiK9q\n7xmuxWKhoqLCvW3Te7SkvLzG/XN3Lwnp67pyv7qqJGS7E8CcOXPcP2/cuPGa5+Pi4oiLi2vWFhgY\nyJIlS9r7q0W6TExMTLvOcIODgxkwYAAHDx5k+PDh7N69m2nTpnl7t8TgdCewyFWOHj3K0qVLKS0t\nJSgoiMLCQlasWEFGRka7znBtNhsLFizg4sWLREdHExMT4+U9FaMzuVo7GOllNzsd0m3ncqUNGeNv\n+Jy/jU9f2e/9YXilKcbu+Jm8Wb/qaF01BKTF4EREDEoJQETEoJQAREQMSglARMSglABERAxK00BF\n/JAns2y6chaL+AedAYiIGJQSgIiIQSkBiIgYlBKAiIhBKQGIiBiUx7OAli1bxqFDh7hw4QIvvvgi\nkZGRzJ8/n8bGRiwWC8uXL8dsNlNQUEBubi4BAQEkJSWRmJhIQ0MDGRkZnDp1yr0y6I2qi4mISOfw\nKAF8+eWXfPvtt+Tn51NeXs4//MM/MHLkSFJSUpgyZQq//e1vsdvtJCQksHr1aux2O8HBwTz99NPE\nxsayd+9ewsPDycnJ4YsvviAnJ6dZkRkRX1NcXMzcuXO57777ALj//vt54YUX9KVH/JpHQ0APPfQQ\nb7zxBgDh4eHU1ta2u2aqiK97+OGHycvLIy8vj1dffbVDCsWLeJNHCSAwMJDQ0FAA7HY7Y8aMaXfN\n1Pr6+vbui0iX0pce8XftuhN4z5492O12NmzYwKRJk9ztHVEz9WpX10YVuZHOWvP/+PHjzJgxg59+\n+onZs2d3yJeepteLeIPHCWD//v2sWbOG9evXExYW1u6aqS19EK6sjSpyMzcrpOFpcrj77ruZPXs2\nU6ZM4eTJkzz77LM0Nja6n++MLz3QsUXhu6oYjr8V3WktFYX//86dO8eyZcvYtGkTPXr0ANpfM1XE\nl/Xu3Zv4+HgA7rrrLu644w6OHDnSqV96oGOLwndFNTF/qFrmKRWF//927txJeXk58+bNc7dlZ2eT\nmZnZrpqpIr6qoKCAsrIypk+fTllZGT/88ANPPvmkvvQYSHdcgM+jBPDMM8/wzDPPXNO+cePGa9ri\n4uKIi4tr1tY0DU7EX4wfP56XX36ZTz/9lIaGBrKyshg8eDDp6en60iN+S8tBi7TCbbfdxpo1a65p\n15ce8WdaCkJExKB0BiBiEJ6MYYPvj2OL53QGICJiUEoAIiIGpQQgImJQugYgIjfl6bUD8X06AxAR\nMSglABERg1ICEBExKCUAERGDUgIQETEoJQAREYNSAhARMSiv3gewePFiDh8+jMlkwmazERUV5c1w\nRDqd+rz4Eq8lgK+++orvvvuO/Px8Tpw4gc1mIz8/31vhiHQ69Xnj8fUiMl4bAioqKmLixIkA3HPP\nPfz0009UVVV5KxyRTqc+L77GawnA6XTSs2dP9+OIiAjKysq8FY5Ip1OfF1/jM2sBuVyumz5/s8LG\nH+c80dHhiHS6lvo8XNvvmx6rz3d/N/s/r6N47QzAarXidDrdj8+ePYvFYvFWOCKdTn1efI3XEsCo\nUaMoLCwE4JtvvsFqtXLbbbd5KxyRTqc+L77Ga0NAw4YN44EHHiA5ORmTycTChQu9FYpIl1CfF19j\ncrVmIFJERLod3QksImJQSgAiIgblM9NA/d2yZcs4dOgQFy5c4MUXXyQyMpL58+fT2NiIxWJh+fLl\nmM1mb4fZperq6vj7v/97Zs6cyciRIw1/PNrDF5aQOHbsGDNnzuS5554jNTWV06dPX/dvWlBQQG5u\nLgEBASQlJZGYmEhDQwMZGRmcOnWKwMBAlixZQt++fTs8xtZ+Dr0RY21tLRkZGfzwww+cP3+emTNn\nMmjQIO/G55J2Kyoqcr3wwgsul8vl+vHHH12PPvqoKyMjw7Vz506Xy+Vy5eTkuH73u995M0Sv+O1v\nf+t68sknXe+//76ORzsUFxe7/vEf/9Hlcrlcx48fdyUlJXV5DNXV1a7U1FRXZmamKy8vz+Vyua77\nN62urnZNmjTJVVlZ6aqtrXU99thjrvLyctcHH3zgysrKcrlcLtf+/ftdc+fO7fAYW/s59FaMn3zy\nievdd991uVwu11//+lfXpEmTvB6fhoA6wEMPPcQbb7wBQHh4OLW1tRQXFzNhwgQAxo0bR1FRkTdD\n7HInTpzg+PHjjB07FsDwx6M9fGEJCbPZzLp167Bare626/1NDx8+TGRkJGFhYYSEhDBs2DAcDgdF\nRUXExsYCEBMTg8Ph6PAYW/s59FaM8fHx/OpXvwLg9OnT9O7d2+vxKQF0gMDAQEJDQwGw2+2MGTOG\n2tpa9xBHr169DHfL/9KlS8nIyHA/NvrxaA9fWEIiKCiIkJCQZm3X+5s6nU4iIiKuifXK9oCAAEwm\nE/X19R0aY2s/h96MESA5OZmXX34Zm83m9fiUADrQnj17sNvtLFiwoFm7y2AzbT/88EOGDh16w/FJ\nox2PjuaLx+9GMbW1vSO09XPY1TFu27aNd955h1deeaXZ7/BGfEoAHWT//v2sWbOGdevWERYWRmho\nKHV1dQCcOXOm2alzd/f555/z6aefkpSUxPbt23n77bcNfTzay1eXkLje3/R6sTa1N521NDQ04HK5\nOmUSQGs+h96K8ejRo5w+fRqAwYMH09jYyK233urV+JQAOsC5c+dYtmwZa9eupUePHsClMbqm2/53\n797N6NGjvRlil1q5ciXvv/8+v//970lMTGTmzJmGPh7t5atLSFzvbxodHc2RI0eorKykuroah8PB\n8OHDGTVqFLt27QJg7969jBgxosPjae3n0FsxHjx4kA0bNgCXhvVqamq8Hp/uBO4A+fn5vPnmm/Tv\n39/dlp2dTWZmJufPn6dPnz4sWbKE4OBgL0bpHW+++SZ33nknjzzyCOnp6YY/Hp5asWIFBw8edC8h\nMWjQoC79/UePHmXp0qWUlpYSFBRE7969WbFiBRkZGdf8TXft2sV7772HyWQiNTWVn//85zQ2NpKZ\nmclf/vIXzGYz2dnZ/OxnP+vQGNvyOfRGjHV1dfzmN7/h9OnT1NXVMXv2bIYMGXLdz0VXxacEICJi\nUBoCEhExKCUAERGDUgIQETEoJQAREYNSAhARMSglABHpNubPn89nn33m7TD8hqaBiogYlM4AvKS4\nuJjHH3+c7OxsJk+ezPjx4/mv//ovzp8/z4IFC5g8eTJTpkwhOzubxsZGAAYOHMi//du/8cQTTzBy\n5Ei2bt3a4u/JyMjgX//1X5k2bRp/93d/x29/+1u2b9/O448/zvjx4/nv//5vAKZNm8ZHH33kft3V\nj0Vao6v69VdffcU//MM/EB8fz5QpU/j3f/93oHm//eCDDxg1ahQ///nP+eCDDxg4cKC7/Z//+Z9J\nS0tj7NixPP/88xw8eJDk5GRiYmLIz88H4OLFi7z22mvu/XjllVdoaGjojMPmNUoAXnTixAmioqIo\nLCzkn/7pn8jKyiI3N5fvv/+eTz75hD/84Q8cPHiQHTt2uF/z3Xff8dFHH/G73/2OxYsXU15e3uLv\n+dOf/sTatWv5t3/7N9avX8+PP/7Ixx9/zOTJk8nLy+vMXRQD6op+vXTpUn7961+zc+dO3nnnHfbs\n2dPs+YqKCl577TU2btzIhx9+yBdffNHs+f379zN79mx2797NiRMnWL9+Pb/73e9YtGgRb7/9NgB/\n/OMf3XH++7//O9988w07d+7soKPkG5QAvCg0NJQpU6YAMGnSJP785z9TWFhIUlKSe/ndxx9/nP/4\nj/9wv+app54CYMCAAfTv39/9Df5mYmJiCA0N5b777uPixYuMGzcOgPvvv5+zZ892wp6JkXVFv+7V\nqxcffvghJ06c4O677yYnJ6fZ84cPH+buu+/m/vvvJyAggKlTpzZ7/t5776V///6YzWb69evHI488\nQmBgYLPPxOTJk3n//fcJDg7mb/7mb4iMjOTkyZPtPj6+RAnAi8LDwzGZTO6fAaqqqrj99tvd29x+\n++388MMPzR5f+XNlZWWLv+fWW28FwGQyERAQ4F4zPSAggIsXL7Z/R0Su0BX9evHixdxyyy08//zz\nTJo0yb1IWpPKyspm79m7d+9mzzd9JqB5HYHAwED3Z+LHH38kPT2dyZMnExcXx6effuqTS3G3hxKA\nF1VUVLh//umnn4BLH5gr2ysqKrjjjjvcj688Na6oqGjWydvj6mTQFI9IW3VFv77jjjt49dVX+dOf\n/sSCBQv49a9/TXV1tfv52267jZqaGvdjT850//Vf/5WgoCA+/vhjdu3axaOPPtrm9/B1SgBeVFdX\n5x67LCwsZMiQIUyePBm73U5jYyM1NTV89NFHzTreJ598AlwaZ/3uu++Ijo7ukFgsFgslJSUAfP31\n1/zlL3/pkPcV4+nsft3Q0MC0adPc/6k/8MADBAUFERBw+b+zBx54gP/93//lu+++4+LFi9jt9jbv\nxw8//MD999+P2WympKSEr7/+ullS6Q6CvB2Akd15550cOnSI5cuX09DQwMqVKxk4cCAnT57kscce\nw2QyERcX5x5PhUvl4Z544gnOnDlDZmZmh50BPP/887z00kv86U9/4uGHH2bUqFEd8r5iPJ3dr4OD\ng3n66ad57rnngEtnr5mZmdxyyy3ubaxWKy+99BLPPvssd9xxB8nJyfzhD39o03788pe/JD09nQ8+\n+IDhw4eTnp7Ob37zG6KioprF7s90H4CXFBcXk5mZyR//+MdWv2bgwIHs27ePv/3bv+3EyEQ850v9\n2uVyua9FfPvtt6SkpPCf//mfHfo7/J2GgESk27lw4QKjR4/m8OHDAOzcuZOhQ4d6OSrfoyEgP1dU\nVMRrr7123ediYmKuKYwt4g/a26+DgoJYuHAh6enpuFwuLBYLixYt6oxQ/ZqGgEREDEpDQCIiBqUE\nICJiUH5zDaCs7Jy3Q+hSPXuGUl7eveYct0Vn7b/FEtbh79mZmvq9r/cHX48PfD9Gb/R5nQH4qKCg\nQG+H4FVG3/+r+frx8PX4wPdj9EZ8SgAiIgalBCAiYlCtugawbNkyDh06xIULF3jxxRf57LPP+Oab\nb+jRowcA06dPZ+zYsRQUFJD3b5wEAAAgAElEQVSbm0tAQABJSUkkJibS0NBARkYGp06dIjAwkCVL\nltC3b19KSkrIysoCLt0JeKM5vyLecHWfj4yMZP78+TQ2NmKxWFi+fDlms1l9Xvxai2cAX375Jd9+\n+y35+fmsX7+exYsXA/DSSy+Rl5dHXl4eY8eOpaamhtWrV7Np0yby8vLIzc2loqKCHTt2EB4eztat\nW5kxY4Z73e5FixZhs9nYtm0bVVVV7Nu3r3P3VKSVrtfnV61aRUpKClu2bKFfv37Y7Xb1efF7LZ4B\nPPTQQ0RFRQGXlnStra11l3K70uHDh4mMjCQs7NIV52HDhuFwOCgqKiIhIQG4dAefzWajvr6e0tJS\n9/uOGzeOoqKibrncanf1y+y2F97ekDG+EyLpeNfr88XFxe5v7OPGjWPDhg30799ffd5P+XL/9SQ2\n8Cy+FhPAlcUS7HY7Y8aMITAwkM2bN7Nx40Z69erFq6++itPpJCIiwv26iIgIysrKmrUHBARgMplw\nOp3uQhFwqbpPWVnZTePo2TPU56/idzR/m7LYkrbuj7f2/3p9/osvvsBsNgOX+2tn93lo3u99vT/4\nenzQvhi7Yv+6Or5W3wewZ88e7HY7GzZs4OjRo/To0YPBgwfz7rvv8tZbb/Hggw822/5GK0xcr701\nq1H48vzdzmCxhHW7ex/asj+dtf9t+ZBc2ecnTZrkbm9L375Re2tXYGnq977eH3w9Pmh/jJ29f50V\nX7vvA9i/fz9r1qxh3bp1hIWFMXLkSAYPHgzA+PHjOXbsGFarFafT6X7N2bNnsVqtWK1W9zedhoYG\n98JMV1YHOnPmDFartTWhiHSJq/t8aGgodXV1wOX+qj4v/q7FBHDu3DmWLVvG2rVr3bN+5syZ4y6O\nXFxczH333Ud0dDRHjhyhsrKS6upqHA4Hw4cPZ9SoUe56nXv37mXEiBEEBwczYMAADh48CMDu3bsZ\nPXp0Z+2jSJtcr8/HxMRQWFgIXO6v6vPi71ocAtq5cyfl5eXMmzfP3fbkk08yb948brnlFkJDQ1my\nZAkhISGkpaUxffp0TCYTs2bNIiwsjPj4eA4cOMDUqVMxm81kZ2cDYLPZWLBgARcvXiQ6OpqYmJjO\n20uRNrhen8/OziYzM5P8/Hz69OlDQkICwcHB6vPi1/xmOWhfH1/saL4+ptrZsyh84RqAL2g6Br7e\nH3w9Pmgeoy/OAmqKr6NnAWktIBERuYYSgIiIQSkBiIgYlBKAiIhB+U1BGPF/vnjhTcTIdAYgImJQ\nSgAiIgalISARkVbojkOYOgMQETEoJQAREYNSAhARMSglABERg1ICEBExqFbNAlq2bBmHDh3iwoUL\nvPjii0RGRjJ//nwaGxuxWCwsX74cs9lMQUEBubm5BAQEkJSURGJiIg0NDWRkZHDq1CkCAwNZsmQJ\nffv2paSkhKysLAAGDhzorrcqIiJdo8UzgC+//JJvv/2W/Px81q9fz+LFi1m1ahUpKSls2bKFfv36\nYbfbqampYfXq1WzatIm8vDxyc3OpqKhgx44dhIeHs3XrVmbMmEFOTg4AixYtwmazsW3bNqqqqti3\nb1+n76yIiFzWYgJ46KGHeOONNwAIDw+ntraW4uJiJkyYAMC4ceMoKiri8OHDREZGEhYWRkhICMOG\nDcPhcFBUVERsbCxwqaqSw+Ggvr6e0tJSoqKimr2HiIh0nRYTQGBgIKGhoQDY7XbGjBlDbW0tZrMZ\ngF69elFWVobT6SQiIsL9uoiIiGvaAwICMJlMOJ1OwsPD3ds2vYeIiHSdVt8JvGfPHux2Oxs2bGDS\npEnu9hsVFGtLe2uKkvXsGUpQUGAro+0e/K16VWfQMRDpPK1KAPv372fNmjWsX7+esLAwQkNDqaur\nIyQkhDNnzmC1WrFarTidTvdrzp49y9ChQ7FarZSVlTFo0CAaGhpwuVxYLBYqKirc2za9x82Ul9d4\nuIv+yR9K7HWFjj4GSigil7U4BHTu3DmWLVvG2rVr6dGjB3BpLL+wsBCA3bt3M3r0aKKjozly5AiV\nlZVUV1fjcDgYPnw4o0aNYteuXQDs3buXESNGEBwczIABAzh48GCz9xDxFceOHWPixIls3rwZgIyM\nDB5//HGmTZvGtGnT+PzzzwEoKCjgqaeeIjExke3btwPQ0NBAWloaU6dOJTU1lZMnTwJQUlJCcnIy\nycnJLFy40Cv7JXKlFs8Adu7cSXl5OfPmzXO3ZWdnk5mZSX5+Pn369CEhIYHg4GDS0tKYPn06JpOJ\nWbNmERYWRnx8PAcOHGDq1KmYzWays7MBsNlsLFiwgIsXLxIdHU1MTEzn7aVIG9TU1PD6668zcuTI\nZu0vvfQS48aNa7bd6tWrsdvtBAcH8/TTTxMbG8vevXsJDw8nJyeHL774gpycHFauXOme+RYVFUVa\nWhr79u3j0Ucf7erdE3FrMQE888wzPPPMM9e0b9y48Zq2uLg44uLimrU1zf2/2r333suWLVvaEqtI\nlzCbzaxbt45169bddLsrZ74BzWa+JSQkAJfOlm022w1nvikBiDfpTmCRqwQFBRESEnJN++bNm3n2\n2Wf5l3/5F3788UfNfBO/p3oAIq3wxBNP0KNHDwYPHsy7777LW2+9xYMPPthsm46e+QbNZ7/5+gVs\nX48Puj7Gtv6+9sTnyWuVAERa4crrAePHjycrK4vJkyd36sw3uDz7zddnhfl6fOCdGNvy+9ob341e\ne7PEoCEgkVaYM2eOezZPcXEx9913n2a+id/TGYDIVY4ePcrSpUspLS0lKCiIwsJCUlNTmTdvHrfc\ncguhoaEsWbKEkJAQzXwTv6YEIHKVIUOGkJeXd0375MmTr2nTzDfxZxoCEhExKCUAERGDUgIQETEo\nJQAREYNSAhARMSglABERg1ICEBExKCUAERGDatWNYMeOHWPmzJk899xzpKamkpGRwTfffOMuEDN9\n+nTGjh1LQUEBubm5BAQEkJSURGJiIg0NDWRkZHDq1Cn3DTJ9+/alpKSErKwsAAYOHMhrr73WaTsp\nIuINv8z+zNsh3FSLCUDFMUREuqcWh4CaimO0tHLhlcUxQkJCmhXHiI2NBS4Vx3A4HDcsjiEiIl2n\nxQSg4hgiIt2TR4vBeaM4xpWFMYzCHwpsdDYdA5HO41EC8EZxjKbCGEbhDwU2ukJHHwMlFJHLPJoG\nquIYIiL+r8UzABXHEBHpnkyu1lan9jKjDYf4+hBQV81v3pAxvkPfz9+GgJr6gK/3B1+PD5rH6Ovz\n8z1xo8+KagKLiMg1lABERAxKCUDkOo4dO8bEiRPZvHkzAKdPn2batGmkpKQwd+5c6uvrASgoKOCp\np54iMTGR7du3A9DQ0EBaWhpTp04lNTXVPWGipKSE5ORkkpOTWbhwoXd2TOQKSgAiV7ne8ierVq0i\nJSWFLVu20K9fP+x2u3v5k02bNpGXl0dubi4VFRXs2LGD8PBwtm7dyowZM8jJyQFwL3+ybds2qqqq\n2Ldvn7d2UQRQAhC5xvWWPykuLmbChAnA5aVLtPyJ+DuPbgQT3+TJzIaOnmXT0byxT0FBQQQFNf9o\n1NbWYjabgctLl2j5E+/rjrN5upISgEgbtWWZkxu1t3b29ZVLoPj6FFZfj6+78+T4KwGItEJoaCh1\ndXWEhIS4ly6xWq2duvwJXF4Cxdfn2ft6fEZwo+Ov+wBE2ikmJobCwkLg8tIlWv5E/J3OAESucr3l\nT1asWEFGRgb5+fn06dOHhIQEgoODtfyJ+DUlAJGrDBkyhLy8vGvaN27ceE1bXFwccXFxzdqaSp9e\n7d5772XLli0dF6hIO2kISETEoJQAREQMSglARMSgWpUAtC6KiEj302IC0LooIiLdU4sJQOuiiIh0\nTy1OA/WVdVGuvCXeKLri1vruuJaKliQQaZ123wfQVeuiNN0SbxS6td5zNztuSg4il3k0C6hpXRTg\npuuiNLU3fbtv77ooIiLScTw6A2haF+WJJ55oti5KZmYmlZWVBAYG4nA4sNlsVFVVsWvXLkaPHn3d\ndVGGDx/O7t27mTZtWkfvm0i31R2X/pau12IC0LooIiLdU4sJQOuiiIh0T7oTWETEoJQAREQMSglA\nRMSglABERAxKCUBExKC6RUUwzYkWEWk7nQGIiBiUEoCIiEF1iyEgkc5WXFzM3Llzue+++wC4//77\neeGFF5g/fz6NjY1YLBaWL1+O2WymoKCA3NxcAgICSEpKIjExkYaGBjIyMjh16pT75si+fft6ea/E\n6JQARFrp4YcfZtWqVe7Hv/71r0lJSWHKlCn89re/xW63k5CQwOrVq7Hb7QQHB/P0008TGxvL3r17\nCQ8PJycnhy+++IKcnBxWrlzpxb0R0RCQiMfaWxhJxNt0BiDSSsePH2fGjBn89NNPzJ49u92Fkerr\n692vv5ErCyG1t5ZBZ9dCaO/7P572UQdFYkyeHH8lAJFWuPvuu5k9ezZTpkzh5MmTPPvsszQ2Nrqf\n74jCSNfTVAipIwoEdWaBIRUw8r4bHf+bJQaPEoAuiInR9O7dm/j4eADuuusu7rjjDo4cOUJdXR0h\nISE3LYw0dOhQd2GkQYMGuQsjtfTtX6SzeXwN4OGHHyYvL4+8vDxeffVVVq1aRUpKClu2bKFfv37Y\n7XZqampYvXo1mzZtIi8vj9zcXCoqKtixYwfh4eFs3bqVGTNmkJOT05H7JNLhCgoKeO+99wAoKyvj\nhx9+4Mknn6SwsBCgWWGkI0eOUFlZSXV1NQ6Hg+HDhzNq1Ch27doF4C6MJOJtHTYEVFxczGuvvQZc\nuiC2YcMG+vfv774gBjS7IJaQkABcuiBms9k6KgyRTjF+/HhefvllPv30UxoaGsjKymLw4MGkp6e3\nqzCSiDd5nAC8cUFMxFtuu+021qxZc017ewsjiXiTRwnAGxfErpwN0RE6e0ZER/CHGH2RjptI63iU\nALxxQaxpNkRH8fUZC5pV4bmbHTclB5HLPLoIrAtiIiL+z6MzAF0QExHxfx4lAF0QExHxf1oLSETE\noJQAREQMSglARMSgtBiciEF4UjsbVD+7O9MZgIiIQSkBiIgYlGGHgDw9HfaETqFFxBfpDEBExKCU\nAEREDEoJQETEoJQAREQMSglARMSglABERAxKCUBExKC8eh/A4sWLOXz4MCaTCZvNRlRUlDfDEel0\n6vPiS7yWAL766iu+++478vPzOXHiBDabjfz8fG+FI9Lp1OfF13htCKioqIiJEycCcM899/DTTz9R\nVVXlrXBEOp36vPgaryUAp9NJz5493Y8jIiIoKyvzVjginU59XnyNz6wF5HK5bvq8xRJ2w+c+znmi\no8PxCTfb5+vprsehu2qpz0PzPnDlz774t25rf72aL+5Td+e1MwCr1YrT6XQ/Pnv2LBaLxVvhiHQ6\n9XnxNV5LAKNGjaKwsBCAb775BqvVym233eatcEQ6nfq8+BqvDQENGzaMBx54gOTkZEwmEwsXLvRW\nKCJdQn1efI3J1ZqBSBER6XZ0J7CIiEEpAYiIGJTPTAM1imPHjjFz5kyee+45UlNTOX36NPPnz6ex\nsRGLxcLy5csxm80UFBSQm5tLQEAASUlJJCYm0tDQQEZGBqdOnSIwMJAlS5bQt29fb+9SmyxbtoxD\nhw5x4cIFXnzxRSIjIw21/23la0tHtKf/doX29K/OVltbS0ZGBj/88APnz59n5syZDBo0yLvxuaTL\nVFdXu1JTU12ZmZmuvLw8l8vlcmVkZLh27tzpcrlcrpycHNfvfvc7V3V1tWvSpEmuyspKV21treux\nxx5zlZeXuz744ANXVlaWy+Vyufbv3++aO3eu1/bFE0VFRa4XXnjB5XK5XD/++KPr0UcfNdT+t1Vx\ncbHrH//xH10ul8t1/PhxV1JSklfjaW//7Wzt7V+d7ZNPPnG9++67LpfL5frrX//qmjRpktfj0xBQ\nFzKbzaxbtw6r1epuKy4uZsKECQCMGzeOoqIiDh8+TGRkJGFhYYSEhDBs2DAcDgdFRUXExsYCEBMT\ng8Ph8Mp+eOqhhx7ijTfeACA8PJza2lpD7X9b+drSEe3tv52tvf2rs8XHx/OrX/0KgNOnT9O7d2+v\nx6cE0IWCgoIICQlp1lZbW4vZbAagV69elJWV4XQ6iYiIcG/TtGTAle0BAQGYTCbq6+u7bgfaKTAw\nkNDQUADsdjtjxowx1P63la8tHdHe/tvZ2tu/ukpycjIvv/wyNpvN6/EpAfgQ1w1m5La13dft2bMH\nu93OggULmrUbZf895ev76yt/p47qX51l27ZtvPPOO7zyyivNfrc34lMC8LLQ0FDq6uoAOHPmDFar\n9bpLBjS1N30TaGhowOVyub89+Iv9+/ezZs0a1q1bR1hYmOH2vy38YemItvz9ukJ7+ldnO3r0KKdP\nnwZg8ODBNDY2cuutt3o1PiUAL4uJiXEvD7B7925Gjx5NdHQ0R44cobKykurqahwOB8OHD2fUqFHs\n2rULgL179zJixAhvht5m586dY9myZaxdu5YePXoAxtr/tvKHpSPa8vfrbO3tX53t4MGDbNiwAbg0\nvFdTU+P1+HQncBc6evQoS5cupbS0lKCgIHr37s2KFSvIyMjg/Pnz9OnThyVLlhAcHMyuXbt47733\nMJlMpKam8vOf/5zGxkYyMzP5y1/+gtlsJjs7m5/97Gfe3q1Wy8/P580336R///7utuzsbDIzMw2x\n/55YsWIFBw8edC8dMWjQIK/F0t7+29na2786W11dHb/5zW84ffo0dXV1zJ49myFDhpCenu61+JQA\nREQMSkNAIiIGpQQgImJQSgAiIgalBCAiYlBKACIiBqUEICJiUEoAIiIGpXoAfi43N5etW7e6H3/3\n3XesXr2a8ePHezEqEfEHuhGsG/n444/ZuHEj27Zt69Zr5IhIx1AC6CZOnjxJSkoKeXl53H333d4O\nR0T8gK4BdAMNDQ289NJLpKWl6T9/EWk1JYBuYOXKlfTr14+EhARvhyIifkQXgf3cgQMH2LNnD++/\n/763QxERP6NrAH7uF7/4Bf/7v//rXv8c4JlnnuH555/3YlQi4g+UAEREDErXAEREDKpVCaCuro6J\nEyfywQcfcPr0aaZNm0ZKSgpz586lvr4egIKCAp566ikSExPZvn07cGl2SlpaGlOnTiU1NZWTJ08C\nUFJSQnJyMsnJySxcuLCTdk1ERG6mVQngnXfe4fbbbwdg1apVpKSksGXLFvr164fdbqempobVq1ez\nadMm8vLyyM3NpaKigh07dhAeHs7WrVuZMWMGOTk5ACxatAibzca2bduoqqpi3759nbeHIiJyXS0m\ngBMnTnD8+HHGjh0LQHFxMRMmTABg3LhxFBUVcfjwYSIjIwkLCyMkJIRhw4bhcDgoKioiNjYWuFSc\n2eFwUF9fT2lpKVFRUc3eQ0REulaLCWDp0qVkZGS4H9fW1rqXGejVqxdlZWU4nU4iIiLc20RERFzT\nHhAQgMlkwul0Eh4e7t626T1ERKRr3TQBfPjhhwwdOpS+ffte9/kbTSBqS3trJyFduNDYqu1ERKR1\nbnoj2Oeff87Jkyf5/PPP+f777zGbzYSGhlJXV0dISAhnzpzBarVitVpxOp3u1509e5ahQ4ditVop\nKytj0KBBNDQ04HK5sFgsVFRUuLdteo+WlJfXuH+2WMIoKzvnyf52G0Y/Bp7uv8US1gnRiPinm54B\nrFy5kvfff5/f//73JCYmMnPmTGJiYigsLARg9+7djB49mujoaI4cOUJlZSXV1dU4HA6GDx/OqFGj\n2LVrFwB79+5lxIgRBAcHM2DAAA4ePNjsPUREpGu1eSmIOXPmkJ6eTn5+Pn369CEhIYHg4GDS0tKY\nPn06JpOJWbNmERYWRnx8PAcOHGDq1KmYzWays7MBsNlsLFiwgIsXLxIdHU1MTEyH75iIiNyc39wJ\nfOXpvtGHP0DHQENAIu2nO4FFRAxKq4H6qF9mf9bm12zIUBlIEWk9nQGIiBiUEoCIiEEpAYiIGJQS\ngIiIQSkBiIgYlBKAiIhBKQGIiBiUEoCIiEEpAYiIGJQSgIiIQSkBiIgYlBKAiIhBKQGIiBiUEoCI\niEEpAYiIGJQSgIiIQSkBiIgYlBKAiIhBKQGIiBhUizWBa2trycjI4IcffuD8+fPMnDmTQYMGMX/+\nfBobG7FYLCxfvhyz2UxBQQG5ubkEBASQlJREYmIiDQ0NZGRkcOrUKQIDA1myZAl9+/alpKSErKws\nAAYOHMhrr73W2fsqIiJXaPEMYO/evQwZMoTNmzezcuVKsrOzWbVqFSkpKWzZsoV+/fpht9upqalh\n9erVbNq0iby8PHJzc6moqGDHjh2Eh4ezdetWZsyYQU5ODgCLFi3CZrOxbds2qqqq2LdvX6fvrIiI\nXNZiAoiPj+dXv/oVAKdPn6Z3794UFxczYcIEAMaNG0dRURGHDx8mMjKSsLAwQkJCGDZsGA6Hg6Ki\nImJjYwGIiYnB4XBQX19PaWkpUVFRzd5DRES6TotDQE2Sk5P5/vvvWbNmDc8//zxmsxmAXr16UVZW\nhtPpJCIiwr19RETENe0BAQGYTCacTifh4eHubZveQ0REuk6rE8C2bdv485//zCuvvILL5XK3X/nz\nldrSfqNtr9SzZyhBQYHuxxZLWIuvMRqjHROj7a9IR2sxARw9epRevXrxs5/9jMGDB9PY2Mitt95K\nXV0dISEhnDlzBqvVitVqxel0ul939uxZhg4ditVqpaysjEGDBtHQ0IDL5cJisVBRUeHetuk9bqa8\nvMb9s8USRlnZOU/2t1sz0jHxtA8oaYhc1uI1gIMHD7JhwwYAnE4nNTU1xMTEUFhYCMDu3bsZPXo0\n0dHRHDlyhMrKSqqrq3E4HAwfPpxRo0axa9cu4NIF5REjRhAcHMyAAQM4ePBgs/cQEZGu0+IZQHJy\nMr/5zW9ISUmhrq6OBQsWMGTIENLT08nPz6dPnz4kJCQQHBxMWloa06dPx2QyMWvWLMLCwoiPj+fA\ngQNMnToVs9lMdnY2ADabjQULFnDx4kWio6OJiYnp9J0VEZHLTK7WDMD7gCtP940wBPTL7M/a/JoN\nGeM7IRLfpCEgkfbTncAiIgalBCAiYlBKACIiBqUEICJiUEoAIiIGpQQgImJQSgAiIgalBCAiYlBK\nACIiBqUEICJiUEoAIiIGpQQgImJQSgAiIgalBCAiYlBKACIiBtXqmsDiOU/W9hcR6Ww6AxARMSid\nAXQjqiImIm2hMwAREYNSAhARMSglABERg1ICEBExqFZdBF62bBmHDh3iwoULvPjii0RGRjJ//nwa\nGxuxWCwsX74cs9lMQUEBubm5BAQEkJSURGJiIg0NDWRkZHDq1CkCAwNZsmQJffv2paSkhKysLAAG\nDhzIa6+91pn7KSIiV2nxDODLL7/k22+/JT8/n/Xr17N48WJWrVpFSkoKW7ZsoV+/ftjtdmpqali9\nejWbNm0iLy+P3NxcKioq2LFjB+Hh4WzdupUZM2aQk5MDwKJFi7DZbGzbto2qqir27dvX6TsrIiKX\ntZgAHnroId544w0AwsPDqa2tpbi4mAkTJgAwbtw4ioqKOHz4MJGRkYSFhRESEsKwYcNwOBwUFRUR\nGxsLQExMDA6Hg/r6ekpLS4mKimr2HiIi0nVaHAIKDAwkNDQUALvdzpgxY/jiiy8wm80A9OrVi7Ky\nMpxOJxEREe7XRUREXNMeEBCAyWTC6XQSHh7u3rbpPW6mZ89QgoIC3Y8tlrA27KbciD8fR3+OXcQX\ntPpGsD179mC329mwYQOTJk1yt7tcrutu35b2G217pfLyGvfPFksYZWXnWnyNtMxfj6OnfUBJQ+Sy\nVs0C2r9/P2vWrGHdunWEhYURGhpKXV0dAGfOnMFqtWK1WnE6ne7XnD171t3e9O2+oaEBl8uFxWKh\noqLCvW3Te4iISNdpMQGcO3eOZcuWsXbtWnr06AFcGssvLCwEYPfu3YwePZro6GiOHDlCZWUl1dXV\nOBwOhg8fzqhRo9i1axcAe/fuZcSIEQQHBzNgwAAOHjzY7D1ERKTrtDgEtHPnTsrLy5k3b567LTs7\nm8zMTPLz8+nTpw8JCQkEBweTlpbG9OnTMZlMzJo1i7CwMOLj4zlw4ABTp07FbDaTnZ0NgM1mY8GC\nBVy8eJHo6GhiYmI6by9FROQaJldrBuB9wJXjvf52DcCXl4P218XgdA1ApP10J7CIiEEpAYiIGJQS\ngIiIQSkBiIgYlBKAiIhBKQGIiBiUEoCIiEEpAYiIGJQSgIiIQSkBiIgYlBKAiIhBKQGIiBiUEoCI\niEEpAYiIGJQSgIiIQSkBiIgYlBKAiIhBKQGIiBiUEoCIiEEpAYiIGJQSgIiIQbUqARw7doyJEyey\nefNmAE6fPs20adNISUlh7ty51NfXA1BQUMBTTz1FYmIi27dvB6ChoYG0tDSmTp1KamoqJ0+eBKCk\npITk5GSSk5NZuHBhZ+ybiIjcRIsJoKamhtdff52RI0e621atWkVKSgpbtmyhX79+2O12ampqWL16\nNZs2bSIvL4/c3FwqKirYsWMH4eHhbN26lRkzZpCTkwPAokWLsNlsbNu2jaqqKvbt29d5eykiItdo\nMQGYzWbWrVuH1Wp1txUXFzNhwgQAxo0bR1FREYcPHyYyMpKwsDBCQkIYNmwYDoeDoqIiYmNjAYiJ\nicHhcFBfX09paSlRUVHN3kNERLpOUIsbBAURFNR8s9raWsxmMwC9evWirKwMp9NJRESEe5uIiIhr\n2gMCAjCZTDidTsLDw93bNr2HiIh0nRYTQEtcLle722+07ZV69gwlKCjQ/dhiCWtlhHIz/nwc/Tl2\nEV/gUQIIDQ2lrq6OkJAQzpw5g9VqxWq14nQ63ducPXuWoUOHYrVaKSsrY9CgQTQ0NOByubBYLFRU\nVLi3bXqPmykvr3H/bLGEUVZ2zpPQ5Sr+ehw97QNKGiKXeTQNNCYmhsLCQgB2797N6NGjiY6O5siR\nI1RWVlJdXY3D4WD48D+BHQYAAAPySURBVOGMGjWKXbt2AbB3715GjBhBcHAwAwYM4ODBg83eQ0RE\nuk6LZwBHjx5l6dKllJaWEhQURGFhIStWrCAjI4P8/Hz69OlDQkICwcHBpKWlMX36dEwmE7NmzSIs\nLIz4+HgOHDjA1KlTMZvNZGdnA2Cz2ViwYAEXL14kOjqamJiYTt9ZERG5zORqzQC8D7jydN/fhoB+\nmf2Zt0O4oQ0Z470dgkc0BCTSfu2+CCz+zdPk5K+JQ0Qu01IQIiIGpQQgImJQSgAiIgalBCAiYlBK\nACIiBqUEICJiUEoAIiIGpQQgImJQSgAiIgalBCAiYlBKACIiBqUEICJiUEoAIiIGpdVA28CXl3UW\nEWkrnQGIiBiUEoCIiEEpAYiIGJQSgIiIQSkBiIgYlBKAiIhBeXUa6OLFizl8+DAmkwmbzUZUVJQ3\nwxERMRSvJYCvvvqK7777jvz8fE6cOIHNZiM/P99b4YiIGI7XhoCKioqYOHEiAPfccw8//fQTVVVV\n3gpHRMRwvHYG4HQ6eeCBB9yPIyIiKCsr47bbbuuS36+7etvHk+O3IWN8J0QiIp7ymaUgXC7XTZ+3\nWMJu+ritPs55ol2vF+9rbx8QMTqvDQFZrVacTqf78dmzZ7FYLN4KR0TEcLyWAEaNGkVhYSEA33zz\nDVartcuGf0RExItDQMOGDeOBBx4gOTkZk8nEwoULvRWKiIghmVwtDb6LiEi3pDuBRUQMSglARMSg\nfGYaaFt88MEHvPHGG9x1110AxMTE8E//9E9ejqprGHn5jOLiYubOnct9990HwP3338+rr77q5ahE\n/JdfJgCA+Ph40tPTvR1Gl9LyGfDwww+zatUqb4ch0i1oCMiPaPkMEelIfpsAvvrqK6ZPn84vfvEL\n/ud//sfb4XQJp9NJz5493Y+bls8wkuPHjzNjxgymTp3Kf/zHf3g7HBG/5vNDQNu3b2f79u3N2h57\n7DHmzJnD2LFj+frrr0lPT+fjjz/2UoTeY7QZvHfffTezZ89mypQpnDx5kmeffZbdu3djNpu9HZqI\nX/L5BJCYmEhiYuINn3/wwQf58ccfaWxsJDAwsAsj63pGXz6jd+/exMfHA3DXXXdxxx13cObMGfr2\n7evlyET8k18OAa1bt44dO3YAcOzYMSIiIrr9f/6g5TMKCgp47733ACgrK+OHH36gd+/eXo5KxH/5\n5Z3A33//Pa+88goul4sLFy4YajrkihUrOHjwoHv5jEGDBnk7pC5TVVXFyy+/TGVlJQ0NDcyePZtH\nH33U22GJ/L927KAGAAAGQph/1xOx14XWBAmzJgMAwN/kAgLgTwAAogQAIEoAAKIEACBKAACiBAAg\nSgAAog76lyuCu9FHXgAAAABJRU5ErkJggg==\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x7ff06e676d30>"
]
},
"metadata": {
"tags": []
}
}
]
},
{
"metadata": {
"id": "ocOT6GkbmVUi",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### **But what about the case when you have mulple alleles?**\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"To work with that, we need to select the right alternate allele using the genotype calls.\n",
"\n"
]
},
{
"metadata": {
"id": "lC00C1RYCyzs",
"colab_type": "code",
"outputId": "556230a1-5135-48fd-858f-914c1bab367f",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 80
}
},
"cell_type": "code",
"source": [
"df14 = pd.io.gbq.read_gbq('''\n",
"\n",
"SELECT\n",
"reference_name as chr,\n",
"start_position,\n",
"reference_bases ,\n",
"--\n",
"-- if the genotype is 0, it's a ref call\n",
"-- else use the genotype call to index the alternative base\n",
"--\n",
"IF( (call.genotype[OFFSET(0)] = 0), \n",
" reference_bases, \n",
" alternate_bases[OFFSET(call.genotype[OFFSET(0)] - 1)].alt ) as alt1,\n",
"--\n",
"--\n",
"IF( (call.genotype[OFFSET(1)] = 0), \n",
" reference_bases, \n",
" alternate_bases[OFFSET(call.genotype[OFFSET(1)] - 1)].alt ) AS alt2,\n",
"--\n",
"-- then we're still unnesting a single column.\n",
"--\n",
"call.name,\n",
"call.genotype[OFFSET(0)] AS g1,\n",
"call.genotype[OFFSET(1)] AS g2\n",
"FROM\n",
" `isb-cgc.QotM.1000genomes`\n",
"JOIN\n",
" UNNEST(call) as call\n",
"WHERE\n",
" reference_name = '21'\n",
" AND call.name = 'HG00119'\n",
" AND start_position = 34434667\n",
"\n",
"''', project_id=project_id, verbose=False, dialect='standard'\n",
")\n",
"\n",
"df14.head()\n",
"\n"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>reference_name</th>\n",
" <th>start_position</th>\n",
" <th>reference_bases</th>\n",
" <th>alt1</th>\n",
" <th>alt2</th>\n",
" <th>name</th>\n",
" <th>g1</th>\n",
" <th>g2</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>21</td>\n",
" <td>34434667</td>\n",
" <td>G</td>\n",
" <td>T</td>\n",
" <td>G</td>\n",
" <td>HG00119</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" reference_name start_position reference_bases alt1 alt2 name g1 g2\n",
"0 21 34434667 G T G HG00119 2 0"
]
},
"metadata": {
"tags": []
},
"execution_count": 11
}
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment