NehaKoppikar/getting-started-with-sql-and-bigquery.ipynb Secret

## getting-started-with-sql-and-bigquery.ipynb
{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.7.6"
    },
    "papermill": {
      "duration": 7.350334,
      "end_time": "2020-10-01T00:23:03.593488",
      "environment_variables": {},
      "exception": null,
      "input_path": "__notebook__.ipynb",
      "output_path": "__notebook__.ipynb",
      "parameters": {},
      "start_time": "2020-10-01T00:22:56.243154",
      "version": "2.1.0"
    },
    "colab": {
      "name": "getting-started-with-sql-and-bigquery.ipynb",
      "provenance": [],
      "include_colab_link": true
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/gist/NehaKoppikar/4bef5c96d730aa56320acc7b5daaa4be/getting-started-with-sql-and-bigquery.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "papermill": {
          "duration": 0.010571,
          "end_time": "2020-10-01T00:23:00.512872",
          "exception": false,
          "start_time": "2020-10-01T00:23:00.502301",
          "status": "completed"
        },
        "tags": [],
        "id": "TxfHJr0FvUyq"
      },
      "source": [
        "# Introduction\n",
        "\n",
        "Structured Query Language, or **SQL**, is the programming language used with databases, and it is an important skill for any data scientist. In this course, you'll build your SQL skills using **BigQuery**, a web service that lets you apply SQL to huge datasets.\n",
        "\n",
        "In this lesson, you'll learn the basics of accessing and examining BigQuery datasets. After you have a handle on these basics, we'll come back to build your SQL skills.\n",
        "\n",
        "# Your first BigQuery commands\n",
        "\n",
        "To use BigQuery, we'll import the Python package below:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2020-10-01T00:23:00.536917Z",
          "iopub.status.busy": "2020-10-01T00:23:00.535991Z",
          "iopub.status.idle": "2020-10-01T00:23:00.539032Z",
          "shell.execute_reply": "2020-10-01T00:23:00.538361Z"
        },
        "papermill": {
          "duration": 0.016762,
          "end_time": "2020-10-01T00:23:00.539141",
          "exception": false,
          "start_time": "2020-10-01T00:23:00.522379",
          "status": "completed"
        },
        "tags": [],
        "id": "DW2EyI3-vUyt"
      },
      "source": [
        "from google.cloud import bigquery"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "papermill": {
          "duration": 0.009199,
          "end_time": "2020-10-01T00:23:00.558030",
          "exception": false,
          "start_time": "2020-10-01T00:23:00.548831",
          "status": "completed"
        },
        "tags": [],
        "id": "aeCy7MsKvUyt"
      },
      "source": [
        "The first step in the workflow is to create a [`Client`](https://google-cloud.readthedocs.io/en/latest/bigquery/generated/google.cloud.bigquery.client.Client.html#google.cloud.bigquery.client.Client) object.  As you'll soon see, this `Client` object will play a central role in retrieving information from BigQuery datasets."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2020-10-01T00:23:00.581237Z",
          "iopub.status.busy": "2020-10-01T00:23:00.580592Z",
          "iopub.status.idle": "2020-10-01T00:23:00.585278Z",
          "shell.execute_reply": "2020-10-01T00:23:00.585787Z"
        },
        "papermill": {
          "duration": 0.018446,
          "end_time": "2020-10-01T00:23:00.585933",
          "exception": false,
          "start_time": "2020-10-01T00:23:00.567487",
          "status": "completed"
        },
        "tags": [],
        "id": "RiEkr6gwvUyu",
        "outputId": "53a7cbab-046f-4c23-a565-48ac2d651eb7"
      },
      "source": [
        "# Create a \"Client\" object\n",
        "client = bigquery.Client()"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Using Kaggle's public dataset BigQuery integration.\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "papermill": {
          "duration": 0.011116,
          "end_time": "2020-10-01T00:23:00.607493",
          "exception": false,
          "start_time": "2020-10-01T00:23:00.596377",
          "status": "completed"
        },
        "tags": [],
        "id": "uUzv7ZslvUyu"
      },
      "source": [
        "We'll work with a dataset of posts on [Hacker News](https://news.ycombinator.com/), a website focusing on computer science and cybersecurity news.\n",
        "\n",
        "In BigQuery, each dataset is contained in a corresponding project.  In this case, our `hacker_news` dataset is contained in the `bigquery-public-data` project.  To access the dataset, \n",
        "- We begin by constructing a reference to the dataset with the [`dataset()`](https://google-cloud.readthedocs.io/en/latest/bigquery/generated/google.cloud.bigquery.client.Client.dataset.html#google.cloud.bigquery.client.Client.dataset) method.  \n",
        "- Next, we use the [`get_dataset()`](https://google-cloud.readthedocs.io/en/latest/bigquery/generated/google.cloud.bigquery.client.Client.get_dataset.html#google.cloud.bigquery.client.Client.get_dataset) method, along with the reference we just constructed, to fetch the dataset."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2020-10-01T00:23:00.632642Z",
          "iopub.status.busy": "2020-10-01T00:23:00.631890Z",
          "iopub.status.idle": "2020-10-01T00:23:01.370548Z",
          "shell.execute_reply": "2020-10-01T00:23:01.369892Z"
        },
        "papermill": {
          "duration": 0.753439,
          "end_time": "2020-10-01T00:23:01.370669",
          "exception": false,
          "start_time": "2020-10-01T00:23:00.617230",
          "status": "completed"
        },
        "tags": [],
        "id": "mkFmhB4yvUyv"
      },
      "source": [
        "# Construct a reference to the \"hacker_news\" dataset\n",
        "dataset_ref = client.dataset(\"hacker_news\", project=\"bigquery-public-data\")\n",
        "\n",
        "# API request - fetch the dataset\n",
        "dataset = client.get_dataset(dataset_ref)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "papermill": {
          "duration": 0.009867,
          "end_time": "2020-10-01T00:23:01.390649",
          "exception": false,
          "start_time": "2020-10-01T00:23:01.380782",
          "status": "completed"
        },
        "tags": [],
        "id": "gFR9wxOAvUyv"
      },
      "source": [
        "Every dataset is just a collection of tables.  You can think of a dataset as a spreadsheet file containing multiple tables, all composed of rows and columns.\n",
        "\n",
        "We use the `list_tables()` method to list the tables in the dataset."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2020-10-01T00:23:01.416456Z",
          "iopub.status.busy": "2020-10-01T00:23:01.415825Z",
          "iopub.status.idle": "2020-10-01T00:23:01.644601Z",
          "shell.execute_reply": "2020-10-01T00:23:01.644115Z"
        },
        "papermill": {
          "duration": 0.244405,
          "end_time": "2020-10-01T00:23:01.644733",
          "exception": false,
          "start_time": "2020-10-01T00:23:01.400328",
          "status": "completed"
        },
        "tags": [],
        "id": "2k_4Aun8vUyv",
        "outputId": "e7cb4b74-c643-4100-ebe5-be4d2c904405"
      },
      "source": [
        "# List all the tables in the \"hacker_news\" dataset\n",
        "tables = list(client.list_tables(dataset))\n",
        "\n",
        "# Print names of all tables in the dataset (there are four!)\n",
        "for table in tables:  \n",
        "    print(table.table_id)"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "comments\n",
            "full\n",
            "full_201510\n",
            "stories\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "papermill": {
          "duration": 0.011545,
          "end_time": "2020-10-01T00:23:01.666482",
          "exception": false,
          "start_time": "2020-10-01T00:23:01.654937",
          "status": "completed"
        },
        "tags": [],
        "id": "ADdL7X3jvUyw"
      },
      "source": [
        "Similar to how we fetched a dataset, we can fetch a table.  In the code cell below, we fetch the `full` table in the `hacker_news` dataset."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2020-10-01T00:23:01.693343Z",
          "iopub.status.busy": "2020-10-01T00:23:01.692646Z",
          "iopub.status.idle": "2020-10-01T00:23:02.114215Z",
          "shell.execute_reply": "2020-10-01T00:23:02.114676Z"
        },
        "papermill": {
          "duration": 0.437628,
          "end_time": "2020-10-01T00:23:02.114877",
          "exception": false,
          "start_time": "2020-10-01T00:23:01.677249",
          "status": "completed"
        },
        "tags": [],
        "id": "E1OeQVX0vUyw"
      },
      "source": [
        "# Construct a reference to the \"full\" table\n",
        "table_ref = dataset_ref.table(\"full\")\n",
        "\n",
        "# API request - fetch the table\n",
        "table = client.get_table(table_ref)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "papermill": {
          "duration": 0.010228,
          "end_time": "2020-10-01T00:23:02.135949",
          "exception": false,
          "start_time": "2020-10-01T00:23:02.125721",
          "status": "completed"
        },
        "tags": [],
        "id": "CvYMJzI5vUyw"
      },
      "source": [
        "In the next section, you'll explore the contents of this table in more detail.  For now, take the time to use the image below to consolidate what you've learned so far.\n",
        "\n",
        "![first_commands](https://i.imgur.com/biYqbUB.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "papermill": {
          "duration": 0.00992,
          "end_time": "2020-10-01T00:23:02.156155",
          "exception": false,
          "start_time": "2020-10-01T00:23:02.146235",
          "status": "completed"
        },
        "tags": [],
        "id": "jVvV9kBZvUyx"
      },
      "source": [
        "# Table schema\n",
        "\n",
        "The structure of a table is called its **schema**.  We need to understand a table's schema to effectively pull out the data we want. \n",
        "\n",
        "In this example, we'll investigate the `full` table that we fetched above."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2020-10-01T00:23:02.184146Z",
          "iopub.status.busy": "2020-10-01T00:23:02.183142Z",
          "iopub.status.idle": "2020-10-01T00:23:02.186662Z",
          "shell.execute_reply": "2020-10-01T00:23:02.187206Z"
        },
        "papermill": {
          "duration": 0.020978,
          "end_time": "2020-10-01T00:23:02.187360",
          "exception": false,
          "start_time": "2020-10-01T00:23:02.166382",
          "status": "completed"
        },
        "tags": [],
        "id": "zagQlC5PvUyx",
        "outputId": "f7af561f-ec90-485f-f7c3-03cb8d91b4e7"
      },
      "source": [
        "# Print information on all the columns in the \"full\" table in the \"hacker_news\" dataset\n",
        "table.schema"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "[SchemaField('title', 'STRING', 'NULLABLE', 'Story title', ()),\n",
              " SchemaField('url', 'STRING', 'NULLABLE', 'Story url', ()),\n",
              " SchemaField('text', 'STRING', 'NULLABLE', 'Story or comment text', ()),\n",
              " SchemaField('dead', 'BOOLEAN', 'NULLABLE', 'Is dead?', ()),\n",
              " SchemaField('by', 'STRING', 'NULLABLE', \"The username of the item's author.\", ()),\n",
              " SchemaField('score', 'INTEGER', 'NULLABLE', 'Story score', ()),\n",
              " SchemaField('time', 'INTEGER', 'NULLABLE', 'Unix time', ()),\n",
              " SchemaField('timestamp', 'TIMESTAMP', 'NULLABLE', 'Timestamp for the unix time', ()),\n",
              " SchemaField('type', 'STRING', 'NULLABLE', 'Type of details (comment, comment_ranking, poll, story, job, pollopt)', ()),\n",
              " SchemaField('id', 'INTEGER', 'NULLABLE', \"The item's unique id.\", ()),\n",
              " SchemaField('parent', 'INTEGER', 'NULLABLE', 'Parent comment ID', ()),\n",
              " SchemaField('descendants', 'INTEGER', 'NULLABLE', 'Number of story or poll descendants', ()),\n",
              " SchemaField('ranking', 'INTEGER', 'NULLABLE', 'Comment ranking', ()),\n",
              " SchemaField('deleted', 'BOOLEAN', 'NULLABLE', 'Is deleted?', ())]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 6
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "papermill": {
          "duration": 0.011316,
          "end_time": "2020-10-01T00:23:02.212585",
          "exception": false,
          "start_time": "2020-10-01T00:23:02.201269",
          "status": "completed"
        },
        "tags": [],
        "id": "yYygS18dvUyx"
      },
      "source": [
        "Each [`SchemaField`](https://googleapis.github.io/google-cloud-python/latest/bigquery/generated/google.cloud.bigquery.schema.SchemaField.html#google.cloud.bigquery.schema.SchemaField) tells us about a specific column (which we also refer to as a **field**). In order, the information is:\n",
        "\n",
        "* The **name** of the column\n",
        "* The **field type** (or datatype) in the column\n",
        "* The **mode** of the column (`'NULLABLE'` means that a column allows NULL values, and is the default)\n",
        "* A **description** of the data in that column\n",
        "\n",
        "The first field has the SchemaField:\n",
        "\n",
        "`SchemaField('by', 'string', 'NULLABLE', \"The username of the item's author.\",())`\n",
        "\n",
        "This tells us:\n",
        "- the field (or column) is called `by`,\n",
        "- the data in this field is strings, \n",
        "- NULL values are allowed, and\n",
        "- it contains the usernames corresponding to each item's author.\n",
        "\n",
        "We can use the [`list_rows()`](https://google-cloud.readthedocs.io/en/latest/bigquery/generated/google.cloud.bigquery.client.Client.list_rows.html#google.cloud.bigquery.client.Client.list_rows) method to check just the first five lines of of the `full` table to make sure this is right. (Sometimes databases have outdated descriptions, so it's good to check.)  This returns a BigQuery [`RowIterator`](https://googleapis.github.io/google-cloud-python/latest/bigquery/generated/google.cloud.bigquery.table.RowIterator.html) object that can quickly be converted to a pandas DataFrame with the [`to_dataframe()`](https://google-cloud.readthedocs.io/en/latest/bigquery/generated/google.cloud.bigquery.table.RowIterator.to_dataframe.html) method."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2020-10-01T00:23:02.241457Z",
          "iopub.status.busy": "2020-10-01T00:23:02.240840Z",
          "iopub.status.idle": "2020-10-01T00:23:02.940329Z",
          "shell.execute_reply": "2020-10-01T00:23:02.940781Z"
        },
        "papermill": {
          "duration": 0.717731,
          "end_time": "2020-10-01T00:23:02.940928",
          "exception": false,
          "start_time": "2020-10-01T00:23:02.223197",
          "status": "completed"
        },
        "tags": [],
        "id": "IwR232OCvUyy",
        "outputId": "1a314992-adce-41dc-fcc5-2fc6b2eb6cd6"
      },
      "source": [
        "# Preview the first five lines of the \"full\" table\n",
        "client.list_rows(table, max_results=5).to_dataframe()"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>title</th>\n",
              "      <th>url</th>\n",
              "      <th>text</th>\n",
              "      <th>dead</th>\n",
              "      <th>by</th>\n",
              "      <th>score</th>\n",
              "      <th>time</th>\n",
              "      <th>timestamp</th>\n",
              "      <th>type</th>\n",
              "      <th>id</th>\n",
              "      <th>parent</th>\n",
              "      <th>descendants</th>\n",
              "      <th>ranking</th>\n",
              "      <th>deleted</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>1391699272</td>\n",
              "      <td>2014-02-06 15:07:52+00:00</td>\n",
              "      <td>story</td>\n",
              "      <td>7190711</td>\n",
              "      <td>NaN</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>Please elaborate how his divorce is either ill...</td>\n",
              "      <td>None</td>\n",
              "      <td>bauerd</td>\n",
              "      <td>None</td>\n",
              "      <td>1527013360</td>\n",
              "      <td>2018-05-22 18:22:40+00:00</td>\n",
              "      <td>comment</td>\n",
              "      <td>17128187</td>\n",
              "      <td>17128167.0</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>If it's the difference between shipping and no...</td>\n",
              "      <td>None</td>\n",
              "      <td>cheald</td>\n",
              "      <td>None</td>\n",
              "      <td>1326079846</td>\n",
              "      <td>2012-01-09 03:30:46+00:00</td>\n",
              "      <td>comment</td>\n",
              "      <td>3441358</td>\n",
              "      <td>3441154.0</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>I believe that political biases are blinding y...</td>\n",
              "      <td>None</td>\n",
              "      <td>learc83</td>\n",
              "      <td>None</td>\n",
              "      <td>1314393283</td>\n",
              "      <td>2011-08-26 21:14:43+00:00</td>\n",
              "      <td>comment</td>\n",
              "      <td>2929994</td>\n",
              "      <td>2929910.0</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>Relevant The Onion: &lt;a href=\"https:&amp;#x2F;&amp;#x2F...</td>\n",
              "      <td>None</td>\n",
              "      <td>Taylor_OD</td>\n",
              "      <td>None</td>\n",
              "      <td>1551482279</td>\n",
              "      <td>2019-03-01 23:17:59+00:00</td>\n",
              "      <td>comment</td>\n",
              "      <td>19286260</td>\n",
              "      <td>19285865.0</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "  title   url                                               text  dead  \\\n",
              "0  None  None                                               None  None   \n",
              "1  None  None  Please elaborate how his divorce is either ill...  None   \n",
              "2  None  None  If it's the difference between shipping and no...  None   \n",
              "3  None  None  I believe that political biases are blinding y...  None   \n",
              "4  None  None  Relevant The Onion: <a href=\"https:&#x2F;&#x2F...  None   \n",
              "\n",
              "          by score        time                 timestamp     type        id  \\\n",
              "0       None  None  1391699272 2014-02-06 15:07:52+00:00    story   7190711   \n",
              "1     bauerd  None  1527013360 2018-05-22 18:22:40+00:00  comment  17128187   \n",
              "2     cheald  None  1326079846 2012-01-09 03:30:46+00:00  comment   3441358   \n",
              "3    learc83  None  1314393283 2011-08-26 21:14:43+00:00  comment   2929994   \n",
              "4  Taylor_OD  None  1551482279 2019-03-01 23:17:59+00:00  comment  19286260   \n",
              "\n",
              "       parent descendants ranking deleted  \n",
              "0         NaN        None    None    True  \n",
              "1  17128167.0        None    None    None  \n",
              "2   3441154.0        None    None    None  \n",
              "3   2929910.0        None    None    None  \n",
              "4  19285865.0        None    None    None  "
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 7
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "papermill": {
          "duration": 0.01107,
          "end_time": "2020-10-01T00:23:02.963253",
          "exception": false,
          "start_time": "2020-10-01T00:23:02.952183",
          "status": "completed"
        },
        "tags": [],
        "id": "v8l6AH2yvUyz"
      },
      "source": [
        "The `list_rows()` method will also let us look at just the information in a specific column. If we want to see the first five entries in the `by` column, for example, we can do that!"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "execution": {
          "iopub.execute_input": "2020-10-01T00:23:02.991684Z",
          "iopub.status.busy": "2020-10-01T00:23:02.991079Z",
          "iopub.status.idle": "2020-10-01T00:23:03.405918Z",
          "shell.execute_reply": "2020-10-01T00:23:03.405391Z"
        },
        "papermill": {
          "duration": 0.4317,
          "end_time": "2020-10-01T00:23:03.406028",
          "exception": false,
          "start_time": "2020-10-01T00:23:02.974328",
          "status": "completed"
        },
        "tags": [],
        "id": "yVSXvtz5vUyz",
        "outputId": "3e15aefc-5178-483b-be47-402adc360933"
      },
      "source": [
        "# Preview the first five entries in the \"by\" column of the \"full\" table\n",
        "client.list_rows(table, selected_fields=table.schema[:1], max_results=5).to_dataframe()"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>title</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>None</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>None</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>None</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>None</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>None</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "  title\n",
              "0  None\n",
              "1  None\n",
              "2  None\n",
              "3  None\n",
              "4  None"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 8
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "papermill": {
          "duration": 0.011313,
          "end_time": "2020-10-01T00:23:03.428976",
          "exception": false,
          "start_time": "2020-10-01T00:23:03.417663",
          "status": "completed"
        },
        "tags": [],
        "id": "xZt9s0HjvUy0"
      },
      "source": [
        "# Disclaimer\n",
        "Before we go into the coding exercise, a quick disclaimer for those who already know some SQL:\n",
        "\n",
        "**Each Kaggle user can scan 5TB every 30 days for free.  Once you hit that limit, you'll have to wait for it to reset.**\n",
        "\n",
        "The commands you've seen so far won't demand a meaningful fraction of that limit. But some BiqQuery datasets are huge. So, if you already know SQL, wait to run SELECT queries until you've seen how to use your allotment effectively. If you are like most people reading this, you don't know how to write these queries yet, so you don't need to worry about this disclaimer.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "papermill": {
          "duration": 0.01127,
          "end_time": "2020-10-01T00:23:03.451826",
          "exception": false,
          "start_time": "2020-10-01T00:23:03.440556",
          "status": "completed"
        },
        "tags": [],
        "id": "Ks4ruPCOvUy0"
      },
      "source": [
        "# Your turn\n",
        "Practice the commands you've seen to **[explore the structure of a dataset](https://www.kaggle.com/kernels/fork/1058477)** with crimes in the city of Chicago."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "papermill": {
          "duration": 0.011478,
          "end_time": "2020-10-01T00:23:03.474909",
          "exception": false,
          "start_time": "2020-10-01T00:23:03.463431",
          "status": "completed"
        },
        "tags": [],
        "id": "WzE95RxgvUy1"
      },
      "source": [
        "---\n",
        "\n",
        "\n",
        "\n",
        "\n",
        "*Have questions or comments? Visit the [Learn Discussion forum](https://www.kaggle.com/learn-forum/161314) to chat with other Learners.*"
      ]
    }
  ]
}