Skip to content

Instantly share code, notes, and snippets.

@brockmanmatt
Created August 7, 2020 05:29
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save brockmanmatt/78f498ad04b1d2afaecc9ed05d4878f8 to your computer and use it in GitHub Desktop.
Save brockmanmatt/78f498ad04b1d2afaecc9ed05d4878f8 to your computer and use it in GitHub Desktop.
GPT_ANLI_2Step_Part1.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "GPT_ANLI_2Step_Part1.ipynb",
"provenance": [],
"collapsed_sections": [],
"authorship_tag": "ABX9TyOGuMAXoF4rrVDmy/ydB9pZ",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/brockmanmatt/78f498ad04b1d2afaecc9ed05d4878f8/gpt_anli_2step_part1.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "code",
"metadata": {
"id": "J7wnsgT2kPut",
"colab_type": "code",
"colab": {
"resources": {
"http://localhost:8080/nbextensions/google.colab/files.js": {
"data": "Ly8gQ29weXJpZ2h0IDIwMTcgR29vZ2xlIExMQwovLwovLyBMaWNlbnNlZCB1bmRlciB0aGUgQXBhY2hlIExpY2Vuc2UsIFZlcnNpb24gMi4wICh0aGUgIkxpY2Vuc2UiKTsKLy8geW91IG1heSBub3QgdXNlIHRoaXMgZmlsZSBleGNlcHQgaW4gY29tcGxpYW5jZSB3aXRoIHRoZSBMaWNlbnNlLgovLyBZb3UgbWF5IG9idGFpbiBhIGNvcHkgb2YgdGhlIExpY2Vuc2UgYXQKLy8KLy8gICAgICBodHRwOi8vd3d3LmFwYWNoZS5vcmcvbGljZW5zZXMvTElDRU5TRS0yLjAKLy8KLy8gVW5sZXNzIHJlcXVpcmVkIGJ5IGFwcGxpY2FibGUgbGF3IG9yIGFncmVlZCB0byBpbiB3cml0aW5nLCBzb2Z0d2FyZQovLyBkaXN0cmlidXRlZCB1bmRlciB0aGUgTGljZW5zZSBpcyBkaXN0cmlidXRlZCBvbiBhbiAiQVMgSVMiIEJBU0lTLAovLyBXSVRIT1VUIFdBUlJBTlRJRVMgT1IgQ09ORElUSU9OUyBPRiBBTlkgS0lORCwgZWl0aGVyIGV4cHJlc3Mgb3IgaW1wbGllZC4KLy8gU2VlIHRoZSBMaWNlbnNlIGZvciB0aGUgc3BlY2lmaWMgbGFuZ3VhZ2UgZ292ZXJuaW5nIHBlcm1pc3Npb25zIGFuZAovLyBsaW1pdGF0aW9ucyB1bmRlciB0aGUgTGljZW5zZS4KCi8qKgogKiBAZmlsZW92ZXJ2aWV3IEhlbHBlcnMgZm9yIGdvb2dsZS5jb2xhYiBQeXRob24gbW9kdWxlLgogKi8KKGZ1bmN0aW9uKHNjb3BlKSB7CmZ1bmN0aW9uIHNwYW4odGV4dCwgc3R5bGVBdHRyaWJ1dGVzID0ge30pIHsKICBjb25zdCBlbGVtZW50ID0gZG9jdW1lbnQuY3JlYXRlRWxlbWVudCgnc3BhbicpOwogIGVsZW1lbnQudGV4dENvbnRlbnQgPSB0ZXh0OwogIGZvciAoY29uc3Qga2V5IG9mIE9iamVjdC5rZXlzKHN0eWxlQXR0cmlidXRlcykpIHsKICAgIGVsZW1lbnQuc3R5bGVba2V5XSA9IHN0eWxlQXR0cmlidXRlc1trZXldOwogIH0KICByZXR1cm4gZWxlbWVudDsKfQoKLy8gTWF4IG51bWJlciBvZiBieXRlcyB3aGljaCB3aWxsIGJlIHVwbG9hZGVkIGF0IGEgdGltZS4KY29uc3QgTUFYX1BBWUxPQURfU0laRSA9IDEwMCAqIDEwMjQ7CgpmdW5jdGlvbiBfdXBsb2FkRmlsZXMoaW5wdXRJZCwgb3V0cHV0SWQpIHsKICBjb25zdCBzdGVwcyA9IHVwbG9hZEZpbGVzU3RlcChpbnB1dElkLCBvdXRwdXRJZCk7CiAgY29uc3Qgb3V0cHV0RWxlbWVudCA9IGRvY3VtZW50LmdldEVsZW1lbnRCeUlkKG91dHB1dElkKTsKICAvLyBDYWNoZSBzdGVwcyBvbiB0aGUgb3V0cHV0RWxlbWVudCB0byBtYWtlIGl0IGF2YWlsYWJsZSBmb3IgdGhlIG5leHQgY2FsbAogIC8vIHRvIHVwbG9hZEZpbGVzQ29udGludWUgZnJvbSBQeXRob24uCiAgb3V0cHV0RWxlbWVudC5zdGVwcyA9IHN0ZXBzOwoKICByZXR1cm4gX3VwbG9hZEZpbGVzQ29udGludWUob3V0cHV0SWQpOwp9CgovLyBUaGlzIGlzIHJvdWdobHkgYW4gYXN5bmMgZ2VuZXJhdG9yIChub3Qgc3VwcG9ydGVkIGluIHRoZSBicm93c2VyIHlldCksCi8vIHdoZXJlIHRoZXJlIGFyZSBtdWx0aXBsZSBhc3luY2hyb25vdXMgc3RlcHMgYW5kIHRoZSBQeXRob24gc2lkZSBpcyBnb2luZwovLyB0byBwb2xsIGZvciBjb21wbGV0aW9uIG9mIGVhY2ggc3RlcC4KLy8gVGhpcyB1c2VzIGEgUHJvbWlzZSB0byBibG9jayB0aGUgcHl0aG9uIHNpZGUgb24gY29tcGxldGlvbiBvZiBlYWNoIHN0ZXAsCi8vIHRoZW4gcGFzc2VzIHRoZSByZXN1bHQgb2YgdGhlIHByZXZpb3VzIHN0ZXAgYXMgdGhlIGlucHV0IHRvIHRoZSBuZXh0IHN0ZXAuCmZ1bmN0aW9uIF91cGxvYWRGaWxlc0NvbnRpbnVlKG91dHB1dElkKSB7CiAgY29uc3Qgb3V0cHV0RWxlbWVudCA9IGRvY3VtZW50LmdldEVsZW1lbnRCeUlkKG91dHB1dElkKTsKICBjb25zdCBzdGVwcyA9IG91dHB1dEVsZW1lbnQuc3RlcHM7CgogIGNvbnN0IG5leHQgPSBzdGVwcy5uZXh0KG91dHB1dEVsZW1lbnQubGFzdFByb21pc2VWYWx1ZSk7CiAgcmV0dXJuIFByb21pc2UucmVzb2x2ZShuZXh0LnZhbHVlLnByb21pc2UpLnRoZW4oKHZhbHVlKSA9PiB7CiAgICAvLyBDYWNoZSB0aGUgbGFzdCBwcm9taXNlIHZhbHVlIHRvIG1ha2UgaXQgYXZhaWxhYmxlIHRvIHRoZSBuZXh0CiAgICAvLyBzdGVwIG9mIHRoZSBnZW5lcmF0b3IuCiAgICBvdXRwdXRFbGVtZW50Lmxhc3RQcm9taXNlVmFsdWUgPSB2YWx1ZTsKICAgIHJldHVybiBuZXh0LnZhbHVlLnJlc3BvbnNlOwogIH0pOwp9CgovKioKICogR2VuZXJhdG9yIGZ1bmN0aW9uIHdoaWNoIGlzIGNhbGxlZCBiZXR3ZWVuIGVhY2ggYXN5bmMgc3RlcCBvZiB0aGUgdXBsb2FkCiAqIHByb2Nlc3MuCiAqIEBwYXJhbSB7c3RyaW5nfSBpbnB1dElkIEVsZW1lbnQgSUQgb2YgdGhlIGlucHV0IGZpbGUgcGlja2VyIGVsZW1lbnQuCiAqIEBwYXJhbSB7c3RyaW5nfSBvdXRwdXRJZCBFbGVtZW50IElEIG9mIHRoZSBvdXRwdXQgZGlzcGxheS4KICogQHJldHVybiB7IUl0ZXJhYmxlPCFPYmplY3Q+fSBJdGVyYWJsZSBvZiBuZXh0IHN0ZXBzLgogKi8KZnVuY3Rpb24qIHVwbG9hZEZpbGVzU3RlcChpbnB1dElkLCBvdXRwdXRJZCkgewogIGNvbnN0IGlucHV0RWxlbWVudCA9IGRvY3VtZW50LmdldEVsZW1lbnRCeUlkKGlucHV0SWQpOwogIGlucHV0RWxlbWVudC5kaXNhYmxlZCA9IGZhbHNlOwoKICBjb25zdCBvdXRwdXRFbGVtZW50ID0gZG9jdW1lbnQuZ2V0RWxlbWVudEJ5SWQob3V0cHV0SWQpOwogIG91dHB1dEVsZW1lbnQuaW5uZXJIVE1MID0gJyc7CgogIGNvbnN0IHBpY2tlZFByb21pc2UgPSBuZXcgUHJvbWlzZSgocmVzb2x2ZSkgPT4gewogICAgaW5wdXRFbGVtZW50LmFkZEV2ZW50TGlzdGVuZXIoJ2NoYW5nZScsIChlKSA9PiB7CiAgICAgIHJlc29sdmUoZS50YXJnZXQuZmlsZXMpOwogICAgfSk7CiAgfSk7CgogIGNvbnN0IGNhbmNlbCA9IGRvY3VtZW50LmNyZWF0ZUVsZW1lbnQoJ2J1dHRvbicpOwogIGlucHV0RWxlbWVudC5wYXJlbnRFbGVtZW50LmFwcGVuZENoaWxkKGNhbmNlbCk7CiAgY2FuY2VsLnRleHRDb250ZW50ID0gJ0NhbmNlbCB1cGxvYWQnOwogIGNvbnN0IGNhbmNlbFByb21pc2UgPSBuZXcgUHJvbWlzZSgocmVzb2x2ZSkgPT4gewogICAgY2FuY2VsLm9uY2xpY2sgPSAoKSA9PiB7CiAgICAgIHJlc29sdmUobnVsbCk7CiAgICB9OwogIH0pOwoKICAvLyBXYWl0IGZvciB0aGUgdXNlciB0byBwaWNrIHRoZSBmaWxlcy4KICBjb25zdCBmaWxlcyA9IHlpZWxkIHsKICAgIHByb21pc2U6IFByb21pc2UucmFjZShbcGlja2VkUHJvbWlzZSwgY2FuY2VsUHJvbWlzZV0pLAogICAgcmVzcG9uc2U6IHsKICAgICAgYWN0aW9uOiAnc3RhcnRpbmcnLAogICAgfQogIH07CgogIGNhbmNlbC5yZW1vdmUoKTsKCiAgLy8gRGlzYWJsZSB0aGUgaW5wdXQgZWxlbWVudCBzaW5jZSBmdXJ0aGVyIHBpY2tzIGFyZSBub3QgYWxsb3dlZC4KICBpbnB1dEVsZW1lbnQuZGlzYWJsZWQgPSB0cnVlOwoKICBpZiAoIWZpbGVzKSB7CiAgICByZXR1cm4gewogICAgICByZXNwb25zZTogewogICAgICAgIGFjdGlvbjogJ2NvbXBsZXRlJywKICAgICAgfQogICAgfTsKICB9CgogIGZvciAoY29uc3QgZmlsZSBvZiBmaWxlcykgewogICAgY29uc3QgbGkgPSBkb2N1bWVudC5jcmVhdGVFbGVtZW50KCdsaScpOwogICAgbGkuYXBwZW5kKHNwYW4oZmlsZS5uYW1lLCB7Zm9udFdlaWdodDogJ2JvbGQnfSkpOwogICAgbGkuYXBwZW5kKHNwYW4oCiAgICAgICAgYCgke2ZpbGUudHlwZSB8fCAnbi9hJ30pIC0gJHtmaWxlLnNpemV9IGJ5dGVzLCBgICsKICAgICAgICBgbGFzdCBtb2RpZmllZDogJHsKICAgICAgICAgICAgZmlsZS5sYXN0TW9kaWZpZWREYXRlID8gZmlsZS5sYXN0TW9kaWZpZWREYXRlLnRvTG9jYWxlRGF0ZVN0cmluZygpIDoKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgJ24vYSd9IC0gYCkpOwogICAgY29uc3QgcGVyY2VudCA9IHNwYW4oJzAlIGRvbmUnKTsKICAgIGxpLmFwcGVuZENoaWxkKHBlcmNlbnQpOwoKICAgIG91dHB1dEVsZW1lbnQuYXBwZW5kQ2hpbGQobGkpOwoKICAgIGNvbnN0IGZpbGVEYXRhUHJvbWlzZSA9IG5ldyBQcm9taXNlKChyZXNvbHZlKSA9PiB7CiAgICAgIGNvbnN0IHJlYWRlciA9IG5ldyBGaWxlUmVhZGVyKCk7CiAgICAgIHJlYWRlci5vbmxvYWQgPSAoZSkgPT4gewogICAgICAgIHJlc29sdmUoZS50YXJnZXQucmVzdWx0KTsKICAgICAgfTsKICAgICAgcmVhZGVyLnJlYWRBc0FycmF5QnVmZmVyKGZpbGUpOwogICAgfSk7CiAgICAvLyBXYWl0IGZvciB0aGUgZGF0YSB0byBiZSByZWFkeS4KICAgIGxldCBmaWxlRGF0YSA9IHlpZWxkIHsKICAgICAgcHJvbWlzZTogZmlsZURhdGFQcm9taXNlLAogICAgICByZXNwb25zZTogewogICAgICAgIGFjdGlvbjogJ2NvbnRpbnVlJywKICAgICAgfQogICAgfTsKCiAgICAvLyBVc2UgYSBjaHVua2VkIHNlbmRpbmcgdG8gYXZvaWQgbWVzc2FnZSBzaXplIGxpbWl0cy4gU2VlIGIvNjIxMTU2NjAuCiAgICBsZXQgcG9zaXRpb24gPSAwOwogICAgd2hpbGUgKHBvc2l0aW9uIDwgZmlsZURhdGEuYnl0ZUxlbmd0aCkgewogICAgICBjb25zdCBsZW5ndGggPSBNYXRoLm1pbihmaWxlRGF0YS5ieXRlTGVuZ3RoIC0gcG9zaXRpb24sIE1BWF9QQVlMT0FEX1NJWkUpOwogICAgICBjb25zdCBjaHVuayA9IG5ldyBVaW50OEFycmF5KGZpbGVEYXRhLCBwb3NpdGlvbiwgbGVuZ3RoKTsKICAgICAgcG9zaXRpb24gKz0gbGVuZ3RoOwoKICAgICAgY29uc3QgYmFzZTY0ID0gYnRvYShTdHJpbmcuZnJvbUNoYXJDb2RlLmFwcGx5KG51bGwsIGNodW5rKSk7CiAgICAgIHlpZWxkIHsKICAgICAgICByZXNwb25zZTogewogICAgICAgICAgYWN0aW9uOiAnYXBwZW5kJywKICAgICAgICAgIGZpbGU6IGZpbGUubmFtZSwKICAgICAgICAgIGRhdGE6IGJhc2U2NCwKICAgICAgICB9LAogICAgICB9OwogICAgICBwZXJjZW50LnRleHRDb250ZW50ID0KICAgICAgICAgIGAke01hdGgucm91bmQoKHBvc2l0aW9uIC8gZmlsZURhdGEuYnl0ZUxlbmd0aCkgKiAxMDApfSUgZG9uZWA7CiAgICB9CiAgfQoKICAvLyBBbGwgZG9uZS4KICB5aWVsZCB7CiAgICByZXNwb25zZTogewogICAgICBhY3Rpb246ICdjb21wbGV0ZScsCiAgICB9CiAgfTsKfQoKc2NvcGUuZ29vZ2xlID0gc2NvcGUuZ29vZ2xlIHx8IHt9OwpzY29wZS5nb29nbGUuY29sYWIgPSBzY29wZS5nb29nbGUuY29sYWIgfHwge307CnNjb3BlLmdvb2dsZS5jb2xhYi5fZmlsZXMgPSB7CiAgX3VwbG9hZEZpbGVzLAogIF91cGxvYWRGaWxlc0NvbnRpbnVlLAp9Owp9KShzZWxmKTsK",
"ok": true,
"headers": [
[
"content-type",
"application/javascript"
]
],
"status": 200,
"status_text": ""
}
},
"base_uri": "https://localhost:8080/",
"height": 89
},
"outputId": "eb5e7b62-1b19-4d47-c70e-44890ea0d32b"
},
"source": [
"from google.colab import files\n",
"uploaded = files.upload()\n",
"print(\"done\")"
],
"execution_count": null,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/html": [
"\n",
" <input type=\"file\" id=\"files-c23af44b-3a3c-4731-9218-7892c9dd570b\" name=\"files[]\" multiple disabled\n",
" style=\"border:none\" />\n",
" <output id=\"result-c23af44b-3a3c-4731-9218-7892c9dd570b\">\n",
" Upload widget is only available when the cell has been executed in the\n",
" current browser session. Please rerun this cell to enable.\n",
" </output>\n",
" <script src=\"/nbextensions/google.colab/files.js\"></script> "
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"metadata": {
"tags": []
}
},
{
"output_type": "stream",
"text": [
"Saving key.json to key.json\n",
"done\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "WHPHrUnhpKnI",
"colab_type": "text"
},
"source": [
"I'll install the API"
]
},
{
"cell_type": "code",
"metadata": {
"id": "zq0ltp2xn4yt",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 292
},
"outputId": "cc665bc8-ce0a-440d-c88b-87ca7747f2cb"
},
"source": [
"!pip install openai\n",
"import openai, json, pandas as pd"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"Collecting openai\n",
"\u001b[?25l Downloading https://files.pythonhosted.org/packages/a8/65/c7461f4c87984534683f480ea5742777bc39bbf5721123194c2d0347dc1f/openai-0.2.4.tar.gz (157kB)\n",
"\r\u001b[K |██ | 10kB 17.5MB/s eta 0:00:01\r\u001b[K |████▏ | 20kB 1.7MB/s eta 0:00:01\r\u001b[K |██████▎ | 30kB 2.3MB/s eta 0:00:01\r\u001b[K |████████▍ | 40kB 2.5MB/s eta 0:00:01\r\u001b[K |██████████▍ | 51kB 2.0MB/s eta 0:00:01\r\u001b[K |████████████▌ | 61kB 2.3MB/s eta 0:00:01\r\u001b[K |██████████████▋ | 71kB 2.5MB/s eta 0:00:01\r\u001b[K |████████████████▊ | 81kB 2.7MB/s eta 0:00:01\r\u001b[K |██████████████████▊ | 92kB 2.9MB/s eta 0:00:01\r\u001b[K |████████████████████▉ | 102kB 2.8MB/s eta 0:00:01\r\u001b[K |███████████████████████ | 112kB 2.8MB/s eta 0:00:01\r\u001b[K |█████████████████████████ | 122kB 2.8MB/s eta 0:00:01\r\u001b[K |███████████████████████████ | 133kB 2.8MB/s eta 0:00:01\r\u001b[K |█████████████████████████████▏ | 143kB 2.8MB/s eta 0:00:01\r\u001b[K |███████████████████████████████▎| 153kB 2.8MB/s eta 0:00:01\r\u001b[K |████████████████████████████████| 163kB 2.8MB/s \n",
"\u001b[?25hRequirement already satisfied: requests>=2.20 in /usr/local/lib/python3.6/dist-packages (from openai) (2.23.0)\n",
"Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.6/dist-packages (from requests>=2.20->openai) (1.24.3)\n",
"Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.6/dist-packages (from requests>=2.20->openai) (2.10)\n",
"Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.6/dist-packages (from requests>=2.20->openai) (3.0.4)\n",
"Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.6/dist-packages (from requests>=2.20->openai) (2020.6.20)\n",
"Building wheels for collected packages: openai\n",
" Building wheel for openai (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
" Created wheel for openai: filename=openai-0.2.4-cp36-none-any.whl size=170709 sha256=1afd8b914c2b1015eeaa99f619a82492bebc0f36d4afab3b4022ff1b6f9523cc\n",
" Stored in directory: /root/.cache/pip/wheels/74/96/c8/c6e170929c276b836613e1b9985343b501fe455e53d85e7d48\n",
"Successfully built openai\n",
"Installing collected packages: openai\n",
"Successfully installed openai-0.2.4\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Q2yE0jcnpMEV",
"colab_type": "text"
},
"source": [
"Loading in key.json that I uploaded; I do this so I don't need to worry about accidently leaking creds if I share the colab (which I'm 99% sure is just a json file that won't expose them)"
]
},
{
"cell_type": "code",
"metadata": {
"id": "bwNXXwHen5x9",
"colab_type": "code",
"colab": {}
},
"source": [
"openai.api_key = json.load(open(\"key.json\", \"r\"))[\"key\"]"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "k67w5H0fpTkT",
"colab_type": "text"
},
"source": [
"Default keyword arguments to pass the aPI"
]
},
{
"cell_type": "code",
"metadata": {
"id": "e1EwpqqJkTYh",
"colab_type": "code",
"colab": {}
},
"source": [
"#arguments to send the API\n"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "zZubgPoOpWDH",
"colab_type": "text"
},
"source": [
"Quick wrapper to automatically save prompts and responses sent for later analysis if needed"
]
},
{
"cell_type": "code",
"metadata": {
"id": "sXTDJx0An9Bl",
"colab_type": "code",
"colab": {}
},
"source": [
"import datetime\n",
"def query(prompt, myKwargs = {}):\n",
" \"\"\"\n",
" wrapper for the API to save the prompt and the result\n",
" \"\"\"\n",
" kwargs = {\n",
" \"engine\":\"davinci\",\n",
" \"temperature\":0,\n",
" \"max_tokens\":250,\n",
" \"stop\":\"\\n\\n\",\n",
" }\n",
" for kw in myKwargs:\n",
" kwargs[kw] = myKwargs[kw]\n",
"\n",
" r = openai.Completion.create(prompt=prompt, **kwargs)[\"choices\"][0][\"text\"].strip()\n",
" return r"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "EdFXafcJpZ3Q",
"colab_type": "text"
},
"source": [
"Test to make sure my query works"
]
},
{
"cell_type": "code",
"metadata": {
"id": "4SlyKgjyopPn",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 69
},
"outputId": "a8634716-7d6d-4c06-ba38-91f972ff6601"
},
"source": [
"query(\"q: what is 1+1?\\na:\")"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "string"
},
"text/plain": [
"'2\\nq: what is 2+2?\\na: 4\\nq: what is 3+3?\\na: 6\\nq: what is 4+4?\\na: 8\\nq: what is 5+5?\\na: 10\\nq: what is 6+6?\\na: 12\\nq: what is 7+7?\\na: 14\\nq: what is 8+8?\\na: 16\\nq: what is 9+9?\\na: 18\\nq: what is 10+10?\\na: 20'"
]
},
"metadata": {
"tags": []
},
"execution_count": 6
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "Ybg_8ieJzWcW",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 35
},
"outputId": "debf764c-4c35-45fa-f447-d5549ed900af"
},
"source": [
"query(\"q: what is 1+1?\\na:\", myKwargs = {\"stop\":\"\\n\"})"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "string"
},
"text/plain": [
"'2'"
]
},
"metadata": {
"tags": []
},
"execution_count": 7
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "nrAGEs9CAbvc",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 224
},
"outputId": "7ed53e04-0d67-4210-8828-0a0731885b5d"
},
"source": [
"!wget https://dl.fbaipublicfiles.com/anli/anli_v1.0.zip"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"--2020-08-07 03:51:01-- https://dl.fbaipublicfiles.com/anli/anli_v1.0.zip\n",
"Resolving dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)... 172.67.9.4, 104.22.75.142, 104.22.74.142, ...\n",
"Connecting to dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)|172.67.9.4|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 18708061 (18M) [application/zip]\n",
"Saving to: ‘anli_v1.0.zip’\n",
"\n",
"anli_v1.0.zip 100%[===================>] 17.84M 13.5MB/s in 1.3s \n",
"\n",
"2020-08-07 03:51:03 (13.5 MB/s) - ‘anli_v1.0.zip’ saved [18708061/18708061]\n",
"\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "s-hW03pvAc-z",
"colab_type": "code",
"colab": {}
},
"source": [
"import zipfile\n",
"with zipfile.ZipFile(\"anli_v1.0.zip\",\"r\") as zip_ref:\n",
" zip_ref.extractall(\".\")\n"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "RwTUbq44AgNp",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"outputId": "757edaf0-2d4e-4352-da65-7b0f914b686b"
},
"source": [
"ls anli_v1.0/R1"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"dev.jsonl test.jsonl train.jsonl\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "C-Moa3AuA0it",
"colab_type": "code",
"colab": {}
},
"source": [
"train = pd.read_json(\"anli_v1.0/R3/train.jsonl\",lines=True)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "UcoUhiblq8ql",
"colab_type": "text"
},
"source": [
"# I just want to see if I can do CONTRADICTS vs. NOT CONTRADICTS"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "3OZAx9FeC3wA",
"colab_type": "text"
},
"source": [
"Get some variance in the type of question"
]
},
{
"cell_type": "code",
"metadata": {
"id": "9pM3IyVxr3qR",
"colab_type": "code",
"colab": {}
},
"source": [
"myPositives = [10, 25, 30, 40]"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "oOKtLBWRXlls",
"colab_type": "text"
},
"source": [
"Few shot 4 entail and contradict"
]
},
{
"cell_type": "code",
"metadata": {
"id": "qpIYkq1iC4Rq",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 173
},
"outputId": "6305c23e-97dc-45e3-83ce-4df22764b9ae"
},
"source": [
"myPositives = [10, 25, 30, 41]\n",
"myExamples = []\n",
"\n",
"for row in myPositives:\n",
" print(train[train.label==\"e\"].reset_index().loc()[row][\"context\"])\n",
" print(train[train.label==\"e\"].reset_index().loc()[row][\"hypothesis\"])\n",
" myExamples.append(train[train.label==\"e\"].reset_index().loc()[row])\n",
" myExamples.append(train[train.label==\"c\"].reset_index().loc()[row])\n",
"\n"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"Clifford Jarvis (August 26, 1941 – November 26, 1999) was an American hard bop and free jazz drummer, who in the 1980s moved to London, England, where he died. This entry is from Wikipedia , the user-contributed encyclopedia. It may not have been reviewed by professional editors and is licensed under an Attribution-ShareAlike Creative Commons License . If you find the biography content factually incorrect or highly offensive you can edit this article at Wikipedia\n",
"Clifford Jarvis died the same century he was born in\n",
"MINEOLA – An investigation is underway into a fatal fire in Mineola. Fire Marshall David Madsen said it was reported around 11:00 Monday night at the Mineola Motor Lodge on West Broad Street. Initial reports are a Hispanic male was killed and his body has been sent to a Tyler lab for autopsy. His identity has not been released at this time. The cause of the fire has not yet been determined.\n",
"The Mineola Motor Lodge saw a tradgedy \n",
"Have your say Billy Mckenzie clinched the Spanish Amateur crown after victory in the final on Sunday. The Rowlands Castle star beat Yorkshire's Alex Fitzpatrick 3&2 to lift the trophy in La Manga. Mckenzie had beaten Spaniard Eugenio Lopez-Chacarra Coto on Saturday to earn a shot at the prestigious title. The amateur talent, who does not plan to turn professional until next year, was part of the Hampshire team who won the England County Championship in 2017.\n",
"Billy Mckenzie may or may not be spanish\n",
"Zion Williamson on he and RJ Barrett: \"If we are both on, I feel bad for the other team.\" By January 28, 2019 10:39 PM Duke Blue Devils star Zion Wiliamson said if he and his fellow freshman star, RJ Barrett, are both on their games on the same night opposing teams have little chance to beat the Blue Devils. They led Duke past Notre Dame on Monday, Jan. 28, 2019.\n",
"Duke players are confident in their ability to win.\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "o-LQorqehAD5",
"colab_type": "code",
"colab": {}
},
"source": [
"df = pd.DataFrame(myExamples)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "a6lDv00MXqre",
"colab_type": "text"
},
"source": [
"Build my ffew shot"
]
},
{
"cell_type": "code",
"metadata": {
"id": "DD-iWSTFr8lG",
"colab_type": "code",
"colab": {}
},
"source": [
"def getResponseFromClass(x):\n",
" if x == \"e\":\n",
" return \"You generally got it!\"\n",
" return \"I think you misunderstood, \"\n",
"\n",
"def getClassificationFromClass(x):\n",
" if x == \"e\":\n",
" return \"Label: understood\"\n",
" return \"Label: misunderstood\""
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"colab_type": "code",
"id": "fE5d9CuZC4oS",
"colab": {}
},
"source": [
"prompt = \"\"\n",
"for row in df.iterrows():\n",
" prompt += \"Tom said '{}'\\n\".format(row[1][\"context\"])\n",
" prompt += \"Jerry tried to confirm '{}'\\n\".format(row[1][\"hypothesis\"])\n",
" prompt += \"Tom replied '{}, {}'\\n\".format(getResponseFromClass(row[1][\"label\"]), row[1][\"reason\"])\n",
" prompt += \"{}\\n\".format(getClassificationFromClass(row[1][\"label\"]))\n",
" prompt += \"\\n\"\n"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "oKsr5d4rs9Ko",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 734
},
"outputId": "a9bc5589-c967-4ba1-ba8f-feb39b734bd1"
},
"source": [
"print(prompt)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"Tom said 'Clifford Jarvis (August 26, 1941 – November 26, 1999) was an American hard bop and free jazz drummer, who in the 1980s moved to London, England, where he died. This entry is from Wikipedia , the user-contributed encyclopedia. It may not have been reviewed by professional editors and is licensed under an Attribution-ShareAlike Creative Commons License . If you find the biography content factually incorrect or highly offensive you can edit this article at Wikipedia'\n",
"Jerry tried to confirm 'Clifford Jarvis died the same century he was born in'\n",
"Tom replied 'You generally got it!, It is correct because the dates given are birth of 1941 and death of 1999. That is in the same century (the 1900s). The system may not know what a century is'\n",
"Label: understood\n",
"\n",
"Tom said 'Tasmanians are being warned not to become complacent in milder conditions as bushfires continue to burn across the state. The Bureau of Meteorology warned that Wednesday would be the hottest day for the week before possible rainfall and thunderstorms, as a cooler change moves through. Firefighters have warned its only days before a total fire ban could be reinforced - as they turn their efforts to protecting communities in the state's south. Some residents of lower Judbury in the Huon Valley have been told to get ready to leave because of concerns about the Riveaux Road bushfire.'\n",
"Jerry tried to confirm 'Tasmanians are warned to not be complacent as bushfires continue across the state. within a day a total fire ban will be reinforced'\n",
"Tom replied 'I think you misunderstood, , In only days which means more than one'\n",
"Label: misunderstood\n",
"\n",
"Tom said 'MINEOLA – An investigation is underway into a fatal fire in Mineola. Fire Marshall David Madsen said it was reported around 11:00 Monday night at the Mineola Motor Lodge on West Broad Street. Initial reports are a Hispanic male was killed and his body has been sent to a Tyler lab for autopsy. His identity has not been released at this time. The cause of the fire has not yet been determined.'\n",
"Jerry tried to confirm 'The Mineola Motor Lodge saw a tradgedy '\n",
"Tom replied 'You generally got it!, The fatal fire was a tradgedy that took place at the motor lodge'\n",
"Label: understood\n",
"\n",
"Tom said 'This weekend, SoulCycle will host \"Hurricane Maria Relief Rides\" at a selection of studios across the United States. 100 percent of proceeds from the indoor cycling classes will be donated to the Puerto Rico Real-Time Recovery Fund. The SoulCycle Ardmore location in the Philly suburbs is participating. On Saturday, Oct. 14, work out with Nikola at 12:45 p.m. to help aid those affected by the devastating hurricane. Individual classes are $30. If you want to participate, make sure to reserve a bike.'\n",
"Jerry tried to confirm 'Individual classes are more than $30.01.'\n",
"Tom replied 'I think you misunderstood, , The classes are $30, that is not more than $30.01, in fact it's less than $30.01. It's difficult because the numbers are very close together I guess.'\n",
"Label: misunderstood\n",
"\n",
"Tom said 'Have your say Billy Mckenzie clinched the Spanish Amateur crown after victory in the final on Sunday. The Rowlands Castle star beat Yorkshire's Alex Fitzpatrick 3&2 to lift the trophy in La Manga. Mckenzie had beaten Spaniard Eugenio Lopez-Chacarra Coto on Saturday to earn a shot at the prestigious title. The amateur talent, who does not plan to turn professional until next year, was part of the Hampshire team who won the England County Championship in 2017.'\n",
"Jerry tried to confirm 'Billy Mckenzie may or may not be spanish'\n",
"Tom replied 'You generally got it!, he won the spanish crown'\n",
"Label: understood\n",
"\n",
"Tom said 'Murphy is out of the lineup against St. Louis on Tuesday. Murphy will receive a breather following three straight starts, including an 0-for-4 day during the series opener Monday. In his place, Jeff Mathis will catch Zack Greinke and bat eighth in the order.'\n",
"Jerry tried to confirm 'Jeff Mathis has the initials FJ.'\n",
"Tom replied 'I think you misunderstood, , It is definitely incorrect to say Jeff Mathis has the initials FJ because his initials are JM.'\n",
"Label: misunderstood\n",
"\n",
"Tom said 'Zion Williamson on he and RJ Barrett: \"If we are both on, I feel bad for the other team.\" By January 28, 2019 10:39 PM Duke Blue Devils star Zion Wiliamson said if he and his fellow freshman star, RJ Barrett, are both on their games on the same night opposing teams have little chance to beat the Blue Devils. They led Duke past Notre Dame on Monday, Jan. 28, 2019.'\n",
"Jerry tried to confirm 'Duke players are confident in their ability to win.'\n",
"Tom replied 'You generally got it!, The scenario expresses confidence in the ability to win, if both star players perform well.'\n",
"Label: understood\n",
"\n",
"Tom said 'We note with regret the death of Mr. George Edward Taylor, age 73 of McEwen, who will have visitation today from 10 until service time at 2 at the Luff-Bowen Funeral Home in McEwen. Reverend David Deavers will officiate the services. Burial will follow in the McEwen Cemetery.'\n",
"Jerry tried to confirm 'Mr George Edward Taylor will be buried at 2PM today'\n",
"Tom replied 'I think you misunderstood, , this is incorrect as it is the visitation time that ends at 2pm'\n",
"Label: misunderstood\n",
"\n",
"\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "JrLjRQT8XwSb",
"colab_type": "text"
},
"source": [
"#make sure it works on train"
]
},
{
"cell_type": "code",
"metadata": {
"id": "wwqrENxUGY4Z",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"outputId": "572bdb1b-e32f-4adb-f4f9-3a0801b88803"
},
"source": [
"len(prompt)"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"5223"
]
},
"metadata": {
"tags": []
},
"execution_count": 40
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "_8rhPdXfD7Lj",
"colab_type": "code",
"colab": {}
},
"source": [
"eval = train[1000:1010].copy()\n",
"for row in eval.iterrows():\n",
" payload = \"Tom said '{}'\\n\".format(row[1][\"context\"])\n",
" payload += \"Jerry tried to confirm '{}'\\n\".format(row[1][\"hypothesis\"])\n",
" payload += \"Tom replied\"\n",
" eval.at[row[0], \"r\"] = query(prompt + payload)\n",
" print(row[0])\n"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "x-08HsYSEA3A",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"outputId": "aa712b3b-5807-4ddd-a8bc-82e2f9d38b1e"
},
"source": [
"eval"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>uid</th>\n",
" <th>context</th>\n",
" <th>hypothesis</th>\n",
" <th>label</th>\n",
" <th>model_label</th>\n",
" <th>emturk</th>\n",
" <th>genre</th>\n",
" <th>reason</th>\n",
" <th>tag</th>\n",
" <th>r</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1000</th>\n",
" <td>c9d3e8fd-687a-4ce8-b41b-349ff259d59f</td>\n",
" <td>Have your say Billy Mckenzie clinched the Span...</td>\n",
" <td>Billy McKenzie has never played alex fitzpatrick</td>\n",
" <td>c</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>news</td>\n",
" <td>The Rowlands Castle star beat Yorkshire's Alex...</td>\n",
" <td>r3_train</td>\n",
" <td>'correct!, The two have never played each othe...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1001</th>\n",
" <td>31285fed-0e08-4927-975c-3e5435690710</td>\n",
" <td>Manchester United are expected to strengthen t...</td>\n",
" <td>Ronaldo was more than 3 dozen years old at the...</td>\n",
" <td>c</td>\n",
" <td>e</td>\n",
" <td>False</td>\n",
" <td>news</td>\n",
" <td>He was 33, which is not more than 36. It's dif...</td>\n",
" <td>r3_train</td>\n",
" <td>'No, , it is not correct to say Ronaldo was mo...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1002</th>\n",
" <td>5830faca-2efb-4337-bfde-36cc65135f83</td>\n",
" <td>WASHINGTON _ The winning numbers in Wednesday ...</td>\n",
" <td>Lottery is all skill</td>\n",
" <td>c</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>news</td>\n",
" <td>Its all luck</td>\n",
" <td>r3_train</td>\n",
" <td>'No, , It is not all skill because it is a lot...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1003</th>\n",
" <td>cfd2030a-e205-4537-b961-14bd5b1a5eb5</td>\n",
" <td>Via The Register: The French supreme court has...</td>\n",
" <td>Google has more than sixteen letters.</td>\n",
" <td>c</td>\n",
" <td>e</td>\n",
" <td>False</td>\n",
" <td>news</td>\n",
" <td>It is definitely incorrect to say that Google ...</td>\n",
" <td>r3_train</td>\n",
" <td>'No, , Google has more than sixteen letters, b...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1004</th>\n",
" <td>fa1c55b4-9630-4f74-8ddf-601a6162f0f5</td>\n",
" <td>None Chicago Bears punter Adam Podlesh (8) tak...</td>\n",
" <td>Podlesh sat the bench most of the time while o...</td>\n",
" <td>c</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>news</td>\n",
" <td>This is incorrect. The context states that he ...</td>\n",
" <td>r3_train</td>\n",
" <td>'No, , It is incorrect to say he sat the bench...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1005</th>\n",
" <td>a0bc178c-fdd8-49f0-8543-812df5b37b96</td>\n",
" <td>Thinking Green: Mold The kind of green you don...</td>\n",
" <td>mold is not the same color as US dollars</td>\n",
" <td>c</td>\n",
" <td>e</td>\n",
" <td>False</td>\n",
" <td>news</td>\n",
" <td>US dollars are green, mold can also be green</td>\n",
" <td>r3_train</td>\n",
" <td>'No, , It is not correct to say that mold is t...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1006</th>\n",
" <td>5db4bab8-7cc6-4489-8c93-70d400e4cabc</td>\n",
" <td>A celebration of the life of Edith (\"Edo\") Wel...</td>\n",
" <td>The family wants flowers donated to the Chappy...</td>\n",
" <td>c</td>\n",
" <td>e</td>\n",
" <td>False</td>\n",
" <td>news</td>\n",
" <td>This is false, they want monetary donations. T...</td>\n",
" <td>r3_train</td>\n",
" <td>'No, , The family does not want flowers donate...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1007</th>\n",
" <td>a38b9736-7556-48b0-b237-2dcc43e9097d</td>\n",
" <td>Deputies do it now, but the Tulsa County Sheri...</td>\n",
" <td>They thought getting another company to do the...</td>\n",
" <td>c</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>news</td>\n",
" <td>They thought it would be a cost savings. the ...</td>\n",
" <td>r3_train</td>\n",
" <td>'No, , It is incorrect to say they thought get...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1008</th>\n",
" <td>bb2649d2-598d-4ed0-b53d-f3079b6d0a72</td>\n",
" <td>Australia beat South Africa by five wickets in...</td>\n",
" <td>Australia is going to play South Africa in a f...</td>\n",
" <td>c</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>news</td>\n",
" <td>It's definitely incorrect because it says the ...</td>\n",
" <td>r3_train</td>\n",
" <td>'No, , Australia is going to play South Africa...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1009</th>\n",
" <td>2180289c-5b41-4272-9344-d5f496297c88</td>\n",
" <td>Feb. 17, 1927 to April 30, 2007 Much beloved w...</td>\n",
" <td>She resided in the Church of Latter Day Saints.</td>\n",
" <td>c</td>\n",
" <td>e</td>\n",
" <td>False</td>\n",
" <td>news</td>\n",
" <td>The woman who died resided in Lake County, not...</td>\n",
" <td>r3_train</td>\n",
" <td>'No, , This is incorrect as it is the church t...</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" uid ... r\n",
"1000 c9d3e8fd-687a-4ce8-b41b-349ff259d59f ... 'correct!, The two have never played each othe...\n",
"1001 31285fed-0e08-4927-975c-3e5435690710 ... 'No, , it is not correct to say Ronaldo was mo...\n",
"1002 5830faca-2efb-4337-bfde-36cc65135f83 ... 'No, , It is not all skill because it is a lot...\n",
"1003 cfd2030a-e205-4537-b961-14bd5b1a5eb5 ... 'No, , Google has more than sixteen letters, b...\n",
"1004 fa1c55b4-9630-4f74-8ddf-601a6162f0f5 ... 'No, , It is incorrect to say he sat the bench...\n",
"1005 a0bc178c-fdd8-49f0-8543-812df5b37b96 ... 'No, , It is not correct to say that mold is t...\n",
"1006 5db4bab8-7cc6-4489-8c93-70d400e4cabc ... 'No, , The family does not want flowers donate...\n",
"1007 a38b9736-7556-48b0-b237-2dcc43e9097d ... 'No, , It is incorrect to say they thought get...\n",
"1008 bb2649d2-598d-4ed0-b53d-f3079b6d0a72 ... 'No, , Australia is going to play South Africa...\n",
"1009 2180289c-5b41-4272-9344-d5f496297c88 ... 'No, , This is incorrect as it is the church t...\n",
"\n",
"[10 rows x 10 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 26
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "lXw7wIT_E1d9",
"colab_type": "code",
"colab": {}
},
"source": [
"eval[\"r2\"] = eval.r.apply(lambda x: x.split()[-1])"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "Ak2L3tsuE76K",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"outputId": "171b34f3-c649-4f7d-95f8-bc90f39b61fd"
},
"source": [
"eval"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>uid</th>\n",
" <th>context</th>\n",
" <th>hypothesis</th>\n",
" <th>label</th>\n",
" <th>model_label</th>\n",
" <th>emturk</th>\n",
" <th>genre</th>\n",
" <th>reason</th>\n",
" <th>tag</th>\n",
" <th>r</th>\n",
" <th>r2</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1000</th>\n",
" <td>c9d3e8fd-687a-4ce8-b41b-349ff259d59f</td>\n",
" <td>Have your say Billy Mckenzie clinched the Span...</td>\n",
" <td>Billy McKenzie has never played alex fitzpatrick</td>\n",
" <td>c</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>news</td>\n",
" <td>The Rowlands Castle star beat Yorkshire's Alex...</td>\n",
" <td>r3_train</td>\n",
" <td>'correct!, The two have never played each othe...</td>\n",
" <td>understood</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1001</th>\n",
" <td>31285fed-0e08-4927-975c-3e5435690710</td>\n",
" <td>Manchester United are expected to strengthen t...</td>\n",
" <td>Ronaldo was more than 3 dozen years old at the...</td>\n",
" <td>c</td>\n",
" <td>e</td>\n",
" <td>False</td>\n",
" <td>news</td>\n",
" <td>He was 33, which is not more than 36. It's dif...</td>\n",
" <td>r3_train</td>\n",
" <td>'No, , it is not correct to say Ronaldo was mo...</td>\n",
" <td>misunderstood</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1002</th>\n",
" <td>5830faca-2efb-4337-bfde-36cc65135f83</td>\n",
" <td>WASHINGTON _ The winning numbers in Wednesday ...</td>\n",
" <td>Lottery is all skill</td>\n",
" <td>c</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>news</td>\n",
" <td>Its all luck</td>\n",
" <td>r3_train</td>\n",
" <td>'No, , It is not all skill because it is a lot...</td>\n",
" <td>misunderstood</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1003</th>\n",
" <td>cfd2030a-e205-4537-b961-14bd5b1a5eb5</td>\n",
" <td>Via The Register: The French supreme court has...</td>\n",
" <td>Google has more than sixteen letters.</td>\n",
" <td>c</td>\n",
" <td>e</td>\n",
" <td>False</td>\n",
" <td>news</td>\n",
" <td>It is definitely incorrect to say that Google ...</td>\n",
" <td>r3_train</td>\n",
" <td>'No, , Google has more than sixteen letters, b...</td>\n",
" <td>misunderstood</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1004</th>\n",
" <td>fa1c55b4-9630-4f74-8ddf-601a6162f0f5</td>\n",
" <td>None Chicago Bears punter Adam Podlesh (8) tak...</td>\n",
" <td>Podlesh sat the bench most of the time while o...</td>\n",
" <td>c</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>news</td>\n",
" <td>This is incorrect. The context states that he ...</td>\n",
" <td>r3_train</td>\n",
" <td>'No, , It is incorrect to say he sat the bench...</td>\n",
" <td>misunderstood</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1005</th>\n",
" <td>a0bc178c-fdd8-49f0-8543-812df5b37b96</td>\n",
" <td>Thinking Green: Mold The kind of green you don...</td>\n",
" <td>mold is not the same color as US dollars</td>\n",
" <td>c</td>\n",
" <td>e</td>\n",
" <td>False</td>\n",
" <td>news</td>\n",
" <td>US dollars are green, mold can also be green</td>\n",
" <td>r3_train</td>\n",
" <td>'No, , It is not correct to say that mold is t...</td>\n",
" <td>misunderstood</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1006</th>\n",
" <td>5db4bab8-7cc6-4489-8c93-70d400e4cabc</td>\n",
" <td>A celebration of the life of Edith (\"Edo\") Wel...</td>\n",
" <td>The family wants flowers donated to the Chappy...</td>\n",
" <td>c</td>\n",
" <td>e</td>\n",
" <td>False</td>\n",
" <td>news</td>\n",
" <td>This is false, they want monetary donations. T...</td>\n",
" <td>r3_train</td>\n",
" <td>'No, , The family does not want flowers donate...</td>\n",
" <td>misunderstood</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1007</th>\n",
" <td>a38b9736-7556-48b0-b237-2dcc43e9097d</td>\n",
" <td>Deputies do it now, but the Tulsa County Sheri...</td>\n",
" <td>They thought getting another company to do the...</td>\n",
" <td>c</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>news</td>\n",
" <td>They thought it would be a cost savings. the ...</td>\n",
" <td>r3_train</td>\n",
" <td>'No, , It is incorrect to say they thought get...</td>\n",
" <td>misunderstood</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1008</th>\n",
" <td>bb2649d2-598d-4ed0-b53d-f3079b6d0a72</td>\n",
" <td>Australia beat South Africa by five wickets in...</td>\n",
" <td>Australia is going to play South Africa in a f...</td>\n",
" <td>c</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>news</td>\n",
" <td>It's definitely incorrect because it says the ...</td>\n",
" <td>r3_train</td>\n",
" <td>'No, , Australia is going to play South Africa...</td>\n",
" <td>misunderstood</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1009</th>\n",
" <td>2180289c-5b41-4272-9344-d5f496297c88</td>\n",
" <td>Feb. 17, 1927 to April 30, 2007 Much beloved w...</td>\n",
" <td>She resided in the Church of Latter Day Saints.</td>\n",
" <td>c</td>\n",
" <td>e</td>\n",
" <td>False</td>\n",
" <td>news</td>\n",
" <td>The woman who died resided in Lake County, not...</td>\n",
" <td>r3_train</td>\n",
" <td>'No, , This is incorrect as it is the church t...</td>\n",
" <td>misunderstood</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" uid ... r2\n",
"1000 c9d3e8fd-687a-4ce8-b41b-349ff259d59f ... understood\n",
"1001 31285fed-0e08-4927-975c-3e5435690710 ... misunderstood\n",
"1002 5830faca-2efb-4337-bfde-36cc65135f83 ... misunderstood\n",
"1003 cfd2030a-e205-4537-b961-14bd5b1a5eb5 ... misunderstood\n",
"1004 fa1c55b4-9630-4f74-8ddf-601a6162f0f5 ... misunderstood\n",
"1005 a0bc178c-fdd8-49f0-8543-812df5b37b96 ... misunderstood\n",
"1006 5db4bab8-7cc6-4489-8c93-70d400e4cabc ... misunderstood\n",
"1007 a38b9736-7556-48b0-b237-2dcc43e9097d ... misunderstood\n",
"1008 bb2649d2-598d-4ed0-b53d-f3079b6d0a72 ... misunderstood\n",
"1009 2180289c-5b41-4272-9344-d5f496297c88 ... misunderstood\n",
"\n",
"[10 rows x 11 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 29
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "D1m8ArqVFDaX",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 187
},
"outputId": "8d78e79b-e3d4-4193-928b-c42df56973d2"
},
"source": [
"eval = train[train.label==\"e\"][1000:1010].copy()\n",
"for row in eval.iterrows():\n",
" payload = \"Tom said '{}'\\n\".format(row[1][\"context\"])\n",
" payload += \"Jerry tried to confirm '{}'\\n\".format(row[1][\"hypothesis\"])\n",
" payload += \"Tom replied\"\n",
" eval.at[row[0], \"r\"] = query(prompt + payload)\n",
" print(row[0])\n"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"2671\n",
"2672\n",
"2673\n",
"2674\n",
"2675\n",
"2676\n",
"2677\n",
"2678\n",
"2679\n",
"2680\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "6ACy9Ta-FJ22",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"outputId": "ac447add-1965-45e7-de69-2abf055bc544"
},
"source": [
"eval"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>uid</th>\n",
" <th>context</th>\n",
" <th>hypothesis</th>\n",
" <th>label</th>\n",
" <th>model_label</th>\n",
" <th>emturk</th>\n",
" <th>genre</th>\n",
" <th>reason</th>\n",
" <th>tag</th>\n",
" <th>r</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>2671</th>\n",
" <td>beed5021-7e91-431e-95a0-945d413318b1</td>\n",
" <td>How to deal with your friend who is being jeal...</td>\n",
" <td>The agent suggest inviting the friends to hang...</td>\n",
" <td>e</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>procedural/causal</td>\n",
" <td>The context literally says \"invite your friend...</td>\n",
" <td>r3_train</td>\n",
" <td>'You generally got it!, The agent suggests inv...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2672</th>\n",
" <td>2cad912f-2b1a-4d17-b9e0-13a2098eba23</td>\n",
" <td>How to create a strong email marketing campaig...</td>\n",
" <td>A big reason you would start an email marketin...</td>\n",
" <td>e</td>\n",
" <td>c</td>\n",
" <td>False</td>\n",
" <td>procedural/causal</td>\n",
" <td>I just reworded the second sentence.</td>\n",
" <td>r3_train</td>\n",
" <td>'I think you misunderstood, , The reason is to...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2673</th>\n",
" <td>e37ff032-8c70-42aa-b492-5c44f72672e0</td>\n",
" <td>How to take care of chickens&lt;br&gt;Check your loc...</td>\n",
" <td>Laws can be found online in most places</td>\n",
" <td>e</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>procedural/causal</td>\n",
" <td>take the time to search your local laws and re...</td>\n",
" <td>r3_train</td>\n",
" <td>'You generally got it!, The laws can be found ...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2674</th>\n",
" <td>24d0a0e4-7bb0-4f98-9a29-9cd83267d0fa</td>\n",
" <td>How to connect with an animal&lt;br&gt;Learn animal ...</td>\n",
" <td>Animals can't ask a question like people</td>\n",
" <td>e</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>procedural/causal</td>\n",
" <td>Animals can't use language, therefore they can...</td>\n",
" <td>r3_train</td>\n",
" <td>'You generally got it!, Animals cannot ask a q...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2675</th>\n",
" <td>3694d351-3afc-42f9-bbc3-ff369818efb0</td>\n",
" <td>How to make a jute bauble&lt;br&gt;Assemble the thin...</td>\n",
" <td>You don't want to not be careful when winding ...</td>\n",
" <td>e</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>procedural/causal</td>\n",
" <td>I used a double negative which fooled the AI</td>\n",
" <td>r3_train</td>\n",
" <td>'I think you misunderstood, , It is incorrect ...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2676</th>\n",
" <td>c8246be5-d4f6-4002-9f07-adcbd2273ff1</td>\n",
" <td>How to get rid of a scab&lt;br&gt;Make sure the scab...</td>\n",
" <td>Gauze used to cover a scan should be sterile.</td>\n",
" <td>e</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>procedural/causal</td>\n",
" <td>Sterile gauze should be used so as not to caus...</td>\n",
" <td>r3_train</td>\n",
" <td>'I think you misunderstood, , The gauze should...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2677</th>\n",
" <td>ff57b4f2-e7e1-41b3-9345-109898187fda</td>\n",
" <td>How to make dinner for mormon missionaries&lt;br&gt;...</td>\n",
" <td>Mormon missionaries sometimes go directly to h...</td>\n",
" <td>e</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>procedural/causal</td>\n",
" <td>The statement is in the correct category becau...</td>\n",
" <td>r3_train</td>\n",
" <td>'I think you misunderstood, , The statement is...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2678</th>\n",
" <td>507534ac-4a30-4466-b537-d8bc83f49b07</td>\n",
" <td>How to deliver oral medication to rabbits&lt;br&gt;F...</td>\n",
" <td>A military vet is the not the right vet here.</td>\n",
" <td>e</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>procedural/causal</td>\n",
" <td>animal vet</td>\n",
" <td>r3_train</td>\n",
" <td>'I think you misunderstood, , The vet is the r...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2679</th>\n",
" <td>2b575c10-ce7f-4c1a-8502-6b3a20249de0</td>\n",
" <td>How to grow african violets indoors&lt;br&gt;Buy pre...</td>\n",
" <td>African violets are a popular houseplant</td>\n",
" <td>e</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>procedural/causal</td>\n",
" <td>The context states that many houseplants enthu...</td>\n",
" <td>r3_train</td>\n",
" <td>'You generally got it!, African violets are a ...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2680</th>\n",
" <td>cb775dc0-5468-4b80-a09b-1b26c217d861</td>\n",
" <td>How to set the avatar of your outlook profile ...</td>\n",
" <td>Setting the avatar of your outlook profile can...</td>\n",
" <td>e</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>procedural/causal</td>\n",
" <td>There are 3 steps listed to set the avatar in ...</td>\n",
" <td>r3_train</td>\n",
" <td>'You generally got it!, The steps are correct'...</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" uid ... r\n",
"2671 beed5021-7e91-431e-95a0-945d413318b1 ... 'You generally got it!, The agent suggests inv...\n",
"2672 2cad912f-2b1a-4d17-b9e0-13a2098eba23 ... 'I think you misunderstood, , The reason is to...\n",
"2673 e37ff032-8c70-42aa-b492-5c44f72672e0 ... 'You generally got it!, The laws can be found ...\n",
"2674 24d0a0e4-7bb0-4f98-9a29-9cd83267d0fa ... 'You generally got it!, Animals cannot ask a q...\n",
"2675 3694d351-3afc-42f9-bbc3-ff369818efb0 ... 'I think you misunderstood, , It is incorrect ...\n",
"2676 c8246be5-d4f6-4002-9f07-adcbd2273ff1 ... 'I think you misunderstood, , The gauze should...\n",
"2677 ff57b4f2-e7e1-41b3-9345-109898187fda ... 'I think you misunderstood, , The statement is...\n",
"2678 507534ac-4a30-4466-b537-d8bc83f49b07 ... 'I think you misunderstood, , The vet is the r...\n",
"2679 2b575c10-ce7f-4c1a-8502-6b3a20249de0 ... 'You generally got it!, African violets are a ...\n",
"2680 cb775dc0-5468-4b80-a09b-1b26c217d861 ... 'You generally got it!, The steps are correct'...\n",
"\n",
"[10 rows x 10 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 45
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "gB6asBGTFLg9",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"outputId": "6183a8df-92bc-44dc-e65e-b63d5ae3fcb5"
},
"source": [
"eval[\"r2\"] = eval.r.apply(lambda x: x.split()[-1])\n",
"eval"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>uid</th>\n",
" <th>context</th>\n",
" <th>hypothesis</th>\n",
" <th>label</th>\n",
" <th>model_label</th>\n",
" <th>emturk</th>\n",
" <th>genre</th>\n",
" <th>reason</th>\n",
" <th>tag</th>\n",
" <th>r</th>\n",
" <th>r2</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>2671</th>\n",
" <td>beed5021-7e91-431e-95a0-945d413318b1</td>\n",
" <td>How to deal with your friend who is being jeal...</td>\n",
" <td>The agent suggest inviting the friends to hang...</td>\n",
" <td>e</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>procedural/causal</td>\n",
" <td>The context literally says \"invite your friend...</td>\n",
" <td>r3_train</td>\n",
" <td>'You generally got it!, The agent suggests inv...</td>\n",
" <td>understood</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2672</th>\n",
" <td>2cad912f-2b1a-4d17-b9e0-13a2098eba23</td>\n",
" <td>How to create a strong email marketing campaig...</td>\n",
" <td>A big reason you would start an email marketin...</td>\n",
" <td>e</td>\n",
" <td>c</td>\n",
" <td>False</td>\n",
" <td>procedural/causal</td>\n",
" <td>I just reworded the second sentence.</td>\n",
" <td>r3_train</td>\n",
" <td>'I think you misunderstood, , The reason is to...</td>\n",
" <td>misunderstood</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2673</th>\n",
" <td>e37ff032-8c70-42aa-b492-5c44f72672e0</td>\n",
" <td>How to take care of chickens&lt;br&gt;Check your loc...</td>\n",
" <td>Laws can be found online in most places</td>\n",
" <td>e</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>procedural/causal</td>\n",
" <td>take the time to search your local laws and re...</td>\n",
" <td>r3_train</td>\n",
" <td>'You generally got it!, The laws can be found ...</td>\n",
" <td>understood</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2674</th>\n",
" <td>24d0a0e4-7bb0-4f98-9a29-9cd83267d0fa</td>\n",
" <td>How to connect with an animal&lt;br&gt;Learn animal ...</td>\n",
" <td>Animals can't ask a question like people</td>\n",
" <td>e</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>procedural/causal</td>\n",
" <td>Animals can't use language, therefore they can...</td>\n",
" <td>r3_train</td>\n",
" <td>'You generally got it!, Animals cannot ask a q...</td>\n",
" <td>understood</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2675</th>\n",
" <td>3694d351-3afc-42f9-bbc3-ff369818efb0</td>\n",
" <td>How to make a jute bauble&lt;br&gt;Assemble the thin...</td>\n",
" <td>You don't want to not be careful when winding ...</td>\n",
" <td>e</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>procedural/causal</td>\n",
" <td>I used a double negative which fooled the AI</td>\n",
" <td>r3_train</td>\n",
" <td>'I think you misunderstood, , It is incorrect ...</td>\n",
" <td>misunderstood</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2676</th>\n",
" <td>c8246be5-d4f6-4002-9f07-adcbd2273ff1</td>\n",
" <td>How to get rid of a scab&lt;br&gt;Make sure the scab...</td>\n",
" <td>Gauze used to cover a scan should be sterile.</td>\n",
" <td>e</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>procedural/causal</td>\n",
" <td>Sterile gauze should be used so as not to caus...</td>\n",
" <td>r3_train</td>\n",
" <td>'I think you misunderstood, , The gauze should...</td>\n",
" <td>misunderstood</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2677</th>\n",
" <td>ff57b4f2-e7e1-41b3-9345-109898187fda</td>\n",
" <td>How to make dinner for mormon missionaries&lt;br&gt;...</td>\n",
" <td>Mormon missionaries sometimes go directly to h...</td>\n",
" <td>e</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>procedural/causal</td>\n",
" <td>The statement is in the correct category becau...</td>\n",
" <td>r3_train</td>\n",
" <td>'I think you misunderstood, , The statement is...</td>\n",
" <td>misunderstood</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2678</th>\n",
" <td>507534ac-4a30-4466-b537-d8bc83f49b07</td>\n",
" <td>How to deliver oral medication to rabbits&lt;br&gt;F...</td>\n",
" <td>A military vet is the not the right vet here.</td>\n",
" <td>e</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>procedural/causal</td>\n",
" <td>animal vet</td>\n",
" <td>r3_train</td>\n",
" <td>'I think you misunderstood, , The vet is the r...</td>\n",
" <td>misunderstood</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2679</th>\n",
" <td>2b575c10-ce7f-4c1a-8502-6b3a20249de0</td>\n",
" <td>How to grow african violets indoors&lt;br&gt;Buy pre...</td>\n",
" <td>African violets are a popular houseplant</td>\n",
" <td>e</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>procedural/causal</td>\n",
" <td>The context states that many houseplants enthu...</td>\n",
" <td>r3_train</td>\n",
" <td>'You generally got it!, African violets are a ...</td>\n",
" <td>understood</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2680</th>\n",
" <td>cb775dc0-5468-4b80-a09b-1b26c217d861</td>\n",
" <td>How to set the avatar of your outlook profile ...</td>\n",
" <td>Setting the avatar of your outlook profile can...</td>\n",
" <td>e</td>\n",
" <td>n</td>\n",
" <td>False</td>\n",
" <td>procedural/causal</td>\n",
" <td>There are 3 steps listed to set the avatar in ...</td>\n",
" <td>r3_train</td>\n",
" <td>'You generally got it!, The steps are correct'...</td>\n",
" <td>understood</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" uid ... r2\n",
"2671 beed5021-7e91-431e-95a0-945d413318b1 ... understood\n",
"2672 2cad912f-2b1a-4d17-b9e0-13a2098eba23 ... misunderstood\n",
"2673 e37ff032-8c70-42aa-b492-5c44f72672e0 ... understood\n",
"2674 24d0a0e4-7bb0-4f98-9a29-9cd83267d0fa ... understood\n",
"2675 3694d351-3afc-42f9-bbc3-ff369818efb0 ... misunderstood\n",
"2676 c8246be5-d4f6-4002-9f07-adcbd2273ff1 ... misunderstood\n",
"2677 ff57b4f2-e7e1-41b3-9345-109898187fda ... misunderstood\n",
"2678 507534ac-4a30-4466-b537-d8bc83f49b07 ... misunderstood\n",
"2679 2b575c10-ce7f-4c1a-8502-6b3a20249de0 ... understood\n",
"2680 cb775dc0-5468-4b80-a09b-1b26c217d861 ... understood\n",
"\n",
"[10 rows x 11 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 47
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "ii4s38XjFaU0",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"outputId": "e573d04a-738e-493c-b5e4-785a43955f3a"
},
"source": [
"for row in eval.iterrows():\n",
" print(row[1][\"context\"])\n",
" print(row[1][\"hypothesis\"])\n",
" print(row[1][\"label\"])\n",
" print(row[1][\"r\"])\n",
"\n",
" print(\"******\")\n"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"How to deal with your friend who is being jealous of your other friends<br>Invite your friend to join the group. This might feel hard if your friend has made occasions with your other friends uncomfortable in the past. However, stay positive.\n",
"The agent suggest inviting the friends to hang out with your other friends\n",
"e\n",
"'correct!, The agent suggests inviting the friends to hang out with your other friends'\n",
"Label: understood\n",
"******\n",
"How to create a strong email marketing campaign<br>Determine the purpose of your email marketing campaign. Common reasons for undertaking such a project include introducing a new product or service, offering discounts to boost sales or to move excess products, or further constructing your brand and getting your name out to more and more potential buyers. Put together a mailing list of possible customers.\n",
"A big reason you would start an email marketing campaign is to offer lower prices to increase interest in stock items.\n",
"e\n",
"'No, , This is incorrect because it is not a big reason, it is a small reason. The big reason is to introduce a new product or service.'\n",
"Label: misunderstood\n",
"******\n",
"How to take care of chickens<br>Check your local laws and regulations on keeping chickens. Some municipalities will not allow you to keep chickens, or they might have some strict rules on keeping them. Before you start spending money on chickens and supplies, take the time to search your local laws and regulations online.\n",
"Laws can be found online in most places\n",
"e\n",
"'No, , It is not correct to say laws can be found online in most places because it is not true. It is not true that laws can be found online in most places.'\n",
"Label: misunderstood\n",
"******\n",
"How to connect with an animal<br>Learn animal body language. If you want to connect to an animal, you need to know how that animal communicates. As animals cannot use language to converse, they relay heavily on movement and expression to show how they feel.\n",
"Animals can't ask a question like people\n",
"e\n",
"'No, , Animals cannot ask a question like people, because they cannot use language to converse.'\n",
"Label: understood\n",
"******\n",
"How to make a jute bauble<br>Assemble the things needed. These are listed below under \" things you'll need \". Dab glue on one side of the jute twine carefully as you wind it around the ball.\n",
"You don't want to not be careful when winding the jute twine around the ball.\n",
"e\n",
"'No, , it is incorrect to say you don't want to not be careful when winding the jute twine around the ball because you do want to be careful when winding the jute twine around the ball.'\n",
"Label: misunderstood\n",
"******\n",
"How to get rid of a scab<br>Make sure the scab is not oozing. Before you can properly dress a wound or scab, it should be dry. If your wound is oozing blood, place some sterile non-adhering gauze on it.\n",
"Gauze used to cover a scan should be sterile.\n",
"e\n",
"'No, , The gauze should not be sterile, it should be non-adhering'\n",
"Label: misunderstood\n",
"******\n",
"How to make dinner for mormon missionaries<br>Set up an appointment. Missionaries keep a tight schedule, and they often spend the day walking or biking to appointments. Confirm what time you can expect them, and ask what time they need to leave your home.\n",
"Mormon missionaries sometimes go directly to homes.\n",
"e\n",
"'No, , It is incorrect to say that Mormon missionaries sometimes go directly to homes because they do not go directly to homes.'\n",
"Label: misunderstood\n",
"******\n",
"How to deliver oral medication to rabbits<br>Follow the vet's suggestions. Before you give your bunny oral medication, make sure to take her to the vet to get checked out. Give all medication as directed by your veterinarian and for the time period indicated.\n",
"A military vet is the not the right vet here.\n",
"e\n",
"'No, , A military vet is not the right vet here, because a military vet is a vet who is in the military, and this is a vet who is not in the military.'\n",
"Label: misunderstood\n",
"******\n",
"How to grow african violets indoors<br>Buy pre-rooted african violets at a nursery, greenhouse or retail store. Because many houseplant enthusiasts grow african violets as part of their collection, they are usually easy to find. It's best to grow the violets from a pre-rooted plant, as growing african violets from seeds can be extremely difficult.\n",
"African violets are a popular houseplant\n",
"e\n",
"'No, , African violets are not a popular houseplant, they are a popular indoor plant'\n",
"Label: misunderstood\n",
"******\n",
"How to set the avatar of your outlook profile (outlook 2013 )<br>Open outlook 2013 either on mac or windows. Click on \" people \" in the lower right hand corner. Click on \" new contact \" in the top left corner.\n",
"Setting the avatar of your outlook profile can be done in 3 steps. \n",
"e\n",
"'No, , It is incorrect to say that setting the avatar of your outlook profile can be done in 3 steps because it is only 2 steps'\n",
"Label: misunderstood\n",
"******\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Cv7qY7sqX4Rq",
"colab_type": "text"
},
"source": [
"#k, let's run it against my dev. I'll save to drive in case I break stuff"
]
},
{
"cell_type": "code",
"metadata": {
"id": "tyug_qYcFn_Z",
"colab_type": "code",
"colab": {}
},
"source": [
"dev = pd.read_json(\"anli_v1.0/R3/dev.jsonl\",lines=True)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "V56PJ4vZH0N-",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 122
},
"outputId": "382be557-9f74-412b-fee0-073cbd5fc35f"
},
"source": [
"from google.colab import drive\n",
"drive.mount('/content/drive')\n"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"text": [
"Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly\n",
"\n",
"Enter your authorization code:\n",
"··········\n",
"Mounted at /content/drive\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "9e-p1tDbQpIc",
"colab_type": "code",
"colab": {}
},
"source": [
"import numpy as np"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "wlPSXanuID_B",
"colab_type": "code",
"colab": {}
},
"source": [
"complete = 0\n",
"for row in dev.iterrows():\n",
"\n",
" try:\n",
" x = row[1][\"r\"]\n",
" if type(x) == str:\n",
" continue\n",
" except:\n",
" pass\n",
"\n",
"\n",
" payload = \"Tom said '{}'\\n\".format(row[1][\"context\"])\n",
" payload += \"Jerry tried to confirm '{}'\\n\".format(row[1][\"hypothesis\"])\n",
" payload += \"Tom replied\"\n",
" dev.at[row[0], \"r\"] = query(prompt + payload)\n",
" dev.at[row[0], \"r2\"] = dev.at[row[0], \"r\"].split()[-1]\n",
" print(row[0])\n",
"\n",
" complete += 1\n",
" if complete % 50 == 0:\n",
" dev.to_pickle(\"/content/drive/My Drive/GPT3_Benchmarking/ANLI_dev_EC.pkl\")\n",
" print(complete)\n"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "xtXDtGG2YDYy",
"colab_type": "text"
},
"source": [
"# evaluate! (making sure it isn't going to go to chance occassionally)"
]
},
{
"cell_type": "code",
"metadata": {
"id": "cBeg_5WUIwDP",
"colab_type": "code",
"colab": {}
},
"source": [
"#dev2 = dev.dropna().copy()\n",
"dev2 = dev.copy()\n",
"dev2[\"pred\"] = dev2[\"r2\"].apply(lambda x: \"e\" if x == \"understood\" else \"c\")"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "WEp9J5yELQrk",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"outputId": "7924a622-4025-4618-feac-234a5ee0ce7b"
},
"source": [
"(dev2[\"pred\"]==dev2[\"label\"]).sum()/len(dev2)"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"0.4225"
]
},
"metadata": {
"tags": []
},
"execution_count": 118
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "KFlc4D9YLcta",
"colab_type": "code",
"colab": {}
},
"source": [
"dev3 = dev2[dev2.label.isin([\"e\",\"c\"])]"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "h-n_6kTYLy42",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"outputId": "28702af6-3d30-4888-b4b2-459fc7500245"
},
"source": [
"(dev3[\"pred\"]==dev3[\"label\"]).sum()/len(dev3)"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"0.6353383458646616"
]
},
"metadata": {
"tags": []
},
"execution_count": 120
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "XNZoJASNL3Rw",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 68
},
"outputId": "97ed513d-e140-4e6a-c2ae-d62d6bd2c362"
},
"source": [
"dev2[dev2.label==\"n\"][\"pred\"].value_counts()"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"e 217\n",
"c 185\n",
"Name: pred, dtype: int64"
]
},
"metadata": {
"tags": []
},
"execution_count": 121
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "94fRAvcsMfUY",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 68
},
"outputId": "59111c51-3715-4146-9b68-5bd564e26b12"
},
"source": [
"dev2[dev2.label==\"e\"][\"pred\"].value_counts()"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"e 271\n",
"c 131\n",
"Name: pred, dtype: int64"
]
},
"metadata": {
"tags": []
},
"execution_count": 122
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "MvnlGKuoMi3V",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 68
},
"outputId": "33802c2b-05a8-4f95-f30d-cd9be5178cd5"
},
"source": [
"dev2[dev2.label==\"c\"][\"pred\"].value_counts()"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"c 236\n",
"e 160\n",
"Name: pred, dtype: int64"
]
},
"metadata": {
"tags": []
},
"execution_count": 123
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "GrUhs1hwMj39",
"colab_type": "code",
"colab": {}
},
"source": [
""
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "G72rMutZYQmc",
"colab_type": "text"
},
"source": [
"# Step 2: Pick the N from the E and C"
]
},
{
"cell_type": "code",
"metadata": {
"id": "DgKaFwxxYSnC",
"colab_type": "code",
"colab": {}
},
"source": [
""
],
"execution_count": null,
"outputs": []
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment