Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
GPL-Domain-Adaptation.ipynb
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/jamescalam/d2c888775c87f9882bb7c379a96adbc8/gpl-domain-adaptation.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"source": [
"# Domain Adaptation for Dense Retrieval\n",
"\n",
"Dense retrieval has the issue, that is doesn't work well for new words and new concepts which it hasn't seen during the pre-training or fine-tuning stage.\n",
"\n",
"In the below example, we demonstrate this for a simple query: \"**How is COVID-19 transmitted**\".\n",
"\n",
"As model, we use TAS-B: A DistilBERT model that achieves state-of-the-art performance on MS MARCO (500k queries from Bing Search Engine). Both DistilBERT and MS MARCO were created with data from 2018 and before, hence, it lacks the knowledge of any COVID-related information.\n",
"\n",
"In your example we use a small collection of just 4 documents. If you search with this model, you get the following results (dot-score & document):\n",
"- 94.84\tEbola is transmitted via direct contact with blood\n",
"- 92.87\tHIV is transmitted via sex or sharing needles\n",
"- 92.31\tCorona is transmitted via the air\n",
"- 91.54\tPolio is transmitted via contaminated water or food\n",
"\n",
"\n",
"As we see, the correct document is just ranked on 3rd place behind how Ebola and HIV are transmitted.\n",
"\n",
"## Efficient Domain Adaptation with GPL\n",
"This notebook demonstrates [Generative Pseudo Labeling (GPL)](https://arxiv.org/abs/2112.07577), an efficient approach to adapt existing dense retrieval models to new domains & data.\n",
"\n",
"We get a collection 10k scientific papers on COVID-19 and then fine-tune within 15-60 minutes (depending on your GPU) to include the new COVID knowledge into our model.\n",
"\n",
"If we search again with the updated model, we get the search results we would expect:\n",
"- Query: How is COVID-19 transmitted\n",
"- 97.70\tCorona is transmitted via the air\n",
"- 96.71\tEbola is transmitted via direct contact with blood\n",
"- 95.14\tPolio is transmitted via contaminated water or food\n",
"- 94.13\tHIV is transmitted via sex or sharing needles"
],
"metadata": {
"id": "lwzfvAYedPF3"
},
"id": "lwzfvAYedPF3"
},
{
"cell_type": "code",
"source": [
"!nvidia-smi"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "DDecS1iYdXaO",
"outputId": "83811fce-93cb-4b30-e1a6-c56e931a87ea"
},
"id": "DDecS1iYdXaO",
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Wed Mar 23 20:27:10 2022 \n",
"+-----------------------------------------------------------------------------+\n",
"| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |\n",
"|-------------------------------+----------------------+----------------------+\n",
"| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |\n",
"| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |\n",
"| | | MIG M. |\n",
"|===============================+======================+======================|\n",
"| 0 Tesla K80 Off | 00000000:00:04.0 Off | 0 |\n",
"| N/A 43C P8 28W / 149W | 0MiB / 11441MiB | 0% Default |\n",
"| | | N/A |\n",
"+-------------------------------+----------------------+----------------------+\n",
" \n",
"+-----------------------------------------------------------------------------+\n",
"| Processes: |\n",
"| GPU GI CI PID Type Process name GPU Memory |\n",
"| ID ID Usage |\n",
"|=============================================================================|\n",
"| No running processes found |\n",
"+-----------------------------------------------------------------------------+\n"
]
}
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ca440f9b",
"metadata": {
"id": "ca440f9b",
"outputId": "a139fa0f-fa02-489f-b1e0-3d0d9af897ff",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Collecting sentence_transformers\n",
" Downloading sentence-transformers-2.2.0.tar.gz (79 kB)\n",
"\u001b[K |████████████████████████████████| 79 kB 3.3 MB/s \n",
"\u001b[?25hCollecting datasets\n",
" Downloading datasets-2.0.0-py3-none-any.whl (325 kB)\n",
"\u001b[K |████████████████████████████████| 325 kB 24.4 MB/s \n",
"\u001b[?25hCollecting pinecone_client\n",
" Downloading pinecone_client-2.0.8-py3-none-any.whl (149 kB)\n",
"\u001b[K |████████████████████████████████| 149 kB 47.6 MB/s \n",
"\u001b[?25hCollecting transformers<5.0.0,>=4.6.0\n",
" Downloading transformers-4.17.0-py3-none-any.whl (3.8 MB)\n",
"\u001b[K |████████████████████████████████| 3.8 MB 49.3 MB/s \n",
"\u001b[?25hRequirement already satisfied: tqdm in /usr/local/lib/python3.7/dist-packages (from sentence_transformers) (4.63.0)\n",
"Requirement already satisfied: torch>=1.6.0 in /usr/local/lib/python3.7/dist-packages (from sentence_transformers) (1.10.0+cu111)\n",
"Requirement already satisfied: torchvision in /usr/local/lib/python3.7/dist-packages (from sentence_transformers) (0.11.1+cu111)\n",
"Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from sentence_transformers) (1.21.5)\n",
"Requirement already satisfied: scikit-learn in /usr/local/lib/python3.7/dist-packages (from sentence_transformers) (1.0.2)\n",
"Requirement already satisfied: scipy in /usr/local/lib/python3.7/dist-packages (from sentence_transformers) (1.4.1)\n",
"Requirement already satisfied: nltk in /usr/local/lib/python3.7/dist-packages (from sentence_transformers) (3.2.5)\n",
"Collecting sentencepiece\n",
" Downloading sentencepiece-0.1.96-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)\n",
"\u001b[K |████████████████████████████████| 1.2 MB 53.5 MB/s \n",
"\u001b[?25hCollecting huggingface-hub\n",
" Downloading huggingface_hub-0.4.0-py3-none-any.whl (67 kB)\n",
"\u001b[K |████████████████████████████████| 67 kB 5.5 MB/s \n",
"\u001b[?25hRequirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from torch>=1.6.0->sentence_transformers) (3.10.0.2)\n",
"Requirement already satisfied: filelock in /usr/local/lib/python3.7/dist-packages (from transformers<5.0.0,>=4.6.0->sentence_transformers) (3.6.0)\n",
"Requirement already satisfied: requests in /usr/local/lib/python3.7/dist-packages (from transformers<5.0.0,>=4.6.0->sentence_transformers) (2.23.0)\n",
"Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.7/dist-packages (from transformers<5.0.0,>=4.6.0->sentence_transformers) (2019.12.20)\n",
"Collecting sacremoses\n",
" Downloading sacremoses-0.0.49-py3-none-any.whl (895 kB)\n",
"\u001b[K |████████████████████████████████| 895 kB 34.3 MB/s \n",
"\u001b[?25hRequirement already satisfied: importlib-metadata in /usr/local/lib/python3.7/dist-packages (from transformers<5.0.0,>=4.6.0->sentence_transformers) (4.11.3)\n",
"Collecting pyyaml>=5.1\n",
" Downloading PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (596 kB)\n",
"\u001b[K |████████████████████████████████| 596 kB 43.3 MB/s \n",
"\u001b[?25hRequirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.7/dist-packages (from transformers<5.0.0,>=4.6.0->sentence_transformers) (21.3)\n",
"Collecting tokenizers!=0.11.3,>=0.11.1\n",
" Downloading tokenizers-0.11.6-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.5 MB)\n",
"\u001b[K |████████████████████████████████| 6.5 MB 38.1 MB/s \n",
"\u001b[?25hRequirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.7/dist-packages (from packaging>=20.0->transformers<5.0.0,>=4.6.0->sentence_transformers) (3.0.7)\n",
"Requirement already satisfied: pandas in /usr/local/lib/python3.7/dist-packages (from datasets) (1.3.5)\n",
"Collecting responses<0.19\n",
" Downloading responses-0.18.0-py3-none-any.whl (38 kB)\n",
"Requirement already satisfied: multiprocess in /usr/local/lib/python3.7/dist-packages (from datasets) (0.70.12.2)\n",
"Requirement already satisfied: dill in /usr/local/lib/python3.7/dist-packages (from datasets) (0.3.4)\n",
"Collecting aiohttp\n",
" Downloading aiohttp-3.8.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.1 MB)\n",
"\u001b[K |████████████████████████████████| 1.1 MB 34.9 MB/s \n",
"\u001b[?25hCollecting xxhash\n",
" Downloading xxhash-3.0.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (212 kB)\n",
"\u001b[K |████████████████████████████████| 212 kB 37.1 MB/s \n",
"\u001b[?25hCollecting fsspec[http]>=2021.05.0\n",
" Downloading fsspec-2022.2.0-py3-none-any.whl (134 kB)\n",
"\u001b[K |████████████████████████████████| 134 kB 36.9 MB/s \n",
"\u001b[?25hRequirement already satisfied: pyarrow>=5.0.0 in /usr/local/lib/python3.7/dist-packages (from datasets) (6.0.1)\n",
"Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests->transformers<5.0.0,>=4.6.0->sentence_transformers) (2021.10.8)\n",
"Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests->transformers<5.0.0,>=4.6.0->sentence_transformers) (3.0.4)\n",
"Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests->transformers<5.0.0,>=4.6.0->sentence_transformers) (2.10)\n",
"Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests->transformers<5.0.0,>=4.6.0->sentence_transformers) (1.24.3)\n",
"Collecting urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1\n",
" Downloading urllib3-1.25.11-py2.py3-none-any.whl (127 kB)\n",
"\u001b[K |████████████████████████████████| 127 kB 39.4 MB/s \n",
"\u001b[?25hCollecting dnspython>=2.0.0\n",
" Downloading dnspython-2.2.1-py3-none-any.whl (269 kB)\n",
"\u001b[K |████████████████████████████████| 269 kB 31.8 MB/s \n",
"\u001b[?25hCollecting loguru>=0.5.0\n",
" Downloading loguru-0.6.0-py3-none-any.whl (58 kB)\n",
"\u001b[K |████████████████████████████████| 58 kB 4.4 MB/s \n",
"\u001b[?25hRequirement already satisfied: python-dateutil>=2.5.3 in /usr/local/lib/python3.7/dist-packages (from pinecone_client) (2.8.2)\n",
"Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.5.3->pinecone_client) (1.15.0)\n",
"Collecting multidict<7.0,>=4.5\n",
" Downloading multidict-6.0.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (94 kB)\n",
"\u001b[K |████████████████████████████████| 94 kB 3.6 MB/s \n",
"\u001b[?25hCollecting aiosignal>=1.1.2\n",
" Downloading aiosignal-1.2.0-py3-none-any.whl (8.2 kB)\n",
"Collecting frozenlist>=1.1.1\n",
" Downloading frozenlist-1.3.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (144 kB)\n",
"\u001b[K |████████████████████████████████| 144 kB 51.4 MB/s \n",
"\u001b[?25hCollecting yarl<2.0,>=1.0\n",
" Downloading yarl-1.7.2-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (271 kB)\n",
"\u001b[K |████████████████████████████████| 271 kB 51.2 MB/s \n",
"\u001b[?25hCollecting asynctest==0.13.0\n",
" Downloading asynctest-0.13.0-py3-none-any.whl (26 kB)\n",
"Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.7/dist-packages (from aiohttp->datasets) (21.4.0)\n",
"Requirement already satisfied: charset-normalizer<3.0,>=2.0 in /usr/local/lib/python3.7/dist-packages (from aiohttp->datasets) (2.0.12)\n",
"Collecting async-timeout<5.0,>=4.0.0a3\n",
" Downloading async_timeout-4.0.2-py3-none-any.whl (5.8 kB)\n",
"Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata->transformers<5.0.0,>=4.6.0->sentence_transformers) (3.7.0)\n",
"Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/dist-packages (from pandas->datasets) (2018.9)\n",
"Requirement already satisfied: click in /usr/local/lib/python3.7/dist-packages (from sacremoses->transformers<5.0.0,>=4.6.0->sentence_transformers) (7.1.2)\n",
"Requirement already satisfied: joblib in /usr/local/lib/python3.7/dist-packages (from sacremoses->transformers<5.0.0,>=4.6.0->sentence_transformers) (1.1.0)\n",
"Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from scikit-learn->sentence_transformers) (3.1.0)\n",
"Requirement already satisfied: pillow!=8.3.0,>=5.3.0 in /usr/local/lib/python3.7/dist-packages (from torchvision->sentence_transformers) (7.1.2)\n",
"Building wheels for collected packages: sentence-transformers\n",
" Building wheel for sentence-transformers (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
" Created wheel for sentence-transformers: filename=sentence_transformers-2.2.0-py3-none-any.whl size=120747 sha256=e8bec25e4ccb7b9a3d83fdd0f98561338b3f7a18eb00271a7bb9a32e40550f6c\n",
" Stored in directory: /root/.cache/pip/wheels/83/c0/df/b6873ab7aac3f2465aa9144b6b4c41c4391cfecc027c8b07e7\n",
"Successfully built sentence-transformers\n",
"Installing collected packages: urllib3, multidict, frozenlist, yarl, pyyaml, asynctest, async-timeout, aiosignal, tokenizers, sacremoses, huggingface-hub, fsspec, aiohttp, xxhash, transformers, sentencepiece, responses, loguru, dnspython, sentence-transformers, pinecone-client, datasets\n",
" Attempting uninstall: urllib3\n",
" Found existing installation: urllib3 1.24.3\n",
" Uninstalling urllib3-1.24.3:\n",
" Successfully uninstalled urllib3-1.24.3\n",
" Attempting uninstall: pyyaml\n",
" Found existing installation: PyYAML 3.13\n",
" Uninstalling PyYAML-3.13:\n",
" Successfully uninstalled PyYAML-3.13\n",
"\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n",
"datascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.\u001b[0m\n",
"Successfully installed aiohttp-3.8.1 aiosignal-1.2.0 async-timeout-4.0.2 asynctest-0.13.0 datasets-2.0.0 dnspython-2.2.1 frozenlist-1.3.0 fsspec-2022.2.0 huggingface-hub-0.4.0 loguru-0.6.0 multidict-6.0.2 pinecone-client-2.0.8 pyyaml-6.0 responses-0.18.0 sacremoses-0.0.49 sentence-transformers-2.2.0 sentencepiece-0.1.96 tokenizers-0.11.6 transformers-4.17.0 urllib3-1.25.11 xxhash-3.0.0 yarl-1.7.2\n"
]
}
],
"source": [
"!pip install sentence_transformers datasets pinecone_client"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5dbc894a",
"metadata": {
"id": "5dbc894a"
},
"outputs": [],
"source": [
"from sentence_transformers import SentenceTransformer, util, InputExample, losses\n",
"from torch.utils.data import DataLoader\n",
"from sentence_transformers import CrossEncoder\n",
"from transformers import AutoTokenizer, AutoModelForSeq2SeqLM\n",
"import tqdm\n",
"import random\n",
"from datasets import load_dataset\n",
"import pinecone"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3c447291",
"metadata": {
"id": "3c447291",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 401,
"referenced_widgets": [
"1a1a54fba1c64fa08cb498c3d44ffb7d",
"fea370531ccc4504a1a3373706378574",
"56f9c71edb034650b30b202e8ffbc48f",
"6c2ce6f3fff34e24ad6cb3a8f9ecb68e",
"5be7da4ebe324e5e80a5b5df121b1357",
"5ae2de1deabd45c484cbbab727219a46",
"528b358f0fc3460f831b432caff7ce1a",
"7b5d43e7ba9c41d699e6c58743922ce1",
"72bb3536051243249970fcce2e97576a",
"cc3afdb243b5446fa3ce3a72e08ddfa3",
"3f0375a254c045568a85401c50b4ba64",
"61766fdf2d624a82816dfd11bad86d13",
"1b0628147ac04954b2810333e9a44784",
"a7b04b4d755940eeaaf9166337093b8c",
"a40d0c68fe4b43ef84467511d2eccfe7",
"b0fe621a97f3449180d2c6d8c22dc9de",
"e99ad1a1f5b541809d9a2f19ff10846d",
"71d86604eb4742dcb4966df42be7801f",
"0805b3d4285142e08ae82f09944665bf",
"6d2d529c1ac54378b944ae4eaee56fcd",
"320bd409bf9f4e0fb0580e41d423c4d9",
"af1fdc87f14849eeb99fba2c38dff580",
"10043932feb34be8a9a0cd95a3d495d6",
"07ede0ede2a740a1a1fb32c4076af0c7",
"af2f0329158e4dae9278c664cc76696a",
"c85071546c0740919c3b0b4fe2abb7ac",
"c5e9f24cffa4480b8d05498247f6fb93",
"59997806751f484db9983ef4e7ac9319",
"f51501d34758419191d63411bce2fd19",
"f3cf53a6b6f54a32ac9e83b73b962fdc",
"d5b0f72025a64fb8a7dd689be300ddff",
"6e401197dee241659be86dfa4b0cbde6",
"1dd5eefb3dd24d3c8a308c05ee65750d",
"0ad0b526378349d8a8ba8b794df9b400",
"e68586c3381146b69f80a6450403a8b6",
"36f4464361364f3b9fc760bc8bb03293",
"f1f871fc9e394bdf979481c707fe6f6e",
"81bec02598e446ed8f298fd50ad52ec8",
"cc4e70c8cea249f08b462052315b3510",
"7a56d1c1ccd2488ca9f0d2d43b540c85",
"292216407e504481a2ebc38f86ed2c57",
"78578f6bb02a45d1b7cbdccff4a868ff",
"b29e0d98df1441d4ae513ec704e7614d",
"27599b5652c246fb82fde622d93b75f3",
"9106c4d055ed46069104dd693b945ed0",
"8e7c4ab9b92a4a309bb602db95d401c8",
"edddfc88c2554bb5a9486dbbc465bb9b",
"8a077ab30ce542b4b49a3f72bb198782",
"08816a3caf6840ddb84f368611bd8a39",
"281e8dc5110a4657b722ff8c7a01b431",
"da760b79c5484f589f22943aef1ef959",
"6e61a142f5ea4ab5868648726406f222",
"6f1588d783ab42a985162c1b968800ec",
"5030e93732484b7f81a5a53fbaefb8e6",
"6651ff4fac7f416e8fe965991e069902",
"4a11441e6b5f43059769a8dcdf232aab",
"0391fa5e89d1491c99d119afc471276a",
"d24edc24cd024286b7b454c1b5d62d30",
"c0ec3a9ef9a94330a986478141a43b9a",
"880a94f8947a4ab9852aa006e4ca3ad5",
"9f3e6a663fc646f9bbc682ba3f85b136",
"d0158ac0df474245b14fad106c4e70f2",
"cbb73a35f8284435a56fee1e45452ac0",
"381347d6943e47b6a97316d38f6fdd10",
"7365562ccd43428aae42c9c540bf076a",
"613a20e0d42f4455a8f5d8b75b0bb0c2",
"d1a28bd8761d41b38296963a3931f6ac",
"e63c273775b54c1ea3e6400b4a8037c3",
"c68edb8867ad401ca7ae67e5baa45ebc",
"ece10d2b8603406aa8817985063936ec",
"23f50683904e4049a4fe57a80cd765e7",
"c2e0db6b8ba244c0b24855151fe8494b",
"04e6538bbb4a45faad4cdf1d6e41c280",
"f4f4c87b9ea64f338bb16e2378f4ade3",
"f7b2e4ed3c784bd6aa927b452e0f5669",
"29093d3364184371bf7fd15688be3a68",
"de985cd9f5e2421a9ce617560f69b37b",
"230669c2735c41a98382267e6bde8192",
"0d1e63cfb1fb46c5951bb17af75f6faa",
"4b87aea01bb141408d722a031d3c03e5",
"95dcc74105634b60b97fec037a49e5ed",
"ec063797b40e44339c8884c8042fbe50",
"b8d8c7d73b91427395c8ae4b6a78031c",
"869129fd8b954a249c567c939b82ade7",
"7ddf13f71c45487bbe8bf3619e6ba540",
"bfb7cdeee24a435daea1c4ec0df9d3c0",
"96e487df0f394144ab4f76c4d2947dda",
"c5e78ce804e745d0a2e23513ed9801f8",
"e5b476ed4e414b46a83096b16ee0bc6a",
"99e34f9d4ffa417f988c9b3ec9ef6d12",
"5bebff3040c14b3ba9f67d48e1f22208",
"0a61b0b9f7714482b4a80f74a370044a",
"24df352c94b7486eb8729226d6089801",
"b9e07794b8a145459b70b2b2effae198",
"5f120feb532a4200836b0dd641cf83ee",
"233d0ec8f738410fab7add8e1663dfd3",
"de2045d01fec4bddb88cf79847e2fa28",
"2ccecaf718d74b278e3ef96dc3cbb539",
"3a182f3aa9ea4ef18e48d77cdf177b8e",
"dc6e7af411cc4e5e86f4bd5546280aed",
"12aead9e3e69470aa3effbcaa78ead07",
"2eb52c89e2724f059820e013a6f129b6",
"23dc95de8403478c8a46befa4e65578c",
"e0a3ed9826f9481f8f25926c3946e269",
"894770ded4a14d9abeef02f191a2ed5a",
"a5c5628d82dc43029e8769a649c9453a",
"4d7365140d1342c3b87a122a08023542",
"110d6b94b7f243e5b7899e49af5b2a3c",
"41c2be6da6ca4c73b5590c6af4daff83",
"b704305a36c046009a397b4e7ba7bf8d",
"5cffb63c6b3e4711b7cb5b4b4a002f08",
"4c82e5a3698b49f49e101e7b05705c64",
"28b1e772dd9d40869629a50f29796882",
"51a71de75a134074b5dcc042e1431bb4",
"86fe2e741383483f8b35b16f266a3e16",
"0cd6ef62b112470a969cc11642790871",
"9bc4810a889d4a499a07e40fdcac769b",
"c646c23225b24f959bd43b1978d74038",
"faacf4a5f2f743e598c9e7460138304f",
"37c102aabbdb4de097f6b945b2523ec4",
"481e9c60f6a04846bdb82a430b265dd7",
"b959e8a27fa848d498db81aaccac5fdb",
"fc306ca1f2a5483a87cdbc881665a087",
"74dfd2a65dac424f826e5c022a142fed",
"77346324e9424a04923a2ad23b5c5655",
"6897d5be3397443c97dcf360cde5eabf",
"78d696fc166c4f82a242b9cdc97bcf5c",
"a7254fdffd52480bb4faf65576adcca8",
"54ae2995a41a46c39d3b75af4f3ed180",
"83583109259644e0a7c23bcb42d39863",
"6ccf080674bc416c9825522d5684451c",
"52934653a0db435dbb8d2c869d52a0e5"
]
},
"outputId": "fe696812-ab08-47e3-a492-2a31b4ff608c"
},
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading: 0%| | 0.00/690 [00:00<?, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "1a1a54fba1c64fa08cb498c3d44ffb7d"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading: 0%| | 0.00/3.95k [00:00<?, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "61766fdf2d624a82816dfd11bad86d13"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading: 0%| | 0.00/548 [00:00<?, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "10043932feb34be8a9a0cd95a3d495d6"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading: 0%| | 0.00/122 [00:00<?, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "0ad0b526378349d8a8ba8b794df9b400"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading: 0%| | 0.00/229 [00:00<?, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "9106c4d055ed46069104dd693b945ed0"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading: 0%| | 0.00/265M [00:00<?, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "4a11441e6b5f43059769a8dcdf232aab"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading: 0%| | 0.00/53.0 [00:00<?, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "d1a28bd8761d41b38296963a3931f6ac"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading: 0%| | 0.00/112 [00:00<?, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "230669c2735c41a98382267e6bde8192"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading: 0%| | 0.00/466k [00:00<?, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "e5b476ed4e414b46a83096b16ee0bc6a"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading: 0%| | 0.00/547 [00:00<?, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "dc6e7af411cc4e5e86f4bd5546280aed"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading: 0%| | 0.00/232k [00:00<?, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "5cffb63c6b3e4711b7cb5b4b4a002f08"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading: 0%| | 0.00/190 [00:00<?, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "b959e8a27fa848d498db81aaccac5fdb"
}
},
"metadata": {}
}
],
"source": [
"# We load the TAS-B model, a state-of-the-art model trained on MS MARCO\n",
"max_seq_length = 200\n",
"model_name = \"msmarco-distilbert-base-tas-b\"\n",
"\n",
"org_model = SentenceTransformer(model_name)\n",
"org_model.max_seq_length = max_seq_length"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3d42c72a",
"metadata": {
"id": "3d42c72a",
"outputId": "5f50a198-083b-4973-e96e-00570ce66492",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Original Model\n",
"Query: How is COVID-19 transmitted\n",
"94.84\tEbola is transmitted via direct contact with blood\n",
"92.87\tHIV is transmitted via sex or sharing needles\n",
"92.31\tCorona is transmitted via the air\n",
"91.54\tPolio is transmitted via contaminated water or food\n"
]
}
],
"source": [
"# We define a simple query and some documents how diseases are transmitted\n",
"# As TAS-B was trained on rather out-dated data (2018 and older), it has now idea about COVID-19\n",
"# So in the below example, it fails to recognize the relationship between COVID-19 and Corona\n",
"\n",
"def show_examples(model):\n",
" query = \"How is COVID-19 transmitted\"\n",
" docs = [\n",
" \"Corona is transmitted via the air\",\n",
" \"Ebola is transmitted via direct contact with blood\",\n",
" \"HIV is transmitted via sex or sharing needles\",\n",
" \"Polio is transmitted via contaminated water or food\"\n",
" ]\n",
"\n",
" query_emb = model.encode(query)\n",
" docs_emb = model.encode(docs)\n",
" scores = util.dot_score(query_emb, docs_emb)[0]\n",
" doc_scores = sorted(zip(docs, scores), key=lambda x: x[1], reverse=True)\n",
"\n",
" print(\"Query:\", query)\n",
" for doc, score in doc_scores:\n",
" #print(doc, score)\n",
" print(f\"{score:0.02f}\\t{doc}\")\n",
" \n",
" \n",
"print(\"Original Model\")\n",
"show_examples(org_model)"
]
},
{
"cell_type": "markdown",
"id": "5e3744e9",
"metadata": {
"id": "5e3744e9"
},
"source": [
"# Get Some Data on COVID-19\n",
"We select 10k scientific publications (title + abstract) that are connected to COVID-19. As dataset we use [TREC-COVID-19](https://huggingface.co/datasets/nreimers/trec-covid)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b4af1063",
"metadata": {
"id": "b4af1063",
"outputId": "e36efea0-49b0-4930-e4dc-fa03650f507d",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 206,
"referenced_widgets": [
"5e1d88ca38ef4cebb729f99eccd68bcd",
"25a0b085f3be444b82993d9351bb5870",
"a30fd0579e474420a1361cb13f7e5359",
"68b5aba36aad44618fce8a1d85ab38c1",
"39eebfbebc5e4832a128a5c7ba5ccd8c",
"dba0dce6128e4148ac05c40bc9f5e4ff",
"b70fd29acae24a9aa300e9d9ded761de",
"ba6b1cbb989f4f4680200ff20b4383c4",
"892efaf72bae4365ad9d3b3a229ce9b1",
"d89f53e596a04b5399d87e009fc4b4c8",
"76ca17fdd2214fc485c4695e75e16455",
"e74e1c800cdf4859834200cb02dc4f3e",
"a30461fa45ee4c358d2d08d09a3bee2d",
"dc942323e6ef464fbd780669d9de3f04",
"6928ef7c4f114363a2acb2f715bf83a6",
"c5db671386004902a0a3a88ec4243f56",
"a60b8f9198bd4c61bfd93c280833c41b",
"eee520c40e6946c7994e67dc53f60923",
"b22a345d588340c3939eb3cd795fbd93",
"a8888bf2b2ce4c0597d41ff3fa89feb9",
"41a8960515764b509978b58c7e0b21dc",
"507e0427ff6248d9bd2f0af041267053",
"6f4e8eafec8c4960bdb4f51c3d86b72b",
"197eaa880e4849d8bdde5f8b941aefe9",
"93e8c2ec44af478c8d2c436aa4f5c540",
"a539617674344c7e98a49350cc52f8e4",
"0113e7da88b64921ad3490af2978b10d",
"1ed9f295062d435fbc88d85da68facd5",
"8cda7f855c1949fa9b26c2fc85a4d788",
"e90b36c7bdc34a30bb4041725356e641",
"655acfa985ec463d89d61570c8f02713",
"08da3816bcc94e5d8248fec6f5a3ac23",
"e00fef93042a404082ab3e79d23c795d"
]
}
},
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"Using custom data configuration nreimers--trec-covid-c0101bec5094a9fc\n"
]
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"Downloading and preparing dataset json/nreimers--trec-covid to /root/.cache/huggingface/datasets/json/nreimers--trec-covid-c0101bec5094a9fc/0.0.0/ac0ca5f5289a6cf108e706efcf040422dbbfa8e658dee6a819f20d76bb84d26b...\n"
]
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading data files: 0%| | 0/1 [00:00<?, ?it/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "5e1d88ca38ef4cebb729f99eccd68bcd"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading data: 0%| | 0.00/73.5M [00:00<?, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "e74e1c800cdf4859834200cb02dc4f3e"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Extracting data files: 0%| | 0/1 [00:00<?, ?it/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "6f4e8eafec8c4960bdb4f51c3d86b72b"
}
},
"metadata": {}
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"Dataset json downloaded and prepared to /root/.cache/huggingface/datasets/json/nreimers--trec-covid-c0101bec5094a9fc/0.0.0/ac0ca5f5289a6cf108e706efcf040422dbbfa8e658dee6a819f20d76bb84d26b. Subsequent calls will reuse this data.\n",
"Len Corpus: 10000\n"
]
}
],
"source": [
"dataset = load_dataset('nreimers/trec-covid', split='train')\n",
"\n",
"corpus = []\n",
"for row in dataset:\n",
" if len(row['title']) > 20 and len(row['text']) > 100:\n",
" text = row['title']+\" \"+row['text']\n",
" \n",
" text_lower = text.lower()\n",
" \n",
" # The dataset also contains many papers on other diseases. To make the training in this demo\n",
" # more efficient, we focus on papers that talk about COVID.\n",
" if 'covid' in text_lower or 'corona' in text_lower or 'sars-cov-2' in text_lower:\n",
" corpus.append(text)\n",
" \n",
" if len(corpus) >= 10000:\n",
" break\n",
" \n",
"print(\"Len Corpus:\", len(corpus))"
]
},
{
"cell_type": "markdown",
"id": "85942063",
"metadata": {
"id": "85942063"
},
"source": [
"# Generate Queries\n",
"Next, for our 10k documents we generate possible queries a person might ask. Here we use the [doc2query/msmarco-t5-base-v1](https://huggingface.co/doc2query/msmarco-t5-base-v1)\n",
"\n",
"It is a T5 model trained on MS MARCO (with data from before 2018): Given a text passage, it generates a matching search query. Even though the model hasn't seen any COVID related content, it can still produce sensible queries by copying words from the input text."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "183ff7ab",
"metadata": {
"id": "183ff7ab",
"outputId": "d8d20d73-6c1c-4b29-a0ed-b58de3840e5a",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 206,
"referenced_widgets": [
"97d8d067a6854955b9a8f3d75e94c0f2",
"7dc62abd59f84ca28a3c951ce0931a26",
"47739e4fb33245b3a1f1ab0a0ab330d4",
"23e41d891e444e04828219a2fb85b54f",
"634cfcb75bc84a529ea63cd340bfe3de",
"776a503db5804251b0c2f291cb7b8030",
"3b33b69413ed4763b46cd5466c233935",
"6c9e59428b964aeea0c25b761de176e1",
"0e587b53f894408bb469729665edeb60",
"85de60d9aacd4275ad0d12114fa861ea",
"358403dfddf44198a066f4fc8bd3c3b8",
"955a675a26304693b3a2704cdef5993e",
"7c8c0787f0f345f08b356e8df83b04be",
"57c48178db7547b3808cde15352eeee5",
"672104035a9b458bb081cc36a0d86526",
"7c1d1fd6164948b6be1d0973aeb2e6ca",
"f96b3b72850444599fd7616a00c7fb65",
"c0fc71340dae49098379a0e159a09ed0",
"9f1eac3bd25c4c14a80ae6cca65ca038",
"b3f68e64f50143d5a794d8a1cf45b5e2",
"3a98ad77c18845edaf94a2a5bdf21733",
"ae67b804807c4450aa6d420154afa420",
"cb49b0f2e56b4e15a21ff08f2229872c",
"3909f85807ac4d058bb89a5aff5a405a",
"2323312aaedd44c1afcce64549a96cfc",
"d95f48e0678c460480cdd03ef46756d5",
"b76bae5db2d641e0977517db11de1e2a",
"a3a1e1c6067f41868a0f123517bc82b0",
"6fc468ac4ef249edb17598b299d30442",
"b3dc4d95462f45708070da820af7f87b",
"7df60cc6baf5425daf633cdd65c026a9",
"edb48ba537024c479da4e385ddc6c6bf",
"9da86d6c455d4bc4826dcd19819bd1b4"
]
}
},
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"Using custom data configuration nreimers--trec-covid-generated-queries-c23aacba4e20cdbe\n"
]
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"Downloading and preparing dataset json/nreimers--trec-covid-generated-queries to /root/.cache/huggingface/datasets/json/nreimers--trec-covid-generated-queries-c23aacba4e20cdbe/0.0.0/ac0ca5f5289a6cf108e706efcf040422dbbfa8e658dee6a819f20d76bb84d26b...\n"
]
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading data files: 0%| | 0/1 [00:00<?, ?it/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "97d8d067a6854955b9a8f3d75e94c0f2"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading data: 0%| | 0.00/6.62M [00:00<?, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "955a675a26304693b3a2704cdef5993e"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Extracting data files: 0%| | 0/1 [00:00<?, ?it/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "cb49b0f2e56b4e15a21ff08f2229872c"
}
},
"metadata": {}
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"Dataset json downloaded and prepared to /root/.cache/huggingface/datasets/json/nreimers--trec-covid-generated-queries-c23aacba4e20cdbe/0.0.0/ac0ca5f5289a6cf108e706efcf040422dbbfa8e658dee6a819f20d76bb84d26b. Subsequent calls will reuse this data.\n",
"Generated queries: 30000\n"
]
}
],
"source": [
"query_doc_pairs = []\n",
"load_queries_from_hub = True\n",
"\n",
"\n",
"# Generation of the queries is quite slow in Colab due to the old GPU and the limited CPU\n",
"# I pre-computed the queries and uploaded these to the HF dataset hub. Here we just download them\n",
"if load_queries_from_hub:\n",
" generated_queries = load_dataset('nreimers/trec-covid-generated-queries', split='train')\n",
" for row in generated_queries:\n",
" query_doc_pairs.append([row['query'], row['doc']])\n",
"else:\n",
" #Load doc2query model\n",
" t5_name = 'doc2query/msmarco-t5-base-v1'\n",
" t5_tokenizer = AutoTokenizer.from_pretrained(t5_name)\n",
" t5_model = AutoModelForSeq2SeqLM.from_pretrained(t5_name).cuda()\n",
"\n",
" batch_size = 32\n",
" queries_per_doc = 3\n",
"\n",
" for start_idx in tqdm.tqdm(range(0, len(corpus), batch_size)):\n",
" corpus_batch = corpus[start_idx:start_idx+batch_size] \n",
" enc_inp = t5_tokenizer(corpus_batch,\n",
" max_length=max_seq_length, \n",
" truncation=True,\n",
" padding=True,\n",
" return_tensors='pt')\n",
" \n",
" \n",
" outputs = t5_model.generate(\n",
" input_ids=enc_inp['input_ids'].cuda(),\n",
" attention_mask=enc_inp['attention_mask'].cuda(),\n",
" max_length=64,\n",
" do_sample=True,\n",
" top_p=0.95,\n",
" num_return_sequences=queries_per_doc,\n",
" )\n",
"\n",
" decoded_output = t5_tokenizer.batch_decode(outputs, skip_special_tokens=True)\n",
" \n",
" for idx, query in enumerate(decoded_output):\n",
" corpus_id = int(idx/queries_per_doc)\n",
" query_doc_pairs.append([query, corpus_batch[corpus_id]])\n",
"\n",
"\n",
"print(\"Generated queries:\", len(query_doc_pairs))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d01e6788",
"metadata": {
"id": "d01e6788",
"outputId": "a4f17e8a-b695-4f8a-c0d1-2d37c5da60db",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"what is a coronavirus\n",
"Infection, Replication, and Transmission of Middle East Respiratory Syndrome Coronavirus in Alpacas Middle East respiratory syndrome coronavirus is a recently emerged pathogen associated with severe human disease. Zoonotic spillover from camels appears to play a major role in transmission. Because of logistic difficulties in working with dromedaries in containment, a more manageable animal model would be desirable. We report shedding and transmission of this virus in experimentally infected alpacas (n = 3) or those infected by contact (n = 3). Infectious virus was detected in all infected animals and in 2 of 3 in-contact animals. All alpacas seroconverted and were rechallenged 70 days after the original infection. Experimentally infected animals were protected against reinfection, and those infected by contact were partially protected. Necropsy specimens from immunologically naive animals (n = 3) obtained on day 5 postinfection showed virus in the upper respiratory tract. These data demonstrate efficient virus replication and animal-to-animal transmission and indicate that alpacas might be useful surrogates for camels in laboratory studies.\n"
]
}
],
"source": [
"#Search for a query that contains the word 'corona'\n",
"for query, doc in query_doc_pairs:\n",
" if 'corona' in query:\n",
" print(query)\n",
" print(doc)\n",
" break\n"
]
},
{
"cell_type": "markdown",
"id": "c882534e",
"metadata": {
"id": "c882534e"
},
"source": [
"# Mine Negatives\n",
"\n",
"To mine negatives we can use a vector database like Pinecone, all we need is an API key from [app.pinecone.io](https://app.pinecone.io). First, we can encode our embeddings."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7313d58d",
"metadata": {
"id": "7313d58d",
"outputId": "45b328af-b0ee-480b-f83e-40db222b91eb",
"colab": {
"referenced_widgets": [
"c9e6d2d11e8b4f3b88619daf2acc0e4a",
"c2f1710c6cfd42f0a2d1fa8faef72983",
"1852cd227739407caf6472d045e99fed",
"c20ec806f6f44596949fc817b0409fc3",
"6acf8fb8c45649e1876964ca8b045842",
"0251f6a516de43cda0df37aba863f9c1",
"96426a1a9c4148f1ad07d1c59c21f5b6",
"ed4544dedb994729a2b35474773e585a",
"833ec3c127054617ad12bad8230fbcbf",
"4c2dbba608b34e19a86048e4e7845654",
"53300e76188e479cb091d37b5327fbc6"
],
"base_uri": "https://localhost:8080/",
"height": 67
}
},
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"Batches: 0%| | 0/313 [00:00<?, ?it/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "c9e6d2d11e8b4f3b88619daf2acc0e4a"
}
},
"metadata": {}
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"torch.Size([10000, 768])"
]
},
"metadata": {},
"execution_count": 9
}
],
"source": [
"corpus_emb = org_model.encode(corpus, convert_to_tensor=True, show_progress_bar=True)\n",
"corpus_emb.shape"
]
},
{
"cell_type": "markdown",
"id": "6fe8be27",
"metadata": {
"id": "6fe8be27"
},
"source": [
"Then initialize a Pinecone index."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4172d247",
"metadata": {
"id": "4172d247"
},
"outputs": [],
"source": [
"with open('secret', 'r') as fp:\n",
" API_KEY = fp.read()\n",
"\n",
"pinecone.init(api_key=API_KEY, environment='us-west1-gcp')\n",
"\n",
"# create new mining index if does not exist\n",
"if 'negative-mine' not in pinecone.list_indexes():\n",
" pinecone.create_index(\n",
" 'negative-mine', dimension=corpus_emb.shape[1],\n",
" metric='dotproduct', pods=20, pod_type='p1' # limit of pods=1 for free plan (more pods == faster mining)\n",
" )\n",
"# connect\n",
"index = pinecone.Index('negative-mine')"
]
},
{
"cell_type": "markdown",
"id": "bcd139da-9f2d-441d-b8e3-488a70250fbb",
"metadata": {
"id": "bcd139da-9f2d-441d-b8e3-488a70250fbb"
},
"source": [
"| Pods (p1) | 10K mine time |\n",
"| --- | --- |\n",
"| 1 | 22:00 |\n",
"| 5 | 09:44 |\n",
"| 10 | 05:34 |\n",
"| 20 | 03:15 |\n",
"| 30 | 03:11 |\n",
"\n",
"*(latency depends on network location and other variables)*"
]
},
{
"cell_type": "markdown",
"id": "fa812ff6",
"metadata": {
"id": "fa812ff6"
},
"source": [
"We then upload the embeddings to our index."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "54a5f006",
"metadata": {
"id": "54a5f006",
"outputId": "9d7f1e5f-9f3e-4373-82e4-fd4c95cae7fc",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"100%|██████████| 313/313 [01:33<00:00, 3.35it/s]\n"
]
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{'dimension': 768,\n",
" 'index_fullness': 0.0,\n",
" 'namespaces': {'': {'vector_count': 10000}}}"
]
},
"metadata": {},
"execution_count": 21
}
],
"source": [
"batch_size = 32\n",
"\n",
"for i in tqdm.tqdm(range(0, len(corpus_emb), batch_size)):\n",
" i_end = min(i+batch_size, len(corpus_emb))\n",
" batch_emb = corpus_emb[i:i_end, :].tolist()\n",
" batch_ids = [str(x) for x in range(i, i_end)]\n",
" # upload to index\n",
" index.upsert(vectors=list(zip(batch_ids, batch_emb)))\n",
"index.describe_index_stats()"
]
},
{
"cell_type": "markdown",
"id": "598680f0-6634-4589-9ef3-840068d2d270",
"metadata": {
"id": "598680f0-6634-4589-9ef3-840068d2d270"
},
"source": [
"And now perform the negative mining step to produce (query, positive, negative) pairs. The advantage of Pinecone in this step is clearer for larger datasets (10K is small). Pinecone can scale to millions (and even billions) of vectors, which we cannot handle with typical exhaustive search."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9be2c6e3-87ae-4178-af1c-fdaba80dd723",
"metadata": {
"tags": [],
"id": "9be2c6e3-87ae-4178-af1c-fdaba80dd723",
"outputId": "7a5de109-6d50-4fd3-9003-4859878e97e8",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"100%|██████████| 300/300 [07:11<00:00, 1.44s/it]\n"
]
}
],
"source": [
"batch_size = 100\n",
"triplets = []\n",
"\n",
"for i in tqdm.tqdm(range(0, len(query_doc_pairs), batch_size)):\n",
" # embed queries and query pinecone in batches to minimize network latency\n",
" i_end = min(i+batch_size, len(query_doc_pairs))\n",
" queries = [pair[0] for pair in query_doc_pairs[i:i_end]]\n",
" pos_docs = [pair[1] for pair in query_doc_pairs[i:i_end]]\n",
" query_embs = org_model.encode(queries, convert_to_tensor=True, show_progress_bar=False)\n",
" res = index.query(query_embs.tolist(), top_k=10)\n",
" # iterate through queries and find negatives\n",
" for query, pos_doc, query_res in zip(queries, pos_docs, res['results']):\n",
" top_results = query_res['matches']\n",
" random.shuffle(top_results)\n",
" for hit in top_results:\n",
" neg_doc = corpus[int(hit['id'])]\n",
" if neg_doc != pos_doc:\n",
" triplets.append([query, pos_doc, neg_doc])\n",
" break"
]
},
{
"cell_type": "markdown",
"id": "7e28cbba",
"metadata": {
"id": "7e28cbba"
},
"source": [
"# Score with CrossEncoder\n",
"\n",
"It can happen that the query generator produces a query of low quality. Also, in the negative mining step it can happen that the found `negative` for a given `query` is actually quite relevant.\n",
"\n",
"To overcome this issue, we use a Cross-Encoder that scores `(query, positive)` and `(query, negative)`. \n",
"\n",
"Note that this cross-encoder was also trained on MS MARCO and has not seen any COVID related content. But due to its structure, it is able to perform word-by-word matching and is more robust to new, unseen words."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ad4b870e",
"metadata": {
"id": "ad4b870e",
"outputId": "b338b770-7f0c-437c-a842-a28250ef8817",
"colab": {
"referenced_widgets": [
"9ea9eb2f98f34dfb8dcaca146a1f14c9",
"0d6d4805a0fd49408c5cf03c39fc5494",
"e8f320a090eb48f2a157b754c73d62d1",
"f3f6717a608f453d93a10e8e3bd4f2ae",
"31054d97bcdb4b6aa2a74802993d6fcc",
"a5b42d22be764343add0994b62e2620f",
"6f33a3fd1d73415cbdf0ea6bc7adc2e4",
"5530ffc1bc924279abe50e4b061d8b08",
"9355cab45d64487984169bbcf0cdccd3",
"325cfb2389dc4907a7c512dfbf654437",
"63c3f1dbc69d42d98c349de4c4a98c05",
"18ccc3837c4342a988422cc62f71e770",
"1ecaad134a04493d8fd6195cedef9cb1",
"8ddd819073f94c6b9ad5b3197751108b",
"ad4ccb08d7594d3b99d2348cc12815c0",
"6851b3db244f4ae6a102da7694d23762",
"b4e800394ba94d0793236de8d5984754",
"ea546c58dda34ec4b093cf39ef2e63c3",
"c05cdfa7853e43bcbba5462eac5c6616",
"dd43adf0d4584434a1ffd0870ea01341",
"421c2edfcbbd4348a82621d68a18645d",
"79021962fd0b480ba387453f0692fb9e",
"7300f1c4f30048f79c157459420fe6e4",
"acb7feb0489f4132a02ae942f092d7c6",
"aab93a2e083d4c2eb412a80bf8957965",
"c23445b8d3164547a2dd6ec436294339",
"15fe4d82ad3b4faea494b86dcba3afb7",
"ab76d26978284f8090dacc381af2e668",
"947f8842233a406495073e0123d3c180",
"af8a20f7814943fc90e6d491efa212fb",
"4c9edb11ffa4487cacc7410a9088700e",
"cd961a8afc3c4a5fb026ac208f586b6f",
"24678cf1eda141ceac89baf92bec28c1",
"81794093f9a24c08a4471701905c5fb1",
"7af92a3045d64060af43de9c2c66c199",
"230b5741499f4632a6c33aa7f4cf664f",
"eae3a4aa61114711aef231d7229350ba",
"42d9d84bf4734d5fb83007d5258cad54",
"6ddba4b3609445628e738ae5412b0aa0",
"18feb4474ec94794949405309913847a",
"6b9242987f05455e831c93d2839025f3",
"6e67a58e73844801a58aa737793f25f6",
"6238bd229e934dfd8b2878b051f47a9c",
"dfb0f6790ef64471959e38b0e03c0ac9",
"ac4d50089c93437e9ea827a0228d52b5",
"46d7a07918c3407fb403c9ad809c49ea",
"6db84a167eec49ecbafc45879b949cfe",
"c23707b00ad84ea1a890c7285d43a39d",
"50dfc54f3e594bae8add98dee47fb90c",
"316b190e6bbc405d873facf75238167d",
"0269956d5a0c4deaa6497871c9e3d5a0",
"0f37d536b9c64686a9352d56c3fc28aa",
"70598568b3c344b99d732fd55bed3c98",
"d22e71e42a7f4522b1cb8d7e4be173d7",
"90c113d42c7c4315a18ef62985c38757"
],
"base_uri": "https://localhost:8080/",
"height": 195
}
},
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading: 0%| | 0.00/794 [00:00<?, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "9ea9eb2f98f34dfb8dcaca146a1f14c9"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading: 0%| | 0.00/86.7M [00:00<?, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "18ccc3837c4342a988422cc62f71e770"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading: 0%| | 0.00/316 [00:00<?, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "7300f1c4f30048f79c157459420fe6e4"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading: 0%| | 0.00/226k [00:00<?, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "81794093f9a24c08a4471701905c5fb1"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Downloading: 0%| | 0.00/112 [00:00<?, ?B/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "ac4d50089c93437e9ea827a0228d52b5"
}
},
"metadata": {}
},
{
"output_type": "stream",
"name": "stderr",
"text": [
"100%|██████████| 30000/30000 [15:03<00:00, 33.21it/s]\n"
]
}
],
"source": [
"cross_model = CrossEncoder(\"cross-encoder/ms-marco-MiniLM-L-6-v2\")\n",
"\n",
"train_examples = []\n",
"for query, pos, neg in tqdm.tqdm(triplets):\n",
" #Compute CrossEncoder score for (query, pos) and (query, neg) pairs\n",
" scores = cross_model.predict([[query, pos], [query, neg]])\n",
" \n",
" #Compute the difference (margin) of (query, pos) and (query, neg)\n",
" score_margin = scores[0]-scores[1] \n",
" train_examples.append(InputExample(texts=[query, pos, neg], label=score_margin))"
]
},
{
"cell_type": "markdown",
"id": "49d484e5",
"metadata": {
"id": "49d484e5"
},
"source": [
"# Update the Bi-Encoder Model\n",
"\n",
"We update the bi-encoder model with the new triplets `(query, positive, negative)` using MarginMSELoss"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c2d10c07",
"metadata": {
"id": "c2d10c07",
"outputId": "9bacd9ba-686a-4198-e41f-b1dbeadc5086",
"colab": {
"referenced_widgets": [
"5880972256e04a8f9278bd3e9b114ff4",
"de9d35de48964031a967c76bfa0d8320",
"90955d546e7f4c8aa241f1dbd9116f0e",
"ffcc53a8c913484fa8fd0420f56f16f3",
"aa5a27b8d06d45c8abd6cd4727b5a515",
"1890064acf7840bbb01477b93c99a7eb",
"441325ecb6694f6ea942dcc95b46dc95",
"0dc2f431c2de4a89ba24c903c6e463a1",
"468d67e9d2dd4b559ed8c244307294d0",
"437aedbdaf0749a6a1e2cc900afc4974",
"54fea299432441418c63dd0580f50960",
"3b89d0cb064343b7a911fdfc1acb965d",
"f46fe4d74c664703845f56649e740d3b",
"cc9ed121dd9e4f8ea1f200bf57310ebf",
"aff4ede8a3ef41588e5bfbab275aba02",
"2eb5437227404dc39e18cbd05b4d7d43",
"8360abbddd6c40d79a1bdbe6203ad820",
"0b8c52a9957245ce8b016cd5e97678d6",
"c30969ced015487a92721a5b8155c06c",
"e42c0867f17749b59cdd26aaa2cd462f",
"c59fb2986c21441788d4c0ece18526be",
"de1397cdfe4045c8ac7773e9326b45be"
],
"base_uri": "https://localhost:8080/",
"height": 138
}
},
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": [
"/usr/local/lib/python3.7/dist-packages/transformers/optimization.py:309: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning\n",
" FutureWarning,\n"
]
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Epoch: 0%| | 0/1 [00:00<?, ?it/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "5880972256e04a8f9278bd3e9b114ff4"
}
},
"metadata": {}
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"Iteration: 0%| | 0/1875 [00:00<?, ?it/s]"
],
"application/vnd.jupyter.widget-view+json": {
"version_major": 2,
"version_minor": 0,
"model_id": "3b89d0cb064343b7a911fdfc1acb965d"
}
},
"metadata": {}
}
],
"source": [
"model = SentenceTransformer(model_name)\n",
"model.max_seq_length = max_seq_length\n",
"train_dataloader = DataLoader(train_examples, batch_size=16, drop_last=True, shuffle=True)\n",
"train_loss = losses.MarginMSELoss(model)\n",
"\n",
"#Tune the model\n",
"model.fit(train_objectives=[(train_dataloader, train_loss)],\n",
" epochs=1, \n",
" warmup_steps=int(len(train_dataloader)*0.1))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4c1c22ce",
"metadata": {
"id": "4c1c22ce",
"outputId": "ba4b4fd9-70d5-428c-caba-9c6cd79dd632",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Original Model\n",
"Query: How is COVID-19 transmitted\n",
"94.84\tEbola is transmitted via direct contact with blood\n",
"92.87\tHIV is transmitted via sex or sharing needles\n",
"92.31\tCorona is transmitted via the air\n",
"91.54\tPolio is transmitted via contaminated water or food\n",
"\n",
"\n",
"Adapted Model\n",
"Query: How is COVID-19 transmitted\n",
"98.01\tCorona is transmitted via the air\n",
"96.80\tEbola is transmitted via direct contact with blood\n",
"94.42\tHIV is transmitted via sex or sharing needles\n",
"94.29\tPolio is transmitted via contaminated water or food\n"
]
}
],
"source": [
"print(\"Original Model\")\n",
"show_examples(org_model)\n",
"\n",
"print(\"\\n\\nAdapted Model\")\n",
"show_examples(model)"
]
}
],
"metadata": {
"environment": {
"kernel": "python3",
"name": "common-cu110.m91",
"type": "gcloud",
"uri": "gcr.io/deeplearning-platform-release/base-cu110:m91"
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.12"
},
"colab": {
"name": "GPL-Domain-Adaptation.ipynb",
"provenance": [],
"include_colab_link": true
},
"accelerator": "GPU",
"widgets": {
"application/vnd.jupyter.widget-state+json": {
"1a1a54fba1c64fa08cb498c3d44ffb7d": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HBoxModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_fea370531ccc4504a1a3373706378574",
"IPY_MODEL_56f9c71edb034650b30b202e8ffbc48f",
"IPY_MODEL_6c2ce6f3fff34e24ad6cb3a8f9ecb68e"
],
"layout": "IPY_MODEL_5be7da4ebe324e5e80a5b5df121b1357"
}
},
"fea370531ccc4504a1a3373706378574": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_5ae2de1deabd45c484cbbab727219a46",
"placeholder": "​",
"style": "IPY_MODEL_528b358f0fc3460f831b432caff7ce1a",
"value": "Downloading: 100%"
}
},
"56f9c71edb034650b30b202e8ffbc48f": {
"model_module": "@jupyter-widgets/controls",
"model_name": "FloatProgressModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_7b5d43e7ba9c41d699e6c58743922ce1",
"max": 690,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_72bb3536051243249970fcce2e97576a",
"value": 690
}
},
"6c2ce6f3fff34e24ad6cb3a8f9ecb68e": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_cc3afdb243b5446fa3ce3a72e08ddfa3",
"placeholder": "​",
"style": "IPY_MODEL_3f0375a254c045568a85401c50b4ba64",
"value": " 690/690 [00:00&lt;00:00, 16.1kB/s]"
}
},
"5be7da4ebe324e5e80a5b5df121b1357": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"5ae2de1deabd45c484cbbab727219a46": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"528b358f0fc3460f831b432caff7ce1a": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"7b5d43e7ba9c41d699e6c58743922ce1": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"72bb3536051243249970fcce2e97576a": {
"model_module": "@jupyter-widgets/controls",
"model_name": "ProgressStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"cc3afdb243b5446fa3ce3a72e08ddfa3": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"3f0375a254c045568a85401c50b4ba64": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"61766fdf2d624a82816dfd11bad86d13": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HBoxModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_1b0628147ac04954b2810333e9a44784",
"IPY_MODEL_a7b04b4d755940eeaaf9166337093b8c",
"IPY_MODEL_a40d0c68fe4b43ef84467511d2eccfe7"
],
"layout": "IPY_MODEL_b0fe621a97f3449180d2c6d8c22dc9de"
}
},
"1b0628147ac04954b2810333e9a44784": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_e99ad1a1f5b541809d9a2f19ff10846d",
"placeholder": "​",
"style": "IPY_MODEL_71d86604eb4742dcb4966df42be7801f",
"value": "Downloading: 100%"
}
},
"a7b04b4d755940eeaaf9166337093b8c": {
"model_module": "@jupyter-widgets/controls",
"model_name": "FloatProgressModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_0805b3d4285142e08ae82f09944665bf",
"max": 3952,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_6d2d529c1ac54378b944ae4eaee56fcd",
"value": 3952
}
},
"a40d0c68fe4b43ef84467511d2eccfe7": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_320bd409bf9f4e0fb0580e41d423c4d9",
"placeholder": "​",
"style": "IPY_MODEL_af1fdc87f14849eeb99fba2c38dff580",
"value": " 3.95k/3.95k [00:00&lt;00:00, 102kB/s]"
}
},
"b0fe621a97f3449180d2c6d8c22dc9de": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"e99ad1a1f5b541809d9a2f19ff10846d": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"71d86604eb4742dcb4966df42be7801f": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"0805b3d4285142e08ae82f09944665bf": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"6d2d529c1ac54378b944ae4eaee56fcd": {
"model_module": "@jupyter-widgets/controls",
"model_name": "ProgressStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"320bd409bf9f4e0fb0580e41d423c4d9": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"af1fdc87f14849eeb99fba2c38dff580": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"10043932feb34be8a9a0cd95a3d495d6": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HBoxModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_07ede0ede2a740a1a1fb32c4076af0c7",
"IPY_MODEL_af2f0329158e4dae9278c664cc76696a",
"IPY_MODEL_c85071546c0740919c3b0b4fe2abb7ac"
],
"layout": "IPY_MODEL_c5e9f24cffa4480b8d05498247f6fb93"
}
},
"07ede0ede2a740a1a1fb32c4076af0c7": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_59997806751f484db9983ef4e7ac9319",
"placeholder": "​",
"style": "IPY_MODEL_f51501d34758419191d63411bce2fd19",
"value": "Downloading: 100%"
}
},
"af2f0329158e4dae9278c664cc76696a": {
"model_module": "@jupyter-widgets/controls",
"model_name": "FloatProgressModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_f3cf53a6b6f54a32ac9e83b73b962fdc",
"max": 548,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_d5b0f72025a64fb8a7dd689be300ddff",
"value": 548
}
},
"c85071546c0740919c3b0b4fe2abb7ac": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_6e401197dee241659be86dfa4b0cbde6",
"placeholder": "​",
"style": "IPY_MODEL_1dd5eefb3dd24d3c8a308c05ee65750d",
"value": " 548/548 [00:00&lt;00:00, 15.3kB/s]"
}
},
"c5e9f24cffa4480b8d05498247f6fb93": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"59997806751f484db9983ef4e7ac9319": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"f51501d34758419191d63411bce2fd19": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"f3cf53a6b6f54a32ac9e83b73b962fdc": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"d5b0f72025a64fb8a7dd689be300ddff": {
"model_module": "@jupyter-widgets/controls",
"model_name": "ProgressStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"6e401197dee241659be86dfa4b0cbde6": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"1dd5eefb3dd24d3c8a308c05ee65750d": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"0ad0b526378349d8a8ba8b794df9b400": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HBoxModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HBoxModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HBoxView",
"box_style": "",
"children": [
"IPY_MODEL_e68586c3381146b69f80a6450403a8b6",
"IPY_MODEL_36f4464361364f3b9fc760bc8bb03293",
"IPY_MODEL_f1f871fc9e394bdf979481c707fe6f6e"
],
"layout": "IPY_MODEL_81bec02598e446ed8f298fd50ad52ec8"
}
},
"e68586c3381146b69f80a6450403a8b6": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_cc4e70c8cea249f08b462052315b3510",
"placeholder": "​",
"style": "IPY_MODEL_7a56d1c1ccd2488ca9f0d2d43b540c85",
"value": "Downloading: 100%"
}
},
"36f4464361364f3b9fc760bc8bb03293": {
"model_module": "@jupyter-widgets/controls",
"model_name": "FloatProgressModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "FloatProgressModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "ProgressView",
"bar_style": "success",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_292216407e504481a2ebc38f86ed2c57",
"max": 122,
"min": 0,
"orientation": "horizontal",
"style": "IPY_MODEL_78578f6bb02a45d1b7cbdccff4a868ff",
"value": 122
}
},
"f1f871fc9e394bdf979481c707fe6f6e": {
"model_module": "@jupyter-widgets/controls",
"model_name": "HTMLModel",
"model_module_version": "1.5.0",
"state": {
"_dom_classes": [],
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "HTMLModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/controls",
"_view_module_version": "1.5.0",
"_view_name": "HTMLView",
"description": "",
"description_tooltip": null,
"layout": "IPY_MODEL_b29e0d98df1441d4ae513ec704e7614d",
"placeholder": "​",
"style": "IPY_MODEL_27599b5652c246fb82fde622d93b75f3",
"value": " 122/122 [00:00&lt;00:00, 1.48kB/s]"
}
},
"81bec02598e446ed8f298fd50ad52ec8": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"cc4e70c8cea249f08b462052315b3510": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"7a56d1c1ccd2488ca9f0d2d43b540c85": {
"model_module": "@jupyter-widgets/controls",
"model_name": "DescriptionStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "DescriptionStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"description_width": ""
}
},
"292216407e504481a2ebc38f86ed2c57": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"overflow_x": null,
"overflow_y": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": null
}
},
"78578f6bb02a45d1b7cbdccff4a868ff": {
"model_module": "@jupyter-widgets/controls",
"model_name": "ProgressStyleModel",
"model_module_version": "1.5.0",
"state": {
"_model_module": "@jupyter-widgets/controls",
"_model_module_version": "1.5.0",
"_model_name": "ProgressStyleModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "StyleView",
"bar_color": null,
"description_width": ""
}
},
"b29e0d98df1441d4ae513ec704e7614d": {
"model_module": "@jupyter-widgets/base",
"model_name": "LayoutModel",
"model_module_version": "1.2.0",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "1.2.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "1.2.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,