Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
fastai/courses/dl2/Testing wt03 on Giro.ipynb
{
"cells": [
{
"metadata": {},
"cell_type": "markdown",
"source": "# Testing wt103 on Giro d'Italia\nThis notebook loads fastai's WikiText103 model, based on Stephen Merity's paper. We try loading some priming input texts to see what it generates. The model can be found here https://www.kaggle.com/pamin2222/fastai-wt103"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Load the model"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Load fastai library. Set up path locations of files."
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "from fastai.text import *\nimport html",
"execution_count": 1,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "PATH=Path('data/aclImdb/')\nPRE_PATH = PATH/'models'/'wt103'\nPRE_LM_PATH = PRE_PATH/'fwd_wt103.h5'",
"execution_count": 2,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "All string tokens were indexed for the model, so load itos2 dictionary that links token index to string and create a reverse dictionary stoi2 to link string to index.\n- n_tok is the vocabulary size\n- em_sz is the size of the embedding for each word\n- nh is the hidden layer size\n- nl is the number of layers"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "itos2 = pickle.load((PRE_PATH/'itos_wt103.pkl').open('rb'))\nstoi2 = collections.defaultdict(lambda:-1, {v:k for k,v in enumerate(itos2)})\n\nn_tok = len(itos2)",
"execution_count": 3,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "em_sz,nh,nl = 400,1150,3",
"execution_count": 4,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "markdown",
"source": "Note that the padding index is 1. "
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "itos2[:5]",
"execution_count": 5,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 5,
"data": {
"text/plain": "['_unk_', '_pad_', 'the', ',', '.']"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Use fastai to get a language model. Load the pre-trained weights and put them into the model. Since we are not doing any training, put the model into evaluation mode."
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "m = get_language_model(n_tok, emb_sz=em_sz, nhid=nh, nlayers=nl, pad_token=1,\n dropout=0.1, dropouth=0.15, dropouti=0.25, dropoute=0.02, wdrop=0.2, tie_weights=True)",
"execution_count": 6,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "wgts = torch.load(PRE_LM_PATH, map_location=lambda storage, loc: storage)",
"execution_count": 7,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "m.load_state_dict(wgts)",
"execution_count": 8,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "m.eval()",
"execution_count": 9,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 9,
"data": {
"text/plain": "SequentialRNN(\n (0): RNN_Encoder(\n (encoder): Embedding(238462, 400, padding_idx=1)\n (encoder_with_dropout): EmbeddingDropout(\n (embed): Embedding(238462, 400, padding_idx=1)\n )\n (rnns): ModuleList(\n (0): WeightDrop(\n (module): LSTM(400, 1150, dropout=0.15)\n )\n (1): WeightDrop(\n (module): LSTM(1150, 1150, dropout=0.15)\n )\n (2): WeightDrop(\n (module): LSTM(1150, 400, dropout=0.15)\n )\n )\n (dropouti): LockedDropout(\n )\n (dropouths): ModuleList(\n (0): LockedDropout(\n )\n (1): LockedDropout(\n )\n (2): LockedDropout(\n )\n )\n )\n (1): LinearDecoder(\n (decoder): Linear(in_features=400, out_features=238462, bias=False)\n (dropout): LockedDropout(\n )\n )\n)"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Out of interest, count the parameters in the model: 115,596,400. Most of the parameters are in the initial linear embedding layer of the encoder, which is 238,462 x 400 = 95,384,800. This is matrix is shared with the decoder."
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "def get_n_params(model):\n pp=0\n for p in list(model.parameters()):\n nn=1\n for s in list(p.size()):\n nn = nn*s\n pp += nn\n return pp",
"execution_count": 10,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "get_n_params(m)",
"execution_count": 11,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 11,
"data": {
"text/plain": "115596400"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Initial test\n- Take a sentence, tokenize it and run it through the model (as a batch size of 1)\n- Take a look at the top five suggested continuation words\n- out[0] is a tensor containing n x n_tok values where n is number of input words and n_tok in the vocab size"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "inp = \"The Giro d ' Italia is a cycling race that\"\nidxs = np.array([stoi2[w] for w in Tokenizer().spacy_tok(inp.lower()) if stoi2[w.lower()]>0])\nout = m(V(T(idxs[None,:])))\nout[0].shape",
"execution_count": 12,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 12,
"data": {
"text/plain": "torch.Size([10, 238462])"
},
"metadata": {}
}
]
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "[ itos2[i] for i in to_np(out[0][-1]).argsort()[-5:][::-1]]",
"execution_count": 13,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 13,
"data": {
"text/plain": "['the', 'it', 'he', 'was', 'they']"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Run some tests\nCreate a helper function to predict the next 100 words, baed on a priming input."
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "def predictText(inp):\n\n idxs = np.array([stoi2[w] for w in Tokenizer().spacy_tok(inp.lower()) if stoi2[w.lower()]>0])\n\n m.reset()\n res,*_ = m(V(idxs[:,None]))\n\n\n print(inp,\"\\n\")\n for i in range(100):\n n=res[-1].topk(2)[1]\n n = n[1] if n.data[0]==0 else n[0]\n print(itos2[n.data[0]], end=' ')\n #res,*_ = m(n[0].unsqueeze(0))\n res,*_ = m(n[0].unsqueeze(0))\n print('...')",
"execution_count": 14,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "inp = \"The Giro d ' Italia is a cycling race that\"\npredictText(inp)",
"execution_count": 15,
"outputs": [
{
"output_type": "stream",
"text": "The Giro d ' Italia is a cycling race that \n\ntakes place in the summer . the race is a part of the national cycling race , which is a race of the same name . the race is a part of the race , and is the first race in the history of the race . the race was won by the reigning world champion , the reigning world champion , who won the race by a margin of two lengths . \n \n = = = race = = = \n \n the race was won by the reigning world champion , who had won the race in the previous ...\n",
"name": "stdout"
}
]
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "inp = \"Most varieties of bean grow either as an erect bush or as a climbing plant, but a few important kinds are of intermediate form. Dwarf and semiclimbers are grown extensively. When the climbing type is grown for its immature pods, artificial supports are necessary to facilitate harvesting. Varieties differ greatly in size, shape, colour, and fibrousness or tenderness of the immature pods. In general, varieties grown for dry mature seeds produce pods that are too fibrous to be eaten at any state of development. Most edible-podded beans produce relatively low yields of mature seeds, or seeds that are of low eating quality. Seed colours range from white through green, yellow, tan, pink, red, brown, and purple to black in solid colours and countless contrasting patterns. Seed shapes range from nearly spherical to flattened, elongated, and kidney-shaped. Pods are of various shades of green, yellow, red, and purple and splashed with red or purple; pod shapes range from flat to round, smooth to irregular, and straight to sharply curved; length ranges from 75 to 200 millimetres (3 to 8 inches) or more.\"\npredictText(inp)",
"execution_count": 16,
"outputs": [
{
"output_type": "stream",
"text": "Most varieties of bean grow either as an erect bush or as a climbing plant, but a few important kinds are of intermediate form. Dwarf and semiclimbers are grown extensively. When the climbing type is grown for its immature pods, artificial supports are necessary to facilitate harvesting. Varieties differ greatly in size, shape, colour, and fibrousness or tenderness of the immature pods. In general, varieties grown for dry mature seeds produce pods that are too fibrous to be eaten at any state of development. Most edible-podded beans produce relatively low yields of mature seeds, or seeds that are of low eating quality. Seed colours range from white through green, yellow, tan, pink, red, brown, and purple to black in solid colours and countless contrasting patterns. Seed shapes range from nearly spherical to flattened, elongated, and kidney-shaped. Pods are of various shades of green, yellow, red, and purple and splashed with red or purple; pod shapes range from flat to round, smooth to irregular, and straight to sharply curved; length ranges from 75 to 200 millimetres (3 to 8 inches) or more. \n\n\n the most common form of the flower is the flower spike , which is a characteristic of the flower spike . the flower is made up of a series of small , cylindrical , cylindrical , cylindrical , cylindrical stems that are up to 10 cm ( 3.9 in ) long and 2.5 cm ( 0.98 in ) wide . the flower spikes are arranged in a series of three or four pairs of flowers , and are arranged in a pattern of three pairs of flowers . the flowers are arranged in a series of three pairs of ...\n",
"name": "stdout"
}
]
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "inp = \"Hurricane Ivan was a large long Cape Verde hurricane that caused widespread damage in the Caribbean and United States The cyclone was\" \npredictText(inp)",
"execution_count": 17,
"outputs": [
{
"output_type": "stream",
"text": "Hurricane Ivan was a large long Cape Verde hurricane that caused widespread damage in the Caribbean and United States The cyclone was \n\nthe first hurricane to strike the united states since hurricane katrina in 2005 . \n the hurricane was the first hurricane to strike the united states since hurricane katrina in 2005 . the hurricane was the first hurricane to strike the united states since hurricane katrina in 2005 . the hurricane was the first hurricane to strike the united states since hurricane katrina in 2005 . the hurricane was the first hurricane to strike the united states since hurricane katrina in 2005 . the hurricane was the first hurricane to strike the united states since hurricane katrina in 2005 . ...\n",
"name": "stdout"
}
]
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "inp = \"Tour de France, the world’s most prestigious and most difficult bicycle race. Of the three foremost races (the others being the Giro d’Italia and the Vuelta a España), the Tour de France attracts the world’s best riders. Staged for three weeks each July—usually in some 20 daylong stages—the Tour typically comprises 20 professional teams of 9 riders each and covers some 3,600 km (2,235 miles), mainly in France, with occasional and brief visits to such countries as Belgium, Italy, Germany, and Spain. Although the race may start outside France—as was the case in 2007, when England hosted the opening stage for the first time—it always heads there quickly; the Tour is France’s premier annual sporting event and has deep cultural roots. It is watched by huge crowds from the roadside and is televised around the world as one of the supreme tests of athletic endurance. Part of the difficulty cyclists face in the Tour is that it is divided among time-trial racing and racing stages covering both flat land and great stretches of mountainous inclines. It is a rare cyclist who can perform well at both time trials and climbing, and those who can usually wear the yellow jersey (maillot jaune) of victory at the end of the race in Paris.\"\npredictText(inp)",
"execution_count": 18,
"outputs": [
{
"output_type": "stream",
"text": "Tour de France, the world’s most prestigious and most difficult bicycle race. Of the three foremost races (the others being the Giro d’Italia and the Vuelta a España), the Tour de France attracts the world’s best riders. Staged for three weeks each July—usually in some 20 daylong stages—the Tour typically comprises 20 professional teams of 9 riders each and covers some 3,600 km (2,235 miles), mainly in France, with occasional and brief visits to such countries as Belgium, Italy, Germany, and Spain. Although the race may start outside France—as was the case in 2007, when England hosted the opening stage for the first time—it always heads there quickly; the Tour is France’s premier annual sporting event and has deep cultural roots. It is watched by huge crowds from the roadside and is televised around the world as one of the supreme tests of athletic endurance. Part of the difficulty cyclists face in the Tour is that it is divided among time-trial racing and racing stages covering both flat land and great stretches of mountainous inclines. It is a rare cyclist who can perform well at both time trials and climbing, and those who can usually wear the yellow jersey (maillot jaune) of victory at the end of the race in Paris. \n\n\n the race was won by the french rider , jean - baptiste u_n , who won the race in the first stage . the race was won by french rider jean - baptiste u_n , who won the race by a margin of two seconds . \n \n = = = tour de france = = = \n \n the tour de france was the first race to be held in france . the race was won by the french rider , jean - baptiste u_n , who won the race by a margin of two seconds over the winner . ...\n",
"name": "stdout"
}
]
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "inp = \"The Giro d' Italia (Italian pronunciation: [ˈdʒiːro diˈtaːlja]; English: Tour of Italy;\\\nalso known as the Giro) is an annual multiple-stage bicycle race primarily held in Italy, while\\\nalso occasionally passing through nearby countries. The first race was organized in 1909 to increase\\\nsales of the newspaper La Gazzetta dello Sport; however it is currently run by RCS Sport. The race has\\\nbeen held annually since its first edition in 1909, except when it was stopped for the two world wars. \\\nAs the Giro gained prominence and popularity the race was lengthened, and the peloton expanded from primarily \\\nItalian participation to riders from all over the world.\"\npredictText(inp)",
"execution_count": 19,
"outputs": [
{
"output_type": "stream",
"text": "The Giro d' Italia (Italian pronunciation: [ˈdʒiːro diˈtaːlja]; English: Tour of Italy;also known as the Giro) is an annual multiple-stage bicycle race primarily held in Italy, whilealso occasionally passing through nearby countries. The first race was organized in 1909 to increasesales of the newspaper La Gazzetta dello Sport; however it is currently run by RCS Sport. The race hasbeen held annually since its first edition in 1909, except when it was stopped for the two world wars. As the Giro gained prominence and popularity the race was lengthened, and the peloton expanded from primarily Italian participation to riders from all over the world. \n\n\n the race was won by the italian rider , giovanni di u_n , who won the race in the first leg of the race . the race was won by italian rider giovanni u_n , who won the race by a margin of two lengths . \n \n = = = world tour = = = \n \n the tour de france was the first of the tour de france . the tour de france was won by the reigning world champion , the reigning world champion , who had won the tour de france in the previous year 's race ...\n",
"name": "stdout"
}
]
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "inp = \"The book was widely acclaimed by the critics.\"\npredictText(inp)",
"execution_count": 20,
"outputs": [
{
"output_type": "stream",
"text": "The book was widely acclaimed by the critics. \n\nthe book was published in the united states by the american library association on june 1 , 2006 . \n \n = = = critical response = = = \n \n the book received mixed reviews from critics . the new york times reviewer robert christgau called it \" a great book \" and \" a great book \" . he praised the book 's \" strong , strong , and well - written \" prose , and the \" excellent \" prose . he also praised the book 's \" strong , strong , and well - written \" prose . ...\n",
"name": "stdout"
}
]
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "",
"execution_count": null,
"outputs": []
}
],
"metadata": {
"kernelspec": {
"name": "conda-env-fastai-py",
"display_name": "Python [conda env:fastai]",
"language": "python"
},
"language_info": {
"name": "python",
"version": "3.6.4",
"mimetype": "text/x-python",
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"pygments_lexer": "ipython3",
"nbconvert_exporter": "python",
"file_extension": ".py"
},
"gist": {
"id": "",
"data": {
"description": "fastai/courses/dl2/Testing wt03 on Giro.ipynb",
"public": true
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.