This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"colab_type": "text", | |
"id": "view-in-github" | |
}, | |
"source": [ | |
"<a href=\"https://colab.research.google.com/github/sheikmohdimran/Experiments_2020/blob/master/NLP/SWT_fastai.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>" |
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ARCHITECCTURE | |
- ADMIN Initialisation | |
- {{[[TODO]]}} Deeper encoder, shallower decoder | |
- {{[[TODO]]}} Mish | |
- DONE? {{[[TODO]]}} Test Impact of embedding tying (would need shared vocab) | |
- {{[[TODO]]}} Use [[PreLayerNorm]] | |
- Try #ELU and #[[Shifted RELU]] | |
- Try [[EDITOR]] transformer: https://jlibovicky.github.io/2020/12/12/MT-Weekly-Editor.html | |
- Gradient Adaptive Clipping | |
- Snake Activation: https://twitter.com/EdwardDixon3/status/1360211045491617792?s=20 |
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
https://colab.research.google.com/drive/1D6krVG0PPJR2Je9g5eN_2h6JP73_NUXz |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
In case you want to use this google colab to fine-tune your model, you should make sure that | |
your training doesn't stop due to inactivity. A simple hack to prevent this is to paste the | |
following code into the console of this tab (right mouse click -> inspect -> Console tab and insert code). | |
``` | |
function ConnectButton(){ | |
console.log("Connect pushed"); | |
document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click() | |
} | |
setInterval(ConnectButton,60000); |
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file has been truncated, but you can view the full file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2021-04-06 10:29:46,421 INFO MainThread:30933 [internal.py:wandb_internal():88] W&B internal server running at pid: 30933, started at: 2021-04-06 10:29:46.421160 | |
2021-04-06 10:29:46,423 DEBUG SenderThread:30933 [sender.py:send():160] send: header | |
2021-04-06 10:29:46,423 DEBUG HandlerThread:30933 [handler.py:handle_request():120] handle_request: check_version | |
2021-04-06 10:29:46,423 INFO WriterThread:30933 [datastore.py:open_for_write():77] open: /home/morgan/ml/projects/xlsr_finetune/notebooks/wandb/run-20210406_102945-d1isczie/run-d1isczie.wandb | |
2021-04-06 10:29:46,424 DEBUG SenderThread:30933 [sender.py:send():160] send: request | |
2021-04-06 10:29:46,424 DEBUG SenderThread:30933 [sender.py:send_request():169] send_request: check_version | |
2021-04-06 10:29:46,501 DEBUG SenderThread:30933 [sender.py:send():160] send: run | |
2021-04-06 10:29:46,718 INFO SenderThread:30933 [sender.py:_start_run_threads():651] run started: d1isczie with start time 1617701385 | |
2021-04-06 10:29:46,718 DEBUG SenderThre |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# This will download the config file and corpora needed for the spaCy GoEmotions tutorial: | |
# https://github.com/explosion/projects/blob/v3/tutorials/textcat_goemotions | |
# Get CNN Config | |
os.makedirs(os.path.join(spacy_dir/'training', 'cnn'), exist_ok=True) | |
cnn_cfg_url = "https://raw.githubusercontent.com/explosion/projects/v3/tutorials/textcat_goemotions/configs/cnn.cfg" | |
cnn_cfg = spacy_dir/'cnn.cfg' | |
!wget -q -O $cnn_cfg $cnn_cfg_url | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/srv/conda/envs/saturn/lib/python3.7/site-packages/torch/nn/functional.py:1204: UserWarning: Output 0 of BackwardHookFunctionBackward is a view and is being modified inplace. This view was created inside a custom Function (or because an input was returned as-is) and the autograd logic to handle view+inplace would override the custom backward associated with the custom Function, leading to incorrect gradients. This behavior is deprecated and will be forbidden starting version 1.6. You can remove this warning by cloning the output of the custom Function. (Triggered internally at /pytorch/torch/csrc/autograd/variable.cpp:547.) | |
result = torch.relu_(input) | |
--------------------------------------------------------------------------- | |
RuntimeError Traceback (most recent call last) | |
<ipython-input-7-8393222d813a> in <module> | |
----> 1 simple_train_single(**model_params) | |
<ipython-input-6-29dd15a1ccdf> in simple_train_single(bucket, prefix, batch_size, downsample_to, n_epochs, base_lr, pretr |