Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import json | |
import datasets | |
import transformers | |
from datasets import ClassLabel, load_dataset | |
from huggingface_hub import ( | |
HfFolder, | |
ModelFilter, | |
hf_hub_download, | |
list_models, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def get_grouped_params(model, no_decay=["bias", "LayerNorm.weight"]): | |
params_with_wd, params_without_wd = [], [] | |
for n, p in model.named_parameters(): | |
if any(nd in n for nd in no_decay): | |
params_without_wd.append(p) | |
else: | |
params_with_wd.append(p) | |
return [{'params': params_with_wd, 'weight_decay': args.weight_decay}, | |
{'params': params_without_wd, 'weight_decay': 0.0}] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
if any(nd in n for nd in no_decay): | |
params_without_wd.append(p) | |
else: | |
params_with_wd.append(p) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from tqdm import tqdm | |
import torch | |
device = "cuda" if torch.cuda.is_available() else "cpu" | |
def chunks(list_of_elements, batch_size): | |
"""Yield successive batch-sized chunks from list_of_elements.""" | |
for i in range(0, len(list_of_elements), batch_size): | |
yield list_of_elements[i : i + batch_size] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
for question_type in ["How", "What", "Is"]: | |
for question in ( | |
dfs["train"][dfs["train"].question.str.startswith(question_type)] | |
.sample(n=3, random_state=42)['question']): | |
print(question) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from datasets import load_dataset | |
def validate_datasets(reference_dataset, new_dataset): | |
"""Validate the column names and rows of the new dataset""" | |
splits = list(reference_dataset.keys()) | |
for split in splits: | |
ref_dset = reference_dataset[split] | |
new_dset = new_dataset[split] | |
# Check column names agree | |
ref_cols = set(ref_dset.column_names) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{"data": [{"title": "B00001WRSJ", "paragraphs": [{"qas": [{"question": "What is the tonal balance of these headphones?", "id": "d0781d13200014aa25860e44da9d5ea7", "answers": [{"text": "I have been a headphone fanatic for thirty years", "answer_start": 0}], "is_impossible": false}], "context": "I have been a headphone fanatic for thirty years and have owned and used a variety of headphones over those years, to include Stax SR-5, Sennheiser HD-424 and HD-580. The Sony MDRV6 excells as the best value of any headphone that I've ever owned. They are especially good at producing natural-sounding deep bass, and the overall octave-to-octave balance is excellent. The sound quality is all in all comparable to other headphones that cost considerably more.The MDRV6 is especially well-suited for travel due to the collapsible design, and for noisy environments or for quiet environments such as a library where the sound emitted by open-back headphones would distract others.The MDRV6 is not quite as comfortable as some ot |
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.