This TPU VM cheatsheet uses and was tested with the following library versions:
Library | Version |
---|---|
JAX | 0.3.25 |
FLAX | 0.6.4 |
Datasets | 2.10.1 |
Transformers | 4.27.1 |
from huggingface_hub.hf_api import ( # type: ignore | |
REPO_TYPES, | |
REPO_TYPES_URL_PREFIXES, | |
HfApi, | |
_raise_for_status, | |
) | |
def update_repo_settings( | |
hf_api: HfApi, | |
repo_id: str, |
This TPU VM cheatsheet uses and was tested with the following library versions:
Library | Version |
---|---|
JAX | 0.3.25 |
FLAX | 0.6.4 |
Datasets | 2.10.1 |
Transformers | 4.27.1 |
# | |
# Author: Cody Buntain | |
# Date: 19 March 2020 | |
# | |
# Description: | |
# This code is an example of uysing the agreement package | |
#. in NLTK to calculate a number of agreement metrics on | |
#. a set of annotations. Currently, this code will work | |
#. with two annotators and multiple labels. | |
#. You can use Fleiss's Kappa or Krippendorf's Alpha if you |
So in the midst of all these Sesame Streets characters and robots transforming automobile era of "contextualize" language models, there is this "Toronto Book Corpus" that points to this kinda recently influential paper:
Yukun Zhu, Ryan Kiros, Rich Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. "Aligning books and movies: Towards story-like visual explanations by watching movies and reading books." In Proceedings of the IEEE international conference on computer vision, pp. 19-27.
Some might know my personal pet peeve on collecting translation datasets but this BookCorpus has no translations, so why do I even care about it?
""" Implementation of OKapi BM25 with sklearn's TfidfVectorizer | |
Distributed as CC-0 (https://creativecommons.org/publicdomain/zero/1.0/) | |
""" | |
import numpy as np | |
from sklearn.feature_extraction.text import TfidfVectorizer | |
from scipy import sparse | |
class BM25(object): |
# Requirements | |
#sudo apt-get install libcurl4-gnutls-dev # for RCurl on linux | |
#install.packages('RCurl') | |
#install.packages('RJSONIO') | |
library('RCurl') | |
library('RJSONIO') | |
query <- function(querystring) { | |
h = basicTextGatherer() |