From https://stackoverflow.com/q/21883108/610569, it suggested:
def zipgrams(sequence, n):
""" From https://stackoverflow.com/q/21883108/610569"""
return zip(*[sequence[i:] for i in range(n)])
$ make -j $(nproc) | |
Scanning dependencies of target nccl_install | |
Scanning dependencies of target marian_version | |
Scanning dependencies of target pathie-cpp | |
Scanning dependencies of target SQLiteCpp | |
Scanning dependencies of target libyaml-cpp | |
Scanning dependencies of target zlib | |
[ 0%] Running cpp protocol buffer compiler on sentencepiece_model.proto | |
[ 1%] Running cpp protocol buffer compiler on sentencepiece.proto | |
[ 2%] Running cpp protocol buffer compiler on sentencepiece_model.proto |
[2019-08-15 08:31:02] [marian] Marian v1.7.8 c65c26d6 2019-08-11 18:27:00 +0100 | |
[2019-08-15 08:31:02] [marian] Running on walle3 as process 24138 with command line: | |
[2019-08-15 08:31:02] [marian] /home/xyz/marian-dev/build/marian --model /disk2/models/xx-yy-r0/model.npz --type transformer --train-sets /disk2/data/xx-yy/train.sk /disk2/data/xx-yy/train.en --vocabs /disk2/models/xx-yy-r0/vocab.src.spm /disk2/models/xx-yy-r0/vocab.trg.spm --dim-vocabs 32000 32000 --mini-batch-fit --mini-batch 1000 --maxi-batch 1000 --valid-freq 10000 --save-freq 10000 --disp-freq 500 --valid-metrics ce-mean-words perplexity bleu-detok --valid-sets /disk2/data/xx-yy/valid.sk /disk2/data/xx-yy/valid.en --quiet-translation --beam-size 6 --normalize=0.6 --valid-mini-batch 16 --early-stopping 5 --cost-type=ce-mean-words --log /disk2/models/xx-yy-r0/train.log --valid-log /disk2/models/xx-yy-r0/valid.log --enc-depth 6 --dec-depth 6 --transformer-preprocess n --transformer-postprocess da --tied-embeddings-all --dim-emb 1024 --transforme |
The Project Gutenberg EBook of The Adventures of Sherlock Holmes | |
by Sir Arthur Conan Doyle | |
(#15 in our series by Sir Arthur Conan Doyle) | |
Copyright laws are changing all over the world. Be sure to check the | |
copyright laws for your country before downloading or redistributing | |
this or any other Project Gutenberg eBook. | |
This header should be the first thing seen when viewing this Project | |
Gutenberg file. Please do not remove it. Do not change or edit the |
from keras.models import Sequential | |
from keras.layers import Dense, Activation | |
model = Sequential([ | |
Dense(32, input_shape=(784,)), | |
Activation('relu'), | |
Dense(10), | |
Activation('softmax'), | |
]) |
$ th | |
______ __ | Torch7 | |
/_ __/__ ________/ / | Scientific computing for Lua. | |
/ / / _ \/ __/ __/ _ \ | | |
/_/ \___/_/ \__/_//_/ | https://github.com/torch | |
| http://torch.ch | |
th> torch.Tensor{1,2,3} | |
1 |
class ToxicDataset(Dataset): | |
def __init__(self, texts, labels): | |
self.texts = texts | |
self.vocab = Dictionary(texts) | |
special_tokens = {'<pad>': 0, '<unk>':1} | |
self.vocab = Dictionary(texts) | |
self.vocab.patch_with_special_tokens(special_tokens) | |
# Vectorize labels | |
self.labels = torch.tensor(labels) |
import os | |
from argparse import Namespace | |
from collections import Counter | |
import json | |
import re | |
import string | |
import numpy as np | |
import pandas as pd | |
import torch |
nationality | nationality_index | split | surname | |
---|---|---|---|---|
Arabic | 15 | train | Totah | |
Arabic | 15 | train | Abboud | |
Arabic | 15 | train | Fakhoury | |
Arabic | 15 | train | Srour | |
Arabic | 15 | train | Sayegh | |
Arabic | 15 | train | Cham | |
Arabic | 15 | train | Haik | |
Arabic | 15 | train | Kattan | |
Arabic | 15 | train | Khouri |
Language is never, ever, ever, random | |
ADAM KILGARRIFF | |
Abstract | |
Language users never choose words randomly, and language is essentially | |
non-random. Statistical hypothesis testing uses a null hypothesis, which |
From https://stackoverflow.com/q/21883108/610569, it suggested:
def zipgrams(sequence, n):
""" From https://stackoverflow.com/q/21883108/610569"""
return zip(*[sequence[i:] for i in range(n)])