Skip to content

Instantly share code, notes, and snippets.

Vadim Kantorov vadimkantorov

Block or report user

Report or block vadimkantorov

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@vadimkantorov
vadimkantorov / vad.py
Last active Feb 6, 2020
Applying WebRTC Voice Activity Detection (VAD) to an audio file and saving the result in a WAV file along with original audio for inspection with Audacity
View vad.py
import argparse
import subprocess
import numpy as np
import scipy.io.wavfile
import scipy.ndimage
import webrtcvad
parser = argparse.ArgumentParser()
parser.add_argument('--audio-path', '-i', required = True)
parser.add_argument('--sample-rate', '-r', type = int, default = 8_000, choices = [8_000, 16_000, 32_000, 48_000], help = 'Sample rate used to load and normalize audio (in Hz)')
@vadimkantorov
vadimkantorov / ctc_forward.py
Last active Jan 3, 2020
A primitive forward pass of CTC loss
View ctc_forward.py
# reimpl of forward pass from https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/LossCTC.cpp#L37
# a vectorized version in https://github.com/vadimkantorov/ctc
import torch
# does only reduction = 'none' and does not support zero_infinity = True
def ctc_loss(log_probs, targets, input_lengths, target_lengths, blank = 0):
targets_ = torch.full((targets.shape[0], 2 * targets.shape[-1] + 1), blank, device = targets.device, dtype = targets.dtype)
temporal_mask = torch.arange(targets.shape[-1], device = input_lengths.device, dtype = input_lengths.dtype).unsqueeze(0) < target_lengths.unsqueeze(1)
targets_[:, 1::2] = temporal_mask * targets + (~temporal_mask) * targets_[:, 1::2]
@vadimkantorov
vadimkantorov / ctc_alignment_targets.py
Last active Jan 3, 2020
An implementation of CTC re-formulation via cross-entropy with pseudo-labels, following "A Novel Re-weighting Method for Connectionist Temporal Classification"
View ctc_alignment_targets.py
# CTC vanilla and CTC via crossentropy are equal, and their gradients as well. In this reformulation it's easier to experiment with modifications of CTC.
# References on CTC regularization:
# "A Novel Re-weighting Method for Connectionist Temporal Classification", Li et al, https://arxiv.org/abs/1904.10619
# "Focal CTC Loss for Chinese Optical Character Recognition on Unbalanced Datasets", Feng et al, https://www.hindawi.com/journals/complexity/2019/9345861/
# "Improved training for online end-to-end speech recognition systems", Kim et al, https://arxiv.org/abs/1711.02212
import torch
import torch.nn.functional as F
## generate example data
@vadimkantorov
vadimkantorov / random_kitchen_sinks.py
Created Dec 11, 2019
A primitive implementation of (Convolutional) Random Kitchen Sinks in PyTorch
View random_kitchen_sinks.py
# Convolutional Kitchen Sinks: https://arxiv.org/abs/1706.00125
# Random Features for Large-Scale Kernel Machines, https://papers.nips.cc/paper/3182-random-features-for-large-scale-kernel-machines.pdf
# Historical review: http://www.argmin.net/2017/12/05/kitchen-sinks/
import math
import torch
class RandomKitchenSinks(torch.nn.Conv1d):
def forward(self, x):
return super().forward(x).cos()
@vadimkantorov
vadimkantorov / git_rebase_pull_request.sh
Created Oct 16, 2019
How to rebase a PyTorch pull request (PR)
View git_rebase_pull_request.sh
git remote add upstream https://github.com/pytorch/pytorch.git # do once
git fetch upstream master
# git rebase -i HEAD~2 # optional pick + squash
git rebase upstream/master
git push -f
@vadimkantorov
vadimkantorov / find_domain_words.py
Last active Nov 5, 2019
Compare two ARPA language models with KenLM
View find_domain_words.py
# Usage: python3 find_domain_words --ours chats.arpa --theirs ru_wiyalen_no_punkt.arpa.binary > domain_words.txt
import argparse
import kenlm
parser = argparse.ArgumentParser()
parser.add_argument('--ours', required = True)
parser.add_argument('--theirs', required = True)
args = parser.parse_args()
@vadimkantorov
vadimkantorov / perlin.py
Last active Sep 5, 2019
Perlin noise in PyTorch
View perlin.py
# ported from https://github.com/pvigier/perlin-numpy/blob/master/perlin2d.py
import torch
import math
def rand_perlin_2d(shape, res, fade = lambda t: 6*t**5 - 15*t**4 + 10*t**3):
delta = (res[0] / shape[0], res[1] / shape[1])
d = (shape[0] // res[0], shape[1] // res[1])
grid = torch.stack(torch.meshgrid(torch.arange(0, res[0], delta[0]), torch.arange(0, res[1], delta[1])), dim = -1) % 1
@vadimkantorov
vadimkantorov / invconv1x1.py
Last active Aug 20, 2019
Invertible 1x1 convolution in pure PyTorch (extracted from Glow packages)
View invconv1x1.py
# Original code from OpenAI Glow: https://github.com/openai/glow/blob/master/model.py
# This impl is inspired by this PyTorch reference: https://github.com/rosinality/glow-pytorch/blob/master/model.py
# This impl does not include inverse() and log_abs_det_jacobian() computation.
import torch
class InvConvNd(torch.nn.Module):
def __init__(self, in_channels, gain = 1e-3):
super().__init__()
@vadimkantorov
vadimkantorov / larc.py
Last active Aug 13, 2019
LARC gradient clipping in PyTorch
View larc.py
# ported from https://github.com/NVIDIA/OpenSeq2Seq/blob/master/open_seq2seq/optimizers/optimizers.py
# paper: https://arxiv.org/abs/1708.03888
# more advanced PyTorch variant: https://github.com/NVIDIA/apex/blob/master/apex/parallel/LARC.py
# Usage: larc_(optimizer.param_groups, larc_eta = 1e-3)
import torch
def larc_(param_groups, larc_eta = 1e-3, larc_mode = 'clip', min_update = 1e-7, eps = 1e-7):
for group in param_groups:
@vadimkantorov
vadimkantorov / novograd.py
Last active Aug 25, 2019
NovoGrad optimizer in PyTorch
View novograd.py
# ported from https://github.com/NVIDIA/OpenSeq2Seq/blob/master/open_seq2seq/optimizers/novograd.py
# paper: https://arxiv.org/abs/1905.11286
# a recent NVidia's implementation in PyTorch: https://github.com/NVIDIA/DeepLearningExamples/blob/master/PyTorch/SpeechRecognition/Jasper/optimizers.py
import torch
class NovoGrad(torch.optim.Optimizer):
def __init__(self, params, lr=1.0, betas = (0.95, 0.98), eps=1e-8, weight_decay=0.0, dampening=False):
defaults = dict(lr=lr, betas=betas, eps=eps, weight_decay=weight_decay, dampening=dampening)
super(NovoGrad, self).__init__(params, defaults)
You can’t perform that action at this time.