Skip to content

Instantly share code, notes, and snippets.

View ShigekiKarita's full-sized avatar
🌴
I may be slow to respond.

Shigeki Karita ShigekiKarita

🌴
I may be slow to respond.
View GitHub Profile
@ShigekiKarita
ShigekiKarita / data.sh.diff
Created July 23, 2023 00:07
Diff at espnet/egs2/*/tts1/local/ scripts btw LibriTTS-R vs LibriTTS
34c34
< data_url=www.openslr.org/resources/60
---
> data_url=www.openslr.org/resources/141
69c69
< cp ${db_root}/LibriTTS/SPEAKERS.txt data/local
---
> cp ${db_root}/LibriTTS/train-clean-100/SPEAKERS.txt data/local
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ShigekiKarita
ShigekiKarita / espnet_tts_docker.sh
Last active February 15, 2021 09:08
How to run pretrained ESPnet TTS in docker. See also https://espnet.github.io/espnet/docker.html
git clone https://github.com/espnet/espnet
cd espnet
git checkout v.0.9.7
# Prepare TTS input
echo "HELLO WORLD" > egs/ljspeech/tts1/text
# Run TTS in docker. If you have GPUs, use "espnet/espnet:gpu-cuda10.1-cudnn7-u18"
# https://hub.docker.com/r/espnet/espnet/tags
docker run --rm -v $(pwd)/utils:/espnet/utils -v $(pwd)/espnet:/espnet/espnet -v $(pwd)/egs:/espnet/egs \
# Count the number of parameters in the saved model.
import torch
# Run this commands to get this model.
# $ cd $ESPNET_ROOT/egs/mini_an4/asr1; ./run.sh
state = torch.load("./exp/train_nodev_pytorch_train/results/model.acc.best")
print(sum(v.numel() for v in state.values()))
210c210
< print B "( " . $cmd . ") 2>>$logfile >> $logfile";
---
> print B "( " . $cmd . ") |& tee -a $logfile";
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
#!/bin/bash
# Copyright 2019 Nagoya University (Masao Someki)
# Copyright 2019 Shigeki Karita
# Apache 2.0 (http://www.apache.org/licenses/LICENSE-2.0)
. ./path.sh || exit 1;
. ./cmd.sh || exit 1;
# general configuration
@ShigekiKarita
ShigekiKarita / beam_search_func.py
Created August 17, 2019 08:44
functional implementation of beam search in ESPnet PR https://github.com/espnet/espnet/pull/1092
import logging
from typing import Dict
from typing import NamedTuple
from typing import Tuple
import torch
from espnet.nets.e2e_asr_common import end_detect
from espnet.nets.scorer_interface import PartialScorerInterface
from espnet.nets.scorer_interface import ScorerInterface
@ShigekiKarita
ShigekiKarita / beam_search.py
Last active August 17, 2019 09:10
alternative implmentation in ESPnet PR https://github.com/espnet/espnet/pull/1092
import logging
from typing import Any
from typing import Dict
from typing import List
from typing import NamedTuple
from typing import Tuple
import torch
from espnet.nets.e2e_asr_common import end_detect
@ShigekiKarita
ShigekiKarita / ntt.md
Created August 8, 2019 14:17
NTT at INTERSPEECH2019

retrieved from https://interspeech2019.org/program/schedule/

Tutorials

[T6] Advanced methods for neural end-to-end speech processing – unification, integration, and implementation Sunday, 15 September, 1400–1730, Hall 1 Takaaki Hori (Mitsubishi Electric Research Laboratories), Tomoki Hayashi (Department of Information Science, Nagoya University), Shigeki Karita (NTT Communication Science Laboratories), Shinji Watanabe (Center for Language and Speech Processing, Johns Hopkins University)

[T8] Microphone array signal processing and deep learning for speech enhancement – strong together, Sunday, 15 September, 1400–1730, Hall 11 Reinhold Haeb-Umbach (Department of Communications Engineering, Paderborn University), Tomohiro Nakatani (NTT Communication Science Laboratories)