Skip to content

Instantly share code, notes, and snippets.

@ddrscott
Created July 25, 2023 14:50
Show Gist options
  • Save ddrscott/8d7f88727b9d9bd6a416cb5b9fb1b507 to your computer and use it in GitHub Desktop.
Save ddrscott/8d7f88727b9d9bd6a416cb5b9fb1b507 to your computer and use it in GitHub Desktop.
import torchaudio
from speechbrain.pretrained import Tacotron2
from speechbrain.pretrained import HIFIGAN
# Intialize TTS (tacotron2) and Vocoder (HiFIGAN)
tacotron2 = Tacotron2.from_hparams(source="speechbrain/tts-tacotron2-ljspeech", savedir="tmpdir_tts")
hifi_gan = HIFIGAN.from_hparams(source="speechbrain/tts-hifigan-ljspeech", savedir="tmpdir_vocoder")
# Running the TTS
mel_output, mel_length, alignment = tacotron2.encode_text("This is an open-source toolkit for the development of speech technologies.")
# Running Vocoder (spectrogram-to-waveform)
waveforms = hifi_gan.decode_batch(mel_output)
from IPython.display import Audio
Audio(waveforms.squeeze(), rate=22050)
@ddrscott
Copy link
Author

pip install torchaudio speechbrain

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment