Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
TTS_example.ipynb
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "TTS_example.ipynb",
"provenance": [],
"collapsed_sections": [],
"toc_visible": true,
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"accelerator": "GPU"
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/erogol/97516ad65b44dbddb8cd694953187c5b/tts_example.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "cjD0xW0cEMVT"
},
"source": [
"## Hands-on example for 🐸 [Coqui TTS](https://github.com/coqui-ai/TTS)\n",
"\n",
"This notebook trains Tacotron model on LJSpeech dataset."
]
},
{
"cell_type": "code",
"metadata": {
"id": "XGiNTMShZYvj"
},
"source": [
"# download LJSpeech dataset\n",
"!wget http://data.keithito.com/data/speech/LJSpeech-1.1.tar.bz2\n",
"# decompress\n",
"!tar -xjf LJSpeech-1.1.tar.bz2"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "__k0BrbfLQ-F"
},
"source": [
"# create train-val splits\n",
"!shuf LJSpeech-1.1/metadata.csv > LJSpeech-1.1/metadata_shuf.csv\n",
"!head -n 12000 LJSpeech-1.1/metadata_shuf.csv > LJSpeech-1.1/metadata_train.csv\n",
"!tail -n 1100 LJSpeech-1.1/metadata_shuf.csv > LJSpeech-1.1/metadata_val.csv"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "pyJwcU9pDUE-"
},
"source": [
"# get TTS to your local\n",
"!git clone https://github.com/coqui-ai/TTS"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "zV-vHTWyirQv"
},
"source": [
"# install espeak backend if you like to use phonemes instead of raw characters\n",
"!sudo apt-get install espeak-ng"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "xwvg3-nVDL5t"
},
"source": [
"%cd TTS\n",
"# install TTS requirements\n",
"!pip install -e ."
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "y7_Xao7uNOvX"
},
"source": [
"# load the default config file and update with the local paths and settings.\n",
"import json\n",
"from TTS.utils.io import load_config\n",
"CONFIG = load_config('/content/TTS/TTS/tts/configs/config.json')\n",
"CONFIG['datasets'][0]['path'] = '../LJSpeech-1.1/' # set the target dataset to the LJSpeech\n",
"CONFIG['audio']['stats_path'] = None # do not use mean and variance stats to normalizat spectrograms. Mean and variance stats need to be computed separately. \n",
"CONFIG['output_path'] = '../'\n",
"with open('config.json', 'w') as fp:\n",
" json.dump(CONFIG, fp)\n"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "8L3JjJOBErxq"
},
"source": [
"# pull the trigger\n",
"!CUDA_VISIBLE_DEVICES=\"0\" python TTS/bin/train_tacotron.py --config_path config.json | tee training.log"
],
"execution_count": null,
"outputs": []
}
]
}
@hypernote

This comment has been minimized.

Copy link

@hypernote hypernote commented Aug 14, 2019

Thanks for share your setup!

I've received the following error after run it:

RuntimeError: Subtraction, the `-` operator, with a bool tensor is not supported. If you are trying to invert a mask, use the `~` or `bitwise_not()` operator instead

any ideas?

@erogol

This comment has been minimized.

Copy link
Owner Author

@erogol erogol commented Aug 15, 2019

it's about the updated version of PyTorch. Needed to be fixed on TTS side.

@hypernote

This comment has been minimized.

Copy link

@hypernote hypernote commented Aug 15, 2019

Thanks, @erogol. I reverted to torch 1.1.0 and worked fine.

@shad94

This comment has been minimized.

Copy link

@shad94 shad94 commented Nov 17, 2019

I am receiving such an error:

  File "train.py", line 13, in <module>
    from TTS.datasets.TTSDataset import MyDataset
ModuleNotFoundError: No module named 'TTS'

Where I might be doing a mistake?

I am in TTS project directory, LJSpeech is in subfile in that directory

@erogol

This comment has been minimized.

Copy link
Owner Author

@erogol erogol commented Nov 18, 2019

now should be working

@shad94

This comment has been minimized.

Copy link

@shad94 shad94 commented Nov 18, 2019

I don't know, maybe I am doing something wrong:

I added

!python setup.py develop

just in case, after json import.


Epoch 0/1000
Traceback (most recent call last):
File "train.py", line 704, in
main(args)
File "train.py", line 615, in main
global_step, epoch)
File "train.py", line 100, in train
for num_iter, data in enumerate(data_loader):
File "/home/marta/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 819, in next
return self._process_data(data)
File "/home/marta/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
data.reraise()
File "/home/marta/anaconda3/lib/python3.7/site-packages/torch/_utils.py", line 385, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/marta/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/home/marta/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/marta/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/marta/Desktop/inż/TTS/tts_namespace/TTS/datasets/TTSDataset.py", line 164, in getitem
return self.load_data(idx)
File "/home/marta/Desktop/inż/TTS/tts_namespace/TTS/datasets/TTSDataset.py", line 110, in load_data
wav = np.asarray(self.load_wav(wav_file), dtype=np.float32)
File "/home/marta/Desktop/inż/TTS/tts_namespace/TTS/datasets/TTSDataset.py", line 68, in load_wav
audio = self.ap.load_wav(filename)
File "/home/marta/Desktop/inż/TTS/tts_namespace/TTS/utils/audio.py", line 239, in load_wav
x, sr = sf.read(filename)
File "/home/marta/anaconda3/lib/python3.7/site-packages/soundfile.py", line 257, in read
subtype, endian, format, closefd) as f:
File "/home/marta/anaconda3/lib/python3.7/site-packages/soundfile.py", line 627, in init
self._file = self._open(file, mode_int, closefd)
File "/home/marta/anaconda3/lib/python3.7/site-packages/soundfile.py", line 1182, in _open
"Error opening {0!r}: ".format(self.name))
File "/home/marta/anaconda3/lib/python3.7/site-packages/soundfile.py", line 1355, in _error_check
raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
RuntimeError: Error opening '../LJSpeech-1.1/wavs/LJ002-0020.wav': System error.

@erogol

This comment has been minimized.

Copy link
Owner Author

@erogol erogol commented Nov 18, 2019

@shad I don't know what is wrong with your run since I just re-runned the notebook and it worked all through.

@shad94

This comment has been minimized.

Copy link

@shad94 shad94 commented Nov 18, 2019

@erogol , is CUDA required?

@nmstoker

This comment has been minimized.

Copy link

@nmstoker nmstoker commented Nov 18, 2019

Is it possible that you don't have the file in location '../LJSpeech-1.1/wavs/LJ002-0020.wav' relative to where the Notebook's current directory?
I wouldn't jump to thinking it's due to CUDA (unless there's something specific in the trace that makes you suggest CUDA)

I reckon checking the file makes sense. Also turned out to be the cause for someone else with a very similar error message when using soundfile too:
bastibe/python-soundfile#227
(I admit they're on Windows but the other aspects seem similar)

@shad94

This comment has been minimized.

Copy link

@shad94 shad94 commented Nov 19, 2019

@nmstoker , this is pretty strange, but yeah, this file is missing :o and I am using the same package as others. So, should I just exclude it from .CSV files and it should be ok?

@kurianbenoy

This comment has been minimized.

Copy link

@kurianbenoy kurianbenoy commented Jan 15, 2020

I don't know, maybe I am doing something wrong:

I added

!python setup.py develop

just in case, after json import.

Epoch 0/1000
Traceback (most recent call last):
File "train.py", line 704, in
main(args)
File "train.py", line 615, in main
global_step, epoch)
File "train.py", line 100, in train
for num_iter, data in enumerate(data_loader):
File "/home/marta/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 819, in next
return self._process_data(data)
File "/home/marta/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
data.reraise()
File "/home/marta/anaconda3/lib/python3.7/site-packages/torch/_utils.py", line 385, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/marta/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/home/marta/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/marta/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/marta/Desktop/inż/TTS/tts_namespace/TTS/datasets/TTSDataset.py", line 164, in getitem
return self.load_data(idx)
File "/home/marta/Desktop/inż/TTS/tts_namespace/TTS/datasets/TTSDataset.py", line 110, in load_data
wav = np.asarray(self.load_wav(wav_file), dtype=np.float32)
File "/home/marta/Desktop/inż/TTS/tts_namespace/TTS/datasets/TTSDataset.py", line 68, in load_wav
audio = self.ap.load_wav(filename)
File "/home/marta/Desktop/inż/TTS/tts_namespace/TTS/utils/audio.py", line 239, in load_wav
x, sr = sf.read(filename)
File "/home/marta/anaconda3/lib/python3.7/site-packages/soundfile.py", line 257, in read
subtype, endian, format, closefd) as f:
File "/home/marta/anaconda3/lib/python3.7/site-packages/soundfile.py", line 627, in init
self._file = self._open(file, mode_int, closefd)
File "/home/marta/anaconda3/lib/python3.7/site-packages/soundfile.py", line 1182, in _open
"Error opening {0!r}: ".format(self.name))
File "/home/marta/anaconda3/lib/python3.7/site-packages/soundfile.py", line 1355, in _error_check
raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
RuntimeError: Error opening '../LJSpeech-1.1/wavs/LJ002-0020.wav': System error.

I also faised the same issue in a Paperspace notebook, while same thing worked fine in Google Colabs

@shad

This comment has been minimized.

Copy link

@shad shad commented Jan 16, 2020

FYI: @shad94, not @shad ;)

@mightmay

This comment has been minimized.

Copy link

@mightmay mightmay commented Apr 23, 2020

I am new, please help me.
So after I finished the training I got.
> BEST MODEL (2.42450) : ../ljspeech-stft_params-April-23-2020_06+45AM-fab74dd/best_model.pth.tar

I only train for 10 epoch because I just want to see what to do after I finished training.
How can I use this model to make prediction?
Thank you

@mightmay

This comment has been minimized.

Copy link

@mightmay mightmay commented Apr 23, 2020

When I try to predit using this code,

python3 synthesize.py hello savedmodel/config.json savedmodel/best_model.pth.tar predictout.wav

I get error:

Traceback (most recent call last):
  File "synthesize.py", line 104, in <module>
    C = load_config(args.config_path)
  File "/usr/local/lib/python3.6/dist-packages/TTS-0.0.1+fab74dd-py3.6.egg/TTS/utils/generic_utils.py", line 26, in load_config
    data = json.loads(input_str)
  File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.6/json/decoder.py", line 355, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Invalid control character at: line 1 column 1323 (char 1322)
@nmstoker

This comment has been minimized.

Copy link

@nmstoker nmstoker commented Apr 23, 2020

Hi @mightmay - not sure how you got here but there are details in the associated repo (https://github.com/mozilla/TTS).
If you want help, after carefully reading the inductions and trying some common sense debugging yourself, then it's probably best to raise this question in the forum: https://github.com/mozilla/TTS

I don't want to prejudge but remember that if you want people to help for free, your best chance of success is making it easy for them (even if it's a little extra effort for you), so including a bit more detail about your installation / setup and what sensible things you've tried already 🙂

@mightmay

This comment has been minimized.

Copy link

@mightmay mightmay commented Apr 26, 2020

Thank You,
I found out that the JSON error is because there is a line in my config.json with invalid character: "url": "tcp://localhost:54321"
So I changed it to "url": "tcp:\/\/localhost:54321"
and it solved that problem.

@OOps717

This comment has been minimized.

Copy link

@OOps717 OOps717 commented May 4, 2020

I have tried to run this example several times on both cloab and my computer but faced same error.
Here is the output:

/pytorch/torch/csrc/utils/python_arg_parser.cpp:756: UserWarning: This overload of add is deprecated:
	add(Number alpha, Tensor other)
Consider using one of the following signatures instead:
	add(Tensor other, *, Number alpha)
Traceback (most recent call last):
  File "train.py", line 724, in <module>
    main(args)
  File "train.py", line 640, in main
    global_step, epoch)
  File "train.py", line 202, in train
    grad_norm, grad_flag = check_update(model, c.grad_clip, ignore_stopnet=True)
  File "/usr/local/lib/python3.6/dist-packages/TTS-0.0.1+2e2221f-py3.6.egg/TTS/utils/generic_utils.py", line 160, in check_update
    if np.isinf(grad_norm):
  File "/usr/local/lib/python3.6/dist-packages/torch/tensor.py", line 492, in __array__
    return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
 ! Run is removed from ../ljspeech-stft_params-May-04-2020_09+12AM-2e2221f

Can somebody help me how to resolve it as I have no idea what is wrong?
Thanks in advance

@thllwg

This comment has been minimized.

Copy link

@thllwg thllwg commented May 6, 2020

@OOps717 I ran into the same error. However, mozilla/TTS#398 introduces a fix for it. It is merged into dev branch but not yet into master. If you don't want to work on the dev branch you can just change the corresponding line directly in colab.

@golf1600

This comment has been minimized.

Copy link

@golf1600 golf1600 commented May 13, 2020

My colab is disconnected after 12 hours? How can I continued the training?

@thllwg

This comment has been minimized.

Copy link

@thllwg thllwg commented May 13, 2020

@golf1600 Colab is limited to 12hours imo. What you could do is to write the model checkpoints to your google drive, and resume training with the latest checkpoint after 12 hours from there.

@golf1600

This comment has been minimized.

Copy link

@golf1600 golf1600 commented May 13, 2020

@thllwg Thank you for your answer. How can I write the model checkpoints, how could I implement that?

@nmstoker

This comment has been minimized.

Copy link

@nmstoker nmstoker commented May 13, 2020

@golf1600 - the checkpoints are already saved to the Colab, so it sounds like you need to Google how to save to the local filesystem and then write some code to do that instead.

@d1gz0r

This comment has been minimized.

Copy link

@d1gz0r d1gz0r commented May 20, 2020

Started training on Google cloud platform on Linux based virtual machine. That's what I get when I start training:

> TRAINING (2020-05-20 17:09:30) Traceback (most recent call last): File "train.py", line 676, in <module> main(args) File "train.py", line 591, in main global_step, epoch) File "train.py", line 193, in train grad_norm, _ = check_update(model, c.grad_clip, ignore_stopnet=True) File "/opt/conda/lib/python3.7/site-packages/TTS-0.0.2+9d7cb1e-py3.7.egg/TTS/utils/training.py", line 12, in check_update if torch.isinf(grad_norm): File "/opt/conda/lib/python3.7/site-packages/torch/functional.py", line 259, in isinfraise TypeError("The argument is not a tensor{}".format(repr(tensor))) TypeError: The argument is not a tensor: 0.306020978823456 ! Run is removed from /home/jupyter/output/ljspeech-May-20-2020_05+09PM-9d7cb1e

May the issue be in data? I use my own training data, almost identical to LJSpeech dataset.

@KingArnaiz

This comment has been minimized.

Copy link

@KingArnaiz KingArnaiz commented May 21, 2020

image
Got this error

@GastonZalba

This comment has been minimized.

Copy link

@GastonZalba GastonZalba commented May 26, 2020

@KingArnaiz The code was refactored. Change "utils.generic_utils" to "utils.io", or use an older version of the repo.

@erogol

This comment has been minimized.

Copy link
Owner Author

@erogol erogol commented Jun 5, 2020

Now things should work :)

@kody-wen

This comment has been minimized.

Copy link

@kody-wen kody-wen commented Jun 5, 2020

I am receiving such an error:

  File "train.py", line 13, in <module>
    from TTS.datasets.TTSDataset import MyDataset
ModuleNotFoundError: No module named 'TTS'

Where I might be doing a mistake?

I am in TTS project directory, LJSpeech is in subfile in that directory

Checkout the symlink file in "tts_namespace". If not, enable the symlink function of current git project and reset it.

@kody-wen

This comment has been minimized.

Copy link

@kody-wen kody-wen commented Jun 5, 2020

image
Got this error

Your current path must be the root path of this git project. You can see the "utils" folder in it. Or you can modify PYTHONPATH to hack it.

@scripples

This comment has been minimized.

Copy link

@scripples scripples commented Jun 8, 2020

I'm having a bit of an issue pointing my train.py to a custom dataset, with this command throwing this error:

!python train.py --restore_path "/checkpoints/checkpoint_261000.pth.tar" --config_path config.json | tee training.log

Traceback (most recent call last):
File "train.py", line 676, in
main(args)
File "train.py", line 496, in main
meta_data_train, meta_data_eval = load_meta_data(c.datasets)
File "/content/drive/My Drive/ML/TTS/datasets/preprocess.py", line 20, in load_meta_data
meta_data_eval, meta_data_train = split_dataset(meta_data_train)
File "/content/drive/My Drive/ML/TTS/utils/generic_utils.py", line 78, in split_dataset
assert len(eval_split_size) > 0, " [!] You do not have enough samples to train. You need at least 100 samples."

So I'm pretty sure I'm not correctly pointing things to the data. I tried to mimic the LJspeech hierarchy-- my metadata is in mypath/dataset/voice (and within that metadata.csv, metadata_shuf, metadata_train and metadata_val) the corresponding wavs are one lower, in mypath/dataset/voice/wavs.

I'm running it from inside of mypath/TTS, and my config looks like this:

import json
from utils.io import load_config
CONFIG = load_config('config.json')
CONFIG['datasets'][0]['path'] = '../dataset/voice'
CONFIG['output_path'] = '/outdir'
with open('config.json', 'w') as fp:
    json.dump(CONFIG, fp)

What am I missing here? I feel like it's something pretty simple, I just don't know what.

@nmstoker

This comment has been minimized.

Copy link

@nmstoker nmstoker commented Jun 8, 2020

Hi @scripples - as it appears you're doing things deviating from the gist above, it might be worth raising your point on the TTS forum here: https://discourse.mozilla.org/c/tts (and include your actual config file too) but before then it's also worth checking whether you can get it working first with LJSpeech data before you start tweaking it to your own settings/dataset

@cb010

This comment has been minimized.

Copy link

@cb010 cb010 commented Jun 10, 2020

I ran this and received the following error:

Traceback (most recent call last):
  File "train.py", line 676, in <module>
    main(args)
  File "train.py", line 496, in main
    meta_data_train, meta_data_eval = load_meta_data(c.datasets)
  File "/usr/local/lib/python3.6/dist-packages/TTS-0.0.3+1d3c0c8-py3.6.egg/TTS/datasets/preprocess.py", line 20, in load_meta_data
    meta_data_eval, meta_data_train = split_dataset(meta_data_train)
  File "/usr/local/lib/python3.6/dist-packages/TTS-0.0.3+1d3c0c8-py3.6.egg/TTS/utils/generic_utils.py", line 78, in split_dataset
    assert len(eval_split_size) > 0, " [!] You do not have enough samples to train. You need at least 100 samples."
TypeError: object of type 'int' has no len()
@safe-bug

This comment has been minimized.

Copy link

@safe-bug safe-bug commented Jun 11, 2020

I ran into the same problem as @cb010

@erogol

This comment has been minimized.

Copy link
Owner Author

@erogol erogol commented Jun 11, 2020

fixed now

@subramaniannk

This comment has been minimized.

Copy link

@subramaniannk subramaniannk commented Jun 24, 2020

While trying to synthesize, I am getting the below error. I had ran the code for 2 epochs and then replaced "best_model.pth" from a pre-trained model. Any help would be appreciated!
error

@erogol

This comment has been minimized.

Copy link
Owner Author

@erogol erogol commented Jun 24, 2020

checkpoint does not match the current model defirnintion.

@subramaniannk

This comment has been minimized.

Copy link

@subramaniannk subramaniannk commented Jun 25, 2020

Can anyone please help me solve the above problem? or provide a method to synthesize speech from already trained model?

@erogol

This comment has been minimized.

Copy link
Owner Author

@erogol erogol commented Jun 25, 2020

check the wiki page of Mozilla TTS we already provided examples to synthesize speech

@ryantan1

This comment has been minimized.

Copy link

@ryantan1 ryantan1 commented Jul 29, 2020

@erogol, appreciate your help with my following issue, Ifollowed the guide, and it seems no trained model is generated.
------training process----------

Using CUDA: False
Number of GPUs: 0
Git Hash: 599c2db
Experiment folder: ../LJSpeech/ljspeech-July-29-2020_09+26AM-599c2db
Setting up Audio Processor...
| > sample_rate:22050
| > num_mels:80
| > min_level_db:-100
| > frame_shift_ms:None
| > frame_length_ms:None
| > ref_level_db:20
| > num_freq:513
| > power:1.5
| > preemphasis:0.0
| > griffin_lim_iters:60
| > signal_norm:True
| > symmetric_norm:True
| > mel_fmin:0
| > mel_fmax:8000.0
| > max_norm:4.0
| > clip_norm:True
| > do_trim_silence:True
| > trim_db:60
| > do_sound_norm:False
| > stats_path:None
| > hop_length:256
| > win_length:1024
| > n_fft:1024
Using model: Tacotron2
| > Num output units : 513

Model has 28947858 parameters

EPOCH: 0/1000

Number of output frames: 7

DataLoader initialization
| > Use phonemes: False
| > Number of instances : 12000
| > Max length sequence: 187
| > Min length sequence: 5
| > Avg length sequence: 98.47641666666667
| > Num. instances discarded by max-min (max=153, min=6) seq limits: 444
| > Batch group size: 0.

TRAINING (2020-07-29 09:26:52)

--> STEP: 24/180 -- GLOBAL_STEP: 25
| > decoder_loss: 3.86587 (3.92852)
| > postnet_loss: 5.71722 (5.89929)
| > stopnet_loss: 0.73470 (0.76719)
| > ga_loss: 0.05319 (0.07255)
| > loss: 9.63628
| > align_error: 0.98838 (0.98719)
| > avg_spec_len: 336.53125
| > avg_text_len: 59.109375
| > step_time: 20.76
| > loader_time: 0.01
| > lr: 0.00010
-----training end-------------
and LJSpeech, only two files there:
.
└── ljspeech-July-29-2020_09+26AM-599c2db
├── config.json
├── events.out.tfevents.1595986011.ubuntu
└── test_audios

2 directories, 2 files

Thanks for your help~

@ryantan1

This comment has been minimized.

Copy link

@ryantan1 ryantan1 commented Jul 29, 2020

Hi, after look into it, I found the training is kill after loop 28 to 30 times due to the out of memory. my free memory drop down quickly, and eventually system kill the training.
could you tell me memory requirement? or how to avoid the memory leak issue? thanks a lot.

@erogol

This comment has been minimized.

Copy link
Owner Author

@erogol erogol commented Jul 29, 2020

It is impossible for me to find the problem and suggest a solution just with the commends above. Please open an issue in the repo with the systems set up your config file and more info.

@ryantan1

This comment has been minimized.

Copy link

@ryantan1 ryantan1 commented Jul 29, 2020

@erogol, I have opened new issue: free memory drop down from 10G to 400M and system killed the training eventually #471
if anything more you need, please let me know. thanks a lot~

@RobertPieta

This comment has been minimized.

Copy link

@RobertPieta RobertPieta commented Aug 1, 2020

Hello! When running the colab, no module named: segments occurs when running python setup.py install for TTS. Adding pip install segments allows the code to progress to training.

Running python train.py --config_path config.json | tee training.log results in the error Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!. Since the colab is just to provide an example, I tried prefixing the command with CUDA_VISIBLE_DEVICES=-1 to run only on the CPU.

This gets the colab to print > TRAINING with the date, but nothing happens after this. I tried running the colab code on my own linux VM overnight with a GPU, and did not face the tensor issue above. However, 0 epochs were completed and nothing was logged passed the first > TRAINING log. On my own VM, I installed the TTS requirements.txt using pip.

Questions:

  1. Is there an expected 8 hours + startup time before an epoch is completed/logged?
  2. Are there any unlisted/special python package/OS requirements for training to work?
  3. Is there a way to get more logs from TTS? When I tried on my own vm htop shows the code is doing a lot of work, but nothing is being logged/saved.
@AdmiralBulldog

This comment has been minimized.

Copy link

@AdmiralBulldog AdmiralBulldog commented Aug 14, 2020

Hi, when I put runtime type as GPU I get this error:


Traceback (most recent call last):
  
File "train.py", line 676, in <module>
    main(args)
  
File "train.py", line 591, in main
    global_step, epoch)
  
File "train.py", line 183, in train
    alignments, alignment_lengths, text_lengths)
  
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  
File "/usr/local/lib/python3.6/dist-packages/TTS-0.0.3+6d6dca0-py3.6.egg/TTS/layers/losses.py", line 231, in forward
    ga_loss = self.criterion_ga(alignments, input_lens, alignment_lens)
  
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  
File "/usr/local/lib/python3.6/dist-packages/TTS-0.0.3+6d6dca0-py3.6.egg/TTS/layers/losses.py", line 145, in forward
    ga_masks = self._make_ga_masks(ilens, olens).to(att_ws.device)
  
File "/usr/local/lib/python3.6/dist-packages/TTS-0.0.3+6d6dca0-py3.6.egg/TTS/layers/losses.py", line 141, in _make_ga_masks
    ga_masks[idx, :olen, :ilen] = self._make_ga_mask(ilen, olen, self.sigma)
  
File "/usr/local/lib/python3.6/dist-packages/TTS-0.0.3+6d6dca0-py3.6.egg/TTS/layers/losses.py", line 155, in _make_ga_mask
    return 1.0 - torch.exp(-(grid_y / ilen - grid_x / olen) ** 2 / (2 * (sigma ** 2)))

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
 ! Run is removed from ../ljspeech-August-14-2020_03+38PM-6d6dca0

Why could this be? When I use CPU this error doesn't occur.

Edit:

Apparently the torch.arange() returns tensor on CPU so specifying the device like this seems to have fixed it:

image

@CozyDoomer

This comment has been minimized.

Copy link

@CozyDoomer CozyDoomer commented Aug 18, 2020

@AdmiralBulldog yep I also ran into that error and it's already fixed on the dev branch
I think torch 1.6 changed the cross device behavior.

@donbale

This comment has been minimized.

Copy link

@donbale donbale commented Sep 16, 2020

Hi, thank you your work and these example notebooks, I am however receiving error ModuleNotFoundError: No module named 'utils.io' whilst trying to replicate this notebook in my own environment as well as running direct in Colab from this page. Not sure if this may be resulting from 540d811dd52b5598a7cd21cbbcf197b0bfbeab62 by @erogol

@gisforgirard

This comment has been minimized.

Copy link

@gisforgirard gisforgirard commented Sep 20, 2020

Hi, thank you your work and these example notebooks, I am however receiving error ModuleNotFoundError: No module named 'utils.io' whilst trying to replicate this notebook in my own environment as well as running direct in Colab from this page. Not sure if this may be resulting from 540d811dd52b5598a7cd21cbbcf197b0bfbeab62 by @erogol

try putting TTS.utils.io instead of just utils.io

@donbale

This comment has been minimized.

Copy link

@donbale donbale commented Sep 21, 2020

Thank you @gisforgirard that sorted it 👍

@maureenrnx

This comment has been minimized.

Copy link

@maureenrnx maureenrnx commented Oct 24, 2020

Hi, thank you for this colab, I get this error every time when I execute "python setup.py install" either on the colab website or in my local installation. Can someone help me?

File "/usr/local/lib/python3.6/dist-packages/setuptools/command/py36compat.py", line 34, in add_defaults self._add_defaults_ext() File "/usr/local/lib/python3.6/dist-packages/setuptools/command/py36compat.py", line 117, in _add_defaults_ext build_ext = self.get_finalized_command('build_ext') File "/usr/lib/python3.6/distutils/cmd.py", line 299, in get_finalized_command cmd_obj.ensure_finalized() File "/usr/lib/python3.6/distutils/cmd.py", line 107, in ensure_finalized self.finalize_options() File "/tmp/easy_install-69n3rvh1/pyworld-0.2.12/setup.py", line 29, in finalize_options type=str, AttributeError: 'dict' object has no attribute '__NUMPY_SETUP__'

@xoxoxo13102020

This comment has been minimized.

Copy link

@xoxoxo13102020 xoxoxo13102020 commented Nov 9, 2020

Hi, thank you for this colab, I get this error every time when I execute "python setup.py install" either on the colab website or in my local installation. Can someone help me?

File "/usr/local/lib/python3.6/dist-packages/setuptools/command/py36compat.py", line 34, in add_defaults self._add_defaults_ext() File "/usr/local/lib/python3.6/dist-packages/setuptools/command/py36compat.py", line 117, in _add_defaults_ext build_ext = self.get_finalized_command('build_ext') File "/usr/lib/python3.6/distutils/cmd.py", line 299, in get_finalized_command cmd_obj.ensure_finalized() File "/usr/lib/python3.6/distutils/cmd.py", line 107, in ensure_finalized self.finalize_options() File "/tmp/easy_install-69n3rvh1/pyworld-0.2.12/setup.py", line 29, in finalize_options type=str, AttributeError: 'dict' object has no attribute '__NUMPY_SETUP__'

+1

@joshnatis

This comment has been minimized.

Copy link

@joshnatis joshnatis commented Nov 19, 2020

I also got the AttributeError: 'dict' object has no attribute '__NUMPY_SETUP__' error while following the instructions.

I was able to fix it with the following:

$ pip install -r requirements.txt  # error from Tensorflow about numpy version
$ pip uninstall numpy
$ pip install numpy==1.18.5
$ pip install -r requirements.txt  # now it works
$ python setup.py install  # it works!

Running in venv Python 3.8.6 on Arch Linux

@Dumpling97

This comment has been minimized.

Copy link

@Dumpling97 Dumpling97 commented Dec 12, 2020

I am trying to run all of this in a conda environment in Windows 10 on the Windows Powershell. So far everything went well(didn't shuffle and used get-content to create the train-val splits but that shouldn't matter right?) but when I want to train the model something really annoying throws a decode error.

image

What I have tried to get rid of this:
-> changed open() instances in preprocess.py to include encoding='utf-8'; also tried utf-16 : error did not change the slightest
-> tried to remove cp1252 so maybe it would use something else: failed to do anything
-> tried to set the error policy for the cp1252 decode() method from 'strict' to 'backslashreplace' which should not throw errors: failed to change anything
->tried to remove special characters like é from the LJSpeech metadata.csv but that also failed to do anything.

The only possible solution to this would be that I am looking in the completely wrong place to fix it, but other than the preprocess.py I don't see where it would be useful to change something.

I had a similar error with a different hex code (I think it was 0x80) when doing the with open('config.json','w') as fp: but that got fixed when I used with open('config.json','w',encoding='utf-8') as fp:

Would appreciate any help.

@DrameMariama

This comment has been minimized.

Copy link

@DrameMariama DrameMariama commented Mar 31, 2021

I ran the notebook but I'm facing the following error after num steps == save_step
Please any help?
I'm using azure ml studio

  return np.power(10.0, x / self.spec_gain) ! Run is kept in ../ljspeech-ddc-March-31-2021_10+20AM-0c2150aTraceback (most recent call last):  File "TTS/bin/train_tacotron.py", line 672, in <module>
    main(args)
  File "TTS/bin/train_tacotron.py", line 638, in main    scaler_st)
  File "TTS/bin/train_tacotron.py", line 314, in train    train_audio = ap.inv_melspectrogram(const_spec.T)
  File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/wolofai/code/Users/mariama.drame/wolof_TTS/TTS/TTS/utils/audio.py", line 286, in inv_melspectrogram
    return self._griffin_lim(S**self.power)  File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/wolofai/code/Users/mariama.drame/wolof_TTS/TTS/TTS/utils/audio.py", line 315, in _griffin_lim
    angles = np.exp(1j * np.angle(self._stft(y)))  File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/wolofai/code/Users/mariama.drame/wolof_TTS/TTS/TTS/utils/audio.py", line 303, in _stft    pad_mode=self.stft_pad_mode,  File "/anaconda/envs/tts/lib/python3.6/site-packages/librosa/core/spectrum.py", line 215, in stft
    util.valid_audio(y)
  File "/anaconda/envs/tts/lib/python3.6/site-packages/librosa/util/utils.py", line 275, in valid_audio
    raise ParameterError('Audio buffer is not finite everywhere')
librosa.util.exceptions.ParameterError: Audio buffer is not finite everywhere
@Sadam1195

This comment has been minimized.

Copy link

@Sadam1195 Sadam1195 commented Apr 23, 2021

Please checkout this issue. coqui-ai/TTS#387 (comment)

@ahgarawani

This comment has been minimized.

Copy link

@ahgarawani ahgarawani commented Apr 30, 2021

Hi. I was trying to continue my tacotron training run using:
!python TTS/bin/train_tacotron.py --continue_path ../ljspeech-ddc-April-29-2021_10+43PM-e9e0784/ | tee training.log

but I got this output:
`

2021-04-30 11:22:23.043015: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
Traceback (most recent call last):
Using CUDA: True
Number of GPUs: 1
Training continues for ../ljspeech-ddc-April-29-2021_10+43PM-e9e0784/
File "TTS/bin/train_tacotron.py", line 688, in
c = load_config(args.config_path)
File "/content/drive/MyDrive/GraduationProject/utils/TTS/TTS/utils/io.py", line 46, in load_config
data = read_json_with_comments(config_path)
File "/content/drive/MyDrive/GraduationProject/utils/TTS/TTS/utils/io.py", line 30, in read_json_with_comments
data = json.loads(input_str)
File "/usr/lib/python3.7/json/init.py", line 348, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.7/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Invalid control character at: line 1 column 622 (char 621)
`

Can anyone help?

@Sadam1195

This comment has been minimized.

Copy link

@Sadam1195 Sadam1195 commented Apr 30, 2021

Hi. I was trying to continue my tacotron training run using:
!python TTS/bin/train_tacotron.py --continue_path ../ljspeech-ddc-April-29-2021_10+43PM-e9e0784/ | tee training.log

but I got this output:
`

2021-04-30 11:22:23.043015: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
Traceback (most recent call last):
Using CUDA: True
Number of GPUs: 1
Training continues for ../ljspeech-ddc-April-29-2021_10+43PM-e9e0784/
File "TTS/bin/train_tacotron.py", line 688, in
c = load_config(args.config_path)
File "/content/drive/MyDrive/GraduationProject/utils/TTS/TTS/utils/io.py", line 46, in load_config
data = read_json_with_comments(config_path)
File "/content/drive/MyDrive/GraduationProject/utils/TTS/TTS/utils/io.py", line 30, in read_json_with_comments
data = json.loads(input_str)
File "/usr/lib/python3.7/json/init.py", line 348, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.7/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Invalid control character at: line 1 column 622 (char 621)
`

Can anyone help?

I was getting the same kind of error. I fixed it by commenting following part in config.json

  // DISTRIBUTED TRAINING
  // "distributed":{
  //    "backend": "nccl",
  //    "url": "tcp:\/\/localhost:54321"
  // },

comment out the character at Invalid control character at: line 1 column 622 (char 621)

@ahgarawani

@ahgarawani

This comment has been minimized.

Copy link

@ahgarawani ahgarawani commented May 1, 2021

Thank you. It now works. However, the training doesn't seem to continue it rather starts from epoch 0. Was that the case with you?
@Sadam1195

@Sadam1195

This comment has been minimized.

Copy link

@Sadam1195 Sadam1195 commented May 1, 2021

Thank you. It now works. However, the training doesn't seem to continue it rather starts from epoch 0. Was that the case with you?
@Sadam1195

No, that wasn't the case for me. Do not comment the whole line instead fix the character at mentioned space line 1 column 622 (char 621).
If that doesn't fix your problem try using --restore_path checkpoint.pth.tar flag and provide it original location of your config

@SVincent

This comment has been minimized.

Copy link

@SVincent SVincent commented Jun 4, 2021

Is it possible that the code was refactored again? If so, it broke the load_config step:

---------------------------------------------------------------------------

ImportError                               Traceback (most recent call last)

<ipython-input-7-1cccdd43e6e8> in <module>()
      1 # load the default config file and update with the local paths and settings.
      2 import json
----> 3 from TTS.utils.io import load_config
      4 CONFIG = load_config('/content/TTS/TTS/tts/configs/config.json')
      5 CONFIG['datasets'][0]['path'] = '../LJSpeech-1.1/'  # set the target dataset to the LJSpeech

ImportError: cannot import name 'load_config' from 'TTS.utils.io' (/content/TTS/TTS/utils/io.py)


---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------

I thought load_config might have been moved from TTS.utils.io to TTS.config, but since the config.json also no longer exists, I've been kind of stuck at this point.

@ahgarawani

This comment has been minimized.

Copy link

@ahgarawani ahgarawani commented Jun 4, 2021

Is it possible that the code was refactored again? If so, it broke the load_config step:

---------------------------------------------------------------------------

ImportError                               Traceback (most recent call last)

<ipython-input-7-1cccdd43e6e8> in <module>()
      1 # load the default config file and update with the local paths and settings.
      2 import json
----> 3 from TTS.utils.io import load_config
      4 CONFIG = load_config('/content/TTS/TTS/tts/configs/config.json')
      5 CONFIG['datasets'][0]['path'] = '../LJSpeech-1.1/'  # set the target dataset to the LJSpeech

ImportError: cannot import name 'load_config' from 'TTS.utils.io' (/content/TTS/TTS/utils/io.py)


---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------

I thought load_config might have been moved from TTS.utils.io to TTS.config, but since the config.json also no longer exists, I've been kind of stuck at this point.

the load_config function is not in TTS.utils.io, but it is still in the mozilla repo maybe clone that instead.
git clone https://github.com/mozilla/TTS

@SVincent

This comment has been minimized.

Copy link

@SVincent SVincent commented Jun 4, 2021

the load_config function is not in TTS.utils.io, but it is still in the mozilla repo maybe clone that instead.
git clone https://github.com/mozilla/TTS

I will probably give the mozilla TTS a try.
Still: "TTS.utils.io" and the "configs.json" file are both still referenced in the readme of the Coqui-ai TTS.

Edit: Well, and there's the fact that the Mozilla TTS also refers to this colab, which at the moment is broken.

@Sadam1195

This comment has been minimized.

Copy link

@Sadam1195 Sadam1195 commented Jun 4, 2021

Is it possible that the code was refactored again? If so, it broke the load_config step:

---------------------------------------------------------------------------

ImportError                               Traceback (most recent call last)

<ipython-input-7-1cccdd43e6e8> in <module>()
      1 # load the default config file and update with the local paths and settings.
      2 import json
----> 3 from TTS.utils.io import load_config
      4 CONFIG = load_config('/content/TTS/TTS/tts/configs/config.json')
      5 CONFIG['datasets'][0]['path'] = '../LJSpeech-1.1/'  # set the target dataset to the LJSpeech

ImportError: cannot import name 'load_config' from 'TTS.utils.io' (/content/TTS/TTS/utils/io.py)


---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------

I thought load_config might have been moved from TTS.utils.io to TTS.config, but since the config.json also no longer exists, I've been kind of stuck at this point.

I am not sure if this would be wise as https://github.com/mozilla/TTS is not maintained anymore.

the load_config function is not in TTS.utils.io, but it is still in the mozilla repo maybe clone that instead.
git clone https://github.com/mozilla/TTS

May be you can edit the code and change the from TTS.utils.io import load_config to from TTS.config import load_config

Is it possible that the code was refactored again?

Yes. Config loading was refactored/resolved in the recent updates, may be that could be causing the break.
@SVincent

@ahgarawani

This comment has been minimized.

Copy link

@ahgarawani ahgarawani commented Jun 4, 2021

Hi I have tried to run:
CUDA_VISIBLE_DEVICES="0" python TTS/bin/train_tacotron.py --config_path ../tacotron2/config.json | tee ../tacotron2/training.log

on a notebook instance on gcp instead of colab and I got this even though it works fine on colab:
image

does anybody have an idea what is going on ?

@Sadam1195

This comment has been minimized.

Copy link

@Sadam1195 Sadam1195 commented Jun 6, 2021

Hi I have tried to run:
CUDA_VISIBLE_DEVICES="0" python TTS/bin/train_tacotron.py --config_path ../tacotron2/config.json | tee ../tacotron2/training.log

on a notebook instance on gcp instead of colab and I got this even though it works fine on colab:
image

does anybody have an idea what is going on ?

check file directory paths. You have provided invalid path in the config.
@ahgarawani

@turian

This comment has been minimized.

Copy link

@turian turian commented Jun 10, 2021

@Sadam1195

from TTS.config import load_config

Which config.json should be used now?

These are the current options:

./TTS/speaker_encoder/configs/config.json
./tests/outputs/dummy_model_config.json
./tests/inputs/test_tacotron2_config.json
./tests/inputs/server_config.json
./tests/inputs/test_tacotron_bd_config.json
./tests/inputs/test_tacotron_config.json
./tests/inputs/test_vocoder_multiband_melgan_config.json
./tests/inputs/test_speaker_encoder_config.json
./tests/inputs/test_vocoder_audio_config.json
./tests/inputs/test_vocoder_wavernn_config.json
./tests/inputs/test_config.json
@Sadam1195

This comment has been minimized.

Copy link

@Sadam1195 Sadam1195 commented Jun 10, 2021

@Sadam1195

from TTS.config import load_config

Which config.json should be used now?

These are the current options:

./TTS/speaker_encoder/configs/config.json
./tests/outputs/dummy_model_config.json
./tests/inputs/test_tacotron2_config.json
./tests/inputs/server_config.json
./tests/inputs/test_tacotron_bd_config.json
./tests/inputs/test_tacotron_config.json
./tests/inputs/test_vocoder_multiband_melgan_config.json
./tests/inputs/test_speaker_encoder_config.json
./tests/inputs/test_vocoder_audio_config.json
./tests/inputs/test_vocoder_wavernn_config.json
./tests/inputs/test_config.json

You can train the model using following command where as you can use the config located in https://github.com/coqui-ai/TTS/blob/main/recipes/ljspeech/tacotron2-DDC/tacotron2-DDC.json with latest changes in repo

!CUDA_VISIBLE_DEVICES="0" python TTS/TTS/bin/train_tacotron.py --config_path ./tacotron2-DDC.json \
                                                          --coqpit.output_path ./Results  \
                                                          --coqpit.datasets.0.path ./riccardo_fasol/   \
                                                          --coqpit.audio.stats_path ./scale_stats.npy \

@turian

@tomchingas

This comment has been minimized.

Copy link

@tomchingas tomchingas commented Jun 20, 2021

It seems to run if you replace the second to last cell:

# load the default config file and update with the local paths and settings.
import json
from TTS.utils.io import load_config
CONFIG = load_config('/content/TTS/TTS/tts/configs/config.json')
CONFIG['datasets'][0]['path'] = '../LJSpeech-1.1/' # set the target dataset to the LJSpeech
CONFIG['audio']['stats_path'] = None # do not use mean and variance stats to normalizat spectrograms. Mean and variance stats need to be computed separately.
CONFIG['output_path'] = '../'
with open('config.json', 'w') as fp:
json.dump(CONFIG, fp)

With:

!python /content/TTS/TTS/bin/compute_statistics.py /content/TTS/recipes/ljspeech/tacotron2-DDC/tacotron2-DDC.json /content/TTS/scale_stats.npy --data_path /content/LJSpeech-1.1/wavs/

And replace the last cell:

# pull the trigger
!CUDA_VISIBLE_DEVICES="0" python TTS/bin/train_tacotron.py --config_path config.json | tee training.log

With:

!CUDA_VISIBLE_DEVICES="0" python /content/TTS/TTS/bin/train_tacotron.py --config_path /content/TTS/recipes/ljspeech/tacotron2-DDC/tacotron2-DDC.json \ --coqpit.output_path ./Results \ --coqpit.datasets.0.path /content/LJSpeech-1.1/ \ --coqpit.audio.stats_path /content/TTS/scale_stats.npy \

@thomasvonl

This comment has been minimized.

Copy link

@thomasvonl thomasvonl commented Jun 20, 2021

It seems to run if you replace the second to last cell:

# load the default config file and update with the local paths and settings. import json from TTS.utils.io import load_config CONFIG = load_config('/content/TTS/TTS/tts/configs/config.json') CONFIG['datasets'][0]['path'] = '../LJSpeech-1.1/' # set the target dataset to the LJSpeech CONFIG['audio']['stats_path'] = None # do not use mean and variance stats to normalizat spectrograms. Mean and variance stats need to be computed separately. CONFIG['output_path'] = '../' with open('config.json', 'w') as fp: json.dump(CONFIG, fp)

With:

!python /content/TTS/TTS/bin/compute_statistics.py /content/TTS/recipes/ljspeech/tacotron2-DDC/tacotron2-DDC.json /content/TTS/scale_stats.npy --data_path /content/LJSpeech-1.1/wavs/

And replace the last cell:

# pull the trigger !CUDA_VISIBLE_DEVICES="0" python TTS/bin/train_tacotron.py --config_path config.json | tee training.log

With:

!CUDA_VISIBLE_DEVICES="0" python /content/TTS/TTS/bin/train_tacotron.py --config_path /content/TTS/recipes/ljspeech/tacotron2-DDC/tacotron2-DDC.json \ --coqpit.output_path ./Results \ --coqpit.datasets.0.path /content/LJSpeech-1.1/ \ --coqpit.audio.stats_path /content/TTS/scale_stats.npy \

The first line seems fine, but when I run the second line something goes wrong, I have get the following error message

Using CUDA: True
Number of GPUs: 1
Mixed precision mode is ON
fatal: not a git repository (or any of the parent directories): .git
Git Hash: 0000000
Experiment folder: DEFINE THIS/ljspeech-ddc-June-20-2021_02+22PM-0000000
fatal: not a git repository (or any of the parent directories): .git
Traceback (most recent call last):
File "/content/TTS/TTS/bin/train_tacotron.py", line 737, in
args, config, OUT_PATH, AUDIO_PATH, c_logger, tb_logger = init_training(sys.argv)
File "/content/TTS/TTS/utils/arguments.py", line 182, in init_training
config, OUT_PATH, AUDIO_PATH, c_logger, tb_logger = process_args(args)
File "/content/TTS/TTS/utils/arguments.py", line 168, in process_args
copy_model_files(config, experiment_path, new_fields)
File "/content/TTS/TTS/utils/io.py", line 42, in copy_model_files
copy_stats_path,
File "/usr/lib/python3.7/shutil.py", line 120, in copyfile
with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: 'scale_stats.npy'

But the file exists in the directory, how should I solve this problem?

@Sadam1195

This comment has been minimized.

Copy link

@Sadam1195 Sadam1195 commented Jun 20, 2021

It seems to run if you replace the second to last cell:
# load the default config file and update with the local paths and settings. import json from TTS.utils.io import load_config CONFIG = load_config('/content/TTS/TTS/tts/configs/config.json') CONFIG['datasets'][0]['path'] = '../LJSpeech-1.1/' # set the target dataset to the LJSpeech CONFIG['audio']['stats_path'] = None # do not use mean and variance stats to normalizat spectrograms. Mean and variance stats need to be computed separately. CONFIG['output_path'] = '../' with open('config.json', 'w') as fp: json.dump(CONFIG, fp)
With:
!python /content/TTS/TTS/bin/compute_statistics.py /content/TTS/recipes/ljspeech/tacotron2-DDC/tacotron2-DDC.json /content/TTS/scale_stats.npy --data_path /content/LJSpeech-1.1/wavs/
And replace the last cell:
# pull the trigger !CUDA_VISIBLE_DEVICES="0" python TTS/bin/train_tacotron.py --config_path config.json | tee training.log
With:
!CUDA_VISIBLE_DEVICES="0" python /content/TTS/TTS/bin/train_tacotron.py --config_path /content/TTS/recipes/ljspeech/tacotron2-DDC/tacotron2-DDC.json \ --coqpit.output_path ./Results \ --coqpit.datasets.0.path /content/LJSpeech-1.1/ \ --coqpit.audio.stats_path /content/TTS/scale_stats.npy \

The first line seems fine, but when I run the second line something goes wrong, I have get the following error message

Using CUDA: True
Number of GPUs: 1
Mixed precision mode is ON
fatal: not a git repository (or any of the parent directories): .git
Git Hash: 0000000
Experiment folder: DEFINE THIS/ljspeech-ddc-June-20-2021_02+22PM-0000000
fatal: not a git repository (or any of the parent directories): .git
Traceback (most recent call last):
File "/content/TTS/TTS/bin/train_tacotron.py", line 737, in
args, config, OUT_PATH, AUDIO_PATH, c_logger, tb_logger = init_training(sys.argv)
File "/content/TTS/TTS/utils/arguments.py", line 182, in init_training
config, OUT_PATH, AUDIO_PATH, c_logger, tb_logger = process_args(args)
File "/content/TTS/TTS/utils/arguments.py", line 168, in process_args
copy_model_files(config, experiment_path, new_fields)
File "/content/TTS/TTS/utils/io.py", line 42, in copy_model_files
copy_stats_path,
File "/usr/lib/python3.7/shutil.py", line 120, in copyfile
with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: 'scale_stats.npy'

But the file exists in the directory, how should I solve this problem?

Replace
!CUDA_VISIBLE_DEVICES="0" python /content/TTS/TTS/bin/train_tacotron.py --config_path /content/TTS/recipes/ljspeech/tacotron2-DDC/tacotron2-DDC.json \ --coqpit.output_path ./Results \ --coqpit.datasets.0.path /content/LJSpeech-1.1/ \ --coqpit.audio.stats_path /content/TTS/scale_stats.npy \
with
!CUDA_VISIBLE_DEVICES="0" python /content/TTS/TTS/bin/train_tacotron.py --config_path ./content/TTS/recipes/ljspeech/tacotron2-DDC/tacotron2-DDC.json \ --coqpit.output_path ./Results \ --coqpit.datasets.0.path ./content/LJSpeech-1.1/ \ --coqpit.audio.stats_path ./content/TTS/scale_stats.npy \

@gbvssd

@thomasvonl

This comment has been minimized.

Copy link

@thomasvonl thomasvonl commented Jun 21, 2021

It seems to run if you replace the second to last cell:
# load the default config file and update with the local paths and settings. import json from TTS.utils.io import load_config CONFIG = load_config('/content/TTS/TTS/tts/configs/config.json') CONFIG['datasets'][0]['path'] = '../LJSpeech-1.1/' # set the target dataset to the LJSpeech CONFIG['audio']['stats_path'] = None # do not use mean and variance stats to normalizat spectrograms. Mean and variance stats need to be computed separately. CONFIG['output_path'] = '../' with open('config.json', 'w') as fp: json.dump(CONFIG, fp)
With:
!python /content/TTS/TTS/bin/compute_statistics.py /content/TTS/recipes/ljspeech/tacotron2-DDC/tacotron2-DDC.json /content/TTS/scale_stats.npy --data_path /content/LJSpeech-1.1/wavs/
And replace the last cell:
# pull the trigger !CUDA_VISIBLE_DEVICES="0" python TTS/bin/train_tacotron.py --config_path config.json | tee training.log
With:
!CUDA_VISIBLE_DEVICES="0" python /content/TTS/TTS/bin/train_tacotron.py --config_path /content/TTS/recipes/ljspeech/tacotron2-DDC/tacotron2-DDC.json \ --coqpit.output_path ./Results \ --coqpit.datasets.0.path /content/LJSpeech-1.1/ \ --coqpit.audio.stats_path /content/TTS/scale_stats.npy \

The first line seems fine, but when I run the second line something goes wrong, I have get the following error message

Using CUDA: True
Number of GPUs: 1
Mixed precision mode is ON
fatal: not a git repository (or any of the parent directories): .git
Git Hash: 0000000
Experiment folder: DEFINE THIS/ljspeech-ddc-June-20-2021_02+22PM-0000000
fatal: not a git repository (or any of the parent directories): .git
Traceback (most recent call last):
File "/content/TTS/TTS/bin/train_tacotron.py", line 737, in
args, config, OUT_PATH, AUDIO_PATH, c_logger, tb_logger = init_training(sys.argv)
File "/content/TTS/TTS/utils/arguments.py", line 182, in init_training
config, OUT_PATH, AUDIO_PATH, c_logger, tb_logger = process_args(args)
File "/content/TTS/TTS/utils/arguments.py", line 168, in process_args
copy_model_files(config, experiment_path, new_fields)
File "/content/TTS/TTS/utils/io.py", line 42, in copy_model_files
copy_stats_path,
File "/usr/lib/python3.7/shutil.py", line 120, in copyfile
with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: 'scale_stats.npy'

But the file exists in the directory, how should I solve this problem?

Replace
!CUDA_VISIBLE_DEVICES="0" python /content/TTS/TTS/bin/train_tacotron.py --config_path /content/TTS/recipes/ljspeech/tacotron2-DDC/tacotron2-DDC.json \ --coqpit.output_path ./Results \ --coqpit.datasets.0.path /content/LJSpeech-1.1/ \ --coqpit.audio.stats_path /content/TTS/scale_stats.npy \
with
!CUDA_VISIBLE_DEVICES="0" python /content/TTS/TTS/bin/train_tacotron.py --config_path ./content/TTS/recipes/ljspeech/tacotron2-DDC/tacotron2-DDC.json \ --coqpit.output_path ./Results \ --coqpit.datasets.0.path ./content/LJSpeech-1.1/ \ --coqpit.audio.stats_path ./content/TTS/scale_stats.npy \

@gbvssd

I have tried the above command line but it does not work either. I think it is not the file path problem, but more like some argument parsing problem, that the audio.stats_path do not parsing correctly, because in the code the attribute "config.audio.stats_path" is "scale_stats.npy" not the "/content/TTS/scale_stats.npy ".

@Sadam1195

This comment has been minimized.

Copy link

@Sadam1195 Sadam1195 commented Jun 21, 2021

I have tried the above command line but it does not work either. I think it is not the file path problem, but more like some argument parsing problem, that the audio.stats_path do not parsing correctly, because in the code the attribute "config.audio.stats_path" is "scale_stats.npy" not the "/content/TTS/scale_stats.npy ".

Sorrry, I missed that. You can double check if this "/content/TTS/scale_stats.npy exists or not if if doesn't. Don't use config.audio.stats_path aurguments.
@gbvssd

@thomasvonl

This comment has been minimized.

Copy link

@thomasvonl thomasvonl commented Jun 21, 2021

The file scale_stats.npy exists and the file path is correct. Is the -- coqpit.addio.stats_path argument set the state file path? And I track the error to the "utils/io.py" file and test the value of "config.audio.stats_path", I assume it should be "/content/TTS/scale_stats.npy " but it is "scale_stats.npy".
@Sadam1195

@Sadam1195

This comment has been minimized.

Copy link

@Sadam1195 Sadam1195 commented Jun 22, 2021

The file scale_stats.npy exists and the file path is correct. Is the -- coqpit.addio.stats_path argument set the state file path? And I track the error to the "utils/io.py" file and test the value of "config.audio.stats_path", I assume it should be "/content/TTS/scale_stats.npy " but it is "scale_stats.npy".

then may be you are not using the right version of repo. Let's get in touch on discussion or gitter as this conversation will spam the gist.
@gbvssd

@ghost

This comment has been minimized.

Copy link

@ghost ghost commented Sep 23, 2021

Hello guys , I have some errors can anyone guide me to solve this
First Try:
image

Second Try:
image

Third Try :
image

What should I do Now ?

Thanks in Advance

all the community members

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment