Created
December 22, 2022 15:39
-
-
Save davesarmoury/a564f000ff80d440cd2b3df1f68f9120 to your computer and use it in GitHub Desktop.
tao spectro_gen finetune -e /specs/spectro_gen/finetune.yaml -g 1 -k tlt_encode -r /results/spectro_gen/finetune -m /data/tts_en_fastpitch_v1.4.0/tts_en_fastpitch_align.nemo train_dataset=/data/GLaDOS/merged_train.json validation_dataset=/data/GLaDOS/manifest_val.json prior_folder=/results/spectro_gen/finetune/prior_folder trainer.max_epochs=200…
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2022-12-22 10:36:33,950 [INFO] root: Registry: ['nvcr.io'] | |
2022-12-22 10:36:33,984 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-pyt | |
2022-12-22 10:36:33,993 [WARNING] tlt.components.docker_handler.docker_handler: | |
Docker will run the commands as root. If you would like to retain your | |
local host permissions, please add the "user":"UID:GID" in the | |
DockerOptions portion of the "/home/davesarmoury/.tao_mounts.json" file. You can obtain your | |
users UID and GID by using the "id -u" and "id -g" commands on the | |
terminal. | |
[NeMo W 2022-12-22 15:36:37 experimental:27] Module <class 'nemo.collections.tts.torch.tts_tokenizers.IPATokenizer'> is experimental, not ready for production and is not fully supported. Use at your own risk. | |
[NeMo W 2022-12-22 15:36:37 experimental:27] Module <class 'nemo.collections.tts.models.radtts.RadTTSModel'> is experimental, not ready for production and is not fully supported. Use at your own risk. | |
[NeMo W 2022-12-22 15:36:41 experimental:27] Module <class 'nemo.collections.tts.torch.tts_tokenizers.IPATokenizer'> is experimental, not ready for production and is not fully supported. Use at your own risk. | |
[NeMo W 2022-12-22 15:36:41 experimental:27] Module <class 'nemo.collections.tts.models.radtts.RadTTSModel'> is experimental, not ready for production and is not fully supported. Use at your own risk. | |
ANTLR runtime and generated code versions disagree: 4.8!=4.9.3 | |
ANTLR runtime and generated code versions disagree: 4.8!=4.9.3 | |
[NeMo W 2022-12-22 15:36:42 nemo_logging:349] <frozen conv_ai.tts.spectro_gen.scripts.finetune>:285: UserWarning: | |
'finetune.yaml' is validated against ConfigStore schema with the same name. | |
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2. | |
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions. | |
[NeMo I 2022-12-22 15:36:42 <frozen core:20] Experiment configuration: | |
restore_from: /data/tts_en_fastpitch_v1.4.0/tts_en_fastpitch_align.nemo | |
exp_manager: | |
explicit_log_dir: /results/spectro_gen/finetune | |
exp_dir: null | |
name: finetuned-model | |
version: null | |
use_datetime_version: true | |
resume_if_exists: true | |
resume_past_end: false | |
resume_ignore_no_checkpoint: true | |
create_tensorboard_logger: true | |
summary_writer_kwargs: null | |
create_wandb_logger: false | |
wandb_logger_kwargs: null | |
create_checkpoint_callback: true | |
checkpoint_callback_params: | |
filepath: null | |
dirpath: null | |
filename: null | |
monitor: v_loss | |
verbose: true | |
save_last: true | |
save_top_k: 3 | |
save_weights_only: false | |
mode: min | |
every_n_epochs: 1 | |
prefix: null | |
postfix: .tlt | |
save_best_model: false | |
always_save_nemo: false | |
save_nemo_on_train_end: true | |
model_parallel_size: null | |
files_to_copy: null | |
log_step_timing: true | |
step_timing_kwargs: | |
reduction: mean | |
sync_cuda: false | |
buffer_size: 1 | |
log_local_rank_0_only: false | |
log_global_rank_0_only: false | |
trainer: | |
logger: false | |
checkpoint_callback: false | |
callbacks: null | |
default_root_dir: null | |
gradient_clip_val: 0.0 | |
process_position: 0 | |
num_nodes: 1 | |
gpus: 1 | |
auto_select_gpus: false | |
tpu_cores: null | |
log_gpu_memory: null | |
progress_bar_refresh_rate: 1 | |
enable_progress_bar: true | |
overfit_batches: 0.0 | |
track_grad_norm: -1 | |
check_val_every_n_epoch: 1 | |
fast_dev_run: false | |
accumulate_grad_batches: 1 | |
max_epochs: 200 | |
min_epochs: 1 | |
max_steps: null | |
min_steps: null | |
limit_train_batches: 1.0 | |
limit_val_batches: 1.0 | |
limit_test_batches: 1.0 | |
val_check_interval: 1.0 | |
flush_logs_every_n_steps: 100 | |
log_every_n_steps: 50 | |
accelerator: ddp | |
sync_batchnorm: false | |
precision: 16 | |
weights_summary: full | |
weights_save_path: null | |
num_sanity_val_steps: 2 | |
resume_from_checkpoint: null | |
profiler: null | |
benchmark: false | |
deterministic: false | |
auto_lr_find: false | |
replace_sampler_ddp: true | |
detect_anomaly: false | |
terminate_on_nan: false | |
auto_scale_batch_size: false | |
prepare_data_per_node: true | |
amp_backend: native | |
amp_level: null | |
plugins: null | |
move_metrics_to_cpu: false | |
multiple_trainloader_mode: max_size_cycle | |
limit_predict_batches: 1.0 | |
stochastic_weight_avg: false | |
gradient_clip_algorithm: norm | |
max_time: null | |
reload_dataloaders_every_n_epochs: 0 | |
ipus: null | |
devices: null | |
strategy: null | |
enable_checkpointing: true | |
enable_model_summary: true | |
train_ds: | |
dataset: | |
_target_: nemo.collections.tts.torch.data.TTSDataset | |
manifest_filepath: ${train_dataset} | |
max_duration: null | |
min_duration: 0.1 | |
int_values: false | |
normalize: true | |
sample_rate: ${sample_rate} | |
trim: true | |
trim_top_db: 50 | |
sup_data_path: ${prior_folder} | |
sup_data_types: ${sup_data_types} | |
n_window_stride: ${n_window_stride} | |
n_window_size: ${n_window_size} | |
pitch_fmin: ${pitch_fmin} | |
pitch_fmax: ${pitch_fmax} | |
pitch_mean: ${pitch_avg} | |
pitch_std: ${pitch_std} | |
pitch_norm: true | |
n_fft: 1024 | |
win_length: 1024 | |
hop_length: 256 | |
window: hann | |
n_mels: 80 | |
lowfreq: 0 | |
highfreq: 8000 | |
ignore_file: null | |
text_normalizer: | |
_target_: nemo_text_processing.text_normalization.normalize.Normalizer | |
lang: en | |
input_case: cased | |
whitelist: ${whitelist_path} | |
text_normalizer_call_kwargs: | |
verbose: false | |
punct_pre_process: true | |
punct_post_process: true | |
text_tokenizer: | |
_target_: nemo.collections.tts.torch.tts_tokenizers.EnglishPhonemesTokenizer | |
punct: true | |
stresses: true | |
chars: true | |
apostrophe: true | |
pad_with_space: true | |
g2p: | |
_target_: nemo.collections.tts.torch.g2ps.EnglishG2p | |
phoneme_dict: ${phoneme_dict_path} | |
heteronyms: ${heteronyms_path} | |
phoneme_probability: 0.5 | |
dataloader_params: | |
drop_last: false | |
shuffle: true | |
batch_size: 16 | |
num_workers: 12 | |
validation_ds: | |
dataset: | |
_target_: nemo.collections.tts.torch.data.TTSDataset | |
manifest_filepath: ${validation_dataset} | |
max_duration: null | |
min_duration: 0.1 | |
int_values: false | |
normalize: true | |
sample_rate: ${sample_rate} | |
trim: true | |
sup_data_path: ${prior_folder} | |
sup_data_types: ${sup_data_types} | |
n_window_stride: ${n_window_stride} | |
n_window_size: ${n_window_size} | |
pitch_fmin: ${pitch_fmin} | |
pitch_fmax: ${pitch_fmax} | |
pitch_mean: ${pitch_avg} | |
pitch_std: ${pitch_std} | |
pitch_norm: true | |
n_fft: 1024 | |
win_length: 1024 | |
hop_length: 256 | |
window: hann | |
n_mels: 80 | |
lowfreq: 0 | |
highfreq: 8000 | |
ignore_file: null | |
text_normalizer: | |
_target_: nemo_text_processing.text_normalization.normalize.Normalizer | |
lang: en | |
input_case: cased | |
whitelist: ${whitelist_path} | |
text_normalizer_call_kwargs: | |
verbose: false | |
punct_pre_process: true | |
punct_post_process: true | |
text_tokenizer: | |
_target_: nemo.collections.tts.torch.tts_tokenizers.EnglishPhonemesTokenizer | |
punct: true | |
stresses: true | |
chars: true | |
apostrophe: true | |
pad_with_space: true | |
g2p: | |
_target_: nemo.collections.tts.torch.g2ps.EnglishG2p | |
phoneme_dict: ${phoneme_dict_path} | |
heteronyms: ${heteronyms_path} | |
phoneme_probability: 0.5 | |
dataloader_params: | |
drop_last: false | |
shuffle: true | |
batch_size: 32 | |
num_workers: 12 | |
train_dataset: /data/GLaDOS/merged_train.json | |
validation_dataset: /data/GLaDOS/manifest_val.json | |
phoneme_dict_path: ??? | |
heteronyms_path: ??? | |
whitelist_path: ??? | |
sup_data_types: | |
- align_prior_matrix | |
- pitch | |
- speaker_id | |
optim: | |
name: adam | |
lr: 0.0002 | |
betas: | |
- 0.9 | |
- 0.98 | |
weight_decay: 1.0e-06 | |
encryption_key: '*****' | |
tlt_checkpoint_interval: 0 | |
sample_rate: 22050 | |
prior_folder: /results/spectro_gen/finetune/prior_folder | |
n_speakers: 2 | |
n_window_size: 1024 | |
n_window_stride: 256 | |
pitch_fmin: 80.0 | |
pitch_fmax: 2048.0 | |
pitch_avg: 165.458 | |
pitch_std: 40.1891 | |
[NeMo W 2022-12-22 15:36:42 nemo_logging:349] /opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:287: LightningDeprecationWarning: Passing `Trainer(accelerator='ddp')` has been deprecated in v1.5 and will be removed in v1.7. Use `Trainer(strategy='ddp')` instead. | |
rank_zero_deprecation( | |
Using 16bit native Automatic Mixed Precision (AMP) | |
[NeMo W 2022-12-22 15:36:42 nemo_logging:349] /opt/conda/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py:52: LightningDeprecationWarning: Setting `max_steps = None` is deprecated in v1.5 and will no longer be supported in v1.7. Use `max_steps = -1` instead. | |
rank_zero_deprecation( | |
[NeMo W 2022-12-22 15:36:42 nemo_logging:349] /opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/callback_connector.py:151: LightningDeprecationWarning: Setting `Trainer(checkpoint_callback=False)` is deprecated in v1.5 and will be removed in v1.7. Please consider using `Trainer(enable_checkpointing=False)`. | |
rank_zero_deprecation( | |
[NeMo W 2022-12-22 15:36:42 nemo_logging:349] /opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/callback_connector.py:96: LightningDeprecationWarning: Setting `Trainer(progress_bar_refresh_rate=1)` is deprecated in v1.5 and will be removed in v1.7. Please pass `pytorch_lightning.callbacks.progress.TQDMProgressBar` with `refresh_rate` directly to the Trainer's `callbacks` argument instead. Or, to disable the progress bar pass `enable_progress_bar = False` to the Trainer. | |
rank_zero_deprecation( | |
[NeMo W 2022-12-22 15:36:42 nemo_logging:349] /opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/callback_connector.py:191: LightningDeprecationWarning: Setting `Trainer(weights_summary=full)` is deprecated in v1.5 and will be removed in v1.7. Please pass `pytorch_lightning.callbacks.model_summary.ModelSummary` with `max_depth` directly to the Trainer's `callbacks` argument instead. | |
rank_zero_deprecation( | |
[NeMo W 2022-12-22 15:36:42 nemo_logging:349] /opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:81: LightningDeprecationWarning: Setting `prepare_data_per_node` with the trainer flag is deprecated in v1.5.0 and will be removed in v1.7.0. Please set `prepare_data_per_node` in `LightningDataModule` and/or `LightningModule` directly instead. | |
rank_zero_deprecation( | |
[NeMo W 2022-12-22 15:36:42 nemo_logging:349] /opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py:572: LightningDeprecationWarning: Trainer argument `terminate_on_nan` was deprecated in v1.5 and will be removed in 1.7. Please use `Trainer(detect_anomaly=True)` instead. | |
rank_zero_deprecation( | |
GPU available: True, used: True | |
TPU available: False, using: 0 TPU cores | |
IPU available: False, using: 0 IPUs | |
HPU available: False, using: 0 HPUs | |
[NeMo W 2022-12-22 15:36:42 nemo_logging:349] /opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py:61: LightningDeprecationWarning: Setting `Trainer(flush_logs_every_n_steps=100)` is deprecated in v1.5 and will be removed in v1.7. Please configure flushing in the logger instead. | |
rank_zero_deprecation( | |
`Trainer(limit_train_batches=1.0)` was configured so 100% of the batches per epoch will be used.. | |
`Trainer(limit_val_batches=1.0)` was configured so 100% of the batches will be used.. | |
`Trainer(limit_test_batches=1.0)` was configured so 100% of the batches will be used.. | |
`Trainer(limit_predict_batches=1.0)` was configured so 100% of the batches will be used.. | |
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch.. | |
[NeMo W 2022-12-22 15:36:42 exp_manager:422] There was no checkpoint folder at checkpoint_dir :/results/spectro_gen/finetune/checkpoints. Training from scratch. | |
[NeMo I 2022-12-22 15:36:42 exp_manager:286] Experiments will be logged at /results/spectro_gen/finetune | |
[NeMo I 2022-12-22 15:36:42 exp_manager:660] TensorboardLogger has been set up | |
[NeMo W 2022-12-22 15:36:42 nemo_logging:349] /opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py:2313: LightningDeprecationWarning: `Trainer.weights_save_path` has been deprecated in v1.6 and will be removed in v1.8. | |
rank_zero_deprecation("`Trainer.weights_save_path` has been deprecated in v1.6 and will be removed in v1.8.") | |
Loading pretrained model from /data/tts_en_fastpitch_v1.4.0/tts_en_fastpitch_align.nemo. | |
Setting number of speakers 2 | |
[NeMo W 2022-12-22 15:36:45 fastpitch:100] AudioToCharWithPriorAndPitchDataset class has been deprecated. No support for training or finetuning. Only inference is supported. | |
[NeMo W 2022-12-22 15:36:45 fastpitch:100] AudioToCharWithPriorAndPitchDataset class has been deprecated. No support for training or finetuning. Only inference is supported. | |
[NeMo E 2022-12-22 15:36:45 common:503] Model instantiation failed! | |
Target class: nemo.collections.tts.models.fastpitch.FastPitchModel | |
Error(s): src path does not exist or it is not a path in nemo file. src value I got was: scripts/tts_dataset_files/cmudict-0.7b_nv22.08. Absolute: /opt/nvidia/tools/scripts/tts_dataset_files/cmudict-0.7b_nv22.08 | |
Traceback (most recent call last): | |
File "<frozen eff.core.archive>", line 764, in extract | |
File "/opt/conda/lib/python3.8/tarfile.py", line 2060, in extract | |
tarinfo = self.getmember(member) | |
File "/opt/conda/lib/python3.8/tarfile.py", line 1782, in getmember | |
raise KeyError("filename %r not found" % name) | |
KeyError: "filename 'manifest.yaml' not found" | |
During handling of the above exception, another exception occurred: | |
Traceback (most recent call last): | |
File "<frozen eff.core.archive>", line 653, in restore_manifest | |
File "<frozen eff.core.archive>", line 767, in extract | |
File "/opt/conda/lib/python3.8/tarfile.py", line 2060, in extract | |
tarinfo = self.getmember(member) | |
File "/opt/conda/lib/python3.8/tarfile.py", line 1782, in getmember | |
raise KeyError("filename %r not found" % name) | |
KeyError: "filename './manifest.yaml' not found" | |
During handling of the above exception, another exception occurred: | |
Traceback (most recent call last): | |
File "<frozen core.connectors.save_restore_connector>", line 94, in restore_from | |
File "<frozen core.cookbooks.nemo_cookbook>", line 404, in restore_from | |
File "<frozen core.cookbooks.cookbook>", line 203, in validate_archive | |
File "<frozen eff.core.archive>", line 657, in restore_manifest | |
TypeError: The indicated file '/data/tts_en_fastpitch_v1.4.0/tts_en_fastpitch_align.nemo' is not an EFF archive | |
During handling of the above exception, another exception occurred: | |
Traceback (most recent call last): | |
File "/opt/NeMo/nemo/core/classes/common.py", line 482, in from_config_dict | |
instance = imported_cls(cfg=config, trainer=trainer) | |
File "/opt/NeMo/nemo/collections/tts/models/fastpitch.py", line 105, in __init__ | |
self._setup_tokenizer(tokenizer_conf) | |
File "/opt/NeMo/nemo/collections/tts/models/fastpitch.py", line 188, in _setup_tokenizer | |
g2p_kwargs["phoneme_dict"] = self.register_artifact( | |
File "/opt/NeMo/nemo/core/classes/modelPT.py", line 222, in register_artifact | |
return self._save_restore_connector.register_artifact(self, config_path, src, verify_src_exists) | |
File "/opt/NeMo/nemo/core/connectors/save_restore_connector.py", line 378, in register_artifact | |
raise FileNotFoundError( | |
FileNotFoundError: src path does not exist or it is not a path in nemo file. src value I got was: scripts/tts_dataset_files/cmudict-0.7b_nv22.08. Absolute: /opt/nvidia/tools/scripts/tts_dataset_files/cmudict-0.7b_nv22.08 | |
Error executing job with overrides: ['exp_manager.explicit_log_dir=/results/spectro_gen/finetune', 'trainer.gpus=1', 'restore_from=/data/tts_en_fastpitch_v1.4.0/tts_en_fastpitch_align.nemo', 'encryption_key=tlt_encode', 'train_dataset=/data/GLaDOS/merged_train.json', 'validation_dataset=/data/GLaDOS/manifest_val.json', 'prior_folder=/results/spectro_gen/finetune/prior_folder', 'trainer.max_epochs=200', 'n_speakers=2', 'pitch_fmin=80.0', 'pitch_fmax=2048.0', 'pitch_avg=165.458', 'pitch_std=40.1891', 'trainer.precision=16'] | |
An error occurred during Hydra's exception formatting: | |
AssertionError() | |
Traceback (most recent call last): | |
File "/opt/conda/lib/python3.8/site-packages/hydra/_internal/utils.py", line 252, in run_and_report | |
assert mdl is not None | |
AssertionError | |
During handling of the above exception, another exception occurred: | |
Traceback (most recent call last): | |
File "</opt/conda/lib/python3.8/site-packages/nvidia_tao_pytorch/conv_ai/tts/spectro_gen/scripts/finetune.py>", line 3, in <module> | |
File "<frozen conv_ai.tts.spectro_gen.scripts.finetune>", line 285, in <module> | |
File "/opt/NeMo/nemo/core/config/hydra_runner.py", line 104, in wrapper | |
_run_hydra( | |
File "/opt/conda/lib/python3.8/site-packages/hydra/_internal/utils.py", line 377, in _run_hydra | |
run_and_report( | |
File "/opt/conda/lib/python3.8/site-packages/hydra/_internal/utils.py", line 294, in run_and_report | |
raise ex | |
File "/opt/conda/lib/python3.8/site-packages/hydra/_internal/utils.py", line 211, in run_and_report | |
return func() | |
File "/opt/conda/lib/python3.8/site-packages/hydra/_internal/utils.py", line 378, in <lambda> | |
lambda: hydra.run( | |
File "/opt/conda/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 111, in run | |
_ = ret.return_value | |
File "/opt/conda/lib/python3.8/site-packages/hydra/core/utils.py", line 233, in return_value | |
raise self._return_value | |
File "/opt/conda/lib/python3.8/site-packages/hydra/core/utils.py", line 160, in run_job | |
ret.return_value = task_function(task_cfg) | |
File "<frozen conv_ai.tts.spectro_gen.scripts.finetune>", line 224, in main | |
File "/opt/NeMo/nemo/core/classes/modelPT.py", line 311, in restore_from | |
instance = cls._save_restore_connector.restore_from( | |
File "<frozen core.connectors.save_restore_connector>", line 94, in restore_from | |
File "/opt/NeMo/nemo/core/connectors/save_restore_connector.py", line 235, in restore_from | |
loaded_params = self.load_config_and_state_dict( | |
File "/opt/NeMo/nemo/core/connectors/save_restore_connector.py", line 158, in load_config_and_state_dict | |
instance = calling_cls.from_config_dict(config=conf, trainer=trainer) | |
File "/opt/NeMo/nemo/core/classes/common.py", line 504, in from_config_dict | |
raise e | |
File "/opt/NeMo/nemo/core/classes/common.py", line 496, in from_config_dict | |
instance = cls(cfg=config, trainer=trainer) | |
File "/opt/NeMo/nemo/collections/tts/models/fastpitch.py", line 105, in __init__ | |
self._setup_tokenizer(tokenizer_conf) | |
File "/opt/NeMo/nemo/collections/tts/models/fastpitch.py", line 188, in _setup_tokenizer | |
g2p_kwargs["phoneme_dict"] = self.register_artifact( | |
File "/opt/NeMo/nemo/core/classes/modelPT.py", line 222, in register_artifact | |
return self._save_restore_connector.register_artifact(self, config_path, src, verify_src_exists) | |
File "/opt/NeMo/nemo/core/connectors/save_restore_connector.py", line 378, in register_artifact | |
raise FileNotFoundError( | |
FileNotFoundError: src path does not exist or it is not a path in nemo file. src value I got was: scripts/tts_dataset_files/cmudict-0.7b_nv22.08. Absolute: /opt/nvidia/tools/scripts/tts_dataset_files/cmudict-0.7b_nv22.08 | |
[NeMo I 2022-12-22 15:36:46 <frozen core:166] Sending telemetry data. | |
[NeMo W 2022-12-22 15:36:46 <frozen core:176] Telemetry data couldn't be sent, but the command ran successfully. | |
[NeMo W 2022-12-22 15:36:46 <frozen core:177] [Error]: <urlopen error [Errno -2] Name or service not known> | |
[NeMo W 2022-12-22 15:36:46 <frozen core:181] Execution status: FAIL | |
2022-12-22 10:36:46,862 [INFO] tlt.components.docker_handler.docker_handler: Stopping container. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment