|
2023-08-05 15:35:03 usage: tts [-h] [--list_models [LIST_MODELS]] |
|
2023-08-05 15:35:03 [--model_info_by_idx MODEL_INFO_BY_IDX] |
|
2023-08-05 15:35:03 [--model_info_by_name MODEL_INFO_BY_NAME] [--text TEXT] |
|
2023-08-05 15:35:03 [--model_name MODEL_NAME] [--vocoder_name VOCODER_NAME] |
|
2023-08-05 15:35:03 [--config_path CONFIG_PATH] [--model_path MODEL_PATH] |
|
2023-08-05 15:35:03 [--out_path OUT_PATH] [--use_cuda USE_CUDA] |
|
2023-08-05 15:35:03 [--vocoder_path VOCODER_PATH] |
|
2023-08-05 15:35:03 [--vocoder_config_path VOCODER_CONFIG_PATH] |
|
2023-08-05 15:35:03 [--encoder_path ENCODER_PATH] |
|
2023-08-05 15:35:03 [--encoder_config_path ENCODER_CONFIG_PATH] [--emotion EMOTION] |
|
2023-08-05 15:35:03 [--speakers_file_path SPEAKERS_FILE_PATH] |
|
2023-08-05 15:35:03 [--language_ids_file_path LANGUAGE_IDS_FILE_PATH] |
|
2023-08-05 15:35:03 [--speaker_idx SPEAKER_IDX] [--language_idx LANGUAGE_IDX] |
|
2023-08-05 15:35:03 [--speaker_wav SPEAKER_WAV [SPEAKER_WAV ...]] |
|
2023-08-05 15:35:03 [--gst_style GST_STYLE] |
|
2023-08-05 15:35:03 [--capacitron_style_wav CAPACITRON_STYLE_WAV] |
|
2023-08-05 15:35:03 [--capacitron_style_text CAPACITRON_STYLE_TEXT] |
|
2023-08-05 15:35:03 [--list_speaker_idxs [LIST_SPEAKER_IDXS]] |
|
2023-08-05 15:35:03 [--list_language_idxs [LIST_LANGUAGE_IDXS]] |
|
2023-08-05 15:35:03 [--save_spectogram SAVE_SPECTOGRAM] [--reference_wav REFERENCE_WAV] |
|
2023-08-05 15:35:03 [--reference_speaker_idx REFERENCE_SPEAKER_IDX] |
|
2023-08-05 15:35:03 [--progress_bar PROGRESS_BAR] [--source_wav SOURCE_WAV] |
|
2023-08-05 15:35:03 [--target_wav TARGET_WAV] [--voice_dir VOICE_DIR] |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 Synthesize speech on command line. |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 You can either use your trained model or choose a model from the provided list. |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 If you don't specify any models, then it uses LJSpeech based English model. |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 ## Example Runs |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 ### Single Speaker Models |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 - List provided models: |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 $ tts --list_models |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 - Query info for model info by idx: |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 $ tts --model_info_by_idx "<model_type>/<model_query_idx>" |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 - Query info for model info by full name: |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 $ tts --model_info_by_name "<model_type>/<language>/<dataset>/<model_name>" |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 - Run TTS with default models: |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 $ tts --text "Text for TTS" |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 - Run a TTS model with its default vocoder model: |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 $ tts --text "Text for TTS" --model_name "<model_type>/<language>/<dataset>/<model_name> |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 - Run with specific TTS and vocoder models from the list: |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 $ tts --text "Text for TTS" --model_name "<model_type>/<language>/<dataset>/<model_name>" --vocoder_name "<model_type>/<language>/<dataset>/<model_name>" --output_path |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 - Run your own TTS model (Using Griffin-Lim Vocoder): |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 $ tts --text "Text for TTS" --model_path path/to/model.pth --config_path path/to/config.json --out_path output/path/speech.wav |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 - Run your own TTS and Vocoder models: |
|
2023-08-05 15:35:03 $ tts --text "Text for TTS" --model_path path/to/config.json --config_path path/to/model.pth --out_path output/path/speech.wav |
|
2023-08-05 15:35:03 --vocoder_path path/to/vocoder.pth --vocoder_config_path path/to/vocoder_config.json |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 ### Multi-speaker Models |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 - List the available speakers and choose as <speaker_id> among them: |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 $ tts --model_name "<language>/<dataset>/<model_name>" --list_speaker_idxs |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 - Run the multi-speaker TTS model with the target speaker ID: |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 $ tts --text "Text for TTS." --out_path output/path/speech.wav --model_name "<language>/<dataset>/<model_name>" --speaker_idx <speaker_id> |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 - Run your own multi-speaker TTS model: |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 $ tts --text "Text for TTS" --out_path output/path/speech.wav --model_path path/to/config.json --config_path path/to/model.pth --speakers_file_path path/to/speaker.json --speaker_idx <speaker_id> |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 ### Voice Conversion Models |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 $ tts --out_path output/path/speech.wav --model_name "<language>/<dataset>/<model_name>" --source_wav <path/to/speaker/wav> --target_wav <path/to/reference/wav> |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 |
|
2023-08-05 15:35:03 options: |
|
2023-08-05 15:35:03 -h, --help show this help message and exit |
|
2023-08-05 15:35:03 --list_models [LIST_MODELS] |
|
2023-08-05 15:35:03 list available pre-trained TTS and vocoder models. |
|
2023-08-05 15:35:03 --model_info_by_idx MODEL_INFO_BY_IDX |
|
2023-08-05 15:35:03 model info using query format: <model_type>/<model_query_idx> |
|
2023-08-05 15:35:03 --model_info_by_name MODEL_INFO_BY_NAME |
|
2023-08-05 15:35:03 model info using query format: <model_type>/<language>/<dataset>/<model_name> |
|
2023-08-05 15:35:03 --text TEXT Text to generate speech. |
|
2023-08-05 15:35:03 --model_name MODEL_NAME |
|
2023-08-05 15:35:03 Name of one of the pre-trained TTS models in format <language>/<dataset>/<model_name> |
|
2023-08-05 15:35:03 --vocoder_name VOCODER_NAME |
|
2023-08-05 15:35:03 Name of one of the pre-trained vocoder models in format <language>/<dataset>/<model_name> |
|
2023-08-05 15:35:03 --config_path CONFIG_PATH |
|
2023-08-05 15:35:03 Path to model config file. |
|
2023-08-05 15:35:03 --model_path MODEL_PATH |
|
2023-08-05 15:35:03 Path to model file. |
|
2023-08-05 15:35:03 --out_path OUT_PATH Output wav file path. |
|
2023-08-05 15:35:03 --use_cuda USE_CUDA Run model on CUDA. |
|
2023-08-05 15:35:03 --vocoder_path VOCODER_PATH |
|
2023-08-05 15:35:03 Path to vocoder model file. If it is not defined, model uses GL as vocoder. Please make sure that you installed vocoder library before (WaveRNN). |
|
2023-08-05 15:35:03 --vocoder_config_path VOCODER_CONFIG_PATH |
|
2023-08-05 15:35:03 Path to vocoder model config file. |
|
2023-08-05 15:35:03 --encoder_path ENCODER_PATH |
|
2023-08-05 15:35:03 Path to speaker encoder model file. |
|
2023-08-05 15:35:03 --encoder_config_path ENCODER_CONFIG_PATH |
|
2023-08-05 15:35:03 Path to speaker encoder config file. |
|
2023-08-05 15:35:03 --emotion EMOTION Emotion to condition the model with. Only available for 🐸Coqui Studio models. |
|
2023-08-05 15:35:03 --speakers_file_path SPEAKERS_FILE_PATH |
|
2023-08-05 15:35:03 JSON file for multi-speaker model. |
|
2023-08-05 15:35:03 --language_ids_file_path LANGUAGE_IDS_FILE_PATH |
|
2023-08-05 15:35:03 JSON file for multi-lingual model. |
|
2023-08-05 15:35:03 --speaker_idx SPEAKER_IDX |
|
2023-08-05 15:35:03 Target speaker ID for a multi-speaker TTS model. |
|
2023-08-05 15:35:03 --language_idx LANGUAGE_IDX |
|
2023-08-05 15:35:03 Target language ID for a multi-lingual TTS model. |
|
2023-08-05 15:35:03 --speaker_wav SPEAKER_WAV [SPEAKER_WAV ...] |
|
2023-08-05 15:35:03 wav file(s) to condition a multi-speaker TTS model with a Speaker Encoder. You can give multiple file paths. The d_vectors is computed as their average. |
|
2023-08-05 15:35:03 --gst_style GST_STYLE |
|
2023-08-05 15:35:03 Wav path file for GST style reference. |
|
2023-08-05 15:35:03 --capacitron_style_wav CAPACITRON_STYLE_WAV |
|
2023-08-05 15:35:03 Wav path file for Capacitron prosody reference. |
|
2023-08-05 15:35:03 --capacitron_style_text CAPACITRON_STYLE_TEXT |
|
2023-08-05 15:35:03 Transcription of the reference. |
|
2023-08-05 15:35:03 --list_speaker_idxs [LIST_SPEAKER_IDXS] |
|
2023-08-05 15:35:03 List available speaker ids for the defined multi-speaker model. |
|
2023-08-05 15:35:03 --list_language_idxs [LIST_LANGUAGE_IDXS] |
|
2023-08-05 15:35:03 List available language ids for the defined multi-lingual model. |
|
2023-08-05 15:35:03 --save_spectogram SAVE_SPECTOGRAM |
|
2023-08-05 15:35:03 If true save raw spectogram for further (vocoder) processing in out_path. |
|
2023-08-05 15:35:03 --reference_wav REFERENCE_WAV |
|
2023-08-05 15:35:03 Reference wav file to convert in the voice of the speaker_idx or speaker_wav |
|
2023-08-05 15:35:03 --reference_speaker_idx REFERENCE_SPEAKER_IDX |
|
2023-08-05 15:35:03 speaker ID of the reference_wav speaker (If not provided the embedding will be computed using the Speaker Encoder). |
|
2023-08-05 15:35:03 --progress_bar PROGRESS_BAR |
|
2023-08-05 15:35:03 If true shows a progress bar for the model download. Defaults to True |
|
2023-08-05 15:35:03 --source_wav SOURCE_WAV |
|
2023-08-05 15:35:03 Original audio file to convert in the voice of the target_wav |
|
2023-08-05 15:35:03 --target_wav TARGET_WAV |
|
2023-08-05 15:35:03 Target audio file to convert in the voice of the source_wav |
|
2023-08-05 15:35:03 --voice_dir VOICE_DIR |
|
2023-08-05 15:35:03 Voice dir for tortoise model |