Skip to content

Instantly share code, notes, and snippets.

@sotelo
Created January 12, 2017 19:58
Show Gist options
  • Save sotelo/3b83ae7ea863a650f216c8ab265f0bdc to your computer and use it in GitHub Desktop.
Save sotelo/3b83ae7ea863a650f216c8ab265f0bdc to your computer and use it in GitHub Desktop.
Pavoque data processing.
All the traces of the processing are in leto11.
1. Extract data with pavoque-repo
1.5 Convert wav to 16k. I use ch_wave -otype riff -F 16000 -o wav/${X} wav48/${X}
2. Copy data in /Tmp/sotelo/data/german/raw
3. cd /Tmp/sotelo/results/merlin/egs/build_your_own_voice/s1
4. ./01_setup.sh pavoque
5. mkdir raw_data/pavoque
mkdir processed_data/pavoque
mkdir processed_data/pavoque/acoustic
6. cp -r /Tmp/sotelo/data/german/raw/wav raw_data/pavoque
7. run merlin/misc/scripts/vocoder/world/extract_features_for_merlin.sh You need to change:
wav_dir=/Tmp/sotelo/results/merlin/egs/build_your_own_voice/s1/raw_data/pavoque/wav
out_dir=/Tmp/sotelo/results/merlin/egs/build_your_own_voice/s1/processed_data/pavoque/acoustic
This will take a while. It's probably the most time consuming processing. So I do it first to use the time to do the rest. Now we will process the text:
8. cd /Tmp/sotelo/data/german/raw
9. Process pavoque labels
https://gist.github.com/0745eb639dc9e5b22b83fbf0ef749ff5
10. cd /Tmp/sotelo/results/merlin/egs/build_your_own_voice/s1
11. cp /Tmp/sotelo/data/german/raw/utts.data raw_data/pavoque
12. cat raw_data/pavoque/utts.data| cut -d " " -f 2 > raw_data/pavoque/file_id_list.scp
(wait for step 7)
13. mkdir experiments/pavoque/acoustic_model/data
cp -r processed_data/pavoque/acoustic/mgc experiments/pavoque/acoustic_model/data
cp -r processed_data/pavoque/acoustic/lf0 experiments/pavoque/acoustic_model/data
cp -r processed_data/pavoque/acoustic/bap experiments/pavoque/acoustic_model/data
14. Run steps from 03_run_merlin.sh
Modify conf/global_conf_settings.cfg
Train=5442
Valid=0
Test=0
global_config_file=conf/global_settings.cfg
source $global_config_file
./scripts/prepare_config_files.sh $global_config_file
Modify conf/acoustic_pavoque.conf
file_id_list: %(data)s/file_id_list.scp
[Outputs]
# dX should be 3 times X
mgc: 60
dmgc: 60
bap: 1
dbap: 1
lf0: 1
dlf0: 1
do_MLPG: False
# Main processes
AcousticModel : True
GenTestList : False
# sub-processes
NORMLAB : False
MAKECMP : True
NORMCMP : True
TRAINDNN : False
DNNGEN : False
GENWAV : False
CALMCD : False
./scripts/submit.sh ${MerlinDir}/src/run_merlin.py conf/acoustic_${Voice}.conf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment