Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save taroushirani/d3cd377ecf525c94310251367edf0085 to your computer and use it in GitHub Desktop.
Save taroushirani/d3cd377ecf525c94310251367edf0085 to your computer and use it in GitHub Desktop.
nnsvs_kurobousuku_ai_song_db_dev_48k_world_training
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/taroushirani/d3cd377ecf525c94310251367edf0085/nnsvs_kurobousuku_ai_song_db_dev_48k_world_training.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "giSmNuVzDdcL"
},
"source": [
"# Setup"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "J76ifcTqBr88"
},
"source": [
"# Miscellaneous setting\n",
"## Setting Google drive accessible\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "UAHxQFcMOBR2"
},
"outputs": [],
"source": [
"from google.colab import drive\n",
"drive.mount('/content/drive')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "7b6A4YPEOEjt"
},
"outputs": [],
"source": [
"!ln -s \"/content/drive/My Drive\" /content/gdrive"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "I7FTMYdyUzgW"
},
"source": [
"## Check GPU"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "C6wUuWIupBz3"
},
"outputs": [],
"source": [
"! nvidia-smi"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "K2ghaLcEOz5C"
},
"source": [
"\n",
"## Update numpy, numba and cython\n",
"If numpy is upgraded, please restart runtime according to the instruction of the colaboratory."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "FCY9PjEUXT5i"
},
"outputs": [],
"source": [
"! pip install -U numpy numba cython"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Ci9XLYz5RRp2"
},
"source": [
"## Install pysinsy (binary-indep version)\n",
"Currently installation of pysinsy via pip will be failed, so we have to install it manually."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "BbzKX7A1PHia"
},
"outputs": [],
"source": [
"# ! pip install pysinsy"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "T6zRCxRJXp5C"
},
"outputs": [],
"source": [
"! rm -rf pysinsy\n",
"! git clone https://github.com/r9y9/pysinsy.git\n",
"! cd pysinsy && git submodule update --init --recursive\n",
"#! cd pysinsy && pip install .\n",
"! cd pysinsy && python setup.py build_ext\n",
"! cd pysinsy && python setup.py install"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "kC4w4OdDSrLx"
},
"source": [
"## Install nnmnkwii (development version)\n",
"We can also use \"pip install git+https://github.com/r9y9/nnmnkwii\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "tZwFN2dLTi8L"
},
"outputs": [],
"source": [
"! git clone -q https://github.com/r9y9/nnmnkwii\n",
"! cd nnmnkwii && pip install ."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "meDORyuKiA50"
},
"source": [
"## Install ParallelWaveGAN"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "un5lx-3DgXiA"
},
"outputs": [],
"source": [
"! git clone -b nnsvs https://github.com/nnsvs/ParallelWaveGAN.git\n",
"! cd ParallelWaveGAN && pip install ."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "24jlkljnTttP"
},
"source": [
"## Install NNSVS"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "UdRQ5pMuYtFj"
},
"outputs": [],
"source": [
"! git clone -b newdb_dev -q https://github.com/taroushirani/nnsvs\n",
"#! cd nnsvs && pip install .\n",
"! cd nnsvs && python setup.py install"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "iCTC2-NSmdSN"
},
"outputs": [],
"source": [
"# for dev branch\n",
"! pip install matplotlib mlflow optuna hydra-optuna-sweeper protobuf"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "2DtPha5MX7G_"
},
"source": [
"## Install HN-UnifiedSourceFilterGAN"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "8osH9Y_IX6q_"
},
"outputs": [],
"source": [
"! git clone -b nnsvs https://github.com/nnsvs/HN-UnifiedSourceFilterGAN.git\n",
"! cd HN-UnifiedSourceFilterGAN && pip install ."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "niCoX2FDiazy"
},
"source": [
"## Install SifiGAN"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Ec2Z1dj1iaAs"
},
"outputs": [],
"source": [
"! git clone -b nnsvs https://github.com/nnsvs/SiFiGAN.git\n",
"! cd SiFiGAN && pip install ."
]
},
{
"cell_type": "markdown",
"source": [
"## Install matplotlib<3.6.0"
],
"metadata": {
"id": "znzdUeJsCpnU"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "C66bGVblVkBC"
},
"outputs": [],
"source": [
" ! pip install -U \"matplotlib<3.6.0\""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zVs9OAsrVTve"
},
"source": [
"## Recipe setting"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "nJHNL9htIP4D"
},
"outputs": [],
"source": [
"RECIPE_ROOT=\"/content/nnsvs/recipes/kurobousuku_ai_song_db/dev-48k-world\""
]
},
{
"cell_type": "markdown",
"source": [
"### Install recipe-specific packages"
],
"metadata": {
"id": "zW-0baBhCvr4"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "XTb5MRRHUGAh"
},
"outputs": [],
"source": [
"! pip install jaconv utaupy"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "kvC_xsSOWDUh"
},
"source": [
"# Data preparation\n",
"## Deflate Kurobousuku_AI_Song -DB.zip"
]
},
{
"cell_type": "code",
"source": [
"! cd /content/gdrive && unzip -u Kurobousuku_AI_Song\\ -DB.zip"
],
"metadata": {
"id": "Gqu18UJzgMaz"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"## Rename and fix Kurobousuku_AI_Song -DB\n",
"Because the scripts in NNSVS can't handle spaces in paths correctly, we have to omit spaces from the paths."
],
"metadata": {
"id": "voEZMfdIhz3-"
}
},
{
"cell_type": "code",
"source": [
"! rm -rf /content/gdrive/Kurobousuku_AI_Song_DB\n",
"! mv /content/gdrive/Kurobousuku_AI_Song\\ -DB /content/gdrive/Kurobousuku_AI_Song_DB\n",
"! cd /content/gdrive/Kurobousuku_AI_Song_DB && rm kuroisorani_kagayakusoragamieru_yado_long/kuroisorani_kagayakusoragamieru_yado_long.zip\n",
"! cd /content/gdrive/Kurobousuku_AI_Song_DB && rm -rf kuroisorani_kagayakusoragamieru_yado_long/kuroisorani_kagayakusoragamieru_yado_long/\n",
"! cd /content/gdrive/Kurobousuku_AI_Song_DB && mv katatsumuri_h/katatsumuri_h..wav katatsumuri_h/katatsumuri_h.wav\n",
"! cd /content/gdrive/Kurobousuku_AI_Song_DB && mv high_speed_chanting_06_a_h/high_speed_chanting_06_a_l.wav high_speed_chanting_06_a_h/high_speed_chanting_06_a_h.wav\n",
"! cd /content/gdrive/Kurobousuku_AI_Song_DB && mv my_grandfather_s_clock_03/my_grandfather_s_clock_03_.wav my_grandfather_s_clock_03/my_grandfather_s_clock_03.wav"
],
"metadata": {
"id": "YihdbX8BhdlD"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "egbw2IYYw1DH"
},
"outputs": [],
"source": [
"! sed -i 's#\\~\\/data\\/Kurobousuku_AI_Song\\ -DB#\\/content\\/gdrive\\/Kurobousuku_AI_Song_DB#g' $RECIPE_ROOT/config.yaml"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "HO_LXEu1BGG8"
},
"outputs": [],
"source": [
"#! cd $RECIPE_ROOT && bash run.sh --stage -1 --stop-stage -1"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "0bLMxIs2x2Qr"
},
"outputs": [],
"source": [
"#! cd $RECIPE_ROOT && rm -rf data\n",
"! cd $RECIPE_ROOT && bash run.sh --stage -0 --stop-stage 0"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "QnYwVu94gjF4"
},
"outputs": [],
"source": [
"! cd $RECIPE_ROOT && bash run.sh --stage 1 --stop-stage 1"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "HH4AVfZlXa07"
},
"source": [
"# Save the extracted data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "SVsg4Xn8XIee"
},
"outputs": [],
"source": [
"! tar zcf /content/gdrive/nnsvs_kurobousuku_ai_song_db_dev_48k_world_data_extract.tgz $RECIPE_ROOT/dump $RECIPE_ROOT/data $RECIPE_ROOT/outputs"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "JNrPKWQBRgcx"
},
"outputs": [],
"source": [
"! tar zxf /content/gdrive/nnsvs_kurobousuku_ai_song_db_dev_48k_world_data_extract.tgz -C /"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8EFgAz7v8Ee6"
},
"source": [
"# Training\n",
"## Timelag model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "oUql0IwlkvEu"
},
"outputs": [],
"source": [
"! cd nnsvs && git checkout $RECIPE_ROOT/config.yaml\n",
"! cd nnsvs && git pull origin newdb_dev\n",
"! cd nnsvs && pip install .\n",
"! sed -i 's#\\~\\/data\\/Kurobousuku_AI_Song\\ -DB#\\/content\\/gdrive\\/Kurobousuku_AI_Song_DB#g' $RECIPE_ROOT/config.yaml"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "andN-LWzC9-l"
},
"outputs": [],
"source": [
"! cd $RECIPE_ROOT && bash run.sh --stage 2 --stop-stage 2"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "DyPXU3RJ8MFp"
},
"source": [
"## Duration model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "js0FLYInDEGL"
},
"outputs": [],
"source": [
"! cd $RECIPE_ROOT && bash run.sh --stage 3 --stop-stage 3"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "bNClQ21AcUGm"
},
"outputs": [],
"source": [
"! tar zcf /content/gdrive/nnsvs_kurobousuku_ai_song_db_dev_48k_world_trained.tgz $RECIPE_ROOT/dump $RECIPE_ROOT/data $RECIPE_ROOT/exp $RECIPE_ROOT/outputs $RECIPE_ROOT/tensorboard"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "fKpJzYQwKdl-"
},
"outputs": [],
"source": [
"#! tar zxf /content/gdrive/nnsvs_kurobousuku_ai_song_db_dev_48k_world_trained.tgz -C /"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "KQ1u_vYdDFwm"
},
"outputs": [],
"source": [
"! cd $RECIPE_ROOT && bash run.sh --stage 4 --stop-stage 4"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "uCj64dxLrSWL"
},
"outputs": [],
"source": [
"! tar zcf /content/gdrive/nnsvs_kurobousuku_ai_song_db_dev_48k_world_trained.tgz $RECIPE_ROOT/dump $RECIPE_ROOT/data $RECIPE_ROOT/exp $RECIPE_ROOT/outputs $RECIPE_ROOT/tensorboard"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "sVU6ra7TDHEZ"
},
"outputs": [],
"source": [
"! cd $RECIPE_ROOT && bash run.sh --stage 5 --stop-stage 5"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "h2IZOWtvFD9N"
},
"outputs": [],
"source": [
"! cd $RECIPE_ROOT && bash run.sh --stage 6 --stop-stage 6"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "YQMmSBWEBZ9I"
},
"outputs": [],
"source": [
"! tar zcf /content/gdrive/nnsvs_kurobousuku_ai_song_db_dev_48k_world_trained.tgz $RECIPE_ROOT/dump $RECIPE_ROOT/data $RECIPE_ROOT/exp $RECIPE_ROOT/outputs $RECIPE_ROOT/tensorboard"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "WX2_fBwPATwg"
},
"outputs": [],
"source": [
"#! tar zxf /content/gdrive/nnsvs_kurobousuku_ai_song_db_dev_48k_world_trained.tgz -C /"
]
},
{
"cell_type": "markdown",
"source": [
"## Check tensorboard"
],
"metadata": {
"id": "vBD77iUKDh6v"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "YE0a85-ZvuZh"
},
"outputs": [],
"source": [
"%load_ext tensorboard"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "VMrHncH4vvW6"
},
"outputs": [],
"source": [
"%tensorboard --logdir $RECIPE_ROOT/tensorboard"
]
},
{
"cell_type": "markdown",
"source": [
"# Train SiFi-GAN"
],
"metadata": {
"id": "V74buHNwmTRQ"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "1PrY3kUcT6zK"
},
"outputs": [],
"source": [
"! cd $RECIPE_ROOT && bash run.sh --stage 9 --stop-stage 9"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "CXPkLmzCVKEE"
},
"outputs": [],
"source": [
"! tar zcf /content/gdrive/nnsvs_kurobousuku_ai_song_db_dev_48k_world_trained.tgz $RECIPE_ROOT/dump $RECIPE_ROOT/data $RECIPE_ROOT/exp $RECIPE_ROOT/outputs $RECIPE_ROOT/tensorboard"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "xne54CxThLnI"
},
"outputs": [],
"source": [
"! tar zxf /content/gdrive/nnsvs_kurobousuku_ai_song_db_dev_48k_world_trained.tgz -C /"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "OMfrXRLVrBsr"
},
"outputs": [],
"source": [
"#! sed -i 's#train_max_steps: 600000#train_max_steps: 200000#g' $RECIPE_ROOT/conf/train_sifigan/train/nnsvs_sifigan.yaml\n",
"#! sed -i 's#save_interval_steps: 100000#save_interval_steps: 30000#g' $RECIPE_ROOT/conf/train_sifigan/train/nnsvs_sifigan.yaml\n",
"#! sed -i 's#resume:#resume: 400000#g' $RECIPE_ROOT/conf/train_sifigan/train/nnsvs_sifigan.yaml"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "mwhYxr88clyv"
},
"outputs": [],
"source": [
"#! sed -i 's#batch_max_length: 48000#batch_max_length: 24000#g' $RECIPE_ROOT/conf/train_sifigan/data/nnsvs_world_sr48k.yaml"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"background_save": true
},
"id": "lKl2sRtkVogN"
},
"outputs": [],
"source": [
"! cd $RECIPE_ROOT && bash run.sh --stage 13 --stop-stage 13"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "7TQa0vUjYTud"
},
"outputs": [],
"source": [
"! tar zcf /content/gdrive/nnsvs_kurobousuku_ai_song_db_dev_48k_world_trained.tgz $RECIPE_ROOT/dump $RECIPE_ROOT/data $RECIPE_ROOT/exp $RECIPE_ROOT/outputs $RECIPE_ROOT/tensorboard"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "xkRkJtPNP2GE"
},
"source": [
"# Create the packed model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "BpBcB0qNO_Xs"
},
"outputs": [],
"source": [
"! cd $RECIPE_ROOT && bash run.sh --stage 99 --stop-stage 99"
]
},
{
"cell_type": "markdown",
"source": [
"# Synthesis via SPSVS module"
],
"metadata": {
"id": "21Or63yrmLTa"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "I3dMaCqpP1PY"
},
"outputs": [],
"source": [
"import yaml\n",
"from os.path import join\n",
"\n",
"with open(join(RECIPE_ROOT, 'config.yaml'), 'r') as yml:\n",
" config = yaml.load(yml, Loader=yaml.FullLoader)\n",
"\n",
"spk=config[\"spk\"]\n",
"timelag_model=config[\"timelag_model\"]\n",
"duration_model=config[\"duration_model\"]\n",
"acoustic_model=config[\"acoustic_model\"]\n",
"vocoder_model=config[\"vocoder_model\"]\n",
"packed_model_dir=join(RECIPE_ROOT, \"packed_models\", f\"{spk}_{timelag_model}_{duration_model}_{acoustic_model}_{vocoder_model}\")\n",
"print(packed_model_dir)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "HjDGh3w1PXUp"
},
"outputs": [],
"source": [
"from nnsvs.svs import SPSVS\n",
"import os\n",
"from os.path import basename, join, splitext, exists\n",
"from tqdm import tqdm\n",
"from nnmnkwii.io import hts\n",
"import numpy as np\n",
"import torch\n",
"from scipy.io import wavfile\n",
"\n",
"in_dir=\"/content/gdrive/sample_score_20230330\"\n",
"song_list_file=join(in_dir, \"song_list.txt\")\n",
"out_dir=join(RECIPE_ROOT, \"exp\", config[\"spk\"], \"synthesis\", \"svs\")\n",
"\n",
"os.makedirs(out_dir, exist_ok=True)\n",
"\n",
"engine=SPSVS(packed_model_dir, device=\"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
"\n",
"with open(song_list_file) as f:\n",
" lines = list(filter(lambda s: len(s.strip()) > 0, f.readlines()))\n",
" print(f\"Processes {len(lines)} utterances...\")\n",
" for idx in tqdm(range(len(lines))):\n",
" utt_id = lines[idx].strip()\n",
" label_path = join(in_dir, f\"{utt_id}.lab\")\n",
" if not exists(label_path):\n",
" raise RuntimeError(f\"Label file does not exist: {label_path}\")\n",
" # load labels and question\n",
" labels = hts.load(label_path).round_()\n",
"\n",
" wav, sr = engine.svs(labels,\n",
" vocoder_type=\"auto\",\n",
" post_filter_type=\"gv\",\n",
" force_fix_vuv=True,\n",
" segmented_synthesis=True\n",
" )\n",
"\n",
"\n",
" out_wav_path = join(out_dir, f\"{utt_id}.wav\")\n",
" wavfile.write(\n",
" out_wav_path, rate=sr, data=wav\n",
" )"
]
}
],
"metadata": {
"colab": {
"provenance": [],
"gpuClass": "premium",
"gpuType": "T4",
"mount_file_id": "12HbEBcuG8pRRDY0w9QECR16MVyqlitsN",
"authorship_tag": "ABX9TyPFBC/cEpmyG0FaNRg2YCDD",
"include_colab_link": true
},
"gpuClass": "premium",
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"accelerator": "GPU"
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment