Skip to content

Instantly share code, notes, and snippets.

@taroushirani
Last active July 12, 2020 06:49
Show Gist options
  • Save taroushirani/d1c9a99f2c17026d116c976fa2270368 to your computer and use it in GitHub Desktop.
Save taroushirani/d1c9a99f2c17026d116c976fa2270368 to your computer and use it in GitHub Desktop.
nnsvs_test_natsume_singing
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "nnsvs_test_natsume_singing",
"provenance": [],
"toc_visible": true,
"authorship_tag": "ABX9TyNUTQwvUu9ZChSaQGgmNsTo",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"accelerator": "GPU"
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/taroushirani/d1c9a99f2c17026d116c976fa2270368/nnsvs_test_natsume_singing.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "IxmW5IpLNCz3",
"colab_type": "text"
},
"source": [
"# NN-SVS で 夏目悠季/男性歌声データベースを使う\n",
"\n",
"ここでは[夏目悠李/男性歌声データベース](https://amanokei.hatenablog.com/)をNN-SVSで使用する方法について解説します.\n",
"\n",
"## 基本的な環境構築\n",
"\n",
"### Google Colaboratoryの設定\n",
"Google Colaboratory の基本的な使い方に関しては割愛します. \n",
"画面上のメニューから「編集」-「ノートブックの設定」を選びハードウェアアクセラレータを None から GPU に変更するのを忘れないようにしてください.\n",
"\n",
"### NumPy, Cythonのアップグレード\n",
"NumPy, Cythonは最初からColaboratoryにインストールされていますが, 最新版にアップグレードしておきます."
]
},
{
"cell_type": "code",
"metadata": {
"id": "FCY9PjEUXT5i",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 294
},
"outputId": "71edac26-9036-4c78-cbb2-1743fa2f2e23"
},
"source": [
"! pip install -U numpy cython"
],
"execution_count": 1,
"outputs": [
{
"output_type": "stream",
"text": [
"Collecting numpy\n",
"\u001b[?25l Downloading https://files.pythonhosted.org/packages/93/0b/71ae818646c1a80fbe6776d41f480649523ed31243f1f34d9d7e41d70195/numpy-1.19.0-cp36-cp36m-manylinux2010_x86_64.whl (14.6MB)\n",
"\u001b[K |████████████████████████████████| 14.6MB 205kB/s \n",
"\u001b[?25hRequirement already up-to-date: cython in /usr/local/lib/python3.6/dist-packages (0.29.20)\n",
"\u001b[31mERROR: datascience 0.10.6 has requirement folium==0.2.1, but you'll have folium 0.8.3 which is incompatible.\u001b[0m\n",
"\u001b[31mERROR: albumentations 0.1.12 has requirement imgaug<0.2.7,>=0.2.5, but you'll have imgaug 0.2.9 which is incompatible.\u001b[0m\n",
"Installing collected packages: numpy\n",
" Found existing installation: numpy 1.18.5\n",
" Uninstalling numpy-1.18.5:\n",
" Successfully uninstalled numpy-1.18.5\n",
"Successfully installed numpy-1.19.0\n"
],
"name": "stdout"
},
{
"output_type": "display_data",
"data": {
"application/vnd.colab-display-data+json": {
"pip_warning": {
"packages": [
"numpy"
]
}
}
},
"metadata": {
"tags": []
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "bHIYMRLgNTd6",
"colab_type": "text"
},
"source": [
"### hts_engine_API, sinsy のビルド\n",
"\n",
"NN-SVS は楽譜 (musicxml ファイル) から歌声合成に必要な情報が書かれたラベルファイル (lab ファイル) に変換するために [Sinsy](http://sinsy.sourceforge.net/) の python wrapper である Pysinsy を使います. Pysinsy のインストールに必要な [hts_engine_API](http://hts-engine.sourceforge.net/), Sinsy をビルドしますが, それぞれ公式からのリリースではなく NN-SVS の作者である Ryuichi Yamamoto 氏の fork を使います."
]
},
{
"cell_type": "code",
"metadata": {
"id": "iV4ghgxzXaNt",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"outputId": "85c51a60-22d7-4eb2-c94d-c48044356dfb"
},
"source": [
"! git clone -q https://github.com/r9y9/hts_engine_API\n",
"! cd hts_engine_API/src && ./waf configure --prefix=/usr/ && ./waf build > hts_engine_API_build.log 2>&1 && ./waf install\n",
"! git clone -q https://github.com/r9y9/sinsy\n",
"! cd sinsy/src/ && mkdir -p build && cd build && cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=/usr/ .. && make -j > sinsy_build.log 2>&1 && make install"
],
"execution_count": 2,
"outputs": [
{
"output_type": "stream",
"text": [
"\u001b[32m\u001b[0mSetting top to :\u001b[0m \u001b[0m\u001b[32m\u001b[32m/content/hts_engine_API/src\u001b[0m \u001b[0m\n",
"\u001b[32m\u001b[0mSetting out to :\u001b[0m \u001b[0m\u001b[32m\u001b[32m/content/hts_engine_API/src/build\u001b[0m \u001b[0m\n",
"\u001b[32m\u001b[0mChecking for waf version in 1.7.11-2.1.0 :\u001b[0m \u001b[0m\u001b[32m\u001b[32mok\u001b[0m \u001b[0m\n",
"\u001b[32m\u001b[0mChecking for 'gcc' (C compiler) :\u001b[0m \u001b[0m\u001b[32m\u001b[32m/usr/bin/gcc\u001b[0m \u001b[0m\n",
"\u001b[32m\u001b[0mChecking for header stdlib.h :\u001b[0m \u001b[0m\u001b[32m\u001b[32myes\u001b[0m \u001b[0m\n",
"\u001b[32m\u001b[0mChecking for header string.h :\u001b[0m \u001b[0m\u001b[32m\u001b[32myes\u001b[0m \u001b[0m\n",
"\n",
"hts_engine_API has been configured as follows:\n",
"\n",
"[Build information]\n",
"Package: hts_engine_API-1.0.9\n",
"build (compile on): x86_64-linux\n",
"host endian: little\n",
"Compiler: gcc\n",
"Compiler version: 7.5.0\n",
"CFLAGS: -O2 -Wall -fno-common -Wstrict-prototypes\n",
"\n",
"\u001b[32m'configure' finished successfully (0.937s)\u001b[0m\n",
"\u001b[32mWaf: Entering directory `/content/hts_engine_API/src/build'\u001b[0m\n",
"\u001b[32m\u001b[0m+ install \u001b[01;34m/usr/include/HTS_hidden.h\u001b[0m (from lib/HTS_hidden.h)\u001b[0m\n",
"\u001b[32m\u001b[0m+ install \u001b[01;34m/usr/include/HTS_engine.h\u001b[0m (from include/HTS_engine.h)\u001b[0m\n",
"\u001b[32m\u001b[0m+ symlink \u001b[01;34m/usr/lib/libhts_engine_API.so\u001b[0m (to libhts_engine_API.so.1.0.9)\u001b[0m\n",
"\u001b[32m\u001b[0m+ install \u001b[01;34m/usr/lib/libhts_engine_API.so.1.0.9\u001b[0m (from build/lib/libhts_engine_API.so)\u001b[0m\n",
"\u001b[32m\u001b[0m+ symlink \u001b[01;34m/usr/lib/libhts_engine_API.so.1\u001b[0m (to libhts_engine_API.so.1.0.9)\u001b[0m\n",
"\u001b[32m\u001b[0m+ install \u001b[01;34m/usr/bin/hts_engine\u001b[0m (from build/bin/hts_engine)\u001b[0m\n",
"\u001b[32m\u001b[0m+ install \u001b[01;34m/usr/lib/pkgconfig/hts_engine_API.pc\u001b[0m (from build/hts_engine_API.pc)\u001b[0m\n",
"\u001b[32mWaf: Leaving directory `/content/hts_engine_API/src/build'\u001b[0m\n",
"\u001b[32m'install' finished successfully (0.080s)\u001b[0m\n",
"-- The C compiler identification is GNU 7.5.0\n",
"-- The CXX compiler identification is GNU 7.5.0\n",
"-- Check for working C compiler: /usr/bin/cc\n",
"-- Check for working C compiler: /usr/bin/cc -- works\n",
"-- Detecting C compiler ABI info\n",
"-- Detecting C compiler ABI info - done\n",
"-- Detecting C compile features\n",
"-- Detecting C compile features - done\n",
"-- Check for working CXX compiler: /usr/bin/c++\n",
"-- Check for working CXX compiler: /usr/bin/c++ -- works\n",
"-- Detecting CXX compiler ABI info\n",
"-- Detecting CXX compiler ABI info - done\n",
"-- Detecting CXX compile features\n",
"-- Detecting CXX compile features - done\n",
"-- Configuring done\n",
"-- Generating done\n",
"-- Build files have been written to: /content/sinsy/src/build\n",
"[ 95%] Built target sinsy\n",
"[100%] Built target sinsy-bin\n",
"\u001b[36mInstall the project...\u001b[0m\n",
"-- Install configuration: \"Release\"\n",
"-- Installing: /usr/lib/libsinsy.so.0.9.2\n",
"-- Installing: /usr/lib/libsinsy.so.0.9\n",
"-- Installing: /usr/lib/libsinsy.so\n",
"-- Installing: /usr/bin/sinsy\n",
"-- Set runtime path of \"/usr/bin/sinsy\" to \"\"\n",
"-- Installing: /usr/include/sinsy\n",
"-- Installing: /usr/include/sinsy/LabelStrings.h\n",
"-- Installing: /usr/include/sinsy/ILabelOutput.h\n",
"-- Installing: /usr/include/sinsy/sinsy.h\n",
"-- Installing: /usr/lib/sinsy/dic\n",
"-- Installing: /usr/lib/sinsy/dic/japanese.shift_jis.table\n",
"-- Installing: /usr/lib/sinsy/dic/japanese.utf_8.table\n",
"-- Installing: /usr/lib/sinsy/dic/japanese.euc_jp.conf\n",
"-- Installing: /usr/lib/sinsy/dic/japanese.shift_jis.conf\n",
"-- Installing: /usr/lib/sinsy/dic/japanese.macron\n",
"-- Installing: /usr/lib/sinsy/dic/japanese.utf_8.conf\n",
"-- Installing: /usr/lib/sinsy/dic/japanese.euc_jp.table\n",
"-- Installing: /usr/lib/pkgconfig/sinsy.pc\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "88So6I-WNocS",
"colab_type": "text"
},
"source": [
"### Pysinsy, nnmnkwii, NN-SVS のインストール\n",
"\n",
"NN-SVS は音声合成システムを作るためのライブラリである [nnmnkwii](https://github.com/r9y9/nnmnkwii) 上に構築されています. Pysinsy, nnmnkwii, NN-SVS をインストールします."
]
},
{
"cell_type": "code",
"metadata": {
"id": "UdRQ5pMuYtFj",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 238
},
"outputId": "2f45d0dd-d2c6-4c16-e8d4-81e19543d5e3"
},
"source": [
"# Python dependencies\n",
"! git clone -q https://github.com/r9y9/pysinsy\n",
"! cd pysinsy && export SINSY_INSTALL_PREFIX=/usr/ && pip install -q .\n",
"! git clone -q https://github.com/r9y9/nnmnkwii\n",
"! cd nnmnkwii && pip install -q .\n",
"! git clone -q https://github.com/r9y9/nnsvs \n",
"! cd nnsvs && pip install -q .\n",
"! pip install -q jaconv"
],
"execution_count": 3,
"outputs": [
{
"output_type": "stream",
"text": [
" Building wheel for pysinsy (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
"\u001b[K |████████████████████████████████| 419kB 6.3MB/s \n",
"\u001b[K |████████████████████████████████| 368kB 14.7MB/s \n",
"\u001b[?25h Building wheel for nnmnkwii (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
" Building wheel for pysptk (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
" Building wheel for bandmat (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
"\u001b[K |████████████████████████████████| 81kB 4.1MB/s \n",
"\u001b[K |████████████████████████████████| 1.6MB 12.0MB/s \n",
"\u001b[K |████████████████████████████████| 81kB 8.1MB/s \n",
"\u001b[?25h Building wheel for nnsvs (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
" Building wheel for librosa (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
" Building wheel for pyworld (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
" Building wheel for jaconv (setup.py) ... \u001b[?25l\u001b[?25hdone\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "SKszyhHhcGm7",
"colab_type": "text"
},
"source": [
"これでNN-SVSを使う準備が出来ました."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "o1Qe7mM2Nx9p",
"colab_type": "text"
},
"source": [
"## データとレシピの準備\n",
"### 夏目悠李/男声歌声データベースのダウンロード\n",
"\n",
"[夏目悠李/男声歌声データベース配布、始めました!【2020/7/5 23:53更新】](https://amanokei.hatenablog.com/entry/2020/04/30/230003) から利用規約に同意して Natsume_Singing_DB.zip をダウンロードします. ダウンロードされたファイルを Google Drive 上にアップロードします.\n",
"\n",
"### Google Driveのマウント\n",
"\n",
"Google Drive にアップロードしたファイルを Colaboratory から使うために Google Drive を Colaboratory にマウントします. 以下のセルを実行し, 表示されたリンクをクリックして Google アカウントへのアクセスを許可してください. 表示されたコードをコピーして Colaboratory 上に戻ってセル中の\"Enter your authorization code:\"の下の欄にペーストします."
]
},
{
"cell_type": "code",
"metadata": {
"id": "XpoUEOtIWx9X",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 122
},
"outputId": "633dbb5a-d800-471b-9997-1bb7cca965cf"
},
"source": [
"from google.colab import drive\n",
"drive.mount('/content/drive')"
],
"execution_count": 4,
"outputs": [
{
"output_type": "stream",
"text": [
"Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly\n",
"\n",
"Enter your authorization code:\n",
"··········\n",
"Mounted at /content/drive\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "BWq-G9ptO9kR",
"colab_type": "text"
},
"source": [
"一部のシェルスクリプトはファイル名にスペースが含まれていると上手く動作しないことがあります. /content/gdrive から Google Drive 上のファイルにアクセスできるようシンボリックリンクを貼ります.\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "yeCIXI50O3Qp",
"colab_type": "code",
"colab": {}
},
"source": [
"!ln -s \"/content/drive/My Drive\" /content/gdrive"
],
"execution_count": 5,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "VcvGd0A2PCV6",
"colab_type": "text"
},
"source": [
"これでさきほどアップロードした Natsume_Singing_DB.zip に /content/gdrive/Natsume_Singing_DB.zip からアクセスできるようになりました."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "JmcKK61zPj4_",
"colab_type": "text"
},
"source": [
"### 夏目悠李/男声歌声データベースの解凍\n",
"\n",
"セルに以下のように入力して Natsume_Singing_DB.zip を解凍します."
]
},
{
"cell_type": "code",
"metadata": {
"id": "_QhtzUCKfvjc",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"outputId": "27c52819-434f-40d5-8757-89abc053ee6f"
},
"source": [
"! cd /content/gdrive && unzip -u /content/gdrive/Natsume_Singing_DB.zip"
],
"execution_count": 6,
"outputs": [
{
"output_type": "stream",
"text": [
"Archive: /content/gdrive/Natsume_Singing_DB.zip\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "kTxlto4DTGtS",
"colab_type": "text"
},
"source": [
"2020年7月8日現在夏目悠李/男声歌声データベースにはいくつかラベリング上のミスがあるのでパッチを当てます. "
]
},
{
"cell_type": "code",
"metadata": {
"id": "_UHHfPc4THPE",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 510
},
"outputId": "3f67a4b6-0004-4594-922b-886c76dbd362"
},
"source": [
"! curl -O https://gist.githubusercontent.com/taroushirani/813380016fa27e7bb42408d246dad7e7/raw/aa985ce0b1dd865b440ede6adfbb78b87c3688c3/natsume_singing_fix.patch\n",
"! cd /content/gdrive/Natsume_Singing_DB/mono_label && perl -pi -e 's/\\r\\n/\\n/' *.lab \n",
"! cd /content/gdrive/Natsume_Singing_DB/ && patch -p1 < /content/natsume_singing_fix.patch"
],
"execution_count": 7,
"outputs": [
{
"output_type": "stream",
"text": [
" % Total % Received % Xferd Average Speed Time Time Time Current\n",
" Dload Upload Total Spent Left Speed\n",
"\r 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\r100 2007 100 2007 0 0 8397 0 --:--:-- --:--:-- --:--:-- 8397\n",
"patching file mono_label/37.lab\n",
"Reversed (or previously applied) patch detected! Assume -R? [n] \n",
"Apply anyway? [n] \n",
"Skipping patch.\n",
"1 out of 1 hunk ignored -- saving rejects to file mono_label/37.lab.rej\n",
"patching file mono_label/38.lab\n",
"Reversed (or previously applied) patch detected! Assume -R? [n] \n",
"Apply anyway? [n] \n",
"Skipping patch.\n",
"1 out of 1 hunk ignored -- saving rejects to file mono_label/38.lab.rej\n",
"patching file mono_label/5.lab\n",
"Reversed (or previously applied) patch detected! Assume -R? [n] \n",
"Apply anyway? [n] \n",
"Skipping patch.\n",
"1 out of 1 hunk ignored -- saving rejects to file mono_label/5.lab.rej\n",
"patching file mono_label/50.lab\n",
"Reversed (or previously applied) patch detected! Assume -R? [n] \n",
"Apply anyway? [n] \n",
"Skipping patch.\n",
"1 out of 1 hunk ignored -- saving rejects to file mono_label/50.lab.rej\n",
"patching file mono_label/6.lab\n",
"Reversed (or previously applied) patch detected! Assume -R? [n] \n",
"Apply anyway? [n] \n",
"Skipping patch.\n",
"patch unexpectedly ends in middle of line\n",
"2 out of 2 hunks ignored -- saving rejects to file mono_label/6.lab.rej\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "k1UPW4daPeWy",
"colab_type": "text"
},
"source": [
"### レシピのダウンロード\n",
"github から nnsvs_natsume_singing をダウンロードします. "
]
},
{
"cell_type": "code",
"metadata": {
"id": "4d3gou5qfhRv",
"colab_type": "code",
"colab": {}
},
"source": [
"! cd nnsvs/egs && git clone -q https://github.com/taroushirani/nnsvs_natsume_singing.git"
],
"execution_count": 8,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "j7dG_jgfQW_9",
"colab_type": "text"
},
"source": [
"夏目悠李/男声歌声データベース用のレシピがあるディレクトリに変数 RECIPE_ROOT でアクセスできるようにします."
]
},
{
"cell_type": "code",
"metadata": {
"id": "m-TVJk1SGW_n",
"colab_type": "code",
"colab": {}
},
"source": [
"RECIPE_ROOT=\"/content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/\""
],
"execution_count": 9,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "9XqfzxIPQrN3",
"colab_type": "text"
},
"source": [
"### /content/nnsvs/egs/natsume_singing/00-svs-world/run.sh の編集\n",
"夏目悠李/男声歌声データベース用のレシピは実際には /content/nnsvs/egs/natsume_singing/00-svs-world/run.sh というシェルスクリプトの形で提供されます. レシピに展開した Natsume_Singing_DB の場所を設定します. \n",
"\n",
"画面右端にある「ファイル」アイコンから /content/nnsvs/egs/natsume_singing/00-svs-world/run.sh を開き, 8 行目の db_root を直接編集するか, 以下のセルを実行します. \n",
"\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "hT4crM1wgCd7",
"colab_type": "code",
"colab": {}
},
"source": [
"! sed -i 's#[$]HOME\\/data#\\/content\\/gdrive#g' $RECIPE_ROOT/run.sh "
],
"execution_count": 10,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "KErdcntbRe26",
"colab_type": "text"
},
"source": [
"# 夏目悠季/男性歌声データベース用レシピを試す\n",
"## ステージ-1(データのダウンロード) の実行\n",
"ここからは基本的にレシピの提供するステージに沿って実行していきます. NN-SVSではステージ-1はデータのダウンロードですが, ライセンスの関係でこのレシピではデータを自動的にダウンロードするようにはしていません. 予め上記の指示に従って Natsume_Singing_DB.zip をダウンロードしておいてください.\n",
"\n",
"データのダウンロードを行わない代わりに, db_rootで設定された場所に解凍されたNatsume_Singing_DBが見つからない場合はダウンロードを促すメッセージを出力するように変更してあります."
]
},
{
"cell_type": "code",
"metadata": {
"id": "Veas7tJiGj8E",
"colab_type": "code",
"colab": {}
},
"source": [
"! cd $RECIPE_ROOT && bash run.sh --stage -1 --stop-stage -1"
],
"execution_count": 11,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "oe2yY-btSh91",
"colab_type": "text"
},
"source": [
"## ステージ0(データの準備)の実行\n",
"\n",
"夏目悠季/男性歌声データベースのデータをNN-SVSで扱いやすい形に整えます. Colaboratory の CPU はそれほど高速ではないので時間がかかります."
]
},
{
"cell_type": "code",
"metadata": {
"id": "8Ce6Fq2fGt17",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"outputId": "f6ed6ca5-acef-4e4a-bbcf-ff210dfe80e9"
},
"source": [
"! cd $RECIPE_ROOT && bash run.sh --stage 0 --stop-stage 0"
],
"execution_count": 12,
"outputs": [
{
"output_type": "stream",
"text": [
"stage 0: Data preparation\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"s U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"r U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"ts U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel I is converted to voiced one. To use unvoiced vowels, the rylics \"sh I\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"ts U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"ts U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"ts U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"ts U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"ts U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"ts U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel I is converted to voiced one. To use unvoiced vowels, the rylics \"sh I\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"ts U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"ts U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"ts U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"ts U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"ts U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"ts U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel I is converted to voiced one. To use unvoiced vowels, the rylics \"sh I\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"s U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"s U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"b U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"s U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"p U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"s U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"b U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"s U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"p U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"b U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"ts U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"ts U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"ts U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"ts U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"ts U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel U is converted to voiced one. To use unvoiced vowels, the rylics \"ts U\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel I is converted to voiced one. To use unvoiced vowels, the rylics \"sh I\" in musicxml should be written in Katakana.\n",
"Unvoiced vowel I is converted to voiced one. To use unvoiced vowels, the rylics \"ch I\" in musicxml should be written in Katakana.\n",
"end_time 109900000 of the phoneme m and start_time 111450000 of the phoneme o is not the same. There seems to be a missing phoneme in sinsy_mono_round.\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"a and a have the same start_time 745250000 and end_time 748200000. There seems to be a missing phoneme in mono_dtw.\n",
"\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"end_time 1019000000 of the phoneme k and start_time 1020900000 of the phoneme u is not the same. There seems to be a missing phoneme in sinsy_mono_round.\n",
"end_time 1096600000 of the phoneme n and start_time 1098450000 of the phoneme o is not the same. There seems to be a missing phoneme in sinsy_mono_round.\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"end_time 1968300000 of the phoneme u and start_time 1969500000 of the phoneme e is not the same. There seems to be a missing phoneme in sinsy_mono_round.\n",
"1.lab 32.0\n",
"Consecutive pau/sil-s are detected.\n",
"a and a have the same start_time 806050000 and end_time 812200000. There seems to be a missing phoneme in mono_dtw.\n",
"\n",
"10.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"11.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"12.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"14.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"15.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"16.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"17.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"18.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"19.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"20.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"21.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"end_time 233600000 of the phoneme t and start_time 237350000 of the phoneme e is not the same. There seems to be a missing phoneme in sinsy_mono_round.\n",
"22.lab 11.0\n",
"Consecutive pau/sil-s are detected.\n",
"end_time 208550000 of the phoneme o and start_time 211100000 of the phoneme i is not the same. There seems to be a missing phoneme in sinsy_mono_round.\n",
"23.lab 7.0\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"end_time 1205100000 of the phoneme t and start_time 1208400000 of the phoneme e is not the same. There seems to be a missing phoneme in sinsy_mono_round.\n",
"24.lab 11.0\n",
"Consecutive pau/sil-s are detected.\n",
"25.lab 3.0\n",
"Consecutive pau/sil-s are detected.\n",
"26.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"28.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"29.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"30.lab 11.0\n",
"Consecutive pau/sil-s are detected.\n",
"31.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"32.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"33.lab 4.0\n",
"Consecutive pau/sil-s are detected.\n",
"34.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"35.lab 6.0\n",
"Consecutive pau/sil-s are detected.\n",
"end_time 201200000 of the phoneme t and start_time 203500000 of the phoneme a is not the same. There seems to be a missing phoneme in sinsy_mono_round.\n",
"36.lab 5.0\n",
"Consecutive pau/sil-s are detected.\n",
"end_time 501750000 of the phoneme ts and start_time 503300000 of the phoneme u is not the same. There seems to be a missing phoneme in sinsy_mono_round.\n",
"37.lab 6.0\n",
"Consecutive pau/sil-s are detected.\n",
"38.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"39.lab 3.0\n",
"Consecutive pau/sil-s are detected.\n",
"4.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"40.lab 4.0\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"41.lab 3.0\n",
"Consecutive pau/sil-s are detected.\n",
"42.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"44.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"45.lab 3.0\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"46.lab 4.0\n",
"Consecutive pau/sil-s are detected.\n",
"a and a have the same start_time 523850000 and end_time 528250000. There seems to be a missing phoneme in mono_dtw.\n",
"\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"end_time 2015550000 of the phoneme a and start_time 2016850000 of the phoneme t is not the same. There seems to be a missing phoneme in sinsy_mono_round.\n",
"47.lab 9.0\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"Consecutive pau/sil-s are detected.\n",
"48.lab 5.0\n",
"Consecutive pau/sil-s are detected.\n",
"49.lab 2.0\n",
"5.lab 3.0\n",
"Consecutive pau/sil-s are detected.\n",
"end_time 423700000 of the phoneme a and start_time 425100000 of the phoneme br is not the same. There seems to be a missing phoneme in sinsy_mono_round.\n",
"end_time 759000000 of the phoneme w and start_time 759700000 of the phoneme a is not the same. There seems to be a missing phoneme in sinsy_mono_round.\n",
"50.lab 22.0\n",
"Consecutive pau/sil-s are detected.\n",
"51.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"6.lab 3.0\n",
"Consecutive pau/sil-s are detected.\n",
"7.lab 2.0\n",
"Consecutive pau/sil-s are detected.\n",
"8.lab 3.0\n",
"Consecutive pau/sil-s are detected.\n",
"9.lab 2.0\n",
"1\n",
"10\n",
"11\n",
"12\n",
"14\n",
"15\n",
"16\n",
"17\n",
"18\n",
"19\n",
"20\n",
"21\n",
"22\n",
"23\n",
"24\n",
"25\n",
"26\n",
"28\n",
"29\n",
"30\n",
"31\n",
"32\n",
"33\n",
"34\n",
"35\n",
"36\n",
"37\n",
"38\n",
"39\n",
"4\n",
"40\n",
"41\n",
"42\n",
"44\n",
"45\n",
"46\n",
"47\n",
"48\n",
"49\n",
"5\n",
"50\n",
"51\n",
"6\n",
"7\n",
"8\n",
"9\n",
"1\n",
"10\n",
"11\n",
"12\n",
"14\n",
"15\n",
"16\n",
"17\n",
"18\n",
"19\n",
"20\n",
"21\n",
"22\n",
"23\n",
"24\n",
"25\n",
"26\n",
"28\n",
"29\n",
"30\n",
"31\n",
"32\n",
"33\n",
"34\n",
"35\n",
"36\n",
"37\n",
"38\n",
"39\n",
"4\n",
"40\n",
"41\n",
"42\n",
"44\n",
"45\n",
"46\n",
"47\n",
"48\n",
"49\n",
"5\n",
"50\n",
"51\n",
"6\n",
"7\n",
"8\n",
"9\n",
"1\n",
"10\n",
"11\n",
"12\n",
"14\n",
"15\n",
"16\n",
"17\n",
"18\n",
"19\n",
"20\n",
"21\n",
"22\n",
"23\n",
"24\n",
"25\n",
"26\n",
"28\n",
"29\n",
"30\n",
"31\n",
"32\n",
"33\n",
"34\n",
"35\n",
"36\n",
"37\n",
"38\n",
"39\n",
"4\n",
"40\n",
"41\n",
"42\n",
"44\n",
"45\n",
"46\n",
"47\n",
"48\n",
"49\n",
"5\n",
"50\n",
"51\n",
"6\n",
"7\n",
"8\n",
"9\n",
"1.lab: segment duration min 15.23, max 54.43, mean 38.53\n",
"10.lab: segment duration min 6.22, max 15.27, mean 8.76\n",
"11.lab: segment duration min 11.86, max 11.86, mean 11.86\n",
"12.lab: segment duration min 6.72, max 14.53, mean 7.57\n",
"14.lab: segment duration min 5.26, max 12.97, mean 6.63\n",
"15.lab: segment duration min 12.54, max 14.47, mean 13.50\n",
"16.lab: segment duration min 6.76, max 6.80, mean 6.78\n",
"17.lab: segment duration min 6.81, max 16.54, mean 8.11\n",
"18.lab: segment duration min 11.21, max 15.15, mean 13.18\n",
"19.lab: segment duration min 6.82, max 16.89, mean 8.63\n",
"20.lab: segment duration min 7.53, max 12.53, mean 8.45\n",
"21.lab: segment duration min 10.16, max 35.74, mean 20.96\n",
"22.lab: segment duration min 6.43, max 10.13, mean 8.07\n",
"23.lab: segment duration min 5.77, max 10.81, mean 7.45\n",
"24.lab: segment duration min 8.21, max 15.03, mean 9.69\n",
"25.lab: segment duration min 7.09, max 14.65, mean 10.55\n",
"26.lab: segment duration min 6.27, max 19.88, mean 12.41\n",
"28.lab: segment duration min 5.00, max 10.64, mean 7.24\n",
"29.lab: segment duration min 10.73, max 27.42, mean 20.01\n",
"30.lab: segment duration min 8.53, max 18.20, mean 11.92\n",
"31.lab: segment duration min 6.04, max 13.15, mean 8.71\n",
"32.lab: segment duration min 5.58, max 10.44, mean 7.27\n",
"33.lab: segment duration min 7.06, max 13.31, mean 8.88\n",
"34.lab: segment duration min 7.75, max 7.95, mean 7.88\n",
"35.lab: segment duration min 7.20, max 13.09, mean 8.50\n",
"36.lab: segment duration min 6.43, max 15.88, mean 8.21\n",
"37.lab: segment duration min 6.65, max 24.25, mean 10.91\n",
"38.lab: segment duration min 63.35, max 63.35, mean 63.35\n",
"39.lab: segment duration min 6.33, max 9.87, mean 7.67\n",
"4.lab: segment duration min 6.47, max 10.54, mean 7.18\n",
"40.lab: segment duration min 7.59, max 17.44, mean 12.21\n",
"41.lab: segment duration min 8.36, max 29.44, mean 18.90\n",
"42.lab: segment duration min 10.82, max 21.82, mean 12.71\n",
"44.lab: segment duration min 6.47, max 9.17, mean 8.41\n",
"45.lab: segment duration min 5.81, max 27.89, mean 13.82\n",
"46.lab: segment duration min 3.53, max 18.90, mean 13.23\n",
"47.lab: segment duration min 5.54, max 23.07, mean 12.59\n",
"48.lab: segment duration min 3.63, max 16.96, mean 9.54\n",
"49.lab: segment duration min 89.12, max 89.12, mean 89.12\n",
"5.lab: segment duration min 24.65, max 24.65, mean 24.65\n",
"50.lab: segment duration min 6.13, max 27.25, mean 13.38\n",
"51.lab: segment duration min 5.20, max 20.54, mean 11.61\n",
"6.lab: segment duration min 5.17, max 17.80, mean 12.60\n",
"7.lab: segment duration min 11.36, max 12.25, mean 12.00\n",
"8.lab: segment duration min 7.17, max 17.01, mean 9.21\n",
"9.lab: segment duration min 5.99, max 9.49, mean 6.92\n",
"1.lab: segment lengths: 15.23, 54.43, 52.86, 31.62, \n",
"10.lab: segment lengths: 15.27, 6.59, 13.65, 6.39, 6.67, 13.28, 6.46, 6.22, 6.69, 6.41, \n",
"11.lab: segment lengths: 11.86, \n",
"12.lab: segment lengths: 14.53, 6.72, 7.14, 6.93, 7.03, 6.98, 7.09, 6.96, 6.98, 6.98, 7.12, 6.95, 7.03, \n",
"14.lab: segment lengths: 12.97, 5.38, 5.38, 5.26, 5.39, 5.40, \n",
"15.lab: segment lengths: 12.54, 14.47, \n",
"16.lab: segment lengths: 6.79, 6.80, 6.76, \n",
"17.lab: segment lengths: 16.54, 6.81, 6.97, 6.87, 6.86, 6.86, 6.97, 6.98, \n",
"18.lab: segment lengths: 11.21, 15.15, \n",
"19.lab: segment lengths: 16.89, 6.97, 7.05, 6.82, 7.05, 6.98, \n",
"20.lab: segment lengths: 12.53, 12.29, 8.11, 7.83, 7.84, 8.01, 7.94, 7.79, 8.00, 8.06, 7.95, 8.02, 7.94, 7.53, 7.73, 7.67, \n",
"21.lab: segment lengths: 35.74, 10.16, 17.00, \n",
"22.lab: segment lengths: 8.30, 6.46, 10.13, 10.08, 10.07, 6.45, 6.64, 6.43, \n",
"23.lab: segment lengths: 5.77, 5.79, 10.81, \n",
"24.lab: segment lengths: 12.43, 8.21, 15.03, 8.29, 8.24, 10.62, 8.29, 8.22, 10.70, 8.26, 8.29, \n",
"25.lab: segment lengths: 12.92, 10.89, 14.65, 7.09, 7.21, \n",
"26.lab: segment lengths: 15.97, 6.27, 6.35, 19.88, 13.60, \n",
"28.lab: segment lengths: 9.13, 8.01, 5.08, 5.07, 5.08, 5.00, 7.78, 7.96, 10.64, 7.85, 8.06, \n",
"29.lab: segment lengths: 21.89, 27.42, 10.73, \n",
"30.lab: segment lengths: 15.18, 8.53, 9.10, 11.80, 18.20, 8.69, \n",
"31.lab: segment lengths: 13.15, 8.48, 8.61, 8.53, 10.55, 8.45, 8.63, 8.56, 8.88, 6.04, 8.50, 8.58, 8.51, 8.79, 6.37, \n",
"32.lab: segment lengths: 8.53, 6.81, 6.57, 10.44, 6.62, 6.76, 6.82, 5.58, \n",
"33.lab: segment lengths: 13.31, 7.38, 11.12, 7.06, 7.30, 7.13, \n",
"34.lab: segment lengths: 7.88, 7.95, 7.87, 7.75, 7.86, 7.95, 7.85, 7.90, \n",
"35.lab: segment lengths: 13.09, 11.18, 7.26, 7.31, 7.20, 7.26, 7.21, 7.52, \n",
"36.lab: segment lengths: 15.88, 6.60, 6.43, 6.61, 6.88, 6.86, \n",
"37.lab: segment lengths: 9.00, 7.19, 24.25, 10.87, 6.65, 9.71, 8.70, \n",
"38.lab: segment lengths: 63.35, \n",
"39.lab: segment lengths: 8.00, 9.87, 6.33, 6.49, \n",
"4.lab: segment lengths: 10.54, 6.54, 6.82, 6.63, 6.80, 6.47, 6.81, 6.84, \n",
"40.lab: segment lengths: 17.44, 7.59, 11.60, \n",
"41.lab: segment lengths: 8.36, 29.44, \n",
"42.lab: segment lengths: 11.01, 11.25, 10.92, 11.00, 10.82, 21.82, 12.16, \n",
"44.lab: segment lengths: 8.81, 6.47, 8.88, 8.70, 9.17, \n",
"45.lab: segment lengths: 18.97, 7.82, 8.62, 5.81, 27.89, \n",
"46.lab: segment lengths: 14.89, 13.47, 12.62, 12.58, 12.58, 12.33, 3.53, 13.47, 14.82, 18.90, 12.62, 16.97, \n",
"47.lab: segment lengths: 14.26, 12.34, 12.28, 7.84, 16.31, 16.34, 12.27, 12.35, 7.67, 5.54, 12.33, 7.66, 16.02, 23.07, \n",
"48.lab: segment lengths: 12.75, 16.96, 5.18, 13.43, 3.63, 12.04, 5.19, 13.46, 5.33, 6.35, 13.54, 9.07, 10.62, 5.96, \n",
"49.lab: segment lengths: 89.12, \n",
"5.lab: segment lengths: 24.65, \n",
"50.lab: segment lengths: 27.25, 10.29, 6.13, 9.85, \n",
"51.lab: segment lengths: 10.50, 10.19, 20.54, 5.20, \n",
"6.lab: segment lengths: 17.80, 5.17, 16.45, 10.98, \n",
"7.lab: segment lengths: 12.06, 11.50, 12.21, 11.96, 12.18, 12.03, 12.20, 11.93, 12.12, 11.36, 12.21, 12.25, \n",
"8.lab: segment lengths: 17.01, 7.17, 7.29, 7.21, 7.39, \n",
"9.lab: segment lengths: 6.09, 5.99, 6.10, 9.49, \n",
"Segmentation stats: min 3.53, max 89.12, mean 10.80\n",
"Total number of segments: 294\n",
"Prepare data for time-lag models\n",
" 0% 0/46 [00:00<?, ?it/s]1: Global offset (in sec): -0.045\n",
"1_seg0.lab offset (in sec): -0.034999999999999996\n",
"1.lab: 1/61 time-lags are excluded.\n",
"1_seg1.lab offset (in sec): -0.045\n",
"1.lab: 6/167 time-lags are excluded.\n",
"1_seg2.lab offset (in sec): -0.045\n",
"1.lab: 5/168 time-lags are excluded.\n",
"1_seg3.lab offset (in sec): -0.055\n",
"1.lab: 5/107 time-lags are excluded.\n",
"10: Global offset (in sec): -0.09999999999999999\n",
"10_seg0.lab offset (in sec): -0.08499999999999999\n",
"10_seg1.lab offset (in sec): -0.105\n",
"10_seg2.lab offset (in sec): -0.09999999999999999\n",
"10.lab: 6/26 time-lags are excluded.\n",
"10_seg3.lab offset (in sec): -0.08\n",
"10.lab: 1/10 time-lags are excluded.\n",
"10_seg4.lab offset (in sec): -0.13999999999999999\n",
"10.lab: 1/10 time-lags are excluded.\n",
"10_seg5.lab offset (in sec): -0.08\n",
"10.lab: 5/25 time-lags are excluded.\n",
"10_seg6.lab offset (in sec): -0.13999999999999999\n",
"10.lab: 2/10 time-lags are excluded.\n",
"10_seg7.lab offset (in sec): -0.11\n",
"10.lab: 3/10 time-lags are excluded.\n",
"10_seg8.lab offset (in sec): -0.11\n",
"10.lab: 4/15 time-lags are excluded.\n",
"10_seg9.lab offset (in sec): -0.11\n",
"10.lab: 2/11 time-lags are excluded.\n",
"11: Global offset (in sec): -2.175\n",
"11_seg0.lab offset (in sec): -2.17\n",
"11.lab: 6/39 time-lags are excluded.\n",
"12: Global offset (in sec): -3.88\n",
"12_seg0.lab offset (in sec): -3.8649999999999998\n",
"12.lab: 1/18 time-lags are excluded.\n",
"12_seg1.lab offset (in sec): -3.895\n",
"12_seg2.lab offset (in sec): -3.8899999999999997\n",
"12.lab: 1/12 time-lags are excluded.\n",
"12_seg3.lab offset (in sec): -3.8899999999999997\n",
"12_seg4.lab offset (in sec): -3.875\n",
"12_seg5.lab offset (in sec): -3.8699999999999997\n",
"12_seg6.lab offset (in sec): -3.8899999999999997\n",
"12.lab: 3/12 time-lags are excluded.\n",
"12_seg7.lab offset (in sec): -3.8699999999999997\n",
"12_seg8.lab offset (in sec): -3.8699999999999997\n",
"12.lab: 1/18 time-lags are excluded.\n",
"12_seg9.lab offset (in sec): -3.885\n",
"12_seg10.lab offset (in sec): -3.8899999999999997\n",
"12.lab: 2/12 time-lags are excluded.\n",
"12_seg11.lab offset (in sec): -3.88\n",
"12_seg12.lab offset (in sec): -3.8649999999999998\n",
"14: Global offset (in sec): -1.635\n",
"14_seg0.lab offset (in sec): -1.6199999999999999\n",
"14.lab: 3/16 time-lags are excluded.\n",
"14_seg1.lab offset (in sec): -1.645\n",
"14.lab: 1/13 time-lags are excluded.\n",
"14_seg2.lab offset (in sec): -1.6099999999999999\n",
"14.lab: 3/15 time-lags are excluded.\n",
"14_seg3.lab offset (in sec): -1.65\n",
"14_seg4.lab offset (in sec): -1.6099999999999999\n",
"14.lab: 2/15 time-lags are excluded.\n",
"14_seg5.lab offset (in sec): -1.645\n",
"15: Global offset (in sec): -1.98\n",
"15_seg0.lab offset (in sec): -1.9749999999999999\n",
"15.lab: 2/17 time-lags are excluded.\n",
"15_seg1.lab offset (in sec): -1.9849999999999999\n",
"15.lab: 3/38 time-lags are excluded.\n",
"16: Global offset (in sec): -1.865\n",
"16_seg0.lab offset (in sec): -1.855\n",
"16_seg1.lab offset (in sec): -1.8699999999999999\n",
"16.lab: 1/27 time-lags are excluded.\n",
"16_seg2.lab offset (in sec): -1.865\n",
"16.lab: 1/27 time-lags are excluded.\n",
"17: Global offset (in sec): -2.04\n",
"17_seg0.lab offset (in sec): -2.05\n",
"17_seg1.lab offset (in sec): -2.0349999999999997\n",
"17.lab: 2/12 time-lags are excluded.\n",
"17_seg2.lab offset (in sec): -2.045\n",
"17.lab: 1/12 time-lags are excluded.\n",
"17_seg3.lab offset (in sec): -2.05\n",
"17.lab: 2/12 time-lags are excluded.\n",
"17_seg4.lab offset (in sec): -2.03\n",
"17.lab: 3/12 time-lags are excluded.\n",
"17_seg5.lab offset (in sec): -2.06\n",
"17.lab: 1/12 time-lags are excluded.\n",
"17_seg6.lab offset (in sec): -2.02\n",
"17.lab: 2/12 time-lags are excluded.\n",
"17_seg7.lab offset (in sec): -2.03\n",
"17.lab: 2/12 time-lags are excluded.\n",
"18: Global offset (in sec): -0.09999999999999999\n",
"18_seg0.lab offset (in sec): -0.08499999999999999\n",
"18_seg1.lab offset (in sec): -0.11\n",
"18.lab: 3/40 time-lags are excluded.\n",
" 20% 9/46 [00:00<00:00, 89.02it/s]19: Global offset (in sec): -2.045\n",
"19_seg0.lab offset (in sec): -2.03\n",
"19_seg1.lab offset (in sec): -2.045\n",
"19.lab: 4/13 time-lags are excluded.\n",
"19_seg2.lab offset (in sec): -2.045\n",
"19.lab: 1/14 time-lags are excluded.\n",
"19_seg3.lab offset (in sec): -2.045\n",
"19.lab: 2/14 time-lags are excluded.\n",
"19_seg4.lab offset (in sec): -2.045\n",
"19_seg5.lab offset (in sec): -2.0549999999999997\n",
"19.lab: 1/15 time-lags are excluded.\n",
"20: Global offset (in sec): -2.295\n",
"20_seg0.lab offset (in sec): -2.245\n",
"20_seg1.lab offset (in sec): -2.315\n",
"20.lab: 3/19 time-lags are excluded.\n",
"20_seg2.lab offset (in sec): -2.27\n",
"20.lab: 3/13 time-lags are excluded.\n",
"20_seg3.lab offset (in sec): -2.29\n",
"20.lab: 3/13 time-lags are excluded.\n",
"20_seg4.lab offset (in sec): -2.275\n",
"20_seg5.lab offset (in sec): -2.3249999999999997\n",
"20_seg6.lab offset (in sec): -2.26\n",
"20.lab: 3/13 time-lags are excluded.\n",
"20_seg7.lab offset (in sec): -2.3\n",
"20_seg8.lab offset (in sec): -2.28\n",
"20_seg9.lab offset (in sec): -2.3\n",
"20_seg10.lab offset (in sec): -2.28\n",
"20_seg11.lab offset (in sec): -2.295\n",
"20.lab: 2/13 time-lags are excluded.\n",
"20_seg12.lab offset (in sec): -2.3049999999999997\n",
"20.lab: 2/13 time-lags are excluded.\n",
"20_seg13.lab offset (in sec): -2.27\n",
"20.lab: 3/13 time-lags are excluded.\n",
"20_seg14.lab offset (in sec): -2.26\n",
"20.lab: 2/13 time-lags are excluded.\n",
"20_seg15.lab offset (in sec): -2.28\n",
"20.lab: 2/13 time-lags are excluded.\n",
"21: Global offset (in sec): -1.8299999999999998\n",
"21_seg0.lab offset (in sec): -1.8199999999999998\n",
"21.lab: 5/67 time-lags are excluded.\n",
"21_seg1.lab offset (in sec): -1.845\n",
"21.lab: 3/25 time-lags are excluded.\n",
"21_seg2.lab offset (in sec): -1.8099999999999998\n",
"21.lab: 2/41 time-lags are excluded.\n",
"22: Global offset (in sec): -1.865\n",
"22_seg0.lab offset (in sec): -1.8499999999999999\n",
"22.lab: 2/12 time-lags are excluded.\n",
"22_seg1.lab offset (in sec): -1.89\n",
"22.lab: 1/25 time-lags are excluded.\n",
"22_seg2.lab offset (in sec): -1.865\n",
"22.lab: 2/36 time-lags are excluded.\n",
"22_seg3.lab offset (in sec): -1.875\n",
"22.lab: 1/35 time-lags are excluded.\n",
"22_seg4.lab offset (in sec): -1.88\n",
"22.lab: 1/36 time-lags are excluded.\n",
"22_seg5.lab offset (in sec): -1.8499999999999999\n",
"22.lab: 1/23 time-lags are excluded.\n",
"22_seg6.lab offset (in sec): -1.8399999999999999\n",
"22.lab: 1/25 time-lags are excluded.\n",
"22_seg7.lab offset (in sec): -1.865\n",
"22.lab: 1/26 time-lags are excluded.\n",
"23: Global offset (in sec): -2.23\n",
"23_seg0.lab offset (in sec): -2.235\n",
"23.lab: 1/13 time-lags are excluded.\n",
"23_seg1.lab offset (in sec): -2.225\n",
"23.lab: 3/28 time-lags are excluded.\n",
"23_seg2.lab offset (in sec): -2.2199999999999998\n",
"23.lab: 4/37 time-lags are excluded.\n",
"24: Global offset (in sec): -1.2\n",
"24_seg0.lab offset (in sec): -1.22\n",
"24.lab: 2/21 time-lags are excluded.\n",
"24_seg1.lab offset (in sec): -1.19\n",
"24_seg2.lab offset (in sec): -1.185\n",
"24.lab: 2/21 time-lags are excluded.\n",
"24_seg3.lab offset (in sec): -1.1949999999999998\n",
"24.lab: 1/20 time-lags are excluded.\n",
"24_seg4.lab offset (in sec): -1.19\n",
"24.lab: 1/22 time-lags are excluded.\n",
"24_seg5.lab offset (in sec): -1.18\n",
"24.lab: 4/22 time-lags are excluded.\n",
"24_seg6.lab offset (in sec): -1.2049999999999998\n",
"24.lab: 1/21 time-lags are excluded.\n",
"24_seg7.lab offset (in sec): -1.19\n",
"24.lab: 3/22 time-lags are excluded.\n",
"24_seg8.lab offset (in sec): -1.19\n",
"24.lab: 1/23 time-lags are excluded.\n",
"24_seg9.lab offset (in sec): -1.2\n",
"24_seg10.lab offset (in sec): -1.1949999999999998\n",
"24.lab: 2/21 time-lags are excluded.\n",
"25: Global offset (in sec): -2.015\n",
"25_seg0.lab offset (in sec): -2.005\n",
"25_seg1.lab offset (in sec): -2.01\n",
"25.lab: 1/25 time-lags are excluded.\n",
"25_seg2.lab offset (in sec): -2.02\n",
"25.lab: 3/29 time-lags are excluded.\n",
"25_seg3.lab offset (in sec): -2.005\n",
"25.lab: 1/18 time-lags are excluded.\n",
"25_seg4.lab offset (in sec): -1.9849999999999999\n",
"26: Global offset (in sec): -0.09\n",
"26_seg0.lab offset (in sec): -0.065\n",
"26.lab: 1/16 time-lags are excluded.\n",
"26_seg1.lab offset (in sec): -0.09999999999999999\n",
"26.lab: 1/15 time-lags are excluded.\n",
"26_seg2.lab offset (in sec): -0.11499999999999999\n",
"26.lab: 1/16 time-lags are excluded.\n",
"26_seg3.lab offset (in sec): -0.08\n",
"26.lab: 2/49 time-lags are excluded.\n",
"26_seg4.lab offset (in sec): -0.09\n",
"26.lab: 1/33 time-lags are excluded.\n",
"28: Global offset (in sec): -2.885\n",
"28_seg0.lab offset (in sec): -2.915\n",
"28_seg1.lab offset (in sec): -2.8899999999999997\n",
"28_seg2.lab offset (in sec): -2.885\n",
"28.lab: 2/8 time-lags are excluded.\n",
"28_seg3.lab offset (in sec): -2.895\n",
"28.lab: 1/8 time-lags are excluded.\n",
"28_seg4.lab offset (in sec): -2.8699999999999997\n",
"28_seg5.lab offset (in sec): -2.875\n",
"28_seg6.lab offset (in sec): -2.8449999999999998\n",
"28_seg7.lab offset (in sec): -2.895\n",
"28.lab: 3/12 time-lags are excluded.\n",
"28_seg8.lab offset (in sec): -2.905\n",
"28.lab: 1/17 time-lags are excluded.\n",
"28_seg9.lab offset (in sec): -2.8899999999999997\n",
"28.lab: 5/12 time-lags are excluded.\n",
"28_seg10.lab offset (in sec): -2.875\n",
"28.lab: 3/10 time-lags are excluded.\n",
" 39% 18/46 [00:00<00:00, 85.33it/s]29: Global offset (in sec): -2.8899999999999997\n",
"29_seg0.lab offset (in sec): -2.885\n",
"29.lab: 1/49 time-lags are excluded.\n",
"29_seg1.lab offset (in sec): -2.8899999999999997\n",
"29.lab: 3/82 time-lags are excluded.\n",
"29_seg2.lab offset (in sec): -2.8899999999999997\n",
"29.lab: 1/32 time-lags are excluded.\n",
"30.lab: 1/28 time-lags are excluded.\n",
"30.lab: 2/30 time-lags are excluded.\n",
"30.lab: 3/42 time-lags are excluded.\n",
"30.lab: 2/57 time-lags are excluded.\n",
"30.lab: 1/17 time-lags are excluded.\n",
"31: Global offset (in sec): -2.385\n",
"31_seg0.lab offset (in sec): -2.375\n",
"31.lab: 1/8 time-lags are excluded.\n",
"31_seg1.lab offset (in sec): -2.395\n",
"31.lab: 3/24 time-lags are excluded.\n",
"31_seg2.lab offset (in sec): -2.38\n",
"31.lab: 2/23 time-lags are excluded.\n",
"31_seg3.lab offset (in sec): -2.385\n",
"31.lab: 3/27 time-lags are excluded.\n",
"31_seg4.lab offset (in sec): -2.375\n",
"31.lab: 1/31 time-lags are excluded.\n",
"31_seg5.lab offset (in sec): -2.38\n",
"31.lab: 1/20 time-lags are excluded.\n",
"31_seg6.lab offset (in sec): -2.4\n",
"31_seg7.lab offset (in sec): -2.3649999999999998\n",
"31.lab: 1/26 time-lags are excluded.\n",
"31_seg8.lab offset (in sec): -2.385\n",
"31.lab: 1/28 time-lags are excluded.\n",
"31_seg9.lab offset (in sec): -2.395\n",
"31_seg10.lab offset (in sec): -2.38\n",
"31.lab: 2/20 time-lags are excluded.\n",
"31_seg11.lab offset (in sec): -2.405\n",
"31.lab: 3/23 time-lags are excluded.\n",
"31_seg12.lab offset (in sec): -2.38\n",
"31_seg13.lab offset (in sec): -2.385\n",
"31.lab: 5/28 time-lags are excluded.\n",
"31_seg14.lab offset (in sec): -2.385\n",
"31.lab: 1/16 time-lags are excluded.\n",
"32: Global offset (in sec): -1.91\n",
"32_seg0.lab offset (in sec): -1.93\n",
"32.lab: 4/25 time-lags are excluded.\n",
"32_seg1.lab offset (in sec): -1.9\n",
"32.lab: 1/25 time-lags are excluded.\n",
"32_seg2.lab offset (in sec): -1.9149999999999998\n",
"32_seg3.lab offset (in sec): -1.91\n",
"32.lab: 1/39 time-lags are excluded.\n",
"32_seg4.lab offset (in sec): -1.92\n",
"32_seg5.lab offset (in sec): -1.9049999999999998\n",
"32.lab: 1/21 time-lags are excluded.\n",
"32_seg6.lab offset (in sec): -1.885\n",
"32.lab: 3/27 time-lags are excluded.\n",
"32_seg7.lab offset (in sec): -1.91\n",
"33: Global offset (in sec): -2.065\n",
"33_seg0.lab offset (in sec): -2.0549999999999997\n",
"33_seg1.lab offset (in sec): -2.0549999999999997\n",
"33.lab: 1/18 time-lags are excluded.\n",
"33_seg2.lab offset (in sec): -2.065\n",
"33.lab: 2/27 time-lags are excluded.\n",
"33_seg3.lab offset (in sec): -2.08\n",
"33_seg4.lab offset (in sec): -2.065\n",
"33.lab: 2/25 time-lags are excluded.\n",
"33_seg5.lab offset (in sec): -2.04\n",
"33.lab: 2/14 time-lags are excluded.\n",
"34: Global offset (in sec): -2.19\n",
"34_seg0.lab offset (in sec): -2.1799999999999997\n",
"34_seg1.lab offset (in sec): -2.1799999999999997\n",
"34_seg2.lab offset (in sec): -2.205\n",
"34.lab: 2/25 time-lags are excluded.\n",
"34_seg3.lab offset (in sec): -2.19\n",
"34.lab: 1/27 time-lags are excluded.\n",
"34_seg4.lab offset (in sec): -2.185\n",
"34.lab: 1/24 time-lags are excluded.\n",
"34_seg5.lab offset (in sec): -2.19\n",
"34_seg6.lab offset (in sec): -2.195\n",
"34.lab: 3/26 time-lags are excluded.\n",
"34_seg7.lab offset (in sec): -2.19\n",
"34.lab: 1/27 time-lags are excluded.\n",
" 52% 24/46 [00:00<00:00, 72.42it/s]35: Global offset (in sec): -2.015\n",
"35_seg0.lab offset (in sec): -2.005\n",
"35_seg1.lab offset (in sec): -2.01\n",
"35.lab: 5/42 time-lags are excluded.\n",
"35_seg2.lab offset (in sec): -2.02\n",
"35.lab: 2/30 time-lags are excluded.\n",
"35_seg3.lab offset (in sec): -1.9949999999999999\n",
"35.lab: 1/28 time-lags are excluded.\n",
"35_seg4.lab offset (in sec): -2.015\n",
"35.lab: 1/30 time-lags are excluded.\n",
"35_seg5.lab offset (in sec): -2.02\n",
"35.lab: 1/29 time-lags are excluded.\n",
"35_seg6.lab offset (in sec): -2.015\n",
"35.lab: 1/30 time-lags are excluded.\n",
"35_seg7.lab offset (in sec): -2.01\n",
"35.lab: 3/31 time-lags are excluded.\n",
"36: Global offset (in sec): -2.0\n",
"36_seg0.lab offset (in sec): -1.9949999999999999\n",
"36_seg1.lab offset (in sec): -2.005\n",
"36_seg2.lab offset (in sec): -2.015\n",
"36.lab: 1/15 time-lags are excluded.\n",
"36_seg3.lab offset (in sec): -2.02\n",
"36_seg4.lab offset (in sec): -1.98\n",
"36_seg5.lab offset (in sec): -1.9949999999999999\n",
"36.lab: 1/15 time-lags are excluded.\n",
"37: Global offset (in sec): -1.9549999999999998\n",
"37_seg0.lab offset (in sec): -1.9649999999999999\n",
"37_seg1.lab offset (in sec): -1.94\n",
"37.lab: 1/20 time-lags are excluded.\n",
"37_seg2.lab offset (in sec): -1.94\n",
"37.lab: 4/64 time-lags are excluded.\n",
"37_seg3.lab offset (in sec): -1.9549999999999998\n",
"37.lab: 1/29 time-lags are excluded.\n",
"37_seg4.lab offset (in sec): -1.94\n",
"37.lab: 1/10 time-lags are excluded.\n",
"37_seg5.lab offset (in sec): -1.97\n",
"37.lab: 2/29 time-lags are excluded.\n",
"37_seg6.lab offset (in sec): -1.9649999999999999\n",
"38: Global offset (in sec): -1.91\n",
"38_seg0.lab offset (in sec): -1.9\n",
"38.lab: 30/225 time-lags are excluded.\n",
"39: Global offset (in sec): -1.825\n",
"39_seg0.lab offset (in sec): -1.835\n",
"39_seg1.lab offset (in sec): -1.8199999999999998\n",
"39.lab: 2/35 time-lags are excluded.\n",
"39_seg2.lab offset (in sec): -1.8199999999999998\n",
"39_seg3.lab offset (in sec): -1.8299999999999998\n",
"4: Global offset (in sec): -1.9349999999999998\n",
"4_seg0.lab offset (in sec): -1.9449999999999998\n",
"4_seg1.lab offset (in sec): -1.92\n",
"4.lab: 1/16 time-lags are excluded.\n",
"4_seg2.lab offset (in sec): -1.9649999999999999\n",
"4.lab: 1/15 time-lags are excluded.\n",
"4_seg3.lab offset (in sec): -1.9349999999999998\n",
"4.lab: 2/16 time-lags are excluded.\n",
"4_seg4.lab offset (in sec): -1.93\n",
"4_seg5.lab offset (in sec): -1.9249999999999998\n",
"4.lab: 2/15 time-lags are excluded.\n",
"4_seg6.lab offset (in sec): -1.94\n",
"4_seg7.lab offset (in sec): -1.9149999999999998\n",
"4.lab: 1/15 time-lags are excluded.\n",
"40: Global offset (in sec): -2.085\n",
"40_seg0.lab offset (in sec): -2.08\n",
"40_seg1.lab offset (in sec): -2.085\n",
"40.lab: 1/25 time-lags are excluded.\n",
"40_seg2.lab offset (in sec): -2.08\n",
"40.lab: 1/40 time-lags are excluded.\n",
"41: Global offset (in sec): -1.385\n",
"41_seg0.lab offset (in sec): -1.385\n",
"41_seg1.lab offset (in sec): -1.385\n",
"41.lab: 3/104 time-lags are excluded.\n",
"42: Global offset (in sec): -3.02\n",
"42_seg0.lab offset (in sec): -3.01\n",
"42.lab: 1/33 time-lags are excluded.\n",
"42_seg1.lab offset (in sec): -3.0149999999999997\n",
"42_seg2.lab offset (in sec): -3.01\n",
"42.lab: 2/33 time-lags are excluded.\n",
"42_seg3.lab offset (in sec): -3.0\n",
"42_seg4.lab offset (in sec): -3.045\n",
"42_seg5.lab offset (in sec): -3.03\n",
"42.lab: 2/58 time-lags are excluded.\n",
"42_seg6.lab offset (in sec): -3.02\n",
"42.lab: 2/25 time-lags are excluded.\n",
" 72% 33/46 [00:00<00:00, 75.56it/s]44: Global offset (in sec): -2.4899999999999998\n",
"44_seg0.lab offset (in sec): -2.465\n",
"44_seg1.lab offset (in sec): -2.485\n",
"44.lab: 1/17 time-lags are excluded.\n",
"44_seg2.lab offset (in sec): -2.485\n",
"44_seg3.lab offset (in sec): -2.5\n",
"44.lab: 1/24 time-lags are excluded.\n",
"44_seg4.lab offset (in sec): -2.5\n",
"44.lab: 3/25 time-lags are excluded.\n",
"45: Global offset (in sec): -1.8099999999999998\n",
"45_seg0.lab offset (in sec): -1.7999999999999998\n",
"45.lab: 2/44 time-lags are excluded.\n",
"45_seg1.lab offset (in sec): -1.7899999999999998\n",
"45_seg2.lab offset (in sec): -1.7999999999999998\n",
"45.lab: 1/26 time-lags are excluded.\n",
"45_seg3.lab offset (in sec): -1.8099999999999998\n",
"45_seg4.lab offset (in sec): -1.815\n",
"45.lab: 2/78 time-lags are excluded.\n",
"46: Global offset (in sec): -0.049999999999999996\n",
"46_seg0.lab offset (in sec): -0.055\n",
"46_seg1.lab offset (in sec): -0.045\n",
"46_seg2.lab offset (in sec): -0.045\n",
"46_seg3.lab offset (in sec): -0.04\n",
"46_seg4.lab offset (in sec): -0.06\n",
"46_seg5.lab offset (in sec): -0.065\n",
"46_seg6.lab offset (in sec): -0.08499999999999999\n",
"46_seg7.lab offset (in sec): -0.049999999999999996\n",
"46_seg8.lab offset (in sec): -0.055\n",
"46.lab: 1/57 time-lags are excluded.\n",
"46_seg9.lab offset (in sec): -0.04\n",
"46_seg10.lab offset (in sec): -0.055\n",
"46_seg11.lab offset (in sec): -0.055\n",
"46.lab: 2/49 time-lags are excluded.\n",
"47: Global offset (in sec): -1.76\n",
"47_seg0.lab offset (in sec): -1.73\n",
"47.lab: 1/43 time-lags are excluded.\n",
"47_seg1.lab offset (in sec): -1.77\n",
"47.lab: 2/40 time-lags are excluded.\n",
"47_seg2.lab offset (in sec): -1.785\n",
"47.lab: 1/20 time-lags are excluded.\n",
"47_seg3.lab offset (in sec): -1.7899999999999998\n",
"47_seg4.lab offset (in sec): -1.755\n",
"47.lab: 3/45 time-lags are excluded.\n",
"47_seg5.lab offset (in sec): -1.7449999999999999\n",
"47.lab: 2/40 time-lags are excluded.\n",
"47_seg6.lab offset (in sec): -1.755\n",
"47.lab: 3/42 time-lags are excluded.\n",
"47_seg7.lab offset (in sec): -1.7999999999999998\n",
"47.lab: 1/20 time-lags are excluded.\n",
"47_seg8.lab offset (in sec): -1.795\n",
"47.lab: 1/13 time-lags are excluded.\n",
"47_seg9.lab offset (in sec): -1.74\n",
"47_seg10.lab offset (in sec): -1.785\n",
"47_seg11.lab offset (in sec): -1.775\n",
"47.lab: 1/13 time-lags are excluded.\n",
"47_seg12.lab offset (in sec): -1.765\n",
"47.lab: 3/44 time-lags are excluded.\n",
"47_seg13.lab offset (in sec): -1.75\n",
"47.lab: 1/51 time-lags are excluded.\n",
"48: Global offset (in sec): -1.4849999999999999\n",
"48_seg0.lab offset (in sec): -1.48\n",
"48.lab: 2/53 time-lags are excluded.\n",
"48_seg1.lab offset (in sec): -1.4849999999999999\n",
"48_seg2.lab offset (in sec): -1.535\n",
"48_seg3.lab offset (in sec): -1.4749999999999999\n",
"48.lab: 6/59 time-lags are excluded.\n",
"48_seg4.lab offset (in sec): -1.48\n",
"48_seg5.lab offset (in sec): -1.48\n",
"48.lab: 1/54 time-lags are excluded.\n",
"48_seg6.lab offset (in sec): -1.515\n",
"48_seg7.lab offset (in sec): -1.4749999999999999\n",
"48.lab: 2/59 time-lags are excluded.\n",
"48_seg8.lab offset (in sec): -1.4849999999999999\n",
"48_seg9.lab offset (in sec): -1.48\n",
"48_seg10.lab offset (in sec): -1.4849999999999999\n",
"48.lab: 1/51 time-lags are excluded.\n",
"48_seg11.lab offset (in sec): -1.47\n",
"48.lab: 1/49 time-lags are excluded.\n",
"48_seg12.lab offset (in sec): -1.48\n",
"48.lab: 4/52 time-lags are excluded.\n",
"48_seg13.lab offset (in sec): -1.4849999999999999\n",
"49: Global offset (in sec): -3.445\n",
"49_seg0.lab offset (in sec): -3.44\n",
"49.lab: 37/160 time-lags are excluded.\n",
" 85% 39/46 [00:00<00:00, 63.19it/s]5: Global offset (in sec): -0.06\n",
"5_seg0.lab offset (in sec): -0.065\n",
"5.lab: 1/70 time-lags are excluded.\n",
"50: Global offset (in sec): -1.795\n",
"50_seg0.lab offset (in sec): -1.785\n",
"50.lab: 3/76 time-lags are excluded.\n",
"50_seg1.lab offset (in sec): -1.795\n",
"50_seg2.lab offset (in sec): -1.78\n",
"50_seg3.lab offset (in sec): -1.7999999999999998\n",
"51: Global offset (in sec): -1.4\n",
"51_seg0.lab offset (in sec): -1.395\n",
"51_seg1.lab offset (in sec): -1.4\n",
"51.lab: 1/32 time-lags are excluded.\n",
"51_seg2.lab offset (in sec): -1.405\n",
"51.lab: 2/63 time-lags are excluded.\n",
"51_seg3.lab offset (in sec): -1.395\n",
"6: Global offset (in sec): -3.9949999999999997\n",
"6_seg0.lab offset (in sec): -2.855\n",
"6.lab: 26/32 time-lags are excluded.\n",
"6_seg1.lab offset (in sec): -3.4099999999999997\n",
"6.lab: 8/14 time-lags are excluded.\n",
"6_seg2.lab offset (in sec): -4.16\n",
"6.lab: 40/50 time-lags are excluded.\n",
"6_seg3.lab offset (in sec): -5.12\n",
"6.lab: 24/32 time-lags are excluded.\n",
"7: Global offset (in sec): -0.13999999999999999\n",
"7_seg0.lab offset (in sec): -0.095\n",
"7.lab: 1/15 time-lags are excluded.\n",
"7_seg1.lab offset (in sec): -0.11\n",
"7.lab: 4/15 time-lags are excluded.\n",
"7_seg2.lab offset (in sec): -0.155\n",
"7.lab: 3/14 time-lags are excluded.\n",
"7_seg3.lab offset (in sec): -0.13999999999999999\n",
"7.lab: 3/14 time-lags are excluded.\n",
"7_seg4.lab offset (in sec): -0.12\n",
"7.lab: 1/16 time-lags are excluded.\n",
"7_seg5.lab offset (in sec): -0.16\n",
"7.lab: 2/14 time-lags are excluded.\n",
"7_seg6.lab offset (in sec): -0.12\n",
"7.lab: 3/14 time-lags are excluded.\n",
"7_seg7.lab offset (in sec): -0.125\n",
"7.lab: 3/14 time-lags are excluded.\n",
"7_seg8.lab offset (in sec): -0.13499999999999998\n",
"7.lab: 3/15 time-lags are excluded.\n",
"7_seg9.lab offset (in sec): -0.16499999999999998\n",
"7.lab: 2/14 time-lags are excluded.\n",
"7_seg10.lab offset (in sec): -0.185\n",
"7.lab: 1/14 time-lags are excluded.\n",
"7_seg11.lab offset (in sec): -0.13999999999999999\n",
"7.lab: 5/14 time-lags are excluded.\n",
"8: Global offset (in sec): -1.9849999999999999\n",
"8_seg0.lab offset (in sec): -1.98\n",
"8.lab: 4/75 time-lags are excluded.\n",
"8_seg1.lab offset (in sec): -2.025\n",
"8.lab: 2/29 time-lags are excluded.\n",
"8_seg2.lab offset (in sec): -1.98\n",
"8_seg3.lab offset (in sec): -2.015\n",
"8.lab: 1/27 time-lags are excluded.\n",
"8_seg4.lab offset (in sec): -1.9649999999999999\n",
"8.lab: 2/41 time-lags are excluded.\n",
"9: Global offset (in sec): -3.36\n",
"9_seg0.lab offset (in sec): -3.34\n",
"9.lab: 1/13 time-lags are excluded.\n",
"9_seg1.lab offset (in sec): -3.355\n",
"9.lab: 1/26 time-lags are excluded.\n",
"9_seg2.lab offset (in sec): -3.36\n",
"9.lab: 3/25 time-lags are excluded.\n",
"9_seg3.lab offset (in sec): -3.375\n",
"9.lab: 1/39 time-lags are excluded.\n",
"100% 46/46 [00:00<00:00, 71.58it/s]\n",
"Prepare data for duration models\n",
"100% 46/46 [00:00<00:00, 340.55it/s]\n",
"Prepare data for acoustic models\n",
"100% 46/46 [00:47<00:00, 1.04s/it]\n",
"train/dev/eval split\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "FKCYftpSV8gN",
"colab_type": "text"
},
"source": [
"##ステージ 1 (タイムラグ, 継続長, 言語特徴量, 音響特徴量の抽出) の実行\n",
"\n",
"歌声合成に必要なタイムラグ, 継続長, 言語特徴量, 音響特徴量を抽出します. この処理もかなりの時間がかかるので, 何度も歌声合成の実験を行う場合はレシピを Google Drive 以下に移動して Colaboratory のインスタンスが初期化されてもデータが残るようにすると良いかもしれません."
]
},
{
"cell_type": "code",
"metadata": {
"id": "tmI-uuTPGxxO",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"outputId": "00eeb4f0-0f1a-41a8-a35b-5d234ec9e954"
},
"source": [
"! cd $RECIPE_ROOT && bash run.sh --stage 1 --stop-stage 1"
],
"execution_count": 13,
"outputs": [
{
"output_type": "stream",
"text": [
"stage 1: Feature generation\n",
"[\u001b[36m2020-07-08 12:49:55,434\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - acoustic:\n",
" enabled: true\n",
" f0_ceil: 650\n",
" f0_floor: 90\n",
" frame_period: 5\n",
" interp_unvoiced_aperiodicity: true\n",
" label_dir: data/acoustic/label_phone_align\n",
" mgc_order: 59\n",
" num_windows: 3\n",
" question_path: null\n",
" relative_f0: true\n",
" subphone_features: coarse_coding\n",
" use_harvest: true\n",
" wav_dir: data/acoustic/wav\n",
"duration:\n",
" enabled: true\n",
" label_dir: data/duration/label_phone_align\n",
" question_path: null\n",
"log_f0_conditioning: true\n",
"out_dir: dump/natsumeyuuri/org/train_no_dev/\n",
"question_path: ./conf/jp_qst001_nnsvs.hed\n",
"timelag:\n",
" enabled: true\n",
" label_phone_align_dir: data/timelag/label_phone_align\n",
" label_phone_score_dir: data/timelag/label_phone_score\n",
" question_path: null\n",
"utt_list: data/list/train_no_dev.list\n",
"verbose: 100\n",
"\u001b[0m\n",
"[\u001b[36m2020-07-08 12:49:55,487\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - mkdirs: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/org/train_no_dev/in_timelag\u001b[0m\n",
"[\u001b[36m2020-07-08 12:49:55,488\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - mkdirs: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/org/train_no_dev/out_timelag\u001b[0m\n",
"[\u001b[36m2020-07-08 12:49:55,488\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - mkdirs: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/org/train_no_dev/in_duration\u001b[0m\n",
"[\u001b[36m2020-07-08 12:49:55,488\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - mkdirs: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/org/train_no_dev/out_duration\u001b[0m\n",
"[\u001b[36m2020-07-08 12:49:55,488\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - mkdirs: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/org/train_no_dev/in_acoustic\u001b[0m\n",
"[\u001b[36m2020-07-08 12:49:55,488\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - mkdirs: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/org/train_no_dev/out_acoustic\u001b[0m\n",
"[\u001b[36m2020-07-08 12:49:55,496\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Timelag linguistic feature dim: 420\u001b[0m\n",
"[\u001b[36m2020-07-08 12:49:55,496\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Timelag feature dim: 1\u001b[0m\n",
"100% 286/286 [00:05<00:00, 56.70it/s]\n",
"[\u001b[36m2020-07-08 12:50:00,553\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Duration linguistic feature dim: 420\u001b[0m\n",
"[\u001b[36m2020-07-08 12:50:00,554\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Duration feature dim: 1\u001b[0m\n",
"100% 286/286 [00:09<00:00, 30.96it/s]\n",
"[\u001b[36m2020-07-08 12:50:09,829\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Acoustic linguistic feature dim: 424\u001b[0m\n",
"[\u001b[36m2020-07-08 12:50:14,844\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Acoustic feature dim: 199\u001b[0m\n",
"100% 286/286 [28:02<00:00, 5.88s/it]\n",
"[\u001b[36m2020-07-08 13:18:19,806\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - acoustic:\n",
" enabled: true\n",
" f0_ceil: 650\n",
" f0_floor: 90\n",
" frame_period: 5\n",
" interp_unvoiced_aperiodicity: true\n",
" label_dir: data/acoustic/label_phone_align\n",
" mgc_order: 59\n",
" num_windows: 3\n",
" question_path: null\n",
" relative_f0: true\n",
" subphone_features: coarse_coding\n",
" use_harvest: true\n",
" wav_dir: data/acoustic/wav\n",
"duration:\n",
" enabled: true\n",
" label_dir: data/duration/label_phone_align\n",
" question_path: null\n",
"log_f0_conditioning: true\n",
"out_dir: dump/natsumeyuuri/org/dev/\n",
"question_path: ./conf/jp_qst001_nnsvs.hed\n",
"timelag:\n",
" enabled: true\n",
" label_phone_align_dir: data/timelag/label_phone_align\n",
" label_phone_score_dir: data/timelag/label_phone_score\n",
" question_path: null\n",
"utt_list: data/list/dev.list\n",
"verbose: 100\n",
"\u001b[0m\n",
"[\u001b[36m2020-07-08 13:18:19,854\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - mkdirs: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/org/dev/in_timelag\u001b[0m\n",
"[\u001b[36m2020-07-08 13:18:19,854\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - mkdirs: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/org/dev/out_timelag\u001b[0m\n",
"[\u001b[36m2020-07-08 13:18:19,854\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - mkdirs: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/org/dev/in_duration\u001b[0m\n",
"[\u001b[36m2020-07-08 13:18:19,855\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - mkdirs: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/org/dev/out_duration\u001b[0m\n",
"[\u001b[36m2020-07-08 13:18:19,855\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - mkdirs: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/org/dev/in_acoustic\u001b[0m\n",
"[\u001b[36m2020-07-08 13:18:19,855\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - mkdirs: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/org/dev/out_acoustic\u001b[0m\n",
"[\u001b[36m2020-07-08 13:18:19,874\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Timelag linguistic feature dim: 420\u001b[0m\n",
"[\u001b[36m2020-07-08 13:18:19,875\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Timelag feature dim: 1\u001b[0m\n",
"100% 4/4 [00:00<00:00, 42.69it/s]\n",
"[\u001b[36m2020-07-08 13:18:20,008\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Duration linguistic feature dim: 420\u001b[0m\n",
"[\u001b[36m2020-07-08 13:18:20,008\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Duration feature dim: 1\u001b[0m\n",
"100% 4/4 [00:00<00:00, 21.44it/s]\n",
"[\u001b[36m2020-07-08 13:18:20,253\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Acoustic linguistic feature dim: 424\u001b[0m\n",
"[\u001b[36m2020-07-08 13:18:25,235\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Acoustic feature dim: 199\u001b[0m\n",
"100% 4/4 [00:21<00:00, 5.38s/it]\n",
"[\u001b[36m2020-07-08 13:18:48,856\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - acoustic:\n",
" enabled: true\n",
" f0_ceil: 650\n",
" f0_floor: 90\n",
" frame_period: 5\n",
" interp_unvoiced_aperiodicity: true\n",
" label_dir: data/acoustic/label_phone_align\n",
" mgc_order: 59\n",
" num_windows: 3\n",
" question_path: null\n",
" relative_f0: true\n",
" subphone_features: coarse_coding\n",
" use_harvest: true\n",
" wav_dir: data/acoustic/wav\n",
"duration:\n",
" enabled: true\n",
" label_dir: data/duration/label_phone_align\n",
" question_path: null\n",
"log_f0_conditioning: true\n",
"out_dir: dump/natsumeyuuri/org/eval/\n",
"question_path: ./conf/jp_qst001_nnsvs.hed\n",
"timelag:\n",
" enabled: true\n",
" label_phone_align_dir: data/timelag/label_phone_align\n",
" label_phone_score_dir: data/timelag/label_phone_score\n",
" question_path: null\n",
"utt_list: data/list/eval.list\n",
"verbose: 100\n",
"\u001b[0m\n",
"[\u001b[36m2020-07-08 13:18:48,902\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - mkdirs: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/org/eval/in_timelag\u001b[0m\n",
"[\u001b[36m2020-07-08 13:18:48,902\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - mkdirs: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/org/eval/out_timelag\u001b[0m\n",
"[\u001b[36m2020-07-08 13:18:48,902\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - mkdirs: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/org/eval/in_duration\u001b[0m\n",
"[\u001b[36m2020-07-08 13:18:48,902\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - mkdirs: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/org/eval/out_duration\u001b[0m\n",
"[\u001b[36m2020-07-08 13:18:48,903\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - mkdirs: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/org/eval/in_acoustic\u001b[0m\n",
"[\u001b[36m2020-07-08 13:18:48,903\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - mkdirs: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/org/eval/out_acoustic\u001b[0m\n",
"[\u001b[36m2020-07-08 13:18:48,947\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Timelag linguistic feature dim: 420\u001b[0m\n",
"[\u001b[36m2020-07-08 13:18:48,948\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Timelag feature dim: 1\u001b[0m\n",
"100% 4/4 [00:00<00:00, 42.80it/s]\n",
"[\u001b[36m2020-07-08 13:18:49,136\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Duration linguistic feature dim: 420\u001b[0m\n",
"[\u001b[36m2020-07-08 13:18:49,137\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Duration feature dim: 1\u001b[0m\n",
"100% 4/4 [00:00<00:00, 23.93it/s]\n",
"[\u001b[36m2020-07-08 13:18:49,434\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Acoustic linguistic feature dim: 424\u001b[0m\n",
"[\u001b[36m2020-07-08 13:19:05,246\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Acoustic feature dim: 199\u001b[0m\n",
"100% 4/4 [00:27<00:00, 6.96s/it]\n",
"[2020-07-08 13:19:34,276][nnsvs][INFO] - list_path: train_list.txt\n",
"out_path: dump/natsumeyuuri/org/in_timelag_scaler.joblib\n",
"scaler:\n",
" class: sklearn.preprocessing.MinMaxScaler\n",
" params: {}\n",
"verbose: 100\n",
"\n",
"[2020-07-08 13:19:34,498][nnsvs][INFO] - data min:\n",
"[ 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 1. 0. 0. 0. 0.\n",
" 0. 0. 1. 0. 0. 0.\n",
" 0. 0. 0. 1. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 1. 1. 1. 1.\n",
" 1. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 4.7004805 4.7004805 4.7004805\n",
" 1. 1. -1. -1. -1. -1.\n",
" -1. 1. 1. 1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. 0. 1. 6. 4. 0.\n",
" 0. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1. ]\n",
"[2020-07-08 13:19:34,502][nnsvs][INFO] - data max:\n",
"[ 0. 1. 0. 1. 1. 1.\n",
" 0. 1. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 1. 1.\n",
" 1. 1. 0. 0. 1. 0.\n",
" 0. 0. 0. 0. 1. 0.\n",
" 0. 0. 0. 0. 1. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 1.\n",
" 0. 0. 1. 0. 1. 0.\n",
" 0. 1. 0. 0. 1. 0.\n",
" 0. 0. 0. 0. 1. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 1. 0. 1. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 1. 0. 0. 0. 0. 1.\n",
" 1. 1. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 0. 0. 0. 1.\n",
" 0. 0. 1. 1. 0. 1.\n",
" 1. 0. 1. 0. 1. 1.\n",
" 1. 0. 1. 0. 1. 1.\n",
" 1. 1. 1. 0. 1. 0.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 0. 1. 1. 0.\n",
" 1. 0. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 0. 0. 0. 1.\n",
" 0. 0. 1. 1. 1. 0.\n",
" 1. 1. 1. 0. 1. 1.\n",
" 1. 0. 1. 0. 1. 1.\n",
" 1. 1. 1. 0. 1. 0.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 0. 1. 1. 0.\n",
" 1. 0. 1. 1. 1. 0.\n",
" 0. 1. 0. 0. 0. 0.\n",
" 0. 0. 1. 0. 0. 0.\n",
" 0. 0. 0. 1. 0. 0.\n",
" 0. 0. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 0. 0.\n",
" 0. 0. 1. 1. 0. 0.\n",
" 1. 0. 0. 0. 1. 1.\n",
" 0. 0. 1. 0. 0. 0.\n",
" 0. 0. 1. 1. 1. 1.\n",
" 1. 0. 0. 0. 1. 1.\n",
" 0. 0. 1. 6.375586 6.375586 6.375586\n",
" 1. 3. -1. 1. 3. 1.\n",
" 1. 3. 1. 1. 3. 1.\n",
" 1. 11. 0. 1. 284. 180.\n",
" 11. 0. 1. 380. 199. 1.\n",
" 1. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. 11. 0. 1. 284. 180. ]\n",
"'dump/natsumeyuuri/org/in_timelag_scaler.joblib' -> 'dump/natsumeyuuri/norm/in_timelag_scaler.joblib'\n",
"[2020-07-08 13:19:35,576][nnsvs][INFO] - list_path: train_list.txt\n",
"out_path: dump/natsumeyuuri/org/in_duration_scaler.joblib\n",
"scaler:\n",
" class: sklearn.preprocessing.MinMaxScaler\n",
" params: {}\n",
"verbose: 100\n",
"\n",
"[2020-07-08 13:19:35,791][nnsvs][INFO] - data min:\n",
"[ 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 1. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 1. 1. 1. 1.\n",
" 1. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 4.7004805 4.7004805 4.7004805\n",
" 1. 1. -1. -1. -1. -1.\n",
" -1. 1. 1. 1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. 0. 1. 6. 4. 0.\n",
" 0. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1. ]\n",
"[2020-07-08 13:19:35,795][nnsvs][INFO] - data max:\n",
"[ 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 0. 0. 0. 1.\n",
" 0. 0. 1. 1. 1. 1.\n",
" 1. 1. 1. 0. 1. 1.\n",
" 1. 0. 1. 0. 1. 1.\n",
" 1. 1. 1. 0. 1. 0.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 0.\n",
" 1. 0. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 0. 0. 0. 1.\n",
" 0. 0. 1. 1. 1. 1.\n",
" 1. 1. 1. 0. 1. 1.\n",
" 1. 0. 1. 0. 1. 1.\n",
" 1. 1. 1. 0. 1. 0.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 0.\n",
" 1. 0. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 0. 0. 0. 1.\n",
" 0. 0. 1. 1. 1. 1.\n",
" 1. 1. 1. 0. 1. 1.\n",
" 1. 0. 1. 0. 1. 1.\n",
" 1. 1. 1. 0. 1. 0.\n",
" 1. 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1. 0.\n",
" 1. 0. 1. 1. 1. 0.\n",
" 0. 1. 0. 0. 0. 0.\n",
" 0. 0. 1. 0. 0. 0.\n",
" 0. 0. 0. 1. 0. 0.\n",
" 0. 0. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 0. 0.\n",
" 0. 0. 1. 1. 0. 0.\n",
" 1. 0. 0. 0. 1. 1.\n",
" 0. 0. 1. 0. 0. 0.\n",
" 0. 0. 1. 1. 1. 1.\n",
" 1. 0. 0. 0. 1. 1.\n",
" 0. 0. 1. 6.375586 6.375586 6.375586\n",
" 3. 3. -1. 1. 3. 1.\n",
" 1. 3. 1. 1. 3. 1.\n",
" 1. 11. 0. 1. 284. 180.\n",
" 11. 0. 1. 499. 199. 1.\n",
" 1. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. 11. 0. 1. 284. 180. ]\n",
"'dump/natsumeyuuri/org/in_duration_scaler.joblib' -> 'dump/natsumeyuuri/norm/in_duration_scaler.joblib'\n",
"[2020-07-08 13:19:36,869][nnsvs][INFO] - list_path: train_list.txt\n",
"out_path: dump/natsumeyuuri/org/in_acoustic_scaler.joblib\n",
"scaler:\n",
" class: sklearn.preprocessing.MinMaxScaler\n",
" params: {}\n",
"verbose: 100\n",
"\n",
"[2020-07-08 13:19:38,235][nnsvs][INFO] - data min:\n",
"[ 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 1. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 1. 1. 1. 1.\n",
" 1. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 4.7004805 4.7004805 4.7004805\n",
" 1. 1. -1. -1. -1. -1.\n",
" -1. 1. 1. 1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. 0. 1. 6. 4. 0.\n",
" 0. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1. -1.\n",
" 0.04404987 0.45900714 0.04404987 2. ]\n",
"[2020-07-08 13:19:38,238][nnsvs][INFO] - data max:\n",
"[ 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 0. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 0.\n",
" 1. 1. 0. 1. 1.\n",
" 0. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 0. 0. 0. 1.\n",
" 0. 0. 1. 1. 1.\n",
" 1. 1. 1. 1. 0.\n",
" 1. 1. 1. 0. 1.\n",
" 0. 1. 1. 1. 1.\n",
" 1. 0. 1. 0. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 0. 1. 0. 1. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 0. 1. 1. 0.\n",
" 1. 1. 0. 1. 1.\n",
" 0. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 1. 0. 0. 0.\n",
" 1. 0. 0. 1. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 0. 1. 1. 1. 0.\n",
" 1. 0. 1. 1. 1.\n",
" 1. 1. 0. 1. 0.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 0. 1. 0. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 1. 0. 1. 1.\n",
" 0. 1. 1. 0. 1.\n",
" 1. 0. 1. 1. 0.\n",
" 1. 1. 0. 1. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 1. 1. 0. 0.\n",
" 0. 1. 0. 0. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 0. 1. 1. 1.\n",
" 0. 1. 0. 1. 1.\n",
" 1. 1. 1. 0. 1.\n",
" 0. 1. 1. 1. 1.\n",
" 1. 1. 1. 1. 1.\n",
" 1. 1. 0. 1. 0.\n",
" 1. 1. 1. 0. 0.\n",
" 1. 0. 0. 0. 0.\n",
" 0. 0. 1. 0. 0.\n",
" 0. 0. 0. 0. 1.\n",
" 0. 0. 0. 0. 1.\n",
" 1. 0. 1. 1. 0.\n",
" 1. 1. 0. 0. 0.\n",
" 0. 1. 1. 0. 0.\n",
" 1. 0. 0. 0. 1.\n",
" 1. 0. 0. 1. 0.\n",
" 0. 0. 0. 0. 1.\n",
" 1. 1. 1. 1. 0.\n",
" 0. 0. 1. 1. 0.\n",
" 0. 1. 6.375586 6.375586 6.375586\n",
" 3. 3. -1. 1. 3.\n",
" 1. 1. 3. 1. 1.\n",
" 3. 1. 1. 11. 0.\n",
" 1. 284. 180. 11. 0.\n",
" 1. 499. 199. 1. 1.\n",
" -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1.\n",
" -1. -1. -1. -1. -1.\n",
" 11. 0. 1. 284. 180.\n",
" 0.99733615 0.99733615 0.99733615 981. ]\n",
"'dump/natsumeyuuri/org/in_acoustic_scaler.joblib' -> 'dump/natsumeyuuri/norm/in_acoustic_scaler.joblib'\n",
"[2020-07-08 13:19:39,311][nnsvs][INFO] - list_path: train_list.txt\n",
"out_path: dump/natsumeyuuri/org/out_timelag_scaler.joblib\n",
"scaler:\n",
" class: sklearn.preprocessing.StandardScaler\n",
" params: {}\n",
"verbose: 100\n",
"\n",
"[2020-07-08 13:19:39,591][nnsvs][INFO] - mean:\n",
"[0.23795306]\n",
"[2020-07-08 13:19:39,591][nnsvs][INFO] - std:\n",
"[9.27282119]\n",
"'dump/natsumeyuuri/org/out_timelag_scaler.joblib' -> 'dump/natsumeyuuri/norm/out_timelag_scaler.joblib'\n",
"[2020-07-08 13:19:40,637][nnsvs][INFO] - list_path: train_list.txt\n",
"out_path: dump/natsumeyuuri/org/out_duration_scaler.joblib\n",
"scaler:\n",
" class: sklearn.preprocessing.StandardScaler\n",
" params: {}\n",
"verbose: 100\n",
"\n",
"[2020-07-08 13:19:40,895][nnsvs][INFO] - mean:\n",
"[43.76150836]\n",
"[2020-07-08 13:19:40,895][nnsvs][INFO] - std:\n",
"[64.27170444]\n",
"'dump/natsumeyuuri/org/out_duration_scaler.joblib' -> 'dump/natsumeyuuri/norm/out_duration_scaler.joblib'\n",
"[2020-07-08 13:19:41,954][nnsvs][INFO] - list_path: train_list.txt\n",
"out_path: dump/natsumeyuuri/org/out_acoustic_scaler.joblib\n",
"scaler:\n",
" class: sklearn.preprocessing.StandardScaler\n",
" params: {}\n",
"verbose: 100\n",
"\n",
"[2020-07-08 13:19:43,855][nnsvs][INFO] - mean:\n",
"[ 4.64074122e+00 2.36490091e+00 -1.58990466e-01 5.14933699e-01\n",
" -1.77912186e-01 3.60614091e-01 2.66937812e-01 1.04247289e-01\n",
" -3.16227925e-01 1.43981457e-01 1.95465795e-01 -1.28800495e-01\n",
" -1.90585595e-01 -4.19427045e-02 6.54676636e-02 2.97859619e-02\n",
" -1.05932140e-01 3.31236756e-02 1.88468686e-02 -2.27164258e-02\n",
" -4.53876326e-02 3.24207903e-02 -5.11192245e-02 3.01600793e-02\n",
" -1.33435141e-02 1.05566074e-02 -1.79424091e-03 4.34900695e-02\n",
" -2.05882369e-02 3.11503845e-02 -3.27573735e-02 1.93395694e-02\n",
" -8.38780110e-03 1.75530066e-02 3.97169231e-03 1.06912915e-02\n",
" -1.04552372e-02 1.48556620e-02 -1.39698087e-02 -1.59251321e-03\n",
" -7.05910276e-03 6.56423231e-03 -1.29188631e-02 9.88494625e-03\n",
" -8.53729762e-03 6.84611001e-03 -3.84054139e-03 3.52971508e-03\n",
" 1.21358065e-03 -2.62873343e-03 1.06140773e-03 2.22081275e-03\n",
" -5.19978032e-03 6.03406245e-03 -2.95951773e-03 -1.66447412e-03\n",
" 2.15354837e-03 -1.72967632e-03 3.41692521e-03 -9.44870071e-04\n",
" 5.50690809e-04 6.85485877e-05 -9.48835349e-06 8.26642202e-05\n",
" -2.94783314e-05 1.84951589e-06 2.92270778e-06 -3.63921631e-06\n",
" -3.99202146e-05 1.20258953e-05 1.58654208e-05 1.07583807e-05\n",
" -1.53247177e-05 -9.61468745e-07 1.04593260e-05 2.54422230e-05\n",
" 9.20840919e-06 3.81392464e-05 1.03768475e-05 -1.78405389e-06\n",
" 5.19067996e-06 7.19195578e-06 -1.20812146e-05 6.93589054e-06\n",
" -3.46939841e-06 -9.02837489e-07 5.18912428e-06 2.84572287e-08\n",
" -4.39541759e-06 5.03119296e-06 -9.63806323e-06 -3.95756785e-06\n",
" -2.10449568e-06 -2.53727131e-06 -5.32826401e-06 -2.77524287e-06\n",
" 1.79144604e-06 4.60637258e-07 -5.83894371e-06 -3.98612345e-06\n",
" -5.58187393e-06 -1.14019079e-06 -2.05172363e-06 -1.71025053e-06\n",
" -2.47143453e-06 2.06957531e-06 5.63689942e-07 1.53722788e-06\n",
" -3.28729865e-07 -1.16342964e-06 2.22808702e-07 -1.77380767e-06\n",
" -4.30856038e-07 6.27810277e-08 -2.11892854e-06 8.02380050e-07\n",
" 5.10580090e-07 8.85622091e-07 5.00562999e-07 -3.61247739e-07\n",
" -3.16052942e-03 -2.08981736e-03 -1.05741404e-04 -5.59133586e-04\n",
" -2.38266776e-05 -3.21908003e-04 -2.21197077e-04 -1.68650965e-04\n",
" 6.91876429e-05 -1.65131340e-04 -1.14296917e-04 4.44166906e-05\n",
" 5.48925324e-05 3.03901295e-07 -9.62459105e-06 -1.90083583e-05\n",
" 6.32518060e-06 -7.19413975e-05 -2.10196224e-05 5.42787571e-06\n",
" 2.79592054e-05 -3.48436281e-05 3.22531368e-05 -1.00312473e-05\n",
" 1.02631577e-05 -3.77273468e-06 4.08524758e-06 -1.13513651e-05\n",
" 7.48649746e-06 -7.89200157e-06 2.28097431e-05 -4.82981807e-06\n",
" 1.23560765e-05 8.24363507e-06 9.42945905e-07 4.38880642e-06\n",
" 7.53063635e-06 -5.60235796e-06 1.13300489e-05 1.65413285e-05\n",
" 2.64375223e-06 5.58458416e-06 5.52970438e-06 8.92369062e-07\n",
" 5.31007552e-06 -1.64782458e-06 -2.41179784e-06 -3.35181671e-06\n",
" 1.44949671e-06 -3.29900620e-06 5.54212078e-06 1.90294412e-08\n",
" 5.40869410e-06 -2.79129449e-06 5.46155989e-06 -4.44701866e-06\n",
" 1.94706027e-06 -2.47356458e-06 -1.31018957e-06 -1.29031957e-07\n",
" -2.90930075e-02 -1.48538243e-06 7.31948694e-05 7.84865979e-01\n",
" -1.21242435e+01 -1.10908962e+01 -7.34181734e+00 -5.98785801e+00\n",
" -5.47605453e+00 2.39656421e-04 1.29227356e-04 2.21978089e-04\n",
" 1.29956867e-04 1.17481992e-04 4.36718031e-03 4.25666600e-03\n",
" 3.61734450e-03 3.44563763e-03 3.35453975e-03]\n",
"[2020-07-08 13:19:43,858][nnsvs][INFO] - std:\n",
"[2.17110161 1.01338015 0.42388615 0.34438559 0.31511081 0.31864768\n",
" 0.39287894 0.35304934 0.3569349 0.21900406 0.21579584 0.22063633\n",
" 0.22675224 0.1630887 0.1762798 0.1551389 0.18886363 0.23715598\n",
" 0.14191278 0.14501201 0.14657777 0.12052753 0.1383641 0.14552778\n",
" 0.12661402 0.11505465 0.10911111 0.10484299 0.10124558 0.10564508\n",
" 0.09343382 0.09157652 0.08402689 0.08692488 0.08947434 0.08803283\n",
" 0.08212961 0.07818825 0.07508797 0.07753582 0.07650855 0.07234036\n",
" 0.07073195 0.07049731 0.06976848 0.06975993 0.06719174 0.0641771\n",
" 0.06319781 0.0620175 0.06069556 0.05945351 0.0579459 0.05752475\n",
" 0.05648237 0.05460216 0.05314542 0.05140386 0.04972594 0.04925705\n",
" 0.25465517 0.13264129 0.08678362 0.07276618 0.07246756 0.0657121\n",
" 0.06917631 0.06501204 0.06603596 0.0571071 0.05683291 0.05731878\n",
" 0.05854761 0.05344074 0.05215591 0.05020919 0.05031205 0.05032898\n",
" 0.0474075 0.04595883 0.04468876 0.04344021 0.04317529 0.04271798\n",
" 0.04134648 0.04027005 0.03920432 0.03872529 0.03769312 0.03713775\n",
" 0.0362179 0.03557505 0.03475884 0.03432438 0.03378033 0.0333161\n",
" 0.03257929 0.03201833 0.03142295 0.03099697 0.03038045 0.0299323\n",
" 0.02950929 0.02899529 0.02854955 0.02813444 0.02768798 0.02734721\n",
" 0.02689235 0.02649494 0.02606806 0.02562619 0.0252101 0.02481346\n",
" 0.02447802 0.02418621 0.02382361 0.02353925 0.02320889 0.02288853\n",
" 0.50477279 0.2413281 0.19218715 0.18245304 0.17834084 0.17111994\n",
" 0.17110343 0.16742169 0.1641049 0.15615855 0.15251791 0.15083373\n",
" 0.15148013 0.14519339 0.14078501 0.1358786 0.13278455 0.131165\n",
" 0.12633892 0.12099969 0.11798039 0.1150881 0.11242747 0.10990481\n",
" 0.10696791 0.10418726 0.10150307 0.09991989 0.09815174 0.09593898\n",
" 0.09417448 0.09271212 0.09054355 0.08934562 0.08746449 0.08610281\n",
" 0.08492568 0.08334709 0.08166688 0.08060801 0.07911255 0.07782299\n",
" 0.07697597 0.07588219 0.0745447 0.07358737 0.07260206 0.07154533\n",
" 0.0705257 0.06954967 0.06856918 0.06768245 0.06672478 0.06570451\n",
" 0.06477378 0.06409615 0.06324016 0.06231462 0.06158501 0.06098052\n",
" 0.14476426 0.02130803 0.02666572 0.41091529 7.79762092 7.84467499\n",
" 5.02137775 3.87934261 3.43344914 1.47926192 1.76856296 1.72409346\n",
" 1.59837153 1.5160182 3.17293363 4.34528529 4.55520061 4.37137008\n",
" 4.2364771 ]\n",
"'dump/natsumeyuuri/org/out_acoustic_scaler.joblib' -> 'dump/natsumeyuuri/norm/out_acoustic_scaler.joblib'\n",
"[\u001b[36m2020-07-08 13:19:44,946\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/org/train_no_dev/in_timelag/\n",
"inverse: false\n",
"num_workers: 4\n",
"out_dir: dump/natsumeyuuri/norm/train_no_dev/in_timelag/\n",
"scaler_path: dump/natsumeyuuri/org/in_timelag_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 286/286 [00:00<00:00, 1028.84it/s]\n",
"[\u001b[36m2020-07-08 13:19:46,369\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/org/train_no_dev/in_duration/\n",
"inverse: false\n",
"num_workers: 4\n",
"out_dir: dump/natsumeyuuri/norm/train_no_dev/in_duration/\n",
"scaler_path: dump/natsumeyuuri/org/in_duration_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 286/286 [00:00<00:00, 964.94it/s]\n",
"[\u001b[36m2020-07-08 13:19:47,886\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/org/train_no_dev/in_acoustic/\n",
"inverse: false\n",
"num_workers: 4\n",
"out_dir: dump/natsumeyuuri/norm/train_no_dev/in_acoustic/\n",
"scaler_path: dump/natsumeyuuri/org/in_acoustic_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 286/286 [00:03<00:00, 83.71it/s]\n",
"[\u001b[36m2020-07-08 13:19:52,669\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/org/train_no_dev/out_timelag/\n",
"inverse: false\n",
"num_workers: 4\n",
"out_dir: dump/natsumeyuuri/norm/train_no_dev/out_timelag/\n",
"scaler_path: dump/natsumeyuuri/org/out_timelag_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 286/286 [00:00<00:00, 1083.54it/s]\n",
"[\u001b[36m2020-07-08 13:19:54,070\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/org/train_no_dev/out_duration/\n",
"inverse: false\n",
"num_workers: 4\n",
"out_dir: dump/natsumeyuuri/norm/train_no_dev/out_duration/\n",
"scaler_path: dump/natsumeyuuri/org/out_duration_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 286/286 [00:00<00:00, 991.01it/s]\n",
"[\u001b[36m2020-07-08 13:19:55,482\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/org/train_no_dev/out_acoustic/\n",
"inverse: false\n",
"num_workers: 4\n",
"out_dir: dump/natsumeyuuri/norm/train_no_dev/out_acoustic/\n",
"scaler_path: dump/natsumeyuuri/org/out_acoustic_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 286/286 [00:12<00:00, 22.69it/s]\n",
"[\u001b[36m2020-07-08 13:20:15,964\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/org/dev/in_timelag/\n",
"inverse: false\n",
"num_workers: 4\n",
"out_dir: dump/natsumeyuuri/norm/dev/in_timelag/\n",
"scaler_path: dump/natsumeyuuri/org/in_timelag_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 4/4 [00:00<00:00, 123.13it/s]\n",
"[\u001b[36m2020-07-08 13:20:17,087\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/org/dev/in_duration/\n",
"inverse: false\n",
"num_workers: 4\n",
"out_dir: dump/natsumeyuuri/norm/dev/in_duration/\n",
"scaler_path: dump/natsumeyuuri/org/in_duration_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 4/4 [00:00<00:00, 137.08it/s]\n",
"[\u001b[36m2020-07-08 13:20:18,208\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/org/dev/in_acoustic/\n",
"inverse: false\n",
"num_workers: 4\n",
"out_dir: dump/natsumeyuuri/norm/dev/in_acoustic/\n",
"scaler_path: dump/natsumeyuuri/org/in_acoustic_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 4/4 [00:00<00:00, 56.66it/s]\n",
"[\u001b[36m2020-07-08 13:20:19,393\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/org/dev/out_timelag/\n",
"inverse: false\n",
"num_workers: 4\n",
"out_dir: dump/natsumeyuuri/norm/dev/out_timelag/\n",
"scaler_path: dump/natsumeyuuri/org/out_timelag_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 4/4 [00:00<00:00, 138.46it/s]\n",
"[\u001b[36m2020-07-08 13:20:20,513\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/org/dev/out_duration/\n",
"inverse: false\n",
"num_workers: 4\n",
"out_dir: dump/natsumeyuuri/norm/dev/out_duration/\n",
"scaler_path: dump/natsumeyuuri/org/out_duration_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 4/4 [00:00<00:00, 118.42it/s]\n",
"[\u001b[36m2020-07-08 13:20:21,646\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/org/dev/out_acoustic/\n",
"inverse: false\n",
"num_workers: 4\n",
"out_dir: dump/natsumeyuuri/norm/dev/out_acoustic/\n",
"scaler_path: dump/natsumeyuuri/org/out_acoustic_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 4/4 [00:00<00:00, 66.11it/s]\n",
"[\u001b[36m2020-07-08 13:20:22,821\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/org/eval/in_timelag/\n",
"inverse: false\n",
"num_workers: 4\n",
"out_dir: dump/natsumeyuuri/norm/eval/in_timelag/\n",
"scaler_path: dump/natsumeyuuri/org/in_timelag_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 4/4 [00:00<00:00, 128.80it/s]\n",
"[\u001b[36m2020-07-08 13:20:23,949\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/org/eval/in_duration/\n",
"inverse: false\n",
"num_workers: 4\n",
"out_dir: dump/natsumeyuuri/norm/eval/in_duration/\n",
"scaler_path: dump/natsumeyuuri/org/in_duration_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 4/4 [00:00<00:00, 124.83it/s]\n",
"[\u001b[36m2020-07-08 13:20:25,080\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/org/eval/in_acoustic/\n",
"inverse: false\n",
"num_workers: 4\n",
"out_dir: dump/natsumeyuuri/norm/eval/in_acoustic/\n",
"scaler_path: dump/natsumeyuuri/org/in_acoustic_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 4/4 [00:00<00:00, 58.56it/s]\n",
"[\u001b[36m2020-07-08 13:20:26,306\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/org/eval/out_timelag/\n",
"inverse: false\n",
"num_workers: 4\n",
"out_dir: dump/natsumeyuuri/norm/eval/out_timelag/\n",
"scaler_path: dump/natsumeyuuri/org/out_timelag_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 4/4 [00:00<00:00, 122.08it/s]\n",
"[\u001b[36m2020-07-08 13:20:27,444\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/org/eval/out_duration/\n",
"inverse: false\n",
"num_workers: 4\n",
"out_dir: dump/natsumeyuuri/norm/eval/out_duration/\n",
"scaler_path: dump/natsumeyuuri/org/out_duration_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 4/4 [00:00<00:00, 119.09it/s]\n",
"[\u001b[36m2020-07-08 13:20:28,587\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/org/eval/out_acoustic/\n",
"inverse: false\n",
"num_workers: 4\n",
"out_dir: dump/natsumeyuuri/norm/eval/out_acoustic/\n",
"scaler_path: dump/natsumeyuuri/org/out_acoustic_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 4/4 [00:00<00:00, 55.97it/s]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Sv81QbPqWSeH",
"colab_type": "text"
},
"source": [
"## ステージ 2-4 (タイムラグモデル, 継続長モデル, 音響モデルの学習) の実行\n",
"ここまでは特に GPU を必要としない処理でしたが, ここからは GPU が使えると処理が高速になります. ハードウェアアクセラレータの設定を None から GPU に変更するのを忘れていた場合は, 画面上のメニューから「編集」-「ノートブックの設定」を選びハードウェアアクセラレータをGPUに変更してください(ここで変更した場合は残念ながら最初からやり直しになります).\n",
"\n",
"ステージ 2 がタイムラグモデル, ステージ 3 が継続長モデル, ステージ 4 が音響モデルの学習になります."
]
},
{
"cell_type": "code",
"metadata": {
"id": "ulblx4TgbU_d",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"outputId": "f8b817d3-3fb1-4ab6-f84d-724c4d1e903e"
},
"source": [
"! cd $RECIPE_ROOT && bash run.sh --stage 2 --stop-stage 4"
],
"execution_count": 14,
"outputs": [
{
"output_type": "stream",
"text": [
"stage 2: Training time-lag model\n",
"+ nnsvs-train data.train_no_dev.in_dir=dump/natsumeyuuri/norm/train_no_dev/in_timelag/ data.train_no_dev.out_dir=dump/natsumeyuuri/norm/train_no_dev/out_timelag/ data.dev.in_dir=dump/natsumeyuuri/norm/dev/in_timelag/ data.dev.out_dir=dump/natsumeyuuri/norm/dev/out_timelag/ model=timelag train.out_dir=exp/natsumeyuuri/timelag data.batch_size=8 resume.checkpoint=\n",
"[\u001b[36m2020-07-08 13:20:32,227\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - cudnn:\n",
" benchmark: false\n",
" deterministic: false\n",
"data:\n",
" batch_size: 8\n",
" dev:\n",
" in_dir: dump/natsumeyuuri/norm/dev/in_timelag/\n",
" out_dir: dump/natsumeyuuri/norm/dev/out_timelag/\n",
" num_workers: 2\n",
" pin_memory: true\n",
" train_no_dev:\n",
" in_dir: dump/natsumeyuuri/norm/train_no_dev/in_timelag/\n",
" out_dir: dump/natsumeyuuri/norm/train_no_dev/out_timelag/\n",
"model:\n",
" has_dynamic_features:\n",
" - false\n",
" netG:\n",
" class: nnsvs.model.FeedForwardNet\n",
" params:\n",
" dropout: 0.5\n",
" hidden_dim: 128\n",
" in_dim: 420\n",
" num_layers: 2\n",
" out_dim: 1\n",
" stream_sizes:\n",
" - 1\n",
" stream_weights:\n",
" - 1\n",
"optim:\n",
" lr_scheduler:\n",
" name: StepLR\n",
" params:\n",
" gamma: 0.5\n",
" step_size: 20\n",
" optimizer:\n",
" name: Adam\n",
" params:\n",
" betas:\n",
" - 0.5\n",
" - 0.999\n",
" lr: 0.001\n",
" weight_decay: 0.0\n",
"resume:\n",
" checkpoint: null\n",
" load_optimizer: false\n",
"train:\n",
" checkpoint_epoch_interval: 20\n",
" nepochs: 50\n",
" out_dir: exp/natsumeyuuri/timelag\n",
" stream_wise_loss: false\n",
"verbose: 100\n",
"\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:32,228\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - cudnn.deterministic: False\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:32,228\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - cudnn.benchmark: False\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,047\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 41, 420]), torch.Size([8, 41, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,048\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 56, 420]), torch.Size([8, 56, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,058\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 19, 420]), torch.Size([8, 19, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,065\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 163, 420]), torch.Size([8, 163, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,082\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 60, 420]), torch.Size([8, 60, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,082\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 39, 420]), torch.Size([8, 39, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,112\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 101, 420]), torch.Size([8, 101, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,113\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 29, 420]), torch.Size([8, 29, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,123\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 102, 420]), torch.Size([8, 102, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,123\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 56, 420]), torch.Size([8, 56, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,141\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 123, 420]), torch.Size([8, 123, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,142\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 57, 420]), torch.Size([8, 57, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,155\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 34, 420]), torch.Size([8, 34, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,161\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 69, 420]), torch.Size([8, 69, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,183\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 48, 420]), torch.Size([8, 48, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,183\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 57, 420]), torch.Size([8, 57, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,199\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 62, 420]), torch.Size([8, 62, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,202\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 60, 420]), torch.Size([8, 60, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,222\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 195, 420]), torch.Size([8, 195, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,231\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 79, 420]), torch.Size([8, 79, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,235\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 76, 420]), torch.Size([8, 76, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,258\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 161, 420]), torch.Size([8, 161, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,258\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 42, 420]), torch.Size([8, 42, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,279\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 33, 420]), torch.Size([8, 33, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,280\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 34, 420]), torch.Size([8, 34, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,296\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 57, 420]), torch.Size([8, 57, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,297\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 40, 420]), torch.Size([8, 40, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,312\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 60, 420]), torch.Size([8, 60, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,313\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 50, 420]), torch.Size([8, 50, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,334\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 40, 420]), torch.Size([8, 40, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,335\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 47, 420]), torch.Size([8, 47, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,350\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 40, 420]), torch.Size([8, 40, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,354\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 71, 420]), torch.Size([8, 71, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,366\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 50, 420]), torch.Size([8, 50, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,372\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 26, 420]), torch.Size([8, 26, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,376\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([6, 28, 420]), torch.Size([6, 28, 1]), torch.Size([6])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,452\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([4, 61, 420]), torch.Size([4, 61, 1]), torch.Size([4])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:41,480\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Start utterance-wise training...\u001b[0m\n",
" 0% 0/50 [00:00<?, ?it/s][\u001b[36m2020-07-08 13:20:42,256\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 1]: loss 0.7413898830612501\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:42,328\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 1]: loss 0.417951375246048\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:42,333\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.417951375246048: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/timelag/best_loss.pth\u001b[0m\n",
" 2% 1/50 [00:00<00:41, 1.17it/s][\u001b[36m2020-07-08 13:20:42,911\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 2]: loss 0.616187627116839\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:42,988\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 2]: loss 0.41615161299705505\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:42,994\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.41615161299705505: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/timelag/best_loss.pth\u001b[0m\n",
" 4% 2/50 [00:01<00:38, 1.26it/s][\u001b[36m2020-07-08 13:20:43,558\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 3]: loss 0.58352855924103\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:43,643\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 3]: loss 0.4274834096431732\u001b[0m\n",
" 6% 3/50 [00:02<00:35, 1.33it/s][\u001b[36m2020-07-08 13:20:44,225\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 4]: loss 0.5712668846050898\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:44,300\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 4]: loss 0.44552746415138245\u001b[0m\n",
" 8% 4/50 [00:02<00:33, 1.38it/s][\u001b[36m2020-07-08 13:20:44,872\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 5]: loss 0.5728822474678358\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:44,957\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 5]: loss 0.4063112735748291\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:44,962\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.4063112735748291: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/timelag/best_loss.pth\u001b[0m\n",
" 10% 5/50 [00:03<00:31, 1.42it/s][\u001b[36m2020-07-08 13:20:45,544\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 6]: loss 0.5595674821072154\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:45,620\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 6]: loss 0.39501121640205383\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:45,625\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.39501121640205383: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/timelag/best_loss.pth\u001b[0m\n",
" 12% 6/50 [00:04<00:30, 1.44it/s][\u001b[36m2020-07-08 13:20:46,215\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 7]: loss 0.5497937641210027\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:46,289\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 7]: loss 0.44509804248809814\u001b[0m\n",
" 14% 7/50 [00:04<00:29, 1.46it/s][\u001b[36m2020-07-08 13:20:46,866\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 8]: loss 0.5482326919833819\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:46,945\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 8]: loss 0.38926035165786743\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:46,950\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.38926035165786743: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/timelag/best_loss.pth\u001b[0m\n",
" 16% 8/50 [00:05<00:28, 1.48it/s][\u001b[36m2020-07-08 13:20:47,523\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 9]: loss 0.5387894420160187\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:47,600\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 9]: loss 0.45804721117019653\u001b[0m\n",
" 18% 9/50 [00:06<00:27, 1.50it/s][\u001b[36m2020-07-08 13:20:48,186\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 10]: loss 0.5376900682846705\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:48,261\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 10]: loss 0.402455598115921\u001b[0m\n",
" 20% 10/50 [00:06<00:26, 1.50it/s][\u001b[36m2020-07-08 13:20:48,875\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 11]: loss 0.5277857788734965\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:48,954\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 11]: loss 0.39304760098457336\u001b[0m\n",
" 22% 11/50 [00:07<00:26, 1.48it/s][\u001b[36m2020-07-08 13:20:49,533\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 12]: loss 0.5103132989671495\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:49,611\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 12]: loss 0.4348149597644806\u001b[0m\n",
" 24% 12/50 [00:08<00:25, 1.49it/s][\u001b[36m2020-07-08 13:20:50,210\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 13]: loss 0.518002906607257\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:50,286\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 13]: loss 0.4713843762874603\u001b[0m\n",
" 26% 13/50 [00:08<00:24, 1.49it/s][\u001b[36m2020-07-08 13:20:50,883\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 14]: loss 0.5004223643077744\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:50,962\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 14]: loss 0.430740088224411\u001b[0m\n",
" 28% 14/50 [00:09<00:24, 1.49it/s][\u001b[36m2020-07-08 13:20:51,538\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 15]: loss 0.49967945035960937\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:51,618\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 15]: loss 0.4012521505355835\u001b[0m\n",
" 30% 15/50 [00:10<00:23, 1.50it/s][\u001b[36m2020-07-08 13:20:52,211\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 16]: loss 0.5079488332072893\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:52,286\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 16]: loss 0.4130418598651886\u001b[0m\n",
" 32% 16/50 [00:10<00:22, 1.50it/s][\u001b[36m2020-07-08 13:20:52,867\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 17]: loss 0.49579934527476627\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:52,947\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 17]: loss 0.3959328532218933\u001b[0m\n",
" 34% 17/50 [00:11<00:21, 1.50it/s][\u001b[36m2020-07-08 13:20:53,533\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 18]: loss 0.48978524406750995\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:53,609\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 18]: loss 0.43996357917785645\u001b[0m\n",
" 36% 18/50 [00:12<00:21, 1.50it/s][\u001b[36m2020-07-08 13:20:54,188\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 19]: loss 0.488814497159587\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:54,263\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 19]: loss 0.39926302433013916\u001b[0m\n",
" 38% 19/50 [00:12<00:20, 1.51it/s][\u001b[36m2020-07-08 13:20:54,841\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 20]: loss 0.48092027836375767\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:54,926\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 20]: loss 0.39583373069763184\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:54,930\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/timelag/checkpoint_epoch0020.pth\u001b[0m\n",
" 40% 20/50 [00:13<00:19, 1.51it/s][\u001b[36m2020-07-08 13:20:55,504\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 21]: loss 0.471134714782238\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:55,582\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 21]: loss 0.40722694993019104\u001b[0m\n",
" 42% 21/50 [00:14<00:19, 1.52it/s][\u001b[36m2020-07-08 13:20:56,172\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 22]: loss 0.45937356766727233\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:56,253\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 22]: loss 0.4129467308521271\u001b[0m\n",
" 44% 22/50 [00:14<00:18, 1.51it/s][\u001b[36m2020-07-08 13:20:56,833\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 23]: loss 0.4620760314994388\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:56,920\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 23]: loss 0.41880175471305847\u001b[0m\n",
" 46% 23/50 [00:15<00:17, 1.51it/s][\u001b[36m2020-07-08 13:20:57,493\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 24]: loss 0.45393017679452896\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:57,571\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 24]: loss 0.4083588123321533\u001b[0m\n",
" 48% 24/50 [00:16<00:17, 1.51it/s][\u001b[36m2020-07-08 13:20:58,150\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 25]: loss 0.4550911659995715\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:58,227\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 25]: loss 0.45512476563453674\u001b[0m\n",
" 50% 25/50 [00:16<00:16, 1.52it/s][\u001b[36m2020-07-08 13:20:58,804\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 26]: loss 0.45073308961258995\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:58,882\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 26]: loss 0.4087487757205963\u001b[0m\n",
" 52% 26/50 [00:17<00:15, 1.52it/s][\u001b[36m2020-07-08 13:20:59,453\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 27]: loss 0.44208161160349846\u001b[0m\n",
"[\u001b[36m2020-07-08 13:20:59,534\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 27]: loss 0.428731232881546\u001b[0m\n",
" 54% 27/50 [00:18<00:15, 1.52it/s][\u001b[36m2020-07-08 13:21:00,107\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 28]: loss 0.44123029294941163\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:00,187\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 28]: loss 0.4306964874267578\u001b[0m\n",
" 56% 28/50 [00:18<00:14, 1.53it/s][\u001b[36m2020-07-08 13:21:00,772\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 29]: loss 0.4446003900633918\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:00,868\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 29]: loss 0.39595672488212585\u001b[0m\n",
" 58% 29/50 [00:19<00:13, 1.51it/s][\u001b[36m2020-07-08 13:21:01,476\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 30]: loss 0.43509911083512837\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:01,558\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 30]: loss 0.40864595770835876\u001b[0m\n",
" 60% 30/50 [00:20<00:13, 1.49it/s][\u001b[36m2020-07-08 13:21:02,136\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 31]: loss 0.44589999152554405\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:02,213\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 31]: loss 0.43367576599121094\u001b[0m\n",
" 62% 31/50 [00:20<00:12, 1.50it/s][\u001b[36m2020-07-08 13:21:02,793\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 32]: loss 0.43148256921105915\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:02,874\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 32]: loss 0.4466657042503357\u001b[0m\n",
" 64% 32/50 [00:21<00:11, 1.50it/s][\u001b[36m2020-07-08 13:21:03,445\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 33]: loss 0.42904651827282375\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:03,521\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 33]: loss 0.4393444359302521\u001b[0m\n",
" 66% 33/50 [00:22<00:11, 1.52it/s][\u001b[36m2020-07-08 13:21:04,102\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 34]: loss 0.43163617245025104\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:04,177\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 34]: loss 0.419528067111969\u001b[0m\n",
" 68% 34/50 [00:22<00:10, 1.52it/s][\u001b[36m2020-07-08 13:21:04,737\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 35]: loss 0.42265744507312775\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:04,817\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 35]: loss 0.4340464770793915\u001b[0m\n",
" 70% 35/50 [00:23<00:09, 1.53it/s][\u001b[36m2020-07-08 13:21:05,397\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 36]: loss 0.4279438886377547\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:05,478\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 36]: loss 0.42154547572135925\u001b[0m\n",
" 72% 36/50 [00:23<00:09, 1.53it/s][\u001b[36m2020-07-08 13:21:06,050\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 37]: loss 0.4244021284911368\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:06,134\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 37]: loss 0.41723060607910156\u001b[0m\n",
" 74% 37/50 [00:24<00:08, 1.53it/s][\u001b[36m2020-07-08 13:21:06,710\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 38]: loss 0.4156242062648137\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:06,787\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 38]: loss 0.42691853642463684\u001b[0m\n",
" 76% 38/50 [00:25<00:07, 1.53it/s][\u001b[36m2020-07-08 13:21:07,378\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 39]: loss 0.42667213992940056\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:07,458\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 39]: loss 0.4312533438205719\u001b[0m\n",
" 78% 39/50 [00:25<00:07, 1.52it/s][\u001b[36m2020-07-08 13:21:08,036\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 40]: loss 0.4192444210251172\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:08,124\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 40]: loss 0.42829224467277527\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:08,129\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/timelag/checkpoint_epoch0040.pth\u001b[0m\n",
" 80% 40/50 [00:26<00:06, 1.51it/s][\u001b[36m2020-07-08 13:21:08,720\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 41]: loss 0.4015681801570786\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:08,801\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 41]: loss 0.4208395779132843\u001b[0m\n",
" 82% 41/50 [00:27<00:05, 1.50it/s][\u001b[36m2020-07-08 13:21:09,386\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 42]: loss 0.40194503797425163\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:09,463\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 42]: loss 0.41528749465942383\u001b[0m\n",
" 84% 42/50 [00:27<00:05, 1.50it/s][\u001b[36m2020-07-08 13:21:10,040\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 43]: loss 0.40110940403408474\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:10,121\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 43]: loss 0.4339146316051483\u001b[0m\n",
" 86% 43/50 [00:28<00:04, 1.51it/s][\u001b[36m2020-07-08 13:21:10,720\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 44]: loss 0.40033632268508273\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:10,799\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 44]: loss 0.45581310987472534\u001b[0m\n",
" 88% 44/50 [00:29<00:04, 1.50it/s][\u001b[36m2020-07-08 13:21:11,406\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 45]: loss 0.39911411371495986\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:11,485\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 45]: loss 0.45169803500175476\u001b[0m\n",
" 90% 45/50 [00:30<00:03, 1.49it/s][\u001b[36m2020-07-08 13:21:12,085\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 46]: loss 0.3944881310065587\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:12,167\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 46]: loss 0.43547898530960083\u001b[0m\n",
" 92% 46/50 [00:30<00:02, 1.48it/s][\u001b[36m2020-07-08 13:21:12,752\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 47]: loss 0.39371995793448555\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:12,830\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 47]: loss 0.4457336366176605\u001b[0m\n",
" 94% 47/50 [00:31<00:02, 1.49it/s][\u001b[36m2020-07-08 13:21:13,408\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 48]: loss 0.39456136690245736\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:13,484\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 48]: loss 0.4572252333164215\u001b[0m\n",
" 96% 48/50 [00:32<00:01, 1.50it/s][\u001b[36m2020-07-08 13:21:14,074\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 49]: loss 0.3898561903172069\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:14,149\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 49]: loss 0.4443695545196533\u001b[0m\n",
" 98% 49/50 [00:32<00:00, 1.50it/s][\u001b[36m2020-07-08 13:21:14,732\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 50]: loss 0.3966614678502083\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:14,809\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 50]: loss 0.42609718441963196\u001b[0m\n",
"100% 50/50 [00:33<00:00, 1.50it/s]\n",
"[\u001b[36m2020-07-08 13:21:14,814\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/timelag/checkpoint_epoch0050.pth\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:14,816\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - The best loss was 0.38926035165786743\u001b[0m\n",
"+ set +x\n",
"stage 3: Training phoneme duration model\n",
"+ nnsvs-train data.train_no_dev.in_dir=dump/natsumeyuuri/norm/train_no_dev/in_duration/ data.train_no_dev.out_dir=dump/natsumeyuuri/norm/train_no_dev/out_duration/ data.dev.in_dir=dump/natsumeyuuri/norm/dev/in_duration/ data.dev.out_dir=dump/natsumeyuuri/norm/dev/out_duration/ model=duration train.out_dir=exp/natsumeyuuri/duration data.batch_size=8 resume.checkpoint=\n",
"[\u001b[36m2020-07-08 13:21:16,739\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - cudnn:\n",
" benchmark: false\n",
" deterministic: false\n",
"data:\n",
" batch_size: 8\n",
" dev:\n",
" in_dir: dump/natsumeyuuri/norm/dev/in_duration/\n",
" out_dir: dump/natsumeyuuri/norm/dev/out_duration/\n",
" num_workers: 2\n",
" pin_memory: true\n",
" train_no_dev:\n",
" in_dir: dump/natsumeyuuri/norm/train_no_dev/in_duration/\n",
" out_dir: dump/natsumeyuuri/norm/train_no_dev/out_duration/\n",
"model:\n",
" has_dynamic_features:\n",
" - false\n",
" netG:\n",
" class: nnsvs.model.LSTMRNN\n",
" params:\n",
" bidirectional: true\n",
" dropout: 0.5\n",
" hidden_dim: 64\n",
" in_dim: 420\n",
" num_layers: 2\n",
" out_dim: 1\n",
" stream_sizes:\n",
" - 1\n",
" stream_weights:\n",
" - 1\n",
"optim:\n",
" lr_scheduler:\n",
" name: StepLR\n",
" params:\n",
" gamma: 0.5\n",
" step_size: 20\n",
" optimizer:\n",
" name: Adam\n",
" params:\n",
" betas:\n",
" - 0.5\n",
" - 0.999\n",
" lr: 0.001\n",
" weight_decay: 0.0\n",
"resume:\n",
" checkpoint: null\n",
" load_optimizer: false\n",
"train:\n",
" checkpoint_epoch_interval: 20\n",
" nepochs: 50\n",
" out_dir: exp/natsumeyuuri/duration\n",
" stream_wise_loss: false\n",
"verbose: 100\n",
"\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:16,739\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - cudnn.deterministic: False\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:16,740\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - cudnn.benchmark: False\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,035\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 45, 420]), torch.Size([8, 45, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,043\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 129, 420]), torch.Size([8, 129, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,060\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 371, 420]), torch.Size([8, 371, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,060\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 100, 420]), torch.Size([8, 100, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,078\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 75, 420]), torch.Size([8, 75, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,078\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 62, 420]), torch.Size([8, 62, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,090\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 141, 420]), torch.Size([8, 141, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,095\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 103, 420]), torch.Size([8, 103, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,115\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 113, 420]), torch.Size([8, 113, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,115\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 64, 420]), torch.Size([8, 64, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,136\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 203, 420]), torch.Size([8, 203, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,136\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 105, 420]), torch.Size([8, 105, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,153\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 67, 420]), torch.Size([8, 67, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,153\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 70, 420]), torch.Size([8, 70, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,170\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 69, 420]), torch.Size([8, 69, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,173\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 115, 420]), torch.Size([8, 115, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,191\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 188, 420]), torch.Size([8, 188, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,194\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 46, 420]), torch.Size([8, 46, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,204\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 78, 420]), torch.Size([8, 78, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,216\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 99, 420]), torch.Size([8, 99, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,221\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 99, 420]), torch.Size([8, 99, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,243\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 92, 420]), torch.Size([8, 92, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,243\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 114, 420]), torch.Size([8, 114, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,264\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 109, 420]), torch.Size([8, 109, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,264\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 74, 420]), torch.Size([8, 74, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,296\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 326, 420]), torch.Size([8, 326, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,297\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 144, 420]), torch.Size([8, 144, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,317\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 109, 420]), torch.Size([8, 109, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,317\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 87, 420]), torch.Size([8, 87, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,346\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 320, 420]), torch.Size([8, 320, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,347\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 83, 420]), torch.Size([8, 83, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,367\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 105, 420]), torch.Size([8, 105, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,367\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 72, 420]), torch.Size([8, 72, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,390\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 75, 420]), torch.Size([8, 75, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,390\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 307, 420]), torch.Size([8, 307, 1]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,396\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([6, 126, 420]), torch.Size([6, 126, 1]), torch.Size([6])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,473\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([4, 122, 420]), torch.Size([4, 122, 1]), torch.Size([4])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:19,500\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Start utterance-wise training...\u001b[0m\n",
" 0% 0/50 [00:00<?, ?it/s][\u001b[36m2020-07-08 13:21:22,408\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 1]: loss 0.9788151217831506\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:22,516\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 1]: loss 0.23424497246742249\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:22,525\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.23424497246742249: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/best_loss.pth\u001b[0m\n",
" 2% 1/50 [00:03<02:28, 3.02s/it][\u001b[36m2020-07-08 13:21:25,496\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 2]: loss 0.6184629404710399\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:25,594\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 2]: loss 0.14174586534500122\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:25,605\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.14174586534500122: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/best_loss.pth\u001b[0m\n",
" 4% 2/50 [00:06<02:25, 3.04s/it][\u001b[36m2020-07-08 13:21:28,526\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 3]: loss 0.4116517191545831\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:28,625\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 3]: loss 0.11850419640541077\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:28,635\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.11850419640541077: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/best_loss.pth\u001b[0m\n",
" 6% 3/50 [00:09<02:22, 3.04s/it][\u001b[36m2020-07-08 13:21:31,547\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 4]: loss 0.3519180185265011\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:31,647\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 4]: loss 0.10702089220285416\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:31,658\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.10702089220285416: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/best_loss.pth\u001b[0m\n",
" 8% 4/50 [00:12<02:19, 3.03s/it][\u001b[36m2020-07-08 13:21:34,581\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 5]: loss 0.29083782248198986\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:34,685\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 5]: loss 0.1230258196592331\u001b[0m\n",
" 10% 5/50 [00:15<02:16, 3.03s/it][\u001b[36m2020-07-08 13:21:37,650\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 6]: loss 0.28506931145158076\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:37,755\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 6]: loss 0.11724185198545456\u001b[0m\n",
" 12% 6/50 [00:18<02:13, 3.04s/it][\u001b[36m2020-07-08 13:21:40,771\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 7]: loss 0.2607448755039109\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:40,871\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 7]: loss 0.08807096630334854\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:40,882\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.08807096630334854: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/best_loss.pth\u001b[0m\n",
" 14% 7/50 [00:21<02:11, 3.07s/it][\u001b[36m2020-07-08 13:21:43,871\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 8]: loss 0.24339092605643803\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:43,977\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 8]: loss 0.10315831750631332\u001b[0m\n",
" 16% 8/50 [00:24<02:09, 3.08s/it][\u001b[36m2020-07-08 13:21:46,842\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 9]: loss 0.20792523378299343\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:46,946\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 9]: loss 0.16103094816207886\u001b[0m\n",
" 18% 9/50 [00:27<02:04, 3.04s/it][\u001b[36m2020-07-08 13:21:49,884\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 10]: loss 0.2087123818281624\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:49,993\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 10]: loss 0.07116121053695679\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:50,003\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.07116121053695679: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/best_loss.pth\u001b[0m\n",
" 20% 10/50 [00:30<02:01, 3.05s/it][\u001b[36m2020-07-08 13:21:52,807\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 11]: loss 0.1953940186649561\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:52,914\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 11]: loss 0.062331072986125946\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:52,924\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.062331072986125946: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/best_loss.pth\u001b[0m\n",
" 22% 11/50 [00:33<01:57, 3.01s/it][\u001b[36m2020-07-08 13:21:55,885\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 12]: loss 0.15693934655023944\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:55,994\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 12]: loss 0.05399143323302269\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:56,005\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.05399143323302269: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/best_loss.pth\u001b[0m\n",
" 24% 12/50 [00:36<01:55, 3.03s/it][\u001b[36m2020-07-08 13:21:58,831\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 13]: loss 0.14568777309937608\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:58,938\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 13]: loss 0.04817579686641693\u001b[0m\n",
"[\u001b[36m2020-07-08 13:21:58,953\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.04817579686641693: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/best_loss.pth\u001b[0m\n",
" 26% 13/50 [00:39<01:51, 3.01s/it][\u001b[36m2020-07-08 13:22:01,799\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 14]: loss 0.12911761179566383\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:01,905\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 14]: loss 0.04236113280057907\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:01,916\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.04236113280057907: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/best_loss.pth\u001b[0m\n",
" 28% 14/50 [00:42<01:47, 2.99s/it][\u001b[36m2020-07-08 13:22:04,845\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 15]: loss 0.12083490430894825\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:04,950\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 15]: loss 0.03692256286740303\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:04,961\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.03692256286740303: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/best_loss.pth\u001b[0m\n",
" 30% 15/50 [00:45<01:45, 3.01s/it][\u001b[36m2020-07-08 13:22:08,038\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 16]: loss 0.10706003134449323\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:08,141\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 16]: loss 0.04401819407939911\u001b[0m\n",
" 32% 16/50 [00:48<01:44, 3.06s/it][\u001b[36m2020-07-08 13:22:11,194\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 17]: loss 0.10751809479875697\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:11,293\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 17]: loss 0.03161386027932167\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:11,304\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.03161386027932167: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/best_loss.pth\u001b[0m\n",
" 34% 17/50 [00:51<01:42, 3.09s/it][\u001b[36m2020-07-08 13:22:14,205\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 18]: loss 0.10668584259433879\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:14,309\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 18]: loss 0.07530204951763153\u001b[0m\n",
" 36% 18/50 [00:54<01:38, 3.07s/it][\u001b[36m2020-07-08 13:22:17,176\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 19]: loss 0.1171399179018206\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:17,278\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 19]: loss 0.02967849001288414\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:17,289\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.02967849001288414: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/best_loss.pth\u001b[0m\n",
" 38% 19/50 [00:57<01:34, 3.04s/it][\u001b[36m2020-07-08 13:22:20,246\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 20]: loss 0.09101046570059326\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:20,349\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 20]: loss 0.03702346235513687\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:20,359\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/checkpoint_epoch0020.pth\u001b[0m\n",
" 40% 20/50 [01:00<01:31, 3.05s/it][\u001b[36m2020-07-08 13:22:23,287\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 21]: loss 0.07408795919683245\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:23,390\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 21]: loss 0.03391813114285469\u001b[0m\n",
" 42% 21/50 [01:03<01:28, 3.04s/it][\u001b[36m2020-07-08 13:22:26,305\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 22]: loss 0.0726237690800594\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:26,418\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 22]: loss 0.03123507834970951\u001b[0m\n",
" 44% 22/50 [01:06<01:25, 3.04s/it][\u001b[36m2020-07-08 13:22:29,413\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 23]: loss 0.06813227809551689\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:29,534\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 23]: loss 0.031177451834082603\u001b[0m\n",
" 46% 23/50 [01:10<01:22, 3.06s/it][\u001b[36m2020-07-08 13:22:32,480\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 24]: loss 0.07268113223835826\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:32,600\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 24]: loss 0.04517199471592903\u001b[0m\n",
" 48% 24/50 [01:13<01:19, 3.06s/it][\u001b[36m2020-07-08 13:22:35,542\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 25]: loss 0.06666368877308236\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:35,655\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 25]: loss 0.038387589156627655\u001b[0m\n",
" 50% 25/50 [01:16<01:16, 3.06s/it][\u001b[36m2020-07-08 13:22:38,526\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 26]: loss 0.06579243686671059\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:38,630\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 26]: loss 0.04000464454293251\u001b[0m\n",
" 52% 26/50 [01:19<01:12, 3.04s/it][\u001b[36m2020-07-08 13:22:41,553\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 27]: loss 0.06716252490878105\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:41,664\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 27]: loss 0.02943587303161621\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:41,675\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.02943587303161621: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/best_loss.pth\u001b[0m\n",
" 54% 27/50 [01:22<01:09, 3.04s/it][\u001b[36m2020-07-08 13:22:44,565\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 28]: loss 0.058823291729721755\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:44,663\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 28]: loss 0.034122880548238754\u001b[0m\n",
" 56% 28/50 [01:25<01:06, 3.02s/it][\u001b[36m2020-07-08 13:22:47,576\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 29]: loss 0.0629075369797647\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:47,679\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 29]: loss 0.026834849268198013\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:47,690\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.026834849268198013: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/best_loss.pth\u001b[0m\n",
" 58% 29/50 [01:28<01:03, 3.02s/it][\u001b[36m2020-07-08 13:22:50,687\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 30]: loss 0.06447314522746536\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:50,789\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 30]: loss 0.030460676178336143\u001b[0m\n",
" 60% 30/50 [01:31<01:00, 3.05s/it][\u001b[36m2020-07-08 13:22:53,736\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 31]: loss 0.06063674726626939\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:53,840\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 31]: loss 0.04267503693699837\u001b[0m\n",
" 62% 31/50 [01:34<00:57, 3.05s/it][\u001b[36m2020-07-08 13:22:56,861\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 32]: loss 0.05852408614009619\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:56,977\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 32]: loss 0.03663172572851181\u001b[0m\n",
" 64% 32/50 [01:37<00:55, 3.07s/it][\u001b[36m2020-07-08 13:22:59,774\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 33]: loss 0.05029191847683655\u001b[0m\n",
"[\u001b[36m2020-07-08 13:22:59,887\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 33]: loss 0.031059743836522102\u001b[0m\n",
" 66% 33/50 [01:40<00:51, 3.03s/it][\u001b[36m2020-07-08 13:23:02,876\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 34]: loss 0.0554132031587263\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:02,989\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 34]: loss 0.04145012050867081\u001b[0m\n",
" 68% 34/50 [01:43<00:48, 3.05s/it][\u001b[36m2020-07-08 13:23:05,984\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 35]: loss 0.056521674514644675\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:06,095\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 35]: loss 0.02637258730828762\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:06,106\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.02637258730828762: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/best_loss.pth\u001b[0m\n",
" 70% 35/50 [01:46<00:46, 3.07s/it][\u001b[36m2020-07-08 13:23:09,091\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 36]: loss 0.05734956067883306\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:09,197\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 36]: loss 0.046009570360183716\u001b[0m\n",
" 72% 36/50 [01:49<00:43, 3.08s/it][\u001b[36m2020-07-08 13:23:12,195\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 37]: loss 0.048293033304313816\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:12,297\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 37]: loss 0.04870763048529625\u001b[0m\n",
" 74% 37/50 [01:52<00:40, 3.08s/it][\u001b[36m2020-07-08 13:23:15,227\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 38]: loss 0.047517083688742585\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:15,330\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 38]: loss 0.03859198838472366\u001b[0m\n",
" 76% 38/50 [01:55<00:36, 3.07s/it][\u001b[36m2020-07-08 13:23:18,387\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 39]: loss 0.0485696009774175\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:18,490\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 39]: loss 0.02744862623512745\u001b[0m\n",
" 78% 39/50 [01:58<00:34, 3.10s/it][\u001b[36m2020-07-08 13:23:21,461\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 40]: loss 0.04585965981500016\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:21,564\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 40]: loss 0.027844272553920746\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:21,574\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/checkpoint_epoch0040.pth\u001b[0m\n",
" 80% 40/50 [02:02<00:30, 3.09s/it][\u001b[36m2020-07-08 13:23:24,614\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 41]: loss 0.04179505605457558\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:24,717\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 41]: loss 0.04385361447930336\u001b[0m\n",
" 82% 41/50 [02:05<00:27, 3.11s/it][\u001b[36m2020-07-08 13:23:27,660\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 42]: loss 0.03881238421632184\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:27,763\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 42]: loss 0.02866148203611374\u001b[0m\n",
" 84% 42/50 [02:08<00:24, 3.09s/it][\u001b[36m2020-07-08 13:23:30,704\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 43]: loss 0.04025567897285024\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:30,811\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 43]: loss 0.03752688318490982\u001b[0m\n",
" 86% 43/50 [02:11<00:21, 3.08s/it][\u001b[36m2020-07-08 13:23:33,786\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 44]: loss 0.03953476110473275\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:33,892\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 44]: loss 0.02555997297167778\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:33,902\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.02555997297167778: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/best_loss.pth\u001b[0m\n",
" 88% 44/50 [02:14<00:18, 3.08s/it][\u001b[36m2020-07-08 13:23:36,940\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 45]: loss 0.037666411915173136\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:37,047\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 45]: loss 0.03175856173038483\u001b[0m\n",
" 90% 45/50 [02:17<00:15, 3.10s/it][\u001b[36m2020-07-08 13:23:40,013\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 46]: loss 0.038618513020790286\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:40,128\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 46]: loss 0.032282426953315735\u001b[0m\n",
" 92% 46/50 [02:20<00:12, 3.09s/it][\u001b[36m2020-07-08 13:23:43,158\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 47]: loss 0.0381338927998311\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:43,266\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 47]: loss 0.03129419684410095\u001b[0m\n",
" 94% 47/50 [02:23<00:09, 3.11s/it][\u001b[36m2020-07-08 13:23:46,258\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 48]: loss 0.03790443173299233\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:46,374\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 48]: loss 0.028950635343790054\u001b[0m\n",
" 96% 48/50 [02:26<00:06, 3.11s/it][\u001b[36m2020-07-08 13:23:49,169\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 49]: loss 0.03582709272288614\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:49,275\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 49]: loss 0.025026710703969002\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:49,285\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.025026710703969002: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/best_loss.pth\u001b[0m\n",
" 98% 49/50 [02:29<00:03, 3.05s/it][\u001b[36m2020-07-08 13:23:52,236\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 50]: loss 0.03447776919023858\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:52,338\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 50]: loss 0.02567560411989689\u001b[0m\n",
"100% 50/50 [02:32<00:00, 3.06s/it]\n",
"[\u001b[36m2020-07-08 13:23:52,348\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/checkpoint_epoch0050.pth\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:52,354\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - The best loss was 0.025026710703969002\u001b[0m\n",
"+ set +x\n",
"stage 4: Training acoustic model\n",
"+ nnsvs-train data.train_no_dev.in_dir=dump/natsumeyuuri/norm/train_no_dev/in_acoustic/ data.train_no_dev.out_dir=dump/natsumeyuuri/norm/train_no_dev/out_acoustic/ data.dev.in_dir=dump/natsumeyuuri/norm/dev/in_acoustic/ data.dev.out_dir=dump/natsumeyuuri/norm/dev/out_acoustic/ model=acoustic train.out_dir=exp/natsumeyuuri/acoustic data.batch_size=8 resume.checkpoint=\n",
"[\u001b[36m2020-07-08 13:23:54,199\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - cudnn:\n",
" benchmark: false\n",
" deterministic: false\n",
"data:\n",
" batch_size: 8\n",
" dev:\n",
" in_dir: dump/natsumeyuuri/norm/dev/in_acoustic/\n",
" out_dir: dump/natsumeyuuri/norm/dev/out_acoustic/\n",
" num_workers: 2\n",
" pin_memory: true\n",
" train_no_dev:\n",
" in_dir: dump/natsumeyuuri/norm/train_no_dev/in_acoustic/\n",
" out_dir: dump/natsumeyuuri/norm/train_no_dev/out_acoustic/\n",
"model:\n",
" has_dynamic_features:\n",
" - true\n",
" - true\n",
" - false\n",
" - true\n",
" netG:\n",
" class: nnsvs.model.Conv1dResnet\n",
" params:\n",
" dropout: 0.1\n",
" hidden_dim: 128\n",
" in_dim: 424\n",
" num_layers: 6\n",
" out_dim: 199\n",
" num_windows: 3\n",
" stream_sizes:\n",
" - 180\n",
" - 3\n",
" - 1\n",
" - 15\n",
" stream_weights: null\n",
"optim:\n",
" lr_scheduler:\n",
" name: StepLR\n",
" params:\n",
" gamma: 0.5\n",
" step_size: 20\n",
" optimizer:\n",
" name: Adam\n",
" params:\n",
" betas:\n",
" - 0.5\n",
" - 0.999\n",
" lr: 0.001\n",
" weight_decay: 0.0\n",
"resume:\n",
" checkpoint: null\n",
" load_optimizer: false\n",
"train:\n",
" checkpoint_epoch_interval: 20\n",
" nepochs: 50\n",
" out_dir: exp/natsumeyuuri/acoustic\n",
" stream_wise_loss: false\n",
"verbose: 100\n",
"\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:54,199\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - cudnn.deterministic: False\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:54,200\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - cudnn.benchmark: False\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:57,037\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 3291, 424]), torch.Size([8, 3291, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:57,038\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 2931, 424]), torch.Size([8, 2931, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:57,225\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 2584, 424]), torch.Size([8, 2584, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:57,226\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 2525, 424]), torch.Size([8, 2525, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:57,727\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 3779, 424]), torch.Size([8, 3779, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:57,728\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 2692, 424]), torch.Size([8, 2692, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:58,019\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 3395, 424]), torch.Size([8, 3395, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:58,020\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 3402, 424]), torch.Size([8, 3402, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:58,231\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 2451, 424]), torch.Size([8, 2451, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:58,232\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 2516, 424]), torch.Size([8, 2516, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:58,479\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 2709, 424]), torch.Size([8, 2709, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:58,965\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 5888, 424]), torch.Size([8, 5888, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:58,966\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 3194, 424]), torch.Size([8, 3194, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:59,533\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 7148, 424]), torch.Size([8, 7148, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:59,533\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 3399, 424]), torch.Size([8, 3399, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:59,605\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 2656, 424]), torch.Size([8, 2656, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:59,605\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 2432, 424]), torch.Size([8, 2432, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:23:59,856\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 4614, 424]), torch.Size([8, 4614, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:00,334\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 3030, 424]), torch.Size([8, 3030, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:00,381\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 4378, 424]), torch.Size([8, 4378, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:00,716\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 4930, 424]), torch.Size([8, 4930, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:01,035\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 3641, 424]), torch.Size([8, 3641, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:01,217\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 2895, 424]), torch.Size([8, 2895, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:02,343\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 6324, 424]), torch.Size([8, 6324, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:02,344\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 10571, 424]), torch.Size([8, 10571, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:02,922\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 3047, 424]), torch.Size([8, 3047, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:02,923\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 3488, 424]), torch.Size([8, 3488, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:04,009\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 4851, 424]), torch.Size([8, 4851, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:04,011\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 2454, 424]), torch.Size([8, 2454, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:06,206\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 17824, 424]), torch.Size([8, 17824, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:06,207\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 10885, 424]), torch.Size([8, 10885, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:06,961\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 2516, 424]), torch.Size([8, 2516, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:09,708\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 12670, 424]), torch.Size([8, 12670, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:09,708\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 3054, 424]), torch.Size([8, 3054, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:10,439\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([8, 3307, 424]), torch.Size([8, 3307, 199]), torch.Size([8])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:10,440\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([6, 5484, 424]), torch.Size([6, 5484, 199]), torch.Size([6])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:10,666\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - torch.Size([4, 4107, 424]), torch.Size([4, 4107, 199]), torch.Size([4])\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:10,700\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Start utterance-wise training...\u001b[0m\n",
" 0% 0/50 [00:00<?, ?it/s][\u001b[36m2020-07-08 13:24:35,700\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 1]: loss 0.9712603870365355\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:36,033\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 1]: loss 1.0082499980926514\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:36,060\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 1.0082499980926514: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/acoustic/best_loss.pth\u001b[0m\n",
" 2% 1/50 [00:25<20:42, 25.35s/it][\u001b[36m2020-07-08 13:24:49,136\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 2]: loss 0.9302392502625784\u001b[0m\n",
"[\u001b[36m2020-07-08 13:24:49,462\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 2]: loss 1.0185375213623047\u001b[0m\n",
" 4% 2/50 [00:38<17:24, 21.77s/it][\u001b[36m2020-07-08 13:25:04,623\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 3]: loss 0.9178364657693439\u001b[0m\n",
"[\u001b[36m2020-07-08 13:25:04,959\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 3]: loss 0.986689567565918\u001b[0m\n",
"[\u001b[36m2020-07-08 13:25:04,996\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.986689567565918: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/acoustic/best_loss.pth\u001b[0m\n",
" 6% 3/50 [00:54<15:35, 19.90s/it][\u001b[36m2020-07-08 13:25:17,242\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 4]: loss 0.9141632268826166\u001b[0m\n",
"[\u001b[36m2020-07-08 13:25:17,572\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 4]: loss 0.996056854724884\u001b[0m\n",
" 8% 4/50 [01:06<13:34, 17.70s/it][\u001b[36m2020-07-08 13:25:28,513\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 5]: loss 0.9147483077314165\u001b[0m\n",
"[\u001b[36m2020-07-08 13:25:28,846\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 5]: loss 0.9841826558113098\u001b[0m\n",
"[\u001b[36m2020-07-08 13:25:28,878\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.9841826558113098: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/acoustic/best_loss.pth\u001b[0m\n",
" 10% 5/50 [01:18<11:50, 15.78s/it][\u001b[36m2020-07-08 13:25:39,927\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 6]: loss 0.9078965253300137\u001b[0m\n",
"[\u001b[36m2020-07-08 13:25:40,262\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 6]: loss 0.9945037364959717\u001b[0m\n",
" 12% 6/50 [01:29<10:36, 14.46s/it][\u001b[36m2020-07-08 13:25:50,699\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 7]: loss 0.9073386291662852\u001b[0m\n",
"[\u001b[36m2020-07-08 13:25:51,034\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 7]: loss 1.010166049003601\u001b[0m\n",
" 14% 7/50 [01:40<09:34, 13.36s/it][\u001b[36m2020-07-08 13:26:05,667\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 8]: loss 0.9082122047742208\u001b[0m\n",
"[\u001b[36m2020-07-08 13:26:06,003\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 8]: loss 0.9847797155380249\u001b[0m\n",
" 16% 8/50 [01:55<09:41, 13.84s/it][\u001b[36m2020-07-08 13:26:17,190\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 9]: loss 0.9052143196264902\u001b[0m\n",
"[\u001b[36m2020-07-08 13:26:17,529\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 9]: loss 0.9708256125450134\u001b[0m\n",
"[\u001b[36m2020-07-08 13:26:17,565\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.9708256125450134: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/acoustic/best_loss.pth\u001b[0m\n",
" 18% 9/50 [02:06<08:59, 13.16s/it][\u001b[36m2020-07-08 13:26:28,942\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 10]: loss 0.9005531320969263\u001b[0m\n",
"[\u001b[36m2020-07-08 13:26:29,269\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 10]: loss 0.9693053960800171\u001b[0m\n",
"[\u001b[36m2020-07-08 13:26:29,304\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.9693053960800171: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/acoustic/best_loss.pth\u001b[0m\n",
" 20% 10/50 [02:18<08:29, 12.73s/it][\u001b[36m2020-07-08 13:26:40,762\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 11]: loss 0.9040981514586343\u001b[0m\n",
"[\u001b[36m2020-07-08 13:26:41,095\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 11]: loss 0.9651496410369873\u001b[0m\n",
"[\u001b[36m2020-07-08 13:26:41,138\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.9651496410369873: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/acoustic/best_loss.pth\u001b[0m\n",
" 22% 11/50 [02:30<08:06, 12.46s/it][\u001b[36m2020-07-08 13:26:52,056\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 12]: loss 0.902095165517595\u001b[0m\n",
"[\u001b[36m2020-07-08 13:26:52,397\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 12]: loss 0.9566709995269775\u001b[0m\n",
"[\u001b[36m2020-07-08 13:26:52,432\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.9566709995269775: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/acoustic/best_loss.pth\u001b[0m\n",
" 24% 12/50 [02:41<07:40, 12.11s/it][\u001b[36m2020-07-08 13:27:03,879\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 13]: loss 0.9002875222100152\u001b[0m\n",
"[\u001b[36m2020-07-08 13:27:04,215\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 13]: loss 0.9774976968765259\u001b[0m\n",
" 26% 13/50 [02:53<07:24, 12.01s/it][\u001b[36m2020-07-08 13:27:14,930\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 14]: loss 0.8936160422033734\u001b[0m\n",
"[\u001b[36m2020-07-08 13:27:15,258\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 14]: loss 0.9530430436134338\u001b[0m\n",
"[\u001b[36m2020-07-08 13:27:15,287\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.9530430436134338: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/acoustic/best_loss.pth\u001b[0m\n",
" 28% 14/50 [03:04<07:02, 11.73s/it][\u001b[36m2020-07-08 13:27:26,370\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 15]: loss 0.8957512709829543\u001b[0m\n",
"[\u001b[36m2020-07-08 13:27:26,703\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 15]: loss 0.962577760219574\u001b[0m\n",
" 30% 15/50 [03:15<06:47, 11.64s/it][\u001b[36m2020-07-08 13:27:37,424\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 16]: loss 0.8915612035327487\u001b[0m\n",
"[\u001b[36m2020-07-08 13:27:37,762\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 16]: loss 0.9533936381340027\u001b[0m\n",
" 32% 16/50 [03:27<06:29, 11.46s/it][\u001b[36m2020-07-08 13:27:48,458\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 17]: loss 0.8913761360777749\u001b[0m\n",
"[\u001b[36m2020-07-08 13:27:48,796\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 17]: loss 0.9621612429618835\u001b[0m\n",
" 34% 17/50 [03:38<06:14, 11.33s/it][\u001b[36m2020-07-08 13:28:00,102\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 18]: loss 0.8909211489889357\u001b[0m\n",
"[\u001b[36m2020-07-08 13:28:00,448\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 18]: loss 0.9578041434288025\u001b[0m\n",
" 36% 18/50 [03:49<06:05, 11.43s/it][\u001b[36m2020-07-08 13:28:11,303\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 19]: loss 0.8944023566113578\u001b[0m\n",
"[\u001b[36m2020-07-08 13:28:11,637\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 19]: loss 0.9569771885871887\u001b[0m\n",
" 38% 19/50 [04:00<05:52, 11.36s/it][\u001b[36m2020-07-08 13:28:22,463\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 20]: loss 0.8892432633373473\u001b[0m\n",
"[\u001b[36m2020-07-08 13:28:22,795\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 20]: loss 0.949061930179596\u001b[0m\n",
"[\u001b[36m2020-07-08 13:28:22,825\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.949061930179596: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/acoustic/best_loss.pth\u001b[0m\n",
"[\u001b[36m2020-07-08 13:28:22,851\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/acoustic/checkpoint_epoch0020.pth\u001b[0m\n",
" 40% 20/50 [04:12<05:39, 11.32s/it][\u001b[36m2020-07-08 13:28:34,077\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 21]: loss 0.8810498416423798\u001b[0m\n",
"[\u001b[36m2020-07-08 13:28:34,401\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 21]: loss 0.9506465196609497\u001b[0m\n",
" 42% 21/50 [04:23<05:30, 11.38s/it][\u001b[36m2020-07-08 13:28:45,326\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 22]: loss 0.8841317147016525\u001b[0m\n",
"[\u001b[36m2020-07-08 13:28:45,653\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 22]: loss 0.9442095160484314\u001b[0m\n",
"[\u001b[36m2020-07-08 13:28:45,683\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.9442095160484314: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/acoustic/best_loss.pth\u001b[0m\n",
" 44% 22/50 [04:34<05:17, 11.35s/it][\u001b[36m2020-07-08 13:28:56,826\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 23]: loss 0.8789101723167632\u001b[0m\n",
"[\u001b[36m2020-07-08 13:28:57,182\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 23]: loss 0.9469482898712158\u001b[0m\n",
" 46% 23/50 [04:46<05:07, 11.40s/it][\u001b[36m2020-07-08 13:29:07,995\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 24]: loss 0.8827470491329829\u001b[0m\n",
"[\u001b[36m2020-07-08 13:29:08,328\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 24]: loss 0.9495368003845215\u001b[0m\n",
" 48% 24/50 [04:57<04:54, 11.32s/it][\u001b[36m2020-07-08 13:29:19,834\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 25]: loss 0.8823114915026559\u001b[0m\n",
"[\u001b[36m2020-07-08 13:29:20,166\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 25]: loss 0.9500911235809326\u001b[0m\n",
" 50% 25/50 [05:09<04:46, 11.48s/it][\u001b[36m2020-07-08 13:29:31,058\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 26]: loss 0.8807980351977878\u001b[0m\n",
"[\u001b[36m2020-07-08 13:29:31,385\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 26]: loss 0.944189727306366\u001b[0m\n",
"[\u001b[36m2020-07-08 13:29:31,415\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.944189727306366: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/acoustic/best_loss.pth\u001b[0m\n",
" 52% 26/50 [05:20<04:33, 11.41s/it][\u001b[36m2020-07-08 13:29:42,616\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 27]: loss 0.878420329756207\u001b[0m\n",
"[\u001b[36m2020-07-08 13:29:42,950\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 27]: loss 0.945614218711853\u001b[0m\n",
" 54% 27/50 [05:32<04:23, 11.45s/it][\u001b[36m2020-07-08 13:29:53,630\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 28]: loss 0.8754172258906894\u001b[0m\n",
"[\u001b[36m2020-07-08 13:29:53,971\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 28]: loss 0.9548240303993225\u001b[0m\n",
" 56% 28/50 [05:43<04:09, 11.32s/it][\u001b[36m2020-07-08 13:30:05,244\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 29]: loss 0.879935599035687\u001b[0m\n",
"[\u001b[36m2020-07-08 13:30:05,579\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 29]: loss 0.9470394253730774\u001b[0m\n",
" 58% 29/50 [05:54<03:59, 11.41s/it][\u001b[36m2020-07-08 13:30:16,106\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 30]: loss 0.8777108970615599\u001b[0m\n",
"[\u001b[36m2020-07-08 13:30:16,439\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 30]: loss 0.9476612210273743\u001b[0m\n",
" 60% 30/50 [06:05<03:44, 11.24s/it][\u001b[36m2020-07-08 13:30:27,807\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 31]: loss 0.879016121228536\u001b[0m\n",
"[\u001b[36m2020-07-08 13:30:28,146\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 31]: loss 0.9467730522155762\u001b[0m\n",
" 62% 31/50 [06:17<03:36, 11.38s/it][\u001b[36m2020-07-08 13:30:38,976\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 32]: loss 0.8759454621209039\u001b[0m\n",
"[\u001b[36m2020-07-08 13:30:39,309\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 32]: loss 0.9443545937538147\u001b[0m\n",
" 64% 32/50 [06:28<03:23, 11.32s/it][\u001b[36m2020-07-08 13:30:49,675\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 33]: loss 0.8773131204975976\u001b[0m\n",
"[\u001b[36m2020-07-08 13:30:49,998\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 33]: loss 0.9458482265472412\u001b[0m\n",
" 66% 33/50 [06:39<03:09, 11.13s/it][\u001b[36m2020-07-08 13:31:01,380\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 34]: loss 0.8765087657504611\u001b[0m\n",
"[\u001b[36m2020-07-08 13:31:01,712\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 34]: loss 0.9425597786903381\u001b[0m\n",
"[\u001b[36m2020-07-08 13:31:01,748\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.9425597786903381: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/acoustic/best_loss.pth\u001b[0m\n",
" 68% 34/50 [06:51<03:01, 11.31s/it][\u001b[36m2020-07-08 13:31:13,219\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 35]: loss 0.8787198944224252\u001b[0m\n",
"[\u001b[36m2020-07-08 13:31:13,546\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 35]: loss 0.9408955574035645\u001b[0m\n",
"[\u001b[36m2020-07-08 13:31:13,577\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.9408955574035645: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/acoustic/best_loss.pth\u001b[0m\n",
" 70% 35/50 [07:02<02:52, 11.47s/it][\u001b[36m2020-07-08 13:31:26,460\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 36]: loss 0.8759144792954127\u001b[0m\n",
"[\u001b[36m2020-07-08 13:31:26,784\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 36]: loss 0.9459250569343567\u001b[0m\n",
" 72% 36/50 [07:16<02:47, 11.99s/it][\u001b[36m2020-07-08 13:31:37,454\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 37]: loss 0.8750185850593779\u001b[0m\n",
"[\u001b[36m2020-07-08 13:31:37,794\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 37]: loss 0.9437361359596252\u001b[0m\n",
" 74% 37/50 [07:27<02:32, 11.70s/it][\u001b[36m2020-07-08 13:31:48,974\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 38]: loss 0.8739786297082901\u001b[0m\n",
"[\u001b[36m2020-07-08 13:31:49,306\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 38]: loss 0.9524400234222412\u001b[0m\n",
" 76% 38/50 [07:38<02:19, 11.64s/it][\u001b[36m2020-07-08 13:32:00,587\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 39]: loss 0.8758373094929589\u001b[0m\n",
"[\u001b[36m2020-07-08 13:32:00,923\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 39]: loss 0.9428941011428833\u001b[0m\n",
" 78% 39/50 [07:50<02:07, 11.63s/it][\u001b[36m2020-07-08 13:32:12,799\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 40]: loss 0.8744848403665755\u001b[0m\n",
"[\u001b[36m2020-07-08 13:32:13,132\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 40]: loss 0.9442757964134216\u001b[0m\n",
"[\u001b[36m2020-07-08 13:32:13,159\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/acoustic/checkpoint_epoch0040.pth\u001b[0m\n",
" 80% 40/50 [08:02<01:58, 11.82s/it][\u001b[36m2020-07-08 13:32:24,206\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 41]: loss 0.8678607145945231\u001b[0m\n",
"[\u001b[36m2020-07-08 13:32:24,537\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 41]: loss 0.9397162199020386\u001b[0m\n",
"[\u001b[36m2020-07-08 13:32:24,568\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.9397162199020386: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/acoustic/best_loss.pth\u001b[0m\n",
" 82% 41/50 [08:13<01:45, 11.69s/it][\u001b[36m2020-07-08 13:32:34,951\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 42]: loss 0.8692252768410577\u001b[0m\n",
"[\u001b[36m2020-07-08 13:32:35,282\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 42]: loss 0.9396857023239136\u001b[0m\n",
"[\u001b[36m2020-07-08 13:32:35,313\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.9396857023239136: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/acoustic/best_loss.pth\u001b[0m\n",
" 84% 42/50 [08:24<01:31, 11.41s/it][\u001b[36m2020-07-08 13:32:46,636\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 43]: loss 0.869785401556227\u001b[0m\n",
"[\u001b[36m2020-07-08 13:32:46,967\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 43]: loss 0.9420670866966248\u001b[0m\n",
" 86% 43/50 [08:36<01:20, 11.48s/it][\u001b[36m2020-07-08 13:32:57,992\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 44]: loss 0.8687263776858648\u001b[0m\n",
"[\u001b[36m2020-07-08 13:32:58,329\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 44]: loss 0.9404141306877136\u001b[0m\n",
" 88% 44/50 [08:47<01:08, 11.45s/it][\u001b[36m2020-07-08 13:33:09,410\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 45]: loss 0.8677939805719588\u001b[0m\n",
"[\u001b[36m2020-07-08 13:33:09,757\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 45]: loss 0.9410929083824158\u001b[0m\n",
" 90% 45/50 [08:59<00:57, 11.44s/it][\u001b[36m2020-07-08 13:33:20,737\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 46]: loss 0.8687533156739341\u001b[0m\n",
"[\u001b[36m2020-07-08 13:33:21,066\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 46]: loss 0.9409741163253784\u001b[0m\n",
" 92% 46/50 [09:10<00:45, 11.40s/it][\u001b[36m2020-07-08 13:33:32,805\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 47]: loss 0.8697140763203303\u001b[0m\n",
"[\u001b[36m2020-07-08 13:33:33,137\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 47]: loss 0.938656210899353\u001b[0m\n",
"[\u001b[36m2020-07-08 13:33:33,169\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [Best loss 0.938656210899353: checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/acoustic/best_loss.pth\u001b[0m\n",
" 94% 47/50 [09:22<00:34, 11.61s/it][\u001b[36m2020-07-08 13:33:44,353\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 48]: loss 0.8698295619752672\u001b[0m\n",
"[\u001b[36m2020-07-08 13:33:44,694\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 48]: loss 0.9392961263656616\u001b[0m\n",
" 96% 48/50 [09:33<00:23, 11.59s/it][\u001b[36m2020-07-08 13:33:56,072\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 49]: loss 0.8685698029067781\u001b[0m\n",
"[\u001b[36m2020-07-08 13:33:56,402\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 49]: loss 0.940766453742981\u001b[0m\n",
" 98% 49/50 [09:45<00:11, 11.62s/it][\u001b[36m2020-07-08 13:34:07,290\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [train_no_dev] [Epoch 50]: loss 0.866032212972641\u001b[0m\n",
"[\u001b[36m2020-07-08 13:34:07,628\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - [dev] [Epoch 50]: loss 0.9434857368469238\u001b[0m\n",
"100% 50/50 [09:56<00:00, 11.94s/it]\n",
"[\u001b[36m2020-07-08 13:34:07,657\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Checkpoint is saved at /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/acoustic/checkpoint_epoch0050.pth\u001b[0m\n",
"[\u001b[36m2020-07-08 13:34:07,676\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - The best loss was 0.938656210899353\u001b[0m\n",
"+ set +x\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "L-30YYpIXaqo",
"colab_type": "text"
},
"source": [
"## ステージ 5-6 (歌声合成) の実行\n",
"\n",
"ステージ 5 は学習したモデルを使った歌声合成に必要なパラメータの計算, ステージ 6 がそのパラメータを使った歌声合成になります. \n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "YnPWRwTgqUhg",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"outputId": "26aece5c-1cee-4850-ee0e-0e83643bb89a"
},
"source": [
"# stage5-6の実行(モデルデータの出力および歌声合成)\n",
"! cd $RECIPE_ROOT && bash run.sh --stage 5 --stop-stage 6"
],
"execution_count": 15,
"outputs": [
{
"output_type": "stream",
"text": [
"stage 5: Generation features from timelag/duration/acoustic models\n",
"+ nnsvs-generate model.checkpoint=exp/natsumeyuuri/timelag/latest.pth model.model_yaml=exp/natsumeyuuri/timelag/model.yaml out_scaler_path=dump/natsumeyuuri/norm/out_timelag_scaler.joblib in_dir=dump/natsumeyuuri/norm/dev/in_timelag/ out_dir=exp/natsumeyuuri/timelag/predicted/dev/latest/\n",
"[\u001b[36m2020-07-08 13:34:13,519\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/norm/dev/in_timelag/\n",
"model:\n",
" checkpoint: exp/natsumeyuuri/timelag/latest.pth\n",
" model_yaml: exp/natsumeyuuri/timelag/model.yaml\n",
"out_dir: exp/natsumeyuuri/timelag/predicted/dev/latest/\n",
"out_scaler_path: dump/natsumeyuuri/norm/out_timelag_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 4/4 [00:00<00:00, 389.71it/s]\n",
"+ set +x\n",
"+ nnsvs-generate model.checkpoint=exp/natsumeyuuri/duration/latest.pth model.model_yaml=exp/natsumeyuuri/duration/model.yaml out_scaler_path=dump/natsumeyuuri/norm/out_duration_scaler.joblib in_dir=dump/natsumeyuuri/norm/dev/in_duration/ out_dir=exp/natsumeyuuri/duration/predicted/dev/latest/\n",
"[\u001b[36m2020-07-08 13:34:18,066\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/norm/dev/in_duration/\n",
"model:\n",
" checkpoint: exp/natsumeyuuri/duration/latest.pth\n",
" model_yaml: exp/natsumeyuuri/duration/model.yaml\n",
"out_dir: exp/natsumeyuuri/duration/predicted/dev/latest/\n",
"out_scaler_path: dump/natsumeyuuri/norm/out_duration_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 4/4 [00:00<00:00, 43.27it/s]\n",
"+ set +x\n",
"+ nnsvs-generate model.checkpoint=exp/natsumeyuuri/acoustic/latest.pth model.model_yaml=exp/natsumeyuuri/acoustic/model.yaml out_scaler_path=dump/natsumeyuuri/norm/out_acoustic_scaler.joblib in_dir=dump/natsumeyuuri/norm/dev/in_acoustic/ out_dir=exp/natsumeyuuri/acoustic/predicted/dev/latest/\n",
"[\u001b[36m2020-07-08 13:34:22,765\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/norm/dev/in_acoustic/\n",
"model:\n",
" checkpoint: exp/natsumeyuuri/acoustic/latest.pth\n",
" model_yaml: exp/natsumeyuuri/acoustic/model.yaml\n",
"out_dir: exp/natsumeyuuri/acoustic/predicted/dev/latest/\n",
"out_scaler_path: dump/natsumeyuuri/norm/out_acoustic_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 4/4 [00:00<00:00, 21.36it/s]\n",
"+ set +x\n",
"+ nnsvs-generate model.checkpoint=exp/natsumeyuuri/timelag/latest.pth model.model_yaml=exp/natsumeyuuri/timelag/model.yaml out_scaler_path=dump/natsumeyuuri/norm/out_timelag_scaler.joblib in_dir=dump/natsumeyuuri/norm/eval/in_timelag/ out_dir=exp/natsumeyuuri/timelag/predicted/eval/latest/\n",
"[\u001b[36m2020-07-08 13:34:27,505\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/norm/eval/in_timelag/\n",
"model:\n",
" checkpoint: exp/natsumeyuuri/timelag/latest.pth\n",
" model_yaml: exp/natsumeyuuri/timelag/model.yaml\n",
"out_dir: exp/natsumeyuuri/timelag/predicted/eval/latest/\n",
"out_scaler_path: dump/natsumeyuuri/norm/out_timelag_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 4/4 [00:00<00:00, 400.67it/s]\n",
"+ set +x\n",
"+ nnsvs-generate model.checkpoint=exp/natsumeyuuri/duration/latest.pth model.model_yaml=exp/natsumeyuuri/duration/model.yaml out_scaler_path=dump/natsumeyuuri/norm/out_duration_scaler.joblib in_dir=dump/natsumeyuuri/norm/eval/in_duration/ out_dir=exp/natsumeyuuri/duration/predicted/eval/latest/\n",
"[\u001b[36m2020-07-08 13:34:32,028\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/norm/eval/in_duration/\n",
"model:\n",
" checkpoint: exp/natsumeyuuri/duration/latest.pth\n",
" model_yaml: exp/natsumeyuuri/duration/model.yaml\n",
"out_dir: exp/natsumeyuuri/duration/predicted/eval/latest/\n",
"out_scaler_path: dump/natsumeyuuri/norm/out_duration_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 4/4 [00:00<00:00, 44.39it/s]\n",
"+ set +x\n",
"+ nnsvs-generate model.checkpoint=exp/natsumeyuuri/acoustic/latest.pth model.model_yaml=exp/natsumeyuuri/acoustic/model.yaml out_scaler_path=dump/natsumeyuuri/norm/out_acoustic_scaler.joblib in_dir=dump/natsumeyuuri/norm/eval/in_acoustic/ out_dir=exp/natsumeyuuri/acoustic/predicted/eval/latest/\n",
"[\u001b[36m2020-07-08 13:34:36,611\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - in_dir: dump/natsumeyuuri/norm/eval/in_acoustic/\n",
"model:\n",
" checkpoint: exp/natsumeyuuri/acoustic/latest.pth\n",
" model_yaml: exp/natsumeyuuri/acoustic/model.yaml\n",
"out_dir: exp/natsumeyuuri/acoustic/predicted/eval/latest/\n",
"out_scaler_path: dump/natsumeyuuri/norm/out_acoustic_scaler.joblib\n",
"verbose: 100\n",
"\u001b[0m\n",
"100% 4/4 [00:00<00:00, 7.31it/s]\n",
"+ set +x\n",
"stage 6: Synthesis waveforms\n",
"+ nnsvs-synthesis question_path=conf/jp_qst001_nnsvs.hed timelag.checkpoint=exp/natsumeyuuri/timelag/latest.pth timelag.in_scaler_path=dump/natsumeyuuri/norm/in_timelag_scaler.joblib timelag.out_scaler_path=dump/natsumeyuuri/norm/out_timelag_scaler.joblib timelag.model_yaml=exp/natsumeyuuri/timelag/model.yaml duration.checkpoint=exp/natsumeyuuri/duration/latest.pth duration.in_scaler_path=dump/natsumeyuuri/norm/in_duration_scaler.joblib duration.out_scaler_path=dump/natsumeyuuri/norm/out_duration_scaler.joblib duration.model_yaml=exp/natsumeyuuri/duration/model.yaml acoustic.checkpoint=exp/natsumeyuuri/acoustic/latest.pth acoustic.in_scaler_path=dump/natsumeyuuri/norm/in_acoustic_scaler.joblib acoustic.out_scaler_path=dump/natsumeyuuri/norm/out_acoustic_scaler.joblib acoustic.model_yaml=exp/natsumeyuuri/acoustic/model.yaml utt_list=./data/list/dev.list in_dir=data/acoustic/label_phone_score/ out_dir=exp/natsumeyuuri/synthesis/dev/latest/label_phone_score ground_truth_duration=false\n",
"[\u001b[36m2020-07-08 13:34:41,740\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - acoustic:\n",
" checkpoint: exp/natsumeyuuri/acoustic/latest.pth\n",
" has_dynamic_features:\n",
" - true\n",
" - true\n",
" - false\n",
" - true\n",
" in_scaler_path: dump/natsumeyuuri/norm/in_acoustic_scaler.joblib\n",
" model_yaml: exp/natsumeyuuri/acoustic/model.yaml\n",
" num_windows: 3\n",
" out_scaler_path: dump/natsumeyuuri/norm/out_acoustic_scaler.joblib\n",
" post_filter: true\n",
" question_path: null\n",
" relative_f0: true\n",
" stream_sizes:\n",
" - 180\n",
" - 3\n",
" - 1\n",
" - 15\n",
" subphone_features: coarse_coding\n",
"device: cuda\n",
"duration:\n",
" checkpoint: exp/natsumeyuuri/duration/latest.pth\n",
" has_dynamic_features:\n",
" - false\n",
" in_scaler_path: dump/natsumeyuuri/norm/in_duration_scaler.joblib\n",
" model_yaml: exp/natsumeyuuri/duration/model.yaml\n",
" out_scaler_path: dump/natsumeyuuri/norm/out_duration_scaler.joblib\n",
" question_path: null\n",
" stream_sizes:\n",
" - 1\n",
"frame_period: 5\n",
"ground_truth_duration: false\n",
"in_dir: data/acoustic/label_phone_score/\n",
"label_path: null\n",
"log_f0_conditioning: true\n",
"out_dir: exp/natsumeyuuri/synthesis/dev/latest/label_phone_score\n",
"out_wav_path: null\n",
"question_path: conf/jp_qst001_nnsvs.hed\n",
"sample_rate: 48000\n",
"timelag:\n",
" allowed_range:\n",
" - -20\n",
" - 20\n",
" allowed_range_rest:\n",
" - -40\n",
" - 40\n",
" checkpoint: exp/natsumeyuuri/timelag/latest.pth\n",
" has_dynamic_features:\n",
" - false\n",
" in_scaler_path: dump/natsumeyuuri/norm/in_timelag_scaler.joblib\n",
" model_yaml: exp/natsumeyuuri/timelag/model.yaml\n",
" out_scaler_path: dump/natsumeyuuri/norm/out_timelag_scaler.joblib\n",
" question_path: null\n",
" stream_sizes:\n",
" - 1\n",
"utt_list: ./data/list/dev.list\n",
"verbose: 100\n",
"\u001b[0m\n",
"[\u001b[36m2020-07-08 13:34:44,077\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Processes 4 utterances...\u001b[0m\n",
"100% 4/4 [00:23<00:00, 5.93s/it]\n",
"+ set +x\n",
"+ nnsvs-synthesis question_path=conf/jp_qst001_nnsvs.hed timelag.checkpoint=exp/natsumeyuuri/timelag/latest.pth timelag.in_scaler_path=dump/natsumeyuuri/norm/in_timelag_scaler.joblib timelag.out_scaler_path=dump/natsumeyuuri/norm/out_timelag_scaler.joblib timelag.model_yaml=exp/natsumeyuuri/timelag/model.yaml duration.checkpoint=exp/natsumeyuuri/duration/latest.pth duration.in_scaler_path=dump/natsumeyuuri/norm/in_duration_scaler.joblib duration.out_scaler_path=dump/natsumeyuuri/norm/out_duration_scaler.joblib duration.model_yaml=exp/natsumeyuuri/duration/model.yaml acoustic.checkpoint=exp/natsumeyuuri/acoustic/latest.pth acoustic.in_scaler_path=dump/natsumeyuuri/norm/in_acoustic_scaler.joblib acoustic.out_scaler_path=dump/natsumeyuuri/norm/out_acoustic_scaler.joblib acoustic.model_yaml=exp/natsumeyuuri/acoustic/model.yaml utt_list=./data/list/dev.list in_dir=data/acoustic/label_phone_align/ out_dir=exp/natsumeyuuri/synthesis/dev/latest/label_phone_align ground_truth_duration=true\n",
"[\u001b[36m2020-07-08 13:35:10,136\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - acoustic:\n",
" checkpoint: exp/natsumeyuuri/acoustic/latest.pth\n",
" has_dynamic_features:\n",
" - true\n",
" - true\n",
" - false\n",
" - true\n",
" in_scaler_path: dump/natsumeyuuri/norm/in_acoustic_scaler.joblib\n",
" model_yaml: exp/natsumeyuuri/acoustic/model.yaml\n",
" num_windows: 3\n",
" out_scaler_path: dump/natsumeyuuri/norm/out_acoustic_scaler.joblib\n",
" post_filter: true\n",
" question_path: null\n",
" relative_f0: true\n",
" stream_sizes:\n",
" - 180\n",
" - 3\n",
" - 1\n",
" - 15\n",
" subphone_features: coarse_coding\n",
"device: cuda\n",
"duration:\n",
" checkpoint: exp/natsumeyuuri/duration/latest.pth\n",
" has_dynamic_features:\n",
" - false\n",
" in_scaler_path: dump/natsumeyuuri/norm/in_duration_scaler.joblib\n",
" model_yaml: exp/natsumeyuuri/duration/model.yaml\n",
" out_scaler_path: dump/natsumeyuuri/norm/out_duration_scaler.joblib\n",
" question_path: null\n",
" stream_sizes:\n",
" - 1\n",
"frame_period: 5\n",
"ground_truth_duration: true\n",
"in_dir: data/acoustic/label_phone_align/\n",
"label_path: null\n",
"log_f0_conditioning: true\n",
"out_dir: exp/natsumeyuuri/synthesis/dev/latest/label_phone_align\n",
"out_wav_path: null\n",
"question_path: conf/jp_qst001_nnsvs.hed\n",
"sample_rate: 48000\n",
"timelag:\n",
" allowed_range:\n",
" - -20\n",
" - 20\n",
" allowed_range_rest:\n",
" - -40\n",
" - 40\n",
" checkpoint: exp/natsumeyuuri/timelag/latest.pth\n",
" has_dynamic_features:\n",
" - false\n",
" in_scaler_path: dump/natsumeyuuri/norm/in_timelag_scaler.joblib\n",
" model_yaml: exp/natsumeyuuri/timelag/model.yaml\n",
" out_scaler_path: dump/natsumeyuuri/norm/out_timelag_scaler.joblib\n",
" question_path: null\n",
" stream_sizes:\n",
" - 1\n",
"utt_list: ./data/list/dev.list\n",
"verbose: 100\n",
"\u001b[0m\n",
"[\u001b[36m2020-07-08 13:35:12,460\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Processes 4 utterances...\u001b[0m\n",
"100% 4/4 [00:23<00:00, 5.79s/it]\n",
"+ set +x\n",
"+ nnsvs-synthesis question_path=conf/jp_qst001_nnsvs.hed timelag.checkpoint=exp/natsumeyuuri/timelag/latest.pth timelag.in_scaler_path=dump/natsumeyuuri/norm/in_timelag_scaler.joblib timelag.out_scaler_path=dump/natsumeyuuri/norm/out_timelag_scaler.joblib timelag.model_yaml=exp/natsumeyuuri/timelag/model.yaml duration.checkpoint=exp/natsumeyuuri/duration/latest.pth duration.in_scaler_path=dump/natsumeyuuri/norm/in_duration_scaler.joblib duration.out_scaler_path=dump/natsumeyuuri/norm/out_duration_scaler.joblib duration.model_yaml=exp/natsumeyuuri/duration/model.yaml acoustic.checkpoint=exp/natsumeyuuri/acoustic/latest.pth acoustic.in_scaler_path=dump/natsumeyuuri/norm/in_acoustic_scaler.joblib acoustic.out_scaler_path=dump/natsumeyuuri/norm/out_acoustic_scaler.joblib acoustic.model_yaml=exp/natsumeyuuri/acoustic/model.yaml utt_list=./data/list/eval.list in_dir=data/acoustic/label_phone_score/ out_dir=exp/natsumeyuuri/synthesis/eval/latest/label_phone_score ground_truth_duration=false\n",
"[\u001b[36m2020-07-08 13:35:37,898\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - acoustic:\n",
" checkpoint: exp/natsumeyuuri/acoustic/latest.pth\n",
" has_dynamic_features:\n",
" - true\n",
" - true\n",
" - false\n",
" - true\n",
" in_scaler_path: dump/natsumeyuuri/norm/in_acoustic_scaler.joblib\n",
" model_yaml: exp/natsumeyuuri/acoustic/model.yaml\n",
" num_windows: 3\n",
" out_scaler_path: dump/natsumeyuuri/norm/out_acoustic_scaler.joblib\n",
" post_filter: true\n",
" question_path: null\n",
" relative_f0: true\n",
" stream_sizes:\n",
" - 180\n",
" - 3\n",
" - 1\n",
" - 15\n",
" subphone_features: coarse_coding\n",
"device: cuda\n",
"duration:\n",
" checkpoint: exp/natsumeyuuri/duration/latest.pth\n",
" has_dynamic_features:\n",
" - false\n",
" in_scaler_path: dump/natsumeyuuri/norm/in_duration_scaler.joblib\n",
" model_yaml: exp/natsumeyuuri/duration/model.yaml\n",
" out_scaler_path: dump/natsumeyuuri/norm/out_duration_scaler.joblib\n",
" question_path: null\n",
" stream_sizes:\n",
" - 1\n",
"frame_period: 5\n",
"ground_truth_duration: false\n",
"in_dir: data/acoustic/label_phone_score/\n",
"label_path: null\n",
"log_f0_conditioning: true\n",
"out_dir: exp/natsumeyuuri/synthesis/eval/latest/label_phone_score\n",
"out_wav_path: null\n",
"question_path: conf/jp_qst001_nnsvs.hed\n",
"sample_rate: 48000\n",
"timelag:\n",
" allowed_range:\n",
" - -20\n",
" - 20\n",
" allowed_range_rest:\n",
" - -40\n",
" - 40\n",
" checkpoint: exp/natsumeyuuri/timelag/latest.pth\n",
" has_dynamic_features:\n",
" - false\n",
" in_scaler_path: dump/natsumeyuuri/norm/in_timelag_scaler.joblib\n",
" model_yaml: exp/natsumeyuuri/timelag/model.yaml\n",
" out_scaler_path: dump/natsumeyuuri/norm/out_timelag_scaler.joblib\n",
" question_path: null\n",
" stream_sizes:\n",
" - 1\n",
"utt_list: ./data/list/eval.list\n",
"verbose: 100\n",
"\u001b[0m\n",
"[\u001b[36m2020-07-08 13:35:40,226\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Processes 4 utterances...\u001b[0m\n",
"100% 4/4 [00:25<00:00, 6.41s/it]\n",
"+ set +x\n",
"+ nnsvs-synthesis question_path=conf/jp_qst001_nnsvs.hed timelag.checkpoint=exp/natsumeyuuri/timelag/latest.pth timelag.in_scaler_path=dump/natsumeyuuri/norm/in_timelag_scaler.joblib timelag.out_scaler_path=dump/natsumeyuuri/norm/out_timelag_scaler.joblib timelag.model_yaml=exp/natsumeyuuri/timelag/model.yaml duration.checkpoint=exp/natsumeyuuri/duration/latest.pth duration.in_scaler_path=dump/natsumeyuuri/norm/in_duration_scaler.joblib duration.out_scaler_path=dump/natsumeyuuri/norm/out_duration_scaler.joblib duration.model_yaml=exp/natsumeyuuri/duration/model.yaml acoustic.checkpoint=exp/natsumeyuuri/acoustic/latest.pth acoustic.in_scaler_path=dump/natsumeyuuri/norm/in_acoustic_scaler.joblib acoustic.out_scaler_path=dump/natsumeyuuri/norm/out_acoustic_scaler.joblib acoustic.model_yaml=exp/natsumeyuuri/acoustic/model.yaml utt_list=./data/list/eval.list in_dir=data/acoustic/label_phone_align/ out_dir=exp/natsumeyuuri/synthesis/eval/latest/label_phone_align ground_truth_duration=true\n",
"[\u001b[36m2020-07-08 13:36:08,224\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - acoustic:\n",
" checkpoint: exp/natsumeyuuri/acoustic/latest.pth\n",
" has_dynamic_features:\n",
" - true\n",
" - true\n",
" - false\n",
" - true\n",
" in_scaler_path: dump/natsumeyuuri/norm/in_acoustic_scaler.joblib\n",
" model_yaml: exp/natsumeyuuri/acoustic/model.yaml\n",
" num_windows: 3\n",
" out_scaler_path: dump/natsumeyuuri/norm/out_acoustic_scaler.joblib\n",
" post_filter: true\n",
" question_path: null\n",
" relative_f0: true\n",
" stream_sizes:\n",
" - 180\n",
" - 3\n",
" - 1\n",
" - 15\n",
" subphone_features: coarse_coding\n",
"device: cuda\n",
"duration:\n",
" checkpoint: exp/natsumeyuuri/duration/latest.pth\n",
" has_dynamic_features:\n",
" - false\n",
" in_scaler_path: dump/natsumeyuuri/norm/in_duration_scaler.joblib\n",
" model_yaml: exp/natsumeyuuri/duration/model.yaml\n",
" out_scaler_path: dump/natsumeyuuri/norm/out_duration_scaler.joblib\n",
" question_path: null\n",
" stream_sizes:\n",
" - 1\n",
"frame_period: 5\n",
"ground_truth_duration: true\n",
"in_dir: data/acoustic/label_phone_align/\n",
"label_path: null\n",
"log_f0_conditioning: true\n",
"out_dir: exp/natsumeyuuri/synthesis/eval/latest/label_phone_align\n",
"out_wav_path: null\n",
"question_path: conf/jp_qst001_nnsvs.hed\n",
"sample_rate: 48000\n",
"timelag:\n",
" allowed_range:\n",
" - -20\n",
" - 20\n",
" allowed_range_rest:\n",
" - -40\n",
" - 40\n",
" checkpoint: exp/natsumeyuuri/timelag/latest.pth\n",
" has_dynamic_features:\n",
" - false\n",
" in_scaler_path: dump/natsumeyuuri/norm/in_timelag_scaler.joblib\n",
" model_yaml: exp/natsumeyuuri/timelag/model.yaml\n",
" out_scaler_path: dump/natsumeyuuri/norm/out_timelag_scaler.joblib\n",
" question_path: null\n",
" stream_sizes:\n",
" - 1\n",
"utt_list: ./data/list/eval.list\n",
"verbose: 100\n",
"\u001b[0m\n",
"[\u001b[36m2020-07-08 13:36:10,588\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Processes 4 utterances...\u001b[0m\n",
"100% 4/4 [00:26<00:00, 6.58s/it]\n",
"+ set +x\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "QnA0F2LHXmxl",
"colab_type": "text"
},
"source": [
"ステージ6を実行すると, 学習の評価用に設定されている曲(本レシピでは50, 51)の歌声が合成されます. 合成結果は $RECIPE_ROOT/exp/natsumeyuuri/synthesis 以下に出力されます. Colaboratory からインタラクティブに視聴するにはセルに以下のように入力します.\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "w_Wy9DNLQhLv",
"colab_type": "code",
"colab": {}
},
"source": [
"import IPython\n",
"from IPython.display import Audio\n",
"from glob import glob\n",
"from os.path import join\n",
"sample_rate=48000\n",
"synthesized_wav_paths = sorted(glob(join(RECIPE_ROOT, \"exp/**/*.wav\"), recursive=True))\n",
"\n",
"for wav_path in synthesized_wav_paths:\n",
" print(wav_path)\n",
" IPython.display.display(Audio(wav_path, rate=sample_rate))"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "kR4Vk31NYIeD",
"colab_type": "text"
},
"source": [
"## 学習したモデルから任意の歌声を合成する\n",
"レシピを見るとわかりますが NN-SVS では nnsvs-synthesis に学習した各種パラメータと合成するラベルファイル(音素の位置指定に使用する単音素のラベルファイルではなく楽譜から生成されたHTSフルコンテキストラベルファイル)のリストを渡すことで歌声合成を行います.\n",
"\n",
"Google Drive に sample フォルダを作成し歌声合成したい曲の楽譜をアップロードしてみましょう(Colaboratoryからは/content/gdrive/sampleに見えます).\n",
"\n",
"Pysinsyを用いて楽譜をHTSフルコンテキストラベルファイルに変換します.\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "uPPsi7pHbSXL",
"colab_type": "code",
"colab": {}
},
"source": [
"import pysinsy\n",
"from os.path import basename, splitext\n",
"sample_dir=\"/content/gdrive/sample\"\n",
"song_list_file=join(sample_dir, \"song_list.txt\")\n",
"\n",
"sinsy = pysinsy.sinsy.Sinsy()\n",
"# Set language to Japanese\n",
"assert sinsy.setLanguages(\"j\", join(RECIPE_ROOT, \"dic\"))\n",
"\n",
"song_list = []\n",
"musicxml_files = glob(join(sample_dir, \"*.xml\"))\n",
"for musicxml_file in musicxml_files:\n",
" assert sinsy.loadScoreFromMusicXML(musicxml_file)\n",
" is_mono = False\n",
" labels = sinsy.createLabelData(is_mono, 1, 1).getData()\n",
" song_name = splitext(basename(musicxml_file))[0]\n",
" song_list.append(song_name)\n",
" lab_file_path = join(sample_dir, song_name + \".lab\")\n",
" with open(lab_file_path, \"w\") as f:\n",
" f.write(\"\\n\".join(labels))\n",
"\n",
" sinsy.clearScore()\n",
"\n",
"with open(song_list_file, \"w\") as f:\n",
" f.write(\"\\n\".join(song_list))"
],
"execution_count": 18,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "AdzKn88wav-c",
"colab_type": "text"
},
"source": [
"nnsvs-synthesisに先ほど学習したモデルと, 合成したい曲のリストを渡します. 出力先は out_dirで指定します. 今回は $RECIPE_ROOT/exp/synthesis/sample にしました."
]
},
{
"cell_type": "code",
"metadata": {
"id": "RPzBOcckZqaM",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"outputId": "caa7798d-8b66-44f9-8565-3ef302823a5c"
},
"source": [
"spk=\"natsumeyuuri\"\n",
"exp_dir=join(RECIPE_ROOT, \"exp\", spk)\n",
"dump_dir=join(RECIPE_ROOT, \"dump\")\n",
"dump_org_dir=join(dump_dir, spk, \"org\")\n",
"dump_norm_dir=join(dump_dir, spk, \"norm\")\n",
"conf_dir=join(RECIPE_ROOT, \"conf\")\n",
"out_dir=join(exp_dir, \"synthesis/sample\")\n",
"\n",
"! nnsvs-synthesis question_path=$conf_dir/jp_qst001_nnsvs.hed \\\n",
"timelag.checkpoint=$exp_dir/timelag/latest.pth \\\n",
"timelag.in_scaler_path=$dump_norm_dir/in_timelag_scaler.joblib \\\n",
"timelag.out_scaler_path=$dump_norm_dir/out_timelag_scaler.joblib \\\n",
"timelag.model_yaml=$exp_dir/timelag/model.yaml \\\n",
"duration.checkpoint=$exp_dir/duration/latest.pth \\\n",
"duration.in_scaler_path=$dump_norm_dir/in_duration_scaler.joblib \\\n",
"duration.out_scaler_path=$dump_norm_dir/out_duration_scaler.joblib \\\n",
"duration.model_yaml=$exp_dir/duration/model.yaml \\\n",
"acoustic.checkpoint=$exp_dir/acoustic/latest.pth \\\n",
"acoustic.in_scaler_path=$dump_norm_dir/in_acoustic_scaler.joblib \\\n",
"acoustic.out_scaler_path=$dump_norm_dir/out_acoustic_scaler.joblib \\\n",
"acoustic.model_yaml=$exp_dir/acoustic/model.yaml \\\n",
"utt_list=$song_list_file \\\n",
"in_dir=$sample_dir \\\n",
"out_dir=$out_dir \\\n",
"ground_truth_duration=false\n"
],
"execution_count": 19,
"outputs": [
{
"output_type": "stream",
"text": [
"[\u001b[36m2020-07-08 13:40:25,903\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - acoustic:\n",
" checkpoint: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/acoustic/latest.pth\n",
" has_dynamic_features:\n",
" - true\n",
" - true\n",
" - false\n",
" - true\n",
" in_scaler_path: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/norm/in_acoustic_scaler.joblib\n",
" model_yaml: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/acoustic/model.yaml\n",
" num_windows: 3\n",
" out_scaler_path: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/norm/out_acoustic_scaler.joblib\n",
" post_filter: true\n",
" question_path: null\n",
" relative_f0: true\n",
" stream_sizes:\n",
" - 180\n",
" - 3\n",
" - 1\n",
" - 15\n",
" subphone_features: coarse_coding\n",
"device: cuda\n",
"duration:\n",
" checkpoint: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/latest.pth\n",
" has_dynamic_features:\n",
" - false\n",
" in_scaler_path: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/norm/in_duration_scaler.joblib\n",
" model_yaml: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/duration/model.yaml\n",
" out_scaler_path: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/norm/out_duration_scaler.joblib\n",
" question_path: null\n",
" stream_sizes:\n",
" - 1\n",
"frame_period: 5\n",
"ground_truth_duration: false\n",
"in_dir: /content/gdrive/sample\n",
"label_path: null\n",
"log_f0_conditioning: true\n",
"out_dir: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/synthesis/sample\n",
"out_wav_path: null\n",
"question_path: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/conf/jp_qst001_nnsvs.hed\n",
"sample_rate: 48000\n",
"timelag:\n",
" allowed_range:\n",
" - -20\n",
" - 20\n",
" allowed_range_rest:\n",
" - -40\n",
" - 40\n",
" checkpoint: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/timelag/latest.pth\n",
" has_dynamic_features:\n",
" - false\n",
" in_scaler_path: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/norm/in_timelag_scaler.joblib\n",
" model_yaml: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/exp/natsumeyuuri/timelag/model.yaml\n",
" out_scaler_path: /content/nnsvs/egs/nnsvs_natsume_singing/00-svs-world/dump/natsumeyuuri/norm/out_timelag_scaler.joblib\n",
" question_path: null\n",
" stream_sizes:\n",
" - 1\n",
"utt_list: /content/gdrive/sample/song_list.txt\n",
"verbose: 100\n",
"\u001b[0m\n",
"[\u001b[36m2020-07-08 13:40:28,238\u001b[0m][\u001b[34mnnsvs\u001b[0m][\u001b[32mINFO\u001b[0m] - Processes 2 utterances...\u001b[0m\n",
"100% 2/2 [02:58<00:00, 89.04s/it]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "T2Id2zV2c1GB",
"colab_type": "text"
},
"source": [
" Colaboratory からインタラクティブに視聴するにはセルに以下のように入力します."
]
},
{
"cell_type": "code",
"metadata": {
"id": "7dlkdPCQeT-O",
"colab_type": "code",
"colab": {}
},
"source": [
"synthesized_wav_paths = sorted(glob(join(out_dir, \"*.wav\"), recursive=True))\n",
"\n",
"for wav_path in synthesized_wav_paths:\n",
" print(wav_path)\n",
" IPython.display.display(Audio(wav_path, rate=sample_rate))"
],
"execution_count": null,
"outputs": []
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment