johndpope/jukebox_extended_notebook.ipynb

## jukebox_extended_notebook.ipynb
{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "Interacting with Jukebox",
      "provenance": [],
      "collapsed_sections": [],
      "machine_shape": "hm"
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "accelerator": "GPU"
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "uq8uLwZCn0BV",
        "colab_type": "text"
      },
      "source": [
        "IMPORTANT NOTE ON SYSTEM REQUIREMENTS:\n",
        "\n",
        "If you are connecting to a hosted runtime, make sure it has a P100 GPU (optionally run !nvidia-smi to confirm). Go to Edit>Notebook Settings to set this.\n",
        "\n",
        "CoLab may first assign you a lower memory machine if you are using a hosted runtime.  If so, the first time you try to load the 5B model, it will run out of memory, and then you'll be prompted to restart with more memory (then return to the top of this CoLab).  If you continue to have memory issues after this (or run into issues on your own home setup), switch to the 1B model.\n",
        "\n",
        "If you are using a local GPU, we recommend V100 or P100 with 16GB GPU memory for best performance. For GPU’s with less memory, we recommend using the 1B model and a smaller batch size throughout.  \n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "8qEqdj8u0gdN",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "!nvidia-smi -L"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "VAMZK4GNA_PM",
        "colab_type": "text"
      },
      "source": [
        "Mount Google Drive to save sample levels as they are generated."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ZPdMgaH_BPGN",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "from google.colab import drive\n",
        "drive.mount('/content/gdrive')"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "source": [
        "Setup"
      ],
      "cell_type": "markdown",
      "metadata": {}
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "!pip install git+https://github.com/openai/jukebox.git\n",
        "!git clone https://github.com/openai/jukebox.git\n",
        "!git clone https://gist.github.com/3615a96d1c25ec8d9fb8bcaacf647e8b.git\n",
        "!mv 3615a96d1c25ec8d9fb8bcaacf647e8b/sample_extended.py jukebox/jukebox/"
      ]
    },
    {
      "source": [
        "#### NOTE: This ipynb provides additional functionality compared to the default opanai notebook. These differences are enumerated below, but also it is worth noting that here the `levels` parameter actually works to stop the process after the specified level(s) have completed. You will notice that the examples below all use --levels=1 when doing bottom level ('level 2') sampling, and --levels=3 for upsampling, though --levels=2 should be used if you only need to do the middle level for whatever reason.\n",
        "\n",
        "#### This ipynb will also reduce or increase the number of `n_samples` as provided in each command so beware that mismatching `n_samples` won't be enforced and instead the number of samples will be automatically reduced. If increasing, the existing codes are duplicated  up to the new `n_samples` which may cause memory issues as well."
      ],
      "cell_type": "markdown",
      "metadata": {}
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "o7CzSiv0MmFP",
        "colab_type": "text"
      },
      "source": [
        "# Sampling\n",
        "---\n",
        "\n",
        "To sample normally, run the following command. Model can be `5b`, `5b_lyrics`, `1b_lyrics`\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "!python ./jukebox/jukebox/sample_extended.py --model=1b_lyrics --name=sample_1b --levels=1 --sample_length_in_seconds=20 \\\n",
        "--total_sample_length_in_seconds=180 --sr=44100 --n_samples=16 --hop_fraction=0.5,0.5,0.125"
      ]
    },
    {
      "source": [
        "The above generates the first `sample_length_in_seconds` seconds of audio from a song of total length `total_sample_length_in_seconds`.\n",
        "\n",
        "To continue sampling from already generated codes for a longer duration, you can run"
      ],
      "cell_type": "markdown",
      "metadata": {}
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "!python ./jukebox/jukebox/sample_extended.py --model=1b_lyrics --name=sample_1b_continued --levels=1 --mode=continue \\\n",
        "--codes_file=sample_1b/level_2/data.pth.tar --sample_length_in_seconds=40 --total_sample_length_in_seconds=180 \\\n",
        "--sr=44100 --n_samples=16 --hop_fraction=0.5,0.5,0.125"
      ]
    },
    {
      "source": [
        "If you stopped sampling at only the first level and want to upsample the saved codes, you can run"
      ],
      "cell_type": "markdown",
      "metadata": {}
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "!python ./jukebox/jukebox/sample_extended.py --model=1b_lyrics --name=sample_1b_upsamples --levels=3 --mode=upsample \\\n",
        "--codes_file=sample_1b/level_2/data.pth.tar --sample_length_in_seconds=20 --total_sample_length_in_seconds=180 \\\n",
        "--sr=44100 --n_samples=16 --hop_fraction=0.5,0.5,0.125"
      ]
    },
    {
      "source": [
        "If you want to prompt the model with your own creative piece or any other music, first save them as wave files and run"
      ],
      "cell_type": "markdown",
      "metadata": {}
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "!python ./jukebox/jukebox/sample_extended.py --model=1b_lyrics --name=sample_1b_prompted --levels=1 --mode=primed \\\n",
        "--audio_file=path/to/recording.wav,awesome-mix.wav,fav-song.wav,etc.wav --prompt_length_in_seconds=12 \\\n",
        "--sample_length_in_seconds=20 --total_sample_length_in_seconds=180 --sr=44100 --n_samples=16 --hop_fraction=0.5,0.5,0.125"
      ]
    },
    {
      "source": [
        "This ipynb also includes an additional mode, truncate, that lets you remove unwanted seconds from the end of sampling output. Consider the continuation example above which produced a 40 second sample. If we reuse that codes file in truncate mode we can remove any unwanted audio at the end of the sample and then continue like normal. Notice that `--sample_length_in_seconds` is reduced by 5 in this example."
      ],
      "cell_type": "markdown",
      "metadata": {}
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "!python ./jukebox/jukebox/sample_extended.py --model=1b_lyrics --name=sample_1b_truncated --levels=3 --mode=truncate \\\n",
        "--codes_file=sample_1b/level_2/data.pth.tar --sample_length_in_seconds=35 --total_sample_length_in_seconds=180 \\\n",
        "--sr=44100 --n_samples=16 --hop_fraction=0.5,0.5,0.125"
      ]
    },
    {
      "source": [
        "Each of these examples matches the README as provided by openai, but this ipynb needs additional functionality in order to change sample metadata from the command line. This means that each level's prompts, lyrics, and temperature are provided as hyperparameters for the purpose of this notebook. These parameters are listed below along with an example.\n",
        "\n",
        "```\n",
        "l2_meta_artist: The artist prompt for level 2\n",
        "l2_meta_genre: The genre prompt for level 2\n",
        "l2_meta_lyrics: The lyrics for level 2\n",
        "```\n",
        "```\n",
        "l1_meta_artist: The artist prompt for level 1\n",
        "l1_meta_genre: The genre prompt for level 1\n",
        "l1_meta_lyrics: The lyrics for level 1\n",
        "```\n",
        "```\n",
        "l0_meta_artist: The artist prompt for level 0\n",
        "l0_meta_genre: The genre prompt for level 0\n",
        "l0_meta_lyrics: The lyrics for level 0\n",
        "```\n",
        "```\n",
        "temperature: The temperature for level 2\n",
        "l1_temperature: The temperature for level 1\n",
        "10_temperature: The temperature for level 0\n",
        "```\n",
        "```\n",
        "pref_codes: Prefer specific codes on continuation or upsample (*see example below)\n",
        "```\n",
        "If you do not provide these parameters in the above cells, you will be using the defaults (artist=unknown, genre=unknown, lyrics=\"\"). A correct example with these things specified looks like:"
      ],
      "cell_type": "markdown",
      "metadata": {}
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "!python ./jukebox/jukebox/sample_extended.py --model=1b_lyrics --name=sample_1b_raekwon --levels=1 --sample_length_in_seconds=20 \\\n",
        "--total_sample_length_in_seconds=180 --sr=44100 --n_samples=16 --hop_fraction=0.5,0.5,0.125 \\\n",
        "--temperature=0.98 --l2_meta_artist=raekwon --l2_meta_genre=psychedelic --l2_meta_lyrics='hello world good raekwon lyrics'"
      ]
    },
    {
      "source": [
        "Afterwards if we find one or more codes that we would like to continue on, we can then specify the `pref_codes` parameter in continuation mode and discard the unwanted codes. For example, if we found that we like samples 1, 3, and 5 from the previous command, we can use `--pref_codes=1,3,5`"
      ],
      "cell_type": "markdown",
      "metadata": {}
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "!python ./jukebox/jukebox/sample_extended.py --model=1b_lyrics --name=sample_1b_raekwon_continued --levels=1 --sample_length_in_seconds=40 \\\n",
        "--total_sample_length_in_seconds=180 --sr=44100 --n_samples=16 --hop_fraction=0.5,0.5,0.125 \\\n",
        "--temperature=0.98 --l2_meta_artist=raekwon --l2_meta_genre=psychedelic --l2_meta_lyrics='hello world good raekwon lyrics' \\\n",
        "--mode=continue --codes_file=sample_1b_raekwon/level_2/data.pth.tar --pref_codes=1,3,5\n",
        "\n",
        "# Or if we only liked one set of codes then we should also add a comma to ensure the data is passed as a tuple to python: e.g. --pref_codes=3,"
      ]
    },
    {
      "source": [
        "# Training\n",
        "---\n",
        "\n",
        "Not yet implemented\n"
      ],
      "cell_type": "markdown",
      "metadata": {}
    }
  ]
}
	{
	"nbformat": 4,
	"nbformat_minor": 0,
	"metadata": {
	"colab": {
	"name": "Interacting with Jukebox",
	"provenance": [],
	"collapsed_sections": [],
	"machine_shape": "hm"
	},
	"kernelspec": {
	"name": "python3",
	"display_name": "Python 3"
	},
	"accelerator": "GPU"
	},
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "uq8uLwZCn0BV",
	"colab_type": "text"
	},
	"source": [
	"IMPORTANT NOTE ON SYSTEM REQUIREMENTS:\n",
	"\n",
	"If you are connecting to a hosted runtime, make sure it has a P100 GPU (optionally run !nvidia-smi to confirm). Go to Edit>Notebook Settings to set this.\n",
	"\n",
	"CoLab may first assign you a lower memory machine if you are using a hosted runtime. If so, the first time you try to load the 5B model, it will run out of memory, and then you'll be prompted to restart with more memory (then return to the top of this CoLab). If you continue to have memory issues after this (or run into issues on your own home setup), switch to the 1B model.\n",
	"\n",
	"If you are using a local GPU, we recommend V100 or P100 with 16GB GPU memory for best performance. For GPU’s with less memory, we recommend using the 1B model and a smaller batch size throughout. \n",
	"\n"
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "8qEqdj8u0gdN",
	"colab_type": "code",
	"colab": {}
	},
	"source": [
	"!nvidia-smi -L"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "VAMZK4GNA_PM",
	"colab_type": "text"
	},
	"source": [
	"Mount Google Drive to save sample levels as they are generated."
	]
	},
	{
	"cell_type": "code",
	"metadata": {
	"id": "ZPdMgaH_BPGN",
	"colab_type": "code",
	"colab": {}
	},
	"source": [
	"from google.colab import drive\n",
	"drive.mount('/content/gdrive')"
	],
	"execution_count": 0,
	"outputs": []
	},
	{
	"source": [
	"Setup"
	],
	"cell_type": "markdown",
	"metadata": {}
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"!pip install git+https://github.com/openai/jukebox.git\n",
	"!git clone https://github.com/openai/jukebox.git\n",
	"!git clone https://gist.github.com/3615a96d1c25ec8d9fb8bcaacf647e8b.git\n",
	"!mv 3615a96d1c25ec8d9fb8bcaacf647e8b/sample_extended.py jukebox/jukebox/"
	]
	},
	{
	"source": [
	"#### NOTE: This ipynb provides additional functionality compared to the default opanai notebook. These differences are enumerated below, but also it is worth noting that here the `levels` parameter actually works to stop the process after the specified level(s) have completed. You will notice that the examples below all use --levels=1 when doing bottom level ('level 2') sampling, and --levels=3 for upsampling, though --levels=2 should be used if you only need to do the middle level for whatever reason.\n",
	"\n",
	"#### This ipynb will also reduce or increase the number of `n_samples` as provided in each command so beware that mismatching `n_samples` won't be enforced and instead the number of samples will be automatically reduced. If increasing, the existing codes are duplicated up to the new `n_samples` which may cause memory issues as well."
	],
	"cell_type": "markdown",
	"metadata": {}
	},
	{
	"cell_type": "markdown",
	"metadata": {
	"id": "o7CzSiv0MmFP",
	"colab_type": "text"
	},
	"source": [
	"# Sampling\n",
	"---\n",
	"\n",
	"To sample normally, run the following command. Model can be `5b`, `5b_lyrics`, `1b_lyrics`\n"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"!python ./jukebox/jukebox/sample_extended.py --model=1b_lyrics --name=sample_1b --levels=1 --sample_length_in_seconds=20 \\\n",
	"--total_sample_length_in_seconds=180 --sr=44100 --n_samples=16 --hop_fraction=0.5,0.5,0.125"
	]
	},
	{
	"source": [
	"The above generates the first `sample_length_in_seconds` seconds of audio from a song of total length `total_sample_length_in_seconds`.\n",
	"\n",
	"To continue sampling from already generated codes for a longer duration, you can run"
	],
	"cell_type": "markdown",
	"metadata": {}
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"!python ./jukebox/jukebox/sample_extended.py --model=1b_lyrics --name=sample_1b_continued --levels=1 --mode=continue \\\n",
	"--codes_file=sample_1b/level_2/data.pth.tar --sample_length_in_seconds=40 --total_sample_length_in_seconds=180 \\\n",
	"--sr=44100 --n_samples=16 --hop_fraction=0.5,0.5,0.125"
	]
	},
	{
	"source": [
	"If you stopped sampling at only the first level and want to upsample the saved codes, you can run"
	],
	"cell_type": "markdown",
	"metadata": {}
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"!python ./jukebox/jukebox/sample_extended.py --model=1b_lyrics --name=sample_1b_upsamples --levels=3 --mode=upsample \\\n",
	"--codes_file=sample_1b/level_2/data.pth.tar --sample_length_in_seconds=20 --total_sample_length_in_seconds=180 \\\n",
	"--sr=44100 --n_samples=16 --hop_fraction=0.5,0.5,0.125"
	]
	},
	{
	"source": [
	"If you want to prompt the model with your own creative piece or any other music, first save them as wave files and run"
	],
	"cell_type": "markdown",
	"metadata": {}
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"!python ./jukebox/jukebox/sample_extended.py --model=1b_lyrics --name=sample_1b_prompted --levels=1 --mode=primed \\\n",
	"--audio_file=path/to/recording.wav,awesome-mix.wav,fav-song.wav,etc.wav --prompt_length_in_seconds=12 \\\n",
	"--sample_length_in_seconds=20 --total_sample_length_in_seconds=180 --sr=44100 --n_samples=16 --hop_fraction=0.5,0.5,0.125"
	]
	},
	{
	"source": [
	"This ipynb also includes an additional mode, truncate, that lets you remove unwanted seconds from the end of sampling output. Consider the continuation example above which produced a 40 second sample. If we reuse that codes file in truncate mode we can remove any unwanted audio at the end of the sample and then continue like normal. Notice that `--sample_length_in_seconds` is reduced by 5 in this example."
	],
	"cell_type": "markdown",
	"metadata": {}
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"!python ./jukebox/jukebox/sample_extended.py --model=1b_lyrics --name=sample_1b_truncated --levels=3 --mode=truncate \\\n",
	"--codes_file=sample_1b/level_2/data.pth.tar --sample_length_in_seconds=35 --total_sample_length_in_seconds=180 \\\n",
	"--sr=44100 --n_samples=16 --hop_fraction=0.5,0.5,0.125"
	]
	},
	{
	"source": [
	"Each of these examples matches the README as provided by openai, but this ipynb needs additional functionality in order to change sample metadata from the command line. This means that each level's prompts, lyrics, and temperature are provided as hyperparameters for the purpose of this notebook. These parameters are listed below along with an example.\n",
	"\n",
	"```\n",
	"l2_meta_artist: The artist prompt for level 2\n",
	"l2_meta_genre: The genre prompt for level 2\n",
	"l2_meta_lyrics: The lyrics for level 2\n",
	"```\n",
	"```\n",
	"l1_meta_artist: The artist prompt for level 1\n",
	"l1_meta_genre: The genre prompt for level 1\n",
	"l1_meta_lyrics: The lyrics for level 1\n",
	"```\n",
	"```\n",
	"l0_meta_artist: The artist prompt for level 0\n",
	"l0_meta_genre: The genre prompt for level 0\n",
	"l0_meta_lyrics: The lyrics for level 0\n",
	"```\n",
	"```\n",
	"temperature: The temperature for level 2\n",
	"l1_temperature: The temperature for level 1\n",
	"10_temperature: The temperature for level 0\n",
	"```\n",
	"```\n",
	"pref_codes: Prefer specific codes on continuation or upsample (*see example below)\n",
	"```\n",
	"If you do not provide these parameters in the above cells, you will be using the defaults (artist=unknown, genre=unknown, lyrics=\"\"). A correct example with these things specified looks like:"
	],
	"cell_type": "markdown",
	"metadata": {}
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"!python ./jukebox/jukebox/sample_extended.py --model=1b_lyrics --name=sample_1b_raekwon --levels=1 --sample_length_in_seconds=20 \\\n",
	"--total_sample_length_in_seconds=180 --sr=44100 --n_samples=16 --hop_fraction=0.5,0.5,0.125 \\\n",
	"--temperature=0.98 --l2_meta_artist=raekwon --l2_meta_genre=psychedelic --l2_meta_lyrics='hello world good raekwon lyrics'"
	]
	},
	{
	"source": [
	"Afterwards if we find one or more codes that we would like to continue on, we can then specify the `pref_codes` parameter in continuation mode and discard the unwanted codes. For example, if we found that we like samples 1, 3, and 5 from the previous command, we can use `--pref_codes=1,3,5`"
	],
	"cell_type": "markdown",
	"metadata": {}
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"!python ./jukebox/jukebox/sample_extended.py --model=1b_lyrics --name=sample_1b_raekwon_continued --levels=1 --sample_length_in_seconds=40 \\\n",
	"--total_sample_length_in_seconds=180 --sr=44100 --n_samples=16 --hop_fraction=0.5,0.5,0.125 \\\n",
	"--temperature=0.98 --l2_meta_artist=raekwon --l2_meta_genre=psychedelic --l2_meta_lyrics='hello world good raekwon lyrics' \\\n",
	"--mode=continue --codes_file=sample_1b_raekwon/level_2/data.pth.tar --pref_codes=1,3,5\n",
	"\n",
	"# Or if we only liked one set of codes then we should also add a comma to ensure the data is passed as a tuple to python: e.g. --pref_codes=3,"
	]
	},
	{
	"source": [
	"# Training\n",
	"---\n",
	"\n",
	"Not yet implemented\n"
	],
	"cell_type": "markdown",
	"metadata": {}
	}
	]
	}