Skip to content

Instantly share code, notes, and snippets.

@nateraw
Last active April 13, 2024 21:11
Show Gist options
  • Save nateraw/0cb4c242b70af10044e9ae73f4617c86 to your computer and use it in GitHub Desktop.
Save nateraw/0cb4c242b70af10044e9ae73f4617c86 to your computer and use it in GitHub Desktop.
songstarter-v0-2-demo.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": [],
"machine_shape": "hm",
"gpuType": "A100",
"authorship_tag": "ABX9TyPN2wiQy975sqo8qrLuiKZa",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
},
"accelerator": "GPU"
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/nateraw/0cb4c242b70af10044e9ae73f4617c86/songstarter-v0-2-demo.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "code",
"source": [
"%%capture\n",
"! pip install git+https://github.com/facebookresearch/audiocraft\n",
"! pip install torchvision==0.16.0\n",
"! pip install hf-transfer"
],
"metadata": {
"id": "U9mIvEuyANKI"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"import os\n",
"\n",
"# enables faster downloads from hugging face\n",
"os.environ[\"HF_HUB_ENABLE_HF_TRANSFER\"] = \"1\""
],
"metadata": {
"id": "rOmrzpFOBy9G"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "WsugXWQuAMjQ"
},
"outputs": [],
"source": [
"import torchaudio\n",
"from audiocraft.models import MusicGen\n",
"from audiocraft.data.audio import audio_write\n",
"from huggingface_hub import hf_hub_download\n",
"\n",
"model = MusicGen.get_pretrained('nateraw/musicgen-songstarter-v0.2')"
]
},
{
"cell_type": "markdown",
"source": [
"Generate music from text descriptions"
],
"metadata": {
"id": "XxkbC2puEBVq"
}
},
{
"cell_type": "code",
"source": [
"model.set_generation_params(duration=10) # generate 10 seconds.\n",
"descriptions = ['acoustic, guitar, melody, trap, d minor, 90 bpm'] * 3\n",
"wav = model.generate(descriptions) # generates 3 samples.\n",
"\n",
"for idx, one_wav in enumerate(wav):\n",
" # Will save under {idx}.wav, with loudness normalization at -14 db LUFS.\n",
" audio_write(f'{idx}', one_wav.cpu(), model.sample_rate, strategy=\"loudness\", loudness_compressor=True)"
],
"metadata": {
"id": "uHnG93FZA5Lz"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"from IPython.display import Audio\n",
"\n",
"# Listen to the first sample\n",
"Audio('0.wav')"
],
"metadata": {
"id": "M3x2Kg4lCVhy"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"We trained the model for text-to-music, but since the base model had melody conditioning, we get that for free.\n",
"\n",
"That means you can generate samples that sound like some other sample you have.\n",
"\n",
"Let's first listen to our melody sample:"
],
"metadata": {
"id": "cghgF0-7DR8K"
}
},
{
"cell_type": "code",
"source": [
"melody_path = hf_hub_download(\"nateraw/musicgen-songstarter-v0.2\", 'assets/bach.mp3')\n",
"Audio(melody_path)"
],
"metadata": {
"id": "u5QiT6svDASW"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"Now lets take that bach organ sample and remix it into a hip hop guitar sample"
],
"metadata": {
"id": "4TRqtV13D1bj"
}
},
{
"cell_type": "code",
"source": [
"melody, sr = torchaudio.load(melody_path)\n",
"# generates using the melody from the given audio and the provided descriptions.\n",
"wav = model.generate_with_chroma(descriptions, melody[None].expand(3, -1, -1), sr)\n",
"\n",
"for idx, one_wav in enumerate(wav):\n",
" # Will save under {idx}.wav, with loudness normalization at -14 db LUFS.\n",
" audio_write(f'{idx}_bach', one_wav.cpu(), model.sample_rate, strategy=\"loudness\", loudness_compressor=True)"
],
"metadata": {
"id": "_XzkcSVQA4mt"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# Listen to the first melody conditioned sample\n",
"Audio('0_bach.wav')"
],
"metadata": {
"id": "WLrYymKPCbjR"
},
"execution_count": null,
"outputs": []
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment