Skip to content

Instantly share code, notes, and snippets.

@previtus
Last active July 25, 2021 12:49
Show Gist options
  • Save previtus/fc014f3aa27b13a077e37c598d7e65af to your computer and use it in GitHub Desktop.
Save previtus/fc014f3aa27b13a077e37c598d7e65af to your computer and use it in GitHub Desktop.
Bash command to train and run music_gen_interaction_RTML demo!
The run.sh bash file should do everything for you, you just need to have an audio sample (in wav format) and the Nvidia docker installed (https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker)
Download the run.sh command and give it executible rights:
chmod +x run.sh
Then simply run
./run.sh sample.wav 16
This will train a model for 16 epochs and then run an interactive real-time handler with it.
Alternatively you can adjust the amount of trained epochs (16 is the default number to allow for a fast demo, but 300 is a better value for a high quality model)
./run.sh sample.wav 300
Finally you can also skip the training and only run your pretrained model (this assumes you ran the previous commands already and only want to get the demo running again).
./run.sh sample.wav 300 True
More about the project: see https://github.com/ual-cci/music_gen_interaction_RTML and https://hub.docker.com/repository/docker/previtus/demo-lstm-music-gen
To clean up (and start with a new audio file), simply rename (or delete) the created "music_gen_interaction_RTML" folder.
#!/bin/bash
echo "[***] Automated script to train our system on an audio sample and then run a real-time interaction demo! [***]"
echo "- Settings: "
echo " - Source audio file (use wav only, defaults to sample.wav): ${1-sample.wav}";
echo " - Number of epochs (default for a demo is 300, which is optimal for good quality): ${2-300}";
echo " - Skip training and only run the real-time demo: ${3:-False}";
echo ""
echo "- Author: 2021 Vit Ruzicka <https://github.com/previtus>"
echo "[***] ================================================================================================== [***]"
mkdir $(date +"music_gen_interaction_RTML")
cd $(date +"music_gen_interaction_RTML")
echo "Working in dictory:"
pwd
# Note: Make sure that nvidia Docker is installed. We can test it with:
# sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
# We need to get the sound sample - provide your own mp3 (and convert it to wav) / wav / or let's download a demo
#ffmpeg -i sample.mp3 sample.wav
#wget -O sample.wav https://www2.cs.uic.edu/~i101/SoundFiles/StarWars60.wav
# Let's assume that we have a file called "sample.wav" in the current directory
# Prepare folders (these will be shared with the script inside Docker - and allow us to send the music sample in and then get the final model out)
mkdir __custom_music_samples
mkdir __custom_music_samples/sample
cp ../${1-sample.wav} __custom_music_samples/sample/
mkdir __custom_saved_models
mkdir data
# Docker root rights
xhost +SI:localuser:root
# (Optionally) Let's try the basic functionality (command from Step 1) - this might take couple of minutes, but should end up playing a sound
# Note1: Jackd binding sometimes has some issues if there is any other sound playing, so please stop it before running this demo
# Note2: Exit with ctrl-c
# sudo docker run -it --rm --privileged=true --env="DISPLAY" --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" --env="QT_X11_NO_MITSHM=1" -v $(pwd)/data/:/home/music_gen_interaction_RTML/data/ --gpus all --device=/dev/snd:/dev/snd previtus/demo-lstm-music-gen
# Train a model on our sample (this is a command from Step 3 with slightly altered path)
# Note: Training a model will take some time (also depending on the number of epochs we selected) and in the end will save the model files into the "__custom_saved_models" directory (which is available outside of the Docker).
if [ ${3:-False} = "False" ];
then
sudo docker run -it --rm --privileged=true --env="DISPLAY" --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" --env="QT_X11_NO_MITSHM=1" -v "$(pwd)/data/":/home/music_gen_interaction_RTML/data/ -v "$(pwd)/__custom_saved_models/":/home/music_gen_interaction_RTML/__saved_models/ -v "$(pwd)/__custom_music_samples/":/home/music_gen_interaction_RTML/__music_samples/ --gpus all --device=/dev/snd:/dev/snd previtus/demo-lstm-music-gen python training_handler.py -target_file __music_samples/sample/${1-sample.wav} -amount_epochs ${2-300} -batch_size 64
fi
# (Important!) Do not rename the model files (namely do not remove the "Model_" from the name - there are some checks that need it present)...
# Now we can finally run the trained model in real-time (command from Step 2):
# Note: you don't have to change any path in this command - it will simply select the trained model and the source audio file from the prepared folders.
# Note2: Exit with ctrl-c
sudo docker run -it --rm --privileged=true --env="DISPLAY" --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" --env="QT_X11_NO_MITSHM=1" -v "$(pwd)/data/":/home/music_gen_interaction_RTML/data/ -v "$(pwd)/__custom_saved_models/":/home/music_gen_interaction_RTML/__saved_models/ -v "$(pwd)/__custom_music_samples/":/home/music_gen_interaction_RTML/__music_samples/ --gpus all --device=/dev/snd:/dev/snd previtus/demo-lstm-music-gen
# And that's it!
# If you want to have a higher quality model, a larger number of epochs is suggested - try starting with 300. You can also adjust the batch size (as a rule of thumb the bigger the better (and also faster to train) as long as your GPU's memory can take it) or further play with the model architecture.
xhost -SI:localuser:root
chmod +x run.sh
./run.sh sample.wav 300
# ^^^ This will train our system on sample.wav for 300 epochs and end up running a real-time demo.
# For jackd's sake don't play any audio while the demo is starting up.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment