LTeder/piano_transcription.md

## piano_transcription.md

      
    Raw
  

              piano_transcription.md
            
          
    Using a couple repos with Python, you can transcribe any piano recording to MIDI.
Each will use their own environment. Pretty straightforward with conda/mamba.
Both steps incur some loss which affects the quality of the end result.
But on any reasonably hi-fi recording since the late 60s you should get acceptable results. Super useful for dumping into Syn/Neothesia.


Start with a folder containing WAV files of the desired songs.


Use Music-Source-Separation-Training's Demucs4HT 6-stem model inference to derive the piano stems.
For me this was:


python inference.py --model_type htdemucs --config_path configs/config_htdemucs_6stems.yaml --start_check_point htdemucs4-6.th --input_folder YT-Rips/ --store_dir results/

Then feed this into piano_transcription_inference.
Here's a simple inference script to use it on folders instead of files:

import argparse
from pathlib import Path
from tqdm import tqdm
from time import time
from torch import cuda
from piano_transcription_inference import PianoTranscription, sample_rate, load_audio


def inference(args):
    output_midi_path = Path(args.output_folder)
    device = 'cuda' if args.cuda else 'cpu'

    transcribe_time = time()

    transcriptor = PianoTranscription(device=device, checkpoint_path=None)
    # checkpoint_path: None for default path, str for downloaded checkpoint path

    for audio_path in tqdm([*Path(args.input_folder).glob("*.wav")]):
        # Load, transcribe, and write out to MIDI file
        audio, _ = load_audio(audio_path, sr=sample_rate, mono=True)
        _ = transcriptor.transcribe(audio, output_midi_path / audio_path.stem)
        #from pprint import pprint; pprint(transcribed_dict)

    print('Transcribe time: {:.3f} s'.format(time() - transcribe_time))

if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='')
    parser.add_argument('--input_folder', '-i', type=str, required=True)
    parser.add_argument('--output_folder', '-o', type=str, required=True)
    parser.add_argument('--cuda', '-c', action='store_true', default=False)
    args = parser.parse_args()
    if args.cuda:
        assert cuda.is_available()
    inference(args)