Skip to content

Instantly share code, notes, and snippets.

Last active June 12, 2024 01:25
Show Gist options
  • Save 0xdevalias/7f4a5c31758e04aea5c2f5520e53accb to your computer and use it in GitHub Desktop.
Save 0xdevalias/7f4a5c31758e04aea5c2f5520e53accb to your computer and use it in GitHub Desktop.
Some notes on Audio pitch correction (eg. autotune, melodyne, etc)

Audio Pitch Correction (eg. autotune, melodyne, etc)

Some notes on Audio pitch correction (eg. autotune, melodyne, etc).

Table of Contents


    • Melodyne

    • Melodyne grants you unrivaled access to all the musical details in your recordings and samples – note by note. This is made possible by a sophisticated analysis that delves deeply into your recordings and samples, and recognizes and understands the musical relationships within them: the individual notes and their characteristics, the scales, keys and chords, the timing, the tempo, the tone color. And with Melodyne you can edit all these things intuitively. With vocals, but every type of instrument as well – including polyphonic ones, such as the piano and guitar.

FLStudio NewTone

Synchronarts RePitch

      • RePitch - Natural Vocal Pitch Editing Plug-in: Powerful, easy to use and exceptionally transparent, RePitch sets a new standard for vocal pitch editing tools. revoice pro pitch tuning alignment vocals instrument timing
      • Revoice Pro 4 - The Ultimate Timing & Pitch Toolbox: Premium vocal tuning, time correction and doubled tracks for music production or ADR just got a lot easier, faster, and better sounding.
      • VocAlign Ultra - Advanced Timing & Pitch Alignment Plugin: Take plug-in based vocal matching to a whole new level with unparalleled control and total flexibility.
      • VocAlign Project - The World's #1 Audio Alignment Plugin: The basic and affordable version of VocAlign still saves hours of work and delivers fantastic results.
  • Fast Vocal Tuning In Ableton Live With RePitch:


Ableton Live

    • Audio Clips, Tempo, and Warping

    • Live is capable of time-warping samples while streaming them from disk so as to synchronize them to the current Live Set’s tempo. This happens without affecting the pitch, which can be changed independently.

      • Time-Warping Samples

      • An audio clip’s warping properties are set in the Audio tab/panel, which is a sub-section of the Clip View.

      • The most significant control here is the Warp switch, which toggles an audio clip’s warping on or off.

      • When the Warp switch is off, Live plays the sample at its original, “normal“ tempo, irrespective of the current Live Set’s tempo. This is useful for samples that have no inherent rhythmic structure: percussion hits, atmospheres, sound effects, spoken word and the like. Turn the Warp switch on to play rhythmically structured samples (such as sample loops, music recordings, complete music pieces, etc.) in sync with the current song tempo.

      • Warp Markers

      • Think of a sample as a rubber-band that you want to pin to a (musical time) ruler. In Live, the pins are called Warp Markers. A Warp Marker locks a specific point in the sample to a specific place in the measure. You can use any number of Warp Markers to create an arbitrary mapping of the piece’s inherent rhythm to a musical meter.

      • When you first load a sample into Live, Live automatically analyzes the audio and finds its transients. These are the points in the audio where notes or events begin, and are usually good places to put Warp Markers. Transients appear as small markers at the top of the Sample Editor after zooming in.

      • As you mouse over transients, temporary “pseudo“ Warp Markers appear. These have the same shape as regular Warp Markers, but they’re gray. Double-clicking or dragging a pseudo Warp Marker creates an actual Warp Marker or, if there are no Warp Markers later in the clip, changes the tempo for the clip segment.

      • You can also select a range of time and create Warp Markers at all of the transients within the range via the Create menu’s “Insert Warp Markers“ command. If there are no transients within your time selection, a Warp Marker will be created at the end of the selection.

      • When you import a sample that represents a well-cut musical loop of 1,2,4 or 8 bars in length, Live usually makes the correct assumptions to play the loop in sync with the chosen tempo. It creates two Warp Markers, one at the sample’s beginning and one at the end. The Seg. BPM field displays Live’s guess of the loop’s tempo; if you happen to know the tempo of the loop, you can type it in here. Sometimes Live’s guess of the original tempo is wrong by half or double. If so, correct this by clicking on the buttons labeled ×2 and ÷2, respectively.

      • When importing a loop that has not been edited into a well-cut loop, Live will play it out of sync. Suppose there is a portion of silence at the sample beginning, prior to the first beat. You can easily correct this by placing a Warp Marker at the beginning of the audio and dragging it so that it lines up with the beginning of bar one in the timeline. Likewise, you can eliminate silence after the actual loop end by placing a Warp Marker at the sample’s right edge.

      • Adjusting for Good Stretching Quality

      • Live offers a number of time-stretching modes to accommodate all sorts of audio material. Each clip’s time-stretching mode and associated parameters are set in the Clip View’s Audio tab/panel. The warp modes are different varieties of granular resynthesis techniques. Granular resynthesis achieves time compression and expansion by repeating and skipping over parts of the sample (the “grains“). The warp modes differ in the selection of grains, as well as in the details of overlapping and crossfading between grains.

      • Beats Mode works best for material where rhythm is dominant (e.g., drum loops as well as most pieces of electronic dance music). The granulation process is optimized to preserve transients in the audio material. Use the Preserve control to preserve divisions in the sample as boundaries when warping. For the most accurate results, particularly with percussive material, choose Transients. This setting uses the positions of the analyzed (or user-created) transients to determine warping behavior. To preserve specific beat divisions regardless of the sample’s contents, choose one of the fixed note values. For some interesting rhythmic artifacts, choose large note values in conjunction with pitch transposition.

      • Tones Mode serves well for stretching material with a more or less clear pitch structure, such as vocals, monophonic instruments and basslines.

      • Texture Mode works well for sound textures with an ambiguous pitch contour (e.g., polyphonic orchestral music, noise, atmospheric pads, etc.). It also offers rich potential for manipulating all kinds of sounds in a creative way.

      • In Re-Pitch Mode, Live doesn’t really time-stretch or compress the music; instead, it adjusts the playback rate to create the desired amount of stretching. In other words, to speed up playback by a factor of 2, it’s transposed up an octave.

      • Complex Mode is a warping method specifically designed to accommodate composite signals that combine the characteristics covered by other Warp Modes; it works well for warping entire songs, which usually contain beats, tones and textures. Complex Mode is a rather CPU-intensive function, using approximately ten times the CPU resources required by the other Warp Modes. As such, you may want to freeze tracks (see ‘Track Freeze’) where Complex Mode is used or record (see ‘Recording New Clips’) the results as a new clip to use as a substitute.

      • Complex Pro Mode uses a variation of the algorithm found in Complex mode, and may offer even better results (although with an increase in CPU usage.) Like Complex Mode, Complex Pro works especially well with polyphonic textures or whole songs. The Formants slider adjusts the extent to which the formants of the sample are compensated when transposing. At 100%, the original formants will be preserved, which allows for large changes in transposition while maintaining the sample’s original tonal quality. Note that this slider has no effect if the sample is played back untransposed. The Envelope slider also influences the spectral characteristics of the material. The default setting of 128 should work well for most audio. For very high-pitched samples, you may have better results with lower Envelope values. Likewise, low-pitched material may sound better with higher values.

    • Step 1: Transpose

      • The transpose knob allows for pitch shifting an audio clip up to forty-eight semitones above or beneath the original audio sample’s pitch.

      • The detune function allows you to tune the sample between semitones by 50 degrees (called ‘cents‘).

    • Step 2: Use The Warp Functions

      • When you change the pitch of a sample, sometimes this results in unwanted tempo changes.

      • To fix this, you need to transpose the sample with the warp mode set to on and then cycle through the warp algorithms to find one that naturally pitch shifts the sample.

    • Step 3: Check The Pitch

      • Apply Ableton’s Tuner Effect to the track with the sample that you have just transposed. Then playback the sample and check the pitch using the Tuner.

    • Step 4: Adjust The Pitch Using The Detune Function

    • How do you modulate pitch in Ableton?

      • Select the clip that you want to modulate. In the envelopes box on the Clip View, select the drop-down menu from the Control Chooser and select Transposition Modulation. Set your automation points. This, in a nutshell, is how you modulate pitch in Ableton.

    • Ableton Live Tutorial: Pitch and Transposition

    • Creating a pitch rising effect in Ableton

    • In Ableton Live, there are at least two easy ways to do that using built-in devices: Ping Pong Delay and Simpler. They give slightly different results, so choose whatever better suit your needs.

    • Method #1: Ping Pong Delay

      • First things off, we need to take an audio sample which we will use for the processing.

      • Put this sample to a new Audio Channel, and add Ping Pong Delay on top. By default, Ping Pong use an algorithm called “Fade”, we need to change it to “Repitch”.

      • From now on the modulation of the “Beat Swing” parameter will affect the sample’s pitch. Change it to the maximum value of 33.3%, and draw automation down to the end, at -33.3%

      • The effect itself is fine, but as you can hear the sound fades out over time, and we don’t need it. To fix this, simply turn on the freeze function, a small square “F”. Now the delay effect will last infinitely as long as the freeze is turned on

    • Method #2: Simpler

      • Now let’s take a look at the alternative method. It requires a few more steps, but I like it more. I’ll put the same source sample to a new MIDI Channel, Ableton automatically creates a Simpler device. By default, Simpler has some parameters that we don’t need, let’s change it in four clicks:

        • Turn on the “Loop” button. With this, we can use a single MIDI note in order to repeat the sound.
        • Turn off the “Snap”. Snap to grid a nice feature, but to make the effect smoother, we don’t it here.
        • Change Warp method to a “Tones”. Other algorithms can work too, but I found this one is better in this case.
      • Now create a new MIDI clip, and draw a single MIDI note up to the full length. Make sure to put it on C3 — this is a default note for most samplers where a sample is played with the original pitch.

      • Now comes the most interesting part. Select the Simpler and press ⌘+G (or click with right mouse button on the title and select Group), it wraps the device into Instrument Rack. Then click on the top left button to open a Macro section

      • Then we have to make a MIDI mapping on the length and transpose parameters. To do that, do right click on the Length parameter → Map to Macro 1, and right-click on the Transpose → Map to S Length

      • By default, it maps the maximum values of the parameters from left to right direction. It means that the maximum amount of the Macro knob (127) equal to 100% sample Length, and +48 semitones of Transpose. But we want quite the opposite, at 100% sample Length pitch should remain unchanged while reducing the Length should drop down the Pitch.

      • To do so, click on the “Map” button near to the Instrument Rack title, it opens the Macro Mapping window. Then right-click on the Transpose parameter and click “Invert Range”, set the maximum value at 0, and minimum up to your taste — I’ll set +36 semitones, which equally to 3 octaves.

      • Now just draw an automation curve of the Macro 1 parameter, and enjoy the result

  • How to automate clip pitch in Ableton Live 11 (YouTube):
  • Ableton Vocoder Tutorial (YouTube):
  • Ableton Tutorial: Vocoder Tips And Tricks (Vocals, Bass Design, Percussion) (YouTube):


    • Pitchbender is a visual, note based audio editing webapp. You can "autotune" your audio by shifting notes around.

      • A compilation of pitch detection algorithms for Javascript. Supports both the browser and node.

      • Provided pitch-finding algorithms

        • YIN - The best balance of accuracy and speed, in my experience. Occasionally provides values that are wildly incorrect.
        • AMDF - Slow and only accurate to around +/- 2%, but finds a frequency more consistenly than others.
        • Dynamic Wavelet - Very fast, but struggles to identify lower frequencies.
        • YIN w/ FFT (coming soon)
        • Goertzel (coming soon)
        • Mcleod (coming soon)
      • A Web Audio framework for making interactive music in the browser

      • Tone.js is a Web Audio framework for creating interactive music in the browser. The architecture of Tone.js aims to be familiar to both musicians and audio programmers creating web-based audio applications. On the high-level, Tone offers common DAW (digital audio workstation) features like a global transport for synchronizing and scheduling events as well as prebuilt synths and effects. Additionally, Tone provides high-performance building blocks to create your own synthesizers, effects, and complex control signals.


MXTune (and AutoTalent/SoundTouch algorithms)

Autotalent (algorithm)

    • Autotalent began as the result of a week of recreational signal processing in May 2009. It's a real-time pitch correction plugin. You specify the notes that a singer is allowed to hit, and Autotalent makes sure that they do. You can also use Autotalent for more exotic effects, like the Cher / T-Pain effect, making your voice sound like a chiptune, adding artificial vibrato, or messing with your formants. Autotalent can also be used as a harmonizer that knows how to sing in the scale with you. Or, you can use Autotalent to change the scale of a melody between major and minor or to change the musical mode.

    • It consists mainly of a pitch detector and a pitch shifter. The pitch detector figures out what pitch the person sang, and based upon this and the values of the various controls, the pitch shifter is instructed to shift the pitch up or down by some appropriate amount, resulting in an output signal that's in tune. Both the pitch detector and pitch shifter are designed to operate on monophonic signals. The pitch detector finds the pitch period via an autocorrelation method, and the pitch shifter uses a time-domain overlap-add technique that's synchronous with the pitch period of the input, which tends to have significantly fewer artifacts than, e.g. phase-vocoder based techniques for single-pitch sources.

SoundTouch (algorithm)

    • SoundTouch is an open-source audio processing library for changing the Tempo, Pitch and Playback Rates of audio streams or audio files. The library additionally supports estimating stable beats-per-minute rates for audio tracks.

      • Tempo (time stretch): Changes the sound to play at faster or slower tempo than originally without affecting the sound pitch.
      • Pitch (key): Changes the sound pitch or key while keeping the original tempo (speed).
      • Playback Rate: Changes both tempo and pitch together as if a vinyl disc was played at different RPM rate.
    • A JavaScript library for manipulating WebAudio Contexts, specifically for time-stretching a pitch change

    • AudioWorkletNode and AudioWorkletProcessor implementing SoundTouchJS

        • The AudioWorkletNode interface of the Web Audio API represents a base class for a user-defined AudioNode, which can be connected to an audio routing graph along with other nodes. It has an associated AudioWorkletProcessor, which does the actual audio processing in a Web Audio rendering thread.

Libraries, Algorithms, etc

    • A lightweight yet powerful audio-to-MIDI converter with pitch bend detection

    • Basic Pitch is a Python library for Automatic Music Transcription (AMT), using lightweight neural network developed by Spotify's Audio Intelligence Lab.

    • aubio is a tool designed for the extraction of annotations from audio signals. Its features include segmenting a sound file before each of its attacks, performing pitch detection, tapping the beat and producing midi streams from live audio.

    • Because these tasks are difficult, we thought it was important to gather them in a dedicated library. To increase the fun, we have made these algorithms work in a causal way, so as to be used in real time applications with as low delay as possible. Functions can be used offline in sound editors and software samplers, or online in audio effects and virtual instruments.

      • a library for audio and music analysis

      • aubio is a library to label music and sounds. It listens to audio signals and attempts to detect events. For instance, when a drum is hit, at which frequency is a note, or at what tempo is a rhythmic melody.

    • brew install aubio
    • Open-source library and tools for audio and music analysis, description and synthesis

    • Essentia is an open-source C++ library for audio analysis and audio-based music information retrieval. It contains an extensive collection of algorithms, including audio input/output functionality, standard digital signal processing blocks, statistical characterization of data, a large variety of spectral, temporal, tonal, and high-level music descriptors, and tools for inference with deep learning models. Essentia is cross-platform and designed with a focus on optimization in terms of robustness, computational speed, and low memory usage, which makes it efficient for many industrial applications. The library includes Python and JavaScript bindings as well as various command-line tools and third-party extensions, which facilitate its use for fast prototyping and allow setting up research experiments very rapidly.

      • Similarity: Analyze audio and compute features to find similar sounds or music tracks.
      • Classification: Classify sounds or music based on computed audio features.
      • Deep learning inference: Use data-driven TensorFlow models for a wide range applications from music annotation to synthesis.
      • Mood detection: Find if a song is happy, sad, aggressive or relaxed.
      • Key detection: Find a key of a music piece.
      • Onset detection: Detect onsets (and transients) in an audio signal.
      • Segmentation: Split audio into homogeneous segments that sound alike.
      • Beat tracking: Estimate beat positions and tempo (BPM) of a song.
      • Melody extraction: Estimate pitch in monophonic and polyphonic audio.
      • Audio fingerprinting: Extract fingerprints from any audio source using the Chromaprint algorithm.
      • Cover song detection: Identify covers and different versions of the same music piece.
      • Spectral analysis: Analyze spectral shape of an audio signal.
      • Loudness metering: Use various loudness meters including algorithms compliant with the EBU R128 broadcasting standard.
      • Audio problems detection: Identify possible audio quality problems in music recordings.
      • Voice analysis: Voice activity detection and characterization.
      • Synthesis: Analyze, transform and synthesize sounds using spectral modeling approaches.
      • Here is the complete list of algorithms which you can access from the Python interface. The C++ interface allows access to the same algorithms, and also some more which are templated and hence are not available in python.

      • This page provides a list of pre-trained models available in Essentia for various music and audio analysis tasks.

      • C++ library for audio and music analysis, description and synthesis, including Python bindings

    • A Python library for working with audio

    • pedalboard is a Python library for working with audio: reading, writing, rendering, adding effects, and more. It supports most popular audio file formats and a number of common audio effects out of the box, and also allows the use of VST3® and Audio Unit formats for loading third-party software instruments and effects.

    • pedalboard was built by Spotify's Audio Intelligence Lab to enable using studio-quality audio effects from within Python and TensorFlow. Internally at Spotify, pedalboard is used for data augmentation to improve machine learning models and to help power features like Spotify's AI DJ and AI Voice Translation. pedalboard also helps in the process of content creation, making it possible to add effects to audio without using a Digital Audio Workstation.

    • A Real-Time Audio Processing Framework in Java

    • TarsosDSP is a Java library for audio processing. Its aim is to provide an easy-to-use interface to practical music processing algorithms implemented, as simply as possible, in pure Java and without any other external dependencies. The library tries to hit the sweet spot between being capable enough to get real tasks done but compact and simple enough to serve as a demonstration on how DSP algorithms works.

    • TarsosDSP features an implementation of a percussion onset detector and a number of pitch detection algorithms: YIN, the Mcleod Pitch method and a “Dynamic Wavelet Algorithm Pitch Tracking” algorithm. Also included is a Goertzel DTMF(Dual tone multi frequency) decoding algorithm, a time stretch algorithm (WSOLA), resampling, filters, simple synthesis, some audio effects, and a pitch shifting algorithm.

    • Rubber Band Library is a high quality software library for audio time-stretching and pitch-shifting. It permits you to change the tempo and pitch of an audio stream or recording dynamically and independently of one another.

    • Rubber Band Library is a C++ library intended for use by developers creating their own application programs. It can be integrated into apps for any desktop or mobile platform. It also includes a simple, free command-line utility that you can use to make adjustments to the speed and pitch of existing audio files.



See Also

My Other Related Deepdive Gist's and Projects

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment