Last active
September 25, 2020 01:33
-
-
Save akaleeroy/e82b1f308b5bc60ae7ec448a936e5355 to your computer and use it in GitHub Desktop.
Parsing song titles on YouTube
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* eslint-disable */ | |
// YouTube Music Uploader Hall of Shame | |
// Trying to grok the range of malformed input in song title strings | |
[ | |
'BPC335 - Maxime Iko "Concilium"', // wrong order (catalog number before everything else), extra info (catalog number), bad separator `"`, bad extra separator `-` | |
'"Pollution" by Tom Lehrer', // wrong order (`Artist - Title` reversed), quotes, bad separator `by` | |
'DIS IZ WHY I\'M HOT (zef remix) - Die Antwoord', // `Artist - Title` reversed, bad case | |
'Man with no name - Teleport (Original mix). HQ', // bad case, noisy `(Original mix)`, extra info `HQ`, bad extra separator `.` | |
'Varg — Under Beige Nylon', // uneven spaces, bad separator `—` | |
'varg - under beige nylon - 46bpm', // bad case, bad extra separator `-`, extra info `46bpm` | |
'Kangding Ray AMBER DECAY', // no separator, bad case | |
'Falling in drop C.', // no artist at all, dubious punctuation `.` | |
'Asa Moto - Playtime - DEEWEE030', // bad extra separator `-`, extra info (catalogue number) | |
'Voodoo People - Quadsep - 1995', // bad extra separator `-`, extra info (year) | |
'Teste - The Wipe (5am Synaptic) - Plus 8 Records - 1992', // bad extra separator `-`, extra info (label, year) | |
'Varg | I Did Not Always Appear This Way [Ascetic House 2015]', // bad separator `|`, label and year unseparated inside brackets | |
'Pig&Dan -The Saint Job San (Lee Van Dowski Remix)', // uneven spaces, incorrect artist spelling | |
'NATHAN FAKE, THE TURTLE (HARD ISLANDS, 2009)', // bad case, bad separator `,`, extra info (album, year) | |
'Ambi Sessions 12/11 {Ambient Techno-Tribal-Dub Techno-Meditative}', // no artist, ambiguous date, extra info (genres), bad extra separator `-` | |
'PILLDRIVER // PITCH HIKER', // bad case, bad separator `\\` | |
'Wu-Tang Clan -- One Blood instrumental', // bad separator, bad extra info (no parens, wrong case) | |
'Mobb Deep "Peer Pressure"', // bad separator | |
// OK these next ones are not so bad | |
'The Prodigy - Voodoo People ( Parasense Rmx )', // spaces around parens | |
'Bernstein - Álom (Original Mix)', // uneven spaces, noisy `(Original Mix)`. | |
'Causa - Stages (Forthcoming Artikal Music UK)' // junk info in parens where `Remix` is expected | |
]; |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Related: minimaxir/big-list-of-naughty-strings