Skip to content

Instantly share code, notes, and snippets.

@fahim
Created July 16, 2012 20:45
Show Gist options
  • Save fahim/3124930 to your computer and use it in GitHub Desktop.
Save fahim/3124930 to your computer and use it in GitHub Desktop.
#!/bin/env ruby
# encoding: utf-8
module Normalizers
class SongDataNormalizer
# def initialize(object)
# @object = object
# @tags = {}
# end
# end
# class TitleNormalizer
DASHES = '\p{Pd}'
STOPPERS = '' + DASHES
def initialize(text)
@text = text
@tags = {}
end
def scan!
remove_quotes!
token = /\s\p{Pd}\s/i
if @text =~ token
parts = @text.split(/\s[-–]\s/, 2)
normalized = [parts[0], parts[1]]
[0, 1].each do |index|
extract_features(normalized[index])
extract_remix(normalized[index])
extract_producer(normalized[index])
end
@tags[:artist] = normalized[0]
@tags[:title] = normalized[1]
end
@tags.inspect
end
def remove_quotes!
@text.gsub! /["\A\Z]/m, ''
end
def extract_features(text)
if match = text.match(/[\[\(]?\s*f[ea]{0,2}t\.?\s([^\[\]\)\(]+)[\[\]\)\(]?/i)
@tags[:features] ||= match[1]
text.gsub!(match[0], '')
end
end
def extract_remix(text)
if match = text.match(/\s[\(\[](.*?)\s?(Remix|Rmx|Mix)[\)\]]/i)
@tags[:is_remix] = !match[2].nil?
@tags[:remixed_by] = match[1]
text.gsub!(match[0], '')
end
end
def extract_producer(text)
if match = text.match(/\s[\(\[]\s?(Produced by|prod[^\s]+)\s?(.*?)[\)\]]/i)
@tags[:producer] = match[2]
text.gsub!(match[0], '')
end
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment