Skip to content

Instantly share code, notes, and snippets.

@bwerdschinski
Created May 19, 2015 12:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bwerdschinski/936d5db5e33bd2cf4ff3 to your computer and use it in GitHub Desktop.
Save bwerdschinski/936d5db5e33bd2cf4ff3 to your computer and use it in GitHub Desktop.
Adventures in Machine Learning with Taylor Swift and Lady Gaga
require 'nokogiri'
require 'classifier'
require 'open-uri'
@skynet = Classifier::Bayes.new 'lady_gaga', 'taylor_swift'
def train_skynet(url, trainer)
# Grab the HTML from the url
html = open(url)
# Extract the lyrics from the #lyrics-body-text div
lyrics = Nokogiri::HTML(html).css("#lyrics-body-text").text
# Filter text to ensure only alphanumeric characters and
# appropriate punctuation are used
lyrics.gsub!(/[^A-Za-z0-9,.'\s]/, ' ')
# Feed each line to skynet
@skynet.send(trainer, lyrics)
end
# From http://www.metrolyrics.com/taylor-swift-lyrics.html
taylor_swift_urls = [
"http://www.metrolyrics.com/i-knew-you-were-trouble-lyrics-taylor-swift.html",
"http://www.metrolyrics.com/teardrops-on-my-guitar-lyrics-taylor-swift.html",
"http://www.metrolyrics.com/we-are-never-ever-getting-back-together-lyrics-taylor-swift.html"
]
taylor_swift_urls.each do |url|
train_skynet(url, :train_taylor_swift)
end
# From http://www.metrolyrics.com/lady-gaga-lyrics.html
lady_gaga_urls = [
"http://www.metrolyrics.com/bad-romance-lyrics-lady-gaga.html",
"http://www.metrolyrics.com/telephone-lyrics-lady-gaga.html",
"http://www.metrolyrics.com/pokerface-lyrics-lady-gaga.html"
]
lady_gaga_urls.each do |url|
train_skynet(url, :train_lady_gaga)
end
puts @skynet.classify "Shake it off, shake it off!"
puts @skynet.classify "At least that's what people say mmm, that's what people say mmm"
puts @skynet.classify "But I won't stop until that boy is mine"
puts @skynet.classify "Papa-paparazzi"
puts @skynet.classify "I'll follow you until you love me"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment