Skip to content

Instantly share code, notes, and snippets.

@dszeto
Last active December 12, 2015 00:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save dszeto/4681133 to your computer and use it in GitHub Desktop.
Save dszeto/4681133 to your computer and use it in GitHub Desktop.
Import MovieLens 100k data set from http://www.grouplens.org/node/73 to PredictionIO 0.5.0
require "predictionio"
if (ARGV[0].nil? || ARGV[1].nil?)
abort("Usage: import_ml.rb <app key> <movie lens data file>")
end
client = PredictionIO::Client.new(ARGV[0],
50,
"http://localhost:8000")
users = Hash.new
items = Hash.new
File.open(ARGV[1]) do |f|
f.each_line do |tsv|
tsv.chomp!
fields = tsv.split(/\s+/)
client.identify(fields[0])
client.arecord_action_on_item("rate", fields[1], "pio_rate" => fields[2])
users[fields[0]] = 1
items[fields[1]] = 1
end
end
users.each_key {|k| client.acreate_user(k) }
items.each_key {|k| client.acreate_item(k, "movies") }
while (client.pending_requests() > 0) do
puts "Remaining: #{client.pending_requests()}"
sleep(5)
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment