Skip to content

Instantly share code, notes, and snippets.

@gregelin
Created September 13, 2009 11:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gregelin/186162 to your computer and use it in GitHub Desktop.
Save gregelin/186162 to your computer and use it in GitHub Desktop.
Ruby parse csv file
# Command line parse csv file
require 'CSV'
gf = "ucp_top_500_pages.txt"
af = "ucp_top_500_091209_categorized.txt"
src = File.open(gf,"r").read
views = {}
views = build_views(src)
src2 = File.open(af,"r").read
views2 = {}
views2 = build_views(src2)
views3 = {}
views.keys.each { |k|
puts k
next if k.nil? or views[k].nil? or views2[k].nil?
url = views[k] + views2[k]
views3[k] = url
}
# "/ucp_channeldoc.cfm/1/15/11383/11383-11383/2848"=>["/ucp_channeldoc.cfm/1/15/11383/11383-11383/2848", "20", "17", "156.25", "0.875", "0.8", "0", "/ucp_channeldoc.cfm/1/15/11383/11383-11383/2848", "Sports / Leisure", "UCP: Sports / Leisure"],
# >> views3["/ucp_generaldoc.cfm/1/3/6633/6633-6633/6256"][8]
# => "About Organization"
def build_views(src)
idx = 0
fs = "\t"
views = {}
begin
parsed = []
parsed_cells, idx = CSV.parse_row(src, idx, parsed, fs)
puts "Parsed #{ parsed_cells } cells."
puts parsed[0]
p parsed
views[parsed[0]] = parsed
end while parsed_cells > 0
return views
end
# Other parts of code
views4 = {}
views.keys.each { |k|
next if views3[k].nil?
category = views3[k][8]
pv = views3[k][1]
puts "#{pv} #{category}"
if views4.has_key?(category) then
views4[category] = views4[category].to_i + views3[k][1].to_i
else
views4[category] = views3[k][1].to_i
end
}
views4.keys.each {|k|
puts "#{views4[k]}, #{k}"
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment