Skip to content

Instantly share code, notes, and snippets.

@akostadinov
Last active January 22, 2021 18:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save akostadinov/e3b8f88cb406717a5ec0351a1210e942 to your computer and use it in GitHub Desktop.
Save akostadinov/e3b8f88cb406717a5ec0351a1210e942 to your computer and use it in GitHub Desktop.
Fix facebook broken exported UTF
require 'json'
require 'uri'
require 'csv'
# too bad we lack variable size lookbehind
bytes_re = /((?:\\\\)+|[^\\])(?:\\u[0-9a-f]{4})+/
friends_txt = File.read('friends.json').gsub(bytes_re) do |bad_unicode|
$1 + eval(%Q{"#{bad_unicode[$1.size..-1].gsub('\u00', '\x')}"}).to_json[1...-1]
end
friends = JSON.load(friends_txt)["friends"]
result = friends.map { |friend|
friend.tap { |f|
f["since"] = "#{Time.at(f.delete("timestamp"))}"
f["status"] = ""
f["url"] = URI::HTTP.build(schema: "https", host: "www.facebook.com", path: "/search/people", query: URI.encode_www_form({ q: f["name"]}))
}
}
# export to a CSV to import into some spreadsheet program
CSV.open("data.csv", "w", headers: result.first.keys, write_headers: true) do |csv|
result.each { |h| csv << h.values }
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment