Skip to content

Instantly share code, notes, and snippets.

@kardeiz
Created October 26, 2012 15:12
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kardeiz/3959343 to your computer and use it in GitHub Desktop.
Save kardeiz/3959343 to your computer and use it in GitHub Desktop.
Convert relative links to absolute in HTML
#!/usr/bin/env ruby
# encoding: utf-8
# This works pretty well, but can't yet go up dir tree
# Also doesn't work on JS generated hrefs/srcs
# I was surprised there wasn't a ready-made solution online (that I could quickly find)
require 'open-uri'
require 'nokogiri'
require 'active_support/core_ext'
def convert_to_abs(base_uri,rel_uri)
# Note: can't follow up (i.e. "..")
base_uri = URI.parse(base_uri)
rel_uri = URI.parse(rel_uri)
if (rel_uri.class == URI::Generic) and not rel_uri.path.blank?
URI.join(base_uri,rel_uri).to_s
else
rel_uri.to_s
end
end
url_lookup = "http://library.tcu.edu/govweb/"
doc = Nokogiri::HTML(open(url_lookup))
doc.xpath("//@href | //@src").each do |x|
x.value = convert_to_abs(url_lookup,x.value)
end
puts doc.to_html
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment