public
Created

Convert relative links to absolute in HTML

  • Download Gist
gistfile1.rb
Ruby
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
#!/usr/bin/env ruby
# encoding: utf-8
 
# This works pretty well, but can't yet go up dir tree
# Also doesn't work on JS generated hrefs/srcs
# I was surprised there wasn't a ready-made solution online (that I could quickly find)
 
require 'open-uri'
require 'nokogiri'
require 'active_support/core_ext'
 
def convert_to_abs(base_uri,rel_uri)
# Note: can't follow up (i.e. "..")
base_uri = URI.parse(base_uri)
rel_uri = URI.parse(rel_uri)
if (rel_uri.class == URI::Generic) and not rel_uri.path.blank?
URI.join(base_uri,rel_uri).to_s
else
rel_uri.to_s
end
end
 
url_lookup = "http://library.tcu.edu/govweb/"
doc = Nokogiri::HTML(open(url_lookup))
 
doc.xpath("//@href | //@src").each do |x|
x.value = convert_to_abs(url_lookup,x.value)
end
 
puts doc.to_html

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.