Skip to content

Instantly share code, notes, and snippets.

@link0ff
Last active November 28, 2021 18:55
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save link0ff/50b34f4ba2336e0ef5c6cb69051603f5 to your computer and use it in GitHub Desktop.
Save link0ff/50b34f4ba2336e0ef5c6cb69051603f5 to your computer and use it in GitHub Desktop.
Convert Firefox/Chromium Bookmarks HTML export files to Org-mode format
#!/usr/bin/env ruby
# firefox_bookmarks_html_to_org.rb
# convert HTML files exported from Firefox/Chromium/Chrome/Brave/... Bookmarks
# to https://orgmode.org/ format
#
# Copyright (C) 2021 Juri Linkov <juri@linkov.net>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
# Usage: ruby firefox_bookmarks_html_to_org.rb bookmarks-date.html > bookmarks-date.org
# where bookmarks-date.html is exported from Firefox by:
# 1. Type Control-Shift-O
# 2. Click "Import and Backup -> Export Bookmarks to HTML..." and "Save"
require 'nokogiri'
require 'cgi'
html = Nokogiri.HTML(ARGF.read.gsub(/\r$/,'').gsub(/<DT><A.+?<\/A>$/, '\\0</DT>').gsub(/<p>/,'').gsub(/\n+[ \t]*/,''))
STDERR.puts html.errors#.last
def process(html, level)
html.children.each do |c|
# DT A
if (a=c.children[0])&.name == 'a'
puts "- #{a.text}"
puts " #{begin CGI.unescape(a.attr('href')).gsub(/ /, '%20') rescue a.attr('href') end}"
print " [#{Time.at(a.attr('add_date').to_i)}]"
print " [#{Time.at(a.attr('last_modified').to_i)}]"
puts
elsif c.name == 'hr'
puts " "
end
end
html.children.each_with_index do |c, i|
# DT H3
if (h3=c.children[0])&.name == 'h3'
puts "#{'*' * (level < 1 ? 1 : level)} #{h3.text}"
print " [#{Time.at(h3.attr('add_date').to_i)}]"
print " [#{Time.at(h3.attr('last_modified').to_i)}]"
puts
# DL
process(html.children[i+1], level + 1)
end
end
end
process(html.css('html > body > dl'), 1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment