Skip to content

Instantly share code, notes, and snippets.

@bfroehle
Forked from henare/mw-to-gollum.rb
Last active December 10, 2015 22:08
Show Gist options
  • Save bfroehle/4499750 to your computer and use it in GitHub Desktop.
Save bfroehle/4499750 to your computer and use it in GitHub Desktop.
#!/usr/bin/env python
import re
import requests
html = requests.get('http://wiki.ipython.org/Special:AllPages')
all_pages_table = re.search('<table class="mw-allpages-table-chunk">(.*)</table>',
html.content, re.DOTALL).group()
titles = re.findall(r'title="([^"]*?)"', all_pages_table)
hrefs = re.findall(r'href="([^"]*?)"', all_pages_table)
print("Go to http://wiki.ipython.org/Special:Export and export the following:")
print('\n'.join(titles))
#!/usr/bin/env ruby
require 'rubygems'
require 'hpricot'
require 'gollum'
require 'open-uri'
wiki = Gollum::Wiki.new('ipython.wiki')
doc = Hpricot(open('https://dl.dropbox.com/s/5ogy0xg2lkuo1rf/IPython-20130110012347.xml'))
doc.search('/mediawiki/page').each do |el|
title = el.at('title').inner_text.tr('/', '-')
content = el.at('text').inner_text
kind = :mediawiki
commit = { :message => "Import MediaWiki page #{title} into Gollum",
:name => 'Bradley M. Froehle',
:email => 'brad.froehle@gmail.com' }
m = content.match(/^\s*<rst>(.*)<\/rst>\s*$/m)
if m
kind = :rest
content = m[1]
end
begin
puts "Writing page #{title}"
wiki.write_page(title, kind, content, commit)
rescue Gollum::DuplicatePageError => e
p "Duplicate #{title}"
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment