Skip to content

Instantly share code, notes, and snippets.

@kreeger
Created March 5, 2012 22:43
Show Gist options
  • Star 12 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kreeger/1981689 to your computer and use it in GitHub Desktop.
Save kreeger/1981689 to your computer and use it in GitHub Desktop.
Ruby: Convert MediaWiki to Markdown
#!/usr/bin/env ruby
require 'rubygems'
require 'optparse'
require 'maruku'
require 'wikicloth'
require 'tidy_ffi'
require 'pandoc-ruby'
opts = {}
OptionParser.new do |o|
o.banner = "Usage: #{File.basename(__FILE__)} [infile.mediawiki]"
end.parse!
filename = File.expand_path(ARGV[0])
wiki = WikiCloth::Parser.new(:data => IO.read(filename))
tidy_opts = { :show_body_only => true }
tidy = TidyFFI::Tidy.new(wiki.to_html)
tidy.options.show_body_only = true
tidy.options.indent = 1
html = tidy.clean
pan = PandocRuby.new(html, :from => :html, :to => :markdown)
data = pan.convert
header = /\[\[edit\]\(\?section\=(?:.*)\)\] /
cleaned = data.gsub(header, '')
puts cleaned
@JamesMcMahon
Copy link

Gemfile

source 'https://rubygems.org'

gem 'maruku'
gem 'wikicloth'
gem 'tidy_ffi'
gem 'pandoc-ruby'

Also pandoc allows you to do this online as well, http://johnmacfarlane.net/pandoc/try/. That's what I ended up using when I found out I couldn't brew install Pandoc.

@todgru
Copy link

todgru commented Oct 24, 2016

Note to anyone trying to run this on Ubuntu, Tidy and Pandoc must be installed on your system.

sudo apt-get install tidy pandoc -y

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment