Skip to content

Instantly share code, notes, and snippets.

@jasonm23
Forked from jystewart/example.rb
Created June 17, 2010 01:38
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save jasonm23/441545 to your computer and use it in GitHub Desktop.
require 'rubygems'
require 'html2markdown'
first_block = <<END
<div id="wikicontent" style="padding:0 3em 1.2em 0">
<p><img src="http://cjcat2266.googlepages.com/Emitterlogo.png"> </p><h1><a name="Emitter_is_now_version_2.1">Emitter is now version 2.1</a><a href="#Emitter_is_now_version_2.1" class="section_anchor">¶</a></h1><p><a href="http://emitter.googlecode.com/svn/trunk/docs/index.html" rel="nofollow">Documentation</a> </p><p><a href="http://cjcat.blogspot.com/2009/06/using-tortoisesvn-to-check-out-files.html" rel="nofollow">How to check out the latest source files from the SVN repository</a> </p><hr><h2><a name="Emitter_Video_Tutorials_are_now_available_on_!!">Emitter Video Tutorials are now available on YouTube!!</a><a href="#Emitter_Video_Tutorials_are_now_available_on_!!" class="section_anchor">¶</a></h2><p><a href="http://www.youtube.com/view_play_list?p=84AC3DE6772538E4" rel="nofollow"><img src="http://www.sabredefence.com/images/youtube_logo.gif"></a> </p><p>Check out the complete playlist <a href="http://www.youtube.com/view_play_list?p=84AC3DE6772538E4" rel="nofollow">here</a>. </p><hr><h2><a name="Another_Particle_Engine_Project">Another Particle Engine Project</a><a href="#Another_Particle_Engine_Project" class="section_anchor">¶</a></h2><p><a href="http://code.google.com/p/stardust-particle-engine/" rel="nofollow"><img src="http://cjcat2266.googlepages.com/StardustLogoMediumShadowed.png"></a> </p><p><a href="http://code.google.com/p/stardust-particle-engine/" rel="nofollow">Stardust Particle Engine</a> is another particle engine project I'm working on, having much more features and extensibility. </p><hr><h2><a name="A_thorough_manual_will_be_available_soon.">A thorough manual will be available soon.</a><a href="#A_thorough_manual_will_be_available_soon." class="section_anchor">¶</a></h2><p>Here's a WIP manual. </p><p>It's already completely covered basic usage and every parameter in the Particle class. </p><p>And it also includes working code snippets. </p><p><a href="http://homepage.ntu.edu.tw/~b95901008/bbsfiles/AS3CS4/Emitter/Emitter%202.0%20manual.swf" rel="nofollow">Go to the WIP manual</a> </p><hr><h1><a name="Version_2.0_Screenshots">Version 2.0 Screenshots</a><a href="#Version_2.0_Screenshots" class="section_anchor">¶</a></h1><p></p><table><tbody><tr><td style="border: 1px solid #aaa; padding: 5px;"><img src="http://1.bp.blogspot.com/_4-LtXwX7Yuo/SY3BJy2aY4I/AAAAAAAAATg/GOndG0_yl48/s400/bomb.PNG"></td><td style="border: 1px solid #aaa; padding: 5px;"><img src="http://3.bp.blogspot.com/_4-LtXwX7Yuo/SY3BJ12gB6I/AAAAAAAAATo/46X8ue0otTU/s400/snow.PNG"></td></tr> </tbody></table><p></p><hr><h1><a name="Components_available_now">Components available now</a><a href="#Components_available_now" class="section_anchor">¶</a></h1><h2><a name="Point_source_component">Point source component</a><a href="#Point_source_component" class="section_anchor">¶</a></h2><p>Use the Emitter 2.0 engine without even writing a single line of code! </p><p></p><table><tbody><tr><td style="border: 1px solid #aaa; padding: 5px;"><img src="http://4.bp.blogspot.com/_4-LtXwX7Yuo/SY1zfIisVJI/AAAAAAAAATY/fZXXESyZsmU/s400/point+source+component.PNG"></td></tr> </tbody></table><p></p><h2><a name="Performance_monitor_component">Performance monitor component</a><a href="#Performance_monitor_component" class="section_anchor">¶</a></h2><p>Helps you get an idea of your application's performance. </p><p></p><table><tbody><tr><td style="border: 1px solid #aaa; padding: 5px;"><img src="http://1.bp.blogspot.com/_4-LtXwX7Yuo/SY1zfAlR1eI/AAAAAAAAATQ/sEZJirM43Og/s400/performance+monitor+component.PNG"></td></tr> </tbody></table><p></p><hr><h2><a name="Version_2.0_Features:">Version 2.0 Features:</a><a href="#Version_2.0_Features:" class="section_anchor">¶</a></h2><p>Easy-to-use API. </p><p>A more comprehensive structure. </p><p>Behavior and behavior triggers. </p><p>Particles can be custom MovieClip symbols. </p><p>Multiple sources for a single emitter. </p><p>Four kinds of basic-shaped sources. </p><p>DisplayObjectSource for custom-shaped particle source. </p><p>Bursting - sudden massive particle spawning. </p><p>Gravitation simulation. </p><p>Bubblemotion simulation. </p><p>Deflector simulation. </p><hr><h1><a name="Version_1.0_Screenshots">Version 1.0 Screenshots</a><a href="#Version_1.0_Screenshots" class="section_anchor">¶</a></h1><p></p><table><tbody><tr><td style="border: 1px solid #aaa; padding: 5px;"><img src="http://bp3.blogger.com/_4-LtXwX7Yuo/SErPLIC3J5I/AAAAAAAAAGM/HxK1Ts5xUrc/s400/Emitter.PNG"></td><td style="border: 1px solid #aaa; padding: 5px;"><img src="http://bp2.blogger.com/_4-LtXwX7Yuo/SE5Q65rTJFI/AAAAAAAAAH8/Ee7YoGkQ5H4/s400/LE.PNG"></td><td style="border: 1px solid #aaa; padding: 5px;"><img src="http://bp2.blogger.com/_4-LtXwX7Yuo/SEwsS_0J6yI/AAAAAAAAAG8/5aaF4cTumbg/s400/circ.PNG"></td><td style="border: 1px solid #aaa; padding: 5px;"><img src="http://bp2.blogger.com/_4-LtXwX7Yuo/SEwsR-Ux3WI/AAAAAAAAAG0/8TxocGtZLtU/s400/rect.PNG"></td></tr> </tbody></table><p></p><p></p><table><tbody><tr><td style="border: 1px solid #aaa; padding: 5px;"><img src="http://bp2.blogger.com/_4-LtXwX7Yuo/SGcPizeCMVI/AAAAAAAAAJE/JqpluCB8u4c/s400/LDD.PNG"></td><td style="border: 1px solid #aaa; padding: 5px;"><img src="http://bp2.blogger.com/_4-LtXwX7Yuo/SE1jr7AsMDI/AAAAAAAAAHU/f-giHE46Ec0/s400/gravity.PNG"></td><td style="border: 1px solid #aaa; padding: 5px;"><img src="http://bp2.blogger.com/_4-LtXwX7Yuo/SE403-PXqvI/AAAAAAAAAH0/0FK9llYhjAg/s400/ranbow.PNG"></td><td style="border: 1px solid #aaa; padding: 5px;"><img src="http://bp2.blogger.com/_4-LtXwX7Yuo/SFAdR5NVZvI/AAAAAAAAAIM/_cq7mjwFpoM/s400/Missiles.PNG"></td></tr> </tbody></table><p></p><p></p><table><tbody><tr><td style="border: 1px solid #aaa; padding: 5px;"><img src="http://bp3.blogger.com/_4-LtXwX7Yuo/SFPWMcyhCLI/AAAAAAAAAIk/d4wfgmPdhUM/s400/Storm.PNG"></td><td style="border: 1px solid #aaa; padding: 5px;"><img src="http://bp2.blogger.com/_4-LtXwX7Yuo/SGZd2BVW2KI/AAAAAAAAAIs/1Zjz50G3Dm8/s400/Bubbles.PNG"></td><td style="border: 1px solid #aaa; padding: 5px;"><img src="http://bp0.blogger.com/_4-LtXwX7Yuo/SGczWrXljnI/AAAAAAAAAJM/7GYAZp4CEsM/s400/WS2.PNG"></td><td style="border: 1px solid #aaa; padding: 5px;"><img src="http://bp1.blogger.com/_4-LtXwX7Yuo/SH5PiD-tgaI/AAAAAAAAAJw/e_KhOLGwjsc/s400/CJWp.PNG"></td></tr> </tbody></table><p></p>
</div>
END
parser = HTMLToMarkdownParser.new
parser.feed(first_block)
puts parser.to_markdown
require 'html2textile'
first_block = <<END
<div class="column span-3">
<h3 class="storytitle entry-title" id="post-312">
<a href="http://jystewart.net/process/2007/11/converting-html-to-textile-with-ruby/" rel="bookmark">Converting HTML to Textile with Ruby</a>
</h3>
<p>
<span>23 November 2007</span>
(<abbr class="updated" title="2007-11-23T19:51:54+00:00">7:51 pm</abbr>)
</p>
<p>
By <span class="author vcard fn">James Stewart</span>
<br />filed under:
<a href="http://jystewart.net/process/category/snippets/" title="View all posts in Snippets" rel="category tag">Snippets</a>
<br />tagged: <a href="http://jystewart.net/process/tag/content-management/" rel="tag">content management</a>,
<a href="http://jystewart.net/process/tag/conversion/" rel="tag">conversion</a>,
<a href="http://jystewart.net/process/tag/html/" rel="tag">html</a>,
<a href="http://jystewart.net/process/tag/python/" rel="tag">Python</a>,
<a href="http://jystewart.net/process/tag/ruby/" rel="tag">ruby</a>,
<a href="http://jystewart.net/process/tag/textile/" rel="tag">textile</a>
</p>
<div class="feedback">
<script src="http://feeds.feedburner.com/~s/jystewart/iLiN?i=http://jystewart.net/process/2007/11/converting-html-to-textile-with-ruby/" type="text/javascript" charset="utf-8"></script>
</div>
</div>
END
parser = HTMLToTextileParser.new
parser.feed(first_block)
puts parser.to_textile
require 'sgml-parser'
# A class to convert HTML to markdown. Based on html2textile.rb by James Stewart
# Read more at http://jystewart.net/process/2007/11/converting-html-to-textile-with-ruby
#
# Authors:: Jasonm23@gmail.com
# License:: Distribute under the same terms as Ruby
# This class is an implementation of an SGMLParser designed to convert
# HTML to markdown.
#
# Example usage:
#
# require 'html2markdown'
# parser = HTMLToMarkdownParser.new
# parser.feed(input_html)
# puts parser.to_markdown
#
class HTMLToMarkdownParser < SGMLParser
attr_accessor :result
attr_accessor :in_block
attr_accessor :data_stack
attr_accessor :a_href
attr_accessor :in_ul
attr_accessor :in_ol
attr_accessor :in_pre
def initialize(verbose=nil)
@output = String.new
self.in_block = false
self.result = []
self.data_stack = []
super(verbose)
end
# Normalise space in the same manner as HTML. Any substring of multiple
# whitespace characters will be replaced with a single space char.
# If however, we are in a pre block, we leave whitespace alone.
def normalise_space(s)
if(in_pre)
s.to_s.gsub(/^/, ' ')
else
s.to_s.gsub(/\s+/x, ' ')
end
end
def make_block_start_pair(tag, attributes)
attributes = attrs_to_hash(attributes)
write("\n\n#{tag} ")
start_capture(tag)
end
def make_block_end_pair
stop_capture_and_write
write("\n\n")
end
def make_quicktag_start_pair(tag, wrapchar, attributes)
attributes = attrs_to_hash(attributes)
write([" ", "#{wrapchar}"])
start_capture(tag)
end
def make_quicktag_end_pair(wrapchar)
stop_capture_and_write
write([wrapchar, " "])
end
def write(d)
if self.data_stack.size < 2
self.result += d.to_a
else
self.data_stack[-1] += d.to_a
end
end
def start_capture(tag)
self.in_block = tag
self.data_stack.push([])
end
def stop_capture_and_write
self.in_block = false
self.write(self.data_stack.pop)
end
def handle_data(data)
write(normalise_space(data).strip) unless data.nil? or data == ''
end
%w[1 2 3 4 5 6].each do |num|
define_method "start_h#{num}" do |attributes|
write("\n\n")
make_block_start_pair("#" * num.to_i, attributes)
end
define_method "end_h#{num}" do
make_block_end_pair
end
end
PAIRS = { 'blockquote' => '> ' }
QUICKTAGS = { 'b' => '**', 'strong' => '__', 'i' => '*', 'em' => '_' }
PAIRS.each do |key, value|
define_method "start_#{key}" do |attributes|
make_block_start_pair(value, attributes)
end
define_method "end_#{key}" do
make_block_end_pair
end
end
QUICKTAGS.each do |key, value|
define_method "start_#{key}" do |attributes|
make_quicktag_start_pair(key, value, attributes)
end
define_method "end_#{key}" do
make_quicktag_end_pair(value)
end
end
def start_code(attrs)
if !self.in_pre
write(" `")
start_capture("code")
end
end
def end_code
if !self.in_pre
write("` ")
stop_capture_and_write
end
end
def start_pre(attrs)
self.in_pre = true
start_capture("pre")
write("\n\n ")
end
def end_pre
self.in_pre = false
stop_capture_and_write
write("\n\n\n\n")
end
def start_ol(attrs)
self.in_ol = true
end
def end_ol
self.in_ol = false
write("\n\n")
end
def start_ul(attrs)
self.in_ul = true
end
def end_ul
self.in_ul = false
write("\n")
end
def start_li(attrs)
if self.in_ol
write("1. ")
else
write("* ")
end
start_capture("li")
end
def end_li
stop_capture_and_write
write("\n")
end
def start_a(attrs)
attrs = attrs_to_hash(attrs)
self.a_href = attrs['href']
if self.a_href:
if self.a_href.match(/^#.*/) == nil
write(" [")
start_capture("a")
else
self.a_href = nil
end
end
end
def end_a
if self.a_href:
stop_capture_and_write
write(["](", self.a_href, ") "])
self.a_href = false
end
end
def attrs_to_hash(array)
array.inject({}) { |collection, part| collection[part[0].downcase] = part[1]; collection }
end
def start_img(attrs)
attrs = attrs_to_hash(attrs)
write([" ![", attrs["alt"], "](", attrs["src"], ") "])
end
def end_img
end
def start_br(attrs)
write("\n")
end
def start_hr(attrs)
write("\n\n- - -\n\n")
end
# Return the textile after processing
def to_markdown
result.join
end
end
require 'sgml-parser'
# A class to convert HTML to textile. Based on the python parser
# found at http://aftnn.org/content/code/html2textile/
#
# Read more at http://jystewart.net/process/2007/11/converting-html-to-textile-with-ruby
#
# Author:: James Stewart (mailto:james@jystewart.net)
# Copyright:: Copyright (c) 2007 James Stewart
# License:: Distributes under the same terms as Ruby
# This class is an implementation of an SGMLParser designed to convert
# HTML to textile.
#
# Example usage:
# parser = HTMLToTextileParser.new
# parser.feed(input_html)
# puts parser.to_textile
class HTMLToTextileParser < SGMLParser
attr_accessor :result
attr_accessor :in_block
attr_accessor :data_stack
attr_accessor :a_href
attr_accessor :in_ul
attr_accessor :in_ol
@@permitted_tags = ["pre", "code"]
@@permitted_attrs = []
def initialize(verbose=nil)
@output = String.new
self.in_block = false
self.result = []
self.data_stack = []
super(verbose)
end
# Normalise space in the same manner as HTML. Any substring of multiple
# whitespace characters will be replaced with a single space char.
def normalise_space(s)
s.to_s.gsub(/\s+/x, ' ')
end
def build_styles_ids_and_classes(attributes)
idclass = ''
idclass += attributes['class'] if attributes.has_key?('class')
idclass += "\##{attributes['id']}" if attributes.has_key?('id')
idclass = "(#{idclass})" if idclass != ''
style = attributes.has_key?('style') ? "{#{attributes['style']}}" : ""
"#{idclass}#{style}"
end
def make_block_start_pair(tag, attributes)
attributes = attrs_to_hash(attributes)
class_style = build_styles_ids_and_classes(attributes)
write("#{tag}#{class_style}. ")
start_capture(tag)
end
def make_block_end_pair
stop_capture_and_write
write("\n\n")
end
def make_quicktag_start_pair(tag, wrapchar, attributes)
attributes = attrs_to_hash(attributes)
class_style = build_styles_ids_and_classes(attributes)
write([" ", "#{wrapchar}#{class_style}"])
start_capture(tag)
end
def make_quicktag_end_pair(wrapchar)
stop_capture_and_write
write([wrapchar, " "])
end
def write(d)
if self.data_stack.size < 2
self.result += d.to_a
else
self.data_stack[-1] += d.to_a
end
end
def start_capture(tag)
self.in_block = tag
self.data_stack.push([])
end
def stop_capture_and_write
self.in_block = false
self.write(self.data_stack.pop)
end
def handle_data(data)
write(normalise_space(data).strip) unless data.nil? or data == ''
end
%w[1 2 3 4 5 6].each do |num|
define_method "start_h#{num}" do |attributes|
make_block_start_pair("h#{num}", attributes)
end
define_method "end_h#{num}" do
make_block_end_pair
end
end
PAIRS = { 'blockquote' => 'bq', 'p' => 'p' }
QUICKTAGS = { 'b' => '*', 'strong' => '*',
'i' => '_', 'em' => '_', 'cite' => '??', 's' => '-',
'sup' => '^', 'sub' => '~', 'code' => '@', 'span' => '%'}
PAIRS.each do |key, value|
define_method "start_#{key}" do |attributes|
make_block_start_pair(value, attributes)
end
define_method "end_#{key}" do
make_block_end_pair
end
end
QUICKTAGS.each do |key, value|
define_method "start_#{key}" do |attributes|
make_quicktag_start_pair(key, value, attributes)
end
define_method "end_#{key}" do
make_quicktag_end_pair(value)
end
end
def start_ol(attrs)
self.in_ol = true
end
def end_ol
self.in_ol = false
write("\n")
end
def start_ul(attrs)
self.in_ul = true
end
def end_ul
self.in_ul = false
write("\n")
end
def start_li(attrs)
if self.in_ol
write("# ")
else
write("* ")
end
start_capture("li")
end
def end_li
stop_capture_and_write
write("\n")
end
def start_a(attrs)
attrs = attrs_to_hash(attrs)
self.a_href = attrs['href']
if self.a_href:
write(" \"")
start_capture("a")
end
end
def end_a
if self.a_href:
stop_capture_and_write
write(["\":", self.a_href, " "])
self.a_href = false
end
end
def attrs_to_hash(array)
array.inject({}) { |collection, part| collection[part[0].downcase] = part[1]; collection }
end
def start_img(attrs)
attrs = attrs_to_hash(attrs)
write([" !", attrs["src"], "! "])
end
def end_img
end
def start_tr(attrs)
end
def end_tr
write("|\n")
end
def start_td(attrs)
write("|")
start_capture("td")
end
def end_td
stop_capture_and_write
write("|")
end
def start_br(attrs)
write("\n")
end
def unknown_starttag(tag, attrs)
if @@permitted_tags.include?(tag)
write(["<", tag])
attrs.each do |key, value|
if @@permitted_attributes.include?(key)
write([" ", key, "=\"", value, "\""])
end
end
end
end
def unknown_endtag(tag)
if @@permitted_tags.include?(tag)
write(["</", tag, ">"])
end
end
# Return the textile after processing
def to_textile
result.join
end
# UNCONVERTED PYTHON METHODS
#
# def handle_charref(self, tag):
# self._write(unichr(int(tag)))
#
# def handle_entityref(self, tag):
# if self.entitydefs.has_key(tag):
# self._write(self.entitydefs[tag])
#
# def handle_starttag(self, tag, method, attrs):
# method(dict(attrs))
#
end
# A parser for SGML, using the derived class as static DTD.
class SGMLParser
# Regular expressions used for parsing:
Interesting = /[&<]/
Incomplete = Regexp.compile('&([a-zA-Z][a-zA-Z0-9]*|#[0-9]*)?|' +
'<([a-zA-Z][^<>]*|/([a-zA-Z][^<>]*)?|' +
'![^<>]*)?')
Entityref = /&([a-zA-Z][-.a-zA-Z0-9]*)[^-.a-zA-Z0-9]/
Charref = /&#([0-9]+)[^0-9]/
Starttagopen = /<[>a-zA-Z]/
Endtagopen = /<\/[<>a-zA-Z]/
Endbracket = /[<>]/
Special = /<![^<>]*>/
Commentopen = /<!--/
Commentclose = /--[ \t\n]*>/
Tagfind = /[a-zA-Z][a-zA-Z0-9.-]*/
Attrfind = Regexp.compile('[\s,]*([a-zA-Z_][a-zA-Z_0-9.-]*)' +
'(\s*=\s*' +
"('[^']*'" +
'|"[^"]*"' +
'|[-~a-zA-Z0-9,./:+*%?!()_#=]*))?')
Entitydefs =
{'lt'=>'<', 'gt'=>'>', 'amp'=>'&', 'quot'=>'"', 'apos'=>'\''}
def initialize(verbose=false)
@verbose = verbose
reset
end
def reset
@rawdata = ''
@stack = []
@lasttag = '???'
@nomoretags = false
@literal = false
end
def has_context(gi)
@stack.include? gi
end
def setnomoretags
@nomoretags = true
@literal = true
end
def setliteral(*args)
@literal = true
end
def feed(data)
@rawdata << data
goahead(false)
end
def close
goahead(true)
end
def goahead(_end)
rawdata = @rawdata
i = 0
n = rawdata.length
while i < n
if @nomoretags
handle_data(rawdata[i..(n-1)])
i = n
break
end
j = rawdata.index(Interesting, i)
j = n unless j
if i < j
handle_data(rawdata[i..(j-1)])
end
i = j
break if (i == n)
if rawdata[i] == ?< #
if rawdata.index(Starttagopen, i) == i
if @literal
handle_data(rawdata[i, 1])
i += 1
next
end
k = parse_starttag(i)
break unless k
i = k
next
end
if rawdata.index(Endtagopen, i) == i
k = parse_endtag(i)
break unless k
i = k
@literal = false
next
end
if rawdata.index(Commentopen, i) == i
if @literal
handle_data(rawdata[i,1])
i += 1
next
end
k = parse_comment(i)
break unless k
i += k
next
end
if rawdata.index(Special, i) == i
if @literal
handle_data(rawdata[i, 1])
i += 1
next
end
k = parse_special(i)
break unless k
i += k
next
end
elsif rawdata[i] == ?& #
if rawdata.index(Charref, i) == i
i += $&.length
handle_charref($1)
i -= 1 unless rawdata[i-1] == ?;
next
end
if rawdata.index(Entityref, i) == i
i += $&.length
handle_entityref($1)
i -= 1 unless rawdata[i-1] == ?;
next
end
else
raise RuntimeError, 'neither < nor & ??'
end
# We get here only if incomplete matches but
# nothing else
match = rawdata.index(Incomplete, i)
unless match == i
handle_data(rawdata[i, 1])
i += 1
next
end
j = match + $&.length
break if j == n # Really incomplete
handle_data(rawdata[i..(j-1)])
i = j
end
# end while
if _end and i < n
handle_data(@rawdata[i..(n-1)])
i = n
end
@rawdata = rawdata[i..-1]
end
def parse_comment(i)
rawdata = @rawdata
if rawdata[i, 4] != '<!--'
raise RuntimeError, 'unexpected call to handle_comment'
end
match = rawdata.index(Commentclose, i)
return nil unless match
matched_length = $&.length
j = match
handle_comment(rawdata[i+4..(j-1)])
j = match + matched_length
return j-i
end
def parse_starttag(i)
rawdata = @rawdata
j = rawdata.index(Endbracket, i + 1)
return nil unless j
attrs = []
if rawdata[i+1] == ?> #
# SGML shorthand: <> == <last open tag seen>
k = j
tag = @lasttag
else
match = rawdata.index(Tagfind, i + 1)
unless match
raise RuntimeError, 'unexpected call to parse_starttag'
end
k = i + 1 + ($&.length)
tag = $&.downcase
@lasttag = tag
end
while k < j
break unless rawdata.index(Attrfind, k)
matched_length = $&.length
attrname, rest, attrvalue = $1, $2, $3
if not rest
attrvalue = '' # was: = attrname
elsif (attrvalue[0] == ?' && attrvalue[-1] == ?') or
(attrvalue[0] == ?" && attrvalue[-1] == ?")
attrvalue = attrvalue[1..-2]
end
attrs << [attrname.downcase, attrvalue]
k += matched_length
end
if rawdata[j] == ?> #
j += 1
end
finish_starttag(tag, attrs)
return j
end
def parse_endtag(i)
rawdata = @rawdata
j = rawdata.index(Endbracket, i + 1)
return nil unless j
tag = (rawdata[i+2..j-1].strip).downcase
if rawdata[j] == ?> #
j += 1
end
finish_endtag(tag)
return j
end
def finish_starttag(tag, attrs)
method = 'start_' + tag
if self.respond_to?(method)
@stack << tag
handle_starttag(tag, method, attrs)
return 1
else
method = 'do_' + tag
if self.respond_to?(method)
handle_starttag(tag, method, attrs)
return 0
else
unknown_starttag(tag, attrs)
return -1
end
end
end
def finish_endtag(tag)
if tag == ''
found = @stack.length - 1
if found < 0
unknown_endtag(tag)
return
end
else
unless @stack.include? tag
method = 'end_' + tag
unless self.respond_to?(method)
unknown_endtag(tag)
end
return
end
found = @stack.index(tag) #or @stack.length
end
while @stack.length > found
tag = @stack[-1]
method = 'end_' + tag
if respond_to?(method)
handle_endtag(tag, method)
else
unknown_endtag(tag)
end
@stack.pop
end
end
def parse_special(i)
rawdata = @rawdata
match = rawdata.index(Endbracket, i+1)
return nil unless match
matched_length = $&.length
handle_special(rawdata[i+1..(match-1)])
return match - i + matched_length
end
def handle_starttag(tag, method, attrs)
self.send(method, attrs)
end
def handle_endtag(tag, method)
self.send(method)
end
def report_unbalanced(tag)
if @verbose
print '*** Unbalanced </' + tag + '>', "\n"
print '*** Stack:', self.stack, "\n"
end
end
def handle_charref(name)
n = Integer(name)
if !(0 <= n && n <= 255)
unknown_charref(name)
return
end
handle_data(n.chr)
end
def handle_entityref(name)
table = Entitydefs
if table.include?(name)
handle_data(table[name])
else
unknown_entityref(name)
return
end
end
def handle_data(data)
end
def handle_comment(data)
end
def handle_special(data)
end
def unknown_starttag(tag, attrs)
end
def unknown_endtag(tag)
end
def unknown_charref(ref)
end
def unknown_entityref(ref)
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment