Skip to content

Instantly share code, notes, and snippets.

@brixen
Created April 28, 2010 18:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save brixen/382510 to your computer and use it in GitHub Desktop.
Save brixen/382510 to your computer and use it in GitHub Desktop.
# encoding: utf-8
unless RUBY_VERSION =~ /^1.9/
$KCODE = 'u'
puts "$KCODE == " + $KCODE
else
puts "Encoding == " + "".encoding.name
end
s = "a *b* c *d*<"
# Extracted from redcloth3
re = /(^|[>\s\(]) # sta
(?!\-\-)
(\*\*|\*|\?\?|\-|__|_|%|\+|\^|~|) # oqs
(\*) # qtag
(\w|[^\s].*?[^\s]) # content
(?!\-\-)
\*
(\*\*|\*|\?\?|\-|__|_|%|\+|\^|~|) # oqa
(?=[[:punct:]]|\s|\)|$)/x
s.scan(re) { |m| p m }
# Attempt to simplify the regex
re = /(\*) # qtag
([^\*]) # content
\*
(?=[[:punct:]]|\s)/x
s.scan(re) { |m| p m }
$ ruby1.8.7 -v red.rb
ruby 1.8.7 (2010-01-10 patchlevel 249) [i686-darwin9.8.0]
$KCODE == UTF8
[" ", "", "*", "b", ""]
[" ", "", "*", "d", ""]
["*", "b"]
["*", "d"]
$ ruby1.9 -v red.rb
ruby 1.9.2dev (2010-04-28 trunk 27536) [i386-darwin9.8.0]
Encoding == UTF-8
[" ", "", "*", "b", ""]
["*", "b"]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment