Skip to content

Instantly share code, notes, and snippets.

@jordansissel
Created March 3, 2011 20:35
Show Gist options
  • Select an option

  • Save jordansissel/853495 to your computer and use it in GitHub Desktop.

Select an option

Save jordansissel/853495 to your computer and use it in GitHub Desktop.
Performance testing Ruby with syslog parsing
x86_64-linux/1.8.7: duration: 4.339848 / rate: 230.422816651643 / iterations: 1000
Thread ID: 70362254016940
Total: 4.340000
%self total self wait child calls name
41.24 4.34 1.79 0.00 2.55 1000 Syslog3164#parse
27.19 1.72 1.18 0.00 0.54 71000 Kernel#loop
12.67 0.55 0.55 0.00 0.00 786000 Array#[]
4.61 0.20 0.20 0.00 0.00 385000 Fixnum#+
2.76 0.12 0.12 0.00 0.00 239000 Fixnum#-
2.30 0.10 0.10 0.00 0.00 220000 Fixnum#==
1.84 0.08 0.08 0.00 0.00 176000 Fixnum#<
1.84 0.08 0.08 0.00 0.00 194000 Fixnum#<=
1.15 0.05 0.05 0.00 0.00 137000 Fixnum#>>
0.92 0.04 0.04 0.00 0.00 211000 Fixnum#>
0.92 0.04 0.04 0.00 0.00 52000 Fixnum#<<
0.69 0.05 0.03 0.00 0.02 4000 Syslog3164#popstring
0.46 0.02 0.02 0.00 0.00 58000 Fixnum#&
0.46 0.02 0.02 0.00 0.00 57000 Fixnum#~
0.23 0.01 0.01 0.00 0.00 1000 String#unpack
0.23 0.01 0.01 0.00 0.00 29000 Kernel#===
0.23 0.01 0.01 0.00 0.00 5000 String#[]
0.23 0.01 0.01 0.00 0.00 4000 Array#push
0.00 0.00 0.00 0.00 0.00 1 Class#new
0.00 0.00 0.00 0.00 0.00 1 Float#coerce
0.00 0.00 0.00 0.00 0.00 1 String#==
0.00 0.00 0.00 0.00 0.00 4002 <Class::Object>#allocate
0.00 0.00 0.00 0.00 0.00 2 <Class::Time>#allocate
0.00 0.00 0.00 0.00 0.00 1000 Array#length
0.00 0.00 0.00 0.00 0.00 1 Time#-
0.00 0.00 0.00 0.00 0.00 2 IO#write
0.00 4.34 0.00 0.00 4.34 1 Kernel#load
0.00 0.01 0.00 0.00 0.01 4000 Syslog3164#mark
0.00 0.00 0.00 0.00 0.00 2 Float#to_s
0.00 0.00 0.00 0.00 0.00 20 Module#attr_accessor
0.00 0.00 0.00 0.00 0.00 10 Module#private
0.00 0.00 0.00 0.00 0.00 1000 Kernel#binding
0.00 0.00 0.00 0.00 0.00 4000 Array#pop
0.00 0.00 0.00 0.00 0.00 1 Kernel#puts
0.00 4.34 0.00 0.00 4.34 1 Integer#times
0.00 0.00 0.00 0.00 0.00 2 <Class::Time>#now
0.00 0.00 0.00 0.00 0.00 1000 String#to_i
0.00 0.00 0.00 0.00 0.00 1 Fixnum#/
0.00 0.00 0.00 0.00 0.00 1 Float#/
0.00 0.00 0.00 0.00 0.00 2 Time#initialize
0.00 4.34 0.00 0.00 4.34 1 Global#[No method]
0.00 0.00 0.00 0.00 0.00 1 Class#inherited
0.00 0.00 0.00 0.00 0.00 1 Syslog3164#initialize
0.00 0.00 0.00 0.00 0.00 18 Module#method_added
0.00 0.00 0.00 0.00 0.00 1 Fixnum#to_s
0.00 0.00 0.00 0.00 0.00 5000 Array#size
0.00 0.00 0.00 0.00 0.00 28 Kernel#singleton_method_added
x86_64-linux/1.9.2: duration: 3.311163941 / rate: 302.00860417016725 / iterations: 1000
Thread ID: 11772940
Total: 3.270000
%self total self wait child calls name
62.69 3.25 2.05 0.00 1.20 1000 Syslog3164#parse
25.99 1.09 0.85 0.00 0.24 71000 Kernel#loop
3.67 0.12 0.12 0.00 0.00 137000 Fixnum#>>
2.45 0.08 0.08 0.00 0.00 52000 Fixnum#<<
2.14 0.07 0.07 0.00 0.00 57000 Fixnum#~
1.53 0.05 0.05 0.00 0.00 58000 Fixnum#&
0.31 3.27 0.01 0.00 3.26 1 Kernel#load
0.31 3.26 0.01 0.00 3.25 1 Integer#times
0.31 0.03 0.01 0.00 0.02 4000 Syslog3164#popstring
0.31 0.01 0.01 0.00 0.00 4000 Array#pop
0.31 0.01 0.01 0.00 0.00 5000 String#[]
0.00 0.00 0.00 0.00 0.00 2 IO#write
0.00 0.00 0.00 0.00 0.00 1 Time#-
0.00 0.00 0.00 0.00 0.00 1000 String#to_i
0.00 0.00 0.00 0.00 0.00 1 Numeric#quo
0.00 0.00 0.00 0.00 0.00 1 Rational#/
0.00 0.00 0.00 0.00 0.00 1 Rational#to_f
0.00 0.00 0.00 0.00 0.00 1 Fixnum#fdiv
0.00 0.00 0.00 0.00 0.00 1 Float#/
0.00 0.00 0.00 0.00 0.00 2 Float#to_s
0.00 0.00 0.00 0.00 0.00 4000 Array#push
0.00 0.00 0.00 0.00 0.00 4000 Syslog3164#mark
0.00 0.00 0.00 0.00 0.00 1 Fixnum#-
0.00 0.00 0.00 0.00 0.00 1 Fixnum#/
0.00 0.00 0.00 0.00 0.00 1000 Kernel#binding
0.00 0.00 0.00 0.00 0.00 1000 String#unpack
0.00 0.00 0.00 0.00 0.00 1 Fixnum#to_s
0.00 0.00 0.00 0.00 0.00 1 Kernel#puts
0.00 0.00 0.00 0.00 0.00 10 Module#private
0.00 0.00 0.00 0.00 0.00 28 BasicObject#singleton_method_added
0.00 0.00 0.00 0.00 0.00 1 Syslog3164#initialize
0.00 0.00 0.00 0.00 0.00 1 <Class::BasicObject>#allocate
0.00 0.00 0.00 0.00 0.00 1 Class#new
0.00 0.00 0.00 0.00 0.00 2 Fixnum#+
0.00 0.00 0.00 0.00 0.00 2 Time#initialize
0.00 0.00 0.00 0.00 0.00 2 <Class::Time>#allocate
0.00 0.00 0.00 0.00 0.00 2 <Class::Time>#now
0.00 0.00 0.00 0.00 0.00 18 Module#method_added
0.00 0.00 0.00 0.00 0.00 20 Module#attr_accessor
0.00 0.00 0.00 0.00 0.00 1 Class#inherited
0.00 0.00 0.00 0.00 0.00 162 Hash#default
0.00 0.00 0.00 0.00 0.00 4001 <Class::Range>#allocate
0.00 0.00 0.00 0.00 0.00 2 IO#set_encoding
0.00 0.00 0.00 0.00 0.00 1 IO#puts
0.00 3.27 0.00 0.00 3.27 2 Global#[No method]
%%{
# How to use this file.
# Run: ragel -R syslog_protocol.rl
# This will generate 'syslog_protocol.rb' which you can run to do this perf test.
machine syslog_rfc3164;
# Keep the current buffere position in instance variable '@pos' rather than
# the default local variable 'p'
variable p @pos ;
# notes:
# '@pos' is the current index in the string being parsed.
# 'cs' is the current state (ragel is a state machine)
#
# Literal ">" means enter actions
# Literal "%" means leaving action.
#
# In general, if we care about matched values, we will
# mark at the enter (which pushes the current position on the stack)
# and popstring at the leaving.
pri = ( "<" [0-9]{1,3} ">" ) >{mark("pri")}
%{
@pri = popstring("pri")[1...-1].to_i # trim '<' and '>'
# pri == (facility * 8) + severity
# TODO(sissel): maybe just make these getter functions.
@severity = @pri & 8
@facility = @pri >> 1
} ;
month = ( "Jan" | "Feb" | "Mar" | "Apr" | "May" | "Jun"
| "Jul" | "Aug" | "Sep" | "Oct" | "Nov" | "Dec" ) ;
day = ((" "? [1-9]) | ([12] [0-9]) | ("3" [01])) ;
hour = (([01] [0-9]) | "2" [0-4]) ;
minute = ([0-5][0-9]) ;
second = ([0-5][0-9]) ;
time = ( hour ":" minute ":" second ) ;
timestamp = ( month " " day " " time )
>{mark("timestamp")}
%{ @timestamp = popstring("timestamp") };
# RFC 3164 says "Domain Name MUST NOT be included in the HOSTNAME field"
# But we'll ignore that because that's a stupid requirement and I'm
# not really convinced people follow it.
# The grammar for hostname here should match any ipv4, ipv6, or hostname.
hostname = ([A-Za-z0-9_.:]+) >{mark("hostname")} %{ @hostname = popstring("hostname") };
header = timestamp " " hostname ;
message = (32..127)+ >{mark("message")} %{ @message = popstring("message") } ;
# RFC 3164 section 4.1.2
payload = ( pri header " " message ) ;
main := payload
$err {
# Compute line and column of the cursor (@pos)
$stderr.puts "Error at line #{self.line(string, @pos)}, column #{self.column(string, @pos)}: #{string[@pos .. -1].inspect}"
# TODO(sissel): Note what we were expecting?
} ;
}%%
class Syslog3164
# syslog attributes
attr_accessor :pri
attr_accessor :facility
attr_accessor :severity
attr_accessor :timestamp
attr_accessor :hostname
attr_accessor :message
#attr_accessor :eof
def initialize
# BEGIN RAGEL DATA
%% write data;
# END RAGEL DATA
end
def parse(string)
@markstack = []
@string = string
# Have to reset @pos because Ragel doesn't.
@pos = 0
data = string.unpack("c*")
# Ragel needs to be told how long 'data' is so it knows when to trigger EOF.
eof = data.size
# BEGIN RAGEL INIT
%% write init;
# END RAGEL INIT
# Save local scope so convenience functions can access it.
@binding = binding
begin
# BEGIN RAGEL EXEC
%% write exec;
# END RAGEL EXEC
rescue => e
# TODO(sissel): report error appropriately?
raise e
end
if cs < self.syslog_rfc3164_first_final
$stderr.puts "Error at line #{self.line(string, @pos)}, column #{self.column(string, @pos)}: #{string[@pos .. -1].inspect}"
raise "Invalid payload: #{string}"
end
end # def parse
def line(str, pos)
return str[0 .. pos].count("\n") + 1
end
def column(str, pos)
return str[0 .. pos].split("\n").last.length
end
# Pop a string off the stack. Requires you've called 'mark' at least once prior.
#
# 'info' is an optional argument that can aid in debugging; give a string
# that identifies why or what you are marking.
def popstring(info)
# pop a string based on the last mark
if @markstack.size == 0
raise "No previous mark, cannot pop string."
end
# Get the current parser position.
#pos = eval("@pos", @binding)
value = @string[@markstack.pop ... @pos]
#puts "Popping string for #{info.inspect} (#{@markstack.size} pops left): #{value.inspect}"
return value
end # def popstring
# Push the current parser position onto the stack
# 'info' is an optional argument that can aid in debugging; give a string
# that identifies why or what you are marking.
def mark(info=nil)
#puts "Marking #{@pos} for #{info.inspect}"
@markstack.push(@pos)
end
end # class Syslog3164
if __FILE__ == $0
# This is the perf test.
# It will parse the same syslog message over and over.
start = Time.now
count = 100000
rfc3164 = Syslog3164.new
count.times do |i|
begin
rfc3164.parse("<12>Mar 1 15:43:35 snack kernel: Kernel logging (proc) stopped.")
rescue => e
puts "Error on attempt #{i + 1}"
raise e
end
end
duration = Time.now - start
version = "#{RUBY_PLATFORM}/#{RUBY_VERSION}"
version += "/#{JRUBY_VERSION}" if RUBY_PLATFORM == "java"
puts "#{version}: duration: #{duration} / rate: #{count / duration} / iterations: #{count}"
end
Allow (say, 1% to 3%) variance in results given these tests were run on my desktop which had other stuff running on it; 5-minute load average before/after test was 0.57 on a 4 core system.
YARV 1.9.2 and JRuby 1.6.0RC2 w/ --1.9 --fast had very similar performance.
MRI/YARV:
x86_64-linux/1.8.7: duration: 93.67088 / rate: 1067.5676368152 / iterations: 100000
x86_64-linux/1.9.2: duration: 31.507847164 / rate: 3173.8125261143596 / iterations: 100000
JRuby (no optimizations enabled, just defaults, 'ruby syslog_protocol.rb')
java/1.8.7/1.5.6: duration: 46.08 / rate: 2170.13888888889 / iterations: 100000
java/1.8.7/1.6.0.RC2: duration: 41.407 / rate: 2415.05059530997 / iterations: 100000
JRuby with --fast: (ruby --fast syslog_protocol.rb)
java/1.8.7/1.5.6: duration: 34.291 / rate: 2916.21708319967 / iterations: 100000
java/1.9.2dev/1.5.6: duration: 97.597 / rate: 1024.62165845262 / iterations: 100000
java/1.8.7/1.6.0.RC2: duration: 35.647 / rate: 2805.28515723623 / iterations: 100000
java/1.9.2/1.6.0.RC2: duration: 31.753 / rate: 3149.30872673448 / iterations: 100000
JRuby with all the optimizations enabled (ruby --fast -J-Djruby.compile.fastest syslog_protocol.rb)
java/1.8.7/1.6.0.RC2: duration: 33.892 / rate: 2950.54880207719 / iterations: 100000
java/1.9.2dev/1.5.6: DID NOT PERFORM THIS TEST
java/1.8.7/1.5.6: duration: 33.583 / rate: 2977.69704910222 / iterations: 100000
java/1.9.2/1.6.0.RC2: duration: 33.881 / rate: 2951.50674419291 / iterations: 100000
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment