jordansissel/Why JRuby.md

## Why JRuby.md

      
    Raw
  

              Why JRuby.md
            
          
    Long story, short: I'm totally open to supporting more rubies if possible. Details
follow.
Related issue: http://code.google.com/p/logstash/issues/detail?id=37
Summary:

core and stdlib ruby changes violently and without notice and without backwards compatibility. I want nothing of that.
need a cross-ruby date library that isn't part of stdlib (see previous point) and is also good.
need an easy way to use multiple cpus that is cross-ruby (threads are not it)

Details:
Mainly, the ruby
core/stdlib API changes between ruby 1.8 and 1.9 are very poorly done. Some
are documented while others are not. Some changes make sense, while others
do not. That was the main reason for originally deciding to use jruby.
JRuby lets me use Java libraries in place of crappy ruby ones. For example,
there are some undocumented changes to datetime between ruby 1.8 and 1.9, so
the logstash 'date' filter uses Joda-Time instead of ruby's stdlib datetime.
Further, JRuby's performance options are currently much better than MRI or
YARV. At worst, during benchmarks, JRuby performs on-par with YARV  1.9.2,
but since JRuby has actual threads, we can use more cpus more easily, and
pretty much beat plain ruby.
Additionally, java debugging tools are quite excellent. jvisualvm, jstack, etc.
Lastly, I can very easily ship a single 'executable' that should work on most platforms with java - see the monolithic jar logstash releases. I can't easily do this with other rubies.
There are some parts of logstash that explicitly require java currently - the date
filter, elasticsearch support, and thread support.
The code is also only tested under ruby 1.8.7, and performance difference
between JRuby and MRI 1.8.7 is pretty huge. It might get better if you try
REE, but that's not really the same ruby everyone's going to have.
The date filter can be made ruby-friendly if someone write a non-crappy date
parsing library in ruby. The ones that ship with stdlib are not fast or safe to
use (ruby core changes it wildly without notice).
ElasticSearch support is much faster in jruby/jvm than it was using pure
ruby, because we are now using the java APi for elasticsearch. Previously we
were using the HTTP/REST api using EventMachine and em-http-request, which
has much lower throughput.
Lastly, jruby supports proper threading so logstash can process events on
multiple CPU cores. MRI and YARV Ruby cannot do this without forking and
message passing.
The downsides to using JRuby are possibly higher in-memory footprint.
Again, I'm open to supporting non-JRuby rubies, but there needs to be answers for some of the above.