Skip to content

@jbrisbin /gist:1444077
Created

Embed URL

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Reactor-based framework versus Node.js streaming

I've been hacking away recently at a JVM framework for doing asynchronous, non-blocking applications using a variation of the venerable Reactor pattern. The core of the framework is currently in Java. I started with Scala then went with Java and am now considering Scala again for the core. What can I say: I'm a grass-is-greener waffler! :) But it understands how to invoke Groovy Closures, Scala anonymous functions, and Clojure functions, so you can use the framework directly without needing wrappers.

I've been continually micro-benchmarking this framework because I feel that the JVM is a better foundation on which to build highly-concurrent, highly-scalable, C100K applications than V8 or Ruby. The problem has been, so far, no good tools exist for JVM developers to leverage the excellent performance and manageability of the JVM. This yet-to-be-publicly-released framework is an effort to give Java, Groovy, Scala, [X JVM language] developers access to an easy-to-use programming model that removes the necessity to use synchronization and worry about concurrency issues, while making it easy to respond to events in multiple threads if it's more efficient for your application to do so (unlike the strict single-threadedness of Node.js, this framework gives you a choice of single-threaded efficiency or multi-threaded parallelism).

The benchmark below is of this reactor-based framework that uses the old-school Java NIO FileChannel.transferTo (sendfile) method to stream data from the filesystem to the client. The Node.js application uses streaming to pipe a file directly to the client.

The Groovy code looks like this:

def server = new HttpServer(3000)
  .on("/lib/{resource}**", {HttpMessage request ->
    def file = request.pathParam("resource")
    def path = Paths.get(resources, file)
    if (Files.exists(path)) {
      request.respond(StandardHttpResponses.ok(contentType, path))
    } else {
      request.respond(StandardHttpResponses.notFound(request.uri().path))
    }
  })
  .start()

A similar application could be built with pure Java using annotations. The POJO delegate would look something like:

@On("/lib/{resource}**") @Get
public void static(HttpMessage request) {
  String file = request.pathParam("resource");
  Path path = Paths.get(resources, file);
  if (Files.exists(path)) {
    request.respond(StandardHttpResponses.ok(contentType, path));
  } else {
    request.respond(StandardHttpResponses.notFound(request.uri().path));
  }
}

The Node.js code looks like this:

http.createServer(
  function (req, res) {
    var pth = path.join("static", req.url)
    var rs = fs.createReadStream(pth);
    rs.on("error", function() {
      res.writeHead(404);
      res.end();
    });
    rs.once("fd", function() {
      res.writeHead(200, {'Content-Type': 'application/octet-stream'});
    });
    rs.pipe(res);
  }
).listen(3001, "127.0.0.1");

When I ran Apache Bench against these servers (100 concurrent users downloading a 1MB file), the JVM framework handily out-performed the Node.js version. By handily, I mean it was more than twice as fast and supported twice the throughput:

Node.js:

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        1    3   1.0      3       4
Processing:   246  252   2.3    252     254
Waiting:       21   32   6.3     34      41
Total:        248  254   2.4    255     257

Percentage of the requests served within a certain time (ms)
  50%    255
  66%    256
  75%    256
  80%    256
  90%    257
  95%    257
  98%    257
  99%    257
 100%    257 (longest request)

JVM framework:

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        1    3   0.9      3       4
Processing:    64  118  17.1    123     135
Waiting:        4   16   9.2     13      43
Total:         67  121  17.2    126     138

Percentage of the requests served within a certain time (ms)
  50%    126
  66%    132
  75%    135
  80%    135
  90%    136
  95%    138
  98%    138
  99%    138
 100%    138 (longest request)

From the gallery: "No fair! You're using multiple threads!"

Since setting up 4 Node.js processes and configuring the load balancing was more than I wanted to take on just for a simple microbenchmark, I dropped the concurrent users to 1 and re-ran the tests:

Node.js:

Requests per second:    270.12 [#/sec] (mean)
Time per request:       3.702 [ms] (mean)
Time per request:       3.702 [ms] (mean, across all concurrent requests)
Transfer rate:          276612.41 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       4
Processing:     3    4   2.3      3      32
Waiting:        0    1   1.0      0      25
Total:          3    4   2.3      3      32
WARNING: The median and mean for the waiting time are not within a normal deviation
        These results are probably not that reliable.

Percentage of the requests served within a certain time (ms)
  50%      3
  66%      3
  75%      4
  80%      4
  90%      4
  95%      4
  98%      6
  99%     15
 100%     32 (longest request)

JVM framework:

Requests per second:    228.89 [#/sec] (mean)
Time per request:       4.369 [ms] (mean)
Time per request:       4.369 [ms] (mean, across all concurrent requests)
Transfer rate:          234428.13 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:     2    4   0.5      4       8
Waiting:        0    0   0.1      0       3
Total:          2    4   0.5      4       8

Percentage of the requests served within a certain time (ms)
  50%      4
  66%      4
  75%      5
  80%      5
  90%      5
  95%      5
  98%      5
  99%      6
 100%      8 (longest request)

With only a single thread at a time, the JVM competed slightly better in response time (drastically smaller standard deviation with identical mean times) while Node.js supported about 6% greater bandwidth.

JVM haters will never be convinced

The point of this is not to bash Node.js. It's a great platform for some applications that can benefit from the things it does well. It's also fine for dogmatic JVM-haters to dismiss any of these tests as flawed or irrelevant. They'll never be open-minded enough to take an honest look at what the JVM can do for this new class of applications that need C100K capabilities but would benefit from the decades of engineering that's gone into the JVM and the plethora of management tools and operational experience with the platform.

Combining the JVM with a better framework for writing non-blocking, evented applications is the goal. These tests just confirm for me that that goal is achievable and that there is benefit to be had from such a framework. It also tells me that the weaknesses of the JVM for C100K applications have more to do with the programming model than they do with the JVM itself. If I were to tune the JVM running these tests, rather than use the default settings, I could likely get even better numbers than these. The JVM has a boatload of knobs to turn that I simply didn't take the time to tweak.

@mraleph

it would be very helpful if you also included node.js version.

@tblachowicz

This is interesting, indeed. What is the class fully-qualified name of the HttpServer class you use in Groovy implementation? I like the DSL used there to specifiy request handlers!

@lojikil
From the gallery: "No fair! You're using multiple threads!"

Since setting up 4 Node.js processes and configuring the load balancing was more than I wanted to take on just for a simple microbenchmark, I  
dropped the concurrent users to 1 and re-ran the tests:

Depending on the version of Node.js you're using, you could just use Node's cluster to help you out with this.

@kitd

Interesting. Deft Server does something similar and has some favourable benchmarks as well.

BTW, your annotations look very similar to those for JAX-RS / Jersey. You could swap to support those.

As regards, concurrency, I think the JVM struggles to handle it easily. You either need synchronization, copy-on-write support or fast serialization/deserialization. And you also need a SW designer who understands each perfectly and when to use it, and software that will never change from needing one type to another. Good luck with that!

@thejh

Does your java stuff do some kind of in-RAM caching?

@burtonator

Great post... you made one error though.

Your JVM code blocks in the event loop. It will have to stat() the file which will block.

It may be that the underlying OS has cached the innode responsible for for this in your test but on a production system the VFS might evict the innode and your performance will dive.

Also , ALL the first requests are blocked until it can be cached.

@burtonator

Also, I think BOTH tests are unfair.

Couldn't you just drop the number of threads Java is using to 1 ? Then re-run it with concurrent users? This should mean that java is running with one core just like Node.js

@JustANull

Why should he have to limit one language to make up for the failings of another? While he certainly shouldn't purposefully optimize one, it is only fair to let it do whatever it wants on its own

@burtonator

@whitewater ... You should benchmark against how it normally would be written on the optimal form on that language. Otherwise why not just handicap the results until you get what you want from the benchmark.

Blocking in the io thread is not a good idea and in Java the right way to do this would be to use an async stay by using an executor.

@JustANull

@burtonator I wasn't trying to say that it shouldn't be optimized, I guess I miswrote what I was trying to say... More of a "don't specially optimize for serving one megabyte files to one hundred clients". I completely agree with your argument, though.

@jbrisbin
Owner

Thanks for all the discussion and suggestions! :) I'm not so much interested in a horserace between Node.js and the JVM. My entire effort with this framework is to give Java and JVM developers a way to write non-blocking, potentially asynchronous applications, without worrying about concurrency issues and synchronization. Those things don't apply for the vast majority of this framework because the code is, as far is is possible, executed by the same thread that started the whole task.

A secondary (or even tertiary) concern is how this JVM framework compares to other non-blocking frameworks. Node.js is probably the most popular and most-bandied about at the moment, so it makes sense to see how the JVM compares to V8. The point of these tests is that it compares quite favorably and that the handicap is not the technical limitations of the foundation, but the application frameworks and programming models that are available to take advantage of all this performance and engineering.

This framework is (hopefully) soon to be publicly admitted to. We'll decide on a name (likely some variation of "Something Reactor" to reflect its origins as an implementation of the Reactor pattern for the JVM). Multiple annotation models will be supported, including things like JAX-RS, Spring MVC, these custom annotations, etc... The decisions used to map methods to events is pluggable and different implementations can do things differently. Currently, the scheme I demonstrated is the only one I've had time to implement. :) It also supports Groovy, Scala, and Clojure natively (i.e. no wrappers required, simply assign a Closure or whatever as an event handler and the framework knows how to invoke that).

@jbrisbin
Owner

@KitD You actually don't need any of that stuff because almost all of your asynchronous work can actually be scheduled to be run on the originating thread. You just need a competent framework to do that sort of thing. I'm using HawtDispatch at the moment, but we're talking about maybe using a Disruptor RingBuffer or some other similar abstraction.

I'm at the Basho offices in San Francisco at the moment and we were just having a discussion about this very topic not 15 minutes ago. :)

@bloodredsun

If you are looking to "give Java and JVM developers a way to write non-blocking, potentially asynchronous applications, without worrying about concurrency issues and synchronization" I'm surprised that you have not mentioned Actor libraries, especially since your Scala experience means that you would have come across Akka. Akka would look miserably complicated for this simple example but for a more complex examples it does a great job of both hiding the complexity of concurrency while giving the best of both event and threaded worlds.

@jbrisbin
Owner

The library I'm writing actually is, in a sense, an Actor library. But a purely Actor model isn't, IMO, flexible enough. You still have two different "kinds" of programming models when writing an application that uses Actors. By that, I mean, there is the very imperative use of an actor (actorRef ! msg) and when you're doing asynchronous or messaging, there is a different model (subscriber.publish(...)). My goal is to combine both and make it easy for a programmer to respond to external events (coming from a message broker, maybe) and internal events (coming from another thread) using a common API since the two tasks do not, by necessity, have to be done differently (reactor.on("event", handler) and reactor.emit("event", payload) covers both situations and is self-documenting).

In a Scala version of this code, I actually defined a bang method (!). It didsn't offer anything more (syntactically) than reactor.emit("event", payload), though. It's more characters to type, of course. :) But Java can't define such a construct, so it doesn't matter anyway.

@jbrisbin
Owner

@burtonator This code is not based on an event loop. It is multi-threaded but tasks are ordered so there isn't any concurrent access (unless you intentionally do that). I've found that it's more performant to do certain blocking operations, particularly when it comes to file IO. It's not always easy to determine where the line is between "I need to do this in another thread" and "it's okay to do that operation in this thread". I certainly wouldn't try and issue an HTTP request like this. But using the traditional, blocking FileChannel from another thread (which is what's happening under the covers), is orders of magnitude faster than using the JDK 7 AsynchronousFileChannel. The same holds true for sockets. The traditional synchronous sockets are more performant than the asynchronous ones.

The gist of this (no pun intended) is that arbitrarily enforcing a "never, ever block on anything" policy is certainly a more "pure" approach, but it is not as performant as the mixed, blocking (but in another thread) and non-blocking (which is also in another thread, you just don't see it directly :) pragmatic approach I've taken with this framework.

At some point, it's impossible to be purely non-blocking as there's only so many threads (one per processor in my case) and a thread actually performs work and context switching is relatively expensive in some operations so my benchmarks have shown me that proper use of blocking IO can give you significant performance gains while not introducing concurrency artifacts.

@adamfisk

Looks great -- would love to take it for a spin if you release the source.

@purplefox

Jon,

how does your "new framework" differ from http://purplefox.github.com/vert.x/ ?

@jbrisbin
Owner
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.