Skip to content

Instantly share code, notes, and snippets.

@courtneycouch
Created May 10, 2012 13:28
Show Gist options
  • Star 12 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save courtneycouch/2652991 to your computer and use it in GitHub Desktop.
Save courtneycouch/2652991 to your computer and use it in GitHub Desktop.
vertx server test
ar http = require('http');
var fs = require('fs');
var util = require('util');
var fileCache;
var sendFile = function(conn, file) {
conn.writeHead(200, {"Content-Type": "text/html", "Content-Length": file.length});
conn.write(file);
conn.end();
}
http.createServer(function (req, res) {
if (fileCache == undefined) {
fs.readFile("foo.html", function(err, file) {
fileCache = file;
sendFile(res, fileCache);
});
} else {
sendFile(res, fileCache);
}
}).listen(8080, 'localhost');
load('vertx.js')
var fileCache;
var sendFile = function(req, file) {
req.response.headers["Content-Length"] = file.length()
req.response.headers["Content-Type"] = "text/html"
req.response.end(file)
}
vertx.createHttpServer().requestHandler(function(req) {
if (fileCache == undefined) {
vertx.fileSystem.readFile("httpperf/foo.html" function(err, file) {
fileCache = file;
sendFile(req, fileCache);
});
} else {
sendFile(req, fileCache);
}
});
}).listen(8080, 'localhost');
@courtneycouch
Copy link
Author

Using these to test vert.x I get the following:

vert.x:

39890 Rate: count/sec: 3289.4736842105262 Average rate: 2958.1348708949613
42901 Rate: count/sec: 2656.924609764198 Average rate: 2936.994475653248
45952 Rate: count/sec: 3277.613897082924 Average rate: 2959.610027855153

node.js:

38439 Rate: count/sec: 4603.748766853009 Average rate: 4474.62212856734
41469 Rate: count/sec: 4620.4620462046205 Average rate: 4485.278159589091
44469 Rate: count/sec: 4666.666666666667 Average rate: 4497.515122894601

Using m1.small instance on EC2 and both nodejs and vert.x using a single core.

@courtneycouch
Copy link
Author

Wanted to also point out that the above changes to Vert.x doesn't change the performance at all. It's clearly caching the file as well under the hood.

@grnadav
Copy link

grnadav commented May 10, 2012

Thank god for the sanity. I saw there was a gotcha somewhere in their (biased) benchmark

@fcsonline
Copy link

Good reply!

@purplefox
Copy link

Neither Vert.x nor the JVM caches the file. This is BS.

@purplefox
Copy link

  1. The benchmark is invalid since code has been added to cache the file explicitly, this breaks the benchmark - in a real web server you would never do this. If files get updated in the file system while the server is running then the updated file should be returned. Remove the explicit caching code and Vert.x greatly outperforms Node.js. This tells you that Node does not take account of OS level caching.
  2. Running a benchmark on a single core cloud VM is plain stupid. No-one who cares about performance will ever do this, they will run with multiple cores. If you do this, then Vert.x outperforms Node.js even further.
  3. This is running Vert.x JS server. If you run the Vert.x Java server, performance will be even higher still.

@courtneycouch
Copy link
Author

  1. The benchmark is valid because both vert.x and node.js are doing the same workload. If both are doing the same workload then it's totally valid. I could have them both calculating prime numbers, both resizing images, or both doing anything. They are both reading a file, explicitly caching it then serving it. Comparing performance for this operation is completely valid. Just because you don't like the results doesn't make it invalid. Sounds like sour grapes.
  2. I ran this with multiple cores here: https://gist.github.com/2657432 and nodejs outperformed vert.x still. So the multiple core argument is invalid. That's even using the cluster functionality which isn't yet ready for prime time. I suspect if I used haproxy to load balance multiple instances I'd get even better numbers with nodejs.
  3. Possibly. I dont know.

@purplefox
Copy link

purplefox commented May 11, 2012 via email

@purplefox
Copy link

I'm sure I could pick some other pointless piece of code that ran better in Node.js too. Well done :)

@courtneycouch
Copy link
Author

Explicitly reading a file for each request in nodejs is equally pointless. Good job on explicitly designing a benchmark to give nodejs poor results.

Perhaps use something like this: https://github.com/bnoordhuis/node-mmap

To get apples to apples comparison. In any case your responses here sounds like sour grapes.

@purplefox
Copy link

I wasn't aware that doing what a web server does (serves files from filesystem) is pointless, but I'll take your word for it ;)

@courtneycouch
Copy link
Author

Serving the same single file over and over is pretty unrealistic. Lets say there were 10k different files and the requests would serve one randomly. I suspect node's read would load them faster (provided the sizes of the files are small). Then if you did want mmap type access there are node modules to handle that type of IO. IO isn't a once size fits all solution. I suspect using mmap based access using an mmap module would give you identical results to my results above.

Also if someone was really serving the same file over and over, I suspect it woud be kept in memory on the app server... or perhaps memcached... or redis... or perhaps use something that excels at static file delivery.. nginx.. or use a cdn.. In any case really I think we should just agree to disagree on what makes a relevant comparison. I don't see this getting anywhere.

Anyway, I do agree the tests above are trivial. My opinion is that they are no more so than the ones on your blog. In some situations vert.x will be faster I'm sure, and in some node.js will be faster (I suspect nodejs will be faster in most situations but thats opinion since there really are no in depth benchmarks comparing the two). It's a matter of figuring out given your problem space which is the better solution.

Cheers

@purplefox
Copy link

I actually agree with your last paragraph :)

We shouldn't read too much into these kinds of benchmarks. Well, that's pretty much the first thing I said in the original posting (disclaimer), right? ;)

I guess one thing we can agree on is both Vert.x and Node.js are both pretty good, and certainly good enough for most users.

@tipiirai
Copy link

Obviously we should rely on good benchmarks and ignore bad ones.

Good benchmarks compare the same workload like @courtneycouch does and the one that's is on your blog

http://vertxproject.wordpress.com/

should be ignored. It's essentially a comparison between direct file system access vs cached file access. Absolutely not a comparison between vertx and node. It feels like an marketing attempt to get some attention.

This definitely does not mean that vertx is a bad product. It's a fresh breath of air for Java community. Java is a bloated environment where simplicity is absent. Your product brings this new spirit that exists on other languages. I suggest you focus on the features you summarize on your nice manual

http://vertx.io/manual.html

and make it stable as it seems to crash on large clusters

https://gist.github.com/2657432

But you should definitely stop giving false promises and hidden truths about the product. There is no way you can beat c/v8 in performance with java/rhino. You need to shine on the parts that makes the difference - like language independency. That's really a good one.

@egeozcan
Copy link

My question would be, if JVM does smart caching of the file (I'd bet it'd start serving the new version if the file changed), how can we achieve the same with node? Does Connect do this?

@itwars
Copy link

itwars commented May 11, 2012

Hi all,

Could someone monitor the RAM consumption because Java eat more RAM than Node.js and on a real big huge app perf will be event better with node.js ?

@purplefox
Copy link

I doubt the JVM is caching the file. Most probably it's using memory mapped files, so it's the OS handling the caching. I'd need to check the JVM source though.

@tipiirai
Copy link

There are many options for caching - even node itself provides this mechanism

http://nodejs.org/docs/latest/api/fs.html#fs_fs_watchfile_filename_options_listener

This benchmark simply shows that node.js outperforms vertx when serving a cached file via http. Nothing more.

@heapwolf
Copy link

"I doubt the JVM is caching the file. Most probably..." -- spoken like a true computer scientist. lol.

Copy link

ghost commented May 11, 2012

about the bs called "nobody runs on a single core"
when ever you rent some hosted machines, you rent a virtual machine where the host has 12 cpu cores but the vm it self just shows 8000mhz (so the max performance is limited per hosting package - my virtual machine can use 8000mhz per second but without seeing on which core my load is processed). in general the most hosting providers doesn't show the guest os on their machines how many cpus are available and here is why nodejs shines so bright.

@courtneycouch
Copy link
Author

@egeozcan I've not gone through what netty is doing exactly but I believe the key difference is that nodejs is using read() and vertx is using mmap() on the OS level. There's modules out there to provide mmap bindings (https://github.com/bnoordhuis/node-mmap for example) and as @tipiirai there are lots of caching options available as well. There's some static file serving solutions with all the problems worked out (https://github.com/cloudhead/node-static)

One thing to note however, if indeed the difference is that nodejs is using read() and vertx is using mmap() (I know for a fact nodejs uses read()) then nodejs will be more performance for one off small file reads. So if you only need to read a file once, or rarely, and the files are not HUGE then nodejs file reading will get more performance. There is a cost to using memory mapped os file reads. So keep that in mind if you switch to using mmap bindings.

@courtneycouch
Copy link
Author

@fibric you make a good point. Also, node still shines on multicore setups (https://gist.github.com/2657432) as well ;) It's somewhat of a myth that nodejs can't be setup to take advantage of multiple cores. The tests there were even using node cluster which is known to be quite slow. I suspect haproxy + independent node instances would scale linearly with the cores (provided the processes are bound to individual cores).

@purplefox
Copy link

Note the mmap is probably only used when readFile() is used. If you use streams I don't believe it uses memory mapping, also (as mentioned before) if you use sendFile(), vert.x will avoid copying through userspace altogether.
This is highly efficient, but you only really see the benefits for larger files (which is why I didn't use it in the benchmark).

@egeozcan
Copy link

@courtneycouch I didn't expect such a detailed explanation... thank you very much!

@courtneycouch
Copy link
Author

@tipiirai exactly. This is hardly a useful benchmark. Was merely meant to show that there's more to the story than the vert.x benchmarks show which has everyone saying "vert.x is 5x faster than nodejs!" which simply isn't true.

@courtneycouch
Copy link
Author

the moral of the story here is.. There are many different ways to handle disk IO on the OS level. You explicitly pick the bindings you want to use with nodejs via modules. You need to be aware of the ramifications of the pros and cons of the IO you are doing. The flaw in the original benchmarks that inspired me to do t his is that the two benchmarks were using two different IO methods on the OS level and trying to compare them.. It was more of a benchmark of read() vs mmap() on reading a single small file over and over again than it was a benchmark of node and vertx. I don't think @purplefox realized that. I suspect he isn't familiar with how node handles IO in order to realize he was comparing apples and oranges.

There are plenty of benchmarks comparing read() and mmap() and when it's faster to use one or the other. Reading the same small file over and over again is a bad use case for read() and you can see that in the results on the original vert.x benchmarks.

You need to pick the IO bindings based on the type of data access.. Don't always trust that some language will understand what kind of IO you want to do.. using just some file read method on whatever language you are using. Find out what bindings it's really using on the OS level to see if those fit your scenario (that's if you are worried about the IO performance)

@egeozcan
Copy link

btw vertx-server.js has a missing comma at line 10, before the anonymous fn parameter.

@isaacs
Copy link

isaacs commented May 11, 2012

@courtneycouch You can save a syscall in the node server if you do res.end(file) rather than res.write(file);res.end()

@courtneycouch
Copy link
Author

Yea I didn't really spend any time on this.. just a few minutes changing the IO silliness (oh and the other benchmark showing node on multiple cores). I really should have tried mmap as well but really I should be spending my time doing actual stuff instead of making benchmarks hah. I think I made my point though that vertx isn't simply 5x faster than node as is being propagated.

Copy link

ghost commented May 12, 2012

i found an blog post on my readitlater list about couchdb but this blog post is about benchmarking.
http://jan.prima.de/~jan/plok/archives/175-Benchmarks-You-are-Doing-it-Wrong.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment