- Asynchronous code is how you write low-resource, high-concurrency servers. See http://www.kegel.com/c10k.html.
- Node embracing async from the get-go means that servers are low-resource, high-concurrency by default.
- Async + JavaScript = Perfect fit for an event loop.
- When it comes to threads vs event-loop, there are times when either are advantageous.
- But there are things very hard or impossible to do with threads.
- WebSockets are difficult to do properly with threads. That's one example where non-blocking IO (async) has a major advantage.
- RAM usage is another factor, especially when we're talking about the physical hardware required to run your application, which translates to real dollars.
- It depends on your application in the end:
- If you're mostly waiting on IO the entire time, an event loop shines.
- If you're doing CPU-intensive tasks like calculating prime numbers, then you want threads.
- Evented programming solves a different problem then progressive programming. It is not inherently better, just different.
-
-
Save TooTallNate/2794861 to your computer and use it in GitHub Desktop.
Maybe you could have two threads per client. One constantly reading, one constantly writing. I tried a long time ago in Java before using the nio
classes, but maybe that was just my inexperience with Java and threaded socket programming. But doesn't the non-deterministic behavior of WebSockets essentially make it so?
For example, this, quite literally, I could not see how it could work, when interacting with a WebSocket server that's sending/reading messages non-deterministically.
Another thing to consider is how asynchronous programming affects the thinking of the programmer and the resulting design itself. Thinking in terms of functional, asynchronous programming means that the programmer will, by necessity, be thinking about things like decoupling components, the nature of race conditions, etc.; I'm sure there are other paradigm-related things I could come up with upon further thought. Many front-end developers already think that way because of years of experience with JavaScript and JQuery or other event-heavy frameworks; Node injects that paradigm shift to the server side, which leads naturally to thought processes that are advantageous in solving modern engineering problems.
Are you asking why Node.js is async (because Ryan Dahl wanted it to be), or why it's good that it's async?
One of the reasons I think async is good, is it allows an app to have a common/global scope (DB connection pools, in-memory content/stats/etc) and multiple workers/servers sharing the same space. (So you can have the same app run very long/heavy IO tasks while simultaneously serving data without interruption.) I'm sure that's possible with some Java applications, but compared to a PHP environment - where every line of code and every bit of needed data has to be loaded fresh into memory on every request - it's a wonderful improvement. PHP frameworks (like Drupal) try to get around that with cron scripts, but that's a very clunky solution and one worker can't easily talk to another.
Ryan Dahl said in the introduction video from the startpage of nodejs.org something like.
Node is really good in idling.
Well. And with async programming you can actually work with idle.
In other frameworks and other languages then javascript this is very hard or impossible.
I'd try to explicitly define the class of problems we're trying to solve here. We want to write applications which:
- can serve many requests at the same time,
- need to respond to many small requests from the same client over a long period of time,
- need to do lots of io,
- and require very little time on the cpu.
Evented servers have been shown to be an excellent way to write high-concurrency, low-resource servers. We can, in comparison to a threaded approach:
- use less memory,
- waste fewer CPU cycles during context switches,
- and write simpler programs.
The first and last of the above list are arguably the most important. Memory is a much scarcer resource in comparison to CPU time. When programs need to ensure that only one thread can operate on a given resource at a time, and when they operate non-deterministically, it makes them harder to write, read, and understand.
There are other ways to do concurrency. This way is easy, efficient, and fun.
It's actually all about code complexity and explicit continuation being one of the simplest concepts in programming. It follows CPS (Continuation Passing Styling) probably not even knowing about it.
As for other comments, it's not that hard to do the same in other languages, sometimes even easier, like in Perl (FYI libev author is a perl programmer).
@TooTallNate Taking your example of the web-sockets code--I'm really not sure that you'd use that in a situation where you needed to handle messaging patterns that aren't strictly request-response. For example, you can just have your client thread call select(2) to listen for events on the socket or notifications from other threads in your process (although it's relatively common to use a pipe(2) for that).
@russfrank I'd say that Simplicity is a function of the abstractions you have, and how well they're used. You could well make the argument that introducing a callback for each I/O operation can potentially make your call graph more complex if you're not careful.
@zzzcpan True--One thing I think is something of a shame in Node.js land, is that abstractions to make the CPS transformation easier. For example, CPS maps very nicely onto the Deferred Monad. For example, you might know that a lot of Scala libraries (eg: http://akka.io/ or http://dispatch.databinder.net/Dispatch.html ) make use of this pattern extensively. But that said, I missed the part (early in it's history, I understand) when Node.js switched from using Promises by default to callbacks.
Personally, I find the questions around why Node.js has become so popular, and why now more interesting.
@cstorey It absolutely makes it more complex than straight synchronous code. I meant to compare the relative complexity of locking in a threaded system and continuation passing in an evented system.
I'd also like to note that it wasn't my assumption that the concepts in my previous comment were novel to anyone in the discussion; I was just suggesting an alternative way to work the english in the 'why async' argument. That is, by explicitly stating: here is what we are trying to solve; then, here is why our solution works.
@russfrank you are forgetting about spawning new threads, joining them, dealing with lock contention, because locking doesn't really work, the whole bunch of synchronization strategies to replace locking, cache lines, thread local memory, affinity and so on.
Multithreaded programming is too complex for most people and should be used in one case only - to implement actor model :)
@zzzcpan absolutely, I was grossly oversimplifying (for brevity's sake).
@russfrank That is not a binary choice. For example, you can use a cooperative threading model (by putting everything under a single lock) which has the big advantage that you don't need to think about concurrency within your code fragments, and has the same disadvantage as node.js' model: you only use one core. We've come there (in a proprietary setting): We used the same asynchronous, callback-driven programming model to do I/O multiplexing, and at some point doing everything in callbacks pointing to the next one becomes just too tedious for some things. (That was in C, compared to that, node.js is much simpler to handle because of garbage collection, and because of true lexical scoping.)
At some time we added the capability to start extra threads (cooperative ones) to be able to do a loop as, well, a loop instead of a data structure with a counter and proper next callback selection. The thread's stack serves as implicit state; and this turned out to be quite helpful.
Only we did it back when single-core was still the norm; node.js is a bit across the grain for current architectures in that regard.
But as we now know, loops are bad, and higher-order functions are good, so there should be a way to emulate such looping with, like, map(), and when we assume that the function argument to map needs to be not a simple function, but one that gets the input value and a result callback, there should be some async_map(fct,list,rescb) to be used like async_map (function (v, cb) { cb (2 * v); }, [1, 2, 3, 4], result_receiver).
Why are websockets impossible with threads?