Non-blocking IO demystified
This is an article about inner workings of non-blocking servers, that is servers that don't block a thread per connected client.
While some simply use the term asynchronous or non-blocking as a synonym for "fast", there seems to be little understanding of what it actually means.
The basic component of non-blocking code is an event-like interface:
onRequest: function(request) { response = … // generate response respond(response) }
The interesting part is what the underlying framework does to provide this interface. The answer is a message loop running under the covers:
while(true) { connection = poll_connection_from_OS() request = read_request_from(connection) onRequest(request) // call user code }
There two observations to make at this point:
- the server runs entirely on a single thread
- if your onRequest code does something fast, only CPU bound, this approach is efficient
Now comes a "surprise". Of course, what you would usually do in your onRequest code is something like this:
onRequest: function(request) { obj = db.read_object(request.params("id")) respond(json(obj)) }
The problem is that the db.read_object is a blocking call. There is absolutely no way the thread can continue, because the method must return the database object. And remember, we are still on the single thread running the message loop.
Therefore, if a thousand clients come at the same time, they will get served one-by-one, the last one waiting for the 999 database calls to complete. In other words, the throughput of our server is terribly low.
So what's the solution to this problem? Well, here it is:
onRequest: function(request) { db.read_object(request.params("id"), function(obj) { respond(json(obj)) }) // returns immediately! }
The whole difference is in the db library being itself non-blocking. What db.read_object does is that it puts the passed callback function inside some data structure and returns immediately, so our (single!) main thread can happily continue accepting requests. The db object itself is then running its own message loop internally (on its own thread, so our server has two threads now). In its internal loop, the db object polls for responses from the external database and calls back our function.
Now, if thousand clients come at the same time, a thousand requests to the database will be fired almost instantly and remembered by the db object, and a response will be sent back to each individual client as soon as the database returns each one of the individual requested objects.
Now this is the awesome non-blocking IO that everyone is talking about. The server really is handling a thousand clients "in parallel" using only two threads.
Of course, there will still be one thousand open sockets but we managed to handle them all using only two threads.
Appendix - what to do when you only have a blocking client library:
If the only API your client library gives you is 'obj = db.read_object(id)', you will essentially need to do the following:
onRequest: function(request) { threadPool.queue(function() { obj = db.read_object(request.params("id")) respond(json(obj)) } }
This way you free the IO thread (the one running the message loop) for accepting more incoming requests, but your server is now blocking, since it uses one thread per client (each of those threads simply sits idle, waiting for the response from the database). If many requests come at once, the threadpool will start new threads. When a maximum number of threads is reached, the calls will be just queued, waiting for threads to complete and become available, therefore seriously limiting throughput.
The takeaway from this article is: in order for your request handling code to be non-blocking, it has to be composed entirely on non-blocking API calls. A "non-blocking" server toolkit by itself does not guarantee high concurrency/throughput.