Every app is built to scale at one time or another.
With traditional technique of spawning a thread-per-response or crating socket-per-incoming connection or with thread pool pattern makes it difficult to scale (please note that this is not a statement as the factor of scalability largely depends on the type of problem your app is solving).
Problem with having many threads is it may lead performance problems due to overhead of context switching and complex concurrency schemes. Most of the time spent by server doing context-switching between requests where threads handling event listeners do not read or write data frequently.
To reduce the overhead of context switching the concept of non blocking IO is put to practice.
Reactor pattern is asynchronous or non blocking model that Node uses for I/O. This is one of the earliest papers where reactor pattern is discussed in detail.
Imagine writing a web service. Typically web server tasks include -
- Read request
- Decode request
- Process service
- Encode reply
- Send reply
Each task differ in nature and cost. Mostly tasks involves IO, whether it is writing or reading from a database or a disk/filesystem or a computational service; they all tend to be blocking operations. This means that the processor can spend most of its time idle waiting for I/O operations to complete. What if we can delegate an operation to an handler and continue with other tasks but get back to them when finished. Event-driven IO uses similar ideas using reactor pattern technique but many systems differ in design.
There are two important actors in the architecture of reactor pattern:
Fig 1.0 : Basic skeleton of a Reactor pattern
The reason I call the components as Actors is due to the fact that they don’t have a shared state, communication is done by message passing/notifications. (depends on the implementation though)
Handler : Performs non-blocking actions
Reactor : This responds to IO events by dispatching the appropriate handler.
Basic dispatch() in the reactor implementation would be a single threaded event loop dispatches events on handles (e.g. sockets, file descriptors) to event handlers.
_select (handlers); __foreach h in handlers loop _ _h.handle_event(type)__end loop_
So when a request arrives on the server, they are serviced one at a time and dispatched to its handler as fast as they can. When the code does some IO, it receives the async treatment of “getting back” to it when it finishes. Until then, it services another request. This avoids the “hostaged” memory and processing, and keeps your CPU utilization maximum all the time.
- Separation of concerns - Components are modular, event handlers are separated from low-level mechanism and handlers can be composed easily as they are decoupled of application-independent mechanisms from application-specific policies. In other words, handler objects need not be aware of how events are dispatched.
- No thread context switching
- Concurrency is simplified due to being in a single thread. So you never have the risk of mutable thread accessing the same mutable state.
- Non-pre-emptive model - Handlers cannot take long time.
- Difficult to understand to get started with but latter comes hard to debug. Since it is not always clear why a particular handler was invoked, and because it may be difficult to repeat the computation preceding the fault.
- One faulty event handler your entire server is down.
This pattern is the foundation for the events processing structure implemented in Node.js, Ruby’s eventmachine, JBoss Netty, Apache MINA, Python’s Twisted event-driven I/O libraries.