Wednesday, February 29, 2012

Managing callback "spaghetti" in node.js

Although node.js' evented I/O model works really, really great for writing tcp/http servers, which are inherently event driven domains, sometimes you just want to execute a number of steps in sequence, wait for them all to complete, and then take some action.

Let's take the example of adding some sample data into a test mongo database. This is a pattern you see a lot. Some test framework, be it mocha/vows/nodeunit whatever, it doesn't mater. But there is a before hook, and you want to populate the database with a nice set of initial data for your test.

Now normally, you would end up with what I call callback "spaghetti". A bunch of async calls, taking callbacks, being called in sequence, each taking the next step as a callback, until finally you end up with a heavily-spaghetti, difficult-to-follow piece of code. And it's always the same pattern over and over again. To give an example, here is some demo code using mongojs to populate a mongo db for testing:

Now as you can clearly see, this very quickly gets unwieldy and annoying to work with. Luckily, there is a pattern to get around this. It is mentioned in the book Secrets of the Javascript Ninjas by John Resiq (thanks Dawid for the tip). On a basic level, it has the following form:

So basically, you have an array of functions. These functions each do their work, and call the shared "callback" function. The "shared" callback function simply pops the first function in the array, and executes it. Next time, it again pops the first item, and executes it. Until the array is empty. At this point your sequence of steps is completed.

This is a very nice pattern, and something that I know I've come up with in some form of another a couple of times already. But I always love it when someone actually comes along and "solidifies" the pattern. I don't know what the pattern is called, I just know that I like it.

So, I thought, what about trying to make it even more generic? What if you can have some function, that takes a "done" functions, and a list of "step" functions (each taking a callback) which do their async work, and then calls the callback, that was passed in to them. So, I came up with this:

This node module exports a single function, waitForAll. This function takes a variable number of arguments (as is natural with any JS function). The first argument is the "done" function - this gets called upon completion of all the async steps. All the other arguments are then treated as async step functions. So inside the function, the list of actions, and the in-between callback function gets created. The in-between callback function is then partially applied as the first argument to all the action step functions inside the preparation code. This means that all functions passed in already have their first argument bounded to the internally created callback function. All that is left then, is to capture inside the internal shared callback function, the arguments that were passed to that function, and apply them to the partially applied function from the first step. This allows us to pass data from the one async step to the next. Consider the following test:

As you can see, the steps are now simply arguments passed to the waitForAll function. The first argument being the done function. Each step gets the callback, and all parameters passed in from the previous step. It is the single duty of each step to ensure that the callback() function gets called when work is done.

This means that, the first example becomes much much simpler to comprehend/manage:

A big win for readability in my opinion.

No comments:

Post a Comment