Tuesday, April 24, 2012

Mocha is a feature-rich JavaScript test framework running on Node and the browser, making asynchronous testing simple and fun. Mocha tests run serially, allowing for flexible and accurate reporting, while mapping uncaught exceptions to the correct test cases. --- http://visionmedia.github.com/mocha/

Okay, the modern fashion is for the developer to write unit tests alongside their code.  In the Node community there has been several unit testing frameworks developed with different approaches in mind.  While at Yahoo I was using YUITest because, well, YUI is Yahoo's thing and we thought it would be best to use Yahoo's thing.  While it worked fairly well on Node, it had a major issue in that it did little to help us with testing asynchronous code.  I haven't looked at too many of the other unit test frameworks, but recently came across Mocha and it looks pretty nice, and has a nice solution for testing asynchronous code.

Installs with npm:  $ npm install -g mocha

An example of a test case is:
describe('Array', function(){
  describe('#indexOf()', function(){
    it('should return -1 when the value is not present', function(){
      [1,2,3].indexOf(5).should.equal(-1);
      [1,2,3].indexOf(0).should.equal(-1);
    })
  })
})

Or

describe('Context', function(){
  beforeEach(function(){
    this.calls = ['before'];
  })

  describe('nested', function(){
    beforeEach(function(){
      this.calls.push('before two');
    })

    it('should work', function(){
      this.calls.should.eql(['before', 'before two']);
      this.calls.push('test');
    })

    after(function(){
      this.calls.should.eql(['before', 'before two', 'test']);
      this.calls.push('after two');
    })
  })

  after(function(){
    this.calls.should.eql(['before', 'before two', 'test', 'after two']);
  })
})

What we see is the "describe" function is a container for test cases, the "it" function is a test case which is read as "it should _____".

There are some extra functions, before(), after(), beforeEach(), and afterEach(), which are called at various stages in the test suite execution.  The functions before and after would be called at the beginning and end of executing a describe block, while beforeEach and afterEach are called before and after each test case.

The examples so far are using the BDD style of writing tests.  Mocha also supports other styles, see the website for more details.

Now, what about asynchronous code?

describe('User', function(){
  describe('#save()', function(){
    it('should save without error', function(done){
      var user = new User('Luna');
      user.save(function(err){
        if (err) throw err;
        done();
      });
    })
  })
})

Basically, the test case either throws an exception, or calls done().  If the test function doesn't take a function argument, as in the earlier examples, then Mocha doesn't treat it as an asynchronous test.  However if the test function does take a function argument, then the test is not finished until either an exception is thrown or the done() function is called.

This makes it relatively easy to encapsulate asynchronous code within a test case.

However there is an issue with this example.  See it?  What if the callback inside the test case is never called?  Neither an exception will be thrown, nor will the done() function be called, hence Mocha will hang at that test case.

There are two ways to fix this:


  1. Use the --timeout option so that if a test case does not finish within the timeout period, the test case will fail
  2. Call this.timeout(500) inside the test function if a given test needs a different timeout from the global timeout
Mocha also includes a long list of output formats, named reporters, that govern the output printed on the console.  The reporters vary from extremely terse (a line of dot's) to very verbose, with color coded output, etc.  While nice for the developer, pretty output on the terminal isn't so good for gathering long-term test results data to look at test failure trends etc.   It supports the TAP format, JSON, etc.  It doesn't seem to support JUnit format output for some reason, but TAP and JSON should be enough for reporting.

An interesting reporter is the doc reporter that provides a nice annotated HTML page of the test collections, and the "should" phrases.




Saturday, February 25, 2012

Converting FLV to MP3 using ffmpeg and the Node.js commander package

I just read this blog post saying that while Node.js has the stereotype that it's not for heavy number crunching stuff like video conversion, you can do video conversion anyway.  (see http://www.hacksparrow.com/flv-to-mp3-converter-in-node-js.html)  With the script in that blog post, converting an FLV to MP3 is as simple as this:

$ [sudo] npm install flv2mp3 -g

Then

$ flv2mp3 -f jump.flv
$ flv2mp3 -f ~/dwhelper/jump.flv -o ~/mp3s/

Intrigued I found that the source of this script used, under the covers, as expected, the ffmpeg command line tool to do the heavy lifting.  Hence, the claim that you could do heavy work like video conversion in Node wasn't proved by this.  But what was interesting was the implementation.

...
var program = require('commander');
...
program
  .version('0.0.2')
  .option('-f, --file <path>', 'FLV file path')
  .option('-o, --out [path]', 'Output directory')
  .parse(process.argv);
...

Commander is well worth exploring because it looks like it makes it trivial to implement command line tools.  Looking through the examples it appears straightforward to implement two sorts of command line options processing, those with "-" for options, and those where the second word on the command line is a subcommand name.

Node.js available for Windows via ChocolateyGallery

Because I freed myself of Windows a long time ago, I am completely ignorant of the state of the art in living with Windows without going insane.  I probably just offended the target audience, so I'll stop there.

In case you don't know (as I didn't), ChocolateyGallery is "somewhat like apt-get but built for Windows".  Or, for my mind, it's like MacPorts, but built for Windows.  Anyway, it apparently has a large suite of open source packages available for easy install.  See http://chocolatey.org/ for more info on the thing.

At http://chocolatey.org/packages/nodejs.install/0.6.11 we see that Node.js installation is as simple as

C:\> cinst nodejs.install


Monday, February 20, 2012

Node.js w/o the JavaScript & V8? Node Native

When Ryan Dahl designed Node.js on top of V8 to use the JavaScript language it gave us both advantages and disadvantages. A project by Daniel Kang is looking to make a pure C++ version of Node with no JavaScript and no V8. According to an article on The Register, the project is called Node.js Native, but I don't see how it can be properly named "Node.js" if it doesn't include Javascript.  Shouldn't its name be "Node.c++" ??  Having visited the project page (see link below) we see the proper name of the project is "Node.native" not "Node.js Native".  Hey Register guys, tut tut tut, we expect better of you than this level of inaccuracy.

This isn't the first time Node.js has served as conceptual fodder for a similar platform, written to a different language.  I noted awhile ago a project that was formerly called Node.x but now has a new name that I've forgotten.  That project is implemented in Java and supports developing asynchronous oriented software in any of the languages that run on the JVM.

I haven't looked at the Node.native - so here's a bit of handwaving prognostication.

By using JavaScript/V8 for Node.js we gained the advantage of a high level language, and most importantly do not have to worry so much about memory leaks.  It's still possible to write memory leaks in JavaScript, but much harder.  Contrarily C++ is also a high level language, but one that's more complex than C++, and most importantly C++ programmers have a tough time dealing with memory leaks.

Daniel Kang, the founder of the Node.native project, is doing this for performance reasons.  That's another of the tradeoffs, because V8 imposes a performance overhead.  To some it doesn't matter how blisteringly fast you believe V8 to be, some will always think that any interpreted or dynamically compiled programming platform is slower than a natively compiled platform like C++.  Further, there are applications such as encoding video streams where JavaScript isn't appropriate.  Obviously this is a matter of using the right tool for the job.  The typical web application could well be fast enough when written in JavaScript and Node.js while a video transcoding server should be written in C++. 

That is - the JavaScript language offers many programming advantages but it isn't the be-all-end-all of programming languages.

Another way to crack this nut is to work on facilitating integration of native libraries.  Here's another tradeoff with Node.js, in that there's a slew of native coded libraries available for all kinds of things, but to use those libraries in Node.js requires building a wrapper library to make its commands available as functions in a Node module.  Clearly with Node.native you don't have this consideration, instead you just link the library into the process, unless the library doesn't cooperate very well with asynchronous programming.

#include <iostream>
#include <native/native.h>
using namespace native::http;

int main() {
    http server;
    if(!server.listen("0.0.0.0", 8080, [](request& req, response& res) {
        res.set_status(200);
        res.set_header("Content-Type", "text/plain");
        res.end("C++ FTW\n");
    })) return 1; // Failed to run server.

    std::cout << "Server running at http://0.0.0.0:8080/" << std::endl;
    return native::run();
}


Friday, February 17, 2012

Potential for integrating Node.js with Drupal and speed up Drupal page processing

Besides some experience with Node.js enough to write the book linked in the side bar, I've also spent a lot of time building and configuring Drupal websites.  I've been pondering the possibilities for marrying Node with Drupal and have also seen a few projects spring up with that purpose.  However the core issue is that Drupal page processing is not an asynchronous process like Node's query handling, instead Drupal implements the typical synchronous start at the beginning and go to the end step by step model.  You know, the model we're trying to get away from by adopting Node.


Four Kitchens is one of the big Drupal development shops, and Elliot Foster posted recently on their blog a sketch of an idea for using Drupal and Node together.  See http://fourkitchens.com/blog/2012/02/07/nodejs-drupal

Because the post is focused on talking to Drupal developers, it spends the first half of the writing discussing what Node is, and what asynchronous programming is.


Elliot then suggests one plausible use is for Node to be a back end gateway to some third party API service.  


The problem, as he says, is that a third party API service can be slow.  This doesn't have to be a third party service, right?  Any back end API service could be slow.  In any case, because Drupal uses a synchronous model, if the Drupal page load accesses a slow API service, that slow service will slow down rendering the page from Drupal and give a bad user experience.  


He suggests a Drupal 7 mechanism called "Drupal Queues" which is an object model inside Drupal to store a queue of requests that will be handled at some later time.  I don't know much about this, but clearly it's a bit of a hack necessary to overlay some form of asynchronocity on Drupal.

In any case it would let a Drupal page request foist a request to a third party API to a Node.js based gateway widget, the Node based widget will immediately reply that it got the request, and the Drupal page processing will proceed on with building the page.  Some time later when the third party API replies to the Node based gateway, the gateway will turn around and notify Drupal.

The implementation will be some little service sitting on http://localhost:port/path that implements the proxy/gateway to the third party API.



Wednesday, February 8, 2012

Followup on removing of isolates from Node.js 0.7.3

A couple days ago I noted that a commit to the Node.js development tree had removed the isolates feature that had been expected for Node.js 0.8.x.  Over on the node-users mailing list Isaac Schleuter posted an explanation of "why".

Why was this feature planned?
The Isolates feature was intended to make it possible to run
child_process.fork() in a thread, rather than a full process. The
justification was to make it cheaper to spin up new child node
instances, as well as allowing for fast message-passing using shared
memory in binary addons, while retaining the semantics of node's
child_process implementation by keeping them in completely isolated v8
instances.
Why was it removed?
ultimately turned out to
cause too much instability in node's internal functionality to justify
continuing with it at this time. It requires a lot of complexity to
be added to libuv and node, and isn't likely to yield enough gains to
be worth the investment.
One of those disappointed saw this as justification for Isolates
was going to make
Node more able to do intense CPU-bound operations without blocking
everything else, a limitation that is one of Node's biggest criticisms.
Let's stop and explain this a bit because this is something I cover in my book, Node Web Development.    A few months ago there was a blog post using the Fibonacci calculation (as I do in Node Web Development) to demonstrate the problem.  Basically a long-running calculation blocks event execution preventing the Node.js process from doing its event processing job.  In my book I described two ways to get around this:  a) refactoring the algorithm to dispatch sub-calculations via the event dispatch mechanism, b) distribute the calculation to a back-end process

The question at this point is whether Node should strive to be a do-everything be-everything platform, or whether it should focus on the thing it does best (extremely fast event driven I/O processing)?

If everyone coming to Node.js knows that long-running calculations require special handling, then is it a problem?  For example people generally don't use a hammer to brush their teeth because everybody knows that hammers are for bashing things.  In other words, you use the best tool for the job and what Node strives to do is be a tool for extremely fast event driven I/O processing.

Isaac responded to the above criticism saying you can still launch a child process to push the long-running calculation to another process.  Nothing about child_process.fork has been changed other than its implementation with Isolates.  The cost is to make spinning up a child process this way a bit more expensive.

Ben Noordhuis suggested this:
Retrofitting thread safety onto a code base that wasn't designed for
it leaves a very wide margin for obscure bugs. Offset against the
potential benefits (which were questionable and probably not
bottlenecks to most people*) the choice was not hard to make.
And several others piped in saying "stability and debugging first".





Monday, February 6, 2012

Good bye isolates, Node.js hardly knew ye

A feature on the Node.js 0.8.x roadmap was isolates, which would allow for threads-like features without the complexity of threads.  That's because isolates have no shared state, while allowing for multiple independent code execution streams to run in parallel within the same process.

This morning a check-in to the Node development tree backed out the support for isolates.  The explanation is pasted in below.

Revert support for isolates. It was decided that the performance benefits that isolates offer (faster spin-up times for worker processes, faster inter-worker communication, possibly a lower memory footprint) are not actual bottlenecks for most people and do not outweigh the potential stability issues and intrusive changes to the code base that first-class support for isolates requires. Hence, this commit backs out all isolates-related changes. Good bye, isolates. We hardly knew ye.