Tuesday, April 15, 2014

AkashaCMS v0.3.0 released - major rearchitecting, plugins, improvements, much more planned for v0.4.x

I'm pleased to announce that AkashaCMS has reached version 0.3.0.  This version has been over a year in development and isn't quite what I'd intended, but it is a result of the actual needs during the past year.  The primary change was to architect AkashaCMS to support plugins, a move which allowed the creation of several useful plugins.

AkashaCMS is a "content management system", written in Node.js, for producing static HTML websites.  The goal is that AkashaCMS websites be built with modern HTML5, CSS3 and JavaScript technologies, while having the performance advantages of straight HTML files.  For some thoughts on why one should use AkashaCMS (or other static HTML CMS's) see: Static HTML website builders (AkashaCMS, etc) slashes web hosting costs to the bone

Plugins - Extensibility for AkashaCMS


What's an AkashaCMS plugin?  The website lists the current set of plugins.  There's no central registry of plugins, but I urge anyone who's built a general purpose plugin to send documentation to put on the website.

A website declares the plugins it uses in config.js like so:

    plugins: [
        require('akashacms-breadcrumbs'),
        require('akashacms-booknav'),
        require('akashacms-embeddables'),
        require('akashacms-social-buttons'),
        require('akashacms-tagged-content'),
        require('akashacms-theme-bootstrap'),
        require('akashacms-theme-boilerplate')
    ],

An AkashaCMS plugin is simply a module that export a function with this signature: 

module.exports.config = function(akasha, config) { .. }

This function is called during AkashaCMS initialization, and the config function will typically manipulate the array's in config.js.   Typical plugins provide assets, layouts, partials, or functions.  The location of any files provided by a plugin are what gets pushed into the configuration.

There's a number of facets to this, such as allowing a website to override features provided by a plugin.  For example, a plugin could provide a template in layout/youtube.html.ejs, and a website could implement the same file name, and AkashaCMS would use the website's version of the template rather than the plugin's template.

AkashaCMS has a plugin named "builtin" that is invisibly included which provides a bunch of useful functions and templates, a lot of which have to do with page metadata.  It also provides a base page template adapted from the boilerplate framework.

PHP?  Wait, I thought this was for Static HTML websites!


Another major improvement is one I snuck in a couple weeks ago.  You can now create PHP files using AkashaCMS's template processing system.   The result is that, just as we do with HTML files today in AkashaCMS, you write the PHP for the core part of the page and then wrap that with page layout templates.

You could always have created a file named foobar.php, either in the documents or assets directory, and AkashaCMS would copy that directly to the rendered website.   Now you can write a file name like index.php.ejs, and the file will be processed using EJS and output as index.php.  At the moment only EJS is supported with PHP files, not Kernel.

Yeah, AkashaCMS was envisioned as creating static HTML websites, and now it supports creating PHP files.  This is a little divergence from the original intentionality.  I don't know if this is going to completely destroy AkashaCMS's purity.  The feature arose directly from my needs in that I ported one of my sites, thereikipage.com, and it had a few PHP scripts.  Those scripts had to use the same theming as the rest of the site, and it seemed best to process the scripts through the same template files used for the site.

The result turned out very well, and it naturally fits with AkashaCMS.  You simply write a snippet of PHP code, and put an AkashaCMS frontmatter block that declares a layout file, and voila your script automatically inherits the layout/theme of the website, and voila partials are available, etc.

What's planned for AkashaCMS 0.4.x?

As I proved with 0.3.x my intentions at the beginning are unlikely to be an accurate prediction of what 0.4.x will look like.  I've already entered a large number of issues for the v0.4.x milestone, that reflect the ideas in my mind that are needed.

Major areas are:

Filters: Generally, this is about manipulating the HTML before rendering is finished.  I want to do things like automatically add rel=nofollow to certain outbound links, or append a little "external link" icon next to outbound links.  There's potentially a huge number of filters that could be written.

Implementation will be that the rendering pipeline will send messages where a plugin function would receive the rendered HTML, and respond with manipulated HTML.

YAML: I need to research this better, but I understand that YAML is a data markup doohickey similar in purpose to the frontmatter AkashaCMS uses today, but more comprehensive.  There's times it feels the frontmatter format is too limiting.  For example, the akashacms-tagged-content plugin is an initial stab at vocabularies and tags, and it'd be very useful to have a better way of declaring a list of tag names than a simplistic comma-separated list. 

Merge/Minify JS/CSS:  This is important for improving site speed, by reducing the number of individual file requests and reducing the size of JS/CSS assets.  There's some complexity, for example the set of JS/CSS files are not necessarily the same across the whole site.  Therefore, potentially each page has its own aggregated/minified JS/CSS file.  I might, again, punt on doing this.

RSS/Blog/Podcast: Theoretically AkashaCMS could build a blog or podcast website.  I have in mind some kind of method of declaring a group of content files that are to be considered as the group of files in a blog, and an RSS (or Atom) file would be generated from that group of files, as would a river-of-news index page.

I'd like AkashaCMS to support multiple "blog" clusters per website - for example, to support both a blog and a podcast on the same site.

Rebuild only what's needed: Currently AkashaCMS deletes everything in the output directory, then renders the entire site.  It's a simple way to ensure the site is clean every time and is exactly what the site is supposed to have.  While useful, it's wasteful to rebuild everything.  If you edit one file and want to test the change, why wait to rebuild the entire site?  AkashaCMS can build an individual file, but the feature is imperfect and what if you have several files to rebuild?

Thursday, April 10, 2014

What's the best recommended format or directory structure for an Express project in Node.js?

The Express framework for Node.js offers great support for developing applications.  It has a good routing system, support for multiple templating engines, and an interesting system called "middleware" for plugging support modules into certain URL's of your website.  Say you want certain paths to require user credentials, and other paths to be open for anybody to use?  You simply insert user authentication middleware into the desired routes.

One thing the Express framework does not do is specify a layout for your application.   Instead we have complete flexibility to either organize it ourselves, or shoot ourselves in the foot.

For my book Node Web Development (see links in sidebar) I thought about this quite a bit, and came to this sort of organization:

Certain files that weave the whole application together live at the top level directory.  This particular application can use the cluster module to load-share over multiple CPU cores, so in addition to the app.js that contains all the Express glue is cluster.js and workforce.js for the clustering support.

The child directories are split into Model (models-xyzzy), View (views), and Controller (routes) roles to attempt to follow the MVC pattern.  This may or may not be a good idea, but I thought for the book it would be good to demonstrate one way of implementing that pattern.  The "public" directory contains any static assets like images or stylesheets.  The "views" directory contains templates and very little code.

Having multiple models directories meant being able to run the application on top of different database engines.  Each model would be required to support the same API, and then choice of model is controlled by the code in app.js.

That the Controller code in the routes directory had to be used with any module meant the routes modules did not directly load the model modules.  That is, routes/users.js did not require('../models-memory/users').  Instead, the model's were require'd in app.js (require('models-xyzzy/users')) and passed into the route modules in a configuration function like so:

var rUsers = require('routes/users');
rUsers.config({ model: require('models-xyzzy/users') });

The authors of Express did not, uh, express an Opinion one way or the other, on what's the Best Practice in organizing the modules in an Express application.  At the end of the day it's up to you.  But, you're almost certainly working in a team with other developers.  The group of you need to work together, so your team needs to be on the same page on these decisions.  As far as determining a best practice, it's better to follow whatever decision your team settles on.

Friday, January 17, 2014

Installing MongoDB on Mac OS X Mavericks for Node.js development

This is part three of a series of blog posts exploring the set-up of MAMP-like development environment for Node.js on Mac OS X.  Earlier we looked at setting up Node.js on a Mac OS X machine, as well as using forever to keep server processes running.  In this installment we'll look at setting up a MongoDB instance.

MongoDB is an extremely popular NoSQL database that's document centric and offers a lot of flexibility.  The Node.js community has developed several Mongo drivers for use with Node.  My book (see links in sidebar) includes a section demonstrating using the Mongoose ORM library.  Both the MongoDB and Mongoose projects are sponsored by 10gen.

The first question to ask yourself is, why set up a local MongoDB instance at all?  There are several companies offering free or low cost MongoDB database services.  You could just do your software development work against one of those providers, and skip the headache of running your own server.

That's a great point and it's worth considering http://www.mongohq.com for your MongoDB needs.  But, it's not that simple.  For example in my book I'd listed two other companies, neither of whom now offer MongoDB hosting services.  You'd be putting your development environment at the risk of 3rd parties who might quit their business.  Also, do you trust your data to a 3rd party?

In any case, assuming you've decided to set up a local MongoDB instance, how do you go about it?

One way is to go here http://www.mongodb.org/downloads and get the Mac OS X download.  They also have installation instructions for use either with the download, or a source install using Homebrew.

What I've done instead, on my computer, is install it with MacPorts.

Why install using a package manager rather than the downloadable binary?  It's so that the package manager system will automagically update the software as new releases come out.

With MacPorts it's real simple to do:

$ sudo port install mongodb
Password:
--->  Computing dependencies for mongodb
--->  Dependencies to be installed: libpcap scons snappy
... lots of output as it builds other stuff
--->  Building mongodb
--->  Staging mongodb into destroot
--->  Creating launchd control script
###########################################################
# A startup item has been generated that will aid in
# starting mongodb with launchd. It is disabled
# by default. Execute the following command to start it,
# and to cause it to launch at startup:
#
# sudo port load mongodb
###########################################################
--->  Installing mongodb @2.4.8_1
--->  Activating mongodb @2.4.8_1

MongoDB is installed at this point, but not running.  As they note on the installation instructions (link above) you can simply run mongod with a local data directory, without having to mess around with getting Mac OS X to autostart the mongod process on system restart.  But, fortunately, MacPorts makes it easy to autostart MongoDB.

$ sudo port load mongodb
Password:
$ ps -eaf | grep mongo
    0 35355     1   0 12:24PM ??         0:00.01 /opt/local/bin/daemondo --label=mongodb --start-cmd sudo -u _mongo /opt/local/bin/mongod --dbpath /opt/local/var/db/mongodb --logpath /opt/local/var/log/mongodb/mongodb.log --logappend ; --pid=exec
    0 35356 35355   0 12:24PM ??         0:00.01 sudo -u _mongo /opt/local/bin/mongod --dbpath /opt/local/var/db/mongodb --logpath /opt/local/var/log/mongodb/mongodb.log --logappend
  506 35357 35356   0 12:24PM ??         0:00.11 /opt/local/bin/mongod --dbpath /opt/local/var/db/mongodb --logpath /opt/local/var/log/mongodb/mongodb.log --logappend
  501 35364   485   0 12:24PM ttys001    0:00.00 grep mongo

Now, MongoDB is running, and will auto restart when the computer starts.

You can play with the mongo install like so:

mainmini:t david$ mongo
MongoDB shell version: 2.4.8
connecting to: test
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
 http://docs.mongodb.org/
Questions? Try the support group
 http://groups.google.com/group/mongodb-user
> 
> 
> help
 db.help()                    help on db methods
 db.mycoll.help()             help on collection methods
 sh.help()                    sharding helpers
 rs.help()                    replica set helpers
 help admin                   administrative help
 help connect                 connecting to a db help
 help keys                    key shortcuts
 help misc                    misc things to know
 help mr                      mapreduce

 show dbs                     show database names
 show collections             show collections in current database
 show users                   show users in current database
 show profile                 show most recent system.profile entries with time >= 1ms
 show logs                    show the accessible logger names
 show log [name]              prints out the last segment of log in memory, 'global' is default
 use                 set current database
 db.foo.find()                list objects in collection foo
 db.foo.find( { a : 1 } )     list objects in foo where a == 1
 it                           result of the last line evaluated; use to further iterate
 DBQuery.shellBatchSize = x   set default number of items to display on shell
 exit                         quit the mongo shell

The Getting Started notes are also useful for verifying the install went well.

Over on the Mongoose website they have a quick start guide.

There are plenty of other MongoDB tools available in the npm registry :- https://npmjs.org/search?q=mongodb

Friday, January 3, 2014

Managing Node.js servers on Mac OS X with forever - works best for development

If, like me, you're doing Node.js development on a Mac, you might have a yearning for a tool like MAMP but which works for Node.  A couple weeks ago I wrote a blog post covering the first step, setting up a Node and npm instance on your computer.   If you don't know what MAMP is, go read that blog post, and then come back here.  What I want to go over today is a way to manage/monitor one or more Node processes on your computer.

Forever is a simple CLI-oriented tool to ensure that a Node process runs continuously.  It's functionality is similar to the init daemon on Linux systems, except it doesn't run at system boot-up time.  On a Mac, launchd serves the same purpose.  Out of the box forever doesn't integrate with either, but it's possible to write a little wrapper script.

Installation is straightforward

$ sudo npm install forever -g

While forever has an API, we'll be using it as a command line tool.  Type this to get a list of options:

$ forever -help

It has start, stop and restart commands to manage processes, as well as a list command to show the current processes.

Among the example scripts that demonstrate using forever is a simple server that we can play with:

var util = require('util'),
    http = require('http'),
    argv = require('optimist').argv;

var port = argv.p || argv.port || 8080;

http.createServer(function (req, res) {
  console.log(req.method + ' request: ' + req.url);
  res.writeHead(200, {'Content-Type': 'text/plain'});
  res.write('hello, i know nodejitsu.');
  res.end();
}).listen(port);

/* server started */
util.puts('> hello world running on port ' + port);

Note that I've modified it to run on port 8080, because on the Mac an Apache process is running on port 80 (assuming you've enabled Web Sharing in the control panel).

A required bit of setup is to run this:

$ npm install optimist

And then you start the server this way:

$ forever start server.js 
warn:    --minUptime not set. Defaulting to: 1000ms
warn:    --spinSleepTime not set. Your script will exit if it does not stay up for at least 1000ms
info:    Forever processing file: server.js

And then you can visit http://localhost:8080/ in your browser.  If you have multiple applications you're developing it's simple enough to ensure that, for development, they're running on different ports, eh?  Maybe.

You can query the status of servers this way:

$ forever list
info:    Forever processes running
data:        uid  command             script    forever pid   logfile                        uptime       
data:    [0] 4NdQ /opt/local/bin/node server.js 48843   48844 /Users/david/.forever/4NdQ.log 0:0:2:11.320 

The [0] indicates the script index, and is used in some of the forever commands to identify the script to act on.  For example we can view the logfile this way:

$ forever logs 0
data:    server.js:48844 - > hello world running on port 8080

Or stop the server:

$ forever stop 0
info:    Forever stopped process:
data:        uid  command             script    forever pid   logfile                        uptime       
[0] 4NdQ /opt/local/bin/node server.js 48843   48844 /Users/david/.forever/4NdQ.log 0:0:4:11.846 
$ forever list
info:    No forever processes running

Since this is about developing Node applications, we want to be able to edit our code and automatically reload the application.  Forever makes this fairly easy, with two different methods.

One way is:

$ forever restart 0

Which restarts and reloads the application.  However, we can instruct forever to watch and automatically reload changes:

$ forever start -w server.js server.js 

Then after making a change, we can reload the page and see the change, and also see this in the logs.

$ forever logs 0
data:    server.js:49219 - > hello world running on port 8080
data:    server.js:49219 - GET request: /
data:    server.js:49219 - error: restarting script because /Users/david/t/server.js changed
data:    server.js:49219 - error: Forever detected script was killed by signal: SIGKILL
data:    server.js:49219 - error: Forever restarting script for 1 time
data:    server.js:49219 - > hello world running on port 8080


While running the processes look like this:

$ ps -eaf | grep node
501 48939     1   0  3:48PM ??  0:00.50 /opt/local/bin/node /opt/local/lib/node_modules/forever/bin/monitor server.js
501 48943 48939   0  3:49PM ??  0:00.12 /opt/local/bin/node /Users/david/server.js

Notice that the parent process of the forever/bin/monitor.js process is process#1 which is /etc/init.  It means we can log out, log back in, and the process will still be there.  However, we reboot the system and the process doesn't restart.  Maybe we want to use forever to manage this.

I found a solution over on stackoverflow - http://stackoverflow.com/questions/18604119/osx-launchd-plist-for-node-forever-process

One uses a plist of this shape:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
    <dict>
        <key>KeepAlive</key>
        <dict>
            <key>SuccessfulExit</key>
            <false/>
        </dict>
        <key>Label</key>
        <string>com.davidherron.testapp</string>
        <key>ProgramArguments</key>
        <array>
            <string>/opt/local/bin/node</string>
            <string>/opt/local/bin/forever</string>
            <string>-a</string>
            <string>-l</string>
            <string>/var/log/com.davidherron.testapp.log</string>
            <string>-e</string>
            <string>/var/log/com.davidherron.testapp_error.log</string>
            <string>-w</string>
            <string>/Users/david/t/server.js</string>
            <string>/Users/david/t/server.js</string>
        </array>
        <key>RunAtLoad</key>
        <true/>
        <key>StartInterval</key>
        <integer>3600</integer>
    </dict>
</plist>

Save it as /Library/LaunchDaemons/com.davidherron.testapp.plist then make sure it's owned correctly:

$ sudo chown root /Library/LaunchDaemons/com.davidherron.testapp.plist
$ sudo chmod 0644 /Library/LaunchDaemons/com.davidherron.testapp.plist


The idea is to be able to launch it this way:

$ sudo launchctl load /Library/LaunchDaemons/com.davidherron.testapp.plist 

Unfortunately, while it runs, we're unable to view the processes using forever.  After awhile the process crashed but forever did not restart it.

After a while of ditzing around with this, it appears that using forever from launchd is not a good combination.  Maybe someone has figured out how to run forever with launchd and can leave a comment below.

In the meantime I'll say that forever is fine for development use, on a Mac.







Tuesday, December 17, 2013

How do function(err,data) callbacks work in Node?

Those new to Node programming might find the callback function pattern a little difficult to understand.  For example how does the data get into the arguments to the function call, and how does the callback function get called?

For example, fs.readFile requires that you pass in a function with this signature:  function(err,data)

Does that mean the function parameters have to be called err and data?  What if the function has more parameters? 

The Node.js runtime follows several conventions, one of which is the order of function parameters in callback functions.  The function parameter convention makes it more straight-forward to write callback functions and in other ways inter-operate with the Node runtime.

Convention:  The first parameter, typically called err, is given an error object if there is an error, otherwise it is NULL.

Convention:  The last parameter might be given a callback function, if one is needed for the called function to notify the caller of results or errors.

Because Node.js uses an asynchronous programming model, typical constructs like try/catch, exceptions, and return value conventions, simply do not work in Node.  The asynchronously called function can throw all the exceptions it wants, but they won't be caught by a try/catch located at the site of the function call.

The arguments required for each callback function depends on the needs of the function being called.  You have to consult the documentation for each function.

Callback functions are invoked when a function needs to return data to the caller, send errors to the caller, or to collaborate with code provided by the caller.

Node.js is a big win at PayPal

In my book Node Web Development (see sidebar), I spent the first chapter trying to sell the reader on using JavaScript on the server.  That's because the typical server side languages do not include JavaScript, meaning everyone has to be scratching their head and wondering why they should use JS on the server.  I suggested that, theoretically, one big win will come because the front end coding and back end coding will both be in the same language, JavaScript, which will make it possible for front engineers to talk with server engineers in the same language.  Or perhaps even be the same person.

A blog post from PayPal validates that theory very well. 

The author of that post says they, PayPal, is looking for "full-stack engineers" who can code both front- and back-end stuff.  He even described the boundary between browser and server code as "artificial."

They like Express, but found that it's too flexible and desired to have some conventions used in PayPal's applications.  They came up with a library to house those conventions.

To test whether Node.js would work at PayPal, they chose to take their "account overview page", one of the most trafficked pages on PayPal's service, and recode it in Node.  To hedge their bets they had a second team work on a Java implementation.

The Node team had a slow start because they had to build up some infrastructure for node.js to work in PayPal, e.g. sessions, centralized logging, keystores.  Even with a two month lag before they could start work on the actual application, they beat the Java team to the end goal.  Further the Node implementation had these gains:

  • Built almost twice as fast with fewer people
  • Written in 33% fewer lines of code
  • Constructed with 40% fewer files
And performance was great:

  • Double the requests per second vs. the Java application. This is even more interesting because our initial performance results were using a single core for the node.js application compared to five cores in Java. We expect to increase this divide further.
  • 35% decrease in the average response time for the same page. This resulted in the pages being served 200ms faster— something users will definitely notice.

How to generate unique temporary file names in Node

Often we want to write data to a file, don't really care what the file name is, but want to be sure our code is the only code accessing that data.  A trivial algorithm might be to form a file name like:
/tmp/tf(PID)(Count)
The process ID is unique to our process, and a simple counter is some assurance that two pieces of code won't step on each others temporary files.  But what about a security vulnerability from untrusted nefarious code?  This file name pattern is very predictable, and nefarious code could subvert the good code with bad data.

In one of my AkashaCMS plugins I needed to create temporary files and after some searching chose to use the temporary module.

Its usage is straight-forward:

var tmp = require('temporary');
var file = new tmp.File();
var dir = new tmp.Dir();

console.log(file.path); // path.
console.log(dir.path); // path.

file.unlink();
dir.rmdir();

In other words, if you need a new temporary file or directory just instantiate a new object, do stuff with it, then delete it when done.

Another module, https://github.com/bruce/node-temp, does a bit more.  It tracks the files and directories created, so you can clean them up when done.  It also integrates with Grunt so it can be used in building things.

Another module, https://npmjs.org/package/tmp, is closer to the module I chose for AkashaCMS.