Gergely Nemeth's Picture

Gergely Nemeth

Node.js and microservices, organizer of @Oneshotbudapest @nodebp @jsconfbp

79 posts

Node.js Post-Mortem Diagnostics & Debugging

Post-mortem diagnostics & debugging comes into the picture when you want to figure out what went wrong with your Node.js application in production.

In this chapter of Node.js at Scale we will take a look at node-report, a core project which aims to help you to do post-mortem diagnostics & debugging.

The node-report diagnostics module

The purpose of the module is to produce a human-readable diagnostics summary file. It is meant to be used in both development and production environments.

The generated report includes:

  • JavaScript and native stack traces,
  • heap statistics,
  • system information,
  • resource usage,
  • loaded libraries.

Currently node-report supports Node.js v4, v6, and v7 on AIX, Linux, MacOS, SmartOS, and Windows.

Adding it to your project just takes an npm install and require:

npm install node-report --save  
//index.js
require('node-report')  

Once you add node-report to your application, it will automatically listen on unhandled exceptions and fatal error events, and will trigger a report generation. Report generation can also be triggered by sending a USR2 signal to the Node.js process.

Use cases of node-report

Diagnostics of exceptions

For the sake of simplicity, imagine you have the following endpoint in one of your applications:

function myListener(request, response) {  
  switch (request.url) {
  case '/exception':
    throw new Error('*** exception.js: uncaught exception thrown from function myListener()');
  }
}

This code simply throws an exception once the /exception route handler is called. To make sure we get the diagnostics information, we have to add the node-report module to our application, as shown previously.

require('node-report')  
function my_listener(request, response) {  
  switch (request.url) {
  case '/exception':
    throw new Error('*** exception.js: uncaught exception thrown from function my_listener()');
  }
}

Let's see what happens once the endpoint gets called! Our report just got written into a file:

Writing Node.js report to file: node-report.20170506.100759.20988.001.txt  
Node.js report completed  


Need assistance running Node.js in production?

Expert help when you need it the most
I want help


The header

Once you open the file, you'll get something like this:

=================== Node Report ===================

Event: exception, location: "OnUncaughtException"  
Filename: node-report.20170506.100759.20988.001.txt  
Dump event time:  2017/05/06 10:07:59  
Module load time: 2017/05/06 10:07:53  
Process ID: 20988  
Command line: node demo/exception.js

Node.js version: v6.10.0  
(ares: 1.10.1-DEV, http_parser: 2.7.0, icu: 58.2, modules: 48, openssl: 1.0.2k, 
 uv: 1.9.1, v8: 5.1.281.93, zlib: 1.2.8)

node-report version: 2.1.2 (built against Node.js v6.10.0, 64 bit)

OS version: Darwin 16.4.0 Darwin Kernel Version 16.4.0: Thu Dec 22 22:53:21 PST 2016; root:xnu-3789.41.3~3/RELEASE_X86_64

Machine: Gergelys-MacBook-Pro.local x86_64  

You can think of this part as a header for your diagnostics summary - it includes..

  • the main event why the report was created,
  • how the Node.js application was started (node demo/exception.js),
  • what Node.js version was used,
  • the host operating system,
  • and the version of node-report itself.

The stack traces

The next part of the report includes the captured stack traces, both for JavaScript and the native part:

=================== JavaScript Stack Trace ===================
Server.myListener (/Users/gergelyke/Development/risingstack/node-report/demo/exception.js:19:5)  
emitTwo (events.js:106:13)  
Server.emit (events.js:191:7)  
HTTPParser.parserOnIncoming [as onIncoming] (_http_server.js:546:12)  
HTTPParser.parserOnHeadersComplete (_http_common.js:99:23)  

In the JavaScript part, you can see..

  • the stack trace (which function called which one with line numbers),
  • and where the exception occurred.

In the native part, you can see the same thing - just on a lower level, in the native code of Node.js

=================== Native Stack Trace ===================
 0: [pc=0x103c0bd50] nodereport::OnUncaughtException(v8::Isolate*) [/Users/gergelyke/Development/risingstack/node-report/api.node]
 1: [pc=0x10057d1c2] v8::internal::Isolate::Throw(v8::internal::Object*, v8::internal::MessageLocation*) [/Users/gergelyke/.nvm/versions/node/v6.10.0/bin/node]
 2: [pc=0x100708691] v8::internal::Runtime_Throw(int, v8::internal::Object**, v8::internal::Isolate*) [/Users/gergelyke/.nvm/versions/node/v6.10.0/bin/node]
 3: [pc=0x3b67f8092a7] 
 4: [pc=0x3b67f99ab41] 
 5: [pc=0x3b67f921533] 

Heap and garbage collector metrics

You can see in the heap metrics how each heap space performed during the creation of the report:

  • new space,
  • old space,
  • code space,
  • map space,
  • large object space.

These metrics include:

  • memory size,
  • committed memory size,
  • capacity,
  • used size,
  • available size.

To better understand how memory handling in Node.js works, check out the following articles:

=================== JavaScript Heap and GC ===================
Heap space name: new_space  
    Memory size: 2,097,152 bytes, committed memory: 2,097,152 bytes
    Capacity: 1,031,680 bytes, used: 530,736 bytes, available: 500,944 bytes
Heap space name: old_space  
    Memory size: 3,100,672 bytes, committed memory: 3,100,672 bytes
    Capacity: 2,494,136 bytes, used: 2,492,728 bytes, available: 1,408 bytes

Total heap memory size: 8,425,472 bytes  
Total heap committed memory: 8,425,472 bytes  
Total used heap memory: 4,283,264 bytes  
Total available heap memory: 1,489,426,608 bytes

Heap memory limit: 1,501,560,832  

Resource usage

The resource usage section includes metrics on..

  • CPU usage,
  • the size of the resident set size,
  • information on page faults,
  • and the file system activity.
=================== Resource usage ===================
Process total resource usage:  
  User mode CPU: 0.119704 secs
  Kernel mode CPU: 0.020466 secs
  Average CPU Consumption : 2.33617%
  Maximum resident set size: 21,965,570,048 bytes
  Page faults: 13 (I/O required) 5461 (no I/O required)
  Filesystem activity: 0 reads 3 writes

System information

The system information section includes..

  • environment variables,
  • resource limits (like open files, CPU time or max memory size)
  • and loaded libraries.

Diagnostics of fatal errors

The node-report module can also help once you have a fatal error, like your application runs out of memory.

By default, you will get an error message something like this:

<--- Last few GCs --->

   23249 ms: Mark-sweep 1380.3 (1420.7) -> 1380.3 (1435.7) MB, 695.6 / 0.0 ms [allocation failure] [scavenge might not succeed].
   24227 ms: Mark-sweep 1394.8 (1435.7) -> 1394.8 (1435.7) MB, 953.4 / 0.0 ms (+ 8.3 ms in 231 steps since start of marking, biggest step 1.2 ms) [allocation failure] [scavenge might not succeed].

On its own, this information is not that helpful. You don't know the context, or what was the state of the application. With node-report, it gets better.

First of all, in the generated post-mortem diagnostics summary you will have a more descriptive event:

Event: Allocation failed - JavaScript heap out of memory, location: "MarkCompactCollector: semi-space copy, fallback in old gen"  

Secondly, you will get the native stack trace - that can help you to understand better why the allocation failed.

Diagnostics of blocking operations

Imagine you have the following loops which block your event loop. This is a performance nightmare.

var list = []  
for (let i = 0; i < 10000000000; i++) {  
  for (let j = 0; i < 1000; i++) {
    list.push(new MyRecord())
  }
  for (let j=0; i < 1000; i++) {
    list[j].id += 1
    list[j].account += 2
  }
  for (let j = 0; i < 1000; i++) {
    list.pop()
  }
}

With node-report you can request reports even when your process is busy, by sending the USR2 signal. Once you do that you will receive the stack trace, and you will see in a minute where your application spends time.

(Examples are taken for the node-report repository)

The API of node-report

Triggering report generation programmatically

The creation of the report can also be triggered using the JavaScript API. This way your report will be saved in a file, just like when it was triggered automatically.

const nodeReport = require('node-report')  
nodeReport.triggerReport()  

Getting the report as a string

Using the JavaScript API, the report can also be retrieved as a string.

const nodeReport = require('nodereport')  
const report = nodeReport.getReport()  

Using without automatic triggering

If you don't want to use automatic triggers (like the fatal error or the uncaught exception) you can opt-out of them by requiring the API itself - also, the file name can be specified as well:

const nodeReport = require('node-report/api')  
nodeReport.triggerReport('name-of-the-report')  

Contribute

If you feel like making Node.js even better, please consider joining the Postmortem Diagnostics working group, where you can contribute to the module.

The Postmortem Diagnostics working group is dedicated to the support and improvement of postmortem debugging for Node.js. It seeks to elevate the role of postmortem debugging for Node, to assist in the development of techniques and tools, and to make techniques and tools known and available to Node.js users.

In the next chapter of the Node.js at Scale series, we will discuss profiling Node.js Applications. If you have any questions, please let me know in the comments section below.

Mastering the Node.js Core Modules - The File System & fs Module

In this article, we'll take a look at the File System core module, File Streams and some fs module alternatives.


Imagine that you just got a task, which you have to complete using Node.js..

The problem you face seems to be easy, but instead of checking the official Node.js docs, you head over to Google or npm and search for a module that can do the job for you.

While this is totally okay; sometimes the core modules could easily do the trick for you.

In this new Mastering the Node.js Core Modules series you can learn what hidden/barely known features the core modules have, and how you can use them. We will also mention modules that extend their behaviors and are great additions to your daily development flow.

The Node.js fs module

File I/O is provided by simple wrappers around standard POSIX functions. To use the fs module you have to require it with require('fs'). All the methods have asynchronous and synchronous forms.

The asynchronous API

// the async api
const fs = require('fs')

fs.unlink('/tmp/hello', (err) => {  
  if (err) {
    return console.log(err)
  }
  console.log('successfully deleted /tmp/hello')
})

You should always use the asynchronous API when developing production code, as it won't block the event loop so you can build performant applications.

The synchronous API

// the sync api
const fs = require('fs')

try {  
  fs.unlinkSync('/tmp/hello')
} catch (ex) {
  console.log(ex)
}

console.log('successfully deleted /tmp/hello');  

You should only use the synchronous API when building proof of concept applications, or small CLIs.

Node.js File Streams

One of the things we see a lot is that developers barely take advantage of file streams.

What are Node.js streams, anyways?

Streams are a first-class construct in Node.js for handling data. There are three main concepts to understand:

  • source - the object where your data comes from,
  • pipeline - where your data passes through (you can filter, or modify it here),
  • sink - where your data ends up.

For more information check Substack's Stream Handbook.

As the core fs module does not expose a feature to copy files, you can easily do it with streams:

// copy a file
const fs = require('fs')  
const readableStream = fs.createReadStream('original.txt')  
var writableStream = fs.createWriteStream('copy.txt')

readableStream.pipe(writableStream)  

You could ask - why should I do it when it is just a cp command away?

The biggest advantage in this case to use streams is the ability to transform the files - you could easily do something like this to decompress a file:

const fs = require('fs')  
const zlib = require('zlib')

fs.createReadStream('original.txt.gz')  
  .pipe(zlib.createGunzip())
  .pipe(fs.createWriteStream('original.txt'))


Expert Node.js Support

Need help with core modules and modules from npm?
Learn more


When not to use fs.access

The purpose of the fs.access method is to check if a user have permissions for the given file or path, something like this:

fs.access('/etc/passwd', fs.constants.R_OK | fs.constants.W_OK, (err) => {  
  if (err) {
    return console.error('no access')
  }
  console.log('access for read/write')
})

Constants exposed for permission checking:

  • fs.constants.F_OK - to check if the path is visible to the calling process,
  • fs.constants.R_OK - to check if the path can be read by the process,
  • fs.constants.W_OK - to check if the path can be written by the process,
  • fs.constants.X_OK - to check if the path can be executed by the process.

However, please note that using fs.access to check for the accessibility of a file before calling fs.open, fs.readFile or fs.writeFile is not recommended.

The reason is simple - if you do so, you will introduce a race condition. Between you check and the actual file operation, another process may have already changed that file.

Instead, you should open the file directly, and handle error cases there.

Caveats about fs.watch

With the fs.watch method, you can listen on changes of a file or a directory.

However, the fs.watch API is not 100% consistent across platforms, and on some systems, it is not available at all:

Note, that the recursive option is only supported on OS X and Windows, but not on Linux.

Also, the fileName argument in the watch callback is not always provided (as it is only supported on Linux and Windows), so you should prepare for fallbacks if it is undefined:

fs.watch('some/path', (eventType, fileName) => {  
  if (!filename) {
    //filename is missing, handle it gracefully
  } 
})

Useful fs modules from npm

There are some very useful modules maintained by the community which extends the functionality of the fs module.

graceful-fs

The graceful-fs is a drop-in replacement for the core fs module, with some improvements:

  • queues up open and readdir calls, and retries them once something closes if there is an EMFILE error from too many file descriptors,
  • ignores EINVAL and EPERM errors in chown, fchown or lchown if the user isn't root,
  • makes lchmod and lchown become noops, if not available,
  • retries reading a file if read results in EAGAIN error.

You can start using it just like the core fs module, or alternatively by patching the global module.

// use as a standalone module
const fs = require('graceful-fs')

// patching the global one
const originalFs = require('fs')  
const gracefulFs = require('graceful-fs')  
gracefulFs.gracefulify(originalFs)  

mock-fs

The mock-fs module allows Node's built-in fs module to be backed temporarily by an in-memory, mock file system. This lets you run tests against a set of mock files or directories.

Start using the module is as easy as:

const mock = require('mock-fs')  
const fs = require('fs')

mock({  
  'path/to/fake/dir': {
    'some-file.txt': 'file content here',
    'empty-dir': {}
  },
  'path/to/some.png': new Buffer([8, 6, 7, 5, 3, 0, 9])
})

fs.exists('path/to/fake/dir', function (exists) {  
  console.log(exists)
  // will output true
})

lockfile

File locking is a way to restrict access to a file by allowing only one process access at any specific time. This can prevent race condition scenarios.

Adding lockfiles using the lockfile module is striaghforward:

const lockFile = require('lockfile')

lockFile.lock('some-file.lock', function (err) {  
  // if the err happens, then it failed to acquire a lock.
  // if there was not an error, then the file was created,
  // and won't be deleted until we unlock it.

  // then, some time later, do:
  lockFile.unlock('some-file.lock', function (err) {

  })
})

Conclusion

I hope this was a useful explanation of the Node.js file system and its' possibilities.

If you have any questions about the topic, please let me know in the comments section below.

How to Debug Node.js with the Best Tools Available

Debugging - the process of finding and fixing defects in software - can be a challenging task to do in all languages. Node.js is no exception.

Luckily, the tooling for finding these issues improved a lot in the past period. Let's take a look at what options you have to find and fix bugs in your Node.js applications!

We will dive into two different aspects of debugging Node.js applications - the first one will be logging, so you can keep an eye on production systems, and have events from there. After logging, we will take a look at how you can debug your applications in development environments.

Logging in Node.js

Logging takes place in the execution of your application to provide an audit trail that can be used to understand the activity of the system and to diagnose problems to find and fix bugs.

For logging purposes, you have lots of options when building Node.js applications. Some npm modules are shipped with built in logging that can be turned on when needed using the debug module. For your own applications, you have to pick a logger too! We will take a look at pino.

Before jumping into logging libraries, let's take a look what requirements they have to fulfil:

  • timestamps - it is crucial to know which event happened when,
  • formatting - log lines must be easily understandable by humans, and straightforward to parse for applications,
  • log destination - it should be always the standard output/error, applications should not concern themselves with log routing,
  • log levels - log events have different severity levels, in most cases, you won't be interested in debug or info level events.

The debug module of Node.js

Recommendation: use for modules published on npm

Let's see how it makes your life easier! Imagine that you have a Node.js module that sends serves requests, as well as send out some.

// index.js
const debugHttpIncoming = require('debug')('http:incoming')  
const debugHttpOutgoing = require('debug')('http:outgoing')

let outgoingRequest = {  
  url: 'https://risingstack.com'
}

// sending some request
debugHttpOutgoing('sending request to %s', outgoingRequest.url)

let incomingRequest = {  
  body: '{"status": "ok"}'
}

// serving some request
debugHttpOutgoing('got JSON body %s', incomingRequest.body)  

Once you have it, start your application this way:

DEBUG=http:incoming,http:outgoing node index.js  

The output will be something like this:

Output of Node.js Debugging

Also, the debug module supports wildcards with the * character. To get the same result we got previously, we simply could start our application with DEBUG=http:* node index.js.

What's really nice about the debug module is that a lot of modules (like Express or Koa) on npm are shipped with it - as of the time of writing this article more than 14.000 modules.

The pino logger module

Recommendation: use for your applications when performance is key

Pino is an extremely fast Node.js logger, inspired by bunyan. In many cases, pino is over 6x faster than alternatives like bunyan or winston:

benchWinston*10000:     2226.117ms  
benchBunyan*10000:      1355.229ms  
benchDebug*10000:       445.291ms  
benchLogLevel*10000:    322.181ms  
benchBole*10000:        291.727ms  
benchPino*10000:        269.109ms  
benchPinoExtreme*10000: 102.239ms  

Getting started with pino is straightforward:

const pino = require('pino')()

pino.info('hello pino')  
pino.info('the answer is %d', 42)  
pino.error(new Error('an error'))  

The above snippet produces the following log lines:

{"pid":28325,"hostname":"Gergelys-MacBook-Pro.local","level":30,"time":1492858757722,"msg":"hello pino","v":1}
{"pid":28325,"hostname":"Gergelys-MacBook-Pro.local","level":30,"time":1492858757724,"msg":"the answer is 42","v":1}
{"pid":28325,"hostname":"Gergelys-MacBook-Pro.local","level":50,"time":1492858757725,"msg":"an error","type":"Error","stack":"Error: an error\n    at Object.<anonymous> (/Users/gergelyke/Development/risingstack/node-js-at-scale-debugging/pino.js:5:12)\n    at Module._compile (module.js:570:32)\n    at Object.Module._extensions..js (module.js:579:10)\n    at Module.load (module.js:487:32)\n    at tryModuleLoad (module.js:446:12)\n    at Function.Module._load (module.js:438:3)\n    at Module.runMain (module.js:604:10)\n    at run (bootstrap_node.js:394:7)\n    at startup (bootstrap_node.js:149:9)\n    at bootstrap_node.js:509:3","v":1}

The Built-in Node.js Debugger module

Node.js ships with an out-of-process debugging utility, accessible via a TCP-based protocol and built-in debugging client. You can start it using the following command:

$ node debug index.js

This debugging agent is a not a fully featured debugging agent - you won't have a fancy user interface, however, simple inspections are possible.

You can add breakpoints to your code by adding the debugger statement into your codebase:

const express = require('express')  
const app = express()

app.get('/', (req, res) => {  
  debugger
  res.send('ok')
})

This way the execution of your script will be paused at that line, then you can start using the commands exposed by the debugging agent:

  • cont or c - continue execution,
  • next or n - step next,
  • step or s - step in,
  • out or o - step out,
  • repl - to evaluate script's context.

V8 Inspector Integration for Node.js

The V8 inspector integration allows attaching Chrome DevTools to Node.js instances for debugging by using the Chrome Debugging Protocol.

V8 Inspector can be enabled by passing the --inspect flag when starting a Node.js application:

$ node --inspect index.js

In most cases, it makes sense to stop the execution of the application at the very first line of your codebase and continue the execution from that. This way you won't miss any command execution.

$ node --inspect-brk index.js


I recommend watching this video in full-screen mode to get every detail!

How to Debug Node.js with Visual Studio Code

Most modern IDEs have some support for debugging applications - so does VS Code. It has built-in debugging support for Node.js.

What you can see below, is the debugging interface of VS Code - with the context variables, watched expressions, call stack and breakpoints.

VS Code Debugging Layout Image credit: Visual Studio Code

One of the most valuable features of the integrated Visual Studio Code debugger is the ability to add conditional breakpoints. With conditional breakpoints, the breakpoint will be hit whenever the expression evaluates to true.

If you need more advanced settings for VS Code, it comes with a configuration file, .vscode/launch.json which describes how the debugger should be launched. The default launch.json looks something like this:

{
    "version": "0.2.0",
    "configurations": [
        {
            "type": "node",
            "request": "launch",
            "name": "Launch Program",
            "program": "${workspaceRoot}/index.js"
        },
        {
            "type": "node",
            "request": "attach",
            "name": "Attach to Port",
            "address": "localhost",
            "port": 5858
        }
    ]
}

For advanced configuration settings of launch.json go to https://code.visualstudio.com/docs/editor/debugging#_launchjson-attributes.

For more information on debugging with Visual Studio Code, visit the official site: https://code.visualstudio.com/docs/editor/debugging.

Next Up

If you have any questions about debugging, please let me know in the comments section.

In the next episode of the Node.js at Scale series, we are going to talk about Node.js Post-Mortem Diagnostics & Debugging.

Announcing Free Node.js Monitoring & Debugging with Trace

Today, we’re excited to announce that Trace, our Node.js monitoring & debugging tool is now free for open-source projects.

What is Trace?

We launched Trace a year ago with the intention of helping developers looking for a Node.js specific APM which is easy to use and helps with the most difficult aspects of building Node projects, like..

  • finding memory leaks in a production environment
  • profiling CPU usage to find bottlenecks
  • tracing distributed call-chains
  • avoiding security leaks & bad npm packages

.. and so on.

Node.js Monitoring with Trace by RisingStack - Performance Metrics chart

Why are we giving it away for free?

We use a ton of open-source technology every day, and we are also the maintainers of some.

We know from experience that developing an open-source project is hard work, which requires a lot of knowledge and persistence.

Trace will save a lot of time for those who use Node for their open-source projects.

How to get started with Trace?

  1. Visit trace.risingstack.com and sign up - it's free.
  2. Connect your app with Trace.
  3. Head over to this form and tell us a little bit about your project.

Done. Your open-source project will be monitored for free as a result.

If you need help with Node.js Monitoring & Debugging..

Just drop us a tweet at @RisingStack if you have any additional questions about the tool or the process.

If you'd like to read a little bit more about the topic, I recommend to read our previous article The Definitive Guide for Monitoring Node.js Applications.

One more thing

At the same time of making Trace available for open-source projects, we're announcing our new line of business at RisingStack:

Commercial Node.js support, aimed at enterprises with Node.js applications running in a production environment.

RisingStack now helps to bootstrap and operate Node.js apps - no matter what life cycle they are in.


Disclaimer: We retain the exclusive right to accept or deny your application to use Trace by RisingStack for free.

Mastering the Node.js CLI & Command Line Options

Node.js comes with a lot of CLI options to expose built-in debugging & to modify how V8, the JavaScript engine works.

In this post, we have collected the most important CLI commands to help you become more productive.

Accessing Node.js CLI Options

To get a full list of all available Node.js CLI options in your current distribution of Node.js, you can access the manual page from the terminal using:

$ man node

Usage: node [options] [ -e script | script.js ] [arguments]  
       node debug script.js [arguments] 

Options:  
  -v, --version         print Node.js version
  -e, --eval script     evaluate script
  -p, --print           evaluate script and print result
  -c, --check           syntax check script without executing
...

As you can see in the first usage section, you have to provide the optional options before the script you want to run.

Take the following file:

console.log(new Buffer(100))  

To take advantage of the --zero-fill-buffers option, you have to run your application using:

$ node --zero-fill-buffers index.js

This way the application will produce the correct output, instead of random memory garbage:

<Buffer 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... >  

CLI Options

Now as we saw how you instruct Node.js to use CLI options, let's see what other options are there!

--version or -v

Using the node --version, or short, node -v, you can print the version of Node.js you are using.

$ node -v
v6.10.0  

--eval or -e

Using the --eval option, you can run JavaScript code right from your terminal. The modules which are predefined in REPL can also be used without requiring them, like the http or the fs module.

$ node -e 'console.log(3 + 2)'
5  

--print or -p

The --print option works the same way as the --eval, however it prints the result of the expression. To achieve the same output as the previous example, we can simply leave the console.log:

$ node -p '3 + 2'
5  

--check or -c

Available since v4.2.0

The --check option instructs Node.js to check the syntax of the provided file, without actually executing it.

Take the following example again:

console.log(new Buffer(100)  

As you can see, a closing ) is missing. Once you run this file using node index.js, it will produce the following output:

/Users/gergelyke/Development/risingstack/mastering-nodejs-cli/index.js:1
(function (exports, require, module, __filename, __dirname) { console.log(new Buffer(100)
                                                                                        ^
SyntaxError: missing ) after argument list  
    at Object.exports.runInThisContext (vm.js:76:16)
    at Module._compile (module.js:542:28)
    at Object.Module._extensions..js (module.js:579:10)
    at Module.load (module.js:487:32)
    at tryModuleLoad (module.js:446:12)
    at Function.Module._load (module.js:438:3)
    at Module.runMain (module.js:604:10)
    at run (bootstrap_node.js:394:7)

Using the --check option you can check for the same issue, without executing the script, using node --check index.js. The output will be similar, except you won't get the stack trace, as the script never ran:

/Users/gergelyke/Development/risingstack/mastering-nodejs-cli/index.js:1
(function (exports, require, module, __filename, __dirname) { console.log(new Buffer(100)
                                                                                        ^
SyntaxError: missing ) after argument list  
    at startup (bootstrap_node.js:144:11)
    at bootstrap_node.js:509:3

The --check option can come handy, when you want to see if your script is syntactically correct, without executing it.


Expert help when you need it the most

Commercial Node.js Support by RisingStack
Learn more


--inspect[=host:port]

Available since v6.3.0

Using node --inspect will activate the inspector on the provided host and port. If they are not provided, the default is 127.0.0.1:9229. The debugging tools attached to Node.js instances communicate via a tcp port using the Chrome Debugging Protocol.

--inspect-brk[=host:port]

Available since v7.6.0

The --inspect-brk has the same functionality as the --inspect option, however it pauses the execution at the first line of the user script.

$ node --inspect-brk index.js 
Debugger listening on port 9229.  
Warning: This is an experimental feature and could change at any time.  
To start debugging, open the following URL in Chrome:  
    chrome-devtools://devtools/bundled/inspector.html?experiments=true&v8only=true&ws=127.0.0.1:9229/86dd44ef-c865-479e-be4d-806d622a4813

Once you have ran this command, just copy and paste the URL you got to start debugging your Node.js process.

--zero-fill-buffers

Available since v6.0.0

Node.js can be started using the --zero-fill-buffers command line option to force all newly allocated Buffer instances to be automatically zero-filled upon creation. The reason to do so is that newly allocated Buffer instances can contain sensitive data.

It should be used when it is necessary to enforce that newly created Buffer instances cannot contain sensitive data, as it has significant impact on performance.

Also note, that some Buffer constructors got deprecated in v6.0.0:

  • new Buffer(array)
  • new Buffer(arrayBuffer[, byteOffset [, length]])
  • new Buffer(buffer)
  • new Buffer(size)
  • new Buffer(string[, encoding])

Instead, you should use Buffer.alloc(size[, fill[, encoding]]), Buffer.from(array), Buffer.from(buffer), Buffer.from(arrayBuffer[, byteOffset[, length]]) and Buffer.from(string[, encoding]).

You can read more on the security implications of the Buffer module on the Synk blog.

--prof-process

Using the --prof-process, the Node.js process will output the v8 profiler output.

To use it, first you have to run your applications using:

node --prof index.js  

Once you ran it, a new file will be placed in your working directory, with the isolate- prefix.

Then, you have to run the Node.js process with the --prof-process option:

node --prof-process isolate-0x102001600-v8.log > output.txt  

This file will contain metrics from the V8 profiler, like how much time was spent in the C++ layer, or in the JavaScript part, and which function calls took how much time. Something like this:

[C++]:
   ticks  total  nonlib   name
     16   18.4%   18.4%  node::ContextifyScript::New(v8::FunctionCallbackInfo<v8::Value> const&)
      4    4.6%    4.6%  ___mkdir_extended
      2    2.3%    2.3%  void v8::internal::String::WriteToFlat<unsigned short>(v8::internal::String*, unsigned short*, int, int)
      2    2.3%    2.3%  void v8::internal::ScavengingVisitor<(v8::internal::MarksHandling)1, (v8::internal::LoggingAndProfiling)0>::ObjectEvacuationStrategy<(v8::internal::ScavengingVisitor<(v8::internal::MarksHandling)1, (v8::internal::LoggingAndProfiling)0>::ObjectContents)1>::VisitSpecialized<24>(v8::internal::Map*, v8::internal::HeapObject**, v8::internal::HeapObject*)

[Summary]:
   ticks  total  nonlib   name
      1    1.1%    1.1%  JavaScript
     70   80.5%   80.5%  C++
      5    5.7%    5.7%  GC
      0    0.0%          Shared libraries
     16   18.4%          Unaccounted

To get a full list of Node.js CLI options, check out the official documentation here.


V8 Options

You can print all the available V8 options using the --v8-options command line option.

Currently V8 exposes more than a 100 command line options - here we just picked a few to showcase some of the functionality they can provide. Some of these options can drastically change how V8 behaves, use them with caution!

--harmony

With the harmony flag, you can enable all completed harmony features.

--max_old_space_size

With this option, you can set the maximum size of the old space on the heap, which directly affects how much memory your process can allocate.

This setting can come handy when you run in low memory environments.

--optimize_for_size

With this option, you can instruct V8 to optimize the memory space for size - even if the application gets slower.

Just as the previous option, it can be useful in low memory environments.


Environment Variables

NODE_DEBUG=module[,…]

Setting this environment variable enables the core modules to print debug information. You can run the previous example like this to get debug information on the module core component (instead of module, you can go for http, fs, etc...):

$ NODE_DEBUG=module node index.js

The output will be something like this:

MODULE 7595: looking for "/Users/gergelyke/Development/risingstack/mastering-nodejs-cli/index.js" in ["/Users/gergelyke/.node_modules","/Users/gergelyke/.node_libraries","/Users/gergelyke/.nvm/versions/node/v6.10.0/lib/node"]  
MODULE 7595: load "/Users/gergelyke/Development/risingstack/mastering-nodejs-cli/index.js" for module "."  

NODE_PATH=path

Using this setting, you can add extra paths for the Node.js process to search modules in.

OPENSSL_CONF=file

Using this environment variable, you can load an OpenSSL configuration file on startup.

For a full list of supported environment variables, check out the official Node.js docs.

Let's Contribute to the CLI related Node Core Issues!

As you can see, the CLI is a really useful tool which becomes better with each Node version!

If you'd like to contribute to its advancement, you can help by checking out the currently open issues at https://github.com/nodejs/node/labels/cli !

Node.js Command Line Interface CLI Issues

Node.js War Stories: Debugging Issues in Production

In this article, you can read stories from Netflix, RisingStack & nearForm about Node.js issues in production - so you can learn from our mistakes and avoid repeating them. You'll also learn what methods we used to debug these Node.js issues.

Special shoutout to Yunong Xiao of Netflix, Matteo Collina of nearForm & Shubhra Kar from Strongloop for helping us with their insights for this post!


At RisingStack, we have accumulated a tremendous experience of running Node apps in production in the past 4 years - thanks to our Node.js consulting, training and development business.

As well as the Node teams at Netflix & nearForm we picked up the habit of always writing thorough postmortems, so the whole team (and now the whole world) could learn from the mistakes we made.

Netflix & Debugging Node: Know your Dependencies

Let's start with a slowdown story from Yunong Xiao, which happened with our friends at Netflix.

The trouble started with the Netflix team noticing that their applications response time increased progressively - some of their endpoints' latency increased with 10ms every hour.

This was also reflected in the growing CPU usage.

Netflix debugging Nodejs in production with the Request latency graph Request latencies for each region over time - photo credit: Netflix

At first, they started to investigate whether the request handler is responsible for slowing things down.

After testing it in isolation, it turned out that the request handler had a constant response time around 1ms.

So the problem was not that, and they started to suspect that probably it's deeper in the stack.

The next thing Yunong & the Netflix team tried are CPU flame graphs and Linux Perf Events.

Flame graph of Netflix Nodejs slowdown Flame graph or the Netflix slowdown - photo credit: Netflix

What you can see in the flame graph above is that

  • it has high stacks (which means lot of function calls)
  • and the boxes are wide (meaning we are spending quite some time in those functions).

After further inspection, the team found that Express's router.handle and router.handle.next has lots of references.

The Express.js source code reveals a couple of interesting tidbits:

  • Route handlers for all endpoints are stored in one global array.
  • Express.js recursively iterates through and invokes all handlers until it finds the right route handler.

Before revealing the solution of this mystery, we have to get one more detail:

Netflix's codebase contained a periodical code that ran every 6 minutes and grabbed new route configs from an external resource and updated the application's route handlers to reflect the changes.

This was done by deleting old handlers and adding new ones. Accidentally, it also added the same static handler all over again - even before the API route handlers. As it turned out, this caused the extra 10ms response time hourly.

Takeaways from Netflix's Issue

  • Always know your dependencies - first, you have to fully understand them before going into production with them.
  • Observability is key - flame graphs helped the Netflix engineering team to get to the bottom of the issue.

Read the full story here: Node.js in Flames.


Expert help when you need it the most

Commercial Node.js Support by RisingStack
Learn more


RisingStack CTO: "Crypto takes time"

You may have already heard to story of how we broke down the monolithic infrastructure of Trace (our Node.js monitoring solution) into microservices from our CTO, Peter Marton.

The issue we'll talk about now is a slowdown which affected Trace in production:

As the very first versions of Trace ran on a PaaS, it used the public cloud to communicate with other services of ours.

To ensure the integrity of our requests, we decided to sign all of them. To do so, we went with Joyent's HTTP signing library. What's really great about it, is that the request module supports HTTP signature out of the box.

This solution was not only expensive, but it also had a bad impact on our response times.

network delay in nodejs request visualized by trace The network delay built up our response times - photo: Trace

As you can see on the graph above, the given endpoint had a response time of 180ms, however from that amount, 100ms was just the network delay between the two services alone.

As the first step, we migrated from the PaaS provider to use Kubernetes. We expected that our response times would be a lot better, as we can leverage internal networking.

We were right - our latency improved.

However, we expected better results - and a lot bigger drop in our CPU usage. The next step was to do CPU profiling, just like the guys at Netflix:

crypto sign function taking up cpu time

As you can see on the screenshot, the crypto.sign function takes up most of the CPU time, by consuming 10ms on each request. To solve this, you have two options:

  • if you are running in a trusted environment, you can drop request signing,
  • if you are in an untrusted environment, you can scale up your machines to have stronger CPUs.

Takeaways from Peter Marton

  • Latency in-between your services has a huge impact on user experience - whenever you can, leverage internal networking.
  • Crypto can take a LOT of time.

nearForm: Don't block the Node.js Event Loop

React is more popular than ever. Developers use it for both the frontend and the backend, or they even take a step further and use it to build isomorphic JavaScript applications.

However, rendering React pages can put some heavy load on the CPU, as rendering complex React components is CPU bound.

When your Node.js process is rendering, it blocks the event loop because of its synchronous nature.

As a result, the server can become entirely unresponsive - requests accumulate, which all puts load on the CPU.

What can be even worse is that even those requests will be served which no longer have a client - still putting load on the Node.js application, as Matteo Collina of nearForm explains.

It is not just React, but string operations in general. If you are building JSON REST APIs, you should always pay attention to JSON.parse and JSON.stringify.

As Shubhra Kar from Strongloop (now Joyent) explained, parsing and stringifying huge payloads can take a lot of time as well (and blocking the event loop in the meantime).

function requestHandler(req, res) {  
  const body = req.rawBody
  let parsedBody
  try {
    parsedBody = JSON.parse(body)
  }
  catch(e) {
     res.end(new Error('Error parsing the body'))
  }
  res.end('Record successfully received')
}

Simple request handler

The example above shows a simple request handler, which just parses the body. For small payloads, it works like a charm - however, if the JSON's size can be measured in megabytes, the execution time can be seconds instead of milliseconds. The same applies for JSON.stringify.

To mitigate these issues, first, you have to know about them. For that, you can use Matteo's loopbench module, or Trace's event loop metrics feature.

With loopbench, you can return a status code of 503 to the load balancer, if the request cannot be fulfilled. To enable this feature, you have to use the instance.overLimit option. This way ELB or NGINX can retry it on a different backend, and the request may be served.

Once you know about the issue and understand it, you can start working on fixing it - you can do it either by leveraging Node.js streams or by tweaking the architecture you are using.

Takeaways from nearForm

  • Always pay attention to CPU bound operations - the more you have, to more pressure you put on your event loop.
  • String operations are CPU-heavy operations

Debugging Node.js Issues in Production

I hope these examples from Netflix, RisingStack & nearForm will help you to debug your Node.js apps in Production.

If you'd like to learn more, I recommend checking out these recent posts which will help you to deepen your Node knowledge:

If you have any questions, please let us know in the comments!

The Definitive Guide for Monitoring Node.js Applications

In the previous chapters of Node.js at Scale we learned how you can get Node.js testing and TDD right, and how you can use Nightwatch.js for end-to-end testing.

In this article, we will learn about running and monitoring Node.js applications in Production. Let's discuss these topics:

  • What is monitoring?
  • What should be monitored?
  • Open-source monitoring solutions
  • SaaS and On-premise monitoring offerings

What is Node.js Monitoring?

Monitoring is observing the quality of a software over time. The available products and tools we have in this industry usually go by the term Application Performance Monitoring or APM in short.

If you have a Node.js application in a staging or production environment, you can (and should) do monitoring on different levels:

You can monitor

  • regions,
  • zones,
  • individual servers and,
  • of course, the Node.js software that runs on them.

In this guide we will deal with the software components only, as if you run in a cloud environment, the others are taken care for you usually.

What should be monitored?

Each application you write in Node.js produces a lot of data about its' behavior.

There are different layers where an APM tool should collect data from. The more of them covered, the more insights you'll get about your system's behavior.

  • Service level
  • Host level
  • Instance (or process) level

The list you can find below collects the most crucial problems you'll run into while you maintain a Node.js application in production. We'll also discuss how monitoring helps to solve them and what kind of data you'll need to do so.

Problem 1.: Service Downtimes

If your application is unavailable, your customers can't spend money on your sites. If your API's are down, your business partners and services depending on them will fail as well because of you.

We all know how cringeworthy it is to apologize for service downtimes.

Your topmost priority should be preventing failures and providing 100% availability for your application.

Running a production app comes with great responsibility.

Node.js APM's can easily help you detecting and preventing downtimes, since they usually collect service level metrics.

This data can show if your application handles requests properly, although it won't always help to tell if your public sites or API's are available.

To have a proper coverage on downtimes, we recommend to set up a pinger as well which can emulate user behavior and provide foolproof data on availability. If you want to cover everything, don't forget to include different regions like the US, Europe and Asia too.

Problem 2.: Slow Services, Terrible Response Times

Slow response times have a huge impact on conversion rate, as well as on product usage. The faster your product is the more customers and user satisfaction you'll have.

Usually, all Node.js APM's can show if your services are slowing down, but interpreting that data requires further analysis.

I recommend doing two things to find the real reasons for slowing services.

  • Collect data on a process level too. Check out each instance of a service to figure out what happens under the hood.
  • Request CPU profiles when your services slow down and analyze them to find the faulty functions.

Eliminating performance bottlenecks enables you to scale your software more efficiently and also to optimize your budget.

Problem 3.: Solving Memory Leaks is Hard

Our Node.js Consulting & Development expertise allowed us to build huge enterprise systems and help developers making them better.

What we see constantly is that Memory Leaks in Node.js applications are quite frequent and that finding out what causes them is among the greatest struggles Node developers face.

This impression is backed with data as well. Our Node.js Developer Survey showed that Memory Leaks cause a lot of headache for even the best engineers.

To find memory leaks, you have to know exactly when they happen.

Some APM's collect memory usage data which can be used to recognize a leak. What you should look for is the steady growth of memory usage which ends up in a service crash & restart (as Node runs out of memory after 1,4 Gigabytes).

Node.js memory leak shown in Trace, the node.js monitoring tool

If your APM collects data on the Garbage Collector as well, you can look for the same pattern. As extra objects in a Node app's memory pile up, the time spent with Garbage Collection increases simultaneously. This is a great indicator of the Memory Leak.

After figuring out that you have a leak, request a memory heapdump and look for the extra objects!

This sounds easy in theory but can be challenging in practice.

What you can do is request 2 heapdumps from your production system with a Monitoring tool, and analyze these dumps with Chrome's DevTools. If you look for the extra objects in comparison mode, you'll end up seeing what piles up in your app's memory.

If you'd like a more detailed rundown on these steps, I wrote one article about finding a Node.js memory leak in Ghost, where I go into more details.

Problem 4.: Depending on Code Written by Anonymus

Most of the Node.js applications heavily rely on npm. We can end up with a lot of dependencies written by developers of unknown expertise and intentions.

Roughly 76% of Node shops use vulnerable packages, while open source projects regularly grow stale, neglecting to fix security flaws.

There are a couple of possible steps to lower the security risks of using npm packages.

  1. Audit your modules with the Node Security Platform CLI
  2. Look for unused dependencies with the depcheck tool
  3. Use the npm stats API, or browse historic stats on npm-stat.com to find out if others using a package
  4. Use the npm view <pkg> maintainers command to avoid packages maintained by only a few
  5. Use the npm outdated command or Greenkeeper to learn whether you're using the latest version of a package.

Going through these steps can consume a lot of your time, so picking a Node.js Monitoring Tool which can warn you about insecure dependencies is highly recommended.

Problem 6.: Email Alerts often go Unnoticed

Let's be honest. We are developers who like spending time writing code - not going through our email account every 10 minutes..

According to my experience, email alerts are usually unread and it's very easy to miss out on a major outage or problem if we depend only on them.

Email is a subpar method to learn about issues in production.

I guess that you also don't want to watch dashboards for potential issues 24/7. This is why it is important to look for an APM with great alerting capabilities.

What I recommend is to use pager systems like opsgenie or pagerduty to learn about critical issues. Pair up the monitoring solution of your choice with one of these systems if you'd like to know about your alerts instantly.

A few alerting best-practices we follow at RisingStack:

  • Always keep alerting simple and alert on symptoms
  • Aim to have as few alerts as possible - associated with end-user pain
  • Alert on high response time and error rates as high up in the stack as possible

Problem 7.: Finding Crucial Errors in the Code

If a feature is broken on your site, it can prevent customers from achieving their goals. Sometimes it can be a sign of bad code quality. Make sure you have proper test coverage for your codebase and a good QA process (preferably automated).

If you use an APM that collects errors from your app then you'll be able to find the ones which occur more often.

The more data your APM is accessing, better the chances of finding and fixing critical issues. We recommend to use a monitoring tool which collects and visualises stack traces as well - so you'll be able to find the root causes of errors in a distributed system.


In the next part of the article, I will show you one open-source, and one SaaS / on-premises Node.js monitoring solution that will help you operate your applications.

Prometheus - an Open-Source, General Purpose Monitoring Platform

Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud.

Prometheus was started in 2012, and since then, many companies and organizations have adopted the tool. It is a standalone open source project and maintained independently of any company.

In 2016, Prometheus joined the Cloud Native Computing Foundation, right after Kubernetes.

The most important features of Prometheus are:

  • a multi-dimensional data model (time series identified by metric name and key/value pairs),
  • a flexible query language to leverage this dimensionality,
  • time series collection happens via a pull model over HTTP by default,
  • pushing time series is supported via an intermediary gateway.

Node.js monitoring with prometheus

As you could see from the previous features, Prometheus is a general purpose monitoring solution, so you can use it with any language or technology you prefer.

Check out the official Prometheus getting started pages if you'd like to give it a try.

Before you start monitoring your Node.js services, you need to add instrumentation to them via one of the Prometheus client libraries.

For this, there is a Node.js client module, which you can find here. It supports histograms, summaries, gauges and counters.

Essentially, all you have to do is require the Prometheus client, then expose its output at an endpoint:

const Prometheus = require('prom-client')  
const server = require('express')()

server.get('/metrics', (req, res) => {  
  res.end(Prometheus.register.metrics())
})

server.listen(process.env.PORT || 3000)  

This endpoint will produce an output, that Prometheus can consume - something like this:

# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1490433285  
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 33046528  
# HELP nodejs_eventloop_lag_seconds Lag of event loop in seconds.
# TYPE nodejs_eventloop_lag_seconds gauge
nodejs_eventloop_lag_seconds 0.000089751  
# HELP nodejs_active_handles_total Number of active handles.
# TYPE nodejs_active_handles_total gauge
nodejs_active_handles_total 4  
# HELP nodejs_active_requests_total Number of active requests.
# TYPE nodejs_active_requests_total gauge
nodejs_active_requests_total 0  
# HELP nodejs_version_info Node.js version info.
# TYPE nodejs_version_info gauge
nodejs_version_info{version="v4.4.2",major="4",minor="4",patch="2"} 1  

Of course, these are just the default metrics which were collected by the module we have used - you can extend it with yours. In the example below we collect the number of requests served:

const Prometheus = require('prom-client')  
const server = require('express')()

const PrometheusMetrics = {  
  requestCounter: new Prometheus.Counter('throughput', 'The number of requests served')
}

server.use((req, res, next) => {  
  PrometheusMetrics.requestCounter.inc()
  next()
})

server.get('/metrics', (req, res) => {  
  res.end(Prometheus.register.metrics())
})

server.listen(3000)  

Once you run it, the /metrics endpoint will include the throughput metrics as well:

# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1490433805  
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 25120768  
# HELP nodejs_eventloop_lag_seconds Lag of event loop in seconds.
# TYPE nodejs_eventloop_lag_seconds gauge
nodejs_eventloop_lag_seconds 0.144927586  
# HELP nodejs_active_handles_total Number of active handles.
# TYPE nodejs_active_handles_total gauge
nodejs_active_handles_total 0  
# HELP nodejs_active_requests_total Number of active requests.
# TYPE nodejs_active_requests_total gauge
nodejs_active_requests_total 0  
# HELP nodejs_version_info Node.js version info.
# TYPE nodejs_version_info gauge
nodejs_version_info{version="v4.4.2",major="4",minor="4",patch="2"} 1  
# HELP throughput The number of requests served
# TYPE throughput counter
throughput 5  

Once you have exposed all the metrics you have, you can start querying and visualizing them - for that, please refer to the official Prometheus query documentation and the vizualization documentation.

As you can imagine, instrumenting your codebase can take quite some time - since you have to create your dashboard and alerts to make sense of the data. While sometimes these solutions can provide greater flexibility for your use-case than hosted solutions, it can take months to implement them & then you have to deal with operating them as well.

If you have the time to dig deep into the topic, you'll be fine with it.

Meet Trace - our SaaS, and On-premises Node.js Monitoring Tool

As we just discussed, running your own solution requires domain knowledge, as well as expertise on how to do proper monitoring. You have to figure out what aggregation to use for what kind of metrics, and so on..

This is why it can make a lot of sense to go with a hosted monitoring solution - whether it is a SaaS product or an on-premises offering.

At RisingStack, we are developing our own Node.js Monitoring Solution, called Trace. We built all the experience into Trace which we gained through the years of providing professional Node services.

What's nice about Trace, is that you have all the metrics you need with only adding a single line of code to your application - so it really takes only a few seconds to get started.

require([email protected]/trace')  

After this, the Trace collector automatically gathers your application's performance data and visualizes it for you in an easy to understand way.

Just a few things Trace is capable to do with your production Node app:

  1. Send alerts about Downtimes, Slow services & Bad Status Codes.
  2. Ping your websites and API's with an external service + show APDEX metrics.
  3. Collect data on service, host and instance levels as well.
  4. Automatically create a (10 second-long) CPU profile in a production environment in case of a slowdown.
  5. Collect data on memory consumption and garbage collection.
  6. Create memory heapdumps automatically in case of a Memory Leak in production.
  7. Show errors and stack traces from your application.
  8. Visualize whole transaction call-chains in a distributed system.
  9. Show how your services communicate with each other on a live map.
  10. Automatically detect npm packages with security vulnerabilities.
  11. Mark new deployments and measure their effectiveness.
  12. Integrate with Slack, Pagerduty, and Opsgenie - so you'll never miss an alert.

Although Trace is currently a SaaS solution, we'll make an on-premises version available as well soon.

It will be able to do exactly the same as the cloud version, but it will run on Amazon VPC or in your own datacenter. If you're interested in it, let's talk!

Summary

I hope that in this chapter of Node.js at Scale I was able to give useful advice about monitoring your Node.js application. In the next article, you will learn how to debug Node.js applications in an easy way.

Node.js End-to-End Testing with Nightwatch.js

In this article, we are going to take a look at how you can do end-to-end testing with Node.js, using Nightwatch.js, a Node.js powered end-to-end testing framework.

In the previous chapter of Node.js at Scale, we discussed Node.js Testing and Getting TDD Right. If you did not read that article, or if you are unfamiliar with unit testing and TDD (test-driven development), I recommend checking that out before continuing with this article.

What is Node.js end-to-end testing?

Before jumping into example codes and learning to implement end-to-end testing for a Node.js project, it's worth exploring what end-to-end tests really are.

First of all, end-to-end testing is part of the black-box testing toolbox. This means that as a test writer, you are examining functionality without any knowledge of internal implementation. So without seeing any source code.

Secondly, end-to-end testing can also be used as user acceptance testing, or UAT. UAT is the process of verifying that the solution actually works for the user. This process is not focusing on finding small typos, but issues that can crash the system, or make it dysfunctional for the user.

Enter Nightwatch.js

Nightwatch.js enables you to "write end-to-end tests in Node.js quickly and effortlessly that run against a Selenium/WebDriver server".

Nightwatch is shipped with the following features:

  • a built-in test runner,
  • can control the selenium server,
  • support for hosted selenium providers, like BrowserStack or SauceLabs,
  • CSS and Xpath selectors.

Installing Nightwatch

To run Nightwatch locally, we have to do a little bit of extra work - we will need a standalone Selenium server locally, as well as a webdriver, so we can use Chrome/Firefox to test our applications locally.

With these three tools, we are going to implement the flow this diagram shows below.

node.js end-to-end testing with nightwatch.js flowchart Photo credit: nightwatchjs.org

STEP 1: Add Nightwatch

You can add Nightwatch to your project simply by running npm install nightwatch --save-dev.

This places the Nightwatch executable in your ./node_modules/.bin folder, so you don't have to install it globally.

STEP 2: Download Selenium

Selenium is a suite of tools to automate web browsers across many platforms.

Prerequisite: make sure you have JDK installed, with at least version 7. If you don't have it, you can grab it from here.

The Selenium server is a Java application which is used by Nightwatch to connect to various browsers. You can download the binary from here.

Once you have downloaded the JAR file, create a bin folder inside your project, and place it there. We will set up Nightwatch to use it, so you don't have to manually start the Selenium server.

STEP 3: Download Chromedriver

ChromeDriver is a standalone server which implements the W3C WebDriver wire protocol for Chromium.

To grab the executable, head over to the downloads section, and place it to the same bin folder.

STEP 4: Configuring Nightwatch.js

The basic Nightwatch configuration happens through a json configuration file.

Let's create a nightwatch.json file, and fill it with:

{
  "src_folders" : ["tests"],
  "output_folder" : "reports",

  "selenium" : {
    "start_process" : true,
    "server_path" : "./bin/selenium-server-standalone-3.3.1.jar",
    "log_path" : "",
    "port" : 4444,
    "cli_args" : {
      "webdriver.chrome.driver" : "./bin/chromedriver"
    }
  },

  "test_settings" : {
    "default" : {
      "launch_url" : "http://localhost",
      "selenium_port"  : 4444,
      "selenium_host"  : "localhost",
      "desiredCapabilities": {
        "browserName": "chrome",
        "javascriptEnabled": true,
        "acceptSslCerts": true
      }
    }
  }
}

With this configuration file, we told Nightwatch where can it find the binary of the Selenium server and the Chromedriver, as well as the location of the tests we want to run.


You shouldn't rely only on e2e testing for QA. Trace helps you to find all issues before your users do.

Node.js monitoring & debugging from the experts of RisingStack
Learn more


Quick Recap

So far, we have installed Nightwatch, downloaded the standalone Selenium server, as well as the Chromedriver. With these steps, you have all the necessary tools to create end-to-end tests using Node.js and Selenium.

Writing your first Nightwatch Test

Let's add a new file in the tests folder, called homepage.js.

We are going to take the example from the Nightwatch getting started guide. Our test script will go to Google, search for Rembrandt, and check the Wikipedia page:

module.exports = {  
  'Demo test Google' : function (client) {
    client
      .url('http://www.google.com')
      .waitForElementVisible('body', 1000)
      .assert.title('Google')
      .assert.visible('input[type=text]')
      .setValue('input[type=text]', 'rembrandt van rijn')
      .waitForElementVisible('button[name=btnG]', 1000)
      .click('button[name=btnG]')
      .pause(1000)
      .assert.containsText('ol#rso li:first-child',
        'Rembrandt - Wikipedia')
      .end()
  }
}

The only thing left to do is to run Nightwatch itself! For that, I recommend adding a new script into our package.json's scripts section:

"scripts": {
  "test-e2e": "nightwatch"
}

The very last thing you have to do is to run the tests using this command:

npm run test-e2e  

If everything goes well, your test will open up Chrome, then Google and Wikipedia.

Nightwatch.js in Your Project

Now as you understood what end-to-end testing is, and how you can set up Nightwatch, it is time to start adding it to your project.

For that, you have to consider some aspects - but please note, that there are no silver bullets here. Depending on your business needs, you may answer the following questions differently:

  • Where should I run? On staging? On production? When don I build my containers?
  • What are the test scenarios I want to test?
  • When and who should write end-to-end tests?

Summary & Next Up

In this chapter of Node.js at Scale we have learned:

  • how to set up Nightwatch,
  • how to configure it to use a standalone Selenium server,
  • and how to write basic end-to-end tests.

In the next chapter, we are going to explore how you can monitor production Node.js infrastructures.

Digital Transformation with the Node.js Stack

In this article, we explore the 9 main areas of digital transformation and show what are the benefits of implementing Node.js. At the end, we’ll lay out a Digital Transformation Roadmap to help you get started with this process.

Note, that implementing Node.js is not the goal of a digital transformation project - it is just a great tool that opens up possibilities that any organization can take advantage of.


Digital transformation is achieved by using modern technology to radically improve the performance of business processes, applications, or even whole enterprises.

One of the available technologies that enable companies to go through a major performance shift is Node.js and its ecosystem. It is a tool that grants improvement opportunities that organizations should take advantage of:

  • Increased developer productivity,
  • DevOps or NoOps practices,
  • and shipping software to production in brief time using the proxy approach,

just to mention a few.

The 9 Areas of Digital Transformation

Digital Transformation projects can improve a company in nine main areas. The following elements were identified as a result of an MIT research on digital transformation, where they interviewed 157 executives from 50 companies (typically $1 billion or more in annual sales).

#1. Understanding your Customers Better

Companies are heavily investing in systems to understand specific market segments and geographies better. They have to figure out what leads to customer happiness and customer dissatisfaction.

Many enterprises are building analytics capabilities to better understand their customers. Information derived this way can be used for data-driven decisions.

#2. Achieving Top-Line Growth

Digital transformation can also be used to enhance in-person sales conversations. Instead of paper-based presentations or slides, salespeople can use great looking, interactive presentations, like tablet-based presentations.

Understanding customers better helps enterprises to transform and improve the sales experience with more personalized sales and customer service.

#3. Building Better Customer Touch Points

Customer service can be improved tremendously with new digital services. For example, by introducing new channels for the communication. Instead of going to a local branch of a business, customers can talk to support through Twitter of Facebook.

Self-service digital tools can be developed which both save time for the customer while saving money for the company.

#4. Process Digitization

With automation companies can focus their employees on more strategic tasks, innovation or creativity rather than repetitive efforts.

#5. Worker Enablement

Virtualization of individual work (the work process is separated from the location of work) have become enablers for knowledge sharing. Information and expertise is accessible in real-time for frontline employees.

#6. Data-Driven Performance Management

With the proper analytical capabilities, decisions can be made on real data and not on assumptions.

Digital transformation is changing the way how strategic decisions are made. With new tools strategic planning sessions can include more stakeholders, not just a small group.

#7. Digitally Extended Businesses

Many companies extend their physical offerings with digital ones. Examples include:

  • news outlets augmenting their print offering with digital content,
  • FMCG companies extending to e-commerce.

#8. New Digital Businesses

Companies not just extend their current offerings with digital transformation, but also coming up with new digital products that complement the traditional ones. Examples may include connected devices, like GPS trackers that can now report activity to the cloud and provide value to the customers through recommendations.

#9. Digital Globalization

Global shared services, like shared finance or HR enable organizations to build truly global operations.


The Digital Transformation Benefits of Implementing Node.js

Organizations are looking for the most effective way of digital transformation - among these companies, Node.js is becoming the de facto technology for building out digital capabilities.

Node.js is a JavaScript runtime built on Chrome's V8 JavaScript engine. Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient.

In other words: Node.js offers you the possibility to write servers using JavaScript with an incredible performance.

Increased Developer Productivity with Same-Language Stacks

When PayPal started using Node.js, they reported an 2x increase in productivity compared to the previous Java stack. How is that even possible?

NPM, the Node.js package manager, has an incredible amount of modules that can be used instantly. This saves a lot of development effort for the development team.

Secondly, as Node.js applications are written using JavaScript, front-end developers can also easily understand what's going on and make necessary changes.

This saves you valuable time again as developers will use the same language on the entire stack.

100% Business Availability, even with Extreme Load

Around 1.5 billion dollars are being spent online in the US on a single day on Black Friday, each year.

It is crucial that your site can keep up with the traffic. This is why Walmart, one of the biggest retailers is using Node.js to serve 500 million pageviews on Black Friday, without a hitch.

Fast Apps = Satisfied Customers

As your velocity increases because of the productivity gains, you can ship features/products sooner. Products that run faster result in better user experience.

Kissmetric's study showed that 40% of people abandon a website that takes more than 3 seconds to load, and 47% of consumers expect a web page to load in 2 seconds or less.

To read more on the benefits of using Node.js, you can download our Node.js is Enterprise Ready ebook.


Your Digital Transformation Roadmap with Node.js

As with most new technologies introduced to a company, it’s worth taking baby-steps first with Node.js as well. As a short framework for introducing Node.js, we recommend the following steps:

  • building a Node.js core team,
  • picking a small part of the application to be rewritten/extended using Node.js,
  • extending the scope of the project to the whole organization.

Step 1 - Building your Node.js Core Team

The core Node.js team will consist of people with JavaScript experience for both the backend and the frontend. It’s not crucial that the backend engineers have any Node.js experience, the important aspect is the vision they bring to the team.

Introducing Node.js is not just about JavaScript - it has to include members of the operations team as well, joining the core team.

The introduction of Node.js to an organization does not stop at excelling Node.js - it also means adding modern DevOps or NoOps practices, including but not limited to continuous integration and delivery.

Step 2 - Embracing The Proxy Approach

To incrementally replace old systems or to extend their functionality easily, your team can use the proxy approach.

For the features or functions you want to replace, create a small and simple Node.js application and proxy some of your load to the newly built Node.js application. This proxy does not necessarily have to be written in Node.js. With this approach, you can easily benefit from modularized, service-oriented architecture.

Another way to use proxies is to write them in Node.js and make them to talk with the legacy systems. This way you have the option to optimize the data sent being sent. PayPal was one of the first adopter of Node.js at scale, and they started with this proxy approach as well.

The biggest advantages of these solutions are that you can put Node.js into production in a short amount of time, measure your results, and learn from them.

Step 3 - Measure Node.js, Be Data-Driven

For the successful introduction of Node.js during a digital transformation project, it is crucial to set up a series of benchmarks to compare the results between the legacy system and the new Node.js applications. These data points can be response times, throughput or memory and CPU usage.

Orchestrating The Node.js Stack

As mentioned previously, introducing Node.js does not stop at excelling Node.js itself, but introducing continuous integration and delivery are crucial points as well.

Also, from an operations point of view, it is important to add containers to ship applications with confidence.

For orchestration, to operate the containers containing the Node.js applications we encourage companies to adopt Kubernetes, an open-source system for automating deployment, scaling, and management of containerized applications.

RisingStack and Digital Transformation with Node.js

RisingStack enables amazing companies to succeed with Node.js and related technologies to stay ahead of the competition. We provide professional Node.js development and consulting services from the early days of Node.js, and help companies like Lufthansa or Cisco to thrive with this technology.

10 Best Practices for Writing Node.js REST APIs

In this article we cover best practices for writing Node.js REST APIs, including topics like naming your routes, authentication, black-box testing & using proper cache headers for these resources.

One of the most popular use-cases for Node.js is to write RESTful APIs using it. Still, while we help our customers to find issues in their applications with Trace, our Node.js monitoring tool we constantly experience that developers have a lot of problems with REST APIs.

I hope these best-practices we use at RisingStack can help:

#1 - Use HTTP Methods & API Routes

Imagine, that you are building a Node.js RESTful API for creating, updating, retrieving or deleting users. For these operations HTTP already has the adequate toolset: POST, PUT, GET, PATCH or DELETE.

As a best practice, your API routes should always use nouns as resource identifiers. Speaking of the user's resources, the routing can look like this:

  • POST /user or PUT /user:/id to create a new user,
  • GET /user to retrieve a list of users,
  • GET /user/:id to retrieve a user,
  • PATCH /user/:id to modify an existing user record,
  • DELETE /user/:id to remove a user.

#2 - Use HTTP Status Codes Correctly

If something goes wrong while serving a request, you must set the correct status code for that in the response:

  • 2xx, if everything was okay,
  • 3xx, if the resource was moved,
  • 4xx, if the request cannot be fulfilled because of a client error (like requesting a resource that does not exist),
  • 5xx, if something went wrong on the API side (like an exception happened).

If you are using Express, setting the status code is as easy as res.status(500).send({error: 'Internal server error happened'}). Similarly with Restify: res.status(201).

For a full list, check the list of HTTP status codes

#3 - Use HTTP headers to Send Metadata

To attach metadata about the payload you are about to send, use HTTP headers. Headers like this can be information on:

  • pagination,
  • rate limiting,
  • or authentication.

A list of standardized HTTP headers can be found here.

If you need to set any custom metadata in your headers, it was a best practice to prefix them with X. For example, if you were using CSRF tokens, it was a common (but non-standard) way to name them X-Csrf-Token. However with RFC 6648 they got deprecated. New APIs should make their best effort to not use header names that can conflict with other applications. For example, OpenStack prefixes its headers with OpenStack:

OpenStack-Identity-Account-ID  
OpenStack-Networking-Host-Name  
OpenStack-Object-Storage-Policy  

Note that the HTTP standard does not define any size limit on the headers; however, Node.js (as of writing this article) imposes an 80KB size limit on the headers object for practical reasons.

" Don't allow the total size of the HTTP headers (including the status line) to exceed HTTP_MAX_HEADER_SIZE. This check is here to protect embedders against denial-of-service attacks where the attacker feeds us a never-ending header that the embedder keeps buffering."

From the Node.js HTTP parser

#4 - Pick the right framework for your Node.js REST API

It is important to pick the framework that suits your use-case the most.

Express, Koa or Hapi

Express, Koa and Hapi can be used to create browser applications, and as such, they support templating and rendering - just to name a few features. If your application needs to provide the user-facing side as well, it makes sense to go for them.

Restify

On the other hand, Restify is focusing on helping you build REST services. It exists to let you build "strict" API services that are maintainable and observable. Restify also comes with automatic DTrace support for all your handlers.

Restify is used in production in major applications like npm or Netflix.

#5 - Black-Box Test your Node.js REST APIs

One of the best ways to test your REST APIs is to treat them as black boxes.

Black-box testing is a method of testing where the functionality of an application is examined without the knowledge of its internal structures or workings. So none of the dependencies are mocked or stubbed, but the system is tested as a whole.

One of the modules that can help you with black-box testing Node.js REST APIs is supertest.

A simple test case which checks if a user is returned using the test runner mocha can be implemented like this:

const request = require('supertest')

describe('GET /user/:id', function() {  
  it('returns a user', function() {
    // newer mocha versions accepts promises as well
    return request(app)
      .get('/user')
      .set('Accept', 'application/json')
      .expect(200, {
        id: '1',
        name: 'John Math'
      }, done)
  })
})

You may ask: how does the data gets populated into the database which serves the REST API?

In general, it is a good approach to write your tests in a way that they make as few assumptions about the state of the system as possible. Still, in some scenarios you can find yourself in a spot when you need to know what is the state of the system exactly, so you can make assertions and achieve higher test coverage.

So based on your needs, you can populate the database with test data in one of the following ways:

  • run your black-box test scenarios on a known subset of production data,
  • populate the database with crafted data before the test cases are run.

Of course, black-box testing does not mean that you don't have to do unit testing, you still have to write unit tests for your APIs.


Node.js Monitoring and Debugging from the Experts of RisingStack

Improve your REST APIs with Trace
Learn more


#6 - Do JWT-Based, Stateless Authentication

As your REST APIs must be stateless, so does your authentication layer. For this, JWT (JSON Web Token) is ideal.

JWT consists of three parts:

  • Header, containing the type of the token and the hashing algorithm
  • Payload, containing the claims
  • Signature (JWT does not encrypt the payload, just signs it!)

Adding JWT-based authentication to your application is very straightforward:

const koa = require('koa')  
const jwt = require('koa-jwt')

const app = koa()

app.use(jwt({  
  secret: 'very-secret' 
}))

// Protected middleware
app.use(function *(){  
  // content of the token will be available on this.state.user
  this.body = {
    secret: '42'
  }
})

After that, the API endpoints are protected with JWT. To access the protected endpoints, you have to provide the token in the Authorization header field.

curl --header "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiYWRtaW4iOnRydWV9.TJVA95OrM7E2cBab30RMHrHDcEfxjoYZgeFONFh7HgQ" my-website.com  

One thing that you could notice is that the JWT module does not depend on any database layer. This is the case because all JWT tokens can be verified on their own, and they can also contain time to live values.

Also, you always have to make sure that all your API endpoints are only accessible through a secure connection using HTTPS.

In a previous article, we explained web authentication methods in details - I recommend to check it out!

#7 - Use Conditional Requests

Conditional requests are HTTP requests which are executed differently depending on specific HTTP headers. You can think of these headers as preconditions: if they are met, the requests will be executed in a different way.

These headers try to check whether a version of a resource stored on the server matches a given version of the same resource. Because of this reason, these headers can be:

  • the timestamp of the last modification,
  • or an entity tag, which differs for each version.

These headers are:

  • Last-Modified (to indicate when the resource was last modified),
  • Etag (to indicate the entity tag),
  • If-Modified-Since (used with the Last-Modified header),
  • If-None-Match (used with the Etag header),

Let's take a look at an example!

The client below did not have any previous versions of the doc resource, so neither the If-Modified-Since, nor the If-None-Match header was applied when the resource was sent. Then, the server responds with the Etag and Last-Modified headers properly set.

Node.js RESTfu API with conditional request, without previous versions

From the MDN Conditional request documentation

The client can set the If-Modified-Since and If-None-Match headers once it tries to request the same resource - since it has a version now. If the response would be the same, the server simply responds with the 304 - Not Modified status and does not send the resource again.

Node.js RESTfu API with conditional request, with previous versions

From the MDN Conditional request documentation

#8 - Embrace Rate Limiting

Rate limiting is used to control how many requests a given consumer can send to the API.

To tell your API users how many requests they have left, set the following headers:

  • X-Rate-Limit-Limit, the number of requests allowed in a given time interval
  • X-Rate-Limit-Remaining, the number of requests remaining in the same interval,
  • X-Rate-Limit-Reset, the time when the rate limit will be reset.

Most HTTP frameworks support it out of the box (or with plugins). For example, if you are using Koa, there is the koa-ratelimit package.

Note, that the time window can vary based on different API providers - for example, GitHub uses an hour for that, while Twitter 15 minutes.

#9 - Create a Proper API Documentation

You write APIs so others can use them, benefit from them. Providing an API documentation for your Node.js REST APIs are crucial.

The following open-source projects can help you with creating documentation for your APIs:

Alternatively, if you want to use a hosted products, you can go for Apiary.

#10 - Don't Miss The Future of APIs

In the past years, two major query languages for APIs arose - namely GraphQL from Facebook and Falcor from Netflix. But why do we even need them?

Imagine the following RESTful resource request:

/org/1/space/2/docs/1/collaborators?include=email&page=1&limit=10

This can get out of hand quite easily - as you'd like to get the same response format for all your models all the time. This is where GraphQL and Falcor can help.

About GraphQL

GraphQL is a query language for APIs and a runtime for fulfilling those queries with your existing data. GraphQL provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools. - Read more here.

About Falcor

Falcor is the innovative data platform that powers the Netflix UIs. Falcor allows you to model all your backend data as a single Virtual JSON object on your Node server. On the client, you work with your remote JSON object using familiar JavaScript operations like get, set, and call. If you know your data, you know your API. - Read more here.

Amazing REST APIs for Inspiration

If you are about to start developing a Node.js REST API or creating a new version of an older one, we have collected four real-life examples that are worth checking out:

I hope that now you have a better understanding of how APIs should be written using Node.js. Please let me know in the comments if you miss anything!