Skip to content

Instantly share code, notes, and snippets.

@tothandras
Last active August 7, 2017 14:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tothandras/bc06910063015b8891ff0fbb909c68c2 to your computer and use it in GitHub Desktop.
Save tothandras/bc06910063015b8891ff0fbb909c68c2 to your computer and use it in GitHub Desktop.
Logging

Logging

Logs are an important source to get insights what's happening when your code runs. It can help to understand how the application works. It can also help to discover errors, but there is no one need to look at this output immediately, it's available for later analysis if needed. Although, it is possible to set up alerts based on logs.

Log format

Here we need to ask ourself, what kind of information can help us to find issues and to understand the call chain, data flow in an application?

  • timestamp: when the given event happened, it helps to establish a timeline of events
  • name or location: what is the source of the log
  • message: a short description of the event
  • context, metadata: can be an Error and/or an Object holding additional data of the context (user id, query parameters, request body, etc.). KEEP IN MIND: all secrets, confidential and private data should be omitted (password, cookie header, authentication token, bank card information, etc.). These information must be masked out:
    const logContext = _.merge({}, data, { user: { password: 'OMITTED' } })
  • one event per line (except: stack traces)

The format of the logs is a JSON string following Stackdrivers LogEntry format:

{
  "logName": string,
  "resource": {
    object(MonitoredResource)
  },
  "timestamp": string,
  "receiveTimestamp": string,
  "severity": enum(LogSeverity),
  "insertId": string,
  "httpRequest": {
    object(HttpRequest)
  },
  "labels": {
    string: string,
    ...
  },
  "operation": {
    object(LogEntryOperation)
  },
  "trace": string,
  "sourceLocation": {
    object(LogEntrySourceLocation)
  },

  // Union field payload can be only one of the following:
  "protoPayload": {
    "@type": string,
    field1: ...,
    ...
  },
  "textPayload": string,
  "jsonPayload": {
    object
  },
  // End of list of possible types for union field payload.
}

Multiple log levels

One of the most important feature for any logger is the ability to distinguish logs based on their importance. It also enables us to configure what severity to log out in different environments. In staging or production environment only info severity and above should be turned on. Check out Stackdriver's documentation for more details.

  • emergency: The system is unusable, person must take an action immediately. Examples:

    • failed to start server on the given port
    • failed to start server due to missing environment variables or bad configuration
  • alert: A person must take an action immediately. The system is in distress, customers are probably being affected (or will soon be). Examples:

  • critical: Critical events cause more severe problems or outages and it probably requires human intervention. Examples:

    • the server can't handle load
    • database is unreachable
  • error: Error events are likely to cause problems, an unexpected technical or business event happened. Examples:

    • 5xx internal server error and its cause
    • most of errors in try/catch blocks
  • warning: Warning events might cause problems. Examples:

    • use of deprecated APIs
    • poor use of API
  • notice: Normal but significant events, such as start up, shut down, or a configuration change. Examples:

    • system life cycle events (server is listening on a port, received kill signal, etc.)
  • info: Routine information, such as ongoing status or performance. For important information, things we want to see at high volume in case we need to analyse an issue. Examples:

    • session life cycle events (login, logout)
    • significant boundary events (database calls, remote API calls, etc.)
  • debug: For not very important, but potentially helpful events in tracking the flow through the system and isolating issues. It should be only turned on when developing locally or for a short period of time. Examples:

    • HTTP requests, responses
    • messages received on a queue
    • states of some data being processed
    • entry/exit of non-trivial methods and decision points

Destination of logs

Although, many libraries (including winston) support multiple transports, the application should always use standard output as the target. It's not the application's responsibility to route or to save logs. Each running process writes its event stream, unbuffered, to stdout.

During local development, the developer will view this stream in the foreground of their terminal to observe the application's behaviour.

In staging or production deployments, each process' stream will be captured by the execution environment, collated together with all other streams from the app, and routed to one or more final destinations for viewing and long-term archival. These archival destinations are not visible to or configurable by the app, and instead are completely managed by the execution environment.

Node.js packages and examples

A multi-transport async logging library for Node.js

$ npm install -S winston
$ yarn add winston

Configure winston

Getting the logger level from environment variable and use a custom formatter:

'use strict'

const joi = require('joi')
const winston = require('winston')
const fp = require('lodash/fp')

const envVarsSchema = joi.object({
  NODE_ENV: joi.string()
    .allow(['development', 'production', 'test'])
    .default('production'),
  LOGGER_LEVEL: joi.string()
    .allow(['emergency', 'alert', 'critical', 'error', 'warning', 'notice', 'info', 'debug'])
    .when('NODE_ENV', {
      is: 'development',
      then: joi.default('debug'),
    })
    .when('NODE_ENV', {
      is: 'production',
      then: joi.default('info'),
    })
    .when('NODE_ENV', {
      is: 'test',
      then: joi.default('critical'),
    }),
})
  .unknown().required()

const { value: envVars, error } = joi.validate(process.env, envVarsSchema, { abortEarly: false })
if (error) {
  // Don't expose environment variables on error
  // eslint-disable-next-line no-underscore-dangle
  delete error._object
  throw error
}

const config = {
  env: envVars.NODE_ENV,
  logger: {
    level: envVars.LOGGER_LEVEL,
    transports: [
      new (winston.transports.Console)({
        formatter(options) {
          const metadata = options.meta || {}
          const entryFields = ['httpRequest', 'labels', 'operation', 'sourceLocation']

          const entryFieldsInMetadata = fp.pick(entryFields, metadata)
          const metadataFields = fp.omit(entryFields, metadata)

          const entry = Object.assign({
            severity: options.level.toUpperCase(),
            timestamp: new Date().toISOString(),
            jsonPayload: {
              message: options.message,
              metadata: metadataFields,
            },
          }, entryFieldsInMetadata)

          return JSON.stringify(entry).trim()
        },
      }),
    ],
    levels: {
      emergency: 0,
      alert: 1,
      critical: 2,
      error: 3,
      warning: 4,
      notice: 5,
      info: 6,
      debug: 7,
    },
    colors: {
      emergency: 'red',
      alert: 'yellow',
      critical: 'red',
      error: 'red',
      warning: 'yellow',
      notice: 'cyan',
      info: 'green',
      debug: 'blue',
    },
  },
}

module.exports = config

Configure the global winston logger:

'use strict'

const winston = require('winston')
const config = require('./config')

winston.configure(config.logger)
// FIXME (andras): changing levels in `configure` is broken (https://github.com/winstonjs/winston/issues/954)
winston.setLevels(config.logger.levels)
winston.addColors(config.logger.colors)

It is also possible to create a logger instance:

'use strict'

const winston = require('winston')
const config = require('./config')

const logger = new winston.Logger(config.logger)
module.exports = logger

Set up logging for ERROR events:

'use strict'

const logger = require('winston')

// It is not safe to resume normal operation after 'uncaughtException'.
// Read more: https://nodejs.org/api/process.html#process_event_uncaughtexception
process.on('uncaughtException', (error) => {
  logger.alert('Uncaught exception', error)
  process.exit(1)
})

// A Promise is rejected and no error handler is attached.
// Read more: https://nodejs.org/api/process.html#process_event_unhandledrejection
process.on('unhandledRejection', (reason, promise) => {
  logger.critical('Unhandled promise rejection', { reason, promise })
})

Examples

Work in progress...

'use strict'

const { promisify } = require('util')
const logger = require('winston')
const config = require('./config')
const server = require('./server')

const serverListen = promisify(server.listen).bind(server)

serverListen(config.port)
  .then(() => {
    const { port } = server.address()
    logger.notice(`Server is listening on port ${port}`)
  })

server.on('error', (err) => {
  logger.error('Server error happened during start', err)
  process.exit(1)
})

Super fast, all natural json logger

Work in progress...

$ npm install -S pino
$ yarn add pino
@alghimo
Copy link

alghimo commented Aug 1, 2017

Looks cool :) Some comments / suggestions:

  • We could add ALERT level or similar. This doesn't need to be an error per se, but something we might want to monitor or take action on.
  • Add FATAL level: Any error that presumably will lead the application to die. For instance, disk is full (and your app is using disk)
  • For ERROR level:
    • Anything that will prevent the expected response to be generated, but will allow the application to keep running
  • For WARN level:
    • 5xx internal server error and its cause: This should be an ERROR, not a warning.

@tothandras
Copy link
Author

tothandras commented Aug 1, 2017

Thanks for the input! Yes, I agree, we should consider splitting ERROR into more fine grained levels. The severity names can be changed as well. Basically what I see:

  • my ERROR level is your FATAL/ALERT
  • my WARN level is your ERROR

@alghimo
Copy link

alghimo commented Aug 1, 2017

Indeed, my levels are probably shifted by one level, although for me calling a deprecated API, for instance, is still a warning. I guess for me the line between WARN and ERROR is wether the system will respond or not (that's why I would put the 5xx as ERROR).
Maybe one thing to also look at is how applications / libraries that we use but do not maintain are logging. In my case (Spark / Hadoop), you can just ignore pretty much all of the WARN messages unless you need to debug.

@tothandras
Copy link
Author

@alghimo Okay! I'll update the list to follow Stackdriver's log levels. It's more fine grained.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment