This format is intended to strike a balance between human and machine readability.
It aims to codify existing logging conventions and nail down some formerly thorny or ambiguous cases, while leaning on a familiar data model (JSON).
Summary: mostly JSON, with a couple of restrictions, a couple of extensions, and a couple of changes.
- A log message is a JSON object, omitting the enclosing curly braces.
- Strings can appear unquoted if they look like C identifiers (only
alphanumeric characters and underscore; can't begin with a digit).
Note that the encodings of values null, true, and false also
fit this description, which means that if you want the string
“null” you have to encode it with quotes (as
"null"
). - We use equals instead of colon, and leave out the commas.
- There is an RFC 3339 timestamp literal format with optional nanoseconds. These values cannot be unambiguously represented in JSON (where the best you can do is pun them with strings).
{"dyno":"web", "cmd":"thin -R config.ru -p 3000 start", "scale":3, "err":null}
{"dyno":"web", "metadata":{"stack":["line 1", "line 2"], "msg":"null"}}
{"t":"2012-06-19T11:02:47.123456789-0400", "machine":"awondo", "event":"start"}
dyno=web cmd="thin -R config.ru -p 3000 start" scale=3 err=null
dyno=web metadata={stack=["line 1" "line 2"] msg="null"}
t=2012-06-19T11:02:47.123456789-0400 machine=awondo event=start
Open Question: Should we allow unquoted strings of the form
3ms
? Currently hermes generates such messages.
Open Question: Should we require timestamps to be in UTC?
Forthcoming.
Forthcoming.
@kr: seems you've got the hand on EBNF format. just saw it here: http://godoc.org/github.com/kr/logfmt . What's the latest updates on logfmt ?
@asenchi: after 9 months of thinking (ok not constantly) I think the simpler logfmt k/v format is better than the one with types. I'm thinking of retiring the lines format as a failed experiment.
The parsing is easy to be made robust with logfmt which is important when consuming diverse sources. I also found that type isn't always desirable. For example in a scenario where logs would be indexed in ElasticSearch the first time a key appears the index is created with the given value type but there is no guarantee that the next log entry won't have the same key with a different type. Aside from that ElasticSearch also needs a fixed set of keys or you run the risk of blowing up your memory with arbitrary indexes (happened to me). Once you've decided on the keys you might as well choose the type all the values are going to be cast to.