Skip to content

Instantly share code, notes, and snippets.

@kr

kr/logfmt.md Secret

Created November 10, 2012 00:45
Show Gist options
  • Save kr/0e8d5ee4b954ce604bb2 to your computer and use it in GitHub Desktop.
Save kr/0e8d5ee4b954ce604bb2 to your computer and use it in GitHub Desktop.
Log format description

logfmt

This format is intended to strike a balance between human and machine readability.

It aims to codify existing logging conventions and nail down some formerly thorny or ambiguous cases, while leaning on a familiar data model (JSON).

Summary: mostly JSON, with a couple of restrictions, a couple of extensions, and a couple of changes.

  • A log message is a JSON object, omitting the enclosing curly braces.
  • Strings can appear unquoted if they look like C identifiers (only alphanumeric characters and underscore; can't begin with a digit). Note that the encodings of values null, true, and false also fit this description, which means that if you want the string “null” you have to encode it with quotes (as "null").
  • We use equals instead of colon, and leave out the commas.
  • There is an RFC 3339 timestamp literal format with optional nanoseconds. These values cannot be unambiguously represented in JSON (where the best you can do is pun them with strings).

Example JSON

{"dyno":"web", "cmd":"thin -R config.ru -p 3000 start", "scale":3, "err":null}
{"dyno":"web", "metadata":{"stack":["line 1", "line 2"], "msg":"null"}}
{"t":"2012-06-19T11:02:47.123456789-0400", "machine":"awondo", "event":"start"}

Equivalent logfmt

dyno=web cmd="thin -R config.ru -p 3000 start" scale=3 err=null
dyno=web metadata={stack=["line 1" "line 2"] msg="null"}
t=2012-06-19T11:02:47.123456789-0400 machine=awondo event=start

Open Question: Should we allow unquoted strings of the form 3ms? Currently hermes generates such messages.

Open Question: Should we require timestamps to be in UTC?

Grammar

Forthcoming.

Key Naming Conventions

Forthcoming.

@kr
Copy link
Author

kr commented Feb 28, 2013

Oh wow @zimbatm, that's really cool. I hope you're prepared to make some
changes if necessary as this draft evolves. It's definitely not final or anything. :)

@zimbatm
Copy link

zimbatm commented Mar 19, 2013

@kr: Sure, Parslet is easy :)

Actually I didn't get your notification and since the activity seemed pretty slow I started my own log format initiative. I hope you don't mind :/ It wasn't really to own it but more because I think this is an awesome idea and would love to get a cross-language specification.

@zimbatm
Copy link

zimbatm commented Mar 19, 2013

Actually I'm going to add you to the project if you don't mind

@whatupdave
Copy link

Wrote a quick log generator in go here: https://github.com/whatupdave/dlog

@asenchi
Copy link

asenchi commented Apr 3, 2013

@zimbatm Nice work, I'll be sending some ideas around agreed upon keys, but thus far I think you are on the right track. I've unfortunately haven't had much time to work on a format internally, but I am glad that there is a more community driven response here.

Lets work to get this in solid shape. @kr Heroku still has, what I would consider to be, the most thorough implementation of "logs as data" so it would be great if you or someone else could work on this as well.

Thanks again @zimbatm

@zimbatm
Copy link

zimbatm commented Dec 22, 2013

@kr: seems you've got the hand on EBNF format. just saw it here: http://godoc.org/github.com/kr/logfmt . What's the latest updates on logfmt ?

@asenchi: after 9 months of thinking (ok not constantly) I think the simpler logfmt k/v format is better than the one with types. I'm thinking of retiring the lines format as a failed experiment.

The parsing is easy to be made robust with logfmt which is important when consuming diverse sources. I also found that type isn't always desirable. For example in a scenario where logs would be indexed in ElasticSearch the first time a key appears the index is created with the given value type but there is no guarantee that the next log entry won't have the same key with a different type. Aside from that ElasticSearch also needs a fixed set of keys or you run the risk of blowing up your memory with arbitrary indexes (happened to me). Once you've decided on the keys you might as well choose the type all the values are going to be cast to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment