Created
December 22, 2014 11:34
-
-
Save OhMeadhbh/eea2f388e0a56822cf71 to your computer and use it in GitHub Desktop.
DDN: Son of Son of Dysfunction
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
; ddn.ddn | |
; | |
; This document describes the Dynamic Data Notation (DDN). DDN is a superset | |
; of JavaScript Object Notation and fills a similar role. It differs from JSON | |
; in a few important ways: | |
; | |
; * DDN supports comments. Semi-colons (;), hash markes (#) and C++ style | |
; "slash slash" di-graphs (//) all begin a "to the end of the line" style | |
; comment. C-style "slash splat" (/*) and "splat slash" (*/) digraphs | |
; enclose bounded comments. | |
; | |
; * Dates are first-class types. That means you can specify dates directly | |
; with RFC3339 format. | |
; | |
; * UTF-8. It's UTF-8 turtles all the way down. | |
; | |
; * Primitive types as keys. Any primitive type (null, boolean, integer, | |
; float or date) can be a key in an associative array. | |
; | |
; * Concatenation of vectors. If you put two strings or two arrays next to | |
; each other with only white space between them, they're concatenated | |
; together. | |
{ | |
null : "Note that the null item can be the key for an associative array.", | |
false : "True and false can also be keys for associative arrays.", | |
true : "And remember; characters that normally denote comments " | |
"are interpreted as members of strings inside quotes /* and double " | |
"quotes.*/" /* This is a comment, however. */, | |
0 : "Did you notice we placed three strings right next to each other?", | |
1 : "When the parser sees multiple vectors next to each other, it " | |
"simply concatenates them.", | |
2 : [ 0, 1, 2 ] [3, 4, 5], ; This works with Arrays and Objects as well. | |
"foo" : "Using numbers or strings as indexes into \"normal\" or " | |
"associative arrays works pretty much the same as with JSON.", | |
3.0 : "Associative arrays can have keys with any primitive type.", | |
3.1 : "But watch out, keys are converted into canonical string form when " | |
"used as an index. This means the string '3' and the number 3 " | |
"reference the same element, but the integer 3 and the floating " | |
"point value 3.0 reference different elements, because their " | |
"canonical string values are different.", | |
3.2 : "We don't retain precision when creating canonical string " | |
"representations, so '3.0' and '3.00' both refer to the same value.", | |
3.20 : "By default, parsers MUST not raise an exception if two keys are " | |
"identical, however.", | |
3.20 : "When deserialized, the last key:value association encountered by " | |
"the parser will be the one in the deserialized object.", | |
3.20 : "This is to reduce the impact on running systems in loosely-coupled " | |
"systems. Implementations MAY implement an advisory service that " | |
"emits an event when a duplicate key is found." | |
; In JavaScript, this would look something like this: | |
; var parser = new DDN(); | |
; parser.on( 'duplicate', function( e ) { console.log( 'dup!' ); } ); | |
; parser.parse( '{ "a":"once", "a":"upon", "a":"a", "a":"time" }' ); | |
2012-12-21 : "Did we mention dates?", | |
2012-12-21T00:00:00Z : "Or times?", | |
2012-12 : "Or partial dates?", | |
2012-12-21T00:00 : "Or partial times?", | |
2012-12-21T00:00:00.000Z : "Times can have fractions of seconds.", | |
4294967297 : "Hold on. Did I just specify an integer greater than 32 bits?", | |
18446744073709551617 : "Why yes, I believe I did.", | |
79228162514264337593543950336 : "Uh oh. This is getting freaky.", | |
4 : "Okay. Why does that work? Shouldn't that throw an error?", | |
4.1 : "You would think it would. But it doesn't, because DDN describes " | |
"a transfer syntax, not a type system.", | |
4.2 : "This means DDN defines a parser that knows how to tell where " | |
"numbers and strings and dates and key:value pairs begin and end, " | |
"but it doesn't mean it requires ints be 32 bits wide. (more on " | |
"this later.)", | |
5.0 : <<EOS | |
One of the coolest things Bash does is support Here Docs. This means that when | |
the parser sees the sequence << (less than, less than) followed by a symbol, | |
followed by a new line, It interprets everything until newline - symbol as a | |
string. | |
So we're actually defining a string right here. And as long as we don't have | |
newline followed by EOS, it's going to keep on shoveling characters into the | |
string. | |
We can even add bits of DDN syntax. It doesn't matter. { we're in a string } | |
EOS | |
"", | |
5.1 : "But once we hit that symbol-newline sequence, we're back into " | |
"regular parsing mode.", | |
5.2 : <<"And you can have spaces in your symbols" | |
Bonus points if you figured out what the quotes around the symbol do. | |
And you can have spaces in your symbols, | |
5.3 : <<2012-12-21T00:00:00.000Z | |
Okay. this looks like the Here-doc initiator is a date object. It's not. We're | |
not that crazy. It's just a string. | |
2012-12-21T00:00:00.000Z, | |
5.4 : <<2012-12-21T00:00:00.000Z | |
You can re-use here-doc initiator strings. <<BUT_THEY_DON'T_NEST | |
2012-12-21T00:00:00.000Z, | |
5.5 : <<EOS, | |
You can put a comma in your here-doc terminator, but that looks very, very | |
confusing, IMHO. In the line below one comma is part of the here-doc symbol and | |
the next comma is there because i need a comma between this array element and | |
the next. | |
EOS,, | |
6.0 : "So... what about arrays? Yes. We have them.", | |
6.1 : [ 'this', 'is', 'a', 'typical', 'array', 0 ], | |
6.2 : "Each element in an array can be any type.", | |
7.0 : "But we have 'packed arrays' to represent arrays with all the same " | |
"type of thing.", | |
7.1 : [[ 'this', 'is', 'a', 'packed', 'array', 'with', 'just', 'strings']], | |
7.2 : [[ 0xFF, 0xFE, 0xFD ]], ; this is a packed array of 8 bit chars. | |
7.3 : [[ 16 | 1, 2, 3, 4 ]], ; this is a packed array of 16-bit values. | |
7.4 : "the number in between the double square brace and the bar " | |
"can be any numeric value that's a multiple of 8. (defaulting " | |
"to 8.), | |
8.0 : [[ | TGludXggaGVsaXVtIDMuMi4wLTQtNjg2LXBhZSAjMSBTTVAgRGViaWFuIDMuMi42 | |
My0yK2RlYjd1MSBpNjg2IEdOVS9MaW51eAo= ]], | |
8.1 : "if there's no value between the double-square-brace and the bar, " | |
"we assume it's base64." | |
} | |
.small | |
[ | |
"so this is weird. we just terminated the associative array and are ", | |
"starting a new 'regular' array. What's up with that? ", | |
"", | |
"Well. It turns out, the parser will return an array of objects if it sees ", | |
"more than one." | |
] | |
.tiny | |
32 | |
.large | |
<<EOS | |
So if you parsed this, you would get five objects in an array: an associative | |
array, a regular array, an integer, a string and another associative array. | |
You're probably also wondering what all those .small, .tiny, .large symbols | |
are. Remember I said DDN doesn't specify the size of integers? That's only | |
partially true. The parser doesn't REQUIRE values to fit in a specific size, | |
but it can communicate to the receiver it's intent to follow certain type | |
sizes. | |
By default, the parser is in "indeterminate mode." This means there are no | |
limits to the sizes of things it transmits. If you include the symbol '.tiny' | |
in the parse stream, you are signaling your intent to only send 'tiny' data. | |
Tiny integers are 8 bits wide. Tiny floats are 16 bits (half precision.) | |
Tiny strings are no more than 255 characters long. small and large modes have | |
these limits: | |
integer float | |
------- ------- | |
small 32 bit 64 bit (double precision) | |
large 64 bit 128 bit (quad precision) | |
EOS | |
.indeterminate | |
{ | |
0 : "So there's one more thing to talk about. And it's going to annoy a ", | |
1 : "lot of people cause it makes parsing a little harder.", | |
2 : "If you put a bar character between two associative arrays, it ", | |
3 : "merges them." | |
} | |
|{ | |
4 : "So... the contents of this associative array are merged into the ", | |
5 : "previous associative array.", | |
6 : null | |
} | |
|{ | |
6 : "It's okay to have the same keys in the two arrays, the keys defined ", | |
7 : "later replace the keys defined earlier. So the 6:null key:value pair ", | |
8 : "in the array above would be replaced by the 6:string pair from this ", | |
9 : "array." | |
} | |
|{ | |
10 : "It's a bit of a pain to code this on 8 bit microcontrollers, so it's ", | |
11 : "disabled in .tiny mode.", | |
} | |
|{ | |
12 : "The main reason for this feature is to enable 'poor mans journaling.'", | |
13 : "While an 8-bit micro might not want to parse it, it's not too bad ", | |
14 : "for a typical 32 or 64 bit system. So your 32 bit system would speak ", | |
15 : "tiny to the 8 bit system. But the 9 bit system would speak .small to ", | |
16 : "the 32/64 bit server. And since it's speaking to a system that ", | |
17 : "doesn't have a problem parsing it, the microcontroller has the ", | |
18 : "option of having the remote system merge the associative arrays." | |
} |
was also just thinking... DDN as described here is a transfer syntax, maybe we should rename it to be something like "FOO transfer syntax" and have a different document describing the processing expectations so there's a clear dividing line between transfer syntax and type expectations.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hey Kent. thanks for the comments. let me start with a list of problems i'm trying to solve (besides the obvious one of communicating a serialized dynamic data structure.)
Having both "to the end of comments" and bounded comments and their delimiters are really just a matter of personal taste. I originally used parentheses for bounded delimeters, but most people found that a little distracting (except FORTH programmers.)
warning: You're about to do something dangerous.
" is the data you're trying to move. This example is easy enough, but when you decide to include a longer string (like a description of why something is dangerous,) you wind up with strings that span lines in your text editor and are a little less easy to understand.So... for example... consider this JSON:
{
"success": false,
"description": "
The capability your system provided has expired.
For more information on web capabilities, please see <a href="http://example.com/docs/webcaps.html\">A Brief Introduction to Web Capabilities
"}
And compare it to:
{
"success": false,
"description":
"
"
""The capability your system provided has expired."
"
"
"
""For more information on web capabilities, please see "
"<a href="http://example.com/docs/webcaps.html\">A Brief Introduction to Web Capabilities"
"
}
or even:
{
"success": false,
"description": <<EOS
The capability your system provided has expired.
For more information on web capabilities, please see A Brief Introduction to Web Capabilities
EOS
}
The directives (.tiny, .small, .large, .indeterminate) are used to signal the senders intent not to send values that violate certain type constraints. DDN doesn't REQUIRE endpoints to adhere to this promise,
but it does allow system builders to signal the consumer of the serialized form of their intent. This is in keeping with the "provide mechanism, not policy," concept.
{
"en0": {
"auto": false,
"type": "dhcp"
}
}
|{
"en0": {
"auto": true
},
"wlan0": {
"auto": true,
"type": "dhcp"
}
}
And then taking points one by one...
"2014-12-17T14:00:05Z"
"2014-12-17T13:00:05-01:00"
Also. should probably point out that support for leap seconds is currently a MAY and not a MUST. We expect the underlying system to property interpret leap seconds. Interestingly, one of the places we used the "any type as a key" was in an array of leap seconds for which a particular action had taken place:
{
2012-06-30T23:59:60Z: true,
1997-06-30T23:59:60Z: false
}
<<FOO
There's a leading newline right before this line.
FOO
For a trailing newline, do this:
<<FOO
There's a trailing newline right after this line.
FOO
And this has neither:
<<FOO
Neither a trailer nor a leader be.
FOO
Which is, IMHO, a little easier to comprehend than python's
"""Oh hey, this line has a trailing newline."""
"""This line does not have a trailing newline"""
or is it
"""This line does not have a trailing newline"""
I can never remember if python triple-quote strings require you to escape double quotes or not. But given enough time, python's
Turns out I don't really personally need auto catenation of arrays. and now that i think about it, the auto-catenation rules make it so you can't have two vector types in sequence at the top level. Maybe requiring the auto-catenation character for strings as well as arrays? hmm... i have to evangelize that change since it will require changing deployed code, but i think an argument could be made for explicitly identifying where concatenation occurs.