Skip to content

Instantly share code, notes, and snippets.

@DaelonSuzuka
Last active June 16, 2020 22:47
JSON printing module written for small embedded systems with serial ports.
#ifndef _JSON_PRINT_H_
#define _JSON_PRINT_H_
/* ************************************************************************** */
/* json_node.h contains the definition of the json_node_t data structure and
its elements, as well as instructions and examples on how to construct your
own C representations for JSON objects.
*/
#include "json_node.h"
/* ************************************************************************** */
/* A pointer to a function that can print null-terminated strings. This allows
the json_print() to not know about system features like serial port
availability.
Any function that can handle pointers to null-terminated strings is valid,
whether it sends the string out a serial port, logs it to file, or even
throws it away entirely.
*/
typedef void (*printer_t)(const char *);
/* json_print() creates a JSON object using 'nodeList' and prints it using the
provided 'destination' function pointer. This allows json_print() to target
different serial ports if the system has them.
If always providing a second argument proves cumbersome, it's always
possible to write a wrapper macro that fills in the destination you want.
Example:
#define my_json_print(nodeList) json_print(fixedDestination, nodeList)
json_print() provides several safety features to
*/
extern void json_print(printer_t destination, const json_node_t *nodeList);
#endif // _JSON_PRINT_H_
#include "json_print.h"
#include <stdint.h>
#include <stdio.h>
/* ************************************************************************** */
/* This is a pointer to the printer function that was passed as the first
argument to json_print(). This could have been passed down the call tree as
an argument, and some people probably would have preferred that, but since
evaluate_node_list() can end up being called recursively AND the value of
out can't change in the middle of that, I think it makes more sense to pull
this out to file scope to help limit stack growth.
typedef void (*printer_t)(const char *);
*/
static printer_t out;
/* ************************************************************************** */
/* JSON strings must be surrounded by double-quotes, and since double-quotes
inside string literals have to be escaped, we'll provide a function that
does it for you.
*/
static void print_json_string(char *string) {
out("\"");
out(string);
out("\"");
}
/* -------------------------------------------------------------------------- */
static void evaluate_node_list(const json_node_t *nodeList); // forward dec
/* Any non-control node is evaluated here.
A node has two elements: a type and a void pointer to its contents.
Evaluating a node involves looking at the type of that node, performing the
relevant typecast on the contents pointer, converting those contents into a
string, and then printing it.
The conversion to string is done using sprintf instead of printf(), because
printf() has a fixed destination of whatever putc() goes. Since the output
of json_print() is retargetable, we format with sprintf(), but the result in
a local char buffer, and then print that char buffer using the function
pointer that was given to json_print() and copied into 'out'.
*/
static void evaluate_node(const json_node_t *node) {
char buffer[50] = {0};
switch (node->type) {
case nNodeList:
evaluate_node_list((const json_node_t *)node->contents);
return;
case nKey:
print_json_string((char *)node->contents);
out(":");
return;
case nString:
print_json_string((char *)node->contents);
return;
case nFloat:
sprintf(&buffer[0], "%f", *(double *)node->contents);
out(buffer);
return;
case nU16:
sprintf(&buffer[0], "%u", *(uint16_t *)node->contents);
out(buffer);
return;
case nU32:
sprintf(&buffer[0], "%lu", *(uint32_t *)node->contents);
out(buffer);
return;
case nNull:
out("null");
return;
default: // type not supported
out("null");
return;
}
}
/* -------------------------------------------------------------------------- */
/* braceDepth:
JSON objects are delimited by curly braces: '{' and '}'. It's REQUIRED for
these braces to match. To help make this easier, evaluate_node_list() keeps
track of every curly brace it prints. This has two major behavioral
consequences:
1) Printing a '}' that matches against the very first opening brace means
that this JSON object MUST be over. Therefore, if that happens, we'll
print the brace and return. If there were still nodes in the list, then
the JSON object was defined improperly.
2) Encountering a control node containing "\e" means that we've hit the end
of the node list we're evaluating and should return. Assuming a simple
node list(a list without any nNodeList nodes), hitting "\e" implies that
we're completely done and should return. However, if there are still
unmatched open braces, we haven't produced valid JSON output. Therefore,
before returning, we'll print close braces until we've balanced all the
open braces.
*/
static uint8_t braceDepth = 0;
/* recursionCount:
In the description of braceDepth(see above), we learned about how
evaluate_node_list() adds missing close braces when it hits the end of a
node list(eg, a control node containing "\e"). There's a big problem with
this: a node list can have a reference to another node list!
This is an important feature for writing the definitions of JSON objects, so
you can factor out common sections of related JSON objects, or just to break
a big definition into smaller, easier to read pieces.
This presents a problem when evaluating a list, because now you need to know
whether you're looking at the 'parent' list or a 'child' list before you
can decide what you to do when you hit "\e".
One solution I tried was adding another control node symbol, so parent lists
would end with "\e" and child lists would end with "\b". This was a bad
solution because it made it harder to write object definitions, and it meant
you had to decide up front whether a list could be included in another list.
Instead, if we increment recursionCount at the beginning of every node list,
whether it's the parent or the child, and we decrement it every time we
evaluate a "\e" node, we'll know the difference between the end of a parent
list and the end of a child list.
Now, we can control the previously described brace matching feature and only
do it when we're at the end of the starting(parent) node list.
*/
static uint8_t recursionCount = 0;
/* evaluate_node_list() builds a JSON string by iterating through an array of
json_node_t's.
This function handles control nodes and keeping track of the structure of
the JSON object we're printing, and delegates most of the formatting and
printing to evaluate_node().
! evaluate_node_list() can be called recursively !
One of the features of the nodal JSON definition is the ability to reference
other lists of nodes. There's a special node type that indicates this and
contains a pointer to the list to be included. When evaluate_node() hits one
of these special nodes, it will call evaluate_node_list() and pass it a
pointer to the included list.
! This function provides NO PROTECTION against circular inclusion. !
If listA includes listB, and listB includes listA, these fucntions will
follow the trail until the system overflows its stack and (hopefully)
triggers a RESET. It's your responsibility to make sure your JSON node lists
aren't dangerous.
*/
static void evaluate_node_list(const json_node_t *list) {
const json_node_t *currentNode;
const json_node_t *nextNode;
recursionCount++;
while (1) {
currentNode = list;
nextNode = ++list;
if (currentNode->type == nControl) {
char controlChar = ((const char *)currentNode->contents)[0];
switch (controlChar) {
case '{':
braceDepth++;
out("{");
break;
case '}':
braceDepth--;
out("}");
if (braceDepth == 0) {
return;
}
if (nextNode->type != nControl) {
out(",");
}
break;
case '\e':
recursionCount--;
if (recursionCount == 0) {
while (braceDepth--) {
out("}");
}
}
return;
}
} else {
evaluate_node(currentNode);
// printing a comma is a little complicated, see [0] for more info
if (nextNode->type == nKey) {
out(",");
} else if (nextNode->type == nNodeList) {
const json_node_t *lookAhead = nextNode->contents;
if (lookAhead->type == nKey) {
out(",");
}
}
}
}
}
/* -------------------------------------------------------------------------- */
/* A JSON object is printed by stepping through its definition one node at a
time until we're done. Each node is evaluated one by one and its contents
converted to a string and passed to the printing function defined as
'destination'.
*/
void json_print(printer_t destination, const json_node_t *nodeList) {
// copy the destination pointer to a static variable, so we don't have to
// keep passing it down the stack
out = destination;
// Reset the internal state used to keep track of the JSON structure
braceDepth = 0;
recursionCount = 0;
//
evaluate_node_list(nodeList);
}
/* ************************************************************************** */
/* [0] Additional information on JSON commas
JSON requires a comma between two key:value pairs, like this:
necessary and correct
v
{ key:value, key:value }
So we could follow every value with a comma, but then we'd get this:
invalid and illegal(why can't JSON just ignore this?)
v
{ key:value, key:value, }
So instead let's look ahead and print a comma if the NEXT node is a key:
invalid and illegal(this one is more reasonable)
v
{ , key:value, key:value }
This is actually easier to fix than the first failure. Let's write the rule
in pseudo-C (you'll need to know about node types to understand this):
if(currentNode.type != nControl && nextNode.type == nKey){
print_comma();
}
Example of an invalid place to print a comma:
currentNode IS a control node
v
{ key:value, key:value }
^
nextNode IS a key node
Example of a valid place to print a comma:
currentNode IS NOT a control node
v
{ key:value, key:value }
^
nextNode IS a key node
There's one problem left, a nNodeList is a node that contains a pointer to
another list of nodes, which is seamlessly included into the current node
list. This is great feature when you're writing the JSON definitions, but it
makes comma printing slightly harder. Here's the revised pseudo-C rules:
if(node != nControl && ((nextNode == nKey) || (nodeAfterNext == nKey))){
print_comma();
}
This operation involves a pretty crunchy multiple dereference, so that's why
the actual function has an extra pointer named 'lookAhead'.
You might notice that the print_comma_if_needed() only has one argument,
nextNode. It only knows about the NEXT node, but our rules mention the
CURRENT node too! What gives?
*/
#ifndef _JSON_NODE_H_
#define _JSON_NODE_H_
/* ************************************************************************** */
/* JSON node type
The contents of a json_node_t(see below) are in a void pointer, and can't be
correctly deferenced without knowing what type to expect. This enum contains
all the possible types you can expect in a node.
[*] these types aren't currently implemented, sorry
*/
typedef enum {
nControl, //
nNodeList, //
nObject, // [*]
nKey, //
nString, //
nFloat, //
nDouble, // [*]
nU8, // [*]
nU16, //
nU24, // [*]
nU32, //
nS8, // [*]
nS16, // [*]
nS24, // [*]
nS32, // [*]
nBool, // [*]
nNull, // [*]
} node_type_t;
/* JSON node
A JSON object is represented as an array of node structs.
*/
typedef struct {
node_type_t type;
void *contents;
} json_node_t;
/* ************************************************************************** */
/* JSON Nodes:
This file describes a C data structure that can be used to represent a JSON
object so that it can be serialized and printed out a UART. The JSON object
is described as an array of json_node_t structs, each one corresponding to
a token in the original JSON object.
* Let's start with an example. *
C data that you're trying to print
const char *billsName = "Bill";
uint16_t billsAge = 56;
Example node representation:
const json_node_t jsonBill[] = {
{nControl, "{"}, // 2
{nKey, "name"}, // 3
{nString, &billsName}, // 4
{nKey, "age"}, // 5
{nU16, &billsAge}, // 6
{nControl, "}"}, // 7
{nControl, "\e"}, //
};
const json_node_t jsonPerson[] = {
{nControl, "{"}, // 0
{nKey, "person"}, // 1
{nNodeList, &jsonBill}, //
{nControl, "\e"}, // 9
};
Resulting JSON object, with each 'token' numbered:
0 1 2 3 4 5 6 7 8 9
{ "person": { "name": "Bill", "age": 56 } } end
JSON is a hierarchical key:value format. A 'value' can be a JSON data type,
but it can also be another key:value pair, or multiple key:value pairs.
These structures can be nested arbitrarily deep, like this:
{key:val}
{key: {key:val}} <- note the double closing braces
{key: {key:val, key:val}}
Because of this property, in order to evaluate a node, you need to know
where that node lives in the overall hierarchy. This hierarchy is described
explicitly using nControl nodes, and implicitly using nKey nodes.
* nControl: (a pointer to a C string literal)
Control nodes provide contextual information about the structure of
the JSON object being printed. Control nodes can indicate one of
several different actions:
"{" - prints a '{', starting a new object
"}" - prints a '}', ending the current object
"\e" - the end of the node array
Now, I've actually played a trick on you. Go back and reread the
example, matching each token in the output to the node that
generates it. Don't worry, I'll wait.
Find it?
Token 8 in the output doesn't have a node. Don't panic, your JSON
serializer hasn't developed sentience(yet), it just keeps track of
every time it prints a '{' or '}', and when if there are open braces
that don't have matching closing braces by the time it hits the "\e"
control node, it'll produce them automatically. This feature makes
it very slightly more pleasant to write short JSON definitions.
* nKey: (a pointer to a C string literal)
An nKey node is identical to an nString(described later) node,
except that when evaluated, it's followed by a colon (':').
There's an additional special node type, that can be used to "factor out" a
section of a node array. This can be used if you feel that your JSON
description is getting too long to read clearly, or if you have multiple
JSON descriptions that contain identical sections and you don't want to
repeat yourself.
* nNodeList: (a pointer to another array of nodes)
This node is a pointer to another array of nodes, which are simply
inserted in place of the current node. This insertion process
doesn't add any additional JSON punctuation, so if you want the
included nodes to be in their own object, you'll need to explicitly
add nControl nodes describing the structure.
The remaining node types don't have any special behaviour, they're simply a
reference to some C data. There are node types that correspond to each of
the JSON data types.
number - nFloat, nDouble, nUxx, nSxx
string - nString
boolean - nBool[*]
array - nArray[*]
object - nObject[*]
null - nNull[*]
[*] This node type isn't implemented, yet. Sorry.
* nFloat, nDouble, nUxx, nSxx: (a pointer to a C 'number')
JSON doesn't understand different 'types' of numbers because it's
derived from Javascript, which is dynamicly and weakly typed.
C is statically typed and quite particular, because in order to
interact with a variable(especially via a pointer), it needs to know
how much memory it occupies and how to interpret that memory.
Because the contents of a node are stored as a void pointer,
accessing those contents requires casting the pointer to a
particular type so that C knows how to access it. JSON's attitude of
it's a number, lol" isn't going to be sufficient. Therefore, we need
a node type for each of the C primitive data types:
floats - nFloat, nDouble
unsigned ints - nU8, nU16, nU24, nU32
signed ints - nS8, nS16, nS24, nS32
* nString: (a pointer a C string literal)
JSON strings are delimited by double quotes ('"'). Since it would be
really annoying to require every string literal to include it's own
escaped quotes, the printer handles adding those quotes for you.
* nBool:
I'm not sure how you'd really have a pointer a 1 bit bool, so this
node type will need some additional work to make it functional.
* nArray:
This is quite a bit more complicated that a simple data type. An
array of primitives would require defining the length of the array
and the data type of its elements. This could be handled by making
the nArray node point to some kind of struct containing the length,
type, and a pointer to the actual array to be printed.'
I'm totally stumped on how I'd represent an array of JSON objects,
which is a very common
* nObject:
Most of the reason to use this JSON type is handled by using control
nodes.
* nNull:
This one's pretty easy, it just prints "null" with no quotes.
*/
/* ************************************************************************** */
/* JSON array
A JSON node only contains a single void pointer, which isn't enough
information to properly reference an array in C. You also need to know the
length of the array, the type of its elements, and the location of the array
itself.
node_type_t contains several values that don't make sense in this context,
but since it's defined right there(^), we'll use it anyways.
This feature actually isn't implemented yet, but I'm leaving this here as a
starting point for when I do.
*/
typedef struct {
int length;
node_type_t type;
void *array;
} json_array_t;
/* ************************************************************************** */
#endif // _JSON_NODE_H_
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment