Adapting HDT for the Internet of Things
Linked Data/RDF would be an elegant way to achieve semantic interoperability for the Internet of Things. RDF does not describe how the data is encoded. The most common formats JSON-LD and Turtle) use text encoding. A more efficient binary encoding would be required for the Internet of Things. HDT is a binary format, which is already very well adopted for data dumps. The HDT format is very compact and optional indexes allow fast look-ups. The current HDT specification is aligned for the data dump use case. Also the reference C++ library implements all features of HDT and could be to big for constrained devices. With small changes, it should be also suitable for the bandwidth and resource limitations of the Internet of Things. A new or simplified library should be implemented. The required changes to the HDT specification should be documented so they can be easily merged or used as an extension to the existing specification.
The header size should be reduced.
Maybe a simple header structure can be defined, which contains only the required information and skips fields like
HDT Cookie and
The "metadata about the dataset in plain RDF format", can be skipped. Also the indexes are not required.
Some use cases may require support for quads.
Named graphs can be used to distinguish between metadata and the actual data.
Structures for quads should be defined, similar to the
Canonical form (optional)
For cryptographic signatures, a canonical form would be required. Any definitions in the specification which define a non-unique way to encode the RDF data should be extended by a more clear definition.