Skip to content

Instantly share code, notes, and snippets.

@internetimagery
Forked from anonymous/thoughts.txt
Last active February 13, 2018 11:00
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save internetimagery/567c2d69c67077c275559b77b99e8194 to your computer and use it in GitHub Desktop.
Save internetimagery/567c2d69c67077c275559b77b99e8194 to your computer and use it in GitHub Desktop.
ramblings

Thoughts on key value backed asset store.

  • Key Value store provides maximum simplicity. Maximium portability. Data safety.
  • Key Value store used in this manner (immutable) could be hard to maintain. ie: cleaning up old data.
  • Key Value store assumes it conforms to map[string][]byte

Data abstracted at various levels.

Base level types:

Blob

Contains the binary data for use. The key is formed from a structured combination of filesize, hash value and hashing type (ie "10002-sdfsfj2342ljsdlfs-SHA1"). Key can be used to verify data integrity directly if required. A straight forward data type. These content types can be deleted during old content purges etc. At higher levels, a cleanup tool can run through and gather links to content that should exist. Any leftovers can be purged.

Header

(consider this being the only base data type, versioning being a base functionality. ie what happens if a blob trasfer fails?)

Metadata file (json? yaml? csv? etc). These are versioned assets and follow special rules. Key names can be essentially anything. Key will be appended with "v%03d" (ie v001 v023 v640 etc). Accessing the latest version is unintuitive starting at v001 and incrimenting checking if key exists up till it fails, at which point we know the latest version. Doing it this way, it can work with the most basic possible key-value tech. Local caching of the latest known version is important. Possible to add enhancements that can be added if availible in the backend and gracefully degrade to this as features fade. Versions will always increase by 1. Headers are never cleaned up individually during data purging as they're pieces of the chain that lead to the latest data. An entire chain can be purged. Transfers in progress need to be tracked to ensure two attempts at creating a version at the same time cannot use the same key (race condition). Failed transfers can be retried.

Low level types:

Assets

Consist of a Header type, and a Blob type. Header contains all metadata relating to the content. Metadata can include things like filepath, creation dates, permissions, users, etc. This stuff can be customized somewhat for system features, but is fairly rigid as it's at system level. Assets can be "Saved over" by increasing the version number of the Header type with the new information, and linking to a new Blob type with the new data.

Essential tags:
  • blob: tag that has a reference to a corresponding Blob type key.
  • type: tag that specifies "asset".

Groups

Consist of a single Header type. Header contains all metadata relating to the group. Names, dates, permisions etc. Groups can be modified by increasing the version of the Header type with the new information.

Essential tags:
  • children: tag that links to a list of Header types (assets and other groups).
  • type: tag that specifies "group".

Mid level types:

Media

Consists of an Asset type that contains a config file with metadata about an asset, or multiple assets that are linked. Contains media types. Highly customizable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment