Skip to content

Instantly share code, notes, and snippets.

@lbfalvy
Last active April 23, 2022 01:20
Show Gist options
  • Save lbfalvy/8119a09ea4d061783f62664a00388fd2 to your computer and use it in GitHub Desktop.
Save lbfalvy/8119a09ea4d061783f62664a00388fd2 to your computer and use it in GitHub Desktop.

title: Fast streaming file access protocol docname: draft-fsfap-01 date: 2022-04-22

ipr: trust200902 area: Network Protocols wg: Group 7 kw: Internet-Draft cat: std submissionType: IETF

coding: us-ascii pi: toc: yes symrefs: yes

author: - ins: xxx name: Olliver org: University of Surrey email: foo@example.com - ins: O'Driscoll name: Liam O'Driscoll org: University of Surrey email: foo@example.com - ins: Bethlenfalvy name: Lawrence Bethlenfalvy org: University of Surrey email: lbfalvy@protonmail.com entity: SELF: "[RFCXXXX]"

--- abstract

This protocol provides a fast, scalable way to traverse directories and download or stream large files from a server. The protocol is entirely connectionless, operating purely using small redundant requests and responses, relying on concurrency to achieve a higher performance than strictly ordered TCP-based protocols.

--- middle

Terminology

In this document, the key words ”MUST”, ”MUST NOT”, ”REQUIRED”, ”SHALL”, ”SHALL NOT”, ”SHOULD”, ”SHOULD NOT”, ”RECOMMENDED”, ”MAY”, and ”OPTIONAL” are to be interpreted as described in BCP 14, RFC 2119 {{!RFC2119}}.

Description

Nodes are divided into slices of 371 bytes for delivery, which may not be requested by clients in order. All nodes are indexed by 4-byte integers. For the sake of extensibility, implementations SHOULD pick numbers from the lower half of the address space, with the upper half containing virtual nodes used by extensions.

Nodes fall into two categories, folders and files. The content of files is unconstrained binary data, its decoding is left entierly to the client application.

Folders contain UTF-8 encoded, newline-separated text, where every line refers to an entry. Entries are colon-separated. Entries referring to files contain the file's node ID, then the file name. Entries referring to subfolders contain the word "folder", then the node ID, then a file name. File names may consist of any unicode character other than colons or newlines. Folders MUST include a trailing newline.

Example:

1:A.txt 4:b.md 22:c.py folder:3:src

Folders MAY include an entry with the name ".." referring to their parent folder, if they have one, and they MAY include an entry with the name "." referring to themselves. Folders MUST NOT define "." as anything other than the current node.

Extensions MAY define custom metadata or other entries using entries of different shape. Additional metadata defined using extensions MUST include a trailing colon, whereas custom entries MAY include a filename in their last field. Clients MAY read the last field in unrecognised lines as an object of unknown type. Clients MUST NOT try to read objects of unknown type as files.

Node 0 is a folder, this folder is the root of the file system. Clients MUST NOT assume that any node other than node 0 is defined. Node -1 (Node 0xFFFFFFFF) is a newline-separated list of options used by extensions. Extensions that define custom nodes MUST specify the indices of these in this node. Servers that don't use any extensions MAY NOT define this node.

UDP Interface

The protocol uses UDP for transmission of messages, which are always at most 400 bytes long. Every message is delivered as a single datagram.

Format

For ease of implementation all messages have the same set of headers, unused fields are set to 0.

 0      7 8     15 16    23 24    31
+--------+--------+--------+--------+
|             checksum              |
+--------+--------+--------+--------+
|  type  | slice index
+--------+--------+--------+--------+
         | last slice index
+--------+--------+--------+--------+
         | node index
+--------+--------+--------+--------+
         | body length     | body...
+--------+--------+--------+...

Meaning of header fields

  • checksum

    The checksum is calculated by dividing the rest of the message to 4-byte blocks, summing them and then taking the 32-bit 2's complement value of the result.

  • type

    The message type has two values:

    0: request 1: response 2: error

    Requests are packets sent from clients to servers, responses are packets sent by servers to clients as a result of requests. Error is sent in response to a malformed request.

  • slice index, last slice index

    The index of the requested slice, and the index of the last slice of the file, the latter specified to enable the client to pick an optimal loading strategy based on the file size. In requests, the last slice index is set to zero. In responses the last slice index is set to 1 if the file is empty, otherwise it is set to the index of the last valid (non-empty) slice. If the requested file doesn't exist, the last slice index is set to zero. Out-of-bounds and malformed requests still receive the request's slice index and the file's last slice index.

  • Node index

    The index of the requested node. Together with the slice index, this can be used to match the response or error to a previously made request.

  • Body length

    The number of bytes following this field that contain data from the file. This MUST be 371 if the requested slice isn't the last slice of the file, and 0 if the requested slice is out of bounds, or if the file is empty. In requests and errors the body length is set to zero.

User Interface

A user interface SHOULD allow

  • adapting a UDP host as a folder
  • listing and traversing folders
  • streaming files

Any other functionality provided by a user interface SHOULD accommodate error handling in the case that the specified host doesn't support the required extensions, and implementations MUST assert that the server supports the specified extension before attempting to use it.

Error handling

Clients are free to choose their own timeouts according to network characteristics and the perceived significance of delays or missing slices. For example, on a network with low delays but high package loss a client might choose a low timeout since a response that didn't arrive quickly is probably lost, and a media player might choose a low timeout of a few tens of miliseconds when there's little preloaded data left to ensure uninterrupted playback. On the other hand a long-runnning download in the background might choose timeouts up to several seconds to minimise load on the server since delays are insignificant and the package might just be stuck.

Caching

Clients are free to choose a caching strategy in accordance with the nature of the data. Some file formats may include a timestamp in the header of the file, or they can be included by extensions in the folder listings. Servers MUST store modified versions of files under the same index as the old version, or remove the old index. Caching clients MAY assume that an index that previously included a file and now includes a different value is the new content of the same file, without traversing the file tree again.

Security

This standard does not define any security measures or encryption, but files can be encrypted and/or signed with the signatures stored in extension-defined nodes. Encryption and signing schemes SHOULD support streaming, for example by storing separate signatures for every slice or every few slices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment