SerializationRecord

Record

A record is a tree of byte sequences and optional hashes, primarily used to store public keys, envelopes, messages, or other data.

A record may look as follows:

(root)
  title
    Mountain hike
  time
    start
      2015-08-05 09:00:00 UTC
    end
      2015-08-05 17:00:00 UTC
  confirmed attendees
    John  # 34bf..7e
    Bob   # a529..10

Structure

Each tree node of a record contains:

Nodes are ordered, but the order does not always matter. The root node is not stored.

Byte sequences

Byte sequences may contain arbitrary data. Their encoding and interpretation is protocol- or application-specific. The following encodings are suggested:

Data type Byte sequence
Text A UTF-8 sequence of Unicode characters.
Boolean true A non-zero-length byte sequence, usually the single byte 0x79 (ASCII "y").
Boolean false A zero-length byte sequence.
Integer with sign A big-endian signed integer of arbitrary length.
Integer without sign¹ A big-endian unsigned integer of arbitrary length.
Fixed-point number A signed integer shifted by a predefined number of bits.
Floating-point number A single-precision (4 bytes) or double-precision (8 bytes) IEEE floating point number.
Date A signed integer, denoting the number of milliseconds that have passed since epoch.
  1. Unsigned integers are discouraged, as they are easily confused with signed integers, which may lead to errors that go unnoticed for a long time.

Since the overhead of a node is small, it is usually not worth packing multiple values into a single node, and easier to create a small subtree. An exception are arrays of a fixed-length data type, such as arrays of 4-byte integers.

Linking objects

Other objects are linked by adding their hash. If the objects are encrypted, the corresponding AES key is stored in the byte sequence:

Data type Byte sequence Hash
Link to an object Object hash
Link to an encrypted object AES key (32 bytes) Object hash

Dictionaries

Records, or parts thereof, may be interpreted as key-value dictionaries:

(root or parent)
  key 1
    value 1
  key 2
    value 2
  ...

Key nodes usually hold a short ASCII sequence — preferably using lowercase characters, spaces and dashes only — and must be unique. Their order has no importance.

The corresponding value is stored in the child nodes. Often, there is only one such value node for each key.

Tables

Records, or parts thereof, may be interpreted as tables:

(root or parent)
  primary key of row 1
    content of cell A1
    content of cell B1
    ...
  primary key of row 2
    content of cell A2
    content of cell B2
    ...
  ...

The first row may contain column headers.

Serialization draft

A record object has the following structure:

H Hashes Condensation header Nodes Encrypted object data 4 H ⨯ 32 bytes

Nodes are stored in depth-first traversal order. The root node is omitted. Each node is encoded as follows:

F L K Byte sequence 1 B bytes 0‒8 0‒4

The bits of F have the following meaning:

7 2 0 4 6 5 3 1 0 1 No hash Hash 1 1 Node has children Node has more siblings B = 0 B = 30 + L B = L 0 0 0 0 1 0 1 1 0 1 1 1 1 1 1 1 B = 29 0 1 1 1 L has 8 bytes L has 1 byte L has 0 bytes L has 0 bytes K has 4 bytes K has 0 bytes

Bits 6 and 7 along with the node sequence encode the tree structure.

L and K are stored as unsigned big-endian integers of the indicated length. K is the index of a hash in the object header. Multiple nodes may refer to the same hash.

Merge semantics

Merge semantics are application- or protocol-specific.