Objects, Accounts, and Stores
Condensation is a complete data system. It can
- store and retrieve data like a database or a file system,
- send data to other users or devices like messengers,
- and share and synchronize data like cloud services.
Condensation thereby follows a distributed actor-message-passing approach, and encrypts all data end-to-end.
With Condensation, data is stored as trees of objects. An object consists of a list of hashes pointing to other objects, and a sequence of bytes:
H denotes the number of hashes preceding the data. [TODO: talk about objects pointed to rather than hashes in the list] The data part often holds a record, and is usually encrypted.
Objects are identified by their SHA-256 hash (32 bytes):
Objects are inherently immutable. Changing an object's content yields a different hash, and therefore makes it a different object.
Through the hash list, an object may span a tree:
Trees can be arbitrarily large, and may form blockchains, binary search trees, or other data structures.
Naturally, the object hash also serves as identifier for the tree it spans.
Since trees are immutable, they cannot be modified directly. To modify data, a new tree is derived from the current tree:
If the tree is structured wisely, a large part of the old tree's data can be reused in the new tree. This yields efficient data modification, and efficient versioning.
Several modifications can be applied at the same time, leading to the equivalent of a database transaction.
Remembering a tree hash is enough to refer to an arbitrary amount of data. Hence – besides storing objects – an actor (user) just needs to manage a few hashes to store, publish, and share data. For that, each actor owns an account with three boxes. Each box keeps a set of hashes:
- The public box points to the actor's public information. It should contain a single hash.
- The private box points to the actor's private data, and should contain a single hash as well.
- The message box collects messages sent by other actors, and may contain an arbitrary number of hashes.
In contrast to objects and trees, boxes are mutable. Hashes can be added and removed.
Objects and accounts are kept on a Condensation store:
The objects make up the bulk of the data, and are organized as hash table. The account list has a slightly more sophisticated structure, but only deals with 32-byte hashes.
A Condensation store can be accessed through 5 functions:
- get an object
- put an object
- list a box
- add a hash (or envelope) to a box
- remove a hash from a box
The store may be running on top of a raw disk, a file system, a database, or a large scale storage system.
Condensation uses end-to-end encryption. Public data is signed, while private data and messages are are encrypted and signed by the actor producing that data.
For that, each actor generates a RSA 2048 key pair:
While the private key is kept in a safe place on the device, the public key is serialized as object and uploaded onto all stores the actor uses. The hash of the public key object serves as unique identifier of the actor and its accounts.
Condensation objects may be symmetrically encrypted as follows:
The data section is thereby encrypted using AES 256 in CTR mode with a random 256-bit key. The CTR counter starts at 0, and is incremented by 1 for each AES block (16 bytes). The last block may be truncated.
The hash list remains unencrypted, so that stores can determine which objects are in use.
Tree encryption and signing
Trees are encrypted and signed as follows:
The root object is an envelope with a content hash and a corresponding AES key encrypted for each recipient actor. In addition, envelopes contain the sender's signature of the content hash, and – if the content is located elsewhere – a store URL.
All objects below the root store the AES keys of their children within the encrypted data part.
Hence, any recipient can parse the envelope, verify its signature, decrypt the AES key of the content object, and then work its way down to read the whole tree.