# Merging preliminary draft, to be adapted to semilattice theory

Data is often modified in several places and then merged together at a later point in time. In a distributed setting, a conflict-free merge operation must fulfill three criteria:

- Commutativity: Merging A ⊕ B must yield the same as merging B ⊕ A.
- Associativity: Merging (A ⊕ B) ⊕ C must yield the same as A ⊕ (B ⊕ C).
- Stability: Merging A ⊕ A must yield A.

Commutativity and associativity are necessary because the different data versions have no intrinsic order, i.e. one cannot say a priori that A is more important than B.

Stability is necessary because data versions may be merged multiple times, such as in the following example:

B has been derived from A, and can therefore be written as B = A ⊕ ΔB, where ΔB denotes the changes. The same goes for C. Merging B and C therefore becomes:

D = B ⊕ C = (A ⊕ ΔB) ⊕ (A ⊕ ΔC) = A ⊕ A ⊕ ΔB ⊕ ΔC = A ⊕ ΔB ⊕ ΔC

Stability can be obtained by keeping the history of all changes, i.e., if it is known that both B and C have been derived from A, the merge algorithm can compensate for that. However, keeping the history of all changes is often impractical, and can be complex.

## History-free merge operations

Fortunately, there are two merge operations that do not require any history: *set union*, and *top-N* merge.

### Set union merge

Combining two sets using set union is commutative, associative, and stable.

This merge operation is typically used for dictionary keys.

### Top-N merge

If the elements in the two sets have an order, one can throw away all but the top *N* elements after union merge.

As a special case, one can keep the most recent element only (*N* = 1, ordering by date).