Mesh Storage Layer
The mesh storage layer provides fault-tolerant, encrypted data persistence across the Zentalk network. This chapter describes the complete architecture: from client-side encryption through Reed-Solomon erasure coding, Kademlia DHT-based distribution, capacity management, and automatic self-healing.
Design Goals
The mesh storage layer must satisfy five requirements simultaneously:
- Confidentiality: Stored data must be unreadable by mesh node operators, even if they have full physical access to the storage hardware.
- Availability: Data must remain accessible even if multiple nodes fail simultaneously.
- Durability: Data must not be lost even under adverse network conditions (node departures, hardware failures, network partitions).
- Efficiency: Storage overhead must be minimized relative to the fault tolerance provided.
- Scalability: The system must handle growth from hundreds to millions of users without architectural changes.
Client-Side Encryption
All data is encrypted on the user's device before transmission to the mesh network. The mesh nodes never receive plaintext data.
Encryption Design
Algorithm: AES-256-GCM (Authenticated Encryption with Associated Data)
The encryption key is derived from the user's identity key through PBKDF2-SHA256 with a cryptographically random salt and high iteration count, ensuring that the key derivation is computationally expensive to brute-force. A fresh random salt is generated per encryption operation, preventing key reuse across different data objects.
The output is a self-contained binary blob comprising a version identifier, the salt (required for key re-derivation during decryption), an initialization vector (GCM nonce), the ciphertext, and an authentication tag. This format ensures that any mesh node or network observer receives only an opaque encrypted payload -- the mesh is a blind courier that stores and forwards data whose content it cannot read and whose integrity it cannot compromise without detection.
AES-256-GCM provides both confidentiality (the ciphertext reveals no information about the plaintext) and integrity (any modification of the ciphertext is detected upon decryption via the authentication tag). These properties hold regardless of the mesh node operator's intentions or capabilities.
Reed-Solomon Erasure Coding
Erasure Coding Principle
Reed-Solomon codes are maximum distance separable (MDS) error-correcting codes that achieve the theoretical optimum for the trade-off between redundancy and fault tolerance [Reed and Solomon, 1960]. An code can recover the original data symbols from any of the encoded symbols, tolerating the loss of up to symbols.
Zentalk uses a Reed-Solomon code over : encrypted data is split into 10 data shards, 5 parity shards are computed through linear combination over the finite field, and the resulting 15 shards are distributed across 15 independent validator nodes via Kademlia. Any 10 of the 15 shards are sufficient to reconstruct the original data. This is mathematically optimal — no code with the same parameters can tolerate more than 5 simultaneous failures.
The encoding and recovery processes rely on the algebraic properties of Vandermonde matrices over , which guarantee that any submatrix of the generator matrix is invertible. Recovery is therefore a deterministic linear algebra operation: select any 10 available shards, construct the corresponding submatrix, invert it, and multiply to recover all original data shards. The mathematical details are standard [see Plank, 2013 for a comprehensive treatment].
Efficiency
| Strategy | Overhead | Fault Tolerance |
|---|---|---|
| No redundancy | 0 failures | |
| replication | 2 failures | |
| 5 failures |
Reed-Solomon tolerates 2.5x more failures than triple replication while using half the storage. This makes it the optimal choice for a decentralized network where node failures are routine.
Kademlia DHT Distribution
XOR Distance Metric
The Kademlia Distributed Hash Table uses the XOR metric to define "distance" between keys and node identifiers:
Properties:
- (identity)
- for (positivity)
- (symmetry)
- (XOR transitivity, NOT triangle inequality)
Note that XOR distance does not satisfy the standard triangle inequality in the additive sense. Instead, it satisfies the ultrametric property: . This stronger property is what makes Kademlia routing efficient -- the distance to any target can only decrease or stay the same as lookup progresses through the routing table [Maymounkov and Mazieres 2002].
A unique structural property of XOR distance is that for any point and any distance , there exists exactly one point such that . This ensures that lookups converge deterministically and uniquely.
Storage Key Generation
Each piece of data to be stored is assigned a deterministic storage key:
The deterministic key ensures that any node can independently compute where a piece of data should be stored, enabling decentralized routing without a central directory. Because the key is derived from the owner's address and a data-type identifier, storage locations are reproducible -- any node in the network can locate any user's data by computing the same hash.
Shard Placement
Given a storage key and 15 encoded shards, the placement algorithm queries the DHT for the nearest nodes to the storage key (querying more candidates than needed to account for offline or full nodes). Candidates are filtered by capacity state using a multi-tier fallback strategy: the algorithm prefers nodes with healthy capacity, falls back to moderately loaded nodes, and as a last resort stores shards locally. Each of the 15 shards is placed on a distinct node to maximize independence of failure domains. Shard locations are recorded in metadata so that any future retrieval or repair operation can locate them.
Node Discovery and Routing Table
Each node maintains a Kademlia routing table organized into -buckets:
- One bucket per bit position in the -bit address space
- Each bucket holds up to peers (a configurable protocol parameter)
- Bucket contains peers with distance where
Nodes discover peers through an initial bootstrap phase (connecting to well-known entry points), periodic refresh (querying random keys in each bucket to discover new peers), and continuous connection maintenance (health-checking connected peers and replacing stale entries).
Capacity Management
Each node monitors its storage utilization and announces its state to the network through a graduated capacity model. As utilization increases, the node's behavior becomes progressively more restrictive -- from accepting all traffic at low utilization, through deprioritizing new shard placement, to rejecting all new data when near capacity. This graduated response ensures that the network sheds load gracefully rather than experiencing abrupt failures.
Nodes periodically publish their capacity state to the DHT, allowing other nodes to make informed placement decisions. When a node approaches its capacity limits, a rebalancing process migrates its least-accessed shards to healthier nodes, verifying integrity at each step. The rebalancing is rate-limited to prevent oscillation and network congestion.
Anti-Entropy and Self-Healing
Health Monitoring
The anti-entropy service continuously monitors shard health by periodically scanning registered chunks in batches, prioritizing those that have not been checked recently. For each chunk, the service queries all shard locations and counts available shards. Health is classified on a graduated scale: all shards present (no action), minor degradation (monitor), moderate degradation (queue repair), minimum viable shards (urgent repair), or below reconstruction threshold (unrecoverable; alert and log).
Automatic Repair
When a chunk's health degrades below the monitoring threshold, the repair process is triggered. The algorithm retrieves the minimum required shards from surviving nodes in parallel, uses Reed-Solomon decoding to reconstruct all shards (any sufficient due to the MDS property), and places the missing shards on new nodes selected from the DHT -- preferring nodes that are geographically diverse and in Normal capacity state. Hash integrity is verified after each transfer and chunk metadata is updated with the new shard locations. Failed repairs are retried with exponential backoff up to a bounded maximum number of attempts.
Adaptive Media Chunking
Large media files (images, videos, documents) use an additional chunking layer before Reed-Solomon encoding. The chunk size adapts to the file size -- smaller files are transmitted as a single unit, while larger files are split into chunks scaled to balance transfer efficiency against per-chunk overhead. A protocol-defined maximum file size prevents abuse.
Integrity Verification
Each chunked media upload produces a signed manifest that records the hash of every chunk and a checksum of the original file. The manifest is signed with the sender's Ed25519 identity key, preventing malicious mesh nodes from modifying it. This creates a three-layer integrity guarantee:
- Manifest signature: detects manifest modification by any mesh node
- Per-chunk hashes: detects individual chunk replacement or corruption
- File checksum: detects reordering or partial substitution of chunks
During download, the recipient verifies the manifest signature, checks each chunk's hash, decrypts and reassembles the file, and verifies the final checksum -- ensuring end-to-end integrity from sender to recipient with no trust placed in any intermediate node.
Scalability Analysis
Steady-State Property
Due to TTL-based expiration, the mesh reaches a bounded steady state where data ingestion rate equals data expiration rate. After the longest TTL period has elapsed, every byte entering the network is matched by a byte expiring:
where is the daily data ingestion rate and is the longest TTL in the system. This remains constant regardless of how long the network operates.
Storage does not grow unboundedly. It stabilizes at approximately and remains there as long as usage patterns are stable. The per-node storage burden decreases as more nodes join the network, since the aggregate data is distributed across a larger set of participants. This bounded growth model ensures that the hardware requirements for node operation do not increase over time, preserving the economic accessibility of participation.