Cashew Data Foundation
The data layer that makes Lattice a foundational SDK. Merkle dictionaries, arrays, and sets that are verifiable, diffable, lazily loadable, selectively encryptable, and provable — the primitives for both web3 state and verifiable data applications.
.package(url: "https://github.com/pumperknickle/cashew.git", from: "1.0.0")
Node Protocol
The base protocol for all Merkle structures. A node is a tree of content-addressed children.
public protocol Node: Codable, LosslessStringConvertible, Sendable {
func get(property: PathSegment) -> (any Header)?
func properties() -> Set<PathSegment>
func set(properties: [PathSegment: any Header]) -> Self
}
You implement three methods. The protocol extension provides everything else:
| Provided Method | Description |
|---|---|
resolve(paths:fetcher:) | Lazily load specific paths from storage |
storeRecursively(storer:) | Persist entire subtree to storage |
transform(transforms:) | Apply batched mutations |
proof(paths:fetcher:) | Generate sparse Merkle proofs |
encrypt(encryption:) | Selectively encrypt subtrees |
diff(from:) | Compute structural diff against another version |
Scalar Protocol
Leaf nodes with no children. Provides no-op defaults for all tree-walking methods:
public protocol Scalar: Node { }
String values in dictionaries and sets are scalars.
Header & Content Addressing
A Header is a content-addressed reference to a Node. It holds a CID (SHA-256 hash) and optionally the deserialized node. Headers that haven't been loaded yet have node == nil — this is the basis of lazy loading.
public protocol Header: Codable, LosslessStringConvertible, Sendable {
associatedtype NodeType: Node
var rawCID: String { get }
var node: NodeType? { get }
var encryptionInfo: EncryptionInfo? { get }
}
Creating Headers
// From a node (CID computed automatically)
let header = HeaderImpl(node: myNode)
print(header.rawCID) // SHA-256 of serialized node
// Lazy reference (CID only, node not loaded)
let lazy = HeaderImpl<MyNode>(rawCID: "bafy...")
lazy.node // nil — must resolve to load
// Encrypted header
let encrypted = HeaderImpl(node: myNode, key: symmetricKey)
Resolving (Loading)
// Load this header's node from storage
let resolved = try await header.resolve(fetcher: myFetcher)
resolved.node // now loaded
// Load entire subtree recursively
let full = try await header.resolveRecursive(fetcher: myFetcher)
CID Format
CIDs follow the IPFS CIDv1 specification: multibase + multicodec + multihash. The hash function is SHA2-256. Encrypted headers serialize as "enc:keyHash:iv:cid".
MerkleDictionary
A key-value map backed by a compressed radix trie. Every mutation produces a new root with a new CID — previous versions remain valid and reachable.
public protocol MerkleDictionary: Node {
associatedtype ValueType
associatedtype ChildType: RadixHeader
}
Mutations
All mutations are immutable — they return a new dictionary with the change applied:
var dict = MerkleDictionaryImpl<String>()
// Insert a new key
dict = dict.inserting(key: "alice", value: "100")
// Update an existing key
dict = dict.mutating(key: "alice", value: "200")
// Delete a key
dict = dict.deleting(key: "alice")
Queries
| Method | Return | Description |
|---|---|---|
get(key:) | ValueType? | Retrieve value by key |
allKeys() | Set<String> | All keys in the dictionary |
allKeysAndValues() | [String: ValueType] | Full materialization |
sortedKeys(limit:after:) | [String] | Paginated sorted keys (cursor-based) |
sortedKeysAndValues(limit:after:) | [(key, value)] | Paginated sorted entries |
count | Int | Number of entries |
Radix Trie Internals
Under the hood, the dictionary is a compressed radix trie (Patricia trie). Each RadixNode has:
prefix: the compressed edge label (path compression eliminates single-child chains)value: optional leaf value at this nodechildren:[Character: RadixHeader]map to child nodes
Each child is a RadixHeader — a content-addressed reference. Only the parts of the trie you access need to be loaded from storage.
MerkleArray
An ordered collection built on top of MerkleDictionary. Indices are 256-bit binary strings, giving deterministic ordering in the trie.
var arr = MerkleArrayImpl<String>()
arr = arr.append("first")
arr = arr.append("second")
arr = arr.append("third")
arr.get(at: 0) // "first"
arr.first() // "first"
arr.last() // "third"
arr.count // 3
Methods
| Method | Description |
|---|---|
append(_:) | Add element at index count |
append(contentsOf:) | Concatenate another array |
get(at:) | Retrieve element by index |
first(), last() | Accessor shortcuts |
mutating(at:value:) | Update element at index |
deleting(at:) | Remove element (swaps with last for O(1)) |
MerkleSet
A set backed by MerkleDictionary with empty string values. Supports full set algebra.
var set = MerkleSetImpl()
set = set.insert("alice")
set = set.insert("bob")
set.contains("alice") // true
set.members() // {"alice", "bob"}
Set Algebra
| Method | Description |
|---|---|
insert(_:) | Add member |
remove(_:) | Remove member |
contains(_:) | Membership test |
union(_:) | Elements in either set |
intersection(_:) | Elements in both sets |
subtracting(_:) | Elements in self but not other |
symmetricDifference(_:) | Elements in exactly one set |
Lazy Resolution
The key insight: a Merkle tree of millions of entries can be represented by a single root CID. You only load the branches you actually access. Resolution strategies control how deep to fetch:
enum ResolutionStrategy {
case targeted // fetch one header only
case recursive // fetch entire subtree
case list // fetch trie structure, leave leaf values lazy
case range(after:limit:) // sorted cursor-based range
}
Fetcher Protocol
Implement this to connect Cashew to any storage backend — including an Acorn CAS chain:
public protocol Fetcher: Sendable {
func fetch(rawCid: String) async throws -> Data
}
public protocol Storer {
func store(rawCid: String, data: Data) throws
}
Fetcher backed by an Acorn CompositeCASWorker gives you Merkle structures that lazily load from memory → disk → network. A Storer backed by the same chain persists mutations across all tiers.
Path-Based Resolution
// Only load the "alice" branch of a dictionary
let paths = ArrayTrie<ResolutionStrategy>()
.setting(path: ["alice"], value: .targeted)
let resolved = try await root.resolve(paths: paths, fetcher: fetcher)
resolved.get(key: "alice") // loaded
resolved.get(key: "bob") // still nil (not resolved)
Selective Encryption
Encrypt specific subtrees while leaving others in plaintext. Different branches can use different keys.
enum EncryptionStrategy {
case targeted(SymmetricKey) // encrypt only this value
case list(SymmetricKey) // encrypt trie structure (one level)
case recursive(SymmetricKey) // encrypt entire subtree
}
EncryptionInfo
public struct EncryptionInfo: Sendable {
public let keyHash: String // SHA-256 of the key (for lookup)
public let iv: String // base64 AES-GCM nonce
}
Encrypted headers serialize as "enc:keyHash:iv:cid". The KeyProvider protocol maps key hashes back to keys for decryption:
public protocol KeyProvider {
func key(for keyHash: String) -> SymmetricKey?
}
"accounts/alice/balance" with Alice's key and "accounts/bob/balance" with Bob's key — all in the same tree. Nodes without encryption are publicly verifiable; encrypted nodes require the correct key to read but their CIDs remain valid for proof verification.
Sparse Merkle Proofs
Generate minimal subtrees that prove specific facts about the data without revealing the entire structure.
enum SparseMerkleProof {
case insertion // prove key doesn't exist (safe to insert)
case mutation // prove key exists (can be updated)
case deletion // prove key exists with neighbors
case existence // prove key exists
}
Example
// Prove that "alice" exists in the dictionary
let proofPaths = ArrayTrie<SparseMerkleProof>()
.setting(path: ["alice"], value: .existence)
let proof = try await root.proof(paths: proofPaths, fetcher: fetcher)
// `proof` is a minimal subtree containing only the nodes
// needed to verify "alice" exists, with the same root CID
A verifier can recompute the root CID from the proof subtree and compare it against the known root. If they match, the proof is valid. No trusted third party needed.
Diffing
Compute the exact set of changes between two versions of a structure:
let changes = old.diff(from: new)
changes.inserted // [key: value] of new entries
changes.deleted // [key: value] of removed entries
changes.modified // [key: ModifiedEntry] with old/new values
changes.changeCount
CashewDiff
public struct CashewDiff {
public var inserted: [String: String]
public var deleted: [String: String]
public var modified: [String: ModifiedEntry]
public var changeCount: Int
}
public struct ModifiedEntry {
public var old: String
public var new: String
public var children: CashewDiff // recursive for nested structures
}
Diffing is structural — it walks both tries simultaneously and only visits nodes whose CIDs differ. Identical subtrees are skipped entirely, making it efficient even for very large structures.
Query Language
Cashew includes a built-in query language for reading and mutating structures without writing Swift code:
// Get a value
let (_, result) = try root.query("get 'alice'")
// Pipeline: get alice's sub-dictionary, then get balance
let (_, result) = try root.query("get 'alice' | get 'balance'")
// Mutations
let (newRoot, _) = try root.query("insert 'carol' '500'")
let (newRoot, _) = try root.query("update 'alice' '300'")
let (newRoot, _) = try root.query("delete 'bob'")
// Enumeration
let (_, result) = try root.query("keys")
let (_, result) = try root.query("count")
Expressions
| Expression | Result Type | Description |
|---|---|---|
get 'key' | .value | Retrieve value by key |
get_at 0 | .value | Retrieve array element by index |
keys | .list | All keys |
values | .list | All values |
sorted_keys 10 | .list | First 10 sorted keys |
count | .count | Number of entries |
contains 'key' | .bool | Membership test |
first / last | .value | Array endpoints |
insert 'k' 'v' | .ok | Insert key-value pair |
update 'k' 'v' | .ok | Update existing key |
delete 'k' | .ok | Remove key |
append 'v' | .ok | Append to array |
Pipelines
Chain expressions with | to traverse nested structures:
get 'accounts' | get 'alice' | get 'balance'
Query Plans
For efficiency, queries can be compiled into a CashewPlan that batches transforms and pre-computes resolution paths:
let plan = CashewPlan.compile(expressions)
let paths = plan.resolutionPaths() // only fetch what's needed
let (newRoot, result) = try await root.execute(plan: plan, fetcher: fetcher)
Transforms
For programmatic batch mutations, use Transform with an ArrayTrie to apply multiple changes in a single pass:
enum Transform {
case insert(String) // insert new value
case update(String) // update existing value
case delete // remove value
}
let transforms = ArrayTrie<Transform>()
.setting(path: ["alice"], value: .update("300"))
.setting(path: ["carol"], value: .insert("500"))
.setting(path: ["bob"], value: .delete)
let newRoot = root.transform(transforms: transforms)
Dependencies
| Package | Purpose |
|---|---|
swift-crypto | SHA-256 hashing, AES-GCM encryption |
swift-cid | IPFS CIDv1 content identifiers |
swift-multicodec | Codec identifiers |
swift-multihash | Self-describing hashes |
swift-collections | Standard collection types |
ArrayTrie | Path-based trie for resolution/transform/proof specs |