Cashew Data Foundation

The data layer that makes Lattice a foundational SDK. Merkle dictionaries, arrays, and sets that are verifiable, diffable, lazily loadable, selectively encryptable, and provable — the primitives for both web3 state and verifiable data applications.

.package(url: "https://github.com/pumperknickle/cashew.git", from: "1.0.0")
Why Cashew matters
Cashew is the foundational data layer of Lattice. Every piece of state in the system — account balances, application state, transaction histories — is a Cashew Merkle structure. This means all state is inherently content-addressed (tamper-evident), lazily loadable (only fetch what you need), and diffable (compute exact changesets between versions). Acorn stores the raw bytes; Cashew gives those bytes structure and meaning.

Node Protocol

The base protocol for all Merkle structures. A node is a tree of content-addressed children.

public protocol Node: Codable, LosslessStringConvertible, Sendable {
    func get(property: PathSegment) -> (any Header)?
    func properties() -> Set<PathSegment>
    func set(properties: [PathSegment: any Header]) -> Self
}

You implement three methods. The protocol extension provides everything else:

Provided MethodDescription
resolve(paths:fetcher:)Lazily load specific paths from storage
storeRecursively(storer:)Persist entire subtree to storage
transform(transforms:)Apply batched mutations
proof(paths:fetcher:)Generate sparse Merkle proofs
encrypt(encryption:)Selectively encrypt subtrees
diff(from:)Compute structural diff against another version

Scalar Protocol

Leaf nodes with no children. Provides no-op defaults for all tree-walking methods:

public protocol Scalar: Node { }

String values in dictionaries and sets are scalars.

A Header is a content-addressed reference to a Node. It holds a CID (SHA-256 hash) and optionally the deserialized node. Headers that haven't been loaded yet have node == nil — this is the basis of lazy loading.

public protocol Header: Codable, LosslessStringConvertible, Sendable {
    associatedtype NodeType: Node

    var rawCID: String { get }
    var node: NodeType? { get }
    var encryptionInfo: EncryptionInfo? { get }
}

Creating Headers

// From a node (CID computed automatically)
let header = HeaderImpl(node: myNode)
print(header.rawCID) // SHA-256 of serialized node

// Lazy reference (CID only, node not loaded)
let lazy = HeaderImpl<MyNode>(rawCID: "bafy...")
lazy.node // nil — must resolve to load

// Encrypted header
let encrypted = HeaderImpl(node: myNode, key: symmetricKey)

Resolving (Loading)

// Load this header's node from storage
let resolved = try await header.resolve(fetcher: myFetcher)
resolved.node // now loaded

// Load entire subtree recursively
let full = try await header.resolveRecursive(fetcher: myFetcher)

CID Format

CIDs follow the IPFS CIDv1 specification: multibase + multicodec + multihash. The hash function is SHA2-256. Encrypted headers serialize as "enc:keyHash:iv:cid".

MerkleDictionary

A key-value map backed by a compressed radix trie. Every mutation produces a new root with a new CID — previous versions remain valid and reachable.

public protocol MerkleDictionary: Node {
    associatedtype ValueType
    associatedtype ChildType: RadixHeader
}

Mutations

All mutations are immutable — they return a new dictionary with the change applied:

var dict = MerkleDictionaryImpl<String>()

// Insert a new key
dict = dict.inserting(key: "alice", value: "100")

// Update an existing key
dict = dict.mutating(key: "alice", value: "200")

// Delete a key
dict = dict.deleting(key: "alice")

Queries

MethodReturnDescription
get(key:)ValueType?Retrieve value by key
allKeys()Set<String>All keys in the dictionary
allKeysAndValues()[String: ValueType]Full materialization
sortedKeys(limit:after:)[String]Paginated sorted keys (cursor-based)
sortedKeysAndValues(limit:after:)[(key, value)]Paginated sorted entries
countIntNumber of entries

Radix Trie Internals

Under the hood, the dictionary is a compressed radix trie (Patricia trie). Each RadixNode has:

Each child is a RadixHeader — a content-addressed reference. Only the parts of the trie you access need to be loaded from storage.

MerkleArray

An ordered collection built on top of MerkleDictionary. Indices are 256-bit binary strings, giving deterministic ordering in the trie.

var arr = MerkleArrayImpl<String>()

arr = arr.append("first")
arr = arr.append("second")
arr = arr.append("third")

arr.get(at: 0)  // "first"
arr.first()     // "first"
arr.last()      // "third"
arr.count       // 3

Methods

MethodDescription
append(_:)Add element at index count
append(contentsOf:)Concatenate another array
get(at:)Retrieve element by index
first(), last()Accessor shortcuts
mutating(at:value:)Update element at index
deleting(at:)Remove element (swaps with last for O(1))

MerkleSet

A set backed by MerkleDictionary with empty string values. Supports full set algebra.

var set = MerkleSetImpl()
set = set.insert("alice")
set = set.insert("bob")

set.contains("alice")  // true
set.members()          // {"alice", "bob"}

Set Algebra

MethodDescription
insert(_:)Add member
remove(_:)Remove member
contains(_:)Membership test
union(_:)Elements in either set
intersection(_:)Elements in both sets
subtracting(_:)Elements in self but not other
symmetricDifference(_:)Elements in exactly one set

Lazy Resolution

The key insight: a Merkle tree of millions of entries can be represented by a single root CID. You only load the branches you actually access. Resolution strategies control how deep to fetch:

enum ResolutionStrategy {
    case targeted             // fetch one header only
    case recursive             // fetch entire subtree
    case list                  // fetch trie structure, leave leaf values lazy
    case range(after:limit:)   // sorted cursor-based range
}

Fetcher Protocol

Implement this to connect Cashew to any storage backend — including an Acorn CAS chain:

public protocol Fetcher: Sendable {
    func fetch(rawCid: String) async throws -> Data
}

public protocol Storer {
    func store(rawCid: String, data: Data) throws
}
Cashew + Acorn
A Fetcher backed by an Acorn CompositeCASWorker gives you Merkle structures that lazily load from memory → disk → network. A Storer backed by the same chain persists mutations across all tiers.

Path-Based Resolution

// Only load the "alice" branch of a dictionary
let paths = ArrayTrie<ResolutionStrategy>()
    .setting(path: ["alice"], value: .targeted)

let resolved = try await root.resolve(paths: paths, fetcher: fetcher)
resolved.get(key: "alice") // loaded
resolved.get(key: "bob")   // still nil (not resolved)

Selective Encryption

Encrypt specific subtrees while leaving others in plaintext. Different branches can use different keys.

enum EncryptionStrategy {
    case targeted(SymmetricKey)   // encrypt only this value
    case list(SymmetricKey)       // encrypt trie structure (one level)
    case recursive(SymmetricKey)  // encrypt entire subtree
}

EncryptionInfo

public struct EncryptionInfo: Sendable {
    public let keyHash: String  // SHA-256 of the key (for lookup)
    public let iv: String       // base64 AES-GCM nonce
}

Encrypted headers serialize as "enc:keyHash:iv:cid". The KeyProvider protocol maps key hashes back to keys for decryption:

public protocol KeyProvider {
    func key(for keyHash: String) -> SymmetricKey?
}
Path-based encryption
You can encrypt "accounts/alice/balance" with Alice's key and "accounts/bob/balance" with Bob's key — all in the same tree. Nodes without encryption are publicly verifiable; encrypted nodes require the correct key to read but their CIDs remain valid for proof verification.

Sparse Merkle Proofs

Generate minimal subtrees that prove specific facts about the data without revealing the entire structure.

enum SparseMerkleProof {
    case insertion   // prove key doesn't exist (safe to insert)
    case mutation    // prove key exists (can be updated)
    case deletion    // prove key exists with neighbors
    case existence   // prove key exists
}

Example

// Prove that "alice" exists in the dictionary
let proofPaths = ArrayTrie<SparseMerkleProof>()
    .setting(path: ["alice"], value: .existence)

let proof = try await root.proof(paths: proofPaths, fetcher: fetcher)
// `proof` is a minimal subtree containing only the nodes
// needed to verify "alice" exists, with the same root CID

A verifier can recompute the root CID from the proof subtree and compare it against the known root. If they match, the proof is valid. No trusted third party needed.

Diffing

Compute the exact set of changes between two versions of a structure:

let changes = old.diff(from: new)

changes.inserted   // [key: value] of new entries
changes.deleted    // [key: value] of removed entries
changes.modified   // [key: ModifiedEntry] with old/new values
changes.changeCount

CashewDiff

public struct CashewDiff {
    public var inserted: [String: String]
    public var deleted: [String: String]
    public var modified: [String: ModifiedEntry]
    public var changeCount: Int
}

public struct ModifiedEntry {
    public var old: String
    public var new: String
    public var children: CashewDiff  // recursive for nested structures
}

Diffing is structural — it walks both tries simultaneously and only visits nodes whose CIDs differ. Identical subtrees are skipped entirely, making it efficient even for very large structures.

Query Language

Cashew includes a built-in query language for reading and mutating structures without writing Swift code:

// Get a value
let (_, result) = try root.query("get 'alice'")

// Pipeline: get alice's sub-dictionary, then get balance
let (_, result) = try root.query("get 'alice' | get 'balance'")

// Mutations
let (newRoot, _) = try root.query("insert 'carol' '500'")
let (newRoot, _) = try root.query("update 'alice' '300'")
let (newRoot, _) = try root.query("delete 'bob'")

// Enumeration
let (_, result) = try root.query("keys")
let (_, result) = try root.query("count")

Expressions

ExpressionResult TypeDescription
get 'key'.valueRetrieve value by key
get_at 0.valueRetrieve array element by index
keys.listAll keys
values.listAll values
sorted_keys 10.listFirst 10 sorted keys
count.countNumber of entries
contains 'key'.boolMembership test
first / last.valueArray endpoints
insert 'k' 'v'.okInsert key-value pair
update 'k' 'v'.okUpdate existing key
delete 'k'.okRemove key
append 'v'.okAppend to array

Pipelines

Chain expressions with | to traverse nested structures:

get 'accounts' | get 'alice' | get 'balance'

Query Plans

For efficiency, queries can be compiled into a CashewPlan that batches transforms and pre-computes resolution paths:

let plan = CashewPlan.compile(expressions)
let paths = plan.resolutionPaths() // only fetch what's needed
let (newRoot, result) = try await root.execute(plan: plan, fetcher: fetcher)

Transforms

For programmatic batch mutations, use Transform with an ArrayTrie to apply multiple changes in a single pass:

enum Transform {
    case insert(String)   // insert new value
    case update(String)   // update existing value
    case delete            // remove value
}
let transforms = ArrayTrie<Transform>()
    .setting(path: ["alice"], value: .update("300"))
    .setting(path: ["carol"], value: .insert("500"))
    .setting(path: ["bob"], value: .delete)

let newRoot = root.transform(transforms: transforms)

Dependencies

PackagePurpose
swift-cryptoSHA-256 hashing, AES-GCM encryption
swift-cidIPFS CIDv1 content identifiers
swift-multicodecCodec identifiers
swift-multihashSelf-describing hashes
swift-collectionsStandard collection types
ArrayTriePath-based trie for resolution/transform/proof specs