Cashew Data Foundation

The data layer that makes Lattice a foundational SDK. Merkle dictionaries, arrays, and sets that are verifiable, diffable, lazily loadable, selectively encryptable, and provable — the primitives for both web3 state and verifiable data applications.

.package(url: "https://github.com/pumperknickle/cashew.git", from: "1.0.0")

Why Cashew matters

Cashew is the foundational data layer of Lattice. Every piece of state in the system — account balances, application state, transaction histories — is a Cashew Merkle structure. This means all state is inherently content-addressed (tamper-evident), lazily loadable (only fetch what you need), and diffable (compute exact changesets between versions). Acorn stores the raw bytes; Cashew gives those bytes structure and meaning.

Node Protocol

The base protocol for all Merkle structures. A node is a tree of content-addressed children.

public protocol Node: Codable, LosslessStringConvertible, Sendable {
    func get(property: PathSegment) -> (any Header)?
    func properties() -> Set<PathSegment>
    func set(properties: [PathSegment: any Header]) -> Self
}

You implement three methods. The protocol extension provides everything else:

Provided Method	Description
`resolve(paths:fetcher:)`	Lazily load specific paths from storage
`storeRecursively(storer:)`	Persist entire subtree to storage
`transform(transforms:)`	Apply batched mutations
`proof(paths:fetcher:)`	Generate sparse Merkle proofs
`encrypt(encryption:)`	Selectively encrypt subtrees
`diff(from:)`	Compute structural diff against another version

Scalar Protocol

Leaf nodes with no children. Provides no-op defaults for all tree-walking methods:

public protocol Scalar: Node { }

String values in dictionaries and sets are scalars.

A Header is a content-addressed reference to a Node. It holds a CID (SHA-256 hash) and optionally the deserialized node. Headers that haven't been loaded yet have node == nil — this is the basis of lazy loading.

public protocol Header: Codable, LosslessStringConvertible, Sendable {
    associatedtype NodeType: Node

    var rawCID: String { get }
    var node: NodeType? { get }
    var encryptionInfo: EncryptionInfo? { get }
}

Creating Headers

// From a node (CID computed automatically)
let header = HeaderImpl(node: myNode)
print(header.rawCID) // SHA-256 of serialized node

// Lazy reference (CID only, node not loaded)
let lazy = HeaderImpl<MyNode>(rawCID: "bafy...")
lazy.node // nil — must resolve to load

// Encrypted header
let encrypted = HeaderImpl(node: myNode, key: symmetricKey)

Resolving (Loading)

// Load this header's node from storage
let resolved = try await header.resolve(fetcher: myFetcher)
resolved.node // now loaded

// Load entire subtree recursively
let full = try await header.resolveRecursive(fetcher: myFetcher)

CID Format

CIDs follow the IPFS CIDv1 specification: multibase + multicodec + multihash. The hash function is SHA2-256. Encrypted headers serialize as "enc:keyHash:iv:cid".

MerkleDictionary

A key-value map backed by a compressed radix trie. Every mutation produces a new root with a new CID — previous versions remain valid and reachable.

public protocol MerkleDictionary: Node {
    associatedtype ValueType
    associatedtype ChildType: RadixHeader
}

Mutations

All mutations are immutable — they return a new dictionary with the change applied:

var dict = MerkleDictionaryImpl<String>()

// Insert a new key
dict = dict.inserting(key: "alice", value: "100")

// Update an existing key
dict = dict.mutating(key: "alice", value: "200")

// Delete a key
dict = dict.deleting(key: "alice")

Queries

Method	Return	Description
`get(key:)`	`ValueType?`	Retrieve value by key
`allKeys()`	`Set<String>`	All keys in the dictionary
`allKeysAndValues()`	`[String: ValueType]`	Full materialization
`sortedKeys(limit:after:)`	`[String]`	Paginated sorted keys (cursor-based)
`sortedKeysAndValues(limit:after:)`	`[(key, value)]`	Paginated sorted entries
`count`	`Int`	Number of entries

Radix Trie Internals

Under the hood, the dictionary is a compressed radix trie (Patricia trie). Each RadixNode has:

prefix: the compressed edge label (path compression eliminates single-child chains)
value: optional leaf value at this node
children: [Character: RadixHeader] map to child nodes

Each child is a RadixHeader — a content-addressed reference. Only the parts of the trie you access need to be loaded from storage.

MerkleArray

An ordered collection built on top of MerkleDictionary. Indices are 256-bit binary strings, giving deterministic ordering in the trie.

var arr = MerkleArrayImpl<String>()

arr = arr.append("first")
arr = arr.append("second")
arr = arr.append("third")

arr.get(at: 0)  // "first"
arr.first()     // "first"
arr.last()      // "third"
arr.count       // 3

Methods

Method	Description
`append(_:)`	Add element at index `count`
`append(contentsOf:)`	Concatenate another array
`get(at:)`	Retrieve element by index
`first()`, `last()`	Accessor shortcuts
`mutating(at:value:)`	Update element at index
`deleting(at:)`	Remove element (swaps with last for O(1))

MerkleSet

A set backed by MerkleDictionary with empty string values. Supports full set algebra.

var set = MerkleSetImpl()
set = set.insert("alice")
set = set.insert("bob")

set.contains("alice")  // true
set.members()          // {"alice", "bob"}

Set Algebra

Method	Description
`insert(_:)`	Add member
`remove(_:)`	Remove member
`contains(_:)`	Membership test
`union(_:)`	Elements in either set
`intersection(_:)`	Elements in both sets
`subtracting(_:)`	Elements in self but not other
`symmetricDifference(_:)`	Elements in exactly one set

Lazy Resolution

The key insight: a Merkle tree of millions of entries can be represented by a single root CID. You only load the branches you actually access. Resolution strategies control how deep to fetch:

enum ResolutionStrategy {
    case targeted             // fetch one header only
    case recursive             // fetch entire subtree
    case list                  // fetch trie structure, leave leaf values lazy
    case range(after:limit:)   // sorted cursor-based range
}

Fetcher Protocol

Implement this to connect Cashew to any storage backend — including an Acorn CAS chain:

public protocol Fetcher: Sendable {
    func fetch(rawCid: String) async throws -> Data
}

public protocol Storer {
    func store(rawCid: String, data: Data) throws
}

Cashew + Acorn

A Fetcher backed by an Acorn CompositeCASWorker gives you Merkle structures that lazily load from memory → disk → network. A Storer backed by the same chain persists mutations across all tiers.

Path-Based Resolution

// Only load the "alice" branch of a dictionary
let paths = ArrayTrie<ResolutionStrategy>()
    .setting(path: ["alice"], value: .targeted)

let resolved = try await root.resolve(paths: paths, fetcher: fetcher)
resolved.get(key: "alice") // loaded
resolved.get(key: "bob")   // still nil (not resolved)

Selective Encryption

Encrypt specific subtrees while leaving others in plaintext. Different branches can use different keys.

enum EncryptionStrategy {
    case targeted(SymmetricKey)   // encrypt only this value
    case list(SymmetricKey)       // encrypt trie structure (one level)
    case recursive(SymmetricKey)  // encrypt entire subtree
}

EncryptionInfo

public struct EncryptionInfo: Sendable {
    public let keyHash: String  // SHA-256 of the key (for lookup)
    public let iv: String       // base64 AES-GCM nonce
}

Encrypted headers serialize as "enc:keyHash:iv:cid". The KeyProvider protocol maps key hashes back to keys for decryption:

public protocol KeyProvider {
    func key(for keyHash: String) -> SymmetricKey?
}

Path-based encryption

You can encrypt "accounts/alice/balance" with Alice's key and "accounts/bob/balance" with Bob's key — all in the same tree. Nodes without encryption are publicly verifiable; encrypted nodes require the correct key to read but their CIDs remain valid for proof verification.

Sparse Merkle Proofs

Generate minimal subtrees that prove specific facts about the data without revealing the entire structure.

enum SparseMerkleProof {
    case insertion   // prove key doesn't exist (safe to insert)
    case mutation    // prove key exists (can be updated)
    case deletion    // prove key exists with neighbors
    case existence   // prove key exists
}

Example

// Prove that "alice" exists in the dictionary
let proofPaths = ArrayTrie<SparseMerkleProof>()
    .setting(path: ["alice"], value: .existence)

let proof = try await root.proof(paths: proofPaths, fetcher: fetcher)
// `proof` is a minimal subtree containing only the nodes
// needed to verify "alice" exists, with the same root CID

A verifier can recompute the root CID from the proof subtree and compare it against the known root. If they match, the proof is valid. No trusted third party needed.

Diffing

Compute the exact set of changes between two versions of a structure:

let changes = old.diff(from: new)

changes.inserted   // [key: value] of new entries
changes.deleted    // [key: value] of removed entries
changes.modified   // [key: ModifiedEntry] with old/new values
changes.changeCount

CashewDiff

public struct CashewDiff {
    public var inserted: [String: String]
    public var deleted: [String: String]
    public var modified: [String: ModifiedEntry]
    public var changeCount: Int
}

public struct ModifiedEntry {
    public var old: String
    public var new: String
    public var children: CashewDiff  // recursive for nested structures
}

Diffing is structural — it walks both tries simultaneously and only visits nodes whose CIDs differ. Identical subtrees are skipped entirely, making it efficient even for very large structures.

Query Language

Cashew includes a built-in query language for reading and mutating structures without writing Swift code:

// Get a value
let (_, result) = try root.query("get 'alice'")

// Pipeline: get alice's sub-dictionary, then get balance
let (_, result) = try root.query("get 'alice' | get 'balance'")

// Mutations
let (newRoot, _) = try root.query("insert 'carol' '500'")
let (newRoot, _) = try root.query("update 'alice' '300'")
let (newRoot, _) = try root.query("delete 'bob'")

// Enumeration
let (_, result) = try root.query("keys")
let (_, result) = try root.query("count")

Expressions

Expression	Result Type	Description
`get 'key'`	`.value`	Retrieve value by key
`get_at 0`	`.value`	Retrieve array element by index
`keys`	`.list`	All keys
`values`	`.list`	All values
`sorted_keys 10`	`.list`	First 10 sorted keys
`count`	`.count`	Number of entries
`contains 'key'`	`.bool`	Membership test
`first` / `last`	`.value`	Array endpoints
`insert 'k' 'v'`	`.ok`	Insert key-value pair
`update 'k' 'v'`	`.ok`	Update existing key
`delete 'k'`	`.ok`	Remove key
`append 'v'`	`.ok`	Append to array

Pipelines

Chain expressions with | to traverse nested structures:

get 'accounts' | get 'alice' | get 'balance'

Query Plans

For efficiency, queries can be compiled into a CashewPlan that batches transforms and pre-computes resolution paths:

let plan = CashewPlan.compile(expressions)
let paths = plan.resolutionPaths() // only fetch what's needed
let (newRoot, result) = try await root.execute(plan: plan, fetcher: fetcher)

Transforms

For programmatic batch mutations, use Transform with an ArrayTrie to apply multiple changes in a single pass:

enum Transform {
    case insert(String)   // insert new value
    case update(String)   // update existing value
    case delete            // remove value
}

let transforms = ArrayTrie<Transform>()
    .setting(path: ["alice"], value: .update("300"))
    .setting(path: ["carol"], value: .insert("500"))
    .setting(path: ["bob"], value: .delete)

let newRoot = root.transform(transforms: transforms)

Dependencies

Package	Purpose
`swift-crypto`	SHA-256 hashing, AES-GCM encryption
`swift-cid`	IPFS CIDv1 content identifiers
`swift-multicodec`	Codec identifiers
`swift-multihash`	Self-describing hashes
`swift-collections`	Standard collection types
`ArrayTrie`	Path-based trie for resolution/transform/proof specs

Cashew Data Foundation

Node Protocol

Scalar Protocol

Header & Content Addressing

Creating Headers

Resolving (Loading)

CID Format

MerkleDictionary

Mutations

Queries

Radix Trie Internals

MerkleArray

Methods

MerkleSet

Set Algebra

Lazy Resolution

Fetcher Protocol

Path-Based Resolution

Selective Encryption

EncryptionInfo

Sparse Merkle Proofs

Example

Diffing

CashewDiff

Query Language

Expressions

Pipelines

Query Plans

Transforms

Dependencies