Hashing Strategy

To figure out what changed between two commits, symtrace needs to identify each piece of code and track it across versions. It does this by computing four different fingerprints (hashes) for every node in the syntax tree.

The four hashes

Hash	What it captures	Why it matters
Structural	The shape of the code (nesting, children)	Detects moves — same shape, different location
Content	The actual source text	Detects any text change, no matter how small
Identity	The shape with names replaced by placeholders	Detects renames — same structure, different names
Context	Parent node and depth in the tree	Detects when code is re-parented or restructured

Why four?

A single fingerprint can’t distinguish between different kinds of changes. For example, if a function is both moved and slightly edited:

The structural hash still matches (same shape), so it’s a move candidate.
The content hash differs (body changed), so it’s not a pure move.
The identity hash matches (names didn’t change), so it’s not a rename.
The context hash differs (different position), confirming the relocation.

Combining all four lets symtrace classify the change accurately instead of falling back to “deleted + inserted.”

Performance

symtrace uses BLAKE3 for hashing, which is fast and collision-resistant. When incremental parsing is enabled (the default), hashes are reused for unchanged parts of the code, avoiding redundant work.