Skip to content

Hashing Strategy

To figure out what changed between two commits, symtrace needs to identify each piece of code and track it across versions. It does this by computing four different fingerprints (hashes) for every node in the syntax tree.

HashWhat it capturesWhy it matters
StructuralThe shape of the code (nesting, children)Detects moves — same shape, different location
ContentThe actual source textDetects any text change, no matter how small
IdentityThe shape with names replaced by placeholdersDetects renames — same structure, different names
ContextParent node and depth in the treeDetects when code is re-parented or restructured

A single fingerprint can’t distinguish between different kinds of changes. For example, if a function is both moved and slightly edited:

  • The structural hash still matches (same shape), so it’s a move candidate.
  • The content hash differs (body changed), so it’s not a pure move.
  • The identity hash matches (names didn’t change), so it’s not a rename.
  • The context hash differs (different position), confirming the relocation.

Combining all four lets symtrace classify the change accurately instead of falling back to “deleted + inserted.”

symtrace uses BLAKE3 for hashing, which is fast and collision-resistant. When incremental parsing is enabled (the default), hashes are reused for unchanged parts of the code, avoiding redundant work.