How It Works
When you run symtrace, it goes through a straightforward pipeline to turn two Git commits into a meaningful diff.
Pipeline overview
Section titled “Pipeline overview” Git commits (A and B) | v Read changed files -- pull file contents from both commits | v Skip unchanged files -- if a file didn't change, skip it | v Parse into syntax trees -- understand the code structure | v Hash each code node -- fingerprint every function, class, variable, etc. | v Match old vs. new -- figure out what moved, renamed, changed, etc. | v Track across files -- detect cross-file moves and renames | v Classify the commit -- label it as feature, bugfix, refactor, etc. | v Output results -- colored terminal or JSONWhat happens at each step
Section titled “What happens at each step”Reading files
Section titled “Reading files”symtrace uses libgit2 to open the repository and extract file contents from both commits. It never shells out to the git command.
Parsing
Section titled “Parsing”Each changed file is parsed into a syntax tree using tree-sitter. This turns raw source code into a structured representation that symtrace can reason about. Files are parsed in parallel for speed, and results are cached so repeated runs are faster.
Hashing
Section titled “Hashing”Every node in the syntax tree (functions, classes, variables, blocks) gets a set of fingerprint hashes. These hashes capture the node’s shape, content, name, and position — which is how symtrace can tell the difference between a move, a rename, and a modification. See Hashing Strategy for more.
Matching
Section titled “Matching”A multi-phase algorithm compares old and new nodes to classify each change as MOVE, RENAME, MODIFY, INSERT, or DELETE. It starts with exact matches and progressively loosens criteria. See Matching Algorithm for the phase breakdown.
Cross-file tracking
Section titled “Cross-file tracking”If a function moved from one file to another, or was renamed across modules, symtrace detects it and reports it as a cross-file operation.
Commit classification
Section titled “Commit classification”Based on the types of changes found, symtrace labels the commit (feature, bugfix, refactor, cleanup, formatting_only) with a confidence percentage.
Output
Section titled “Output”Results are printed as colored terminal output by default, or as structured JSON with --json. See Output Formats for details.