Performance Benchmarking
Performance Benchmarking
SSI uses a two-tier benchmarking system to track performance over time and investigate regressions.
Tier 1: Timing
./scripts/benchmark-timing.py
Runs timing measurements across benchmark sites. Takes approximately 4–7 minutes. Results are stored in benchmarks/timing-history.toml and published to the ssi-benchmarks Codeberg repo.
Run after every commit to main to maintain a performance history.
Tier 2: Profiling
./scripts/benchmark-profiling.py
Runs DWARF profiling, generates flamegraphs, and produces differential analysis against a baseline. Takes approximately 15–20 minutes. Requires Tier 1 data for the current version. Results are published to the ssi-benchmarks Codeberg repo.
Run when timing measurements show changes worth investigating.
Benchmark Sites
Benchmark sites are located in tests/test-examples/ alongside the functional test sites. Each benchmark site is sized to produce meaningful timing measurements and exercises specific SSI features.
Validation Performance (v0.89+)
A significant improvement was made in v0.89 by separating validation logic from the build pipeline:
Impact:
- Build speed: 5–7% improvement in standard deployments
- Up to 44.9% improvement observed in some configurations (79.01ms → 43.51ms average)
Architecture change: Validation now runs only when explicitly requested via ssi validate, removing the overhead from the standard build path.
# Before: validation always active
ssi deploy site/ output/ # 79.01ms average
# After: validation optional
ssi deploy site/ output/ # 43.51ms average (validation disabled)
ssi validate site/ # Full validation on demand
Common Bottlenecks
Based on profiling:
- File I/O: Batch operations where possible
- String allocations: Reuse buffers in hot paths
- Path resolution: Cache resolved paths
Token matching uses the Aho-Corasick algorithm (via the daachorse crate), which scans for all configured tokens in a single pass — this is not a bottleneck.
Interpreting Results
- Single runs can be noisy; watch patterns over time
- System load affects timing measurements
- Use instruction counts alongside wall time for more stable comparisons
- Profile with flamegraphs before optimizing
Troubleshooting
Inconsistent timing results
- Ensure the system is idle during benchmarking
- Disable CPU frequency scaling
- Close unnecessary applications
Missing baseline data
- Run
./scripts/benchmark-timing.pyto generate - Commit the updated history file
- Ensure version numbers are current