← Back to the demo

The proof, not the pitch — measured, not estimated

Reproduced locally on SigMap v7.30: whole-repo extraction across 405 repositories, 51 real coding tasks answered with vs without SigMap, and a BM25 re-ranker that lifts retrieval. We also A/B-tested a real agent (Devin) — and report that result honestly below, win or not. Every figure is a real measurement; the full method and raw data are open.

~99%
fewer tokens
98.7% overall, 405 repos
96×
cheaper context
51 real coding tasks
82.4%
retrieval hit@5
BM25 re-ranker, +7pts
Methodology & raw data (open repo) ↗

1 · Token reduction at scale — 321 repositories

1,765,696,54923,427,118 tokens — an 98.7% overall reduction (95.6% average per repo). 84 of 405repos use languages SigMap doesn't yet parse and are excluded from the headline.

321
repos supported
95.6%
avg reduction
98.7%
overall reduction
100/100
avg health
Python
94.8%80 repos
TypeScript
95.4%42 repos
Rust
96.9%38 repos
Go
96.5%38 repos
JavaScript
93.9%31 repos
Java
96.6%25 repos
PHP
94.6%14 repos
Ruby
96.4%13 repos
C#
96.9%10 repos
Kotlin
95.2%9 repos
Swift
98.3%8 repos
Dart
94.1%6 repos
Scala
97.1%4 repos
Svelte
91.4%2 repos
Vue
94.9%1 repo

2 · Real coding tasks — 51 tasks, with vs without SigMap

For each task we measured the tokens an LLM needs to answer using the whole repo versus only the files SigMap ranks. Tokens are model-reported; cost is derived.

99.2%
fewer tokens
96×
cheaper
$1.73→$0.018
cost (51 tasks)
62.7%
right file in top-5

3 · Does it also make a real agent faster? — we tested it honestly

The savings above are about context size and cost, and they're deterministic. A separate question is whether smaller context also makes an autonomous agent finish faster. We A/B-tested one (Devin) on these tasks — A = task only, B = SigMap context first — at 3 reps each. The honest result: too close to call.

8.4m
no SigMap (avg, completed)
8.0m
with SigMap (avg, completed)
≈ Tie — within measurement noise

What this means: this is notevidence SigMap slows an agent down — the two are statistically even. The agent's run-to-run variance is simply large (some runs exceeded our 30-min measurement cap), so we can't yet claim a speedup either way. An early single run looked like a big win (~61%) but didn't hold up across 3reps — so we're showing you the real result, not that number. SigMap's proven value is the ~99% smaller, 96× cheaper, better-retrieved context above— what your agent does with that head start is the open question we're still measuring.

What these numbers don't say