The proof, not the pitch — measured, not estimated

Reproduced locally on SigMap v7.30: whole-repo extraction across 405 repositories, 51 real coding tasks answered with vs without SigMap, and a BM25 re-ranker that lifts retrieval. We also A/B-tested a real agent (Devin) — and report that result honestly below, win or not. Every figure is a real measurement; the full method and raw data are open.

~99%: fewer tokens; 98.7% overall, 405 repos
96×: cheaper context; 51 real coding tasks
82.4%: retrieval hit@5; BM25 re-ranker, +7pts

Methodology & raw data (open repo) ↗

1 · Token reduction at scale — 321 repositories

1,765,696,549 → 23,427,118 tokens — an 98.7% overall reduction (95.6% average per repo). 84 of 405repos use languages SigMap doesn't yet parse and are excluded from the headline.

321

repos supported

95.6%

avg reduction

98.7%

overall reduction

100/100

avg health

Python

94.8%80 repos

TypeScript

95.4%42 repos

Rust

96.9%38 repos

96.5%38 repos

JavaScript

93.9%31 repos

Java

96.6%25 repos

PHP

94.6%14 repos

Ruby

96.4%13 repos

96.9%10 repos

Kotlin

95.2%9 repos

Swift

98.3%8 repos

Dart

94.1%6 repos

Scala

97.1%4 repos

Svelte

91.4%2 repos

Vue

94.9%1 repo

2 · Real coding tasks — 51 tasks, with vs without SigMap

For each task we measured the tokens an LLM needs to answer using the whole repo versus only the files SigMap ranks. Tokens are model-reported; cost is derived.

99.2%

fewer tokens

96×

cheaper

$1.73→$0.018

cost (51 tasks)

62.7%

right file in top-5

3 · Does it also make a real agent faster? — we tested it honestly

The savings above are about context size and cost, and they're deterministic. A separate question is whether smaller context also makes an autonomous agent finish faster. We A/B-tested one (Devin) on these tasks — A = task only, B = SigMap context first — at 3 reps each. The honest result: too close to call.

8.4m

no SigMap (avg, completed)

8.0m

with SigMap (avg, completed)

≈ Tie — within measurement noise

What this means: this is notevidence SigMap slows an agent down — the two are statistically even. The agent's run-to-run variance is simply large (some runs exceeded our 30-min measurement cap), so we can't yet claim a speedup either way. An early single run looked like a big win (~61%) but didn't hold up across 3reps — so we're showing you the real result, not that number. SigMap's proven value is the ~99% smaller, 96× cheaper, better-retrieved context above— what your agent does with that head start is the open question we're still measuring.

What these numbers don't say

•84 of 405 repos use languages SigMap doesn't yet parse (Clojure/Lua/C/C++/Haskell) — excluded from the headline, not hidden.
•Retrieval precision is the ceiling: even the BM25 re-ranker (82.4% hit@5) sometimes surfaces a neighbour file, not the exact target.
•We did NOT find a reproducible agent wall-clock speedup: a 3-rep Devin A/B came out within noise (8.4 vs 8.0 min on completed runs), with high variance and some sessions exceeding our 30-min cap. The token/cost savings below are deterministic; the agent-speed question is still open.
•Devin's ACUs (its billing unit) aren't exposed by the API — only on its dashboard — so cost-per-task on the agent side is not yet measured.