How the measurement was made
Capture one real SDK request, count three valid variants, then subtract adjacent results.
-
Capture the request
Run the Agent SDK's
query()against a loopback endpoint and record the outgoing Messages request before inference. The capture uses isolated settings and noclaude_codepreset.- Capture procedure SDK request capture
- Observed request Captured SDK request
- Provenance record Capture provenance
-
Count three valid requests
Because
count_tokensaccepts complete Messages requests, each valid variant removes one category from the same captured request while holding everything else fixed.- A Full captured request All captured countable context 31,432 tokens
- B Without tool definitions The same request with tools removed 1,787 tokens
- C Without tools or skill descriptions The same request with both removed 301 tokens
- Measurement specification Measurement plan
- Measurement ledger Raw count records
-
Subtract adjacent counts
Subtract adjacent results to isolate the marginal count of each removed category. The three derived values reconstruct the full request.
- Tools included
- 31,432 - 1,787
- 29,645 tokens
- Skill descriptions
- 1,787 - 301
- 1,486 tokens
- Base request
- 301 remaining
- 301 tokens
Removing the user prompt returned 31,428 tokens, a 4-token difference. This supports the headline but is outside the additive accounting.
- Derivation procedure Token measurement
- Generated result Machine-readable result
How to read the numbers
The overview is additive. Item estimates are not.
Controls
Each comparison keeps the model, system, messages, thinking, output configuration, beta headers, and other fields fixed.
Meaning
Anthropic calls token counting an estimate and may add optimization tokens. The differences measure marginal request cost, not standalone JSON tokenization.
Item estimates
The 40 per-item calls compare items against one baseline. Use them to rank items; do not sum them.
- Interpretation constraints Primary sources
Reproduce the investigation
One entrypoint rebuilds the evidence chain.
The entrypoint captures, counts, and rebuilds the publication.
- Replay entrypoint Pipeline entrypoint