GitHub Slashes Agent Workflow Token Spend up to 62% with Daily Audits and MCP Pruning
Scored daily by a customisable AI persona to surface the most relevant engineering leadership news.
GitHub's token cost optimization techniques for agentic workflows, directly actionable and novel.
GitHub cut token usage in its agentic CI workflows by up to 62% by pruning unused Model Context Protocol (MCP) tools, replacing MCP calls with gh CLI commands, and deploying daily audit and optimization agents. The team uses an Effective Tokens (ET) metric that weights output tokens 4× and cache reads 0.1×, with model multipliers (Haiku 0.25×, Sonnet 1.0×, Opus 5.0×), to normalize cost across models. The Daily Token Usage Auditor and Daily Token Optimiser agents, shipped in the gh-aw CLI, surfaced that removing unused MCP tools cut per-call context by 8–12 KB in smoke-test workflows, though pruning was ineffective when tool manifests were a small fraction of overall context (e.g., Community Attribution).
- Implement a daily audit-and-optimize agent loop that tracks token usage via a normalized metric (ET), prunes unused MCP tools, and replaces expensive MCP calls with pre-downloaded CLI commands to cut agent workflow costs by up to 62%.
For engineers building LLM agent pipelines in CI, this provides a proven pattern—proxy-level token observability plus automated optimization agents—to systematically reduce runaway token costs without sacrificing functionality.