Your coding agent burned 60% of its tokens before it even started coding.
Your coding agent is a glutton β re-reading the same files every session like it's an all-you-can-eat buffet. Lettuce serves it a smaller, smarter plate. Same brain, much lighter token bill.
Every session, your agent starts blind.
Open Cursor, Claude Code, Copilot β anything β and the agent walks into your repo with zero memory of it. So it greps. It opens files at random. It re-reads the same modules it read yesterday, last week, an hour ago. Then it writes the line of code you asked for.
Blind grep + file reads
The agent has no map, so it scans. The bigger the repo, the more it scans.
Time wasted before coding
Most of the wall-clock in a session is the agent finding its way around β not the edit you asked for.
You pay for the rediscovery
60% of input tokens, on average, go to context the agent could have looked up once.
Index once. Ask, don't grep.
Lettuce is a hosted MCP server. It indexes your repo into a graph of symbols, callers, imports, and dependencies β then exposes small tools your agent calls instead of scanning files.
Index your repo
Add a repo from the dashboard. Lettuce builds a graph: every function, class, file, the edges between them, and a one-line summary of each.
Connect your agent
Paste one MCP URL + key into Claude Code, Cursor, Copilot, Windsurf, Cline, OpenCode β anything that speaks MCP. ~60 seconds.
The agent asks the map
Instead of grepping, the agent calls understand, callers, search. One call returns file:line + signature + callers β not 10k lines of file dumps.
A webhook on your repo re-indexes on every push. Your agent never reads a stale map.
No new workflow for your devs. The tools just show up in their agent β short instruction in the system prompt and you're done.
Measured, not modeled β a real Claude Code agent solving real GitHub issues, run with and without Lettuce. See the run.
Why ship Lettuce to your team
If your engineers run coding agents β Claude Code, Cursor, Copilot, anything β Lettuce is a one-day install that pays back in week one. No new tool to learn, no migration, no model change.
Big repos stop being a problem
A 200k-file monorepo isn't browsable in 200k tokens. The graph collapses it into the 5 symbols the agent actually needs. PRs that used to time out now finish.
Predictable, lower agent spend
Same prompts, 60% fewer input tokens. Across a team of 20 devs running agents daily, that's a five-figure line item β without changing models or rate-limiting.
Faster PR turnaround
17% less wall-clock per task in our benchmark. Your devs ship more PRs per hour because the agent stops re-reading the same files.
Onboard new hires faster
A new engineer's agent walks into your codebase with the same map a senior's agent has. They ask 'where does X get called?' and get a real answer in one call β not a 20-minute scavenger hunt.
No code change, no risk
It's an MCP server. Your devs paste a URL + key into their agent config. If it disappears tomorrow, their agent just goes back to grepping.
Enterprise-ready
Self-host in your VPC, SSO, audit logs, per-team budgets. Talk to us about a rollout.
Get started
Two paths in: connect your agent over MCP, or talk to us if you're rolling Lettuce out across a team.
Connect your agent
For individual devs and small teams. Add a repo from the dashboard, point your agent at the Lettuce MCP endpoint, watch the token bill drop. No CLI to install.
# Add a repo from the dashboard, then:$claude mcp add --transport http lettuce \https://coze.clickclick.cloud/mcp \--header "Authorization: Bearer cwz_..."
For companies
Self-hosted indexer, SSO, audit logs, per-repo budgets, and a human you can email when something breaks. We'll size it to your codebase and your agent fleet.
- Deploy in your VPC or ours
- SAML / OIDC SSO + SCIM
- Per-team token caps & usage reports
- Dedicated Slack channel + onboarding
Works with every coding agent you use
Lettuce is just an MCP server. If your agent speaks MCP, it can connect β usually in under a minute. Pick yours and copy the snippet.
Claude Code
Register the server with one CLI command β no config file to edit.
Cursor
Drop the server into .cursor/mcp.json (project) or ~/.cursor/mcp.json (global).
VS Code Β· Copilot
Add to .vscode/mcp.json, switch Copilot to Agent mode, and the tools show up.
Windsurf
Cascade reads ~/.codeium/windsurf/mcp_config.json β paste the server, hit Refresh.
Cline
Configure from the Cline side panel β choose Remote (HTTP) and add the bearer header.
OpenCode
Add the lettuce entry to mcp.json in the project or user config dir, then restart.
Codex CLI
Bridge stdio-only clients to the HTTP endpoint with mcp-remote β works for any stdio agent.
Your own agent
Any MCP-capable client works. Point it at the endpoint with a bearer token and you're done.
Building your own agent? If it can call MCP tools, it works. See all connection guides.
What that means in dollars
Drop in last month's AI agent bill. We apply the same input-token reduction we measured on real GitHub issues β no need to guess at sessions, developers, or models, because the ratio holds regardless.
Whatever your team is paying Anthropic, OpenAI, Cursor, etc. for agent usage right now. Pull it from last month's invoice.
Savings ratio comes from the published benchmark (Claude Code solving real GitHub issues with and without Lettuce β same methodology, see below). It's applied flat to whatever you're spending today, because Lettuce cuts input tokens regardless of the model on the other end.
How we compute βtokens savedβ+
We don't ask you for your βbeforeβ spend and we don't make it up. Every Lettuce tool call writes two numbers to the same row in our database:
- served β measured. The exact byte size of the response Lettuce returned to your agent, converted to tokens.
- baseline β modeled. What the agent would have read from source to answer the same question without us.
tokens saved is max(baseline β served, 0) summed across every call. The clamp matters: if a call ever returns more than its baseline, the row contributes zero β never a negative βsaving.β
How baseline is computed per tool:
read_snippetβ near-exact. The baseline is the full file the slice came from; the saving is everything we trimmed off.- navigation tools (
find_symbol,explain_symbol,callers) β conservative. The baseline is the combined source span of the symbols the call returned: the lines the agent would have had to read to locate them. This ignores the surrounding file the agent would also have pulled in, so we under-count. The real saving is higher.
The calculator on this page uses the same numbers, averaged across 99 real GitHub issues solved by Claude Code with and without Lettuce (14,969,244 baseline tokens vs. 5,934,465 served, a 60% reduction). See the full run.
Proof
We don't ask you to trust the pitch. Every run, every issue, every token count is published β including the cases where we lost.