Security & architecture

Your code shape. Not your code.

Lettuce indexes the structure of your repo — symbols, calls, types — so your coding agent can ask instead of grep. Pick the deployment model that fits your threat model. We'll show you the blueprint.

TL;DR for the security team

Four facts.

No full-file source ever ships to a model.

We extract symbols, signatures, call edges, and short code chunks. Whole files don't leave the graph builder.

Two deployment modes.

Managed cloud (we run it) or self-hosted (you run it, license-gated, single outbound flow). Same product, different perimeter.

Per-tenant isolation, scoped tokens.

Every MCP request authenticates with a tenant-scoped bearer key. Repos, graphs, and embeddings are partitioned by tenant ID.

Bring your own everything (self-hosted).

OIDC IdP, secret store (Vault, AWS SM, GCP SM), git host, OTLP collector. Outbound flows from your cluster: a signed license heartbeat and a daily telemetry ping (version, license ID, aggregate counts) — each disabled with one env var.

Architecture blueprints

Two deployment models. Pick your perimeter.

Each diagram is component-level. Boxes are services Lettuce ships or third-party components you provide. Arrows are the actual data flows.

Your perimeterLettuce managed cloudExternal model providergit host webhookpush eventsdeveloper's agentMCP clientadmin browserOAuth sign-inAPI gatewayTLS, auth, MCPgraph builderephemeral clonemetadata DBtenants, reposgraph storesymbols, edgesembeddings storevector indexmodel APIchunks → vectorspushMCPOAuthwriteembedreadchunks
Components
What each box does. No vendor names — the wiring is what matters.
  • Git host webhookPushes to your repo trigger a webhook into the API gateway.
  • API gatewayPublic ingress. TLS-terminated. Authenticates MCP clients via tenant bearer tokens, signs in operators via OAuth.
  • Graph builderClones the repo into ephemeral storage, parses it, builds the symbol graph, then deletes the working tree.
  • Graph + metadata DBStores symbols, edges, summaries, and tenant rows. Partitioned by tenant ID.
  • Embeddings storeVector index over short code chunks for retrieval.
  • Model providerExternal LLM API used only for embeddings and one-line symbol summaries. Receives chunks, not whole files.
What leaves your perimeter
  • Webhook payloads from your git host (commit SHA, refs).
  • The cloned repo contents, fetched by the graph builder over the git host's API.
  • Short code chunks sent to the embedding model.
  • Symbol names + signatures sent to the summarisation model.
What never leaves
  • Full-file source code as a blob — only chunks and structural metadata flow downstream.
  • Binaries, vendor directories, lockfiles, env files (dropped at parse).
  • Anything outside the repos you connect.
  • Your agent's chat history with the model — Lettuce only sees MCP tool calls.
Indexing

What we extract from your repos.

The graph builder is a parser, not a backup tool. Here's the shape of what ends up in the index.

Indexed
Structural metadata + short retrieval chunks.
  • Function / class / method names and signatures
  • Import edges, call edges, type references
  • Docstrings + leading comments (verbatim)
  • One-line LLM-generated summary per symbol
  • Short code chunks (function-sized) for vector search
  • File paths and language tags
Dropped at the door
Filtered before anything writes to disk.
  • Binaries, images, archives, model weights
  • Vendored / generated directories (node_modules, vendor/, dist/, build/)
  • .env, .env.* files and dotfile secret bundles
  • Lockfiles > 1 MB (kept as path-only)
  • Anything matched by .gitignore
  • Files over the size threshold
Authentication & access

One model. Two enforcement points.

Operators sign in with OAuth / OIDC

Managed cloud uses GitHub or GitLab OAuth. Self-hosted federates to your OIDC IdP. No Lettuce-managed passwords.

Agents authenticate with scoped bearer tokens

MCP clients send a cwz_… key per request. Tokens are tenant-scoped, revocable from the dashboard, and never embedded in code we ship.

Per-tenant isolation

Every row carries a tenant ID. Repos, graphs, embeddings, and audit logs are partitioned. Cross-tenant reads are not in the query path.

Admin endpoints are gated

The /admin surface (license, members, ops) sits behind an admin-only role check. Audit log records every state-changing call.

Compliance posture

What we do today. What we're working on.

We'd rather tell you the truth than fail your due-diligence questionnaire after the fact.

Today
Shipping and verifiable.
  • TLS-only ingress on the managed cloud
  • Per-tenant isolation in DB, graphs, and embeddings
  • Scoped revocable API keys for every MCP client
  • Self-hosted deployment for customers who can't move code off-prem
  • Cosign keyless-signed images + SPDX-JSON SBOMs (self-hosted releases)
  • Pluggable SecretStore — Vault, AWS SM, GCP SM
  • OpenTelemetry to your OTLP collector (self-hosted)
In progress
Honest list. Ask us for an ETA.
  • workingSOC 2 Type I — not held today. Sequencing it after the self-hosted GA.
  • workingSAML SSO and SCIM provisioning (OIDC ships today).
  • workingExportable audit log for the managed cloud.
  • workingCustomer-managed encryption keys (CMEK) for the managed graph store.
  • not yetWe don't hold ISO 27001, HIPAA, or FedRAMP. If you need any of these, self-host inside the boundary you already certified.

Want the long version?

Send us your security questionnaire, threat model, or just the three questions you actually need answered. We'll meet you wherever your DD process starts.