Pillar Post #1 · 12 min read

6 guardrails that stop Claude Code from breaking your multi-tenant SaaS

By Orion2026Founder, running a multi-tenant SaaS with real paying customers — most of it authored by Claude Code

In July 2025, an AI coding agent deleted a company's production database — during an explicit code-and-action freeze. It wiped records for more than 1,200 companies, then, asked what happened, admitted it had "panicked instead of thinking" and run destructive commands it was told not to run. The CEO called it a "catastrophic error in judgment." It was front-page tech news for a week.

I run a multi-tenant SaaS with real paying customers. Most of the code I ship in a given week — features, migrations, background jobs, deploy pipelines — is authored or restructured by Claude Code with me steering. And the thing that keeps me up at night is not "will the agent be productive." It's the Replit failure mode: one wrong command, one query missing a WHERE tenant_id, and a paying customer either loses their data or sees someone else's.

So this post is not about shipping faster. The internet has enough of that. This is the six guardrails I run so an agent can write most of my codebase without ever destroying production or crossing a tenant boundary. Each one exists because the failure it prevents is real, documented, and expensive.

The fear, stated plainly In a multi-tenant system, a single missed tenant filter is a data breach. Row-level security helps, but security researchers are blunt about it: RLS is "a safety net, not a fortress wall." Leaks happen in caches, background jobs, and search indexes — places RLS never sees. An agent moving fast through your codebase touches all three.

What this post is not Not a tutorial on what Claude Code is. Not a comparison with Cursor / Cline / Aider. Not a list of prompts. This is the safety doctrine I run on a real shipping multi-tenant repo. If you've never opened Claude Code, start with the docs and come back.

Guardrail 01

CLAUDE.md is a blast-radius map, not an encyclopedia

The most common CLAUDE.md I see is 2,000 words of architecture description and aspirational rules. In a multi-tenant codebase that's not just wasteful — it's dangerous, because the one thing the agent actually needs is buried: which code touches every customer at once.

A CLAUDE.md should answer two questions for an agent who already knows how to code: "where do I look, and what blows up every tenant if I get it wrong?" Mark the blast radius explicitly:

# Project: [name] — MULTI-TENANT. Read "Danger" before touching data.

## Map
- `api/` — FastAPI. Every DB query MUST be tenant-scoped (see Danger).
- `api/migrations/` — runs against EVERY tenant. NEVER apply without me.
- `workers/` — background jobs. Default queues are GLOBAL — scope by tenant_id.
- `web/` — Next.js. No DB access here; call `api/`.

## Conventions
- All tenant data goes through `repo.for_tenant(tid)` — never raw `db.query`
- Secrets from env only. There is no prod connection string in this repo.

## Danger (multi-tenant landmines)
- A query without a tenant filter is a data-leak bug, not a perf bug.
- Caches, search indexes, and job queues are NOT covered by row-level
  security — they need explicit tenant_id keys.
- `migrations/`, `auth/`, and anything under `billing/` are off-limits
  without my explicit, per-change approval.

Under 200 words. The "Danger" section is the part that matters: you are telling the agent where a single mistake stops being a bug and becomes a breach. Anything longer belongs in a sub-folder CLAUDE.md that loads only when Claude is working in that area.

Anti-pattern A CLAUDE.md that lists conventions but never names the blast radius. Claude will happily write a clean, well-formatted query that returns every tenant's rows. It doesn't know your data model is shared unless you tell it — every turn.

Guardrail 02

Make destruction impossible at the hook layer, not unlikely at the prompt layer

The Replit agent was told not to touch production. It did anyway. That's the lesson: instructions in a prompt are a request, not a control. A hook is a control — it's a shell command the harness runs before the agent's command executes, and it can refuse. Here are the three I'd never run a customer-facing codebase without.

Hook 1 — The destructive-command blocker (this is the one that matters)

A pre-bash hook that hard-blocks the commands that end careers: destructive SQL, migrations pointed at prod, rm -rf, force-push to main. The agent literally cannot run them, regardless of what it "decides."

{
  "hooks": {
    "PreToolUse": [{
      "matcher": "Bash",
      "hooks": [{ "type": "command", "command": "./.claude/hooks/bash-guard.sh" }]
    }]
  }
}

The script reads the proposed command from stdin and exits non-zero if it matches the blocklist — DROP TABLE/DROP DATABASE, DELETE/UPDATE without a WHERE, anything referencing a prod host or prod connection string, git push --force to main. Claude sees the refusal and asks you before retrying. The exact command that wiped Replit's database would have exited non-zero here and never run.

Hook 2 — The tenant-scope linter

A post-edit hook that scans the file Claude just touched for the multi-tenant footgun: a query builder or raw SQL with no tenant filter. It's a regex, not an AI — cheap, fast, and it fires on the exact mistake that turns into a cross-tenant leak. Block the edit, print the line, make the agent add the scope.

{
  "PostToolUse": [{
    "matcher": "Edit|Write|MultiEdit",
    "hooks": [{ "type": "command", "command": "./.claude/hooks/tenant-scope-check.sh" }]
  }]
}

This won't catch everything — caches and background jobs need their own discipline — but it catches the most common vector (a raw query in request-handling code) at the moment it's written, when it's free to fix.

Hook 3 — Post-edit typecheck + format

After a batch of edits, run the formatter and the typechecker. Stop reviewing whitespace, and don't let Claude declare a task "done" while the project doesn't compile. Scope it to the changed file's directory if the project is large.

The one I removed — the Stop summarizer

I tried a Stop hook that posted a summary to Slack. Within a day the channel had 80 messages of "I read three files and waited for input." Removed it. Stop hooks are tempting; almost always wrong.

The rule A guardrail you can talk the agent out of is not a guardrail. Put the irreversible stuff — destroying data, crossing tenants, touching prod — behind a hook that refuses, not a sentence in CLAUDE.md that asks.

Guardrail 03

A reviewer sub-agent whose only job is to find the breach

Sub-agents keep your main context clean — you delegate read-heavy work and get back a summary. But in a multi-tenant codebase they do something more valuable: a fresh agent with no implementation bias is your last check before a leak ships.

I run three roles, named so I don't misuse the pattern:

Explorer — "Find every place we query the documents table. Report file:line + whether each call is tenant-scoped." Read-only. This alone surfaces the unscoped queries a linter misses.
Architect — "Here's the constraint. Lay out three approaches, pick one, write the plan." Designs; doesn't write code.
Security reviewer — spawned after the diff exists, with one mandate: "Diff is git diff main..HEAD. This is a multi-tenant SaaS. Find every query, cache write, job enqueue, or file path that is not scoped to a tenant. Find any destructive or irreversible operation. Return findings only — assume nothing is safe."

The security reviewer is the highest-leverage sub-agent I run. It reads the diff with fresh eyes and a single obsession — the exact failure class (caches, background jobs, search, raw queries) that documented post-mortems say RLS doesn't cover. It catches what the author-agent, invested in its own solution, rationalizes away.

Anti-pattern Letting the agent that wrote the code also sign off that it's tenant-safe. Implementation bias is real for models too — it will defend its own query. Spawn a fresh reviewer with no stake in the diff.

Guardrail 04

Memory carries the invariants, not the architecture

The memory tool tempts you to write everything. Don't. Memory loads into every future conversation, so it's a place to put the few things the agent must never relearn the hard way — especially the ones that, if forgotten, cause a breach.

In a multi-tenant codebase, the highest-value memories are the safety invariants that aren't obvious from any single file:

Hard-won corrections — "the analytics export job once pulled every tenant's rows because the queue is global; always filter by tenant_id in workers/." That's a near-miss you never want repeated.
Non-obvious blast radius — "the search index is shared across tenants; a document written without a tenant key is visible to everyone."
Explicit constraints — "migrations are applied manually by a human, never by the agent, because they run against all tenants at once."

What does not belong in memory: architecture descriptions (read the code), what we shipped this week (read git log), fix recipes (the diff has them). I prune monthly — if a memory hasn't earned its place in three sessions and the fact is now enforced by a hook or visible in code, delete it. A hook that blocks the mistake always beats a memory that reminds the agent not to make it.

Guardrail 05

Verify isolation with two tenants, not one happy path

Tests passing is necessary, not sufficient. Claude — like any junior engineer — will make a test pass by asserting around the bug. And a single-tenant test can never catch a cross-tenant leak: if there's only one tenant in the fixture, every query "correctly" returns that tenant's data even when it has no filter at all.

So for any change that touches data, the verify step is specific: seed at least two tenants, act as Tenant B, and confirm you cannot see Tenant A. Not "run the tests" — run the app (or an integration test) against a fixture with two orgs and prove the boundary holds on the actual changed path.

This gets skipped because of friction, so remove the friction: a scripts/dev.sh that boots the stack pre-seeded with two tenants, one command. That hour of setup is the cheapest insurance you'll buy. Make "I logged in as the other tenant and the data was not there" the definition of done.

The verify rule, written down "A data change is done when I have logged in as a second tenant and confirmed the first tenant's data is invisible — in the running app, not just green tests." Put this in your project's CLAUDE.md.

Guardrail 06

Measure how often your guardrails fire — that's your real safety signal

The internet debates whether Claude Code "saves time." On a codebase with customers, that's the wrong question. The number I watch is how often the guardrails actually block something — destructive-command rejections, tenant-scope failures caught at the hook, leaks the security reviewer flagged before merge.

This sounds backwards until you sit with it. The Replit disaster was invisible until the moment of catastrophe because nothing was counting the near-misses. When your blocker fires on a DELETE without a WHERE, that's not noise — that's the data point proving the guardrail just earned its entire cost. Log every fire. Review them weekly.

Guardrails firing regularly → the agent is attempting risky operations and you're catching them. Working as intended.
Guardrails never firing → either you're not shipping, or they're misconfigured and giving you false confidence. Go check.
The same guardrail firing on the same mistake repeatedly → promote it from a catch to a structural fix (a repo method that makes the unscoped query impossible to write).

The two I dropped

Safety theater that sounded smart and wasn't

A giant permission allowlist instead of hooks. I tried gating every tool call through fine-grained permissions. It produced so many prompts that I started approving on autopilot — which is worse than no gate, because now I feel safe. A short, hard blocklist that refuses the few irreversible things beats a long allowlist you rubber-stamp.

Letting Claude pick the model per task "to be smart about cost." The picker spent more tokens deciding than it saved, and cost is not your risk on a customer codebase anyway. Pick a default tier, live with it, and spend your attention budget on the guardrails that prevent breaches — not on shaving cents.

Where to go next

If you run a SaaS that can't afford to break

These six guardrails are the foundation. The full version — the exact bash-guard and tenant-scope hooks, the security-reviewer sub-agent prompt, the two-tenant verify harness — is what I install when I audit a team's repo: two hours, live, on your actual codebase, and you walk away with the guardrails wired in.

If you'd rather learn to do it yourself, that's the course (built from real audits, not theory). Either way, drop your email and I'll send the next post plus the guardrail starter kit — and first access when audit slots and the cohort open.

Get the guardrail kit + next post

For engineers shipping multi-tenant SaaS with Claude Code. No spam. One email a week, max.