Multi-tenant billing without leaks

If you’re building a B2B SaaS product for the Indian market, you will eventually be asked to be multi-tenant. Agencies will want to resell you. Builders will want sub-accounts for their brokers. Hospitals will want isolation between their outpatient and IPD calling teams. Single-tenant architectures lose every one of these deals.

This post is the architecture we settled on for DigitalCallers — what works, what we got wrong, and the specific traps in the data model.

The two-account problem

The naïve multi-tenant model has one account per customer. Every record has an account_id. Every API query filters by it. Done. This works for most B2B SaaS but breaks immediately for agencies.

An agency’s use case: “I run AI calling for 10 builders. I want one dashboard where I see all 10 in aggregate, but each builder should only see their own data and pay their own per-minute rate.”

If you give the agency one account with all 10 builders mixed in, you can’t isolate per-builder. If you give each builder their own account, the agency has 10 separate dashboards. Both are wrong.

Parent + sub-accounts

The right model is a parent account with sub-accounts. The agency is the parent. Each builder is a sub-account. A user’s session is scoped to the sub-account they belong to. The parent account’s admins can view-as a sub-account or roll up across all of them.

Three things make this work:

Account hierarchy in the data model — every account has an optional parent_account_id
Middleware-level filtering — every API query injects an account_id WHERE clause based on the session, never trusting URL params
Parent-aware aggregation — agency dashboards roll up via WHERE account_id IN (...sub_accounts) when explicitly requested

Operating account vs. billing account

The next subtlety is billing. When a Clicks Bazaar agency operator dispatches a call on behalf of Koravi Developers, who gets billed?

Wrong answer: the agency. They’d eat the cost.

Right answer: Koravi.

We split this into two columns:

account_id — the OPERATING account. Who ran the dispatch. Who shows up in the audit log.
client_account_id — the BILLING account. Who pays. Who the per-minute rate is looked up against.

The webhook handler resolves the billing account by precedence:

effective_client_aid = data.client_account_id           # per-dispatch override
                     OR agent.client_account_id          # agent's stored default
                     OR aid (operating account)          # fallback

We added this in 2026-05-01 after a customer noticed that test calls from the agency’s admin panel were being billed at platform cost (~₹2.80/min) instead of the per-client rate (₹5/min). The misattribution had been silent for weeks.

The boot-time billing canary

Silent billing breakage is the worst kind. By the time a customer notices “our cost report doesn’t match”, you’ve already lost trust.

We added a billing canary that runs synthetically at every Flask boot. It computes a per-minute rate against a known-good test record and fails loudly if the result doesn’t match expectation. It runs in milliseconds. It catches breakage in 30 seconds, not 30 days.

Pair this with a health endpoint:

GET /api/admin/billing-health
  → { status: "ok" | "degraded" | "broken",
      missing_billing_rows: int,
      recent_billing_failures: int }

Pageable monitor. Slack-alert it. Now you know if billing is drifting before any customer does.

Webhook signing as a contract, not a feature

DigitalCallers is calling-only. The CRM is your customer’s source of truth. The integration contract between us and them is webhooks. If a webhook gets faked or replayed, billing data is wrong.

HMAC-SHA256 signature on every webhook. Header: X-DigitalCallers-Signature: hex(hmac_sha256(secret, body)). The CRM verifies on receipt. Replay-safe because we include a delivered_at timestamp in the payload.

Per-account secrets. The agency parent has its own. Each sub-account has its own. We rotate them on demand. We deliver test pings (test.ping event) from the admin UI so customers can verify their handler before going live.

Auto-disable after 10 consecutive delivery failures. The customer’s admin gets emailed. We don’t want a webhook in a known-broken state silently dropping events for weeks.

Cross-tenant isolation, provable

The hardest claim in multi-tenant SaaS is “clients can’t see each other’s data”. Customers ask for proof. Most vendors hand-wave (“we follow best practices”). We can’t hand-wave because we sell to compliance-conscious sectors.

Our automated isolation test does six things:

Spins up a synthetic agency with 2 sub-accounts (Account A and Account B)
Seeds each with calls, contacts, agents, and recordings — clearly labeled
Logs in as Account A’s user and tries to fetch every endpoint with Account B’s IDs
Asserts every response is 403 / 404 / empty
Logs in as the agency parent and verifies it CAN see both sub-accounts (only via the rollup endpoints)
Outputs a JSON report we can hand to compliance teams

This runs on every CI build. Customers get a copy of the latest report on request. We sleep better.

What we got wrong

Three lessons from doing this in production:

1. SQLite + FUSE + concurrent writers is fragile

Our DB lives on a FUSE-mounted SQLite file. When Flask, the our calling infrastructure worker, AND a Python admin script all wrote concurrently, the file got corrupted (“database disk image is malformed”). We lost half a day recovering from a backup.

Lesson: before any admin script that writes to digitalcallers.db, stop both Flask and the worker. Take a backup. Verify integrity after. Long-term we’re migrating to PostgreSQL.

2. Werkzeug’s dev auto-reload kills mid-call webhooks

Don’t run production on Werkzeug’s dev server. Its auto-reload triggers on any file save and drops in-flight webhook deliveries. We lost calls because of this. Production now runs on gunicorn via a run_prod.sh launcher; ServerOn auto-detects and prefers it over app.py.

3. Idempotency at the dispatch layer

One UI click should produce exactly one dispatched call. We had a bug where worker auto-reload would re-dispatch a queued call, producing 2-3 calls per click. The fix is an idempotency guard in the dispatch handler — checking the request hash and short-circuiting duplicates within a 5-second window.

What this gets you

The agency-grade multi-tenant model lets us serve three different customer shapes from the same product:

Single SMB — one account, no parent, no sub-accounts. Looks single-tenant to them.
Mid-market with multiple business units — one parent, several sub-accounts. They get the same control plane the agencies use.
Reseller agency — parent + many sub-accounts, each with its own per-minute markup, white-label branding, and isolated dashboards.

None of this is obvious from the outside. You only notice when you go to onboard your fifth customer and the system already does what you need. Multi-tenancy isn’t a feature — it’s a structural decision you make on day 1 or pay for in a rewrite.

If you’re evaluating DigitalCallers for an agency rollout, the agency demo walks through the parent dashboard, sub-account creation, per-client billing setup, and the cross-tenant isolation report.

Book the agency demo →

Multi-tenant billingwithout leaks.