Skip to main content

Shard tenants across servers and rebalance live

Outcome

Tenant databases are placed across multiple Postgres Flexible Servers according to capacity. Existing tenants can be moved between servers without data loss.

Prerequisites

  • PLATFORM_ADMIN.
  • More than one identity.db_server row, or you're about to add the second.
  • For moves: pg_dump and pg_restore on PATH at the server's major version.
  • Both source and target server admin secrets resolvable via Key Vault.

Concepts

capacity_hint — the soft cap

identity.db_server.capacity_hint is the soft upper bound on active + read-only tenants on one server. Set it from the connection budget:

capacity_hint = floor(server_max_connections / TENANT_POOL_MAX / safety_factor)

Recommended safety_factor = 2 leaves headroom for connection spikes (background migrations, impersonation, ad-hoc tooling).

Example: a server with max_connections=500, default TENANT_POOL_MAX=5, safety factor 2 → capacity_hint = 50.

capacity_hint = NULL means unbounded — only acceptable for the dev server. offboarding tenants do not count against capacity.

Placement algorithm

provision-tenant without --db-server calls the placement service:

  1. Select all identity.db_server with is_active = true joined with a live non-offboarding tenant count.
  2. Drop any row where current_tenant_count >= capacity_hint.
  3. Pick the lowest count. Tie-break by name ASC — deterministic.

If the operator passes --db-server <name>, placement only considers that server. If it's at capacity, provisioning fails with E_NO_CAPACITY — it does not silently fall back. Operators who name a server should hear "that server is full" and act.

Add a new server

  1. Stand up the Postgres Flexible Server in the desired region (Azure Portal or az postgres flexible-server create).

  2. Store the admin connection string in Key Vault:

    az keyvault secret set \
    --vault-name rcm-kv-prod \
    --name db-admin-use2 \
    --value 'postgres://admin:***@pgbouncer-use2.internal:6432/postgres'
  3. Insert the identity.db_server row:

    INSERT INTO identity.db_server (
    name, pgbouncer_host, pgbouncer_port,
    admin_secret_ref, region, capacity_hint, is_active
    ) VALUES (
    'pg-us-east-2', 'pgbouncer-use2.internal', 6432,
    'db-admin-use2', 'eastus2', 50, true
    );

Preview placement (place-tenant)

pnpm --filter @rcm/rcm-core place-tenant
pnpm --filter @rcm/rcm-core place-tenant --preferred-server pg-us-east-2
pnpm --filter @rcm/rcm-core place-tenant --format json
ExitMeaning
0Placement succeeded; prints the chosen server and alternatives
1no_capacity — every active server at capacity_hint
2preferred_not_found or preferred_at_capacity
3No active db_server rows
4Argument error or master-connection failure

Rebalance: move-tenant

UI surface: Platform Admin → Tenants → <tenant> → Metrics → Move tenant drives the same flow. The dialog previews decidePlacement for the picked target, exposes dryRun / dropSource / noReadOnlyGate, and polls GET /platform/jobs/:jobId for move_started / dry_run / move_complete phases.

The CLI below is the canonical break-glass path — it requires only Postgres super-user + pg_dump/pg_restore on PATH, while the UI requires the rcm-core process to have those binaries on PATH. Both paths share the same library, audit events, and rollback semantics.

Move flow

Steps

  1. Dry-run first:

    pnpm --filter @rcm/rcm-core move-tenant \
    --slug acme --target-db-server pg-us-east-2 --dry-run
  2. Run the move:

    pnpm --filter @rcm/rcm-core move-tenant \
    --slug acme --target-db-server pg-us-east-2
  3. Confirm with the parity probe (the move runs it automatically; you can re-run by hand if you want belt + braces):

    SELECT
    (SELECT count(*) FROM identity.organization) AS org_count,
    (SELECT count(*) FROM security.app_user) AS user_count,
    (SELECT count(*) FROM members.member) AS member_count;
  4. After 7+ days, drop the source DB if you've confirmed nothing rolled back. Default behaviour leaves the source around — pass --drop-source on the move only if you're certain.

Rollback semantics

Failure pointBehaviour
Before Key Vault flipAuto-rolls back: drops target DB, restores tenant status, writes TENANT_MOVE_FAILED. Source untouched.
Between Key Vault flip and master updateDocumented-ambiguous: loud log + E_POST_FLIP_INCONSISTENT. Manual reconciliation: re-run the master update, OR flip Key Vault back to the source URL. Source is always retained by default.
After master updateMove complete. Roll back by running the move in reverse.

Per-server monitoring

MetricMeaning
rcm_tenants_on_server{server}Non-offboarding tenant count per server
rcm_db_server_capacity_hint{server}Soft cap (0 = NULL = unbounded)
rcm_connection_budget_remaining_per_server{server}max(0, server_connection_limit − tenant_count × tenant_pool_max)
AlertWhenAction
Server fill ratio > 0.9 for 1 hA server is filling upCapacity-planning ticket; raise capacity_hint after a budget check, or stand up another server.
rcm_connection_budget_remaining_per_server &lt; 20 sustainedConnection budget tightEither raise max_connections on the server (Azure Portal restart) or move tenants off.

Troubleshooting

Symptom / error_codeCauseFix
E_NO_CAPACITYAll active servers at capacity_hintAdd a server (above) or raise capacity_hint after a connection-budget check.
E_BINARY_MISSINGpg_dump or pg_restore not on PATHInstall postgresql-client matching the server major version.
E_POST_FLIP_INCONSISTENTFailure between KV flip and master updateRe-run the master update OR flip KV back. Source is always retained.
Move stuck at "row count parity check"Source had concurrent writes despite read_onlyConfirm read_only propagated (cache TTL); re-run with a longer status-cache wait.

Next

1.8 — PITR-based tenant restore