Multi-Accounts

Design & phased rollout — splitting the conflated user_id into identity, ownership, and membership.

Tracking: A-3402 Design PR: #2056 Label: accounts Status snapshot: 2026-06-11

§0Principles

Separate identity from ownership

Split the conflated user_id into three concepts and sort every table, key, field, and service onto exactly one: identity (user_id — login, MFA, JWT, sessions, audit actor), ownership (account_id — balances, positions, fills, margin, risk, fees, ledger), and membership (account_permissions — granular can_* flags). → §2a, §2b

Authorize every account explicitly

Every money-touching request carries an explicit account_id and passes one centralized authorize(user, account, action) against account_permissions. This inverts the old guard from "must equal the user's default" to "must be a permission the user holds" — the watershed after which multi-account is a config-flip, not a code change. → §2e, §2f

One account per request — never a session, never a fan-out

A money read keys on exactly one account_id (the explicit one, or the derived default); account context is per-request, never sticky per-session (the JWT stays identity-scoped). Multi-account views are composed client-side (N calls / an account selector), never via server-side aggregation over the user's readable set. → §2e

Default-account resolution is permanent but edge-only

A request that omits account_id resolves to the user's default account (the one whose id equals the user's bit pattern), so the forward resolver as_default_account_id never goes away. But its legitimacy is positional: allowed only at the API edge, only as the input to an authorize check; forbidden in service internals and as a stand-in for an authorized account. → §2c, §3.5, §1 step 10

At the EP3 seam, we use the EP3 user and ignore the EP3 account

EP3 offers two identifiers per participant; we choose to model on only the EP3 user and map AX account_id → EP3 user (1:1, never to an AX user). The EP3 account comes along for the ride — unused, un-keyed, un-decoded (hence the generate_ep3_account_name twin is deleted, not account-keyed). The seam is value-preserving (default accounts keep their byte-identical participant strings), and the order's actor-of-record lives AX-side via order correlation, never reconstructed from the participant. → §2a, §1 step 10e

§1Progress / Tracker

Live snapshot — phase status and the bar below are computed from GitHub PR state at build time (2026-06-11). 7/15 phases done · 47%.

Done — 7 In progress — 4 Not started — 4
1Class 0 additive columndone

Add trading_accounts.ubo_user_id (nullable FK to users), backfilled 1:1 from creator user for customer accounts and NULL for system accounts. Unblocks admin GUI navigation ("show me Andrew's accounts") and gives compliance a single back-reference. Default-account resolution stays bit-pattern-derived (§3.5), so no users.primary_account_id column is needed today.

  • #2058 add ubo_user_id + account_id columns (multi-account prep)
  • Superseded / related
  • #1935 track system users as real rows (system-account UBO invariants)
2Class A Postgres PK rewrites — starting with current_balancesdone

A-2946 already added account_id NOT NULL to that table with a 1:1 backfill, so only the keying and call-site cutover remain. Then monthly_loss_limits and loan_to_value_monitors each get an additive account_id column + backfill before their PK swap. Mechanical — same value, different key column. No semantics change while 1:1 holds.

  • #2058 added account_id to current_balances (prep)
  • #2059 rekey current_balances PK to account_id
  • #2108 rekey loss-limit & LTV tables + admin endpoints
  • #2109 GUI: loss-limit / LTV admin screens account-keyed
  • Superseded / related
  • #1899 earlier current_balances rekey (superseded by #2059) → kevin/a-2946-rekey-fanout-rekey
  • #2084 earlier loss-limit / LTV rekey (superseded by #2108)
3Redis cutover for risk:* keysdone

Rename + producer/consumer swap + backfill. Easy because values already carry account_id.

  • #2116 rekey risk snapshots from user_id to account_id
4ClickHouse drain-rekey for short-TTL tablesdone

(risk_snapshots, risk_snapshots_1h): cut writers over to *_new tables keyed on account_id, let TTL drain the old tables, repoint MV bodies. No bulk INSERT SELECT. Cheap.

  • #2120 rekey short-TTL risk_snapshot tables to account_id
5ClickHouse rebuild-rekey for long-lived tablesdone

(positions, transactions, risk_snapshots_1d): build *_new with the new ORDER BY, batched INSERT SELECT by partition, atomic RENAME. Rekey the reader queries for the account_id-leading tables (positions, risk_snapshots*) to filter/LIMIT 1 BY on account_id in the same step — see §2c.2; deferring them silently turns leading-column prunes into full scans. (transactions reads stay on user_id for now — perf-neutral, key leads with timestamp — and re-scope in step 6.) Also reseed sequence_number generators in trade-engine2 / settlement-engine to be per-account.

  • #2121 rekey positions / transactions / risk_snapshots_1d
  • #2147 per-account positions sequence_number
  • Superseded / related
  • #2055 combined positions / risk / transactions rekey (exploratory fan-out branch, reworked into #2121)
  • #1864 combined positions + risk slice → kevin/a-2946-user-id-to-account-id-3
  • #1866 positions slice
  • #1867 risk-snapshots slice
  • #1869 orders + risk rekey → kevin/a-2946-user-id-to-account-id-3
  • #1820 first ClickHouse rekey attempt → kevin/a-2946-user-id-to-account-id-3
  • #1850 combined rekey (stale draft)
  • #1851 orders / order-history rekey (stale draft) → kevin/a-2946-rekey-fanout-rekey
  • #1852 fills / trades / volumes rekey (stale draft) → kevin/a-2946-rekey-fanout-rekey
  • #1855 transactions / realized-pnl rekey (stale draft) → kevin/a-2946-rekey-fanout-rekey
6Invert the gateway guard — the watershedin progress

Replace validate_requested_account_id with a real ACL check against account_permissions. Add authorize(user, account, action) and call it on every money-touching route. This is the moment the system stops being 1:1 in spirit. This is also where the user-scoped money reads re-scope to a single accountquery_realized_pnl_by_user, query_monthly_realized_pnl_by_user[s][_symbol], get_monthly_sandbox_usage, the funding-transaction history, and the transactions ChTransaction::query listing all answer "for a user" today; under multi-account each resolves to one account_id: the explicit one (authorized via authorize) or, when omitted, the user's default account. No implicit fan-out over the readable-account set (§2e) — this removes the set-resolving helpers (authorize_account_ids / authorize_account_set_with_key); multi-account views are composed client-side or via future account_ids-plural endpoints. Until this step the single-account reads stay correct on user_id because 1:1 holds.

Enforcement fallout: flipping the guard on exposed accounts seeded before per-account permissions existed — they had no account_permissions rows at all, so their (valid) tokens started 403ing on account and trading endpoints (A-3623). #2353 closes the gap with an idempotent one-off backfill granting each account's UBO full_access, and seeds the loan_demo permission row so fresh DB inits aren't born broken.

  • #2180 enforce account_permissions on money-touching routes (authorize())
  • #2183 rekey user-keyed money reads onto a single account_id
  • #2353 backfill account_permissions for pre-watershed accounts (valid-token 403, A-3623)
7Class B Postgres PK rewritesdone

(blockchain_deposit_addresses, leaderboard) — riskier because they have historical rows; do behind a backfill.

  • #2222 rekey blockchain_deposit_addresses and leaderboard to account_id
8Protocol cleanup (additive, non-breaking — ship early)done

Real WhoAmIResponse.accounts (done); single-account_id stream scoping via WsQueryParams.account_id (done — not account_ids fan-out, §2e item 3); and the optional account_id on CancelOrderRequest/ReplaceOrderRequest so the REST cancel/replace paths can select the cid namespace (done — superfluous for oid — globally unique — but load-bearing for client_order_id, whose index is (UserId, ClientOrderId) today and becomes (AccountId, ClientOrderId); the WS path already supplies this via the connection, REST has nowhere else to put it). All three named additive items are merged (#2264 landed 2026-06-07); any further additive shape gaps surfaced by the SDK audit ride alongside the API-key scoping in step 9. All additive — landed ahead of the breaking bump.

  • #1734 account_id plumbing incl. WhoAmIResponse.accounts
  • #2224 single-account WS stream scoping
  • #2264 optional account_id on cancel / replace requests
8bPublic SDK breaking bump — the 1:1-break cutover ("ship once")not started

Bound to the step-6 watershed: must land before the first multi-account customer, because that is the moment a bare user_id on a money/position/ledger/risk payload becomes ambiguous (§2e). One major version, one deprecation window: drop Fill.user_id, AdminTrade.maker_user_id/taker_user_id, Position.user_id, Transaction.user_id, FundingTransaction.user_id; rename UserRiskSnapshot → AccountRiskSnapshot and drop its user_id; keep OrderDetails.user_id with its meaning re-pinned to actor-of-record (non-breaking doc shift, shippable even earlier). account_id companions already exist on every dropped type, so account-keyed consumers are unblocked today; the GUI and external consumers move to account_id first (actor recoverable via order_id → order_log, owner via account → ubo_user_id). Trade (market data) untouched. See the §2e decision table. No PRs yet — the SDK still carries every listed user_id field (verified 2026-06-11). Timing-critical.

9API key reworkin progress

Schema migration to (user, account, scoped_perms). New keys mint with explicit account; old keys grandfathered to the user's default account. #2227 remains in draft.

  • #2227 account-scoped API keys
10Account-key the servicesin progress

This step is not "delete as_default_account_id". The 1:1 era left ~210 call sites that reinterpret a UserId as an AccountId or vice-versa. They are not one change: they are several different questions the bit pattern answered for free. The forward resolution (user → their default account) is permanent and stays (§3.5); the work is per-service account-keying that removes the reverse derivations and the user-keyed service internals. Decomposition:

Concern Where Disposition
a. Interface/internals rekey user→account risk-engine (+ verbatim twin risk-engine-bitnomial), transaction-engine, trade-engine2 rekey the service's own state/interface to AccountId; the edge still passes the default account
b. CH denormalized user_id columns transactions, positions, trades, risk_snapshot_user_equity deferred to the single step-13 sweep — services keep emitting user_id as a passthrough
c. Close-only / freeze user-keying risk-monitor, order-gateway, api-gateway spun out as step 10b
d. Redis / pub-sub channel keying risk-monitor trigger channel, risk_snapshot_updated consumers per-channel; the publisher emits account_id
e. EP3 participant ↔ account (bidirectional) provisioning (api-gateway/auth.rs, admin-cli), order submission (order-gateway client_participant_id), fills (trade-engine2 participant decode) the participant namespace is account-keyed end to end: mint via generate_ep3_username(account_id: AccountId) (delete the UserId overload + the generate_ep3_account_name twin), stamp the account's participant on submission, decode Ep3Username::from_full_path → AccountId. Value-preserving for default accounts (§2a). Source the trade/fill actor-of-record from order correlation, not the participant (§2e)
f. Reverse → UBO lookups (legitimate) account→username, Db*::pseudonym() read trading_accounts.ubo_user_id (replica-cached on hot paths), not the bit pattern
g. Edge default-account resolution order-gateway, api-gateway, admin-cli as_default_account_id stays — this is its permanent home
h. Test fixtures every /tests/, sdk-internal unit tests (~97 sites) explicit test-only helper; 1:1 is correct forever in fixtures

The one exception (a CH user_id drop pulled forward): risk-engine's slice drops ChRiskSnapshotRow.user_id and the column on risk_snapshots/_1h/_1d (+ rebuild the 3 MVs) in the same PR — free, because those tables lead their sort key with account_id and the reader already GROUP BY account_id (§2c.2). Every other CH user_id column stays for step 13.

The done-bit is not "the function is gone"; it is: no money/risk service keys its own state on user_id, and no code derives a user from an account except for a legitimate UBO/identity reason (concern f). as_default_account_id survives at the edge; from_default_account_id survives only as the deprecated shim until concerns a/c/f retire its last callers. EP3 seam (concern e) not started — generate_ep3_username still takes UserId; the reverse from_default_account_id is down to 10 refs / 6 files (verified 2026-06-11).

  • #2259 account-key transaction-engine / api-gateway interface (concern a)
  • #2260 account-key trade-engine2 fee-rate state (concern a)
  • #2261 risk-engine: account-key the calculator internal state (concern a)
  • #2262 risk-monitor: account-key the risk-snapshot channel consumer (concern d)
  • Superseded / related
  • #1831 earlier risk-engine calculator rekey (superseded by #2261) → kevin/a-2946-rekey-redux
10bAccount-key close-only/freezein progress

users.is_close_only/is_frozen are enforced user-keyed today: order-gateway reads users_replica…is_close_only to gate order entry, and risk-monitor breach/cure writes the user row (via the reverse derivation). Move the truth to trading_accounts.is_close_only/is_frozen (both columns already exist): risk-monitor breach/cure writes the account, order-gateway enforcement reads an account-keyed replica, api-gateway admin freeze targets the account. Then drop the users columns (the Class C / db/postgres/1.sql:30 TODO). This is what removes risk-monitor's reverse derivation — it is not separately convertible. Resolves §3.6. Sequence: after step 6; can run parallel to steps 7–10.

  • #1839 replace users.is_close_only with a close_only_holds table
  • #1840 admin: close-only Holds tab + per-symbol surfacing → loc/a-2650-close-only-holds
  • Superseded / related
  • #1824 per-(user, symbol) close-only enforcement (abandoned) → loc/a-2650-extend-monthly-loss-limits-to-per-product-pnl
11Account-creation messages in the public SDKnot started

CreateAccount, ListAccounts, GrantPermission, RevokePermission. No PRs yet.

12TigerBeetle binding swapnot started

To AccountId::derive_tb_account_id. Default accounts retain their bit pattern so ledger history reconciles; new accounts use fresh ULIDs. No PRs yet — derive_tb_account_id still defined on UserId (verified 2026-06-11).

13Drop the denormalized user_id — ClickHouse columns onlynot started

Terminal cleanup, after step 10 — once nothing keys on or filters by user_id. The public SDK breaking changes are not in this step — they ride the single 1:1-break bump (step 8b), gated to land before the first multi-account customer. What remains in this step is purely internal ClickHouse: both classes from §2c.2 except the risk_snapshots family, already dropped in step 10's risk-engine slice — the rekeyed tables (positions, transactions) that kept user_id through the 1:1 era, and the column-only tables (order_log, trades, historical_orders) — for each, decide drop-outright vs. keep-as-actor-stamp (audit). Migrates the last stragglers off user_id: the transactions reads left perf-neutral in step 5 (get_monthly_sandbox_usage, the realized-PnL queries, ChFundingTransactionFilters, query_by_reference_id) and the admin-cli reconciliation reads. Nothing structural depends on this; it's hygiene that makes the conflation un-reintroducible.

Notebook

Reference design — the detailed mechanics behind the tracker.

§2aThe mental-model shift

Today the system conflates two things into one identifier (user_id):

  • Identitywho is making a request (a human, a session, an actor for audit).
  • Ownershipwhose money the request moves (balances, positions, margin, fills, P&L, statements).

The new model needs to split these. Every concept in the system is one of:

Bucket Scope Examples
Identity user_id login, MFA, JWT, API keys, sessions, audit "actor", notifications
Ownership account_id balances, positions, fills, orders, margin, risk limits, statements, ledger, deposits, fee schedule, close-only/frozen flags
Membership (user_id, account_id, perms) "Andrew can trade account ACME-1; Andrew can read-only account ACME-2"

The single hardest discipline: every money-touching API call must take an explicit account_id, and the gateway must authorize (user_id, account_id, action) against account_permissions. No more silent default.

The EP3 seam is account-scoped. The exchange (EP3) knows only participants; a participant maps 1:1 to an AX account, never to an AX user. trading_accounts.ep3_username/ep3_account is the sole binding — users carries no EP3 identity once §2c.1 Class C lands. The participant namespace therefore decouples entirely from AX user identity: provisioning mints an account-keyed participant, order submission stamps the account's participant, and fills decode the participant back to an account_id. The 1:1 era hid this because the participant was minted from the user's bit pattern — and because that bit pattern is the default account's id (AccountId(self.0)), the switch is value-preserving: existing participant strings are byte-identical ({prefix}.{ulid} with the same ULID), so no EP3-side rename or re-provisioning; only genuinely-new (non-default) accounts get fresh participants. Terminal state: a single Ep3UserManager::generate_ep3_username(account_id: AccountId) — the UserId overload and the redundant generate_ep3_account_name twin are both deleted, and the inbound Ep3Username::from_full_path → UserId decode becomes → AccountId. The actor (who placed the order) is no longer recoverable from the participant — it is carried AX-side via order correlation, not reconstructed from the EP3 seam (§2e).

UserId::as_default_account_id() (forward: user → their default account) and UserId::from_default_account_id() (reverse: account → user) are the inventory of every place that conflation still lives — ~210 call sites. The two are not symmetric: the forward resolution is legitimate and permanent (a request that omits account_id resolves to the user's default account — the one whose id equals the user's; §3.5), so as_default_account_id is never deleted. The reverse is the real smell — code deriving "the user behind this account," which only the 1:1 era made free. Step 10 (§1) is therefore per-service account-keying, not a function deletion.

§2bConcepts that need rethinking

Secretly account-level, masquerading as user-level today

  • users.is_close_only, users.is_frozen, users.ep3_username, users.ep3_account, users.maker_fee, users.taker_fee — schema already has the TODO. These move to trading_accounts exclusively.
  • API keys — currently per-user with implicit access to "the user's account". Need to become per-(user, accounts).
  • TigerBeetle ledger IDs — UserId::derive_tb_account_id(ledger) constructs the TB ledger account id from user_id bit pattern. The ledger of record is bound to user, not account. This must become AccountId::derive_tb_account_id(ledger).
  • Risk snapshots — risk:{user_id} in Redis, but the value already carries account_id. The key is the thing that's wrong. user_id should be deprecated and then dropped.
  • Drop-copy / position / fill / balance streams — currently filtered by user_id; should be filtered by the set of account_ids the user can read.
  • Loss limits, LTV monitors, deposit addresses, leaderboard rows — composite-PK'd on user_id but semantically account-level.

Genuinely user-level — stay that way

  • users, password_reset_codes, user_risk_profiles, onboarding_applications, documents_meta, KYC, MFA, session tokens, audit "who did this".
  • Notifications addressed to a human (email/push), not to an account.

Has to be invented (in protocol, not just DB)

  • Every money-touching request carries an explicit account_id (or account_ids: Vec<AccountId> for reads/subscriptions); there is no session-level "active account" state. Today there's no protocol-level way to say "I'm Andrew, acting on behalf of ACME-1" — the request body has to grow that field. WhoAmIResponse.accounts is multi-aware but the rest of the surface isn't. Sticky session state is rejected on purpose: it makes "I traded in the wrong account" reconstructable only from logs+state-at-the-time instead of just the request body, and it adds an invalidation axis (permission revocation, account close) that explicit parameters don't have.

§2cDatabase primary keys

Two stores to migrate: Postgres (transactional state of record) and ClickHouse (append-only time-series for orders, fills, positions, risk snapshots). They have very different rekey economics — Postgres PKs are an integrity constraint, ClickHouse ORDER BY is the on-disk layout — so they need separate plans.

2c.1 · Postgres

The schema is in surprisingly good shape — trading_accounts and account_permissions already exist. The work splits into additive schema, PK rewrites, and one cleanup pass.

Class 0: additive columns (do these first; unblock the rest):

  • trading_accounts.ubo_user_id CHAR(16) REFERENCES users(id) — the Ultimate Beneficial Owner. See §3.3. Nullable: system accounts (e.g., AX_LENDING_USER_ID) have no beneficial owner and should be NULL; customer-owned accounts have a UBO. Backfill: 1:1 from the user that created the account, NULL for system accounts.

No users.primary_account_id column. The "user's default account" is derived, not stored: it's the account whose id shares the user's ULID bit pattern (the TigerBeetle-preservation invariant from below). Zero-cost to compute, stable for the life of the user, and doesn't require a new NOT NULL column. See §3.5.

Class A: drop user_id, key on account_id (simple, mechanical):

  • current_balances — PK user_idaccount_id (A-2946 already added the account_id column NOT NULL with a 1:1 backfill; only the keying remains)
  • monthly_loss_limits — FK user_idaccount_id (needs additive column add + backfill first)
  • loan_to_value_monitors — PK user_idaccount_id (needs additive column add + backfill first)

Class B: composite PKs that need user_id swapped for account_id (data migration):

  • blockchain_deposit_addresses(user_id, blockchain, asset)(account_id, blockchain, asset). Deposit addresses fund accounts.
  • leaderboard(metric, cadence, period_start, user_id)(..., account_id). Ranks trading entities, not humans.

Class C: shed duplicated columns once trading_accounts is canonical (cleanup, off the critical path):

  • users: drop ep3_username, ep3_account, is_close_only, is_frozen, maker_fee, taker_fee. (Schema already TODOs this at db/postgres/1.sql:30.) Pure hygiene — does not block anything; can ship before, after, or in parallel with any rekey.

No change (identity tables): users, api_keys (see §2f — different reason), user_risk_profiles, password_reset_codes, onboarding_applications, etc.

Key invariant to preserve through migration. TigerBeetle ledger IDs are constructed from a 128-bit value seeded by a ULID. For the default account of an existing user, keep account_id = user_id's bit pattern so historical TB rows still reconcile. Additional accounts get fresh ULIDs. This is exactly the shape the current code provides — as_default_account_id() returns AccountId(self.0). Bake that as a backfill rule. The function itself is not deleted: forward default-account resolution (user → their default account) is permanent (§3.5). What goes in step 10 is the reverse (from_default_account_id, account → user) and the user-keyed service internals.

2c.2 · ClickHouse

The A-2946 backfill (db/clickhouse/a-2946-account-id-backfill.sql) has already added account_id columns to every relevant table and backfilled them 1:1 from user_id. The materialized views (risk_snapshots_1h_mv, risk_snapshots_1d_mv) were updated atomically via MODIFY QUERY to carry the column through. So columns are in place; what remains is the rekey — that script's preamble explicitly defers it ("the follow-up rekey is a separate migration"). That's the work below.

Why this is meaningfully different from Postgres: MergeTree sort/primary keys are the physical part layout. You cannot ALTER … MODIFY ORDER BY to move user_id out of the sort key — you must build a new table with the new ORDER BY, INSERT SELECT everything across, swap names, drop the old. For multi-billion-row tables (order_log, trades, positions, transactions) this is a real operation: hours of I/O, dual disk-space, and a brief writer pause at swap time.

Tables where user_id is already only a column (no rekey, just stop writing it after Postgres cutover):

  • historical_orders — sort key (completion_time_ns, order_id).
  • order_log — sort key (event_timestamp_ns, order_id).
  • trades — sort key (symbol, timestamp_ns, trade_id); both maker/taker carry both ids.

For these, the column stays for one release as a denormalized convenience, then we either drop user_id outright or keep it as an "actor of record" stamp (cheap, useful for audit). Either way: no on-disk rebuild.

The three rekeyed tables below (positions, transactions, risk_snapshots_1d) likewise keep their user_id column through the 1:1 era as denormalized actor-of-record data — only the sort key flips. That has a consequence for readers: once the sort key leads with account_id, any query that still filters/LIMIT 1 BYs on user_id loses the leading-column prune and full-scans. So reader-query rekey is coupled to each storage rekey, not deferrable — every WHERE user_id/LIMIT 1 BY user_id against an account_id-leading table (positions, risk_snapshots, risk_snapshots_1h, risk_snapshots_1d) must move to account_id in the same step that flips the key. (transactions is the exception: its key leads with timestamp, not account_id, so its user_id reads are perf-neutral and only need migrating for the semantic/cleanup reasons below.) The eventual user_id-column drop for all of these — rekeyed and column-only alike — is its own terminal step (§1 step 13), with one exception: risk_snapshots/_1h/_1d have their user_id drop pulled forward into the risk-engine slice of step 10. Those tables already lead their sort key with account_id and the api-gateway reader already GROUP BY account_id, so dropping the column (and rebuilding the 3 MVs) is free to do alongside risk-engine's internal rekey. Every other CH user_id column stays for the step-13 sweep.

Tables that need a true rekey (user_id is in the sort/primary key):

Table Today Target Notes
positions PRIMARY KEY (user_id, timestamp_ns, sequence_number, symbol) (account_id, timestamp_ns, sequence_number, symbol) Largest by row count; will dominate the migration window.
transactions PRIMARY KEY (timestamp, user_id, sequence_number) (timestamp, account_id, sequence_number) The "primary unique ID" doc comment ((user_id, sequence_number)) becomes (account_id, sequence_number) — see invariant note below.
risk_snapshots PRIMARY KEY (user_id, timestamp_ns) (account_id, timestamp_ns) 7-day TTL — can drain rather than rebuild.
risk_snapshots_1h ORDER BY (user_id, bucket_timestamp_ns) (account_id, bucket_timestamp_ns) 90-day TTL — can drain or rebuild; MV body must be updated.
risk_snapshots_1d ORDER BY (user_id, bucket_timestamp_ns) (account_id, bucket_timestamp_ns) No TTL — must rebuild. MV body must be updated.

Rekey playbook (per table, applied uniformly):

  1. CREATE TABLE … _new with the new ORDER BY / PRIMARY KEY.
  2. For MV-fed tables (risk_snapshots_1h, _1d): drop the MV, recreate it pointing at _new. For source-of-truth tables fed by service writes (positions, transactions, risk_snapshots): the writer must dual-write or be paused.
  3. INSERT INTO …_new SELECT * FROM … in time-bucketed batches (by partition where partitioned, by month otherwise). Verify row counts match per partition.
  4. RENAME TABLE … TO …_old, …_new TO … (atomic in a single statement).
  5. Optionally DROP TABLE …_old after a soak.

For TTL'd tables (risk_snapshots, risk_snapshots_1h) there's a cheaper path: stop writing on the old key, start writing on the new key in a new table from cutover, let the TTL drain the old one. Skips the INSERT SELECT entirely. Worth using.

Invariants that change meaning, not just column names:

  • positions.sequence_number — doc says "globally unique per user, i.e., not tracked per symbol." Once accounts can be multi-user, this needs to be per account, not per user. Where the writer generates it (trade-engine2 / settlement-engine) must reseed per account.
  • transactions.sequence_number + transactions.event_id — the doc identifies (user_id, sequence_number) as the primary unique key with event_id as backup. The new primary unique key is (account_id, sequence_number). The backup event_id semantics don't change but the uniqueness contract migrates.
  • risk_snapshots rows currently fan out one-per-user; in the multi-account world they fan out one-per-account. A user's multi-account "risk view" is composed client-side (one query per chosen account) — the server emits one row per account and never fans out over a permission set (§2e, no implicit fan-out). The producer (risk-engine buying_power_updater.rs) already loops on account_id.

Out of scope: funding_rates, candles_*, bbo_candles_*, market_snapshots, benchmark_prices, index_prices, invariants_log — none reference user/account. monitor.* schema is service-owned and unaffected.

Analytics DB (db/clickhouse/analytics.sql) is symbol/event-keyed throughout — no user/account in any table — so no rekey, no column adds.

§2dRedis shape changes

The Redis surface is narrow — almost all of it is in rs/sdk-internal/src/redis/redis_keys.rs:

Today Tomorrow Reason
risk:{user_id} risk:{account_id} Risk is money, not identity
risk:{user_id}:{symbol} risk:{account_id}:{symbol} Same
session:token:{hash} unchanged Token = identity artifact
session:user:{user_id} unchanged Token index per human

Nothing scans these with wildcards, and the value (RedisUserRiskSnapshot) already carries both ids, so the cutover is a single producer/consumer swap plus a backfill. No Lua/MULTI dependencies cross the user/account boundary.

The Redis pubsub channel risk_snapshot_updated is global, so it doesn't shift. But any future "subscribe to a user's risk" should be expressed as "subscribe to an account's risk" — a multi-account user's view is composed client-side, one subscription per chosen account, never a server-side fan-out over a permission set (§2e).

§2eProtocol & request shape

The SDK already half-supports this. Most message types (Fill, Position, Balance, UserRiskSnapshot, AdminTrade) carry both user_id and account_id. The shape is mostly right — the enforcement is what's wrong — with one exception: user_id on the trade/fill types (Fill, AdminTrade) is scheduled for removalaccount_id becomes the sole identity (see the outbound note at the end of this section and §1 step 8b). The gateway guard in api-gateway/src/utils.rs:106 actively rejected any client-supplied account_id that didn't equal the user's default. That guard inverted (step 6, merged):

  • Before: "account_id must equal user's default."
  • After: "account_id must exist in account_permissions for this user with can_<action> true."

Request-shape decisions, each made explicitly:

  1. Optional account_id, defaults to the user's default account. Keeps every existing SDK call working unchanged (the default account is user_id's bit pattern, so this is a no-op until users actually have a second account). This default is permanent — there is no primary_account_id and no mutable "make this my default" knob; every user has exactly one default account (id == user id) forever, used or not. The foot-gun concern — "I misclicked and traded the wrong account" — is real but lives at the client layer: the CLI/GUI for multi-account users should refuse to send without an explicit choice. The server accepts the omission and resolves it deterministically via as_default_account_id. One nuance to document on every endpoint: the default is fixed by the bit pattern, so adding a second account to a user never changes which account their bare-account_id-less calls land on (still the default account).
    No implicit fan-out (decided). Omission resolves to the single default account — never a server-side aggregate over the user's readable-account set. A money read keys on exactly one account_id: the explicit one (authorized via authorize) or the default. Reading N accounts is N explicit calls (or a future account_ids-plural endpoint — a deliberately later decision). Rationale: server fan-out makes a bare read's cost unbounded in the user's grant set (multi-account scan plus per-row account_permissions resolution) and turns a cheap keyed lookup into an expensive union. Account is a top-level context, surfaced as an account selector in the GUI header bar: the active account is chosen explicitly and sent on every money read; "switch account" is a selector change, not a server-inferred set. The consumer composes multi-account views client-side.
  2. JWT claims: add nothing. Keep JWT identity-scoped. Account context is per-request, not per-session — a single session can act on multiple accounts (treasurer pattern).
  3. Subscriptions/streams: each stream scopes to a single account_id (optional; omission resolves to the user's default account), supplied by the connection out-of-band — e.g. WsQueryParams.account_id (rs/sdk protocol/order_gateway.rs), already shipped (#2224). The server enforces the scope server-side (rejecting an account the user cannot read); never trust a client-supplied filter. There is no account_ids: Vec<AccountId> server-side fan-out — a multi-account view is N connections, one per account, composed client-side (the same no-fan-out rule as item 1). There is no plural account_ids subscription.
  4. WhoAmIResponse.accounts becomes the real list (from account_permissions), not the synthetic 1:1.
  5. Cancel/replace by cid need an optional account_id; by oid they don't. A server order id (oid) is globally unique, so CancelOrderRequest/ReplaceOrderRequest referencing one need no account. But client_order_id (cid) is namespace-scoped — the open-orders index is (UserId, ClientOrderId) today and becomes (AccountId, ClientOrderId) — so resolving a cid requires knowing which account's namespace. The WS path gets this from the connection (WsQueryParams.account_id); the REST /cancel-order and /replace-order paths have no connection scope, so the request grows an optional account_id (omission → default account). Additive and non-breaking — shipped in step 8 (#2264), not the breaking bump.

For events flowing out (drop-copy, fills, position deltas), the payload key is account_id — the owner whose position moved. A user_id may also ride along, but its meaning must be pinned: it is the actor of record (who placed the order), which under multi-account is not the owner and not the UBO (a proxy/algo trades an account it doesn't own). The trap to avoid: once the EP3 seam is account-only (§2a, §1 step 10e), the actor is no longer recoverable from the participant, so it must be sourced from order correlation (order_id → order_log, which holds the authenticating user) — never from the participant parse, and never silently backfilled from trading_accounts.ubo_user_id (that would reinterpret "the trader" as "the legal owner" exactly when they diverge).

Public SDK consequence (the breaking bump, bound to the 1:1 break — not step 13). Several public types in rs/sdk carry a user_id alongside account_id. While 1:1 holds, user_id == account_id's bit pattern, so the field is harmless and consumers can already key on account_id. The moment a user can own/trade a second account, a bare user_id on a money/position/ledger payload becomes ambiguous (owner? actor? UBO?) — so the breaking cleanup is not terminal hygiene; it is a correctness deadline pinned to the 1:1 break (the step-6 watershed, before the first multi-account customer). All *user_id fields below are Serialize/Deserialize and carry utoipa::ToSchema (documented REST/WS + OpenAPI), so this is a single, deliberate major SDK bump with one deprecation window — shipped once, then the plumbing moves underneath on our timeline.

The split is by what the object is about, not blanket removal:

Public type (rs/sdk) Today Decision at the 1:1 break
Fill (protocol/common.rs) user_id + account_id DROP user_id; account_id is the sole identity
AdminTrade (protocol/api_gateway.rs) maker_user_id/taker_user_id + maker_account_id/taker_account_id DROP both *_user_id; the *_account_id companions are sole identity
Position (protocol/api_gateway.rs) user_id + account_id DROP user_id
Transaction (protocol/api_gateway.rs) user_id + account_id DROP user_id
FundingTransaction (protocol/api_gateway.rs) user_id + account_id DROP user_id
UserRiskSnapshot (protocol/api_gateway.rs) user_id + account_id RENAME → AccountRiskSnapshot and DROP user_id — the type name itself encodes the conflation (risk is account-scoped; §2c.2, §2d)
OrderDetails (protocol/order_gateway.rs) user_id + account_id KEEP user_id, meaning re-pinned to actor-of-record (who placed the order), distinct from account_id = owner. An order has exactly one authenticating actor, so the field is unambiguous on an order object even under multi-account.
FillDetails (protocol/order_gateway.rs) account_id only already account-only — no change (it is the terminal shape)
Trade (protocol/marketdata_publisher.rs) no identity unaffected — never carried identity

Why OrderDetails keeps it but Fill/Position/ledger/risk don't: an order is the act of one authenticating session, so "actor of record" is a real, single-valued property of that object. A money/position/ledger/risk row is a fact about the account; its actor (if ever needed) is recovered by joining order_id → order_log (which holds the authenticating user), and the legal owner by joining account → trading_accounts.ubo_user_id. Stamping a bare user_id on those rows just re-creates the conflation. (Rejected for the dropped types: keeping user_id silently meaning UBO — actively wrong under proxy trading; or actor-of-record — needless surface area now that account_id is the key and the actor is recoverable via order correlation.)

The internal ChTradeRow.maker_user_id/taker_user_id stamps stop being derived from the participant in step 10e; the public field removals/rename above ride the single 1:1-break bump in §1 step 8b.

§2fAuthorization & API keys

account_permissions already has the right granularity: can_list, can_read, can_set_limits, can_reduce_or_close, can_trade. Two design choices to lock in:

Centralize the check. One function — authorize(user_id, account_id, Action::Trade) — called at the gateway boundary on every money-touching route. Replace the contents of validate_requested_account_id with a real ACL lookup. The as_default_account_id call sites become the test set: every one of them is a place where a permission check should now run. (Shipped in step 6 — #2180. One operational lesson already learned: enforcement is only as good as the permission rows; accounts seeded before account_permissions existed had none and 403'd until the #2353 backfill.)

API keys are the hardest sub-design. Today api_keys has user_id only. Two viable shapes:

  • (a) Key belongs to a user, inherits the user's permissions at request time. Simple, but a leaked key is as powerful as the user's full account set, and revoking access to one account doesn't revoke the key.
  • (b) Key belongs to a (user, account_id, scoped_perms) tuple. A user can mint N keys, each with a subset of their permissions on one account. Recommended — matches how every other exchange API does this, and matches a "read-only key for the accountant, trade-only key for the bot" workflow that customers will want immediately.

Recommendation: (b). Schema change: api_keys gets account_id (NOT NULL FK) and per-key permission flags mirroring account_permissions, with a constraint that key perms ⊆ user perms at issuance time. In flight as #2227 (step 9, draft).

§2gSequencing summary & off-critical-path work

The watershed has been crossed. Step 6 (authorize() against account_permissions) is merged, so multi-account is now a config-flip per customer rather than a code change. The one timing-critical item remaining is the public-SDK breaking bump (step 8b), which must land before the first multi-account customer.

Steps 1–6 are the "make it work for 1:1 in the new shape" phase and shipped safely. Step 6 is the watershed — once the gateway authorizes against account_permissions instead of "must equal default," multi-account is a config-flip per customer rather than a code change. The public-SDK breaking bump (step 8b) is gated to that watershed, not to the end: it must land before the first customer's second account exists, because the dropped user_id fields stay value-correct only while 1:1 holds — once it breaks they are ambiguous. That is the whole "ship the public API early and once" discipline: the additive surface (step 8) went out first, the single breaking bump (8b) at the 1:1 break, then the internal plumbing (steps 10–13) moves underneath on our own timeline. Steps 4 and 5 (ClickHouse) ran in parallel with the Postgres work but had to complete before step 10 (the service account-keying), since the writers stamping account_id into CH rows still used the 1:1 derivation. Step 13 (internal ClickHouse user_id column drops, no public surface) is strictly last. Step 10 does not delete as_default_account_id: the forward default-account resolver is permanent (§3.5). What step 10 retires is the reverse derivation (from_default_account_id) and user-keyed service state; step 10b additionally moves close-only/freeze off the user.

Off the critical path (ship anytime)

  • Drop the duplicate users columns (ep3_*, maker_fee, taker_fee — pure cleanup; is_close_only/is_frozen are retired by step 10b, not here). Does not block any step above.
  • Admin GUI concept-shift (see §3.7). Earliest start: as soon as step 1 lands (it has). Run as a parallel PR series.

§3Design Questions

  1. Cross-account margin / netting — if one user can act on several accounts, are those accounts margined jointly?
    • Not yet answered Leaning no — independent per account (a joint legal entity gets fuzzy). Needs to become a stated invariant: it decides whether current_balances keys on account_id alone (independent) or needs a margin_group_id (joint).
  2. Account lifecycle — who creates accounts beyond onboarding, and is there a public-SDK account-creation message?
    • Not yet answered Onboarding still creates (user, account) atomically; no admin-creates / self-serve flow and no public-SDK account-creation message yet. The surface lands in step 11.
  3. First-class owner — should trading_accounts carry an explicit owner column?
    • Answered — affirmative Add ubo_user_id CHAR(16) REFERENCES users(id) (nullable) — the Ultimate Beneficial Owner (KYC/tax/compliance counterparty), distinct from the symmetric, operational account_permissions. Nullable for system accounts (AX_LENDING_USER_ID); backfilled 1:1 from the creating user for customer accounts. A stepping-stone to a future first-class customer entity, at which point ubo_user_id migrates to customer_id and is dropped. UBO is single-valued; joint ownership is a future customer-entity concern.
  4. Audit shape — should audit rows carry account_id (and the authorizing permission)?
    • Not yet answered Minimum viable: every audit row gains account_id. Richer: also record the account_permission row that authorized the action, so revoking a permission yields a clean "after this point, X could no longer do Y" story. Not yet committed to either.
  5. Default account for legacy clients — does an omitted account_id resolve to a single, derived default account (no stored primary_account_id)?
    • Answered — affirmative account_id stays optional and resolves to the user's default account even after they have several (§2e). The default is derived, not stored — the account whose id shares the user's ULID bit pattern (already preserved for TigerBeetle history, §2c.1). No users.primary_account_id, ever; the forward resolver as_default_account_id is therefore permanent (§1 step 10, concern g). A mutable "active account" preference, if ever wanted, is a client/session concern — never a stored column that changes ownership math.
  6. Frozen / close-only semantics — is the truth at the account level (dropping the user-level columns)?
    • Answered — affirmative Per-account is the truth; per-user is a derived "all my accounts are frozen" view. Move enforcement to trading_accounts.is_close_only/is_frozen and drop the users columns (§1 step 10b).
  7. Admin GUI concept-shift — should money-bearing admin workflows flip from user-centric to account-centric?
    • Answered — affirmative Money operations flip to account-centric ("freeze ACME-1"); identity operations stay user-centric (MFA, password, KYC). ubo_user_id makes the admin GUI navigable ("show me Andrew's accounts" → WHERE ubo_user_id = Andrew; audit lines carry both). The trader GUI gets an account selector in the header — what makes no-implicit-fan-out ergonomic (§2e) — shown only when permissions authorize more than the default account. Its own workstream: a parallel PR series, earliest start once step 1 lands (it has).