Multi-Accounts
Design & phased rollout — splitting the conflated user_id into identity, ownership, and membership.
§0Principles
Separate identity from ownership
Split the conflated user_id into three concepts and sort every table, key, field, and service onto exactly one: identity (user_id — login, MFA, JWT, sessions, audit actor), ownership (account_id — balances, positions, fills, margin, risk, fees, ledger), and membership (account_permissions — granular can_* flags). → §2a, §2b
Authorize every account explicitly
Every money-touching request carries an explicit account_id and passes one centralized authorize(user, account, action) against account_permissions. This inverts the old guard from "must equal the user's default" to "must be a permission the user holds" — the watershed after which multi-account is a config-flip, not a code change. → §2e, §2f
One account per request — never a session, never a fan-out
A money read keys on exactly one account_id (the explicit one, or the derived default); account context is per-request, never sticky per-session (the JWT stays identity-scoped). Multi-account views are composed client-side (N calls / an account selector), never via server-side aggregation over the user's readable set. → §2e
Default-account resolution is permanent but edge-only
A request that omits account_id resolves to the user's default account (the one whose id equals the user's bit pattern), so the forward resolver as_default_account_id never goes away. But its legitimacy is positional: allowed only at the API edge, only as the input to an authorize check; forbidden in service internals and as a stand-in for an authorized account. → §2c, §3.5, §1 step 10
At the EP3 seam, we use the EP3 user and ignore the EP3 account
EP3 offers two identifiers per participant; we choose to model on only the EP3 user and map AX account_id → EP3 user (1:1, never to an AX user). The EP3 account comes along for the ride — unused, un-keyed, un-decoded (hence the generate_ep3_account_name twin is deleted, not account-keyed). The seam is value-preserving (default accounts keep their byte-identical participant strings), and the order's actor-of-record lives AX-side via order correlation, never reconstructed from the participant. → §2a, §1 step 10e
§1Progress / Tracker
Live snapshot — phase status and the bar below are computed from GitHub PR state at build time (2026-06-11). 7/15 phases done · 47%.
1Class 0 additive columndone›
Add trading_accounts.ubo_user_id (nullable FK to users), backfilled 1:1 from creator user for customer accounts and NULL for system accounts. Unblocks admin GUI navigation ("show me Andrew's accounts") and gives compliance a single back-reference. Default-account resolution stays bit-pattern-derived (§3.5), so no users.primary_account_id column is needed today.
2Class A Postgres PK rewrites — starting with current_balancesdone›
A-2946 already added account_id NOT NULL to that table with a 1:1 backfill, so only the keying and call-site cutover remain. Then monthly_loss_limits and loan_to_value_monitors each get an additive account_id column + backfill before their PK swap. Mechanical — same value, different key column. No semantics change while 1:1 holds.
- #2058 added
account_idtocurrent_balances(prep) - #2059 rekey
current_balancesPK toaccount_id - #2108 rekey loss-limit & LTV tables + admin endpoints
- #2109 GUI: loss-limit / LTV admin screens account-keyed
- Superseded / related
- #1899 earlier
current_balancesrekey (superseded by #2059) → kevin/a-2946-rekey-fanout-rekey - #2084 earlier loss-limit / LTV rekey (superseded by #2108)
3Redis cutover for risk:* keysdone›
Rename + producer/consumer swap + backfill. Easy because values already carry account_id.
- #2116 rekey risk snapshots from
user_idtoaccount_id
4ClickHouse drain-rekey for short-TTL tablesdone›
(risk_snapshots, risk_snapshots_1h): cut writers over to *_new tables keyed on account_id, let TTL drain the old tables, repoint MV bodies. No bulk INSERT SELECT. Cheap.
- #2120 rekey short-TTL
risk_snapshottables toaccount_id
5ClickHouse rebuild-rekey for long-lived tablesdone›
(positions, transactions, risk_snapshots_1d): build *_new with the new ORDER BY, batched INSERT SELECT by partition, atomic RENAME. Rekey the reader queries for the account_id-leading tables (positions, risk_snapshots*) to filter/LIMIT 1 BY on account_id in the same step — see §2c.2; deferring them silently turns leading-column prunes into full scans. (transactions reads stay on user_id for now — perf-neutral, key leads with timestamp — and re-scope in step 6.) Also reseed sequence_number generators in trade-engine2 / settlement-engine to be per-account.
- #2121 rekey positions / transactions / risk_snapshots_1d
- #2147 per-account positions
sequence_number - Superseded / related
- #2055 combined positions / risk / transactions rekey (exploratory fan-out branch, reworked into #2121)
- #1864 combined positions + risk slice → kevin/a-2946-user-id-to-account-id-3
- #1866 positions slice
- #1867 risk-snapshots slice
- #1869 orders + risk rekey → kevin/a-2946-user-id-to-account-id-3
- #1820 first ClickHouse rekey attempt → kevin/a-2946-user-id-to-account-id-3
- #1850 combined rekey (stale draft)
- #1851 orders / order-history rekey (stale draft) → kevin/a-2946-rekey-fanout-rekey
- #1852 fills / trades / volumes rekey (stale draft) → kevin/a-2946-rekey-fanout-rekey
- #1855 transactions / realized-pnl rekey (stale draft) → kevin/a-2946-rekey-fanout-rekey
6Invert the gateway guard — the watershedin progress›
Replace validate_requested_account_id with a real ACL check against account_permissions. Add authorize(user, account, action) and call it on every money-touching route. This is the moment the system stops being 1:1 in spirit. This is also where the user-scoped money reads re-scope to a single account — query_realized_pnl_by_user, query_monthly_realized_pnl_by_user[s][_symbol], get_monthly_sandbox_usage, the funding-transaction history, and the transactions ChTransaction::query listing all answer "for a user" today; under multi-account each resolves to one account_id: the explicit one (authorized via authorize) or, when omitted, the user's default account. No implicit fan-out over the readable-account set (§2e) — this removes the set-resolving helpers (authorize_account_ids / authorize_account_set_with_key); multi-account views are composed client-side or via future account_ids-plural endpoints. Until this step the single-account reads stay correct on user_id because 1:1 holds.
Enforcement fallout: flipping the guard on exposed accounts seeded before per-account permissions existed — they had no account_permissions rows at all, so their (valid) tokens started 403ing on account and trading endpoints (A-3623). #2353 closes the gap with an idempotent one-off backfill granting each account's UBO full_access, and seeds the loan_demo permission row so fresh DB inits aren't born broken.
7Class B Postgres PK rewritesdone›
(blockchain_deposit_addresses, leaderboard) — riskier because they have historical rows; do behind a backfill.
- #2222 rekey
blockchain_deposit_addressesandleaderboardtoaccount_id
8Protocol cleanup (additive, non-breaking — ship early)done›
Real WhoAmIResponse.accounts (done); single-account_id stream scoping via WsQueryParams.account_id (done — not account_ids fan-out, §2e item 3); and the optional account_id on CancelOrderRequest/ReplaceOrderRequest so the REST cancel/replace paths can select the cid namespace (done — superfluous for oid — globally unique — but load-bearing for client_order_id, whose index is (UserId, ClientOrderId) today and becomes (AccountId, ClientOrderId); the WS path already supplies this via the connection, REST has nowhere else to put it). All three named additive items are merged (#2264 landed 2026-06-07); any further additive shape gaps surfaced by the SDK audit ride alongside the API-key scoping in step 9. All additive — landed ahead of the breaking bump.
8bPublic SDK breaking bump — the 1:1-break cutover ("ship once")not started›
Bound to the step-6 watershed: must land before the first multi-account customer, because that is the moment a bare user_id on a money/position/ledger/risk payload becomes ambiguous (§2e). One major version, one deprecation window: drop Fill.user_id, AdminTrade.maker_user_id/taker_user_id, Position.user_id, Transaction.user_id, FundingTransaction.user_id; rename UserRiskSnapshot → AccountRiskSnapshot and drop its user_id; keep OrderDetails.user_id with its meaning re-pinned to actor-of-record (non-breaking doc shift, shippable even earlier). account_id companions already exist on every dropped type, so account-keyed consumers are unblocked today; the GUI and external consumers move to account_id first (actor recoverable via order_id → order_log, owner via account → ubo_user_id). Trade (market data) untouched. See the §2e decision table. No PRs yet — the SDK still carries every listed user_id field (verified 2026-06-11). Timing-critical.
9API key reworkin progress›
Schema migration to (user, account, scoped_perms). New keys mint with explicit account; old keys grandfathered to the user's default account. #2227 remains in draft.
- #2227 account-scoped API keys
10Account-key the servicesin progress›
This step is not "delete as_default_account_id". The 1:1 era left ~210
call sites that reinterpret a UserId as an AccountId or vice-versa. They
are not one change: they are several different questions the bit pattern
answered for free. The forward resolution (user → their default account)
is permanent and stays (§3.5); the work is per-service
account-keying that removes the reverse derivations and the user-keyed
service internals. Decomposition:
| Concern | Where | Disposition |
|---|---|---|
| a. Interface/internals rekey user→account | risk-engine (+ verbatim twin risk-engine-bitnomial), transaction-engine, trade-engine2 | rekey the service's own state/interface to AccountId; the edge still passes the default account |
b. CH denormalized user_id columns |
transactions, positions, trades, risk_snapshot_user_equity |
deferred to the single step-13 sweep — services keep emitting user_id as a passthrough |
| c. Close-only / freeze user-keying | risk-monitor, order-gateway, api-gateway | spun out as step 10b |
| d. Redis / pub-sub channel keying | risk-monitor trigger channel, risk_snapshot_updated consumers |
per-channel; the publisher emits account_id |
| e. EP3 participant ↔ account (bidirectional) | provisioning (api-gateway/auth.rs, admin-cli), order submission (order-gateway client_participant_id), fills (trade-engine2 participant decode) |
the participant namespace is account-keyed end to end: mint via generate_ep3_username(account_id: AccountId) (delete the UserId overload + the generate_ep3_account_name twin), stamp the account's participant on submission, decode Ep3Username::from_full_path → AccountId. Value-preserving for default accounts (§2a). Source the trade/fill actor-of-record from order correlation, not the participant (§2e) |
| f. Reverse → UBO lookups (legitimate) | account→username, Db*::pseudonym() |
read trading_accounts.ubo_user_id (replica-cached on hot paths), not the bit pattern |
| g. Edge default-account resolution | order-gateway, api-gateway, admin-cli | as_default_account_id stays — this is its permanent home |
| h. Test fixtures | every /tests/, sdk-internal unit tests (~97 sites) |
explicit test-only helper; 1:1 is correct forever in fixtures |
The one exception (a CH user_id drop pulled forward): risk-engine's
slice drops ChRiskSnapshotRow.user_id and the column on
risk_snapshots/_1h/_1d (+ rebuild the 3 MVs) in the same PR — free,
because those tables lead their sort key with account_id and the reader
already GROUP BY account_id (§2c.2). Every other CH user_id
column stays for step 13.
The done-bit is not "the function is gone"; it is: no money/risk service
keys its own state on user_id, and no code derives a user from an account
except for a legitimate UBO/identity reason (concern f).
as_default_account_id survives at the edge; from_default_account_id
survives only as the deprecated shim until concerns a/c/f retire its last
callers. EP3 seam (concern e) not started — generate_ep3_username still
takes UserId; the reverse from_default_account_id is down to 10 refs / 6
files (verified 2026-06-11).
- #2259 account-key transaction-engine / api-gateway interface (concern a)
- #2260 account-key trade-engine2 fee-rate state (concern a)
- #2261 risk-engine: account-key the calculator internal state (concern a)
- #2262 risk-monitor: account-key the risk-snapshot channel consumer (concern d)
- Superseded / related
- #1831 earlier risk-engine calculator rekey (superseded by #2261) → kevin/a-2946-rekey-redux
10bAccount-key close-only/freezein progress›
users.is_close_only/is_frozen are enforced user-keyed today: order-gateway reads users_replica…is_close_only to gate order entry, and risk-monitor breach/cure writes the user row (via the reverse derivation). Move the truth to trading_accounts.is_close_only/is_frozen (both columns already exist): risk-monitor breach/cure writes the account, order-gateway enforcement reads an account-keyed replica, api-gateway admin freeze targets the account. Then drop the users columns (the Class C / db/postgres/1.sql:30 TODO). This is what removes risk-monitor's reverse derivation — it is not separately convertible. Resolves §3.6. Sequence: after step 6; can run parallel to steps 7–10.
11Account-creation messages in the public SDKnot started›
CreateAccount, ListAccounts, GrantPermission, RevokePermission. No PRs yet.
12TigerBeetle binding swapnot started›
To AccountId::derive_tb_account_id. Default accounts retain their bit pattern so ledger history reconciles; new accounts use fresh ULIDs. No PRs yet — derive_tb_account_id still defined on UserId (verified 2026-06-11).
13Drop the denormalized user_id — ClickHouse columns onlynot started›
Terminal cleanup, after step 10 — once nothing keys on or filters by user_id. The public SDK breaking changes are not in this step — they ride the single 1:1-break bump (step 8b), gated to land before the first multi-account customer. What remains in this step is purely internal ClickHouse: both classes from §2c.2 except the risk_snapshots family, already dropped in step 10's risk-engine slice — the rekeyed tables (positions, transactions) that kept user_id through the 1:1 era, and the column-only tables (order_log, trades, historical_orders) — for each, decide drop-outright vs. keep-as-actor-stamp (audit). Migrates the last stragglers off user_id: the transactions reads left perf-neutral in step 5 (get_monthly_sandbox_usage, the realized-PnL queries, ChFundingTransactionFilters, query_by_reference_id) and the admin-cli reconciliation reads. Nothing structural depends on this; it's hygiene that makes the conflation un-reintroducible.
Notebook
Reference design — the detailed mechanics behind the tracker.
§2aThe mental-model shift
Today the system conflates two things into one identifier (user_id):
- Identity — who is making a request (a human, a session, an actor for audit).
- Ownership — whose money the request moves (balances, positions, margin, fills, P&L, statements).
The new model needs to split these. Every concept in the system is one of:
| Bucket | Scope | Examples |
|---|---|---|
| Identity | user_id |
login, MFA, JWT, API keys, sessions, audit "actor", notifications |
| Ownership | account_id |
balances, positions, fills, orders, margin, risk limits, statements, ledger, deposits, fee schedule, close-only/frozen flags |
| Membership | (user_id, account_id, perms) |
"Andrew can trade account ACME-1; Andrew can read-only account ACME-2" |
The single hardest discipline: every money-touching API call must take an
explicit account_id, and the gateway must authorize
(user_id, account_id, action) against account_permissions. No more silent
default.
trading_accounts.ep3_username/ep3_account is the sole binding — users carries no EP3 identity once §2c.1 Class C lands. The participant namespace therefore decouples entirely from AX user identity: provisioning mints an account-keyed participant, order submission stamps the account's participant, and fills decode the participant back to an account_id. The 1:1 era hid this because the participant was minted from the user's bit pattern — and because that bit pattern is the default account's id (AccountId(self.0)), the switch is value-preserving: existing participant strings are byte-identical ({prefix}.{ulid} with the same ULID), so no EP3-side rename or re-provisioning; only genuinely-new (non-default) accounts get fresh participants. Terminal state: a single Ep3UserManager::generate_ep3_username(account_id: AccountId) — the UserId overload and the redundant generate_ep3_account_name twin are both deleted, and the inbound Ep3Username::from_full_path → UserId decode becomes → AccountId. The actor (who placed the order) is no longer recoverable from the participant — it is carried AX-side via order correlation, not reconstructed from the EP3 seam (§2e).
UserId::as_default_account_id() (forward: user → their default account) and
UserId::from_default_account_id() (reverse: account → user) are the
inventory of every place that conflation still lives — ~210 call sites. The
two are not symmetric: the forward resolution is legitimate and
permanent (a request that omits account_id resolves to the user's
default account — the one whose id equals the user's;
§3.5), so as_default_account_id is never deleted. The
reverse is the real smell — code deriving "the user behind this account,"
which only the 1:1 era made free. Step 10 (§1) is therefore
per-service account-keying, not a function deletion.
§2bConcepts that need rethinking
Secretly account-level, masquerading as user-level today
users.is_close_only,users.is_frozen,users.ep3_username,users.ep3_account,users.maker_fee,users.taker_fee— schema already has the TODO. These move totrading_accountsexclusively.- API keys — currently per-user with implicit access to "the user's account". Need to become per-
(user, accounts). - TigerBeetle ledger IDs —
UserId::derive_tb_account_id(ledger)constructs the TB ledger account id fromuser_idbit pattern. The ledger of record is bound to user, not account. This must becomeAccountId::derive_tb_account_id(ledger). - Risk snapshots —
risk:{user_id}in Redis, but the value already carriesaccount_id. The key is the thing that's wrong.user_idshould be deprecated and then dropped. - Drop-copy / position / fill / balance streams — currently filtered by
user_id; should be filtered by the set ofaccount_ids the user can read. - Loss limits, LTV monitors, deposit addresses, leaderboard rows — composite-PK'd on
user_idbut semantically account-level.
Genuinely user-level — stay that way
users,password_reset_codes,user_risk_profiles,onboarding_applications,documents_meta, KYC, MFA, session tokens, audit "who did this".- Notifications addressed to a human (email/push), not to an account.
Has to be invented (in protocol, not just DB)
- Every money-touching request carries an explicit
account_id(oraccount_ids: Vec<AccountId>for reads/subscriptions); there is no session-level "active account" state. Today there's no protocol-level way to say "I'm Andrew, acting on behalf of ACME-1" — the request body has to grow that field.WhoAmIResponse.accountsis multi-aware but the rest of the surface isn't. Sticky session state is rejected on purpose: it makes "I traded in the wrong account" reconstructable only from logs+state-at-the-time instead of just the request body, and it adds an invalidation axis (permission revocation, account close) that explicit parameters don't have.
§2cDatabase primary keys
Two stores to migrate: Postgres (transactional state of record) and
ClickHouse (append-only time-series for orders, fills, positions, risk
snapshots). They have very different rekey economics — Postgres PKs are an
integrity constraint, ClickHouse ORDER BY is the on-disk layout — so they
need separate plans.
2c.1 · Postgres
The schema is in surprisingly good shape — trading_accounts and
account_permissions already exist. The work splits into additive schema, PK
rewrites, and one cleanup pass.
Class 0: additive columns (do these first; unblock the rest):
trading_accounts.ubo_user_id CHAR(16) REFERENCES users(id)— the Ultimate Beneficial Owner. See §3.3. Nullable: system accounts (e.g.,AX_LENDING_USER_ID) have no beneficial owner and should beNULL; customer-owned accounts have a UBO. Backfill: 1:1 from the user that created the account, NULL for system accounts.
No users.primary_account_id column. The "user's default account" is
derived, not stored: it's the account whose id shares the user's ULID bit
pattern (the TigerBeetle-preservation invariant from below). Zero-cost to
compute, stable for the life of the user, and doesn't require a new NOT NULL
column. See §3.5.
Class A: drop user_id, key on account_id (simple, mechanical):
current_balances— PKuser_id→account_id(A-2946 already added theaccount_idcolumn NOT NULL with a 1:1 backfill; only the keying remains)monthly_loss_limits— FKuser_id→account_id(needs additive column add + backfill first)loan_to_value_monitors— PKuser_id→account_id(needs additive column add + backfill first)
Class B: composite PKs that need user_id swapped for account_id (data migration):
blockchain_deposit_addresses—(user_id, blockchain, asset)→(account_id, blockchain, asset). Deposit addresses fund accounts.leaderboard—(metric, cadence, period_start, user_id)→(..., account_id). Ranks trading entities, not humans.
Class C: shed duplicated columns once trading_accounts is canonical (cleanup, off the critical path):
users: dropep3_username,ep3_account,is_close_only,is_frozen,maker_fee,taker_fee. (Schema already TODOs this atdb/postgres/1.sql:30.) Pure hygiene — does not block anything; can ship before, after, or in parallel with any rekey.
No change (identity tables): users, api_keys (see §2f —
different reason), user_risk_profiles, password_reset_codes,
onboarding_applications, etc.
account_id = user_id's bit pattern so historical TB rows still reconcile. Additional accounts get fresh ULIDs. This is exactly the shape the current code provides — as_default_account_id() returns AccountId(self.0). Bake that as a backfill rule. The function itself is not deleted: forward default-account resolution (user → their default account) is permanent (§3.5). What goes in step 10 is the reverse (from_default_account_id, account → user) and the user-keyed service internals.2c.2 · ClickHouse
The A-2946 backfill (db/clickhouse/a-2946-account-id-backfill.sql) has
already added account_id columns to every relevant table and backfilled them
1:1 from user_id. The materialized views (risk_snapshots_1h_mv,
risk_snapshots_1d_mv) were updated atomically via MODIFY QUERY to carry
the column through. So columns are in place; what remains is the rekey —
that script's preamble explicitly defers it ("the follow-up rekey is a
separate migration"). That's the work below.
Why this is meaningfully different from Postgres: MergeTree sort/primary
keys are the physical part layout. You cannot ALTER … MODIFY ORDER BY to
move user_id out of the sort key — you must build a new table with the new
ORDER BY, INSERT SELECT everything across, swap names, drop the old. For
multi-billion-row tables (order_log, trades, positions, transactions)
this is a real operation: hours of I/O, dual disk-space, and a brief writer
pause at swap time.
Tables where user_id is already only a column (no rekey, just stop writing it after Postgres cutover):
historical_orders— sort key(completion_time_ns, order_id).order_log— sort key(event_timestamp_ns, order_id).trades— sort key(symbol, timestamp_ns, trade_id); both maker/taker carry both ids.
For these, the column stays for one release as a denormalized convenience,
then we either drop user_id outright or keep it as an "actor of record"
stamp (cheap, useful for audit). Either way: no on-disk rebuild.
The three rekeyed tables below (positions, transactions,
risk_snapshots_1d) likewise keep their user_id column through the 1:1
era as denormalized actor-of-record data — only the sort key flips. That has a
consequence for readers: once the sort key leads with account_id, any query
that still filters/LIMIT 1 BYs on user_id loses the leading-column prune
and full-scans. So reader-query rekey is coupled to each storage rekey, not
deferrable — every WHERE user_id/LIMIT 1 BY user_id against an
account_id-leading table (positions, risk_snapshots,
risk_snapshots_1h, risk_snapshots_1d) must move to account_id in the
same step that flips the key. (transactions is the exception: its key leads
with timestamp, not account_id, so its user_id reads are perf-neutral
and only need migrating for the semantic/cleanup reasons below.) The eventual
user_id-column drop for all of these — rekeyed and column-only alike — is
its own terminal step (§1 step 13), with one exception:
risk_snapshots/_1h/_1d have their user_id drop pulled forward into
the risk-engine slice of step 10. Those tables already lead their sort key
with account_id and the api-gateway reader already GROUP BY account_id, so
dropping the column (and rebuilding the 3 MVs) is free to do alongside
risk-engine's internal rekey. Every other CH user_id column stays for the
step-13 sweep.
Tables that need a true rekey (user_id is in the sort/primary key):
| Table | Today | Target | Notes |
|---|---|---|---|
positions |
PRIMARY KEY (user_id, timestamp_ns, sequence_number, symbol) |
(account_id, timestamp_ns, sequence_number, symbol) |
Largest by row count; will dominate the migration window. |
transactions |
PRIMARY KEY (timestamp, user_id, sequence_number) |
(timestamp, account_id, sequence_number) |
The "primary unique ID" doc comment ((user_id, sequence_number)) becomes (account_id, sequence_number) — see invariant note below. |
risk_snapshots |
PRIMARY KEY (user_id, timestamp_ns) |
(account_id, timestamp_ns) |
7-day TTL — can drain rather than rebuild. |
risk_snapshots_1h |
ORDER BY (user_id, bucket_timestamp_ns) |
(account_id, bucket_timestamp_ns) |
90-day TTL — can drain or rebuild; MV body must be updated. |
risk_snapshots_1d |
ORDER BY (user_id, bucket_timestamp_ns) |
(account_id, bucket_timestamp_ns) |
No TTL — must rebuild. MV body must be updated. |
Rekey playbook (per table, applied uniformly):
CREATE TABLE … _newwith the newORDER BY/PRIMARY KEY.- For MV-fed tables (
risk_snapshots_1h,_1d): drop the MV, recreate it pointing at_new. For source-of-truth tables fed by service writes (positions,transactions,risk_snapshots): the writer must dual-write or be paused. INSERT INTO …_new SELECT * FROM …in time-bucketed batches (by partition where partitioned, by month otherwise). Verify row counts match per partition.RENAME TABLE … TO …_old, …_new TO …(atomic in a single statement).- Optionally
DROP TABLE …_oldafter a soak.
For TTL'd tables (risk_snapshots, risk_snapshots_1h) there's a cheaper
path: stop writing on the old key, start writing on the new key in a new table
from cutover, let the TTL drain the old one. Skips the INSERT SELECT
entirely. Worth using.
Invariants that change meaning, not just column names:
positions.sequence_number— doc says "globally unique per user, i.e., not tracked per symbol." Once accounts can be multi-user, this needs to be per account, not per user. Where the writer generates it (trade-engine2 / settlement-engine) must reseed per account.transactions.sequence_number+transactions.event_id— the doc identifies(user_id, sequence_number)as the primary unique key withevent_idas backup. The new primary unique key is(account_id, sequence_number). The backupevent_idsemantics don't change but the uniqueness contract migrates.risk_snapshotsrows currently fan out one-per-user; in the multi-account world they fan out one-per-account. A user's multi-account "risk view" is composed client-side (one query per chosen account) — the server emits one row per account and never fans out over a permission set (§2e, no implicit fan-out). The producer (risk-enginebuying_power_updater.rs) already loops onaccount_id.
Out of scope: funding_rates, candles_*, bbo_candles_*,
market_snapshots, benchmark_prices, index_prices, invariants_log —
none reference user/account. monitor.* schema is service-owned and
unaffected.
Analytics DB (db/clickhouse/analytics.sql) is symbol/event-keyed
throughout — no user/account in any table — so no rekey, no column adds.
§2dRedis shape changes
The Redis surface is narrow — almost all of it is in
rs/sdk-internal/src/redis/redis_keys.rs:
| Today | Tomorrow | Reason |
|---|---|---|
risk:{user_id} |
risk:{account_id} |
Risk is money, not identity |
risk:{user_id}:{symbol} |
risk:{account_id}:{symbol} |
Same |
session:token:{hash} |
unchanged | Token = identity artifact |
session:user:{user_id} |
unchanged | Token index per human |
Nothing scans these with wildcards, and the value (RedisUserRiskSnapshot)
already carries both ids, so the cutover is a single producer/consumer swap
plus a backfill. No Lua/MULTI dependencies cross the user/account boundary.
The Redis pubsub channel risk_snapshot_updated is global, so it doesn't
shift. But any future "subscribe to a user's risk" should be expressed as
"subscribe to an account's risk" — a multi-account user's view is composed
client-side, one subscription per chosen account, never a server-side fan-out
over a permission set (§2e).
§2eProtocol & request shape
The SDK already half-supports this. Most message types (Fill, Position,
Balance, UserRiskSnapshot, AdminTrade) carry both user_id and
account_id. The shape is mostly right — the enforcement is what's wrong —
with one exception: user_id on the trade/fill types (Fill, AdminTrade)
is scheduled for removal — account_id becomes the sole identity (see the
outbound note at the end of this section and §1 step 8b). The
gateway guard in api-gateway/src/utils.rs:106 actively rejected any
client-supplied account_id that didn't equal the user's default. That guard
inverted (step 6, merged):
- Before: "
account_idmust equal user's default." - After: "
account_idmust exist inaccount_permissionsfor this user withcan_<action>true."
Request-shape decisions, each made explicitly:
- Optional
account_id, defaults to the user's default account. Keeps every existing SDK call working unchanged (the default account isuser_id's bit pattern, so this is a no-op until users actually have a second account). This default is permanent — there is noprimary_account_idand no mutable "make this my default" knob; every user has exactly one default account (id == user id) forever, used or not. The foot-gun concern — "I misclicked and traded the wrong account" — is real but lives at the client layer: the CLI/GUI for multi-account users should refuse to send without an explicit choice. The server accepts the omission and resolves it deterministically viaas_default_account_id. One nuance to document on every endpoint: the default is fixed by the bit pattern, so adding a second account to a user never changes which account their bare-account_id-less calls land on (still the default account).No implicit fan-out (decided). Omission resolves to the single default account — never a server-side aggregate over the user's readable-account set. A money read keys on exactly oneaccount_id: the explicit one (authorized viaauthorize) or the default. Reading N accounts is N explicit calls (or a futureaccount_ids-plural endpoint — a deliberately later decision). Rationale: server fan-out makes a bare read's cost unbounded in the user's grant set (multi-account scan plus per-rowaccount_permissionsresolution) and turns a cheap keyed lookup into an expensive union. Account is a top-level context, surfaced as an account selector in the GUI header bar: the active account is chosen explicitly and sent on every money read; "switch account" is a selector change, not a server-inferred set. The consumer composes multi-account views client-side. - JWT claims: add nothing. Keep JWT identity-scoped. Account context is per-request, not per-session — a single session can act on multiple accounts (treasurer pattern).
- Subscriptions/streams: each stream scopes to a single
account_id(optional; omission resolves to the user's default account), supplied by the connection out-of-band — e.g.WsQueryParams.account_id(rs/sdkprotocol/order_gateway.rs), already shipped (#2224). The server enforces the scope server-side (rejecting an account the user cannot read); never trust a client-supplied filter. There is noaccount_ids: Vec<AccountId>server-side fan-out — a multi-account view is N connections, one per account, composed client-side (the same no-fan-out rule as item 1). There is no pluralaccount_idssubscription. WhoAmIResponse.accountsbecomes the real list (fromaccount_permissions), not the synthetic 1:1.- Cancel/replace by
cidneed an optionalaccount_id; byoidthey don't. A server order id (oid) is globally unique, soCancelOrderRequest/ReplaceOrderRequestreferencing one need no account. Butclient_order_id(cid) is namespace-scoped — the open-orders index is(UserId, ClientOrderId)today and becomes(AccountId, ClientOrderId)— so resolving acidrequires knowing which account's namespace. The WS path gets this from the connection (WsQueryParams.account_id); the REST/cancel-orderand/replace-orderpaths have no connection scope, so the request grows an optionalaccount_id(omission → default account). Additive and non-breaking — shipped in step 8 (#2264), not the breaking bump.
For events flowing out (drop-copy, fills, position deltas), the payload key
is account_id — the owner whose position moved. A user_id may also ride
along, but its meaning must be pinned: it is the actor of record (who
placed the order), which under multi-account is not the owner and
not the UBO (a proxy/algo trades an account it doesn't own). The trap to
avoid: once the EP3 seam is account-only (§2a,
§1 step 10e), the actor is no longer recoverable from the
participant, so it must be sourced from order correlation
(order_id → order_log, which holds the authenticating user) — never from the
participant parse, and never silently backfilled from
trading_accounts.ubo_user_id (that would reinterpret "the trader" as "the
legal owner" exactly when they diverge).
rs/sdk carry a user_id alongside account_id. While 1:1 holds, user_id == account_id's bit pattern, so the field is harmless and consumers can already key on account_id. The moment a user can own/trade a second account, a bare user_id on a money/position/ledger payload becomes ambiguous (owner? actor? UBO?) — so the breaking cleanup is not terminal hygiene; it is a correctness deadline pinned to the 1:1 break (the step-6 watershed, before the first multi-account customer). All *user_id fields below are Serialize/Deserialize and carry utoipa::ToSchema (documented REST/WS + OpenAPI), so this is a single, deliberate major SDK bump with one deprecation window — shipped once, then the plumbing moves underneath on our timeline.
The split is by what the object is about, not blanket removal:
Public type (rs/sdk) |
Today | Decision at the 1:1 break |
|---|---|---|
Fill (protocol/common.rs) |
user_id + account_id |
DROP user_id; account_id is the sole identity |
AdminTrade (protocol/api_gateway.rs) |
maker_user_id/taker_user_id + maker_account_id/taker_account_id |
DROP both *_user_id; the *_account_id companions are sole identity |
Position (protocol/api_gateway.rs) |
user_id + account_id |
DROP user_id |
Transaction (protocol/api_gateway.rs) |
user_id + account_id |
DROP user_id |
FundingTransaction (protocol/api_gateway.rs) |
user_id + account_id |
DROP user_id |
UserRiskSnapshot (protocol/api_gateway.rs) |
user_id + account_id |
RENAME → AccountRiskSnapshot and DROP user_id — the type name itself encodes the conflation (risk is account-scoped; §2c.2, §2d) |
OrderDetails (protocol/order_gateway.rs) |
user_id + account_id |
KEEP user_id, meaning re-pinned to actor-of-record (who placed the order), distinct from account_id = owner. An order has exactly one authenticating actor, so the field is unambiguous on an order object even under multi-account. |
FillDetails (protocol/order_gateway.rs) |
account_id only |
already account-only — no change (it is the terminal shape) |
Trade (protocol/marketdata_publisher.rs) |
no identity | unaffected — never carried identity |
Why OrderDetails keeps it but Fill/Position/ledger/risk don't: an
order is the act of one authenticating session, so "actor of record" is a
real, single-valued property of that object. A money/position/ledger/risk
row is a fact about the account; its actor (if ever needed) is recovered
by joining order_id → order_log (which holds the authenticating user), and
the legal owner by joining account → trading_accounts.ubo_user_id.
Stamping a bare user_id on those rows just re-creates the conflation.
(Rejected for the dropped types: keeping user_id silently meaning UBO —
actively wrong under proxy trading; or actor-of-record — needless surface area
now that account_id is the key and the actor is recoverable via order
correlation.)
The internal ChTradeRow.maker_user_id/taker_user_id stamps stop being
derived from the participant in step 10e; the public field removals/rename
above ride the single 1:1-break bump in §1 step 8b.
§2fAuthorization & API keys
account_permissions already has the right granularity: can_list,
can_read, can_set_limits, can_reduce_or_close, can_trade. Two design
choices to lock in:
Centralize the check. One function —
authorize(user_id, account_id, Action::Trade) — called at the gateway
boundary on every money-touching route. Replace the contents of
validate_requested_account_id with a real ACL lookup. The
as_default_account_id call sites become the test set: every one of them is a
place where a permission check should now run. (Shipped in step 6 — #2180.
One operational lesson already learned: enforcement is only as good as the
permission rows; accounts seeded before account_permissions existed had
none and 403'd until the #2353 backfill.)
API keys are the hardest sub-design. Today api_keys has user_id only.
Two viable shapes:
- (a) Key belongs to a user, inherits the user's permissions at request time. Simple, but a leaked key is as powerful as the user's full account set, and revoking access to one account doesn't revoke the key.
- (b) Key belongs to a
(user, account_id, scoped_perms)tuple. A user can mint N keys, each with a subset of their permissions on one account. Recommended — matches how every other exchange API does this, and matches a "read-only key for the accountant, trade-only key for the bot" workflow that customers will want immediately.
Recommendation: (b). Schema change: api_keys gets account_id (NOT NULL FK)
and per-key permission flags mirroring account_permissions, with a
constraint that key perms ⊆ user perms at issuance time. In flight as #2227
(step 9, draft).
§2gSequencing summary & off-critical-path work
authorize() against account_permissions) is merged, so multi-account is now a config-flip per customer rather than a code change. The one timing-critical item remaining is the public-SDK breaking bump (step 8b), which must land before the first multi-account customer.
Steps 1–6 are the "make it work for 1:1 in the new shape" phase and shipped
safely. Step 6 is the watershed — once the gateway authorizes against
account_permissions instead of "must equal default," multi-account is a
config-flip per customer rather than a code change. The public-SDK breaking
bump (step 8b) is gated to that watershed, not to the end: it must land
before the first customer's second account exists, because the dropped
user_id fields stay value-correct only while 1:1 holds — once it breaks they
are ambiguous. That is the whole "ship the public API early and once"
discipline: the additive surface (step 8) went out first, the single breaking
bump (8b) at the 1:1 break, then the internal plumbing (steps 10–13) moves
underneath on our own timeline. Steps 4 and 5 (ClickHouse) ran in parallel
with the Postgres work but had to complete before step 10 (the service
account-keying), since the writers stamping account_id into CH rows still
used the 1:1 derivation. Step 13 (internal ClickHouse user_id column drops,
no public surface) is strictly last. Step 10 does not delete
as_default_account_id: the forward default-account resolver is permanent
(§3.5). What step 10 retires is the reverse derivation
(from_default_account_id) and user-keyed service state; step 10b
additionally moves close-only/freeze off the user.
Off the critical path (ship anytime)
- Drop the duplicate
userscolumns (ep3_*,maker_fee,taker_fee— pure cleanup;is_close_only/is_frozenare retired by step 10b, not here). Does not block any step above. - Admin GUI concept-shift (see §3.7). Earliest start: as soon as step 1 lands (it has). Run as a parallel PR series.
§3Design Questions
-
Cross-account margin / netting — if one user can act on several accounts, are those accounts margined jointly?
- Not yet answered Leaning no — independent per account (a joint legal entity gets fuzzy). Needs to become a stated invariant: it decides whether
current_balanceskeys onaccount_idalone (independent) or needs amargin_group_id(joint).
- Not yet answered Leaning no — independent per account (a joint legal entity gets fuzzy). Needs to become a stated invariant: it decides whether
-
Account lifecycle — who creates accounts beyond onboarding, and is there a public-SDK account-creation message?
- Not yet answered Onboarding still creates
(user, account)atomically; no admin-creates / self-serve flow and no public-SDK account-creation message yet. The surface lands in step 11.
- Not yet answered Onboarding still creates
-
First-class owner — should
trading_accountscarry an explicit owner column?- Answered — affirmative Add
ubo_user_id CHAR(16) REFERENCES users(id)(nullable) — the Ultimate Beneficial Owner (KYC/tax/compliance counterparty), distinct from the symmetric, operationalaccount_permissions. Nullable for system accounts (AX_LENDING_USER_ID); backfilled 1:1 from the creating user for customer accounts. A stepping-stone to a future first-classcustomerentity, at which pointubo_user_idmigrates tocustomer_idand is dropped. UBO is single-valued; joint ownership is a future customer-entity concern.
- Answered — affirmative Add
-
Audit shape — should audit rows carry
account_id(and the authorizing permission)?- Not yet answered Minimum viable: every audit row gains
account_id. Richer: also record theaccount_permissionrow that authorized the action, so revoking a permission yields a clean "after this point, X could no longer do Y" story. Not yet committed to either.
- Not yet answered Minimum viable: every audit row gains
-
Default account for legacy clients — does an omitted
account_idresolve to a single, derived default account (no storedprimary_account_id)?- Answered — affirmative
account_idstays optional and resolves to the user's default account even after they have several (§2e). The default is derived, not stored — the account whose id shares the user's ULID bit pattern (already preserved for TigerBeetle history, §2c.1). Nousers.primary_account_id, ever; the forward resolveras_default_account_idis therefore permanent (§1 step 10, concern g). A mutable "active account" preference, if ever wanted, is a client/session concern — never a stored column that changes ownership math.
- Answered — affirmative
-
Frozen / close-only semantics — is the truth at the account level (dropping the user-level columns)?
- Answered — affirmative Per-account is the truth; per-user is a derived "all my accounts are frozen" view. Move enforcement to
trading_accounts.is_close_only/is_frozenand drop theuserscolumns (§1 step 10b).
- Answered — affirmative Per-account is the truth; per-user is a derived "all my accounts are frozen" view. Move enforcement to
-
Admin GUI concept-shift — should money-bearing admin workflows flip from user-centric to account-centric?
- Answered — affirmative Money operations flip to account-centric ("freeze ACME-1"); identity operations stay user-centric (MFA, password, KYC).
ubo_user_idmakes the admin GUI navigable ("show me Andrew's accounts" →WHERE ubo_user_id = Andrew; audit lines carry both). The trader GUI gets an account selector in the header — what makes no-implicit-fan-out ergonomic (§2e) — shown only when permissions authorize more than the default account. Its own workstream: a parallel PR series, earliest start once step 1 lands (it has).
- Answered — affirmative Money operations flip to account-centric ("freeze ACME-1"); identity operations stay user-centric (MFA, password, KYC).