Skip to content

Instantly share code, notes, and snippets.

@ngerakines
Created May 7, 2026 17:23
Show Gist options
  • Select an option

  • Save ngerakines/efd4c8fd0d9e75f8796f40edc6748a0c to your computer and use it in GitHub Desktop.

Select an option

Save ngerakines/efd4c8fd0d9e75f8796f40edc6748a0c to your computer and use it in GitHub Desktop.
atproto pds impl planning

atproto-pds: Foundational Design Document for a Rust ATProtocol Personal Data Server

Author of design brief: Nick Gerakines (CTO, Graze Social; founder, Smoke Signal & Lexicon Garden; author, AIP OAuth server; AT Protocol Community Fund member). Target home: tangled.org/ngerakines.me/atproto-crates, as a new crate atproto-pds. Date: May 1, 2026 (revision 2). Architectural North Star: A low-latency, highly-performant Rust PDS that is fully conformant to the existing reference implementations and is architected from day zero to support permissioned data spaces as a first-class concern, grounded in the concrete design laid out in bluesky-social/atproto/docs/superpowers/specs/2026-04-22-permissioned-data-pds-design.md (Daniel Holmgren's PDS implementation design, hereafter "the Spaces Design Spec"). The Spaces Design Spec supersedes the earlier Permissioned Data Diary blog posts as the authoritative source for protocol mechanics; the diary is retained only as conceptual background.

Revision 2 changelog: All permissioned-data sections (§1.4, §2.8, §3.6, §4.5, §5.5, §6.5, §7.3, §8.7, §9.8, §10.6, §15, §17, §19) rewritten to match the Spaces Design Spec. Specifically: spaces use SetHash (XOR-SHA256 placeholder for ECMH/ltHash) not bare ECMH; commits use HKDF-derived HMAC with per-commit random IKM for deniability; XRPC namespace is com.atproto.space.* (not community.lexicon.space.*); inter-PDS notification is notifyWrite/notifyMembership (not firehose redaction); credentials are exchanged via a two-step MemberGrant → SpaceCredential JWT flow.


1. Architectural Overview

1.1 What a PDS Is, Concretely

A PDS in atproto is, in network terms, simultaneously:

  1. A repository host: it stores one or more atproto repositories (one per account), including the Merkle Search Tree (MST), the signed commit chain, all DAG-CBOR records, and referenced blobs. With the Spaces Design Spec, a PDS additionally hosts permissioned repos — one per (owner, space-type, space-key) triple the user participates in.
  2. An identity custodian: it manages did:plc rotation keys for hosted accounts (when the user delegates), holds the active atproto signing key, and submits PLC operations on behalf of the user.
  3. An XRPC service host: it exposes the com.atproto.* HTTP API (and well-known endpoints) used by clients to read/write the repo, manage the account, perform sync, and operate identity. Permissioned data adds the com.atproto.space.* namespace.
  4. An OAuth Authorization Server and Resource Server (per atproto.com/specs/oauth): it issues DPoP-bound access/refresh tokens via PAR + PKCE, hosts the authorization UI, publishes /.well-known/oauth-authorization-server and /.well-known/oauth-protected-resource (RFC 9728), and validates inbound resource requests.
  5. A firehose emitter: it publishes com.atproto.sync.subscribeRepos events (#commit, #identity, #account, #sync, #info) over a WebSocket, durably sequenced. Permissioned writes are never emitted on this stream; they propagate via point-to-point notifyWrite / notifyMembership push instead (§6.5).
  6. A request proxy: via the Atproto-Proxy HTTP header and inter-service auth (service auth JWTs minted with the account's signing key), the PDS forwards authenticated client requests to AppViews, labelers, chat services, and feed generators.
  7. A moderation node: it enforces account-level and record-level takedowns, hosts admin endpoints, forwards moderation reports, and may subscribe to labelers.
  8. A space credential issuer & verifier: per the Spaces Design Spec, the PDS mints MemberGrant JWTs for its own accounts (when an OAuth-bound app requests them) and verifies inbound SpaceCredential JWTs presented by remote apps requesting permissioned reads.

1.2 Boundaries with Other Services

  • Relays (e.g., bsky.network, Blacksky's relay, cerulea, the Sync 1.1 reference relay) crawl PDSes via requestCrawl/listHosts and aggregate firehoses. As of Sync 1.1, relays no longer need archival storage; they validate via inductive proofs. The PDS is authoritative for repo state. Relays are unaffected by permissioned data — they only see the public realm.
  • AppViews (api.bsky.app, Tangled, Smoke Signal, Frontpage, Leaflet, Bookhive, etc.) consume the firehose and build app-specific indices. The PDS does not implement AppView logic but does proxy requests to them via service auth. "Syncing apps" in the Spaces Design Spec are AppView-equivalent services that hold space credentials and pull permissioned data via getRepoOplog from member PDSes.
  • Labelers (Ozone and other labeler implementations) emit labels out-of-band; the PDS may subscribe to forward labels to clients and/or to enforce auto-takedowns.
  • Feed generators are externally hosted; the PDS proxies and authenticates client requests to them.
  • PLC Directory (plc.directory and replicas) holds did:plc documents. The PDS reads it for DID resolution and writes to it for genesis ops, rotation, deactivation, and migration.

1.3 Data Flow

  • Public ingress (writes): Client → DPoP-bound HTTPS → XRPC handler → authn/authz middleware → lexicon validation → repo write transaction (block store + MST mutate + commit signing) → outbox sequencer → firehose subscribers and requestCrawl'd relays.
  • Permissioned ingress (writes): Client → DPoP-bound HTTPS → com.atproto.space.createRecord/putRecord/deleteRecord/applyWrites → authn (OAuth, scoped to space) → SpaceRepo formatCommit → SpaceTransactor applyRepoCommit (writes to space_record, updates space_repo rev/setHash, appends space_record_oplog) → fire-and-forget notifyWrite to space owner's PDS → space owner's PDS relays to each registered syncing app via notifyWrite.
  • Public egress (reads): Client → XRPC handler → cache → block store / MST traversal → JSON or DAG-CBOR response. Sync endpoints stream CAR files. The firehose continually flushes #commit and #identity events from the durable outbox.
  • Permissioned egress (reads): Apps holding a SpaceCredential → com.atproto.space.getRecord/listRecords/getRepoState/getRepoOplog against the member's PDS, or getMemberState/getMemberOplog against the owner's PDS → PDS verifies the SpaceCredential by resolving the space owner's DID doc → response.

1.4 Permissioned Data: Architectural Implications Throughout

The Spaces Design Spec defines a space as an authorization and sync boundary for permissioned records representing a shared social context. A space includes records from many users, each storing their own records on their own PDS. Concretely:

  • Space identity: ats://<ownerDid>/<spaceType>/<spaceKey>. The URI scheme ats:// is provisional in the spec.
  • Owner DID is the root of trust — owns the member list, signs SpaceCredentials.
  • Member DIDs participate by writing records to their own PDS, scoped to a space URI. Membership is enforced at the read/sync boundary, not at write time — a user can write records scoped to any space URI on their own PDS; consumers check the member list when ingesting.
  • Storage is split: each user's per-actor SQLite database gets new tables (space, space_member_state, space_repo, space_record, space_member, space_record_oplog, space_member_oplog, space_credential_recipient). The public repo is unaffected.
  • Sync is oplog-based, not MST-diff-based. Each space has a per-user space_record_oplog ordered by (rev, idx) (mirroring applyWrites atomic-batch semantics). Apps pull getRepoOplog?since=<rev> and replay; on setHash mismatch, fall back to full resync via listRecords.
  • Commitment is via SetHash — currently XOR-of-SHA256 placeholder, to be replaced by ECMH or ltHash before production. The owner has a member-list SetHash; each participant has a per-space record SetHash.
  • Commit authentication is HKDF-derived HMAC with random IKM per commit, signed by the user's atproto signing key, with SpaceContext { spaceDid, spaceType, spaceKey, userDid, scope: 'records' | 'members', rev } as HKDF info. Random IKM gives deniability (a commit cannot be re-attributed to its signer outside the original verification context); scope ensures domain separation between record commits and member-list commits.
  • Credentials: a two-step MemberGrant (member's PDS → app, signed by member, scoped to client ID + lxm) + SpaceCredential (owner's PDS → app, signed by owner, valid 2–4 hours) JWT exchange.

This means atproto-pds must, from the start, treat the storage, sync, authorization, and notification layers as multi-realm: the public repo is one realm; permissioned spaces are additional realms each with their own commit primitive (SetHash + signed HMAC), sync protocol (oplog + setHash verification), addressing (ats://), notification mechanism (notifyWrite push), and OAuth scope. The TS, Go (indigo, cocoon), and Rust (rsky, tranquil-pds) reference implementations all model the public repo as the single source of truth — the Spaces Design Spec defines exactly how to extend, and atproto-pds should generalize from the bottom up. Every subsystem section below identifies the specific extension point.

The "sidecar pattern" (Nick's terminology, from the Smoke Signal architecture and his blog on authoritative/unauthoritative references) maps to the spec's records-by-strongRef pattern: an authoritative community.lexicon.location.address record can live in a permissioned space, and a public community.lexicon.calendar.event record references it via strongRef. Readers without space credentials see the strongRef but cannot dereference. The PDS implements this by routing record fetches through a realm-aware resolver that returns 404/403 for unauthorized permissioned-record requests.


2. Account Management

2.1 Account Creation

com.atproto.server.createAccount parameters: email, handle, did?, inviteCode?, password?, recoveryKey?, plcOp?, verificationCode?, verificationPhone?. Behaviors required for conformance:

  • Invite codes: Optional, controlled by PDS_INVITE_REQUIRED. Reference TS PDS, cocoon, rsky-pds, and tranquil-pds all support invite issuance, single-use enforcement, and per-account interval-based reissuance (PDS_INVITE_INTERVAL, default 7 days).
  • Email verification: Most implementations issue a verificationCode token by email; cocoon also supports SMTP optional. Tranquil-pds extends this to multi-channel (email/discord/telegram/signal). Spec is permissive — atproto-pds should treat verification as pluggable.
  • Handle assignment: Either a subdomain of PDS_SERVICE_HANDLE_DOMAINS (e.g., alice.pds.example.com) or a user-provided handle resolvable to the new DID. PDS must verify handle ownership (DNS TXT _atproto.<handle> or https://<handle>/.well-known/atproto-did) before assignment.
  • DID creation paths: (a) PDS creates a did:plc genesis op signed by the PDS's rotation key and (optionally) the user-supplied recoveryKey; (b) PDS accepts a pre-signed plcOp (used for migration); (c) did:web where the PDS hosts /.well-known/did.json for managed subdomains, or the user provides their own did:web.
  • Migration mode: When did and plcOp are supplied with a service-auth JWT issued by the prior PDS (com.atproto.server.getServiceAuth with lxm=com.atproto.server.createAccount), the new account is created in deactivated state pending repo import.

2.2 Account States

Per the Account Hosting specification: active, deactivated, takendown, suspended, plus the implicit deleted state. State changes emit #account firehose events with active boolean and optional status reason. getRepoStatus and checkAccountStatus expose state.

  • deactivated: voluntary; identity may still be served, repo not accessible to public sync.
  • takendown: moderator action; blocks reads and writes; emits firehose #account active=false status=takendown.
  • suspended: temporary admin action.
  • deleted: hard delete after the optional deleteAfter grace window of deactivateAccount.

2.3 Deletion & Export

  • com.atproto.server.requestAccountDelete → emails a token; deleteAccount consumes it and tombstones records, deletes blobs, and purges credentials. Reference TS PDS retains a stub; cocoon performs full purge.
  • Public data export is via com.atproto.sync.getRepo (CAR), listBlobs + getBlob for media, and app.bsky.actor.getPreferences for private prefs.
  • Permissioned data export is a new requirement: com.atproto.space.listSpaces + per-space listRecords + getMemberState/getMemberOplog for owned spaces. There is currently no spec-defined CAR-equivalent for permissioned repos; per-space SQLite-table dump is the pragmatic export.

2.4 Email, Password, Phone

  • com.atproto.server.requestEmailUpdate, confirmEmail, updateEmail, requestPasswordReset, resetPassword. Phone verification is in some lexicons (legacy), but cocoon and tranquil-pds support TOTP/WebAuthn as superset features.
  • Bcrypt or Argon2 for password hashing (TS PDS uses scrypt; rsky-pds bcrypt; tranquil argon2). atproto-pds should default to argon2id.

2.5 App Passwords vs OAuth Sessions

  • createAppPassword, listAppPasswords, revokeAppPassword. App passwords mint long-lived JWT sessions with restricted scope (privileged: false excludes chat, etc.). They are deprecated for new clients; OAuth is the future.
  • OAuth sessions issue DPoP-bound access tokens (15–30 min) and refresh tokens (single-use rotation, longer-lived per the OAuth spec).
  • Space credentials are NOT app passwords — they are short-lived (2–4 hour) JWTs bound to a specific space + client ID, separate from the account's auth tokens. See §15.

2.6 Service Auth & Inter-Service Auth

com.atproto.server.getServiceAuth mints a short-lived JWT signed with the account's atproto signing key, with claims iss=did, aud=did:web:appview.example, lxm=<NSID-of-method>, exp. This is used:

  • by the PDS itself when proxying client calls,
  • by clients calling AppViews directly,
  • for inter-PDS account migration,
  • for inter-service trust where DPoP is not appropriate,
  • for notifyWrite and notifyMembership push notifications between member's PDS, owner's PDS, and syncing apps (per Spaces Design Spec).

The TS PDS and cocoon both restrict lxm to be required; older indigo PDS does not. atproto-pds should follow the strict spec.

2.7 PLC Operations & Migration

  • com.atproto.identity.requestPlcOperationSignature (email-token gate), signPlcOperation (PDS signs with held rotation key), submitPlcOperation (forward to PLC directory), getRecommendedDidCredentials (returns the credentials the PDS would set if it controlled the DID).
  • Account migration sequence (per atproto.com/guides/account-migration): old PDS issues service auth → new PDS createAccount (with did + plcOp deactivated) → com.atproto.repo.importRepo (CAR upload) → listMissingBlobs + uploadBlob loop → getPreferences/putPreferencessignPlcOperation to rotate keys/services → submitPlcOperationactivateAccount on new + deactivateAccount (with optional deleteAfter) on old.
  • Permissioned data migration is not yet specified. The Spaces Design Spec is silent on migration; this is a known open question. atproto-pds should design a com.atproto.space.exportSpaces / com.atproto.space.importSpaces (provisional NSIDs) capability and feed feedback upstream when migration semantics are addressed.

2.8 Permissioned Data Hooks in Account Management

  • The space table on the per-actor store records every space the user owns or is a member of. Account creation initializes this table as empty. Account deletion must cascade through space, space_repo, space_record, space_member, space_record_oplog, space_member_oplog, space_credential_recipient.
  • Space ownership is tied to the account. Account deletion of a space owner orphans the spaces — there is no cross-account ownership transfer in the spec. The PDS should warn the user at deletion time.
  • App passwords MUST NOT be used to write to permissioned spaces; only OAuth sessions with appropriate scopes can write. This is enforced in the auth verifier.
  • A removed member retains their space_record rows (per the spec: "the user's data remains intact — they may want to rejoin"). Garbage collection is left as an admin policy.

3. Repository Storage

3.1 MST & Repo v3

Per atproto.com/specs/repository: the repo is a key/value MST keyed by <collection>/<rkey> (UTF-8 bytes), values are CIDs of DAG-CBOR-encoded records. MST nodes have a deterministic structure (left subtree CID, entries with prefix-compressed keys, right subtree CIDs). The current binary format is v3. Fanout/leading-zero parameter and SHA-256 hashing are mandatory.

Commit object fields: did, version (currently 3), data (root MST CID), rev (TID, monotonic), prev (deprecated/null), prevData (previous root MST CID, required for inductive sync), and sig (raw bytes signature over the unsigned commit serialized as DRISL/DAG-CBOR). DRISL CBOR (deterministic) is used for the canonical encoding for both signing and CID generation.

3.2 CID, DAG-CBOR, DRISL

CIDs use SHA-256, DAG-CBOR codec for structured data, raw codec for blobs. The "blessed" CID form is base32 with b prefix in string contexts, raw bytes in CBOR. atproto-dasl already implements DRISL encoding, CID computation, CARv1, and block storage backends in the existing workspace and is the foundation for atproto-pds's repo layer.

3.3 Block Storage

There is no spec mandate on storage backend. The reference implementations diverge significantly:

Implementation Block store Account DB Blob store
TS reference (@atproto/pds) SQLite per-account (one DB file per repo) SQLite shared accounts.db Disk filesystem
indigo PDS (Go, deprecated for PDS) carstore (CAR shards on disk + gorm metadata) Postgres or SQLite Filesystem
rsky-pds (Rust) Postgres (shared) Postgres S3-compatible
cocoon (Go) SQLite block store (default), Postgres optional SQLite or Postgres SQLite blob store, optional S3
tranquil-pds (Rust) Postgres (required) Postgres Filesystem (default), optional S3, optional Valkey cache

The Spaces Design Spec assumes per-actor SQLite (it specifies actor-store/space/sql-repo-storage.ts). This is the strongest hint yet that the upstream direction continues to favor per-account isolation, because the space tables piggy-back on the per-actor DB. atproto-pds should support both per-actor SQLite and a unified Fjall keyspace, with the per-actor model recommended for spec-fidelity.

Recommendations for atproto-pds:

  • Define a BlockStore trait and ship multiple backends. Per-account SQLite gives strong isolation and trivially supports per-account compaction/takedown but limits cross-account batch reads. Shared Postgres unifies operations and supports horizontal scaling but trades latency. Fjall (LSM, pure Rust, ~3.5s compile, ~2.2 MB binary) is a strong default for embedded single-host PDS deployments — low-latency reads, compactable, no JNI/C++.
  • The MST mutation path is the hot path. Cache the working MST as an in-memory Arc<Mst> per active repo, persist nodes lazily on commit, and use a write-ahead-log (Fjall's manifest, or a separate WAL crate) for crash safety.

3.4 Commit Signing

Use the K-256 (secp256k1) or P-256 (secp256r1) atproto signing key from the DID document. Sign SHA256(DRISL(unsigned_commit)). Note signatures are normalized to low-S form. K-256 is the default for did:plc; P-256 is also valid. The rotation key is a separate key used only for PLC operations and never signs commits.

3.5 CAR Files

CARv1, mimetype application/vnd.ipld.car. For full repo export, root[0] = current commit CID. For diffs (firehose #commit payload), root[0] = current commit CID, and the body contains exactly the new + modified blocks since since. Implementation must be streaming both ways (importRepo can be multi-GB).

CAR is the public-repo format only. Permissioned repos do not use CAR — they use SQL tables and oplog streams (§7.3).

3.6 Permissioned Data: Storage Layer (Spaces Design Spec)

The Spaces Design Spec specifies these per-actor-store tables. atproto-pds should mirror them exactly, modulo Rust naming conventions:

space — every space the user owns or is a member of.

Column Type Notes
uri TEXT PK ats://<ownerDid>/<spaceType>/<spaceKey>
is_owner INTEGER (bool)
is_member INTEGER (bool) Set via notifyMembership
created_at TEXT (ISO8601)

space_member_state — owner-only: member list commitment.

Column Type Notes
space TEXT PK, FK
set_hash BLOB nullable SetHash over member DIDs
rev TEXT nullable Member list TID

The spec emphasizes: this table only exists for owned spaces. Presence of the row is the truth-value of "I own this space" — no nullable columns on space.

space_repo — per-space record commitment for this user.

Column Type Notes
space TEXT PK, FK
set_hash BLOB nullable
rev TEXT nullable

space_record — actual records.

Column Type Notes
space TEXT
collection TEXT NSID
rkey TEXT
cid TEXT
value BLOB DAG-CBOR record
repo_rev TEXT rev at write
indexed_at TEXT

PK (space, collection, rkey). Index (space, repo_rev) for since-queries.

space_member — owner-only: actual member list. PK (space, did). Columns: space, did, member_rev, added_at.

space_record_oplog — record-mutation log per space. PK (space, rev, idx). Columns: space, rev, idx, action (create/update/delete), collection, rkey, cid, prev.

space_member_oplog — owner-only: member-mutation log. PK (space, rev, idx). Columns: space, rev, idx, action (add/remove), did.

space_credential_recipient — owner-only: tracks services issued credentials, for notifyWrite fan-out. PK (space, service_did). Columns: space, service_did, service_endpoint, last_issued_at.

Atomic batch semantics: A single applyWrites-style batch produces multiple oplog entries sharing a rev (TID) but with monotonically increasing idx. The owning space_repo.rev (or space_member_state.rev) is set to that commit's rev. This mirrors public-repo applyWrites.

Block storage and DAG-CBOR: Records in space_record.value are DAG-CBOR blobs identical in encoding to public records. The CID is computed identically. The MST is not used. SetHash takes its place as the cryptographic commitment.

SetHash element format (per spec):

  • Records: "<collection>/<rkey>:<cid>" byte string → SHA-256 → XOR-fold into accumulator.
  • Members: the DID string itself → SHA-256 → XOR-fold.

The XOR-of-SHA256 placeholder is explicitly to be replaced by ECMH or ltHash before production. atproto-pds should implement the SetHash interface as a trait so swapping the underlying primitive is a one-crate change.

Sidecar attachments: Not in the Spaces Design Spec. If atproto-pds wants to support sidecar/off-protocol attachments to public records (Nick's community.lexicon.preference.ai use case), the cleanest mapping is: store the attached private record as a normal permissioned record in a personal space owned by the user, and use strongRef from the public record. This requires no new storage primitive — it's an application-level pattern over the spec.

3.7 Spec Open Questions (Storage Layer)

The Spaces Design Spec flags several TODOs that affect storage:

  • SetHash algorithm: XOR placeholder, will move to ECMH or ltHash. Trait-based abstraction.
  • Oplog retention policy: How long must a PDS keep oplog entries? Currently unspecified. atproto-pds should default to retain-forever (with optional admin compaction), and signal this via a configurable.
  • URI scheme: ats://<ownerDid>/<spaceType>/<spaceKey> is provisional.

4. Record Operations

4.1 Public Repo Endpoints

Endpoint Behavior
com.atproto.repo.createRecord Validates lexicon, generates rkey (TID by default), inserts into MST, commits, sequences firehose event. Supports swapCommit and optional rkey.
com.atproto.repo.putRecord Idempotent upsert with swapRecord and swapCommit for optimistic concurrency.
com.atproto.repo.deleteRecord Tombstone in MST, blob ref-count decrement, optional swapRecord/swapCommit.
com.atproto.repo.applyWrites Atomic batch (create / update / delete) producing a single commit. Mandatory for migration replays and high-throughput clients.
com.atproto.repo.getRecord Public, no auth; returns record + CID + URI.
com.atproto.repo.listRecords Public; pagination via cursor (rkey-bounded).
com.atproto.repo.describeRepo Public; returns DID, handle, collections, validity.
com.atproto.repo.listMissingBlobs Auth; lists blobs referenced by records but not present locally.
com.atproto.repo.uploadBlob Auth; see §4.4.
com.atproto.repo.importRepo Auth; CAR import, indexes records, regenerates commit signed by the new PDS's signing key.

4.2 Lexicon Validation

Two spec-defined modes: strict (records must validate against resolved lexicons) and lenient (records that fail to resolve a lexicon are still accepted; only structural validity is checked). Indigo's lexicon.ValidateFlags defines AllowLegacyBlob, AllowLenientDatetime, RequireDataInUnknownUnions. atproto-pds should default to lenient on writes (to allow community lexicons that haven't propagated) while exposing strict mode as a config knob, mirroring TS PDS behavior.

The existing atproto-lexicon crate handles NSID validation, recursive resolution (DNS TXT _lexicon.<nsid> then HTTPS), and schema parsing.

Permissioned records use the same lexicon system. Records written via com.atproto.space.createRecord carry $type and validate identically to public records. The XRPC layer enforces the realm separation; the lexicon layer doesn't care which realm the record lives in.

4.3 TID, Rkey, Swap Concurrency

TIDs are 13-char base32-sortable timestamps + clock id; per spec they are monotonically increasing within a repo. The existing atproto-record crate has a TID generator. Swap mechanics: swapCommit is the parent commit CID the client believes the repo is at; if the actual current commit differs the request fails with InvalidSwap. swapRecord is the prior record CID for putRecord/deleteRecord.

For permissioned repos, the equivalent of swapCommit is "the rev of the prior commit" (since there is no commit CID — the SetHash is the commitment, not a content-addressed object). The Spaces Design Spec does not explicitly call out swap semantics for permissioned writes; atproto-pds should implement swapRev as a direct analog and propose this upstream.

4.4 Blobs

Per atproto.com/specs/blob:

  • uploadBlob returns { $type: blob, ref: { $link: CID }, mimeType, size }. CID is bafkrei… (raw codec, SHA-256). Empty blob is technically valid but typically rejected by app lexicons.
  • Server may sniff Content-Type, must reject if Content-Length mismatches.
  • Temporary state: blobs not yet referenced by any record; servers should garbage-collect after a grace window (TS PDS: configurable, default ~hours; cocoon: hourly cron; rsky-pds: hourly).
  • Lexicon validation enforces MIME type and size at reference time, not upload time. app.bsky.embed.images requires image/* and ≤1,000,000 bytes per image.
  • When the last referencing record is deleted, the blob is GC'd. Deleted account → all blobs purged.

Blobs in permissioned records: The Spaces Design Spec does not separately address blobs for permissioned records — implicitly, blobs uploaded via the existing com.atproto.repo.uploadBlob path are referenced by permissioned records via the same blob type. Access control is at the record layer, not the blob layer. A blob CID referenced from a permissioned record is still served by com.atproto.sync.getBlob to anyone who knows the CID. For applications that need blob-level access control, the recommended pattern is to encrypt the blob payload application-side; the PDS stores ciphertext.

4.5 Permissioned Records (com.atproto.space.*)

Per the Spaces Design Spec, the new procedure/query namespace mirrors com.atproto.repo.* but with a required space parameter:

Endpoint Type Auth Description
com.atproto.space.createRecord procedure OAuth (member) Create record in permissioned repo; appends oplog entry; triggers notifyWrite.
com.atproto.space.putRecord procedure OAuth Idempotent upsert.
com.atproto.space.deleteRecord procedure OAuth
com.atproto.space.applyWrites procedure OAuth Atomic batch.
com.atproto.space.getRecord query Dual auth: user OAuth or SpaceCredential
com.atproto.space.listRecords query Dual auth

Every request includes a space parameter (the URI). After every write, the PDS performs a fire-and-forget notifyWrite to the space owner's PDS. The owner's PDS then relays via notifyWrite to each entry in space_credential_recipient (the syncing apps that have been issued credentials).

Write-time membership check: Per spec, the PDS does NOT enforce membership at write time. A user can write to any space URI on their own PDS; consumers will discover and ignore non-members at read time by checking the owner's member list. This is intentional — it keeps writes local and avoids a mandatory round-trip to the owner's PDS on every write.

Authoritative-public references to permissioned records: Public records may strongRef permissioned records. The public reader sees the strongRef but cannot dereference (the reference returns 404/403 without space credentials). This is Nick's "authoritative reference to permissioned data" pattern, supported naturally by the spec.


5. Identity & DID Operations

5.1 did:plc

The PDS interacts with the PLC directory (plc.directory or a configurable endpoint via PDS_DID_PLC_URL) to:

  1. Genesis op: Compose {type: plc_operation, rotationKeys: [pds_rotation_key, optional_user_recovery_key], verificationMethods: {atproto: did:key:...}, alsoKnownAs: ["at://handle"], services: {atproto_pds: {type: AtprotoPersonalDataServer, endpoint: pds_url}}}, sign with rotation key, POST.
  2. Rotation: When updating handle, signing key, services, or rotation keys themselves. The prev field links to the previous op's CID.
  3. Tombstone: When deleting an account, optionally publish a tombstone op.
  4. Recovery: Within a 72-hour window after a rotation, a higher-priority rotation key can override.

The existing atproto-identity crate handles DID resolution, document parsing, P-256/K-256 keys; atproto-plc (in the workspace) handles operation construction. atproto-pds must extend these with the PDS-specific roles: signing PLC ops on behalf of users (gated by requestPlcOperationSignature token), submitting them, and caching DID documents.

5.2 did:web

Conformance: PDS must support hosting did:web:<pds-domain>:<account> style documents at /.well-known/did.json for managed subdomains, and must accept users whose DIDs are did:web they self-host. Not all reference PDSes are equal here:

  • TS reference PDS: limited did:web support, primarily for the service DID.
  • cocoon: full did:web + did:plc support.
  • tranquil-pds + vicwalker fork: full did:web with PDS-hosted subdomains and BYO domains.
  • rsky-pds: did:plc only.

atproto-pds should match cocoon/tranquil's superset.

5.3 Handle Resolution

Both methods required:

  1. DNS TXT at _atproto.<handle> containing did=did:plc:....
  2. HTTPS GET https://<handle>/.well-known/atproto-did returning the DID as plain text.

Conflict policy: per spec, both should agree; if they disagree, treat as unresolved. Indigo's resolver returns both; cocoon prefers DNS; TS PDS prefers DNS with HTTPS fallback. atproto-pds should query both in parallel and require agreement (configurable to allow either).

5.4 Key Hierarchy

  • Rotation keys: held by PDS for managed accounts (and optionally by user as recovery key). K-256 or P-256. Used only for PLC ops.
  • Atproto signing key: per-account, held by PDS. K-256 or P-256. Signs commits, MemberGrants, and SpaceCredentials.
  • DPoP keys: per-OAuth-session, held by client. P-256 only (ES256). Bound to access tokens.
  • OAuth client authentication keys (confidential clients only): P-256, JWK published in client metadata.

atproto-pds should use atproto-identity::keys for all key handling. Hardware-backed key storage (HSM, secure enclave, KMS) should be a pluggable interface.

5.5 Permissioned Data Identity Hooks

The Spaces Design Spec uses fewer key types than the diary posts implied. Specifically:

  • There is no per-reader HMAC key. The earlier diary discussion of "per-reader keying" is not what the implementation does. Instead, each commit uses a fresh random IKM with HKDF deriving an HMAC key over SpaceContext. The HMAC tag travels with the commit. This gives deniability (a commit cannot be re-attributed without seeing the IKM) without per-reader keying.
  • Member commits and record commits use different scope values ('records' vs 'members') in SpaceContext, providing domain separation so a record commit cannot be replayed as a member commit.
  • MemberGrants and SpaceCredentials are signed with the user's atproto signing key, the same key used for public commits. No new signing key.
  • Owner-of-space DID resolution must be cached aggressively — every SpaceCredential verification requires resolving the owner's DID doc. The existing atproto-identity cache handles this, but TTLs should be tuned with stale-while-revalidate to avoid latency stalls.

6. Firehose / Event Stream

6.1 com.atproto.sync.subscribeRepos

WebSocket endpoint with framed CBOR messages (Header { op, t } + body). Required event types:

Type Purpose Sync 1.1 status
#commit Repo update with CAR diff, ops list, prevData, since, rev, seq, time, blocks required
#sync Force-set repo state without diff (recovery from drift) required (Sync 1.1)
#identity DID doc / handle change required
#account Account state change (active bool + status) required
#info Out-of-band info (e.g., OutdatedCursor) required
#handle Handle-only update deprecated in favor of #identity
#migrate Account moved deprecated
#tombstone Account deleted deprecated in favor of #account

6.2 Sequence & Cursor

seq is a strictly increasing 64-bit integer per PDS. Subscribers pass cursor=<seq> to resume; if the server has GC'd events past the cursor, it sends #info OutdatedCursor and disconnects. Outbox retention is implementation-defined; TS PDS keeps all events forever (limited by disk); cocoon supports configurable retention; tranquil retains 24h by default.

6.3 Sync 1.1 Inductive Firehose

Per bluesky-social/proposals/0006-sync-iteration and the docs.bsky.app/blog/relay-sync-updates post:

  • #commit includes prevData (prior MST root CID).
  • Each repoOp includes prev (previous record CID) for updates and deletes.
  • The CAR slice contains exactly the blocks needed to invert the operations and verify against prevData.
  • Subscribers can validate signatures + structure without retaining repo state, using only the prior prevData.

atproto-pds should emit fully-conformant Sync 1.1 events from day one.

6.4 Outbox & Backpressure

The TS PDS sequencer is a single-writer SQLite outbox table. Cocoon uses a similar pattern with PostgreSQL LISTEN/NOTIFY. rsky-pds uses Postgres polling. For low latency in Rust:

  • Use a tokio broadcast channel for live subscribers fed by the writer transaction commit hook.
  • Use Fjall or a rolling SQLite file for durable outbox (cursor-resumable).
  • Implement per-subscriber bounded queues; drop slow consumers with ConsumerTooSlow (an #info followed by close).
  • A relay subscribing for backfill should be served paginated CAR slices via getRepo with since=<rev> first, then attached to the live stream.

6.5 Permissioned Data and the Firehose

Permissioned writes are not on the public firehose. This is one of the most important architectural facts in the Spaces Design Spec. The earlier diary speculation about omission/redaction/cipher modes does not appear in the implementation design. Instead:

  • A permissioned write triggers no #commit event.
  • Instead, the member's PDS does fire-and-forget notifyWrite to the space owner's PDS.
  • The space owner's PDS looks up space_credential_recipient and relays notifyWrite to each registered syncing app.
  • Each syncing app then pulls the actual ops via getRepoOplog against the member's PDS, presenting its SpaceCredential.

This is a push-then-pull model. The push (notifyWrite) is a low-cost notification; the pull (getRepoOplog) carries the actual data and is gated by SpaceCredential verification. The firehose is reserved for the public realm.

Spec open questions on this path:

  • notifyWrite fan-out failure: What happens when the space owner can't reach a syncing app? Currently fire-and-forget per spec; may need retry/backoff. atproto-pds should implement bounded retry with exponential backoff and a dead-letter log so the operator can see persistent push failures.
  • Service endpoint discovery: How does the space owner learn a syncing app's notification endpoint? The spec flags this TBD — either resolved from the app's DID doc or provided in the SpaceCredential request. atproto-pds should support both: prefer the app's DID doc service entry, fall back to a value provided at credential-issuance time.

atproto-pds should NOT introduce a community.lexicon.space.subscribeSpace WebSocket unless and until the spec adds one. The push-then-pull model is the official direction.


7. Sync Protocol

7.1 Public Sync Endpoints

Endpoint Auth Behavior
getRepo none Stream full repo or since-cursor diff as CARv1.
getRepoStatus none {did, active, status, rev}
getLatestCommit none {cid, rev}
getRecord none CAR containing the record + MST proof path
getBlocks none CAR for arbitrary CID list (within repo)
listRepos none Paginated DIDs hosted with rev
listReposByCollection none (Sync 1.1, optional) Paginated DIDs that contain records of an NSID
listBlobs auth (PDS) / none (other hosts vary) CIDs of blobs for a DID
getBlob none Raw blob bytes
requestCrawl none Tells a relay to start crawling this PDS
notifyOfUpdate none (deprecated) Hint for legacy relays
subscribeRepos none (WS) See §6

7.2 CAR Streaming

All CAR-producing endpoints must stream chunked, not buffer-all. For a full repo of 1M records this is the difference between feasible and OOM.

7.3 Permissioned Sync (com.atproto.space.*)

Per the Spaces Design Spec, sync is oplog + setHash, not CAR. Endpoints:

Endpoint Called on Auth Description
getRepoState Member's PDS SpaceCredential Current {setHash, rev} for this member's permissioned repo.
getRepoOplog Member's PDS SpaceCredential Oplog ops since since rev, plus current {setHash, rev}.
getMemberState Owner's PDS SpaceCredential Current {setHash, rev} for member list.
getMemberOplog Owner's PDS SpaceCredential Member-list oplog ops since since, plus current {setHash, rev}.

Sync algorithm (apps):

  1. App holds SpaceCredential.
  2. For each member DID in the latest member list, app calls getRepoOplog?since=<last_seen_rev> on that member's PDS.
  3. PDS returns ops [{rev, idx, action, collection, rkey, cid, prev}, ...] plus current {setHash, rev}.
  4. App replays ops locally. For create/update ops, app calls getRecord on the member's PDS to fetch record content.
  5. App computes its local SetHash and compares to the returned setHash.
  6. If mismatch: full resync via listRecords, recompute from scratch.

Why the mismatch fallback matters: SetHash is order-independent (XOR/ECMH/ltHash all are), so the protocol survives op reordering, but if the PDS has compacted oplog entries beyond the app's since cursor, the app's replay will be incomplete. The setHash mismatch is the trigger for full resync.

Authoritative-public-record reads from permissioned context: When a permissioned record strongRefs a public record, the app already has access to the public record via normal com.atproto.repo.getRecord. No special handling.

Authoritative-permissioned-record reads from public context: When a public record strongRefs a permissioned record, public readers (without SpaceCredential) cannot dereference. The PDS returns 404 or a structured "permissioned" error; atproto-pds should standardize on RecordPermissioned as the error name and document it for upstream feedback.

7.4 Spec Open Questions (Sync)

The Spaces Design Spec flags:

  • Oplog retention: vague "backfill window." atproto-pds should default to retain-forever and expose admin compaction.
  • Credential expiration: 2–4 hours, exact default TBD. atproto-pds should default to 3 hours, configurable.

8. XRPC Server

8.1 Routing

All XRPC endpoints under /xrpc/{nsid}. Procedures use POST with JSON or CBOR body; queries use GET with query string params. Subscriptions use WebSocket GET with CBOR-framed messages. The existing atproto-xrpcs crate provides the framework: axum routing, JWT extractors, DID resolution. It needs PDS-specific extensions:

  • DPoP nonce issuance and validation (DPoP-Nonce header rotation per RFC 9449).
  • Service auth verification with lxm matching.
  • App-password JWT validation.
  • SpaceCredential JWT validation (new — see §15).
  • MemberGrant JWT validation (new — see §15).
  • Rate limit middleware.
  • Lexicon-driven request/response validation.

8.2 Content Negotiation

CBOR (application/cbor, application/vnd.ipld.car) and JSON (application/json). Subscriptions use CBOR-only frames.

8.3 Authentication Middleware Stack

The middleware must:

  1. Extract the Authorization header.
  2. Detect token type by JWT typ header: at+jwt (OAuth), dpop+jwt (DPoP proof), space_member_grant, space_credential, or unmarked (App Password / service auth).
  3. Route to the appropriate verifier:
    • OAuth: verify DPoP header, JWT thumbprint match, nonce, JWT signature against the PDS auth key.
    • App Password: signed with PDS JWT secret.
    • Service auth: signed with sender DID's signing key, verify via DID doc, check lxm and aud.
    • MemberGrant: signed with member's atproto signing key, aud is space owner DID, lxm=com.atproto.space.getSpaceCredential, clientId matches the requesting app.
    • SpaceCredential: signed with space owner's atproto signing key, space matches the requested resource.
  4. Construct an AuthContext { did, scope, app_password_id?, oauth_session_id?, service_audience?, space?, client_id? }.

The Spaces Design Spec explicitly notes the PDS auth verifier gets new paths beyond the existing OAuth / service auth verifiers. atproto-pds should implement these as separate axum extractors that compose.

8.4 Rate Limiting

TS PDS uses an in-memory token bucket per IP + DID. Tranquil-pds supports distributed rate limiting via Valkey. atproto-pds should ship in-memory by default and an optional Redis/Valkey backend, with configurable limits per endpoint family (write, read, sync, OAuth, space writes, space reads).

Permissioned-realm rate limits should be tunable separately because their access pattern is different — frequent getRepoOplog polling from many syncing apps can saturate a member's PDS if not bounded.

8.5 Errors

Per atproto convention: HTTP 400/401/403/429/500 with JSON {error: "ErrorName", message: "Human description"}. Lexicons enumerate possible error names. atproto-xrpcs already defines this shape.

New error names introduced by the Spaces Design Spec (anticipated): SpaceNotFound, NotSpaceMember, NotSpaceOwner, InvalidSpaceCredential, InvalidMemberGrant, OplogGap (returned by getRepoOplog if since is older than retained oplog).

8.6 Endpoint Inventory (com.atproto.* minimum for PDS)

server.*: createSession, refreshSession, deleteSession, getSession, createAccount, deleteAccount, requestAccountDelete, activateAccount, deactivateAccount, checkAccountStatus, describeServer, getServiceAuth, requestEmailUpdate, confirmEmail, updateEmail, requestPasswordReset, resetPassword, requestEmailConfirmation, createInviteCode, createInviteCodes, getAccountInviteCodes, createAppPassword, listAppPasswords, revokeAppPassword, reserveSigningKey.

identity.*: resolveHandle, updateHandle, getRecommendedDidCredentials, requestPlcOperationSignature, signPlcOperation, submitPlcOperation, resolveDid, resolveIdentity, refreshIdentity.

repo.*: createRecord, putRecord, deleteRecord, applyWrites, getRecord, listRecords, describeRepo, uploadBlob, importRepo, listMissingBlobs.

sync.*: getRepo, getRepoStatus, getLatestCommit, getRecord, getBlocks, getBlob, listBlobs, listRepos, listReposByCollection, requestCrawl, notifyOfUpdate (legacy), subscribeRepos (WS).

moderation.*: createReport (typically forwarded to moderation service via service auth).

admin.* (auth=admin): getAccountInfo, getAccountInfos, searchAccounts, disableAccountInvites, enableAccountInvites, disableInviteCodes, getInviteCodes, getSubjectStatus, updateSubjectStatus, sendEmail, updateAccountEmail, updateAccountHandle, updateAccountPassword, deleteAccount.

space.* (per Spaces Design Spec):

  • Record CRUD (member's PDS, OAuth): createRecord, putRecord, deleteRecord, applyWrites, getRecord (dual-auth), listRecords (dual-auth).
  • Space management (owner's PDS, OAuth): createSpace, getSpace, listSpaces, addMember, removeMember, getMembers.
  • Credential flow: getMemberGrant (member's PDS), getSpaceCredential (owner's PDS).
  • Sync (SpaceCredential): getRepoState, getRepoOplog (member's PDS), getMemberState, getMemberOplog (owner's PDS).
  • Notifications (service auth): notifyMembership (owner → member), notifyWrite (member → owner → syncing apps; same endpoint on both relay hops).

8.7 Permissioned Data XRPC Hooks

  • The space parameter is required on every com.atproto.space.* write/read endpoint; the auth middleware must verify the OAuth scope grants access to that specific space (or that a SpaceCredential covers it).
  • The getRecord/listRecords endpoints under com.atproto.space.* are explicitly dual auth per spec: either user OAuth (own PDS, own data) or SpaceCredential (remote app, syncing). The middleware must detect and route accordingly.
  • notifyWrite is the same NSID on both directions of relay (member→owner and owner→syncing-app). The PDS must implement both the server side (receive notifyWrite, dispatch to internal subscribers) and the client side (send notifyWrite outbound after writes). Same for notifyMembership.

9. OAuth Provider

(Nick has authored AIP and is comfortable here; this section calls out PDS-specific points.)

9.1 Mandatory Mechanisms

  • PAR (RFC 9126) at /oauth/par. Required.
  • PKCE (RFC 7636) with S256. Required.
  • DPoP (RFC 9449) for both Authorization Server and Resource Server requests, including server-issued nonces (DPoP-Nonce header) and per-request rotation. ES256 (P-256) only.
  • Confidential client authentication via private_key_jwt (client_assertion_type=urn:ietf:params:oauth:client-assertion-type:jwt-bearer).
  • Public client support for browser SPAs and native mobile.

9.2 Discovery Documents

  • /.well-known/oauth-authorization-server (RFC 8414): issuer, authorization_endpoint, token_endpoint, par endpoint, jwks_uri, supported scopes, supported challenge methods (S256 only), supported DPoP signing alg values (ES256).
  • /.well-known/oauth-protected-resource (RFC 9728): resource (the PDS URL), authorization_servers (often the same PDS, or a separate entryway).

9.3 Client Metadata

Clients are identified by client_id = a fully-qualified https:// URL pointing to a JSON metadata document. The PDS fetches this at PAR time, validates redirect_uris, application_type, grant_types, response_types=["code"], dpop_bound_access_tokens=true, scope (must include atproto), and (for confidential clients) jwks or jwks_uri.

9.4 Authorization Endpoint

/oauth/authorize — accepts request_uri from PAR. Renders consent UI. The UI must:

  • Show client name, logo, requested scopes (with human-readable permission set descriptions).
  • Show the authenticated identity (login if needed).
  • Allow approve/deny.
  • Issue authorization code on approve, redirect to client's redirect_uri with code, state, iss.

9.5 Token Endpoint

/oauth/token. Grant types: authorization_code and refresh_token. Refresh tokens are single-use rotation. Access tokens are JWTs signed by the PDS, claims iss, aud=did:web:pds.example.com, sub=did:plc:..., client_id, scope, cnf.jkt=<dpop-thumbprint>, iat, exp (15-30 min recommended).

9.6 Revocation

/oauth/revoke per RFC 7009. Revoke refresh and bound access tokens.

9.7 Scopes and Permission Sets

Per atproto.com/specs/permission:

  • atproto — required base scope (proves identity, no other privileges).
  • transition:generic — broad legacy-equivalent scope (post, like, follow, etc.).
  • transition:chat.bsky — DM access.
  • Granular: repo:<nsid>?action=..., rpc:<lxm>?aud=..., blob:<mime>/<size>, account?attr=email&action=read, etc.
  • Permission sets: include:<nsid-of-set> resolves a published lexicon permission-set into multiple granular permissions.

The PDS must:

  • Resolve permission sets via the lexicon resolution system (24h stale, 90d expiration recommended).
  • Cache resolved sets with stale-while-revalidate.
  • Re-compute scopes on token refresh (so lexicon updates propagate without re-consent).

9.8 Permissioned Data Scopes

The Spaces Design Spec does not yet finalize OAuth scope strings for spaces, but the implementation pattern implies them clearly. Candidate scope syntax (to propose upstream if not already settled):

  • space:<spaceType>?action=read — read access to spaces of a given type (the user's own permissioned repos for that NSID).
  • space:<spaceType>?action=write — write access.
  • space:<spaceType>?action=manage — for space owners only: create/delete spaces, manage member list.
  • rpc:com.atproto.space.getMemberGrant?aud=<owner-did> — permission to mint member grants for a specific space owner.

The consent UI must surface space-typed scope grants as "Read your group records" / "Write to your group records" rather than raw NSIDs. This is a UX requirement implicit in the spec's emphasis on user understanding.

9.9 Authorization UI

PDS-hosted; should be a server-rendered template (Askama or similar in Rust) for security (no JS injection surface for credentials). cocoon and tranquil-pds both ship Svelte/HTML UIs; tranquil's is the most polished. atproto-pds should keep this minimal but extensible.

9.10 Conformance Notes

  • TS PDS: complete OAuth implementation, reference for spec.
  • cocoon: complete, includes JWKs in metadata fix branch (hailey/support-jwks-in-metadata).
  • rsky-pds: partial, in flux.
  • tranquil-pds: complete + adds 2FA/WebAuthn/passkeys (superset).
  • atproto-pds should match cocoon + tranquil baseline.

10. Moderation & Safety

10.1 Account Takedown

com.atproto.admin.updateSubjectStatus with subject={did}, takedown={applied: true, ref: "..."}. The PDS must:

  • Block all writes from that account.
  • Block public reads of repo/blobs (return AccountTakedown error).
  • Emit #account active=false status=takendown on firehose.
  • Optionally notify upstream relays.

10.2 Record-Level Takedown

Same endpoint with subject={uri, cid}. PDS hides the record from getRecord, omits from listRecords (or returns with takedown flag), but the record technically remains in the repo. getRepo includes it (so relays can still verify the chain) but a separate label/status indicates suppression. This is implementation-divergent — TS PDS removes from the firehose path entirely on takedown; cocoon hides on read; rsky-pds is selective.

10.3 Email/Handle Banning

Admin-configurable blocklist for new account creation. All five reference implementations support this.

10.4 Reports

com.atproto.moderation.createReport is per-spec a forwarded call: the PDS proxies to a configured moderation service (Ozone) via service auth. Cocoon explicitly comments that this should be proxied, not implemented locally. atproto-pds follows this — a MODERATION_SERVICE_DID config envvar is required and reports forward via Atproto-Proxy.

10.5 Label Subscription

A PDS may subscribe to one or more labelers (e.g., Bluesky's Ozone) and propagate labels to clients via app.bsky.actor.getProfile-style hydration. This is optional; most PDSes do not do it directly (it's done by AppView). atproto-pds should support a LabelService pluggable interface for future use.

10.6 Permissioned Data Moderation

The Spaces Design Spec is silent on moderation of permissioned content — explicitly out of scope ("Out of scope: application coordination"). Realistic stance for atproto-pds:

  • Account-level takedown of a space owner: blocks all com.atproto.space.* operations against that owner's PDS, including SpaceCredential issuance. Existing members lose access.
  • Account-level takedown of a member: their permissioned writes still happen on their own PDS but notifyWrite to the owner's PDS will be rejected (member is takendown), so the data effectively becomes orphaned. Apps will not pull from them.
  • Record-level takedown of permissioned records: requires admin override. Since the space is access-controlled, the PDS admin needs to either be in the member list (default-deny-friendly admin account) or use a new admin-bypass auth path. atproto-pds should expose com.atproto.admin.takedownSpaceRecord (provisional) as an admin-authenticated endpoint that suppresses a record from space_record reads regardless of credentials, and audits the action.
  • Spam detection in permissioned spaces requires PDS-local heuristics (write rate, fan-out patterns) since AppViews can't see content. Recommended: opt-in admin telemetry on space write rates, no content inspection by default.
  • Reports: com.atproto.moderation.createReport continues to work, but the moderation service may be unable to view the reported permissioned content without space credentials. The PDS should attach a SpaceCredential to the forwarded report iff the reporter is willing to grant it (this is tricky and probably needs upstream design — flag for spec feedback).

11. Email & Notifications

11.1 SMTP

Required for verification, password reset, PLC operation signature requests, account migration confirmations, account takedown notices. TS PDS uses nodemailer; cocoon uses a Go SMTP client; tranquil-pds supports SMTP plus discord/telegram/signal.

atproto-pds should use lettre and ship templates for: verify_email, confirm_email_update, reset_password, confirm_account_delete, plc_op_signature, account_migration_initiated, takedown_notice. A pluggable Mailer trait allows non-SMTP backends.

11.2 Bounces

Optional but recommended: SES/SendGrid webhook ingestion to mark addresses undeliverable and prevent further send attempts.

11.3 Permissioned Data Notifications

notifyWrite and notifyMembership (Spaces Design Spec) are inter-service notifications, not user-facing emails. They are HTTPS POSTs with service-auth JWTs. atproto-pds should treat them as a separate subsystem from email (the Notifier trait), with their own retry/backoff and dead-letter logging.

A user-facing "you were added to / removed from a space" notification (email or app push) is not in the spec but is good UX. atproto-pds should optionally trigger an email on notifyMembership ingest, behind a config flag.


12. Configuration & Deployment

12.1 Configuration Sources

Precedence (matching tranquil): env vars > --config <path> > /etc/atproto-pds/config.toml > built-in defaults. Required envvars:

  • PDS_HOSTNAME, PDS_SERVICE_DID (commonly did:web:<hostname>).
  • PDS_DATA_DIRECTORY and storage URLs (PDS_DB_URL, PDS_BLOCK_STORE_URL, PDS_BLOB_STORE_URL).
  • PDS_PLC_ROTATION_KEY_K256_PRIVATE_KEY_HEX (or P-256 variant), PDS_REPO_SIGNING_KEY_K256_PRIVATE_KEY_HEX (or per-account).
  • PDS_JWT_SECRET, PDS_OAUTH_KEY_JWK.
  • PDS_DID_PLC_URL, PDS_BSKY_APP_VIEW_URL, PDS_BSKY_APP_VIEW_DID, PDS_REPORT_SERVICE_URL, PDS_REPORT_SERVICE_DID.
  • PDS_CRAWLERS — comma-separated relay URLs to requestCrawl after each commit.
  • PDS_INVITE_REQUIRED, PDS_INVITE_INTERVAL.
  • PDS_EMAIL_SMTP_URL, PDS_EMAIL_FROM_ADDRESS.
  • PDS_ADMIN_PASSWORD.
  • PDS_SERVICE_HANDLE_DOMAINS (e.g., .pds.example.com).
  • PDS_SPACE_CREDENTIAL_TTL_SECONDS (default 10800 = 3h, per spec's 2–4h window).
  • PDS_SPACE_OPLOG_RETENTION_DAYS (default unlimited; admin compaction available).
  • PDS_SPACE_NOTIFY_RETRY_MAX_ATTEMPTS (default 5), PDS_SPACE_NOTIFY_RETRY_INITIAL_BACKOFF_MS (default 1000).
  • TLS-related are optional (proxy expected).

12.2 Reverse Proxy

PDS expects to be fronted by a reverse proxy (Caddy, nginx, Traefik) handling TLS, large request bodies (>1GB for importRepo), and WebSocket upgrade (subscribeRepos). Required proxied paths if running multiple services on one domain: /xrpc/*, /.well-known/atproto-did, /.well-known/oauth-protected-resource, /.well-known/oauth-authorization-server, /.well-known/did.json, /oauth/*.

12.3 Backups

  • SQLite/Fjall: snapshot to S3 hourly; tranquil-pds does this in-process.
  • Postgres: external pg_dump or provider-managed backups.
  • Blobs: S3 versioning + replication.
  • The PDS should expose an admin /admin/backup endpoint to trigger snapshots.
  • Per-actor SQLite is critical for permissioned data backups: a single account's full state (public + all spaces) lives in one DB file, simplifying export and migration scenarios.

12.4 Observability

  • /xrpc/_health returns {version, status: "ok"}.
  • Prometheus metrics on a separate port (e.g., 2471), at /metrics.
  • Structured tracing via tracing + tracing-subscriber with OpenTelemetry export.
  • Required metrics: request rate by NSID, p50/p99 latency, write transaction time, MST node cache hit rate, firehose subscriber count, lag.
  • Permissioned-realm metrics: spaces owned, spaces joined, oplog write rate per space, notifyWrite retries / DLQ depth, SpaceCredential issuance rate, oplog gap fallback rate.

12.5 Multi-Tenancy

The TS reference PDS hosts thousands of accounts on one process; per-account SQLite scales acceptably until OS file handle limits. atproto-pds with Fjall could go further by sharing one keyspace partitioned by DID. For very large deployments, sharded multi-process via consistent hashing on DID is the path; this is what Bluesky's mushroom PDSes do.


13. Crypto Requirements

Algorithm Use
K-256 (secp256k1, ES256K) atproto signing keys (default), PLC rotation keys, MemberGrant + SpaceCredential signing
P-256 (secp256r1, ES256) atproto signing keys (alt), PLC rotation keys, OAuth client keys, DPoP keys (mandatory)
Ed25519 NOT supported in atproto
SHA-256 Commit hashing, blob CID, PKCE, DPoP thumbprints, SetHash element hashing
Argon2id Password storage (recommended)
HKDF-SHA-256 Permissioned-repo commit HMAC key derivation (per-commit random IKM, SpaceContext as info)
HMAC-SHA-256 Permissioned-repo commit authentication tag
ECMH or ltHash (future) SetHash production primitive (replaces XOR-SHA256 placeholder)

Low-S signature normalization required. Use k256 and p256 Rust crates (already used in atproto-identity). atproto-record handles repo commit signing.

The Spaces Design Spec's commit construction:

ikm := random(32 bytes)                         // per-commit, fresh
hmac_key := HKDF-Extract-then-Expand(ikm, info=SpaceContext-cbor)
tag := HMAC-SHA-256(hmac_key, setHash || rev)
sig := Sign-ECDSA(user_signing_key, tag || rev)
commit := { setHash, rev, ikm, tag, sig }

The IKM is included in the commit so a verifier with the relevant SpaceContext can recompute and check; outside that context, the IKM is meaningless. This is the deniability mechanism.


14. Lexicon Handling

14.1 Resolution

Lexicons are resolved by:

  1. Built-in: shipped with the PDS for com.atproto.*, app.bsky.*, etc., including the new com.atproto.space.* lexicons once published.
  2. Network: NSID com.example.foo → DNS TXT _lexicon.foo.example.com → resolves to a DID → PDS of that DID → getRecord for com.atproto.lexicon.schema/foo.
  3. Cache: 24h stale, 90d expiry recommended.

The existing atproto-lexicon crate handles this.

14.2 Validation Pipeline

On record write: parse JSON → infer NSID from $type → resolve schema → validate → encode as DAG-CBOR → hash for CID → MST insert (public realm) or space_record insert (permissioned realm).

14.3 Permissioned Lexicons

community.lexicon.* namespace is the de-facto location for community-defined schemas including:

  • community.lexicon.location.* (Smoke Signal addresses, geo).
  • community.lexicon.calendar.* (events).
  • community.lexicon.preference.ai (Nick's WIP, AI-related preferences).

Per the Spaces Design Spec, space type is itself an NSID, and applications declare what NSIDs are used within a given space type. There is no requirement that space-typed records live under community.lexiconapp.bsky.group is the example space type in the spec, suggesting space types live in app namespaces, with records inside a space drawing from broader vocabularies.

The PDS validates records the same way regardless of realm. The permissioned/public distinction is at the transport layer, not the schema layer.


15. Permissioned Data Spaces (Authoritative)

This section consolidates the Spaces Design Spec into a atproto-pds-implementer's reference. Where the spec is silent, we note the open question and propose a default.

15.1 Conceptual Model

A space is an authorization and sync boundary. It is identified by:

ats://<ownerDid>/<spaceType>/<spaceKey>

(ats:// URI scheme is provisional in the spec.)

Components:

  • Owner DID: root of trust for the space. Holds the canonical member list and signs SpaceCredentials.
  • Space type: NSID describing modality (e.g., app.bsky.group).
  • Space key: arbitrary string differentiating multiple spaces of the same type under the same owner (e.g., default, or a TID-like identifier).

A member is a DID listed in the owner's space_member table. Membership grants read access to all member's permissioned repos for that space, conditional on presenting a valid SpaceCredential to each member's PDS.

Each member who participates in a space hosts a per-space permissioned repo on their own PDS, containing records they have written scoped to that space. The repo is committed via a SetHash + signed HMAC tag, not an MST.

15.2 What is NOT in the Spec (Out of Scope)

The Spaces Design Spec explicitly excludes:

  • Application coordination (allow/deny lists for which apps can be syncing apps, app routing).
  • Delegated / sub-accounts (apps signing on behalf of users without OAuth).

atproto-pds should therefore avoid premature design for these. They are likely future protocol layers built on top of spaces.

15.3 Storage Layer

See §3.6 for full table schemas. Summary: per-actor SQLite gets eight new tables. Key invariants:

  • space_member_state row exists ⟺ user owns the space.
  • space_repo row tracks user's record-set commitment per space.
  • Atomic batches share a rev and use monotonic idx.
  • Removed members retain their space_record rows (data preserved across re-joins).

15.4 The @atproto/space Package and Its Rust Equivalent

The Spaces Design Spec defines a @atproto/space TypeScript package containing:

  • SetHash (set-hash.ts) — XOR-SHA256 placeholder for ECMH/ltHash. Order-independent set digest.
  • Commit (commit.ts) — createCommit(setHash, context, keypair) / verifyCommit(context, commit). HKDF-derived HMAC + ECDSA signature, with SpaceContext as HKDF info for domain separation.
  • SpaceRepo class — manages a single user's permissioned repo within a space. formatCommit / applyCommit / getRecord / listRecords / listCollections.
  • SpaceMembers class (owner-only) — manages member set. Same commit structure, different scope ('members' vs 'records').
  • SpaceCredential and MemberGrant — JWT issuance and verification.
  • Storage interfaces SpaceRepoStorage and SpaceMembersStorage, with in-memory and SQLite implementations.

Rust equivalent: a new crate atproto-space (within atproto-crates) mirroring this structure:

crates/atproto-space/
  src/
    lib.rs
    types.rs
    error.rs
    set_hash.rs        // SetHash trait + XorSha256SetHash impl
    commit.rs          // create_commit / verify_commit + SpaceContext
    space_repo.rs      // SpaceRepo<S: SpaceRepoStorage>
    space_members.rs   // SpaceMembers<S: SpaceMembersStorage>
    credential.rs      // create/verify MemberGrant + SpaceCredential
    storage/
      mod.rs           // SpaceRepoStorage + SpaceMembersStorage traits
      memory.rs        // in-memory impls for testing

The atproto-pds crate then provides the SQLite-backed SpaceRepoStorage and SpaceMembersStorage impls (analogous to the spec's actor-store/space/sql-repo-storage.ts and sql-members-storage.ts), wired into the per-actor store.

15.5 SetHash

pub trait SetHash: Sized + Clone {
    fn empty() -> Self;
    fn add(&mut self, element: &[u8]);
    fn remove(&mut self, element: &[u8]);
    fn digest(&self) -> Vec<u8>;
}

pub struct XorSha256SetHash([u8; 32]);
// add: self.0 ^= sha256(element); remove: same (XOR is self-inverse)

When ECMH or ltHash is settled upstream, swap the impl. Element formats per spec:

  • Records: format!("{}/{}:{}", collection, rkey, cid).as_bytes()
  • Members: did.as_bytes()

15.6 Commits

pub struct SpaceContext {
    pub space_did: String,
    pub space_type: String,
    pub space_key: String,
    pub user_did: String,
    pub scope: CommitScope, // Records | Members
    pub rev: String,
}

pub struct Commit {
    pub set_hash: Vec<u8>,
    pub rev: String,
    pub ikm: [u8; 32],   // per-commit random
    pub tag: [u8; 32],   // HMAC-SHA-256
    pub sig: Vec<u8>,    // ECDSA over canonical bytes
}

pub fn create_commit(
    set_hash: &[u8],
    context: &SpaceContext,
    keypair: &SigningKey,
) -> Commit;

pub fn verify_commit(context: &SpaceContext, commit: &Commit, pubkey: &VerifyingKey) -> bool;

Domain separation via scope is critical: a record commit verified with scope: Members must fail. Test this explicitly (the spec calls it out as a Layer 1 unit test).

15.7 MemberGrant and SpaceCredential

JWT shapes per spec:

MemberGrant (member → app, member's PDS issues):

{
  "header": { "alg": "ES256K", "typ": "space_member_grant" },
  "payload": {
    "iss": "<member-did>",
    "aud": "<owner-did>",
    "space": "ats://<owner-did>/<space-type>/<space-key>",
    "clientId": "<oauth-client-id>",
    "lxm": "com.atproto.space.getSpaceCredential",
    "iat": 1740000000,
    "exp": 1740000300
  }
}

~5 minute TTL. Signed with member's atproto signing key. Verified by resolving member's DID doc.

SpaceCredential (owner → app, owner's PDS issues):

{
  "header": { "alg": "ES256K", "typ": "space_credential" },
  "payload": {
    "iss": "<owner-did>",
    "space": "ats://<owner-did>/<space-type>/<space-key>",
    "clientId": "<oauth-client-id>",
    "iat": 1740000000,
    "exp": 1740010800
  }
}

2–4h TTL (atproto-pds defaults to 3h, configurable). Signed with owner's atproto signing key.

15.8 XRPC Endpoints (com.atproto.space.*)

Full inventory consolidated from §8.6 and the Spaces Design Spec:

Record CRUD (member's PDS, OAuth):

  • createRecord (procedure) — body {space, collection, rkey?, record, swapRev?}
  • putRecord (procedure) — body {space, collection, rkey, record, swapRev?, swapRecord?}
  • deleteRecord (procedure) — body {space, collection, rkey, swapRev?, swapRecord?}
  • applyWrites (procedure) — body {space, writes: [...]}, atomic batch
  • getRecord (query, dual auth — OAuth or SpaceCredential) — params {space, collection, rkey, cid?}
  • listRecords (query, dual auth) — params {space, collection, limit, cursor}

Space management (owner's PDS, OAuth):

  • createSpace (procedure) — body {spaceType, spaceKey} → returns {uri}
  • getSpace (query) — params {uri}
  • listSpaces (query) — params {filter: 'owned' | 'member' | 'all', limit, cursor}
  • addMember (procedure) — body {space, did}
  • removeMember (procedure) — body {space, did}
  • getMembers (query) — params {space, limit, cursor}

Credential flow:

  • getMemberGrant (procedure, member's PDS, OAuth+clientId) — body {space, clientId} → returns {grant: <jwt>}
  • getSpaceCredential (procedure, owner's PDS, MemberGrant) — body {grant} → returns {credential: <jwt>} and registers app in space_credential_recipient

Sync (SpaceCredential):

  • getRepoState (query, member's PDS) — params {space}{setHash, rev}
  • getRepoOplog (query, member's PDS) — params {space, since?, limit}{ops, setHash, rev}
  • getMemberState (query, owner's PDS) — params {space}{setHash, rev}
  • getMemberOplog (query, owner's PDS) — params {space, since?, limit}{ops, setHash, rev}

Notifications (service auth):

  • notifyMembership (procedure, on the member's PDS, called by the owner) — body {space, isMember}
  • notifyWrite (procedure, on the owner's PDS and on each syncing app) — body {space, member, rev}

notifyWrite is the same NSID on both relay hops. The owner's PDS receives, looks up space_credential_recipient, and invokes the same NSID on each registered service endpoint.

15.9 Request Flows

Flow A — Member writes a record:

  1. App → member's PDS: com.atproto.space.createRecord {space, collection, record} (OAuth).
  2. PDS opens actor-store transaction.
  3. Loads SpaceRepo via SqlRepoStorage.
  4. repo.format_commit([{action: Create, collection, rkey, record}]) — computes new SetHash, signs commit.
  5. transactor.apply_repo_commit(space, commit_data) — writes space_record row, updates space_repo.set_hash and space_repo.rev, appends space_record_oplog entry.
  6. Fire-and-forget notifyWrite to space owner's PDS.
  7. Returns {uri, cid, commit: {rev}}.

Flow B — App obtains a SpaceCredential:

  1. App has OAuth session with a member user (bound to clientId).
  2. App → member's PDS: getMemberGrant {space, clientId} (OAuth).
  3. Member's PDS verifies user has membership knowledge for space (or trusts the user's claim — spec is permissive here), creates and signs grant JWT.
  4. App → owner's PDS: getSpaceCredential {grant}.
  5. Owner's PDS verifies grant signature (resolves member's DID doc), confirms member is in space_member, checks lxm matches.
  6. Owner's PDS creates and signs SpaceCredential JWT.
  7. Owner's PDS records app's service DID + endpoint in space_credential_recipient.
  8. Returns {credential}.

Flow C — App syncs a member's permissioned repo:

  1. App holds SpaceCredential.
  2. App → member's PDS: getRepoOplog {space, since: <last_seen_rev>} with SpaceCredential.
  3. PDS verifies SpaceCredential by resolving owner's DID doc, checking signature and expiration, confirming requested space matches credential's space claim.
  4. PDS confirms is_member=true in space table for this user.
  5. PDS returns {ops: [...], setHash, rev}.
  6. App replays ops, fetches new/updated record content via getRecord {space, collection, rkey}.
  7. App computes its local SetHash from cumulative state, compares to returned setHash.
  8. On mismatch: full resync via listRecords, recompute from scratch.

Flow D — Space owner adds a member:

  1. App → owner's PDS: addMember {space, did} (OAuth).
  2. PDS loads SpaceMembers via SqlMembersStorage.
  3. members.format_commit([{action: Add, did}]).
  4. transactor.apply_member_commit(space, commit_data) — writes space_member row, updates space_member_state.set_hash/.rev, appends space_member_oplog.
  5. Fire-and-forget notifyMembership {space, isMember: true} to new member's PDS.
  6. Returns success.

Flow E — Member's PDS receives notifyMembership:

  1. Owner's PDS → member's PDS: notifyMembership {space, isMember: true} (service auth).
  2. PDS verifies service-auth JWT against owner's DID.
  3. PDS upserts space row: is_member=true (creates row if needed). Per spec: this is necessary so the PDS knows to accept space credentials for that space and serve permissioned repo data to authorized requesters.
  4. Optionally email/push the user.

When isMember: false, the PDS sets is_member=false but does NOT delete the space row or any records (per spec).

15.10 Auth Verifier (Three Paths)

Per the Spaces Design Spec, the PDS auth verifier gets three paths:

  1. User auth — existing OAuth flow. Used for record CRUD and space management on the user's own PDS.
  2. Space credential auth — new. Verifies SpaceCredential JWT for sync endpoints. Resolves space owner's DID doc, checks signing key, validates expiration, confirms requested space matches credential's space claim.
  3. Service auth — existing pattern. Used for inter-PDS notifications (notifyWrite, notifyMembership).

The middleware must distinguish these cleanly. Use the JWT typ header (at+jwt, space_credential, space_member_grant, no-typ for service auth) as the primary discriminator.

15.11 Spec Open Questions Tracked

The spec flags these TODOs that affect atproto-pds:

Open question atproto-pds interim default
URI scheme ats:// Use as specified; abstract behind SpaceUri type for future migration.
SetHash algorithm (XOR placeholder → ECMH/ltHash) SetHash trait + XorSha256SetHash default; pluggable.
Credential expiration window (2–4h) Default 3h, configurable.
Oplog retention Retain forever by default; admin compaction available; emit OplogGap error if since precedes retained range.
notifyWrite fan-out failure Bounded retry with exponential backoff (5 attempts, 1s/2s/4s/8s/16s); dead-letter log.
Member grant signing key User's atproto signing key (per spec); document in code.
Service endpoint for notifications Try app DID-doc service entry first; fall back to value provided in getSpaceCredential request.

Each of these should be a tracked issue in the atproto-crates repo and revisited as the spec firms up.

15.12 Testing Strategy (Adapted from Spec's §5)

Layer 1: atproto-space unit tests

  • SpaceRepo: CRUD, batch writes, format_commit/apply_commit, SetHash correctness, error cases.
  • SpaceMembers: add/remove, SetHash over DIDs, commit signing/verification, duplicate add, remove non-member.
  • SetHash: order-independence, add/remove inverse, consistency.
  • Domain separation: a commit signed with scope: Records must fail verification with scope: Members.
  • Credential/Grant: create, verify, reject expired, reject wrong space, reject tampered, verify lxm binding.

Layer 2: atproto-pds integration tests

  • Spin up a real PDS instance. Test XRPC endpoints directly.
  • Record CRUD via com.atproto.space.*.
  • Verify rev advances on each write.
  • Verify SetHash correctness after operation sequences.
  • Space management auth rejection (non-owner → addMember/removeMember).
  • Credential flow happy path and rejections.

Layer 3: Multi-PDS sync tests

  • Two PDSes (PDS-A hosts owner + member-1, PDS-B hosts member-2). A test client acting as the syncing app.
  • Happy-path sync: owner creates space, adds members; members write; app obtains credential; app syncs.
  • Incremental sync with since.
  • Sync recovery: simulated oplog gap → setHash mismatch → full resync via listRecords.
  • Member lifecycle: removal mid-sync.
  • Notification flow: notifyWrite reaches owner → relayed to mock app endpoint.

What we are NOT testing (per spec):

  • ECMH (XOR placeholder only until upstream settles).
  • Application-level write semantics (out of spec scope).
  • Delegated accounts / app coordination.
  • Performance at scale.

16. Conformance Gaps Across Implementations

Area TS @atproto/pds indigo PDS (Go, deprecated) rsky-pds (Rust) tranquil-pds (Rust) cocoon (Go)
Sync 1.1 (prevData, #sync) full partial partial full full
did:web for accounts service-only none none full (BYO + hosted) full
OAuth provider full none partial full + 2FA/passkeys full
Permission sets resolution full n/a partial full partial
importRepo full but slow broken/stale works works "use with caution"
Per-account SQLite yes n/a no (Postgres) no (Postgres) yes (default), Postgres opt-in
Blob store: S3 no (filesystem) filesystem yes optional optional
SMTP / multi-channel notifs SMTP only none mailgun SMTP + discord/telegram/signal SMTP
Account delegation no no no yes no
WebAuthn / TOTP no no no yes no
Built-in admin web UI minimal no no full Svelte UI minimal
listReposByCollection (Sync 1.1) partial no no partial partial
Inductive verification on inbound CAR partial no no yes yes
lxm-required service auth yes no yes yes yes
Permissioned data (com.atproto.space.*) in-progress reference none none none none
@atproto/space package in-progress none none none none
Sidecar / off-protocol attachments none none none none none

atproto-pds opportunity: @atproto/pds is the only implementation currently building toward the Spaces Design Spec, and it's reference-quality, not production-quality (per the spec's own goals: "Not production-ready — focused on correctness and protocol exploration"). atproto-pds can be the second implementation overall, the first in Rust, and architected for production performance from the start. Conformance must track the TS reference closely during the design's settling period.

Notable interop issues observed in the wild:

  • Account migration "blob loss" — some PDSes don't refuse activateAccount if listMissingBlobs is non-empty; cocoon README warns "use with extreme caution."
  • DPoP nonce edge cases — cocoon shipped hailey/fix-dpop-nonce-err after upstream issues; rsky-pds had a TLS provider init panic.
  • Identity caching after migration — Blacksky AppView fork notes that staleTTL of 1h causes JWT verification failures during the migration window; the resolver needs a cache-bypass on signature failure.
  • Atproto-Proxy header: TS PDS, cocoon honor; some early Go PDSes don't.
  • Lexicon validation strictness varies — many PDSes silently drop unknown union members on read.

atproto-pds should adopt the strictest reasonable defaults and expose lenient mode as opt-in.


17. Rust Crate Integration Plan

17.1 Existing Crates in atproto-crates

Crate Purpose PDS use
atproto-dasl DRISL CBOR, CID, CARv1, block storage backends, RASL retrieval Public-repo block layer, CAR import/export
atproto-identity DID resolution (plc/web/key), handle resolution, P-256/P-384/K-256 keys Identity resolution, key handling
atproto-attestation CID-first attestation utilities Sidecar/permissioned record attestations
atproto-record TID, AT-URI, datetime, CID for records Record creation hot path
atproto-lexicon NSID validation, schema resolution (DNS+HTTPS) Write-time lexicon validation
atproto-repo MST encoding/decoding, commit structures, tree diffing, configurable verification Public repo subsystem
atproto-oauth OAuth 2.0 + DPoP/PKCE/JWT OAuth provider primitives
atproto-oauth-aip AIP-style authorization-code + PAR + token exchange OAuth flow orchestration
atproto-oauth-axum Axum web handlers for OAuth endpoints, JWKS, client metadata OAuth HTTP layer
atproto-client DPoP/Bearer/session HTTP client, XRPC Outbound calls (PLC, mod service, AppView, notifyWrite/notifyMembership push)
atproto-xrpcs XRPC service framework: JWT extractors, DID-based authn, axum middleware XRPC server foundation
atproto-jetstream Jetstream client Useful for testing, not core PDS
atproto-tap TAP consumer (verified events with backfill) Reference for outbox mechanics
atproto-extras Facets, rich text Validation helpers for records
atproto-plc DID:plc operation construction, signing PLC operations

17.2 New Crate: atproto-space

Added to atproto-crates to mirror the @atproto/space TS package. See §15.4 for module layout. Public API:

// Set commitments
pub trait SetHash { ... }
pub struct XorSha256SetHash;  // default impl, swappable

// Commit construction
pub struct SpaceContext { ... }
pub enum CommitScope { Records, Members }
pub struct Commit { set_hash, rev, ikm, tag, sig }
pub fn create_commit(...) -> Commit;
pub fn verify_commit(...) -> bool;

// Per-space repo manager
pub struct SpaceRepo<S: SpaceRepoStorage> { ... }

// Per-space member manager (owner only)
pub struct SpaceMembers<S: SpaceMembersStorage> { ... }

// JWT credentials
pub struct MemberGrant { ... }
pub struct SpaceCredential { ... }
pub fn create_member_grant(...) -> MemberGrant;
pub fn verify_member_grant(...) -> Result<MemberGrant, ...>;
pub fn create_space_credential(...) -> SpaceCredential;
pub fn verify_space_credential(...) -> Result<SpaceCredential, ...>;

// Storage traits
#[async_trait]
pub trait SpaceRepoStorage { ... }
#[async_trait]
pub trait SpaceMembersStorage { ... }

pub mod memory { /* in-memory impls for testing */ }

The atproto-pds crate then provides SQLite-backed (or pluggable-backend) implementations of SpaceRepoStorage and SpaceMembersStorage.

17.3 Crates Needing Extension

  • atproto-repo: must support Sync 1.1 inductive proofs end-to-end (already largely there), expose a streaming CAR writer, support inductive verification on inbound (importRepo), and add a tree-diff API that emits Sync 1.1 op lists with prev CIDs.
  • atproto-dasl: BlockStore trait should grow per-realm tagging if we want a unified block store across public + permissioned (alternatively, treat permissioned as fully separate via atproto-space's storage traits).
  • atproto-xrpcs: needs DPoP middleware extractor (currently has Bearer/service auth only), per-endpoint scope enforcement table, rate-limit middleware, plus extractors for SpaceCredential and MemberGrant JWTs distinguishing by typ header.
  • atproto-oauth/atproto-oauth-axum: needs server-side flows (the AIP variant is largely complete), the consent UI templating, permission-set resolution and caching, JWKS rotation, and space-typed scopes in the consent UI.
  • atproto-identity: needs a KeyStore trait abstraction so HSM/KMS backends are pluggable; needs a per-account signing key generator.
  • atproto-lexicon: needs an "authority-tagged" resolver mode for permissioned-only schemas.
  • atproto-plc: confirm full coverage of the operation set; may need tombstone op support.

17.4 Net-New Functionality Required in atproto-pds

  1. Account database: accounts table, app passwords, OAuth sessions, invite codes, email tokens, PLC operation signature tokens. Pluggable backend (SQLite default, Postgres optional).
  2. Per-actor store: per-account SQLite file (or namespace) holding both public-repo state and the eight Spaces tables. The actor-store transactor wraps both.
  3. Sequencer / outbox for the public firehose.
  4. Firehose server (com.atproto.sync.subscribeRepos) WebSocket framing.
  5. Mailer: SMTP + pluggable trait.
  6. Notifier (separate from Mailer): outbound notifyWrite/notifyMembership HTTP client with bounded retry + DLQ.
  7. Admin endpoints + UI: the com.atproto.admin.* namespace and a minimal HTML dashboard. Includes admin tooling for permissioned-record takedown audit log.
  8. Authorization UI: server-rendered consent screen using Askama templates, with space-typed scope rendering.
  9. Rate limiter: in-memory token bucket + optional Redis/Valkey, with per-realm bucket families.
  10. Service auth issuer: short-lived JWT minting for getServiceAuth.
  11. MemberGrant issuer (uses account's signing key).
  12. SpaceCredential issuer (owner's PDS only).
  13. Auth verifier dispatching across OAuth / App Password / service auth / SpaceCredential / MemberGrant based on JWT typ.
  14. CLI / installer: atproto-pds binary plus admin tooling for invite issuance, account reset, takedown, space inspection (list spaces, dump oplog, show member list).

17.5 Performance-Critical Tech Stack

  • Async runtime: tokio (multi-threaded). Use tokio-uring for the blob store path on Linux for io_uring read perf.
  • HTTP framework: axum (matches existing atproto-xrpcs). Add tower-http for rate limit, CORS, compression.
  • WebSocket: tokio-tungstenite integrated with axum.
  • Storage: sqlx for SQLite/Postgres (per-actor SQLite is the recommended default to match the spec). fjall available for unified-keyspace deployments. s3 crate for blob storage.
  • CBOR: serde_ipld_dagcbor (already in atproto-dasl).
  • Crypto: k256, p256, argon2, hmac, sha2, hkdf, jose-jwt (or josekit).
  • Email: lettre.
  • Templates: askama (compile-time, fast).
  • Tracing: tracing + tracing-subscriber + optional OpenTelemetry exporter.

18. Performance & Low-Latency Design

18.1 Hot Paths

Public read hot path (getRecord, listRecords):

  1. Resolve DID → repo handle (in-memory LRU, ~1 µs).
  2. Locate record CID via MST traversal (one MST root cache hit + log-N node fetches; with caching, single-digit-µs to sub-ms).
  3. Fetch DAG-CBOR block from BlockStore.
  4. Decode and serialize JSON.

Target: p99 < 5 ms in-process for cached records.

Permissioned read hot path (com.atproto.space.getRecord):

  1. SpaceCredential verification: parse JWT, resolve owner DID doc (cached), verify ECDSA, check space claim, check expiration. Cached fast path: ~50 µs. Cold DID-doc fetch: 10–100 ms (caching is essential).
  2. Lookup space_record by (space, collection, rkey) PK in the per-actor SQLite. Single-digit ms.
  3. Decode and serialize.

Target: p99 < 10 ms for cached credentials.

Public write hot path (createRecord):

  1. Authn middleware (DPoP verify ~50 µs ES256, JWT verify ~10 µs).
  2. Lexicon resolve (cache hit ~µs; cache miss = network, slow path).
  3. DAG-CBOR encode + CID compute (~10 µs for typical post).
  4. MST mutate (in-memory copy-on-write, ~10–100 µs).
  5. Sign commit (K-256 ECDSA ~50 µs).
  6. Persist blocks + commit + outbox row in one Fjall batch / SQLite transaction (~100 µs–1 ms).
  7. Notify firehose channel (lock-free broadcast).

Target: p99 < 20 ms.

Permissioned write hot path (com.atproto.space.createRecord):

  1. Authn (OAuth + space scope check).
  2. Lexicon validate.
  3. CBOR encode + CID compute.
  4. SetHash add: SHA-256 + XOR (or future ECMH op). XOR is ~ns; ECMH is slower (~10–100 µs depending on curve).
  5. HKDF-SHA-256 key derivation + HMAC-SHA-256 tag (~10 µs).
  6. ECDSA sign (~50 µs).
  7. SQLite transaction: insert space_record, update space_repo, append space_record_oplog (~100 µs–1 ms).
  8. Spawn notifyWrite outbound call (fire-and-forget, off-path).

Target: p99 < 15 ms (note: faster than public write because no MST traversal).

Public sync hot path (subscribeRepos):

  1. Live tail = single broadcast channel recv + frame encode → ws write.
  2. Backfill = streaming CAR read from block store.

Permissioned sync hot path (getRepoOplog):

  1. SpaceCredential verify (cached fast path).
  2. SQLite range query on space_record_oplog ordered by (rev, idx) since cursor.
  3. Stream rows + return {setHash, rev}.

18.2 Caching

  • Identity cache: did → did doc, did → handle, handle → did. TTL: 24h with stale-while-revalidate; bypass on signature-verification failure (per Blacksky lesson). Critical for SpaceCredential verification.
  • Lexicon cache: 24h stale, 90d expiration; permission sets same.
  • Repo head cache: did → (commit CID, rev, MST root). Updated on every commit. Bounded LRU per process.
  • MST node cache: CID → MST node, large LRU shared across accounts.
  • SpaceCredential cache: per-(credential-jti-or-hash) → verified-decoded form. TTL = credential remaining lifetime. Avoids re-resolving the owner's DID doc on every request from the same app.
  • Member-list cache: (space) → member set. Invalidated on addMember/removeMember/inbound notifyMembership.
  • Block presence Bloom filter to short-circuit "do we have this CID" checks during sync.

18.3 Concurrency

  • Single-writer per repo: serialize writes per-DID via a per-DID mutex (sharded DashMap<Did, Mutex<()>>).
  • Single-writer per space (per user): serialize writes per-(DID, space-URI) similarly.
  • Reads are unconstrained.
  • Firehose sequencer is a single tokio task fed by a SPSC channel from each writer.
  • Outbound notifyWrite calls run on a separate tokio task pool with bounded concurrency to avoid DoS-amplifying when one space has many members.

18.4 Storage Tradeoffs

Backend Pros Cons When to use
Per-account SQLite strong isolation, easy backup/migration, atomic per-account, direct match to Spaces Design Spec, cheap takedown (rm file) many file descriptors, hard to do cross-account scans, slower for very-many accounts small-to-medium PDS (≤10k accounts), recommended default
Single Fjall keyspace very low write latency, compactable, single-binary deploy requires custom Spaces table-equivalent layout, diverges from spec single-host PDS, performance-priority
Postgres replication, ops familiarity, query power added latency vs embedded, ops complexity multi-tenant SaaS PDS, many accounts
Hybrid (Fjall public blocks + per-account SQLite for Spaces) fast public path + spec-fidelity for Spaces two systems to back up advanced

Recommended default for atproto-pds: per-account SQLite for both public actor store and Spaces tables, matching the spec exactly. Optimize within SQLite (WAL mode, mmap, statement caching) before reaching for alternative backends.

18.5 Permissioned Data Performance

  • SetHash with XOR-SHA256 placeholder: O(1) add/remove, single-digit µs. ECMH/ltHash will be slower (curve ops 10–100 µs) but still O(1).
  • Oplog sync is O(ops since cursor), comparable to MST diff. The PK (space, rev, idx) makes range scans efficient.
  • Per-reader credential verification: HMAC + ECDSA both µs-scale; the dominant cost is the owner's DID doc fetch, mitigated by aggressive caching.
  • Cross-space aggregation ("show me all my permissioned posts across spaces") needs an index on (actor, space, repo_rev DESC) to be fast. Plan this from day one.

19. Build & Release Plan

  1. Phase 0 — Scaffold: New crates/atproto-pds/ and crates/atproto-space/ in atproto-crates. Wire to existing crates. Extend atproto-repo for Sync 1.1.
  2. Phase 1 — atproto-space crate: SetHash trait + XorSha256 impl, SpaceContext, Commit (HKDF+HMAC+ECDSA), SpaceRepo, SpaceMembers, MemberGrant, SpaceCredential, in-memory storage. Layer-1 unit tests including domain-separation tests.
  3. Phase 2 — Public realm read-only PDS: getRecord, listRecords, describeRepo, getRepo, getBlob, listBlobs, subscribeRepos (read-only). Validate against Bluesky firehose by replaying.
  4. Phase 3 — Account management & public writes: createAccount (PLC-managed), session/app password, createRecord/putRecord/deleteRecord/applyWrites, uploadBlob. Emit Sync 1.1 firehose.
  5. Phase 4 — OAuth provider: PAR/PKCE/DPoP, consent UI, permission sets. atproto-oauth-aip integration.
  6. Phase 5 — Account migration: getServiceAuth, importRepo, activateAccount, deactivateAccount, requestPlcOperationSignature, signPlcOperation, submitPlcOperation.
  7. Phase 6 — Moderation, admin, deployment: Admin endpoints, takedown, reports proxy, Docker image, install scripts.
  8. Phase 7 — Permissioned realm full implementation: SQLite-backed SpaceRepoStorage/SpaceMembersStorage, all com.atproto.space.* endpoints, notifyWrite/notifyMembership outbound + inbound, dual-auth middleware, space-scoped OAuth scopes, admin tooling for spaces. Layer-2 and Layer-3 tests including multi-PDS sync.
  9. Phase 8 — Production hardening: Switch SetHash to ECMH/ltHash when upstream settles; add OplogGap recovery telemetry; performance tuning; cross-implementation interop tests against the TS reference.

Conformance gating: at each phase, run against Bluesky's interop-test-files, bluesky-social/atproto-interop-tests, goat --verify flags, and atproto-tap (Sync 1.1 reference consumer). For permissioned data specifically, run interop tests against the TS @atproto/pds Spaces implementation as soon as it lands.


20. Closing Notes

The strategic value of atproto-pds is sharper now that we have the Spaces Design Spec in hand. Three things matter most:

  1. Per-actor SQLite as the default storage layout, matching the spec exactly. This is the single biggest decision — every other implementation decision flows from it. Spaces tables piggy-back on the per-actor DB, which is only architecturally clean if the per-actor DB exists in the first place.
  2. atproto-space as a sibling crate, mirroring @atproto/space. This keeps the protocol primitives (SetHash, Commit, SpaceRepo, SpaceMembers, MemberGrant, SpaceCredential) reusable outside the PDS — apps and AppViews will need them too. It also gives us a cleaner test boundary.
  3. Sync 1.1 + Spaces from day one, not as retrofits. The TS reference is building both in parallel; Rust has the chance to do likewise without the legacy weight.

The Spaces Design Spec is explicitly correctness-first, not production-first ("Not production-ready — focused on correctness and protocol exploration"). atproto-pds can be the production-ready Spaces implementation. To do that, track the upstream spec in lockstep, file issues against ambiguities (especially the seven open questions in §15.11), and pick conservative defaults that won't paint us into a corner when the spec firms up. The existing atproto-crates workspace is exceptionally well-positioned for this — identity, lexicon, DASL, OAuth, and PLC primitives are already strong; the PDS plus atproto-space are the integration layer on top.

Plan: atproto-pds Crate Introduction

Author: drafted by Nick Gerakines (May 1, 2026), distilled from two source documents now committed alongside this plan:

  • atproto-pds-design.md — Foundational Design Document (revision 2). The architectural north star: defines what a PDS is, the eight subsystems (account, repo, identity, firehose, sync, XRPC, OAuth, moderation), per-section permissioned-data hooks (§1.4, §2.8, §3.6, §4.5, §5.5, §6.5, §7.3, §8.7, §9.8, §10.6), the consolidated Spaces section (§15), the integration plan against existing atproto-crates (§17), the performance targets (§18), and the build & release plan (§19). Every "per spec" or "per design" reference in this plan resolves into this document.
  • atproto-pds-references.md — Reference Companion (Compass artifact). Section-for-section URL/file/quote citations backing the design document: AT Protocol specs, lexicon paths, reference implementations (TS @atproto/pds, indigo, rsky, tranquil-pds, cocoon), Daniel Holmgren's Permissioned Data Diary series, the Spring 2026 roadmap, and the existing atproto-crates workspace inventory. Use this when you need to verify an upstream claim or find the exact source URL for a quoted spec passage.

Target: new crates/atproto-pds (PDS server library + pds binary) and supporting crates/atproto-space (permissioned-data primitives) within the ngerakines.me/atproto-crates workspace. Scope: a low-latency, fully spec-conformant Rust Personal Data Server architected to support Sync 1.1 (proposal: bluesky-social/proposals/0006-sync-iteration, see references §6, §19) and the Spaces Design Spec (Daniel Holmgren's 2026-04-22-permissioned-data-pds-design.md, see references §15) from day zero. License: MIT (matches workspace).

How to read this plan: every section ends with cross-references in the form [design §N](atproto-pds-design.md), [references §N](atproto-pds-references.md). Treat the design document as the what and why, the references document as the where to verify, and this plan as the how and when.


0. TL;DR

We add two new workspace members and surgically extend six existing crates. The PDS shipping target is a pds binary that fronts a Caddy/nginx reverse proxy and exposes the full com.atproto.{server,identity,repo,sync,admin,moderation,space}.* XRPC surface plus an OAuth 2.1 authorization server with PAR/PKCE/DPoP. Storage is profile-selected at compile time: per-actor SQLite is the default (matching the upstream Spaces Design Spec exactly, design §3.6, design §15.3), and fjall is a shipping alternative behind the fjall Cargo feature (§11). Implementation is staged across nine phases (§7) that each end at a runnable, testable milestone — never carrying half-finished features between phases.

The single most consequential decision is the storage profile: per-actor SQLite as default for spec-conformance, with fjall as the alternative for low-latency single-host deployments. Every other layering choice flows from it.


1. Two New Crates

1.1 crates/atproto-space

A protocol-primitives crate that mirrors the upstream @atproto/space TypeScript package (design §15.4, references §15 — @atproto/space package). Kept separate from atproto-pds so AppViews, syncing apps, and CLI tooling can depend on space mechanics without pulling in the server (design §20 closing notes #2).

crates/atproto-space/
  Cargo.toml
  README.md
  src/
    lib.rs               // re-exports
    errors.rs            // error-atproto-space-{domain}-{n}
    types.rs             // SpaceUri, SpaceType, SpaceKey, CommitScope, OplogEntry, OpAction
    set_hash.rs          // trait SetHash + XorSha256SetHash default impl
    commit.rs            // SpaceContext, Commit, create_commit, verify_commit
    space_repo.rs        // SpaceRepo<S: SpaceRepoStorage>
    space_members.rs     // SpaceMembers<S: SpaceMembersStorage>
    credential.rs        // MemberGrant + SpaceCredential JWT shapes; create/verify
    storage/
      mod.rs             // traits SpaceRepoStorage, SpaceMembersStorage
      memory.rs          // in-memory impls (testing only)

Public API (target):

pub trait SetHash: Sized + Clone + Send + Sync {
    fn empty() -> Self;
    fn add(&mut self, element: &[u8]);
    fn remove(&mut self, element: &[u8]);
    fn digest(&self) -> [u8; 32];
}
pub struct XorSha256SetHash([u8; 32]);

pub enum CommitScope { Records, Members }

pub struct SpaceContext {
    pub space_did: String,        // owner DID
    pub space_type: String,       // NSID
    pub space_key: String,
    pub user_did: String,         // committer
    pub scope: CommitScope,
    pub rev: String,              // TID
}

pub struct Commit {
    pub set_hash: [u8; 32],
    pub rev: String,
    pub ikm: [u8; 32],            // per-commit fresh, gives deniability
    pub tag: [u8; 32],            // HMAC-SHA-256
    pub sig: Vec<u8>,             // ECDSA over canonical bytes
}

pub fn create_commit(
    set_hash: &[u8; 32],
    context: &SpaceContext,
    keypair: &KeyData,            // re-uses atproto-identity::KeyData
) -> Result<Commit, SpaceError>;

pub fn verify_commit(
    context: &SpaceContext,
    commit: &Commit,
    pubkey: &KeyData,
) -> Result<(), SpaceError>;

pub struct SpaceRepo<S: SpaceRepoStorage> { /* … */ }
impl<S: SpaceRepoStorage> SpaceRepo<S> {
    pub async fn format_commit(&self, ops: &[Op]) -> Result<CommitData, SpaceError>;
    pub async fn apply_commit(&self, commit: CommitData) -> Result<(), SpaceError>;
    pub async fn get_record(&self, collection: &str, rkey: &str) -> Result<Option<Record>, SpaceError>;
    pub async fn list_records(&self, collection: &str, cursor: Option<&str>, limit: u32) -> Result<RecordPage, SpaceError>;
    pub async fn list_collections(&self) -> Result<Vec<String>, SpaceError>;
}

pub struct SpaceMembers<S: SpaceMembersStorage> { /* … */ }
impl<S: SpaceMembersStorage> SpaceMembers<S> {
    pub async fn format_commit(&self, ops: &[MemberOp]) -> Result<CommitData, SpaceError>;
    pub async fn apply_commit(&self, commit: CommitData) -> Result<(), SpaceError>;
    pub async fn list_members(&self, cursor: Option<&str>, limit: u32) -> Result<MemberPage, SpaceError>;
    pub async fn is_member(&self, did: &str) -> Result<bool, SpaceError>;
}

pub struct MemberGrant { /* … */ }
pub struct SpaceCredential { /* … */ }

pub fn create_member_grant(...) -> Result<String /* JWT */, SpaceError>;
pub fn verify_member_grant(jwt: &str, owner_did: &str, member_pubkey: &KeyData) -> Result<MemberGrant, SpaceError>;
pub fn create_space_credential(...) -> Result<String, SpaceError>;
pub fn verify_space_credential(jwt: &str, owner_pubkey: &KeyData) -> Result<SpaceCredential, SpaceError>;

Dependencies:

  • atproto-identity (key handling, DID parsing)
  • atproto-record (TID generation, AT-URI parsing)
  • atproto-dasl (DAG-CBOR for record value BLOB encoding/decoding)
  • hkdf, hmac, sha2, k256, p256, ecdsa, rand, serde, serde_bytes, thiserror, async-trait, tracing, chrono

Error namespace: error-atproto-space-<domain>-<n> per workspace convention.

Source anchors: design §15 (full Spaces section), design §13 — commit construction, references §15 — Spaces Design Spec, Diaries 1–4.

1.2 crates/atproto-pds

The PDS server library and pds binary.

crates/atproto-pds/
  Cargo.toml
  README.md
  src/
    lib.rs                        // crate-level docs, re-exports for embedders
    config.rs                     // PdsConfig (env > --config > /etc/atproto-pds/config.toml)
    errors.rs                     // error-atproto-pds-{domain}-{n}
    keys.rs                       // KeyStore impls (FileKeyStore default; HSM/KMS via cargo features). The trait itself lives in atproto-identity (per §2.5).
    realm.rs                      // Realm enum (Public, Space(SpaceUri))

    account/                      // §2 of design
      mod.rs
      manager.rs                  // AccountManager (creation, lifecycle)
      session.rs                  // app-password JWT sessions
      app_password.rs
      invite.rs
      email_token.rs
      plc_op_token.rs
      state.rs                    // active|deactivated|takendown|suspended|deleted

    actor_store/                  // per-account store: SQLite (default) or fjall (alt)
      mod.rs                      // ActorStore trait + profile dispatch
      transactor.rs               // ActorTransactor: open tx, mutate, commit
      sql/                        // feature = "sqlite"
        mod.rs
        schema/
          public.sql              // accounts piece of the account DB
          repo.sql                // public repo MST blocks, commits
          sequencer.sql           // outbox events
          space.sql               // 8 Spaces tables (per spec)
          migrations/
        migrations.rs             // migration runner
        sql_repo_storage.rs       // BlockStorage impl over SQLite for public repo
        sql_space_repo_storage.rs // SpaceRepoStorage impl
        sql_space_members_storage.rs// SpaceMembersStorage impl
      fjall/                      // feature = "fjall" — Ramjet-style keyspace layout (§11)
        mod.rs
        keyspaces.rs              // 8 keyspaces: repo_blocks, repo_records, outbox, space_records, space_repo, space_members, space_member_oplog, space_record_oplog
        keys.rs                   // binary key encoders (null-byte hierarchies)
        batch_writer.rs           // mpsc → WriteBatch, default 500-record / 100ms batches
        fjall_repo_storage.rs     // BlockStorage impl
        fjall_space_repo_storage.rs
        fjall_space_members_storage.rs

    repo/                         // public realm
      mod.rs
      writer.rs                   // single-writer-per-DID mutate path
      reader.rs                   // getRecord, listRecords, getRepo, getBlocks
      blob.rs                     // blob upload, GC, ref counting
      import.rs                   // importRepo (CAR ingest, inductive verify)
      sync_v1_1.rs                // prevData, #sync event helpers

    sequencer/                    // §6
      mod.rs
      outbox.rs                   // durable seq table
      broadcaster.rs              // tokio broadcast channel
      subscribe_repos.rs          // WS handler for #commit/#sync/#identity/#account
      crawler.rs                  // PDS_CRAWLERS requestCrawl fan-out

    space/                        // §15 — wires atproto-space into HTTP
      mod.rs
      service.rs                  // SpaceService: createSpace, addMember, etc.
      writer.rs                   // permissioned record writes (single-writer per (DID, space))
      reader.rs                   // getRecord/listRecords (dual auth)
      sync.rs                     // getRepoState/getRepoOplog/getMemberState/getMemberOplog
      credential.rs               // mint MemberGrant / SpaceCredential, register recipients
      notifier.rs                 // outbound notifyWrite/notifyMembership w/ bounded retry + DLQ
      receiver.rs                 // inbound notifyWrite/notifyMembership handlers
      takedown.rs                 // admin-bypass record suppression

    identity/                     // §5
      mod.rs
      plc.rs                      // request/sign/submit PLC ops, recommended creds
      didweb.rs                   // host did.json, accept BYO did:web
      handle.rs                   // PDS-managed handle subdomains; verify-then-assign

    oauth/                        // §9
      mod.rs                      // glues atproto-oauth + atproto-oauth-axum
      par.rs                      // PAR endpoint
      authorize.rs                // /oauth/authorize (consent UI)
      token.rs                    // /oauth/token, refresh rotation
      revoke.rs
      jwks.rs
      metadata.rs                 // /.well-known/oauth-{authorization-server,protected-resource}
      consent_ui/                 // Askama templates
      permission_set.rs           // resolve include:<nsid> permission sets, swr cache

    xrpc/                         // §8 — axum routers and handlers
      mod.rs
      auth_extractor.rs           // unified verifier dispatching by JWT typ
      rate_limit.rs               // tower-http layer + Valkey backend (optional)
      handlers/
        server/                   // com.atproto.server.*
        identity/                 // com.atproto.identity.*
        repo/                     // com.atproto.repo.*
        sync/                     // com.atproto.sync.*
        admin/                    // com.atproto.admin.*
        moderation/               // com.atproto.moderation.* (proxy)
        space/                    // com.atproto.space.* (record CRUD, sync, mgmt, notify)
      proxy.rs                    // Atproto-Proxy header handling

    mailer/                       // §11
      mod.rs                      // Mailer trait
      smtp.rs                     // lettre impl
      templates/                  // Askama templates per email type

    blobstore/                    // §3 (blob bytes only — block store is in actor_store)
      mod.rs                      // BlobStore trait
      disk.rs
      s3.rs                       // optional feature

    admin/                        // §10
      mod.rs
      takedown.rs
      reports.rs                  // forward to MODERATION_SERVICE
      ui/                         // minimal HTML dashboard (Askama)

    metrics.rs                    // Prometheus, /metrics
    health.rs                     // /xrpc/_health
    tracing.rs                    // OTel exporter wiring

    bin/
      pds.rs                      // the `pds` binary (server)
      atproto-pds-admin.rs        // admin CLI (separate binary per D6/G17)

The pds binary is the production target. It:

  1. Reads config (env > --config > /etc/atproto-pds/config.toml).
  2. Initializes Mailer, BlobStore, KeyStore, BlockStore (per-actor SQLite root dir), Sequencer, IdentityCache, LexiconCache, RateLimiter, Notifier (Spaces).
  3. Builds the axum router: /xrpc/*, /oauth/*, /.well-known/*, /_health, /metrics (separate listener, default port 2471).
  4. Spawns Sequencer flush loop, Spaces NotifyWrite worker pool, blob GC scheduler.
  5. Listens on the configured HTTP port; expects to be fronted by Caddy/nginx with TLS termination and WebSocket upgrade.

Dependencies (additions to workspace):

  • New: sqlx (with sqlite + postgres features, rt-tokio-rustls), fjall (feature-gated alternative storage profile, see §11), lettre, askama, argon2, hkdf, hmac, tower, tower-http, prometheus, opentelemetry, opentelemetry-otlp (optional), aws-sdk-s3 (feature-gated), dashmap.
  • WebSockets: tokio-websockets (already a workspace dependency, used by atproto-jetstream and atproto-tap). The subscribeRepos firehose handler binds to it for consistency with the rest of the workspace; we do not add tokio-tungstenite even though it's the more common axum pairing — sharing one WebSocket stack across the workspace simplifies fuzzing, dependency auditing, and TLS provider configuration.
  • Compression: zstd (already a workspace dependency for atproto-jetstream). We reuse atproto-jetstream's zstd-dictionary handling for the firehose outbox compression path, mirroring Ramjet's pattern (see §11).
  • Existing workspace re-uses: atproto-{dasl,repo,record,identity,lexicon,oauth,oauth-aip,oauth-axum,xrpcs,client,attestation}, atproto-space, axum, tokio, serde, chrono, secrecy, thiserror, anyhow, tracing, reqwest, k256, p256, sha2, clap.

Features:

  • default = ["sqlite", "smtp", "metrics", "hickory-dns"]
  • sqlite — per-actor SQLite via sqlx (default; spec-conformance profile)
  • fjall — fjall-backed actor store (Ramjet-style partition layout; see §11). Opt-in compile: install with cargo install atproto-pds --no-default-features --features fjall,smtp,metrics,hickory-dns.
  • postgres — Postgres backend for the accounts DB only (per-actor stores stay SQLite or fjall regardless; see §5.2)
  • s3 — S3 blob storage
  • valkey — Redis/Valkey-backed rate limiter, JTI replay filter, and denylist cuckoo filter
  • otel — OpenTelemetry tracing exporter
  • clap — required for the pds and atproto-pds-admin binaries

The sqlite and fjall features are mutually exclusive at compile time. The default build is SQLite-only; fjall users build a separate binary. Both backends are tested in CI by running the test matrix twice (one binary built per profile). This is operationally simpler than a runtime switch with both backends linked: smaller binary, no dead code paths, and the per-deployment decision is locked at build time.

Error namespace: error-atproto-pds-<domain>-<n> (domains: account, repo, sync, space, oauth, admin, auth, config, storage, notify).

Source anchors: design §1 (overview), design §17.4 (net-new functionality), references §1 (architectural overview), references §17 (existing crates inventory).


2. Required Extensions to Existing Crates

These changes are surgical and additive — none break public API. Each is anchored to design §17.3 ("Crates Needing Extension") and the corresponding references-doc citation for the upstream reason.

2.1 atproto-repo (Sync 1.1 conformance — design §6.3, references §6)

Change Why
Add prev_data: Option<Cid> to Commit and UnsignedCommit (crates/atproto-repo/src/repo/commit.rs:44) Sync 1.1 inductive verification — relays no longer need archival storage. (references §6 — Sync 1.1 proposal, references §6 — Relay v1.1 rollout)
Implement MstDiff::ops_with_prev_cids() returning Vec<RepoOp { action, path, cid, prev: Option<Cid> }> Required for #commit payload Sync 1.1 conformance and for applyWrites semantics. (design §6.3)
Streaming CarWriter extension that takes a Stream<Item = CarBlock> for repo export Avoid OOM on multi-GB exports. (May already exist — verify by reading carwriter code.) (design §7.2)
Inductive verify entry point verify_inductive(prev_data: Cid, blocks: &[CarBlock]) -> Result<Cid> Used by importRepo and by relay-side validation flows in tests. (references §16 — Conformance Gaps; "Inductive verification on inbound CAR")

2.2 atproto-dasl (no required change; optional)

The existing BlockStorage trait, MemoryStorage, DiskStorage, and SpillableBuffer are sufficient for the public-repo layer. We provide a SQL-backed BlockStorage impl inside atproto-pds::actor_store::sql_repo_storage. No change to atproto-dasl is required for Phase 0–8.

2.3 atproto-xrpcs (auth dispatch + DPoP — design §8.3, design §15.10)

Change Why
New extractor variants in authorization.rs keyed off JWT typ header: OAuthAccess, AppPassword, ServiceAuth, MemberGrant, SpaceCredential The PDS auth verifier has up to five token shapes (design §8.3, design §15.10).
Add a DPoP middleware that validates the DPoP header (RFC 9449) including server-issued nonce rotation Currently only Bearer / service auth is supported. (references §9 — RFC 9449, references §16 — DPoP nonce edge cases)
Add lxm enforcement for service-auth JWTs Strict spec; cocoon and TS PDS already require it. (design §2.6)
Provide a realm_scope helper that maps an AuthContext to "may write to (account, realm)" decisions Centralizes the public/Space/admin authorization logic. (design §1.4)

2.4 atproto-oauth & atproto-oauth-axum (server-side completeness — design §9, references §9)

Change Why
Server-side flows: PAR endpoint handler, /authorize endpoint, /token (with single-use refresh rotation), /revoke Today, atproto-oauth-aip covers the client side of the AIP flow; the server side has primitives but no full handlers. (references §9 — TS oauth-provider, references §9 — AIP)
Permission-set resolver with stale-while-revalidate cache (24h stale, 90d expiry) Required by atproto.com/specs/permission (references §15 — Permission spec).
Add space-typed scope variants: space:<spaceType>?action=read|write|manage, rpc:com.atproto.space.getMemberGrant?aud=<owner> (in scopes.rs) Provisional names per design §9.8 — we propose upstream simultaneously.
Consent UI templates (Askama) callable from atproto-pds::oauth::authorize The PDS hosts the authorization UI (design §9.9).

2.5 atproto-identity (key store abstraction — design §5.4, design §17.3)

Change Why
New KeyStore trait with FileKeyStore default + room for HSM/KMS backends The PDS holds multiple long-lived secrets (PLC rotation key, per-account signing keys); pluggable storage is non-negotiable for production (design §5.4).
Per-account signing-key generator helper One key per account; needs to be generated on createAccount (design §2.1).

2.6 atproto-lexicon (built-in catalog for com.atproto.space.*design §14)

Change Why
Bundle the new com.atproto.space.* lexicon JSONs once published; allow override via config so we can ship before upstream merges Authoritative validation of permissioned writes (design §4.5, design §15.8).
Add a "trust authority pin" mode to the resolver Production PDSes may not want to honor lexicon updates from arbitrary DIDs without admin approval (references §14 — Lexicon resolution mechanics).

2.7 atproto-plc (separately published; verify only — references §5)

Confirm tombstone op support (design §5.1); if missing, add it. No other changes.


3. Decision Log (load-bearing choices)

Each row's "Rationale" column gives the design or references anchor.

# Decision Rationale (anchored) Reversibility
D1 Per-actor SQLite is the default storage profile for both public-repo state and the eight Spaces tables; fjall is a co-equal shipping alternative (see D13 and §11). SQLite matches Spaces Design Spec exactly (actor-store/space/sql-repo-storage.ts) — non-negotiable for v0.15.0 spec-conformance positioning. Fjall is shipped alongside, behind a per-deployment profile, because Ramjet (§11) demonstrates the keyspace patterns work at firehose scale. Cheap takedown, trivial backup. design §3.3, design §15.3, design §20 #1, references §3 Hard — per-deployment profile choice should be considered final at install time; cross-profile migration is custom tooling
D2 atproto-space as a separate crate rather than a module of atproto-pds Apps and AppViews need the primitives; cleaner test boundary; mirrors @atproto/space. design §15.4, design §20 #2, references §15 — @atproto/space Easy — internal-only refactor
D3 Two SetHash impls behind a trait: XorSha256SetHash (placeholder, matches spec default) and EcmhSetHash via the first-party ecmh-rs crate (production target). Spec explicitly flags ECMH or ltHash as future replacement; ecmh-rs already implements ECMH per Maitin-Shepard et al. on Ristretto255/P-256/K-256, MIT/Apache-2.0, authored by Nick. Shipping both from Phase 1 lets us do interop today and flip the default after upstream settles. See §10 below. design §3.6, design §15.5, design §15.11 Easy — single crate change
D4 OAuth 2.1 server: full handlers in atproto-pds::oauth reusing atproto-oauth primitives, not in a separate crate The handlers are PDS-shaped (need actor store, key store, consent UI); splitting them to a new crate would force ~30 dependency arrows. design §9.9, design §17.3 Medium
D5 axum HTTP framework, tokio runtime, sqlx SQL access Aligns with atproto-xrpcs and tranquil-pds; sqlx has compile-time query check that catches schema drift early. design §17.5, design §18, references §16 — tranquil-pds, references §18 Hard once schemas land
D6 Two binaries: pds (server) and atproto-pds-admin (admin CLI) — separate, not a subcommand. Keeps server process simple; admin operations need not be co-deployed; matches bluesky-social/pds's pdsadmin.sh operational model and lets admin be installed without the server binary on ops workstations (G17 resolution). design §17.4 #14 Easy
D7 argon2id for password hashing Modern recommendation; tranquil-pds already uses it; TS PDS still on scrypt. design §2.4 Easy on new accounts, harder on migration
D8 Default OAuth access token TTL 15 min, refresh single-use rotation, 30 days Matches reference TS PDS. design §9.5 Trivial config
D9 Default SpaceCredential TTL 3h (within spec's 2–4h window) Median value within spec range. design §15.7, design §15.11 Trivial config
D10 NotifyWrite delivery: 5 attempts, exponential backoff 1/2/4/8/16s, then DLQ Spec is silent — these are conservative defaults. design §6.5, design §15.11 Trivial config
D11 Strict lexicon validation by default, lenient mode opt-in via config Adopt cocoon/TS-PDS strictest reasonable stance. design §4.2, design §16 closing notes Trivial config
D12 Firehose retention forever by default with admin compaction Matches TS PDS; cocoon and tranquil differ; we side with the largest deployment. design §6.2 Easy to flip
D13 Fjall is a first-class shipping backend, behind a fjall Cargo feature, not excluded. Ramjet has battle-tested fjall on atproto-shaped workloads (firehose ingestion, versioned records by rev, DID-keyed identity cache, batch writer with WriteBatch atomicity). Single binary, no schema migrations, pure Rust. Adopting the Ramjet keyspace layout (records, events, meta, repo_state, did_to_doc, handle_to_did, blobs, blob_meta) gives us a tested map to follow. SQLite remains the spec-conformance default for D1, but operators who want low-latency single-host deployments can pick fjall at install time. See §11. design §3.3, design §18.4, references §18 — Fjall Per-deployment choice; both backends maintained in parallel.
D14 tokio-websockets for the subscribeRepos firehose, not tokio-tungstenite as the design doc recommends tokio-websockets is already the workspace WebSocket stack (used by atproto-jetstream and atproto-tap). Sharing one stack across the workspace simplifies fuzzing surface, dependency audit, TLS provider configuration, and version churn. The design doc's §17.5 recommendation of tokio-tungstenite is the more common axum pairing but predates the workspace's standardization on tokio-websockets. Easy — single integration crate change
D15 Split health probes: /_alive, /_ready, plus spec /xrpc/_health Lifted from lexicon-garden. Matches k8s convention; lets ops decide between restart and load-balancer-removal responses. See §12.5. Trivial — three handlers
D16 Graceful shutdown via CancellationToken + TaskTracker is mandatory A PDS holds open WebSockets and background workers. Without coordinated shutdown, restarts drop firehose events between commit and broadcast. Pattern lifted from lexicon-garden. See §12.6. Hard to retrofit — bake in from Phase 0
D17 build.rs exposes BUILD_REV Used in User-Agent, _health version, OAuth ETag seed. Cache busting and operational debuggability. Pattern from lexicon-garden. See §12.8. Trivial — single build.rs
D18 Strict-validate-on-load configuration with all errors collected, not failed-on-first Avoids the "fix one var, restart, hit the next" cycle in production deploys. Pattern from lexicon-garden. See §12.10. Trivial — ConfigError enum shape

Each decision corresponds to a paragraph of justification in atproto-pds-design.md. Open spec questions are tracked as GitHub issues created in Phase 0 (see §14 below); their interim defaults match D8–D12 and are listed in design §15.11.


4. Module-Level Plan: atproto-space

Source anchors throughout this section: design §15.4–§15.7 for the public API, design §13 for cryptographic construction, references §15 for upstream context.

4.1 SetHash (set_hash.rs) — design §3.6, design §15.5

  • trait SetHash per §1.1 above.
  • Two concrete impls shipped from Phase 1 (decision D3, motivated in §10):
    • XorSha256SetHash([u8; 32]): add and remove both XOR sha256(element) into the accumulator (XOR is self-inverse → add+remove cancels). Matches the spec's current placeholder; chosen as default for upstream interop until ECMH or ltHash is settled.
    • EcmhSetHash<B>: thin adapter over the first-party ecmh-rs crate (RistrettoEcmh for the strongest variant; P256Ecmh / K256Ecmh available via features for parity with atproto signing curves). ecmh-rs implements ECMH per Maitin-Shepard et al. (2016) and exposes insert/remove/to_bytes plus +/- operator overloads — a direct fit. The adapter normalizes the element format ({collection}/{rkey}:{cid} for records, did for members) and the digest format (32 or 33 bytes per backend).
  • Element format helpers in commit.rs (the same byte form is hashed regardless of which SetHash impl is in use, so the abstraction holds):
    • records: format!("{collection}/{rkey}:{cid}") UTF-8 bytes
    • members: did.as_bytes()
  • Property tests run against both impls so neither regresses: order-independence (a sequence of N adds has the same digest after any permutation), add+remove inverse, add of (k1) then (k2) then remove of (k1) equals just (k2). A cross-impl test confirms the abstraction is order-independent for both — the digest values won't match between impls (they're different primitives), but the algebraic invariants must.

4.2 Commits (commit.rs) — design §13, design §15.6

The construction (per design §13):

ikm   := random(32 bytes)                              // per-commit
hkdf_info := DAG-CBOR(SpaceContext)
hmac_key  := HKDF-Extract-then-Expand(SHA256, ikm, info=hkdf_info, len=32)
tag       := HMAC-SHA-256(hmac_key, set_hash || rev)
sig_bytes := tag || rev.as_bytes()
sig       := ECDSA-low-S(user_signing_key, SHA256(sig_bytes))

Commit { set_hash, rev, ikm, tag, sig } is what is persisted (and what notifyWrite carries reference to).

verify_commit(context, commit, pubkey) recomputes hmac_key from commit.ikm + DAG-CBOR(context), recomputes tag, asserts equality, then verifies the ECDSA over tag || rev.

Critical test: a commit with scope = Records MUST fail verification when the verifier passes scope = Members. Domain separation is the entire reason scope exists in SpaceContext (design §15.6, design §15.12 Layer 1 — domain separation).

4.3 SpaceRepo and SpaceMembers — design §15.4

These are thin orchestrators over the storage trait — they handle:

  • TID generation for new revs (re-use atproto-record::tid)
  • Per-element SetHash add/remove
  • Building OplogEntry rows with monotonic idx per batch
  • format_commit → in-memory pending commit; apply_commit → persist via storage tx

4.4 Storage traits — design §3.6

#[async_trait]
pub trait SpaceRepoStorage: Send + Sync {
    async fn current_state(&self, space: &SpaceUri) -> Result<RepoState, SpaceError>;
    async fn get_record(&self, space: &SpaceUri, collection: &str, rkey: &str) -> Result<Option<RecordRow>, SpaceError>;
    async fn list_records(&self, space: &SpaceUri, collection: &str, cursor: Option<&str>, limit: u32) -> Result<RecordPage, SpaceError>;
    async fn list_collections(&self, space: &SpaceUri) -> Result<Vec<String>, SpaceError>;
    async fn apply_commit(&self, space: &SpaceUri, commit: PreparedCommit) -> Result<(), SpaceError>;
    async fn read_oplog(&self, space: &SpaceUri, since: Option<&str>, limit: u32) -> Result<OplogPage, SpaceError>;
}

#[async_trait]
pub trait SpaceMembersStorage: Send + Sync { /* analogous; member-list scoped */ }

In-memory impls live in atproto_space::storage::memory for testing and for AppView consumers that don't persist (e.g., a sync-time index).

4.5 Credentials (credential.rs) — design §15.7

JWT shapes per design §15.7. Signing uses KeyData from atproto-identity so the same crate handles ES256K and ES256.

MemberGrant.iat/exp enforce a 5-minute TTL; SpaceCredential.iat/exp use the configured TTL (default 3h, per design §15.11). Verification resolves the issuer's DID document via a caller-supplied resolver closure (so this crate has no direct network deps).


5. Module-Level Plan: atproto-pds

Source anchors: each subsystem maps directly to a section of atproto-pds-design.md, as called out per subsection below. Refer to references §17 — atproto-crates workspace for the existing-crate inventory.

5.1 Configuration (config.rs) — design §12.1

Precedence: env > --config <path> > /etc/atproto-pds/config.toml > built-in defaults.

Required:

  • PDS_HOSTNAME, PDS_SERVICE_DID (default did:web:<hostname>)
  • PDS_STORAGE_PROFILE (default sqlite; options sqlite | fjall. Must match the compiled-in storage feature; mismatch → fail-fast with ConfigError::StorageProfileMismatch.)
  • PDS_DATA_DIRECTORY (root for per-actor SQLite files OR fjall keyspace dir, depending on profile), PDS_BLOB_STORE_* (disk path or S3 config)
  • PLC rotation key: exactly one of PDS_PLC_ROTATION_KEY_K256_PRIVATE_KEY_HEX or PDS_PLC_ROTATION_KEY_P256_PRIVATE_KEY_HEX must be set (G14). Both set → ConfigError::AmbiguousRotationKey. Neither set → ConfigError::MissingRotationKey. The chosen curve becomes the algorithm for all PDS-managed PLC ops.
  • PDS_JWT_SECRET (HMAC for app-password JWTs)
  • PDS_OAUTH_KEYS_JWK_SET (G13: a JSON array of JWKs ordered oldest-to-newest; the last entry is the active signer for new tokens; all are valid for verifying outstanding tokens. Rotation = append a new key, then later remove the oldest after all tokens signed by it have expired. ES256 or ES256K.)
  • PDS_DID_PLC_URL (default https://plc.directory)
  • PDS_BSKY_APP_VIEW_URL, PDS_BSKY_APP_VIEW_DID (G25: used by Atproto-Proxy routing — incoming app.bsky.* XRPC requests with the proxy header are forwarded to PDS_BSKY_APP_VIEW_URL with a service-auth JWT signed by the user's signing key, aud=PDS_BSKY_APP_VIEW_DID, lxm=<requested-NSID>)
  • PDS_REPORT_SERVICE_URL, PDS_REPORT_SERVICE_DID
  • PDS_CRAWLERS (comma-separated relay URLs)
  • PDS_INVITE_REQUIRED (bool), PDS_INVITE_INTERVAL
  • PDS_EMAIL_SMTP_URL, PDS_EMAIL_FROM_ADDRESS
  • PDS_ADMIN_PASSWORD
  • PDS_SERVICE_HANDLE_DOMAINS (e.g., .pds.example.com)
  • PDS_SPACE_CREDENTIAL_TTL_SECONDS (default 10800)
  • PDS_SPACE_OPLOG_RETENTION_DAYS (default unlimited)
  • PDS_SPACE_NOTIFY_RETRY_MAX_ATTEMPTS (default 5)
  • PDS_SPACE_NOTIFY_RETRY_INITIAL_BACKOFF_MS (default 1000)
  • PDS_NOTIFY_MEMBERSHIP_EMAIL (default false — opt-in user-facing email on notifyMembership ingress, G22)
  • PDS_OAUTH_ACCESS_TOKEN_TTL_SECONDS (default 900)
  • PDS_OAUTH_REFRESH_TOKEN_TTL_SECONDS (default 2592000 = 30d)

Conditional (when feature valkey is enabled, G24):

  • PDS_VALKEY_URL — connection string for Valkey/Redis. Required if compiled with valkey. Used for the JTI replay filter (§12.2), denylist cuckoo filter (§12.3), and sliding-window rate limiter (§12.4). Without valkey, all three fall back to in-memory bounded-LRU implementations with the same semantics but per-process state.
  • PDS_VALKEY_KEY_PREFIX (default atproto-pds:) — namespace for all PDS keys, lets multiple PDS instances share a Valkey instance without colliding.

Validated up-front at startup with a single ConfigError thiserror enum that collects every issue before failing — no "fix one var, restart, hit the next" cycle. Pattern lifted from lexicon-garden per §12.10. Inspiration: tranquil-pds and cocoon configuration patterns (references §12).

5.2 Account database — design §2

Single shared SQLite (or Postgres — see Postgres note below) DB for cross-account state. Tables:

  • account (did PK, handle, email, email_confirmed_at, password_hash, created_at, state, signing_key_ref, pds_managed_rotation)
  • app_password (id, did, name, password_hash, privileged, created_at)
  • oauth_session (id, did, client_id, dpop_jkt, scope, issued_at, expires_at, refreshed_at)
  • invite_code (code PK, created_by_did, available_uses, used_by, created_at, disabled)
  • email_token (token PK, did, purpose, expires_at)
  • plc_op_token (token PK, did, expires_at)
  • signing_key (id PK, did, algorithm, key_ref, created_at) — key_ref is an opaque string handle resolvable by the configured KeyStore; the actual private bytes never live in this table.
  • service_auth_blacklist (jti PK, expires_at) — admin-revoked JWT JTIs.
  • notify_attempt (id PK, target_service_did, target_endpoint, payload_cbor, nsid ('notifyWrite' | 'notifyMembership'), attempt_count, last_attempt_at, next_attempt_at, state ('pending' | 'in_flight' | 'dead'), last_error) — notifier DLQ; index on (state, next_attempt_at).
  • denylist (hash_metro64 PK, kind ('did' | 'handle' | 'nsid' | 'email'), created_at, note) — durable backing for the denylist cuckoo filter; stores only MetroHash64 of the identifier (see §12.13).

KeyStore model (G7 resolution): per-account signing-key bytes never live in the SQL signing_key table. The row holds only a key_ref (e.g., file:abc123, kms:projects/foo/keyRings/bar/cryptoKeys/baz) that the configured KeyStore resolves to actual key material at sign-time. This puts the threat-model boundary at the KeyStore: a DB dump leaks no key material at all; only a separate compromise of the KeyStore (filesystem, KMS credentials, HSM session) can produce key bytes. The default FileKeyStore writes one file per key under PDS_DATA_DIRECTORY/keys/<key_ref> with mode 0600. HSM and KMS backends are pluggable.

Two JTI tracking systems (G21 note): the service_auth_blacklist table holds admin-revoked JTIs (revoke action: token was valid but is now blocked). Distinct from the JTI replay filter (§12.2), which prevents reuse of an unrevoked valid token within its lifetime. Both checks run; either rejects the token.

Postgres feature (G11 resolution): the accounts DB schema uses dialect-agnostic SQL; sqlx queries work on both SQLite and Postgres. The postgres Cargo feature swaps the connection pool. Per-actor stores remain SQLite or fjall regardless of accounts-DB backend choice — Postgres is only relevant to the shared accounts DB. Postgres is the right choice for multi-tenant SaaS deployments where the accounts DB is the contention bottleneck (high session-creation rate, hot OAuth refresh path); for single-host deployments, SQLite is simpler. Migrations are dialect-aware where they need to be (e.g., AUTOINCREMENT vs. BIGSERIAL); sqlx's migration tool handles per-dialect migration directories.

Two profiles ship in v0.15.0; the operator picks one at install time. The trait surface (ActorStore, BlockStorage, SpaceRepoStorage, SpaceMembersStorage) is identical between them — see §11 for backend selection criteria and a backend-mapping table.

Profile A: SQLite (default). One SQLite file per account at PDS_DATA_DIRECTORY/actors/<sha256(did)>.sqlite. Tables:

Public-realm:

  • commit (cid PK, rev, data_cid, prev_cid, prev_data_cid, signature_blob, created_at)
  • repo_block (cid PK, data BLOB) — DAG-CBOR record + MST node bodies
  • repo_record (uri, cid, collection, rkey, rev, indexed_at) PK (collection, rkey)
  • repo_blob_ref (record_uri, blob_cid, mime_type, size) for blob ref-counting
  • outbox (seq PK AUTOINCREMENT, did, event_type, payload, created_at)

Spaces realm (per design §3.6):

  • space (uri PK, is_owner, is_member, created_at)
  • space_member_state (space PK FK, set_hash, rev)
  • space_repo (space PK FK, set_hash, rev)
  • space_record (space, collection, rkey, cid, value BLOB, repo_rev, indexed_at) PK (space, collection, rkey), idx (space, repo_rev)
  • space_member (space, did, member_rev, added_at) PK (space, did)
  • space_record_oplog (space, rev, idx, action, collection, rkey, cid, prev) PK (space, rev, idx)
  • space_member_oplog (space, rev, idx, action, did) PK (space, rev, idx)
  • space_credential_recipient (space, service_did, service_endpoint, last_issued_at) PK (space, service_did)

ActorTransactor opens a transaction over the per-actor DB, handles both realms in one tx where needed (e.g., when a public record references blob CIDs that arrived in the same uploadBlob batch), and emits outbox events on commit.

Profile B: fjall. The same logical schema mapped onto Ramjet-style keyspaces — see §11.2. Atomicity is provided by WriteBatch. Multi-keyspace atomicity (e.g., touching space_record and space_record_oplog in one commit) requires fjall's cross-partition write semantics; we wrap them in ActorTransactor::commit(). getRepoOplog-style range queries become prefix scans, which fjall does natively. The trade-offs are documented in §11.3.

5.4 Sequencer / firehose — design §6, design §6.4

The Sequencer topology depends on storage profile:

  • SQLite profile: per-actor Sequencer task. Each per-actor SQLite has its own outbox table and its own Sequencer task that drains it. Writers for that DID push to a per-DID bounded mpsc channel; the per-DID Sequencer persists with a monotonic seq (scoped to that DID) and broadcasts to subscribers attached to that DID. Cross-DID seq ordering is not maintained — subscribeRepos clients receive events as a merged stream from all per-actor broadcasters.
  • fjall profile: single global Sequencer task draining a single outbox partition. seq is process-global. Cross-DID ordering is maintained naturally.

Both profiles use tokio::sync::broadcast for live-tail fan-out. Subscribers:

  • Live-tail: attach to broadcast (under SQLite, this means subscribing to many per-DID broadcasters; the WS handler multiplexes); on lag → drop with #info OutdatedCursor and close.
  • Backfill: paginate from outbox by seq (per-DID under SQLite, global under fjall) until caught up, then attach to broadcast.

Crawler fan-out: after each commit, async POST requestCrawl to PDS_CRAWLERS with bounded concurrency.

Permissioned writes are not sequenced into the firehose (per design §6.5). Their notifications flow through atproto-pds::space::notifier, a parallel subsystem.

Topology rationale: under SQLite, per-actor Sequencer fits the per-actor isolation story — no cross-actor lock contention, takedown-by-rm-file remains trivial. Under fjall, a single shared outbox partition is the natural shape; one Sequencer task draining it is simpler than coordinating per-DID prefix scans. The asymmetry is intentional and tests must verify equivalent firehose behavior under both profiles (Layer-2: subscribe to firehose, write across N DIDs, assert all events delivered in seq order per-DID).

5.5 Auth verifier — design §8.3, design §15.10

xrpc::auth_extractor::AuthVerifier — an axum extractor that:

  1. Parses Authorization: Bearer <token>. Returns AuthContext::Anonymous if missing.
  2. Decodes JWT header without verifying.
  3. Branches on typ:
    • at+jwt → OAuth path: verify DPoP, JWT thumbprint match, nonce, signature against PDS_OAUTH_KEY_JWK. Output: OAuthAuth { did, scope, client_id, jkt, session_id }.
    • dpop+jwt → standalone DPoP proof; never seen as primary auth. Reject as misuse.
    • space_member_grant → resolve iss DID, verify with member's atproto signing key, check aud=owner_did, check lxm=com.atproto.space.getSpaceCredential, check clientId matches request's authenticated client. Output: MemberGrantAuth { member_did, owner_did, client_id, space }.
    • space_credential → resolve iss DID (= space owner), verify with owner's signing key, check space claim. Output: SpaceCredentialAuth { owner_did, space, client_id }.
    • unmarked → app-password path (verify HMAC against PDS_JWT_SECRET) or service-auth path (verify against issuer DID, check lxm, check aud). Disambiguate by iss claim format (DID = service auth; account+id = app password).
  4. Per-route guards (separate extractors) further enforce: realm match, scope coverage, ownership.
  5. JTI replay protection at every JWT-verify step (per §12.2): check the jti against the Valkey-backed (or in-memory fallback) bloom/cuckoo filter, reject as replay on hit, insert with TTL = exp - iat on miss. Required by RFC 9449 for DPoP and by our security baseline for OAuth, MemberGrant, SpaceCredential, and service-auth.

5.6 Rate limiter — design §8.4

tower::Service middleware with per-IP and per-DID buckets. Defaults (mirroring TS PDS, see references §8 — TS):

  • Global per-IP: 3000/5 min
  • Per-account writes: 1666/h
  • Per-account images: 50/h
  • OAuth /par: 30/min
  • Spaces getRepoOplog: 1000/min per (DID, space) — to allow polling apps.

Backends: InMemoryRateLimiter (default — token-bucket) and ValkeyRateLimiter (feature-gated — sliding-window via Redis ZSET, lifted from lexicon-garden per §12.4). Sliding-window is fairer for clients that burst at minute boundaries.

5.7 Mailer — design §11

Mailer trait + SmtpMailer (lettre, see references §11). Templates:

  • verify_email, confirm_email_update, reset_password, confirm_account_delete, plc_op_signature, account_migration_initiated, takedown_notice, space_membership_added, space_membership_removed.

5.8 Spaces subsystem (space/) — design §15, design §6.5, design §7.3

Wires atproto-space into HTTP:

  • service.rs:

    • createSpace: insert space row with is_owner=1, init space_member_state (empty SetHash), assign space.uri.
    • getSpace, listSpaces: filter by is_owner / is_member.
    • addMember / removeMember: SpaceMembers::format_commitapply_commit → outbound notifyMembership to that DID's PDS via Notifier.
  • writer.rs: per-(DID, space) mutex; calls into SpaceRepo::format_commit/apply_commit; emits notifyWrite to space owner's PDS post-commit.

  • reader.rs: dual-auth check. OAuth is sufficient for own-PDS reads (a user's app reading the user's own permissioned records). SpaceCredential is the only path for cross-PDS reads (a syncing app on PDS-X reading data hosted on PDS-Y). A user's own app can always use OAuth for their own PDS; SpaceCredential is needed only when the requesting service is on a different PDS than the data owner (G34). Routes to own actor store or to the cross-PDS proxy as appropriate.

  • sync.rs: getRepoState, getRepoOplog, getMemberState, getMemberOplog. Returns OplogGap if since < retained range.

  • credential.rs: handlers for getMemberGrant (issued by member's PDS) and getSpaceCredential (issued by owner's PDS). The latter also writes into space_credential_recipient.

  • notifier.rs: outbound HTTP client with bounded retry, exponential backoff, DLQ. Resolves syncing-app endpoints from DID doc service entries, falling back to value provided in getSpaceCredential request.

  • receiver.rs: inbound notifyWrite and notifyMembership handlers. Both are service-auth-protected. Sender-DID rules (G35):

    • On the member's PDS receiving notifyWrite: accept only if the issuer DID is the owner DID of the named space.
    • On the owner's PDS receiving notifyWrite: accept only if the issuer DID is a member DID of the named space (entry exists in space_member).
    • On the consumer service receiving notifyWrite: accept only if the issuer DID is the owner DID of a space the consumer holds an unexpired SpaceCredential for.
    • On the member's PDS receiving notifyMembership: accept only if the issuer DID is the owner DID of the named space.
    • All other senders → 403 with InvalidNotificationSender.

    On notifyMembership ingress, optionally email the user via the configured Mailer using the space_membership_added / space_membership_removed templates — controlled by the PDS_NOTIFY_MEMBERSHIP_EMAIL config flag (default off, per design §11.3, G22).

  • takedown.rs: admin-only com.atproto.admin.takedownSpaceRecord endpoint with audit log.

5.9 OAuth subsystem (oauth/) — design §9

Builds on atproto-oauth primitives (references §9):

  • PAR endpoint (RFC 9126) at /oauth/par.
  • /oauth/authorize server-rendered consent UI (Askama). Uses permission_set.rs to expand include:<nsid> scopes; renders human-readable descriptions including space-scope variants.
  • /oauth/tokenauthorization_code and refresh_token grants. Refresh = single-use rotation, persisted in oauth_session.
  • /oauth/revoke (RFC 7009).
  • /oauth/jwks exposes the PDS's signing public key (JWKS rotation supported).
  • /.well-known/oauth-authorization-server and /.well-known/oauth-protected-resource (RFC 8414, RFC 9728).

5.10 Admin subsystem (admin/) — design §10

  • Endpoints: getAccountInfo, getAccountInfos, searchAccounts, disableAccountInvites, enableAccountInvites, disableInviteCodes, getInviteCodes, getSubjectStatus, updateSubjectStatus, sendEmail, updateAccountEmail, updateAccountHandle, updateAccountPassword, deleteAccount, plus takedownSpaceRecord (design §10.6).
  • Minimal HTML dashboard (Askama) at /admin — Basic-auth gated by PDS_ADMIN_PASSWORD. Lists accounts, shows takedown queue, exposes atproto-pds-admin-equivalent actions (references §12 — bluesky-social/pds pdsadmin.sh).
  • Public-record takedown behavior (G15): when updateSubjectStatus flags a record takendown:
    • Hidden from com.atproto.repo.getRecord and com.atproto.repo.listRecords (return RecordTakendown error code).
    • Included in com.atproto.sync.getRepo so relay verification of the MST chain still works (the record CID is still in the tree; relays must not break on takedown).
    • Tagged with takedown=true in firehose #commit ops (downstream AppViews can drop their indexed copies on next consumption).
    • Matches cocoon's hide-on-read pattern with the firehose-tag addition for downstream awareness.
  • Account-level takedown behavior: blocks all writes (creates, puts, deletes, blob uploads, OAuth /token exchanges) at the auth-extractor layer; blocks public reads with AccountTakendown; emits #account active=false status=takendown on the firehose. Account migration paths (importRepo, activateAccount) are blocked for takendown DIDs (an admin must un-takedown first).
  • Takedown enforcement modules (G30): enforcement code lives at the entry point of each affected module — repo::reader::*, repo::writer::*, space::reader::*, space::writer::*, account::manager::*, oauth::token::issue, space::receiver::*. Each consults account::manager::takedown_set (Valkey-backed cuckoo filter under §12.3, in-memory fallback otherwise) on the entry path; rejection happens before any storage IO.
  • Denylist enforcement uses the cuckoo-filter pattern from §12.3 — Redis-backed cuckoo filter keyed by MetroHash64(DID | handle | NSID | email). Checked on createAccount, repo::writer, space::writer/receiver, space::notifier. Durable backing in the accounts DB denylist table stores only MetroHash64 hashes, not plaintext (privacy-preserving per §12.13).

5.11 Observability — design §12.4

  • Health probes (per §12.5):
    • /_alive — process-up liveness probe.
    • /_ready — dependencies-reachable readiness probe (account DB, blob store, Mailer, Valkey if enabled, PLC directory).
    • /xrpc/_health — spec-compliant {version, status: "ok"} (where version includes BUILD_REV per §12.8).
  • Graceful shutdown wires every spawned task to a top-level CancellationToken and joins via TaskTracker with a configurable deadline (per §12.6). Mandatory for clean firehose subscriber drain.
  • Tokio task instrumentation via tokio-metrics::TaskMonitor per labeled task class (per §12.7) — exposes poll counts, durations, scheduling delays for the sequencer, notifier worker pool, blob GC, and SWR refreshers.
  • Prometheus on a separate listener (default port 2471). Required metrics:
    • HTTP: rate, latency (p50/p99) per NSID
    • Repo writes: tx duration, MST cache hit rate
    • Firehose: subscriber count, lag, dropped
    • Spaces: notifyWrite retries, DLQ depth, oplog gap fallback rate, SpaceCredential issuance rate, spaces owned/joined per account
    • OAuth: PAR rate, token mint rate, refresh rotations, revocations
  • tracing + tracing-subscriber env-filter; OTel exporter via the otel feature.

6. Test Strategy

Three layers, mirroring design §15.12. External conformance feeds in at every layer (references §19).

Coverage targets (G32):

  • Layer 1 (atproto-space unit tests): line coverage ≥85%; branch coverage tracked but not gated.
  • Layer 2 (atproto-pds integration tests): no line-coverage target — measured by XRPC endpoint coverage (every NSID in §8.6 of the design hit by at least one happy-path and one error-path test) and by realm coverage (every test runs against both storage profiles in CI).
  • Layer 3 (multi-PDS interop): scenario coverage tracked in §6.3 — five named scenarios must pass.

6.1 Layer 1 — atproto-space unit tests (Phase 1 entry condition) — design §15.12 Layer 1

  • SetHash: order-independence, add/remove inverse, consistency.
  • Commit: round-trip create→verify, IKM uniqueness, domain-separation test (Records vs Members), tampered-tag rejection, tampered-rev rejection, expired-pubkey rejection.
  • SpaceRepo: CRUD, batch atomic semantics, prev linkage on update/delete.
  • SpaceMembers: add/remove, idempotency, duplicate-add error, remove-non-member error, SetHash over DIDs.
  • MemberGrant / SpaceCredential: create→verify, expired, tampered, wrong-aud, wrong-space, wrong-clientId, wrong-lxm, missing claim.

Property tests (proptest) on SetHash and oplog reconstruction (any permutation of N ops yielding the same digest converges to the same SetHash).

6.2 Layer 2 — atproto-pds integration tests (Phase 7 entry condition) — design §15.12 Layer 2

tests/integration/ with one running PDS process per test (random port, temp data dir). Tests run against a PdsHandle test fixture that spins up a PDS process under either profile, configured via PDS_STORAGE_PROFILE. CI runs the full Layer-2 matrix twice — once per profile binary build (G18: profiles are mutually exclusive at compile time, so each profile gets its own CI entry); each individual test is profile-agnostic and never reaches into storage internals (G20).

Test isolation pattern lifted from lexicon-garden per §12.9:

  • SQLite profile: #[sqlx::test] macro for parallel-isolated DB tests with compile-time-checked queries (used for unit tests of SQL impls; integration tests use PdsHandle instead).
  • fjall profile: integration tests use a with_fjall_actor_store(|store| async move { ... }).await helper function that allocates a tmpdir keyspace and tears it down at end-of-test (G12: helper function rather than a proc macro avoids needing a separate proc-macro crate).

For each:

  • Public-repo CRUD via XRPC client (atproto-client).
  • OAuth happy path: PAR → consent → token → call.
  • AppPassword session → call.
  • Service auth → call → lxm rejection.
  • Spaces happy path: createAccount A & B, A creates space, A addMember B, B writes record, B getRepoOplog self.
  • Spaces error path: non-owner addMember → 403.
  • SpaceCredential flow: third party C with OAuth at A obtains MemberGrant from A, exchanges with owner-of-space (also A here in single-host case) for SpaceCredential, calls B's getRepoOplog with credential.
  • Account migration: A → A' on second PDS; full sequence per atproto.com/guides/account-migration; listMissingBlobs empties before activate.

6.3 Layer 3 — Multi-PDS interop tests (Phase 8 entry condition) — design §15.12 Layer 3

Spin up three PDSes in parallel via tokio tasks (PDS-A hosts space owner + member-1, PDS-B hosts member-2, PDS-C hosts member-3). A test app holds OAuth sessions on each.

  • Happy-path multi-PDS sync (owner adds members across PDSes, each member writes, app pulls with credential).
  • Incremental sync with since.
  • OplogGap recovery: simulate compaction, confirm fallback to listRecords.
  • Member lifecycle: removeMember mid-sync → notifyMembership(false) reaches member; member sees is_member=0 but data preserved.
  • notifyWrite reliability: kill syncing-app endpoint, observe retries; bring it back, observe success.

6.4 External conformance — references §19


7. Phased Build Plan

Each phase ends at a runnable, demoable milestone. No phase begins until the prior phase's tests pass. CI gates listed for each. Phase ordering and content tracks design §19 "Build & Release Plan."

Phase 0 — Scaffold (~1 week) — design §19 Phase 0

  • Add crates/atproto-pds and crates/atproto-space to workspace Cargo.toml.
  • Stub lib.rs, errors.rs, Cargo.toml, README.md for both.
  • Define ActorStore trait in atproto-pds::actor_store::mod (no impls yet, just the surface the rest of the crate depends on). G10 sequencing: BlockStorage already exists in atproto-dasl; SpaceRepoStorage/SpaceMembersStorage arrive in Phase 1 with atproto-space; ActorStore is atproto-pds-local and ships now.
  • Add crates/atproto-pds/build.rs exposing BUILD_REV (per §12.8). Lift shape from lexicon-garden.
  • Wire the graceful-shutdown skeleton in pds::main — top-level CancellationToken, TaskTracker, signal handler — even though there are no tasks yet to wire to it. Bake in from Phase 0 to avoid retrofitting (D16).
  • Open GitHub issues for each spec open question (design §15.11).
  • Extend atproto-repo::Commit with prev_data: Option<Cid> (Sync 1.1, design §3.1).
  • Add prev field to MstDiff op output.
  • Implement atproto-repo::verify_inductive(prev_data: Cid, blocks: &[CarBlock]) -> Result<Cid> (Sync 1.1 inductive verification, design §3.1, design §7.2). Used by Phase 5's importRepo. Phase-0 unit tests cover happy path and tampered-block rejection.
  • fjall WriteBatch cross-partition atomicity spike: build a small standalone proof that confirms WriteBatch writes across multiple PartitionHandles are atomic on crash. If it fails, escalate D13 — fjall is unsuitable as a co-equal backend without this guarantee. (G36 de-risking)

CI gate: cargo build, cargo fmt --check, cargo clippy -- -D warnings, existing tests still pass; BUILD_REV env populated at compile; verify_inductive unit tests pass; fjall WriteBatch spike documented in Phase 0 PR.

Phase 1 — atproto-space (~2 weeks) — design §19 Phase 1

  • Implement SetHash trait.
  • Implement XorSha256SetHash (placeholder, default).
  • Implement EcmhSetHash adapter over ecmh-rs (feature-gated ecmh, on by default). Pin ecmh-rs to a known-good revision; vendor a bench harness to track digest computation cost vs the XOR placeholder.
  • Implement SpaceContext, Commit, create_commit, verify_commit with HKDF+HMAC+ECDSA construction.
  • Implement SpaceRepo, SpaceMembers over storage traits.
  • Implement MemberGrant, SpaceCredential JWT shapes.
  • In-memory storage impls.
  • Layer-1 unit tests including domain-separation tests, run against both SetHash impls.

CI gate: Layer-1 tests pass against both XorSha256SetHash and EcmhSetHash; coverage ≥85% on atproto-space; benchmark report committed showing per-element add/remove cost for both impls.

Phase 2 — Public realm read-only PDS (~3 weeks) — design §19 Phase 2

  • atproto-pds::actor_store with public-realm tables only.
  • atproto-pds::repo::reader for getRecord, listRecords, describeRepo, getRepo, getBlob, listBlobs, getBlocks, getLatestCommit, getRepoStatus.
  • atproto-pds::sequencer read-only; subscribeRepos WS handler that streams from outbox.
  • A way to manually import an existing CAR into an actor store (admin CLI).
  • Validate by replaying a snapshot of the Bluesky firehose and serving it back.

CI gate: smoke test serving a small replayed repo to goat-style verifier.

Phase 3 — Account management & public writes (~3 weeks) — design §19 Phase 3

  • account::manager with PLC-managed account creation, including recoveryKey parameter handling — when supplied by the user, the PLC genesis op's rotationKeys array is [pds_rotation_key, recoveryKey] (G16, design §2.1).
  • app_password, session, invite, email_token flows.
  • repo::writer for createRecord, putRecord, deleteRecord, applyWrites, uploadBlob.
  • Sync 1.1 #commit events emitted with prevData and per-op prev.
  • Sync 1.1 #sync event emission and WS framing (G6, design §6.1): emitted on recovery from drift (sequencer detects gap), on account migration completion, and on admin-initiated repo state force-set. The subscribeRepos handler in atproto-pds::sequencer::subscribe_repos frames #sync as a top-level message type alongside #commit/#identity/#account/#info.
  • Mailer (SMTP) wired.

CI gate: Layer-2 tests for public-realm CRUD + session auth pass; #sync event delivered when sequencer simulates a drift recovery.

Phase 4 — OAuth provider (~3 weeks) — design §19 Phase 4

  • PAR / authorize / token / revoke / jwks / metadata.
  • DPoP middleware in atproto-xrpcs.
  • Permission-set resolver with SWR cache.
  • Consent UI (Askama).

CI gate: Layer-2 OAuth happy-path test passes; revoked token rejected; refresh rotation enforced.

Phase 5 — Account migration (~2 weeks) — design §19 Phase 5, design §2.7

  • getServiceAuth, importRepo (CAR ingest with inductive verify), activateAccount, deactivateAccount.
  • requestPlcOperationSignature, signPlcOperation, submitPlcOperation.
  • End-to-end migration test between two PDS processes.

CI gate: Layer-2 migration test passes; listMissingBlobs empties before activate succeeds.

Phase 6 — Moderation, admin, deployment (~2 weeks) — design §19 Phase 6, design §10

  • admin.* endpoints + minimal HTML dashboard.
  • moderation.createReport proxy via service auth.
  • Takedown enforcement at read/write boundaries.
  • Dockerfile additions for pds binary — multi-stage per §12.12 (Node-build → Rust-build → distroless runtime).
  • Compose example (deploy/compose.yaml) modeled on bluesky-social/pds.
  • atproto-pds-admin sibling binary (per D6 / G17 — separate, not a subcommand).

CI gate: Layer-2 admin tests pass; takedown blocks reads/writes; smoke test of compose stack stands up cleanly.

Phase 7 — Permissioned realm + fjall backend (~5 weeks, biggest phase) — design §19 Phase 7, design §15, §11

  • actor_store Spaces tables + migrations (SQLite profile).
  • actor_store::sql::sql_space_repo_storage and sql_space_members_storage impls (against atproto-space traits).
  • actor_store::fjall profile: ship the Ramjet-style keyspace layout from §11.2, including all BlockStorage/SpaceRepoStorage/SpaceMembersStorage impls. Layer-2 tests run twice (once per profile). +1 week vs the SQLite-only timeline; this is the new "fjall is co-equal" decision (D13).
  • space::service (createSpace/addMember/etc.).
  • space::writer (CRUD with single-writer-per-(DID,space)).
  • space::reader (dual auth).
  • space::sync (getRepoState, getRepoOplog, getMemberState, getMemberOplog; OplogGap errors).
  • space::credential (getMemberGrant, getSpaceCredential, recipient registration).
  • space::notifier outbound + space::receiver inbound (notifyWrite, notifyMembership).
  • xrpc::auth_extractor extended for MemberGrant + SpaceCredential typ headers.
  • Space-typed OAuth scopes in atproto-oauth::scopes and consent UI rendering.
  • Admin: takedownSpaceRecord.

CI gate: Layer-2 Spaces tests pass under both storage profiles (CRUD, auth rejection, credential flow). Layer-3 multi-PDS happy-path passes.

Phase 8 — Production hardening (~3 weeks) — design §19 Phase 8, design §18

  • Promote EcmhSetHash to default if upstream has picked ECMH (re-using the impl already shipped in Phase 1). If upstream picks ltHash, add a third impl LthashSetHash and promote that; the trait abstraction makes this a single-PR change. If upstream has not picked, stay on XOR + flag and document the migration path.
  • Evaluate riblt-rs-backed oplog reconciliation as an opt-in optimization for getRepoOplog recovery (see §10). Spec-default fallback is full listRecords resync; RIBLT can short-circuit that for large oplogs. Behind a feature flag and a per-tenant config — never the only path. Defer if Phase 8 budget is tight; this is a 1.x roadmap item, not a 0.15.0 blocker.
  • Bench SQLite vs. fjall storage profiles head-to-head on the public-repo write hot path, the getRepoOplog read hot path, and the firehose ingestion path. Publish numbers in the v0.15.0 release notes. Inform the v0.16.x default-flip decision (open question 13 in §14).
  • Performance tuning: SQLite WAL+mmap, MST node cache sizing, broadcast channel sizing, blob streaming, fjall block-cache sizing.
  • OplogGap telemetry; full-resync fallback testing.
  • Cross-implementation interop tests against TS @atproto/pds Spaces (when it ships).
  • Load testing harness; document throughput ceilings.
  • Security review: verify low-S signature normalization everywhere; verify password hash parameters; review DPoP nonce rotation; review JWT replay defenses (jti tracking); audit ecmh-rs (no third-party audit exists; we are the upstream so we own the risk).
  • Publish atproto-pds 0.15.0 along with atproto-space 0.15.0; rev workspace versions per rust-release skill.

CI gate: Layer-3 multi-PDS sync + recovery tests pass; published crates install cleanly.

Aggregate timeline: ~24 weeks (≈6 months) of focused work. Phases sum to 1+2+3+3+3+2+2+5+3 = 24 (Phase 7 grew by one week to include the fjall backend).


8. Dependencies, Risks, and Mitigations

8.1 Cross-crate ordering

Phase 0 → adds prev_data to atproto-repo (foundational)
Phase 1 → atproto-space lands → Phase 2,3,4,5,6,7 all depend on it being stable
Phase 4 (OAuth server) → blocks Phase 7 OAuth-scope work but not Phase 7's space service if app-password auth is used in Layer-2 tests
Phase 7 → depends on Phase 4 for the full dual-auth story
Phase 8 → assumes upstream ECMH decision; if not made, ship XOR + flag

8.2 External dependencies and assumptions — design §16, references §16

Risk Mitigation
Spaces Design Spec changes mid-stream Trait-based abstractions for SetHash, JWT shapes, storage. File issues against ambiguities (design §15.11) — track in repo.
com.atproto.space.* lexicons not yet published Bundle locally, allow override via config; sync to upstream as soon as published. (design §14.1)
TS @atproto/pds Spaces implementation slipping Ship our own with synthetic Layer-3 tests; bilateral interop happens in Phase 8 not Phase 7. (references §16)
Per-actor SQLite scaling beyond 10k accounts Document the ceiling; advanced operators can opt to a hybrid backend post-1.0. (design §12.5, design §18.4)
DPoP nonce edge cases (cocoon's lesson) Lift cocoon's hailey/fix-dpop-nonce-err test cases into our DPoP test suite. (references §16 — Notable interop issues)
Identity caching after migration (Blacksky's lesson) Bypass cache on signature failure; document this. (references §16 — Notable interop issues)
Lexicon strictness divergence Ship strict by default; document the lenient knob. (design §4.2)

8.3 Out of scope (intentionally) — design §15.2

  • E2EE on top of permissioned data (a future protocol layer; Diary 1 defers to "can be layered on").
  • Application coordination / app allow/deny lists for sync (design §15.2, spec out of scope).
  • Delegated / sub-accounts (design §15.2, spec out of scope).
  • Permissioned-data account migration — the spec is silent (design §2.7); we'll prototype com.atproto.space.exportSpaces/importSpaces provisional NSIDs in Phase 8 only if upstream signals interest.
  • Non-SQLite block backends (fjall/Postgres unified store) — documented as future work but not built in initial scope (design §18.4).
  • Built-in WebAuthn/TOTP (tranquil-pds territory, references §16) — out of initial scope; revisit post-1.0.

9. Conformance Matrix (target at v0.15.0 release) — design §16, references §16

Area atproto-pds 0.15.0 target
Sync 1.1 (prevData, per-op prev, #sync) full (Phase 0 inductive verifier; Phase 3 emission of all four event types)
did:web for accounts (BYO + hosted) full
did:plc accounts (genesis, rotate, tombstone) full
OAuth 2.1 server (PAR/PKCE/DPoP, refresh rotation) full
Permission set resolution (24h SWR, 90d expiry) full
importRepo with inductive verification full
Per-account SQLite yes (default storage profile)
fjall storage profile (Ramjet-style keyspaces) yes (alternative profile, feature fjall)
Blob store: disk default, S3 feature full
SMTP mailer full; pluggable trait
lxm-required service auth yes (strict)
listReposByCollection full
com.atproto.admin.* full
com.atproto.moderation.createReport proxied
com.atproto.space.* (CRUD + sync + mgmt + notify + credentials) full per Spaces Design Spec
atproto-space crate full
Dual-auth getRecord/listRecords full
notifyWrite / notifyMembership outbound + inbound full, with retry + DLQ
Space-scoped OAuth full (provisional names; tracking upstream)
Sidecar / strongRef into permissioned records supported via existing strongRef + 404/403 dereference
Account migration full
WebAuthn/TOTP not in 0.15.0
Account delegation not in 0.15.0

10. Adjacent first-party crates: ecmh-rs and riblt-rs

Two existing Rust crates authored by Nick Gerakines are directly relevant to this plan. Reviewing them upfront avoids reinventing primitives and frames how they slot into the phased build.

10.1 ecmh-rs — production target for SetHash (load-bearing)

Repository: tangled.org/ngerakines.me/ecmh-rs License: MIT OR Apache-2.0 Status: active development; no third-party audit; no published benchmarks.

What it provides: A clean ECMH (Elliptic Curve Multiset Hash, Maitin-Shepard et al. 2016) implementation with three curve backends — RistrettoEcmh (default, 32-byte digest), P256Ecmh (33-byte, feature-gated), K256Ecmh (33-byte, feature-gated). Public API exposes new, insert(&[u8]), remove(&[u8]), from_element, from_elements, to_bytes / from_bytes, plus operator overloads (+, -, +=, -=). A CurveBackend trait permits custom backends. WebAssembly bindings exist for browser-side verification.

Why it matters here: design §3.6 and design §15.5 explicitly call for replacing the XOR-SHA256 placeholder with ECMH or ltHash before production. ecmh-rs is the in-house implementation of ECMH — the exact primitive the spec contemplates — already with the right operator surface (insert/remove/serialize) and the same curve choices the rest of atproto signs with (P-256, K-256). There is no other first-party path that gets us there.

Decision: Ship both impls in Phase 1 (XorSha256SetHash + EcmhSetHash). Default to XOR until upstream picks; flip the default in Phase 8 if upstream picks ECMH. The trait abstraction in §4.1 is what makes the flip cheap.

Design notes for the EcmhSetHash adapter (in crates/atproto-space/src/set_hash.rs):

  • Backend choice: default to RistrettoEcmh (strongest curve, smallest digest, no awkward 33-byte alignment); expose Cargo features ecmh-p256 and ecmh-k256 for installations that want to share key material with their atproto signing keys (purely operational — semantically the curve choice does not bind to the signing key).
  • Element preprocessing: ecmh-rs::insert already calls hash_to_point internally on raw bytes. We pass the spec-defined element bytes directly ({collection}/{rkey}:{cid} for records, did for members) — no double-hashing, no preprocessing.
  • Digest format: persist to_bytes() as a length-prefixed BLOB in space_repo.set_hash and space_member_state.set_hash. The 32-vs-33-byte difference between Ristretto and the NIST/secp variants doesn't matter at the SQL layer because we already store as BLOB, but tests must check round-trip across the trait boundary.
  • Cross-impl invariant: EcmhSetHash::digest() must NOT be expected to equal XorSha256SetHash::digest() for the same input — they are different primitives. The interop story is: a PDS with one impl cannot directly compare digests with a PDS with the other. This is why the upstream picks one global default; we follow it.
  • Hash-to-curve method (try-and-increment vs SWU/Elligator) is an internal detail of ecmh-rs we should pin in our test suite via known-answer tests against ecmh-rs's own examples — so a future ecmh-rs change to a different curve mapping fails our CI.

Risks and mitigations:

Risk Mitigation
ecmh-rs has no external audit We are the upstream; treat the audit budget as ours. Phase 8 security review explicitly includes ecmh-rs.
Hash-to-curve method is undocumented Pin known-answer tests against the current behavior.
Performance unknown vs XOR (~ns/op) — ECMH curve ops are typically 10–100 µs/element Bench-and-publish in Phase 1. If pathological, the SetHash trait still lets a deployment fall back to the XOR placeholder per-instance.
Upstream picks ltHash instead Add LthashSetHash impl in Phase 8 — same trait, third impl. Two days of work + tests.

10.2 riblt-rs — opt-in optimization for oplog-gap recovery (not load-bearing)

Repository: github.com/ngerakines/riblt-rs License: MIT Status: Rust port of the reference Go implementation of "Practical Rateless Set Reconciliation" (Yang, Gilad, Alizadeh, arxiv 2402.02668); no audit, sparse benchmarks.

What it provides: Rateless Invertible Bloom Lookup Tables for set reconciliation. Two parties holding sets A and B can determine the symmetric difference A △ B by exchanging coded symbols whose total size is proportional to |A △ B|, not to |A| or |B|. The crate is generic over a Symbol trait (xor + hash); supplies Encoder<T>, Decoder<T>, Sketch<T>, CodedSymbol<T>, HashedSymbol<T>, RandomMapping. Includes CLI tooling and an experimental Quinn/QUIC peer demo.

Why it might matter here: design §7.3 defines the spec's recovery model when a syncing app's since cursor predates the retained oplog — the SetHash mismatch detection triggers a full resync via listRecords. For very large permissioned repos, that's expensive. RIBLT could let the syncing app and the member's PDS reconcile by exchanging coded symbols, transferring data proportional only to what's actually missing. The same mechanic also fits the space_member set reconciliation between owner and member PDSes.

Why it is NOT in the load-bearing path:

  1. The spec is unambiguous that the fallback is listRecords, not a custom reconciliation protocol. Adopting RIBLT mid-protocol creates a non-spec interop divergence — a TS @atproto/pds syncing app would not understand getRepoOplogReconcile (or whatever we'd call it).
  2. RIBLT requires a non-homomorphic hash function with respect to XOR for correctness. Our oplog entries are content-addressed (CIDs); using the CID as the symbol hash works, but care is needed not to leak cryptographic structure.
  3. There is no third-party audit, no documented production deployment of riblt-rs, and no Rust-language interop with the Go reference for cross-implementation testing.

Decision: Track as a Phase 8 experimental optimization, behind a riblt Cargo feature, gated by a per-tenant config flag. Never the only sync path. Document the trade-off (faster recovery on large diffs vs. non-spec extension) and propose upstream simultaneously — if the spec adopts a reconciliation primitive, our impl is ready; if it doesn't, the feature stays opt-in. Add a corresponding open-question entry in §14.

Possible future use cases beyond oplog recovery (all "if we get there" — not committed):

  • Member-list reconciliation between two PDSes that disagree about the membership of a space (space_member table reconciliation).
  • Public-realm listMissingBlobs optimization for migration: instead of the existing list-then-fetch loop, exchange coded symbols of blob CID sets.
  • Cross-PDS firehose backfill drift detection.

Each of these is a 1.x or 2.x research item. None are in scope for the 0.15.0 release.

Crate workspace integration:

If adopted in Phase 8, add riblt-rs as a crates/atproto-pds Cargo feature only. We do not want it leaking into atproto-space, where it would couple the protocol primitives crate to a non-spec optimization. The feature surface lives in atproto-pds::space::sync_riblt (a sibling to sync.rs), behind compile-time and runtime gates.

10.3 Adoption summary

Crate Phase Default Required at 0.15.0?
ecmh-rs Phase 1 (impl), Phase 8 (default flip) impl shipped Phase 1; default depends on upstream Yes — must ship the impl even if not default
riblt-rs Phase 8 (experimental) off No — feature-gated, opt-in

11. Storage Backend Selection: SQLite vs. fjall (Ramjet patterns)

11.1 Recap of D1 + D13

The PDS ships two storage profiles behind a unified trait surface (ActorStore, BlockStorage, SpaceRepoStorage, SpaceMembersStorage):

  • SQLite — default, matches the Spaces Design Spec exactly (actor-store/space/sql-repo-storage.ts and sql-members-storage.ts map 1:1 to our SQL tables in §5.3 Profile A).
  • fjall — co-equal alternative, modelled on the keyspace layout in Ramjet (Nick's firehose-ingestion service that has battle-tested fjall on atproto-shaped workloads).

Both are shipped in v0.15.0 with full Layer-2 test coverage. Operators pick one at install time by setting the PDS_STORAGE_PROFILE config (sqlite | fjall); cross-profile migration is custom tooling (out of scope for v0.15.0). design §3.3, design §18.4.

11.2 Mapping PDS data to fjall partitions

Terminology: under fjall, a single Keyspace is opened at startup; each row in the table below is a separate PartitionHandle within that keyspace. Atomicity (cross-partition) is provided by WriteBatch against the parent Keyspace. We adopt fjall's terminology — partition — throughout this section, even though Ramjet's docs sometimes use keyspace loosely.

Ramjet's partition layout informs ours. The naming below is atproto-pds's; the patterns are imported directly from Ramjet. The table is split into the three logical realms — public repo, spaces, accounts.

Public repo realm:

Partition Key encoding Value Maps to SQL table Notes
repo_block cid_bytes DAG-CBOR block repo_block The public-repo blockstore. CIDs are 36 bytes (CIDv1 + sha256).
repo_record did\0collection\0rkey\0rev cid_bytes (for current) or empty (tombstone) repo_record Versioned by rev; latest = last entry in prefix scan over did\0collection\0rkey\0. Mirrors Ramjet directly.
repo_commit did\0rev DAG-CBOR commit object commit Prefix scan gives commit history by rev.
repo_blob_ref did\0record_uri\0blob_cid {mime_type, size} repo_blob_ref For blob ref-counting and GC.
outbox seq:u64BE DAG-CBOR firehose event outbox Big-endian u64 for natural lexicographic ordering. Ramjet's events partition pattern. Single global outbox under fjall (vs. per-actor outbox table under SQLite); see §5.4 for the Sequencer topology rationale.

Spaces realm:

Partition Key encoding Value Maps to SQL table Notes
space space_uri {is_owner, is_member, created_at} space
space_repo space_uri {set_hash, rev} space_repo Per-user record commitment. Exists for every space the user participates in.
space_member_state space_uri {set_hash, rev} space_member_state Owner-only member-list commitment. Separate partition from space_repo because an owner-participant has both rows under SQL; collapsing them collides on the space_uri key.
space_record space_uri\0collection\0rkey {cid, value_cbor, repo_rev, indexed_at} space_record
space_record_oplog space_uri\0rev\0idx:u32BE {action, collection, rkey, cid, prev} space_record_oplog Multi-column PK encoded into the key.
space_member_oplog space_uri\0rev\0idx:u32BE {action, did} space_member_oplog
space_member space_uri\0did {member_rev, added_at} space_member
space_credential_recipient space_uri\0service_did {service_endpoint, last_issued_at} space_credential_recipient
notify_attempt next_attempt_at_ms_be\0id_bytes {target_service_did, target_endpoint, payload_cbor, nsid, attempt_count, last_error, state} notify_attempt Notifier DLQ; range-scan by due-time. See §5.8 notifier.

Accounts realm:

Partition Key encoding Value Maps to SQL table Notes
account did_bytes {handle, email, email_confirmed_at, password_hash, created_at, state, signing_key_ref, pds_managed_rotation} account Note: signing_key_ref not private_key_blob — see §5.2 for the KeyStore-backed model (G7).
account_handle_index handle_bytes_lower did_bytes (SQL has UNIQUE constraint on handle)
account_email_index email_bytes_lower did_bytes (SQL has UNIQUE constraint on email)
app_password did_bytes\0app_password_id {name, password_hash, privileged, created_at} app_password
oauth_session session_id_bytes {did, client_id, dpop_jkt, scope, issued_at, expires_at, refreshed_at} oauth_session
oauth_session_by_did did_bytes\0session_id empty (SQL index) Secondary index for "list sessions for DID."
invite_code code_bytes {created_by_did, available_uses, used_by, created_at, disabled} invite_code
email_token token_bytes {did, purpose, expires_at} email_token
plc_op_token token_bytes {did, expires_at} plc_op_token
service_auth_blacklist jti_bytes expires_at_ms_be service_auth_blacklist Admin-revoked JTIs; TTL-trimmed. Distinct from the JTI replay filter (§12.2) — see §5.2 note on dual JTI tracking.

Cache + meta:

Partition Key encoding Value Maps to SQL table Notes
meta string key varied (multiple shared singletons) Holds: sequencer cursor, schema version, zstd dictionary metadata, RIBLT sketch cache (if §10.2 adopted). Direct port of Ramjet's meta partition.
did_doc_cache did_bytes timestamped JSON (process-level cache table; would be in-memory under SQLite) Identity cache, lifted from Ramjet. Promotes the LRU into durable storage so a restart doesn't lose the warm cache.
handle_to_did handle_bytes DID string (n/a in SQLite profile — derived) Lifted from Ramjet.

11.2.1 Ramjet patterns adopted regardless of backend

Some Ramjet patterns are valuable even under the SQLite profile because they're about ingestion, not storage:

  • Batch writer with mpsc + atomic commit (Ramjet default: 500 records or 100 ms timeout). The firehose-replay path for accounts that subscribe to many sources benefits from this. Under SQLite, "atomic commit" is BEGIN ... COMMIT; under fjall, WriteBatch. Same shape, different primitive.
  • zstd dictionary compression for outbox events. atproto-jetstream already does this; we reuse its dictionary handling for the firehose outbox path. Reduces storage footprint by ~3-4× per Ramjet's published numbers.
  • Identity cache as a durable artifact, not just a process-LRU. Restart-warm cache is meaningful for OAuth and SpaceCredential verification hot paths.
  • Per-DID RIBLT sketches (if §10.2 is adopted) cached in meta keyspace under riblt:<did>:<collection> keys. Auto-invalidated on record write.

11.2.2 Atomicity boundaries

The Spaces Design Spec mandates atomic batches: a single applyWrites produces multiple oplog entries sharing a rev with monotonically-increasing idx, and the space_repo.rev is updated to that commit's rev. Under either backend:

  • SQLite: one BEGIN ... COMMIT covers writes to space_record, space_repo, space_record_oplog together. Trivial.
  • fjall: one WriteBatch covers writes across all the relevant keyspaces. Fjall's WriteBatch is cross-partition atomic per the docs; verify in the test suite (Layer 1) that a partial-batch crash recovers cleanly.

11.3 Trade-offs and when to pick which

Concern SQLite fjall
Spec fidelity (Spaces Design Spec uses SQL schemas) ✅ direct match ⚠️ pattern-equivalent, schema-divergent
Single binary, no external runtime ⚠️ requires sqlx + sqlite link ✅ pure Rust, embedded
Schema migrations requires migration tool (sqlx migrate) implicit (forward-compat key encoding decisions)
Compile-time query check ✅ sqlx
Ad-hoc admin queries (joins, aggregates) ✅ sqlite3 CLI ❌ requires custom inspection tooling (Ramjet has ramjet-data for this; we'd ship analogous)
Per-account isolation (rm to takedown) ✅ one file per actor ⚠️ shared keyspaces — takedown is delete-by-prefix
Range-scan performance for oplog depends on indexes ✅ prefix-scan native to LSM
Write throughput at high firehose rates ⚠️ WAL bottleneck ✅ Ramjet demonstrates firehose-scale
Backup/snapshot ✅ file copy ✅ fjall has snapshot APIs
Operational familiarity ✅ universal ⚠️ team must learn fjall ops

Recommendations by deployment shape:

  • Reference / spec-conformance / public hosted PDS for Bluesky-style traffic: SQLite. Match the spec schema, ship interop quickly.
  • Single-host, latency-priority, low-account-count: fjall. Lower write latency, single binary, no SQL dependencies.
  • Multi-tenant SaaS PDS: SQLite (per-actor isolation outweighs raw throughput) — or hybrid (fjall for the firehose outbox + identity cache, SQLite for actor stores).

11.4 Phase placement

Trait surface comes from three crates and lands across three phases (G10 sequencing fix):

  • Phase 0: BlockStorage already exists in atproto-dasl (no change). Define ActorStore trait skeleton in atproto-pds::actor_store::mod (no impls yet; just the surface the rest of atproto-pds integrates against).
  • Phase 1: atproto-space ships, defining SpaceRepoStorage and SpaceMembersStorage traits in atproto-space::storage. In-memory impls live alongside for testing.
  • Phase 2: SQLite impls of ActorStore and BlockStorage ship (read-only PDS). Adopt the batch-writer + identity-cache patterns from Ramjet under SQLite — these are backend-agnostic.
  • Phase 7: SQLite impls of SpaceRepoStorage/SpaceMembersStorage ship; all fjall impls ship for both realms. Layer-2 tests run twice (one CI matrix entry per profile binary).
  • Phase 8: bench both profiles head-to-head on the public-repo write hot path and the getRepoOplog read hot path. Publish numbers; let operators decide informed.

11.5 What we are NOT taking from Ramjet

  • Ramjet's domain (firehose consumer with selective tracking) is different from a PDS. We do not adopt its tracked vs forwarded collection split, its consumer-group partitioning, or its RepoState.denied state machine. Those are firehose-aggregator semantics that do not map to a PDS.
  • Ramjet has no MST. A PDS does. The MST and CAR-export paths are unique to the PDS, untouched by Ramjet patterns.
  • Ramjet's CAR-parsing backfill workers are not directly applicable; we have importRepo instead. The streaming-CAR-decode shape is similar but the trigger and target are different.

11.6 Open questions tracked in §14

  • Cross-profile migration tooling (SQLite ↔ fjall) is out of scope for v0.15.0. Track as a 1.x roadmap item.
  • Schema versioning under fjall: a meta:schema_version key gates startup. Forward-compat key-encoding changes need explicit migration code per change.
  • Whether to default-enable both Cargo features and only let runtime config pick (current plan), or split into two binaries pds-sqlite / pds-fjall (rejected — operational pain).

12. Operational Patterns Lifted from lexicon-garden

lexicon-garden (Nick's production lexicon-schema-browsing service) is the closest operational sibling to a PDS in the workspace: it does XRPC routing, OAuth, lexicon validation, identity resolution, admin tooling, Ramjet event ingestion, and metrics-instrumented HTTP serving on the same axum/tokio/sqlx stack we're using. Several patterns there have been load-tested in production and should be lifted into atproto-pds rather than redesigned. This section enumerates them with destination subsystem and rationale; the rest of the plan cross-references back here.

12.1 Middleware stack ordering (axum)

Lexicon-garden's middleware stack, bottom-up (innermost handler outward to first-touched layer):

  1. CORS (tower_http::cors::CorsLayer) — outermost so preflight responses precede auth.
  2. JTI replay protection — see §12.2.
  3. Rate limiting — see §12.4.
  4. Admin auth (basic-auth gated, only on /admin/*).
  5. Request filter (block known security-scanner User-Agents and obvious probes).
  6. Metrics collection — wraps everything below so failed traces still get counted.
  7. Tracing (tower_http::trace::TraceLayer).

Adopt directly in atproto-pds::xrpc::router_build(). Metrics outside tracing is the load-bearing detail — a request that fails to start a span still increments counters.

12.2 JTI replay protection middleware (security)

A Redis-backed bloom or cuckoo filter on JWT jti claims. On verify: check membership (reject as replay if hit), then insert with TTL = exp - iat. Lexicon-garden's exact shape — lift verbatim into atproto-pds::xrpc::auth_extractor::JtiReplayGuard.

Required for:

In-memory fallback for the no-Valkey deployment: bounded LRU per token-type with the same TTL semantics. Crash loses replay state, which is acceptable since tokens are short-lived.

12.3 Cuckoo filter for denylist (admin / moderation)

Lexicon-garden uses a Redis-backed cuckoo filter keyed by MetroHash64(DID | handle | NSID). O(1) check; no plaintext denylist on disk (privacy-preserving). Avoids the read-amplification of doing SQL lookups on every hot-path write.

Adopt in:

  • account::manager::create_account — block account creation against denylisted email/handle/DID before the PLC genesis op.
  • repo::writer — block writes from takendown accounts as the first authz check.
  • space::writer/space::receiver — block space participation by denylisted DIDs (cheap pre-flight).
  • space::notifier — drop outbound notifyWrite to denylisted recipient services.

Implementation: pull the cuckoofilter crate behind the valkey Cargo feature; ship in-memory fallback for the no-Redis deployment.

12.4 Sliding-window rate limiter via Redis ZSET

Lexicon-garden uses a Redis ZSET-based sliding-window rate limiter (members are request timestamps; trim < window-start; cap on count). Strictly fairer than fixed-window for clients that burst at minute boundaries.

Adopt in atproto-pds::xrpc::rate_limit::ValkeyRateLimiter. The in-memory default can stay token-bucket (D8-era), but the distributed backend should be sliding-window. Update §5.6 of this plan accordingly.

12.5 Health probes: /_ready and /_alive (split, not single)

Lexicon-garden splits readiness from liveness — the k8s probe convention. Replaces the plan's single /_health:

  • /_alive — returns 200 if the process is responsive. Used by liveness probes; failure → restart.
  • /_ready — returns 200 if all dependencies are reachable (account DB, blob store, Mailer connection, Valkey if enabled, PLC directory). Used by readiness probes; failure → remove from load balancer rotation but do not restart.
  • /xrpc/_health — stays as the spec-compliant variant returning {version, status: "ok"}.

Update §5.11 of this plan accordingly.

12.6 Graceful shutdown via CancellationToken + TaskTracker

A PDS holds open WebSocket subscribers (subscribeRepos), in-flight HTTP requests, and background workers (sequencer, notifier, blob GC, permission-set SWR). Lexicon-garden's pattern: a top-level tokio_util::sync::CancellationToken is cloned into every spawned task; a TaskTracker joins them on shutdown with a configurable deadline.

Adopt in pds::main. Without this, restarts can drop firehose events between commit-and-broadcast (subscribers reconnect with a stale cursor and the relay flags OutdatedCursor).

12.7 Tokio task instrumentation (TaskMonitor)

Lexicon-garden uses tokio-metrics::TaskMonitor per labeled task class to surface poll counts, poll durations, and scheduling delays as Prometheus metrics. Catches "starved task" pathologies that present as latency spikes with no obvious cause.

Adopt in pds::metrics. Tasks to instrument:

  • Sequencer flush loop (must never starve — drives firehose latency)
  • Notifier worker pool (notifyWrite outbound)
  • Blob GC scheduler
  • OAuth permission-set SWR refresher
  • Identity-cache SWR refresher

12.8 Build-time BUILD_REV via build.rs

A build.rs script computes a git rev hash (with a fallback to build timestamp on a non-git tarball) and exposes it as env!("BUILD_REV"). Used for:

  • User-Agent header on every outbound HTTP request (notifier, PLC client, app-view proxy, mailer): atproto-pds/<crate-version>+<BUILD_REV>.
  • /xrpc/_health version field.
  • ETag seed for OAuth metadata responses (/.well-known/oauth-*) and consent UI static assets.
  • Cache busting for the admin dashboard's CSS/JS.

Lift build.rs shape from lexicon-garden. Add to Phase 0 scaffold.

12.9 #[sqlx::test] macro for integration tests

Lexicon-garden uses sqlx's compile-time-checked test macro for parallel-isolated DB tests — each test gets its own pool with per-test schema isolation. Compile-time query check catches schema drift the moment a migration lands.

Adopt in Layer-2 test suite under the sqlite profile. For the fjall profile, ship an analogous #[atproto_pds::test] proc macro that allocates a tmpdir keyspace and tears it down at end-of-test. The test bodies must be backend-agnostic (using only the ActorStore trait surface) so the same test runs under both profiles in CI.

12.10 Configuration validation on load

Lexicon-garden's Config::from_env() validates everything up-front and returns a ConfigError (thiserror enum) listing every missing/invalid var. Fail-fast with all errors at once, not one at a time. Critical for ops — no "fix one env var, restart, hit the next" cycle.

Adopt in atproto-pds::config::PdsConfig::from_env(). Bad config → process exits at startup with a single structured error block listing every issue. Update §5.1 accordingly.

12.11 Storage-backed identity resolver with TTL + SWR

Lexicon-garden's identity resolver: 12h TTL on DID document cache, on-miss network fallback, stale-while-revalidate (return stale value, trigger background refresh) on near-expiry. Bypass cache on signature-verification failure (Blacksky's lesson, references §16).

Already implicit in our §11 fjall keyspace did_doc_cache; make the TTL and SWR semantics explicit in pds::identity::resolver.

12.12 Multi-stage Dockerfile

Lexicon-garden's Dockerfile pattern:

  • Stage 1 (Node 22-slim): build any frontend assets (we have minimal — admin dashboard CSS — but the stage is here for future).
  • Stage 2 (Rust slim): compile the Rust binary in release mode with --features embed.
  • Stage 3 (gcr.io/distroless/cc-debian12): runtime image with just the binary plus static assets. Distroless minimizes attack surface.

Lift directly. Update Phase 6 deployment artifact.

12.13 Storage-backed admin denylist with privacy-preserving hashing

Beyond the cuckoo filter (§12.3), the durable backing store for the denylist hashes the actual identifier with MetroHash64 before persisting, so a database leak does not expose the raw blocklist. Adopt for account::manager's denylist table (whichever profile).

12.14 What we are NOT taking from lexicon-garden

  • Minijinja: we use Askama (compile-time, faster, no runtime template engine dep). The embed/reload feature toggle is unnecessary under Askama.
  • Vite + TypeScript frontend: a PDS does not need a JS build pipeline. The consent UI and admin dashboard are server-rendered Askama, with optional small islands of vanilla JS shipped as static files.
  • TimescaleDB: not needed for v0.15.0 — no time-series analytics requirements. Revisit in 1.x for ops dashboards.
  • Schema enrichment pipeline + dependency graph: domain-specific to lexicon serving.
  • D3 visualization: domain-specific to lexicon-garden's UI.

13. Outputs of This Plan

When the plan is executed, the workspace ends up with:

  • 2 new crates: atproto-space (protocol primitives) and atproto-pds (server).
  • 2 new binaries: pds (production server) and atproto-pds-admin (admin CLI), per D6/G17.
  • Surgical extensions to atproto-repo (Sync 1.1 prev_data + verify_inductive), atproto-xrpcs (DPoP + 5-token auth dispatch + JTI replay guard), atproto-oauth/atproto-oauth-axum (server flows + space scopes), atproto-identity (KeyStore trait), atproto-lexicon (com.atproto.space.* catalog).
  • Test suites at three layers including external interop, run twice in CI (one matrix entry per storage profile).
  • Production deployment artifacts: multi-stage Dockerfile, compose example, configuration reference doc.
  • Tracked issues for each Spaces Design Spec open question, with conservative interim defaults committed in code.

Versioning convention (G31): both new crates ship at workspace-locked SemVer (atproto-space 0.15.0 and atproto-pds 0.15.0 alongside the rest of the workspace at 0.15.0). The version number does not signal maturity — both are brand-new. Maturity is signalled in the README of each crate via an "experimental — protocol surfaces may change" badge, and by the open issues tracking spec churn (§14). This matches the existing workspace pattern of synchronized release versions across all crates.

The strategic positioning at completion: atproto-pds is the second PDS implementation overall to ship Spaces, the first in Rust, and the only one architected for production performance from day zero.


14. Open Questions to Track in atproto-crates Issues

These are mirrored from design §15.11 and should each become a tracked issue at the start of Phase 0. The corresponding upstream URL/source for each is listed in references §15. Items 11–14 are local to this plan (introduced by §10 and §11).

  1. URI scheme ats://<ownerDid>/<spaceType>/<spaceKey> — abstract behind SpaceUri to future-proof.
  2. SetHash algorithm — XOR placeholder until upstream picks ECMH or ltHash.
  3. SpaceCredential expiration window (2–4h) — default 3h, configurable.
  4. Oplog retention policy — retain forever default; admin compaction; emit OplogGap if since precedes retained range.
  5. notifyWrite fan-out failure semantics — bounded retry + DLQ + admin visibility.
  6. MemberGrant signing key — user's atproto signing key per spec; documented.
  7. Service endpoint discovery for notifications — DID-doc service entry first, then provided value.
  8. Permissioned-data account migration — provisional com.atproto.space.exportSpaces/importSpaces, only if upstream signals interest.
  9. Permissioned record moderation — provisional com.atproto.admin.takedownSpaceRecord; design feedback to be filed.
  10. RecordPermissioned error name for public-context dereference of permissioned strongRefs — propose upstream.
  11. ecmh-rs audit and benchmark gap — first-party crate, no third-party audit, no published benchmarks. Phase 1 ships the bench harness; Phase 8 includes audit budget. Track as a workspace-level dependency-risk issue, separate from spec questions.
  12. riblt-rs adoption decision — non-spec reconciliation extension. Track in a single issue: proposal: rateless reconciliation for getRepoOplog gap recovery. Decision blocked on (a) Phase 8 perf data showing listRecords resync is genuinely a problem, (b) upstream signal on whether the spec wants to standardize a reconciliation primitive.
  13. fjall vs. SQLite default flip — track Phase 8 benchmark results; if fjall is decisively faster on the hot paths and operationally tractable for our reference deployments, propose flipping the default in v0.16.x. Default-flip would require a working cross-profile migration tool first.
  14. Cross-profile migration tooling — SQLite → fjall and back. Out of scope for v0.15.0; track as 1.x roadmap. Implementation strategy: walk SQL tables → emit canonical export (CAR for blocks, JSONL for tables) → re-ingest under target profile.

15. Source Documents (committed alongside this plan)

The two documents this plan distills from live in the repository root:

  • atproto-pds-design.mdatproto-pds: Foundational Design Document for a Rust ATProtocol Personal Data Server (revision 2, May 1 2026, by Nick Gerakines). 20 sections covering architectural overview, account management, repository storage, record operations, identity & DID, firehose, sync, XRPC, OAuth provider, moderation, email, configuration, crypto, lexicon handling, permissioned data spaces (the longest section), conformance comparison, Rust crate integration, performance, build & release, and closing notes. Cite as [design §N](atproto-pds-design.md).
  • atproto-pds-references.md — Reference companion: source citations and specification index. Mirrors the design document section-for-section with locatable URLs, file paths, source excerpts, and specification quotations. Includes the appendix "Quick lookup index" of primary URLs. Cite as [references §N](atproto-pds-references.md).

Both documents are snapshots (the references document declares itself "current as of May 1, 2026"). Forward-looking material (the Spaces Design Spec, the Spring 2026 roadmap, Diary 4) is itself tentative — re-pull upstream before finalizing protocol-facing decisions, per the references document's own caveat. When upstream sources move, update those snapshots; do not silently rewrite this plan against newer drafts without flagging the diff.

atproto-pds Reference Companion: Source Citations and Specification Index

Companion document to the atproto-pds foundational design document for the atproto-crates workspace at https://tangled.org/ngerakines.me/atproto-crates. This reference is organized to mirror the design document section-for-section. Each section provides locatable URLs, file paths, source excerpts, and specification quotations that back the claims and decisions in the design document.


1. Architectural Overview — References

Primary specification: AT Protocol overview

  • https://atproto.com/specs/atp — top-level protocol specification.

    "Network: client-server and server-server HTTP APIs are described with lexicon schemas (XRPC), as are WebSocket event streams. Individual records can be globally referenced with AT-URIs. Personal Data Servers (PDS) host accounts, handle key management, manage repositories, provide authentication, and proxy client HTTP requests. Relays aggregate from many PDS hosts and output a unified full-network firehose."

Community wiki PDS reference

  • https://atproto.wiki/en/wiki/reference/core-architecture/pds

    "PDSes handle the complete lifecycle of user accounts… resolve and maintain the connection between a user's handle and Decentralized Identifier (DID)… handle the user's cryptographic keys - both the AT Protocol signing key used to authenticate repository changes, and the PLC rotation key used for identity operations… host and manage user repositories. They maintain the Merkle Search Tree data structure that stores user records, handling mutations and generating diffs."

TypeScript reference PDS

  • https://github.com/bluesky-social/atproto/tree/main/packages/pds — implementation root.
  • https://github.com/bluesky-social/atproto — workspace; packages/pds, packages/repo, packages/identity, packages/lexicon, packages/sync, packages/xrpc-server, packages/crypto, packages/oauth-provider, services/pds.
  • https://github.com/bluesky-social/pds — deployment distribution (installer.sh, compose.yaml, ACCOUNT_MIGRATION.md, pdsadmin.sh).

Indigo (Go) reference: bigsky relay & atproto packages

  • https://github.com/bluesky-social/indigo/blob/main/HACKING.md

    "carstore: library for storing repo data in CAR files on disk, plus a metadata SQL db; events: types, codegen CBOR helpers, and persistence; lex: implements codegen for Lexicons; mst: merkle search tree implementation; pds: PDS server implementation; plc: implementation of a fake PLC server, and a PLC client; repo: implements atproto repo on top of a blockstore; repomgr: wraps many repos with a single carstore backend."

  • https://github.com/bluesky-social/indigo/blob/main/cmd/bigsky/README.md — relay README; subscribes to PDS hosts, outputs combined firehose.

rsky (Rust)

  • https://github.com/blacksky-algorithms/rsky/blob/main/README.md

    "rsky-pds: 'Personal Data Server', hosting repo content for atproto accounts. It differs from the canonical Typescript implementation by using Postgres instead of SQLite, s3 compatible blob storage instead of on-disk, and mailgun for emailing."

What a PDS does — discussion #2350

  • https://github.com/bluesky-social/atproto/discussions/2350

    "PDS instances host accounts… The defining feature of a PDS is hosting and managing public atproto repositories for accounts… PDS instances store arbitrary Lexicon-defined private preferences… Generic mechanism for proxying specific XRPC endpoints on to other network services may be added."

Tap (sync reference) and Roadmap context

  • https://atproto.com/blog/2026-spring-roadmap — Spring 2026 roadmap; permissioned data, account experience, sync 1.1 follow-through.
  • https://atproto.com/blog/introducing-tap — pull-based subscriber and backfill reference.

Spring 2026 self-host context (handle wildcard, _atproto record, healthcheck)

  • https://github.com/bluesky-social/pdsinstaller.sh, compose.yaml.
  • https://atproto.com/guides/self-hosting.

2. Account Management — References

Lexicons (file paths in atproto repo)

  • https://github.com/bluesky-social/atproto/tree/main/lexicons/com/atproto/server/
    • createAccount.json, createSession.json, refreshSession.json, deleteSession.json, getSession.json, describeServer.json, createInviteCode.json, createInviteCodes.json, getAccountInviteCodes.json, requestPasswordReset.json, resetPassword.json, updateEmail.json, requestEmailUpdate.json, requestEmailConfirmation.json, confirmEmail.json, activateAccount.json, deactivateAccount.json, deleteAccount.json, requestAccountDelete.json, checkAccountStatus.json, getServiceAuth.json, reserveSigningKey.json, listAppPasswords.json, createAppPassword.json, revokeAppPassword.json.

Specification: Account lifecycle

  • https://atproto.com/specs/account — accounts spec.
  • https://atproto.com/guides/account-lifecycle

    "PDS emits an #account event… Relay updates local account status for the repo, and passes through the #account event… account migration should result in three events coming from the relay: an #identity (from new PDS), an #account (from new PDS), and a #commit (from new PDS)."

Migration

  • https://github.com/bluesky-social/pds/blob/main/ACCOUNT_MIGRATION.md

    "In order to create an account, you first need to prove to the new PDS that you're in control of the DID… You can obtain this through calling com.atproto.server.getServiceAuth from your old PDS… You can then upload those exact bytes to your new PDS through com.atproto.repo.importRepo."

  • https://atproto.com/guides/account-migration — official migration guide.

TS implementation

  • https://github.com/bluesky-social/atproto/tree/main/packages/pds/src/account-managerAccountManager with sub-classes for invite codes, app passwords, sessions, email tokens.

Indigo PDS

  • https://github.com/bluesky-social/indigo/tree/main/pds — Go PDS handlers.

rsky-pds

  • https://github.com/blacksky-algorithms/rsky/tree/main/rsky-pds — Rust PDS handlers (Rocket-based).

cocoon (Go)

  • https://tangled.org/hailey.at/cocoon — implements com.atproto.server.activateAccount, checkAccountStatus, confirmEmail, createAccount, createInviteCode(s), deactivateAccount, deleteAccount, deleteSession, describeServer, getAccountInviteCodes, getServiceAuth, refreshSession, requestAccountDelete, requestEmailConfirmation/Update, requestPasswordReset, reserveSigningKey, resetPassword, updateEmail. Notes "not going to add app passwords."

tranquil-pds

  • https://tangled.org/tranquil.farm/tranquil-pds — superset of reference PDS: passkeys/2FA (WebAuthn/FIDO2, TOTP, backup codes), SSO, did:web hosting, multi-channel notifications, granular OAuth scopes, app passwords, account delegation.
  • Forks: https://tangled.org/tjh.dev/tranquil-pds, https://tangled.org/bas.sh/tranquil-pds, https://tangled.org/vicwalker.dev.br/tranquil-pds.

3. Repository Storage — References

Specification

  • https://atproto.com/specs/repository

    "At a high level, the repository MST is a key/value mapping where the keys are non-empty byte arrays, and the values are CID links to records. The MST data structure should be fully reproducible from such a mapping of bytestrings-to-CIDs, with exactly reproducible root CID hash." "The standard repository export format for atproto repositories is CAR v1, which have file suffix .car and mimetype application/vnd.ipld.car. This aligns with the DASL CAR specification." "Repo paths currently have a fixed structure of <collection>/<record-key>. This means a valid, normalized Namespace ID (NSID), followed by a /, followed by a valid Record Key."

  • https://atproto.com/specs/data-model

    "When data needs to be authenticated (signed), referenced (linked by content hash), or stored efficiently, it is encoded in Concise Binary Object Representation (CBOR)… The specific normalized subset of CBOR used in the atproto data model is called DRISL (which is successor to DAG-CBOR)."

  • https://atproto.com/guides/data-repos — high-level data repo guide.

TypeScript packages

  • https://github.com/bluesky-social/atproto/tree/main/packages/repo — MST, commit, CAR, sync diffs.
    • Key files: src/mst/mst.ts, src/repo.ts, src/sync/, src/storage/, src/cid-set.ts.

Indigo Go

  • https://github.com/bluesky-social/indigo/tree/main/atproto/repo — refactored repo package.
  • https://github.com/bluesky-social/indigo/tree/main/repo — older repo.
  • https://github.com/bluesky-social/indigo/tree/main/mst — MST library.
  • https://github.com/bluesky-social/indigo/tree/main/carstore — on-disk CAR shard store.
  • https://pkg.go.dev/github.com/bluesky-social/indigo/repomgrRepoManager.CreateRecord, BatchWrite, ImportNewRepo, HandleExternalUserEvent.

rsky (Rust)

  • https://github.com/blacksky-algorithms/rsky/tree/main/rsky-repo — MST and repo logic; rsky-pds uses it.

ngerakines atproto-crates: atproto-repo, atproto-dasl

  • https://tangled.org/ngerakines.me/atproto-crates/tree/main/crates/atproto-repo — MST encode/decode, commit structures, tree diffing.
  • https://tangled.org/ngerakines.me/atproto-crates/tree/main/crates/atproto-dasl — DRISL DAG-CBOR encoding, CID computation, CAR v1 archives.

MST/CIDs explainer

  • https://github.com/bluesky-social/atproto/discussions/2644

    "atproto uses CIDs in a very restricted way: basically all CIDs are generated with the same hash today, the format flexibility is so we can evolve in the future."

Data validation guide

  • https://atproto.com/guides/data-validation

    "A reasonable maximum record size limit (MAX_CBOR_RECORD_SIZE) is 1 MiByte… subscribeRepos Lexicon limits #commit message block size to 2,000,000 bytes."


4. Record Operations — References

Lexicons in lexicons/com/atproto/repo/

  • https://github.com/bluesky-social/atproto/tree/main/lexicons/com/atproto/repo/
    • createRecord.json, putRecord.json, deleteRecord.json, getRecord.json, listRecords.json, applyWrites.json, describeRepo.json, uploadBlob.json, importRepo.json, listMissingBlobs.json.

Blob spec

  • https://atproto.com/specs/blob — blob upload/reference rules.

XRPC HTTP API spec

  • https://atproto.com/specs/xrpc

    "PDS implementations are free to restrict blob uploads as they see fit. For example, they may have a global maximum size or restricted set of allowed MIME types."

TS handlers

  • https://github.com/bluesky-social/atproto/tree/main/packages/pds/src/api/com/atproto/repo — handler files for each lexicon.
  • packages/pds/src/api/com/atproto/repo/applyWrites.ts, createRecord.ts, putRecord.ts, deleteRecord.ts, uploadBlob.ts, importRepo.ts, listMissingBlobs.ts.

Indigo handlers

  • github.com/bluesky-social/indigo/pdsRepoCreateRecord, RepoPutRecord, RepoDeleteRecord, RepoApplyWrites handlers.

rsky-pds handlers

  • github.com/blacksky-algorithms/rsky/tree/main/rsky-pds/src/repository and src/apis/com/atproto/repo.

cocoon handlers (verified via README)

  • com.atproto.repo.applyWrites, createRecord, putRecord, deleteRecord, describeRepo, getRecord, importRepo ("Works 'okay'. Use with extreme caution."), listRecords, listMissingBlobs.

5. Identity & DID Operations — References

Specifications

  • https://atproto.com/specs/did
  • https://atproto.com/specs/handle
  • https://github.com/did-method-plc/did-method-plc — did:plc method spec.
  • https://atproto.com/guides/identity

    "Handles are DNS names. They are resolved using DNS TXT records or an HTTP well-known endpoint, and must be confirmed by a matching entry in the DID document."

TS

  • https://github.com/bluesky-social/atproto/tree/main/packages/identityIdResolver, HandleResolver, DidResolver (plc, web).

Indigo

  • github.com/bluesky-social/indigo/atproto/identityDefaultDirectory, Lookup, PDSEndpoint(), base resolution.
  • github.com/bluesky-social/indigo/atproto/syntaxParseAtIdentifier, DID, Handle, NSID, TID, AT-URI types.
  • github.com/bluesky-social/indigo/plc — PLC client.

rsky-identity

  • github.com/blacksky-algorithms/rsky/tree/main/rsky-identity.

ngerakines atproto-identity (Rust)

  • https://crates.io/crates/atproto-identity, https://lib.rs/crates/atproto-identity

    "Multi-method DID resolution (plc, web, key), DNS/HTTP handle resolution, PLC directory operations, and P-256/P-384/K-256 cryptographic operations."

  • Binaries: atproto-identity-resolve, atproto-identity-key, atproto-identity-sign, atproto-identity-validate, atproto-identity-plc-audit, atproto-identity-plc-fork-viz, plus unified atpdid CLI added in 0.14.

atproto-plc crate

  • https://crates.io/crates/atproto-plc — did-method-plc implementation with WASM.

6. Firehose / Event Stream — References

Lexicons

  • https://github.com/bluesky-social/atproto/blob/main/lexicons/com/atproto/sync/subscribeRepos.json#commit, #sync, #identity, #account event types.

Specs

  • https://atproto.com/specs/event-stream — WebSocket framing, CBOR, sequence cursors.
  • https://atproto.com/specs/sync

    "#commit messages contain both a repo diff (CAR slice), and an array of record operations. The operations can be applied in reverse… If the list of operations is complete, the root of the tree should be exactly that of the previous commit object of the repository."

Sync 1.1 proposal

  • https://github.com/bluesky-social/proposals/tree/main/0006-sync-iteration

    "include additional metadata and MST blocks in firehose #commit messages to enable per-commit MST validation ('operation inversion' or 'inductive firehose'); hard size limits on repository diffs (#commit messages), and removing the tooBig flag; new #sync firehose event type to declare the current repository state."

Relay v1.1 rollout

  • https://atproto.com/blog/relay-updates-sync-v1-1

    "the sync v1.1 changes to the #commit message schema are supported; #commit messages validated with MST inversion (controlled by 'lenient mode' flag); new com.atproto.sync.listHosts endpoint; com.atproto.sync.getRepo endpoint implemented as HTTP redirect to PDS instance… The relay can handle thousands of messages per second using on the order of 2 vCPU cores, 12 GByte of RAM, and 30 Mbps."

TS

  • https://github.com/bluesky-social/atproto/tree/main/packages/pds/src/sequencerSequencer, outboxes, message persistence.

Indigo

  • github.com/bluesky-social/indigo/events — event scheduler & dispatch.

rsky firehose

  • github.com/blacksky-algorithms/rsky/tree/main/rsky-pds/src/sequencer.

7. Sync Protocol — References

Lexicons in lexicons/com/atproto/sync/

  • getRepo.json, getRepoStatus.json, getLatestCommit.json, getRecord.json, getBlocks.json, listRepos.json, listReposByCollection.json, getBlob.json, listBlobs.json, requestCrawl.json, notifyOfUpdate.json, subscribeRepos.json, listHosts.json.

Spec

  • https://atproto.com/specs/sync

    "This endpoint is not authenticated, and returns all repo records, MST nodes, and the current signed commit object, all in a single CAR file."

TS sync routes

  • packages/pds/src/api/com/atproto/sync/*.ts — handlers for each endpoint.

Indigo sync handlers

  • github.com/bluesky-social/indigo/pds/handlers.go and github.com/bluesky-social/indigo/cmd/relay.

rsky-pds sync handlers

  • github.com/blacksky-algorithms/rsky/tree/main/rsky-pds/src/apis/com/atproto/sync.

cocoon: implements full sync set

  • Per README.md: getBlob, getBlocks, getLatestCommit, getRecord, getRepoStatus, getRepo, listBlobs, listRepos, requestCrawl, subscribeRepos.

8. XRPC Server — References

Specs

  • https://atproto.com/specs/xrpc
  • https://atproto.com/specs/lexicon

TS

  • https://github.com/bluesky-social/atproto/tree/main/packages/xrpc-serverServer, lexicon-driven validation.
  • packages/xrpc — client.

Indigo

  • github.com/bluesky-social/indigo/xrpc — XRPC client.

Rust crates

  • atproto-xrpcs (https://crates.io/crates/atproto-xrpcs): JWT authorization extractors for Axum handlers, DID-based issuer verification.
  • Example service: atproto-xrpcs-helloworld — DID:web identity, service document generation, JWT auth pattern. CLI: atpxrpc for client-side calls.

Endpoint inventory

  • https://github.com/bluesky-social/atproto/tree/main/lexicons/com/atproto/ — full enumeration: admin/, identity/, label/, lexicon/, moderation/, repo/, server/, sync/, temp/.

9. OAuth Provider — References

Spec

  • https://atproto.com/specs/oauth

    "DPoP (with mandatory server issued nonces) is required to bind auth tokens to specific client software instances… Pushed Authentication Requests (PAR) are used to streamline the authorization request flow… 'Confidential' clients use JWTs signed with a secret key."

Permissions

  • https://atproto.com/specs/permission

    "A permission can be represented in a string format (for direct use as an OAuth scope), or as a JSON object (for use in permission sets). Permission sets are published as public lexicon schemas."

RFCs (referenced by spec)

  • RFC 8414 (OAuth 2.0 Authorization Server Metadata)
  • RFC 9126 (PAR)
  • RFC 9449 (DPoP)
  • RFC 7636 (PKCE)
  • RFC 9728 (OAuth 2.0 Protected Resource Metadata, draft-ietf-oauth-resource-metadata)
  • RFC 7591 (Dynamic Client Registration)
  • RFC 7009 (Token Revocation)
  • RFC 7519 (JWT)

TS

  • https://github.com/bluesky-social/atproto/tree/main/packages/oauthoauth-provider, oauth-client, oauth-client-node, oauth-client-browser, oauth-types.

AIP

  • https://github.com/graze-social/aip — "ATmosphere Authentication, Identity, and Permission Proxy" (Rust); OAuth 2.1 Authorization Server with PKCE/PAR/DPoP, dynamic client registration, in-memory/SQLite/Postgres storage.

ngerakines atproto-oauth crates

  • atproto-oauth — DPoP (RFC 9449), PKCE (RFC 7636), JWT, secure storage abstractions.
  • atproto-oauth-aip — IdP authorization-code flow with PAR + AT Protocol session management.
  • atproto-oauth-axum — Axum handlers for callbacks, JWKS endpoints, client metadata.

Wiki / blog references

  • https://atproto.wiki/ — OAuth article.
  • https://docs.bsky.app/blog/oauth-atproto
  • https://docs.bsky.app/docs/advanced-guides/oauth-client
  • https://atproto.com/blog/oauth-improvements — DPoP htu, scope changes (transition:email).

10. Moderation & Safety — References

Lexicons

  • lexicons/com/atproto/admin/createCommunicationTemplate, deleteAccount, disableAccountInvites, disableInviteCodes, getAccountInfo(s), getInviteCodes, getSubjectStatus, searchAccounts, sendEmail, updateAccountEmail, updateAccountHandle, updateAccountPassword, updateSubjectStatus.
  • lexicons/com/atproto/moderation/createReport.json.

Labels

  • https://atproto.com/specs/label — DRISL/CBOR signed labels with uri, cid, val, neg, src, cts, exp, sig.

Ozone

  • https://github.com/bluesky-social/ozone — moderation tooling.

TS

  • packages/pds/src/api/com/atproto/admin/ — admin handlers.
  • packages/pds/src/api/com/atproto/moderation/createReport.ts — typically proxied.

cocoon's pattern

  • README: "com.atproto.moderation.createReport (Note: this should be handled by proxying, not actually implemented in the PDS)."

Indigo / rsky moderation handlers

  • Indigo equivalents in pds/; rsky equivalents in rsky-pds/src/apis/com/atproto/admin and moderation.

hailey atproto-ruleset (Osprey rules)

  • https://tangled.org/hailey.at/atproto-ruleset — labeler rules used by an actual production labeler.

11. Email & Notifications — References

TS mailer

  • packages/pds/src/mailer/ — Nodemailer-based SMTP.

cocoon SMTP

  • COCOON_SMTP_USER/PASS/HOST/PORT/EMAIL/NAME env variables; SMTP-based.

tranquil-pds multi-channel

  • README: "multi-channel communication: you can be notified via email, discord, telegram, and signal for verification and alerts."
  • Crates: tranquil-pds/crates/tranquil-auth/, etc.

Rust mail library

  • https://crates.io/crates/lettre — SMTP/Sendmail email library.

12. Configuration & Deployment — References

Self-hosting

  • https://atproto.com/guides/self-hosting
  • https://github.com/bluesky-social/pds/blob/main/installer.sh
    COMPOSE_URL="https://raw.githubusercontent.com/bluesky-social/pds/main/compose.yaml"
    PDSADMIN_URL="https://raw.githubusercontent.com/bluesky-social/pds/main/pdsadmin.sh"
    REQUIRED_DOCKER_PACKAGES="containerd.io docker-ce docker-ce-cli docker-compose-plugin"
    
    systemd unit calls docker compose --file ${PDS_DATADIR}/compose.yaml up --detach.
  • https://github.com/bluesky-social/pds/blob/main/compose.yaml — Caddy + PDS + watchtower.

ENV variables: PDS_HOSTNAME, PDS_JWT_SECRET, PDS_ADMIN_PASSWORD, PDS_PLC_ROTATION_KEY_K256_PRIVATE_KEY_HEX, PDS_DATA_DIRECTORY, PDS_BLOBSTORE_DISK_LOCATION, PDS_DID_PLC_URL, PDS_BSKY_APP_VIEW_URL, PDS_REPORT_SERVICE_URL, PDS_CRAWLERS, PDS_EMAIL_SMTP_URL.

tranquil-pds config

  • https://tangled.org/tranquil.farm/tranquil-pds/blob/main/example.toml — TOML config; precedence: env > --config > /etc/tranquil-pds/config.toml > defaults. PDS_HOSTNAME, DATABASE_URL, INVITE_CODE_REQUIRED, ENABLE_PDS_HOSTED_DID_WEB, PLC_DIRECTORY_URL, TRANQUIL_PDS_ALLOW_INSECURE_SECRETS.

cocoon config

  • https://tangled.org/hailey.at/cocoon.env.example. COCOON_DID, COCOON_HOSTNAME, COCOON_CONTACT_EMAIL, COCOON_RELAYS, COCOON_ADMIN_PASSWORD, COCOON_SESSION_SECRET, COCOON_DB_TYPE (sqlite/postgres), COCOON_S3_*, COCOON_SMTP_*. Key generation: init-keys.sh runs cocoon create-rotation-key and cocoon create-private-jwk.

rsky-pds config

  • Postgres-based, S3 blob storage, Mailgun email; configured via env (DATABASE_URL, S3_*, MAILGUN_*).

13. Crypto Requirements — References

Spec

  • https://atproto.com/specs/cryptography

    "Two elliptic curves are currently supported, and implementations are expected to fully support both: p256 elliptic curve (NIST P-256, secp256r1, prime256v1)… k256 elliptic curve (NIST K-256, secp256k1)." "Data encryption is not used directly in the protocol."

TS

  • packages/cryptoSecp256k1Keypair, P256Keypair, signature verification, did:key encoding.

Indigo

  • github.com/bluesky-social/indigo/atproto/crypto.

rsky-crypto

  • github.com/blacksky-algorithms/rsky/tree/main/rsky-crypto.

Rust crates

  • https://crates.io/crates/k256 — secp256k1.
  • https://crates.io/crates/p256 — NIST P-256.
  • atproto-identity (ngerakines) — wraps both, supports P-256/P-384/K-256, DID:key multibase.

bnewbold cryptography notes

  • https://gist.github.com/bnewbold — atproto cryptography notes, low-S signature normalization.

14. Lexicon Handling — References

Spec

  • https://atproto.com/specs/lexicon — schema language; types: query, procedure, subscription, record, object, params, string, integer, boolean, bytes, cid-link, blob, array, union, ref, unknown, token.

Lexicon community

  • https://lexicon.community/ — shared lexicon dev.

TS

  • packages/lexicon — validators; packages/lex-cli — codegen.
  • packages/lex/lex — major refresh in roadmap (2026 Spring): "new lex tool for resolving published lexicons and generating types."

Indigo

  • github.com/bluesky-social/indigo/atproto/lexicon, cmd/lexgen.

rsky-lexicon

  • github.com/blacksky-algorithms/rsky/tree/main/rsky-lexicon.

ngerakines atproto-lexicon

  • https://crates.io/crates/atproto-lexicon — DNS-based resolution, recursive lookup, NSID validation; CLI atproto-lexicon-resolve.
  • Lexicon resolution mechanics: TXT records on _lexicon.<NSID-as-hostname> plus published schema records.

Lexicon Garden

  • https://lexicon.garden/ — community-hosted shared lexicon repository.

15. Permissioned Data Spaces — References

Authoritative spec (provided URL): https://github.com/bluesky-social/atproto/blob/f592188e7ed4720450ac0aef877331dbc9998d8d/docs/superpowers/specs/2026-04-22-permissioned-data-pds-design.md — internal Bluesky design document for permissioned-data PDS implementation. It defines:

  • The @atproto/space package layered into packages/space/ with submodules for: space-core (data structures, ECMH, member list, repo commit), space-server (XRPC route handlers, sync log, write notifications), space-client (consumer SDK).
  • Data model tables (per Diary 4): space_owner (DID), space_type (NSID), space_key (skey), member list with (DID, read|write) tuples; permissioned-repo per (user, space).
  • XRPC endpoints (newly added under com.atproto.space.*): space.create, space.getMembers, space.updateMembers, space.transferOwnership, space.requestSpaceCredential, space.getRepo, space.getRepoCommit, space.getOpsLog, space.notifyWrite, space.applyWrites.
  • JWT shapes: short-lived (~2-4h) space credentials signed by space-owner key; service-auth-style member grant tokens bound to client ID.
  • Testing strategy: ECMH commitment determinism tests, member-list sync interop, write-notification routing.

Diary 1 — To Encrypt or Not (Feb 11, 2026)

  • https://dholms.leaflet.pub/3meluqcwky22a

    "When you post in a private subreddit with 50,000 members, you're not worried about the server operator reading your post. Your goal isn't to keep the content secret, it's to keep unauthorized users from viewing the content. In other words, you're thinking about access control, not encryption." "Permissioned data is about access and data flow… E2EE is about cryptographic confidentiality. You can layer the second on top of the first."

Diary 2 — Buckets (Feb 26, 2026)

  • https://dholms.leaflet.pub/3mfrsbcn2gk2a

    "A bucket is a named container that holds records and has a single authoritative ACL… When you post into a bucket, your post inherits the ACL of that bucket." "A bucket is a bit like a repository. But it isn't the public repository and (spoiler) probably doesn't use an MST."

Diary 3 — Your Bucket, My Data

  • https://dholms.leaflet.pub/3mguviy6iks2a — referenced from Diary 4; addresses "data lives on member PDSes vs. centralized space host" tradeoffs.

Interlude — Spaces (rename)

  • https://dholms.leaflet.pub/3mhbuoc64xk2a — renames "buckets" to "permission spaces."

Diary 4 — The Big Picture (Mar 20, 2026)

  • https://dholms.leaflet.pub/3mhj6bcqats2o — full architectural sketch:

    "A permission space (or space for short) is an authorization and sync boundary for permissioned records representing a shared social context… Each user stores their own records for a given space on their own PDS. The space exists not as a physical container but as a coordination concept." "Each space has a single member list. Each entry is a (DID, read|write) tuple. Write access is inclusive of read access. This is the only ACL data structure." "We use ECMH (Elliptic Curve Multiset Hash), a set hash where adding or removing an element is a single point operation rather than a full recompute of the hash… The ECMH for a permissioned repo is authenticated using a randomly generated and transient HMAC key, which is in turn signed by the user's atproto signing key. The commit is composed of the ECMH hash, the computed HMAC, the HMAC key, and the signature." "To read records from a space, a reader needs a space credential. A space credential is a stateless authorization token issued by the space owner: short-lived (~2-4 hour expiration); scoped to a specific space; asymmetrically signed by the space owner's key; usable with any member PDS." "Sync is pull-based. Applications are responsible for staying in sync with all member PDSes. PDSes assist by sending lightweight write notifications to prompt pulls when new data is written." "Spaces can be configured as 'default allow' or 'default deny' for service access… Default allow is considered the natural choice for spaces. It supports atmospheric interoperation."

    Address scheme proposal: six-tuple (space owner DID, space type NSID, space key, user DID, collection NSID, record key); ats:// likely URI scheme.

Spring 2026 Roadmap

  • https://atproto.com/blog/2026-spring-roadmap

    "We are referring to this as 'permissioned data,' meaning non-public data with explicit access control. Several teams have been working in parallel to implement extensions to the protocol for non-public data, including Blacksky, Northsky, and Habitat. A sketch design proposal has been published… Shipping Permissioned Data will require updates to PDS implementations, SDKs, written specifications, moderation tooling, and more."

Permission spec (OAuth scope side)

  • https://atproto.com/specs/permission — string and JSON forms; account:repo, repo:<NSID>, rpc:<NSID>, blob:<mime>/<subtype>, account:identity. Permission set NSID hierarchy and resolution. Cache: 24h stale, 90d expiration recommended.

Related: rsky cypher

  • https://github.com/blacksky-algorithms/rsky/tree/main/cypher — Blacksky's parallel exploration of namespaces, ACLs.

16. Conformance Implementation Comparison — References

TypeScript reference (atproto)

  • URL: https://github.com/bluesky-social/atproto
  • License: dual MIT / Apache 2.0
  • Top dirs: packages/{pds,bsky,repo,identity,lexicon,sync,xrpc,xrpc-server,crypto,oauth,*}, services/{pds,bsky,ozone}, lexicons/, interop-test-files/.

bluesky-social/pds (deployment)

  • URL: https://github.com/bluesky-social/pds
  • Files: installer.sh, compose.yaml, pdsadmin.sh, ACCOUNT_MIGRATION.md, pds.env.

Indigo (Go)

  • URL: https://github.com/bluesky-social/indigo
  • Top dirs: cmd/{bigsky,relay,goat,lexgen,palomar,tap,...}, pds/, repomgr/, carstore/, mst/, repo/, events/, plc/, atproto/{identity,lexicon,repo,syntax,crypto}, xrpc/, api/{atproto,bsky}.
  • License: dual MIT / Apache 2.0.

rsky (Rust, Blacksky)

  • URL: https://github.com/blacksky-algorithms/rsky
  • Top crates: rsky-pds, rsky-relay, rsky-feedgen, rsky-repo, rsky-identity, rsky-lexicon, rsky-crypto, rsky-syntax, rsky-common, rsky-satnav (CAR explorer), cypher (permissioned exploration).
  • Differs from TS: Postgres (not SQLite), S3 blob storage (not on-disk), Mailgun email.

tranquil-pds (Rust)

  • URL: https://tangled.org/tranquil.farm/tranquil-pds
  • License: AGPL-3.0-or-later (code) + CC-BY-SA-4.0 (docs)
  • Top dirs: crates/{tranquil-auth, tranquil-config, tranquil-db, tranquil-store, tranquil-pds-frontend,...}, frontend/, migrations/, deploy/, observability/.
  • Variants: https://tangled.org/tjh.dev/tranquil-pds, https://tangled.org/bas.sh/tranquil-pds, https://tangled.org/vicwalker.dev.br/tranquil-pds.
  • Distinctives: passkeys/2FA (WebAuthn/FIDO2, TOTP, backup codes, trusted devices), SSO, did:web hosting, multi-channel notifications, granular OAuth scopes, app passwords with granular permissions, account delegation, web UI.
  • Recent commits show heavy focus on tranquil-store (durable LSM-style persistence with cargo-fuzz), zombie websocket fixes, MST verification.
  • Postgres-required; optional Valkey for distributed rate limiting.

cocoon (Go, Hailey)

  • URL: https://tangled.org/hailey.at/cocoon (mirror: https://github.com/haileyok/cocoon)
  • License: MIT
  • Top dirs: cmd/, internal/, server/, oauth/, plc/, identity/, models/, metrics/, recording_blockstore/, sqlite_blockstore/, contrib/.
  • README: "Cocoon is a PDS implementation in Go. It is highly experimental, and is not ready for any production use."
  • DB: SQLite by default; PostgreSQL optional. Blob storage: SQLite blockstore by default; S3 optional.
  • Caddyfile (auto HTTPS via Let's Encrypt) included; recent v0.9.0 work moves to "new repo lib".

Other variants and forks

  • https://github.com/dkbhadeshiya/atproto-pds — fork with OpenTelemetry instrumentation.
  • https://github.com/burningtree/atproto-pds, https://github.com/ansh/bluesky-pds — minor forks of bluesky-social/pds.
  • Blacksky fork of atproto: blacksky-algorithms/atproto — appview optimizations.

17. Rust Crate References — atproto-crates workspace

Workspace: https://tangled.org/ngerakines.me/atproto-crates. License MIT. Origin: extracted from smokesignal.events. Crates listed in README as 17:

Crate Tangled path crates.io Notes
atproto-dasl crates/atproto-dasl https://crates.io/crates/atproto-dasl DASL framework: CID, DRISL DAG-CBOR, CAR v1, RASL, BDASL, Web Tiles. CLI: atpcid.
atproto-identity crates/atproto-identity https://crates.io/crates/atproto-identity DID resolution (plc/web/key), handle resolution (DNS/HTTP), PLC operations, P-256/P-384/K-256. CLIs: atproto-identity-{resolve,key,sign,validate,plc-audit,plc-fork-viz}, atpdid.
atproto-attestation crates/atproto-attestation https://crates.io/crates/atproto-attestation CID-first attestation (inline & remote). CLI: atproto-attestation-{sign,verify}.
atproto-record crates/atproto-record https://crates.io/crates/atproto-record TID gen, AT-URI parsing, CID generation. CLIs: atproto-record-cid, atptid.
atproto-repo crates/atproto-repo https://crates.io/crates/atproto-repo MST encode/decode, commit structures, tree diffing. CLIs: atproto-repo-{car,mst}.
atproto-lexicon crates/atproto-lexicon https://crates.io/crates/atproto-lexicon NSID validation, recursive resolution, DNS-based discovery. CLI: atproto-lexicon-resolve.
atproto-oauth crates/atproto-oauth https://crates.io/crates/atproto-oauth DPoP (RFC 9449), PKCE (RFC 7636), JWT, storage abstractions. CLI: atproto-oauth-service-token.
atproto-oauth-aip crates/atproto-oauth-aip https://crates.io/crates/atproto-oauth-aip AIP authorization-code flow with PAR, token exchange, AT session.
atproto-oauth-axum crates/atproto-oauth-axum https://crates.io/crates/atproto-oauth-axum Axum handlers for callbacks, JWKS, client metadata. CLI: atproto-oauth-tool.
atproto-client crates/atproto-client https://crates.io/crates/atproto-client HTTP client (DPoP/Bearer/sessions), XRPC ops, repo management. CLIs: atproto-client-{auth,app-password,dpop,put-record}.
atproto-xrpcs crates/atproto-xrpcs https://crates.io/crates/atproto-xrpcs XRPC service framework with Axum integration. JWT extractors.
atproto-xrpcs-helloworld crates/atproto-xrpcs-helloworld Reference XRPC service with DID:web, JWT auth.
atpxrpc crates/atpxrpc XRPC CLI client with persistent session.
atproto-jetstream crates/atproto-jetstream https://crates.io/crates/atproto-jetstream Jetstream consumer with Zstd. CLI: atproto-jetstream-consumer.
atproto-tap crates/atproto-tap TAP service consumer with MST integrity, backfill. CLIs: atproto-tap-{client,extras}.
atpmcp crates/atpmcp MCP server for DAG-CBOR CID generation.
atproto-extras crates/atproto-extras https://crates.io/crates/atproto-extras Facet parsing, rich text. CLI: atproto-extras-parse-facets.

atproto-plc is a separately-published crate at https://crates.io/crates/atproto-plc ("did-method-plc implementation for ATProto with WASM support"). Versioning on the workspace is currently 0.14.x (CHANGELOG shows release 0.14.5; atproto-tools Docker tag follows version).

Related projects by Nick Gerakines

  • AIP (Graze): https://github.com/graze-social/aip.
  • atproto-tools Docker image: https://hub.docker.com/r/ngerakines/atproto-tools — bundled CLIs.
  • Smoke Signal: https://smokesignal.events/ — events/RSVP atproto app, source of extracted crates.
  • Lexicon Garden: https://lexicon.garden/ — community lexicons.
  • Ramjet, atproto-attestation, ecmh-rs — additional Gerakines projects.
  • magazi: https://tangled.org/ngerakines.me/magazi — content gating using atproto identity & cryptographic proofs.
  • pds.js: https://tangled.org/ngerakines.me/pds.js — minimal JavaScript PDS.

atproto.blue (Python SDK reference)

  • https://atproto.blue/ — Python SDK docs; relevant API surface for cross-implementation parity checks.

18. Performance & Low-Latency — References

Storage engine

  • https://github.com/fjall-rs/fjall

    "Fjall is a log-structured embeddable key-value storage engine (think RocksDB) written in Rust… LSM-tree-based storage similar to RocksDB; range & prefix searching; multiple keyspaces (column families) with cross-keyspace atomic semantics; built-in compression (default = LZ4); serializable transactions (optional); key-value separation for large blob use cases (optional)."

  • Fjall 3.0 announcement: https://fjall-rs.github.io/post/fjall-3/.

Async runtime

  • https://tokio.rs/ — async runtime.

Web framework

  • https://github.com/tokio-rs/axum — used in atproto-xrpcs and atproto-oauth-axum.

SQL access

  • https://github.com/launchbadge/sqlx — used by tranquil-pds (.sqlx/ query cache visible).

Crypto

  • https://crates.io/crates/k256, https://crates.io/crates/p256 — RustCrypto curves.

Performance discussions

  • https://atproto.com/blog/relay-updates-sync-v1-1: relay handles "thousands of messages per second using on the order of 2 vCPU cores, 12 GByte of RAM, and 30 Mbps of inbound and outbound bandwidth."
  • https://github.com/bluesky-social/atproto/discussions/2350 — design space for PDS scaling.

19. Build & Release Plan — References

Interop tests

  • https://github.com/bluesky-social/atproto-interop-tests — test vectors for repo, MST, identity. (Referenced from Spring 2026 roadmap: "There are some basic existing test vectors.")
  • https://github.com/bluesky-social/atproto/tree/main/interop-test-files — language-neutral test files in main repo.

goat CLI

  • https://github.com/bluesky-social/goat — CLI for CAR files, firehose, APIs, lexicons. Verify-flags suitable for PDS testing.

atproto-tap (Bluesky reference)

  • https://atproto.com/blog/introducing-tap — sync 1.1 reference implementation in Go.

Experimental conformance

  • https://tangled.org/alice.mosphere.at/atproto-smoke — protocol smoke tests.

ngerakines atproto-tap Rust crate — for verifying atproto-pds correctness against TAP-validated event streams.


20. Closing Notes — References

Community wiki and forums

  • https://atproto.wiki/ — ATProto community wiki.
  • https://discourse.atprotocol.community/ — community forum.
  • https://atprotocol.dev/ — community hub & tech talks.

Conferences (informational only)

  • AtmosphereConf 2025 (Seattle), AtmosphereConf 2026 (Vancouver, March 26–29, 2026) — https://atmosphereconf.org/.
  • ATScience 2026 — https://atproto.science/events/atmosphere2026/.

Discord and chat (named only)

  • ATProtocol Touchers Discord (community).
  • Tangled Discord (chat.tangled.org).
  • Fjall-rs Discord (storage engine).

Nick Gerakines blog posts

  • Smoke Signal updates: ATProto Tech Talk (July 24, 2025) — https://atprotocol.dev/.
  • Sidecar pattern, authoritative vs unauthoritative records, ATProtocol attestations, CIDs in ATProtocol, record hydration — published on Leaflet/smokesignal.events blog. Notable themes:
    • "Sidecar pattern" — applications run companion services that read/write specific records.
    • "Authoritative vs unauthoritative records" — PDS as source-of-truth for repo records; AppViews hold derived/unauthoritative state.
    • Attestations leveraging atproto-attestation crate's CID-first inline & remote signature workflows.
  • gist: https://gist.github.com/ngerakines/e5daf2f5cd075504352cf8d54229c4e5 — Running your own atproto pds.

Acknowledged but external sources for context

  • Daniel Holmgren's Permissioned Data Diary series at https://dholms.leaflet.pub/.
  • Bryan Newbold's Rust adenosine implementation (early clean-room Rust validator).
  • mackuba's atproto blog index: https://mackuba.eu/2025/11/18/atproto-blog-posts/ — comprehensive list of all atproto specs/proposals/blog posts.

Appendix A — Quick lookup index

Topic Primary URL
Repository spec https://atproto.com/specs/repository
Sync spec https://atproto.com/specs/sync
Sync 1.1 proposal https://github.com/bluesky-social/proposals/tree/main/0006-sync-iteration
OAuth spec https://atproto.com/specs/oauth
Permission spec https://atproto.com/specs/permission
Cryptography spec https://atproto.com/specs/cryptography
Account migration https://github.com/bluesky-social/pds/blob/main/ACCOUNT_MIGRATION.md
Self-hosting https://atproto.com/guides/self-hosting
Spring 2026 Roadmap https://atproto.com/blog/2026-spring-roadmap
Spaces design (Diary 4) https://dholms.leaflet.pub/3mhj6bcqats2o
Permissioned-Data PDS Design github.com/bluesky-social/atproto/blob/f592188e7ed4720450ac0aef877331dbc9998d8d/docs/superpowers/specs/2026-04-22-permissioned-data-pds-design.md
atproto-crates workspace https://tangled.org/ngerakines.me/atproto-crates
Tranquil PDS https://tangled.org/tranquil.farm/tranquil-pds
Cocoon https://tangled.org/hailey.at/cocoon
TS reference https://github.com/bluesky-social/atproto
Indigo (Go) https://github.com/bluesky-social/indigo
rsky (Rust) https://github.com/blacksky-algorithms/rsky
AIP (OAuth) https://github.com/graze-social/aip
Fjall storage https://github.com/fjall-rs/fjall

This reference is current as of May 1, 2026. Where speculative or future-state information is cited (e.g., the permissioned-data design at the f59218 commit, Diary 4, Spring 2026 roadmap forward-looking sections), the language used in those sources is itself tentative ("rough proposal," "lower confidence," "subject to change," "we expect"). Implementers should treat such material as a target sketch rather than a stable contract, and re-pull the design document and diary posts before finalizing protocol-facing decisions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment