All work
P / 01 · 2026
Case study

Pigeon

Multi-tenant notifications platform

A notifications product with a backend SDK for sending events and a frontend SDK for reading them, built to make in-app notifications simple to add and reliable to run.

Project still
Pigeon preview
(I) — Overview

Why Pigeon exists.

Most teams start notifications as a database table and a frontend badge. Pigeon treats them as a cross-cutting product surface from the first commit.

Backends should call one client to send. Frontends should mount a provider and render a bell that stays live, survives reconnects, and reads optimistically. Customers should receive signed webhooks with retries that leave evidence. None of that should be assembled from scratch inside the consuming app.

The result is a TypeScript monorepo: an API, a worker, a web dashboard, a Node SDK, a React SDK, and a demo that exercises the full loop end-to-end.

Runtime services
5
Realtime channel
1
SDKs shipped
2
Worker lanes
3
Auth paths
2
Schema families
5
(II) — Product surface

What the platform includes.

Node SDK
  • send() · sendBatch() · createUserToken()
  • Validates inputs before network
  • Configurable transport and timeout behavior
  • Typed API · network · validation errors
React SDK
  • PigeonProvider · useNotifications() · NotificationBell
  • Token caching with automatic refresh
  • SSE reconnect with backoff
  • Optimistic mark, mark-all, archive
API · Worker
  • Typed HTTP API with request tracing
  • Background workers split by responsibility
  • Schema-level idempotency guarantees
  • Signed webhooks · attempt log per send
Dashboard
  • Projects · environments · members
  • API key management with safe storage
  • Logs · users · templates · webhooks
  • Dashboard auth and role-based access
(III) Architecture

How the system is
put together.

The Node SDK sends. The API persists the request, hands work to background processing, and returns. Workers render, publish live updates, and dispatch signed webhooks. The React SDK opens an SSE stream and resumes cleanly after reconnects.

01Backendapi key02APIingest + reads03Workerasync processing04Webhookcustomer endpointFrontendReact SDKprovider · bell · live inboxStorage and coordinationPostgreSQLsource of truthRedisqueues · fanout · replaysendenqueuedispatchpersistdeliverstreambackend sends, api accepts, worker delivers, frontend stays live
(IV) Request flow

From send
to delivery.

Accept fast, deliver async, fan out everywhere. The send returns after durable persistence and job handoff. The slower work stays out of the caller's request path.

I - AcceptII - DeliverIII - FanoutBackendapi keyAPIingestPostgresdurableRealtimequeue · fanoutWorkerasyncFrontendsseWebhookcustomer01notification request02persist notification03queue background work04accepted05worker picks up job06render and mark delivered07publish live event08fanout for connected clients09SSE event · replay aware10signed webhook
(V) Tenancy model

Projects, environments,
and scope.

Projects are top-level. Each project has development and production environments with their own scoped credentials. Almost everything that matters, including keys, users, notifications, templates, and webhooks, lives behind an environment, not just a project.

project-scopedMembers · Invites01Projecttop-level workspace02adevelopmentscoped keyset02bproductionscoped keysetenvironment signing secretenvironment signing secretenvironment-scoped resourcesAPI keys(env_id, prefix)End users(env_id, ext_id)Notifications(env_id, idem_key)Templates(env_id, type)Webhooks(env_id, url)
(VI) Authentication

Separate trust
boundaries.

Backends authenticate with long-lived API keys. Frontends use short-lived client tokens minted per end user. Different blast radius and different rate budgets, but one API enforces both.

trust boundaryA - Trusted backendAPI KeyLong-lived · trusted backend onlyproject_key_••••scoped credentialStoragehashed and scopedVerifyvalidated per requestResolvesproject · environment contextRotationcreate / revoke; history keptB - FrontendUser TokenShort-lived · minted per end useruserrecipient identityscopetenant boundaryttlshort lifetimeauthread accessMintissued by a trusted backendSignenvironment-scoped secretRefreshautomatic on expiryScopereads · mutations · realtime
(VII) — Data model

How tenancy shows up in the schema.

The schema is organized around a simple rule: human collaboration lives at the project level, while credentials, recipients, notifications, templates, and webhooks are scoped per environment. That keeps tenancy visible in the data model instead of depending on scattered application checks.

01Identity & session
  • users
  • sessions
  • accounts
  • verifications

dashboard auth and session state

02Project & collaboration
  • projects
  • project_members
  • project_invites
  • environments

workspace and membership boundaries

03Credentials & recipients
  • api_keys
  • end_users

credential storage and recipient identity

04Notifications & templates
  • notifications
  • templates

delivery records, content, and idempotency

05Webhooks & attempts
  • webhook_endpoints
  • webhook_delivery_attempts

delivery destinations and attempt history

(VIII) Realtime delivery

Live delivery
with replay recovery.

Live fanout and reconnect replay are handled separately. Connected clients receive updates immediately, while reconnecting clients get a short recovery window so brief network drops do not turn into missed notifications.

SSE timeline · per-user connectionlive fanout + replay bufferreconnect windowopenstream connectpingkeepaliveeventnotification eventpingkeepalivedropnetwork blipreopenresume cursorreplayshort replay bufferlivefanout resumes
(IX) — Reliability model

What happens after accept.

ISend
  • Node SDK validates and posts
  • API authenticates the key
  • Resolves the correct tenant scope
  • Writes the notification record
  • Hands work to the background lane
IIDeliver
  • Worker drains the queue
  • Loads the right template
  • Renders title and body
  • Marks delivered
  • Publishes realtime · dispatches webhooks
IIIStream
  • React SDK requests a short-lived token
  • Opens an SSE connection
  • Receives live events
  • Replays missed events after reconnect
  • Updates the inbox optimistically
IVDispatch
  • Worker logs a pending attempt
  • Signs the payload
  • POSTs the customer endpoint
  • Records success or failure
  • Retries with backoff
VMaintain
  • Scheduled maintenance runs off the hot path
  • Deletes old data under a retention policy
  • Attempt history kept for audit
  • Cleanup stays bounded and predictable
(X) — Engineering tradeoffs

What I optimized for.

Strong choices
  • Idempotency at the schema layer

    Duplicate sends are blocked at the persistence layer, not just discouraged in application code.

  • Durable state has one home

    Transient coordination stays separate from the system of record, which keeps recovery and reasoning much simpler.

  • Write before enqueue

    The notification commits before background processing begins, so failures surface clearly instead of disappearing into side effects.

  • Webhook attempts are first-class

    Webhook delivery leaves a traceable attempt history, which matters for debugging and support.

Deliberate tradeoffs
  • SSE over WebSockets

    The product needs one-way delivery, not full duplex messaging. SSE keeps the auth and reconnect story simpler for this scope.

  • At-least-once, not exactly-once

    Workers may retry a job after partial success. Idempotency keys carry the weight at the boundaries.

  • Short replay buffer

    Reconnect history exists to smooth over short disconnects, not to act as a permanent event ledger.

  • Single-region, by design

    No multi-region or active-active deployment story yet. The MVP picks scope over posture.

(XI) — Risks

Where the design can fail.

  1. 01

    Slow side-effects on a hot path

    Rendering, fanout, and logging cannot sit on the caller's request. The send has to return fast or every integration feels the slowest dependency.

  2. 02

    Realtime across reconnects

    Live fanout does not retain history. Reconnects need a cursor and a replay window the SDK can drive without consumer code.

  3. 03

    Misbehaving webhook receivers

    Endpoints time out, 5xx, or 200-then-crash. Delivery has to retry, and every attempt has to leave evidence.

  4. 04

    Two callers, two trust boundaries

    Backends authenticate with long-lived keys, frontends with short-lived tokens. Different blast radius, different rate budgets, one API.

(XII) — Current limits

What the product does not cover yet.

  • Channels

    In-app + webhooks today. No email, SMS, or push.

  • Deployment

    Single-region in spirit. No multi-region or DR story shipped.

  • Identity

    Basic dashboard authentication only. No SSO or enterprise identity layer yet.

(XIII) — Outcomes

What the product made easier.

Outcome
Two-line backend integration

One Node SDK call sends a notification, another mints a frontend token. Validation and typed errors live at the boundary, not inside the app.

Outcome
Drop-in React inbox

A provider, a hook, and a bell. Token caching, reconnects, and optimistic reads are handled inside the SDK, not in consumer code.

Outcome
Reconnect-safe realtime

Live events stream into the SDK while reconnects resume from the last seen event. Backgrounded tabs and brief network loss stop dropping notifications.

(XIV) — Key lessons

What I would carry forward.

  1. 01

    The SDK is the product

    A platform is only as good as the libraries integrators actually touch. Two clean SDKs do more for adoption than another feature behind a flag.

  2. 02

    Realtime is two problems

    Live delivery and replay want different storage. Solve them separately and they compose.

  3. 03

    Tenancy belongs in the schema

    Project and environment scoping in indexes and uniqueness constraints made the rest of the system easier to reason about than enforcing it in code paths.

  4. 04

    Keep the write path boring

    Slow work goes to a queue, fast work stays inline. That line is the architecture.

Next case study
P / 02 · 2025 — 2026

Media Pipeline

Long-form to short-form, automated