BLR—— ——

P / 01 · 2026

Case study

Pigeon

Multi-tenant notifications platform

A notifications product with a backend SDK for sending events and a frontend SDK for reading them, built to make in-app notifications simple to add and reliable to run.

Project still

(I) — Overview

Why Pigeon exists.

Most teams start notifications as a database table and a frontend badge. Pigeon treats them as a cross-cutting product surface from the first commit.

Backends should call one client to send. Frontends should mount a provider and render a bell that stays live, survives reconnects, and reads optimistically. Customers should receive signed webhooks with retries that leave evidence. None of that should be assembled from scratch inside the consuming app.

The result is a TypeScript monorepo: an API, a worker, a web dashboard, a Node SDK, a React SDK, and a demo that exercises the full loop end-to-end.

Runtime services

Realtime channel

SDKs shipped

Worker lanes

Auth paths

Schema families

(II) — Product surface

What the platform includes.

Node SDK

send() · sendBatch() · createUserToken()
Validates inputs before network
Configurable transport and timeout behavior
Typed API · network · validation errors

React SDK

PigeonProvider · useNotifications() · NotificationBell
Token caching with automatic refresh
SSE reconnect with backoff
Optimistic mark, mark-all, archive

API · Worker

Typed HTTP API with request tracing
Background workers split by responsibility
Schema-level idempotency guarantees
Signed webhooks · attempt log per send

Dashboard

Projects · environments · members
API key management with safe storage
Logs · users · templates · webhooks
Dashboard auth and role-based access

(III) Architecture

How the system is
put together.

The Node SDK sends. The API persists the request, hands work to background processing, and returns. Workers render, publish live updates, and dispatch signed webhooks. The React SDK opens an SSE stream and resumes cleanly after reconnects.

(IV) Request flow

From send
to delivery.

Accept fast, deliver async, fan out everywhere. The send returns after durable persistence and job handoff. The slower work stays out of the caller's request path.

(V) Tenancy model

Projects, environments,
and scope.

Projects are top-level. Each project has development and production environments with their own scoped credentials. Almost everything that matters, including keys, users, notifications, templates, and webhooks, lives behind an environment, not just a project.

(VI) Authentication

Separate trust
boundaries.

Backends authenticate with long-lived API keys. Frontends use short-lived client tokens minted per end user. Different blast radius and different rate budgets, but one API enforces both.

(VII) — Data model

How tenancy shows up in the schema.

The schema is organized around a simple rule: human collaboration lives at the project level, while credentials, recipients, notifications, templates, and webhooks are scoped per environment. That keeps tenancy visible in the data model instead of depending on scattered application checks.

01Identity & session

users
sessions
accounts
verifications

dashboard auth and session state

02Project & collaboration

projects
project_members
project_invites
environments

workspace and membership boundaries

03Credentials & recipients

api_keys
end_users

credential storage and recipient identity

04Notifications & templates

notifications
templates

delivery records, content, and idempotency

05Webhooks & attempts

webhook_endpoints
webhook_delivery_attempts

delivery destinations and attempt history

(VIII) Realtime delivery

Live delivery
with replay recovery.

Live fanout and reconnect replay are handled separately. Connected clients receive updates immediately, while reconnecting clients get a short recovery window so brief network drops do not turn into missed notifications.

(IX) — Reliability model

What happens after accept.

ISend

Node SDK validates and posts
API authenticates the key
Resolves the correct tenant scope
Writes the notification record
Hands work to the background lane

IIDeliver

Worker drains the queue
Loads the right template
Renders title and body
Marks delivered
Publishes realtime · dispatches webhooks

IIIStream

React SDK requests a short-lived token
Opens an SSE connection
Receives live events
Replays missed events after reconnect
Updates the inbox optimistically

IVDispatch

Worker logs a pending attempt
Signs the payload
POSTs the customer endpoint
Records success or failure
Retries with backoff

VMaintain

Scheduled maintenance runs off the hot path
Deletes old data under a retention policy
Attempt history kept for audit
Cleanup stays bounded and predictable

(X) — Engineering tradeoffs

What I optimized for.

Strong choices

Idempotency at the schema layer
Duplicate sends are blocked at the persistence layer, not just discouraged in application code.
Durable state has one home
Transient coordination stays separate from the system of record, which keeps recovery and reasoning much simpler.
Write before enqueue
The notification commits before background processing begins, so failures surface clearly instead of disappearing into side effects.
Webhook attempts are first-class
Webhook delivery leaves a traceable attempt history, which matters for debugging and support.

Deliberate tradeoffs

SSE over WebSockets
The product needs one-way delivery, not full duplex messaging. SSE keeps the auth and reconnect story simpler for this scope.
At-least-once, not exactly-once
Workers may retry a job after partial success. Idempotency keys carry the weight at the boundaries.
Short replay buffer
Reconnect history exists to smooth over short disconnects, not to act as a permanent event ledger.
Single-region, by design
No multi-region or active-active deployment story yet. The MVP picks scope over posture.

(XI) — Risks

Where the design can fail.

01
Slow side-effects on a hot path
Rendering, fanout, and logging cannot sit on the caller's request. The send has to return fast or every integration feels the slowest dependency.
02
Realtime across reconnects
Live fanout does not retain history. Reconnects need a cursor and a replay window the SDK can drive without consumer code.
03
Misbehaving webhook receivers
Endpoints time out, 5xx, or 200-then-crash. Delivery has to retry, and every attempt has to leave evidence.
04
Two callers, two trust boundaries
Backends authenticate with long-lived keys, frontends with short-lived tokens. Different blast radius, different rate budgets, one API.

(XII) — Current limits

What the product does not cover yet.

Channels
In-app + webhooks today. No email, SMS, or push.
Deployment
Single-region in spirit. No multi-region or DR story shipped.
Identity
Basic dashboard authentication only. No SSO or enterprise identity layer yet.

(XIII) — Outcomes

What the product made easier.

Outcome

Two-line backend integration

One Node SDK call sends a notification, another mints a frontend token. Validation and typed errors live at the boundary, not inside the app.

Outcome

Drop-in React inbox

A provider, a hook, and a bell. Token caching, reconnects, and optimistic reads are handled inside the SDK, not in consumer code.

Outcome

Reconnect-safe realtime

Live events stream into the SDK while reconnects resume from the last seen event. Backgrounded tabs and brief network loss stop dropping notifications.

(XIV) — Key lessons

What I would carry forward.

01
The SDK is the product
A platform is only as good as the libraries integrators actually touch. Two clean SDKs do more for adoption than another feature behind a flag.
02
Realtime is two problems
Live delivery and replay want different storage. Solve them separately and they compose.
03
Tenancy belongs in the schema
Project and environment scoping in indexes and uniqueness constraints made the rest of the system easier to reason about than enforcing it in code paths.
04
Keep the write path boring
Slow work goes to a queue, fast work stays inline. That line is the architecture.

Next case study

P / 02 · 2025 — 2026

Media Pipeline

Long-form to short-form, automated

↗

Pigeon

Why Pigeon exists.

What the platform includes.

How the system isput together.

From sendto delivery.

Projects, environments,and scope.

Separate trustboundaries.

How tenancy shows up in the schema.

Live deliverywith replay recovery.

What happens after accept.

What I optimized for.

Idempotency at the schema layer

Durable state has one home

Write before enqueue

Webhook attempts are first-class

SSE over WebSockets

At-least-once, not exactly-once

Short replay buffer

Single-region, by design

Where the design can fail.

Slow side-effects on a hot path

Realtime across reconnects

Misbehaving webhook receivers

Two callers, two trust boundaries

What the product does not cover yet.

What the product made easier.

What I would carry forward.

The SDK is the product

Realtime is two problems

Tenancy belongs in the schema

Keep the write path boring

Media Pipeline

How the system is
put together.

From send
to delivery.

Projects, environments,
and scope.

Separate trust
boundaries.

Live delivery
with replay recovery.