All work
P / 02 · 2025 — 2026
Case study

Media Pipeline

Long-form to short-form, automated

A Django, FastAPI, Celery, and RabbitMQ media pipeline that repurposes long-form video into near-ready short-form deliverables, using durable run/job/output records, artifact-first stage boundaries, PydanticAI-driven analysis, and Remotion on AWS Lambda for parallel rendering.

Product preview
(I) — The premise

Ready to post, ready in minutes.

Manual social clipping took about four weeks per campaign. This pipeline turns a two-hour podcast into 30 to 40 branded social clips in roughly one hour.

Before this, enterprises were paying editors to sit through hours of recording, find the usable moments, cut them down, caption them, and export multiple aspect ratios by hand. The target here was not draft quality. It was near-ready output at production throughput.

The system now lands around 95% ready-to-post quality by pushing the work through twelve cooperating stages that can fail, resume, and reuse work without losing their place.

Branded clips / run
30-40
Turnaround time
1 hr
Ready-to-post quality
95%
Time reduction
4wk→1hr
Pipeline stages
12
Parallel export
Lambda
(II) — System map

Five cooperating parts.

01django + queues
Control Plane
  • Django APIs receive repurpose requests
  • Celery + RabbitMQ drive async execution
  • FastAPI serves AI-heavy endpoints
  • Owns run state and render completion
02analysis + orchestration
Pipeline Runtime
  • Selects the active workflow
  • Triggers stages when prerequisites are ready
  • Records job-level progress and failure
  • Uses PydanticAI + LLMs to analyze and score moments
03short-form renderer
Render Runtime
  • Consumes prepared clip payloads
  • Renders branded 9:16, 1:1, and 16:9 variants
  • Fans out exports through Remotion on AWS Lambda
  • Hands files back to product storage
04authoring + publish
Style Plane
  • Authors subtitle and overlay systems
  • Pins client-specific branding at runtime
  • Publishes immutable creative packages
  • Defines runtime-safe contracts
05durable state
State Layer
  • Run record (one per source video)
  • Job record (one per stage)
  • Output record (one per short)
  • Execution metadata + logs
(III) — DAG

Twelve stages,
explicit edges.

Multiple transcription providers produce timestamped text. Segment detection and scene analysis establish candidate boundaries. PydanticAI-driven LLM passes look for high-energy moments, speaker changes, topic completeness, hook strength, quotability, and likely engagement. Preparation and face tracking then shift to clip-local assets, with the heavier vision path deployed separately on Cloud Run. Composition writes the render index. Rendering fans out through Remotion on AWS Lambda, and the analysis boundary closes before delivery does.

IngestPlanPrepareComposeRender01Transcodingnormalize02Transcriptionspeech to text03Segment Detectboundaries04Planningeditorial plan05Enhancement Plancreative06Prepare Mediaasync handoff07Framing Analysisper clip08Audio Planningshort-local09Enhancement Buildbundle10Prepare Finalizemerge11Compositionrender bundle12Renderingfanoutdelegated async callbackanalysis to delivery
(IV) — Records

Three lifecycles,
one spine.

The run is the top-level control object. Each stage becomes a durable job. Each rendered short becomes its own output. The tight job state model — pending, in_progress, completed, failed — keeps orchestration predictable while still allowing retries, subtree resets, and reuse of completed work.

Persistent state · three durable records01Run recordone source video being repurposedstartedanalysisanalysis-completefully generatedokdelivery failedfailcarries· execution id · workflow name· started · completed · failed ts· append-only progress logs· composition index location02Job recordone stage in the workflowpendingin_progresscompletedokfailedfailcarries· stage type · execution id· delegated platform ids· error details for recovery· artifact namespace03Output recordone rendered shortpendingrenderingrenderedokfailedfailcarries· render operation id· render bucket · namespace· progress %· final video asset ref
(V) — Artifact contract

Stages don't call,
they publish.

Each stage reads the artifacts it depends on, computes, writes new structured data or media to durable storage, and marks its job complete. The orchestrator triggers what comes next. Delegated work, retries, and reuse all fall out of that contract. Debugging becomes reading the directory.

Control Planerun recordOrchestratorworkflow graphPipeline Stagejob recordArtifact Storestructured data + mediaRender Runtimeshort-form renderer01start pipeline execution02trigger eligible stage03write stage artifacts04mark job complete05trigger dependent stage06read upstream artifacts07write composition index08trigger render fanout09write rendered shortsevery stage boundary is an artifact, not an in-memory hop
(VI) — Render fanout

Webhook or poll,
whichever wins.

The composition index spawns one render job per short. Remotion packages client branding, captions, and aspect-ratio variants, then AWS Lambda fans those renders out in parallel. Each output completes on its own schedule, signalled either by an async webhook or by background polling. Finalization is modeled as a terminal-state transition so a missed callback is recovered by the poll, and a duplicate signal becomes a no-op.

Render fanout · webhook + pollingterminal-state transition · idempotentINComposition Indexone payload per short5 shortsshort 01output record · renderingshort 02output record · renderingshort 03output record · renderingshort 04output record · renderingshort 05output record · renderingwebhookpush · low-latencypollingfallback · resilientOUTFinalizecopy · publish · video recordWhichever signal arrives first wins. The other is a no-op.terminal-state transition handles the race
(VII) — Ownership

Authoring out,
production in.

The style and enhancement control plane is a separate subsystem. It authors and publishes immutable creative contracts: subtitle systems, overlays, audio treatment, and visual rules. The pipeline pins them by version at runtime. Creative quality control stays out of the production DAG.

A - AuthoringStyle Control PlaneAuthoring · Previewcreative quality controlPublish Immutableversioned · auditableB - ProductionMain PipelineControl Planerun recordWorkflow Orchestration12-stage dagPlan / Prepare / Composeclip-localRun / Job / Output statedurableC - DeliveryRender RuntimeShort-form Compositionrender bundleOutput Deliveryobject storage handoffpublishcomposition indexcompletioncreative authoring stays outside the production dag
(VIII) — Operational behavior

Reuse, rerun, recover.

Reuse of completed work
Completed jobs are first-class

The trigger layer can attach a new execution to compatible completed jobs for the same source asset and stage type. No separate cache. The orchestrator keys off completed contracts, not task invocations. Latency drops, cost drops, and the graph continues from the reused boundary.

Enhancement-only rerun
Replay from a controlled subtree

Style override reruns enter at the enhancement-planning boundary rather than at the start. The override lives in the run metadata, and downstream composition and rendering regenerate. The DAG is not just executable. It is partially replayable.

Failure boundaries
Workflow ≠ delivery

Workflow completion = analysis + composition done. Delivery completion = every output reached a terminal render outcome. Any failed required stage marks the workflow failed. Per-clip render failures roll up into delivery state, not workflow state.

(IX) — Failure points

Where it could break.

  1. 01

    Inconsistent source media

    Strange codecs, broken containers, missing audio. Transcoding has to be defensive enough that downstream stages can assume a stable input.

  2. 02

    Long jobs failing late

    An hour of analysis cannot collapse on a render error. Resumability has to live at the artifact boundary, not at the task level.

  3. 03

    Delegated stages and missed callbacks

    Stitch Prepare and Rendering both depend on external systems. Polling is the safety net so a dropped webhook doesn't strand a run.

  4. 04

    Editorial drift in generated plans

    Plans can be technically valid but editorially uneven. Treating the plan as an explicit contract, with style binding and enhancement separation, keeps reruns cheap when only the creative layer needs to move.

(X) — Outcomes

What the architecture buys.

Outcome
Four weeks down to one hour

What used to take a campaign-long editorial loop now compresses into a single processing run, because transcription, planning, preparation, and rendering all execute as resumable async stages.

Outcome
95% ready-to-post output

Candidate clips are not just cut on timestamps. They are scored for hook strength, emotional tone, quotability, and completion, then rendered with client branding and captions already applied.

Outcome
30 to 40 clips in one run

Parallel Lambda export turns per-clip rendering into fanout instead of a queue of serial renders, so dozens of outputs land in minutes rather than hours.

(XI) — Learnings

What stayed useful.

  1. 01

    Make the contract the artifact

    When stages publish to durable storage, retries, reuse, and delegated execution all become trivial. The orchestrator stops caring about who computed what.

  2. 02

    Separate clip selection from creative

    Holding clip selection and creative treatment in different stages made enhancement-only reruns possible without rebuilding the whole graph.

  3. 03

    Two completion signals beat one

    Webhooks reduce time-to-finalize on the happy path; polling survives missed callbacks. Treat finalization as a terminal-state transition and the race resolves itself.

  4. 04

    Pin creative as runtime input

    Style is not a render flag. It is a versioned contract published from a separate control plane. The pipeline pins it for consistency, auditability, and safer reruns.

Next case study
P / 03 · 2025

Tessact AI

Video intelligence platform