Saga Pattern: Distributed Transactions Without 2PC — ContentBuffer guide

Saga Pattern: Distributed Transactions Without 2PC

K
Kodetra Technologies··10 min read Advanced

Summary

Coordinate multi-service writes with orchestration, choreography, and compensating transactions.

If you have ever shipped a feature that touches three services in one user action — charge a card, decrement inventory, send a confirmation email — you have run head-first into the dirty secret of microservices: the database transaction stops at the service boundary. Two-phase commit (2PC) used to paper over that gap, but in 2026 it is effectively dead in cloud-native stacks. It locks resources across services, falls over when any participant is slow, and is unsupported by the message brokers and managed databases most teams build on.

The saga pattern is the answer that survived. Instead of one global transaction, a saga is a sequence of local transactions, each in its own service, glued together with events or commands. When a step fails, the saga runs compensating transactions that semantically undo the earlier steps. No global locks, no XA driver, no coordinator that has to stay up for the entire workflow.

This guide is the production version of that pattern. We will walk through both flavors — choreography and orchestration — with runnable Python sketches you can paste into a service, then dig into the trade-offs that decide which one fits your system. Expect honest opinions on idempotency, retries, ordering, and the failure modes that bite teams in their first month live.

Prerequisites

  • Comfort with at least two cooperating services and a message broker (Kafka, RabbitMQ, NATS, SQS).
  • Working knowledge of database transactions, including the difference between ACID locally and BASE across services.
  • A grasp of idempotency: an operation that produces the same result if applied once or many times.
  • Familiarity with eventual consistency. Sagas do not give you read-after-write across services for free.

The Core Idea: Local Transactions + Compensations

Picture a checkout flow. The user clicks Buy, and you must do four things across four services:

  1. OrderService creates an order in PENDING state.
  2. PaymentService charges the card.
  3. InventoryService reserves stock.
  4. ShippingService creates a shipment.

Each step writes to a different database. There is no global transaction that wraps all four. The saga insight is to accept that and to define, for each forward step, a compensation that semantically reverses it. If shipment creation fails, you do not roll back inventory — you call a ReleaseStock compensation. If payment fails after stock is reserved, you call RefundPayment and ReleaseStock. Compensations are themselves local transactions, and they must be idempotent and almost-always-succeed: a compensation that can fail puts your saga into a manual-recovery state.

Forward steps and compensations form a pair. Write them together. If you cannot describe how to undo a step, you cannot include it in a saga.

Flavor 1: Choreography (Decentralized)

In a choreographed saga, there is no central brain. Each service listens for events on the bus and emits its own events when it finishes. The flow lives in the network of subscriptions, not in any one place. This is the natural fit for an event-driven shop already on Kafka or NATS.

Here is a minimal sketch of the InventoryService participating in a checkout saga. It reacts to PaymentSucceeded, reserves stock, and emits the next event. On failure, it emits a compensation trigger.

# inventory_service/handlers.py
import uuid
from kafka import KafkaConsumer, KafkaProducer
import json, db

producer = KafkaProducer(
    bootstrap_servers="kafka:9092",
    value_serializer=lambda v: json.dumps(v).encode(),
)

def handle_payment_succeeded(event):
    order_id = event["order_id"]
    items = event["items"]

    # Idempotency guard: have we processed this saga step already?
    if db.has_processed("inventory", order_id):
        return

    try:
        with db.transaction() as tx:
            for item in items:
                tx.execute(
                    "UPDATE stock SET reserved = reserved + %s "
                    "WHERE sku = %s AND available - reserved >= %s",
                    (item["qty"], item["sku"], item["qty"]),
                )
                if tx.rowcount == 0:
                    raise OutOfStock(item["sku"])
            tx.execute(
                "INSERT INTO saga_log(saga, order_id, step) "
                "VALUES (%s, %s, %s)",
                ("inventory", order_id, "RESERVED"),
            )

        producer.send("saga.events", {
            "type": "StockReserved",
            "order_id": order_id,
            "items": items,
            "event_id": str(uuid.uuid4()),
        })

    except OutOfStock as e:
        producer.send("saga.events", {
            "type": "StockReservationFailed",
            "order_id": order_id,
            "reason": f"out_of_stock:{e.sku}",
            "event_id": str(uuid.uuid4()),
        })

# Compensation handler — fired when a downstream step fails.
def handle_shipment_failed(event):
    order_id = event["order_id"]

    # Compensations must be idempotent.
    if db.has_compensated("inventory", order_id):
        return

    with db.transaction() as tx:
        items = tx.fetch_reserved_items(order_id)
        for item in items:
            tx.execute(
                "UPDATE stock SET reserved = reserved - %s WHERE sku = %s",
                (item["qty"], item["sku"]),
            )
        tx.execute(
            "INSERT INTO saga_log(saga, order_id, step) "
            "VALUES (%s, %s, %s)",
            ("inventory", order_id, "COMPENSATED"),
        )

Three details deserve attention. The reservation update uses an atomic UPDATE ... WHERE available - reserved >= qty so the SQL engine rejects races without an explicit lock. The handler checks has_processed before doing anything — Kafka, like every broker worth using, gives you at-least-once delivery, so duplicates are normal. And the saga state is written in the same local transaction as the business change, which is the heart of the transactional outbox trick: either both happen, or neither does.

Choreography shines when you have three or four steps and a culture of small services owning their data. It dies when you have nine steps, branching logic, or a need to ask 'where is order 12345 right now?' — the answer is scattered across nine services and a Kafka topic, and the true source is the union of their logs.

Flavor 2: Orchestration (Centralized Coordinator)

An orchestrated saga puts the workflow in one place. A coordinator service (or a workflow engine like Temporal, AWS Step Functions, or Camunda) sends explicit commands to participants and waits for replies. The participants do not know about each other; they only know the orchestrator.

In 2026 most teams reach for a workflow engine here rather than rolling their own. Temporal in particular has eaten this niche because it gives you durable state, automatic retries, and timer support without you running a Postgres-backed coordinator. Here is a Temporal-style orchestrator in Python that captures the structure clearly.

# checkout_workflow.py
from datetime import timedelta
from temporalio import workflow
from temporalio.common import RetryPolicy
from activities import (
    create_order, charge_card, reserve_stock, create_shipment,
    refund_payment, release_stock, cancel_order,
)

@workflow.defn
class CheckoutSaga:
    @workflow.run
    async def run(self, order: dict) -> str:
        retry = RetryPolicy(
            initial_interval=timedelta(seconds=1),
            maximum_attempts=5,
        )
        compensations = []

        try:
            order_id = await workflow.execute_activity(
                create_order, order,
                start_to_close_timeout=timedelta(seconds=10),
                retry_policy=retry,
            )
            compensations.append(("cancel_order", {"order_id": order_id}))

            payment_id = await workflow.execute_activity(
                charge_card,
                {"order_id": order_id, "amount": order["total"]},
                start_to_close_timeout=timedelta(seconds=15),
                retry_policy=retry,
            )
            compensations.append(("refund_payment", {"payment_id": payment_id}))

            await workflow.execute_activity(
                reserve_stock,
                {"order_id": order_id, "items": order["items"]},
                start_to_close_timeout=timedelta(seconds=10),
                retry_policy=retry,
            )
            compensations.append(("release_stock", {"order_id": order_id}))

            await workflow.execute_activity(
                create_shipment,
                {"order_id": order_id, "address": order["address"]},
                start_to_close_timeout=timedelta(seconds=20),
                retry_policy=retry,
            )

            return order_id

        except Exception as exc:
            workflow.logger.warning("saga failed, compensating: %s", exc)
            # Run compensations in reverse order. They MUST be idempotent.
            for name, args in reversed(compensations):
                await workflow.execute_activity(
                    name, args,
                    start_to_close_timeout=timedelta(seconds=30),
                    retry_policy=RetryPolicy(maximum_attempts=10),
                )
            raise

Read it once and the structure jumps out: a list of forward activities, a stack of compensations pushed after each success, and an except clause that drains the stack in reverse. Because Temporal persists workflow state on every step, the orchestrator can crash, restart, and pick up exactly where it left off — including in the middle of a compensation. That durability is what you are buying when you adopt a workflow engine; building it on top of a hand-rolled coordinator and Postgres is possible but easy to get wrong.

Choreography vs Orchestration: How to Choose

Both patterns are correct. The trade-off is about visibility, coupling, and operational maturity. The matrix below summarizes how I push teams in design reviews.

DimensionChoreographyOrchestration
CouplingLoose (services know events)Tight to orchestrator (services know commands)
VisibilityDistributed across logs and topicsCentralized — one workflow ID tells the whole story
Adding a stepEdit one new subscriberEdit the workflow definition
Branching/conditionalHard — duplicates logic across servicesTrivial — it is just if in the workflow
Failure debuggingHard at 5+ stepsStraightforward — one trace per saga
Operational depsJust a brokerBroker plus a workflow engine
Best fitSmall flows, event-native domainLong, branching, money-touching flows

A reasonable rule: start with choreography for flows up to about four steps, switch to orchestration the moment you need branching, timers ("cancel if not paid in 30 minutes"), or a single place to ask "why did this order fail?". Hybrid is fine — choreography for the fan-out parts, orchestration for the critical path.

Production Pitfalls That Will Bite You

1. Forgetting idempotency at every step

At-least-once delivery is the default everywhere — Kafka, SNS, RabbitMQ, Temporal activities. Every handler will be called more than once at some point. Without idempotency, retries silently double-charge cards and double-decrement inventory. The cheapest fix is a deduplication table keyed on (saga_id, step_id, event_id), written inside the same local transaction as the side effect.

2. Compensations that can fail forever

A failed compensation is an outage. Design compensations to be safer than the forward step: refund-by-creating-credit-note rather than refund-by-reversing-charge, release-stock-by-decrementing-counter rather than by calling out to a third-party fulfillment system. If a compensation can genuinely fail, escalate to a human queue with the saga ID and current state.

3. Cross-saga isolation (no, you do not have it)

Sagas are not isolated from each other. While saga A is mid-flight, saga B can read the half-applied state. This is the famous "dirty reads in sagas" problem. Two countermeasures: use semantic locks (a RESERVED state on the order that other sagas treat as unavailable), or design the read side to tolerate inconsistency by reading from a projection that is only updated on saga completion.

4. Missing or wrong timeouts

Without a timeout, a stuck participant freezes the entire saga indefinitely. Every forward step needs a timeout, after which the saga either retries or compensates. In Temporal that is start_to_close_timeout; in a hand-rolled choreography it is a scheduled "saga heartbeat" job that looks for sagas older than N minutes in non-terminal states.

5. Treating compensations as exact inverses

A refund is not the inverse of a charge — it is a separate action with its own audit trail and tax implications. Cancel-shipment is not the inverse of create-shipment if the shipment has already been picked. Document each compensation as its own first-class operation, not as step_X.undo().

Observability: Knowing Where Every Saga Is

The number one operational complaint about sagas is "I have no idea where order 12345 is right now." Three pieces of plumbing fix this, and you should put them in on day one rather than retrofit them after your first outage.

First, generate a saga_id at the entry point and propagate it as a header on every event and command. Treat it the same as a trace ID — in fact, you can just use the W3C traceparent for this and let OpenTelemetry do the work. Every span in every service then carries the same saga ID, and a single Jaeger or Datadog query gives you the full path.

Second, write a saga_log row in every participating service on every state transition, in the same local transaction as the business change. The schema is small and the same everywhere:

CREATE TABLE saga_log (
    saga_id     UUID NOT NULL,
    service     TEXT NOT NULL,
    step        TEXT NOT NULL,
    status      TEXT NOT NULL,         -- STARTED | OK | FAILED | COMPENSATED
    event_id    UUID NOT NULL,         -- for idempotency
    payload     JSONB,
    created_at  TIMESTAMPTZ DEFAULT now(),
    PRIMARY KEY (saga_id, service, step, event_id)
);
CREATE INDEX ON saga_log (saga_id, created_at);

Third, build a small read-model that joins saga logs from every service into a single view. In choreography this is mandatory because no one service has the whole picture; in orchestration the workflow engine gives you most of it for free, but a unified view across multiple workflows still pays. A simple periodic ETL into a saga_state table is enough to drive a status page and an on-call dashboard.

When NOT to Use a Saga

Sagas are not free. They cost code complexity, operational surface area, and a real cognitive tax on every engineer who has to reason about them. Reach for one only when the problem genuinely needs cross-service consistency.

  • Single-service flows. If everything happens inside one service, use a database transaction. A saga inside one service is just a worse transaction.
  • Read-only or eventually-consistent flows. If your steps are queries or projections, you do not need compensations. Use a fan-out or a materialized view.
  • True ACID requirements. If money cannot be even briefly miscounted (regulatory ledgers, exchange order books), put the data in one service with a real transaction. Sagas are eventually consistent by definition.
  • Tiny teams without ops maturity. If your team cannot reliably run a Kafka cluster or a Temporal deployment, a saga will be a worse problem than the duplicated work you are trying to avoid. Ship the monolith first.

The most common mistake is reaching for a saga in a system that did not need to be a microservice in the first place. If a single workflow touches five services and they all share the same release cadence and team, you have a distributed monolith. Merging two of those services often kills the saga entirely and leaves you with cleaner code.

Quick Reference

ConceptWhat it meansWhen you need it
Local transactionAn ACID transaction inside one serviceEvery saga step
CompensationA local transaction that semantically undoes a previous stepEvery step except the last
Idempotency keyA unique ID stored alongside the side effect, used to detect duplicatesEvery handler that mutates state
Saga logPer-saga state machine persisted in a databaseBoth flavors, mandatory for orchestration
Semantic lockA status flag others treat as 'busy'When dirty reads cause user-visible bugs
Workflow engineDurable execution platform (Temporal, Step Functions, Camunda)Orchestration at any non-trivial scale
Outbox patternWrite event + business change in one local TX, ship event from a pollerChoreography on top of any DB + broker

Next Steps

  • Pick the smallest multi-service flow you currently run and draw the saga: forward steps, compensations, and the trigger event for each.
  • Add a saga_log table to each participating service with columns (saga_id, step, status, event_id, updated_at). Backfill it from your existing logs.
  • If you are already on Kafka or NATS, prototype the choreography flavor first — it is closer to what you have. If you have more than five steps or any branching, prototype Temporal in parallel.
  • Write at least one chaos test per saga: kill the participant after the forward step succeeds but before it acks. The saga should resume cleanly on restart.
  • Wire your tracing: every saga should have a single trace ID that spans every service. This single change will pay for itself the first time something breaks at 2 AM.

Sagas are not magic. They trade global consistency for service autonomy and ask you to think hard about every step's failure mode. Do that work once, and you have a workflow that scales horizontally, survives partial outages, and gives you a real audit trail. Skip it, and you ship double-charged customers.

Comments

Subscribe to join the conversation...

Be the first to comment