Distributed Transactions - The Saga Pattern

14 May 2026·3 min read

1. The Background: The Monolithic Safety Net

In a traditional, single-server backend (a Monolith), transactions are easy. You rely on the database's ACID properties. If a user buys a laptop, you wrap the whole process in an SQL Transaction:

BEGIN TRANSACTION
Create the Order.
Deduct $1,000 from the user's account.
Deduct 1 Laptop from Inventory.
COMMIT (Save permanently).

If the Inventory check fails (out of stock!), the database instantly executes a ROLLBACK. The $1,000 deduction is magically erased as if it never happened. It is "All or Nothing."

2. The Problem: The Microservice Nightmare

Modern tech companies (Netflix, Uber, Amazon) do not use Monoliths. They use Microservices. The Order Service, Payment Service, and Inventory Service are three completely separate codebases running on different physical servers, each with their own private database.

The Disaster: A user clicks "Buy."

The Order Service creates an order. (Success)
The Payment Service deducts $1,000 from the credit card. (Success)
The Inventory Service checks for laptops... but they are out of stock. (Failure!)

You cannot run an SQL ROLLBACK across three different databases over the internet. The Payment database has no idea the Inventory database even exists. You have successfully taken $1,000 from a customer for a laptop that does not exist, and there is no automatic way to undo it.

3. The Solutions: 2PC vs. The Saga

Engineers created two main ways to solve this. One is the "old school" way, and the other is the modern industry standard.

Solution A: Two-Phase Commit / 2PC (The "Hold Your Breath" Strategy)

This relies on a central "Coordinator" server.

Phase 1 (Prepare): The Coordinator calls all three services: "Are you ready? Lock your rows!" All three databases lock their tables and reply "Ready."
Phase 2 (Commit): The Coordinator says, "Okay, everyone execute!"
The Fatal Flaw: This is incredibly slow. While the databases are locked waiting for the Coordinator, no other users can buy laptops. Worse, if the Coordinator crashes in the middle of Phase 1, all your databases are permanently locked. It is terrible for performance and violates the "Availability" rule of the CAP theorem.

Solution B: The Saga Pattern (The Modern Standard)

Instead of one massive, locked transaction, a Saga is a sequence of completely independent, local database transactions.

There is no magical ROLLBACK. Instead, if step 3 fails, the system must automatically execute Compensating Transactions. A Compensating Transaction is a brand-new, explicitly written piece of code designed to act as an "Apology."

How a Saga handles the failure:

Order Service runs a local transaction: Status = PENDING.
Payment Service runs a local transaction: Charges $1,000.
Inventory Service tries to run its local transaction, but fails (Out of stock!).
The Rollback (The Apologies): The Inventory Service fires a "Failure Event" to a message broker (like Kafka).
The Payment Service hears the failure and runs its Compensating Transaction: Issues a $1,000 Refund.
The Order Service hears the failure and runs its Compensating Transaction: Changes Order Status to CANCELLED.

The data is eventually consistent, no databases were ever locked, and the system stayed blazingly fast.