Saga Pattern in Distributed Systems

Riza Farheen
Developer Advocate
July 24, 2023
Reading Time: 9 mins

Distributed systems allow for creating architecture that is easy to maintain and scale. With distributed systems, data management is distributed across multiple services - this is typical of Microservices-based architecture, where a single business use case spans multiple microservices, each with its own local datastore and localized transaction.

A saga pattern is a sequence of such local transactions across multiple services. The Saga Pattern was introduced in 1987 in a paper by Hector Garcia Molina & Kenneth Salem from Princeton University. They defined a saga pattern as a sequence of transactions that can be interleaved with one another.

In this blog, we dive deeper into the concept of the saga pattern and shed light on how Orkes Conductor operates based on this pattern. Orkes Conductor, built over the battle-tested Conductor OSS, is an orchestration platform for building distributed applications and implementing saga patterns for microservices.

Saga Pattern - Orchestration & Choreography

The Saga pattern can be implemented mainly in 2 ways, Orchestration & Choreography.

In a choreography approach, an individual microservice consumes an event and performs the required activity, and passes the event for the next service to consume. There is no centralized coordinator, making communication between the services difficult. But in the orchestration pattern, all the services are linked to the centralized coordinator that orchestrates the services in a predefined order, thus completing the application.

In my previous blog on event-driven architecture, I've discussed Orchestration & Choreography approaches in detail.

Why and When do you implement a Saga Pattern in Your Application Architecture?

The Saga pattern is considered essential in scenarios when:

  • Your application involves multiple steps spanning different services, databases, etc.
  • In situations where a fragmentary execution is not desirable. That is, a rollback is mandatory when one of the services fails.

Compensation Transaction & State Management in Saga Pattern

Several benefits are there while implementing a saga pattern in your application architecture. Let’s have a look at the two main advantages of saga pattern in a microservices architecture:

Compensation Transaction

The Compensation Transaction, a core component of the Saga Pattern, aligns with the “Do It All or Do Nothing" principle. This implies that all the transactions should be completed successfully, or if any service encounters an error, it should be rolled back to the initial state.

If we take the case of a cab booking application, various scenarios, such as the driver and user canceling the trip without proper notification or the payment being declined while the ride is marked as completed, can disrupt the seamless operation of the application. However, these challenges can be addressed by implementing compensation mechanisms for each microservice. These mechanisms help to rectify and resolve such issues, ensuring the smooth functioning of the overall application.

Workflow to handle compensation

In Conductor, the applications are built as workflows. While defining a workflow, you can set a failure/compensation Workflow that will be triggered with the failure of your main workflow. This powerful concept allows developers to build complex compensating workflows that would enable handling database compensations or state management.

Consider the situations where you are booking a cab using an application. The main steps in booking a cab involve:

Sample Illustration for cab booking application

A simplistic model of a cab booking application

When a user searches for a cab, it collects the user's details and initiates the transaction. The next step is to assign a driver for the ride. If no driver is available in the area, then a compensating action should be taken to inform the user.

Now, if the driver is available and confirms the ride, the transaction moves to the next step, which is the payment. The user is prompted to choose the payment method, and if it fails due to any reason, the rollback actions to notify the failed payment or to initiate a refund if the amount was processed should be carried out. Subsequently, the ride should be removed from the driver's list and canceled for the user. All these should work in synchronization for the smooth functioning of the application.

Let’s see how you can create this application using Orkes Conductor.

The sample cab booking application looks like this:





The complete code used to define the application is available in the GitHub repository -https://github.com/conductor-sdk/conductor-examples-saga-pattern.

Let’s see how the workflow progresses:

  • The initial booking process is implemented as a series of Simple tasks, which include booking a ride, assigning a driver, making payment & confirming the ride.
  • It is then followed by a fork-join task which handles the notification process. The fork-join task has 2 forks to notify the driver & the user. It is also handled using a Simple task in Orkes Conductor.

Let’s run the application now!

Note: Ensure that you have JDK17 & SQLite installed on your system.

  1. Clone the project locally onto your system.
  2. Update the application.properties file with your access key.
orkes.access.key=your_key_id
orkes.access.secret=your_key_secret
orkes.conductor.server.url=https://play.orkes.io/api.

Note: If you are running Conductor locally, replace orkes.conductor.server.url with your Conductor server URL.

  1. Provide permissions for the application to access the workflow and tasks.
  2. By default, conductor.worker.all.domain is set to ‘saga’. Ensure to update with a different name to avoid conflicts with the workflows and workers spun up by others in Orkes Playground.
  3. Run your application from the root project using the following command.
mvn spring-boot:run

Next, you need to create a booking request!

You can achieve this in 2 ways,

  • By calling the triggerRideFlow API from within the application.
 curl --location 'http://localhost:8081/triggerRideBookingFlow' \
 --header 'Content-Type: application/json' \
 --data '{
     "pickUpLocation": "150 East 52nd Street, New York, NY 10045",
     "dropOffLocation": "120 West 81st Street, New York, NY 10012",
     "riderId": 1
 }'

Another method is to create the JWT token and then call the Orkes API for executing the workflow.

Once the JWT token is generated, you need to make an HTTP request from Postman/cURL using the following command:

curl --location 'https://play.orkes.io/api/workflow' \
--header "Content-Type: application/json" \
--header 'X-Authorization: <JWT Token>' \
--request POST \
--data '{
    "name": "cab_service_saga_booking_wf",
    "version": 1,
    "input": {
        "pickUpLocation": "250 East 52nd Street, New York, NY 10045",
        "dropOffLocation": "120 West 81st Street, New York, NY 10012",
        "riderId": 1
    },
    "taskToDomain": {
        "*": "saga"
    }
}'

Replace with your JWT token and the required input parameters & your task to domain mapping.

The cab booking application is up and running smoothly! 🚘

Here’s what a compensation action for the cab booking application looks like:

Sample Illustration for cab booking compensation flow

A simplistic model of compensation transaction in a cab booking application

Since there are multiple services with separate databases, a local ACID transaction or the 2PC commit may not be an ideal pattern. Therefore, using the Saga pattern, the cab booking application can ensure that the entire booking process is executed consistently and can be compensated if any step fails.

With Orkes Conductor, while defining a workflow, you can set a failureWorkflow that will be triggered with the failure of your main workflow. In your workflow definition, you can add the workflow name to be run on the failure of your current workflow:

"failureWorkflow": "<name of the workflow to be run on failure>",

Here is a sample compensation workflow for the cab booking application we discussed.





The cancellation workflow is defined so that relevant steps are taken whenever a corresponding task fails to ensure the rollback of the completed tasks.

Let’s see if the first booking service fails, then the compensation would be triggered like this:

Now, suppose due to payment issues, if the workflow fails at the payment service; then the compensation workflow runs like this:

Here the payment is canceled in the payment service, followed by removing the driver assignment from the cab assignment service. Finally, the booking is canceled, and the driver is removed from the booking service, along with notifying the driver & user of the same.

Ta-da 🎊! That’s how you roll back the completed services in your application using Orkes Conductor.

Here's a video guide on cab booking app:

State Management

The second main advantage of the saga pattern is state management, which is keeping track of the progress and state of the entire distributed transaction. In distributed applications, failures can occur at any instance; therefore, it is essential to maintain the overall consistency of the transaction.

The saga pattern saves information such as the current step or task being executed, the data associated with the transaction, any compensating actions to be taken in case of failures, and the overall status of the application.

In an orchestrator-based saga pattern, the central orchestrator maintains the application's state. By effectively managing the state in a saga pattern, the system can ensure that complex distributed transactions are executed reliably and that overall data consistency is maintained despite failures.

With Orkes Conductor, the workflow can be queried to get the transaction status at any point and acts as an aggregator across multiple services and their local backend. With its native error-handling capabilities, a strong Orchestrator like Orkes Conductor can be a solution for implementing saga patterns in your distributed applications.

Summing Up

The Saga pattern is efficient for handling distributed transactions in applications with multiple services. With the saga pattern, businesses can ensure data consistency and reliable transaction execution, even in the case of failures.

At Orkes, we offer a powerful orchestration platform, ‘Orkes Conductor’, for building your distributed applications at lightning speed. With Conductor, enterprises can automate their business processes, mitigating potential issues and thus improving overall customer satisfaction.

Whether you’re running a financial institution, healthcare institution, or any business that involves an application with multiple services, implementing the Saga pattern with Orkes Conductor can help streamline your business workflows. Embracing this powerful pattern can improve the overall efficiency of your distributed applications.

We offer Orkes Cloud, the enterprise version of Conductor, in all major cloud platforms, including AWS, Azure, and GCP. In addition, if you want to try Conductor for free, leverage Orkes Playground, a free developer sandbox.

Please feel free to contact us on our community Slack for any queries.

Related Posts

Ready to build reliable applications 10x faster?