Blogroll

Comparing Distributed Transaction Patterns for Microservices

As a consulting architect at Red Hat, I've had the privilege of working on legions of customer projects. Every customer brings their own challenges but I've found some commonalities. One thing most customers want to know is how to coordinate writes to more than one system of record. Answering this question typically involves a long explanation of dual writes, distributed transactions, modern alternatives, and the possible failure scenarios and drawbacks of each approach. Typically, this is the moment when a customer realizes that splitting a monolithic application into microservices is a long and complicated journey, and usually requires tradeoffs.

Rather than go down the rabbit hole of discussing transactions in-depth, this article summarizes the main approaches and patterns for coordinating writes to multiple resources. I’m aware that you might have good or bad past experiences with one or more of these approaches. But in practice, in the right context and with the right constraints, all of these methods work fine. Tech leads are responsible for choosing the best approach for their context.

Note: If you are interested in dual writes, watch my Red Hat Summit 2021 session, where I covered dual write challenges in depth. You can also skim through the slides from my presentation. Currently, I am involved with Red Hat OpenShift Streams for Apache Kafka, a fully managed Apache Kafka service. It takes less than a minute to start and is completely free during the trial period. Give it a try and help us shape it with your early feedback. If you have questions or comments about this article, hit me on Twitter @bibryam and let’s get started.

The dual write problem

The single indicator that you may have a dual write problem is the need to write to more than one system of record predictably. This requirement might not be obvious and it can express itself in different ways in the distributed systems design process. For example:

  • You have chosen the best tools for each job and now you have to update a NoSQL database, a search index, and a cache as part of a single business transaction.
  • The service you have designed has to update its database and also send a notification to another service about the change.
  • You have business transactions that span multiple services boundaries.
  • You may have to implement service operations as idempotent because consumers have to retry failed invocations.

For this article, we'll use a single example scenario to evaluate the various approaches to handling dual writes in distributed transactions. Our scenario is a client application that invokes a microservice on a mutating operation. Service A has to update its database, but it also has to call Service B on a write operation, as illustrated in Figure 1. The actual type of the database, the protocol of the service-to-service interactions, is irrelevant for our discussion as the problem remains the same.

Figure 1: The dual write problem in microservices.

A small but critical clarification explains why there are no simple solutions to this problem. If Service A writes to its database and then sends a notification to a queue for Service B (let’s call it a local-commit-then-publish approach), there is still a chance the application won't work reliably. While Service A writes to its database and then sends the message to a queue, there is a small probability of the application crashing after the commit to the database and before the second operation, which would leave the system in an inconsistent state. If the message is sent before writing to the database (let’s call this approach publish-then-local-commit), there is a possibility of database write failing or timing issues where Service B receives the event before Service A has committed the change to its database. In either case, this scenario involves dual writes to a database and a queue, which is the core problem we are going to explore. In the next sections, I will discuss the various implementation approaches available today for this always-present challenge.

The modular monolith

Developing your application as a modular monolith might seem like a hack or going backward in architectural evolution, but I have seen it work fine in practice. It is not a microservices pattern but an exception to the microservices rule that can be combined cautiously with microservices. When strong write consistency is the driving requirement, more important even than the ability to deploy and scale microservices independently, then you could go with the modular monolith architecture.

Having a monolithic architecture does not imply that the system is poorly designed or bad. It does not say anything about quality. As the name suggests, it is a system designed in a modular way with exactly one deployment unit. Note that this is a purposefully designed and implemented modular monolith, which is different from an accidentally created monolith that grows over time. In a purposeful modular monolith architecture, every module follows the microservices principles. Each module encapsulates all the access to its data, but the operations are exposed and consumed as in-memory method calls.

The architecture of a modular monolith

With this approach, you have to convert both microservices (Service A and Service B) into library modules that can be deployed into a shared runtime. You then make both microservices share the same database instance. Because the services are written and deployed as libraries in a common runtime, they can participate in the same transactions. Because the modules share a database instance, you can use a local transaction to commit or rollback all changes at once. There are also differences around the deployment method because we want the modules to be deployed as libraries within a bigger deployment, and to participate in existing transactions.

Even in a monolithic architecture, there are ways to isolate the code and data. For example, you can segregate the modules into separate packages, build modules, and source code repositories, which can be owned by different teams. You can do partial data isolation by grouping tables by naming convention, schemas, database instances, or even by database servers. The diagram in Figure 2, inspired by Axel Fontaine's talk on majestic modular monoliths, illustrates the different code- and data-isolation levels in applications.

 Figure 2: Levels of code and data isolation for applications.

The last piece of the puzzle is to use a runtime and a wrapper service capable of consuming other modules and including them in the context of an existing transaction. All of these constraints make the modules more tightly coupled than typical microservices, but the benefit is that the wrapper service can start a transaction, invoke the library modules to update their databases, and commit or roll back the transaction as one operation, without concerns about partial failure or eventual consistency.

In our example, illustrated in Figure 3, we have converted Service A and Service B into libraries and deployed them into a shared runtime, or one of the services could act as the shared runtime. The tables from the databases also share a single database instance, but it is separated as a group of tables managed by the respective library services.

 Figure 3: Modular monolith with a shared database.

Benefits and drawbacks of the modular monolith

In some industries, it turns out the benefits of this architecture are far more important than the faster delivery and pace of change that are so highly valued at other places. Table 1 summarizes the benefits and drawbacks of the modular monolith architecture.

Table 1: Benefits and drawbacks of the modular monolith architecture.
Benefits Simple transaction semantics with local transactions ensuring data consistency, read-your-writes, rollbacks, and so on.
Drawbacks
  • A shared runtime prevents us from independently deploying and scaling modules, and prevents failure isolation.
  • The logical separation of tables in a single database is not strong. With time, it can turn into a shared integration layer.
  • Module coupling and sharing transaction context requires coordination during the development stage and increases the coupling between services.
Examples
  • Runtimes such as Apache Karaf and WildFly that allow modular and dynamic deployment of services.
  • Apache Camel’s direct and direct-vm components allow exposing operations for in-memory invocations and preserve transaction contexts within a JVM process.
  • Apache Isis is one of the best examples of the modular monolith architecture. It enables domain-driven application development by automatically generating a UI and REST APIs for your Spring Boot applications.
  • Apache OFBiz is another example of a modular monolith and service-oriented architecture (SOA). It is a comprehensive enterprise resource planning system with hundreds of tables and services that can automate enterprise business processes. Despite its size, its modular architecture allows developers to quickly understand and customize it.

Distributed transactions are typically the last resort, used in a variety of instances:

  • When writes to disparate resources cannot be eventually consistent.
  • When we have to write to heterogeneous data sources.
  • When exactly-once message processing is required and we cannot refactor a system and make its operations idempotent.
  • When integrating with third-party black-box systems or legacy systems that implement the two-phase commit specification.

In all of these situations, when scalability is not a concern, we might consider distributed transactions an option.

Implementing the two-phase commit architecture

The technical requirements for two-phase commit are that you need a distributed transaction manager such as Narayana and a reliable storage layer for the transaction logs. You also need DTP XA-compatible data sources with associated XA drivers that are capable of participating in distributed transactions, such as RDBMS, message brokers, and caches. If you are lucky to have the right data sources but run in a dynamic environment, such as Kubernetes, you also need an operator-like mechanism to ensure there is only a single instance of the distributed transaction manager. The transaction manager must be highly available and must always have access to the transaction log.

For implementation, you could explore a Snowdrop Recovery Controller that uses the Kubernetes StatefulSet pattern for singleton purposes and persistent volumes to store transaction logs. In this category, I also include specifications such as Web Services Atomic Transaction (WS-AtomicTransaction) for SOAP web services. What all of these technologies have in common is that they implement the XA specification and have a central transaction coordinator.

In our example, shown in Figure 4, Service A is using distributed transactions to commit all changes to its database and a message to a queue without leaving any chance for duplicates or lost messages. Similarly, Service B can use distributed transactions to consume the messages and commit to Database B in a single transaction without any duplicates. Or, Service B can choose not to use distributed transactions, but use local transactions and implement the idempotent consumer pattern. For the record, a more appropriate example for this section would be using WS-AtomicTransaction to coordinate the writes to Database A and Database A in a single transaction and avoid eventual consistency altogether. But that approach is even less common, these days, than what I've described.

Figure 4: Two-phase commit spanning between a database and a message broker.

Benefits and drawbacks of the two-phase commit architecture

The two-phase commit protocol offers similar guarantees to local transactions in the modular monolith approach, but with a few exceptions. Because there are two or more separate data sources involved in an atomic update, they may fail in a different manner and block the transaction. But thanks to its central coordinator, it is still easy to discover the state of the distributed system compared to the other approaches I will discuss.

Table 2 summarizes the benefits and drawbacks of this approach.

Table 2: Benefits and drawbacks of two-phase commit.
Benefits
  • Standard-based approach with out-of-the-box transaction managers and supporting data sources.
  • Strong data consistency for the happy scenarios.
Drawbacks
  • Scalability constraints.
  • Possible recovery failures when the transaction manager fails.
  • Limited data source support.
  • Storage and singleton requirements in dynamic environments.
Examples
  • The Jakarta Transactions API (formerly Java Transaction API)
  • WS-AtomicTransaction
  • JTS/IIOP
  • eBay’s GRIT
  • Atomikos
  • Narayana
  • Message brokers such as Apache ActiveMQ
  • Relational data sources that implement the XA spec, in-memory data stores such as Infinispan

Orchestration

With a modular monolith, we use local transactions and we always know the state of the system. With distributed transactions based on the two-phase commit protocol, we also guarantee a consistent state. The only exception would be an unrecoverable failure that involved the transaction coordinator. But what if we wanted to ease the consistency requirements while still knowing the state of the overall distributed system and coordinating from a single place? In this case, we might consider an orchestration approach, where one of the services acts as the coordinator and orchestrator of the overall distributed state change. The orchestrator service has the responsibility to call other services until they reach the desired state or take corrective actions if they fail. The orchestrator uses its local database to keep track of state changes, and it is responsible for recovering any failures related to state changes.

Implementing an orchestration architecture

The most popular implementations of the orchestration technique are BPMN specification implementations such as the jBPM and Camunda projects. The need for such systems doesn’t disappear with overly distributed architectures such as microservices or serverless; on the contrary, it increases. For proof, we can look to newer stateful orchestration engines that do not follow a specification but provide similar stateful behavior, such as Netflix’s Conductor, Uber’s Cadence, and Apache's Airflow. Serverless stateful functions such as Amazon StepFunctions, Azure Durable Functions, and Azure Logic Apps are in this category, as well. There are also open source libraries that allow you to implement stateful coordination and rollback behavior such as Apache Camel’s Saga pattern implementation and the NServiceBus Saga capability. The many homegrown systems implementing the Saga pattern are also in this category.

 

Figure 5: Orchestrating distributed transactions between two services.

In our example diagram, shown in Figure 5, we have Service A acting as the stateful orchestrator responsible to call Service B and recover from failures through a compensating operation if needed. The crucial characteristic of this approach is that Service A and Service B have local transaction boundaries, but Service A has the knowledge and the responsibility to orchestrate the overall interaction flow. That is why its transaction boundary touches Service B endpoints. In terms of implementation, we could set this up with synchronous interactions, as shown in the diagram, or using a message queue in between the services (in which case you could use a two-phase commit, too).

Benefits and drawbacks of orchestration

Orchestration is an eventually consistent approach that may involve retries and rollbacks to get the distribution into a consistent state. While it avoids the need for distributed transactions, orchestration requires the participating services to offer idempotent operations in case the coordinator has to retry an operation. Participating services also must offer recovery endpoints in case the coordinator decides to roll back and fix the global state. The big advantage of this approach is the ability to drive heterogeneous services that might not support distributed transactions into a consistent state by using only local transactions. The coordinator and the participating services need only local transactions, and it is always possible to discover the state of the system by asking the coordinator, even if it is in a partially consistent state. Doing that is not possible with the other approaches I will describe.

Table 3: Benefits and drawbacks of orchestration.
Benefits
  • Coordinates state among heterogeneous distributed components.
  • No need for XA transactions.
  • Known distributed state at the coordinator level.
Drawbacks
  • Complex distributed programming model.
  • May require idempotency and compensating operations from the participating services.
  • Eventual consistency.
  • Possibly unrecoverable failures during compensations.
Examples
  • jBPM
  • Camunda
  • MicroProfile Long Running Actions
  • Conductor
  • Cadence
  • Step Functions
  • Durable Functions
  • Apache Camel Saga pattern implementation
  • NServiceBus Saga pattern implementation
  • The CNCF Serverless Workflow specification
  • Homegrown implementations

Choreography

As you've seen in the discussion so far, a single business operation can result in multiple calls among services, and it can take an indeterminate amount of time before a business transaction is processed end-to-end. To manage this, the orchestration pattern uses a centralized controller service that tells the participants what to do.

An alternative to orchestration is choreography, which is a style of service coordination where participants exchange events without a centralized point of control. With this pattern, each service performs a local transaction and publishes events that trigger local transactions in other services. Each component of the system participates in decision-making about a business transaction's workflow, instead of relying on a central point of control. Historically, the most common implementation for the choreography approach was using an asynchronous messaging layer for the service interactions. Figure 6 illustrates the basic architecture of the choreography pattern.

 
Figure 6: Service choreography through a messaging layer.

Choreography with a dual write

For message-based choreography to work, we need each participating service to execute a local transaction and trigger the next service by publishing a command or event to a messaging infrastructure. Similarly, other participating services have to consume a message and perform a local transaction. That in itself is a dual-write problem within a higher-level dual-write problem. When we develop a messaging layer with a dual write to implement the choreography approach, we could design it as a two-phase commit that spans a local database and a message broker. I covered that approach earlier. Alternatively, we might use a publish-then-local-commit or local-commit-then-publish pattern:

  • Publish-then-local-commit: We could try to publish a message first and then commit a local transaction. While this option might sound fine, it has practical challenges. For example, very often you need to publish an ID that is generated from the local transaction commit, which won’t be available to publish. Also, the local transaction might fail, but we cannot rollback the published message. This approach lacks read-your-write semantics and it is an impractical solution for most use cases.
  • Local-commit-then-publish: A slightly better approach would be to commit the local transaction first and then publish the message. This has a small probability of failure occurring after a local transaction has been committed and before publishing the message. But even in that case, you could design your services to be idempotent and retry the operation. That would mean committing the local transaction again and then publishing the message. This approach can work if you control the downstream consumers and can make them idempotent, too. It's also a pretty good implementation option overall.

Choreography without a dual write

The various ways of implementing a choreography architecture constrain every service to write only to a single data source with a local transaction, and nowhere else. Let’s see how that could work without a dual write.

Let’s say Service A receives a request and writes it to Database A, and nowhere else. Service B periodically polls Service A and detects new changes. When it reads the change, Service B updates its own database with the change and also the index or timestamp up to which it picked up the changes. The critical part here is the fact that both services only write to their own database and commit with a local transaction. This approach, illustrated in Figure 7, can be described as service choreography, or we could describe it using the good old data pipeline terminology. The possible implementation options are more interesting.


Figure 7: Service choreography through polling.

The simplest scenario is for Service B to connect to the Service A database and read the tables owned by Service A. The industry tries to avoid that level of coupling with shared tables, however, and for a good reason: Any change in Service A's implementation and data model could break Service B. We can make a few gradual improvements to this scenario, for example by using the Outbox pattern and giving Service A a table that acts as a public interface. This table could only contain the data Service B requires, and it could be designed to be easy to query and track for changes. If that is not good enough, a further improvement would be for Service B to ask Service A for any changes through an API management layer rather than connecting directly to Database A.

Fundamentally, all of these variations suffer from the same drawback: Service B has to poll Service A continuously. Doing this can lead to unnecessary continuous load on the system or unnecessary delay in picking up the changes. Polling a microservice for changes is a hard sell, so let’s see what we can do to further improve this architecture.

Choreography with Debezium

One way to improve a choreography architecture and make it more attractive is to introduce a tool like Debezium, which we can use to perform change data capture (CDC) using Database A’s transaction log. Figure 8 illustrates this approach.

 Figure 8: Service choreography with change data capture.

Debezium can monitor a database's transaction log, perform any necessary filtering and transformation, and deliver relevant changes into an Apache Kafka topic. This way, Service B can listen to generic events in a topic rather than polling Service A's database or APIs. Swapping database polling for streaming changes and introducing a queue between the services makes the distributed system more reliable, scalable, and opens up the possibility of introducing other consumers for new use cases. Using Debezium offers an elegant way to implement the Outbox pattern for orchestration- or choreography-based Saga pattern implementations.

A side-effect of this approach is that it introduces the possibility of Service B receiving duplicate messages. This can be addressed by implementing the service as idempotent, either at the business logic level or with a technical deduplicator (with something like Apache ActiveMQ Artemis’s duplicate message detection or Apache Camel's idempotent consumer pattern).

Choreography with event sourcing

Event sourcing is another implementation of the service choreography approach. With this pattern, the state of an entity is stored as a sequence of state-changing events. When there is a new update, rather than updating the entity's state, a new event is appended to the list of events. Appending new events to an event store is an atomic operation done in a local transaction. The beauty of this approach, shown in Figure 9, is that the event store also behaves like a message queue for other services to consume updates.

Figure 9: Service choreography through event sourcing.

Our example, when converted to use event sourcing, would store client requests in an append-only event store. Service A can reconstruct its current state by replaying the events. The event store also needs to allow Service B to subscribe to the same update events. With this mechanism, Service A uses its storage layer also as the communication layer with other services. While this mechanism is very neat and solves the problem of reliably publishing events whenever the state change occurs, it introduces a new programming style unfamiliar to many developers and additional complexity around state reconstruction and message compaction, which require specialized data stores.

Benefits and drawbacks of choreography

Regardless of the mechanism used to retrieve data changes, the choreography approach decouples writes, allows independent service scalability, and improves overall system resiliency. The downside of this approach is that the flow of decision-making is decentralized and it is hard to discover the globally distributed state. Discovering the state of a request requires querying multiple data sources which can be challenging with a large number of services. Table 4 summarizes the benefits and drawbacks of this approach.

Table 4: Benefits and drawbacks of choreography.
Benefits
  • Decouples implementation and interaction.
  • No central transaction coordinator.
  • Improved scalability and resilience characteristics.
  • Near real-time interactions.
  • Less overhead on the system with Debezium and similar tools.
Drawbacks
  • The global system state and coordination logic is scattered across all participants.
  • Eventual consistency.
Examples

Parallel pipelines

With the choreography pattern, there is no central place to query the state of the system, but there is a sequence of services that propagates the state through the distributed system. Choreography creates a sequential pipeline of processing services, so we know that when a message reaches a certain step of the overall process, it has passed all the previous steps. What if we could loosen this constraint and process all the steps independently? In this scenario, Service B could process a request regardless of whether Service A had processed it or not.

With parallel pipelines, we add a router service that accepts requests and forwards them to Service A and Service B through a message broker in a single local transaction. From this step onward, as shown in Figure 10, both services can process the requests independently and in parallel.

Figure 10: Processing through parallel pipelines.

While this pattern is very simple to implement, it is only applicable to situations where there is no temporal binding between the services. For example, Service B should be able to process the request regardless of whether Service A has processed the same request. Also, this approach requires an additional router service or the client being aware of both Service A and B for targeting the messages.

Listen to yourself

There is a lighter alternative to this approach, known as the Listen to yourself pattern, where one of the services also acts as the router. With this alternative approach, when Service A receives a request, it would not write to its database but would instead publish the request into the messaging system, where it is targeted to Service B, and to itself. Figure 11 illustrates this pattern.

 Figure 11: The Listen to yourself pattern.

The reason for not writing to the database is to avoid dual writes. Once a message is in the messaging system, the message goes to Service B, and also it goes to back Service A in a completely separate transaction context. With that twist of the processing flow, Service A, and Service B can independently process the request and write to their respective databases.

Benefits and drawbacks of parallel pipelines

Table 5 summarizes the benefits and drawbacks of using parallel pipelines.

Table 5: Benefits and drawbacks of parallel pipelines.
Benefits Simple, scalable architecture for parallel processing.
Drawbacks Requires temporal dismantling; hard to reason about the global system state.
Examples Apache Camel’s multicast and splitter with parallel processing.

How to choose a distributed transactions strategy

As you might have already guessed from this article, there is no right or wrong pattern for handling distributed transactions in a microservices architecture. Every pattern has its pros and cons. Each pattern solves some problems while generating others in turn. The chart in Figure 12 offers a short summary of the main characteristics of the dual write patterns I've discussed.

 Figure 12: Characteristics of dual write patterns.

Whatever approach you choose, you will need to explain and document the motivation behind the decision and the long-lasting architectural consequences of your choice. You will also need to get support from the teams that will implement and maintain the system in the long term. I like to organize and evaluate the approaches described in this article based on their data consistency and scalability attributes, as shown in Figure 13.
 

 Figure 13: Relative data consistency and scalability characteristics of dual write patterns.

As a good starting point, we could evaluate the various approaches from the most scalable and highly available to the least scalable and available ones.

High: Parallel pipelines and choreography

If your steps are temporarily decoupled, then it could make sense to run them in a parallel pipelines method. The chances are you can apply this pattern for certain parts of the system, but not for all of them. Next, assuming there is a temporal coupling between the processing steps, and certain operations and services have to happen before others, you might consider the choreography approach. Using service choreography, it is possible to create a scalable, event-driven architecture where messages flow from service to service through a decentralized orchestration process. In this case, Outbox pattern implementations with Debezium and Apache Kafka (such as Red Hat OpenShift Streams for Apache Kafka) are particularly interesting and gaining traction.

Medium: Orchestration and two-phase commit

If choreography is not a good fit, and you need a central point that is responsible for coordination and decision making, then you would consider orchestration. This is a popular architecture, with standard-based and custom open source implementations available. While a standard-based implementation may force you to use certain transaction semantics, a custom orchestration implementation allows you to make a trade-off between the desired data consistency and scalability.

Low: Modular monolith

If you are going further left in the spectrum, most likely you have a very strong need for data consistency and you are ready to pay for it with significant tradeoffs. In this case, distributed transactions through two-phase commits will work with certain data sources, but they are difficult to implement reliably on dynamic cloud environments designed for scalability and high availability. In that case, you can go all the way to the good old modular monolith approach, accompanied by practices learned from the microservices movement. This approach ensures the highest data consistency but at the price of runtime and data source coupling.

Conclusion

In a sizable distributed system with tens of services, there won’t be a single approach that works for all, but a few of these combined and applied for different contexts. You might have a few services deployed on a shared runtime for exceptional requirements around data consistency. You might choose a two-phase commit for integration with a legacy system that supports JTA. You might orchestrate a complex business process, and also use choreography and parallel processing for the rest of the services. In the end, it doesn't matter what strategy you pick; what matters is choosing a strategy deliberately for the right reasons, and executing it.

This post was originally published on Red Hat Developers. To read the original post, check here.

Turning Microservices Inside-Out

There is a fantastic talk by Martin Kleppmann called “Turning the database inside-out”. Once watched, it can change your perspective on databases and event logs irreversibly. While I agree with the outlined limitations of databases and the benefits of event logs, I’m not convinced of the practicality of replacing databases with event logs. I believe the same design principles used for turning databases inside-out, should instead be applied at a higher, service design level to ensure microservices stream changes from inside-out. With that twist, within the services, we can keep using traditional databases for what they are best for - efficiently working with mutable state and also use event logs to reliably propagate changes among services. With the help of frameworks such as Debezium which can act as a connecting tissue between databases and event logs, we can benefit from the time-tested and familiar database technology and modern event logs such as Red Hat’s managed Apache Kafka service at the same time. This inside-out mindset requires a deliberate focus on offering outbound APIs in microservices to stream all relevant state change and domain events from within the service to the outside world. This merge of microservices movement with the event driven emerging trends is what I call turning the microservices data inside-out.

Microservices API types

To build up this idea, I will look into microservices from the point of different API types they provide and consume. A common way to describe microservices is as independently deployed components, built around a business domain, that own their data and are exposed over APIs. That is very similar to how databases are described in the post mentioned above - a black box with a single API that goes in and out.


Data flowing from microservices’ inbound to outbound APIs

Data flowing from microservices’ inbound to outbound APIs

I believe a better way to think about microservices would be one where every microservice is composed of inbound and outbound APIs where the data flows through and a meta API that describes these APIs. While inbound APIs are well known today, outbound APIs are not used as much, and the responsibilities of meta API are spread around various tools and proliferating microservices technologies. To make the inside-out approach work, we need to make outbound and meta APIs first-class microservices constructs and improve the tooling and practices around these areas.

Inbound APIs

Inbound APIs are what every microservice has today in the form of service endpoints. These APIs are outside-in, and they allow outside systems to interact with the service directly through commands and queries or indirectly through events.

Inbound APIs are the norm in microservices today

Inbound APIs are the norm in microservices today

In terms of implementation, these are typically REST-based APIs that offer mutating or read-only operations for synchronous operations, fronted by a load balancing gateway. These can also be implemented as queues for asynchronous command-based interactions, or topics for event-based interactions. The responsibilities and governance of these APIs are well understood and they form the majority of the microservices API landscape today.

Outbound APIs

What I refer to as outbound APIs here are the interactions that originate from within the service and go to outside services and systems. The majority of these are queries and commands initiated by the service and targeted to dependent services owned by somebody else. What I also put under this category are the outbound events that originate from within the service. Outbound events are different from the query and commands targeted for a particular endpoint because an outbound event is defined by the service without concrete knowledge of the existing and possible future recipients. Regardless of the indirect nature of the API, there is still the expectation that these events are generated predictably and reliably for any significant change that happens within the service (typically caused by inbound interactions). Today, outbound events are often an afterthought. They are either created for the needs of a specific consumer that depends on them, or they are added later in the service lifecycle, not by the service owners but other teams responsible for data replication. On both occasions, the possible use cases of outbound events remain low and diminish its potential.

The challenging part with outbound events is implementing a uniform and reliable notification mechanism for any change that happens within the service. To apply this approach uniformly in every microservice and for any kind of database, the tools here have to be non-intrusive and developer-friendly. Not having good frameworks that support this pattern, not having proven patterns, practices, and standards are impediments preventing the adoption of outbound events as a common top-level microservices construct.

Outbound events implemented through change data capture

Outbound events implemented through change data capture

To implement outbound events, you can include the logic of updating a database and publishing an event to a messaging system in your application code but that leads to the well-known dual-write problem. Or you could try to replace the traditional database with an event log, or use specialized event sourcing platforms. But if you consider that your most valuable resources in a project are the people and their proven tools and practices, replacing a fundamental component such as the database with something different will have a significant impact. A better approach would be to keep using the relational databases and all the surrounding tools and practices that have served fine for decades and complement your database with a connecting tissue such as Debezium (disclaimer: I’m the product manager for Debezium at Red Hat and I’m biased about it). I believe the best implementation approach for outbound events is the outbox pattern which uses a single transaction to both perform the normal database update dictated by the service logic and insert a message into a specific outbox table within the same database. Once the transaction is written to the database’s transaction log, Debezium picks up the outbox message from the log and sends it to Apache Kafka. This has nice properties such as "read your own writes" semantics, where a subsequent query to the service returns the newly persisted record and at the same time, we get reliable, asynchronous, propagation of changes via Apache Kafka. Debezium can selectively capture changes from the database transaction logs, transform and publish them into Kafka in a uniform way acting as an outbound eventing interface of the services. Debezium can be embedded into the Java application runtimes as a library, or decoupled as a sidecar. It is a plug-and-play component you add to your service regardless of whether it is a legacy service or created from scratch. It is the missing configuration-based outbound eventing API for any service.

Meta APIs

Today meta APIs describe the inbound and outbound APIs, enabling their governance, discovery, and consumption. They are implemented in siloed tools around a specific technology. In my definition, an OpenAPI definition for a REST endpoint published to an API portal is an example of meta API. An AsyncAPI definition for a messaging topic that is published to a schema registry is an example of meta API too. The schema change topic that Debezium publishes database schema change events (which are different from the data change events) is an example of meta API. There are various capabilities in other tools that describe the data structures and the APIs serving them that can all be classified as meta APIs. So in my definition, meta APIs are all the artifacts that allow different stakeholders to work with the service and enable other systems to use the inbound and outbound APIs.

The evolving responsibilities of Meta APIs

The evolving responsibilities of Meta APIs

One of the fundamental design principles of microservices is to make them independently updatable and deployable. But today there are still significant amounts of coordination required among service owners for upgrades that involve API changes. Service owners need better meta API tools to subscribe for updates from dependent services and prepare to change timely. The meta API tools need to be integrated deeper into development and operational activities to increase agility. Meta API tools today are siloed, passive, and disparate across the technology stack. Instead, meta tools need to reflect the changing nature of service interactions towards an event-driven approach and play a more proactive role in automating some of the routine tasks of the development and operational teams.

Emerging trends

The rise of outbound events

Outbound events are already present as the preferred integration method for most modern platforms. Most cloud services emit events. Many data sources (such as Cockroach changefeeds, MongoDB change streams) and even file systems (for example Ceph notifications) can emit state change events. Custom-built microservices are not an exception here. Emitting state change or domain events is the most natural way for modern microservices to fit uniformly among the event-driven systems they are connected to in order to benefit from the same tooling and practices. Outbound events are bound to become a top-level microservices design construct for many reasons. Designing services with outbound events can help replicate data during an application modernization process. Outbound events are also the enabler for implementing elegant inter-service interactions through the Outbox Patterns and complex business transactions that span multiple services using a non-blocking Saga implementation. Outbound events fit nicely into the Distributed Data Mesh architecture where a service is designed with its data consumers in mind. Data mesh claims that for data to fuel innovation, its ownership must be federated among domain data owners who are accountable for providing their data as products… In short, rather than having a centralized data engineering team to replicate data from every microservice through an ETL process, it is better if microservices are owned jointly with developers and data engineers and design the services to make the data available in the first place. What better way to do that than outbound events with real-time data streaming through Debezium, Apache Kafka, and Schema Registry.

To sum up, outbound events align microservices with the Unix philosophy where “the output of every program becomes the input of a yet unknown program”. To future proof your services, you have to design them in a way to let the data flow from inbound to outbound APIs. This allows all the services to be developed and operated uniformly using modern event-oriented tools and patterns, and unlocks yet unknown future uses of data exposed through events.

Convergence of meta API tools

With the increasing adoption of event-driven architectures and faster pace of service evolution, the responsibilities and the importance of meta APIs are growing too. The scope of meta API tools is no longer limited to synchronous APIs but includes asynchronous APIs too. The meta APIs are expanding towards enabling faster development cycles by ensuring safe schema evolution through compatibility checks, notifications for updates, code generation for bindings, test simulations, and so forth. As a consumer of a service, I want to discover existing endpoints and data formats, the API compatibility rules, limits, and SLAs the service complies with in one place. At the same time, I want to get notifications for any changes that are coming, any deprecations, updates to the APIs, or any new APIs the service is going to offer that might be of interest to me. Not only that, developers are challenged to ship code faster and faster, and modern API tools can automate the process of schema and event structure discovery. Once a schema is discovered and added to the registry, a developer can quickly generate code bindings for their language and start developing in an IDE. Then, other tools could use the meta API definitions and generate tests and mocks, and simulate load by emitting dummy events with something like Microcks or even Postman. At runtime, the contextual information available in the meta APIs can enable the platforms I’m running the application on to inject the connection credentials, register it with monitoring tools, and so on.

Overall, the role of meta API is evolving towards playing a more active role in the asynchronous interaction ecosystem by automating some of the coordination activities among service owners, increasing developer productivity, and automating operations teams’ tasks. And for that to become a reality, the different tools containing API metadata, code generation, test stimulation, environment management must converge, standardize and integrate better.

Standardization of the event-driven space

While event-driven architecture (EDA) has a long history, recent drivers such as cloud adoption, microservices architecture, and a faster pace of change have amplified the relevance and adoption of EDA. Similar to the consolidation and the standardization that happens with Kubernetes and its ecosystem on the platform space, there is a consolidation and community-driven standardization that is happening in the event-driven space around Apache Kafka. Let see a few concrete examples.

Apache Kafka has reached the point of becoming the de facto standard platform for event streaming, the same way AWS S3 is for object store, and Kubernetes is for container orchestration. Kafka has a huge community behind, a large open source ecosystem of tools and services, and possibly the largest adoption as eventing infrastructure by modern digital organizations. There are all kinds of self-hosted Kafka offerings, managed services by boutique companies, cloud providers, and recently by Red Hat too (Red Hat OpenShift Streams for Apache Kafka is a managed Kafka service I’m involved with and I’d love to hear your feedback). Kafka as an API for log-based messaging is so widespread that even non-Kafka projects such as Pulsar, Red Panda, Azure Event Hubs offer compatibility with it. Kafka today is more than a 3rd party architectural dependency. Kafka influences how services are designed and implemented, it dictates how systems are scaled and made highly available, it drives how the users consume the data in real-time. But Kafka alone is like a bare Kubernetes platform without any pods. Let’s see what else in the Kafka ecosystem is a must-have complement and is becoming a de facto standard too.

A Schema Registry is as important for asynchronous APIs as an API manager is for synchronous APIs. In many streaming scenarios, the event payload contains structured data that both the producer and consumer need to understand and validate. A schema registry provides a central repository and a common governance framework for schema documents and enables applications to adhere to these contracts. Today there are registries such as Apicurio by Red Hat, Karapace by Aiven, registries by Cloudera, Lenses, Confluent, Azure, AWS, and more. While schema repositories are increasing in popularity and consolidating in the capabilities and practices around schema management, at the same time they vary in licensing restrictions. Not only that, schema registries tend to leak into client applications in the form of Kafka Serializer/Deserializer (SerDes), converters, and other client dependencies. So the need for an open and vendor-neutral standard where the implementations can be swapped has been apparent for a while. And the good news is that Schema Registry API standard proposal exists in CNCF and few registries such Apicurio and Azure Schema Registry have already started to follow it. Complementing the open source Kafka API with an open source service registry API and common governance practices feels right and I expect the adoption and consolidation in this space to grow to make the whole meta API concept a cornerstone of event-driven architectures.

Similar to EDA, the concept of Change Data Capture (CDC) is not new. But the recent drivers around event-driven systems and the increasing demand for access to real-time data are building the momentum for transaction-log-driven event streaming tools. Today, there are many closed source, point-and-click tools (such as Striim, HVR, Qlik) that rely on the same transaction log concept to replicate data point-to-point. There are cloud services such as AWS DMS, Oracle GoldenGate Cloud Service and Google Datastream that will stream your data into their services (but never in the opposite direction). There are many databases, key-value stores that stream changes too. The need for an open source and vendor-neutral CDC standard that different vendors can follow and downstream change-event consumers can rely on is growing. To succeed, such a standard has to be managed on a vendor-neutral foundation and be part of a larger related ecosystem. The closest thing that exists today is CNCF which is already home to AsyncAPI, CloudEvents, Schema Registry, and Serverless Workflow specifications too. Today, by far, the leading open source project in the CDC space is Debezium. Debezium is used by major companies, embedded into cloud services from Google, Heroku, Confluent, Aiven, Red Hat, embedded into multiple open source projects, and used by many proprietary solutions that we won’t ever know about. If you are looking for a standard in this domain, the closest de facto standard is Debezium. To clarify, with a CDC standard I don’t mean an API for data sources to emit changes. I mean standard conventions for data sources and connecting tissues such as Debezium to follow when converting database transaction logs into events. That includes data mapping (from database field types into JSON/Avro types), data structures (for example Debezium’s Before/After message structure), snapshotting, partitioning of tables into topics, and primary keys into topic partitions, transaction demarcation indicators, and so forth. If you are going heavy on CDC, using Debezium will ensure consistent semantics for mapping from database transaction log entries into Apache Kafka events that are uniform across datasources.

Specifications and implementation around the Apache Kafka ecosystemSpecifications and implementation around the Apache Kafka ecosystem

 Specifications and implementation around the Apache Kafka ecosystem

There are already a few existing specifications from the event-driven space at CNCF that are gaining traction.

  • AsyncAPI is OpenAPI’s equivalent for event-driven applications that recently joined CNCF. It offers a specification to document your event-driven systems to maintain consistency, and governance across different teams and tools.
  • CloudEvents (also part of CNCF) aims to eliminate the metadata challenge by specifying mandatory metadata information into what could be called a standard envelope. It also offers libraries for multiple programming languages for multiple protocols, which streamlines interoperability.
  • OpenTelemetry (another CNCF sandbox project) standardizes the creation and management of trace information that reveals the end-to-end path of events through multiple applications.
  • CNCF Serverless Workflow is a vendor-neutral spec for coordination asynchronous stateless and stateful interaction.
  • The service registry proposal in CNCF we discussed above...

Whether we call it standardization, community adoption or something else, we cannot deny the consolidation process around event-driven constructs and the rise of some open source projects as de facto standards.

Summary

Microservices are focused around the encapsulation of data that belongs to a business domain and exposing it over a minimal API as possible. But that is changing. Data going out of a service is as important as data going into it. Exposing data in microservices can no longer be an afterthought. Siloed and inaccessible data wrapped in a highly decoupled microservice are of limited value. There are new users of data and possible yet unknown users that will demand access to discoverable, understandable, real-time data. To satisfy the needs of these users, microservices have to turn data inside-out and be designed with outbound APIs that can emit data and meta APIs that make the consumption of data a self-service activity. Projects such as Apache Kafka, Debezium and schema registries are a natural enabler of this architecture and with the help of the various open source asynchronous specifications are turning into de facto choice for implementing future-proof event-driven microservices. 

This article was originally published on InfoQ here.

Which is better: A monolithic Kafka cluster vs many?

Apache Kafka is designed for performance and large volumes of data. Kafka's append-only log format, sequential I/O access, and zero copying all support high throughput with low latency. Its partition-based data distribution lets it scale horizontally to hundreds of thousands of partitions.

Because of these capabilities, it can be tempting to use a single monolithic Kafka cluster for all of your eventing needs. Using one cluster reduces your operational overhead and development complexities to a minimum. But is "a single Kafka cluster to rule them all" the ideal architecture, or is it better to split Kafka clusters?

To answer that question, we have to consider the segregation strategies for maximizing performance and optimizing cost while increasing Kafka adoption. We also have to understand the impact of using Kafka as a service, on a public cloud, or managing it yourself on-premise (Are you looking to experiment with Kafka? Get started in minutes with a no-cost Kafka service trial). This article explores these questions and more, offering a structured way to decide whether or not to segregate Kafka clusters in your organization. Figure 1 summarizes the questions explored in this article.

 

A mind map for Apache Kafka cluster segregation strategies

Figure 1. A mind map for Apache Kafka cluster segregation strategies shows the concerns that can drive a multiple-cluster setup.

Benefits of a monolithic Kafka cluster

To start, let's explore some of the benefits of using a single, monolithic Kafka cluster. Note that by this I don't mean literally a single Kafka cluster for all environments, but a single production Kafka cluster for the entire organization. The different environments would still typically be fully isolated with their respective Kafka clusters. A single production Kafka cluster is simpler to use and operate and is a no-brainer as a starting point.

Global event hub

Many companies are sold on the idea of having a single "Kafka backbone" and the value they can get from it. The possibility of combining data from different topics from across the company arbitrarily in response to future and yet unknown business needs is a huge motivation. As a result, some organizations end up using Kafka as a centralized enterprise service bus (ESB) where they put all their messages under a single cluster. The chain of streaming applications is deeply interconnected.

This approach can work for companies with a small number of applications and development teams, and with no hard departmental data boundaries that are enforced in large corporations by business and regulatory forces. (Note that this singleton Kafka environment expects no organizational boundaries.)

The monolithic setup reduces thinking about event boundaries, speeds up development, and works well until an operational or a process limitation kicks in.

No technical constraints

Certain technical features are available only within a single Kafka cluster. For example, a common pattern used by stream processing applications is to perform read-process-write operations in a sequence without any tolerances for errors that could lead to duplicates or loss of messages. To address that strict requirement, Kafka offers transactions that ensure that each message is consumed from the source topic and published to a target topic in exactly-once processing semantics. That guarantee is possible only when the source and target topics are within the same Kafka cluster.

A consumer group, such as a Kafka Streams-based application, can process data from a single Kafka cluster only. Therefore, multi-topic subscriptions or load balancing across the consumers in a consumer group are possible only within a single Kafka cluster. In a multi-Kafka setup, enabling such stream processing requires data replication across clusters.

Each Kafka cluster has a unique URL, a few authentication mechanisms, Kafka-wide authorization configurations, and other cluster-level settings. With a single cluster, all applications can make the same assumptions, use the same configurations, and send all events to the same location. These are all good technical reasons for sharing a single Kafka cluster whenever possible.

Lower cost of ownership

I assume that you use Kafka because you have a huge volume of data, or you want to do low latency asynchronous interactions, or take advantage of both of these with added high availability—not because you have modest data needs and Kafka is a fashionable technology. Offering high-volume, low-latency Kafka processing in a production environment has a significant cost. Even a lightly used Kafka cluster deployed for production purposes requires three to six brokers and three to five ZooKeeper nodes. The components should be spread across multiple availability zones for redundancy.

Note: ZooKeeper will eventually be replaced, but its role will still have to be performed by the cluster.

You have to budget for base compute, networking, storage, and operating costs for every Kafka cluster. This cost applies whether you self-manage a Kafka cluster on-premises with something like Strimzi or consume Kafka as a service. There are attempts at "serverless" Kafka offerings that try to be more creative and hide the cost per cluster in other cost lines, but somebody still has to pay for resources.

Generally, running and operating multiple Kafka clusters costs more than a single larger cluster. There are exceptions to this rule, where you achieve local cost optimizations by running a cluster at the point where the data and processing happens or by avoiding replication of large volumes of non-critical data, and so on.

Benefits of multiple Kafka clusters

Although Kafka can scale beyond the needs of a single team, it is not designed for multi-tenancy. Sharing a single Kafka cluster across multiple teams and different use cases requires precise application and cluster configuration, a rigorous governance process, standard naming conventions, and best practices for preventing abuse of the shared resources. Using multiple Kafka clusters is an alternative approach to address these concerns. Let's explore a few of the reasons that you might choose to implement multiple Kafka clusters.

Operational decoupling

Kafka's sweet spot is real-time messaging and distributed data processing. Providing that at scale requires operational excellence. Here are a few manageability concerns that apply to operating Kafka.

Workload criticality

Not all Kafka clusters are equal. A batch processing Kafka cluster that can be populated from source again and again with derived data doesn't have to replicate data into multiple sites for higher availability. An ETL data pipeline can afford more downtime than a real-time messaging infrastructure for frontline applications. Segregating workloads by service availability and data criticality helps you pick the most suitable deployment architecture, optimize infrastructure costs, and direct the right level of operating attention to every workload.

Maintainability

The larger a cluster gets, the longer it can take to upgrade and expand the cluster due to rolling restarts, data replication, and rebalancing. In addition to the length of the change window, the time when the change is performed might also be important. A customer-facing application might have an upgrade window that differs from a customer service application. Using separate Kafka clusters allows faster upgrades and more control over the time and the sequence of rolling out a change.

Regulatory compliance

Regulations and certifications typically leave no room for compromise. You might have to host a Kafka cluster on a specific cloud provider or region. You might have to allow access only to support personnel from a specific country. All personally identifiable information (PII) data might have to be on a particular cluster with short retention, separate administrative access, and network segmentation. You might want to hold the data encryption keys for specific clusters. The larger your company is, the longer the requirements list gets.

Tenant isolation

The secret for happy application coexistence on a shared infrastructure relies on having good primitives for access, resource, and logical isolation. Unlike Kubernetes, Kafka has no concept like namespaces for enforcing quotas and access control or avoiding topic naming collisions. Let's explore some of the resulting challenges for isolating tenants.

Resource isolation

Although Kafka has mechanisms to control resource use, it doesn't prevent a bad tenant from monopolizing the cluster resources. Storage size can be controlled per topic through retention size, but cannot be limited for a group of topics corresponding to an application or tenant. Network utilization can be enforced through quotas, but it is applied at the client connection level. There is no means to prevent an application from creating an unlimited number of topics or partitions until the whole cluster gets to a halt.

All of that means you have to enforce these resource control mechanisms while operating at different granularity levels, and enforce additional conventions for the healthy coexistence of multiple teams on a single cluster. An alternative is to assign separate Kafka clusters to each functional area and use cluster-level resource isolation.

Security boundary

Kafka's access control with the default authorization mechanism (ACLs) is more flexible than the quota mechanism and can apply to multiple resources at once through pattern matching. But you have to ensure good naming convention hygiene. The structure for topic name prefixes becomes part of your security policy.

ACLs control which users can perform which actions on which resources, but a user with admin access to a Kafka instance has access to all the topics in that Kafka instance. With multiple clusters, each team can have admin rights only to their Kafka instance.

The alternative is to ask someone with admin rights to edit the ACLs and update topics rights and such. Nobody likes having to open a ticket to another team to get a project rolling.

Logical decoupling

A single cluster shared across multiple teams and applications with different needs can quickly get cluttered and difficult to navigate. You might have teams that need very few topics and others that generate hundreds of them. Some teams might even generate topics on the fly from existing data sources by turning microservices inside-out. You might need hundreds of granular ACLs for some applications that are less trusted, and coarse-grained ACLs for others. You might have a large number of producers and consumers. In the absence of namespaces, properties, and labels that can be used for logical segregation of resources, the only option left is to use naming conventions creatively.

Use case optimization

So far we have looked at the manageability and multi-tenancy needs that apply to most shared platforms in common. Next, we will look at a few examples of Kafka cluster segregation for specific use cases. The goal of this section is to list the long tail of reasons for segregating Kafka clusters that varies for every organization and demonstrate that there is no "wrong" reason for creating another Kafka cluster.

Data locality

Data has gravity, meaning that a useful dataset tends to attract related services and applications. The larger a dataset is, the harder it is to move around. Data can originate from a constrained or offline environment, preventing it from streaming into the cloud. Large volumes of data might reside in a specific region, making it economically unfeasible to replicate the data to other locations. Therefore, you might create separate Kafka clusters at regions, cloud providers, or even at the edge to benefit from data's gravitational characteristics.

Fine-tuning

Fine-tuning is the process of precisely adjusting the parameters of a system to fit certain objectives. In the Kafka world, the primary interactions that an application has with a cluster center on the concept of topics. And while every topic has separate and fine-tuning configurations, there are also cluster-wide settings that apply to all applications.

For instance, cluster-wide configurations such as redundancy factor (RF) and in-sync replicas (ISR) apply to all topics if not explicitly overridden per topic. In addition, some constraints apply to the whole cluster and all users, such as the allowed authentication and authorization mechanisms, IP whitelists, the maximum message size, whether dynamic topic creation is allowed, and so on.

Therefore, you might create separate clusters for large messages, less-secure authentication mechanisms, and other oddities to localize and isolate the effect of such configurations from the rest of the tenants.

Domain ownership

Previous sections described examples of cluster segregation to address data and application concerns, but what about business domains? Aligning Kafka clusters by business domain can enforce ownership and give users more responsibilities. Domain-specific clusters can offer more freedom to the domain owners and reduce reliance on a central team. This division can also reduce cross-cluster data replication needs because most joins are likely to happen within the boundaries of a business domain.

Purpose-built

Kafka clusters can be created and configured for a particular use case. Some clusters might be born while modernizing existing legacy applications and others created while implementing event-driven distributed transaction patterns. Some clusters might be created to handle unpredictable loads, whereas others might be optimized for stable and predictable processing.

For example, Wise uses separate Kafka clusters for stream processing with topic compaction enabled, separate clusters for service communication with short message retention, and a logging cluster for log aggregation. Netflix uses separate clusters for producers and consumers. The so-called fronting clusters are responsible for getting messages from all applications and buffering, while consumer clusters contain only a subset of the data needed for stream processing.

These decisions for classifying clusters are based on high-level criteria, but you might also have low-level criteria for separate clusters. For example, to benefit from page caching at the operating-system level, you might create a separate cluster for consumers that re-read topics from the beginning each time. The separate cluster would prevent any disruption of the page caches for real-time consumers that read data from the current head of each topic. You might also create a separate cluster for the odd use case of a single topic that uses the whole cluster. The reasons can be endless.

Summary

The argument "one thing to rule them all" has been used for pretty much any technology: mainframes, databases, application servers, ESBs, Kubernetes, cloud providers, and so on. But generally, the principle falls apart. At some point, decentralizing and scaling with multiple instances offer more benefits than continuing with one centralized instance. Then a new threshold is reached, and the technology cycle starts to centralize again, which sparks the next phase of innovation. Kafka is following this historical pattern.

In this article, we looked at common motivations for growing a monolithic Kafka cluster along with reasons for splitting it out. But not all points apply to all organizations in every circumstance. Every organization has different business goals and execution strategies, team structure, application architecture, and data processing needs. Every organization is at a different stage of its journey to the hybrid cloud, a cloud-based architecture, edge computing, data mesh—you name it.

You might run on-premises Kafka clusters for good reason and give more weight to the operational concerns you have to deal with. Software-as-a-Service (SaaS) offerings such as Red Hat OpenShift Streams for Apache Kafka can provision a Kafka cluster with a single click and remove the concerns around maintainability, workload criticality, and compliance. With such services, you might pay more attention to governance, logical isolation, and controlling data locality.

If you have a reasonably sized organization, you will have hybrid and multi-cloud Kafka deployments and a new set of concerns around optimizing and reusing Kafka skills, patterns, and best practices across the organization. These concerns are topics for another article.

I hope this guide provides a way to structure your decision-making process for segregating Kafka clusters. Follow me at @bibryam to join my journey of learning Apache Kafka. This post was originally published on Red Hat Developers. To read the original post, check here.

Application modernization patterns with Apache Kafka, Debezium, and Kubernetes

We build our computers the way we build our cities—over time, without a plan, on top of ruins.

Ellen Ullman wrote this in 1998, but it applies just as much today to the way we build modern applications; that is, over time, with short-term plans, on top of legacy software. In this article, I will introduce a few patterns and tools that I believe work well for thoughtfully modernizing legacy applications and building modern event-driven systems.

Note: If you prefer watch my talk on YouTube about application modernization patterns which this post is  based on.

Application modernization in context

Application modernization refers to the process of taking an existing legacy application and modernizing its infrastructure—the internal architecture—to improve the velocity of new feature delivery, improve performance and scalability, expose the functionality for new use cases, and so on. Luckily, there is already a good classification of modernization and migration types, as shown in Figure 1.

Figure 1: Three modernization types and the technologies we might use for them.
Figure 1: Three modernization types and the technologies we might use for them.

Depending on your needs and appetite for change, there are a few levels of modernization:

  • Retention: The easiest thing you can do is to retain what you have and ignore the application's modernization needs. This makes sense if the needs are not yet pressing.
  • Retirement: Another thing you could do is retire and get rid of the legacy application. That is possible if you discover the application is no longer being used.
  • Rehosting: The next thing you could do is to rehost the application, which typically means taking an application as-is and hosting it on new infrastructure such as cloud infrastructure, or even on Kubernetes through something like KubeVirt. This is not a bad option if your application cannot be containerized, but you still want to reuse your Kubernetes skills, best practices, and infrastructure to manage a virtual machine as a container.
  • Replatforming: When changing the infrastructure is not enough and you are doing a bit of alteration at the edges of the application without changing its architecture, replatforming is an option. Maybe you are changing the way the application is configured so that it can be containerized, or moving from a legacy Java EE runtime to an open source runtime. Here, you could use a tool like windup to analyze your application and return a report with what needs to be done.
  • Refactoring: Much application modernization today focuses on migrating monolithic, on-premises applications to a cloud-native microservices architecture that supports faster release cycles. That involves refactoring and rearchitecting your application, which is the focus of this article.

For this article, we will assume we are working with a monolithic, on-premise application, which is a common starting point for modernization. The approach discussed here could also apply to other scenarios, such as a cloud migration initiative.

Challenges of migrating monolithic legacy applications

Deployment frequency is a common challenge for migrating monolithic legacy applications. Another challenge is scaling development so that more developers and teams can work on a common code base without stepping on each other’s toes. Scaling the application to handle an increasing load in a reliable way is another concern. On the other hand, the expected benefits from a modernization include reduced time to market, increased team autonomy on the codebase, and dynamic scaling to handle the service load more efficiently. Each of these benefits offsets the work involved in modernization. Figure 2 shows an example infrastructure for scaling a legacy application for increased load.

Figure 2: Refactoring a legacy application into event-driven microservices.
Figure 2: Refactoring a legacy application into event-driven microservices.

Envisioning the target state and measuring success

For our use case, the target state is an architectural style that follows microservices principles using open source technologies such as Kubernetes, Apache Kafka, and Debezium. We want to end up with independently deployable services modeled around a business domain. Each service should own its own data, emit its own events, and so on.

When we plan for modernization, it is also important to consider how we will measure the outcomes or results of our efforts. For that purpose, we can use metrics such as lead time for changes, deployment frequency, time to recovery, concurrent users, and so on.

The next sections will introduce three design patterns and three open source technologies—Kubernetes, Apache Kafka, and Debezium—that you can use to migrate from brown-field systems toward green-field, modern, event-driven services. We will start with the Strangler pattern.

The Strangler pattern

The Strangler pattern is the most popular technique used for application migrations. Martin Fowler introduced and popularized this pattern under the name of Strangler Fig Application, which was inspired by a type of fig that seeds itself in the upper branches of a tree and gradually evolves around the original tree, eventually replacing it. The parallel with application migration is that our new service is initially set up to wrap the existing system. In this way, the old and the new systems can coexist, giving the new system time to grow and potentially replace the old system. Figure 3 shows the main components of the Strangler pattern for a legacy application migration.

Figure 3: The Strangler pattern in a legacy application migration.
Figure 3: The Strangler pattern in a legacy application migration.

The key benefit of the Strangler pattern is that it allows low-risk, incremental migration from a legacy system to a new one. Let’s look at each of the main steps involved in this pattern.

Step 1: Identify functional boundaries

The very first question is where to start the migration. Here, we can use domain-driven design to help us identify aggregates and the bounded contexts where each represents a potential unit of decomposition and a potential boundary for microservices. Or, we can use the event storming technique created by Antonio Brandolini to gain a shared understanding of the domain model. Other important considerations here would be how these models interact with the database and what work is required for database decomposition. Once we have a list of these factors, the next step is to identify the relationships and dependencies between the bounded contexts to get an idea of the relative difficulty of the extraction.

Armed with this information, we can proceed with the next question: Do we want to start with the service that has the least amount of dependencies, for an easy win, or should we start with the most difficult part of the system? A good compromise is to pick a service that is representative of many others and can help us build a good technology foundation. That foundation can then serve as a base for estimating and migrating other modules.

Step 2: Migrate the functionality

For the strangler pattern to work, we must be able to clearly map inbound calls to the functionality we want to move. We must also be able to redirect these calls to the new service and back if needed. Depending on the state of the legacy application, client applications, and other constraints, weighing our options for this interception might be straightforward or difficult:

  • The easiest option would be to change the client application and redirect inbound calls to the new service. Job done.
  • If the legacy application uses HTTP, then we’re off to a good start. HTTP is very amenable to redirection and we have a wealth of transparent proxy options to choose from.
  • In practice, it likely that our application will not only be using REST APIs, but will have SOAP, FTP, RPC, or some kind of traditional messaging endpoints, too. In this case, we may need to build a custom protocol translation layer with something like Apache Camel.

Interception is a potentially dangerous slippery slope: If we start building a custom protocol translation layer that is shared by multiple services, we risk adding too much intelligence to the shared proxy that services depend on. This would move us away from the "smart microservices, dumb pipes” mantra. A better option is to use the Sidecar pattern, illustrated in Figure 4.

Figure 4: The Sidecar pattern.
Figure 4: The Sidecar pattern.

Rather than placing custom proxy logic in a shared layer, make it part of the new service. But rather than embedding the custom proxy in the service at compile-time, we use the Kubernetes sidecar pattern and make the proxy a runtime binding activity. With this pattern, legacy clients use the protocol-translating proxy and new clients are offered the new service API. Inside the proxy, calls are translated and directed to the new service. That allows us to reuse the proxy if needed. More importantly, we can easily decommission the proxy when it is no longer needed by legacy clients, with minimal impact on the newer services.

Step 3: Migrate the database

Once we have identified the functional boundary and the interception method, we need to decide how we will approach database strangulation—that is, separating our legacy database from application services. We have a few paths to choose from.

Database first

In a database-first approach, we separate the schema first, which could potentially impact the legacy application. For example, a SELECT might require pulling data from two databases, and an UPDATE can lead to the need for distributed transactions. This option requires changes to the source application and doesn’t help us demonstrate progress in the short term. That is not what we are looking for.

Code first

A code-first approach lets us get to independently deployed services quickly and reuse the legacy database, but it could give us a false sense of progress. Separating the database can turn out to be challenging and hide future performance bottlenecks. But it is a move in the right direction and can help us discover the data ownership and what needs to be split into the database layer later.

Code and database together

Working on the code and database together can be difficult to aim for from the get-go, but it is ultimately the end state we want to get to. Regardless of how we do it, we want to end up with a separate service and database; starting with that in mind will help us avoid refactoring later.

 

Database strangulation strategies

Figure 4.1: Database strangulation strategies

Having a separate database requires data synchronization. Once again, we can choose from a few common technology approaches.

Triggers

Most databases allow us to execute custom behavior when data is changed. In some cases, that could even be calling a web service and integrating with another system. But how triggers are implemented and what we can do with them varies between databases. Another significant drawback here is that using triggers requires changing the legacy database, which we might be reluctant to do.

Queries

We can use queries to regularly check the source database for changes. The changes are typically detected with implementation strategies such as timestamps, version numbers, or status column changes in the source database. Regardless of the implementation strategy, polling always leads to the dilemma between polling often and creating overhead over the source database, or missing frequent updates. While queries are simple to install and use, this approach has significant limitations. It is unsuitable for mission-critical applications with frequent database interactions.

Log readers

Log readers identify changes by scanning the database transaction log files. Log files exist for database backup and recovery purposes and provide a reliable way to capture all changes including DELETEs. Using log readers is the least disruptive option because they require no modification to the source database and they don’t have a query load. The main downside of this approach is that there is no common standard for the transaction log files and we'll need specialized tools to process them. This is where Debezium fits in.

 

Data synchronization patterns

Figure 4.2: Data synchronization patterns

Before moving on to the next step, let's see how using Debezium with the log reader approach works.

Change data capture with Debezium

When an application writes to the database, changes are recorded in log files, then the database tables are updated. For MySQL, the log file is binlog; for PostgreSQL, it is the write-ahead-log; and for MongoDB it's the op log. The good news is Debezium has connectors for different databases, so it does the hard work for us of understanding the format of all of these log files. Debezium can read the log files and produce a generic abstract event into a messaging system such as Apache Kafka, which contains the data changes. Figure 5 shows Debezium connectors as the interface for a variety of databases.

Figure 5: Debezium connectors in a microservices architecture.
Figure 5: Debezium connectors in a microservices architecture.

Debezium is the most widely used open source change data capture (CDC) project with multiple connectors and features that make it a great fit for the Strangler pattern.

Why is Debezium a good fit for the Strangler pattern?

One of the most important reasons to consider the Strangler pattern for migrating monolithic legacy applications is reduced risk and the ability to fall back to the legacy application. Similarly, Debezium is completely transparent to the legacy application, and it doesn’t require any changes to the legacy data model. Figure 6 shows Debezium in an example microservices architecture.

Figure 6: Debezium deployment in a hybrid-cloud environment.
Figure 6: Debezium deployment in a hybrid-cloud environment.

With a minimal configuration to the legacy database, we can capture all the required data. So at any point, we can remove Debezium and fall back to the legacy application if we need to.

Debezium features that support legacy migrations

Here are some of Debezium's specific features that support migrating a monolithic legacy application with the Strangler pattern:

  • Snapshots: Debezium can take a snapshot of the current state of the source database, which we can use for bulk data imports. Once a snapshot is completed, Debezium will start streaming the changes to keep the target system in sync.
  • Filters: Debezium lets us pick which databases, tables, and columns to stream changes from. With the Strangler pattern, we are not moving the whole application.
  • Single message transformation (SMT): This feature can act like an anti-corruption layer and protect our new data model from legacy naming, data formats, and even let us filter out obsolete data
  • Using Debezium with a schema registry: We can use a schema registry such as Apicurio with Debezium for schema validation, and also use it to enforce version compatibility checks when the source database model changes. This can prevent changes from the source database from impacting and breaking the new downstream message consumers.
  • Using Debezium with Apache Kafka: There are many reasons why Debezium and Apache Kafka work well together for application migration and modernization. Guaranteed ordering of database changes, message compaction, the ability to re-read changes as many times as needed, and tracking transaction log offsets are all good examples of why we might choose to use these tools together.

Step 4: Releasing services

With that quick overview of Debezium, let’s see where we are with the Strangler pattern. Assume that, so far, we have done the following:

  • Identified a functional boundary.
  • Migrated the functionality.
  • Migrated the database.
  • Deployed the service into a Kubernetes environment.
  • Migrated the data with Debezium and kept Debezium running to synchronize ongoing changes.

At this point, there is not yet any traffic routed to the new services, but we are ready to release the new services. Depending on our routing layer's capabilities, we can use techniques such as dark launching, parallel runs, and canary releasing to reduce or remove the risk of rolling out the new service, as shown in Figure 7.

Figure 7: Directing read traffic to the new service.
Figure 7: Directing read traffic to the new service.

What we can also do here is to only direct read requests to our new service initially, while continuing to send the writes to the legacy system. This is required as we are replicating changes in a single direction only.

When we see that the read operations are going through without issues, we can then direct the write traffic to the new service. At this point, if we still need the legacy application to operate for whatever reason, we will need to stream changes from the new services toward the legacy application database. Next, we'll want to stop any write or mutating activity in the legacy module and stop the data replication from it. Figure 8 illustrates this part of the pattern implementation.

Figure 8: Directing read and write traffic to the new service.
Figure 8: Directing read and write traffic to the new service.

Since we still have legacy read operations in place, we are continuing the replication from the new service to the legacy application. Eventually, we'll stop all operations in the legacy module and stop the data replication. At this point, we will be able to decommission the migrated module.

We've had a broad look at using the Strangler pattern to migrate a monolithic legacy application, but we are not quite done with modernizing our new microservices-based architecture. Next, let’s consider some of the challenges that come later in the modernization process and how Debezium, Apache Kafka, and Kubernetes might help.

After the migration: Modernization challenges

The most important reason to consider using the Strangler pattern for migration is the reduced risk. This pattern gives value steadily and allows us to demonstrate progress through frequent releases. But migration alone, without enhancements or new “business value” can be a hard sell to some stakeholders. In the longer-term modernization process, we also want to enhance our existing services and add new ones. With modernization initiatives, very often, we are also tasked with setting the foundation and best practices for building modern applications that will follow. By migrating more and more services, adding new ones, and in general by transitioning to the microservices architecture, new challenges will come up, including the following:

  • Automating the deployment and operating a large number of services.
  • Performing dual-writes and orchestrating long-running business processes in a reliable and scalable manner.
  • Addressing the analytical and reporting needs.

There are all challenges that might not have existed in the legacy world. Let’s explore how we can address a few of them using a combination of design patterns and technologies.

Challenge 1: Operating event-driven services at scale

While peeling off more and more services from the legacy monolithic application, and also creating new services to satisfy emerging business requirements, the need for automated deployments, rollbacks, placements, configuration management, upgrades, self-healing becomes apparent. These are the exact features that make Kubernetes a great fit for operating large-scale microservices. Figure 9 illustrates.

Figure 9: A sample event-driven architecture on top of Kubernetes.
Figure 9: A sample event-driven architecture on top of Kubernetes.

When we are working with event-driven services, we will quickly find that we need to automate and integrate with an event-driven infrastructure—which is where Apache Kafka and other projects in its ecosystem might come in. Moreover, we can use Kubernetes Operators to help automate the management of Kafka and the following supporting services:

  • Apicurio Registry provides an Operator for managing Apicurio Schema Registry on Kubernetes.
  • Strimzi offers Operators for managing Kafka and Kafka Connect clusters declaratively on Kubernetes.
  • KEDA (Kubernetes Event-Driven Autoscaling) offers workload auto-scalers for scaling up and down services that consume from Kafka. So, if the consumer lag passes a threshold, the Operator will start more consumers up to the number of partitions to catch up with message production.
  • Knative Eventing offers event-driven abstractions backed by Apache Kafka.

Note: Kubernetes not only provides a target platform for application modernization but also allows you to grow your applications on top of the same foundation into a large-scale event-driven architecture. It does that through automation of user workloads, Kafka workloads, and other tools from the Kafka ecosystem. That said, not everything has to run on your Kubernetes. For example, you can use a fully managed Apache Kafka or a schema registry service from Red Hat and automatically bind it to your application using Kubernetes Operators. Creating a multi-availability-zone (multi-AZ) Kafka cluster on Red Hat OpenShift Streams for Apache Kafka takes less than a minute and is completely free during our trial period. Give it a try and help us shape it with your early feedback.

Now, let’s see how we can meet the remaining two modernization challenges using design patterns.

Challenge 2: Avoiding dual-writes

Once you build a couple of microservices, you quickly realize that the hardest part about them is data. As part of their business logic, microservices often have to update their local data store. At the same time, they also need to notify other services about the changes that happened. This challenge is not so obvious in the world of monolithic applications and legacy distributed transactions. How can we avoid or resolve this situation the cloud-native way? The answer is to only modify one of the two resources—the database—and then drive the update of the second one, such as Apache Kafka, in an eventually consistent manner. Figure 10 illustrates this approach.

Figure 10: The Outbox pattern.
Figure 10: The Outbox pattern.

Using the Outbox pattern with Debezium lets services execute these two tasks in a safe and consistent manner. Instead of directly sending a message to Kafka when updating the database, the service uses a single transaction to both perform the normal update and insert the message into a specific outbox table within its database. Once the transaction has been written to the database’s transaction log, Debezium can pick up the outbox message from there and send it to Apache Kafka. This approach gives us very nice properties. By synchronously writing to the database in a single transaction, the service benefits from "read your own writes" semantics, where a subsequent query to the service will return the newly persisted record. At the same time, we get reliable, asynchronous, propagation to other services via Apache Kafka. The Outbox pattern is a proven approach for avoiding dual-writes for scalable event-driven microservices. It solves the inter-service communication challenge very elegantly without requiring all participants to be available at the same time, including Kafka. I believe Outbox will become one of the foundational patterns for designing scalable event-driven microservices.

Challenge 3: Long-running transactions

While the Outbox pattern solves the simpler inter-service communication problem, it is not sufficient alone for solving the more complex long-running, distributed business transactions use case. The latter requires executing multiple operations across multiple microservices and applying consistent all-or-nothing semantics. A common example for demonstrating this requirement is the booking-a-trip use case consisting of multiple parts where the flight and accommodation must be booked together. In the legacy world, or with a monolithic architecture, you might not be aware of this problem as the coordination between the modules is done in a single process and a single transactional context. The distributed world requires a different approach, as illustrated in Figure 11.

Figure 11: The Saga pattern implemented with Debezium.
 Figure 11: The Saga pattern implemented with Debezium.

The Saga pattern offers a solution to this problem by splitting up an overarching business transaction into a series of multiple local database transactions, which are executed by the participating services. Generally, there are two ways to implement distributed sagas:
  • Choreography: In this approach, one participating service sends a message to the next one after it has executed its local transaction.
  • Orchestration: In this approach, one central coordinating service coordinates and invokes the participating services.

Communication between the participating services might be either synchronous, via HTTP or gRPC, or asynchronous, via messaging such as Apache Kafka.

The cool thing here is that you can implement sagas using Debezium, Apache Kafka, and the Outbox pattern. With these tools, it is possible to take advantage of the orchestration approach and have one place to manage the flow of a saga and check the status of the overarching saga transaction. We can also combine orchestration with asynchronous communication to decouple the coordinating service from the availability of participating services and even from the availability of Kafka. That gives us the best of both worlds: orchestration and asynchronous, non-blocking, parallel communication with participating services, without temporal coupling.

Combining the Outbox pattern with the Sagas pattern is an awesome, event-driven implementation option for the long-running business transactions use case in the distributed services world. See Saga Orchestration for Microservices Using the Outbox Pattern (InfoQ) for a detailed description. Also see an implementation example of this pattern on GitHub.

Conclusion

The Strangler pattern, Outbox pattern, and Saga pattern can help you migrate from brown-field systems, but at the same time, they can help you build green-field, modern, event-driven services that are future-proof.

Kubernetes, Apache Kafka, and Debezium are open source projects that have turned into de facto standards in their respective fields. You can use them to create standardized solutions with a rich ecosystem of supporting tools and best practices.

The one takeaway from this article is the realization that modern software systems are like cities: They evolve over time, on top of legacy systems. Using proven patterns, standardized tools, and open ecosystems will help you create long-lasting systems that grow and change with your needs.

This post was originally published on Red Hat Developers. To read the original post, check here. 

About Me