Choosing Between ActiveMQ and Kafka for Messaging Infrastructure ~ Bilgin Ibryam (@bibryam)

The term asynchronous means “not occurring at the same time” and in the context of distributed systems and messaging it implies that the processing of a request occurs at an arbitrary point in time. There are many advantages of asynchronous interactions over synchronous ones but also new challenges introduced by it. In this post, we will focus on a few specific considerations for choosing a suitable asynchronous messaging infrastructure for implementing event-driven systems. Let’s see a few of the subtle differences between asynchronous interaction styles.

Message Business Value

Not all messages are created equal. Some are valid and valuable only for a short period of time and obsolete later. Some are valuable until they are consumed regardless of the time passed. And some messages are valid and useful for repeated consumption. Considering the validity and the value of messages relative to time and consumption rate, we can qualify interaction styles between services into the following categories:

Message types by business value

Volatile

These are ephemeral messages where the value is time-bound. Valuable now, but not in the short future. There is no point in storing events that are useless in the future and using messaging systems with such characteristics gives the best performance with the lowest latency possible as the disk is skipped. In such a scenario, the system is aware of the connected consumers and the event disseminated to all consumers online at the time of publication. If a consumer is disconnected, the messaging system forgets about these consumers. What is important in such a system is the ability to handle a large number of dynamic clients with low latency interaction needs such as IoT devices.

Durable

However, in some situations you want the messaging system to be aware of the consumers and store the messages while the consumer is not available. That is a traditional message broker which will hold on to the messages for the consumers that he knows about and allow the consumers to re-connect and consume the events that were produced in his absence. Once an event is consumed by all the interested parties, it will discard the messages. Here the broker knows about registered consumers and messages are stored durably until read by all registered consumers. Here the goal is to do reliable messaging among services with strong ordering and delivery guarantees.

Replayable

Here, the messaging system is not aware of the consumers that are interested in the event. It simply stores the events published to a stream for some time or until capacity is reached. Then a consumer can come along at any time, connect and consume the events and perhaps replay the stream from the beginning. Consumers can move back and forth in the stream as required and replay the messages repeatedly. Here, the driving force is extreme scalability combined with the ability to replay messages for existing or new consumers.

Message Semantics

Apart from the technical characteristics of the messages, it is important to distinguish the language we use, the semantic aspects, and the intent of the interactions. Some messages are targeted for a specific consumer and demand concrete actions. Some are querying the latest state of a system without requiring a state change. And some notify the world about a change that has happened in the source system. From a messaging semantic perspective, there are the following types of messages:

Message types by semantics

Command

A command is a request for action that usually leads to a state change on a known target system. Typically there is a response indicating that action was completed and even there might be a result associated with it. When a response is expected, commands are typically implemented over synchronous protocols such as HTTP, but it is possible to implement request/response or fire and forget style commands over asynchronous messaging systems. With a command based asynchronous messages, there is some coupling between the source and the target systems in the form of command semantics.

Query

A query is like a command, but it is a read-only interaction that does not lead to a state change. By its very nature, a query expects a response, and it is common to see synchronous implementations here. But asynchronous and non-blocking implementations over messaging systems and even fire and forget style interactions for long-running operations where a response is written to a different location are common too.

Event

An event is a notification that something has changed. A system sends event notifications to notify other systems for a change in its domain. An event is different from a command in that often the event emitting system doesn’t expect an answer at all. In addition to being asynchronous, event messages are not targeted to a specific recipient and thus, they enable even further decoupling. Similar to other asynchronous interactions, events are implemented as messages on queues, which are often called streams. Martin Fowler covers in-depth the different types of events in this talk.

Summary

One approach you can take is to follow the Law of the Instrument approach defined by Maslow as “If the only tool you have is a hammer, treat everything as if it were a nail." You could certainly use a classic message broker such as Apache ActiveMQ to implement the different interaction styles. It would be a familiar technology to many and easier to start with, but hard to implement some use cases such as replayable messaging. Or you could take the other extreme and try to use Apache Kafka for everything. It would require a larger amount of hardware resources and human effort to manage it, but it would cover the replayable messaging and extreme scalability needs. While both of the above approaches are fine to start with, when you have a large number of services with different messaging needs, using the right tool for the right job is a better option. We can map the above-described messaging patterns to see what messaging infrastructure is best suited for each.

Mapping messaging subtleties to different messaging infrastructures

We at Red Hat love any open source technology. That is why we included Apache Qpid, Apache ActiveMQ Artemis, and Apache Kafka in our Red Hat AMQ product and let the customer choose the right tool for the right job. There are many other aspects to consider when choosing the right tool, I hope this post will help you get there one step closer.

This post was originally published on Red Hat Developers. To read the original post, check here.