Blogroll

Choosing Between ActiveMQ and Kafka for Messaging Infrastructure

The term asynchronous means “not occurring at the same time” and in the context of distributed systems and messaging it implies that the processing of a request occurs at an arbitrary point in time. There are many advantages of asynchronous interactions over synchronous ones but also new challenges introduced by it. In this post, we will focus on a few specific considerations for choosing a suitable asynchronous messaging infrastructure for implementing event-driven systems. Let’s see a few of the subtle differences between asynchronous interaction styles.

Message Business Value

Not all messages are created equal. Some are valid and valuable only for a short period of time and obsolete later. Some are valuable until they are consumed regardless of the time passed. And some messages are valid and useful for repeated consumption. Considering the validity and the value of messages relative to time and consumption rate, we can qualify interaction styles between services into the following categories:


Message types by business value
Message types by business value

Volatile

These are ephemeral messages where the value is time-bound. Valuable now, but not in the short future. There is no point in storing events that are useless in the future and using messaging systems with such characteristics gives the best performance with the lowest latency possible as the disk is skipped. In such a scenario, the system is aware of the connected consumers and the event disseminated to all consumers online at the time of publication. If a consumer is disconnected, the messaging system forgets about these consumers. What is important in such a system is the ability to handle a large number of dynamic clients with low latency interaction needs such as IoT devices.

Durable

However, in some situations you want the messaging system to be aware of the consumers and store the messages while the consumer is not available. That is a traditional message broker which will hold on to the messages for the consumers that he knows about and allow the consumers to re-connect and consume the events that were produced in his absence. Once an event is consumed by all the interested parties, it will discard the messages. Here the broker knows about registered consumers and messages are stored durably until read by all registered consumers. Here the goal is to do reliable messaging among services with strong ordering and delivery guarantees.

Replayable

Here, the messaging system is not aware of the consumers that are interested in the event. It simply stores the events published to a stream for some time or until capacity is reached. Then a consumer can come along at any time, connect and consume the events and perhaps replay the stream from the beginning. Consumers can move back and forth in the stream as required and replay the messages repeatedly. Here, the driving force is extreme scalability combined with the ability to replay messages for existing or new consumers.

Message Semantics

Apart from the technical characteristics of the messages, it is important to distinguish the language we use, the semantic aspects, and the intent of the interactions. Some messages are targeted for a specific consumer and demand concrete actions. Some are querying the latest state of a system without requiring a state change. And some notify the world about a change that has happened in the source system. From a messaging semantic perspective, there are the following types of messages:



Message types by semantics
Message types by semantics

Command

A command is a request for action that usually leads to a state change on a known target system. Typically there is a response indicating that action was completed and even there might be a result associated with it. When a response is expected, commands are typically implemented over synchronous protocols such as HTTP, but it is possible to implement request/response or fire and forget style commands over asynchronous messaging systems. With a command based asynchronous messages, there is some coupling between the source and the target systems in the form of command semantics.

Query

A query is like a command, but it is a read-only interaction that does not lead to a state change. By its very nature, a query expects a response, and it is common to see synchronous implementations here. But asynchronous and non-blocking implementations over messaging systems and even fire and forget style interactions for long-running operations where a response is written to a different location are common too.

Event

An event is a notification that something has changed. A system sends event notifications to notify other systems for a change in its domain. An event is different from a command in that often the event emitting system doesn’t expect an answer at all. In addition to being asynchronous, event messages are not targeted to a specific recipient and thus, they enable even further decoupling. Similar to other asynchronous interactions, events are implemented as messages on queues, which are often called streams. Martin Fowler covers in-depth the different types of events in this talk.

Summary

One approach you can take is to follow the Law of the Instrument approach defined by Maslow as “If the only tool you have is a hammer, treat everything as if it were a nail." You could certainly use a classic message broker such as Apache ActiveMQ to implement the different interaction styles. It would be a familiar technology to many and easier to start with, but hard to implement some use cases such as replayable messaging. Or you could take the other extreme and try to use Apache Kafka for everything. It would require a larger amount of hardware resources and human effort to manage it, but it would cover the replayable messaging and extreme scalability needs. While both of the above approaches are fine to start with, when you have a large number of services with different messaging needs, using the right tool for the right job is a better option. We can map the above-described messaging patterns to see what messaging infrastructure is best suited for each.

Mapping messaging subtleties to different messaging infrastructures
Mapping messaging subtleties to different messaging infrastructures

We at Red Hat love any open source technology. That is why we included Apache Qpid, Apache ActiveMQ Artemis, and Apache Kafka in our Red Hat AMQ product and let the customer choose the right tool for the right job. There are many other aspects to consider when choosing the right tool, I hope this post will help you get there one step closer.
This post was originally published on Red Hat Developers. To read the original post, check here.

What is Application Performance Monitoring (APM)?

This is a guest post by freelance editor and copywriter Laila Mahran.

When using Application Performance Monitoring, you’re able to monitor key app performance metrics about the performance of a web application in production. APM is often thought of as a ‘second wave’ of performance monitoring techniques, which was preceded by traditional host-based monitoring. Let’s dive in more.
Host-based monitoring focuses on indicators such as:
  • Storage
  • Memory
  • CPU
  • Network utilization
Application monitoring goes a step further and focuses on the actual “end-user” metrics of an application in real-time such as:
  • Code-level errors
  • Slowdowns in response times
  • Error rates

How does this APM magic work?

There are multiple different ways Application Performance Monitoring tools can function. Let’s look at the most common ways APM is used.
  • An agent process that is deployed alongside a web application that hooks into the application runtime to collect telemetry data from the process
  • Specialized web appliances that inspect Layer 7 traffic to generate telemetry
When combined with the monitoring mechanism, an external application generates synthetic traffic which is then sent to the application to monitor performance at predefined throughput intervals. When looking at APM tools and other monitoring types, the main difference to highlight is that the telemetry data is generated by inspecting the application runtime, and the performance metrics that it exposes.

Can APM help me?

Traditional host monitoring can make you feel stuck with no step closer to an answer. Application Performance Monitoring is designed to answer questions that you can’t get an answer to. While understanding the raw resource utilization of your application is useful, it doesn’t give you a lot of information when you’re trying to track down why a specific request has high latency, why a particular transaction against your database is failing, or how your application performs under load.
Let’s take a look at common questions asked on a daily basis.
  1. What are the implications of this issue on user experience for end users? 
  2. Where is this high latency coming from?
  3. What caused that outage?
  4. Why are we getting an error here?
  5. Why is this transaction failing?
  6. Can we find the root cause of this substandard user experience?
Have you asked yourself these questions before? If you’re nodding your head furiously, you can look to APM to provide the answer.

Monitoring vs. Management: What’s the difference?

Application Performance Management applies to a suite of applications while Application Performance Monitoring applies to a single application. An application performance management tool is able to aggregate and compare multiple types of metrics across multiple applications and services in order to pinpoint performance issues and regressions in your suite of applications. On the other hand, Application Performance Monitoring looks at the code-level to ensure each step is monitored thoroughly.

Is Network Monitoring different?

Network monitoring focuses on routers in order to detect issues with an application or collecting telemetry from network devices such as switches. If you’re looking to get a complete picture, networking monitoring requires stitching together information from each line. This approach doesn’t provide sufficient resolution or information for modern applications, however, especially when the application itself may be running behind a variety of proxies or service routers which themselves are running on virtualized networking equipment.

APM vs. Observability: What’s the difference?

You’ve heard the hype of observability, but how is it different from APM? Observability is a holistic approach to fully understanding your application performance as well as a shared set of practices and terminology to help communicate performance across your organization. While observability helps you navigate from effect to cause, APM falls short of being able to answer “unknown unknowns,” questions that you didn’t think to ask ahead of time. This is the reason behind APM currently being eclipsed by observability.
Observability is unique due to the capability of answering questions about modern, microservice-based application architectures where you will often contend with serverless components, polyglot services, and container-based deployments running on Kubernetes. Circling back, observability provides a shared language to standardize communication around performance. This way you’re able to focus on the measurement of service level objectives and service level indicators that are more broadly applicable and interpretable to your unique application architecture than simple throughput or health checks.

Is Application Performance Monitoring worth it?

Instead of depending on the second or third order metrics about host or network utilization to understand your application’s performance, APM collects real-time performance data from the perspective of an end-user. Another bonus: real-time results of database queries and page load times are provided with APM in a way that’s not possible with host-based monitoring. This information can be invaluable in understanding how your application performs under load or while trying to track down bugs in your software. APM solutions provide alerting systems to IT Operations, Site Reliability Engineers, DevOps, and more to quickly troubleshoot performance issues and slowdowns.

Operators and Sidecars Are the New Model for Software Delivery

Today’s developers are expected to develop resilient and scalable distributed systems. Systems that are easy to patch in the face of security concerns and easy to do low-risk incremental upgrades. Systems that benefit from software reuse and innovation of the open source model. Achieving all of this for different languages, using a variety of application frameworks with embedded libraries is not possible.

Recently I’ve blogged about “Multi-Runtime Microservices Architecture” where I have explored the needs of distributed systems such as lifecycle management, advanced networking, resource binding, state abstraction and how these abstractions have been changing over the years. I also spoke about “The Evolution of Distributed Systems on Kubernetes” covering how Kubernetes Operators and the sidecar model are acting as the primary innovation mechanisms for delivering the same distributed system primitives.

On both occasions, the main takeaway is the prediction that the progression of software application architectures on Kubernetes moves towards the sidecar model managed by operators. Sidecars and operators could become a mainstream software distribution and consumption model and in some cases even replace software libraries and frameworks as we are used to.

The sidecar model allows the composition of applications written in different languages to deliver joint value, faster and without the runtime coupling. Let’s see a few concrete examples of sidecars and operators, and then we will explore how this new software composition paradigm could impact us.

Out-of-Process Smarts on the Rise

In Kubernetes, a sidecar is one of the core design patterns achieved easily by organizing multiple containers in a single Pod. The Pod construct ensures that the containers are always placed on the same node and can cooperate by interacting over networking, file system or other IPC methods. And operators allow the automation, management and integration of the sidecars with the rest of the platform. The sidecars represent a language-agnostic, scalable data plane offering distributed primitives to custom applications. And the operators represent their centralized management and control plane.

Let’s look at a few popular manifestations of the sidecar model.

Envoy

Service Meshes such as Istio, Consul, and others are using transparent service proxies such as Envoy for delivering enhanced networking capabilities for distributed systems. Envoy can improve security, it enables advanced traffic management, improves resilience, adds deep monitoring and tracing features. Not only that, it understands more and more Layer 7 protocols such as Redis, MongoDB, MySQL and most recently Kafka. It also added response caching capabilities and even WebAssembly support that will enable all kinds of custom plugins. Envoy is an example of how a transparent service proxy adds advanced networking capabilities to a distributed system without including them into the runtime of the distributed application components.

Skupper

In addition to the typical service mesh, there are also projects, such as Skupper, that ship application networking capabilities through an external agent. Skupper solves multicluster Kubernetes communication challenges through a Layer 7 virtual network and offers advanced routing and connectivity capabilities. But rather than embedding Skupper into the business service runtime, it runs an instance per Kubernetes namespace which acts as a shared sidecar.

Cloudstate

Cloudstate is another example of the sidecar model, but this time for providing stateful abstractions for the serverless development model. It offers stateful primitives over GRPC for EventSourcing, CQRS, Pub/Sub, Key/Value stores and other use cases. Again, it an example of sidecars and operators in action but this time for the serverless programming model.

Dapr

Dapr is a relatively young project started by Microsoft, and it is also using the sidecar model for providing developer-focused distributed system primitives. Dapr offers abstractions for state management, service invocation and fault handling, resource bindings, pub/sub, distributed tracing and others. Even though there is some overlap in the capabilities provided by Dapr and Service Mesh, both are very different in nature. Envoy with Istio is injected and runs transparently from the service and represents an operational tool. Dapr, on the other hand, has to be called explicitly from the application runtime over HTTP or gRPC and it is an explicit sidecar targeted for developers. It is a library for distributed primitives that is distributed and consumed as a sidecar, a model that may become very attractive for developers consuming distributed capabilities.

Camel K

Apache Camel is a mature integration library that rediscovers itself on Kubernetes. Its subproject Camel K uses heavily the operator model to improve the developer experience and integrate deeply with the Kubernetes platform. While Camel K does not rely on a sidecar, through its CLI and operator it is able to reuse the same application container and execute any local code modification in a remote Kubernetes cluster in less than a second. This is another example of developer-targeted software consumption through the operator model.

More to Come

And these are only some of the pioneer projects exploring various approaches through sidecars and operators. There is more work being done to reduce the networking overhead introduced by container-based distributed architectures such as the data plane development kit (DPDK), which is a userspace application that bypasses the layers of the Linux kernel networking stack and access directly to the network hardware. There is work in the Kubernetes project to create sidecar containers with more granular lifecycle guarantees. There are new Java projects based on GraalVM implementation such as Quarkus that reduce the resource consumption and application startup time which makes more workloads attractive for sidecars. All of these innovations will make the side-car model more attractive and enable the creation of even more such projects.

Sidecars Providing Distributed Systems Primitives
Sidecars providing distributed systems primitives

I’d not be surprised to see projects coming up around more specific use cases such as stateful orchestration of long-running processes such as Business Process Model and Notation (BPMN) engines in sidecars. Job schedulers in sidecars. Stateless integration engines i.e. Enterprise Integration Patterns implementations in sidecars. Data abstractions and data federation engines in sidecars. OAuth2/OpenID proxy in sidecars. Scalable database connection pools for serverless workloads in sidecars. Application networks as sidecars, etc. But why would software vendors and developers switch to this model? Let’s see a few of the benefits it provides.

Runtimes with Control Planes over Libraries

If you are a software vendor today, probably you have already considered offering your software to potential users as an API or a SaaS-based solution. This is the fastest software consumption model and a no-brainer to offer, when possible. Depending on the nature of the software you may be also distributing your software as a library or a runtime framework. Maybe it is time to consider if it can be offered as a container with an operator too. This mechanism of distributing software and the resulting architecture has some very unique benefits that the library mechanism cannot offer.

Supporting Polyglot Consumers

By offering libraries to be consumable through open protocols and standards, you open them up for all programming languages. A library that runs as a sidecar and consumable over HTTP, using a text format such as JSON does not require any specific client runtime library. Even when gRPC and Protobuf are used for low-latency and high-performance interactions, it is still easier to generate such clients than including third party custom libraries in the application runtime and implement certain interfaces.

Application Architecture Agnostic

The explicit sidecar architecture (as opposed to the transparent one) is a way of software capability consumption as a separate runtime behind a developer-focused API. It is an orthogonal feature that can be added to any application whether that is monolithic, microservices, functions-based, actor-based or anything in between. It can sit next to a monolith in a less dynamic environment, or next to every microservice in a dynamic cloud-based environment. It is trivial to create sidecars on Kubernetes, and doable on many other software orchestration platforms too.

Tolerant to Release Impedance Mismatch

Business logic is always custom and developed in house. Distributed system primitives are well-known commodity features, and consumed off-the-shelf as either platform features or runtime libraries. You might be consuming software for state abstractions, messaging clients, networking resiliency and monitoring libraries, etc. from third-party open source projects or companies. And these third party entities have their release cycles, critical fixes, CVE patches that impact your software release cycles too. When third party libraries are consumed as a separate runtime (sidecar), the upgrade process is simpler as it is behind an API and it is not coupled with your application runtime. The release impedance mismatch between your team and the consumed 3rd party libraries vendors becomes easier to manage.

Control Plane Included Mentality

When a feature is consumed as a library, it is included in your application runtime and it becomes your responsibility to understand how it works, how to configure, monitor, tune and upgrade. That is because the language runtimes (such as the JVM) and the runtime frameworks (such as Spring Boot or application servers) dictate how a third-party library can be included, configured, monitored and upgraded.
When a software capability is consumed as a separate runtime (such as a sidecar or standalone container) it comes with its own control plane in the form of a Kubernetes operator.

That has a lot of benefits as the control plane understands the software it manages (the operand) and comes with all the necessary management intelligence that otherwise would be distributed as documentation and best practices. What’s more, operators also integrate deeply with Kubernetes and offer a unique blend of platform integration and operand management intelligence out-of-the-box. Operators are created by the same developers who are creating the operands, they understand the internals of the containerized features and know how to operate the best. Operators are executables SREs in containers, and the number of operators and their capabilities are increasing steadily with more operators and marketplaces coming up.

Software Distribution and Consumption in the Future

Software Distributed as Sidecars with Control Planes

Let’s say you are a software provider of a Java framework. You may distribute it as an archive or a Maven artifact. Maybe you have gone a step further and you distribute a container image. In either case, in today’s cloud-native world, that is not good enough. The users still have to know how to patch and upgrade a running application with zero downtime. They have to know what to backup and restore its state. They have to know how to configure their monitoring and alerting thresholds. They have to know how to detect and recover from complex failures. They have to know how to tune an application based on the current load profile.

In all of these and similar scenarios, intelligent control planes in the form of Kubernetes operators are the answer. An operator encapsulates platform and domain knowledge of an application in a declaratively configured component to manage the workload.

Sidecars and operators could become a mainstream software distribution and consumption model and in some cases even replace software libraries and frameworks as we are used to.

Let’s assume that you are providing a software library that is included in the consumer applications as a dependency. Maybe it is the client-side library of the backend framework described above. If it is in Java, for example, you may have certified it to run it on a JEE server, provided Spring Boot Starters, Builders, Factories, and other implementations that are all hidden behind a clean Java interface. You may have even backported it to .Net too.

With Kubernetes operators and sidecars all of that is hidden from the consumer. The factory classes are replaced by the operator, and the only configuration interface is a YAML file for the custom resource. The operator is then responsible for configuring the software and the platform so that users can consume it as an explicit sidecar, or a transparent proxy. In all cases, your application is available for consumption over remote API and fully integrated with the platform features and even other dependent operators. Let’s see how that happens.

Software Consumed over Remote APIs Rather than Embedded Libraries

One way to think about sidecars is similar to the composition over inheritance principle in OOP, but in a polyglot context. It is a different way of organizing the application responsibilities by composing capabilities from different processes rather than including them into a single application runtime as dependencies. When you consume software as a library, you instantiate a class, call its methods by passing some value objects. When you consume it as an out-of-process capability, you access a local process. In this model, methods are replaced with APIs, in-process methods invocation with HTTP or gRPC invocations, and value objects with something like CloudEvents. This is a change from application servers to Kubernetes as the distributed runtime. A change from language-specific interfaces, to remote APIs. From in-memory calls to HTTP, from value objects to CloudEvents, etc.

This requires software providers to distribute containers and controllers to operate them. To create IDEs that are capable of building and debugging multiple runtime services locally. CLIs for quickly deploying code changes into Kubernetes and configuring the control planes. Compilers that can decide what to compile in a custom application runtime, what capabilities to consume from a sidecar and what from the orchestration platform.

Software consumers and providers ecosystem
Software consumers and providers ecosystem

In the longer term, this will lead to the consolidation of standardized APIs that are used for the consumption of common primitives in sidecars. Rather than language-specific standards and APIs we will have polyglot APIs. For example, rather than Java Database Connectivity (JDBC) API, caching API for Java (JCache), Java Persistence API (JPA), we will have polyglot APIs over HTTP using something like CloudEvents. Sidecar centric APIs for messaging, caching, reliable networking, cron jobs and timer scheduling, resource bindings (connectors to other APIs, protocols), idempotency, SAGAs, etc. And all of these capabilities will be delivered with the management layer included in the form of operators and even wrapped with self-service UIs. The operators are key enablers here as they will make this even more distributed architecture easy to manage and self-operate on Kubernetes. The management interface of the operator is defined by the CustomResourceDefinition and represents another public-facing API that remains application-specific.

This is a big shift in mentality to a different way of distributing and consuming software, driven by the speed of delivery and operability. It is a shift from a single runtime to multi runtime application architectures. It is a shift similar to what the hardware industry had to go through from single-core to multicore platforms when Moore’s law ended. It is a shift that is slowly happening by building all the elements of the puzzle: we have uniformly adopted and standardized containers, we have a de facto standard for orchestration through Kubernetes, possibly improved sidecars coming soon, rapid operators adoption, CloudEvents as a widely agreed standard, light runtimes such as Quarkus, etc. With the foundation in place, applications, productivity tools, practices, standardized APIs, and ecosystem will come too.

This post was originally published at ​The New Stack here.

About Me