Blogroll

Microservices are a Commodity

I'm a big fan of Simon Wardley. I'm not able to follow everything he writes about, but even the old writings are very interesting and somehow it all makes sense in retrospective. If you haven't read anything from him, this video is an excellent start (or go back a few years to the longer talk). In this article I'll try to make sense about what is happening in Microservices world using Wardly's theory (and diagrams).

How things evolve?

Any idea, product, system, etc., starts with its genesis, and if it is successful, it evolves, others copy it and create new custom solutions from it. If it is still successful, it diffuses further and others create new products which get improved, extended, and becomes widespread and available, “ubiquitous”, well understood and more of a commodity. This lifecycle can be observed in many successful products when looked over the years such as computers, mobiles, virtualization/cloud, etc.

If we think about the Microservices, the architectural style, the projects that support it, platforms that were born from it, containers, DevOps practice, etc... each of them is at a certain stage in the diagram above. But as a whole the Microservices movement now is a pretty widespread, well understood concept, and turning into commodity already. And there are many indications confirming that, starting from number of publications, conferences, books, confirmed success stories on production, etc. No doubt about it, not any longer.

How did we get here?

The Microservices genesis started 5-6 years ago with Fred George and James Lewis from ThoughtWorks sharing their ideas. In the next few months Thoughtworks did lot of thinking, writing, and talking about it, while Netflix did a lot of hacking and created the first generation of Microservices libraries. Most of those libraries were still not very popular, and usable by the wider developer community, and only the pioneers and startup minded companies would try them at times. Then SpringSource joined the bandwagon, they wrapped and packaged the Netflix libraries into products and made all the custom build solutions accessible and easy to consume for Java developers. In the meantime, all this interest in Microservices drove further innovation and containers were born. That brought in another wave of innovation, more funding, shuffling, new set of tools, which made the DevOps theory a practise.



Containers being the primary means for deploying Microservices, soon created the need for container orchestration i.e. Cloud Native platforms. And today, the Cloud Native landscape is in transition, taking its next shape. If you look around there are multiple Cloud Native platforms, each of which started its journey from different point in time and a unique value proposition , but slowly getting into a common feature set, similar concepts and even standards. For example the feature parity of platforms such as AWS ECS, Kubernetes, Apache Mesos, Cloud Foundry, are getting close, each being feature rich, used in production, and comparable primitives. As you can see from the diagram above, now what becomes important as technology strategy is to bet on platforms with open standards, open source, large community and high chance of long term success. That means for example choosing a container runtime that is OCI compliant, choosing a tracing tool that is based on Open Tracing standard rather than custom implementation, supporting the industry standard logging and monitoring solutions, supported by companies that are good in commodity products.

Organization Types

According to Wardly, there are three types of people/teams/organizations and each is good at certain stages of the evolutions:
  • Pioneers are good in exploring uncharted territories and undiscovered concepts. They turn into life the crazy ideas.
  • Settlers are good in turning the half baked prototype into something useful for a larger audience. They build trust, understanding and refine the concept. They turn the prototype into a product, make it manufacturable and turn it profitable.
  • Town Planners are good in taking something and industrialise it taking advantage of economies of scale. They build the trusted platforms of the future which requires immense skill. They find ways to make things faster, better, smaller, more efficient, more economic and good enough.




    With this definition and the above table showing the characteristics of each type of organisation, we can make the following hypothetical classification:
    • Netflix are definitely the Pioneers. The creative, path finder people they have, the way the company is set around experimentation, uncertainty, their culture around freedom, responsibility, everything they have brought into the microservice services world makes them pioneers.
    • SpringSource for me are more the settler type. They already had a popular Java stack, and they managed to spot the trend in Microservices, and created a good consumable product in the form of Spring Boot and Spring Cloud.
    • Amazon, Google, Microsoft are the town planners. They may come late, but they come well prepared, with the long term strategy defined, with web scale solutions and unbeatable pricing. Platforms such as Kubernetes, ECS (not entirely sure about the latter as it is pretty closed) are build on over 10 years of experience, and indented to last long, and become the undustry standard.
    One important takeway from this section is that, not everything invented by pioneers is meant for general consumption. Pioneers move fast, and unless your organisation has similar characteristics, it might be difficult to follow all the time. On the other hand, town planners create products and services which are interoperable and based on open standards. That in the longer term becomes an important axes of freedom. 

    Conclusion

    In the Microservices world things are moving from uncharted to industrialised direction. Most of the activities are not that chaotic, uncertain and unpredictable. It is almost turning into a boring and dull activity to plan, design and implement Microservices. And since it is an industrialised job with a low margin, the choice of tool, and ability to swap those platforms plays a significant role.


    Last but not least, a nice side effect from this evolution is that we should hear less about Conway's Law, the two pizzas, and circuit breaker during conferences, and should hear more about managing Microservices at scale, automation, business value, serverless and new uncharted ideas from the pioneers in our industry.

    Spring Cloud for Microservices Compared to Kubernetes

    (This post was originally published on Red Hat Developers, the community to learn, code, and share faster. To read the original post, click here.)
    Spring Cloud and Kubernetes both claim to be the best environment for developing and running Microservices, but they are both very different in nature and address different concerns. In this article we will look at how each platform is helping in delivering Microservice based architectures (MSA), which areas they are good at, and how to use the best of both worlds in order to succeed in the Microservices journey.

    Background Story

    Recently I read a great article about building Microservice Architectures With Spring Cloud and Docker by A. Lukyanchikov. If you haven't read it, you should, as it gives a comprehensive overview of what it takes to create a simple Microservices based system using Spring Cloud. In order to build a scalable and resilient Microservices system that could grow to tens or hundreds of services, it must be centrally managed and governed with the help of a tool set that has extensive build time and runtime capabilities. With Spring Cloud, that involves implementing both functional services (such as statistics service, account service and notification service) and supporting infrastructure services (such as log analysis, configuration server, service discovery, auth service). A diagram describing such a MSA using Spring Cloud is below:
    Infrastructure Services
    MSA with Spring Cloud (by A. Lukyanchikov)
    This diagram covers the runtime aspects of the system, but doesn't touch on the packaging, continuous integration, scaling, high availability, and self-healing which are also very important in the MSA world. Assuming that the majority of Java developers are familiar with Spring Cloud, in this article we will draw a parallel and see how Kubernetes relates to Spring Cloud by addressing these additional concerns.

    Microservices Concerns

    Rather than doing a feature by feature comparison, let's take a look at wider Microservices concerns and see how Spring Cloud and Kubernetes approach those. The good thing about MSA today is that it is an architectural style with well understood benefits and trade-offs. Microservices enable strong module boundaries, with independent deployment and technology diversity. But they come at the cost of developing distributed systems and significant operational overhead. A key success factor is to focus on being surrounded by tools that will help you address as many MSA concerns as possible. Making the starting process quick and easy is important, but the journey to production is a long one, and you need to be this tall to get there.
    Microservices Concerns
    In the diagram above, we can see a list with the most common technical concerns (we are not covering the non-technical concerns such as organisation structure, culture and so on) that have to be addressed in a MSA.

    Technology Mapping

    The two platforms, Spring Cloud and Kubernetes, are very different and there is no direct feature parity between them. If we map each MSA concern to the technology/project used to address it in both platforms, we come up with the following table.
    Spring Cloud and Kubernetes Technologies
    The main takeaways from the above table are:
    • Spring Cloud has a rich set of well integrated Java libraries to address all runtime concerns as part of the application stack. As a result, the Microservices themselves have libraries and runtime agents to do client side service discovery, load balancing, configuration update, metrics tracking, etc. Patterns such as singleton clustered services and batch jobs are managed in the JVM too.
    • Kubernetes is polyglot, doesn't target only the Java platform, and addresses the distributed computing challenges in a generic way for all languages. It provides services for configuration management, service discovery, load balancing, tracing, metrics, singletons, scheduled jobs on the platform level and outside of the application stack. The application doesn't need any library or agents for client side logic and it can be written in any language.
    • In some areas both platforms rely on similar third party tools. For example the ELK and EFK stacks, tracing libraries, etc. Some libraries such as Hystrix, Spring Boot are useful equally well on both environments. There are areas where both platforms are complementary and can be combined together to create a more powerful solution (KubeFlix and Spring Cloud Kubernetes are such examples).

    Microservices Requirements

    In order to demonstrate the scope of each project, here is a table with (almost) end-to-end MSA requirements starting from the hardware on the bottom, up to the DevOps and self service experience at the top, and how it relates to Spring Cloud and Kubernetes platforms.
    Microservices Requirements
    In some cases both projects address the same requirements using different approaches and in some areas one project may be stronger than the other. But there is also a sweet spot where both platforms are complementary to each other and can be combined for a superior Microservices experience. For example Spring Boot provides Maven plugins for building single jar application packages. That combined with Docker and Kubernetes declarative Deployments and Scheduling capabilities makes running Microservice a breeze. Similarly, Spring Cloud has in-application libraries for creating resilient, fault tolerant Microservices using Hystrix (with bulkhead and circuit breaker patterns) and Ribbon (for load balancing). But that alone is not enough, and when it is combined with Kubernetes health checks, process restarts and auto-scaling capabilities turns Microservices into a true antifragile system.

    Strengths and Weaknesses

    Since both platforms are not directly comparable feature by feature, and rather than digging into each item, here are the (summarized) advantages and disadvantages of each platform.

    Spring Cloud

    Spring Cloud provides tools for developers to quickly build some of the common patterns in distributed systems such as configuration management, service discovery, circuit breakers, routing, etc. It is build on top of Netflix OSS libraries, written in Java, for Java developers.

    Strengths

    • The unified programing model offered by the Spring Platform itself, and rapid application creation abilities of Spring Boot, give developers a great Microservice development experience. For example, with few annotations you can create a Config Server, and few more annotation you can get the client libraries to configure your services.
    • There are a rich selection of libraries covering the majority of runtime concerns. Since all libraries are written in Java, it offers multiple features, greater control and fine tuning options.
    • The different Spring Cloud libraries are well integrated with one another. For example a Feign client will also use Hystrix for Circuit Breaking, and Ribbon for load balancing requests. Everything is annotation driven, making it easy to develop for Java developers.

    Weaknesses

    • One of the major advantages of the Spring Cloud is also its drawback - it is limited to Java only. A strong motivation for the MSA is the ability to interchange technology stacks, libraries and even languages when required. That is not possible with Spring Cloud. If you want to consume Spring Cloud/Netflix OSS infrastructure services such as configuration management, service discovery, load balancing, the solution is not elegant. The Netflix Prana project implements the sidecar pattern to exposes Java-based client libraries over HTTP to make it possible for applications written in Non-JVM languages that exist in the NetflixOSS eco-system, but it is not very elegant.
    • There is too much responsibility for Java developers to care about and the Java applications to handle. Each Microservice needs to run various clients for configuration retrieval, service discovery and load balancing. It is easy to set those up, but that doesn't hide the buildtime and runtime dependencies to the environment. For example, developers can create a Config Server with @EnableConfigServer annotation easily, but that is only the happy path. Every time developers want to run a single Microservice, they need to have the Config Server up and running. For a controlled environment, developers have to think about making the Config Server highly available and since it can be backed by Git or Svn, they need shared file system for it. Similarly for service discovery, developers need to start Eureka Server first. For a controlled environment, they need to cluster it with multiple instances on each AZ, etc. It feels like as a Java developers have to build and manage a non-trivial Microservices platform in addition to implementing all the functional services.
    • Spring Cloud alone has a shorter scope in the Microservices journey, and developers will also need to consider automated deployments, scheduling, resource management, process isolation, self healing, build pipelines, etc. for a complete Micorservices experience. For this point, I think it is not fair to compare Spring Cloud alone to Kubernetes, and a more fair comparison would be between Spring Cloud + Cloud Foundry (or Docker Swarm) and Kubernetes. But that also means that for a complete end-to-end Microservices experience, Spring Cloud must be supplemented with an application platform like Kubernetes itself.

    Kubernetes

    Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications. It is polyglot and provides primitives for provisioning, running, scaling and managing distributed systems.

    Strengths

    • Kubernetes is a polyglot and language agnostic container management platform that is capable of running both cloud native and traditional containerized applications. The services it provides such as configuration management, service discovery, load balancing, metrics collection, log aggregation are consumable by a variety of languages. This allows having one platform in the organisation that can be used by multiple teams (including Java developers using Spring framework) and serve multiple purposes: application development, testing environments, build environments (to run source control system, build server, artifact repositories), etc.
    • When compared to Spring Cloud, Kubernetes addresses a wider set of MSA concerns. In addition to providing runtime services, Kubernetes also lets you provision environments, set resource constraints, RBAC, manage application lifecycle, enable autoscaling and self healing (behaving almost like an antifragile platform).
    • Kubernetes technology is based on Google's 15 years of R&D and experience of managing containers. In addition, with close to 1000 committers, it is one of the most active Open Source communities on Github.

    Weaknesses

    • Kubernetes is polyglot and as such its services and primitives are generic and not optimised for different platforms such as Spring Cloud for JVM. For example configurations are passed to applications as environment variables or a mounted file system. It doesn't have the fancy configuration updating capabilities offered by Spring Cloud Config.
    • Kubernetes is not a developer focused platform. It is intended to be used by DevOps minded IT personnel. As such, Java developers need to learn some new concepts and be open for learning new ways of solving problems. Despite it being super easy to start a developer instance of Kubernetes using MiniKube, there is a significant operation overhead to install a highly available Kubernetes cluster manually.
    • Kubernetes is still a relatively new platform (2 years old) and it is still actively developed and growing. Therefore there are many new features added with every release which may be difficult to keep up with. The good news is that, this has been envisaged, and the API is extensible and backward compatible.

    Best of Both Worlds

    As you have seen both platforms have strengths in certain areas, and things to improve upon in other areas. Spring Cloud is a quick to start with, developer friendly platform, whereas Kubernetes is DevOps friendly, with a steeper learning curve, but covers a wider range of Microservices concerns. Here is a summary of those points.
    Strengths and Weaknesses
    Both frameworks address a different range of MSA concerns, and they do it in a fundamentally different way. The Spring Cloud approach is trying to solve every MSA challenge inside the JVM, whereas the Kubernetes approach is trying to make the problem disappear for the developers by solving it at platform level. Spring Cloud is very powerful inside the JVM, and Kubernetes is powerful in managing those JVMs. As such, it feels like a natural progression to combine them and benefit from best parts of both projects.
    Spring Cloud backed by Kubernetes
    With such a combination, Spring provides the application packaging, and Docker and Kubernetes provides the deployment and Scheduling. Spring provides in-application bulkheading through Hystrix thread pools, and Kubernetes provides bulkheading through resource, process and namespace isolation. Spring provides health endpoint for every microservice, and Kubernetes performs the healthchecks and traffic routing to healthy services. Spring externalizes and updates configurations, and Kubernetes distributes the configurations to every Microservice. And this list goes on and on.

    My Favourite Microservices Stack
    What about my favourite Microservices platform? I like them both. I like the developer experience offered by the Spring framework. It is all annotation driven, and there are libraries covering all kind of functional requirements. I also like Apache Camel (rather that Spring Integration) for anything to do with integration, connectors, messaging, routing, resilience and fault tolerance at the application level. Then for anything to do with clustering and managing multiple application instances, I prefer the magical Kubernetes powers. And whenever there is an overlap of functionality, such as for service discovery, load balancing, configuration management, I try to use the polyglot primitives offered by Kubernetes.

    A geeky laptop skin

    Recently we had a company meetup where the team leads talked about the Red Hat culture, mission and values. Inspired by all the talks and the stickers on the table that day, I've created my first laptop skin. And the result was this:
    Red Hat - the Open Source Catalyst
    In case you want to create something similar here the steps I took:

    1. Pick some nerdy words. I picked all the middleware project names that Red Hat contributes to (I've never realized there were that many).  Removed some that are not so cool any longer (EJB3) and added few other hot projects (Kubernetes, Apache isis, OFBiz).

    2. Created the cloud tag using the above words and the shadowman picture at Tagul. You can log in and edit my existing desing for a quicker sticker.

    3. Exported the image from step 2 and created a Mac skin at wrappz which costs around £15. (The following discount code should give you 20% off Bilgin20 and some royalties for me).

    Attached is the transparent image used for the skin and the editable version is at Tagul.

    From Fragile to Antifragile Software

    (This post was originally published on Red Hat Developers, the community to learn, code, and share faster. To read the original post, click here.)

    One of my favourite books is Antifragile by Nassim Taleb where the author talks about things that gain from disorder. Nacim introduces the concept of antifragility which is similar to hormesis in biology or creative destruction (accessible version creative-destruction) in economics and analyses it characteristics in great details. If you find this topic interesting, there are also other authors who have examined the same phenomenon in different industries such as Gary Hamel, C. S. Holling, Jan Husdal. The concept of antifragile is the opposite of the fragile. A fragile thing such as a package of wine glasses is easily broken when dropped but an antifragile object would benefit from such stress. So rather than marking such a box with "Handle with Care", it would be labelled "Please Mishandle" and the wine would get better with each drop (would be awesome woulnd't it).

    It didn't take long for the concept of antifragility to be used also for describing some of the software development principles and architectural styles. Some would say that SOLID prinsiples are antifragile, some would say that microservices are antifragile, and some would say software systems cannot be antifragile ever. This article is my take on the subject.

    According to Taleb, fragility, robustness, resilience and antifragility are all very different. Fragility involves loss and penalisation from disorder. Robustness is enduring to stress with no harm nor gain. Resilience involves adapting to stress and staying the same. And antifragility involves gain and benefit from disorder. If we try to relate these concepts and their characteristics to software systems, one way to define them would be as the following.

    Disorder 

    Different systems are affected by different kind of disorder, such as stress, time, change, volatility, debt, etc. For software systems, the main disorder is the change. The business is in a constant changing environment and the software needs to adapt to the business needs quickly. That is implementing new requirements, changes to existing functionality, even creating new business opportunities through innovation. A software system has to change all the time, otherwise it is obsolete.
    Apart from development time challenges, there are also runtime challenges for software systems too. Software systems are created and exists to add value by running in a production environment. And while doing so, they are under stress by end users and other systems. This is another kind of disorder, that software systems have to deal with.

    Fragile 

    This property describes systems that suffer when put under stress. Imagine a software project that is not easy to change at development time. For example if it is not easy to extend, modify and deploy to the production environment. Or a system that is not able to handle unexpected user inputs or external system failures and breaks easily. That's is a fragile system that is harmed by stress and penalised by change, a good example of fragile.

    Robust

    This is a system that can continue functioning in the presence of internal and external challenges without adaptation. Every system is robust up to a level. For example a bottle is robust until it reaches the level of breaking point. A software system can be made robust to handle unanticipated user input, or failures in external systems. For example handling NPE in Java, using try-catch-finally statements to handle unreliable invocations, having a thread pool to handle concurrent users, creating network connections using timeouts, are all examples of robustness for a software system. But a robust system doesn't adapt to a chaining environment and when the stress and change threshold is reached it would break. The qualities that define the robustness vary from system to system. An ATM for example needs to be robust and not fail in the middle of a transaction, whereas, a media streaming service can drop a frame or two, as long as it continues streaming under stress.

    Resilient

    A system is resilient when it can adapt to internal and external challenges by changing its method of operation. The key here is that the system is responding to stress by changing its internal behaviour rather than resisting stress with a predefined buffer.
    • A typical example here is the circuit breaker pattern which changes its internal state to adapt to the external system behaviour to protect itself. 
    • Another example would be using a retry mechanism with some backoff algorithm to handle transient failures in external systems.
    • A different technique for creating resilient systems is through graceful degradation, both on the UI and the server side. Rendering UI based on the user agent capabilities, or failing fast on the server side and fallback to some default values are commonly used techniques to adapt to failures.
    • Systems with self healing and autorepair capabilities are another example of resiliency. These systems are self-aware and can detect abnormalities and take corrective actions. For example Kubernetes/OpenShift will perform regular liveness checks for the running Docker containers and if they detect any anomalies they will restart the container and perform necessary backoffs until the application stabilises. This is another mechanism to cope with stress and improve application resilience.

    Overview

    Before looking at the next level of software evolution - the antifragile systems, let's visualise and summarise different kind of software system characteristics.
    • A fragile system is difficult to modify, and cannot cope with a changing environment. Even if it provides some value when used in a stable non changing environment, when faced with further stress and change, it quickly turns into a liability. Many organisations have applications (mainframes for example) which are impossible to change, very expensive to maintain, but still running on high cost as they are very critical to the business.
    • A robust system is the one that is implemented with certain buffers to handle change and stress. So when the stress level increases, it can withstand it for up to a level without losing its capabilities and still provide a good value. But a robust system does not adapt, and if the stress and change levels continue raising, such a system can also stops providing benefit and may turn into liability and run on loss.
    • A resilient system can handle more stress and change as it is designed and implemented with stress in mind and adaptability features. Even if it is not benefiting from stress, it can survive lot's of different kind of stress and change and provide value up to a greater degree.
    • An antifragile system is created with change in mind and it feeds from stress and change. It is much harder to create such a system (it is not a software system but a social-technical system) but once it is in place, it drives the business based on change, and even creates the change.

    Antifragile

    Many things in life are antifragile, such as the human body. When stressed at the right level, a muscle or bone wold come back stronger. But can a software system be antifragile? Certainly there are some tools, platforms, architectural styles, methodologies that can help create software with antifragile characteristics. Let's see some of the more popular ones.

    Auto scaling feature allows applications to handle increasing load by creating more instances of the application. To achieve that, the software system have to able to measure and then react to change and stress. Some good examples here are AWS autoscaling of EC2 instances at infrastructure level, and OpenShift autoscaling of application containers for the application level. This is a feature that transitions applications from resiliency to antifragility since the software system is shifting resources from one part of the system into another to respond to stress.

    Microservices. According to Taleb, at times of stress, the large is doomed to breaking. And that phenomen has been observed with mammals, corporations, administrations, etc. In software and large projects, this behaviour has been observed even more often. The bigger a software project is, the harder it becomes to change and react to stress. Microservices is an architecture styles that allows easier change by having autonomous services with well defined APIs - features that allow change. Russ Miles is a strong believer and proponent of Antifragile Software through Microservices (here is an intro video from him).

    Chaos engineering is a technique to create antifragility by evolving systems to survive chaos. Rather than waiting for things to break at the worst possible time, the idea of chaos engineering is to proactively inject failures in order to be prepared when disaster strikes. Netflix’s Simian Army is a very good materialization of this technique, designed to generate failures and help isolate system’s weaknesses.

    Continuous deployments to a production environment creates continuous partial system failures and forces organisations to react better to failures through redundancy, rolling upgrades, rollbacks, and avoiding single points of failure. Other techniques such as canary release, blue-green deployments, are used to reduce the risk of introducing new software into production environment. Some other methods such as A/B testing even allows experimenting with change and measuring its effect, in order to gain from change.

    The human element

    Antifragility is not an universal characteristics. Different systems are antigragile towards different kind of disorder. For example Chaos Monkey will make your system antifragile towards EC2 deaths, and autoscaller will make your system respond to specific type of load. But your systems will not be antifragile towards other kinds of stress. And if you look above at the different ways of introdusing antifragility into software systems, all of them are means for making the social-technical system antifragile (and not only the software system):
    • Chaos engineering forces human feedback to the injected randomness and makes the system antifragile.
    • Microservices alone do not make a software system antifragile. But microservices combined with appropriate organisational and team structure enables antifragility. If microservices are a way for architecting applications into autonomous structures, DevOps is a means for organizing teams into similar structures. You need both in order to benefit from them and gain from disorder.
    • Continuous deployment pipeline is a tool that allows teams to react to stress faster, introduce or retract change faster and generally use the stress as the driver.
    • Similarly, iterative development is not enough to benefit from changing environment. But iterative development with open and honest retrospective rituals is.

    Innovate

    If it takes few weeks to create a new developer environment, you can not react to change. If it takes three months to release a new feature, you can not react to change. If your ops team is watching the metrics dashboard and manually scaling applications up and down, you cannot embrace change. If the team is hardly catching up with the change, there is no way to gain from change. But once you put an appropriate organisational structure, the right tools and culture in place, then you can start gaining from change. Then you can afford having Friday Hackathons, then you can start exploring open source projects and start contributing to them, then you can start open sourcing your internal projects and benefit from a community, and generally be the change itself. And why not the Netflix or the Amazon of tomorrow.

    Visualizing Integration Applications

    (This post was originally published on Red Hat Middleware Blog. To read the original post, click here.)
    Since I've changed role and started performing architect duties, I have to draw more boxes and arrows than write code. There are ways to fight that, like contributing to open source projects during sleepless nights, POCs, demos, but drawing boxes to express architectures and designs is still big part of it. This post is about visualising distributed messaging/SOA/microservices applications in agile (this term has lost its meaning, but cannot find a better one in this case) environments. What I like about the software industry in recent years is that the majority of organisations I've worked with, value the principles behind lean and agile software development methodologies. As long as it is practical, everyone strives to deliver working software (rather than documentation), deliver fast (rather than plan for a long time), eliminate waste, respond to change, etc. And there are management practises such as Scrum and Kanban, and Technical Practises from Extreme programming (XP) methodology such as unit testing, pair programing, and other practises such as CI, CD, DevOps to help implement the aforementioned principles. In this line of thinking, I decided to put together a summary of the design tools and diagrams I find useful in my day to day job while working with distributed Systems.

    Issues with 4+1 View Model and death by UML

    Every project kicks off with big ambitions, but there is never enough time to do things perfectly, and at the end we have to deliver whatever works. And that is a good thing, it is the way the environment helps us avoid gold plating, YAGNI, KISS, etc. so we do just enough and adapt to chance.

    Looking back, I can say that most of the diagrams I've seen around are inspired by 4+1 view model of Philippe Kruchten which has Logical, Development, Process and Physical views.
    4+1 View Model
    4+1 View Model
    I quite like the ideas and the motivation behind this framework: using separate views and perspectives to address specific set of constraints and targeting the different stakeholders. That is a great way of describing complex software architectures. But I have two issues with using this model for integration applications.

    Diagram applicability

    Typically these views are expressed through UML, and for each view, you have to use one or more UML diagrams. The fact that I have to use 15 types of UML diagrams to communicate and express a system architecture in an accessible way, defeats its purpose.
      Death by UML
      Death by UML
      With such a complexity, the chances are that there are only one or two people in the whole organisation who has the tools to create, ability to understand and maintain these diagrams. And having hard to interpret, out of date diagrams is as useful as having out of date gibberish documentation. These diagrams are too complex and with limited value, and very quickly they turns into liability that you have to maintain rather than asset expressing the state of a constantly changing system.
      Another big drawback is that the existing UML diagram types are primarily focused on describing object-oriented architectures rather than Pipes and Filters architectures. The essence of messaging applications is around interaction styles, routing, data flow rather than structure. Class, object, component, package, and other diagrams are of less value for describing a Pipes and Filters based processing flows. Behavioural UML diagrams such as activity and sequence get closer, but still cannot express easily concepts such filtering and content based routing which are fundamental part of integration applications.

      View applicability 

      Having different set of views for a system, to address different concerns is a great way of expressing intend. But the existing views of 4+1 model doesn't reflect the way we develop and deploy software nowadays. The idea that you have a logical view first, which then leads to development and process view, and those lead to physical view is not always the case. The systems development life cycle, is not following the (waterfall) sequence of requirement gathering, designing, implementing and maintaining.
        Software Development Lifecycle
        Software Development Lifecycle
        Instead other development methodologies such as agile, prototyping, synchronise and stabilise, spike and stabilise are used too. In addition to the process, the stakeholders are changing too. With practises such as DevOps, developers have to know about the final physical deployment model, operations have to know about the application processing flows too. Modern architectures such as microservices affect the views too. Knowing one microservice is in a plethora of a services is not very useful. Knowing too much about all the services is not practical either. Having the right abstraction level to have a system wide view with just enough details becomes vital.

        Practical Visualisation for Integration Applications

        There are some good tips for drawing in general. The closest thing that has been working for me is described by Simon Brown as C4 model. (You should also get a free copy of Simon's awesome The Art of Visualising Software Architecture book). In his model, Simon is talking about the importance of a common set of abstractions rather than common notation (such as UML) and then using simple set of diagrams for different level of abstractions: system context diagram, container diagram, component diagram and class diagram. I quite like this "Outside-In" approach, where you first have 10000 foot view and with each next level, going deeper with more detailed views.
        C4 is also not an exact match for middleware/integration applications either, but it is getting closer. If we were to use C4 model, then system context diagram would be one box that says ESB (or middleware, MOM, microservices) with tens of arrows from north to south. Not very useful. Container diagram is quite close, but the term container is so overloaded (VM, application container, docker container) which makes it less useful for communication. Component and class diagrams are also not a good fit as Pipes and Filter architectures are focused around Enterprise Integration Patterns, rather than classes and packages.
        So at the end, what is it that worked for me? It is the following 3 types of diagrams which abbreviate as SSD (not as cool as C4):  System Context Diagram, Service Design Diagram and Deployment Diagram.

        System Context Diagram

        The aim of this model is to show all the services (whether they are SOA, Microservices) with their inputs and outputs. Ideally having the external systems on the north, the services in the middle section, and internal services in the south. Or you could use both external and internal services on both side of the middleware layer as shown below. Also having the protocol (such as HTTP, JMS, file) on the arrows, with the data format (XML, JSON, CSV) gives useful context too, but it is not mandatory. If there are too many services, you can leave the protocol and the data format for the service level diagrams. I use the direction of the arrow to indicate which service is initiating the call rather than the data flow direction.
        System Context Diagram
        System Context Diagram
        Having such a diagram gives a good overview of the scope of a distributed system. We can see all the services, the internal and external dependencies, the types of interaction (with protocol and data format), and the call initiator.

        Service Design Diagram

        The aim of this diagram is to show what is going on in each box representing a middleware service from the System Context Diagram. And the best diagram for this is to use EIP icons and connect those as message flows. A service may have a number of flows, support a number of protocols, implement real time and/or batch behaviour.

        Service Design Diagram
        Service Design Diagram
        At this level, we want to show all possible data flows implemented by a specific service, from any source to any destination.

        Deployment Diagram

        The previous two diagrams are the logical views of the system as a whole and each service separately. With the deployment diagram, we want to show where each service is going to be deployed. May be there will be multiple instances of the same service running on multiple hosts. May be some services will be active on one host, and passive on the other. May be there will be a load balancer fronting the services, etc.
        Deployment Diagram
        Deployment Diagram
        The deployment diagram is supposed to show how individual services and the system as a whole relates to the host systems (regardless whether that is physical or virtual).

        What tools do I use?

        The System Context and the Deployment Diagrams are composed only of boxes and arrows and do not require any specail tools. For the Service Design Diagram, you will need a tool that has the Enterprise Integration Pattern icons installed. So far, I have seen the following tools with EIP icon support:
        • Mac OS: OmniGraffle with icons from graffletopia.This is what I've used to create all the diagrams for Camel Design Patterns book.
        • Windows: Enterprise Architect by Sparx Systems with EIP icons by Harald Westphal. There are also MS Visio Stencils here.
        • Web: LucidCharts which ships EIP icons by default. That is the easiest to use tool and accessible from everywhere. It is my favourite tool (with MS Visio import/export options for Windows users), and has free account plans to start with.
        • Web: DrawIO another web tool with EIP icons. The beauty of this tool is that it forces you to use your own storage options for the diagrams, such as: google drive, dropbox, locally, etc. So you own the diagrams and keep them safe.
        • Canva: not specifically for EIP diagrams, but in general this site has a nice templates and photos to use in various presentations. Check it out.
        Other development tools that could be also used for creating EIP diagrams are:

        System Context Diagram is useful to show the system wide scope and reach of the services, Service Design Diagram is good for describing what a service does, and Deployment Diagram is useful mapping all that into something physical. In IT, we can expand work and fill up all the available time with things to do. I'm sure given more time, we can invent 10 more useful views. But w/o the above three, I cannot imagine myself describing an integration application. As Antoine de Saint-Exupery put it long ago: "Perfection is finally attained not when there is no longer anything to add but when there is no longer anything to take away."

        About Me