MuleSoft and event-driven architecture
- June 19, 2022
The objective of this document is to provide an in-depth analysis of how event-driven architecture (EDA) can be used within the Anypoint platform, especially while designing APIs based on Anypoint-led connectivity principles. All this while stressing on the fact that EDA isn’t a silver bullet for all scenarios. Hence, this document tries to help architects and developers narrow down use cases where it can be implemented to gain the maximum benefits. It attempts to provide a glimpse of how EDA can be used to solve everyday challenges faced in the real world and accomplish event-driven services.
EDA definition
What is an event?
At the grass-root level, an event is an occurrence or a change in the state of a system that is of particular importance. Events are immutable pieces of information. These cover a wide spectrum of events, from a simple user keystroke or a mouse click to an application-generated event, airport gate change or financial transaction.
Why do we care about it?
Sync request/reply patterns are usually useful when we know what to ask and what to expect. This introduces tight coupling. In this modern era, data is the new oil. However, organizations often struggle to connect their many sources of data. Data silos are the biggest obstacle toward digital transformation. Modern enterprises require data to be shared among multiple systems. Sync request/response patterns become a limiting factor or bottleneck in such situations.
In today’s world, we have a plethora of events generated from various sources like Internet of Things (IoT), applications, people and bots to name a few. These events should be correlated and integrated in a rapid, agile way to make them more predictive, preventive and actionable to keep abreast with the competition.
Not only this, in the recent past, we have seen enterprises with their monolith legacy applications need to adapt cloud capabilities, integrate with the modern Software as a service (SaaS) applications and expose out critical data silos. This is where EDA helps.
What is EDA?
To the above-mentioned few real-world problems, the answer comes in the form of event-driven architecture. It’s primarily a software architecture or a design pattern in which a software component responds to one or more such event notifications. Decoupled applications can asynchronously publish and subscribe to events, usually accomplished with a combination of a modern event broker and iPAAS solution like MuleSoft.
What’s the future of EDA?
Events never travel; they just occur. Everything else needs to be portable. The more each service can stand alone, the more resilient and fault-tolerant the system is. The economies of this programming approach create the likelihood of greater adoption in the field.
Gartner identified event-driven architecture as a top tech trend for 2018 and predicted that by 2022, event notifications will form part of more than 60% of new digital business solutions. By 2022, more than 50% of business organizations will participate in event-driven digital business ecosystems.
Hyper-automation is about automating anything that can be automated. For example, the #1 use case for artificial intelligence is process automation, according to Gartner. On the path to hyper-automation is a spectrum of technologies, from robotic process automation (RPA) to artificial intelligence (AI). Event processing via APIs and event-driven architecture is seen as an underlying technology that will help underpin the march towards hyper automation.
EDA in Anypoint API-led connectivity
API-led connectivity is a standardized way to connect data and applications with reusable, composable APIs designed to perform a specific role, such as:
- System API: unlocking data from back-end systems
- Process API: compose unlocked data into processes and take care of business logic/transformation
- Experience API: delivering an experience for an intended audience
Each of the Anypoint APIs can publish its events to a common queue or topic and get the advantages of event-driven services and be more autonomous. However, certain rules or best practices need to be adhered to in order to maintain API-led connectivity approach as detailed below:
Best practices
- Any API that publishes events should define its own queues.
- Destinations belong logically to the same API-led connectivity tier (System API publish system events, Process API publish process events, Experience API publish Experience API).
- Event consumption must be from lower tiers only and not vice versa. System API should consume from process API published events and not vice versa.
- System API shouldn’t directly consume from Experience API, thus bypassing process API.
How to implement EDA in API-led connectivity
With the help of an example of API-led connectivity, let’s discuss common scenarios where EDA can be implemented irrespective of broker, which can be internal (VM, Anypoint MQ) or external (Solace, Kafka or RabbitMQ).
Let’s take a simple example where a user triggers an event or notification and doesn’t expect a response. We have the following APIs based on the API-led connectivity approach.
- An experience API (EAPI-1) is exposed to an application/user. This layer takes in the request, does basic checks and calls the Process layer API (PAPI-1).
- Process layer API (PAPI-1) transforms data and submits to the System API (SAPI-1)
- SAPI-1 validates data against a database. Once successfully validated, it calls the next process API (PAPI-2).
- PAPI-2 converts data into the required format and calls the respective System APIs, which connect to back-end systems.
- SAPI-2 has sole responsibility to update data into Salesforce.
- SAPI-3 sends analytical data to Splunk.
With the above example, let’s discuss where we can fit EDA within API-led connectivity to improve the overall user experience and overcome the drawbacks of tight coupling due to sync REST request response. The idea is to use sync communication where necessary and use eventing as much as possible (not applicable in scenarios where immediate response is required).
Latency and User experience
Problem: In a chain of microservices, the overall response time is the sum of all the APIs response time in the chain. In API-led connectivity, this can pose a problem if any of the process or system API is slow. In the above scenario, if the database response is slow, it ripples down into the experience API.
Solution: With the introduction of eventing, the experience API immediately responds back as soon as it’s delivered the message to the message broker. Guaranteed delivery is ensured by the messaging layer, as it’s able to persist the message until a successful database update.
Benefits
- Experience API doesn’t need to wait for all the back-end processes to complete execution, thus providing a better user experience.
- Eventual consistency is guaranteed.
- It is useful in scenarios that involve long-running processes.
Error handling
Problem: Consider a scenario where one of the services is calling or dependent on another service, and the called service is unavailable or errors out. In this case, the responsibility of retry and error handling logic or rollback most often needs to be implemented on the calling service. This logic might need to be duplicated if it’s dependent on multiple different APIs, which need to be error-handled differently — adding complexity and maintenance overhead along with precious worker core consumption.
Solution: With the introduction of eventing, the calling API just needs to care about connecting to the message broker and dumping the event onto a queue or topic.
Benefits
- Calling service doesn’t need to worry whether the message has been received by the consumer. The message broker takes care of persisting the message until it’s successfully consumed by the consumer service, thus increasing the overall reliability and availability.
- Overall developer efforts and worker cores/memory utilization are reduced as the complex retry logic and data persistence is offloaded to the messaging layer.
- If there are multiple instances of target or consumer, the message can be re-delivered to the next available instance if the first instance is down.
- The overall user experience is improved.
- Re-delivery is a common trait of message brokers. Mule workers can save upon costly CPU required for redelivery and retry implementation within code.
Scalability and resource utilization
Problem: Imagine a scenario where PAPI-2 involves complex transformations and business logic. Due to this, especially during peak hours, it can cause a ripple effect of spiking up the resource utilization of all the upstream synchronously dependent services. The effect cascades across all the services because one service is a bottleneck or heavily loaded in a sync request/response layout. This in turn may demand a scale-up of resources across the entire chain of services, which isn’t cost efficient.
Solution: Decouple PAPI-2 and scale it independently rather than scaling up the entire chain of services.
Benefits:
- Since PAPI-2 is decoupled, you need to increase resources or scale out only PAPI-2 and not the entire chain of APIs as now the cascading effect is negated.
- You get a better user experience as the impact is not cascaded to experience API.
- If the incoming rate of messages is higher than the rate at which PAPI-2 can process these messages, the message broker can throttle and persist messages for PAPI-2 to consume at its own pace without impacting other services.
- It allows a more granular vertical scale-up of PAPI-2 API.
- It allows horizontal scaling of PAPI-2, as each message can be consumed by separate instances of PAPI-2 in a round-robin fashion.
- It allows load-balancing of incoming payload across multiple instances of PAPI-2.
Reusability
Problem: Imagine a scenario where an application is sending customer data that is currently received by experience API, forwarding it to process API and finally the system API updates it into a database. After some time, this same data is required by other systems.
Solution: For use cases where data reuse is forecasted, the publishing API can dump the data onto a topic from where messages can be broadcasted to ‘n’ number of new APIs that require the same data as is the case with many enterprises today.
Benefits:
- It drastically reduces development time, as new APIs just need to plug in to the message broker and subscribe to data.
- It complements the reusability principle of API-led connectivity and can act as a reliable backbone in Anypoint network.
- Most of the modern message brokers like Solace Pubsub+ and Kafka have built in advanced replay capabilities that can potentially replay all the previous messages from the beginning or from a particular timestamp / message ID for new applications to consume — eliminating the need to build separate APIs to sync data.
- It is also useful in a variety of scenarios where you can plug in and listen to live data for fraud detection, audit, analytics and trend analysis without impacting the main flow or reinventing the wheel.
Common EDA patterns
Apart from the above-mentioned scenarios, below are some common EDA patterns that can be implemented within the API-led connectivity for achieving robust architectural designs. We won’t go in depth as these patterns are widely discussed and up to the architect how best to utilize it within the API-led connectivity framework to achieve maximum benefits.
CQRS pattern: Command Query Responsibility Segregation (CQRS) is the segregation of the responsibilities of the commands and queries in a system. That means that we’re slicing our application logic vertically. We’re also segregating state mutation (command -Writes) from the data retrieval (reads / query handling).
SAGA pattern: For implementing transactions that scan across different APIs. Implementing each business transaction that spans multiple services is a saga. A saga is a sequence of local transactions. Each local transaction updates the database and publishes a message or event to trigger the next local transaction in the saga. If a local transaction fails because it violates a business rule, the saga executes a series of compensating transactions that undo the changes that were made by the preceding local transactions. There are two ways of coordination sagas:
- Choreography: Each local transaction publishes domain events that trigger local transactions in other services.
- Orchestration: An orchestrator (object) tells the participants what local transactions to execute.
Event sourcing: Event sourcing persists the state of a business entity, such as order or a customer as a sequence of state-changing events. Whenever the state of a business entity changes, a new event is appended to the list of events. Since saving an event is a single operation, it’s inherently atomic. The application reconstructs an entity’s current state by replaying the events.
Database per service: One of the core characteristics of the microservices architecture is the loose coupling of services. However, there may be scenarios where the data needs to be synced across to different databases. In this case, a copy of data is published to an event broker by the API and is consumed at the other end by another API that syncs its own database.
ASync API
Just like how we define RESTful API Modeling Language (RAML) or an OpenAPI Specification (OAS) specification for REST-based APIs, we have the AsyncAPI specification for event-driven APIs. It’s the industry standard for defining asynchronous APIs. It aims at building the future of event-driven architecture (EDA) tools to easily build and maintain your event-driven architecture.
The AsyncAPI Specification is a project used to describe and document event-driven APIs in a machine-readable format. It’s protocol-agnostic, so you can use it for APIs that work over any protocol. The AsyncAPI Specification defines a set of files required to describe such an API. These files can then be used to create utilities, such as documentation, integration and/or testing tools.
The AsyncAPI specification does not assume any kind of software topology, architecture or pattern. Therefore, a server MAY be a message broker, a web server or any other kind of computer program capable of sending and/or receiving data. However, AsyncAPI offers a mechanism called “bindings” that aims to help with more specific information about the protocol and/or the topology.
Ref: AsyncAPI Initiative for event-driven APIs
Designing AsyncAPI in Anypoint
AsyncAPI specification can be defined in Anypoint designer in Yet Another Markup Language (YAML) or JSON (JavaScript Object Notation) format. At a high level, the document provides details of the message schemas, producer or consumer application, the channel (queues), broker details and security. When published to Exchange, this provides developers with the necessary information to publish or subscribe to events.
Currently, the option to scaffold it into Studio isn’t available. However, there are multiple tools on the internet to import and plumb it into a code for popular languages.
Here are the high-level steps:
- Navigate to Design Center and click on Create New > New AsyncAPI.
- Select format YAML or JSON.
- Define the ASync API and publish to Exchange.
High-level AsyncAPI definition
Below is a simple code snippet of AsyncAPI definition to get started.
MuleSoft connectors for EDA
Anypoint platform provides a plethora of connectors to enable developers to connect to multiple Message / Event brokers and different messaging protocols available on Exchange. Some of the common ones are:
- Google Pub/Sub Connector – Mule 4 (mulesoft.com)
- Amazon Kinesis Data Streams Connector – Mule 4 (mulesoft.com)
- Solace PubSub+ Connector – Mule 4 (mulesoft.com)
- AMQP Connector – Mule 4 (mulesoft.com)
- Anypoint MQ Connector – Mule 4 (mulesoft.com)
- IBM MQ Connector – Mule 4 (mulesoft.com)
- JMS Connector – Mule 4 (mulesoft.com)
- MQTT Connector (mulesoft.com)
- RabbitMQ Connector – Mule 3 (mulesoft.com)
- Apache Kafka Connector – Mule 4 (mulesoft.com)
Event-driven architecture can enhance the API-led connectivity. The traditional request-driven model and the event-driven model are complementary. This combination of events and APIs gives rise to solutions that are more reliable, loosely coupled, scalable, reusable, robust, load-balanced and fault-tolerant, while enhancing the overall user experience. It also makes it future-proof as new services requiring the same data can be easily incorporated into the Anypoint network.
It’s not a silver bullet, but the idea is to use sync communication where necessary and use eventing as much as possible. Architects should consider the adoption of EDA at the beginning of any new projects to make full use of its benefits.
— By Tariq M. Syed