MuleSoft For Each, Parallel For Each and batch processing comparison

  • August 31, 2020

As we know, MuleSoft provides For Each, Parallel For Each and Batch Processing to process a list of records. In this technical article, we’ll compare them to see which use cases are suitable.

What is For Each scope?

  • The For Each scope splits a collection into elements, processes them iteratively through the processors embedded in the scope and then returns the original message to the flow.

For Each scope

What is Parallel For Each Scope?

  • The Parallel For Each scope enables you to process a collection of messages by splitting the collection into parts that are simultaneously processed in separate routes within the scope of any limitation configured for concurrent processing. After all messages are processed, the results are aggregated following the same order they were in before the split and then the flow continues.

What is batch job?

  • Mule allows you to process messages in batches. You can initiate a batch job scope, which splits messages into individual records, performs actions upon each record and then reports on the results and potentially pushes the processed output to other systems or queues.

batch job

Comparison between For Each, Parallel For Each and batch processing

For Each Parallel For Each Batch Processing
Execution Support Mule 3.x Onwards Mule 4.2 Onwards Mule 3.x Onwards
Graphical Support Mule 3.x Onwards Mule 4.3 Onwards Mule 3.x Onwards
Execution Pattern Synchronous Synchronous Asynchronous
Execution Order Sequential Parallel Parallel
Record Grouping Possible using Batch Size Not Possible Possible using Batch Aggregator
Error Handling In exception scenarios stops processing if error not handled In exception scenarios does not stop processing however raise MULE:COMPOSITE_ROUTE error type The Behaviour can be configured
Suitable for? Sequential Processing Parallel Processing With Synchronous Parallel Processing With Asynchronous
# of Records ? Small Data Medium Data Large Data
Output Original Payload, Custom Logic required to get each record processing output Accumulated Payload Original Payload
For Each use cases
  • Sequential processing required
  • Synchronous processing required
  • Small data set
  • Processing of records in batch required
  • Process records only if previous records are processed successfully

Parallel For Each use cases

  • Synchronous processing required with parallelism
  • Medium data set
  • Accumulated output required
  • Process records irrespective of previous records status

Parallel For Batch Job use cases

  • Asynchronous processing required
  • Ordering of process records not needed
  • Large data set
  • Processing logic is complex and filtering is optional
  • Process records irrespective of previous records status

Conclusion

In general, the number of records and behavior (sync or async) determines which option to choose. However, for a medium number of records, choosing between Parallel For Each and Batch Job mostly governs whether we want accumulated output. If you’re choosing Parallel For Each because your use case requires accumulated output, remember large accumulated output can cause Java Virtual Machine (JVM) OutOfMemory issues.

— By Mohammad Mazhar Ansari