1. Batch Processing
Mule possesses the ability to process messages in batches. Within
an application, you can initiate a batch job which is a block of code
that splits messages into individual records, performs actions upon
each record, then reports on the results and potentially pushes the
processed output to other systems or queues. This functionality is
particularly useful when working with streaming input or when
engineering "near real-time" data integration between SaaS
applications.
2. • For example, batch processing is particularly useful
when working with the following scenarios:
• integrating data sets, small or large, streaming or not,
to parallel process records
• synchronizing data sets between business applications,
such as syncing contacts between Netsuite and
Salesforce, effecting "near real-time"data integration
• extracting, transforming and loading (ETL) information
into a target system, such as uploading data from a flat
file (CSV) to Hadoop
• handling large quantities of incoming data from an API
into a legacy system
3. • A batch job is a top-level element in Mule
which exists outside all Mule flows. Batch jobs
split large messages into records which Mule
processes asynchronously in a batch job; just
as flows process messages, batch jobs process
records.
4. • A batch job contains one or more batch steps which, in
turn, contain any number of message processors that
act upon records as they move through the batch job.
During batch processing, you can use record-level
variables (recordVars) and MEL expressions to enrich,
route or otherwise act upon records.
• The heart of Mule’s batch processing functionality lies
within the batch job. In an application, the batch job
element exists outside the context of any regular Mule
flow. It is a block of code which contains one or
more batch steps which, as the label implies, process
items step-by-step in a sequential order.
11. Batch Job vs. Batch Job Instance
•
• Though defined in context above, it’s worth elaborating upon the
terms batch job and batch job instance as they relate to each other.
• A batch job is the top-level element in an application in which Mule
processes a message payload as a batch of records. The term batch
job is inclusive of all four phases of processing: Input, Load and
Dispatch, Process, and On Complete.
• A batch job instance is an occurrence in a Mule application
resulting from the execution of a batch job in a Mule flow; Mule
creates the batch job instance in the Load and Dispatch, and
persists eternally.