SlideShare a Scribd company logo
1 of 50
Download to read offline
Introduction to
Spring Integration and Spring Batch
© 2012 SpringOne 2GX. All rights reserved. Do not distribute without permission.
Gary Russell, Staff Engineer, SpringSource; @gprussell
Tuesday, December 18, 12
What we will cover...
• Spring Integration
• Spring Batch
• Using Spring Batch and Spring Integration together
2
Tuesday, December 18, 12
Spring Integration
3
Tuesday, December 18, 12
Integration Styles
• Business to Business Integration (B2B)
• Inter Application Integration (EAI)
• Intra Application Integration
4
JVM JVM
EAI
External Business
Partner
B2B
Core Messaging
Tuesday, December 18, 12
Integration Styles
• File Transfer
• Shared Database
• Remoting
• Messaging
5
Tuesday, December 18, 12
Common Patterns
6
Retrieve Parse Transform Transmit
Tuesday, December 18, 12
Enterprise Integration Patterns
• By Gregor Hohpe & Bobby Woolf
• Published 2003
• Collection of well-known patterns
• http://www.eaipatterns.com/eaipatterns.html
7
Tuesday, December 18, 12
8
“Spring Integration provides an extension
of the Spring programming model
to support the well-known enterprise
integration patterns.
Tuesday, December 18, 12
What is Spring Integration?
• Light-weight messaging framework
• Provides an adapter-based platform for communicating with external
systems
• Pipes and Filters at the core of Spring Integration’s architecture
–Endpoint (Filter)
–Channel (Pipe)
–Message
9
Tuesday, December 18, 12
Advantages
• Provides building blocks to implement systems that are:
– are loosely Coupled (Logically or Physically)
– are Event Driven (EDA)
– have a staged event-driven architecture (SEDA)
• Sophisticated support for synchronous / asynchronous messaging
10
Tuesday, December 18, 12
Configuration
• XML Namespace Support
• Annotation Support (e.g. @Transformer, @Router, @ServiceActivator)
11
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:int="http://www.springframework.org/schema/integration"
xsi:schemaLocation="..">
...
<int:channel id="orders"/>
<int:splitter input-channel="orders" expression="payload.items"
output-channel="drinks"/>
<int:channel id="drinks"/>
<int:router input-channel="drinks"
expression="payload.iced ? 'coldDrinks' : 'hotDrinks'"/>
...
</beans>
Tuesday, December 18, 12
Spring Integration DSLs
• Scala
• Groovy
12
val messageFlow =
filter {payload: String => payload == "World"} -->
transform { payload: String => "Hello " + payload} -->
handle { payload: String => println(payload) }
messageFlow.send("World")
def builder = new IntegrationBuilder()
def flow = builder.messageFlow {
transform {payload->payload.toUpperCase()}
filter {payload-> payload =="HELLO"}
handle {payload->payload}
}
assert flow.sendAndReceive("hello") == "HELLO"
assert flow.sendAndReceive("world") == null
Tuesday, December 18, 12
What is in a Message?
• Unit of information
• Encapsulates data
• Passed between endpoints
• Consists of headers
– contains data relevant to the messaging system
• and a payload
– actual data for the receiver
– depending on use-cases: POJO instances or serialized data
13
Tuesday, December 18, 12
What is in a Message?
14
package org.springframework.integration;
public interface Message<T> {
MessageHeaders getHeaders();
T getPayload();
}
Tuesday, December 18, 12
Message Headers
• Message ID (automatically generated UUID)
• Timestamp
• Correlation Id
• Reply Channel
• Error Channel
• Expiration Date
• Priority
• ...
• Add your own headers using a Header Enricher
15
Tuesday, December 18, 12
Function of a Message
• Command Message
• Event Message
• Document Message
16
E
D
C
Tuesday, December 18, 12
What is a Channel?
• Channels connect producers and consumers (decoupling)
• MessageChannel Interface:
–PollableChannel (Polling Consumer)
–SubscribableChannel (Event Driven)
• Implementations:
–DirectChannel
–PublishSubscribeChannel
–QueueChannel
–PriorityChannel
–RendezvousChannel
–ExecutorChannel
17
<int:channel id="input">
<int:queue capacity="10"/>
</int:channel>
Tuesday, December 18, 12
What is an Endpoint?
• Polling or event-driven
• Inbound or outbound
• Unidirectional (Channel Adapter) or bidirectional (Gateway)
• Internal or external (application context)
18
<inbound-channel-adapter/>
<outbound-channel-adapter/>
<inbound-gateway/>
<outbound-gateway/>
<gateway/>
<service-activator/>
Tuesday, December 18, 12
Router
• Message Router
• Content-based router
• Recipient list router (with selectors)
• Payload type router
• Header value router
• Exception type router
19
Tuesday, December 18, 12
Transformer
• Delegating via ref/method
• Spring Expression Language
• Groovy, JRuby, Jython, JavaScript
• Object-to-JSON / JSON-to-Object
• Payload serializing/deserializing
• File-to-bytes, File-to-String
• JAXB, JibX, Castor, XMLBeans, Xstream
• XPath, XSLT
• Object XML Marshalling/Unmarshalling (Spring OXM)
• ...
20
Tuesday, December 18, 12
Spring Integration Components
21
• Claim Check (In/Out)
• Content Enricher
–Header Enricher
–Payload Enricher
• Control Bus
• Delayer
• JMX Support
• Message Handler Chain
• Messaging Bridge
• Resequencer
• Service Activator
• Scripting support (JSR 223)
– Ruby/JRuby, Javascript ...
• Groovy
• Message History
• Message Store
– JDBC, Redis, MongoDB,
Gemfire
• Wire Tap
• ...
Tuesday, December 18, 12
Adapters
22
• AMQP/RabbitMQ
• AWS*
• File/Resource
• FTP/FTPS/SFTP
• GemFire
• HTTP (REST)
• JDBC
• JMS
• JMX
• JPA
• MongoDB
• POP3/IMAP/SMTP
• Print*
• Redis
• RMI
• RSS/Atom
• SMB*
• Splunk*
• Spring Application
Events
• Stored Procedures
• TCP/UDP
• Twitter
• Web Services
(SOAP or POX)
• XMPP
• XPath
• XQuery*
• ...
* Spring Integration Extensions Project
Tuesday, December 18, 12
Tooling - Spring Tool Suite (STS)
• Namespace Support
• Visualization
• 4 Spring Integration specific STS Templates
–Simple Template (Core Components only)
–File Polling Template (File Adapter)
–War Template (Uses Twitter Adapter)
–Adapter Template (Create your own components)
23
Tuesday, December 18, 12
Samples
• https://github.com/SpringSource/spring-integration-samples
• Contains 50 Samples and Applications
• Several Categories:
–Basic
–Intermediate
–Advanced
–Applications
24
Tuesday, December 18, 12
Books
• Just Spring Integration
• Pro Spring Integration
• Spring Integration in Action
25
Tuesday, December 18, 12
What’s new in Spring Integration 2.2
• JPA support
– http://blog.springsource.org/2012/10/05/whats-new-in-spring-integration-2-2-part-3-jpa-support/
• MongoDB adapters
– http://blog.springsource.org/2012/09/24/whats-new-in-spring-integration-2-2-rc1-part-1-mongodb/
• Transaction synchronization
– http://blog.springsource.org/2012/09/26/whats-new-in-spring-integration-2-2-part-2-transaction-
synchronization/
• Retry, Circuit Breaker and ExpressionEvaluatingRequestHandlerAdvice
– http://blog.springsource.org/2012/10/09/spring-integration-2-2-retry-and-more/
• Redis Store Adapters
26
Tuesday, December 18, 12
Source Code
• https://github.com/SpringSource/spring-integration
• https://github.com/SpringSource/spring-integration-samples
• https://github.com/SpringSource/spring-integration-extensions
• https://github.com/SpringSource/spring-integration-templates
• https://github.com/SpringSource/spring-integration-dsl-groovy
• https://github.com/SpringSource/spring-integration-dsl-scala
27
Tuesday, December 18, 12
Contribute
• Post Question and Answers the Forums
–http://forum.springsource.org/forumdisplay.php?42-Integration
• Create Jiras
–https://jira.springsource.org/browse/INT
• Submit Pull Requests - Contributor Guidelines:
– github.com/SpringSource/spring-integration/wiki/Contributor-Guidelines
• New Spring Integration Extensions Repository
28
Tuesday, December 18, 12
Spring Batch
29
Tuesday, December 18, 12
30
Batch Jobs
Differ from online/real-time processing applications:
• Long-running
– Often outside office hours
• Non-interactive
– Often include logic for handling errors or restarts
• Process large volumes of data
– More than fits in memory or a single transaction
Tuesday, December 18, 12
Batch and offline processing
• Close of business processing
– Order processing
– Business reporting
– Account reconciliation
• Import/export handling
– a.k.a. ETL jobs (Extract-Transform-Load)
– Instrument/position import
– Data warehouse synchronization
• Large-scale output jobs
– Loyalty scheme emails
– Bank statements
• 31
Tuesday, December 18, 12
Job and Step
32
Tuesday, December 18, 12
Chunk-Oriented Processing
• Input-output can be grouped together
• Input collects Items before outputting:Chunk-Oriented Processing
• Optional ItemProcessor
33
Tuesday, December 18, 12
JobLauncher
34
Tuesday, December 18, 12
Simple File Load Job
35
Tuesday, December 18, 12
More Complex Use Cases
• It's very common to use an off-the-shelf reader
and writer
• More complex jobs often require custom readers
or writers
• ItemProcessor is often used if there's a need to
delegate to existing business logic
• Use a writer if it's more efficient to process a
complete chunk
36
Tuesday, December 18, 12
Job and Step in Context
37
Tuesday, December 18, 12
JobRepository and Batch Metadata
38
Tuesday, December 18, 12
ExecutionContext
• We need to know where a failure occurred to restart a batch
process
• Job Repository metadata is used to determine the step at
which the failure occurred
• Application Code (in reader/writer) needs to maintain state
within a step (e.g. current chunk)
• Spring Batch can supply that data during restart to facilitate
repositioning
39
Tuesday, December 18, 12
Common Batch Idioms
• Batch jobs typically process large amounts of homogeneous
input
• Makes iteration a common concern: Repeat
• Transient errors during processing may require a Retry of
an input item
• Some input may not be valid, may want to Skip it without
failing
• Some errors should fail the job execution, allowing one to fix
the problem and Restart the job instance where it left off
40
Tuesday, December 18, 12
Spring Batch
• Spring Batch supports these common concerns
• Abstracts them in the framework
– Job business logic doesn't need to care about details
• Allows for simple configuration with pluggable strategies
41
Tuesday, December 18, 12
Business Logic Delegation – Spring Application
42
Tuesday, December 18, 12
Spring Batch Admin
• Sub project of Spring Batch
• Provides Web UI and ReSTFul interface to manage batch
processes
• Manager, Resources, Sample WAR
– Deployed with batch job(s) as single app to be able to control &
monitor jobs
– Or monitors external jobs only via shared database
43
Tuesday, December 18, 12
Scaling and Parallel Processing
• First Rule:
– Use the simplest technique to get the job done in the required
time
– Do not optimize/parallelize unnecessarily
• Options:
– Multi-threaded Step (single process)
– Parallel Steps (single process)
– Remote Chunking of Step (multi process)
– Partitioning a Step (single or multi process)
44
Tuesday, December 18, 12
Using Spring Batch
and
Spring Integration
Together
45
Tuesday, December 18, 12
Launching batch jobs through messages
• Event-Driven execution of the JobLauncher
• Spring Integration retrieves the data (e.g. file system, FTP, ...)
• Easy to support separate input sources simultaneously
46
D
C
FTP
Inbound Channel Adapter
JobLauncher
Transformer
File
JobLaunchRequest
Tuesday, December 18, 12
Providing feedback with informational messages
• Spring Batch provides support for listeners:
– StepListener
– ChunkListener
– JobExecutionListener
47
<batch:job	
  id="importPayments">
	
  	
  	
  	
  ...
	
  	
  	
  	
  <batch:listeners>
	
  	
  	
  	
  	
  	
  	
  	
  <batch:listener	
  ref="notificationExecutionsListener"/>
	
  	
  	
  	
  </batch:listeners>
</batch:job>
<int:gateway	
  id="notificationExecutionsListener"
	
  	
  	
  	
  service-­‐interface="o.s.batch.core.JobExecutionListener"
	
  	
  	
  	
  default-­‐request-­‐channel="jobExecutions"/>
Tuesday, December 18, 12
Externalizing batch process execution
• Use Spring Integration inside of Batch jobs, e.g.:
– ItemProcessor
– ItemWriter
• Offload complex processing
• Asynchronous processing support:
– AsyncItemProcessor
– AsyncItemWriter
• Externalize chunk processing using ChunkMessageChannelItemWriter
48
Tuesday, December 18, 12
Business Logic Delegation – Spring Integration
49
Tuesday, December 18, 12
•
Questions
50
Questions?
Thank You!!
Tuesday, December 18, 12

More Related Content

What's hot

Exploring Oracle Multitenant in Oracle Database 12c
Exploring Oracle Multitenant in Oracle Database 12cExploring Oracle Multitenant in Oracle Database 12c
Exploring Oracle Multitenant in Oracle Database 12cZohar Elkayam
 
Querying Druid in SQL with Superset
Querying Druid in SQL with SupersetQuerying Druid in SQL with Superset
Querying Druid in SQL with SupersetDataWorks Summit
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBMongoDB
 
MySQL 5.7 New Features for Developers
MySQL 5.7 New Features for DevelopersMySQL 5.7 New Features for Developers
MySQL 5.7 New Features for DevelopersZohar Elkayam
 
NoSQL on MySQL - MySQL Document Store by Vadim Tkachenko
NoSQL on MySQL - MySQL Document Store by Vadim TkachenkoNoSQL on MySQL - MySQL Document Store by Vadim Tkachenko
NoSQL on MySQL - MySQL Document Store by Vadim TkachenkoData Con LA
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageBethmi Gunasekara
 
Mobile App Development With IBM Cloudant
Mobile App Development With IBM CloudantMobile App Development With IBM Cloudant
Mobile App Development With IBM CloudantIBM Cloud Data Services
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduceJ Singh
 
NoSQL databases pros and cons
NoSQL databases pros and consNoSQL databases pros and cons
NoSQL databases pros and consFabio Fumarola
 
ETL Practices for Better or Worse
ETL Practices for Better or WorseETL Practices for Better or Worse
ETL Practices for Better or WorseEric Sun
 
Real-Time Analytics in Transactional Applications by Brian Bulkowski
Real-Time Analytics in Transactional Applications by Brian BulkowskiReal-Time Analytics in Transactional Applications by Brian Bulkowski
Real-Time Analytics in Transactional Applications by Brian BulkowskiData Con LA
 
سکوهای ابری و مدل های برنامه نویسی در ابر
سکوهای ابری و مدل های برنامه نویسی در ابرسکوهای ابری و مدل های برنامه نویسی در ابر
سکوهای ابری و مدل های برنامه نویسی در ابرdatastack
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL DatabasesDerek Stainer
 
Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8MongoDB
 
AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012
AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012
AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012Amazon Web Services
 
Prepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBPrepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBMongoDB
 
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveIBM Cloud Data Services
 

What's hot (20)

Exploring Oracle Multitenant in Oracle Database 12c
Exploring Oracle Multitenant in Oracle Database 12cExploring Oracle Multitenant in Oracle Database 12c
Exploring Oracle Multitenant in Oracle Database 12c
 
Querying Druid in SQL with Superset
Querying Druid in SQL with SupersetQuerying Druid in SQL with Superset
Querying Druid in SQL with Superset
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDB
 
MySQL 5.7 New Features for Developers
MySQL 5.7 New Features for DevelopersMySQL 5.7 New Features for Developers
MySQL 5.7 New Features for Developers
 
Presto
PrestoPresto
Presto
 
NoSQL on MySQL - MySQL Document Store by Vadim Tkachenko
NoSQL on MySQL - MySQL Document Store by Vadim TkachenkoNoSQL on MySQL - MySQL Document Store by Vadim Tkachenko
NoSQL on MySQL - MySQL Document Store by Vadim Tkachenko
 
Introducing Amazon Aurora
Introducing Amazon AuroraIntroducing Amazon Aurora
Introducing Amazon Aurora
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
 
Mobile App Development With IBM Cloudant
Mobile App Development With IBM CloudantMobile App Development With IBM Cloudant
Mobile App Development With IBM Cloudant
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
 
NoSQL databases pros and cons
NoSQL databases pros and consNoSQL databases pros and cons
NoSQL databases pros and cons
 
ETL Practices for Better or Worse
ETL Practices for Better or WorseETL Practices for Better or Worse
ETL Practices for Better or Worse
 
Real-Time Analytics in Transactional Applications by Brian Bulkowski
Real-Time Analytics in Transactional Applications by Brian BulkowskiReal-Time Analytics in Transactional Applications by Brian Bulkowski
Real-Time Analytics in Transactional Applications by Brian Bulkowski
 
سکوهای ابری و مدل های برنامه نویسی در ابر
سکوهای ابری و مدل های برنامه نویسی در ابرسکوهای ابری و مدل های برنامه نویسی در ابر
سکوهای ابری و مدل های برنامه نویسی در ابر
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
 
Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8
 
Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
 
AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012
AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012
AWS Customer Presentation: Freie Univerisitat - Berlin Summit 2012
 
Prepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBPrepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDB
 
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The Move
 

Similar to Russell 2012 introduction to spring integration and spring batch

Overview of data analytics service: Treasure Data Service
Overview of data analytics service: Treasure Data ServiceOverview of data analytics service: Treasure Data Service
Overview of data analytics service: Treasure Data ServiceSATOSHI TAGOMORI
 
Atlanta JUG - Integrating Spring Batch and Spring Integration
Atlanta JUG - Integrating Spring Batch and Spring IntegrationAtlanta JUG - Integrating Spring Batch and Spring Integration
Atlanta JUG - Integrating Spring Batch and Spring IntegrationGunnar Hillert
 
S2GX 2012 - Introduction to Spring Integration and Spring Batch
S2GX 2012 - Introduction to Spring Integration and Spring BatchS2GX 2012 - Introduction to Spring Integration and Spring Batch
S2GX 2012 - Introduction to Spring Integration and Spring BatchGunnar Hillert
 
Hadoop: The Default Machine Learning Platform ?
Hadoop: The Default Machine Learning Platform ?Hadoop: The Default Machine Learning Platform ?
Hadoop: The Default Machine Learning Platform ?Milind Bhandarkar
 
MongoDB Use Cases and Roadmap
MongoDB Use Cases and RoadmapMongoDB Use Cases and Roadmap
MongoDB Use Cases and RoadmapMongoDB
 
Teradata Partners Conference Oct 2014 Big Data Anti-Patterns
Teradata Partners Conference Oct 2014   Big Data Anti-PatternsTeradata Partners Conference Oct 2014   Big Data Anti-Patterns
Teradata Partners Conference Oct 2014 Big Data Anti-PatternsDouglas Moore
 
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesWebinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesMongoDB
 
Dynamic DDL: Adding structure to streaming IoT data on the fly
Dynamic DDL: Adding structure to streaming IoT data on the flyDynamic DDL: Adding structure to streaming IoT data on the fly
Dynamic DDL: Adding structure to streaming IoT data on the flyDataWorks Summit
 
Things Every Oracle DBA Needs To Know About The Hadoop Ecosystem
Things Every Oracle DBA Needs To Know About The Hadoop EcosystemThings Every Oracle DBA Needs To Know About The Hadoop Ecosystem
Things Every Oracle DBA Needs To Know About The Hadoop EcosystemZohar Elkayam
 
20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...
20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...
20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...David Horvath
 
Big Data Analytics: Finding diamonds in the rough with Azure
Big Data Analytics: Finding diamonds in the rough with AzureBig Data Analytics: Finding diamonds in the rough with Azure
Big Data Analytics: Finding diamonds in the rough with AzureChristos Charmatzis
 
Things Every Oracle DBA Needs to Know about the Hadoop Ecosystem
Things Every Oracle DBA Needs to Know about the Hadoop EcosystemThings Every Oracle DBA Needs to Know about the Hadoop Ecosystem
Things Every Oracle DBA Needs to Know about the Hadoop EcosystemZohar Elkayam
 
Continuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData AnalysisContinuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData AnalysisKai Sasaki
 
Solving real world data problems with Jerakia
Solving real world data problems with JerakiaSolving real world data problems with Jerakia
Solving real world data problems with JerakiaCraig Dunn
 
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Databricks
 
Build an Open Source Data Lake For Data Scientists
Build an Open Source Data Lake For Data ScientistsBuild an Open Source Data Lake For Data Scientists
Build an Open Source Data Lake For Data ScientistsShawn Zhu
 
Hadoop-Quick introduction
Hadoop-Quick introductionHadoop-Quick introduction
Hadoop-Quick introductionSandeep Singh
 

Similar to Russell 2012 introduction to spring integration and spring batch (20)

Overview of data analytics service: Treasure Data Service
Overview of data analytics service: Treasure Data ServiceOverview of data analytics service: Treasure Data Service
Overview of data analytics service: Treasure Data Service
 
Atlanta JUG - Integrating Spring Batch and Spring Integration
Atlanta JUG - Integrating Spring Batch and Spring IntegrationAtlanta JUG - Integrating Spring Batch and Spring Integration
Atlanta JUG - Integrating Spring Batch and Spring Integration
 
S2GX 2012 - Introduction to Spring Integration and Spring Batch
S2GX 2012 - Introduction to Spring Integration and Spring BatchS2GX 2012 - Introduction to Spring Integration and Spring Batch
S2GX 2012 - Introduction to Spring Integration and Spring Batch
 
Hadoop: The Default Machine Learning Platform ?
Hadoop: The Default Machine Learning Platform ?Hadoop: The Default Machine Learning Platform ?
Hadoop: The Default Machine Learning Platform ?
 
MongoDB Use Cases and Roadmap
MongoDB Use Cases and RoadmapMongoDB Use Cases and Roadmap
MongoDB Use Cases and Roadmap
 
Grails 2.0 Update
Grails 2.0 UpdateGrails 2.0 Update
Grails 2.0 Update
 
Teradata Partners Conference Oct 2014 Big Data Anti-Patterns
Teradata Partners Conference Oct 2014   Big Data Anti-PatternsTeradata Partners Conference Oct 2014   Big Data Anti-Patterns
Teradata Partners Conference Oct 2014 Big Data Anti-Patterns
 
Big data nyu
Big data nyuBig data nyu
Big data nyu
 
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesWebinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
 
Dynamic DDL: Adding structure to streaming IoT data on the fly
Dynamic DDL: Adding structure to streaming IoT data on the flyDynamic DDL: Adding structure to streaming IoT data on the fly
Dynamic DDL: Adding structure to streaming IoT data on the fly
 
Things Every Oracle DBA Needs To Know About The Hadoop Ecosystem
Things Every Oracle DBA Needs To Know About The Hadoop EcosystemThings Every Oracle DBA Needs To Know About The Hadoop Ecosystem
Things Every Oracle DBA Needs To Know About The Hadoop Ecosystem
 
20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...
20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...
20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...
 
Big Data Analytics: Finding diamonds in the rough with Azure
Big Data Analytics: Finding diamonds in the rough with AzureBig Data Analytics: Finding diamonds in the rough with Azure
Big Data Analytics: Finding diamonds in the rough with Azure
 
Things Every Oracle DBA Needs to Know about the Hadoop Ecosystem
Things Every Oracle DBA Needs to Know about the Hadoop EcosystemThings Every Oracle DBA Needs to Know about the Hadoop Ecosystem
Things Every Oracle DBA Needs to Know about the Hadoop Ecosystem
 
Continuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData AnalysisContinuous Optimization for Distributed BigData Analysis
Continuous Optimization for Distributed BigData Analysis
 
Solving real world data problems with Jerakia
Solving real world data problems with JerakiaSolving real world data problems with Jerakia
Solving real world data problems with Jerakia
 
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
 
Architecting a datalake
Architecting a datalakeArchitecting a datalake
Architecting a datalake
 
Build an Open Source Data Lake For Data Scientists
Build an Open Source Data Lake For Data ScientistsBuild an Open Source Data Lake For Data Scientists
Build an Open Source Data Lake For Data Scientists
 
Hadoop-Quick introduction
Hadoop-Quick introductionHadoop-Quick introduction
Hadoop-Quick introduction
 

Russell 2012 introduction to spring integration and spring batch

  • 1. Introduction to Spring Integration and Spring Batch © 2012 SpringOne 2GX. All rights reserved. Do not distribute without permission. Gary Russell, Staff Engineer, SpringSource; @gprussell Tuesday, December 18, 12
  • 2. What we will cover... • Spring Integration • Spring Batch • Using Spring Batch and Spring Integration together 2 Tuesday, December 18, 12
  • 4. Integration Styles • Business to Business Integration (B2B) • Inter Application Integration (EAI) • Intra Application Integration 4 JVM JVM EAI External Business Partner B2B Core Messaging Tuesday, December 18, 12
  • 5. Integration Styles • File Transfer • Shared Database • Remoting • Messaging 5 Tuesday, December 18, 12
  • 6. Common Patterns 6 Retrieve Parse Transform Transmit Tuesday, December 18, 12
  • 7. Enterprise Integration Patterns • By Gregor Hohpe & Bobby Woolf • Published 2003 • Collection of well-known patterns • http://www.eaipatterns.com/eaipatterns.html 7 Tuesday, December 18, 12
  • 8. 8 “Spring Integration provides an extension of the Spring programming model to support the well-known enterprise integration patterns. Tuesday, December 18, 12
  • 9. What is Spring Integration? • Light-weight messaging framework • Provides an adapter-based platform for communicating with external systems • Pipes and Filters at the core of Spring Integration’s architecture –Endpoint (Filter) –Channel (Pipe) –Message 9 Tuesday, December 18, 12
  • 10. Advantages • Provides building blocks to implement systems that are: – are loosely Coupled (Logically or Physically) – are Event Driven (EDA) – have a staged event-driven architecture (SEDA) • Sophisticated support for synchronous / asynchronous messaging 10 Tuesday, December 18, 12
  • 11. Configuration • XML Namespace Support • Annotation Support (e.g. @Transformer, @Router, @ServiceActivator) 11 <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:int="http://www.springframework.org/schema/integration" xsi:schemaLocation=".."> ... <int:channel id="orders"/> <int:splitter input-channel="orders" expression="payload.items" output-channel="drinks"/> <int:channel id="drinks"/> <int:router input-channel="drinks" expression="payload.iced ? 'coldDrinks' : 'hotDrinks'"/> ... </beans> Tuesday, December 18, 12
  • 12. Spring Integration DSLs • Scala • Groovy 12 val messageFlow = filter {payload: String => payload == "World"} --> transform { payload: String => "Hello " + payload} --> handle { payload: String => println(payload) } messageFlow.send("World") def builder = new IntegrationBuilder() def flow = builder.messageFlow { transform {payload->payload.toUpperCase()} filter {payload-> payload =="HELLO"} handle {payload->payload} } assert flow.sendAndReceive("hello") == "HELLO" assert flow.sendAndReceive("world") == null Tuesday, December 18, 12
  • 13. What is in a Message? • Unit of information • Encapsulates data • Passed between endpoints • Consists of headers – contains data relevant to the messaging system • and a payload – actual data for the receiver – depending on use-cases: POJO instances or serialized data 13 Tuesday, December 18, 12
  • 14. What is in a Message? 14 package org.springframework.integration; public interface Message<T> { MessageHeaders getHeaders(); T getPayload(); } Tuesday, December 18, 12
  • 15. Message Headers • Message ID (automatically generated UUID) • Timestamp • Correlation Id • Reply Channel • Error Channel • Expiration Date • Priority • ... • Add your own headers using a Header Enricher 15 Tuesday, December 18, 12
  • 16. Function of a Message • Command Message • Event Message • Document Message 16 E D C Tuesday, December 18, 12
  • 17. What is a Channel? • Channels connect producers and consumers (decoupling) • MessageChannel Interface: –PollableChannel (Polling Consumer) –SubscribableChannel (Event Driven) • Implementations: –DirectChannel –PublishSubscribeChannel –QueueChannel –PriorityChannel –RendezvousChannel –ExecutorChannel 17 <int:channel id="input"> <int:queue capacity="10"/> </int:channel> Tuesday, December 18, 12
  • 18. What is an Endpoint? • Polling or event-driven • Inbound or outbound • Unidirectional (Channel Adapter) or bidirectional (Gateway) • Internal or external (application context) 18 <inbound-channel-adapter/> <outbound-channel-adapter/> <inbound-gateway/> <outbound-gateway/> <gateway/> <service-activator/> Tuesday, December 18, 12
  • 19. Router • Message Router • Content-based router • Recipient list router (with selectors) • Payload type router • Header value router • Exception type router 19 Tuesday, December 18, 12
  • 20. Transformer • Delegating via ref/method • Spring Expression Language • Groovy, JRuby, Jython, JavaScript • Object-to-JSON / JSON-to-Object • Payload serializing/deserializing • File-to-bytes, File-to-String • JAXB, JibX, Castor, XMLBeans, Xstream • XPath, XSLT • Object XML Marshalling/Unmarshalling (Spring OXM) • ... 20 Tuesday, December 18, 12
  • 21. Spring Integration Components 21 • Claim Check (In/Out) • Content Enricher –Header Enricher –Payload Enricher • Control Bus • Delayer • JMX Support • Message Handler Chain • Messaging Bridge • Resequencer • Service Activator • Scripting support (JSR 223) – Ruby/JRuby, Javascript ... • Groovy • Message History • Message Store – JDBC, Redis, MongoDB, Gemfire • Wire Tap • ... Tuesday, December 18, 12
  • 22. Adapters 22 • AMQP/RabbitMQ • AWS* • File/Resource • FTP/FTPS/SFTP • GemFire • HTTP (REST) • JDBC • JMS • JMX • JPA • MongoDB • POP3/IMAP/SMTP • Print* • Redis • RMI • RSS/Atom • SMB* • Splunk* • Spring Application Events • Stored Procedures • TCP/UDP • Twitter • Web Services (SOAP or POX) • XMPP • XPath • XQuery* • ... * Spring Integration Extensions Project Tuesday, December 18, 12
  • 23. Tooling - Spring Tool Suite (STS) • Namespace Support • Visualization • 4 Spring Integration specific STS Templates –Simple Template (Core Components only) –File Polling Template (File Adapter) –War Template (Uses Twitter Adapter) –Adapter Template (Create your own components) 23 Tuesday, December 18, 12
  • 24. Samples • https://github.com/SpringSource/spring-integration-samples • Contains 50 Samples and Applications • Several Categories: –Basic –Intermediate –Advanced –Applications 24 Tuesday, December 18, 12
  • 25. Books • Just Spring Integration • Pro Spring Integration • Spring Integration in Action 25 Tuesday, December 18, 12
  • 26. What’s new in Spring Integration 2.2 • JPA support – http://blog.springsource.org/2012/10/05/whats-new-in-spring-integration-2-2-part-3-jpa-support/ • MongoDB adapters – http://blog.springsource.org/2012/09/24/whats-new-in-spring-integration-2-2-rc1-part-1-mongodb/ • Transaction synchronization – http://blog.springsource.org/2012/09/26/whats-new-in-spring-integration-2-2-part-2-transaction- synchronization/ • Retry, Circuit Breaker and ExpressionEvaluatingRequestHandlerAdvice – http://blog.springsource.org/2012/10/09/spring-integration-2-2-retry-and-more/ • Redis Store Adapters 26 Tuesday, December 18, 12
  • 27. Source Code • https://github.com/SpringSource/spring-integration • https://github.com/SpringSource/spring-integration-samples • https://github.com/SpringSource/spring-integration-extensions • https://github.com/SpringSource/spring-integration-templates • https://github.com/SpringSource/spring-integration-dsl-groovy • https://github.com/SpringSource/spring-integration-dsl-scala 27 Tuesday, December 18, 12
  • 28. Contribute • Post Question and Answers the Forums –http://forum.springsource.org/forumdisplay.php?42-Integration • Create Jiras –https://jira.springsource.org/browse/INT • Submit Pull Requests - Contributor Guidelines: – github.com/SpringSource/spring-integration/wiki/Contributor-Guidelines • New Spring Integration Extensions Repository 28 Tuesday, December 18, 12
  • 30. 30 Batch Jobs Differ from online/real-time processing applications: • Long-running – Often outside office hours • Non-interactive – Often include logic for handling errors or restarts • Process large volumes of data – More than fits in memory or a single transaction Tuesday, December 18, 12
  • 31. Batch and offline processing • Close of business processing – Order processing – Business reporting – Account reconciliation • Import/export handling – a.k.a. ETL jobs (Extract-Transform-Load) – Instrument/position import – Data warehouse synchronization • Large-scale output jobs – Loyalty scheme emails – Bank statements • 31 Tuesday, December 18, 12
  • 32. Job and Step 32 Tuesday, December 18, 12
  • 33. Chunk-Oriented Processing • Input-output can be grouped together • Input collects Items before outputting:Chunk-Oriented Processing • Optional ItemProcessor 33 Tuesday, December 18, 12
  • 35. Simple File Load Job 35 Tuesday, December 18, 12
  • 36. More Complex Use Cases • It's very common to use an off-the-shelf reader and writer • More complex jobs often require custom readers or writers • ItemProcessor is often used if there's a need to delegate to existing business logic • Use a writer if it's more efficient to process a complete chunk 36 Tuesday, December 18, 12
  • 37. Job and Step in Context 37 Tuesday, December 18, 12
  • 38. JobRepository and Batch Metadata 38 Tuesday, December 18, 12
  • 39. ExecutionContext • We need to know where a failure occurred to restart a batch process • Job Repository metadata is used to determine the step at which the failure occurred • Application Code (in reader/writer) needs to maintain state within a step (e.g. current chunk) • Spring Batch can supply that data during restart to facilitate repositioning 39 Tuesday, December 18, 12
  • 40. Common Batch Idioms • Batch jobs typically process large amounts of homogeneous input • Makes iteration a common concern: Repeat • Transient errors during processing may require a Retry of an input item • Some input may not be valid, may want to Skip it without failing • Some errors should fail the job execution, allowing one to fix the problem and Restart the job instance where it left off 40 Tuesday, December 18, 12
  • 41. Spring Batch • Spring Batch supports these common concerns • Abstracts them in the framework – Job business logic doesn't need to care about details • Allows for simple configuration with pluggable strategies 41 Tuesday, December 18, 12
  • 42. Business Logic Delegation – Spring Application 42 Tuesday, December 18, 12
  • 43. Spring Batch Admin • Sub project of Spring Batch • Provides Web UI and ReSTFul interface to manage batch processes • Manager, Resources, Sample WAR – Deployed with batch job(s) as single app to be able to control & monitor jobs – Or monitors external jobs only via shared database 43 Tuesday, December 18, 12
  • 44. Scaling and Parallel Processing • First Rule: – Use the simplest technique to get the job done in the required time – Do not optimize/parallelize unnecessarily • Options: – Multi-threaded Step (single process) – Parallel Steps (single process) – Remote Chunking of Step (multi process) – Partitioning a Step (single or multi process) 44 Tuesday, December 18, 12
  • 45. Using Spring Batch and Spring Integration Together 45 Tuesday, December 18, 12
  • 46. Launching batch jobs through messages • Event-Driven execution of the JobLauncher • Spring Integration retrieves the data (e.g. file system, FTP, ...) • Easy to support separate input sources simultaneously 46 D C FTP Inbound Channel Adapter JobLauncher Transformer File JobLaunchRequest Tuesday, December 18, 12
  • 47. Providing feedback with informational messages • Spring Batch provides support for listeners: – StepListener – ChunkListener – JobExecutionListener 47 <batch:job  id="importPayments">        ...        <batch:listeners>                <batch:listener  ref="notificationExecutionsListener"/>        </batch:listeners> </batch:job> <int:gateway  id="notificationExecutionsListener"        service-­‐interface="o.s.batch.core.JobExecutionListener"        default-­‐request-­‐channel="jobExecutions"/> Tuesday, December 18, 12
  • 48. Externalizing batch process execution • Use Spring Integration inside of Batch jobs, e.g.: – ItemProcessor – ItemWriter • Offload complex processing • Asynchronous processing support: – AsyncItemProcessor – AsyncItemWriter • Externalize chunk processing using ChunkMessageChannelItemWriter 48 Tuesday, December 18, 12
  • 49. Business Logic Delegation – Spring Integration 49 Tuesday, December 18, 12