SlideShare a Scribd company logo
1 of 18
Spring Batch
Situation
About me
Taher Abdelmohsen
Java Full-Stack Software Engineer
Problem Definition
 Migrating 1M new users into our Subscription Engine
 Migration, no interactions needed while running
 Legacy tool:
migrates 20000 in ~ 7 hours
1000000 -> 350 hours -> 14.58 days
options
Legacy Java Tool Spring Batch Framework
Spring Batch
the leading batch processing framework on the JVM
 Benefits:
Job flow state machine
Transaction handling
Declarative IO
Robust error handling(retry, Skip, fail)
Scalability options
Battle tested
Built on Spring
What is Spring ?
 Spring is an open source application framework, and inversion of
control container for Java. The framework's core features can be used by any Java
application, and there are extensions for building web applications, Enterprise
Applications, and many other features.
Spring Batch Job
 Job Repository
 Transitions
 Decisions
 Nested Jobs
 Job Parameters
Job
• *
JobInstance
• *
JobExecution
EndOfDay
“5/12/2018”
One for Each Attempt
Step processing
 Tasklet
 Chunk based:
Chunk Based Step
1. Item Readers
 Reading from Database
 Flat Files
 XML, JSON
 Multiple resources
2. Item Writer
3. Item processor
 Filtering Items
 Validating Items
 Composite Processes
JOB Features
1. Error Handling:
 Restart-ability
 Retry
 Skip
2. Scaling Job:
 Multiple threads
 Async ItemProcessor in combination with Async ItemWriter
 Partitioning
 Remote chunking
Other features
 Schedule a job
 Lunching Jobs Via Messages
Sample Step Definition
@Bean
public Step step1() throws Exception {
return stepBuilderFactory
.get("step1")
.chunk(20)
.reader(fileItemReader)
.processor(subscriptionItemProcessor)
.writer(fileItemWriter)
.build();
}
Sample Job Definition
@Bean
public Job buildMigrationJob() throws Exception {
return jobBuilderFactory
.get("migrationJob")
.start(step1)
.on(“COMPLETED”).to(step2)
.from(step2).on(“COMPLETED”).end();
.build();
}
Solution Result
 Process finished in less than 12h
 Time to handle the failed items
 Project replaced the old tool and used in more than 6 other migration processes
 Sessions have been organized to handover the tool to the new engineers
 Enhance for the next versions
Comparison
Spring Batch
- Time: 12h
- Clean Code
- Less Code
- Easy To learn
- Less Complex
Legacy Tool
- Time: 14.58d
- More Code
- More Complex
Contact Me
• El-sherouk city, Cairo, Egypt
• +201023842575
• Taher.ayoub90@gmail.com
Any Questions

More Related Content

What's hot

Whats New in MSBuild 3.5 and Team Build 2008
Whats New in MSBuild 3.5 and Team Build 2008Whats New in MSBuild 3.5 and Team Build 2008
Whats New in MSBuild 3.5 and Team Build 2008
wbarthol
 
Gatling Tool in Action at Devoxx 2012
Gatling Tool in Action at Devoxx 2012Gatling Tool in Action at Devoxx 2012
Gatling Tool in Action at Devoxx 2012
slandelle
 
Reactive programming using rx java & akka actors - pdx-scala - june 2014
Reactive programming   using rx java & akka actors - pdx-scala - june 2014Reactive programming   using rx java & akka actors - pdx-scala - june 2014
Reactive programming using rx java & akka actors - pdx-scala - june 2014
Thomas Lockney
 

What's hot (20)

Spring batch introduction
Spring batch introductionSpring batch introduction
Spring batch introduction
 
Spring Batch Workshop
Spring Batch WorkshopSpring Batch Workshop
Spring Batch Workshop
 
Apache Airflow | What Is An Operator
Apache Airflow | What Is An OperatorApache Airflow | What Is An Operator
Apache Airflow | What Is An Operator
 
Airflow for Beginners
Airflow for BeginnersAirflow for Beginners
Airflow for Beginners
 
Apache airflow
Apache airflowApache airflow
Apache airflow
 
Whats New in MSBuild 3.5 and Team Build 2008
Whats New in MSBuild 3.5 and Team Build 2008Whats New in MSBuild 3.5 and Team Build 2008
Whats New in MSBuild 3.5 and Team Build 2008
 
Airflow 101
Airflow 101Airflow 101
Airflow 101
 
Introduction to Reactive programming
Introduction to Reactive programmingIntroduction to Reactive programming
Introduction to Reactive programming
 
Performance Test Automation With Gatling
Performance Test Automation  With GatlingPerformance Test Automation  With Gatling
Performance Test Automation With Gatling
 
Gatling overview
Gatling overviewGatling overview
Gatling overview
 
Reactors.io
Reactors.ioReactors.io
Reactors.io
 
Load test REST APIs using gatling
Load test REST APIs using gatlingLoad test REST APIs using gatling
Load test REST APIs using gatling
 
Understanding Reactive Programming
Understanding Reactive ProgrammingUnderstanding Reactive Programming
Understanding Reactive Programming
 
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
 
Chilinet
ChilinetChilinet
Chilinet
 
SFDC Batch Apex
SFDC Batch ApexSFDC Batch Apex
SFDC Batch Apex
 
Airflow and supervisor
Airflow and supervisorAirflow and supervisor
Airflow and supervisor
 
Sprint 12
Sprint 12Sprint 12
Sprint 12
 
Gatling Tool in Action at Devoxx 2012
Gatling Tool in Action at Devoxx 2012Gatling Tool in Action at Devoxx 2012
Gatling Tool in Action at Devoxx 2012
 
Reactive programming using rx java & akka actors - pdx-scala - june 2014
Reactive programming   using rx java & akka actors - pdx-scala - june 2014Reactive programming   using rx java & akka actors - pdx-scala - june 2014
Reactive programming using rx java & akka actors - pdx-scala - june 2014
 

Similar to Spring batch showCase

Similar to Spring batch showCase (20)

Batching and Java EE (jdk.io)
Batching and Java EE (jdk.io)Batching and Java EE (jdk.io)
Batching and Java EE (jdk.io)
 
Flink Forward SF 2017: Feng Wang & Zhijiang Wang - Runtime Improvements in Bl...
Flink Forward SF 2017: Feng Wang & Zhijiang Wang - Runtime Improvements in Bl...Flink Forward SF 2017: Feng Wang & Zhijiang Wang - Runtime Improvements in Bl...
Flink Forward SF 2017: Feng Wang & Zhijiang Wang - Runtime Improvements in Bl...
 
AAI-1713 Introduction to Java EE 7
AAI-1713 Introduction to Java EE 7AAI-1713 Introduction to Java EE 7
AAI-1713 Introduction to Java EE 7
 
AAI 1713-Introduction to Java EE 7
AAI 1713-Introduction to Java EE 7AAI 1713-Introduction to Java EE 7
AAI 1713-Introduction to Java EE 7
 
IBM ConnectED 2015 - MAS103 XPages Performance and Scalability
IBM ConnectED 2015 - MAS103 XPages Performance and ScalabilityIBM ConnectED 2015 - MAS103 XPages Performance and Scalability
IBM ConnectED 2015 - MAS103 XPages Performance and Scalability
 
node.js 실무 - node js in practice by Jesang Yoon
node.js 실무 - node js in practice by Jesang Yoonnode.js 실무 - node js in practice by Jesang Yoon
node.js 실무 - node js in practice by Jesang Yoon
 
Design & Develop Batch Applications in Java/JEE
Design & Develop Batch Applications in Java/JEEDesign & Develop Batch Applications in Java/JEE
Design & Develop Batch Applications in Java/JEE
 
Fundamental concepts of react js
Fundamental concepts of react jsFundamental concepts of react js
Fundamental concepts of react js
 
Building React Applications with Redux
Building React Applications with ReduxBuilding React Applications with Redux
Building React Applications with Redux
 
Fundamental Concepts of React JS for Beginners.pdf
Fundamental Concepts of React JS for Beginners.pdfFundamental Concepts of React JS for Beginners.pdf
Fundamental Concepts of React JS for Beginners.pdf
 
Java EE 7 Recipes for Concurrency - JavaOne 2014
Java EE 7 Recipes for Concurrency - JavaOne 2014Java EE 7 Recipes for Concurrency - JavaOne 2014
Java EE 7 Recipes for Concurrency - JavaOne 2014
 
Example Of Import Java
Example Of Import JavaExample Of Import Java
Example Of Import Java
 
Continuous Integration on AWS
Continuous Integration on AWSContinuous Integration on AWS
Continuous Integration on AWS
 
javascript Event Handling and introduction to event.ppt
javascript Event Handling and introduction to event.pptjavascript Event Handling and introduction to event.ppt
javascript Event Handling and introduction to event.ppt
 
CS101- Introduction to Computing- Lecture 32
CS101- Introduction to Computing- Lecture 32CS101- Introduction to Computing- Lecture 32
CS101- Introduction to Computing- Lecture 32
 
Event Programming JavaScript
Event Programming JavaScriptEvent Programming JavaScript
Event Programming JavaScript
 
Batch Processing with Amazon EC2 Container Service
Batch Processing with Amazon EC2 Container ServiceBatch Processing with Amazon EC2 Container Service
Batch Processing with Amazon EC2 Container Service
 
ADF Gold Nuggets (Oracle Open World 2011)
ADF Gold Nuggets (Oracle Open World 2011)ADF Gold Nuggets (Oracle Open World 2011)
ADF Gold Nuggets (Oracle Open World 2011)
 
Case Study: Migrating Hyperic from EJB to Spring from JBoss to Apache Tomcat
Case Study: Migrating Hyperic from EJB to Spring from JBoss to Apache TomcatCase Study: Migrating Hyperic from EJB to Spring from JBoss to Apache Tomcat
Case Study: Migrating Hyperic from EJB to Spring from JBoss to Apache Tomcat
 
Concepts of React
Concepts of ReactConcepts of React
Concepts of React
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Spring batch showCase

  • 3. About me Taher Abdelmohsen Java Full-Stack Software Engineer
  • 4. Problem Definition  Migrating 1M new users into our Subscription Engine  Migration, no interactions needed while running  Legacy tool: migrates 20000 in ~ 7 hours 1000000 -> 350 hours -> 14.58 days
  • 5. options Legacy Java Tool Spring Batch Framework
  • 6. Spring Batch the leading batch processing framework on the JVM  Benefits: Job flow state machine Transaction handling Declarative IO Robust error handling(retry, Skip, fail) Scalability options Battle tested Built on Spring
  • 7. What is Spring ?  Spring is an open source application framework, and inversion of control container for Java. The framework's core features can be used by any Java application, and there are extensions for building web applications, Enterprise Applications, and many other features.
  • 8. Spring Batch Job  Job Repository  Transitions  Decisions  Nested Jobs  Job Parameters Job • * JobInstance • * JobExecution EndOfDay “5/12/2018” One for Each Attempt
  • 10. Chunk Based Step 1. Item Readers  Reading from Database  Flat Files  XML, JSON  Multiple resources 2. Item Writer 3. Item processor  Filtering Items  Validating Items  Composite Processes
  • 11. JOB Features 1. Error Handling:  Restart-ability  Retry  Skip 2. Scaling Job:  Multiple threads  Async ItemProcessor in combination with Async ItemWriter  Partitioning  Remote chunking
  • 12. Other features  Schedule a job  Lunching Jobs Via Messages
  • 13. Sample Step Definition @Bean public Step step1() throws Exception { return stepBuilderFactory .get("step1") .chunk(20) .reader(fileItemReader) .processor(subscriptionItemProcessor) .writer(fileItemWriter) .build(); }
  • 14. Sample Job Definition @Bean public Job buildMigrationJob() throws Exception { return jobBuilderFactory .get("migrationJob") .start(step1) .on(“COMPLETED”).to(step2) .from(step2).on(“COMPLETED”).end(); .build(); }
  • 15. Solution Result  Process finished in less than 12h  Time to handle the failed items  Project replaced the old tool and used in more than 6 other migration processes  Sessions have been organized to handover the tool to the new engineers  Enhance for the next versions
  • 16. Comparison Spring Batch - Time: 12h - Clean Code - Less Code - Easy To learn - Less Complex Legacy Tool - Time: 14.58d - More Code - More Complex
  • 17.
  • 18. Contact Me • El-sherouk city, Cairo, Egypt • +201023842575 • Taher.ayoub90@gmail.com Any Questions

Editor's Notes

  1. -JSR-352: (the standardization of batch processing on the JVM) Transaction management: for example if you have a file with a million record and youdo not want this amount of data to be processed in one transaction, spring batch provides you with a chunk based steps to process the file chunk by chunk, handle the state of processing, and handle if error happen where it is last processed and start from this point when triggered to restart. Declarative IO: spring batch provide a collection of readers and writers from and to (files, XML, JSON, JDBC, and even JPA….Etc) to use and focus on the business logic. Scalability on single JVM or multiple JVMs Battle tested: since 2008 and used in many verticals, finance, retails, Governments, with mission critical applications running on spring batch on production, so spring batch components are well tested Build on Spring, so all the facilities we have on spring we also have with spring batch, spring initializr, spring boot, context, configurations, IDE integrations, testing utilities…etc.
  2. So before speaking about spring batch, does anyone here heard about spring Frangwaork, it is Java related framework, I know but I think it is well-knonw even for the engineers with no java back ground? 2.1Inversion of control container (dependency injection) 2.2Aspect-oriented programming framework 2.3Data access framework 2.4Transaction management 2.5Model–view–controller framework(MVC) s 2.6Convention-over-configuration rapid application development 2.7.1Spring Boot 2.7.2Spring Roo 2.8Batch framework 2.9Integration framework
  3. Transiton: Spring batch is a state machine, and so we need to configure how to transit from state to state and from step to step, and the expected conditions, so when “step1” completes do we go to “step2” or “step3”, also we need to configure the terminal state of the job itself, what happen as a result of this job, does it finished “successful”, “failed”, “stopped” and so. Decisions : deciders is a way to orchestrate the job steps based on the output of the step logic itself, so it is not depending on the termination state of the step but the step logic output Nested Jobs: job can be nested inside a step, this makes it more clean to compose a job instead of having a huge complex job, so the job can execute another, how it is work?, the parent job is waiting for the child step to complete and if it is successful the step is considered successful and the processing of the parent job continues, if the child job fails the step is considered failed and the parent job itself stops as expected. It is pretty useful to add additional configurations at run time, spring batch provides a mechanism for providing parameters to a job to allow you customize the configurations, at the same time job parameters are used to identify “Job instance”, if we take a look at the diagram, this is how it is designed, so we have a job,, conceptual job, and a job can have a job instance, a job instance can have many executions, a job instance is a logical run, so in this example if I have an “endOfDay” job that should run for each day, I get an instance for each day, a logical run, so in this case I can pass a parameter for each day indicating the new job instance, each time I physically run the job I get a “job execution”, with the same parameters I passed to the “job instance”. In our case for example we needed to send the input file path as a job parameter. Job instance can only run once to the completion.
  4. This sequence diagram shows how chunk based steps handle the process, read record by record and pass it to the processor, until finish the chunk, then the whole output data is written to by the writer at once, reading and processing item by item, helps in handling the errors, while writing all the output at once for performance reasons, it is better to execute one insert statement rather than execute one for each item.
  5. In chunk based steps the itemReader is responsible of providing the input to the step. cursorItemReader: very simple stateful, so if something happen I can restart from the failing point with no issues, but it is not thread safe, the resultset has only one cursor, this will cause issues if it is called from multiple threads, setSql, data source, Row mapper pagingItemReader: the key difference is that it is thread safe, if some error happened in a page the whole page is considered failed and will be the start point in the next “restart” Multiple resources: reading each file to the end and then move to the next one, you specify spring batch the resources and it handles the rest, you can keep track of each record’s source. Managing state within the chunk based steps, this facility makes “retry and skip” features possible, the jobExection or the step execution state is saved in the job repository, the component state of the step itself is saved, so you can save the execution state of the reader or the writer, so if an error occurred while reading or processing, the
  6. As a batch processing framework, sprnig batch provide many and robust ways to handle the errors, relying on the job repository, spring batch can catch the job or the step where it last failed, and start over there if the job restarted, re-run with the same parameters, for example 1M records job with 3 steps and 1000 chunk size, step1 succeeded, and step2 failed at record 500250, if we restart the job it will skip step1 and the first 500,000 records that have been processed successfully in steps2 and start from 500001 at step2, Spring batch give the ability to retry an action if something went wrong
  7. - As you can see, the code is very readable, easy too understand even before going into details, here we defining a step, that can be injected in any subsequent job definitions, it is builder, and factory DP dependent implementation, so here we are using “stepBuilderFactory” to get a builder object which is used to build the step with the supplied parameters, so we are building a step named “step1”,and it is chunk based step, not tasklet, the chunk size is 20 records, and we are using a reader name “subscriptionItemReader”, and processor “subscriptionItemProcessor”, and the output will be written using the writer defined and named “itemWriter”, then call build method of the builder to build the step.
  8. end() : termination state indicates that the job has finished “successfully”, hence it can not be restarted again with the same arguments, there are also, fail(), means that the job failed at that step and stoped(), indicates that we just programaicaly stopped the execution, thus are termination state that implies that the job does not finish successfully, so it can be restarted again , and it will start working from the point it failed. stepAndRestart(step3)