Spring Day | Behind the Scenes at Spring Batch | Dave Syer

2,707 views

Published on

2011-10-31 | 01:30 PM - 02:15 PM
Spring Batch has a large user base and a good track record in production systems, but what is it all really about, and why does it work? This presentation provides a short bootstrap to get a new user started with the Batch domain, showing the key concepts and explaining the benefits of the framework. Then it goes into a deeper dive and looks at what holds it all together, with a close look at some of the most important but least understood features, including restart, retry and transactions.

Published in: Technology, Education
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,707
On SlideShare
0
From Embeds
0
Number of Embeds
30
Actions
Shares
0
Downloads
81
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Spring Day | Behind the Scenes at Spring Batch | Dave Syer

  1. 1. Inside Spring Batch Dave Syer, VMware, JAX London 2011Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited.
  2. 2. Overview • Very quick start with Spring Batch • Spring Batch Admin • State management – thread isolation • Retry and skip • Restart and transactionsCopyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 2
  3. 3. Processing the Same File Twice…Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 3
  4. 4. Spring Batch Application Business logic Batch Core Quality of service, Batch Infrastructure auditability, management information Re-usable low level stuff: flat files, XML files, database keysCopyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 4
  5. 5. Spring Batch Admin Application Batch Core Runtime services (JSON and Java) plus optional UI Batch InfrastructureCopyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 5
  6. 6. Simple Sequential Sample <job id="job"> <step id="businessStep"> <tasklet> <chunk reader="itemGenerator" writer="itemWriter" commit-interval="1"/> </tasklet> <next on="FAILED" to="recoveryStep"/> <end on="*"/> </step> <step id="businessStep"> <tasklet ref="reportFailure" /> </step> </job>Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 6
  7. 7. Item-Oriented Processing • Input-output can be grouped together = Item-Oriented Processing Step ItemReader ItemWriter execute() read() item repeat, retry, write(items) etc. ExitStatusCopyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 7
  8. 8. Job Configuration and Execution The EndOfDay Job Job schedule.date = 2007/05/05 JobParameters * The EndOfDay Job JobInstance for 2007/05/05 * The first attempt at JobExecution EndOfDay Job for 2007/05/05Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 8
  9. 9. State Management • Isolation – thread safety • Retry and skip • RestartCopyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 9
  10. 10. Thread Isolation: StepScope File writer needs to be step scoped so it can flush and close the output stream <bean class="org.sfw...FlatFileItemWriter" scope=“step”> <property name=“resource"> <value> /data/#{jobName}-{#stepName}.csv </value> </property> </bean> Because it is step scoped the writer has access to the StepContext and can replace these patterns with runtime valuesCopyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 10
  11. 11. Step Scope Responsibilities • Create beans for the duration of a step • Respect Spring bean lifecycle metadata (e.g. InitializingBean at start of step, DisposableBean at end of step) • Recognise StepScopeAware components and inject the StepContext • Allows stateful components in a multithreaded environment • Well-known internal services recognised automaticallyCopyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 11
  12. 12. Quality of Service • Stuff happens: – Item fails – Job fails • Failures can be – Transient – try again and see if you succeed – Skippable – ignore it and maybe come back to it later – Fatal – need manual intervention • Mark a job execution as FAILED • When it restarts, pick up where you left off • All framework concerns: not business logicCopyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 12
  13. 13. Quality of Service Sample <step id="step1"> <tasklet> <chunk reader="itemGenerator" writer="itemWriter" commit-interval="1" retry-limit="3" skip-limit="10"> ... </chunk> </tasklet> </step>Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 13
  14. 14. Retry and the Transaction REPEAT(while more input) { chunk = ACCUMULATE(size=500) { Chunk input; Provider } RETRY { TX { for (item : chunk) { process; } write; Chunk } Processor } }Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 14
  15. 15. Retry and Skip: Failed Processor RETRY(up to n times) { TX { Skip is just for (item : chunk) { process; } an exhausted write; retry } } RECOVER { TX { for (item : successful) { process; } write; skip(item); } } }Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 15
  16. 16. Flushing: ItemWriterpublic class RewardWriter implements ItemWriter<Reward> { public void write(List<Reward> rewards) { // do stuff to output Reward records // and flush changes here… }}Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 16
  17. 17. Retry and Skip: Failed Write RETRY(up to n times) { TX { for (item : chunk) { process; } write; Scanning for } failed item } RECOVER { for (item : chunk) { TX { process; write; } CATCH { skip(item); } } }Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 17
  18. 18. Restart and Partial Failure • Store state to enable restart • What happens to the business data on error? • What happens to the restart data? • Goal: they all need to rollback togetherCopyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 18
  19. 19. Partial Failure: Piggyback the Business Transaction JOB { STEP { REPEAT(while more input) { TX { Inside REPEAT(size=500) { business input; transaction output; } FLUSH and UPDATE; Database } } Persist } context data for next } executionCopyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 19
  20. 20. ItemStream Step ItemStream JobRepositoryexecute() open(executionContext) Called before commitrepeat, update(executionContext) retry, etc. save(executionContext) close() ExitStatusCopyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 20
  21. 21. Overview • Very quick start with Spring Batch • Spring Batch Admin • State management – thread isolation • Retry and skip • Restart and transactionsCopyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. 21
  22. 22. Q&ACopyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited.

×