Give your little scripts big wings: Using cron in the cloud with Amazon Simple Workflow


Published on

Most developers write them and every company has them – a vast library of small and large scripts that are designed to run on a scheduled basis. These background angels help keep the lights on and the doors open. They’ve been built up over time and are forgotten little heroes that are only remembered when the machines they live on fail. They are scattered throughout a company’s IT infrastructure and do important things.
In this session, we will explain how to use Ruby on Simple Workflow to quickly build a system that schedules scripts, runs them on time, retries them if they fail, and stores the history of their execution. You will walk away from this session with an understanding of how Simple Workflow brings resiliency, concurrency, and tracking to your applications.

Published in: Technology

Give your little scripts big wings: Using cron in the cloud with Amazon Simple Workflow

  1. 1. Asad JawaharGive your little scripts big wings: Using cron in the cloud withAmazon Simple WorkflowSenior Technical Program Manager
  2. 2. For applications with multiple connected steps…• Amazon Simple Workflow provides the building blocks and a controller toreduce the complexity of infrastructure programming and state machinery.This helps customers focus on...• Algorithms and logic that make their organizations unique
  3. 3. Many Use Cases
  4. 4. Cron• Scheduled tasks• Why use SWF for Cron?– Failure handling– Scale– Lost tasks (office move, machine failure)– OS provided cron not an option (shared hosting)
  5. 5. Roadmapcc : codeThe implementation ofyour control logic andstepsdd : deploymentworker processes,machines, SWFprocessing enginell : logicalprocess model, flowcontrol, retry logicetc→ →
  6. 6. Key Simple Workflow conceptsActivities (and workers)Workflows (and workers)• Discrete steps in yourapplication, processors• Logical grouping for multipleactivities, processorsDeciders• Control logic for workflows:execution order, retry policies,timer logic, etc. – Decision tasksRetained workflows• Execution history forworkflows: auditable,re-playableTimers, signals• Scheduled actions, Workflow“interrupt requests”
  7. 7. SWF ConceptsWorkflow – Image processingDecider logicActivitiesDownloadEncodeApply DRM UploadCopyright?AmazonSWFActivity WorkersWorkflow WorkersDecisionTasksActivityTasksActivityTasksActivityTasksHistory
  8. 8. ResponsibilitiesAWSYouSWF• Step sequence logic(“secret sauce”)• Discrete step logic• Workflow workers• Activity workers• I/O data• State of execution• Pending tasks• Execution history• Searching histories• Control engine
  9. 9. class CronWorkflowextend Deciderentry_point :startactivity_client :activityClient do |options|options.prefix_name = "CronActivities"enddef start(arg)while true docreate_timer(86400){activityClient.backup “myDB"}endendendCron in SWFclass CronActivitiesextend Activityactivity :backupdef backup(arg)# backup scriptendend
  10. 10. Daily Build ProcessOrdinary cron, single point of failureDownloadfiles from S3Run buildtoolsUpload buildartifacts toS3Delete localfilesSend email
  11. 11. Handling FailuresDownloadfiles from S3Run buildtoolsUpload buildartifacts toS3Delete localfilesSend emailPass?Pass?YesYesNoNo
  12. 12. Resilient CronJob1Job2Job3Job4
  13. 13. SWFResilient CronDistributedasynchronousinteractionsFailure handlingScalability Latency Auditability• Coordinate scheduled tasksacross distributed hosts• Reliable messagingCoordination engine in thecloud• Stores and dispatches tasks• Delivery guarantees• Add workers at any time• Need stateless workers• State provided by another systemScalability• Repository of distributedstate• Workers poll for work• Exactly once delivery• Many types of failures• Different mitigations• Must react quicklyFault Tolerance• No loss of distributed state• Supports timeouts on work• Explicit failure modeling• Need visibility into processexecution• Retain records for investigationAudit History• Index of executions• Play-by-play history ofexecution• Retention of completedexecutions• Get work quickly• Route tasks to specific workersLow Latency• Long polling for work• Task routing through tasklistsRequirementsSWF provides
  14. 14. Introducing AWS Flow Framework (Ruby)• Ease of use library for Amazon SWF APIs• Uses standard Ruby constructs• Open source• Packaged as a gem• In Private Beta (stay tuned for release)AmazonSWF
  15. 15. Benefits of AWS Flow Framework• Run and load balance work on many machines with minimal changes• Simplifies remote, long running, non-blocking steps• Uses Amazon SWF to:– Keep track of progress– Triggers your code to execute next steps• Failure handling built in• Easily evolve your logic without adding complexityAmazonSWF
  16. 16. Hourly Build – Logical Control Flowwait (1 hour)copy filesrun build taskupload filesdelete local filessend emailif (failed)retry up to 3 timesrepeatDownloadfiles from S3Run buildtoolsUpload buildartifacts toS3Delete localfilesSend email
  17. 17. Build Cron Workflow – Execution FlowAmazonSWFExecution History- Input data- Download complete- Build complete- Upload completeDecisions:1. Schedule download [shared]2. Schedule build [worker1]3. Schedule upload [worker1]DECIDERMakes decisions on what tasks toschedule, when, in what orderStart Workflow ExecutionYour App, SWF Console orCLIStarts execution of CronWorkflowWorkerWorkerLong pollLong PollLong PollWorker 2Worker 1Decision TasksGet taskGet task1. /tmp, worker1Return decisionsShared- Delete local file- Email sent2. Built4. Schedule delete files [worker1]5. Schedule email [worker2]6. Execution complete3. Uploaded4. Deleted5. Email sentGet task
  18. 18. Hourly Build – Deciderclass BuildWorkflowextend Deciderentry_point :startactivity_client :client do |options|options.prefix_name = "BuildActivities"enddef start(source_bucket, target_bucket)while true docreate_timer(25) { start_build(source_bucket, target_bucket)}endenddef start_build(source, target)begindir =, target)ensureclient.delete(dir)client.send_email(dir)endend
  19. 19. Task Routingdef start_build(source_bucket, target_bucket)activity_client :client do |options|options.prefix_name = "BuildActivities"endhost_specific_task_list, dir = do |options|options.default_task_list =host_specific_task_listendclient.upload(dir, target_bucket) do |options|options.default_task_list =host_specific_task_listendclient.delete(dir) do |options|options.default_task_list =host_specific_task_listendendend
  20. 20. Exponential Retrydef start_build(source_bucket, target_bucket)activity_client :client do |options|options.prefix_name = "BuildActivities"enddir = client.exponential_retry (:download, bucket) do |options|options.maximum_attempts = (:upload, dir, target_bucket) do |options|options.maximum_attempts = 3endclient.exponential_retry (:delete, dir) do |options|options.maximum_attempts = 3endend
  21. 21. Activitiesclass BuildActivities extend Activityactivity :download, :build, :upload, :delete do |options|options.default_task_list = "list1"options.version = "1"options.default_task_heartbeat_timeout = "3600"options.default_task_schedule_to_close_timeout = "30"options.default_task_schedule_to_start_timeout = "30"options.default_task_start_to_close_timeout = "30"enddef download(bucket)puts bucketenddef build(dir)puts direnddef upload(dir, bucket)puts bucketenddef delete(dir)puts direndend
  22. 22. Multiple builds in parallel• Parent Cron workflow kicks off child Build Workflows• Child workflow– A workflow started from another workflow– Runs independently with its own history– Invocation similar to activities– Factors functionality into reusable components• Flow and SWF can run Child workflows and activities in parallel
  23. 23. Cron Workflow(parent)Build Workflow(OS A)Build Workflow(OS B)Downloadfiles from S3Run buildtoolsUpload buildartifacts toS3Delete localfilesSend emailDownloadfiles from S3Run buildtoolsUpload buildartifacts toS3Delete localfilesSend emailMultiple builds in parallel
  24. 24. Multiple builds in parallelclass CronWorkflow extend Deciderentry_point :startdef start(w_source_bucket, w_target_bucket, l_source_bucket, l_target_bucket)while true doworkflow_client :w_client do |options|"BuildWorkflow"options.task_list="win"endworkflow_client :l_client do |options|"BuildWorkflow"options.tasklist="linux"endcreate_timer(arg) doresult1 = w_client.send_async :start, w_source_bucket, w_target_bucketresult2 = l_client.send_async :start, l_source_bucket, l_target_bucketwait_for_all(result1, result2)endcontinue_as_newendendend
  25. 25. Concurrency in Ruby Flow• Decider is single threaded• Blocking semantics by default• send_async for asynchronous execution– Returns a Future– Cedes control when waited on– Uses fibers (requires Ruby 1.9 or better)– Code looks similar to synchronous code
  26. 26. Continuous workflows• SWF allows a workflow to stay open up to 1 year• Workflow history grows over time as events get added• Large history => latency• Create new runs to keep history size in checkclass BuildWorkflow extend Deciderentry_point :startdef start(source_bucket, target_bucket)while true docreate_timer(3600) { start_build }continue_as_new(source_bucket, target_bucket)endend
  27. 27. Activity Worker• Hosts activity implementation• Polls SWF for tasks and dispatches to your code• Uses two thread pools– Polling– Running activity tasks• Activity implementation must be thread safeactivity_worker =, domain, task_list)activity_worker.add_activities_implementation(BuildActivities)activity_worker.start
  28. 28. Workflow Worker• Hosts workflow implementation• Polls SWF for tasks and dispatches to your code• Uses a single thread pool for polling and running tasks– Your logic should be light weight and deterministic• Delegate heavy lifting to activitiesworker =, domain, task_list)worker.add_workflow_implementation_type(CronWorkflow)worker.add_workflow_implementation_type(BuildWorkflow)workflow_worker.start
  29. 29. Starting an executionfactory = workflow_factory swf, domain do |options|options.workflow_name = “CronWorkflow"options.execution_start_to_close_timeout = 3600options.task_list = task_listoptions.task_start_to_close_timeout = 3600options.child_policy = :request_cancelendclient = my_workflow_factory.get_clientworkflow_execution = client.start(“sources“, “binaries”)
  30. 30. Learn More• Amazon SWF–• AWS Flow Framework in Ruby– TBD• Application Samples–• Webinar Videos– Introduction to Amazon SWF– Using the AWS Flow Framework• AWS Flow Framework source on GitHub– TBD