Your SlideShare is downloading. ×
ScaleFast Grid And Flow
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

ScaleFast Grid And Flow


Published on

A python based grid computing project. With process work-flow built in, deploy and manage simple through to complex business processes across a distributed network of dedicated or on demand commodity …

A python based grid computing project. With process work-flow built in, deploy and manage simple through to complex business processes across a distributed network of dedicated or on demand commodity computers. Run command line apps, native Python, Java and .Net code.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide
  • Scheduling Tools: Active Batch
  • Issues in the top grouping can be addressed by tools like Active Batch
  • Support staff can now see in very granular process details where process failed.It is no easier to determine the cause of a failure, was it: Resource issues Bad static data Bugs in code
  • There is also an instance of flow that can be easily integrated into an Enterprise Workflow/Automation Application
  • Transcript

    • 1. Grid and Flow
      By Robert Betts
    • 2. The Offering
      A distributed, stable and well synchronised platform for grid computing.
      A methodology, toolset and library for deploying an efficient, co-ordinated and parallel platform.
    • 3. Current Operational Challenges
      Processing of large datasets open to high rates of failure
      Processing takes a long time to complete
      Processing is often executed sequentially
      Technology bottlenecks e.g. 32 bit software often only supports between 2 and 3 GB RAM
      64 bit processes can hog available resources
      Most tools can’t exploit multi core configurations
      Dedicated hardware allocated to accommodate the maximum processing load
      Difficult to audit and trace processing problems
    • 4. With ScaleFast you can ...
      Centralise all business processes
      Define hierarchical processes with step inter-dependencies
      Parallelise the running of processes and steps
      Re-run failed processes from any point.
      Automatically split large process steps
      Make use of multi core/processor computers
      Distribute jobs across multiple computers
      Make use of user workstations and other idle computing resources
    • 5. ScaleFast Grid
      Distributed Computing Grid
      A distributor and worker nodes
      Implements map/reduce
      Workers can run on user workstations or dedicated infrastructure
      Can be easily deployed to a cloud platform
      Supports the native running of Python, Java and .Net
    • 6. ScaleFast Flow
      Process Workflow Engine
      Processes are made up of individual jobs which have inter-dependencies
      The output of one job can be the input of the next job
      Processes have notifications based on success or failure
      Flow has a built in scheduler which can be triggered by:
      Time with multiple time zone support
      User Interface
      Processes can be restarted from any point of failure
      Processes can be made up of sub processes
    • 7. Common Use Cases
      Reporting and data processing
      Stabilising processes that fail due to resource constraints
      Speeding up processes that take a long time to run
      Improve and/or balance resource utilisation
      Process orchestration and scheduling
      Co-ordinating processes with event based synchronisation
      Parameter and data flow between process steps
      Centralisation and versioning of processes
      Reducing support administration with full process auditing
      General processing and application development
      Any application/process that would benefit from parallelism
      Risk Management and PL Processing
      Distributed Computations
    • 8. Case Study 1 – Hedge Fund
      • Trade volumes of 10 000 per day
      • 9. Reports continuously failing
      • 10. Reporting taking longer to run each day
      • 11. System support occupies a fulltime resource with additional assistance frequently required
      • 12. Overnight failures push EOD processing to t+2 (SLA at t+1 am)
      • 13. Fund considers:
      Adding head count with full time EOD support resources
      Purchasing additional hardware
      Purchasing a scheduling and automation product
      • Process failures reduced significantly
      • 14. Fine grained audit trail of all processes
      • 15. EOD processing time reduced from 8 hours to 50 minutes
      • 16. Hardware freed up
      • 17. No additional headcount required
    • Case Study 2 – Hedge Fund
      • Fixed Income Risk Project
      New trading system implementation for Risk Management and P&L
      Requirement to take all Risk processing in house
      • After trading system implemented
      EOD process become more complex and onerous
      Reports begin to fail
      • Fund considers:
      Head count requirement in supporting new trading system
      Purchasing of hardware for additional processing
      • Process failures reduced significantly
      • 18. Fine grained audit trail of all processes
      • 19. EOD processing time reduced from 4 hours to 30 minutes
      • 20. Hardware freed up
      • 21. Risk Engine built on top of Grid and Flow.
      • 22. Scenario analysis report with 50 scenarios across 2000 positions runs in under 5 minutes
    • Case Study 3 – Bank
      • Key EOD reports failing due to resource constraints
      • 23. Tried shell scripts to split reports
      • 24. Tried refactoring reports
      • 25. Reports still took 7 hours to run and taking longer to run each day
      • 26. A single failure required a complete restart
      • 27. A failure and restart would result in SLA failures to all downstream systems
      • 28. Bank considers:
      Purchasing additional hardware
      Re-assessing support requirements
      • Process failures reduced significantly
      • 29. On failure, reporting process can now resume from any step
      • 30. Completed reports are now processed in 40 minutes
      • 31. Fine grained audit trail of process to aid support staff
      • 32. Hardware freed up for other projects
    • Other Uses
      Monte Carlo framework for pricing exotic structured credit instruments
      Risk Management Processing
      General application processing
      Process synchronisation
      Loading and parsing large datasets
    • 33. SCALEFAST Architecture
      Flow stores, versions and schedules workflows which are predefined and synchronised grid jobs.
      Grid Clients are any processes able to submit grid jobs.
      Grid Distributor receives job requests and maps the reduced jobs as tasks across workers.
      Grid Workers request and process job tasks
      Grid Client 0
      Grid Client ...
      Grid Client Q
      Grid Distributor
      Worker 0
      Worker ...
      Worker X
      Server 0
      Server ...
      Server N
      Local disk
      Shared Storage
    • 34. Flow GUI
      A simple process example with 3 steps
      Processes can have multiple branch dependencies e.g. 1 to many and many to 1
      Processes can be build up from sub processes
      Flow highlights the status of the individual steps
      By clicking on a step, you are redirected to the Grid for further details.
      Processes can be paused and restarted.
      On a processes failure, it can be restarted at any step in the process.
    • 35. Grid Job Details GUI
      Parameters, status and details of a grid job
      Individual tasks can be drilled down into
      Stderr and Stdout out can be accessed and queried across all tasks
      Input parameters, context and output visible at job or task level
    • 36. Grid Summary GUI
      High level view on the Grid status and activity
      View active worker nodes
      View job activity and history