Distributed Task Execution
Upcoming SlideShare
Loading in...5
×
 

Distributed Task Execution

on

  • 1,250 views

Your architecture needs distributed task processing? This presentation will help you to define criterias to choose the implementation

Your architecture needs distributed task processing? This presentation will help you to define criterias to choose the implementation

Statistics

Views

Total Views
1,250
Views on SlideShare
1,245
Embed Views
5

Actions

Likes
0
Downloads
6
Comments
0

1 Embed 5

https://twitter.com 5

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Distributed Task Execution Distributed Task Execution Presentation Transcript

    • DistributedTaskExecutionVitalii Tymchyshyntivv00@gmail.com
    • Participants● Task source● Tasks● Task distribution server(s)● Task executors● Monitoring & reporting subsystem
    • Task source types & main problemsDynamic task source Large set of tasks to execute● Pull vs Push source ● Batching● Buffering ● Restart on failures● Flow Control ● Progress & problems monitoring ● Tail slowdown
    • Tasks variety● Uniform vs different● Idempotent vs Transactional● Fat vs Thin● Fire and forget vs Tasks with result
    • Uniform tasks ExecutorSource Distributor Executor Executor
    • Differently typed tasks Executor, type A Executor, type ASource Distributor Executor, type B Executor, type B
    • Use idempotent tasks!● This means that task executed twice dont do any harm (usually second call is NOOP)● This makes batching a lot easier● This makes task source/coordinator a lot easier, especially in distributed case● This is usually quite easy● This gives you quite strong guaranties● Transaction with two transactional resources is not guarantied
    • Regular modelWe need to perform task to move $X moneyfrom A to B1. Take task and remove it from queue2. Start database transaction3. Perform move4. Commit database transactionIf there are any problems between (1) and (4) -your move is lost
    • Transactional modelWe need to perform task to move $X moneyfrom A to B1. Take task2. Start database transaction3. Perform move4. Commit database transaction5. Ack taskIf there are any problems between (4) and (5) -your move is duplicated
    • Idempotent modelWe need to perform task T to move $X moneyfrom A to B1. Take task2. Start database transaction3. Create move with id T or skip if move exists4. Commit database transaction5. Ack taskIf there are any problems between (4) and (5) -task will be tried once more and do no harm
    • Dealing with fat tasks● Most queues do not tolerate fat tasks● Move the fat to the storage● Dont forget to GC your fat
    • What result can task generate● Task finished indicator● OK / ERROR outcome● Different task statistics to be stored / aggregated (e.g. time taken / resources used)● Business level task resultThis means that business-level "Fire andforget" is not usually "Fire and forget" atoperations level
    • Task tail problemspeed time
    • Task tail problem80% of tasks complete in 20% of time :)● Let average task take 10 seconds● Let slow task take 100 seconds● Let it be 1% of slow tasks● This means out of last 100 tasks, at least one will be slow with 63% probability.● This means last 100 tasks will take 100 seconds, no matter number of executors● Even if we have an executor for each task, the whole set will take 100 seconds
    • Task tail problem solutions● Start slow tasks early● If slowness is variable, start slow task multiple times in parallel● Cut your tail
    • ActiveMQ (JMS)● Easy for "Fire and forget tasks"● Problematic for a lot of "in-flight" tasks● Pain to configure (does not work OK Out of box)● Complex to monitor/control single task
    • Zookeeper as distributor forDynamic Task Source● Done with small custom module● Push task source / pull task executor design● Task priorities / timeouts / reprocessing logic easy with custom module● Task monitoring / reporting is done by task source
    • Custom task execution solution● Pluggable task sources, supporting JMS/RDBMs/Plain file sources for different deployments● Complex task reprocessing schema with timeouts & customizable affinity● RDBMs source provides per-task execution information● "Killer" task detection● Configurable batching abilities for fast speed processing
    • Q&AQuestions are welcome!● Voice (red ale preferred☺)● tivv00@gmail.com● @tivv00● tinyurl.com/LinkedTIVV