Dataflow: the concurrency/parallelism architecture you need
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Dataflow: the concurrency/parallelism architecture you need

on

  • 238 views

An informal investigation/tutorial on the dataflow architecture for Java and Groovy as presented at DevoxxUK 2014.

An informal investigation/tutorial on the dataflow architecture for Java and Groovy as presented at DevoxxUK 2014.

Code presented is on GitHub: https://github.com/russel/MeanStdDev.git

Statistics

Views

Total Views
238
Views on SlideShare
237
Embed Views
1

Actions

Likes
0
Downloads
6
Comments
0

1 Embed 1

http://www.slideee.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Dataflow: the concurrency/parallelism architecture you need Presentation Transcript

  • 1. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow: Russel Winder @russel_winder http://www.russel.org.uk russel@winder.org.uk The Concurrency/Parallelism Architecture You Need
  • 2. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder What is Dataflow?
  • 3. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder What are (in computing† ): Concurrency: Structuring solution and code such that multiple parts may execute independently and possibly even at the same time. Parallelism: Execute multiple parts of a system at the same time on different processors so as to get things working faster. † In natural language these words have very different meanings.
  • 4. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder What is Dataflow? An architecture comprising channels allowing data to flow from one operator to another, where each operator has multiple input channels and multiple output channels, and executes code only in response to the arrival of data on the inputs.
  • 5. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Historically Dataflow computers: – Values flowing between… –…operators that calculate… –…new values to pass to… –…other operators. Dataflow hardware didn't take off, but the architecture works at various scales. The Manchester Prototype Dataflow Computer J R Gurd, C C Kirkham, I Watson CACM 28(1), 1985-01.
  • 6. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow diagrams have been an integral part of analysis and design of information systems since the 1970s T de Marco, Structured Analysis and Systems Specification, Yourdon Press, NY, 1978.
  • 7. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
  • 8. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow and Functional Operators seem like they might be pure functions, but… …they are not necessarily, operators may have internal state. Operators may be referentially transparent, but they may be not. Operators may even have side effects.
  • 9. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow is an event-based architecture
  • 10. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow systems are (possibly) reactive systems. Which would make them exceedingly trendy even if the idea is very old.
  • 11. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow systems have no† shared memory. † or at least should have no.
  • 12. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder operator channel
  • 13. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow systems are message passing systems.
  • 14. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Each operator must† be single threaded. † or at least should.
  • 15. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow Frameworks Scala: –Future Akka: –Dataflow variables, aka Promise –Deprecated in favour of Async Java: –Pre-8, Future –8+, CompletableFuture, aka Promise
  • 16. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Architectural Issue Each of the aforementioned frameworks assumes that each operator creates a single value. Communication is by dataflow variables: each dataflow variable is a thread-safe single assignment variable.
  • 17. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
  • 18. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
  • 19. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder GPars… Has dataflow variables (promises) and tasks and so can do everything Akka and Java can offer. Has DataflowQueue, and so can create real dataflow networks.
  • 20. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder One does like to code… …doesn't one.
  • 21. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder We need a problem…
  • 22. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder A Problem Calculate mean and standard deviation of a data sample. ¯x = 1 n ∑i=0 n xi s = √ 1 n−1 ∑i=0 n (xi−¯x)2
  • 23. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Amend the Problem s = √ 1 n−1 ((∑i=0 n xi 2 )−n¯x ¯x) ¯x = 1 n ∑i=0 n xi
  • 24. @YourTwitterHandle@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder C od e
  • 25. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Switch to using an IDE for this.Switch to using an IDE for this. Code Example
  • 26. @YourTwitterHandle#DVXFR14{session hashtag} @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder S um m ary
  • 27. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Summary Dataflow is an architecture: Event-driven, single-threaded operators communicating by message passing using channels. Dataflow is an easement: Synchronization is inherent in the model, and there is no shared memory, so all deadlocks are trivial.
  • 28. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow is a way of harnessing concurrency and parallelism in easy to program ways.
  • 29. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder GPars is usable from Java as well as Groovy.
  • 30. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Testing is really Groovy with Spock.
  • 31. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow is an architecture of code you need to know.
  • 32. @YourTwitterHandle#DVXFR14{session hashtag} @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Q & A
  • 33. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow: Russel Winder @russel_winder http://www.russel.org.uk russel@winder.org.uk The Concurrency/Parallelism Architecture You Need