Your SlideShare is downloading. ×
Dataflow: the concurrency/parallelism architecture you need
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Dataflow: the concurrency/parallelism architecture you need

217
views

Published on

An informal investigation/tutorial on the dataflow architecture for Java and Groovy as presented at DevoxxUK 2014. …

An informal investigation/tutorial on the dataflow architecture for Java and Groovy as presented at DevoxxUK 2014.

Code presented is on GitHub: https://github.com/russel/MeanStdDev.git

Published in: Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
217
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow: Russel Winder @russel_winder http://www.russel.org.uk russel@winder.org.uk The Concurrency/Parallelism Architecture You Need
  • 2. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder What is Dataflow?
  • 3. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder What are (in computing† ): Concurrency: Structuring solution and code such that multiple parts may execute independently and possibly even at the same time. Parallelism: Execute multiple parts of a system at the same time on different processors so as to get things working faster. † In natural language these words have very different meanings.
  • 4. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder What is Dataflow? An architecture comprising channels allowing data to flow from one operator to another, where each operator has multiple input channels and multiple output channels, and executes code only in response to the arrival of data on the inputs.
  • 5. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Historically Dataflow computers: – Values flowing between… –…operators that calculate… –…new values to pass to… –…other operators. Dataflow hardware didn't take off, but the architecture works at various scales. The Manchester Prototype Dataflow Computer J R Gurd, C C Kirkham, I Watson CACM 28(1), 1985-01.
  • 6. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow diagrams have been an integral part of analysis and design of information systems since the 1970s T de Marco, Structured Analysis and Systems Specification, Yourdon Press, NY, 1978.
  • 7. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
  • 8. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow and Functional Operators seem like they might be pure functions, but… …they are not necessarily, operators may have internal state. Operators may be referentially transparent, but they may be not. Operators may even have side effects.
  • 9. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow is an event-based architecture
  • 10. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow systems are (possibly) reactive systems. Which would make them exceedingly trendy even if the idea is very old.
  • 11. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow systems have no† shared memory. † or at least should have no.
  • 12. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder operator channel
  • 13. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow systems are message passing systems.
  • 14. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Each operator must† be single threaded. † or at least should.
  • 15. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow Frameworks Scala: –Future Akka: –Dataflow variables, aka Promise –Deprecated in favour of Async Java: –Pre-8, Future –8+, CompletableFuture, aka Promise
  • 16. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Architectural Issue Each of the aforementioned frameworks assumes that each operator creates a single value. Communication is by dataflow variables: each dataflow variable is a thread-safe single assignment variable.
  • 17. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
  • 18. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
  • 19. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder GPars… Has dataflow variables (promises) and tasks and so can do everything Akka and Java can offer. Has DataflowQueue, and so can create real dataflow networks.
  • 20. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder One does like to code… …doesn't one.
  • 21. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder We need a problem…
  • 22. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder A Problem Calculate mean and standard deviation of a data sample. ¯x = 1 n ∑i=0 n xi s = √ 1 n−1 ∑i=0 n (xi−¯x)2
  • 23. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Amend the Problem s = √ 1 n−1 ((∑i=0 n xi 2 )−n¯x ¯x) ¯x = 1 n ∑i=0 n xi
  • 24. @YourTwitterHandle@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder C od e
  • 25. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Switch to using an IDE for this.Switch to using an IDE for this. Code Example
  • 26. @YourTwitterHandle#DVXFR14{session hashtag} @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder S um m ary
  • 27. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Summary Dataflow is an architecture: Event-driven, single-threaded operators communicating by message passing using channels. Dataflow is an easement: Synchronization is inherent in the model, and there is no shared memory, so all deadlocks are trivial.
  • 28. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow is a way of harnessing concurrency and parallelism in easy to program ways.
  • 29. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder GPars is usable from Java as well as Groovy.
  • 30. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Testing is really Groovy with Spock.
  • 31. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow is an architecture of code you need to know.
  • 32. @YourTwitterHandle#DVXFR14{session hashtag} @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Q & A
  • 33. @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder Dataflow: Russel Winder @russel_winder http://www.russel.org.uk russel@winder.org.uk The Concurrency/Parallelism Architecture You Need