SlideShare a Scribd company logo
Using Flow-based programming
... to write Tools and Workflows
for Scientific Computing
Go Stockholm Conference Oct 6, 2018
Samuel Lampa | | @saml (slack) | @smllmp (twitter)
Ex - Dept. of Pharm. Biosci, Uppsala University | |
Savantic AB | RIL Partner AB
About the speaker
● Name: Samuel Lampa
● PhD in Pharm. Bioinformatics from UU / (since 1 week)
● Researched: Flow-based programming-based workflow tools to build
predictive models for drug discovery
● Previously: HPC sysadmin & developer,Web developer,etc,
M.Sc. in molecular biotechnology engineering
● Next week: R&D Engineer at Savantic AB (
● (Also:AfricArxiv ( and RIL Partner AB (
Read more about my research →
Flow-based … what?
Flow-based programming (FBP)
Note: Doesn’t need to
be done visaully though!
● Black box, asynchronously running processes
● Data exchange across predefined connections
between named ports (with bounded buffers) by
message passing only
● Connections specified separately from processes
● Processes can be reconnected endlessly to form
different applications without having being changed
FBP in brief
Flow-based programming (FBP)
Note: Doesn’t need to
be done visaully though!
The Central Dogma of Biology …
… from DNA to RNA to Proteins
Image credits: Nicolle Rager, National Science Foundation. License: Public domain
Amino acids
Cell nucleus
“FBP is a particular form of dataflow
programming based on bounded buffers,
information packets with defined lifetimes,
named ports, and separate definition
of connections”
FBP vs Dataflow
● Change of connection wiring
without rewriting components
● Inherently concurrent - suited
for the multi-core CPU world
● Testing, monitoring and logging very
easy: Just plug in a mock-, logging-
or debugging component.
● Etc etc ...
Benefits abound
Invented by J. Paul Morrison at IBM in late 60’s
by Vladimir Sibirov @sibiroff (twitter)
FBP in Go: GoFlow
FBP in plain Go
(almost) without frameworks?
Generator functions
Adapted from Rob Pike’s slides:
func main() {
c := generateInts(10) // Call function to get a channel
for v := range c { // … and loop over it
func generateInts(max int) <-chan int { // Return a channel of ints
c := make(chan int)
go func() { // Init go-routine inside function
defer close(c)
for i := 0; i <= max; i++ {
c <- i
return c // Return the channel
Chaining generator functions 1/2
func reverse(cin chan string) chan string {
cout := make(chan string)
go func() {
defer close(cout)
for s := range cin { // Loop over in-chan
cout <- reverse(s) // Send on out-chan
return cout
Chaining generator functions 2/2
// Chain the generator functions
dna := generateDNA() // Generator func of strings
rev := reverse(dna)
compl := complement(rev)
// Drive the chain by reading from last channel
for dnaString := range compl {
Chaining generator functions 2/2
// Chain the generator functions
dna := generateDNA() // Generator func of strings
rev := reverse(dna)
compl := complement(rev)
// Drive the chain by reading from last channel
for dnaString := range compl {
Problems with the generator approach
● Inputs not named in connection code (no keyword arguments)
● Multiple return values depend on positional arguments:
leftPart, rightPart := splitInHalves(chanOfStrings)
Could we emulate named ports?
type P struct {
in chan string // Channels as struct fields, to act as “named ports”
out chan string
func NewP() *P { // Initialize a new component
return &P{
in: make(chan string, 16),
out: make(chan string, 16),
func (p *P) Run() {
defer close(p.out)
for s := range { // Refer to struct fields when reading ...
p.out <- s // ... and writing
Could we emulate named ports?
func main() {
p1 := NewP()
p2 := NewP() = p1.out // Connect dependencies here, by assigning to same chan
go p1.Run()
go p2.Run()
go func() { // Feed the input of the network
defer close(
for i := 0; i <= 10; i++ { <- "Hej"
for s := range p2.out { // Drive the chain from the main go-routine
Add almost no additional code, and get:
Real-world use of FlowBase
● RDF (Semantic) MediaWiki XML→
● Import via MediaWiki XML import
● Code:
● Paper:
Connecting dependencies with FlowBase
ttlFileRead.OutTriple = aggregator.In
aggregator.Out = indexCreator.In
indexCreator.Out = indexFanOut.In
indexFanOut.Out["serialize"] = indexToAggr.In
indexFanOut.Out["conv"] = triplesToWikiConverter.InIndex
indexToAggr.Out = triplesToWikiConverter.InAggregate
triplesToWikiConverter.OutPage = xmlCreator.InWikiPage
xmlCreator.OutTemplates = templateWriter.In
xmlCreator.OutProperties = propertyWriter.In
xmlCreator.OutPages = pageWriter.In
Taking it further: Port structs
(So far only used in SciPipe, not yet FlowBase)
Write Scientific Workflows in Go
● Define processes with shell command patterns
● Atomic writes, Restartable workflows, Caching
● Automatic file naming
● Audit logging
● Workflow graph plotting
● Intro & Docs:
● Preprint paper:
● Workflow
● Keeps track of dependency graph
● Process
● Added to workflows
● Long-running
● Typically one per operation
● Task
● Spawned by processes
● Executes just one shell command or custom Go function
● Typically one task spawned per operation on a set of input files
● Information Packet (IP)
● Most common data type passed between processes
File IP
“Hello World” in SciPipe
package main
import (
// Import the SciPipe package, aliased to 'sp'
sp ""
func main() {
// Init workflow with a name, and max concurrent tasks
wf := sp.NewWorkflow("hello_world", 4)
// Initialize processes and set output file paths
hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}")
hello.SetOut("out", "hello.txt")
world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}")
world.SetOut("out", "{i:in|%.txt}_world.txt")
// Connect network
// Run workflow
“Hello World” in SciPipe
package main
import (
// Import the SciPipe package, aliased to 'sp'
sp ""
func main() {
// Init workflow with a name, and max concurrent tasks
wf := sp.NewWorkflow("hello_world", 4)
// Initialize processes and set output file paths
hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}")
hello.SetOut("out", "hello.txt")
world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}")
world.SetOut("out", "{i:in|%.txt}_world.txt")
// Connect network
// Run workflow
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
“Hello World” in SciPipe
package main
import (
// Import the SciPipe package, aliased to 'sp'
sp ""
func main() {
// Init workflow with a name, and max concurrent tasks
wf := sp.NewWorkflow("hello_world", 4)
// Initialize processes and set output file paths
hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}")
hello.SetOut("out", "hello.txt")
world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}")
world.SetOut("out", "{i:in|%.txt}_world.txt")
// Connect network
// Run workflow
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
“Hello World” in SciPipe
package main
import (
// Import the SciPipe package, aliased to 'sp'
sp ""
func main() {
// Init workflow with a name, and max concurrent tasks
wf := sp.NewWorkflow("hello_world", 4)
// Initialize processes and set output file paths
hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}")
hello.SetOut("out", "hello.txt")
world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}")
world.SetOut("out", "{i:in|%.txt}_world.txt")
// Connect network
// Run workflow
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
“Hello World” in SciPipe
package main
import (
// Import the SciPipe package, aliased to 'sp'
sp ""
func main() {
// Init workflow with a name, and max concurrent tasks
wf := sp.NewWorkflow("hello_world", 4)
// Initialize processes and set output file paths
hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}")
hello.SetOut("out", "hello.txt")
world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}")
world.SetOut("out", "{i:in|%.txt}_world.txt")
// Connect network
// Run workflow
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
“Hello World” in SciPipe
package main
import (
// Import the SciPipe package, aliased to 'sp'
sp ""
func main() {
// Init workflow with a name, and max concurrent tasks
wf := sp.NewWorkflow("hello_world", 4)
// Initialize processes and set output file paths
hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}")
hello.SetOut("out", "hello.txt")
world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}")
world.SetOut("out", "{i:in|%.txt}_world.txt")
// Connect network
// Run workflow
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
“Hello World” in SciPipe
package main
import (
// Import the SciPipe package, aliased to 'sp'
sp ""
func main() {
// Init workflow with a name, and max concurrent tasks
wf := sp.NewWorkflow("hello_world", 4)
// Initialize processes and set output file paths
hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}")
hello.SetOut("out", "hello.txt")
world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}")
world.SetOut("out", "{i:in|%.txt}_world.txt")
// Connect network
// Run workflow
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs & inputs
(dependencies / data flow)
● Run the workflow
“Hello World” in SciPipe
package main
import (
// Import the SciPipe package, aliased to 'sp'
sp ""
func main() {
// Init workflow with a name, and max concurrent tasks
wf := sp.NewWorkflow("hello_world", 4)
// Initialize processes and set output file paths
hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}")
hello.SetOut("out", "hello.txt")
world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}")
world.SetOut("out", "{i:in|%.txt}_world.txt")
// Connect network
// Run workflow
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
Writing SciPipe workflows
package main
import (
func main() {
wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4)
makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}")
makeDNA.SetOut("dna", "dna.txt")
complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}")
complmt.SetOut("compl", "{i:in|%.txt}.compl.txt")
reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}")
reverse.SetOut("rev", "{i:in|%.txt}.rev.txt")
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
Writing SciPipe workflows
package main
import (
func main() {
wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4)
makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}")
makeDNA.SetOut("dna", "dna.txt")
complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}")
complmt.SetOut("compl", "{i:in|%.txt}.compl.txt")
reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}")
reverse.SetOut("rev", "{i:in|%.txt}.rev.txt")
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
Writing SciPipe workflows
package main
import (
func main() {
wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4)
makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}")
makeDNA.SetOut("dna", "dna.txt")
complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}")
complmt.SetOut("compl", "{i:in|%.txt}.compl.txt")
reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}")
reverse.SetOut("rev", "{i:in|%.txt}.rev.txt")
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
Writing SciPipe workflows
package main
import (
func main() {
wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4)
makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}")
makeDNA.SetOut("dna", "dna.txt")
complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}")
complmt.SetOut("compl", "{i:in|%.txt}.compl.txt")
reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}")
reverse.SetOut("rev", "{i:in|%.txt}.rev.txt")
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
Writing SciPipe workflows
package main
import (
func main() {
wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4)
makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}")
makeDNA.SetOut("dna", "dna.txt")
complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}")
complmt.SetOut("compl", "{i:in|%.txt}.compl.txt")
reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}")
reverse.SetOut("rev", "{i:in|%.txt}.rev.txt")
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
Writing SciPipe workflows
package main
import (
func main() {
wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4)
makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}")
makeDNA.SetOut("dna", "dna.txt")
complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}")
complmt.SetOut("compl", "{i:in|%.txt}.compl.txt")
reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}")
reverse.SetOut("rev", "{i:in|%.txt}.rev.txt")
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
Writing SciPipe workflows
package main
import (
func main() {
wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4)
makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}")
makeDNA.SetOut("dna", "dna.txt")
complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}")
complmt.SetOut("compl", "{i:in|%.txt}.compl.txt")
reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}")
reverse.SetOut("rev", "{i:in|%.txt}.rev.txt")
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
Running it
go run revcompl.go
Dependency graph plotting
Structured audit log
(Hierarchical JSON)
Turn Audit log into TeX/PDF report
TeX template by Jonathan Alvarsson @jonalv
● Intuitive behaviour: Like conveyor belts & stations in a factory.
● Flexible: Combine command-line programs with Go components
● Custom file naming: Easy to manually browse output files
● Portable: Distribute as Go code or as compiled executable files
● Easy to debug: Use any Go debugging tools or even just println()
● Powerful audit logging: Stream outputs via UNIX FIFO files
● Efficient & Parallel: Fast code + Efficient use of multi-core CPU
Benefits of SciPipe - Thanks to Go + FBP
More info at:
Thank you for your time!
Using Flow-based programming
... to write Tools and Workflows for Scientific Computing
Talk at Go Stockholm Conference Oct 6, 2018
Samuel Lampa | | @saml (slack) | @smllmp (twitter)
Dept. of Pharm. Biosci, Uppsala University | |

More Related Content

What's hot

Migrating from drupal to plone with transmogrifier
Migrating from drupal to plone with transmogrifierMigrating from drupal to plone with transmogrifier
Migrating from drupal to plone with transmogrifier
Clayton Parker
Python Coroutines, Present and Future
Python Coroutines, Present and FuturePython Coroutines, Present and Future
Python Coroutines, Present and Future
Something about Golang
Something about GolangSomething about Golang
Something about Golang
Anton Arhipov
Laying Pipe with Transmogrifier
Laying Pipe with TransmogrifierLaying Pipe with Transmogrifier
Laying Pipe with Transmogrifier
Clayton Parker
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
Ian Ozsvald
Transmogrifier: Migrating to Plone with less pain
Transmogrifier: Migrating to Plone with less painTransmogrifier: Migrating to Plone with less pain
Transmogrifier: Migrating to Plone with less pain
Lennart Regebro
Don't do this
Don't do thisDon't do this
Don't do this
Richard Jones
05 pig user defined functions (udfs)
05 pig user defined functions (udfs)05 pig user defined functions (udfs)
05 pig user defined functions (udfs)
Subhas Kumar Ghosh
When RegEx is not enough
When RegEx is not enoughWhen RegEx is not enough
When RegEx is not enough
Nati Cohen
Geeks Anonymes - Le langage Go
Geeks Anonymes - Le langage GoGeeks Anonymes - Le langage Go
Geeks Anonymes - Le langage Go
Geeks Anonymes
Reversing the dropbox client on windows
Reversing the dropbox client on windowsReversing the dropbox client on windows
Reversing the dropbox client on windows
Naughty And Nice Bash Features
Naughty And Nice Bash FeaturesNaughty And Nice Bash Features
Naughty And Nice Bash Features
Nati Cohen
Dts x dicoding #2 memulai pemrograman kotlin
Dts x dicoding #2 memulai pemrograman kotlinDts x dicoding #2 memulai pemrograman kotlin
Dts x dicoding #2 memulai pemrograman kotlin
Ahmad Arif Faizin
Go Concurrency Basics
Go Concurrency Basics Go Concurrency Basics
Go Concurrency Basics
Go for the paranoid network programmer, 3rd edition
Go for the paranoid network programmer, 3rd editionGo for the paranoid network programmer, 3rd edition
Go for the paranoid network programmer, 3rd edition
Eleanor McHugh
Painless Data Storage with MongoDB & Go
Painless Data Storage with MongoDB & Go Painless Data Storage with MongoDB & Go
Painless Data Storage with MongoDB & Go
Steven Francia
Journeys with Transmogrifier and friends or How not to get stuck in the Plone...
Journeys with Transmogrifier and friends or How not to get stuck in the Plone...Journeys with Transmogrifier and friends or How not to get stuck in the Plone...
Journeys with Transmogrifier and friends or How not to get stuck in the Plone...
Daniel Jowett
Go Concurrency
Go ConcurrencyGo Concurrency
Go Concurrency
06 file processing
06 file processing06 file processing
06 file processing
Issay Meii
JavaOne 2015 - Having fun with Javassist
JavaOne 2015 - Having fun with JavassistJavaOne 2015 - Having fun with Javassist
JavaOne 2015 - Having fun with Javassist
Anton Arhipov

What's hot (20)

Migrating from drupal to plone with transmogrifier
Migrating from drupal to plone with transmogrifierMigrating from drupal to plone with transmogrifier
Migrating from drupal to plone with transmogrifier
Python Coroutines, Present and Future
Python Coroutines, Present and FuturePython Coroutines, Present and Future
Python Coroutines, Present and Future
Something about Golang
Something about GolangSomething about Golang
Something about Golang
Laying Pipe with Transmogrifier
Laying Pipe with TransmogrifierLaying Pipe with Transmogrifier
Laying Pipe with Transmogrifier
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
Transmogrifier: Migrating to Plone with less pain
Transmogrifier: Migrating to Plone with less painTransmogrifier: Migrating to Plone with less pain
Transmogrifier: Migrating to Plone with less pain
Don't do this
Don't do thisDon't do this
Don't do this
05 pig user defined functions (udfs)
05 pig user defined functions (udfs)05 pig user defined functions (udfs)
05 pig user defined functions (udfs)
When RegEx is not enough
When RegEx is not enoughWhen RegEx is not enough
When RegEx is not enough
Geeks Anonymes - Le langage Go
Geeks Anonymes - Le langage GoGeeks Anonymes - Le langage Go
Geeks Anonymes - Le langage Go
Reversing the dropbox client on windows
Reversing the dropbox client on windowsReversing the dropbox client on windows
Reversing the dropbox client on windows
Naughty And Nice Bash Features
Naughty And Nice Bash FeaturesNaughty And Nice Bash Features
Naughty And Nice Bash Features
Dts x dicoding #2 memulai pemrograman kotlin
Dts x dicoding #2 memulai pemrograman kotlinDts x dicoding #2 memulai pemrograman kotlin
Dts x dicoding #2 memulai pemrograman kotlin
Go Concurrency Basics
Go Concurrency Basics Go Concurrency Basics
Go Concurrency Basics
Go for the paranoid network programmer, 3rd edition
Go for the paranoid network programmer, 3rd editionGo for the paranoid network programmer, 3rd edition
Go for the paranoid network programmer, 3rd edition
Painless Data Storage with MongoDB & Go
Painless Data Storage with MongoDB & Go Painless Data Storage with MongoDB & Go
Painless Data Storage with MongoDB & Go
Journeys with Transmogrifier and friends or How not to get stuck in the Plone...
Journeys with Transmogrifier and friends or How not to get stuck in the Plone...Journeys with Transmogrifier and friends or How not to get stuck in the Plone...
Journeys with Transmogrifier and friends or How not to get stuck in the Plone...
Go Concurrency
Go ConcurrencyGo Concurrency
Go Concurrency
06 file processing
06 file processing06 file processing
06 file processing
JavaOne 2015 - Having fun with Javassist
JavaOne 2015 - Having fun with JavassistJavaOne 2015 - Having fun with Javassist
JavaOne 2015 - Having fun with Javassist

Similar to Using Flow-based programming to write tools and workflows for Scientific Computing in Go

scala-gopher: async implementation of CSP for scala
scala-gopher:  async implementation of CSP  for  scalascala-gopher:  async implementation of CSP  for  scala
scala-gopher: async implementation of CSP for scala
Ruslan Shevchenko
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using PipesLinux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
Golang basics for Java developers - Part 1
Golang basics for Java developers - Part 1Golang basics for Java developers - Part 1
Golang basics for Java developers - Part 1
Robert Stern
Os lab final
Os lab finalOs lab final
Os lab final
Mykola Novik
Dev8d 2011-pipe2 py
Dev8d 2011-pipe2 pyDev8d 2011-pipe2 py
Dev8d 2011-pipe2 py
Tony Hirst
Gore: Go REPL
Gore: Go REPLGore: Go REPL
Gore: Go REPL
Hiroshi Shibamura
NetPonto - The Future Of C# - NetConf Edition
NetPonto - The Future Of C# - NetConf EditionNetPonto - The Future Of C# - NetConf Edition
NetPonto - The Future Of C# - NetConf Edition
Paulo Morgado
Future vs. Monix Task
Future vs. Monix TaskFuture vs. Monix Task
Future vs. Monix Task
Hermann Hueck
Incredible Machine with Pipelines and Generators
Incredible Machine with Pipelines and GeneratorsIncredible Machine with Pipelines and Generators
Incredible Machine with Pipelines and Generators
PySpark with Juypter
PySpark with JuypterPySpark with Juypter
PySpark with Juypter
Li Ming Tsai
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides:  Let's build macOS CLI Utilities using SwiftMobileConf 2021 Slides:  Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
Diego Freniche Brito
Puppi. Puppet strings to the shell
Puppi. Puppet strings to the shellPuppi. Puppet strings to the shell
Puppi. Puppet strings to the shell
Alessandro Franceschi
Apache Beam de A à Z
 Apache Beam de A à Z Apache Beam de A à Z
Apache Beam de A à Z
Paris Data Engineers !
A Recovering Java Developer Learns to Go
A Recovering Java Developer Learns to GoA Recovering Java Developer Learns to Go
A Recovering Java Developer Learns to Go
Matt Stine
Diseño y Desarrollo de APIs
Diseño y Desarrollo de APIsDiseño y Desarrollo de APIs
Diseño y Desarrollo de APIs
Raúl Neis
Go serving: Building server app with go
Go serving: Building server app with goGo serving: Building server app with go
Go serving: Building server app with go
Hean Hong Leong
Node.js basics
Node.js basicsNode.js basics
Node.js basics
Ben Lin
모던자바의 역습
모던자바의 역습모던자바의 역습
모던자바의 역습
DoHyun Jung
CP3108B (Mozilla) Sharing Session on Add-on SDK
CP3108B (Mozilla) Sharing Session on Add-on SDKCP3108B (Mozilla) Sharing Session on Add-on SDK
CP3108B (Mozilla) Sharing Session on Add-on SDK

Similar to Using Flow-based programming to write tools and workflows for Scientific Computing in Go (20)

scala-gopher: async implementation of CSP for scala
scala-gopher:  async implementation of CSP  for  scalascala-gopher:  async implementation of CSP  for  scala
scala-gopher: async implementation of CSP for scala
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using PipesLinux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
Golang basics for Java developers - Part 1
Golang basics for Java developers - Part 1Golang basics for Java developers - Part 1
Golang basics for Java developers - Part 1
Os lab final
Os lab finalOs lab final
Os lab final
Dev8d 2011-pipe2 py
Dev8d 2011-pipe2 pyDev8d 2011-pipe2 py
Dev8d 2011-pipe2 py
Gore: Go REPL
Gore: Go REPLGore: Go REPL
Gore: Go REPL
NetPonto - The Future Of C# - NetConf Edition
NetPonto - The Future Of C# - NetConf EditionNetPonto - The Future Of C# - NetConf Edition
NetPonto - The Future Of C# - NetConf Edition
Future vs. Monix Task
Future vs. Monix TaskFuture vs. Monix Task
Future vs. Monix Task
Incredible Machine with Pipelines and Generators
Incredible Machine with Pipelines and GeneratorsIncredible Machine with Pipelines and Generators
Incredible Machine with Pipelines and Generators
PySpark with Juypter
PySpark with JuypterPySpark with Juypter
PySpark with Juypter
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides:  Let's build macOS CLI Utilities using SwiftMobileConf 2021 Slides:  Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
Puppi. Puppet strings to the shell
Puppi. Puppet strings to the shellPuppi. Puppet strings to the shell
Puppi. Puppet strings to the shell
Apache Beam de A à Z
 Apache Beam de A à Z Apache Beam de A à Z
Apache Beam de A à Z
A Recovering Java Developer Learns to Go
A Recovering Java Developer Learns to GoA Recovering Java Developer Learns to Go
A Recovering Java Developer Learns to Go
Diseño y Desarrollo de APIs
Diseño y Desarrollo de APIsDiseño y Desarrollo de APIs
Diseño y Desarrollo de APIs
Go serving: Building server app with go
Go serving: Building server app with goGo serving: Building server app with go
Go serving: Building server app with go
Node.js basics
Node.js basicsNode.js basics
Node.js basics
모던자바의 역습
모던자바의 역습모던자바의 역습
모던자바의 역습
CP3108B (Mozilla) Sharing Session on Add-on SDK
CP3108B (Mozilla) Sharing Session on Add-on SDKCP3108B (Mozilla) Sharing Session on Add-on SDK
CP3108B (Mozilla) Sharing Session on Add-on SDK

More from Samuel Lampa

Linked Data for improved organization of research data
Linked Data  for improved organization  of research dataLinked Data  for improved organization  of research data
Linked Data for improved organization of research data
Samuel Lampa
How to document computational research projects
How to document computational research projectsHow to document computational research projects
How to document computational research projects
Samuel Lampa
Reproducibility in Scientific Data Analysis - BioScience Seminar
Reproducibility in Scientific Data Analysis - BioScience SeminarReproducibility in Scientific Data Analysis - BioScience Seminar
Reproducibility in Scientific Data Analysis - BioScience Seminar
Samuel Lampa
Batch import of large RDF datasets into Semantic MediaWiki
Batch import of large RDF datasets into Semantic MediaWikiBatch import of large RDF datasets into Semantic MediaWiki
Batch import of large RDF datasets into Semantic MediaWiki
Samuel Lampa
SciPipe - A light-weight workflow library inspired by flow-based programming
SciPipe - A light-weight workflow library inspired by flow-based programmingSciPipe - A light-weight workflow library inspired by flow-based programming
SciPipe - A light-weight workflow library inspired by flow-based programming
Samuel Lampa
Vagrant, Ansible and Docker - How they fit together for productive flexible d...
Vagrant, Ansible and Docker - How they fit together for productive flexible d...Vagrant, Ansible and Docker - How they fit together for productive flexible d...
Vagrant, Ansible and Docker - How they fit together for productive flexible d...
Samuel Lampa
iRODS Rule Language Cheat Sheet
iRODS Rule Language Cheat SheetiRODS Rule Language Cheat Sheet
iRODS Rule Language Cheat Sheet
Samuel Lampa
AddisDev Meetup ii: Golang and Flow-based Programming
AddisDev Meetup ii: Golang and Flow-based ProgrammingAddisDev Meetup ii: Golang and Flow-based Programming
AddisDev Meetup ii: Golang and Flow-based Programming
Samuel Lampa
First encounter with Elixir - Some random things
First encounter with Elixir - Some random thingsFirst encounter with Elixir - Some random things
First encounter with Elixir - Some random things
Samuel Lampa
Profiling go code a beginners tutorial
Profiling go code   a beginners tutorialProfiling go code   a beginners tutorial
Profiling go code a beginners tutorial
Samuel Lampa
Flow based programming an overview
Flow based programming   an overviewFlow based programming   an overview
Flow based programming an overview
Samuel Lampa
Python Generators - Talk at PySthlm meetup #15
Python Generators - Talk at PySthlm meetup #15Python Generators - Talk at PySthlm meetup #15
Python Generators - Talk at PySthlm meetup #15
Samuel Lampa
The RDFIO Extension - A Status update
The RDFIO Extension - A Status updateThe RDFIO Extension - A Status update
The RDFIO Extension - A Status update
Samuel Lampa
My lightning talk at Go Stockholm meetup Aug 6th 2013
My lightning talk at Go Stockholm meetup Aug 6th 2013My lightning talk at Go Stockholm meetup Aug 6th 2013
My lightning talk at Go Stockholm meetup Aug 6th 2013
Samuel Lampa
Hooking up Semantic MediaWiki with external tools via SPARQL
Hooking up Semantic MediaWiki with external tools via SPARQLHooking up Semantic MediaWiki with external tools via SPARQL
Hooking up Semantic MediaWiki with external tools via SPARQL
Samuel Lampa
Thesis presentation Samuel Lampa
Thesis presentation Samuel LampaThesis presentation Samuel Lampa
Thesis presentation Samuel Lampa
Samuel Lampa
3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
Samuel Lampa
2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
Samuel Lampa

More from Samuel Lampa (18)

Linked Data for improved organization of research data
Linked Data  for improved organization  of research dataLinked Data  for improved organization  of research data
Linked Data for improved organization of research data
How to document computational research projects
How to document computational research projectsHow to document computational research projects
How to document computational research projects
Reproducibility in Scientific Data Analysis - BioScience Seminar
Reproducibility in Scientific Data Analysis - BioScience SeminarReproducibility in Scientific Data Analysis - BioScience Seminar
Reproducibility in Scientific Data Analysis - BioScience Seminar
Batch import of large RDF datasets into Semantic MediaWiki
Batch import of large RDF datasets into Semantic MediaWikiBatch import of large RDF datasets into Semantic MediaWiki
Batch import of large RDF datasets into Semantic MediaWiki
SciPipe - A light-weight workflow library inspired by flow-based programming
SciPipe - A light-weight workflow library inspired by flow-based programmingSciPipe - A light-weight workflow library inspired by flow-based programming
SciPipe - A light-weight workflow library inspired by flow-based programming
Vagrant, Ansible and Docker - How they fit together for productive flexible d...
Vagrant, Ansible and Docker - How they fit together for productive flexible d...Vagrant, Ansible and Docker - How they fit together for productive flexible d...
Vagrant, Ansible and Docker - How they fit together for productive flexible d...
iRODS Rule Language Cheat Sheet
iRODS Rule Language Cheat SheetiRODS Rule Language Cheat Sheet
iRODS Rule Language Cheat Sheet
AddisDev Meetup ii: Golang and Flow-based Programming
AddisDev Meetup ii: Golang and Flow-based ProgrammingAddisDev Meetup ii: Golang and Flow-based Programming
AddisDev Meetup ii: Golang and Flow-based Programming
First encounter with Elixir - Some random things
First encounter with Elixir - Some random thingsFirst encounter with Elixir - Some random things
First encounter with Elixir - Some random things
Profiling go code a beginners tutorial
Profiling go code   a beginners tutorialProfiling go code   a beginners tutorial
Profiling go code a beginners tutorial
Flow based programming an overview
Flow based programming   an overviewFlow based programming   an overview
Flow based programming an overview
Python Generators - Talk at PySthlm meetup #15
Python Generators - Talk at PySthlm meetup #15Python Generators - Talk at PySthlm meetup #15
Python Generators - Talk at PySthlm meetup #15
The RDFIO Extension - A Status update
The RDFIO Extension - A Status updateThe RDFIO Extension - A Status update
The RDFIO Extension - A Status update
My lightning talk at Go Stockholm meetup Aug 6th 2013
My lightning talk at Go Stockholm meetup Aug 6th 2013My lightning talk at Go Stockholm meetup Aug 6th 2013
My lightning talk at Go Stockholm meetup Aug 6th 2013
Hooking up Semantic MediaWiki with external tools via SPARQL
Hooking up Semantic MediaWiki with external tools via SPARQLHooking up Semantic MediaWiki with external tools via SPARQL
Hooking up Semantic MediaWiki with external tools via SPARQL
Thesis presentation Samuel Lampa
Thesis presentation Samuel LampaThesis presentation Samuel Lampa
Thesis presentation Samuel Lampa
3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse

Recently uploaded

Height and depth gauge linear metrology.pdf
Height and depth gauge linear metrology.pdfHeight and depth gauge linear metrology.pdf
Height and depth gauge linear metrology.pdf
SCALING OF MOS CIRCUITS m                 .pptxSCALING OF MOS CIRCUITS m                 .pptx
AI in customer support Use cases solutions development and implementation.pdf
AI in customer support Use cases solutions development and implementation.pdfAI in customer support Use cases solutions development and implementation.pdf
AI in customer support Use cases solutions development and implementation.pdf
This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...
Open Channel Flow: fluid flow with a free surface
Open Channel Flow: fluid flow with a free surfaceOpen Channel Flow: fluid flow with a free surface
Open Channel Flow: fluid flow with a free surface
Indrajeet sahu
Presentation on Food Delivery Systems
Presentation on Food Delivery SystemsPresentation on Food Delivery Systems
Presentation on Food Delivery Systems
Abdullah Al Noman
Digital Image Processing Unit -2 Notes complete
Digital Image Processing Unit -2 Notes completeDigital Image Processing Unit -2 Notes complete
Digital Image Processing Unit -2 Notes complete
ITSM Integration with MuleSoft.pptx
ITSM  Integration with MuleSoft.pptxITSM  Integration with MuleSoft.pptx
ITSM Integration with MuleSoft.pptx
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls ChennaiCall Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
paraasingh12 #V08
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Determination of Equivalent Circuit parameters and performance characteristic...
Determination of Equivalent Circuit parameters and performance characteristic...Determination of Equivalent Circuit parameters and performance characteristic...
Determination of Equivalent Circuit parameters and performance characteristic...
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
Digital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptxDigital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptx
Butterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdfButterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdf
Lubi Valves

Recently uploaded (20)

Height and depth gauge linear metrology.pdf
Height and depth gauge linear metrology.pdfHeight and depth gauge linear metrology.pdf
Height and depth gauge linear metrology.pdf
SCALING OF MOS CIRCUITS m                 .pptxSCALING OF MOS CIRCUITS m                 .pptx
AI in customer support Use cases solutions development and implementation.pdf
AI in customer support Use cases solutions development and implementation.pdfAI in customer support Use cases solutions development and implementation.pdf
AI in customer support Use cases solutions development and implementation.pdf
This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...
Open Channel Flow: fluid flow with a free surface
Open Channel Flow: fluid flow with a free surfaceOpen Channel Flow: fluid flow with a free surface
Open Channel Flow: fluid flow with a free surface
Presentation on Food Delivery Systems
Presentation on Food Delivery SystemsPresentation on Food Delivery Systems
Presentation on Food Delivery Systems
Digital Image Processing Unit -2 Notes complete
Digital Image Processing Unit -2 Notes completeDigital Image Processing Unit -2 Notes complete
Digital Image Processing Unit -2 Notes complete
ITSM Integration with MuleSoft.pptx
ITSM  Integration with MuleSoft.pptxITSM  Integration with MuleSoft.pptx
ITSM Integration with MuleSoft.pptx
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls ChennaiCall Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Determination of Equivalent Circuit parameters and performance characteristic...
Determination of Equivalent Circuit parameters and performance characteristic...Determination of Equivalent Circuit parameters and performance characteristic...
Determination of Equivalent Circuit parameters and performance characteristic...
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
Digital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptxDigital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptx
Butterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdfButterfly Valves Manufacturer (LBF Series).pdf
Butterfly Valves Manufacturer (LBF Series).pdf

Using Flow-based programming to write tools and workflows for Scientific Computing in Go

  • 1. Using Flow-based programming ... to write Tools and Workflows for Scientific Computing Go Stockholm Conference Oct 6, 2018 Samuel Lampa | | @saml (slack) | @smllmp (twitter) Ex - Dept. of Pharm. Biosci, Uppsala University | | Savantic AB | RIL Partner AB
  • 2. About the speaker ● Name: Samuel Lampa ● PhD in Pharm. Bioinformatics from UU / (since 1 week) ● Researched: Flow-based programming-based workflow tools to build predictive models for drug discovery ● Previously: HPC sysadmin & developer,Web developer,etc, M.Sc. in molecular biotechnology engineering ● Next week: R&D Engineer at Savantic AB ( ● (Also:AfricArxiv ( and RIL Partner AB (
  • 3. Read more about my research → (
  • 5.
  • 6.
  • 7.
  • 8.
  • 9. Flow-based programming (FBP) Note: Doesn’t need to be done visaully though!
  • 10. ● Black box, asynchronously running processes ● Data exchange across predefined connections between named ports (with bounded buffers) by message passing only ● Connections specified separately from processes ● Processes can be reconnected endlessly to form different applications without having being changed internally FBP in brief
  • 11. Flow-based programming (FBP) Note: Doesn’t need to be done visaully though!
  • 12.
  • 13.
  • 14. The Central Dogma of Biology … … from DNA to RNA to Proteins DNA mRNA Protein Image credits: Nicolle Rager, National Science Foundation. License: Public domain Amino acids Ribosome RNA polymerase Cell nucleus Cell
  • 15. “FBP is a particular form of dataflow programming based on bounded buffers, information packets with defined lifetimes, named ports, and separate definition of connections” FBP vs Dataflow
  • 16. ● Change of connection wiring without rewriting components ● Inherently concurrent - suited for the multi-core CPU world ● Testing, monitoring and logging very easy: Just plug in a mock-, logging- or debugging component. ● Etc etc ... Benefits abound
  • 17. Invented by J. Paul Morrison at IBM in late 60’s
  • 18. by Vladimir Sibirov @sibiroff (twitter) FBP in Go: GoFlow
  • 19. FBP in plain Go (almost) without frameworks?
  • 20. Generator functions Adapted from Rob Pike’s slides: func main() { c := generateInts(10) // Call function to get a channel for v := range c { // … and loop over it fmt.Println(v) } } func generateInts(max int) <-chan int { // Return a channel of ints c := make(chan int) go func() { // Init go-routine inside function defer close(c) for i := 0; i <= max; i++ { c <- i } }() return c // Return the channel }
  • 21. Chaining generator functions 1/2 func reverse(cin chan string) chan string { cout := make(chan string) go func() { defer close(cout) for s := range cin { // Loop over in-chan cout <- reverse(s) // Send on out-chan } }() return cout }
  • 22. Chaining generator functions 2/2 // Chain the generator functions dna := generateDNA() // Generator func of strings rev := reverse(dna) compl := complement(rev) // Drive the chain by reading from last channel for dnaString := range compl { fmt.Println(dnaString) }
  • 23. Chaining generator functions 2/2 // Chain the generator functions dna := generateDNA() // Generator func of strings rev := reverse(dna) compl := complement(rev) // Drive the chain by reading from last channel for dnaString := range compl { fmt.Println(dnaString) }
  • 24. Problems with the generator approach ● Inputs not named in connection code (no keyword arguments) ● Multiple return values depend on positional arguments: leftPart, rightPart := splitInHalves(chanOfStrings)
  • 25. Could we emulate named ports? type P struct { in chan string // Channels as struct fields, to act as “named ports” out chan string } func NewP() *P { // Initialize a new component return &P{ in: make(chan string, 16), out: make(chan string, 16), } } func (p *P) Run() { defer close(p.out) for s := range { // Refer to struct fields when reading ... p.out <- s // ... and writing } }
  • 26. Could we emulate named ports? func main() { p1 := NewP() p2 := NewP() = p1.out // Connect dependencies here, by assigning to same chan go p1.Run() go p2.Run() go func() { // Feed the input of the network defer close( for i := 0; i <= 10; i++ { <- "Hej" } }() for s := range p2.out { // Drive the chain from the main go-routine fmt.Println(s) } }
  • 27. Add almost no additional code, and get:
  • 28. Real-world use of FlowBase ● RDF (Semantic) MediaWiki XML→ ● Import via MediaWiki XML import ● Code: ● Paper:
  • 29. Connecting dependencies with FlowBase ttlFileRead.OutTriple = aggregator.In aggregator.Out = indexCreator.In indexCreator.Out = indexFanOut.In indexFanOut.Out["serialize"] = indexToAggr.In indexFanOut.Out["conv"] = triplesToWikiConverter.InIndex indexToAggr.Out = triplesToWikiConverter.InAggregate triplesToWikiConverter.OutPage = xmlCreator.InWikiPage xmlCreator.OutTemplates = templateWriter.In xmlCreator.OutProperties = propertyWriter.In xmlCreator.OutPages = pageWriter.In
  • 30. Taking it further: Port structs ttlFileRead.OutTriple().To(aggregator.In()) aggregator.Out().To(indexCreator.In()) indexCreator.Out().To(indexToAggr.In()) indexCreator.Out().To(triplesToWikiConverter.InIndex()) indexToAggr.Out().To(triplesToWikiConverter.InAggregate()) triplesToWikiConverter.OutPage().To(xmlCreator.InWikiPage()) xmlCreator.OutTemplates().To(templateWriter.In()) xmlCreator.OutProperties().To(propertyWriter.In()) xmlCreator.OutPages().To(pageWriter.In()) (So far only used in SciPipe, not yet FlowBase)
  • 31. SciPipe Write Scientific Workflows in Go ● Define processes with shell command patterns ● Atomic writes, Restartable workflows, Caching ● Automatic file naming ● Audit logging ● Workflow graph plotting ● Intro & Docs: ● Preprint paper:
  • 32. SciPipe ● Workflow ● Keeps track of dependency graph ● Process ● Added to workflows ● Long-running ● Typically one per operation ● Task ● Spawned by processes ● Executes just one shell command or custom Go function ● Typically one task spawned per operation on a set of input files ● Information Packet (IP) ● Most common data type passed between processes Workflow Process File IP Task Task Task
  • 33. “Hello World” in SciPipe package main import ( // Import the SciPipe package, aliased to 'sp' sp "" ) func main() { // Init workflow with a name, and max concurrent tasks wf := sp.NewWorkflow("hello_world", 4) // Initialize processes and set output file paths hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}") hello.SetOut("out", "hello.txt") world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}") world.SetOut("out", "{i:in|%.txt}_world.txt") // Connect network world.In("in").From(hello.Out("out")) // Run workflow wf.Run() }
  • 34. “Hello World” in SciPipe package main import ( // Import the SciPipe package, aliased to 'sp' sp "" ) func main() { // Init workflow with a name, and max concurrent tasks wf := sp.NewWorkflow("hello_world", 4) // Initialize processes and set output file paths hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}") hello.SetOut("out", "hello.txt") world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}") world.SetOut("out", "{i:in|%.txt}_world.txt") // Connect network world.In("in").From(hello.Out("out")) // Run workflow wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 35. “Hello World” in SciPipe package main import ( // Import the SciPipe package, aliased to 'sp' sp "" ) func main() { // Init workflow with a name, and max concurrent tasks wf := sp.NewWorkflow("hello_world", 4) // Initialize processes and set output file paths hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}") hello.SetOut("out", "hello.txt") world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}") world.SetOut("out", "{i:in|%.txt}_world.txt") // Connect network world.In("in").From(hello.Out("out")) // Run workflow wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 36. “Hello World” in SciPipe package main import ( // Import the SciPipe package, aliased to 'sp' sp "" ) func main() { // Init workflow with a name, and max concurrent tasks wf := sp.NewWorkflow("hello_world", 4) // Initialize processes and set output file paths hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}") hello.SetOut("out", "hello.txt") world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}") world.SetOut("out", "{i:in|%.txt}_world.txt") // Connect network world.In("in").From(hello.Out("out")) // Run workflow wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 37. “Hello World” in SciPipe package main import ( // Import the SciPipe package, aliased to 'sp' sp "" ) func main() { // Init workflow with a name, and max concurrent tasks wf := sp.NewWorkflow("hello_world", 4) // Initialize processes and set output file paths hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}") hello.SetOut("out", "hello.txt") world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}") world.SetOut("out", "{i:in|%.txt}_world.txt") // Connect network world.In("in").From(hello.Out("out")) // Run workflow wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 38. “Hello World” in SciPipe package main import ( // Import the SciPipe package, aliased to 'sp' sp "" ) func main() { // Init workflow with a name, and max concurrent tasks wf := sp.NewWorkflow("hello_world", 4) // Initialize processes and set output file paths hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}") hello.SetOut("out", "hello.txt") world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}") world.SetOut("out", "{i:in|%.txt}_world.txt") // Connect network world.In("in").From(hello.Out("out")) // Run workflow wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 39. “Hello World” in SciPipe package main import ( // Import the SciPipe package, aliased to 'sp' sp "" ) func main() { // Init workflow with a name, and max concurrent tasks wf := sp.NewWorkflow("hello_world", 4) // Initialize processes and set output file paths hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}") hello.SetOut("out", "hello.txt") world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}") world.SetOut("out", "{i:in|%.txt}_world.txt") // Connect network world.In("in").From(hello.Out("out")) // Run workflow wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs & inputs (dependencies / data flow) ● Run the workflow
  • 40. “Hello World” in SciPipe package main import ( // Import the SciPipe package, aliased to 'sp' sp "" ) func main() { // Init workflow with a name, and max concurrent tasks wf := sp.NewWorkflow("hello_world", 4) // Initialize processes and set output file paths hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}") hello.SetOut("out", "hello.txt") world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}") world.SetOut("out", "{i:in|%.txt}_world.txt") // Connect network world.In("in").From(hello.Out("out")) // Run workflow wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 41. Writing SciPipe workflows package main import ( "" ) const dna = "AAAGCCCGTGGGGGACCTGTTC" func main() { wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4) makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}") makeDNA.SetOut("dna", "dna.txt") complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}") complmt.SetOut("compl", "{i:in|%.txt}.compl.txt") reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}") reverse.SetOut("rev", "{i:in|%.txt}.rev.txt") complmt.In("in").From(makeDNA.Out("dna")) reverse.In("in").From(complmt.Out("compl")) wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 42. Writing SciPipe workflows package main import ( "" ) const dna = "AAAGCCCGTGGGGGACCTGTTC" func main() { wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4) makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}") makeDNA.SetOut("dna", "dna.txt") complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}") complmt.SetOut("compl", "{i:in|%.txt}.compl.txt") reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}") reverse.SetOut("rev", "{i:in|%.txt}.rev.txt") complmt.In("in").From(makeDNA.Out("dna")) reverse.In("in").From(complmt.Out("compl")) wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 43. Writing SciPipe workflows package main import ( "" ) const dna = "AAAGCCCGTGGGGGACCTGTTC" func main() { wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4) makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}") makeDNA.SetOut("dna", "dna.txt") complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}") complmt.SetOut("compl", "{i:in|%.txt}.compl.txt") reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}") reverse.SetOut("rev", "{i:in|%.txt}.rev.txt") complmt.In("in").From(makeDNA.Out("dna")) reverse.In("in").From(complmt.Out("compl")) wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 44. Writing SciPipe workflows package main import ( "" ) const dna = "AAAGCCCGTGGGGGACCTGTTC" func main() { wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4) makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}") makeDNA.SetOut("dna", "dna.txt") complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}") complmt.SetOut("compl", "{i:in|%.txt}.compl.txt") reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}") reverse.SetOut("rev", "{i:in|%.txt}.rev.txt") complmt.In("in").From(makeDNA.Out("dna")) reverse.In("in").From(complmt.Out("compl")) wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 45. Writing SciPipe workflows package main import ( "" ) const dna = "AAAGCCCGTGGGGGACCTGTTC" func main() { wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4) makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}") makeDNA.SetOut("dna", "dna.txt") complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}") complmt.SetOut("compl", "{i:in|%.txt}.compl.txt") reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}") reverse.SetOut("rev", "{i:in|%.txt}.rev.txt") complmt.In("in").From(makeDNA.Out("dna")) reverse.In("in").From(complmt.Out("compl")) wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 46. Writing SciPipe workflows package main import ( "" ) const dna = "AAAGCCCGTGGGGGACCTGTTC" func main() { wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4) makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}") makeDNA.SetOut("dna", "dna.txt") complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}") complmt.SetOut("compl", "{i:in|%.txt}.compl.txt") reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}") reverse.SetOut("rev", "{i:in|%.txt}.rev.txt") complmt.In("in").From(makeDNA.Out("dna")) reverse.In("in").From(complmt.Out("compl")) wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 47. Writing SciPipe workflows package main import ( "" ) const dna = "AAAGCCCGTGGGGGACCTGTTC" func main() { wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4) makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}") makeDNA.SetOut("dna", "dna.txt") complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}") complmt.SetOut("compl", "{i:in|%.txt}.compl.txt") reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}") reverse.SetOut("rev", "{i:in|%.txt}.rev.txt") complmt.In("in").From(makeDNA.Out("dna")) reverse.In("in").From(complmt.Out("compl")) wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 48. Running it go run revcompl.go
  • 51. Turn Audit log into TeX/PDF report TeX template by Jonathan Alvarsson @jonalv
  • 52. ● Intuitive behaviour: Like conveyor belts & stations in a factory. ● Flexible: Combine command-line programs with Go components ● Custom file naming: Easy to manually browse output files ● Portable: Distribute as Go code or as compiled executable files ● Easy to debug: Use any Go debugging tools or even just println() ● Powerful audit logging: Stream outputs via UNIX FIFO files ● Efficient & Parallel: Fast code + Efficient use of multi-core CPU Benefits of SciPipe - Thanks to Go + FBP
  • 54. Thank you for your time! Using Flow-based programming ... to write Tools and Workflows for Scientific Computing Talk at Go Stockholm Conference Oct 6, 2018 Samuel Lampa | | @saml (slack) | @smllmp (twitter) Dept. of Pharm. Biosci, Uppsala University | |