Using Flow-based programming to write tools and workflows for Scientific Computing in Go

Samuel Lampa
Samuel LampaPhD Student in Pharmaceutical Bioinformatics
Using Flow-based programming
... to write Tools and Workflows
for Scientific Computing
Go Stockholm Conference Oct 6, 2018
Samuel Lampa | bionics.it | @saml (slack) | @smllmp (twitter)
Ex - Dept. of Pharm. Biosci, Uppsala University | www.farmbio.uu.se | pharmb.io
Savantic AB savantic.se | RIL Partner AB rilpartner.com
About the speaker
● Name: Samuel Lampa
● PhD in Pharm. Bioinformatics from UU / pharmb.io (since 1 week)
● Researched: Flow-based programming-based workflow tools to build
predictive models for drug discovery
● Previously: HPC sysadmin & developer,Web developer,etc,
M.Sc. in molecular biotechnology engineering
● Next week: R&D Engineer at Savantic AB (savanticab.com)
● (Also:AfricArxiv (africarxiv.org) and RIL Partner AB (rilpartner.com))
Read more about my research
bit.ly/samlthesis →
(bionics.it/posts/phdthesis)
Flow-based … what?
Using Flow-based programming to write tools and workflows for Scientific Computing in Go
Using Flow-based programming to write tools and workflows for Scientific Computing in Go
Using Flow-based programming to write tools and workflows for Scientific Computing in Go
Using Flow-based programming to write tools and workflows for Scientific Computing in Go
Flow-based programming (FBP)
Note: Doesn’t need to
be done visaully though!
● Black box, asynchronously running processes
● Data exchange across predefined connections
between named ports (with bounded buffers) by
message passing only
● Connections specified separately from processes
● Processes can be reconnected endlessly to form
different applications without having being changed
internally
FBP in brief
Flow-based programming (FBP)
Note: Doesn’t need to
be done visaully though!
Using Flow-based programming to write tools and workflows for Scientific Computing in Go
Using Flow-based programming to write tools and workflows for Scientific Computing in Go
The Central Dogma of Biology …
… from DNA to RNA to Proteins
DNA
mRNA
Protein
Image credits: Nicolle Rager, National Science Foundation. License: Public domain
Amino acids
Ribosome
RNA
polymerase
Cell nucleus
Cell
“FBP is a particular form of dataflow
programming based on bounded buffers,
information packets with defined lifetimes,
named ports, and separate definition
of connections”
FBP vs Dataflow
● Change of connection wiring
without rewriting components
● Inherently concurrent - suited
for the multi-core CPU world
● Testing, monitoring and logging very
easy: Just plug in a mock-, logging-
or debugging component.
● Etc etc ...
Benefits abound
jpaulmorrison.com(/fbp)
Invented by J. Paul Morrison at IBM in late 60’s
github.com/trustmaster/goflow
by Vladimir Sibirov @sibiroff (twitter)
FBP in Go: GoFlow
FBP in plain Go
(almost) without frameworks?
Generator functions
Adapted from Rob Pike’s slides: talks.golang.org/2012/concurrency.slide#25
func main() {
c := generateInts(10) // Call function to get a channel
for v := range c { // … and loop over it
fmt.Println(v)
}
}
func generateInts(max int) <-chan int { // Return a channel of ints
c := make(chan int)
go func() { // Init go-routine inside function
defer close(c)
for i := 0; i <= max; i++ {
c <- i
}
}()
return c // Return the channel
}
Chaining generator functions 1/2
func reverse(cin chan string) chan string {
cout := make(chan string)
go func() {
defer close(cout)
for s := range cin { // Loop over in-chan
cout <- reverse(s) // Send on out-chan
}
}()
return cout
}
Chaining generator functions 2/2
// Chain the generator functions
dna := generateDNA() // Generator func of strings
rev := reverse(dna)
compl := complement(rev)
// Drive the chain by reading from last channel
for dnaString := range compl {
fmt.Println(dnaString)
}
Chaining generator functions 2/2
// Chain the generator functions
dna := generateDNA() // Generator func of strings
rev := reverse(dna)
compl := complement(rev)
// Drive the chain by reading from last channel
for dnaString := range compl {
fmt.Println(dnaString)
}
Problems with the generator approach
● Inputs not named in connection code (no keyword arguments)
● Multiple return values depend on positional arguments:
leftPart, rightPart := splitInHalves(chanOfStrings)
Could we emulate named ports?
type P struct {
in chan string // Channels as struct fields, to act as “named ports”
out chan string
}
func NewP() *P { // Initialize a new component
return &P{
in: make(chan string, 16),
out: make(chan string, 16),
}
}
func (p *P) Run() {
defer close(p.out)
for s := range p.in { // Refer to struct fields when reading ...
p.out <- s // ... and writing
}
}
Could we emulate named ports?
func main() {
p1 := NewP()
p2 := NewP()
p2.in = p1.out // Connect dependencies here, by assigning to same chan
go p1.Run()
go p2.Run()
go func() { // Feed the input of the network
defer close(p1.in)
for i := 0; i <= 10; i++ {
p1.in <- "Hej"
}
}()
for s := range p2.out { // Drive the chain from the main go-routine
fmt.Println(s)
}
}
Add almost no additional code, and get:
flowbase.org
Real-world use of FlowBase
● RDF (Semantic) MediaWiki XML→
● Import via MediaWiki XML import
● Code: github.com/rdfio/rdf2smw
● Paper: bit.ly/rdfiopub
Connecting dependencies with FlowBase
ttlFileRead.OutTriple = aggregator.In
aggregator.Out = indexCreator.In
indexCreator.Out = indexFanOut.In
indexFanOut.Out["serialize"] = indexToAggr.In
indexFanOut.Out["conv"] = triplesToWikiConverter.InIndex
indexToAggr.Out = triplesToWikiConverter.InAggregate
triplesToWikiConverter.OutPage = xmlCreator.InWikiPage
xmlCreator.OutTemplates = templateWriter.In
xmlCreator.OutProperties = propertyWriter.In
xmlCreator.OutPages = pageWriter.In
github.com/rdfio/rdf2smw/blob/e7e2b3/main.go#L100-L125
Taking it further: Port structs
ttlFileRead.OutTriple().To(aggregator.In())
aggregator.Out().To(indexCreator.In())
indexCreator.Out().To(indexToAggr.In())
indexCreator.Out().To(triplesToWikiConverter.InIndex())
indexToAggr.Out().To(triplesToWikiConverter.InAggregate())
triplesToWikiConverter.OutPage().To(xmlCreator.InWikiPage())
xmlCreator.OutTemplates().To(templateWriter.In())
xmlCreator.OutProperties().To(propertyWriter.In())
xmlCreator.OutPages().To(pageWriter.In())
(So far only used in SciPipe, not yet FlowBase)
SciPipe
Write Scientific Workflows in Go
● Define processes with shell command patterns
● Atomic writes, Restartable workflows, Caching
● Automatic file naming
● Audit logging
● Workflow graph plotting
● Intro & Docs: scipipe.org
● Preprint paper: doi.org/10.1101/380808
SciPipe
● Workflow
● Keeps track of dependency graph
● Process
● Added to workflows
● Long-running
● Typically one per operation
● Task
● Spawned by processes
● Executes just one shell command or custom Go function
● Typically one task spawned per operation on a set of input files
● Information Packet (IP)
● Most common data type passed between processes
Workflow
Process
File IP
Task
Task
Task
“Hello World” in SciPipe
package main
import (
// Import the SciPipe package, aliased to 'sp'
sp "github.com/scipipe/scipipe"
)
func main() {
// Init workflow with a name, and max concurrent tasks
wf := sp.NewWorkflow("hello_world", 4)
// Initialize processes and set output file paths
hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}")
hello.SetOut("out", "hello.txt")
world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}")
world.SetOut("out", "{i:in|%.txt}_world.txt")
// Connect network
world.In("in").From(hello.Out("out"))
// Run workflow
wf.Run()
}
“Hello World” in SciPipe
package main
import (
// Import the SciPipe package, aliased to 'sp'
sp "github.com/scipipe/scipipe"
)
func main() {
// Init workflow with a name, and max concurrent tasks
wf := sp.NewWorkflow("hello_world", 4)
// Initialize processes and set output file paths
hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}")
hello.SetOut("out", "hello.txt")
world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}")
world.SetOut("out", "{i:in|%.txt}_world.txt")
// Connect network
world.In("in").From(hello.Out("out"))
// Run workflow
wf.Run()
}
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
“Hello World” in SciPipe
package main
import (
// Import the SciPipe package, aliased to 'sp'
sp "github.com/scipipe/scipipe"
)
func main() {
// Init workflow with a name, and max concurrent tasks
wf := sp.NewWorkflow("hello_world", 4)
// Initialize processes and set output file paths
hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}")
hello.SetOut("out", "hello.txt")
world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}")
world.SetOut("out", "{i:in|%.txt}_world.txt")
// Connect network
world.In("in").From(hello.Out("out"))
// Run workflow
wf.Run()
}
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
“Hello World” in SciPipe
package main
import (
// Import the SciPipe package, aliased to 'sp'
sp "github.com/scipipe/scipipe"
)
func main() {
// Init workflow with a name, and max concurrent tasks
wf := sp.NewWorkflow("hello_world", 4)
// Initialize processes and set output file paths
hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}")
hello.SetOut("out", "hello.txt")
world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}")
world.SetOut("out", "{i:in|%.txt}_world.txt")
// Connect network
world.In("in").From(hello.Out("out"))
// Run workflow
wf.Run()
}
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
“Hello World” in SciPipe
package main
import (
// Import the SciPipe package, aliased to 'sp'
sp "github.com/scipipe/scipipe"
)
func main() {
// Init workflow with a name, and max concurrent tasks
wf := sp.NewWorkflow("hello_world", 4)
// Initialize processes and set output file paths
hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}")
hello.SetOut("out", "hello.txt")
world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}")
world.SetOut("out", "{i:in|%.txt}_world.txt")
// Connect network
world.In("in").From(hello.Out("out"))
// Run workflow
wf.Run()
}
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
“Hello World” in SciPipe
package main
import (
// Import the SciPipe package, aliased to 'sp'
sp "github.com/scipipe/scipipe"
)
func main() {
// Init workflow with a name, and max concurrent tasks
wf := sp.NewWorkflow("hello_world", 4)
// Initialize processes and set output file paths
hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}")
hello.SetOut("out", "hello.txt")
world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}")
world.SetOut("out", "{i:in|%.txt}_world.txt")
// Connect network
world.In("in").From(hello.Out("out"))
// Run workflow
wf.Run()
}
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
“Hello World” in SciPipe
package main
import (
// Import the SciPipe package, aliased to 'sp'
sp "github.com/scipipe/scipipe"
)
func main() {
// Init workflow with a name, and max concurrent tasks
wf := sp.NewWorkflow("hello_world", 4)
// Initialize processes and set output file paths
hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}")
hello.SetOut("out", "hello.txt")
world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}")
world.SetOut("out", "{i:in|%.txt}_world.txt")
// Connect network
world.In("in").From(hello.Out("out"))
// Run workflow
wf.Run()
}
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs & inputs
(dependencies / data flow)
● Run the workflow
“Hello World” in SciPipe
package main
import (
// Import the SciPipe package, aliased to 'sp'
sp "github.com/scipipe/scipipe"
)
func main() {
// Init workflow with a name, and max concurrent tasks
wf := sp.NewWorkflow("hello_world", 4)
// Initialize processes and set output file paths
hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}")
hello.SetOut("out", "hello.txt")
world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}")
world.SetOut("out", "{i:in|%.txt}_world.txt")
// Connect network
world.In("in").From(hello.Out("out"))
// Run workflow
wf.Run()
}
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
Writing SciPipe workflows
package main
import (
"github.com/scipipe/scipipe"
)
const dna = "AAAGCCCGTGGGGGACCTGTTC"
func main() {
wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4)
makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}")
makeDNA.SetOut("dna", "dna.txt")
complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}")
complmt.SetOut("compl", "{i:in|%.txt}.compl.txt")
reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}")
reverse.SetOut("rev", "{i:in|%.txt}.rev.txt")
complmt.In("in").From(makeDNA.Out("dna"))
reverse.In("in").From(complmt.Out("compl"))
wf.Run()
}
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
Writing SciPipe workflows
package main
import (
"github.com/scipipe/scipipe"
)
const dna = "AAAGCCCGTGGGGGACCTGTTC"
func main() {
wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4)
makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}")
makeDNA.SetOut("dna", "dna.txt")
complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}")
complmt.SetOut("compl", "{i:in|%.txt}.compl.txt")
reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}")
reverse.SetOut("rev", "{i:in|%.txt}.rev.txt")
complmt.In("in").From(makeDNA.Out("dna"))
reverse.In("in").From(complmt.Out("compl"))
wf.Run()
}
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
Writing SciPipe workflows
package main
import (
"github.com/scipipe/scipipe"
)
const dna = "AAAGCCCGTGGGGGACCTGTTC"
func main() {
wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4)
makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}")
makeDNA.SetOut("dna", "dna.txt")
complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}")
complmt.SetOut("compl", "{i:in|%.txt}.compl.txt")
reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}")
reverse.SetOut("rev", "{i:in|%.txt}.rev.txt")
complmt.In("in").From(makeDNA.Out("dna"))
reverse.In("in").From(complmt.Out("compl"))
wf.Run()
}
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
Writing SciPipe workflows
package main
import (
"github.com/scipipe/scipipe"
)
const dna = "AAAGCCCGTGGGGGACCTGTTC"
func main() {
wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4)
makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}")
makeDNA.SetOut("dna", "dna.txt")
complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}")
complmt.SetOut("compl", "{i:in|%.txt}.compl.txt")
reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}")
reverse.SetOut("rev", "{i:in|%.txt}.rev.txt")
complmt.In("in").From(makeDNA.Out("dna"))
reverse.In("in").From(complmt.Out("compl"))
wf.Run()
}
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
Writing SciPipe workflows
package main
import (
"github.com/scipipe/scipipe"
)
const dna = "AAAGCCCGTGGGGGACCTGTTC"
func main() {
wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4)
makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}")
makeDNA.SetOut("dna", "dna.txt")
complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}")
complmt.SetOut("compl", "{i:in|%.txt}.compl.txt")
reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}")
reverse.SetOut("rev", "{i:in|%.txt}.rev.txt")
complmt.In("in").From(makeDNA.Out("dna"))
reverse.In("in").From(complmt.Out("compl"))
wf.Run()
}
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
Writing SciPipe workflows
package main
import (
"github.com/scipipe/scipipe"
)
const dna = "AAAGCCCGTGGGGGACCTGTTC"
func main() {
wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4)
makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}")
makeDNA.SetOut("dna", "dna.txt")
complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}")
complmt.SetOut("compl", "{i:in|%.txt}.compl.txt")
reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}")
reverse.SetOut("rev", "{i:in|%.txt}.rev.txt")
complmt.In("in").From(makeDNA.Out("dna"))
reverse.In("in").From(complmt.Out("compl"))
wf.Run()
}
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
Writing SciPipe workflows
package main
import (
"github.com/scipipe/scipipe"
)
const dna = "AAAGCCCGTGGGGGACCTGTTC"
func main() {
wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4)
makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}")
makeDNA.SetOut("dna", "dna.txt")
complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}")
complmt.SetOut("compl", "{i:in|%.txt}.compl.txt")
reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}")
reverse.SetOut("rev", "{i:in|%.txt}.rev.txt")
complmt.In("in").From(makeDNA.Out("dna"))
reverse.In("in").From(complmt.Out("compl"))
wf.Run()
}
● Import SciPipe
● Set up any default variables
or data, handle flags etc
● Initiate workflow
● Create processes
● Define outputs and paths
● Connect outputs to inputs
(dependencies / data flow)
● Run the workflow
Running it
go run revcompl.go
Dependency graph plotting
Structured audit log
(Hierarchical JSON)
Turn Audit log into TeX/PDF report
TeX template by Jonathan Alvarsson @jonalv
● Intuitive behaviour: Like conveyor belts & stations in a factory.
● Flexible: Combine command-line programs with Go components
● Custom file naming: Easy to manually browse output files
● Portable: Distribute as Go code or as compiled executable files
● Easy to debug: Use any Go debugging tools or even just println()
● Powerful audit logging: Stream outputs via UNIX FIFO files
● Efficient & Parallel: Fast code + Efficient use of multi-core CPU
Benefits of SciPipe - Thanks to Go + FBP
More info at:
scipipe.org
Thank you for your time!
Using Flow-based programming
... to write Tools and Workflows for Scientific Computing
Talk at Go Stockholm Conference Oct 6, 2018
Samuel Lampa | bionics.it | @saml (slack) | @smllmp (twitter)
Dept. of Pharm. Biosci, Uppsala University | www.farmbio.uu.se | pharmb.io
1 of 54

Recommended

EuroPython 2016 - Do I Need To Switch To Golang by
EuroPython 2016 - Do I Need To Switch To GolangEuroPython 2016 - Do I Need To Switch To Golang
EuroPython 2016 - Do I Need To Switch To GolangMax Tepkeev
1.2K views40 slides
«iPython & Jupyter: 4 fun & profit», Лев Тонких, Rambler&Co by
«iPython & Jupyter: 4 fun & profit», Лев Тонких, Rambler&Co«iPython & Jupyter: 4 fun & profit», Лев Тонких, Rambler&Co
«iPython & Jupyter: 4 fun & profit», Лев Тонких, Rambler&CoMail.ru Group
10K views65 slides
Beauty and Power of Go by
Beauty and Power of GoBeauty and Power of Go
Beauty and Power of GoFrank Müller
3.3K views46 slides
7 Common Mistakes in Go (2015) by
7 Common Mistakes in Go (2015)7 Common Mistakes in Go (2015)
7 Common Mistakes in Go (2015)Steven Francia
23.5K views80 slides
A deep dive into PEP-3156 and the new asyncio module by
A deep dive into PEP-3156 and the new asyncio moduleA deep dive into PEP-3156 and the new asyncio module
A deep dive into PEP-3156 and the new asyncio moduleSaúl Ibarra Corretgé
15.2K views55 slides
7 Common mistakes in Go and when to avoid them by
7 Common mistakes in Go and when to avoid them7 Common mistakes in Go and when to avoid them
7 Common mistakes in Go and when to avoid themSteven Francia
40.1K views72 slides

More Related Content

What's hot

Migrating from drupal to plone with transmogrifier by
Migrating from drupal to plone with transmogrifierMigrating from drupal to plone with transmogrifier
Migrating from drupal to plone with transmogrifierClayton Parker
4.6K views37 slides
Python Coroutines, Present and Future by
Python Coroutines, Present and FuturePython Coroutines, Present and Future
Python Coroutines, Present and Futureemptysquare
21.8K views39 slides
Something about Golang by
Something about GolangSomething about Golang
Something about GolangAnton Arhipov
1.5K views69 slides
Laying Pipe with Transmogrifier by
Laying Pipe with TransmogrifierLaying Pipe with Transmogrifier
Laying Pipe with TransmogrifierClayton Parker
1.1K views25 slides
Euro python2011 High Performance Python by
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance PythonIan Ozsvald
3K views48 slides
Transmogrifier: Migrating to Plone with less pain by
Transmogrifier: Migrating to Plone with less painTransmogrifier: Migrating to Plone with less pain
Transmogrifier: Migrating to Plone with less painLennart Regebro
2.2K views33 slides

What's hot(20)

Migrating from drupal to plone with transmogrifier by Clayton Parker
Migrating from drupal to plone with transmogrifierMigrating from drupal to plone with transmogrifier
Migrating from drupal to plone with transmogrifier
Clayton Parker4.6K views
Python Coroutines, Present and Future by emptysquare
Python Coroutines, Present and FuturePython Coroutines, Present and Future
Python Coroutines, Present and Future
emptysquare21.8K views
Something about Golang by Anton Arhipov
Something about GolangSomething about Golang
Something about Golang
Anton Arhipov1.5K views
Laying Pipe with Transmogrifier by Clayton Parker
Laying Pipe with TransmogrifierLaying Pipe with Transmogrifier
Laying Pipe with Transmogrifier
Clayton Parker1.1K views
Euro python2011 High Performance Python by Ian Ozsvald
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
Ian Ozsvald3K views
Transmogrifier: Migrating to Plone with less pain by Lennart Regebro
Transmogrifier: Migrating to Plone with less painTransmogrifier: Migrating to Plone with less pain
Transmogrifier: Migrating to Plone with less pain
Lennart Regebro2.2K views
When RegEx is not enough by Nati Cohen
When RegEx is not enoughWhen RegEx is not enough
When RegEx is not enough
Nati Cohen378 views
Geeks Anonymes - Le langage Go by Geeks Anonymes
Geeks Anonymes - Le langage GoGeeks Anonymes - Le langage Go
Geeks Anonymes - Le langage Go
Geeks Anonymes238 views
Reversing the dropbox client on windows by extremecoders
Reversing the dropbox client on windowsReversing the dropbox client on windows
Reversing the dropbox client on windows
extremecoders28.4K views
Naughty And Nice Bash Features by Nati Cohen
Naughty And Nice Bash FeaturesNaughty And Nice Bash Features
Naughty And Nice Bash Features
Nati Cohen1.1K views
Dts x dicoding #2 memulai pemrograman kotlin by Ahmad Arif Faizin
Dts x dicoding #2 memulai pemrograman kotlinDts x dicoding #2 memulai pemrograman kotlin
Dts x dicoding #2 memulai pemrograman kotlin
Ahmad Arif Faizin243 views
Go Concurrency Basics by ElifTech
Go Concurrency Basics Go Concurrency Basics
Go Concurrency Basics
ElifTech388 views
Go for the paranoid network programmer, 3rd edition by Eleanor McHugh
Go for the paranoid network programmer, 3rd editionGo for the paranoid network programmer, 3rd edition
Go for the paranoid network programmer, 3rd edition
Eleanor McHugh254 views
Painless Data Storage with MongoDB & Go by Steven Francia
Painless Data Storage with MongoDB & Go Painless Data Storage with MongoDB & Go
Painless Data Storage with MongoDB & Go
Steven Francia32.9K views
Journeys with Transmogrifier and friends or How not to get stuck in the Plone... by Daniel Jowett
Journeys with Transmogrifier and friends or How not to get stuck in the Plone...Journeys with Transmogrifier and friends or How not to get stuck in the Plone...
Journeys with Transmogrifier and friends or How not to get stuck in the Plone...
Daniel Jowett1.1K views
Go Concurrency by jgrahamc
Go ConcurrencyGo Concurrency
Go Concurrency
jgrahamc14.3K views
06 file processing by Issay Meii
06 file processing06 file processing
06 file processing
Issay Meii1K views
JavaOne 2015 - Having fun with Javassist by Anton Arhipov
JavaOne 2015 - Having fun with JavassistJavaOne 2015 - Having fun with Javassist
JavaOne 2015 - Having fun with Javassist
Anton Arhipov1.9K views

Similar to Using Flow-based programming to write tools and workflows for Scientific Computing in Go

scala-gopher: async implementation of CSP for scala by
scala-gopher:  async implementation of CSP  for  scalascala-gopher:  async implementation of CSP  for  scala
scala-gopher: async implementation of CSP for scalaRuslan Shevchenko
1.9K views48 slides
Golang basics for Java developers - Part 1 by
Golang basics for Java developers - Part 1Golang basics for Java developers - Part 1
Golang basics for Java developers - Part 1Robert Stern
1.3K views74 slides
Os lab final by
Os lab finalOs lab final
Os lab finalLakshmiSarvani6
144 views29 slides
BUILDING APPS WITH ASYNCIO by
BUILDING APPS WITH ASYNCIOBUILDING APPS WITH ASYNCIO
BUILDING APPS WITH ASYNCIOMykola Novik
713 views39 slides
Dev8d 2011-pipe2 py by
Dev8d 2011-pipe2 pyDev8d 2011-pipe2 py
Dev8d 2011-pipe2 pyTony Hirst
764 views30 slides
Gore: Go REPL by
Gore: Go REPLGore: Go REPL
Gore: Go REPLHiroshi Shibamura
3.4K views51 slides

Similar to Using Flow-based programming to write tools and workflows for Scientific Computing in Go(20)

scala-gopher: async implementation of CSP for scala by Ruslan Shevchenko
scala-gopher:  async implementation of CSP  for  scalascala-gopher:  async implementation of CSP  for  scala
scala-gopher: async implementation of CSP for scala
Ruslan Shevchenko1.9K views
Golang basics for Java developers - Part 1 by Robert Stern
Golang basics for Java developers - Part 1Golang basics for Java developers - Part 1
Golang basics for Java developers - Part 1
Robert Stern1.3K views
BUILDING APPS WITH ASYNCIO by Mykola Novik
BUILDING APPS WITH ASYNCIOBUILDING APPS WITH ASYNCIO
BUILDING APPS WITH ASYNCIO
Mykola Novik713 views
Dev8d 2011-pipe2 py by Tony Hirst
Dev8d 2011-pipe2 pyDev8d 2011-pipe2 py
Dev8d 2011-pipe2 py
Tony Hirst764 views
NetPonto - The Future Of C# - NetConf Edition by Paulo Morgado
NetPonto - The Future Of C# - NetConf EditionNetPonto - The Future Of C# - NetConf Edition
NetPonto - The Future Of C# - NetConf Edition
Paulo Morgado234 views
Future vs. Monix Task by Hermann Hueck
Future vs. Monix TaskFuture vs. Monix Task
Future vs. Monix Task
Hermann Hueck1.4K views
Incredible Machine with Pipelines and Generators by dantleech
Incredible Machine with Pipelines and GeneratorsIncredible Machine with Pipelines and Generators
Incredible Machine with Pipelines and Generators
dantleech264 views
PySpark with Juypter by Li Ming Tsai
PySpark with JuypterPySpark with Juypter
PySpark with Juypter
Li Ming Tsai414 views
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift by Diego Freniche Brito
MobileConf 2021 Slides:  Let's build macOS CLI Utilities using SwiftMobileConf 2021 Slides:  Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
A Recovering Java Developer Learns to Go by Matt Stine
A Recovering Java Developer Learns to GoA Recovering Java Developer Learns to Go
A Recovering Java Developer Learns to Go
Matt Stine13.8K views
Diseño y Desarrollo de APIs by Raúl Neis
Diseño y Desarrollo de APIsDiseño y Desarrollo de APIs
Diseño y Desarrollo de APIs
Raúl Neis153 views
Go serving: Building server app with go by Hean Hong Leong
Go serving: Building server app with goGo serving: Building server app with go
Go serving: Building server app with go
Hean Hong Leong1.4K views
Node.js basics by Ben Lin
Node.js basicsNode.js basics
Node.js basics
Ben Lin1.1K views
모던자바의 역습 by DoHyun Jung
모던자바의 역습모던자바의 역습
모던자바의 역습
DoHyun Jung9.4K views
CP3108B (Mozilla) Sharing Session on Add-on SDK by Mifeng
CP3108B (Mozilla) Sharing Session on Add-on SDKCP3108B (Mozilla) Sharing Session on Add-on SDK
CP3108B (Mozilla) Sharing Session on Add-on SDK
Mifeng724 views
Living With Legacy Code by Rowan Merewood
Living With Legacy CodeLiving With Legacy Code
Living With Legacy Code
Rowan Merewood25.3K views

More from Samuel Lampa

Linked Data for improved organization of research data by
Linked Data  for improved organization  of research dataLinked Data  for improved organization  of research data
Linked Data for improved organization of research dataSamuel Lampa
700 views24 slides
How to document computational research projects by
How to document computational research projectsHow to document computational research projects
How to document computational research projectsSamuel Lampa
107 views7 slides
Reproducibility in Scientific Data Analysis - BioScience Seminar by
Reproducibility in Scientific Data Analysis - BioScience SeminarReproducibility in Scientific Data Analysis - BioScience Seminar
Reproducibility in Scientific Data Analysis - BioScience SeminarSamuel Lampa
432 views27 slides
Batch import of large RDF datasets into Semantic MediaWiki by
Batch import of large RDF datasets into Semantic MediaWikiBatch import of large RDF datasets into Semantic MediaWiki
Batch import of large RDF datasets into Semantic MediaWikiSamuel Lampa
1.1K views28 slides
SciPipe - A light-weight workflow library inspired by flow-based programming by
SciPipe - A light-weight workflow library inspired by flow-based programmingSciPipe - A light-weight workflow library inspired by flow-based programming
SciPipe - A light-weight workflow library inspired by flow-based programmingSamuel Lampa
1.4K views14 slides
Vagrant, Ansible and Docker - How they fit together for productive flexible d... by
Vagrant, Ansible and Docker - How they fit together for productive flexible d...Vagrant, Ansible and Docker - How they fit together for productive flexible d...
Vagrant, Ansible and Docker - How they fit together for productive flexible d...Samuel Lampa
1.3K views15 slides

More from Samuel Lampa(18)

Linked Data for improved organization of research data by Samuel Lampa
Linked Data  for improved organization  of research dataLinked Data  for improved organization  of research data
Linked Data for improved organization of research data
Samuel Lampa700 views
How to document computational research projects by Samuel Lampa
How to document computational research projectsHow to document computational research projects
How to document computational research projects
Samuel Lampa107 views
Reproducibility in Scientific Data Analysis - BioScience Seminar by Samuel Lampa
Reproducibility in Scientific Data Analysis - BioScience SeminarReproducibility in Scientific Data Analysis - BioScience Seminar
Reproducibility in Scientific Data Analysis - BioScience Seminar
Samuel Lampa432 views
Batch import of large RDF datasets into Semantic MediaWiki by Samuel Lampa
Batch import of large RDF datasets into Semantic MediaWikiBatch import of large RDF datasets into Semantic MediaWiki
Batch import of large RDF datasets into Semantic MediaWiki
Samuel Lampa1.1K views
SciPipe - A light-weight workflow library inspired by flow-based programming by Samuel Lampa
SciPipe - A light-weight workflow library inspired by flow-based programmingSciPipe - A light-weight workflow library inspired by flow-based programming
SciPipe - A light-weight workflow library inspired by flow-based programming
Samuel Lampa1.4K views
Vagrant, Ansible and Docker - How they fit together for productive flexible d... by Samuel Lampa
Vagrant, Ansible and Docker - How they fit together for productive flexible d...Vagrant, Ansible and Docker - How they fit together for productive flexible d...
Vagrant, Ansible and Docker - How they fit together for productive flexible d...
Samuel Lampa1.3K views
iRODS Rule Language Cheat Sheet by Samuel Lampa
iRODS Rule Language Cheat SheetiRODS Rule Language Cheat Sheet
iRODS Rule Language Cheat Sheet
Samuel Lampa2.9K views
AddisDev Meetup ii: Golang and Flow-based Programming by Samuel Lampa
AddisDev Meetup ii: Golang and Flow-based ProgrammingAddisDev Meetup ii: Golang and Flow-based Programming
AddisDev Meetup ii: Golang and Flow-based Programming
Samuel Lampa5.1K views
First encounter with Elixir - Some random things by Samuel Lampa
First encounter with Elixir - Some random thingsFirst encounter with Elixir - Some random things
First encounter with Elixir - Some random things
Samuel Lampa697 views
Profiling go code a beginners tutorial by Samuel Lampa
Profiling go code   a beginners tutorialProfiling go code   a beginners tutorial
Profiling go code a beginners tutorial
Samuel Lampa2.2K views
Flow based programming an overview by Samuel Lampa
Flow based programming   an overviewFlow based programming   an overview
Flow based programming an overview
Samuel Lampa4.6K views
Python Generators - Talk at PySthlm meetup #15 by Samuel Lampa
Python Generators - Talk at PySthlm meetup #15Python Generators - Talk at PySthlm meetup #15
Python Generators - Talk at PySthlm meetup #15
Samuel Lampa1.4K views
The RDFIO Extension - A Status update by Samuel Lampa
The RDFIO Extension - A Status updateThe RDFIO Extension - A Status update
The RDFIO Extension - A Status update
Samuel Lampa1.3K views
My lightning talk at Go Stockholm meetup Aug 6th 2013 by Samuel Lampa
My lightning talk at Go Stockholm meetup Aug 6th 2013My lightning talk at Go Stockholm meetup Aug 6th 2013
My lightning talk at Go Stockholm meetup Aug 6th 2013
Samuel Lampa1.7K views
Hooking up Semantic MediaWiki with external tools via SPARQL by Samuel Lampa
Hooking up Semantic MediaWiki with external tools via SPARQLHooking up Semantic MediaWiki with external tools via SPARQL
Hooking up Semantic MediaWiki with external tools via SPARQL
Samuel Lampa4.1K views
Thesis presentation Samuel Lampa by Samuel Lampa
Thesis presentation Samuel LampaThesis presentation Samuel Lampa
Thesis presentation Samuel Lampa
Samuel Lampa715 views
3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse by Samuel Lampa
3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
3rd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
Samuel Lampa1.3K views
2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse by Samuel Lampa
2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
2nd Proj. Update: Integrating SWI-Prolog for Semantic Reasoning in Bioclipse
Samuel Lampa828 views

Recently uploaded

Automated Remote sensing GPS satellite system for managing resources and moni... by
Automated Remote sensing GPS satellite system for managing resources and moni...Automated Remote sensing GPS satellite system for managing resources and moni...
Automated Remote sensing GPS satellite system for managing resources and moni...Khalid Abdel Naser Abdel Rahim
7 views1 slide
Programmable Logic Devices : SPLD and CPLD by
Programmable Logic Devices : SPLD and CPLDProgrammable Logic Devices : SPLD and CPLD
Programmable Logic Devices : SPLD and CPLDUsha Mehta
44 views54 slides
Renewal Projects in Seismic Construction by
Renewal Projects in Seismic ConstructionRenewal Projects in Seismic Construction
Renewal Projects in Seismic ConstructionEngineering & Seismic Construction
12 views8 slides
Trust Metric-Based Anomaly Detection via Deep Deterministic Policy Gradient R... by
Trust Metric-Based Anomaly Detection via Deep Deterministic Policy Gradient R...Trust Metric-Based Anomaly Detection via Deep Deterministic Policy Gradient R...
Trust Metric-Based Anomaly Detection via Deep Deterministic Policy Gradient R...IJCNCJournal
5 views25 slides
taylor-2005-classical-mechanics.pdf by
taylor-2005-classical-mechanics.pdftaylor-2005-classical-mechanics.pdf
taylor-2005-classical-mechanics.pdfArturoArreola10
40 views808 slides
REPORT Data Science EXPERT LECTURE.doc by
REPORT Data Science EXPERT LECTURE.docREPORT Data Science EXPERT LECTURE.doc
REPORT Data Science EXPERT LECTURE.docParulkhatri11
7 views9 slides

Recently uploaded(20)

Programmable Logic Devices : SPLD and CPLD by Usha Mehta
Programmable Logic Devices : SPLD and CPLDProgrammable Logic Devices : SPLD and CPLD
Programmable Logic Devices : SPLD and CPLD
Usha Mehta44 views
Trust Metric-Based Anomaly Detection via Deep Deterministic Policy Gradient R... by IJCNCJournal
Trust Metric-Based Anomaly Detection via Deep Deterministic Policy Gradient R...Trust Metric-Based Anomaly Detection via Deep Deterministic Policy Gradient R...
Trust Metric-Based Anomaly Detection via Deep Deterministic Policy Gradient R...
IJCNCJournal5 views
taylor-2005-classical-mechanics.pdf by ArturoArreola10
taylor-2005-classical-mechanics.pdftaylor-2005-classical-mechanics.pdf
taylor-2005-classical-mechanics.pdf
ArturoArreola1040 views
REPORT Data Science EXPERT LECTURE.doc by Parulkhatri11
REPORT Data Science EXPERT LECTURE.docREPORT Data Science EXPERT LECTURE.doc
REPORT Data Science EXPERT LECTURE.doc
Parulkhatri117 views
Programmable Switches for Programmable Logic Devices by Usha Mehta
Programmable Switches for Programmable Logic DevicesProgrammable Switches for Programmable Logic Devices
Programmable Switches for Programmable Logic Devices
Usha Mehta37 views
Different type of computer networks .pptx by nazmul1514788
Different  type of computer networks .pptxDifferent  type of computer networks .pptx
Different type of computer networks .pptx
nazmul151478820 views
2023-12 Emarei MRI Tool Set E2I0501ST (TQ).pdf by Philipp Daum
2023-12 Emarei MRI Tool Set E2I0501ST (TQ).pdf2023-12 Emarei MRI Tool Set E2I0501ST (TQ).pdf
2023-12 Emarei MRI Tool Set E2I0501ST (TQ).pdf
Philipp Daum6 views
Ansari: Practical experiences with an LLM-based Islamic Assistant by M Waleed Kadous
Ansari: Practical experiences with an LLM-based Islamic AssistantAnsari: Practical experiences with an LLM-based Islamic Assistant
Ansari: Practical experiences with an LLM-based Islamic Assistant
M Waleed Kadous13 views
Integrating Sustainable Development Goals (SDGs) in School Education by SheetalTank1
Integrating Sustainable Development Goals (SDGs) in School EducationIntegrating Sustainable Development Goals (SDGs) in School Education
Integrating Sustainable Development Goals (SDGs) in School Education
SheetalTank120 views
GDSC Mikroskil Members Onboarding 2023.pdf by gdscmikroskil
GDSC Mikroskil Members Onboarding 2023.pdfGDSC Mikroskil Members Onboarding 2023.pdf
GDSC Mikroskil Members Onboarding 2023.pdf
gdscmikroskil75 views
Unlocking Research Visibility.pdf by KhatirNaima
Unlocking Research Visibility.pdfUnlocking Research Visibility.pdf
Unlocking Research Visibility.pdf
KhatirNaima11 views
Field Programmable Gate Arrays : Architecture by Usha Mehta
Field Programmable Gate Arrays : ArchitectureField Programmable Gate Arrays : Architecture
Field Programmable Gate Arrays : Architecture
Usha Mehta33 views
IRJET-Productivity Enhancement Using Method Study.pdf by SahilBavdhankar
IRJET-Productivity Enhancement Using Method Study.pdfIRJET-Productivity Enhancement Using Method Study.pdf
IRJET-Productivity Enhancement Using Method Study.pdf
SahilBavdhankar11 views
Design_Discover_Develop_Campaign.pptx by ShivanshSeth6
Design_Discover_Develop_Campaign.pptxDesign_Discover_Develop_Campaign.pptx
Design_Discover_Develop_Campaign.pptx
ShivanshSeth659 views
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for Growth by Innomantra
BCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for GrowthBCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for Growth
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for Growth
Innomantra 28 views

Using Flow-based programming to write tools and workflows for Scientific Computing in Go

  • 1. Using Flow-based programming ... to write Tools and Workflows for Scientific Computing Go Stockholm Conference Oct 6, 2018 Samuel Lampa | bionics.it | @saml (slack) | @smllmp (twitter) Ex - Dept. of Pharm. Biosci, Uppsala University | www.farmbio.uu.se | pharmb.io Savantic AB savantic.se | RIL Partner AB rilpartner.com
  • 2. About the speaker ● Name: Samuel Lampa ● PhD in Pharm. Bioinformatics from UU / pharmb.io (since 1 week) ● Researched: Flow-based programming-based workflow tools to build predictive models for drug discovery ● Previously: HPC sysadmin & developer,Web developer,etc, M.Sc. in molecular biotechnology engineering ● Next week: R&D Engineer at Savantic AB (savanticab.com) ● (Also:AfricArxiv (africarxiv.org) and RIL Partner AB (rilpartner.com))
  • 3. Read more about my research bit.ly/samlthesis → (bionics.it/posts/phdthesis)
  • 9. Flow-based programming (FBP) Note: Doesn’t need to be done visaully though!
  • 10. ● Black box, asynchronously running processes ● Data exchange across predefined connections between named ports (with bounded buffers) by message passing only ● Connections specified separately from processes ● Processes can be reconnected endlessly to form different applications without having being changed internally FBP in brief
  • 11. Flow-based programming (FBP) Note: Doesn’t need to be done visaully though!
  • 14. The Central Dogma of Biology … … from DNA to RNA to Proteins DNA mRNA Protein Image credits: Nicolle Rager, National Science Foundation. License: Public domain Amino acids Ribosome RNA polymerase Cell nucleus Cell
  • 15. “FBP is a particular form of dataflow programming based on bounded buffers, information packets with defined lifetimes, named ports, and separate definition of connections” FBP vs Dataflow
  • 16. ● Change of connection wiring without rewriting components ● Inherently concurrent - suited for the multi-core CPU world ● Testing, monitoring and logging very easy: Just plug in a mock-, logging- or debugging component. ● Etc etc ... Benefits abound
  • 17. jpaulmorrison.com(/fbp) Invented by J. Paul Morrison at IBM in late 60’s
  • 18. github.com/trustmaster/goflow by Vladimir Sibirov @sibiroff (twitter) FBP in Go: GoFlow
  • 19. FBP in plain Go (almost) without frameworks?
  • 20. Generator functions Adapted from Rob Pike’s slides: talks.golang.org/2012/concurrency.slide#25 func main() { c := generateInts(10) // Call function to get a channel for v := range c { // … and loop over it fmt.Println(v) } } func generateInts(max int) <-chan int { // Return a channel of ints c := make(chan int) go func() { // Init go-routine inside function defer close(c) for i := 0; i <= max; i++ { c <- i } }() return c // Return the channel }
  • 21. Chaining generator functions 1/2 func reverse(cin chan string) chan string { cout := make(chan string) go func() { defer close(cout) for s := range cin { // Loop over in-chan cout <- reverse(s) // Send on out-chan } }() return cout }
  • 22. Chaining generator functions 2/2 // Chain the generator functions dna := generateDNA() // Generator func of strings rev := reverse(dna) compl := complement(rev) // Drive the chain by reading from last channel for dnaString := range compl { fmt.Println(dnaString) }
  • 23. Chaining generator functions 2/2 // Chain the generator functions dna := generateDNA() // Generator func of strings rev := reverse(dna) compl := complement(rev) // Drive the chain by reading from last channel for dnaString := range compl { fmt.Println(dnaString) }
  • 24. Problems with the generator approach ● Inputs not named in connection code (no keyword arguments) ● Multiple return values depend on positional arguments: leftPart, rightPart := splitInHalves(chanOfStrings)
  • 25. Could we emulate named ports? type P struct { in chan string // Channels as struct fields, to act as “named ports” out chan string } func NewP() *P { // Initialize a new component return &P{ in: make(chan string, 16), out: make(chan string, 16), } } func (p *P) Run() { defer close(p.out) for s := range p.in { // Refer to struct fields when reading ... p.out <- s // ... and writing } }
  • 26. Could we emulate named ports? func main() { p1 := NewP() p2 := NewP() p2.in = p1.out // Connect dependencies here, by assigning to same chan go p1.Run() go p2.Run() go func() { // Feed the input of the network defer close(p1.in) for i := 0; i <= 10; i++ { p1.in <- "Hej" } }() for s := range p2.out { // Drive the chain from the main go-routine fmt.Println(s) } }
  • 27. Add almost no additional code, and get: flowbase.org
  • 28. Real-world use of FlowBase ● RDF (Semantic) MediaWiki XML→ ● Import via MediaWiki XML import ● Code: github.com/rdfio/rdf2smw ● Paper: bit.ly/rdfiopub
  • 29. Connecting dependencies with FlowBase ttlFileRead.OutTriple = aggregator.In aggregator.Out = indexCreator.In indexCreator.Out = indexFanOut.In indexFanOut.Out["serialize"] = indexToAggr.In indexFanOut.Out["conv"] = triplesToWikiConverter.InIndex indexToAggr.Out = triplesToWikiConverter.InAggregate triplesToWikiConverter.OutPage = xmlCreator.InWikiPage xmlCreator.OutTemplates = templateWriter.In xmlCreator.OutProperties = propertyWriter.In xmlCreator.OutPages = pageWriter.In github.com/rdfio/rdf2smw/blob/e7e2b3/main.go#L100-L125
  • 30. Taking it further: Port structs ttlFileRead.OutTriple().To(aggregator.In()) aggregator.Out().To(indexCreator.In()) indexCreator.Out().To(indexToAggr.In()) indexCreator.Out().To(triplesToWikiConverter.InIndex()) indexToAggr.Out().To(triplesToWikiConverter.InAggregate()) triplesToWikiConverter.OutPage().To(xmlCreator.InWikiPage()) xmlCreator.OutTemplates().To(templateWriter.In()) xmlCreator.OutProperties().To(propertyWriter.In()) xmlCreator.OutPages().To(pageWriter.In()) (So far only used in SciPipe, not yet FlowBase)
  • 31. SciPipe Write Scientific Workflows in Go ● Define processes with shell command patterns ● Atomic writes, Restartable workflows, Caching ● Automatic file naming ● Audit logging ● Workflow graph plotting ● Intro & Docs: scipipe.org ● Preprint paper: doi.org/10.1101/380808
  • 32. SciPipe ● Workflow ● Keeps track of dependency graph ● Process ● Added to workflows ● Long-running ● Typically one per operation ● Task ● Spawned by processes ● Executes just one shell command or custom Go function ● Typically one task spawned per operation on a set of input files ● Information Packet (IP) ● Most common data type passed between processes Workflow Process File IP Task Task Task
  • 33. “Hello World” in SciPipe package main import ( // Import the SciPipe package, aliased to 'sp' sp "github.com/scipipe/scipipe" ) func main() { // Init workflow with a name, and max concurrent tasks wf := sp.NewWorkflow("hello_world", 4) // Initialize processes and set output file paths hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}") hello.SetOut("out", "hello.txt") world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}") world.SetOut("out", "{i:in|%.txt}_world.txt") // Connect network world.In("in").From(hello.Out("out")) // Run workflow wf.Run() }
  • 34. “Hello World” in SciPipe package main import ( // Import the SciPipe package, aliased to 'sp' sp "github.com/scipipe/scipipe" ) func main() { // Init workflow with a name, and max concurrent tasks wf := sp.NewWorkflow("hello_world", 4) // Initialize processes and set output file paths hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}") hello.SetOut("out", "hello.txt") world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}") world.SetOut("out", "{i:in|%.txt}_world.txt") // Connect network world.In("in").From(hello.Out("out")) // Run workflow wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 35. “Hello World” in SciPipe package main import ( // Import the SciPipe package, aliased to 'sp' sp "github.com/scipipe/scipipe" ) func main() { // Init workflow with a name, and max concurrent tasks wf := sp.NewWorkflow("hello_world", 4) // Initialize processes and set output file paths hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}") hello.SetOut("out", "hello.txt") world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}") world.SetOut("out", "{i:in|%.txt}_world.txt") // Connect network world.In("in").From(hello.Out("out")) // Run workflow wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 36. “Hello World” in SciPipe package main import ( // Import the SciPipe package, aliased to 'sp' sp "github.com/scipipe/scipipe" ) func main() { // Init workflow with a name, and max concurrent tasks wf := sp.NewWorkflow("hello_world", 4) // Initialize processes and set output file paths hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}") hello.SetOut("out", "hello.txt") world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}") world.SetOut("out", "{i:in|%.txt}_world.txt") // Connect network world.In("in").From(hello.Out("out")) // Run workflow wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 37. “Hello World” in SciPipe package main import ( // Import the SciPipe package, aliased to 'sp' sp "github.com/scipipe/scipipe" ) func main() { // Init workflow with a name, and max concurrent tasks wf := sp.NewWorkflow("hello_world", 4) // Initialize processes and set output file paths hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}") hello.SetOut("out", "hello.txt") world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}") world.SetOut("out", "{i:in|%.txt}_world.txt") // Connect network world.In("in").From(hello.Out("out")) // Run workflow wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 38. “Hello World” in SciPipe package main import ( // Import the SciPipe package, aliased to 'sp' sp "github.com/scipipe/scipipe" ) func main() { // Init workflow with a name, and max concurrent tasks wf := sp.NewWorkflow("hello_world", 4) // Initialize processes and set output file paths hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}") hello.SetOut("out", "hello.txt") world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}") world.SetOut("out", "{i:in|%.txt}_world.txt") // Connect network world.In("in").From(hello.Out("out")) // Run workflow wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 39. “Hello World” in SciPipe package main import ( // Import the SciPipe package, aliased to 'sp' sp "github.com/scipipe/scipipe" ) func main() { // Init workflow with a name, and max concurrent tasks wf := sp.NewWorkflow("hello_world", 4) // Initialize processes and set output file paths hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}") hello.SetOut("out", "hello.txt") world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}") world.SetOut("out", "{i:in|%.txt}_world.txt") // Connect network world.In("in").From(hello.Out("out")) // Run workflow wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs & inputs (dependencies / data flow) ● Run the workflow
  • 40. “Hello World” in SciPipe package main import ( // Import the SciPipe package, aliased to 'sp' sp "github.com/scipipe/scipipe" ) func main() { // Init workflow with a name, and max concurrent tasks wf := sp.NewWorkflow("hello_world", 4) // Initialize processes and set output file paths hello := wf.NewProc("hello", "echo 'Hello ' > {o:out}") hello.SetOut("out", "hello.txt") world := wf.NewProc("world", "echo $(cat {i:in}) World >> {o:out}") world.SetOut("out", "{i:in|%.txt}_world.txt") // Connect network world.In("in").From(hello.Out("out")) // Run workflow wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 41. Writing SciPipe workflows package main import ( "github.com/scipipe/scipipe" ) const dna = "AAAGCCCGTGGGGGACCTGTTC" func main() { wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4) makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}") makeDNA.SetOut("dna", "dna.txt") complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}") complmt.SetOut("compl", "{i:in|%.txt}.compl.txt") reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}") reverse.SetOut("rev", "{i:in|%.txt}.rev.txt") complmt.In("in").From(makeDNA.Out("dna")) reverse.In("in").From(complmt.Out("compl")) wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 42. Writing SciPipe workflows package main import ( "github.com/scipipe/scipipe" ) const dna = "AAAGCCCGTGGGGGACCTGTTC" func main() { wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4) makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}") makeDNA.SetOut("dna", "dna.txt") complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}") complmt.SetOut("compl", "{i:in|%.txt}.compl.txt") reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}") reverse.SetOut("rev", "{i:in|%.txt}.rev.txt") complmt.In("in").From(makeDNA.Out("dna")) reverse.In("in").From(complmt.Out("compl")) wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 43. Writing SciPipe workflows package main import ( "github.com/scipipe/scipipe" ) const dna = "AAAGCCCGTGGGGGACCTGTTC" func main() { wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4) makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}") makeDNA.SetOut("dna", "dna.txt") complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}") complmt.SetOut("compl", "{i:in|%.txt}.compl.txt") reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}") reverse.SetOut("rev", "{i:in|%.txt}.rev.txt") complmt.In("in").From(makeDNA.Out("dna")) reverse.In("in").From(complmt.Out("compl")) wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 44. Writing SciPipe workflows package main import ( "github.com/scipipe/scipipe" ) const dna = "AAAGCCCGTGGGGGACCTGTTC" func main() { wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4) makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}") makeDNA.SetOut("dna", "dna.txt") complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}") complmt.SetOut("compl", "{i:in|%.txt}.compl.txt") reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}") reverse.SetOut("rev", "{i:in|%.txt}.rev.txt") complmt.In("in").From(makeDNA.Out("dna")) reverse.In("in").From(complmt.Out("compl")) wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 45. Writing SciPipe workflows package main import ( "github.com/scipipe/scipipe" ) const dna = "AAAGCCCGTGGGGGACCTGTTC" func main() { wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4) makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}") makeDNA.SetOut("dna", "dna.txt") complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}") complmt.SetOut("compl", "{i:in|%.txt}.compl.txt") reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}") reverse.SetOut("rev", "{i:in|%.txt}.rev.txt") complmt.In("in").From(makeDNA.Out("dna")) reverse.In("in").From(complmt.Out("compl")) wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 46. Writing SciPipe workflows package main import ( "github.com/scipipe/scipipe" ) const dna = "AAAGCCCGTGGGGGACCTGTTC" func main() { wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4) makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}") makeDNA.SetOut("dna", "dna.txt") complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}") complmt.SetOut("compl", "{i:in|%.txt}.compl.txt") reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}") reverse.SetOut("rev", "{i:in|%.txt}.rev.txt") complmt.In("in").From(makeDNA.Out("dna")) reverse.In("in").From(complmt.Out("compl")) wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 47. Writing SciPipe workflows package main import ( "github.com/scipipe/scipipe" ) const dna = "AAAGCCCGTGGGGGACCTGTTC" func main() { wf := scipipe.NewWorkflow("DNA Base Complement Workflow", 4) makeDNA := wf.NewProc("Make DNA", "echo "+dna+" > {o:dna}") makeDNA.SetOut("dna", "dna.txt") complmt := wf.NewProc("Base Complement", "cat {i:in} | tr ATCG TAGC > {o:compl}") complmt.SetOut("compl", "{i:in|%.txt}.compl.txt") reverse := wf.NewProc("Reverse", "cat {i:in} | rev > {o:rev}") reverse.SetOut("rev", "{i:in|%.txt}.rev.txt") complmt.In("in").From(makeDNA.Out("dna")) reverse.In("in").From(complmt.Out("compl")) wf.Run() } ● Import SciPipe ● Set up any default variables or data, handle flags etc ● Initiate workflow ● Create processes ● Define outputs and paths ● Connect outputs to inputs (dependencies / data flow) ● Run the workflow
  • 48. Running it go run revcompl.go
  • 51. Turn Audit log into TeX/PDF report TeX template by Jonathan Alvarsson @jonalv
  • 52. ● Intuitive behaviour: Like conveyor belts & stations in a factory. ● Flexible: Combine command-line programs with Go components ● Custom file naming: Easy to manually browse output files ● Portable: Distribute as Go code or as compiled executable files ● Easy to debug: Use any Go debugging tools or even just println() ● Powerful audit logging: Stream outputs via UNIX FIFO files ● Efficient & Parallel: Fast code + Efficient use of multi-core CPU Benefits of SciPipe - Thanks to Go + FBP
  • 54. Thank you for your time! Using Flow-based programming ... to write Tools and Workflows for Scientific Computing Talk at Go Stockholm Conference Oct 6, 2018 Samuel Lampa | bionics.it | @saml (slack) | @smllmp (twitter) Dept. of Pharm. Biosci, Uppsala University | www.farmbio.uu.se | pharmb.io