SlideShare a Scribd company logo
Validation and Inference of Schema-Level
Workflow Data-Dependency Annotations
Shawn Bowers1, Timothy McPhillips2, Bertram Lud¨ascher2
1Dept. of Computer Science, Gonzaga University
2School of Information Sciences, University of Illinois,
Urbana-Champaign
IPAW 2018
Scientific Workflows and Provenance
A workflow specification modeled as a graph of computation
steps (nodes) and data/control flow (edges)
gen_boundary_region
gen_boundary_region
boundary_coordinates
user_map_marker_pos
prism_data
file:data/112W36N.nc
d3gend1
d2 filter
c
Steps are often “black boxes” (invoke external programs)
Scientific Workflows and Provenance
During a workflow execution, systems
record “provenance” information ...
I invocation of steps
I data received/produced by steps
A workflow trace modeled as a graph
of invocations and corresponding data
I a trace is a specification instance
I capturing details of a workflow run
4gen:11 4 filter:1
1
4
gen:12
4 filter:1
1
77 filter:2
1
gen:11 0 filter:1
1
Di↵erent traces of the same specification
Data Dependency Assumptions and Issues
Traces are used to infer the “lineage” of data products (⇤)
I e.g., all steps and inputs/outputs that led to an output
I assume all outputs “depend on” all inputs of a step
4gen:11 4 filter:1
1
However, the inferred “dependencies” can be incorrect and vague
1. some outputs might not “depend on” all inputs
2. outputs can depend on inputs di↵erently (derivation, copy, ...)
(⇤)
some systems provide APIs for steps to declare dependencies at runtime
Prospective (Schema-Level) Dependency Annotations
Our approach:
I Allow wf authors to specify dependency patterns (annotations)
I Support di↵erent data dependency types
I Use dependency annotations to infer trace-level dependencies
Prospective (Schema-Level) Dependency Annotations
Our approach:
I Allow wf authors to specify dependency patterns (annotations)
I Support di↵erent data dependency types
I Use dependency annotations to infer trace-level dependencies
Prior work:
I Allows dependency annotations for individual workflow steps
I Rules for extracting trace-level invocation dependencies
I Requires each step to be (fully) annotated
Prospective (Schema-Level) Dependency Annotations
Our approach:
I Allow wf authors to specify dependency patterns (annotations)
I Support di↵erent data dependency types
I Use dependency annotations to infer trace-level dependencies
Prior work:
I Allows dependency annotations for individual workflow steps
I Rules for extracting trace-level invocation dependencies
I Requires each step to be (fully) annotated
Current contributions focus on workflow design:
1. Allow partially annotated workflow specifications
2. Infer complete sets of (possible) annotations
3. Validate correctness of annotations
Workflow Specifications
Minimally, a workflow specification W = (P, D, E) consists of
• a set P of program blocks (computation steps)
p1
• a set D of data blocks (data items or containers)
d1
• a set E ✓ P ⇥ L ⇥ D ⇥ {in, out} of uniquely labeled edges
p1
d1
p2
x1
x2
We use in(pi , xi , di ) and out(pj , xj , dj ) for input and output edges
• where xi , xj are labels in L
Dependency Annotations
Dependency annotations A ✓ Lout ⇥ Lin ⇥ T for a workflow W ...
• associate dependency types t 2 T (more later)
• to input-output edge pairs of W (identified by their labels)
We use dep rule(xi , xj , t) for annotations xi
t
xj (drawn in red)
d3gend1
d2 filter
c
cutoff
n r v1
v2
DependsOn CopyOf
DependsOn
• dep rule(n, r, depends on), dep rule(v1, v2, copy of),
dep rule(cutoff, v2, depends on)
Dependency Types
We consider five di↵erent dependency annotation types ... (⇤,†)
FlowsFrom: input present during invocation (e.g., a trigger)
DependsOn: output has control (statement) dependency on input
DerivedFrom: output has data (read-after-write) dependency on input
ValueOf: input value copied to the output (new data item)
SameAs: input copied to the output (same item “passed through”)
Dependency Types
We consider five di↵erent dependency annotation types ... (⇤,†)
FlowsFrom: input present during invocation (e.g., a trigger)
DependsOn: output has control (statement) dependency on input
DerivedFrom: output has data (read-after-write) dependency on input
ValueOf: input value copied to the output (new data item)
SameAs: input copied to the output (same item “passed through”)
Ordered from weakest to strongest form of dependency ...
FlowsFrom DependsOn DerivedFrom ValueOf SameAs
Dependency Types
We consider five di↵erent dependency annotation types ... (⇤,†)
FlowsFrom: input present during invocation (e.g., a trigger)
DependsOn: output has control (statement) dependency on input
DerivedFrom: output has data (read-after-write) dependency on input
ValueOf: input value copied to the output (new data item)
SameAs: input copied to the output (same item “passed through”)
Ordered from weakest to strongest form of dependency ...
FlowsFrom DependsOn DerivedFrom ValueOf SameAs
Or as subclasses (e.g., FlowsFrom+ as “at least FlowsFrom”) ...
FlowsFrom+
w DependsOn+
w DerivedFrom+
w ValueOf +
w SameAs+
(⇤)
Plus NotFlowsFrom, described later (†)
A more formal description is given in the paper
Reasoning using Dependency Composition
Given two “connected” program blocks:
p1
d1
d2
x1
x2
p2
d3
x3
x4
tj
ti
t
A composite (indirect) dependency x1
t
x4 is the weaker of the
dependencies x1
ti
x2 and x3
tj
x4
dep rule(x1, x2, ti)^dep rule(x3, x4, tj)^ti tj $ dep rule(x1, x4, ti)
dep rule(x1, x2, ti)^dep rule(x3, x4, tj)^tj ti $ dep rule(x1, x4, tj)
This extends to longer “chains” of connected program blocks
Dependency Composition with Multiple Paths
When multiple annotation “paths” exist ...
p1
p4
d1
d2
d5
x1
x2
x7
x9
DerivedFrom
p2
p3
d3
d4
x3
x4
x5
x6
x8
FlowsFrom
DerivedFrom
SameAs
DerivedFrom
The composite annotation type is the strongest type of the paths
• the top path implies FlowsFrom
• the bottom path implies DerivedFrom
• the infered type is DerivedFrom (i.e., “at least DerivedFrom”)
Use Case 1: Infer Composite Dependencies
Given annotations on blocks (steps), find composite annotations
I helps verify intent and construction of workflow
I e.g., that certain outputs are derived from inputs
normalize filterd1
d3
d5
d2
d4
xrange
x1
x2
x3
x4
xcutoff
DependsOn
SameAsDerivedFrom
DerivedFrom
DerivedFrom
DerivedFrom
Inferred annotations shown in blue
Use Case 2: Constraining Dependency Annotations
Add annotations to constrain choices
I e.g., may know the output should be derived from the input
I which can guide (constrain) block-level annotation choices
I or guide the workflow design itself
p1
p2
d1
d2
d3
x1
x2
x3
x4
DerivedFrom
DerivedFrom,
ValueOf, or SameAs?
DerivedFrom,
ValueOf, or SameAs?
Use Case 3: Validating Dependency Annotations
Ensure annotations are compatible
I e.g., lower-level (block) annotations are not consistent with
composite annotation (shown in purple)
generate
sample
d2
dtype
diter
xout
xiter
d1
xin
DerivedFrom
initial
sample
perturbd1
d2
dtype
diter
xtype
n x1
x2
s
xiter
DependsOn
DerivedFromDependsOn
DependsOn
xtype
din
p1 p2 dout
Dependency Reasoning Prototype Implementation
Answer-Set Programming (ASP) prototype in Potascco (clingo)
High level idea: use a generate-and-test algorithm
(i) “guess” annotations for non-annotated input-output pairs
(ii) ensure annotations satisfy composition rules
(iii) ensure annotations satisfy “strongest-path” constraint
Result is all possible and complete annotation sets (possible worlds)
(iv) find all annotations common to all worlds
(v) report possible choices for remaining input-output pairs
Prototype Implementation (cont)
The following “choice rule” guesses annotations
{dep_rule(I,O,R) : dep_type(R)} = 1 :- up_stream(I,O).
The up stream relation finds all possible input-output pairs
up_stream(I,O) :- in(I,P,_), out(O,P,_).
up_stream(I,O) :- in(I,P1,_), out(O1,P1,D1),
in(I2,P2,D1), up_stream(I2,O).
The following constraint ensures composition rules are satisfied
:- dep_rule(I,O,R), not valid_dep_path(I,O,R).
Prototype Implementation (cont)
The valid dep path relation finds valid compositions
valid_dep_path(I,O,R) :- in(I,P,_), out(O,P,_),
dep_rule(I,O,R).
valid_dep_path(I,O,R) :- in(I,P,_), out(O1,P,_), O != O1,
dep_rule(I,O1,R1), connected(O1,I1),
I != I1, valid_dep_path(I1,O,R2),
compose(R1,R2,R).
The connected relation ensures an output is connected to an input
connected(O,I) :- out(O,_,D), in(I,_,D).
compose computes composition (where weaker eq implements )
compose(R1,R2,R1) :- weaker_eq(R1,R2).
compose(R1,R2,R2) :- weaker_eq(R2,R1).
Prototype Implementation (cont)
Finally, the following constraint ensures “strongest” paths
:- dep_rule(I,O,R), valid_dep_path(I,O,R1),
weaker_eq(R,R1), R != R1.
Recently added NotFlowsFrom type (e.g., for subworkflows)
I Required only minimal changes: NotFlowsFrom FlowsFrom
I Full subworkflow support not yet implemented (future work)
d1
p1
x1
d2
p2
x2
d3
x3
d4
x4
Preliminary Performance Results
(1) Increase the depth of the
workflow (2-50 steps) and %
of block annotations
ps
ds
pe
de
...
...
(2) Increase the width of the
workflow (2-50 steps) and %
of block annotations
pe
de
...
Future Work
Add dependency annotations to YesWorkflow’s annotation types
I combine schema-level support and extend trace-level support
Apply schema-level dependency annotations to workflows in YW
I we can now do this, e.g., for paleocar (with NotFlowsFrom)
I extend annotation types as needed
Develop specialized reasoning support (as needed)
I ASP great for prototyping!
I but can improve performance with dedicated implementation
Dr. Shawn Bowers presenting the paper on July 10th, 2018 at IPAW, King’s College, London, UK.

More Related Content

What's hot

7.0 files and c input
7.0 files and c input7.0 files and c input
7.0 files and c input
Abdullah Basheer
 
Java8 - Interfaces, evolved
Java8 - Interfaces, evolvedJava8 - Interfaces, evolved
Java8 - Interfaces, evolved
Charles Casadei
 
Lexical analyzer generator lex
Lexical analyzer generator lexLexical analyzer generator lex
Lexical analyzer generator lex
Anusuya123
 
Introduction of bison
Introduction of bisonIntroduction of bison
Introduction of bison
vip_du
 
Python Basics by Akanksha Bali
Python Basics by Akanksha BaliPython Basics by Akanksha Bali
Python Basics by Akanksha Bali
Akanksha Bali
 
XPath Injection
XPath InjectionXPath Injection
XPath Injection
Roberto Suggi Liverani
 
Introduction to Java Programming Part 2
Introduction to Java Programming Part 2Introduction to Java Programming Part 2
Introduction to Java Programming Part 2
university of education,Lahore
 
Yacc lex
Yacc lexYacc lex
Yacc lex
915086731
 
Control statements
Control statementsControl statements
Control statements
raksharao
 
LEX & YACC TOOL
LEX & YACC TOOLLEX & YACC TOOL
Unit iii
Unit iiiUnit iii
Unit iii
SHIKHA GAUTAM
 
JAVA OOP
JAVA OOPJAVA OOP
JAVA OOP
Sunil OS
 
C intro
C introC intro
C intro
SHIKHA GAUTAM
 
On Parameterised Types and Java Generics
On Parameterised Types and Java GenericsOn Parameterised Types and Java Generics
On Parameterised Types and Java Generics
Yann-Gaël Guéhéneuc
 
[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov
[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov
[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov
OWASP Russia
 
09. Java Methods
09. Java Methods09. Java Methods
09. Java Methods
Intro C# Book
 
Javaz. Functional design in Java 8.
Javaz. Functional design in Java 8.Javaz. Functional design in Java 8.
Javaz. Functional design in Java 8.
Vadim Dubs
 
Java generics
Java genericsJava generics
Java generics
Hosein Zare
 
Java introduction
Java introductionJava introduction
Java introduction
Samsung Electronics Egypt
 
Java Generics Introduction - Syntax Advantages and Pitfalls
Java Generics Introduction - Syntax Advantages and PitfallsJava Generics Introduction - Syntax Advantages and Pitfalls
Java Generics Introduction - Syntax Advantages and Pitfalls
Rakesh Waghela
 

What's hot (20)

7.0 files and c input
7.0 files and c input7.0 files and c input
7.0 files and c input
 
Java8 - Interfaces, evolved
Java8 - Interfaces, evolvedJava8 - Interfaces, evolved
Java8 - Interfaces, evolved
 
Lexical analyzer generator lex
Lexical analyzer generator lexLexical analyzer generator lex
Lexical analyzer generator lex
 
Introduction of bison
Introduction of bisonIntroduction of bison
Introduction of bison
 
Python Basics by Akanksha Bali
Python Basics by Akanksha BaliPython Basics by Akanksha Bali
Python Basics by Akanksha Bali
 
XPath Injection
XPath InjectionXPath Injection
XPath Injection
 
Introduction to Java Programming Part 2
Introduction to Java Programming Part 2Introduction to Java Programming Part 2
Introduction to Java Programming Part 2
 
Yacc lex
Yacc lexYacc lex
Yacc lex
 
Control statements
Control statementsControl statements
Control statements
 
LEX & YACC TOOL
LEX & YACC TOOLLEX & YACC TOOL
LEX & YACC TOOL
 
Unit iii
Unit iiiUnit iii
Unit iii
 
JAVA OOP
JAVA OOPJAVA OOP
JAVA OOP
 
C intro
C introC intro
C intro
 
On Parameterised Types and Java Generics
On Parameterised Types and Java GenericsOn Parameterised Types and Java Generics
On Parameterised Types and Java Generics
 
[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov
[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov
[3.3] Detection & exploitation of Xpath/Xquery Injections - Boris Savkov
 
09. Java Methods
09. Java Methods09. Java Methods
09. Java Methods
 
Javaz. Functional design in Java 8.
Javaz. Functional design in Java 8.Javaz. Functional design in Java 8.
Javaz. Functional design in Java 8.
 
Java generics
Java genericsJava generics
Java generics
 
Java introduction
Java introductionJava introduction
Java introduction
 
Java Generics Introduction - Syntax Advantages and Pitfalls
Java Generics Introduction - Syntax Advantages and PitfallsJava Generics Introduction - Syntax Advantages and Pitfalls
Java Generics Introduction - Syntax Advantages and Pitfalls
 

Similar to Validation and Inference of Schema-Level Workflow Data-Dependency Annotations

Compiler Construction | Lecture 10 | Data-Flow Analysis
Compiler Construction | Lecture 10 | Data-Flow AnalysisCompiler Construction | Lecture 10 | Data-Flow Analysis
Compiler Construction | Lecture 10 | Data-Flow Analysis
Eelco Visser
 
Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010
Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010
Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010
Paolo Missier
 
Python Workshop. LUG Maniapl
Python Workshop. LUG ManiaplPython Workshop. LUG Maniapl
Python Workshop. LUG Maniapl
Ankur Shrivastava
 
IntroductionSTATA.ppt
IntroductionSTATA.pptIntroductionSTATA.ppt
IntroductionSTATA.ppt
ssuser3840bc
 
Introduction to the basics of Python programming (part 1)
Introduction to the basics of Python programming (part 1)Introduction to the basics of Python programming (part 1)
Introduction to the basics of Python programming (part 1)
Pedro Rodrigues
 
Functions in python
Functions in pythonFunctions in python
Functions in python
Santosh Verma
 
Stream Based Input Output
Stream Based Input OutputStream Based Input Output
Stream Based Input Output
Bharat17485
 
Python basics
Python basicsPython basics
Python basics
Hoang Nguyen
 
Python basics
Python basicsPython basics
Python basics
Young Alista
 
Python basics
Python basicsPython basics
Python basics
Fraboni Ec
 
Python basics
Python basicsPython basics
Python basics
Harry Potter
 
Python basics
Python basicsPython basics
Python basics
James Wong
 
Python basics
Python basicsPython basics
Python basics
Tony Nguyen
 
Python basics
Python basicsPython basics
Python basics
Luis Goldster
 
Introduction to Python , Overview
Introduction to Python , OverviewIntroduction to Python , Overview
Introduction to Python , Overview
NB Veeresh
 
PythonStudyMaterialSTudyMaterial.pdf
PythonStudyMaterialSTudyMaterial.pdfPythonStudyMaterialSTudyMaterial.pdf
PythonStudyMaterialSTudyMaterial.pdf
data2businessinsight
 
Lambda Functions in Java 8
Lambda Functions in Java 8Lambda Functions in Java 8
Lambda Functions in Java 8
Ganesh Samarthyam
 
The Swift Compiler and Standard Library
The Swift Compiler and Standard LibraryThe Swift Compiler and Standard Library
The Swift Compiler and Standard Library
Santosh Rajan
 
Detecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Detecting Bugs in Binaries Using Decompilation and Data Flow AnalysisDetecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Detecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Silvio Cesare
 
Deep Learning and TensorFlow
Deep Learning and TensorFlowDeep Learning and TensorFlow
Deep Learning and TensorFlow
Oswald Campesato
 

Similar to Validation and Inference of Schema-Level Workflow Data-Dependency Annotations (20)

Compiler Construction | Lecture 10 | Data-Flow Analysis
Compiler Construction | Lecture 10 | Data-Flow AnalysisCompiler Construction | Lecture 10 | Data-Flow Analysis
Compiler Construction | Lecture 10 | Data-Flow Analysis
 
Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010
Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010
Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010
 
Python Workshop. LUG Maniapl
Python Workshop. LUG ManiaplPython Workshop. LUG Maniapl
Python Workshop. LUG Maniapl
 
IntroductionSTATA.ppt
IntroductionSTATA.pptIntroductionSTATA.ppt
IntroductionSTATA.ppt
 
Introduction to the basics of Python programming (part 1)
Introduction to the basics of Python programming (part 1)Introduction to the basics of Python programming (part 1)
Introduction to the basics of Python programming (part 1)
 
Functions in python
Functions in pythonFunctions in python
Functions in python
 
Stream Based Input Output
Stream Based Input OutputStream Based Input Output
Stream Based Input Output
 
Python basics
Python basicsPython basics
Python basics
 
Python basics
Python basicsPython basics
Python basics
 
Python basics
Python basicsPython basics
Python basics
 
Python basics
Python basicsPython basics
Python basics
 
Python basics
Python basicsPython basics
Python basics
 
Python basics
Python basicsPython basics
Python basics
 
Python basics
Python basicsPython basics
Python basics
 
Introduction to Python , Overview
Introduction to Python , OverviewIntroduction to Python , Overview
Introduction to Python , Overview
 
PythonStudyMaterialSTudyMaterial.pdf
PythonStudyMaterialSTudyMaterial.pdfPythonStudyMaterialSTudyMaterial.pdf
PythonStudyMaterialSTudyMaterial.pdf
 
Lambda Functions in Java 8
Lambda Functions in Java 8Lambda Functions in Java 8
Lambda Functions in Java 8
 
The Swift Compiler and Standard Library
The Swift Compiler and Standard LibraryThe Swift Compiler and Standard Library
The Swift Compiler and Standard Library
 
Detecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Detecting Bugs in Binaries Using Decompilation and Data Flow AnalysisDetecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Detecting Bugs in Binaries Using Decompilation and Data Flow Analysis
 
Deep Learning and TensorFlow
Deep Learning and TensorFlowDeep Learning and TensorFlow
Deep Learning and TensorFlow
 

More from Bertram Ludäscher

Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion
Games, Queries, and Argumentation Frameworks: Time for a Family ReunionGames, Queries, and Argumentation Frameworks: Time for a Family Reunion
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion
Bertram Ludäscher
 
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!
Bertram Ludäscher
 
[Flashback] Integration of Active and Deductive Database Rules
[Flashback] Integration of Active and Deductive Database Rules[Flashback] Integration of Active and Deductive Database Rules
[Flashback] Integration of Active and Deductive Database Rules
Bertram Ludäscher
 
[Flashback] Statelog: Integration of Active & Deductive Database Rules
[Flashback] Statelog: Integration of Active & Deductive Database Rules[Flashback] Statelog: Integration of Active & Deductive Database Rules
[Flashback] Statelog: Integration of Active & Deductive Database Rules
Bertram Ludäscher
 
Answering More Questions with Provenance and Query Patterns
Answering More Questions with Provenance and Query PatternsAnswering More Questions with Provenance and Query Patterns
Answering More Questions with Provenance and Query Patterns
Bertram Ludäscher
 
Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Computational Reproducibility vs. Transparency: Is It FAIR Enough?Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Bertram Ludäscher
 
Which Model Does Not Belong: A Dialogue
Which Model Does Not Belong: A DialogueWhich Model Does Not Belong: A Dialogue
Which Model Does Not Belong: A Dialogue
Bertram Ludäscher
 
From Workflows to Transparent Research Objects and Reproducible Science Tales
From Workflows to Transparent Research Objects and Reproducible Science TalesFrom Workflows to Transparent Research Objects and Reproducible Science Tales
From Workflows to Transparent Research Objects and Reproducible Science Tales
Bertram Ludäscher
 
From Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science TalesFrom Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science Tales
Bertram Ludäscher
 
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of UsPossible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Bertram Ludäscher
 
Deduktive Datenbanken & Logische Programme: Eine kleine Zeitreise
Deduktive Datenbanken & Logische Programme: Eine kleine ZeitreiseDeduktive Datenbanken & Logische Programme: Eine kleine Zeitreise
Deduktive Datenbanken & Logische Programme: Eine kleine Zeitreise
Bertram Ludäscher
 
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
Bertram Ludäscher
 
Dissecting Reproducibility: A case study with ecological niche models in th...
Dissecting Reproducibility:  A case study with ecological niche models  in th...Dissecting Reproducibility:  A case study with ecological niche models  in th...
Dissecting Reproducibility: A case study with ecological niche models in th...
Bertram Ludäscher
 
Incremental Recomputation: Those who cannot remember the past are condemned ...
Incremental Recomputation:  Those who cannot remember the past are condemned ...Incremental Recomputation:  Those who cannot remember the past are condemned ...
Incremental Recomputation: Those who cannot remember the past are condemned ...
Bertram Ludäscher
 
An ontology-driven framework for data transformation in scientific workflows
An ontology-driven framework for data transformation in scientific workflowsAn ontology-driven framework for data transformation in scientific workflows
An ontology-driven framework for data transformation in scientific workflows
Bertram Ludäscher
 
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses Approach
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses ApproachKnowledge Representation & Reasoning and the Hierarchy-of-Hypotheses Approach
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses Approach
Bertram Ludäscher
 
Whole-Tale: The Experience of Research
Whole-Tale: The Experience of ResearchWhole-Tale: The Experience of Research
Whole-Tale: The Experience of Research
Bertram Ludäscher
 
ETC & Authors in the Driver's Seat
ETC & Authors in the Driver's SeatETC & Authors in the Driver's Seat
ETC & Authors in the Driver's Seat
Bertram Ludäscher
 
From Provenance Standards and Tools to Queries and Actionable Provenance
From Provenance Standards and Tools to Queries and Actionable ProvenanceFrom Provenance Standards and Tools to Queries and Actionable Provenance
From Provenance Standards and Tools to Queries and Actionable Provenance
Bertram Ludäscher
 

More from Bertram Ludäscher (20)

Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion
Games, Queries, and Argumentation Frameworks: Time for a Family ReunionGames, Queries, and Argumentation Frameworks: Time for a Family Reunion
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion
 
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!
 
[Flashback] Integration of Active and Deductive Database Rules
[Flashback] Integration of Active and Deductive Database Rules[Flashback] Integration of Active and Deductive Database Rules
[Flashback] Integration of Active and Deductive Database Rules
 
[Flashback] Statelog: Integration of Active & Deductive Database Rules
[Flashback] Statelog: Integration of Active & Deductive Database Rules[Flashback] Statelog: Integration of Active & Deductive Database Rules
[Flashback] Statelog: Integration of Active & Deductive Database Rules
 
Answering More Questions with Provenance and Query Patterns
Answering More Questions with Provenance and Query PatternsAnswering More Questions with Provenance and Query Patterns
Answering More Questions with Provenance and Query Patterns
 
Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Computational Reproducibility vs. Transparency: Is It FAIR Enough?Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Computational Reproducibility vs. Transparency: Is It FAIR Enough?
 
Which Model Does Not Belong: A Dialogue
Which Model Does Not Belong: A DialogueWhich Model Does Not Belong: A Dialogue
Which Model Does Not Belong: A Dialogue
 
From Workflows to Transparent Research Objects and Reproducible Science Tales
From Workflows to Transparent Research Objects and Reproducible Science TalesFrom Workflows to Transparent Research Objects and Reproducible Science Tales
From Workflows to Transparent Research Objects and Reproducible Science Tales
 
From Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science TalesFrom Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science Tales
 
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of UsPossible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
 
Deduktive Datenbanken & Logische Programme: Eine kleine Zeitreise
Deduktive Datenbanken & Logische Programme: Eine kleine ZeitreiseDeduktive Datenbanken & Logische Programme: Eine kleine Zeitreise
Deduktive Datenbanken & Logische Programme: Eine kleine Zeitreise
 
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
 
Dissecting Reproducibility: A case study with ecological niche models in th...
Dissecting Reproducibility:  A case study with ecological niche models  in th...Dissecting Reproducibility:  A case study with ecological niche models  in th...
Dissecting Reproducibility: A case study with ecological niche models in th...
 
Incremental Recomputation: Those who cannot remember the past are condemned ...
Incremental Recomputation:  Those who cannot remember the past are condemned ...Incremental Recomputation:  Those who cannot remember the past are condemned ...
Incremental Recomputation: Those who cannot remember the past are condemned ...
 
An ontology-driven framework for data transformation in scientific workflows
An ontology-driven framework for data transformation in scientific workflowsAn ontology-driven framework for data transformation in scientific workflows
An ontology-driven framework for data transformation in scientific workflows
 
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses Approach
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses ApproachKnowledge Representation & Reasoning and the Hierarchy-of-Hypotheses Approach
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses Approach
 
Whole-Tale: The Experience of Research
Whole-Tale: The Experience of ResearchWhole-Tale: The Experience of Research
Whole-Tale: The Experience of Research
 
ETC & Authors in the Driver's Seat
ETC & Authors in the Driver's SeatETC & Authors in the Driver's Seat
ETC & Authors in the Driver's Seat
 
From Provenance Standards and Tools to Queries and Actionable Provenance
From Provenance Standards and Tools to Queries and Actionable ProvenanceFrom Provenance Standards and Tools to Queries and Actionable Provenance
From Provenance Standards and Tools to Queries and Actionable Provenance
 

Recently uploaded

State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
g4dpvqap0
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 

Recently uploaded (20)

State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 

Validation and Inference of Schema-Level Workflow Data-Dependency Annotations

  • 1. Validation and Inference of Schema-Level Workflow Data-Dependency Annotations Shawn Bowers1, Timothy McPhillips2, Bertram Lud¨ascher2 1Dept. of Computer Science, Gonzaga University 2School of Information Sciences, University of Illinois, Urbana-Champaign IPAW 2018
  • 2. Scientific Workflows and Provenance A workflow specification modeled as a graph of computation steps (nodes) and data/control flow (edges) gen_boundary_region gen_boundary_region boundary_coordinates user_map_marker_pos prism_data file:data/112W36N.nc d3gend1 d2 filter c Steps are often “black boxes” (invoke external programs)
  • 3. Scientific Workflows and Provenance During a workflow execution, systems record “provenance” information ... I invocation of steps I data received/produced by steps A workflow trace modeled as a graph of invocations and corresponding data I a trace is a specification instance I capturing details of a workflow run 4gen:11 4 filter:1 1 4 gen:12 4 filter:1 1 77 filter:2 1 gen:11 0 filter:1 1 Di↵erent traces of the same specification
  • 4. Data Dependency Assumptions and Issues Traces are used to infer the “lineage” of data products (⇤) I e.g., all steps and inputs/outputs that led to an output I assume all outputs “depend on” all inputs of a step 4gen:11 4 filter:1 1 However, the inferred “dependencies” can be incorrect and vague 1. some outputs might not “depend on” all inputs 2. outputs can depend on inputs di↵erently (derivation, copy, ...) (⇤) some systems provide APIs for steps to declare dependencies at runtime
  • 5. Prospective (Schema-Level) Dependency Annotations Our approach: I Allow wf authors to specify dependency patterns (annotations) I Support di↵erent data dependency types I Use dependency annotations to infer trace-level dependencies
  • 6. Prospective (Schema-Level) Dependency Annotations Our approach: I Allow wf authors to specify dependency patterns (annotations) I Support di↵erent data dependency types I Use dependency annotations to infer trace-level dependencies Prior work: I Allows dependency annotations for individual workflow steps I Rules for extracting trace-level invocation dependencies I Requires each step to be (fully) annotated
  • 7. Prospective (Schema-Level) Dependency Annotations Our approach: I Allow wf authors to specify dependency patterns (annotations) I Support di↵erent data dependency types I Use dependency annotations to infer trace-level dependencies Prior work: I Allows dependency annotations for individual workflow steps I Rules for extracting trace-level invocation dependencies I Requires each step to be (fully) annotated Current contributions focus on workflow design: 1. Allow partially annotated workflow specifications 2. Infer complete sets of (possible) annotations 3. Validate correctness of annotations
  • 8. Workflow Specifications Minimally, a workflow specification W = (P, D, E) consists of • a set P of program blocks (computation steps) p1 • a set D of data blocks (data items or containers) d1 • a set E ✓ P ⇥ L ⇥ D ⇥ {in, out} of uniquely labeled edges p1 d1 p2 x1 x2 We use in(pi , xi , di ) and out(pj , xj , dj ) for input and output edges • where xi , xj are labels in L
  • 9. Dependency Annotations Dependency annotations A ✓ Lout ⇥ Lin ⇥ T for a workflow W ... • associate dependency types t 2 T (more later) • to input-output edge pairs of W (identified by their labels) We use dep rule(xi , xj , t) for annotations xi t xj (drawn in red) d3gend1 d2 filter c cutoff n r v1 v2 DependsOn CopyOf DependsOn • dep rule(n, r, depends on), dep rule(v1, v2, copy of), dep rule(cutoff, v2, depends on)
  • 10. Dependency Types We consider five di↵erent dependency annotation types ... (⇤,†) FlowsFrom: input present during invocation (e.g., a trigger) DependsOn: output has control (statement) dependency on input DerivedFrom: output has data (read-after-write) dependency on input ValueOf: input value copied to the output (new data item) SameAs: input copied to the output (same item “passed through”)
  • 11. Dependency Types We consider five di↵erent dependency annotation types ... (⇤,†) FlowsFrom: input present during invocation (e.g., a trigger) DependsOn: output has control (statement) dependency on input DerivedFrom: output has data (read-after-write) dependency on input ValueOf: input value copied to the output (new data item) SameAs: input copied to the output (same item “passed through”) Ordered from weakest to strongest form of dependency ... FlowsFrom DependsOn DerivedFrom ValueOf SameAs
  • 12. Dependency Types We consider five di↵erent dependency annotation types ... (⇤,†) FlowsFrom: input present during invocation (e.g., a trigger) DependsOn: output has control (statement) dependency on input DerivedFrom: output has data (read-after-write) dependency on input ValueOf: input value copied to the output (new data item) SameAs: input copied to the output (same item “passed through”) Ordered from weakest to strongest form of dependency ... FlowsFrom DependsOn DerivedFrom ValueOf SameAs Or as subclasses (e.g., FlowsFrom+ as “at least FlowsFrom”) ... FlowsFrom+ w DependsOn+ w DerivedFrom+ w ValueOf + w SameAs+ (⇤) Plus NotFlowsFrom, described later (†) A more formal description is given in the paper
  • 13. Reasoning using Dependency Composition Given two “connected” program blocks: p1 d1 d2 x1 x2 p2 d3 x3 x4 tj ti t A composite (indirect) dependency x1 t x4 is the weaker of the dependencies x1 ti x2 and x3 tj x4 dep rule(x1, x2, ti)^dep rule(x3, x4, tj)^ti tj $ dep rule(x1, x4, ti) dep rule(x1, x2, ti)^dep rule(x3, x4, tj)^tj ti $ dep rule(x1, x4, tj) This extends to longer “chains” of connected program blocks
  • 14. Dependency Composition with Multiple Paths When multiple annotation “paths” exist ... p1 p4 d1 d2 d5 x1 x2 x7 x9 DerivedFrom p2 p3 d3 d4 x3 x4 x5 x6 x8 FlowsFrom DerivedFrom SameAs DerivedFrom The composite annotation type is the strongest type of the paths • the top path implies FlowsFrom • the bottom path implies DerivedFrom • the infered type is DerivedFrom (i.e., “at least DerivedFrom”)
  • 15. Use Case 1: Infer Composite Dependencies Given annotations on blocks (steps), find composite annotations I helps verify intent and construction of workflow I e.g., that certain outputs are derived from inputs normalize filterd1 d3 d5 d2 d4 xrange x1 x2 x3 x4 xcutoff DependsOn SameAsDerivedFrom DerivedFrom DerivedFrom DerivedFrom Inferred annotations shown in blue
  • 16. Use Case 2: Constraining Dependency Annotations Add annotations to constrain choices I e.g., may know the output should be derived from the input I which can guide (constrain) block-level annotation choices I or guide the workflow design itself p1 p2 d1 d2 d3 x1 x2 x3 x4 DerivedFrom DerivedFrom, ValueOf, or SameAs? DerivedFrom, ValueOf, or SameAs?
  • 17. Use Case 3: Validating Dependency Annotations Ensure annotations are compatible I e.g., lower-level (block) annotations are not consistent with composite annotation (shown in purple) generate sample d2 dtype diter xout xiter d1 xin DerivedFrom initial sample perturbd1 d2 dtype diter xtype n x1 x2 s xiter DependsOn DerivedFromDependsOn DependsOn xtype din p1 p2 dout
  • 18. Dependency Reasoning Prototype Implementation Answer-Set Programming (ASP) prototype in Potascco (clingo) High level idea: use a generate-and-test algorithm (i) “guess” annotations for non-annotated input-output pairs (ii) ensure annotations satisfy composition rules (iii) ensure annotations satisfy “strongest-path” constraint Result is all possible and complete annotation sets (possible worlds) (iv) find all annotations common to all worlds (v) report possible choices for remaining input-output pairs
  • 19. Prototype Implementation (cont) The following “choice rule” guesses annotations {dep_rule(I,O,R) : dep_type(R)} = 1 :- up_stream(I,O). The up stream relation finds all possible input-output pairs up_stream(I,O) :- in(I,P,_), out(O,P,_). up_stream(I,O) :- in(I,P1,_), out(O1,P1,D1), in(I2,P2,D1), up_stream(I2,O). The following constraint ensures composition rules are satisfied :- dep_rule(I,O,R), not valid_dep_path(I,O,R).
  • 20. Prototype Implementation (cont) The valid dep path relation finds valid compositions valid_dep_path(I,O,R) :- in(I,P,_), out(O,P,_), dep_rule(I,O,R). valid_dep_path(I,O,R) :- in(I,P,_), out(O1,P,_), O != O1, dep_rule(I,O1,R1), connected(O1,I1), I != I1, valid_dep_path(I1,O,R2), compose(R1,R2,R). The connected relation ensures an output is connected to an input connected(O,I) :- out(O,_,D), in(I,_,D). compose computes composition (where weaker eq implements ) compose(R1,R2,R1) :- weaker_eq(R1,R2). compose(R1,R2,R2) :- weaker_eq(R2,R1).
  • 21. Prototype Implementation (cont) Finally, the following constraint ensures “strongest” paths :- dep_rule(I,O,R), valid_dep_path(I,O,R1), weaker_eq(R,R1), R != R1. Recently added NotFlowsFrom type (e.g., for subworkflows) I Required only minimal changes: NotFlowsFrom FlowsFrom I Full subworkflow support not yet implemented (future work) d1 p1 x1 d2 p2 x2 d3 x3 d4 x4
  • 22. Preliminary Performance Results (1) Increase the depth of the workflow (2-50 steps) and % of block annotations ps ds pe de ... ... (2) Increase the width of the workflow (2-50 steps) and % of block annotations pe de ...
  • 23. Future Work Add dependency annotations to YesWorkflow’s annotation types I combine schema-level support and extend trace-level support Apply schema-level dependency annotations to workflows in YW I we can now do this, e.g., for paleocar (with NotFlowsFrom) I extend annotation types as needed Develop specialized reasoning support (as needed) I ASP great for prototyping! I but can improve performance with dedicated implementation
  • 24. Dr. Shawn Bowers presenting the paper on July 10th, 2018 at IPAW, King’s College, London, UK.