SlideShare a Scribd company logo
1 of 29
Requirements for Supporting
the Iterative Exploration of
Scientific Workflow Variants
Lucas Carvalho1
, Bakinam T. Essawy2
,
Daniel Garijo3
, Claudia B. Medeiros1
, Yolanda Gil3
lucas.carvalho@ic.unicamp.br
1
University of Campinas, Institute of Computing, Brazil
2
University of Virginia, Department of Civil and Environmental Engineering, U.S.A
3
University of Southern California, Information Sciences Institute, U.S.A
Problem: Iterative Exploration in Science
Trial-and-Error
Exploration
Improvements
& Corrections
Model BModel A
Model A
Versions:
1.0 2.0 2.1 3.0
Model A
Diff Data
Diff versions
Diff models
Start:
Model A v2.1
Trial-and-Error
Exploration
Improvements
& Corrections
Model B
Versions:
1.0 2.0 2.0 3.0
Model B
Diff Data
Current:
Model B v3.0
Diff versions
2
Capturing Analyses as Workflows
James_Rivanna_RCH.tifRecharge
Model
MODFLOW-NWT v1.0.2
Software component
MODFLOW-NWT.exe 3
Workflow Variants
new
variant
4
Software Components and Interfaces
Software package
encapsulates a set of related
functions (or data). E.g.,
FloPy v.3.3.6
Interface regards the
function encapsulated by a
software component,
namely, its inputs, outputs,
parameters and data types.
5
Motivating Scenarios
1. Same i/o, replacing model version
i. Bug fixes
ii. More accurate results
2. Different i/o, same model
i. Additional inputs/outputs
ii. Replace inputs/outputs
3. Different models, diff i/o, diff
steps
i. New model is known, it has been
selected already
ii. New model has to be chosen among
several candidates
Model A
Versions:
1.0 2.0 2.1 3.0
Diff versions
Diff Data
Model A
Model BModel A
Diff models
6
Motivating Scenario 1:
Same i/o, replacing model version
Subcases:
1. Bug Fixes
2. More accurate results
new
variant
7
Motivating Scenario 1:
Same i/o, replacing model version
Subcases:
1. Bug Fixes
2. More accurate results
Workflow Variant
• From version 1.0.2
• To version 1.0.3
Changes in the MODFLOW-NWT component
Compare
executions
Compare: functionalities,
interface, quality, ...
MODFLOW-NWT
Versions:
1.0 ... 1.0.2 1.0.3 ... 1.1.3
Diff versions
8
Motivating Scenario 1:
Requirements
R1 – Version descriptions need to capture  useful metadata of
the software interface.
R2 – Scientists need to understand differences in interfaces
between software versions.
R3 – Scientists need to be alerted about relevant updates of
software used in their workflows.
R4 – Workflow descriptions need to capture the software,
software version, and functions used in the implementation
of workflow components.
R5 - Scientists need to understand how new workflow
variants can be used to correct errors in prior results.
9
Motivating Scenario 1:
Requirements
R6 – Scientists should be able to easily replace a component of
the workflow with a new one when the interfaces of the
components are the same.
R7 - Given a software package that can be used to create
many workflow components, scientists need to easily figure
out how to implement new variants of a workflow
component with newer versions of that package.
R8 – Scientists should be able to easily create new variants of
workflow components and relate them to each other.
R9 – Scientists should be able to easily create new workflow
variants and relate them to each other.
10
Motivating Scenario 1:
Requirements
R10 – Scientists should be able to relate changes in software
to specific workflow results, so it is clear how new software
versions affect calculated variables to produce wrong values.
R11 – Version descriptions need to capture bug fixes and
known bugs and relate them to software features and input
and output file variables.
R12 – Scientists need a summarization of changes between a
given software version and a newer version to understand
their differences without need to understand the changes
associated to each version in between those.
R13 – Scientists need to understand any incompatibilities
between versions of different software packages used to
implement a workflow component.
11
Motivating Scenario 2
Different i/o, same model
Subcases:
1. Additional inputs/outputs
2. Replace inputs/outputs
12
Motivating Scenario 2
Different i/o, same model
Subcases:
1. Additional inputs/outputs
new
variant
Motivating Scenario 2
Different i/o, same model
Subcases:
2. Replace inputs/outputs
new
variant
Motivating Scenario 2
Different i/o, same model
Subcases:
2. Replace inputs/outputs
Changes to add snowmelt
- MODFLOW-NWT => Infiltration package
- Recharge and Infiltration are incompatible
- Recharge + Snowmelt => Infiltration
+ Snowmelt Data
MODFLOW-NWT
Motivating Scenario 2
Requirements
R14 – Scientists need to easily find software or methods
that process a specific data input. 
R15 – Scientists need to easily find workflow components
for data conversion.
R16 – Scientists need to know whether a workflow
variant is still valid. 
R17 – Version descriptions to capture incompatible
packages or libraries used to implement a workflow
component.
R18 – Scientists need to be able to understand the
differences between two workflow variants.
16
Motivating Scenario 3
Different i/o, diff models and diff steps
Subcases:
1. New model is known, it has been selected already.
new
variant
Motivating Scenario 3
Different i/o, diff models and diff steps
Subcases:
1. New model is known, it has been selected already.
MIKE-SHEMODFLOW
Diff models
Changes to use MIKE-SHE
Motivating Scenario 3
Different i/o, diff models and diff steps
Subcases:
2. New model has to be chosen among several
candidates.
MIKE-SHEMODFLOW
Diff models
PIHM TOPOFLOW
19
Motivating Scenario 3
Requirements
R19 – Version descriptions need to capture assumptions used
in software.
R20 – Version descriptions need to capture computational
methods in a software and their features and interface. 
R21 – Workflow components, inputs, outputs or parameters in
new workflow variants that are no longer needed need to be
removed.
R22 – Scientists need to assess and compare the effort in
creating new workflow variants that represent a significant
departure from previous ones.
R23 – Scientists need to find and compare equivalent models.
R24 – Scientists need to understand the commonalities and
differences between the variables used and generated by
distinct computational models.
20
Summary of Requirements
1. Workflow component metadata
2. Workflow updates
3. Workflow comparisons
21
Summary of Requirements
Workflow component metadata
Representation and metadata of software regarding its versions,
interfaces and functionalities.
Representation and metadata for workflows, workflow variants
and workflow components.
Requirement Scenarios
S
1
S
2
S
3
R1 – Version descriptions need to capture  useful metadata of software interface. X X X
R2 – Scientists need to understand differences in interfaces between the same or distinct software and
software version.
X X
R3 – Scientists need to be alerted about relevant changes that influence how a software works. X
R4 – Workflow descriptions need to capture the software, software version, and functions used in the
implementation of workflow components.
X X X
R8 – Scientists should be able to create new variants of workflow components and relate them to each
other.
X X X
R9 – Scientists should be able to easily create new workflow variants and relate them to each other. X X X
R10 – Scientists should be able to relate changes in software to specific variables, so it is clear which
versions affect calculated variables to produce wrong values.
X
22
Summary of Requirements
Workflow component metadata
Requirement Scenarios
S1 S
2
S3
R11 – Version descriptions need to capture bug fixes and known bugs and relate them to software
features and input and output file variables.
X
R12 – Scientists need a summarization of changes between a given software version and a target
version to understand their differences without need to understand the changes associated to each
version.
X
R13 – Version descriptions to capture the dependency between software versions used in workflows. X
R17 – Version descriptions to capture incompatible inputs that cannot be used at the same time in a
workflow component.
X X
R19 – Version descriptions need to capture assumptions used in software. X
R20 – Version descriptions need to capture computational methods in software and their features and
interface.
x x X
R24 – Scientists need to understand the commonalities and differences between the variables used and
generated by distinct computational models.
x
23
Summary of Requirements
Workflow Updates
The creation of new workflow variants by replacing, adding, or
removing workflow components
The propagation of the effects of those changes throughout the
structure of the workflow and the validation of the new variants.
Requirement Scenarios
S1 S2 S3
R6 – Scientists should be able to easily replace a component of the workflow with a new one when the
interface of the components is the same.
X
R7 – Given a software package that can be used to create many workflow components, scientists need
to easily figure out how to implement new variants of a workflow component with newer versions of the
software.
X
R14 – Scientists need to easily find a software component that process a specific data input. X X
R15 – Scientists need to easily find workflow components for data conversion. X X
R16 – Scientists need to know whether the workflow variant is valid. X X
R21 – No longer needed components, inputs, outputs or parameters in workflow  variants need to be
removed after changes.
X
R22 – Scientists need to assess and compare the effort in creating new workflow variants that
represent a significant departure from previous ones.
X
R23 – Scientists need to find and compare equivalent methods. X 24
Summary of Requirements
Requirement Scenarios
S1 S2 S3
R5 – Scientists need to understand effects to the results of changes performed in workflows. X X X
R18 – Scientists need to be able to understand the differences between two workflow variants. X X X
Workflow Comparisons
The comparison between different software versions,
software packages, workflow variants and workflow runs.
25
Discussion and Research Directions
1. Describing workflow components and their underlying
software
• The creation and adaptation of existing ontologies to capture
information about software versions and variants.
1. Managing and tracking workflow variants and their
differences
• How to compare workflow components and workflow variants
and present these results in a usable way.
• The use of multi-media narratives that combine text, graphic
diagrams, and visualizations to explain in a human-readable.
• Narrative easily customized to the reader’s level of expertise
and interest.
• Manage histories of creation and evolution of workflow variants
26
Discussion and Research Directions
3. Designing an interactive framework to support scientists in
the exploration and experimentation process through
workflow variants
• Leverage workflow reuse and composition to support the
creation of workflow variants.
• Mechanisms to identify critical and non-critical components in
workflows.
27
Conclusions
The need to support scientists in exploring different
experiment designs over time.
Several scenarios where an initial workflow is modified
to create workflow variants by replacing, adding or
removing workflow steps in Hydrology.
Requirements generic enough for other domains.
Research directions to address the requirements.
28
Acknowledgments
US National Science Foundation (NSF)
Sao Paulo Research Foundation (FAPESP)
IS-GEO (Intelligent Systems for GeoSciences)
29

More Related Content

Similar to Requirements for Supporting the Iterative Exploration of Scientific Workflow Variants

Software Engineering with Objects (M363) Final Revision By Kuwait10
Software Engineering with Objects (M363) Final Revision By Kuwait10Software Engineering with Objects (M363) Final Revision By Kuwait10
Software Engineering with Objects (M363) Final Revision By Kuwait10Kuwait10
 
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...dgarijo
 
A Comparative Study of Forward and Reverse Engineering
A Comparative Study of Forward and Reverse EngineeringA Comparative Study of Forward and Reverse Engineering
A Comparative Study of Forward and Reverse Engineeringijsrd.com
 
Determan SummerSim_submit_rev3
Determan SummerSim_submit_rev3Determan SummerSim_submit_rev3
Determan SummerSim_submit_rev3John Determan
 
Fundamentals of software development
Fundamentals of software developmentFundamentals of software development
Fundamentals of software developmentPratik Devmurari
 
Software enginneering
Software enginneeringSoftware enginneering
Software enginneeringchirag patil
 
[2015/2016] Modern development paradigms
[2015/2016] Modern development paradigms[2015/2016] Modern development paradigms
[2015/2016] Modern development paradigmsIvano Malavolta
 
System analsis and design
System analsis and designSystem analsis and design
System analsis and designRizwan Kabir
 
LIFT: A Legacy InFormation retrieval Tool
LIFT: A Legacy InFormation retrieval ToolLIFT: A Legacy InFormation retrieval Tool
LIFT: A Legacy InFormation retrieval ToolKellyton Brito
 
Adam_Mcconnell_SPR11_v3
Adam_Mcconnell_SPR11_v3Adam_Mcconnell_SPR11_v3
Adam_Mcconnell_SPR11_v3Adam McConnell
 
Partitioned Based Regression Verification
Partitioned Based Regression VerificationPartitioned Based Regression Verification
Partitioned Based Regression VerificationAung Thu Rha Hein
 
Programming the Network Data Plane
Programming the Network Data PlaneProgramming the Network Data Plane
Programming the Network Data PlaneC4Media
 
Modern development paradigms
Modern development paradigmsModern development paradigms
Modern development paradigmsIvano Malavolta
 
Software Development : Jeremy Gleason Iscope Digital
Software Development : Jeremy Gleason Iscope DigitalSoftware Development : Jeremy Gleason Iscope Digital
Software Development : Jeremy Gleason Iscope DigitalIscope Digital
 

Similar to Requirements for Supporting the Iterative Exploration of Scientific Workflow Variants (20)

Intro-Soft-Engg-2.pptx
Intro-Soft-Engg-2.pptxIntro-Soft-Engg-2.pptx
Intro-Soft-Engg-2.pptx
 
Software Engineering with Objects (M363) Final Revision By Kuwait10
Software Engineering with Objects (M363) Final Revision By Kuwait10Software Engineering with Objects (M363) Final Revision By Kuwait10
Software Engineering with Objects (M363) Final Revision By Kuwait10
 
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
 
A Comparative Study of Forward and Reverse Engineering
A Comparative Study of Forward and Reverse EngineeringA Comparative Study of Forward and Reverse Engineering
A Comparative Study of Forward and Reverse Engineering
 
Determan SummerSim_submit_rev3
Determan SummerSim_submit_rev3Determan SummerSim_submit_rev3
Determan SummerSim_submit_rev3
 
Fundamentals of software development
Fundamentals of software developmentFundamentals of software development
Fundamentals of software development
 
Software enginneering
Software enginneeringSoftware enginneering
Software enginneering
 
[2015/2016] Modern development paradigms
[2015/2016] Modern development paradigms[2015/2016] Modern development paradigms
[2015/2016] Modern development paradigms
 
Ikc 2015
Ikc 2015Ikc 2015
Ikc 2015
 
SDLC
SDLCSDLC
SDLC
 
System analsis and design
System analsis and designSystem analsis and design
System analsis and design
 
cheatsheet.pdf
cheatsheet.pdfcheatsheet.pdf
cheatsheet.pdf
 
LIFT: A Legacy InFormation retrieval Tool
LIFT: A Legacy InFormation retrieval ToolLIFT: A Legacy InFormation retrieval Tool
LIFT: A Legacy InFormation retrieval Tool
 
Adam_Mcconnell_SPR11_v3
Adam_Mcconnell_SPR11_v3Adam_Mcconnell_SPR11_v3
Adam_Mcconnell_SPR11_v3
 
C.V. of Santanu Chatterjee
C.V. of Santanu ChatterjeeC.V. of Santanu Chatterjee
C.V. of Santanu Chatterjee
 
Partitioned Based Regression Verification
Partitioned Based Regression VerificationPartitioned Based Regression Verification
Partitioned Based Regression Verification
 
Programming the Network Data Plane
Programming the Network Data PlaneProgramming the Network Data Plane
Programming the Network Data Plane
 
Modern development paradigms
Modern development paradigmsModern development paradigms
Modern development paradigms
 
poster_3.0
poster_3.0poster_3.0
poster_3.0
 
Software Development : Jeremy Gleason Iscope Digital
Software Development : Jeremy Gleason Iscope DigitalSoftware Development : Jeremy Gleason Iscope Digital
Software Development : Jeremy Gleason Iscope Digital
 

More from Lucas Augusto Carvalho

Pexincha: Startup Weekend São Paulo Retail+Tech 2018
Pexincha: Startup Weekend São Paulo Retail+Tech 2018Pexincha: Startup Weekend São Paulo Retail+Tech 2018
Pexincha: Startup Weekend São Paulo Retail+Tech 2018Lucas Augusto Carvalho
 
Converting Scripts into Reproducible Workflow Research Objects
Converting Scripts into Reproducible Workflow Research ObjectsConverting Scripts into Reproducible Workflow Research Objects
Converting Scripts into Reproducible Workflow Research ObjectsLucas Augusto Carvalho
 
Google Analytics: Como explorar suas estatísticas para fazer mais negócios.
Google Analytics: Como explorar suas estatísticas para fazer mais negócios.Google Analytics: Como explorar suas estatísticas para fazer mais negócios.
Google Analytics: Como explorar suas estatísticas para fazer mais negócios.Lucas Augusto Carvalho
 
TV Digital interativa - Projeto TeouVi
TV Digital interativa - Projeto TeouViTV Digital interativa - Projeto TeouVi
TV Digital interativa - Projeto TeouViLucas Augusto Carvalho
 
Palestra - Symfony Framework MVC PHP 5
Palestra - Symfony Framework MVC PHP 5Palestra - Symfony Framework MVC PHP 5
Palestra - Symfony Framework MVC PHP 5Lucas Augusto Carvalho
 
Ferramentas Para Acessibilidade Na Web
Ferramentas Para Acessibilidade Na WebFerramentas Para Acessibilidade Na Web
Ferramentas Para Acessibilidade Na WebLucas Augusto Carvalho
 
Rascunho do Seminário sobre Acessibilidade na Web
Rascunho do Seminário sobre Acessibilidade na WebRascunho do Seminário sobre Acessibilidade na Web
Rascunho do Seminário sobre Acessibilidade na WebLucas Augusto Carvalho
 

More from Lucas Augusto Carvalho (14)

Pexincha: Startup Weekend São Paulo Retail+Tech 2018
Pexincha: Startup Weekend São Paulo Retail+Tech 2018Pexincha: Startup Weekend São Paulo Retail+Tech 2018
Pexincha: Startup Weekend São Paulo Retail+Tech 2018
 
NiW: Notebooks into Workflows
NiW: Notebooks into WorkflowsNiW: Notebooks into Workflows
NiW: Notebooks into Workflows
 
Converting Scripts into Reproducible Workflow Research Objects
Converting Scripts into Reproducible Workflow Research ObjectsConverting Scripts into Reproducible Workflow Research Objects
Converting Scripts into Reproducible Workflow Research Objects
 
Conhecendo a rede social Google+
Conhecendo a rede social Google+Conhecendo a rede social Google+
Conhecendo a rede social Google+
 
Empreendedorismo Digital - versão 2
Empreendedorismo Digital - versão 2Empreendedorismo Digital - versão 2
Empreendedorismo Digital - versão 2
 
Empreendedorismo Digital
Empreendedorismo DigitalEmpreendedorismo Digital
Empreendedorismo Digital
 
Sistemas de Recomendação na web
Sistemas de Recomendação na webSistemas de Recomendação na web
Sistemas de Recomendação na web
 
Google Analytics: Como explorar suas estatísticas para fazer mais negócios.
Google Analytics: Como explorar suas estatísticas para fazer mais negócios.Google Analytics: Como explorar suas estatísticas para fazer mais negócios.
Google Analytics: Como explorar suas estatísticas para fazer mais negócios.
 
TV Digital interativa - Projeto TeouVi
TV Digital interativa - Projeto TeouViTV Digital interativa - Projeto TeouVi
TV Digital interativa - Projeto TeouVi
 
Palestra - SEO - Otimização Busca
Palestra - SEO - Otimização BuscaPalestra - SEO - Otimização Busca
Palestra - SEO - Otimização Busca
 
Palestra - Symfony Framework MVC PHP 5
Palestra - Symfony Framework MVC PHP 5Palestra - Symfony Framework MVC PHP 5
Palestra - Symfony Framework MVC PHP 5
 
Ferramentas Para Acessibilidade Na Web
Ferramentas Para Acessibilidade Na WebFerramentas Para Acessibilidade Na Web
Ferramentas Para Acessibilidade Na Web
 
Seminário Final
Seminário FinalSeminário Final
Seminário Final
 
Rascunho do Seminário sobre Acessibilidade na Web
Rascunho do Seminário sobre Acessibilidade na WebRascunho do Seminário sobre Acessibilidade na Web
Rascunho do Seminário sobre Acessibilidade na Web
 

Recently uploaded

Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
Solution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsSolution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsHajira Mahmood
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxEran Akiva Sinbar
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 
Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2John Carlo Rollon
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Forest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are importantForest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are importantadityabhardwaj282
 

Recently uploaded (20)

Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
Solution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsSolution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutions
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 
Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Forest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are importantForest laws, Indian forest laws, why they are important
Forest laws, Indian forest laws, why they are important
 

Requirements for Supporting the Iterative Exploration of Scientific Workflow Variants

  • 1. Requirements for Supporting the Iterative Exploration of Scientific Workflow Variants Lucas Carvalho1 , Bakinam T. Essawy2 , Daniel Garijo3 , Claudia B. Medeiros1 , Yolanda Gil3 lucas.carvalho@ic.unicamp.br 1 University of Campinas, Institute of Computing, Brazil 2 University of Virginia, Department of Civil and Environmental Engineering, U.S.A 3 University of Southern California, Information Sciences Institute, U.S.A
  • 2. Problem: Iterative Exploration in Science Trial-and-Error Exploration Improvements & Corrections Model BModel A Model A Versions: 1.0 2.0 2.1 3.0 Model A Diff Data Diff versions Diff models Start: Model A v2.1 Trial-and-Error Exploration Improvements & Corrections Model B Versions: 1.0 2.0 2.0 3.0 Model B Diff Data Current: Model B v3.0 Diff versions 2
  • 3. Capturing Analyses as Workflows James_Rivanna_RCH.tifRecharge Model MODFLOW-NWT v1.0.2 Software component MODFLOW-NWT.exe 3
  • 5. Software Components and Interfaces Software package encapsulates a set of related functions (or data). E.g., FloPy v.3.3.6 Interface regards the function encapsulated by a software component, namely, its inputs, outputs, parameters and data types. 5
  • 6. Motivating Scenarios 1. Same i/o, replacing model version i. Bug fixes ii. More accurate results 2. Different i/o, same model i. Additional inputs/outputs ii. Replace inputs/outputs 3. Different models, diff i/o, diff steps i. New model is known, it has been selected already ii. New model has to be chosen among several candidates Model A Versions: 1.0 2.0 2.1 3.0 Diff versions Diff Data Model A Model BModel A Diff models 6
  • 7. Motivating Scenario 1: Same i/o, replacing model version Subcases: 1. Bug Fixes 2. More accurate results new variant 7
  • 8. Motivating Scenario 1: Same i/o, replacing model version Subcases: 1. Bug Fixes 2. More accurate results Workflow Variant • From version 1.0.2 • To version 1.0.3 Changes in the MODFLOW-NWT component Compare executions Compare: functionalities, interface, quality, ... MODFLOW-NWT Versions: 1.0 ... 1.0.2 1.0.3 ... 1.1.3 Diff versions 8
  • 9. Motivating Scenario 1: Requirements R1 – Version descriptions need to capture  useful metadata of the software interface. R2 – Scientists need to understand differences in interfaces between software versions. R3 – Scientists need to be alerted about relevant updates of software used in their workflows. R4 – Workflow descriptions need to capture the software, software version, and functions used in the implementation of workflow components. R5 - Scientists need to understand how new workflow variants can be used to correct errors in prior results. 9
  • 10. Motivating Scenario 1: Requirements R6 – Scientists should be able to easily replace a component of the workflow with a new one when the interfaces of the components are the same. R7 - Given a software package that can be used to create many workflow components, scientists need to easily figure out how to implement new variants of a workflow component with newer versions of that package. R8 – Scientists should be able to easily create new variants of workflow components and relate them to each other. R9 – Scientists should be able to easily create new workflow variants and relate them to each other. 10
  • 11. Motivating Scenario 1: Requirements R10 – Scientists should be able to relate changes in software to specific workflow results, so it is clear how new software versions affect calculated variables to produce wrong values. R11 – Version descriptions need to capture bug fixes and known bugs and relate them to software features and input and output file variables. R12 – Scientists need a summarization of changes between a given software version and a newer version to understand their differences without need to understand the changes associated to each version in between those. R13 – Scientists need to understand any incompatibilities between versions of different software packages used to implement a workflow component. 11
  • 12. Motivating Scenario 2 Different i/o, same model Subcases: 1. Additional inputs/outputs 2. Replace inputs/outputs 12
  • 13. Motivating Scenario 2 Different i/o, same model Subcases: 1. Additional inputs/outputs new variant
  • 14. Motivating Scenario 2 Different i/o, same model Subcases: 2. Replace inputs/outputs new variant
  • 15. Motivating Scenario 2 Different i/o, same model Subcases: 2. Replace inputs/outputs Changes to add snowmelt - MODFLOW-NWT => Infiltration package - Recharge and Infiltration are incompatible - Recharge + Snowmelt => Infiltration + Snowmelt Data MODFLOW-NWT
  • 16. Motivating Scenario 2 Requirements R14 – Scientists need to easily find software or methods that process a specific data input.  R15 – Scientists need to easily find workflow components for data conversion. R16 – Scientists need to know whether a workflow variant is still valid.  R17 – Version descriptions to capture incompatible packages or libraries used to implement a workflow component. R18 – Scientists need to be able to understand the differences between two workflow variants. 16
  • 17. Motivating Scenario 3 Different i/o, diff models and diff steps Subcases: 1. New model is known, it has been selected already. new variant
  • 18. Motivating Scenario 3 Different i/o, diff models and diff steps Subcases: 1. New model is known, it has been selected already. MIKE-SHEMODFLOW Diff models Changes to use MIKE-SHE
  • 19. Motivating Scenario 3 Different i/o, diff models and diff steps Subcases: 2. New model has to be chosen among several candidates. MIKE-SHEMODFLOW Diff models PIHM TOPOFLOW 19
  • 20. Motivating Scenario 3 Requirements R19 – Version descriptions need to capture assumptions used in software. R20 – Version descriptions need to capture computational methods in a software and their features and interface.  R21 – Workflow components, inputs, outputs or parameters in new workflow variants that are no longer needed need to be removed. R22 – Scientists need to assess and compare the effort in creating new workflow variants that represent a significant departure from previous ones. R23 – Scientists need to find and compare equivalent models. R24 – Scientists need to understand the commonalities and differences between the variables used and generated by distinct computational models. 20
  • 21. Summary of Requirements 1. Workflow component metadata 2. Workflow updates 3. Workflow comparisons 21
  • 22. Summary of Requirements Workflow component metadata Representation and metadata of software regarding its versions, interfaces and functionalities. Representation and metadata for workflows, workflow variants and workflow components. Requirement Scenarios S 1 S 2 S 3 R1 – Version descriptions need to capture  useful metadata of software interface. X X X R2 – Scientists need to understand differences in interfaces between the same or distinct software and software version. X X R3 – Scientists need to be alerted about relevant changes that influence how a software works. X R4 – Workflow descriptions need to capture the software, software version, and functions used in the implementation of workflow components. X X X R8 – Scientists should be able to create new variants of workflow components and relate them to each other. X X X R9 – Scientists should be able to easily create new workflow variants and relate them to each other. X X X R10 – Scientists should be able to relate changes in software to specific variables, so it is clear which versions affect calculated variables to produce wrong values. X 22
  • 23. Summary of Requirements Workflow component metadata Requirement Scenarios S1 S 2 S3 R11 – Version descriptions need to capture bug fixes and known bugs and relate them to software features and input and output file variables. X R12 – Scientists need a summarization of changes between a given software version and a target version to understand their differences without need to understand the changes associated to each version. X R13 – Version descriptions to capture the dependency between software versions used in workflows. X R17 – Version descriptions to capture incompatible inputs that cannot be used at the same time in a workflow component. X X R19 – Version descriptions need to capture assumptions used in software. X R20 – Version descriptions need to capture computational methods in software and their features and interface. x x X R24 – Scientists need to understand the commonalities and differences between the variables used and generated by distinct computational models. x 23
  • 24. Summary of Requirements Workflow Updates The creation of new workflow variants by replacing, adding, or removing workflow components The propagation of the effects of those changes throughout the structure of the workflow and the validation of the new variants. Requirement Scenarios S1 S2 S3 R6 – Scientists should be able to easily replace a component of the workflow with a new one when the interface of the components is the same. X R7 – Given a software package that can be used to create many workflow components, scientists need to easily figure out how to implement new variants of a workflow component with newer versions of the software. X R14 – Scientists need to easily find a software component that process a specific data input. X X R15 – Scientists need to easily find workflow components for data conversion. X X R16 – Scientists need to know whether the workflow variant is valid. X X R21 – No longer needed components, inputs, outputs or parameters in workflow  variants need to be removed after changes. X R22 – Scientists need to assess and compare the effort in creating new workflow variants that represent a significant departure from previous ones. X R23 – Scientists need to find and compare equivalent methods. X 24
  • 25. Summary of Requirements Requirement Scenarios S1 S2 S3 R5 – Scientists need to understand effects to the results of changes performed in workflows. X X X R18 – Scientists need to be able to understand the differences between two workflow variants. X X X Workflow Comparisons The comparison between different software versions, software packages, workflow variants and workflow runs. 25
  • 26. Discussion and Research Directions 1. Describing workflow components and their underlying software • The creation and adaptation of existing ontologies to capture information about software versions and variants. 1. Managing and tracking workflow variants and their differences • How to compare workflow components and workflow variants and present these results in a usable way. • The use of multi-media narratives that combine text, graphic diagrams, and visualizations to explain in a human-readable. • Narrative easily customized to the reader’s level of expertise and interest. • Manage histories of creation and evolution of workflow variants 26
  • 27. Discussion and Research Directions 3. Designing an interactive framework to support scientists in the exploration and experimentation process through workflow variants • Leverage workflow reuse and composition to support the creation of workflow variants. • Mechanisms to identify critical and non-critical components in workflows. 27
  • 28. Conclusions The need to support scientists in exploring different experiment designs over time. Several scenarios where an initial workflow is modified to create workflow variants by replacing, adding or removing workflow steps in Hydrology. Requirements generic enough for other domains. Research directions to address the requirements. 28
  • 29. Acknowledgments US National Science Foundation (NSF) Sao Paulo Research Foundation (FAPESP) IS-GEO (Intelligent Systems for GeoSciences) 29

Editor's Notes

  1. TODO: Relate each author to the corresponding affiliation
  2. Workflow components (steps) + Data + Dataflow Inputs, output and intermediate data
  3. Workflow variant: changing software components, interfaces and data preparation steps. A workflow variant is a workflow in which components have been changed compared to the initial workflow. Moreover, a change may be propagated to other components.
  4. Source Software Component: https://en.wikipedia.org/wiki/Component-based_software_engineering
  5. R1: metadata for software interface R2: diff interfaces R3: Alert of changes in software R4: Info about software components in workflows R5: Effect of changes in results R6: Easily replace a component without changes in I/O R7: Easily create variants from software packages R8/9: Create variant and relate to others R10: Changes in software to specific variables R11: Bugs, fixes and relate to specific software parts R12: Summarization of changes R13: Compatibility btw software versions used in diff workflow steps
  6. R1: metadata for software interface R2: diff interfaces R3: Alert of changes in software R4: Info about software components in workflows R5: Effect of changes in results R6: Easily replace a component without changes in I/O R7: Easily create variants from software packages R8/9: Create variant and relate to others R10: Changes in software to specific variables R11: Bugs, fixes and relate to specific software parts R12: Summarization of changes R13: Compatibility btw software versions used in diff workflow steps
  7. R14: Find software components that process a specific data input R15: Find software components for data conversion R16: Check if workflow variant is still valid. R17: Capture incompatible inputs for software components. R18: Differences between workflow variants (delta)
  8. R14: Find software components that process a specific data input R15: Find software components for data conversion R16: Check if workflow variant is still valid. R17: Capture incompatible inputs for software components. R18: Differences between workflow variants (delta)
  9. R14: Find software components that process a specific data input R15: Find software components for data conversion R16: Check if workflow variant is still valid. R17: Capture incompatible inputs for software components. R18: Differences between workflow variants (delta)
  10. R14: Find software components that process a specific data input R15: Find software components for data conversion R16: Check if workflow variant is still valid. R17: Capture incompatible inputs for software components. R18: Differences between workflow variants (delta)
  11. R19: Capture assumptions used in software components. R20: Capture functions and their functionalities and interface. R21: Remove no longer needed parts of the workflow variant. R22: Effort to create workflow variants R23: Find and compare equivalent software components. R24: Diff between variables present in input and outputs
  12. R19: Capture assumptions used in software components. R20: Capture functions and their functionalities and interface. R21: Remove no longer needed parts of the workflow variant. R22: Effort to create workflow variants R23: Find and compare equivalent software components. R24: Diff between variables present in input and outputs
  13. R19: Capture assumptions used in software components. R20: Capture functions and their functionalities and interface. R21: Remove no longer needed parts of the workflow variant. R22: Effort to create workflow variants R23: Find and compare equivalent software components. R24: Diff between variables present in input and outputs
  14. Table: go through it very quickly
  15. Table: go through it very quickly