eScience 2014, Guarujá (Brasil). Abstract: Workflow reuse is a major benefit of workflow systems and shared workflow repositories, but there are barely any studies that quantify the degree of reuse of workflows or the practical barriers that may stand in the way of successful reuse. In our own work, we hypothesize that defining workflow fragments improves reuse, since end-to-end workflows may be very specific and only partially reusable by others. This paper reports on a study of the current use of workflows and workflow fragments in labs that use the LONI Pipeline, a popular workflow system used mainly for neuroimaging research that enables users to define and reuse workflow fragments. We present an overview of the benefits of workflows and workflow fragments reported by users in informal discussions. We also report on a survey of researchers in a lab that has the LONI Pipeline installed, asking them about their experiences with reuse of workflow fragments and the actual benefits they perceive. This leads to quantifiable indicators of the reuse of workflows and workflow fragments in practice. Finally, we discuss barriers to further adoption of workflow fragments and workflow reuse that motivate further work.
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
1. Date: 22/10/2014
Workflow Reuse in Practice:
A Study of Neuroimaging Pipeline Users
Daniel Garijo *, Oscar Corcho *, Yolanda Gil Ŧ, Meredith N. Braskieⱡ, Derrek Hibarⱡ, Xue Huaⱡ, Neda Jahanshadⱡ, Paul Thompsonⱡ, and Arthur W. Togaⱡ
* Universidad Politécnica de Madrid,
Ŧ USC Information Sciences Institute,
ⱡ USC Laboratory of Neuroimaging
2. Main Contributions
•Highlight the benefits of workflows and workflow fragments
reported by users in a neuroscience research lab
•Survey of workflow users
•Quantitative perspective on the identified benefits.
IEEE eScience 2014. Guarujá, Brasil
2
repurpose
reuse
repository
Create, collaborate
3. Background
•Workflows are software artifacts that capture computational experiments
•Addition to paper publication
•Provenance of results
•Reuse
•Existing repositories of workflows (Galaxy, myExperiment, the LONI Pipeline, CrowdLabs, etc.)
•Sharing workflows
•Exploring existing workflows
•PROBLEMS to address:
•How does workflow reuse happen in a research lab environment?
•Are workflow fragments more useful than workflows?
3
IEEE eScience 2014. Guarujá, Brasil
4. Use case: The LONI Pipeline
Workflow system for neuroimaging analysis
http://pipeline.loni.usc.edu/explore/library-navigator/
IEEE eScience 2014. Guarujá, Brasil
4
5. Why LONI Pipeline?
•Need for reuse
•Grouping Tools
•Manual annotation of workflow fragments
•Workflow Miner
5
IEEE eScience 2014. Guarujá, Brasil
6. Approach
IEEE eScience 2014. Guarujá, Brasil
6
Discussions with scientists
User survey
Collect responses from users
21 responses
Discuss results
7. Possible benefits of workflows and workflow fragments
•Sharing workflows with collaborators
•Time savings
•Copy & paste fragments of workflows
•Reuse existent workflows
•Teaching
•Reduce the learning curve of new students
•Visualization
•Simplify workflows
•Design for modularity
•Highlight the most relevant steps on a workflow
IEEE eScience 2014. Guarujá, Brasil
7
8. Possible benefits of workflows and workflow fragments (2)
•Design for understandability
•Design for standardization
•Debugging
•Provenance exploration
•Paper writing
•Linking papers to pipelines
•Reproducibility and inspectability
IEEE eScience 2014. Guarujá, Brasil
8
10. Writing and Sharing Code
•Writing code is considered very important for this area of research.
•Sharing code is not considered to be as important.
10
IEEE eScience 2014. Guarujá, Brasil
11. Adopting a Workflow System
The overwhelming majority of responders found the workflow system useful.
•Creation of workflows.
IEEE eScience 2014. Guarujá, Brasil
11
12. Adopting a workflow system: workflow size
•Workflows of fewer than 10 steps seem to be the most preferred by scientists
IEEE eScience 2014. Guarujá, Brasil
12
0 2 4 6 8 10 12 14 1 2 3 4 1-5 5-10 10-20 >20 Number of workflow components
13. Reusing workflows
•Respondents answered that creating workflows is very useful
•Reuse of workflows was seen as less useful
•Reuse is not the only reason why workflows are created
•Reusing workflows from a user’s prior work is considered as useful as reusing workflows from others
IEEE eScience 2014. Guarujá, Brasil
13
14. Reusing workflows (2)
According to the respondents, the major benefits of workflows include:
• Time savings
•Organizing and storing code
• Having a visualization of the overall analysis
•Facilitating reproducibility
IEEE eScience 2014. Guarujá, Brasil
14
Workflows save time
13
Easier to track and debug complex code
9
Convenient way to organize/store code
11
Help write more organized code
6
Help make code more modular/reusable
4
Help make methods more understandable
8
Visualization of overall analysis
11
Workflows facilitate reproducibility
10
15. Reusing workflows (3)
•The overwhelming majority of respondents said workflows are useful for both non-programmers and for teaching new students
IEEE eScience 2014. Guarujá, Brasil
15
Non-programmers can use them
20
New students can easily learn
19
No need for others to re-implement code
14
Adoption of standard ways to do things
9
16. Reusing workflows (4)
•Respondents did not offer very overwhelming reasons for not sharing workflows
•Respondents did not offer very overwhelming reasons for not reusing workflows from others
IEEE eScience 2014. Guarujá, Brasil
16
Others would not want to use them
1
Others ask too many questions of the creators
2
Workflows from others are difficult to understand
3
It is difficult to understand how to prepare data for a workflow
3
Workflows from others are difficult to understand
4
It is difficult to understand how to prepare data for a workflow
2
Workflows created by others are too specific
1
It is hard to take workflows created by others and make them work
2
17. Reusing groupings
•Reuse is not the only reason why groupings are created. Unlike workflows, reusing groupings from one’s own work is more useful than reusing groupings from others
IEEE eScience 2014. Guarujá, Brasil
17
18. Reusing groupings (2)
•Most respondents agreed that groupings help simplify workflows. Groupings also make workflows more understandable by others
•Other grouping benefits:
•Time savings
•Help making modular and understandable code, more so than workflows
•Seen as useful to non-programmers and students
IEEE eScience 2014. Guarujá, Brasil
18
Visualization of the analysis
10
To simplify workflows that are complex overall
12
To make workflows more understandable to others
12
Groupings save time
12
Help make code more modular/reusable
10
Help make methods more understandable
7
19. Reusing groupings (3)
Very few responses motivated any reasons for not sharing groupings or not reusing groupings from others In general, workflows are considered generally more useful than groupings. On the other hand, more respondents said that groupings help make their code more modular and understandable
IEEE eScience 2014. Guarujá, Brasil
19
Others would not want to use them
0
Others ask too many questions of the creators
1
Workflows from others are difficult to understand
4
It is difficult to understand how to prepare data for a grouping
1
Groupings from others are difficult to understand
2
It is difficult to understand how to prepare data for a grouping
3
Groupings created by others are too specific
1
It is hard to take groupings created by others and make them work
4
20. Paper Writing
Workflows are not systematically linked to publications
•Most responders believe that the link between a workflow and a publication is kept in private laboratory notes, rather than in a publicly accessible manner
IEEE eScience 2014. Guarujá, Brasil
20
21. Discussion
Workflows have a clear benefit to the lab. There are important directions of future research suggested by this work:
•Improve the use of groupings.
•If users had more assistance in specifying and finding groupings, it is possible that workflows and fragments would be more reused
•Debugging and checking results
•Better mechanisms to handle checking intermediate execution results would allow users to define larger workflows
•Better documentation of workflows.
•Documentation of workflows tends to be private and scattered, and not usually linked to papers
•Facilitating workflows publication and linking to papers
•Papers provide important context and documentation for workflows
IEEE eScience 2014. Guarujá, Brasil
21
22. Conclusions
•Contributions:
•Highlight the benefits of workflows and workflow fragments reported by users in a neuroscience research lab
•Quantitative survey of the benefits by workflow users
•Our work can be expanded by
•Validating our findings with more respondents
•Reflecting the experience level of the respondents on the questionnaire
•Including statistics of the groupings usage on the workflows they create
•There are clear opportunities to develop best practices for designing workflow components and modularizing code, encouraging standards adoption, and facilitating understanding by other users
IEEE eScience 2014. Guarujá, Brasil
22
All materials used and the survey are available at: http://purl.org/net/wfSurvey-eScience2014
23. 23
Who are we?
•Daniel Garijo, Oscar Corcho Ontology Engineering Group, UPM
•Yolanda Gil Information Sciences Institute, USC
•Meredith N. Braskie, Derrek Hibar, Xue Hua, Neda Jahanshad, Paul Thompson Arthur W. Toga. USC Laboratory of Neuro Imaging
IEEE eScience 2014. Guarujá, Brasil
25. Date: 22/10/2014
Workflow Reuse in Practice:
A Study of Neuroimaging Pipeline Users
Daniel Garijo *, Oscar Corcho *, Yolanda Gil Ŧ, Meredith N. Braskieⱡ, Derrek Hibarⱡ, Xue Huaⱡ, Neda Jahanshadⱡ, Paul Thompsonⱡ, and Arthur W. Togaⱡ
* Universidad Politécnica de Madrid,
Ŧ USC Information Sciences Institute,
ⱡ USC Laboratory of Neuroimaging