SlideShare a Scribd company logo
1 of 11
Download to read offline
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
1
Tool-Supported Fault
Localization in Spreadsheets:
Limitations of Current
Evaluation Practice
Birgit Hofer, Franz Wotawa
Dietmar Jannach, Thomas Schmitz
Kostyantyn Shchekotykhin
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
2
An Overview of Limitations
of Current Evaluation Practice
1
2
3
Lack of benchmarks systems
Usability and user acceptance
Field research
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
3
Benchmark Systems – Current Situation
 There is no public data set for spreadsheet fault
localization
 Researcher create own benchmark systems
 Take existing corpus (e.g. EUSES [FR05]) or collect
individual spreadsheets
 Apply mutation operators, e.g. [AE09] on them or
manually inject faults
[FR05] M. Fisherand G. Rothermel:“The EUSES spreadsheetcorpus:Ashared resource forsupporting
experimentationwith spreadsheetdependability mechanisms.”1stWorkshop on End-User
Software Engineering,2005.
[AE09] R.Abraham and M.Erwig. Mutation Operators forSpreadsheets.IEEE Transactionson Software
Engineering,2009.
1
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
4
Some Examples I
 Hofer et. al [HRW13]
o “… we are evaluating the … approaches by means of the EUSES
spreadsheet corpus. We skipped around 240 Excel 5.0
spreadsheets that are not compatible with our implementation, …
o we removed all spreadsheets containing less than 5 formulas …
o we automaticallycreated up to five first-order mutants. Amutant of a
spreadsheet is created by randomly choosing a formula cell of the
spreadsheet and applying a mutation operator on it. According to the
classification of spreadsheet mutation operators ofAbraham and
Erwig, we used the following mutation operators …”
 Jannach and Schmitz [JS14]
o “For the performance analysis, we selected a number of artificial and
real-world spreadsheetsin which we manuallyinjected faults.”
[HRW13] B. Hofer, A. Riboira, F. Wotawa, and R. Abreu, E. Getzner: “On the Empirical Evaluation of Fault Localization
Techniques for Spreadsheets.” FASE 2013.
[JS14] D. Jannach and T. Schmitz: “Model-based diagnosis of spreadsheet programs - A constraint-based debugging
approach.” Automated Software Engineering, Springer, 2014.
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
5
Some examples II
 Abraham and Erwig [AE08]
o “… we use spreadsheets that have been used in previous
empirical studies. The spreadsheets have been picked to include as
many different kinds of formulas, and formulas with branching …
o We generate mutant spreadsheets by seeding faults in the original
spreadsheets using the mutation operators given in Table 1. The
mutation operators have been designed to reflect errors reported in
spreadsheet literature …”
 Außerlechner et al. [AFW13]
o “Since MINION is not able to deal with Real numbers …, we created a
specific spreadsheet corpus that contains spreadsheets with Integer
values only … Whereas some of the spreadsheets are artificially
created, 21 spreadsheets are real-life programs … “
[AE08] R. Abraham, and M. Erwig: “Test-Driven Goal-Directed Debugging in Spreadsheets.” IEEE Symposioum on
Visual Languages and Human-Centric Computing, 2008.
[AFW13] S. Ausserlechner et al.: “The Right Choice Matters! SMT Solving Substantially Improves Model-Based
Debugging of Spreadsheets.” QSIC 2013.
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
6
Current Situation - Consequences
 Each research group uses own data set
 rarely made publicly available
 often made to fit the evaluated approach
 comparison of approaches difficult
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
7
We need a corpus that contains …
 Real world spreadsheets
 Large spreadsheets, not toy examples
 Spreadsheets with real faults, not only seeded faults
 Input-/output relations that reveal the fault
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
8
Ways to get there
 Laboratory: spreadsheet construction exercises
 Excellent starting point: Kooper Corpus [AP10]
 Larger spreadsheets
 Different domains and exercises
 Real life
[AP10] S.Aurigemma,and R.Panko:“The detection of human spreadsheeterrors by humans versus
inspection (auditing)software,”CoRR,2010.
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
9
Usability and User Acceptance
 Mostly offline experiments
 Information from the user required, e.g.
 Correctness of values
 Expected values
 Specification of several test cases
Is a user willing / able to provide these inputs?
 User studies are necessary to answer these questions.
2
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
10
Field research
 Setting
 Laboratory experiments vs. everyday use
 Participant
 Students vs. managers
 Scenario
 Artificial problem vs. real problem
3
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
11
Proposals for future work
 Improve comparability and reproducibility
Develop common benchmark system
 Focus on usability and user acceptance
Make user studies
 Focus on real life scenarios (not only laboratory experiments)
Make field research, questionnaires …

More Related Content

Similar to Tool-Supported 2014 07 sems_limitations_evaluation_practice

Field of Study and Research Methods for an Effect of Cognitive and Informatio...
Field of Study and Research Methods for an Effect of Cognitive and Informatio...Field of Study and Research Methods for an Effect of Cognitive and Informatio...
Field of Study and Research Methods for an Effect of Cognitive and Informatio...Yury Solonitsyn
 
Crediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCrediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCarole Goble
 
Open Access Statistics: An Examination how to Generate Interoperable Usage In...
Open Access Statistics: An Examination how to Generate Interoperable Usage In...Open Access Statistics: An Examination how to Generate Interoperable Usage In...
Open Access Statistics: An Examination how to Generate Interoperable Usage In...Daniel Beucke
 
Chi2012 analysis in practical usability evaluation web
Chi2012 analysis in practical usability evaluation webChi2012 analysis in practical usability evaluation web
Chi2012 analysis in practical usability evaluation webAsbjørn Følstad
 
Impact your Library UX with Contextual Inquiry
Impact your Library UX with Contextual InquiryImpact your Library UX with Contextual Inquiry
Impact your Library UX with Contextual InquiryRachel Vacek
 
201209 An Introduction to Building Affective-Driven Self-Adaptive Software
201209 An Introduction to Building Affective-Driven Self-Adaptive Software 201209 An Introduction to Building Affective-Driven Self-Adaptive Software
201209 An Introduction to Building Affective-Driven Self-Adaptive Software Javier Gonzalez-Sanchez
 
Looking for Data: Finding New Science
Looking for Data: Finding New ScienceLooking for Data: Finding New Science
Looking for Data: Finding New ScienceAnita de Waard
 
Gp technologybuilds july2011
Gp technologybuilds july2011Gp technologybuilds july2011
Gp technologybuilds july2011bodaceacat
 
Gp technologybuilds july2011
Gp technologybuilds july2011Gp technologybuilds july2011
Gp technologybuilds july2011bodaceacat
 
From Bugs to Decision Support - Selected Research Highlights
From Bugs to Decision Support - Selected Research HighlightsFrom Bugs to Decision Support - Selected Research Highlights
From Bugs to Decision Support - Selected Research HighlightsMarkus Borg
 
SurveyDesignTutorial_Session1-1.pdf
SurveyDesignTutorial_Session1-1.pdfSurveyDesignTutorial_Session1-1.pdf
SurveyDesignTutorial_Session1-1.pdfEssamAlnatsheh
 
The Symbiotic Nature of Provenance and Workflow
The Symbiotic Nature of Provenance and WorkflowThe Symbiotic Nature of Provenance and Workflow
The Symbiotic Nature of Provenance and WorkflowEric Stephan
 
Research software susainability
Research software susainabilityResearch software susainability
Research software susainabilityDaniel S. Katz
 
ECE695DVisualAnalyticsprojectproposal (2)
ECE695DVisualAnalyticsprojectproposal (2)ECE695DVisualAnalyticsprojectproposal (2)
ECE695DVisualAnalyticsprojectproposal (2)Shweta Gupte
 
Usability Testing in Federal Libraries: A Case Study
Usability Testing in Federal Libraries: A Case StudyUsability Testing in Federal Libraries: A Case Study
Usability Testing in Federal Libraries: A Case Studynullhandle
 
Using Bibliometrics to Keep Up with the Joneses
Using Bibliometrics to Keep Up with the JonesesUsing Bibliometrics to Keep Up with the Joneses
Using Bibliometrics to Keep Up with the JonesesChristina Pikas
 
Proactive Displays IIIA 20080627
Proactive Displays IIIA 20080627Proactive Displays IIIA 20080627
Proactive Displays IIIA 20080627Joe McCarthy
 

Similar to Tool-Supported 2014 07 sems_limitations_evaluation_practice (20)

Field of Study and Research Methods for an Effect of Cognitive and Informatio...
Field of Study and Research Methods for an Effect of Cognitive and Informatio...Field of Study and Research Methods for an Effect of Cognitive and Informatio...
Field of Study and Research Methods for an Effect of Cognitive and Informatio...
 
Crediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCrediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teams
 
Open Access Statistics: An Examination how to Generate Interoperable Usage In...
Open Access Statistics: An Examination how to Generate Interoperable Usage In...Open Access Statistics: An Examination how to Generate Interoperable Usage In...
Open Access Statistics: An Examination how to Generate Interoperable Usage In...
 
Chi2012 analysis in practical usability evaluation web
Chi2012 analysis in practical usability evaluation webChi2012 analysis in practical usability evaluation web
Chi2012 analysis in practical usability evaluation web
 
Lopez
LopezLopez
Lopez
 
Impact your Library UX with Contextual Inquiry
Impact your Library UX with Contextual InquiryImpact your Library UX with Contextual Inquiry
Impact your Library UX with Contextual Inquiry
 
201209 An Introduction to Building Affective-Driven Self-Adaptive Software
201209 An Introduction to Building Affective-Driven Self-Adaptive Software 201209 An Introduction to Building Affective-Driven Self-Adaptive Software
201209 An Introduction to Building Affective-Driven Self-Adaptive Software
 
Analytical Survey on Bug Tracking System
Analytical Survey on Bug Tracking SystemAnalytical Survey on Bug Tracking System
Analytical Survey on Bug Tracking System
 
Looking for Data: Finding New Science
Looking for Data: Finding New ScienceLooking for Data: Finding New Science
Looking for Data: Finding New Science
 
Gp technologybuilds july2011
Gp technologybuilds july2011Gp technologybuilds july2011
Gp technologybuilds july2011
 
Gp technologybuilds july2011
Gp technologybuilds july2011Gp technologybuilds july2011
Gp technologybuilds july2011
 
From Bugs to Decision Support - Selected Research Highlights
From Bugs to Decision Support - Selected Research HighlightsFrom Bugs to Decision Support - Selected Research Highlights
From Bugs to Decision Support - Selected Research Highlights
 
SurveyDesignTutorial_Session1-1.pdf
SurveyDesignTutorial_Session1-1.pdfSurveyDesignTutorial_Session1-1.pdf
SurveyDesignTutorial_Session1-1.pdf
 
The Symbiotic Nature of Provenance and Workflow
The Symbiotic Nature of Provenance and WorkflowThe Symbiotic Nature of Provenance and Workflow
The Symbiotic Nature of Provenance and Workflow
 
Research software susainability
Research software susainabilityResearch software susainability
Research software susainability
 
ECE695DVisualAnalyticsprojectproposal (2)
ECE695DVisualAnalyticsprojectproposal (2)ECE695DVisualAnalyticsprojectproposal (2)
ECE695DVisualAnalyticsprojectproposal (2)
 
Usability Testing in Federal Libraries: A Case Study
Usability Testing in Federal Libraries: A Case StudyUsability Testing in Federal Libraries: A Case Study
Usability Testing in Federal Libraries: A Case Study
 
Anu2018
Anu2018Anu2018
Anu2018
 
Using Bibliometrics to Keep Up with the Joneses
Using Bibliometrics to Keep Up with the JonesesUsing Bibliometrics to Keep Up with the Joneses
Using Bibliometrics to Keep Up with the Joneses
 
Proactive Displays IIIA 20080627
Proactive Displays IIIA 20080627Proactive Displays IIIA 20080627
Proactive Displays IIIA 20080627
 

More from semsworkshop

Spreadsheets are models too - Richard Paige at Sems 2014
Spreadsheets are models too - Richard Paige at Sems 2014Spreadsheets are models too - Richard Paige at Sems 2014
Spreadsheets are models too - Richard Paige at Sems 2014semsworkshop
 
MDSheet - Model driven spreadsheets - Jacome Cunha at Sems 2014
MDSheet - Model driven spreadsheets - Jacome Cunha at Sems 2014MDSheet - Model driven spreadsheets - Jacome Cunha at Sems 2014
MDSheet - Model driven spreadsheets - Jacome Cunha at Sems 2014semsworkshop
 
Dependence tracing techniques for spreadsheets - Sohon Roy at Sems 2014
Dependence tracing techniques for spreadsheets - Sohon Roy at Sems 2014Dependence tracing techniques for spreadsheets - Sohon Roy at Sems 2014
Dependence tracing techniques for spreadsheets - Sohon Roy at Sems 2014semsworkshop
 
A spreadsheet cell-meaning model for testing - Daniel Kulesz at Sems 2014
A spreadsheet cell-meaning model for testing - Daniel Kulesz at Sems 2014A spreadsheet cell-meaning model for testing - Daniel Kulesz at Sems 2014
A spreadsheet cell-meaning model for testing - Daniel Kulesz at Sems 2014semsworkshop
 
Using a visual language to create better spreadsheets - Bas Jansen at Sems 2014
Using a visual language to create better spreadsheets - Bas Jansen at Sems 2014Using a visual language to create better spreadsheets - Bas Jansen at Sems 2014
Using a visual language to create better spreadsheets - Bas Jansen at Sems 2014semsworkshop
 
2014 07 sems_debugging_models
2014 07 sems_debugging_models2014 07 sems_debugging_models
2014 07 sems_debugging_modelssemsworkshop
 
High-performance sheet-defined functions in Excel - Peter Sestoft at Sems 2014
High-performance sheet-defined functions in Excel - Peter Sestoft at Sems 2014High-performance sheet-defined functions in Excel - Peter Sestoft at Sems 2014
High-performance sheet-defined functions in Excel - Peter Sestoft at Sems 2014semsworkshop
 

More from semsworkshop (7)

Spreadsheets are models too - Richard Paige at Sems 2014
Spreadsheets are models too - Richard Paige at Sems 2014Spreadsheets are models too - Richard Paige at Sems 2014
Spreadsheets are models too - Richard Paige at Sems 2014
 
MDSheet - Model driven spreadsheets - Jacome Cunha at Sems 2014
MDSheet - Model driven spreadsheets - Jacome Cunha at Sems 2014MDSheet - Model driven spreadsheets - Jacome Cunha at Sems 2014
MDSheet - Model driven spreadsheets - Jacome Cunha at Sems 2014
 
Dependence tracing techniques for spreadsheets - Sohon Roy at Sems 2014
Dependence tracing techniques for spreadsheets - Sohon Roy at Sems 2014Dependence tracing techniques for spreadsheets - Sohon Roy at Sems 2014
Dependence tracing techniques for spreadsheets - Sohon Roy at Sems 2014
 
A spreadsheet cell-meaning model for testing - Daniel Kulesz at Sems 2014
A spreadsheet cell-meaning model for testing - Daniel Kulesz at Sems 2014A spreadsheet cell-meaning model for testing - Daniel Kulesz at Sems 2014
A spreadsheet cell-meaning model for testing - Daniel Kulesz at Sems 2014
 
Using a visual language to create better spreadsheets - Bas Jansen at Sems 2014
Using a visual language to create better spreadsheets - Bas Jansen at Sems 2014Using a visual language to create better spreadsheets - Bas Jansen at Sems 2014
Using a visual language to create better spreadsheets - Bas Jansen at Sems 2014
 
2014 07 sems_debugging_models
2014 07 sems_debugging_models2014 07 sems_debugging_models
2014 07 sems_debugging_models
 
High-performance sheet-defined functions in Excel - Peter Sestoft at Sems 2014
High-performance sheet-defined functions in Excel - Peter Sestoft at Sems 2014High-performance sheet-defined functions in Excel - Peter Sestoft at Sems 2014
High-performance sheet-defined functions in Excel - Peter Sestoft at Sems 2014
 

Recently uploaded

Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 

Recently uploaded (20)

Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 

Tool-Supported 2014 07 sems_limitations_evaluation_practice

  • 1. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 1 Tool-Supported Fault Localization in Spreadsheets: Limitations of Current Evaluation Practice Birgit Hofer, Franz Wotawa Dietmar Jannach, Thomas Schmitz Kostyantyn Shchekotykhin
  • 2. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 2 An Overview of Limitations of Current Evaluation Practice 1 2 3 Lack of benchmarks systems Usability and user acceptance Field research
  • 3. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 3 Benchmark Systems – Current Situation  There is no public data set for spreadsheet fault localization  Researcher create own benchmark systems  Take existing corpus (e.g. EUSES [FR05]) or collect individual spreadsheets  Apply mutation operators, e.g. [AE09] on them or manually inject faults [FR05] M. Fisherand G. Rothermel:“The EUSES spreadsheetcorpus:Ashared resource forsupporting experimentationwith spreadsheetdependability mechanisms.”1stWorkshop on End-User Software Engineering,2005. [AE09] R.Abraham and M.Erwig. Mutation Operators forSpreadsheets.IEEE Transactionson Software Engineering,2009. 1
  • 4. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 4 Some Examples I  Hofer et. al [HRW13] o “… we are evaluating the … approaches by means of the EUSES spreadsheet corpus. We skipped around 240 Excel 5.0 spreadsheets that are not compatible with our implementation, … o we removed all spreadsheets containing less than 5 formulas … o we automaticallycreated up to five first-order mutants. Amutant of a spreadsheet is created by randomly choosing a formula cell of the spreadsheet and applying a mutation operator on it. According to the classification of spreadsheet mutation operators ofAbraham and Erwig, we used the following mutation operators …”  Jannach and Schmitz [JS14] o “For the performance analysis, we selected a number of artificial and real-world spreadsheetsin which we manuallyinjected faults.” [HRW13] B. Hofer, A. Riboira, F. Wotawa, and R. Abreu, E. Getzner: “On the Empirical Evaluation of Fault Localization Techniques for Spreadsheets.” FASE 2013. [JS14] D. Jannach and T. Schmitz: “Model-based diagnosis of spreadsheet programs - A constraint-based debugging approach.” Automated Software Engineering, Springer, 2014.
  • 5. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 5 Some examples II  Abraham and Erwig [AE08] o “… we use spreadsheets that have been used in previous empirical studies. The spreadsheets have been picked to include as many different kinds of formulas, and formulas with branching … o We generate mutant spreadsheets by seeding faults in the original spreadsheets using the mutation operators given in Table 1. The mutation operators have been designed to reflect errors reported in spreadsheet literature …”  Außerlechner et al. [AFW13] o “Since MINION is not able to deal with Real numbers …, we created a specific spreadsheet corpus that contains spreadsheets with Integer values only … Whereas some of the spreadsheets are artificially created, 21 spreadsheets are real-life programs … “ [AE08] R. Abraham, and M. Erwig: “Test-Driven Goal-Directed Debugging in Spreadsheets.” IEEE Symposioum on Visual Languages and Human-Centric Computing, 2008. [AFW13] S. Ausserlechner et al.: “The Right Choice Matters! SMT Solving Substantially Improves Model-Based Debugging of Spreadsheets.” QSIC 2013.
  • 6. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 6 Current Situation - Consequences  Each research group uses own data set  rarely made publicly available  often made to fit the evaluated approach  comparison of approaches difficult
  • 7. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 7 We need a corpus that contains …  Real world spreadsheets  Large spreadsheets, not toy examples  Spreadsheets with real faults, not only seeded faults  Input-/output relations that reveal the fault
  • 8. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 8 Ways to get there  Laboratory: spreadsheet construction exercises  Excellent starting point: Kooper Corpus [AP10]  Larger spreadsheets  Different domains and exercises  Real life [AP10] S.Aurigemma,and R.Panko:“The detection of human spreadsheeterrors by humans versus inspection (auditing)software,”CoRR,2010.
  • 9. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 9 Usability and User Acceptance  Mostly offline experiments  Information from the user required, e.g.  Correctness of values  Expected values  Specification of several test cases Is a user willing / able to provide these inputs?  User studies are necessary to answer these questions. 2
  • 10. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 10 Field research  Setting  Laboratory experiments vs. everyday use  Participant  Students vs. managers  Scenario  Artificial problem vs. real problem 3
  • 11. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 11 Proposals for future work  Improve comparability and reproducibility Develop common benchmark system  Focus on usability and user acceptance Make user studies  Focus on real life scenarios (not only laboratory experiments) Make field research, questionnaires …