SlideShare a Scribd company logo
1 of 11
Download to read offline
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
1
Tool-Supported Fault
Localization in Spreadsheets:
Limitations of Current
Evaluation Practice
Birgit Hofer, Franz Wotawa
Dietmar Jannach, Thomas Schmitz
Kostyantyn Shchekotykhin
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
2
An Overview of Limitations
of Current Evaluation Practice
1
2
3
Lack of benchmarks systems
Usability and user acceptance
Field research
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
3
Benchmark Systems – Current Situation
 There is no public data set for spreadsheet fault
localization
 Researcher create own benchmark systems
 Take existing corpus (e.g. EUSES [FR05]) or collect
individual spreadsheets
 Apply mutation operators, e.g. [AE09] on them or
manually inject faults
[FR05] M. Fisherand G. Rothermel:“The EUSES spreadsheetcorpus:Ashared resource forsupporting
experimentationwith spreadsheetdependability mechanisms.”1stWorkshop on End-User
Software Engineering,2005.
[AE09] R.Abraham and M.Erwig. Mutation Operators forSpreadsheets.IEEE Transactionson Software
Engineering,2009.
1
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
4
Some Examples I
 Hofer et. al [HRW13]
o “… we are evaluating the … approaches by means of the EUSES
spreadsheet corpus. We skipped around 240 Excel 5.0
spreadsheets that are not compatible with our implementation, …
o we removed all spreadsheets containing less than 5 formulas …
o we automaticallycreated up to five first-order mutants. Amutant of a
spreadsheet is created by randomly choosing a formula cell of the
spreadsheet and applying a mutation operator on it. According to the
classification of spreadsheet mutation operators ofAbraham and
Erwig, we used the following mutation operators …”
 Jannach and Schmitz [JS14]
o “For the performance analysis, we selected a number of artificial and
real-world spreadsheetsin which we manuallyinjected faults.”
[HRW13] B. Hofer, A. Riboira, F. Wotawa, and R. Abreu, E. Getzner: “On the Empirical Evaluation of Fault Localization
Techniques for Spreadsheets.” FASE 2013.
[JS14] D. Jannach and T. Schmitz: “Model-based diagnosis of spreadsheet programs - A constraint-based debugging
approach.” Automated Software Engineering, Springer, 2014.
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
5
Some examples II
 Abraham and Erwig [AE08]
o “… we use spreadsheets that have been used in previous
empirical studies. The spreadsheets have been picked to include as
many different kinds of formulas, and formulas with branching …
o We generate mutant spreadsheets by seeding faults in the original
spreadsheets using the mutation operators given in Table 1. The
mutation operators have been designed to reflect errors reported in
spreadsheet literature …”
 Außerlechner et al. [AFW13]
o “Since MINION is not able to deal with Real numbers …, we created a
specific spreadsheet corpus that contains spreadsheets with Integer
values only … Whereas some of the spreadsheets are artificially
created, 21 spreadsheets are real-life programs … “
[AE08] R. Abraham, and M. Erwig: “Test-Driven Goal-Directed Debugging in Spreadsheets.” IEEE Symposioum on
Visual Languages and Human-Centric Computing, 2008.
[AFW13] S. Ausserlechner et al.: “The Right Choice Matters! SMT Solving Substantially Improves Model-Based
Debugging of Spreadsheets.” QSIC 2013.
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
6
Current Situation - Consequences
 Each research group uses own data set
 rarely made publicly available
 often made to fit the evaluated approach
 comparison of approaches difficult
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
7
We need a corpus that contains …
 Real world spreadsheets
 Large spreadsheets, not toy examples
 Spreadsheets with real faults, not only seeded faults
 Input-/output relations that reveal the fault
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
8
Ways to get there
 Laboratory: spreadsheet construction exercises
 Excellent starting point: Kooper Corpus [AP10]
 Larger spreadsheets
 Different domains and exercises
 Real life
[AP10] S.Aurigemma,and R.Panko:“The detection of human spreadsheeterrors by humans versus
inspection (auditing)software,”CoRR,2010.
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
9
Usability and User Acceptance
 Mostly offline experiments
 Information from the user required, e.g.
 Correctness of values
 Expected values
 Specification of several test cases
Is a user willing / able to provide these inputs?
 User studies are necessary to answer these questions.
2
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
10
Field research
 Setting
 Laboratory experiments vs. everyday use
 Participant
 Students vs. managers
 Scenario
 Artificial problem vs. real problem
3
B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa:
„ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“
11
Proposals for future work
 Improve comparability and reproducibility
Develop common benchmark system
 Focus on usability and user acceptance
Make user studies
 Focus on real life scenarios (not only laboratory experiments)
Make field research, questionnaires …

More Related Content

Similar to Tool-Supported 2014 07 sems_limitations_evaluation_practice

Field of Study and Research Methods for an Effect of Cognitive and Informatio...
Field of Study and Research Methods for an Effect of Cognitive and Informatio...Field of Study and Research Methods for an Effect of Cognitive and Informatio...
Field of Study and Research Methods for an Effect of Cognitive and Informatio...Yury Solonitsyn
 
Crediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCrediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCarole Goble
 
Open Access Statistics: An Examination how to Generate Interoperable Usage In...
Open Access Statistics: An Examination how to Generate Interoperable Usage In...Open Access Statistics: An Examination how to Generate Interoperable Usage In...
Open Access Statistics: An Examination how to Generate Interoperable Usage In...Daniel Beucke
 
Chi2012 analysis in practical usability evaluation web
Chi2012 analysis in practical usability evaluation webChi2012 analysis in practical usability evaluation web
Chi2012 analysis in practical usability evaluation webAsbjørn Følstad
 
Impact your Library UX with Contextual Inquiry
Impact your Library UX with Contextual InquiryImpact your Library UX with Contextual Inquiry
Impact your Library UX with Contextual InquiryRachel Vacek
 
201209 An Introduction to Building Affective-Driven Self-Adaptive Software
201209 An Introduction to Building Affective-Driven Self-Adaptive Software 201209 An Introduction to Building Affective-Driven Self-Adaptive Software
201209 An Introduction to Building Affective-Driven Self-Adaptive Software Javier Gonzalez-Sanchez
 
Looking for Data: Finding New Science
Looking for Data: Finding New ScienceLooking for Data: Finding New Science
Looking for Data: Finding New ScienceAnita de Waard
 
Gp technologybuilds july2011
Gp technologybuilds july2011Gp technologybuilds july2011
Gp technologybuilds july2011bodaceacat
 
Gp technologybuilds july2011
Gp technologybuilds july2011Gp technologybuilds july2011
Gp technologybuilds july2011bodaceacat
 
From Bugs to Decision Support - Selected Research Highlights
From Bugs to Decision Support - Selected Research HighlightsFrom Bugs to Decision Support - Selected Research Highlights
From Bugs to Decision Support - Selected Research HighlightsMarkus Borg
 
SurveyDesignTutorial_Session1-1.pdf
SurveyDesignTutorial_Session1-1.pdfSurveyDesignTutorial_Session1-1.pdf
SurveyDesignTutorial_Session1-1.pdfEssamAlnatsheh
 
The Symbiotic Nature of Provenance and Workflow
The Symbiotic Nature of Provenance and WorkflowThe Symbiotic Nature of Provenance and Workflow
The Symbiotic Nature of Provenance and WorkflowEric Stephan
 
Research software susainability
Research software susainabilityResearch software susainability
Research software susainabilityDaniel S. Katz
 
ECE695DVisualAnalyticsprojectproposal (2)
ECE695DVisualAnalyticsprojectproposal (2)ECE695DVisualAnalyticsprojectproposal (2)
ECE695DVisualAnalyticsprojectproposal (2)Shweta Gupte
 
Usability Testing in Federal Libraries: A Case Study
Usability Testing in Federal Libraries: A Case StudyUsability Testing in Federal Libraries: A Case Study
Usability Testing in Federal Libraries: A Case Studynullhandle
 
Using Bibliometrics to Keep Up with the Joneses
Using Bibliometrics to Keep Up with the JonesesUsing Bibliometrics to Keep Up with the Joneses
Using Bibliometrics to Keep Up with the JonesesChristina Pikas
 
Proactive Displays IIIA 20080627
Proactive Displays IIIA 20080627Proactive Displays IIIA 20080627
Proactive Displays IIIA 20080627Joe McCarthy
 

Similar to Tool-Supported 2014 07 sems_limitations_evaluation_practice (20)

Field of Study and Research Methods for an Effect of Cognitive and Informatio...
Field of Study and Research Methods for an Effect of Cognitive and Informatio...Field of Study and Research Methods for an Effect of Cognitive and Informatio...
Field of Study and Research Methods for an Effect of Cognitive and Informatio...
 
Crediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCrediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teams
 
Open Access Statistics: An Examination how to Generate Interoperable Usage In...
Open Access Statistics: An Examination how to Generate Interoperable Usage In...Open Access Statistics: An Examination how to Generate Interoperable Usage In...
Open Access Statistics: An Examination how to Generate Interoperable Usage In...
 
Chi2012 analysis in practical usability evaluation web
Chi2012 analysis in practical usability evaluation webChi2012 analysis in practical usability evaluation web
Chi2012 analysis in practical usability evaluation web
 
Lopez
LopezLopez
Lopez
 
Impact your Library UX with Contextual Inquiry
Impact your Library UX with Contextual InquiryImpact your Library UX with Contextual Inquiry
Impact your Library UX with Contextual Inquiry
 
201209 An Introduction to Building Affective-Driven Self-Adaptive Software
201209 An Introduction to Building Affective-Driven Self-Adaptive Software 201209 An Introduction to Building Affective-Driven Self-Adaptive Software
201209 An Introduction to Building Affective-Driven Self-Adaptive Software
 
Analytical Survey on Bug Tracking System
Analytical Survey on Bug Tracking SystemAnalytical Survey on Bug Tracking System
Analytical Survey on Bug Tracking System
 
Looking for Data: Finding New Science
Looking for Data: Finding New ScienceLooking for Data: Finding New Science
Looking for Data: Finding New Science
 
Gp technologybuilds july2011
Gp technologybuilds july2011Gp technologybuilds july2011
Gp technologybuilds july2011
 
Gp technologybuilds july2011
Gp technologybuilds july2011Gp technologybuilds july2011
Gp technologybuilds july2011
 
From Bugs to Decision Support - Selected Research Highlights
From Bugs to Decision Support - Selected Research HighlightsFrom Bugs to Decision Support - Selected Research Highlights
From Bugs to Decision Support - Selected Research Highlights
 
SurveyDesignTutorial_Session1-1.pdf
SurveyDesignTutorial_Session1-1.pdfSurveyDesignTutorial_Session1-1.pdf
SurveyDesignTutorial_Session1-1.pdf
 
The Symbiotic Nature of Provenance and Workflow
The Symbiotic Nature of Provenance and WorkflowThe Symbiotic Nature of Provenance and Workflow
The Symbiotic Nature of Provenance and Workflow
 
Research software susainability
Research software susainabilityResearch software susainability
Research software susainability
 
ECE695DVisualAnalyticsprojectproposal (2)
ECE695DVisualAnalyticsprojectproposal (2)ECE695DVisualAnalyticsprojectproposal (2)
ECE695DVisualAnalyticsprojectproposal (2)
 
Usability Testing in Federal Libraries: A Case Study
Usability Testing in Federal Libraries: A Case StudyUsability Testing in Federal Libraries: A Case Study
Usability Testing in Federal Libraries: A Case Study
 
Anu2018
Anu2018Anu2018
Anu2018
 
Using Bibliometrics to Keep Up with the Joneses
Using Bibliometrics to Keep Up with the JonesesUsing Bibliometrics to Keep Up with the Joneses
Using Bibliometrics to Keep Up with the Joneses
 
Proactive Displays IIIA 20080627
Proactive Displays IIIA 20080627Proactive Displays IIIA 20080627
Proactive Displays IIIA 20080627
 

More from semsworkshop

Spreadsheets are models too - Richard Paige at Sems 2014
Spreadsheets are models too - Richard Paige at Sems 2014Spreadsheets are models too - Richard Paige at Sems 2014
Spreadsheets are models too - Richard Paige at Sems 2014semsworkshop
 
MDSheet - Model driven spreadsheets - Jacome Cunha at Sems 2014
MDSheet - Model driven spreadsheets - Jacome Cunha at Sems 2014MDSheet - Model driven spreadsheets - Jacome Cunha at Sems 2014
MDSheet - Model driven spreadsheets - Jacome Cunha at Sems 2014semsworkshop
 
Dependence tracing techniques for spreadsheets - Sohon Roy at Sems 2014
Dependence tracing techniques for spreadsheets - Sohon Roy at Sems 2014Dependence tracing techniques for spreadsheets - Sohon Roy at Sems 2014
Dependence tracing techniques for spreadsheets - Sohon Roy at Sems 2014semsworkshop
 
A spreadsheet cell-meaning model for testing - Daniel Kulesz at Sems 2014
A spreadsheet cell-meaning model for testing - Daniel Kulesz at Sems 2014A spreadsheet cell-meaning model for testing - Daniel Kulesz at Sems 2014
A spreadsheet cell-meaning model for testing - Daniel Kulesz at Sems 2014semsworkshop
 
Using a visual language to create better spreadsheets - Bas Jansen at Sems 2014
Using a visual language to create better spreadsheets - Bas Jansen at Sems 2014Using a visual language to create better spreadsheets - Bas Jansen at Sems 2014
Using a visual language to create better spreadsheets - Bas Jansen at Sems 2014semsworkshop
 
2014 07 sems_debugging_models
2014 07 sems_debugging_models2014 07 sems_debugging_models
2014 07 sems_debugging_modelssemsworkshop
 
High-performance sheet-defined functions in Excel - Peter Sestoft at Sems 2014
High-performance sheet-defined functions in Excel - Peter Sestoft at Sems 2014High-performance sheet-defined functions in Excel - Peter Sestoft at Sems 2014
High-performance sheet-defined functions in Excel - Peter Sestoft at Sems 2014semsworkshop
 

More from semsworkshop (7)

Spreadsheets are models too - Richard Paige at Sems 2014
Spreadsheets are models too - Richard Paige at Sems 2014Spreadsheets are models too - Richard Paige at Sems 2014
Spreadsheets are models too - Richard Paige at Sems 2014
 
MDSheet - Model driven spreadsheets - Jacome Cunha at Sems 2014
MDSheet - Model driven spreadsheets - Jacome Cunha at Sems 2014MDSheet - Model driven spreadsheets - Jacome Cunha at Sems 2014
MDSheet - Model driven spreadsheets - Jacome Cunha at Sems 2014
 
Dependence tracing techniques for spreadsheets - Sohon Roy at Sems 2014
Dependence tracing techniques for spreadsheets - Sohon Roy at Sems 2014Dependence tracing techniques for spreadsheets - Sohon Roy at Sems 2014
Dependence tracing techniques for spreadsheets - Sohon Roy at Sems 2014
 
A spreadsheet cell-meaning model for testing - Daniel Kulesz at Sems 2014
A spreadsheet cell-meaning model for testing - Daniel Kulesz at Sems 2014A spreadsheet cell-meaning model for testing - Daniel Kulesz at Sems 2014
A spreadsheet cell-meaning model for testing - Daniel Kulesz at Sems 2014
 
Using a visual language to create better spreadsheets - Bas Jansen at Sems 2014
Using a visual language to create better spreadsheets - Bas Jansen at Sems 2014Using a visual language to create better spreadsheets - Bas Jansen at Sems 2014
Using a visual language to create better spreadsheets - Bas Jansen at Sems 2014
 
2014 07 sems_debugging_models
2014 07 sems_debugging_models2014 07 sems_debugging_models
2014 07 sems_debugging_models
 
High-performance sheet-defined functions in Excel - Peter Sestoft at Sems 2014
High-performance sheet-defined functions in Excel - Peter Sestoft at Sems 2014High-performance sheet-defined functions in Excel - Peter Sestoft at Sems 2014
High-performance sheet-defined functions in Excel - Peter Sestoft at Sems 2014
 

Recently uploaded

My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 

Tool-Supported 2014 07 sems_limitations_evaluation_practice

  • 1. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 1 Tool-Supported Fault Localization in Spreadsheets: Limitations of Current Evaluation Practice Birgit Hofer, Franz Wotawa Dietmar Jannach, Thomas Schmitz Kostyantyn Shchekotykhin
  • 2. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 2 An Overview of Limitations of Current Evaluation Practice 1 2 3 Lack of benchmarks systems Usability and user acceptance Field research
  • 3. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 3 Benchmark Systems – Current Situation  There is no public data set for spreadsheet fault localization  Researcher create own benchmark systems  Take existing corpus (e.g. EUSES [FR05]) or collect individual spreadsheets  Apply mutation operators, e.g. [AE09] on them or manually inject faults [FR05] M. Fisherand G. Rothermel:“The EUSES spreadsheetcorpus:Ashared resource forsupporting experimentationwith spreadsheetdependability mechanisms.”1stWorkshop on End-User Software Engineering,2005. [AE09] R.Abraham and M.Erwig. Mutation Operators forSpreadsheets.IEEE Transactionson Software Engineering,2009. 1
  • 4. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 4 Some Examples I  Hofer et. al [HRW13] o “… we are evaluating the … approaches by means of the EUSES spreadsheet corpus. We skipped around 240 Excel 5.0 spreadsheets that are not compatible with our implementation, … o we removed all spreadsheets containing less than 5 formulas … o we automaticallycreated up to five first-order mutants. Amutant of a spreadsheet is created by randomly choosing a formula cell of the spreadsheet and applying a mutation operator on it. According to the classification of spreadsheet mutation operators ofAbraham and Erwig, we used the following mutation operators …”  Jannach and Schmitz [JS14] o “For the performance analysis, we selected a number of artificial and real-world spreadsheetsin which we manuallyinjected faults.” [HRW13] B. Hofer, A. Riboira, F. Wotawa, and R. Abreu, E. Getzner: “On the Empirical Evaluation of Fault Localization Techniques for Spreadsheets.” FASE 2013. [JS14] D. Jannach and T. Schmitz: “Model-based diagnosis of spreadsheet programs - A constraint-based debugging approach.” Automated Software Engineering, Springer, 2014.
  • 5. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 5 Some examples II  Abraham and Erwig [AE08] o “… we use spreadsheets that have been used in previous empirical studies. The spreadsheets have been picked to include as many different kinds of formulas, and formulas with branching … o We generate mutant spreadsheets by seeding faults in the original spreadsheets using the mutation operators given in Table 1. The mutation operators have been designed to reflect errors reported in spreadsheet literature …”  Außerlechner et al. [AFW13] o “Since MINION is not able to deal with Real numbers …, we created a specific spreadsheet corpus that contains spreadsheets with Integer values only … Whereas some of the spreadsheets are artificially created, 21 spreadsheets are real-life programs … “ [AE08] R. Abraham, and M. Erwig: “Test-Driven Goal-Directed Debugging in Spreadsheets.” IEEE Symposioum on Visual Languages and Human-Centric Computing, 2008. [AFW13] S. Ausserlechner et al.: “The Right Choice Matters! SMT Solving Substantially Improves Model-Based Debugging of Spreadsheets.” QSIC 2013.
  • 6. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 6 Current Situation - Consequences  Each research group uses own data set  rarely made publicly available  often made to fit the evaluated approach  comparison of approaches difficult
  • 7. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 7 We need a corpus that contains …  Real world spreadsheets  Large spreadsheets, not toy examples  Spreadsheets with real faults, not only seeded faults  Input-/output relations that reveal the fault
  • 8. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 8 Ways to get there  Laboratory: spreadsheet construction exercises  Excellent starting point: Kooper Corpus [AP10]  Larger spreadsheets  Different domains and exercises  Real life [AP10] S.Aurigemma,and R.Panko:“The detection of human spreadsheeterrors by humans versus inspection (auditing)software,”CoRR,2010.
  • 9. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 9 Usability and User Acceptance  Mostly offline experiments  Information from the user required, e.g.  Correctness of values  Expected values  Specification of several test cases Is a user willing / able to provide these inputs?  User studies are necessary to answer these questions. 2
  • 10. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 10 Field research  Setting  Laboratory experiments vs. everyday use  Participant  Students vs. managers  Scenario  Artificial problem vs. real problem 3
  • 11. B. Hofer, D. Jannach, T. Schmitz, K. Shchekotykhin, and F. Wotawa: „ Tool-supported fault localization in spreadsheets: Limitations of current evaluation practice“ 11 Proposals for future work  Improve comparability and reproducibility Develop common benchmark system  Focus on usability and user acceptance Make user studies  Focus on real life scenarios (not only laboratory experiments) Make field research, questionnaires …