SlideShare a Scribd company logo
1 of 27
` BugCache Predicting Defects Sung Kim • MIT Tom Zimmermann • Saarland University Jim Whitehead • UC Santa Cruz Andreas Zeller • Saarland University
The Problem How should we allocate our resources for quality assurance? Which files should we focus on?
Which files are most bug-prone? The Problem
Where are bugs? Temporal locality: Defected files are  likely to have more soon. [Ostrand, Weyuker] Spatial locality: In nearby other bugs!   [Zimmermann et al.] In modified files! [Nagappan et al.] In new files! [Graves et al.]
Our Solution ,[object Object],[object Object],Cache
Bug Cache 10% files most defect-prone all files pre-fetch replacement Near by: co changes load
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Bug Cache load if missed load if missed pre-fetch A Fix  change Non-fix change Fix  change Change history B C
Cache Model Miss Cache size: 2 A B C C
Cache Update Parameter: Block size (neighborhood size) ,[object Object],[object Object],File Number of common changes with  . 1 4 0 C A B D 4 B
Cache Model Hit Miss Miss Cache size: 2 Block size:  2 Hit A B C A D C B B A C A B Which one should be replaced?
Replacement Policies ,[object Object],[object Object],[object Object],Parameter: Replacement Policy
Cache Model Hit Miss Miss Cache size: 2 Block size:  2 Hit Replacement: BUG A B C A D C B B A C A B Block size:  1 Cache size: 2 File LRU CHANGE BUG -5 2 2 -3 3 1 B C BUG 2 1 (replace)
Pre-fill and pre-fetch ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Parameter: Pre-fetch size
Cache Model Hit Miss Miss Cache size: 2 Block size:  2 Replacement: BUG Pre-fetch size: 1 A B C A D C B B A C A B Hit rate = #Hits / #Defects = 25% Pre-fill Pre-fetch Miss D Pre-fetch
Evaluation PostgreSQL jEdit Mozilla Columba
Hit Rates  Cache size = 10%  Block/pre-fetch size = 50% of the cache size Replacement policy = LRU
Exhaustive Evaluation ,[object Object],[object Object],[object Object],[object Object]
Function Level  Default vs Optimal Options   Cache size = 10% of all functions/methods
Function Level  Optimal Hit Rates Project Function Apache 1.3 Columba Eclipse JEdit Mozilla PostgreSQL Subversion 2,113 8,428 33,214 5,489 8,203 8,659 3,693 Cache size = 10% of all functions/methods Hit rate 62% 68% 72% 49% 55% 59% 46% Block 15% 57% 20% 85% 41% 29% 71% Pre-fetch 17% 20% 4% 8% 14% 17% 14% Replace BUG BUG BUG BUG LRU LRU BUG
File Level  Default vs Optimal Options   Cache size = 10% of all files
File Level  Optimal Hit Rates Project Files Apache 1.3 Columba Eclipse JEdit Mozilla PostgreSQL Subversion 154 1,428 3,330 420 396 598 255 Cache size = 10% of all files Hit rate 82% 83% 95% 85% 88% 79% 73% Block 50% 59% 20% 23% 23% 22% 42% Pre-fetch 0% 0% 0% 0% 0% 0% 0% Replace LRU BUG LRU LRU LRU LRU LRU
Related Work In previous work, 10% predicts 44%~78%  20% predicts 71~93% 10% BugCache predicts 73~95%
Summary
BugCache Predicting Defects hit rates of 73%~95%  ,[object Object],[object Object],[object Object]
` BugCache Predicting Defects Sung Kim • MIT Tom Zimmermann • Saarland University Jim Whitehead • UC Santa Cruz Andreas Zeller • Saarland University
Changes that lead to problems  as indicated by  later  fixes. Bug-introducing Changes ... if (foo!=null) { foo.bar(); ... FIX ... if (foo==null) { foo.bar(); ... BUG-INTRODUCING later fixed

More Related Content

Viewers also liked (12)

[B5]memcached scalability-bag lru-deview-100
[B5]memcached scalability-bag lru-deview-100[B5]memcached scalability-bag lru-deview-100
[B5]memcached scalability-bag lru-deview-100
 
Resetting the-LRU-Replacement-Norm
Resetting the-LRU-Replacement-NormResetting the-LRU-Replacement-Norm
Resetting the-LRU-Replacement-Norm
 
1st LRU RAC Partners Meeting: FPE Organizational Structure and Mandates
1st LRU RAC Partners Meeting: FPE Organizational Structure and Mandates1st LRU RAC Partners Meeting: FPE Organizational Structure and Mandates
1st LRU RAC Partners Meeting: FPE Organizational Structure and Mandates
 
Lru Counter
Lru CounterLru Counter
Lru Counter
 
Lru Stack
Lru StackLru Stack
Lru Stack
 
Lru Algorithm
Lru AlgorithmLru Algorithm
Lru Algorithm
 
Page replacement
Page replacementPage replacement
Page replacement
 
Virtual memory
Virtual memoryVirtual memory
Virtual memory
 
cache memory
cache memorycache memory
cache memory
 
Cache memory
Cache memoryCache memory
Cache memory
 
Virtual memory ppt
Virtual memory pptVirtual memory ppt
Virtual memory ppt
 
Cache memory presentation
Cache memory presentationCache memory presentation
Cache memory presentation
 

Similar to Predicting Faults from Cached History

A tale of bug prediction in software development
A tale of bug prediction in software developmentA tale of bug prediction in software development
A tale of bug prediction in software developmentMartin Pinzger
 
advanced Google file System
advanced Google file Systemadvanced Google file System
advanced Google file Systemdiptipan
 
Continuous Performance Testing for Microservices
Continuous Performance Testing for MicroservicesContinuous Performance Testing for Microservices
Continuous Performance Testing for MicroservicesVincenzo Ferme
 
Advance google file system
Advance google file systemAdvance google file system
Advance google file systemLalit Rastogi
 
Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...Antonio Cesarano
 
Introduction to performance tuning perl web applications
Introduction to performance tuning perl web applicationsIntroduction to performance tuning perl web applications
Introduction to performance tuning perl web applicationsPerrin Harkins
 
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairIt Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairClaire Le Goues
 
The Impact of Tangled Code Changes
The Impact of Tangled Code ChangesThe Impact of Tangled Code Changes
The Impact of Tangled Code ChangesKim Herzig
 
Web Performance Automation - NY Web Performance Meetup
Web Performance Automation - NY Web Performance MeetupWeb Performance Automation - NY Web Performance Meetup
Web Performance Automation - NY Web Performance MeetupStrangeloop
 
Enery efficient data prefetching
Enery efficient data prefetchingEnery efficient data prefetching
Enery efficient data prefetchingHimanshu Koli
 
GFS - Google File System
GFS - Google File SystemGFS - Google File System
GFS - Google File Systemtutchiio
 
Google File System
Google File SystemGoogle File System
Google File SystemDreamJobs1
 
Using Developer Information as a Prediction Factor
Using Developer Information as a Prediction FactorUsing Developer Information as a Prediction Factor
Using Developer Information as a Prediction FactorTim Menzies
 
Scalable Apache for Beginners
Scalable Apache for BeginnersScalable Apache for Beginners
Scalable Apache for Beginnerswebhostingguy
 
AgileMidwest2018-Becker-DatabasesAndCattle
AgileMidwest2018-Becker-DatabasesAndCattleAgileMidwest2018-Becker-DatabasesAndCattle
AgileMidwest2018-Becker-DatabasesAndCattleJason Tice
 
MongoDB Memory Management Demystified
MongoDB Memory Management DemystifiedMongoDB Memory Management Demystified
MongoDB Memory Management DemystifiedMongoDB
 

Similar to Predicting Faults from Cached History (20)

A tale of bug prediction in software development
A tale of bug prediction in software developmentA tale of bug prediction in software development
A tale of bug prediction in software development
 
ZGC-SnowOne.pdf
ZGC-SnowOne.pdfZGC-SnowOne.pdf
ZGC-SnowOne.pdf
 
advanced Google file System
advanced Google file Systemadvanced Google file System
advanced Google file System
 
Continuous Performance Testing for Microservices
Continuous Performance Testing for MicroservicesContinuous Performance Testing for Microservices
Continuous Performance Testing for Microservices
 
Advance google file system
Advance google file systemAdvance google file system
Advance google file system
 
Overview of the Data Processing Error Analysis System (DPEAS)
Overview of the Data Processing Error Analysis System (DPEAS)Overview of the Data Processing Error Analysis System (DPEAS)
Overview of the Data Processing Error Analysis System (DPEAS)
 
Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...
 
Introduction to performance tuning perl web applications
Introduction to performance tuning perl web applicationsIntroduction to performance tuning perl web applications
Introduction to performance tuning perl web applications
 
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairIt Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
 
The Impact of Tangled Code Changes
The Impact of Tangled Code ChangesThe Impact of Tangled Code Changes
The Impact of Tangled Code Changes
 
Web Performance Automation - NY Web Performance Meetup
Web Performance Automation - NY Web Performance MeetupWeb Performance Automation - NY Web Performance Meetup
Web Performance Automation - NY Web Performance Meetup
 
Tuning Java Servers
Tuning Java Servers Tuning Java Servers
Tuning Java Servers
 
tittle
tittletittle
tittle
 
Enery efficient data prefetching
Enery efficient data prefetchingEnery efficient data prefetching
Enery efficient data prefetching
 
GFS - Google File System
GFS - Google File SystemGFS - Google File System
GFS - Google File System
 
Google File System
Google File SystemGoogle File System
Google File System
 
Using Developer Information as a Prediction Factor
Using Developer Information as a Prediction FactorUsing Developer Information as a Prediction Factor
Using Developer Information as a Prediction Factor
 
Scalable Apache for Beginners
Scalable Apache for BeginnersScalable Apache for Beginners
Scalable Apache for Beginners
 
AgileMidwest2018-Becker-DatabasesAndCattle
AgileMidwest2018-Becker-DatabasesAndCattleAgileMidwest2018-Becker-DatabasesAndCattle
AgileMidwest2018-Becker-DatabasesAndCattle
 
MongoDB Memory Management Demystified
MongoDB Memory Management DemystifiedMongoDB Memory Management Demystified
MongoDB Memory Management Demystified
 

More from Sung Kim

DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningDeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningSung Kim
 
Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)Sung Kim
 
Time series classification
Time series classificationTime series classification
Time series classificationSung Kim
 
Tensor board
Tensor boardTensor board
Tensor boardSung Kim
 
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...Sung Kim
 
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)Sung Kim
 
A Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesA Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesSung Kim
 
Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Sung Kim
 
Software Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSung Kim
 
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Sung Kim
 
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Sung Kim
 
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...Sung Kim
 
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)Sung Kim
 
Source code comprehension on evolving software
Source code comprehension on evolving softwareSource code comprehension on evolving software
Source code comprehension on evolving softwareSung Kim
 
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
A Survey on  Dynamic Symbolic Execution  for Automatic Test GenerationA Survey on  Dynamic Symbolic Execution  for Automatic Test Generation
A Survey on Dynamic Symbolic Execution for Automatic Test GenerationSung Kim
 
Survey on Software Defect Prediction
Survey on Software Defect PredictionSurvey on Software Defect Prediction
Survey on Software Defect PredictionSung Kim
 
MSR2014 opening
MSR2014 openingMSR2014 opening
MSR2014 openingSung Kim
 
Personalized Defect Prediction
Personalized Defect PredictionPersonalized Defect Prediction
Personalized Defect PredictionSung Kim
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSung Kim
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learningSung Kim
 

More from Sung Kim (20)

DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningDeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
 
Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)
 
Time series classification
Time series classificationTime series classification
Time series classification
 
Tensor board
Tensor boardTensor board
Tensor board
 
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
 
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
 
A Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesA Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution Techniques
 
Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)
 
Software Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled Datasets
 
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
 
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
 
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
 
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
 
Source code comprehension on evolving software
Source code comprehension on evolving softwareSource code comprehension on evolving software
Source code comprehension on evolving software
 
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
A Survey on  Dynamic Symbolic Execution  for Automatic Test GenerationA Survey on  Dynamic Symbolic Execution  for Automatic Test Generation
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
 
Survey on Software Defect Prediction
Survey on Software Defect PredictionSurvey on Software Defect Prediction
Survey on Software Defect Prediction
 
MSR2014 opening
MSR2014 openingMSR2014 opening
MSR2014 opening
 
Personalized Defect Prediction
Personalized Defect PredictionPersonalized Defect Prediction
Personalized Defect Prediction
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash Reproduction
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learning
 

Recently uploaded

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Predicting Faults from Cached History

  • 1. ` BugCache Predicting Defects Sung Kim • MIT Tom Zimmermann • Saarland University Jim Whitehead • UC Santa Cruz Andreas Zeller • Saarland University
  • 2. The Problem How should we allocate our resources for quality assurance? Which files should we focus on?
  • 3. Which files are most bug-prone? The Problem
  • 4. Where are bugs? Temporal locality: Defected files are likely to have more soon. [Ostrand, Weyuker] Spatial locality: In nearby other bugs! [Zimmermann et al.] In modified files! [Nagappan et al.] In new files! [Graves et al.]
  • 5.
  • 6. Bug Cache 10% files most defect-prone all files pre-fetch replacement Near by: co changes load
  • 7.
  • 8. Bug Cache load if missed load if missed pre-fetch A Fix change Non-fix change Fix change Change history B C
  • 9. Cache Model Miss Cache size: 2 A B C C
  • 10.
  • 11. Cache Model Hit Miss Miss Cache size: 2 Block size: 2 Hit A B C A D C B B A C A B Which one should be replaced?
  • 12.
  • 13. Cache Model Hit Miss Miss Cache size: 2 Block size: 2 Hit Replacement: BUG A B C A D C B B A C A B Block size: 1 Cache size: 2 File LRU CHANGE BUG -5 2 2 -3 3 1 B C BUG 2 1 (replace)
  • 14.
  • 15. Cache Model Hit Miss Miss Cache size: 2 Block size: 2 Replacement: BUG Pre-fetch size: 1 A B C A D C B B A C A B Hit rate = #Hits / #Defects = 25% Pre-fill Pre-fetch Miss D Pre-fetch
  • 16. Evaluation PostgreSQL jEdit Mozilla Columba
  • 17. Hit Rates Cache size = 10% Block/pre-fetch size = 50% of the cache size Replacement policy = LRU
  • 18.
  • 19. Function Level Default vs Optimal Options Cache size = 10% of all functions/methods
  • 20. Function Level Optimal Hit Rates Project Function Apache 1.3 Columba Eclipse JEdit Mozilla PostgreSQL Subversion 2,113 8,428 33,214 5,489 8,203 8,659 3,693 Cache size = 10% of all functions/methods Hit rate 62% 68% 72% 49% 55% 59% 46% Block 15% 57% 20% 85% 41% 29% 71% Pre-fetch 17% 20% 4% 8% 14% 17% 14% Replace BUG BUG BUG BUG LRU LRU BUG
  • 21. File Level Default vs Optimal Options Cache size = 10% of all files
  • 22. File Level Optimal Hit Rates Project Files Apache 1.3 Columba Eclipse JEdit Mozilla PostgreSQL Subversion 154 1,428 3,330 420 396 598 255 Cache size = 10% of all files Hit rate 82% 83% 95% 85% 88% 79% 73% Block 50% 59% 20% 23% 23% 22% 42% Pre-fetch 0% 0% 0% 0% 0% 0% 0% Replace LRU BUG LRU LRU LRU LRU LRU
  • 23. Related Work In previous work, 10% predicts 44%~78% 20% predicts 71~93% 10% BugCache predicts 73~95%
  • 25.
  • 26. ` BugCache Predicting Defects Sung Kim • MIT Tom Zimmermann • Saarland University Jim Whitehead • UC Santa Cruz Andreas Zeller • Saarland University
  • 27. Changes that lead to problems as indicated by later fixes. Bug-introducing Changes ... if (foo!=null) { foo.bar(); ... FIX ... if (foo==null) { foo.bar(); ... BUG-INTRODUCING later fixed

Editor's Notes

  1. Usage Coupling