SlideShare a Scribd company logo
1 of 20
Zhen Ming Jiang, Ahmed E. Hassan
Software Analysis and Intelligence (SAIL) Lab
Queens’ University, Canada
Gilbert Hamann, Parminder Flora
Enterprise Performance Engineering,
Research In Motion (RIM), Canada
Abstracting Execution Logs to Execution
Events for Enterprise Applications
How many types of “errors” are there?
■ One RIM application generates 1.6 million
log lines (in 8 hours) and 23,000 lines
contain “fail” or “failure”
– Total 319 execution events, among them 16
contains “fail” or “failure”
Events Frequency
Error occurred during purchasing, item=$v 500
Error! Cannot retrieve catalogs for user=$v 300
Authentication error for user=$v 100
1. User checkout for accountID(Tom), item=100
2. User checkout for accountID(Jenny), item=100
3. Item shipped for accountID(Tom), item=100
4. User checkout for accountID(John), item=100
Abstracting Log Lines to
Execution Events
Events Lines
User checkout for accountID($v), item=$v 1, 2, 4
Item shipped for accountID($v), item=$v 3
Clone Detection Approach
- Parameterized Token Matching
Running CCFinder on Logs
■ Won’t work for large files
■ Unsatisfying results
■ Because log lines do not have
– Delimiters like “;” or “}”
– Keywords like “if”, “for”
Working Example
1. Start check out
2. Paid for, item=bag, quantity=1, amount=100
3. Paid for, item=book, quantity=3, amount=150
4. Check out, total amount is 250
5. Check out done
Our Log Abstraction
Approach
3_0_1 1. Start check out
3_0_2 5. Check out done
5_1_1 4. Check out, total amount=$v
8_3_1 2. Paid for, item=$v, quantity=$v, amount=$v
8_3_1 3. Paid for, item=$v, quantity=$v, amount=$v
Anonymize
1. Start check out
2. Paid for, item=bag, quantity=1, amount=100
3. Paid for, item=book, quantity=3, amount=150
4. Check out, total amount is 250
5. Check out done
1. Start check out
2. Paid for, item=$v, quantity=$v, amount=$v
3. Paid for, item=$v, quantity=$v, amount=$v
4. Check out, total amount=$v
5. Check out done
Tokenize
1. Start check out
2. Paid for, item=$v, quantity=$v, amount=$v
3. Paid for, item=$v, quantity=$v, amount=$v
4. Check out, total amount=$v
5. Check out done
(3, 0) 1. Start check out
5. Check out done
(5, 1) 4. Check out, total amount=$v
(8, 3) 2. Paid for, item=$v, quantity=$v, amount=$v
3. Paid for, item=$v, quantity=$v, amount=$v
Categorize
3_0_1 1. Start check out
3_0_2 5. Check out done
5_1_1 4. Check out, total amount=$v
8_3_1 2. Paid for, item=$v, quantity=$v, amount=$v
8_3_1 2. Paid for, item=$v, quantity=$v, amount=$v
(3, 0) 1. Start check out
5. Check out done
(5, 1) 4. Check out, total amount=$v
(8, 3) 2. Paid for, item=$v, quantity=$v, amount=$v
3. Paid for, item=$v, quantity=$v, amount=$v
Reconcile
5_0_1 Start processing for user Jen
5_0_2 Start processing for user Tom
5_0_3 Start processing for user Henry
5_0_4 Start processing for user Jack
5_0_5 Start processing for user Peter
5_0_1 Start processing for user $v
Reconcile
(6, 2) User shopping basket contains: 1, 2
(7, 3) User shopping basket contains: 1, 2, 3
(8, 4) User shopping basket contains: 1, 2, 3, 4
6_2_1 User shopping basket contains: $v
6_2_1 User shopping basket contains: $v
7_3_1 User shopping basket contains: $v
8_4_1 User shopping basket contains: $v
Measuring the Performance
∑×=
∑×=
=
+
=
+
=
=
+
=
+
=
=
=
=
k
ierecall
k
RecallAverage
k
ieprecision
k
PrecisionAverage
ePMePC
ePC
recall
ePFePC
ePC
precision
FePM
EDePF
CBAePC
1
1
1
1
%60
23
3
%75
13
3
}{
},{
},,{
Measuring the Performance
- Getting the Correct Execution Events
■ Simply searching for “printf” or “System.out” won’t work
■ We use
– Internationalization file
– Random sampling
Case Study
RIM App 1 723, 608
RIM App 2 1, 688, 876
LoadSim 67, 651
Blue Gene/L 2, 994, 986
■ 4 Applications
■ Other similar log abstraction tools
– Terrify
– SLCT
SLCT
■ Uses Frequent Itemset Mining
Performance Comparison
Discussion
- SLCT Performance
■ SLCT performance is not high, because
– Infrequent log lines won’t abstract
– Does not further abstract line patterns
Discussion
- Our heuristics
■ Adjusting our heuristics
– Anonymization rules
– Reconcile step
Conclusions
How many types of “errors” are
there?
Events Frequency
Error occurred during purchasing, item=$v 500
Error! Cannot retrieve catalogs for user=$v 300
Authentication error for user=$v 100
Our Log Abstraction
Approach
3_0_1 1. Start check out
3_0_2 5. Check out done
5_1_1 4. Check out, total amount=$v
8_3_1 2. Paid for, item=$v, quantity=$v, amount=$v
8_3_1 2. Paid for, item=$v, quantity=$v, amount=$v
Measuring the Performance Performance Comparison

More Related Content

Viewers also liked

Estudo arquitetonico 2015-06-25_09_40_31
Estudo arquitetonico 2015-06-25_09_40_31Estudo arquitetonico 2015-06-25_09_40_31
Estudo arquitetonico 2015-06-25_09_40_31Francis Zeman
 
Scam2011 syer
Scam2011 syerScam2011 syer
Scam2011 syerSAIL_QU
 
Mud2010 bettenburg presentation
Mud2010 bettenburg presentationMud2010 bettenburg presentation
Mud2010 bettenburg presentationSAIL_QU
 
Msr2012 adams
Msr2012 adamsMsr2012 adams
Msr2012 adamsSAIL_QU
 
Msr2011 zaman
Msr2011 zamanMsr2011 zaman
Msr2011 zamanSAIL_QU
 
Wcre2011 khomh
Wcre2011 khomhWcre2011 khomh
Wcre2011 khomhSAIL_QU
 
Scam2007 jiang
Scam2007 jiangScam2007 jiang
Scam2007 jiangSAIL_QU
 
Sthomas slides
Sthomas slidesSthomas slides
Sthomas slidesSAIL_QU
 
Ohira icsm2012
Ohira icsm2012Ohira icsm2012
Ohira icsm2012SAIL_QU
 
Wcre2010 shihab
Wcre2010 shihabWcre2010 shihab
Wcre2010 shihabSAIL_QU
 
Qsic2010 shihab
Qsic2010 shihabQsic2010 shihab
Qsic2010 shihabSAIL_QU
 
Wcre2009 alam
Wcre2009 alamWcre2009 alam
Wcre2009 alamSAIL_QU
 
Scam2010 thomas presentation
Scam2010 thomas presentationScam2010 thomas presentation
Scam2010 thomas presentationSAIL_QU
 
Msr2012 chen
Msr2012 chenMsr2012 chen
Msr2012 chenSAIL_QU
 
Msr2012 bettenburg presentation
Msr2012 bettenburg presentationMsr2012 bettenburg presentation
Msr2012 bettenburg presentationSAIL_QU
 
TRY - a global database of plant traits
TRY - a global database of plant traitsTRY - a global database of plant traits
TRY - a global database of plant traitsFuture Earth
 
Wcre2009 bettenburg
Wcre2009 bettenburgWcre2009 bettenburg
Wcre2009 bettenburgSAIL_QU
 

Viewers also liked (20)

Youtube marketing music
Youtube marketing musicYoutube marketing music
Youtube marketing music
 
Estudo arquitetonico 2015-06-25_09_40_31
Estudo arquitetonico 2015-06-25_09_40_31Estudo arquitetonico 2015-06-25_09_40_31
Estudo arquitetonico 2015-06-25_09_40_31
 
Scam2011 syer
Scam2011 syerScam2011 syer
Scam2011 syer
 
Mud2010 bettenburg presentation
Mud2010 bettenburg presentationMud2010 bettenburg presentation
Mud2010 bettenburg presentation
 
Msr2012 adams
Msr2012 adamsMsr2012 adams
Msr2012 adams
 
Msr2011 zaman
Msr2011 zamanMsr2011 zaman
Msr2011 zaman
 
Wcre2011 khomh
Wcre2011 khomhWcre2011 khomh
Wcre2011 khomh
 
Scam2007 jiang
Scam2007 jiangScam2007 jiang
Scam2007 jiang
 
FANTASTIC TRIP -A MUST WATCH by RISHABH
FANTASTIC TRIP -A MUST WATCH by RISHABHFANTASTIC TRIP -A MUST WATCH by RISHABH
FANTASTIC TRIP -A MUST WATCH by RISHABH
 
Sthomas slides
Sthomas slidesSthomas slides
Sthomas slides
 
Ohira icsm2012
Ohira icsm2012Ohira icsm2012
Ohira icsm2012
 
CATALOG_2
CATALOG_2CATALOG_2
CATALOG_2
 
Wcre2010 shihab
Wcre2010 shihabWcre2010 shihab
Wcre2010 shihab
 
Qsic2010 shihab
Qsic2010 shihabQsic2010 shihab
Qsic2010 shihab
 
Wcre2009 alam
Wcre2009 alamWcre2009 alam
Wcre2009 alam
 
Scam2010 thomas presentation
Scam2010 thomas presentationScam2010 thomas presentation
Scam2010 thomas presentation
 
Msr2012 chen
Msr2012 chenMsr2012 chen
Msr2012 chen
 
Msr2012 bettenburg presentation
Msr2012 bettenburg presentationMsr2012 bettenburg presentation
Msr2012 bettenburg presentation
 
TRY - a global database of plant traits
TRY - a global database of plant traitsTRY - a global database of plant traits
TRY - a global database of plant traits
 
Wcre2009 bettenburg
Wcre2009 bettenburgWcre2009 bettenburg
Wcre2009 bettenburg
 

Similar to Qsic2008 jiang

CQRS & event sourcing in the wild
CQRS & event sourcing in the wildCQRS & event sourcing in the wild
CQRS & event sourcing in the wildMichiel Rook
 
Bank Program in JavaBelow is my code(havent finished, but it be .pdf
Bank Program in JavaBelow is my code(havent finished, but it be .pdfBank Program in JavaBelow is my code(havent finished, but it be .pdf
Bank Program in JavaBelow is my code(havent finished, but it be .pdfizabellejaeden956
 
A Journey with React
A Journey with ReactA Journey with React
A Journey with ReactFITC
 
Metrics-Driven Engineering
Metrics-Driven EngineeringMetrics-Driven Engineering
Metrics-Driven EngineeringMike Brittain
 
Functional Principles for OO Developers
Functional Principles for OO DevelopersFunctional Principles for OO Developers
Functional Principles for OO Developersjessitron
 
Machine Learning, Key to Your Classification Challenges
Machine Learning, Key to Your Classification ChallengesMachine Learning, Key to Your Classification Challenges
Machine Learning, Key to Your Classification ChallengesMarc Borowczak
 
PAC 2020 Santorin - Andreas Grabner
PAC 2020 Santorin - Andreas Grabner PAC 2020 Santorin - Andreas Grabner
PAC 2020 Santorin - Andreas Grabner Neotys
 
Design how your objects talk through mocking
Design how your objects talk through mockingDesign how your objects talk through mocking
Design how your objects talk through mockingKonstantin Kudryashov
 
Meet Magento Belarus debug Pavel Novitsky (eng)
Meet Magento Belarus debug Pavel Novitsky (eng)Meet Magento Belarus debug Pavel Novitsky (eng)
Meet Magento Belarus debug Pavel Novitsky (eng)Pavel Novitsky
 
Let it crash - fault tolerance in Elixir/OTP
Let it crash - fault tolerance in Elixir/OTPLet it crash - fault tolerance in Elixir/OTP
Let it crash - fault tolerance in Elixir/OTPMaciej Kaszubowski
 
Synchronize applications with akeneo/batch
Synchronize applications with akeneo/batchSynchronize applications with akeneo/batch
Synchronize applications with akeneo/batchgplanchat
 
Rails-like JavaScript Using CoffeeScript, Backbone.js and Jasmine
Rails-like JavaScript Using CoffeeScript, Backbone.js and JasmineRails-like JavaScript Using CoffeeScript, Backbone.js and Jasmine
Rails-like JavaScript Using CoffeeScript, Backbone.js and JasmineRaimonds Simanovskis
 
Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)
Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)
Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)Dan Robinson
 
Building Better Applications with Data::Manager
Building Better Applications with Data::ManagerBuilding Better Applications with Data::Manager
Building Better Applications with Data::ManagerJay Shirley
 
ES3-2020-06 Test Driven Development (TDD)
ES3-2020-06 Test Driven Development (TDD)ES3-2020-06 Test Driven Development (TDD)
ES3-2020-06 Test Driven Development (TDD)David Rodenas
 

Similar to Qsic2008 jiang (20)

CQRS + ES. Más allá del hexágono
CQRS + ES. Más allá del hexágonoCQRS + ES. Más allá del hexágono
CQRS + ES. Más allá del hexágono
 
CQRS & event sourcing in the wild
CQRS & event sourcing in the wildCQRS & event sourcing in the wild
CQRS & event sourcing in the wild
 
Bank Program in JavaBelow is my code(havent finished, but it be .pdf
Bank Program in JavaBelow is my code(havent finished, but it be .pdfBank Program in JavaBelow is my code(havent finished, but it be .pdf
Bank Program in JavaBelow is my code(havent finished, but it be .pdf
 
A Journey with React
A Journey with ReactA Journey with React
A Journey with React
 
Metrics-Driven Engineering
Metrics-Driven EngineeringMetrics-Driven Engineering
Metrics-Driven Engineering
 
Functional Principles for OO Developers
Functional Principles for OO DevelopersFunctional Principles for OO Developers
Functional Principles for OO Developers
 
Machine Learning, Key to Your Classification Challenges
Machine Learning, Key to Your Classification ChallengesMachine Learning, Key to Your Classification Challenges
Machine Learning, Key to Your Classification Challenges
 
PAC 2020 Santorin - Andreas Grabner
PAC 2020 Santorin - Andreas Grabner PAC 2020 Santorin - Andreas Grabner
PAC 2020 Santorin - Andreas Grabner
 
Design how your objects talk through mocking
Design how your objects talk through mockingDesign how your objects talk through mocking
Design how your objects talk through mocking
 
SPL, not a bridge too far
SPL, not a bridge too farSPL, not a bridge too far
SPL, not a bridge too far
 
Meet Magento Belarus debug Pavel Novitsky (eng)
Meet Magento Belarus debug Pavel Novitsky (eng)Meet Magento Belarus debug Pavel Novitsky (eng)
Meet Magento Belarus debug Pavel Novitsky (eng)
 
Let it crash - fault tolerance in Elixir/OTP
Let it crash - fault tolerance in Elixir/OTPLet it crash - fault tolerance in Elixir/OTP
Let it crash - fault tolerance in Elixir/OTP
 
Synchronize applications with akeneo/batch
Synchronize applications with akeneo/batchSynchronize applications with akeneo/batch
Synchronize applications with akeneo/batch
 
Growing up with Magento
Growing up with MagentoGrowing up with Magento
Growing up with Magento
 
Rails-like JavaScript Using CoffeeScript, Backbone.js and Jasmine
Rails-like JavaScript Using CoffeeScript, Backbone.js and JasmineRails-like JavaScript Using CoffeeScript, Backbone.js and Jasmine
Rails-like JavaScript Using CoffeeScript, Backbone.js and Jasmine
 
Magento code audit
Magento code auditMagento code audit
Magento code audit
 
Spring batch
Spring batchSpring batch
Spring batch
 
Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)
Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)
Designing The Right Schema To Power Heap (PGConf Silicon Valley 2016)
 
Building Better Applications with Data::Manager
Building Better Applications with Data::ManagerBuilding Better Applications with Data::Manager
Building Better Applications with Data::Manager
 
ES3-2020-06 Test Driven Development (TDD)
ES3-2020-06 Test Driven Development (TDD)ES3-2020-06 Test Driven Development (TDD)
ES3-2020-06 Test Driven Development (TDD)
 

More from SAIL_QU

Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...SAIL_QU
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...SAIL_QU
 
Improving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load testsImproving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load testsSAIL_QU
 
Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...SAIL_QU
 
Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...SAIL_QU
 
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...SAIL_QU
 
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...SAIL_QU
 
Mining Development Knowledge to Understand and Support Software Logging Pract...
Mining Development Knowledge to Understand and Support Software Logging Pract...Mining Development Knowledge to Understand and Support Software Logging Pract...
Mining Development Knowledge to Understand and Support Software Logging Pract...SAIL_QU
 
Which Log Level Should Developers Choose For a New Logging Statement?
Which Log Level Should Developers Choose For a New Logging Statement?Which Log Level Should Developers Choose For a New Logging Statement?
Which Log Level Should Developers Choose For a New Logging Statement?SAIL_QU
 
Towards Just-in-Time Suggestions for Log Changes
Towards Just-in-Time Suggestions for Log ChangesTowards Just-in-Time Suggestions for Log Changes
Towards Just-in-Time Suggestions for Log ChangesSAIL_QU
 
The Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution AnalysesThe Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution AnalysesSAIL_QU
 
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...SAIL_QU
 
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...SAIL_QU
 
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...SAIL_QU
 
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...SAIL_QU
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...SAIL_QU
 
What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?SAIL_QU
 
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...SAIL_QU
 
Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...SAIL_QU
 
Measuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with ProfessionalsMeasuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with ProfessionalsSAIL_QU
 

More from SAIL_QU (20)

Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
 
Improving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load testsImproving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load tests
 
Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...
 
Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...
 
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
 
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
 
Mining Development Knowledge to Understand and Support Software Logging Pract...
Mining Development Knowledge to Understand and Support Software Logging Pract...Mining Development Knowledge to Understand and Support Software Logging Pract...
Mining Development Knowledge to Understand and Support Software Logging Pract...
 
Which Log Level Should Developers Choose For a New Logging Statement?
Which Log Level Should Developers Choose For a New Logging Statement?Which Log Level Should Developers Choose For a New Logging Statement?
Which Log Level Should Developers Choose For a New Logging Statement?
 
Towards Just-in-Time Suggestions for Log Changes
Towards Just-in-Time Suggestions for Log ChangesTowards Just-in-Time Suggestions for Log Changes
Towards Just-in-Time Suggestions for Log Changes
 
The Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution AnalysesThe Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution Analyses
 
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
 
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
 
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
 
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
 
What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?
 
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
 
Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...
 
Measuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with ProfessionalsMeasuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with Professionals
 

Qsic2008 jiang

  • 1. Zhen Ming Jiang, Ahmed E. Hassan Software Analysis and Intelligence (SAIL) Lab Queens’ University, Canada Gilbert Hamann, Parminder Flora Enterprise Performance Engineering, Research In Motion (RIM), Canada Abstracting Execution Logs to Execution Events for Enterprise Applications
  • 2. How many types of “errors” are there? ■ One RIM application generates 1.6 million log lines (in 8 hours) and 23,000 lines contain “fail” or “failure” – Total 319 execution events, among them 16 contains “fail” or “failure” Events Frequency Error occurred during purchasing, item=$v 500 Error! Cannot retrieve catalogs for user=$v 300 Authentication error for user=$v 100
  • 3. 1. User checkout for accountID(Tom), item=100 2. User checkout for accountID(Jenny), item=100 3. Item shipped for accountID(Tom), item=100 4. User checkout for accountID(John), item=100 Abstracting Log Lines to Execution Events Events Lines User checkout for accountID($v), item=$v 1, 2, 4 Item shipped for accountID($v), item=$v 3
  • 4. Clone Detection Approach - Parameterized Token Matching
  • 5. Running CCFinder on Logs ■ Won’t work for large files ■ Unsatisfying results ■ Because log lines do not have – Delimiters like “;” or “}” – Keywords like “if”, “for”
  • 6. Working Example 1. Start check out 2. Paid for, item=bag, quantity=1, amount=100 3. Paid for, item=book, quantity=3, amount=150 4. Check out, total amount is 250 5. Check out done
  • 7. Our Log Abstraction Approach 3_0_1 1. Start check out 3_0_2 5. Check out done 5_1_1 4. Check out, total amount=$v 8_3_1 2. Paid for, item=$v, quantity=$v, amount=$v 8_3_1 3. Paid for, item=$v, quantity=$v, amount=$v
  • 8. Anonymize 1. Start check out 2. Paid for, item=bag, quantity=1, amount=100 3. Paid for, item=book, quantity=3, amount=150 4. Check out, total amount is 250 5. Check out done 1. Start check out 2. Paid for, item=$v, quantity=$v, amount=$v 3. Paid for, item=$v, quantity=$v, amount=$v 4. Check out, total amount=$v 5. Check out done
  • 9. Tokenize 1. Start check out 2. Paid for, item=$v, quantity=$v, amount=$v 3. Paid for, item=$v, quantity=$v, amount=$v 4. Check out, total amount=$v 5. Check out done (3, 0) 1. Start check out 5. Check out done (5, 1) 4. Check out, total amount=$v (8, 3) 2. Paid for, item=$v, quantity=$v, amount=$v 3. Paid for, item=$v, quantity=$v, amount=$v
  • 10. Categorize 3_0_1 1. Start check out 3_0_2 5. Check out done 5_1_1 4. Check out, total amount=$v 8_3_1 2. Paid for, item=$v, quantity=$v, amount=$v 8_3_1 2. Paid for, item=$v, quantity=$v, amount=$v (3, 0) 1. Start check out 5. Check out done (5, 1) 4. Check out, total amount=$v (8, 3) 2. Paid for, item=$v, quantity=$v, amount=$v 3. Paid for, item=$v, quantity=$v, amount=$v
  • 11. Reconcile 5_0_1 Start processing for user Jen 5_0_2 Start processing for user Tom 5_0_3 Start processing for user Henry 5_0_4 Start processing for user Jack 5_0_5 Start processing for user Peter 5_0_1 Start processing for user $v
  • 12. Reconcile (6, 2) User shopping basket contains: 1, 2 (7, 3) User shopping basket contains: 1, 2, 3 (8, 4) User shopping basket contains: 1, 2, 3, 4 6_2_1 User shopping basket contains: $v 6_2_1 User shopping basket contains: $v 7_3_1 User shopping basket contains: $v 8_4_1 User shopping basket contains: $v
  • 14. Measuring the Performance - Getting the Correct Execution Events ■ Simply searching for “printf” or “System.out” won’t work ■ We use – Internationalization file – Random sampling
  • 15. Case Study RIM App 1 723, 608 RIM App 2 1, 688, 876 LoadSim 67, 651 Blue Gene/L 2, 994, 986 ■ 4 Applications ■ Other similar log abstraction tools – Terrify – SLCT
  • 16. SLCT ■ Uses Frequent Itemset Mining
  • 18. Discussion - SLCT Performance ■ SLCT performance is not high, because – Infrequent log lines won’t abstract – Does not further abstract line patterns
  • 19. Discussion - Our heuristics ■ Adjusting our heuristics – Anonymization rules – Reconcile step
  • 20. Conclusions How many types of “errors” are there? Events Frequency Error occurred during purchasing, item=$v 500 Error! Cannot retrieve catalogs for user=$v 300 Authentication error for user=$v 100 Our Log Abstraction Approach 3_0_1 1. Start check out 3_0_2 5. Check out done 5_1_1 4. Check out, total amount=$v 8_3_1 2. Paid for, item=$v, quantity=$v, amount=$v 8_3_1 2. Paid for, item=$v, quantity=$v, amount=$v Measuring the Performance Performance Comparison