SlideShare a Scribd company logo
Mining Development Knowledge to
Understand and Support Software
Logging Practices
Heng Li
Supervisor: Dr. Ahmed E. Hassan
Software Analysis & Intelligence Lab (SAIL)
Queen’s University, Canada
Developers insert logging code that
produces log messages at runtime
2
Log()
Logging
code
Log
messages
Software
system
Log.info(“Stopping server on ” + port);
2016-07-23 17:56:16 INFO Stopping server on 8032
Log messages record valuable runtime information
Diagnose
failures
Logging is critical for software maintenance
Detect
anomalies
Log messages are widely used in software
maintenance efforts
3
Understand
runtime
behaviors
Fu et al., Contextual analysis of program logs
for understanding system behaviors. MSR ‘13
Yuan et al., Sherlog: Error diagnosis by
connecting clues from run-time logs. ASPLOS ‘10
Xu et al., Detecting large-scale system
problems by mining console logs. SOSP ‘09
Developers have difficulties deciding on
appropriate logging code
4
“A lot of log
noise”
“Slowing
down perf
by 20%”
“Missing an
error log”
Developers spend a significant amount of efforts
maintaining their logging code
§ Logging practices in open source projects
[Yuan et al., 2012; Chen and Jiang, 2017]
§ Logging practices in industry
[Shang et al, 2014; Fu et al, 2014]
Prior
work
Development knowledge explains
the development of logging code
5
− LOG.info(msg);
+ LOG.warn(msg);
To help users
identify a problem
LOG.warn(msg);
What How Why
Change historySource code Issue reports
Thesis statement
Development knowledge can help us understand
current logging practices and develop useful tools
to support such logging practices
6
Change historySource code Issue reports
Development knowledge
Mining development knowledge to
understand and support logging practices
7
Developers’
logging concerns?
[TSE under review]
Where to log?
When to update
log? How to log?
[EMSE 2018]
[EMSE 2017] [EMSE 2017]
Error
Warn
Info
Mining development knowledge to
understand and support logging practices
8
Developers’
logging concerns?
[TSE under review]
Where to log?
When to update
log? How to log?
[EMSE 2018]
[EMSE 2017] [EMSE 2017]
Error
Warn
Info
Developers communicate their logging
concerns in issue reports
9
Logging cost: performance overhead
Remove a logging statement
Developers communicate their logging
concerns in issue reports
10
Add a logging statement
Logging benefit:
exposing runtime problems
We study logging-related issues reports to
understand developer’s logging concerns
11
Logging
issue
reports
Logging
concerns
Automated
& manual
filtering
Qualitative
analysis
What are developers’ logging concerns?
12
Logging Benefits
§ Assisting in debugging
Logging Costs
§ Excessive log information
Research opportunities
Leverage Minimize
Frequency
§ Providing runtime perf
§ Exposing runtime problems
§ Bookkeeping
§ Showing execution progress
§ Exposing unnecessary details
§ Misleading end users
§ Performance overhead
§ Exposing sensitive info
Mining development knowledge to
understand and support logging practices
13
Developers’
logging concerns?
[TSE under review]
Where to log?
When to update
log? How to log?
[EMSE 2018]
[EMSE 2017] [EMSE 2017]
Error
Warn
Info
10 categories of
logging concerns
(e.g., misleading users)
Mining development knowledge to
understand and support logging practices
14
Developers’
logging concerns?
[TSE under review]
Where to log?
When to update
log? How to log?
[EMSE 2018]
[EMSE 2017] [EMSE 2017]
Error
Warn
Info
Some code topics are more likely to need
logging statements
15
Examples of JIRA issues that require developers to log
the topic of “connections”
[EMSE 2018]
Can code topics explain where to log?
Topic: “connection”
Logging statement
[EMSE 2018]
16
We extract the code topics and logging statements for
each code snippet (method level)
We use LDA to extract code topics
Logging statement
[EMSE 2018]
17
Tokenization
Topic model
(LDA)
queue, connection
Topic: “connection”
A small number of topics are much more
likely to be logged
Topic: “connection”
Logging statement
The most log-intensive topics usually capture
communication between machines (e.g., ”connection”) or
interactions between threads (e.g., “thread interruption”)
[EMSE 2018]
18
We combine both the structure and topic
info to explain where to log
Topic: “connection”
Logging statement
Structure info: lines of
code, complexity, control
flow statements, etc.
[EMSE 2018]
19
We combine both the structure and topic
info to explain where to log
Topic: “connection”
Logging statement
Structure info: lines of
code, complexity, control
flow statements, etc.
LASSO model
[EMSE 2018]
20
Code topics bring additional explanatory
power (up to 13% AUC improvement)
21
0.82
0.86
0.8
0.86
0.83
0.96
0.87
0.94
0.9 0.9 0.88
0.99
0.5
0.6
0.7
0.8
0.9
1
Structure info Structure & topic info
AUC
The performance (AUC) of our LASSO models
Random guess
[EMSE 2018]
Mining development knowledge to
understand and support logging practices
22
Developers’
logging concerns?
[TSE under review]
Where to log?
When to update
log? How to log?
[EMSE 2018]
[EMSE 2017] [EMSE 2017]
Error
Warn
Info
Logging varies
across code topics
Mining development knowledge to
understand and support logging practices
23
Developers’
logging concerns?
[TSE under review]
Where to log?
When to update
log? How to log?
[EMSE 2018]
[EMSE 2017] [EMSE 2017]
Error
Warn
Info
Developers have difficulties to make
appropriate log changes
24
Developers usually forget to change logging code when
they change their code; in many cases, logging code is
written as “after-thoughts” after a failure happens and
logs are needed [Yuan et al., 2012]
Commit n Commit n+1
Code
changes
Log
changes
Version k
Debugging
difficulties
Code change history
Maintenance
efforts
Learning from the code change history to
provide log change suggestions
25
[EMSE 2017]
Code Code Log Code Log
?
Commit 1 Commit 2 Commit n…
Code changes
without log
changes
Code changes
with log
changes
Do we need to
change logs?
Code change history
LOG?
Providing automated suggestions for log
changes when developers change the code
26
Random Forest
Classifier
Log change
suggestions
Three dimensions
25 metrics
Change
metrics
Historical
metrics
Product
metrics
[EMSE 2017]
Code
Our models can effectively suggest whether
a log change is needed
27
0.84
0.91
0.86 0.88
0.5
0.6
0.7
0.8
0.9
1
AUC
The performance (AUC) of our Random
Forest models
Random guess
[EMSE 2017]
LOG?
The source code and code changes are
important for explaining log changes
28
Log change
suggestions
Three dimensions
25 metrics
Change
metrics
Historical
metrics
Product
metrics
[EMSE 2017]
Code
Explain
Mining development knowledge to
understand and support logging practices
29
Developers’
logging concerns?
[TSE under review]
Where to log?
When to update
log? How to log?
[EMSE 2018]
[EMSE 2017] [EMSE 2017]
Error
Warn
Info
The source code &
code changes can
explain log changes
Mining development knowledge to
understand and support logging practices
30
Developers’
logging concerns?
[TSE under review]
Where to log?
When to update
log? How to log?
[EMSE 2018]
[EMSE 2017] [EMSE 2017]
Error
Warn
Info
Log levels are used to disable some verbose
log messages while enabling important ones
31
Trace
Debug
Info
Warn
Error
Fatal Less verbose levels
(higher levels)
More verbose
levels (lower levels)
Log.error(“message”)
Log level
Improper log levels can have many
negative impacts
32
“…tends to generate a lot
of log noise…”
“These warnings worry
users”
Developers spend much efforts adjusting log levels
[Yuan et al., 2012]
Learning from the code change history to
provide log level suggestions
33
[EMSE 2017]
Commit 1 Commit 2 Commit n…
Code change history
Log.warn(msg) Log.info(msg) Log. ? (msg)
Log.error(msg)
Which log level
to use?
Providing automated suggestions for log
levels when developers add logging code
34
Logging statement metrics
Containing block metrics
Containing file metrics
Code change metrics
Historical change metrics
Trace
Debug
Info
Warn
Error
Fatal
Ordinal
Regression
Model
[EMSE 2017]
Ordinal regression models can effectively
model log levels
35
0.76
0.78
0.81
0.75
0.5
0.6
0.7
0.8
0.9
The performance (AUC) of our Ordinal
Regression Models
AUC
Random guess
[EMSE 2017]
The content of a logging statements and the
containing block/file explain its log level
36
Logging statement metrics
Containing block metrics
Containing file metrics
Code change metrics
Historical change metrics
Trace
Debug
Info
Warn
Error
Fatal
[EMSE 2017]
Explain
Mining development knowledge to
understand and support logging practices
37
Developers’
logging concerns?
[TSE under review]
Where to log?
When to update
log? How to log?
[EMSE 2018]
[EMSE 2017] [EMSE 2017]
Error
Warn
Info
The log content &
containing blocks/files
can explain log levels
Mining development knowledge to
understand and support logging practices
38
Developers’
logging concerns?
[TSE under review]
Where to log?
When to update
log? How to log?
[EMSE 2018]
[EMSE 2017] [EMSE 2017]
Logging varies
across code topics
Error
Warn
Info
The source code &
code changes can
explain log changes
The log content &
containing blocks/files
can explain log levels
10 categories of
logging concerns
(e.g., misleading users)
References
§ Fu, Q., Lou, J. G., Lin, Q., Ding, R., Zhang, D., and Xie, T. (2013). Contextual analysis of program logs for
understanding system behaviors. In Proceedings of the 10th Working Conference on Mining Software
Repositories, MSR ’13, pages 397–400.
§ Xu, W., Huang, L., Fox, A., Patterson, D., and Jordan, M. I. (2009). Detecting large-scale system problems by
mining console logs. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles,
SOSP ’09, pages 117–132.
§ Yuan, D., Mai, H., Xiong, W., Tan, L., Zhou, Y., and Pasupathy, S. (2010). Sherlog: Error diagnosis by connecting
clues from run-time logs. In Proceedings of the 15th International Conference on Architectural Support for
Programming Languages and Operating Systems, ASPLOS ’10, pages 143–154.
§ Yuan, D., Park, S., and Zhou, Y. (2012). Characterizing logging practices in open source software. In Proceedings
of the 34th International Conference on Software Engineering, ICSE ’12, pages 102–112.
§ Chen, B. and Jiang, Z. M. J. (2017). Characterizing logging practices in Java-based open source software projects
– a replication study in apache software foundation. Empirical Software Engineering, 22(1):330–374.
§ Shang, W., Jiang, Z. M., Adams, B., Hassan, A. E., Godfrey, M. W., Nasser, M., and Flora, P. (2014). An
exploratory study of the evolution of communicated information about the execution of large software
systems. Journal of Software: Evolution and Process, 26(1):3–26.
§ Fukushima, T., Kamei, Y., McIntosh, S., Yamashita, K., and Ubayashi, N. (2014). An empirical study of just-in-time
defect prediction using cross-project models. In Proceedings of the 11thWorking Conference onMining
Software Repositories, MSR 2014, pages 172–181.
39
Extra slides
40
Log()
Literature review
41
Mining
logging
code
Mining log messages
Improving
logging
code
Log()
Mining log messages
42
Understanding runtime behaviors
[Fu et al., 2013; Hassan et al., 2008; Shang et al., 2013]
Detecting anomaly conditions
[Xu et al., 2008, 2009; Fu et al., 2009; Jiang et al., 2008]
Diagnosing system failures
[Yuan et al, 2010; Syer et al., 2013]
Prior work highlights the importance of improving
logging quality
Mining logging code
43
Logging practices in open source projects
[Yuan et al., 2012; Chen and Jiang, 2017]
Logging practices in industry
[Fu et al, 2014; Pecchia et al., 2015]
Evolution of logging code
[Shang et al, 2011; Kabinna et al., 2016]
Log()
Developers spend much effort maintaining their logging
Software logging is a common practice
Improving logging code: proactive logging
44
Proactively adding logging info in the source
code
[Yuan et al., 2011, 2012; Zhao et al., 2017]
Log()
Producing excessive log information
Developers’ expertise and concerns are not considered
Improving logging code: learning to log
45
Learning statistical models to suggest where
to log
[Zhu et al., 2015; Lal and Sureka, 2016; Jia et al., 2018]
Ignoring logging patterns (e.g., log level, stack trace)
Log()
Focusing on one dim. of dev. knowledge (source code)
Providing logging suggestions as a post-dev. process
Logging stack traces can grow log files
very fast
46
Log.warn(msg) Log.warn(msg, e)
Logging a log
message + full stack
trace
Logging a log
message
Developers have difficulties to decide
whether to log stack traces
47
Missing stack trace
Improper logging
of stack trace
Learning from existing source code to
suggest whether to log a stack trace
48
Source
code
Source
code
Log(msg) Log(msg, e)
Source
code
Log(msg, ?)
Random Forest
Classifier
Log the
stack trace?
Six dimensions of
features
Log(msg, e)
Our models can effectively suggest whether
a stack trace is needed
49
0.85
0.94
0.9
0.86
0.5
0.6
0.7
0.8
0.9
1
AUC
The performance (AUC) of our Random
Forest models
Random guess

More Related Content

Similar to Mining Development Knowledge to Understand and Support Software Logging Practices

Cse viii-advanced-computer-architectures-06cs81-solution
Cse viii-advanced-computer-architectures-06cs81-solutionCse viii-advanced-computer-architectures-06cs81-solution
Cse viii-advanced-computer-architectures-06cs81-solutionShobha Kumar
 
Msr2021 tutorial-di penta
Msr2021 tutorial-di pentaMsr2021 tutorial-di penta
Msr2021 tutorial-di penta
Massimiliano Di Penta
 
Logging service design
Logging service designLogging service design
Logging service design
Wittawas Wisarnkanchana
 
Ni week 2018 LLAMA presentation
Ni week 2018 LLAMA presentationNi week 2018 LLAMA presentation
Ni week 2018 LLAMA presentation
DMC, Inc.
 
on log messages
on log messageson log messages
on log messages
Laurence Chen
 
Maturity of-code-mgmt-2016-04-06
Maturity of-code-mgmt-2016-04-06Maturity of-code-mgmt-2016-04-06
Maturity of-code-mgmt-2016-04-06
Bogusz Jelinski
 
MuleSoft Nashik Meetup#5 - JSON Logger and Externalize Logs
MuleSoft Nashik Meetup#5 - JSON Logger and Externalize LogsMuleSoft Nashik Meetup#5 - JSON Logger and Externalize Logs
MuleSoft Nashik Meetup#5 - JSON Logger and Externalize Logs
Jitendra Bafna
 
Machine Learning to Turbo-Charge the Ops Portion of DevOps
Machine Learning to Turbo-Charge the Ops Portion of DevOpsMachine Learning to Turbo-Charge the Ops Portion of DevOps
Machine Learning to Turbo-Charge the Ops Portion of DevOps
Deborah Schalm
 
Solve cross cutting concerns with aspect oriented programming (aop)
Solve cross cutting concerns with aspect oriented programming (aop)Solve cross cutting concerns with aspect oriented programming (aop)
Solve cross cutting concerns with aspect oriented programming (aop)
Siva Prasad Rao Janapati
 
MicroStrategy Design Challenges - Tips and Best Practices
MicroStrategy Design Challenges - Tips and Best PracticesMicroStrategy Design Challenges - Tips and Best Practices
MicroStrategy Design Challenges - Tips and Best Practices
BiBoard.Org
 
Line Of Code(LOC) In Software Engineering By NADEEM AHMED FROM DEPALPUR
Line Of Code(LOC) In Software Engineering By NADEEM AHMED FROM DEPALPURLine Of Code(LOC) In Software Engineering By NADEEM AHMED FROM DEPALPUR
Line Of Code(LOC) In Software Engineering By NADEEM AHMED FROM DEPALPUR
NA000000
 
Doctor ZedGe @InsideTrack Rome #sitROME
Doctor ZedGe @InsideTrack Rome #sitROMEDoctor ZedGe @InsideTrack Rome #sitROME
Doctor ZedGe @InsideTrack Rome #sitROME
sergio.ferrari
 
Improving the Performance of Database-Centric Applications Through Program An...
Improving the Performance of Database-Centric Applications Through Program An...Improving the Performance of Database-Centric Applications Through Program An...
Improving the Performance of Database-Centric Applications Through Program An...
Concordia University
 
Just-in-time Detection of Protection-Impacting Changes on WordPress and Media...
Just-in-time Detection of Protection-Impacting Changes on WordPress and Media...Just-in-time Detection of Protection-Impacting Changes on WordPress and Media...
Just-in-time Detection of Protection-Impacting Changes on WordPress and Media...
Amine Barrak
 
ISV Error Handling With Spring '21 Update
ISV Error Handling With Spring '21 UpdateISV Error Handling With Spring '21 Update
ISV Error Handling With Spring '21 Update
CodeScience
 
Cascon06 tooldemo.ppt
Cascon06 tooldemo.pptCascon06 tooldemo.ppt
Cascon06 tooldemo.ppt
Yann-Gaël Guéhéneuc
 
Why Monitoring and Logging are Important in DevOps.pdf
Why Monitoring and Logging are Important in DevOps.pdfWhy Monitoring and Logging are Important in DevOps.pdf
Why Monitoring and Logging are Important in DevOps.pdf
Datacademy.ai
 
ICSME2014
ICSME2014ICSME2014
ICSME2014
swy351
 
Association Rule Mining Scheme for Software Failure Analysis
Association Rule Mining Scheme for Software Failure AnalysisAssociation Rule Mining Scheme for Software Failure Analysis
Association Rule Mining Scheme for Software Failure Analysis
Editor IJMTER
 

Similar to Mining Development Knowledge to Understand and Support Software Logging Practices (20)

Cse viii-advanced-computer-architectures-06cs81-solution
Cse viii-advanced-computer-architectures-06cs81-solutionCse viii-advanced-computer-architectures-06cs81-solution
Cse viii-advanced-computer-architectures-06cs81-solution
 
Msr2021 tutorial-di penta
Msr2021 tutorial-di pentaMsr2021 tutorial-di penta
Msr2021 tutorial-di penta
 
Logging service design
Logging service designLogging service design
Logging service design
 
Ni week 2018 LLAMA presentation
Ni week 2018 LLAMA presentationNi week 2018 LLAMA presentation
Ni week 2018 LLAMA presentation
 
on log messages
on log messageson log messages
on log messages
 
Maturity of-code-mgmt-2016-04-06
Maturity of-code-mgmt-2016-04-06Maturity of-code-mgmt-2016-04-06
Maturity of-code-mgmt-2016-04-06
 
MuleSoft Nashik Meetup#5 - JSON Logger and Externalize Logs
MuleSoft Nashik Meetup#5 - JSON Logger and Externalize LogsMuleSoft Nashik Meetup#5 - JSON Logger and Externalize Logs
MuleSoft Nashik Meetup#5 - JSON Logger and Externalize Logs
 
Machine Learning to Turbo-Charge the Ops Portion of DevOps
Machine Learning to Turbo-Charge the Ops Portion of DevOpsMachine Learning to Turbo-Charge the Ops Portion of DevOps
Machine Learning to Turbo-Charge the Ops Portion of DevOps
 
Solve cross cutting concerns with aspect oriented programming (aop)
Solve cross cutting concerns with aspect oriented programming (aop)Solve cross cutting concerns with aspect oriented programming (aop)
Solve cross cutting concerns with aspect oriented programming (aop)
 
Abcxyz
AbcxyzAbcxyz
Abcxyz
 
MicroStrategy Design Challenges - Tips and Best Practices
MicroStrategy Design Challenges - Tips and Best PracticesMicroStrategy Design Challenges - Tips and Best Practices
MicroStrategy Design Challenges - Tips and Best Practices
 
Line Of Code(LOC) In Software Engineering By NADEEM AHMED FROM DEPALPUR
Line Of Code(LOC) In Software Engineering By NADEEM AHMED FROM DEPALPURLine Of Code(LOC) In Software Engineering By NADEEM AHMED FROM DEPALPUR
Line Of Code(LOC) In Software Engineering By NADEEM AHMED FROM DEPALPUR
 
Doctor ZedGe @InsideTrack Rome #sitROME
Doctor ZedGe @InsideTrack Rome #sitROMEDoctor ZedGe @InsideTrack Rome #sitROME
Doctor ZedGe @InsideTrack Rome #sitROME
 
Improving the Performance of Database-Centric Applications Through Program An...
Improving the Performance of Database-Centric Applications Through Program An...Improving the Performance of Database-Centric Applications Through Program An...
Improving the Performance of Database-Centric Applications Through Program An...
 
Just-in-time Detection of Protection-Impacting Changes on WordPress and Media...
Just-in-time Detection of Protection-Impacting Changes on WordPress and Media...Just-in-time Detection of Protection-Impacting Changes on WordPress and Media...
Just-in-time Detection of Protection-Impacting Changes on WordPress and Media...
 
ISV Error Handling With Spring '21 Update
ISV Error Handling With Spring '21 UpdateISV Error Handling With Spring '21 Update
ISV Error Handling With Spring '21 Update
 
Cascon06 tooldemo.ppt
Cascon06 tooldemo.pptCascon06 tooldemo.ppt
Cascon06 tooldemo.ppt
 
Why Monitoring and Logging are Important in DevOps.pdf
Why Monitoring and Logging are Important in DevOps.pdfWhy Monitoring and Logging are Important in DevOps.pdf
Why Monitoring and Logging are Important in DevOps.pdf
 
ICSME2014
ICSME2014ICSME2014
ICSME2014
 
Association Rule Mining Scheme for Software Failure Analysis
Association Rule Mining Scheme for Software Failure AnalysisAssociation Rule Mining Scheme for Software Failure Analysis
Association Rule Mining Scheme for Software Failure Analysis
 

More from SAIL_QU

Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...
SAIL_QU
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
SAIL_QU
 
Improving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load testsImproving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load tests
SAIL_QU
 
Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...
SAIL_QU
 
Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...
SAIL_QU
 
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
SAIL_QU
 
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
SAIL_QU
 
The Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution AnalysesThe Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution Analyses
SAIL_QU
 
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
SAIL_QU
 
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
SAIL_QU
 
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
SAIL_QU
 
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
SAIL_QU
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
SAIL_QU
 
What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?
SAIL_QU
 
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
SAIL_QU
 
Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...
SAIL_QU
 
Measuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with ProfessionalsMeasuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with Professionals
SAIL_QU
 
On the Unreliability of Bug Severity Data
On the Unreliability of Bug Severity DataOn the Unreliability of Bug Severity Data
On the Unreliability of Bug Severity Data
SAIL_QU
 
On the Link Between Mobile App Quality and User Reviews
On the Link Between Mobile App Quality and User ReviewsOn the Link Between Mobile App Quality and User Reviews
On the Link Between Mobile App Quality and User Reviews
SAIL_QU
 
Mining Software Engineering Data
Mining Software Engineering DataMining Software Engineering Data
Mining Software Engineering Data
SAIL_QU
 

More from SAIL_QU (20)

Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
 
Improving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load testsImproving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load tests
 
Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...
 
Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...
 
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
 
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
 
The Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution AnalysesThe Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution Analyses
 
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
 
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
 
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
 
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
 
What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?
 
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
 
Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...
 
Measuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with ProfessionalsMeasuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with Professionals
 
On the Unreliability of Bug Severity Data
On the Unreliability of Bug Severity DataOn the Unreliability of Bug Severity Data
On the Unreliability of Bug Severity Data
 
On the Link Between Mobile App Quality and User Reviews
On the Link Between Mobile App Quality and User ReviewsOn the Link Between Mobile App Quality and User Reviews
On the Link Between Mobile App Quality and User Reviews
 
Mining Software Engineering Data
Mining Software Engineering DataMining Software Engineering Data
Mining Software Engineering Data
 

Recently uploaded

Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
Dr Ramhari Poudyal
 
14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
SyedAbiiAzazi1
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
heavyhaig
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
An Approach to Detecting Writing Styles Based on Clustering Techniques
An Approach to Detecting Writing Styles Based on Clustering TechniquesAn Approach to Detecting Writing Styles Based on Clustering Techniques
An Approach to Detecting Writing Styles Based on Clustering Techniques
ambekarshweta25
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
drwaing
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
zwunae
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
Osamah Alsalih
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
Aditya Rajan Patra
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
camseq
 

Recently uploaded (20)

Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
 
14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
An Approach to Detecting Writing Styles Based on Clustering Techniques
An Approach to Detecting Writing Styles Based on Clustering TechniquesAn Approach to Detecting Writing Styles Based on Clustering Techniques
An Approach to Detecting Writing Styles Based on Clustering Techniques
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
 

Mining Development Knowledge to Understand and Support Software Logging Practices

  • 1. Mining Development Knowledge to Understand and Support Software Logging Practices Heng Li Supervisor: Dr. Ahmed E. Hassan Software Analysis & Intelligence Lab (SAIL) Queen’s University, Canada
  • 2. Developers insert logging code that produces log messages at runtime 2 Log() Logging code Log messages Software system Log.info(“Stopping server on ” + port); 2016-07-23 17:56:16 INFO Stopping server on 8032 Log messages record valuable runtime information
  • 3. Diagnose failures Logging is critical for software maintenance Detect anomalies Log messages are widely used in software maintenance efforts 3 Understand runtime behaviors Fu et al., Contextual analysis of program logs for understanding system behaviors. MSR ‘13 Yuan et al., Sherlog: Error diagnosis by connecting clues from run-time logs. ASPLOS ‘10 Xu et al., Detecting large-scale system problems by mining console logs. SOSP ‘09
  • 4. Developers have difficulties deciding on appropriate logging code 4 “A lot of log noise” “Slowing down perf by 20%” “Missing an error log” Developers spend a significant amount of efforts maintaining their logging code § Logging practices in open source projects [Yuan et al., 2012; Chen and Jiang, 2017] § Logging practices in industry [Shang et al, 2014; Fu et al, 2014] Prior work
  • 5. Development knowledge explains the development of logging code 5 − LOG.info(msg); + LOG.warn(msg); To help users identify a problem LOG.warn(msg); What How Why Change historySource code Issue reports
  • 6. Thesis statement Development knowledge can help us understand current logging practices and develop useful tools to support such logging practices 6 Change historySource code Issue reports Development knowledge
  • 7. Mining development knowledge to understand and support logging practices 7 Developers’ logging concerns? [TSE under review] Where to log? When to update log? How to log? [EMSE 2018] [EMSE 2017] [EMSE 2017] Error Warn Info
  • 8. Mining development knowledge to understand and support logging practices 8 Developers’ logging concerns? [TSE under review] Where to log? When to update log? How to log? [EMSE 2018] [EMSE 2017] [EMSE 2017] Error Warn Info
  • 9. Developers communicate their logging concerns in issue reports 9 Logging cost: performance overhead Remove a logging statement
  • 10. Developers communicate their logging concerns in issue reports 10 Add a logging statement Logging benefit: exposing runtime problems
  • 11. We study logging-related issues reports to understand developer’s logging concerns 11 Logging issue reports Logging concerns Automated & manual filtering Qualitative analysis
  • 12. What are developers’ logging concerns? 12 Logging Benefits § Assisting in debugging Logging Costs § Excessive log information Research opportunities Leverage Minimize Frequency § Providing runtime perf § Exposing runtime problems § Bookkeeping § Showing execution progress § Exposing unnecessary details § Misleading end users § Performance overhead § Exposing sensitive info
  • 13. Mining development knowledge to understand and support logging practices 13 Developers’ logging concerns? [TSE under review] Where to log? When to update log? How to log? [EMSE 2018] [EMSE 2017] [EMSE 2017] Error Warn Info 10 categories of logging concerns (e.g., misleading users)
  • 14. Mining development knowledge to understand and support logging practices 14 Developers’ logging concerns? [TSE under review] Where to log? When to update log? How to log? [EMSE 2018] [EMSE 2017] [EMSE 2017] Error Warn Info
  • 15. Some code topics are more likely to need logging statements 15 Examples of JIRA issues that require developers to log the topic of “connections” [EMSE 2018]
  • 16. Can code topics explain where to log? Topic: “connection” Logging statement [EMSE 2018] 16 We extract the code topics and logging statements for each code snippet (method level)
  • 17. We use LDA to extract code topics Logging statement [EMSE 2018] 17 Tokenization Topic model (LDA) queue, connection Topic: “connection”
  • 18. A small number of topics are much more likely to be logged Topic: “connection” Logging statement The most log-intensive topics usually capture communication between machines (e.g., ”connection”) or interactions between threads (e.g., “thread interruption”) [EMSE 2018] 18
  • 19. We combine both the structure and topic info to explain where to log Topic: “connection” Logging statement Structure info: lines of code, complexity, control flow statements, etc. [EMSE 2018] 19
  • 20. We combine both the structure and topic info to explain where to log Topic: “connection” Logging statement Structure info: lines of code, complexity, control flow statements, etc. LASSO model [EMSE 2018] 20
  • 21. Code topics bring additional explanatory power (up to 13% AUC improvement) 21 0.82 0.86 0.8 0.86 0.83 0.96 0.87 0.94 0.9 0.9 0.88 0.99 0.5 0.6 0.7 0.8 0.9 1 Structure info Structure & topic info AUC The performance (AUC) of our LASSO models Random guess [EMSE 2018]
  • 22. Mining development knowledge to understand and support logging practices 22 Developers’ logging concerns? [TSE under review] Where to log? When to update log? How to log? [EMSE 2018] [EMSE 2017] [EMSE 2017] Error Warn Info Logging varies across code topics
  • 23. Mining development knowledge to understand and support logging practices 23 Developers’ logging concerns? [TSE under review] Where to log? When to update log? How to log? [EMSE 2018] [EMSE 2017] [EMSE 2017] Error Warn Info
  • 24. Developers have difficulties to make appropriate log changes 24 Developers usually forget to change logging code when they change their code; in many cases, logging code is written as “after-thoughts” after a failure happens and logs are needed [Yuan et al., 2012] Commit n Commit n+1 Code changes Log changes Version k Debugging difficulties Code change history Maintenance efforts
  • 25. Learning from the code change history to provide log change suggestions 25 [EMSE 2017] Code Code Log Code Log ? Commit 1 Commit 2 Commit n… Code changes without log changes Code changes with log changes Do we need to change logs? Code change history
  • 26. LOG? Providing automated suggestions for log changes when developers change the code 26 Random Forest Classifier Log change suggestions Three dimensions 25 metrics Change metrics Historical metrics Product metrics [EMSE 2017] Code
  • 27. Our models can effectively suggest whether a log change is needed 27 0.84 0.91 0.86 0.88 0.5 0.6 0.7 0.8 0.9 1 AUC The performance (AUC) of our Random Forest models Random guess [EMSE 2017]
  • 28. LOG? The source code and code changes are important for explaining log changes 28 Log change suggestions Three dimensions 25 metrics Change metrics Historical metrics Product metrics [EMSE 2017] Code Explain
  • 29. Mining development knowledge to understand and support logging practices 29 Developers’ logging concerns? [TSE under review] Where to log? When to update log? How to log? [EMSE 2018] [EMSE 2017] [EMSE 2017] Error Warn Info The source code & code changes can explain log changes
  • 30. Mining development knowledge to understand and support logging practices 30 Developers’ logging concerns? [TSE under review] Where to log? When to update log? How to log? [EMSE 2018] [EMSE 2017] [EMSE 2017] Error Warn Info
  • 31. Log levels are used to disable some verbose log messages while enabling important ones 31 Trace Debug Info Warn Error Fatal Less verbose levels (higher levels) More verbose levels (lower levels) Log.error(“message”) Log level
  • 32. Improper log levels can have many negative impacts 32 “…tends to generate a lot of log noise…” “These warnings worry users” Developers spend much efforts adjusting log levels [Yuan et al., 2012]
  • 33. Learning from the code change history to provide log level suggestions 33 [EMSE 2017] Commit 1 Commit 2 Commit n… Code change history Log.warn(msg) Log.info(msg) Log. ? (msg) Log.error(msg) Which log level to use?
  • 34. Providing automated suggestions for log levels when developers add logging code 34 Logging statement metrics Containing block metrics Containing file metrics Code change metrics Historical change metrics Trace Debug Info Warn Error Fatal Ordinal Regression Model [EMSE 2017]
  • 35. Ordinal regression models can effectively model log levels 35 0.76 0.78 0.81 0.75 0.5 0.6 0.7 0.8 0.9 The performance (AUC) of our Ordinal Regression Models AUC Random guess [EMSE 2017]
  • 36. The content of a logging statements and the containing block/file explain its log level 36 Logging statement metrics Containing block metrics Containing file metrics Code change metrics Historical change metrics Trace Debug Info Warn Error Fatal [EMSE 2017] Explain
  • 37. Mining development knowledge to understand and support logging practices 37 Developers’ logging concerns? [TSE under review] Where to log? When to update log? How to log? [EMSE 2018] [EMSE 2017] [EMSE 2017] Error Warn Info The log content & containing blocks/files can explain log levels
  • 38. Mining development knowledge to understand and support logging practices 38 Developers’ logging concerns? [TSE under review] Where to log? When to update log? How to log? [EMSE 2018] [EMSE 2017] [EMSE 2017] Logging varies across code topics Error Warn Info The source code & code changes can explain log changes The log content & containing blocks/files can explain log levels 10 categories of logging concerns (e.g., misleading users)
  • 39. References § Fu, Q., Lou, J. G., Lin, Q., Ding, R., Zhang, D., and Xie, T. (2013). Contextual analysis of program logs for understanding system behaviors. In Proceedings of the 10th Working Conference on Mining Software Repositories, MSR ’13, pages 397–400. § Xu, W., Huang, L., Fox, A., Patterson, D., and Jordan, M. I. (2009). Detecting large-scale system problems by mining console logs. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, SOSP ’09, pages 117–132. § Yuan, D., Mai, H., Xiong, W., Tan, L., Zhou, Y., and Pasupathy, S. (2010). Sherlog: Error diagnosis by connecting clues from run-time logs. In Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’10, pages 143–154. § Yuan, D., Park, S., and Zhou, Y. (2012). Characterizing logging practices in open source software. In Proceedings of the 34th International Conference on Software Engineering, ICSE ’12, pages 102–112. § Chen, B. and Jiang, Z. M. J. (2017). Characterizing logging practices in Java-based open source software projects – a replication study in apache software foundation. Empirical Software Engineering, 22(1):330–374. § Shang, W., Jiang, Z. M., Adams, B., Hassan, A. E., Godfrey, M. W., Nasser, M., and Flora, P. (2014). An exploratory study of the evolution of communicated information about the execution of large software systems. Journal of Software: Evolution and Process, 26(1):3–26. § Fukushima, T., Kamei, Y., McIntosh, S., Yamashita, K., and Ubayashi, N. (2014). An empirical study of just-in-time defect prediction using cross-project models. In Proceedings of the 11thWorking Conference onMining Software Repositories, MSR 2014, pages 172–181. 39
  • 41. Log() Literature review 41 Mining logging code Mining log messages Improving logging code Log()
  • 42. Mining log messages 42 Understanding runtime behaviors [Fu et al., 2013; Hassan et al., 2008; Shang et al., 2013] Detecting anomaly conditions [Xu et al., 2008, 2009; Fu et al., 2009; Jiang et al., 2008] Diagnosing system failures [Yuan et al, 2010; Syer et al., 2013] Prior work highlights the importance of improving logging quality
  • 43. Mining logging code 43 Logging practices in open source projects [Yuan et al., 2012; Chen and Jiang, 2017] Logging practices in industry [Fu et al, 2014; Pecchia et al., 2015] Evolution of logging code [Shang et al, 2011; Kabinna et al., 2016] Log() Developers spend much effort maintaining their logging Software logging is a common practice
  • 44. Improving logging code: proactive logging 44 Proactively adding logging info in the source code [Yuan et al., 2011, 2012; Zhao et al., 2017] Log() Producing excessive log information Developers’ expertise and concerns are not considered
  • 45. Improving logging code: learning to log 45 Learning statistical models to suggest where to log [Zhu et al., 2015; Lal and Sureka, 2016; Jia et al., 2018] Ignoring logging patterns (e.g., log level, stack trace) Log() Focusing on one dim. of dev. knowledge (source code) Providing logging suggestions as a post-dev. process
  • 46. Logging stack traces can grow log files very fast 46 Log.warn(msg) Log.warn(msg, e) Logging a log message + full stack trace Logging a log message
  • 47. Developers have difficulties to decide whether to log stack traces 47 Missing stack trace Improper logging of stack trace
  • 48. Learning from existing source code to suggest whether to log a stack trace 48 Source code Source code Log(msg) Log(msg, e) Source code Log(msg, ?) Random Forest Classifier Log the stack trace? Six dimensions of features Log(msg, e)
  • 49. Our models can effectively suggest whether a stack trace is needed 49 0.85 0.94 0.9 0.86 0.5 0.6 0.7 0.8 0.9 1 AUC The performance (AUC) of our Random Forest models Random guess