Which log level should developers
choose for a new logging statement?
Journal-first Presentation | Empirical Software
Engineering
Heng Li Weiyi Shang Ahmed E.
Hassan
Logs are usually the only resource for
diagnosing field issues
2
A verbosity level can be assigned to
each logging statement
3
Log.info(“Log
message”)
Trace
Debu
g
Info
WarnErrorFatal
Log levels are used to disable some verbose
log messages while enable important ones
4
Trace
Debu
g
Info
WarnErrorFatal
Trace
Debu
g
Info
Warn
Error
Fatal Less verbose
levels
(higher levels)
More verbose
levels (lower
levels)
Log levels are used to disable some verbose
log messages while enable important ones
5
Trace
Debu
g
Info
Warn
Error
Fatal
System/module
level setting:
LogLevel = Warn
Less verbose
levels
(higher levels)
More verbose
levels (lower
levels)
Improper log levels can have many
negative impacts
6
“…tends to generate a lot of log noise…”
“These warnings worry users, especially first time users”
Researchers and industrial experts
highlighted the challenge of choosing proper
log levels
7
Developers spend much
efforts on adjusting log
levels
Severity levels are often
used inaccurately
[Oliner et al. CACM’12]
[Yuan et al. ICSE’12]
We want to suggest log levels when
developers add a new logging statement
8
Add a new
logging
statement
Which log
level should
be used?
+
Logger.level();
We study the code change history
of four subject systems
9
Over 2 M lines of code
Over 13 K logging statements
Added 17 K logging statements in history
Not single log level
dominates other log levels
13(0%)
2330(47%)
549(11%) 430(9%)
1655(33%)
185(4%)
1257(24%)
1839(35%)
1149(22%)
722(14%)
73(1%)
2(0%)
477(29%)
541(33%)
169(10%)
408(25%)
32(2%)
648(14%)
1603(36%)
985(22%)
459(10%)
799(18%)
2(0%)
0
1000
2000
0
500
1000
1500
2000
0
200
400
600
0
500
1000
1500
DirectoryServerHadoopHamaQpid
trace debug info warn error fatal
Log level
Numberoflogs
10Trace Debug Info Warn Error Fatal
Numberoflogs
95%
Developers spend much effort
adjusting log levels
11
Trace Debug Info Warn Error Fatal
Trace 0 25 2 0 0 0
Debug 16 9 41 3 7 1
Info 8 211 7 13 4 0
Warn 0 23 35 0 16 3
Error 0 12 23 23 2 4
Fatal 0 1 0 1 1 0
491 logging statements had at least
one log level change
Log level after changesInitial
log
level
Developers spend much effort
adjusting log levels
12
Trace Debug Info Warn Error Fatal
Trace 0 25 2 0 0 0
Debug 16 9 41 3 7 1
Info 8 211 7 13 4 0
Warn 0 23 35 0 16 3
Error 0 12 23 23 2 4
Fatal 0 1 0 1 1 0
72% of the log level changes
are from a higher level to a lower level
Log level after changesInitial
log
level
Developers spend much effort
adjusting log levels
13
Trace Debug Info Warn Error Fatal
Trace 0 25 2 0 0 0
Debug 16 9 41 3 7 1
Info 8 211 7 13 4 0
Warn 0 23 35 0 16 3
Error 0 12 23 23 2 4
Fatal 0 1 0 1 1 0
78% of the log level changes
are between adjacent levels
Log level after changesInitial
log
level
Developers spend much effort
adjusting log levels
14
Trace Debug Info Warn Error Fatal
Trace 0 25 2 0 0 0
Debug 16 9 41 3 7 1
Info 8 211 7 13 4 0
Warn 0 23 35 0 16 3
Error 0 12 23 23 2 4
Fatal 0 1 0 1 1 0
51% of the log level changes
are between “info” and “debug” levels
Log level after changesInitial
log
level
Which log level should developers choose
for a new logging statement?
15
RQ1: How well can we model the log
levels of logging statements?
RQ2: What are the important factors for
determining the log level of a logging
statement?
Which log level should developers choose
for a new logging statement?
16
RQ1: How well can we model the log
levels of logging statements?
RQ2: What are the important factors for
determining the log level of a logging
statement?
We derive 22 metrics from 5 dimensions
to model log levels
17
Logging statement
metrics
Containing block metrics
Containing file metrics
Code change metrics
Historical change metrics
The metrics are
extracted for each
logging statement,
at the time when the
logging statement is
added
We exclude the
logging statements
with subsequent log
level change
Order is important
18
Trace
Debug
Info
Warn
Error
Fatal
We use an Ordinal
Regression Model
for ordinal
responses
Higher
levels
Lower levels
Ordinal Regression Model
19
Logging statement
metrics
Containing block
metrics
Containing file metrics
Code change metrics
Historical change
metrics Trace
Debug
Info
Warn
Error
Fatal
Ordinal regression model can effectively
model log levels
20
0.76 0.78 0.81
0.75
0.4
0.5
0.6
0.7
0.8
0.9
1
Random
guess
AUC
The performance (AUC) of a within-project evaluation
Ordinal regression model can effectively
model log levels
21
0.76 0.78
0.81
0.750.72
0.76
0.8
0.71
0.4
0.5
0.6
0.7
0.8
0.9
1 Within-project Cross-project
The performance (AUC) of a cross-project
evaluation
AUC
Which log level should developers choose
for a new logging statement?
22
RQ1: How well can we model the log
levels of logging statements?
RQ2: What are the important factors for
determining the log level of a logging
statement?
We use a Wald 𝝌 𝟐 test to measure the
variable importance
23
Wald 𝝌 𝟐
Test
tests the significance of a variable against the null
hypothesis that the coefficient is equal to zero:
𝐻0: 𝜃 = 0
𝐻0: 𝜃0 = 𝜃1 = ⋯ = 𝜃 𝑘
Joint Wald 𝝌 𝟐 Test
tests the joint impact of a group
of
variables to a model’s fitness:
The most influential factors for log levels
are different for different projects
24
889
189
96
224
640
894
577
723
258
549
91
1134
5 0 0
78
0 67 0 0
0
500
1000
1500
The joint importance of each dimension of metrics
Logging statement metrics
Containing block metrics
Wald𝜒2statistic
The most influential factors for log levels
are different for different projects
25
889
189
96
224
640
894
577
723
258
549
91
1134
5 0 0
78
0 67 0 0
0
500
1000
1500
The joint importance of each dimension of metrics
Logging statement metrics
Containing block metrics
Wald𝜒2statistic
The most influential factors for log levels
are different for different projects
26
889
189
96
224
640
894
577
723
258
549
91
1134
5 0 0
78
0 67 0 0
0
500
1000
1500
The joint importance of each dimension of metrics
Logging statement metrics
Containing block metrics
Wald𝜒2statistic
27
Log levels are used to disable some verbose
log messages while enable important ones
Trace
Debug
Info
Warn
Error
Fatal
System/ module level
setting:
LogLevel = W arn
Less verbose levels
(higher levels)
More verbose
levels (lower levels)
Log levels are used to disable some verbose
log messages while enable important ones
Trace
Debug
Info
Warn
Error
Fatal
System/ module level
setting:
LogLevel = W arn
Less verbose levels
(higher levels)
More verbose
levels (lower levels)
28
29
Log levels are used to disable some verbose
log messages while enable important ones
Trace
Debug
Info
Warn
Error
Fatal
System/ module level
setting:
LogLevel = W arn
Less verbose levels
(higher levels)
More verbose
levels (lower levels)
30
31
Ordinal regression model can effectively
model log levels
0.72
0.76
0.8
0.71
0.76 0.78
0.81
0.75
0.4
0.5
0.6
0.7
0.8
0.9
1 Within-project Cross-project
The performance (AUC) of a cross-project evaluation
AUC
Log levels are used to disable some verbose
log messages while enable important ones
Trace
Debug
Info
Warn
Error
Fatal
System/ module level
setting:
LogLevel = W arn
Less verbose levels
(higher levels)
More verbose
levels (lower levels)
32
Ordinal regression model can effectively
model log levels
0.72
0.76
0.8
0.71
0.76 0.78
0.81
0.75
0.4
0.5
0.6
0.7
0.8
0.9
1 Within-project Cross-project
The performance (AUC) of a cross-project evaluation
AUC
33
Log levels are used to disable some verbose
log messages while enable important ones
Trace
Debug
Info
Warn
Error
Fatal
System/ module level
setting:
LogLevel = W arn
Less verbose levels
(higher levels)
More verbose
levels (lower levels)
34
Ordinal regression model can effectively
model log levels
0.72
0.76
0.8
0.71
0.76 0.78
0.81
0.75
0.4
0.5
0.6
0.7
0.8
0.9
1 Within-project Cross-project
The performance (AUC) of a cross-project evaluation
AUC
http://hengli.org
hengli@cs.queensu.ca

Which Log Level Should Developers Choose For a New Logging Statement?

  • 1.
    Which log levelshould developers choose for a new logging statement? Journal-first Presentation | Empirical Software Engineering Heng Li Weiyi Shang Ahmed E. Hassan
  • 2.
    Logs are usuallythe only resource for diagnosing field issues 2
  • 3.
    A verbosity levelcan be assigned to each logging statement 3 Log.info(“Log message”) Trace Debu g Info WarnErrorFatal
  • 4.
    Log levels areused to disable some verbose log messages while enable important ones 4 Trace Debu g Info WarnErrorFatal Trace Debu g Info Warn Error Fatal Less verbose levels (higher levels) More verbose levels (lower levels)
  • 5.
    Log levels areused to disable some verbose log messages while enable important ones 5 Trace Debu g Info Warn Error Fatal System/module level setting: LogLevel = Warn Less verbose levels (higher levels) More verbose levels (lower levels)
  • 6.
    Improper log levelscan have many negative impacts 6 “…tends to generate a lot of log noise…” “These warnings worry users, especially first time users”
  • 7.
    Researchers and industrialexperts highlighted the challenge of choosing proper log levels 7 Developers spend much efforts on adjusting log levels Severity levels are often used inaccurately [Oliner et al. CACM’12] [Yuan et al. ICSE’12]
  • 8.
    We want tosuggest log levels when developers add a new logging statement 8 Add a new logging statement Which log level should be used? + Logger.level();
  • 9.
    We study thecode change history of four subject systems 9 Over 2 M lines of code Over 13 K logging statements Added 17 K logging statements in history
  • 10.
    Not single loglevel dominates other log levels 13(0%) 2330(47%) 549(11%) 430(9%) 1655(33%) 185(4%) 1257(24%) 1839(35%) 1149(22%) 722(14%) 73(1%) 2(0%) 477(29%) 541(33%) 169(10%) 408(25%) 32(2%) 648(14%) 1603(36%) 985(22%) 459(10%) 799(18%) 2(0%) 0 1000 2000 0 500 1000 1500 2000 0 200 400 600 0 500 1000 1500 DirectoryServerHadoopHamaQpid trace debug info warn error fatal Log level Numberoflogs 10Trace Debug Info Warn Error Fatal Numberoflogs 95%
  • 11.
    Developers spend mucheffort adjusting log levels 11 Trace Debug Info Warn Error Fatal Trace 0 25 2 0 0 0 Debug 16 9 41 3 7 1 Info 8 211 7 13 4 0 Warn 0 23 35 0 16 3 Error 0 12 23 23 2 4 Fatal 0 1 0 1 1 0 491 logging statements had at least one log level change Log level after changesInitial log level
  • 12.
    Developers spend mucheffort adjusting log levels 12 Trace Debug Info Warn Error Fatal Trace 0 25 2 0 0 0 Debug 16 9 41 3 7 1 Info 8 211 7 13 4 0 Warn 0 23 35 0 16 3 Error 0 12 23 23 2 4 Fatal 0 1 0 1 1 0 72% of the log level changes are from a higher level to a lower level Log level after changesInitial log level
  • 13.
    Developers spend mucheffort adjusting log levels 13 Trace Debug Info Warn Error Fatal Trace 0 25 2 0 0 0 Debug 16 9 41 3 7 1 Info 8 211 7 13 4 0 Warn 0 23 35 0 16 3 Error 0 12 23 23 2 4 Fatal 0 1 0 1 1 0 78% of the log level changes are between adjacent levels Log level after changesInitial log level
  • 14.
    Developers spend mucheffort adjusting log levels 14 Trace Debug Info Warn Error Fatal Trace 0 25 2 0 0 0 Debug 16 9 41 3 7 1 Info 8 211 7 13 4 0 Warn 0 23 35 0 16 3 Error 0 12 23 23 2 4 Fatal 0 1 0 1 1 0 51% of the log level changes are between “info” and “debug” levels Log level after changesInitial log level
  • 15.
    Which log levelshould developers choose for a new logging statement? 15 RQ1: How well can we model the log levels of logging statements? RQ2: What are the important factors for determining the log level of a logging statement?
  • 16.
    Which log levelshould developers choose for a new logging statement? 16 RQ1: How well can we model the log levels of logging statements? RQ2: What are the important factors for determining the log level of a logging statement?
  • 17.
    We derive 22metrics from 5 dimensions to model log levels 17 Logging statement metrics Containing block metrics Containing file metrics Code change metrics Historical change metrics The metrics are extracted for each logging statement, at the time when the logging statement is added We exclude the logging statements with subsequent log level change
  • 18.
    Order is important 18 Trace Debug Info Warn Error Fatal Weuse an Ordinal Regression Model for ordinal responses Higher levels Lower levels
  • 19.
    Ordinal Regression Model 19 Loggingstatement metrics Containing block metrics Containing file metrics Code change metrics Historical change metrics Trace Debug Info Warn Error Fatal
  • 20.
    Ordinal regression modelcan effectively model log levels 20 0.76 0.78 0.81 0.75 0.4 0.5 0.6 0.7 0.8 0.9 1 Random guess AUC The performance (AUC) of a within-project evaluation
  • 21.
    Ordinal regression modelcan effectively model log levels 21 0.76 0.78 0.81 0.750.72 0.76 0.8 0.71 0.4 0.5 0.6 0.7 0.8 0.9 1 Within-project Cross-project The performance (AUC) of a cross-project evaluation AUC
  • 22.
    Which log levelshould developers choose for a new logging statement? 22 RQ1: How well can we model the log levels of logging statements? RQ2: What are the important factors for determining the log level of a logging statement?
  • 23.
    We use aWald 𝝌 𝟐 test to measure the variable importance 23 Wald 𝝌 𝟐 Test tests the significance of a variable against the null hypothesis that the coefficient is equal to zero: 𝐻0: 𝜃 = 0 𝐻0: 𝜃0 = 𝜃1 = ⋯ = 𝜃 𝑘 Joint Wald 𝝌 𝟐 Test tests the joint impact of a group of variables to a model’s fitness:
  • 24.
    The most influentialfactors for log levels are different for different projects 24 889 189 96 224 640 894 577 723 258 549 91 1134 5 0 0 78 0 67 0 0 0 500 1000 1500 The joint importance of each dimension of metrics Logging statement metrics Containing block metrics Wald𝜒2statistic
  • 25.
    The most influentialfactors for log levels are different for different projects 25 889 189 96 224 640 894 577 723 258 549 91 1134 5 0 0 78 0 67 0 0 0 500 1000 1500 The joint importance of each dimension of metrics Logging statement metrics Containing block metrics Wald𝜒2statistic
  • 26.
    The most influentialfactors for log levels are different for different projects 26 889 189 96 224 640 894 577 723 258 549 91 1134 5 0 0 78 0 67 0 0 0 500 1000 1500 The joint importance of each dimension of metrics Logging statement metrics Containing block metrics Wald𝜒2statistic
  • 27.
    27 Log levels areused to disable some verbose log messages while enable important ones Trace Debug Info Warn Error Fatal System/ module level setting: LogLevel = W arn Less verbose levels (higher levels) More verbose levels (lower levels)
  • 28.
    Log levels areused to disable some verbose log messages while enable important ones Trace Debug Info Warn Error Fatal System/ module level setting: LogLevel = W arn Less verbose levels (higher levels) More verbose levels (lower levels) 28
  • 29.
  • 30.
    Log levels areused to disable some verbose log messages while enable important ones Trace Debug Info Warn Error Fatal System/ module level setting: LogLevel = W arn Less verbose levels (higher levels) More verbose levels (lower levels) 30
  • 31.
    31 Ordinal regression modelcan effectively model log levels 0.72 0.76 0.8 0.71 0.76 0.78 0.81 0.75 0.4 0.5 0.6 0.7 0.8 0.9 1 Within-project Cross-project The performance (AUC) of a cross-project evaluation AUC
  • 32.
    Log levels areused to disable some verbose log messages while enable important ones Trace Debug Info Warn Error Fatal System/ module level setting: LogLevel = W arn Less verbose levels (higher levels) More verbose levels (lower levels) 32 Ordinal regression model can effectively model log levels 0.72 0.76 0.8 0.71 0.76 0.78 0.81 0.75 0.4 0.5 0.6 0.7 0.8 0.9 1 Within-project Cross-project The performance (AUC) of a cross-project evaluation AUC
  • 33.
  • 34.
    Log levels areused to disable some verbose log messages while enable important ones Trace Debug Info Warn Error Fatal System/ module level setting: LogLevel = W arn Less verbose levels (higher levels) More verbose levels (lower levels) 34 Ordinal regression model can effectively model log levels 0.72 0.76 0.8 0.71 0.76 0.78 0.81 0.75 0.4 0.5 0.6 0.7 0.8 0.9 1 Within-project Cross-project The performance (AUC) of a cross-project evaluation AUC http://hengli.org hengli@cs.queensu.ca

Editor's Notes

  • #6 Log levels are beneficial for both developers and users to trade-off the rich information in logs with their associated overhead.
  • #19 the ordinal regression model is used to predict an ordinal dependent variable, i.e., a variable with categorical values where the relative ordering between different values is important. We leverage ordinal regression models in automated log level prediction because log level has a small number (e.g., six) of categorical values and the relative ordering among these categorical values is important, hence neither a logistic regression model nor a classification model is as appropriate as an ordinal regression model.
  • #21 Need to update the data
  • #22 Need to update the data