Logging—used for system events and security breaches to describe more informational yet essential aspects of software features—is pervasive. Given the high transactionality of today's software, logging effectiveness can be reduced by information overload. Log levels help alleviate this problem by correlating a priority to logs that can be later filtered. As software evolves, however, levels of logs documenting surrounding feature implementations may also require modification as features once deemed important may have decreased in urgency and vice-versa. We present an automated approach that assists developers in evolving levels of such (feature) logs. The approach, based on mining Git histories and manipulating a degree of interest (DOI) model,1 transforms source code to revitalize feature log levels based on the “interestingness” of the surrounding code. Built upon JGit and Mylyn, the approach is implemented as an Eclipse IDE plug-in and evaluated on 18 Java projects with ~3 million lines of code and ~4K log statements. Our tool successfully analyzes 99.22% of logging statements, increases log level distributions by ~20 %, and increases the focus of logs in bug fix contexts ~83 % of the time. Moreover, pull (patch) requests were integrated into large and popular open-source projects. The results indicate that the approach is promising in assisting developers in evolving feature log levels.
Project Based Learning (A.I).pptx detail explanation
Automated Evolution of Feature Logging Statement Levels Using Git Histories and Degree of Interest
1. Automated Evolution of Feature Logging
Statement Levels Using Git Histories and Degree
of Interest
Science of Computer Programming, Volume 214, 1 Feb 2022, 102724
Yiming Tang1
Allan Spektor2
Raffi Khatchadourian2,3
Mehdi
Bagherzadeh4
1
Concordia University, Canada
2
City University of New York (CUNY) Hunter College, USA
3
City University of New York (CUNY) Graduate Center, USA
4
Oakland University, USA
IEEE International Conference on Software Analysis, Evolution &
Re-engineering
March 17, 2022, Honolulu, HI, USA (remote)
2. Introduction Motivation Approach Evaluation Conclusion Logging Issues
Logging in Modern Software in the Big Data Era
Logging is pervasive in the modern software.
Big data systems deal with high-volumes of transactions.
Source code is tangled with scattered logging statements capturing
important event information.
Essential for reporting security and privacy breaches.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 2 / 12
3. Introduction Motivation Approach Evaluation Conclusion Logging Issues
Feature Logging Statements
Modern software is also feature-heavy, implementing hundreds of
features.
Logging statements—although more informational—also capture
important aspects of feature implementations.
Useful for validating feature implementations and diagnosing
unintended interactions with other features.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 3 / 12
4. Introduction Motivation Approach Evaluation Conclusion Logging Issues
Logging Issues
Source: Stuart Pilbrow / CC BY-SA
(https://creativecommons.org/licenses/by-sa/2.0)
Too much logging causes
information overload.
Makes postmortem analysis
difficult.
Understanding system behavior
in production and diagnosing
problems can be challenging.
Also challenging during
development as logs pertaining
to auxiliary features are tangled
with those under current
development.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 4 / 12
5. Introduction Motivation Approach Evaluation Conclusion
Feature Logging Statement Level Evolution
Logging statements are typically associated with a log level.
Dictates if the log should be emitted, if at all.
Example
logger.log(Level.FINER, "Health:" + systemHealthStatus());
Outputs system health iff the run time level of logger ≤ Level.FINER.
As software evolves, logging statements levels correlated with
surrounding feature implementations may also need to be modified.
Ideally, feature log levels would evolve with the system as it is
developed.
Higher log levels (e.g., INFO) being assigned to logs corresponding to
features with more current stakeholder interest.
Lower log levels for those with less interest (e.g., FINEST).
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 5 / 12
6. Introduction Motivation Approach Evaluation Conclusion Overview DOI Manipulation
Automation Approach Overview
Figure: Logging Level rejuvenation approach overview (details in paper).
Automatically evolve feature logging statement levels.
Mine Git repositories to discover the “interestingness” of code
surrounding feature logging statements.
Adapt Mylyn degree of interest (DOI) model [Kersten and Murphy,
2005].
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 6 / 12
7. Introduction Motivation Approach Evaluation Conclusion Overview DOI Manipulation
What is Mylyn?
Standard Eclipse Integrated Development
Environment (IDE) plug-in.
Focuses graphical components of the IDE.
Only “interesting” artifacts related to the
currently active task are revealed [Kersten
and Murphy, 2006].
The more interaction with an artifact
(e.g., file), the more prominent it appears
in the IDE.
Less recently used artifacts appear less
prominently.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 7 / 12
8. Introduction Motivation Approach Evaluation Conclusion Overview DOI Manipulation
Mylyn Adaptation
Programmatically manipulate a DOI model using Git code changes.
Transform source code to “rejuvenate” feature logging statement
levels.
Pull those related to features whose implementation is worked on
more and more recently to the forefront.
Push those related to features whose implementations are worked on
less and less recently to the background.
Goals
Reduce information overload.
Support system evolution.
Automatically bring more relevant features to developers’ attention
and vice-versa.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 8 / 12
9. Introduction Motivation Approach Evaluation Conclusion Overview DOI Manipulation
Mylyn Adaptation
Programmatically manipulate a DOI model using Git code changes.
Transform source code to “rejuvenate” feature logging statement
levels.
Pull those related to features whose implementation is worked on
more and more recently to the forefront.
Push those related to features whose implementations are worked on
less and less recently to the background.
Goals
Reduce information overload.
Support system evolution.
Automatically bring more relevant features to developers’ attention
and vice-versa.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 8 / 12
10. Introduction Motivation Approach Evaluation Conclusion Overview DOI Manipulation
Mylyn Adaptation
Programmatically manipulate a DOI model using Git code changes.
Transform source code to “rejuvenate” feature logging statement
levels.
Pull those related to features whose implementation is worked on
more and more recently to the forefront.
Push those related to features whose implementations are worked on
less and less recently to the background.
Goals
Reduce information overload.
Support system evolution.
Automatically bring more relevant features to developers’ attention
and vice-versa.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 8 / 12
11. Introduction Motivation Approach Evaluation Conclusion Overview DOI Manipulation
Mylyn Adaptation
Programmatically manipulate a DOI model using Git code changes.
Transform source code to “rejuvenate” feature logging statement
levels.
Pull those related to features whose implementation is worked on
more and more recently to the forefront.
Push those related to features whose implementations are worked on
less and less recently to the background.
Goals
Reduce information overload.
Support system evolution.
Automatically bring more relevant features to developers’ attention
and vice-versa.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 8 / 12
12. Implementation
Implemented as an open-source plug-in to the Eclipse IDE.
May also be used with popular build systems via plug-ins.
Supports two popular logging frameworks, SLF4J and JUL.
Integrates with JGit and Mylyn.
Available at https://git.io/fjlTY.
13. Introduction Motivation Approach Evaluation Conclusion
Research Questions
1 How applicable is our tool to
and how does it behave with
real-world open source
software?
2 Does our tool help developers
focus on feature
implementation bugs?
3 Do developers find the results
acceptable? What is the impact
of our tool?
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 10 / 12
14. Introduction Motivation Approach Evaluation Conclusion
Evaluation Overview
18 Java projects, ˜3 MLOC, and ˜4K logging statements.
Fully-automated analysis running-time:
10.66 secs per analyzed logging statement.
0.89 secs per KLOC changed.
Developers do not actively think about how their logging statement
levels evolve with their software.
Successfully analyzes 99.26% of candidate logging statements.
Increases log level distributions by an average of ˜20%.
Ideally transforms log levels in bug contexts ˜83% of the time.
Preliminary pull request study successfully integrated into 2 large
and popular open-source projects (comparable to related work [S. Li
et al., 2018]).
More details in the paper!
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 11 / 12
15. Introduction Motivation Approach Evaluation Conclusion
Conclusion
Feature logging statements document important values and track
progress of feature implementations.
As interest of features evolve, feature logging levels may also require
modification to combat information overload.
Our approach discovers and rectifies mismatches between feature
interest levels and logging levels.
Results show that the technique is promising in alleviating the
burden of manually evolving logging levels.
Future Work
Expand pull request study.
Issue widescale developer surveys.
Enhance feature logging statement classification heuristics with ML.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 12 / 12
16. Introduction Motivation Approach Evaluation Conclusion
Conclusion
Feature logging statements document important values and track
progress of feature implementations.
As interest of features evolve, feature logging levels may also require
modification to combat information overload.
Our approach discovers and rectifies mismatches between feature
interest levels and logging levels.
Results show that the technique is promising in alleviating the
burden of manually evolving logging levels.
Future Work
Expand pull request study.
Issue widescale developer surveys.
Enhance feature logging statement classification heuristics with ML.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 12 / 12
17. Appendix Additional Material
For Further Reading I
Apache Software Foundation (2020). Log4j. Log4j 2 Architecture. url:
http://logging.apache.org/log4j/2.x/manual/architecture.html#Logger_Hierarchy (visited on 06/12/2020).
Chen, Boyuan and Zhen Ming (Jack) Jiang (2017). “Characterizing and Detecting Anti-Patterns in the Logging Code”. In:
International Conference on Software Engineering. ICSE ’17. Buenos Aires, Argentina: IEEE Press, pp. 71–81. isbn:
9781538638682. doi: 10.1109/ICSE.2017.15.
Eclipse Foundation, Inc. (2020). JGit. url: http://eclip.se/gF (visited on 03/02/2020).
Hassani, Mehran et al. (Mar. 2018). “Studying and detecting log-related issues”. In: Empirical Software Engineering. issn:
1573-7616. doi: 10.1007/s10664-018-9603-z. url: https://doi.org/10.1007/s10664-018-9603-z.
He, Pinjia et al. (2018). “Characterizing the Natural Language Descriptions in Software Logging Statements”. In: International
Conference on Automated Software Engineering. ASE 2018. Montpellier, France: ACM, pp. 178–189. isbn: 9781450359375. doi:
10.1145/3238147.3238193.
Kabinna, Suhas et al. (Feb. 2018). “Examining the Stability of Logging Statements”. In: Empirical Softw. Engg. 23.1,
pp. 290–333. issn: 1382-3256. doi: 10.1007/s10664-017-9518-0.
Kersten, Mik and Gail C. Murphy (2005). “Mylar: a degree-of-interest model for IDEs”. In: International Conference on
Aspect-Oriented Software Development. Chicago, Illinois: ACM, pp. 159–168. isbn: 1-59593-042-6. doi:
10.1145/1052898.1052912.
Kersten, Mik and Gail C. Murphy (2006). “Using Task Context to Improve Programmer Productivity”. In: ACM Symposium on
the Foundations of Software Engineering. SIGSOFT ’06/FSE-14. Portland, Oregon, USA: ACM, pp. 1–11. isbn: 1-59593-468-5.
doi: 10.1145/1181775.1181777.
Li, Heng, Weiyi Shang, and Ahmed E. Hassan (Aug. 2017). “Which Log Level Should Developers Choose for a New Logging
Statement?” In: Empirical Softw. Engg. 22.4, pp. 1684–1716. issn: 1382-3256. doi: 10.1007/s10664-016-9456-2.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 1 / 5
18. Appendix Additional Material
For Further Reading II
Li, Shanshan et al. (2018). “Logtracker: Learning Log Revision Behaviors Proactively from Software Evolution History”. In:
International Conference on Program Comprehension. ICPC ’18. Gothenburg, Sweden: ACM, pp. 178–188. isbn:
978-1-4503-5714-2. doi: 10.1145/3196321.3196328.
Oracle (2018). Logger (Java SE 10 & JDK 10). url:
http://docs.oracle.com/javase/10/docs/api/java/util/logging/Logger.html (visited on 02/29/2020).
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 2 / 5
19. Appendix Additional Material
Related Work
Source: Jonathan Joseph Bondhus / CC BY-SA
(https://creativecommons.org/licenses/by-sa/3.0)
Existing approaches [Chen and
Jiang, 2017; Hassani et al.,
2018; He et al., 2018; Kabinna
et al., 2018; H. Li et al., 2017]
are inclined to focus on either
new logging statements or log
messages.
Logger hierarchies [Apache
Software Foundation, 2020;
Oracle, 2018] may be but still
require manual maintenance.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 3 / 5
20. Appendix Additional Material
Rename Refactorings & Copying
Program elements (e.g.,
methods) changed in Git may
no longer exist in current
project version.
Must process rename
refactorings.
Maintain a data structure that
associates rename relationships
between program elements,
e.g., method signatures.
Use lightweight refactoring
approximations.
Use copy detection features of
Git at the file level.
New copy “inherits” old DOI
values.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 4 / 5
21. Appendix Additional Material
Classifying Feature Logging Statements
Logging levels are often used to differentiate various logging
“categories” (e.g., severe errors, security breaches).
Need to distinguish between these and feature logs.
Derive a set of heuristics based on first-hand developer interactions.
Also distinguish between less-critical debugging logs (e.g., tracing)
using a keyword-based approach.
Goals
Focus on only manipulating logging statements tied to features to better
align them with developers’ current interests.
Yiming Tang, Allan Spektor, Raffi Khatchadourian, Mehdi Bagherzadeh Automated Evolution of Feature Logging Statement Levels 5 / 5