SlideShare a Scribd company logo
Tracing Back Log Data to
its Log Statement:
From Research to Practice
Daan Schipper (Adyen)
Maurício F. Aniche, Arie van Deursen (Delft University of Technology)
DEV OPS
20170101160001 Adyen version: ******
20170101160002 Starting TX/amt=10001/currency=978
20170101160003 Starting EMV
20170101160004 EMV started
20170101160005 Magswipe opened
20170101160006 CTLS started
20170101160007 Transaction initialised
20170101160008 Run TX as EMV transaction
20170101160009 Application selected app:******
20170101160010 read_application_data succeeded
20170101160011 data_authentication succeeded
20170101160012 validate 0
20170101160013 DCC rejected
20170101160014 terminal_risk_management succeeded
20170101160015 verify_card_holder succeeded
20170101160016 generate_first_ac succeeded
20170101160017 Authorizing online
20170101160018 Data returned by the host succeeded
20170101160019 Transaction authorized by card
20170101160020 Approved receipt printed
20170101160021 pos_result_code:APPROVED
20170101160022 Final status: Approved
20170101160001 Adyen version: ******
20170101160002 Starting TX/amt=10001/currency=978
20170101160003 Starting EMV
20170101160004 EMV started
20170101160005 Magswipe opened
20170101160006 CTLS started
20170101160007 Transaction initialised
20170101160008 Run TX as EMV transaction
20170101160009 Application selected app:******
20170101160010 read_application_data succeeded
20170101160011 data_authentication succeeded
20170101160012 validate 0
20170101160013 DCC rejected
20170101160014 terminal_risk_management succeeded
20170101160015 verify_card_holder succeeded
20170101160016 generate_first_ac succeeded
20170101160017 Authorizing online
20170101160018 Data returned by the host succeeded
20170101160019 Transaction authorized by card
20170101160020 Approved receipt printed
20170101160021 pos_result_code:APPROVED
20170101160022 Final status: Approved
20170101160001 Adyen version: ******
20170101160002 Starting TX/amt=10001/currency=978
20170101160003 Starting EMV
20170101160004 EMV started
20170101160005 Magswipe opened
20170101160006 CTLS started
20170101160007 Transaction initialised
20170101160008 Run TX as EMV transaction
20170101160009 Application selected app:******
20170101160010 read_application_data succeeded
20170101160011 data_authentication succeeded
20170101160012 validate 0
20170101160013 DCC rejected
20170101160014 terminal_risk_management succeeded
20170101160015 verify_card_holder succeeded
20170101160016 generate_first_ac succeeded
20170101160017 Authorizing online
20170101160018 Data returned by the host succeeded
20170101160019 Transaction authorized by card
20170101160020 Approved receipt printed
20170101160021 pos_result_code:APPROVED
20170101160022 Final status: Approved
20170101160001 Adyen version: ******
20170101160002 Starting TX/amt=10001/currency=978
20170101160003 Starting EMV
20170101160004 EMV started
20170101160005 Magswipe opened
20170101160006 CTLS started
20170101160007 Transaction initialised
20170101160008 Run TX as EMV transaction
20170101160009 Application selected app:******
20170101160010 read_application_data succeeded
20170101160011 data_authentication succeeded
20170101160012 validate 0
20170101160013 DCC rejected
20170101160014 terminal_risk_management succeeded
20170101160015 verify_card_holder succeeded
20170101160016 generate_first_ac succeeded
20170101160017 Authorizing online
20170101160018 Data returned by the host succeeded
20170101160019 Transaction authorized by card
20170101160020 Approved receipt printed
20170101160021 pos_result_code:APPROVED
20170101160022 Final status: Approved
20170101160001 Adyen version: ******
20170101160002 Starting TX/amt=10001/currency=978
20170101160003 Starting EMV
20170101160004 EMV started
20170101160005 Magswipe opened
20170101160006 CTLS started
20170101160007 Transaction initialised
20170101160008 Run TX as EMV transaction
20170101160009 Application selected app:******
20170101160010 read_application_data succeeded
20170101160011 data_authentication succeeded
20170101160012 validate 0
20170101160013 DCC rejected
20170101160014 terminal_risk_management succeeded
20170101160015 verify_card_holder succeeded
20170101160016 generate_first_ac succeeded
20170101160017 Authorizing online
20170101160018 Data returned by the host succeeded
20170101160019 Transaction authorized by card
20170101160020 Approved receipt printed
20170101160021 pos_result_code:APPROVED
20170101160022 Final status: Approved
log.info(”Customer “ + customer +
“ paying ” + paymentValue);
[2019-02-03 15:43:24] [MagicPayment.java][L456] Customer Maurício paying 235.67
Class name Line number
In practice…
• Adyen can’t log the class and the line number that
originates a log message.
• Developers ‘grep’ the source code
• Can we automatically detect where origin of the log
lines?
• W. Xu, L. Huang, A. Fox, D. Patterson, and M. I. Jordan.
Detecting large-scale system problems by mining console
logs. In Proceedings of the ACM SIGOPS 22nd symposium
on Operating systems principles, pages 117–132. ACM,
2009.
Match
message to
template
Source
code
Find log
statements
Create
template
Enrich
template
Create index
Template
database
Logs
Link
Template creation process Matching process
query
Templates
Match
message to
template
Source
code
Find log
statements
Create
template
Enrich
template
Create index
Template
database
Logs
Link
Template creation process Matching process
query
Templates
log.info(“Customer “ + customer + “ ID “ + id); à “Customer .* paying d*”
Match
message to
template
Source
code
Find log
statements
Create
template
Enrich
template
Create index
Template
database
Logs
Link
Template creation process Matching process
query
Templates
RQ1: Accuracy
• We collect 100k messages from a week day.
• These 100k messages pointed to 676 different locations in the source
code.
• 95% CL, 5% CI sample = 245 links.
• Manually investigation = 97.6% accuracy (239 out of 245 correct links)
• 99.8% accuracy in the original paper (two projects: HDFS and Darkstar,
millions of log lines, # of log statements not specified).
RQ2: Performance
Creating the ASTs of the entire
codebase is the most expensive
part of the process.
Step Time (minutes) Percentage
Finding log statements 37:25 92.1%
Creating template 00:32 1.3%
Enriching template 02:27 5.9%
Creating index 00:16 0.7%
Total 40:42 100%
Related work: not really reported.
Overall process: 200 HDFS nodes (with aggressive logging)
over 48 hours => 3 minutes with 50 nodes, or less than 10
minutes with 10 nodes.
When does it fail?
• JSON-based logs
• The developer logs a JSON (that is created in runtime) and thus the log
statement is simply “log.info(json)”, making our template inaccurate.
• Unknown logging method
• Developers create their own logging methods, which our tool can’t recognize.
• Log strings created on-the-fly
• Some log messages are too complex and developers create them by means of
multiple line of code (e.g. String log = “content1”; log = log + “content 2”, …),
which makes the analysis harder.
• Related work: “Almost all of these (failed) messages contain long
string variables.” => same for us.
Summary
• Logging the class and line number that originates a log message can be
expensive.
• Heuristics to make this link are available.
• We evaluated Xu et al’s proposal in an industry dataset:
• High accuracy (~97%)
• Reasonable performance, where the high cost comes from AST generation.
• Complex log statements make the analysis hard.
Schipper, Aniche, van Deursen. Tracing Back Log Data to its Log Statement:
From Research to Practice.
Contact: Daan.Schipper@adyen.com, M.FinavaroAniche@tudelft.nl,
Arie.VanDeursen@tudelft.nl
Paper: http://bit.ly/msr19-tracing-log-statements

More Related Content

Similar to Tracing Back Log Data to its Log Statement: From Research to Practice

Global e payment system ppt
Global e payment system pptGlobal e payment system ppt
Global e payment system ppt
Salman Khaja
 
OSMC 2018 | Stream connector: Easily sending events and/or metrics from the C...
OSMC 2018 | Stream connector: Easily sending events and/or metrics from the C...OSMC 2018 | Stream connector: Easily sending events and/or metrics from the C...
OSMC 2018 | Stream connector: Easily sending events and/or metrics from the C...
NETWAYS
 
DataArt Innovation Showcase Blockchain Billing
DataArt Innovation Showcase Blockchain BillingDataArt Innovation Showcase Blockchain Billing
DataArt Innovation Showcase Blockchain Billing
Alan Quayle
 
Patent Blockchain People Bank of China (PBOC)
Patent Blockchain People Bank of China (PBOC)Patent Blockchain People Bank of China (PBOC)
Patent Blockchain People Bank of China (PBOC)
Rein Mahatma
 
ATM and NDC Overview for eduction on ATM technology.pptx
ATM and NDC Overview for eduction on ATM technology.pptxATM and NDC Overview for eduction on ATM technology.pptx
ATM and NDC Overview for eduction on ATM technology.pptx
KarthikeyanSaddyS
 
Modern EPOS System for Retail business
Modern EPOS System for Retail businessModern EPOS System for Retail business
Modern EPOS System for Retail business
Vivek Siddhartha
 
VTC Pay Presentation
VTC Pay PresentationVTC Pay Presentation
VTC Pay Presentation
Lucas Nguyen
 
VTC Pay Presentation
VTC Pay PresentationVTC Pay Presentation
VTC Pay Presentation
Lucas Nguyen
 
Automated processing of client incoming payment v3
Automated processing of client incoming payment v3Automated processing of client incoming payment v3
Automated processing of client incoming payment v3
Marius Diaconu
 
PPS.pptx this ppt is for coding your problems and to do ppt for new students ...
PPS.pptx this ppt is for coding your problems and to do ppt for new students ...PPS.pptx this ppt is for coding your problems and to do ppt for new students ...
PPS.pptx this ppt is for coding your problems and to do ppt for new students ...
ragishettyanilkumar
 
Have your cake and eat it too, further dispelling the myths of the lambda arc...
Have your cake and eat it too, further dispelling the myths of the lambda arc...Have your cake and eat it too, further dispelling the myths of the lambda arc...
Have your cake and eat it too, further dispelling the myths of the lambda arc...
Dimos Raptis
 
3452 - Managing your applications
3452 - Managing your applications3452 - Managing your applications
3452 - Managing your applications
Timothy McCormick
 
5-LEC- 5.pptxTransport Layer. Transport Layer Protocols
5-LEC- 5.pptxTransport Layer.  Transport Layer Protocols5-LEC- 5.pptxTransport Layer.  Transport Layer Protocols
5-LEC- 5.pptxTransport Layer. Transport Layer Protocols
ZahouAmel1
 
WSO2 Complex Event Processor
WSO2 Complex Event ProcessorWSO2 Complex Event Processor
WSO2 Complex Event Processor
Sriskandarajah Suhothayan
 
Automating the Waterway Inspection Reporting Customer Success Story
Automating the Waterway Inspection Reporting Customer Success StoryAutomating the Waterway Inspection Reporting Customer Success Story
Automating the Waterway Inspection Reporting Customer Success Story
Safe Software
 
Mobile Data Terminal M710W Presentation
Mobile Data Terminal M710W PresentationMobile Data Terminal M710W Presentation
Mobile Data Terminal M710W Presentation
Harvey Wang
 
Sujith ~ cross applications
Sujith ~ cross applicationsSujith ~ cross applications
Sujith ~ cross applications
Kranthi Kumar
 
ProjectReport_Subhayu
ProjectReport_SubhayuProjectReport_Subhayu
ProjectReport_Subhayu
Subhayu Chakravorty
 
The Future of Payments
The Future of PaymentsThe Future of Payments
The Future of Payments
Netcetera
 
Fraud Detection in Real-time @ Apache Big Data Con
Fraud Detection in Real-time @ Apache Big Data ConFraud Detection in Real-time @ Apache Big Data Con
Fraud Detection in Real-time @ Apache Big Data Con
Seshika Fernando
 

Similar to Tracing Back Log Data to its Log Statement: From Research to Practice (20)

Global e payment system ppt
Global e payment system pptGlobal e payment system ppt
Global e payment system ppt
 
OSMC 2018 | Stream connector: Easily sending events and/or metrics from the C...
OSMC 2018 | Stream connector: Easily sending events and/or metrics from the C...OSMC 2018 | Stream connector: Easily sending events and/or metrics from the C...
OSMC 2018 | Stream connector: Easily sending events and/or metrics from the C...
 
DataArt Innovation Showcase Blockchain Billing
DataArt Innovation Showcase Blockchain BillingDataArt Innovation Showcase Blockchain Billing
DataArt Innovation Showcase Blockchain Billing
 
Patent Blockchain People Bank of China (PBOC)
Patent Blockchain People Bank of China (PBOC)Patent Blockchain People Bank of China (PBOC)
Patent Blockchain People Bank of China (PBOC)
 
ATM and NDC Overview for eduction on ATM technology.pptx
ATM and NDC Overview for eduction on ATM technology.pptxATM and NDC Overview for eduction on ATM technology.pptx
ATM and NDC Overview for eduction on ATM technology.pptx
 
Modern EPOS System for Retail business
Modern EPOS System for Retail businessModern EPOS System for Retail business
Modern EPOS System for Retail business
 
VTC Pay Presentation
VTC Pay PresentationVTC Pay Presentation
VTC Pay Presentation
 
VTC Pay Presentation
VTC Pay PresentationVTC Pay Presentation
VTC Pay Presentation
 
Automated processing of client incoming payment v3
Automated processing of client incoming payment v3Automated processing of client incoming payment v3
Automated processing of client incoming payment v3
 
PPS.pptx this ppt is for coding your problems and to do ppt for new students ...
PPS.pptx this ppt is for coding your problems and to do ppt for new students ...PPS.pptx this ppt is for coding your problems and to do ppt for new students ...
PPS.pptx this ppt is for coding your problems and to do ppt for new students ...
 
Have your cake and eat it too, further dispelling the myths of the lambda arc...
Have your cake and eat it too, further dispelling the myths of the lambda arc...Have your cake and eat it too, further dispelling the myths of the lambda arc...
Have your cake and eat it too, further dispelling the myths of the lambda arc...
 
3452 - Managing your applications
3452 - Managing your applications3452 - Managing your applications
3452 - Managing your applications
 
5-LEC- 5.pptxTransport Layer. Transport Layer Protocols
5-LEC- 5.pptxTransport Layer.  Transport Layer Protocols5-LEC- 5.pptxTransport Layer.  Transport Layer Protocols
5-LEC- 5.pptxTransport Layer. Transport Layer Protocols
 
WSO2 Complex Event Processor
WSO2 Complex Event ProcessorWSO2 Complex Event Processor
WSO2 Complex Event Processor
 
Automating the Waterway Inspection Reporting Customer Success Story
Automating the Waterway Inspection Reporting Customer Success StoryAutomating the Waterway Inspection Reporting Customer Success Story
Automating the Waterway Inspection Reporting Customer Success Story
 
Mobile Data Terminal M710W Presentation
Mobile Data Terminal M710W PresentationMobile Data Terminal M710W Presentation
Mobile Data Terminal M710W Presentation
 
Sujith ~ cross applications
Sujith ~ cross applicationsSujith ~ cross applications
Sujith ~ cross applications
 
ProjectReport_Subhayu
ProjectReport_SubhayuProjectReport_Subhayu
ProjectReport_Subhayu
 
The Future of Payments
The Future of PaymentsThe Future of Payments
The Future of Payments
 
Fraud Detection in Real-time @ Apache Big Data Con
Fraud Detection in Real-time @ Apache Big Data ConFraud Detection in Real-time @ Apache Big Data Con
Fraud Detection in Real-time @ Apache Big Data Con
 

More from Maurício Aniche

Can ML help software developers? (TEQnation 2022)
Can ML help software developers? (TEQnation 2022)Can ML help software developers? (TEQnation 2022)
Can ML help software developers? (TEQnation 2022)
Maurício Aniche
 
Pragmatic software testing education - SIGCSE 2019
Pragmatic software testing education - SIGCSE 2019Pragmatic software testing education - SIGCSE 2019
Pragmatic software testing education - SIGCSE 2019
Maurício Aniche
 
Test Automation Day 2018
Test Automation Day 2018Test Automation Day 2018
Test Automation Day 2018
Maurício Aniche
 
Code smells in MVC applications (Dutch Spring meetup)
Code smells in MVC applications (Dutch Spring meetup)Code smells in MVC applications (Dutch Spring meetup)
Code smells in MVC applications (Dutch Spring meetup)
Maurício Aniche
 
A Collaborative Approach to Teach Software Architecture - SIGCSE 2017
A Collaborative Approach to Teach Software Architecture - SIGCSE 2017A Collaborative Approach to Teach Software Architecture - SIGCSE 2017
A Collaborative Approach to Teach Software Architecture - SIGCSE 2017
Maurício Aniche
 
Code quality in MVC systems - BENEVOL 2016
Code quality in MVC systems - BENEVOL 2016Code quality in MVC systems - BENEVOL 2016
Code quality in MVC systems - BENEVOL 2016
Maurício Aniche
 
A Validated Set of Smells for MVC Architectures - ICSME 2016
A Validated Set of Smells for MVC Architectures - ICSME 2016A Validated Set of Smells for MVC Architectures - ICSME 2016
A Validated Set of Smells for MVC Architectures - ICSME 2016
Maurício Aniche
 
SATT: Tailoring Code Metric Thresholds for Different Software Architectures (...
SATT: Tailoring Code Metric Thresholds for Different Software Architectures (...SATT: Tailoring Code Metric Thresholds for Different Software Architectures (...
SATT: Tailoring Code Metric Thresholds for Different Software Architectures (...
Maurício Aniche
 
DNAD 2015 - Métricas de código, pra que te quero?
DNAD 2015 - Métricas de código, pra que te quero?DNAD 2015 - Métricas de código, pra que te quero?
DNAD 2015 - Métricas de código, pra que te quero?
Maurício Aniche
 
Como eu aprendi que testar software é importante?
Como eu aprendi que testar software é importante?Como eu aprendi que testar software é importante?
Como eu aprendi que testar software é importante?
Maurício Aniche
 
Proposta: Métricas e Heurísticas para Detecção de Problemas em Aplicações Web
Proposta: Métricas e Heurísticas para Detecção de Problemas em Aplicações WebProposta: Métricas e Heurísticas para Detecção de Problemas em Aplicações Web
Proposta: Métricas e Heurísticas para Detecção de Problemas em Aplicações Web
Maurício Aniche
 
Efeitos da Prática de Revisão de Código na Caelum: Um Estudo Preliminar em Du...
Efeitos da Prática de Revisão de Código na Caelum: Um Estudo Preliminar em Du...Efeitos da Prática de Revisão de Código na Caelum: Um Estudo Preliminar em Du...
Efeitos da Prática de Revisão de Código na Caelum: Um Estudo Preliminar em Du...
Maurício Aniche
 
Test-Driven Development serve pra mim?
Test-Driven Development serve pra mim?Test-Driven Development serve pra mim?
Test-Driven Development serve pra mim?
Maurício Aniche
 
O que estamos temos feito com mineração de repositório de código no IME?
O que estamos temos feito com mineração de repositório de código no IME?O que estamos temos feito com mineração de repositório de código no IME?
O que estamos temos feito com mineração de repositório de código no IME?
Maurício Aniche
 
MetricMiner: Supporting Researchers in Mining Software Repositories - SCAM 2013
MetricMiner: Supporting Researchers in Mining Software Repositories - SCAM 2013MetricMiner: Supporting Researchers in Mining Software Repositories - SCAM 2013
MetricMiner: Supporting Researchers in Mining Software Repositories - SCAM 2013
Maurício Aniche
 
Does the Act of Refactoring Really Make Code Simpler? A Preliminary Study - W...
Does the Act of Refactoring Really Make Code Simpler? A Preliminary Study - W...Does the Act of Refactoring Really Make Code Simpler? A Preliminary Study - W...
Does the Act of Refactoring Really Make Code Simpler? A Preliminary Study - W...
Maurício Aniche
 
Minicurso sobre Evolução de Software no CBSoft 2011
Minicurso sobre Evolução de Software no CBSoft 2011Minicurso sobre Evolução de Software no CBSoft 2011
Minicurso sobre Evolução de Software no CBSoft 2011
Maurício Aniche
 
MTD2014 - Are The Methods In Your DAOs in the Right Place? A Preliminary Study
MTD2014 - Are The Methods In Your DAOs in the Right Place? A Preliminary StudyMTD2014 - Are The Methods In Your DAOs in the Right Place? A Preliminary Study
MTD2014 - Are The Methods In Your DAOs in the Right Place? A Preliminary Study
Maurício Aniche
 
[TDC 2014] Métricas de código, pra que te quero?
[TDC 2014] Métricas de código, pra que te quero?[TDC 2014] Métricas de código, pra que te quero?
[TDC 2014] Métricas de código, pra que te quero?
Maurício Aniche
 
Code coverage for MSR Researches [Work in Progress]
Code coverage for MSR Researches [Work in Progress]Code coverage for MSR Researches [Work in Progress]
Code coverage for MSR Researches [Work in Progress]
Maurício Aniche
 

More from Maurício Aniche (20)

Can ML help software developers? (TEQnation 2022)
Can ML help software developers? (TEQnation 2022)Can ML help software developers? (TEQnation 2022)
Can ML help software developers? (TEQnation 2022)
 
Pragmatic software testing education - SIGCSE 2019
Pragmatic software testing education - SIGCSE 2019Pragmatic software testing education - SIGCSE 2019
Pragmatic software testing education - SIGCSE 2019
 
Test Automation Day 2018
Test Automation Day 2018Test Automation Day 2018
Test Automation Day 2018
 
Code smells in MVC applications (Dutch Spring meetup)
Code smells in MVC applications (Dutch Spring meetup)Code smells in MVC applications (Dutch Spring meetup)
Code smells in MVC applications (Dutch Spring meetup)
 
A Collaborative Approach to Teach Software Architecture - SIGCSE 2017
A Collaborative Approach to Teach Software Architecture - SIGCSE 2017A Collaborative Approach to Teach Software Architecture - SIGCSE 2017
A Collaborative Approach to Teach Software Architecture - SIGCSE 2017
 
Code quality in MVC systems - BENEVOL 2016
Code quality in MVC systems - BENEVOL 2016Code quality in MVC systems - BENEVOL 2016
Code quality in MVC systems - BENEVOL 2016
 
A Validated Set of Smells for MVC Architectures - ICSME 2016
A Validated Set of Smells for MVC Architectures - ICSME 2016A Validated Set of Smells for MVC Architectures - ICSME 2016
A Validated Set of Smells for MVC Architectures - ICSME 2016
 
SATT: Tailoring Code Metric Thresholds for Different Software Architectures (...
SATT: Tailoring Code Metric Thresholds for Different Software Architectures (...SATT: Tailoring Code Metric Thresholds for Different Software Architectures (...
SATT: Tailoring Code Metric Thresholds for Different Software Architectures (...
 
DNAD 2015 - Métricas de código, pra que te quero?
DNAD 2015 - Métricas de código, pra que te quero?DNAD 2015 - Métricas de código, pra que te quero?
DNAD 2015 - Métricas de código, pra que te quero?
 
Como eu aprendi que testar software é importante?
Como eu aprendi que testar software é importante?Como eu aprendi que testar software é importante?
Como eu aprendi que testar software é importante?
 
Proposta: Métricas e Heurísticas para Detecção de Problemas em Aplicações Web
Proposta: Métricas e Heurísticas para Detecção de Problemas em Aplicações WebProposta: Métricas e Heurísticas para Detecção de Problemas em Aplicações Web
Proposta: Métricas e Heurísticas para Detecção de Problemas em Aplicações Web
 
Efeitos da Prática de Revisão de Código na Caelum: Um Estudo Preliminar em Du...
Efeitos da Prática de Revisão de Código na Caelum: Um Estudo Preliminar em Du...Efeitos da Prática de Revisão de Código na Caelum: Um Estudo Preliminar em Du...
Efeitos da Prática de Revisão de Código na Caelum: Um Estudo Preliminar em Du...
 
Test-Driven Development serve pra mim?
Test-Driven Development serve pra mim?Test-Driven Development serve pra mim?
Test-Driven Development serve pra mim?
 
O que estamos temos feito com mineração de repositório de código no IME?
O que estamos temos feito com mineração de repositório de código no IME?O que estamos temos feito com mineração de repositório de código no IME?
O que estamos temos feito com mineração de repositório de código no IME?
 
MetricMiner: Supporting Researchers in Mining Software Repositories - SCAM 2013
MetricMiner: Supporting Researchers in Mining Software Repositories - SCAM 2013MetricMiner: Supporting Researchers in Mining Software Repositories - SCAM 2013
MetricMiner: Supporting Researchers in Mining Software Repositories - SCAM 2013
 
Does the Act of Refactoring Really Make Code Simpler? A Preliminary Study - W...
Does the Act of Refactoring Really Make Code Simpler? A Preliminary Study - W...Does the Act of Refactoring Really Make Code Simpler? A Preliminary Study - W...
Does the Act of Refactoring Really Make Code Simpler? A Preliminary Study - W...
 
Minicurso sobre Evolução de Software no CBSoft 2011
Minicurso sobre Evolução de Software no CBSoft 2011Minicurso sobre Evolução de Software no CBSoft 2011
Minicurso sobre Evolução de Software no CBSoft 2011
 
MTD2014 - Are The Methods In Your DAOs in the Right Place? A Preliminary Study
MTD2014 - Are The Methods In Your DAOs in the Right Place? A Preliminary StudyMTD2014 - Are The Methods In Your DAOs in the Right Place? A Preliminary Study
MTD2014 - Are The Methods In Your DAOs in the Right Place? A Preliminary Study
 
[TDC 2014] Métricas de código, pra que te quero?
[TDC 2014] Métricas de código, pra que te quero?[TDC 2014] Métricas de código, pra que te quero?
[TDC 2014] Métricas de código, pra que te quero?
 
Code coverage for MSR Researches [Work in Progress]
Code coverage for MSR Researches [Work in Progress]Code coverage for MSR Researches [Work in Progress]
Code coverage for MSR Researches [Work in Progress]
 

Recently uploaded

Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
TIPNGVN2
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 

Recently uploaded (20)

Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 

Tracing Back Log Data to its Log Statement: From Research to Practice

  • 1. Tracing Back Log Data to its Log Statement: From Research to Practice Daan Schipper (Adyen) Maurício F. Aniche, Arie van Deursen (Delft University of Technology)
  • 2. DEV OPS 20170101160001 Adyen version: ****** 20170101160002 Starting TX/amt=10001/currency=978 20170101160003 Starting EMV 20170101160004 EMV started 20170101160005 Magswipe opened 20170101160006 CTLS started 20170101160007 Transaction initialised 20170101160008 Run TX as EMV transaction 20170101160009 Application selected app:****** 20170101160010 read_application_data succeeded 20170101160011 data_authentication succeeded 20170101160012 validate 0 20170101160013 DCC rejected 20170101160014 terminal_risk_management succeeded 20170101160015 verify_card_holder succeeded 20170101160016 generate_first_ac succeeded 20170101160017 Authorizing online 20170101160018 Data returned by the host succeeded 20170101160019 Transaction authorized by card 20170101160020 Approved receipt printed 20170101160021 pos_result_code:APPROVED 20170101160022 Final status: Approved 20170101160001 Adyen version: ****** 20170101160002 Starting TX/amt=10001/currency=978 20170101160003 Starting EMV 20170101160004 EMV started 20170101160005 Magswipe opened 20170101160006 CTLS started 20170101160007 Transaction initialised 20170101160008 Run TX as EMV transaction 20170101160009 Application selected app:****** 20170101160010 read_application_data succeeded 20170101160011 data_authentication succeeded 20170101160012 validate 0 20170101160013 DCC rejected 20170101160014 terminal_risk_management succeeded 20170101160015 verify_card_holder succeeded 20170101160016 generate_first_ac succeeded 20170101160017 Authorizing online 20170101160018 Data returned by the host succeeded 20170101160019 Transaction authorized by card 20170101160020 Approved receipt printed 20170101160021 pos_result_code:APPROVED 20170101160022 Final status: Approved 20170101160001 Adyen version: ****** 20170101160002 Starting TX/amt=10001/currency=978 20170101160003 Starting EMV 20170101160004 EMV started 20170101160005 Magswipe opened 20170101160006 CTLS started 20170101160007 Transaction initialised 20170101160008 Run TX as EMV transaction 20170101160009 Application selected app:****** 20170101160010 read_application_data succeeded 20170101160011 data_authentication succeeded 20170101160012 validate 0 20170101160013 DCC rejected 20170101160014 terminal_risk_management succeeded 20170101160015 verify_card_holder succeeded 20170101160016 generate_first_ac succeeded 20170101160017 Authorizing online 20170101160018 Data returned by the host succeeded 20170101160019 Transaction authorized by card 20170101160020 Approved receipt printed 20170101160021 pos_result_code:APPROVED 20170101160022 Final status: Approved 20170101160001 Adyen version: ****** 20170101160002 Starting TX/amt=10001/currency=978 20170101160003 Starting EMV 20170101160004 EMV started 20170101160005 Magswipe opened 20170101160006 CTLS started 20170101160007 Transaction initialised 20170101160008 Run TX as EMV transaction 20170101160009 Application selected app:****** 20170101160010 read_application_data succeeded 20170101160011 data_authentication succeeded 20170101160012 validate 0 20170101160013 DCC rejected 20170101160014 terminal_risk_management succeeded 20170101160015 verify_card_holder succeeded 20170101160016 generate_first_ac succeeded 20170101160017 Authorizing online 20170101160018 Data returned by the host succeeded 20170101160019 Transaction authorized by card 20170101160020 Approved receipt printed 20170101160021 pos_result_code:APPROVED 20170101160022 Final status: Approved 20170101160001 Adyen version: ****** 20170101160002 Starting TX/amt=10001/currency=978 20170101160003 Starting EMV 20170101160004 EMV started 20170101160005 Magswipe opened 20170101160006 CTLS started 20170101160007 Transaction initialised 20170101160008 Run TX as EMV transaction 20170101160009 Application selected app:****** 20170101160010 read_application_data succeeded 20170101160011 data_authentication succeeded 20170101160012 validate 0 20170101160013 DCC rejected 20170101160014 terminal_risk_management succeeded 20170101160015 verify_card_holder succeeded 20170101160016 generate_first_ac succeeded 20170101160017 Authorizing online 20170101160018 Data returned by the host succeeded 20170101160019 Transaction authorized by card 20170101160020 Approved receipt printed 20170101160021 pos_result_code:APPROVED 20170101160022 Final status: Approved
  • 3. log.info(”Customer “ + customer + “ paying ” + paymentValue); [2019-02-03 15:43:24] [MagicPayment.java][L456] Customer Maurício paying 235.67 Class name Line number
  • 4.
  • 5. In practice… • Adyen can’t log the class and the line number that originates a log message. • Developers ‘grep’ the source code • Can we automatically detect where origin of the log lines? • W. Xu, L. Huang, A. Fox, D. Patterson, and M. I. Jordan. Detecting large-scale system problems by mining console logs. In Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, pages 117–132. ACM, 2009.
  • 6. Match message to template Source code Find log statements Create template Enrich template Create index Template database Logs Link Template creation process Matching process query Templates
  • 7. Match message to template Source code Find log statements Create template Enrich template Create index Template database Logs Link Template creation process Matching process query Templates log.info(“Customer “ + customer + “ ID “ + id); à “Customer .* paying d*”
  • 8. Match message to template Source code Find log statements Create template Enrich template Create index Template database Logs Link Template creation process Matching process query Templates
  • 9. RQ1: Accuracy • We collect 100k messages from a week day. • These 100k messages pointed to 676 different locations in the source code. • 95% CL, 5% CI sample = 245 links. • Manually investigation = 97.6% accuracy (239 out of 245 correct links) • 99.8% accuracy in the original paper (two projects: HDFS and Darkstar, millions of log lines, # of log statements not specified).
  • 10. RQ2: Performance Creating the ASTs of the entire codebase is the most expensive part of the process. Step Time (minutes) Percentage Finding log statements 37:25 92.1% Creating template 00:32 1.3% Enriching template 02:27 5.9% Creating index 00:16 0.7% Total 40:42 100% Related work: not really reported. Overall process: 200 HDFS nodes (with aggressive logging) over 48 hours => 3 minutes with 50 nodes, or less than 10 minutes with 10 nodes.
  • 11. When does it fail? • JSON-based logs • The developer logs a JSON (that is created in runtime) and thus the log statement is simply “log.info(json)”, making our template inaccurate. • Unknown logging method • Developers create their own logging methods, which our tool can’t recognize. • Log strings created on-the-fly • Some log messages are too complex and developers create them by means of multiple line of code (e.g. String log = “content1”; log = log + “content 2”, …), which makes the analysis harder. • Related work: “Almost all of these (failed) messages contain long string variables.” => same for us.
  • 12. Summary • Logging the class and line number that originates a log message can be expensive. • Heuristics to make this link are available. • We evaluated Xu et al’s proposal in an industry dataset: • High accuracy (~97%) • Reasonable performance, where the high cost comes from AST generation. • Complex log statements make the analysis hard. Schipper, Aniche, van Deursen. Tracing Back Log Data to its Log Statement: From Research to Practice. Contact: Daan.Schipper@adyen.com, M.FinavaroAniche@tudelft.nl, Arie.VanDeursen@tudelft.nl Paper: http://bit.ly/msr19-tracing-log-statements