#BugWars @appperf
Bug Wars: Episode IV – A New Hope
Ben Cripps & Martin Pinner
@appperf
www.applicationperformance.com/blogs
www.applicationperformance.com
Copyright © 2018 Application Performance Ltd. All rights reserved.
EPISODE IV – A NEW HOPE
Copyright © 2018 Application Performance Ltd. All rights reserved.
WHO ARE WE?
Over 25 years experience in software development. He is a
founder and director of Application Performance and
WebTuna. He specialises in performance tuning and
writing performance-monitoring software, particularly for
web sites.
Software consultant for Application Performance Ltd. Ben
has a proven, 20 year development and management
background, with experience of delivering, maintaining
and managing high profile, high value branded solutions.
Martin Pinner Ben Cripps
Copyright © 2018 Application Performance Ltd. All rights reserved.
TIME SPENT FIXING BUGS IN 2017
~215 ROUND TRIPS
TO MARS!
268 YEARS
EIGHT MONTHS
2 WEEKS, 3 DAYS
8 HOURS AND 46 MINUTES
Sources: Tricentis – 2018 Software Fails Watch, Mars One - FAQ
Copyright © 2018 Application Performance Ltd. All rights reserved.
WHAT IS THE COST OF A BUG?
1. Imagine a loan company having a bug.
2. This bug resulted in 60% of loans not being collected on time. What impact do you
think it would have?
Copyright © 2018 Application Performance Ltd. All rights reserved.
WELL, IT HAS HAPPENED BEFORE!
1. Bug introduced as part of a £21.6m department overhaul.
2. The bug cost £120 million in profit loss in 2017.
3. The company lost £1.7 billion in market capital in just one day.
Sources: Tricentis – 2018 Software Fails Watch, The Register – Provident Financial Software Woes Share Price Crash
TOTAL COST
£1.8BILLION!
Copyright © 2018 Application Performance Ltd. All rights reserved.
WILL HAPPEN AGAIN?
Oh hang on…
Copyright © 2018 Application Performance Ltd. All rights reserved.
Copyright © 2018 Application Performance Ltd. All rights reserved.
Looks
Familiar?
Copyright © 2018 Application Performance Ltd. All rights reserved.
DEVELOPER VISIBILITY
Copyright © 2018 Application Performance Ltd. All rights reserved.
PRODUCTION VISIBILITY
1. Locked down environments.
2. Cannot see under the covers.
3. Real world users – unexpected usage
patterns.
Copyright © 2018 Application Performance Ltd. All rights reserved.
In production we have to rely on log files.
Aren’t they the answer?
Copyright © 2018 Application Performance Ltd. All rights reserved.
Sources: OverOps – Swallowed Exceptions: The Silent Killer of Java Applications
How do we handle exceptions?1. SWALLOWED EXCEPTIONS: THE SILENT KILLER
of catch
blocks
are
empty!
2,943,327
2,394,629
3,067,863
3,608,309
900,806
0 500,000 1,000,000 1,500,000 2,000,000 2,500,000 3,000,000 3,500,000 4,000,000
LOG STATEMENTS
PRINT STACK TRACE
EMPTY BLOCK
THROW EXCEPTIONS
SYSTEM.OUT.PRINTLN
Number of Instances
20%
Copyright © 2018 Application Performance Ltd. All rights reserved.
Sources: https://blog.takipi.com/github-research-over-50-of-java-logging-statements-are-written-wrong/
How do we handle exceptions?2. INFO / TRACE / DEBUG LOGGING TURNED OFF
of logs are
switched
off in
production.
5%
30%
28%
14%
23%
0%
Trace
Info
Debug
Warn
Error
Fatal
2/3
Copyright © 2018 Application Performance Ltd. All rights reserved.
How do we handle exceptions?3. VARIABLES NOT LOGGED
of log messages
do not have any
variable state.
Sources: https://blog.takipi.com/github-research-over-50-of-java-logging-statements-are-written-wrong/
52%
34%
11%
3% 1% 1%
0 1
2 3
4 5+
52%
Copyright © 2018 Application Performance Ltd. All rights reserved.
Sources: https://blog.takipi.com/5-ways-developers-waste-more-than-20-of-their-work-week/
How do we handle exceptions?4. TOO MUCH TIME PRODUCTION DEBUGGING
of developer
time spent
fixing bugs.
25%
Copyright © 2018 Application Performance Ltd. All rights reserved.
WE’VE ALL BEEN THERE!
Copyright © 2018 Application Performance Ltd. All rights reserved.
QUICK RECAP
Despite best efforts, bugs can and will happen.
Your chosen IDE provides the power to debug issues effectively.
Production on other hand is typically locked down.
Exception logging strategies are often lacking.
Copyright © 2018 Application Performance Ltd. All rights reserved.
IMAGINE THIS VISIBILITY IN PRODUCTION…
Copyright © 2018 Application Performance Ltd. All rights reserved.
WITH THIS LEVEL OF CATEGORISATION…
Copyright © 2018 Application Performance Ltd. All rights reserved.
WANT TO KNOW MORE?
• www.applicationperformance.com/overops
• blog.applicationperformance.com/topic/overops
WEBINAR 16th MAY:
How to Accelerate Delivery of Reliable Software
https://land.overops.com/webinar-java-jedi-powers/
Copyright © 2018 Application Performance Ltd. All rights reserved.
COME SEE US ON STAND B4
Ask about
our FREE
eBook and
Whitepaper

Devoxx - Bug Wars: Episode IV - A New Hope

  • 1.
    #BugWars @appperf Bug Wars:Episode IV – A New Hope Ben Cripps & Martin Pinner @appperf www.applicationperformance.com/blogs www.applicationperformance.com
  • 2.
    Copyright © 2018Application Performance Ltd. All rights reserved. EPISODE IV – A NEW HOPE
  • 3.
    Copyright © 2018Application Performance Ltd. All rights reserved. WHO ARE WE? Over 25 years experience in software development. He is a founder and director of Application Performance and WebTuna. He specialises in performance tuning and writing performance-monitoring software, particularly for web sites. Software consultant for Application Performance Ltd. Ben has a proven, 20 year development and management background, with experience of delivering, maintaining and managing high profile, high value branded solutions. Martin Pinner Ben Cripps
  • 4.
    Copyright © 2018Application Performance Ltd. All rights reserved. TIME SPENT FIXING BUGS IN 2017 ~215 ROUND TRIPS TO MARS! 268 YEARS EIGHT MONTHS 2 WEEKS, 3 DAYS 8 HOURS AND 46 MINUTES Sources: Tricentis – 2018 Software Fails Watch, Mars One - FAQ
  • 5.
    Copyright © 2018Application Performance Ltd. All rights reserved. WHAT IS THE COST OF A BUG? 1. Imagine a loan company having a bug. 2. This bug resulted in 60% of loans not being collected on time. What impact do you think it would have?
  • 6.
    Copyright © 2018Application Performance Ltd. All rights reserved. WELL, IT HAS HAPPENED BEFORE! 1. Bug introduced as part of a £21.6m department overhaul. 2. The bug cost £120 million in profit loss in 2017. 3. The company lost £1.7 billion in market capital in just one day. Sources: Tricentis – 2018 Software Fails Watch, The Register – Provident Financial Software Woes Share Price Crash TOTAL COST £1.8BILLION!
  • 7.
    Copyright © 2018Application Performance Ltd. All rights reserved. WILL HAPPEN AGAIN? Oh hang on…
  • 8.
    Copyright © 2018Application Performance Ltd. All rights reserved.
  • 9.
    Copyright © 2018Application Performance Ltd. All rights reserved. Looks Familiar?
  • 10.
    Copyright © 2018Application Performance Ltd. All rights reserved. DEVELOPER VISIBILITY
  • 11.
    Copyright © 2018Application Performance Ltd. All rights reserved. PRODUCTION VISIBILITY 1. Locked down environments. 2. Cannot see under the covers. 3. Real world users – unexpected usage patterns.
  • 12.
    Copyright © 2018Application Performance Ltd. All rights reserved. In production we have to rely on log files. Aren’t they the answer?
  • 13.
    Copyright © 2018Application Performance Ltd. All rights reserved. Sources: OverOps – Swallowed Exceptions: The Silent Killer of Java Applications How do we handle exceptions?1. SWALLOWED EXCEPTIONS: THE SILENT KILLER of catch blocks are empty! 2,943,327 2,394,629 3,067,863 3,608,309 900,806 0 500,000 1,000,000 1,500,000 2,000,000 2,500,000 3,000,000 3,500,000 4,000,000 LOG STATEMENTS PRINT STACK TRACE EMPTY BLOCK THROW EXCEPTIONS SYSTEM.OUT.PRINTLN Number of Instances 20%
  • 14.
    Copyright © 2018Application Performance Ltd. All rights reserved. Sources: https://blog.takipi.com/github-research-over-50-of-java-logging-statements-are-written-wrong/ How do we handle exceptions?2. INFO / TRACE / DEBUG LOGGING TURNED OFF of logs are switched off in production. 5% 30% 28% 14% 23% 0% Trace Info Debug Warn Error Fatal 2/3
  • 15.
    Copyright © 2018Application Performance Ltd. All rights reserved. How do we handle exceptions?3. VARIABLES NOT LOGGED of log messages do not have any variable state. Sources: https://blog.takipi.com/github-research-over-50-of-java-logging-statements-are-written-wrong/ 52% 34% 11% 3% 1% 1% 0 1 2 3 4 5+ 52%
  • 16.
    Copyright © 2018Application Performance Ltd. All rights reserved. Sources: https://blog.takipi.com/5-ways-developers-waste-more-than-20-of-their-work-week/ How do we handle exceptions?4. TOO MUCH TIME PRODUCTION DEBUGGING of developer time spent fixing bugs. 25%
  • 17.
    Copyright © 2018Application Performance Ltd. All rights reserved. WE’VE ALL BEEN THERE!
  • 18.
    Copyright © 2018Application Performance Ltd. All rights reserved. QUICK RECAP Despite best efforts, bugs can and will happen. Your chosen IDE provides the power to debug issues effectively. Production on other hand is typically locked down. Exception logging strategies are often lacking.
  • 19.
    Copyright © 2018Application Performance Ltd. All rights reserved. IMAGINE THIS VISIBILITY IN PRODUCTION…
  • 20.
    Copyright © 2018Application Performance Ltd. All rights reserved. WITH THIS LEVEL OF CATEGORISATION…
  • 21.
    Copyright © 2018Application Performance Ltd. All rights reserved. WANT TO KNOW MORE? • www.applicationperformance.com/overops • blog.applicationperformance.com/topic/overops WEBINAR 16th MAY: How to Accelerate Delivery of Reliable Software https://land.overops.com/webinar-java-jedi-powers/
  • 22.
    Copyright © 2018Application Performance Ltd. All rights reserved. COME SEE US ON STAND B4 Ask about our FREE eBook and Whitepaper

Editor's Notes

  • #4 Hello, and welcome to our “Bug Wars” presentation. I’m Ben Cripps, and to my SIDE is my colleague Martin. In this presentation, I’m going to look at bugs from a managerial view, be it project manager or development manager, while Martin is going to discuss bugs from a pizza-loving developer’s view.
  • #5 Have you ever wondered how much time is spent fixing bugs? As a manager, that has to be one of my worries – bugs can bring heat to me, the team and can burn through a lot of time trying to fix the bug. According to Tricentis, in 2017 we spent close to 269 years fixing bugs. 269. In that time alone, Tesla’s roadster would have made 215 round-trips to Mars. Now imagine what you could have created with that amount of time. Maybe some new functionality? Exciting new development projects? Cutting edge proof of concepts? Maybe self-investment – training, development practices, methodologies etc. It certainly makes me think about my own missed opportunities. Sources: Tricentis Software Fail Watch (2018) - https://www.tricentis.com/blog/2018/02/01/how-to-avoid-the-tricentis-software-fail-watch/ Mars One - https://www.mars-one.com/faq/mission-to-mars/how-long-does-it-take-to-travel-to-mars
  • #6 Other than time, what are the bug costs are there? The obvious one is financial. Let’s imagine there is a loan company that found a bug that resulted in 60% of their loans not being collected on time. It already sounds quite serious, right?
  • #7 Well, this happened in the real world and got introduced as part of department overhaul. The bug caused a lot of financial damage: Firstly, for 2017 it cost Provident Financial £120m in lost profit. But it also hit that market value – that dropped a whopping £1.7bn in one day. Total cost was a whopping £1.8bn, plus the chief execs’ registration. But there is more than just a financial and time cost. There is the potential brand or client damage; maybe internal trust will be lost with you or the rest of your development team. Sub-prime loan firm Provident Financial. Chief exec Peter Crook resigned as the company's share price tumbled from £17.42 to just £4.50.
  • #8 Will it ever happen again?
  • #9 It has, and sadly it will continue to do so. And the later you find a bug in the SDLC, the more it will cost in time, reputation and financial. TSB have been suffering recently too, migrating to a their own, new platform. Intensive national news coverage, with open air social media posts on Twitter, Facebook etc. Days of outages which will no doubt cost millions in pounds in damages, compensation and most likely lost customers! Displaying on-screen errors to the customer – java.lang.NullPointerException, BeanCreationNotAllowedException and more. All in public display, stoked by news outlets and the power of social media. Not only that, TSB bank managers are reporting staff are close to collapse. 1.9m digital banking customers without access to their account. 40,000 complaints, 13x their usual levels. An interview with the TSB CEO about resolution time where introduced doubt over the technical teams capabilities to resolve by their predicted time. Soul destroying move for the already under pressure, under fire department, and it removes team morale and inter-company trust – bridges were burnt! Immense company, inter-departmental and development pressure. Sources: https://news.sky.com/story/tsb-boss-in-switching-row-at-mps-hearing-over-it-meltdown-11356085
  • #10 This picture is of me drinking a hard-earnt glass of champagne with our client and my CEO, convincing them all is well with the new solution we’ve just released. Meanwhile, next door the QA team are finding new bugs, pushing them down into the basement for Martin and the team to deal with ASAP. No matter your best efforts, some form of bug will always slip through the net. So how do you capture and log exceptions? More importantly, how do you analyse and prioritise these? So, what is the magic answer to bugs … in one word, “visibility”. But as a development team, how can we achieve that? Here’s Martin with the answers!
  • #11 The IDE experience while developing – everything is to hand: Editor (syntax colour highlighting) Compiler Debugger Unit tests Code analysers Other tools
  • #12 Production environment is usually locked down. Developers cannot see under the covers. Real users use the application in ways that were not anticipated.
  • #14 20% of errors never make it to the logs in production: Research shows that 20% of catch blocks are empty! That means swallowing the exception without any trace. These are the silent killers of Java applications. Research by OverOps into 1/2 million Java projects on GitHub: Their research showed three groups: Documenting what happened, through logging, printing a stack trace or printing out information to the console. Rethrowing an exception, probably a wider abstraction that one of the methods further up the call stack would know how to handle. And nothing! An empty block, simply swallowing the exception without any trace. And looks like it even happens at least as often as logging it. Source: OverOps, https://blog.takipi.com/swallowed-exceptions-the-silent-killer-of-java-applications/
  • #15 Nearly two-thirds of logging statements are turned off in production: INFO and DEBUG logging accounts for 57.8% of logging statements and these are turned off in production. And in pre-prod environments where these are turned on, you will probably find you can't reproduce the issue. Source: OverOps, https://blog.takipi.com/swallowed-exceptions-the-silent-killer-of-java-applications/
  • #16 More than 50% of logging statements don’t include ANY information about the variable state at the time of an error: 52.2% of log messages don't include any variables and 33.5% include only 1 variable. Considering the number of variables likely to be in scope that could have been responsible for the error then chances are, the logs won't have the detail you need. Source: OverOps, https://blog.takipi.com/swallowed-exceptions-the-silent-killer-of-java-applications/ Overall, logs provide useful information in only 13% of cases.
  • #17 We looked into the most time-consuming tasks for developers. It turns out, more than 25% of their time, on average, is spent troubleshooting.
  • #20 Imagine a tool that helps you visualise and understand why exceptions are occurring. The good news is that it exists now. Similar in concept to a UNIX core file. Exception details including type, frequency, deployment, stack trace, source code and parameter values.
  • #21 Management view. Categorisation, filtering, prioritisation, when introduced, re-introduced etc. Assign errors to the correct team via e-mail alerts or integrations with ticketing systems, such JIRA. Even add a gatekeeper to your CI/CD pipeline e.g. Jenkins.