Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Nikita Salnikov-Tarnovski            TECHNICAL OBSTACLES            WHEN BUILDING PLUMBRMonday, April 1, 13
AGENDA              Who we were and who we are              Object lifecycle with little overhead              Graph analy...
OUR BACKGROUND              2 developers                      Nikita Salnikov-Tarnovski, @iNikem                      Vlad...
NEW PROBLEM              Memory leaks              130,000 monthly searches for OutOfMemoryError in              Google   ...
PLUMBR              Automated performance consultant              Giving you the exact location of the leak with enough in...
PLUMBR AGENT...              JVM TI agents              both java and native, OS specific              welcome malloc and f...
... WATCHES YOU              We monitor object creation and disposal              On-the-fly bytecode instrumentation      ...
OBJECT MONITORING I              Java agent registers              java.lang.instrument.ClassFileTransformer              ...
PROBLEMS              Different compilers produce slightly different bytecode              Some classes are too fragile or...
OBJECT MONITORING II              We keep some data about each live object              That data creation and association...
OBJECT MONITORING II              If you cannot do in-process, do it off-processMonday, April 1, 13
PROBLEMS              BlockingQueue are slow              Locks are slow              Atomic* are slow!              No ex...
MORE PROBLEMS              Have to store all that objects related data somewhere              Java Collections are too fat...
LEAK HUNTING              When leaks are detected we need to find out, who is              holding them              Paths ...
PROBLEMS              Java objects have no incoming refs              You can walk the heap in C code              But tha...
STILL PROBLEMS              We’ve tried many graph traversal libraries              And NoSQL solutions              All s...
ONE MORE BICYCLE              We’ve written our own specialized version of Dijkstra              path searching           ...
TIME TO DIE              Plumbr runs inside JVM alongside with an application              It isn’t the main actor, just a...
WHEN JVM QUITS              It turns out JVM is quite survivable              No shutdown notification or smth             ...
PROBLEMS              Plumbr’s own threads              Threads from libraries that Plumbr uses              ExecutorServi...
PROBLEMS              RMI Reaper Thread              Keeps JVM alive as long as some JMX resources are in use             ...
PROBLEMS              Earlier versions used some Swing components, e.g.              Systray icon              And JVM wil...
CONCLUSION                      Don’t spend all your time writing web components or                      web-services or S...
Upcoming SlideShare
Loading in …5
×

Plumbr case study

605 views

Published on

  • Be the first to comment

  • Be the first to like this

Plumbr case study

  1. 1. Nikita Salnikov-Tarnovski TECHNICAL OBSTACLES WHEN BUILDING PLUMBRMonday, April 1, 13
  2. 2. AGENDA Who we were and who we are Object lifecycle with little overhead Graph analysis in low memory The problem of quittingMonday, April 1, 13
  3. 3. OUR BACKGROUND 2 developers Nikita Salnikov-Tarnovski, @iNikem Vladimir Šor, @vovencij 10+ years in custom software house Nortal Mostly Java EE development Web sites, backend systems, batch processesMonday, April 1, 13
  4. 4. NEW PROBLEM Memory leaks 130,000 monthly searches for OutOfMemoryError in Google 20,000 monthly unique visitors on our site http://plumbr.eu 400 monthly downloads 1700+ leaks discoveredMonday, April 1, 13
  5. 5. PLUMBR Automated performance consultant Giving you the exact location of the leak with enough information to fix it The foundation is based on machine learning trained on 500,000 memory snapshots From 3,000 different applications Finding 88% of the existing leaks. Quality only going up with the additional data gathered each day.Monday, April 1, 13
  6. 6. PLUMBR AGENT... JVM TI agents both java and native, OS specific welcome malloc and free! JNI code for communication between themMonday, April 1, 13
  7. 7. ... WATCHES YOU We monitor object creation and disposal On-the-fly bytecode instrumentation Hooks into GC eventsMonday, April 1, 13
  8. 8. OBJECT MONITORING I Java agent registers java.lang.instrument.ClassFileTransformer Modifies bytecode as classes are loaded Using ASM library To capture all newly created objectsMonday, April 1, 13
  9. 9. PROBLEMS Different compilers produce slightly different bytecode Some classes are too fragile or broken already new and chain of <init> Clone, deserialization, reflectionMonday, April 1, 13
  10. 10. OBJECT MONITORING II We keep some data about each live object That data creation and association takes time On every object creation!Monday, April 1, 13
  11. 11. OBJECT MONITORING II If you cannot do in-process, do it off-processMonday, April 1, 13
  12. 12. PROBLEMS BlockingQueue are slow Locks are slow Atomic* are slow! No existing library Even Disruptor doesn’t suite We’ve written no-guarantee-lock-free-many-producers-one- consumer buffer Concurrent programming IS hardMonday, April 1, 13
  13. 13. MORE PROBLEMS Have to store all that objects related data somewhere Java Collections are too fat No lock-free thread-safe reading We use Trove to save memory Hand-written clone with dirty check Testing persistent immutable data structuresMonday, April 1, 13
  14. 14. LEAK HUNTING When leaks are detected we need to find out, who is holding them Paths to GC roots While application is still runningMonday, April 1, 13
  15. 15. PROBLEMS Java objects have no incoming refs You can walk the heap in C code But that stops the world Standard heap dump loses information So we make custom heap dump And traverse reference graph on itMonday, April 1, 13
  16. 16. STILL PROBLEMS We’ve tried many graph traversal libraries And NoSQL solutions All somewhat works If you give them gigs of memory But we have to do this on-site, while application is still running We needed memory sensitive solutionMonday, April 1, 13
  17. 17. ONE MORE BICYCLE We’ve written our own specialized version of Dijkstra path searching Again had to replace many Java Collections with more memory efficient implementationsMonday, April 1, 13
  18. 18. TIME TO DIE Plumbr runs inside JVM alongside with an application It isn’t the main actor, just a supporter So Plumbr must be ready to quit whenever main application wishesMonday, April 1, 13
  19. 19. WHEN JVM QUITS It turns out JVM is quite survivable No shutdown notification or smth It just quits when there are no more non-daemon threads And some threads live for far too longMonday, April 1, 13
  20. 20. PROBLEMS Plumbr’s own threads Threads from libraries that Plumbr uses ExecutorService with daemon thread factoryMonday, April 1, 13
  21. 21. PROBLEMS RMI Reaper Thread Keeps JVM alive as long as some JMX resources are in use We must clean behind ourselves, MBeans, JMX connections, JMX servers But when??? Implemented our own monitor thread with some heuristicsMonday, April 1, 13
  22. 22. PROBLEMS Earlier versions used some Swing components, e.g. Systray icon And JVM will not quit while there is some displayable Swing components Should kill it when before quitting Again, when???Monday, April 1, 13
  23. 23. CONCLUSION Don’t spend all your time writing web components or web-services or Swing There is more to Java than that There are many Java libraries but not enoughMonday, April 1, 13

×