Disclaimer
During the course of this presentation, we may make forward looking statements regarding future events or the
expected performance of the company. We caution you that such statements reflect our current expectations and
estimates based on factors currently known to us and that actual events or results could differ materially. For important
factors that may cause actual results to differ from those contained in our forward-looking statements, please review
our filings with the SEC. The forward-looking statements made in the this presentation are being made as of the time
and date of its live presentation. If reviewed after its live presentation, this presentation may not contain current or
accurate information. We do not assume any obligation to update any forward looking statements we may make. In
addition, any information about our roadmap outlines our general product direction and is subject to change at any
time without notice. It is for informational purposes only and shall not, be incorporated into any contract or other
commitment. Splunk undertakes no obligation either to develop the features or functionality described or to include
any such feature or functionality in a future release.
2
3
From Middle Earth
Make Splunk Apps & Add-ons
JVM background
JVM = Java Virtual Machine
First appeared in the early 90’s
Now the dominant runtime for enterprise
applications
Application Servers Enterprise Service Buses Databases
NoSQL Distributed Big Data Web Servers
Directory Servers Search Engines Build Systems
Gaming Platforms Trading Systems Reservation Systems
Core Banking Messaging Infrastructure Proprietary Systems
7
Many JVM Variants
Oracle Hotspot
OpenJDK
Oracle JRockit
9
IBM J9
Azul Zing
More or less the same. OpenJDK
is reference impl for Java 7/8
Getting merged into Hotspot
Commercial alternatives
JVMs are here to stay
It’s not just about the “J”
Big Data frameworks
You should care about getting insights into your
JVM data
10
Standard Log Files
Oct 21, 2013 4:42:15 PM org.apache.catalina.startup.Catalina load
INFO: Initialization processed in 1153 ms
Oct 21, 2013 4:42:15 PM org.apache.catalina.core.StandardService startInternal
INFO: Starting service Catalina
Application logs that are part of the product
Developer logs for any code that was deployed
14
Written to local disk or network storage
Garbage Collector Logs
54.736: [Full GC 54.737:
[Tenured: 172798K->18092K(174784K), 2.3792658 secs] 257598K->18092K(259584K),
[Perm : 20476K->20476K(20480K)], 2.4715398 secs] [Times: user=0.56 sys=0.05, real=0.07 secs]
Generated by way of arguments passed to the JVM at startup
15
-verbose:gc
-Xloggc:/home/damien/jvm_logs/gc.log
-XX:+PrintGC
-XX:+PrintGCTimeStamps
-XX:+PrintGCDetails
SplunkJavaLogging
Sometimes you can’t write to file or deploy a UF
Appenders for Java Util Logging , Log4J , Logback
Simply add a logging appender to your logging configuration file
16
Splunk SDK for Java
Use the SDK from any JVM Language , Java / Groovy / Scala etc….
Send log events via REST , UDP or TCP directly to Splunk from your code
Spring Integration Adaptors available on Github
22
SDK available from dev.splunk.com
Go horizontal to achieve real scale
36
Indexer Cluster
UFs with JMX Mod Input
Monitored JVMs
In my experience , the data sources shown so far will be
thorough enough for most of your JVM monitoring use cases.
But what if you want to go deeper ?
37
SplunkJavaAgent
Dynamic Byte Code Injection Agent for JVMs
No need to change any code to get APM level insights
This is all you pass to the JVM at startup :
-javaagent:splunkagent.jar
39
What raw metrics can you get ?
40
Bytecode Injection
• class loading
• method execution
• method timings
• method call stack
• caught & uncaught exceptions
JMX
• running in local in-memory mode
Binary HPROF Memory dumps decoded in text
• no more static hprof dumps and then loading into another Mem Analysis tool
The JVM executes as a process
Let’s see what the host Operating System can tell us : top
External programs that provide additional JVM insights : jstat
Index this data in Splunk and correlate
47
Opportunity
Reuse some of the data collection approaches in this presentation
and build out a specific JVM app
55
Publish it on apps.splunk.com
Make some money
Generate publicity for yourself / your company
Generate sales leads
Sell more Splunk licenses if you are also a partner
Do it because you love community collaboration
Become my beer buddy for life
Some JVM App Ideas
Splunk for JBoss
Splunk for Tomcat
Splunk for Solr
Splunk for Cassandra
Splunk for “Apache Project”
I am here to help you
56
From Auckland
Dev evang , ex customer
5th Conf
Make Apps , Cut code
Through enterprise Java background in many verticals , enterprise and non enterprise Java and other JVM langs
Everything 100% open source use , reuse , whatever.
Collaborate
Community
answers.splunk.com for support is best
JVM is a runtime for code that gets compiled to bytecode and executed. Not natively. But can JIT. So performance is good now.
Circa 1991, Dr. James Gosling at Sun started developing a technology for next generation smart devices/appliances
“Green” became “Oak” which became “Java”
Java 1.0 first appeared in January 1996.
The JVM is a virtual machine that runs programs that are compiled into Java bytecode
Available for many hardware and software platforms
17 years later , the JVM has evolved from a consumer device technology, to a browser oriented technology with the explosion of the web , to now becoming deeply rooted in the enterprise software landscape on the server side and in the cloud
Overcame some limitatons on C/C++
Language changes , generics , annotations , lambda’s , streams , default methods
GC , JIT , bytecode , write once run anywhere
Ask crowd about JVM experience :
Devs
Operations
Have no idea but might after this preso
Jrockit and hotspot code being integrated
Openjdk closely linked to hotspot , Oracle JDK has a few other proprietary add-ons above. Open JDK is the Java reference impl
Zing is commercial
Groovy , Scala , Clojure , JS , the JVM is evolving with the changing landscape of enterprise software
From app servers in the 00’s to Big Data frameworks today
Hadoop , Cassandra, 1000’s of nodes
The JVM footprint cross cuts the data centre and represents a massive source of valuable machine data
Large scale Application/Web Server clusters
Hadoop & Cassandra Node topologies in the 100’s and in some cases 1000’s
Segway into JVM Data
Monitor with a UF = best practice
JVM generated logs by way of options
Try to use a UF if at all possible
Utility classes for formatting log events
Configurable in memory buffer to handle network outages
TCP and REST Appenders
SplunkLogEvent will log in best practive semantic format
What if you can’t use logging appenders etc…
Well , then you have to get down and dirty with your code !
Majority of useful data for monitoring will come from this layer
Open and easily extensible
Developers can simply create new MBeans
Vendor products(JBoss, Cassandra, Hadoop etc..) ship with thorough MBean coverage, not MIBs
JConsole to Browse Mbeans
Enable your target JVM’s remote JMX interface , test connectivity with JConsole
Splunkcowboy.com for demo
As many config files as you want
Can make output CIM compliant
Also MX4j connectivity
Run Splunk UF locally with target JVM.Can connect use remote JMX interface or direct process attachment.
Each tier scales out horizontally.
Can overcome firewall issues that are sometimes inherent with Java RMI
Deploy Splunk for JMX components and configurations with Splunk Deployment Server, Puppet or Chef.
More than just jvm as bottleneck , also have stdin/out and Splunk indexing pipeline
Appdynamics have an integration
Hprof :
Binary JVM dumps that allow for deeper JVM resource inspection
Typical use case is diagnosing memory issues after JVM crashes with java.lang.OutOfMemoryError
sourcetype=splunk_java_agent name=method* |
transaction appName,appID,threadID,className,methodName startswith=method_entered endswith=method_exited |
stats
min(duration) as "Min Execution Time"
max(duration) as "Max Execution Time"
avg(duration) as "Avg Execution Time"
count as "Total Calls"
sum(duration) as "Cumulative Execution Time"
by className,methodName
Correlate this OS data across your JVM and Application events ie: your JVM may have hung because of CPU starvation caused by some other process thrashing
Could even go lower if running virtual and get the VM/Hypervisor data
Created a wrapper script to dynamically get the PID of the Java process
Custom handlers for pre-processing and reformatting if you want
Not going into details here , there are other sessions you can attend.