SplunkApplicationLoggingBestPractices_Template_2.3.pdf

<Presenter>
<Title>
Application
Logging
Best
Practices

§ Reality
of
Event
Logging
§ Liberating
Application
Data
§ Operational
Best
Practices
§ Data
Enrichment
|
Other
Data
Sources
§ More
Developer
Tools
Agenda
2

3
Reality
of
Event

Logging

The
Accelerating
Pace
of
Data
Volume

|

Velocity |

Variety
|

Variability
GPS,
RFID,
Hypervisor,
Web
Servers,
Email,
Messaging,
Clickstreams,
Mobile,

Telephony,
IVR,
Databases,
Sensors,
Telematics,
Storage,
Servers,
Security
Devices,
Desktops

Machine data is
the
fastest
growing,
most

complex,
most
valuable
area
of
big
data

Event
Logs
Suck
Online

Services Web

Services
Servers
Security GPS

Location
Storage
Desktops
Networks
Packaged

Applications
Custom
Applications
Messaging
Telecoms
Online

Shopping

Cart
Web

Clickstreams
Databases
Energy

Meters
Call
Detail

Records
Smartphones

and
Devices
RFID
On-‐
Premises
Private

Cloud
Public

Cloud
§ They
have
some
structure
§ Structure
is
not
consistent
§ Structure
is
non-‐standard
§ Keys
can
be
stored
separately
§ High
volume,
growing
every
day
§ Hard
to
access
§ Take
up
tons
of
space
§ Clog
up
the
network

Event
Logs
Suck
050818
16:19:31 2
Query

UPDATE
xar_session_info
SET
xar_vars=

'XARSVuid|i:2;XARSVrand|i:343223999;XARSVuaid|s:2:"29";XARSVbrowsername|s:9:"Netscape6";XARSVbrowserversion|s:
3:"5.0";XARSVosname|s:7:"Unknown";XARSVosversion|s:7:"Unknown";XARSVnavigationLocale|s:11:"en_US.utf-‐
8";SPLUNKAPP_IP|N;',
xar_lastused =
1124407171
WHERE
xar_sessid ='ll7joq442223fl6h07v3f3vpd2’
10
Query

UPDATE
xar_session_info
SET
xar_vars=
'XARSVuid|i:2;XARSVrand|i:89426315;XARSVuaid|s
:2:"29";XARSVbrowsername|s:9:"Netscape6";XARSVbrowserversion|s:3:"5.0";XARSVosname|s:7:"Unknown";XARSVosv
ersion|s:7:"Unknown";XARSVnavigationLocale|s:11:"en_US.utf-‐8";SPLUNKAPP_IP|N;',
xar_lastused =
1124407193
WHERE

xar_sessid =
't2idg584t1co0scgj40qnnm’
31
Connect

caveuser@web2.int.splunk.com
on
cave
Jun

2
13:36:50
DEBUG[1826]:
Setting
NAT
on
RTP
to
0
Jun

2
13:36:50
DEBUG[1826]:
Check
for
res
for
5008office
Jun

2
13:36:50
DEBUG[1826]:
Call
from
user
'5008office'
is
1
out
of
0
Jun

2
13:36:50
DEBUG[1826]:
build_route:
Contact
hop:
<sip:5008office@10.1.1.132:5060>
Jun

2
13:36:50
VERBOSE[10887]:

-‐-‐ Executing
Macro("SIP/5008office-‐dfbd",
”

Apr 29 19:13:01 45.2.98.7 SentriantGenericAlert: Time="04/29/06 07:12 PM
PDT",Host="roach_motel.enet.interop.net",Category="fabric_network_activity",Generator="R
esponse:Slow
Scan",Type="NOTICE",Priority="High",Body="Appliance=roach_motel.enet.interop.net,Reporti
ng Segment=ENET network,Action=Response disabled,Response=Slow Scan,Duration=90
seconds,Source Segment=Unprotected,Source IP=88.73.39.200,Source
MAC=00:01:30:BC:93:90,Current Target Count=0"
esponse:Slow
Scan",Type="NOTICE",Priority="High",Body="Appliance=roach_motel.enet.interop.net,Reporti
ng Segment=ENET network,Action=Response disabled,Response=Slow Scan,Duration=69
seconds,Source Segment=Unprotected,Source IP=68.163.20.95,Source
MAC=00:01:30:BC:93:90,Current Target Count=0"
esponse:Slow
45.2.98.7
SentriantGe
nericAlert:
Time="04
Event
Logs
Rocks

• Ensure
system
security
• Meet
compliance
mandates
• Customer
behavior
and
experience
• Product
and
service
usage
• End-‐to-‐end
transaction
visibility
Definitive
record
of
activity
and
behavior
Important
insight
for
IT
and
the
business
10.2.1.44 - [25/Sep/2009:09:52:30 -0700]
type=USER_LOGIN msg=audit(1253898008.056:199891): user pid=25702 uid=0
auid=4294967295 msg='acct="TAYLOR": exe="/usr/sbin/sshd" (hostname=?,
addr=10.2.1.48, terminal=sshd res=failed)'
User
IP Action Login Result
10.2.1.80 - - [25/Jan/2010:09:52:30 -0700]
"GET /petstore/product.screen
?product_id=AV-CB-01 HTTP/1.1" 200 9967 "http://
category.screen?category_id=BIRDS" "Mozilla/5.0 (
Linux)”"JSESSIONID=xZDTK81Gjq9gJLGWnt2NXrJ2tpGZb1
User
IP Product Category
Gold
Mine
of
Information
8

Interpretation
=
Real
Business
Value
9

The
Mighty
Application
Log
Operations
Security
Business

Intelligence
Social/
Mobile
§ How
many
transactions
are
failing?
§ Which
specific
transactions
are
failing?
§ Is
system
performance
falling
behind?
§ Who
is
accessing
the
app?

When?
§ What
activity
looks
suspicious?
§ Is
the
application
behaving
as
expected?
§ What
is
the
purchase
volume
over
time?
§ How
do
purchases
compare
to
last
month?
§ How
are
customers
affected
by
app
issues?
§ How
is
the
customer
experience?
§ Are
transactions
taking
too
long?
§ Where
are
transactions
happening?

Traditional
Analytics
SELECT customers.* FROM
customers WHERE
customers.customer_id NOT
IN(SELECT customer_id FROM
orders WHERE
year(orders.order_date) =
2004)
Early
Structure
Binding
Structure Data
§ Schema created
a

design
time
§ Queries
understood
at

design
time
§ Homogenous
§ Must
fit
into
table
or

converted
to
tables
§ Must
match
constraints

Analytics
with
Splunk

Late
Structure
Binding
Structure Data
§ Schema-‐less
§ Created
at
search
time
§ Queries executed
ad-‐hoc
§ Heterogeneous
§ Constantly changing
§ No
conversion
required
§ No
constraints

Gain
Intelligence
Quickly
Early
Structure
Binding
Decide
question
to
ask
Design
the
schema
Normalize
data
+
write
DB
insertion
code
Create
SQL
&
feed
into
analytics
tool
Write
Semantic
Events
Collect
Create
Searches,
Reports
&
Graphs
Late
Structure
Binding
§ Days
– Weeks
– Months

§ Destructive
§ Minutes
§ Non-‐Destructive

15
Liberating
Application

Data

Current
State
§ You
have
no
control
over
other
system’s
events
§ You
have
full
control
over
events
that
YOU
write
§ Most
events
are
written
by
developers
to
help
them
debug

§ Some
events
are
written
to
form
an
audit
trail

Logging
with
Purpose
§ Logging
for
Debugging
§ Troubleshoot
application
problems
§ Identify
trends
§ Categorize
issues
§ Semantic
Logging
§ Record
the
state
of
business
processes
§ Examples:
web
clicks,
financial
trades,

cell
phone
connections,
audit
trails,

etc.
void submitPurchase(purchaseId) {
log.info(
"action=submitPurchaseStart,
purchaseId=%d", purchaseId)
// These calls throw an exception:
submitToCreditCard(...)
generateInvoice(...)
generateFullfillmentOrder(...)
log.info(
"action=submitPurchaseCompleted,
purchaseId=%d", purchaseId)
}

Liberating
Log
Data
– In
a
Nutshell
q Use
clear
key-‐value
pairs
q Create
events
humans
can
read
q Use
developer-‐friendly
formats
q Use
timestamps
for
every
event
q Use
unique
identifiers
(IDs)
q Log
in
text
format
q Log
more
than
debug
events
q Use
categories
q Identify
the
source
q Minimize
multi-‐line
events

Use
Clear
Key-‐Value
Pairs
§ Create
Structure
from
Unstructured
Data
§ Use
space
or
comma
delimited
§ Wrap
values
with
spaces
in
quotes
§ Automatic
field
extraction
§ Self
describing,
does
not
require
regular
expressions
to
parse
§ Keys
are
stored
alongside
field
values
§ No
additional
configuration
work
for
Splunk
Admin
or
Knowledge
Manager
Example
(Good):
Log.debug(“orderstatus=error,errorcode=454,
user=%d,transactionid=%s”, userId, transId)
Example
(Bad):
Log.debug(“error %d 454 - %s ”, userId, transId)

Create
Human-‐Readable
Events
§ Use
ASCII
Format
§ Avoid
complex
encoding
§ Avoid
formats
which
require
arbitrary
code
to
decipher
§ Use
Consistent
Formatting
§ Separate
events
with
different
formats
into
individual
files

Create
Human-‐Readable
Events
§ Avoid
Binary
Data
§ Binary
data
is
compressed,
but
requires
decoding
and
does
not
segment

§ Splunk
cannot
meaningfully
search
or
analyze
binary
data
§ If
data
must
be
in
binary
format:
§ Provide
tool
to
easily
convert
to
ASCII
§ Create
custom
Splunk
search
command
to
decode
binary
segments
inline
§ Place
textual
metadata
in
the
event
§ For
example,
do
not
log
the
binary
data
of
a
JPG
file,
but
do
log
its
image
size,

creation
tool,
username,
camera,
GPS
location,
etc.

Use
Developer-‐Friendly
Formats
§ JSON
and
XML
are
Readable
by
Humans
and
Machines
§ Seamless
parsing
by
most
programming
languages
right
in
the
browser
§ Useful
for
capturing
hierarchy
or
membership,
and
self-‐describing
§ Easily
interpreted
by
Splunk
spath
command
{"widget": {
"text": {
"data": "Click here",
"size": 36,
"data": "Learn more",
"size": 37,
"data": "Help",
"size": 38,
}}
date size data
---------- ---- ----------
2014-08-12 36 Click here
37 Learn more
38 Help

Use
Timestamps
§ Time
is
a
First
Class
Citizen
§ Timestamps
are
critical
to
understanding
the
sequence
of
events
for

debugging,
analytics,
and
deriving
transactions
§ Timestamps
are
automatically
detected,
but
best
to
use
an
intelligent
format
§ Timestamp
Dos
§ Use
most
verbose
granularity,
if
possible
microseconds
since
events
can

become
orphaned
from
the
originating
event
§ Place
timestamps
at
beginning
of
event
§ Include
a
four
digit
year
§ Include
a
time
zone
§ Timestamp
Do
Nots
§ Do
not
use
a
time
offset
Example
(Good):
08/12/2014:09:16:35.842 GMT
INFO key1=value1 key2=value2

Use
Unique
Identifiers
(IDs)
§ More
Power
for
Debugging
and
Analytics
§ Examples:
Transaction
IDs,
user
IDs
§ Used
to
find
exact
transactions
§ Carry
Unique
IDs
Through
Multiple
Touch
Points
§ Avoid
changing
format
between
modules
or
systems
§ Include
transitive
closures
transid=abcdef,

transid=abcdef,

otherid=
qrstuv,
.
.
.
.
.
otherid=qrstuv
Transaction

Unique
IDs
Through
Multiple
Touch
Points
Order
ID
Customer’s
Tweet

Time
Waiting
On
Hold
Product
ID
Company’s
Twitter
ID
Order
ID
Customer
ID
Twitter
ID
Customer
ID
Customer
ID
Sources
Order
Processing
Twitter
Care
IVR
Middleware

Error

Minimize
Multi-‐ Line/Value
Events
§ Multi-‐ Line/Value
Events
are
Less
Efficient
§ More
difficult
for
software
to
parse
§ Generate
many
segments,
affects
indexing/search
speed
+
disk
compression
§ Break
multi-‐line
events
into
separate
events
§ Break
multi-‐value
fields
into
separate
events
for
easier
manipulation
Example
(Good):
<TS>
phonenumber=333-‐444-‐4444,
app=angrybirds,
installdate=xx/xx/xx
<TS>
app=facebook,
installdate=yy/yy/yy
Example
(Bad):
<TS>
app=angrybirds,facebook

Log
More
Than
Debug
Events
§ Log
anything
that
can
add
value
when
aggregated
and/or
visualized
§ user
actions
§ timing
§ transactions
§ audit
trails
§ Log
Category
§ Severity
levels
can
aid
navigation
and
baselining
§ Identify
the
Source
§ Use
class,
function
or
filename

28
Operational
Best

Practices

Operational
Best
Practices
§ Log
locally
to
log
files
§ Provides
local
buffer
§ Non-‐blocking
during
network
failures
§ Use
syslog-‐ng or
rsyslog +
Splunk
forwarder
for
syslog
data
§ Implement
rotation
policies
§ Logs
take
up
space
§ Many
compliance
regulations
require
years
of
archival
storage
§ Decide
on
destroying
or
backing
up
logs

Creating
Value
With
Structured
Data
Enrich
search
results
with
additional

business
context
Easily
import
data
into
Splunk
for
deeper

analysis
Integrate
multiple
DBs
concurrently
Simple
set-‐up,
non-‐invasive
and
secure
DB
Connect
provides
reliable,
scalable,

real-‐time
integration
between
Splunk

and
traditional
relational
databases
Microsoft
SQL
Server
JDBC
Database

Lookup
Database

Query
Connection

Pooling
Other

Databases
Oracle

Database
Java
Bridge
Server
32

Hadoop
and
NoSQL
offer
simple

storage
but
hard
analytics:
difficult

to
explore,
analyze,
visualize
Hard-‐to-‐staff
skills:
require

months
of
labor
by
specialists

with
rare
and
expensive
skill
sets

Inflexible
approaches:
must

predefine
fixed
schemas
or

program
MapReduce
jobs
Hadoop

(MapReduce

&
HDFS)
YARN
DataFu
H
i
v
e
Mahout Pig
Sqoop
Wide
Range
of
Open
Source
Projects

for
Analytics
and
Data
Visualization
Azkaban
It’s
Hard
to
Turn
Raw
Data
Into
Refined
Insights
NoSQL

Data

Stores

Integrated
Analytics
Platform
for
Diverse
Data
Stores
Full-‐featured,

Integrated

Product
Fast
Insights

for
Everyone
Works
with

What
You

Have
Today
Explore Visualize Dashboard
s
Share
Analyze
Hadoop
Clusters NoSQL
and
Other
Data
Stores
Hadoop Client
Libraries Streaming
Resource
Libraries
Bi-‐directional

Integration

with
Hadoop

35
More
Developer
Tools

The
Splunk
Enterprise
Platform
Collection
Indexing
Search
Processing
Language
Core
Functions
Inputs,
Apps,
Other
Content
SDK
Content
Core
Engine
User
and
Developer
Interfaces
Web
Framework
REST
API

What’s
Possible
with
the
Splunk
Enterprise
Platform?
Power

Mobile

Apps
Log

Directly
Extract

Data
Customer

Dashboards
Integrate

BI
Tools
Integrate

Platform
Services
Developer Platform

Powerful
Platform
for
Enterprise
Developers
REST API
Web Framework
Web

Framework
Ruby
C#
PHP
Data
Models
Search
Extensibility
Modular
Inputs
SDKs
Simple
XML
JavaScript
Django
Developers Can Customize and Extend

Splunk
Software
for
Developers
Gain

Application

Intelligence
Build
Splunk

Apps
Integrate
and

Extend
Splunk

SplunkApplicationLoggingBestPractices_Template_2.3.pdf

Recommended

Recommended

More Related Content

Similar to SplunkApplicationLoggingBestPractices_Template_2.3.pdf

Similar to SplunkApplicationLoggingBestPractices_Template_2.3.pdf (20)

Recently uploaded

Recently uploaded (20)

SplunkApplicationLoggingBestPractices_Template_2.3.pdf