How I Learned to Stop
Worrying and Love the Logs
DISCLAIMER
DISCLAIMER
The bearing of a child takes nine months, no matter how many
women are assigned.
Brooks's Law: Adding manpower to a late software project
makes it later.
This second is the most dangerous system a man ever designs.
Never go to sea with two chronometers; take one or three.
DISCLAIMER
There’s no silver bullet
How I Learned to Stop
Worrying and Love the Logs
Outline
Motivation
Objectives
Format
Best practices
Quiz
Cheats
Motivation
Your program is going to die anyway,
the leastit can do is tell you who killed it
G
Objective
A H
I
T
Format
W ? W ?W ?
W ?W ?
W ?
Log date and time (international format)
Resolution - Milliseconds
processID / threadId
W ?
Application identifier e.g. name and version
Code location e.g. script name, module name
Geolocation
Window/form/page
W ?
Source address
User identity (if authenticated or otherwise
known)
Or session context
W ?
Type of event
Severity
security
General description
Message Priorities/Levels
Fatal
Error
Warn
Info
Debug
Trace
Emergency
Alert
Critical
Error
Warning
Notice
Informational
debug
Warn
Info
Verbose
Debug
Silly
Severe
Warning
Info
Config
Fine
Finer
Finest
Message Priorities/Levels
•Fatal Severe errors that cause premature termination.
•Error Other runtime errors or unexpected conditions.
•Warn are undesirable or unexpected, but not
necessarily "wrong".
•Info Interesting runtime events (startup/shutdown).
•Debug detailed information on the flow through the
system.
•Trace more detailed information.
Message Priorities/Levels
Fatal
Error
Warn
Info
Debug
Trace
Emergency
Alert
Critical
Error
Warning
Notice
Informational
debug
Warn
Info
Verbose
Debug
Silly
Severe
Warning
Info
Config
Fine
Finer
Finest
Message Priorities/Levels
Fatal
Error
Warn
Info
Debug
Trace
Emergency
Alert
Critical
Error
Warning
Notice
Informational
debug
Warn
Info
Verbose
Debug
Silly
Severe
Warning
Info
Config
Fine
Finer
Finest
•Input validation failures
•Output validation failures
•Authentication successes and failures•Authorization (access control) failures•Session management failures•Application errors and system events•Application and related systems start-upsand shut-downs, and logging initialization(starting, stopping or pausing)•Use of higher-risk functionality•Legal and other opt-ins
•Sequencing failure
•Excessive use
•Data changes
•Fraud and other criminal activities
•Suspicious, unacceptable or unexpected
behavior
•Modifications to configuration
•Application code file and/or memory
changes
W ?
W ?
Service origin
Flow marker
Best Practices
B
Log locally to files
Rotate log files to keep them current
Keep multi-line events to a minimum
Window/form/page
Collect events
from everything,
everywhere
•Application logs
•Database logs
•Network logs
•Configuration files
•Performance data (iostat,
vmstat, ps, etc.)
•Anything that has a time
component
Separate
logs to areas
of interest
Line.log
Uut-0141409124.log
Uut-0141409125.log
Uut-0141409126.log
Uut-0141409127.log
Don’t log
and throw
try {
// Code that generates an IOException
} catch (IOException e) {
logger.fatal("Caught IOException", e);
throw e;
}
Don’t catch
and drop
try {
// Code that generates an IOException
} catch (IOException e) {}
void transfer(Account fromAcc, Account toAcc,int amount, User user) {
if (!isUserAuthorised(user, fromAcc)) {
throw new UnauthorisedUserException();
}
if (fromAcc.getBalance() < amount) {
throw new InsufficientFundsException();
}
fromAcc.withdraw(amount);
toAcc.deposit(amount);
database.commitChanges(); // Atomic operation.
}
void transfer(Account fromAcc, Account toAcc,int amount, User user) {
logger.info("Transferring money...");
if (!isUserAuthorised(user, fromAcc)) {
throw new UnauthorisedUserException();
}
if (fromAcc.getBalance() < amount) {
throw new InsufficientFundsException();
}
fromAcc.withdraw(amount);
toAcc.deposit(amount);
database.commitChanges(); // Atomic operation.
}
.
void transfer(Account fromAcc, Account toAcc,int amount, User user) {
logger.info("Transferring money from {} to {} amount {} user {}“,
fromAcc, toAcc, amount, user);
if (!isUserAuthorised(user, fromAcc)) {
logger.info("User has no permission.");
throw new UnauthorisedUserException();
}
if (fromAcc.getBalance() < amount) {
logger.info("Insufficient funds.");
throw new InsufficientFundsException();
}
fromAcc.withdraw(amount);
toAcc.deposit(amount);
database.commitChanges(); // Atomic operation.
}
void transfer(Account fromAcc, Account toAcc,int amount, User user) {
logger.info("Transferring money from {} to {} amount {} user {}“,
fromAcc, toAcc, amount, user);
if (!isUserAuthorised(user, fromAcc)) {
logger.info("User has no permission.");
throw new UnauthorisedUserException();
}
if (fromAcc.getBalance() < amount) {
logger.info("Insufficient funds.");
throw new InsufficientFundsException();
}
fromAcc.withdraw(amount);
toAcc.deposit(amount);
database.commitChanges(); // Atomic operation.
logger.info("Transaction successful.");
}
private String phone;
public String getPhone() {
return (phone == null) ? "none" : phone;
}
public void setPhone(String arg, MfDate changeDate) {
log(changeDate, this, "change of phone", phone, arg);
phone = arg;
}
public void setPhone(String arg) {
setPhone(arg, MfDate.today());
}
private static void log(MfDate validDate, Customer customer, String description, Object oldValue, Object
newValue) {
try {
logfile().write(validDate.toString() + customer.name() + "t" + description +
"t" + oldValue + "t" + newValue + "t" + MfDate.today() + "n");
logfile().flush();
} catch (IOException e) {
throw new ApplicationException("Unable to write to log");
}
}
LOG.debug(arg.toString());
LOG.debug("Method called with arg {}", arg);
LOG.debug("{}", arg);
LOG.debug( "Something interesting is happening:
event=" + event.type + ", message=" + message);
LOG.debug( "Something interesting is happening:event={0}, message={1}", event.type, message);
try { doSomething(arg); ... }
catch (SomeException ex) { }
try { doSomething(arg); ... }
catch (SomeException ignore) {
LOG.warn(
"Failed to do something with {}, ignoring it",
arg); }
try { doSomething(arg);}
catch (SomeException ex) {
LOG.warn(ex.getMessage()); }
Log Manager
Count
something
57,465Exceptions per day
57,465Errors per day
57,465Transactions per day
Enumerate logger
errors
LOGGER.error("001 - Error writing
to device”);
LOGGER.error("002 - Error writing
to device”);
LOGGER.error("003 - Error writing
to device”);
Use ALM markers
in code
LOGGER.error(“JIRA-1231 :
threshold has been exceeded.”)
LOGGER.warn(“TFS-1231 : Map
server not found. No tiles will
be shown.”)
57,465JIRA-1231
Per day
1JIRA-1231
Per day
0JIRA-1231
Last month
Use categories
and flow markers
LOGGER.info("MESSAGE-FLOW
7a-SUCESS - {0} got ACK for {1}",
sender, alert);
LOGGER.info("MESSAGE-FLOW
7b-RETRY - {0} will resend {1}",
sender, alert);
57,465Messages sent
57,465Messages sent
57,465Messages sent
351Messages not sent
C f
Lack of common log formats
Not all activities are logged, or aggregated
Requires learning
High volume of junk
Thank you

How i learned to stop worrying and love the logs