DevOps and
Performance
Why, How &
Best Practices
@grabnerandi
http://apmblog.compuware.com
What you may have heard about Austrians
And just very recently @ Euro Song Contest
How we would like the world to see us 
What we are also proud of 
What you should check out …
The stuff we did
when we were a Start Up
and we All were
Devs, Testers and Ops
YOU ARE NOT ALONE: Popularity on Google
Who is doing it? How many successful
deployments can they do?
300 Deployments / Year
50-60 Deployments / Day
10+ Deployments / Day
Every 11.6 seconds
More on Amazons Story
75% fewer outages since 2006
90% fewer outage minutes
~0.001% of deployments cause a problem
Instantaneous automatic rollback
Deploying every 11.6s
Testing is Important – and gives Confidence
But are we ready for “The Real” world?
Measure Performance during the game
Ball Possession: 40 : 60
Fouls: 0 : 0
Score: 0 : 0
Minute 1 - 5
Measure Performance during the game
Minute 6 - 35
Ball Possession: 80 : 20
Fouls: 2 : 12
Score: 0 : 0
Deep Dive Analysis
Options “To Fix” the situation
Not always a happy ending 
Minute 90
Ball Possession: 80 : 20
Fouls: 4 : 25
Score: 3 : 0
FRUSTRATED FANS!!
25
How does that
relate to
Software?
From Deploy to …
Deploy Promotion/Event Problems Ops Playbook War Room
Timeline
The “War Room” – back then
'Houston, we have a problem‘
NASA Mission Control Center, Apollo 13, 1970
The “War Room” – NOW
Facebook – December 2012
3 Situations on
WHY this happens,
HOW to avoid it
Image taken from https://www.scriptrock.com/blog/devops-whats-hype-about/
#Disconnected
Teams
“Teamwork” between Dev and Ops
SEV1 Problem in Production
Need access to log files
Where are they? Can’t get them
Need to increase log level
Can’t do! Can’t change config
files in prod!
“Solution”: Implement a Custom “On Demand”
Remote Logger
Implementation and Rollout
Implemented
Custom Logger
Worked well in
Load Testing
What happened?
~ 1Mio Lock Exceptions in 30
mins
Root Cause: A special WebSphere Setting!
Log Service provides a synchronized
log file across ALL JVMs
Log Service provides a
synchronized log file across
ALL JVMs
Metrics: # Log Messages,
# Exceptions
Share: Same Server
Settings
DevOps: Agree on Data
for Troubleshooting
#No “Agile”
Deployment
Adonair
Load Spike resulted in Unavailability
Alternative: “GoDaddy goes DevOps”
1h before
SuperBowl KickOff
1h after
Game ended
Behind the Scenes
Metrics: Availability
Page Size, # Objects
# Hosts, # Connections
DevOps: “Feature”
Switches
#Push
without a Plan
Mobile Landing Page of Super Bowl Ad
434 Resources in total on that page:
230 JPEGs, 75 PNGs, 50 GIFs, …
Total size of ~
20MB
m.store.com redirects to www.store.com
ALL CSS and JS files are
redirected to the www domain
This is a lot of time “wasted”
especially on high latency mobile
connections
Critical Pages not Optimized!
Browse, Search
and Product Info
performs well
… because they don’t follow
best practices: 87 Requests, 28
Redirects, …
Critical Pages such as
Shopping Cart are very
slow …
Metrics: Load Time,
# Resources (Images, …),
# HTTP 3xx, 4xx, 5xx
Dev: Build for Mobile
Test: Test on Mobile
Ops: Monitor Mobile
# of Requests / User
# of Log Messages
# of Exceptions
# Objects Allocated
# Objects In Cache
Cache Hit Ratio
# of Images
# of SQLs
# SQLs per RequestAvailability
# HTTP 3xx, 4xx
Page Size
54
Commit Stage
• Compile
• Execute Unit Test
• Code Analysis
• Build installers
Automated
Acceptance
Testing
Automated
Capacity
Testing
Manual testing
• Key showcases
• Exploratory testing Release
Unit & Integration Tests
Functional Tests
Performance Tests
Production
Monitoring
Functional Tests
If we do all that
Which gives you more time for the real important
things in life …
Want MORE of these and more details?
http://apmblog.compuware.com
Recommended Book
https://itrevolution.wufoo.com/forms/phoenix-project-ebook-offer/
FREE Products & More Info
• dynaTrace Enterprise
– Full End-to-End Visibility in your Java, .NET, PHP Apps
– Sign up for a 15 Days Free Trial on http://compuwareapm.com
• dynaTrace AJAX Edition
– Browser Diagnostics for IE + FF
– Download @ http://ajax.dynatrace.com
• Our Blog: http://apmblog.compuware.com

DevOps and Performance - Why, How and Best Practices - DevOps Meetup Sydney

Editor's Notes

  • #15 MEASURE!!
  • #16 Who knows what that is? It’s the Fifa World Cup Trophy
  • #17 Teams are currently competing in the qualifications to compete in Brazil 2014
  • #18 This is “my” austrian national team soccer team. Their GOAL is to qualify for Brazil 2014. After the many failed attempts in the past we hired a new coach who’s goal is to form a new team that PERFORMs good enough to qualify
  • #19 In order to get there the team competed in many test games. Which gaves them a lot of confidence because they played against teams that were “easier” to beat. At the end of these tests we even started in the qualification with some wins against teams that we were expecting to win So – at the end of these “test and easy qualification games” we thought: “ALL GOOD – THE ROAD IS OPEN FOR 2014 – NOT ONLY WILL WE QUALIFY BUT WE ALSO BELIEVE WE HAVE SUCH A STRONG TEAM THAT WILL ALSO DO WELL AT THE WORLDCUP”
  • #20 Then reality kicked in when we had our first “real competitor” – it was the first qualification against a team whos quality level is at a level that we have to expect at the world cup. The competing team was Germany – and – based on these images you can see how the game went
  • #21 The coach is responsible to watch the game and see how things are going. Like in other sports – soccer has a couple of Key Performance Indicators such as Ball Possession, Fouls and the actual score The first 5 minutes actually didn’t look too bad
  • #22 After the first 5 minutes the game changes – with germany taking over the game in their typical way. The KPIs make this very clear The coach is responsible to react based on these values and how the game wents
  • #23 The coach should use more data for detailed analysis on what is going wrong in the game
  • #24 One of his options is to substitute players – or even change tactics Does this succeed based on the KPIs that we have seen before?
  • #25 Well – not always. Just replacing players – putting some in that are faster in chasing the ball doesn’t always help
  • #28 Story New Build Deployed on Thursday Evening Everything runs smooth on Friday Daytime An Ad Campaign hits the Air Friday Night The site crashes under load -> ALERTS GO OFF Restarting Server -> SERVER DOESN’T START Adding more Servers-> PROBLEM REMAINS Calling in the “App Experts” and Pizza Delivery!
  • #32 Well – I guess there is just not more to say about this. The attitude between these teams doesn’t help in solving issues any faster