Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Operational Analytics
“Data is the New Soil”
–David McCandless
October 16th 2015
Darrell Westbury
Director of Operational ...
About me….
October 16th 2015 2
§  I’m Darrell Westbury
§  I work in Global Technology Services for
Credit Suisse
§ Cred...
“Data is the new Soil”
- David McCandless
October 16th 2015 3
What is Operational Analytics?
“Operations Analytics (OA) is an approach or method of
applying big data principles and dat...
October 16th 2015 5
Machine
Data
Wire
Data
Agent
Data
Synthetic
Transactions
Human
Maintained
System and Application logs,...
So, What Does OA Actually Do?
Phase I: Data Onboarding
October 16th 2015 6
§  Identify Golden Data Sources
§  Extract an...
So, What Does OA Actually Do?
Phase II: Data Science
October 16th 2015 7
§  Identify trends & seasonal patterns in data
§...
October 16th 2015 8
So, What Does OA Actually Do?
Phase III: Data Visualizations
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 10...
How are we using ThousandEyes?
October 16th 2015 9
1 Basic DNS Testing
Alert on internet domain name
resolution failures
2...
Detecting Internet Client Access Issues
! Thousand Eyes probes began reporting a 50-100% drop in authentication responses ...
Evidence of Improved Performance
! Implemented a synthetic transaction to compare the relative End User Experience of usin...
Insight into a Production Incident
Pre-Incident, Paths Look OK
October 16th 2015 12
10:45 AM 11:00 AM
Insight into a Production Incident
Alert Received - BGP Route Disruption
October 16th 2015 13
10:45 AM 11:00 AM
Insight into a Production Incident
Internet Paths are Failing
October 16th 2015 14
10:45 AM 11:00 AM
Insight into a Production Incident
No BGP Routes – SPoF Detected
October 16th 2015 15
10:45 AM 11:00 AM
Insight into a Production Incident
Path Failover to DR Site Successful
October 16th 2015 16
10:45 AM 11:00 AM
Some Final Thoughts…
October 16th 2015 17
§ ThousandEyes lets us run multiple types of tests at various levels
of sophist...
Thank You
October 16th 2015
Darrell Westbury
Director of Operational Analytics
Upcoming SlideShare
Loading in …5
×

Operational Analytics at Credit Suisse from ThousandEyes Connect

4,527 views

Published on

Darrell Westbury, Director of Operational Analytics at Credit Suisse, presents on how the global bank collects five types of IT operations data, analyzes it and uses it to derive insights.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Operational Analytics at Credit Suisse from ThousandEyes Connect

  1. 1. Operational Analytics “Data is the New Soil” –David McCandless October 16th 2015 Darrell Westbury Director of Operational Analytics
  2. 2. About me…. October 16th 2015 2 §  I’m Darrell Westbury §  I work in Global Technology Services for Credit Suisse § Credit Suisse is a fortune 200, Swiss- based Investment Bank, with about ~46,000 employees world wide §  I’ve worked at CS for ~7 years §  I’ve held various roles ranging from: • Head of Storage Ops for the Americas • Head of Capacity and Inventory Services • Director of Operational Analytics (current)
  3. 3. “Data is the new Soil” - David McCandless October 16th 2015 3
  4. 4. What is Operational Analytics? “Operations Analytics (OA) is an approach or method of applying big data principles and data analytics to the IT Operations realm” “OA centers on discovering trends and patterns in high volume, complex and noisy IT systems data and making predictions that will help avoid impact to key services where possible and ‘recover quickly *’ when issues do occur” (* reduced MTTR) October 16th 2015 4
  5. 5. October 16th 2015 5 Machine Data Wire Data Agent Data Synthetic Transactions Human Maintained System and Application logs, System Events, Performance and Capacity Metrics… Simulated views of a customer’s experience while interacting with a service Asset Inventories, Lifecycle Status, Data classes, Apps Names and the people who manage them, etc. Intercepted system calls and Application Method Invocations Network Packet Captures that have been pre- decoded for ease of use What Types of Data Does OA Target?
  6. 6. So, What Does OA Actually Do? Phase I: Data Onboarding October 16th 2015 6 §  Identify Golden Data Sources §  Extract and Transform (sometimes) §  Load & Ingest §  Manage Data Quality §  Reference Data, Maturity Scales §  Accountable Data Owners 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Data Onboarding EFFORT §  Track Progress with Score cards §  Remove any Blockers §  Update Data Documentation
  7. 7. So, What Does OA Actually Do? Phase II: Data Science October 16th 2015 7 §  Identify trends & seasonal patterns in data §  look for baselines & outliers using statistics and linear algebra techniques 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Data Onboarding Data Science EFFORT
  8. 8. October 16th 2015 8 So, What Does OA Actually Do? Phase III: Data Visualizations 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Data Onboarding Data Science Data Visualization EFFORT
  9. 9. How are we using ThousandEyes? October 16th 2015 9 1 Basic DNS Testing Alert on internet domain name resolution failures 2 Port Listener Health Tickle a TCP port over the internet to ensure it’s listening and responding 3 Data Path Testing Observe the end-to-end path through the internet to a target service and monitor for route changes, packet loss, latency & jitter 4 Page Load Testing Ensure internet facing web sites and services are responding correctly and consistently 5 Synthetic Transactions Test authentication and site navigation; collect performance stats at the page object level
  10. 10. Detecting Internet Client Access Issues ! Thousand Eyes probes began reporting a 50-100% drop in authentication responses from a public facing Web Service ! Evidence of issue was provided to Web Operations, who were unaware , as all of their monitoring tools depicted what they believed to be a healthy and stable infrastructure (CPU, Memory, Disk & Network I/O, Capacity, Performance, etc.) October 16th 2015 10
  11. 11. Evidence of Improved Performance ! Implemented a synthetic transaction to compare the relative End User Experience of using the infrastructure with and without acceleration. ! Collected concrete evidence of a ~33% service time / latency performance improvement when accessing an accelerated URL ! Also able to demonstrate relative smoothing of service time inconsistency October 16th 2015 11
  12. 12. Insight into a Production Incident Pre-Incident, Paths Look OK October 16th 2015 12 10:45 AM 11:00 AM
  13. 13. Insight into a Production Incident Alert Received - BGP Route Disruption October 16th 2015 13 10:45 AM 11:00 AM
  14. 14. Insight into a Production Incident Internet Paths are Failing October 16th 2015 14 10:45 AM 11:00 AM
  15. 15. Insight into a Production Incident No BGP Routes – SPoF Detected October 16th 2015 15 10:45 AM 11:00 AM
  16. 16. Insight into a Production Incident Path Failover to DR Site Successful October 16th 2015 16 10:45 AM 11:00 AM
  17. 17. Some Final Thoughts… October 16th 2015 17 § ThousandEyes lets us run multiple types of tests at various levels of sophistication and granularity from all over the world (DNS Test, TCP Port Tickle, Internet Path Test, Page Loads, Full Synthetic Transactions) § We see exactly how our clients are experiencing our services (Internet Path, BGP Route health, packet loss, jitter & latency) § We receive alerts on any significant variations from our established baselines (Paths, Routes, Performance, Service Quality, etc.) § We’re leveraging real quantifiable data to take the guesswork and subjectivity off the table § We’re empowering important business decisions
  18. 18. Thank You October 16th 2015 Darrell Westbury Director of Operational Analytics

×