0
How Morgan Stanley Uses Wire Data for IT
Operational Intelligence
Chris Kozlowski | Executive Director, Morgan Stanley
Raj...
“The network is the
computer.”
– John Gage, Chief Researcher and
Director of the Science Office for Sun
Microsystems in 19...
What Is Wire Data?
Transmit
Data

USER
APPLICATION LAYER
PRESENTATION LAYER
SESSION LAYER
TRANSPORT LAYER
NETWORK LAYER
DA...
How We Used Wire Data
•  Manually pulled from sniffers, massaged by experts
•  Purpose-built devices requiring expertise
•...
Making Wire Data Accessible
Making Wire Data Accessible
•  Up to 20 gigabits per second, with real-time
dissection, analysis, baselining and alerting
...
Making Wire Data Accessible
Making Wire Data Accessible
How We Are Using Wire Data Today
•  Real-time analysis
•  Historical look-back
•  Trending and alerts
•  Supplementing oth...
Use Cases
•  Issue triage
•  Correlation of events across layers
•  Alerting
•  Data mining
•  Post-release review
•  Capa...
“War Stories”
Payment Processing
Virtual Packet Loss
Database Schema Change
Real-Time Operational Intelligence for
Payment Processing
Why?
Where?
For whom?
How frequently?

Director of Ops

A large ...
Webservers Don’t Log Application Payload
Example of a standard payment processing protocol (Orbital)

Parsing specific
tra...
Driving Business Insight

For the first time,
ExtraHop offers a
simple, non-invasive
way to monitor any
transaction elemen...
Driving Business Insight
ExtraHop provides real-time view
of payment process time, errors,
and performance across tiers.
H...
Driving Business Insight

All transactions previously un-loggable now
become searchable analytics to answer the
business q...
Driving Business Insight
A simple search: transaction order-id | search eventcount >1
finds all business transactions that...
Virtual Packet Loss
Virtual
Packet Loss

Shortly after some app servers are transitioned to VMs, users
begin to complain o...
Virtual Packet Loss
Virtual
Packet Loss

Shortly after some app servers are transitioned to VMs, users
begin to complain o...
Virtual Packet Loss
Virtual
Packet Loss

Shortly after some app servers are transitioned to VMs, users
begin to complain o...
Virtual Packet Loss
Virtual
Packet Loss

Shortly after some app servers are transitioned to VMs, users
begin to complain o...
Virtual Packet Loss
Virtual
Packet Loss

Shortly after some app servers are transitioned to VMs, users
begin to complain o...
Virtual Packet Loss
Virtual
Packet Loss

Shortly after some app servers are transitioned to VMs, users
begin to complain o...
Virtual Packet Loss
Virtual
Packet Loss

Shortly after some app servers are transitioned to VMs, users
begin to complain o...
Virtual Packet Loss
Virtual
Packet Loss

Shortly after some app servers are transitioned to VMs, users
begin to complain o...
Virtual Packet Loss
Virtual
Packet Loss

Shortly after some app servers are transitioned to VMs, users
begin to complain o...
Database Schema Change
Database
Schema
Change

•  The app team rolls out a new version of the application
after completing...
Database Schema Change
Database
Schema
Change

•  The app team rolls out a new version of the application
after completing...
Database Schema Change
Database
Schema
Change

•  The app team rolls out a new version of the application
after completing...
Database Schema Change
Database
Schema
Change

•  The app team rolls out a new version of the application
after completing...
Database Schema Change
Database
Schema
Change

•  The app team rolls out a new version of the application
after completing...
Questions?
Upcoming SlideShare
Loading in...5
×

Interop NY 2013: How Morgan Stanley Uses Wire Data for IT Operational Intelligence

638

Published on

Data off the wire has long been recognized as the definitive source of truth for IT operations, especially as all modern applications and systems rely on the network. As the saying goes, “Packets don’t lie.” But packets alone aren’t enough. What is required is the ability to analyze and interpret wire data, which has never been available continuously, in real time, or in a way that was easily understood. Morgan Stanley is using a new technology that makes real-time wire data analytics accessible to everyone. With a view into all of the L2-L7 conversations occurring between distributed systems, IT teams can solve a number of operational challenges, including troubleshooting, monitoring security, optimization, and more.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
638
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
27
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Interop NY 2013: How Morgan Stanley Uses Wire Data for IT Operational Intelligence"

  1. 1. How Morgan Stanley Uses Wire Data for IT Operational Intelligence Chris Kozlowski | Executive Director, Morgan Stanley Raja Mukerji | Founder/President, ExtraHop Networks
  2. 2. “The network is the computer.” – John Gage, Chief Researcher and Director of the Science Office for Sun Microsystems in 1984
  3. 3. What Is Wire Data? Transmit Data USER APPLICATION LAYER PRESENTATION LAYER SESSION LAYER TRANSPORT LAYER NETWORK LAYER DATA LINK LAYER PHYSICAL LAYER Physical Link Receive Data
  4. 4. How We Used Wire Data •  Manually pulled from sniffers, massaged by experts •  Purpose-built devices requiring expertise •  Problems with this approach: •  Difficult to know where and when to deploy •  Sniffers hold a limited amount of data •  Learning curve is steep
  5. 5. Making Wire Data Accessible
  6. 6. Making Wire Data Accessible •  Up to 20 gigabits per second, with real-time dissection, analysis, baselining and alerting •  Layer 2 to Layer 7 analysis •  Ethernet, IP, and TCP •  Common enterprise application-layer protocols such as HTTP, DNS, CIFS, SQL, SSL, LDAP, ICA, FIX, FTP, NFS, SMTP, and MQ •  Intuitive interface into this data •  Historical view (weeks to months rather than hours to days) •  Drill-down into the data for that time context. •  Shift perspective according to the protocol of interest, or the vantage point of a particular device or application
  7. 7. Making Wire Data Accessible
  8. 8. Making Wire Data Accessible
  9. 9. How We Are Using Wire Data Today •  Real-time analysis •  Historical look-back •  Trending and alerts •  Supplementing other instrumentation points •  First place where “eyes meet glass” when troubleshooting
  10. 10. Use Cases •  Issue triage •  Correlation of events across layers •  Alerting •  Data mining •  Post-release review •  Capacity planning
  11. 11. “War Stories” Payment Processing Virtual Packet Loss Database Schema Change
  12. 12. Real-Time Operational Intelligence for Payment Processing Why? Where? For whom? How frequently? Director of Ops A large financial services firm saw many duplicate orders, but could not find the source of the problem or understand its scope. ? ? ? Developer Virtual Administrator Network ? DBA ? Storage Administrator 12
  13. 13. Webservers Don’t Log Application Payload Example of a standard payment processing protocol (Orbital) Parsing specific transaction details for monitoring every transaction across all applications was previously infeasible. Web servers don’t log payload data. Network monitoring is blind to it. Legacy APM requires agents if they could do it. 13
  14. 14. Driving Business Insight For the first time, ExtraHop offers a simple, non-invasive way to monitor any transaction element in the payload for any application in real-time and can log it to a solution like Splunk for analytics. 14
  15. 15. Driving Business Insight ExtraHop provides real-time view of payment process time, errors, and performance across tiers. However, ExtraHop can’t index and search every transaction. ExtraHop “forwards” Splunk precise payment processing feeds without agents or server-based forwarders. 15
  16. 16. Driving Business Insight All transactions previously un-loggable now become searchable analytics to answer the business question, “Who, when, how, and where did duplicate transactions occur.” 16
  17. 17. Driving Business Insight A simple search: transaction order-id | search eventcount >1 finds all business transactions that had a duplicate order ID “It’s like being able to find a snowflake in an avalanche.” – Director of Operations, Large Financial Company ExtraHop + Splunk can now answer nearly impossible questions without massive time and expense. For example, which customers and what orders were duplicated in the last 12 months, when did it happen and to whom? 17
  18. 18. Virtual Packet Loss Virtual Packet Loss Shortly after some app servers are transitioned to VMs, users begin to complain of slower application performance. ation tualiz vi r Blame the network
  19. 19. Virtual Packet Loss Virtual Packet Loss Shortly after some app servers are transitioned to VMs, users begin to complain of slower application performance. CPU load on the app servers does not indicate a problem.
  20. 20. Virtual Packet Loss Virtual Packet Loss Shortly after some app servers are transitioned to VMs, users begin to complain of slower application performance. Infrastructure monitoring does not indicate any dropped packets on the switch ports or router links.
  21. 21. Virtual Packet Loss Virtual Packet Loss Shortly after some app servers are transitioned to VMs, users begin to complain of slower application performance. Spot checking a few flows in a packet sniffer does not reveal any problems.
  22. 22. Virtual Packet Loss Virtual Packet Loss Shortly after some app servers are transitioned to VMs, users begin to complain of slower application performance. TCP connection analysis reveals large numbers of RTOs to several VMs and higher levels of jitter on round-trip times.
  23. 23. Virtual Packet Loss Virtual Packet Loss Shortly after some app servers are transitioned to VMs, users begin to complain of slower application performance. TCP connection analysis reveals large numbers of RTOs to several VMs and higher levels of jitter on round-trip times.
  24. 24. Virtual Packet Loss Virtual Packet Loss Shortly after some app servers are transitioned to VMs, users begin to complain of slower application performance. VMware vCenter indicates multiple VMs on the same physical host heavily oversubscribing available memory and CPU.
  25. 25. Virtual Packet Loss Virtual Packet Loss Shortly after some app servers are transitioned to VMs, users begin to complain of slower application performance.
  26. 26. Virtual Packet Loss Virtual Packet Loss Shortly after some app servers are transitioned to VMs, users begin to complain of slower application performance. Conclusion: Oversubscribed hypervisors may look exactly like a degraded network.
  27. 27. Database Schema Change Database Schema Change •  The app team rolls out a new version of the application after completing the latest sprint. •  Users complain that certain functions in the application are broken.
  28. 28. Database Schema Change Database Schema Change •  The app team rolls out a new version of the application after completing the latest sprint. •  Users complain that certain functions in the application are broken. Service checks indicate that the application is running normally, i.e. all green. Logfiles indicate that the application is rarely generating exceptions.
  29. 29. Database Schema Change Database Schema Change •  The app team rolls out a new version of the application after completing the latest sprint. •  Users complain that certain functions in the application are broken. Passive DB monitoring reveals that certain stored procedures began failing after the new rollout.
  30. 30. Database Schema Change Database Schema Change •  The app team rolls out a new version of the application after completing the latest sprint. •  Users complain that certain functions in the application are broken. DB error messages suggest that the failures are the result of a schema change.
  31. 31. Database Schema Change Database Schema Change •  The app team rolls out a new version of the application after completing the latest sprint. •  Users complain that certain functions in the application are broken. Conclusion: Production-level visibility is required to mitigate risk during new application roll-outs.
  32. 32. Questions?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×