Monitoring Stock Market Trading Networks with nGenius


Published on

Published in: Business, Economy & Finance
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Monitoring Stock Market Trading Networks with nGenius

  1. 1. Monitoring Real-Time Trading Networks with the nGenius Solution Detecting and troubleshooting microbursts and out-of-sequence packets for latency-sensitive market data and trade/exchange applications and protocols. Introduction NetScout applies its high-definition CDM In the highly competitive investment technology to identify and monitor application services industry, differentiation can often traffic flows, mapping them to the powerful be measured in milliseconds. The firms that real-time and historical views and reports can process and analyze the ever-growing available in nGenius® Performance Manager. volumes of market and trade data the This approach enables nGenius InfiniStream fastest can execute trades more swiftly and and nGenius Probes, when strategically pass this advantage on to their customers. placed throughout the organization’s In fact, according to Information Week, “a network, to collect and deliver rich, millisecond advantage in trading applications actionable information on the utilization and can be worth $100 million a year to a major conversations from market trading-specific brokerage firm.” 1 The ability to identify and application traffic. Network professionals monitor microbursts, latency and packet loss can then proactively manage faults, monitor for critical market data and trade/exchange application responsiveness, plan network applications and the protocols associated bandwidth changes, and rapidly troubleshoot with them, such as FIX, Multicast, TIBCO problems. and others, is essential to the day-to-day Network Considerations operation of financial exchanges, trading houses, hedge funds, investment banks and In the wake of widespread consolidations, other global financial institutions and market today’s investment services organization’s exchanges with trading desks. network landscape typically spans multiple data centers and geographically dispersed Algorithmic trading is driving firms to re- sales offices. These locations, all connected architect their trading systems and networks by high-speed networks, host applications for low latency and optimal performance. affecting high-value, financial services Firms are automating trade processes and activities, such as updated market trading adding throughput and processing power data, verifying executed orders, accessing to their market data and trading systems. customer account information, and Market data delayed by tens of milliseconds performing market research. Additionally, may no longer be usable for trading purposes, given the critical nature of the data involved, so designing the network to minimize latency redundant and/or disaster recovery sites are is also essential. well established to protect against natural or In addition, these networks also support 24x7 man-made disasters. access for customers checking account bal- At the same time, the networks for ances, current stock prices, validating orders investment services organizations is that have been executed or for brokers and evolving and becoming more complicated, traders viewing their clients’ investment and implementing SOA (Service-Oriented trading history or researching investment al- Architecture) and Web-based applications in ternatives. This increased dependency on the support of customer resource management network has brought many IT departments (CRM) and enterprise resource planning additional responsibilities and formidable (ERP) activities. Many financial service challenges in managing, maintaining, and organizations are rolling out VoIP initiatives optimizing the performance of their valuable, triggering other IT projects, such as VPNs, global networks. 1 “Wall Street’s Quest to Process Data at the Speed of Light,” Information Week, April 23, 2007.
  2. 2. Mo n i t o ri n g Rea l-Time Tr a ding Net wo rks with t h e nGeniu s So lu t io n MPLS networks or Quality of Service Network Management Challenges market trading- or exchange-specific (QoS) prioritization to maintain high-quality Maintaining the networks critical to their application traffic, network professionals voice and ensure the delivery of mission- charter and the applications transported can proactively manage faults, monitor critical market trading applications. In the across them are key challenges facing application responsiveness, plan network core, to support the speed and market trading IT departments today. network bandwidth changes, and rapidly volume of data, users, and transactions, Some specific challenges include: troubleshoot problems. market trading firms rely on Gigabit The nGenius Performance Manager EtherChannel and 10 Gigabit Ethernet n Troubleshooting network and application response time issues, provides global market trading companies links. with latency, packet loss, and out of Communications between headquarters, sequence problems that result in n Real-time monitoring and historical data centers, and field sales offices, performance degradations across the reporting analysis (utilization, response typically occur over a variety of Wide widely distributed network. time, hosts and conversations) of Area Network (WAN) segments. Local market trading applications, such as sales office connections may be T1 or n Maintaining the delivery of time- sensitive applications such as real- FIX, OPRA, MDP, PGM, etc. E1 circuits, while headquarters and data n Ability to identify IP Multicast groups center connections are more likely OC-3/ time stock price updates that depend on TIBCO used with IP Multicast to as unique business applications as STM-1 or OC-12/STM-4 speeds. well as view the interaction between ensure reliability, FIX Protocol-based applications for electronic trading or publishers and groups Business Challenges Voice over IP to sales offices. n Ability to view TIBCO statistics, To generate revenues and remain including error traffic and n Supporting disaster recovery and competitive, it is essential for today’s retransmissions. business continuity plans for all market trading companies to ensure aspects of the company’s business n Real-time microburst alarming that all their products and services are data and voice services throughout Robust error detection, such as available, at all times and across the global n headquarters, data centers, and branch monitoring out-of-sequence packets/ organization. Specific challenges include: office facilities. retransmissions in real time, and the n Maintaining competitive advantage To meet the myriad of challenges, ability to generate an alarm when the by constantly adapting their trading investment services companies face, a gap exceeds a pre-defined value strategies and increasing the speed single unified performance management Response time analysis which of trading in the wake of higher n system that assures the efficient delivery distinguishes network flight times market data volume, quoting in penny of applications across the distributed versus application processing times increments, and new regulatory network, troubleshoots network problems demands facing the industry. Integrated application-level monitoring in real-time and provides unprecedented n n Avoiding delays in quotes, customer visibility to control the cost of network and analysis with continuous packet information, algorithmic trading, or on- operations is required. They need the capture for robust, post-event data line transactions. Because the network nGenius Performance Management mining transmits funds, delays translate to lost Solution. n KPI-to-flow-to-packet approach to revenue and unhappy customers. problem diagnosis for streamlined Meeting the Need troubleshooting that lowers mean time n Assimilating and incorporating all the changes in users and applications The nGenius Solution meets the needs to resolution (MTTR) that follow mergers, acquisitions and of market trading companies by providing consolidations. the real-time and historical information Broadest Visibility into Revenue- necessary to keep the network running Impacting Market Trading Applications at peak performance with the lowest The nGenius Solution, built on NetScout’s possible latency. Using nGenius Common Data Model (CDM) architecture, InfiniStream, nGenius Probes and nGenius provides extensive visibility into network Collectors which have been strategically resources usage and in-depth application placed throughout the organization’s identification and analysis. The nGenius network to collect and deliver rich, Solution recognizes, discovers, and actionable information on the utilization, monitors all the applications in your conversations and packet details from network including more than 200
  3. 3. well-known applications; complex applications that use a range of ports, such as SAP, Citrix or .NET; custom Figure 1 applications designed specifically for Market trading applications monitored “out of the box” your business; web-based applications by the nGenius Solution include: identified by URLs; and next-generation Multicast and TIBCO communication applications, such as VoIP, n TIBCO Multicast - 64101 video, instant messaging, and peer to peer. n TIBCO Message type 2 - 64102 Market trading applications monitored n TIBCO NAK - 64103 “out of the box” by the nGenius Solution n TIBCO Keep Alive - 64104 include Fix Order Single, FIX Other, MDP, n TIBCO NULL - 64105 PGM, CTS, CQS, OPRA and Multicast n TIBCO Unicast - 64106 Push and Monitoring. (See Figure 1 for a n TIBCO Message type 7 - 64107 complete list). n TIBCO Message type 8 - 64108 n TIBCO Other - 64120 FIX Protocol Support FIX Protocol The Financial Information eXchange (FIX) n FIX Order Single protocol, which specifies the standards- n FIX Other based method of exchanging trade- related messages and real-time electronic Other Market Trading Applications communications, is crucial to many n MDP (Market Data Platform) order management and trading systems. n PGM (Pragmatic General Multicast) With broad adoption as the messaging n CTS (Consolidated Tape System) standard, it is an essential service to n CQS (Consolidated Quote System) the day-to-day operation of applications n OPRA (Options Price Reporting Authority) deployed by virtually every major stock exchange and investment bank with electronic trading, as well as by many global mutual funds, money managers, and investment firms. The FIX Protocol Interface specification distinguishes the different message types into two categories: Session for Administrative messages, such as heartbeat, test request, and rejects; and Application for Business messages, such as Order Single and Execution Reports. The nGenius Solution collects and analyzes FIX traffic and displays it in two categories: n FIX Order Single displays the Application for Business message details for the actual order transaction traffic flows and conversations - specifically Order Single and Execution Report. n FIX Other monitors Session for nGenius Newspaper Report for Response Time over Time and Response Time Distribution over Time. Administrative messages, such as The nGenius Solution identifies FIX Order Single conversations and traffic flows separately from FIX other heartbeat, test requests and rejects. messages as illustrated in this response time report analyzing performance levels in this financial services network.
  4. 4. Mo n i t o ri n g Rea l-Time Tr a ding Net wo rks with t h e nGeniu s So lu t io n The nGenius Solution uses innovative, n Balance the delivery and bandwidth and add additional levels of complexity, deep-packet inspection to track and clas- requirements of FIX protocol making them very difficult to isolate. As sify FIX protocol application traffic flows, applications by monitoring and a result, IT organizations end up devoting collecting volume, utilization, users, and trending traffic patterns and demands a great deal of time ascertaining the conversation statistics as FIX Order Single to particular hosts and locations for location of a problem, rather than focusing and FIX Other. It also collects response efficient capacity planning purposes. on solving the problem itself. time metrics for FIX Order Single. These With the ability to view host group metrics provide the basis for ensuring Support for IP Multicast and TIBCO class D IP addresses, NetScout’s optimal performance of FIX protocol ap- Market trading companies use IP Multicast nGenius Solution can track IP Multicast plications, making it possible for IT staff to to stream large amounts of data, such conversations, whether they are one-to- easily: as stock price updates, to multiple sites many, many-to-many or many-to-one. This n Evaluate FIX protocol application simultaneously and to do so efficiently ability to monitor individual IP Multicast utilization by recognizing and under- and in real time. TIBCO is used to ensure conversations enables the network team standing existing network patterns reliability of IP Multicast by providing to: and identifying FIX protocol traffic congestion control and bandwidth n Identify and track IP Multicast groups flows and applications running over the management and error correction. and TIBCO protocol messages as network. IP Multicasts’ one-to-many, many-to- unique business applications to n Examine traffic activity and maximize many or many-to-one nature creates identify which applications, servers and the performance for FIX Order Single an environment where it is difficult to locations are most active specific transactions versus all other distinguish between conversations. n Manage real-time IP Multicast FIX traffic by pinpointing application To properly manage the network and applications along side all other degradations and analyzing application accommodate IP Multicast applications, networked applications in order to response-time metrics network professionals require a clear quickly troubleshoot interactions that picture of how publishers and groups n Analyze FIX Order Single and FIX could impact service delivery converse. Most performance management Other protocol-based applications and View and alarm on critical IP Multicast systems do not monitor or report on IP n all other applications simultaneously by parameters and other network Multicast activity and have limited ability monitoring and trending the patterns conditions, including microbursts, to recognize or alarm on microbursts that of application behavior side-by-side in order to reduce performance disrupt streaming IP Multicasts. These on segments and virtual circuits in the degradations before they impact users microbursts occur in a short span of time network. n View the interaction between publishers and groups, as well as the applications used during the particular conversation n Join IP Multicast sessions and alarm on TIBCO retransmissions and other error types n Cost-effectively assure delivery of IP Multicast applications across a globally distributed network, troubleshoot problems in real-time, optimize bandwidth and application delivery, and report to executives, business managers and IT staff worldwide. nGenius Performance Manager can monitor individual multicast conversations revealing top conversations from source to group. This information is crucial for capacity planning and resource allocation.
  5. 5. Support for Other Market Trading ensure service levels, detect and more n KPI - Key Performance Indicators Applications quickly troubleshoot degradations, support (KPIs) often measure aspects of the In addition to supporting FIX and Multicast intelligent capacity planning and report user experience, such as errors, jitter, applications, the nGenius Solution performance results to the organization. response time, etc., and therefore are supports other applications and protocols useful in providing early detection of Response Time Analysis that are important to the market trading performance issues. organizations, including: Response time is a primary indicator of n Flows - Once detected, the nGenius network or application service quality CDM architecture allows users to view n MDP (Market Data Platform) - a and often the primary complaint of the applications, conversations, hosts multicast distribution platform end users regarding the network. and other flows associated with the n PGM (Pragmatic General Multicast) Tracking and isolating the sources of specific KPI. - a multicast transport protocol used application response time problems are for multicast applications with basic crucial because in trading transactions, n Packets - When needed, the reliability requirements. The protocol is milliseconds are worth millions. The nGenius System can provide visibility intended for multicast applications that nGenius Solution provides application into the individual packet details require ordered packets from a source response time metrics specifically for (bounce charts, packet decodes, etc.) device to multiple recipients. market trading applications listed in Figure associated with each flow, providing 1. The response time analysis is provided sub-second granularity. n CTS (Consolidated Tape System) - multicast protocol for processing of with up to one-minute granularity to aid in Consider this example: An nGenius Trade information. establishing financial network services at Solution microburst alarm emails the the appropriate quality levels. n CQS (Consolidated Quote System) - network manager to inform him that a Unique KPI-to-Flow-to-Packet Approach WAN link exceeded its capacity for 50ms multicast protocol for processing of quote information. NetScout takes a unique approach to starting at 14:15.606. Since this alarm is diagnosing network issues, which we call in real-time and it provides the start time n OPRA (Options Price Reporting “KPI to Flow to Packet.” (See Figure 2) and end time for the microburst, when you Authority) - provides last sale This approach allows users to view only launch link utilization over time, you can information and current options the data they need to diagnose an issue easily zone in on the 50 millisecond slice quotations from a committee of and alleviates the need to sift through in question. A right click of the mouse Participant Exchanges designated as mounds of irrelevant data. In each case, allows you to launch a packet decode to the Options Price Reporting Authority. a single right click of the mouse enables determine what was contributing to the you to drill-down into progressively deeper burst and if it caused any transactions to The nGenius Performance Management levels of detail: be incomplete. Solution incorporates real-time and historical reporting of all applications - including FIX, other market trading applications, those delivered via IP DATA SET FUNCTION Multicast and the everyday applications necessary for running any business - KPIs WARNING eliminating the need for separate products Session Errors, VoIP QoS Errors Service Health and Application Response Time, etc. Early Problem Indicators to manage each individual application. Displaying all applications side by side as they compete for bandwidth and FLOWS ISOLATION Applications, Services, All Application Activity resources, not just selected applications, Conversations, Utilization, and Complex Service provides total visibility and context into Volume, etc. Relationships the networking environment. Microburst alarming and monitoring allow the PACKETS EVIDENCE nGenius Solution to pinpoint utilization Headers and Payload Definitive Sub-second Session spikes that last mere milliseconds and and Transaction Details affect critical market trading applications and IP Multicast conversations. Figure 2. Illustrated here is the unique KPI-to-Flow-to-Packet approach NetScout Systems takes Therefore, network professionals gain to diagnose and resolve network issues. the knowledge and insight they need to
  6. 6. Mo n i t o ri n g Rea l-Time Tr a ding Net wo rks with t h e nGeniu s So lu t io n Proactive Performance Management with Sophisticated Problem Detection and Alarming Studies show that in 50% to 75% of the cases, network problems are most often discovered by the network users. When dealing with an application where degradations of milliseconds could severely impact the financial outcome, more effective approaches are necessary. Maintaining network performance for critical trading applications means detecting, diagnosing and rectifying issues that may impede their delivery. In market trading firms, this means identifying, in the earliest stages, problems with packet loss, out-of-sequence packets, latency or delay. The nGenius Solution is well positioned to monitor all the traffic in the network to quickly recognize a number of potential Microburst alarms are available in real-time, provide the application and conversation details for the signs of network degradations. Microburst contributors, and allow you to drill down into the specific packets for the alarm time window. Packet Loss or Out-of-Sequence Packets Alarm that alerts IT when link utilization time slices, nGenius K2 is able to detect NetScout’s Sequence Number KPI has exceeded its threshold for as little as different classes of problems, including watches for gaps in the sequence 5 milliseconds. It is useful in identifying short-term spikes, sustained shifts, and numbers and generates an alarm when microbursts of activity that are otherwise subtle, long-term performance drifts that the gap exceeds a predefined value. difficult to detect and that can slow are virtually impossible to catch manually. OPRA, MDP and PGM are examples of transaction rates. Because nGenius K2 groups related protocols that use the Sequence Number Microburst Alarms are set in nGenius incidences, it generates fewer, more KPI to track out of sequence packets. This Probes and nGenius InfiniStream for intelligent alerts, allowing network KPI is also used to monitor packet loss for instantaneous alarming. This approach operations to focus in on only the most applications such as MDP that use a fixed differs from most other products that critical issues. In addition, by linking the off-set for the location of the sequence must process the data before sending the Analytics Alarms with detailed diagnostics number (for MDP, offset is 0). alerts, which can delay notification for as information, it facilitates faster diagnosis The Sequence Number KPI provides/ much as 15 minutes. The Microburst alerts and more accurate interpretation of returns the following information: include evidence of the top applications, performance issues, shortening costly n Previous sequence number hosts and conversations that contributed downtime. For these reasons, companies to the alarm to aid in identifying the source that use Enterprise Management Systems n Current sequence number (EMS) such as HP Operations Manager or of the problem. n Gap in sequence numbers IBM Tivoli NetView, value nGenius K2. nGenius® K2 n Multicast address nGenius K2 uses NetScout’s patented, Network forensics for intermittent n Source address problem analysis statistical behavior modeling to detect Microburst Alarms abnormal changes in network and Traffic problems can severely disrupt the application behavior in order to deliver flow of critical market trading applications, Power Alarms are time-over-threshold early warning of performance issues. and because they can happen in such alarms as granular as 1 second that can The product automatically learns the a narrow time span, they are often hard be set on link utilization, as well as on network’s behavior patterns and identifies to isolate. For investigating this type specific applications’ utilization, such as performance anomalies without the of intermittent performance issue, the FIX Order Single. A Microburst Alarm manual configuration and guesswork of nGenius InfiniStream, with up to 8TB of is a more granular version of the Power setting thresholds. By using variously sized storage, continuously captures and stores
  7. 7. a complete packet-by-packet audit trail to support at-a-glance identification and quick resolution of difficult, subtle, and intermittent network problems. In addition to providing post-mortem packet analysis by capturing network traf- fic 24x7 and storing it to disk, the nGenius InfiniStream simultaneously delivers real- time and historical monitoring for revealing key performance metrics such as traffic and application utilization, network talk- ers, conversations, error conditions, and response times. The tight integration of these two functions makes the nGenius Solution the most robust, functional and feature-rich solution for the market trading industry. Global capacity planning Having adequate capacity is particularly The ability to view 1-second peaks is an important tool for capacity planning crucial for avoiding performance degrading bottlenecks for market trading/ n Disaster recovery. The nGenius standards that facilitates the exchange applications. The nGenius Performance Manager Standby Server attachment of a wide variety of Solution provides the trended, historical functions as a co-located backup higher-level tools that can be used information needed to ensure adequate with a main server to protect against to distribute information, integrate bandwidth and to optimize the delivery a hardware or database failure. knowledge into various management of networked electronic trading services. Alternatively, it can be deployed at a platforms, and exercise resource Baseline and forecast reports enable remote site for seamless business control. network engineers to fine-tune traffic continuity in the event of a disaster. flows across the enterprise, understand Third-party EMS integration. By Conclusion the normal behavior patterns of various n critical applications, and make informed integrating the nGenius Solution with Market trading and exchange applications decisions on where to invest resources. other industry-leading management are the lifeblood of investment services and business service solutions often organizations, thus their performance lev- Redundancy and Scalability installed in investment services els receive a great deal of attention. The networks, such as HP (Operations nGenius Solution is uniquely positioned Lastly, enterprise-class scalability and Manager, Business Availability Center to provide comprehensive visibility into a redundancy is crucial to the investment and Network Node Manager) or broad range of market trading applica- services industry. They must maintain IBM (Tivoli NetView, Tivoli Netcool/ tions, as well as their associated hosts operations at all costs and under all OMNIbus and Tivoli Enterprise and conversations, response time perfor- circumstances. The nGenius Solution Console), performance monitoring mance, and errors conditions. In addition, provides the following capabilities for capabilities are expanded, enabling the nGenius Solution provides tightly continuous performance monitoring: better visibility and control over integrated capabilities for packet capture n Scalability. Capable of being deployed mission-critical networked and and decode, early warning alarming, and in a distributive manner, the nGenius application resources. historical reporting and trending. Armed Servers work in parallel, rolling up with evidence from their own network’s n Data Export. In addition, NetScout’s data to a designated master server day-to-day operations, IT organizations CDM information is available to that provides a complete enterprise- are better able to reduce MTTR as they other third-party applications via wide view of networked application troubleshoot degradations, respond to the Common Data Export (CDE) performance. emerging utilization and performance functionality. The CDE supports a set of interfaces based on industry problems, and plan for future bandwidth changes.
  8. 8. Monitoring Real-Time Trading Networks with the nGenius Solution Case Studies Fixing FIX Visibility About NetScout Systems NetScout Systems provides advanced A leading U.S. brokerage firm recognized they had a critical need for network and application service more visibility into FIX protocol-based application traffic flows soon after assurance solutions that deliver they had publicly introduced a new FIX protocol-based, on-line foreign complete visibility into real-time, packet/ exchange trading platform, which allowed traders to customize currency flow-based operational intelligence. pairs and settings, set ticket size limits, and provided streaming spot prices IT operators at the world’s largest in all major currencies via a “one-click” browser-based system for fast enterprises, government agencies, market access. and service providers use the Sniffer and nGenius solutions to troubleshoot The brokerage needed more robust, automated analysis of FIX protocol service degradations faster and more application performance than the simple packet decode would provide so efficiently in order to reduce MTTR. NetScout put the development effort for FIX protocol on the fast track. In the end, adding support for FIX protocol-based application utilization, Our world-renowned Sniffer and nGenius solutions include: hosts and conversation tracking, and response time analysis was crucial to supporting this new application, as was the combined monitoring and n Intelligent Data Sources for high capacity, deep-packet recording recording of nGenius AFMon for automated analysis and post-event and monitoring forensics troubleshooting. Fortunately, NetScout’s CDM architecture is n Analysis Software for real-time designed for easy extensibility so adding new application and protocol and historical network and support is a quick and painless process. application performance management, troubleshooting, Abnormal delay with market trading application capacity planning, and reporting An East Coast financial institution was experiencing intermittent delay n Advanced Intelligence for early detection and in-depth analysis of problems with one of its market data applications and users were starting complex or specialized application to complain. Using the nGenius Solution, they investigated the application services response time and saw that the delay was attributable to the server, n Comprehensive, global support, not the network. In a few short clicks, they were able to determine that consulting and training services all the clients were accessing the same remote server. The server was overwhelmed by the sheer number of the client connections and requests Corporate Headquarters for market data and became a bottleneck. Once network operations 310 Littleton Road identified the problem, the support team was able to redirect the clients to Westford, MA 01886-4105 Phone: 978-614-4000 the right source for production data. Toll Free: 888-999-5946 Misconfigured clients disrupting WAN service An intermittent service disruption on the WAN link of a U.S. brokerage firm European Headquarters had network operations wondering why there was an increase in traffic NetScout Systems (UK) Ltd. 100 Pall Mall volume and a corresponding increase in packet loss. Using the nGenius London SW1Y 5HP Solution, the network operations team was able to view all the applications United Kingdom on that link and their associated volume and was thus able to quickly Phone: +44 (0)20 7321 5660 identify the two applications contributing to the load. Further instigation allowed them to identify the clients associated with the traffic. It turns out Asia/Pacific Headquarters Room 105, 17F/B, No. 167 a couple of the clients were misconfigured to use a remote server instead TunHwa N. Road of the local server, causing the increased load on the WAN link. Network Taipei, Taiwan operations made a quick call to the remote support team who was able Phone: +886 2 2717 1999 to redirect the clients to the local server and restore the WAN to normal operations. ©2008 NetScout Systems, Inc. All rights reserved. NetScout, the NetScout logo, Network General, the Network General logo, nGenius, Sniffer, InfiniStream, Business Container, Business Forensics, NetVigil and Quantiva are trademarks or registered trademarks of NetScout Systems, Inc. Other brands, product names and trademarks are property of their respective owners. NetScout reserves the right, at its sole discretion, to make changes at any time in its technical information and specifications, and service and support programs. WP0908_01revA 2008-09-04