In this talk, Andre Langevin discusses how Geode forms the core of many Wall Street derivative risk solutions. By externalizing risk from trading systems, Geode-based solutions provide cross-product risk management at speeds suitable for automated hedging, while simultaneously eliminating the back office costs associated with traditional trading system based solutions.
4. A Crash Course in Wall Street Trading
• Big Wall Street firms have “FICC” trading business organized by market:
• Each business will trade “cash” and derivative products, but traders specialize in one or the other
• There may be a team of traders working to trade a single product
• Trading systems are product specific and often highly specialized:
• May have up to 50 different booking points for transactions
• Multiple instances of the same trading system, deployed in different regions
• Electronic markets mean that there are often external booking points to consolidate
• Managing these businesses requires a consolidated view of risk:
• Risk factors span products and markets – it is not sufficient to just look at the risk by trade or book
• Risk measures must be both fast and detailed to be relevant on the trading floor
• Desk heads aggregate risk from individual trades to stay within desk limits for risk
• Business heads aggregate risk across desk to stay within the firm’s risk appetite and regulatory limits
FICC
”Fixed Income
Commodities &
Currencies”
5. Calculating Risk
• What is the “risk” that we are trying to measure?
• Trades are valued by discounting their estimated future cash flows
• Discount factors are based on observable market data
• Movement in markets can change the value of your trades
• “Trade Risk” is the sensitivity of each trade to changes in market data
• Markets are represented using curves:
• A curve is defined using observable rates and prices and then “built” into a smooth, consistent “zero
curve” using interpolation
• Consistency is paramount:
• Most firms have a proprietary math library used for valuation and risk
• Use the same market data in all calculations to avoid basis differences
6. Technology Solutions that Work Badly
• The easiest thing to do is just book all of your trades using one trading system!
• Trading systems are product specific for many very good reasons, so this idea is a non-starter
• How about booking all of the hedges into the primary trading system?
• Cash product systems can’t price derivatives, so you have to invent simple “proxies” for them
• Have to build live feeds from one trading system into another – or book duplicates by hand
• The back office has to remove the duplicates before settlement and accounting
• How about adding up all of the risk from each trading system into a report?
• Almost impossible to make the valuations consistent across systems:
• Different yield curve definitions, and different market data sources feeding curves
• Different math libraries, and often a technology challenge to make them run fast enough
• Different calculation methodologies (is relative risk up or down?)
• Difficult to achieve speed needed to accurately compute hedge requirements
Cash Products
Cash products are
securities that are
settled
immediately for a
cash payment,
such as stocks and
bonds.
7. Filling in the Details of the Design
Event-based cross-product risk system using Geode
8. PDX Integration Glue
• PDX serialization is an effective cross-language bridge:
• PDX data objects bridge solution components in Java, C# and C++
• Avoid language-specific data types (e.g. C# date types) that don’t
have equivalents in other languages
• Structure PDX objects to optimize performance:
• May want to externalize sub-objects or lists into separate objects
• Balance speed of lookup with memory consumption
• Need to consider cluster locality
• JSON is a good message format:
• PDX natively handles JSON, but not XML
• C# works well with JSON, so the calculation engine and the
downstream consumers should consume easily
9. Designing and Naming Data Objects
• The trade data model serves two distinct purposes:
• Descriptive data is only used for aggregation and viewing
• Model parameters are only needed to calculate risk
• Can be split into two regions to optimize performance
• Market data should follow the calculation design:
• Model data to align to the calculation engine’s math library to reduce
format conversions downstream
• Use “dot” notation to give business-friendly keys to objects:
• Create compound keys like “USD.LIBOR.3M” and ”USD.LIBOR.6M” to
allow business users to “guess” a key easily – promotes use of Geode
data in secondary applications and spreadsheets
• Values in the “dot” name are repeated as attributes of the object
10. Region Design
• Trade and market data regions:
• Both may be high velocity, but with a low number of contributors
• Curve definitions are updated slowly but used constantly
• Typically a curve embeds a list of rates – leave it denormalized if
rates are updated slowly
• If calculation engine supports it, create a second region to cache
built interest rate and credit curves (building a credit curve is 80%
of the valuation time for a credit default swap)
• Consider splitting model parameters from descriptive data to
reduce amount of data flowing to compute grid
• Foreign exchange quotes are typically small and updated daily
• Interest rates change slowly and are referenced constantly
• Computational results and aggregation:
• Risk results will be the the largest and highest traffic region
• Pre-aggregate risk inside Geode to support lower powered
consumers (e.g. web pages)
11. Region Placement On the Geode Cluster
• Region placement optimizes the solution’s performance:
• Consider placement of market data and trades holistically to make the
risk calculation efficient – keep all data on one machine
• Partition the trades regions to balance the cluster:
• Partition trade region to maximize parallel execution during compute
• Use a business term (e.g. valuation curve, currency, industry) that can
be used to partition both the trade and market data sets
• Partition or replicate market data to optimize computations:
• Replicate interest rates and foreign exchange rates to all nodes
• Replicate or partition curve data to maximize collocation of trades with
their market data to minimize cross-member network traffic
• When using an external compute grid, this technique should also be
applied to the local Geode cache on the compute grid
12. Getting Trade Data into Geode
• Message formats vary by product type:
• OTC derivatives typically are usually captured in XML documents
• Bond trading systems use FIX or similar (e.g. TOMS)
• Proprietary formats from legacy trading systems
• Broker messages in an application server:
• Transactional message consumer is best pattern
• XML-to-object parsing tools readily available
• Trade data capture is transactional:
• Best practice is to make end-to-end process a transaction, but may
need to split into two legs based on source of messages
13. Getting Market Data into Geode
• Market data feeds have many proprietary formats
• Market data is often exceptionally fast moving:
• Foreign exchange quotes for the major current pairs can reach
70,000 messages/second
• Market data can also be very slow moving:
• Rate fixings like LIBOR are once per day
• Illiquid securities may not be quoted daily
• Conflate fast market data by sampling:
• Discard inbound ticks that don’t move the market sufficiently
• Sample down to a rate that your compute farm can accommodate
• External client required to conflate within message queue
• Gate market data into batches:
• Push complete update of all market data at pre-determined intervals
• Day open and close by trading location (NY, London, Hong Kong)
14. Crunching Numbers on a Shared Grid
• Most trading firms have a proprietary math library:
• Developed by internal quantitative teams to ensure consistency
• Usually coded in C++ or C# to take advantage of Intel compute grid
• Pushing Geode events to an external compute grid:
• Typical compute grid has a “head node” or “broker”
• Use client-side Asynchronous Event Queue (“AEQ”) to collect events
for grid’s broker to process
• Stateless grid nodes can synchronously put results back to Geode
regions to ensure results are captured
• Caching locally on the grid to accelerate performance:
• Grid nodes can use Geode client-side caching proxies
• Use client-side region interest registration to ensure updates are
pushed to grid nodes
• Can use wildcards on keys (see dot notation)
15. Crunching Numbers Inside Geode
• Running the math inside Geode is dramatically faster:
• STAC Report Issue 0.99 in 2010 found that trade valuations running
inside GemFire 6.3 were 76 times faster than a traditional grid
• Using the Geode grid as a compute grid:
• Math library must be coded in java (most are C++ or C#)
• Try to use function parameters to define data model
• Opportunities to cache frequently used derived results
• Using cache listeners to propagate valuation events:
• Use cache listener to detect data updates in regions that contain
valuation inputs (e.g. new trade, market data updates)
• Do not listen to “jittery” regions, such as exchange rates
• Encapsulate math into functions that cache listener can execute
• Ensure regions are partitioned in order to get parallel execution
across the grid
16. Ticking Risk Views
• Roll-your-own client applications to view ticking risk:
• Desktop applications can use the client libraries to receive events
from the cluster using Continuous Queries, which can then be
displayed in real time
• Server hosted applications can use Continuous Queries or
Asynchronous Event Queues
• Integrating packaged products:
• Some specialty products handle streaming risk:
• Armanta TheatreTM
• ION Enterprise RiskTM
• Integrate using custom java components
• The traders will always want spreadsheets:
• Write an Excel a plug-in