Andre Langevin
March 9th 2016
Wall Street Derivative Risk Solutions
Using Apache Geode (Incubating)
Design Whiteboards for Solution Architects
Design Pattern
Event-based cross-product risk system using Geode
A Crash Course in Wall Street Trading
•  Big Wall Street firms have “FICC” trading business organized by market:
•  Each business will trade “cash” and derivative products, but traders specialize in one or the other
•  There may be a team of traders working to trade a single product
•  Trading systems are product specific and often highly specialized:
•  May have up to 50 different booking points for transactions
•  Multiple instances of the same trading system, deployed in different regions
•  Electronic markets mean that there are often external booking points to consolidate
•  Managing these businesses requires a consolidated view of risk:
•  Risk factors span products and markets – it is not sufficient to just look at the risk by trade or book
•  Risk measures must be both fast and detailed to be relevant on the trading floor
•  Desk heads aggregate risk from individual trades to stay within desk limits for risk
•  Business heads aggregate risk across desk to stay within the firm’s risk appetite and regulatory limits
FICC
”Fixed Income
Commodities &
Currencies”
Calculating Risk
•  What is the “risk” that we are trying to measure?
•  Trades are valued by discounting their estimated future cash flows
•  Discount factors are based on observable market data
•  Movement in markets can change the value of your trades
•  “Trade Risk” is the sensitivity of each trade to changes in market data
•  Markets are represented using curves:
•  A curve is defined using observable rates and prices and then “built” into a smooth, consistent “zero
curve” using interpolation
•  Consistency is paramount:
•  Most firms have a proprietary math library used for valuation and risk
•  Use the same market data in all calculations to avoid basis differences
Technology Solutions that Work Badly
•  The easiest thing to do is just book all of your trades using one trading system!
•  Trading systems are product specific for many very good reasons, so this idea is a non-starter
•  How about booking all of the hedges into the primary trading system?
•  Cash product systems can’t price derivatives, so you have to invent simple “proxies” for them
•  Have to build live feeds from one trading system into another – or book duplicates by hand
•  The back office has to remove the duplicates before settlement and accounting
•  How about adding up all of the risk from each trading system into a report?
•  Almost impossible to make the valuations consistent across systems:
•  Different yield curve definitions, and different market data sources feeding curves
•  Different math libraries, and often a technology challenge to make them run fast enough
•  Different calculation methodologies (is relative risk up or down?)
•  Difficult to achieve speed needed to accurately compute hedge requirements
Cash Products
Cash products are
securities that are
settled
immediately for a
cash payment,
such as stocks and
bonds.
Filling in the Details of the Design
Event-based cross-product risk system using Geode
PDX Integration Glue
•  PDX serialization is an effective cross-language bridge:
•  PDX data objects bridge solution components in Java, C# and C++
•  Avoid language-specific data types (e.g. C# date types) that don’t
have equivalents in other languages
•  Structure PDX objects to optimize performance:
•  May want to externalize sub-objects or lists into separate objects
•  Balance speed of lookup with memory consumption
•  Need to consider cluster locality
•  JSON is a good message format:
•  PDX natively handles JSON, but not XML
•  C# works well with JSON, so the calculation engine and the
downstream consumers should consume easily
Designing and Naming Data Objects
•  The trade data model serves two distinct purposes:
•  Descriptive data is only used for aggregation and viewing
•  Model parameters are only needed to calculate risk
•  Can be split into two regions to optimize performance
•  Market data should follow the calculation design:
•  Model data to align to the calculation engine’s math library to reduce
format conversions downstream
•  Use “dot” notation to give business-friendly keys to objects:
•  Create compound keys like “USD.LIBOR.3M” and ”USD.LIBOR.6M” to
allow business users to “guess” a key easily – promotes use of Geode
data in secondary applications and spreadsheets
•  Values in the “dot” name are repeated as attributes of the object
Region Design
•  Trade and market data regions:
•  Both may be high velocity, but with a low number of contributors
•  Curve definitions are updated slowly but used constantly
•  Typically a curve embeds a list of rates – leave it denormalized if
rates are updated slowly
•  If calculation engine supports it, create a second region to cache
built interest rate and credit curves (building a credit curve is 80%
of the valuation time for a credit default swap)
•  Consider splitting model parameters from descriptive data to
reduce amount of data flowing to compute grid
•  Foreign exchange quotes are typically small and updated daily
•  Interest rates change slowly and are referenced constantly
•  Computational results and aggregation:
•  Risk results will be the the largest and highest traffic region
•  Pre-aggregate risk inside Geode to support lower powered
consumers (e.g. web pages)
Region Placement On the Geode Cluster
•  Region placement optimizes the solution’s performance:
•  Consider placement of market data and trades holistically to make the
risk calculation efficient – keep all data on one machine
•  Partition the trades regions to balance the cluster:
•  Partition trade region to maximize parallel execution during compute
•  Use a business term (e.g. valuation curve, currency, industry) that can
be used to partition both the trade and market data sets
•  Partition or replicate market data to optimize computations:
•  Replicate interest rates and foreign exchange rates to all nodes
•  Replicate or partition curve data to maximize collocation of trades with
their market data to minimize cross-member network traffic
•  When using an external compute grid, this technique should also be
applied to the local Geode cache on the compute grid
Getting Trade Data into Geode
•  Message formats vary by product type:
•  OTC derivatives typically are usually captured in XML documents
•  Bond trading systems use FIX or similar (e.g. TOMS)
•  Proprietary formats from legacy trading systems
•  Broker messages in an application server:
•  Transactional message consumer is best pattern
•  XML-to-object parsing tools readily available
•  Trade data capture is transactional:
•  Best practice is to make end-to-end process a transaction, but may
need to split into two legs based on source of messages
Getting Market Data into Geode
•  Market data feeds have many proprietary formats
•  Market data is often exceptionally fast moving:
•  Foreign exchange quotes for the major current pairs can reach
70,000 messages/second
•  Market data can also be very slow moving:
•  Rate fixings like LIBOR are once per day
•  Illiquid securities may not be quoted daily
•  Conflate fast market data by sampling:
•  Discard inbound ticks that don’t move the market sufficiently
•  Sample down to a rate that your compute farm can accommodate
•  External client required to conflate within message queue
•  Gate market data into batches:
•  Push complete update of all market data at pre-determined intervals
•  Day open and close by trading location (NY, London, Hong Kong)
Crunching Numbers on a Shared Grid
•  Most trading firms have a proprietary math library:
•  Developed by internal quantitative teams to ensure consistency
•  Usually coded in C++ or C# to take advantage of Intel compute grid
•  Pushing Geode events to an external compute grid:
•  Typical compute grid has a “head node” or “broker”
•  Use client-side Asynchronous Event Queue (“AEQ”) to collect events
for grid’s broker to process
•  Stateless grid nodes can synchronously put results back to Geode
regions to ensure results are captured
•  Caching locally on the grid to accelerate performance:
•  Grid nodes can use Geode client-side caching proxies
•  Use client-side region interest registration to ensure updates are
pushed to grid nodes
•  Can use wildcards on keys (see dot notation)
Crunching Numbers Inside Geode
•  Running the math inside Geode is dramatically faster:
•  STAC Report Issue 0.99 in 2010 found that trade valuations running
inside GemFire 6.3 were 76 times faster than a traditional grid
•  Using the Geode grid as a compute grid:
•  Math library must be coded in java (most are C++ or C#)
•  Try to use function parameters to define data model
•  Opportunities to cache frequently used derived results
•  Using cache listeners to propagate valuation events:
•  Use cache listener to detect data updates in regions that contain
valuation inputs (e.g. new trade, market data updates)
•  Do not listen to “jittery” regions, such as exchange rates
•  Encapsulate math into functions that cache listener can execute
•  Ensure regions are partitioned in order to get parallel execution
across the grid
Ticking Risk Views
•  Roll-your-own client applications to view ticking risk:
•  Desktop applications can use the client libraries to receive events
from the cluster using Continuous Queries, which can then be
displayed in real time
•  Server hosted applications can use Continuous Queries or
Asynchronous Event Queues
•  Integrating packaged products:
•  Some specialty products handle streaming risk:
•  Armanta TheatreTM
•  ION Enterprise RiskTM
•  Integrate using custom java components
•  The traders will always want spreadsheets:
•  Write an Excel a plug-in
Join the Apache Geode Community!
•  Check out: http://geode.incubator.apache.org
•  Subscribe: user-subscribe@geode.incubator.apache.org
•  Download: http://geode.incubator.apache.org/releases/
Thank you!

#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode

  • 1.
  • 2.
    Wall Street DerivativeRisk Solutions Using Apache Geode (Incubating) Design Whiteboards for Solution Architects
  • 3.
  • 4.
    A Crash Coursein Wall Street Trading •  Big Wall Street firms have “FICC” trading business organized by market: •  Each business will trade “cash” and derivative products, but traders specialize in one or the other •  There may be a team of traders working to trade a single product •  Trading systems are product specific and often highly specialized: •  May have up to 50 different booking points for transactions •  Multiple instances of the same trading system, deployed in different regions •  Electronic markets mean that there are often external booking points to consolidate •  Managing these businesses requires a consolidated view of risk: •  Risk factors span products and markets – it is not sufficient to just look at the risk by trade or book •  Risk measures must be both fast and detailed to be relevant on the trading floor •  Desk heads aggregate risk from individual trades to stay within desk limits for risk •  Business heads aggregate risk across desk to stay within the firm’s risk appetite and regulatory limits FICC ”Fixed Income Commodities & Currencies”
  • 5.
    Calculating Risk •  Whatis the “risk” that we are trying to measure? •  Trades are valued by discounting their estimated future cash flows •  Discount factors are based on observable market data •  Movement in markets can change the value of your trades •  “Trade Risk” is the sensitivity of each trade to changes in market data •  Markets are represented using curves: •  A curve is defined using observable rates and prices and then “built” into a smooth, consistent “zero curve” using interpolation •  Consistency is paramount: •  Most firms have a proprietary math library used for valuation and risk •  Use the same market data in all calculations to avoid basis differences
  • 6.
    Technology Solutions thatWork Badly •  The easiest thing to do is just book all of your trades using one trading system! •  Trading systems are product specific for many very good reasons, so this idea is a non-starter •  How about booking all of the hedges into the primary trading system? •  Cash product systems can’t price derivatives, so you have to invent simple “proxies” for them •  Have to build live feeds from one trading system into another – or book duplicates by hand •  The back office has to remove the duplicates before settlement and accounting •  How about adding up all of the risk from each trading system into a report? •  Almost impossible to make the valuations consistent across systems: •  Different yield curve definitions, and different market data sources feeding curves •  Different math libraries, and often a technology challenge to make them run fast enough •  Different calculation methodologies (is relative risk up or down?) •  Difficult to achieve speed needed to accurately compute hedge requirements Cash Products Cash products are securities that are settled immediately for a cash payment, such as stocks and bonds.
  • 7.
    Filling in theDetails of the Design Event-based cross-product risk system using Geode
  • 8.
    PDX Integration Glue • PDX serialization is an effective cross-language bridge: •  PDX data objects bridge solution components in Java, C# and C++ •  Avoid language-specific data types (e.g. C# date types) that don’t have equivalents in other languages •  Structure PDX objects to optimize performance: •  May want to externalize sub-objects or lists into separate objects •  Balance speed of lookup with memory consumption •  Need to consider cluster locality •  JSON is a good message format: •  PDX natively handles JSON, but not XML •  C# works well with JSON, so the calculation engine and the downstream consumers should consume easily
  • 9.
    Designing and NamingData Objects •  The trade data model serves two distinct purposes: •  Descriptive data is only used for aggregation and viewing •  Model parameters are only needed to calculate risk •  Can be split into two regions to optimize performance •  Market data should follow the calculation design: •  Model data to align to the calculation engine’s math library to reduce format conversions downstream •  Use “dot” notation to give business-friendly keys to objects: •  Create compound keys like “USD.LIBOR.3M” and ”USD.LIBOR.6M” to allow business users to “guess” a key easily – promotes use of Geode data in secondary applications and spreadsheets •  Values in the “dot” name are repeated as attributes of the object
  • 10.
    Region Design •  Tradeand market data regions: •  Both may be high velocity, but with a low number of contributors •  Curve definitions are updated slowly but used constantly •  Typically a curve embeds a list of rates – leave it denormalized if rates are updated slowly •  If calculation engine supports it, create a second region to cache built interest rate and credit curves (building a credit curve is 80% of the valuation time for a credit default swap) •  Consider splitting model parameters from descriptive data to reduce amount of data flowing to compute grid •  Foreign exchange quotes are typically small and updated daily •  Interest rates change slowly and are referenced constantly •  Computational results and aggregation: •  Risk results will be the the largest and highest traffic region •  Pre-aggregate risk inside Geode to support lower powered consumers (e.g. web pages)
  • 11.
    Region Placement Onthe Geode Cluster •  Region placement optimizes the solution’s performance: •  Consider placement of market data and trades holistically to make the risk calculation efficient – keep all data on one machine •  Partition the trades regions to balance the cluster: •  Partition trade region to maximize parallel execution during compute •  Use a business term (e.g. valuation curve, currency, industry) that can be used to partition both the trade and market data sets •  Partition or replicate market data to optimize computations: •  Replicate interest rates and foreign exchange rates to all nodes •  Replicate or partition curve data to maximize collocation of trades with their market data to minimize cross-member network traffic •  When using an external compute grid, this technique should also be applied to the local Geode cache on the compute grid
  • 12.
    Getting Trade Datainto Geode •  Message formats vary by product type: •  OTC derivatives typically are usually captured in XML documents •  Bond trading systems use FIX or similar (e.g. TOMS) •  Proprietary formats from legacy trading systems •  Broker messages in an application server: •  Transactional message consumer is best pattern •  XML-to-object parsing tools readily available •  Trade data capture is transactional: •  Best practice is to make end-to-end process a transaction, but may need to split into two legs based on source of messages
  • 13.
    Getting Market Datainto Geode •  Market data feeds have many proprietary formats •  Market data is often exceptionally fast moving: •  Foreign exchange quotes for the major current pairs can reach 70,000 messages/second •  Market data can also be very slow moving: •  Rate fixings like LIBOR are once per day •  Illiquid securities may not be quoted daily •  Conflate fast market data by sampling: •  Discard inbound ticks that don’t move the market sufficiently •  Sample down to a rate that your compute farm can accommodate •  External client required to conflate within message queue •  Gate market data into batches: •  Push complete update of all market data at pre-determined intervals •  Day open and close by trading location (NY, London, Hong Kong)
  • 14.
    Crunching Numbers ona Shared Grid •  Most trading firms have a proprietary math library: •  Developed by internal quantitative teams to ensure consistency •  Usually coded in C++ or C# to take advantage of Intel compute grid •  Pushing Geode events to an external compute grid: •  Typical compute grid has a “head node” or “broker” •  Use client-side Asynchronous Event Queue (“AEQ”) to collect events for grid’s broker to process •  Stateless grid nodes can synchronously put results back to Geode regions to ensure results are captured •  Caching locally on the grid to accelerate performance: •  Grid nodes can use Geode client-side caching proxies •  Use client-side region interest registration to ensure updates are pushed to grid nodes •  Can use wildcards on keys (see dot notation)
  • 15.
    Crunching Numbers InsideGeode •  Running the math inside Geode is dramatically faster: •  STAC Report Issue 0.99 in 2010 found that trade valuations running inside GemFire 6.3 were 76 times faster than a traditional grid •  Using the Geode grid as a compute grid: •  Math library must be coded in java (most are C++ or C#) •  Try to use function parameters to define data model •  Opportunities to cache frequently used derived results •  Using cache listeners to propagate valuation events: •  Use cache listener to detect data updates in regions that contain valuation inputs (e.g. new trade, market data updates) •  Do not listen to “jittery” regions, such as exchange rates •  Encapsulate math into functions that cache listener can execute •  Ensure regions are partitioned in order to get parallel execution across the grid
  • 16.
    Ticking Risk Views • Roll-your-own client applications to view ticking risk: •  Desktop applications can use the client libraries to receive events from the cluster using Continuous Queries, which can then be displayed in real time •  Server hosted applications can use Continuous Queries or Asynchronous Event Queues •  Integrating packaged products: •  Some specialty products handle streaming risk: •  Armanta TheatreTM •  ION Enterprise RiskTM •  Integrate using custom java components •  The traders will always want spreadsheets: •  Write an Excel a plug-in
  • 17.
    Join the ApacheGeode Community! •  Check out: http://geode.incubator.apache.org •  Subscribe: user-subscribe@geode.incubator.apache.org •  Download: http://geode.incubator.apache.org/releases/
  • 18.