Grid Asia2008 Low Latency Data Grid

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Grid Asia2008 Low Latency Data Grid - Presentation Transcript

    1. Low Latency Data Grids in Finance Jags Ramnarayan Chief Architect GemStone Systems [email_address]
    2. Background on GemStone Systems
      • Known for its Object Database technology since 1982
      • Now specializes in memory-oriented distributed data management
      • Over 200 installed customers in global 2000
      • Grid focus driven by:
        • Very high performance with predictable throughput, latency and availability
          • Capital markets
          • Large e-commerce portals – real time fraud
          • Federal intelligence
    3. Use of Grid computing in finance
      • Two primary areas in tier 1 investment banks
        • Risk Analytics
        • Pricing
    4. State of affairs – Risk Analytics
      • Deluge of data (market data, trade data, etc)
      • Overnight batch job doesn’t cut it
        • Want intra-day risk metrics
        • In some cases, real-time risk
      • Explosion in simulation scenarios
        • More accurate risk exposure
        • Compliance
      • Increasing number of smaller calculations
    5. State of affairs – Pricing (derivatives)
      • Too many products
      • Increasing complexity in products
        • Too many underliers
        • Many relationships
      • Hunger for latency reduction
        • Calculating the new price with lowest possible latency
        • Pushing the prices to distributed applications
    6. Where is the problem? Compute farm Data warehouses Rational databases
      • Database/file access contention
        • Too many concurrent connections
        • Large database server bottlenecks on network
        • Queries results are large causing CPU bottlenecks
        • Even a parallel file system throttled by disk speeds
      • Too much data transfer
        • Between tasks, Jobs
        • Between Grid and file systems, databases
        • Data consistency issues
      File system CPU bound job turns into a IO bound Job Grid Scheduler
    7. Data Fabric for Risk Analytics When data is stored, it is transparently replicated and/or partitioned; Redundant storage can be in memory and/or on disk— ensures continuous availability Keep reference data replicated on many; partition trade data Machine nodes can be added dynamically to expand storage capacity or to handle increased client load Pool memory (and disk) across cluster ; parallelize data access and computation to achieve very high aggregate throughput
    8. Data Fabric for Risk Analytics TaskFlow - As results are generated push events to compute nodes to initiate subsequent computation Avoid bulk data transfer across tasks or Jobs Thousands of compute nodes can maintain local cache of most frequently used data; Optionally use local disk for overflow Move reference data to local cache Synchronous read through, write through or Asynchronous write-behind to other data sources and sinks
    9. Move business logic to data f 1 , f 2 , … f n FIFO Queue Data fabric Resources Exec functions Sept Trades Submit (f1) -> AggregateHighValueTrades(<input data>, “ where trades.month=‘Sept ’) Function (f1) Function (f2)
        • Principle: Move task to computational resource with most of the relevant data before considering other nodes where data transfer becomes necessary
        • Parallel function execution service (“Map Reduce”)
        • Data dependency hints
          • Routing key, collection of keys, “where clause(s)”
        • Serial or parallel execution
    10. Key lessons
      • Apps should think about capitalizing memory across Grid (it is abundant)
      • Keep IO cycles to minimum through main memory caching of operational data sets
        • Scavange Grid memory and avoid data source access
      • Achieve linear scaling for your Grid apps by horizontally partitioning your data and behavior
        • Read “Pat helland’s – Life beyond Distributed transactions” ( http://www-db.cs.wisc.edu/cidr/cidr2007/papers/cidr07p15.pdf )
      • Get more info on the GemFire data fabric
        • http:// www.gemstone.com/gemfire
    SlideShare Zeitgeist 2009

    + Jags RamnarayanJags Ramnarayan Nominate

    custom

    309 views, 0 favs, 0 embeds more stats

    Investment banks rely extensively on grids to drama more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 309
      • 309 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 6
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories