Webinar : Managing Real Time Risk Analytics with MongoDB
will begin at 14:00 GMT / 7:00 AM PDT / 2:00 PM UTC

Audio should start immediately when you log into the event via
Audio Broadcast. You will need a VOID headset and reliable
internet connection for Audio Broadcast. If you are having issues
connecting, please dial 1-877-668-4493; Access code: 666 722
454.

There is a Q&A following the webinar. You can enter questions in
the chat box to the Host and Presenter.

A recording of the webinar will be available 24 hours after the
eventi s complete.

For any other issues please email webinars@10gen.com.
Easy to Start, Easy to Develop, Easy to Scale


  Managing Real Time Risk Analytics with
              MongoDB

                  10gen, Inc.

                November 2012
@dmroberts

daniel.roberts@10gen.com
       Solution Architect
        Based in London
     http://www.10gen.com/
Last Time

•Document-Oriented
                             •High Volume Data Feeds
    •Dynamic schema          •Tick Data capture
    •Agile                   •Risk Analytics
    •Flexible                •Product Catalogs & Trade
•High Performance            Capture
•Highly Available            •P&L Reporting
•Horizontal Scale Out        •Portfolio Management
                             •Reference Data Management
                             •Quantitative Analysis
                             •Automated Trading
Key Features for reporting /
             analytics


•Dynamic Query Language
•Aggregation Framework
•Dynamic & Flexible Schemas
•Atomic Updates to documents
   •Upserts
•Horizonal Scale Out
•Map Reduce
•Hadoop Integration
Sharded Architecture
Risk Analytics & Reporting
Use Case:
•Collect and aggregate risk data
•Calculate risk / exposures
•Potentially real time
Why MongoDB?
•Collect data from a single or multiple sources
   •Different formats
•Documents used to create ‘pre-aggregated’ reports
   •Real Time
•Aggregation Framework for reporting
   •e.g. exposure for a counter party
•Internal MR or Hadoop connector
   •Batch process risk data
Portfolio / Position reporting
Use Case:
•Store positions or portfolio information
•Query to find current positions/portfolios
•Query by client or trader
Why MongoDB?
•Customer/client my have many different products
•Aggregation Framework to calculate values and views
•Work on extremely large data sets
   •Current and historic data
Reporting / Analytics requirements


•How quickly do you need answers?
•How often do you need updates?
•Requirements will drive which methods to utilise.
   •Generally the high the latency tolerance the greater the insight.

•Choices
   •Batch calculations - large complex data volumes
   •Pre-Aggregated - specific and very fast
   •Real-time calculations - As needed reports and calculations
Batch Processing

•MongoDB internal Map Reduce
•Hadoop Map Reduce with MongoDB connector
                                                  raw

•Insight after batch run
                                                 hourly
   •For instance every hour or day
   •Output to documents/collection                daily
   •Fast read once data produced
                                                 monthly

•Results not up to last millisecond
•Can generate insight from huge datasets
•Rolled up stats
   •Source collections -> reporting collection
Sharded MongoDB + Hadoop
Shard 1           Shard 2           Shard 3           Shard 4           Shard 5

       c                  z         t       f         v                 w        y


a      s         u        g         e                 h       d         b        x




    Hadoop            Hadoop            Hadoop            Hadoop            Hadoop
     Node              Node              Node              Node              Node



                                                 Hadoop            Hadoop
             Hadoop            Hadoop
                                                  Node              Node
              Node              Node
Use Query Language

•Query across documents using MongoDB JSON query language
   •Infer results in the application code.
   •Dynamic - but what happens when we have 1 billion documents.
•Indexing strategy key
•var data = db.pl.find({ positionId: 1234 })[0]
            {
                 "_id" : ObjectId("50990a10fd421cb025407cb1"),
                 "positionId" : 1234,
                 "security" : "ORCL",
                 "quantity" : 1000,
                 "price" : 30.23,
                 "currency" : "USD"
            }
   data.price * data.quantity = 30230.00
Leverage schema design


•Group useful data together into documents
•Utilise upsert and $inc functionality of MongoDB
   •Pre-aggregate reports
   •$inc incrementing counters is light weight.

•Fast pre calculated data
•Low latency retrieval
•http://docs.mongodb.org/manual/use-cases/pre-aggregated-reports/
Pre-Aggregated Reports -
Daily trade volumes


// daily buckets, each hour a sub-document
{ _id: "2012-11-06-1231", security:‘ORCL’
  ts: ISODate("2012-11-06T00:00:00.000Z")
  daily: 67345234,
  minute: { 0: { 0: 2034, 1: 735, ... 59: 2644 },
             ...
            7: { 0: 15434, 1: 334, ... 59: 64234 }
           }
}
// Increment counters in document. Automatically
insert new document (upsert : true)
> db.trades.update(
   { _id: "2012-11-06-1231", security:‘ORCL’ },
   { $inc: { "minute.4.09": 1034, “daily” : 1034 } },
  true)
Aggregation Framework
•Much simpler and faster than MongoDB map reduce
•Replaces common MR use cases in MongoDB
•Native operators in the MongoDB core

db.pl.aggregate([
!   {$match:{"clientId" : 4321}},
    { $project :
       { value :
          { $multiply:["$quantity", "$price"] }
       }
    }
]);
Aggregation FIX Execution report

Calculating average price for a fulfilled order:

db.ExecutionReport.aggregate(
{
    $match : {"Instrument.Symbol":"MSFT"},
    $group :
    {
        _id : "$ClOrdID",
             reportsPerOrd : { $sum : 1 },
            totalNumOrdered : { $sum : "$OrderQtyData.OrderQty"},
            avgPrice : { $avg : "$AvgPx" }
      }
});
Summary

•Number of choices for Aggregating data.
   •MongoDB Map Reduce
   •Pre-Compute - Schema Design
   •Hadoop Connector
   •Aggregation Framework
download at mongodb.org

                    @dmroberts
             daniel.roberts@10gen.com

Free online training - http://education.10gen.com/

www.meetup.com/London-MongoDB-User-Group/



     Facebook           Twitter       LinkedIn
http://bit.ly/mongodb   @dmroberts   http://linkd.in/joinmongo

Webinar: Managing Real Time Risk Analytics with MongoDB

  • 1.
    Webinar : ManagingReal Time Risk Analytics with MongoDB will begin at 14:00 GMT / 7:00 AM PDT / 2:00 PM UTC Audio should start immediately when you log into the event via Audio Broadcast. You will need a VOID headset and reliable internet connection for Audio Broadcast. If you are having issues connecting, please dial 1-877-668-4493; Access code: 666 722 454. There is a Q&A following the webinar. You can enter questions in the chat box to the Host and Presenter. A recording of the webinar will be available 24 hours after the eventi s complete. For any other issues please email webinars@10gen.com.
  • 2.
    Easy to Start,Easy to Develop, Easy to Scale Managing Real Time Risk Analytics with MongoDB 10gen, Inc. November 2012
  • 3.
    @dmroberts daniel.roberts@10gen.com Solution Architect Based in London http://www.10gen.com/
  • 4.
    Last Time •Document-Oriented •High Volume Data Feeds •Dynamic schema •Tick Data capture •Agile •Risk Analytics •Flexible •Product Catalogs & Trade •High Performance Capture •Highly Available •P&L Reporting •Horizontal Scale Out •Portfolio Management •Reference Data Management •Quantitative Analysis •Automated Trading
  • 5.
    Key Features forreporting / analytics •Dynamic Query Language •Aggregation Framework •Dynamic & Flexible Schemas •Atomic Updates to documents •Upserts •Horizonal Scale Out •Map Reduce •Hadoop Integration
  • 6.
  • 7.
    Risk Analytics &Reporting Use Case: •Collect and aggregate risk data •Calculate risk / exposures •Potentially real time Why MongoDB? •Collect data from a single or multiple sources •Different formats •Documents used to create ‘pre-aggregated’ reports •Real Time •Aggregation Framework for reporting •e.g. exposure for a counter party •Internal MR or Hadoop connector •Batch process risk data
  • 8.
    Portfolio / Positionreporting Use Case: •Store positions or portfolio information •Query to find current positions/portfolios •Query by client or trader Why MongoDB? •Customer/client my have many different products •Aggregation Framework to calculate values and views •Work on extremely large data sets •Current and historic data
  • 9.
    Reporting / Analyticsrequirements •How quickly do you need answers? •How often do you need updates? •Requirements will drive which methods to utilise. •Generally the high the latency tolerance the greater the insight. •Choices •Batch calculations - large complex data volumes •Pre-Aggregated - specific and very fast •Real-time calculations - As needed reports and calculations
  • 10.
    Batch Processing •MongoDB internalMap Reduce •Hadoop Map Reduce with MongoDB connector raw •Insight after batch run hourly •For instance every hour or day •Output to documents/collection daily •Fast read once data produced monthly •Results not up to last millisecond •Can generate insight from huge datasets •Rolled up stats •Source collections -> reporting collection
  • 11.
    Sharded MongoDB +Hadoop Shard 1 Shard 2 Shard 3 Shard 4 Shard 5 c z t f v w y a s u g e h d b x Hadoop Hadoop Hadoop Hadoop Hadoop Node Node Node Node Node Hadoop Hadoop Hadoop Hadoop Node Node Node Node
  • 12.
    Use Query Language •Queryacross documents using MongoDB JSON query language •Infer results in the application code. •Dynamic - but what happens when we have 1 billion documents. •Indexing strategy key •var data = db.pl.find({ positionId: 1234 })[0] { "_id" : ObjectId("50990a10fd421cb025407cb1"), "positionId" : 1234, "security" : "ORCL", "quantity" : 1000, "price" : 30.23, "currency" : "USD" } data.price * data.quantity = 30230.00
  • 13.
    Leverage schema design •Groupuseful data together into documents •Utilise upsert and $inc functionality of MongoDB •Pre-aggregate reports •$inc incrementing counters is light weight. •Fast pre calculated data •Low latency retrieval •http://docs.mongodb.org/manual/use-cases/pre-aggregated-reports/
  • 14.
    Pre-Aggregated Reports - Dailytrade volumes // daily buckets, each hour a sub-document { _id: "2012-11-06-1231", security:‘ORCL’ ts: ISODate("2012-11-06T00:00:00.000Z") daily: 67345234, minute: { 0: { 0: 2034, 1: 735, ... 59: 2644 }, ... 7: { 0: 15434, 1: 334, ... 59: 64234 } } } // Increment counters in document. Automatically insert new document (upsert : true) > db.trades.update( { _id: "2012-11-06-1231", security:‘ORCL’ }, { $inc: { "minute.4.09": 1034, “daily” : 1034 } }, true)
  • 15.
    Aggregation Framework •Much simplerand faster than MongoDB map reduce •Replaces common MR use cases in MongoDB •Native operators in the MongoDB core db.pl.aggregate([ ! {$match:{"clientId" : 4321}}, { $project : { value : { $multiply:["$quantity", "$price"] } } } ]);
  • 16.
    Aggregation FIX Executionreport Calculating average price for a fulfilled order: db.ExecutionReport.aggregate( { $match : {"Instrument.Symbol":"MSFT"}, $group : { _id : "$ClOrdID", reportsPerOrd : { $sum : 1 }, totalNumOrdered : { $sum : "$OrderQtyData.OrderQty"}, avgPrice : { $avg : "$AvgPx" } } });
  • 17.
    Summary •Number of choicesfor Aggregating data. •MongoDB Map Reduce •Pre-Compute - Schema Design •Hadoop Connector •Aggregation Framework
  • 18.
    download at mongodb.org @dmroberts daniel.roberts@10gen.com Free online training - http://education.10gen.com/ www.meetup.com/London-MongoDB-User-Group/ Facebook Twitter LinkedIn http://bit.ly/mongodb @dmroberts http://linkd.in/joinmongo