Webinar: How Banks Use MongoDB as a Tick Database

7,848 views

Published on

Learn why MongoDB is spreading like wildfire across capital markets (and really every industry) and then focus in particular on how financial firms are enjoying the developer productivity, low TCO, and unlimited scale of MongoDB as a tick database for capturing, analyzing, and taking advantage of opportunities in tick data. This webinar illustrates how MongoDB can easily and quickly store variable data formats, like top and depth of book, multiple asset classes, and even news and social networking feeds. It will explore aggregating and analyzing tick data in real-time for automated trading or in batch for research and analysis and how auto-sharding enables MongoDB to scale with commodity hardware to satisfy unlimited storage and performance requirements.

0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
7,848
On SlideShare
0
From Embeds
0
Number of Embeds
4,681
Actions
Shares
0
Downloads
172
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide
  • Mention tick databases
  • JSON document – contains key value pairs, different types, values can also be arrays and other documents
  • because of the way MongoDB lets you update documents atomically we can be sure totals and list of voters will stay in sync
  • because of the way MongoDB lets you update documents atomically we can be sure totals and list of voters will stay in sync
  • because of the way MongoDB lets you update documents atomically we can be sure totals and list of voters will stay in sync
  • comments is an array of JSON documentswe can query by fields inside embedded documents as well as array members.
  • secondary indexes, compound indexes, multikey indexes.why is it important to have all of document together? data locality
  • secondary indexes, compound indexes, multikey indexes.why is it important to have all of document together? data locality
  • Fewer reads, data is together, memory mapped files, caching handled by OS, naturally leaves most frequently accessed data in RAM (have enough RAM to fit indexes and working data set into RAM for best performance), horizontal scaling is "built-in" to the product by design from the start.
  • Full deployment. As many mongoS processes as you have app servers (for example); Config DBs are small but hold the critical information about where ranges of data are located on disk/shards.
  • Webinar: How Banks Use MongoDB as a Tick Database

    1. 1. How Capital Markets Firms Use MongoDB as a Tick Database Matt Kalan, Sr. Solution Architect Email: Matt.kalan@10gen.com Twitter: @matthewkalan
    2. 2. Agenda• MongoDB Introduction• FS Use Cases• Writing/Capturing Market Data• Reading/Analyzing Market Data• Performance, Scalability, & High Availability• Q&A 2
    3. 3. Introduction10gen is the company behind MongoDB –the leading next generation database Document- General Open- Oriented Purpose Source 3
    4. 4. 10gen Overview 200+ employees 500+ customers Offices in New York, Palo Alto, Washington Over $81 million in funding DC, London, Dublin, Barcelona and Sydney 4
    5. 5. Database Landscape • No Automatic Joins • Document Transactions • Fast, Scalable Read/Writes 5
    6. 6. MongoDB Business BenefitsIncreased Developer Productivity Better Customer Experience Faster Time to Market Lower TCO 6
    7. 7. MongoDB Technical Benefits Application Agile & Flexible { author: “roger”, High Highly date: new Date(), text: “Spirited Away”, Performance Available tags: [“Tezuka”, “Manga”]} -Indexes -Replica Sets -RAM Horizontally Scalable -Sharding 7
    8. 8. Most Common FS Use Cases1. Tick Data Capture & Analysis2. Reference Data Management3. Risk Analysis & Reporting4. Trade Repository5. Portfolio Reporting 8
    9. 9. Tick Data Capture & Analysis -Requirements• Capture real-time market data (multi-asset, top of book, depth of book, even news)• Load historical data• Aggregate data into bars, daily, monthly intervals• Enable queries & analysis on raw ticks or aggregates• Drive backtesting or automated signals 9
    10. 10. Tick Data Capture & Analysis –Why MongoDB?• High throughput => can capture real-time feeds for all products/asset classes needed• High scalability => all data and depth for all historical time periods can be captured• Flexible & Range-based indexing => fast querying on time ranges and any fields• Aggregation Framework => can shape raw data into aggregates (e.g. ticks to bars)• Map-reduce capability (Native MR or Hadoop Connector) => batch analysis looking for patterns and opportunities• Easy to use => native language drivers and JSON expressions that you can apply for most operational database needs as well• Low TCO => Low software license cost and commodity hardware 10
    11. 11. Writing/Capturing Tick Data
    12. 12. High Level Trading Architecture Market Data Capturing Feed Handler Application News & social networking sources Cached Static & Orders Aggregated DataExchanges/Mark Low Latency ets/Brokers Applications Trades/metrics Orders Higher Latency Backtesting and Trading Analysis Applications Applications 12
    13. 13. High Level Trading Architecture Market Data Capturing Feed Handler Application News & social networking Data Types sources • Top of book Cached Static & Orders • Depth of book Aggregated DataExchanges/Mark • Low Latency Multi-asset ets/Brokers Applications • Derivatives (e.g. strips) • News (text, video) Trades/metrics • Social Networking Orders Higher Latency Backtesting and Trading Analysis Applications Applications 13
    14. 14. Top of book [e.g. equities]{ _id : ObjectId("4e2e3f92268cdda473b628f6"), symbol : "DIS", timestamp: ISODate("2013-02-15 10:00"), bidPrice: 55.37, offerPrice: 55.58, bidQuantity: 500, offerQuantity: 700}> db.ticks.find( {symbol: "DIS", bidPrice: {$gt: 55.36} } ) 14
    15. 15. Depth of book{ _id : ObjectId("4e2e3f92268cdda473b628f6"), symbol : "DIS", timestamp: ISODate("2013-02-15 10:00"), bidPrices: [55.37, 55.36, 55.35], offerPrices: [55.58, 55.59, 55.60], bidQuantities: [500, 1000, 2000], offerQuantities: [1000, 2000, 3000]}> db.ticks.find( {bidPrices: {$gt: 55.36} } ) 15
    16. 16. or any way your app uses it{ _id : ObjectId("4e2e3f92268cdda473b628f6"), symbol : "DIS", timestamp: ISODate("2013-02-15 10:00"), bids: [ {price: 55.37, amount: 500}, {price: 55.37, amount: 1000}, {price: 55.37, amount: 2000} ], offers: [ {price: 55.58, amount: 1000}, {price: 55.58, amount: 2000}, {price: 55.59, amount: 3000} ]}> db.ticks.find( {"bids.price": {$gt: 55.36} } ) 16
    17. 17. Synthetic spreads{ _id : ObjectId("4e2e3f92268cdda473b628f6"), symbol : "DIS", timestamp: ISODate("2013-02-15 10:00"), spreadPrice: 0.58 leg1: {symbol: “CLM13, price: 97.34} leg2: {symbol: “CLK13, price: 96.92}}db.ticks.find( { “leg1” : “CLM13” }, { “leg2” : “CLK13” }, { “spreadPrice” : {$gt: 0.50 } } ) 17
    18. 18. News{ _id : ObjectId("4e2e3f92268cdda473b628f6"), symbol : "DIS", timestamp: ISODate("2013-02-15 10:00"), title: “Disney Earnings…” body: “Walt Disney Company reported…”, tags: [“earnings”, “media”, “walt disney”]} 18
    19. 19. Social networking{ _id : ObjectId("4e2e3f92268cdda473b628f6"), timestamp: ISODate("2013-02-15 10:00"), twitterHandle: “jdoe”, tweet: “Heard @DisneyPictures is releasing…”, usernamesIncluded: [“DisneyPictures”], hashTags: [“movierumors”, “disney”]} 19
    20. 20. Aggregates (bars, daily, etc.){ _id : ObjectId("4e2e3f92268cdda473b628f6"), symbol : "DIS”, openTS: Date("2013-02-15 10:00"), closeTS: Date("2013-02-15 10:05"), open: 55.36, high: 55.80, low: 55.20, close: 55.70} 20
    21. 21. Querying/Analyzing Tick Data
    22. 22. Architecture for Querying Data Research & Analysis • Ticks Applications • Bars • Other analysis Backtesting Applications Higher Latency Trading Applications 22
    23. 23. Index any fields: arrays, nested, etc // Compound indexes > db.ticks.ensureIndex({symbol: 1, timestamp:1}) // Index on arrays >db.ticks.ensureIndex( {bidPrices: -1}) // Index on any depth > db.ticks.ensureIndex( {“bids.price”: 1} ) // Full text search > db.ticks.ensureIndex ( {tweet: “text”} ) 23
    24. 24. Query for ticks by time; pricethreshold // Ticks for last month for media companies > db.ticks.find({ symbol: {$in: ["DIS", “VIA“, “CBS"]}, timestamp: {$gt: new ISODate("2013-01-01")}, timestamp: {$lte: new ISODate("2013-01-31")}}) // Ticks when Disney’s bid breached 55.50 this month > db.ticks.find({ symbol: "DIS", bidPrice: {$gt: 55.50}, timestamp: {$gt: new ISODate("2013-02-01")}}) 24
    25. 25. Analyzing/Aggregating Options• Custom application code – Run your queries, compute your results• Aggregation framework – Declarative, pipeline-based approach• Native Map/Reduce in MongoDB – Javascript functions distributed across cluster• Hadoop Connector – Offline batch processing/computation 25
    26. 26. Aggregate into min bars//Aggregate minute bars for Disney for this monthdb.ticks.aggregate( { $match: {symbol: "DIS”, timestamp: {$gt: new ISODate("2013-02-01")}}}, { $project: { year: {$year: "$timestamp"}, month: {$month: "$timestamp"}, day: {$dayOfMonth: "$timestamp"}, hour: {$hour: "$timestamp"}, minute: {$minute: "$timestamp"}, second: {$second: "$timestamp"}, timestamp: 1, price: 1}}, { $sort: { timestamp: 1}}, { $group : { _id : {year: "$year", month: "$month", day: "$day", hour: "$hour", minute: "$minute"}, open: {$first: "$price"}, high: {$max: "$price"}, low: {$min: "$price"}, close: {$last: "$price"} }} ) 26
    27. 27. Add analysis on the bars…//then count the number of down bars{ $project: { downBar: {$lt: [“$close”, “$open”] }, timestamp: 1, open: 1, high: 1, low: 1, close: 1}},{ $group: { _id: “$downBar”, sum: {$sum: 1}}} }) 27
    28. 28. Map-Reduce Example: Sumvar mapFunction = function () { emit(this.symbol, this.bidPrice);}var reduceFunction = function (symbol, priceList) { return Array.sum(priceList);}> db.ticks.mapReduce( map, reduceFunction, {out: ”tickSums"}) 28
    29. 29. Process Data on Hadoop• MongoDB’s Hadoop Connector• Supports Map/Reduce, Streaming, Pig• MongoDB as input/output storage for Hadoop jobs – No need to go through HDFS• Leverage power of Hadoop ecosystem against operational data in MongoDB 29
    30. 30. Performance, Scalability, and High Availability
    31. 31. Why MongoDB is fast and scalableBetter data locality In-Memory Auto-Sharding Caching Read/write scaling Relational MongoDB 31
    32. 32. Auto-Sharding for Horizontal Scale Key Range Symbol: A…Z mongod Read/Write Scalability 32
    33. 33. Auto-Sharding for Horizontal Scale Key Range Key Range Symbol: A…J Symbol: K…Z mongod mongod Read/Write Scalability 33
    34. 34. Sharding Key Range Key Range Key Range Key Range Symbol: A…F Symbol: G…J Symbol: K…O Symbol: P…Z mongod mongod mongod mongod Read/Write Scalability 34
    35. 35. Application MongoS MongoS MongoSKey Range Key Range Key Range Key RangeSymbol: A…F, Symbol: G…J, Symbol: K…O, Symbol: P…Z,Time Time Time TimePrimary Primary Primary PrimarySecondary Secondary Secondary SecondarySecondary Secondary Secondary Secondary 35
    36. 36. 10gen Products and Services Subscriptions Professional Support, Enterprise Edition and Commercial License Consulting Expert Resources for All Phases of MongoDB Implementations Training Online and In-Person, for Developers and Administrators 36
    37. 37. Summary• MongoDB is high performance for tick data• Scales horizontally automatically by auto- sharding• Fast, flexible querying, analysis, & aggregation• Dynamic schema can handle any data types• MongoDB has all these features with low TCO• 10gen can support you with anything discussed 37
    38. 38. For More Information Resource User Data Management Location MongoDB Downloads www.mongodb.org/download Free Online Training education.10gen.com Webinars and Events www.10gen.com/events White Papers www.10gen.com/white-papers Customer Case Studies www.10gen.com/customers Presentations www.10gen.com/presentations Documentation docs.mongodb.org Additional Info info@10gen.com 38
    39. 39. How Capital Markets Firms Use MongoDB as a Tick Database Matt Kalan, Sr. Solution Architect Email: Matt.kalan@10gen.com Twitter: @matthewkalan

    ×