• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Big Data Warehousing Meetup: Real-time Trade Data Monitoring with Storm & Cassandra
 

Big Data Warehousing Meetup: Real-time Trade Data Monitoring with Storm & Cassandra

on

  • 1,407 views

Caserta Concepts' implementation team presented a solution that performs big data analytics on active trade data in real-time. They presented the core components – Storm for the real-time ingest, ...

Caserta Concepts' implementation team presented a solution that performs big data analytics on active trade data in real-time. They presented the core components – Storm for the real-time ingest, Cassandra, a NoSQL database, and others. For more information on future events, please check out http://www.casertaconcepts.com/.

Statistics

Views

Total Views
1,407
Views on SlideShare
1,161
Embed Views
246

Actions

Likes
7
Downloads
23
Comments
0

3 Embeds 246

http://www.casertaconcepts.com 241
https://twitter.com 4
http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via SlideShare as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Alternative NoSQL: Hbase, Cassandra, Druid, VoltDB

Big Data Warehousing Meetup: Real-time Trade Data Monitoring with Storm & Cassandra Big Data Warehousing Meetup: Real-time Trade Data Monitoring with Storm & Cassandra Presentation Transcript

  • Big Data Warehousing Meetup December 10, 2013 Real-time Trade Data Monitoring with Storm & Cassandra
  • Agenda 7:00 Networking Grab a slice of pizza and a drink... 7:15 Welcome & Intro President, Caserta Concepts Author, Data Warehouse ETL Toolkit 7:30 Joe Caserta About the Meetup and about Caserta Concepts Elliott Cordo Cassandra Chief Architect, Caserta Concepts 8:00 Noel Vega Consultant, Caserta Concepts Consultant, Dimension Data, LLC 8:309:00 Q&A / More Networking Storm
  • About the BDW Meetup • Big Data is a complex, rapidly changing landscape • We want to share our stories and hear about yours • Great networking opportunity for like minded data nerds • Opportunities to collaborate on exciting projects • Founded by Caserta Concepts, Big Data Analytics, DW & BI Consulting • Next BDW Meetup: JANUARY 20
  • About Caserta Concepts Focused Expertise • • • • Big Data Analytics Data Warehousing Business Intelligence Strategic Data Ecosystems Industries Served • • • • • Financial Services Healthcare / Insurance Retail / eCommerce Digital Media / Marketing K-12 / Higher Education Founded in 2001 • President: Joe Caserta, industry thought leader, consultant, educator and co-author, The Data Warehouse ETL Toolkit (Wiley, 2004)
  • Caserta Concepts Listed as one of the 20 Most Promising Data Analytics Consulting Companies CIOReview looked at hundreds of data analytics consulting companies and shortlisted the ones who are at the forefront of tackling the real analytics challenges. A distinguished panel comprising of CEOs, CIOs, VCs, industry analysts and the editorial board of CIOReview selected the Final 20.
  • Expertise & Offerings Strategic Roadmap/ Assessment/Consulting/ Implementation Big Data Analytics Data Warehousing/ ETL/Data Integration BI/Visualization/ Analytics
  • Client Portfolio Finance & Insurance Retail/eCommerce & Manufacturing Education & Services
  • We are hiring Does this word cloud excite you? Speak with us about our open positions: jobs@casertaconcepts.com
  • Why talk about Storm & Cassandra? Traditional BI ERP ETL Traditional EDW Finance ETL Ad-Hoc/Canned Reporting Legacy Big Data BI Big Data Cluster NoSQL Database Storm Data Analytics Mahout N1 MapReduce N2 N3 Pig/Hive N4 N5 Hadoop Distributed File System (HDFS) Horizontally Scalable Environment - Optimized for Analytics Data Science
  • What is Storm • Distributed Event Processor • Real-time data ingestion and dissemination • In-Stream ETL • Reliably process unbounded streams of data • Storm is fast: Clocked it at over a million tuples per second per node • It is scalable, fault-tolerant, guarantees your data will be processed • Preferred technology for real-time big data processing by organizations worldwide: • Partial list at https://github.com/nathanmarz/storm/wiki/Powered-By • Incubator: • http://wiki.apache.org/incubator/StormProposal
  • Components of Storm • Spout – Collects data from upstream feeds and submits it for processing • Tuple – A collection of data that is passed within Storm • Bolt – Processes tuples (Transformations) • Stream – Identifies outputs from Spouts/Bolts • Storm usually outputs to a NoSQL database
  • Why NoSQL? • Performance: • Relational databases have a lot of features, overhead that we don’t need in many cases. Although we will miss some… • Scalability: • Most relational databases scale vertically giving them limits to how large they can get. Federation and Sharding is an awkward manual process. • Agile • Sparse Data / Data with a lot of variation • Most NoSQL scale horizontally on commodity hardware
  • What is Cassandra? • Column families are the equivalent to a table in a RDMS • Primary unit of storage is a column, they are stored contiguously Skinny Rows: Most like relational database. Except columns are optional and not stored if omitted: Wide Rows: Rows can be billions of columns wide, used for time series, relationships, secondary indexes:
  • REAL TIME TRADE DATA MONITORING Elliott Cordo Chief Architect, Caserta Concepts
  • The Use Case • Trade data (orders and executions) • High volume of incoming data • 500 thousand records per second • 12 billion messages per day • Required that data be aggregated and monitored in real time (end to end latency measured in 100's of ms) • Both raw messages and analytics stored, persisted to a database
  • The Data • Primarily FIX messages: Financial Information Exchange  • Established in early 90's as a standard for trade data communication  widely used throughout the industry • Basically a delimited file of variable attribute-value pairs • Looks something like this: 8=FIX.4.2 | 9=178 | 35=8 | 49=PHLX | 56=PERS | 52=20071123-05:30:00.000 | 11=ATOMNOCCC9990900 | 20=3 | 150=E | 39=E | 55=MSFT | 167=CS | 54=1 | 38=15 | 40=2 | 44=15 | 58=PHLX EQUITY TESTING | 59=0 | 47=C | 32=0 | 31=0 | 151=15 | 14=0 | 6=0 | 10=128 | • A single trade can be comprised of 1000's of such messages, although typical trades have about a dozen
  • Additional Requirements • Linearly scalable • Highly available  no single point of failure ,quick recovery • Quicker time to benefit • Processing guarantees  NO DATA IS LOST!
  • Some Sample Analytic Use Cases • Sum(Notional volume) by Ticker: Daily, Hourly, Minute • Average trade latency (Execution TS – Order TS) • Wash Sales (sell within x seconds of last buy) for same Client/Ticker
  • How has this system traditionally been handled • Typically by manually partitioning the application  Having a number Message Queue of independent systems and databases “dividing” the problem Use Case 1: Partition A Database A Use Case 1: Partition B Database B Use Case 2: All Partitions Database C Main issues  • Growth requires changing these systems to accept the new partitioning scheme: Development! • A lot of different applications replicating complex architecture, tons of boilerplate code • Performing analysis across the partitioning schemes very difficult
  • Need to Establish a Platform as a Service Architecture d3.js Analytics Atomic data Sensor Data Aggregates Storm Cluster Event Monitors • Redis queue is used for ingestion • Storm is used for real-time ETL and outputs atomic data and derived data needed for analytics • Redis is used as a reference data lookup cache and state • Real time analytics are produced from the aggregated data. • Higher latency ad-hoc analytics are done in Hadoop using Pig and Hive Low Latency Analytics
  • Deeper Dive: Cassandra as an Analytic Database • Based on a blend of Dynamo and BigTable • Distributed, master-less • Super fast writes  Can ingest lots of data! • Very fast reads Why did we choose it: • Data throughput requirements • High availability • Simple expansion • Interesting data models for time series data (more on this later)
  • Design Practices • Cassandra does not support aggregation or joins  Data model must be tuned to usage • Denormalize your data (flatten your primary dimensional attributes into your fact) • Storing the same data redundantly is OK Might sound weird but we've been doing this all along in the traditional world modeling our data to make analytic queries simple!
  • Wide rows are our friends • Cassandra composite columns are powerful for analytic models • Facilitate multi-dimensional analysis • A wide row table may have N number of rows, and a variable number of columns (millions of columns) ClientA ClientB ClientC … 20130101 20130102 20130103 20130104 20130104 20130105 … 10003 9493 43143 45553 54553 34343 … 45453 34313 54543 `23233 4233 34423 … 3323 35313 43123 54543 43433 4343 … … … … … … .. … • And now with CQL3 we have “unpacked” wide rows into named columns  Easy to work with!
  • More about wide rows! • The left-most column is the ROW KEY • It is the mechanism by which the row is distributed across the Cassandra cluster… • Care must be taken to prevent hot spots: Dates for example are not generally good candidates because all load will go to given set of servers on a particular day! • Data can be filtered using equal and “in” clause ClientA ClientB ClientC … 20130101 20130102 20130103 20130104 20130104 20130105 … 10003 9493 43143 45553 54553 34343 … 45453 34313 54543 `23233 4233 34423 … 3323 35313 43123 54543 43433 4343 … … … … … … .. … Create table Client_Daily_Summary ( Client text, Date_ID int, Trade_Count int, Primary key (Client, Date_ID)) • The top row is the COLUMN KEY • Their can be a variable number of columns • It is acceptable to have millions/ even billions of columns in a table • Columns keys are sorted and can accept a range query (greater than / less than)
  • Traditional Cassandra Analytic Model If we wanted to track trade count by day, hour we could stream our ETL to two (or more) summary fact tables ClientA ClientB ClientC 20130101 20130102 20130103 20130104 20130104 20130105 10003 9493 43143 45553 54553 34343 45453 34313 54543 `23233 4233 34423 3323 35313 43123 54543 43433 4343 Sample analytic query: Give me daily trade counts for ClientA between Jan 1 and Jan 3: Select Date_ID, Trade_Count from Client_Hourly_Summary ` where Client='ClientA' and Date_ID>=20130101 and Date_ID <=20130103 ClientA|20131101 ClientA|20131102 ClientB|20131101 0900 1000 4545 332 1000 949 3431 3531 1100 4314 5454 4312 1200 4555 2323 5454 1300 5455 423 4343 1400 3434 3442 434 Sample analytic query: Give me hourly trade counts for ClientA for Jan1 between 9 and 11 AM Select Hour, Trade_Count from Client_Hourly_Summary ` where Client_Date='ClientA|20131101' and hour >= 900 and <= 1100
  • But there are other methods too • Assuming some level of client side aggregation (and additive measures) we could also further unpack and leverage column keys using CQL 3  A slightly different use case: Create table Client_Ticker_Summary ( Client text, Date_ID int, Ticker text, Trade_Count int, Notional_Volume float, Primary Key (Client, Date_ID, Ticker)) The first column in the PK definition is the Row Key aka Partition Key Look at all this flexible SQL goodness: select * from Client_Ticker_Summary where Client in ('ClientA','ClientB') select * from Client_Ticker_Summary where Client in ('ClientA','ClientB') and Date_ID >= 20130101 and Date_ID <= 20130103 select * from Client_Ticker_Summary where Client ='ClientA' and Date_ID >= 20130101 and Date_ID <= 20130103 Select * from Client_Ticker_Summary where Client = 'ClientA’ and Date_ID=20130101 and Ticker in ('APPL','GE','PG') ALSO  But not recommended! select * from Client_Ticker_Summary where Date_ID > 20120101 allow filtering; select * from Client_Ticker_Summary where Date_ID = 20120101 and ticker in ('APPL','GE') allow filtering;
  • Storing the Atomic data 8=FIX.4.2 | 9=178 | 35=8 | 49=PHLX | 56=PERS | 52=20071123-05:30:00.000 | 11=ATOMNOCCC9990900 | 20=3 | 150=E | 39=E | 55=MSFT | 167=CS | 54=1 | 38=15 | 40=2 | 44=15 | 58=PHLX EQUITY TESTING | 59=0 | 47=C | 32=0 | 31=0 | 151=15 | 14=0 | 6=0 | 10=128 | • We must land all atomic data: • Persistence • Future replay (new metrics, corrections) • Drill down capabilities/auditability • The sparse nature of the FIX data fits the Cassandra data model very well. • We will store tags which are actually present in the data, saving space  a few approaches depending on usage pattern. Create table Trades_Skinny( OrderID Text Primary_Key, Date_ID int, Ticker int, Client text, …Many more columns) Create index ix_Date_ID on Trade_Data_Skinny (Date_ID) Create table Trades_Wide( Order_ID Text Primary_Key, Tag text, Value text, Primary key (Order_ID, Tag)) Create table Trades_Map( OrderID Text Primary_Key, Date_ID int, Ticker int, Client text, Tags map <text, text>) Create index ix_Date_ID on Trade_Data_Map (Date_ID)
  • Big data solutions usually employ multiple DB types Some considerations:  Size type requirements: • Volume: which is a disk space size requirement. • Velocity: which is an message rate requirement.  Data-Structure & Query Pattern complexity: Simple K/V pair -vs- Relational -vs- …  C.A.P. theorem alignment: Which two does of your use-case benefit from?  Value-add features: • API: (Interface: e.g. HTTP ReST -vs- Client classes). (Power: e.g. mget, incrementBy). • Replication and/or H/A support. (B.C./D.R.) • Support for Data Processing Patterns (e.g. Riak has Map/Reduce; Redis zSets has Top-N) • Transaction support (Redis: Multi; Command list; Exec). • and so on.
  • Contact Elliott Cordo Principal Consultant, Caserta Concepts P: (855) 755-2246 x267 E: elliott@casertaconcepts.com info@casertaconcepts.com 1(855) 755-2246 www.casertaconcepts.com
  • DEEP-DIVE INTO STORM TOPOLOGY Noel Milton Vega Consultant, Dimension Data, LLC. Consultant, Caserta Concepts
  • Practical Deep Dive: Continuity-of-Service across Storm failures An approach to making topologies more resilient to task failure  Tasks in Storm are the units that do the actual work.  Tasks can individually fail due to:  Resource starvation (OOM, CPU)  Unhandled exceptions  Timeouts (such as waiting for I/O)  and so on  Tasks also fail because parent Executors, Workers or Supervisors fail.  Nimbus will spawn a replacement task, but in the context of C.o.S. is that enough? Answer: No. But, maybe we can work around that. http://bit.ly/1bsBooT  My “storm-user” Google group question:
  • Storyboard: Continuity-of-Service ACME C heck Deposit C (H.Q.) orp X S tep1: deposit client [A-I] checks S tep2: update checkbook balance S tep1: deposit client [J-R] checks S tep2: update checkbook balance S tep1: deposit client [S checks -Z] S tep2: update checkbook balance Blue:  Deposits a check for an [A-I] client, and is given a deposit receipt for it (Step1).  Before he’s able to journal the receipt to the check register journal, he quits. (Step2). 1) ACME H.Q. notices that [A-I] checks aren’t being processed. Should the workload be redistributed? No! (exception policy). 2) Policy Consequence: there’s no difference before & after event, so context has to be remembered:  The new hire’s role is as check depositor for ACME (not a plumber for sub-company FOOBAR).  Their specific ACME role is to deposit checks for clients [A-I].  The role did have state: there’s an Aggregate check register; and an incomplete Transaction.
  • Storyboard: Continuity-of-Service Why this example? It has the operational requirements of real-world use cases:  Distributed model (where processors are autonomous). Suitable for Big Data.  Specific Failure / Recovery requirements:  Incomplete Transaction are completed  Aggregated state is remembered  Behavior Persistence: Same behavior before & after an exception event (stikyness).
  • Modeling this use-case story in Storm Blue:   Deposits a batch of checks for clients [A-I] and is given a deposit receipt for them (Step1). Before he’s able to journal the receipt to the check register journal, he quits. (Step2). 1) ACME H.Q. notices that [A-I] checks aren’t being processed. Should the workload be redistributed? No! (by policy). 2) Policy Consequence: there’s no difference before & after event, so context has to be remembered: acmeBolt  The role is check depositor for ACME (not a plumber for sister-company FOO). acmeBolt task (fields grouped  The specific ACME role is to deposit checks for clients [A-I].  The role did have state: there’s an Aggregate check register; and an incomplete Java objects in the JVM associated with Transaction. acmeBolt task
  • Modeling this use-case story in Storm http://bit.ly/1bsBooT
  • What does Storm remember across task fail/restarts? (if anything) http://bit.ly/1bsBooT worker exec t0 X worker exec worker exec t0 t0 supervisor node 1-of-3 worker exec t1 worker exec worker exec t1 t2 supervisor node 2-of-3 worker exec t2 worker exec worker exec t2 t2 supervisor node 3-of-3 - What is Storm’s grouping/re-grouping policy? - Will replacement tasks use the same identifier?
  • Programmatically, what we’re asking is this … http://bit.ly/1bsBooT // =============================== // Constructor. // =============================== public bolt01(Properties properties) { } worker exec t0 X worker exec t0 t0 supervisor node 1-of-3 // =============================== // prepare() method // =============================== public void prepare(Map stormConf, TopologyContext } // =============================== // execute() method. // =============================== public void execute(Tuple inTuple) { } worker exec context, worker exec t1 worker exec worker exec t1 t2 supervisor node 2-of-3 worker exec t2 worker exec worker exec t2 t2 supervisor node 3-of-3 OutputCollector collector) { Is identification remembered here? Is grouping remembered here? (i.e. redistribution policy)
  • Lab behavior observations shows Storm does remember … http://bit.ly/1bsBooT componentID = context.getThisComponentId(); # Defined in topology class. E.g. bolt01 ComponentID taskPntr1 0 taskPntr2 1 taskPntr3 2 … taskPntrN N-1 taskID = context.getThisTaskId(); # An integer between [1 – N], where N is the number of tasks, topology-wide. taskIndex = context.getThisTaskIndex(); # An integer between [0-(N-1)], where N is the number of tasks, component-wide. fqid = componentID + “.0” + Integer.toString(taskIndex) # Ex: bolt02.05; spout01.03; bolt01.00
  • Quick digression …
  • Lab tests show Storm does remember, but what’s missing? http://bit.ly/1bsBooT So in Lab tests we observed the following behaviors in Storm:  Preserve the FQID (e.g.: bolt01.02) before & after task failures. IDENTITY PERSISTANCE!  Tasks with a given FQID will receive the same grouping of data throughout the life of a topology. (Analogy: New hire will be an ACME check depositor for clients [A-I]). And yet, there is something still missing? While Storm can replay unprocessed Tuples that timed-out during the fail/restart period, it can’t regenerate in-memory (in-JVM) aggregated state What to do? 
  • REDIS to the rescue :: Continuity-of-Service Since we observed the following behaviors in Storm:  Preserves the FQID (e.g.: bolt01.02) before & after task failures. IDENTITY PERSISTANCE!  Tasks with a given FQID will receive the same grouping of data throughout the life of a topology.
  • REDIS to the rescue :: Continuity-of-Service FQID is maintained across task Fail/Restarts (i.e. for the lifetime of the topology). // =============================== // prepare() method // =============================== public void prepare(Map stormConf, TopologyContext [ ... snip ... ] context, OutputCollector collector) { this.componentID = context.getThisComponentId(); // e.g. bolt01; spout03 this.taskIndex = context.getThisTaskIndex(); // [0-(N-1)]; where N = Number of component tasks. this.fqid = componentID + “.0” + Integer.toString(this.taskIndex); // bolt01.04; spout03.00 this.redisKeyPrefix = this.fqid; // Use your unique Fully Qualified I.D. as a Redis key prefix. // Establish connection to Redis [not shown], and recover lost data structures, if any. this.hashMap = this.jedisClient.hgetAll(this.redisKeyPrefix + “-myMap”); //bolt01.01-myMap } // =============================== // execute() method // =============================== public void execute(Tuple inTuple) { [ ... snip ... ] Tuple grouping/partitioning is maintained across task fail/restarts (i.e. for the lifetime of the topology). String customer = inTuple.getString(0); double balance += inTuple.getString(1); this.hashMap.put(customer, balance); // Recovered, as necessary, in prepare(). this.jedisClient.hput(this.redisKeyPrefix + “-myMap”, customer, balance); }
  • Summary :: Storm / Redis and Continuity-of-Service Master r/o Slave (local) host:6379 Fields grouping within a stream is based on field-1 of the Tuple. } KEY: dataSourceQueue01 spout01.00 bolt01.00  taskIndex -vstaskID bolt01.01 bolt01.02 spout01.01 KEY: dataSourceQueue02 spout01.02 spout01.03 spout01.04 KEY: spout01.tupleAchHash tupleGUID GUID1 GUID2 ... GUID-n Tuple tuple1 tuple2 Tuple-n KE bolt01.02-dat aS Y: truct 1 KE bolt01.02-dat aS Y: truct 2 KE bolt01.02-dat aS Y: tructN KE bolt02.00-dat aS Y: truct 1 KE bolt02.00-dat aS Y: truct 2 KE bolt01.00-dat aS Y: tructN ... spout01.05 bolt02.00 bolt02.01 bolt02.02 } }      v v S trings (Byte-arrays). Lists (2-way queue, as linked list) S ets Hashes S orted S (Hashes w/ sorted values) ets S e/De-serialize objects as JS ON Other in-memory solution: e.g. MemS QL.
  • Noel Milton Vega Consultant, Dimension Data, LLC. Consultant, Caserta Concepts P: (212) 699-2660 E1: noel@casertaconcepts.com E2: nmvega@didata.us info@casertaconcepts.com 1(855) 755-2246 www.casertaconcepts.com
  • Q&A / THANK YOU 501 Fifth Ave 17th Floor New York, NY 10017 1-855-755-2246 info@casertaconcepts.com