Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.



Published on

  • Be the first to comment

  • Be the first to like this


  1. 1. From Moore to Metcalf: The Network as the Next Database Platform HPDC June 2007 Michael Franklin UC Berkeley & Truviso (formerly, Amalgamated Insight)
  2. 2. Outline <ul><li>Motivation </li></ul><ul><li>Stream Processing Overview </li></ul><ul><li>Micro-Architecture Issues </li></ul><ul><li>Macro-Architecture Issues </li></ul><ul><li>Conclusions </li></ul>
  3. 3. Moore’s Law vs. Shugart’s: The battle of the bottlenecks <ul><li>Moore : Exponential Processor and Memory improvement. </li></ul><ul><li>Shugart : Similar law for disk capacity . </li></ul><ul><li>The yin and yang of DBMS architecture: “ disk-bound ” or “ memory-bound ”? </li></ul><ul><ul><li>OR are DBMS platforms getting faster or slower relative to the data they need to process? </li></ul></ul><ul><ul><li>Traditionally, the answer dictates where you innovate. </li></ul></ul>
  4. 4. Metcalf’s Law will drive more profound changes <ul><li>Metcalf : “The value of a network grows with the square of the # of participants ” . </li></ul><ul><li>Practical implication: all interesting data-centric applications become distributed. </li></ul><ul><ul><li>Already happening: </li></ul></ul><ul><ul><ul><li>Service-based architectures (and Grid!) </li></ul></ul></ul><ul><ul><ul><li>Web 2.0 </li></ul></ul></ul><ul><ul><ul><li>Mobile Computing </li></ul></ul></ul>
  5. 5. Bell’s law will amplify Metcalf’s <ul><li>Bell : “ Every decade, a new, lower cost, class of computers emerges, defined by platform, interface, and interconnect.” </li></ul><ul><ul><ul><li>Mainframes 1960s </li></ul></ul></ul><ul><ul><ul><li>Minicomputers 1970s </li></ul></ul></ul><ul><ul><ul><li>Microcomputers/PCs 1980s </li></ul></ul></ul><ul><ul><ul><li>Web-based computing 1990s </li></ul></ul></ul><ul><ul><ul><li>Devices (Cell phones, PDAs, wireless sensors, RFID) 2000’s </li></ul></ul></ul>Enabling a new generation of applications for Operational Visibility, monitoring, and alerting.
  6. 6. The Network as platform: Challenges Clickstream Barcodes PoS System RFID Telematics <ul><li>Data Constantly “ On-the-Move ” </li></ul><ul><li>Increased Data Volume </li></ul><ul><li>Increased Heterogeneity & Sharing </li></ul><ul><li>Shrinking decision cycles </li></ul><ul><li>Increased data and decision complexity </li></ul>Mobile Devices Transactional Systems Information Feeds XYZ 23.2; AAA 19; … Sensors Blogs/Web 2.0
  7. 7. <ul><li>Lots of challenges: </li></ul><ul><li>Integration (or “Dataspaces”) </li></ul><ul><li>Optimization/Planning/Adaptivity </li></ul><ul><li>Consistency/Master Data Mgmt </li></ul><ul><li>Continuity/Disaster Mgmt </li></ul><ul><li>Stream Processing (or data-on-the-move) </li></ul><ul><li>My current focus (and thus, the focus of this talk) is the latter. </li></ul>The Network as platform: Implications
  8. 8. Stream Processing <ul><li>My view: Stream Processing will become the 3rd leg of standard IT data management: </li></ul><ul><ul><li>OLAP splitoff from OLTP for historical reporting. </li></ul></ul><ul><ul><li>OLSA (On-line Stream Analytics) will handle: </li></ul></ul><ul><ul><ul><li>Monitoring </li></ul></ul></ul><ul><ul><ul><li>Alerting </li></ul></ul></ul><ul><ul><ul><li>Transformation </li></ul></ul></ul><ul><ul><ul><li>Real-time Visability and Reporting </li></ul></ul></ul><ul><li>Note: CEP (Complex Event Processing) is a related, emerging technology. </li></ul>
  9. 9. Stream Processing + Grid? <ul><li>On-the-fly stream processing required for high-volume data/event generators. </li></ul><ul><li>Real-time event detection for coordination of distributed observations. </li></ul><ul><li>Wide-area sensing in environmental macroscopes. </li></ul>
  10. 10. Stream Processing - Overview
  11. 11. Turning Query Processing Upside Down Static Batch Reports Bulk Load Data Queries Results <ul><li>Batch ETL & load, query later </li></ul><ul><li>Poor RT monitoring, no replay </li></ul><ul><li>DB size affects query response </li></ul>Traditional Database Approach Data Warehouse <ul><li>Always-on data analysis & alerts </li></ul><ul><li>RT Monitor & Replay to optimize </li></ul><ul><li>Consistent sub-second response </li></ul>Data Stream Processing Approach Continuous, Visibility, Alerts Live Data Streams Results Data Stream Processor
  12. 12. Example 1: Simple Stream Query <ul><li>A SQL smoothing filter to interpolate dropped RFID readings. </li></ul>Time Raw readings Smoothed output SELECT distinct tag_id FROM RFID_stream [RANGE ‘5 sec’] GROUP BY tag_id Smoothing Filter
  13. 13. Example 2 - Stream/Table Join SELECT T.symbol, AVG(T.price*T.volume) FROM Trades T [RANGE ‘5 sec’ SLIDE ‘3 sec’], SANDP500 S WHERE T.symbol = S.symbol AND T.volume > 5000 GROUP BY T.symbol Every 3 seconds, compute avg transaction value of high-volume trades on S&P 500 stocks, over a 5 second “sliding window” Note: Output is also a Stream Stream Table Window clause
  14. 14. Example 3 - Streaming View Positive Suspense: Find the top 100 store-skus ordered by their decreasing positive suspense (inventory - sales). CREATE VIEW StoreSKU (store, sku, sales) as (SELECT, P.sku,SUM(P.qty) as sales FROM POSLog P[RANGE `1 day’ SLIDE `10 min’], Inventory I WHERE P.sku = I.sku and = and P.time > I.time GROUP BY, P.sku) SELECT (I.quantity – S.sales) as positive_suspense FROM StoreSKU S, Inventory I WHERE = and S.sku = I.sku ORDER BY positive_suspense DESC LIMIT 100
  15. 15. Application Areas <ul><li>Financial Services: Trading/Capital Mkts </li></ul><ul><li>SOA/Infrastructure Monitoring; Security </li></ul><ul><li>Physical (sensor) Monitoring </li></ul><ul><li>Fraud Detection/Prevention </li></ul><ul><li>Risk Analytics and Compliance </li></ul><ul><li>Location-based Services </li></ul><ul><li>Customer Relationship Management/Retail </li></ul><ul><li>Supply chain/Logistics </li></ul><ul><li>… </li></ul>
  16. 16. Real-Time Monitoring A Flex-based dashboard driven by multiple SQL queries.
  17. 17. The “ Jellybean ” Argument <ul><li>Reality: With stream query processing, real-time is cheaper than batch. </li></ul><ul><ul><li>minimize copies & query start-up overhead </li></ul></ul><ul><ul><li>takes load off expensive back-end systems </li></ul></ul><ul><ul><li>rapid application dev & maintenance </li></ul></ul>Conventional Wisdom: “can I afford real-time?” Do the benefits justify the cost?
  18. 18. Historical Context and status <ul><li>Early stuff: </li></ul><ul><ul><li>Data “Push”, Pub/Sub, Adaptive Query Proc. </li></ul></ul><ul><li>Lots of non-SQL approaches </li></ul><ul><ul><li>Rules systems (e.g., for Fraud Detection) </li></ul></ul><ul><ul><li>Complex Event Processing (CEP) </li></ul></ul><ul><li>Research Projects led to companies </li></ul><ul><ul><li>TelegraphCQ -> Truviso (Amalgamated) </li></ul></ul><ul><ul><li>Aurora -> Streambase </li></ul></ul><ul><ul><li>Streams -> Coral8 </li></ul></ul><ul><li>Big guys ready to jump in: BEA, IBM, Oracle, … </li></ul>
  19. 19. Requirements <ul><li>High Data Rates: 1K (SOA monitoring) up to 700K rec/sec (option trading) </li></ul><ul><li># queries: single digits to 10,000’s </li></ul><ul><li>Query complexity </li></ul><ul><ul><li>Full SQL + windows + events + analytics </li></ul></ul><ul><li>Persistence, replay, historical comparison </li></ul><ul><li>Huge range of Sources and Sinks </li></ul>
  20. 20. Stream QP: Micro-Architecture
  21. 21. Single Node Architecture Proprietary APIs © 2007, Amalgamated Insight, Inc. … … Other CQE Instances Other CQE Instances External Archive Continuous Query Engine Adaptive SQL Query Processor Concurrent Query Planner Triggers/ Rules Active Data Replay Database Streaming SQL Query Processor XML CSV MQ MSMQ JDBC .NET Connectors Transformations Ingress XML Message Bus Alerts Pub/Sub Events Connectors Transformations Egress
  22. 22. Ingress Issues (performance) <ul><li>Must support high data rates </li></ul><ul><ul><li>700K ticks/second for FS </li></ul></ul><ul><ul><li>Wirespeed for networking/security </li></ul></ul><ul><li>Minimal latency </li></ul><ul><ul><li>FS trading particularly sensitive to this </li></ul></ul><ul><li>Fault tolerance </li></ul><ul><ul><li>Especially given remote sources </li></ul></ul><ul><li>Efficient (bulk) data transformation </li></ul><ul><ul><li>XML, text, binary, … </li></ul></ul><ul><li>Work well for both push and pull sources </li></ul>XML CSV MQ MSMQ JDBC .NET Connectors Transformations Ingress
  23. 23. Egress Issues (performance) <ul><li>Must support high data rates </li></ul><ul><li>Minimal latency </li></ul><ul><li>Fault tolerance </li></ul><ul><li>Efficient (bulk) data transformation </li></ul><ul><li>Buffering/Support for JDBC-style clients </li></ul><ul><li>Interaction with bulk warehouse loaders </li></ul><ul><li>Large-scale dissemination (Pub/Sub) </li></ul>Prop. APIs XML Message Bus Alerts Pub/Sub Events Connectors Transformations Egress
  24. 24. Query Processing (Single) <ul><li>Simple approach: </li></ul><ul><ul><li>Stream inputs are “scan” operators </li></ul></ul><ul><ul><li>Adapt operator plumbing to push/pull </li></ul></ul><ul><ul><ul><li>“ Exchange” operators/ Fjords </li></ul></ul></ul><ul><li>Need to run lots of these concurrently </li></ul><ul><ul><li>Index the queries? </li></ul></ul><ul><ul><li>Scheduling, Memory Mgmt. </li></ul></ul><ul><li>Must avoid I/O, cache misses to run at speed </li></ul><ul><li>Predicate push-down - a la Gigascope </li></ul>Continuous Query Engine Adaptive SQL Query Processor Concurrent Query Planner Triggers/ Rules Active Data Replay Database Streaming SQL Query Processor
  25. 25. QP (continued) <ul><li>Transactional/Correctness issues: </li></ul><ul><ul><li>Never-ending queries hold locks forever! </li></ul></ul><ul><ul><li>Need efficient heartbeat mechanism to keep things moving forward. </li></ul></ul><ul><ul><li>Dealing with corrections (e.g., in financial feeds). </li></ul></ul><ul><ul><li>Out-of-order/missing data </li></ul></ul><ul><ul><ul><li>“ ripples in the stream” can hurt clever scheduling mechanisms. </li></ul></ul></ul><ul><li>Integration with external code: </li></ul><ul><ul><li>Matlab, R, …, UDFs and UDAs </li></ul></ul>
  26. 26. Query Processing (Shared) <ul><li>Previous approach misses huge opportunity. </li></ul><ul><li>Individual execution leads to linear slowdown </li></ul><ul><ul><li>Until you fall off the memory cliff! </li></ul></ul><ul><li>Recall that we know all the queries </li></ul><ul><ul><li>we know when they will need data </li></ul></ul><ul><ul><li>we know what data they will need </li></ul></ul><ul><ul><li>we know what things they will compute </li></ul></ul><ul><li>Why run them individually (as if we didn’t know any of this)? </li></ul>
  27. 27. Shared Processing - The Überquery No redundant modules = Super-Linear Query Scalability SELECT T.symbol, AVG(T.price*T.volume) FROM Trades T [RANGE ‘5 sec’ SLIDE ‘3 sec’], SANDP500 S WHERE T.symbol = S.symbol AND T.volume > 5000 GROUP BY T.symbol Form “query plan” from query text New query plan enters the system Shared Query Engine More queries arrive … Queries get compiled into plans Each plan is folded into the global plan SELECT … FROM … WHERE …. GROUP BY … SELECT … FROM … WHERE …. GROUP BY … SELECT … FROM … WHERE …. GROUP BY …
  28. 28. Shared QP raises lots of new issues <ul><li>Scheduling based on data availability/location and work affinity. </li></ul><ul><li>Lots of bittwiddling: need efficient bitmaps. </li></ul><ul><li>Query “folding” - how to combine (MQO) </li></ul><ul><li>On-the-fly query changes. </li></ul><ul><li>How does shared processing change the traditional architectural tradeoffs? </li></ul><ul><li>How to process across multiple: cores, dies, boxes, racks, rooms? </li></ul><ul><li>Refs: NiagaraCQ, CACQ, TelegraphCQ, Sailesh Krishnamurthy’s thesis </li></ul>
  29. 29. Archiving - Huge area <ul><li>Most streaming use-cases want access to historical information. </li></ul><ul><li>Compliance/Risk : also need to keep the data. </li></ul><ul><ul><li>Science apps need to keep raw data around too. </li></ul></ul><ul><li>In a high-volume streaming environment, going to disk is an absolute killer. </li></ul><ul><li>Obviously need clever techniques: </li></ul><ul><ul><ul><li>Sampling, Index update deferral, load shedding </li></ul></ul></ul><ul><ul><ul><li>Scheduling based on time-oriented queries </li></ul></ul></ul><ul><ul><ul><li>Good old buffering/prefetching </li></ul></ul></ul>External Archive
  30. 30. Stream QP: Macro-Architecture
  31. 31. HiFi - Taming the Data Flood Receptors Warehouses, Stores Dock doors, Shelves Regional Centers Headquarters Hierarchical Aggregation: Spatial & Temporal In-network Stream Query Processing and Storage Fast Data Path vs. Slow Data Path
  32. 32. Problem: Sensors are Noisy <ul><li>A simple RFID Experiment </li></ul><ul><li>2 adjacent shelves, 6 ft. wide </li></ul><ul><li>10 EPC-tagged items each, plus 5 moved between them </li></ul><ul><li>RFID antenna on each shelf </li></ul>
  33. 33. Shelf RIFD - Ground Truth
  34. 34. Actual RFID Readings “ Restock every time inventory goes below 5”
  35. 35. Vice API is a natural place to hide much of the complexity arising from physical devices. VICE: Virtual Device Interface [ Jeffery et al., Pervasive 2006, VLDBJ 07] “ Virtual Device (VICE) API”
  36. 36. Query-based Data Cleaning Point Smooth CREATE VIEW smoothed_rfid_stream AS (SELECT receptor_id, tag_id FROM cleaned_rfid_stream [range by ’5 sec’, slide by ’5 sec’] GROUP BY receptor_id, tag_id HAVING count(*) >= count_T)
  37. 37. Query-based Data Cleaning Point Smooth Arbitrate CREATE VIEW arbitrated_rfid_stream AS (SELECT receptor_id, tag_id FROM smoothed_rfid_stream rs [range by ’5 sec’, slide by ’5 sec’] GROUP BY receptor_id, tag_id HAVING count(*) >= ALL (SELECT count(*) FROM smoothed_rfid_stream [range by ’5 sec’, slide by ’5 sec’] WHERE tag_id = rs.tag_id GROUP BY receptor_id))
  38. 38. After Query-based Cleaning “ Restock every time inventory goes below 5”
  39. 39. Adaptive Smoothing [Jeffery et al. VLDB 2006]
  40. 40. SQL Abstraction Makes it Easy? <ul><li>Soft Sensors - e.g., </li></ul><ul><ul><li>“ LOUDMOUTH” sensor (VLDB 04) </li></ul></ul><ul><li>Quality and lineage </li></ul><ul><li>Optimization (power, etc.) </li></ul><ul><li>Pushdown of external validation information </li></ul><ul><li>Automatic/Adaptive query placement </li></ul><ul><li>Data archiving </li></ul><ul><li>Imperative processing </li></ul>
  41. 41. Some Challenges <ul><li>How to run across the full gamut of devices from motes to mainframes? </li></ul><ul><ul><li>What about running *really* in-the-network? </li></ul></ul><ul><li>Data/query placement and movement </li></ul><ul><ul><li>Adaptivity is key </li></ul></ul><ul><ul><li>“ Push down” is a small subset of this problem. </li></ul></ul><ul><ul><li>Sharing is also crucial here. </li></ul></ul><ul><li>Security, encryption, compression, etc. </li></ul><ul><li>Lots of issues due to devices and “physical world” problems. </li></ul>
  42. 42. It’s not just a sensor-net problem OLTP OLTP OLTP OLTP OLTP OLTP OLTP OLTP Batch Load E-com Transactional OLTP ERP CRM SCM OLTP OLTP OLTP Analytical PCs PoS Handhelds Readers Edge Devices Enterprise Apps Transactional Data Stores Integration Bus Reports Analytics OLAP OLAP OLAP OLAP OLAP OLAP OLAP OLAP OLAP Enterprise Data Warehouse Specialized Data Marts Business Intelligence Data Mining Portal Operational BI Alerts Dash- Boards Distributed Data Batch Latency Exploding Data Volumes Query Latency Decision Latency
  43. 43. Data Dissemination (Fan-Out) <ul><li>Many applications have large numbers of consumers. </li></ul><ul><li>Lots of interesting questions on large-scale pub/sub technology. </li></ul><ul><ul><li>Micro-scale: locality, scheduling, sharing, for huge numbers of subscriptions. </li></ul></ul><ul><ul><li>Macro-scale: dissemination trees, placement, sharing, … </li></ul></ul>
  44. 44. What to measure? (a research opportunity) <ul><li>High Data Rates/Throughput </li></ul><ul><ul><li>rec/sec; record size </li></ul></ul><ul><li>Number of concurrent queries. </li></ul><ul><li>Query complexity </li></ul><ul><li>Huge range of Sources and Sinks </li></ul><ul><ul><li>transformation and connector performance </li></ul></ul><ul><li>Minimal Benchmarking work so far: </li></ul><ul><ul><li>“ Linear Road” from Aurora group </li></ul></ul><ul><ul><li>CEP benchmark work by Pedro Bizarro </li></ul></ul>
  45. 45. Conclusions <ul><li>Two relevant trends: </li></ul><ul><ul><li>Metcalf’s Law  DB systems need to become more network-savvy. </li></ul></ul><ul><ul><li>Jim Gray and others have helped demonstrate the value of SQL to science. </li></ul></ul><ul><li>Stream query processing is where these two trends meet in the Grid world. </li></ul><ul><ul><li>A new (3rd) component of data management infrastructure. </li></ul></ul><ul><li>Lots of open research problems for the HPDC (and DB) community. </li></ul>
  46. 46. Resources <ul><li>Research Projects @ Berkeley </li></ul><ul><ul><li>TelegraphCQ - single-site stream processor </li></ul></ul><ul><ul><li>HiFi - Distributed/Hierarchical </li></ul></ul><ul><ul><li>see for links/papers </li></ul></ul><ul><li>Good jumping off point for CEP and related info: </li></ul><ul><li>The company: </li></ul><ul><li>www. truviso .com </li></ul>