This is an introductory presentation to the notion of Relational Streaming, its tie to “Big Data”, and how SQLstream changes the “What happened?” results of Big Data, to “What is happening ?!” through the power to “Query the Future”.
SQLstream was founded in 2003 with a vision of transforming the way streaming service, system and sensor data are processed.SQLstream is headquartered in San Francisco with agents and partners in EMEA and APAC.Fontinalis Partners led our most recent funding round; Fontinalis is a Venture Capital firm with Private Equity roots and limited partners comprising some of the largest industrial families around the world.The company left stealth mode in 2008 after 1.2M lines of code to deliver the world’s first SQL standards compliant stream-to-business platform.We have 8 patents of which 4 are already granted, and 4 are pending.
Put simply, there is a new real-time data challenge facing many enterprises today. The common solution is to “store then query”- archive everything, then hope your queries can identify crucial trends in a timeframe of relevance.The data volumes are exploding exponentially – making it too costly to analyze with conventional technologies where you have to store all of the data, even if most of the data might have a very limited ‘shelf life’. The costs come from needing specialized data warehousing systems to handle such large and ever growing data volumes, and those license fees are almost always based on the volume of data stored. So why store everything if you are only really concerned with the results of analyses? At the same time, businesses are having to become nimbler and more agile. They need to consume and analyze data faster than their competitors, and fast enough to hold the attention of their customers while they are interacting with their product, service, systems or personnel.
Using SQLstream, an organization can now perform low-latency queries on data in flight, delivering immediate low-latency resultsNew queries or modified queries can be pushed into the system without service outageSQLstream can also act as a real-time filter to reduce the volume of obtuse data stored in costly data archivesAnd by the use of standard SQL statements, enables your development staff or partners to create value at a significantly reduced development cost
SQLstream eliminates this pain by combining the features of relational databases with enterprise service bus concepts – in a low-latency, in-memory implementation.SQLstream’s stream-to-business platform allows you to easily and incrementally plug-in new sources of data and similarly to plug-in new destinations for results. Each consuming application creates a relational VIEW of the data that it needs to see. This VIEW is turned into a continuously executing real-time SQL query acting upon the living streaming data sources (including sources that represent the results of other streaming SQL queries) in order to deliver the required stream of results that the business requires. These SQL queries are based on 2003 ANSI and ISO SQL standards. They are ACTIVE queries in that they executive against the live streaming data while the data are still in flight – WITHOUT HAVING TO STORE THE DATA FIRST!SQLstream provides an ACTIVE relational database where hundreds of SQL VIEWS and QUERIES are continuously combing, indexing, aggregating and correlating massive volumes of data from hundreds or thousands of streaming sources ALL WITHOUT STORING THE DATA FIRST. SQLstream assembles the record streams that each application has requested, so that all of the streaming source data can be reused and repurposed according to need - in real-time.
SQLstream compliments Hadoop / Map-Reduce solutions to answer both “What happened?” and “What is happening?”.Hadoop, with its phased approached, handles a lot of superscalar executions of 64MB chunks of data. When each phase is complete, and all the records are assembled and sorted from the chunks of records feeding upstream, the next phase can commence.SQLstream implements the Relational Streaming model. This can be visualized as a Directed Acyclic Graph of relational operators operating on streams of tuples synchronized around (normally) timestamps.This is very similar to an electronic logic circuit. The relational operators are the logic gates. The tuples are the binary data signals. Both propagate answers as soon as the results are available with minimal latency. Both utilize dafaflow execution, time synchronization where appropriate, and both offer both pipelining parallelism and superscalar parallelism.
SQL was developed to elegant process massive quantities of stored data. Using SQLstream, it works just as well in processing massive volumes of streaming data.It has proven scalability and sophisticated query optimizers.It enables rapid application development – a few SQL rules have immense power – and the SQL skills are readily available in the marketplace.It allows easy migration of SQL queries and logic to and from databases and data warehouses and SQLstream.This query example shows how one would use SQLstream to find orders from New York that ship within one hour. The keyword STREAM is used to maintain standards compatibility as without it the query would return a table not a stream of results-- results that could now continue ad infinitum.
SQLstream operates on streams of data, which we term “S3 Data” – for Sensors, System and Service Data.The Sensor Data category includes Vehicular sensor data, GPS data, RFID data, transportation data, engine data. Machine-to-machine networks, smart energy, manufacturing sensors and so forth all emit this type of sensor data.The System Data category – some call this Machine Data – covers log files emitted from applications and server, and can be used for real-time security, fraud and compliance detection. Also cloud computing monitoring, service level monitoring, and so forth.Finally the Service Data category includes all manner of service data, ranging from SMS Text messages, Call Detail Records (CDRs), fraud logic alerts, real-time pricing and promotion for the “Active Internet” – context-dependent content streamed from low-latency relational streaming output.
We are one of only two closed source solutions within Mozilla. We power Mozilla’s real-time analytics. Note the ”Powered by SQLstream” logo on the bottom right of their web display. See http://gigaom.com/cloud/dataflow-sqlstream/ for more informationThe SQLstream application processes all of the log-files from Mozilla download servers in real-time, parsing the files, streaming the data, mapping IP addresses to Longitude and Latitude, finding the nearest town, city or village and performing a range of analytics on the streams to feed a Hadoop cluster (Hbase) for displaying historical information complemented by SQLstream’s real-time analytics.
Another SQLstream customer is a division of the Australian Government. SQLstream monitors the vehicular traffic on their road systems by processing vehicle and road sensors data streams, and in real-time determining congestion, instantaneous average speeds, and much more.The application monitors road traffic flows down to a granularity of 10 meter segments. The results are then displayed on Google Earth and Google Maps underlays.The customer has many more applications in planning stages, including real-time metrics, and integration of data from all modes of transportation – continuously and in real-time, using SQLstream as the relational streaming engine.
In conclusion, SQLstream offers a fundamental breakthrough in real-time data management.Customers and partners can create complex applications processing massive data volumes and easily reuse the results across multiple applications and share streaming data sources in real-time across the enterprise.The delivers new insights into their existing voluminous but previously untamed data, with dramatically lower development and ownership costs.SQLstream is a key enabler for the real-time enterprise- or as we say “Query the Future”
Customers and Partners typically start with a small pilot and rapidly move on to develop and deploy key streaming applicationsGiven the entire application is written in standards-based SQL, it is normally easy for the customer or the partner’s developers to take ownershipSQLstream or SQLstream partners are comfortable working with fixed price deliverables and projects where that makes senseWe have a 5 day certification course for Partners, and then generally mentor a Partner through the first project