SlideShare a Scribd company logo
1 of 16
SUPPORTING QUERYING ON MULTI-MILLION
EVENTS PER SECOND
                  SPEAKER: Damian Black
                           CEO
                           SQLstream


Monday, July 30, 2012
Querying at Multi-
                             million Events per
                                   second

                             Real-time Big Data through Relational Streaming



      Copyright	
  ©	
  2012	
  	
  SQLstream	
  Inc.
Monday, July 30, 2012
Real-time Big Data through Relational
        Streaming

     So	
  what	
  is	
  a	
  	
  Streaming	
  Big	
  Data	
  	
  Pla,orm	
  ?	
  	
  
     üStream	
  any	
  data	
  in,	
  immediately	
  stream	
  out	
  real-­‐2me	
  answers.
     üCon2nuously	
  analyze	
  and	
  process	
  massive	
  data	
  volumes.
     üReact	
  in	
  real-­‐,me	
  to	
  each	
  and	
  every	
  new	
  record.


     And	
  what	
  is	
  	
  Rela/onal	
  Streaming	
  ?	
  	
  
     üA	
  paradigm	
  for	
  processing	
  Streaming	
  Big	
  Data	
  tuples.
     üFamiliar	
  rela2onal	
  expressions	
  with	
  automa2c	
  op2miza2on.
     üRela/onal	
  queries	
  executed	
  con/nuously	
  on	
  a	
  massively	
  parallel
       scale.	
  

      Copyright	
  ©	
  2012	
  	
  SQLstream	
  Inc.   Confiden'al	
  and	
  Trade	
  Secret	
  SQLstream	
  Inc.	
  ©	
  2012
Monday, July 30, 2012
Comparison of Techniques for Scaling

                                                           Data	
                        Hadoop	
  and	
             Rela2onal
                                                           Warehouses                    HDFS                        Streaming

                                          How              § Appear	
  as	
             § Appear	
  as	
           § Appear	
  as	
  
                                           are	
              a	
  single                   a	
  single                 a	
  single
                                        the	
  tuples
                                          held?               Fat	
  Table                  Fat	
  File                 Fat	
  Stream

                                            How            § New	
  tuples	
            § New	
  tuples	
          § New	
  tuples	
  
                                           do	
  we           overwrite	
  old              from	
  old                 from	
  old
                                          change           § Old	
  tuples	
  are	
     § Old	
  tuples	
  le0	
   § Old	
  tuples	
  le0	
  
                                         the	
  data?         updated                       alone                       alone

                                       What                  § 10s+                     § 1000s+                   § 1000s+
                                   kind	
  of	
  cluster	
   § Shared	
  state	
        § No	
  shared	
           § No	
  shared	
  
                                       scale?                   propaga2on                  state                       state




      Copyright	
  ©	
  2012	
  	
  SQLstream	
  Inc.
Monday, July 30, 2012
Parallel Processing done Hadoop
                           Finite	
  tuple	
  sets	
  are	
  mapped	
  into	
  finite	
  
     Historical	
  queries tuple	
  sets.


                        Need	
  to	
  break	
  data	
  into	
  independent	
  
     Independent	
  chunks
                        chunks.

     Procedural,	
  phased
                                                        Procedural,	
  step-­‐wise	
  process	
  used.

                                 For	
  example,	
  great	
  for	
  sor/ng	
  many	
  years’	
  
                                      gaming	
  scores	
  under	
  different	
  keys.
      Copyright	
  ©	
  2012	
  	
  SQLstream	
  Inc.     Confiden'al	
  and	
  Trade	
  Secret	
  SQLstream	
  Inc.	
  ©	
  2012
Monday, July 30, 2012
Parallel Processing with Relational
                           Finite	
  tuple	
  sets	
  are	
  mapped	
  into	
  finite	
  
                           tuple	
  sets.
     Historical	
  queries Infinite	
  tuple	
  streams	
  mapped	
  to	
  infinite	
  
     Continuous	
  queries tuple	
  streams.

                        Need	
  to	
  break	
  data	
  into	
  independent	
  
     Independent	
  chunks
                        chunks.
     Ordered	
  streams Data	
  are	
  processed	
  in	
  the	
  context	
  of	
  
                        streams.
     Procedural,	
  phased
                            Procedural,	
  step-­‐wise	
  process	
  used.
     Declarative,	
  parallel
                                                        Declara,ve,	
  fine-­‐grained	
  parallel	
  processing.

       For	
  example,	
  great	
  for	
  giving	
  the	
  real-­‐/me	
  leaderboard	
  over	
  a	
  
                                           rolling	
  minute.
      Copyright	
  ©	
  2012	
  	
  SQLstream	
  Inc.        Confiden'al	
  and	
  Trade	
  Secret	
  SQLstream	
  Inc.	
  ©	
  2012

Monday, July 30, 2012
SQL: The only declarative dataflow language
        standard
        » The key to massive scale parallelism is Dataflow
              Execution

        » Hadoop provides Dataflow Execution, but only in
              waves:
                  » Each wave consists of a procedural execution phase

                  » Tuple sets are transformed to new tuple sets

                  » Tuple sets are chunked and shuffled over a “hash partition” scheme

        » Relational Streaming maximizes Dataflow
              Execution:
       Rela/onal	
  Streaming	
  allows	
  bamenable to intelligent “superscalar”	
  
          » SQL is a declarative language oth	
  pipelining	
  and	
   optimization

          » Tuple streams are shuffled also rocessing.
                                parallel	
  p using hash partitioning
      Copyright	
  ©	
  2012	
  	
  SQLstream	
  Inc.
Monday, July 30, 2012
Tuple Processing: Hadoop versus Relational
        Streaming



                 	
  Hadoop	
  style:	
  data	
  chunking	
  coarse-­‐grained	
  dataflow.




               Rela/onal	
  Streaming:	
  DAGs	
  of	
  fine-­‐grained	
  dataflow.
      Copyright	
  ©	
  2012	
  	
  SQLstream	
  Inc.
Monday, July 30, 2012
Application Example: MMO Multiplayer
        Scoring
        » Many MMO servers streaming game action in real-
              time.

        » Streaming analytics maintained over varying time
              windows.

        » Aggregated and continuously sorted: streaming
                      stream	
                           stream	
  
              “order stream	
  
                      by”.
                         stream	
                           stream	
  
                                                        stream	
                stream	
  
                      Server
                           stream	
                      Server
                                                              stream	
          Server
                         Server
                     Server                                 Server
                                                        Server
                                stream	
  
                           Server                                  stream	
  
                                                              Server
                                Server                             Server




      Copyright	
  ©	
  2012	
  	
  SQLstream	
  Inc.
Monday, July 30, 2012
Streaming SQL: MMO Multiplayer
        Scoring
    CREATE OR REPLACE PUMP "SONG_SCORE_PUMP" STOPPED AS INSERT INTO "S_SONG_SCORE" ("songId", "SCORE")

      SELECT STREAM

         "SONG_ID" AS "songId",

               SUM("POINTS") OVER "LAST_WEEK" +

               ((SUM("POINTS") OVER "LAST_2_WEEKS” - SUM("POINTS") OVER "LAST_WEEK") * 0.5) +

               ((SUM("POINTS") OVER "LAST_3_WEEKS" - SUM("POINTS") OVER "LAST_2_WEEKS") * 0.25) +

               ((SUM("POINTS") OVER "LAST_4_WEEKS" - SUM("POINTS") OVER "LAST_3_WEEKS") * 0.125) AS "SCORE”

        FROM "S_SONG_SCORE_CHANGE”

        WINDOW

              "LAST_WEEK" AS (PARTITION BY "SONG_ID" RANGE INTERVAL '7' DAY PRECEDING),

              "LAST_2_WEEKS" AS (PARTITION BY "SONG_ID" RANGE INTERVAL '14' DAY PRECEDING),

              "LAST_3_WEEKS" AS (PARTITION BY "SONG_ID" RANGE INTERVAL '21' DAY PRECEDING),

              "LAST_4_WEEKS" AS (PARTITION BY "SONG_ID" RANGE INTERVAL '28' DAY PRECEDING);




  » Millions of events per second                                                 stream	
  
                                                                                      stream	
  
                                                                                    stream	
  
                                                                                  Serverstream	
  
                                                                                                        stream	
  
                                                                                                            stream	
  
                                                                                                          stream	
  
                                                                                                        Serverstream	
  
                                                                                                                              stream	
  
                                                                                                                              Server
                                                                                      Server
                                                                                    Server stream	
         Server
                                                                                                          Server stream	
  
                                                                                         Server                Server
                                                                                             Server                Server
  » Real-time game scoring

  » Amazon EC2
      Copyright	
  ©	
  2012	
  	
  SQLstream	
  Inc.
Monday, July 30, 2012
Relational Streaming / Hadoop Synergy

     » Relational Stream Processors (RSPs)
     » Co-located with Hadoop Servers to stream/re-stream
          local data
     » RSPs + Hadoop integrate Real-time and Historical
          processing:
              » Querying the future – Continuous ETL and Analytics (parallel pipelines)
              » Querying the past –Map
                       Split       Hadoop batch jobs on stored tuples (parallel batches)
                                           Combine      Sort       Reduce
                        Hadoop & Relational Streaming
              » Re-streaming and Re-querying (for example, scenario & sensitivity analyses
                                                                   Server
                                       Select           Project   Join   Agg   Order   Group




      Copyright	
  ©	
  2012	
  	
  SQLstream	
  Inc.
Monday, July 30, 2012
Use Cases for S3 Data (Sensor x System
        x Service)
        » Sensor Data:
                  » Vehicle, GPS and transportation sensors

                  » M2M sensor networks

                  » Smart Energy sensors

        » System Data:
                  » Log file processing for real-time Security, Compliance, Fraud

                  » Cloud performance monitoring

                  » Service Level Monitoring

        » Service Data:
                  » SMS analysis, CDRs for billing, Fraud

                  » Real-time pricing and promotion for eCommerce

                  » Active Internet (real-time context-dependent content)




      Copyright	
  ©	
  2012	
  	
  SQLstream	
  Inc.
Monday, July 30, 2012
Relational Streaming – A New Data
        Management Quadrant
                                                        High-level Declarative
                                                        Language & Operation




                                                                                                                                 Continuous
 Historical analysis                                                                                                             analysis
 Periodic batches                                                                                                                Real-time
                                                                                                                                 processing




                                                        Low-level Procedural
                                                        Language & Operation
      Copyright	
  ©	
  2012	
  	
  SQLstream	
  Inc.   Confiden'al	
  and	
  Trade	
  Secret	
  SQLstream	
  Inc.	
  ©	
  2012
Monday, July 30, 2012
Conclusions: Relational Streaming – the next “Big Data”
        frontier?


                     Any	
  view	
  of	
  any	
  data,	
  in	
  real-­‐/me,	
  and	
  all	
  
  Streaming	
  Views the	
  /me.
                        Harness	
  real-­‐/me	
  data	
  and	
  react	
  and	
  adapt	
  
  Real-­‐time	
  Reaction
                        in	
  real-­‐/me.

  Massively	
  Parallel
                      Deliver	
  fine-­‐grained	
  parallelism	
  on	
  a	
  
                                                        massive	
  scale.

                                     We	
  already	
  query	
  historical	
  data…	
  	
  
                                     	
       ….	
  let’s	
  now	
  query	
  future	
  data!	
  
      Copyright	
  ©	
  2012	
  	
  SQLstream	
  Inc.     Confiden'al	
  and	
  Trade	
  Secret	
  SQLstream	
  Inc.	
  ©	
  2012
Monday, July 30, 2012
Thanks! Any questions?




Monday, July 30, 2012
Monday, July 30, 2012

More Related Content

Similar to SUPPORTING QUERYING ON MULTI-MILLION EVENTS PER SECOND from Structure:Data 2012

Polyglot persistence with no sql
Polyglot persistence with no sqlPolyglot persistence with no sql
Polyglot persistence with no sqlMichael Lehmann
 
Complex Er[jl]ang Processing with StreamBase
Complex Er[jl]ang Processing with StreamBaseComplex Er[jl]ang Processing with StreamBase
Complex Er[jl]ang Processing with StreamBasedarach
 
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera, Inc.
 
Microsoft - The Big Data opportunity
Microsoft - The Big Data opportunityMicrosoft - The Big Data opportunity
Microsoft - The Big Data opportunityLee Stott
 
StreamBase - Embedded Erjang - Erlang User Group London - 20th April 2011
StreamBase - Embedded Erjang - Erlang User Group London - 20th April 2011StreamBase - Embedded Erjang - Erlang User Group London - 20th April 2011
StreamBase - Embedded Erjang - Erlang User Group London - 20th April 2011darach
 
Microsoft Big Data @ SQLUG 2013
Microsoft Big Data @ SQLUG 2013Microsoft Big Data @ SQLUG 2013
Microsoft Big Data @ SQLUG 2013Nathan Bijnens
 
Processing Big Data
Processing Big DataProcessing Big Data
Processing Big Datacwensel
 
Oracle+golden+gate+introduction
Oracle+golden+gate+introductionOracle+golden+gate+introduction
Oracle+golden+gate+introductionxiakaicd
 
Scalable vertical search engine with hadoop
Scalable vertical search engine with hadoopScalable vertical search engine with hadoop
Scalable vertical search engine with hadoopdatasalt
 
Integrating Hadoop Into the Enterprise
Integrating Hadoop Into the EnterpriseIntegrating Hadoop Into the Enterprise
Integrating Hadoop Into the EnterpriseDataWorks Summit
 
Hadoop Summit 2012 | Integrating Hadoop Into the Enterprise
Hadoop Summit 2012 | Integrating Hadoop Into the EnterpriseHadoop Summit 2012 | Integrating Hadoop Into the Enterprise
Hadoop Summit 2012 | Integrating Hadoop Into the EnterpriseCloudera, Inc.
 
Distributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentdDistributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentdSATOSHI TAGOMORI
 
Starfish: A Self-tuning System for Big Data Analytics
Starfish: A Self-tuning System for Big Data AnalyticsStarfish: A Self-tuning System for Big Data Analytics
Starfish: A Self-tuning System for Big Data AnalyticsGrant Ingersoll
 
Hadoop and its Ecosystem Components in Action
Hadoop and its Ecosystem Components in ActionHadoop and its Ecosystem Components in Action
Hadoop and its Ecosystem Components in ActionAndrew Brust
 
3. Sql Services 概览
3. Sql Services 概览3. Sql Services 概览
3. Sql Services 概览GaryYoung
 
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Jonathan Seidman
 
Bangalore cloudstack user group
Bangalore cloudstack user groupBangalore cloudstack user group
Bangalore cloudstack user groupShapeBlue
 
SQL Data Service Overview
SQL Data Service OverviewSQL Data Service Overview
SQL Data Service OverviewEric Nelson
 

Similar to SUPPORTING QUERYING ON MULTI-MILLION EVENTS PER SECOND from Structure:Data 2012 (20)

Os Pittaro
Os PittaroOs Pittaro
Os Pittaro
 
Polyglot persistence with no sql
Polyglot persistence with no sqlPolyglot persistence with no sql
Polyglot persistence with no sql
 
Complex Er[jl]ang Processing with StreamBase
Complex Er[jl]ang Processing with StreamBaseComplex Er[jl]ang Processing with StreamBase
Complex Er[jl]ang Processing with StreamBase
 
Cloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for HadoopCloudera Impala: A modern SQL Query Engine for Hadoop
Cloudera Impala: A modern SQL Query Engine for Hadoop
 
Microsoft - The Big Data opportunity
Microsoft - The Big Data opportunityMicrosoft - The Big Data opportunity
Microsoft - The Big Data opportunity
 
StreamBase - Embedded Erjang - Erlang User Group London - 20th April 2011
StreamBase - Embedded Erjang - Erlang User Group London - 20th April 2011StreamBase - Embedded Erjang - Erlang User Group London - 20th April 2011
StreamBase - Embedded Erjang - Erlang User Group London - 20th April 2011
 
Microsoft Big Data @ SQLUG 2013
Microsoft Big Data @ SQLUG 2013Microsoft Big Data @ SQLUG 2013
Microsoft Big Data @ SQLUG 2013
 
Processing Big Data
Processing Big DataProcessing Big Data
Processing Big Data
 
Oracle+golden+gate+introduction
Oracle+golden+gate+introductionOracle+golden+gate+introduction
Oracle+golden+gate+introduction
 
Scalable vertical search engine with hadoop
Scalable vertical search engine with hadoopScalable vertical search engine with hadoop
Scalable vertical search engine with hadoop
 
Integrating Hadoop Into the Enterprise
Integrating Hadoop Into the EnterpriseIntegrating Hadoop Into the Enterprise
Integrating Hadoop Into the Enterprise
 
Hadoop Summit 2012 | Integrating Hadoop Into the Enterprise
Hadoop Summit 2012 | Integrating Hadoop Into the EnterpriseHadoop Summit 2012 | Integrating Hadoop Into the Enterprise
Hadoop Summit 2012 | Integrating Hadoop Into the Enterprise
 
Distributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentdDistributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentd
 
Starfish: A Self-tuning System for Big Data Analytics
Starfish: A Self-tuning System for Big Data AnalyticsStarfish: A Self-tuning System for Big Data Analytics
Starfish: A Self-tuning System for Big Data Analytics
 
Hadoop and its Ecosystem Components in Action
Hadoop and its Ecosystem Components in ActionHadoop and its Ecosystem Components in Action
Hadoop and its Ecosystem Components in Action
 
Introducing DynamoDB
Introducing DynamoDBIntroducing DynamoDB
Introducing DynamoDB
 
3. Sql Services 概览
3. Sql Services 概览3. Sql Services 概览
3. Sql Services 概览
 
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
 
Bangalore cloudstack user group
Bangalore cloudstack user groupBangalore cloudstack user group
Bangalore cloudstack user group
 
SQL Data Service Overview
SQL Data Service OverviewSQL Data Service Overview
SQL Data Service Overview
 

More from Gigaom

Structure 2014 - The strategic value of the cloud - Joe Weinman
Structure 2014 - The strategic value of the cloud - Joe WeinmanStructure 2014 - The strategic value of the cloud - Joe Weinman
Structure 2014 - The strategic value of the cloud - Joe WeinmanGigaom
 
Structure 2014 - The right and wrong way to scale - Rackspace
Structure 2014 - The right and wrong way to scale - RackspaceStructure 2014 - The right and wrong way to scale - Rackspace
Structure 2014 - The right and wrong way to scale - RackspaceGigaom
 
Structure 2014 - The future of cloud computing survey results
Structure 2014 - The future of cloud computing survey resultsStructure 2014 - The future of cloud computing survey results
Structure 2014 - The future of cloud computing survey resultsGigaom
 
Structure 2014 - Launchpad Competition
Structure 2014 - Launchpad CompetitionStructure 2014 - Launchpad Competition
Structure 2014 - Launchpad CompetitionGigaom
 
Structure 2014 - Disrupting the data center - Intel sponsor workshop
Structure 2014 - Disrupting the data center - Intel sponsor workshopStructure 2014 - Disrupting the data center - Intel sponsor workshop
Structure 2014 - Disrupting the data center - Intel sponsor workshopGigaom
 
Structure 2014 - Cloud trends - Battery
Structure 2014 - Cloud trends - BatteryStructure 2014 - Cloud trends - Battery
Structure 2014 - Cloud trends - BatteryGigaom
 
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...Gigaom
 
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...Gigaom
 
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit BendovStructure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit BendovGigaom
 
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...Gigaom
 
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA, Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA, Gigaom
 
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari GesherStructure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari GesherGigaom
 
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris HaddadStructure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris HaddadGigaom
 
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...Gigaom
 
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrathStructure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrathGigaom
 
Structure Data 2014: IS VIDEO BIG DATA?, Steve Russell
Structure Data 2014: IS VIDEO BIG DATA?, Steve RussellStructure Data 2014: IS VIDEO BIG DATA?, Steve Russell
Structure Data 2014: IS VIDEO BIG DATA?, Steve RussellGigaom
 
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan WaiteStructure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan WaiteGigaom
 
How Data is Remaking E-commerce - from Roadmap 2013
How Data is Remaking E-commerce - from Roadmap 2013How Data is Remaking E-commerce - from Roadmap 2013
How Data is Remaking E-commerce - from Roadmap 2013Gigaom
 
25 Favorite Experiences in Tech - from Roadmap 2013
25 Favorite Experiences in Tech - from Roadmap 201325 Favorite Experiences in Tech - from Roadmap 2013
25 Favorite Experiences in Tech - from Roadmap 2013Gigaom
 
How Moore’s Law is Influencing Design - from Roadmap 2013
How Moore’s Law is Influencing Design - from Roadmap 2013How Moore’s Law is Influencing Design - from Roadmap 2013
How Moore’s Law is Influencing Design - from Roadmap 2013Gigaom
 

More from Gigaom (20)

Structure 2014 - The strategic value of the cloud - Joe Weinman
Structure 2014 - The strategic value of the cloud - Joe WeinmanStructure 2014 - The strategic value of the cloud - Joe Weinman
Structure 2014 - The strategic value of the cloud - Joe Weinman
 
Structure 2014 - The right and wrong way to scale - Rackspace
Structure 2014 - The right and wrong way to scale - RackspaceStructure 2014 - The right and wrong way to scale - Rackspace
Structure 2014 - The right and wrong way to scale - Rackspace
 
Structure 2014 - The future of cloud computing survey results
Structure 2014 - The future of cloud computing survey resultsStructure 2014 - The future of cloud computing survey results
Structure 2014 - The future of cloud computing survey results
 
Structure 2014 - Launchpad Competition
Structure 2014 - Launchpad CompetitionStructure 2014 - Launchpad Competition
Structure 2014 - Launchpad Competition
 
Structure 2014 - Disrupting the data center - Intel sponsor workshop
Structure 2014 - Disrupting the data center - Intel sponsor workshopStructure 2014 - Disrupting the data center - Intel sponsor workshop
Structure 2014 - Disrupting the data center - Intel sponsor workshop
 
Structure 2014 - Cloud trends - Battery
Structure 2014 - Cloud trends - BatteryStructure 2014 - Cloud trends - Battery
Structure 2014 - Cloud trends - Battery
 
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
 
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
 
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit BendovStructure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
 
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
 
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA, Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
 
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari GesherStructure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
 
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris HaddadStructure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
 
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
 
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrathStructure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
 
Structure Data 2014: IS VIDEO BIG DATA?, Steve Russell
Structure Data 2014: IS VIDEO BIG DATA?, Steve RussellStructure Data 2014: IS VIDEO BIG DATA?, Steve Russell
Structure Data 2014: IS VIDEO BIG DATA?, Steve Russell
 
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan WaiteStructure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
 
How Data is Remaking E-commerce - from Roadmap 2013
How Data is Remaking E-commerce - from Roadmap 2013How Data is Remaking E-commerce - from Roadmap 2013
How Data is Remaking E-commerce - from Roadmap 2013
 
25 Favorite Experiences in Tech - from Roadmap 2013
25 Favorite Experiences in Tech - from Roadmap 201325 Favorite Experiences in Tech - from Roadmap 2013
25 Favorite Experiences in Tech - from Roadmap 2013
 
How Moore’s Law is Influencing Design - from Roadmap 2013
How Moore’s Law is Influencing Design - from Roadmap 2013How Moore’s Law is Influencing Design - from Roadmap 2013
How Moore’s Law is Influencing Design - from Roadmap 2013
 

Recently uploaded

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

SUPPORTING QUERYING ON MULTI-MILLION EVENTS PER SECOND from Structure:Data 2012

  • 1. SUPPORTING QUERYING ON MULTI-MILLION EVENTS PER SECOND SPEAKER: Damian Black CEO SQLstream Monday, July 30, 2012
  • 2. Querying at Multi- million Events per second Real-time Big Data through Relational Streaming Copyright  ©  2012    SQLstream  Inc. Monday, July 30, 2012
  • 3. Real-time Big Data through Relational Streaming So  what  is  a    Streaming  Big  Data    Pla,orm  ?     üStream  any  data  in,  immediately  stream  out  real-­‐2me  answers. üCon2nuously  analyze  and  process  massive  data  volumes. üReact  in  real-­‐,me  to  each  and  every  new  record. And  what  is    Rela/onal  Streaming  ?     üA  paradigm  for  processing  Streaming  Big  Data  tuples. üFamiliar  rela2onal  expressions  with  automa2c  op2miza2on. üRela/onal  queries  executed  con/nuously  on  a  massively  parallel scale.   Copyright  ©  2012    SQLstream  Inc. Confiden'al  and  Trade  Secret  SQLstream  Inc.  ©  2012 Monday, July 30, 2012
  • 4. Comparison of Techniques for Scaling Data   Hadoop  and   Rela2onal Warehouses HDFS Streaming How § Appear  as   § Appear  as   § Appear  as   are   a  single a  single a  single the  tuples held? Fat  Table Fat  File Fat  Stream How § New  tuples   § New  tuples   § New  tuples   do  we overwrite  old from  old from  old change § Old  tuples  are   § Old  tuples  le0   § Old  tuples  le0   the  data? updated alone alone What § 10s+ § 1000s+ § 1000s+ kind  of  cluster   § Shared  state   § No  shared   § No  shared   scale? propaga2on state state Copyright  ©  2012    SQLstream  Inc. Monday, July 30, 2012
  • 5. Parallel Processing done Hadoop Finite  tuple  sets  are  mapped  into  finite   Historical  queries tuple  sets. Need  to  break  data  into  independent   Independent  chunks chunks. Procedural,  phased Procedural,  step-­‐wise  process  used. For  example,  great  for  sor/ng  many  years’   gaming  scores  under  different  keys. Copyright  ©  2012    SQLstream  Inc. Confiden'al  and  Trade  Secret  SQLstream  Inc.  ©  2012 Monday, July 30, 2012
  • 6. Parallel Processing with Relational Finite  tuple  sets  are  mapped  into  finite   tuple  sets. Historical  queries Infinite  tuple  streams  mapped  to  infinite   Continuous  queries tuple  streams. Need  to  break  data  into  independent   Independent  chunks chunks. Ordered  streams Data  are  processed  in  the  context  of   streams. Procedural,  phased Procedural,  step-­‐wise  process  used. Declarative,  parallel Declara,ve,  fine-­‐grained  parallel  processing. For  example,  great  for  giving  the  real-­‐/me  leaderboard  over  a   rolling  minute. Copyright  ©  2012    SQLstream  Inc. Confiden'al  and  Trade  Secret  SQLstream  Inc.  ©  2012 Monday, July 30, 2012
  • 7. SQL: The only declarative dataflow language standard » The key to massive scale parallelism is Dataflow Execution » Hadoop provides Dataflow Execution, but only in waves: » Each wave consists of a procedural execution phase » Tuple sets are transformed to new tuple sets » Tuple sets are chunked and shuffled over a “hash partition” scheme » Relational Streaming maximizes Dataflow Execution: Rela/onal  Streaming  allows  bamenable to intelligent “superscalar”   » SQL is a declarative language oth  pipelining  and   optimization » Tuple streams are shuffled also rocessing. parallel  p using hash partitioning Copyright  ©  2012    SQLstream  Inc. Monday, July 30, 2012
  • 8. Tuple Processing: Hadoop versus Relational Streaming  Hadoop  style:  data  chunking  coarse-­‐grained  dataflow. Rela/onal  Streaming:  DAGs  of  fine-­‐grained  dataflow. Copyright  ©  2012    SQLstream  Inc. Monday, July 30, 2012
  • 9. Application Example: MMO Multiplayer Scoring » Many MMO servers streaming game action in real- time. » Streaming analytics maintained over varying time windows. » Aggregated and continuously sorted: streaming stream   stream   “order stream   by”. stream   stream   stream   stream   Server stream   Server stream   Server Server Server Server Server stream   Server stream   Server Server Server Copyright  ©  2012    SQLstream  Inc. Monday, July 30, 2012
  • 10. Streaming SQL: MMO Multiplayer Scoring CREATE OR REPLACE PUMP "SONG_SCORE_PUMP" STOPPED AS INSERT INTO "S_SONG_SCORE" ("songId", "SCORE") SELECT STREAM "SONG_ID" AS "songId", SUM("POINTS") OVER "LAST_WEEK" + ((SUM("POINTS") OVER "LAST_2_WEEKS” - SUM("POINTS") OVER "LAST_WEEK") * 0.5) + ((SUM("POINTS") OVER "LAST_3_WEEKS" - SUM("POINTS") OVER "LAST_2_WEEKS") * 0.25) + ((SUM("POINTS") OVER "LAST_4_WEEKS" - SUM("POINTS") OVER "LAST_3_WEEKS") * 0.125) AS "SCORE” FROM "S_SONG_SCORE_CHANGE” WINDOW "LAST_WEEK" AS (PARTITION BY "SONG_ID" RANGE INTERVAL '7' DAY PRECEDING), "LAST_2_WEEKS" AS (PARTITION BY "SONG_ID" RANGE INTERVAL '14' DAY PRECEDING), "LAST_3_WEEKS" AS (PARTITION BY "SONG_ID" RANGE INTERVAL '21' DAY PRECEDING), "LAST_4_WEEKS" AS (PARTITION BY "SONG_ID" RANGE INTERVAL '28' DAY PRECEDING); » Millions of events per second stream   stream   stream   Serverstream   stream   stream   stream   Serverstream   stream   Server Server Server stream   Server Server stream   Server Server Server Server » Real-time game scoring » Amazon EC2 Copyright  ©  2012    SQLstream  Inc. Monday, July 30, 2012
  • 11. Relational Streaming / Hadoop Synergy » Relational Stream Processors (RSPs) » Co-located with Hadoop Servers to stream/re-stream local data » RSPs + Hadoop integrate Real-time and Historical processing: » Querying the future – Continuous ETL and Analytics (parallel pipelines) » Querying the past –Map Split Hadoop batch jobs on stored tuples (parallel batches) Combine Sort Reduce Hadoop & Relational Streaming » Re-streaming and Re-querying (for example, scenario & sensitivity analyses Server Select Project Join Agg Order Group Copyright  ©  2012    SQLstream  Inc. Monday, July 30, 2012
  • 12. Use Cases for S3 Data (Sensor x System x Service) » Sensor Data: » Vehicle, GPS and transportation sensors » M2M sensor networks » Smart Energy sensors » System Data: » Log file processing for real-time Security, Compliance, Fraud » Cloud performance monitoring » Service Level Monitoring » Service Data: » SMS analysis, CDRs for billing, Fraud » Real-time pricing and promotion for eCommerce » Active Internet (real-time context-dependent content) Copyright  ©  2012    SQLstream  Inc. Monday, July 30, 2012
  • 13. Relational Streaming – A New Data Management Quadrant High-level Declarative Language & Operation Continuous Historical analysis analysis Periodic batches Real-time processing Low-level Procedural Language & Operation Copyright  ©  2012    SQLstream  Inc. Confiden'al  and  Trade  Secret  SQLstream  Inc.  ©  2012 Monday, July 30, 2012
  • 14. Conclusions: Relational Streaming – the next “Big Data” frontier? Any  view  of  any  data,  in  real-­‐/me,  and  all   Streaming  Views the  /me. Harness  real-­‐/me  data  and  react  and  adapt   Real-­‐time  Reaction in  real-­‐/me. Massively  Parallel Deliver  fine-­‐grained  parallelism  on  a   massive  scale. We  already  query  historical  data…       ….  let’s  now  query  future  data!   Copyright  ©  2012    SQLstream  Inc. Confiden'al  and  Trade  Secret  SQLstream  Inc.  ©  2012 Monday, July 30, 2012