SlideShare a Scribd company logo
SnappyData	
  
Building Continuous Applications Driven By Real Time Insights
Version  0.1    |  ©  SnappyData  Inc  2017  
www.snappydata.io  
Sudhir  Menon  ,  Founder,  COO  
 	
  	
  
Who Are We?
2	
  
•  New  Spark-­‐based  open  source  project  started  by  Pivotal  GemFire  
founders  +  engineers  
•  Decades  of  in-­‐memory  data  management  experience  
•  Focus  on  real-­‐Ome,  operaOonal  analyOcs:  Spark  based  OLTP+OLAP  
database	
  
Spinout  
SnappyData  
Funded  by  
Pivotal,  GE,  GTD  
Capital  
www.snappydata.io  
•  Applications that are intelligent, proactive, learn from past interactions, and are
context aware in their decision making
•  Fast and reliable ingestion capabilities
•  Support high memory density
•  Utilize memory to reduce response time
•  Support high concurrency
•  Work on live data
•  Support data mutability
What Is A Continuous Application?
www.snappydata.io  
•  Market Surveillance Systems (Trading exchanges, Market makers)
•  Real Time Scoring Systems (Product recommendations, real time offers)
•  Telco Analytics (Location based services, Predictive analytics)
•  Sensor Analytics (Real time alerting for parking management, lighting etc.)
•  Ad analytics + Ad placement systems
•  Credit Card Fraud
•  Detecting and Stopping Malware
Lets Discuss Some Use Cases
www.snappydata.io  
•  Elapsed time from event occurrence to event
analytics matters
•  Latency in using information for learning matters
•  Concurrency matters
•  Recovery time matters
•  User-kernel crossings matter
In short, liveness of data matters when it comes to
making decisions based on current information
Time Value of Information – Why does it matter
 	
  	
  
Mixed Workloads Are Everywhere
6	
  
Stream  
Processing  
Transac@on  
(point  lookups,  small  
updates)  
Interac@ve  
Analy@cs  
Analytics on
mutating data
Correlating and
joining streams with
large histories
Maintaining state or
counters while
ingesting streams
 	
  	
  
Mixed Workloads in Industrial IOT
7	
  
IOT  
Devices  
Anomaly  detecOon  –  
score  against  models  
-­‐  Map  sensors  to  tags  
-­‐  Monitor  temperature  
-­‐  Send  alerts  
Correlate  current  
temperature  trend  with  
history….    
Interact  using  
  dynamic  queries….    
Event Stream
 	
  	
  
How Mixed Workloads Are Supported Today
8	
  
Query  
New            
Data  
Batch  layer  
Master  
Datasheet  
2  
Serving  layer  
Batch  view  
3  
Batch  view  
Speed  layer  
4  
Real-­‐@me  View   Real-­‐@me  View  
1  
Query  
5  
 	
  	
  
Lambda Architecture is Complex
9	
  
KAFKA  
STORM  
CASSANDRA  
.....
SOURCE  APPS  
•  Complexity: learn  and  master  mulOple  
products,  data  models,  disparate  APIs,  
configs  
•  Slower
•  Wasted resources
10	
  
Can	
  We	
  
Simplify	
  &	
  
Op0mize?	
  
 	
  	
  	
  
11	
  
How about a single clustered DB that can manage
stream state, transactional data & run OLAP queries?
Stream  processing  
Scalable writes, point reads, OLAP queries
Apps  
Framework  for  Stream  Processing,  etc  
RDB  
MPP  DB  
HDFS  
Tables  
Txn  
12	
  ©  Snappydata  INC  2017  
  
Our  
Solu@on  
SnappyData	
  
A Single Unified Cluster:
OLTP + OLAP + Streaming for real-time analytics
 	
  	
  
Our Solution
13	
  
Deep  Scale,  
High  Volume  
MPP  DB  
Real-­‐@me  design  
Low  latency,  HA,    
concurrency  
  
Batch  design,  high  
throughput  
  
Rapidly Maturing Matured over 13 years
Single  Unified  HA  Cluster  
OLTP + OLAP + Streaming for real-time analytics
 	
  	
  
A  Remarkable  Marriage  
14	
  
Deep  Scale,  
High  Volume  
MPP  DB  
A  lineage-­‐based  
system  designed  for  
high-­‐throughput  
A  (consensus-­‐driven)  
replica@on-­‐based  
system  designed  for  
low-­‐latency  
GEMFIRETwo drastically different
breeds of distributed
systems . . .
 	
  	
  
15	
  
Deep  Fusion  
w/  Spark  Extreme  
Speed  
Synopsis  
Data  
Engine  
Deep  Fusion  with  Spark  
Elas0c,	
  highly	
  available	
  in-­‐memory	
  store	
  for	
  OLTP	
  fused	
  with	
  
Spark’s	
  memory	
  manager	
  and	
  the	
  Catalyst/Tungsten	
  engine.	
  	
  
The	
  store	
  itself	
  is	
  exposed	
  as	
  na0ve	
  Spark	
  data	
  frames.
Extreme  Speed  thru  CPU  code  gen,  vectoriza@on  
Extend	
  Spark’s	
  Tungsten	
  engine	
  with	
  bePer	
  code	
  genera0on,	
  
coloca0on	
  schemes,	
  ..
Use	
  Sta0s0cal	
  techniques	
  to	
  reduce	
  data	
  by	
  100-­‐1000x	
  
Answer	
  queries	
  in	
  frac0on	
  of	
  0me	
  and	
  resources	
  
Synopses  Data  Engine  
What is unique
 	
  	
  
We transform Spark from this…
16	
  
Deep  Scale,  
High  Volume  
MPP  DB  
USER 1 / APP 1
SPARK  
MASTER  
Spark  Execu@on  (Worker)  
Framework  for  
streaming  SQL,  
ML…  
Immutable  
CACHE  
USER 2 / APP 2
SPARK  
MASTER  
Spark  Execu@on  (Worker)  
Framework  for  
streaming  SQL,  
ML…  
Immutable  
CACHE  
HDFS  
SQL  
NoSQL  
  
•  Cannot  update  
•  Repeated  for  each  User/
APP  
Boaleneck  
 	
  	
  
… Into “an always-on hybrid database !
17	
  
Deep  Scale,  
High  Volume  
MPP  DB  
HDFS  
SQL  
NoSQL  
  
HISTORY  
Spark  Execu@on  (Worker)  JVM
- Long running
Framework  for  
streaming  SQL,  
ML…  
Spark  
Driver  
IN-­‐Memory  
ROW  +  COLUMN  
Start  with  
Indexing  
Store  
-  Mutable,
-  TransactionalSPARK  
Cluster  
JDBC  
ODBC  
Spark Job
Shared  Nothing  
Persistence  
  
 	
  	
  
Architecture  
18	
  
Cluster  Manager    
&  Scheduler  
Snappy  Data  Server  (Spark Executor + Store)
Parser  
OLAP  
TXN  
Synopsis  Data  Engine  
Distributed  Membership    
Service  
H
A
Stream  Processing  
Data  Frame  
RDD  
Low  
Latency  
High  
Latency  
HYBRID  Store  
ProbabilisOc   Rows   Columns  
Index  
Query  
OpOmizer  
Add  /  Remove  
Server  
Tables   ODBC/JDBC  
 	
  	
  
Unified API
19	
  
•  ML,  graph,  batch  &  streaming,  SQL  (selects)	
  
Spark’s  DataFrame  API  allows  for:  
	
  
•  Mutability  semanOcs  (DML  &  transacOons)  
•  Indexing    
•  SQL-­‐based  streaming	
  
SnappyData  adds  full  SQL  support  and  extends  DataFrame  and  DataSource  APIs  for:	
  
 	
  	
  
High-Level Accuracy Guarantees
20	
  
1 0 1 1 0 0
2 1 2 0 0 1
2 0 0 0 1 1
0 1 0 2 0 2
Quality  cer,fied  
Approx  Answers  
Query  Engine  
HAC  
Bias  Es,mate  
Variance  Es,mate  
STREAMS  
Aging  
SNAPPY  STORE  
Stra,fied  Samples   Stra,fied  Samples  
Interac,ve  Query  
Con,nuous  Query  
Pipelined  
bootstrapped  
operator  
Row  store  Memory   Column  Store  Disk  
 	
  	
  
Cloud Ready
21	
  
Dealing with Credit Card Fraud
SnappyData  Cluster  
Credit  Card  
transacOon  
stream  
User  History  
PredicOon  
Model  
Streaming  ApplicaOon  
……….  
Black    
Listed    
Cards  
Data
Lake
No0fica0on	
  to	
  
owner	
  
No0fica0on	
  to	
  
merchant	
  
SnappyData  Cluster  
Customers  
Approaching    
Limit  
Plan    
Info  
CDR  Stream  
Schedule  callback  
through  call  center  
Streaming  ApplicaOon  Immediate  SMS    
to  customer  
Data
Lake
Preventing Bill Shock, Real Time Upgrades
The	
  system	
  detects	
  approaching	
  usage	
  
limits,	
  no0fies	
  users	
  and	
  gives	
  them	
  	
  
a	
  chance	
  to	
  buy	
  a	
  one	
  0me	
  upgrade	
  or	
  
a	
  new	
  plan,	
  increasing	
  loyalty	
  &	
  revenue	
  
www.snappydata.io  
Stream  IngesOon  
Reference	
  
Data	
  
•  Stream  analyOcs  
•  Insider  detecOon  
•  Apply  Rules  
•  Detect  Market  
ManipulaOon  
Alert  &  NoOfy  Downstream  
Systems  
Trigger  InvesOgaOons  
Spark	
  Streaming	
  
SQL	
  Querying	
  
Con0nuous	
  Queries	
  
Par00oned	
  Stream	
  
Inges0on	
  
Summaries	
  &	
  Alerts	
  
Messaging	
  
Machine	
  Learning	
  
Market Surveillance For Market Makers
Connected Car Real Time Data Flow
SnappyData  Cluster  
Ka]a	
  	
  
Receiver	
   Vehicle  Time    
Series  Data  
Vehicle  
History  
Driver    
History  
Streaming  ApplicaOon  
HDFS,	
  HBase	
  
Raw	
  Data	
  Store	
  
Custom	
  
Summary	
  
Dashboard	
  
No0fica0on	
  to	
  
owner	
  
……….  
System    
KPIs  
  
Asset    
Metadata  
Offline
Analysis
REAL TIME MATCHING ENGINE
MATCHING  
ENGINE  
Customer  
History  
NoOficaOon  
Sub-­‐system  
!  
Historical  Customer  
Profiles  
User  by    
Geo  locaOon  
PERSONALIZED  
CAMPAIGNS  TO  
USERS  
	
  	
  	
  	
  	
  	
  	
  Ingest  Stream  
REAL    
TIME  
OFFERS  
  
  
from  
Merchants  
Real Time Marketing Campaigns
A	
  stream	
  matching	
  engine	
  that	
  uses	
  customer	
  
history,	
  their	
  current	
  loca0on	
  and	
  relevant	
  offers	
  to	
  
Effec0vely	
  target	
  users	
  creates	
  differen0a0on	
  &	
  generates	
  revenue	
  
www.snappydata.io  
000’s data points/sec
Emergency Shutdown
Tuning & Optimization,
Monitor & Control
Continuous Real-time
Analysis
Maintenance
Billing
Sensor Analytics
Message	
  Bus	
  
Stream  IngesOon  
Reference	
  
Data	
  
ETL	
  
•  OLAP	
  and	
  Low	
  
Latency	
  
Querying	
  in	
  SQL	
  	
  
•  Machine	
  
Learning	
  in	
  Spark	
  
RFQs/Trades/Quotes streams
Analytic Dashboards
SnappyData
RFQ Analytics
 	
  	
  
Ad Analytics
29	
  
1.5-­‐2x        faster ingestion, faster trx
7-­‐142×    faster analytics (at 300M records)
 	
  	
  
Data Synopsis Engine
30	
  
 	
  	
  
TPCH
31	
  
Avg  Latency  
  
SnappyData  
  
MemSQL  
  
Spark  
5.7s  
100 GB
12.0s  
66.9s  
THANK  YOU  !  
Try  it  out:  hlp://snappydata.io/download	
  

More Related Content

What's hot

SnappyData at Spark Summit 2017
SnappyData at Spark Summit 2017SnappyData at Spark Summit 2017
SnappyData at Spark Summit 2017
Jags Ramnarayan
 
Explore big data at speed of thought with Spark 2.0 and Snappydata
Explore big data at speed of thought with Spark 2.0 and SnappydataExplore big data at speed of thought with Spark 2.0 and Snappydata
Explore big data at speed of thought with Spark 2.0 and Snappydata
Data Con LA
 
Jags Ramnarayan's presentation
Jags Ramnarayan's presentationJags Ramnarayan's presentation
Jags Ramnarayan's presentation
punesparkmeetup
 
Sumedh Wale's presentation
Sumedh Wale's presentationSumedh Wale's presentation
Sumedh Wale's presentation
punesparkmeetup
 
SDM (Standardized Data Management) - A Dynamic Adaptive Ingestion Frameworks ...
SDM (Standardized Data Management) - A Dynamic Adaptive Ingestion Frameworks ...SDM (Standardized Data Management) - A Dynamic Adaptive Ingestion Frameworks ...
SDM (Standardized Data Management) - A Dynamic Adaptive Ingestion Frameworks ...
DataWorks Summit
 
Is hadoop for you
Is hadoop for youIs hadoop for you
Is hadoop for you
Gwen (Chen) Shapira
 
Strata Conference + Hadoop World NY 2016: Lessons learned building a scalable...
Strata Conference + Hadoop World NY 2016: Lessons learned building a scalable...Strata Conference + Hadoop World NY 2016: Lessons learned building a scalable...
Strata Conference + Hadoop World NY 2016: Lessons learned building a scalable...
Sumeet Singh
 
Spark Technology Center IBM
Spark Technology Center IBMSpark Technology Center IBM
Spark Technology Center IBM
DataWorks Summit/Hadoop Summit
 
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Databricks
 
Pandas UDF: Scalable Analysis with Python and PySpark
Pandas UDF: Scalable Analysis with Python and PySparkPandas UDF: Scalable Analysis with Python and PySpark
Pandas UDF: Scalable Analysis with Python and PySpark
Li Jin
 
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Chicago Hadoop Users Group
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
Spark Summit
 
Data Discovery at Databricks with Amundsen
Data Discovery at Databricks with AmundsenData Discovery at Databricks with Amundsen
Data Discovery at Databricks with Amundsen
Databricks
 
Incredible Impala
Incredible Impala Incredible Impala
Incredible Impala
Gwen (Chen) Shapira
 
Apache Spark Model Deployment
Apache Spark Model Deployment Apache Spark Model Deployment
Apache Spark Model Deployment
Databricks
 
What’s New in the Upcoming Apache Spark 3.0
What’s New in the Upcoming Apache Spark 3.0What’s New in the Upcoming Apache Spark 3.0
What’s New in the Upcoming Apache Spark 3.0
Databricks
 
High concurrency,
Low latency analytics
using Spark/Kudu
 High concurrency,
Low latency analytics
using Spark/Kudu High concurrency,
Low latency analytics
using Spark/Kudu
High concurrency,
Low latency analytics
using Spark/Kudu
Chris George
 
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
Sudhir Mallem
 
Predicting Optimal Parallelism for Data Analytics
Predicting Optimal Parallelism for Data AnalyticsPredicting Optimal Parallelism for Data Analytics
Predicting Optimal Parallelism for Data Analytics
Databricks
 

What's hot (20)

SnappyData at Spark Summit 2017
SnappyData at Spark Summit 2017SnappyData at Spark Summit 2017
SnappyData at Spark Summit 2017
 
Explore big data at speed of thought with Spark 2.0 and Snappydata
Explore big data at speed of thought with Spark 2.0 and SnappydataExplore big data at speed of thought with Spark 2.0 and Snappydata
Explore big data at speed of thought with Spark 2.0 and Snappydata
 
Jags Ramnarayan's presentation
Jags Ramnarayan's presentationJags Ramnarayan's presentation
Jags Ramnarayan's presentation
 
Sumedh Wale's presentation
Sumedh Wale's presentationSumedh Wale's presentation
Sumedh Wale's presentation
 
SDM (Standardized Data Management) - A Dynamic Adaptive Ingestion Frameworks ...
SDM (Standardized Data Management) - A Dynamic Adaptive Ingestion Frameworks ...SDM (Standardized Data Management) - A Dynamic Adaptive Ingestion Frameworks ...
SDM (Standardized Data Management) - A Dynamic Adaptive Ingestion Frameworks ...
 
Is hadoop for you
Is hadoop for youIs hadoop for you
Is hadoop for you
 
Strata Conference + Hadoop World NY 2016: Lessons learned building a scalable...
Strata Conference + Hadoop World NY 2016: Lessons learned building a scalable...Strata Conference + Hadoop World NY 2016: Lessons learned building a scalable...
Strata Conference + Hadoop World NY 2016: Lessons learned building a scalable...
 
Spark Technology Center IBM
Spark Technology Center IBMSpark Technology Center IBM
Spark Technology Center IBM
 
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
 
Pandas UDF: Scalable Analysis with Python and PySpark
Pandas UDF: Scalable Analysis with Python and PySparkPandas UDF: Scalable Analysis with Python and PySpark
Pandas UDF: Scalable Analysis with Python and PySpark
 
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
 
Data Discovery at Databricks with Amundsen
Data Discovery at Databricks with AmundsenData Discovery at Databricks with Amundsen
Data Discovery at Databricks with Amundsen
 
Incredible Impala
Incredible Impala Incredible Impala
Incredible Impala
 
Apache Spark Model Deployment
Apache Spark Model Deployment Apache Spark Model Deployment
Apache Spark Model Deployment
 
Conviva spark
Conviva sparkConviva spark
Conviva spark
 
What’s New in the Upcoming Apache Spark 3.0
What’s New in the Upcoming Apache Spark 3.0What’s New in the Upcoming Apache Spark 3.0
What’s New in the Upcoming Apache Spark 3.0
 
High concurrency,
Low latency analytics
using Spark/Kudu
 High concurrency,
Low latency analytics
using Spark/Kudu High concurrency,
Low latency analytics
using Spark/Kudu
High concurrency,
Low latency analytics
using Spark/Kudu
 
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
Interactive SQL POC on Hadoop (Hive, Presto and Hive-on-Tez)
 
Predicting Optimal Parallelism for Data Analytics
Predicting Optimal Parallelism for Data AnalyticsPredicting Optimal Parallelism for Data Analytics
Predicting Optimal Parallelism for Data Analytics
 

Similar to SnappyData @ Seattle Spark Meetup

Real-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven ApplicationsReal-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven Applications
VMware Tanzu
 
Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsLessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsClaudiu Barbura
 
[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale
[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale
[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale
DataScienceConferenc1
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex
Apache Apex
 
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Dataconomy Media
 
High performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataHigh performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyData
VMware Tanzu
 
High performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataHigh performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyData
Carlos Andrés García
 
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
 New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S... New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
Big Data Spain
 
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWSACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
AWS User Group Kochi
 
Real-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQLReal-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQL
SingleStore
 
Big Data Analytics Platforms by KTH and RISE SICS
Big Data Analytics Platforms by KTH and RISE SICSBig Data Analytics Platforms by KTH and RISE SICS
Big Data Analytics Platforms by KTH and RISE SICS
Big Data Value Association
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
Spark Summit
 
Efficient State Management With Spark 2.x And Scale-Out Databases
Efficient State Management With Spark 2.x And Scale-Out DatabasesEfficient State Management With Spark 2.x And Scale-Out Databases
Efficient State Management With Spark 2.x And Scale-Out Databases
SnappyData
 
Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
DataWorks Summit/Hadoop Summit
 
[WSO2Con EU 2018] The Rise of Streaming SQL
[WSO2Con EU 2018] The Rise of Streaming SQL[WSO2Con EU 2018] The Rise of Streaming SQL
[WSO2Con EU 2018] The Rise of Streaming SQL
WSO2
 
Analytics&IoT
Analytics&IoTAnalytics&IoT
Analytics&IoT
Selvaraj Kesavan
 
Pivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream AnalyticsPivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream Analytics
kgshukla
 
Kognitio overview jan 2013
Kognitio overview jan 2013Kognitio overview jan 2013
Kognitio overview jan 2013Kognitio
 

Similar to SnappyData @ Seattle Spark Meetup (20)

Real-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven ApplicationsReal-time Analytics for Data-Driven Applications
Real-time Analytics for Data-Driven Applications
 
Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsLessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatterns
 
[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale
[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale
[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex
 
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
 
High performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataHigh performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyData
 
High performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataHigh performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyData
 
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
 New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S... New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
New usage model for real-time analytics by Dr. WILLIAM L. BAIN at Big Data S...
 
Cassandra in xPatterns
Cassandra in xPatternsCassandra in xPatterns
Cassandra in xPatterns
 
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWSACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
 
Real-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQLReal-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQL
 
Big Data Analytics Platforms by KTH and RISE SICS
Big Data Analytics Platforms by KTH and RISE SICSBig Data Analytics Platforms by KTH and RISE SICS
Big Data Analytics Platforms by KTH and RISE SICS
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
 
Efficient State Management With Spark 2.x And Scale-Out Databases
Efficient State Management With Spark 2.x And Scale-Out DatabasesEfficient State Management With Spark 2.x And Scale-Out Databases
Efficient State Management With Spark 2.x And Scale-Out Databases
 
Presentation-QRUA
Presentation-QRUAPresentation-QRUA
Presentation-QRUA
 
Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
 
[WSO2Con EU 2018] The Rise of Streaming SQL
[WSO2Con EU 2018] The Rise of Streaming SQL[WSO2Con EU 2018] The Rise of Streaming SQL
[WSO2Con EU 2018] The Rise of Streaming SQL
 
Analytics&IoT
Analytics&IoTAnalytics&IoT
Analytics&IoT
 
Pivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream AnalyticsPivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream Analytics
 
Kognitio overview jan 2013
Kognitio overview jan 2013Kognitio overview jan 2013
Kognitio overview jan 2013
 

Recently uploaded

Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Shahin Sheidaei
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
Globus
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
Donna Lenk
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 

Recently uploaded (20)

Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 

SnappyData @ Seattle Spark Meetup

  • 1. SnappyData   Building Continuous Applications Driven By Real Time Insights Version  0.1    |  ©  SnappyData  Inc  2017   www.snappydata.io   Sudhir  Menon  ,  Founder,  COO  
  • 2.       Who Are We? 2   •  New  Spark-­‐based  open  source  project  started  by  Pivotal  GemFire   founders  +  engineers   •  Decades  of  in-­‐memory  data  management  experience   •  Focus  on  real-­‐Ome,  operaOonal  analyOcs:  Spark  based  OLTP+OLAP   database   Spinout   SnappyData   Funded  by   Pivotal,  GE,  GTD   Capital  
  • 3. www.snappydata.io   •  Applications that are intelligent, proactive, learn from past interactions, and are context aware in their decision making •  Fast and reliable ingestion capabilities •  Support high memory density •  Utilize memory to reduce response time •  Support high concurrency •  Work on live data •  Support data mutability What Is A Continuous Application?
  • 4. www.snappydata.io   •  Market Surveillance Systems (Trading exchanges, Market makers) •  Real Time Scoring Systems (Product recommendations, real time offers) •  Telco Analytics (Location based services, Predictive analytics) •  Sensor Analytics (Real time alerting for parking management, lighting etc.) •  Ad analytics + Ad placement systems •  Credit Card Fraud •  Detecting and Stopping Malware Lets Discuss Some Use Cases
  • 5. www.snappydata.io   •  Elapsed time from event occurrence to event analytics matters •  Latency in using information for learning matters •  Concurrency matters •  Recovery time matters •  User-kernel crossings matter In short, liveness of data matters when it comes to making decisions based on current information Time Value of Information – Why does it matter
  • 6.       Mixed Workloads Are Everywhere 6   Stream   Processing   Transac@on   (point  lookups,  small   updates)   Interac@ve   Analy@cs   Analytics on mutating data Correlating and joining streams with large histories Maintaining state or counters while ingesting streams
  • 7.       Mixed Workloads in Industrial IOT 7   IOT   Devices   Anomaly  detecOon  –   score  against  models   -­‐  Map  sensors  to  tags   -­‐  Monitor  temperature   -­‐  Send  alerts   Correlate  current   temperature  trend  with   history….     Interact  using    dynamic  queries….     Event Stream
  • 8.       How Mixed Workloads Are Supported Today 8   Query   New             Data   Batch  layer   Master   Datasheet   2   Serving  layer   Batch  view   3   Batch  view   Speed  layer   4   Real-­‐@me  View   Real-­‐@me  View   1   Query   5  
  • 9.       Lambda Architecture is Complex 9   KAFKA   STORM   CASSANDRA   ..... SOURCE  APPS   •  Complexity: learn  and  master  mulOple   products,  data  models,  disparate  APIs,   configs   •  Slower •  Wasted resources
  • 10. 10   Can  We   Simplify  &   Op0mize?  
  • 11.         11   How about a single clustered DB that can manage stream state, transactional data & run OLAP queries? Stream  processing   Scalable writes, point reads, OLAP queries Apps   Framework  for  Stream  Processing,  etc   RDB   MPP  DB   HDFS   Tables   Txn  
  • 12. 12  ©  Snappydata  INC  2017     Our   Solu@on   SnappyData   A Single Unified Cluster: OLTP + OLAP + Streaming for real-time analytics
  • 13.       Our Solution 13   Deep  Scale,   High  Volume   MPP  DB   Real-­‐@me  design   Low  latency,  HA,     concurrency     Batch  design,  high   throughput     Rapidly Maturing Matured over 13 years Single  Unified  HA  Cluster   OLTP + OLAP + Streaming for real-time analytics
  • 14.       A  Remarkable  Marriage   14   Deep  Scale,   High  Volume   MPP  DB   A  lineage-­‐based   system  designed  for   high-­‐throughput   A  (consensus-­‐driven)   replica@on-­‐based   system  designed  for   low-­‐latency   GEMFIRETwo drastically different breeds of distributed systems . . .
  • 15.       15   Deep  Fusion   w/  Spark  Extreme   Speed   Synopsis   Data   Engine   Deep  Fusion  with  Spark   Elas0c,  highly  available  in-­‐memory  store  for  OLTP  fused  with   Spark’s  memory  manager  and  the  Catalyst/Tungsten  engine.     The  store  itself  is  exposed  as  na0ve  Spark  data  frames. Extreme  Speed  thru  CPU  code  gen,  vectoriza@on   Extend  Spark’s  Tungsten  engine  with  bePer  code  genera0on,   coloca0on  schemes,  .. Use  Sta0s0cal  techniques  to  reduce  data  by  100-­‐1000x   Answer  queries  in  frac0on  of  0me  and  resources   Synopses  Data  Engine   What is unique
  • 16.       We transform Spark from this… 16   Deep  Scale,   High  Volume   MPP  DB   USER 1 / APP 1 SPARK   MASTER   Spark  Execu@on  (Worker)   Framework  for   streaming  SQL,   ML…   Immutable   CACHE   USER 2 / APP 2 SPARK   MASTER   Spark  Execu@on  (Worker)   Framework  for   streaming  SQL,   ML…   Immutable   CACHE   HDFS   SQL   NoSQL     •  Cannot  update   •  Repeated  for  each  User/ APP   Boaleneck  
  • 17.       … Into “an always-on hybrid database ! 17   Deep  Scale,   High  Volume   MPP  DB   HDFS   SQL   NoSQL     HISTORY   Spark  Execu@on  (Worker)  JVM - Long running Framework  for   streaming  SQL,   ML…   Spark   Driver   IN-­‐Memory   ROW  +  COLUMN   Start  with   Indexing   Store   -  Mutable, -  TransactionalSPARK   Cluster   JDBC   ODBC   Spark Job Shared  Nothing   Persistence    
  • 18.       Architecture   18   Cluster  Manager     &  Scheduler   Snappy  Data  Server  (Spark Executor + Store) Parser   OLAP   TXN   Synopsis  Data  Engine   Distributed  Membership     Service   H A Stream  Processing   Data  Frame   RDD   Low   Latency   High   Latency   HYBRID  Store   ProbabilisOc   Rows   Columns   Index   Query   OpOmizer   Add  /  Remove   Server   Tables   ODBC/JDBC  
  • 19.       Unified API 19   •  ML,  graph,  batch  &  streaming,  SQL  (selects)   Spark’s  DataFrame  API  allows  for:     •  Mutability  semanOcs  (DML  &  transacOons)   •  Indexing     •  SQL-­‐based  streaming   SnappyData  adds  full  SQL  support  and  extends  DataFrame  and  DataSource  APIs  for:  
  • 20.       High-Level Accuracy Guarantees 20   1 0 1 1 0 0 2 1 2 0 0 1 2 0 0 0 1 1 0 1 0 2 0 2 Quality  cer,fied   Approx  Answers   Query  Engine   HAC   Bias  Es,mate   Variance  Es,mate   STREAMS   Aging   SNAPPY  STORE   Stra,fied  Samples   Stra,fied  Samples   Interac,ve  Query   Con,nuous  Query   Pipelined   bootstrapped   operator   Row  store  Memory   Column  Store  Disk  
  • 21.       Cloud Ready 21  
  • 22. Dealing with Credit Card Fraud SnappyData  Cluster   Credit  Card   transacOon   stream   User  History   PredicOon   Model   Streaming  ApplicaOon   ……….   Black     Listed     Cards   Data Lake No0fica0on  to   owner   No0fica0on  to   merchant  
  • 23. SnappyData  Cluster   Customers   Approaching     Limit   Plan     Info   CDR  Stream   Schedule  callback   through  call  center   Streaming  ApplicaOon  Immediate  SMS     to  customer   Data Lake Preventing Bill Shock, Real Time Upgrades The  system  detects  approaching  usage   limits,  no0fies  users  and  gives  them     a  chance  to  buy  a  one  0me  upgrade  or   a  new  plan,  increasing  loyalty  &  revenue  
  • 24. www.snappydata.io   Stream  IngesOon   Reference   Data   •  Stream  analyOcs   •  Insider  detecOon   •  Apply  Rules   •  Detect  Market   ManipulaOon   Alert  &  NoOfy  Downstream   Systems   Trigger  InvesOgaOons   Spark  Streaming   SQL  Querying   Con0nuous  Queries   Par00oned  Stream   Inges0on   Summaries  &  Alerts   Messaging   Machine  Learning   Market Surveillance For Market Makers
  • 25. Connected Car Real Time Data Flow SnappyData  Cluster   Ka]a     Receiver   Vehicle  Time     Series  Data   Vehicle   History   Driver     History   Streaming  ApplicaOon   HDFS,  HBase   Raw  Data  Store   Custom   Summary   Dashboard   No0fica0on  to   owner   ……….   System     KPIs     Asset     Metadata  
  • 26. Offline Analysis REAL TIME MATCHING ENGINE MATCHING   ENGINE   Customer   History   NoOficaOon   Sub-­‐system   !   Historical  Customer   Profiles   User  by     Geo  locaOon   PERSONALIZED   CAMPAIGNS  TO   USERS                Ingest  Stream   REAL     TIME   OFFERS       from   Merchants   Real Time Marketing Campaigns A  stream  matching  engine  that  uses  customer   history,  their  current  loca0on  and  relevant  offers  to   Effec0vely  target  users  creates  differen0a0on  &  generates  revenue  
  • 27. www.snappydata.io   000’s data points/sec Emergency Shutdown Tuning & Optimization, Monitor & Control Continuous Real-time Analysis Maintenance Billing Sensor Analytics
  • 28. Message  Bus   Stream  IngesOon   Reference   Data   ETL   •  OLAP  and  Low   Latency   Querying  in  SQL     •  Machine   Learning  in  Spark   RFQs/Trades/Quotes streams Analytic Dashboards SnappyData RFQ Analytics
  • 29.       Ad Analytics 29   1.5-­‐2x        faster ingestion, faster trx 7-­‐142×    faster analytics (at 300M records)
  • 30.       Data Synopsis Engine 30  
  • 31.       TPCH 31   Avg  Latency     SnappyData     MemSQL     Spark   5.7s   100 GB 12.0s   66.9s  
  • 32. THANK  YOU  !   Try  it  out:  hlp://snappydata.io/download