SlideShare a Scribd company logo
Bringing OLAP Fully Online
Analyze Changing Datasets in MemSQL and Spark with Pinterest Demo
Eric Frenkiel, MemSQL CEO
Rob Stepeck, Novus CTO
Yu Yang, Pinterest Software Engineer
Feb 19, 2015 • San Jose, CA
What’s in store for this presentation
▸MemSQL: The real-time database for transactions and
analytics
▸Case Study with Novus CTO, Rob Stepeck
▸New Developments in Spark
▸Advanced Analytics with Demo from Pinterest Sofware
Engineer, Yu Yang
THE REAL-TIME DATABASE FOR
TRANSACTIONS AND ANALYTICS
MemSQL Story
MemSQL Snapshot
▸Experienced Leadership
• Microsoft, Facebook, Oracle, Fusion-io
▸Inspired by Enterprise architecture gap
▸A real-time database for transactions
and analytics
• In-memory, distributed, SQL
▸Broad customer adoption across verticals
▸Top tier investors
4
Four ways your DBMS is holding you back
▸ETL (Extract, Transform, Load)
▸Analytic Latency
▸Synchronization
▸Copies of data
Source: Gartner Hybrid/Transactional/Analytical Processing Will Foster Opportunities for Dramatic Business Innovation
The Real-Time Database for Transactions and Analytics
6
MemSQL Cluster
Data Loading and Queries
Aggregator
Nodes
Leaf
Nodes
Availability
Group 1
Availability
Group 2
HOW NOVUS ENABLES INVESTORS TO
CONSISTENTLY MAXIMIZE THEIR
PERFORMANCE POTENTIAL USING
MEMSQL
Novus Case Study
Quick Background on Novus
Rob Stepeck
Chief Technology Officer
▸Investment acumen, risk, insights
and data management
▸$2 trillion in client assets
▸Used by 100 of the world’s top
investment managers and investors
▸Founded in 2007 by group of
investors, data scientists and
engineers
8
Before MemSQL
Problem:
▸ Write operations inefficient
▸ Loading data was a 24 hour
operation
▸ Failures could significantly impact
subsequent processes
▸ Loading client data degraded system
performance
▸ Scaling was non-trivial
▸ Prospect data integration trade-offs
9
MemSQL Implementation
Reduce Latency SQL Support
10
Scale with Ease
Novus choose to use MemSQL based on the following
data management requirements
After MemSQL
Results:
▸ 24 hour data cycle down to several hours
▸ Scale is achieved by adding/removing
clusters with ease
▸ Learning curve is non existent
▸ Eliminated data ‘hand-holding’ so team
can focus on more important initiatives
▸ Sales are more effective because they can
use a customer’s actual data
11
Example: ‘Refresh a Client’
12
Convert to
In-memory
Backing
Store
Before MemSQL:
After MemSQL:
90 Min.
Raw Data
2 Min.
NEW DEVELOPMENTS IN SPARK
MemSQL Spark Connector
Interest in Spark
▸Recent survey of 2100 developers
– 82% of users choose Spark to replace MapReduce
– 78% of users need faster processing of larger datasets
Source: Typesafe, APACHE SPARK - Preparing for the Next Wave of Reactive Big Data
Spark Data Processing Framework
▸Intuitive, concise, and expressive operations needed for analytics
15
Spark
SQL
Spark
Streaming
Mllib
(machine
learning)
GraphX
(graph)
Apache Spark
Enterprises Seek Simple Ways to Use Spark
▸Spark with operational data stores delivers new use cases
▸In-memory, distributed databases such as MemSQL fit well
Understanding MemSQL and Spark
17
Cluster-wide Parallelization | Bi-Directional
MemSQL and Spark Use Cases
▸Operationalize models built in Spark
▸Stream and event processing
▸Live dashboards and automated reports
▸Extend MemSQL analytics
18
Operationalize Models Built in Spark
▸Process in Spark, persist to MemSQL
▸Go to production and iterate faster
19
MemSQL ClusterSpark Cluster
Enterprise
Consumption
Data into
Spark
Model Creation
Model
Persistence
Stream and Event processing
▸Structure event data on the fly
▸Pass to MemSQL for persistent, queryable format
20
MemSQL ClusterSpark Cluster
Enterprise
Consumption
Real-time
Streaming Data
Data
Transformation
Persistent,
Queryable Format
Extend MemSQL Analytics
▸The freshest data for analysis in Spark
▸Load from MemSQL to Spark and write results on return
21
MemSQL ClusterSpark Cluster
Applications,
Data Streams
Interactive Analytics,
Machine Learning
MemSQL
Replicated
Cluster
Access to Live
Production Data
Real-time Replica
Live Dashboards and Automated Reports
▸Serve live dashboards from MemSQL
▸Run custom reports on live data with Spark
22
MemSQL ClusterSpark Cluster
Live
Dashboards
Custom Reporting
Access to Live
Production Data
SQL Transactions
and Analytics
REAL-TIME ANALYTICS IN PRACTICE
Pinterest Demo
Pinterest Demo
▸Yu Yang Software Engineer at Pinterest
Prototype
events
Kafka
App
Realtime Analytics at Pinterest
Singer
Insights
Spark
Secor
Why Spark
▸Pinterest has high traffic and an active community
▸Always looking for new ways to help users
▸Processing event data presents unique challenges
▸Spark is the leading processing framework for big data
deployments
▸Spark Streaming is ideal for real-time data structuring
How It Works
All at sub-second speed
27
Bringing olap fully online  analyze changing datasets in mem sql and spark with pinterest demo

More Related Content

What's hot

How Microsoft Built and Scaled Cosmos
How Microsoft Built and Scaled CosmosHow Microsoft Built and Scaled Cosmos
How Microsoft Built and Scaled Cosmos
SingleStore
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
HostedbyConfluent
 
Real-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLReal-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQL
SingleStore
 
Internet of Things and Multi-model Data Infrastructure
Internet of Things and Multi-model Data InfrastructureInternet of Things and Multi-model Data Infrastructure
Internet of Things and Multi-model Data Infrastructure
SingleStore
 
Real-Time Geospatial Intelligence at Scale
Real-Time Geospatial Intelligence at Scale Real-Time Geospatial Intelligence at Scale
Real-Time Geospatial Intelligence at Scale
SingleStore
 
Introducing MemSQL 4
Introducing MemSQL 4Introducing MemSQL 4
Introducing MemSQL 4
SingleStore
 
DataOps Automation for a Kafka Streaming Platform (Andrew Stevenson + Spiros ...
DataOps Automation for a Kafka Streaming Platform (Andrew Stevenson + Spiros ...DataOps Automation for a Kafka Streaming Platform (Andrew Stevenson + Spiros ...
DataOps Automation for a Kafka Streaming Platform (Andrew Stevenson + Spiros ...
HostedbyConfluent
 
What every software engineer should know about streams and tables in kafka ...
What every software engineer should know about streams and tables in kafka   ...What every software engineer should know about streams and tables in kafka   ...
What every software engineer should know about streams and tables in kafka ...
confluent
 
Journey to the Real-Time Analytics in Extreme Growth
Journey to the Real-Time Analytics in Extreme GrowthJourney to the Real-Time Analytics in Extreme Growth
Journey to the Real-Time Analytics in Extreme Growth
SingleStore
 
Kafka Summit NYC 2017 - Stream it Together: 3 Realities of Modern Programming
Kafka Summit NYC 2017 - Stream it Together: 3 Realities of Modern ProgrammingKafka Summit NYC 2017 - Stream it Together: 3 Realities of Modern Programming
Kafka Summit NYC 2017 - Stream it Together: 3 Realities of Modern Programming
confluent
 
How Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and AnalyticsHow Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and Analytics
SingleStore
 
Getting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming ArchitecturesGetting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming Architectures
SingleStore
 
Introduction to MemSQL
Introduction to MemSQLIntroduction to MemSQL
Introduction to MemSQL
SingleStore
 
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
Eva Tse
 
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...
HostedbyConfluent
 
Real-Time, Geospatial, Maps by Neil Dahlke
Real-Time, Geospatial, Maps by Neil DahlkeReal-Time, Geospatial, Maps by Neil Dahlke
Real-Time, Geospatial, Maps by Neil Dahlke
SingleStore
 
Building a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQLBuilding a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQL
SingleStore
 
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
HostedbyConfluent
 
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid CloudGartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
SingleStore
 
Converging Database Transactions and Analytics
Converging Database Transactions and Analytics Converging Database Transactions and Analytics
Converging Database Transactions and Analytics
SingleStore
 

What's hot (20)

How Microsoft Built and Scaled Cosmos
How Microsoft Built and Scaled CosmosHow Microsoft Built and Scaled Cosmos
How Microsoft Built and Scaled Cosmos
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
 
Real-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLReal-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQL
 
Internet of Things and Multi-model Data Infrastructure
Internet of Things and Multi-model Data InfrastructureInternet of Things and Multi-model Data Infrastructure
Internet of Things and Multi-model Data Infrastructure
 
Real-Time Geospatial Intelligence at Scale
Real-Time Geospatial Intelligence at Scale Real-Time Geospatial Intelligence at Scale
Real-Time Geospatial Intelligence at Scale
 
Introducing MemSQL 4
Introducing MemSQL 4Introducing MemSQL 4
Introducing MemSQL 4
 
DataOps Automation for a Kafka Streaming Platform (Andrew Stevenson + Spiros ...
DataOps Automation for a Kafka Streaming Platform (Andrew Stevenson + Spiros ...DataOps Automation for a Kafka Streaming Platform (Andrew Stevenson + Spiros ...
DataOps Automation for a Kafka Streaming Platform (Andrew Stevenson + Spiros ...
 
What every software engineer should know about streams and tables in kafka ...
What every software engineer should know about streams and tables in kafka   ...What every software engineer should know about streams and tables in kafka   ...
What every software engineer should know about streams and tables in kafka ...
 
Journey to the Real-Time Analytics in Extreme Growth
Journey to the Real-Time Analytics in Extreme GrowthJourney to the Real-Time Analytics in Extreme Growth
Journey to the Real-Time Analytics in Extreme Growth
 
Kafka Summit NYC 2017 - Stream it Together: 3 Realities of Modern Programming
Kafka Summit NYC 2017 - Stream it Together: 3 Realities of Modern ProgrammingKafka Summit NYC 2017 - Stream it Together: 3 Realities of Modern Programming
Kafka Summit NYC 2017 - Stream it Together: 3 Realities of Modern Programming
 
How Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and AnalyticsHow Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and Analytics
 
Getting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming ArchitecturesGetting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming Architectures
 
Introduction to MemSQL
Introduction to MemSQLIntroduction to MemSQL
Introduction to MemSQL
 
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
 
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...
 
Real-Time, Geospatial, Maps by Neil Dahlke
Real-Time, Geospatial, Maps by Neil DahlkeReal-Time, Geospatial, Maps by Neil Dahlke
Real-Time, Geospatial, Maps by Neil Dahlke
 
Building a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQLBuilding a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQL
 
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
Use ksqlDB to migrate core-banking processing from batch to streaming | Mark ...
 
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid CloudGartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
 
Converging Database Transactions and Analytics
Converging Database Transactions and Analytics Converging Database Transactions and Analytics
Converging Database Transactions and Analytics
 

Similar to Bringing olap fully online analyze changing datasets in mem sql and spark with pinterest demo

SnappyData Toronto Meetup Nov 2017
SnappyData Toronto Meetup Nov 2017SnappyData Toronto Meetup Nov 2017
SnappyData Toronto Meetup Nov 2017
SnappyData
 
From Spark to Ignition: Fueling Your Business on Real-Time Analytics
From Spark to Ignition: Fueling Your Business on Real-Time AnalyticsFrom Spark to Ignition: Fueling Your Business on Real-Time Analytics
From Spark to Ignition: Fueling Your Business on Real-Time Analytics
SingleStore
 
Managing data analytics in a hybrid cloud
Managing data analytics in a hybrid cloudManaging data analytics in a hybrid cloud
Managing data analytics in a hybrid cloud
Karan Singh
 
High performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataHigh performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyData
Carlos Andrés García
 
High performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataHigh performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyData
VMware Tanzu
 
The Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with SparkThe Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with Spark
SingleStore
 
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data PlatformDeploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
Rackspace
 
Intelligent Integration OOW2017 - Jeff Pollock
Intelligent Integration OOW2017 - Jeff PollockIntelligent Integration OOW2017 - Jeff Pollock
Intelligent Integration OOW2017 - Jeff Pollock
Jeffrey T. Pollock
 
NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation
NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data FederationNRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation
NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation
NRB
 
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation
NRB
 
Scaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data ScienceScaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data Science
eRic Choo
 
IMCSummit 2015 - Day 1 IT Business Track - From Spark to Ignition
IMCSummit 2015 - Day 1 IT Business Track - From Spark to IgnitionIMCSummit 2015 - Day 1 IT Business Track - From Spark to Ignition
IMCSummit 2015 - Day 1 IT Business Track - From Spark to Ignition
In-Memory Computing Summit
 
CTO View: Driving the On-Demand Economy with Predictive Analytics
CTO View: Driving the On-Demand Economy with Predictive AnalyticsCTO View: Driving the On-Demand Economy with Predictive Analytics
CTO View: Driving the On-Demand Economy with Predictive Analytics
SingleStore
 
Machine learning model to production
Machine learning model to productionMachine learning model to production
Machine learning model to production
Georg Heiler
 
VMworld 2013: Strategic Reasons for Classifying Workloads for Tier 1 Virtuali...
VMworld 2013: Strategic Reasons for Classifying Workloads for Tier 1 Virtuali...VMworld 2013: Strategic Reasons for Classifying Workloads for Tier 1 Virtuali...
VMworld 2013: Strategic Reasons for Classifying Workloads for Tier 1 Virtuali...
VMworld
 
Identifying the Potential of Near Data Processing for Apache Spark
Identifying the Potential of Near Data Processing for Apache SparkIdentifying the Potential of Near Data Processing for Apache Spark
Identifying the Potential of Near Data Processing for Apache Spark
Ahsan Javed Awan
 
Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
 Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov... Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
Databricks
 
Getting Spark ready for real-time, operational analytics
Getting Spark ready for real-time, operational analyticsGetting Spark ready for real-time, operational analytics
Getting Spark ready for real-time, operational analytics
airisData
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Precisely
 
DoneDeal - AWS Data Analytics Platform
DoneDeal - AWS Data Analytics PlatformDoneDeal - AWS Data Analytics Platform
DoneDeal - AWS Data Analytics Platform
martinbpeters
 

Similar to Bringing olap fully online analyze changing datasets in mem sql and spark with pinterest demo (20)

SnappyData Toronto Meetup Nov 2017
SnappyData Toronto Meetup Nov 2017SnappyData Toronto Meetup Nov 2017
SnappyData Toronto Meetup Nov 2017
 
From Spark to Ignition: Fueling Your Business on Real-Time Analytics
From Spark to Ignition: Fueling Your Business on Real-Time AnalyticsFrom Spark to Ignition: Fueling Your Business on Real-Time Analytics
From Spark to Ignition: Fueling Your Business on Real-Time Analytics
 
Managing data analytics in a hybrid cloud
Managing data analytics in a hybrid cloudManaging data analytics in a hybrid cloud
Managing data analytics in a hybrid cloud
 
High performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataHigh performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyData
 
High performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataHigh performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyData
 
The Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with SparkThe Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with Spark
 
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data PlatformDeploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
 
Intelligent Integration OOW2017 - Jeff Pollock
Intelligent Integration OOW2017 - Jeff PollockIntelligent Integration OOW2017 - Jeff Pollock
Intelligent Integration OOW2017 - Jeff Pollock
 
NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation
NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data FederationNRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation
NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation
 
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation
 
Scaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data ScienceScaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data Science
 
IMCSummit 2015 - Day 1 IT Business Track - From Spark to Ignition
IMCSummit 2015 - Day 1 IT Business Track - From Spark to IgnitionIMCSummit 2015 - Day 1 IT Business Track - From Spark to Ignition
IMCSummit 2015 - Day 1 IT Business Track - From Spark to Ignition
 
CTO View: Driving the On-Demand Economy with Predictive Analytics
CTO View: Driving the On-Demand Economy with Predictive AnalyticsCTO View: Driving the On-Demand Economy with Predictive Analytics
CTO View: Driving the On-Demand Economy with Predictive Analytics
 
Machine learning model to production
Machine learning model to productionMachine learning model to production
Machine learning model to production
 
VMworld 2013: Strategic Reasons for Classifying Workloads for Tier 1 Virtuali...
VMworld 2013: Strategic Reasons for Classifying Workloads for Tier 1 Virtuali...VMworld 2013: Strategic Reasons for Classifying Workloads for Tier 1 Virtuali...
VMworld 2013: Strategic Reasons for Classifying Workloads for Tier 1 Virtuali...
 
Identifying the Potential of Near Data Processing for Apache Spark
Identifying the Potential of Near Data Processing for Apache SparkIdentifying the Potential of Near Data Processing for Apache Spark
Identifying the Potential of Near Data Processing for Apache Spark
 
Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
 Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov... Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
 
Getting Spark ready for real-time, operational analytics
Getting Spark ready for real-time, operational analyticsGetting Spark ready for real-time, operational analytics
Getting Spark ready for real-time, operational analytics
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
 
DoneDeal - AWS Data Analytics Platform
DoneDeal - AWS Data Analytics PlatformDoneDeal - AWS Data Analytics Platform
DoneDeal - AWS Data Analytics Platform
 

More from SingleStore

Five ways database modernization simplifies your data life
Five ways database modernization simplifies your data lifeFive ways database modernization simplifies your data life
Five ways database modernization simplifies your data life
SingleStore
 
Architecting Data in the AWS Ecosystem
Architecting Data in the AWS EcosystemArchitecting Data in the AWS Ecosystem
Architecting Data in the AWS Ecosystem
SingleStore
 
Building the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free LifeBuilding the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free Life
SingleStore
 
MemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastMemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks Webcast
SingleStore
 
An Engineering Approach to Database Evaluations
An Engineering Approach to Database EvaluationsAn Engineering Approach to Database Evaluations
An Engineering Approach to Database Evaluations
SingleStore
 
Building a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed ArchitectureBuilding a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed Architecture
SingleStore
 
Stream Processing with Pipelines and Stored Procedures
Stream Processing with Pipelines  and Stored ProceduresStream Processing with Pipelines  and Stored Procedures
Stream Processing with Pipelines and Stored Procedures
SingleStore
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017
SingleStore
 
Image Recognition on Streaming Data
Image Recognition  on Streaming DataImage Recognition  on Streaming Data
Image Recognition on Streaming Data
SingleStore
 
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image RecognitionSpark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
SingleStore
 
The State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and BeyondThe State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and Beyond
SingleStore
 
How Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data ManagementHow Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data Management
SingleStore
 
Teaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AITeaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AI
SingleStore
 
Gartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming DataGartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming Data
SingleStore
 
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSpark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
SingleStore
 
Real-Time Analytics at Uber Scale
Real-Time Analytics at Uber ScaleReal-Time Analytics at Uber Scale
Real-Time Analytics at Uber Scale
SingleStore
 
Machines and the Magic of Fast Learning
Machines and the Magic of Fast LearningMachines and the Magic of Fast Learning
Machines and the Magic of Fast Learning
SingleStore
 
Machines and the Magic of Fast Learning - Strata Keynote
Machines and the Magic of Fast Learning - Strata KeynoteMachines and the Magic of Fast Learning - Strata Keynote
Machines and the Magic of Fast Learning - Strata Keynote
SingleStore
 
Enabling Real-Time Analytics for IoT
Enabling Real-Time Analytics for IoTEnabling Real-Time Analytics for IoT
Enabling Real-Time Analytics for IoT
SingleStore
 
Driving the On-Demand Economy with Predictive Analytics
Driving the On-Demand Economy with Predictive AnalyticsDriving the On-Demand Economy with Predictive Analytics
Driving the On-Demand Economy with Predictive Analytics
SingleStore
 

More from SingleStore (20)

Five ways database modernization simplifies your data life
Five ways database modernization simplifies your data lifeFive ways database modernization simplifies your data life
Five ways database modernization simplifies your data life
 
Architecting Data in the AWS Ecosystem
Architecting Data in the AWS EcosystemArchitecting Data in the AWS Ecosystem
Architecting Data in the AWS Ecosystem
 
Building the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free LifeBuilding the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free Life
 
MemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastMemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks Webcast
 
An Engineering Approach to Database Evaluations
An Engineering Approach to Database EvaluationsAn Engineering Approach to Database Evaluations
An Engineering Approach to Database Evaluations
 
Building a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed ArchitectureBuilding a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed Architecture
 
Stream Processing with Pipelines and Stored Procedures
Stream Processing with Pipelines  and Stored ProceduresStream Processing with Pipelines  and Stored Procedures
Stream Processing with Pipelines and Stored Procedures
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017
 
Image Recognition on Streaming Data
Image Recognition  on Streaming DataImage Recognition  on Streaming Data
Image Recognition on Streaming Data
 
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image RecognitionSpark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
 
The State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and BeyondThe State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and Beyond
 
How Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data ManagementHow Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data Management
 
Teaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AITeaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AI
 
Gartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming DataGartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming Data
 
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSpark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
 
Real-Time Analytics at Uber Scale
Real-Time Analytics at Uber ScaleReal-Time Analytics at Uber Scale
Real-Time Analytics at Uber Scale
 
Machines and the Magic of Fast Learning
Machines and the Magic of Fast LearningMachines and the Magic of Fast Learning
Machines and the Magic of Fast Learning
 
Machines and the Magic of Fast Learning - Strata Keynote
Machines and the Magic of Fast Learning - Strata KeynoteMachines and the Magic of Fast Learning - Strata Keynote
Machines and the Magic of Fast Learning - Strata Keynote
 
Enabling Real-Time Analytics for IoT
Enabling Real-Time Analytics for IoTEnabling Real-Time Analytics for IoT
Enabling Real-Time Analytics for IoT
 
Driving the On-Demand Economy with Predictive Analytics
Driving the On-Demand Economy with Predictive AnalyticsDriving the On-Demand Economy with Predictive Analytics
Driving the On-Demand Economy with Predictive Analytics
 

Recently uploaded

DataScienceConcept_Kanchana_Weerasinghe.pptx
DataScienceConcept_Kanchana_Weerasinghe.pptxDataScienceConcept_Kanchana_Weerasinghe.pptx
DataScienceConcept_Kanchana_Weerasinghe.pptx
Kanchana Weerasinghe
 
potential usefulness of multi-agent maze-solving in general
potential usefulness of multi-agent maze-solving in generalpotential usefulness of multi-agent maze-solving in general
potential usefulness of multi-agent maze-solving in general
huseindihon
 
Biometric Question Bank 2021 - 1 Soln-1.pdf
Biometric Question Bank 2021 - 1 Soln-1.pdfBiometric Question Bank 2021 - 1 Soln-1.pdf
Biometric Question Bank 2021 - 1 Soln-1.pdf
Joel Ngushwai
 
ch8_multiplexing cs553 st07 slide share ss
ch8_multiplexing cs553 st07 slide share ssch8_multiplexing cs553 st07 slide share ss
ch8_multiplexing cs553 st07 slide share ss
MinThetLwin1
 
potential development of the A* search algorithm specifically
potential development of the A* search algorithm specificallypotential development of the A* search algorithm specifically
potential development of the A* search algorithm specifically
huseindihon
 
CHAPTER-1-Introduction-to-Marketing.pptx
CHAPTER-1-Introduction-to-Marketing.pptxCHAPTER-1-Introduction-to-Marketing.pptx
CHAPTER-1-Introduction-to-Marketing.pptx
girewiy968
 
Coimbatore Girls call Service 000XX00000 Provide Best And Top Girl Service An...
Coimbatore Girls call Service 000XX00000 Provide Best And Top Girl Service An...Coimbatore Girls call Service 000XX00000 Provide Best And Top Girl Service An...
Coimbatore Girls call Service 000XX00000 Provide Best And Top Girl Service An...
vrvipin164
 
Mumbai Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service And ...
Mumbai Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service And ...Mumbai Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service And ...
Mumbai Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service And ...
norina2645
 
社内勉強会資料_TransNeXt: Robust Foveal Visual Perception for Vision Transformers
社内勉強会資料_TransNeXt: Robust Foveal Visual Perception for Vision Transformers社内勉強会資料_TransNeXt: Robust Foveal Visual Perception for Vision Transformers
社内勉強会資料_TransNeXt: Robust Foveal Visual Perception for Vision Transformers
NABLAS株式会社
 
the unexpected potential of Dijkstra's Algorithm
the unexpected potential of Dijkstra's Algorithmthe unexpected potential of Dijkstra's Algorithm
the unexpected potential of Dijkstra's Algorithm
huseindihon
 
🚂🚘 Premium Girls Call Bangalore 🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...
🚂🚘 Premium Girls Call Bangalore  🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...🚂🚘 Premium Girls Call Bangalore  🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...
🚂🚘 Premium Girls Call Bangalore 🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...
bhupeshkumar0889
 
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
ginni singh$A17
 
OpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy Dsouza
OpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy DsouzaOpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy Dsouza
OpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy Dsouza
OpenMetadata
 
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdfWhy_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
Alexander Teggin
 
Busty Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And...
Busty Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And...Busty Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And...
Busty Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And...
tanupasswan6
 
Beautiful Girls Call Pune 000XX00000 Provide Best And Top Girl Service And No...
Beautiful Girls Call Pune 000XX00000 Provide Best And Top Girl Service And No...Beautiful Girls Call Pune 000XX00000 Provide Best And Top Girl Service And No...
Beautiful Girls Call Pune 000XX00000 Provide Best And Top Girl Service And No...
birajmohan012
 
Supervised Learning (Data Science).pptx
Supervised Learning  (Data Science).pptxSupervised Learning  (Data Science).pptx
Supervised Learning (Data Science).pptx
TARIKU ENDALE
 
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
ginni singh$A17
 
Celebrity Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service...
Celebrity Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service...Celebrity Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service...
Celebrity Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service...
tanupasswan6
 
DU degree offer diploma Transcript
DU degree offer diploma TranscriptDU degree offer diploma Transcript
DU degree offer diploma Transcript
uapta
 

Recently uploaded (20)

DataScienceConcept_Kanchana_Weerasinghe.pptx
DataScienceConcept_Kanchana_Weerasinghe.pptxDataScienceConcept_Kanchana_Weerasinghe.pptx
DataScienceConcept_Kanchana_Weerasinghe.pptx
 
potential usefulness of multi-agent maze-solving in general
potential usefulness of multi-agent maze-solving in generalpotential usefulness of multi-agent maze-solving in general
potential usefulness of multi-agent maze-solving in general
 
Biometric Question Bank 2021 - 1 Soln-1.pdf
Biometric Question Bank 2021 - 1 Soln-1.pdfBiometric Question Bank 2021 - 1 Soln-1.pdf
Biometric Question Bank 2021 - 1 Soln-1.pdf
 
ch8_multiplexing cs553 st07 slide share ss
ch8_multiplexing cs553 st07 slide share ssch8_multiplexing cs553 st07 slide share ss
ch8_multiplexing cs553 st07 slide share ss
 
potential development of the A* search algorithm specifically
potential development of the A* search algorithm specificallypotential development of the A* search algorithm specifically
potential development of the A* search algorithm specifically
 
CHAPTER-1-Introduction-to-Marketing.pptx
CHAPTER-1-Introduction-to-Marketing.pptxCHAPTER-1-Introduction-to-Marketing.pptx
CHAPTER-1-Introduction-to-Marketing.pptx
 
Coimbatore Girls call Service 000XX00000 Provide Best And Top Girl Service An...
Coimbatore Girls call Service 000XX00000 Provide Best And Top Girl Service An...Coimbatore Girls call Service 000XX00000 Provide Best And Top Girl Service An...
Coimbatore Girls call Service 000XX00000 Provide Best And Top Girl Service An...
 
Mumbai Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service And ...
Mumbai Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service And ...Mumbai Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service And ...
Mumbai Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service And ...
 
社内勉強会資料_TransNeXt: Robust Foveal Visual Perception for Vision Transformers
社内勉強会資料_TransNeXt: Robust Foveal Visual Perception for Vision Transformers社内勉強会資料_TransNeXt: Robust Foveal Visual Perception for Vision Transformers
社内勉強会資料_TransNeXt: Robust Foveal Visual Perception for Vision Transformers
 
the unexpected potential of Dijkstra's Algorithm
the unexpected potential of Dijkstra's Algorithmthe unexpected potential of Dijkstra's Algorithm
the unexpected potential of Dijkstra's Algorithm
 
🚂🚘 Premium Girls Call Bangalore 🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...
🚂🚘 Premium Girls Call Bangalore  🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...🚂🚘 Premium Girls Call Bangalore  🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...
🚂🚘 Premium Girls Call Bangalore 🛵🚡000XX00000 💃 Choose Best And Top Girl Serv...
 
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
 
OpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy Dsouza
OpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy DsouzaOpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy Dsouza
OpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy Dsouza
 
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdfWhy_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
Why_are_we_hypnotizing_ourselves-_ATeggin-1.pdf
 
Busty Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And...
Busty Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And...Busty Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And...
Busty Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service And...
 
Beautiful Girls Call Pune 000XX00000 Provide Best And Top Girl Service And No...
Beautiful Girls Call Pune 000XX00000 Provide Best And Top Girl Service And No...Beautiful Girls Call Pune 000XX00000 Provide Best And Top Girl Service And No...
Beautiful Girls Call Pune 000XX00000 Provide Best And Top Girl Service And No...
 
Supervised Learning (Data Science).pptx
Supervised Learning  (Data Science).pptxSupervised Learning  (Data Science).pptx
Supervised Learning (Data Science).pptx
 
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
Celebrity Girls Call Noida 9873940964 Unlimited Short Providing Girls Service...
 
Celebrity Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service...
Celebrity Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service...Celebrity Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service...
Celebrity Girls Call Delhi 🎈🔥9711199171 🔥💋🎈 Provide Best And Top Girl Service...
 
DU degree offer diploma Transcript
DU degree offer diploma TranscriptDU degree offer diploma Transcript
DU degree offer diploma Transcript
 

Bringing olap fully online analyze changing datasets in mem sql and spark with pinterest demo

  • 1. Bringing OLAP Fully Online Analyze Changing Datasets in MemSQL and Spark with Pinterest Demo Eric Frenkiel, MemSQL CEO Rob Stepeck, Novus CTO Yu Yang, Pinterest Software Engineer Feb 19, 2015 • San Jose, CA
  • 2. What’s in store for this presentation ▸MemSQL: The real-time database for transactions and analytics ▸Case Study with Novus CTO, Rob Stepeck ▸New Developments in Spark ▸Advanced Analytics with Demo from Pinterest Sofware Engineer, Yu Yang
  • 3. THE REAL-TIME DATABASE FOR TRANSACTIONS AND ANALYTICS MemSQL Story
  • 4. MemSQL Snapshot ▸Experienced Leadership • Microsoft, Facebook, Oracle, Fusion-io ▸Inspired by Enterprise architecture gap ▸A real-time database for transactions and analytics • In-memory, distributed, SQL ▸Broad customer adoption across verticals ▸Top tier investors 4
  • 5. Four ways your DBMS is holding you back ▸ETL (Extract, Transform, Load) ▸Analytic Latency ▸Synchronization ▸Copies of data Source: Gartner Hybrid/Transactional/Analytical Processing Will Foster Opportunities for Dramatic Business Innovation
  • 6. The Real-Time Database for Transactions and Analytics 6 MemSQL Cluster Data Loading and Queries Aggregator Nodes Leaf Nodes Availability Group 1 Availability Group 2
  • 7. HOW NOVUS ENABLES INVESTORS TO CONSISTENTLY MAXIMIZE THEIR PERFORMANCE POTENTIAL USING MEMSQL Novus Case Study
  • 8. Quick Background on Novus Rob Stepeck Chief Technology Officer ▸Investment acumen, risk, insights and data management ▸$2 trillion in client assets ▸Used by 100 of the world’s top investment managers and investors ▸Founded in 2007 by group of investors, data scientists and engineers 8
  • 9. Before MemSQL Problem: ▸ Write operations inefficient ▸ Loading data was a 24 hour operation ▸ Failures could significantly impact subsequent processes ▸ Loading client data degraded system performance ▸ Scaling was non-trivial ▸ Prospect data integration trade-offs 9
  • 10. MemSQL Implementation Reduce Latency SQL Support 10 Scale with Ease Novus choose to use MemSQL based on the following data management requirements
  • 11. After MemSQL Results: ▸ 24 hour data cycle down to several hours ▸ Scale is achieved by adding/removing clusters with ease ▸ Learning curve is non existent ▸ Eliminated data ‘hand-holding’ so team can focus on more important initiatives ▸ Sales are more effective because they can use a customer’s actual data 11
  • 12. Example: ‘Refresh a Client’ 12 Convert to In-memory Backing Store Before MemSQL: After MemSQL: 90 Min. Raw Data 2 Min.
  • 13. NEW DEVELOPMENTS IN SPARK MemSQL Spark Connector
  • 14. Interest in Spark ▸Recent survey of 2100 developers – 82% of users choose Spark to replace MapReduce – 78% of users need faster processing of larger datasets Source: Typesafe, APACHE SPARK - Preparing for the Next Wave of Reactive Big Data
  • 15. Spark Data Processing Framework ▸Intuitive, concise, and expressive operations needed for analytics 15 Spark SQL Spark Streaming Mllib (machine learning) GraphX (graph) Apache Spark
  • 16. Enterprises Seek Simple Ways to Use Spark ▸Spark with operational data stores delivers new use cases ▸In-memory, distributed databases such as MemSQL fit well
  • 17. Understanding MemSQL and Spark 17 Cluster-wide Parallelization | Bi-Directional
  • 18. MemSQL and Spark Use Cases ▸Operationalize models built in Spark ▸Stream and event processing ▸Live dashboards and automated reports ▸Extend MemSQL analytics 18
  • 19. Operationalize Models Built in Spark ▸Process in Spark, persist to MemSQL ▸Go to production and iterate faster 19 MemSQL ClusterSpark Cluster Enterprise Consumption Data into Spark Model Creation Model Persistence
  • 20. Stream and Event processing ▸Structure event data on the fly ▸Pass to MemSQL for persistent, queryable format 20 MemSQL ClusterSpark Cluster Enterprise Consumption Real-time Streaming Data Data Transformation Persistent, Queryable Format
  • 21. Extend MemSQL Analytics ▸The freshest data for analysis in Spark ▸Load from MemSQL to Spark and write results on return 21 MemSQL ClusterSpark Cluster Applications, Data Streams Interactive Analytics, Machine Learning MemSQL Replicated Cluster Access to Live Production Data Real-time Replica
  • 22. Live Dashboards and Automated Reports ▸Serve live dashboards from MemSQL ▸Run custom reports on live data with Spark 22 MemSQL ClusterSpark Cluster Live Dashboards Custom Reporting Access to Live Production Data SQL Transactions and Analytics
  • 23. REAL-TIME ANALYTICS IN PRACTICE Pinterest Demo
  • 24. Pinterest Demo ▸Yu Yang Software Engineer at Pinterest
  • 25. Prototype events Kafka App Realtime Analytics at Pinterest Singer Insights Spark Secor
  • 26. Why Spark ▸Pinterest has high traffic and an active community ▸Always looking for new ways to help users ▸Processing event data presents unique challenges ▸Spark is the leading processing framework for big data deployments ▸Spark Streaming is ideal for real-time data structuring
  • 27. How It Works All at sub-second speed 27