SlideShare a Scribd company logo
Hybrid Transaction/Analytical Processing:
Beyond the Big Database Hype
Ali Hodroj
Vice President, Products and Strategy
Agenda
• Drivers for HTAP
• Emergence of insight-driven
transformation
• GigaSpaces Solution for HTAP
• Reference Architecture and Case Studies
About GigaSpaces
GigaSpaces provides Cloud native In-Memory
Compute middleware for mission-critical
applications.
GigaSpaces IMC serves more than 500 large
enterprises & ISVs, over 50 of which are
Fortune-listed.
Direct customers
300+
Fortune / Organizations
50+ / 500+
Large installations in
production (OEM)
5,000+
ISVs
25+
Direct customers
300+
Fortune / Organizations
50+ / 500+
Large installations in
production (OEM)
5,000+
ISVs
25+
Direct customers
300+
Fortune / Organizations
50+ / 500+
Large installations in
production (OEM)
5,000+
ISVs
25+
Why Hybrid
Transactional
Analytics
Processing?
$13.01 forevery$1
a company spends on analytics, it
gets back spend on data
management and analytics
Source: MIT Sloan, NucleusResearch
The economic value of insight-driven transformation
74%of firms say they want to be data-
driven, but only 23%are successful
Source: Forbes: Actionable Insight: Missing Link between Data and Value
2x [companies are twice] likely to
outperform their peers if they use
advanced analytics
Source: MIT Sloan
Data &
Transactions
Created
Extract, Transform,
Load
BusinessValue
Time toAct
Positive
Negative
Run Analytics
Stale Insights
Decision Made
Outdated Decisions
Trigger Action
Irrelevant
actions
Fast Data Analytics = Immediate Business Value
Data is generated in real-time, while analytics and insight fall behind
Batch Machine Learning & Event ProcessingStreaming
Hours Minutes Seconds Sub-Second Milliseconds
PredictiveSearchandUserInterfacesReal-timePricingHyperlocalAdvertisingRevenue,Customer
Segmentation
ProductRecommendations
Insight-centric systems demand hyperscale analytics
(Case study: intelligent omni-channel commerce)
Microseconds
In-Memory Computing enables HTAP
Clearing the hype:
HTAP and the big
(database)
monolith
Evolution of big databases towards HTAP
Traditional
Relational Database
In-Memory or MPP
Database
• Query engine for either transactional
OR analytics workloads
• Single storage engine
• Vertically Scalable
• Single Query engine for both workloads
• Multiple storage engines (Row-based and
Column-based)
• Leverages memory to speed up I/O
(Traditional) (HTAP)
Yet analytics evolved much faster
Insight-driven transformation requires:
• Applications with polyglot persistence
(microservices, multiple data sources)
• Analytics are mostly real-time,
streaming, and predictive
• Iterative data science – modeling against
live data for continuous machine and
deep learning
High
Low
Past FutureTime Horizon
BusinessValue
Business
Intelligence
Data Science
Prescriptive analytics
Predictive analytics
(What will happen?
What should I do?)
Historical reporting
(What happened?)
(HTAP)
The Open Source
and In-Memory
Insight Platform
Approach
HTAP = Spark + In-Memory Data Grid
Large-scale distributed
analytics framework
Unified, scale-out, low-latency data store
Transactional capabilities:
ACID, Event-Driven, Rich Data
modeling
Microservices
16
Elastic Scale-out In-Memory Storage
(Shared-nothing, Linear scalability, Elastic capacity)
Low latency and high throughput
(co-located ops, event-driven, fast indexing)
High availability and Resiliency
(auto-healing, multi-data center replication, fault tolerance)
Rich API and Query Language
(SQL, Spring, Java, .NET, C++)
GigaSpaces XAP In-Memory Data Grid
17
Geo-Spatial Full Text
In-Memory Data Grid + Spark Convergence
19
• Unified & Concise API
• Highly Flexible Data Store Integration
• Massive Community and Adoption
Why Spark?
Why In-Memory Data Grid?
SQL-99, Polyglot
Data & Search
Multi-Tiered Data
Storage
Cloud-Nativeand Horizontally
Scalable
• RAM
• SSD/Flash
• Storage-Class Memory
(3DXPoint)
• SQL ‘99
• Graph
• JSON
• POJO
• GeoSpatial
• Full Text
Distributed In-Grid Analytics
• SQL
• Streaming
• Machine Learning
• Graph Processing
• Deep Learning
• Textmining
• Geospatial
• In-Memory Event-Driven
Processing
• Distributed Tasks and Compute
Grid
• Real-time Web Services
• In-Memory Aggregations
Advanced In-Grid Transactions and Analytics Processing
GigaSpaces
Hadoop
Embracing an open source analytics ecosystem
Pick your own fast data architecture (lambda, kappa) and co-locate transaction processing
Kafka
Spark
Simplified Lambda Architecture
(Realtime + Historical)
Reference
Architecture
Unified HTAP Architecture
node 1
Spark master
Grid
master
node 2
Spark worker
Grid
Partition
node 3
Spark worker
Grid
Partition
Lightweight
workers,
small JVMs
Large JVMs,
Fast
indexing
• Push-down predicates (ultra-low latency processing,
30x performance improvement)
• Stateful data-360 sharing across analytics jobs
• Data-locality for high throughput
• Five 9s High Availability
Decoupled HTAP Architecture
In-Memory Data Grid
Realtime Replication
• Scoring models
• Trigger actions
• Events
Transactions Analytics
• Useful when analytics are
mostly batch or long-
running queries.
• Analytics grid can be used
for frequent model training
(CPU intensive), without
impacting transactional
apps
• Flexibility in write-heavy
(transactions) and read-
heavy (analytics)
independent scaling Application
developers
Data Scientists &
Analysts
Case Studies
Case Study: Magic Software
IoT Hub + Predictive Analytics (Automotive Telematics)
Challenge:
• Implement predictive analytics and anomaly detection
• Expand insight context through customer/data-360
integration
• Trigger transactional workflows based on prediction criteria
Solution:
• Simplified HTAP with Streaming data pipeline (3 tiers)
• IoT streaming analytics with 9s high availability
“GigaSpaces enables our
customers to simplify and
accelerate telemetry
ingestion, to gain full
business value from IoT
adoption.”
Yuval Lavi, VP of Innovation
Magic Software
http://www.magicsoftware.com
Key Takeaways
By the end of this presentation, you hopefully understood that:
➔ HTAP is not just a database problem!
Capturing business value from real-time apps requires more than a hybrid
database. Look into distributed analytics frameworks for speed of
innovation
➔ Hyperscale analytics require the combination of several tools
Open source analytics provide better long term ROI for implementing both
BI analytics and Data Science, while reducing architecture complexity.
➔ Try it all out – It’s open source!
http://insightedge.io / http://gigaspaces.com
http://github.com/InsightEdge
http://insightedge.slack.com
hello@insightedge.io
Book a demo:
Q&A

More Related Content

What's hot

Big Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace ImagesBig Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace Images
Mark Kromer
 
Building big data solutions on azure
Building big data solutions on azureBuilding big data solutions on azure
Building big data solutions on azure
Eyal Ben Ivri
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dataconomy Media
 
Intuit Analytics Cloud 101
Intuit Analytics Cloud 101Intuit Analytics Cloud 101
Intuit Analytics Cloud 101
DataWorks Summit/Hadoop Summit
 
Transforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform StrategyTransforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform Strategy
Databricks
 
Unlocking Geospatial Analytics Use Cases with CARTO and Databricks
Unlocking Geospatial Analytics Use Cases with CARTO and DatabricksUnlocking Geospatial Analytics Use Cases with CARTO and Databricks
Unlocking Geospatial Analytics Use Cases with CARTO and Databricks
Databricks
 
Azure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analyticsAzure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analytics
Mark Kromer
 
VP of WW Partners by Alan Chhabra
VP of WW Partners by Alan ChhabraVP of WW Partners by Alan Chhabra
VP of WW Partners by Alan Chhabra
Big Data Spain
 
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics SolutionCortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
MSAdvAnalytics
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
Infochimps, a CSC Big Data Business
 
Snowflakes in the Cloud Real world experience on a new approach for Big Data
Snowflakes in the Cloud Real world experience on a new approach for Big DataSnowflakes in the Cloud Real world experience on a new approach for Big Data
Snowflakes in the Cloud Real world experience on a new approach for Big Data
DevFest DC
 
Pouring the Foundation: Data Management in the Energy Industry
Pouring the Foundation: Data Management in the Energy IndustryPouring the Foundation: Data Management in the Energy Industry
Pouring the Foundation: Data Management in the Energy Industry
DataWorks Summit
 
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Big Data Spain
 
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
DataStax
 
Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...
Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...
Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...
Databricks
 
Advanced data science algorithms applied to scalable stream processing by Dav...
Advanced data science algorithms applied to scalable stream processing by Dav...Advanced data science algorithms applied to scalable stream processing by Dav...
Advanced data science algorithms applied to scalable stream processing by Dav...
Big Data Spain
 
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
TigerGraph
 
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Dipti Borkar
 
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Dataconomy Media
 
IBM Cloud Native Day April 2021: Serverless Data Lake
IBM Cloud Native Day April 2021: Serverless Data LakeIBM Cloud Native Day April 2021: Serverless Data Lake
IBM Cloud Native Day April 2021: Serverless Data Lake
Torsten Steinbach
 

What's hot (20)

Big Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace ImagesBig Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace Images
 
Building big data solutions on azure
Building big data solutions on azureBuilding big data solutions on azure
Building big data solutions on azure
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
 
Intuit Analytics Cloud 101
Intuit Analytics Cloud 101Intuit Analytics Cloud 101
Intuit Analytics Cloud 101
 
Transforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform StrategyTransforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform Strategy
 
Unlocking Geospatial Analytics Use Cases with CARTO and Databricks
Unlocking Geospatial Analytics Use Cases with CARTO and DatabricksUnlocking Geospatial Analytics Use Cases with CARTO and Databricks
Unlocking Geospatial Analytics Use Cases with CARTO and Databricks
 
Azure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analyticsAzure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analytics
 
VP of WW Partners by Alan Chhabra
VP of WW Partners by Alan ChhabraVP of WW Partners by Alan Chhabra
VP of WW Partners by Alan Chhabra
 
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics SolutionCortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
 
Snowflakes in the Cloud Real world experience on a new approach for Big Data
Snowflakes in the Cloud Real world experience on a new approach for Big DataSnowflakes in the Cloud Real world experience on a new approach for Big Data
Snowflakes in the Cloud Real world experience on a new approach for Big Data
 
Pouring the Foundation: Data Management in the Energy Industry
Pouring the Foundation: Data Management in the Energy IndustryPouring the Foundation: Data Management in the Energy Industry
Pouring the Foundation: Data Management in the Energy Industry
 
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
 
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
 
Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...
Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...
Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...
 
Advanced data science algorithms applied to scalable stream processing by Dav...
Advanced data science algorithms applied to scalable stream processing by Dav...Advanced data science algorithms applied to scalable stream processing by Dav...
Advanced data science algorithms applied to scalable stream processing by Dav...
 
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
 
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
Presto – Today and Beyond – The Open Source SQL Engine for Querying all Data...
 
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
Sören Eickhoff, Informatica GmbH, "Informatica Intelligent Data Lake – Self S...
 
IBM Cloud Native Day April 2021: Serverless Data Lake
IBM Cloud Native Day April 2021: Serverless Data LakeIBM Cloud Native Day April 2021: Serverless Data Lake
IBM Cloud Native Day April 2021: Serverless Data Lake
 

Similar to Hybrid Transactional/Analytics Processing: Beyond the Big Database Hype

Using real time big data analytics for competitive advantage
 Using real time big data analytics for competitive advantage Using real time big data analytics for competitive advantage
Using real time big data analytics for competitive advantage
Amazon Web Services
 
Machine Data Analytics
Machine Data AnalyticsMachine Data Analytics
Machine Data Analytics
Nicolas Morales
 
How to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT OperationsHow to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT Operations
ExtraHop Networks
 
Derfor skal du bruge en DataLake
Derfor skal du bruge en DataLakeDerfor skal du bruge en DataLake
Derfor skal du bruge en DataLake
Microsoft
 
Vertica Analytics Database general overview
Vertica Analytics Database general overviewVertica Analytics Database general overview
Vertica Analytics Database general overview
Stratebi
 
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
Databricks
 
ADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic Solutions
DATAVERSITY
 
An overview of modern scalable web development
An overview of modern scalable web developmentAn overview of modern scalable web development
An overview of modern scalable web development
Tung Nguyen
 
AWS Webcast - Sales Productivity Solutions with MicroStrategy and Redshift
AWS Webcast - Sales Productivity Solutions with MicroStrategy and RedshiftAWS Webcast - Sales Productivity Solutions with MicroStrategy and Redshift
AWS Webcast - Sales Productivity Solutions with MicroStrategy and Redshift
Amazon Web Services
 
Girish Juneja - Intel Big Data & Cloud Summit 2013
Girish Juneja - Intel Big Data & Cloud Summit 2013Girish Juneja - Intel Big Data & Cloud Summit 2013
Girish Juneja - Intel Big Data & Cloud Summit 2013IntelAPAC
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Dataconomy Media
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
RojaT4
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Rizaldy Ignacio
 
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
MSAdvAnalytics
 
Top SAP Online training institute in Hyderabad
Top SAP Online training institute in HyderabadTop SAP Online training institute in Hyderabad
Top SAP Online training institute in Hyderabad
AadhyaKrishnan
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric Approach
SoftServe
 
Chip ICT | Hgst storage brochure
Chip ICT | Hgst storage brochureChip ICT | Hgst storage brochure
Chip ICT | Hgst storage brochure
Marco van der Hart
 
Kushal Data Warehousing PPT
Kushal Data Warehousing PPTKushal Data Warehousing PPT
Kushal Data Warehousing PPT
Kushal Singh
 
J1 - Keynote Data Platform - Rohan Kumar
J1 - Keynote Data Platform - Rohan KumarJ1 - Keynote Data Platform - Rohan Kumar
J1 - Keynote Data Platform - Rohan Kumar
MS Cloud Summit
 
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLTBig Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Kiththi Perera
 

Similar to Hybrid Transactional/Analytics Processing: Beyond the Big Database Hype (20)

Using real time big data analytics for competitive advantage
 Using real time big data analytics for competitive advantage Using real time big data analytics for competitive advantage
Using real time big data analytics for competitive advantage
 
Machine Data Analytics
Machine Data AnalyticsMachine Data Analytics
Machine Data Analytics
 
How to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT OperationsHow to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT Operations
 
Derfor skal du bruge en DataLake
Derfor skal du bruge en DataLakeDerfor skal du bruge en DataLake
Derfor skal du bruge en DataLake
 
Vertica Analytics Database general overview
Vertica Analytics Database general overviewVertica Analytics Database general overview
Vertica Analytics Database general overview
 
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
 
ADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic Solutions
 
An overview of modern scalable web development
An overview of modern scalable web developmentAn overview of modern scalable web development
An overview of modern scalable web development
 
AWS Webcast - Sales Productivity Solutions with MicroStrategy and Redshift
AWS Webcast - Sales Productivity Solutions with MicroStrategy and RedshiftAWS Webcast - Sales Productivity Solutions with MicroStrategy and Redshift
AWS Webcast - Sales Productivity Solutions with MicroStrategy and Redshift
 
Girish Juneja - Intel Big Data & Cloud Summit 2013
Girish Juneja - Intel Big Data & Cloud Summit 2013Girish Juneja - Intel Big Data & Cloud Summit 2013
Girish Juneja - Intel Big Data & Cloud Summit 2013
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
 
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
 
Top SAP Online training institute in Hyderabad
Top SAP Online training institute in HyderabadTop SAP Online training institute in Hyderabad
Top SAP Online training institute in Hyderabad
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric Approach
 
Chip ICT | Hgst storage brochure
Chip ICT | Hgst storage brochureChip ICT | Hgst storage brochure
Chip ICT | Hgst storage brochure
 
Kushal Data Warehousing PPT
Kushal Data Warehousing PPTKushal Data Warehousing PPT
Kushal Data Warehousing PPT
 
J1 - Keynote Data Platform - Rohan Kumar
J1 - Keynote Data Platform - Rohan KumarJ1 - Keynote Data Platform - Rohan Kumar
J1 - Keynote Data Platform - Rohan Kumar
 
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLTBig Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
 

Recently uploaded

Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
Srikant77
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Mind IT Systems
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
Tendenci - The Open Source AMS (Association Management Software)
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
e20449
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
Ortus Solutions, Corp
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 

Recently uploaded (20)

Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 

Hybrid Transactional/Analytics Processing: Beyond the Big Database Hype

  • 1. Hybrid Transaction/Analytical Processing: Beyond the Big Database Hype Ali Hodroj Vice President, Products and Strategy
  • 2. Agenda • Drivers for HTAP • Emergence of insight-driven transformation • GigaSpaces Solution for HTAP • Reference Architecture and Case Studies
  • 3. About GigaSpaces GigaSpaces provides Cloud native In-Memory Compute middleware for mission-critical applications. GigaSpaces IMC serves more than 500 large enterprises & ISVs, over 50 of which are Fortune-listed. Direct customers 300+ Fortune / Organizations 50+ / 500+ Large installations in production (OEM) 5,000+ ISVs 25+
  • 4. Direct customers 300+ Fortune / Organizations 50+ / 500+ Large installations in production (OEM) 5,000+ ISVs 25+
  • 5. Direct customers 300+ Fortune / Organizations 50+ / 500+ Large installations in production (OEM) 5,000+ ISVs 25+
  • 7. $13.01 forevery$1 a company spends on analytics, it gets back spend on data management and analytics Source: MIT Sloan, NucleusResearch The economic value of insight-driven transformation 74%of firms say they want to be data- driven, but only 23%are successful Source: Forbes: Actionable Insight: Missing Link between Data and Value 2x [companies are twice] likely to outperform their peers if they use advanced analytics Source: MIT Sloan
  • 8. Data & Transactions Created Extract, Transform, Load BusinessValue Time toAct Positive Negative Run Analytics Stale Insights Decision Made Outdated Decisions Trigger Action Irrelevant actions Fast Data Analytics = Immediate Business Value Data is generated in real-time, while analytics and insight fall behind
  • 9. Batch Machine Learning & Event ProcessingStreaming Hours Minutes Seconds Sub-Second Milliseconds PredictiveSearchandUserInterfacesReal-timePricingHyperlocalAdvertisingRevenue,Customer Segmentation ProductRecommendations Insight-centric systems demand hyperscale analytics (Case study: intelligent omni-channel commerce) Microseconds
  • 11. Clearing the hype: HTAP and the big (database) monolith
  • 12. Evolution of big databases towards HTAP Traditional Relational Database In-Memory or MPP Database • Query engine for either transactional OR analytics workloads • Single storage engine • Vertically Scalable • Single Query engine for both workloads • Multiple storage engines (Row-based and Column-based) • Leverages memory to speed up I/O (Traditional) (HTAP)
  • 13. Yet analytics evolved much faster Insight-driven transformation requires: • Applications with polyglot persistence (microservices, multiple data sources) • Analytics are mostly real-time, streaming, and predictive • Iterative data science – modeling against live data for continuous machine and deep learning High Low Past FutureTime Horizon BusinessValue Business Intelligence Data Science Prescriptive analytics Predictive analytics (What will happen? What should I do?) Historical reporting (What happened?) (HTAP)
  • 14. The Open Source and In-Memory Insight Platform Approach
  • 15. HTAP = Spark + In-Memory Data Grid Large-scale distributed analytics framework Unified, scale-out, low-latency data store Transactional capabilities: ACID, Event-Driven, Rich Data modeling Microservices
  • 16. 16 Elastic Scale-out In-Memory Storage (Shared-nothing, Linear scalability, Elastic capacity) Low latency and high throughput (co-located ops, event-driven, fast indexing) High availability and Resiliency (auto-healing, multi-data center replication, fault tolerance) Rich API and Query Language (SQL, Spring, Java, .NET, C++) GigaSpaces XAP In-Memory Data Grid
  • 17. 17
  • 18. Geo-Spatial Full Text In-Memory Data Grid + Spark Convergence
  • 19. 19 • Unified & Concise API • Highly Flexible Data Store Integration • Massive Community and Adoption Why Spark?
  • 20. Why In-Memory Data Grid? SQL-99, Polyglot Data & Search Multi-Tiered Data Storage Cloud-Nativeand Horizontally Scalable • RAM • SSD/Flash • Storage-Class Memory (3DXPoint) • SQL ‘99 • Graph • JSON • POJO • GeoSpatial • Full Text Distributed In-Grid Analytics • SQL • Streaming • Machine Learning • Graph Processing • Deep Learning • Textmining • Geospatial • In-Memory Event-Driven Processing • Distributed Tasks and Compute Grid • Real-time Web Services • In-Memory Aggregations Advanced In-Grid Transactions and Analytics Processing
  • 21. GigaSpaces Hadoop Embracing an open source analytics ecosystem Pick your own fast data architecture (lambda, kappa) and co-locate transaction processing Kafka Spark Simplified Lambda Architecture (Realtime + Historical)
  • 23. Unified HTAP Architecture node 1 Spark master Grid master node 2 Spark worker Grid Partition node 3 Spark worker Grid Partition Lightweight workers, small JVMs Large JVMs, Fast indexing • Push-down predicates (ultra-low latency processing, 30x performance improvement) • Stateful data-360 sharing across analytics jobs • Data-locality for high throughput • Five 9s High Availability
  • 24. Decoupled HTAP Architecture In-Memory Data Grid Realtime Replication • Scoring models • Trigger actions • Events Transactions Analytics • Useful when analytics are mostly batch or long- running queries. • Analytics grid can be used for frequent model training (CPU intensive), without impacting transactional apps • Flexibility in write-heavy (transactions) and read- heavy (analytics) independent scaling Application developers Data Scientists & Analysts
  • 26. Case Study: Magic Software IoT Hub + Predictive Analytics (Automotive Telematics) Challenge: • Implement predictive analytics and anomaly detection • Expand insight context through customer/data-360 integration • Trigger transactional workflows based on prediction criteria Solution: • Simplified HTAP with Streaming data pipeline (3 tiers) • IoT streaming analytics with 9s high availability “GigaSpaces enables our customers to simplify and accelerate telemetry ingestion, to gain full business value from IoT adoption.” Yuval Lavi, VP of Innovation Magic Software http://www.magicsoftware.com
  • 27. Key Takeaways By the end of this presentation, you hopefully understood that: ➔ HTAP is not just a database problem! Capturing business value from real-time apps requires more than a hybrid database. Look into distributed analytics frameworks for speed of innovation ➔ Hyperscale analytics require the combination of several tools Open source analytics provide better long term ROI for implementing both BI analytics and Data Science, while reducing architecture complexity. ➔ Try it all out – It’s open source! http://insightedge.io / http://gigaspaces.com http://github.com/InsightEdge http://insightedge.slack.com hello@insightedge.io Book a demo:
  • 28. Q&A

Editor's Notes

  1. We’re talking today about HTAP, and analytics in general, because the economic value of insight-driven transformation is undeniable Recent research shows really interesting numbers for what you might call insight-driven businesses From an ROI perspective, firms are seeing a %1300 ROI The majority of those who haven’t become fully insight-driven, about 74%, already have plans for introducing analytics at every corner for their business This is mainly due to the recognition that, having analytics, not only as means of differentiation, but as a fast innovation engine, to be twice as innovate and ourperform their peers.
  2. Which brings us to the business value of analytics:…. Recent years have seen the need for more real-time analytics. In addition, mobile and IoT have given rise to a new generation of applications that are characterized by heavy ingest rates, i.e. they produce large amounts of data in a short time, as well as their need for more realtime analysis. Enterprises are pushing for more real-time analysis of their data to drive competitive advantage, and as such they need the ability to run analytics on their operational data as soon as possible. In order to become truly insight driven and innovate like amazon, this requires a departure from traditional analytics infrastructures.
  3. Speaking of Amazon, one interesting use case we see quite often in retail is the ability to become an omni-channel retailer. Which requires what we call “hyperscale analytics” Let’s take a look
  4. NOW FORTUNATELY, there has been advances in distributed computing that help us realize this vision. Thanks to the declining price of RAM and advancements in SSD storage, in-memory computing is becoming mainstrema. For those not familiar, in-memory computing means using RAM as the primary storage medium for business and analytics. There by eliminating any form of Disk I/O or network I/O latency, therefore operating at millisecond latencies at very high throughput. To understand HTAP, we first need to look into OLTP and OLAP systems and how they progressed over the years. Relational databases have been used for both transaction processing as well as analytics. However, OLTP and OLAP systems have very different characteristics. OLTP systems are identified by their individual record insert/delete/update statements, as well as point queries that benefit from indexes. One cannot think about OLTP systems without indexing support. OLAP systems, on the other hand, are updated in batches and usually require scans of the tables. Batch insertion into OLAP systems are an artifact of ETL (extract transform load) systems that consolidate and transform transactional data from OLTP systems into an OLAP environment for analysis.
  5. If you read Gartner’s report on HTAP, you’ll see that most are actually classic database vendors. Now what does it mean to have an HTAP architecture? We see quite a lot people fall into the trap of thinking about HTAP as a acquiring a large vertically scalable database (like SAP HANA, Oracle) or others.
  6. To understand HTAP, we first need to look into how databases evolved from the traditional OLTP vs OLAP world to the modern HTAP. As HTAP and realtime analytics became a necessity, we started seeing database vendors go outside their swim-lanes to introduce built-in LRU caching mechanisms (using In-Memory).
  7. The reality of insight-driven transformation is that it requires a wide scope of analytics HTAP databases are simply focused on BI type of workloads (reporting queries)
  8. At the same time, the last decade seen an explosion of many big data and in-memory computing technologies, driven by new generation applications. NoSQL or key-value stores, such as Voldemort, Cassandra, RocksDB, offer fast inserts and lookups, and very high scale out, but lack in their query capabilities, and offer only loose transactional guarantees (see Mohan’s tutorial[25]). There have been also many SQL-on-Hadoop offerings, including Hive [36], Big SQL[15], Impala[20], and Spark SQL[3], that provide analytics capabilities over large data sets, focusing on OLAP queries only, and lacking transaction support. Although all these systems support queries over text and CSV files, their focus have been on columnar storage formats
  9. HTAP solutions today follow a variety of design practices. Now one of the major design decisions HTAP systems have to make is whether or not to use the same engine for both OLTP and OLAP requests. One approach is to decouple OLTP and an OLAP systems together for HTAP. It is up to the applications to maintain the hybrid architecture. The operational data in the OLTP system are aged to the OLAP system using standard ETL process. In fact, this is very common in the big data world, where applications use a fast key-value store like Cassandra for transactional workloads, and the operational data are groomed into Parquet or ORC files on HDFS for a SQL-on-Hadoop system for queries. BUT as a result, there is a lag between what data the OLAP system can query and what data the OLTP system sees.
  10. all common API tap into other data stores on demand Data science is in high demand, but short supply – so the ability to leverage the know how, capability, and production readiness eliminates a lot of pain points.
  11. First reason is speed: “In memory computing (IMC) … provides transformational opportunities. The execution of certain-types of hours-long batch processes can be squeezed into minutes or even seconds …Millions of events can be scanned in a matter of a few tens of millisecond to detect correlations and patterns pointing at emerging opportunities and threats "as things happen.” Besides that, in-memory data grids are proving to be a very mature containers for real-time application. While they started in finance and
  12. Goal is to provide a unified environment where application developers and data scientists can collaborate. Data science by itself is an iterative activity which requires a lot of trial and error