Hybrid Transactional/Analytics Processing: Beyond the Big Database Hype

Hybrid Transaction/Analytical Processing:
Beyond the Big Database Hype
Ali Hodroj
Vice President, Products and Strategy

Agenda
• Drivers for HTAP
• Emergence of insight-driven
transformation
• GigaSpaces Solution for HTAP
• Reference Architecture and Case Studies

About GigaSpaces
GigaSpaces provides Cloud native In-Memory
Compute middleware for mission-critical
applications.
GigaSpaces IMC serves more than 500 large
enterprises & ISVs, over 50 of which are
Fortune-listed.
Direct customers
300+
Fortune / Organizations
50+ / 500+
Large installations in
production (OEM)
5,000+
ISVs
25+

Direct customers
300+
Fortune / Organizations
50+ / 500+
Large installations in
production (OEM)
5,000+
ISVs
25+

Why Hybrid
Transactional
Analytics
Processing?

$13.01 forevery$1
a company spends on analytics, it
gets back spend on data
management and analytics
Source: MIT Sloan, NucleusResearch
The economic value of insight-driven transformation
74%of firms say they want to be data-
driven, but only 23%are successful
Source: Forbes: Actionable Insight: Missing Link between Data and Value
2x [companies are twice] likely to
outperform their peers if they use
advanced analytics
Source: MIT Sloan

Data &
Transactions
Created
Extract, Transform,
Load
BusinessValue
Time toAct
Positive
Negative
Run Analytics
Stale Insights
Decision Made
Outdated Decisions
Trigger Action
Irrelevant
actions
Fast Data Analytics = Immediate Business Value
Data is generated in real-time, while analytics and insight fall behind

Batch Machine Learning & Event ProcessingStreaming
Hours Minutes Seconds Sub-Second Milliseconds
PredictiveSearchandUserInterfacesReal-timePricingHyperlocalAdvertisingRevenue,Customer
Segmentation
ProductRecommendations
Insight-centric systems demand hyperscale analytics
(Case study: intelligent omni-channel commerce)
Microseconds

In-Memory Computing enables HTAP

Clearing the hype:
HTAP and the big
(database)
monolith

Evolution of big databases towards HTAP
Traditional
Relational Database
In-Memory or MPP
Database
• Query engine for either transactional
OR analytics workloads
• Single storage engine
• Vertically Scalable
• Single Query engine for both workloads
• Multiple storage engines (Row-based and
Column-based)
• Leverages memory to speed up I/O
(Traditional) (HTAP)

Yet analytics evolved much faster
Insight-driven transformation requires:
• Applications with polyglot persistence
(microservices, multiple data sources)
• Analytics are mostly real-time,
streaming, and predictive
• Iterative data science – modeling against
live data for continuous machine and
deep learning
High
Low
Past FutureTime Horizon
BusinessValue
Business
Intelligence
Data Science
Prescriptive analytics
Predictive analytics
(What will happen?
What should I do?)
Historical reporting
(What happened?)
(HTAP)

The Open Source
and In-Memory
Insight Platform
Approach

HTAP = Spark + In-Memory Data Grid
Large-scale distributed
analytics framework
Unified, scale-out, low-latency data store
Transactional capabilities:
ACID, Event-Driven, Rich Data
modeling
Microservices

16
Elastic Scale-out In-Memory Storage
(Shared-nothing, Linear scalability, Elastic capacity)
Low latency and high throughput
(co-located ops, event-driven, fast indexing)
High availability and Resiliency
(auto-healing, multi-data center replication, fault tolerance)
Rich API and Query Language
(SQL, Spring, Java, .NET, C++)
GigaSpaces XAP In-Memory Data Grid

Geo-Spatial Full Text
In-Memory Data Grid + Spark Convergence

19
• Unified & Concise API
• Highly Flexible Data Store Integration
• Massive Community and Adoption
Why Spark?

Why In-Memory Data Grid?
SQL-99, Polyglot
Data & Search
Multi-Tiered Data
Storage
Cloud-Nativeand Horizontally
Scalable
• RAM
• SSD/Flash
• Storage-Class Memory
(3DXPoint)
• SQL ‘99
• Graph
• JSON
• POJO
• GeoSpatial
• Full Text
Distributed In-Grid Analytics
• SQL
• Streaming
• Machine Learning
• Graph Processing
• Deep Learning
• Textmining
• Geospatial
• In-Memory Event-Driven
Processing
• Distributed Tasks and Compute
Grid
• Real-time Web Services
• In-Memory Aggregations
Advanced In-Grid Transactions and Analytics Processing

GigaSpaces
Hadoop
Embracing an open source analytics ecosystem
Pick your own fast data architecture (lambda, kappa) and co-locate transaction processing
Kafka
Spark
Simplified Lambda Architecture
(Realtime + Historical)

Unified HTAP Architecture
node 1
Spark master
Grid
master
node 2
Spark worker
Grid
Partition
node 3
Spark worker
Grid
Partition
Lightweight
workers,
small JVMs
Large JVMs,
Fast
indexing
• Push-down predicates (ultra-low latency processing,
30x performance improvement)
• Stateful data-360 sharing across analytics jobs
• Data-locality for high throughput
• Five 9s High Availability

Decoupled HTAP Architecture
In-Memory Data Grid
Realtime Replication
• Scoring models
• Trigger actions
• Events
Transactions Analytics
• Useful when analytics are
mostly batch or long-
running queries.
• Analytics grid can be used
for frequent model training
(CPU intensive), without
impacting transactional
apps
• Flexibility in write-heavy
(transactions) and read-
heavy (analytics)
independent scaling Application
developers
Data Scientists &
Analysts

Case Study: Magic Software
IoT Hub + Predictive Analytics (Automotive Telematics)
Challenge:
• Implement predictive analytics and anomaly detection
• Expand insight context through customer/data-360
integration
• Trigger transactional workflows based on prediction criteria
Solution:
• Simplified HTAP with Streaming data pipeline (3 tiers)
• IoT streaming analytics with 9s high availability
“GigaSpaces enables our
customers to simplify and
accelerate telemetry
ingestion, to gain full
business value from IoT
adoption.”
Yuval Lavi, VP of Innovation
Magic Software
http://www.magicsoftware.com

Key Takeaways
By the end of this presentation, you hopefully understood that:
➔ HTAP is not just a database problem!
Capturing business value from real-time apps requires more than a hybrid
database. Look into distributed analytics frameworks for speed of
innovation
➔ Hyperscale analytics require the combination of several tools
Open source analytics provide better long term ROI for implementing both
BI analytics and Data Science, while reducing architecture complexity.
➔ Try it all out – It’s open source!
http://insightedge.io / http://gigaspaces.com
http://github.com/InsightEdge
http://insightedge.slack.com
hello@insightedge.io
Book a demo:

Hybrid Transactional/Analytics Processing: Beyond the Big Database Hype

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Hybrid Transactional/Analytics Processing: Beyond the Big Database Hype

Similar to Hybrid Transactional/Analytics Processing: Beyond the Big Database Hype (20)

Recently uploaded

Recently uploaded (20)

Hybrid Transactional/Analytics Processing: Beyond the Big Database Hype

Editor's Notes