Yellowbrick Webcast with DBTA for Real-Time Analytics

Key Capabilities for
Real–Time Analytics
Brian Bulkowski
CTO

Today’s Discussion
We’re awash in real-time data
Real-time data, combined with
historical data, provides the most
context for decision making
Building data pipelines with fewer
systems and steps leads to
greater scalability and reliability
2CONFIDENTIAL

Real-Time Reality
Everything is trackable
Everything is shareable,
often inadvertently
Consumers expectations
demand real-time
4

Real-Time Reality of Yesterday’s Data Systems
No ability to easily capture real-time
feeds
Too many disparate silos
Poor data cleanliness
Difficult data access (tooling, obscure
languages)
Unpredictable performance and
resource consumption
5

Real-Time Needs
Ingest on-the-fly data
• Natively from apps, Kafka/Spark, ETL tools, high speed loaders
Write groundbreaking analytic applications
• Custom dashboards, reporting
Deliver massive capacity
• With minimal node count
Guarantee performance
• Across thousands of users with reserved resources
Provide universal accessibility with ANSI SQL
6

A Real-Time World
Incorporating History

Real-Time Is Only Part of the Picture
An important moment,
always fleeting
Challenging to incorporate
context
A small view of the stream
compared to the broad view
over time
9

Incorporating Historical Data for Context
Business value lies in the right amount of history
• Hospitality
• Measure across annual visits
• Consumer goods
• Seasonal analytics
Both examples benefit from being able to incorporate real-
time data
• Real-time offers to hospitality guests
• More efficient inventory management
10

A Real-Time World
Incorporating History
Building A Real-Time Future

Identifying The Right Capabilities
Ingest and data loading
• Direct from apps, Kafka/Spark, Change Data Capture from OLTP systems,
ETL, YB Load
Data store scale and expansion
• Capacity, number of concurrent users, mixed workloads
Data accessibility
• Interactive applications, Ad Hoc SQL, Business critical reporting
12

Evolution of data pipeline architectures
Enterprise Data Warehouse model
• Consolidate one or multiple application data sets
into a data warehouse
Desire to capture all Internet data
led to adoption of a data lake
• However, MapReduce was challenging
SQL-as-a-Layer provides some relief
• But SQL on a file system IS NOT
a data warehouse
SQL as a Layer

Further evolution of data pipelines
14
Data science
Data Lake
High value data to EDW
Large number of
enterprise analytics users

Incoming Data
Structured and semi-structured
Enterprise Data Warehouse 1000s of users
(BI analysts, Data engineers)
High value data moves to EDW
Unstructured data Data Lake Data science
Modern architecture for real-time analytics
15

Real-Time Architecture Data Warehouse Attributes
Real-time Feeds
Ingest IoT or OLTP data
Capture 100,000s
of rows per second
Interactive Applications
Serve short queries in
under 100 milliseconds
Periodic Bulk Loads
Capture terabytes
of data, petabytes
over time
Powerful Analytics
Respond to
complex BI queries
in just a few seconds
Load and Transform
Use existing ETL tools including intensive
push-down ELT
Business Critical Reporting
Workload management
for prioritized responses
PostgreSQL
compatible
CONFIDENTIAL16

The Yellowbrick Data Warehouse
MPP scale-out architecture
Start small
Grow compute
and storage
CONFIDENTIAL17
MODULAR PURPOSE-BUILT APPLIANCE
ALL FLASH DATA WAREHOUSE
Capacity from tens of terabytes
to petabytes

Yellowbrick deployments across hybrid cloud
Yellowbrick Data Warehouse
Enabling analytics anywhere
Today
On-premises data centers
Private cloud
Colocation
Edge
2019
Cloud
Hybrid Cloud
Colocation
On-premises
Data Centers
Private Cloud Edge
Cloud
CONFIDENTIAL18

The Yellowbrick Impact: 6 full racks > 1 appliance (6 rack units)
3x-100x performance improvement
19

Real-World Use Cases
Risk analytics
• Fraud detection for e-commerce
Consumer financing
• Tracking loyalty points and
impact on balance sheet
Hospitality
• Real-time offers
20

THANK YOU
yellowbrick.com
S E E I N G I S B E L I E V I N G

Common Event Streams
Business Applications
Customer orders
Airline Reservations
Insurance claims
Bank transactions
Telco CDRs
Sources
Digital Information
Clickstreams
Social computing
Customer call logs
News, weather feeds
IT, network logs
Market data
Email
Ideal for real-time
applications and analytics
Internet of Things
RFID
Telemetry SCADA
Geolocation
Machine logs
CONFIDENTIAL22

Getting ready for real-time analytics
Business Applications
- OLTP databases
Consolidate multiple
data integration patterns
into fewer systems
Enterprise Digital Information
available via existing ETL procedures
Big data clickstreams, IoT,
Machine logs
CONFIDENTIAL23
IoT
Big Data

Gartner on Data Integration Styles
Real-time analytics popularity
dwarfs its practice
Ideal solutions will handle
multiple ingestion methods
More many workflows, the
further “up the stream” you
can grab the data, the better
Source: Gartner24

Yellowbrick Webcast with DBTA for Real-Time Analytics

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Yellowbrick Webcast with DBTA for Real-Time Analytics

Similar to Yellowbrick Webcast with DBTA for Real-Time Analytics (20)

Recently uploaded

Recently uploaded (20)

Yellowbrick Webcast with DBTA for Real-Time Analytics

Editor's Notes