Assessing New Databases– Translytical Use Cases

Assessing New
Databases: Translytical
Use Cases
Presented by: William McKnight
“#1 Global Influencer in Big Data” Thinkers360
President, McKnight Consulting Group
A 2 time Inc. 5000 Company
@williammcknight
www.mcknightcg.com
(214) 514-1444
Second Thursday of Every Month, at 2:00 ET
With William McKnight

Enterprises
Analysts
Vendors
• Keynote/Webinar Presentations – Online & In-Person – Great
turnouts.
• White Paper Development – Use our unique voice to talk about a
theme important to you and tie your product to it.
• Benchmark Services – Performance, Ease-of-Use, Functionality, TCO.
We’ve done 40+ benchmarks; TPCs, others; databases (analytical,
operational) and related (lake, integration, APIs, etc.). Impactful.
• Day in the Life of Report – We go from zero to production and
document the steps, creating comfort for the buyer to make the next
step.
• Teardown – Comparing and grading vs. competition across 50 +/-
factors. Ideal for building product roadmaps.
• Competitive Education – We teach vendor competitive teams about
the competition with ½ day – 1 day hands-on workshops per
competitor.
• Technical Specification Development – i.e., Deployment Guide, Best
Practices Guide, Reference Architecture.
• Test Drives for demonstrations/booth – We build real-world relatable
test drives/demos you can use to show off features or performance.
• *NEW* McKnight Enterprise Contribution Ranking Report – We’re
taking an industry and assessing market leaders against critical
capabilities of the market. Industries available for prioritizing
research.
• Total Addressable Market Report.
McKnight Consulting Group Vendor Offerings

OLTP vs OLAP
OLTP
• Process business
interactions as they occur
• Support limited query
• Focus on IUD/individual
transactions
• Low latency and high
throughput needed
• ACID compliance
• Normalized data model
OLAP
• Analytics/complex
analysis
• Offload of processing
from OLTP
• Dimensional data model
• Lite data modification
from source
• Complex queries,
frequently long-running
• Large data accumulation
3

Capability Requirements
• Analytics on live data, recent data and
historical data
• Real-time analytics calculated from across
data domains
• Pre-calculated data
• Live analytics usable operationally
• A seamless platform
• Operational SLAs
4

Analytics Defined
• Analytics is the process of utilizing data to enhance
business processes.
• Analytics is deeper than simple knowledge; they have
depth.
• There’s Analytic Projects and…
• There’s Analytics Added to projects

Analytics Origin
Batch
• Broad Context
• Drives Reactions
• Action Options
• “Most People”
• Static Rules
Real-Time
• Immediate Context
• Activity not in Batch
• Dynamic Rules
6

Benefits of Real-Time Analytics
• Speed to Insight
• Customer Experience
• Operational Excellence
• Deeper Understanding
7

Translytical Use Cases
• Portfolio Management
• Wealth Management
• Fraud Analytics
• Risk Management
• Algorithmic Trading
• Crypto Exchange
• SC/IoT Analytics
• Real-Time Customer
Experience
• Network Telemetry
• Geolocation Analysis
• Field Support Optimization
• Ad Optimization & Ad
Serving
• Streaming Media Quality
Analytics
• Real-Time
Recommendations
• Video Games
• Telemetry Processing
• IoT & Smart Meter Analytics
• Predictive Maintenance
• Geospatial Tracking

Next Best Offer/Touch
• Need to incorporate not only analytics
through last night, but also today, all
morning, last hour and last second into
screen render
• Need to incorporate not just the user’s data
but all users data
– Need to correlate user to other users instantly
• Only AI can operate at the needed scale
9

Financial Market
• Billions of API Requests Daily
• Need 5-10ms Average Query Response
• Data to include:
– Real-time and historical stock price
– Cryptocurrencies, Forex, Commodities,
Currencies, Premium Data
• Front-Office Traders Need Real-Time
Analysis

Healthcare
• Genomic medicine
• Virtual visits
• Tele-health and AI Triage
• AI Diagnostics
• Robotics Automating Lab Work
11

Retailer
• Better & personalized product recommendations for the consumers based on session
data, historical order data, and trending products.
• Continuous and automatic retraining the recommendation (ML) engine.
• Near real-time data integration from their retail application to the analytical
platform.
• Identify potential compliance issues with customer data, classify and tag sensitive
data with labels, and track how sensitive data is being used from the data source to
the reports.
• Integrate other systems such as their SAP ERP, email and instant messaging platforms
with the analytical solution to get a full 360 view over their business operations and
to improve customer satisfaction.
• Save in operational costs while offering the best customer experience even during
peak seasons such as Black Friday, Thanksgiving, Christmas, and Mother’s Day.
12

Metaverse
• VR chairs, vests, scent generators, and
better directional sound systems
• Avatars fully virtual agents
• Surgical implants to the metaverse
13

Transportation
• Driverless and autonomous
• Floating or vertical warehouses delivering
packages
• Urban transportation
• Airbus drone-like popup concept
14

Cameras and Audio Recording
• Cameras Will Be Abundant
• Person’s Profile Will Be Evident
• Third-Party Analytics
• AI Will Decide …
15

Manufacturing
• Real-Time Dashboards
• Variety of data sources
• When they ingest data they must recalculate
the entire dataset because business rules
change over time
• Cross-matching survey results at the team and
individual level
• Need to know what impact various dimensions,
such as product quality, support, cost, and
more have on their NPS score
• Processes that formerly required 10 steps are
streamlined down to just one.
16

Asset Management
• End-to-end asset visibility
• Needed one place to discover all assets in
environment
– With instant context around risk, vulnerability,
threat assessment and threat detection
• 100 billion events per day
– devices, firewalls, IoT, multi-tenant, ServiceNow
and network traffic
17

Security Surveillance
• Goal to view all sites in a single, cloud-
based package
– And offer analytics from video data
• Real-Time Insights
• Biggest challenge was scalability
18

Finance: Embedded
• Started with easy to prototype, ingest data,
do basic reports
– Required replica sets
• Performance constraints on writes to
PostgreSQL
• Had to do a bulk load of the data and it was
so time-consuming that certain data was
skipped
19

eSports
• Need to offer real-time and historical live
streaming data to analyze trends and
performance across all genres, games,
events, and channels
• Need to work with thousands of time series
data points in complex multi-gigabyte
aggregated queries.
• Analytics speed is the top priority
• Understand spikes in viewership
21

Data Architecture Needs for Translytical
Workloads
• Fast Streaming Ingest (millions of
events/second)
• Low Latency
• High Concurrency (thousands of concurrent
users)
• Unlimited Storage
• Pipelines
• Transactional Consistency

Data Architecture Ill-fit for Translytical
Logs
(Apps, Web,
Devices)
User tracking
Operational
Metrics
Offload
data
Raw Data Topics
JSON, AVRO
Processed
Data Topics
Sensors
and
/ or
Transactiona
l/ Context
Data
OLTP/ODS
ETL
Or
EL with
T in Spark
Batch
Low
Latency
Applications
Files
In-
database
analytics
Reach
through
or ETL/ELT
or
Stream
Processing
or
Stream
Processing
Q
Q
Data
Warehouse
Data Lake

Operational Data Store
Data
Warehouse
ODS

Data Lakehouse
25
and
/ or
In-
database
analytics
Reach
through
Q
Q
Data
Warehouse
Data Lake

NoSQL for Operational Big Data
More data model flexibility
– Web Services as a data model
– No “schema first” requirement; load first
Faster time to insight from data acquisition
Relaxed ACID
– Eventual consistency
– Willing to trade consistency for availability
– ACID would crush things like storing clicks on Google
Low upfront software and development costs
Fault-tolerant redundancy
Linear Scaling to “webscale”
26

Event-Driven Architectures
• Kafka Connector
• Realtime Pub/Sub Messaging Platform
• Edge Computing
• Selective Feed to a Data Lake
27

Single Product Architectures
• Single Table Storage for Transactions and
Analytics
– Fast IUD and Query
– Simplified Data Architecture
– Reduced Data Movement
• Rowstore + Columnstore
28

Columnstore
• SingleStore uses two storage types
internally: an in-memory rowstore and a
disk-based columnstore
• Columnstore:
29

Azure Synapse Analytics
Data
Processing
Transactional
Database
ML Model Training
& Deployment
Azure Machine
Learning
Azure Cosmos
DB
Core API
E-Commerce
Website
Azure Kubernetes Service (AKS)
Front-end Back-end
Cart
Profile
Products
Stock
Azure ML managed
online endpoint
Deployed
Recommender
Azure Cosmos
DB
Analytical Store
(Parquet)
Analytical
Store (HTAP)
ADLS Gen2
(data lake)
Data Lake +
Historical Data
Automatic
Model deployment
Synapse
Link
Enables
automatic
sync
to
analytical
store
(no
ETL)
Data Management
& Governance
Microsoft
Purview
Classify &
protect
sensitive data
(customer
profiles, etc.)
Power BI
Report & Visualize
Power Apps M365
Dataverse
Synapse Link
Enterprise Data Sources
Synapse Pipelines
Azure Real-Time Environment

Amazon Redshift
Data
Processing
Transactional
Database
ML Model Training
& Deployment
Amazon SageMaker
Amazon
DynamoDB
E-Commerce
Website
Amazon Elastic Kubernetes Service (Amazon EKS)
Front-end Back-end
Cart
Profile
Products
Stock
SageMaker
model endpoint
Deployed
Recommender
Data Loading
S3
(data lake)
Data Lake +
Historical Data
Automatic
Model deployment
Amazon Glue
Data Governance:
• AWS Partner solutions
• AWS Marketplace solutions
AWS Real-Time Environment

Single Product
Logs
(Apps, Web,
Devices)
User tracking
Operational
Metrics
Offload
data
Raw Data Topics
JSON, AVRO
Processed
Data Topics
Sensors
and
/ or
ETL
Or
EL with
T in Spark
Batch
Low
Latency
Applications
Files
Transactional/
Context Data OLTP
Reach
through
or
Stream
Processing
Data
Warehouse
Data Lake

Single Vendor Solutions
• SingleStore
• Oracle
• Snowflake Unistore
• Cassandra
• Azure
• AWS
• Google

Tweak on Traditional Architectures
Logs
(Apps, Web,
Devices)
User tracking
Operational
Metrics
Offload
data
Raw Data Topics
JSON, AVRO
Processed
Data Topics
Sensors
and
/ or
Transactiona
l/ Context
Data
OLTP/ODS
ETL
Or
EL with
T in Spark
Batch
Low
Latency
Applications
Files
In-
database
analytics
Reach
through
or ETL/ELT
or
Stream
Processing
or
Stream
Processing
Q
Q
Data
Warehouse
Data Lake
Analytics

Multi-Vendor Architectures
• Spark, Cassandra, Elastic, Druid, MongoDB
• MySQL, Redis, DynamoDB
• Redis, Elastic, Spark, Flink
• Storm, Druid
• PostgreSQL, Elastic

Benchmark
• We found a single database competitiveness in operating effectively, actually putting
it in a winning position for both transactional and operational workloads.
– The use of a single database facilitates operational analytics and offers an efficient
approach for any organization.
• For the TPC H-like workload, it obtained a geometric mean better than both of the
pure-play data warehouses.
• In the TPC DS-like workload, an analytic db was superior, both with maintenance and
without maintenance. Its 4.1 geometric mean outperformed the 1 db without
maintenance, while its 3.9 with maintenance likewise bested the 1 db.
• Given the vast superiority in transactional processing and the high competitiveness in
analytic processing, the efficiencies of one database—the 1 db —across the spectrum
of enterprise needs should be considered.
• Platform costs favor the 1 db by 1.9x over 1 analytic db and 2.5x over the other in
Year 1.
• Development costs are 2.5x – 3x and Production Costs are 2.1x – 2.5x for the analytic
db.
• We calculated the annual costs of the platform stacks and the Time-Effort Costs
(People Costs, Development Costs and Production Costs) and concluded that the 1 db
is 2 times cheaper than 1 analytic stack and 2.5 times cheaper than the other over 3
years running enterprise-equivalent workloads.

Summary
• Applications are moving translytical as the lines between operational and
analytical blur
• Analytics are deeper than simple knowledge; they have depth
• The need for real-time analytics drives the need for a translytical
architecture
• There are examples in every industry
• Traditional architectures do not meet the requirements
• There are multiple vendor, multiple product/same vendor and single product
options
• Single product solutions combine Rowstore + Columnstore
• Given the vast superiority in transactional processing and the high
competitiveness in analytic processing, the efficiencies of one database—the
1 db —across the spectrum of enterprise needs should be considered

Upcoming Topics
• Assessing New Database Capabilities: Multi-Model
• MLOps: Applying DevOps to Competitive Advantage
• 2023 Trends in Enterprise Analytics
• Showing ROI for your Analytic Project
• Architecture, Products and Total Cost of Ownership of
the Leading Machine Learning Stacks
39

Assessing New
Databases: Translytical
Use Cases
Presented by: William McKnight
“#1 Global Influencer in Big Data” Thinkers360
President, McKnight Consulting Group
A 2 time Inc. 5000 Company
@williammcknight
www.mcknightcg.com
(214) 514-1444
#AdvAnalytics

Assessing New Databases– Translytical Use Cases

Recommended

Recommended

More Related Content

Similar to Assessing New Databases– Translytical Use Cases

Similar to Assessing New Databases– Translytical Use Cases (20)

More from DATAVERSITY

More from DATAVERSITY (20)

Recently uploaded

Recently uploaded (20)

Assessing New Databases– Translytical Use Cases

Editor's Notes