SlideShare a Scribd company logo
Assessing New
Databases: Translytical
Use Cases
Presented by: William McKnight
“#1 Global Influencer in Big Data” Thinkers360
President, McKnight Consulting Group
A 2 time Inc. 5000 Company
@williammcknight
www.mcknightcg.com
(214) 514-1444
Second Thursday of Every Month, at 2:00 ET
With William McKnight
Enterprises
Analysts
Vendors
• Keynote/Webinar Presentations – Online & In-Person – Great
turnouts.
• White Paper Development – Use our unique voice to talk about a
theme important to you and tie your product to it.
• Benchmark Services – Performance, Ease-of-Use, Functionality, TCO.
We’ve done 40+ benchmarks; TPCs, others; databases (analytical,
operational) and related (lake, integration, APIs, etc.). Impactful.
• Day in the Life of Report – We go from zero to production and
document the steps, creating comfort for the buyer to make the next
step.
• Teardown – Comparing and grading vs. competition across 50 +/-
factors. Ideal for building product roadmaps.
• Competitive Education – We teach vendor competitive teams about
the competition with ½ day – 1 day hands-on workshops per
competitor.
• Technical Specification Development – i.e., Deployment Guide, Best
Practices Guide, Reference Architecture.
• Test Drives for demonstrations/booth – We build real-world relatable
test drives/demos you can use to show off features or performance.
• *NEW* McKnight Enterprise Contribution Ranking Report – We’re
taking an industry and assessing market leaders against critical
capabilities of the market. Industries available for prioritizing
research.
• Total Addressable Market Report.
McKnight Consulting Group Vendor Offerings
OLTP vs OLAP
OLTP
• Process business
interactions as they occur
• Support limited query
• Focus on IUD/individual
transactions
• Low latency and high
throughput needed
• ACID compliance
• Normalized data model
OLAP
• Analytics/complex
analysis
• Offload of processing
from OLTP
• Dimensional data model
• Lite data modification
from source
• Complex queries,
frequently long-running
• Large data accumulation
3
Capability Requirements
• Analytics on live data, recent data and
historical data
• Real-time analytics calculated from across
data domains
• Pre-calculated data
• Live analytics usable operationally
• A seamless platform
• Operational SLAs
4
Analytics Defined
• Analytics is the process of utilizing data to enhance
business processes.
• Analytics is deeper than simple knowledge; they have
depth.
• There’s Analytic Projects and…
• There’s Analytics Added to projects
Analytics Origin
Batch
• Broad Context
• Drives Reactions
• Action Options
• “Most People”
• Static Rules
Real-Time
• Immediate Context
• Activity not in Batch
• Dynamic Rules
6
Benefits of Real-Time Analytics
• Speed to Insight
• Customer Experience
• Operational Excellence
• Deeper Understanding
7
Translytical Use Cases
• Portfolio Management
• Wealth Management
• Fraud Analytics
• Risk Management
• Algorithmic Trading
• Crypto Exchange
• SC/IoT Analytics
• Real-Time Customer
Experience
• Network Telemetry
• Geolocation Analysis
• Field Support Optimization
• Ad Optimization & Ad
Serving
• Streaming Media Quality
Analytics
• Real-Time
Recommendations
• Video Games
• Telemetry Processing
• IoT & Smart Meter Analytics
• Predictive Maintenance
• Geospatial Tracking
Next Best Offer/Touch
• Need to incorporate not only analytics
through last night, but also today, all
morning, last hour and last second into
screen render
• Need to incorporate not just the user’s data
but all users data
– Need to correlate user to other users instantly
• Only AI can operate at the needed scale
9
Financial Market
• Billions of API Requests Daily
• Need 5-10ms Average Query Response
• Data to include:
– Real-time and historical stock price
– Cryptocurrencies, Forex, Commodities,
Currencies, Premium Data
• Front-Office Traders Need Real-Time
Analysis
Healthcare
• Genomic medicine
• Virtual visits
• Tele-health and AI Triage
• AI Diagnostics
• Robotics Automating Lab Work
11
Retailer
• Better & personalized product recommendations for the consumers based on session
data, historical order data, and trending products.
• Continuous and automatic retraining the recommendation (ML) engine.
• Near real-time data integration from their retail application to the analytical
platform.
• Identify potential compliance issues with customer data, classify and tag sensitive
data with labels, and track how sensitive data is being used from the data source to
the reports.
• Integrate other systems such as their SAP ERP, email and instant messaging platforms
with the analytical solution to get a full 360 view over their business operations and
to improve customer satisfaction.
• Save in operational costs while offering the best customer experience even during
peak seasons such as Black Friday, Thanksgiving, Christmas, and Mother’s Day.
12
Metaverse
• VR chairs, vests, scent generators, and
better directional sound systems
• Avatars fully virtual agents
• Surgical implants to the metaverse
13
Transportation
• Driverless and autonomous
• Floating or vertical warehouses delivering
packages
• Urban transportation
• Airbus drone-like popup concept
14
Cameras and Audio Recording
• Cameras Will Be Abundant
• Person’s Profile Will Be Evident
• Third-Party Analytics
• AI Will Decide …
15
Manufacturing
• Real-Time Dashboards
• Variety of data sources
• When they ingest data they must recalculate
the entire dataset because business rules
change over time
• Cross-matching survey results at the team and
individual level
• Need to know what impact various dimensions,
such as product quality, support, cost, and
more have on their NPS score
• Processes that formerly required 10 steps are
streamlined down to just one.
16
Asset Management
• End-to-end asset visibility
• Needed one place to discover all assets in
environment
– With instant context around risk, vulnerability,
threat assessment and threat detection
• 100 billion events per day
– devices, firewalls, IoT, multi-tenant, ServiceNow
and network traffic
17
Security Surveillance
• Goal to view all sites in a single, cloud-
based package
– And offer analytics from video data
• Real-Time Insights
• Biggest challenge was scalability
18
Finance: Embedded
• Started with easy to prototype, ingest data,
do basic reports
– Required replica sets
• Performance constraints on writes to
PostgreSQL
• Had to do a bulk load of the data and it was
so time-consuming that certain data was
skipped
19
eSports
• Need to offer real-time and historical live
streaming data to analyze trends and
performance across all genres, games,
events, and channels
• Need to work with thousands of time series
data points in complex multi-gigabyte
aggregated queries.
• Analytics speed is the top priority
• Understand spikes in viewership
21
Data Architecture Needs for Translytical
Workloads
• Fast Streaming Ingest (millions of
events/second)
• Low Latency
• High Concurrency (thousands of concurrent
users)
• Unlimited Storage
• Pipelines
• Transactional Consistency
Data Architecture Ill-fit for Translytical
Logs
(Apps, Web,
Devices)
User tracking
Operational
Metrics
Offload
data
Raw Data Topics
JSON, AVRO
Processed
Data Topics
Sensors
and
/ or
Transactiona
l/ Context
Data
OLTP/ODS
ETL
Or
EL with
T in Spark
Batch
Low
Latency
Applications
Files
In-
database
analytics
Reach
through
or ETL/ELT
or
Stream
Processing
or
Stream
Processing
Q
Q
Data
Warehouse
Data Lake
Operational Data Store
Data
Warehouse
ODS
Data Lakehouse
25
and
/ or
In-
database
analytics
Reach
through
Q
Q
Data
Warehouse
Data Lake
NoSQL for Operational Big Data
More data model flexibility
– Web Services as a data model
– No “schema first” requirement; load first
Faster time to insight from data acquisition
Relaxed ACID
– Eventual consistency
– Willing to trade consistency for availability
– ACID would crush things like storing clicks on Google
Low upfront software and development costs
Fault-tolerant redundancy
Linear Scaling to “webscale”
26
Event-Driven Architectures
• Kafka Connector
• Realtime Pub/Sub Messaging Platform
• Edge Computing
• Selective Feed to a Data Lake
27
Single Product Architectures
• Single Table Storage for Transactions and
Analytics
– Fast IUD and Query
– Simplified Data Architecture
– Reduced Data Movement
• Rowstore + Columnstore
28
Columnstore
• SingleStore uses two storage types
internally: an in-memory rowstore and a
disk-based columnstore
• Columnstore:
29
Azure Synapse Analytics
Data
Processing
Transactional
Database
ML Model Training
& Deployment
Azure Machine
Learning
Azure Cosmos
DB
Core API
E-Commerce
Website
Azure Kubernetes Service (AKS)
Front-end Back-end
Cart
Profile
Products
Stock
Azure ML managed
online endpoint
Deployed
Recommender
Azure Cosmos
DB
Analytical Store
(Parquet)
Analytical
Store (HTAP)
ADLS Gen2
(data lake)
Data Lake +
Historical Data
Automatic
Model deployment
Synapse
Link
Enables
automatic
sync
to
analytical
store
(no
ETL)
Data Management
& Governance
Microsoft
Purview
Classify &
protect
sensitive data
(customer
profiles, etc.)
Power BI
Report & Visualize
Power Apps M365
Dataverse
Synapse Link
Enterprise Data Sources
Synapse Pipelines
Azure Real-Time Environment
Amazon Redshift
Data
Processing
Transactional
Database
ML Model Training
& Deployment
Amazon SageMaker
Amazon
DynamoDB
E-Commerce
Website
Amazon Elastic Kubernetes Service (Amazon EKS)
Front-end Back-end
Cart
Profile
Products
Stock
SageMaker
model endpoint
Deployed
Recommender
Data Loading
S3
(data lake)
Data Lake +
Historical Data
Automatic
Model deployment
Amazon Glue
Data Governance:
• AWS Partner solutions
• AWS Marketplace solutions
AWS Real-Time Environment
Single Product
Logs
(Apps, Web,
Devices)
User tracking
Operational
Metrics
Offload
data
Raw Data Topics
JSON, AVRO
Processed
Data Topics
Sensors
and
/ or
ETL
Or
EL with
T in Spark
Batch
Low
Latency
Applications
Files
Transactional/
Context Data OLTP
Reach
through
or
Stream
Processing
Data
Warehouse
Data Lake
Single Vendor Solutions
• SingleStore
• Oracle
• Snowflake Unistore
• Cassandra
• Azure
• AWS
• Google
Tweak on Traditional Architectures
Logs
(Apps, Web,
Devices)
User tracking
Operational
Metrics
Offload
data
Raw Data Topics
JSON, AVRO
Processed
Data Topics
Sensors
and
/ or
Transactiona
l/ Context
Data
OLTP/ODS
ETL
Or
EL with
T in Spark
Batch
Low
Latency
Applications
Files
In-
database
analytics
Reach
through
or ETL/ELT
or
Stream
Processing
or
Stream
Processing
Q
Q
Data
Warehouse
Data Lake
Analytics
Multi-Vendor Architectures
• Spark, Cassandra, Elastic, Druid, MongoDB
• MySQL, Redis, DynamoDB
• Redis, Elastic, Spark, Flink
• Storm, Druid
• PostgreSQL, Elastic
Benchmark
• We found a single database competitiveness in operating effectively, actually putting
it in a winning position for both transactional and operational workloads.
– The use of a single database facilitates operational analytics and offers an efficient
approach for any organization.
• For the TPC H-like workload, it obtained a geometric mean better than both of the
pure-play data warehouses.
• In the TPC DS-like workload, an analytic db was superior, both with maintenance and
without maintenance. Its 4.1 geometric mean outperformed the 1 db without
maintenance, while its 3.9 with maintenance likewise bested the 1 db.
• Given the vast superiority in transactional processing and the high competitiveness in
analytic processing, the efficiencies of one database—the 1 db —across the spectrum
of enterprise needs should be considered.
• Platform costs favor the 1 db by 1.9x over 1 analytic db and 2.5x over the other in
Year 1.
• Development costs are 2.5x – 3x and Production Costs are 2.1x – 2.5x for the analytic
db.
• We calculated the annual costs of the platform stacks and the Time-Effort Costs
(People Costs, Development Costs and Production Costs) and concluded that the 1 db
is 2 times cheaper than 1 analytic stack and 2.5 times cheaper than the other over 3
years running enterprise-equivalent workloads.
Summary
• Applications are moving translytical as the lines between operational and
analytical blur
• Analytics are deeper than simple knowledge; they have depth
• The need for real-time analytics drives the need for a translytical
architecture
• There are examples in every industry
• Traditional architectures do not meet the requirements
• There are multiple vendor, multiple product/same vendor and single product
options
• Single product solutions combine Rowstore + Columnstore
• Given the vast superiority in transactional processing and the high
competitiveness in analytic processing, the efficiencies of one database—the
1 db —across the spectrum of enterprise needs should be considered
Upcoming Topics
• Assessing New Database Capabilities: Multi-Model
• MLOps: Applying DevOps to Competitive Advantage
• 2023 Trends in Enterprise Analytics
• Showing ROI for your Analytic Project
• Architecture, Products and Total Cost of Ownership of
the Leading Machine Learning Stacks
39
Second Thursday of Every Month, at 2:00 ET
Assessing New
Databases: Translytical
Use Cases
Presented by: William McKnight
“#1 Global Influencer in Big Data” Thinkers360
President, McKnight Consulting Group
A 2 time Inc. 5000 Company
@williammcknight
www.mcknightcg.com
(214) 514-1444
Second Thursday of Every Month, at 2:00 ET
#AdvAnalytics

More Related Content

Similar to Assessing New Databases– Translytical Use Cases

Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
NoSQLmatters
 
Real time data integration best practices and architecture
Real time data integration best practices and architectureReal time data integration best practices and architecture
Real time data integration best practices and architecture
Bui Kiet
 
Data Science and Enterprise Engineering with Michael Finger and Chris Robison
Data Science and Enterprise Engineering with Michael Finger and Chris RobisonData Science and Enterprise Engineering with Michael Finger and Chris Robison
Data Science and Enterprise Engineering with Michael Finger and Chris Robison
Databricks
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
DATAVERSITY
 
In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017
SingleStore
 
Leverage Machine Data
Leverage Machine DataLeverage Machine Data
Leverage Machine Data
Splunk
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Denodo
 
Igniting Audience Measurement at Time Warner Cable
Igniting Audience Measurement at Time Warner CableIgniting Audience Measurement at Time Warner Cable
Igniting Audience Measurement at Time Warner Cable
Tim Case
 
Microstrategy Overview
Microstrategy OverviewMicrostrategy Overview
Microstrategy Overview
Roberto Zerbini
 
Real Time Business Platform by Ivan Novick from Pivotal
Real Time Business Platform by Ivan Novick from PivotalReal Time Business Platform by Ivan Novick from Pivotal
Real Time Business Platform by Ivan Novick from Pivotal
VMware Tanzu Korea
 
Drive Smarter Decisions with Big Data Using Complex Event Processing
Drive Smarter Decisions with Big Data Using Complex Event ProcessingDrive Smarter Decisions with Big Data Using Complex Event Processing
Drive Smarter Decisions with Big Data Using Complex Event Processing
Perficient, Inc.
 
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
Nelson Petracek
 
A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)
Denodo
 
How to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT OperationsHow to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT Operations
ExtraHop Networks
 
Which data should you move to Hadoop?
Which data should you move to Hadoop?Which data should you move to Hadoop?
Which data should you move to Hadoop?
Attunity
 
Chief AI Officer and AI Digital Transformation
Chief AI Officer and AI Digital TransformationChief AI Officer and AI Digital Transformation
Chief AI Officer and AI Digital Transformation
Value Amplify Consulting
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric Approach
SoftServe
 
Fractional Chief AI Officer Services For Hire
Fractional Chief AI Officer Services For HireFractional Chief AI Officer Services For Hire
Fractional Chief AI Officer Services For Hire
Value Amplify Consulting
 
Implementing Advanced Analytics Platform
Implementing Advanced Analytics PlatformImplementing Advanced Analytics Platform
Implementing Advanced Analytics Platform
Arvind Sathi
 

Similar to Assessing New Databases– Translytical Use Cases (20)

Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
 
Real time data integration best practices and architecture
Real time data integration best practices and architectureReal time data integration best practices and architecture
Real time data integration best practices and architecture
 
Data Science and Enterprise Engineering with Michael Finger and Chris Robison
Data Science and Enterprise Engineering with Michael Finger and Chris RobisonData Science and Enterprise Engineering with Michael Finger and Chris Robison
Data Science and Enterprise Engineering with Michael Finger and Chris Robison
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 
In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017
 
Leverage Machine Data
Leverage Machine DataLeverage Machine Data
Leverage Machine Data
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
 
Igniting Audience Measurement at Time Warner Cable
Igniting Audience Measurement at Time Warner CableIgniting Audience Measurement at Time Warner Cable
Igniting Audience Measurement at Time Warner Cable
 
Microstrategy Overview
Microstrategy OverviewMicrostrategy Overview
Microstrategy Overview
 
Real Time Business Platform by Ivan Novick from Pivotal
Real Time Business Platform by Ivan Novick from PivotalReal Time Business Platform by Ivan Novick from Pivotal
Real Time Business Platform by Ivan Novick from Pivotal
 
Operational-Analytics
Operational-AnalyticsOperational-Analytics
Operational-Analytics
 
Drive Smarter Decisions with Big Data Using Complex Event Processing
Drive Smarter Decisions with Big Data Using Complex Event ProcessingDrive Smarter Decisions with Big Data Using Complex Event Processing
Drive Smarter Decisions with Big Data Using Complex Event Processing
 
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
TIBCO Innovation Workshop Series: Reducing Decision Latency with Streaming An...
 
A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)
 
How to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT OperationsHow to Use Big Data to Transform IT Operations
How to Use Big Data to Transform IT Operations
 
Which data should you move to Hadoop?
Which data should you move to Hadoop?Which data should you move to Hadoop?
Which data should you move to Hadoop?
 
Chief AI Officer and AI Digital Transformation
Chief AI Officer and AI Digital TransformationChief AI Officer and AI Digital Transformation
Chief AI Officer and AI Digital Transformation
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric Approach
 
Fractional Chief AI Officer Services For Hire
Fractional Chief AI Officer Services For HireFractional Chief AI Officer Services For Hire
Fractional Chief AI Officer Services For Hire
 
Implementing Advanced Analytics Platform
Implementing Advanced Analytics PlatformImplementing Advanced Analytics Platform
Implementing Advanced Analytics Platform
 

More from DATAVERSITY

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
DATAVERSITY
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
DATAVERSITY
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
DATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
DATAVERSITY
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
DATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
DATAVERSITY
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
DATAVERSITY
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
DATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
DATAVERSITY
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
DATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
DATAVERSITY
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
DATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
DATAVERSITY
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
DATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
DATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
DATAVERSITY
 

More from DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Recently uploaded

做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
theahmadsaood
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
James Polillo
 

Recently uploaded (20)

做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 

Assessing New Databases– Translytical Use Cases

  • 1. Assessing New Databases: Translytical Use Cases Presented by: William McKnight “#1 Global Influencer in Big Data” Thinkers360 President, McKnight Consulting Group A 2 time Inc. 5000 Company @williammcknight www.mcknightcg.com (214) 514-1444 Second Thursday of Every Month, at 2:00 ET With William McKnight
  • 2. Enterprises Analysts Vendors • Keynote/Webinar Presentations – Online & In-Person – Great turnouts. • White Paper Development – Use our unique voice to talk about a theme important to you and tie your product to it. • Benchmark Services – Performance, Ease-of-Use, Functionality, TCO. We’ve done 40+ benchmarks; TPCs, others; databases (analytical, operational) and related (lake, integration, APIs, etc.). Impactful. • Day in the Life of Report – We go from zero to production and document the steps, creating comfort for the buyer to make the next step. • Teardown – Comparing and grading vs. competition across 50 +/- factors. Ideal for building product roadmaps. • Competitive Education – We teach vendor competitive teams about the competition with ½ day – 1 day hands-on workshops per competitor. • Technical Specification Development – i.e., Deployment Guide, Best Practices Guide, Reference Architecture. • Test Drives for demonstrations/booth – We build real-world relatable test drives/demos you can use to show off features or performance. • *NEW* McKnight Enterprise Contribution Ranking Report – We’re taking an industry and assessing market leaders against critical capabilities of the market. Industries available for prioritizing research. • Total Addressable Market Report. McKnight Consulting Group Vendor Offerings
  • 3. OLTP vs OLAP OLTP • Process business interactions as they occur • Support limited query • Focus on IUD/individual transactions • Low latency and high throughput needed • ACID compliance • Normalized data model OLAP • Analytics/complex analysis • Offload of processing from OLTP • Dimensional data model • Lite data modification from source • Complex queries, frequently long-running • Large data accumulation 3
  • 4. Capability Requirements • Analytics on live data, recent data and historical data • Real-time analytics calculated from across data domains • Pre-calculated data • Live analytics usable operationally • A seamless platform • Operational SLAs 4
  • 5. Analytics Defined • Analytics is the process of utilizing data to enhance business processes. • Analytics is deeper than simple knowledge; they have depth. • There’s Analytic Projects and… • There’s Analytics Added to projects
  • 6. Analytics Origin Batch • Broad Context • Drives Reactions • Action Options • “Most People” • Static Rules Real-Time • Immediate Context • Activity not in Batch • Dynamic Rules 6
  • 7. Benefits of Real-Time Analytics • Speed to Insight • Customer Experience • Operational Excellence • Deeper Understanding 7
  • 8. Translytical Use Cases • Portfolio Management • Wealth Management • Fraud Analytics • Risk Management • Algorithmic Trading • Crypto Exchange • SC/IoT Analytics • Real-Time Customer Experience • Network Telemetry • Geolocation Analysis • Field Support Optimization • Ad Optimization & Ad Serving • Streaming Media Quality Analytics • Real-Time Recommendations • Video Games • Telemetry Processing • IoT & Smart Meter Analytics • Predictive Maintenance • Geospatial Tracking
  • 9. Next Best Offer/Touch • Need to incorporate not only analytics through last night, but also today, all morning, last hour and last second into screen render • Need to incorporate not just the user’s data but all users data – Need to correlate user to other users instantly • Only AI can operate at the needed scale 9
  • 10. Financial Market • Billions of API Requests Daily • Need 5-10ms Average Query Response • Data to include: – Real-time and historical stock price – Cryptocurrencies, Forex, Commodities, Currencies, Premium Data • Front-Office Traders Need Real-Time Analysis
  • 11. Healthcare • Genomic medicine • Virtual visits • Tele-health and AI Triage • AI Diagnostics • Robotics Automating Lab Work 11
  • 12. Retailer • Better & personalized product recommendations for the consumers based on session data, historical order data, and trending products. • Continuous and automatic retraining the recommendation (ML) engine. • Near real-time data integration from their retail application to the analytical platform. • Identify potential compliance issues with customer data, classify and tag sensitive data with labels, and track how sensitive data is being used from the data source to the reports. • Integrate other systems such as their SAP ERP, email and instant messaging platforms with the analytical solution to get a full 360 view over their business operations and to improve customer satisfaction. • Save in operational costs while offering the best customer experience even during peak seasons such as Black Friday, Thanksgiving, Christmas, and Mother’s Day. 12
  • 13. Metaverse • VR chairs, vests, scent generators, and better directional sound systems • Avatars fully virtual agents • Surgical implants to the metaverse 13
  • 14. Transportation • Driverless and autonomous • Floating or vertical warehouses delivering packages • Urban transportation • Airbus drone-like popup concept 14
  • 15. Cameras and Audio Recording • Cameras Will Be Abundant • Person’s Profile Will Be Evident • Third-Party Analytics • AI Will Decide … 15
  • 16. Manufacturing • Real-Time Dashboards • Variety of data sources • When they ingest data they must recalculate the entire dataset because business rules change over time • Cross-matching survey results at the team and individual level • Need to know what impact various dimensions, such as product quality, support, cost, and more have on their NPS score • Processes that formerly required 10 steps are streamlined down to just one. 16
  • 17. Asset Management • End-to-end asset visibility • Needed one place to discover all assets in environment – With instant context around risk, vulnerability, threat assessment and threat detection • 100 billion events per day – devices, firewalls, IoT, multi-tenant, ServiceNow and network traffic 17
  • 18. Security Surveillance • Goal to view all sites in a single, cloud- based package – And offer analytics from video data • Real-Time Insights • Biggest challenge was scalability 18
  • 19. Finance: Embedded • Started with easy to prototype, ingest data, do basic reports – Required replica sets • Performance constraints on writes to PostgreSQL • Had to do a bulk load of the data and it was so time-consuming that certain data was skipped 19
  • 20. eSports • Need to offer real-time and historical live streaming data to analyze trends and performance across all genres, games, events, and channels • Need to work with thousands of time series data points in complex multi-gigabyte aggregated queries. • Analytics speed is the top priority • Understand spikes in viewership 21
  • 21. Data Architecture Needs for Translytical Workloads • Fast Streaming Ingest (millions of events/second) • Low Latency • High Concurrency (thousands of concurrent users) • Unlimited Storage • Pipelines • Transactional Consistency
  • 22. Data Architecture Ill-fit for Translytical Logs (Apps, Web, Devices) User tracking Operational Metrics Offload data Raw Data Topics JSON, AVRO Processed Data Topics Sensors and / or Transactiona l/ Context Data OLTP/ODS ETL Or EL with T in Spark Batch Low Latency Applications Files In- database analytics Reach through or ETL/ELT or Stream Processing or Stream Processing Q Q Data Warehouse Data Lake
  • 25. NoSQL for Operational Big Data More data model flexibility – Web Services as a data model – No “schema first” requirement; load first Faster time to insight from data acquisition Relaxed ACID – Eventual consistency – Willing to trade consistency for availability – ACID would crush things like storing clicks on Google Low upfront software and development costs Fault-tolerant redundancy Linear Scaling to “webscale” 26
  • 26. Event-Driven Architectures • Kafka Connector • Realtime Pub/Sub Messaging Platform • Edge Computing • Selective Feed to a Data Lake 27
  • 27. Single Product Architectures • Single Table Storage for Transactions and Analytics – Fast IUD and Query – Simplified Data Architecture – Reduced Data Movement • Rowstore + Columnstore 28
  • 28. Columnstore • SingleStore uses two storage types internally: an in-memory rowstore and a disk-based columnstore • Columnstore: 29
  • 29. Azure Synapse Analytics Data Processing Transactional Database ML Model Training & Deployment Azure Machine Learning Azure Cosmos DB Core API E-Commerce Website Azure Kubernetes Service (AKS) Front-end Back-end Cart Profile Products Stock Azure ML managed online endpoint Deployed Recommender Azure Cosmos DB Analytical Store (Parquet) Analytical Store (HTAP) ADLS Gen2 (data lake) Data Lake + Historical Data Automatic Model deployment Synapse Link Enables automatic sync to analytical store (no ETL) Data Management & Governance Microsoft Purview Classify & protect sensitive data (customer profiles, etc.) Power BI Report & Visualize Power Apps M365 Dataverse Synapse Link Enterprise Data Sources Synapse Pipelines Azure Real-Time Environment
  • 30. Amazon Redshift Data Processing Transactional Database ML Model Training & Deployment Amazon SageMaker Amazon DynamoDB E-Commerce Website Amazon Elastic Kubernetes Service (Amazon EKS) Front-end Back-end Cart Profile Products Stock SageMaker model endpoint Deployed Recommender Data Loading S3 (data lake) Data Lake + Historical Data Automatic Model deployment Amazon Glue Data Governance: • AWS Partner solutions • AWS Marketplace solutions AWS Real-Time Environment
  • 31. Single Product Logs (Apps, Web, Devices) User tracking Operational Metrics Offload data Raw Data Topics JSON, AVRO Processed Data Topics Sensors and / or ETL Or EL with T in Spark Batch Low Latency Applications Files Transactional/ Context Data OLTP Reach through or Stream Processing Data Warehouse Data Lake
  • 32. Single Vendor Solutions • SingleStore • Oracle • Snowflake Unistore • Cassandra • Azure • AWS • Google
  • 33. Tweak on Traditional Architectures Logs (Apps, Web, Devices) User tracking Operational Metrics Offload data Raw Data Topics JSON, AVRO Processed Data Topics Sensors and / or Transactiona l/ Context Data OLTP/ODS ETL Or EL with T in Spark Batch Low Latency Applications Files In- database analytics Reach through or ETL/ELT or Stream Processing or Stream Processing Q Q Data Warehouse Data Lake Analytics
  • 34. Multi-Vendor Architectures • Spark, Cassandra, Elastic, Druid, MongoDB • MySQL, Redis, DynamoDB • Redis, Elastic, Spark, Flink • Storm, Druid • PostgreSQL, Elastic
  • 35. Benchmark • We found a single database competitiveness in operating effectively, actually putting it in a winning position for both transactional and operational workloads. – The use of a single database facilitates operational analytics and offers an efficient approach for any organization. • For the TPC H-like workload, it obtained a geometric mean better than both of the pure-play data warehouses. • In the TPC DS-like workload, an analytic db was superior, both with maintenance and without maintenance. Its 4.1 geometric mean outperformed the 1 db without maintenance, while its 3.9 with maintenance likewise bested the 1 db. • Given the vast superiority in transactional processing and the high competitiveness in analytic processing, the efficiencies of one database—the 1 db —across the spectrum of enterprise needs should be considered. • Platform costs favor the 1 db by 1.9x over 1 analytic db and 2.5x over the other in Year 1. • Development costs are 2.5x – 3x and Production Costs are 2.1x – 2.5x for the analytic db. • We calculated the annual costs of the platform stacks and the Time-Effort Costs (People Costs, Development Costs and Production Costs) and concluded that the 1 db is 2 times cheaper than 1 analytic stack and 2.5 times cheaper than the other over 3 years running enterprise-equivalent workloads.
  • 36. Summary • Applications are moving translytical as the lines between operational and analytical blur • Analytics are deeper than simple knowledge; they have depth • The need for real-time analytics drives the need for a translytical architecture • There are examples in every industry • Traditional architectures do not meet the requirements • There are multiple vendor, multiple product/same vendor and single product options • Single product solutions combine Rowstore + Columnstore • Given the vast superiority in transactional processing and the high competitiveness in analytic processing, the efficiencies of one database—the 1 db —across the spectrum of enterprise needs should be considered
  • 37. Upcoming Topics • Assessing New Database Capabilities: Multi-Model • MLOps: Applying DevOps to Competitive Advantage • 2023 Trends in Enterprise Analytics • Showing ROI for your Analytic Project • Architecture, Products and Total Cost of Ownership of the Leading Machine Learning Stacks 39 Second Thursday of Every Month, at 2:00 ET
  • 38. Assessing New Databases: Translytical Use Cases Presented by: William McKnight “#1 Global Influencer in Big Data” Thinkers360 President, McKnight Consulting Group A 2 time Inc. 5000 Company @williammcknight www.mcknightcg.com (214) 514-1444 Second Thursday of Every Month, at 2:00 ET #AdvAnalytics

Editor's Notes

  1. HTAP, HOAP, Operlytical, Event-driven*
  2. The dw is not dead, but it is dying. Data lake.
  3. Analytics needed even if they are not traditionally stored together (e.g. real-time customer event data alongside CRM data; network sensor data alongside marketing campaign management data) Between the pre-calc and the live is design…. On-demand vs. Continuous Real-time Analytics
  4. Projects that are classified as “analytics”. And there’s analytics added to projects. At some point, all projects are becoming analytic projects which makes it fair to just measure the project roi.
  5. Red light…. Stop. Car running low on gas, gas station 20 minutes away, home 30 minutes away, tomorrow not as busy can gas in morning.
  6. Speed to Insight: The primary benefit of real-time analytics is of course speed. It speeds up time to insight and lets businesses work faster to make necessary changes to systems or act on any critical information discovered. This can help organizations not only flag potential problems and mitigate risk, but also seize opportunities when they matter. Customer Experience: Real-time analytics can help businesses anticipate problems and streamline operations to improve the overall customer experience. These on-the-fly adjustments greatly influence customer interactions and can help improve the end-to-end experience. Operational Excellence: Real-time analytics allows organizations to gain a clear view of the business and understand what needs to be done to address potential operational issues. It also allows users to understand what resources are available to make those changes. Deeper Understanding: When there is a need for deeper analytics to make a business decision, real-time analytics can help compare real-time and historical data to inform the decision. Most r/t arch about r/t ingest
  7. Keywords: real-time, real-time analytics, operational excellence, operational analytics, reatl-time DW: Real-time analytics essentially means that data is provided for analysis almost immediately once it is collected.  Way of the future; No 2nd store of data: DW
  8. Maybe somebody just became a correlated user theme
  9. Foreign exchange Premium data comes from a growing community of curated partners, such as: Wall Street Horizon Fraud Factors Audit Analytics ValuEngine Stocktwits And much more
  10. Recalls Outbreaks Latest findings pandemic footprint Human beings have roughly 20,500 genes, in DNA, housed in each and every one of the trillions of cells that make you who you are. What cases what action… it’s complicated. Batch anal needed.
  11. 360 incl the now
  12. MV is about simulation. Avatars able to act, within tightly defined parameters, as our agents, our companions, and some may even be considered co-workers.  unable to tell the difference between a virtualized real person and an AI-driven avatar.  we will vrutally be able to travel the world and experience life on other planets, all from home. Metaverse will give a feeling of actually being there with your family/friends. parallel life in the Metaverse. It has become absolutely necessary for your existence. It is very difficult to be operational outside of the Metaverse. You are connected via multiple devices, wearables and even brain chips. You live in a mixed reality where physical and digital converge. Many people opt to spend most of their day in virtual worlds where they can become whoever they want and live the way they always dreamed. unlimited freedoms in their personal virtual worlds  - no liniits. NFTs and crypto …. Take off later. bitcoin will displace the US dollar as the primary form of global finance by 2050
  13. Traffic and weather – current and patterns. Constantly changing.
  14. Imagine this: You walk into a furnitue showroom virtually and before you say anything, the store knows your name, employment status, car-buying history, and credit rating. ADD: Where you’ve been today, clothes you’re wearing, etc. Already, data brokers such as Acxiom and LexisNexis compile reams of information on all of us. Clients can purchase a dossier on your criminal, consumer, and marital past. it’s only a matter of time before data brokers begin drawing from online-dating profiles and social-media posts as well. Right now, clients have to log in and search for people by name or buy lists of people with certain traits. But as facial-recognition technology becomes more widespread, , any device with a camera and the right software could automatically pull up your information. Eventually, someone might be able to point a phone at you (or look at you through smart contact lenses) and see a bubble over your head marking you as unemployed or recently divorced. We’ll no longer be able to separate our work selves from our weekend selves. Instead our histories will come bundled as a pop-up on strangers’ screens. With the advent of the Internet of Things, appliances and gadgets will monitor many aspects of our lives, from what we eat to what we flush. Devices we talk to will record and upload our conversations, as Amazon’s Echo already does. Even toys will make us vulnerable. Kids say the darndest things, and the talking Hello Barbie doll sends those things wirelessly to a third-party server, where they are analyzed by speech-recognition software and shared with vendors. Even our thoughts could become hackable. The technology company Retinad can use the sensors on virtual-reality headsets to track users’ engagement. Future devices might integrate electrodes to measure brain waves. In August, Berkeley engineers announced that they had produced “neural dust,” implantable electrodes just a millimeter wide that can record brain activity for scientific or medical purposes. Chicago police use an algorithm that analyzes arrest records, social networks, and other data to identify future criminals.
  15. xxx previously had to run advanced analytics offline xxx. “if you looked at the dashboard and wanted to drill through, the waiting times were longer than 2 seconds. If it's not instant or very close to instant, it becomes painful. At that point, people just don’t do the analytics and valuable information is lost. If you don't use it, and if you don't analyze, you can’t find these things, you're not going to improve your business.” The data sources could be almost anything, from databases to IoT devices. They can now drill down into things like NPS to get at the root cause of a score, drive those insights back into the business, improve their scores, and most importantly, retain their end customers . In the past, xxx had to provide these analytic insights by moving data into SPSS, which was painfully slow. With a translytical appraoch, they can now slice and dice data in real time, and in the NPS example, instantly understand the validity of a data correlation.
  16. Armis originally launched its platform using a PostgreSQL database. Over time, the time-based data set got too large for Postgres to handle. At this point the team migrated this data set from 400+ PostgresSQL databases into a huge Elasticsearch cluster (160 nodes). the entire data pipeline including ElasticSearch cost more than $1 million annually.
  17. Embedded finance is when non-financial companies offer their customers access to credit through their technology platform. Customers can be individuals or businesses, and the credit can be offered by the company or by a third party. The replicated data needed to be re-ingested, and the dashboard only refreshed once every 24 hours, leading to a serious and unacceptable lag in data freshness. didn’t store a lot of information due to performance constraints on writes to PostgreSQL, and data from other partners was nearly impossible to obtain so Ant Money could enrich its first-party data.  Ant Money had to do a bulk load of the data and it was so time-consuming that certain data was skipped.
  18. s customers are the biggest companies in the eSports industry: game publishers, eSports organizers, and other brands. & Help companies analyze eSports data so they can understand how they’re doing and ways they can optimize time and resources Both real-time and historical data were needed to provide the full context of the live streams and eSports events. The data ingestion pipeline included manual metadata input, third-party fact tables, and automated systems.
  19. Linkedin 80million events/sec + bulk loading Parallel, high-scale streaming data ingest; immediate avail
  20. Ppl on 1 side. Most of the time post-op is “learning” 1 db solutions: Doing analytics with operational DBs or tying together multiple databases to power their applications with analytics Is it more anal needing op or op needing anal, which way coming? It’s anal trying to do op & fail . Also mysql, postgresql to anal fail.
  21. r/t for access Single source
  22. databricks Lake+dw. All points of integration are points of failure. Data lakes (cloud stg) emerged to handle raw data in a variety of formats on cheap storage for data science and machine learning, though lacked critical features from the world of data warehouses: they do not support transactions, they do not enforce data quality well, and their lack of consistency/isolation makes it almost impossible to mix appends and reads, and batch and streaming jobs. There are a few key technology advancements that have enabled the data lakehouse: metadata layers for data lakes to set up drill through paths new query engine designs providing high-performance SQL like execution on data lakes access for data science and machine learning tools. Lake concerns itself w/ DQ not offloads all of the major data platform vendors have converged their messaging around the concept of a lakehouse architecture that takes the best attributes of traditional data warehouses and enables them to run on platforms with data- lake storage architectures.
  23. Column stores, key-value stores, document stores\ Data fit for nosql
  24. Linkedin 80million messages/sec
  25. Readers don’t need to wait on writers. Each version of the row is stored as a fixed-sized struct (variable- length fields are stored as pointers) according to the table schema, along with bookkeeping information such as the timestamp and the commit status of the version. Oracle (Oracle has a dual store approach rather than a single store like SingleStoreDB's Universal Storage) Snowflake has announced combining transactional and #analytical data with Unistore. SingleStore, ….what we encounter is customers trying to do analytics with operational DBs or tying together multiple databases to power their applications with analytics. We also replace a lot of the 1st gen operational DBs (such as the MySQL, Postgres, RDS) and also augment data warehouses or Hadoop to power real-time analytics.
  26. Microsoft solution   Microsoft Azure and Microsoft Intelligent Data Platform Azure Kubernetes Service Azure Cosmos DB Synapse Link for Cosmos DB Synapse Analytics, Synapse Pipelines, ADLS Gen2 Azure ML Power BI Microsoft Purview
  27. AWS solution   Amazon Elastic Kubernetes Service (Amazon EKS) Amazon DynamoDB Amazon Glue Amazon Redshift, S3 Amazon SageMaker For Data Governance: 3rd party Marketplace/Partner solutions
  28. GCP solution Google Kubernetes Engine (GKE) Cloud Firestore Cloud Data Fusion BigQuery, Cloud Storage, Cloud Dataprep, Cloud Dataflow Vertex AI Prediction, Vertex AI For Data Governance: Dataplex, requires a separate Dataplex lake
  29. Diff: 1 vendor, 1 product
  30. Oracle has a dual store approach rather than a single store like SingleStoreDB's Universal Storage).     You need      (i) basic Oracle DB license (ii) diag + tuning pack (iii) Oracle RAC option (iv) Exadata (for the columnar compression/performance) (v) partitioning pack  (vi) Active Data guard option Snowflake has announced combining transactional and #analytical data with Unistore
  31. Some Ppl on 2 sides. 1. Data lakes can be difficult to manage and govern due to their size and complexity. 2. Data lakes can be difficult to extract from regularly due to the variety and volume of data they contain. Need to add a cache like Redis Op db often nosql
  32. Organizations are often reluctant to attempt analyzing real-time data, fearing the analytical workload will hamper the performance of the operational work that has to be the priority. 
  33. 2 anal db, 1 1 db solution
  34. 2 anal db, 1 1 db solution
  35. Some of these BP you’ll see next month in mature env