Architecture, Products, and Total Cost of Ownership of the Leading Machine Learning Stacks

Architecture, Products
and Total Cost of
Ownership of the
Leading Machine
Learning Stacks
Presented by: William McKnight
“#1 Global Influencer in Big Data” Thinkers360
President, McKnight Consulting Group
A 2-time Inc. 5000 Company
linkedin.com/in/wmcknight/
www.mcknightcg.com
(214) 514-1444
Second Thursday of Every Month, at 2:00 ET
With William McKnight

TELECOMMUNICATIONS
PHARMACEUTICAL
EDUCATION
CONSUMER PRODUCTS/RETAIL FINANCIAL INSURANCE/HEALTHCARE
GOVERNMENT AND UTILITIES
OTHER
PUBLISHING
McKnight Consulting Group Partial Client List

Performance Features
• Micro-partitions
• Clustering Keys
• Clustering Depth
• Multi-Clusters
• Transparent Materialized Views
• Search Optimization Service
• Query Acceleration Service

Individual Query Performance Feature
Comparison
Improves Clustering Materialized Views Search Opt. Service
Equality searches X X X
Range searches X X X
Sort operations X X
Substring and Regex X
VARIANT searches X
Geospatial X
Extra Costs
Compute X X X
Storage X X

Usability Features
• External Tables
• Dynamic Data Masking
• Time Travel and Fail Safe
• Semi-Structured Data
• Snowpipe
• Snowsight Dashboards
• Snowpark API
6

Warehouses
• 10 sizes
• Available in Standard
and Snowpark
• New Snowpark-
optimized with 16x
memory than
Standard (open
preview)
Size
XS
S
M
L
XL
2XL
3XL
4XL
5XL
6XL

Pricing
• Watch For:
– Concurrency and
price-per-
performance
– Effective Warehouses
(Multi-clusters)
– Add-on compute:
• Automatic
Clustering
• Materialized View
Refreshes
• Search
Optimization
• Query
Acceleration
– Time travel storage
• Discounts

(A) Snowflake ML Stack
Category
Dedicated Compute Snowflake
Storage Snowflake
Data Integration AWS Glue
Streaming Kafka Confluent Cloud
Spark Analytics Amazon EMR + Kinesis Spark
Data Lake Snowflake External Tables
Business Intelligence Tableau
Machine Learning Amazon SageMaker
Identity Management Amazon IAM
Data Catalog Amazon Glue Data Catalog

(A) Snowflake Machine Learning Stack
Azure Kubernetes Services (AKS)
Front-end
E-Commerce
Website
Back-end
Cart
Profile
Products
Stock
Deployed
Recommender
ML Model Training &
Deployment
Automatic
Model deployment
Databricks Databricks
Transactional
Database
Cloud Firestore
Data Loading
Data
Processing
Cloud Data Fusion
Snowflake
Data
Transformation
Data Lake +
Historical Data
Data Marts
Cloud Storage
(data lake)
MDM
Database
Talend
Data Governance:
• Partner Solutions
• Marketplace solutions

• Redshift Advisor
• Workload Management
• Concurrency Scaling
• Short Query Acceleration
12

Usability Features
• Redshift Spectrum (External Tables)
• Automated Materialized Views (AutoMV)
• Federated Queries
• Semi-Structured and SUPER Type
• Streaming Ingest with Kinesis
• Python UDF
• Redshift ML

Provisioned Clusters vs. Serverless
Provisioned Serverless
Managed Self managed Fully managed
Compute Choose node type and cluster size Workgroup
Storage Provisioned disk capacity Namespace
WLM User configured Not applicable
Concurrent scaling User enabled Not applicable
Scale out/up/down User-initiated cluster resize Not applicable
Pause/resume Manual Automatic
Compute billing Per second when not paused
$/hour rate
Per second when workloads run
RPU-hour rate
Storage billing $ per managed storage amount $ per GB-month used
More detailed comparison: https://docs.aws.amazon.com/redshift/latest/mgmt/serverless-console-comparison.html

Cluster Sizes
AWS Type CPU/RAM Node Range Price Per Node
dc2.large 2 / 15 GB 1 – 32 $0.25
dc2.8xlarge 32 / 244 GB 2 – 128 $4.80
ra3.xlplus 4 / 32 GB 1 – 32 $1.09
ra3.4xlarge 12 / 96 GB 2 – 32 $3.26
ra3.16xlarge 48 / 384 GB 2 – 128 $13.04
Serverless (Base & Max RPUs) ? 32 – 512 RPUs* $0.36
*Redshift Processing Units are available in units of 8 (32, 40, 48, and so on, up to 512)

Pricing
• Price-per-
performance
• Watch For:
– Concurrency
Scaling
– Serverless
RPU Usage
– SageMaker
costs for
Redshift ML
• Discounts

Redshift ML Stack
Category
Dedicated Compute Amazon Redshift RA3
Storage Amazon Redshift Managed Storage
Data Integration AWS Glue
Streaming Amazon Kinesis Data Analytics
Spark Analytics Amazon EMR + Kinesis Spark
Data Lake Amazon Redshift Spectrum
Business Intelligence Amazon Quicksight
Machine Learning Amazon SageMaker
Data Catalog Amazon Glue Data Catalog

Amazon Elastic Kubernetes services (Amazon EKS)
Front-end
E-Commerce
Website
Back-end
Cart
Profile
Products
Stock
Deployed
Recommender
ML Model
Training &
Deployment
Automatic
Model
deployment
SageMaker
model
endpoint
Amazon
SageMaker
Transactional
Database
Amazon
Dynamo DB
Data
Loading
Amazon
Glue
Data
Processing
Amazon
Redshift
Data Lake
+ Historical
Data
S3
(data
lake)
Data Governance:
• AWS Partner Solutions
• AWS Marketplace
solutions
MDM
Database
Talend
AWS Machine Learning Stack

• Workload Management
• Estimated query plan (coming soon)
• Transparent materialized views
• Adaptive caching (recently use data on
NVMe)
• Azure Advisor

Usability Features
• External Data Sources
• Synapse Link
• SynapseML
21

Data Warehouse Units (DWU)
• Official: “a collection of analytic
resources…defined as a combination of
CPU, memory, and IO…[which] represents
an abstract, normalized measure of
compute resources and performance.”
• Increasing DWUs linearly improves
performance
DWUs
100
200
300
400
500
1000
1500
2000
2500
3000
5000
6000
7500
10000
15000
30000

Pricing
DWUs Price/hr
100 $1.20
200 $2.40
300 $3.60
400 $4.80
500 $6
1000 $12
1500 $18
2000 $24
2500 $30
3000 $36
5000 $60
6000 $72
7500 $90
10000 $120
15000 $180
30000 $360
Component Price
Serverless $5/TB processed
Dedicated $/hour >>>
1-year Reserved 37% discount
3-year Reserved 65% discount
Storage $23/TB-month
• Additional charges (per vCore-hour) for Synapse
Link, Data Explorer, and Spark Pools
• Pipelines priced by DIU-hour, runtime-hour, and
per activity run

Microsoft Synapse ML Stack
Category
Dedicated Compute Azure Synapse Analytics Workspace
Storage Azure Synapse Analytics SQL Pool
Data Integration Azure Data Factory (ADF)
Streaming
Azure Stream Analytics (for Analytics)
and Azure Event Hubs
Spark Analytics Big Data Analytics with Apache Spark
Data Lake Amazon Redshift Spectrum
Business Intelligence Amazon Quicksight
Machine Learning Amazon Sagemaker
Data Catalog Amazon Purview

Front-end
E-Commerce
Website
Back-end
Cart
Profile
Products
Stock
Deployed
Recommender
ML Model Runtime
Azure ML
managed online
endpoint
Azure Machine
Learning
Transactional
Database
Azure Cosmos
DB Core API
Analytical
Store (HTAP)
Azure Cosmos
DB Analytical
Store (Parquet)
Cognitive
Services
Sentiment
analysis on
product reviews
to enhance the
recommender
model
Synapse
Link
Enables
automatic
sync
to
analytical
store
(no
ETL)
Data
Processing
Azure Synapse Analytics
Data Lake +
Historical Data
ADL Gen2 Data Lake:
HTAP data, sentiment
data, historical order data
Automatic
Model
deployment
(MLOps)
Data Transformation &
ML Model Training
Azure Databricks Delta Live Tables SparkML
Microsoft
Purview
Data Management & Governance
Discover, classify, track lineage, and protect sensitive data
(customer profiles, etc.)
MDM
Database
Talend
Azure Machine Learning Stack

• BQ Architecture and Slots
• Clustering and Partitioning
• BI Engine

Usability Features
• BigQuery Omni – External Tables
• Time Travel
• Migration Service – SQL Translation
• Looker Studio
• Colab Notebooks
• BigQuery ML
28

Pricing
Compute
BigQuery Omni
On-demand $5 per TB $5 per TB
Flex $4.00/hr per
100 slots
$5.00/hr per
100 slots
Monthly
Commit*
$2.74/hr per
100 slots
$3.42/hr per
100 slots
Annual
Commit*
$2.33/hr per
100 slots
$2.91/hr per
100 slots
BI Engine $0.0416/hr per
GB
N/A
Storage1
Logical2 Physical3
Active $0.02/GB-
month
$0.04/GB-
month
Long-term4 $0.01/GB-
month
$0.02/GB-
month
Batch loading FREE
Streaming
inserts
$0.01 per 200MB
Storage API $0.025 per 1GB
1 You get to choose logical or physical billing
2 Logical = Uncompressed size (Time travel free)
3 Physical = Compressed size + Time travel
4 Table not modified in 90 days
*comes with some free BI Engine

Google BigQuery ML Stack
Category
Dedicated Compute Google BigQuery
Storage Google BigQuery Storage
Data Integration Google Dataflow (Batch)
Streaming Google Dataflow (Streaming)
Spark Analytics Google Dataproc
Data Lake Google BigQuery On-Demand Infrastructure
Business Intelligence Google BigQuery BI Engine
Machine Learning Google BigQuery ML
Identity Management Google Cloud IAM
Data Catalog Google Data Catalog

Front-end
E-Commerce
Website
Back-end
Cart
Profile
Products
Stock
Deployed
Recommender
ML Model Training &
Deployment
Automatic
Model deployment
Vertex AI Prediction Vertex AI
Data Governance
• Google Dataplex
Transactional
Database
Cloud
Firestore
Data Loading
Data
Processing
Cloud Data Fusion
BigQuery
Data
Transformation
Data Lake +
Historical
Data
Cloud
Dataprep
Cloud Dataflow
Cloud Storage
(data lake)
MDM
Database
Talend
Google Machine Learning Stack

Line Item Pricing (AWS)
Lookup CostCenter Category Platform Product Size UnitNode
Amazon Redshift ra3.4xlarge-Infrastructure Infrastructure
01-Dedicated
Compute AWS Amazon Redshift ra3.4xlarge 1-Medium ra3.4xlarge
Amazon Redshift ra3.16xlarge-Infrastructure Infrastructure
01-Dedicated
Compute AWS Amazon Redshift ra3.16xlarge 2-Large ra3.16xlarge
Amazon Redshift Managed Storage-Storage Storage 02-Storage AWS
Amazon Redshift Managed
Storage 1-Medium GB-month
Amazon Redshift Managed Storage-Storage Storage 02-Storage AWS
Amazon Redshift Managed
Storage 2-Large GB-month
AWS Glue-Software Software 03-Data Integration AWS AWS Glue 1-Medium DPU-Hour
AWS Glue-Software Software 03-Data Integration AWS AWS Glue 2-Large DPU-Hour
Amazon Kinesis Data Analytics-Infrastructure Infrastructure 04-Streaming AWS Amazon Kinesis Data Analytics 1-Medium KPU-Hour
Amazon Kinesis Data Analytics-Infrastructure Infrastructure 04-Streaming AWS Amazon Kinesis Data Analytics 2-Large KPU-Hour
Amazon Kinesis Data Analytics-Storage Storage 04-Streaming AWS Amazon Kinesis Data Analytics 1-Medium GB-month
Amazon Kinesis Data Analytics-Storage Storage 04-Streaming AWS Amazon Kinesis Data Analytics 2-Large GB-month
Amazon EMR-Infrastructure Infrastructure 05-Spark Analytics AWS Amazon EMR 1-Medium r5.4xlarge
Amazon EMR-Software Software 05-Spark Analytics AWS Amazon EMR 1-Medium EMR on r5.4xlarge
Amazon EMR-Infrastructure Infrastructure 05-Spark Analytics AWS Amazon EMR 2-Large r5.4xlarge
Amazon EMR-Software Software 05-Spark Analytics AWS Amazon EMR 2-Large EMR on r5.4xlarge
Amazon Kinesis-Shards Shards 05-Spark Analytics AWS Amazon Kinesis 1-Medium Shard-hour
Amazon Kinesis-Shards Shards 05-Spark Analytics AWS Amazon Kinesis 2-Large Shard-hour
Amazon Redshift Spectrum-Software Software 06-Data Exploration AWS Amazon Redshift Spectrum 1-Medium TB-month
Amazon Redshift Spectrum-Software Software 06-Data Exploration AWS Amazon Redshift Spectrum 2-Large TB-month
Amazon Redshift ra3.4xlarge-Infrastructure Infrastructure 06-Data Exploration AWS Amazon Redshift ra3.4xlarge 1-Medium ra3.4xlarge
Amazon Redshift ra3.4xlarge-Infrastructure Infrastructure 06-Data Exploration AWS Amazon Redshift ra3.4xlarge 2-Large ra3.4xlarge
Amazon EMR-Infrastructure Infrastructure 07-Data Lake AWS Amazon EMR 1-Medium r5.4xlarge
Amazon EMR-Software Software 07-Data Lake AWS Amazon EMR 1-Medium EMR on r5.4xlarge
Amazon EMR-Infrastructure Infrastructure 07-Data Lake AWS Amazon EMR 2-Large r5.4xlarge
Amazon EMR-Software Software 07-Data Lake AWS Amazon EMR 2-Large EMR on r5.4xlarge
Amazon Quicksight Readers-Licenses Licenses
08-Business
Intelligence AWS Amazon Quicksight Readers 1-Medium User-month
Amazon Quicksight Readers-Licenses Licenses
08-Business
Intelligence AWS Amazon Quicksight Readers 2-Large User-month
Amazon Quicksight Authors-Licenses Licenses
08-Business
Intelligence AWS Amazon Quicksight Authors 1-Medium User-month
Amazon Quicksight Authors-Licenses Licenses
08-Business
Intelligence AWS Amazon Quicksight Authors 2-Large User-month
Amazon SageMaker-Infrastructure Infrastructure 09-Machine Learning AWS Amazon SageMaker 1-Medium ml.r5.2xlarge
Amazon SageMaker-Software Software 09-Machine Learning AWS Amazon SageMaker 1-Medium ml.r5.2xlarge
Amazon SageMaker-Infrastructure Infrastructure 09-Machine Learning AWS Amazon SageMaker 2-Large ml.r5.2xlarge
Amazon SageMaker-Software Software 09-Machine Learning AWS Amazon SageMaker 2-Large ml.r5.2xlarge
Amazon IAM-Licenses Licenses
10-Identity
Management AWS Amazon IAM 1-Medium Included
Amazon IAM-Licenses Licenses
10-Identity
Management AWS Amazon IAM 2-Large Included
AWS Glue Data Catalog-Software Software 11-Data Catalog AWS AWS Glue Data Catalog 1-Medium 100K objects
AWS Glue Data Catalog-Software Software 11-Data Catalog AWS AWS Glue Data Catalog 2-Large 100K objects
34

Stack Cost by Use Case for Medium-Sized
Enterprises
• 1st Year of Project
• 1st Large Scale ML Project
• 1.3M – 3.2M
35

Stack Cost by Use Case for Large Size
Enterprises
• 1st Year of Project
• 1st Large Scale ML Project
• 3.4M – 8.5M
36

Project ROI & TCO
37
ROI =
Benefit
TCO Infrastructure Software
+
FTE
+
Consulting
+

Summary
• For large-sized enterprise projects, the stack cost typically ranges between $3.4M-$8.5M to
ensure successful deployment of ML-based projects into production, in addition to labor
expenses.
• The total cost of ownership of cloud analytics platforms scales up as the demand for analytics
at your company grows over time.
• Snowflake adopts a usage-based or consumption-based pricing model, where users are
charged based on the amount of data processed, resulting in higher costs for higher usage
levels.
• Redshift offers both provisioned clusters and serverless options to cater to different business
requirements.
• Synapse is available for purchase in DWU, which comprises a collection of analytic resources
that can be adjusted to meet the specific needs of the organization.
• BigQuery slots operate as virtual CPUs to ensure efficient data processing and analysis.
• While there are numerous technology stacks available, the ones mentioned here are just a few
examples.
• Dedicated Compute, Storage, Data Integration, Streaming, Spark Analytics, Data Lake,
Business Intelligence, Machine Learning, Identity Management, and Data Catalog are all
essential components of a modern data management and analytics ecosystem.
• Estimating the costs of building a technology stack can be a complex task and requires careful
consideration of various factors.
• It is recommended to seek reliable performance at a predictable price to ensure the
successful implementation of data management and analytics projects.
• The true measure of project efficacy is Return on Investment (ROI), and organizations should
strive to achieve positive ROI in their data management and analytics endeavors.

Architecture, Products, and Total Cost of Ownership of the Leading Machine Learning Stacks

Recommended

Recommended

More Related Content

Similar to Architecture, Products, and Total Cost of Ownership of the Leading Machine Learning Stacks

Similar to Architecture, Products, and Total Cost of Ownership of the Leading Machine Learning Stacks (20)

More from DATAVERSITY

More from DATAVERSITY (20)

Recently uploaded

Recently uploaded (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Learning Stacks