SlideShare a Scribd company logo

Architecture, Products, and Total Cost of Ownership of the Leading Machine Learning Stacks

Organizations today need a broad set of enterprise data cloud services with key data functionality to modernize applications and utilize machine learning. They need a comprehensive platform designed to address multi-faceted needs by offering multi-function data management and analytics to solve the enterprise’s most pressing data and analytic challenges in a streamlined fashion. In this research-based session, I’ll discuss what the components are in multiple modern enterprise analytics stacks (i.e., dedicated compute, storage, data integration, streaming, etc.) and focus on total cost of ownership. A complete machine learning infrastructure cost for the first modern use case at a midsize to large enterprise will be anywhere from $3 million to $22 million. Get this data point as you take the next steps on your journey into the highest spend and return item for most companies in the next several years.

1 of 39
Download to read offline
Architecture, Products
and Total Cost of
Ownership of the
Leading Machine
Learning Stacks
Presented by: William McKnight
“#1 Global Influencer in Big Data” Thinkers360
President, McKnight Consulting Group
A 2-time Inc. 5000 Company
linkedin.com/in/wmcknight/
www.mcknightcg.com
(214) 514-1444
Second Thursday of Every Month, at 2:00 ET
With William McKnight
TELECOMMUNICATIONS
PHARMACEUTICAL
EDUCATION
CONSUMER PRODUCTS/RETAIL FINANCIAL INSURANCE/HEALTHCARE
GOVERNMENT AND UTILITIES
OTHER
PUBLISHING
McKnight Consulting Group Partial Client List
Architecture, Products, and Total Cost of Ownership of the Leading Machine Learning Stacks
Performance Features
• Micro-partitions
• Clustering Keys
• Clustering Depth
• Multi-Clusters
• Transparent Materialized Views
• Search Optimization Service
• Query Acceleration Service
Individual Query Performance Feature
Comparison
Improves Clustering Materialized Views Search Opt. Service
Equality searches X X X
Range searches X X X
Sort operations X X
Substring and Regex X
VARIANT searches X
Geospatial X
Extra Costs
Compute X X X
Storage X X
Usability Features
• External Tables
• Dynamic Data Masking
• Time Travel and Fail Safe
• Semi-Structured Data
• Snowpipe
• Snowsight Dashboards
• Snowpark API
6

Recommended

Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayC4Media
 
AWS Summit Sydney | 50GB Mailboxes for 50,000 Users on AWS? Easy - Session Sp...
AWS Summit Sydney | 50GB Mailboxes for 50,000 Users on AWS? Easy - Session Sp...AWS Summit Sydney | 50GB Mailboxes for 50,000 Users on AWS? Easy - Session Sp...
AWS Summit Sydney | 50GB Mailboxes for 50,000 Users on AWS? Easy - Session Sp...Amazon Web Services
 
Estimating the Total Costs of Your Cloud Analytics Platform 
Estimating the Total Costs of Your Cloud Analytics Platform Estimating the Total Costs of Your Cloud Analytics Platform 
Estimating the Total Costs of Your Cloud Analytics Platform DATAVERSITY
 
CloudOpen Japan - Controlling the cost of your first cloud
CloudOpen Japan - Controlling the cost of your first cloudCloudOpen Japan - Controlling the cost of your first cloud
CloudOpen Japan - Controlling the cost of your first cloudTim Mackey
 
Optimize Your Reporting In Less Than 10 Minutes
Optimize Your Reporting In Less Than 10 MinutesOptimize Your Reporting In Less Than 10 Minutes
Optimize Your Reporting In Less Than 10 MinutesAlexandra Sasha Blumenfeld
 
AWS Cost Optimization
AWS Cost OptimizationAWS Cost Optimization
AWS Cost OptimizationMiles Ward
 
Introducing Azure SQL Database
Introducing Azure SQL DatabaseIntroducing Azure SQL Database
Introducing Azure SQL DatabaseJames Serra
 

More Related Content

Similar to Architecture, Products, and Total Cost of Ownership of the Leading Machine Learning Stacks

Getting Started with Amazon EC2 and Compute Services
Getting Started with Amazon EC2 and Compute ServicesGetting Started with Amazon EC2 and Compute Services
Getting Started with Amazon EC2 and Compute ServicesAmazon Web Services
 
cloud-training-pricing-billing.pdf
cloud-training-pricing-billing.pdfcloud-training-pricing-billing.pdf
cloud-training-pricing-billing.pdfAbhi850745
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevAltinity Ltd
 
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...Amazon Web Services
 
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...Amazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Sql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su AzureSql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su AzureMarco Obinu
 
DATA LAKE AND THE RISE OF THE MICROSERVICES - ALEX BORDEI
DATA LAKE AND THE RISE OF THE MICROSERVICES - ALEX BORDEIDATA LAKE AND THE RISE OF THE MICROSERVICES - ALEX BORDEI
DATA LAKE AND THE RISE OF THE MICROSERVICES - ALEX BORDEIBig Data Week
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformDATAVERSITY
 
Dynamics CRM high volume systems - lessons from the field
Dynamics CRM high volume systems - lessons from the fieldDynamics CRM high volume systems - lessons from the field
Dynamics CRM high volume systems - lessons from the fieldStéphane Dorrekens
 
Introducing Cloudian HyperStore 6.0
Introducing Cloudian HyperStore 6.0Introducing Cloudian HyperStore 6.0
Introducing Cloudian HyperStore 6.0Cloudian
 
Solving Office 365 Big Challenges using Cassandra + Spark
Solving Office 365 Big Challenges using Cassandra + Spark Solving Office 365 Big Challenges using Cassandra + Spark
Solving Office 365 Big Challenges using Cassandra + Spark Anubhav Kale
 
Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...
Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...
Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...RightScale
 
Data & Analytics Forum: Moving Telcos to Real Time
Data & Analytics Forum: Moving Telcos to Real TimeData & Analytics Forum: Moving Telcos to Real Time
Data & Analytics Forum: Moving Telcos to Real TimeSingleStore
 
AWS Summit 2013 | India - Understanding the Total Cost of (Non) Ownership, Ki...
AWS Summit 2013 | India - Understanding the Total Cost of (Non) Ownership, Ki...AWS Summit 2013 | India - Understanding the Total Cost of (Non) Ownership, Ki...
AWS Summit 2013 | India - Understanding the Total Cost of (Non) Ownership, Ki...Amazon Web Services
 
ADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsDATAVERSITY
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 

Similar to Architecture, Products, and Total Cost of Ownership of the Leading Machine Learning Stacks (20)

Getting Started with Amazon EC2 and Compute Services
Getting Started with Amazon EC2 and Compute ServicesGetting Started with Amazon EC2 and Compute Services
Getting Started with Amazon EC2 and Compute Services
 
cloud-training-pricing-billing.pdf
cloud-training-pricing-billing.pdfcloud-training-pricing-billing.pdf
cloud-training-pricing-billing.pdf
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
AWS re:Invent 2016: How DataXu scaled its Attribution System to handle billio...
 
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Sql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su AzureSql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su Azure
 
DATA LAKE AND THE RISE OF THE MICROSERVICES - ALEX BORDEI
DATA LAKE AND THE RISE OF THE MICROSERVICES - ALEX BORDEIDATA LAKE AND THE RISE OF THE MICROSERVICES - ALEX BORDEI
DATA LAKE AND THE RISE OF THE MICROSERVICES - ALEX BORDEI
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics Platform
 
Dynamics CRM high volume systems - lessons from the field
Dynamics CRM high volume systems - lessons from the fieldDynamics CRM high volume systems - lessons from the field
Dynamics CRM high volume systems - lessons from the field
 
Introduction to Amazon EC2
Introduction to Amazon EC2Introduction to Amazon EC2
Introduction to Amazon EC2
 
Introducing Cloudian HyperStore 6.0
Introducing Cloudian HyperStore 6.0Introducing Cloudian HyperStore 6.0
Introducing Cloudian HyperStore 6.0
 
Google file system
Google file systemGoogle file system
Google file system
 
Solving Office 365 Big Challenges using Cassandra + Spark
Solving Office 365 Big Challenges using Cassandra + Spark Solving Office 365 Big Challenges using Cassandra + Spark
Solving Office 365 Big Challenges using Cassandra + Spark
 
Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...
Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...
Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...
 
Data & Analytics Forum: Moving Telcos to Real Time
Data & Analytics Forum: Moving Telcos to Real TimeData & Analytics Forum: Moving Telcos to Real Time
Data & Analytics Forum: Moving Telcos to Real Time
 
AWS Summit 2013 | India - Understanding the Total Cost of (Non) Ownership, Ki...
AWS Summit 2013 | India - Understanding the Total Cost of (Non) Ownership, Ki...AWS Summit 2013 | India - Understanding the Total Cost of (Non) Ownership, Ki...
AWS Summit 2013 | India - Understanding the Total Cost of (Non) Ownership, Ki...
 
ADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic SolutionsADV Slides: Comparing the Enterprise Analytic Solutions
ADV Slides: Comparing the Enterprise Analytic Solutions
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 

More from DATAVERSITY

Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data LiteracyDATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsDATAVERSITY
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for YouDATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling FundamentalsDATAVERSITY
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectDATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise AnalyticsDATAVERSITY
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?DATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best PracticesDATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...DATAVERSITY
 
Empowering the Data Driven Business with Modern Business Intelligence
Empowering the Data Driven Business with Modern Business IntelligenceEmpowering the Data Driven Business with Modern Business Intelligence
Empowering the Data Driven Business with Modern Business IntelligenceDATAVERSITY
 

More from DATAVERSITY (20)

Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
 
Empowering the Data Driven Business with Modern Business Intelligence
Empowering the Data Driven Business with Modern Business IntelligenceEmpowering the Data Driven Business with Modern Business Intelligence
Empowering the Data Driven Business with Modern Business Intelligence
 

Recently uploaded

Operations Data On Mobile - inSis Mobile App - Sample Screens
Operations Data On Mobile - inSis Mobile App - Sample ScreensOperations Data On Mobile - inSis Mobile App - Sample Screens
Operations Data On Mobile - inSis Mobile App - Sample ScreensKondapi V Siva Rama Brahmam
 
Industry 4.0 in IoT Transforming the Future.pptx
Industry 4.0 in IoT Transforming the Future.pptxIndustry 4.0 in IoT Transforming the Future.pptx
Industry 4.0 in IoT Transforming the Future.pptxMdRafiqulIslam403212
 
Artificial Intelligence and its Impact on Society.pptx
Artificial Intelligence and its Impact on Society.pptxArtificial Intelligence and its Impact on Society.pptx
Artificial Intelligence and its Impact on Society.pptxVighnesh Shashtri
 
Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...
Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...
Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...Thibaud Le Douarin
 
Tips to Align with Your Salesforce Data Goals
Tips to Align with Your Salesforce Data GoalsTips to Align with Your Salesforce Data Goals
Tips to Align with Your Salesforce Data GoalsDataArchiva
 
data analytics and tools from in2inglobal.pdf
data analytics  and tools from in2inglobal.pdfdata analytics  and tools from in2inglobal.pdf
data analytics and tools from in2inglobal.pdfdigimartfamily
 
A Gentle Introduction to Text Analysis :)
A Gentle Introduction to Text Analysis :)A Gentle Introduction to Text Analysis :)
A Gentle Introduction to Text Analysis :)UNCResearchHub
 
AWS Identity and access management for users
AWS Identity and access management for usersAWS Identity and access management for users
AWS Identity and access management for usersStephenEfange3
 
SABARI PRIYAN's self introduction as reference
SABARI PRIYAN's self introduction as referenceSABARI PRIYAN's self introduction as reference
SABARI PRIYAN's self introduction as referencepriyansabari355
 
Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...
Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...
Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...Cyber Security Experts
 
Soil Health Policy Map Years 2020 to 2023
Soil Health Policy Map Years 2020 to 2023Soil Health Policy Map Years 2020 to 2023
Soil Health Policy Map Years 2020 to 2023stephizcoolio
 
ppt penjualan berbasis online omset.pptx
ppt penjualan berbasis online omset.pptxppt penjualan berbasis online omset.pptx
ppt penjualan berbasis online omset.pptxHizkiaJastis
 
IIBA Adl - Being Effective on Day 1 - Slide Deck.pdf
IIBA Adl - Being Effective on Day 1 - Slide Deck.pdfIIBA Adl - Being Effective on Day 1 - Slide Deck.pdf
IIBA Adl - Being Effective on Day 1 - Slide Deck.pdfAustraliaChapterIIBA
 
What is the value of your Data v3.0.pptx
What is the value of your Data v3.0.pptxWhat is the value of your Data v3.0.pptx
What is the value of your Data v3.0.pptxJose Briones
 
SABARI PRIYAN's self introduction as a reference
SABARI PRIYAN's self introduction as a referenceSABARI PRIYAN's self introduction as a reference
SABARI PRIYAN's self introduction as a referencepriyansabari355
 
Big Data - large Scale data (Amazon, FB)
Big Data - large Scale data (Amazon, FB)Big Data - large Scale data (Amazon, FB)
Big Data - large Scale data (Amazon, FB)CUO VEERANAN VEERANAN
 
Lies and Myths in InfoSec - 2023 Usenix Enigma
Lies and Myths in InfoSec - 2023 Usenix EnigmaLies and Myths in InfoSec - 2023 Usenix Enigma
Lies and Myths in InfoSec - 2023 Usenix EnigmaAdrian Sanabria
 

Recently uploaded (18)

Operations Data On Mobile - inSis Mobile App - Sample Screens
Operations Data On Mobile - inSis Mobile App - Sample ScreensOperations Data On Mobile - inSis Mobile App - Sample Screens
Operations Data On Mobile - inSis Mobile App - Sample Screens
 
Industry 4.0 in IoT Transforming the Future.pptx
Industry 4.0 in IoT Transforming the Future.pptxIndustry 4.0 in IoT Transforming the Future.pptx
Industry 4.0 in IoT Transforming the Future.pptx
 
Artificial Intelligence and its Impact on Society.pptx
Artificial Intelligence and its Impact on Society.pptxArtificial Intelligence and its Impact on Society.pptx
Artificial Intelligence and its Impact on Society.pptx
 
Electricity Year 2023_updated_22022024.pptx
Electricity Year 2023_updated_22022024.pptxElectricity Year 2023_updated_22022024.pptx
Electricity Year 2023_updated_22022024.pptx
 
Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...
Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...
Generative AI Rennes Meetup with OVHcloud - WAICF highlights & how to deploy ...
 
Tips to Align with Your Salesforce Data Goals
Tips to Align with Your Salesforce Data GoalsTips to Align with Your Salesforce Data Goals
Tips to Align with Your Salesforce Data Goals
 
data analytics and tools from in2inglobal.pdf
data analytics  and tools from in2inglobal.pdfdata analytics  and tools from in2inglobal.pdf
data analytics and tools from in2inglobal.pdf
 
A Gentle Introduction to Text Analysis :)
A Gentle Introduction to Text Analysis :)A Gentle Introduction to Text Analysis :)
A Gentle Introduction to Text Analysis :)
 
AWS Identity and access management for users
AWS Identity and access management for usersAWS Identity and access management for users
AWS Identity and access management for users
 
SABARI PRIYAN's self introduction as reference
SABARI PRIYAN's self introduction as referenceSABARI PRIYAN's self introduction as reference
SABARI PRIYAN's self introduction as reference
 
Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...
Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...
Web 3.0 in Data Privacy and Security | Data Privacy |Blockchain Security| Cyb...
 
Soil Health Policy Map Years 2020 to 2023
Soil Health Policy Map Years 2020 to 2023Soil Health Policy Map Years 2020 to 2023
Soil Health Policy Map Years 2020 to 2023
 
ppt penjualan berbasis online omset.pptx
ppt penjualan berbasis online omset.pptxppt penjualan berbasis online omset.pptx
ppt penjualan berbasis online omset.pptx
 
IIBA Adl - Being Effective on Day 1 - Slide Deck.pdf
IIBA Adl - Being Effective on Day 1 - Slide Deck.pdfIIBA Adl - Being Effective on Day 1 - Slide Deck.pdf
IIBA Adl - Being Effective on Day 1 - Slide Deck.pdf
 
What is the value of your Data v3.0.pptx
What is the value of your Data v3.0.pptxWhat is the value of your Data v3.0.pptx
What is the value of your Data v3.0.pptx
 
SABARI PRIYAN's self introduction as a reference
SABARI PRIYAN's self introduction as a referenceSABARI PRIYAN's self introduction as a reference
SABARI PRIYAN's self introduction as a reference
 
Big Data - large Scale data (Amazon, FB)
Big Data - large Scale data (Amazon, FB)Big Data - large Scale data (Amazon, FB)
Big Data - large Scale data (Amazon, FB)
 
Lies and Myths in InfoSec - 2023 Usenix Enigma
Lies and Myths in InfoSec - 2023 Usenix EnigmaLies and Myths in InfoSec - 2023 Usenix Enigma
Lies and Myths in InfoSec - 2023 Usenix Enigma
 

Architecture, Products, and Total Cost of Ownership of the Leading Machine Learning Stacks

  • 1. Architecture, Products and Total Cost of Ownership of the Leading Machine Learning Stacks Presented by: William McKnight “#1 Global Influencer in Big Data” Thinkers360 President, McKnight Consulting Group A 2-time Inc. 5000 Company linkedin.com/in/wmcknight/ www.mcknightcg.com (214) 514-1444 Second Thursday of Every Month, at 2:00 ET With William McKnight
  • 2. TELECOMMUNICATIONS PHARMACEUTICAL EDUCATION CONSUMER PRODUCTS/RETAIL FINANCIAL INSURANCE/HEALTHCARE GOVERNMENT AND UTILITIES OTHER PUBLISHING McKnight Consulting Group Partial Client List
  • 4. Performance Features • Micro-partitions • Clustering Keys • Clustering Depth • Multi-Clusters • Transparent Materialized Views • Search Optimization Service • Query Acceleration Service
  • 5. Individual Query Performance Feature Comparison Improves Clustering Materialized Views Search Opt. Service Equality searches X X X Range searches X X X Sort operations X X Substring and Regex X VARIANT searches X Geospatial X Extra Costs Compute X X X Storage X X
  • 6. Usability Features • External Tables • Dynamic Data Masking • Time Travel and Fail Safe • Semi-Structured Data • Snowpipe • Snowsight Dashboards • Snowpark API 6
  • 7. Warehouses • 10 sizes • Available in Standard and Snowpark • New Snowpark- optimized with 16x memory than Standard (open preview) Size XS S M L XL 2XL 3XL 4XL 5XL 6XL
  • 8. Pricing • Watch For: – Concurrency and price-per- performance – Effective Warehouses (Multi-clusters) – Add-on compute: • Automatic Clustering • Materialized View Refreshes • Search Optimization • Query Acceleration – Time travel storage • Discounts
  • 9. (A) Snowflake ML Stack Category Dedicated Compute Snowflake Storage Snowflake Data Integration AWS Glue Streaming Kafka Confluent Cloud Spark Analytics Amazon EMR + Kinesis Spark Data Lake Snowflake External Tables Business Intelligence Tableau Machine Learning Amazon SageMaker Identity Management Amazon IAM Data Catalog Amazon Glue Data Catalog
  • 10. (A) Snowflake Machine Learning Stack Azure Kubernetes Services (AKS) Front-end E-Commerce Website Back-end Cart Profile Products Stock Deployed Recommender ML Model Training & Deployment Automatic Model deployment Databricks Databricks Transactional Database Cloud Firestore Data Loading Data Processing Cloud Data Fusion Snowflake Data Transformation Data Lake + Historical Data Data Marts Cloud Storage (data lake) MDM Database Talend Data Governance: • Partner Solutions • Marketplace solutions
  • 11. 11
  • 12. Performance Features • Redshift Advisor • Workload Management • Concurrency Scaling • Transparent Materialized Views • Short Query Acceleration 12
  • 13. Usability Features • Redshift Spectrum (External Tables) • Automated Materialized Views (AutoMV) • Dynamic Data Masking • Federated Queries • Semi-Structured and SUPER Type • Streaming Ingest with Kinesis • Python UDF • Redshift ML
  • 14. Provisioned Clusters vs. Serverless Provisioned Serverless Managed Self managed Fully managed Compute Choose node type and cluster size Workgroup Storage Provisioned disk capacity Namespace WLM User configured Not applicable Concurrent scaling User enabled Not applicable Scale out/up/down User-initiated cluster resize Not applicable Pause/resume Manual Automatic Compute billing Per second when not paused $/hour rate Per second when workloads run RPU-hour rate Storage billing $ per managed storage amount $ per GB-month used More detailed comparison: https://docs.aws.amazon.com/redshift/latest/mgmt/serverless-console-comparison.html
  • 15. Cluster Sizes AWS Type CPU/RAM Node Range Price Per Node dc2.large 2 / 15 GB 1 – 32 $0.25 dc2.8xlarge 32 / 244 GB 2 – 128 $4.80 ra3.xlplus 4 / 32 GB 1 – 32 $1.09 ra3.4xlarge 12 / 96 GB 2 – 32 $3.26 ra3.16xlarge 48 / 384 GB 2 – 128 $13.04 Serverless (Base & Max RPUs) ? 32 – 512 RPUs* $0.36 *Redshift Processing Units are available in units of 8 (32, 40, 48, and so on, up to 512)
  • 16. Pricing • Price-per- performance • Watch For: – Concurrency Scaling – Serverless RPU Usage – SageMaker costs for Redshift ML • Discounts
  • 17. Redshift ML Stack Category Dedicated Compute Amazon Redshift RA3 Storage Amazon Redshift Managed Storage Data Integration AWS Glue Streaming Amazon Kinesis Data Analytics Spark Analytics Amazon EMR + Kinesis Spark Data Lake Amazon Redshift Spectrum Business Intelligence Amazon Quicksight Machine Learning Amazon SageMaker Identity Management Amazon IAM Data Catalog Amazon Glue Data Catalog
  • 18. Amazon Elastic Kubernetes services (Amazon EKS) Front-end E-Commerce Website Back-end Cart Profile Products Stock Deployed Recommender ML Model Training & Deployment Automatic Model deployment SageMaker model endpoint Amazon SageMaker Transactional Database Amazon Dynamo DB Data Loading Amazon Glue Data Processing Amazon Redshift Data Lake + Historical Data S3 (data lake) Data Governance: • AWS Partner Solutions • AWS Marketplace solutions MDM Database Talend AWS Machine Learning Stack
  • 19. 19
  • 20. Performance Features • Workload Management • Estimated query plan (coming soon) • Transparent materialized views • Adaptive caching (recently use data on NVMe) • Azure Advisor
  • 21. Usability Features • Dynamic Data Masking • External Data Sources • Synapse Link • SynapseML 21
  • 22. Data Warehouse Units (DWU) • Official: “a collection of analytic resources…defined as a combination of CPU, memory, and IO…[which] represents an abstract, normalized measure of compute resources and performance.” • Increasing DWUs linearly improves performance DWUs 100 200 300 400 500 1000 1500 2000 2500 3000 5000 6000 7500 10000 15000 30000
  • 23. Pricing DWUs Price/hr 100 $1.20 200 $2.40 300 $3.60 400 $4.80 500 $6 1000 $12 1500 $18 2000 $24 2500 $30 3000 $36 5000 $60 6000 $72 7500 $90 10000 $120 15000 $180 30000 $360 Component Price Serverless $5/TB processed Dedicated $/hour >>> 1-year Reserved 37% discount 3-year Reserved 65% discount Storage $23/TB-month • Additional charges (per vCore-hour) for Synapse Link, Data Explorer, and Spark Pools • Pipelines priced by DIU-hour, runtime-hour, and per activity run
  • 24. Microsoft Synapse ML Stack Category Dedicated Compute Azure Synapse Analytics Workspace Storage Azure Synapse Analytics SQL Pool Data Integration Azure Data Factory (ADF) Streaming Azure Stream Analytics (for Analytics) and Azure Event Hubs Spark Analytics Big Data Analytics with Apache Spark Data Lake Amazon Redshift Spectrum Business Intelligence Amazon Quicksight Machine Learning Amazon Sagemaker Identity Management Amazon IAM Data Catalog Amazon Purview
  • 25. Azure Kubernetes Services (AKS) Front-end E-Commerce Website Back-end Cart Profile Products Stock Deployed Recommender ML Model Runtime Azure ML managed online endpoint Azure Machine Learning Transactional Database Azure Cosmos DB Core API Analytical Store (HTAP) Azure Cosmos DB Analytical Store (Parquet) Cognitive Services Sentiment analysis on product reviews to enhance the recommender model Synapse Link Enables automatic sync to analytical store (no ETL) Data Processing Azure Synapse Analytics Data Lake + Historical Data ADL Gen2 Data Lake: HTAP data, sentiment data, historical order data Automatic Model deployment (MLOps) Data Transformation & ML Model Training Azure Databricks Delta Live Tables SparkML Microsoft Purview Data Management & Governance Discover, classify, track lineage, and protect sensitive data (customer profiles, etc.) MDM Database Talend Azure Machine Learning Stack
  • 26. 26
  • 27. Performance Features • BQ Architecture and Slots • Clustering and Partitioning • Transparent Materialized Views • BI Engine
  • 28. Usability Features • BigQuery Omni – External Tables • Time Travel • Migration Service – SQL Translation • Looker Studio • Colab Notebooks • BigQuery ML 28
  • 29. Pricing Compute BigQuery Omni On-demand $5 per TB $5 per TB Flex $4.00/hr per 100 slots $5.00/hr per 100 slots Monthly Commit* $2.74/hr per 100 slots $3.42/hr per 100 slots Annual Commit* $2.33/hr per 100 slots $2.91/hr per 100 slots BI Engine $0.0416/hr per GB N/A Storage1 Logical2 Physical3 Active $0.02/GB- month $0.04/GB- month Long-term4 $0.01/GB- month $0.02/GB- month Batch loading FREE Streaming inserts $0.01 per 200MB Storage API $0.025 per 1GB 1 You get to choose logical or physical billing 2 Logical = Uncompressed size (Time travel free) 3 Physical = Compressed size + Time travel 4 Table not modified in 90 days *comes with some free BI Engine
  • 30. Google BigQuery ML Stack Category Dedicated Compute Google BigQuery Storage Google BigQuery Storage Data Integration Google Dataflow (Batch) Streaming Google Dataflow (Streaming) Spark Analytics Google Dataproc Data Lake Google BigQuery On-Demand Infrastructure Business Intelligence Google BigQuery BI Engine Machine Learning Google BigQuery ML Identity Management Google Cloud IAM Data Catalog Google Data Catalog
  • 31. Azure Kubernetes Services (AKS) Front-end E-Commerce Website Back-end Cart Profile Products Stock Deployed Recommender ML Model Training & Deployment Automatic Model deployment Vertex AI Prediction Vertex AI Data Governance • Google Dataplex Transactional Database Cloud Firestore Data Loading Data Processing Cloud Data Fusion BigQuery Data Transformation Data Lake + Historical Data Cloud Dataprep Cloud Dataflow Cloud Storage (data lake) MDM Database Talend Google Machine Learning Stack
  • 33. Sample Stack Cost Breakout
  • 34. Line Item Pricing (AWS) Lookup CostCenter Category Platform Product Size UnitNode Amazon Redshift ra3.4xlarge-Infrastructure Infrastructure 01-Dedicated Compute AWS Amazon Redshift ra3.4xlarge 1-Medium ra3.4xlarge Amazon Redshift ra3.16xlarge-Infrastructure Infrastructure 01-Dedicated Compute AWS Amazon Redshift ra3.16xlarge 2-Large ra3.16xlarge Amazon Redshift Managed Storage-Storage Storage 02-Storage AWS Amazon Redshift Managed Storage 1-Medium GB-month Amazon Redshift Managed Storage-Storage Storage 02-Storage AWS Amazon Redshift Managed Storage 2-Large GB-month AWS Glue-Software Software 03-Data Integration AWS AWS Glue 1-Medium DPU-Hour AWS Glue-Software Software 03-Data Integration AWS AWS Glue 2-Large DPU-Hour Amazon Kinesis Data Analytics-Infrastructure Infrastructure 04-Streaming AWS Amazon Kinesis Data Analytics 1-Medium KPU-Hour Amazon Kinesis Data Analytics-Infrastructure Infrastructure 04-Streaming AWS Amazon Kinesis Data Analytics 2-Large KPU-Hour Amazon Kinesis Data Analytics-Storage Storage 04-Streaming AWS Amazon Kinesis Data Analytics 1-Medium GB-month Amazon Kinesis Data Analytics-Storage Storage 04-Streaming AWS Amazon Kinesis Data Analytics 2-Large GB-month Amazon EMR-Infrastructure Infrastructure 05-Spark Analytics AWS Amazon EMR 1-Medium r5.4xlarge Amazon EMR-Software Software 05-Spark Analytics AWS Amazon EMR 1-Medium EMR on r5.4xlarge Amazon EMR-Infrastructure Infrastructure 05-Spark Analytics AWS Amazon EMR 2-Large r5.4xlarge Amazon EMR-Software Software 05-Spark Analytics AWS Amazon EMR 2-Large EMR on r5.4xlarge Amazon Kinesis-Shards Shards 05-Spark Analytics AWS Amazon Kinesis 1-Medium Shard-hour Amazon Kinesis-Shards Shards 05-Spark Analytics AWS Amazon Kinesis 2-Large Shard-hour Amazon Redshift Spectrum-Software Software 06-Data Exploration AWS Amazon Redshift Spectrum 1-Medium TB-month Amazon Redshift Spectrum-Software Software 06-Data Exploration AWS Amazon Redshift Spectrum 2-Large TB-month Amazon Redshift ra3.4xlarge-Infrastructure Infrastructure 06-Data Exploration AWS Amazon Redshift ra3.4xlarge 1-Medium ra3.4xlarge Amazon Redshift ra3.4xlarge-Infrastructure Infrastructure 06-Data Exploration AWS Amazon Redshift ra3.4xlarge 2-Large ra3.4xlarge Amazon EMR-Infrastructure Infrastructure 07-Data Lake AWS Amazon EMR 1-Medium r5.4xlarge Amazon EMR-Software Software 07-Data Lake AWS Amazon EMR 1-Medium EMR on r5.4xlarge Amazon EMR-Infrastructure Infrastructure 07-Data Lake AWS Amazon EMR 2-Large r5.4xlarge Amazon EMR-Software Software 07-Data Lake AWS Amazon EMR 2-Large EMR on r5.4xlarge Amazon Quicksight Readers-Licenses Licenses 08-Business Intelligence AWS Amazon Quicksight Readers 1-Medium User-month Amazon Quicksight Readers-Licenses Licenses 08-Business Intelligence AWS Amazon Quicksight Readers 2-Large User-month Amazon Quicksight Authors-Licenses Licenses 08-Business Intelligence AWS Amazon Quicksight Authors 1-Medium User-month Amazon Quicksight Authors-Licenses Licenses 08-Business Intelligence AWS Amazon Quicksight Authors 2-Large User-month Amazon SageMaker-Infrastructure Infrastructure 09-Machine Learning AWS Amazon SageMaker 1-Medium ml.r5.2xlarge Amazon SageMaker-Software Software 09-Machine Learning AWS Amazon SageMaker 1-Medium ml.r5.2xlarge Amazon SageMaker-Infrastructure Infrastructure 09-Machine Learning AWS Amazon SageMaker 2-Large ml.r5.2xlarge Amazon SageMaker-Software Software 09-Machine Learning AWS Amazon SageMaker 2-Large ml.r5.2xlarge Amazon IAM-Licenses Licenses 10-Identity Management AWS Amazon IAM 1-Medium Included Amazon IAM-Licenses Licenses 10-Identity Management AWS Amazon IAM 2-Large Included AWS Glue Data Catalog-Software Software 11-Data Catalog AWS AWS Glue Data Catalog 1-Medium 100K objects AWS Glue Data Catalog-Software Software 11-Data Catalog AWS AWS Glue Data Catalog 2-Large 100K objects 34
  • 35. Stack Cost by Use Case for Medium-Sized Enterprises • 1st Year of Project • 1st Large Scale ML Project • 1.3M – 3.2M 35
  • 36. Stack Cost by Use Case for Large Size Enterprises • 1st Year of Project • 1st Large Scale ML Project • 3.4M – 8.5M 36
  • 37. Project ROI & TCO 37 ROI = Benefit TCO Infrastructure Software + FTE + Consulting +
  • 38. Summary • For large-sized enterprise projects, the stack cost typically ranges between $3.4M-$8.5M to ensure successful deployment of ML-based projects into production, in addition to labor expenses. • The total cost of ownership of cloud analytics platforms scales up as the demand for analytics at your company grows over time. • Snowflake adopts a usage-based or consumption-based pricing model, where users are charged based on the amount of data processed, resulting in higher costs for higher usage levels. • Redshift offers both provisioned clusters and serverless options to cater to different business requirements. • Synapse is available for purchase in DWU, which comprises a collection of analytic resources that can be adjusted to meet the specific needs of the organization. • BigQuery slots operate as virtual CPUs to ensure efficient data processing and analysis. • While there are numerous technology stacks available, the ones mentioned here are just a few examples. • Dedicated Compute, Storage, Data Integration, Streaming, Spark Analytics, Data Lake, Business Intelligence, Machine Learning, Identity Management, and Data Catalog are all essential components of a modern data management and analytics ecosystem. • Estimating the costs of building a technology stack can be a complex task and requires careful consideration of various factors. • It is recommended to seek reliable performance at a predictable price to ensure the successful implementation of data management and analytics projects. • The true measure of project efficacy is Return on Investment (ROI), and organizations should strive to achieve positive ROI in their data management and analytics endeavors.
  • 39. Architecture, Products and Total Cost of Ownership of the Leading Machine Learning Stacks Presented by: William McKnight “#1 Global Influencer in Big Data” Thinkers360 President, McKnight Consulting Group A 2-time Inc. 5000 Company linkedin.com/in/wmcknight/ www.mcknightcg.com (214) 514-1444 Second Thursday of Every Month, at 2:00 ET With William McKnight