BigQuery best practices and recommendations to reduce costs with BI Engine, Slots, Materialized Views

Márton Kodok
Márton KodokGoogle Developer Expert/Senior Software Engineer/Team-leader/Mentor at Reea
BigQuerybestpractices and recommendations
toreducecosts
with BI Engine, Slots, Materialized Views
Devfest Nantes, October 2022
Márton Kodok
Google Developer Expert at REEA.net
● Among the Top 3 romanians on Stackoverflow 201k reputation
● Google Developer Expert on Cloud technologies (2016→)
● Champion of Google Cloud Innovators program (2021→)
● Crafting Web/Mobile backends at REEA.net
Articles: martonkodok.medium.com
Twitter: @martonkodok
Slideshare: martonkodok
StackOverflow: pentium10
GitHub: pentium10
Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
About me
1. Looking at a BigQuery billing report
2. What is BI Engine?
3. Obtaining per job billing stats
4. Enable and use BI Engine reservations
5. Using Cloud Workflows to orchestrate the right capacity
6. Lower bills and faster queries on Data Studio, BigQuery
7. Conclusions, articles
Agenda
Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
Looking at a BigQuery billing report
@martonkodok
Reduce BigQuery bills with BI Engine capacity orchestration
Article: https://medium.com/p/9e2634c84a82 @martonkodok
Cloud Workflows automating the BI Engine capacity size
@martonkodok
@martonkodok
What is
BI Engine?
Part #2
“ BIEngine is a fast, in-memory analysis service
that integrates out of the box
with BigQuery, DataStudio, Looker,Tableau,PowerBI
Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
What is BiEngine?
Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
BIEngine architecture
1. Its a cache plugin to BigQuery- a manageddistributed in-memoryexecutionengine
2. BI Engine reservations manage the memoryallocationattheprojectbillinglevel.
3. cachesonlycolumnsandpartitionsthatarequeriedorscanned. It does not cache the whole table.
4. Any BI solution or custom application that works with the BigQuery API
such as REST or JDBC and ODBC drivers canuseBIEnginewithoutanychanges.
What does out-of-the-box means?
Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
Free-for-all
1TB free each month
On-demand
queries $5/TB, storage: $20/TB
Flat rate reservation slots
average $4 per hour,
best is $1700 for 100 slots (Yrl plan)
BigQuery ML excluded from this table.
Cost components in BigQuery and BI Engine
@martonkodok
BI Engine
$0.0416 per GB/hour
($30.36 per GB/month)
Part #3
Orchestrating the
capacity size
“The aim is to dynamically adjust the size of
the BIEngine to get the lowest combined cost
of BigQuery and BI Engine.
Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
1. Obtain the cost of your on-demand BigQuery usage
2. Set the BI Engine capacity in steps
3. Have a real-time sense of the savings todrive capacity automation up/down
4. Monitor the applied settings for optimal savings
Prerequisite:
Access to INFORMATION_SCHEMA or Auditlogs exported to BigQuery (historically better)
Biggest challenges
Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
The query to get the recent costs for each job
Article: https://medium.com/p/9e2634c84a82 @martonkodok
The query to get the recent costs for each job
Article: https://medium.com/p/9e2634c84a82 @martonkodok
1. The query uses a flat rate of 5 USD to calculate the cost
2. At this point no optimization is in place, as the two columns are the same
BigQuery savings based on billed vs processed bytes
Article: https://medium.com/p/9e2634c84a82 @martonkodok
BigQuery savings when BI Engine is properly sized
Article: https://medium.com/p/9e2634c84a82 @martonkodok
1. BI Engine capacity resize needs 5 minute to propagate
2. Savings are calculated yielding lower billed bytes than processed bytes
Optimize BI Engine
effectiveness
Part #4
1. BI engine capacity might be too small
2. bq queries are too complex
BI Engine turned on - but ineffective
Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
“ Not all BigQueryqueries are accelerated.
Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
1. Detailed statistics on BI Engine are available through the job statistics API
2. bq command-line tool to fetch job statistics
Acceleration statistics
Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
1. bq show --format=prettyjson -j job_id
"statistics": {
"creationTime": "1602175128902",
"endTime": "1602175130700",
"query": {
"biEngineStatistics": {
"biEngineMode": "DISABLED",
"biEngineReasons": [
{
"code": "UNSUPPORTED_SQL_TEXT",
"message": "Detected unsupported join type"
}
]
},
Acceleration statistics
Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
Use INFORMATION_SCHEMA to get acceleration statistics
Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
1. Investigate queries that have BI Engine acceleration reported as disabled, partial
2. Rewrite queries to perform better under BI Engine optimizer
3. Use materialized views to join and flatten data to optimize their structure for BI Engine
4. Create short lived (5m, 15m, 1h) temporary tables to improve caching efficiency
5. Increase the size of the BI Engine reservation until effective use
6. Use Cloud Workflow and business logic to automate the size based on workload during the day
To have effective BI Engine acceleration
Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
Leverage temporary, dedicated business scope tables
Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
Dedicated table for scope
Use a scheduler to recreate
every 5m/15m/1h
Leverage clustering
Use Materialized Views to get latest rows from append-only tables
Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
Trick to get latest row using
Materialized Views
2nd view to get rid of the
arrays
@martonkodok
Orchestrating the
capacity size
Part #5
Cloud Workflows automating the BI Engine capacity size
@martonkodok
Cloud Workflows automating the BI Engine capacity size
Article: https://medium.com/p/9e2634c84a82 @martonkodok
BigQuery savings when BI Engine is properly sized
Article: https://medium.com/p/9e2634c84a82 @martonkodok
1. BI Engine capacity resize needs 5 minute to propagate
2. Savings are calculated yielding lower billed bytes than processed bytes
1. Reads the output of the effectiveness of billed vs processed bytes query
2. Based on benefits margin map the step of the increase eg: 5GB step, 1GB step, 0.5GB step
3. Have a math of the evaluation, how far you can stretch by increasing the BI Engine to have the benefits
4. Capacity mapping over office hours for more capacity, and lower capacity during the night.
5. Leverage BigQuery ML to write a time-series forecast prediction based on historical data to actually drive
the best BI Engine capacity for the “hour slot”.
6. Stop increasing the capacity when the rationale of the savings costs more than the benefits.
Cloud Workflow automation logic
Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
Reduce BigQuery bills with BI Engine capacity orchestration
Article: https://medium.com/p/9e2634c84a82 @martonkodok
Data Studio aspects
Article: https://medium.com/p/9e2634c84a82 @martonkodok
1. Accelerated by BigQuery Engine icon
2. Faster dashboards
Cloud Monitoring
Article: https://medium.com/p/9e2634c84a82 @martonkodok
1. create a chart plotting the bigquerybiengine.googleapis.com/reservation/used_bytes
2. over the bigquerybiengine.googleapis.com/reservation/total_bytes
Article on medium.com
@martonkodok
https://medium.com/p/9e2634c84a82
1. Easy out of box way to optimize costs of BigQuery
2. by turning out BI Engine, which does not need code changes.
3. Leverage INFORMATION_SCHEMA stats to see underperforming queries, try tooptimize them.
4. Automate the right capacity size by using Cloud Workflows
5. Save precious development time, lower bills, faster queries
Conclusions
Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
Thank you. Q&A.
Slides available on:
slideshare.net/martonkodok
Reea.net - Integrated web solutions driven by creativity
to deliver projects.
Twitter: @martonkodok
1 of 38

Recommended

Google BigQuery Best Practices by
Google BigQuery Best PracticesGoogle BigQuery Best Practices
Google BigQuery Best PracticesMatillion
1.5K views15 slides
Big query by
Big queryBig query
Big queryTanvi Parikh
3.3K views29 slides
What Is Power BI? | Introduction To Microsoft Power BI | Power BI Training | ... by
What Is Power BI? | Introduction To Microsoft Power BI | Power BI Training | ...What Is Power BI? | Introduction To Microsoft Power BI | Power BI Training | ...
What Is Power BI? | Introduction To Microsoft Power BI | Power BI Training | ...Edureka!
1.7K views22 slides
Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스... by
Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...
Amazon SageMaker 모델 빌딩 파이프라인 소개::이유동, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스...Amazon Web Services Korea
716 views40 slides
Microsoft power bi by
Microsoft power biMicrosoft power bi
Microsoft power bitechpro360
1.5K views6 slides
Power bi by
Power biPower bi
Power bijainema23
6.1K views30 slides

More Related Content

What's hot

Power BI Governance - Access Management, Recommendations and Best Practices by
Power BI Governance - Access Management, Recommendations and Best PracticesPower BI Governance - Access Management, Recommendations and Best Practices
Power BI Governance - Access Management, Recommendations and Best PracticesLearning SharePoint
1.2K views10 slides
Power bi (1)Power BI Online Training Hyderabad | power bi online training ben... by
Power bi (1)Power BI Online Training Hyderabad | power bi online training ben...Power bi (1)Power BI Online Training Hyderabad | power bi online training ben...
Power bi (1)Power BI Online Training Hyderabad | power bi online training ben...Big IT Trainings
375 views9 slides
SPS-Power BI Introduction by
SPS-Power BI IntroductionSPS-Power BI Introduction
SPS-Power BI IntroductionKerry Dirks MCPS MS
928 views65 slides
Power bi by
Power biPower bi
Power biLakshmi Prasanna Kottagorla
1K views17 slides
GPPB2020 - Milan - Power BI dataflows deep dive by
GPPB2020 - Milan - Power BI dataflows deep diveGPPB2020 - Milan - Power BI dataflows deep dive
GPPB2020 - Milan - Power BI dataflows deep diveRiccardo Perico
128 views44 slides
Power BI Full Course | Power BI Tutorial for Beginners | Edureka by
Power BI Full Course | Power BI Tutorial for Beginners | EdurekaPower BI Full Course | Power BI Tutorial for Beginners | Edureka
Power BI Full Course | Power BI Tutorial for Beginners | EdurekaEdureka!
3.6K views60 slides

What's hot(20)

Power BI Governance - Access Management, Recommendations and Best Practices by Learning SharePoint
Power BI Governance - Access Management, Recommendations and Best PracticesPower BI Governance - Access Management, Recommendations and Best Practices
Power BI Governance - Access Management, Recommendations and Best Practices
Learning SharePoint1.2K views
Power bi (1)Power BI Online Training Hyderabad | power bi online training ben... by Big IT Trainings
Power bi (1)Power BI Online Training Hyderabad | power bi online training ben...Power bi (1)Power BI Online Training Hyderabad | power bi online training ben...
Power bi (1)Power BI Online Training Hyderabad | power bi online training ben...
Big IT Trainings375 views
GPPB2020 - Milan - Power BI dataflows deep dive by Riccardo Perico
GPPB2020 - Milan - Power BI dataflows deep diveGPPB2020 - Milan - Power BI dataflows deep dive
GPPB2020 - Milan - Power BI dataflows deep dive
Riccardo Perico128 views
Power BI Full Course | Power BI Tutorial for Beginners | Edureka by Edureka!
Power BI Full Course | Power BI Tutorial for Beginners | EdurekaPower BI Full Course | Power BI Tutorial for Beginners | Edureka
Power BI Full Course | Power BI Tutorial for Beginners | Edureka
Edureka!3.6K views
Power BI Desktop | Power BI Tutorial | Power BI Training | Edureka by Edureka!
Power BI Desktop | Power BI Tutorial | Power BI Training | EdurekaPower BI Desktop | Power BI Tutorial | Power BI Training | Edureka
Power BI Desktop | Power BI Tutorial | Power BI Training | Edureka
Edureka!1.9K views
Power bi introduction by Bishwadeb Dey
Power bi introductionPower bi introduction
Power bi introduction
Bishwadeb Dey1.4K views
Best Practices For Workflow by Timothy Spann
Best Practices For WorkflowBest Practices For Workflow
Best Practices For Workflow
Timothy Spann89 views
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW... by Amazon Web Services
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Databases on AWS - The right tool for the right job - ADB203 - Santa Clara AW...
Amazon Web Services1.1K views
Power BI Charts Tutorial | Counter Strike Data Analysis using Power BI | Powe... by Edureka!
Power BI Charts Tutorial | Counter Strike Data Analysis using Power BI | Powe...Power BI Charts Tutorial | Counter Strike Data Analysis using Power BI | Powe...
Power BI Charts Tutorial | Counter Strike Data Analysis using Power BI | Powe...
Edureka!652 views
Microsoft Power BI | Brief Introduction | PPT by Sophia Smith
Microsoft Power BI | Brief Introduction | PPTMicrosoft Power BI | Brief Introduction | PPT
Microsoft Power BI | Brief Introduction | PPT
Sophia Smith3.2K views
Google BigQuery for Everyday Developer by Márton Kodok
Google BigQuery for Everyday DeveloperGoogle BigQuery for Everyday Developer
Google BigQuery for Everyday Developer
Márton Kodok1.4K views
AWS Initiate Brasil 2021 - Vantagens econômicas da nuvem - Andre Serafim by Amazon Web Services LATAM
AWS Initiate Brasil 2021 - Vantagens econômicas da nuvem - Andre SerafimAWS Initiate Brasil 2021 - Vantagens econômicas da nuvem - Andre Serafim
AWS Initiate Brasil 2021 - Vantagens econômicas da nuvem - Andre Serafim
bigquery.pptx by Harissh16
bigquery.pptxbigquery.pptx
bigquery.pptx
Harissh16420 views
Intro for Power BI by Martin X
Intro for Power BIIntro for Power BI
Intro for Power BI
Martin X1.7K views
Amazon SageMaker 모델 배포 방법 소개::김대근, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스페셜 웨비나 by Amazon Web Services Korea
Amazon SageMaker 모델 배포 방법 소개::김대근, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스페셜 웨비나Amazon SageMaker 모델 배포 방법 소개::김대근, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스페셜 웨비나
Amazon SageMaker 모델 배포 방법 소개::김대근, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스페셜 웨비나
Retail Analytics and BI with Looker, BigQuery, GCP & Leigha Jarett by Daniel Zivkovic
Retail Analytics and BI with Looker, BigQuery, GCP & Leigha JarettRetail Analytics and BI with Looker, BigQuery, GCP & Leigha Jarett
Retail Analytics and BI with Looker, BigQuery, GCP & Leigha Jarett
Daniel Zivkovic479 views
Using MLOps to Bring ML to Production/The Promise of MLOps by Weaveworks
Using MLOps to Bring ML to Production/The Promise of MLOpsUsing MLOps to Bring ML to Production/The Promise of MLOps
Using MLOps to Bring ML to Production/The Promise of MLOps
Weaveworks5.4K views

Similar to BigQuery best practices and recommendations to reduce costs with BI Engine, Slots, Materialized Views

Supercharge your data analytics with BigQuery by
Supercharge your data analytics with BigQuerySupercharge your data analytics with BigQuery
Supercharge your data analytics with BigQueryMárton Kodok
189 views30 slides
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery by
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryMárton Kodok
358 views38 slides
Building Data Products with BigQuery for PPC and SEO (SMX 2022) by
Building Data Products with BigQuery for PPC and SEO (SMX 2022)Building Data Products with BigQuery for PPC and SEO (SMX 2022)
Building Data Products with BigQuery for PPC and SEO (SMX 2022)Christopher Gutknecht
572 views62 slides
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery by
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryVoxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryMárton Kodok
417 views35 slides
Applying BigQuery ML on e-commerce data analytics by
Applying BigQuery ML on e-commerce data analyticsApplying BigQuery ML on e-commerce data analytics
Applying BigQuery ML on e-commerce data analyticsMárton Kodok
1.5K views38 slides
Big Query Basics by
Big Query BasicsBig Query Basics
Big Query BasicsIdo Green
28.6K views33 slides

Similar to BigQuery best practices and recommendations to reduce costs with BI Engine, Slots, Materialized Views(20)

Supercharge your data analytics with BigQuery by Márton Kodok
Supercharge your data analytics with BigQuerySupercharge your data analytics with BigQuery
Supercharge your data analytics with BigQuery
Márton Kodok189 views
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery by Márton Kodok
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
Márton Kodok358 views
Building Data Products with BigQuery for PPC and SEO (SMX 2022) by Christopher Gutknecht
Building Data Products with BigQuery for PPC and SEO (SMX 2022)Building Data Products with BigQuery for PPC and SEO (SMX 2022)
Building Data Products with BigQuery for PPC and SEO (SMX 2022)
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery by Márton Kodok
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryVoxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Márton Kodok417 views
Applying BigQuery ML on e-commerce data analytics by Márton Kodok
Applying BigQuery ML on e-commerce data analyticsApplying BigQuery ML on e-commerce data analytics
Applying BigQuery ML on e-commerce data analytics
Márton Kodok1.5K views
Big Query Basics by Ido Green
Big Query BasicsBig Query Basics
Big Query Basics
Ido Green28.6K views
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi... by Márton Kodok
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
Márton Kodok648 views
Implementing google big query automation using google analytics data by Countants
Implementing google big query automation using google analytics dataImplementing google big query automation using google analytics data
Implementing google big query automation using google analytics data
Countants173 views
BigQuery ML - Machine learning at scale using SQL by Márton Kodok
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQL
Márton Kodok305 views
BigdataConference Europe - BigQuery ML by Márton Kodok
BigdataConference Europe - BigQuery MLBigdataConference Europe - BigQuery ML
BigdataConference Europe - BigQuery ML
Márton Kodok147 views
AwReporting Update by marcwan
AwReporting UpdateAwReporting Update
AwReporting Update
marcwan987 views
BigQuery ML - Machine learning at scale using SQL by Márton Kodok
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQL
Márton Kodok1.1K views
Google Developer Group - Cloud Singapore BigQuery Webinar by Rasel Rana
Google Developer Group - Cloud Singapore BigQuery WebinarGoogle Developer Group - Cloud Singapore BigQuery Webinar
Google Developer Group - Cloud Singapore BigQuery Webinar
Rasel Rana762 views
Database performance improvement, a six sigma project (4 block) by nirav shah by Nirav Shah
Database performance improvement, a six sigma project (4 block) by nirav shah Database performance improvement, a six sigma project (4 block) by nirav shah
Database performance improvement, a six sigma project (4 block) by nirav shah
Nirav Shah5K views
Cherokee nation 2 day AIAD & DIAD - App in a day and Dashboard in day by Vishal Pawar
Cherokee nation 2 day AIAD & DIAD - App in a day and Dashboard in dayCherokee nation 2 day AIAD & DIAD - App in a day and Dashboard in day
Cherokee nation 2 day AIAD & DIAD - App in a day and Dashboard in day
Vishal Pawar103 views
Google BigQuery is the future of Analytics! (Google Developer Conference) by Rasel Rana
Google BigQuery is the future of Analytics! (Google Developer Conference)Google BigQuery is the future of Analytics! (Google Developer Conference)
Google BigQuery is the future of Analytics! (Google Developer Conference)
Rasel Rana710 views
DevTalks Keynote Powering interactive data analysis with Google BigQuery by Márton Kodok
DevTalks Keynote Powering interactive data analysis with Google BigQueryDevTalks Keynote Powering interactive data analysis with Google BigQuery
DevTalks Keynote Powering interactive data analysis with Google BigQuery
Márton Kodok258 views
Discover BigQuery ML, build your own CREATE MODEL statement by Márton Kodok
Discover BigQuery ML, build your own CREATE MODEL statementDiscover BigQuery ML, build your own CREATE MODEL statement
Discover BigQuery ML, build your own CREATE MODEL statement
Márton Kodok73 views
What's New for Report Authors in Cognos 10.2 by Senturus
What's New for Report Authors in Cognos 10.2What's New for Report Authors in Cognos 10.2
What's New for Report Authors in Cognos 10.2
Senturus7.1K views

More from Márton Kodok

Gen Apps on Google Cloud PaLM2 and Codey APIs in Action by
Gen Apps on Google Cloud PaLM2 and Codey APIs in ActionGen Apps on Google Cloud PaLM2 and Codey APIs in Action
Gen Apps on Google Cloud PaLM2 and Codey APIs in ActionMárton Kodok
11 views55 slides
DevBCN Vertex AI - Pipelines for your MLOps workflows by
DevBCN Vertex AI - Pipelines for your MLOps workflowsDevBCN Vertex AI - Pipelines for your MLOps workflows
DevBCN Vertex AI - Pipelines for your MLOps workflowsMárton Kodok
67 views57 slides
Cloud Run - the rise of serverless and containerization by
Cloud Run - the rise of serverless and containerizationCloud Run - the rise of serverless and containerization
Cloud Run - the rise of serverless and containerizationMárton Kodok
64 views51 slides
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud by
Vertex AI - Unified ML Platform for the entire AI workflow on Google CloudVertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Vertex AI - Unified ML Platform for the entire AI workflow on Google CloudMárton Kodok
1.2K views50 slides
Vertex AI: Pipelines for your MLOps workflows by
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsMárton Kodok
788 views31 slides
Cloud Workflows What's new in serverless orchestration and automation by
Cloud Workflows What's new in serverless orchestration and automationCloud Workflows What's new in serverless orchestration and automation
Cloud Workflows What's new in serverless orchestration and automationMárton Kodok
191 views38 slides

More from Márton Kodok(20)

Gen Apps on Google Cloud PaLM2 and Codey APIs in Action by Márton Kodok
Gen Apps on Google Cloud PaLM2 and Codey APIs in ActionGen Apps on Google Cloud PaLM2 and Codey APIs in Action
Gen Apps on Google Cloud PaLM2 and Codey APIs in Action
Márton Kodok11 views
DevBCN Vertex AI - Pipelines for your MLOps workflows by Márton Kodok
DevBCN Vertex AI - Pipelines for your MLOps workflowsDevBCN Vertex AI - Pipelines for your MLOps workflows
DevBCN Vertex AI - Pipelines for your MLOps workflows
Márton Kodok67 views
Cloud Run - the rise of serverless and containerization by Márton Kodok
Cloud Run - the rise of serverless and containerizationCloud Run - the rise of serverless and containerization
Cloud Run - the rise of serverless and containerization
Márton Kodok64 views
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud by Márton Kodok
Vertex AI - Unified ML Platform for the entire AI workflow on Google CloudVertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Márton Kodok1.2K views
Vertex AI: Pipelines for your MLOps workflows by Márton Kodok
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflows
Márton Kodok788 views
Cloud Workflows What's new in serverless orchestration and automation by Márton Kodok
Cloud Workflows What's new in serverless orchestration and automationCloud Workflows What's new in serverless orchestration and automation
Cloud Workflows What's new in serverless orchestration and automation
Márton Kodok191 views
Serverless orchestration and automation with Cloud Workflows by Márton Kodok
Serverless orchestration and automation with Cloud WorkflowsServerless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud Workflows
Márton Kodok268 views
Serverless orchestration and automation with Cloud Workflows by Márton Kodok
Serverless orchestration and automation with Cloud WorkflowsServerless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud Workflows
Márton Kodok371 views
Serverless orchestration and automation with Cloud Workflows by Márton Kodok
Serverless orchestration and automation with Cloud WorkflowsServerless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud Workflows
Márton Kodok762 views
DevFest Romania 2020 Keynote: Bringing the Cloud to you. by Márton Kodok
DevFest Romania 2020 Keynote: Bringing the Cloud to you.DevFest Romania 2020 Keynote: Bringing the Cloud to you.
DevFest Romania 2020 Keynote: Bringing the Cloud to you.
Márton Kodok66 views
Vibe Koli 2019 - Utazás az egyetem padjaitól a Google Developer Expertig by Márton Kodok
Vibe Koli 2019 - Utazás az egyetem padjaitól a Google Developer ExpertigVibe Koli 2019 - Utazás az egyetem padjaitól a Google Developer Expertig
Vibe Koli 2019 - Utazás az egyetem padjaitól a Google Developer Expertig
Márton Kodok150 views
Google Cloud Platform Solutions for DevOps Engineers by Márton Kodok
Google Cloud Platform Solutions  for DevOps EngineersGoogle Cloud Platform Solutions  for DevOps Engineers
Google Cloud Platform Solutions for DevOps Engineers
Márton Kodok1.1K views
GDG DevFest Romania - Architecting for the Google Cloud Platform by Márton Kodok
GDG DevFest Romania - Architecting for the Google Cloud PlatformGDG DevFest Romania - Architecting for the Google Cloud Platform
GDG DevFest Romania - Architecting for the Google Cloud Platform
Márton Kodok462 views
Next18 Extended Targu Mures - Bringing the Cloud to you by Márton Kodok
Next18 Extended Targu Mures - Bringing the Cloud to youNext18 Extended Targu Mures - Bringing the Cloud to you
Next18 Extended Targu Mures - Bringing the Cloud to you
Márton Kodok81 views
6. DISZ - Webalkalmazások skálázhatósága a Google Cloud Platformon by Márton Kodok
6. DISZ - Webalkalmazások skálázhatósága  a Google Cloud Platformon6. DISZ - Webalkalmazások skálázhatósága  a Google Cloud Platformon
6. DISZ - Webalkalmazások skálázhatósága a Google Cloud Platformon
Márton Kodok133 views
GCP - A felhőalapú architektúrák és szolgáltatások by Márton Kodok
GCP - A felhőalapú architektúrák és szolgáltatásokGCP - A felhőalapú architektúrák és szolgáltatások
GCP - A felhőalapú architektúrák és szolgáltatások
Márton Kodok157 views
GDG Heraklion - Architecting for the Google Cloud Platform by Márton Kodok
GDG Heraklion - Architecting for the Google Cloud PlatformGDG Heraklion - Architecting for the Google Cloud Platform
GDG Heraklion - Architecting for the Google Cloud Platform
Márton Kodok754 views
Efikot - Smart City, okos város - a jövőnk kulcsa by Márton Kodok
Efikot - Smart City, okos város - a jövőnk kulcsaEfikot - Smart City, okos város - a jövőnk kulcsa
Efikot - Smart City, okos város - a jövőnk kulcsa
Márton Kodok170 views
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery by Márton Kodok
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQueryGDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
Márton Kodok300 views
Making advanced analytics accessible to more companies by Márton Kodok
Making advanced analytics accessible to more companiesMaking advanced analytics accessible to more companies
Making advanced analytics accessible to more companies
Márton Kodok155 views

Recently uploaded

Generic or specific? Making sensible software design decisions by
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsBert Jan Schrijver
6 views60 slides
Software evolution understanding: Automatic extraction of software identifier... by
Software evolution understanding: Automatic extraction of software identifier...Software evolution understanding: Automatic extraction of software identifier...
Software evolution understanding: Automatic extraction of software identifier...Ra'Fat Al-Msie'deen
10 views33 slides
Unlocking the Power of AI in Product Management - A Comprehensive Guide for P... by
Unlocking the Power of AI in Product Management - A Comprehensive Guide for P...Unlocking the Power of AI in Product Management - A Comprehensive Guide for P...
Unlocking the Power of AI in Product Management - A Comprehensive Guide for P...NimaTorabi2
15 views17 slides
Programming Field by
Programming FieldProgramming Field
Programming Fieldthehardtechnology
5 views9 slides
Sprint 226 by
Sprint 226Sprint 226
Sprint 226ManageIQ
8 views18 slides
predicting-m3-devopsconMunich-2023-v2.pptx by
predicting-m3-devopsconMunich-2023-v2.pptxpredicting-m3-devopsconMunich-2023-v2.pptx
predicting-m3-devopsconMunich-2023-v2.pptxTier1 app
8 views33 slides

Recently uploaded(20)

Generic or specific? Making sensible software design decisions by Bert Jan Schrijver
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
Software evolution understanding: Automatic extraction of software identifier... by Ra'Fat Al-Msie'deen
Software evolution understanding: Automatic extraction of software identifier...Software evolution understanding: Automatic extraction of software identifier...
Software evolution understanding: Automatic extraction of software identifier...
Unlocking the Power of AI in Product Management - A Comprehensive Guide for P... by NimaTorabi2
Unlocking the Power of AI in Product Management - A Comprehensive Guide for P...Unlocking the Power of AI in Product Management - A Comprehensive Guide for P...
Unlocking the Power of AI in Product Management - A Comprehensive Guide for P...
NimaTorabi215 views
Sprint 226 by ManageIQ
Sprint 226Sprint 226
Sprint 226
ManageIQ8 views
predicting-m3-devopsconMunich-2023-v2.pptx by Tier1 app
predicting-m3-devopsconMunich-2023-v2.pptxpredicting-m3-devopsconMunich-2023-v2.pptx
predicting-m3-devopsconMunich-2023-v2.pptx
Tier1 app8 views
Team Transformation Tactics for Holistic Testing and Quality (Japan Symposium... by Lisi Hocke
Team Transformation Tactics for Holistic Testing and Quality (Japan Symposium...Team Transformation Tactics for Holistic Testing and Quality (Japan Symposium...
Team Transformation Tactics for Holistic Testing and Quality (Japan Symposium...
Lisi Hocke35 views
DRYiCE™ iAutomate: AI-enhanced Intelligent Runbook Automation by HCLSoftware
DRYiCE™ iAutomate: AI-enhanced Intelligent Runbook AutomationDRYiCE™ iAutomate: AI-enhanced Intelligent Runbook Automation
DRYiCE™ iAutomate: AI-enhanced Intelligent Runbook Automation
HCLSoftware6 views
JioEngage_Presentation.pptx by admin125455
JioEngage_Presentation.pptxJioEngage_Presentation.pptx
JioEngage_Presentation.pptx
admin1254556 views
Quality Engineer: A Day in the Life by John Valentino
Quality Engineer: A Day in the LifeQuality Engineer: A Day in the Life
Quality Engineer: A Day in the Life
John Valentino6 views
AI and Ml presentation .pptx by FayazAli87
AI and Ml presentation .pptxAI and Ml presentation .pptx
AI and Ml presentation .pptx
FayazAli8712 views
Ports-and-Adapters Architecture for Embedded HMI by Burkhard Stubert
Ports-and-Adapters Architecture for Embedded HMIPorts-and-Adapters Architecture for Embedded HMI
Ports-and-Adapters Architecture for Embedded HMI
Burkhard Stubert21 views
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports by Ra'Fat Al-Msie'deen
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug ReportsBushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports
Introduction to Git Source Control by John Valentino
Introduction to Git Source ControlIntroduction to Git Source Control
Introduction to Git Source Control
John Valentino5 views

BigQuery best practices and recommendations to reduce costs with BI Engine, Slots, Materialized Views

  • 1. BigQuerybestpractices and recommendations toreducecosts with BI Engine, Slots, Materialized Views Devfest Nantes, October 2022 Márton Kodok Google Developer Expert at REEA.net
  • 2. ● Among the Top 3 romanians on Stackoverflow 201k reputation ● Google Developer Expert on Cloud technologies (2016→) ● Champion of Google Cloud Innovators program (2021→) ● Crafting Web/Mobile backends at REEA.net Articles: martonkodok.medium.com Twitter: @martonkodok Slideshare: martonkodok StackOverflow: pentium10 GitHub: pentium10 Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok About me
  • 3. 1. Looking at a BigQuery billing report 2. What is BI Engine? 3. Obtaining per job billing stats 4. Enable and use BI Engine reservations 5. Using Cloud Workflows to orchestrate the right capacity 6. Lower bills and faster queries on Data Studio, BigQuery 7. Conclusions, articles Agenda Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
  • 4. Looking at a BigQuery billing report @martonkodok
  • 5. Reduce BigQuery bills with BI Engine capacity orchestration Article: https://medium.com/p/9e2634c84a82 @martonkodok
  • 6. Cloud Workflows automating the BI Engine capacity size @martonkodok
  • 8. “ BIEngine is a fast, in-memory analysis service that integrates out of the box with BigQuery, DataStudio, Looker,Tableau,PowerBI Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok What is BiEngine?
  • 9. Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok BIEngine architecture
  • 10. 1. Its a cache plugin to BigQuery- a manageddistributed in-memoryexecutionengine 2. BI Engine reservations manage the memoryallocationattheprojectbillinglevel. 3. cachesonlycolumnsandpartitionsthatarequeriedorscanned. It does not cache the whole table. 4. Any BI solution or custom application that works with the BigQuery API such as REST or JDBC and ODBC drivers canuseBIEnginewithoutanychanges. What does out-of-the-box means? Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
  • 11. Free-for-all 1TB free each month On-demand queries $5/TB, storage: $20/TB Flat rate reservation slots average $4 per hour, best is $1700 for 100 slots (Yrl plan) BigQuery ML excluded from this table. Cost components in BigQuery and BI Engine @martonkodok BI Engine $0.0416 per GB/hour ($30.36 per GB/month)
  • 13. “The aim is to dynamically adjust the size of the BIEngine to get the lowest combined cost of BigQuery and BI Engine. Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
  • 14. 1. Obtain the cost of your on-demand BigQuery usage 2. Set the BI Engine capacity in steps 3. Have a real-time sense of the savings todrive capacity automation up/down 4. Monitor the applied settings for optimal savings Prerequisite: Access to INFORMATION_SCHEMA or Auditlogs exported to BigQuery (historically better) Biggest challenges Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
  • 15. The query to get the recent costs for each job Article: https://medium.com/p/9e2634c84a82 @martonkodok
  • 16. The query to get the recent costs for each job Article: https://medium.com/p/9e2634c84a82 @martonkodok 1. The query uses a flat rate of 5 USD to calculate the cost 2. At this point no optimization is in place, as the two columns are the same
  • 17. BigQuery savings based on billed vs processed bytes Article: https://medium.com/p/9e2634c84a82 @martonkodok
  • 18. BigQuery savings when BI Engine is properly sized Article: https://medium.com/p/9e2634c84a82 @martonkodok 1. BI Engine capacity resize needs 5 minute to propagate 2. Savings are calculated yielding lower billed bytes than processed bytes
  • 20. 1. BI engine capacity might be too small 2. bq queries are too complex BI Engine turned on - but ineffective Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
  • 21. “ Not all BigQueryqueries are accelerated. Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
  • 22. 1. Detailed statistics on BI Engine are available through the job statistics API 2. bq command-line tool to fetch job statistics Acceleration statistics Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
  • 23. 1. bq show --format=prettyjson -j job_id "statistics": { "creationTime": "1602175128902", "endTime": "1602175130700", "query": { "biEngineStatistics": { "biEngineMode": "DISABLED", "biEngineReasons": [ { "code": "UNSUPPORTED_SQL_TEXT", "message": "Detected unsupported join type" } ] }, Acceleration statistics Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
  • 24. Use INFORMATION_SCHEMA to get acceleration statistics Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
  • 25. 1. Investigate queries that have BI Engine acceleration reported as disabled, partial 2. Rewrite queries to perform better under BI Engine optimizer 3. Use materialized views to join and flatten data to optimize their structure for BI Engine 4. Create short lived (5m, 15m, 1h) temporary tables to improve caching efficiency 5. Increase the size of the BI Engine reservation until effective use 6. Use Cloud Workflow and business logic to automate the size based on workload during the day To have effective BI Engine acceleration Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
  • 26. Leverage temporary, dedicated business scope tables Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok Dedicated table for scope Use a scheduler to recreate every 5m/15m/1h Leverage clustering
  • 27. Use Materialized Views to get latest rows from append-only tables Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok Trick to get latest row using Materialized Views 2nd view to get rid of the arrays
  • 29. Cloud Workflows automating the BI Engine capacity size @martonkodok
  • 30. Cloud Workflows automating the BI Engine capacity size Article: https://medium.com/p/9e2634c84a82 @martonkodok
  • 31. BigQuery savings when BI Engine is properly sized Article: https://medium.com/p/9e2634c84a82 @martonkodok 1. BI Engine capacity resize needs 5 minute to propagate 2. Savings are calculated yielding lower billed bytes than processed bytes
  • 32. 1. Reads the output of the effectiveness of billed vs processed bytes query 2. Based on benefits margin map the step of the increase eg: 5GB step, 1GB step, 0.5GB step 3. Have a math of the evaluation, how far you can stretch by increasing the BI Engine to have the benefits 4. Capacity mapping over office hours for more capacity, and lower capacity during the night. 5. Leverage BigQuery ML to write a time-series forecast prediction based on historical data to actually drive the best BI Engine capacity for the “hour slot”. 6. Stop increasing the capacity when the rationale of the savings costs more than the benefits. Cloud Workflow automation logic Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
  • 33. Reduce BigQuery bills with BI Engine capacity orchestration Article: https://medium.com/p/9e2634c84a82 @martonkodok
  • 34. Data Studio aspects Article: https://medium.com/p/9e2634c84a82 @martonkodok 1. Accelerated by BigQuery Engine icon 2. Faster dashboards
  • 35. Cloud Monitoring Article: https://medium.com/p/9e2634c84a82 @martonkodok 1. create a chart plotting the bigquerybiengine.googleapis.com/reservation/used_bytes 2. over the bigquerybiengine.googleapis.com/reservation/total_bytes
  • 37. 1. Easy out of box way to optimize costs of BigQuery 2. by turning out BI Engine, which does not need code changes. 3. Leverage INFORMATION_SCHEMA stats to see underperforming queries, try tooptimize them. 4. Automate the right capacity size by using Cloud Workflows 5. Save precious development time, lower bills, faster queries Conclusions Reduce BigQuery bills with BI Engine capacity orchestration @martonkodok
  • 38. Thank you. Q&A. Slides available on: slideshare.net/martonkodok Reea.net - Integrated web solutions driven by creativity to deliver projects. Twitter: @martonkodok