SlideShare a Scribd company logo
1 of 18
Download to read offline
1
Airbyte & Firebolt
Why?
Open-source data integration platform
AIRBYTE FEATURES
Scheduled updates
Manual full refresh
Change Data Capture for
databases
DATA
REPLICATION
Full control over the data
Normalized schemas
Custom transformation via
dbt
TRANSFORMATION
Real time monitoring
Notification for failed
syncs
Debugging autonomy
IN FULL CONTROL
Airbyte
● Concept, Installation, Source, destination, configuration
○ Pandas under the hood
○ No way to synchronize folders, only files
● S3, csv 10k, 10M, 1K files with 10K rows
● S3 to Snowflake
○ 10M does not work, maybe AWS instance issue?
○ From local file to SF works
○
● Oracle RDS to Snowflake
○ Override and incrementals
○ Custom transformations
Test
● EC2 instance with airbyte
● Ingest data from S3 to Snowflake
○ 10K
○ 10M
● Ingest data from RDS Oracle DB to Snowflake
● Custom DBT transformation
Cloud Data Warehouse For Engineers
Firebolt
Source: Firebolt Live Product Showdown with Boaz Farkash
EFFICIENCY
Do more with less. 10x price
performance advantage
through greater HW
efficiency & choice
SPEED
Up to 182x faster speed at
scale with optimizing
storage, indexing and
engine
The Firebolt difference
SCALE
Elastic scale at speed
across ETL,
semi-structured data and
thousands of users
A new speed focus on speed and efficiency at scale
Architecture
Source: Firebolt documentation
First try
Fact table (10M rows)
Second try
● Mock generated data
○ Star schema
○ 1M rows (Fact)
○ 1K rows (Dimensions)
● Parquet files in S3
● Loading data with firebolt script
● External, Fact and dimensions
tables
● 7 different queries (joins, where,
like clauses and windows
functions)
● No aggregate or join indexes
(Firebolt)
Query Sentence Rows
1
SELECT dim_employee.employee_id, dim_product.product_id, dim_time.time_id, dim_store.store_id, dim_sales_type.sales_type_id, fs.price, fs.quantity, fs.yn
FROM fact_sales fs
INNER JOIN dim_employee ON fs.employee_id=dim_employee.employee_id
INNER JOIN dim_product ON fs.product_id=dim_product.product_id
INNER JOIN dim_time ON fs.time_id=dim_time.time_id
INNER JOIN dim_store ON fs.store_id=dim_store.store_id
INNER JOIN dim_sales_type ON fs.sales_type_id=dim_sales_type.sales_type_id;
994,969
2
SELECT dim_employee.employee_id, dim_product.product_id, dim_time.time_id, dim_store.store_id, dim_sales_type.sales_type_id, fs.price, fs.quantity, fs.my_date
FROM fact_sales fs
INNER JOIN dim_employee ON fs.employee_id=dim_employee.employee_id
INNER JOIN dim_product ON fs.product_id=dim_product.product_id
INNER JOIN dim_time ON fs.time_id=dim_time.time_id
INNER JOIN dim_store ON fs.store_id=dim_store.store_id
INNER JOIN dim_sales_type ON fs.sales_type_id=dim_sales_type.sales_type_id
WHERE fs.my_date BETWEEN '2050-01-01' and '2100-12-31'
531,438
3
SELECT dim_employee.employee_id, dim_product.product_id, dim_time.time_id, dim_store.store_id, dim_sales_type.sales_type_id, fs.price, fs.quantity, fs.my_date,
dim_product.product_name
FROM fact_sales fs
INNER JOIN dim_employee ON fs.employee_id=dim_employee.employee_id
INNER JOIN dim_product ON fs.product_id=dim_product.product_id
INNER JOIN dim_time ON fs.time_id=dim_time.time_id
INNER JOIN dim_store ON fs.store_id=dim_store.store_id
INNER JOIN dim_sales_type ON fs.sales_type_id=dim_sales_type.sales_type_id
WHERE fs.my_date BETWEEN '2050-01-01' and '2100-12-31'
AND dim_product.product_name like '%eto%'
ORDER BY dim_product.product_name
3,305
4
SELECT dim_employee.employee_id, dim_product.product_id, sum(fs.price) as total, max(fs.quantity) max_quantity
FROM fact_sales fs
INNER JOIN dim_employee ON fs.employee_id=dim_employee.employee_id
INNER JOIN dim_product ON fs.product_id=dim_product.product_id
GROUP BY dim_employee.employee_id, dim_product.product_id
ORDER BY total desc
551,277
Query Sentence Rows
5
SELECT dim_employee.employee_id, dim_product.product_id, fs.category, fs.random,
RANK() OVER(PARTITION BY fs.category ORDER BY fs.random DESC) as rank
FROM fact_sales fs
INNER JOIN dim_employee ON fs.employee_id=dim_employee.employee_id
INNER JOIN dim_product ON fs.product_id=dim_product.product_id
ORDER BY rank
997,933
6
SELECT distinct dim_employee.employee_id, dim_product.product_id
, sum(fs.price) over(partition by dim_employee.employee_id, dim_product.product_id) as total
, max(fs.quantity) over(partition by dim_employee.employee_id, dim_product.product_id) max_quantity
FROM fact_sales fs
INNER JOIN dim_employee ON fs.employee_id=dim_employee.employee_id
INNER JOIN dim_product ON fs.product_id=dim_product.product_id
ORDER BY total desc
552,133
7
SELECT distinct fs.employee_id, fs.product_id
, sum(fs.price) over(partition by dim_employee.employee_id, dim_product.product_id) as total
, max(fs.quantity) over(partition by dim_employee.employee_id, dim_product.product_id) max_quantity
FROM fact_sales fs
INNER JOIN dim_employee ON fs.employee_id=dim_employee.employee_id
INNER JOIN dim_product ON fs.product_id=dim_product.product_id
ORDER BY total desc
552,133
Results
0 = OOM or Failed
Results
0 = OOM or Failed
Primary Index Join Index
Aggregate Index
Firebolt vs Snowflake

More Related Content

What's hot

BigQuery implementation
BigQuery implementationBigQuery implementation
BigQuery implementationSimon Su
 
Data Governance Strategies - With Great Power Comes Great Accountability
Data Governance Strategies - With Great Power Comes Great AccountabilityData Governance Strategies - With Great Power Comes Great Accountability
Data Governance Strategies - With Great Power Comes Great AccountabilityDATAVERSITY
 
BigQuery walk through.pptx
BigQuery walk through.pptxBigQuery walk through.pptx
BigQuery walk through.pptxVikRam S
 
Got data?… now what? An introduction to modern data platforms
Got data?… now what?  An introduction to modern data platformsGot data?… now what?  An introduction to modern data platforms
Got data?… now what? An introduction to modern data platformsJamesAnderson599331
 
Denodo Data Virtualization Platform Architecture: Performance (session 2 from...
Denodo Data Virtualization Platform Architecture: Performance (session 2 from...Denodo Data Virtualization Platform Architecture: Performance (session 2 from...
Denodo Data Virtualization Platform Architecture: Performance (session 2 from...Denodo
 
Data strategy in a Big Data world
Data strategy in a Big Data worldData strategy in a Big Data world
Data strategy in a Big Data worldCraig Milroy
 
State of Data Governance in 2021
State of Data Governance in 2021State of Data Governance in 2021
State of Data Governance in 2021DATAVERSITY
 
‏‏‏‏‏‏‏‏‏‏Chapter 12: Data Quality Management
‏‏‏‏‏‏‏‏‏‏Chapter 12: Data Quality Management‏‏‏‏‏‏‏‏‏‏Chapter 12: Data Quality Management
‏‏‏‏‏‏‏‏‏‏Chapter 12: Data Quality ManagementAhmed Alorage
 
DataEd Slides: Data Strategy Best Practices
DataEd Slides:  Data Strategy Best PracticesDataEd Slides:  Data Strategy Best Practices
DataEd Slides: Data Strategy Best PracticesDATAVERSITY
 
Master Data Management - Gartner Presentation
Master Data Management - Gartner PresentationMaster Data Management - Gartner Presentation
Master Data Management - Gartner Presentation303Computing
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...Amazon Web Services
 
BigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLMárton Kodok
 

What's hot (20)

BigQuery implementation
BigQuery implementationBigQuery implementation
BigQuery implementation
 
Data Governance Strategies - With Great Power Comes Great Accountability
Data Governance Strategies - With Great Power Comes Great AccountabilityData Governance Strategies - With Great Power Comes Great Accountability
Data Governance Strategies - With Great Power Comes Great Accountability
 
Enterprise Data Management
Enterprise Data ManagementEnterprise Data Management
Enterprise Data Management
 
Google BigQuery
Google BigQueryGoogle BigQuery
Google BigQuery
 
Data modeling for the business
Data modeling for the businessData modeling for the business
Data modeling for the business
 
BigQuery walk through.pptx
BigQuery walk through.pptxBigQuery walk through.pptx
BigQuery walk through.pptx
 
Got data?… now what? An introduction to modern data platforms
Got data?… now what?  An introduction to modern data platformsGot data?… now what?  An introduction to modern data platforms
Got data?… now what? An introduction to modern data platforms
 
Denodo Data Virtualization Platform Architecture: Performance (session 2 from...
Denodo Data Virtualization Platform Architecture: Performance (session 2 from...Denodo Data Virtualization Platform Architecture: Performance (session 2 from...
Denodo Data Virtualization Platform Architecture: Performance (session 2 from...
 
Importance of Big Data Analytics
Importance of Big Data AnalyticsImportance of Big Data Analytics
Importance of Big Data Analytics
 
Data strategy in a Big Data world
Data strategy in a Big Data worldData strategy in a Big Data world
Data strategy in a Big Data world
 
State of Data Governance in 2021
State of Data Governance in 2021State of Data Governance in 2021
State of Data Governance in 2021
 
‏‏‏‏‏‏‏‏‏‏Chapter 12: Data Quality Management
‏‏‏‏‏‏‏‏‏‏Chapter 12: Data Quality Management‏‏‏‏‏‏‏‏‏‏Chapter 12: Data Quality Management
‏‏‏‏‏‏‏‏‏‏Chapter 12: Data Quality Management
 
DataEd Slides: Data Strategy Best Practices
DataEd Slides:  Data Strategy Best PracticesDataEd Slides:  Data Strategy Best Practices
DataEd Slides: Data Strategy Best Practices
 
Master Data Management - Gartner Presentation
Master Data Management - Gartner PresentationMaster Data Management - Gartner Presentation
Master Data Management - Gartner Presentation
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Informatica Cloud Overview
Informatica Cloud OverviewInformatica Cloud Overview
Informatica Cloud Overview
 
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...
Using Amazon Neptune to power identity resolution at scale - ADB303 - Atlanta...
 
Data Mesh
Data MeshData Mesh
Data Mesh
 
Data Lake,beyond the Data Warehouse
Data Lake,beyond the Data WarehouseData Lake,beyond the Data Warehouse
Data Lake,beyond the Data Warehouse
 
BigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQL
 

Similar to Firebolt vs Snowflake

Build your own CDN with Varnish - Confoo 2022
Build your own CDN with Varnish - Confoo 2022Build your own CDN with Varnish - Confoo 2022
Build your own CDN with Varnish - Confoo 2022Thijs Feryn
 
All course slides.pdf
All course slides.pdfAll course slides.pdf
All course slides.pdfssuser98bffa1
 
Practical Pipelines: A Houseplant Soil Alerting System with ksqlDB with Danic...
Practical Pipelines: A Houseplant Soil Alerting System with ksqlDB with Danic...Practical Pipelines: A Houseplant Soil Alerting System with ksqlDB with Danic...
Practical Pipelines: A Houseplant Soil Alerting System with ksqlDB with Danic...HostedbyConfluent
 
Bring Your Cloud Bill Back Down to Earth
Bring Your Cloud Bill Back Down to EarthBring Your Cloud Bill Back Down to Earth
Bring Your Cloud Bill Back Down to EarthAlan Hand
 
Rails App performance at the limit - Bogdan Gusiev
Rails App performance at the limit - Bogdan GusievRails App performance at the limit - Bogdan Gusiev
Rails App performance at the limit - Bogdan GusievRuby Meditation
 
REPEAT_1_Deep_dive_on_new_features_in_Amazon_RDS_for_SQL_Server_DAT364-R1(1).pdf
REPEAT_1_Deep_dive_on_new_features_in_Amazon_RDS_for_SQL_Server_DAT364-R1(1).pdfREPEAT_1_Deep_dive_on_new_features_in_Amazon_RDS_for_SQL_Server_DAT364-R1(1).pdf
REPEAT_1_Deep_dive_on_new_features_in_Amazon_RDS_for_SQL_Server_DAT364-R1(1).pdfAkashGoel82
 
The Enterprise Wor/d/thy/Press
The Enterprise Wor/d/thy/PressThe Enterprise Wor/d/thy/Press
The Enterprise Wor/d/thy/PressJeroen van Dijk
 
Lessons Learned Managing Large AWS Environments
Lessons Learned Managing Large AWS EnvironmentsLessons Learned Managing Large AWS Environments
Lessons Learned Managing Large AWS EnvironmentsRonald Bradford
 
Datera — Elastic Data Fabric Storage für das Rechenzentrum von heute
Datera — Elastic Data Fabric Storage für das Rechenzentrum von heuteDatera — Elastic Data Fabric Storage für das Rechenzentrum von heute
Datera — Elastic Data Fabric Storage für das Rechenzentrum von heuteBOSTON Server & Storage Solutions GmbH
 
Finding hidden waste in your AWS infrastructure - 2/11/16
Finding hidden waste in your AWS infrastructure - 2/11/16Finding hidden waste in your AWS infrastructure - 2/11/16
Finding hidden waste in your AWS infrastructure - 2/11/16Cloudability
 
2014 Import.io Data Summit - Including Hadoop/Impala Getting Started Demo
2014 Import.io Data Summit - Including Hadoop/Impala Getting Started Demo2014 Import.io Data Summit - Including Hadoop/Impala Getting Started Demo
2014 Import.io Data Summit - Including Hadoop/Impala Getting Started DemoIan Massingham
 
nter-pod Revolutions: Connected Enterprise Solution in Oracle EPM Cloud
nter-pod Revolutions: Connected Enterprise Solution in Oracle EPM Cloud nter-pod Revolutions: Connected Enterprise Solution in Oracle EPM Cloud
nter-pod Revolutions: Connected Enterprise Solution in Oracle EPM Cloud Alithya
 
How to Realize an Additional 270% ROI on Snowflake
How to Realize an Additional 270% ROI on SnowflakeHow to Realize an Additional 270% ROI on Snowflake
How to Realize an Additional 270% ROI on SnowflakeAtScale
 
AWS Summit Auckland 2014 | Moving to the Cloud. What does it Mean to your Bus...
AWS Summit Auckland 2014 | Moving to the Cloud. What does it Mean to your Bus...AWS Summit Auckland 2014 | Moving to the Cloud. What does it Mean to your Bus...
AWS Summit Auckland 2014 | Moving to the Cloud. What does it Mean to your Bus...Amazon Web Services
 
Power up Your AWS Data Lake and Warehouse with Trusted Data (Sponsored by Tal...
Power up Your AWS Data Lake and Warehouse with Trusted Data (Sponsored by Tal...Power up Your AWS Data Lake and Warehouse with Trusted Data (Sponsored by Tal...
Power up Your AWS Data Lake and Warehouse with Trusted Data (Sponsored by Tal...Amazon Web Services
 
Bring the light in your Always FREE Oracle Cloud
Bring the light in your Always FREE Oracle CloudBring the light in your Always FREE Oracle Cloud
Bring the light in your Always FREE Oracle CloudDimitri Gielis
 
Efficient Monitoring & Tuning of Dynamic SQL in DB2 for z/OS by Namik Hrle ...
Efficient Monitoring & Tuning of Dynamic SQL in DB2 for z/OS  by  Namik Hrle ...Efficient Monitoring & Tuning of Dynamic SQL in DB2 for z/OS  by  Namik Hrle ...
Efficient Monitoring & Tuning of Dynamic SQL in DB2 for z/OS by Namik Hrle ...Surekha Parekh
 

Similar to Firebolt vs Snowflake (20)

Amazon Redshift Masterclass
Amazon Redshift MasterclassAmazon Redshift Masterclass
Amazon Redshift Masterclass
 
Build your own CDN with Varnish - Confoo 2022
Build your own CDN with Varnish - Confoo 2022Build your own CDN with Varnish - Confoo 2022
Build your own CDN with Varnish - Confoo 2022
 
All course slides.pdf
All course slides.pdfAll course slides.pdf
All course slides.pdf
 
ObjectEngine
ObjectEngineObjectEngine
ObjectEngine
 
Practical Pipelines: A Houseplant Soil Alerting System with ksqlDB with Danic...
Practical Pipelines: A Houseplant Soil Alerting System with ksqlDB with Danic...Practical Pipelines: A Houseplant Soil Alerting System with ksqlDB with Danic...
Practical Pipelines: A Houseplant Soil Alerting System with ksqlDB with Danic...
 
Bring Your Cloud Bill Back Down to Earth
Bring Your Cloud Bill Back Down to EarthBring Your Cloud Bill Back Down to Earth
Bring Your Cloud Bill Back Down to Earth
 
Rails App performance at the limit - Bogdan Gusiev
Rails App performance at the limit - Bogdan GusievRails App performance at the limit - Bogdan Gusiev
Rails App performance at the limit - Bogdan Gusiev
 
REPEAT_1_Deep_dive_on_new_features_in_Amazon_RDS_for_SQL_Server_DAT364-R1(1).pdf
REPEAT_1_Deep_dive_on_new_features_in_Amazon_RDS_for_SQL_Server_DAT364-R1(1).pdfREPEAT_1_Deep_dive_on_new_features_in_Amazon_RDS_for_SQL_Server_DAT364-R1(1).pdf
REPEAT_1_Deep_dive_on_new_features_in_Amazon_RDS_for_SQL_Server_DAT364-R1(1).pdf
 
The Enterprise Wor/d/thy/Press
The Enterprise Wor/d/thy/PressThe Enterprise Wor/d/thy/Press
The Enterprise Wor/d/thy/Press
 
Lessons Learned Managing Large AWS Environments
Lessons Learned Managing Large AWS EnvironmentsLessons Learned Managing Large AWS Environments
Lessons Learned Managing Large AWS Environments
 
Datera — Elastic Data Fabric Storage für das Rechenzentrum von heute
Datera — Elastic Data Fabric Storage für das Rechenzentrum von heuteDatera — Elastic Data Fabric Storage für das Rechenzentrum von heute
Datera — Elastic Data Fabric Storage für das Rechenzentrum von heute
 
Finding hidden waste in your AWS infrastructure - 2/11/16
Finding hidden waste in your AWS infrastructure - 2/11/16Finding hidden waste in your AWS infrastructure - 2/11/16
Finding hidden waste in your AWS infrastructure - 2/11/16
 
2014 Import.io Data Summit - Including Hadoop/Impala Getting Started Demo
2014 Import.io Data Summit - Including Hadoop/Impala Getting Started Demo2014 Import.io Data Summit - Including Hadoop/Impala Getting Started Demo
2014 Import.io Data Summit - Including Hadoop/Impala Getting Started Demo
 
nter-pod Revolutions: Connected Enterprise Solution in Oracle EPM Cloud
nter-pod Revolutions: Connected Enterprise Solution in Oracle EPM Cloud nter-pod Revolutions: Connected Enterprise Solution in Oracle EPM Cloud
nter-pod Revolutions: Connected Enterprise Solution in Oracle EPM Cloud
 
How to Realize an Additional 270% ROI on Snowflake
How to Realize an Additional 270% ROI on SnowflakeHow to Realize an Additional 270% ROI on Snowflake
How to Realize an Additional 270% ROI on Snowflake
 
AWS Summit Auckland 2014 | Moving to the Cloud. What does it Mean to your Bus...
AWS Summit Auckland 2014 | Moving to the Cloud. What does it Mean to your Bus...AWS Summit Auckland 2014 | Moving to the Cloud. What does it Mean to your Bus...
AWS Summit Auckland 2014 | Moving to the Cloud. What does it Mean to your Bus...
 
Serverless Rust
Serverless RustServerless Rust
Serverless Rust
 
Power up Your AWS Data Lake and Warehouse with Trusted Data (Sponsored by Tal...
Power up Your AWS Data Lake and Warehouse with Trusted Data (Sponsored by Tal...Power up Your AWS Data Lake and Warehouse with Trusted Data (Sponsored by Tal...
Power up Your AWS Data Lake and Warehouse with Trusted Data (Sponsored by Tal...
 
Bring the light in your Always FREE Oracle Cloud
Bring the light in your Always FREE Oracle CloudBring the light in your Always FREE Oracle Cloud
Bring the light in your Always FREE Oracle Cloud
 
Efficient Monitoring & Tuning of Dynamic SQL in DB2 for z/OS by Namik Hrle ...
Efficient Monitoring & Tuning of Dynamic SQL in DB2 for z/OS  by  Namik Hrle ...Efficient Monitoring & Tuning of Dynamic SQL in DB2 for z/OS  by  Namik Hrle ...
Efficient Monitoring & Tuning of Dynamic SQL in DB2 for z/OS by Namik Hrle ...
 

More from Carlos Fuentes

Descubriendo el poder de la Ciencia de Datos.pdf
Descubriendo el poder de la Ciencia de Datos.pdfDescubriendo el poder de la Ciencia de Datos.pdf
Descubriendo el poder de la Ciencia de Datos.pdfCarlos Fuentes
 
Adaptive Big Data Pipelines
Adaptive Big Data PipelinesAdaptive Big Data Pipelines
Adaptive Big Data PipelinesCarlos Fuentes
 
Deep learning algo mas que una moda
Deep learning algo mas que una modaDeep learning algo mas que una moda
Deep learning algo mas que una modaCarlos Fuentes
 
Entrega contínua en la práctica
Entrega contínua en la prácticaEntrega contínua en la práctica
Entrega contínua en la prácticaCarlos Fuentes
 
Análisis, preparación y visualización de datos usando herramientas Open Source
Análisis, preparación y visualización de datos usando herramientas Open SourceAnálisis, preparación y visualización de datos usando herramientas Open Source
Análisis, preparación y visualización de datos usando herramientas Open SourceCarlos Fuentes
 
Introducción a Data Science
Introducción a Data ScienceIntroducción a Data Science
Introducción a Data ScienceCarlos Fuentes
 

More from Carlos Fuentes (12)

Descubriendo el poder de la Ciencia de Datos.pdf
Descubriendo el poder de la Ciencia de Datos.pdfDescubriendo el poder de la Ciencia de Datos.pdf
Descubriendo el poder de la Ciencia de Datos.pdf
 
soda-sql
soda-sqlsoda-sql
soda-sql
 
Data Ingestion in AWS
Data Ingestion in AWSData Ingestion in AWS
Data Ingestion in AWS
 
Data Mesh
Data MeshData Mesh
Data Mesh
 
CD4ML
CD4MLCD4ML
CD4ML
 
Adaptive Big Data Pipelines
Adaptive Big Data PipelinesAdaptive Big Data Pipelines
Adaptive Big Data Pipelines
 
Deep learning algo mas que una moda
Deep learning algo mas que una modaDeep learning algo mas que una moda
Deep learning algo mas que una moda
 
HPA*
HPA*HPA*
HPA*
 
Realidad virtual
Realidad virtualRealidad virtual
Realidad virtual
 
Entrega contínua en la práctica
Entrega contínua en la prácticaEntrega contínua en la práctica
Entrega contínua en la práctica
 
Análisis, preparación y visualización de datos usando herramientas Open Source
Análisis, preparación y visualización de datos usando herramientas Open SourceAnálisis, preparación y visualización de datos usando herramientas Open Source
Análisis, preparación y visualización de datos usando herramientas Open Source
 
Introducción a Data Science
Introducción a Data ScienceIntroducción a Data Science
Introducción a Data Science
 

Recently uploaded

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 

Recently uploaded (20)

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 

Firebolt vs Snowflake

  • 4. AIRBYTE FEATURES Scheduled updates Manual full refresh Change Data Capture for databases DATA REPLICATION Full control over the data Normalized schemas Custom transformation via dbt TRANSFORMATION Real time monitoring Notification for failed syncs Debugging autonomy IN FULL CONTROL
  • 5. Airbyte ● Concept, Installation, Source, destination, configuration ○ Pandas under the hood ○ No way to synchronize folders, only files ● S3, csv 10k, 10M, 1K files with 10K rows ● S3 to Snowflake ○ 10M does not work, maybe AWS instance issue? ○ From local file to SF works ○ ● Oracle RDS to Snowflake ○ Override and incrementals ○ Custom transformations
  • 6. Test ● EC2 instance with airbyte ● Ingest data from S3 to Snowflake ○ 10K ○ 10M ● Ingest data from RDS Oracle DB to Snowflake ● Custom DBT transformation
  • 7. Cloud Data Warehouse For Engineers
  • 8. Firebolt Source: Firebolt Live Product Showdown with Boaz Farkash
  • 9. EFFICIENCY Do more with less. 10x price performance advantage through greater HW efficiency & choice SPEED Up to 182x faster speed at scale with optimizing storage, indexing and engine The Firebolt difference SCALE Elastic scale at speed across ETL, semi-structured data and thousands of users A new speed focus on speed and efficiency at scale
  • 11. First try Fact table (10M rows)
  • 12. Second try ● Mock generated data ○ Star schema ○ 1M rows (Fact) ○ 1K rows (Dimensions) ● Parquet files in S3 ● Loading data with firebolt script ● External, Fact and dimensions tables ● 7 different queries (joins, where, like clauses and windows functions) ● No aggregate or join indexes (Firebolt)
  • 13. Query Sentence Rows 1 SELECT dim_employee.employee_id, dim_product.product_id, dim_time.time_id, dim_store.store_id, dim_sales_type.sales_type_id, fs.price, fs.quantity, fs.yn FROM fact_sales fs INNER JOIN dim_employee ON fs.employee_id=dim_employee.employee_id INNER JOIN dim_product ON fs.product_id=dim_product.product_id INNER JOIN dim_time ON fs.time_id=dim_time.time_id INNER JOIN dim_store ON fs.store_id=dim_store.store_id INNER JOIN dim_sales_type ON fs.sales_type_id=dim_sales_type.sales_type_id; 994,969 2 SELECT dim_employee.employee_id, dim_product.product_id, dim_time.time_id, dim_store.store_id, dim_sales_type.sales_type_id, fs.price, fs.quantity, fs.my_date FROM fact_sales fs INNER JOIN dim_employee ON fs.employee_id=dim_employee.employee_id INNER JOIN dim_product ON fs.product_id=dim_product.product_id INNER JOIN dim_time ON fs.time_id=dim_time.time_id INNER JOIN dim_store ON fs.store_id=dim_store.store_id INNER JOIN dim_sales_type ON fs.sales_type_id=dim_sales_type.sales_type_id WHERE fs.my_date BETWEEN '2050-01-01' and '2100-12-31' 531,438 3 SELECT dim_employee.employee_id, dim_product.product_id, dim_time.time_id, dim_store.store_id, dim_sales_type.sales_type_id, fs.price, fs.quantity, fs.my_date, dim_product.product_name FROM fact_sales fs INNER JOIN dim_employee ON fs.employee_id=dim_employee.employee_id INNER JOIN dim_product ON fs.product_id=dim_product.product_id INNER JOIN dim_time ON fs.time_id=dim_time.time_id INNER JOIN dim_store ON fs.store_id=dim_store.store_id INNER JOIN dim_sales_type ON fs.sales_type_id=dim_sales_type.sales_type_id WHERE fs.my_date BETWEEN '2050-01-01' and '2100-12-31' AND dim_product.product_name like '%eto%' ORDER BY dim_product.product_name 3,305 4 SELECT dim_employee.employee_id, dim_product.product_id, sum(fs.price) as total, max(fs.quantity) max_quantity FROM fact_sales fs INNER JOIN dim_employee ON fs.employee_id=dim_employee.employee_id INNER JOIN dim_product ON fs.product_id=dim_product.product_id GROUP BY dim_employee.employee_id, dim_product.product_id ORDER BY total desc 551,277
  • 14. Query Sentence Rows 5 SELECT dim_employee.employee_id, dim_product.product_id, fs.category, fs.random, RANK() OVER(PARTITION BY fs.category ORDER BY fs.random DESC) as rank FROM fact_sales fs INNER JOIN dim_employee ON fs.employee_id=dim_employee.employee_id INNER JOIN dim_product ON fs.product_id=dim_product.product_id ORDER BY rank 997,933 6 SELECT distinct dim_employee.employee_id, dim_product.product_id , sum(fs.price) over(partition by dim_employee.employee_id, dim_product.product_id) as total , max(fs.quantity) over(partition by dim_employee.employee_id, dim_product.product_id) max_quantity FROM fact_sales fs INNER JOIN dim_employee ON fs.employee_id=dim_employee.employee_id INNER JOIN dim_product ON fs.product_id=dim_product.product_id ORDER BY total desc 552,133 7 SELECT distinct fs.employee_id, fs.product_id , sum(fs.price) over(partition by dim_employee.employee_id, dim_product.product_id) as total , max(fs.quantity) over(partition by dim_employee.employee_id, dim_product.product_id) max_quantity FROM fact_sales fs INNER JOIN dim_employee ON fs.employee_id=dim_employee.employee_id INNER JOIN dim_product ON fs.product_id=dim_product.product_id ORDER BY total desc 552,133
  • 15. Results 0 = OOM or Failed
  • 16. Results 0 = OOM or Failed
  • 17. Primary Index Join Index Aggregate Index