SlideShare a Scribd company logo
1 of 40
Download to read offline
© 2023
A Modern & Sustainable
Analytics Stack
W A L D
PyConDE / PyData 2023, April 17th
Florian Wilhelm
Mathematical Modelling
Modern Data Warehousing & Analytics
Personalisation & RecSys
Uncertainty Quantification & Causality
Python Data Stack
OSS Contributor & Creator of PyScaffold
Dr. Florian Wilhelm
• H E A D O F D ATA S C I E N C E
FlorianWilhelm.info
florian.wilhelm@inovex.de
Florian Wilhelm
@FlorianWilhelm
‣ Application Development (Web Platforms,
Mobile Apps, Smart Devices and Robotics, UI/
UX design,Backend Services)
‣ Data Management and Analytics (Business
Intelligence, Big Data, Searches, Data Science
and Deep Learning, Machine Perception and
Artificial Intelligence)
‣ Scalable IT-Infrastructures (IT Engineering,
Cloud Services, DevOps, Replatforming,
Security)
‣ Training and Coaching (inovex Academy)
is an innovation and quality-
driven IT project house with a
focus on digital transformation.
Using technology to
inspire our clients.
And ourselves.
Berlin · Karlsruhe · Pforzheim · Stuttgart · München · Köln · Hamburg · Erlangen
www.inovex.de
V S
Data Warehouse
Data Lake
V S
ETL
ELT
&
C O M PA R I S O N
Data Lake vs Data Warehouse
Data Lake Data Warehouse
Data Structure Raw Processed
Purpose of Data Not yet determined Currently in use
Users Data scientists /
engineers
Business professionals
Accessibility Highly accessible and
quick to update
More complicated and
costly to make changes
C O M PA R I S O N
ETL — Extract, Transform, Load
S TA G I N G A R E A
I N D ATA L A K E
T R A N S F O R M
S Q L R D B M S
N O S Q L D B M S
X M L F I L E S
S A A S P L AT F O R M
E X T R A C T
D ATA
WA R E H O U S E
L O A D
A N A LY T I C S
A N A L Y S E
C O M PA R I S O N
ELT — Extract, Load, Transform
D ATA
WA R E H O U S E
T R A N S F O R M
S Q L R D B M S
N O S Q L D B M S
X M L F I L E S
S A A S P L AT F O R M
E X T R A C T L O A D
A N A LY T I C S
A N A L Y S E
P R O G R E S S
The Evolution of Databricks & Snowflake
D ATA L A K E
& E T L
D ATA WA R E H O U S E
& E LT
P R O G R E S S
The Evolution of Databricks & Snowflake
D ATA B R I C K S
L A K E H O U S E
S N OW F L A K E
D ATA C L O U D
U N I F I C AT I O N
O F D ATA WA R E H O U S E & L A K E
W A L D
S T A C K
W A L D
S T A C K
WA L D S TA C K
A Modern & Sustainable Analytics Stack
Warehouse like
Snowflake, Google
BigQuery, Databricks
Airbyte for
Integration of
hundreds of data
sources
Lightdash for
Business
Intelligence
DBT for data
transformations
WA L D S TA C K
WALD based on Google Big Query
‣ fully managed enterprise data
warehouse
‣ only available on Google Cloud
Platform
‣ uses dbt-bigquery to run Python
UDFs with Dataproc using
PySpark
WA L D S TA C K
WALD based on Databricks
‣ uses Databrick’s SQL warehouse
‣ available on all major cloud
providers
‣ uses dbt-databricks to run Python
UDFs with PySpark
WA L D S TA C K
WALD based on Snowflake
‣ comes with all the benefits of
Snowflake, e.g. governance, easy
collaboration, security, near zero
maintenance
‣ runs on all major cloud providers
‣ tons of great documentation
‣ the easiest to use and full-
featured option to get started
with WALD
WA L D S TA C K
Modern Analytics Stack
DATA CLOUD
DATA SOURCES
E X T R A C T L O A D T R A N S F O R M A n a l y s e
SNOWPARK
WA L D S TA C K
Example
Use-Case
>find all code & details under
https://waldstack.org
F O R M U L A 1
Analysis and Position Prediction
1. Upload data using Airbyte to
Snowflake
2. Use dbt for some basic analysis
3. Use dbt with Snowpark for some
ML to predict future position
4. Visualise our results with
Lightdash
F O R M U L A 1
Rough Overview of the Data
‣ dimensions: drivers,
circuits, constructors,
races
‣ facts: lap_times,
pit_tops, results
F O R M U L A 1
Rough Overview of the Data
‣ dimensions: drivers,
circuits, constructors,
races
‣ facts: lap_times,
pit_tops, results
F O R M U L A 1
Rough Overview of the Data
‣ dimensions: drivers,
circuits, constructors,
races
‣ facts: lap_times,
pit_tops, results
F O R M U L A 1
Rough Overview of the Data
‣ dimensions: drivers,
circuits, constructors,
races
‣ facts: lap_times,
pit_tops, results
F O R M U L A 1
Rough Overview of the Data
‣ dimensions: drivers,
circuits, constructors,
races
‣ facts: lap_times,
pit_tops, results
F O R M U L A 1
Rough Overview of the Data
‣ dimensions: drivers,
circuits, constructors,
races
‣ facts: lap_times,
pit_tops, results
F O R M U L A 1
Rough Overview of the Data
‣ dimensions: drivers,
circuits, constructors,
races
‣ facts: lap_times,
pit_tops, results
F O R M U L A 1
Rough Overview of the Data
‣ dimensions: drivers,
circuits, constructors,
races
‣ facts: lap_times,
pit_tops, results
S N OW F L A K E
Snowflake Setup
‣ activate the Anaconda Packages
(Preview Feature)
‣ create a warehouse to be used
for dbt
‣ that’s it already!
A I R B Y T E
Ingesting the
Data with Airbyte
Simply select
‣ file as source
‣ Snowflake as destination
and hit Launch
P Y T H O N + S N OW PA R K
Setup Python/Snowpark for dbt
‣ just install along dbt also
‣ dbt-snowflake
‣ snowflake-snowpark-
python
‣ snowflake-connector-
python
‣ and configure Snowflake in your
profiles.yml
P Y T H O N + S N OW PA R K
Training a Python Model in dbt/Snowpark
D B T / S N OW PA R K
Predicting with a trained Model in dbt/Snowpark
L I G H T D A S H
Visualisation with Lightdash
Lightdash pulls dimensions and metrics from dbt.
Lightdash
‣ runs ad-hoc queries
‣ creates charts from queries
‣ creates dashboards from charts
‣ organises everything in spaces
What are the avg lap times split by year and grand prix?
M E T R I C D I M E N S I O N S
L I G H T D A S H
Main View of Lightdash
L I G H T D A S H
Analysing the Average Lap Times
L I G H T D A S H
Example of a Simple Dashboard
F O R M U L A 1
Analysis and Position Prediction
One more thing…
O N M O R E T H I N G O N …
Streamlit
‣ provides faster way to build and
share data apps
‣ was recently acquired by
Snowflake
‣ might be a nice addition to
Lightdash for interactive data
apps
Dr. Florian Wilhelm, 2023
WA L D S TA C K
Conclusion
The WALD stack combines 4 powerful
technologies that work very well
together.
The WALD stack is flexible enough for
many use-cases from analytics to
building a full-fledged end-to-end
data product at scale.
C H E E R S T O T H E C O M M U N I T Y
Credits & Resources
‣ python-snowpark-formula
Github repo by Hope Watson
from dbt labs
‣ kind support by Michael Gorkow
& Marko Cavar from Snowflake
‣ these awesome slides by
Michael Hofmann from inovex
© 2023
Thank you!
Dr. Florian Wilhelm
Head of Data Science
P y C o n D E / P y D a t a 2 0 2 3
inovex.de
florian.wilhelm@inovex.de
@inovexlife
@inovexgmbh
waldstack.org

More Related Content

What's hot

Architecting Agile Data Applications for Scale
Architecting Agile Data Applications for ScaleArchitecting Agile Data Applications for Scale
Architecting Agile Data Applications for ScaleDatabricks
 
Strata NY 2018: The deconstructed database
Strata NY 2018: The deconstructed databaseStrata NY 2018: The deconstructed database
Strata NY 2018: The deconstructed databaseJulien Le Dem
 
Data Platform Architecture Principles and Evaluation Criteria
Data Platform Architecture Principles and Evaluation CriteriaData Platform Architecture Principles and Evaluation Criteria
Data Platform Architecture Principles and Evaluation CriteriaScyllaDB
 
Data Engineer's Lunch #54: dbt and Spark
Data Engineer's Lunch #54: dbt and SparkData Engineer's Lunch #54: dbt and Spark
Data Engineer's Lunch #54: dbt and SparkAnant Corporation
 
Snowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the UglySnowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the UglyTyler Wishnoff
 
Data Modeling & Metadata for Graph Databases
Data Modeling & Metadata for Graph DatabasesData Modeling & Metadata for Graph Databases
Data Modeling & Metadata for Graph DatabasesDATAVERSITY
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling FundamentalsDATAVERSITY
 
Training Series - Build A Routing Web Application With OpenStreetMap, Neo4j, ...
Training Series - Build A Routing Web Application With OpenStreetMap, Neo4j, ...Training Series - Build A Routing Web Application With OpenStreetMap, Neo4j, ...
Training Series - Build A Routing Web Application With OpenStreetMap, Neo4j, ...Neo4j
 
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceDATAVERSITY
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureDmitry Anoshin
 
Building a Data Warehouse for Business Analytics using Spark SQL-(Blagoy Kalo...
Building a Data Warehouse for Business Analytics using Spark SQL-(Blagoy Kalo...Building a Data Warehouse for Business Analytics using Spark SQL-(Blagoy Kalo...
Building a Data Warehouse for Business Analytics using Spark SQL-(Blagoy Kalo...Spark Summit
 
The Data Driven University - Automating Data Governance and Stewardship in Au...
The Data Driven University - Automating Data Governance and Stewardship in Au...The Data Driven University - Automating Data Governance and Stewardship in Au...
The Data Driven University - Automating Data Governance and Stewardship in Au...Pieter De Leenheer
 
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?Albert Hoitingh
 
Building Adaptive Systems with Wardley Mapping, Domain-Driven Design, and Tea...
Building Adaptive Systems with Wardley Mapping, Domain-Driven Design, and Tea...Building Adaptive Systems with Wardley Mapping, Domain-Driven Design, and Tea...
Building Adaptive Systems with Wardley Mapping, Domain-Driven Design, and Tea...Susanne Kaiser
 
REX sur une implantation SAFe - La complexité en TI et l'agilité d'entreprise
REX sur une implantation SAFe - La complexité en TI et l'agilité d'entrepriseREX sur une implantation SAFe - La complexité en TI et l'agilité d'entreprise
REX sur une implantation SAFe - La complexité en TI et l'agilité d'entrepriseEtienne Laverdière
 
Graph Thinking: Why it Matters
Graph Thinking: Why it MattersGraph Thinking: Why it Matters
Graph Thinking: Why it MattersNeo4j
 
Observability for Data Pipelines With OpenLineage
Observability for Data Pipelines With OpenLineageObservability for Data Pipelines With OpenLineage
Observability for Data Pipelines With OpenLineageDatabricks
 
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...Databricks
 

What's hot (20)

Architecting Agile Data Applications for Scale
Architecting Agile Data Applications for ScaleArchitecting Agile Data Applications for Scale
Architecting Agile Data Applications for Scale
 
Strata NY 2018: The deconstructed database
Strata NY 2018: The deconstructed databaseStrata NY 2018: The deconstructed database
Strata NY 2018: The deconstructed database
 
Data Platform Architecture Principles and Evaluation Criteria
Data Platform Architecture Principles and Evaluation CriteriaData Platform Architecture Principles and Evaluation Criteria
Data Platform Architecture Principles and Evaluation Criteria
 
Data Engineer's Lunch #54: dbt and Spark
Data Engineer's Lunch #54: dbt and SparkData Engineer's Lunch #54: dbt and Spark
Data Engineer's Lunch #54: dbt and Spark
 
Snowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the UglySnowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the Ugly
 
Data Modeling & Metadata for Graph Databases
Data Modeling & Metadata for Graph DatabasesData Modeling & Metadata for Graph Databases
Data Modeling & Metadata for Graph Databases
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
From Data Warehouse to Lakehouse
From Data Warehouse to LakehouseFrom Data Warehouse to Lakehouse
From Data Warehouse to Lakehouse
 
Training Series - Build A Routing Web Application With OpenStreetMap, Neo4j, ...
Training Series - Build A Routing Web Application With OpenStreetMap, Neo4j, ...Training Series - Build A Routing Web Application With OpenStreetMap, Neo4j, ...
Training Series - Build A Routing Web Application With OpenStreetMap, Neo4j, ...
 
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data Governance
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
 
Building a Data Warehouse for Business Analytics using Spark SQL-(Blagoy Kalo...
Building a Data Warehouse for Business Analytics using Spark SQL-(Blagoy Kalo...Building a Data Warehouse for Business Analytics using Spark SQL-(Blagoy Kalo...
Building a Data Warehouse for Business Analytics using Spark SQL-(Blagoy Kalo...
 
The Data Driven University - Automating Data Governance and Stewardship in Au...
The Data Driven University - Automating Data Governance and Stewardship in Au...The Data Driven University - Automating Data Governance and Stewardship in Au...
The Data Driven University - Automating Data Governance and Stewardship in Au...
 
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
 
DMBOK and Data Governance
DMBOK and Data GovernanceDMBOK and Data Governance
DMBOK and Data Governance
 
Building Adaptive Systems with Wardley Mapping, Domain-Driven Design, and Tea...
Building Adaptive Systems with Wardley Mapping, Domain-Driven Design, and Tea...Building Adaptive Systems with Wardley Mapping, Domain-Driven Design, and Tea...
Building Adaptive Systems with Wardley Mapping, Domain-Driven Design, and Tea...
 
REX sur une implantation SAFe - La complexité en TI et l'agilité d'entreprise
REX sur une implantation SAFe - La complexité en TI et l'agilité d'entrepriseREX sur une implantation SAFe - La complexité en TI et l'agilité d'entreprise
REX sur une implantation SAFe - La complexité en TI et l'agilité d'entreprise
 
Graph Thinking: Why it Matters
Graph Thinking: Why it MattersGraph Thinking: Why it Matters
Graph Thinking: Why it Matters
 
Observability for Data Pipelines With OpenLineage
Observability for Data Pipelines With OpenLineageObservability for Data Pipelines With OpenLineage
Observability for Data Pipelines With OpenLineage
 
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
 

Similar to WALD: A Modern & Sustainable Analytics Stack

How to Transform Into a Data-Driven Organization
How to Transform Into a Data-Driven OrganizationHow to Transform Into a Data-Driven Organization
How to Transform Into a Data-Driven OrganizationWarrenCruz3
 
Data Science with Spark
Data Science with SparkData Science with Spark
Data Science with SparkKrishna Sankar
 
Foundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information ArchitectureFoundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information ArchitectureInside Analysis
 
Puppet Camp Sydney 2014 - Evolving Design Patterns in AWS
Puppet Camp Sydney 2014 - Evolving Design Patterns in AWSPuppet Camp Sydney 2014 - Evolving Design Patterns in AWS
Puppet Camp Sydney 2014 - Evolving Design Patterns in AWSjohnpainter_id_au
 
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...Denodo
 
Demystifying Data Virtualization: Why it’s Now Critical for Your Data Strategy
Demystifying Data Virtualization: Why it’s Now Critical for Your Data StrategyDemystifying Data Virtualization: Why it’s Now Critical for Your Data Strategy
Demystifying Data Virtualization: Why it’s Now Critical for Your Data StrategyDenodo
 
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataExclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataPentaho
 
Trivadis - Microsoft Transform your data estate with cloud, data and AI
Trivadis - Microsoft Transform your data estate with cloud, data and AITrivadis - Microsoft Transform your data estate with cloud, data and AI
Trivadis - Microsoft Transform your data estate with cloud, data and AITrivadis
 
Digital transformation with microsoft data and ai
Digital transformation with microsoft data and ai Digital transformation with microsoft data and ai
Digital transformation with microsoft data and ai MichaelRoenker
 
Predicting Consumer Behaviour via Hadoop
Predicting Consumer Behaviour via HadoopPredicting Consumer Behaviour via Hadoop
Predicting Consumer Behaviour via HadoopSkillspeed
 
The New Database Frontier: Harnessing the Cloud
The New Database Frontier: Harnessing the CloudThe New Database Frontier: Harnessing the Cloud
The New Database Frontier: Harnessing the CloudInside Analysis
 
Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Denodo
 
Cloud Infrastructure: Changing Today's World
Cloud Infrastructure: Changing Today's WorldCloud Infrastructure: Changing Today's World
Cloud Infrastructure: Changing Today's WorldKirti Khanna
 
How to Build Cloud-based Microservice Environments with Docker and VoltDB
How to Build Cloud-based Microservice Environments with Docker and VoltDBHow to Build Cloud-based Microservice Environments with Docker and VoltDB
How to Build Cloud-based Microservice Environments with Docker and VoltDBVoltDB
 
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizonHadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizonDataWorks Summit/Hadoop Summit
 
Power of the Run Graph
Power of the Run GraphPower of the Run Graph
Power of the Run GraphVaticle
 
The Central Hub: Defining the Data Lake
The Central Hub: Defining the Data LakeThe Central Hub: Defining the Data Lake
The Central Hub: Defining the Data LakeEric Kavanagh
 
Process big data within an hour, with the OVH Public Cloud
Process big data within an hour, with the OVH Public CloudProcess big data within an hour, with the OVH Public Cloud
Process big data within an hour, with the OVH Public CloudOVHcloud
 

Similar to WALD: A Modern & Sustainable Analytics Stack (20)

How to Transform Into a Data-Driven Organization
How to Transform Into a Data-Driven OrganizationHow to Transform Into a Data-Driven Organization
How to Transform Into a Data-Driven Organization
 
Data Science with Spark
Data Science with SparkData Science with Spark
Data Science with Spark
 
Foundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information ArchitectureFoundation for Success: How Big Data Fits in an Information Architecture
Foundation for Success: How Big Data Fits in an Information Architecture
 
Puppet Camp Sydney 2014 - Evolving Design Patterns in AWS
Puppet Camp Sydney 2014 - Evolving Design Patterns in AWSPuppet Camp Sydney 2014 - Evolving Design Patterns in AWS
Puppet Camp Sydney 2014 - Evolving Design Patterns in AWS
 
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
 
Demystifying Data Virtualization: Why it’s Now Critical for Your Data Strategy
Demystifying Data Virtualization: Why it’s Now Critical for Your Data StrategyDemystifying Data Virtualization: Why it’s Now Critical for Your Data Strategy
Demystifying Data Virtualization: Why it’s Now Critical for Your Data Strategy
 
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataExclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
 
Trivadis - Microsoft Transform your data estate with cloud, data and AI
Trivadis - Microsoft Transform your data estate with cloud, data and AITrivadis - Microsoft Transform your data estate with cloud, data and AI
Trivadis - Microsoft Transform your data estate with cloud, data and AI
 
Digital transformation with microsoft data and ai
Digital transformation with microsoft data and ai Digital transformation with microsoft data and ai
Digital transformation with microsoft data and ai
 
Predicting Consumer Behaviour via Hadoop
Predicting Consumer Behaviour via HadoopPredicting Consumer Behaviour via Hadoop
Predicting Consumer Behaviour via Hadoop
 
The New Database Frontier: Harnessing the Cloud
The New Database Frontier: Harnessing the CloudThe New Database Frontier: Harnessing the Cloud
The New Database Frontier: Harnessing the Cloud
 
Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)
 
Cloud Infrastructure: Changing Today's World
Cloud Infrastructure: Changing Today's WorldCloud Infrastructure: Changing Today's World
Cloud Infrastructure: Changing Today's World
 
Data democratised
Data democratisedData democratised
Data democratised
 
What is CRATE?
What is CRATE?What is CRATE?
What is CRATE?
 
How to Build Cloud-based Microservice Environments with Docker and VoltDB
How to Build Cloud-based Microservice Environments with Docker and VoltDBHow to Build Cloud-based Microservice Environments with Docker and VoltDB
How to Build Cloud-based Microservice Environments with Docker and VoltDB
 
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizonHadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
 
Power of the Run Graph
Power of the Run GraphPower of the Run Graph
Power of the Run Graph
 
The Central Hub: Defining the Data Lake
The Central Hub: Defining the Data LakeThe Central Hub: Defining the Data Lake
The Central Hub: Defining the Data Lake
 
Process big data within an hour, with the OVH Public Cloud
Process big data within an hour, with the OVH Public CloudProcess big data within an hour, with the OVH Public Cloud
Process big data within an hour, with the OVH Public Cloud
 

More from Florian Wilhelm

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unlocking the Power of Integer Programming
Unlocking the Power of Integer ProgrammingUnlocking the Power of Integer Programming
Unlocking the Power of Integer ProgrammingFlorian Wilhelm
 
Forget about AI and do Mathematical Modelling instead!
Forget about AI and do Mathematical Modelling instead!Forget about AI and do Mathematical Modelling instead!
Forget about AI and do Mathematical Modelling instead!Florian Wilhelm
 
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...Florian Wilhelm
 
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...Florian Wilhelm
 
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...Florian Wilhelm
 
Uncertainty Quantification in AI
Uncertainty Quantification in AIUncertainty Quantification in AI
Uncertainty Quantification in AIFlorian Wilhelm
 
Performance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use casePerformance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use caseFlorian Wilhelm
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionFlorian Wilhelm
 
How mobile.de brings Data Science to Production for a Personalized Web Experi...
How mobile.de brings Data Science to Production for a Personalized Web Experi...How mobile.de brings Data Science to Production for a Personalized Web Experi...
How mobile.de brings Data Science to Production for a Personalized Web Experi...Florian Wilhelm
 
Deep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace
Deep Learning-based Recommendations for Germany's Biggest Vehicle MarketplaceDeep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace
Deep Learning-based Recommendations for Germany's Biggest Vehicle MarketplaceFlorian Wilhelm
 
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...Florian Wilhelm
 
Declarative Thinking and Programming
Declarative Thinking and ProgrammingDeclarative Thinking and Programming
Declarative Thinking and ProgrammingFlorian Wilhelm
 
Which car fits my life? - PyData Berlin 2017
Which car fits my life? - PyData Berlin 2017Which car fits my life? - PyData Berlin 2017
Which car fits my life? - PyData Berlin 2017Florian Wilhelm
 
PyData Meetup Berlin 2017-04-19
PyData Meetup Berlin 2017-04-19PyData Meetup Berlin 2017-04-19
PyData Meetup Berlin 2017-04-19Florian Wilhelm
 
Explaining the idea behind automatic relevance determination and bayesian int...
Explaining the idea behind automatic relevance determination and bayesian int...Explaining the idea behind automatic relevance determination and bayesian int...
Explaining the idea behind automatic relevance determination and bayesian int...Florian Wilhelm
 

More from Florian Wilhelm (16)

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unlocking the Power of Integer Programming
Unlocking the Power of Integer ProgrammingUnlocking the Power of Integer Programming
Unlocking the Power of Integer Programming
 
Forget about AI and do Mathematical Modelling instead!
Forget about AI and do Mathematical Modelling instead!Forget about AI and do Mathematical Modelling instead!
Forget about AI and do Mathematical Modelling instead!
 
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
 
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
 
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
 
Uncertainty Quantification in AI
Uncertainty Quantification in AIUncertainty Quantification in AI
Uncertainty Quantification in AI
 
Performance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use casePerformance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use case
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to Production
 
How mobile.de brings Data Science to Production for a Personalized Web Experi...
How mobile.de brings Data Science to Production for a Personalized Web Experi...How mobile.de brings Data Science to Production for a Personalized Web Experi...
How mobile.de brings Data Science to Production for a Personalized Web Experi...
 
Deep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace
Deep Learning-based Recommendations for Germany's Biggest Vehicle MarketplaceDeep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace
Deep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace
 
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
 
Declarative Thinking and Programming
Declarative Thinking and ProgrammingDeclarative Thinking and Programming
Declarative Thinking and Programming
 
Which car fits my life? - PyData Berlin 2017
Which car fits my life? - PyData Berlin 2017Which car fits my life? - PyData Berlin 2017
Which car fits my life? - PyData Berlin 2017
 
PyData Meetup Berlin 2017-04-19
PyData Meetup Berlin 2017-04-19PyData Meetup Berlin 2017-04-19
PyData Meetup Berlin 2017-04-19
 
Explaining the idea behind automatic relevance determination and bayesian int...
Explaining the idea behind automatic relevance determination and bayesian int...Explaining the idea behind automatic relevance determination and bayesian int...
Explaining the idea behind automatic relevance determination and bayesian int...
 

Recently uploaded

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 

Recently uploaded (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

WALD: A Modern & Sustainable Analytics Stack

  • 1. © 2023 A Modern & Sustainable Analytics Stack W A L D PyConDE / PyData 2023, April 17th Florian Wilhelm
  • 2. Mathematical Modelling Modern Data Warehousing & Analytics Personalisation & RecSys Uncertainty Quantification & Causality Python Data Stack OSS Contributor & Creator of PyScaffold Dr. Florian Wilhelm • H E A D O F D ATA S C I E N C E FlorianWilhelm.info florian.wilhelm@inovex.de Florian Wilhelm @FlorianWilhelm
  • 3. ‣ Application Development (Web Platforms, Mobile Apps, Smart Devices and Robotics, UI/ UX design,Backend Services) ‣ Data Management and Analytics (Business Intelligence, Big Data, Searches, Data Science and Deep Learning, Machine Perception and Artificial Intelligence) ‣ Scalable IT-Infrastructures (IT Engineering, Cloud Services, DevOps, Replatforming, Security) ‣ Training and Coaching (inovex Academy) is an innovation and quality- driven IT project house with a focus on digital transformation. Using technology to inspire our clients. And ourselves. Berlin · Karlsruhe · Pforzheim · Stuttgart · München · Köln · Hamburg · Erlangen www.inovex.de
  • 4. V S Data Warehouse Data Lake V S ETL ELT &
  • 5. C O M PA R I S O N Data Lake vs Data Warehouse Data Lake Data Warehouse Data Structure Raw Processed Purpose of Data Not yet determined Currently in use Users Data scientists / engineers Business professionals Accessibility Highly accessible and quick to update More complicated and costly to make changes
  • 6. C O M PA R I S O N ETL — Extract, Transform, Load S TA G I N G A R E A I N D ATA L A K E T R A N S F O R M S Q L R D B M S N O S Q L D B M S X M L F I L E S S A A S P L AT F O R M E X T R A C T D ATA WA R E H O U S E L O A D A N A LY T I C S A N A L Y S E
  • 7. C O M PA R I S O N ELT — Extract, Load, Transform D ATA WA R E H O U S E T R A N S F O R M S Q L R D B M S N O S Q L D B M S X M L F I L E S S A A S P L AT F O R M E X T R A C T L O A D A N A LY T I C S A N A L Y S E
  • 8. P R O G R E S S The Evolution of Databricks & Snowflake D ATA L A K E & E T L D ATA WA R E H O U S E & E LT
  • 9. P R O G R E S S The Evolution of Databricks & Snowflake D ATA B R I C K S L A K E H O U S E S N OW F L A K E D ATA C L O U D U N I F I C AT I O N O F D ATA WA R E H O U S E & L A K E
  • 10. W A L D S T A C K
  • 11. W A L D S T A C K WA L D S TA C K A Modern & Sustainable Analytics Stack Warehouse like Snowflake, Google BigQuery, Databricks Airbyte for Integration of hundreds of data sources Lightdash for Business Intelligence DBT for data transformations
  • 12. WA L D S TA C K WALD based on Google Big Query ‣ fully managed enterprise data warehouse ‣ only available on Google Cloud Platform ‣ uses dbt-bigquery to run Python UDFs with Dataproc using PySpark
  • 13. WA L D S TA C K WALD based on Databricks ‣ uses Databrick’s SQL warehouse ‣ available on all major cloud providers ‣ uses dbt-databricks to run Python UDFs with PySpark
  • 14. WA L D S TA C K WALD based on Snowflake ‣ comes with all the benefits of Snowflake, e.g. governance, easy collaboration, security, near zero maintenance ‣ runs on all major cloud providers ‣ tons of great documentation ‣ the easiest to use and full- featured option to get started with WALD
  • 15. WA L D S TA C K Modern Analytics Stack DATA CLOUD DATA SOURCES E X T R A C T L O A D T R A N S F O R M A n a l y s e SNOWPARK
  • 16. WA L D S TA C K Example Use-Case >find all code & details under https://waldstack.org
  • 17. F O R M U L A 1 Analysis and Position Prediction 1. Upload data using Airbyte to Snowflake 2. Use dbt for some basic analysis 3. Use dbt with Snowpark for some ML to predict future position 4. Visualise our results with Lightdash
  • 18. F O R M U L A 1 Rough Overview of the Data ‣ dimensions: drivers, circuits, constructors, races ‣ facts: lap_times, pit_tops, results
  • 19. F O R M U L A 1 Rough Overview of the Data ‣ dimensions: drivers, circuits, constructors, races ‣ facts: lap_times, pit_tops, results
  • 20. F O R M U L A 1 Rough Overview of the Data ‣ dimensions: drivers, circuits, constructors, races ‣ facts: lap_times, pit_tops, results
  • 21. F O R M U L A 1 Rough Overview of the Data ‣ dimensions: drivers, circuits, constructors, races ‣ facts: lap_times, pit_tops, results
  • 22. F O R M U L A 1 Rough Overview of the Data ‣ dimensions: drivers, circuits, constructors, races ‣ facts: lap_times, pit_tops, results
  • 23. F O R M U L A 1 Rough Overview of the Data ‣ dimensions: drivers, circuits, constructors, races ‣ facts: lap_times, pit_tops, results
  • 24. F O R M U L A 1 Rough Overview of the Data ‣ dimensions: drivers, circuits, constructors, races ‣ facts: lap_times, pit_tops, results
  • 25. F O R M U L A 1 Rough Overview of the Data ‣ dimensions: drivers, circuits, constructors, races ‣ facts: lap_times, pit_tops, results
  • 26. S N OW F L A K E Snowflake Setup ‣ activate the Anaconda Packages (Preview Feature) ‣ create a warehouse to be used for dbt ‣ that’s it already!
  • 27. A I R B Y T E Ingesting the Data with Airbyte Simply select ‣ file as source ‣ Snowflake as destination and hit Launch
  • 28. P Y T H O N + S N OW PA R K Setup Python/Snowpark for dbt ‣ just install along dbt also ‣ dbt-snowflake ‣ snowflake-snowpark- python ‣ snowflake-connector- python ‣ and configure Snowflake in your profiles.yml
  • 29. P Y T H O N + S N OW PA R K Training a Python Model in dbt/Snowpark
  • 30. D B T / S N OW PA R K Predicting with a trained Model in dbt/Snowpark
  • 31. L I G H T D A S H Visualisation with Lightdash Lightdash pulls dimensions and metrics from dbt. Lightdash ‣ runs ad-hoc queries ‣ creates charts from queries ‣ creates dashboards from charts ‣ organises everything in spaces What are the avg lap times split by year and grand prix? M E T R I C D I M E N S I O N S
  • 32. L I G H T D A S H Main View of Lightdash
  • 33. L I G H T D A S H Analysing the Average Lap Times
  • 34. L I G H T D A S H Example of a Simple Dashboard
  • 35. F O R M U L A 1 Analysis and Position Prediction
  • 37. O N M O R E T H I N G O N … Streamlit ‣ provides faster way to build and share data apps ‣ was recently acquired by Snowflake ‣ might be a nice addition to Lightdash for interactive data apps
  • 38. Dr. Florian Wilhelm, 2023 WA L D S TA C K Conclusion The WALD stack combines 4 powerful technologies that work very well together. The WALD stack is flexible enough for many use-cases from analytics to building a full-fledged end-to-end data product at scale.
  • 39. C H E E R S T O T H E C O M M U N I T Y Credits & Resources ‣ python-snowpark-formula Github repo by Hope Watson from dbt labs ‣ kind support by Michael Gorkow & Marko Cavar from Snowflake ‣ these awesome slides by Michael Hofmann from inovex
  • 40. © 2023 Thank you! Dr. Florian Wilhelm Head of Data Science P y C o n D E / P y D a t a 2 0 2 3 inovex.de florian.wilhelm@inovex.de @inovexlife @inovexgmbh waldstack.org