Speeding Time to Insight with a
Modern ELT Approach:
Fivetran + Databricks SQL Service + dbt
Introductions
Sean Spediacci
Fivetran, Sr. Product Marketer
Amy Deora
Fishtown Analytics (keepers of dbt), Head of
Alliances
In 2010, the focus of many analytics teams was on
infrastructure, compute and storage for large data sets
● “How do we build / scale our ETL
infrastructure?”
● “How will we control storage
costs?”
● “How will we design a data
warehouse that is performant?”
“Big Data”
The landscape has shifted
● Constraining compute (On-
premises)
● Perpetual licensing
● ETL outside the database
● Separation of storage +
compute (on-premise)
● Subscription licensing
● Complex pipelines required
for large datasets
Single Node Databases Scalable Databases
● Auto scale cloud DBMS
● Usage based licensing
● Separation of storage and
compute allows for agile
data pipelines
Modern Cloud
Destinations
Pipeline architecture is adapting to the modern cloud technology.
ETL ELT
Sources
Transformation (Modeling)
Process
Raw DataExtraction Process Clean, Usable Datasets
Moving Data Into the Warehouse (EL) is a
highly automatable process.
Data Transformation is different for
every company - it cannot be fully
automated.
The Modern Data Stack
Moving the data transformation step to the
warehouse has another benefit - democratizing
access to data.
The data transformation step is now more accessible to more
members of the data team - from BI developers to data
scientists.
No more waiting for an ETL process that happens out of sight.
➔Automatic Data
Updates (DML)
➔Automatic Schema
Migrations (DDL)
➔Automated Recovery
from Failure
(Idempotent)
➔Micro-batched
architecture
➔Extensible Cloud
Functions
About Fivetran
About dbt
dbt is an open source transform tool that allows
anyone comfortable with SQL to author their own data
pipelines
● Users write SQL with “super powers” from python (e.g., loops, macros,
local variables).
● Wraps the right DDL, DML around your SQL to materialize data
models in any warehouse.
● Infers the data lineage graph (DAG) as you code.
● Supports multiple environments, and git-based version control.
● Integrates testing into your pipeline.
● Automates documentation
dbt allows data engineers/analysts to work like
software developers
The modern data stack - with Fivetran, Delta Lake,
Databricks SQL Service and dbt
● Fivetran automates the integration with operational
systems with zero configuration required
● dbt provides a flexible data modeling environment with
best practices from DevOps.
● Connecting directly to the new databricks SQL service
makes building your pipeline fast and easy.
1. Ingest data via a Fivetran automatic connector.
2. Connect dbt to start transforming your data in-
warehouse.
3. Use a a Fivetran dbt package to jump-start your
modeling process.
Speeding time to insight with the modern stack
Demo: Instant Insight from github data
Feedback
Your feedback is important to us.
Don’t forget to rate
and review the sessions.
Color Palette
Primary
Colors
Bringing it all together
Basic Slide
▪ Bullet 1
▪ Sub-bullet
▪ Sub-bullet
▪ Bullet 2
▪ Sub-bullet
▪ Sub-bullet
▪ Bullet 3
▪ Sub-bullet
▪ Sub-bullet
Reduce Long Titles
▪ Bullet 1
▪ Sub-bullet
▪ Sub-bullet
▪ Bullet 2
▪ Sub-bullet
▪ Sub-bullet
By splitting them into a short title, and a more detailed subtitle using this slide format
that includes a subtitle area
Code example
Two Columns
▪ Bulleted list format
▪ Bulleted list format
▪ Bulleted list format
▪ Bulleted list format
▪ Bulleted list format
▪ Bulleted list format
▪ Bulleted list format
▪ Bulleted list format
Headline FormatHeadline Format
Attribution Format
Second line of attribution
This is a template for a quote slide.
This is where the quote goes.
Attribute the source below…
Databricks simplifies data and AI
so data teams can innovate faster
Logos
Databricks Logos
Open Source Logos

Speeding Time to Insight with a Modern ELT Approach

  • 1.
    Speeding Time toInsight with a Modern ELT Approach: Fivetran + Databricks SQL Service + dbt
  • 2.
    Introductions Sean Spediacci Fivetran, Sr.Product Marketer Amy Deora Fishtown Analytics (keepers of dbt), Head of Alliances
  • 3.
    In 2010, thefocus of many analytics teams was on infrastructure, compute and storage for large data sets ● “How do we build / scale our ETL infrastructure?” ● “How will we control storage costs?” ● “How will we design a data warehouse that is performant?” “Big Data”
  • 4.
    The landscape hasshifted ● Constraining compute (On- premises) ● Perpetual licensing ● ETL outside the database ● Separation of storage + compute (on-premise) ● Subscription licensing ● Complex pipelines required for large datasets Single Node Databases Scalable Databases ● Auto scale cloud DBMS ● Usage based licensing ● Separation of storage and compute allows for agile data pipelines Modern Cloud Destinations Pipeline architecture is adapting to the modern cloud technology.
  • 5.
    ETL ELT Sources Transformation (Modeling) Process RawDataExtraction Process Clean, Usable Datasets Moving Data Into the Warehouse (EL) is a highly automatable process. Data Transformation is different for every company - it cannot be fully automated.
  • 6.
  • 7.
    Moving the datatransformation step to the warehouse has another benefit - democratizing access to data. The data transformation step is now more accessible to more members of the data team - from BI developers to data scientists. No more waiting for an ETL process that happens out of sight.
  • 8.
    ➔Automatic Data Updates (DML) ➔AutomaticSchema Migrations (DDL) ➔Automated Recovery from Failure (Idempotent) ➔Micro-batched architecture ➔Extensible Cloud Functions About Fivetran
  • 9.
    About dbt dbt isan open source transform tool that allows anyone comfortable with SQL to author their own data pipelines
  • 10.
    ● Users writeSQL with “super powers” from python (e.g., loops, macros, local variables). ● Wraps the right DDL, DML around your SQL to materialize data models in any warehouse. ● Infers the data lineage graph (DAG) as you code. ● Supports multiple environments, and git-based version control. ● Integrates testing into your pipeline. ● Automates documentation dbt allows data engineers/analysts to work like software developers
  • 11.
    The modern datastack - with Fivetran, Delta Lake, Databricks SQL Service and dbt ● Fivetran automates the integration with operational systems with zero configuration required ● dbt provides a flexible data modeling environment with best practices from DevOps. ● Connecting directly to the new databricks SQL service makes building your pipeline fast and easy.
  • 12.
    1. Ingest datavia a Fivetran automatic connector. 2. Connect dbt to start transforming your data in- warehouse. 3. Use a a Fivetran dbt package to jump-start your modeling process. Speeding time to insight with the modern stack
  • 13.
    Demo: Instant Insightfrom github data
  • 14.
    Feedback Your feedback isimportant to us. Don’t forget to rate and review the sessions.
  • 15.
  • 16.
  • 17.
    Basic Slide ▪ Bullet1 ▪ Sub-bullet ▪ Sub-bullet ▪ Bullet 2 ▪ Sub-bullet ▪ Sub-bullet ▪ Bullet 3 ▪ Sub-bullet ▪ Sub-bullet
  • 18.
    Reduce Long Titles ▪Bullet 1 ▪ Sub-bullet ▪ Sub-bullet ▪ Bullet 2 ▪ Sub-bullet ▪ Sub-bullet By splitting them into a short title, and a more detailed subtitle using this slide format that includes a subtitle area
  • 19.
  • 20.
    Two Columns ▪ Bulletedlist format ▪ Bulleted list format ▪ Bulleted list format ▪ Bulleted list format ▪ Bulleted list format ▪ Bulleted list format ▪ Bulleted list format ▪ Bulleted list format Headline FormatHeadline Format
  • 21.
    Attribution Format Second lineof attribution This is a template for a quote slide. This is where the quote goes. Attribute the source below…
  • 22.
    Databricks simplifies dataand AI so data teams can innovate faster
  • 23.
  • 24.
  • 25.