Welcome
Delivering rapid-fire analytics with
Snowflake and Tableau
# d a t a 1 9
Harald Erb
Sales Engineer, Central Europe
Agenda
What is Snowflake?
Demo #1: Fresh Data for Tableau via Snowpipe
Demo #2: Monitor Account Utilization in Snowflake
Take away & Call to Action
What is Snowflake?
© 2019 Snowflake Computing Inc. All Rights Reserved
THE SNOWFLAKE TIMELINE
SQL Data Warehouse built for the cloud
6
Founded 2012
by industry
veterans
Over 2,200
active customers
Raised over $950M in
venture funding from
leading investors
First customers 2014,
general availability 2015
Gartner and
Forrester “Leader”
Queries processed in Snowflake per day:
# rows in largest single table:
Largest number of tables single DB:
Single customer most data:
Single customer most users:
> 60.000.000
68.000.000.000.000
200,000
> 40 PB
> 10,000
Benoit
Dageville
Thierry
Cruanes
© 2019 Snowflake Computing Inc. All Rights Reserved
KNOWN CHALLENGES…
7
Complexity
Manage both
infrastructure
and data
Limited
Scalability
Can’t support all
data, users and
workloads
Diversity
Unable to
consolidate
siloed datasets
Inadequate
Elasticity
Stuck with rigid,
inflexible
architectures
Rigid Cost
Forced to
keep the lights on
24/7
© 2019 Snowflake Computing Inc. All Rights Reserved
NEW ARCHITECTURE FOR DATA WAREHOUSING
Multi-Cluster, Shared Data, in the Cloud
8
Traditional Architectures Snowflake
Cluster of nodes with a
single shared disk.
Limited by disk size and
I/O throughput
(Traditional DW’s based
on RDBMS)
Shared-Disk (SMP)
Cluster of nodes each of which has its
own disk – data distributed across the
nodes. Not elastic because data must
be redistributed when resize the
cluster (Most MPP DW‘s, Hadoop)
Shared-Nothing (MPP) Multi-Cluster, Shared Data
Multiple clusters, shared data. Compute
power and storage scale independently
of each other
© 2019 Snowflake Computing Inc. All Rights Reserved
MULTI CLUSTER, SHARED DATA ARCHITECTURE
Cloud Storage Layer
Instant, automatic Scalability & Elasticity
Compute Layer
• Multiple Warehouses without
resource contention
• Resize Warehouse instantly
(scale up/down)
• Warehouse scales out
automatically and elastically
Centralized Storage
© 2019 Snowflake Computing Inc. All Rights Reserved
REAL-WORLD USE CASE
10
Continuous
Loading (4TB/day)
S3
<5min SLA
Virtual
Warehouse
Medium
ETL &
Maintenance
Virtual Warehouse
Large
Virtual
Warehouse
2X-Large
Reporting
(Segmented)
Interactive
Dashboard
50% < 1s
85% < 2s
95% < 5s
Virtual Warehouse
Auto Scale – X-Large x 5
3+ PB of raw data
1,5 PB data stored in Database (8x compression ratio)
25M micro partitions
Prod DB
© 2019 Snowflake Computing Inc. All Rights Reserved
Concurrency Simplicity
Fully managed with a
pay-as-you-go model.
Works on any data
Multiple groups access
data simultaneously
with no performance degradation
Multi petabyte-scale, up to 200x
faster performance
and 1/10th the cost
200x
Performance
THE SNOWFLAKE DIFFERENCE
© 2019 Snowflake Computing Inc. All Rights Reserved
NEW FEATURES RELEASED
Sources:
snowflake.com/about/press-and-news
data.solita.fi/a-curated-list-of-new-snowflake-features-released-at-snowflake-summit-2019
> Snowflake Data Pipelines
• Auto-Ingest
• Streams and Tasks
• Snowflake Connector for Kafka
> Core Data Warehouse
• New web-based SQL Editor through acquisition
of tech company Numeracy
• Materialized Views
• JavaScript Store Procedures, hierarchical SQL
• External Tables, Hive Metastore integration,
Credential-less external stages
> Multi-cloud strategy
• Snowflake on Google Cloud is set to launch in
preview in Fall 2019
• Snowflake announced Database Replication
and Database Failover. If a disaster occurs in
one region or on one cloud service, businesses
can immediately access and control Snowflake
data they have replicated in a different region or
cloud service.
> Secure Data Sharing
• Snowflake Data Exchange
Demo #1: Fresh Data for
Tableau via Snowpipe
DEMO SCENARIODEMO SCENARIO
DATA SCHEMA
Snowflake Web UI – SQL Editor
3rd Party SQL Editor (DBeaver)
3 ys historical data
~184.000.000 rows
DATA ARCHITECTURE
Data Sources
Extract, Load &
Transform Tools
(ELT)
Extract,
Transform &
Load Tools
(ETL)
Database
Migration
Services
Snowflake
DW
Data Flow Tools
Tables, CSV, JSON, XML, Avro, Parquet
Virtual
Warehouses
Corporate
Applications
Databases
Cloud
Services
Web
Devices
Azure Blob
Amazon S3
Snowpipe
Data Lake feeds
Data Feed Options with Snowflake
• Snowpipe processed 'Messages' or files;
structured or semi-structured
• Snowpipe designed for continuous ingest –
typically < 1 min latency
• Potential downstream ELT e.g. hourly
• Time-travel can provide static vs dynamic views
• (Future – Downstream pipe processing, Direct
streaming connectivity)
Data Lake
Live Query
creativecommons.tankerkoenig.de
LAST DATA LOAD = JUNE 7
LOADING ADDITIONAL FILES INTO AWS S3
USING AWS NOTIFICATIONS FOR SNOWPIPE
CREATING A SNOWPIPE TO LOAD DATA FROM S3
NEW DATA AUTOMATICALLY LOADED FROM S3
FRESH DATA READY FOR TABLEAU!
Demo #2: Monitor Account
Utilization in Snowflake
PAY FOR WHAT YOU USE…DOWN TO THE SEC.
ETL and
Processing
Morning Noon Night
WorkloadReporting
Ad-hoc
Analytics
Morning Noon Night
Workload
Morning Noon Night
Workload
Data Scientist
Morning Noon Night
Workload
Snowflake Web UI – Account Billing & Usage
Scott Smith‘s Blog incl. Download of Sample Workbook: tableau.com/about/blog/2019/5/monitor-understand-snowflake-account-usage
CONNECT TO SNOWFLAKE DIRECTLY
AND ANALYZE/FORECAST ACCOUNT UTILIZATION
Take away &
Call to Action
LEARN MORE: BEST PRACTICES
E-Paper Download: resources.snowflake.com/ebooks/best-practices-for-using-tableau-with-snowflake
E-Book Content:
• Creating efficient Tableau workbooks
• Connecting to Snowflake
• Working with semi-structured data
• Working with Snowflake Time Travel
• Working with Snowflake Data
Sharing
• Implementing role-based security
• Using custom aggregations
• Scaling Snowflake warehouses
• Caching
• Other performance considerations
• Measuring performance
TRY: TABLEAU & SNOWFLAKE QUICK START
This Quick Start deploys Tableau
Server in the Amazon Web
Services (AWS) Cloud and
configures it to work with
Snowflake in about 30 minutes.
More information: aws.amazon.com/quickstart/architecture/tableau-snowflake/
SESSION TAKE AWAY
What you don’t have to worry when working with the Snowflake
Cloud Data Warehouse
Installing, provisioning and maintaining
hardware and software:
• Snowflake is a cloud-built DW as a service.
• Just create an account and load some data.
• You can then just connect from Tableau and start
querying.
Working out the capacity of your DW:
• Snowflake is a fully elastic platform, so it can scale
to handle all of your data and all of your users.
• Just size your compute (virtual warehouses) up
and down on the fly to handle peaks and lulls in
your data usage.
• Turn your warehouses completely off to save
money when not used
Learning new tools and a new query language:
• Snowflake is a fully ANSI SQL-compliant DW à all skills
and tools, such as Tableau, will easily connect
• Snowflake provides connectors for ODBC, JDBC, Python,
Spark and Node.js
• Even semi-structured data can be accessed via SQL
Optimizing and maintaining your data:
• Snowflake is a highly-scalable, columnar data platform
allowing users to run analytic queries quickly and easily.
• It is not required to index or distribute data across
partitions, it is all transparently managed by the platform.
• Snowflake also provides inherent data protection
capabilities, there is no need to worry about snapshots,
backups or other administrative tasks.
Please complete the
session survey from
the My Evaluations
menu in your
Tableau Conference
Europe 2019 app
Thank you!
harald.erb@snowflake.com
Delivering rapid-fire Analytics with Snowflake and Tableau

Delivering rapid-fire Analytics with Snowflake and Tableau

  • 2.
  • 3.
    Delivering rapid-fire analyticswith Snowflake and Tableau # d a t a 1 9 Harald Erb Sales Engineer, Central Europe
  • 4.
    Agenda What is Snowflake? Demo#1: Fresh Data for Tableau via Snowpipe Demo #2: Monitor Account Utilization in Snowflake Take away & Call to Action
  • 5.
  • 6.
    © 2019 SnowflakeComputing Inc. All Rights Reserved THE SNOWFLAKE TIMELINE SQL Data Warehouse built for the cloud 6 Founded 2012 by industry veterans Over 2,200 active customers Raised over $950M in venture funding from leading investors First customers 2014, general availability 2015 Gartner and Forrester “Leader” Queries processed in Snowflake per day: # rows in largest single table: Largest number of tables single DB: Single customer most data: Single customer most users: > 60.000.000 68.000.000.000.000 200,000 > 40 PB > 10,000 Benoit Dageville Thierry Cruanes
  • 7.
    © 2019 SnowflakeComputing Inc. All Rights Reserved KNOWN CHALLENGES… 7 Complexity Manage both infrastructure and data Limited Scalability Can’t support all data, users and workloads Diversity Unable to consolidate siloed datasets Inadequate Elasticity Stuck with rigid, inflexible architectures Rigid Cost Forced to keep the lights on 24/7
  • 8.
    © 2019 SnowflakeComputing Inc. All Rights Reserved NEW ARCHITECTURE FOR DATA WAREHOUSING Multi-Cluster, Shared Data, in the Cloud 8 Traditional Architectures Snowflake Cluster of nodes with a single shared disk. Limited by disk size and I/O throughput (Traditional DW’s based on RDBMS) Shared-Disk (SMP) Cluster of nodes each of which has its own disk – data distributed across the nodes. Not elastic because data must be redistributed when resize the cluster (Most MPP DW‘s, Hadoop) Shared-Nothing (MPP) Multi-Cluster, Shared Data Multiple clusters, shared data. Compute power and storage scale independently of each other
  • 9.
    © 2019 SnowflakeComputing Inc. All Rights Reserved MULTI CLUSTER, SHARED DATA ARCHITECTURE Cloud Storage Layer Instant, automatic Scalability & Elasticity Compute Layer • Multiple Warehouses without resource contention • Resize Warehouse instantly (scale up/down) • Warehouse scales out automatically and elastically Centralized Storage
  • 10.
    © 2019 SnowflakeComputing Inc. All Rights Reserved REAL-WORLD USE CASE 10 Continuous Loading (4TB/day) S3 <5min SLA Virtual Warehouse Medium ETL & Maintenance Virtual Warehouse Large Virtual Warehouse 2X-Large Reporting (Segmented) Interactive Dashboard 50% < 1s 85% < 2s 95% < 5s Virtual Warehouse Auto Scale – X-Large x 5 3+ PB of raw data 1,5 PB data stored in Database (8x compression ratio) 25M micro partitions Prod DB
  • 11.
    © 2019 SnowflakeComputing Inc. All Rights Reserved Concurrency Simplicity Fully managed with a pay-as-you-go model. Works on any data Multiple groups access data simultaneously with no performance degradation Multi petabyte-scale, up to 200x faster performance and 1/10th the cost 200x Performance THE SNOWFLAKE DIFFERENCE
  • 12.
    © 2019 SnowflakeComputing Inc. All Rights Reserved NEW FEATURES RELEASED Sources: snowflake.com/about/press-and-news data.solita.fi/a-curated-list-of-new-snowflake-features-released-at-snowflake-summit-2019 > Snowflake Data Pipelines • Auto-Ingest • Streams and Tasks • Snowflake Connector for Kafka > Core Data Warehouse • New web-based SQL Editor through acquisition of tech company Numeracy • Materialized Views • JavaScript Store Procedures, hierarchical SQL • External Tables, Hive Metastore integration, Credential-less external stages > Multi-cloud strategy • Snowflake on Google Cloud is set to launch in preview in Fall 2019 • Snowflake announced Database Replication and Database Failover. If a disaster occurs in one region or on one cloud service, businesses can immediately access and control Snowflake data they have replicated in a different region or cloud service. > Secure Data Sharing • Snowflake Data Exchange
  • 13.
    Demo #1: FreshData for Tableau via Snowpipe
  • 14.
  • 15.
    DATA SCHEMA Snowflake WebUI – SQL Editor 3rd Party SQL Editor (DBeaver) 3 ys historical data ~184.000.000 rows
  • 16.
    DATA ARCHITECTURE Data Sources Extract,Load & Transform Tools (ELT) Extract, Transform & Load Tools (ETL) Database Migration Services Snowflake DW Data Flow Tools Tables, CSV, JSON, XML, Avro, Parquet Virtual Warehouses Corporate Applications Databases Cloud Services Web Devices Azure Blob Amazon S3 Snowpipe Data Lake feeds Data Feed Options with Snowflake • Snowpipe processed 'Messages' or files; structured or semi-structured • Snowpipe designed for continuous ingest – typically < 1 min latency • Potential downstream ELT e.g. hourly • Time-travel can provide static vs dynamic views • (Future – Downstream pipe processing, Direct streaming connectivity) Data Lake Live Query creativecommons.tankerkoenig.de
  • 17.
  • 18.
  • 19.
  • 20.
    CREATING A SNOWPIPETO LOAD DATA FROM S3
  • 21.
    NEW DATA AUTOMATICALLYLOADED FROM S3
  • 22.
    FRESH DATA READYFOR TABLEAU!
  • 23.
    Demo #2: MonitorAccount Utilization in Snowflake
  • 24.
    PAY FOR WHATYOU USE…DOWN TO THE SEC. ETL and Processing Morning Noon Night WorkloadReporting Ad-hoc Analytics Morning Noon Night Workload Morning Noon Night Workload Data Scientist Morning Noon Night Workload Snowflake Web UI – Account Billing & Usage
  • 25.
    Scott Smith‘s Blogincl. Download of Sample Workbook: tableau.com/about/blog/2019/5/monitor-understand-snowflake-account-usage CONNECT TO SNOWFLAKE DIRECTLY AND ANALYZE/FORECAST ACCOUNT UTILIZATION
  • 26.
  • 27.
    LEARN MORE: BESTPRACTICES E-Paper Download: resources.snowflake.com/ebooks/best-practices-for-using-tableau-with-snowflake E-Book Content: • Creating efficient Tableau workbooks • Connecting to Snowflake • Working with semi-structured data • Working with Snowflake Time Travel • Working with Snowflake Data Sharing • Implementing role-based security • Using custom aggregations • Scaling Snowflake warehouses • Caching • Other performance considerations • Measuring performance
  • 28.
    TRY: TABLEAU &SNOWFLAKE QUICK START This Quick Start deploys Tableau Server in the Amazon Web Services (AWS) Cloud and configures it to work with Snowflake in about 30 minutes. More information: aws.amazon.com/quickstart/architecture/tableau-snowflake/
  • 29.
    SESSION TAKE AWAY Whatyou don’t have to worry when working with the Snowflake Cloud Data Warehouse Installing, provisioning and maintaining hardware and software: • Snowflake is a cloud-built DW as a service. • Just create an account and load some data. • You can then just connect from Tableau and start querying. Working out the capacity of your DW: • Snowflake is a fully elastic platform, so it can scale to handle all of your data and all of your users. • Just size your compute (virtual warehouses) up and down on the fly to handle peaks and lulls in your data usage. • Turn your warehouses completely off to save money when not used Learning new tools and a new query language: • Snowflake is a fully ANSI SQL-compliant DW à all skills and tools, such as Tableau, will easily connect • Snowflake provides connectors for ODBC, JDBC, Python, Spark and Node.js • Even semi-structured data can be accessed via SQL Optimizing and maintaining your data: • Snowflake is a highly-scalable, columnar data platform allowing users to run analytic queries quickly and easily. • It is not required to index or distribute data across partitions, it is all transparently managed by the platform. • Snowflake also provides inherent data protection capabilities, there is no need to worry about snapshots, backups or other administrative tasks.
  • 30.
    Please complete the sessionsurvey from the My Evaluations menu in your Tableau Conference Europe 2019 app
  • 31.