SlideShare a Scribd company logo
1 of 31
Download to read offline
© 2023 Snowflake Inc. All Rights Reserved
FROM RAW DATA TO
INTERACTIVE DATA APP!
Powered by Snowpark Python
© 2023 Snowflake Inc. All Rights Reserved
© 2023 Snowflake Inc. All Rights Reserved
Challenges in Developing Data Pipelines
- Troubleshooting & debugging failed jobs
- Multi-page stack trace
- Capacity management & resource sizing
- Setting up Infrastructure and Configs
- Executor memory
- Driver memory
- # of executors
- Z-ordering, V-ordering, ABC-ordering
- Partitioning, Bucketing, Salting
© 2023 Snowflake Inc. All Rights Reserved
Challenges in Developing Data Pipelines
- Troubleshooting a failed spark job
- Multi-page stack trace
- Setting up Infrastructure and Configs
- Executor memory
- Driver memory
- # of executors
- Z-ordering, V-ordering, ABC-ordering
- Partitioning, Bucketing, Salting
© 2023 Snowflake Inc. All Rights Reserved
Challenges Today
- Troubleshooting a failed spark job
- Multi-page stack trace
- Setting up Infrastructure and Configs
- Executor memory
- Driver memory
- # of executors
- Z-ordering, V-ordering, ABC-ordering
- Partitioning, Bucketing, Salting
© 2023 Snowflake Inc. All Rights Reserved
Challenges Today
- Troubleshooting a failed spark job
- Multi-page stack trace
- Setting up Infrastructure and Configs
- Executor memory
- Driver memory
- # of executors
- Z-ordering, V-ordering, ABC-ordering
- Partitioning, Bucketing, Salting
© 2023 Snowflake Inc. All Rights Reserved
Challenges Today
- Troubleshooting a failed spark job
- Multi-page stack trace
- Setting up Infrastructure and Configs
- Executor memory
- Driver memory
- # of executors
- Z-ordering, V-ordering, ABC-ordering
- Partitioning, Bucketing, Salting
© 2023 Snowflake Inc. All Rights Reserved
Challenges Today
- Troubleshooting a failed spark job
- Multi-page stack trace
- Setting up Infrastructure and Configs
- Executor memory
- Driver memory
- # of executors
- Z-ordering, V-ordering, ABC-ordering
- Partitioning, Bucketing, Salting
© 2023 Snowflake Inc. All Rights Reserved
Challenges Today
- Troubleshooting a failed spark job
- Multi-page stack trace
- Setting up Infrastructure and Configs
- Executor memory
- Driver memory
- # of executors
- Z-ordering, V-ordering, ABC-ordering
- Partitioning, Bucketing, Salting
© 2023 Snowflake Inc. All Rights Reserved
Challenges Today
- Troubleshooting a failed spark job
- Multi-page stack trace
- Setting up Infrastructure and Configs
- Executor memory
- Driver memory
- # of executors
- Z-ordering, V-ordering, ABC-ordering
- Partitioning, Bucketing, Salting
© 2023 Snowflake Inc. All Rights Reserved
Challenges Today
- Troubleshooting a failed spark job
- Multi-page stack trace
- Setting up Infrastructure and Configs
- Executor memory
- Driver memory
- # of executors
- Z-ordering, V-ordering, ABC-ordering
- Partitioning, Bucketing, Salting
© 2023 Snowflake Inc. All Rights Reserved
Challenges Today
- Troubleshooting a failed spark job
- Multi-page stack trace
- Setting up Infrastructure and Configs
- Executor memory
- Driver memory
- # of executors
- Z-ordering, V-ordering, ABC-ordering
- Partitioning, Bucketing, Salting
© 2023 Snowflake Inc. All Rights Reserved
Challenges Today
- Troubleshooting a failed spark job
- Multi-page stack trace
- Setting up Infrastructure and Configs
- Executor memory
- Driver memory
- # of executors
- Z-ordering, V-ordering, ABC-ordering
- Partitioning, Bucketing, Salting
© 2023 Snowflake Inc. All Rights Reserved
ENTER SNOWPARK
© 2023 Snowflake Inc. All Rights Reserved
Snowpark for Python
PYTHON • JAVA • SCALA
UDFs Stored Procedures
CLIENT SIDE
LIBRARIES
SERVER SIDE
RUNTIMES
Warehouses (Standard & Snowpark-Optimized)
DataFrame API
© 2023 Snowflake Inc. All Rights Reserved
Snowpark: Secure Deployment
& Processing Of Non-SQL Code
& more
Built-in Anaconda
Packages
Processing Engine
SQL Engine Python Secure
Sandbox
Snowflake Connector for Python
Object Serializer
Query Translator
@udf def detect_fraud()
Python Functions & SProcs
df.filter(df.state == ‘WA’)
DataFrameAPI
Python Bytecode
SQL Query
16
CLIENT SIDE
LIBRARIES
SERVER SIDE
RUNTIMES
© 2023 Snowflake Inc. All Rights Reserved
DATA STREAMING
WITH
DYNAMIC TABLES
© 2023 Snowflake Inc. All Rights Reserved
Streaming in Snowflake
BENEFITS: AFTER
Native support for streaming and continuous batch
data pipelines. Easy, declarative semantics and no
orchestration required, no infrastructure management
SIMPLIFIED PIPELINES
Streaming ingest as much as 50% cheaper than
files. Continuous incremental processing reduces
wasted compute
COST EFFECTIVE
Expanding ecosystem in the Data Cloud with consistent
and strong security, governance, and scalability
NATIVE TO DATA CLOUD
PAIN POINTS: BEFORE
Managing dependencies,
scheduling, and orchestration
COMPLEXITY
Rebuilding tables completely,
no incremental materialization
INEFFICIENCY
Brittle pipelines unable to react
to changes upstream
MANAGEABILITY
© 2023 Snowflake Inc. All Rights Reserved
Streaming ≠ Instantaneous
1 sec 1+ minutes 6+ hours
TIME
Time between event creation and action
VALUE
Value
to
business
SUMMIT
OF NOW
PEAK OF
SOON AFTER
MOUNTAIN
OF WISDOM
VALLEY OF
IRRELEVANCE
© 2023 Snowflake Inc. All Rights Reserved
Streaming ≠ Instantaneous
1 sec 1+ minutes 6+ hours
TIME
Time between event creation and action
COST
Cost
to
business
SUMMIT
OF NOW
PEAK OF
SOON AFTER
MOUNTAIN
OF WISDOM
VALLEY OF
IRRELEVANCE
HIGH COST,
LOW RETURN
LOW COST,
UNTAPPED POTENTIAL
20
© 2023 Snowflake Inc. All Rights Reserved
Streaming Pipelines at a Glance
INGEST TRANSFORM DELIVER
STORAGE SCHEDULING PROCESSING GOVERNANCE
Apps &
Services
OLTP
IoT
Kafka
Rows
Files
Snowpipe
Auto-Ingest
& Streaming
Tables Dynamic Tables*
Sharing
Replication
Native Apps
Worksheets
Dashboards
Serving
Unload
Python, Java, Scala
SQL
In Dev Private Public* GA
© 2023 Snowflake Inc. All Rights Reserved
Ingestion Options
COPY SNOWPIPE SNOWPIPE
STREAMING
Efficient bulk loading of files
Control your own
compute resources
Deterministic latency
Continuous ingestion of files
Serverless
Median latency ~30s
Near real-time ingestion
of rowsets
Client application needed
< 5s median latency
In Dev Private Public GA
© 2023 Snowflake Inc. All Rights Reserved
SNOWPIPE: FILES & STREAMING
In Dev Private Public GA
APPS & SERVICES
OLTP
BUSINESS
INTELLIGENCE
MACHINE
LEARNING
SHARING
COPY &
Snowpipe
Snowpipe Streaming*
& Kafka Connector
STREAMING
Rowsets
Kafka Topics
SNOWPIPE
• Designed for batched rowsets as files
• Auto-scaled ingestion (10M files/10TB per hr)
• Deduplication with file tracking
SNOWPIPE STREAMING
• For rowsets with variable arrival frequency:
insertRows()
• Focus on lower latency & cost
• Ordered ingestion within a channel
BATCH
Files
© 2023 Snowflake Inc. All Rights Reserved
Streaming Use Cases
Use existing event hubs to source data
Flexible latency-cost profiles
Run transformations with all reference data
instead of just single row transforms (ELT & ETL)
Powered by Snowflake apps
Add full power of Snowflake analytics from day 1
One place to query latest window of data &
full history + reference data
Proprietary (ISV-built) pipelines for
continuous analysis
CONNECTORS KAFKA / KINESIS SOURCES
SECURITY & LOG ANALYTICS
Aggregated logs from devices
No need to add event hubs if not needed
Simple post-ingestion cleanup
IOT / DEVICE LOGS
Ingest CDC streams with lower latency
Ensure exactly once semantics
Sourced from OLTP DBs, SaaS apps
Serverless so no clusters / stages to manage
© 2023 Snowflake Inc. All Rights Reserved
Dynamic Tables Overview
CREATE DYNAMIC TABLE <name>
TARGET_LAG = <duration>
WAREHOUSE = <warehouse_name>
AS <select>
SELECT * FROM <name>
Store Results
Automatic Refreshes
Any Query!
NEW TABLE TYPE THAT
AUTOMATICALLY AND CONTINUOUSLY
MATERIALIZES THE RESULTS OF A QUERY
In Dev Private Public GA
© 2023 Snowflake Inc. All Rights Reserved
Dynamic Tables Overview
CONSISTENTLY
FAST TO QUERY
In Dev Private Public GA
Immediate results
Freshness within LAG
Snapshot isolation
CREATE DYNAMIC TABLE <name>
TARGET_LAG = <duration>
WAREHOUSE = <warehouse_name>
AS <select>
SELECT * FROM <name>
© 2023 Snowflake Inc. All Rights Reserved
Key Features
In Dev Private Public GA
DECLARATIVE
DATA PIPELINES
Continuous data pipelines as easy
as SELECT. Complex pipelines with
hundreds of branches. Dynamic Tables
manage the scheduling and orchestration.
SQL
SUPPORT
Use any core SQL syntax to define
transformations, including joins,
unions, aggregations, window
functions, group bys, filters, etc.
USER-DEFINED
FRESHNESS
Controlled by a target lag for each
table, for sake of reduced cost and
improved performance. Data freshness
as low as 1 minute.
AUTOMATIC INCREMENTAL
REFRESHES
Refresh only what's changed, even for
complex queries, automatically (yes,
including UPDATEs and DELETEs!).
SNAPSHOT
ISOLATION
All Dynamic Tables in a DAG are
refreshed consistently from aligned
snapshots.
© 2023 Snowflake Inc. All Rights Reserved
FULL STACK
DATA ENGINEERING
WITH
SNOWPARK
© 2023 Snowflake Inc. All Rights Reserved
Full Stack DE with Python
© 2023 Snowflake Inc. All Rights Reserved
Let’s build a Data App!
Ad Spend Optimizer for Ski Gear Co.
© 2023 Snowflake Inc. All Rights Reserved
THANK YOU!

More Related Content

Similar to From Raw Data to an Interactive Data App in an Hour: Powered by Snowpark Python

Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)Kent Graziano
 
Enterprise Application and Data Protection on AWS with Amazon FSx for NetApp ...
Enterprise Application and Data Protection on AWS with Amazon FSx for NetApp ...Enterprise Application and Data Protection on AWS with Amazon FSx for NetApp ...
Enterprise Application and Data Protection on AWS with Amazon FSx for NetApp ...LilyJang3
 
Zero Downtime for Oracle E-Business Suite on Oracle Exalogic
Zero Downtime for Oracle E-Business Suite on Oracle ExalogicZero Downtime for Oracle E-Business Suite on Oracle Exalogic
Zero Downtime for Oracle E-Business Suite on Oracle ExalogicPaulo Fagundes
 
Virtualization Landscape & Cloud Computing
Virtualization Landscape & Cloud ComputingVirtualization Landscape & Cloud Computing
Virtualization Landscape & Cloud ComputingAdhish Pendharkar
 
Champion Fas Deduplication
Champion Fas DeduplicationChampion Fas Deduplication
Champion Fas DeduplicationMichael Hudak
 
EMC ScaleIO Overview
EMC ScaleIO OverviewEMC ScaleIO Overview
EMC ScaleIO Overviewwalshe1
 
Ucs invicta &amp; application performance
Ucs invicta &amp; application performanceUcs invicta &amp; application performance
Ucs invicta &amp; application performancesolarisyougood
 
Meetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline DevelopmentMeetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline DevelopmentTimothy Spann
 
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023ssuser73434e
 
Accelerating and Protecting your Virtualize Environment
Accelerating and Protecting your Virtualize EnvironmentAccelerating and Protecting your Virtualize Environment
Accelerating and Protecting your Virtualize EnvironmentCTI Group
 
kreuzwerker AWS Modernizing Legacy Operations with Containerized Solutions 20...
kreuzwerker AWS Modernizing Legacy Operations with Containerized Solutions 20...kreuzwerker AWS Modernizing Legacy Operations with Containerized Solutions 20...
kreuzwerker AWS Modernizing Legacy Operations with Containerized Solutions 20...kreuzwerker GmbH
 
Bringing the JAMstack to the Enterprise
Bringing the JAMstack to the EnterpriseBringing the JAMstack to the Enterprise
Bringing the JAMstack to the EnterpriseJamund Ferguson
 
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreDataStax Academy
 
Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture Hortonworks
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackDataStax Academy
 
Building real-time serverless data applications with Confluent and AWS.pptx
Building real-time serverless data applications with Confluent and AWS.pptxBuilding real-time serverless data applications with Confluent and AWS.pptx
Building real-time serverless data applications with Confluent and AWS.pptxAhmed791434
 
Emc data domain technical deep dive workshop
Emc data domain  technical deep dive workshopEmc data domain  technical deep dive workshop
Emc data domain technical deep dive workshopsolarisyougood
 
Demystifying Data Warehouse as a Service
Demystifying Data Warehouse as a ServiceDemystifying Data Warehouse as a Service
Demystifying Data Warehouse as a ServiceSnowflake Computing
 
Dr Training V1 07 17 09 Rev Four 4
 Dr Training V1 07 17 09 Rev Four 4 Dr Training V1 07 17 09 Rev Four 4
Dr Training V1 07 17 09 Rev Four 4Ricoh
 

Similar to From Raw Data to an Interactive Data App in an Hour: Powered by Snowpark Python (20)

Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)
 
Enterprise Application and Data Protection on AWS with Amazon FSx for NetApp ...
Enterprise Application and Data Protection on AWS with Amazon FSx for NetApp ...Enterprise Application and Data Protection on AWS with Amazon FSx for NetApp ...
Enterprise Application and Data Protection on AWS with Amazon FSx for NetApp ...
 
Zero Downtime for Oracle E-Business Suite on Oracle Exalogic
Zero Downtime for Oracle E-Business Suite on Oracle ExalogicZero Downtime for Oracle E-Business Suite on Oracle Exalogic
Zero Downtime for Oracle E-Business Suite on Oracle Exalogic
 
Virtualization Landscape & Cloud Computing
Virtualization Landscape & Cloud ComputingVirtualization Landscape & Cloud Computing
Virtualization Landscape & Cloud Computing
 
Champion Fas Deduplication
Champion Fas DeduplicationChampion Fas Deduplication
Champion Fas Deduplication
 
EMC ScaleIO Overview
EMC ScaleIO OverviewEMC ScaleIO Overview
EMC ScaleIO Overview
 
Ucs invicta &amp; application performance
Ucs invicta &amp; application performanceUcs invicta &amp; application performance
Ucs invicta &amp; application performance
 
Meetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline DevelopmentMeetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline Development
 
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
 
Accelerating and Protecting your Virtualize Environment
Accelerating and Protecting your Virtualize EnvironmentAccelerating and Protecting your Virtualize Environment
Accelerating and Protecting your Virtualize Environment
 
kreuzwerker AWS Modernizing Legacy Operations with Containerized Solutions 20...
kreuzwerker AWS Modernizing Legacy Operations with Containerized Solutions 20...kreuzwerker AWS Modernizing Legacy Operations with Containerized Solutions 20...
kreuzwerker AWS Modernizing Legacy Operations with Containerized Solutions 20...
 
Bringing the JAMstack to the Enterprise
Bringing the JAMstack to the EnterpriseBringing the JAMstack to the Enterprise
Bringing the JAMstack to the Enterprise
 
Expertslive azure site recovery
  Expertslive   azure site recovery  Expertslive   azure site recovery
Expertslive azure site recovery
 
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
 
Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Building real-time serverless data applications with Confluent and AWS.pptx
Building real-time serverless data applications with Confluent and AWS.pptxBuilding real-time serverless data applications with Confluent and AWS.pptx
Building real-time serverless data applications with Confluent and AWS.pptx
 
Emc data domain technical deep dive workshop
Emc data domain  technical deep dive workshopEmc data domain  technical deep dive workshop
Emc data domain technical deep dive workshop
 
Demystifying Data Warehouse as a Service
Demystifying Data Warehouse as a ServiceDemystifying Data Warehouse as a Service
Demystifying Data Warehouse as a Service
 
Dr Training V1 07 17 09 Rev Four 4
 Dr Training V1 07 17 09 Rev Four 4 Dr Training V1 07 17 09 Rev Four 4
Dr Training V1 07 17 09 Rev Four 4
 

More from HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonHostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonHostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonHostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLHostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsHostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 

More from HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Recently uploaded

Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformWSO2
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...caitlingebhard1
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingWSO2
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 

Recently uploaded (20)

Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 

From Raw Data to an Interactive Data App in an Hour: Powered by Snowpark Python

  • 1. © 2023 Snowflake Inc. All Rights Reserved FROM RAW DATA TO INTERACTIVE DATA APP! Powered by Snowpark Python
  • 2. © 2023 Snowflake Inc. All Rights Reserved
  • 3. © 2023 Snowflake Inc. All Rights Reserved Challenges in Developing Data Pipelines - Troubleshooting & debugging failed jobs - Multi-page stack trace - Capacity management & resource sizing - Setting up Infrastructure and Configs - Executor memory - Driver memory - # of executors - Z-ordering, V-ordering, ABC-ordering - Partitioning, Bucketing, Salting
  • 4. © 2023 Snowflake Inc. All Rights Reserved Challenges in Developing Data Pipelines - Troubleshooting a failed spark job - Multi-page stack trace - Setting up Infrastructure and Configs - Executor memory - Driver memory - # of executors - Z-ordering, V-ordering, ABC-ordering - Partitioning, Bucketing, Salting
  • 5. © 2023 Snowflake Inc. All Rights Reserved Challenges Today - Troubleshooting a failed spark job - Multi-page stack trace - Setting up Infrastructure and Configs - Executor memory - Driver memory - # of executors - Z-ordering, V-ordering, ABC-ordering - Partitioning, Bucketing, Salting
  • 6. © 2023 Snowflake Inc. All Rights Reserved Challenges Today - Troubleshooting a failed spark job - Multi-page stack trace - Setting up Infrastructure and Configs - Executor memory - Driver memory - # of executors - Z-ordering, V-ordering, ABC-ordering - Partitioning, Bucketing, Salting
  • 7. © 2023 Snowflake Inc. All Rights Reserved Challenges Today - Troubleshooting a failed spark job - Multi-page stack trace - Setting up Infrastructure and Configs - Executor memory - Driver memory - # of executors - Z-ordering, V-ordering, ABC-ordering - Partitioning, Bucketing, Salting
  • 8. © 2023 Snowflake Inc. All Rights Reserved Challenges Today - Troubleshooting a failed spark job - Multi-page stack trace - Setting up Infrastructure and Configs - Executor memory - Driver memory - # of executors - Z-ordering, V-ordering, ABC-ordering - Partitioning, Bucketing, Salting
  • 9. © 2023 Snowflake Inc. All Rights Reserved Challenges Today - Troubleshooting a failed spark job - Multi-page stack trace - Setting up Infrastructure and Configs - Executor memory - Driver memory - # of executors - Z-ordering, V-ordering, ABC-ordering - Partitioning, Bucketing, Salting
  • 10. © 2023 Snowflake Inc. All Rights Reserved Challenges Today - Troubleshooting a failed spark job - Multi-page stack trace - Setting up Infrastructure and Configs - Executor memory - Driver memory - # of executors - Z-ordering, V-ordering, ABC-ordering - Partitioning, Bucketing, Salting
  • 11. © 2023 Snowflake Inc. All Rights Reserved Challenges Today - Troubleshooting a failed spark job - Multi-page stack trace - Setting up Infrastructure and Configs - Executor memory - Driver memory - # of executors - Z-ordering, V-ordering, ABC-ordering - Partitioning, Bucketing, Salting
  • 12. © 2023 Snowflake Inc. All Rights Reserved Challenges Today - Troubleshooting a failed spark job - Multi-page stack trace - Setting up Infrastructure and Configs - Executor memory - Driver memory - # of executors - Z-ordering, V-ordering, ABC-ordering - Partitioning, Bucketing, Salting
  • 13. © 2023 Snowflake Inc. All Rights Reserved Challenges Today - Troubleshooting a failed spark job - Multi-page stack trace - Setting up Infrastructure and Configs - Executor memory - Driver memory - # of executors - Z-ordering, V-ordering, ABC-ordering - Partitioning, Bucketing, Salting
  • 14. © 2023 Snowflake Inc. All Rights Reserved ENTER SNOWPARK
  • 15. © 2023 Snowflake Inc. All Rights Reserved Snowpark for Python PYTHON • JAVA • SCALA UDFs Stored Procedures CLIENT SIDE LIBRARIES SERVER SIDE RUNTIMES Warehouses (Standard & Snowpark-Optimized) DataFrame API
  • 16. © 2023 Snowflake Inc. All Rights Reserved Snowpark: Secure Deployment & Processing Of Non-SQL Code & more Built-in Anaconda Packages Processing Engine SQL Engine Python Secure Sandbox Snowflake Connector for Python Object Serializer Query Translator @udf def detect_fraud() Python Functions & SProcs df.filter(df.state == ‘WA’) DataFrameAPI Python Bytecode SQL Query 16 CLIENT SIDE LIBRARIES SERVER SIDE RUNTIMES
  • 17. © 2023 Snowflake Inc. All Rights Reserved DATA STREAMING WITH DYNAMIC TABLES
  • 18. © 2023 Snowflake Inc. All Rights Reserved Streaming in Snowflake BENEFITS: AFTER Native support for streaming and continuous batch data pipelines. Easy, declarative semantics and no orchestration required, no infrastructure management SIMPLIFIED PIPELINES Streaming ingest as much as 50% cheaper than files. Continuous incremental processing reduces wasted compute COST EFFECTIVE Expanding ecosystem in the Data Cloud with consistent and strong security, governance, and scalability NATIVE TO DATA CLOUD PAIN POINTS: BEFORE Managing dependencies, scheduling, and orchestration COMPLEXITY Rebuilding tables completely, no incremental materialization INEFFICIENCY Brittle pipelines unable to react to changes upstream MANAGEABILITY
  • 19. © 2023 Snowflake Inc. All Rights Reserved Streaming ≠ Instantaneous 1 sec 1+ minutes 6+ hours TIME Time between event creation and action VALUE Value to business SUMMIT OF NOW PEAK OF SOON AFTER MOUNTAIN OF WISDOM VALLEY OF IRRELEVANCE
  • 20. © 2023 Snowflake Inc. All Rights Reserved Streaming ≠ Instantaneous 1 sec 1+ minutes 6+ hours TIME Time between event creation and action COST Cost to business SUMMIT OF NOW PEAK OF SOON AFTER MOUNTAIN OF WISDOM VALLEY OF IRRELEVANCE HIGH COST, LOW RETURN LOW COST, UNTAPPED POTENTIAL 20
  • 21. © 2023 Snowflake Inc. All Rights Reserved Streaming Pipelines at a Glance INGEST TRANSFORM DELIVER STORAGE SCHEDULING PROCESSING GOVERNANCE Apps & Services OLTP IoT Kafka Rows Files Snowpipe Auto-Ingest & Streaming Tables Dynamic Tables* Sharing Replication Native Apps Worksheets Dashboards Serving Unload Python, Java, Scala SQL In Dev Private Public* GA
  • 22. © 2023 Snowflake Inc. All Rights Reserved Ingestion Options COPY SNOWPIPE SNOWPIPE STREAMING Efficient bulk loading of files Control your own compute resources Deterministic latency Continuous ingestion of files Serverless Median latency ~30s Near real-time ingestion of rowsets Client application needed < 5s median latency In Dev Private Public GA
  • 23. © 2023 Snowflake Inc. All Rights Reserved SNOWPIPE: FILES & STREAMING In Dev Private Public GA APPS & SERVICES OLTP BUSINESS INTELLIGENCE MACHINE LEARNING SHARING COPY & Snowpipe Snowpipe Streaming* & Kafka Connector STREAMING Rowsets Kafka Topics SNOWPIPE • Designed for batched rowsets as files • Auto-scaled ingestion (10M files/10TB per hr) • Deduplication with file tracking SNOWPIPE STREAMING • For rowsets with variable arrival frequency: insertRows() • Focus on lower latency & cost • Ordered ingestion within a channel BATCH Files
  • 24. © 2023 Snowflake Inc. All Rights Reserved Streaming Use Cases Use existing event hubs to source data Flexible latency-cost profiles Run transformations with all reference data instead of just single row transforms (ELT & ETL) Powered by Snowflake apps Add full power of Snowflake analytics from day 1 One place to query latest window of data & full history + reference data Proprietary (ISV-built) pipelines for continuous analysis CONNECTORS KAFKA / KINESIS SOURCES SECURITY & LOG ANALYTICS Aggregated logs from devices No need to add event hubs if not needed Simple post-ingestion cleanup IOT / DEVICE LOGS Ingest CDC streams with lower latency Ensure exactly once semantics Sourced from OLTP DBs, SaaS apps Serverless so no clusters / stages to manage
  • 25. © 2023 Snowflake Inc. All Rights Reserved Dynamic Tables Overview CREATE DYNAMIC TABLE <name> TARGET_LAG = <duration> WAREHOUSE = <warehouse_name> AS <select> SELECT * FROM <name> Store Results Automatic Refreshes Any Query! NEW TABLE TYPE THAT AUTOMATICALLY AND CONTINUOUSLY MATERIALIZES THE RESULTS OF A QUERY In Dev Private Public GA
  • 26. © 2023 Snowflake Inc. All Rights Reserved Dynamic Tables Overview CONSISTENTLY FAST TO QUERY In Dev Private Public GA Immediate results Freshness within LAG Snapshot isolation CREATE DYNAMIC TABLE <name> TARGET_LAG = <duration> WAREHOUSE = <warehouse_name> AS <select> SELECT * FROM <name>
  • 27. © 2023 Snowflake Inc. All Rights Reserved Key Features In Dev Private Public GA DECLARATIVE DATA PIPELINES Continuous data pipelines as easy as SELECT. Complex pipelines with hundreds of branches. Dynamic Tables manage the scheduling and orchestration. SQL SUPPORT Use any core SQL syntax to define transformations, including joins, unions, aggregations, window functions, group bys, filters, etc. USER-DEFINED FRESHNESS Controlled by a target lag for each table, for sake of reduced cost and improved performance. Data freshness as low as 1 minute. AUTOMATIC INCREMENTAL REFRESHES Refresh only what's changed, even for complex queries, automatically (yes, including UPDATEs and DELETEs!). SNAPSHOT ISOLATION All Dynamic Tables in a DAG are refreshed consistently from aligned snapshots.
  • 28. © 2023 Snowflake Inc. All Rights Reserved FULL STACK DATA ENGINEERING WITH SNOWPARK
  • 29. © 2023 Snowflake Inc. All Rights Reserved Full Stack DE with Python
  • 30. © 2023 Snowflake Inc. All Rights Reserved Let’s build a Data App! Ad Spend Optimizer for Ski Gear Co.
  • 31. © 2023 Snowflake Inc. All Rights Reserved THANK YOU!