SlideShare a Scribd company logo
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Zach Christopherson, Redshift Database Engineer
July 13, 2016
Deep Dive on Amazon Redshift
Agenda
• Service overview
• Migration considerations
• Optimization:
• Schema/table design
• Data ingestion
• Maintenance and database tuning
• What’s new?
• Q & A (~15 minutes)
Service overview
Relational data warehouse
Massively parallel; petabyte scale
Fully managed
HDD and SSD platforms
$1,000/TB/year; starts at $0.25/hour
Amazon
Redshift
a lot faster
a lot simpler
a lot cheaper
Selected Amazon Redshift customers
Amazon Redshift system architecture
Leader node
• SQL endpoint
• Stores metadata
• Coordinates query execution
Compute nodes
• Local, columnar storage
• Executes queries in parallel
• Load, backup, restore via
Amazon S3; load from
Amazon DynamoDB, Amazon EMR, or SSH
Two hardware platforms
• Optimized for data processing
• DS2: HDD; scale from 2 TB to 2 PB
• DC1: SSD; scale from 160 GB to 326 TB
10 GigE
(HPC)
Ingestion
Backup
Restore
JDBC/ODBC
A deeper look at compute node architecture
Each node contains multiple slices
• DS2 – 2 slices on XL, 16 on 8 XL
• DC1 – 2 slices on L, 32 on 8 XL
Each slice is allocated CPU and
table data
Each slice processes a piece of
the workload in parallel
Leader Node
Amazon Redshift dramatically reduces I/O
Data compression
Zone maps
ID Age State Amount
123 20 CA 500
345 25 WA 250
678 40 FL 125
957 37 WA 375
• Calculating SUM(Amount)
with row storage:
– Need to read
everything
– Unnecessary I/O
ID Age State Amount
Amazon Redshift dramatically reduces I/O
Data compression
Zone maps
ID Age State Amount
123 20 CA 500
345 25 WA 250
678 40 FL 125
957 37 WA 375
• Calculating SUM(Amount)
with column storage:
– Only scan the
necessary blocks
ID Age State Amount
Amazon Redshift dramatically reduces I/O
Column storage
Data compression
Zone maps
• Columnar compression
– Effective due to like
data
– Reduces storage
requirements
– Reduces I/O
ID Age State Amount
analyze compression orders;
Table | Column | Encoding
--------+-------------+----------
orders | id | mostly32
orders | age | mostly32
orders | state | lzo
orders | amount | mostly32
Amazon Redshift dramatically reduces I/O
Column storage
Data compression
Zone maps
• In-memory block metadata
• Contains per-block MIN and MAX value
• Effectively prunes blocks that don’t
contain data for a given query
• Minimizes unnecessary I/O
ID Age State Amount
Migration considerations
Forklift = BAD
On Redshift
• Optimal throughput at 8-12 concurrency, 50 possible
• Upfront table design is critical, not able to add indexes
• DISTKEY and SORTKEY significantly influence performance
• PRIMARY KEY, FOREIGN KEY, UNIQUE constraints will
help
Caution: These are not enforced
• Compression is also for speed
On Redshift
• Optimized for large writes:
• Blocks are immutable
• Small write (~1-10 rows) has similar cost to a larger write
(~100 K rows)
• Partition temporal data with time series tables and
UNION ALL views
Optimization: schema design
Goals of distribution
• Distribute data evenly for parallel processing
• Minimize data movement
• Co-located joins
• Localized aggregations
Distribution key All
Node 1
Slice
1
Slice
2
Node 2
Slice
3
Slice
4
Node 1
Slice
1
Slice
2
Node 2
Slice
3
Slice
4
Full table data on first
slice of every node
Same key to same location
Node 1
Slice
1
Slice
2
Node 2
Slice
3
Slice
4
Even
Round robin
distribution
Choosing a distribution style
Key
• Large FACT tables
• Rapidly changing tables used
in joins
• Localize columns used within
aggregations
All
• Have slowly changing data
• Reasonable size (i.e., few
millions but not 100s of
millions of rows)
• No common distribution key
for frequent joins
• Typical use case: joined
dimension table without a
common distribution key
Even
• Tables not frequently joined or
aggregated
• Large tables without
acceptable candidate keys
Goals of sorting
• Physically sort data within blocks and throughout a table
• Enable rrscans to prune blocks by leveraging zone maps
• Optimal SORTKEY is dependent on:
• Query patterns
• Data profile
• Business requirements
COMPOUND
• Most common
• Well-defined filter criteria
• Time-series data
Choosing a SORTKEY
INTERLEAVED
• Edge cases
• Large tables (> billion rows)
• No common filter criteria
• Non time-series data
• Primarily as a query predicate (date, identifier, …)
• Optionally, choose a column frequently used for aggregates
• Optionally, choose same as distribution key column for most
efficient joins (merge join)
Compressing data
• COPY automatically analyzes and compresses data
when loading into empty tables
• ANALYZE COMPRESSION checks existing tables and
proposes optimal compression algorithms for each
column
• Changing column encoding requires a table rebuild
Automatic compression is a good thing (mostly)
“The definition of insanity is doing the same thing over and over again,
but expecting different results”.
- Albert Einstein
If you have a regular ETL process and you use temp tables or staging
tables, turn off automatic compression:
• Use ANALYZE COMPRESSION to determine the right encodings
• Bake those encodings into your DDL
• Use CREATE TABLE … LIKE
Automatic compression is a good thing (mostly)
• From the zone maps we know:
• Which blocks contain the
range
• Which row offsets to scan
• Highly compressed SORTKEYs:
• Many rows per block
• Large row offset
Skip compression on just the
leading column of the compound
SORTKEY
Optimization: ingest performance
Amazon Redshift loading data overview
AWS CloudCorporate data center
Amazon
DynamoDB
Amazon S3
Data
volume
Amazon EMR
Amazon
RDS
Amazon
Redshift
Amazon
Glacier
logs / files
Source DBs
VPN
Connection
AWS Direct
Connect
Amazon S3
Multipart Upload
AWS Import/
Export
Amazon EC2
or on-premises
(using SSH)
Parallelism is a function of load files
Each slice’s query processors can load one file at a time:
• Streaming decompression
• Parse
• Distribute
• Write
A single input file means
only one slice is ingesting data
Realizing only partial cluster usage as 6.25% of slices are active
2 4 6 8 10 12 141 3 5 7 9 11 13 15
Maximize throughput with multiple files
Use at least as many input files
as there are slices in the cluster
With 16 input files, all slices are
working so you maximize
throughput
COPY continues to scale linearly
as you add nodes
2 4 6 8 10 12 141 3 5 7 9 11 13 15
Optimization: performance tuning
Optimizing a database for querying
• Periodically check your table status
• Vacuum and analyze regularly
• SVV_TABLE_INFO
• Missing statistics
• Table skew
• Uncompressed columns
• Unsorted data
• Check your cluster status
• WLM queuing
• Commit queuing
• Database locks
Missing statistics
• Amazon Redshift query optimizer
relies on up-to-date statistics
• Statistics are necessary only for
data that you are accessing
• Updated stats important on:
• SORTKEY
• DISTKEY
• Columns in query predicates
Table skew
• Unbalanced workload
• Query completes as fast as the
slowest slice completes
• Can cause skew inflight:
• Temp data fills a single
node, resulting in query
failure
Table maintenance and status
Unsorted table
• Sortkey is just a guide, but
data actually needs to be
sorted
• VACUUM or DEEP COPY to
sort
• Scans against unsorted tables
continue to benefit from zone
maps:
• Load sequential blocks
WLM queue
Identify short/long-running queries
and prioritize them
Define multiple queues to route
queries appropriately
Default concurrency of 5
Leverage wlm_apex_hourly to tune
WLM based on peak concurrency
requirements
Cluster status: commits and WLM
Commit queue
How long is your commit queue?
• Identify needless transactions
• Group dependent statements
within a single transaction
• Offload operational workloads
• STL_COMMIT_STATS
What’s new?
Performance
Ease of use
Security
Analytics &
functionality
SOA
Recent launches Dynamic WLM parameters
Queue hopping for timed-out queries
Merge rows from staging to prod. table
2x improvement in query throughput
10x latency improvement for UNION ALL queries
Bzip2 format for ingestion
Table-level restore
10x improvement in vacuum perf
Default access privileges
Tag-based IAM access
IAM roles for COPY/UNLOAD
SAS connector enhancements,
Implicit conversion of SAS
queries to Redshift
DMS support from OLTP sources
Enhanced data ingestion from
Amazon Kinesis Firehose
Improved Data schema conversion
to Amazon ML
VACUUM and throughput improvements
• 2x query throughput improvement
• Improved memory management
• Optimizations for query plans against UNION ALL views
• Performance improvement for queries that stress the network
• 10x VACUUM improvement
• Improved backup performance
New features - ingestion
Backup option on CREATE TABLE
• For use on Staging tables for enhanced load performance
• Table will not be present on restore
Alter Table Append
• Appends rows to a target table by moving data from an existing source table
• Data in the source table is moved to matching columns in the target table
• You cannot run ALTER TABLE APPEND within a transaction block (BEGIN ... END)
New feature - Table Level Restore
aws redshift restore-table-from-cluster-snapshot --cluster-identifier mycluster-example
--new-table-name my-new-table
--snapshot-identifier my-snapshot-id
--source-database-name sample-database
--source-table-name my-source-table
Open source tools
https://github.com/awslabs/amazon-redshift-utils
https://github.com/awslabs/amazon-redshift-monitoring
https://github.com/awslabs/amazon-redshift-udfs
Admin scripts
Collection of utilities for running diagnostics on your cluster
Admin views
Collection of utilities for managing your cluster, generating schema DDL, etc.
ColumnEncodingUtility
Gives you the ability to apply optimal column encoding to an established
schema with data already loaded
Thank you!
Remember to complete
your evaluations!

More Related Content

What's hot

Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon Redshift
Amazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
Amazon Web Services
 
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
Amazon Web Services
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
Amazon Web Services
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
Amazon Web Services
 
개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016
개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016
개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016
Amazon Web Services Korea
 
Getting Started with Amazon Kinesis
Getting Started with Amazon KinesisGetting Started with Amazon Kinesis
Getting Started with Amazon Kinesis
Amazon Web Services
 
Amazon Redshift
Amazon Redshift Amazon Redshift
Amazon Redshift
Amazon Web Services
 
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Emprovise
 
AWS Black Belt Online Seminar 2017 Amazon Relational Database Service (Amazon...
AWS Black Belt Online Seminar 2017 Amazon Relational Database Service (Amazon...AWS Black Belt Online Seminar 2017 Amazon Relational Database Service (Amazon...
AWS Black Belt Online Seminar 2017 Amazon Relational Database Service (Amazon...
Amazon Web Services Japan
 
Amazon Aurora
Amazon AuroraAmazon Aurora
Amazon Aurora
Amazon Web Services
 
AWS EBS
AWS EBSAWS EBS
AWS EBS
Mahesh Raj
 
AWS Storage - S3 Fundamentals
AWS Storage - S3 FundamentalsAWS Storage - S3 Fundamentals
AWS Storage - S3 Fundamentals
Piyush Agrawal
 
AWS Route53
AWS Route53AWS Route53
Intro to databricks delta lake
 Intro to databricks delta lake Intro to databricks delta lake
Intro to databricks delta lake
Mykola Zerniuk
 
Load & Unload Data TO and FROM Snowflake (By Faysal Shaarani)
Load & Unload Data TO and FROM Snowflake (By Faysal Shaarani)Load & Unload Data TO and FROM Snowflake (By Faysal Shaarani)
Load & Unload Data TO and FROM Snowflake (By Faysal Shaarani)Faysal Shaarani (MBA)
 
Serverless Databases - Amazon DynamoDB and Amazon Aurora Serverless - Demo
Serverless Databases - Amazon DynamoDB and Amazon Aurora Serverless - DemoServerless Databases - Amazon DynamoDB and Amazon Aurora Serverless - Demo
Serverless Databases - Amazon DynamoDB and Amazon Aurora Serverless - Demo
Amazon Web Services
 
Cassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsCassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patterns
Dave Gardner
 
Introduction to Amazon DynamoDB
Introduction to Amazon DynamoDBIntroduction to Amazon DynamoDB
Introduction to Amazon DynamoDB
Amazon Web Services
 
Redshift performance tuning
Redshift performance tuningRedshift performance tuning
Redshift performance tuning
Carlos del Cacho
 

What's hot (20)

Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016
개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016
개발자가 알아야 할 Amazon DynamoDB 활용법 :: 김일호 :: AWS Summit Seoul 2016
 
Getting Started with Amazon Kinesis
Getting Started with Amazon KinesisGetting Started with Amazon Kinesis
Getting Started with Amazon Kinesis
 
Amazon Redshift
Amazon Redshift Amazon Redshift
Amazon Redshift
 
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
 
AWS Black Belt Online Seminar 2017 Amazon Relational Database Service (Amazon...
AWS Black Belt Online Seminar 2017 Amazon Relational Database Service (Amazon...AWS Black Belt Online Seminar 2017 Amazon Relational Database Service (Amazon...
AWS Black Belt Online Seminar 2017 Amazon Relational Database Service (Amazon...
 
Amazon Aurora
Amazon AuroraAmazon Aurora
Amazon Aurora
 
AWS EBS
AWS EBSAWS EBS
AWS EBS
 
AWS Storage - S3 Fundamentals
AWS Storage - S3 FundamentalsAWS Storage - S3 Fundamentals
AWS Storage - S3 Fundamentals
 
AWS Route53
AWS Route53AWS Route53
AWS Route53
 
Intro to databricks delta lake
 Intro to databricks delta lake Intro to databricks delta lake
Intro to databricks delta lake
 
Load & Unload Data TO and FROM Snowflake (By Faysal Shaarani)
Load & Unload Data TO and FROM Snowflake (By Faysal Shaarani)Load & Unload Data TO and FROM Snowflake (By Faysal Shaarani)
Load & Unload Data TO and FROM Snowflake (By Faysal Shaarani)
 
Serverless Databases - Amazon DynamoDB and Amazon Aurora Serverless - Demo
Serverless Databases - Amazon DynamoDB and Amazon Aurora Serverless - DemoServerless Databases - Amazon DynamoDB and Amazon Aurora Serverless - Demo
Serverless Databases - Amazon DynamoDB and Amazon Aurora Serverless - Demo
 
Cassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsCassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patterns
 
Introduction to Amazon DynamoDB
Introduction to Amazon DynamoDBIntroduction to Amazon DynamoDB
Introduction to Amazon DynamoDB
 
Redshift performance tuning
Redshift performance tuningRedshift performance tuning
Redshift performance tuning
 

Viewers also liked

Amazon Redshift Masterclass
Amazon Redshift MasterclassAmazon Redshift Masterclass
Amazon Redshift Masterclass
Amazon Web Services
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
Amazon Web Services
 
Amazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and OptimizationAmazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and Optimization
Amazon Web Services
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
Amazon Web Services
 
Data Warehousing Infrastructure on Cloud
Data Warehousing Infrastructure on CloudData Warehousing Infrastructure on Cloud
Data Warehousing Infrastructure on Cloud
tdwiindia
 
Secure Content Delivery with AWS
Secure Content Delivery with AWSSecure Content Delivery with AWS
Secure Content Delivery with AWS
Amazon Web Services
 
AWS Keynote II - New Services Showcase: Connecting the Dots
AWS Keynote II - New Services Showcase: Connecting the DotsAWS Keynote II - New Services Showcase: Connecting the Dots
AWS Keynote II - New Services Showcase: Connecting the Dots
Amazon Web Services
 
AWS Security and Compliance
AWS Security and ComplianceAWS Security and Compliance
AWS Security and Compliance
Amazon Web Services
 
Another Day, Another Billion Packets
Another Day, Another Billion PacketsAnother Day, Another Billion Packets
Another Day, Another Billion Packets
Amazon Web Services
 
AWS July Webinar Series: Amazon Redshift Optimizing Performance
AWS July Webinar Series: Amazon Redshift Optimizing PerformanceAWS July Webinar Series: Amazon Redshift Optimizing Performance
AWS July Webinar Series: Amazon Redshift Optimizing Performance
Amazon Web Services
 
DevOps on AWS: Deep Dive on Continuous Delivery and the AWS Developer Tools
DevOps on AWS: Deep Dive on Continuous Delivery and the AWS Developer ToolsDevOps on AWS: Deep Dive on Continuous Delivery and the AWS Developer Tools
DevOps on AWS: Deep Dive on Continuous Delivery and the AWS Developer Tools
Amazon Web Services
 
Deep Dive on Elastic Load Balancing
Deep Dive on Elastic Load BalancingDeep Dive on Elastic Load Balancing
Deep Dive on Elastic Load Balancing
Amazon Web Services
 
Building Your Practice on AWS: An APN Breakfast Session
Building Your Practice on AWS: An APN Breakfast SessionBuilding Your Practice on AWS: An APN Breakfast Session
Building Your Practice on AWS: An APN Breakfast Session
Amazon Web Services
 
Mobile App Testing with AWS Device Farm - AWS July 2016 Webinar Series
Mobile App Testing with AWS Device Farm - AWS July 2016 Webinar SeriesMobile App Testing with AWS Device Farm - AWS July 2016 Webinar Series
Mobile App Testing with AWS Device Farm - AWS July 2016 Webinar Series
Amazon Web Services
 
AWS Data Transfer Services: Data Ingest Strategies Into the AWS Cloud
AWS Data Transfer Services: Data Ingest Strategies Into the AWS CloudAWS Data Transfer Services: Data Ingest Strategies Into the AWS Cloud
AWS Data Transfer Services: Data Ingest Strategies Into the AWS Cloud
Amazon Web Services
 
Scalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query SpeedScalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query Speed
FlyData Inc.
 
AWS re:Invent 2016: Wild Rydes Takes Off – The Dawn of a New Unicorn (SVR309)
AWS re:Invent 2016: Wild Rydes Takes Off – The Dawn of a New Unicorn (SVR309)AWS re:Invent 2016: Wild Rydes Takes Off – The Dawn of a New Unicorn (SVR309)
AWS re:Invent 2016: Wild Rydes Takes Off – The Dawn of a New Unicorn (SVR309)
Amazon Web Services
 
Born in the Cloud; Build it Like a Startup
Born in the Cloud; Build it Like a StartupBorn in the Cloud; Build it Like a Startup
Born in the Cloud; Build it Like a Startup
Amazon Web Services
 
Real-Time Analytics and Visualization of Streaming Big Data with JReport & Sc...
Real-Time Analytics and Visualization of Streaming Big Data with JReport & Sc...Real-Time Analytics and Visualization of Streaming Big Data with JReport & Sc...
Real-Time Analytics and Visualization of Streaming Big Data with JReport & Sc...
Mia Yuan Cao
 
Deep Dive on Amazon Aurora - Covering New Feature Announcements
Deep Dive on Amazon Aurora - Covering New Feature AnnouncementsDeep Dive on Amazon Aurora - Covering New Feature Announcements
Deep Dive on Amazon Aurora - Covering New Feature Announcements
Amazon Web Services
 

Viewers also liked (20)

Amazon Redshift Masterclass
Amazon Redshift MasterclassAmazon Redshift Masterclass
Amazon Redshift Masterclass
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
 
Amazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and OptimizationAmazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and Optimization
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
Data Warehousing Infrastructure on Cloud
Data Warehousing Infrastructure on CloudData Warehousing Infrastructure on Cloud
Data Warehousing Infrastructure on Cloud
 
Secure Content Delivery with AWS
Secure Content Delivery with AWSSecure Content Delivery with AWS
Secure Content Delivery with AWS
 
AWS Keynote II - New Services Showcase: Connecting the Dots
AWS Keynote II - New Services Showcase: Connecting the DotsAWS Keynote II - New Services Showcase: Connecting the Dots
AWS Keynote II - New Services Showcase: Connecting the Dots
 
AWS Security and Compliance
AWS Security and ComplianceAWS Security and Compliance
AWS Security and Compliance
 
Another Day, Another Billion Packets
Another Day, Another Billion PacketsAnother Day, Another Billion Packets
Another Day, Another Billion Packets
 
AWS July Webinar Series: Amazon Redshift Optimizing Performance
AWS July Webinar Series: Amazon Redshift Optimizing PerformanceAWS July Webinar Series: Amazon Redshift Optimizing Performance
AWS July Webinar Series: Amazon Redshift Optimizing Performance
 
DevOps on AWS: Deep Dive on Continuous Delivery and the AWS Developer Tools
DevOps on AWS: Deep Dive on Continuous Delivery and the AWS Developer ToolsDevOps on AWS: Deep Dive on Continuous Delivery and the AWS Developer Tools
DevOps on AWS: Deep Dive on Continuous Delivery and the AWS Developer Tools
 
Deep Dive on Elastic Load Balancing
Deep Dive on Elastic Load BalancingDeep Dive on Elastic Load Balancing
Deep Dive on Elastic Load Balancing
 
Building Your Practice on AWS: An APN Breakfast Session
Building Your Practice on AWS: An APN Breakfast SessionBuilding Your Practice on AWS: An APN Breakfast Session
Building Your Practice on AWS: An APN Breakfast Session
 
Mobile App Testing with AWS Device Farm - AWS July 2016 Webinar Series
Mobile App Testing with AWS Device Farm - AWS July 2016 Webinar SeriesMobile App Testing with AWS Device Farm - AWS July 2016 Webinar Series
Mobile App Testing with AWS Device Farm - AWS July 2016 Webinar Series
 
AWS Data Transfer Services: Data Ingest Strategies Into the AWS Cloud
AWS Data Transfer Services: Data Ingest Strategies Into the AWS CloudAWS Data Transfer Services: Data Ingest Strategies Into the AWS Cloud
AWS Data Transfer Services: Data Ingest Strategies Into the AWS Cloud
 
Scalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query SpeedScalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query Speed
 
AWS re:Invent 2016: Wild Rydes Takes Off – The Dawn of a New Unicorn (SVR309)
AWS re:Invent 2016: Wild Rydes Takes Off – The Dawn of a New Unicorn (SVR309)AWS re:Invent 2016: Wild Rydes Takes Off – The Dawn of a New Unicorn (SVR309)
AWS re:Invent 2016: Wild Rydes Takes Off – The Dawn of a New Unicorn (SVR309)
 
Born in the Cloud; Build it Like a Startup
Born in the Cloud; Build it Like a StartupBorn in the Cloud; Build it Like a Startup
Born in the Cloud; Build it Like a Startup
 
Real-Time Analytics and Visualization of Streaming Big Data with JReport & Sc...
Real-Time Analytics and Visualization of Streaming Big Data with JReport & Sc...Real-Time Analytics and Visualization of Streaming Big Data with JReport & Sc...
Real-Time Analytics and Visualization of Streaming Big Data with JReport & Sc...
 
Deep Dive on Amazon Aurora - Covering New Feature Announcements
Deep Dive on Amazon Aurora - Covering New Feature AnnouncementsDeep Dive on Amazon Aurora - Covering New Feature Announcements
Deep Dive on Amazon Aurora - Covering New Feature Announcements
 

Similar to Deep Dive on Amazon Redshift

Deep Dive: Amazon Redshift (March 2017)
Deep Dive: Amazon Redshift (March 2017)Deep Dive: Amazon Redshift (March 2017)
Deep Dive: Amazon Redshift (March 2017)
Julien SIMON
 
Deep Dive Redshift, with a focus on performance
Deep Dive Redshift, with a focus on performanceDeep Dive Redshift, with a focus on performance
Deep Dive Redshift, with a focus on performance
Amazon Web Services
 
Best Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon RedshiftBest Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon Redshift
Amazon Web Services
 
AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features
Amazon Web Services
 
Best Practices – Extreme Performance with Data Warehousing on Oracle Databa...
Best Practices –  Extreme Performance with Data Warehousing  on Oracle Databa...Best Practices –  Extreme Performance with Data Warehousing  on Oracle Databa...
Best Practices – Extreme Performance with Data Warehousing on Oracle Databa...
Edgar Alejandro Villegas
 
Redshift overview
Redshift overviewRedshift overview
Redshift overview
Amazon Web Services LATAM
 
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
Amazon Web Services
 
AWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationAWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationVolodymyr Rovetskiy
 
Maaz Anjum - IOUG Collaborate 2013 - An Insight into Space Realization on ODA...
Maaz Anjum - IOUG Collaborate 2013 - An Insight into Space Realization on ODA...Maaz Anjum - IOUG Collaborate 2013 - An Insight into Space Realization on ODA...
Maaz Anjum - IOUG Collaborate 2013 - An Insight into Space Realization on ODA...Maaz Anjum
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftBest Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Amazon Web Services
 
Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon RedshiftUses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift
Amazon Web Services
 
Redshift deep dive
Redshift deep diveRedshift deep dive
Redshift deep dive
Amazon Web Services LATAM
 
Leveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseLeveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data Warehouse
Amazon Web Services
 
AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)
AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)
AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)
Amazon Web Services
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Amazon Web Services
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftBest Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Amazon Web Services
 
Processing and Analytics
Processing and AnalyticsProcessing and Analytics
Processing and Analytics
Amazon Web Services
 
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift
Amazon Web Services
 
Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift
Amazon Web Services
 
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
Michael Bohlig
 

Similar to Deep Dive on Amazon Redshift (20)

Deep Dive: Amazon Redshift (March 2017)
Deep Dive: Amazon Redshift (March 2017)Deep Dive: Amazon Redshift (March 2017)
Deep Dive: Amazon Redshift (March 2017)
 
Deep Dive Redshift, with a focus on performance
Deep Dive Redshift, with a focus on performanceDeep Dive Redshift, with a focus on performance
Deep Dive Redshift, with a focus on performance
 
Best Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon RedshiftBest Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon Redshift
 
AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features
 
Best Practices – Extreme Performance with Data Warehousing on Oracle Databa...
Best Practices –  Extreme Performance with Data Warehousing  on Oracle Databa...Best Practices –  Extreme Performance with Data Warehousing  on Oracle Databa...
Best Practices – Extreme Performance with Data Warehousing on Oracle Databa...
 
Redshift overview
Redshift overviewRedshift overview
Redshift overview
 
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
 
AWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationAWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentation
 
Maaz Anjum - IOUG Collaborate 2013 - An Insight into Space Realization on ODA...
Maaz Anjum - IOUG Collaborate 2013 - An Insight into Space Realization on ODA...Maaz Anjum - IOUG Collaborate 2013 - An Insight into Space Realization on ODA...
Maaz Anjum - IOUG Collaborate 2013 - An Insight into Space Realization on ODA...
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftBest Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift
 
Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon RedshiftUses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift
 
Redshift deep dive
Redshift deep diveRedshift deep dive
Redshift deep dive
 
Leveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseLeveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data Warehouse
 
AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)
AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)
AWS Summit London 2014 | Uses and Best Practices for Amazon Redshift (200)
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftBest Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift
 
Processing and Analytics
Processing and AnalyticsProcessing and Analytics
Processing and Analytics
 
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift
 
Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift
 
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
Amazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
Amazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
Amazon Web Services
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Amazon Web Services
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
Amazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
Amazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Amazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
Amazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Amazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 

Deep Dive on Amazon Redshift

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Zach Christopherson, Redshift Database Engineer July 13, 2016 Deep Dive on Amazon Redshift
  • 2. Agenda • Service overview • Migration considerations • Optimization: • Schema/table design • Data ingestion • Maintenance and database tuning • What’s new? • Q & A (~15 minutes)
  • 4. Relational data warehouse Massively parallel; petabyte scale Fully managed HDD and SSD platforms $1,000/TB/year; starts at $0.25/hour Amazon Redshift a lot faster a lot simpler a lot cheaper
  • 6. Amazon Redshift system architecture Leader node • SQL endpoint • Stores metadata • Coordinates query execution Compute nodes • Local, columnar storage • Executes queries in parallel • Load, backup, restore via Amazon S3; load from Amazon DynamoDB, Amazon EMR, or SSH Two hardware platforms • Optimized for data processing • DS2: HDD; scale from 2 TB to 2 PB • DC1: SSD; scale from 160 GB to 326 TB 10 GigE (HPC) Ingestion Backup Restore JDBC/ODBC
  • 7. A deeper look at compute node architecture Each node contains multiple slices • DS2 – 2 slices on XL, 16 on 8 XL • DC1 – 2 slices on L, 32 on 8 XL Each slice is allocated CPU and table data Each slice processes a piece of the workload in parallel Leader Node
  • 8. Amazon Redshift dramatically reduces I/O Data compression Zone maps ID Age State Amount 123 20 CA 500 345 25 WA 250 678 40 FL 125 957 37 WA 375 • Calculating SUM(Amount) with row storage: – Need to read everything – Unnecessary I/O ID Age State Amount
  • 9. Amazon Redshift dramatically reduces I/O Data compression Zone maps ID Age State Amount 123 20 CA 500 345 25 WA 250 678 40 FL 125 957 37 WA 375 • Calculating SUM(Amount) with column storage: – Only scan the necessary blocks ID Age State Amount
  • 10. Amazon Redshift dramatically reduces I/O Column storage Data compression Zone maps • Columnar compression – Effective due to like data – Reduces storage requirements – Reduces I/O ID Age State Amount analyze compression orders; Table | Column | Encoding --------+-------------+---------- orders | id | mostly32 orders | age | mostly32 orders | state | lzo orders | amount | mostly32
  • 11. Amazon Redshift dramatically reduces I/O Column storage Data compression Zone maps • In-memory block metadata • Contains per-block MIN and MAX value • Effectively prunes blocks that don’t contain data for a given query • Minimizes unnecessary I/O ID Age State Amount
  • 14. On Redshift • Optimal throughput at 8-12 concurrency, 50 possible • Upfront table design is critical, not able to add indexes • DISTKEY and SORTKEY significantly influence performance • PRIMARY KEY, FOREIGN KEY, UNIQUE constraints will help Caution: These are not enforced • Compression is also for speed
  • 15. On Redshift • Optimized for large writes: • Blocks are immutable • Small write (~1-10 rows) has similar cost to a larger write (~100 K rows) • Partition temporal data with time series tables and UNION ALL views
  • 17. Goals of distribution • Distribute data evenly for parallel processing • Minimize data movement • Co-located joins • Localized aggregations Distribution key All Node 1 Slice 1 Slice 2 Node 2 Slice 3 Slice 4 Node 1 Slice 1 Slice 2 Node 2 Slice 3 Slice 4 Full table data on first slice of every node Same key to same location Node 1 Slice 1 Slice 2 Node 2 Slice 3 Slice 4 Even Round robin distribution
  • 18. Choosing a distribution style Key • Large FACT tables • Rapidly changing tables used in joins • Localize columns used within aggregations All • Have slowly changing data • Reasonable size (i.e., few millions but not 100s of millions of rows) • No common distribution key for frequent joins • Typical use case: joined dimension table without a common distribution key Even • Tables not frequently joined or aggregated • Large tables without acceptable candidate keys
  • 19. Goals of sorting • Physically sort data within blocks and throughout a table • Enable rrscans to prune blocks by leveraging zone maps • Optimal SORTKEY is dependent on: • Query patterns • Data profile • Business requirements
  • 20. COMPOUND • Most common • Well-defined filter criteria • Time-series data Choosing a SORTKEY INTERLEAVED • Edge cases • Large tables (> billion rows) • No common filter criteria • Non time-series data • Primarily as a query predicate (date, identifier, …) • Optionally, choose a column frequently used for aggregates • Optionally, choose same as distribution key column for most efficient joins (merge join)
  • 21. Compressing data • COPY automatically analyzes and compresses data when loading into empty tables • ANALYZE COMPRESSION checks existing tables and proposes optimal compression algorithms for each column • Changing column encoding requires a table rebuild
  • 22. Automatic compression is a good thing (mostly) “The definition of insanity is doing the same thing over and over again, but expecting different results”. - Albert Einstein If you have a regular ETL process and you use temp tables or staging tables, turn off automatic compression: • Use ANALYZE COMPRESSION to determine the right encodings • Bake those encodings into your DDL • Use CREATE TABLE … LIKE
  • 23. Automatic compression is a good thing (mostly) • From the zone maps we know: • Which blocks contain the range • Which row offsets to scan • Highly compressed SORTKEYs: • Many rows per block • Large row offset Skip compression on just the leading column of the compound SORTKEY
  • 25. Amazon Redshift loading data overview AWS CloudCorporate data center Amazon DynamoDB Amazon S3 Data volume Amazon EMR Amazon RDS Amazon Redshift Amazon Glacier logs / files Source DBs VPN Connection AWS Direct Connect Amazon S3 Multipart Upload AWS Import/ Export Amazon EC2 or on-premises (using SSH)
  • 26. Parallelism is a function of load files Each slice’s query processors can load one file at a time: • Streaming decompression • Parse • Distribute • Write A single input file means only one slice is ingesting data Realizing only partial cluster usage as 6.25% of slices are active 2 4 6 8 10 12 141 3 5 7 9 11 13 15
  • 27. Maximize throughput with multiple files Use at least as many input files as there are slices in the cluster With 16 input files, all slices are working so you maximize throughput COPY continues to scale linearly as you add nodes 2 4 6 8 10 12 141 3 5 7 9 11 13 15
  • 29. Optimizing a database for querying • Periodically check your table status • Vacuum and analyze regularly • SVV_TABLE_INFO • Missing statistics • Table skew • Uncompressed columns • Unsorted data • Check your cluster status • WLM queuing • Commit queuing • Database locks
  • 30. Missing statistics • Amazon Redshift query optimizer relies on up-to-date statistics • Statistics are necessary only for data that you are accessing • Updated stats important on: • SORTKEY • DISTKEY • Columns in query predicates
  • 31. Table skew • Unbalanced workload • Query completes as fast as the slowest slice completes • Can cause skew inflight: • Temp data fills a single node, resulting in query failure Table maintenance and status Unsorted table • Sortkey is just a guide, but data actually needs to be sorted • VACUUM or DEEP COPY to sort • Scans against unsorted tables continue to benefit from zone maps: • Load sequential blocks
  • 32. WLM queue Identify short/long-running queries and prioritize them Define multiple queues to route queries appropriately Default concurrency of 5 Leverage wlm_apex_hourly to tune WLM based on peak concurrency requirements Cluster status: commits and WLM Commit queue How long is your commit queue? • Identify needless transactions • Group dependent statements within a single transaction • Offload operational workloads • STL_COMMIT_STATS
  • 34. Performance Ease of use Security Analytics & functionality SOA Recent launches Dynamic WLM parameters Queue hopping for timed-out queries Merge rows from staging to prod. table 2x improvement in query throughput 10x latency improvement for UNION ALL queries Bzip2 format for ingestion Table-level restore 10x improvement in vacuum perf Default access privileges Tag-based IAM access IAM roles for COPY/UNLOAD SAS connector enhancements, Implicit conversion of SAS queries to Redshift DMS support from OLTP sources Enhanced data ingestion from Amazon Kinesis Firehose Improved Data schema conversion to Amazon ML
  • 35. VACUUM and throughput improvements • 2x query throughput improvement • Improved memory management • Optimizations for query plans against UNION ALL views • Performance improvement for queries that stress the network • 10x VACUUM improvement • Improved backup performance
  • 36. New features - ingestion Backup option on CREATE TABLE • For use on Staging tables for enhanced load performance • Table will not be present on restore Alter Table Append • Appends rows to a target table by moving data from an existing source table • Data in the source table is moved to matching columns in the target table • You cannot run ALTER TABLE APPEND within a transaction block (BEGIN ... END)
  • 37. New feature - Table Level Restore aws redshift restore-table-from-cluster-snapshot --cluster-identifier mycluster-example --new-table-name my-new-table --snapshot-identifier my-snapshot-id --source-database-name sample-database --source-table-name my-source-table
  • 38. Open source tools https://github.com/awslabs/amazon-redshift-utils https://github.com/awslabs/amazon-redshift-monitoring https://github.com/awslabs/amazon-redshift-udfs Admin scripts Collection of utilities for running diagnostics on your cluster Admin views Collection of utilities for managing your cluster, generating schema DDL, etc. ColumnEncodingUtility Gives you the ability to apply optimal column encoding to an established schema with data already loaded