SlideShare a Scribd company logo
1 of 41
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
John Loughlin, Solution Architect
Eric Ferreira, Principal Database Engineer
July 22, 2015
Best Practices: Amazon Redshift
Migration and Loading Data
Amazon Redshift – Resources
Getting Started – June Webinar Series:
https://www.youtube.com/watch?v=biqBjWqJi-Q
Best Practices – July Webinar Series:
Optimizing Performance – July 21, 2015
Migration and Data Loading – July 22,2015
Reporting and Advanced Analytics – July 23, 2015
Agenda
Common Migration Patterns
Copy Command
Automation Options
Near real time loading
ETL Options with Partners
Common Migration Patterns
Common Migration Patterns
Data from a variety of relational OLTP systems
structure lends itself to SQL schemas
Data from logs, devices, sensors…
data is less structured
Structured Data Loading
Data is often being loaded into another warehouse
existing ETL process
Temptation is to ‘lift & shift’ workload.
Resist temptation. Instead consider:
What do I really want to do?
What do I need?
Ingesting Less Structured Data
Some data does not lend itself to a relational schema
Common pattern is to use EMR:
impose structure
import into Redshift
Other solutions are often home grown scripting
applications.
Loading Data
Load to an empty Redshift database.
Load changes captured in the source system to Redshift
Truncate and Load
This is by far the easiest option:
Move the data to Amazon Simple Storage Service
multi-part upload
import/export service
direct connect
COPY the data into Redshift, a table at a time.
Load Changes
Identify changes in source systems
Move data to Amazon S3
Load changes
‘Upsert process’
Partner ETL tools
Partner ETL
Amazon Redshift is supported by a variety of ETL vendors
Many simplify the process of data loading
Visit http://aws.amazon.com/redshift/partners
There are a variety of vendors offering a free trial of their
products, allowing you to evaluate and choose the one that
suits your needs.
Upsert
The goal is to insert new rows and update changed rows in
Redshift.
Load data into a temporary staging table
Join the staging with production and delete the common
rows.
Copy the new data into the production table.
See Updating and Inserting New Data in the developer’s
guide
Checkpoint
We’ve talked about common migration patterns
Sources of data and data structure
Methods of getting data to AWS
Options for loading data
COPY
Amazon Redshift Architecture
Leader Node
• SQL endpoint, JDBC/ODBC
• Stores metadata
• Coordinates query execution
Compute Nodes
• Local, columnar storage
• Execute queries in parallel
• Load, backup, restore via Amazon S3
• Load from Amazon DynamoDB or SSH
Two hardware platforms
• Optimized for data processing
• DS2: HDD; scale from 2TB to 2PB
• DC1: SSD; scale from 160GB to 326TB
10 GigE
(HPC)
Ingestion
Backup
Restore
JDBC/ODBC
A Closer Look
Each node is split into slices
• One slice per core
Each slice is allocated
memory, CPU, and disk space
Each slice processes a piece
of the workload in parallel
COPY command
COMPUPDATE ON when running on an empty table
Use the COPY command.
Each slice can load one file at a time.
Partition input files so every slice can load in parallel.
Use a Manifest file.
Use multiple input files to maximize throughput
Use the COPY command
Each slice can load one file at a
time
A single input file means only one
slice is ingesting data
Instead of 100MB/s, you’re only
getting 6.25MB/s
Use multiple input files to maximize throughput
Use the COPY command
You need at least as many input
files as you have slices
With 16 input files, all slices are
working so you maximize
throughput
Get 100MB/s per node; scale
linearly as you add nodes
Primary keys and manifest files
Amazon Redshift doesn’t enforce primary key constraints
• If you load data multiple times, Amazon Redshift won’t complain
• If you declare primary keys in your DML, the optimizer will
expect the data to be unique
Use manifest files to control exactly what is loaded and
how to respond if input files are missing
• Define a JSON manifest on Amazon S3
• Ensures the cluster loads exactly what you want
Analyze sort/dist key columns after every load
Amazon Redshift’s query
optimizer relies on up-to-date
statistics
Maximize performance by
updating stats on sort/dist key
columns after every load
Automatic compression
Better performance, lower costs
COPY samples data automatically when loading into an empty
table
• Samples up to 100,000 rows and picks optimal encoding
If you have a regular ETL process and you use temp tables or
staging tables, turn off automatic compression
• Use analyze compression to determine the right encodings
• Bake those encodings into your DML
Checking STL_LOAD_COMMITS
SELECT query, trim(filename) as filename, curtime, status
FROM stl_load_commits
WHERE filename LIKE ’%table name%'
ORDER BY query;
After the load operation is complete, query the
STL_LOAD_COMMITS system table to verify that the
expected files were loaded.
COPY and 18 inserts
COPY country FROM
's3://…country.txt' CREDENTIALS …
1.57s then
.
insert into country (country_name)
values ('Slovakia'),('Slovenia'),('South
Africa'),('South Korea'),('Spain'); 5.44s
‘
Insert vs Copy
Commit info
COPY best practice
Use it.
Avoid inserts, which will not run in parallel.
If you are moving data from table to another, use the
deep copy features:
1. Use the original CREATE TABLE ddl and then
INSERT INTO … SELECT
2. CREATE TABLE AS
3. CREATE TABLE LIKE
4. Create a temporary table and truncate the
original.
Automation Options
Automating Data Ingestion
Many customers run custom scripts on EC2 instances to
load data into Redshift.
Another option is to use the Amazon Data Pipeline
automation tool.
AWS Lambda-based Amazon Redshift Loader
Create a Data Pipeline
Create a Data Pipeline
Review Results
Execution Details
Using the Lambda based Redshift Loader
Offers the ability to drop files
into S3 and load them into any
number of database tables in
multiple Amazon Redshift
clusters automatically, with no
servers to maintain.
Configure the sample loader
johnlou$ ./configureSample.sh more.ohno.us-east-1.redshift.amazonaws.com 8192 mydb
johnlou us-east-1
Password for user johnlou:
create user test_lambda_load_user password 'Change-me1!';
CREATE USER
create table lambda_redshift_sample(
column_a int,
column_b int,
column_c int
);
CREATE TABLE
Enter the Region for the Redshift Load Configuration > us-east-1
Enter the S3 Bucket to use for the Sample Input > johnlou-ohno/loader-demo-data
Enter the Access Key used by Redshift to get data from S3 > nope
Enter the Secret Key used by Redshift to get data from S3 > nope
Creating Tables in Dynamo DB if Required
Configuration for johnlou-ohno/loader-demo-data/input successfully written in us-east-1
View Logs
Near Real Time Loading
Micro-batch loading
Ideal for time series data
Balance input files
Pre-configure column encoding
Reduce frequency of statistics calculation
Load in sort key order
Use SSD instances
Consider using the ‘Load Stream’ architecture HasOffers
developed.
ETL Options with Partners
Data Loading Options
Parallel upload to Amazon S3
AWS Direct Connect
AWS Import/Export
Amazon Kinesis
Systems integrators
Data Integration Systems Integrators
Resources on the AWS Big Data Blog
Best Practices for Micro-Batch Loading on Amazon
Redshift
Using Attunity Cloudbeam at UMUC to Replicate Data
to Amazon RDS and Amazon Redshift
A Zero-Administration Amazon Redshift Database
Loader
Best Practices References
Best Practices for Designing Tables
Best Practices for Designing Queries
Best Practices for Loading Data
Thank you!

More Related Content

What's hot

Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftAmazon Web Services
 
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon RedshiftAmazon Web Services
 
Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon RedshiftUses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon RedshiftAmazon Web Services
 
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesDeep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesAmazon Web Services
 
Scalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query SpeedScalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query SpeedFlyData Inc.
 
Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift Amazon Web Services
 
AWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAmazon Web Services
 
AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features Amazon Web Services
 
AWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationAWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationVolodymyr Rovetskiy
 
(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon RedshiftAmazon Web Services
 
Building AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and TableauBuilding AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and TableauLynn Langit
 
Near Real-Time Data Analysis With FlyData
Near Real-Time Data Analysis With FlyData Near Real-Time Data Analysis With FlyData
Near Real-Time Data Analysis With FlyData FlyData Inc.
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon RedshiftKel Graham
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Migration to Redshift from SQL Server
Migration to Redshift from SQL ServerMigration to Redshift from SQL Server
Migration to Redshift from SQL Serverjoeharris76
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftAmazon Web Services
 
AWS July Webinar Series: Amazon Redshift Optimizing Performance
AWS July Webinar Series: Amazon Redshift Optimizing PerformanceAWS July Webinar Series: Amazon Redshift Optimizing Performance
AWS July Webinar Series: Amazon Redshift Optimizing PerformanceAmazon Web Services
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift Amazon Web Services
 
Leveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseLeveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseAmazon Web Services
 

What's hot (20)

Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon Redshift
 
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift
(ISM303) Migrating Your Enterprise Data Warehouse To Amazon Redshift
 
Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon RedshiftUses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift
 
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesDeep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
 
Scalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query SpeedScalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query Speed
 
Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift Uses and Best Practices for Amazon Redshift
Uses and Best Practices for Amazon Redshift
 
AWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon RedshiftAWS June Webinar Series - Getting Started: Amazon Redshift
AWS June Webinar Series - Getting Started: Amazon Redshift
 
AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features
 
AWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationAWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentation
 
(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift
 
Building AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and TableauBuilding AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and Tableau
 
Near Real-Time Data Analysis With FlyData
Near Real-Time Data Analysis With FlyData Near Real-Time Data Analysis With FlyData
Near Real-Time Data Analysis With FlyData
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon Redshift
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Migration to Redshift from SQL Server
Migration to Redshift from SQL ServerMigration to Redshift from SQL Server
Migration to Redshift from SQL Server
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
AWS July Webinar Series: Amazon Redshift Optimizing Performance
AWS July Webinar Series: Amazon Redshift Optimizing PerformanceAWS July Webinar Series: Amazon Redshift Optimizing Performance
AWS July Webinar Series: Amazon Redshift Optimizing Performance
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift
 
Leveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data WarehouseLeveraging Amazon Redshift for your Data Warehouse
Leveraging Amazon Redshift for your Data Warehouse
 

Viewers also liked

Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and HiveBig Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hiveodsc
 
Encryption and Key Management in AWS
Encryption and Key Management in AWSEncryption and Key Management in AWS
Encryption and Key Management in AWSAmazon Web Services
 
Scaling by Design: AWS Web Services Patterns
Scaling by Design:AWS Web Services PatternsScaling by Design:AWS Web Services Patterns
Scaling by Design: AWS Web Services PatternsAmazon Web Services
 
Hybrid Infrastructure Integration
Hybrid Infrastructure IntegrationHybrid Infrastructure Integration
Hybrid Infrastructure IntegrationAmazon Web Services
 
Get the Most Out of Amazon EC2: A Deep Dive on Reserved, On-Demand, and Spot ...
Get the Most Out of Amazon EC2: A Deep Dive on Reserved, On-Demand, and Spot ...Get the Most Out of Amazon EC2: A Deep Dive on Reserved, On-Demand, and Spot ...
Get the Most Out of Amazon EC2: A Deep Dive on Reserved, On-Demand, and Spot ...Amazon Web Services
 
Data Storage for the Long Haul: Compliance and Archive
Data Storage for the Long Haul: Compliance and ArchiveData Storage for the Long Haul: Compliance and Archive
Data Storage for the Long Haul: Compliance and ArchiveAmazon Web Services
 
AWS March 2016 Webinar Series Getting Started with Serverless Architectures
AWS March 2016 Webinar Series   Getting Started with Serverless ArchitecturesAWS March 2016 Webinar Series   Getting Started with Serverless Architectures
AWS March 2016 Webinar Series Getting Started with Serverless ArchitecturesAmazon Web Services
 
AWS APAC Webinar Week - Understanding AWS Storage Options
AWS APAC Webinar Week - Understanding AWS Storage OptionsAWS APAC Webinar Week - Understanding AWS Storage Options
AWS APAC Webinar Week - Understanding AWS Storage OptionsAmazon Web Services
 
AWS Mobile Services & SDK Introduction & Demo
AWS Mobile Services & SDK Introduction & DemoAWS Mobile Services & SDK Introduction & Demo
AWS Mobile Services & SDK Introduction & DemoAmazon Web Services
 
The Pace of Innovation - Pop-up Loft Tel Aviv
The Pace of Innovation - Pop-up Loft Tel AvivThe Pace of Innovation - Pop-up Loft Tel Aviv
The Pace of Innovation - Pop-up Loft Tel AvivAmazon Web Services
 
(NET307) Pinterest: The road from EC2-Classic To EC2-VPC
(NET307) Pinterest: The road from EC2-Classic To EC2-VPC(NET307) Pinterest: The road from EC2-Classic To EC2-VPC
(NET307) Pinterest: The road from EC2-Classic To EC2-VPCAmazon Web Services
 
Compute Without Servers – Building Applications with AWS Lambda - Technical 301
Compute Without Servers – Building Applications with AWS Lambda - Technical 301Compute Without Servers – Building Applications with AWS Lambda - Technical 301
Compute Without Servers – Building Applications with AWS Lambda - Technical 301Amazon Web Services
 
(DEV204) Building High-Performance Native Cloud Apps In C++
(DEV204) Building High-Performance Native Cloud Apps In C++(DEV204) Building High-Performance Native Cloud Apps In C++
(DEV204) Building High-Performance Native Cloud Apps In C++Amazon Web Services
 
Workshop: AWS Lamda Signal Corps vs Zombies
Workshop: AWS Lamda Signal Corps vs ZombiesWorkshop: AWS Lamda Signal Corps vs Zombies
Workshop: AWS Lamda Signal Corps vs ZombiesAmazon Web Services
 
Security Day IAM Recommended Practices
Security Day IAM Recommended PracticesSecurity Day IAM Recommended Practices
Security Day IAM Recommended PracticesAmazon Web Services
 
Ansible on aws - Pop-up Loft Tel Aviv
Ansible on aws - Pop-up Loft Tel AvivAnsible on aws - Pop-up Loft Tel Aviv
Ansible on aws - Pop-up Loft Tel AvivAmazon Web Services
 

Viewers also liked (20)

Amazon Redshift
Amazon Redshift Amazon Redshift
Amazon Redshift
 
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and HiveBig Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
 
Encryption and Key Management in AWS
Encryption and Key Management in AWSEncryption and Key Management in AWS
Encryption and Key Management in AWS
 
Scaling by Design: AWS Web Services Patterns
Scaling by Design:AWS Web Services PatternsScaling by Design:AWS Web Services Patterns
Scaling by Design: AWS Web Services Patterns
 
Hybrid Infrastructure Integration
Hybrid Infrastructure IntegrationHybrid Infrastructure Integration
Hybrid Infrastructure Integration
 
Agile BI - Pop-up Loft Tel Aviv
Agile BI - Pop-up Loft Tel AvivAgile BI - Pop-up Loft Tel Aviv
Agile BI - Pop-up Loft Tel Aviv
 
Get the Most Out of Amazon EC2: A Deep Dive on Reserved, On-Demand, and Spot ...
Get the Most Out of Amazon EC2: A Deep Dive on Reserved, On-Demand, and Spot ...Get the Most Out of Amazon EC2: A Deep Dive on Reserved, On-Demand, and Spot ...
Get the Most Out of Amazon EC2: A Deep Dive on Reserved, On-Demand, and Spot ...
 
Data Storage for the Long Haul: Compliance and Archive
Data Storage for the Long Haul: Compliance and ArchiveData Storage for the Long Haul: Compliance and Archive
Data Storage for the Long Haul: Compliance and Archive
 
AWS March 2016 Webinar Series Getting Started with Serverless Architectures
AWS March 2016 Webinar Series   Getting Started with Serverless ArchitecturesAWS March 2016 Webinar Series   Getting Started with Serverless Architectures
AWS March 2016 Webinar Series Getting Started with Serverless Architectures
 
AWS APAC Webinar Week - Understanding AWS Storage Options
AWS APAC Webinar Week - Understanding AWS Storage OptionsAWS APAC Webinar Week - Understanding AWS Storage Options
AWS APAC Webinar Week - Understanding AWS Storage Options
 
Deep Dive: Hybrid Architectures
Deep Dive: Hybrid ArchitecturesDeep Dive: Hybrid Architectures
Deep Dive: Hybrid Architectures
 
AWS Mobile Services & SDK Introduction & Demo
AWS Mobile Services & SDK Introduction & DemoAWS Mobile Services & SDK Introduction & Demo
AWS Mobile Services & SDK Introduction & Demo
 
The Pace of Innovation - Pop-up Loft Tel Aviv
The Pace of Innovation - Pop-up Loft Tel AvivThe Pace of Innovation - Pop-up Loft Tel Aviv
The Pace of Innovation - Pop-up Loft Tel Aviv
 
(NET307) Pinterest: The road from EC2-Classic To EC2-VPC
(NET307) Pinterest: The road from EC2-Classic To EC2-VPC(NET307) Pinterest: The road from EC2-Classic To EC2-VPC
(NET307) Pinterest: The road from EC2-Classic To EC2-VPC
 
Compute Without Servers – Building Applications with AWS Lambda - Technical 301
Compute Without Servers – Building Applications with AWS Lambda - Technical 301Compute Without Servers – Building Applications with AWS Lambda - Technical 301
Compute Without Servers – Building Applications with AWS Lambda - Technical 301
 
(DEV204) Building High-Performance Native Cloud Apps In C++
(DEV204) Building High-Performance Native Cloud Apps In C++(DEV204) Building High-Performance Native Cloud Apps In C++
(DEV204) Building High-Performance Native Cloud Apps In C++
 
Workshop: AWS Lamda Signal Corps vs Zombies
Workshop: AWS Lamda Signal Corps vs ZombiesWorkshop: AWS Lamda Signal Corps vs Zombies
Workshop: AWS Lamda Signal Corps vs Zombies
 
Security Day IAM Recommended Practices
Security Day IAM Recommended PracticesSecurity Day IAM Recommended Practices
Security Day IAM Recommended Practices
 
Ansible on aws - Pop-up Loft Tel Aviv
Ansible on aws - Pop-up Loft Tel AvivAnsible on aws - Pop-up Loft Tel Aviv
Ansible on aws - Pop-up Loft Tel Aviv
 
My First Big Data Application
My First Big Data ApplicationMy First Big Data Application
My First Big Data Application
 

Similar to AWS July Webinar Series: Amazon redshift migration and load data 20150722

Migrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar SeriesMigrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar SeriesAmazon Web Services
 
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...Amazon Web Services
 
Amazon Redshift For Data Analysts
Amazon Redshift For Data AnalystsAmazon Redshift For Data Analysts
Amazon Redshift For Data AnalystsCan Abacıgil
 
London Redshift Meetup - July 2017
London Redshift Meetup - July 2017London Redshift Meetup - July 2017
London Redshift Meetup - July 2017Pratim Das
 
Melhores práticas de data warehouse no Amazon Redshift
Melhores práticas de data warehouse no Amazon RedshiftMelhores práticas de data warehouse no Amazon Redshift
Melhores práticas de data warehouse no Amazon RedshiftAmazon Web Services LATAM
 
Loading Data into Redshift: Data Analytics Week at the SF Loft
Loading Data into Redshift: Data Analytics Week at the SF LoftLoading Data into Redshift: Data Analytics Week at the SF Loft
Loading Data into Redshift: Data Analytics Week at the SF LoftAmazon Web Services
 
Loading Data into Redshift: Data Analytics Week SF
Loading Data into Redshift: Data Analytics Week SFLoading Data into Redshift: Data Analytics Week SF
Loading Data into Redshift: Data Analytics Week SFAmazon Web Services
 
SQL Server 2008 Development for Programmers
SQL Server 2008 Development for ProgrammersSQL Server 2008 Development for Programmers
SQL Server 2008 Development for ProgrammersAdam Hutson
 
(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct
(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct
(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & DataductAmazon Web Services
 
Loading Data into Amazon Redshift
Loading Data into Amazon RedshiftLoading Data into Amazon Redshift
Loading Data into Amazon RedshiftAmazon Web Services
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftBest Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftAmazon Web Services
 
Loading Data into Redshift with Lab
Loading Data into Redshift with LabLoading Data into Redshift with Lab
Loading Data into Redshift with LabAmazon Web Services
 
Accelerate Oracle to Aurora PostgreSQL Migration (GPSTEC313) - AWS re:Invent ...
Accelerate Oracle to Aurora PostgreSQL Migration (GPSTEC313) - AWS re:Invent ...Accelerate Oracle to Aurora PostgreSQL Migration (GPSTEC313) - AWS re:Invent ...
Accelerate Oracle to Aurora PostgreSQL Migration (GPSTEC313) - AWS re:Invent ...Amazon Web Services
 
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)Amazon Web Services
 

Similar to AWS July Webinar Series: Amazon redshift migration and load data 20150722 (20)

Migrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar SeriesMigrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar Series
 
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
 
Amazon Redshift For Data Analysts
Amazon Redshift For Data AnalystsAmazon Redshift For Data Analysts
Amazon Redshift For Data Analysts
 
London Redshift Meetup - July 2017
London Redshift Meetup - July 2017London Redshift Meetup - July 2017
London Redshift Meetup - July 2017
 
Melhores práticas de data warehouse no Amazon Redshift
Melhores práticas de data warehouse no Amazon RedshiftMelhores práticas de data warehouse no Amazon Redshift
Melhores práticas de data warehouse no Amazon Redshift
 
Loading Data into Redshift
Loading Data into RedshiftLoading Data into Redshift
Loading Data into Redshift
 
Loading Data into Redshift: Data Analytics Week at the SF Loft
Loading Data into Redshift: Data Analytics Week at the SF LoftLoading Data into Redshift: Data Analytics Week at the SF Loft
Loading Data into Redshift: Data Analytics Week at the SF Loft
 
Amazon Redshift Deep Dive
Amazon Redshift Deep Dive Amazon Redshift Deep Dive
Amazon Redshift Deep Dive
 
Loading Data into Redshift
Loading Data into RedshiftLoading Data into Redshift
Loading Data into Redshift
 
Loading Data into Redshift: Data Analytics Week SF
Loading Data into Redshift: Data Analytics Week SFLoading Data into Redshift: Data Analytics Week SF
Loading Data into Redshift: Data Analytics Week SF
 
SQL Server 2008 Development for Programmers
SQL Server 2008 Development for ProgrammersSQL Server 2008 Development for Programmers
SQL Server 2008 Development for Programmers
 
(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct
(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct
(BDT404) Large-Scale ETL Data Flows w/AWS Data Pipeline & Dataduct
 
Loading Data into Amazon Redshift
Loading Data into Amazon RedshiftLoading Data into Amazon Redshift
Loading Data into Amazon Redshift
 
BDA311 Introduction to AWS Glue
BDA311 Introduction to AWS GlueBDA311 Introduction to AWS Glue
BDA311 Introduction to AWS Glue
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftBest Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift
 
Loading Data into Redshift
Loading Data into RedshiftLoading Data into Redshift
Loading Data into Redshift
 
Loading Data into Redshift with Lab
Loading Data into Redshift with LabLoading Data into Redshift with Lab
Loading Data into Redshift with Lab
 
Accelerate Oracle to Aurora PostgreSQL Migration (GPSTEC313) - AWS re:Invent ...
Accelerate Oracle to Aurora PostgreSQL Migration (GPSTEC313) - AWS re:Invent ...Accelerate Oracle to Aurora PostgreSQL Migration (GPSTEC313) - AWS re:Invent ...
Accelerate Oracle to Aurora PostgreSQL Migration (GPSTEC313) - AWS re:Invent ...
 
AWS Analytics
AWS AnalyticsAWS Analytics
AWS Analytics
 
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
Amazon Redshift 與 Amazon Redshift Spectrum 幫您建立現代化資料倉儲 (Level 300)
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Recently uploaded (20)

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

AWS July Webinar Series: Amazon redshift migration and load data 20150722

  • 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. John Loughlin, Solution Architect Eric Ferreira, Principal Database Engineer July 22, 2015 Best Practices: Amazon Redshift Migration and Loading Data
  • 2. Amazon Redshift – Resources Getting Started – June Webinar Series: https://www.youtube.com/watch?v=biqBjWqJi-Q Best Practices – July Webinar Series: Optimizing Performance – July 21, 2015 Migration and Data Loading – July 22,2015 Reporting and Advanced Analytics – July 23, 2015
  • 3. Agenda Common Migration Patterns Copy Command Automation Options Near real time loading ETL Options with Partners
  • 5. Common Migration Patterns Data from a variety of relational OLTP systems structure lends itself to SQL schemas Data from logs, devices, sensors… data is less structured
  • 6. Structured Data Loading Data is often being loaded into another warehouse existing ETL process Temptation is to ‘lift & shift’ workload. Resist temptation. Instead consider: What do I really want to do? What do I need?
  • 7. Ingesting Less Structured Data Some data does not lend itself to a relational schema Common pattern is to use EMR: impose structure import into Redshift Other solutions are often home grown scripting applications.
  • 8. Loading Data Load to an empty Redshift database. Load changes captured in the source system to Redshift
  • 9. Truncate and Load This is by far the easiest option: Move the data to Amazon Simple Storage Service multi-part upload import/export service direct connect COPY the data into Redshift, a table at a time.
  • 10. Load Changes Identify changes in source systems Move data to Amazon S3 Load changes ‘Upsert process’ Partner ETL tools
  • 11. Partner ETL Amazon Redshift is supported by a variety of ETL vendors Many simplify the process of data loading Visit http://aws.amazon.com/redshift/partners There are a variety of vendors offering a free trial of their products, allowing you to evaluate and choose the one that suits your needs.
  • 12. Upsert The goal is to insert new rows and update changed rows in Redshift. Load data into a temporary staging table Join the staging with production and delete the common rows. Copy the new data into the production table. See Updating and Inserting New Data in the developer’s guide
  • 13. Checkpoint We’ve talked about common migration patterns Sources of data and data structure Methods of getting data to AWS Options for loading data
  • 14. COPY
  • 15. Amazon Redshift Architecture Leader Node • SQL endpoint, JDBC/ODBC • Stores metadata • Coordinates query execution Compute Nodes • Local, columnar storage • Execute queries in parallel • Load, backup, restore via Amazon S3 • Load from Amazon DynamoDB or SSH Two hardware platforms • Optimized for data processing • DS2: HDD; scale from 2TB to 2PB • DC1: SSD; scale from 160GB to 326TB 10 GigE (HPC) Ingestion Backup Restore JDBC/ODBC
  • 16. A Closer Look Each node is split into slices • One slice per core Each slice is allocated memory, CPU, and disk space Each slice processes a piece of the workload in parallel
  • 17. COPY command COMPUPDATE ON when running on an empty table Use the COPY command. Each slice can load one file at a time. Partition input files so every slice can load in parallel. Use a Manifest file.
  • 18. Use multiple input files to maximize throughput Use the COPY command Each slice can load one file at a time A single input file means only one slice is ingesting data Instead of 100MB/s, you’re only getting 6.25MB/s
  • 19. Use multiple input files to maximize throughput Use the COPY command You need at least as many input files as you have slices With 16 input files, all slices are working so you maximize throughput Get 100MB/s per node; scale linearly as you add nodes
  • 20. Primary keys and manifest files Amazon Redshift doesn’t enforce primary key constraints • If you load data multiple times, Amazon Redshift won’t complain • If you declare primary keys in your DML, the optimizer will expect the data to be unique Use manifest files to control exactly what is loaded and how to respond if input files are missing • Define a JSON manifest on Amazon S3 • Ensures the cluster loads exactly what you want
  • 21. Analyze sort/dist key columns after every load Amazon Redshift’s query optimizer relies on up-to-date statistics Maximize performance by updating stats on sort/dist key columns after every load
  • 22. Automatic compression Better performance, lower costs COPY samples data automatically when loading into an empty table • Samples up to 100,000 rows and picks optimal encoding If you have a regular ETL process and you use temp tables or staging tables, turn off automatic compression • Use analyze compression to determine the right encodings • Bake those encodings into your DML
  • 23. Checking STL_LOAD_COMMITS SELECT query, trim(filename) as filename, curtime, status FROM stl_load_commits WHERE filename LIKE ’%table name%' ORDER BY query; After the load operation is complete, query the STL_LOAD_COMMITS system table to verify that the expected files were loaded.
  • 24. COPY and 18 inserts COPY country FROM 's3://…country.txt' CREDENTIALS … 1.57s then . insert into country (country_name) values ('Slovakia'),('Slovenia'),('South Africa'),('South Korea'),('Spain'); 5.44s ‘ Insert vs Copy Commit info
  • 25. COPY best practice Use it. Avoid inserts, which will not run in parallel. If you are moving data from table to another, use the deep copy features: 1. Use the original CREATE TABLE ddl and then INSERT INTO … SELECT 2. CREATE TABLE AS 3. CREATE TABLE LIKE 4. Create a temporary table and truncate the original.
  • 27. Automating Data Ingestion Many customers run custom scripts on EC2 instances to load data into Redshift. Another option is to use the Amazon Data Pipeline automation tool. AWS Lambda-based Amazon Redshift Loader
  • 28. Create a Data Pipeline
  • 29. Create a Data Pipeline
  • 32. Using the Lambda based Redshift Loader Offers the ability to drop files into S3 and load them into any number of database tables in multiple Amazon Redshift clusters automatically, with no servers to maintain.
  • 33. Configure the sample loader johnlou$ ./configureSample.sh more.ohno.us-east-1.redshift.amazonaws.com 8192 mydb johnlou us-east-1 Password for user johnlou: create user test_lambda_load_user password 'Change-me1!'; CREATE USER create table lambda_redshift_sample( column_a int, column_b int, column_c int ); CREATE TABLE Enter the Region for the Redshift Load Configuration > us-east-1 Enter the S3 Bucket to use for the Sample Input > johnlou-ohno/loader-demo-data Enter the Access Key used by Redshift to get data from S3 > nope Enter the Secret Key used by Redshift to get data from S3 > nope Creating Tables in Dynamo DB if Required Configuration for johnlou-ohno/loader-demo-data/input successfully written in us-east-1
  • 35. Near Real Time Loading
  • 36. Micro-batch loading Ideal for time series data Balance input files Pre-configure column encoding Reduce frequency of statistics calculation Load in sort key order Use SSD instances Consider using the ‘Load Stream’ architecture HasOffers developed.
  • 37. ETL Options with Partners
  • 38. Data Loading Options Parallel upload to Amazon S3 AWS Direct Connect AWS Import/Export Amazon Kinesis Systems integrators Data Integration Systems Integrators
  • 39. Resources on the AWS Big Data Blog Best Practices for Micro-Batch Loading on Amazon Redshift Using Attunity Cloudbeam at UMUC to Replicate Data to Amazon RDS and Amazon Redshift A Zero-Administration Amazon Redshift Database Loader
  • 40. Best Practices References Best Practices for Designing Tables Best Practices for Designing Queries Best Practices for Loading Data