SlideShare a Scribd company logo
1 of 28
SQL Server To Redshift
Data Load Using SSIS
Reach for the Clouds, Inc.
Next Generation SSIS Tasks and Connectors Series
AUTHOR:
NAYAN PATEL | SR. ETL SSIS ARCHITECT
N PAT E L @ R F TC LO U D S . C O M
Content
• Introduction – SQL Server to Redshift Load
• VideoTutorial – Redshift Data Load
• Right way but hard way
• Steps for Amazon Redshift Data Load from On-Premise files or RDBMS (e.g. MySQL, SQL Server)
• Doing it easy way
• Should I use SSIS to load Redshift
• Setup your Amazon Redshift Cluster
• Add inbound rule for Redshift Cluster
• Automate Redshift Cluster Creation
• Create Sample table and data in Source – (in this example SQL Server)
• Create Sample table in Amazon Redshift
• SQL Server to Redshift Data Load using SSIS
• Conclusion - Related Links
Introduction – SQL Server to Redshift Load
• Before we talk data load from SQL Server to Redshift using SSIS lets
talk what is Amazon Redshift (or sometimes referred to as AWS
Redshift). Amazon Redshift is a Cloud based Data warehouse service.
This type of system also referred as MPP (Massively Parallel
Processing). Amazon Redshift uses highly modified version of
PostGrey SQL Engine behind the scene. Amazon Redshift provides
advantage of Scale as you go, at very low cost compared to onsite
dedicated hardware/software approach.
Right way but hard way
• If you are reading some of the guidelines published by Amazon
regarding Redshift Data load then you will quickly realize that there is
a lot to do under the cover to get it going right way. Here are few
steps you will have to perform while loading data to Redshift from
your On-Premise server (Data can be sitting in files or Relational
source).
Right way but hard way
Steps for Amazon Redshift Data Load from On-Premise files or RDBMS (e.g. MySQL, SQL Server)
• Export local RDBMS data to flat files (Make sure you remove invalid characters, apply escape sequence
during export)
• Split files into 10-15 MB each to get optimal performance during upload and final Data load
• Compress files to *.gz format so you don’t end up with $1000 surprise bill :) .. In my caseText files were
compressed 10-20 times
• List all file names to manifest file so when you issue COPY command to Redshift its treated as one unit
of load
• Upload manifest file to Amazon S3 bucket
• Upload local *.gz files to Amazon S3 bucket
• Issue RedshiftCOPY command with different options
• Schedule file archiving from on-premises and S3 Staging area on AWS
• Capturing Errors, setting up restart ability if something fails
Doing it the easy way
• So if you are not sure you ready to code many steps listed above then
you can use Amazon Redshift DataTransferTask.
• In next few sections we will describe how to setup your Redshift
Cluster for Demo purpose and load Data from SQL Server to Redshift
using SSIS.
Doing it the easy way
Should I use SSIS to load Redshift
• If you are curious which approach to use to load data then consider few facts
• Do you have existing ETL processes written in SSIS?
• Do you need more visual approach and better work flow management (what SSIS
Provides)?
• Do you need connection string encryption and other goodies offered by SSIS such
as native logging, passing parameters from SSIS environment
• Do you have expertise available for SSIS in-house or you better stay with command
line scripts?
• Do you have need to create workflow which can run on any server where SSIS is not
installed?
Setup your Amazon Redshift
Cluster
NOTE: SKIP THIS STEP IF YOU ALREADY SETUP
YOU REDSHIFT CLUSTER
1.LOGIN TO YOUR AWS CONSOLE
AND CLICK ON REDSHIFT ICON. OR
CLICK HERE TO LAND DIRECTLY TO
REDSHIFT
2.CLICK ON LAUNCH CLUSTER
3.ON CLUSTER DETAIL PAGE SPECIFY
CLUSTER IDENTIFIER, DATABASE
NAME, PORT, MASTER USER AND
PASSWORD. CLICK CONTINUE TO GO
TO NEXT PAGE
Setup your Amazon Redshift
Cluster
4. ON NODE CONFIGURATION PAGE
SPECIFY NODE TYPE (THIS IS VM
TYPE), CLUSTER TYPE AND NUMBER
OF NODE. IF YOU ARE TRYING UNDER
FREE TIRE THEN SELECT SMALLEST
NODE POSSIBLE (IN THIS CASE IT
WAS DW2.LARGE). CLICK CONTINUE
TO GO TO NEXT PAGE
Setup your Amazon Redshift
Cluster
5. ON ADDITIONAL CONFIGURATION
PAGE YOU CAN PICK VPC (VIRTUAL
PRIVATE CONNECTION), SECURITY
GROUP FOR CLUSTER AND OTHER
OPTIONS FOR ENCRYPTION. FOR
DEMO PURPOSE SELECT AS BELOW
SCREENSHOT . CLICK CONTINUE TO
REVIEW YOUR SETTINGS AND CLICK
CREATE CLUSTER
Setup your Amazon Redshift
Cluster
6. GIVE IT FEW MINS WHILE YOUR
CLUSTER IS BEING CREATED. AFTER
FEW MINUTES (5-10 MINS) YOU CAN
GO BACK TO SAME PAGE AND
REVIEW CLUSTER STATUS AND
OTHER PROPERTIES AS BELOW.
COPY CLUSTER ENDPOINT TO
SOMEWHERE BECAUSE WE WILL
NEED IT LATER.
Add inbound rule for Redshift
Cluster
NOTE: SKIP THIS STEP IF YOU HAVE ALREADY
ADDED YOUR IP TO INBOUND EXCLUSION RULE.
BY DEFAULT YOU CANNOT CONNECT TO AMAZON
REDSHIFT CLUSTER FROM OUTSIDE AWS
NETWORK (E.G. FROM YOUR ON -PREMISES
MACHINE). IF YOU WISH TO CONNECT THEN YOU
MUST ADD INBOUND EXCEPTION RULE TO ALLOW
YOUR REQUEST TO REDSHIFT CLUSTER ON
SPECIFIC PORT.
TO ADD CREATE NEW INBOUND RULE PERFORM
FOLLOWING STEPS
1. UNDER REDSHIFT HOME PAGE
CLICK [SECURITY] TAB. YOU MAY SEE
FOLLOWING NOTICE DEPENDING ON
WHICH REGION YOU ARE. CLICK ON
[GO TO THE EC2 CONSOLE] LINK OR
YOU CAN DIRECT GO TO EC2 BY
CLICKING SERVICES -> EC2 MENU AT
THE TOP
Add inbound rule for Redshift
Cluster
2. ON EC2 SECURITY GROUPS PAGE
SELECT SECURITY GROUP ATTACHED
WITH YOUR REDSHIFT CLUSTER AND
THEN IN THE BOTTOM PANE CLICK
ON INBOUND TAB
3. ON INBOUND TAB CLICK EDIT
OPTION TO MODIFY DEFAULT ENTRY
OR YOU CAN ADD NEW RULE
4. CLICK ON ADD RULE IF YOU WISH
TO ADD NEW ENTRY ELSE EDIT AS
BELOW AND CLICK SAVE
Automate Redshift Cluster Creation
If you have need to automate Redshift Cluster Creation or any of the following things
automatically then check Redshift Cluster managementTask
• Automate Amazon Redshift Cluster Create Action in few clicks.You can also add
Access Security Rule.
• Automate Amazon Redshift Cluster Delete Action
• Fetch Amazon Redshift Cluster Property to SSISVariable (e.g. Fetch Cluster Status)
• Fetch all cluster and their properties as DataTable (Use ForEach Loop and iterate
through all clusters)
• Automate Redshift Cluster Snapshot Creation
• Automate Redshift Cluster Snapshot Delete Action
• Support forWait until Cluster operation is done
Create Sample table and data in Source – (in this
example SQL Server)
Note: Skip this step if you wish to use your own table. If you do so please ignore certain steps and
screenshots mentioned in this article.
For this demo we will use Free Northwind sample database
supplied by Microsoft.
• Download Sample Database from here.
• Extract the zip file -> Open *.sql file and run it to create new
database with sample tables and data.
Create Sample table in Amazon
Redshift
4. DOUBLE CLICK ON THE TASK TO
SEE UI.
5.CLICK ON [NEW] CONNECTION.
6. CONFIGURE REDSHIFT
CONNECTION PROPERTIES AND
CLICK TEST.
Create Sample table in Amazon
Redshift
7. TEST CONNECTION IS SUCCESSFUL
THEN CLICK OK TO SAVE
CONNECTION DETAIL.
8. ENTER FOLLOWING SCRIPT IN THE
SQL TEXTBOX AND HIT OK TO SAVE
IT.
Create Sample table in Amazon
Redshift
9. NOW RIGHT CLICK ON THE TASK
AND EXECUTE. THIS SHOULD CREATE
NEW TABLE IN REDSHIFT.
SQL Server to Redshift Data Load using SSIS
Once table is created now lets do real work to get data moving from SQL Server to Amazon Redshift.
Perform the following steps to configure SSISAmazon Redshift DataTransferTask
1. Drag Amazon Redshift DataTransferTask on the SSIS designer surface.
2. Double click on the task to edit properties.
3. Select Action: In the top Action drop down select Bulk Import to Redshift from any RDBMS (e.g.
MySQL, Oracle, SQL Server) option
4. Configure Source: On the Source tab click [New] next to connection dropdown and configure Source
connection or pick existing connection. In our case we are extracting data from SQl Server database
(Northwind) on local server.
Enter the following SQL Query to extract 100,000 rows from SQL Server
Create Sample table in Amazon
Redshift
5. CONFIGURE SOURCE STAGING
AREA: ON THE SOURCE TAB YOU
HAVE TO ENTER FOLDER LOCATION
WHERE STAGING FILES WILL BE
SAVED BEFORE WE UPLOAD TO
REDSHIFT (SEE ABOVE SCREEN).
Create Sample table in Amazon Redshift
6. CONFIGURE TARGET: ON TARGET
TAB SELECT EXISTING REDSHIFT
CONNECTION MANAGER (OR CREATE
NEW), SELECT TARGET TABLE FROM
THE DROPDOWN WHERE YOU WANT
TO LOAD DATA. IF YOU HAVE LONG
LIST OF TABLES THEN SIMPLY ENTER
SCHEMA NAME IN THE SCHEMA
FILTER TEXT BOX AND CLICK
REFRESH TO RELOAD TABLE
DROPDOWN WITH FEWER ITEMS.
Create Sample table in Amazon Redshift
7. CONFIGURE RELOAD OPTION AND
TARGET STAGING AREA: ON TARGET
TAB CHECK TRUNCATE TARGET
TABLE OPTION IF YOU WANT TO
RELOAD EACH TIME EXECUTE THIS
TASK ELSE LEAVE IT UNCHECKED TO
APPEND RECORDS. WE ALSO HAVE
TO SPECIFY AMAZON S3 STAGING
AREAS WHERE REDSHIFT WILL LOOK
FOR FILES TO LOAD.
Create Sample table in Amazon Redshift
8. CONFIGURE FILE FORMAT: WE
ARE GOING TO GENERATE CSV FILES
FOR REDSHIFT LOAD SO MAKE SURE
YOU SELECT CORRECT COLUMN
DELIMITER. ALSO MAKE SURE YOU
CHECK ALWAYS COMPRESS FILE
OPTION TO REDUCE BANDWIDTH.
Create Sample table in Amazon Redshift
9. CONFIGURE ARCHIVE OPTIONS:
ON ARCHIVE TAB WE CAN SPECIFY
HOW TO ARCHIVE SOURCE AND
TARGET FILES WE GENERATED.
SOURCE FILES ARE CSV FILES AND
SOURCE STAGE FILES ARE *.GZ
FILES (IF YOU SELECT
COMPRESSION). TARGET STAGE
FILES ARE EITHER CSV OR *.GZ
FILES.BY DEFAULT SOURCE CSV
FILES ARE KEPT AND ALL OTHER
STAGE FILES ARE DELETED. SEE
BELOW SCREENSHOT
Create Sample table in Amazon Redshift
10. CONFIGURE ADVANCED
OPTIONS: ON ADVANCED OPTIONS
TAB YOU FINE TUNE LOAD PROCESS
SUCH AS HOW TO HANDLE NULL
DATA, HOW TO HANDLE DATA
TRUNCATION ETC. READ HELP FILE
FOR MORE INFO
Create Sample table in Amazon Redshift
11. CONFIGURE ERROR HANDLING
OPTIONS: ON ERROR HANDLING TAB
YOU CAN SPECIFY HOW MANY
ERRORS YOU WANT TO IGNORE
BEFORE FAILING ENTIRE LOAD. YOU
CAN ALSO REPLACE SOME INVALID
CHARACTERS DURING YOUR IF YOU
CHECK [ALLOW INVALID
CHARACTERS] OPTION.
Create Sample table in Amazon Redshift
12. NOW FINALLY WE READY TO EXECUTE OUR SSIS PACKAGE. ONCE ITS
DONE YOU CAN REVIEW LOG. HERE IS THE SAMPLE EXECUTION LOG .
Conclusion
So in this article we outlined different steps needed to load data into Redshift from relational source (e.g.
MySQL, SQL Server, Oracle). Redshift is a great way to offload your expensive data warehouse to cloud so
you don’t have to worry about costly maintenance and future growth.With redshift you can grow your data
size from Gigabyte to Petabyte. SSISAmazon Redshift DataTransferTask. can give you an easy way to
maintain your Redshift data transfer process with ease of use and fast load options (for full or incremental
load).
Again this was just proof of concept but we encourage you to do your own benchmarking and research see
which approach suites best for your need.
• Related Links:
• SSIS Amazon Redshift DataTransferTask
TAGS: amazon redshift Amazon Redshift Data Transfer Task aws command line csv excel export How-To json mysql PDF Redshift SSIS SSIS PowerPack

More Related Content

What's hot

Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...
Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...
Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...Amazon Web Services
 
AWS re:Invent 2016: Building a Solid Business Case for Cloud Migration (ENT308)
AWS re:Invent 2016: Building a Solid Business Case for Cloud Migration (ENT308)AWS re:Invent 2016: Building a Solid Business Case for Cloud Migration (ENT308)
AWS re:Invent 2016: Building a Solid Business Case for Cloud Migration (ENT308)Amazon Web Services
 
JavaScript for Hackers.pdf
JavaScript for Hackers.pdfJavaScript for Hackers.pdf
JavaScript for Hackers.pdfnafees40
 
Introduction to the Well-Architected Framework and Tool - SVC208 - Anaheim AW...
Introduction to the Well-Architected Framework and Tool - SVC208 - Anaheim AW...Introduction to the Well-Architected Framework and Tool - SVC208 - Anaheim AW...
Introduction to the Well-Architected Framework and Tool - SVC208 - Anaheim AW...Amazon Web Services
 
DAS Slides: Data Quality Best Practices
DAS Slides: Data Quality Best PracticesDAS Slides: Data Quality Best Practices
DAS Slides: Data Quality Best PracticesDATAVERSITY
 
Building a Better Business Case for Migrating to Cloud
Building a Better Business Case for Migrating to CloudBuilding a Better Business Case for Migrating to Cloud
Building a Better Business Case for Migrating to CloudAmazon Web Services
 
[REPEAT 1] Executing a Large-Scale Migration to AWS (ENT205-R1) - AWS re:Inve...
[REPEAT 1] Executing a Large-Scale Migration to AWS (ENT205-R1) - AWS re:Inve...[REPEAT 1] Executing a Large-Scale Migration to AWS (ENT205-R1) - AWS re:Inve...
[REPEAT 1] Executing a Large-Scale Migration to AWS (ENT205-R1) - AWS re:Inve...Amazon Web Services
 
AWS Batch를 통한 손쉬운 일괄 처리 작업 관리하기 - 윤석찬 (AWS 테크에반젤리스트)
AWS Batch를 통한 손쉬운 일괄 처리 작업 관리하기 - 윤석찬 (AWS 테크에반젤리스트)AWS Batch를 통한 손쉬운 일괄 처리 작업 관리하기 - 윤석찬 (AWS 테크에반젤리스트)
AWS Batch를 통한 손쉬운 일괄 처리 작업 관리하기 - 윤석찬 (AWS 테크에반젤리스트)Amazon Web Services Korea
 
AWS Summit Singapore 2019 | Banking in the Cloud: 10 Lessons Learned
AWS Summit Singapore 2019 | Banking in the Cloud: 10 Lessons LearnedAWS Summit Singapore 2019 | Banking in the Cloud: 10 Lessons Learned
AWS Summit Singapore 2019 | Banking in the Cloud: 10 Lessons LearnedAWS Summits
 
Various Cloud offerings AWS/AZURE/GCP
Various Cloud offerings AWS/AZURE/GCPVarious Cloud offerings AWS/AZURE/GCP
Various Cloud offerings AWS/AZURE/GCPMohammad Imran Ansari
 
[Pcamp19] - Scaling Nubank`s customer service with machine learning - Gustavo...
[Pcamp19] - Scaling Nubank`s customer service with machine learning - Gustavo...[Pcamp19] - Scaling Nubank`s customer service with machine learning - Gustavo...
[Pcamp19] - Scaling Nubank`s customer service with machine learning - Gustavo...Product Camp Brasil
 
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)Emprovise
 
AWS Financial Governance Practice
AWS Financial Governance Practice AWS Financial Governance Practice
AWS Financial Governance Practice Amir Arama
 
Top 7 Capabilities for Next-Gen Master Data Management
Top 7 Capabilities for Next-Gen Master Data ManagementTop 7 Capabilities for Next-Gen Master Data Management
Top 7 Capabilities for Next-Gen Master Data ManagementDATAVERSITY
 
Cloud Migration: Moving Data and Infrastructure to the Cloud
Cloud Migration: Moving Data and Infrastructure to the CloudCloud Migration: Moving Data and Infrastructure to the Cloud
Cloud Migration: Moving Data and Infrastructure to the CloudSafe Software
 
Disrupting Traditional Payment Systems Architecture with AWS (FSV320) - AWS r...
Disrupting Traditional Payment Systems Architecture with AWS (FSV320) - AWS r...Disrupting Traditional Payment Systems Architecture with AWS (FSV320) - AWS r...
Disrupting Traditional Payment Systems Architecture with AWS (FSV320) - AWS r...Amazon Web Services
 
Multi-cloud strategies and services
Multi-cloud strategies and servicesMulti-cloud strategies and services
Multi-cloud strategies and servicesTatiana Lavrentieva
 

What's hot (20)

Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...
Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...
Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...
 
AWS Adoption in FSI
AWS Adoption in FSIAWS Adoption in FSI
AWS Adoption in FSI
 
Ecosystem Design
Ecosystem DesignEcosystem Design
Ecosystem Design
 
AWS re:Invent 2016: Building a Solid Business Case for Cloud Migration (ENT308)
AWS re:Invent 2016: Building a Solid Business Case for Cloud Migration (ENT308)AWS re:Invent 2016: Building a Solid Business Case for Cloud Migration (ENT308)
AWS re:Invent 2016: Building a Solid Business Case for Cloud Migration (ENT308)
 
JavaScript for Hackers.pdf
JavaScript for Hackers.pdfJavaScript for Hackers.pdf
JavaScript for Hackers.pdf
 
Introduction to the Well-Architected Framework and Tool - SVC208 - Anaheim AW...
Introduction to the Well-Architected Framework and Tool - SVC208 - Anaheim AW...Introduction to the Well-Architected Framework and Tool - SVC208 - Anaheim AW...
Introduction to the Well-Architected Framework and Tool - SVC208 - Anaheim AW...
 
DAS Slides: Data Quality Best Practices
DAS Slides: Data Quality Best PracticesDAS Slides: Data Quality Best Practices
DAS Slides: Data Quality Best Practices
 
Building a Better Business Case for Migrating to Cloud
Building a Better Business Case for Migrating to CloudBuilding a Better Business Case for Migrating to Cloud
Building a Better Business Case for Migrating to Cloud
 
[REPEAT 1] Executing a Large-Scale Migration to AWS (ENT205-R1) - AWS re:Inve...
[REPEAT 1] Executing a Large-Scale Migration to AWS (ENT205-R1) - AWS re:Inve...[REPEAT 1] Executing a Large-Scale Migration to AWS (ENT205-R1) - AWS re:Inve...
[REPEAT 1] Executing a Large-Scale Migration to AWS (ENT205-R1) - AWS re:Inve...
 
AWS Batch를 통한 손쉬운 일괄 처리 작업 관리하기 - 윤석찬 (AWS 테크에반젤리스트)
AWS Batch를 통한 손쉬운 일괄 처리 작업 관리하기 - 윤석찬 (AWS 테크에반젤리스트)AWS Batch를 통한 손쉬운 일괄 처리 작업 관리하기 - 윤석찬 (AWS 테크에반젤리스트)
AWS Batch를 통한 손쉬운 일괄 처리 작업 관리하기 - 윤석찬 (AWS 테크에반젤리스트)
 
AWS Summit Singapore 2019 | Banking in the Cloud: 10 Lessons Learned
AWS Summit Singapore 2019 | Banking in the Cloud: 10 Lessons LearnedAWS Summit Singapore 2019 | Banking in the Cloud: 10 Lessons Learned
AWS Summit Singapore 2019 | Banking in the Cloud: 10 Lessons Learned
 
Various Cloud offerings AWS/AZURE/GCP
Various Cloud offerings AWS/AZURE/GCPVarious Cloud offerings AWS/AZURE/GCP
Various Cloud offerings AWS/AZURE/GCP
 
[Pcamp19] - Scaling Nubank`s customer service with machine learning - Gustavo...
[Pcamp19] - Scaling Nubank`s customer service with machine learning - Gustavo...[Pcamp19] - Scaling Nubank`s customer service with machine learning - Gustavo...
[Pcamp19] - Scaling Nubank`s customer service with machine learning - Gustavo...
 
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
 
AWS Financial Governance Practice
AWS Financial Governance Practice AWS Financial Governance Practice
AWS Financial Governance Practice
 
Top 7 Capabilities for Next-Gen Master Data Management
Top 7 Capabilities for Next-Gen Master Data ManagementTop 7 Capabilities for Next-Gen Master Data Management
Top 7 Capabilities for Next-Gen Master Data Management
 
Application Portfolio Migration
Application Portfolio MigrationApplication Portfolio Migration
Application Portfolio Migration
 
Cloud Migration: Moving Data and Infrastructure to the Cloud
Cloud Migration: Moving Data and Infrastructure to the CloudCloud Migration: Moving Data and Infrastructure to the Cloud
Cloud Migration: Moving Data and Infrastructure to the Cloud
 
Disrupting Traditional Payment Systems Architecture with AWS (FSV320) - AWS r...
Disrupting Traditional Payment Systems Architecture with AWS (FSV320) - AWS r...Disrupting Traditional Payment Systems Architecture with AWS (FSV320) - AWS r...
Disrupting Traditional Payment Systems Architecture with AWS (FSV320) - AWS r...
 
Multi-cloud strategies and services
Multi-cloud strategies and servicesMulti-cloud strategies and services
Multi-cloud strategies and services
 

Viewers also liked

Migration to Redshift from SQL Server
Migration to Redshift from SQL ServerMigration to Redshift from SQL Server
Migration to Redshift from SQL Serverjoeharris76
 
Como o Magazine Luiza inova suas operações utilizando as soluções de IoT e Bi...
Como o Magazine Luiza inova suas operações utilizando as soluções de IoT e Bi...Como o Magazine Luiza inova suas operações utilizando as soluções de IoT e Bi...
Como o Magazine Luiza inova suas operações utilizando as soluções de IoT e Bi...Amazon Web Services LATAM
 
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...Amazon Web Services
 
Migrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar SeriesMigrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar SeriesAmazon Web Services
 

Viewers also liked (6)

Migration to Redshift from SQL Server
Migration to Redshift from SQL ServerMigration to Redshift from SQL Server
Migration to Redshift from SQL Server
 
Começando com Amazon Redshift
Começando com Amazon RedshiftComeçando com Amazon Redshift
Começando com Amazon Redshift
 
Como o Magazine Luiza inova suas operações utilizando as soluções de IoT e Bi...
Como o Magazine Luiza inova suas operações utilizando as soluções de IoT e Bi...Como o Magazine Luiza inova suas operações utilizando as soluções de IoT e Bi...
Como o Magazine Luiza inova suas operações utilizando as soluções de IoT e Bi...
 
Amazon Redshift Masterclass
Amazon Redshift MasterclassAmazon Redshift Masterclass
Amazon Redshift Masterclass
 
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
(BDT303) Construct Your ETL Pipeline with AWS Data Pipeline, Amazon EMR, and ...
 
Migrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar SeriesMigrate your Data Warehouse to Amazon Redshift - September Webinar Series
Migrate your Data Warehouse to Amazon Redshift - September Webinar Series
 

Similar to SQL Server to Redshift Data Load Using SSIS

Amazon Redshift For Data Analysts
Amazon Redshift For Data AnalystsAmazon Redshift For Data Analysts
Amazon Redshift For Data AnalystsCan Abacıgil
 
Databases on aws part 1
Databases on aws   part 1Databases on aws   part 1
Databases on aws part 1Parag Patil
 
Aws overview part 3(databases, dns and management services)
Aws overview   part 3(databases, dns and management services)Aws overview   part 3(databases, dns and management services)
Aws overview part 3(databases, dns and management services)Parag Patil
 
Amazon dynamodb & amazon redshift
Amazon dynamodb & amazon redshiftAmazon dynamodb & amazon redshift
Amazon dynamodb & amazon redshiftSumeraHangi
 
AWS July Webinar Series: Amazon redshift migration and load data 20150722
AWS July Webinar Series: Amazon redshift migration and load data 20150722AWS July Webinar Series: Amazon redshift migration and load data 20150722
AWS July Webinar Series: Amazon redshift migration and load data 20150722Amazon Web Services
 
Spring boot-application
Spring boot-applicationSpring boot-application
Spring boot-applicationParag Patil
 
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...Amazon Web Services
 
Building AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and TableauBuilding AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and TableauLynn Langit
 
How To Create RDS Database for WordPress in AWS on RHEL 7 or CentOS 7
How To Create RDS Database for WordPress in AWS on RHEL 7 or CentOS 7How To Create RDS Database for WordPress in AWS on RHEL 7 or CentOS 7
How To Create RDS Database for WordPress in AWS on RHEL 7 or CentOS 7VCP Muthukrishna
 
Melhores práticas de data warehouse no Amazon Redshift
Melhores práticas de data warehouse no Amazon RedshiftMelhores práticas de data warehouse no Amazon Redshift
Melhores práticas de data warehouse no Amazon RedshiftAmazon Web Services LATAM
 
Hands-on Lab: Migrating Oracle to PostgreSQL
Hands-on Lab: Migrating Oracle to PostgreSQL Hands-on Lab: Migrating Oracle to PostgreSQL
Hands-on Lab: Migrating Oracle to PostgreSQL Amazon Web Services
 
Aws schema conversion tool
Aws schema conversion toolAws schema conversion tool
Aws schema conversion toolanshuman mishra
 

Similar to SQL Server to Redshift Data Load Using SSIS (20)

Amazon Redshift For Data Analysts
Amazon Redshift For Data AnalystsAmazon Redshift For Data Analysts
Amazon Redshift For Data Analysts
 
Databases on aws part 1
Databases on aws   part 1Databases on aws   part 1
Databases on aws part 1
 
Aws overview part 3(databases, dns and management services)
Aws overview   part 3(databases, dns and management services)Aws overview   part 3(databases, dns and management services)
Aws overview part 3(databases, dns and management services)
 
Amazon dynamodb & amazon redshift
Amazon dynamodb & amazon redshiftAmazon dynamodb & amazon redshift
Amazon dynamodb & amazon redshift
 
AWS July Webinar Series: Amazon redshift migration and load data 20150722
AWS July Webinar Series: Amazon redshift migration and load data 20150722AWS July Webinar Series: Amazon redshift migration and load data 20150722
AWS July Webinar Series: Amazon redshift migration and load data 20150722
 
Spring boot-application
Spring boot-applicationSpring boot-application
Spring boot-application
 
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...
 
Building AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and TableauBuilding AWS Redshift Data Warehouse with Matillion and Tableau
Building AWS Redshift Data Warehouse with Matillion and Tableau
 
Lampstack (1)
Lampstack (1)Lampstack (1)
Lampstack (1)
 
Aws setup
Aws setupAws setup
Aws setup
 
How To Create RDS Database for WordPress in AWS on RHEL 7 or CentOS 7
How To Create RDS Database for WordPress in AWS on RHEL 7 or CentOS 7How To Create RDS Database for WordPress in AWS on RHEL 7 or CentOS 7
How To Create RDS Database for WordPress in AWS on RHEL 7 or CentOS 7
 
Loading Data into Redshift
Loading Data into RedshiftLoading Data into Redshift
Loading Data into Redshift
 
Amazon Redshift Deep Dive
Amazon Redshift Deep Dive Amazon Redshift Deep Dive
Amazon Redshift Deep Dive
 
Loading Data into Redshift
Loading Data into RedshiftLoading Data into Redshift
Loading Data into Redshift
 
CDS Views.pptx
CDS Views.pptxCDS Views.pptx
CDS Views.pptx
 
Melhores práticas de data warehouse no Amazon Redshift
Melhores práticas de data warehouse no Amazon RedshiftMelhores práticas de data warehouse no Amazon Redshift
Melhores práticas de data warehouse no Amazon Redshift
 
Mysql
MysqlMysql
Mysql
 
Hands-on Lab: Migrating Oracle to PostgreSQL
Hands-on Lab: Migrating Oracle to PostgreSQL Hands-on Lab: Migrating Oracle to PostgreSQL
Hands-on Lab: Migrating Oracle to PostgreSQL
 
Aws schema conversion tool
Aws schema conversion toolAws schema conversion tool
Aws schema conversion tool
 
Simple ETL Solution - Marco Kiesewetter
Simple ETL Solution - Marco KiesewetterSimple ETL Solution - Marco Kiesewetter
Simple ETL Solution - Marco Kiesewetter
 

Recently uploaded

why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 

Recently uploaded (20)

why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 

SQL Server to Redshift Data Load Using SSIS

  • 1. SQL Server To Redshift Data Load Using SSIS Reach for the Clouds, Inc. Next Generation SSIS Tasks and Connectors Series AUTHOR: NAYAN PATEL | SR. ETL SSIS ARCHITECT N PAT E L @ R F TC LO U D S . C O M
  • 2. Content • Introduction – SQL Server to Redshift Load • VideoTutorial – Redshift Data Load • Right way but hard way • Steps for Amazon Redshift Data Load from On-Premise files or RDBMS (e.g. MySQL, SQL Server) • Doing it easy way • Should I use SSIS to load Redshift • Setup your Amazon Redshift Cluster • Add inbound rule for Redshift Cluster • Automate Redshift Cluster Creation • Create Sample table and data in Source – (in this example SQL Server) • Create Sample table in Amazon Redshift • SQL Server to Redshift Data Load using SSIS • Conclusion - Related Links
  • 3. Introduction – SQL Server to Redshift Load • Before we talk data load from SQL Server to Redshift using SSIS lets talk what is Amazon Redshift (or sometimes referred to as AWS Redshift). Amazon Redshift is a Cloud based Data warehouse service. This type of system also referred as MPP (Massively Parallel Processing). Amazon Redshift uses highly modified version of PostGrey SQL Engine behind the scene. Amazon Redshift provides advantage of Scale as you go, at very low cost compared to onsite dedicated hardware/software approach.
  • 4. Right way but hard way • If you are reading some of the guidelines published by Amazon regarding Redshift Data load then you will quickly realize that there is a lot to do under the cover to get it going right way. Here are few steps you will have to perform while loading data to Redshift from your On-Premise server (Data can be sitting in files or Relational source).
  • 5. Right way but hard way Steps for Amazon Redshift Data Load from On-Premise files or RDBMS (e.g. MySQL, SQL Server) • Export local RDBMS data to flat files (Make sure you remove invalid characters, apply escape sequence during export) • Split files into 10-15 MB each to get optimal performance during upload and final Data load • Compress files to *.gz format so you don’t end up with $1000 surprise bill :) .. In my caseText files were compressed 10-20 times • List all file names to manifest file so when you issue COPY command to Redshift its treated as one unit of load • Upload manifest file to Amazon S3 bucket • Upload local *.gz files to Amazon S3 bucket • Issue RedshiftCOPY command with different options • Schedule file archiving from on-premises and S3 Staging area on AWS • Capturing Errors, setting up restart ability if something fails
  • 6. Doing it the easy way • So if you are not sure you ready to code many steps listed above then you can use Amazon Redshift DataTransferTask. • In next few sections we will describe how to setup your Redshift Cluster for Demo purpose and load Data from SQL Server to Redshift using SSIS.
  • 7. Doing it the easy way Should I use SSIS to load Redshift • If you are curious which approach to use to load data then consider few facts • Do you have existing ETL processes written in SSIS? • Do you need more visual approach and better work flow management (what SSIS Provides)? • Do you need connection string encryption and other goodies offered by SSIS such as native logging, passing parameters from SSIS environment • Do you have expertise available for SSIS in-house or you better stay with command line scripts? • Do you have need to create workflow which can run on any server where SSIS is not installed?
  • 8. Setup your Amazon Redshift Cluster NOTE: SKIP THIS STEP IF YOU ALREADY SETUP YOU REDSHIFT CLUSTER 1.LOGIN TO YOUR AWS CONSOLE AND CLICK ON REDSHIFT ICON. OR CLICK HERE TO LAND DIRECTLY TO REDSHIFT 2.CLICK ON LAUNCH CLUSTER 3.ON CLUSTER DETAIL PAGE SPECIFY CLUSTER IDENTIFIER, DATABASE NAME, PORT, MASTER USER AND PASSWORD. CLICK CONTINUE TO GO TO NEXT PAGE
  • 9. Setup your Amazon Redshift Cluster 4. ON NODE CONFIGURATION PAGE SPECIFY NODE TYPE (THIS IS VM TYPE), CLUSTER TYPE AND NUMBER OF NODE. IF YOU ARE TRYING UNDER FREE TIRE THEN SELECT SMALLEST NODE POSSIBLE (IN THIS CASE IT WAS DW2.LARGE). CLICK CONTINUE TO GO TO NEXT PAGE
  • 10. Setup your Amazon Redshift Cluster 5. ON ADDITIONAL CONFIGURATION PAGE YOU CAN PICK VPC (VIRTUAL PRIVATE CONNECTION), SECURITY GROUP FOR CLUSTER AND OTHER OPTIONS FOR ENCRYPTION. FOR DEMO PURPOSE SELECT AS BELOW SCREENSHOT . CLICK CONTINUE TO REVIEW YOUR SETTINGS AND CLICK CREATE CLUSTER
  • 11. Setup your Amazon Redshift Cluster 6. GIVE IT FEW MINS WHILE YOUR CLUSTER IS BEING CREATED. AFTER FEW MINUTES (5-10 MINS) YOU CAN GO BACK TO SAME PAGE AND REVIEW CLUSTER STATUS AND OTHER PROPERTIES AS BELOW. COPY CLUSTER ENDPOINT TO SOMEWHERE BECAUSE WE WILL NEED IT LATER.
  • 12. Add inbound rule for Redshift Cluster NOTE: SKIP THIS STEP IF YOU HAVE ALREADY ADDED YOUR IP TO INBOUND EXCLUSION RULE. BY DEFAULT YOU CANNOT CONNECT TO AMAZON REDSHIFT CLUSTER FROM OUTSIDE AWS NETWORK (E.G. FROM YOUR ON -PREMISES MACHINE). IF YOU WISH TO CONNECT THEN YOU MUST ADD INBOUND EXCEPTION RULE TO ALLOW YOUR REQUEST TO REDSHIFT CLUSTER ON SPECIFIC PORT. TO ADD CREATE NEW INBOUND RULE PERFORM FOLLOWING STEPS 1. UNDER REDSHIFT HOME PAGE CLICK [SECURITY] TAB. YOU MAY SEE FOLLOWING NOTICE DEPENDING ON WHICH REGION YOU ARE. CLICK ON [GO TO THE EC2 CONSOLE] LINK OR YOU CAN DIRECT GO TO EC2 BY CLICKING SERVICES -> EC2 MENU AT THE TOP
  • 13. Add inbound rule for Redshift Cluster 2. ON EC2 SECURITY GROUPS PAGE SELECT SECURITY GROUP ATTACHED WITH YOUR REDSHIFT CLUSTER AND THEN IN THE BOTTOM PANE CLICK ON INBOUND TAB 3. ON INBOUND TAB CLICK EDIT OPTION TO MODIFY DEFAULT ENTRY OR YOU CAN ADD NEW RULE 4. CLICK ON ADD RULE IF YOU WISH TO ADD NEW ENTRY ELSE EDIT AS BELOW AND CLICK SAVE
  • 14. Automate Redshift Cluster Creation If you have need to automate Redshift Cluster Creation or any of the following things automatically then check Redshift Cluster managementTask • Automate Amazon Redshift Cluster Create Action in few clicks.You can also add Access Security Rule. • Automate Amazon Redshift Cluster Delete Action • Fetch Amazon Redshift Cluster Property to SSISVariable (e.g. Fetch Cluster Status) • Fetch all cluster and their properties as DataTable (Use ForEach Loop and iterate through all clusters) • Automate Redshift Cluster Snapshot Creation • Automate Redshift Cluster Snapshot Delete Action • Support forWait until Cluster operation is done
  • 15. Create Sample table and data in Source – (in this example SQL Server) Note: Skip this step if you wish to use your own table. If you do so please ignore certain steps and screenshots mentioned in this article. For this demo we will use Free Northwind sample database supplied by Microsoft. • Download Sample Database from here. • Extract the zip file -> Open *.sql file and run it to create new database with sample tables and data.
  • 16. Create Sample table in Amazon Redshift 4. DOUBLE CLICK ON THE TASK TO SEE UI. 5.CLICK ON [NEW] CONNECTION. 6. CONFIGURE REDSHIFT CONNECTION PROPERTIES AND CLICK TEST.
  • 17. Create Sample table in Amazon Redshift 7. TEST CONNECTION IS SUCCESSFUL THEN CLICK OK TO SAVE CONNECTION DETAIL. 8. ENTER FOLLOWING SCRIPT IN THE SQL TEXTBOX AND HIT OK TO SAVE IT.
  • 18. Create Sample table in Amazon Redshift 9. NOW RIGHT CLICK ON THE TASK AND EXECUTE. THIS SHOULD CREATE NEW TABLE IN REDSHIFT.
  • 19. SQL Server to Redshift Data Load using SSIS Once table is created now lets do real work to get data moving from SQL Server to Amazon Redshift. Perform the following steps to configure SSISAmazon Redshift DataTransferTask 1. Drag Amazon Redshift DataTransferTask on the SSIS designer surface. 2. Double click on the task to edit properties. 3. Select Action: In the top Action drop down select Bulk Import to Redshift from any RDBMS (e.g. MySQL, Oracle, SQL Server) option 4. Configure Source: On the Source tab click [New] next to connection dropdown and configure Source connection or pick existing connection. In our case we are extracting data from SQl Server database (Northwind) on local server. Enter the following SQL Query to extract 100,000 rows from SQL Server
  • 20. Create Sample table in Amazon Redshift 5. CONFIGURE SOURCE STAGING AREA: ON THE SOURCE TAB YOU HAVE TO ENTER FOLDER LOCATION WHERE STAGING FILES WILL BE SAVED BEFORE WE UPLOAD TO REDSHIFT (SEE ABOVE SCREEN).
  • 21. Create Sample table in Amazon Redshift 6. CONFIGURE TARGET: ON TARGET TAB SELECT EXISTING REDSHIFT CONNECTION MANAGER (OR CREATE NEW), SELECT TARGET TABLE FROM THE DROPDOWN WHERE YOU WANT TO LOAD DATA. IF YOU HAVE LONG LIST OF TABLES THEN SIMPLY ENTER SCHEMA NAME IN THE SCHEMA FILTER TEXT BOX AND CLICK REFRESH TO RELOAD TABLE DROPDOWN WITH FEWER ITEMS.
  • 22. Create Sample table in Amazon Redshift 7. CONFIGURE RELOAD OPTION AND TARGET STAGING AREA: ON TARGET TAB CHECK TRUNCATE TARGET TABLE OPTION IF YOU WANT TO RELOAD EACH TIME EXECUTE THIS TASK ELSE LEAVE IT UNCHECKED TO APPEND RECORDS. WE ALSO HAVE TO SPECIFY AMAZON S3 STAGING AREAS WHERE REDSHIFT WILL LOOK FOR FILES TO LOAD.
  • 23. Create Sample table in Amazon Redshift 8. CONFIGURE FILE FORMAT: WE ARE GOING TO GENERATE CSV FILES FOR REDSHIFT LOAD SO MAKE SURE YOU SELECT CORRECT COLUMN DELIMITER. ALSO MAKE SURE YOU CHECK ALWAYS COMPRESS FILE OPTION TO REDUCE BANDWIDTH.
  • 24. Create Sample table in Amazon Redshift 9. CONFIGURE ARCHIVE OPTIONS: ON ARCHIVE TAB WE CAN SPECIFY HOW TO ARCHIVE SOURCE AND TARGET FILES WE GENERATED. SOURCE FILES ARE CSV FILES AND SOURCE STAGE FILES ARE *.GZ FILES (IF YOU SELECT COMPRESSION). TARGET STAGE FILES ARE EITHER CSV OR *.GZ FILES.BY DEFAULT SOURCE CSV FILES ARE KEPT AND ALL OTHER STAGE FILES ARE DELETED. SEE BELOW SCREENSHOT
  • 25. Create Sample table in Amazon Redshift 10. CONFIGURE ADVANCED OPTIONS: ON ADVANCED OPTIONS TAB YOU FINE TUNE LOAD PROCESS SUCH AS HOW TO HANDLE NULL DATA, HOW TO HANDLE DATA TRUNCATION ETC. READ HELP FILE FOR MORE INFO
  • 26. Create Sample table in Amazon Redshift 11. CONFIGURE ERROR HANDLING OPTIONS: ON ERROR HANDLING TAB YOU CAN SPECIFY HOW MANY ERRORS YOU WANT TO IGNORE BEFORE FAILING ENTIRE LOAD. YOU CAN ALSO REPLACE SOME INVALID CHARACTERS DURING YOUR IF YOU CHECK [ALLOW INVALID CHARACTERS] OPTION.
  • 27. Create Sample table in Amazon Redshift 12. NOW FINALLY WE READY TO EXECUTE OUR SSIS PACKAGE. ONCE ITS DONE YOU CAN REVIEW LOG. HERE IS THE SAMPLE EXECUTION LOG .
  • 28. Conclusion So in this article we outlined different steps needed to load data into Redshift from relational source (e.g. MySQL, SQL Server, Oracle). Redshift is a great way to offload your expensive data warehouse to cloud so you don’t have to worry about costly maintenance and future growth.With redshift you can grow your data size from Gigabyte to Petabyte. SSISAmazon Redshift DataTransferTask. can give you an easy way to maintain your Redshift data transfer process with ease of use and fast load options (for full or incremental load). Again this was just proof of concept but we encourage you to do your own benchmarking and research see which approach suites best for your need. • Related Links: • SSIS Amazon Redshift DataTransferTask TAGS: amazon redshift Amazon Redshift Data Transfer Task aws command line csv excel export How-To json mysql PDF Redshift SSIS SSIS PowerPack