Data Replication Options in AWS
Thomas Park – Manager, Solutions Architecture
November 15, 2013

© 2013 Amazon.com, Inc. a...
Thomas Jefferson first
acquired the lettercopying device he called
"the finest invention of
the present age" in
March of 1...
Agenda
• Data replication design options in AWS
• Replication design factors and challenges
• Use cases
– Do-it-yourself o...
Data Replication Capabilities
Application Services

Compute

Storage
Networking
AWS Global Infrastructure

Database

Partn...
Data Replication Solution Architecture
AWS
Capabilities

•
•
•
•
•
•
•
•
•
•
•
•

Solutions
Architecture

Multiple Availab...
Focus of Our Discussion Today
Business Drivers

Databases

Replication
Options

Data Preservation

Design
Factors

Data
Ty...
Focus of Our Discussion Today
Business Drivers

Databases

Replication
Options

Data Preservation

Design
Factors

Data
Ty...
Design Options in AWS
AZ

Single
AZ

Databases

Files and
Objects

AZ

AZ

Multi-AZ

Region

Region

Cross
Region

Hybrid
...
Focus of Our Discussion Today
Business Drivers

Databases

Replication
Options

Data Preservation

Design
Factors

Data
Ty...
DR Metrics – RTO/RPO
2:00am

6:00am
RPO
4 Hours

Physical vs. Logical
Synchronous vs. Asynchronous
11:00am

RTO
5 Hours

T...
Data Replication Options in AWS
AZ

Single
AZ

Files and
Objects

AZ

Multi-AZ

Region

Region

Cross
Region

Hybrid
IT

P...
Focus of Our Discussion Today
Business Drivers

Databases

Replication
Options

Data Preservation

Design
Factors

Data
Ty...
Performance Metric – Total Time
600M Records and
320 GB in Size

Daily and Weekly
Updates

Source
DB

Estimated DB
Size
~3...
Performance Metric – Total Time

Source
DB

Estimated DB
Size
~70 TB

ETL

Estimated DB
Size
~48 TB

Can we STILL do this ...
Data Replication Options in AWS
AZ

Single
AZ

Multi-AZ

Region

Region

Cross
Region

Hybrid
IT

Performance

Files and
O...
Focus of Our Discussion Today
Business Drivers

Databases

Replication
Options

Data preservation

Design
Factors

Data
Ty...
Factors Affecting Replication Designs
Throughput

Source

Target

4

Bandwidth
Latency
6

1

2

Consistency

Replicate

Si...
Challenges in Replication

Consistency

Data
Size
Compute

Availability &
Performance

Storage

Change
Rate

Network

Infr...
Challenges in Replication

Consistency

Data
Size
Compute

Availability &
Performance

Storage

Network

Infrastructure Ca...
Challenges in Replication

Data
Size
Compute

Availability &
Performance
Storage

Network

Change
Rate
Database
Replication Design Options in AWS

Flexibility
The right tool for the right job

Options
Common Data Replication Scenarios
•
•
•
•
•
•

Hybrid IT
Database migration
HA databases
Increase throughput
Cross regions...
Please Meet Bob
• DBA for a large
enterprise company
• 10 years of IT
experience
• What is AWS?
Disaster Recovery
I can’t find
archlog_002
file!!!!!!

Sue, DBA

Bob, DBA
We Need a Better Way….
RTO is 8 hours
RPO is 1 hour
MySQL

SQL
Server

Oracle

5 - 6 hours

Daily

Bob, DBA
Demo – AWS S3 Upload

Generic Database

DB
Full Backup

Amazon S3
Bucket
Corporate Data Center
Think Parallel

2 Seconds

Multipart
Think Parallel
Foreach($file in $files) {Write-S3Object -BucketName mybucket -Key $file.filename}

2 Seconds

2 Seconds

2...
Think Parallel

Nearly 3 Days
Multiple Machines, Multiple Threads, Multiple Parts

Think Parallel

2 Seconds

2 Seconds

2 Seconds

2 Seconds

2 Seconds...
Demo – AWS S3 Multipart Upload

Generic Database

DB
Full Backup

Corporate Data center

Amazon S3
Bucket
Think Parallel
• Use multipart upload (API/SDK or command line
tools) – min part size is 5 MB
• Use multiple threads
– GNU...
Database Replication Options
How are you going
to replicate
databases?

Tom, Sys Admin

Bob, DBA
Database Replication Options in AWS
Bob’s Office

Amazon RDS
Replication
MySQL

SQL
Server

Oracle
Non-RDS to RDS Database Replication
Bob’s Office

1

Amazon RDS

Configure to Be a Master

Initialize

4
MySQL

mysqldump
...
Non-RDS to RDS Database Replication
Bob’s Office

5
Amazon RDS

MySQL

Availability Zone A
Corporate Data Center

Run mysq...
Non-RDS to RDS Database Replication
Bob’s Office

Amazon RDS

6

MySQL

Run mysql.rds_start_replication

Availability Zone...
Database Replication Options
Bob’s Office
Amazon RDS
MySQL

Restore

Log Shipping
SQL
Server

Oracle

Amazon S3
Bucket

SQ...
Database Replication Options
Bob’s Office
Amazon RDS
MySQL

SQL
Server
SQL
Server

OSB Cloud
Module
Oracle

O
S
B

RMAN
re...
Database Replication Options
• Amazon RDS MySQL
– Replication

• SQL Server and Oracle on EC2
– SQL server log shipping, a...
HA Database Replication Options
We need a highly
available solution.

Katie, Director

Bob, DBA
HA DB Replication Options
Data Guard Configuration
Prepare primary database
1. Enable logging
2. Add standby redo logs
3. ...
HA DB Replication Options
Physical
Synchronous Replication
Amazon RDS DB
Instance Standby
(Multi-AZ)

SQL
Server

Oracle

...
Increase Throughput
The order
system is
running
slowly.
Disk
I/O?

Hannah, Finance

Bob, DBA Manager
Increase Throughput Options
• Amazon EC2 instance type
– Amazon RDS MySQL

• PIOPS
– Amazon RDS MySQL

• Read replicas
– A...
Amazon RDS Performance Options
Asynchronous

m1.small

m2.4xlarge

Amazon
RDS DB
Instance
Read
Replica

Provision IOPS

Av...
us-east-1

Web

Web

Web
AS
SQL
Server

SQL
Server

Oracle

Oracle
Standby

Provision IOPS

Availability Zone A

Availabil...
Increase Throughput Options
• Amazon CloudFront
– Large objects

Bob, DBA Manager
us-east-1
CloudFront

Web

Web

Web

Logs

AS
SQL
Server

SQL
Server

Oracle

Availability Zone A

Web

Oracle
Standby

Av...
Increase Throughput Options
• Amazon DynamoDB
– Sessions, orders

Bob, DBA Manager
us-east-1
CloudFront

Web

Web

Web

Web

AS
SQL
Server

Oracle

Logs

SQL
Server

Oracle
Standby

Amazon DynamoDB

Automa...
Cross-region Replication Options
We are
opening a new
development
site in….

Bella, VP

Bob, Architect
Us-east-1

•
•
•
•

Tokyo

Replicate AMIs and Amazon EBS snapshots
Replicate Amazon DynamoDB tables
Replicate Amazon RDS s...
Demo Cross-region Replication Options
Copy

Create
EC2

AMI

Amazon EBS

Amazon EBS Snapshot

Amazon RDS Snapshot

Amazon ...
Replicate Amazon S3 Bucket

Source
Bucket with
Objects

Destination
Bucket
Amazon S3 Copy
List bucket (D)

List bucket (S)
Controller

Copy Queue List(S-D)

Source
Bucket with
Objects

Destination
...
Hive Script to Compare Amazon S3 Buckets
Create external table sourcekeys (key string)
location ‘s3://mybucket/sourcebucke...
Data Warehouse Replication
We need to
understand
impact of price
changes.

Bob, Architect

Gus, Marketing
Data Warehouse Replication

Amazon
DynamoDB

BI
Reports
Bucket with
Objects

Oracle

Amazon Redshift
AWS Data Pipeline
JDBC

Amazon S3

DynamoDB

DB on
Instance

Amazon Redshift

Scheduler

Copy

EMR

Hive

Hive
copy

Pig

...
Demo Data Warehouse Replication

Amazon
DynamoDB

BI
Reports
Bucket with
Objects

Oracle

AWS Data Pipeline
Amazon Redshif...
Attunity CloudBeam for Amazon Redshift –
Incremental Load (CDC)
Amazon
Redshift

AWS region
5

‘Copy’ data to CDC table

C...
Business Intelligence
CloudFront

us-east-1

ap-northeast-1

Logs

Web

Web

Web

Web

Amazon
S3
Bucket

Amazon
Redshift

AS
AMI

SQL
Server

SQ...
Bob, Chief Architect
Key Takeaways
• Consider design factors and make trade offs, if
possible
• Think parallel
• Pick the right tool for the ri...
Please give us your feedback on this
presentation

ARC302
As a thank you, we will select prize
winners daily for completed...
Extra
Amazon DynamoDB Built-in Replication
Region B

Region A

Provisioned Throughput
Automatic 3-way Replication
Amazon S3
Buck...
Amazon Redshift Replication Patterns
Region B

Leader
Node

Copy
Unload
Amazon S3
Bucket

Amazon Redshift

10GigE

Compute...
WAN Acceleration Test

Instance

Instance

AP-southest-1

US-East-1

Instance

EU-West-1
Tsunami-UDP here’s how
Install Tsunami

Origin Server
$ cd /path/to/files
$ tsunamid *

Destination Server
$ cd /path/to/r...
BBCP here’s how
Install BBCP
$ sudo wget http://www.slac.stanford.edu/~abh/bbcp/bin/amd64_linux26/bbcp
$ sudo cp bbcp /usr...
Minutes to Send a 30GB File from US-East-1
60
50
40
30
20

10
0
To Dublin
SCP

To Singapore
BBCP

Tsunami
*** Note using h...
Environment & Configuration
Primary DB: prod
#Primary init.ora:
LOG_ARCHIVE_DEST_1='LOCATION=/data/oracle/Prod/db/archive
...
Listener.ora
Primary DB
prod =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST =
primary) (PORT = 1526...
tnsnames.ora
Primary DB
#Primary tnsnames.ora:
STB =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST =
standby)(PORT =
152...
Data Replication Options in AWS (ARC302) | AWS re:Invent 2013
Upcoming SlideShare
Loading in …5
×

Data Replication Options in AWS (ARC302) | AWS re:Invent 2013

11,524 views

Published on

One of the most critical roles of an IT department is to protect and serve its corporate data. As a result, IT departments spend tremendous amounts of resources developing, designing, testing, and optimizing data recovery and replication options in order to improve data availability and service response time. This session outlines replication challenges, key design patterns, and methods commonly used in today’s IT environment. Furthermore, the session provides different data replication solutions available in the AWS cloud. Finally, the session outlines several key factors to be considered when implementing data replication architectures in the AWS cloud.

Published in: Technology

Data Replication Options in AWS (ARC302) | AWS re:Invent 2013

  1. 1. Data Replication Options in AWS Thomas Park – Manager, Solutions Architecture November 15, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  2. 2. Thomas Jefferson first acquired the lettercopying device he called "the finest invention of the present age" in March of 1804. http://www.monticello.org/site/house-and-gardens/polygraph
  3. 3. Agenda • Data replication design options in AWS • Replication design factors and challenges • Use cases – Do-it-yourself options – Managed and built-in AWS options • Demos
  4. 4. Data Replication Capabilities Application Services Compute Storage Networking AWS Global Infrastructure Database Partner Data Replication Capabilities AWS Data Replication Capabilities Deployment & Administration
  5. 5. Data Replication Solution Architecture AWS Capabilities • • • • • • • • • • • • Solutions Architecture Multiple Availability Zones Evaluate Multiple regions Options AMI Amazon EBS and DB snapshots AMI copy Amazon EBS and DB snapshot copy Multi-AZ DBs Read replica DBs Provisioned IOPS Offline backups and achieves Data lifecycle policies Highly durable storage Business Drivers Measure Metrics • • • • • • • • • • Business continuity Disaster recovery Customer experience Productivity Mismatched SLA Compliance Reducing cost Information security risks Global expansion Performance/availability
  6. 6. Focus of Our Discussion Today Business Drivers Databases Replication Options Data Preservation Design Factors Data Type Design Options Storage/Content (Files and Objects) Performance
  7. 7. Focus of Our Discussion Today Business Drivers Databases Replication Options Data Preservation Design Factors Data Type Design Options Storage/Content (Files and Objects) Performance
  8. 8. Design Options in AWS AZ Single AZ Databases Files and Objects AZ AZ Multi-AZ Region Region Cross Region Hybrid IT
  9. 9. Focus of Our Discussion Today Business Drivers Databases Replication Options Data Preservation Design Factors Data Type Design Options Storage/Content (Files and Objects) Performance
  10. 10. DR Metrics – RTO/RPO 2:00am 6:00am RPO 4 Hours Physical vs. Logical Synchronous vs. Asynchronous 11:00am RTO 5 Hours Time Last Backup Event Data Restored
  11. 11. Data Replication Options in AWS AZ Single AZ Files and Objects AZ Multi-AZ Region Region Cross Region Hybrid IT Preservation Databases AZ
  12. 12. Focus of Our Discussion Today Business Drivers Databases Replication Options Data Preservation Design Factors Data Type Design Options Storage/Content (Files and Objects) Performance
  13. 13. Performance Metric – Total Time 600M Records and 320 GB in Size Daily and Weekly Updates Source DB Estimated DB Size ~35 TB ETL Estimated DB Size ~48 TB Can we do this in 10 hours? Target DB
  14. 14. Performance Metric – Total Time Source DB Estimated DB Size ~70 TB ETL Estimated DB Size ~48 TB Can we STILL do this in 10 hours? Target DB
  15. 15. Data Replication Options in AWS AZ Single AZ Multi-AZ Region Region Cross Region Hybrid IT Performance Files and Objects AZ Preservation Databases AZ
  16. 16. Focus of Our Discussion Today Business Drivers Databases Replication Options Data preservation Design Factors Data Type Design Options Storage/Content (Files and Objects) Performance
  17. 17. Factors Affecting Replication Designs Throughput Source Target 4 Bandwidth Latency 6 1 2 Consistency Replicate Size of Data 5 Read/Write 3 Data Change Rate Read/Write
  18. 18. Challenges in Replication Consistency Data Size Compute Availability & Performance Storage Change Rate Network Infrastructure Capabilities Database
  19. 19. Challenges in Replication Consistency Data Size Compute Availability & Performance Storage Network Infrastructure Capabilities Change Rate Database
  20. 20. Challenges in Replication Data Size Compute Availability & Performance Storage Network Change Rate Database
  21. 21. Replication Design Options in AWS Flexibility The right tool for the right job Options
  22. 22. Common Data Replication Scenarios • • • • • • Hybrid IT Database migration HA databases Increase throughput Cross regions Data warehousing
  23. 23. Please Meet Bob • DBA for a large enterprise company • 10 years of IT experience • What is AWS?
  24. 24. Disaster Recovery I can’t find archlog_002 file!!!!!! Sue, DBA Bob, DBA
  25. 25. We Need a Better Way…. RTO is 8 hours RPO is 1 hour MySQL SQL Server Oracle 5 - 6 hours Daily Bob, DBA
  26. 26. Demo – AWS S3 Upload Generic Database DB Full Backup Amazon S3 Bucket Corporate Data Center
  27. 27. Think Parallel 2 Seconds Multipart
  28. 28. Think Parallel Foreach($file in $files) {Write-S3Object -BucketName mybucket -Key $file.filename} 2 Seconds 2 Seconds 2 Seconds 8 Seconds 2 Seconds
  29. 29. Think Parallel Nearly 3 Days
  30. 30. Multiple Machines, Multiple Threads, Multiple Parts Think Parallel 2 Seconds 2 Seconds 2 Seconds 2 Seconds 2 Seconds 2 Seconds 2 Seconds 2 Seconds 2 Seconds 2 Seconds 2 Seconds 2 Seconds 120,000 files @ 15,000 TPS = 8 seconds
  31. 31. Demo – AWS S3 Multipart Upload Generic Database DB Full Backup Corporate Data center Amazon S3 Bucket
  32. 32. Think Parallel • Use multipart upload (API/SDK or command line tools) – min part size is 5 MB • Use multiple threads – GNU parallels: parallel -j0 -N2 --progress /usr/bin/s3cmd put {1} {2} – Python multiprocessing, .Net parallel extensions, etc. • Use multiple machines – Limited by host CPU / memory / network / IO
  33. 33. Database Replication Options How are you going to replicate databases? Tom, Sys Admin Bob, DBA
  34. 34. Database Replication Options in AWS Bob’s Office Amazon RDS Replication MySQL SQL Server Oracle
  35. 35. Non-RDS to RDS Database Replication Bob’s Office 1 Amazon RDS Configure to Be a Master Initialize 4 MySQL mysqldump 2 mysqldump Dump 3 AWS S3 CP Amazon S3 Bucket Availability Zone A Corporate Data Center
  36. 36. Non-RDS to RDS Database Replication Bob’s Office 5 Amazon RDS MySQL Availability Zone A Corporate Data Center Run mysql.rds_set_external_master
  37. 37. Non-RDS to RDS Database Replication Bob’s Office Amazon RDS 6 MySQL Run mysql.rds_start_replication Availability Zone A Corporate Data Center
  38. 38. Database Replication Options Bob’s Office Amazon RDS MySQL Restore Log Shipping SQL Server Oracle Amazon S3 Bucket SQL Server
  39. 39. Database Replication Options Bob’s Office Amazon RDS MySQL SQL Server SQL Server OSB Cloud Module Oracle O S B RMAN restore Amazon S3 Bucket Oracle
  40. 40. Database Replication Options • Amazon RDS MySQL – Replication • SQL Server and Oracle on EC2 – SQL server log shipping, always on, mirroring, etc. – Oracle RMAN/OSB, Active Data Guard, etc. Bob, DBA
  41. 41. HA Database Replication Options We need a highly available solution. Katie, Director Bob, DBA
  42. 42. HA DB Replication Options Data Guard Configuration Prepare primary database 1. Enable logging 2. Add standby redo logs 3. Add data guard parameters to init.ora/spfile 4. Update tnsnames.ora and listener.ora Amazon RDS DB Instance Standby (Multi-AZ) SQL Server SQL Server Prepare standby database environment 1. Install or clone the Oracle home 2. Copy password file (orapwdSID) from primary database 3. Add data guard parameters to init.ora/spfile 4. Update tnsnames.ora and listener.ora Create standby database using RMAN 1. Duplicate target database for standby Oracle Availability Zone A Data Guard Oracle Standby Availability Zone B Configure Data Guard broker 1. Setup database parameters on primary and standby database init.ora/spfile 2. Create Data Guard configuration for primary and standby using dgmgrl 3. Setup StaticConnectIdentifier for primary and standby 4. Enable Data Guard configuration 5. Show configuration – should return success
  43. 43. HA DB Replication Options Physical Synchronous Replication Amazon RDS DB Instance Standby (Multi-AZ) SQL Server Oracle Availability Zone A Availability Zone B Amazon RDS MySQL Multi-AZ Amazon RDS Oracle Multi-AZ
  44. 44. Increase Throughput The order system is running slowly. Disk I/O? Hannah, Finance Bob, DBA Manager
  45. 45. Increase Throughput Options • Amazon EC2 instance type – Amazon RDS MySQL • PIOPS – Amazon RDS MySQL • Read replicas – Amazon RDS MySQL Bob, DBA Manager
  46. 46. Amazon RDS Performance Options Asynchronous m1.small m2.4xlarge Amazon RDS DB Instance Read Replica Provision IOPS Availability Zone A Availability Zone B
  47. 47. us-east-1 Web Web Web AS SQL Server SQL Server Oracle Oracle Standby Provision IOPS Availability Zone A Availability Zone C Availability Zone B Web
  48. 48. Increase Throughput Options • Amazon CloudFront – Large objects Bob, DBA Manager
  49. 49. us-east-1 CloudFront Web Web Web Logs AS SQL Server SQL Server Oracle Availability Zone A Web Oracle Standby Availability Zone C Availability Zone B
  50. 50. Increase Throughput Options • Amazon DynamoDB – Sessions, orders Bob, DBA Manager
  51. 51. us-east-1 CloudFront Web Web Web Web AS SQL Server Oracle Logs SQL Server Oracle Standby Amazon DynamoDB Automatic Replication Availability Zone A Availability Zone C Availability Zone B
  52. 52. Cross-region Replication Options We are opening a new development site in…. Bella, VP Bob, Architect
  53. 53. Us-east-1 • • • • Tokyo Replicate AMIs and Amazon EBS snapshots Replicate Amazon DynamoDB tables Replicate Amazon RDS snapshots Replicate Amazon S3 buckets
  54. 54. Demo Cross-region Replication Options Copy Create EC2 AMI Amazon EBS Amazon EBS Snapshot Amazon RDS Snapshot Amazon DynamoDB AWS Data Pipeline us-east-1 Restore Amazon EBS Snapshot Copy Snapshot EC2 AMI Copy Snapshot Restore Amazon EBS Restore RDS Snapshot Amazon DynamoDB ap-northeast-1
  55. 55. Replicate Amazon S3 Bucket Source Bucket with Objects Destination Bucket
  56. 56. Amazon S3 Copy List bucket (D) List bucket (S) Controller Copy Queue List(S-D) Source Bucket with Objects Destination Bucket Task Queue S3 Copy API Dequeue Task Agent(s)
  57. 57. Hive Script to Compare Amazon S3 Buckets Create external table sourcekeys (key string) location ‘s3://mybucket/sourcebucketlist’; Create external table destinationkeys (key string) location ‘s3://mybucket/destinationbucketlist’; Create table differencelist location ‘s3://mybucket/differencelist’ as Insert overwrite table differencelist Select sourcekeys.key From sourcekeys Left outer join destinationkeys On (sourcekeys.key = destinationkeys.key) Where destinationkeys.key is null;
  58. 58. Data Warehouse Replication We need to understand impact of price changes. Bob, Architect Gus, Marketing
  59. 59. Data Warehouse Replication Amazon DynamoDB BI Reports Bucket with Objects Oracle Amazon Redshift
  60. 60. AWS Data Pipeline JDBC Amazon S3 DynamoDB DB on Instance Amazon Redshift Scheduler Copy EMR Hive Hive copy Pig Data Pipeline Shell command Redshift copy SQL
  61. 61. Demo Data Warehouse Replication Amazon DynamoDB BI Reports Bucket with Objects Oracle AWS Data Pipeline Amazon Redshift
  62. 62. Attunity CloudBeam for Amazon Redshift – Incremental Load (CDC) Amazon Redshift AWS region 5 ‘Copy’ data to CDC table Change Files in customer’s S3 account Data Tables S3 CDC Table 4 Execute SQL commands ‘merge’ change into data tables Execute ‘copy’ command to load data tables from S3 2 Beam files to S3 6 3 Replication Server Source Database Oracle DB 1 Generate change files Change Files (CDC) Net Changes file Validate file content upon arrival
  63. 63. Business Intelligence
  64. 64. CloudFront us-east-1 ap-northeast-1 Logs Web Web Web Web Amazon S3 Bucket Amazon Redshift AS AMI SQL Server SQL Server Oracle AWS Data Pipeline Oracle STBY Amazon EBS snapshot Amazon RDS snapshot Amazon DynamoDB Amazon DynamoDB Automatic Replication AZ A AZ C AZ B
  65. 65. Bob, Chief Architect
  66. 66. Key Takeaways • Consider design factors and make trade offs, if possible • Think parallel • Pick the right tool for the right job
  67. 67. Please give us your feedback on this presentation ARC302 As a thank you, we will select prize winners daily for completed surveys!
  68. 68. Extra
  69. 69. Amazon DynamoDB Built-in Replication Region B Region A Provisioned Throughput Automatic 3-way Replication Amazon S3 Bucket Amazon DynamoDB Amazon DynamoDB Table Table Table Table Availability Zone A Availability Zone B Availability Zone C AWS Data Pipeline Amazon DynamoDB Amazon DynamoDB
  70. 70. Amazon Redshift Replication Patterns Region B Leader Node Copy Unload Amazon S3 Bucket Amazon Redshift 10GigE Compute Node Compute Node Availability Zone A Compute Node
  71. 71. WAN Acceleration Test Instance Instance AP-southest-1 US-East-1 Instance EU-West-1
  72. 72. Tsunami-UDP here’s how Install Tsunami Origin Server $ cd /path/to/files $ tsunamid * Destination Server $ cd /path/to/receive/files $ tsunami tsunami> connect ec2-XX-XX-XX-83.compute-1.amazonaws.com tsunami> get * *** Note firewall ports need opening between servers
  73. 73. BBCP here’s how Install BBCP $ sudo wget http://www.slac.stanford.edu/~abh/bbcp/bin/amd64_linux26/bbcp $ sudo cp bbcp /usr/bin/ Transmitting with 64 parallel streams $ bbcp -P 2 -V -w 8m -s 64 /local/files/* ec2-your-instance.ap-southeast-1.compute.amazonaws.com:/mnt/ Notes: • SSH key pairs are used to authenticate between systems • TCP port 5031 needs to be open between machines • Does NOT encrypt data in transit • Instance Type matters the better the instance type the better the performance • Many dials and options to tweak to improve performance and friendly features like retries and restarts *** Note firewall ports need opening between servers
  74. 74. Minutes to Send a 30GB File from US-East-1 60 50 40 30 20 10 0 To Dublin SCP To Singapore BBCP Tsunami *** Note using hs1.xl
  75. 75. Environment & Configuration Primary DB: prod #Primary init.ora: LOG_ARCHIVE_DEST_1='LOCATION=/data/oracle/Prod/db/archive VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=Prod' LOG_ARCHIVE_CONFIG='DG_CONFIG=(prod,stb)' DB_FILE_NAME_CONVERT='prod','prod' FAL_CLIENT='prod' FAL_SERVER='stb' log_archive_dest_2='SERVICE=stb VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=‘stb' LOG_ARCHIVE_DEST_STATE_1='ENABLE' log_archive_dest_state_2='ENABLE' log_archive_format='%t_%s_%r.arc' LOG_FILE_NAME_CONVERT='prod','prod' remote_login_passwordfile='EXCLUSIVE' SERVICE_NAMES='prod' STANDBY_FILE_MANAGEMENT='AUTO' db_unique_name=prod global_names=TRUE DG_BROKER_START=TRUE DG_BROKER_CONFIG_FILE1='/data/oracle/prod/db/tech_st/11.1 .0/dbs/prod1.dat' DG_BROKER_CONFIG_FILE2='/data/oracle/prod/db/tech_st/11.1 .0/dbs/prod2.dat' Standby DB: stb #Standby init.ora: LOG_ARCHIVE_DEST_1='LOCATION=/data/oracle/prod/db/archive VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=‘stb' LOG_ARCHIVE_CONFIG='DG_CONFIG=(prod,stb)' DB_FILE_NAME_CONVERT='prod','prod' FAL_CLIENT=‘stb' FAL_SERVER='prod' log_archive_dest_2='SERVICE=prod VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=prod' LOG_ARCHIVE_DEST_STATE_1='ENABLE' log_archive_dest_state_2='DEFER' log_archive_format='%t_%s_%r.arc' LOG_FILE_NAME_CONVERT='prod','prod' remote_login_passwordfile='EXCLUSIVE' SERVICE_NAMES=‘stb' STANDBY_FILE_MANAGEMENT='AUTO' db_unique_name=stb global_names=TRUE DG_BROKER_START=TRUE DG_BROKER_CONFIG_FILE1='/data/oracle/prod/db/tech_st/11.1.0/d bs/prod1.dat' DG_BROKER_CONFIG_FILE2='/data/oracle/prod/db/tech_st/11.1.0/d bs/prod2.dat'
  76. 76. Listener.ora Primary DB prod = (DESCRIPTION_LIST = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = primary) (PORT = 1526)) ) ) (SID_LIST = (SID_DESC = (ORACLE_HOME= /data/oracle/prod/db/tech_st/11.1.0) (SID_NAME = prod) ) (SID_DESC = (ORACLE_HOME= /data/oracle/prod/db/tech_st/11.1.0) (SID_NAME = prod) (GLOBAL_DBNAME=prod_DGMGRL) ) (SID_DESC = (ORACLE_HOME= /data/oracle/prod/db/tech_st/11.1.0) (SID_NAME = prod) (GLOBAL_DBNAME=prod_DGB) Standby DB prod = (DESCRIPTION_LIST = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = ec2)(PORT = 1526)) ) ) SID_LIST_prod = (SID_LIST = (SID_DESC = (ORACLE_HOME= /data/oracle/prod/db/tech_st/11.1.0) (SID_NAME = prod) ) (SID_DESC = (ORACLE_HOME= /data/oracle/prod/db/tech_st/11.1.0) (SID_NAME = prod) (GLOBAL_DBNAME=stb_DGMGRL) ) (SID_DESC = (ORACLE_HOME= /data/oracle/prod/db/tech_st/11.1.0) (SID_NAME = prod) (GLOBAL_DBNAME=stb_DGB)
  77. 77. tnsnames.ora Primary DB #Primary tnsnames.ora: STB = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = standby)(PORT = 1526)) (CONNECT_DATA = (SID = prod) ) ) prod = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = primary)(PORT = 1526)) (CONNECT_DATA = (SID = VIS) ) ) Standby DB #Standby tnsnames.ora: STB = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = standby)(PORT = 1526)) (CONNECT_DATA = (SID = prod) ) ) prod = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = primary)(PORT = 1526)) (CONNECT_DATA = (SID = VIS) ) )

×