©Continuent 2014 
Replicating in Real-Time from 
MySQL to Amazon Redshift 
Featuring Continuent Tungsten 
Robert Hodges, CEO 
Jeff Mace, Director of Services
Presentation Topics 
• Introductions 
• Tungsten Replicator Overview 
• Real-Time Data Warehouse Loading 
• Adding Redshift to the Mix 
• Replication from Tungsten Clusters to 
Redshift 
• Wrap-up 
©Continuent 2014 
2
Introducing Continuent 
• The leading provider of clustering and 
replication for open source DBMS 
• Our Product: Continuent Tungsten 
• Clustering - Commercial-grade HA, performance 
©Continuent 2014 
scaling and data management for MySQL 
• Replication - Flexible, high-performance data 
3 
movement
Continuent Tungsten Customers 
©Continuent 2014 
4 
1
Quick Continuent Facts 
• Largest Tungsten installation by data volume 
processes over 800 million transactions per 
day on 225 terabytes of relational data 
• Largest installation by transaction volume 
handles up to 8 billion transactions daily 
• Wide variety of topologies including MySQL, 
Oracle, Vertica, and Hadoop are in production 
now 
• Amazon Redshift is the newest data 
©Continuent 2014 
warehouse target 
5
Tungsten Replicator 
Overview 
©Continuent 2014 6
What is Tungsten Replicator? 
Download from https://code.google.com/p/tungsten-replicator/ 
Annual support subscription available from Continuent 
©Continuent 2014 
A real-time, 
high-performance, 
open source database 
replication engine 
! 
GPL V2 license - 100% open source 
“Golden Gate® without the Price Tag” 
7
Tungsten Replicator Overview 
©Continuent 2014 
Master 
Replicator THL 
8 
(Transactions + Metadata) 
Slave 
THL 
DBMS 
Logs 
Replicator 
(Transactions + Metadata) 
Extract 
transactions 
from log 
Apply
Replication Services and Parallel 
Apply 
©Continuent 2014 
Stage 
9 
Stage 
Extract Filter Apply 
Extract Filter Apply 
Stage 
Pipeline 
Master 
DBMS 
Transaction 
History Log 
Extract Filter Apply 
Extract Filter Apply 
Extract Filter Apply 
Parallel 
Queue 
Slave 
DBMS
©Continuent 2014 
Heterogeneous 
MySQL Oracle Oracle MySQL MySQL 
star 
master-slave 
Oracle MySQL Oracle 
fan-in slave all-masters
Real-Time Data Warehouse Loading 
©Continuent 2014 
11
Why MySQL Needs A Data Warehouse 
©Continuent 2014 
id cust_id prod_id ... 
1 335301 532 ... 
2 2378 6235 ... 
3 ... ... ... 
12 
Sales Table 
Product Table 
id sku type 
532 C00135 consumer 
533 S09957 specialty 
... ... 
Prod_ID Index 
prod_id id 
532 1 
6235 2 
... ... 
Row format 
makes table 
scans very 
slow 
Indexes slow 
OLTP 
Low/no data 
compression 
Limited 
index 
types 
Limited 
join 
types 
Single-threaded query — no parallelization
Current Data Warehouse Options 
MySQL Master 
©Continuent 2014 
13 
OLTP Data 
Scalable row store 
with star schema 
and materialized 
view support 
CSV files stored 
on HDFS with 
access via Hive/ 
MapReduce 
Column storage 
with compression 
and built-in time 
series support
Traditional ETL Approach 
©Continuent 2014 
Batch-oriented = not timely 
Scan for changes = performance hit 
14 
MySQL 
Sales 
Table 
Extract Transfer Load 
Date columns = intrusive 
Sales 
Table 
Data Warehouse
Real-Time Data Replication 
©Continuent 2014 
No SQL changes = transparent 
15 
MySQL 
Sales 
Table 
Data Warehouse 
Fast propagation = timely 
Automatic change capture = low impact 
DBMS 
Logs 
Data 
Replication 
Sales 
Table
Loading to an Oracle Data Warehouse 
©Continuent 2014 
16 
MySQL Tungsten Master 
Replicator 
oracle 
MySQLExtractor 
Special Filters 
* Transform enum to string 
binlog_format=row 
Tungsten Slave 
Replicator 
oracle 
Special Filters 
* Ignore extra tables 
* Map names to upper case 
* Optimize updates to 
MySQL 
remove unchanged columns 
Binlog 
Data stored 
in rows like 
MySQL
How about Loading to Hadoop? 
©Continuent 2014 
Replication 
17 
CSV 
Files 
CSV 
Files 
Buffered 
Transactions 
Binlog 
Provisioning 
Data stored as 
append-only files
Typical Raw Hadoop Data Layout 
! 
! 
! 
! 
556,MALTESE HOPE,4.99,127n 
557,MANCHURIAN CURTAIN,3.99,177n 
558,MANNEQUIN WORST,2.99,71n 
559,MARRIED GO,2.99,114n 
©Continuent 2014 
18 
(CSV file) 
field separator 
file partitioning 
record separator 
compression type conversions
Loading to a Hadoop Data Warehouse 
©Continuent 2014 
19 
Replicator 
hadoop Transactions 
from master 
CSV 
Files 
CSV 
Files 
CSV 
Files 
Staging 
Tables 
Staging 
Tables 
BaBsea sTea bTalebsles Materialized Views 
Staging 
“Tables” 
Javascript load 
script 
e.g. hadoop.js 
Write data 
to CSV 
Merge using 
external 
MapReduce job 
(Generate 
Hive table 
definitions) 
(Generate 
Hive table 
definitions) 
Load using 
hadoop 
command
Hadoop Materialized View Generation 
Sort SHUFFLE by key(s), transaction order 
©Continuent 2014 
Transaction logs Snapshot 
UNION ALL 
Emit last row per key if not a delete 
20 
MAP 
REDUCE 
Materialized view 
including all updates
How about Loading to Vertica? 
©Continuent 2014 
Replication 
21 
CSV 
Files 
CSV 
Files 
Buffered 
Transactions 
Binlog 
Provisioning? 
Data stored in 
column format
Data Layout in a Column Store 
©Continuent 2014 
Product Table 
22 
Sales Table 
cust_id 
335301 
2378 
... 
prod_id 
532 
6235 
... 
Fast scans 
on columns 
quantity 
1 
3 
... 
Updates to single 
rows are hideously 
slow 
Every column 
is an index 
id 
1 
2 
3 
Good 
compression 
id 
532 
533 
... 
sku 
C00135 
S09957 
... 
type 
consumer 
specialty 
... 
Fast joins 
with parallel 
query
Loading to a Vertica Data Warhouse 
©Continuent 2014 
23 
MySQL Tungsten Master 
Replicator 
vertica 
MySQLExtractor 
Special Filters 
* pkey - Fill in pkey info 
* colnames - Fill in names 
* replicate - Ignore tables 
binlog_format=row 
Tungsten Slave 
Replicator 
vertica 
MySQL 
Binlog 
CSV 
Files 
CSV 
Files 
CSV 
Files 
CSV 
Files 
CSV 
Files 
Large transaction 
batches to leverage 
load parallelization
Vertica Batch Loading--The Details 
©Continuent 2014 
COPY to 
stage tables SELECT to 
24 
Replicator 
vertica Transactions 
from master 
CSV 
Files 
CSV 
Files 
CSV 
Files 
Staging 
Tables 
Staging 
Tables 
Staging 
Tables 
Base 
Tables 
Base 
Tables 
Base 
Tables 
Merge 
Script 
(or) 
COPY 
directly to 
base tables 
base tables 
No external 
merge required!
Provisioning Using a Sandbox Server 
©Continuent 2014 
25 
OLTP 
Server 
Temporary 
Sandbox Server 
Vertica 
Cluster 
1. Restore 
logical 
backup 
2. Replicate 
restored 
transactions 
3. Replicate 
normally after 
restore loads
Adding Amazon Redshift to 
the Mix 
©Continuent 2014 26
Tungsten Support for Amazon Redshift 
• Tungsten Replicator 3.0 supports real-time 
replication to Amazon Redshift 
• Loading builds on data warehouse support 
for Vertica and Hadoop 
• Beta program in progress now 
• Production ready in September 2014 
©Continuent 2014 
27
Redshift = Cloud Data Warehouse 
©Continuent 2014 
28
Redshift Capabilities 
• On-demand column store in Amazon Web 
Services 
• Looks like a PostgreSQL 8.x DBMS 
• $1,000 per terabyte per year or $0.25/hour 
• Automatic backup and cluster management 
• Integrated into Amazon VPC 
• Data loading from Amazon S3 using SQL 
©Continuent 2014 
COPY command 
29
Connecting to Amazon Redshift 
$ psql -h redshift-webinar.cub79gczb9kq.us-east- 
1.redshift.amazonaws.com -U tungsten -d dev -p 5439 
psql (9.2.6, server 8.0.2) 
WARNING: psql version 9.2, server version 8.0. 
©Continuent 2014 
Some psql features might not work. 
SSL connection (cipher: ECDHE-RSA-AES256-SHA, bits: 256) 
Type "help" for help. 
! 
dev=# select * from test.foo; 
id | data 
----+-------------- 
2 | salutations! 
3 | bye! 
1 | hello! 
(3 rows) 
! 
dev=# q 
30
Tungsten Loading to Redshift 
©Continuent 2014 
31 
Tungsten Slave 
redshift 
Transactions 
from Tungsten 
Master 
CSV 
Files 
CSV 
Files 
CSV 
Files 
BaBsea sTeRa bTealdebSslheisft 
Base Tables 
Staging 
Tables 
Staging 
Tables RedShift 
Staging Tables 
Javascript load 
script 
e.g. hadoop.js 
Write data 
to CSV 
3. Merge using 
SQL to base 
2. COPY to 
1. Load to S3 Redshift 
using s3cmd 
S3 
Storage
Prerequisites for Redshift Loading 
©Continuent 2014 
32 
MySQL Tungsten Master 
Replicator 
redshift 
binlog_format=row 
Tungsten Slave 
Replicator 
redshift 
MySQL 
Binlog 
Allow access from 
Tungsten host 
Java 6+, 
Ruby 1.8.5+ 
Amazon 
Java 6+, 
Redshift 
Ruby 1.8.5+ 
s3cmd 1.0+ 
S3 credentials 
S3 Storage 
Bucket for CSV uploads
Application Prerequisites 
• Primary keys on all tables 
• UTF-8 character set--or at least be consistent 
• Use GMT timezone--or be very consistent 
©Continuent 2014 
33 
about dates 
Avoid “Cowboy” schema changes 
to MySQL master(s)
Generate Schema Using ddlscan 
©Continuent 2014 
ddlscan 
34 
•Data types? 
•Column lengths? 
•Naming conventions? 
•Staging tables? 
MySQL Tables 
Amazon 
Redshift
Staging Table Schema 
$ bin/ddlscan -db test -template ddl-mysql-redshift-staging.vm! 
...! 
! 
CREATE SCHEMA test;! 
! 
DROP TABLE test.stage_xxx_foo;! 
CREATE TABLE test.stage_xxx_foo! 
(! 
tungsten_opcode CHAR(2),! 
tungsten_seqno INT,! 
tungsten_row_id INT,! 
tungsten_commit_timestamp TIMESTAMP,! 
id INT,! 
data VARCHAR(120) /* VARCHAR(30) */,! 
PRIMARY KEY (tungsten_opcode, tungsten_seqno, tungsten_row_id)! 
); 
©Continuent 2014 
35
Base Table Schema 
$ bin/ddlscan -db test -template ddl-mysql-redshift.vm! 
...! 
! 
CREATE SCHEMA test;! 
! 
DROP TABLE test.foo;! 
CREATE TABLE test.foo! 
(! 
id INT,! 
data VARCHAR(120) /* VARCHAR(30) */,! 
PRIMARY KEY (id)! 
); 
©Continuent 2014 
36
Sandbox Provisioning for Redshift 
©Continuent 2014 
37 
OLTP 
Server 
Temporary 
Sandbox Server 
Redshift 
Cluster 
1. Restore 
logical 
backup 
2. Replicate 
restored 
transactions 
3. Replicate 
normally after 
restore loads 
Amazon 
Redshift
©Continuent 2014 
Demo 
Setting up MySQL to 
Redshift Data Loading 
38
Replicating from Tungsten 
Clusters to Redshift 
©Continuent 2014 39
60 Second Intro to Tungsten Clusters 
Tungsten clusters combine off-the- 
servers into data services with: 
! 
• 24x7 data access 
• Scaling of load on replicas 
• Simple management commands 
! 
...without app changes or data 
migration 
©Continuent 2014 
shelf open source MySQL 
40 
Amazon 
US West 
GonzoPortal.com 
apache 
/php 
Connector Connector
Replicating from a Cluster to Redshift 
©Continuent 2014 
41 
Tungsten Cluster1 
master 
slave 
Tungsten 3.0 
Replicator 
cluster1 Amazon 
Redshift 
Filters and Special Configuration 
* pkey - Fill in pkey info 
* colnames - Fill in names 
* fixmysqlstrings - Convert to UT8 
* Replicate from both nodes for 
higher availability
Fan-in from Multiple Tungsten Clusters 
©Continuent 2014 
42 
Redshift 
Cluster 
cluster1 
cluster2 
Tungsten 
Replicator 
Amazon 
Redshift
Getting Started with MySQL 
to Redshift Loading 
©Continuent 2014 43
Where Is Everything? 
• Tungsten Replicator 3.0 builds are available on 
code.google.com 
http://code.google.com/p/tungsten-replicator/ 
(Visit the nightly builds page for latest features) 
• Replicator documentation is available on Continuent 
©Continuent 2014 
website 
https://docs.continuent.com/tungsten-replicator-3.0/ 
deployment-redshift.html 
Contact Continuent for support 
44
In Conclusion… 
• Tungsten Replicator 3.0 now supports realtime 
loading from MySQL to Amazon Redshift 
• You can start using it yourself or join our beta 
program 
• Software will be production-ready in September 
2014 
• Continuent has a wealth of data loading features on 
©Continuent 2014 
tap so stay tuned! 
45
560 S. Winchester Blvd., Suite 500 
San Jose, CA 95128 
Tel +1 (866) 998-3642 
Fax +1 (408) 668-1009 
e-mail: sales@continuent.com 
©Continuent 2014 
Our Blogs: 
http://scale-out-blog.blogspot.com 
http://datacharmer.org/blog 
http://www.continuent.com/news/blogs 
http://flyingclusters.blogspot.com/ 
www.continuent.com 
Follow us on Twitter @continuent 
! 
Tungsten Replicator: 
http://code.google.com/p/tungsten-replicator

Replicating in Real-time from MySQL to Amazon Redshift

  • 1.
    ©Continuent 2014 Replicatingin Real-Time from MySQL to Amazon Redshift Featuring Continuent Tungsten Robert Hodges, CEO Jeff Mace, Director of Services
  • 2.
    Presentation Topics •Introductions • Tungsten Replicator Overview • Real-Time Data Warehouse Loading • Adding Redshift to the Mix • Replication from Tungsten Clusters to Redshift • Wrap-up ©Continuent 2014 2
  • 3.
    Introducing Continuent •The leading provider of clustering and replication for open source DBMS • Our Product: Continuent Tungsten • Clustering - Commercial-grade HA, performance ©Continuent 2014 scaling and data management for MySQL • Replication - Flexible, high-performance data 3 movement
  • 4.
    Continuent Tungsten Customers ©Continuent 2014 4 1
  • 5.
    Quick Continuent Facts • Largest Tungsten installation by data volume processes over 800 million transactions per day on 225 terabytes of relational data • Largest installation by transaction volume handles up to 8 billion transactions daily • Wide variety of topologies including MySQL, Oracle, Vertica, and Hadoop are in production now • Amazon Redshift is the newest data ©Continuent 2014 warehouse target 5
  • 6.
    Tungsten Replicator Overview ©Continuent 2014 6
  • 7.
    What is TungstenReplicator? Download from https://code.google.com/p/tungsten-replicator/ Annual support subscription available from Continuent ©Continuent 2014 A real-time, high-performance, open source database replication engine ! GPL V2 license - 100% open source “Golden Gate® without the Price Tag” 7
  • 8.
    Tungsten Replicator Overview ©Continuent 2014 Master Replicator THL 8 (Transactions + Metadata) Slave THL DBMS Logs Replicator (Transactions + Metadata) Extract transactions from log Apply
  • 9.
    Replication Services andParallel Apply ©Continuent 2014 Stage 9 Stage Extract Filter Apply Extract Filter Apply Stage Pipeline Master DBMS Transaction History Log Extract Filter Apply Extract Filter Apply Extract Filter Apply Parallel Queue Slave DBMS
  • 10.
    ©Continuent 2014 Heterogeneous MySQL Oracle Oracle MySQL MySQL star master-slave Oracle MySQL Oracle fan-in slave all-masters
  • 11.
    Real-Time Data WarehouseLoading ©Continuent 2014 11
  • 12.
    Why MySQL NeedsA Data Warehouse ©Continuent 2014 id cust_id prod_id ... 1 335301 532 ... 2 2378 6235 ... 3 ... ... ... 12 Sales Table Product Table id sku type 532 C00135 consumer 533 S09957 specialty ... ... Prod_ID Index prod_id id 532 1 6235 2 ... ... Row format makes table scans very slow Indexes slow OLTP Low/no data compression Limited index types Limited join types Single-threaded query — no parallelization
  • 13.
    Current Data WarehouseOptions MySQL Master ©Continuent 2014 13 OLTP Data Scalable row store with star schema and materialized view support CSV files stored on HDFS with access via Hive/ MapReduce Column storage with compression and built-in time series support
  • 14.
    Traditional ETL Approach ©Continuent 2014 Batch-oriented = not timely Scan for changes = performance hit 14 MySQL Sales Table Extract Transfer Load Date columns = intrusive Sales Table Data Warehouse
  • 15.
    Real-Time Data Replication ©Continuent 2014 No SQL changes = transparent 15 MySQL Sales Table Data Warehouse Fast propagation = timely Automatic change capture = low impact DBMS Logs Data Replication Sales Table
  • 16.
    Loading to anOracle Data Warehouse ©Continuent 2014 16 MySQL Tungsten Master Replicator oracle MySQLExtractor Special Filters * Transform enum to string binlog_format=row Tungsten Slave Replicator oracle Special Filters * Ignore extra tables * Map names to upper case * Optimize updates to MySQL remove unchanged columns Binlog Data stored in rows like MySQL
  • 17.
    How about Loadingto Hadoop? ©Continuent 2014 Replication 17 CSV Files CSV Files Buffered Transactions Binlog Provisioning Data stored as append-only files
  • 18.
    Typical Raw HadoopData Layout ! ! ! ! 556,MALTESE HOPE,4.99,127n 557,MANCHURIAN CURTAIN,3.99,177n 558,MANNEQUIN WORST,2.99,71n 559,MARRIED GO,2.99,114n ©Continuent 2014 18 (CSV file) field separator file partitioning record separator compression type conversions
  • 19.
    Loading to aHadoop Data Warehouse ©Continuent 2014 19 Replicator hadoop Transactions from master CSV Files CSV Files CSV Files Staging Tables Staging Tables BaBsea sTea bTalebsles Materialized Views Staging “Tables” Javascript load script e.g. hadoop.js Write data to CSV Merge using external MapReduce job (Generate Hive table definitions) (Generate Hive table definitions) Load using hadoop command
  • 20.
    Hadoop Materialized ViewGeneration Sort SHUFFLE by key(s), transaction order ©Continuent 2014 Transaction logs Snapshot UNION ALL Emit last row per key if not a delete 20 MAP REDUCE Materialized view including all updates
  • 21.
    How about Loadingto Vertica? ©Continuent 2014 Replication 21 CSV Files CSV Files Buffered Transactions Binlog Provisioning? Data stored in column format
  • 22.
    Data Layout ina Column Store ©Continuent 2014 Product Table 22 Sales Table cust_id 335301 2378 ... prod_id 532 6235 ... Fast scans on columns quantity 1 3 ... Updates to single rows are hideously slow Every column is an index id 1 2 3 Good compression id 532 533 ... sku C00135 S09957 ... type consumer specialty ... Fast joins with parallel query
  • 23.
    Loading to aVertica Data Warhouse ©Continuent 2014 23 MySQL Tungsten Master Replicator vertica MySQLExtractor Special Filters * pkey - Fill in pkey info * colnames - Fill in names * replicate - Ignore tables binlog_format=row Tungsten Slave Replicator vertica MySQL Binlog CSV Files CSV Files CSV Files CSV Files CSV Files Large transaction batches to leverage load parallelization
  • 24.
    Vertica Batch Loading--TheDetails ©Continuent 2014 COPY to stage tables SELECT to 24 Replicator vertica Transactions from master CSV Files CSV Files CSV Files Staging Tables Staging Tables Staging Tables Base Tables Base Tables Base Tables Merge Script (or) COPY directly to base tables base tables No external merge required!
  • 25.
    Provisioning Using aSandbox Server ©Continuent 2014 25 OLTP Server Temporary Sandbox Server Vertica Cluster 1. Restore logical backup 2. Replicate restored transactions 3. Replicate normally after restore loads
  • 26.
    Adding Amazon Redshiftto the Mix ©Continuent 2014 26
  • 27.
    Tungsten Support forAmazon Redshift • Tungsten Replicator 3.0 supports real-time replication to Amazon Redshift • Loading builds on data warehouse support for Vertica and Hadoop • Beta program in progress now • Production ready in September 2014 ©Continuent 2014 27
  • 28.
    Redshift = CloudData Warehouse ©Continuent 2014 28
  • 29.
    Redshift Capabilities •On-demand column store in Amazon Web Services • Looks like a PostgreSQL 8.x DBMS • $1,000 per terabyte per year or $0.25/hour • Automatic backup and cluster management • Integrated into Amazon VPC • Data loading from Amazon S3 using SQL ©Continuent 2014 COPY command 29
  • 30.
    Connecting to AmazonRedshift $ psql -h redshift-webinar.cub79gczb9kq.us-east- 1.redshift.amazonaws.com -U tungsten -d dev -p 5439 psql (9.2.6, server 8.0.2) WARNING: psql version 9.2, server version 8.0. ©Continuent 2014 Some psql features might not work. SSL connection (cipher: ECDHE-RSA-AES256-SHA, bits: 256) Type "help" for help. ! dev=# select * from test.foo; id | data ----+-------------- 2 | salutations! 3 | bye! 1 | hello! (3 rows) ! dev=# q 30
  • 31.
    Tungsten Loading toRedshift ©Continuent 2014 31 Tungsten Slave redshift Transactions from Tungsten Master CSV Files CSV Files CSV Files BaBsea sTeRa bTealdebSslheisft Base Tables Staging Tables Staging Tables RedShift Staging Tables Javascript load script e.g. hadoop.js Write data to CSV 3. Merge using SQL to base 2. COPY to 1. Load to S3 Redshift using s3cmd S3 Storage
  • 32.
    Prerequisites for RedshiftLoading ©Continuent 2014 32 MySQL Tungsten Master Replicator redshift binlog_format=row Tungsten Slave Replicator redshift MySQL Binlog Allow access from Tungsten host Java 6+, Ruby 1.8.5+ Amazon Java 6+, Redshift Ruby 1.8.5+ s3cmd 1.0+ S3 credentials S3 Storage Bucket for CSV uploads
  • 33.
    Application Prerequisites •Primary keys on all tables • UTF-8 character set--or at least be consistent • Use GMT timezone--or be very consistent ©Continuent 2014 33 about dates Avoid “Cowboy” schema changes to MySQL master(s)
  • 34.
    Generate Schema Usingddlscan ©Continuent 2014 ddlscan 34 •Data types? •Column lengths? •Naming conventions? •Staging tables? MySQL Tables Amazon Redshift
  • 35.
    Staging Table Schema $ bin/ddlscan -db test -template ddl-mysql-redshift-staging.vm! ...! ! CREATE SCHEMA test;! ! DROP TABLE test.stage_xxx_foo;! CREATE TABLE test.stage_xxx_foo! (! tungsten_opcode CHAR(2),! tungsten_seqno INT,! tungsten_row_id INT,! tungsten_commit_timestamp TIMESTAMP,! id INT,! data VARCHAR(120) /* VARCHAR(30) */,! PRIMARY KEY (tungsten_opcode, tungsten_seqno, tungsten_row_id)! ); ©Continuent 2014 35
  • 36.
    Base Table Schema $ bin/ddlscan -db test -template ddl-mysql-redshift.vm! ...! ! CREATE SCHEMA test;! ! DROP TABLE test.foo;! CREATE TABLE test.foo! (! id INT,! data VARCHAR(120) /* VARCHAR(30) */,! PRIMARY KEY (id)! ); ©Continuent 2014 36
  • 37.
    Sandbox Provisioning forRedshift ©Continuent 2014 37 OLTP Server Temporary Sandbox Server Redshift Cluster 1. Restore logical backup 2. Replicate restored transactions 3. Replicate normally after restore loads Amazon Redshift
  • 38.
    ©Continuent 2014 Demo Setting up MySQL to Redshift Data Loading 38
  • 39.
    Replicating from Tungsten Clusters to Redshift ©Continuent 2014 39
  • 40.
    60 Second Introto Tungsten Clusters Tungsten clusters combine off-the- servers into data services with: ! • 24x7 data access • Scaling of load on replicas • Simple management commands ! ...without app changes or data migration ©Continuent 2014 shelf open source MySQL 40 Amazon US West GonzoPortal.com apache /php Connector Connector
  • 41.
    Replicating from aCluster to Redshift ©Continuent 2014 41 Tungsten Cluster1 master slave Tungsten 3.0 Replicator cluster1 Amazon Redshift Filters and Special Configuration * pkey - Fill in pkey info * colnames - Fill in names * fixmysqlstrings - Convert to UT8 * Replicate from both nodes for higher availability
  • 42.
    Fan-in from MultipleTungsten Clusters ©Continuent 2014 42 Redshift Cluster cluster1 cluster2 Tungsten Replicator Amazon Redshift
  • 43.
    Getting Started withMySQL to Redshift Loading ©Continuent 2014 43
  • 44.
    Where Is Everything? • Tungsten Replicator 3.0 builds are available on code.google.com http://code.google.com/p/tungsten-replicator/ (Visit the nightly builds page for latest features) • Replicator documentation is available on Continuent ©Continuent 2014 website https://docs.continuent.com/tungsten-replicator-3.0/ deployment-redshift.html Contact Continuent for support 44
  • 45.
    In Conclusion… •Tungsten Replicator 3.0 now supports realtime loading from MySQL to Amazon Redshift • You can start using it yourself or join our beta program • Software will be production-ready in September 2014 • Continuent has a wealth of data loading features on ©Continuent 2014 tap so stay tuned! 45
  • 46.
    560 S. WinchesterBlvd., Suite 500 San Jose, CA 95128 Tel +1 (866) 998-3642 Fax +1 (408) 668-1009 e-mail: sales@continuent.com ©Continuent 2014 Our Blogs: http://scale-out-blog.blogspot.com http://datacharmer.org/blog http://www.continuent.com/news/blogs http://flyingclusters.blogspot.com/ www.continuent.com Follow us on Twitter @continuent ! Tungsten Replicator: http://code.google.com/p/tungsten-replicator