Hadoop is an increasingly popular means of analyzing transaction data from single MySQL or multiple MySQL servers. Up until now mechanisms for moving data between MySQL and Hadoop have been rather limited. The new Continuent Tungsten Replicator 3.0 provides enterprise-quality replication from MySQL to Hadoop. Tungsten Replicator 3.0 is 100% open source, released under a GPL V2 license, and available for download at https://code.google.com/p/tungsten-replicator/.
Continuent Tungsten handles MySQL transaction types including INSERT/UPDATE/DELETE operations and can materialize binlogs as well as mirror-image data copies in Hadoop. Continuent Tungsten also has the high performance necessary to load data from busy source MySQL systems into Hadoop clusters with minimal load on source systems while retaining control over how the data is viewed and accessed within Hadoop.
We will discuss:
- How Hadoop works and why it's useful for processing transaction data from MySQL
- Setting up Continuent Tungsten replication from MySQL to Hadoop
- Understanding how Hadoop makes data analytics efficient and easy to execute
- Tuning replication to maximize performance.