MySQL HA reloaded - old tricks and cool new tools to guarantee high availability to your MySQL Servers
Upcoming SlideShare
Loading in...5
×
 

MySQL HA reloaded - old tricks and cool new tools to guarantee high availability to your MySQL Servers

on

  • 2,893 views

Do you think that High Availability is all about MySQL Replication? Have you tried to alter your tables to NDB to drink at the holy grail of the shared nothing architectures? High Availability is #1 ...

Do you think that High Availability is all about MySQL Replication? Have you tried to alter your tables to NDB to drink at the holy grail of the shared nothing architectures? High Availability is #1 request for MySQL Servers, even more popular than scalability and performance. In this presentation we will talk about old and new tools to provide HA, automatic failover and disaster recovery for MySQL - there is a solution for every need.

Statistics

Views

Total Views
2,893
Views on SlideShare
2,778
Embed Views
115

Actions

Likes
5
Downloads
67
Comments
1

2 Embeds 115

http://www.scoop.it 114
http://webcache.googleusercontent.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

MySQL HA reloaded - old tricks and cool new tools to guarantee high availability to your MySQL Servers MySQL HA reloaded - old tricks and cool new tools to guarantee high availability to your MySQL Servers Presentation Transcript

  • High Availability Reloaded IVAN ZORATTI Chief Technology OfficerOracle, MySQL and InnoDB are registered trademarks of Oracle and/or its affiliates.  Other names may be trademarks of their respective owners. 1201.01.01Tuesday, 24 January 12
  • Agenda • SkySQL - 3 (+1) slides! • A bit of theory • High availability solutions • ...and the famous last words! 2Tuesday, 24 January 12
  • SkySQL Ab • Funded by: –MySQL® AB founders Monty Widenius and David Axmark –US Investment group OnCorps.org • A team of 40, operating in 14 countries, 90% from MySQL® AB • Backed by: –Product Engineering MontyProgram Ab –Top Community contributors, commercial partners and end users 3Tuesday, 24 January 12
  • SkySQL Offering • SkySQL Enterprise Subscriptions – Monitoring, Administration and End User tools – Specialised modules for High Availability and performance improvements1 • SkySQL Enterprise Cluster and SkySQL Enterprise HA – Up to L3 Technical and Consultative Support for the most used MySQL® distributions and branches • SkySQL Consulting – Top class team for MySQL® technology – Extended service offering from Health Check to continuous administration • SkySQL Training – MySQL® Training and Certification 1 - Option 4Tuesday, 24 January 12
  • The SkySQL Reference Architecture Components Integra&on Integra&on Tools Tools Migra&on Migra&on Tools Tools 5Tuesday, 24 January 12
  • High Availability ...a bit of theory 6Tuesday, 24 January 12
  • High Availability “High availability is a system design protocol and associated implementation that ensures a certain degree of operational continuity during a given measurement period.” 7Tuesday, 24 January 12
  • Fault-tolerant? “Fault-tolerant design enables a system to continue operation, possibly at a reduced level (also known as graceful degradation), rather than failing completely, when some part of the system fails.” 8Tuesday, 24 January 12
  • Switchover / Failover • Switchover – “Switchover is the capability to manually switch over from one system to a redundant or standby computer server, system, or network upon the failure or abnormal termination of the previously active server, system, or network.” • Failover – “Failover is the capability to switch over automatically to a redundant or standby computer server, system, or network upon the failure or abnormal termination of the previously active application, server, system, or network.” • Aided Switchover? • Failback? 9Tuesday, 24 January 12
  • Downtime • Planned/Scheduled • Unplanned/Unscheduled • “Downtime or outage duration refers to a period of time that a system fails to provide or perform its primary function.” 10Tuesday, 24 January 12
  • Single Point Of Failure - SPOF “A Single Point of Failure, (SPOF), is a part of a system which, if it fails, will stop the entire system from working.” 11Tuesday, 24 January 12
  • Disaster Recovery and Business Continuity “Disaster recovery is the “Disaster recovery process, policies and planning is a subset of a procedures related to larger process known preparing for recovery as business continuity or continuation of planning and should technology include planning for infrastructure critical to resumption of an organization after a applications, data, natural or human- hardware, induced disaster.” communications (such as networking) and other IT infrastructure.” 12Tuesday, 24 January 12
  • Disaster Recovery and Business Continuity “Disaster recovery is the process, policies and procedures related to preparing for recovery or continuation of technology infrastructure critical to an organization after a natural or human- induced disaster.” 13Tuesday, 24 January 12
  • Designing a Highly Available System • Which level of High Availability do I need? • Do I require no loss of data? • Do I need failover or is switchover enough? • Can I provide a reasonable service when a component is down? 14Tuesday, 24 January 12
  • Something to clarify... • Availability vs Scalability • HA Costs • HA for your systems, not only for your database • Review your SLAs 15Tuesday, 24 January 12
  • High Availability Solutions 16Tuesday, 24 January 12
  • High Availability with MySQL HigherAvailability • Combined solutions • Shared nothing distributed cluster with MySQL Cluster • Geographical Replication for disaster recovery • Virtualised Environments • Active/Passive Clusters through shared storage • MySQL synchronous replication • Generic synchronous replication • MySQL Replication with agents and failover • MySQL Replication 17Tuesday, 24 January 12
  • MySQL Replication • Something you may have missed... –Asynchronous or Semi-synchronous –Pros and Cons of RBR vs SBR –Mono-thread pull from the slaves –sync_binlog = 0/1 –Antilope vs Barracuda Read-Write –Group Commit Read-Only Read-Only –Multi-engines –Rolling upgrades binlog 99 18 relaylog relaylog relaylog relaylogTuesday, 24 January 12
  • MySQL Replication with MMM (Multi-Master replication Manager) • Master-Master features: –Monitoring –Automatic failover –Data backup –Resynch Read-Write mmm_agentd mmm_agentd • http://code.openark.org/blog/mysql/ problems-with-mmm-for-mysql Read-Only Read-Only • http://www.xaprb.com/blog/2011/05/ 04/whats-wrong-with-mmm/ binlog binlog mmm_mond relaylog relaylog relaylog relaylog http://mysql-mmm.org 19Tuesday, 24 January 12
  • MySQL Replication with MHA • Something to consider... –read-only=1 and log-bin on slaves –Master IP failover –Filtering rules –multi-tier replication http://code.google.com/p/mysql-master-ha/ 20Tuesday, 24 January 12
  • Tungsten Replicator • Open Source, heterogenous replication • Truly multi-master and fan-in with Global ID • Per-schema Read-Write multi-thread Replicator Replicator agent agent Replicator Replicator agent agent http://code.google.com/p/tungsten-replicator/ 21Tuesday, 24 January 12
  • Tungsten Enterprise Connector Connector Connector Connector Connector Read-Write • Tungsten Replicator + Replicator + Monitor Replicator + Monitor –Client Connector with R/W Replicator Replicator split and load balancing + Monitor + Monitor –Replication Monitoring –Integrated backup http://www.continuent.com/solutions/overview 22Tuesday, 24 January 12
  • Synchronous Replication with DRBD • Typical Active/Standby • Cross active/active servers implementations • Possible issues: –Dependencies –Infrastructure SPOFs –Write performance impact Active/Hot Passive/Std-by Server Server –InnoDB only • DRBD in a virtualized environment Block Block 23 Device DeviceTuesday, 24 January 12
  • Synchronous Replication through DRBD Configuration Gateway 192.168.1.1 192.168.1.X Active/Hot VIP 192.168.1.2 Passive/Std-by Server Server HB1: 10.0.3.X HB2: 10.0.4.X 15 16 DRBD: 10.0.5.X /dev/sdb /mysqldata /dev/sdb /mysqldata Block Device Block Device 24Tuesday, 24 January 12
  • Synchronous Replication with Galera • Synchronous replication for InnoDB • Multi-master, no SPOF • Application Read-Write Read-Write failover must be managed • Conflict resolution wsrep wsrep wsrep http://www.codership.com 25Tuesday, 24 January 12
  • Percona XtraDB Cluster • Alpha version of Galera + XtraDB • Multi-master, no SPOF • Application failover must be Read-Write Read-Write managed • Conflict resolution with aborted COMMITs • Auto Increment • No XA TXN • NoPK operations issues wsrep wsrep wsrep http://www.percona.com/doc/percona-xtradb-cluster 26Tuesday, 24 January 12
  • SchoonerSQL • Synchronous master-slave replication for InnoDB • Retrieve/Inject in the transaction log and buffer pool • Monitoring/ Administration tool • Closed source 27Tuesday, 24 January 12
  • Active/Passive Clusters using Shared Storage • Points to consider: Active/Hot Passive/Std-by –Redundancy and replication Server Server must be guaranteed by the shared storage (and this is not trivial) –InnoDB only –File Systems Shared Storage 28Tuesday, 24 January 12
  • Active/Passive Clusters using Shared Storage Large Deployments VIP01 VIP02 VIP03 VIP04 VIP05 VIP06 VIP07 VIP08 in01 in02 in03 in04 in05 in06 in07 in08 01 02 03 04 05 06 07 08 Shared Storage 29Tuesday, 24 January 12
  • Active/Passive Clusters using Shared Storage Failover in Large Deployments VIP01 VIP05 VIP02 VIP03 VIP04 VIP06 VIP07 VIP08 in02 in03 in04 in06 in07 in08 in01 in05 01 02 03 04 05 06 07 08 Shared Storage 30Tuesday, 24 January 12
  • Virtualised Environments • Data storage, high availability and load balancing are provided and managed by the virtualised software • In case of fault, the virtualised software restarts on any other physical server • MySQL Replication for disaster 01 03 05 07 recovery • InnoDB only 02 04 06 08 01 02 03 04 05 06 07 08 31 Shared StorageTuesday, 24 January 12
  • Geographical Replication for Disaster Recovery • Master-Master Asynchronous Replication is used to update the backup data centre • In case of fault, the network traffic is redirected to the backup data centre. Failback must be executed manually • Cross-platform and cross- engine Active Backup Data Data Centre Centre 32Tuesday, 24 January 12
  • Storage Snapshots for Disaster Recovery • Snapshots are managed by the NAS and SAN firmware. There is usually a short read-only freeze Active Data • Snapshots can be used as run-time Centre backup • InnoDB only, NetApp NASs and firmware are certified using Snapshot and Snapmirror Backup Data Centre 33Tuesday, 24 January 12
  • MySQL Cluster • Shared-nothing, fully transactional and distributed architecture used for high volume and small transactions. • MySQL Cluster is based on the NDB (Network DataBase) Storage Engine • Data is distributed for scalability and performance, and it is replicated for redundancy on multiple data nodes. Application Nodes • Nodes in a cluster: – SQL Nodes: provide the SQL interface to NDB – API Nodes: provide the native NDB API – Data Nodes: store and retrieve data, manage NDB API, ClusterJ/JPA SQL Nodes transactions – Management Nodes: manage the Cluster • Load balanced • Memory or disk-based • Geographically replicated for disaster recovery with conflict resolution • Full online operation for maintenance and Management administration Nodes 34 Data NodesTuesday, 24 January 12
  • Client-based Failover and Proxies • Connector/J –jdbc:mysql://[host][,failoverhost...][:port] • mysqlnd_ms for PHP –connection pooling for mysqli, mysql and PDO_MYSQL • ScaleBase 35Tuesday, 24 January 12
  • The absolutely necessary comparison chart... Galera MySQL   Shooner Shared   Geo   Storage   MySQL   MHA Tungsten DRBD XtraDB VM Replica.on SQL Cluster Replica.on Snapshots Cluster Cluster 100%  Data   ✘ ✘ ✔ ✔ ✘ ✔ ✔ ✔ ✘ ✘ ✔ Safe All  Storage   ✔ ✔ ✔ ✘ ✘ ✘ ✘ ✘ ✔ ✘ ✘ Engines Automa&c   ✘! ✔ ✔ ✔ ✘! ✔ ✘! ✔ ✘ ✘ ✔ Failover Performance   Overhead * * ** *** ** ** -­‐ ** * ** * (*  -­‐  Best) Easy  admin/ config * * * *** * * * * ** ** *** (*  -­‐  Best) Scalability (***-­‐  Best) ** ** *** * ** ** * * * * ** 36Tuesday, 24 January 12
  • The famous last words... • I need 5 nines – • Everything must be automatic – • I want to migrate to MySQL Cluster – • I can’t afford to lose any data – • I need a sub-second failover – 37Tuesday, 24 January 12
  • The famous last words... • I need 5 nines –Implement what you really need • Everything must be automatic –Aided switchover is sometimes more effective, inexpensive and easy to implement/administer • I want to migrate to MySQL Cluster –Is your application designed for Cluster? • I can’t afford to lose any data –People lose data every day. Is the drop in performance worth it? • I need a sub-second failover –Check the timeout periods and the caching warm- ups 38Tuesday, 24 January 12
  • SkySQL Enterprise HA • Full HA solution, supported on –Platforms: Linux, Windsows Solaris X86 –DB Servers: Oracle MySQL, MariaDB, Percona Server –2 to 3 days implementation guaranteed with acceptance tests • Technologies: –MySQL Replication –DRBD Active/Passive or Cross Active –MHA Tool with/without Multi-tier Replication –Linux or Windows Shared Storage –MySQL Cluster –Tungsten Enterprise 39Tuesday, 24 January 12
  • Thank You! ivan@skysql.com izoratti.blogspot.com mysql4all.wordpress.com 40Tuesday, 24 January 12