Mysql high availability and scalability
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Mysql high availability and scalability

  • 2,968 views
Uploaded on

how to make sure high availability and scalability with mysql.

how to make sure high availability and scalability with mysql.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,968
On Slideshare
2,847
From Embeds
121
Number of Embeds
3

Actions

Shares
Downloads
91
Comments
0
Likes
2

Embeds 121

http://www.gongyin.net 119
http://cache.baidu.com 1
http://localhost 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • 两层: SQL Layer 、 Storage Engine Layer 插件式存储引擎体系结构( from 5.1 ) MyISAM : 高性能、 B-Tree 索引、全文索引 InnoDB :支持事务, SQL92 4 个级别事务都支持( read uncommitted 、 read committed 、 repeatable 、 serializable ),实现行锁,实现外键(完整性)
  • Scalability :数据库通过相应的升级后所带来的处理能力提升的难易程度
  • 纵向扩展,增加节点的处理能力,设备升级 维护简单 数据集中在一起,应用系统架构简单 缺点:高端设备成本高 单台主机处理能力有限,容易出现瓶颈 单点故障影响大
  • Scale out :横向扩展,增加处理节点 成本低 不容易遇到瓶颈 单节点故障影响小, HA 能力好 缺点:处理节点多,复杂度提高 集群维护难度高,维护成本大
  • 短期 scale up 长期 scale out 水平扩展考虑的是提高整体处理能力 数据的不断复制实现很多完全一样的数据源 将集中的数据源切分城很多数据源来实现扩展
  • 1. 事件相关性最小化,避免分布式事务,大事务切分成小事务 2 . 数据一致性原则。 BASE 模型:基本可用、柔性状态、基本一致、最终一致。让系统在满足用户使用的基础上,允许数据在短时间内处于非实时状态,通过后续技术来保证数据的一致性。 3. 高可用、数据安全原则。冗余机制来保证
  • 不支持从多个 master 复制,据说有 patch ,主要是数据一致性问题比较难以处理
  • 复制过程是异步进行的,延时非常少 Master : 读取 binary 日志,与 slave 的 I/O 线程交互 Slave : I/O 线程 – 请求和接受 binary 日志并写入本地的 relay log SQL 线程 – 从 relay log 中读取相关日志,解析,并在 slave 端执行(不会写 binary 日志,或根据 serverid 解决循环问题) 过程: 1.Slave io 线程连接 master ,请求日志文件的指定位置 2.Master 接受 slave 请求,读取日志信息,返回给 slave ,还包括 binary log 的文件名和位置 3.Slave 的 IO 线程接收,将日志写入到 relay log 最末端,并将读取到的 bin-log 的文件名和位置记录到 master-info 文件中 4.Slave 的 sql 线程解析并执行 query 语句
  • 复制实现级别: 1.Row level ( 5.1.5 ): 会记录每一行数据被修改的形式,不需记录 query 语句上下文信息,缺点是会产生大量的日志文件 2.Statament level :记录 query 语句,减少日志,节省 IO ,缺点:上下文信息,函数、 UUID 等导致 server 和 slave 不一致 3. Mixed level ( 5.1.8 ):默认使用 statement ,当 statement 可能会造成复制过程中的不一致数据时(存储过程、函数等),使用 row
  • 避免循环记录 serverid slave 不打开 -log-slave-update 一般建议只开发一个 master 可写,以免造成数据不一致的情况出现
  • Replication 的搭建 1.master 准备工作 2 . 获取数据快照备份 3 .Salve 恢复 master 数据快照 4 .Slave 短设置 master 相关配置,启动复制。 Change master to
  • Ebay
  • Ebay
  • 将存放在一个数据库的数据分散存放到多个。数据库上面,分散单台设备的负载,提高系统的整体可用性。 垂直切分:按照功能模块切分,不同表的数据。 优点:简单,应用程序整合容易,维护方便。 缺点:关联表问题,需要在程序中处理、单表数据量大问题、事务处理问题、扩展性受限,系统复杂 水平切分:按照某种条件拆分,同一个表的数据。 优点:表关联基本能在数据段完成、没有大数据量的瓶颈、应用程序整体架构改动小,事务处理简单、扩展性好。 缺点:切分规则复杂、维护难度高、应用系统耦合高 混合切分:
  • 应用系统面临的最大问题就是如何让这些数据源得到很好的整合? 1 . 每个应用系统维护自己需要使用的数据源 2 . 通过中间代理层统一管理 自行开发中间层 MySQL Proxy (连接路由、负载均衡、 HA 、 query 过滤和修) Amoeba , java 开发,专注解决分布式数据库数据源的整合开发框架 HiveDB
  • 每个应用系统维护自己需要使用的数据源 通过中间代理层统一管理 自行开发中间层 MySQL Proxy (连接路由、负载均衡、 HA 、 query 过滤和修) Amoeba , java 开发,专注解决分布式数据库数据源的整合开发框架 HiveDB
  • 数据切分的问题: 1. 分布式事务的问题( 5.0 开始支持, innodb ) 2. 跨节点 join 问题( federated ,会保存远端表定义在本地) 3. 跨节点合并排序分页问题
  • 不仅对数据进行了水平切分,还对数据进行了跨节点冗余 MySQL Cluster is designed not to have any single point of failure. In a shared-nothing system, each component is expected to have its own memory and disk, and the use of shared storage mechanisms such as network shares, network file systems, and SANs is not recommended or supported. MySQL Cluster  is a synchronous solution that enables multiple MySQL instances to share database information. Unlike replication, data in a cluster can be read from or written to any node within the cluster, and information will be distributed to the other nodes. Advantages Offers multiple read and write nodes for data storage. Provides automatic failover between nodes. Only transaction information for the active node being used is lost in the event of a failure. Data on nodes is instantaneously distributed to the other data nodes. Disadvantages Available on a limited range of platforms. Nodes within a cluster should be connected via a LAN; geographically separate nodes are not supported. However, you can replicate from one cluster to another using MySQL Replication, although the replication in this case is still asynchronous. Recommended uses Applications that need very high availability, such as telecoms and banking. Applications that require an equal or higher number of writes compared to reads. Oracle RAC: share everyThing Mysql cluster: share nothing Oracle RAC relies on a "shared storage" architecture that requires an additional investment in SAN (Storage Area Network) infrastructure
  • The requirement for a SAN results in: An additional expense for customers since they have to turn to a 3rd party for a networked storage solution. A shared disk can cost $15k-20k in addition to the database license even for a small implementation. * Recovery from a failed node requires access to the shard-disk which increases time to failover to minutes vs. the sub-second failover time of MySQL Cluster. * A single point of failure in the cluster. Heartbeat, 平衡负载,自动路由, VIP Shared Nothing 结构 数据库被分区到集群的每个节点上。每个节点都有一个数据的唯一子集 ( 保存着所有数据的一部分),所有访问这些数据的都要到这个节点。数据并行操作的性能,取决于数据被合理的分区。每个分区被各自的处理器进行管理。 系统可以使用双磁盘子系统,保留一个物理备份,来防止某个节点错误影响系统可用性。不过此时依然会显著的降低整体性能。 “ shared-nothing” , 单打独斗 那么 Oracle RAC 就是通过把所有的数据库资源( databasefile,controlfile,logfile 等)共享出来,放到一个物理的存储介质中,并对其牢固可靠的保存,然后采用高端的连接技术将其与各 Instance 节点进行连接,达到 shared-everything 。 Orace 的 RAC 让磁盘可以被所有的节点链接。数据库文件在所有的节点间逻辑共享。每个实例都可以访问所有的数据。共享磁盘访问可以通过硬件链接或者操作系统层提供一个所有节点上设备的单一视图。如果多个节点同时链接相同的数据块,事务共享磁盘数据库系统使用磁盘 I/O 来同步多个节点的数据访问,比如通过一个写入块的锁来防止其他节点访问同样的数据块。 DB2 不是纯粹的 Shared_Nothing, 它为了可用性,使用了 Shared-Disk 的数据库,它的 Shared-Nothing 指的是在运行期间对数据的所有权,而不是物理上的关系。
  • NDB cluster ( from 5.0 ): share nothing 的分布式数据存储引擎,也支持事务,用于分布式环境 . 无共享存储设备的情况下实现的一种内存数据( from7.1 ,支持只装载索引)库 cluster 。各个 SQL node 间不共享数据 Manage node :管理工作,必须最先被启动 SQL node :负责数据库在存储层之上的所有事情。 ndbcluster NDB data node :内存是存储引擎,数据和索引都会加载到内存中,也会持久到存储设备上。新版本支持非索引字段不用全部加载到内存中。 每个 NDB 节点保存完整数据的一部分。 noOfReplicas 参数指定每一份数据冗余在不同节点上的份数
  • MySQL Cluster normally partitions NDBCLUSTER tables automatically Horizontal Data Partitioning Data within NDB tables is automatically  partitioned  across all of the data nodes in the system. This is done based on a hashing algorithm based on the  PRIMARY KEY  on the  table , and is transparent to the end  application . In the 5.1 release, users can define their own partitioning schemes.
  • replicate asynchronously
  • Advantages Provides high availability and data integrity across two servers in the event of hardware or system failure. Can ensure data integrity by enforcing write consistency on the primary and secondary nodes. Disadvantages Only provides a method for duplicating data across the nodes. Secondary nodes cannot use the DRBD device while data is being replicated, and so the MySQL on the secondary node cannot be simultaneously active. Can not be used to scale performance, since you can not redirect reads to the secondary node. Recommended uses High availability situations where concurrent access to the data is not required, but instant access to the active data in the event of a system or hardware failure is required
  • Replication : 优:部署简单,实施、维护方便 劣: master 主机无法恢复,可能导致部分数据未传送而丢失 Cluster : 优:可用性高、性能好,冗余数据拷贝实时同步 劣:维护复杂,产品新还在发展阶段 DRBD : 优:功能强大,数据在底层快设备级别跨物理主机镜像 劣:非分布式文件系统无法支持镜像数据同时可见,维护成本高

Transcript

  • 1. MySql High Availability And Scalability [email_address] @gongyin
  • 2. Agenda
    • Brief Introduction
    • High Availability and Scalability
    • MySQL Replication
    • MySQL Cluster
    • DRBD
    • Resources
  • 3. MySQL Brief introduction
    • High performance
    • Reliable
    • Easy To Use
  • 4. MySQL Server Architecture SQL Layer Storage Engine Layer
  • 5. High Availability
    • 7 * 24 * 365 online
    • Single point of failure
    • Auto Recover
  • 6. Scalability
    • Scalability refers to the ability to spread the load of your application queries across multiple MySQL servers.
  • 7. Scalability - Scale up
    • Scale vertically - add resources to a single node in a system, typically involving the addition of CPUs or memory to a single computer.
    • Pros :
    • Simple Maintenance
    • Centralization Data, Simple application architecture
    • Cons :
    • Expensive Device
    • Limitation of processing, Prone to bottleneck
    • Single point of failure
  • 8. Scalability - Scale out
    • Scale horizontal - add more nodes to a system, such as adding a new computer to a distributed software application.
    • Pros :
    • Bottleneck is not easy occur
    • Low cost device.
    • Little impact on single point of failure, HA
    • Cons :
    • More nodes, more complex
    • Difficult to maintain
  • 9. Scalability - Scale out
    • Database Scale out How?
  • 10. Scalability - P rinciple
    • Principle :
    • M inimize Transaction R elevance
    • Data Consistency, BASE model
    • HA 、 D ata S ecurity. Data Redundancy.
  • 11. MySQL Replication
    • Features :
    • Across different platforms
    • Asynchronous
    • One master to any number of slaves.(separate R/W)
    • Data can only be written to the master
    • No guarantee that data on master and slaves will be consistent at a given point in time.
    Full replication of data
  • 12. MySQL Replication - Process
    • Master
    • I/O thread
    • Binary Log (mysqld log-bin)
    • Slave
    • I/O thread
    • SQL thread
    • Relay Log
    • Master-info
  • 13. MySQL Replication - Level
    • Statement Level
    • Row Level (support from 5.1.5)
    • Mixed Level (support from 5.1.8,default)
  • 14. MySQL Replication - A rchitecture
    • Master-slaves
    Master repl W R salve client client client
  • 15. MySQL Replication - A rchitecture
    • Master – Master
    repl R/W client client client
  • 16. MySQL Replication - A rchitecture
    • Master-Slaves-Slaves : Cascading replication
    Master repl salve salve repl W R salve client client client
  • 17. MySQL Replication - A rchitecture
    • Master-Master-Slaves
    Master repl Master repl W R salve client client client
  • 18. MySQL Replication - A rchitecture
  • 19. MySQL Replication - A rchitecture Ebay
  • 20. MySQL Replication - A rchitecture Facebook
  • 21. Sharding
    • Vertical S harding
    • according to function, different table locate on different DB
    • Horizontal Sharding
    • data on same table locate on different DB
    • Mixed Sharding
    • Pros and Cons
  • 22. Sharding Application System How to integrate all of data source?
  • 23. Sharding
    • Each application system maintain its required data sources
    • Unified management by middle layer
    • Self-developed
    • MySQL Proxy ( connection route 、 load balance 、 HA 、 query filter 、 query modify )
    • Amoeba , based on java
    • HiveDB
  • 24. Sharding
    • Problems :
    • Distribute transaction question
    • Join cross multi nodes ( supported by federated storage engine )
    • Merge sort paging cross multi nodes
  • 25. MySQL Cluster
  • 26. MySQL Cluster
    • Real-time transactional relational
    • “ Shared-nothing" distributed architecture
    • No single point of failure, two replicas is needed
    • Synchronous and two-phase commit
    • R/W on any nodes
    • Automatic failover between nodes
  • 27. Shared-Nothing
  • 28. MySQL Cluster
  • 29. MySQL Cluster
    • Three parts:
    • Manage node
    • SQL node, startup with ndbcluster
    • NDB data node
    • Data storage and management of both in-memory and disk-based data
    • Automatic and user defined partitioning of data
    • Synchronous replication of data between data nodes
    • Transactions and data retrieval
    • Automatic fail over
    • Resynchronization after failure
  • 30. MySQL Cluster
  • 31. MySQL Cluster
    • Cluster Nodes
    • Node Groups
    • [number_of_node_groups] = number_of_data_nodes / NumberOfReplicas
    • Replicas
    • The number of replicas is equal to the number of nodes per node group
    • Partitions
    • This is a portion of the data stored by the cluster
    • MySQL Cluster normally partitions NDBCLUSTER tables automatically Horizontal Data Partitioning. Based on hash algorithm based on the primary key on the table.
  • 32. MySQL Cluster
  • 33. MySQL cluster replication replicate asynchronously
  • 34. DRBD DRBD (Distributed Replicated Block Device)  is a solution from Linbit supported only on Linux. DRBD creates a virtual block device (which is associated with an underlying physical block device) that can be replicated from the primary server to a secondary server.  .
  • 35. MySQL HA
  • 36. Resources
    • HA: Heartbeat
    • Load balance : F5/NetScalar/LVS/HAProxy
    • Monitor : Nagios/cacti
    • http://highscalability.com/