• Save
Severalnines Training: MySQL® Cluster - Part IX
 

Severalnines Training: MySQL® Cluster - Part IX

on

  • 5,587 views

Part IX of our self-training slides on MySQL Cluster.

Part IX of our self-training slides on MySQL Cluster.

Statistics

Views

Total Views
5,587
Views on SlideShare
772
Embed Views
4,815

Actions

Likes
1
Downloads
0
Comments
0

2 Embeds 4,815

http://www.severalnines.com 4812
http://translate.googleusercontent.com 3

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Severalnines Training: MySQL® Cluster - Part IX Severalnines Training: MySQL® Cluster - Part IX Presentation Transcript

    • 1Copyright 2011 Severalnines AB Control your database infrastructure9th InstallmentMySQL Cluster Self-TrainingPart 8 – Designing a MySQL Cluster
    • 2Copyright 2011 Severalnines AB Control your database infrastructureTopics• Node Placement• Capacity Planning and Dimensioning• Hardware recommendations• Best practice configuration• Storage calculations
    • 3Copyright 2011 Severalnines AB Control your database infrastructureNode Placement• Data nodes should use dedicated instances– Heavy user of RAM, CPU, and DISK• API nodes (e.g. SQL node) should preferably be ondedicated instance– Heavy user of CPU, but little DISK– RAM usage dependant on workload• Management servers– Negligible use of CPU, DISK, RAM
    • 4Copyright 2011 Severalnines AB Control your database infrastructureCo-location• Do not co-locate API nodes with Data nodes– They will compete for CPU– RAM usage for API nodes may grow, competing withresources of the Data node (causing swapping and nodefailures)• Don’t co-locate Management servers with Datanodes– You lose protection from split brain/network partitioning• API nodes and Management servers can be co-located
    • 5Copyright 2011 Severalnines AB Control your database infrastructureCluster Size• Number of Data Nodes– Depends on Storage and Throughput requirements– Use Sizer (http://www.severalnines.com/sizer) to calculatestorage requirements for your data– At least two for redundancy• Number of API Nodes– Depends on the expected level of Throughput– At least two for redundancy– Usually recommended to have 2x API nodes compared to Datanodes (2 data nodes  4 API nodes). Especially for API nodesusing the synchronous NDB API (mysqld, Cluster/J)• Number of Management servers– Two for redundancy. Always!– Having one management server on every API node does notmake sense
    • 6Copyright 2011 Severalnines AB Control your database infrastructureGood Initial Setup (1)STORAGE LAYER (NDBCLUSTER)ACCESS LAYERAPI nodendb_mgmdAPI nodendb_mgmdndbmtd ndbmtdCLUSTER CONTROLmysqldcmon
    • 7Copyright 2011 Severalnines AB Control your database infrastructureGood Initial Setup (2)• Easy to scale:– Data nodes can be added online (it is not easy but possible)– API nodes can be added online (as long as there are free[mysqld] slots in config.ini)• Can be extended– Replicating out to an InnoDB database forreportinghttp://johanandersson.blogspot.se/2012/09/mysql-cluster-to-innodb-replication.html– Using the Hadoop Applier (oracle)https://blogs.oracle.com/MySQL/entry/announcing_the_mysql_hadoop_applier• The suggested setup is only a starting point– The questions on the next slides might help determine if youneed more nodes
    • 8Copyright 2011 Severalnines AB Control your database infrastructureGood Initial Setup (3)• Can you load in the data that you need?– YES: good– NO:• Can you add more RAM to the data nodes?If not, create a new cluster with four nodes. Try and load in thedata gain.• Can some of the less active tables use DISK DATA storage?Avoid DISK DATA tables for frequently used data• Use Severalnines Sizer (http://www.severalnines.com/sizer/)(capacity planning tool). Create the schema in NDB Cluster,run sizer and import the result to a spreadsheet. Manipulate therow count.• Use sizer to verify growth scenarios
    • 9Copyright 2011 Severalnines AB Control your database infrastructureGood Initial Setup (4)• Can you handle the throughput you need?– verify with Bencher (www.severalnines.com/bencher/)– YES: good– NO:• Are the data nodes the bottleneck?Run:top –Hd1Any of the data nodes threads running >90%?YES: create a new cluster with 2x the number of nodes.NO: The APIs can be the bottleneck. Add more API nodes• Tune schema and queries/requests (possible play with the NDBcluster connection pool as well)
    • 10Copyright 2011 Severalnines AB Control your database infrastructureHardware for Data Nodes• 8 cores or more– Fast CPU and memory bus is important• As much RAM as you need– Memory tables and indexes for DISK DATA tables must fit in RAM• Disk Subsystem:– SATA2 is the absolute minimum (7200RPM), but not really suitable forproduction– Better options are:• SAS• SSD• AWS IOPS preferably– RAID 1+0 – requires 4 disks• Disk Storage Capacity– 10xDataMemory (for REDO LOG and LCP)• If you use Disk data tables– One disk for LCP– One disk for Tablespace (SSD could be an option)– One disk for UNDO/ REDO
    • 11Copyright 2011 Severalnines AB Control your database infrastructureHardware for API Nodes• 8 cores or more– Fast CPU and memory bus is important• Disks– Replication servers:• Disk space must be dimensioned to store binary logs/relay logs• 5MB/s written into NDB  binary logs will grow with 5MB/s– Disk is not important for the API Nodes• API nodes do not save any state information to disk (exceptsmall meta data like .frm files)
    • 12Copyright 2011 Severalnines AB Control your database infrastructureNetwork• Network interconnect is important– Ethernet• 1Gig-E is most common• 10Gig-E is coming– Infiniband• IBOIP• Lower latency than Ethernet• Load-balancing– Hardware: F5, Extreme Summit, Cisco– Software: HAProxy, LVS
    • 13Copyright 2011 Severalnines AB Control your database infrastructureStorage Calculations• Two things to consider– Disk space– Memory consumption
    • 14Copyright 2011 Severalnines AB Control your database infrastructureDisk Space• One data node needs– 3xDM for LCP (3x for Headroom, 2x is on the limit)– 4-6xDM for Redo Log• 4x – read mostly applications• 6x – write intensive applications– Tablespace• Depends on how much data you plan to store on disk• Storage needed per table per node:2 x( #records x size_of_non_indexed_cols + 40B) x NoOfReplicas /#nodesNote: 40B is the record overhead– Store one or more backups• 1 x DM for each backup• This sums to >8x disk space than DataMemory
    • 15Copyright 2011 Severalnines AB Control your database infrastructureDataMemory and IndexMemory• IndexMemory = 20B xsum_for_all_records• DataMemory / per table = 40B + avg_record_size• Per node:– DataMemory=SUM(DataMemory/table) x NO_OF_NODES /NO_OF_REPLICAS– IndexMemory=IndexMemoryx NO_OF_NODES /NO_OF_REPLICAS• Easy way:– www.severalnines.com/sizer• Provision a data model in clusterRun: ./sizer –aImport the csv data into the excel template.
    • 16Copyright 2011 Severalnines AB Control your database infrastructureDisk Data tables• Not everything has to stay in RAM.– Log data, archives etc not frequently accessed can be storedin DISK DATA tables:http://johanandersson.blogspot.se/2012/04/mysql-cluster-disk-data-config.html– Indexedcolumnswillalwaysstay in RAM for DISK DATAtables.– Disk data access is notfast, butSSDshelps a lot.– Disk Data tablespacecanbeincreasedovertimeonline.
    • 17Copyright 2011 Severalnines AB Control your database infrastructurePerformance Planning• Transaction capacity planning requires benchmarking– Throughput and Response times requirements affects thenumber of nodes, both data nodes and mysql servers.• Benchmark the common use cases– Severalnines Bencher allows to drive a high load and testindividual queries.– Jmeteretc can be used to drive web load– Try to simulate expected peak traffic.• Can the cluster handle the load? If not add resourcesonline where needed.
    • 18Copyright 2011 Severalnines AB Control your database infrastructureComing next in Installment 10:Troubleshooting MySQL Cluster
    • 19Copyright 2011 Severalnines AB Control your database infrastructureWe hope these training slides areuseful to you!Please visit our website to view thenext section of this training.For any questions, comments or feedback,please contact us at:services@severalnines.comThank you!
    • 20Copyright 2011 Severalnines AB Control your database infrastructureDisclaimer© Copyright 2011 Severalnines AB. All rights reserved.Severalnines& the Severalnineslogo(s) are trademarks of Severalnines AB.MySQL is a registered trademark of Oracle and/or its affiliates.Other names may be trademarks of their respective owners.