Scaling API-first – The story of a global engineering organization
10 Do's and Don'ts for MySQL Cluster
1. 10 Do's and Don'ts
for MySQL Cluster
Jonathon Coombes
jon@cybersite.com.au
http://www.cybersite.com.au
Slide: 1 Cybersite Consulting Pty Ltd
2. 1. Don't Simply Transfer DB's
Don't take an existing database on a single
server or replicated system and directly
transfer it onto a cluster.
Issues arise from MyISAM or InnoDB
coming across to cluster without
consideration.
NDB engine has it's own unique tweaks just
as do the other engines.
Even if it works (and it most likely won't!) it
will under-perform compared to original
setup.
Slide: 2 Cybersite Consulting Pty Ltd
3. 2. Do Plan Your Cluster
Consider hardware, network/transport,
schema, relations and data types carefully.
Hardware will affect performance,
particularly with memory capacity for in-
memory cluster.
Networking and transport can affect the
performance in regards to latency in
particular.
Similarly data can affect performance.
Planning avoids problems later on and helps
in debugging problems at a logical level.
Slide: 3 Cybersite Consulting Pty Ltd
4. 3. Do Understand How NDB Works
Understand nodes, groups, fragments etc.
Know how the management nodes work
and communicate with data nodes.
Know how the api nodes work and how they
relate to data nodes.
Know your ndb engine properties and
tweakings. Memory-based storage is very
different to standard engine types here.
Slide: 4 Cybersite Consulting Pty Ltd
5. 4. Do Calculate Memory Use
Don't just throw a number at it and expect
it to work (except in the simplest of
scenarios).
Use the ndb_size.pl program to try and
help calculate the size of your existing data.
Don't forget to account for index sizes as
well.
The 5.1 version will allow disk use for data
only at this stage, but will still require
calculated usage to optimise performance.
Slide: 5 Cybersite Consulting Pty Ltd
6. 5. Don't Start Minimal and Expand
Don't plan to start with a minimal setup and
expand it as you grow.
All architectures will require some capacity
planning for future growth.
Adding nodes once cluster is setup is not as
easy as planning properly in the first place.
Adding new nodes requires configuration
changes, and rolling restarts of the cluster.
Slide: 6 Cybersite Consulting Pty Ltd
7. 6. Do Optimise Data Transfer
Since data nodes will be passing data
between them, plan for the best possible
transport method available to your budget.
Transport types include SCI and GigE.
GigE allows quick use of existing hardware:
Enable jumbo frames in Linux AND THE
SWITCH!
SCI will require a compilation to enable it
2D or 3D torus architecture depending on HA
Utilise the options available:
engine-condition-pushdown=1
Slide: 7 Cybersite Consulting Pty Ltd
8. 7. Don't Create Many Indexes
Primary index is essential – if not supplied,
ndb will create it automatically.
Primary indexes are fast!
Adding more indexes may not help. Even a
full table scan in memory may be faster
than managing extra indexes.
If you are uncertain, do a test to check the
performance.
Slide: 8 Cybersite Consulting Pty Ltd
9. 8. Do Avoid Joins
Joins are more expensive in any engine.
Avoiding the joins helps reduce extra
workload on the ndb server and trying to
share data between the different nodes.
De-normalise data if suitable. This goes
towards a data warehouse type model
rather than true normalised form.
Slide: 9 Cybersite Consulting Pty Ltd
10. 9. Don't Assume Only HA Solution
Many people assume that NDB cluster is the
only option for obtaining high availability.
HA can be obtained using other
architectures or models such as:
Scale out replication
ultramonkey,redhat cluster
Continuent's cluster software (multi-master)
Slide: 10 Cybersite Consulting Pty Ltd
11. 10. Do Remember to Backup
Backups are important for any model used!
Backups from within the manager
(ndb_mgm) can be restored quickly on
rolling restarts with newly initialised data
store areas.
Redo and Undo logs should be backed up to
allow for safe recovery if required.
Remember, just because it is memory-
based does not mean data loss on power
failure!
Slide: 11 Cybersite Consulting Pty Ltd
12. 11. Do Remember to Communicate
Give feedback on any problems that you
come across.
Make sure you search the bug database
before reporting problems.
File bug reports where appropriate.
Give repeatable examples as well as trace
logs and configurations to help the
developers.
Slide: 12 Cybersite Consulting Pty Ltd
13. Summary
Remember that ndb cluster is still evolving
and is relatively new software.
MySQL Cluster is not always the best
solution for your particular data needs.
Plan your cluster rather than jumping in
headlong and trying to work through it.
Optimise data and schema for the engine
type.
Choose appropriate hardware, memory and
transports to suit your needs.
Slide: 13 Cybersite Consulting Pty Ltd