SlideShare a Scribd company logo
1 of 29
1Copyright 2013 Severalnines AB Control your database infrastructure
10th Installment
MySQL Cluster Self-Training
Part 9 – Troubleshooting MySQL
Cluster
2Copyright 2013 Severalnines AB Control your database infrastructure
Topics
• Common problems
• Error logs and Trace files
• Recovery and Escalation procedures
3Copyright 2013 Severalnines AB Control your database infrastructure
Common Problems
• The most common problems are
– Configuration changes
– Out of disk space
– Out of RAM
– Network issues (switch failures, network reorganization,
upgrade of RST)
– Swapping
• echo “0” > /proc/vm/swapiness
4Copyright 2013 Severalnines AB Control your database infrastructure
Localizing the problem
• Look in the cluster log on the management node
• What node/nodes crashed and in what order
• Go to those node/nodes
– View the error log file for each node.
– Look at the recommended restart action
• Initial node recovery
• Node Recovery
– It could also be a Permanent error
• Filesystem is full
• Directory does not exist
5Copyright 2013 Severalnines AB Control your database infrastructure
Error logs
• Data node store its error log in its DATADIR
– ndb_X_error.log
– X is the node id of the node
• The ndb_X_out.log contains debug messages but is
usually not interesting to look in.
• The ndb_X_trace.log.n contains the last execution
steps before the data node stopped/crashed.
6Copyright 2013 Severalnines AB Control your database infrastructure
Ndb_X_cluster.log
2011-05-24 08:12:44 [MgmtSrvr] INFO -- Node 3: Start with all nodes 3 and 4
2011-05-24 08:12:44 [MgmtSrvr] INFO -- Node 3: CM_REGCONF president = 3, own Node = 3, our dynamic id = 0/1
2011-05-24 08:12:45 [MgmtSrvr] INFO -- Node 4: CM_REGCONF president = 3, own Node = 4, our dynamic id = 0/2
2011-05-24 08:12:45 [MgmtSrvr] INFO -- Node 3: Node 4: API mysql-5.1.51 ndb-7.1.10
2011-05-24 08:12:45 [MgmtSrvr] INFO -- Node 4: Node 3: API mysql-5.1.51 ndb-7.1.10
2011-05-24 08:12:45 [MgmtSrvr] INFO -- Node 4: Start phase 1 completed
2011-05-24 08:12:45 [MgmtSrvr] INFO -- Node 3: Start phase 1 completed
2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 3: Start phase 2 completed (system restart)
2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 4: Start phase 2 completed (system restart)
2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 3: Start phase 3 completed (system restart)
2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 4: Start phase 3 completed (system restart)
2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 3: Restarting cluster to GCI: 231577
2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 3: Starting to restore schema
2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 3: Restore of schema complete
2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 4: Starting to restore schema
2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 4: Restore of schema complete
2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 3: DICT: activate index 8 done (sys/def/7/PRIMARY)
2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 3: Node: 3 StartLog: [GCI Keep: 227381 LastCompleted: 231577
NewestRestorable: 231577]
2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 3: Node: 4 StartLog: [GCI Keep: 227381 LastCompleted: 231577
NewestRestorable: 231577]
2011-05-24 08:12:48 [MgmtSrvr] ALERT -- Node 3: Forced node shutdown completed. Occured during startphase 4.
Caused by error 2306: 'Pointer too large(Internal error, programming error or missing error message,
please report a bug). Temporary error, restart node'.
2011-05-24 08:12:48 [MgmtSrvr] ALERT -- Node 1: Node 3 Disconnected
2011-05-24 08:12:48 [MgmtSrvr] ALERT -- Node 4: Forced node shutdown completed. Occured during startphase 4.
Caused by error 2308: 'Another node failed during system restart, please investigate error(s) on other
node(s)(Restart error). Temporary error, restart node'.
2011-05-24 08:12:48 [MgmtSrvr] ALERT -- Node 1: Node 4 Disconnected
2011-05-24 08:12:49 [MgmtSrvr] INFO -- Mgmt server state: nodeid 3 freed, m_reserved_nodes 1, 4, 5 and 9.
2011-05-24 08:12:49 [MgmtSrvr] INFO -- Mgmt server state: nodeid 4 freed, m_reserved_nodes 1, 5 and 9.
2011-05-24 08:12:57 [MgmtSrvr] INFO -- Mgmt server state: nodeid 4 reserved for ip 192.168.100.112,
m_reserved_nodes 1, 4, 5 and 9.
7Copyright 2013 Severalnines AB Control your database infrastructure
Closer Inspection
• There was a system restart ongoing
• Node 3 crashed
– Forced node shutdown completed. Occured
during startphase 4. Caused by error 2306:
'Pointer too large(Internal error,
programming error or missing error message,
please report a bug). Temporary error,
restart node’
• Node 4
– Forced node shutdown completed. Occured
during startphase 4. Caused by error 2308:
'Another node failed during system restart,
please investigate error(s) on other
node(s)(Restart error). Temporary error,
restart node'
8Copyright 2013 Severalnines AB Control your database infrastructure
Closer Inspection
• Next the error logs of the data nodes needs to be
inspected.
– Ndb_3_error.log
– Ndb_4_error.log
9Copyright 2013 Severalnines AB Control your database infrastructure
Ndb_X_error.log
Time: Tuesday 24 May 2011 - 02:36:36
Status: Temporary error, restart node
Message: Another node failed during system
restart, please investigate error(s) on
other node(s) (Restart error)
Error: 2308
Error data: Node 3 disconnected
Error object: QMGR (Line: 3050) 0x00000002
Program: /usr/local//mysql/bin//ndbd
Pid: 3501
Version: mysql-5.1.51 ndb-7.1.10
Trace: /data/mysqlcluster//ndb_4_trace.log.5
***EOM***
10Copyright 2013 Severalnines AB Control your database infrastructure
Error logs
• Looking at the error logs usually gives a good hint
what needs to be done
– In the above example one node failed during system restart.
– This caused
11Copyright 2013 Severalnines AB Control your database infrastructure
Ndb_X_error.log
Time: Tuesday 24 May 2011 - 08:53:40
Status: Temporary error, restart node
Message: Pointer too large (Internal error,
programming error or missing error message,
please report a bug)
Error: 2306
Error data: dblqh/DblqhMain.cpp
Error object: DBLQH (Line: 15725) 0x00000002
Program: /usr/local//mysql/bin//ndbd
Pid: 4790
Version: mysql-5.1.51 ndb-7.1.10
Trace: /data/mysqlcluster//ndb_3_trace.log.25
***EOM***
12Copyright 2013 Severalnines AB Control your database infrastructure
Ndb_X_out.log
RESTORE table: 2 1039 rows applied
RESTORE table: 2 1012 rows applied
RESTORE table: 3 2 rows applied
RESTORE table: 3 2 rows applied
13Copyright 2013 Severalnines AB Control your database infrastructure
Ndb_X_trace.log.N
--------------- Signal ----------------
r.bn: 247 "DBLQH", r.proc: 3, r.sigId: 75928 gsn:
164 "CONTINUEB" prio: 1
s.bn: 247 "DBLQH", s.proc: 3, s.sigId: 75923
length: 2 trace: 1 #sec: 0 fragInf: 0
H'00000006 H'00000000
--------------- Signal ----------------
r.bn: 247 "DBLQH", r.proc: 3, r.sigId: 75927 gsn:
262 "FSREADCONF" prio: 0
s.bn: 253 "NDBFS", s.proc: 3, s.sigId: 75926
length: 1 trace: 1 #sec: 0 fragInf: 0
UserPointer: 1
--------------- Signal ----------------
r.bn: 253 "NDBFS", r.proc: 3, r.sigId: 75926 gsn:
164 "CONTINUEB" prio: 1
s.bn: 253 "NDBFS", s.proc: 3, s.sigId: 75922
length: 1 trace: 1 #sec: 0 fragInf: 0
Scanning the memory channel again with no delay
14Copyright 2013 Severalnines AB Control your database infrastructure
Recovery and Escalation Procedures
• There are two escalation steps to recover failed data
nodes – in this case Cluster is still STARTED
– Optimized Node Recovery (NR)
– Initial Node Recovery (INR)
• A failed Cluster can be recovered in two ways:
– System Restart
• The individual nodes may have to be restarted in a combination
of NR and INR.
– Initial System Restart + Restore Backup
15Copyright 2013 Severalnines AB Control your database infrastructure
Optimized Node Recovery
• A failed node can recovered using Optimized Node
Recovery.
– This is the fastest way to recover a failed node
– Node will recover from Local Checkpoint and apply Redo
log.
– Then copy changes from the other node in the same node
group.
16Copyright 2013 Severalnines AB Control your database infrastructure
Optimized Node Recovery
STORAGE LAYER
P0
DATA
NODE 2
DATA
NODE 1
P1
subid data
1 A
2 B
3 C
4 D
5 E
6 F
7 G
8 H
Partition 0
Partition 1
S0S1  P1
Px == PRIMARY Partition x
Sx == SECONDARY Parttion x
P2
DATA
NODE 4
DATA
NODE 3
P3
S2P2S3
Partition 2
Partition 3
• Multiple failed nodes can recovery in parallel
• The first step is to try to restart the failed nodes in
Optimized Node Recovery mode:
– ndbmtd
Node group 0 Node group 0
17Copyright 2013 Severalnines AB Control your database infrastructure
Initial Node Recovery
• If a node fails to complete Optimized Node Recovery
the next step in the escalation chain is to perform an
Initial Node Recovery.
– This can be because of a corrupted file system
• During Initial Node Recovery the data node will
– Clear out its local filesystem (rm –rf /datadir/ndbd/*)
– Recreate the REDO LOG
– Copy all data from the other node in the node group.
• Usually this recovery takes a lot longer to perform
than Optimized Node Recovery.
18Copyright 2013 Severalnines AB Control your database infrastructure
Initial Node Recovery Is Needed
19Copyright 2013 Severalnines AB Control your database infrastructure
Initial Node Recovery Is Needed
20Copyright 2013 Severalnines AB Control your database infrastructure
Initial Node Recovery
STORAGE LAYER
P0
DATA
NODE 2
DATA
NODE 1
P1
subid data
1 A
2 B
3 C
4 D
5 E
6 F
7 G
8 H
Partition 0
Partition 1
S0S1  P1
Px == PRIMARY Partition x
Sx == SECONDARY Parttion x
P2
DATA
NODE 4
DATA
NODE 3
P3
S2P2S3
Partition 2
Partition 3
• Pretend that Node 2 failed to recover
• In this case it can be recovered with
– ndbmtd --initial
Node group 0 Node group 0
21Copyright 2013 Severalnines AB Control your database infrastructure
System Restart
STORAGE LAYER
P0
DATA
NODE 2
DATA
NODE 1
P1
subid data
1 A
2 B
3 C
4 D
5 E
6 F
7 G
8 H
Partition 0
Partition 1
S0S1  P1
Px == PRIMARY Partition x
Sx == SECONDARY Parttion x
P2
DATA
NODE 4
DATA
NODE 3
P3
S2P2S3
Partition 2
Partition 3
• All nodes have failed.
• Every data node can be restarted with
– ndbmtd
Node group 0 Node group 0
22Copyright 2013 Severalnines AB Control your database infrastructure
System Restart
STORAGE LAYER
P0
DATA
NODE 2
DATA
NODE 1
P1
subid data
1 A
2 B
3 C
4 D
5 E
6 F
7 G
8 H
Partition 0
Partition 1
S0S1  P1
Px == PRIMARY Partition x
Sx == SECONDARY Parttion x
P2
DATA
NODE 4
DATA
NODE 3
P3
S2P2S3
Partition 2
Partition 3
• If one node fails during the System Restart the
system restart is aborted
– All nodes crash again
• Error logs must be inspected
Node group 0 Node group 0
23Copyright 2013 Severalnines AB Control your database infrastructure
System Restart
• Some data nodes will write out in the error log
– “Another data node failed during system restart”
– Start these nodes with
• Ndbmtd
– The let the cluster perform a partial start (start with the
nodes that are ok, at least one from each node group) ->
you may have to try multiple combinations.
• The goal is to find the “another node”
– “Filesystem inconsistency”, “DBDIH pointer too large”
– Start this node with
• ndbmtd --initial
24Copyright 2013 Severalnines AB Control your database infrastructure
System Restart
• If all nodes in one node group has written out
something like:
– “Filesystem inconsistency”, “DBDIH pointer too large”
– System restart is not possible
• OR
– It is not possible to perform a partial start (i.e, one node from
each node group), and all possible combinations are exempt
then ..
• Initial System Restart is needed !
– Which basically means you have to restore from backup.
25Copyright 2013 Severalnines AB Control your database infrastructure
System Restart
STORAGE LAYER
P0
DATA
NODE 2
DATA
NODE 1
P1
subid data
1 A
2 B
3 C
4 D
5 E
6 F
7 G
8 H
Partition 0
Partition 1
S0S1  P1
Px == PRIMARY Partition x
Sx == SECONDARY Parttion x
P2
DATA
NODE 4
DATA
NODE 3
P3
S2P2S3
Partition 2
Partition 3
• This situation requires an Initial System Restart
because all nodes in one node group have failed in
such a way they are impossible to restart.
– Luckily this is not very common at all.
Node group 0 Node group 0
26Copyright 2013 Severalnines AB Control your database infrastructure
Initial System Restart
STORAGE LAYER
P0
DATA
NODE 2
DATA
NODE 1
P1
subid data
1 A
2 B
3 C
4 D
5 E
6 F
7 G
8 H
Partition 0
Partition 1
S0S1  P1
Px == PRIMARY Partition x
Sx == SECONDARY Parttion x
P2
DATA
NODE 4
DATA
NODE 3
P3
S2P2S3
Partition 2
Partition 3
• Restart all nodes with
– ndbmtd --initial
• Restore a backup
Node group 0 Node group 0
27Copyright 2013 Severalnines AB Control your database infrastructure
Summary
• The whole exercise is to try different combinations,
but never –initial all nodes in one node group.
28Copyright 2013 Severalnines AB Control your database infrastructure
Coming next in Installment 11:
Connectivity Overview
29Copyright 2013 Severalnines AB Control your database infrastructure
Disclaimer
© Copyright 2013 Severalnines AB. All rights reserved.
Severalnines & the Severalnines logo(s) are trademarks of Severalnines AB.
MySQL is a registered trademark of Oracle and/or its affiliates.
Other names may be trademarks of their respective owners.

More Related Content

What's hot

Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...xKinAnx
 
Ibm spectrum scale fundamentals workshop for americas part 4 Replication, Str...
Ibm spectrum scale fundamentals workshop for americas part 4 Replication, Str...Ibm spectrum scale fundamentals workshop for americas part 4 Replication, Str...
Ibm spectrum scale fundamentals workshop for americas part 4 Replication, Str...xKinAnx
 
Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)
Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)
Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)Michael Arnold
 
PostgreSQL and Benchmarks
PostgreSQL and BenchmarksPostgreSQL and Benchmarks
PostgreSQL and BenchmarksJignesh Shah
 
Best Practices of HA and Replication of PostgreSQL in Virtualized Environments
Best Practices of HA and Replication of PostgreSQL in Virtualized EnvironmentsBest Practices of HA and Replication of PostgreSQL in Virtualized Environments
Best Practices of HA and Replication of PostgreSQL in Virtualized EnvironmentsJignesh Shah
 
Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA EDB
 
Database & Technology 2 _ Marting Lambert _ Mixed Workloads Why and How.pdf
Database & Technology 2 _ Marting Lambert _ Mixed Workloads Why and How.pdfDatabase & Technology 2 _ Marting Lambert _ Mixed Workloads Why and How.pdf
Database & Technology 2 _ Marting Lambert _ Mixed Workloads Why and How.pdfInSync2011
 
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Kathleen Ting
 
MySQL Performance Tuning
MySQL Performance TuningMySQL Performance Tuning
MySQL Performance TuningFromDual GmbH
 
Best Practices for Virtualizing Hadoop
Best Practices for Virtualizing HadoopBest Practices for Virtualizing Hadoop
Best Practices for Virtualizing HadoopDataWorks Summit
 
My experience with embedding PostgreSQL
 My experience with embedding PostgreSQL My experience with embedding PostgreSQL
My experience with embedding PostgreSQLJignesh Shah
 
What's New and Upcoming in HDFS - the Hadoop Distributed File System
What's New and Upcoming in HDFS - the Hadoop Distributed File SystemWhat's New and Upcoming in HDFS - the Hadoop Distributed File System
What's New and Upcoming in HDFS - the Hadoop Distributed File SystemCloudera, Inc.
 
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS Storage
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS StorageWebinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS Storage
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS StorageGlusterFS
 
Strata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and FutureStrata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and FutureCloudera, Inc.
 
Intro to GlusterFS Webinar - August 2011
Intro to GlusterFS Webinar - August 2011Intro to GlusterFS Webinar - August 2011
Intro to GlusterFS Webinar - August 2011GlusterFS
 
MySQL High-Availability and Scale-Out architectures
MySQL High-Availability and Scale-Out architecturesMySQL High-Availability and Scale-Out architectures
MySQL High-Availability and Scale-Out architecturesFromDual GmbH
 
Spectrum Scale Best Practices by Olaf Weiser
Spectrum Scale Best Practices by Olaf WeiserSpectrum Scale Best Practices by Olaf Weiser
Spectrum Scale Best Practices by Olaf WeiserSandeep Patil
 
Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?EDB
 
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in Production
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in ProductionUpgrade Without the Headache: Best Practices for Upgrading Hadoop in Production
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in ProductionCloudera, Inc.
 

What's hot (20)

Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
 
Ibm spectrum scale fundamentals workshop for americas part 4 Replication, Str...
Ibm spectrum scale fundamentals workshop for americas part 4 Replication, Str...Ibm spectrum scale fundamentals workshop for americas part 4 Replication, Str...
Ibm spectrum scale fundamentals workshop for americas part 4 Replication, Str...
 
Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)
Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)
Hadoop Operations: Starting Out Small / So Your Cluster Isn't Yahoo-sized (yet)
 
PostgreSQL and Benchmarks
PostgreSQL and BenchmarksPostgreSQL and Benchmarks
PostgreSQL and Benchmarks
 
Best Practices of HA and Replication of PostgreSQL in Virtualized Environments
Best Practices of HA and Replication of PostgreSQL in Virtualized EnvironmentsBest Practices of HA and Replication of PostgreSQL in Virtualized Environments
Best Practices of HA and Replication of PostgreSQL in Virtualized Environments
 
Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA Best Practices for Becoming an Exceptional Postgres DBA
Best Practices for Becoming an Exceptional Postgres DBA
 
Exadata Backup
Exadata BackupExadata Backup
Exadata Backup
 
Database & Technology 2 _ Marting Lambert _ Mixed Workloads Why and How.pdf
Database & Technology 2 _ Marting Lambert _ Mixed Workloads Why and How.pdfDatabase & Technology 2 _ Marting Lambert _ Mixed Workloads Why and How.pdf
Database & Technology 2 _ Marting Lambert _ Mixed Workloads Why and How.pdf
 
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)
 
MySQL Performance Tuning
MySQL Performance TuningMySQL Performance Tuning
MySQL Performance Tuning
 
Best Practices for Virtualizing Hadoop
Best Practices for Virtualizing HadoopBest Practices for Virtualizing Hadoop
Best Practices for Virtualizing Hadoop
 
My experience with embedding PostgreSQL
 My experience with embedding PostgreSQL My experience with embedding PostgreSQL
My experience with embedding PostgreSQL
 
What's New and Upcoming in HDFS - the Hadoop Distributed File System
What's New and Upcoming in HDFS - the Hadoop Distributed File SystemWhat's New and Upcoming in HDFS - the Hadoop Distributed File System
What's New and Upcoming in HDFS - the Hadoop Distributed File System
 
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS Storage
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS StorageWebinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS Storage
Webinar Sept 22: Gluster Partners with Redapt to Deliver Scale-Out NAS Storage
 
Strata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and FutureStrata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and Future
 
Intro to GlusterFS Webinar - August 2011
Intro to GlusterFS Webinar - August 2011Intro to GlusterFS Webinar - August 2011
Intro to GlusterFS Webinar - August 2011
 
MySQL High-Availability and Scale-Out architectures
MySQL High-Availability and Scale-Out architecturesMySQL High-Availability and Scale-Out architectures
MySQL High-Availability and Scale-Out architectures
 
Spectrum Scale Best Practices by Olaf Weiser
Spectrum Scale Best Practices by Olaf WeiserSpectrum Scale Best Practices by Olaf Weiser
Spectrum Scale Best Practices by Olaf Weiser
 
Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?Is There Anything PgBouncer Can’t Do?
Is There Anything PgBouncer Can’t Do?
 
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in Production
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in ProductionUpgrade Without the Headache: Best Practices for Upgrading Hadoop in Production
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in Production
 

Similar to MySQL Cluster Troubleshooting Guide

Mod03 linking and accelerating
Mod03 linking and acceleratingMod03 linking and accelerating
Mod03 linking and acceleratingPeter Haase
 
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...Continuent
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACKristofferson A
 
Sharing-Knowledge-OAM-3G-Ericsson .ppt
Sharing-Knowledge-OAM-3G-Ericsson   .pptSharing-Knowledge-OAM-3G-Ericsson   .ppt
Sharing-Knowledge-OAM-3G-Ericsson .pptwafawafa52
 
Data Guard Deep Dive UKOUG 2012
Data Guard Deep Dive UKOUG 2012Data Guard Deep Dive UKOUG 2012
Data Guard Deep Dive UKOUG 2012Emre Baransel
 
Training Slides: Basics 102: Introduction to Tungsten Clustering
Training Slides: Basics 102: Introduction to Tungsten ClusteringTraining Slides: Basics 102: Introduction to Tungsten Clustering
Training Slides: Basics 102: Introduction to Tungsten ClusteringContinuent
 
MySQL Utilities -- Cool Tools For You: PHP World Nov 16 2016
MySQL Utilities -- Cool Tools For You: PHP World Nov 16 2016MySQL Utilities -- Cool Tools For You: PHP World Nov 16 2016
MySQL Utilities -- Cool Tools For You: PHP World Nov 16 2016Dave Stokes
 
Percona Live '18 Tutorial: The Accidental DBA
Percona Live '18 Tutorial: The Accidental DBAPercona Live '18 Tutorial: The Accidental DBA
Percona Live '18 Tutorial: The Accidental DBAJenni Snyder
 
Webinar: Replication and Replica Sets
Webinar: Replication and Replica SetsWebinar: Replication and Replica Sets
Webinar: Replication and Replica SetsMongoDB
 
VMworld 2013: vSphere Data Protection (VDP) Technical Deep Dive and Troublesh...
VMworld 2013: vSphere Data Protection (VDP) Technical Deep Dive and Troublesh...VMworld 2013: vSphere Data Protection (VDP) Technical Deep Dive and Troublesh...
VMworld 2013: vSphere Data Protection (VDP) Technical Deep Dive and Troublesh...VMworld
 
2010 12 mysql_clusteroverview
2010 12 mysql_clusteroverview2010 12 mysql_clusteroverview
2010 12 mysql_clusteroverviewDimas Prasetyo
 
Training Slides: 203 - Backup & Recovery
Training Slides: 203 - Backup & RecoveryTraining Slides: 203 - Backup & Recovery
Training Slides: 203 - Backup & RecoveryContinuent
 
Upgrade to MySQL 5.6 without downtime
Upgrade to MySQL 5.6 without downtimeUpgrade to MySQL 5.6 without downtime
Upgrade to MySQL 5.6 without downtimeOlivier DASINI
 
NoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
NoCOUG_201411_Patel_Managing_a_Large_OLTP_DatabaseNoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
NoCOUG_201411_Patel_Managing_a_Large_OLTP_DatabaseParesh Patel
 
IBM Think 2018 - IBM Connections Troubleshooting
IBM Think 2018 -  IBM Connections TroubleshootingIBM Think 2018 -  IBM Connections Troubleshooting
IBM Think 2018 - IBM Connections TroubleshootingNico Meisenzahl
 
Finding an unusual cause of max_user_connections in MySQL
Finding an unusual cause of max_user_connections in MySQLFinding an unusual cause of max_user_connections in MySQL
Finding an unusual cause of max_user_connections in MySQLOlivier Doucet
 
RMAN in 12c: The Next Generation (PPT)
RMAN in 12c: The Next Generation (PPT)RMAN in 12c: The Next Generation (PPT)
RMAN in 12c: The Next Generation (PPT)Gustavo Rene Antunez
 

Similar to MySQL Cluster Troubleshooting Guide (20)

Mod03 linking and accelerating
Mod03 linking and acceleratingMod03 linking and accelerating
Mod03 linking and accelerating
 
Perf Tuning Short
Perf Tuning ShortPerf Tuning Short
Perf Tuning Short
 
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
Training Slides: Intermediate 202: Performing Cluster Maintenance with Zero-D...
 
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RACPerformance Scenario: Diagnosing and resolving sudden slow down on two node RAC
Performance Scenario: Diagnosing and resolving sudden slow down on two node RAC
 
Sharing-Knowledge-OAM-3G-Ericsson .ppt
Sharing-Knowledge-OAM-3G-Ericsson   .pptSharing-Knowledge-OAM-3G-Ericsson   .ppt
Sharing-Knowledge-OAM-3G-Ericsson .ppt
 
Data Guard Deep Dive UKOUG 2012
Data Guard Deep Dive UKOUG 2012Data Guard Deep Dive UKOUG 2012
Data Guard Deep Dive UKOUG 2012
 
Training Slides: Basics 102: Introduction to Tungsten Clustering
Training Slides: Basics 102: Introduction to Tungsten ClusteringTraining Slides: Basics 102: Introduction to Tungsten Clustering
Training Slides: Basics 102: Introduction to Tungsten Clustering
 
MySQL Utilities -- Cool Tools For You: PHP World Nov 16 2016
MySQL Utilities -- Cool Tools For You: PHP World Nov 16 2016MySQL Utilities -- Cool Tools For You: PHP World Nov 16 2016
MySQL Utilities -- Cool Tools For You: PHP World Nov 16 2016
 
Percona Live '18 Tutorial: The Accidental DBA
Percona Live '18 Tutorial: The Accidental DBAPercona Live '18 Tutorial: The Accidental DBA
Percona Live '18 Tutorial: The Accidental DBA
 
Webinar: Replication and Replica Sets
Webinar: Replication and Replica SetsWebinar: Replication and Replica Sets
Webinar: Replication and Replica Sets
 
VMworld 2013: vSphere Data Protection (VDP) Technical Deep Dive and Troublesh...
VMworld 2013: vSphere Data Protection (VDP) Technical Deep Dive and Troublesh...VMworld 2013: vSphere Data Protection (VDP) Technical Deep Dive and Troublesh...
VMworld 2013: vSphere Data Protection (VDP) Technical Deep Dive and Troublesh...
 
2010 12 mysql_clusteroverview
2010 12 mysql_clusteroverview2010 12 mysql_clusteroverview
2010 12 mysql_clusteroverview
 
Training Slides: 203 - Backup & Recovery
Training Slides: 203 - Backup & RecoveryTraining Slides: 203 - Backup & Recovery
Training Slides: 203 - Backup & Recovery
 
Rh436 pdf
Rh436 pdfRh436 pdf
Rh436 pdf
 
Upgrade to MySQL 5.6 without downtime
Upgrade to MySQL 5.6 without downtimeUpgrade to MySQL 5.6 without downtime
Upgrade to MySQL 5.6 without downtime
 
NoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
NoCOUG_201411_Patel_Managing_a_Large_OLTP_DatabaseNoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
NoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
 
IBM Think 2018 - IBM Connections Troubleshooting
IBM Think 2018 -  IBM Connections TroubleshootingIBM Think 2018 -  IBM Connections Troubleshooting
IBM Think 2018 - IBM Connections Troubleshooting
 
Finding an unusual cause of max_user_connections in MySQL
Finding an unusual cause of max_user_connections in MySQLFinding an unusual cause of max_user_connections in MySQL
Finding an unusual cause of max_user_connections in MySQL
 
Upgrade & ndmp
Upgrade & ndmpUpgrade & ndmp
Upgrade & ndmp
 
RMAN in 12c: The Next Generation (PPT)
RMAN in 12c: The Next Generation (PPT)RMAN in 12c: The Next Generation (PPT)
RMAN in 12c: The Next Generation (PPT)
 

More from Severalnines

Cloud's future runs through Sovereign DBaaS
Cloud's future runs through Sovereign DBaaSCloud's future runs through Sovereign DBaaS
Cloud's future runs through Sovereign DBaaSSeveralnines
 
Tips to drive maria db cluster performance for nextcloud
Tips to drive maria db cluster performance for nextcloudTips to drive maria db cluster performance for nextcloud
Tips to drive maria db cluster performance for nextcloudSeveralnines
 
Working with the Moodle Database: The Basics
Working with the Moodle Database: The BasicsWorking with the Moodle Database: The Basics
Working with the Moodle Database: The BasicsSeveralnines
 
SysAdmin Working from Home? Tips to Automate MySQL, MariaDB, Postgres & MongoDB
SysAdmin Working from Home? Tips to Automate MySQL, MariaDB, Postgres & MongoDBSysAdmin Working from Home? Tips to Automate MySQL, MariaDB, Postgres & MongoDB
SysAdmin Working from Home? Tips to Automate MySQL, MariaDB, Postgres & MongoDBSeveralnines
 
(slides) Polyglot persistence: utilizing open source databases as a Swiss poc...
(slides) Polyglot persistence: utilizing open source databases as a Swiss poc...(slides) Polyglot persistence: utilizing open source databases as a Swiss poc...
(slides) Polyglot persistence: utilizing open source databases as a Swiss poc...Severalnines
 
Webinar slides: How to Migrate from Oracle DB to MariaDB
Webinar slides: How to Migrate from Oracle DB to MariaDBWebinar slides: How to Migrate from Oracle DB to MariaDB
Webinar slides: How to Migrate from Oracle DB to MariaDBSeveralnines
 
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControl
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControlWebinar slides: How to Automate & Manage PostgreSQL with ClusterControl
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControlSeveralnines
 
Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...
Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...
Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...Severalnines
 
Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...
Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...
Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...Severalnines
 
Disaster Recovery Planning for MySQL & MariaDB
Disaster Recovery Planning for MySQL & MariaDBDisaster Recovery Planning for MySQL & MariaDB
Disaster Recovery Planning for MySQL & MariaDBSeveralnines
 
MariaDB Performance Tuning Crash Course
MariaDB Performance Tuning Crash CourseMariaDB Performance Tuning Crash Course
MariaDB Performance Tuning Crash CourseSeveralnines
 
Performance Tuning Cheat Sheet for MongoDB
Performance Tuning Cheat Sheet for MongoDBPerformance Tuning Cheat Sheet for MongoDB
Performance Tuning Cheat Sheet for MongoDBSeveralnines
 
Advanced MySql Data-at-Rest Encryption in Percona Server
Advanced MySql Data-at-Rest Encryption in Percona ServerAdvanced MySql Data-at-Rest Encryption in Percona Server
Advanced MySql Data-at-Rest Encryption in Percona ServerSeveralnines
 
Polyglot Persistence Utilizing Open Source Databases as a Swiss Pocket Knife
Polyglot Persistence Utilizing Open Source Databases as a Swiss Pocket KnifePolyglot Persistence Utilizing Open Source Databases as a Swiss Pocket Knife
Polyglot Persistence Utilizing Open Source Databases as a Swiss Pocket KnifeSeveralnines
 
Webinar slides: Free Monitoring (on Steroids) for MySQL, MariaDB, PostgreSQL ...
Webinar slides: Free Monitoring (on Steroids) for MySQL, MariaDB, PostgreSQL ...Webinar slides: Free Monitoring (on Steroids) for MySQL, MariaDB, PostgreSQL ...
Webinar slides: Free Monitoring (on Steroids) for MySQL, MariaDB, PostgreSQL ...Severalnines
 
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLWebinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLSeveralnines
 
Webinar slides: Our Guide to MySQL & MariaDB Performance Tuning
Webinar slides: Our Guide to MySQL & MariaDB Performance TuningWebinar slides: Our Guide to MySQL & MariaDB Performance Tuning
Webinar slides: Our Guide to MySQL & MariaDB Performance TuningSeveralnines
 
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDBWebinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDBSeveralnines
 
Webinar slides: How to Measure Database Availability?
Webinar slides: How to Measure Database Availability?Webinar slides: How to Measure Database Availability?
Webinar slides: How to Measure Database Availability?Severalnines
 
Webinar slides: Designing Open Source Databases for High Availability
Webinar slides: Designing Open Source Databases for High AvailabilityWebinar slides: Designing Open Source Databases for High Availability
Webinar slides: Designing Open Source Databases for High AvailabilitySeveralnines
 

More from Severalnines (20)

Cloud's future runs through Sovereign DBaaS
Cloud's future runs through Sovereign DBaaSCloud's future runs through Sovereign DBaaS
Cloud's future runs through Sovereign DBaaS
 
Tips to drive maria db cluster performance for nextcloud
Tips to drive maria db cluster performance for nextcloudTips to drive maria db cluster performance for nextcloud
Tips to drive maria db cluster performance for nextcloud
 
Working with the Moodle Database: The Basics
Working with the Moodle Database: The BasicsWorking with the Moodle Database: The Basics
Working with the Moodle Database: The Basics
 
SysAdmin Working from Home? Tips to Automate MySQL, MariaDB, Postgres & MongoDB
SysAdmin Working from Home? Tips to Automate MySQL, MariaDB, Postgres & MongoDBSysAdmin Working from Home? Tips to Automate MySQL, MariaDB, Postgres & MongoDB
SysAdmin Working from Home? Tips to Automate MySQL, MariaDB, Postgres & MongoDB
 
(slides) Polyglot persistence: utilizing open source databases as a Swiss poc...
(slides) Polyglot persistence: utilizing open source databases as a Swiss poc...(slides) Polyglot persistence: utilizing open source databases as a Swiss poc...
(slides) Polyglot persistence: utilizing open source databases as a Swiss poc...
 
Webinar slides: How to Migrate from Oracle DB to MariaDB
Webinar slides: How to Migrate from Oracle DB to MariaDBWebinar slides: How to Migrate from Oracle DB to MariaDB
Webinar slides: How to Migrate from Oracle DB to MariaDB
 
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControl
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControlWebinar slides: How to Automate & Manage PostgreSQL with ClusterControl
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControl
 
Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...
Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...
Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...
 
Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...
Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...
Webinar slides: Backup Management for MySQL, MariaDB, PostgreSQL & MongoDB wi...
 
Disaster Recovery Planning for MySQL & MariaDB
Disaster Recovery Planning for MySQL & MariaDBDisaster Recovery Planning for MySQL & MariaDB
Disaster Recovery Planning for MySQL & MariaDB
 
MariaDB Performance Tuning Crash Course
MariaDB Performance Tuning Crash CourseMariaDB Performance Tuning Crash Course
MariaDB Performance Tuning Crash Course
 
Performance Tuning Cheat Sheet for MongoDB
Performance Tuning Cheat Sheet for MongoDBPerformance Tuning Cheat Sheet for MongoDB
Performance Tuning Cheat Sheet for MongoDB
 
Advanced MySql Data-at-Rest Encryption in Percona Server
Advanced MySql Data-at-Rest Encryption in Percona ServerAdvanced MySql Data-at-Rest Encryption in Percona Server
Advanced MySql Data-at-Rest Encryption in Percona Server
 
Polyglot Persistence Utilizing Open Source Databases as a Swiss Pocket Knife
Polyglot Persistence Utilizing Open Source Databases as a Swiss Pocket KnifePolyglot Persistence Utilizing Open Source Databases as a Swiss Pocket Knife
Polyglot Persistence Utilizing Open Source Databases as a Swiss Pocket Knife
 
Webinar slides: Free Monitoring (on Steroids) for MySQL, MariaDB, PostgreSQL ...
Webinar slides: Free Monitoring (on Steroids) for MySQL, MariaDB, PostgreSQL ...Webinar slides: Free Monitoring (on Steroids) for MySQL, MariaDB, PostgreSQL ...
Webinar slides: Free Monitoring (on Steroids) for MySQL, MariaDB, PostgreSQL ...
 
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLWebinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
 
Webinar slides: Our Guide to MySQL & MariaDB Performance Tuning
Webinar slides: Our Guide to MySQL & MariaDB Performance TuningWebinar slides: Our Guide to MySQL & MariaDB Performance Tuning
Webinar slides: Our Guide to MySQL & MariaDB Performance Tuning
 
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDBWebinar slides: Migrating to Galera Cluster for MySQL and MariaDB
Webinar slides: Migrating to Galera Cluster for MySQL and MariaDB
 
Webinar slides: How to Measure Database Availability?
Webinar slides: How to Measure Database Availability?Webinar slides: How to Measure Database Availability?
Webinar slides: How to Measure Database Availability?
 
Webinar slides: Designing Open Source Databases for High Availability
Webinar slides: Designing Open Source Databases for High AvailabilityWebinar slides: Designing Open Source Databases for High Availability
Webinar slides: Designing Open Source Databases for High Availability
 

Recently uploaded

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 

Recently uploaded (20)

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 

MySQL Cluster Troubleshooting Guide

  • 1. 1Copyright 2013 Severalnines AB Control your database infrastructure 10th Installment MySQL Cluster Self-Training Part 9 – Troubleshooting MySQL Cluster
  • 2. 2Copyright 2013 Severalnines AB Control your database infrastructure Topics • Common problems • Error logs and Trace files • Recovery and Escalation procedures
  • 3. 3Copyright 2013 Severalnines AB Control your database infrastructure Common Problems • The most common problems are – Configuration changes – Out of disk space – Out of RAM – Network issues (switch failures, network reorganization, upgrade of RST) – Swapping • echo “0” > /proc/vm/swapiness
  • 4. 4Copyright 2013 Severalnines AB Control your database infrastructure Localizing the problem • Look in the cluster log on the management node • What node/nodes crashed and in what order • Go to those node/nodes – View the error log file for each node. – Look at the recommended restart action • Initial node recovery • Node Recovery – It could also be a Permanent error • Filesystem is full • Directory does not exist
  • 5. 5Copyright 2013 Severalnines AB Control your database infrastructure Error logs • Data node store its error log in its DATADIR – ndb_X_error.log – X is the node id of the node • The ndb_X_out.log contains debug messages but is usually not interesting to look in. • The ndb_X_trace.log.n contains the last execution steps before the data node stopped/crashed.
  • 6. 6Copyright 2013 Severalnines AB Control your database infrastructure Ndb_X_cluster.log 2011-05-24 08:12:44 [MgmtSrvr] INFO -- Node 3: Start with all nodes 3 and 4 2011-05-24 08:12:44 [MgmtSrvr] INFO -- Node 3: CM_REGCONF president = 3, own Node = 3, our dynamic id = 0/1 2011-05-24 08:12:45 [MgmtSrvr] INFO -- Node 4: CM_REGCONF president = 3, own Node = 4, our dynamic id = 0/2 2011-05-24 08:12:45 [MgmtSrvr] INFO -- Node 3: Node 4: API mysql-5.1.51 ndb-7.1.10 2011-05-24 08:12:45 [MgmtSrvr] INFO -- Node 4: Node 3: API mysql-5.1.51 ndb-7.1.10 2011-05-24 08:12:45 [MgmtSrvr] INFO -- Node 4: Start phase 1 completed 2011-05-24 08:12:45 [MgmtSrvr] INFO -- Node 3: Start phase 1 completed 2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 3: Start phase 2 completed (system restart) 2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 4: Start phase 2 completed (system restart) 2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 3: Start phase 3 completed (system restart) 2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 4: Start phase 3 completed (system restart) 2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 3: Restarting cluster to GCI: 231577 2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 3: Starting to restore schema 2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 3: Restore of schema complete 2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 4: Starting to restore schema 2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 4: Restore of schema complete 2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 3: DICT: activate index 8 done (sys/def/7/PRIMARY) 2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 3: Node: 3 StartLog: [GCI Keep: 227381 LastCompleted: 231577 NewestRestorable: 231577] 2011-05-24 08:12:46 [MgmtSrvr] INFO -- Node 3: Node: 4 StartLog: [GCI Keep: 227381 LastCompleted: 231577 NewestRestorable: 231577] 2011-05-24 08:12:48 [MgmtSrvr] ALERT -- Node 3: Forced node shutdown completed. Occured during startphase 4. Caused by error 2306: 'Pointer too large(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node'. 2011-05-24 08:12:48 [MgmtSrvr] ALERT -- Node 1: Node 3 Disconnected 2011-05-24 08:12:48 [MgmtSrvr] ALERT -- Node 4: Forced node shutdown completed. Occured during startphase 4. Caused by error 2308: 'Another node failed during system restart, please investigate error(s) on other node(s)(Restart error). Temporary error, restart node'. 2011-05-24 08:12:48 [MgmtSrvr] ALERT -- Node 1: Node 4 Disconnected 2011-05-24 08:12:49 [MgmtSrvr] INFO -- Mgmt server state: nodeid 3 freed, m_reserved_nodes 1, 4, 5 and 9. 2011-05-24 08:12:49 [MgmtSrvr] INFO -- Mgmt server state: nodeid 4 freed, m_reserved_nodes 1, 5 and 9. 2011-05-24 08:12:57 [MgmtSrvr] INFO -- Mgmt server state: nodeid 4 reserved for ip 192.168.100.112, m_reserved_nodes 1, 4, 5 and 9.
  • 7. 7Copyright 2013 Severalnines AB Control your database infrastructure Closer Inspection • There was a system restart ongoing • Node 3 crashed – Forced node shutdown completed. Occured during startphase 4. Caused by error 2306: 'Pointer too large(Internal error, programming error or missing error message, please report a bug). Temporary error, restart node’ • Node 4 – Forced node shutdown completed. Occured during startphase 4. Caused by error 2308: 'Another node failed during system restart, please investigate error(s) on other node(s)(Restart error). Temporary error, restart node'
  • 8. 8Copyright 2013 Severalnines AB Control your database infrastructure Closer Inspection • Next the error logs of the data nodes needs to be inspected. – Ndb_3_error.log – Ndb_4_error.log
  • 9. 9Copyright 2013 Severalnines AB Control your database infrastructure Ndb_X_error.log Time: Tuesday 24 May 2011 - 02:36:36 Status: Temporary error, restart node Message: Another node failed during system restart, please investigate error(s) on other node(s) (Restart error) Error: 2308 Error data: Node 3 disconnected Error object: QMGR (Line: 3050) 0x00000002 Program: /usr/local//mysql/bin//ndbd Pid: 3501 Version: mysql-5.1.51 ndb-7.1.10 Trace: /data/mysqlcluster//ndb_4_trace.log.5 ***EOM***
  • 10. 10Copyright 2013 Severalnines AB Control your database infrastructure Error logs • Looking at the error logs usually gives a good hint what needs to be done – In the above example one node failed during system restart. – This caused
  • 11. 11Copyright 2013 Severalnines AB Control your database infrastructure Ndb_X_error.log Time: Tuesday 24 May 2011 - 08:53:40 Status: Temporary error, restart node Message: Pointer too large (Internal error, programming error or missing error message, please report a bug) Error: 2306 Error data: dblqh/DblqhMain.cpp Error object: DBLQH (Line: 15725) 0x00000002 Program: /usr/local//mysql/bin//ndbd Pid: 4790 Version: mysql-5.1.51 ndb-7.1.10 Trace: /data/mysqlcluster//ndb_3_trace.log.25 ***EOM***
  • 12. 12Copyright 2013 Severalnines AB Control your database infrastructure Ndb_X_out.log RESTORE table: 2 1039 rows applied RESTORE table: 2 1012 rows applied RESTORE table: 3 2 rows applied RESTORE table: 3 2 rows applied
  • 13. 13Copyright 2013 Severalnines AB Control your database infrastructure Ndb_X_trace.log.N --------------- Signal ---------------- r.bn: 247 "DBLQH", r.proc: 3, r.sigId: 75928 gsn: 164 "CONTINUEB" prio: 1 s.bn: 247 "DBLQH", s.proc: 3, s.sigId: 75923 length: 2 trace: 1 #sec: 0 fragInf: 0 H'00000006 H'00000000 --------------- Signal ---------------- r.bn: 247 "DBLQH", r.proc: 3, r.sigId: 75927 gsn: 262 "FSREADCONF" prio: 0 s.bn: 253 "NDBFS", s.proc: 3, s.sigId: 75926 length: 1 trace: 1 #sec: 0 fragInf: 0 UserPointer: 1 --------------- Signal ---------------- r.bn: 253 "NDBFS", r.proc: 3, r.sigId: 75926 gsn: 164 "CONTINUEB" prio: 1 s.bn: 253 "NDBFS", s.proc: 3, s.sigId: 75922 length: 1 trace: 1 #sec: 0 fragInf: 0 Scanning the memory channel again with no delay
  • 14. 14Copyright 2013 Severalnines AB Control your database infrastructure Recovery and Escalation Procedures • There are two escalation steps to recover failed data nodes – in this case Cluster is still STARTED – Optimized Node Recovery (NR) – Initial Node Recovery (INR) • A failed Cluster can be recovered in two ways: – System Restart • The individual nodes may have to be restarted in a combination of NR and INR. – Initial System Restart + Restore Backup
  • 15. 15Copyright 2013 Severalnines AB Control your database infrastructure Optimized Node Recovery • A failed node can recovered using Optimized Node Recovery. – This is the fastest way to recover a failed node – Node will recover from Local Checkpoint and apply Redo log. – Then copy changes from the other node in the same node group.
  • 16. 16Copyright 2013 Severalnines AB Control your database infrastructure Optimized Node Recovery STORAGE LAYER P0 DATA NODE 2 DATA NODE 1 P1 subid data 1 A 2 B 3 C 4 D 5 E 6 F 7 G 8 H Partition 0 Partition 1 S0S1  P1 Px == PRIMARY Partition x Sx == SECONDARY Parttion x P2 DATA NODE 4 DATA NODE 3 P3 S2P2S3 Partition 2 Partition 3 • Multiple failed nodes can recovery in parallel • The first step is to try to restart the failed nodes in Optimized Node Recovery mode: – ndbmtd Node group 0 Node group 0
  • 17. 17Copyright 2013 Severalnines AB Control your database infrastructure Initial Node Recovery • If a node fails to complete Optimized Node Recovery the next step in the escalation chain is to perform an Initial Node Recovery. – This can be because of a corrupted file system • During Initial Node Recovery the data node will – Clear out its local filesystem (rm –rf /datadir/ndbd/*) – Recreate the REDO LOG – Copy all data from the other node in the node group. • Usually this recovery takes a lot longer to perform than Optimized Node Recovery.
  • 18. 18Copyright 2013 Severalnines AB Control your database infrastructure Initial Node Recovery Is Needed
  • 19. 19Copyright 2013 Severalnines AB Control your database infrastructure Initial Node Recovery Is Needed
  • 20. 20Copyright 2013 Severalnines AB Control your database infrastructure Initial Node Recovery STORAGE LAYER P0 DATA NODE 2 DATA NODE 1 P1 subid data 1 A 2 B 3 C 4 D 5 E 6 F 7 G 8 H Partition 0 Partition 1 S0S1  P1 Px == PRIMARY Partition x Sx == SECONDARY Parttion x P2 DATA NODE 4 DATA NODE 3 P3 S2P2S3 Partition 2 Partition 3 • Pretend that Node 2 failed to recover • In this case it can be recovered with – ndbmtd --initial Node group 0 Node group 0
  • 21. 21Copyright 2013 Severalnines AB Control your database infrastructure System Restart STORAGE LAYER P0 DATA NODE 2 DATA NODE 1 P1 subid data 1 A 2 B 3 C 4 D 5 E 6 F 7 G 8 H Partition 0 Partition 1 S0S1  P1 Px == PRIMARY Partition x Sx == SECONDARY Parttion x P2 DATA NODE 4 DATA NODE 3 P3 S2P2S3 Partition 2 Partition 3 • All nodes have failed. • Every data node can be restarted with – ndbmtd Node group 0 Node group 0
  • 22. 22Copyright 2013 Severalnines AB Control your database infrastructure System Restart STORAGE LAYER P0 DATA NODE 2 DATA NODE 1 P1 subid data 1 A 2 B 3 C 4 D 5 E 6 F 7 G 8 H Partition 0 Partition 1 S0S1  P1 Px == PRIMARY Partition x Sx == SECONDARY Parttion x P2 DATA NODE 4 DATA NODE 3 P3 S2P2S3 Partition 2 Partition 3 • If one node fails during the System Restart the system restart is aborted – All nodes crash again • Error logs must be inspected Node group 0 Node group 0
  • 23. 23Copyright 2013 Severalnines AB Control your database infrastructure System Restart • Some data nodes will write out in the error log – “Another data node failed during system restart” – Start these nodes with • Ndbmtd – The let the cluster perform a partial start (start with the nodes that are ok, at least one from each node group) -> you may have to try multiple combinations. • The goal is to find the “another node” – “Filesystem inconsistency”, “DBDIH pointer too large” – Start this node with • ndbmtd --initial
  • 24. 24Copyright 2013 Severalnines AB Control your database infrastructure System Restart • If all nodes in one node group has written out something like: – “Filesystem inconsistency”, “DBDIH pointer too large” – System restart is not possible • OR – It is not possible to perform a partial start (i.e, one node from each node group), and all possible combinations are exempt then .. • Initial System Restart is needed ! – Which basically means you have to restore from backup.
  • 25. 25Copyright 2013 Severalnines AB Control your database infrastructure System Restart STORAGE LAYER P0 DATA NODE 2 DATA NODE 1 P1 subid data 1 A 2 B 3 C 4 D 5 E 6 F 7 G 8 H Partition 0 Partition 1 S0S1  P1 Px == PRIMARY Partition x Sx == SECONDARY Parttion x P2 DATA NODE 4 DATA NODE 3 P3 S2P2S3 Partition 2 Partition 3 • This situation requires an Initial System Restart because all nodes in one node group have failed in such a way they are impossible to restart. – Luckily this is not very common at all. Node group 0 Node group 0
  • 26. 26Copyright 2013 Severalnines AB Control your database infrastructure Initial System Restart STORAGE LAYER P0 DATA NODE 2 DATA NODE 1 P1 subid data 1 A 2 B 3 C 4 D 5 E 6 F 7 G 8 H Partition 0 Partition 1 S0S1  P1 Px == PRIMARY Partition x Sx == SECONDARY Parttion x P2 DATA NODE 4 DATA NODE 3 P3 S2P2S3 Partition 2 Partition 3 • Restart all nodes with – ndbmtd --initial • Restore a backup Node group 0 Node group 0
  • 27. 27Copyright 2013 Severalnines AB Control your database infrastructure Summary • The whole exercise is to try different combinations, but never –initial all nodes in one node group.
  • 28. 28Copyright 2013 Severalnines AB Control your database infrastructure Coming next in Installment 11: Connectivity Overview
  • 29. 29Copyright 2013 Severalnines AB Control your database infrastructure Disclaimer © Copyright 2013 Severalnines AB. All rights reserved. Severalnines & the Severalnines logo(s) are trademarks of Severalnines AB. MySQL is a registered trademark of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.