SlideShare a Scribd company logo
1 of 167
MySQL Cluster Carrier Grade Edition
Alexander Yu
Principal Sales Consultant | MySQL Asia Pacific & Japan
2011-07-20
Agenda / Topics


              • Oracle MySQL Strategy
              • MySQL Server
                Pluggable Storage Engine Architecture
              • High Availability Solutions
              • MySQL Cluster Carrier Grade
                – Internals
                – Geographical Replication
                – Scale Out
                – Backup & Restore
              •Q&A




© 2011 Oracle Corporation                               2
About MySQL

              •   Founded, first release in 1995
              •   MySQL Acquired by Sun Microsystems Feb 2008
              •   Oracle Acquires Sun Microsystems Jan 2010
              •   +12M Product Installations
              •   65K+ Downloads Per Day
              •   Part of the rapidly growing open source LAMP stack


  Customers across every major operating system, hardware
      vendor, geography, industry, and application type



    High Performance ▪ Reliable ▪ Easy to Use

© 2011 Oracle Corporation                                              3
Oracle’s Strategy:
             Complete. Open. Integrated.
                            • Built together
                            • Tested together
                            • Managed together
                            • Serviced together
                            • Based on open standards
                            • Lower cost
                            • Lower risk
                            • More reliable




© 2011 Oracle Corporation                               4
Complete. Open. Integrated.
               MySQL Completes The Stack

                                • Oracle never settles for being second
                                  best at any level of the stack
                                • “Complete” means we meet most
                                  customer requirements at every level

                                That’s why MySQL matters to
                                 Oracle and Oracle customers




© 2011 Oracle Corporation                                                 5
The “M” in the LAMP Stack


                    Operating
                     System                                 L
                   Application
                     Server                                 A
                     Database                               M
                     Scripting                              P

For© 2011 Oracle Only -- Oracle Confidential & Proprietary
    Internal Use Corporation                                    6
Investment in MySQL
              Rapid Innovation

           • Make MySQL a Better MySQL
             • #1 Open Source Database for Web Applications
             • Most Complete LAMP Stack
             • Telecom & Embedded
           • Develop, Promote and Support MySQL
             • Improve engineering, consulting and support
             • Leverage 24x7, World-Class Oracle Support

           • MySQL Community Edition
             • Source and binary releases
             • GPL license


© 2011 Oracle Corporation                                     7
Oracle + MySQL Customers

              • Product Integration
                 • Oracle GoldenGate (Complete!)
                 • Oracle Enterprise Linux + Oracle VM (Complete!)
                    HA Template Available
                 • Oracle Secure Backup (CY 2011)
                 • Oracle Audit Vault (CY 2011)
                 • Oracle Enterprise Manager (CY 2011)

              • Support
                 • Leverage 24x7, World-Class Oracle Support
                 • MyOracle Support



© 2011 Oracle Corporation                                            8
Serving Key Markets and Industry Leaders

            Powering Data Management on the Web & in the Network




                            Web                            OEM / ISV’s




                   SaaS, Hosting     Telecommunications            Enterprise 2.0




© 2011 Oracle Corporation                                                           9
MySQL in Communications




                            http://www.mysql.com/industry/communications/resources.html#customer_case_studies




© 2011 Oracle Corporation                                                                                       10
MySQL Server
            Pluggable Storage Engine Architecture




© 2011 Oracle Corporation                           11
Pluggable Storage Engine Architecture
           MySQL Server
                                                                                          Connectors
           Clients and Apps                                         Native C API, JDBC, ODBC, .Net, PHP, Ruby, Python, VB, Perl




          Enterprise Management
           Services and Utilities                                                       Connection Pool
             Backup & Recovery                             Authentication – Thread Reuse – Connection Limits – Check Memory – Caches
                   Security
                  Replication
                   Cluster
                  Partitioning                   SQL Interface                   Parser                                                    Caches
                                                                                                           Optimizer
              Instance Manager                 DDL, DML, Stored             Query Translation,                                        Global and Engine
                                                                                                     Access Paths, Statistics
            Information_Schema                 Procedures, Views,           Object Privileges                                        Specific Caches and
             MySQL Workbench                     Triggers, Etc..                                                                            Buffers




                                                                Pluggable Storage Engines
                                                            Memory, Index and Storage Management



                 InnoDB     MyISAM   Cluster   Etc…                                                                    Partners   Community   More..


                                                                 Filesystems, Files and Logs
                                                      Redo, Undo, Data, Index, Binary, Error, Query and Slow




© 2011 Oracle Corporation                                                                                                                                  12
MySQL Cluster Architecture
             Shared-nothing distributed database with no SPOF:                              JDBC (Java)
             High Read & Write Performance & 99.999% uptime                                 NDB API (C++)
                                                                                            ClusterJ (Java)
                                                                        Clients             OpenJPA (Java)
                                                                                            PHP/P*/ODBC
                                                                                            OpenLDAP

                                      MySQL Cluster Application Nodes
                                                                   SQL Nodes



                     ClusterJ                                                                   MGM Client


                                                                   NDB API (C++)               MGM API (C)
                            NDB API




                                                                                  NDB API
MGM Node                                                                                       MGM Node
                                        MySQL Cluster Data Nodes


© 2011 Oracle Corporation                                                                                    13
Workload Qualification InnoDB vs MySQL Cluster
       Workload                                                      InnoDB   MySQL Cluster

                                                                              No. Unless mainly
       Packaged Applications (i.e. standard business applications)   Yes
                                                                              PK access

       Custom Applications                                           Yes      Yes

       OLTP Applications                                             Yes      Yes

       DSS Applications (i.e. Data Marts, Analytics, etc.)           Yes      No

       Content Management                                            Yes      Limited Support

       In-Network Telecoms Applications (HLR, HSS, SDP, etc)         No       Yes

       Web Session Management                                        Yes      Yes

       User Profile Management & AAA                                 Yes      Yes

       eCommerce Databases                                           Yes      Yes



© 2011 Oracle Corporation                                                                         14
Feature Comparison InnoDB vs MySQL Cluster
Feature Qualification                                                    InnoDB           MySQL Cluster
Latest MySQL 5.5 & InnoDB 1.1 Performance Enhancements                       Yes                     No

Storage Limits                                                              64TB                  2TB (a)
Foreign Keys                                                                 Yes                     No
MVCC Non-Blocking Reads                                                      Yes                     No
Optimized for Complex Multi-Table JOINs with Thousands of Accesses           Yes                   No (b)

Hash Indexes                                                                  No                    Yes
Compressed Data                                                              Yes                     No
Support for 8KB+ Row Sizes                                                   Yes            Only via BLOBs ( c )
Built-in Clustering Support for 99.999% HA                                    No                    Yes
Minimum Number of Physical Hosts for Redundancy                       2 (Active / Passive) 2 + 1 ( A/A & Mgmt) (d)
Time to Recovery After Node Failure                                      30s - hours            Sub-Second
Real-Time Performance                                                        No                    Yes
Option for In-Memory Storage of Tables with Disk Persistence                  No                    Yes

Non-SQL Access Methods to Data (i.e. NDB API)                                 No                    Yes

Write Scalability without Application Partitioning                            No                   Yes (e)
Max Number of Nodes for Parallel Write Performance                            1                     48 (f)

Conflict Resolution & Detection across Multiple Replication Masters           No                    Yes

Virtualization Support                                                       Yes                     No



© 2011 Oracle Corporation                                                                                          15
Storage Engines

 Feature                                         MyISAM              NDB      Archive    InnoDB     Memory
 Storage limits                                     No                 Yes       No        64TB       Yes

 Transactions                                       No                 Yes       No        Yes        No
 Locking granularity                               Table              Row       Row        Row       Table
 MVCC snapshot read                                 No                 No        No        Yes        No

 Geospatial support                                 Yes                No       Yes        Yes        No
 Data caches                                        No                 Yes       No        Yes        NA
 Index caches                                       Yes                Yes       No        Yes        NA
 Compressed data                                    Yes                No       Yes         No        No

 Storage cost (relative to other engines)‫‏‬         Small              Med      Small       Med        NA
 Memory cost (relative to other engines)‫‏‬          Low                High      Low        High       High
 Bulk insert speed                                 High               High    Highest      Med        High

 Replication support                                Yes                Yes      Yes        Yes        Yes
 Foreign Key support                                No                 No        No        Yes        No

 Built-in Cluster/High-availability support         No                 Yes       No         No        No

    Dynamically add and remove storage engines. Change the storage engine on a table with “ALTER TABLE …”



© 2011 Oracle Corporation                                                                                    16
Why Users Adopt MySQL Cluster
                            MySQL Already in Use




High Read/Write                                       99.999% MySQL
Throughput




Real Time Performance                              Scale-Out, On-Demand


© 2011 Oracle Corporation                                                 17
Why Users Buy MySQL Cluster CGE
                               Standardized on Open Source




Blend of Web &     Deploying Mission Critical Applications
Telecoms Capabilities                                      HA MySQL


                                                     Management & Monitoring
                   Global 24x7 support               Tools



                            Embedding MySQL Cluster
         Real-Time, High Read/
         Write Performance                         Scale-Out, Shared
                                                   Nothing

 © 2011 Oracle Corporation                                                     18
High Availability Solutions




© 2011 Oracle Corporation                   19
Selecting the Right HA Architecture




© 2011 Oracle Corporation                           20
Mapping HA Architecture to Applications
                                                                                    Shared-Nothing,
                                                           Data       Clustered /
                              Applications                                          Geo-Replicated
                                                        Replication   Virtualized
                                                                                        Cluster
                            E-Commerce / Trading

                            Session Management

                     User Authentication / Accounting

                             Feeds, Blogs, Wikis
                               Data Refinery

                                   OLTP

                             Data Warehouse/BI

                            Content Management

                                CRM / SCM

                                Collaboration

                             Packaged Software
                       Telco Apps (HLR/HSS/SDP…)




© 2011 Oracle Corporation                                                                             21
MySQL High Availability Solutions

          9 5. 0 0 0 %      • MySQL Replication
          9 9. 0 0 0 %      • MySQL Replication with Clustering Software
          9 9. 9 0 0 %      • DRBD with Clustering Software
          9 9. 9 0 0 %      • Shared Storage with Clustering Software (A/P - A/A)
          9 9. 9 9 0 %      • DRBD and Replication with Clustering Software
          9 9. 9 9 0 %      • Shared Storage and Replication with Clustering SW
          9 9. 9 9 0 %      • Shared Storage Replication
          9 9. 9 9 0 %      • Virtualised Environment
          9 9. 9 9 9 %      • MySQL Cluster
          9 9. 9 9 9 %      • MySQL Cluster & Replication
          9 9. 9 9 9 %      • MySQL Cluster Carrier Grade Edition


© 2011 Oracle Corporation                                                           22
MySQL Replication



• Native in MySQL
• Used for Scalability and HA
• Asynchronous as standard
• Semi-Synchronous support
  added in MySQL 5.5
• Each slave adds minimal
  load on master




                                  Relay Log




© 2011 Oracle Corporation                     22
Replication Topologies



                   Single           Chain            Circular




                 Multiple        Multi - Master   Multi - Circular




© 2011 Oracle Corporation                                            24
MySQL Replication
                  Read Scalability
        Clients




                                              MySQL Replication


                                     Slaves                       Master



  • Used by leading web properties for scale-out
  • Reads are directed to slaves, writes to master
  • Delivers higher performance & scale with efficient resource utilization


© 2011 Oracle Corporation                                                     22
MySQL Replication
              Failure Detection & Failover




  • Linux Heartbeat implements heartbeat protocol between nodes
  • Failover initiated by Cluster Resource Manager (Pacemaker) if heartbeat message is not
    received
  • Virtual IP address failed over to ensure failover is transparent to apps


© 2011 Oracle Corporation                                                                    22
Shared Disk Clusters
              A/P - A/A
                                                      READS/WRITES
                            Applications




                                                VIP
                                                                     Shared
                                                                     Storage




       • Reliability                             • High Availability
            - Commonly used solution                  - Data handled by a SAN or NAS
                                                        and always available
       • Fault Tolerance                              - Automatic fail-over
            - No single point of failure with
                appropriate hardware             • Simplified Management

© 2011 Oracle Corporation                                                              27
Distributed Replicated Block Device




 • DRBD creates transaction-safe hot standby configuration
 • MySQL updates written to block device on the Active Server
 • DRBD synchronously replicates updates to the Passive Server
 • Linux Heartbeat fails over from Active to Passive in event of failure


© 2011 Oracle Corporation                                                  28
Sharding aka Application Partitioning

      Master



                            Clients
       Slave
      Reads

      Writes


                                              Partitioning Logic

                                      1   2           3            4            5



                                                                                    Shards




                                                                       Slaves




© 2011 Oracle Corporation                                                                    29
Oracle VM Template for MySQL
              Integrated & Tested OS, VM and Database Stack

                                                       Oracle VM     Oracle VM     Oracle VM
Fastest, simplest & most reliable
 way to deploy virtualized, cloud-
 ready MySQL instances, certified                             Oracle VM      Oracle VM

 for production use

• Rapid DEPLOYMENT                                     Oracle VM Server Pool

• Increased RELIABILITY
• Higher AVAILABILITY                                       Oracle VM Servers

• Lower COST




© 2011 Oracle Corporation                                                                      30
Template Components
            Certified for Production Deployment

                                                         Oracle VM                           Oracle VM
                                                                      Automatic Fault
                                                                       Detection &
                                                                         Recovery
 • Oracle Linux 5 Update 6 with the
   Unbreakable Enterprise Kernel
 • Oracle VM 2.2.1                                                   Secure Live Migration
                                                                             (SSL)

 • Oracle VM Manager 2.1.5
                                                                Oracle VM Server Pool
 • Oracle Cluster File System 2 (OCFS2)      Oracle VM
                                              Manager
 • MySQL Database 5.5.10 (Enterprise Edition)
                                                                 Oracle VM Servers

 Pre-Installed & Pre-Configured
                                                                         ocfs2
 Full Integration & QA Testing
                                                                     SAN / iSCSI
 Single Point of Support



© 2011 Oracle Corporation                                                                                31
Positioning Current Solutions
            Requirement                 MySQL Replication              Heartbeat + DRBD              Oracle VM Template                MySQL Cluster
                     Availability
                                       All Supported by MySQL                                                                      All Supported by MySQL
    Platform Support                                                          Linux                       Oracle Linux
                                               Server                                                                                      Cluster
                                                                                                                                  Depends on Connector and
    Automated IP Failover                        No                            Yes                            Yes
                                                                                                                                       Configuration
    Automated Database
                                                 No                            Yes                            Yes                            Yes
    Failover
    Automatic               Data
                                                 No                            Yes                   N/A - Shared Storage                    Yes
    Resynchronization
                                                                   Configuration Dependent, 60 Configuration Dependent, 60
    Typical Failover Time              User / Script Dependent                                                                       1 Second and Less
                                                                       seconds and Above           seconds and Above
                                     No, Asynchronous and Semi-
    Synchronous Replication                                                    Yes                   N/A - Shared Storage                    Yes
                                            Synchronous
    Geographic Redundancy
                                                 Yes               Yes, via MySQL Replication Yes, via MySQL Replication Yes, via MySQL Replication
    Support
                       Scalability

                                                                    One Active (primary), one      One Active (primary), one
    Number of Nodes                  One Master, Multiple Slaves                                                                             255
                                                                    Passive (secondary) Node       Passive (secondary) Node

                                         Reads, via MySQL              Reads, via MySQL                Reads, via MySQL
    Built-in Load Balancing                                                                                                         Yes, Reads and Writes
                                            Replication                   Replication             Replication & During Failover

    Read-Intensive Workloads                     Yes                           Yes                            Yes                            Yes

                                                                    Yes, via Application-Level     Yes, via Application-Level
                                      Yes, via Application-Level
    Write-Intensive Workloads                                      Sharding to Multiple Active/   Sharding to Multiple Active/      Yes, via Auto-Sharding
                                               Sharding
                                                                          Passive Pairs                  Passive Pairs
    Scale On-Line (add nodes,
                                                 No                            No                              No                            Yes
    repartition, etc.)




© 2011 Oracle Corporation                                                                                                                                    32
MySQL Cluster
              Real-time Carrier Grade Database




© 2011 Oracle Corporation                        33
Customers & Applications

              • Web
                  –   User profile management
                  –   Session stores
                  –   eCommerce
                  –   On-Line Gaming
                  –   Application Servers
              • Telecoms
                  –   Subscriber Databases (HLR/HSS)
                  –   Service Delivery Platforms
                  –   VoIP, IPTV & VoD
                  –   Mobile Content Delivery
                  –   On-Line app stores and portals
                  –   IP Management
                  –   Payment Gateways



                                                       http://www.mysql.com/industry/telecom/



© 2011 Oracle Corporation                                                                       34
MySQL Cluster - NDB Storage Engine




© 2011 Oracle Corporation                          35
MySQL Cluster Architecture
             Shared-nothing distributed database with no SPOF:                              JDBC (Java)
             High Read & Write Performance & 99.999% uptime                                 NDB API (C++)
                                                                                            ClusterJ (Java)
                                                                        Clients             OpenJPA (Java)
                                                                                            PHP/P*/ODBC
                                                                                            OpenLDAP

                                      MySQL Cluster Application Nodes
                                                                   SQL Nodes



                     ClusterJ                                                                   MGM Client


                                                                   NDB API (C++)               MGM API (C)
                            NDB API




                                                                                  NDB API
MGM Node                                                                                       MGM Node
                                        MySQL Cluster Data Nodes


© 2011 Oracle Corporation                                                                                    36
MySQL Cluster Nodes
        SQL Based Applications


                              JDBC/ODBC

                     MySQL/                     API       API
                                    API Node                                  Management
                    SQL Node                   Node      Node
                                                                                Client

                                                                NDB API



                            Data
                                   MySQL Cluster      Data                         MGM API
                            Node                      Node
                                                                              Management
                                                                                 Node




                                                                    NDB API
                            Data                      Data
                            Node                      Node



© 2011 Oracle Corporation                                                                  37
MySQL Cluster Nodes
                                        • Standard SQL Interface
                    SQL Node            • Scale-out for Performance
                            (MySQL)
                                        • Enables Replication




                                        • High Performance
                     NDB API            • C, C++ & Java, LDAP, HTTP API
                       (Application)    • Developer’s Guide



                                        • Data Storage (Memory/Disk)
                   Data Node            • Automatic & User-Defined Partitioning
                                        • Local & Global Checkpoints
                 (NDB Storage Engine)
                                        • Scale-out or scale-up for Capacity & Redundancy
                                        • Scale dynamically with on-line add node

                                        • Administration and Configuration
                 Management             • Arbitration
                                        • Use Two for Redundancy
                            Node


© 2011 Oracle Corporation                                                              38
Replication Flexibility


                                                                   • Synchronous replication within a
                                                                     Cluster node group for HA
                                                                   • Bi-Direction asynchronous
              Cluster 1                                Cluster 2     replication to remote Cluster for
                                                                     geographic redundancy
                                                                   • Asynchronous replication to non-
                                                                     Cluster databases for specialised
                                                                     activities such as report
                                                                     generation
                                                                   • Mix and match replication types
                            MyISAM         MyISAM   InnoDB



                Synchronous replication


                Asynchronous replication




© 2011 Oracle Corporation                                                                                39
MySQL Cluster Loads

         MySQL                    MySQL           MySQL
        Community                 Cluster         Cluster
          Server                  (GPL)            CGE
                                                            • MySQL Cluster software (Management &
        MySQL                    MySQL           MySQL
        Server              ≠    Server     =    Server       Data Nodes) included with MySQL
                                                              Community Server should not be used
        InnoDB
                            ≠    InnoDB     =    InnoDB
                                                            • MySQL Server included with MySQL
                            ≠   Data Node   =   Data Node
                                                              Cluster loads is different to regular
                            ≠   Mgmt Node   =   Mgmt Node
                                                              MySQL Server
                                                              • Always use this special version of MySQL Server when
                                                                accessing MySQL Cluster data
                                                            • MySQL Cluster CGE downloaded from
                                                              oem.mysql.com
                                                            • GA GPL Community versions downloaded
                                                              from www.mysql.com/downloads
                                                            • In-development GPL Community versions
                                                              downloaded from dev.mysql.com/
                                                              downloads/




© 2011 Oracle Corporation                                                                                        40
MySQL Cluster System Requirements
               System Component                          Requirement

                             Hosts       Maximum of 255 total nodes (48 Data Nodes)‫‏‬


                                                     COTS – Advanced TCA
                            Hardware
                                                    32 & 64-bit x86 & SPARC


                            Memory     Varies on size of database, # of hosts, # of replicas

                                             Shared-Nothing - Memory & Disk Data
                            Storage
                                               SCSI or RAID for I/O performance


                            Network         >1 Gigabit recommended, SCI supported


                                       Linux (Red Hat, SuSE), Solaris, HP-UX, Mac OSX,
                  Operating System
                                                      Windows, others…



© 2011 Oracle Corporation                                                                      41
MySQL Cluster 6.2




© 2011 Oracle Corporation         42
MySQL Cluster 6.3




     http://dev.mysql.com/doc/mysql-cluster-excerpt/5.1/en/mysql-cluster-changes-5-1-ndb-6-3.html

© 2011 Oracle Corporation                                                                           43
MySQL Cluster 7.0 –GA April 2009




     http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster7_architecture.php

© 2011 Oracle Corporation                                                             44
Scale out – multi core environments




© 2011 Oracle Corporation                           45
MySQL Cluster vs MySQL MEMORY:
              30x Higher Throughput / 1/3rd the Latency on a single node




          • Table level locking inhibits MEMORY scalability beyond a single client
            connection
          • Check-pointing & logging enabled, MySQL Cluster still delivers durability
          • 4 socket server, 64GB RAM, running Linux


© 2011 Oracle Corporation                                                               46
Scale-Out Reads & Writes on Commodity Hardware



                                      • NDB API Performance 4.33 M
                                        Queries per second!
                                      • 8 Intel servers, dual-6-core CPUs
                                        @2.93 GHz, 24GB RAM
                                      • 2 Data Nodes per server
                                      • flexAsync benchmark
                                        – 16 parallel threads, each issuing 256
                                          simultaneous transactions
                                        – Read / Write 100KB attribute
                                      • Interim results from 2 days testing –
                                        watch this space:
                                        mikaelronstrom.blogspot.com




© 2011 Oracle Corporation                                                         47
MySQL Cluster CGE 7.1 – Key Enhancements




    http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster7_architecture.php

© 2011 Oracle Corporation                                                            48
MySQL Cluster 7.1 Momentum

                                                         1,000 Downloads per Day


                                                           Windows GA

                                                Pro-active Cluster
                                                   Monitoring
                                  Fully Automated             “MySQL Cluster 7.1 gave us the
                                   Management               perfect combination of extreme levels
                                                               of transaction throughput, low
                      10x Higher Java                       latency & carrier-grade availability,
                       Performance                                  while reducing TCO”
                                                            Phani Naik, Pyro Group




© 2011 Oracle Corporation                                                                     49
MySQL Cluster 7.1: ndbinfo


          mysql> use ndbinfo         • New database (ndbinfo) which
          mysql> show tables;          presents real-time metric data
          +-------------------+        in the form of tables
          | Tables_in_ndbinfo |
          +-------------------+      • Exposes new information
          | blocks            |        together with providing a
          | config_params     |        simpler, more consistent way to
          | counters          |        access existing data
          | logbuffers        |
          | logspaces         |
                                     • Examples include:
          | memoryusage       |        • Resource usage (memory, buffers)
          | nodes             |        • Event counters (such as number of
          | resources         |          READ operations since last restart)
          | transporters      |        • Data node status and connection
          +-------------------+          status




© 2011 Oracle Corporation                                                      50
MySQL Cluster 7.1: ndbinfo


       • Example 1: Check memory usage/availability


                      mysql> select * from ndbinfo.memoryusage;
                      +---------+--------------+--------+------------+-----------+-------------+
                      | node_id | memory_type | used    | used_pages | total     | total_pages |
                      +---------+--------------+--------+------------+-----------+-------------+
                      |       3 | Data memory | 917504 |          28 | 104857600 |        3200 |
                      |       3 | Index memory | 221184 |         27 | 11010048 |         1344 |
                      |       4 | Data memory | 917504 |          28 | 104857600 |        3200 |
                      |       4 | Index memory | 221184 |         27 | 11010048 |         1344 |
                      +---------+--------------+--------+------------+-----------+-------------+



        • Note that there is a DATA_MEMORY and INDEX_MEMORY row for each data node in the
          cluster
        • If the Cluster is nearing the configured limit then increase the DataMemory and/or
          IndexMemory parameters in config.ini and then perform a rolling restart




© 2011 Oracle Corporation                                                                          51
MySQL Cluster 7.1: ndbinfo


       • Example 2: Check how many table scans performed on each data node since the last restart


                            mysql> select node_id as 'data node', val as 'Table Scans' from ndbinfo.counters
                                where counter_name='TABLE_SCANS';
                            +-----------+-------------+
                                | data node | Table Scans |
                                +-----------+-------------+
                                |         3 |           3 |
                                |         4 |           4 |
                                +-----------+-------------+



        • You might check this if your database performance is lower than anticipated
        • If this figure is rising faster than you expected then examine your application to understand
          why there are so many table scans




© 2011 Oracle Corporation                                                                                      52
Latest news on MySQL Cluster 7.1

              • As of MySQL Cluster 7.1.9a:
                  • InnoDB plugin included
                  • New view in ndbinfo:
                    mysql> SELECT node_id, page_requests_direct_return AS hit, page_requests_wait_io AS miss,
                       100*page_requests_direct_return/(page_requests_direct_return+page_requests_wait_io) AS
                       hit_rate FROM ndbinfo.diskpagebuffer;
                    +---------+------+------+----------+
                    | node_id | hit | miss | hit_rate |
                    +---------+------+------+----------+
                    |       3 |    6 |    3 | 66.6667 |
                    |       4 |   10 |    3 | 76.9231 |
                    +---------+------+------+----------+

              • MEM2.3 includes new Cluster Advisor/graphs




© 2011 Oracle Corporation                                                                                       53
MySQL Enterprise Monitor 2.3




© 2011 Oracle Corporation                    54
Online Operations

              • Scale the cluster for throughput or capacity
                  – Data and SQL Nodes
              •   Repartition tables
              •   Recover failed nodes
              •   Upgrade / patch servers & OS
              •   Upgrade / patch MySQL Cluster
              •   Back-Up
              •   Evolve the schema on-line, in real-time




© 2011 Oracle Corporation                                      55
Real-Time, On-Line Schema Changes

                                            CREATE OFFLINE INDEX b ON t1(b);
    • Fully online – transaction response
                                            Query OK, 1356 rows affected (2.20 sec)‫‏‬
      times unchanged
    • Add and remove indexes, add new
      columns and tables                    DROP OFFLINE INDEX b ON t1;

    • No temporary table creation           Query OK, 1356 rows affected (2.03 sec)‫‏‬

    • No recreation of data or deletion
      required                              CREATE ONLINE INDEX b ON t1(b);
    • Faster and better performing table    Query OK, 0 rows affected (0.58 sec)‫‏‬
      maintenance operations
    • Less memory and disk requirements     DROP ONLINE INDEX b ON t1;
                                            Query OK, 0 rows affected (0.46 sec)‫‏‬


                                            ALTER ONLINE TABLE t1 ADD COLUMN d INT;
                                            Query OK, 0 rows affected (0.36 sec)‫‏‬




© 2011 Oracle Corporation                                                              56
Performance I Flexibility I Simplification
    • SQL and NoSQL Access Methods to tables
         – SQL: complex queries, rich ecosystem of apps & expertise
         – Simple Key/Value interfaces bypassing SQL layer for blazing fast reads & writes
         – Real-time interfaces for micro-second latency
         – Developers free to work in their preferred environment




© 2011 Oracle Corporation                                                                    57
Scaling Distributed Joins                                                                                                  7.2DM
              Adaptive Query Localization

                                                                   • ‘Complex’ joins traditionally slower in MySQL Cluster
                                                                       – Complex = lots of levels and interim results in JOIN
                                                                   • JOIN was implemented in the MySQL Server:
                                                                       – Nested Loop join
                                                                       – When data is needed, it must be fetched over the
        mysqld                                                           network from the Data Nodes; row by row
                                                                       – This causes latency and consumes resources
                                                                   • Can now push the execution down into the data
                                            Data Nodes
                                                                     nodes, greatly reducing the network trips
                             AQL                                   • 25x-40x performance gain in customer PoC!




        mysqld



                                            Data Nodes             The existence, content and timing of future releases described here is included for
                                                                   information only and may be changed at Oracles discretion.


http://www.mysql.com/news-and-events/on-demand-webinars/display-od-583.html

 © 2011 Oracle Corporation                                                                                                                         58
Adaptive Query Localization: Current Limitations

              • Columns to be joined
                  – must use exactly the same data type
                  – cannot be any of the BLOB or TEXT types
                  – columns to be joined must be part of a table index or primary key
              • AQL can be disabled using the ndb_join_pushdown server system variable
                  – enabled by default




© 2011 Oracle Corporation                                                                59
•<Insert Picture Here>




                                                     Early Adopter Speaks!

“Testing of Adaptive Query Localization has yielded over 20x
higher performance on complex queries within our application,
enabling Docudesk to expand our use of MySQL Cluster into a
broader range of highly dynamic web services.”

Casey Brown
Manager, Development & DBA Services, Docudesk



© 2011 Oracle Corporation                                                    60
MySQL Cluster: SQL & NoSQL Combined




                                                                      Mix & Match!
                                                                    Same data accessed
                                                                   simultaneously through
                                                                   SQL & NoSQL interfaces

• NoSQL – Multiple ways to bypass SQL, and maximize performance:
   • NDB API. C++ for highest performance, lowest latency
   • Cluster/J for optimized access in Java
   • NEW! Memcached. Use all your existing clients/applications




© 2011 Oracle Corporation                                                                   61
Which to Choose ?




© 2011 Oracle Corporation         62
Performance




© 2011 Oracle Corporation   63
NoSQL With NDB API
              Best possible performance

                            Clients            • Application embeds the NDB API C++
                                                 interface library
                                               • NDB API make intelligent decision (where
                                                 possible) about which data node to send queries
                                                 to
        Applications with embedded NDB API
        Library                                  – With a little planning in the schema design, achieve linear
                                                   scalability
                                               • Used by all of the other application nodes
                                                 (MySQL, LDAP, ClusterJ,…)
                                               • Best possible performance but requires >
                                                 development skill
                                               • Favourite API for real-time network
                                                 applications
                                               • Foundation for all interfaces

                    MySQL Cluster Data Nodes




© 2011 Oracle Corporation                                                                                        64
NoSQL with memcached                                                          7.2DM



                                       • Memcached is a distributed memory based
                                         hash-key/value store with no persistence to disk
            Memcached       protocol   • NoSQL, simple API, popular with developers
                                       • MySQL Cluster already provides scalable, in-
                                         memory performance with NoSQL (hashed)
                                         access as well as persistence
                                         • Provide the Memcached API but map to NDB API calls
                                       • Writes-in-place, so no need to invalidate cache
                                       • Simplifies architecture as caching & database
                                         integrated into 1 tier
                                       • Access data from existing relational tables




© 2011 Oracle Corporation                                                                       65
NoSQL with Memcached                                                         7.2DM
              Pre-GA version available from labs.mysql.com




   Flexible:                                                  Simple:
        •   Deployment options                                   set maidenhead 0 0 3
                                                                 SL6
        •   Multiple Clusters
                                                                     STORED
        •   Simultaneous SQL Access
        •   Can still cache in Memcached server                  get maidenhead
        •   Flat key-value store or map to multiple tables/         VALUE maidenhead 0 3
                                                                    SL6
            columns
                                                                    END




© 2011 Oracle Corporation                                                                      66
MySQL Cluster Manager 1.1 Features




   Delivered as part of MySQL Cluster CGE 7.1

© 2011 Oracle Corporation                          67
How Does MySQL Cluster Manager Help ?
                  Example: Initiating upgrade from MySQL Cluster 6.3 to
                  7.1

             Before MySQL Cluster Manager                         With MySQL Cluster Manager
           •1 x preliminary check of cluster state                upgrade cluster --package=7.1 mycluster;

           •8 x ssh commands per server
           •8 x per-process stop commands
           •4 x scp of configuration files (2 x mgmd & 2 x                Total: 1 Command -
            mysqld)                                                      Unattended Operation
           •8 x per-process start commands
           •8 x checks for started and re-joined processes        • Results
           •8 x process completion verifications
                                                                     • Reduces the overhead and complexity of
           •1 x verify completion of the whole cluster.
                                                                       managing database clusters
           •Excludes manual editing of each configuration file.
                                                                     • Reduces the risk of downtime resulting from
                   Total: 46 commands -                                administrator error
              2.5 hours of attended operation                        • Automates best practices in database
                                                                       cluster management




© 2011 Oracle Corporation                                                                                            68
Terms used by MySQL Cluster Manager


                                                                                                   • Site: the set of physical hosts which are to run
                                                                                                     Cluster processes to be managed by MySQL
                                                                                                     Cluster Manager. A site can include 1 or more
          Site                                                                                       clusters.
             Host            Host                Host                     Host                     • Cluster: represents a MySQL Cluster
                                                                                                     deployment. A Cluster contains 1 or more
          Cluster
                                                                                                     processes running on 1 or more hosts
                                                                                                   • Host: Physical machine, running the MySQL
                                                                         Process

                                                                                         Process
                                                Process

                                                               Process
               Process



                            Process

                                      Process




                                                                                                     Cluster Manager agent
          Cluster                                                                                  • Agent: The MySQL Cluster Manager process
                                                                                                     running on each host
                  Process




                                                          Process



                                                                                   Process




                                                                                                   • Process: an individual MySQL Cluster node;
                                                                                                     one of: ndb_mgmd, ndbd, ndbmtd, mysqld &
               agent             agent               agent                    agent                  ndbapi*
                                                                                                   • Package: A copy of a MySQL Cluster installation
                                                                                                     directory as downloaded from mysql.com, stored
                                                                                                     on each host
                                                                                                    *ndbapi is a special case, representing a slot for an external application
                                                                                                    process to connect to the cluster using the NDB API




© 2011 Oracle Corporation                                                                                                                                                        69
Example configuration
     mysql
     client



                                                                  • MySQL Cluster Manager agent runs on
                                                                    each physical host
                    7. mysqld                        8. mysqld
                                                                  • No central process for Cluster Manager –
                   1. ndb_mgmd                      2. ndb_mgmd
                                                                    agents co-operate, each one responsible
                      agent                            agent        for its local nodes
                                                                  • Agents are responsible for managing all
                192.168.0.10                    192.168.0.11
                                                                    nodes in the cluster
                     3. ndbd                          4. ndbd
                                                                  • Management responsibilities
                                                                    •   Starting, stopping & restarting nodes
                     5. ndbd                          6. ndbd
                                                                    •   Configuration changes
                      agent                            agent
                                                                    •   Upgrades
                192.168.0.12                    192.168.0.13        •   Host & Node status reporting
                                                                    •   Recovering failed nodes
     n. mysqld        MySQL Server (ID=n)

   n. ndb_mgmd        Management Node (ID=n)

      n. ndbd         Data Node (ID=n)

       agent          MySQL Cluster Manager agent




© 2011 Oracle Corporation                                                                                       70
Creating & Starting a Cluster


    mysql                                    1.Define the site:
    client
                                                 Mysql> create site --hosts=192.168.0.10,192.168.0.11,
                                                 -> 192.168.0.12,192.168.0.13 mysite;

                                             2.Expand the MySQL Cluster tar-ball(s) from
                                               mysql.com to known directory
                 7. mysqld       8. mysqld
                                             3.Define the package(s):
               1. ndb_mgmd     2. ndb_mgmd       Mysql> add package --basedir=/usr/local/mysql_6_3_26 6.3;
                                                 Mysql> add package --basedir=/usr/local/mysql_7_0_7 7.0;
                   agent           agent
                                             Note that the basedir should match the directory used
                                               in Step 2.
             192.168.0.10    192.168.0.11
                                             4.Create the Cluster
                                                 Mysql> create cluster --package=6.3
                  3. ndbd         4. ndbd        -> --processhosts=ndb_mgmd@192.168.0.10,ndb_mgmd@192.168.0.11,
                                                 -> ndbd@192.168.0.12,ndbd@192.168.0.13, ndbd@192.168.0.12,
                                                 -> ndbd@192.168.0.13,mysqld@192.168.9.10,mysqld@192.168.9.11
                  5. ndbd         6. ndbd        -> mycluster;

                   agent           agent     This is where you define what nodes/processes make
                                               up the Cluster and where they should run
             192.168.0.12    192.168.0.13    5.Start the Cluster:
                                                 Mysql> start cluster mycluster;




© 2011 Oracle Corporation                                                                                         71
Upgrade Cluster

     mysql
     client




                                                • Upgrade from MySQL Cluster 6.3.26 to 7.0.7:
                  7. mysqld        8. mysqld
                                                 mysql> upgrade cluster --package=7.0 mycluster;
                 1. ndb_mgmd      2. ndb_mgmd


                    agent            agent
                                                • Automatically upgrades each node and restarts
                                                  the process – in the correct order to avoid any
                                                  loss of service
              192.168.0.10     192.168.0.11
                                                • Without MySQL Cluster Manager, the
                   3. ndbd          4. ndbd       administrator must stop each process in turn,
                                                  start the process with the new version and wait
                   5. ndbd          6. ndbd
                                                  for the node to restart before moving onto the
                    agent            agent        next one

              192.168.0.12     192.168.0.13




© 2011 Oracle Corporation                                                                          72
MySQL Cluster Manager
              GA 1st November 2010
                       Mgmt                   Mgmt                          Mgmt                          Mgmt
     33     mysqld
                       Node   34   mysqld
                                              Node   33   mysqld   mysqld
                                                                            Node   34   mysqld   mysqld
                                                                                                          Node


                Data                   Data               Data              Data          Data            Data
     31         Node          32       Node          31   Node        32    Node   35     Node       36   Node




  • On-line add-node
          mysql> add hosts --hosts=192.168.0.35,192.168.0.36 mysite;
          mysql> add package --basedir=/usr/local/mysql_7_0_7 –
            hosts=192.168.0.35,192.168.0.36 7.0;
          mysql> add process --
            processhosts=mysqld@192.168.0.33,mysqld@192.168.0.34,ndbd@192.1
            68.0.35,ndbd@192.168.0.36 mycluster;
          mysql> start process --added mycluster;
  • Restart optimizations
       • Fewer nodes restarted on some parameter changes




© 2011 Oracle Corporation                                                                                        73
General Design Considerations


          • MySQL Cluster is designed for
               – Short transactions
               – Many parallel transactions
          • Utilize Simple access patterns to fetch data
               – Use efficient scans and batching interfaces
          • Analyze what your most typical use cases are
               – optimize for those




                                   Overall design goal
                            Minimize network roundtrips for your
                                 most important requests!


© 2011 Oracle Corporation                                          74
Best Practice : Primary Keys


          • To avoid problems with
               • Cluster 2 Cluster replication
               • Recovery
               • Application behavior (KEY NOT FOUND.. etc)
          • ALWAYS DEFINE A PRIMARY KEY ON THE TABLE!
               • A hidden PRIMARY KEY is added if no PK is specified. BUT..
                  • .. NOT recommended
                  • The hidden primary key is for example not replicated (between Clusters)!!
                  • There are problems in this area, so avoid the problems!
          • So always, at least have
            id BIGINT AUTO_INCREMENT PRIMARY KEY
               • Even if you don't “need” it for you applications




© 2011 Oracle Corporation                                                                       75
Best Practice: Distribution Aware Apps
SELECT SUM(population) FROM towns
WHERE country=“UK”;
                                                         • Partition selected using hash on
                                                           Partition Key
                            Partition Key
                                                           • Primary Key by default
                 Primary Key
                                                           • User can override in table definition
     town                   country         population
     Maidenhead             UK              78000        • MySQL Server (or NDB API) will
     Paris                  France          2193031        attempt to send transaction to the
     Boston                 UK              58124          correct data node
     Boston                 USA             617594         • If all data for the transaction are in the
                                                             same partition, less messaging -> faster
SELECT SUM(population) FROM towns
WHERE town=“Boston”;
                                                         • Aim to have all rows for high-running
                                                           queries in same partition
     Partition Key

                 Primary Key
     town                   country         population
     Maidenhead             UK              78000
     Paris                  France          2193031
     Boston                 UK              58124
     Boston                 USA             617594



© 2011 Oracle Corporation                                                                             76
Best Practice: Distribution Aware – Multiple Tables


        Partition Key

          Primary Key
        sub_id              age      gender              • Extend partition awareness over
        19724               25       male                  multiple tables
        84539               43       female
                                                         • Same rule – aim to have all data for
        19724               16       female
                                                           instance of high running transactions
        74574               21       female
                                                           in the same partition
                            Partition Key

                    Primary Key
                                                         ALTER TABLE service_ids
        service             sub_id            svc_id
                                                           PARTITION BY KEY(sub_id);
        twitter             19724             76325732
        twitter             84539             67324782
        facebook            19724             83753984
        facebook            73642             87324793




© 2011 Oracle Corporation                                                                      77
MySQL Cluster
              Internals




© 2011 Oracle Corporation     78
Automatic Data Partitioning
                                                  4 Partitions * 2 Replicas = 8 Fragments
                             Table T1                                                            Data Node 1

                                                                                                 F1        F3
                                                          Px Partition

                                                         P1                                                Node Group 1
                                                                                                 Data Node 2

                                                         P2                                      F3        F1



                                                         P3                                      Data Node 3

                                                                                                 F2        F4
                                                         P4

                                                                                                               Node Group 2
                                                  Fx     Primary Fragment                        Data Node 4
- Node groups are created automatically                                                          F4        F2
- # of groups = # of data nodes / # of replicas   Fx     Secondary Fragment (fragment replica)




 © 2011 Oracle Corporation                                                                                               79
Automatic Data Partitioning
                                       4 Partitions * 2 Replicas = 8 Fragments
                            Table T1                                             Data Node 1


                                               Px Partition

                                              P1
                                                                                 Data Node 2

                                              P2

                                              P3                                 Data Node 3


                                              P4

                                                                                 Data Node 4
  A fragment is a copy of a partition (aka fragment replica)
      Number of fragments = # of partitions * # of replicas


© 2011 Oracle Corporation                                                                      80
Automatic Data Partitioning
                                       4 Partitions * 2 Replicas = 8 Fragments
                            Table T1                                                  Data Node 1

                                                                                      F1
                                               Px Partition

                                              P1
                                                                                      Data Node 2

                                              P2

                                              P3                                      Data Node 3


                                              P4

                                       Fx     Primary Fragment                        Data Node 4

                                       Fx     Secondary Fragment (fragment replica)




© 2011 Oracle Corporation                                                                           81
Automatic Data Partitioning
                                       4 Partitions * 2 Replicas = 8 Fragments
                            Table T1                                                  Data Node 1

                                                                                      F1
                                               Px Partition

                                              P1
                                                                                      Data Node 2

                                              P2                                                F1



                                              P3                                      Data Node 3


                                              P4

                                       Fx     Primary Fragment                        Data Node 4

                                       Fx     Secondary Fragment (fragment replica)




© 2011 Oracle Corporation                                                                            82
Automatic Data Partitioning
                                       4 Partitions * 2 Replicas = 8 Fragments
                            Table T1                                                  Data Node 1

                                                                                      F1
                                               Px Partition

                                              P1
                                                                                      Data Node 2

                                              P2                                      F3        F1



                                              P3                                      Data Node 3


                                              P4

                                       Fx     Primary Fragment                        Data Node 4

                                       Fx     Secondary Fragment (fragment replica)




© 2011 Oracle Corporation                                                                            83
Automatic Data Partitioning
                                       4 Partitions * 2 Replicas = 8 Fragments
                            Table T1                                                   Data Node 1

                                                                                       F1        F3
                                               Px Partition

                                              P1
                                                                                       Data Node 2

                                              P2                                       F3        F1



                                              P3                                       Data Node 3


                                              P4

                                       Fx     Primary Fragment                         Data Node 4

                                       Fx      Secondary Fragment (fragment replica)




© 2011 Oracle Corporation                                                                             84
Automatic Data Partitioning
                                       4 Partitions * 2 Replicas = 8 Fragments
                            Table T1                                                  Data Node 1

                                                                                      F1        F3
                                               Px Partition

                                              P1
                                                                                      Data Node 2

                                              P2                                      F3        F1



                                              P3                                      Data Node 3

                                                                                      F2
                                              P4

                                       Fx     Primary Fragment                        Data Node 4

                                       Fx     Secondary Fragment (fragment replica)




© 2011 Oracle Corporation                                                                            85
Automatic Data Partitioning
                                       4 Partitions * 2 Replicas = 8 Fragments
                            Table T1                                                  Data Node 1

                                                                                      F1        F3
                                               Px Partition

                                              P1
                                                                                      Data Node 2

                                              P2                                      F3        F1



                                              P3                                      Data Node 3

                                                                                      F2
                                              P4

                                       Fx     Primary Fragment                        Data Node 4

                                                                                                F2
                                       Fx     Secondary Fragment (fragment replica)




© 2011 Oracle Corporation                                                                            86
Automatic Data Partitioning
                                       4 Partitions * 2 Replicas = 8 Fragments
                            Table T1                                                  Data Node 1

                                                                                      F1        F3
                                               Px Partition

                                              P1
                                                                                      Data Node 2

                                              P2                                      F3        F1



                                              P3                                      Data Node 3

                                                                                      F2
                                              P4

                                       Fx     Primary Fragment                        Data Node 4

                                                                                      F4        F2
                                       Fx     Secondary Fragment (fragment replica)




© 2011 Oracle Corporation                                                                            87
Automatic Data Partitioning
                                       4 Partitions * 2 Replicas = 8 Fragments
                            Table T1                                                  Data Node 1

                                                                                      F1        F3
                                               Px Partition

                                              P1
                                                                                      Data Node 2

                                              P2                                      F3        F1



                                              P3                                      Data Node 3

                                                                                      F2        F4
                                              P4

                                       Fx     Primary Fragment                        Data Node 4

                                                                                      F4        F2
                                       Fx     Secondary Fragment (fragment replica)




© 2011 Oracle Corporation                                                                            88
Automatic Data Partitioning
                                       4 Partitions * 2 Replicas = 8 Fragments
                            Table T1                                                  Data Node 1

                                                                                      F1        F3
                                               Px Partition

                                              P1                                                Node Group 1
                                                                                      Data Node 2

                                              P2                                      F3        F1



                                              P3                                      Data Node 3

                                                                                      F2        F4
                                              P4

                                       Fx     Primary Fragment                        Data Node 4

                                                                                      F4        F2
                                       Fx     Secondary Fragment (fragment replica)




© 2011 Oracle Corporation                                                                                  89
Automatic Data Partitioning
                                                  4 Partitions * 2 Replicas = 8 Fragments
                             Table T1                                                            Data Node 1

                                                                                                 F1        F3
                                                          Px Partition

                                                         P1                                                Node Group 1
                                                                                                 Data Node 2

                                                         P2                                      F3        F1



                                                         P3                                      Data Node 3

                                                                                                 F2        F4
                                                         P4

                                                                                                               Node Group 2
                                                  Fx     Primary Fragment                        Data Node 4
- Node groups are created automatically                                                          F4        F2
- # of groups = # of data nodes / # of replicas   Fx     Secondary Fragment (fragment replica)




 © 2011 Oracle Corporation                                                                                               90
Automatic Data Partitioning
                                        4 Partitions * 2 Replicas = 8 Fragments
                            Table T1                                                   Data Node 1

                                                                                       F1        F3
                                                Px Partition

                                               P1                                                Node Group 1
                                                                                       Data Node 2

                                               P2                                      F3        F1



                                               P3                                      Data Node 3

                                                                                       F2        F4
                                               P4

                                                                                                     Node Group 2
                                        Fx     Primary Fragment                        Data Node 4
As long as one data node in each node
 group is running we have a complete                                                   F4        F2
                                        Fx     Secondary Fragment (fragment replica)
            copy of the data




© 2011 Oracle Corporation                                                                                      91
Automatic Data Partitioning
                                        4 Partitions * 2 Replicas = 8 Fragments
                            Table T1                                                   Data Node 1

                                                                                       F1        F3
                                                Px Partition

                                               P1                                                Node Group 1
                                                                                       Data Node 2

                                               P2                                      F3        F1



                                               P3                                      Data Node 3

                                                                                       F2        F4
                                               P4

                                                                                                     Node Group 2
                                        Fx     Primary Fragment                        Data Node 4
As long as one data node in each node
 group is running we have a complete                                                   F4        F2
                                        Fx     Secondary Fragment (fragment replica)
            copy of the data




© 2011 Oracle Corporation                                                                                      92
Automatic Data Partitioning
                                        4 Partitions * 2 Replicas = 8 Fragments
                            Table T1                                                   Data Node 1

                                                                                       F1        F3
                                                Px Partition

                                               P1                                                Node Group 1
                                                                                       Data Node 2

                                               P2                                      F3        F1



                                               P3                                      Data Node 3

                                                                                       F2        F4
                                               P4

                                                                                                     Node Group 2
                                        Fx     Primary Fragment                        Data Node 4
As long as one data node in each node
                group                                                                  F4        F2
                                        Fx     Secondary Fragment (fragment replica)
is running we have a complete copy of
               the data



© 2011 Oracle Corporation                                                                                      93
Automatic Data Partitioning
                                        4 Partitions * 2 Replicas = 8 Fragments
                             Table T1                                                  Data Node 1

                                                                                       F1        F3
                                                Px Partition

                                               P1                                                Node Group 1
                                                                                       Data Node 2

                                               P2                                      F3        F1



                                               P3                                      Data Node 3

                                                                                       F2        F4
                                               P4

                                                                                                     Node Group 2
                                        Fx     Primary Fragment                        Data Node 4

                                                                                       F4        F2
- No complete copy of the data          Fx     Secondary Fragment (fragment replica)
- Cluster shutdowns automatically




 © 2011 Oracle Corporation                                                                                     94
Data Partitioning

         • Automatic distribution/partitioning
             – Primary Key hash value (partitioning by Key)
         • Transparent load balancing
             – Distribution awareness
                • Data Node chosen based on PK hash value
             – Or proximity (SQL Node - shared memory, localhost, remote host)
         • Support for user defined partitioning                                                        4 Partitions * 2 Replicas = 8 Fragments

         • Key Concepts                                                                  Table T1                                                      Data Node 1

                                                                                                                                                       F1        F3
             – Partition                                                                                        Px Partition

                • Horizontal                                                                                   P1                                                Node Group 1
                                                                                                                                                       Data Node 2

                • # of partitions = # of data nodes                                                            P2                                      F3        F1


             – Fragment
                                                                                                               P3
                • Copy of a partition                                                                                                                  Data Node 3

                                                                                                                                                       F2        F4
             – Replica                                                                                         P4

                • Complete copy of the data                                                             Fx     Primary Fragment                        Data Node 4
                                                                                                                                                                     Node Group 2


             – Node Group                             - Node groups are created automatically
                                                      - # of groups = # of data nodes / # of replicas   Fx     Secondary Fragment (fragment replica)
                                                                                                                                                       F4        F2


                • Groups data nodes (automatically)
                • Determined by the order in configuration file
                • # of groups = # of data nodes / # of replicas


© 2011 Oracle Corporation                                                                                                                                                           95
Internal Replication

              • Replication between Data Nodes
              • Synchronous Replication
                  – To ensure minimal failover time
                  – Data Nodes have the same information at the same point in time
                  – Achieved by Two-phase commit protocol
              • Two-phase commit
                  – 1. Prepare/update phase
                     • All fragments (primary/secondary) gets updated
                  – 2. Commit phase
                     • The changes are committed
                  – Every Data Node has Transaction Coordinator
                  – One is elected to be the transaction coordinator
                  – The information goes from the Transaction Coordinator (TC) to primary fragments
                    and further to secondary fragments




© 2011 Oracle Corporation                                                                             96
Internal Replication: Prepare Phase

                            Data Node              insert into T1 values (...)              Data Node
                                              1


                    Transaction Coordinator                                          Transaction Coordinator


                                   4                              2


                      Local Query Handler                         3                    Local Query Handler




                 ACC                    TUP        1. Calc hash on PK              ACC                  TUP
                                                   2. Forward request to LQH
                 Index             F1         F2   where primary fragment is        Index          F2         F1
                Memory                             3. Prepare secondary fragment   Memory
                                   Data Memory                                                     Data Memory
                                                   4. Prepare phase done




© 2011 Oracle Corporation                                                                                          97
Internal Replication: Commit Phase

                            Data Node              insert into T1 values (...)            Data Node
                                              4


                    Transaction Coordinator                                        Transaction Coordinator


                                   1                              3


                      Local Query Handler                         2                  Local Query Handler




                 ACC                    TUP                                      ACC                  TUP

                 Index             F1         F2                                  Index          F2         F1
                Memory                                                           Memory
                                   Data Memory                                                   Data Memory




© 2011 Oracle Corporation                                                                                        98
Transactions

              • Transaction Coordinator
                  – The elected TC starts the transaction
                  – TC calculates a hash on the primary key
                  – Each transaction contains one or more Read/Insert/Update or Delete
                    Operations
                  – Operations are forwarded to the LQH of the Data Node having the data for
                    the operation
              • Isolation Level
                  – Committed Read
                     • Read both from primary and secondary fragment
                     • No lock required
              • Update/Insert/Delete
                  – Locks on index entry in ACC
                  – Both primary and secondary fragments
              • Read exclusive/Read shared
                  – Locks the index entry in ACC on primary and secondary fragments


© 2011 Oracle Corporation                                                                      99
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge
Oracle my sql cluster cge

More Related Content

What's hot

Reading The Source Code of Presto
Reading The Source Code of PrestoReading The Source Code of Presto
Reading The Source Code of PrestoTaro L. Saito
 
Distributed applications using Hazelcast
Distributed applications using HazelcastDistributed applications using Hazelcast
Distributed applications using HazelcastTaras Matyashovsky
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversScyllaDB
 
Smartsheet’s Transition to Snowflake and Databricks: The Why and Immediate Im...
Smartsheet’s Transition to Snowflake and Databricks: The Why and Immediate Im...Smartsheet’s Transition to Snowflake and Databricks: The Why and Immediate Im...
Smartsheet’s Transition to Snowflake and Databricks: The Why and Immediate Im...Databricks
 
美团数据平台之Kafka应用实践和优化
美团数据平台之Kafka应用实践和优化美团数据平台之Kafka应用实践和优化
美团数据平台之Kafka应用实践和优化confluent
 
Frame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine LearningFrame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine LearningDavid Stein
 
Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...
Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...
Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...Data Con LA
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data ArchitectureGuido Schmutz
 
IBM DB2 for zOSのソースエンドポイントとしての利用
IBM DB2 for zOSのソースエンドポイントとしての利用IBM DB2 for zOSのソースエンドポイントとしての利用
IBM DB2 for zOSのソースエンドポイントとしての利用QlikPresalesJapan
 
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and Linkerd
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and LinkerdService Mesh with Apache Kafka, Kubernetes, Envoy, Istio and Linkerd
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and LinkerdKai Wähner
 
Real-time Adaptation of Financial Market Events with Kafka | Cliff Cheng and ...
Real-time Adaptation of Financial Market Events with Kafka | Cliff Cheng and ...Real-time Adaptation of Financial Market Events with Kafka | Cliff Cheng and ...
Real-time Adaptation of Financial Market Events with Kafka | Cliff Cheng and ...HostedbyConfluent
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
DevOpsに求められる様々な技術とその連携の学習方法
DevOpsに求められる様々な技術とその連携の学習方法DevOpsに求められる様々な技術とその連携の学習方法
DevOpsに求められる様々な技術とその連携の学習方法CASAREAL, Inc.
 
障害に備えたアーキテクチャを考える
障害に備えたアーキテクチャを考える障害に備えたアーキテクチャを考える
障害に備えたアーキテクチャを考えるYoshii Ryo
 
DeNA のデータ活用を支える BigQuery データの民主化とガバナンス強化の軌跡 | Google Cloud INSIDE Games & App...
DeNA のデータ活用を支える BigQuery データの民主化とガバナンス強化の軌跡 | Google Cloud INSIDE Games & App...DeNA のデータ活用を支える BigQuery データの民主化とガバナンス強化の軌跡 | Google Cloud INSIDE Games & App...
DeNA のデータ活用を支える BigQuery データの民主化とガバナンス強化の軌跡 | Google Cloud INSIDE Games & App...Google Cloud Platform - Japan
 
Apache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep LearningApache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep LearningKai Wähner
 
[DI03] DWH スペシャリストが語る! Azure SQL Data Warehouse チューニングの勘所
[DI03] DWH スペシャリストが語る! Azure SQL Data Warehouse チューニングの勘所[DI03] DWH スペシャリストが語る! Azure SQL Data Warehouse チューニングの勘所
[DI03] DWH スペシャリストが語る! Azure SQL Data Warehouse チューニングの勘所de:code 2017
 
Snowflake: The most cost-effective agile and scalable data warehouse ever!
Snowflake: The most cost-effective agile and scalable data warehouse ever!Snowflake: The most cost-effective agile and scalable data warehouse ever!
Snowflake: The most cost-effective agile and scalable data warehouse ever!Visual_BI
 

What's hot (20)

Reading The Source Code of Presto
Reading The Source Code of PrestoReading The Source Code of Presto
Reading The Source Code of Presto
 
Distributed applications using Hazelcast
Distributed applications using HazelcastDistributed applications using Hazelcast
Distributed applications using Hazelcast
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 
Smartsheet’s Transition to Snowflake and Databricks: The Why and Immediate Im...
Smartsheet’s Transition to Snowflake and Databricks: The Why and Immediate Im...Smartsheet’s Transition to Snowflake and Databricks: The Why and Immediate Im...
Smartsheet’s Transition to Snowflake and Databricks: The Why and Immediate Im...
 
美团数据平台之Kafka应用实践和优化
美团数据平台之Kafka应用实践和优化美团数据平台之Kafka应用实践和优化
美团数据平台之Kafka应用实践和优化
 
Frame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine LearningFrame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine Learning
 
Data and AI reference architecture
Data and AI reference architectureData and AI reference architecture
Data and AI reference architecture
 
Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...
Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...
Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
 
IBM DB2 for zOSのソースエンドポイントとしての利用
IBM DB2 for zOSのソースエンドポイントとしての利用IBM DB2 for zOSのソースエンドポイントとしての利用
IBM DB2 for zOSのソースエンドポイントとしての利用
 
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and Linkerd
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and LinkerdService Mesh with Apache Kafka, Kubernetes, Envoy, Istio and Linkerd
Service Mesh with Apache Kafka, Kubernetes, Envoy, Istio and Linkerd
 
Datapower Steven Cawn
Datapower Steven CawnDatapower Steven Cawn
Datapower Steven Cawn
 
Real-time Adaptation of Financial Market Events with Kafka | Cliff Cheng and ...
Real-time Adaptation of Financial Market Events with Kafka | Cliff Cheng and ...Real-time Adaptation of Financial Market Events with Kafka | Cliff Cheng and ...
Real-time Adaptation of Financial Market Events with Kafka | Cliff Cheng and ...
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
DevOpsに求められる様々な技術とその連携の学習方法
DevOpsに求められる様々な技術とその連携の学習方法DevOpsに求められる様々な技術とその連携の学習方法
DevOpsに求められる様々な技術とその連携の学習方法
 
障害に備えたアーキテクチャを考える
障害に備えたアーキテクチャを考える障害に備えたアーキテクチャを考える
障害に備えたアーキテクチャを考える
 
DeNA のデータ活用を支える BigQuery データの民主化とガバナンス強化の軌跡 | Google Cloud INSIDE Games & App...
DeNA のデータ活用を支える BigQuery データの民主化とガバナンス強化の軌跡 | Google Cloud INSIDE Games & App...DeNA のデータ活用を支える BigQuery データの民主化とガバナンス強化の軌跡 | Google Cloud INSIDE Games & App...
DeNA のデータ活用を支える BigQuery データの民主化とガバナンス強化の軌跡 | Google Cloud INSIDE Games & App...
 
Apache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep LearningApache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep Learning
 
[DI03] DWH スペシャリストが語る! Azure SQL Data Warehouse チューニングの勘所
[DI03] DWH スペシャリストが語る! Azure SQL Data Warehouse チューニングの勘所[DI03] DWH スペシャリストが語る! Azure SQL Data Warehouse チューニングの勘所
[DI03] DWH スペシャリストが語る! Azure SQL Data Warehouse チューニングの勘所
 
Snowflake: The most cost-effective agile and scalable data warehouse ever!
Snowflake: The most cost-effective agile and scalable data warehouse ever!Snowflake: The most cost-effective agile and scalable data warehouse ever!
Snowflake: The most cost-effective agile and scalable data warehouse ever!
 

Viewers also liked

MySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion QueriesMySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion QueriesBernd Ocklin
 
MySQL High Availability with Group Replication
MySQL High Availability with Group ReplicationMySQL High Availability with Group Replication
MySQL High Availability with Group ReplicationNuno Carvalho
 
Advanced dimensional modelling
Advanced dimensional modellingAdvanced dimensional modelling
Advanced dimensional modellingSid Xing
 
SBAAAM, Web Push Notification System V.12
SBAAAM, Web Push Notification System V.12SBAAAM, Web Push Notification System V.12
SBAAAM, Web Push Notification System V.12Marco Del Bene
 
Maria db vs mysql
Maria db vs mysqlMaria db vs mysql
Maria db vs mysqlNitin KR
 
Mysql cluster introduction
Mysql cluster introductionMysql cluster introduction
Mysql cluster introductionAndrew Morgan
 
Bugzilla installation guide
Bugzilla installation guideBugzilla installation guide
Bugzilla installation guideusha kannappan
 
Oracle OpenWorld - Getting started with MySQL Cluster
Oracle OpenWorld - Getting started with MySQL ClusterOracle OpenWorld - Getting started with MySQL Cluster
Oracle OpenWorld - Getting started with MySQL ClusterBenedita Paúl Vasconcelos
 
Oracle oracle database 11g product family
Oracle oracle database 11g product familyOracle oracle database 11g product family
Oracle oracle database 11g product familySid Xing
 
MySQL Troubleshooting with the Performance Schema
MySQL Troubleshooting with the Performance SchemaMySQL Troubleshooting with the Performance Schema
MySQL Troubleshooting with the Performance SchemaSveta Smirnova
 
MySQL Tech Tour 2015 - Progettare, installare e configurare MySQL Cluster
MySQL Tech Tour 2015 - Progettare, installare e configurare MySQL ClusterMySQL Tech Tour 2015 - Progettare, installare e configurare MySQL Cluster
MySQL Tech Tour 2015 - Progettare, installare e configurare MySQL ClusterPar-Tec S.p.A.
 
The MySQL Performance Schema & New SYS Schema
The MySQL Performance Schema & New SYS SchemaThe MySQL Performance Schema & New SYS Schema
The MySQL Performance Schema & New SYS SchemaTed Wennmark
 
Performance Schema in MySQL (Danil Zburivsky)
Performance Schema in MySQL (Danil Zburivsky)Performance Schema in MySQL (Danil Zburivsky)
Performance Schema in MySQL (Danil Zburivsky)Ontico
 
MySQL Performance - SydPHP October 2011
MySQL Performance - SydPHP October 2011MySQL Performance - SydPHP October 2011
MySQL Performance - SydPHP October 2011Graham Weldon
 
Percona Live 2012PPT: introduction-to-mysql-replication
Percona Live 2012PPT: introduction-to-mysql-replicationPercona Live 2012PPT: introduction-to-mysql-replication
Percona Live 2012PPT: introduction-to-mysql-replicationmysqlops
 
MySQL Oslayer performace optimization
MySQL  Oslayer performace optimizationMySQL  Oslayer performace optimization
MySQL Oslayer performace optimizationLouis liu
 
MySQL Developer Day conference: MySQL Replication and Scalability
MySQL Developer Day conference: MySQL Replication and ScalabilityMySQL Developer Day conference: MySQL Replication and Scalability
MySQL Developer Day conference: MySQL Replication and ScalabilityShivji Kumar Jha
 

Viewers also liked (20)

MySQL cluster 7.4
MySQL cluster 7.4 MySQL cluster 7.4
MySQL cluster 7.4
 
MySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion QueriesMySQL Cluster Scaling to a Billion Queries
MySQL Cluster Scaling to a Billion Queries
 
MySQL High Availability with Group Replication
MySQL High Availability with Group ReplicationMySQL High Availability with Group Replication
MySQL High Availability with Group Replication
 
Advanced dimensional modelling
Advanced dimensional modellingAdvanced dimensional modelling
Advanced dimensional modelling
 
SBAAAM, Web Push Notification System V.12
SBAAAM, Web Push Notification System V.12SBAAAM, Web Push Notification System V.12
SBAAAM, Web Push Notification System V.12
 
MySQL Cluster
MySQL ClusterMySQL Cluster
MySQL Cluster
 
Maria db vs mysql
Maria db vs mysqlMaria db vs mysql
Maria db vs mysql
 
Mysql cluster introduction
Mysql cluster introductionMysql cluster introduction
Mysql cluster introduction
 
Bugzilla installation guide
Bugzilla installation guideBugzilla installation guide
Bugzilla installation guide
 
Oracle OpenWorld - Getting started with MySQL Cluster
Oracle OpenWorld - Getting started with MySQL ClusterOracle OpenWorld - Getting started with MySQL Cluster
Oracle OpenWorld - Getting started with MySQL Cluster
 
Oracle oracle database 11g product family
Oracle oracle database 11g product familyOracle oracle database 11g product family
Oracle oracle database 11g product family
 
MySQL Troubleshooting with the Performance Schema
MySQL Troubleshooting with the Performance SchemaMySQL Troubleshooting with the Performance Schema
MySQL Troubleshooting with the Performance Schema
 
MySQL Tech Tour 2015 - Progettare, installare e configurare MySQL Cluster
MySQL Tech Tour 2015 - Progettare, installare e configurare MySQL ClusterMySQL Tech Tour 2015 - Progettare, installare e configurare MySQL Cluster
MySQL Tech Tour 2015 - Progettare, installare e configurare MySQL Cluster
 
MySQL user camp march 11th 2016
MySQL user camp march 11th 2016MySQL user camp march 11th 2016
MySQL user camp march 11th 2016
 
The MySQL Performance Schema & New SYS Schema
The MySQL Performance Schema & New SYS SchemaThe MySQL Performance Schema & New SYS Schema
The MySQL Performance Schema & New SYS Schema
 
Performance Schema in MySQL (Danil Zburivsky)
Performance Schema in MySQL (Danil Zburivsky)Performance Schema in MySQL (Danil Zburivsky)
Performance Schema in MySQL (Danil Zburivsky)
 
MySQL Performance - SydPHP October 2011
MySQL Performance - SydPHP October 2011MySQL Performance - SydPHP October 2011
MySQL Performance - SydPHP October 2011
 
Percona Live 2012PPT: introduction-to-mysql-replication
Percona Live 2012PPT: introduction-to-mysql-replicationPercona Live 2012PPT: introduction-to-mysql-replication
Percona Live 2012PPT: introduction-to-mysql-replication
 
MySQL Oslayer performace optimization
MySQL  Oslayer performace optimizationMySQL  Oslayer performace optimization
MySQL Oslayer performace optimization
 
MySQL Developer Day conference: MySQL Replication and Scalability
MySQL Developer Day conference: MySQL Replication and ScalabilityMySQL Developer Day conference: MySQL Replication and Scalability
MySQL Developer Day conference: MySQL Replication and Scalability
 

Similar to Oracle my sql cluster cge

My sql 5.5_product_update
My sql 5.5_product_updateMy sql 5.5_product_update
My sql 5.5_product_updatehenriquesidney
 
Mysql overview_20100811
Mysql overview_20100811Mysql overview_20100811
Mysql overview_20100811thinkinlamp
 
MySQL 5.6 Updates
MySQL 5.6 UpdatesMySQL 5.6 Updates
MySQL 5.6 UpdatesDave Stokes
 
MySQL 8: Ready for Prime Time
MySQL 8: Ready for Prime TimeMySQL 8: Ready for Prime Time
MySQL 8: Ready for Prime TimeArnab Ray
 
Santo Leto - MySQL Connect 2012 - Getting Started with Mysql Cluster
Santo Leto - MySQL Connect 2012 - Getting Started with Mysql ClusterSanto Leto - MySQL Connect 2012 - Getting Started with Mysql Cluster
Santo Leto - MySQL Connect 2012 - Getting Started with Mysql ClusterSanto Leto
 
Extending The Value Of Oracle Crm On Demand Through Cloud Based Extensibility
Extending The Value Of Oracle Crm On Demand Through Cloud Based ExtensibilityExtending The Value Of Oracle Crm On Demand Through Cloud Based Extensibility
Extending The Value Of Oracle Crm On Demand Through Cloud Based ExtensibilityJerome Leonard
 
TMOUG - Engineered Systems Overview
TMOUG - Engineered Systems OverviewTMOUG - Engineered Systems Overview
TMOUG - Engineered Systems OverviewMark Rabne
 
Oracle mysql comparison
Oracle mysql comparisonOracle mysql comparison
Oracle mysql comparisonArun Sharma
 
Application-Driven Virtualization: Architectural Considerations
Application-Driven Virtualization: Architectural ConsiderationsApplication-Driven Virtualization: Architectural Considerations
Application-Driven Virtualization: Architectural ConsiderationsBob Rhubart
 
Why Oracle on IBM POWER7 is Better Than Oracle Exadata - The Advantages of IB...
Why Oracle on IBM POWER7 is Better Than Oracle Exadata - The Advantages of IB...Why Oracle on IBM POWER7 is Better Than Oracle Exadata - The Advantages of IB...
Why Oracle on IBM POWER7 is Better Than Oracle Exadata - The Advantages of IB...miguelnoronha
 
Powered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundo
Powered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundoPowered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundo
Powered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundoGeneXus
 
MySQL - powering the web economy v1.0
MySQL - powering the web economy v1.0MySQL - powering the web economy v1.0
MySQL - powering the web economy v1.0IDG Romania
 
Oracle University - Your Complete Training Source for Oracle Software and Har...
Oracle University - Your Complete Training Source for Oracle Software and Har...Oracle University - Your Complete Training Source for Oracle Software and Har...
Oracle University - Your Complete Training Source for Oracle Software and Har...ORACLE USER GROUP ESTONIA
 
Introduction to MySQL
Introduction to MySQLIntroduction to MySQL
Introduction to MySQLTed Wennmark
 
Oracle Database Appliance
Oracle Database ApplianceOracle Database Appliance
Oracle Database Appliancevkv_vkv
 
Oracle cloud story short
Oracle cloud story   shortOracle cloud story   short
Oracle cloud story shortYuri Grinshteyn
 
What's New in MySQL 5.6
What's New in MySQL 5.6What's New in MySQL 5.6
What's New in MySQL 5.6Santo Leto
 
PDoolan Oracle Overview
PDoolan Oracle OverviewPDoolan Oracle Overview
PDoolan Oracle OverviewPeter Doolan
 

Similar to Oracle my sql cluster cge (20)

My sql 5.5_product_update
My sql 5.5_product_updateMy sql 5.5_product_update
My sql 5.5_product_update
 
Mysql overview_20100811
Mysql overview_20100811Mysql overview_20100811
Mysql overview_20100811
 
MySQL 5.6 Updates
MySQL 5.6 UpdatesMySQL 5.6 Updates
MySQL 5.6 Updates
 
MySQL 8: Ready for Prime Time
MySQL 8: Ready for Prime TimeMySQL 8: Ready for Prime Time
MySQL 8: Ready for Prime Time
 
Santo Leto - MySQL Connect 2012 - Getting Started with Mysql Cluster
Santo Leto - MySQL Connect 2012 - Getting Started with Mysql ClusterSanto Leto - MySQL Connect 2012 - Getting Started with Mysql Cluster
Santo Leto - MySQL Connect 2012 - Getting Started with Mysql Cluster
 
Extending The Value Of Oracle Crm On Demand Through Cloud Based Extensibility
Extending The Value Of Oracle Crm On Demand Through Cloud Based ExtensibilityExtending The Value Of Oracle Crm On Demand Through Cloud Based Extensibility
Extending The Value Of Oracle Crm On Demand Through Cloud Based Extensibility
 
TMOUG - Engineered Systems Overview
TMOUG - Engineered Systems OverviewTMOUG - Engineered Systems Overview
TMOUG - Engineered Systems Overview
 
Oracle Database appliancepptx
Oracle Database appliancepptxOracle Database appliancepptx
Oracle Database appliancepptx
 
Oracle mysql comparison
Oracle mysql comparisonOracle mysql comparison
Oracle mysql comparison
 
Application-Driven Virtualization: Architectural Considerations
Application-Driven Virtualization: Architectural ConsiderationsApplication-Driven Virtualization: Architectural Considerations
Application-Driven Virtualization: Architectural Considerations
 
MySQL
MySQL MySQL
MySQL
 
Why Oracle on IBM POWER7 is Better Than Oracle Exadata - The Advantages of IB...
Why Oracle on IBM POWER7 is Better Than Oracle Exadata - The Advantages of IB...Why Oracle on IBM POWER7 is Better Than Oracle Exadata - The Advantages of IB...
Why Oracle on IBM POWER7 is Better Than Oracle Exadata - The Advantages of IB...
 
Powered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundo
Powered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundoPowered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundo
Powered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundo
 
MySQL - powering the web economy v1.0
MySQL - powering the web economy v1.0MySQL - powering the web economy v1.0
MySQL - powering the web economy v1.0
 
Oracle University - Your Complete Training Source for Oracle Software and Har...
Oracle University - Your Complete Training Source for Oracle Software and Har...Oracle University - Your Complete Training Source for Oracle Software and Har...
Oracle University - Your Complete Training Source for Oracle Software and Har...
 
Introduction to MySQL
Introduction to MySQLIntroduction to MySQL
Introduction to MySQL
 
Oracle Database Appliance
Oracle Database ApplianceOracle Database Appliance
Oracle Database Appliance
 
Oracle cloud story short
Oracle cloud story   shortOracle cloud story   short
Oracle cloud story short
 
What's New in MySQL 5.6
What's New in MySQL 5.6What's New in MySQL 5.6
What's New in MySQL 5.6
 
PDoolan Oracle Overview
PDoolan Oracle OverviewPDoolan Oracle Overview
PDoolan Oracle Overview
 

Recently uploaded

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 

Recently uploaded (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 

Oracle my sql cluster cge

  • 1. MySQL Cluster Carrier Grade Edition Alexander Yu Principal Sales Consultant | MySQL Asia Pacific & Japan 2011-07-20
  • 2. Agenda / Topics • Oracle MySQL Strategy • MySQL Server Pluggable Storage Engine Architecture • High Availability Solutions • MySQL Cluster Carrier Grade – Internals – Geographical Replication – Scale Out – Backup & Restore •Q&A © 2011 Oracle Corporation 2
  • 3. About MySQL • Founded, first release in 1995 • MySQL Acquired by Sun Microsystems Feb 2008 • Oracle Acquires Sun Microsystems Jan 2010 • +12M Product Installations • 65K+ Downloads Per Day • Part of the rapidly growing open source LAMP stack Customers across every major operating system, hardware vendor, geography, industry, and application type High Performance ▪ Reliable ▪ Easy to Use © 2011 Oracle Corporation 3
  • 4. Oracle’s Strategy: Complete. Open. Integrated. • Built together • Tested together • Managed together • Serviced together • Based on open standards • Lower cost • Lower risk • More reliable © 2011 Oracle Corporation 4
  • 5. Complete. Open. Integrated. MySQL Completes The Stack • Oracle never settles for being second best at any level of the stack • “Complete” means we meet most customer requirements at every level That’s why MySQL matters to Oracle and Oracle customers © 2011 Oracle Corporation 5
  • 6. The “M” in the LAMP Stack Operating System L Application Server A Database M Scripting P For© 2011 Oracle Only -- Oracle Confidential & Proprietary Internal Use Corporation 6
  • 7. Investment in MySQL Rapid Innovation • Make MySQL a Better MySQL • #1 Open Source Database for Web Applications • Most Complete LAMP Stack • Telecom & Embedded • Develop, Promote and Support MySQL • Improve engineering, consulting and support • Leverage 24x7, World-Class Oracle Support • MySQL Community Edition • Source and binary releases • GPL license © 2011 Oracle Corporation 7
  • 8. Oracle + MySQL Customers • Product Integration • Oracle GoldenGate (Complete!) • Oracle Enterprise Linux + Oracle VM (Complete!) HA Template Available • Oracle Secure Backup (CY 2011) • Oracle Audit Vault (CY 2011) • Oracle Enterprise Manager (CY 2011) • Support • Leverage 24x7, World-Class Oracle Support • MyOracle Support © 2011 Oracle Corporation 8
  • 9. Serving Key Markets and Industry Leaders Powering Data Management on the Web & in the Network Web OEM / ISV’s SaaS, Hosting Telecommunications Enterprise 2.0 © 2011 Oracle Corporation 9
  • 10. MySQL in Communications http://www.mysql.com/industry/communications/resources.html#customer_case_studies © 2011 Oracle Corporation 10
  • 11. MySQL Server Pluggable Storage Engine Architecture © 2011 Oracle Corporation 11
  • 12. Pluggable Storage Engine Architecture MySQL Server Connectors Clients and Apps Native C API, JDBC, ODBC, .Net, PHP, Ruby, Python, VB, Perl Enterprise Management Services and Utilities Connection Pool Backup & Recovery Authentication – Thread Reuse – Connection Limits – Check Memory – Caches Security Replication Cluster Partitioning SQL Interface Parser Caches Optimizer Instance Manager DDL, DML, Stored Query Translation, Global and Engine Access Paths, Statistics Information_Schema Procedures, Views, Object Privileges Specific Caches and MySQL Workbench Triggers, Etc.. Buffers Pluggable Storage Engines Memory, Index and Storage Management InnoDB MyISAM Cluster Etc… Partners Community More.. Filesystems, Files and Logs Redo, Undo, Data, Index, Binary, Error, Query and Slow © 2011 Oracle Corporation 12
  • 13. MySQL Cluster Architecture Shared-nothing distributed database with no SPOF: JDBC (Java) High Read & Write Performance & 99.999% uptime NDB API (C++) ClusterJ (Java) Clients OpenJPA (Java) PHP/P*/ODBC OpenLDAP MySQL Cluster Application Nodes SQL Nodes ClusterJ MGM Client NDB API (C++) MGM API (C) NDB API NDB API MGM Node MGM Node MySQL Cluster Data Nodes © 2011 Oracle Corporation 13
  • 14. Workload Qualification InnoDB vs MySQL Cluster Workload InnoDB MySQL Cluster No. Unless mainly Packaged Applications (i.e. standard business applications) Yes PK access Custom Applications Yes Yes OLTP Applications Yes Yes DSS Applications (i.e. Data Marts, Analytics, etc.) Yes No Content Management Yes Limited Support In-Network Telecoms Applications (HLR, HSS, SDP, etc) No Yes Web Session Management Yes Yes User Profile Management & AAA Yes Yes eCommerce Databases Yes Yes © 2011 Oracle Corporation 14
  • 15. Feature Comparison InnoDB vs MySQL Cluster Feature Qualification InnoDB MySQL Cluster Latest MySQL 5.5 & InnoDB 1.1 Performance Enhancements Yes No Storage Limits 64TB 2TB (a) Foreign Keys Yes No MVCC Non-Blocking Reads Yes No Optimized for Complex Multi-Table JOINs with Thousands of Accesses Yes No (b) Hash Indexes No Yes Compressed Data Yes No Support for 8KB+ Row Sizes Yes Only via BLOBs ( c ) Built-in Clustering Support for 99.999% HA No Yes Minimum Number of Physical Hosts for Redundancy 2 (Active / Passive) 2 + 1 ( A/A & Mgmt) (d) Time to Recovery After Node Failure 30s - hours Sub-Second Real-Time Performance No Yes Option for In-Memory Storage of Tables with Disk Persistence No Yes Non-SQL Access Methods to Data (i.e. NDB API) No Yes Write Scalability without Application Partitioning No Yes (e) Max Number of Nodes for Parallel Write Performance 1 48 (f) Conflict Resolution & Detection across Multiple Replication Masters No Yes Virtualization Support Yes No © 2011 Oracle Corporation 15
  • 16. Storage Engines Feature MyISAM NDB Archive InnoDB Memory Storage limits No Yes No 64TB Yes Transactions No Yes No Yes No Locking granularity Table Row Row Row Table MVCC snapshot read No No No Yes No Geospatial support Yes No Yes Yes No Data caches No Yes No Yes NA Index caches Yes Yes No Yes NA Compressed data Yes No Yes No No Storage cost (relative to other engines)‫‏‬ Small Med Small Med NA Memory cost (relative to other engines)‫‏‬ Low High Low High High Bulk insert speed High High Highest Med High Replication support Yes Yes Yes Yes Yes Foreign Key support No No No Yes No Built-in Cluster/High-availability support No Yes No No No Dynamically add and remove storage engines. Change the storage engine on a table with “ALTER TABLE …” © 2011 Oracle Corporation 16
  • 17. Why Users Adopt MySQL Cluster MySQL Already in Use High Read/Write 99.999% MySQL Throughput Real Time Performance Scale-Out, On-Demand © 2011 Oracle Corporation 17
  • 18. Why Users Buy MySQL Cluster CGE Standardized on Open Source Blend of Web & Deploying Mission Critical Applications Telecoms Capabilities HA MySQL Management & Monitoring Global 24x7 support Tools Embedding MySQL Cluster Real-Time, High Read/ Write Performance Scale-Out, Shared Nothing © 2011 Oracle Corporation 18
  • 19. High Availability Solutions © 2011 Oracle Corporation 19
  • 20. Selecting the Right HA Architecture © 2011 Oracle Corporation 20
  • 21. Mapping HA Architecture to Applications Shared-Nothing, Data Clustered / Applications Geo-Replicated Replication Virtualized Cluster E-Commerce / Trading Session Management User Authentication / Accounting Feeds, Blogs, Wikis Data Refinery OLTP Data Warehouse/BI Content Management CRM / SCM Collaboration Packaged Software Telco Apps (HLR/HSS/SDP…) © 2011 Oracle Corporation 21
  • 22. MySQL High Availability Solutions 9 5. 0 0 0 % • MySQL Replication 9 9. 0 0 0 % • MySQL Replication with Clustering Software 9 9. 9 0 0 % • DRBD with Clustering Software 9 9. 9 0 0 % • Shared Storage with Clustering Software (A/P - A/A) 9 9. 9 9 0 % • DRBD and Replication with Clustering Software 9 9. 9 9 0 % • Shared Storage and Replication with Clustering SW 9 9. 9 9 0 % • Shared Storage Replication 9 9. 9 9 0 % • Virtualised Environment 9 9. 9 9 9 % • MySQL Cluster 9 9. 9 9 9 % • MySQL Cluster & Replication 9 9. 9 9 9 % • MySQL Cluster Carrier Grade Edition © 2011 Oracle Corporation 22
  • 23. MySQL Replication • Native in MySQL • Used for Scalability and HA • Asynchronous as standard • Semi-Synchronous support added in MySQL 5.5 • Each slave adds minimal load on master Relay Log © 2011 Oracle Corporation 22
  • 24. Replication Topologies Single Chain Circular Multiple Multi - Master Multi - Circular © 2011 Oracle Corporation 24
  • 25. MySQL Replication Read Scalability Clients MySQL Replication Slaves Master • Used by leading web properties for scale-out • Reads are directed to slaves, writes to master • Delivers higher performance & scale with efficient resource utilization © 2011 Oracle Corporation 22
  • 26. MySQL Replication Failure Detection & Failover • Linux Heartbeat implements heartbeat protocol between nodes • Failover initiated by Cluster Resource Manager (Pacemaker) if heartbeat message is not received • Virtual IP address failed over to ensure failover is transparent to apps © 2011 Oracle Corporation 22
  • 27. Shared Disk Clusters A/P - A/A READS/WRITES Applications VIP Shared Storage • Reliability • High Availability - Commonly used solution - Data handled by a SAN or NAS and always available • Fault Tolerance - Automatic fail-over - No single point of failure with appropriate hardware • Simplified Management © 2011 Oracle Corporation 27
  • 28. Distributed Replicated Block Device • DRBD creates transaction-safe hot standby configuration • MySQL updates written to block device on the Active Server • DRBD synchronously replicates updates to the Passive Server • Linux Heartbeat fails over from Active to Passive in event of failure © 2011 Oracle Corporation 28
  • 29. Sharding aka Application Partitioning Master Clients Slave Reads Writes Partitioning Logic 1 2 3 4 5 Shards Slaves © 2011 Oracle Corporation 29
  • 30. Oracle VM Template for MySQL Integrated & Tested OS, VM and Database Stack Oracle VM Oracle VM Oracle VM Fastest, simplest & most reliable way to deploy virtualized, cloud- ready MySQL instances, certified Oracle VM Oracle VM for production use • Rapid DEPLOYMENT Oracle VM Server Pool • Increased RELIABILITY • Higher AVAILABILITY Oracle VM Servers • Lower COST © 2011 Oracle Corporation 30
  • 31. Template Components Certified for Production Deployment Oracle VM Oracle VM Automatic Fault Detection & Recovery • Oracle Linux 5 Update 6 with the Unbreakable Enterprise Kernel • Oracle VM 2.2.1 Secure Live Migration (SSL) • Oracle VM Manager 2.1.5 Oracle VM Server Pool • Oracle Cluster File System 2 (OCFS2) Oracle VM Manager • MySQL Database 5.5.10 (Enterprise Edition) Oracle VM Servers Pre-Installed & Pre-Configured ocfs2 Full Integration & QA Testing SAN / iSCSI Single Point of Support © 2011 Oracle Corporation 31
  • 32. Positioning Current Solutions Requirement MySQL Replication Heartbeat + DRBD Oracle VM Template MySQL Cluster Availability All Supported by MySQL All Supported by MySQL Platform Support Linux Oracle Linux Server Cluster Depends on Connector and Automated IP Failover No Yes Yes Configuration Automated Database No Yes Yes Yes Failover Automatic Data No Yes N/A - Shared Storage Yes Resynchronization Configuration Dependent, 60 Configuration Dependent, 60 Typical Failover Time User / Script Dependent 1 Second and Less seconds and Above seconds and Above No, Asynchronous and Semi- Synchronous Replication Yes N/A - Shared Storage Yes Synchronous Geographic Redundancy Yes Yes, via MySQL Replication Yes, via MySQL Replication Yes, via MySQL Replication Support Scalability One Active (primary), one One Active (primary), one Number of Nodes One Master, Multiple Slaves 255 Passive (secondary) Node Passive (secondary) Node Reads, via MySQL Reads, via MySQL Reads, via MySQL Built-in Load Balancing Yes, Reads and Writes Replication Replication Replication & During Failover Read-Intensive Workloads Yes Yes Yes Yes Yes, via Application-Level Yes, via Application-Level Yes, via Application-Level Write-Intensive Workloads Sharding to Multiple Active/ Sharding to Multiple Active/ Yes, via Auto-Sharding Sharding Passive Pairs Passive Pairs Scale On-Line (add nodes, No No No Yes repartition, etc.) © 2011 Oracle Corporation 32
  • 33. MySQL Cluster Real-time Carrier Grade Database © 2011 Oracle Corporation 33
  • 34. Customers & Applications • Web – User profile management – Session stores – eCommerce – On-Line Gaming – Application Servers • Telecoms – Subscriber Databases (HLR/HSS) – Service Delivery Platforms – VoIP, IPTV & VoD – Mobile Content Delivery – On-Line app stores and portals – IP Management – Payment Gateways http://www.mysql.com/industry/telecom/ © 2011 Oracle Corporation 34
  • 35. MySQL Cluster - NDB Storage Engine © 2011 Oracle Corporation 35
  • 36. MySQL Cluster Architecture Shared-nothing distributed database with no SPOF: JDBC (Java) High Read & Write Performance & 99.999% uptime NDB API (C++) ClusterJ (Java) Clients OpenJPA (Java) PHP/P*/ODBC OpenLDAP MySQL Cluster Application Nodes SQL Nodes ClusterJ MGM Client NDB API (C++) MGM API (C) NDB API NDB API MGM Node MGM Node MySQL Cluster Data Nodes © 2011 Oracle Corporation 36
  • 37. MySQL Cluster Nodes SQL Based Applications JDBC/ODBC MySQL/ API API API Node Management SQL Node Node Node Client NDB API Data MySQL Cluster Data MGM API Node Node Management Node NDB API Data Data Node Node © 2011 Oracle Corporation 37
  • 38. MySQL Cluster Nodes • Standard SQL Interface SQL Node • Scale-out for Performance (MySQL) • Enables Replication • High Performance NDB API • C, C++ & Java, LDAP, HTTP API (Application) • Developer’s Guide • Data Storage (Memory/Disk) Data Node • Automatic & User-Defined Partitioning • Local & Global Checkpoints (NDB Storage Engine) • Scale-out or scale-up for Capacity & Redundancy • Scale dynamically with on-line add node • Administration and Configuration Management • Arbitration • Use Two for Redundancy Node © 2011 Oracle Corporation 38
  • 39. Replication Flexibility • Synchronous replication within a Cluster node group for HA • Bi-Direction asynchronous Cluster 1 Cluster 2 replication to remote Cluster for geographic redundancy • Asynchronous replication to non- Cluster databases for specialised activities such as report generation • Mix and match replication types MyISAM MyISAM InnoDB Synchronous replication Asynchronous replication © 2011 Oracle Corporation 39
  • 40. MySQL Cluster Loads MySQL MySQL MySQL Community Cluster Cluster Server (GPL) CGE • MySQL Cluster software (Management & MySQL MySQL MySQL Server ≠ Server = Server Data Nodes) included with MySQL Community Server should not be used InnoDB ≠ InnoDB = InnoDB • MySQL Server included with MySQL ≠ Data Node = Data Node Cluster loads is different to regular ≠ Mgmt Node = Mgmt Node MySQL Server • Always use this special version of MySQL Server when accessing MySQL Cluster data • MySQL Cluster CGE downloaded from oem.mysql.com • GA GPL Community versions downloaded from www.mysql.com/downloads • In-development GPL Community versions downloaded from dev.mysql.com/ downloads/ © 2011 Oracle Corporation 40
  • 41. MySQL Cluster System Requirements System Component Requirement Hosts Maximum of 255 total nodes (48 Data Nodes)‫‏‬ COTS – Advanced TCA Hardware 32 & 64-bit x86 & SPARC Memory Varies on size of database, # of hosts, # of replicas Shared-Nothing - Memory & Disk Data Storage SCSI or RAID for I/O performance Network >1 Gigabit recommended, SCI supported Linux (Red Hat, SuSE), Solaris, HP-UX, Mac OSX, Operating System Windows, others… © 2011 Oracle Corporation 41
  • 42. MySQL Cluster 6.2 © 2011 Oracle Corporation 42
  • 43. MySQL Cluster 6.3 http://dev.mysql.com/doc/mysql-cluster-excerpt/5.1/en/mysql-cluster-changes-5-1-ndb-6-3.html © 2011 Oracle Corporation 43
  • 44. MySQL Cluster 7.0 –GA April 2009 http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster7_architecture.php © 2011 Oracle Corporation 44
  • 45. Scale out – multi core environments © 2011 Oracle Corporation 45
  • 46. MySQL Cluster vs MySQL MEMORY: 30x Higher Throughput / 1/3rd the Latency on a single node • Table level locking inhibits MEMORY scalability beyond a single client connection • Check-pointing & logging enabled, MySQL Cluster still delivers durability • 4 socket server, 64GB RAM, running Linux © 2011 Oracle Corporation 46
  • 47. Scale-Out Reads & Writes on Commodity Hardware • NDB API Performance 4.33 M Queries per second! • 8 Intel servers, dual-6-core CPUs @2.93 GHz, 24GB RAM • 2 Data Nodes per server • flexAsync benchmark – 16 parallel threads, each issuing 256 simultaneous transactions – Read / Write 100KB attribute • Interim results from 2 days testing – watch this space: mikaelronstrom.blogspot.com © 2011 Oracle Corporation 47
  • 48. MySQL Cluster CGE 7.1 – Key Enhancements http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster7_architecture.php © 2011 Oracle Corporation 48
  • 49. MySQL Cluster 7.1 Momentum 1,000 Downloads per Day Windows GA Pro-active Cluster Monitoring Fully Automated “MySQL Cluster 7.1 gave us the Management perfect combination of extreme levels of transaction throughput, low 10x Higher Java latency & carrier-grade availability, Performance while reducing TCO” Phani Naik, Pyro Group © 2011 Oracle Corporation 49
  • 50. MySQL Cluster 7.1: ndbinfo mysql> use ndbinfo • New database (ndbinfo) which mysql> show tables; presents real-time metric data +-------------------+ in the form of tables | Tables_in_ndbinfo | +-------------------+ • Exposes new information | blocks | together with providing a | config_params | simpler, more consistent way to | counters | access existing data | logbuffers | | logspaces | • Examples include: | memoryusage | • Resource usage (memory, buffers) | nodes | • Event counters (such as number of | resources | READ operations since last restart) | transporters | • Data node status and connection +-------------------+ status © 2011 Oracle Corporation 50
  • 51. MySQL Cluster 7.1: ndbinfo • Example 1: Check memory usage/availability mysql> select * from ndbinfo.memoryusage; +---------+--------------+--------+------------+-----------+-------------+ | node_id | memory_type | used | used_pages | total | total_pages | +---------+--------------+--------+------------+-----------+-------------+ | 3 | Data memory | 917504 | 28 | 104857600 | 3200 | | 3 | Index memory | 221184 | 27 | 11010048 | 1344 | | 4 | Data memory | 917504 | 28 | 104857600 | 3200 | | 4 | Index memory | 221184 | 27 | 11010048 | 1344 | +---------+--------------+--------+------------+-----------+-------------+ • Note that there is a DATA_MEMORY and INDEX_MEMORY row for each data node in the cluster • If the Cluster is nearing the configured limit then increase the DataMemory and/or IndexMemory parameters in config.ini and then perform a rolling restart © 2011 Oracle Corporation 51
  • 52. MySQL Cluster 7.1: ndbinfo • Example 2: Check how many table scans performed on each data node since the last restart mysql> select node_id as 'data node', val as 'Table Scans' from ndbinfo.counters where counter_name='TABLE_SCANS'; +-----------+-------------+ | data node | Table Scans | +-----------+-------------+ | 3 | 3 | | 4 | 4 | +-----------+-------------+ • You might check this if your database performance is lower than anticipated • If this figure is rising faster than you expected then examine your application to understand why there are so many table scans © 2011 Oracle Corporation 52
  • 53. Latest news on MySQL Cluster 7.1 • As of MySQL Cluster 7.1.9a: • InnoDB plugin included • New view in ndbinfo: mysql> SELECT node_id, page_requests_direct_return AS hit, page_requests_wait_io AS miss, 100*page_requests_direct_return/(page_requests_direct_return+page_requests_wait_io) AS hit_rate FROM ndbinfo.diskpagebuffer; +---------+------+------+----------+ | node_id | hit | miss | hit_rate | +---------+------+------+----------+ | 3 | 6 | 3 | 66.6667 | | 4 | 10 | 3 | 76.9231 | +---------+------+------+----------+ • MEM2.3 includes new Cluster Advisor/graphs © 2011 Oracle Corporation 53
  • 54. MySQL Enterprise Monitor 2.3 © 2011 Oracle Corporation 54
  • 55. Online Operations • Scale the cluster for throughput or capacity – Data and SQL Nodes • Repartition tables • Recover failed nodes • Upgrade / patch servers & OS • Upgrade / patch MySQL Cluster • Back-Up • Evolve the schema on-line, in real-time © 2011 Oracle Corporation 55
  • 56. Real-Time, On-Line Schema Changes CREATE OFFLINE INDEX b ON t1(b); • Fully online – transaction response Query OK, 1356 rows affected (2.20 sec)‫‏‬ times unchanged • Add and remove indexes, add new columns and tables DROP OFFLINE INDEX b ON t1; • No temporary table creation Query OK, 1356 rows affected (2.03 sec)‫‏‬ • No recreation of data or deletion required CREATE ONLINE INDEX b ON t1(b); • Faster and better performing table Query OK, 0 rows affected (0.58 sec)‫‏‬ maintenance operations • Less memory and disk requirements DROP ONLINE INDEX b ON t1; Query OK, 0 rows affected (0.46 sec)‫‏‬ ALTER ONLINE TABLE t1 ADD COLUMN d INT; Query OK, 0 rows affected (0.36 sec)‫‏‬ © 2011 Oracle Corporation 56
  • 57. Performance I Flexibility I Simplification • SQL and NoSQL Access Methods to tables – SQL: complex queries, rich ecosystem of apps & expertise – Simple Key/Value interfaces bypassing SQL layer for blazing fast reads & writes – Real-time interfaces for micro-second latency – Developers free to work in their preferred environment © 2011 Oracle Corporation 57
  • 58. Scaling Distributed Joins 7.2DM Adaptive Query Localization • ‘Complex’ joins traditionally slower in MySQL Cluster – Complex = lots of levels and interim results in JOIN • JOIN was implemented in the MySQL Server: – Nested Loop join – When data is needed, it must be fetched over the mysqld network from the Data Nodes; row by row – This causes latency and consumes resources • Can now push the execution down into the data Data Nodes nodes, greatly reducing the network trips AQL • 25x-40x performance gain in customer PoC! mysqld Data Nodes The existence, content and timing of future releases described here is included for information only and may be changed at Oracles discretion. http://www.mysql.com/news-and-events/on-demand-webinars/display-od-583.html © 2011 Oracle Corporation 58
  • 59. Adaptive Query Localization: Current Limitations • Columns to be joined – must use exactly the same data type – cannot be any of the BLOB or TEXT types – columns to be joined must be part of a table index or primary key • AQL can be disabled using the ndb_join_pushdown server system variable – enabled by default © 2011 Oracle Corporation 59
  • 60. •<Insert Picture Here> Early Adopter Speaks! “Testing of Adaptive Query Localization has yielded over 20x higher performance on complex queries within our application, enabling Docudesk to expand our use of MySQL Cluster into a broader range of highly dynamic web services.” Casey Brown Manager, Development & DBA Services, Docudesk © 2011 Oracle Corporation 60
  • 61. MySQL Cluster: SQL & NoSQL Combined Mix & Match! Same data accessed simultaneously through SQL & NoSQL interfaces • NoSQL – Multiple ways to bypass SQL, and maximize performance: • NDB API. C++ for highest performance, lowest latency • Cluster/J for optimized access in Java • NEW! Memcached. Use all your existing clients/applications © 2011 Oracle Corporation 61
  • 62. Which to Choose ? © 2011 Oracle Corporation 62
  • 63. Performance © 2011 Oracle Corporation 63
  • 64. NoSQL With NDB API Best possible performance Clients • Application embeds the NDB API C++ interface library • NDB API make intelligent decision (where possible) about which data node to send queries to Applications with embedded NDB API Library – With a little planning in the schema design, achieve linear scalability • Used by all of the other application nodes (MySQL, LDAP, ClusterJ,…) • Best possible performance but requires > development skill • Favourite API for real-time network applications • Foundation for all interfaces MySQL Cluster Data Nodes © 2011 Oracle Corporation 64
  • 65. NoSQL with memcached 7.2DM • Memcached is a distributed memory based hash-key/value store with no persistence to disk Memcached protocol • NoSQL, simple API, popular with developers • MySQL Cluster already provides scalable, in- memory performance with NoSQL (hashed) access as well as persistence • Provide the Memcached API but map to NDB API calls • Writes-in-place, so no need to invalidate cache • Simplifies architecture as caching & database integrated into 1 tier • Access data from existing relational tables © 2011 Oracle Corporation 65
  • 66. NoSQL with Memcached 7.2DM Pre-GA version available from labs.mysql.com Flexible: Simple: • Deployment options set maidenhead 0 0 3 SL6 • Multiple Clusters STORED • Simultaneous SQL Access • Can still cache in Memcached server get maidenhead • Flat key-value store or map to multiple tables/ VALUE maidenhead 0 3 SL6 columns END © 2011 Oracle Corporation 66
  • 67. MySQL Cluster Manager 1.1 Features Delivered as part of MySQL Cluster CGE 7.1 © 2011 Oracle Corporation 67
  • 68. How Does MySQL Cluster Manager Help ? Example: Initiating upgrade from MySQL Cluster 6.3 to 7.1 Before MySQL Cluster Manager With MySQL Cluster Manager •1 x preliminary check of cluster state upgrade cluster --package=7.1 mycluster; •8 x ssh commands per server •8 x per-process stop commands •4 x scp of configuration files (2 x mgmd & 2 x Total: 1 Command - mysqld) Unattended Operation •8 x per-process start commands •8 x checks for started and re-joined processes • Results •8 x process completion verifications • Reduces the overhead and complexity of •1 x verify completion of the whole cluster. managing database clusters •Excludes manual editing of each configuration file. • Reduces the risk of downtime resulting from Total: 46 commands - administrator error 2.5 hours of attended operation • Automates best practices in database cluster management © 2011 Oracle Corporation 68
  • 69. Terms used by MySQL Cluster Manager • Site: the set of physical hosts which are to run Cluster processes to be managed by MySQL Cluster Manager. A site can include 1 or more Site clusters. Host Host Host Host • Cluster: represents a MySQL Cluster deployment. A Cluster contains 1 or more Cluster processes running on 1 or more hosts • Host: Physical machine, running the MySQL Process Process Process Process Process Process Process Cluster Manager agent Cluster • Agent: The MySQL Cluster Manager process running on each host Process Process Process • Process: an individual MySQL Cluster node; one of: ndb_mgmd, ndbd, ndbmtd, mysqld & agent agent agent agent ndbapi* • Package: A copy of a MySQL Cluster installation directory as downloaded from mysql.com, stored on each host *ndbapi is a special case, representing a slot for an external application process to connect to the cluster using the NDB API © 2011 Oracle Corporation 69
  • 70. Example configuration mysql client • MySQL Cluster Manager agent runs on each physical host 7. mysqld 8. mysqld • No central process for Cluster Manager – 1. ndb_mgmd 2. ndb_mgmd agents co-operate, each one responsible agent agent for its local nodes • Agents are responsible for managing all 192.168.0.10 192.168.0.11 nodes in the cluster 3. ndbd 4. ndbd • Management responsibilities • Starting, stopping & restarting nodes 5. ndbd 6. ndbd • Configuration changes agent agent • Upgrades 192.168.0.12 192.168.0.13 • Host & Node status reporting • Recovering failed nodes n. mysqld MySQL Server (ID=n) n. ndb_mgmd Management Node (ID=n) n. ndbd Data Node (ID=n) agent MySQL Cluster Manager agent © 2011 Oracle Corporation 70
  • 71. Creating & Starting a Cluster mysql 1.Define the site: client Mysql> create site --hosts=192.168.0.10,192.168.0.11, -> 192.168.0.12,192.168.0.13 mysite; 2.Expand the MySQL Cluster tar-ball(s) from mysql.com to known directory 7. mysqld 8. mysqld 3.Define the package(s): 1. ndb_mgmd 2. ndb_mgmd Mysql> add package --basedir=/usr/local/mysql_6_3_26 6.3; Mysql> add package --basedir=/usr/local/mysql_7_0_7 7.0; agent agent Note that the basedir should match the directory used in Step 2. 192.168.0.10 192.168.0.11 4.Create the Cluster Mysql> create cluster --package=6.3 3. ndbd 4. ndbd -> --processhosts=ndb_mgmd@192.168.0.10,ndb_mgmd@192.168.0.11, -> ndbd@192.168.0.12,ndbd@192.168.0.13, ndbd@192.168.0.12, -> ndbd@192.168.0.13,mysqld@192.168.9.10,mysqld@192.168.9.11 5. ndbd 6. ndbd -> mycluster; agent agent This is where you define what nodes/processes make up the Cluster and where they should run 192.168.0.12 192.168.0.13 5.Start the Cluster: Mysql> start cluster mycluster; © 2011 Oracle Corporation 71
  • 72. Upgrade Cluster mysql client • Upgrade from MySQL Cluster 6.3.26 to 7.0.7: 7. mysqld 8. mysqld mysql> upgrade cluster --package=7.0 mycluster; 1. ndb_mgmd 2. ndb_mgmd agent agent • Automatically upgrades each node and restarts the process – in the correct order to avoid any loss of service 192.168.0.10 192.168.0.11 • Without MySQL Cluster Manager, the 3. ndbd 4. ndbd administrator must stop each process in turn, start the process with the new version and wait 5. ndbd 6. ndbd for the node to restart before moving onto the agent agent next one 192.168.0.12 192.168.0.13 © 2011 Oracle Corporation 72
  • 73. MySQL Cluster Manager GA 1st November 2010 Mgmt Mgmt Mgmt Mgmt 33 mysqld Node 34 mysqld Node 33 mysqld mysqld Node 34 mysqld mysqld Node Data Data Data Data Data Data 31 Node 32 Node 31 Node 32 Node 35 Node 36 Node • On-line add-node mysql> add hosts --hosts=192.168.0.35,192.168.0.36 mysite; mysql> add package --basedir=/usr/local/mysql_7_0_7 – hosts=192.168.0.35,192.168.0.36 7.0; mysql> add process -- processhosts=mysqld@192.168.0.33,mysqld@192.168.0.34,ndbd@192.1 68.0.35,ndbd@192.168.0.36 mycluster; mysql> start process --added mycluster; • Restart optimizations • Fewer nodes restarted on some parameter changes © 2011 Oracle Corporation 73
  • 74. General Design Considerations • MySQL Cluster is designed for – Short transactions – Many parallel transactions • Utilize Simple access patterns to fetch data – Use efficient scans and batching interfaces • Analyze what your most typical use cases are – optimize for those Overall design goal Minimize network roundtrips for your most important requests! © 2011 Oracle Corporation 74
  • 75. Best Practice : Primary Keys • To avoid problems with • Cluster 2 Cluster replication • Recovery • Application behavior (KEY NOT FOUND.. etc) • ALWAYS DEFINE A PRIMARY KEY ON THE TABLE! • A hidden PRIMARY KEY is added if no PK is specified. BUT.. • .. NOT recommended • The hidden primary key is for example not replicated (between Clusters)!! • There are problems in this area, so avoid the problems! • So always, at least have id BIGINT AUTO_INCREMENT PRIMARY KEY • Even if you don't “need” it for you applications © 2011 Oracle Corporation 75
  • 76. Best Practice: Distribution Aware Apps SELECT SUM(population) FROM towns WHERE country=“UK”; • Partition selected using hash on Partition Key Partition Key • Primary Key by default Primary Key • User can override in table definition town country population Maidenhead UK 78000 • MySQL Server (or NDB API) will Paris France 2193031 attempt to send transaction to the Boston UK 58124 correct data node Boston USA 617594 • If all data for the transaction are in the same partition, less messaging -> faster SELECT SUM(population) FROM towns WHERE town=“Boston”; • Aim to have all rows for high-running queries in same partition Partition Key Primary Key town country population Maidenhead UK 78000 Paris France 2193031 Boston UK 58124 Boston USA 617594 © 2011 Oracle Corporation 76
  • 77. Best Practice: Distribution Aware – Multiple Tables Partition Key Primary Key sub_id age gender • Extend partition awareness over 19724 25 male multiple tables 84539 43 female • Same rule – aim to have all data for 19724 16 female instance of high running transactions 74574 21 female in the same partition Partition Key Primary Key ALTER TABLE service_ids service sub_id svc_id PARTITION BY KEY(sub_id); twitter 19724 76325732 twitter 84539 67324782 facebook 19724 83753984 facebook 73642 87324793 © 2011 Oracle Corporation 77
  • 78. MySQL Cluster Internals © 2011 Oracle Corporation 78
  • 79. Automatic Data Partitioning 4 Partitions * 2 Replicas = 8 Fragments Table T1 Data Node 1 F1 F3 Px Partition P1 Node Group 1 Data Node 2 P2 F3 F1 P3 Data Node 3 F2 F4 P4 Node Group 2 Fx Primary Fragment Data Node 4 - Node groups are created automatically F4 F2 - # of groups = # of data nodes / # of replicas Fx Secondary Fragment (fragment replica) © 2011 Oracle Corporation 79
  • 80. Automatic Data Partitioning 4 Partitions * 2 Replicas = 8 Fragments Table T1 Data Node 1 Px Partition P1 Data Node 2 P2 P3 Data Node 3 P4 Data Node 4 A fragment is a copy of a partition (aka fragment replica) Number of fragments = # of partitions * # of replicas © 2011 Oracle Corporation 80
  • 81. Automatic Data Partitioning 4 Partitions * 2 Replicas = 8 Fragments Table T1 Data Node 1 F1 Px Partition P1 Data Node 2 P2 P3 Data Node 3 P4 Fx Primary Fragment Data Node 4 Fx Secondary Fragment (fragment replica) © 2011 Oracle Corporation 81
  • 82. Automatic Data Partitioning 4 Partitions * 2 Replicas = 8 Fragments Table T1 Data Node 1 F1 Px Partition P1 Data Node 2 P2 F1 P3 Data Node 3 P4 Fx Primary Fragment Data Node 4 Fx Secondary Fragment (fragment replica) © 2011 Oracle Corporation 82
  • 83. Automatic Data Partitioning 4 Partitions * 2 Replicas = 8 Fragments Table T1 Data Node 1 F1 Px Partition P1 Data Node 2 P2 F3 F1 P3 Data Node 3 P4 Fx Primary Fragment Data Node 4 Fx Secondary Fragment (fragment replica) © 2011 Oracle Corporation 83
  • 84. Automatic Data Partitioning 4 Partitions * 2 Replicas = 8 Fragments Table T1 Data Node 1 F1 F3 Px Partition P1 Data Node 2 P2 F3 F1 P3 Data Node 3 P4 Fx Primary Fragment Data Node 4 Fx Secondary Fragment (fragment replica) © 2011 Oracle Corporation 84
  • 85. Automatic Data Partitioning 4 Partitions * 2 Replicas = 8 Fragments Table T1 Data Node 1 F1 F3 Px Partition P1 Data Node 2 P2 F3 F1 P3 Data Node 3 F2 P4 Fx Primary Fragment Data Node 4 Fx Secondary Fragment (fragment replica) © 2011 Oracle Corporation 85
  • 86. Automatic Data Partitioning 4 Partitions * 2 Replicas = 8 Fragments Table T1 Data Node 1 F1 F3 Px Partition P1 Data Node 2 P2 F3 F1 P3 Data Node 3 F2 P4 Fx Primary Fragment Data Node 4 F2 Fx Secondary Fragment (fragment replica) © 2011 Oracle Corporation 86
  • 87. Automatic Data Partitioning 4 Partitions * 2 Replicas = 8 Fragments Table T1 Data Node 1 F1 F3 Px Partition P1 Data Node 2 P2 F3 F1 P3 Data Node 3 F2 P4 Fx Primary Fragment Data Node 4 F4 F2 Fx Secondary Fragment (fragment replica) © 2011 Oracle Corporation 87
  • 88. Automatic Data Partitioning 4 Partitions * 2 Replicas = 8 Fragments Table T1 Data Node 1 F1 F3 Px Partition P1 Data Node 2 P2 F3 F1 P3 Data Node 3 F2 F4 P4 Fx Primary Fragment Data Node 4 F4 F2 Fx Secondary Fragment (fragment replica) © 2011 Oracle Corporation 88
  • 89. Automatic Data Partitioning 4 Partitions * 2 Replicas = 8 Fragments Table T1 Data Node 1 F1 F3 Px Partition P1 Node Group 1 Data Node 2 P2 F3 F1 P3 Data Node 3 F2 F4 P4 Fx Primary Fragment Data Node 4 F4 F2 Fx Secondary Fragment (fragment replica) © 2011 Oracle Corporation 89
  • 90. Automatic Data Partitioning 4 Partitions * 2 Replicas = 8 Fragments Table T1 Data Node 1 F1 F3 Px Partition P1 Node Group 1 Data Node 2 P2 F3 F1 P3 Data Node 3 F2 F4 P4 Node Group 2 Fx Primary Fragment Data Node 4 - Node groups are created automatically F4 F2 - # of groups = # of data nodes / # of replicas Fx Secondary Fragment (fragment replica) © 2011 Oracle Corporation 90
  • 91. Automatic Data Partitioning 4 Partitions * 2 Replicas = 8 Fragments Table T1 Data Node 1 F1 F3 Px Partition P1 Node Group 1 Data Node 2 P2 F3 F1 P3 Data Node 3 F2 F4 P4 Node Group 2 Fx Primary Fragment Data Node 4 As long as one data node in each node group is running we have a complete F4 F2 Fx Secondary Fragment (fragment replica) copy of the data © 2011 Oracle Corporation 91
  • 92. Automatic Data Partitioning 4 Partitions * 2 Replicas = 8 Fragments Table T1 Data Node 1 F1 F3 Px Partition P1 Node Group 1 Data Node 2 P2 F3 F1 P3 Data Node 3 F2 F4 P4 Node Group 2 Fx Primary Fragment Data Node 4 As long as one data node in each node group is running we have a complete F4 F2 Fx Secondary Fragment (fragment replica) copy of the data © 2011 Oracle Corporation 92
  • 93. Automatic Data Partitioning 4 Partitions * 2 Replicas = 8 Fragments Table T1 Data Node 1 F1 F3 Px Partition P1 Node Group 1 Data Node 2 P2 F3 F1 P3 Data Node 3 F2 F4 P4 Node Group 2 Fx Primary Fragment Data Node 4 As long as one data node in each node group F4 F2 Fx Secondary Fragment (fragment replica) is running we have a complete copy of the data © 2011 Oracle Corporation 93
  • 94. Automatic Data Partitioning 4 Partitions * 2 Replicas = 8 Fragments Table T1 Data Node 1 F1 F3 Px Partition P1 Node Group 1 Data Node 2 P2 F3 F1 P3 Data Node 3 F2 F4 P4 Node Group 2 Fx Primary Fragment Data Node 4 F4 F2 - No complete copy of the data Fx Secondary Fragment (fragment replica) - Cluster shutdowns automatically © 2011 Oracle Corporation 94
  • 95. Data Partitioning • Automatic distribution/partitioning – Primary Key hash value (partitioning by Key) • Transparent load balancing – Distribution awareness • Data Node chosen based on PK hash value – Or proximity (SQL Node - shared memory, localhost, remote host) • Support for user defined partitioning 4 Partitions * 2 Replicas = 8 Fragments • Key Concepts Table T1 Data Node 1 F1 F3 – Partition Px Partition • Horizontal P1 Node Group 1 Data Node 2 • # of partitions = # of data nodes P2 F3 F1 – Fragment P3 • Copy of a partition Data Node 3 F2 F4 – Replica P4 • Complete copy of the data Fx Primary Fragment Data Node 4 Node Group 2 – Node Group - Node groups are created automatically - # of groups = # of data nodes / # of replicas Fx Secondary Fragment (fragment replica) F4 F2 • Groups data nodes (automatically) • Determined by the order in configuration file • # of groups = # of data nodes / # of replicas © 2011 Oracle Corporation 95
  • 96. Internal Replication • Replication between Data Nodes • Synchronous Replication – To ensure minimal failover time – Data Nodes have the same information at the same point in time – Achieved by Two-phase commit protocol • Two-phase commit – 1. Prepare/update phase • All fragments (primary/secondary) gets updated – 2. Commit phase • The changes are committed – Every Data Node has Transaction Coordinator – One is elected to be the transaction coordinator – The information goes from the Transaction Coordinator (TC) to primary fragments and further to secondary fragments © 2011 Oracle Corporation 96
  • 97. Internal Replication: Prepare Phase Data Node insert into T1 values (...) Data Node 1 Transaction Coordinator Transaction Coordinator 4 2 Local Query Handler 3 Local Query Handler ACC TUP 1. Calc hash on PK ACC TUP 2. Forward request to LQH Index F1 F2 where primary fragment is Index F2 F1 Memory 3. Prepare secondary fragment Memory Data Memory Data Memory 4. Prepare phase done © 2011 Oracle Corporation 97
  • 98. Internal Replication: Commit Phase Data Node insert into T1 values (...) Data Node 4 Transaction Coordinator Transaction Coordinator 1 3 Local Query Handler 2 Local Query Handler ACC TUP ACC TUP Index F1 F2 Index F2 F1 Memory Memory Data Memory Data Memory © 2011 Oracle Corporation 98
  • 99. Transactions • Transaction Coordinator – The elected TC starts the transaction – TC calculates a hash on the primary key – Each transaction contains one or more Read/Insert/Update or Delete Operations – Operations are forwarded to the LQH of the Data Node having the data for the operation • Isolation Level – Committed Read • Read both from primary and secondary fragment • No lock required • Update/Insert/Delete – Locks on index entry in ACC – Both primary and secondary fragments • Read exclusive/Read shared – Locks the index entry in ACC on primary and secondary fragments © 2011 Oracle Corporation 99