• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Conference tutorial: MySQL Cluster as NoSQL
 

Conference tutorial: MySQL Cluster as NoSQL

on

  • 4,616 views

Slides from the 'MySQL Cluster as NoSQL' tutorial at Percona Live MySQL Conference 2012 in London. ...

Slides from the 'MySQL Cluster as NoSQL' tutorial at Percona Live MySQL Conference 2012 in London.
Tutorial covers:
*MySQL Cluster administration
* NoSQL options for MySQL Cluster and when to use what
* Memcached (installation and configuration)
* Cluster/J
* NDBAPI
* Benchmarking of different access methods on a live cluster

Statistics

Views

Total Views
4,616
Views on SlideShare
1,647
Embed Views
2,969

Actions

Likes
3
Downloads
0
Comments
0

5 Embeds 2,969

http://www.severalnines.com 2958
http://translate.googleusercontent.com 5
http://severalnines.com 2
http://www.slashdocs.com 2
https://translate.googleusercontent.com 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Conference tutorial: MySQL Cluster as NoSQL Conference tutorial: MySQL Cluster as NoSQL Presentation Transcript

    • MySQL Cluster and NoSQLDecember 2012Johan AnderssonSeveralnines ABjohan@severalnines.comCell +46 73 073 60 99
    • Copyright 2011 Severalnines ABTopics  RDBMS/NoSQL  API Overview  Memcached Installation  Configuration  Performance Tuning  Troubleshooting  Use Cases 2
    • Copyright 2012 Severalnines AB RDBMS vs NoSQLRDBMS NoSQL  Structure and relations are   Structure and relations not as important important   Relational schema   Focus on storing/retrieving  Complex Queries   Simple access   JOINs   E.g. Key Value: get(), set()  ACID   Eventual Consistency  Scalability usually not built-in   Scalability built-in  Durability of data on one   Durability of data guaranteed by node having data on multiple nodes 3
    • Copyright 2012 Severalnines AB RDBMS vs NoSQLRDBMS NoSQL  Structure and relations are   Structure and relations not as important important   Relational schema   Focus on storing/retrieving  Complex Queries   Simple access   JOINs   E.g. Key Value: get(), set()  ACID   Eventual Consistency  Scalability usually not built-in   Scalability built-in  Durability of data on one   Durability of data guaranteed by node having data on multiple nodes 4
    • Copyright 2012 Severalnines AB RDBMS vs NoSQLRDBMS NoSQL  Structure and relations are   Structure and relations not as important important   Relational schema   Focus on storing/retrieving  Complex Queries   Simple access   JOINs   E.g. Key Value: get(), set()  ACID   Eventual Consistency  Scalability usually not built-in   Scalability built-in  Durability of data on one   Durability of data guaranteed by node having data on multiple nodes 5
    • Copyright 2012 Severalnines AB RDBMS vs NoSQLRDBMS NoSQL  Structure and relations are   Structure and relations not as important important   Relational schema   Focus on storing/retrieving  Complex Queries   Simple access   JOINs   E.g. Key Value: get(), set()  ACID   Eventual Consistency  Scalability usually not built-in   Scalability built-in  Durability of data on one   Durability of data guaranteed by node having data on multiple nodes 6
    • Copyright 2012 Severalnines AB RDBMS vs NoSQLRDBMS NoSQL  Structure and relations are   Structure and relations not as important important   Relational schema   Focus on storing/retrieving  Complex Queries   Simple access   JOINs   E.g. Key Value: get(), set()  ACID   Eventual Consistency  Scalability usually not built-in   Scalability built-in  Durability of data on one   Durability of data guaranteed by node having data on multiple nodes 7
    • Copyright 2012 Severalnines AB RDBMS vs NoSQLRDBMS NoSQL  Structure and relations are   Structure and relations not as important important   Relational schema   Focus on storing/retrieving  Complex Queries   Simple access   JOINs   E.g. Key Value: get(), set()  ACID   Eventual Consistency  Scalability usually not built-in   Scalability built-in  Durability of data on one   Durability of data guaranteed by node having data on multiple nodes 8
    • Copyright 2012 Severalnines ABIntroducing MySQL Cluster  Shared Nothing database   Up to 255 nodes in a cluster   Automatic sharding  In-memory or hybrid disk data storage  Multiple APIs  Availability   Strong consistency with synchronous replication   Automatic fail-over within a cluster   Eventual consistency between clusters 9
    • Copyright 2012 Severalnines ABArchitecture 10
    • Copyright 2012 Severalnines AB #1 – Horizontal scalability   Data Nodes   Stores the data   Memory or disk tables   Can be added onlineShard 1 Shard 2 Shard 3 11
    • Copyright 2012 Severalnines AB#2 – API Not Only SQL Data 12
    • Copyright 2012 Severalnines AB#2 – API Not Only SQL 13
    • Copyright 2012 Severalnines AB#2 – API Not Only SQL 14
    • Copyright 2012 Severalnines AB#2 – API Not Only SQL 15
    • Copyright 2012 Severalnines AB#2 – API Not Only SQL 16
    • Copyright 2012 Severalnines AB#2 – API Not Only SQL 17
    • Copyright 2012 Severalnines AB#2 – API Not Only SQL 18
    • Copyright 2012 Severalnines AB #3 – Schema   SQL/Relational   Add column   Add/remove index   Memcached prefix key value   Key-value <city: ldn 1>Prefix Table Key-col Val-col policy city … code … …city: AreaCode city code cluster ldn … 1 … … Configuration/Mapping Table: ‘AreaCode’ 19
    • Copyright 2012 Severalnines AB#4 – Data Consistency  Strong consistency within a cluster  Eventual consistency across clusters EU Cluster US Cluster 20
    • Copyright 2012 Severalnines AB#5 – Data Storage  Memory Tables   No disk checkpoints  Memory Tables   With disk checkpoints  Disk Data tables   Index in memory  Writes not IO bound   Transaction durability = data written in at least 2 nodes 21
    • SQL Access  Parsing/Optimizing/Executing Query Takes time  Query plan converted to NDBAPI calls  Introduces an extra network hop often: APP MYSQLD NDB
    • NoSQL : Memcached(new in 7.2)  Native Key-Value access (converts memcached proto to ndbapi calls)  Bypasses SQL  Schema and schemaless data storage MEMCACHED APP NDB  By default server -  Every KV written to the same table -  Each KV in a single row  Or configure to use existing tables
    • NoSQL : REST  Bypasses SQL  Native HTTP/REST access  Loads in an Apache module (mod_ndb) Apache NDB
    • NoSQL : Cluster/J  Bypasses SQL  Native Java and JPA access AppServer NDB EJB
    • NoSQL: NDBAPI (sync)  C++ API supporting GET/SET/RANGE_SCAN/SCAN  Bypasses SQL NDBAPI  Ultra low latency NDB client  Hand-optimize execution path  Lots of freedom (also to make mistakes)
    • NoSQL : NDBAPI (sync)   DEFINE AND STORAGE BUFFERS (NDB RECORD)   START TRANSACTION   CREATE OPERATION ( on table) -  DEFINE OPERATION (insert/update/read/delete) – PK operation -  GET/SET PK AND VALUES -  <repeat these for batching or read from many tables>   EXECUTE ( COMMIT / NO COMMIT)   CHECK STORAGE BUFFERS
    • NoSQL : NDBAPI (async)  Bypasses SQL  Similar to node.js with callbacks registered and executed on completion  Ultra fast performance for GET/SET on PK NDBAPI  Hand-optimize execution path NDB client  Lots of freedom (also to make mistakes)  Scales with number of threads and number of Apps
    • NoSQL : NDBAPI (async)  DEFINE AND STORAGE BUFFERS (NDB RECORD) PREPARE TRANSACTION -  ASSIGN A CALLBACK -  CREATE OPERATION ( on table) -  DEFINE OPERATION (insert/update/read/delete) – PK operation -  GET/SET PK AND VALUES -  <repeat these for batching or read from many tables>  <repeat and PREPARE up to 1024 TXs>  SEND to NDB  POLL for CALLBACKs -  Executes callbacks, and PREPARE a new TX if you want. -  CHECK STORAGE BUFFERS
    • NoSQL : NDBAPI (async)  Using the Async NDBAPI Oracle managed to get 1.05 Billion Queries Per Minute -  flexAsync -a 25 -p 128 -t <cores> -l <iterations> -  8 data nodes (48GB of RAM) -  10 api nodes -  Intel X5670 (2 CPU x 6 cores) -  Infiniband (IPoIB)
    • Copyright 2011 Severalnines ABIntroduction  Memcached access to NDB is included in MySQL Cluster 7.2  Provides a Memcached Interface to NDB data   Using get/set to read and write data   Avoid SQL altogether (except for creating tables)  There several “run-time” models that can be configured   Affects mainly placement of data 31
    • Copyright 2011 Severalnines ABIntroduction  Memcached uses the NDBAPI (C++ direct API) to access data in NDB.  There are two flavors of the NDBAPI   Synchronous NDBAPI   Asynchronous NDBAPI  Memcached uses the Asynchronous NDBAPI 32
    • Copyright 2011 Severalnines ABIntroduction “S” SCHEDULER Worker threads ndb ndb Ndb_cluster_connection 33
    • Copyright 2011 Severalnines AB Introduction Synchronous Asynchronous   Start transaction   Start transaction   Associate callback   Create op   Create oplogic   Set op type   Set op type   Bind keys/values   Bind keys/values   Prepare transactionndb   Execute   Send   Send request to NDB   Send request to NDB   Check result   Poll   Check callbacks 34
    • Copyright 2011 Severalnines ABIntroduction  Asynchronous invocation gives   Higher degree of parallelism, up to 1024 transactions in flight from each NDB object   Less threads needed to drive load   Both threads and transaction parallelism in one shot!   Harder programming model  Synchronous invocation gives   Easy programming model   One thread does one transaction at a time, less parallelism   Many threads needed to drive high load 35
    • Copyright 2011 Severalnines ABIntroduction  Memcached supported operations   GET / MULTI GET   SET   ADD   REPLACE   CAS   INCR   DECR 36
    • Copyright 2011 Severalnines ABInstallation  The memcached server is included in the MySQL Cluster distributions   <basedir>/bin/memcached  Memcached also requires a plugin that is also included in the distribution   <basedir>/lib/ndb_engine.so or /usr/lib64/ndb_engine.so  It requires a connect string to be able to join the NDB Cluster  + “normal” memcached options   port, bind-address etc 37
    • Copyright 2011 Severalnines ABInstallation  Starting Memcached can be done as follows: memcached -p11211 -E <basedir>/lib/ndb_engine.so -u nobody -d -l 127.0.0.1 -e connectstring=127.0.0.1:1186  Options: -l -- bind-address -u -- user -d -- daemon -e -- connectstring and more NDB options -E -- specifies a memcached plugin 38
    • Copyright 2011 Severalnines ABInstallation memcached server  Before we can start memcached we must sanity check NDB Cluster   memcached will by default make two connections to the Data Nodes.   This is same as   --ndb-cluster-connection- pool=2 P0 P1 S1 S0 39
    • Copyright 2011 Severalnines ABInstallation  Use ndb_mgm to check for free slots ndb_mgm> show Connected to Management Server at: localhost:1186 Cluster Configuration --------------------- [ndbd(NDB)] 2 node(s) id=3 @10.176.129.89 (mysql-5.5.27 ndb-7.2.8, Nodegroup: 0, Master) id=4 @10.178.0.69 (mysql-5.5.27 ndb-7.2.8, Nodegroup: 0) [ndb_mgmd(MGM)] 2 node(s) id=1 @10.176.131.164 (mysql-5.5.27 ndb-7.2.8) id=2 @10.177.67.255 (mysql-5.5.27 ndb-7.2.8) [mysqld(API)] 27 node(s) id=5 @10.176.131.164 (mysql-5.5.27 ndb-7.2.8) id=6 @10.176.131.164 (mysql-5.5.27 ndb-7.2.8) id=7 @10.176.131.165 (mysql-5.5.27 ndb-7.2.8) id=8 @10.176.131.165 (mysql-5.5.27 ndb-7.2.8)
    • Copyright 2011 Severalnines ABInstallation  In the previous example you must add atleast two “slots”   Change config.ini and add [mysqld] [mysqld]   Perform a rolling restart   Stop and start the management servers one at a time   Stop and start one data node at a time   Stop and start the mysql servers one at a time
    • Copyright 2011 Severalnines ABInstallation  Now we can connect! ndb_mgm> show Connected to Management Server at: localhost:1186 Cluster Configuration --------------------- [ndbd(NDB)] 2 node(s) id=3 @10.176.129.89 (mysql-5.5.27 ndb-7.2.8, Nodegroup: 0, Master) id=4 @10.178.0.69 (mysql-5.5.27 ndb-7.2.8, Nodegroup: 0) [ndb_mgmd(MGM)] 2 node(s) id=1 @10.176.131.164 (mysql-5.5.27 ndb-7.2.8) id=2 @10.177.67.255 (mysql-5.5.27 ndb-7.2.8) [mysqld(API)] 27 node(s) id=5 @10.176.131.164 (mysql-5.5.27 ndb-7.2.8) id=6 @10.176.131.164 (mysql-5.5.27 ndb-7.2.8) id=7 @10.176.131.165 (mysql-5.5.27 ndb-7.2.8) id=8 @10.176.131.165 (mysql-5.5.27 ndb-7.2.8) id=9 (not connected, accepting connect from any host) id=10 (not connected, accepting connect from any host)
    • Copyright 2011 Severalnines ABInstallation  But wait! We need to install the ndb_memcached schema!   Only needed the first time   Table are stored in NDB  Defined in the file:   <basedir>/share/memcache-api/ ndb_memcache_metadata.sql  mysql –uroot –p < <basedir>/share/memcache-api/ ndb_memcache_metadata.sql
    • Copyright 2011 Severalnines ABExercise 1  Install the schema   /usr/local/mysql/share/memcache-api/ ndb_memcache_metadata.sql  Start memcached   The management server is listening on 127.0.0.1   Use port 11211   Use bind address 127.0.0.1   Don’t use the daemon option   Basedir = /usr/local/mysql/  Verify using the management client:   ndb_mgm –e “show”
    • Copyright 2011 Severalnines ABTroubleshooting  Common errors :   bind(): Cannot assign requested address   Wrong bind address   Hanging on “Contacting primary management server (..) ...”   Wrong ndb-connectstring  Success:   done [0.759 sec
    • Copyright 2011 Severalnines ABConfiguration  One of the key benefits with Memcached is that it can be used in multiple ways:   Store data in NDB only   Store data in NDB and cache in Memcached   Cache only on Memcached  An existing data model can also be presented to Memcached   This requires a bit of setup to create mappings for the tables being exposed to Memcached   Let’s do it now!
    • Copyright 2011 Severalnines AB Configuration   Consider the following table. Goals:   Expose it to memcached   Read/write to it   Make two configuration – NDB Only and NDB + Cachingcreate table users( uid integer auto_increment primary key, name varchar(255), email varchar(255), view_cnt bigint unsigned default 0, created bigint unsigned default 0, json_data varbinary(12000) ) engine = ndb;
    • Copyright 2011 Severalnines ABConcepts  Memcached uses two important concepts   CONTAINERS (table ndbmemcache.containers)   KEY_PREFIXES (table ndbmemcache.key_prefixes)  CONTAINERS   Specifies what tables, columns in the tables, keys etc  KEY_PREFIXES   Specifies key bindings, and roles (if data should be in ndb only e.g). 48
    • Copyright 2011 Severalnines ABContainers Table  DESC containers;   name - container name (PK)   db_schema - database where db_table is stored   db_table - name of the database table   key_columns - the columns mapping to the memcached key   value_columns - the columns that map to the   flags - not implemented   increment_column - for INCR / DECR - BIGINT UNSIGNED   cas_column - CAS , must be BIGINT UNSIGNED   expire_time_column - not implemented   large_values_table 49
    • Copyright 2011 Severalnines AB Cointainer for Users   name = users_container create table users( uid integer auto_increment primary key,   db_schema = test name varchar(255),   db_table = users email varchar(255), view_cnt bigint unsigned default 0,   key_columns = uid created bigint unsigned default 0 ) engine = ndb;   value_columns = name,email,created,view_cnt, json_data   flags =0   increment_column = NULL   cas_column = NULL   expire_time_column = NULL   large_values_table = NULLINSERT INTO containers(name,db_schema, db_table, key_columns, value_columns, flags,increment_column,cas_column,expire_time_column,large_values_table) values(users_container,test,users, uid,name,email,created,view_cnt,json_data, 0 ,NULL , NULL, NULL, NULL); 50
    • Copyright 2011 Severalnines ABKey_prefixes Table  DESC key_prefixes;   server_role_id - id referencing memcache_server_roles table   key_prefix - memcache search key prefix (e.g ‘myid:’)   cluster_id - id referencing ndb_clusters table   policy - referencing cache_policies table   container - name referencing containers.name We will now explore the referenced tables and see what they contain. 51
    • Copyright 2011 Severalnines ABKey_prefixes for Users  server_role_id = 1 /*db-only*/ /*Must match how memcached is started*/  key_prefix = ‘user:’  cluster_id = 0  policy = ‘ndb-only’  container = ‘users_container’insert into key_prefixes(server_role_id, key_prefix, cluster_id,policy, container) values (1, user: , 0 , ndb-only ,users_container ); 52
    • Copyright 2011 Severalnines ABExercise 2  Create the ’Users’ table in database ’test’  Create the Container (use ndbmemcache)  Create the Key_prefix  Stop memcached , some options:   killlall -15 memcached   ctrl-c   killall -9 memcached  Start memcached /usr/local//mysql/bin//memcached -p11211 -E /usr/local//mysql/lib//ndb_engine.so -unobody -e “connectstring=127.0.0.1;role=db-only “What happens?
    • Copyright 2011 Severalnines ABTroubleshooting  Common errors :   Specified a column that does not exist:   ‘Invalid column "test.users.view_cnt” ‘  seg fault   The same column has been specified twice in the Container:   createRecord() failure: Duplicate column specification in NdbDictionary::RecordSpecification   Mismatch between container.name and key_prefixes.container:   "users_containerxx" NOT FOUND in database.  Fixing the problem:   DELETE FROM key_prefixes …;   DELETE FROM containers … ;
    • Copyright 2011 Severalnines ABExercise 3  Insert a record into the users table:  mysql –uroot –ppassword  insert into users(name,email, view_cnt,created, json_data) values (johan, johan@severalnines.com, 0, unix_timestamp(now()), "{messages: [msg1, msg2]}");  telnet localhost 11211   GET user:1   INCR user:1 1   GET user:1  Do you get what you expect?
    • Copyright 2011 Severalnines AB view_cntContainer: Key_prefixes:  name = users_view_cnt_container   server_role_id =  db_schema = test   key_prefix = ‘user:’  db_table = users   cluster_id = 0  key_columns = uid   policy = ‘ndb-only’  value_columns = NULL   container = ‘users_view_cnt_container’  flags =0  increment_column = view_cnt  cas_column = NULL  expire_time_column = NULL  large_values_table = NULLINSERT INTO containers(name,db_schema, db_table, key_columns, value_columns, flags,increment_column,cas_column,expire_time_column,large_values_table) VALUES(users_view_cnt_container,test,users, uid,NULL, 0 , view_cnt , NULL, NULL,NULL);INSERT INTO key_prefixes(server_role_id, key_prefix, cluster_id, policy, container) VALUES(1, user_view_cnt: , 0 , ndb-only , users_view_cnt_container); 56
    • Copyright 2011 Severalnines AB view_cntINSERT INTO containers(name,db_schema, db_table, key_columns, value_columns,flags, increment_column,cas_column,expire_time_column,large_values_table)VALUES(users_view_cnt_container,test,users, uid,NULL, 0 ,view_cnt , NULL, NULL, NULL);INSERT INTO key_prefixes(server_role_id, key_prefix, cluster_id, policy,container) VALUES(1, user_view_cnt: , 0 , ndb-only ,users_view_cnt_container); 57
    • Copyright 2011 Severalnines ABExercise 4  Create the Container and Key_prefix for the view_cnt.  telnet localhost 11211   GET user:1   INCR user_view_cnt:1 1   INCR user_view_cnt:1 1000   DECR user_view_cnt:1 100   GET user:1  Do you get what you expect?
    • Copyright 2011 Severalnines ABRecap  One Container must be setup for each operation you want to do:   Write/Read whole record   INCR/DECR   CAS   Etc.  One Key_prefix must be setup for each Container. 1 1
    • Copyright 2011 Severalnines ABAccessing the Data  There are many client interfaces to memcached:   libmemcached (c/c++)   PECL/memcached (php)   PHP/libmemcached (php)   Spymemcached (java)   Python-memcached (python)   Cache::Memcached::Fast (perl)   Telnet 60
    • Copyright 2011 Severalnines ABTELNET   Telnet can be used to access data stored in memcached:   telnet localhost 11211 get user 1 61
    • Copyright 2011 Severalnines ABCaching Policies  Read-only/read-mostly data can be cached in the Memcached server CLIENT P0 P1 S1 S0 62
    • Copyright 2011 Severalnines ABCaching Policies –Setup  A new Key_prefix must be created:   server_role_id = 3 /*caching*/ /*Must match how memcached is started*/   key_prefix = ‘user_cache:’   cluster_id = 0   policy = ‘caching’   container = ‘users_container’insert into key_prefixes(server_role_id, key_prefix,cluster_id, policy, container) values (3,user_cache: , 0 , caching , users_container );
    • Copyright 2011 Severalnines ABCaching Policies  GET   Read data from Cache if exists in Cache   Read data from NDB if not exists in Cache  populate cache  STORE   Write data to Cache AND to NDB   Overwrites existing data in cache
    • Copyright 2011 Severalnines ABPerformance Tuning  Tunables are few   NDB Cluster connections can be set 0-4 (0 means it will “figure it out”)   Send timeout , 1-10 (ms), default 1 (ms)   Force send On or Off (1 or 0), default Off  Set with scheduler options:   memcached -e “…;S:c1,t1,f1”   This would set:   Ndb_cluster_connections=1   Send timeout=1 (ms)   Force send = ON (1) 65
    • Copyright 2011 Severalnines ABShoot out  Host A and B: data node (ndbmtd)  Host C: MySQL Server or Memcached  Host C: Application  Users table with 10000 records  Get User based on UID. Queries:   SELECT name,email,created,view_cnt,json_data FROM users WHERE uid=<random int 1-10000>   GET user:<random int 1-10000> 66
    • Copyright 2011 Severalnines AB Shoot outAccess Method 4 threads 8 thread/ 16 threads 4NDB 8NDB 8NDBSQL (python)** 1616 1376 ?? -SQL ( C ) 3808 5712 9312MEMCACHE (python) 3076 5516* 6944*MEMCACHE (C++) 3300 7096* 14632*NDBAPI (C++, sync) 5500 10425 15500 *) Max 4 ndb_cluster_connections is possible from MEMCACHED **) Connector/Python was used Averages measured over three runs. For all C/C++ tests bencher was used to drive load. Threading in Python doesn’t seem to be great. 67
    • Copyright 2011 Severalnines ABRecommendation  Tuning the Memcached scheduler options make a difference:   Scheduler: starting for 1 cluster; c4,f0,g1,t1   10500 reads/sec   Scheduler: starting for 1 cluster; c4,f1,g1,t1   14632 reads/sec   Scheduler: starting for 1 cluster; c2,f1,g1,t1   11000 reads/sec  Set:   Scheduler option: f1 (force send = on)   Scheduler option: c4 (4 ndb cluster connections   Memcached option: -t <no workers> set depending on the number of clients you need.   If you have many workers, >= 128 try force send = off. 68
    • Copyright 2011 Severalnines ABTHANK YOU!  Twitter : @severalnines  Facebook: www.facebook.com/severalnines  Slideshare : www.slideshare.net/severalnines  Linked-in: www.linkedin.com/company/severalnines  Web site: www.severalnines.com  Mail: johan@severalnines.com 69