Using Q4Ma message queue storage engine for MySQL              Kazuho Oku
Who am I?   Name: Kazuho Oku (奥一穂)   Original Developer of Palmscape / Xiino        The oldest web browser for Palm OS ...
What is Q4M?   A message queue        runs as a storage plugin of MySQL 5.1   Why is it a MySQLplugin?        accessib...
Design Goals of Q4M   Robust        Does not lose data on OS crash or power failure              necessary for Tokyo wo...
Users of Q4M   Many leading web services in Japan        DeNA Co., Ltd.        livedoor Co., Ltd.        mixi, Inc.   ...
Agenda   What MQ is in General   Applications of Q4M   Brief TutorialMar 26 2011         Using Q4M   6
What is a Message Queue?Mar 26 2011            Using Q4M         7
What is a Message Queue?   Middleware for persistent asynchronous    communication        communicate between fixed pair...
Minimal Configuration of a MQ   Senders and receivers access a single    queue               Sender                Receiv...
MQ and Relays   Separate queue for sender and receiver   Messages relayed between queues                                ...
Merits of Message Relays   Destination can be changed easily        Relays may transfer messages to different         lo...
Message Brokers   Publish / subscribe model        Separation between components and their         integration        C...
What about Q4M?   Q4M itself is a message queue   Can connect Q4M instances to create a    message relay   Provides API...
Performance of Q4M   over 7,000 mess/sec.        message size: avg. 512 bytes        syncing to disk   Outperforming m...
Applications of Q4MMar 26 2011          Using Q4M      15
Asynchronous Updates   DeNA uses Q4M for sending notifications    to users asynchronously              http://engineer.de...
Delay Peak Demands   Mixi (Japans one of the largest SNS)    uses Q4M to buffer writes to DB, to    delay peak demands   ...
Connecting Distant Servers   Pathtraq uses Q4M to create a relay    between its database and content    analysis processe...
Prefetch Data   livedoor Reader (web-based feed    aggregator) uses Q4M to prefetch data    from database to memcached   ...
Scheduling Web Crawlers   Web crawlers with retry-on-error   Sample code included in Q4M dist.                          ...
Delayed Content Generation   Hatetter(RSS feed-to-twitter-API    gateway) uses Q4M to delay content    generation        ...
Installing Q4MMar 26 2011       Using Q4M    22
Installing Q4M   Compatible with MySQL 5.1   Download from q4m.github.com        Binary releases available for some pla...
Configuration Options of Q4M   --with-sync=no|fsync|fdatasync|fcntl        Controls synchronization to disk        defa...
Q4M BasicsMar 26 2011     Using Q4M   25
The Model        Various publishers write to queue        Set of subscribers consume the entries in queue               ...
Creating a Q4M Table    ENGINE=QUEUE creates                                mysql> CREATE TABLE qt (                     ...
Modifying Data on a Q4M Table    No restrictions for                                mysql> INSERT INTO qt (id,message)   ...
SELECT from a Q4M Table   Works the same as                                  mysql> SELECT * FROM qt;                    ...
Calling queue_wait()    After calling, only one                                       mysql> SELECT * FROM qt;           ...
OWNER Mode and NON-OWNER Mode   In OWNER mode, only the OWNED row    is visible              OWNED row becomes invisible...
Returning to NON-OWNER mode    By calling queue_abort,                                  mysql> SELECT QUEUE_ABORT();     ...
Consuming a Row    By calling                                mysql> SELECT queue_wait(qt);                               ...
Writing a Subscriber    Call two functions: queue_wait, queue_end    Multiple subscribers can be run concurrently       ...
Writing a Subscriber (contd)   Or call queue_wait as a condition        Warning: conflicts with trigger-based insertions...
The Model – with code                    INSERT INTO queue ...                            Publisher                       ...
Three Functions in DetailMar 26 2011            Using Q4M          37
queue_wait(table)   Enters OWNER mode   0〜1 row becomes OWNED        Enters OWNER mode even if no rows were         ava...
Revisiting Subscriber Code   Calls to queue_end just before    queue_wait can be omitted   while (true) {     rows := SEL...
Conditional queue_wait()   Consume rows of certain condition        Rows that do not match will be left untouched       ...
queue_wait(tbl_cond,[tbl_cond…,timeout])   Accepts multiple tables and timeout   Data searched from leftmost table to   ...
Functions for Exiting OWNER Mode   queue_end        Deletes the owned row and exits OWNER mode   queue_abort        Re...
Relaying and Routing MessagesMar 26 2011               Using Q4M           43
The Model   Relay (or router) consists of more than 3    processes, 2 conns   No losses, no duplicates on crash or    di...
Internal Row ID   Every row have a internal row ID        invisible from Q4M table definition        monotonically incr...
queue_rowid()   Returns row ID of the OWNED row (if    any)        Returns NULL if no row is OWNED   Call when retrievi...
queue_set_srcid(src_tbl_id, mode, src_row_id)   Call before inserting a row to destination    table        Checks if the...
Pseudo Code   Relays data from src_tbl to dest_tbl   while (true) {    # wait for data    SELECT queue_wait(src_tbl) =>sr...
q4m-forward   Simple forwarder script        installed into mysql-dir/bin   usage: q4m-forward [options] src_addrdest_ad...
Limitations and the Future of Q4MMar 26 2011                 Using Q4M             50
Things that Need to be Fixed   Table compactions is a blocking    operation        runs when live data becomes <25% of l...
Future of Q4M (maybe)   Support for MySQL 5.5        not request yet from current users :-p   2-phase commit with other...
Thank you              http://q4m.github.com/Mar 26 2011          Using Q4M         53
Upcoming SlideShare
Loading in...5
×

Using Q4M - a message queue for MySQL #osdc.tw

2,504

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,504
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
32
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Using Q4M - a message queue for MySQL #osdc.tw

  1. 1. Using Q4Ma message queue storage engine for MySQL Kazuho Oku
  2. 2. Who am I? Name: Kazuho Oku (奥一穂) Original Developer of Palmscape / Xiino The oldest web browser for Palm OS Worked Cybozu Labs in 2005-2010 Research subsidiary of Cybozu, Inc. in Japan Developed Japanize / Mylingual, Q4M, etc. Now working at DeNA Co., Ltd. with the developers of the HandlerSocketpluginMar 26 2011 Using Q4M 2
  3. 3. What is Q4M? A message queue runs as a storage plugin of MySQL 5.1 Why is it a MySQLplugin? accessible by using existing MySQL clients no need for a new client library administrable by using SQL friendly to DB adminsMar 26 2011 Using Q4M 3
  4. 4. Design Goals of Q4M Robust Does not lose data on OS crash or power failure necessary for Tokyo wo. nuclear power plants… orz Fast Transfer thousands of messages per second Easy to Use Use SQL for access / maintenance Integration into MySQL no more separate daemons to take care ofMar 26 2011 Using Q4M 4
  5. 5. Users of Q4M Many leading web services in Japan DeNA Co., Ltd. livedoor Co., Ltd. mixi, Inc. Zynga Japan (formerly Unoh, Inc.)Mar 26 2011 Using Q4M 5
  6. 6. Agenda What MQ is in General Applications of Q4M Brief TutorialMar 26 2011 Using Q4M 6
  7. 7. What is a Message Queue?Mar 26 2011 Using Q4M 7
  8. 8. What is a Message Queue? Middleware for persistent asynchronous communication communicate between fixed pairs (parties) a.k.a. Message Oriented Middleware MQ is intermediate storage RDBMS is persistent storage Senders / receivers may go downMar 26 2011 Using Q4M 8
  9. 9. Minimal Configuration of a MQ Senders and receivers access a single queue Sender Receiver QueueMar 26 2011 Using Q4M 9
  10. 10. MQ and Relays Separate queue for sender and receiver Messages relayed between queues Relay Sender Receiver Queue QueueMar 26 2011 Using Q4M 10
  11. 11. Merits of Message Relays Destination can be changed easily Relays may transfer messages to different locations depending on their headers Robustness against network failure no loss or duplicates when the relay fails Logging and Multicasting, etc.Mar 26 2011 Using Q4M 11
  12. 12. Message Brokers Publish / subscribe model Separation between components and their integration Components read / write to predefined queues Integration is definition of routing rules between the message queues Messages are often transformed (filtered) within the relay agentMar 26 2011 Using Q4M 12
  13. 13. What about Q4M? Q4M itself is a message queue Can connect Q4M instances to create a message relay Provides API for creating message relays and brokersMar 26 2011 Using Q4M 13
  14. 14. Performance of Q4M over 7,000 mess/sec. message size: avg. 512 bytes syncing to disk Outperforming most needs if you need more, just scale out Can coexist with other storage engines without sacrificing their performance see http://labs.cybozu.co.jp/blog/kazuhoatwork/2008/06/q4m_06_release_and_benchmarks.phpMar 26 2011 Using Q4M 14
  15. 15. Applications of Q4MMar 26 2011 Using Q4M 15
  16. 16. Asynchronous Updates DeNA uses Q4M for sending notifications to users asynchronously http://engineer.dena.jp/2010/03/dena-technical-seminar-1- 2.htmlMar 26 2011 Using Q4M 16
  17. 17. Delay Peak Demands Mixi (Japans one of the largest SNS) uses Q4M to buffer writes to DB, to delay peak demands from http://alpha.mixi.co.jp/blog/?p=272Mar 26 2011 Using Q4M 17
  18. 18. Connecting Distant Servers Pathtraq uses Q4M to create a relay between its database and content analysis processes → Contents to be analyzed → Content Pathtraq MySQL conn. Analysis DB over SSL,gzip Processes ← Results of the analysis ←Mar 26 2011 Using Q4M 18
  19. 19. Prefetch Data livedoor Reader (web-based feed aggregator) uses Q4M to prefetch data from database to memcached uses Q4M for scheduling web crawlers as well from http://d.hatena.ne.jp/mala/20081212/1229074359Mar 26 2011 Using Q4M 19
  20. 20. Scheduling Web Crawlers Web crawlers with retry-on-error Sample code included in Q4M dist. If failed to fetch, store URL in retry queue Store Result Read URL Spiders URL DB Request Queue Retry Queue Re- schedulerMar 26 2011 Using Q4M 20
  21. 21. Delayed Content Generation Hatetter(RSS feed-to-twitter-API gateway) uses Q4M to delay content generation Source code: github.com/yappo/website-hatetterMar 26 2011 Using Q4M 21
  22. 22. Installing Q4MMar 26 2011 Using Q4M 22
  23. 23. Installing Q4M Compatible with MySQL 5.1 Download from q4m.github.com Binary releases available for some platforms Installing from source: requires source code of MySQL ./configure && make && make install run support-files/install.sqlMar 26 2011 Using Q4M 23
  24. 24. Configuration Options of Q4M --with-sync=no|fsync|fdatasync|fcntl Controls synchronization to disk default: fdatasync on linux --enable-mmap Mmap’ed reads lead to higher throughput default: yes --with-delete=pwrite|msync msyncrecommended on linux>=2.6.20 if you need really high performanceMar 26 2011 Using Q4M 24
  25. 25. Q4M BasicsMar 26 2011 Using Q4M 25
  26. 26. The Model Various publishers write to queue Set of subscribers consume the entries in queue Publisher Publisher Q4M table Subscribers PublisherMar 26 2011 Using Q4M 26
  27. 27. Creating a Q4M Table  ENGINE=QUEUE creates mysql> CREATE TABLE qt ( -> id int(10) unsigned NOT NULL, a Q4M table -> message varchar(255) NOT NULL -> ) ENGINE=QUEUE; Query OK, 0 rows affected (0.42 sec)  No primary keys or indexes  Sorted by insertion order (it’s a queue)Mar 26 2011 Using Q4M 27
  28. 28. Modifying Data on a Q4M Table  No restrictions for mysql> INSERT INTO qt (id,message) -> VALUES INSERT and DELETE -> (1,Hello), -> (2,Bonjour), -> (3,Hola);  No support for UPDATE Query OK, 3 rows affected (0.02 sec) mysql> SELECT * FROM qt; +----+---------+ | id | message | +----+---------+ | 1 | Hello | | 2 | Bonjour | | 3 | Hola | +----+---------+ 3 rows in set (0.00 sec)Mar 26 2011 Using Q4M 28
  29. 29. SELECT from a Q4M Table Works the same as mysql> SELECT * FROM qt; +----+---------+ | id | message | other storage +----+---------+ | 1 | Hello | engines | 2 | Bonjour | | 3 | Hola | +----+---------+ SELECT COUNT(*) is 3 rows in set (0.00 sec) cached mysql> SELECT COUNT(*) FROM qt; +----------+ | COUNT(*) | +----------+ | 3| +----------+ 1 row in set (0.00 sec) How to subscribe to a queue?Mar 26 2011 Using Q4M 29
  30. 30. Calling queue_wait()  After calling, only one mysql> SELECT * FROM qt; +----+---------+ | id | message | row becomes visible +----+---------+ | 1 | Hello | from the connection | 2 | Bonjour | | 3 | Hola | +----+---------+ 3 rows in set (0.00 sec) mysql> SELECT queue_wait(qt); +------------------+ | queue_wait(qt) | +------------------+ | 1| +------------------+ 1 row in set (0.00 sec) mysql> SELECT * FROM qt; +----+---------+ | id | message | +----+---------+ | 1 | Hello | +----+---------+ 1 row in set (0.00 sec)Mar 26 2011 Using Q4M 30
  31. 31. OWNER Mode and NON-OWNER Mode In OWNER mode, only the OWNED row is visible OWNED row becomes invisible from other connections rows of other storage engines are visible NON-OWNER Mode queue_wait() OWNER Mode 1,Hello 1,Hello 2,Bonjour queue_end() 3,Hola queue_abort()Mar 26 2011 Using Q4M 31
  32. 32. Returning to NON-OWNER mode  By calling queue_abort, mysql> SELECT QUEUE_ABORT(); +---------------+ the connection returns | QUEUE_ABORT() | +---------------+ to NON-OWNER mode | 1| +---------------+ 1 row in set (0.00 sec) mysql> SELECT * FROM qt; +----+---------+ | id | message | +----+---------+ | 1 | Hello | | 2 | Bonjour | | 3 | Hola | +----+---------+ 3 rows in set (0.01 sec)Mar 26 2011 Using Q4M 32
  33. 33. Consuming a Row  By calling mysql> SELECT queue_wait(qt); (snip) mysql> SELECT * FROM qt; queue_end, the OWNED +----+---------+ | id | message | row is deleted, and +----+---------+ | 1 | Hello | connection returns to +----+---------+ 1 row in set (0.01 sec) NON-OWNER mode mysql> SELECT queue_end(); +-------------+ | queue_end() | +-------------+ | 1| +-------------+ 1 row in set (0.01 sec) mysql> SELECT * FROM qt; +----+---------+ | id | message | +----+---------+ | 2 | Bonjour | | 3 | Hola | +----+---------+ 2 rows in set (0.00 sec)Mar 26 2011 Using Q4M 33
  34. 34. Writing a Subscriber  Call two functions: queue_wait, queue_end  Multiple subscribers can be run concurrently  each row in the queue is consumed only once while (true) { SELECT queue_wait(qt); # switch to owner mode rows := SELECT * FROM qt; # obtain data if (count(rows) != 0) # if we have any data, then handle_row(rows[0]); # consume the row SELECT queue_end(); # erase the row from queue }Mar 26 2011 Using Q4M 34
  35. 35. Writing a Subscriber (contd) Or call queue_wait as a condition Warning: conflicts with trigger-based insertions while (true) { rows := SELECT * FROM qt WHERE queue_wait(qt); if (count(rows) != 0) handle_row(rows[0]); SELECT queue_end(); }Mar 26 2011 Using Q4M 35
  36. 36. The Model – with code INSERT INTO queue ... Publisher while (true) { rows := SELECT * FROM qt WHERE queue_wait(qt); if (count(rows) != 0) INSERT INTO queue ... handle_row(rows[0]); SELECT queue_end(); } Publisher Q4M table Subscribers INSERT INTO queue ... PublisherMar 26 2011 Using Q4M 36
  37. 37. Three Functions in DetailMar 26 2011 Using Q4M 37
  38. 38. queue_wait(table) Enters OWNER mode 0〜1 row becomes OWNED Enters OWNER mode even if no rows were available Default timeout: 60 seconds Returns 1 if a row is OWNED (0 on timeout) If called within OWNER mode, the owned row is deletedMar 26 2011 Using Q4M 38
  39. 39. Revisiting Subscriber Code Calls to queue_end just before queue_wait can be omitted while (true) { rows := SELECT * FROM qt WHERE queue_wait(qt); if (count(rows) != 0) handle_row(rows[0]); SELECT queue_end(); }Mar 26 2011 Using Q4M 39
  40. 40. Conditional queue_wait() Consume rows of certain condition Rows that do not match will be left untouched Only numeric columns can be checked Fast - condition tested once per each row examples: SELECT queue_wait(table:(col_a*3)+col_b<col_c); SELECT queue_wait(table:retry_count<5);Mar 26 2011 Using Q4M 40
  41. 41. queue_wait(tbl_cond,[tbl_cond…,timeout]) Accepts multiple tables and timeout Data searched from leftmost table to right Returns table index (the leftmost table is 1) of the newly owned row Returns zero if no rows are being owned example: SELECT queue_wait(table_A,table_B,60);Mar 26 2011 Using Q4M 41
  42. 42. Functions for Exiting OWNER Mode queue_end Deletes the owned row and exits OWNER mode queue_abort Releases (instead of deleting) the owned row and exits OWNER mode Close of a MySQL connection does the same thingMar 26 2011 Using Q4M 42
  43. 43. Relaying and Routing MessagesMar 26 2011 Using Q4M 43
  44. 44. The Model Relay (or router) consists of more than 3 processes, 2 conns No losses, no duplicates on crash or disconnection Q4M Table Q4M Table (source) Relay Program (dest.)Mar 26 2011 Using Q4M 44
  45. 45. Internal Row ID Every row have a internal row ID invisible from Q4M table definition monotonically increasing 64-bit integer Used for detecting duplicates Use two functions to skip duplicates Data loss prevented by using queue_wait / queue_endMar 26 2011 Using Q4M 45
  46. 46. queue_rowid() Returns row ID of the OWNED row (if any) Returns NULL if no row is OWNED Call when retrieving data from sourceMar 26 2011 Using Q4M 46
  47. 47. queue_set_srcid(src_tbl_id, mode, src_row_id) Call before inserting a row to destination table Checks if the row is already inserted into the table, and ignores next INSERT if true Parameters: src_tbl_id - id to determine source table (0〜63) mode - "a" to drop duplicates, "w" to reset src_row_id - row ID obtained from source tableMar 26 2011 Using Q4M 47
  48. 48. Pseudo Code Relays data from src_tbl to dest_tbl while (true) { # wait for data SELECT queue_wait(src_tbl) =>src_db; # read row and rowid row := (SELECT * FROM src_tbl =>src_db); rowid := (SELECT queue_rowid() =>src_db); # insert the row after setting srcid SELECT queue_set_srcid(src_tbl_id, a, rowid) =>dest_db; INSERT INTO dest_tbl (row) =>dest_db;Mar 26 2011 Using Q4M 48
  49. 49. q4m-forward Simple forwarder script installed into mysql-dir/bin usage: q4m-forward [options] src_addrdest_addr example: % support-files/q4m-forward "dbi:mysql:database=db1;table=tbl1;user=foo;password=XXX" "dbi:mysql:database=db2;table=tbl2;host=bar;user=foo" options: --reset reset duplicate check info. --sender=idx slot no. used for checking duplicates (0..63, default: 0) --helpMar 26 2011 Using Q4M 49
  50. 50. Limitations and the Future of Q4MMar 26 2011 Using Q4M 50
  51. 51. Things that Need to be Fixed Table compactions is a blocking operation runs when live data becomes <25% of log file very bad, though not as bad as it seems its fast since its a sequential write operation Relays are slow since transfer is done row-by-row Binlog does not work since MQ replication should be synchronousMar 26 2011 Using Q4M 51
  52. 52. Future of Q4M (maybe) Support for MySQL 5.5 not request yet from current users :-p 2-phase commit with other storage engines queue consumption and InnoDB updates can become atomic operationMar 26 2011 Using Q4M 52
  53. 53. Thank you http://q4m.github.com/Mar 26 2011 Using Q4M 53
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×