Using Q4M
a message queue storage engine for MySQL


           Cybozu Labs, Inc.
             Kazuho Oku
Background




                            2
Apr 22 2009     Using Q4M
Who am I?

   Name: Kazuho Oku (奥一穂)
   Original Developer of Palmscape / Xiino
        The oldest web browser for Palm...
About Cybozu, Inc.

   Japan’s largest groupware vendor
        Mostly provides as software products, not as
         se...
About Pathtraq

   Started in Aug. 2007
   Web ranking service
        One of Japan’s largest
        like Alexa, but ...
Design Goals of Q4M

   Robust
        Do not lose data on OS crash or power failure
   Fast
        Transfer thousand...
What is a Message Queue?




                                         7
Apr 22 2009            Using Q4M
What is a Message Queue?

   Middleware for persistent asynchronous
    communication
        communicate between fixed ...
Minimal Configuration of a MQ

   Senders and receivers access a single
    queue




               Sender              ...
MQ and Relays

   Separate queue for sender and receiver
   Messages relayed between queues



                         ...
Merits of Message Relays

   Destination can be changed easily
        Relays may transfer messages to different
       ...
Message Brokers

   Publish / subscribe model
        Separation between components and their
         integration
     ...
What about Q4M?

   Q4M itself is a message queue
   Can connect Q4M instances to create a
    message relay
   Provide...
Performance of Q4M

   over 7,000 mess/sec.
        message size: avg. 512 bytes
        syncing to disk
   Outperform...
Applications of Q4M




                                    15
Apr 22 2009          Using Q4M
Asynchronous Updates

   Mixi (Japan's one of the largest SNS)
    uses Q4M to buffer writes to DB, to
    offload peak d...
Connecting Distant Servers

   Pathtraq uses Q4M to create a relay
    between its database and content
    analysis proc...
To Prefetch Data

   livedoor Reader (web-based feed
    aggregator) uses Q4M to prefetch data
    from database to memca...
Scheduling Web Crawlers

   Web crawlers with retry-on-error
   Sample code included in Q4M dist.
                      ...
Delayed Content Generation

   Hatetter(RSS feed-to-twitter-API
    gateway) uses Q4M to delay content
    generation
   ...
User Notifications

   For sending notifications from web
    services


                              DB


             ...
Installing Q4M




                               22
Apr 22 2009       Using Q4M
Installing Q4M

   Compatible with MySQL 5.1
   Download from q4m.31tools.com
        Binary releases available for som...
Configuration Options of Q4M

   --with-sync=no|fsync|fdatasync|fcntl
        Controls synchronization to disk
        ...
Q4M Basics




                            25
Apr 22 2009     Using Q4M
The Model

        Various publishers write to queue
        Set of subscribers consume the entries in queue



        ...
Creating a Q4M Table

    ENGINE=QUEUE creates
                                mysql> CREATE TABLE qt (
                 ...
Modifying Data on a Q4M Table

    No restrictions for
                                mysql> INSERT INTO qt (id,message)...
SELECT from a Q4M Table

   Works the same as
                                  mysql> SELECT * FROM qt;
                ...
Calling queue_wait()

    After calling, only one
                                       mysql> SELECT * FROM qt;
       ...
OWNER Mode and NON-OWNER Mode

   In OWNER mode, only the OWNED row
    is visible
              OWNED row becomes invis...
Returning to NON-OWNER mode

    By calling
                                     mysql> SELECT QUEUE_ABORT();
           ...
Consuming a Row

    By calling
                                mysql> SELECT queue_wait('qt');
                         ...
Writing a Subscriber

    Call two functions: queue_wait, queue_end
    Multiple subscribers can be run concurrently
   ...
Writing a Subscriber (cont'd)

   Or call queue_wait as a condition
        Warning: conflicts with trigger-based insert...
The Model – with code

                    INSERT INTO queue ...



                            Publisher
                ...
Three Functions in Detail




                                          37
Apr 22 2009            Using Q4M
queue_wait(table)

   Enters OWNER mode
   0〜1 row becomes OWNED
        Enters OWNER mode even if no rows were
       ...
Revisiting Subscriber Code

   Calls to queue_end just before
    queue_wait can be omitted


   while (true) {
     rows...
Conditional queue_wait()

   Consume rows of certain condition
        Rows that do not match will be left untouched
   ...
queue_wait(tbl_cond,[tbl_cond…,timeout])

   Accepts multiple tables and timeout
   Data searched from leftmost table to...
Functions for Exiting OWNER Mode

   queue_end
        Deletes the owned row and exits OWNER mode
   queue_abort
      ...
Relaying and Routing Messages




                                              43
Apr 22 2009               Using Q4M
The Problem

   Relay (or router) consists of more than 3
    processes, 2 conns
   No losses, no duplicates on crash or...
Internal Row ID

   Every row have a internal row ID
        invisible from Q4M table definition
        monotonically ...
queue_rowid()

   Returns row ID of the OWNED row (if
    any)
        Returns NULL if no row is OWNED
   Call when ret...
queue_set_srcid(src_tbl_id, mode, src_row_id)

   Call before inserting a row to destination
    table
        Checks if...
Pseudo Code

   Relays data from src_tbl to dest_tbl
   while (true) {
    # wait for data
    SELECT queue_wait(src_tbl)...
q4m-forward

   Simple forwarder script
        installed into mysql-dir/bin

   usage: q4m-forward [options] src_addrde...
Limitations and the Future of Q4M




                                                  50
Apr 22 2009                 Usi...
Things that Need to be Fixed

   Table compactions is a blocking
    operation
        runs when live data becomes <25% ...
Future of Q4M

   2-phase commit with other storage
    engines (maybe)
        queue consumption and InnoDB updates can...
Thank you

              http://q4m.31tools.com/




                                        53
Apr 22 2009          Using...
Upcoming SlideShare
Loading in …5
×

Using Q4M - a message queue storage engine for MySQL

15,693 views
14,244 views

Published on

Explains how to use Q4M. Slides used at MySQL Conference & Expro 2009.

Published in: Technology
0 Comments
19 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
15,693
On SlideShare
0
From Embeds
0
Number of Embeds
3,816
Actions
Shares
0
Downloads
126
Comments
0
Likes
19
Embeds 0
No embeds

No notes for slide

Using Q4M - a message queue storage engine for MySQL

  1. Using Q4M a message queue storage engine for MySQL Cybozu Labs, Inc. Kazuho Oku
  2. Background 2 Apr 22 2009 Using Q4M
  3. Who am I? Name: Kazuho Oku (奥一穂) Original Developer of Palmscape / Xiino The oldest web browser for Palm OS Working at Cybozu Labs since 2005 Research subsidiary of Cybozu, Inc. in Japan 3 Apr 22 2009 Using Q4M
  4. About Cybozu, Inc. Japan’s largest groupware vendor Mostly provides as software products, not as services Some of our apps bundle MySQL as storage 4 Apr 22 2009 Using Q4M
  5. About Pathtraq Started in Aug. 2007 Web ranking service One of Japan’s largest like Alexa, but semi-realtime, and per-page running on MySQL Need for a fast and reliable message relay for communication between the main server and content analysis server(s) 5 Apr 22 2009 Using Q4M
  6. Design Goals of Q4M Robust Do not lose data on OS crash or power failure Fast Transfer thousands of messages per second Easy to Use Use SQL for access / maintenance Integration into MySQL no more separate daemons to take care of 6 Apr 22 2009 Using Q4M
  7. What is a Message Queue? 7 Apr 22 2009 Using Q4M
  8. What is a Message Queue? Middleware for persistent asynchronous communication communicate between fixed pairs (parties) a.k.a. Message Oriented Middleware MQ is intermediate storage RDBMS is persistent storage Senders / receivers may go down 8 Apr 22 2009 Using Q4M
  9. Minimal Configuration of a MQ Senders and receivers access a single queue Sender Receiver Queue 9 Apr 22 2009 Using Q4M
  10. MQ and Relays Separate queue for sender and receiver Messages relayed between queues Relay Sender Receiver Queue Queue 10 Apr 22 2009 Using Q4M
  11. Merits of Message Relays Destination can be changed easily Relays may transfer messages to different locations depending on their headers Robustness against network failure no loss or duplicates when the relay fails Logging and Multicasting, etc. 11 Apr 22 2009 Using Q4M
  12. Message Brokers Publish / subscribe model Separation between components and their integration Components read / write to predefined queues Integration is definition of routing rules between the message queues Messages are often transformed (filtered) within the relay agent 12 Apr 22 2009 Using Q4M
  13. What about Q4M? Q4M itself is a message queue Can connect Q4M instances to create a message relay Provides API for creating message relays and brokers 13 Apr 22 2009 Using Q4M
  14. Performance of Q4M over 7,000 mess/sec. message size: avg. 512 bytes syncing to disk Outperforming most needs if you need more, just scale out Can coexist with other storage engines without sacrificing their performance see http://labs.cybozu.co.jp/blog/kazuhoatwork/2008/06/q4m_06_release_and_benchmarks.php 14 Apr 22 2009 Using Q4M
  15. Applications of Q4M 15 Apr 22 2009 Using Q4M
  16. Asynchronous Updates Mixi (Japan's one of the largest SNS) uses Q4M to buffer writes to DB, to offload peak demands from http://alpha.mixi.co.jp/blog/?p=272 16 Apr 22 2009 Using Q4M
  17. Connecting Distant Servers Pathtraq uses Q4M to create a relay between its database and content analysis processes → Contents to be analyzed → Content MySQL conn. Analysis Pathtraq over SSL,gzip Processes DB ← Results of the analysis ← 17 Apr 22 2009 Using Q4M
  18. To Prefetch Data livedoor Reader (web-based feed aggregator) uses Q4M to prefetch data from database to memcached uses Q4M for scheduling web crawlers as well from http://d.hatena.ne.jp/mala/20081212/1229074359 18 Apr 22 2009 Using Q4M
  19. Scheduling Web Crawlers Web crawlers with retry-on-error Sample code included in Q4M dist. If failed to fetch, store URL in retry queue Store Result Read URL Spiders URL DB Request Queue Retry Queue Re- scheduler 19 Apr 22 2009 Using Q4M
  20. Delayed Content Generation Hatetter(RSS feed-to-twitter-API gateway) uses Q4M to delay content generation Source code: github.com/yappo/website-hatetter 20 Apr 22 2009 Using Q4M
  21. User Notifications For sending notifications from web services DB App. Logic SMTP Agent Queue(s) IM Agent 21 Apr 22 2009 Using Q4M
  22. Installing Q4M 22 Apr 22 2009 Using Q4M
  23. Installing Q4M Compatible with MySQL 5.1 Download from q4m.31tools.com Binary releases available for some platforms Installing from source: requires source code of MySQL ./configure && make && make install run support-files/install.sql 23 Apr 22 2009 Using Q4M
  24. Configuration Options of Q4M --with-sync=no|fsync|fdatasync|fcntl Controls synchronization to disk default: fdatasync on linux --enable-mmap Mmap’ed reads lead to higher throughput default: yes --with-delete=pwrite|msync msyncrecommended on linux>=2.6.20 if you need really high performance 24 Apr 22 2009 Using Q4M
  25. Q4M Basics 25 Apr 22 2009 Using Q4M
  26. The Model Various publishers write to queue Set of subscribers consume the entries in queue Publisher Q4M table Publisher Subscribers Publisher 26 Apr 22 2009 Using Q4M
  27. Creating a Q4M Table  ENGINE=QUEUE creates mysql> CREATE TABLE qt ( -> id int(10) unsigned NOT NULL, a Q4M table -> message varchar(255) NOT NULL -> ) ENGINE=QUEUE; Query OK, 0 rows affected (0.42 sec)  No primary keys or indexes  Sorted by insertion order (it’s a queue) 27 Apr 22 2009 Using Q4M
  28. Modifying Data on a Q4M Table  No restrictions for mysql> INSERT INTO qt (id,message) -> VALUES INSERT and DELETE -> (1,'Hello'), -> (2,'Bonjour'), -> (3,'Hola');  No support for UPDATE Query OK, 3 rows affected (0.02 sec) mysql> SELECT * FROM qt; +----+---------+ | id | message | +----+---------+ | 1 | Hello | | 2 | Bonjour | | 3 | Hola | +----+---------+ 3 rows in set (0.00 sec) 28 Apr 22 2009 Using Q4M
  29. SELECT from a Q4M Table Works the same as mysql> SELECT * FROM qt; +----+---------+ | id | message | other storage +----+---------+ | 1 | Hello | engines | 2 | Bonjour | | 3 | Hola | +----+---------+ SELECT COUNT(*) is 3 rows in set (0.00 sec) cached mysql> SELECT COUNT(*) FROM qt; +----------+ | COUNT(*) | +----------+ | 3| +----------+ 1 row in set (0.00 sec) How to subscribe to a queue? 29 Apr 22 2009 Using Q4M
  30. Calling queue_wait()  After calling, only one mysql> SELECT * FROM qt; +----+---------+ | id | message | row becomes visible +----+---------+ | 1 | Hello | from the connection | 2 | Bonjour | | 3 | Hola | +----+---------+ 3 rows in set (0.00 sec) mysql> SELECT queue_wait('qt'); +------------------+ | queue_wait('qt') | +------------------+ | 1| +------------------+ 1 row in set (0.00 sec) mysql> SELECT * FROM qt; +----+---------+ | id | message | +----+---------+ | 1 | Hello | +----+---------+ 1 row in set (0.00 sec) 30 Apr 22 2009 Using Q4M
  31. OWNER Mode and NON-OWNER Mode In OWNER mode, only the OWNED row is visible OWNED row becomes invisible from other connections rows of other storage engines are visible NON-OWNER Mode OWNER Mode queue_wait() 1,'Hello' 1,'Hello' 2,'Bonjour' queue_end() 3,'Hola' queue_abort() 31 Apr 22 2009 Using Q4M
  32. Returning to NON-OWNER mode  By calling mysql> SELECT QUEUE_ABORT(); +---------------+ queue_abort, the | QUEUE_ABORT() | +---------------+ connection returns to | 1| +---------------+ NON-OWNER mode 1 row in set (0.00 sec) mysql> SELECT * FROM qt; +----+---------+ | id | message | +----+---------+ | 1 | Hello | | 2 | Bonjour | | 3 | Hola | +----+---------+ 3 rows in set (0.01 sec) 32 Apr 22 2009 Using Q4M
  33. Consuming a Row  By calling mysql> SELECT queue_wait('qt'); (snip) mysql> SELECT * FROM qt; queue_end, the OWNED +----+---------+ | id | message | row is deleted, and +----+---------+ | 1 | Hello | connection returns to +----+---------+ 1 row in set (0.01 sec) NON-OWNER mode mysql> SELECT queue_end(); +-------------+ | queue_end() | +-------------+ | 1| +-------------+ 1 row in set (0.01 sec) mysql> SELECT * FROM qt; +----+---------+ | id | message | +----+---------+ | 2 | Bonjour | | 3 | Hola | +----+---------+ 2 rows in set (0.00 sec) 33 Apr 22 2009 Using Q4M
  34. Writing a Subscriber  Call two functions: queue_wait, queue_end  Multiple subscribers can be run concurrently  each row in the queue is consumed only once while (true) { SELECT queue_wait('qt'); # switch to owner mode rows := SELECT * FROM qt; # obtain data if (count(rows) != 0) # if we have any data, then handle_row(rows[0]); # consume the row SELECT queue_end(); # erase the row from queue } 34 Apr 22 2009 Using Q4M
  35. Writing a Subscriber (cont'd) Or call queue_wait as a condition Warning: conflicts with trigger-based insertions while (true) { rows := SELECT * FROM qt WHERE queue_wait('qt'); if (count(rows) != 0) handle_row(rows[0]); SELECT queue_end(); } 35 Apr 22 2009 Using Q4M
  36. The Model – with code INSERT INTO queue ... Publisher while (true) { rows := SELECT * FROM qt WHERE queue_wait('qt'); if (count(rows) != 0) handle_row(rows[0]); INSERT INTO queue ... SELECT queue_end(); } Q4M table Publisher Subscribers INSERT INTO queue ... Publisher 36 Apr 22 2009 Using Q4M
  37. Three Functions in Detail 37 Apr 22 2009 Using Q4M
  38. queue_wait(table) Enters OWNER mode 0〜1 row becomes OWNED Enters OWNER mode even if no rows were available Default timeout: 60 seconds Returns 1 if a row is OWNED (0 on timeout) If called within OWNER mode, the owned row is deleted 38 Apr 22 2009 Using Q4M
  39. Revisiting Subscriber Code Calls to queue_end just before queue_wait can be omitted while (true) { rows := SELECT * FROM qt WHERE queue_wait('qt'); if (count(rows) != 0) handle_row(rows[0]); SELECT queue_end(); } 39 Apr 22 2009 Using Q4M
  40. Conditional queue_wait() Consume rows of certain condition Rows that do not match will be left untouched Only numeric columns can be checked Fast - condition tested once per each row examples: SELECT queue_wait('table:(col_a*3)+col_b<col_c'); SELECT queue_wait('table:retry_count<5'); 40 Apr 22 2009 Using Q4M
  41. queue_wait(tbl_cond,[tbl_cond…,timeout]) Accepts multiple tables and timeout Data searched from leftmost table to right Returns table index (the leftmost table is 1) of the newly owned row Returns zero if no rows are being owned example: SELECT queue_wait('table_A','table_B',60); 41 Apr 22 2009 Using Q4M
  42. Functions for Exiting OWNER Mode queue_end Deletes the owned row and exits OWNER mode queue_abort Releases (instead of deleting) the owned row and exits OWNER mode Close of a MySQL connection does the same thing 42 Apr 22 2009 Using Q4M
  43. Relaying and Routing Messages 43 Apr 22 2009 Using Q4M
  44. The Problem Relay (or router) consists of more than 3 processes, 2 conns No losses, no duplicates on crash or disconnection Q4M Table Q4M Table Relay Program (source) (dest.) 44 Apr 22 2009 Using Q4M
  45. Internal Row ID Every row have a internal row ID invisible from Q4M table definition monotonically increasing 64-bit integer Used for detecting duplicates Use two functions to skip duplicates Data loss prevented by using queue_wait / queue_end 45 Apr 22 2009 Using Q4M
  46. queue_rowid() Returns row ID of the OWNED row (if any) Returns NULL if no row is OWNED Call when retrieving data from source 46 Apr 22 2009 Using Q4M
  47. queue_set_srcid(src_tbl_id, mode, src_row_id) Call before inserting a row to destination table Checks if the row is already inserted into the table, and ignores next INSERT if true Parameters: src_tbl_id - id to determine source table (0〜63) mode - quot;aquot; to drop duplicates, quot;wquot; to reset src_row_id - row ID obtained from source table 47 Apr 22 2009 Using Q4M
  48. Pseudo Code Relays data from src_tbl to dest_tbl while (true) { # wait for data SELECT queue_wait(src_tbl) =>src_db; # read row and rowid row := (SELECT * FROM src_tbl =>src_db); rowid := (SELECT queue_rowid() =>src_db); # insert the row after setting srcid SELECT queue_set_srcid(src_tbl_id, 'a', rowid) =>dest_db; INSERT INTO dest_tbl (row) =>dest_db; 48 Apr 22 2009 Using Q4M
  49. q4m-forward Simple forwarder script installed into mysql-dir/bin usage: q4m-forward [options] src_addrdest_addr example: % support-files/q4m-forward quot;dbi:mysql:database=db1;table=tbl1;user=foo;password=XXXquot; quot;dbi:mysql:database=db2;table=tbl2;host=bar;user=fooquot; options: --reset reset duplicate check info. --sender=idx slot no. used for checking duplicates (0..63, default: 0) --help 49 Apr 22 2009 Using Q4M
  50. Limitations and the Future of Q4M 50 Apr 22 2009 Using Q4M
  51. Things that Need to be Fixed Table compactions is a blocking operation runs when live data becomes <25% of log file very bad, though not as bad as it seems it's fast since it's a sequential write operation Relays are slow since transfer is done row-by-row Binlog does not work since MQ replication should be synchronous 51 Apr 22 2009 Using Q4M
  52. Future of Q4M 2-phase commit with other storage engines (maybe) queue consumption and InnoDB updates can become atomic operation 52 Apr 22 2009 Using Q4M
  53. Thank you http://q4m.31tools.com/ 53 Apr 22 2009 Using Q4M

×