Using Q4M - a message queue for MySQL #osdc.tw
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Using Q4M - a message queue for MySQL #osdc.tw

  • 2,524 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,524
On Slideshare
2,523
From Embeds
1
Number of Embeds
1

Actions

Shares
Downloads
31
Comments
0
Likes
1

Embeds 1

https://twitter.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Using Q4Ma message queue storage engine for MySQL Kazuho Oku
  • 2. Who am I? Name: Kazuho Oku (奥一穂) Original Developer of Palmscape / Xiino The oldest web browser for Palm OS Worked Cybozu Labs in 2005-2010 Research subsidiary of Cybozu, Inc. in Japan Developed Japanize / Mylingual, Q4M, etc. Now working at DeNA Co., Ltd. with the developers of the HandlerSocketpluginMar 26 2011 Using Q4M 2
  • 3. What is Q4M? A message queue runs as a storage plugin of MySQL 5.1 Why is it a MySQLplugin? accessible by using existing MySQL clients no need for a new client library administrable by using SQL friendly to DB adminsMar 26 2011 Using Q4M 3
  • 4. Design Goals of Q4M Robust Does not lose data on OS crash or power failure necessary for Tokyo wo. nuclear power plants… orz Fast Transfer thousands of messages per second Easy to Use Use SQL for access / maintenance Integration into MySQL no more separate daemons to take care ofMar 26 2011 Using Q4M 4
  • 5. Users of Q4M Many leading web services in Japan DeNA Co., Ltd. livedoor Co., Ltd. mixi, Inc. Zynga Japan (formerly Unoh, Inc.)Mar 26 2011 Using Q4M 5
  • 6. Agenda What MQ is in General Applications of Q4M Brief TutorialMar 26 2011 Using Q4M 6
  • 7. What is a Message Queue?Mar 26 2011 Using Q4M 7
  • 8. What is a Message Queue? Middleware for persistent asynchronous communication communicate between fixed pairs (parties) a.k.a. Message Oriented Middleware MQ is intermediate storage RDBMS is persistent storage Senders / receivers may go downMar 26 2011 Using Q4M 8
  • 9. Minimal Configuration of a MQ Senders and receivers access a single queue Sender Receiver QueueMar 26 2011 Using Q4M 9
  • 10. MQ and Relays Separate queue for sender and receiver Messages relayed between queues Relay Sender Receiver Queue QueueMar 26 2011 Using Q4M 10
  • 11. Merits of Message Relays Destination can be changed easily Relays may transfer messages to different locations depending on their headers Robustness against network failure no loss or duplicates when the relay fails Logging and Multicasting, etc.Mar 26 2011 Using Q4M 11
  • 12. Message Brokers Publish / subscribe model Separation between components and their integration Components read / write to predefined queues Integration is definition of routing rules between the message queues Messages are often transformed (filtered) within the relay agentMar 26 2011 Using Q4M 12
  • 13. What about Q4M? Q4M itself is a message queue Can connect Q4M instances to create a message relay Provides API for creating message relays and brokersMar 26 2011 Using Q4M 13
  • 14. Performance of Q4M over 7,000 mess/sec. message size: avg. 512 bytes syncing to disk Outperforming most needs if you need more, just scale out Can coexist with other storage engines without sacrificing their performance see http://labs.cybozu.co.jp/blog/kazuhoatwork/2008/06/q4m_06_release_and_benchmarks.phpMar 26 2011 Using Q4M 14
  • 15. Applications of Q4MMar 26 2011 Using Q4M 15
  • 16. Asynchronous Updates DeNA uses Q4M for sending notifications to users asynchronously http://engineer.dena.jp/2010/03/dena-technical-seminar-1- 2.htmlMar 26 2011 Using Q4M 16
  • 17. Delay Peak Demands Mixi (Japans one of the largest SNS) uses Q4M to buffer writes to DB, to delay peak demands from http://alpha.mixi.co.jp/blog/?p=272Mar 26 2011 Using Q4M 17
  • 18. Connecting Distant Servers Pathtraq uses Q4M to create a relay between its database and content analysis processes → Contents to be analyzed → Content Pathtraq MySQL conn. Analysis DB over SSL,gzip Processes ← Results of the analysis ←Mar 26 2011 Using Q4M 18
  • 19. Prefetch Data livedoor Reader (web-based feed aggregator) uses Q4M to prefetch data from database to memcached uses Q4M for scheduling web crawlers as well from http://d.hatena.ne.jp/mala/20081212/1229074359Mar 26 2011 Using Q4M 19
  • 20. Scheduling Web Crawlers Web crawlers with retry-on-error Sample code included in Q4M dist. If failed to fetch, store URL in retry queue Store Result Read URL Spiders URL DB Request Queue Retry Queue Re- schedulerMar 26 2011 Using Q4M 20
  • 21. Delayed Content Generation Hatetter(RSS feed-to-twitter-API gateway) uses Q4M to delay content generation Source code: github.com/yappo/website-hatetterMar 26 2011 Using Q4M 21
  • 22. Installing Q4MMar 26 2011 Using Q4M 22
  • 23. Installing Q4M Compatible with MySQL 5.1 Download from q4m.github.com Binary releases available for some platforms Installing from source: requires source code of MySQL ./configure && make && make install run support-files/install.sqlMar 26 2011 Using Q4M 23
  • 24. Configuration Options of Q4M --with-sync=no|fsync|fdatasync|fcntl Controls synchronization to disk default: fdatasync on linux --enable-mmap Mmap’ed reads lead to higher throughput default: yes --with-delete=pwrite|msync msyncrecommended on linux>=2.6.20 if you need really high performanceMar 26 2011 Using Q4M 24
  • 25. Q4M BasicsMar 26 2011 Using Q4M 25
  • 26. The Model Various publishers write to queue Set of subscribers consume the entries in queue Publisher Publisher Q4M table Subscribers PublisherMar 26 2011 Using Q4M 26
  • 27. Creating a Q4M Table  ENGINE=QUEUE creates mysql> CREATE TABLE qt ( -> id int(10) unsigned NOT NULL, a Q4M table -> message varchar(255) NOT NULL -> ) ENGINE=QUEUE; Query OK, 0 rows affected (0.42 sec)  No primary keys or indexes  Sorted by insertion order (it’s a queue)Mar 26 2011 Using Q4M 27
  • 28. Modifying Data on a Q4M Table  No restrictions for mysql> INSERT INTO qt (id,message) -> VALUES INSERT and DELETE -> (1,Hello), -> (2,Bonjour), -> (3,Hola);  No support for UPDATE Query OK, 3 rows affected (0.02 sec) mysql> SELECT * FROM qt; +----+---------+ | id | message | +----+---------+ | 1 | Hello | | 2 | Bonjour | | 3 | Hola | +----+---------+ 3 rows in set (0.00 sec)Mar 26 2011 Using Q4M 28
  • 29. SELECT from a Q4M Table Works the same as mysql> SELECT * FROM qt; +----+---------+ | id | message | other storage +----+---------+ | 1 | Hello | engines | 2 | Bonjour | | 3 | Hola | +----+---------+ SELECT COUNT(*) is 3 rows in set (0.00 sec) cached mysql> SELECT COUNT(*) FROM qt; +----------+ | COUNT(*) | +----------+ | 3| +----------+ 1 row in set (0.00 sec) How to subscribe to a queue?Mar 26 2011 Using Q4M 29
  • 30. Calling queue_wait()  After calling, only one mysql> SELECT * FROM qt; +----+---------+ | id | message | row becomes visible +----+---------+ | 1 | Hello | from the connection | 2 | Bonjour | | 3 | Hola | +----+---------+ 3 rows in set (0.00 sec) mysql> SELECT queue_wait(qt); +------------------+ | queue_wait(qt) | +------------------+ | 1| +------------------+ 1 row in set (0.00 sec) mysql> SELECT * FROM qt; +----+---------+ | id | message | +----+---------+ | 1 | Hello | +----+---------+ 1 row in set (0.00 sec)Mar 26 2011 Using Q4M 30
  • 31. OWNER Mode and NON-OWNER Mode In OWNER mode, only the OWNED row is visible OWNED row becomes invisible from other connections rows of other storage engines are visible NON-OWNER Mode queue_wait() OWNER Mode 1,Hello 1,Hello 2,Bonjour queue_end() 3,Hola queue_abort()Mar 26 2011 Using Q4M 31
  • 32. Returning to NON-OWNER mode  By calling queue_abort, mysql> SELECT QUEUE_ABORT(); +---------------+ the connection returns | QUEUE_ABORT() | +---------------+ to NON-OWNER mode | 1| +---------------+ 1 row in set (0.00 sec) mysql> SELECT * FROM qt; +----+---------+ | id | message | +----+---------+ | 1 | Hello | | 2 | Bonjour | | 3 | Hola | +----+---------+ 3 rows in set (0.01 sec)Mar 26 2011 Using Q4M 32
  • 33. Consuming a Row  By calling mysql> SELECT queue_wait(qt); (snip) mysql> SELECT * FROM qt; queue_end, the OWNED +----+---------+ | id | message | row is deleted, and +----+---------+ | 1 | Hello | connection returns to +----+---------+ 1 row in set (0.01 sec) NON-OWNER mode mysql> SELECT queue_end(); +-------------+ | queue_end() | +-------------+ | 1| +-------------+ 1 row in set (0.01 sec) mysql> SELECT * FROM qt; +----+---------+ | id | message | +----+---------+ | 2 | Bonjour | | 3 | Hola | +----+---------+ 2 rows in set (0.00 sec)Mar 26 2011 Using Q4M 33
  • 34. Writing a Subscriber  Call two functions: queue_wait, queue_end  Multiple subscribers can be run concurrently  each row in the queue is consumed only once while (true) { SELECT queue_wait(qt); # switch to owner mode rows := SELECT * FROM qt; # obtain data if (count(rows) != 0) # if we have any data, then handle_row(rows[0]); # consume the row SELECT queue_end(); # erase the row from queue }Mar 26 2011 Using Q4M 34
  • 35. Writing a Subscriber (contd) Or call queue_wait as a condition Warning: conflicts with trigger-based insertions while (true) { rows := SELECT * FROM qt WHERE queue_wait(qt); if (count(rows) != 0) handle_row(rows[0]); SELECT queue_end(); }Mar 26 2011 Using Q4M 35
  • 36. The Model – with code INSERT INTO queue ... Publisher while (true) { rows := SELECT * FROM qt WHERE queue_wait(qt); if (count(rows) != 0) INSERT INTO queue ... handle_row(rows[0]); SELECT queue_end(); } Publisher Q4M table Subscribers INSERT INTO queue ... PublisherMar 26 2011 Using Q4M 36
  • 37. Three Functions in DetailMar 26 2011 Using Q4M 37
  • 38. queue_wait(table) Enters OWNER mode 0〜1 row becomes OWNED Enters OWNER mode even if no rows were available Default timeout: 60 seconds Returns 1 if a row is OWNED (0 on timeout) If called within OWNER mode, the owned row is deletedMar 26 2011 Using Q4M 38
  • 39. Revisiting Subscriber Code Calls to queue_end just before queue_wait can be omitted while (true) { rows := SELECT * FROM qt WHERE queue_wait(qt); if (count(rows) != 0) handle_row(rows[0]); SELECT queue_end(); }Mar 26 2011 Using Q4M 39
  • 40. Conditional queue_wait() Consume rows of certain condition Rows that do not match will be left untouched Only numeric columns can be checked Fast - condition tested once per each row examples: SELECT queue_wait(table:(col_a*3)+col_b<col_c); SELECT queue_wait(table:retry_count<5);Mar 26 2011 Using Q4M 40
  • 41. queue_wait(tbl_cond,[tbl_cond…,timeout]) Accepts multiple tables and timeout Data searched from leftmost table to right Returns table index (the leftmost table is 1) of the newly owned row Returns zero if no rows are being owned example: SELECT queue_wait(table_A,table_B,60);Mar 26 2011 Using Q4M 41
  • 42. Functions for Exiting OWNER Mode queue_end Deletes the owned row and exits OWNER mode queue_abort Releases (instead of deleting) the owned row and exits OWNER mode Close of a MySQL connection does the same thingMar 26 2011 Using Q4M 42
  • 43. Relaying and Routing MessagesMar 26 2011 Using Q4M 43
  • 44. The Model Relay (or router) consists of more than 3 processes, 2 conns No losses, no duplicates on crash or disconnection Q4M Table Q4M Table (source) Relay Program (dest.)Mar 26 2011 Using Q4M 44
  • 45. Internal Row ID Every row have a internal row ID invisible from Q4M table definition monotonically increasing 64-bit integer Used for detecting duplicates Use two functions to skip duplicates Data loss prevented by using queue_wait / queue_endMar 26 2011 Using Q4M 45
  • 46. queue_rowid() Returns row ID of the OWNED row (if any) Returns NULL if no row is OWNED Call when retrieving data from sourceMar 26 2011 Using Q4M 46
  • 47. queue_set_srcid(src_tbl_id, mode, src_row_id) Call before inserting a row to destination table Checks if the row is already inserted into the table, and ignores next INSERT if true Parameters: src_tbl_id - id to determine source table (0〜63) mode - "a" to drop duplicates, "w" to reset src_row_id - row ID obtained from source tableMar 26 2011 Using Q4M 47
  • 48. Pseudo Code Relays data from src_tbl to dest_tbl while (true) { # wait for data SELECT queue_wait(src_tbl) =>src_db; # read row and rowid row := (SELECT * FROM src_tbl =>src_db); rowid := (SELECT queue_rowid() =>src_db); # insert the row after setting srcid SELECT queue_set_srcid(src_tbl_id, a, rowid) =>dest_db; INSERT INTO dest_tbl (row) =>dest_db;Mar 26 2011 Using Q4M 48
  • 49. q4m-forward Simple forwarder script installed into mysql-dir/bin usage: q4m-forward [options] src_addrdest_addr example: % support-files/q4m-forward "dbi:mysql:database=db1;table=tbl1;user=foo;password=XXX" "dbi:mysql:database=db2;table=tbl2;host=bar;user=foo" options: --reset reset duplicate check info. --sender=idx slot no. used for checking duplicates (0..63, default: 0) --helpMar 26 2011 Using Q4M 49
  • 50. Limitations and the Future of Q4MMar 26 2011 Using Q4M 50
  • 51. Things that Need to be Fixed Table compactions is a blocking operation runs when live data becomes <25% of log file very bad, though not as bad as it seems its fast since its a sequential write operation Relays are slow since transfer is done row-by-row Binlog does not work since MQ replication should be synchronousMar 26 2011 Using Q4M 51
  • 52. Future of Q4M (maybe) Support for MySQL 5.5 not request yet from current users :-p 2-phase commit with other storage engines queue consumption and InnoDB updates can become atomic operationMar 26 2011 Using Q4M 52
  • 53. Thank you http://q4m.github.com/Mar 26 2011 Using Q4M 53