Lean & Mean Tokyo Cabinet Recipes (with Lua) - FutureRuby '09
Upcoming SlideShare
Loading in...5
×
 

Lean & Mean Tokyo Cabinet Recipes (with Lua) - FutureRuby '09

on

  • 14,485 views

FutureRuby presentation on extending Tokyo Cabinet with Lua extensions.

FutureRuby presentation on extending Tokyo Cabinet with Lua extensions.

GitHub repo with sample code & extensions:
http://bit.ly/wJpeG

Statistics

Views

Total Views
14,485
Views on SlideShare
10,528
Embed Views
3,957

Actions

Likes
25
Downloads
182
Comments
1

32 Embeds 3,957

http://www.igvita.com 3853
http://www.slideshare.net 24
http://localhost:3000 11
http://coderwall.com 10
http://staging.corkbeta.com 6
http://www.lublinlab.com 5
http://megalaserg.blogspot.com 5
https://www.igvita.com 4
file:// 4
http://feeds.igvita.com 4
http://blogs.ua.es 3
http://feeds.feedburner.com 2
http://bgror.com 2
http://www.mefeedia.com 2
http://static.slidesharecdn.com 2
http://www.railsfire.com 2
http://localhost 2
http://aulenbac.tumblr.com 2
http://webcache.googleusercontent.com 1
http://74.125.47.132 1
http://www.edinburghlibrariesagency.info 1
http://74.125.95.132 1
http://facebook.slideshare.com 1
http://westmidssfp.ning.com 1
http://www.zhuaxia.com 1
http://74.125.113.132 1
http://dicasblogger1.blogspot.com 1
http://www.protopage.com 1
http://virgenesnegras.com 1
http://caminodelcid.org.es 1
http://benjohnson.ca 1
http://10consejos.com 1
More...

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • Tokyo Cabinet + Lua examples & recipes: http://github.com/igrigorik/tokyo-recipes/tree/master
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • So what is Tokyo Cabinet? Not surprisingly, as the name implies Tokyo Cabinet is a software project that was started in Japan, and as any Rubyist knows, Japanese developers have made some amazing contributions. Case in point: Ruby and Matz. Started as a research project back in ’93, it made its way to North America in around 2001 and 2002, and the rest is history.For that reason, when I stumbled across Tokyo Cabinet, I decided to do some digging. Turns out, the project is a brainchild of one developer.. and facebook to the rescue to help us put a face to the project. And I don’t know about you, but when I saw that photo.. I had one association pop up..
  • Yeah? What do you think? In fact, I think the analogy may be well suited because Tokyo Cabinet has all the potential to make it big in the database world.
  • When we talk about TC, we are actually referring to three distinct products under the umbrella: - cabinet – a set of database management routines - tyrant – a standalone server implementation around cabinet - dystopia – full text search engine built on top of tokyo cabinetAll of the projects are written in very clean, easy to read, and well documented C, and released under LGPL. For the purposes of this talk, we’ll first focus on tokyo cabinet and work through some of the basics and then move on to tyrant, which is where we’ll spend most of the time. I’m not going to talk about Dystopia, but I’d encourage you to check it out and play with it.
  • QDBM is/was a library of routines for managing a database. - 2000 – 2007Tokyocabinet is a successor to QDBM.
  • - Thread-safeRow level lockingSupports many different data layouts as we will see shortlyFull ACID supportHas bindings to all the most popular languages, which is pretty important considering this is an embedded database
  • TC has many different engines, which if you’re ever used MySQL is exactly the same as deciding between MyISAM or InnoDB, or any other engine. Each one has it’s advantages and the right choice really depends on your data and performance requirements. The bread and butter is the hashtable, which is just a key value store like BDB and many other database. It’s fast, really fast.B-Tree table is similar to the hash table, but because of the supported data layout it actually allows storage of duplicate keys, and iteration over that data.Fixed-length is a no frills array. It’s probably the fastest engine there is, just because of it’s simplicity, but you certainly give up a lot of functionality as well.Last but not least, the Table engine is one of the most interesting features of TC. It is essentially a schemaless document store with support for arbitrary indexing and query capabilities. Let’s take a look at a few examples…
  • We’re using tokyo cabinet via rufus/tokyo gem, which means we’re building an embedded database. There are two good alternatives for interfacing with TC from Ruby. There is a native gem provided by the author, and then there is the rufus/tokyo gem which I’m using here which is built via FFI which means you can this library from Jruby, MRI, Rubinius and so forth. It is definitely a little bit slower, but I like the syntax it provides a lot more than the native gem.
  • It walks & talks just like a Ruby Hash. Very easy to use.
  • Create a query, order by ‘age’. In this case we didn’t declare any explicit indexes, but the dataset is small so the query is still very fast. If we were working with a larger dataset we would explicitly declare an index either via ruby or when we created the database. Check the docs for details on how to do this.
  • Also, as you would expect, there is full transaction support. For this reason alone, it’s not completely unreasonable to think about using TC in place of Ruby hashes in some of your code. It’s dead simple, and fast.
  • http://www.slideshare.net/rawwell/tokyotalk
  • - High concurrency (multi-thread uses epoll/kqueue) - 3 Protocol Options: Binary, memcached, and HTTP - Hot backup - Update logging - Replication (master/slave, master/master) - Lua extensions
  • Interacting via rest_client.. No rufus/tokyo here!
  • Unfortunately, even though the project is very mature at this point, and is being used in production at mixi, finding discussions or support around the project is a bit of a challenge. Mikio regularly writes on his developer blog at mixi, but even that is in Japanese.. And I’m a big fan of statistical machine translation techniques, but there is definitely a lot of room for improvement… What you see on the slide is a start of one his blog posts of August of last year, in which he goes on to announce… By the way, this is where you should gasp and all jump in joy, because this is huge!
  • Lua" (pronounced LOO-ah) means "Moon" in Portuguese. As such, it is neither an acronym nor an abbreviation, but a noun. Lua runs on all flavors of Unix and Windows, and also on mobile devices (such as handheld computers and cell phones that use BREW, Symbian, Pocket PC, etc.) and embedded microprocessors (such as ARM and Rabbit) for applications like Lego MindStorms. Lua is a fast language engine with small footprint that you can embed easily into your application. Lua has a simple and well documented API that allows strong integration with code written in other languages.Lua allows many applications to easily add scripting within a sandbox…
  • So why is Lua + TokyoCabinet so interesting after all? Who’s used MySQL UDFs’? Anyone written one? They’re a pain on both accounts.User-defined functions are compiled as object files and then added to and removed from the server dynamically using the CREATE FUNCTION and DROP FUNCTION statements. You can add functions as native (built-in) MySQL functions. Native functions are compiled into the mysqld server and become available on a permanent basis. Couple of problems: C/C++ plus a very messy internal API. If you get it wrong, you crash the database, so you better get it right. Why use it? Faster then triggers, allows linking against other libraries. For example, there is a very popular memcached library which allows you to interface with memcache right from MySQL. There are number of reasons why you would want to use this, but one great use case is for replicating memcached. That is, use mysql protocol to replay queries and then update your memcached instances in different data centers, etc. This is exactly how Facebook keeps their clusters in sync.
  • TT is a hassle to extend the protocol and implementation. Lua on the server is able to register any function.The "passing as an argument and then returns the results" so that the interface that is common to all database operations, "the method name string, string key, string value to send, running and passing the key value of the name of the method, the return string will be returned "only if a protocol that eliminates the need to define for each protocol.
  • - Lua Extension * defines arbitrary database operations * atomic operation by record locking - define DB operations as Lua functions * clients call each giving functionname and record data * the server returns the return value of the function - options about atomicity * no locking / record locking / global lockinghttp://www.scribd.com/doc/12016121/Tokyo-Cabinet-and-Tokyo-Tyrant-Presentation
  • Enough handwaving, let’s look at the code… A no-op extension which will return the key and value pair without storing anything in the database.The “..” syntax is Lua’sconcat operator for strings.
  • Next, we start the tokyo tyrant server via a command line utility ‘ttserver’ and pass in the –ext parameter with the name of the extension. That’s all you need to start the server with our new extension.After that, we can use the manager utility and invoke some commands. First we specify ext as the command name, which indicates that we’re going to be calling an extension, then the address of the server, then we specify the command, which is “echo” and finally we pass in our key and value. In return, we get the string from TC!
  • Alternatively, we could do the same thing from ruby. Except this time, instead of creating a local database, we’re going to specify an IP address and port. From there, we call the ext method again, pass in the name of the command as a symbol and the parameters. Voila.
  • Ok, so echo is a cute example, but let’s look at something slightly more interesting. We’re going to build an increment command in Lua. Now, TC already has this in it’s API, but we’re going to do this for the same of an example.
  • Redis is a key-value database. It is similar to memcached but the dataset is not volatile, and values can be strings, exactly like in memcached, but also lists and sets with atomic operations to push/pop elements.In order to be very fast but at the same time persistent the whole dataset is taken in memory and from time to time and/or when a number of changes to the dataset are performed it is written asynchronously on disk. You may lost the last few queries that is acceptable in many applications but it is as fast as an in memory DB (Redis supports non-blocking master-slave replication in order to solve this problem by redundancy).
  • Do some iteration over keys, and finally append the new value if it’s actually new.
  • In similar fashion, implement delete get and length functions and we have a minimal set of SET operations in TC. Of course, this is not going to be as fast as a native implementation in Redis, but that’s not the point either. The point is the potential extensibility of the database.
  • The extpc argument allows us to specify a function and an interval, in this case 5 seconds which will be executed by the tokyo server. Which means that our cleanup script will be executed every 5 seconds, effectively removing the expired records from the database! Additionally, we’re going to add an index on x, in decreasing order to help speed up this operation.Once again, TC+Lua = memcached? Not quiet. Memcached will perform better because it doesn’t care to sweep it’s memory for expired records, it just lets them hang around and evicts them when it runs of out free memory. But, it’s a great example of scripting your database.
  • Finally, here’s a real use case deployed at mixi and documented by Mikio. They use tyrant as a proxy cache server in front of a tier of their database. They use lua to interface with their cache and fallback to the actual MySQL and Hbase tables on a cache miss or write. This is possible because Lua has bindings for MySQL, which means that the entire cache server layer is built in Lua and runs as Tokyo Tyrant. No need for any extra frills and the different TC engines allow them to customize the cache layout to fit their use case.
  • Another example documented by Mikio is a session trail tracker. Instead of analyzing the logs in offline fashion, they built a system which recorded user’s visit based on their session cookie. In this case, there are two sessions one with ID 1, and second with ID 2. First user visits resources 123, 256, and 987, which are then easily retrieved via the list operation.The source for this is available in my github repo.
  • Back in November of ’08Mikio added another native interface for invoking mapreduce jobs on the database. There is no particular reason for this, as this implementation does not really have the distributed features that make Map-Reduce what it is, but it is a great example of a programming paradigm making the rounds…To get started with this functionality, you will once again need Lua and we’ll have to build just two functions to make it all work, a mapper and a reduce function. Let’s take a look at an example…
  • We’re going to build a wordcount example. That is, we’re going to assume that the values in our database are strings, and we’ll try to get a word frequency count across all the keys. To do that, we’re going to iterate over all the keys, which will be passed to the mapper function repeatedly, and emit a temporary tuple, where each word will have a count of 1.
  • Next, the reducer function takes over. In the background TC aggregates all the results with the same key and passes them into reduce. At this point, we simple count the number of values for each word, and.. we’re done!
  • Start the server, add a few strings, and then execute our MR job. Easy as that.
  • On that note.. Hopefully these examples gave you a taste for how easy it is to extend Tokyo Cabinet, and I would really encourage you to give it a try. All of the examples we went through in the slides are available in my github repo which I created for this talk, as well as a number of other examples.Take a look at the high-low-game, and the inverted-index extensions. Both great examples of what you can do with less than a hundred lines of Lua and TC.

Lean & Mean Tokyo Cabinet Recipes (with Lua) - FutureRuby '09 Presentation Transcript

  • 1. Lean & Mean Tokyo Cabinet Recipes
    Ilya Grigorik
    @igrigorik
  • 2. postrank.com/topic/ruby
    The slides…
    Twitter
    My blog
  • 3. MikioHirabayashi
    Yukihiro Matsumoto
  • 4. MikioHirabayashi
    Yukihiro Matsumoto
    ???
  • 5.
  • 6.
  • 7.
  • 8. Hashtable
    Berkeley DB, DBM, QDB, TDB…
    B-Tree Table
    Key-Value with duplicates & ordering
    3. Fixed-length
    An in memory array.. No hashing.
    4. Table Engine
    Schemaless, indexes & queries
    Choose your engine
  • 9. gem install rufus-tokyo
    require 'rubygems'require 'rufus/tokyo'db = Rufus::Tokyo::Cabinet.new('data.tch')db['nada'] = 'surf'p db['nada'] # => 'surf'p db['lost'] # => nildb.close
    TC: Hashtable
  • 10. require 'rubygems'require 'rufus/tokyo'db = Rufus::Tokyo::Cabinet.new('data.tch')db['nada'] = 'surf'p db['nada'] # => 'surf'p db['lost'] # => nildb.close
    ~ Ruby Hash
    TC: Hashtable
  • 11. require 'rubygems'require 'rufus/tokyo't = Rufus::Tokyo::Table.new('table.tct')t['pk0'] = { 'name' => 'alfred', 'age' => '22' }t['pk1'] = { 'name' => 'bob', 'age' => '18', 'sex' => 'male' }t['pk2'] = { 'name' => 'charly', 'age' => '45' }t['pk4'] = { 'name' => 'ephrem', 'age' => '32' }p t.query { |q|q.add_condition 'age', :numge, '32'q.order_by 'age'}# => [ {"name"=>"ephrem", :pk=>"pk4", age"=>"32"},# {"name"=>"charly", :pk=>"pk2", "age"=>"45"} ]t.close
    Table Engine
    TC: Table Engine
  • 12. require 'rubygems'require 'rufus/tokyo't = Rufus::Tokyo::Table.new('table.tct')t['pk0'] = { 'name' => 'alfred', 'age' => '22' }t['pk1'] = { 'name' => 'bob', 'age' => '18', 'sex' => 'male' }t['pk2'] = { 'name' => 'charly', 'age' => '45' }t['pk4'] = { 'name' => 'ephrem', 'age' => '32' }p t.query { |q|q.add_condition'age', :numge, '32'q.order_by'age'}# => [ {"name"=>"ephrem", :pk=>"pk4", age"=>"32"},# {"name"=>"charly", :pk=>"pk2", "age"=>"45"} ]t.close
    age > 32 order by age
    TC: Table Engine
  • 13. p t.size# => 0t.transactiondo t['pk0'] = { 'name' => 'alfred', 'age' => '22' } t['pk1'] = { 'name' => 'bob', 'age' => '18' }t.abortendp t.size# => 0
    Uh oh…
    TC: Table Engine Transactions
  • 14. Network
    Embedded
  • 15.
  • 16. require "rubygems"require "rest_client"# Interacting with TokyoTyrant via RESTful HTTP!db = RestClient::Resource.new("http://localhost:1978")db["key"].put "value 1" # insert via HTTPdb["key"].put "value 2" # update via HTTPputs db["key"].get # get via HTTP# => "value 2"db["key"].delete # delete via HTTPputs db["key"].get rescue RestClient::ResourceNotFound
    RESTful Tokyo Tyrant
  • 17. require "rubygems"require "rest_client"# Interacting with TokyoTyrant via HTTP!db = RestClient::Resource.new("http://localhost:1978")db["key"].put "value 1"# insert via HTTPdb["key"].put "value 2"# update via HTTPputs db["key"].get # get via HTTP# => "value 2"db["key"].delete # delete via HTTPputs db["key"].get rescueRestClient::ResourceNotFound
    RESTful Tokyo Tyrant
    Awesome.
  • 18. “Recently, I sophisticated Hanami and the Sumida River in a houseboat, I was sad that day and not even a minute yet mikio bloom …”
    … so I added Lua scripting to Tyrant.
    http://alpha.mixi.co.jp/blog/?p=236
  • 19. “Powerful, fast, lightweight, embeddable scripting language”
    • Procedural syntax
    • 20. Everything is an associatiave array
    • 21. Dynamically typed
    • 22. Interpreted bytecode
    • 23. Garbage collection
    GZIP(Source + Docs + Examples) = 212 Kb
    What is Lua?
    It’s like Ruby.. except it’s not.
    Fast + Lightweight = Great for embedded apps
  • 24. +
    CREATE FUNCTION json_members RETURNS STRING SONAME 'lib_mysqludf_json.so';
    SELECT json_object(customer_id, first_name) FROM customer;
    +---------------------------------------------------+
    | customer |
    +---------------------------------------------------+
    | {customer_id:1,first_name:"MARY"} |
    +---------------------------------------------------+
    Extending the Database?
    MySQL User Defined Functions
    JSON Response
    http://www.mysqludf.org/lib_mysqludf_json/index.php
  • 25. = C/C++
    +
    +
    = Lua
    TC+Lua? Why?
    To make our lives easier, and more fun!
    Easy to learn & easy to extend!
  • 26. Lua extension within Tokyo Cabinet
    _put(key, value)
    _putkeep(key, value)
    _putcat(key, value)
    _rnum()_vanish()
    _mapreduce(mapper, reducer, keys)
    _out(key)
    _get(key)
    _vsiz(key)
    _addint(key, value)
    TC + Lua Extensions
    Request / Response data-flow
    http://tokyocabinet.sourceforge.net/tyrantdoc/#luaext
  • 27. -- -- echo.lua-- function echo(key, value)return key .. ":" .. valueend
    [ilya@igvita] >ttserver-ext echo.lua test.tch
    [ilya@igvita] >tcrmgrextlocalhostechofoo bar
    foo:bar
    require 'rubygems'require 'rufus/tokyo/tyrant' # sudo gem install rufus-tokyot = Rufus::Tokyo::Tyrant.new('127.0.0.1', 1978)puts t.ext(:echo, 'hello', 'world')t.close
    Lua + TC Echo Server
  • 28. -- -- echo.lua-- function echo(key, value)return key .. ":" .. valueend
    [ilya@igvita] >ttserver-ext echo.lua test.tch
    [ilya@igvita] >tcrmgrextlocalhostechofoo bar
    foo:bar
    require 'rubygems'require 'rufus/tokyo/tyrant' # sudo gem install rufus-tokyot = Rufus::Tokyo::Tyrant.new('127.0.0.1', 1978)puts t.ext(:echo, 'hello', 'world')t.close
    Lua + TC Echo Server
  • 29. -- -- echo.lua-- function echo(key, value)return key .. ":" .. valueend
    [ilya@igvita] >ttserver-ext echo.lua test.tch
    [ilya@igvita] >tcrmgrextlocalhostechofoo bar
    foo:bar
    require 'rubygems'require 'rufus/tokyo/tyrant'# sudo gem install rufus-tokyot = Rufus::Tokyo::Tyrant.new('127.0.0.1', 1978)puts t.ext(:echo, 'hello', 'world')t.close
    Lua + TC Echo Server
  • 30. -- -- incr.lua-- function incr (key, i)i = tonumber(i)ifnotithenreturnnilend local old = tonumber(_get(key)) if old theni = old + i end if not _put(key, i) then return nil end return iend
    Verify input
    Implementing INCR in Lua+TC
  • 31. -- -- incr.lua-- function incr (key, i)i = tonumber(i)ifnotithenreturnnilend local old = tonumber(_get(key))if old theni = old + iend if not _put(key, i) then return nil end return iend
    Get old value & increment it
    Implementing INCR in Lua+TC
  • 32. -- -- incr.lua-- function incr (key, i)i = tonumber(i)ifnotithenreturnnilend local old = tonumber(_get(key))if old theni = old + iendifnot _put(key, i) thenreturnnilendreturniend
    Save new value
    Implementing INCR in Lua+TC
  • 33. [ilya@igvita] >ttserver-ext incr.lua test.tch
    [ilya@igvita] >tcrmgrextlocalhostincrkeyname 1
    1
    [ilya@igvita] >tcrmgrextlocalhostincrkeyname 5
    6
    require 'rubygems'require 'rufus/tokyo/tyrant' # sudo gem install rufus-tokyot = Rufus::Tokyo::Tyrant.new('127.0.0.1', 1978)5.times do puts t.ext(:incr, 'my-counter', 2).to_iendt.close
    Implementing INCR in Lua+TC
  • 34. [ilya@igvita] >ttserver-ext incr.lua test.tch
    [ilya@igvita] >tcrmgrextlocalhostincrkeyname 1
    1
    [ilya@igvita] >tcrmgrextlocalhostincrkeyname 5
    6
    require 'rubygems'require 'rufus/tokyo/tyrant'# sudo gem install rufus-tokyot = Rufus::Tokyo::Tyrant.new('127.0.0.1', 1978)5.times do puts t.ext(:incr, 'my-counter', 2).to_iendt.close
    Implementing INCR in Lua+TC
  • 35. Lua + TC = Database Kung-fu
    TTL, Sets & Caching
  • 36. “Redis as a data structures server, it is not just another key-value DB”
  • 37. functionset_append(key, value) local stream = _get(key)ifnot stream then _put(key, value)else local set_len = _set_len(stream) if set_len == 1 then if stream == value then return nil endelseifset_len > 1 then for _, element in ipairs(_split(stream, SEP)) do if element == value then return nil end end end if not _putcat(key, SEP .. value) then return nil endendreturn valueend
    Empty Set
    Implementing Set operations in TC
  • 38. functionset_append(key, value) local stream = _get(key)ifnot stream then _put(key, value)else local set_len = _set_len(stream)ifset_len == 1 thenif stream == value thenreturnnilendelseifset_len > 1 thenfor _, element inipairs(_split(stream, SEP)) doif element == value thenreturnnilendendendifnot _putcat(key, SEP .. value) thenreturnnilendendreturn valueend
    Append key if unique
    Implementing Set operations in TC
  • 39. = ?
    +
    [ilya@igvita] >ttserver-ext set.lua test.tch
    [ilya@igvita] >tcrmgrextlocalhostset_append key 1
    [ilya@igvita] >tcrmgrextlocalhostset_appendkey 2
    [ilya@igvita] >tcrmgrextlocalhostset_append key 1
    [ilya@igvita] >tcrmgrextlocalhostset_getkey
    1
    2
    set_length
    set_get
    set_delete
    set_append
    Implementing Set operations in TC
  • 40. “memcachedis a general-purpose distributed memory caching system that is used by many top sites on the internet”
    Key Value Time
    key1
    value1
    10
    key2
    value2
    20
    Time = 15
    key2
    value2
    30
    Implementing TTL’s in TC
  • 41. DELETE where x > Time.now
    function expire() local args = {} local cdate = string.format("%d", _time())table.insert(args, "addcondxNUMLE" .. cdate)table.insert(args, "out") local res = _misc("search", args)ifnot res then _log("expiration was failed")end print("rnum=" .. _rnum() .. " size=" .. _size())end
    Expiring Records with Lua
  • 42. = ?
    +
    [ilya@igvita] >ttserver-ext expire.lua -extpc expire 5 "casket.tct#idx=x:dec"
    Invoke “expire” command every 5 seconds
    Table database, with index on expiry column (x)
    Implementing Set operations in TC
  • 43.
  • 44. [ilya@igvita] >ttserver -ext session-trail.lua test.tch
    [ilya@igvita] >tcrmgrextlocalhostadd 1 123
    [ilya@igvita] >tcrmgrextlocalhostadd 1 256
    [ilya@igvita] >tcrmgrextlocalhostadd 1 987
    [ilya@igvita] >tcrmgrextlocalhostadd 2 987
    [ilya@igvita] >tcrmgrextlocalhostlist 1
    987 1247008220
    256 1247008216
    123 1247008123
    Session-trail with Lua
    Timestamped session trail
  • 45. Lua + TC = Map Reduce!
    Just for kicks.
  • 46. _out(key)
    _get(key)
    _vsiz(key)
    _addint(key, value)
    _mapreduce(mapper, reducer, keys)
    Executing MR jobs within Tokyo Cabinet
  • 47. functionwordcount()functionmapper(key, value, mapemit)for word instring.gmatch(string.lower(value), "%w+") domapemit(word, 1)endreturntrueendlocal res = ""functionreducer(key, values) res = res .. key .. " " .. #values .. " " return true end if not _mapreduce(mapper, reducer) then res = nil end return resend
    Emit: {word: 1}
    Map-Reduce within Tokyo Cabinet
  • 48. functionwordcount()functionmapper(key, value, mapemit)for word instring.gmatch(string.lower(value), "%w+") domapemit(word, 1)endreturntrueend local res = ""functionreducer(key, values) res = res .. key .. " " .. #values .. " "returntrueendifnot _mapreduce(mapper, reducer) then res = nilendreturn resend
    Emit: {word: 1}
    sizeof(values)
    Map-Reduce within Tokyo Cabinet
  • 49. [ilya@igvita] >ttserver-ext wordcount.lua test.tch
    [ilya@igvita] >tcrmgrputlocalhost1 “This is a pen.“
    [ilya@igvita] >tcrmgrputlocalhost1 “Hello World“
    [ilya@igvita] >tcrmgrputlocalhost1 “Life is good“
    [ilya@igvita] >tcrmgrextlocalhostwordcount
    a 1
    good 1
    is 2
    life 1
    pen 1
    this 1
    Execute Map-Reduce Job
    Map-Reduce within Tokyo Cabinet
  • 50. github.com/igrigorik/tokyo-recipes
    The slides…
    Twitter
    My blog