SlideShare a Scribd company logo
1 of 34
Download to read offline
Introduction to
Tokyo Products

          Mikio Hirabayashi
              <hirarin@gmail.com>
Tokyo Products
• Tokyo Cabinet
  – database library
• Tokyo Tyrant                                    applications
                                                                               Prome
  – database server              custom storage    Tyrant           Dystopia
                                                                                nade


• Tokyo Dystopia                                        Cabinet
  – full-text search engine
                                                      file system
• Tokyo Promenade
  – content management system


• open source
  – released under LGPL
• powerful, portable, practical
  – written in the standard C, optimized to POSIX
Tokyo Cabinet
 - database library -
Features
• modern implementation of DBM
 – key/value database
   • e.g.) DBM, NDBM, GDBM, TDB, CDB, Berkeley DB
 – simple library = process embedded
 – Successor of QDBM
   • C99 and POSIX compatible, using Pthread, mmap, etc...
   • Win32 porting is work-in-progress

• high performance
 – insert: 0.4 sec/1M records (2,500,000 qps)
 – search: 0.33 sec/1M records (3,000,000 qps)
• high concurrency
  – multi-thread safe
  – read/write locking by records
• high scalability
  – hash and B+tree structure = O(1) and O(log N)
  – no actual limit size of a database file (to 8 exabytes)
• transaction
  – write ahead logging and shadow paging
  – ACID properties
• various APIs
  – on-memory list/hash/tree
  – file hash/B+tree/array/table
• script language bindings
  – Perl, Ruby, Java, Lua, Python, PHP, Haskell, Erlang, etc...
TCHDB: Hash Database
• static hashing                  bucket array

  – O(1) time complexity

• separate chaining                key           value

  – binary search tree             key           value

  – balances by the second hash    key           value

• free block pool                  key           value

                                   key           value
  – best fit allocation
  – dynamic defragmentation        key           value


• combines mmap and                key

                                   key
                                                 value

                                                 value
  pwrite/pread                     key           value
  – saves calling system calls
                                   key           value
• compression                      key           value
  – deflate(gzip)/bzip2/custom
TCBDB: B+ Tree Database
• B+ tree                                     key   value
  – O(log N) time complexity   B tree index   key   value

• page caching                                key
                                              key
                                                    value
                                                    value
  – LRU removing
                                              key   value
  – speculative search
                                              key   value
• stands on hash DB                           key   value
  – records pages in hash DB                  key   value
  – succeeds time and space                   key   value
    efficiency                                key   value
• custom comparison                           key   value

  function                                    key   value

  – prefix/range matching                     key   value
                                              key   value
• cursor                                      key   value
  – jump/next/prev                            key   value
TCFDB: Fixed-length Database
• array of fixed-
  length elements          array
  – O(1) time complexity     value   value   value   value
  – natural number keys      value   value   value   value
  – addresses records by     value   value   value   value
    multiple of key          value   value   value   value

• most effective             value

                             value
                                     value

                                     value
                                             value

                                             value
                                                     value

                                                     value
  – bulk load by mmap        value   value   value   value
  – no key storage per       value   value   value   value
    record                   value   value   value   value
  – extremely fast and       value   value   value   value
    concurrent
TCTDB: Table Database
• column based
  – the primary key and named
    columns
  – stands on hash DB                                                    bucket array

• flexible structure                   primary key   name value name value name value
  – no data scheme, no data type
  – various structure for each
    record                             primary key   name value name value name value


• query mechanism                      primary key   name value name value name value
  – various operators matching
    column values                      primary key   name value name value name value
  – lexical/decimal orders by column
    values                             primary key   name value name value name value

• column indexes
  –   implemented with B+ tree         primary key   name value name value name value

  –   typed as string/number
  –   inverted index of token/q-gram                            column index
  –   query optimizer
On-memory Structures
• TCXSTR: extensible string
  – concatenation, formatted allocation
• TCLIST: array list (dequeue)
  – random access by index
  – push/pop, unshift/shift, insert/remove
• TCMAP: map of hash table
  – insert/remove/search
  – iterator by order of insertion
• TCTREE: map of ordered tree
  – insert/remove/search
  – iterator by order of comparison function
Other Mechanisms
• abstract database
  – common interface of 6 schema
     • on-memory hash, on-memory tree
     • file hash, file B+tree, file array, file table
  – decides the concrete scheme in runtime

• remote database
  – network interface of the abstract database
  – yes, it's Tokyo Tyrant!

• miscellaneous utilities
  – string processing, filesystem operation
  – memory pool, encoding/decoding
Example Code
#include   <tcutil.h>
#include   <tchdb.h>
#include   <stdlib.h>
#include   <stdbool.h>
#include   <stdint.h>

int main(int argc, char **argv){

  TCHDB *hdb;
  int ecode;
  char *key, *value;

  /* create the object */
  hdb = tchdbnew();

  /* open the database */
  if(!tchdbopen(hdb, "casket.hdb", HDBOWRITER | HDBOCREAT)){
    ecode = tchdbecode(hdb);                                       /* traverse records */
    fprintf(stderr, "open error: %s¥n", tchdberrmsg(ecode));       tchdbiterinit(hdb);
  }                                                                while((key = tchdbiternext2(hdb)) != NULL){
                                                                     value = tchdbget2(hdb, key);
  /* store records */                                                if(value){
  if(!tchdbput2(hdb, "foo", "hop") ||                                  printf("%s:%s¥n", key, value);
     !tchdbput2(hdb, "bar", "step") ||                                 free(value);
     !tchdbput2(hdb, "baz", "jump")){                                }
    ecode = tchdbecode(hdb);                                         free(key);
    fprintf(stderr, "put error: %s¥n", tchdberrmsg(ecode));        }
  }
                                                                   /* close the database */
  /* retrieve records */                                           if(!tchdbclose(hdb)){
  value = tchdbget2(hdb, "foo");                                     ecode = tchdbecode(hdb);
  if(value){                                                         fprintf(stderr, "close error: %s¥n", tchdberrmsg(ecode));
    printf("%s¥n", value);                                         }
    free(value);
  } else {                                                         /* delete the object */
    ecode = tchdbecode(hdb);                                       tchdbdel(hdb);
    fprintf(stderr, "get error: %s¥n", tchdberrmsg(ecode));
  }                                                                return 0;
                                                               }
Tokyo Tyrant
- database server -
Features
• network server of Tokyo Cabinet
 – client/server model
 – multi applications can access one database
 – effective binary protocol
• compatible protocols
 – supports memcached protocol and HTTP
 – available from most popular languages
• high concurrency/performance
 – resolves "c10k" with epoll/kqueue/eventports
 – 17.2 sec/1M queries (58,000 qps)
• high availability
  – hot backup and update log
  – asynchronous replication between servers
• various database schema
  – using the abstract database API of Tokyo Cabinet
• effective operations
  – no-reply updating, multi-record retrieval
  – atomic increment
• Lua extension
  – defines arbitrary database operations
  – atomic operation by record locking
• pure script language interfaces
  – Perl, Ruby, Java, Python, PHP, Erlang, etc...
Asynchronous Replication
 master and slaves                                 dual master
 (load balancing)                                  (fail over)

                          write query

      master server                      client
                                                          client
            database                                                 query if the master is dead
                                                  query
                            read query
           update log                             active master                 standby master
                            with load balancing

                                                      database                        database
replicate
the difference
slave server            slave server                  update log                     update log
                                                                   replicate
      database              database
                                                                   the difference



     update log             update log
Thread Pool Model
                                 epoll/kqueue            listen
accept the client connection
if the event is about the listener           queue   first of all, the listening socket is enqueued into
                                                     the epoll queue
           accept           epoll_ctl(add)
                                                                           queue back if keep-alive
                            epoll_wait

                                                         task manager
                            epoll_ctl(del)
                                                                  queue
                                                     enqueue
          move the readable client socket
          from the epoll queue to the task queue

                                                       deque
                                                                                 worker thread

                                                                                 worker thread

                                                     do each task                worker thread
Lua Extention
• defines DB operations as Lua functions
  – clients send the function name and record data
  – the server returns the return value of the function

• options about atomicity
  – no locking / record locking / global locking


  front end           request           back end
                      - function name
                      - key data            Tokyo Tyrant
                      - value data
    Clients
                                            Tokyo Cabinet   Lua processor
                     response
                     - result data

                                                                  script
                                              database          user-defined
                                                                operations
case: Timestamp DB at mixi.jp
• 20 million records               mod_perl
  – each record size is 20 bytes
                                      home.pl          update
• more than 10,000                  show_friend.pl
                                                                 TT (active)


  updates per sec.                   view_diary.pl
                                                                  database

  – keeps 10,000 connections
                                         search.pl               replication
• dual master                            other pages

  replication                                                     TT (standby)

  – each server is only one                                         database
                                                         fetch
• memcached                         list_friend.pl



  compatible protocol
                                    list_bookmark.pl



  – reuses existing Perl clients
case: Cache for Big Storages
• works as proxy                   clients                        1. inserts to the storage
  – mediates insert/search                                        2. inserts to the cache

  – write through, read through

• Lua extension
                                                 Tokyo Tyrant          MySQL/hBase


  – atomic operation by record                    atomic insert
                                                                          database
    locking                                             Lua
  – uses LuaSocket to access
    the storage                                                           database
                                                 atomic search
• proper DB scheme                                      Lua
  – TCMDB: for generic cache                                              database

  – TCNDB: for biased access
  – TCHDB: for large records                           cache
                                                                          database
    such as image                 1. retrieves from the cache
  – TCFDB: for small records           if found, return
                                  2. retrieves from the storage
    such as timestamp             3. inserts to the cache
Example Code
#include   <tcrdb.h>
#include   <stdlib.h>
#include   <stdbool.h>
#include   <stdint.h>

int main(int argc, char **argv){

  TCRDB *rdb;
  int ecode;
  char *value;

  /* create the object */
  rdb = tcrdbnew();

  /* connect to the server */
  if(!tcrdbopen(rdb, "localhost", 1978)){
    ecode = tcrdbecode(rdb);
    fprintf(stderr, "open error: %s¥n", tcrdberrmsg(ecode));
  }

  /* store records */
  if(!tcrdbput2(rdb, "foo", "hop") ||
     !tcrdbput2(rdb, "bar", "step") ||
     !tcrdbput2(rdb, "baz", "jump")){
    ecode = tcrdbecode(rdb);
    fprintf(stderr, "put error: %s¥n", tcrdberrmsg(ecode));
  }                                                                /* close the connection */
                                                                   if(!tcrdbclose(rdb)){
  /* retrieve records */                                             ecode = tcrdbecode(rdb);
  value = tcrdbget2(rdb, "foo");                                     fprintf(stderr, "close error: %s¥n", tcrdberrmsg(ecode));
  if(value){                                                       }
    printf("%s¥n", value);
    free(value);                                                   /* delete the object */
  } else {                                                         tcrdbdel(rdb);
    ecode = tcrdbecode(rdb);
    fprintf(stderr, "get error: %s¥n", tcrdberrmsg(ecode));        return 0;
  }                                                            }
Tokyo Dystopia
- full-text search engine -
Features
• full-text search engine
  – manages databases of Tokyo Cabinet as an inverted
    index

• combines two tokenizers
  – character N-gram (bi-gram) method
     • perfect recall ratio
  – simple word by outer language processor
     • high accuracy and high performance

• high performance/scalability
  – handles more than 10 million documents
  – searches in milliseconds
• optimized to professional use
  – layered architecture of APIs
  – no embedded scoring system
    • to combine outer scoring system
  – no text filter, no crawler, no language
    processor
• convenient utilities
  – multilingualism with Unicode
  – set operations
  – phrase matching, prefix matching, suffix
    matching, and token matching
  – command line utilities
Inverted Index
• stands on key/value database
  – key = token
     • N-gram or simple word
  – value = occurrence data (posting list)
     • list of pairs of document number and offset in the document

• uses B+ tree database
  – reduces write operations into the disk device
  – enables common prefix search for tokens
  – delta encoding and deflate compression

       ID:21            text: "abracadabra"
          a    -   21:10          ca - 21:1, 21:8
          ab   -   21:0,21:7      da - 21:4
          ac   -   21:3           ra - 21:2, 21:9
          br   -   21:5
Layered Architecture
• character N-gram index
  – "q-gram index" (only index), and "indexed database"
  – uses embedded tokenizer

• word index
  – "word index" (only index), and "tagged index"
  – uses outer tokenizer
                                                Applications
                             Character N-gram Index            Tagging Index

                               indexed database         tagged database

                                 q-gram index              word index

                                              Tokyo Cabinet
case: friend search at mixi.jp
• 20 million records
  – each record size is 1K bytes                   query               user interface
  – name and self introduction
                                      merger
• more than 100 qps                       TT's        social               query

• attribute narrowing                    cache        graph

  – gender, address, birthday                                         searcher
  – multiple sort orders              copy the social graph
                                                                        inverted   attribute

• distributed processing
                                                                          index       DB


  – more than 10 servers              indexer
  – indexer, searchers, merger                                      copy the index and the DB
                                        inverted     attribute
• ranking by social                       index         DB


  graph                                                          dump profile data

  – the merger scores the result by
    following the friend links                                    profile DB
Example Code
#include   <dystopia.h>
#include   <stdlib.h>
#include   <stdbool.h>
#include   <stdint.h>

int main(int argc, char **argv){
  TCIDB *idb;
  int ecode, rnum, i;
  uint64_t *result;
  char *text;
                                                                   /* search records */
  /* create the object */                                          result = tcidbsearch2(idb, "john || thomas", &rnum);
  idb = tcidbnew();                                                if(result){
                                                                     for(i = 0; i < rnum; i++){
  /* open the database */                                              text = tcidbget(idb, result[i]);
  if(!tcidbopen(idb, "casket", IDBOWRITER | IDBOCREAT)){               if(text){
    ecode = tcidbecode(idb);                                             printf("%d¥t%s¥n", (int)result[i], text);
    fprintf(stderr, "open error: %s¥n", tcidberrmsg(ecode));             free(text);
  }                                                                    }
                                                                     }
  /* store records */                                                free(result);
  if(!tcidbput(idb, 1, "George Washington") ||                     } else {
     !tcidbput(idb, 2, "John Adams") ||                              ecode = tcidbecode(idb);
     !tcidbput(idb, 3, "Thomas Jefferson")){                         fprintf(stderr, "search error: %s¥n", tcidberrmsg(ecode));
    ecode = tcidbecode(idb);                                       }
    fprintf(stderr, "put error: %s¥n", tcidberrmsg(ecode));
  }                                                                /* close the database */
                                                                   if(!tcidbclose(idb)){
                                                                     ecode = tcidbecode(idb);
                                                                     fprintf(stderr, "close error: %s¥n", tcidberrmsg(ecode));
                                                                   }

                                                                   /* delete the object */
                                                                   tcidbdel(idb);

                                                                   return 0;
                                                               }
Tokyo Promenade
- content management system -
Features
• content management system
  – manages Web contents easily with a browser
  – available as BBS, Blog, and Wiki

• simple and logical interface
  – aims at conciseness like LaTeX
  – optimized for text browsers such as w3m and Lynx
  – complying with XHTML 1.0 and considering WCAG 1.0

• high performance/throughput
  – implemented in pure C
  – uses Tokyo Cabinet and supports FastCGI
  – 0.836ms/view (more than 1,000 qps)
• sufficient functionality
  – simple Wiki formatting
  – file uploader and manager
  – user authentication by the login form
  – guest comment authorization by a riddle
  – supports the sidebar navigation
  – full-text/attribute search, calendar view
  – Atom feed
• flexible customizability
  – thorough separation of logic and presentation
  – template file to generate the output
  – server side scripting by the Lua extension
  – post processing by outer commands
Example Code
#!   Introduction to Tokyo Cabinet
#c   2009-11-05T18:58:39+09:00
#m   2009-11-05T18:58:39+09:00
#o   mikio
#t   database,programming,tokyocabinet

This article describes what is [[Tokyo
Cabinet|http://1978th.net/tokyocabinet/]] and
how to use it.

@ upfile:1257415094-logo-ja.png

* Features

- modern implementation of DBM
-- key/value database
-- e.g.) DBM, NDBM, GDBM, TDB, CDB, Berkeley
DB
- simple library = process embedded
- Successor of QDBM
-- C99 and POSIX compatible, using Pthread,
mmap, etc...
-- Win32 porting is work-in-progress
- high performance
- insert: 0.4 sec/1M records (2,500,000 qps)
- search: 0.33 sec/1M records (3,000,000 qps)
innovating more and yet more...
               http://1978th.net/
Introduction to Tokyo Products

More Related Content

What's hot

Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisArnab Mitra
 
MongoDB presentation
MongoDB presentationMongoDB presentation
MongoDB presentationHyphen Call
 
Polyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great TogetherPolyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great TogetherJohn Wood
 
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...Edureka!
 
KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기
KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기
KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기흥배 최
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBMongoDB
 
Using AI to Build a Self-Driving Query Optimizer with Shivnath Babu and Adria...
Using AI to Build a Self-Driving Query Optimizer with Shivnath Babu and Adria...Using AI to Build a Self-Driving Query Optimizer with Shivnath Babu and Adria...
Using AI to Build a Self-Driving Query Optimizer with Shivnath Babu and Adria...Databricks
 
Apache spark 소개 및 실습
Apache spark 소개 및 실습Apache spark 소개 및 실습
Apache spark 소개 및 실습동현 강
 
Management Zabbix with Terraform
Management Zabbix with TerraformManagement Zabbix with Terraform
Management Zabbix with TerraformAécio Pires
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataDataWorks Summit
 
Sqoop on Spark for Data Ingestion
Sqoop on Spark for Data IngestionSqoop on Spark for Data Ingestion
Sqoop on Spark for Data IngestionDataWorks Summit
 
Node.js Express
Node.js  ExpressNode.js  Express
Node.js ExpressEyal Vardi
 
Introduction to Apache Sqoop
Introduction to Apache SqoopIntroduction to Apache Sqoop
Introduction to Apache SqoopAvkash Chauhan
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBMike Dirolf
 
Basic Concept of Node.js & NPM
Basic Concept of Node.js & NPMBasic Concept of Node.js & NPM
Basic Concept of Node.js & NPMBhargav Anadkat
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDBMongoDB
 

What's hot (20)

Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
MongoDB presentation
MongoDB presentationMongoDB presentation
MongoDB presentation
 
CouchDB
CouchDBCouchDB
CouchDB
 
Polyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great TogetherPolyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great Together
 
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
 
KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기
KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기
KGC 2016 오픈소스 네트워크 엔진 Super socket 사용하기
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Using AI to Build a Self-Driving Query Optimizer with Shivnath Babu and Adria...
Using AI to Build a Self-Driving Query Optimizer with Shivnath Babu and Adria...Using AI to Build a Self-Driving Query Optimizer with Shivnath Babu and Adria...
Using AI to Build a Self-Driving Query Optimizer with Shivnath Babu and Adria...
 
Apache spark 소개 및 실습
Apache spark 소개 및 실습Apache spark 소개 및 실습
Apache spark 소개 및 실습
 
Management Zabbix with Terraform
Management Zabbix with TerraformManagement Zabbix with Terraform
Management Zabbix with Terraform
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
 
Sqoop on Spark for Data Ingestion
Sqoop on Spark for Data IngestionSqoop on Spark for Data Ingestion
Sqoop on Spark for Data Ingestion
 
Introduction to mongodb
Introduction to mongodbIntroduction to mongodb
Introduction to mongodb
 
Node.js Express
Node.js  ExpressNode.js  Express
Node.js Express
 
Introduction to Apache Sqoop
Introduction to Apache SqoopIntroduction to Apache Sqoop
Introduction to Apache Sqoop
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Basic Concept of Node.js & NPM
Basic Concept of Node.js & NPMBasic Concept of Node.js & NPM
Basic Concept of Node.js & NPM
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDB
 
Introduction to NodeJS
Introduction to NodeJSIntroduction to NodeJS
Introduction to NodeJS
 
Mongo DB 102
Mongo DB 102Mongo DB 102
Mongo DB 102
 

Similar to Introduction to Tokyo Products

Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture EMC
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataRoger Xia
 
Doug Cutting on the State of the Hadoop Ecosystem
Doug Cutting on the State of the Hadoop EcosystemDoug Cutting on the State of the Hadoop Ecosystem
Doug Cutting on the State of the Hadoop EcosystemCloudera, Inc.
 
Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...SignalFx
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsKorea Sdec
 
Ruby on Rails & PostgreSQL - v2
Ruby on Rails & PostgreSQL - v2Ruby on Rails & PostgreSQL - v2
Ruby on Rails & PostgreSQL - v2John Ashmead
 
MongoDB Replication fundamentals - Desert Code Camp - October 2014
MongoDB Replication fundamentals - Desert Code Camp - October 2014MongoDB Replication fundamentals - Desert Code Camp - October 2014
MongoDB Replication fundamentals - Desert Code Camp - October 2014Avinash Ramineni
 
第17回Cassandra勉強会: MyCassandra
第17回Cassandra勉強会: MyCassandra第17回Cassandra勉強会: MyCassandra
第17回Cassandra勉強会: MyCassandraShun Nakamura
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentationMurat Çakal
 
Spark after Dark by Chris Fregly of Databricks
Spark after Dark by Chris Fregly of DatabricksSpark after Dark by Chris Fregly of Databricks
Spark after Dark by Chris Fregly of DatabricksData Con LA
 
Spark After Dark - LA Apache Spark Users Group - Feb 2015
Spark After Dark - LA Apache Spark Users Group - Feb 2015Spark After Dark - LA Apache Spark Users Group - Feb 2015
Spark After Dark - LA Apache Spark Users Group - Feb 2015Chris Fregly
 
MongoDB Replication fundamentals - Desert Code Camp - October 2014
MongoDB Replication fundamentals - Desert Code Camp - October 2014MongoDB Replication fundamentals - Desert Code Camp - October 2014
MongoDB Replication fundamentals - Desert Code Camp - October 2014clairvoyantllc
 
Hash Functions FTW
Hash Functions FTWHash Functions FTW
Hash Functions FTWsunnygleason
 
Scaling HDFS to Manage Billions of Files
Scaling HDFS to Manage Billions of FilesScaling HDFS to Manage Billions of Files
Scaling HDFS to Manage Billions of FilesHaohui Mai
 
Scaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value StoresScaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value StoresDataWorks Summit
 
HadoopThe Hadoop Java Software Framework
HadoopThe Hadoop Java Software FrameworkHadoopThe Hadoop Java Software Framework
HadoopThe Hadoop Java Software FrameworkThoughtWorks
 
Modern classification techniques
Modern classification techniquesModern classification techniques
Modern classification techniquesmark_landry
 

Similar to Introduction to Tokyo Products (20)

Hadoop Overview kdd2011
Hadoop Overview kdd2011Hadoop Overview kdd2011
Hadoop Overview kdd2011
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
 
Doug Cutting on the State of the Hadoop Ecosystem
Doug Cutting on the State of the Hadoop EcosystemDoug Cutting on the State of the Hadoop Ecosystem
Doug Cutting on the State of the Hadoop Ecosystem
 
Hadoop london
Hadoop londonHadoop london
Hadoop london
 
Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and models
 
Ruby on Rails & PostgreSQL - v2
Ruby on Rails & PostgreSQL - v2Ruby on Rails & PostgreSQL - v2
Ruby on Rails & PostgreSQL - v2
 
MongoDB Replication fundamentals - Desert Code Camp - October 2014
MongoDB Replication fundamentals - Desert Code Camp - October 2014MongoDB Replication fundamentals - Desert Code Camp - October 2014
MongoDB Replication fundamentals - Desert Code Camp - October 2014
 
第17回Cassandra勉強会: MyCassandra
第17回Cassandra勉強会: MyCassandra第17回Cassandra勉強会: MyCassandra
第17回Cassandra勉強会: MyCassandra
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentation
 
Spark after Dark by Chris Fregly of Databricks
Spark after Dark by Chris Fregly of DatabricksSpark after Dark by Chris Fregly of Databricks
Spark after Dark by Chris Fregly of Databricks
 
Spark After Dark - LA Apache Spark Users Group - Feb 2015
Spark After Dark - LA Apache Spark Users Group - Feb 2015Spark After Dark - LA Apache Spark Users Group - Feb 2015
Spark After Dark - LA Apache Spark Users Group - Feb 2015
 
MongoDB Replication fundamentals - Desert Code Camp - October 2014
MongoDB Replication fundamentals - Desert Code Camp - October 2014MongoDB Replication fundamentals - Desert Code Camp - October 2014
MongoDB Replication fundamentals - Desert Code Camp - October 2014
 
Hash Functions FTW
Hash Functions FTWHash Functions FTW
Hash Functions FTW
 
Scaling HDFS to Manage Billions of Files
Scaling HDFS to Manage Billions of FilesScaling HDFS to Manage Billions of Files
Scaling HDFS to Manage Billions of Files
 
Scaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value StoresScaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value Stores
 
HadoopThe Hadoop Java Software Framework
HadoopThe Hadoop Java Software FrameworkHadoopThe Hadoop Java Software Framework
HadoopThe Hadoop Java Software Framework
 
Modern classification techniques
Modern classification techniquesModern classification techniques
Modern classification techniques
 
Hadoop
HadoopHadoop
Hadoop
 

Recently uploaded

Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 

Recently uploaded (20)

Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 

Introduction to Tokyo Products

  • 1. Introduction to Tokyo Products Mikio Hirabayashi <hirarin@gmail.com>
  • 2. Tokyo Products • Tokyo Cabinet – database library • Tokyo Tyrant applications Prome – database server custom storage Tyrant Dystopia nade • Tokyo Dystopia Cabinet – full-text search engine file system • Tokyo Promenade – content management system • open source – released under LGPL • powerful, portable, practical – written in the standard C, optimized to POSIX
  • 3. Tokyo Cabinet - database library -
  • 4. Features • modern implementation of DBM – key/value database • e.g.) DBM, NDBM, GDBM, TDB, CDB, Berkeley DB – simple library = process embedded – Successor of QDBM • C99 and POSIX compatible, using Pthread, mmap, etc... • Win32 porting is work-in-progress • high performance – insert: 0.4 sec/1M records (2,500,000 qps) – search: 0.33 sec/1M records (3,000,000 qps)
  • 5. • high concurrency – multi-thread safe – read/write locking by records • high scalability – hash and B+tree structure = O(1) and O(log N) – no actual limit size of a database file (to 8 exabytes) • transaction – write ahead logging and shadow paging – ACID properties • various APIs – on-memory list/hash/tree – file hash/B+tree/array/table • script language bindings – Perl, Ruby, Java, Lua, Python, PHP, Haskell, Erlang, etc...
  • 6. TCHDB: Hash Database • static hashing bucket array – O(1) time complexity • separate chaining key value – binary search tree key value – balances by the second hash key value • free block pool key value key value – best fit allocation – dynamic defragmentation key value • combines mmap and key key value value pwrite/pread key value – saves calling system calls key value • compression key value – deflate(gzip)/bzip2/custom
  • 7. TCBDB: B+ Tree Database • B+ tree key value – O(log N) time complexity B tree index key value • page caching key key value value – LRU removing key value – speculative search key value • stands on hash DB key value – records pages in hash DB key value – succeeds time and space key value efficiency key value • custom comparison key value function key value – prefix/range matching key value key value • cursor key value – jump/next/prev key value
  • 8. TCFDB: Fixed-length Database • array of fixed- length elements array – O(1) time complexity value value value value – natural number keys value value value value – addresses records by value value value value multiple of key value value value value • most effective value value value value value value value value – bulk load by mmap value value value value – no key storage per value value value value record value value value value – extremely fast and value value value value concurrent
  • 9. TCTDB: Table Database • column based – the primary key and named columns – stands on hash DB bucket array • flexible structure primary key name value name value name value – no data scheme, no data type – various structure for each record primary key name value name value name value • query mechanism primary key name value name value name value – various operators matching column values primary key name value name value name value – lexical/decimal orders by column values primary key name value name value name value • column indexes – implemented with B+ tree primary key name value name value name value – typed as string/number – inverted index of token/q-gram column index – query optimizer
  • 10. On-memory Structures • TCXSTR: extensible string – concatenation, formatted allocation • TCLIST: array list (dequeue) – random access by index – push/pop, unshift/shift, insert/remove • TCMAP: map of hash table – insert/remove/search – iterator by order of insertion • TCTREE: map of ordered tree – insert/remove/search – iterator by order of comparison function
  • 11. Other Mechanisms • abstract database – common interface of 6 schema • on-memory hash, on-memory tree • file hash, file B+tree, file array, file table – decides the concrete scheme in runtime • remote database – network interface of the abstract database – yes, it's Tokyo Tyrant! • miscellaneous utilities – string processing, filesystem operation – memory pool, encoding/decoding
  • 12. Example Code #include <tcutil.h> #include <tchdb.h> #include <stdlib.h> #include <stdbool.h> #include <stdint.h> int main(int argc, char **argv){ TCHDB *hdb; int ecode; char *key, *value; /* create the object */ hdb = tchdbnew(); /* open the database */ if(!tchdbopen(hdb, "casket.hdb", HDBOWRITER | HDBOCREAT)){ ecode = tchdbecode(hdb); /* traverse records */ fprintf(stderr, "open error: %s¥n", tchdberrmsg(ecode)); tchdbiterinit(hdb); } while((key = tchdbiternext2(hdb)) != NULL){ value = tchdbget2(hdb, key); /* store records */ if(value){ if(!tchdbput2(hdb, "foo", "hop") || printf("%s:%s¥n", key, value); !tchdbput2(hdb, "bar", "step") || free(value); !tchdbput2(hdb, "baz", "jump")){ } ecode = tchdbecode(hdb); free(key); fprintf(stderr, "put error: %s¥n", tchdberrmsg(ecode)); } } /* close the database */ /* retrieve records */ if(!tchdbclose(hdb)){ value = tchdbget2(hdb, "foo"); ecode = tchdbecode(hdb); if(value){ fprintf(stderr, "close error: %s¥n", tchdberrmsg(ecode)); printf("%s¥n", value); } free(value); } else { /* delete the object */ ecode = tchdbecode(hdb); tchdbdel(hdb); fprintf(stderr, "get error: %s¥n", tchdberrmsg(ecode)); } return 0; }
  • 14. Features • network server of Tokyo Cabinet – client/server model – multi applications can access one database – effective binary protocol • compatible protocols – supports memcached protocol and HTTP – available from most popular languages • high concurrency/performance – resolves "c10k" with epoll/kqueue/eventports – 17.2 sec/1M queries (58,000 qps)
  • 15. • high availability – hot backup and update log – asynchronous replication between servers • various database schema – using the abstract database API of Tokyo Cabinet • effective operations – no-reply updating, multi-record retrieval – atomic increment • Lua extension – defines arbitrary database operations – atomic operation by record locking • pure script language interfaces – Perl, Ruby, Java, Python, PHP, Erlang, etc...
  • 16. Asynchronous Replication master and slaves dual master (load balancing) (fail over) write query master server client client database query if the master is dead query read query update log active master standby master with load balancing database database replicate the difference slave server slave server update log update log replicate database database the difference update log update log
  • 17. Thread Pool Model epoll/kqueue listen accept the client connection if the event is about the listener queue first of all, the listening socket is enqueued into the epoll queue accept epoll_ctl(add) queue back if keep-alive epoll_wait task manager epoll_ctl(del) queue enqueue move the readable client socket from the epoll queue to the task queue deque worker thread worker thread do each task worker thread
  • 18. Lua Extention • defines DB operations as Lua functions – clients send the function name and record data – the server returns the return value of the function • options about atomicity – no locking / record locking / global locking front end request back end - function name - key data Tokyo Tyrant - value data Clients Tokyo Cabinet Lua processor response - result data script database user-defined operations
  • 19. case: Timestamp DB at mixi.jp • 20 million records mod_perl – each record size is 20 bytes home.pl update • more than 10,000 show_friend.pl TT (active) updates per sec. view_diary.pl database – keeps 10,000 connections search.pl replication • dual master other pages replication TT (standby) – each server is only one database fetch • memcached list_friend.pl compatible protocol list_bookmark.pl – reuses existing Perl clients
  • 20. case: Cache for Big Storages • works as proxy clients 1. inserts to the storage – mediates insert/search 2. inserts to the cache – write through, read through • Lua extension Tokyo Tyrant MySQL/hBase – atomic operation by record atomic insert database locking Lua – uses LuaSocket to access the storage database atomic search • proper DB scheme Lua – TCMDB: for generic cache database – TCNDB: for biased access – TCHDB: for large records cache database such as image 1. retrieves from the cache – TCFDB: for small records if found, return 2. retrieves from the storage such as timestamp 3. inserts to the cache
  • 21. Example Code #include <tcrdb.h> #include <stdlib.h> #include <stdbool.h> #include <stdint.h> int main(int argc, char **argv){ TCRDB *rdb; int ecode; char *value; /* create the object */ rdb = tcrdbnew(); /* connect to the server */ if(!tcrdbopen(rdb, "localhost", 1978)){ ecode = tcrdbecode(rdb); fprintf(stderr, "open error: %s¥n", tcrdberrmsg(ecode)); } /* store records */ if(!tcrdbput2(rdb, "foo", "hop") || !tcrdbput2(rdb, "bar", "step") || !tcrdbput2(rdb, "baz", "jump")){ ecode = tcrdbecode(rdb); fprintf(stderr, "put error: %s¥n", tcrdberrmsg(ecode)); } /* close the connection */ if(!tcrdbclose(rdb)){ /* retrieve records */ ecode = tcrdbecode(rdb); value = tcrdbget2(rdb, "foo"); fprintf(stderr, "close error: %s¥n", tcrdberrmsg(ecode)); if(value){ } printf("%s¥n", value); free(value); /* delete the object */ } else { tcrdbdel(rdb); ecode = tcrdbecode(rdb); fprintf(stderr, "get error: %s¥n", tcrdberrmsg(ecode)); return 0; } }
  • 22. Tokyo Dystopia - full-text search engine -
  • 23. Features • full-text search engine – manages databases of Tokyo Cabinet as an inverted index • combines two tokenizers – character N-gram (bi-gram) method • perfect recall ratio – simple word by outer language processor • high accuracy and high performance • high performance/scalability – handles more than 10 million documents – searches in milliseconds
  • 24. • optimized to professional use – layered architecture of APIs – no embedded scoring system • to combine outer scoring system – no text filter, no crawler, no language processor • convenient utilities – multilingualism with Unicode – set operations – phrase matching, prefix matching, suffix matching, and token matching – command line utilities
  • 25. Inverted Index • stands on key/value database – key = token • N-gram or simple word – value = occurrence data (posting list) • list of pairs of document number and offset in the document • uses B+ tree database – reduces write operations into the disk device – enables common prefix search for tokens – delta encoding and deflate compression ID:21 text: "abracadabra" a - 21:10 ca - 21:1, 21:8 ab - 21:0,21:7 da - 21:4 ac - 21:3 ra - 21:2, 21:9 br - 21:5
  • 26. Layered Architecture • character N-gram index – "q-gram index" (only index), and "indexed database" – uses embedded tokenizer • word index – "word index" (only index), and "tagged index" – uses outer tokenizer Applications Character N-gram Index Tagging Index indexed database tagged database q-gram index word index Tokyo Cabinet
  • 27. case: friend search at mixi.jp • 20 million records – each record size is 1K bytes query user interface – name and self introduction merger • more than 100 qps TT's social query • attribute narrowing cache graph – gender, address, birthday searcher – multiple sort orders copy the social graph inverted attribute • distributed processing index DB – more than 10 servers indexer – indexer, searchers, merger copy the index and the DB inverted attribute • ranking by social index DB graph dump profile data – the merger scores the result by following the friend links profile DB
  • 28. Example Code #include <dystopia.h> #include <stdlib.h> #include <stdbool.h> #include <stdint.h> int main(int argc, char **argv){ TCIDB *idb; int ecode, rnum, i; uint64_t *result; char *text; /* search records */ /* create the object */ result = tcidbsearch2(idb, "john || thomas", &rnum); idb = tcidbnew(); if(result){ for(i = 0; i < rnum; i++){ /* open the database */ text = tcidbget(idb, result[i]); if(!tcidbopen(idb, "casket", IDBOWRITER | IDBOCREAT)){ if(text){ ecode = tcidbecode(idb); printf("%d¥t%s¥n", (int)result[i], text); fprintf(stderr, "open error: %s¥n", tcidberrmsg(ecode)); free(text); } } } /* store records */ free(result); if(!tcidbput(idb, 1, "George Washington") || } else { !tcidbput(idb, 2, "John Adams") || ecode = tcidbecode(idb); !tcidbput(idb, 3, "Thomas Jefferson")){ fprintf(stderr, "search error: %s¥n", tcidberrmsg(ecode)); ecode = tcidbecode(idb); } fprintf(stderr, "put error: %s¥n", tcidberrmsg(ecode)); } /* close the database */ if(!tcidbclose(idb)){ ecode = tcidbecode(idb); fprintf(stderr, "close error: %s¥n", tcidberrmsg(ecode)); } /* delete the object */ tcidbdel(idb); return 0; }
  • 29. Tokyo Promenade - content management system -
  • 30. Features • content management system – manages Web contents easily with a browser – available as BBS, Blog, and Wiki • simple and logical interface – aims at conciseness like LaTeX – optimized for text browsers such as w3m and Lynx – complying with XHTML 1.0 and considering WCAG 1.0 • high performance/throughput – implemented in pure C – uses Tokyo Cabinet and supports FastCGI – 0.836ms/view (more than 1,000 qps)
  • 31. • sufficient functionality – simple Wiki formatting – file uploader and manager – user authentication by the login form – guest comment authorization by a riddle – supports the sidebar navigation – full-text/attribute search, calendar view – Atom feed • flexible customizability – thorough separation of logic and presentation – template file to generate the output – server side scripting by the Lua extension – post processing by outer commands
  • 32. Example Code #! Introduction to Tokyo Cabinet #c 2009-11-05T18:58:39+09:00 #m 2009-11-05T18:58:39+09:00 #o mikio #t database,programming,tokyocabinet This article describes what is [[Tokyo Cabinet|http://1978th.net/tokyocabinet/]] and how to use it. @ upfile:1257415094-logo-ja.png * Features - modern implementation of DBM -- key/value database -- e.g.) DBM, NDBM, GDBM, TDB, CDB, Berkeley DB - simple library = process embedded - Successor of QDBM -- C99 and POSIX compatible, using Pthread, mmap, etc... -- Win32 porting is work-in-progress - high performance - insert: 0.4 sec/1M records (2,500,000 qps) - search: 0.33 sec/1M records (3,000,000 qps)
  • 33. innovating more and yet more... http://1978th.net/