Cc index for cassandra a novel scheme for multidimensional range queries in cassandra


Published on

J Gabriel Lima -

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Cc index for cassandra a novel scheme for multidimensional range queries in cassandra

  1. 1. 2011 Seventh International Conference on Semantics, Knowledge and Grids CCIndex for Cassandra: A Novel Scheme for Multi- dimensional Range Queries in Cassandra Chen Feng#1, Yongqiang Zou*2, Zhiwei Xu#3 # Institute of Computing Technology, Chinese Academy of Sciences Beijing, 100190, China 1 3 * Tencent Corporation Beijing, 100190, China 2 Abstract—Multi-dimensional range queries are fundamental their limited support for queries on non-primary keys leads to requirements in large scale Internet applications using poor performance in multi-dimensional range queries Distributed Ordered Tables. Apache Cassandra is a Distributed involving non-primary keys. Ordered Table when it employs order-preserving hashing as data CCIndex [7], short for Complemental Clustering Index, is partitioner. Cassandra supports multi-dimensional range queries proposed to support multi-dimensional range queries over with poor performance and with a limitation that there must be DOTs for high performance, low space overhead, and high one dimension with an equal operator. Based on the success of CCIndex scheme in Apache HBase, this paper tries to answer the reliability. CCIndex has been implemented on HBase and question: Can CCIndex benefit multi-dimensional range queries gains 11.4 times scan efficiency over non-primary columns. in DOTs like Cassandra? The Apache Cassandra [8] is a highly scalable distributed This paper studies the feasibility of employing CCIndex in database with fully distributed design like Dynamo [9] and Cassandra, proposes a new approach to estimate result size, column family data model of BigTable. Cassandra is a implements CCIndex in Cassandra including recovery Distributed Ordered Tables rather than Distributed Hash mechanisms and studies the pros and cons of CCIndex for Tables (DHT) when it employs order-preserving hashing as different DOTs. Experimental results show that CCIndex gains data partitioner, instead of consistency hashing. Cassandra 2.4 to 3.7 times efficiency over Cassandra’s index scheme with partitions primary keys into nodes in a circle overlay as in 1% to 50% selectivity for 2 million records. This paper shows that CCIndex is a general approach for DOTs, and could gain Chord [10], replicates keys for performance and reliability, better performance for DOTs which perform scan tasks much and supports range queries on keys. Cassandra supports multi- faster than random read. This paper reveals that Cassandra is dimensional range queries since version 0.7 with a limitation optimized for hash tables rather than ordered tables in that there must be one dimension with an equal operator in the performing read and range queries. query expression, which hinders the broad usage of these queries. Multi-dimensional range queries of Cassandra’s index I. INTRODUCTION schema also encounter the efficiency problem when applied in Multi-dimensional range queries are fundamental the HBase IndexedTable. When responding to the queries, the requirements in large scale Internet applications and gained system must first scan the secondary index to get primary keys, more and more attentions in Distributed Ordered Tables and then issue multiple random reads to get real data. (DOTs) [1,2] like BigTable [3], PNUTS [4], and HBase [5] in Can CCIndex benefit multi-dimensional range queries in recent years. DOTs like Cassandra to support multi-dimensional range Multi-dimensional range queries mean queries with less queries without such limitations and with better performance? than operator or greater than operator on multiple table This paper studies the feasibility of employing CCIndex to columns. For example, a query for yesterday’s hot photos support multi-dimensional range queries in Cassandra. We written in SQL is like “select * from photos where identify three differences between HBase and Cassandra when hit_counts > 100000 and create_time > now() - 86400”. When utilizing CCIndex: (1) The smallest sorted unit is region in modeling resources in physical or cyber space as a multi- HBase while it’s node in Cassandra; (2) The speed of range dimensional classification space as in Probabilistic Resource query in Cassandra is not fast enough to accelerate the Space Model (P-RSM) [6], multi-dimensional range queries CCIndex performance; (3) The APIs of HBase and Cassandra are basic operations. are different. This paper proposes a new approach to estimate As the data scale grows, Distributed Ordered Tables are result size by data distribution information, implements adopted in more and more applications to store and query CCIndex in Cassandra, studies the pros and cons of CCIndex structured data for outstanding performance, reliability, and for different DOTs styles, and reveals more performance scalability. Naturally, Distributed Ordered Tables can support issues of Cassandra. point queries and range queries on primary key. However, The contributions of this paper are summarized as follows.978-0-7695-4515-8/11 $26.00 © 2011 IEEE 130DOI 10.1109/SKG.2011.28
  2. 2. 1. This paper employs CCIndex to support multi- CCIndex creates all ComplementalTables and CCTs whendimensional range queries overcoming the limitations of the OriginalTable is created. CCIndex maintains the index byCassandra. The results show that CCIndex gains 2.4 times the procedures of inserting and deleting.performance over Cassandra’s index scheme with 1%selectivity, and about 3.7 times performance when theselectivity is 50% for 2 million records. 2. This paper shows that CCIndex is a general approach forDOTs, which could gain better performance for DOTs withslow random read and fast sequential read. This paper showsthat CCIndex improves query performance by about 2 timeson DOTs with fast random read, and achieves an order ofmagnitude times performance improvement for the DOTswhose random read is significantly slower than sequentialread or scan, such as HBase. This paper implements theCCIndex recovery mechanism indicates that the efficiency ofCCIndex recovery is 33% of that of sequential write forCassandra. 3. This paper reveals that Cassandra is optimized for hashtables rather than ordered tables. Cassandra provides bothconsistency hashing and order-preserving hashing, while theread and scan operations are not optimized for order-preserving hashing, such as considering pre-fetch for read, andoptimizing scan for range queries over ordered tables.Cassandra’s strategy is good for hash tables, but inefficient forordered tables. This paper is organized as follows. Section 2 gives thebackground. Section 3 illustrates the design and Fig. 1 Data layout of CCIndex.implementation for CCIndex in Cassandra. Section 4 shows The procedure of writing is shown as Fig. 2. When writingthe experimental results and the discussion on the results. a record into OriginalTable, CCIndex reads the OriginalTableSection 5 concludes the whole work. by rowkey to get the old values, checks whether the index values are going to be modified, and then deletes records form II. BACKGROUND corresponding CCITs and CCTs when updating index values.A. CCIndex Analysis After that, CCIndex writes the records to all CCITs and CCTs. CCIndex is proposed to support multi-dimensional range When deleting a record, CCIndex reads all index values fromqueries over DOTs by reorganizing data. CCIndex introduces OriginalTable and deletes records from all CCITs and CCTs.a ComplementalTable for each index column. AComplementalTable stores all columns except the rowkey andthe corresponding index column. The ComplementalTablerowkey is a concatenation of the index column value, theoriginal rowkey, and the length of index column value. Theway of generating the rowkey of ComplementalTable ensuresthat all the rowkeys are unique and sorted by index columnand the original rowkey. The OriginalTable and theComplementalTables are called Complemental ClusteringIndex Table (CCIT). CCIT sets the replica factor to 1 todecrease the storage overhead. CCIndex maintains thereliability of a CCIT by other CCITs and introduces areplicated CCT (Complemental Check Table) for each CCIT Fig. 2 The procedure of help data recovery. In Fig. 1, there is an OriginalTable (CCIT0) with a primary The procedure of multi-dimensional range queries is shownid and two index columns weight and height. CCIT-W and as Fig. 3. CCIndex estimates result size for each queryCCIT-H (ComplementalTable) are ordered by key1 and key2 condition and selects the condition with the smallest resultrespectively. With these CCITs, range queries over id, weight, size to execute range query on corresponding CCIT. CCIndexor height can be converted to range queries on CCIT0, CCIT- employs other conditions to filter the result got by range queryW or CCIT-H. and returns the ultimate results of multi-dimensional range CCT stores the rowkey and all index columns of a CCIT. queries.CCTs are replicated while the CCITs are not replicated. 131
  3. 3. ratio of CCIndex to IndexedTable is determined by the speed ratio of range query to random read. B. Cassandra Analysis Cassandra organizes nodes as a ring overlay like Chord to partition data. Each node manages a part of data in the ring, with data id from previous node token to this node token. Records use the same partitioner to map its key to the token ring. Corresponding node writes records to commitlog and then to its memtable. Memtable is a memory structure contains sorted rows. Memtable is flushed to an SSTable on disk when it is full. SSTable is a sorted structure flushed one by one and cannot be modified once flushed, so that records between multiple SSTables are not sorted as in Fig. 5. Cassandra combines several old SSTables into a new SSTable by compaction to reduce the SSTable number. Each node contains more than Fig. 3 The procedure of multi-dimensional range queries. one SSTable in most cases. CCIndex for HBase uses a simple way to estimate the resultsize. In HBase, HMaster stores region-to-server mappinginformation as in Fig. 4. The mapping information can bedescribed as a set of <startKey-regionServer>, ordered bystartKey. CCIndex finds the regions covered by each rangequery and estimates the result size by the region number.When HBase has more than 1 region and has max region sizeSmax, each region size must be greater than Smax/2 and less thanSmax. Thus CCIndex considers the result size depends on theregion number covered. Fig. 5 An example of memtable and SSTables in a node. Like Dynamo, Cassandra keeps strong consistency if W + R > N, where W and R indicates respectively the minimum number of nodes that have executed write and read operation successfully, and N is the number of replication factor. Cassandra uses different ConsistencyLevels to keep the balance between consistency and availability. In writing, ConsistencyLevel.ONE and QUORUM ensure that the write Fig. 4 The region-to-server mapping of HBase. operation has been executed successfully on at least 1 and N / In HBase, the speed of scan is 8.2 times of random read. 2 + 1 node(s). In reading, ONE returns the record respondedThe speed of multi-dimensional range query on CCIndex is by the fastest node and QUORUM returns the record in11.4 times of IndexedTable. majority of most recent records from at least N / 2 + 1 nodes. The performance of CCIndex is affected by 2 issues: Comparing with ONE, QUORUM has higher latency while The accuracy of result size estimation. The more maintaining the consistency. accurate the estimation is, the less unnecessary Cassandra version 0.7+ provides APIs to execute multi- records will be scanned. dimensional range queries. But there is a limitation that the The speed ratio of range query to random read. To APIs require at least one equal operator on a configured index execute a multi-dimensional range query, CCIndex column in the query expression. Cassandra also provides APIs executes range query on a CCIT and then filters the to execute the range query over rowkey, but the speed of result. IndexedTable executes range query on an index range query is only 1.3 times of random read. table to get original rowkeys, and then gets the records In summary, there are three issues of mismatches between by random read on those rowkeys. Thus the speed HBase and Cassandra, which impose challenges when utilizing CCIndex for Cassandra. 132
  4. 4. 1) The smallest sorted unit is region in HBase while it’s CCIndex encapsulates APIs of HBase and Cassandra, andnode in Cassandra: In HBase, regions are sorted by the exposes the same CCIndex APIs for applications.rowkey of records. In Cassandra, records are stored inSSTables and sorted between nodes, while multiple SSTables D. Data recoveryin the same node are not sorted. The difference decreases the CCIndex introduces replicated CCT to help recover theaccuracy of estimating result size. damaged data. This paper implements the data recovery 2) The speed of range query: Cassandra executes range module with CCT in Cassandra.query by logical scan, traversing all SSTables to find the To recover a record of OriginalTable, CCIndex first reads‘next’ record, while HBase executes physical scans on regions. CCTs by rowkey to get all index columns. Then CCIndex 3) The differences between HBase and Cassandra on APIs: concatenates the original rowkey and the index column valueTo implement CCIndex for Cassandra, the API issue must be to form the rowkey of a certain ComplementalTable. CCIndexconsidered, namely how to utilize the different APIs given by tries to read the record by the concatenated rowkey and writeHBase and Cassandra and unify the APIs CCIndex providing the corresponding record into OriginalTable. If the recoveryto the application level. fails, CCIndex tries to recover data by another ComplementalTable. III. DESIGN AND IMPLEMENTATION To recover a record on ComplementalTable, CCIndex gets CCIndex for Cassandra uses different methods to deal with the rowkey of OriginalTable by splitting the given rowkey.the differences when utilizing CCIndex for Cassandra. Then CCIndex tries to read the record from OriginalTable. If the reading operation fails, CCIndex uses other index columnA. The smallest sorted unit issue. values got from CCT to recover data by other As record size between nodes might be unbalanced, the ComplementalTables.way which CCIndex for HBase uses to estimate result size by To recover a certain range of table, CCIndex scanscovered region number cannot work on Cassandra. This paper corresponding CCT, and uses the methods above to recoveruses a different way to estimate result size, which lies on data records one by one. A range can be split into several parts fordistribution information of Cassandra. multi-thread recovery to increase efficiency. 1) Data distribution information gathering: CCIndex for E. ImplementationCassandra first adds an API in CassandraClient to gatherSSTable information of a certain node, and then adds a CCIndex for Cassandra prototype uses Cassandra v0.7.2 asdaemon thread Listener in CassandraDaemon. Listener gets code bases and is written in Java.token ring information from StorageService every other As replica factor of Cassandra associates with keyspace, itminute. With token-IP mapping, Listener uses the API above is easy for CCIndex for Cassandra to replicate CCTs byto get SSTable information from every node. Thus each node putting CCTs into a separate keyspace with replica factor 3.saves the data distribution information of all nodes. Cassandra CCIndex sets keyspace replica factor to 1 for CCIT, andkernel code is modified without performance degradation. creates one ComplementalTable for each index column. 2) The estimation of result size: CCIndex client uses athread Refiner to get data distribution information and tokenring information from Listener, then CCIndex estimates resultsize for every query condition: • Calculate the nodes covered by range. Count the node number as N3, • For every node covered, read the SSTable data file total size S, and file number C, • Summarize the total size of S, C for all nodes, get N1, N2. Each search condition has a tuple [N1, N2, N3]. N1 hashigher priority than N2, and N2 has higher priority than N3.CCIndex for Cassandra executes range queries oncorresponding CCIT which has the smallest tuple.B. The speed of range query The speed of range query is determined by Cassandrasystem. The aim of CCIndex for Cassandra is to implement Fig. 6 The architecture of CCIndex for Cassandra.CCIndex while making as few changes as possible. The lowspeed of range query affects the speed of multi-dimensional CCIndex for Cassandra client connects with a server noderange queries but does not restrict the implementation. to perform operations like inserting, reading and range query. As Fig. 6 shows, CCIndex for Cassandra uses a connectionC. The API issue 133
  5. 5. pool extends from Pelops [11]. The connection pool assigns a not have enough replicas for CCIT. When N changes from 2random connection to each client to avoid hot spot issue. to 4 and Ls/L changes from 1/30 to 1/10, the overhead ratio The client gets the token ring and data distribution changes from 10% to 116.7%.information by sending a query to a certain node to estimatethe query result size. B. Experiment Setup This paper introduces a benchmark to evaluate the basic IV. EVALUATION operations throughput, including sequential read/write, CCIndex for Cassandra is implemented and evaluated random read, and range query. The workload uses a table withthrough analysis and experiments. columns rowkey, index1, index2, index3 and data. The length of rowkey, index1, index2 and index3 are 10 bytes while theA. Space Overhead Analysis data column is 1 KB. The throughput is defined as rows per For the given metrics, the performance is easy to be second for all clients.evaluated through experiments. As to the space overhead, CCIndex builds index for index1, index2, and index3,theoretical analysis is more suitable. ConsistencyLevel for CCIT is ONE, and is QUORUM for Here we denote the number of index columns by N, the CCT.replica factor of Original Cassandra and CCT by R, the Original Cassandra and Cassandra Indexed set replica to 3average length of the key and all index columns by Ls, and the and ConsistencyLevel to QUORUM. Original Cassandra doestotal length of record by L. not build index. Cassandra Indexed builds index for index1, In Original Cassandra, the space for every record is: index2, and index3. SORG = L * R The experimental cluster has 5 nodes. Each node has two (1) 1.8 GHz dual-cores AMD Opteron (tm) Processor270, with 4 In CCIndex, the space for each record is the CCITs plus GB memory. Each node in the cluster has 321 GB RAID5CCTs. The space for CCITs is: SCSI disks. All nodes are connected by Gigabits Ethernet. SCCIT = L *( N + 1) (2) Each node uses Red Hat CentOS release 5.3 (kernel 2.6.18), The space for CCT is: ext3 file system, Sun JDK1.6.0_14. The test runs on another SCCT = Ls *( N + 1)* R client machine, which has two 2.0 GHz Intel(R) Core(TM) (3) Duo T5750 Processor , with 3 GB memory, Broadcom The total space for CCIndex is: Netlink(TM) fast Ethernet 100M bps. The client uses Ubuntu SCC = SCCIT + SCCT = ( N + 1)( L + Ls * R) (4) 10.04LTS, ext3 file system, Sun JDK 1.6.0_14. The space overhead ratio of CCIndex to Original Cassandra The workload in the experiments has 2 million rows; theis: token of each node is initialized manually to keep load SCC / SORG − 1 = ( N + 1) / R + ( N + 1)* Ls / L − 1 balance. Each test runs three times to report the average value. (5) The client uses 25 concurrent threads for sequential write, In Cassandra, the replica number R is often set to 3. The sequential read, random read and range query, and uses 1radio is: thread for multi-dimensional range queries. ( N + 1) / 3 + ( N + 1)* Ls / L − 1 (6) C. Experiment Result Equation (6) can be plotted as Fig. 7. The result in Fig. 8 shows that ConsistencyLevel has great effect on every test, which can be confirmed by the great differences between the throughput of Cassandra(1) and Cassandra(3) or Cassandra Indexed(1) and Cassandra Indexed(3). The throughput of sequential write for CCIndex is significantly lower than the Cassandra Indexed and much lower than the Original Cassandra, because maintaining index needs extra random read to get row data from OriginalTable, and if there are old index column values, further delete operations are needed to update index. The performance of Original Cassandra(3) and Cassandra Indexed(3) on range query, random read, and sequential read Fig. 7 The space overhead ratio of CCIndex to Original Cassandra. Using are nearly identical due to the same implementation. They are L/Ls values as the horizontal axis. lower than that of CCIndex because of ConsistencyLevel, From Fig. 7, the overhead ratio drop significantly as the which can be confirmed by the fact that Original Cassandra(1)Ls/L decreases and the N decreases, which indicates that to and Cassandra Indexed(1) have nearly the same throughputavoid huge space overhead, there should be less index with CCIndex.columns in CCIndex and the data length of index columnsshould be shorter. When N is smaller than 2, CCIndex would 134
  6. 6. CCIndex increases to 3.7 times that of Cassandra Indexed(3). In the experiment, CCIndex is about 1.8 to 2.7 times as fast as Cassandra Indexed(1). In another test on Cassandra Indexed, when MAXVALUE is 100 and the query expression is 0 < index1 < 10000, 0 < index2 < 10000 and index3 = 0, exception happens every time in all 10 attempts while CCIndex performs well. We consider it happens when many records are discarded by the non-equal columns ranges. The throughput of recovery is 1819 records/s in average in Fig. 10. To recover one record, CCIndex first executes range query on CCT, writes on CCIT, and random reads on CCIT. The CCT range query speed is 6013 records/s, while the write speed on CCIT is 4778 records/s and the random read speed on CCIT is 4797 records/s. The recovery speed is 1964.7 Fig. 8 Basic Operations for Original Cassandra, Cassandra Indexed and records/s in theory. Comparing with 1819 records/s in practice,CCIndex. Cassandra(1) is Cassandra with 1 replica and ConsistencyLevel is the recover speed matches the theoretical analysis. ONE. Cassandra(3) is Cassandra with 3 replica and ConsistencyLevel is QUORUM. Cassandra Indexed builds index for index columns. In this experiment, N is 4,Ls/L is 1/30, CCIndex uses 46%more space than Original Cassandra(3) in theory. The resultshows that Original Cassandra(3) uses 1.39 GB per nodewhile CCIndex uses 2.12 GB per node, which has 52.6%space overhead. Because there are memtables not flushed inmemory, we consider the storage overhead confirms thetheoretical analysis. The tests of multi-dimensional range query writes recordswith index1 and index2 whose value is randomly generatedfrom 0 to 2 million and index3 is randomly generated from 0to MAXVALUE. In this way, the test could use expression 0< index1 < 2000000 and 0 < index2 < 2000000 and index3 = Fig. 10 CCIndex recovery speed.0 to match the requirement of Cassandra API. TheMAXVALUE of index3 is set from 100 to 1 to change the D. Discussionselectivity from 1% to 100%. The results provide many insights on CCIndex and The results of multi-dimensional range query test on Cassandra.different conditions are shown as Fig. 9. When the selectivity 1) Overall, the results show that CCIndex is a generalis under 10%, Cassandra Indexed performs well, but when the approach for DOTs, successfully in improving bothselectivity raises from 20% to 100%, the latency increases performance and query expressiveness.significantly. 2) The results show that in Cassandra, the sequential read and random read are the same in throughput and the range query throughput is only 1.3 times as fast as random read. But if a client sets Cassandra’s partitioner to OrderedPartitioner, it suggests that the client is probably willing to use some special operations on ordered table such as sequential read and range query. Cassandra could do some optimization like prefetching and caching on adjacent records. 3) CCIndex is suitable for tables with 2 to 4 index columns. CCIndex cannot guarantee the reliability with fewer than 2 index columns because the CCITs are not replicated. If there are more than 4 index columns, the space overhead is more than 2 times of the Original Cassandra. When a table has more than 4 columns with query requirements, a solution is to build index for 2 to 4 most frequently used columns, and to filter the Fig. 9 Throughput of multi-dimensional range queries by CCIndex , result by non-indexed conditions in applications. Cassandra Indexed(1) and Cassandra Indexed(3) 4) The throughput of CCIndex is determined by the ratio of The throughput ratio of CCIndex to Cassandra Indexed(3) range query to random read. This explains why the throughputis at least 2.4. When the selectivity grows, the throughput of of CCIndex for Cassandra is 2.4 to 3.7 times to Cassandra 135
  7. 7. Indexed(3), while the throughput of CCIndex for HBase is 1% to 50% selectivity for 2 million records. This paper shows11.4 times to that of IndexedTable. CCIndex converts random that CCIndex is a general approach for DOTs, and could gainread on OriginalTable to range query on CCIT, so its better performance on multi-dimensional range queries forperformance is associated with the speed improvement from DOTs with slow random read and fast sequential read. Thisrandom read to range query. paper implements the CCIndex recovery mechanism and show During the procedure of multi-dimensional range query, that CCIndex recovery performance is 33% of that forIndexedTable executes range query and random read for every sequential write in Cassandra. This paper reveals thatrecord before filtering while CCIndex only needs to execute Cassandra is optimized for hash tables rather than orderedrange query for one time. tables in read and range queries. Cassandra could do some We denote the speed of range query by Ss, and the speed of optimizing like prefetching and caching on adjacent records.random read by Sr. The speed for CCIndex to get records is: ACKNOWLEDGMENT Scc = S s (7) This work is supported in part by the Hi-Tech Research and Development (863) Program of China (Grant No. The speed for IndexedTable is: 2006AA01A106), and the major national science and Si = 1/ (1/ S s + 1/ S r ) = S s * Sr / ( S s + Sr ) (8) technology special projects (2010ZX03004-003-03). The ratio of CCIndex to IndexedTable is: Scc / Si = ( S s + Sr ) / Sr = 1 + S s / Sr REFERENCES (9) [1] Adam Silberstein, Brian F. Cooper, Utkarsh Srivastava, Erik Vee, So the ratio of CCIndex to IndexedTable is decided by the Ramana Yerneni, and Raghu Ramakrishnan, “Efficient bulk insertionvalue of Ss / Sr. For HBase, Ss / Sr is equal to 8.2 and Scc / Si is into a distributed ordered table,” in Proceedings of the 2008 ACMequal to 9.2. As there’s no optimization on query, SIGMOD International conference on Management of Data, 2008. [2] Ymir Vigfusson, Adam Silberstein, Brian F. Cooper, Rodrigo Fonseca,IndexedTable filters more records as candidate results. So the “Adaptively parallelizing distributed range queries,” in Proc. VLDBfinal ratio of CCIndex to IndexedTable on multi-dimensional Endow., vol. 2, pp. 682–693. VLDB Endowment (2009)range queries, 11.4, meets the analysis. [3] Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, From Fig.9, the throughput of CCIndex is 1.9 and 2.4 times Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber, “Bigtable: a distributed storage system forto Cassandra Indexed(1) and Cassandra Indexed(3) structured data,” in 7th USENIX Symposium on Operating Systemsrespectively. CCIndex performs the same with Cassandra Design and Implementation, 2006.Indexed(1) in random read and scan. [4] Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam From Fig.8 Ss / Sr is equal to 1.2 on Cassandra Indexed(1), Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, and Ramana Yerneni, “PNUTS: Yahoo!s hosted data servingand CCIndex takes more time to filter the result, the final ratio platform,” in Proc. VLDB Endow. vol. 1, pp. 1277--1288. 20081.9 is close to the predicted value 2.2. [5] Apache HBase project. [Online]. Available: [6] Hai Zhuge, "Probabilistic Resource Space Model for Managing V. CONCLUSIONS Resources in Cyber-Physical Society," IEEE Transactions on Services Computing, vol. 99, no. PrePrints, 2011 Cassandra is a Distributed Ordered Table supporting multi- [7] Yongqiang Zou, Jia Liu, Shicai Wang, Li Zha, and Zhiwei Xu,dimensional range queries. However, current design and “CCIndex: a Complemental Clustering Index on Distributed Orderedimplementation of Cassandra have two problems: (1) Tables for Multi-dimensional Range Queries,” in 7th IFIP International Conference on Network and Parallel Computing, 2010.Cassandra’s query expression is limited in that there must be [8] Avinash Lakshman, Prashant Malik, “Cassandra: a decentralizedone dimension with an equal operator in the query expression; structured storage system,” SIGOPS Operating Systems Review, vol.(2) The performance is poor. With the success of CCIndex 44 issue 2. pp. 35-40. Apr. 2010scheme in Apache HBase, this paper tries to study the [9] Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathanfeasibility of employing CCIndex to improve multi- Sivasubramanian, Peter Vosshall, and Werner Vogels, “Dynamo:dimensional range queries in DOTs like Cassandra. amazons highly available key-value store,” in Proceedings of 21st There are three mismatches between HBase and Cassandra ACM SIGOPS symposium on Operating systems principles, 2007.when utilizing CCIndex for Cassandra, which imposes [10] Ion Stoica, Robert Morris, David Karger, Frans Kaashoek, and Hari Balakrishnan, “Chord: A scalable peer-to-peer lookup service forchallenges: (1) The smallest sorted unit is region in HBase internet applications,” in Proceedings of the 2001 conference onwhile it’s node in Cassandra, so the estimation method in Applications, Technologies, Architectures, and Protocols for ComputerHBase is not suitable for Cassandra; (2) The speed of range Communications, 2001.query of Cassandra is not fast enough to accelerate the [11] Pelops project. [Online]. Available. performance; (3) The APIs of HBase and Cassandraare different. This paper proposes a new approach to estimate result sizeand exposes the same CCIndex APIs for application to tacklethe first and the third mismatch. The speed of range query isdetermined by Cassandra system, Cassandra could do someoptimization like prefetching and caching on adjacent records. The experimental results show that CCIndex gains 2.4 to3.7 times performance over Cassandra’s index scheme with 136