Cassandra Basics: Indexing

Cassandra Basics
Indexing

Benjamin Black, b@b3k.us

Relational stores are
SCHEMA ORIENTED

Start from your SCHEMA &
WORK FORWARDS

Column stores are
QUERY ORIENTED

Start from your QUERIES &
WORK BACKWARDS

AT SCALE
Denormalization is
THE NORM

AT SCALE
Everything depends on
THE INDICES

Cassandra is an
INDEX CONSTRUCTION KIT

Two-level Map

key: {
column: value,
column: value,
...
}

Three-level Map
key: {
supercolumn: {
column:value,
column: value
},
supercolumn: {
...
}
}

column sorting deﬁned by
CompareWith/
CompareSubcolumnsWith

TimeUUIDType
UTF8Type
ASCIIType
LongType

LexicalUUIDType

row placement determined by
Partitioner

RandomPartitioner
Place based on MD5 of key

OrderPreservingPartitioner
Place based on actual key

Rows are sorted by key on each node
Regardless of partitioner

“b”: {“name”:”Ben”, “street”:”1234 Oak St.”,
“city”:”Seattle”, “state”:”WA”}
“jason”: {”name”:”Jason”, “street”:”456 First Ave.”,
“city”:”Bellingham”, “state”:”WA”}
“zack”: {”name”: “Zack”, “street”: “4321 Pine St.”,
“city”: “Seattle”, “state”: “WA”}
“jen1982”: {”name”:”Jennifer”, “street”:”1120 Foo Lane”,
“city”:”San Francisco”, “state”:”CA”}
“albert”: {”name”:”Albert”, “street”:”2364 South St.”,
“city”:”Boston”, “state”:”MA”}

SELECT name FROM Users
WHERE state=”WA”

SELECT name FROM Users
WHERE state=”WA”

How is WHERE clause
formed?

[state]: {
[city1]: {[name1]:[user1], [name2]:[user2], ... },
[city2]: {[name3]:[user3], [name4]:[user4], ... },
...
[cityX]: {[name5]:[user5], [name6]:[user6], ... }
}

“CA”: {

“San Francisco”: {”Jennifer”: “jen1982”}
}
“MA”: {

“Boston”: {”Albert”: “albert”}
}
“WA”: {

“Bellingham”: {”Jason”: “jason”},

“Seattle”: {”Ben”: “b”, ”Zack”: “zack”}
}

Row Key

“CA”: {

}
“MA”: {

}
“WA”: {


}

Row Key
Super Column

“CA”: {

}
“MA”: {

}
“WA”: {


}

Row Key
Colum
Super Column
n
“CA”: {

}
“MA”: {

}
“WA”: {


}

Row Key
Colum
Super Column Value
n
“CA”: {

}
“MA”: {

}
“WA”: {


}

Show me
EVERYONE IN WASHINGTON

get(:LocationUserIndexSCF, ‘WA’)

{


}

Act Two
Composite Key Indexing

Order Preserving Partitioner
+
Range Queries

[state1]/[city1]: {[name1]:[user1], [name2]:[user2], ... }
...
[stateX]/[cityY]: {[name7]:[user7], [name8]:[user8], ... }

“CA/San Francisco”: {”Jennifer”: “jen1982”}
“MA/Boston”: {”Albert”: “albert”}
“WA/Bellingham”: {”Jason”: “jason”}
“WA/Seattle”: {”Ben”: “b”, “Zack”: “zack”}

get_range(:LocationUserIndexCF, {:start: 'WA',
:ﬁnish:'WB'})

{
”WA/Bellingham”: {”Jason”: “jason”},
“WA/Seattle”: {”Ben”: “b”, “Zack”: “zack”}
}

Finale
BUILD SOMETHING AWESOME

<Keyspace Name="UserDb">
<ColumnFamily Name="Users"
CompareWith="UTF8Type" />

<ColumnFamily Name="LocationUserIndexSCF"

CompareWith="UTF8Type"

CompareSubcolumnsWith="UTF8Type"

ColumnType="Super" />

<ColumnFamily Name="LocationUserIndexCF"

CompareWith="UTF8Type" />

<ReplicaPlacementStrategy>
org.apache.cassandra.locator.RackUnawareStrategy
</ReplicaPlacementStrategy>
<ReplicationFactor>1</ReplicationFactor>
<EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
</Keyspace>

Cassandra Basics: Indexing

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (20)

Similar to Cassandra Basics: Indexing

Similar to Cassandra Basics: Indexing (10)

Recently uploaded

Recently uploaded (20)

Cassandra Basics: Indexing

Editor's Notes