Cassandra - PHP

Cassandra
Integrating Cassandra into your project

dinsdag 12 november 13

Maurits Lawende
•
•
•


Work at Dutch Open Projects (DOP) since 2007
Development and technical design for challenging Drupal sites
Development of SaaS solutions in PHP & NodeJS

ToDoToDay
•
•
•
•

Data versus information
History and usage of Cassandra
How to use Cassandra
Developments

Celko, J. (1999). Data and databases


SQL is designed for information
DBMS knows how to use your data


SQL is designed for ﬂexibility
Not even a single line on scalability


SQL
nearly 40 years of experience


SQL
Never designed for scalability


Alexa top 10
•
•
•
•
•


Google
Facebook
YouTube
Yahoo
Baidu

•
•
•
•
•

Wikipedia
QQ.com
LinkedIn
Live.com
Twitter

Alexa top 10
•
•
•
•
•


Google (BigTable)
Facebook (MySQL)
YouTube (MySQL)
Yahoo
Baidu (HyperTable)

•
•
•
•
•

Wikipedia (MySQL)
QQ.com
LinkedIn (Voldemort)
Live.com
Twitter (MySQL)

Cassandra users
•
•
•
•
•
•


Facebook (+ Redis & HBase & MySQL)
Twitter (+ MySQL)
Reddit (+ Postgres)
Digg (+ Redis)
Bit.ly (+ MongoDB)
Netﬂix

Cassandra users
•
•
•
•
•
•


Twitter (+ MySQL)
Reddit (+ Postgres)
Digg (+ Redis)
Bit.ly (+ MongoDB)
Netﬂix

Jeff Hammerbacher

Cassandra users
•
•
•
•
•
•


Twitter (+ MySQL)
Reddit (+ Postgres)
Digg (+ Redis)
Bit.ly (+ MongoDB)
Netﬂix

Jeff Hammerbacher
left Facebook in 2008

Back to basic
Don’t think SQL


Key/value store
Evolved towards tables


Just data
•
•
•


No joins
Limited sorting capabilities
No aggregation, grouping, subqueries whatsoever

Schemaless

•
•


Fixed <strike>tables</strike> column families, but;
Dynamic column names

Operations in Cassandra 1.0
•

CREATE KEYSPACE name

•
•
•
•

USE name

CREATE COLUMN FAMILY name
DROP KEYSPACE name
DROP COLUMN FAMILY name

•
•
•
•
•

SET columnfamily[‘row’][‘column’] = ‘value’;
GET columnfamily[‘row’]
LIST columnfamily
DEL columnfamily[‘row’]
DEL columnfamily[‘row’][‘column’]

•
•
•


post[‘uuid’][‘title’] = ‘First post!’;
user[‘mau’][‘ﬁrstname’] = ‘Maurits’;
user[‘mau’][‘lastname’] = ‘Lawende’;

post

•
•
•



title
uuid First post!
user

ﬁrstname
mau Maurits

lastname
Lawende

sorted by rowkey, columnname (all ascending)

•
•
•



•
•
•


post[‘uuid’][‘user’] = ‘mau’;

How to get a list
of blogs by “mau”?

•
•
•



How to get a list

•
•
•



WHERE user = ‘mau’

How to get a list

•
•
•


Bad Request:
No indexed columns present in
by-columns clause with
Equal operator

How to get a list

•
•
•

Bad Request:
Equal operator
sequal scans
are rejected


How to get a list
Bad Request:
Equal operator
Bad Request: Order by is currently only supported
on the clustered columns of the PRIMARY KEY

•
•
•


How to get a list
Bad Request:
Equal operator
Bad Request: Order by is currently only supported
on the clustered columns of the PRIMARY KEY
Bad Request: ORDER BY is only supported when the partition key is
restricted by an EQ or an IN.

•
•
•


How to get a list

•
•
•



ORDER BY date DESC
LIMIT 10

How to get a list

•
•
•



ORDER BY date DESC
LIMIT 10
only possible when user and
date is in primary key

Predictable performance
No performance degradation after data growth


•
•
•
•
•

user[‘mau’][‘post001’] = ‘uuid’;

•
•
•
•
•


any order and limit

•
•
•
•
•

post[‘uuid’][‘user’] = ‘uuid’;

join

•
•
•
•
•

post[‘uuid’][‘user’] = ‘uuid’;

join
no uuid IN (...) or OR’s

•
•
•
•

user[‘mau’][‘post001:uuid’] = ‘First post!’;
user[‘mau’][‘post002:uuid’] = ‘Second post!’;

•
•
•
•


only one query required
to get user proﬁle
with latest posts


•
•
•
•

64 KB


2 billion cells

64 KB

2 GB

Beauty?
•
•
•
•

Dirty in the SQL world, but;
It’s a best practice in Big Data
Don’t think of it as a relational database
No strict rules on how to use it, just push it to the limits

Each row is a snapshot of data
meant to satisfy a given query, sort
of like a materialized view.


Storage in a cluster


Cluster structures


Master-slave


Master-master


Sharding


HDFS / GlusterFS


HyperTable


Dynamo


No master or single point of failure
Every node is (nearly) identical


Distribution and replication
2^127 0


Distribution and replication


Client can connect to any node


Seed nodes

•
•


Required for bootstrapping nodes
Deﬁne 2 or 3 seed nodes per cluster

Extending the ring
•
•
•


Assign a token for new node
Conﬁgure seed node host
Start Cassandra on new node

Consistency


Writing data
•
•
•
•

Hinted handoff
Write to commit log
Write in memory
Write to disk (together with timestamp)

Write consistency

•
•


Choose from ANY, ONE, TWO, THREE, QUORUM, ALL
QUORUM = ﬂoor((replication factor / 2) + 1)

Read consistency

•
•


Choose from ONE, TWO, THREE, QUORUM, ALL
Most recent copy is returned

Read repair
•
•
•


Compares data with 2 other replica’s in the background
Fixes inconsistent and missing data
At 10% of all reads

Node repair

•
•


Gradually compares all data in nodes with replica’s
Required in conjunction with read repair to ﬁx ‘forgotten deletes’

ACID theorem
•
•
•
•

Atomic; completed successfully or entirely rolled back
Consistent; transations never invalidates the database state
Isolated; transactions are processed sequential
Durable; completed actions are persistent

CAP theorem
Impossible to achieve all three:

•
•
•


Consistency
Availability
Partition tolerance

Eventual consistency
Not guaranteed to be consistent, but becomes consistent later


Eventual consistency
•
•

Best effort

•

Conﬁgurable consistency level, but no transaction support


Consistency is not always more important than speed and scalability
(doesn’t require locking)

Surrogate keys
Say bye to sequences


Surrogate keys

ss cluster
istent acro
not cons


Surrogate keys

ss cluster
istent acro
not cons

counters a
re for cou
n


ting

Native support for uuid’s
f47ac10b-58cc-4372-a567-0e02b2c3d479

Surrogate keys

ss cluster
istent acro
not cons

counters a
re for cou
n


ting

Cassandra 1.2


Cassandra 1.2
•
•
•


Not longer schemaless
Introduced CQL3
No wide tables anymore

Collections
•
•
•


Lists
Maps
Sets

Lists
•
•

user[‘mau’][‘posts’] = ‘uuid’;

•
•

UPDATE user SET posts = posts + [‘uuid’]


CREATE TABLE user (
username text PRIMARY KEY,
posts list<uuid>
);

UPDATE user SET posts = [‘uuid’] + posts

Set
•

CREATE TABLE user (
email set<text>
);

•

UPDATE user SET emails = emails + {‘mail@example.com’}


Maps
•

CREATE TABLE user (
attending map<timestamp,text>
);

•
•

UPDATE user SET attending[‘2013-11-12’] = ‘PHPMeetup’


DELETE attending[‘2013-12-05’] FROM user

Limits on collections
•
•
•


64K
Whole collection loaded in memory when reading / writing
Not an alternative to wide tables!

Limits on collections
•
•
•


64K

No size check in CQL
SET list = list + [‘...’]

Whole collection loaded in memory when reading / writing
Not an alternative to wide tables!

Wide tables in CQL3
•

CREATE TABLE tweets (
tweet_id uuid PRIMARY KEY,
author varchar,
body varchar
);

•

CREATE TABLE timeline (
user_id varchar,
tweet_id uuid,
author varchar,
body varchar,
PRIMARY KEY (user_id, tweet_id)
)


Wide tables in CQL3
•

•


author varchar,
body varchar
);
user_id varchar,
tweet_id uuid,
author varchar,
body varchar,
)

user_id
mau
user_id
mike

uuid:author
anne
uuid:author
david

uuid:body
Tweet from Anne
uuid:body
Tweet from David

Wide tables in CQL3

For schemaless lovers:

•

•


author varchar,
body varchar
);
user_id varchar,
tweet_id uuid,
author varchar,
body varchar,
)

user_id
mau
user_id
mike

CREATE TABLE name (
rowkey varchar,
columnname varchar,
value blob,
PRIMARY KEY (rowkey, columnname)
);
uuid:author uuid:body
anne
Tweet from Anne
uuid:author uuid:body
david
Tweet from David

Secondary index

•
•


CREATE INDEX name ON table (column);
High memory usage when used with high cardinality

Iteration

•


SELECT * FROM users

Iteration
unpredictable performance

•


SELECT * FROM users LIMIT 10 OFFSET 100

Iteration

•
•


SELECT * FROM users
SELECT token(username), username, country, age FROM user

Iteration
•
•


SELECT * FROM users
SELECT token(username), username, country, age FROM user
WHERE token(username) > 23947239 LIMIT 10

Queries are always controlled by
one node


Queries are always controlled by
one node
Even if data from 100 nodes is involved


MapReduce
Or just ‘MapRed’


MapReduce

•
•


array_map
array_reduce

map()

•
•


Processes a subset of the data
array_map(function($v) { return strtoupper($v); }, array('a', 'b'))

reduce()

•
•


Merge results from the mapping function
array_reduce(array(1, 2, 3), function($a, $b) { return $a + $b; });

MapReduce


MapReduce
map()

map()

map()

map()

map()

map()

map()

map()


map()

map()

map()

map()

MapReduce
result


Wordcount
$data = array(‘red green blue’, ‘orange blue’, ‘purple green’);
$data = array_map(function($v) {
$words = array();
foreach (explode(' ', $v) as $word)
$words[$word] = isset($words[$word]) ? $words[$word] + 1 : 1;
return $words;
}, $data);
$data = array_reduce($data, function($a, $b) {
foreach ($a as $word => $count)
$b[$word] = isset($b[$word]) ? $b[$word] + $count : $count;
return $b;
}, array());
array(‘red’ => 1, ‘green’ => 2, ‘blue’ => 2, ‘orange’ => 1, ‘purple’ => 1)

ORDER BY value LIMIT 5
$data = array(array(4,5,2), array(62,35,1), array(74,56,2,34));
$data = array_map(function($v) {
sort($v);
return array_slice($v, 0, 5);
}, $data);
$data = array_reduce($data, function($a, $b) {
$v = array_merge($a, $b);
sort($v);
return array_slice($v, 0, 5);
}, array());
array(1, 2, 2, 4, 5)

Remember

•
•


Getting information is a bumpy road in big data
Use MapRed to transform data into information

MapReduce

•
•


No native support in Cassandra
MapReduce possible with Hadoop (requires Java programming)

Pig
input_lines = LOAD '/tmp/my-copy-of-all-pages-on-internet' AS (line:chararray);
words = FOREACH input_lines GENERATE FLATTEN(TOKENIZE(line)) AS word;
filtered_words = FILTER words BY word MATCHES 'w+';
word_groups = GROUP filtered_words BY word;
word_count = FOREACH word_groups GENERATE COUNT(filtered_words) AS
count, group AS word;
ordered_word_count = ORDER word_count BY count DESC;
STORE ordered_word_count INTO '/tmp/number-of-words-on-internet';


Hive
SELECT v['ip'], COUNT(1) AS cnt FROM www_access
GROUP BY v['ip']
ORDER BY cnt DESC LIMIT 30


Pig and Hive
•
•
•


Using MapReduce
No(t very) predictable performance
Good for analysis

Hack your own
•
•
•
•

Not too difﬁcult
Data can be split into subsets by ﬁltering on tokens
Application must run on all MapRed nodes
Probably better performance than Pig / Hive

Interfaces / protocols
•
•
•


Thrift
Binary protocol (1.2+)
Gossip (internode communication)

Thrift
•
•
•
•
•

Something like SOAP in a binary format
Tool which generates libraries based on deﬁnition ﬁles
Supports many languages (incl. PHP, JS, NodeJS, c, java, python, ruby.....)
Also used by HyperTable, HBase, Accumulo and ElasticSearch
Sole interface before 1.2

Thrift

•


No support for collections

Binary protocol
•
•
•


Recommended protocol for Cassandra 1.2
Few client libraries available
No binary connectors were available for PHP
https://github.com/mauritsl/php-cassandra

php-cassandra
require('lib/cassandra/Cassandra.php');
use CassandraConnection as Cassandra;
$connection = new Cassandra('localhost', 'keyspace');
$rows = $connection->query('SELECT * FROM user');
foreach ($rows as $row) {
print $row->ﬁrstname;
print $row->listﬁeld[0];
}
$rows->count();
$rows->getColumns();

Scaling applications


Rule 1:
Don’t ask for NoSQL drivers for a CMS


Cassandra does not ﬁt all
(same story for every NoSQL solution)


Every page (or API call) should only
require a few (if not one) query


Static versus Dynamic data
•

Static: information that doesn’t change very often

•
•
•

I.e.: translations
May go in a RDBMS or local storage (ﬁles?)

Dynamic: many changes

•
•

Changes must be visible on all nodes
Use Cassandra

Local versus Global data
•

Logging

•
•

Separate logs per node

Cache

•
•

Sometimes no need to share cache between nodes

Statistics

•

Can be kept local for a limited time

Local versus Global data

•

Sessions

•


Dependent on session stickiness

Caching
•
•

Memcache is recommended for local cache
Cassandra can be used for global cache

•


Has a TTL feature
INSERT INTO ... (...) VALUES (...) USING TTL 86400

What about ﬁles?

•


Use Hadoop Distributed File System (HDFS) or GlusterFS

What about ﬁles?

•
•


Use Hadoop Distributed File System (HDFS) or GlusterFS
Or use Cassandra

What about files?
•
•

Split files in chunks to avoid hotspots and save the heap
Not uncommon to have files in Cassandra

•
•

github.com/Netflix/astyanax

GB’s are ok, but do not store TB’s

Maximum size of cluster?
•
•

No satisfactory answer
Probably more dependent on network equipment

•
•
•

Rack awareness helps here

Facebook: 150 node cluster, 50TB data (2010)
Easou: 400 node cluster, 300TB data (300 million images)

Minimum size of a cluster?
•
•
•


Can run on a single node
4GB RAM recommended
Runs ﬁne on 1GB RAM

Minimum size of a cluster?
•
•
•


Can run on a single node
4GB RAM recommended
Runs ﬁne on 1GB RAM

“hot data” should ﬁt in RAM

Installing Cassandra
•

Install JDK
Oracle Java recommended but OpenJDK works ok

•
•
•
•

Add Cassandra repository


apt-get install cassandra
Set listen and seed address (IP address of node and seed)
(Re)start Cassandra

Last words...


Data structure is naturally responsive for information


Data structure is naturally responsive for information

predictable performance


History and usage
Jeff Hammerbacher


How to use it
Schema design, CQL3 and limits


Developments
CQL3 and binary protocol


Thank you!


Questions?


Cassandra - PHP

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Cassandra - PHP

Similar to Cassandra - PHP (20)

Recently uploaded

Recently uploaded (20)

Cassandra - PHP