A presentation on noSQL (structured storage) introduction. this presentation also includes why people should be choosing cassandra over database system.
5. PhpXperts seminar 2010,
Work for Fun!!
Database!
SELECT books.*
FROM books
LEFT JOIN users
ON books.user_id = users.id
WHERE
users.age < 15
Relational
Database
System
Query language!
6. PhpXperts seminar 2010,
Work for Fun!!
Known issues with
existing database system!
• Maintaining Relations among the
tables
• Table, Page, Row level Locking
• Huge data produces huge
Fat indexes
• Transactional Operations
• Parsing SQL query syntax
• Multi tables Joining Query
8. PhpXperts seminar 2010,
Work for Fun!!
About noSQL ?
• NoSQL == Structured storage!
• An initiative to use
alternative of relational
database system
• Targeting on the following goals
–Performance
– Autonomous
– Minimizing cost
9. PhpXperts seminar 2010,
Work for Fun!!
Why structured storage over
relational database
• Getting rid of fear
–In larger expansion is
CHEAP!
–No SQL parsing overhead
–No table joining
–No relation consideration
–No single big chunk of data
17. PhpXperts seminar 2010,
Work for Fun!!
At facebook!
• Facebook! Around 140 nodes!
• Facebook open sourced cassandra!
18. PhpXperts seminar 2010,
Work for Fun!!
Digg.com declared to use cassandra!
The fundamental problem is endemic to the relational
database mindset, which places the burden of computation
on reads rather than writes. This is completely wrong
for large-scale web applications, where response time is
critical. It’s made much worse by the serial nature of
most applications. Each component of the page blocks on
reads from the data store, as well as the completion of
the operations that come before it. Non-relational data
stores reverse this model completely, because they don’t
have the complex read operations of SQL.
Read at home!
19. PhpXperts seminar 2010,
Work for Fun!!
Twitter moved their
statuses on cassandra!
We have a lot of data, the growth factor in that data is
huge and the rate of growth is accelerating. We have a
system in place based on shared mysql + memcached but
its quickly becoming prohibitively costly (in terms of
manpower) to operate. We need a system that can grow in
a more automated fashion and be highly available.
Read at home!
20. PhpXperts seminar 2010,
Work for Fun!!
Few Related References!
• Check out Yahoo! Research's
“Cloud Serving Benchmark”
• MySQL and Memcached: End of An era?
• An article on how Yoshinori scaled MySQL as
noSQL to serve 750,000 qps! Click here
24. PhpXperts seminar 2010,
Work for Fun!!
Why cassandra!
• Tested (Facebook, Twitter,
Reddit, Digg, Rackspace etc..)
• Decentralized and No single point
of failure
• Flexible schema
• Elastic
• Durable, Data center and Disaster
management aware
• Highly scalable write
26. PhpXperts seminar 2010,
Work for Fun!!
Install cassandra
• Go and download cassandra from here -
http://cassandra.apache.org/
• Ensure you have java runtime on your pc
•
27. PhpXperts seminar 2010,
Work for Fun!!
Create keyspace and column family
• Extract your download cassandra
archive
• Edit “cofig/storage-config.xml”
file in your text editor
• Go to the “Keyspaces” block
• Add “AddressBook” Keyspace
28. PhpXperts seminar 2010,
Work for Fun!!
Configuration!
<Keyspace Name="AddressBook">
<ColumnFamily Name='Addresses'
CompareWith="TimeUUIDType"/>
<ReplicaPlacementStrategy>org.apache.cassandr
a.locator.RackUnawareStrategy</ReplicaPlaceme
ntStrategy>
<ReplicationFactor>1</ReplicationFactor>
<EndPointSnitch>org.apache.cassandra.locator.
EndPointSnitch</EndPointSnitch>
</Keyspace>
29. PhpXperts seminar 2010,
Work for Fun!!
Kick start cassandra!
• $ cd apache-cassandra-0.6.x
• $ ./bin/cassandra
• Get PHP thrift library – phpcassa
• https://github.com/hoan/phpcassa/
30. PhpXperts seminar 2010,
Work for Fun!!
Show code!!
• Inserting data into cassandra!
• Listing all added data
• Remove an existing data
31. PhpXperts seminar 2010,
Work for Fun!!
More about cassandra!
• All nodes need to be in
low latency fiber connected
• Still in alpha version! Might
have issues!
• High RPM hard disk (ie. 15000,
10000)
33. PhpXperts seminar 2010,
Work for Fun!!
Common sense!
• Avoid Big Design Up Front !
• Benchmark your existing system performance!
• Experiment and calculate cost!
• Structured database with Dr. Eric Brewer CAP
theorem
– Consistency - Is the data I’m looking at now the same if I look at it
somewhere else?
– Availability - What happens if my database goes down?
– Partitioning - What if my data is on different networks?
35. PhpXperts seminar 2010,
Work for Fun!!
Find out the best restaurant in the
town!
Passion food reviewers community!
http://restaurant.welltreat.us
36. PhpXperts seminar 2010,
Work for Fun!!
- nhm tanveer hossain khan (hasan)
IT Director, Tasawr Interactive
hasan@tasawr.com
Blog: http://hasan.we4tech.com
Twitter: http://twitter.com/we4tech
love programming,
used to write code in Ruby, Java and PHP!
Who am i?