NoSQL

NOSQL, NO?
Introductory presentation

RELATIONAL

 SQL  ACID

 Relational algebra  Optimal for ad-hoc queries

 Tables, Columns, Rows  Sharding can be difficult

 Metadata separate from data

 Normalized data

 Optimized storage

POPULAR RDBMS

 MySQL  Informix

 SQL Server  Progress

 Oracle  Pervasive

 Postgres  Sybase

 DB2  Access

 Interbase, Firebird …

SQL

 Unified language to create and query both data and metadata

 Similar to English

 Verbose(!)

 Can get complex for non-trivial queries

 Does not expose execution plan – you say what you want it to
return, not how

SQL EXAMPLES
 If you can say what you mean, you can query the existing data
 Results are near-instant when querying based on primary key
select * from valute where id=1 and sid=42

 Results are fast when querying based on non-unique index
select valuta from valute where ((id=1 and sid=42)) and (valute.firma_id=123 and
valute.firma__sid=1)

 Very readable for trivial queries
select r.customer,sum(rs.iznos) sveukupno from racuni r
join racuni_stavke rs on r.id=rs.racun_id
where r.id=5
order by rs.ordinal

SQL EXAMPLES

 Not so readable for non-trivial queries
select "MP" tip_prometa, mprac.broj broj_racuna, mprac_stavke.kolicina kolicina, (mprac.tecaj*mprac_stavke.kolicina*mprac_stavke.rabat_iznos)
rabat_iznos, (round(mprac_stavke.cijena - mprac_stavke.rabat_iznos - mprac_stavke.rabat2_iznos - mprac_stavke.rabat3_iznos - mprac_stavke.porez1 -
mprac_stavke.porez2 - mprac_stavke.porez_potrosnja,6)*mprac_stavke.kolicina) iznos, (mprac_stavke.kolicina* ifnull((select
sum(pn_cijena*kolicina)/sum(kolicina) from mprac_skl left join skl_stavke on mprac_skl.skl_id=skl_stavke.skl_id and
mprac_skl.skl__sid=skl_stavke.skl__sid where mprac_skl.mprac_id=mprac.id and mprac_skl.mprac__sid=mprac.sid and
skl_stavke.artikl_id=mprac_stavke.artikl_id and skl_stavke.artikl__sid=mprac_stavke.artikl__sid ),0) ) iznos_nabavno, ifnull( (select
sum(mprac_stavke.kolicina*ambalaze.naknada_kom) from artikli_ambalaze left join ambalaze on ambalaze.id=artikli_ambalaze.ambalaza_id and
ambalaze.sid=artikli_ambalaze.ambalaza__sid where artikli_ambalaze.artikl_id=artikli.id and artikli_ambalaze.artikl__sid=artikli.sid and
ambalaze.kalkulacija="N" ),0) naknada, radnici_komercijalisti.ime racun_komercijalist_ime, (select naziv from skladista where skladista.tip_skladista="M"
and pj_id=mprac.pj_id limit 1) skladiste_naziv , pj.naziv pj_naziv, mprac.datum,
cast(concat("(",if(DayOfWeek(mprac.datum)=1,7,DayOfWeek(mprac.datum)-1),") ", if(DayOfWeek(mprac.datum)=1,"1 Nedjelja",
if(DayOfWeek(mprac.datum)=2,"2 Ponedjeljak", if(DayOfWeek(mprac.datum)=3,"3 Utorak", if(DayOfWeek(mprac.datum)=4,"4 Srijeda",
if(DayOfWeek(mprac.datum)=5,"5 Èetvratk", if(DayOfWeek(mprac.datum)=6,"6 Petak", if(DayOfWeek(mprac.datum)=7,"7 Subota","")))))))) as char(15))
dan_u_tjednu, cast(month(mprac.datum) as unsigned) mjesec, cast(week(mprac.datum) as unsigned) tjedan, cast(quarter(mprac.datum) as unsigned) kvartal,
cast(year(mprac.datum) as unsigned) godina, cast(if(tipovi_komitenata.tip="F",trim(concat(partneri.ime," ",partneri.prezime)),partneri.naziv) as char(200))
kupac_naziv, partneri_mjesta.postanski_broj kupac_mjesto, partneri_mjesta.mjesto kupac_mjesto_naziv, partneri_grupe_mjesta.naziv …

RDBMS SCALING

 Vertical scaling
• Better CPU, more CPUs
• More RAM
• More disks
• SAN

 Partitioning

 Sharding

PARTITIONING

 With many rows and heavy usage, partitioning is a must

 What to partition
• Tables
• Indexes
• Views

 Typical cases
• Monthly data
• Alphabetical keys

RDBMS SHARDING

 Sharding means using several databases where each represents part
of data (500 clients on one server, another 500 on another)

 Requires changing application code
connect(calculate_server_from(sharding_key))

 Impossible to join data from different databases, so choose your
sharding key wisely

 Very difficult to repartition your databases based on a new key

RDBMS METADATA

 Metadata: data describing other data

 RDBMS structures are explicitly defined, and each data type is
optimized for storage

 Lots of constraints

 Can get slow with lot of data

NOSQL

 “Not SQL”, “Not only SQL”

 Core NoSQL databases invented mostly because RDBMS made
life very hard for huge and heavy traffic web databases

 NoSQL databases are the ones significantly different from
relational databases

NOSQL TYPES

 Wide Column Store / Column Families
 Document Store
 Key Value / Tuple Store
 Graph Databases
 Object Databases
 XML Databases
 Multivalue Databases

4 MAIN DATA MODELS

 Key-Value Stores

 BigTable Clones (aka "ColumnFamily")

 Document Databases

 Graph Databases
Source: http://blogs.neotechnology.com/emil/2009/11/nosql-scaling-to-size-and-scaling-to-complexity.html

KEY/VALUE STORES

 Lineage: Amazon's Dynamo paper and Distributed HashTables.

 Data model: A global collection of key-value pairs.

 Example: Voldemort, Dynomite, Tokyo Cabinet

BIGTABLE CLONES

 Lineage: Google's BigTable paper.

 Data model: Column family, i.e. a tabular model where each row at
least in theory can have an individual configuration of columns.

 Example: HBase, Hypertable, Cassandra

DOCUMENT DATABASES

 Lineage: Inspired by Lotus Notes.

 Data model: Collections of documents, which contain key-value
collections (called "documents").

 Example: CouchDB, MongoDB, Riak

GRAPH DATABASES

 Lineage: Draws from Euler and graph theory.

 Data model: Nodes & relationships, both which can hold key-value
pairs

 Example: AllegroGraph, InfoGrid, Neo4j

POPULAR NOSQL

 Hadoop / Hbase  MemcacheDB

 Cassandra  Voldemort

 Amazon SimpleDB  Hypertable

 MongoDB  Cloudata

 CouchDB  IBM Lotus/Domino

 Redis

NOSQL CHARACTERISTICTS

 Almost infinite horizontal scaling
 Very fast
 Performance doesn’t deteriorate with growth (much)
 No fixed table schemas
 No join operations
 Ad-hoc queries difficult or impossible
 Structured storage
 Almost everything happens in RAM

REAL-WORLD USE
 Cassandra
• Facebook (original developer, used it till late 2010)
• Twitter
• Digg
• Reddit
• Rackspace
• Cisco
 BigTable
• Google (open-source version is HBase)
 MongoDB
• Foursquare
• Craigslist
• Bit.ly
• SourceForge
• GitHub

WHY NOSQL?

 Handles huge databases (I know, I said it before)

 Redundancy, data is pretty safe on commodity hardware

 Super flexible queries using map/reduce

 Rapid development (no fixed schema, yeah!)

 Very fast for common use cases

PERFORMANCE

 RDBMS uses buffer to ensure ACID properties

 NoSQL does not guarantee ACID and is therefore much faster

 We don’t need ACID everywhere!

 I used MySQL and switched to MongDB for my analytics app
• Data processing (every minute) is 4x faster with MongoDB, despite
being a lot more detailed (due to much simple development)

SCALING

 Simple web application with not much traffic
• Application server, database server all on one machine

SCALING

 More traffic comes in
• Application server
• Database server

SCALING

 Even more traffic comes in
• Load balancer
• Application server x2
• Database server

SCALING

 Even more traffic comes in
• Load balancer x N
• easy
• Application server x N
• easy
• Database server xN
• hard for SQL databases

SQL SLOWDOWN

 Not linear!
 http://www.slideshare.net/rightscale
/scaling-sql-and-nosql-databases-in-the-
cloud

NOSQL SCALING

 Need more storage?
• Add more servers!

 Need higher performance?

 Need better reliability?

SCALING SUMMARY

 You can scale SQL databases (Oracle, MySQL, SQL Server…)
• This will cost you dearly
• If you don’t have a lot of money, you will reach limits quickly

 You can scale NoSQL databases
• Very easy horizontal scaling
• Lots of open-source solutions
• Scaling is one of the basic incentives for design, so it is well handled
• Scaling is the cause of trade-offs causing you to have to use
map/reduce

RAM

 Why map/reduce? I just need some simple queries. Tomorrow I
will need some other queries….

 SQL databases are optimized for very efficient disk access, but for
significant scaling need RAM caching (MySQL+memcached)

 NoSQL databases are designed to keep whole working set in RAM

WORKING SET

 In real-world use working set is much less than complete database
• For analytics 99% of queries will be regarding last 30 days

 As you need RAM only for working set, you can use commodity
servers, VPS, and just add more as your app becomes more popular

WORKING SET WOES

 Foursquare has millions of users and working set the same as the database
 They used a single 66GB Amazon EC2 High-Memory Quadruple Extra Large
Instance (with cheese) for millions of users
 When their RAM usage was 65GB, they decided to shard
 Too late, they started to have disk swaps
 Disk is much slower than RAM - 100x slowdown
 Server could not keep up due to swapping
 11 hours outage (ouch!)

MAP/REDUCE

 Google’s framework for processing highly distributable
problems across huge datasets using a large number of
computers

 Let’s define large number of computers
• Cluster if all of them have same hardware
• Grid unless Cluster (if !Cluster for old-style programmers)

MAP/REDUCE

 Process split into two phases
• Map
• Take the input, partition it delegate to other machines
• Other machines can repeat the process, leading to tree structure
• Each machine returns results to the machine who gave it the task
• Reduce
• collect results from machines you gave the tasks
• combine results and return it to requester
• Slower than sequential data processing, but massively parallel
• Sort petabyte of data in a few hours
• Input, Map, Shuffle, Reduce, Output

MAP/REDUCE EXAMPLE

 You need to write two functions

 Count different words in a set of documents

MONGODB

 Document store

 Basic support for dynamic (ad hoc) queries

 Query by example (nice!)

MONGODB

 Conditional Operators
• <, <=, >, >=
• $all, $exists, $mod, $ne, $in, $nin, $nor, $or, $and, $size, $type

 Regular expressions

MONGODB
 Data is stored as BSON (binary JSON)
• Makes it very well suited for languages with native JSON support
 Map/Reduce written in Javascript
• Slow! There is one single thread of execution in Javascript
 Master/slave replication (auto failover with replica sets)
 Sharding built-in
 Uses memory mapped files for data storage
 Performance over features
 On 32bit systems, limited to ~2.5Gb
 An empty database takes up 192Mb
 GridFS to store big data + metadata (not actually an FS)
Source: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis

CASSANDRA
 Written in: Java
 Protocol: Custom, binary (Thrift)
 Tunable trade-offs for distribution and replication (N, R, W)
 Querying by column, range of keys
 BigTable-like features: columns, column families
 Writes are much faster than reads (!)
• Constant write time regardless of database size
 Map/reduce possible with Apache Hadoop

HBASE
 Written in: Java
 Main point: Billions of rows X millions of columns
 Modeled after BigTable
 Map/reduce with Hadoop
 Query predicate push down via server side scan and get filters
 Optimizations for real time queries
 A high performance Thrift gateway
 HTTP supports XML, Protobuf, and binary
 Cascading, hive, and pig source and sink modules
 No single point of failure
 While Hadoop streams data efficiently, it has overhead for starting map/reduce jobs. HBase is column oriented key/value store and
allows for low latency read and writes.
 Random access performance is like MySQL

REDIS
 Written in: C/C++
 Main point: Blazing fast
 Disk-backed in-memory database,
 Master-slave replication
 Simple values or hash tables by keys,
 Has sets (also union/diff/inter)
 Has lists (also a queue; blocking pop)
 Has hashes (objects of multiple fields)
 Sorted sets (high score table, good for range queries)
 Has transactions (!)
 Values can be set to expire (as in a cache)
 Pub/Sub lets one implement messaging (!)


COUCHDB
 Written in: Erlang
 Main point: DB consistency, ease of use
 Bi-directional (!) replication, continuous or ad-hoc, with conflict detection, thus, master-master replication. (!)
 MVCC - write operations do not block reads
 Previous versions of documents are available
 Crash-only (reliable) design
 Needs compacting from time to time
 Views: embedded map/reduce
 Formatting views: lists & shows
 Server-side document validation possible
 Authentication possible
 Real-time updates via _changes (!)
 Attachment handling
 CouchApps (standalone JS apps)

HADOOP

 Apache project

 A framework that allows for the distributed processing of large
data sets across clusters of computers

 Designed to scale up from single servers to thousands of machines

 Designed to detect and handle failures at the application layer,
instead of relying on hardware for it

HADOOP
 Created by Doug Cutting, who named it after his son's toy elephant
 Hadoop subprojects
• Cassandra
• HBase
• Pig
 Hive was a Hadoop subproject, but is now a top-level Apache project
 Used by many large & famous organizations
• http://wiki.apache.org/hadoop/PoweredBy
 Scales to hundreds or thousands of computers, each with several processor cores
 Designed to efficiently distribute large amounts of work across a set of machines
 Hundreds of gigabytes of data constitute the low end of Hadoop-scale
 Built to process "web-scale" data on the order of hundreds of gigabytes to terabytes or petabytes

HADOOP

 See http://www.slideshare.net/hadoop/practical-problem-solving-
with-apache-hadoop-pig

 Uses Java, but allows streaming so other languages can easily send
and accept data items to/from Hadoop

HADOOP

 Uses distributed file system (HDFS)
• Designed to hold very large amounts of data (terabytes or even
petabytes)
• Files are stored in a redundant fashion across multiple machines to
ensure their durability to failure and high availability to very parallel
applications
• Data organized into directories and files
• Files are divided into block (64MB by default) and distributed across
nodes
 Design of HDFS is based on the design of the Google File System

HIVE

 A petabyte-scale data warehouse system for Hadoop

 Easy data summarization, ad-hoc queries

 Query the data using a SQL-like language called HiveQL

 Hive compiler generates map-reduce jobs for most queries

PIG

 Platform for analyzing large data sets

 High-level language for expressing data analysis programs

 Compiler produces sequences of Map-Reduce programs

 Textual language called Pig Latin
• Ease of programming
• System optimizes task execution automatically
• Users can create their own functions

PIG LATIN

 Pig Latin – high level Map/Reduce programming

 Equivalent to SQL for RDBMS systems.

 Pig Latin can be extended using Java User Defined Functions

 “Word Count” script in Pig Latin

SUMMARY

 NoSQL is a great problem solver if you need it

 Choose your NoSQL platform carefully as each is designed for
specific purpose

 Get used to Map/Reduce

 It’s not a sin to use NoSQL alongside (yes)SQL database

 I am really happy to work with MongoDB  instead of MySQL

NoSQL

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (7)

Similar to NoSQL

Similar to NoSQL (20)

Recently uploaded

Recently uploaded (20)

NoSQL