Upcoming SlideShare
Loading in...5







Total Views
Views on SlideShare
Embed Views



4 Embeds 277 273 2 1 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

NoSQL NoSQL Presentation Transcript

  • NOSQL, NO? Introductory presentation
  • RELATIONAL SQL  ACID Relational algebra  Optimal for ad-hoc queries Tables, Columns, Rows  Sharding can be difficult Metadata separate from data Normalized data Optimized storage
  • POPULAR RDBMS MySQL  Informix SQL Server  Progress Oracle  Pervasive Postgres  Sybase DB2  Access Interbase, Firebird …
  • SQL Unified language to create and query both data and metadata Similar to English Verbose(!) Can get complex for non-trivial queries Does not expose execution plan – you say what you want it toreturn, not how
  • SQL EXAMPLES If you can say what you mean, you can query the existing data Results are near-instant when querying based on primary keyselect * from valute where id=1 and sid=42 Results are fast when querying based on non-unique indexselect valuta from valute where ((id=1 and sid=42)) and (valute.firma_id=123 andvalute.firma__sid=1) Very readable for trivial queriesselect r.customer,sum(rs.iznos) sveukupno from racuni rjoin racuni_stavke rs on by rs.ordinal
  • SQL EXAMPLES Not so readable for non-trivial queriesselect "MP" tip_prometa, mprac.broj broj_racuna, mprac_stavke.kolicina kolicina, (mprac.tecaj*mprac_stavke.kolicina*mprac_stavke.rabat_iznos)rabat_iznos, (round(mprac_stavke.cijena - mprac_stavke.rabat_iznos - mprac_stavke.rabat2_iznos - mprac_stavke.rabat3_iznos - mprac_stavke.porez1 -mprac_stavke.porez2 - mprac_stavke.porez_potrosnja,6)*mprac_stavke.kolicina) iznos, (mprac_stavke.kolicina* ifnull((selectsum(pn_cijena*kolicina)/sum(kolicina) from mprac_skl left join skl_stavke on mprac_skl.skl_id=skl_stavke.skl_id andmprac_skl.skl__sid=skl_stavke.skl__sid where and mprac_skl.mprac__sid=mprac.sid andskl_stavke.artikl_id=mprac_stavke.artikl_id and skl_stavke.artikl__sid=mprac_stavke.artikl__sid ),0) ) iznos_nabavno, ifnull( (selectsum(mprac_stavke.kolicina*ambalaze.naknada_kom) from artikli_ambalaze left join ambalaze on andambalaze.sid=artikli_ambalaze.ambalaza__sid where and artikli_ambalaze.artikl__sid=artikli.sid andambalaze.kalkulacija="N" ),0) naknada, radnici_komercijalisti.ime racun_komercijalist_ime, (select naziv from skladista where skladista.tip_skladista="M"and pj_id=mprac.pj_id limit 1) skladiste_naziv , pj.naziv pj_naziv, mprac.datum,cast(concat("(",if(DayOfWeek(mprac.datum)=1,7,DayOfWeek(mprac.datum)-1),") ", if(DayOfWeek(mprac.datum)=1,"1 Nedjelja",if(DayOfWeek(mprac.datum)=2,"2 Ponedjeljak", if(DayOfWeek(mprac.datum)=3,"3 Utorak", if(DayOfWeek(mprac.datum)=4,"4 Srijeda",if(DayOfWeek(mprac.datum)=5,"5 Èetvratk", if(DayOfWeek(mprac.datum)=6,"6 Petak", if(DayOfWeek(mprac.datum)=7,"7 Subota","")))))))) as char(15))dan_u_tjednu, cast(month(mprac.datum) as unsigned) mjesec, cast(week(mprac.datum) as unsigned) tjedan, cast(quarter(mprac.datum) as unsigned) kvartal,cast(year(mprac.datum) as unsigned) godina, cast(if(tipovi_komitenata.tip="F",trim(concat(partneri.ime," ",partneri.prezime)),partneri.naziv) as char(200))kupac_naziv, partneri_mjesta.postanski_broj kupac_mjesto, partneri_mjesta.mjesto kupac_mjesto_naziv, partneri_grupe_mjesta.naziv …
  • RDBMS SCALING Vertical scaling • Better CPU, more CPUs • More RAM • More disks • SAN Partitioning Sharding
  • PARTITIONING With many rows and heavy usage, partitioning is a must What to partition • Tables • Indexes • Views Typical cases • Monthly data • Alphabetical keys
  • RDBMS SHARDING Sharding means using several databases where each represents partof data (500 clients on one server, another 500 on another) Requires changing application code connect(calculate_server_from(sharding_key)) Impossible to join data from different databases, so choose yoursharding key wisely Very difficult to repartition your databases based on a new key
  • RDBMS METADATA Metadata: data describing other data RDBMS structures are explicitly defined, and each data type isoptimized for storage Lots of constraints Can get slow with lot of data
  • NOSQL “Not SQL”, “Not only SQL” Core NoSQL databases invented mostly because RDBMS madelife very hard for huge and heavy traffic web databases NoSQL databases are the ones significantly different fromrelational databases
  • NOSQL TYPES Wide Column Store / Column Families Document Store Key Value / Tuple Store Graph Databases Object Databases XML Databases Multivalue Databases
  • 4 MAIN DATA MODELS Key-Value Stores BigTable Clones (aka "ColumnFamily") Document Databases Graph DatabasesSource:
  • KEY/VALUE STORES Lineage: Amazons Dynamo paper and Distributed HashTables. Data model: A global collection of key-value pairs. Example: Voldemort, Dynomite, Tokyo CabinetSource:
  • BIGTABLE CLONES Lineage: Googles BigTable paper. Data model: Column family, i.e. a tabular model where each row atleast in theory can have an individual configuration of columns. Example: HBase, Hypertable, CassandraSource:
  • DOCUMENT DATABASES Lineage: Inspired by Lotus Notes. Data model: Collections of documents, which contain key-valuecollections (called "documents"). Example: CouchDB, MongoDB, RiakSource:
  • GRAPH DATABASES Lineage: Draws from Euler and graph theory. Data model: Nodes & relationships, both which can hold key-valuepairs Example: AllegroGraph, InfoGrid, Neo4jSource:
  • POPULAR NOSQL Hadoop / Hbase  MemcacheDB Cassandra  Voldemort Amazon SimpleDB  Hypertable MongoDB  Cloudata CouchDB  IBM Lotus/Domino Redis
  • NOSQL CHARACTERISTICTS Almost infinite horizontal scaling Very fast Performance doesn’t deteriorate with growth (much) No fixed table schemas No join operations Ad-hoc queries difficult or impossible Structured storage Almost everything happens in RAM
  • REAL-WORLD USE Cassandra • Facebook (original developer, used it till late 2010) • Twitter • Digg • Reddit • Rackspace • Cisco BigTable • Google (open-source version is HBase) MongoDB • Foursquare • Craigslist • • SourceForge • GitHub
  • WHY NOSQL? Handles huge databases (I know, I said it before) Redundancy, data is pretty safe on commodity hardware Super flexible queries using map/reduce Rapid development (no fixed schema, yeah!) Very fast for common use cases
  • PERFORMANCE RDBMS uses buffer to ensure ACID properties NoSQL does not guarantee ACID and is therefore much faster We don’t need ACID everywhere! I used MySQL and switched to MongDB for my analytics app • Data processing (every minute) is 4x faster with MongoDB, despite being a lot more detailed (due to much simple development)
  • SCALING Simple web application with not much traffic • Application server, database server all on one machine
  • SCALING More traffic comes in • Application server • Database server
  • SCALING Even more traffic comes in • Load balancer • Application server x2 • Database server
  • SCALING Even more traffic comes in • Load balancer x N • easy • Application server x N • easy • Database server xN • hard for SQL databases
  • SQL SLOWDOWN Not linear!
  • NOSQL SCALING Need more storage? • Add more servers! Need higher performance? • Add more servers! Need better reliability? • Add more servers!
  • SCALING SUMMARY You can scale SQL databases (Oracle, MySQL, SQL Server…) • This will cost you dearly • If you don’t have a lot of money, you will reach limits quickly You can scale NoSQL databases • Very easy horizontal scaling • Lots of open-source solutions • Scaling is one of the basic incentives for design, so it is well handled • Scaling is the cause of trade-offs causing you to have to use map/reduce
  • RAM Why map/reduce? I just need some simple queries. Tomorrow Iwill need some other queries…. SQL databases are optimized for very efficient disk access, but forsignificant scaling need RAM caching (MySQL+memcached) NoSQL databases are designed to keep whole working set in RAM
  • WORKING SET In real-world use working set is much less than complete database • For analytics 99% of queries will be regarding last 30 days As you need RAM only for working set, you can use commodityservers, VPS, and just add more as your app becomes more popular
  • WORKING SET WOES Foursquare has millions of users and working set the same as the database They used a single 66GB Amazon EC2 High-Memory Quadruple Extra LargeInstance (with cheese) for millions of users When their RAM usage was 65GB, they decided to shard Too late, they started to have disk swaps Disk is much slower than RAM - 100x slowdown Server could not keep up due to swapping 11 hours outage (ouch!)
  • MAP/REDUCE Google’s framework for processing highly distributableproblems across huge datasets using a large number ofcomputers Let’s define large number of computers • Cluster if all of them have same hardware • Grid unless Cluster (if !Cluster for old-style programmers)
  • MAP/REDUCE Process split into two phases • Map • Take the input, partition it delegate to other machines • Other machines can repeat the process, leading to tree structure • Each machine returns results to the machine who gave it the task • Reduce • collect results from machines you gave the tasks • combine results and return it to requester • Slower than sequential data processing, but massively parallel • Sort petabyte of data in a few hours • Input, Map, Shuffle, Reduce, Output
  • MAP/REDUCE EXAMPLE You need to write two functions Count different words in a set of documents
  • MONGODB Document store Basic support for dynamic (ad hoc) queries Query by example (nice!)
  • MONGODB Conditional Operators • <, <=, >, >= • $all, $exists, $mod, $ne, $in, $nin, $nor, $or, $and, $size, $type  Regular expressions
  • MONGODB Data is stored as BSON (binary JSON) • Makes it very well suited for languages with native JSON support Map/Reduce written in Javascript • Slow! There is one single thread of execution in Javascript Master/slave replication (auto failover with replica sets) Sharding built-in Uses memory mapped files for data storage Performance over features On 32bit systems, limited to ~2.5Gb An empty database takes up 192Mb GridFS to store big data + metadata (not actually an FS)Source:
  • CASSANDRA Written in: Java Protocol: Custom, binary (Thrift) Tunable trade-offs for distribution and replication (N, R, W) Querying by column, range of keys BigTable-like features: columns, column families Writes are much faster than reads (!) • Constant write time regardless of database size Map/reduce possible with Apache HadoopSource:
  • HBASE Written in: Java Main point: Billions of rows X millions of columns Modeled after BigTable Map/reduce with Hadoop Query predicate push down via server side scan and get filters Optimizations for real time queries A high performance Thrift gateway HTTP supports XML, Protobuf, and binary Cascading, hive, and pig source and sink modules No single point of failure While Hadoop streams data efficiently, it has overhead for starting map/reduce jobs. HBase is column oriented key/value store andallows for low latency read and writes. Random access performance is like MySQLSource:
  • REDIS Written in: C/C++ Main point: Blazing fast Disk-backed in-memory database, Master-slave replication Simple values or hash tables by keys, Has sets (also union/diff/inter) Has lists (also a queue; blocking pop) Has hashes (objects of multiple fields) Sorted sets (high score table, good for range queries) Has transactions (!) Values can be set to expire (as in a cache) Pub/Sub lets one implement messaging (!)Source:
  • COUCHDB Written in: Erlang Main point: DB consistency, ease of use Bi-directional (!) replication, continuous or ad-hoc, with conflict detection, thus, master-master replication. (!) MVCC - write operations do not block reads Previous versions of documents are available Crash-only (reliable) design Needs compacting from time to time Views: embedded map/reduce Formatting views: lists & shows Server-side document validation possible Authentication possible Real-time updates via _changes (!) Attachment handling CouchApps (standalone JS apps)Source:
  • HADOOP Apache project A framework that allows for the distributed processing of largedata sets across clusters of computers Designed to scale up from single servers to thousands of machines Designed to detect and handle failures at the application layer,instead of relying on hardware for it
  • HADOOP Created by Doug Cutting, who named it after his sons toy elephant Hadoop subprojects • Cassandra • HBase • Pig Hive was a Hadoop subproject, but is now a top-level Apache project Used by many large & famous organizations • Scales to hundreds or thousands of computers, each with several processor cores Designed to efficiently distribute large amounts of work across a set of machines Hundreds of gigabytes of data constitute the low end of Hadoop-scale Built to process "web-scale" data on the order of hundreds of gigabytes to terabytes or petabytes
  • HADOOP See Uses Java, but allows streaming so other languages can easily sendand accept data items to/from Hadoop
  • HADOOP Uses distributed file system (HDFS) • Designed to hold very large amounts of data (terabytes or even petabytes) • Files are stored in a redundant fashion across multiple machines to ensure their durability to failure and high availability to very parallel applications • Data organized into directories and files • Files are divided into block (64MB by default) and distributed across nodes Design of HDFS is based on the design of the Google File System
  • HIVE A petabyte-scale data warehouse system for Hadoop Easy data summarization, ad-hoc queries Query the data using a SQL-like language called HiveQL Hive compiler generates map-reduce jobs for most queries
  • PIG Platform for analyzing large data sets High-level language for expressing data analysis programs Compiler produces sequences of Map-Reduce programs Textual language called Pig Latin • Ease of programming • System optimizes task execution automatically • Users can create their own functions
  • PIG LATIN Pig Latin – high level Map/Reduce programming Equivalent to SQL for RDBMS systems. Pig Latin can be extended using Java User Defined Functions “Word Count” script in Pig Latin
  • SUMMARY NoSQL is a great problem solver if you need it Choose your NoSQL platform carefully as each is designed forspecific purpose Get used to Map/Reduce It’s not a sin to use NoSQL alongside (yes)SQL database I am really happy to work with MongoDB  instead of MySQL