Your SlideShare is downloading. ×
0
Making sense of NoSQL         Ann Kelly       Dan McCreary        Dipti Borkar         April 2013
Presenters  Ann Kelly              Dan McCreary                     Dipti BorkarKelly-McCreary           Kelly-McCreary   ...
Agenda•   What is NoSQL?•   What Triggered the NoSQL Movement?•   Database Architecture Patterns•   Common Characteristics...
Pressures on Single CPU SQL                               Volume                          Single CPU Velocity             ...
Three Eras of Databases                                                                        Data                       ...
Advancements in Distributed Databases                                                       6          Copyright Kelly-McC...
NoSQL on Google TrendsInterest over timeThe number 100 represents the peak search volume                                  ...
2009: the NoSQL "Revolt"                             “NoSQLers came toNoSQL!                       share how they had     ...
Common Themes• Horizontal scalability• Clever use of hashing and caching• Parallel execution of queries  – move queries to...
Selecting a Database…"Selecting the right data storage solution is  no longer a trivial task."              Does it       ...
Six Types of Databases  Relational            Analytical (OLAP)                       Key-Value                           ...
Relational• Data is usually stored in row by row  manner (row store)• Standardized query language (SQL)• Data model define...
Analytical (OLAP)• Based on "Star" schema with  central fact table for each event• Optimized for analysis of read-  analys...
Key-Value Storeskey   value     • Keys used to access opaquekey   value       blobs of datakey   valuekey   value         ...
Column-Family                   • Key includes a row, column                     family and column name                   ...
Graph Store                   • Data is stored in a series of nodes                     and properties                   •...
Document Store                     • Data stored in nested                       hierarchies                     • Logical...
Business Solutions•   Big Data – horizontal scalability•   Search – full-text search•   High availability – fault toleranc...
Shared Nothing Architecture   CPU         CPU           CPU          CPU             CPU       CPU                        ...
Distribution Models         Master-Slave                                   Peer-to-Peerrequests                           ...
Move Queries to the NodesMapReduce       MapReduce    MapReduce   MapReduce                                               ...
Structured Search                 Flat Ocean                       Retained Structure       synonym• Retain document struc...
Incremental MapReduce                                     Pre-calculated                                    aggregate valu...
Is Shredding Really Necessary?                   • Every time you take                     hierarchical data and          ...
Object Relational Mapping                   T1                            T2                   T4                         ...
"The Vietnam of Applications"• Object-relational mapping has become one of  the most complex components of building  appli...
Perspectives                 Object                         OLAPDocument         Stores                         MDX Stores...
Selection Checklist•   Horizontal Scalability•   High Availability•   Search•   No object-relational mapping•   Security• ...
29Kelly-McCreary & Associates, LLC
Architectural Tradeoffs      "I want a fast car with good mileage.""I want a scaleable database with low cost that runswel...
Introduction toDocument DatabasesIntroduction to Document   and CouchbaseDatabases and Couchbase           Dipti Borkar   ...
NoSQL Document Database
Couchbase Server - Core Capabilities           Easy                             Consistent High         Scalability       ...
Relational vs Document data model      C1     C2      C3     C4                                               {     JSON  ...
Making a Change Using RDBMS          User Table                                       Photo Table                         ...
Making the Same Change with a Document Database                  {                    “ID”: 1,                    “FIRST”:...
Couchbase Server 2.0 Architecture8092                 11211             11210Query API            Memcapable 1.0    Memcap...
Couchbase“The basics”
Basic Operation                  APP SERVER 1                               APP SERVER 2            COUCHBASE Client Libra...
Add Nodes to Cluster                       APP SERVER 1                                   APP SERVER 2                 COU...
Fail Over Node                         APP SERVER 1                                 APP SERVER 2                   COUCHBA...
New in 2.0    JSON support         Indexing and Querying          JSON        JSON JSO         JSON N         JSONIncremen...
Cluster wide - XDCR   SERVER 1             SERVER 2                       SERVER 3           ACTIVE               ACTIVE  ...
Couchbase Server Admin Console
Use cases
Data driven use cases• Support for unlimited data growth• Data with non-homogenous structure• Need to quickly and often ch...
Performance driven use cases• Low latency matters• High throughput matters• Large number of users• Unknown demand with sud...
Common Use Cases  Social Gaming                Ad Targeting                   Session store • Couchbase stores   player an...
Recommended Reading   • Making Sense of NoSQL: A guide for     managers and the rest of us   • Manning Publications   • Fo...
Dan McCreary & Ann KellyKelly-McCreary & Associateswww.danmccreary.comDipti Borkar@dborkardipti@couchbase.comwww.couchbase...
Webinar - Making Sense of NoSQL: Applying Non-Relational Databases to Business Needs
Upcoming SlideShare
Loading in...5
×

Webinar - Making Sense of NoSQL: Applying Non-Relational Databases to Business Needs

5,570

Published on

The term "NoSQL" defines a wide range of database technologies that offer alternatives to the traditional RDBMS. NoSQL databases take innovative approaches to the unique problems of handling data in modern distributed and web-based systems. But how do you choose the the right database for your specific business need?

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
5,570
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
76
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide
  • Most of you are probably familiar with the table layout. A table is defined with a set of column. And each record in the table conforms to the schema. If you wish to capture different data in the future, the table schema must be changed using the alter table statement. Typically data is normalized in the 3rd normal form reduce duplication. Large tables are split into smaller tables.using foreign keys
  • The data is modeled for the application code and not for the database.
  • JSON support – natively stored as json, whne you build an app, there is not conversion required. New doc viewing , editing capability. Indexing and querying – look inside your json, build views and query for a key, for ranges or to aggregate data Incremental mapreduce – powers indexing. Build complex views over your data. Great for real-time analytics XDCR – replicate information from one cluster to another cluster
  • Transcript of "Webinar - Making Sense of NoSQL: Applying Non-Relational Databases to Business Needs"

    1. 1. Making sense of NoSQL Ann Kelly Dan McCreary Dipti Borkar April 2013
    2. 2. Presenters Ann Kelly Dan McCreary Dipti BorkarKelly-McCreary Kelly-McCreary Couchbase & Associates & Associates 2 Copyright Kelly-McCreary & Associates, LLC
    3. 3. Agenda• What is NoSQL?• What Triggered the NoSQL Movement?• Database Architecture Patterns• Common Characteristics of NoSQL System• Business Benefits of NoSQL• Core NoSQL Concepts• Selected NoSQL Implementations• Recent NoSQL Developments• Selecting the Right NoSQL System• Next Step: Selecting the Right NoSQL Pilot Project• Quick introduction to Document databases & Couchbase 3 Copyright Kelly-McCreary & Associates, LLC
    4. 4. Pressures on Single CPU SQL Volume Single CPU Velocity Agility RDBMS Variability 4 Copyright Kelly-McCreary & Associates, LLC
    5. 5. Three Eras of Databases Data Data Warehouse Warehouse RDBMS RDBMS RDBMS NoSQL 1985-1995 1995-2010 2010-Now• RDBMS for transactions, Data Warehouse for analytics and NoSQL for scalability 5 Copyright Kelly-McCreary & Associates, LLC
    6. 6. Advancements in Distributed Databases 6 Copyright Kelly-McCreary & Associates, LLC
    7. 7. NoSQL on Google TrendsInterest over timeThe number 100 represents the peak search volume http://www.google.com/trends/explore#q=NoSQL 7 Kelly-McCreary & Associates, LLC
    8. 8. 2009: the NoSQL "Revolt" “NoSQLers came toNoSQL! share how they had overthrown the tyranny of slow, expensive relational databases in favor of more efficient and cheaper ways of managing data.” Computerworld magazine, July 1st, 2009 8 Kelly-McCreary & Associates, LLC
    9. 9. Common Themes• Horizontal scalability• Clever use of hashing and caching• Parallel execution of queries – move queries to the data, not the other way around• Share resources when possible – Example – memcached protocol• Use simple interfaces when possible – put, get, delete 9 Copyright Kelly-McCreary & Associates, LLC
    10. 10. Selecting a Database…"Selecting the right data storage solution is no longer a trivial task." Does it Yes Start look like Use Microsoft document? Office No Use the Stop RDBMS 10 Copyright Kelly-McCreary & Associates, LLC
    11. 11. Six Types of Databases Relational Analytical (OLAP) Key-Value key value key value key value key valueColumn-Family Graph Document 11 Copyright Kelly-McCreary & Associates, LLC
    12. 12. Relational• Data is usually stored in row by row manner (row store)• Standardized query language (SQL)• Data model defined before you add data• Joins merge data from multiple tables• Results are tables• Pros: mature ACID transactions with fine-grain security controls• Cons: Requires up front data modeling, does not scale well 12 Copyright Kelly-McCreary & Associates, LLC
    13. 13. Analytical (OLAP)• Based on "Star" schema with central fact table for each event• Optimized for analysis of read- analysis of historical data• Use of MDX language to count query "measures" for "categories" of data• Pros: fast queries for large data• Cons: not optimized for transactions and updates 13 Copyright Kelly-McCreary & Associates, LLC
    14. 14. Key-Value Storeskey value • Keys used to access opaquekey value blobs of datakey valuekey value • Values can contain any type of data (images, video) Pros: scalable, simple API (put, get, delete) Cons: no way to query based on the content of the value 14 Copyright Kelly-McCreary & Associates, LLC
    15. 15. Column-Family • Key includes a row, column family and column name • Store versioned blobs in one large table • Queries can be done on rows, column families and column names • Pros: Good scale outExamples: HBase, • Cons: Can not query blobCassandra content, row and column designs are critical 15 Copyright Kelly-McCreary & Associates, LLC
    16. 16. Graph Store • Data is stored in a series of nodes and properties • Queries are really graph traversals • Ideal when relationships between data is key: – e.g. social networks • Pros: fast network search, works with public linked data setsExamples: Neo4j,AllegroGraph • Cons: Poor scalability when graphs dont fit into RAM, specialized query language 16 Copyright Kelly-McCreary & Associates, LLC
    17. 17. Document Store • Data stored in nested hierarchies • Logical data remains stored together as a unit • Any item in the document can be queried • Pros: No object-relationalExamples: MongoDB, mapping layer, ideal for searchCouchbase • Cons: Complex to implement, incompatible with SQL 17 Copyright Kelly-McCreary & Associates, LLC
    18. 18. Business Solutions• Big Data – horizontal scalability• Search – full-text search• High availability – fault tolerance• Agility – quickly adapt to change• Enterprise Class – Security – Monitoring 18 Copyright Kelly-McCreary & Associates, LLC
    19. 19. Shared Nothing Architecture CPU CPU CPU CPU CPU CPU RAM RAM RAM RAM BUS RAM LAN Disk Disk SAN LAN Shared RAM Shared Disk Shared NothingShared nothing systems have proven to be most cost-effective and flexible 19 Kelly-McCreary & Associates, LLC
    20. 20. Distribution Models Master-Slave Peer-to-Peerrequests requests Used only if primary master fails Master Standby Node Master Node Node Node Node Node Node Peer to peer models do not have standby nodes that are idle 20 Copyright Kelly-McCreary & Associates, LLC
    21. 21. Move Queries to the NodesMapReduce MapReduce MapReduce MapReduce QueryDatabase Database Database DatabaseMapReduce MapReduce MapReduce MapReduceDatabase Database Database DatabaseMapReduce MapReduce MapReduce MapReduceDatabase Database Database DatabaseMapReduce MapReduce MapReduce MapReduceDatabase Database Database Database Queries work best if the run on the local node that has the data 21
    22. 22. Structured Search Flat Ocean Retained Structure synonym• Retain document structure to allow keyword matches in "title" to rank higher then a keyword match in text body 22 Kelly-McCreary & Associates, LLC
    23. 23. Incremental MapReduce Pre-calculated aggregate values count(), sum(),prior items avg(), min(), max() HBase only read the new item aggregate values new item are updated in HBase• Unlike standard MapReduce, Incremental MapReduce only updates aggregates that need to be updated.• This is an example of how pre-built values are updated with only deltas• Very useful to save time when calculating aggregates of large data collections 23
    24. 24. Is Shredding Really Necessary? • Every time you take hierarchical data and put it into a traditional database you have to put repeating groups in separate tables and use SQL “joins” to reassemble the data 24 Copyright 2008 Dan McCreary & Associates
    25. 25. Object Relational Mapping T1 T2 T4 T3 Relational Web Browser Object Middle Database Tier• T1 – HTML into Objects• T2 –Objects into SQL Tables• T3 – Tables into Objects• T4 – Objects into HTML 25 Kelly-McCreary & Associates, LLC
    26. 26. "The Vietnam of Applications"• Object-relational mapping has become one of the most complex components of building applications today• A "Quagmire" where many projects get lost• Many "heroic efforts" have been made to solve the problem: – Hibernate – Ruby on Rails• But sometimes the way to avoid complexity is to keep your architecture very simple 26 Copyright Kelly-McCreary & Associates, LLC
    27. 27. Perspectives Object OLAPDocument Stores MDX Stores NoSQL for Graph Web 2.0 Stores and BigData Perspective depends on your context 27 Kelly-McCreary & Associates, LLC
    28. 28. Selection Checklist• Horizontal Scalability• High Availability• Search• No object-relational mapping• Security• Monitoring 28 Kelly-McCreary & Associates, LLC
    29. 29. 29Kelly-McCreary & Associates, LLC
    30. 30. Architectural Tradeoffs "I want a fast car with good mileage.""I want a scaleable database with low cost that runswell on the 1,000 CPUs in our data center." 30 Kelly-McCreary & Associates, LLC
    31. 31. Introduction toDocument DatabasesIntroduction to Document and CouchbaseDatabases and Couchbase Dipti Borkar Director, Product Management
    32. 32. NoSQL Document Database
    33. 33. Couchbase Server - Core Capabilities Easy Consistent High Scalability Performance Grow cluster without Consistent sub-millisecond application changes, without read and write response times downtime with a single click with consistent high throughput Always Flexible Data On JSON JSON JSO JSON JSON N Model 24x365 No downtime for software JSON document model with upgrades, hardware no fixed schema. maintenance, etc.
    34. 34. Relational vs Document data model C1 C2 C3 C4 { JSON JSON } JSON Relational data model Document data modelHighly-structured table organization Collection of complex documents withwith rigidly-defined data formats and arbitrary, nested data formats and record structure. varying “record” format.
    35. 35. Making a Change Using RDBMS User Table Photo Table Country Table Country TE CountryUser ID First Last Zip ID User ID L3 Photo ID Commen ID Country Country t ID name 001 1 Dipti Borkar 94040 001 2 d043 NYC 001 USA 007 2 Joe Smith 94040 001 2 b054 Bday 002 UK 001 5 c036 Miami 003 Argentina 3 Ali Dodson 94040 001 133 7 d072 Sunset 004 Australia 133 4 Sarah Gorin NW1 002 5002 e086 Spain 005 Aruba Status Table 5 Bob Young 30303 001 Country 006 Austria User ID Status ID Text ID 6 Nancy Baker 10010 001 1 a42 At conf 134 007 Brazil 4 b26 excited 007 008 Canada 7 Ray Jones 31311 001 5 c32 hockey 008 009 Chile 8 Lee Chen V5V3M 008 12 d83 Go A’s 001 • 5000 e34 sailing 005 • • • . • . • . Affiliations Table 130 Portugal Country User ID Affl ID Affl Name ID 131 Romania50000 Doug Moore 04252 001 2 a42 Cal 001 4 b96 USC 001 132 Russia50001 Mary White SW195 002 7 c14 UW 001 133 Spain50002 Lisa Clark 12425 001 8 e22 Oxford 002 134 Sweden
    36. 36. Making the Same Change with a Document Database { “ID”: 1, “FIRST”: “Dipti”, “LAST”: “Borkar”, “ZIP”: “94040”, “CITY”: “MV”, “STATE”: “CA”, “STATUS”: { “TEXT”: “At Conf” } , “GEO_LOC”: }“134” }, “COUNTRY”: ”USA” } JSON Just add information to a document
    37. 37. Couchbase Server 2.0 Architecture8092 11211 11210Query API Memcapable 1.0 Memcapable 2.0 Moxi Query Engine REST management API/Web UI vBucket state and replication manager Memcached Global singleton supervisor Rebalance orchestrator Configuration manager Node health monitor Process monitor Heartbeat Couchbase EP Engine Data Manager Cluster Manager storage interface New Persistence Layer http on each node one per cluster Erlang/OTP HTTP Erlang port mapper Distributed Erlang 8091 4369 21100 - 21199
    38. 38. Couchbase“The basics”
    39. 39. Basic Operation APP SERVER 1 APP SERVER 2 COUCHBASE Client Library COUCHBASE Client Library CLUSTER MAP CLUSTER MAP READ/WRITE/UPDATE SERVER 1 SERVER 2 SERVER 3 • Docs distributed evenly across ACTIVE ACTIVE ACTIVE servers Doc 5 Doc Doc 4 Doc Doc 1 Doc • Each server stores both active and replica docs Doc 2 Doc Doc 7 Doc Doc 2 Doc – Only one server active at a time • Client library provides app with Doc 9 Doc Doc 8 Doc Doc 6 Doc simple interface to database REPLICA REPLICA REPLICA • Cluster map provides map to which server doc is on Doc 4 Doc Doc 6 Doc Doc 7 Doc – App never needs to know Doc 1 Doc Doc 3 Doc Doc 9 Doc • App reads, writes, updates docs Doc 8 Doc Doc 2 Doc Doc 5 Doc • Multiple app servers can access same document at same time COUCHBASE SERVER CLUSTERUser Configured Replica Count = 1
    40. 40. Add Nodes to Cluster APP SERVER 1 APP SERVER 2 COUCHBASE Client Library COUCHBASE Client Library CLUSTER MAP CLUSTER MAP READ/WRITE/UPDATE READ/WRITE/UPDATE SERVER 1 SERVER 2 SERVER 3 SERVER 4 SERVER 5 • Two servers added with ACTIVE ACTIVE ACTIVE ACTIVE ACTIVE one-click operation Doc 5 Doc Doc 4 Doc Doc 1 Doc • Docs automatically rebalance across cluster Doc 2 Doc Doc 7 Doc Doc 2 Doc – Even distribution of docs – Minimum doc movement Doc 9 Doc Doc 8 Doc Doc 6 Doc • Cluster map updated REPLICA REPLICA REPLICA REPLICA REPLICA • App database Doc 4 Doc Doc 6 Doc Doc 7 Doc calls now distributed over larger number of Doc 1 Doc Doc 3 Doc Doc 9 Doc servers Doc 8 Doc Doc 2 Doc Doc 5 Doc COUCHBASE SERVER CLUSTERUser Configured Replica Count = 1
    41. 41. Fail Over Node APP SERVER 1 APP SERVER 2 COUCHBASE Client Library COUCHBASE Client Library CLUSTER MAP CLUSTER MAP SERVER 1 SERVER 2 SERVER 3 SERVER 4 SERVER 5 • App servers accessing docs ACTIVE ACTIVE ACTIVE ACTIVE ACTIVE • Requests to Server 3 fail Doc 5 Doc Doc 4 Doc Doc 1 Doc Doc 9 Doc Doc 6 Doc • Cluster detects server failed – Promotes replicas of docs to Doc 2 Doc Doc 7 Doc Doc 3 Doc Doc 8 Doc Doc active – Updates cluster map Doc 1 Doc 3 • Requests for docs now go to REPLICA REPLICA REPLICA REPLICA REPLICA appropriate server Doc 4 Doc Doc 6 Doc Doc 7 Doc Doc 5 Doc Doc 8 Doc • Typically rebalance would follow Doc 1 Doc Doc 3 Doc Doc 9 Doc Doc 2 Doc COUCHBASE SERVER CLUSTERUser Configured Replica Count = 1
    42. 42. New in 2.0 JSON support Indexing and Querying JSON JSON JSO JSON N JSONIncremental Map Reduce Cross data center replication
    43. 43. Cluster wide - XDCR SERVER 1 SERVER 2 SERVER 3 ACTIVE ACTIVE ACTIVE COUCHBASE SERVER CLUSTER Doc Doc Doc NY DATA CENTER Doc 2 Doc Doc Doc 9 Doc Doc RAM RAM RAM Doc Doc Doc Doc Doc Doc Doc Doc Doc DISK DISK DISK SERVER 1 SERVER 2 SERVER 3 ACTIVE ACTIVE ACTIVE Doc Doc Doc Doc 2 Doc Doc Doc 9 Doc Doc RAM RAM RAMCOUCHBASE SERVER CLUSTER Doc Doc Doc Doc Doc Doc Doc Doc Doc SF DATA CENTER DISK DISK DISK
    44. 44. Couchbase Server Admin Console
    45. 45. Use cases
    46. 46. Data driven use cases• Support for unlimited data growth• Data with non-homogenous structure• Need to quickly and often change data structure• 3rd party or user defined structure• Variable length documents• Sparse data records• Hierarchical data
    47. 47. Performance driven use cases• Low latency matters• High throughput matters• Large number of users• Unknown demand with sudden growth of users/data• Predominantly direct document access• Workloads with very high mutation rate per document
    48. 48. Common Use Cases Social Gaming Ad Targeting Session store • Couchbase stores player and game • Couchbase stores • Couchbase Server as a key- data user information for value store fast access • Examples • Examples customers include: customers include: • Examples customers Concur, Sabre Zynga include: AOL, • Tapjoy, Ubisoft, Mediamind, Tencent Convertro Content & Metadata User Profile Store Store Mobile Apps • Couchbase document store • Couchbase Server as a with Elastic Search• Couchbase stores user key-value store • Examples customers include: info and app content Tunewiki, McGraw Hill• Examples customers • Examples customers include: Kobo, Playtika include: Tunewiki 3rd party data aggregation High availability cache • Couchbase stores social media• Couchbase Server used as a cache tier and data feeds replacement • Examples customers include:• Examples customers include: Orbitz Sambacloud
    49. 49. Recommended Reading • Making Sense of NoSQL: A guide for managers and the rest of us • Manning Publications • Focus on objective architectural analysis • Available now in Manning Early Access Program (MEAP) e-book (PDF) • In print June 2013 • http://manning.com/mccreary
    50. 50. Dan McCreary & Ann KellyKelly-McCreary & Associateswww.danmccreary.comDipti Borkar@dborkardipti@couchbase.comwww.couchbase.com
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×