NoSQL Night!
Singapore Spring@Pivotal User Group
Clarence J M Tauro
Sr. Instructor
Couchbase
About the Speaker
• Clarence J M Tauro – clarence@couchbase.com
– Senior Instructor, Couchbase
– ~11 Years Professional Teaching and Consulting Experience
– Worked at Pivotal – Instructor/Consultant for Spring/Spring
Security/Spring Web/Enterprise Integration with Spring/Spring
JMS/Spring Web/Spring Batch, Pivotal Hadoop/Cloud Foundry
– PhD in Computer Science from Christ University [thesis
accepted]
– Hard-core Dog lover
Disclaimer
• Disclaimer: The views expressed in this presentation
are our own and do not necessarily reflect the views of
Couchbase
Objectives
• Introduction to NoSQL
• Are ACID Properties always desirable?
• Basically available, Soft state, Eventually consistent
(BASE)
• The CAP Theorem
• Introducing Couchbase
• Couchbase Operations
Introduction
RDBMS - predominant technology
for storing structured data in
web and business applications
“one size fits all” - thinking
concerning data-stores has been
questioned
Apply NoSQL databases for the
persistence layer/Polyglot
Programming
ACID Properties
• ATOMICITY
• CONSISTENCY
• ISOLATION
• DURABILITY
Are ACID Properties always desirable?
• … But what about:
– Latency
– Partition Tolerance
– High Availability
– Scalability
the system is
available, but not
necessarily all items
in it at any given
point in time
after a certain time all
nodes are consistent, but
at any given time this
might not be the case
information (state) the user
put into the system that will go
away if the user doesn't
maintain it
BASE
NoSQL Common Traits
• Non-relational
• Schema-free/Schema-on-read
• Eventual consistency
• Open source
• Distributed
• “web-scale”
The CAP Theorem
• Consistency – can all
nodes see identical data,
at all times?
• Availability – can all
nodes be read from and
written to, at all times?
• Partition Tolerance – will
nodes function normally,
even when the cluster
breaks?
Consistency
Partition
Tolerance
Availability
CHOOSE ANY TWO
The CAP Theorem
• CP: Consistency and Partition Tolerance
- Immediately consistent data across a horizontally scaled
cluster, even with network problems
- Couchbase
• AP: Availability and Partition Tolerance
- Always services requests, across multiple data centers,
even with network problems, data eventually consistent
- Apache HBase or Cassandra, Couchbase (XDCR)
• CA: Consistency and Availability
- Always services requests with immediately consistent
data, in a vertically scaled system
- MySQL, Oracle, Microsoft SQL Server
What do you do with the Data?
Operational Use
•Real time intelligence
•Focus on data flows and
processes
•Extremely fast (in-memory)
reads
•Extremely fast (log append)
writes
•Improve the current
outcome
Analytical Use
•Batched workloads
•Vast data aggregations
•Retrospective analyses
•Focus on data pools
•Improve future outcomes
Hadoop vs. NoSQL
Operational VelocityAnalytical Volume
Real-time
operational database systems
improve current outcomes
Batch-oriented
analytical database systems
improve future outcomes
Hadoop NoSQL
Types of NoSQL
• Key-value stores
• Wide Column stores
• Document stores
• Graph databases
Key-Value Stores
• The most common; not-necessarily the most popular
• Key and a simple value
- Speed
- Scale
- Simplicity
• Find simple values by key extremely fast
Clarenceuser::1234
Melisauser::1235
Michaeluser::1236
Document Stores
• Key and a structured value (document)
- Speed
- Scale
- Flexibility
• Read/write ever-changing data about people, places,
and things, at cloud-scale
user::1234 { name: 'Frank', age: 37, kids: ['Sue', 'Ann', 'Bob'] }
user::1235 { name: 'Carolyn', age: 56, kids: ['Tina'] }
user::1236 { name: 'Tessa', age: 24}
Wide Column Stores
• Key and nested set of tuples
- Write vast volumes of data, with eventually consistent
read access
user::1234
name: text Frank
age: number 37
kid: text
Sue
Ann
Bob
user::1235
name: text Carolyn
age: number 56
kid: text Tina
Graph Databases
• Linked list of keyed objects
- Relationships
• Monitor complex, dynamically networked connections
user::1
234
Frank
37
Sue
Ann
Bob
user::1
235
Carolyn
56
Tina
user::1
236
Tessa
24
Polyglot Programming
• Enterprise will have a variety of different data storage
technologies for different kinds of data
• We need to ask how we want to manipulate the data.
This will help us figure out which persistence
technologies are appropriate
- User Sessions: Couchbase (Memcached)/Redis
- Financial Data: RDBMS
- Shopping Cart: Riak/Couchbase (Memcached)
- Recommendation Systems: Neo4J
- Product Catalog: Couchbase/MongoDB
- Reporting: RDBMS/Couchbase Views
- Analytics: Couchbase/Cassandra
History of Couchbase
NorthScale developed a
key-value storage engine
Apache CouchDB database
project
Membase and CouchOne joined forces in February
2011 to create Couchbase, the first and only
provider of a comprehensive, end-to-end family of
NoSQL database products
What is Couchbase Server?
• Couchbase Server
• Is a “document” database solution
• Has key/value based orientation
• Is geared for JSON
• Has no tables and no fixed schema
• Runs on a networked cluster of nodes
• Is highly scalable
• Is lightning fast read/write
• Has caching and persistence layers
• Automatically fails-over
• Couchbase Server is best suited for fast-changing data
items of relatively small size
JavaScript Object Notation
{
     "firstName": "Clarence",
     "lastName": "Tauro",
     "age": 25,
     "address":
     {
         "streetAddress": "21 2nd Street",
         "city": "Bangalore",
         "state": "KA",
         "postalCode": "560059"
     },
     "phoneNumber":
     [
         {
           "type": "home",
           "number": "988 621-7674"
         }
     ]
}
JSON is a lightweight data-interchange
format easy for humans to read and
write
What is a Couchbase Document?
{
  "visibility": "PRIVATE",
  "name": "Eclectic Summer Mix",
  "userName": "suzyqrocks",
  "type": "org.couchmusic.domain.Playlist",
  "created": 1422138028037,
  "updated": 1422138028072,
  "tracks": []
}
{
  "id": "playlist:12345",
  "rev": "1-0004ebc0000000000",
  "flags": 0,
  "expiration": 0,
  "type": "json"
}
Document Content
(Most recent in RAM
and persisted to disk)
Document Metadata
(All keys unique
and kept in RAM)
Couchbase Server Architecture
• Technology Stack for Data Manager:
­ Couchbase Client SDK (“Smart Client”)
­ Client Query API1
and Query Engine (Views)
­ Cache Layer: RAM Cache
­ Persistence Layer: Couchbase
Couchbase Server Architecture
• Technology Stack for Cluster Manager:
­ Node Level – multiple vBuckets
• Default 1024 vBuckets/number of nodes
­ Cluster Level – multiple nodes (with 1 .. * buckets)1
­ Datacenter Level – multiple clusters (optional XDCR)2
­ Erlang (cluster management and process supervision)3
Couchbase Server Architecture
Anatomy of a Couchbase Application
Couchbase Client Software
Cluster Map
NS Server
EP Engine
NS Server
EP Engine
NS Server
EP Engine
{Server List}
1. REST request 8091
2. HTTP response
5. Create, Read, Update and Delete Documents
Becomes
a Smart
Client
4. Connect CRUD
Data Port 11210
3333 22
Managed Cache
DiskQueue
Disk
Replication
Queue
App Server
Doc 1Doc 1
Doc 1
To other node
Single Node – Couchbase Write Operation
Couchbase Server Node
3333 22
Managed Cache
DiskQueue
Replication
Queue
App Server
Doc 1’
Doc 1
Doc 1’Doc 1
Doc 1’
Disk
To other node
Single Node – Couchbase Update Operation
Couchbase Server Node
GET
Doc1
3333 22
DiskQueue
Replication
Queue
App Server
Doc 1
Doc 1Doc 1
Managed Cache
Disk
To other node
Single Node – Couchbase Read Operation
Couchbase Server Node
3333 22
2
DiskQueue
Replication
Queue
App Server
Couchbase Server Node
Doc 1
Doc 6Doc 5Doc 4Doc 3Doc 2
Doc 1
Doc 6 Doc 5 Doc 4 Doc 3 Doc 2
Managed Cache
Disk
To other node
Single Node – Couchbase Cache Eviction
3333 22
2
DiskQueue
Replication
Queue
App Server
Couchbase Server Node
Doc 1
Doc 3Doc 5 Doc 2Doc 4
Doc 6 Doc 5 Doc 4 Doc 3 Doc 2
Doc 4
GET
Doc1
Doc 1
Doc 1
Managed Cache
Disk
To other node
Single Node – Couchbase Cache Miss
Other Features of Couchbase 4.0
• Multi-dimensional Scaling
• N1QL
• XDCR
Training
Get Started with Couchbase Server 4.0:
www.couchbase.com/beta
Get Trained on Couchbase: http://training.couchbase.com
CD220: Developing Couchbase NoSQL Applications
Oct 20 – Oct 23 2015
CS300: Couchbase NoSQL Server Administration
Nov 17 – Nov 20 2015
Enroll Today!
Questions?

NoSQL_Night

  • 1.
    NoSQL Night! Singapore Spring@PivotalUser Group Clarence J M Tauro Sr. Instructor Couchbase
  • 2.
    About the Speaker •Clarence J M Tauro – clarence@couchbase.com – Senior Instructor, Couchbase – ~11 Years Professional Teaching and Consulting Experience – Worked at Pivotal – Instructor/Consultant for Spring/Spring Security/Spring Web/Enterprise Integration with Spring/Spring JMS/Spring Web/Spring Batch, Pivotal Hadoop/Cloud Foundry – PhD in Computer Science from Christ University [thesis accepted] – Hard-core Dog lover
  • 3.
    Disclaimer • Disclaimer: Theviews expressed in this presentation are our own and do not necessarily reflect the views of Couchbase
  • 4.
    Objectives • Introduction toNoSQL • Are ACID Properties always desirable? • Basically available, Soft state, Eventually consistent (BASE) • The CAP Theorem • Introducing Couchbase • Couchbase Operations
  • 5.
    Introduction RDBMS - predominanttechnology for storing structured data in web and business applications “one size fits all” - thinking concerning data-stores has been questioned Apply NoSQL databases for the persistence layer/Polyglot Programming
  • 6.
    ACID Properties • ATOMICITY •CONSISTENCY • ISOLATION • DURABILITY
  • 7.
    Are ACID Propertiesalways desirable? • … But what about: – Latency – Partition Tolerance – High Availability – Scalability
  • 8.
    the system is available,but not necessarily all items in it at any given point in time after a certain time all nodes are consistent, but at any given time this might not be the case information (state) the user put into the system that will go away if the user doesn't maintain it BASE
  • 9.
    NoSQL Common Traits •Non-relational • Schema-free/Schema-on-read • Eventual consistency • Open source • Distributed • “web-scale”
  • 10.
    The CAP Theorem •Consistency – can all nodes see identical data, at all times? • Availability – can all nodes be read from and written to, at all times? • Partition Tolerance – will nodes function normally, even when the cluster breaks? Consistency Partition Tolerance Availability CHOOSE ANY TWO
  • 11.
    The CAP Theorem •CP: Consistency and Partition Tolerance - Immediately consistent data across a horizontally scaled cluster, even with network problems - Couchbase • AP: Availability and Partition Tolerance - Always services requests, across multiple data centers, even with network problems, data eventually consistent - Apache HBase or Cassandra, Couchbase (XDCR) • CA: Consistency and Availability - Always services requests with immediately consistent data, in a vertically scaled system - MySQL, Oracle, Microsoft SQL Server
  • 12.
    What do youdo with the Data? Operational Use •Real time intelligence •Focus on data flows and processes •Extremely fast (in-memory) reads •Extremely fast (log append) writes •Improve the current outcome Analytical Use •Batched workloads •Vast data aggregations •Retrospective analyses •Focus on data pools •Improve future outcomes
  • 13.
    Hadoop vs. NoSQL OperationalVelocityAnalytical Volume Real-time operational database systems improve current outcomes Batch-oriented analytical database systems improve future outcomes Hadoop NoSQL
  • 14.
    Types of NoSQL •Key-value stores • Wide Column stores • Document stores • Graph databases
  • 15.
    Key-Value Stores • Themost common; not-necessarily the most popular • Key and a simple value - Speed - Scale - Simplicity • Find simple values by key extremely fast Clarenceuser::1234 Melisauser::1235 Michaeluser::1236
  • 16.
    Document Stores • Keyand a structured value (document) - Speed - Scale - Flexibility • Read/write ever-changing data about people, places, and things, at cloud-scale user::1234 { name: 'Frank', age: 37, kids: ['Sue', 'Ann', 'Bob'] } user::1235 { name: 'Carolyn', age: 56, kids: ['Tina'] } user::1236 { name: 'Tessa', age: 24}
  • 17.
    Wide Column Stores •Key and nested set of tuples - Write vast volumes of data, with eventually consistent read access user::1234 name: text Frank age: number 37 kid: text Sue Ann Bob user::1235 name: text Carolyn age: number 56 kid: text Tina
  • 18.
    Graph Databases • Linkedlist of keyed objects - Relationships • Monitor complex, dynamically networked connections user::1 234 Frank 37 Sue Ann Bob user::1 235 Carolyn 56 Tina user::1 236 Tessa 24
  • 19.
    Polyglot Programming • Enterprisewill have a variety of different data storage technologies for different kinds of data • We need to ask how we want to manipulate the data. This will help us figure out which persistence technologies are appropriate - User Sessions: Couchbase (Memcached)/Redis - Financial Data: RDBMS - Shopping Cart: Riak/Couchbase (Memcached) - Recommendation Systems: Neo4J - Product Catalog: Couchbase/MongoDB - Reporting: RDBMS/Couchbase Views - Analytics: Couchbase/Cassandra
  • 20.
    History of Couchbase NorthScaledeveloped a key-value storage engine Apache CouchDB database project Membase and CouchOne joined forces in February 2011 to create Couchbase, the first and only provider of a comprehensive, end-to-end family of NoSQL database products
  • 21.
    What is CouchbaseServer? • Couchbase Server • Is a “document” database solution • Has key/value based orientation • Is geared for JSON • Has no tables and no fixed schema • Runs on a networked cluster of nodes • Is highly scalable • Is lightning fast read/write • Has caching and persistence layers • Automatically fails-over • Couchbase Server is best suited for fast-changing data items of relatively small size
  • 22.
    JavaScript Object Notation {      "firstName": "Clarence",      "lastName": "Tauro",      "age": 25,      "address":      {          "streetAddress": "21 2nd Street",          "city": "Bangalore",          "state": "KA",          "postalCode": "560059"      },      "phoneNumber":      [          {            "type": "home",            "number": "988 621-7674"          }      ] } JSONis a lightweight data-interchange format easy for humans to read and write
  • 23.
    What is aCouchbase Document? {   "visibility": "PRIVATE",   "name": "Eclectic Summer Mix",   "userName": "suzyqrocks",   "type": "org.couchmusic.domain.Playlist",   "created": 1422138028037,   "updated": 1422138028072,   "tracks": [] } {   "id": "playlist:12345",   "rev": "1-0004ebc0000000000",   "flags": 0,   "expiration": 0,   "type": "json" } Document Content (Most recent in RAM and persisted to disk) Document Metadata (All keys unique and kept in RAM)
  • 24.
  • 25.
    • Technology Stackfor Data Manager: ­ Couchbase Client SDK (“Smart Client”) ­ Client Query API1 and Query Engine (Views) ­ Cache Layer: RAM Cache ­ Persistence Layer: Couchbase Couchbase Server Architecture
  • 26.
    • Technology Stackfor Cluster Manager: ­ Node Level – multiple vBuckets • Default 1024 vBuckets/number of nodes ­ Cluster Level – multiple nodes (with 1 .. * buckets)1 ­ Datacenter Level – multiple clusters (optional XDCR)2 ­ Erlang (cluster management and process supervision)3 Couchbase Server Architecture
  • 27.
    Anatomy of aCouchbase Application Couchbase Client Software Cluster Map NS Server EP Engine NS Server EP Engine NS Server EP Engine {Server List} 1. REST request 8091 2. HTTP response 5. Create, Read, Update and Delete Documents Becomes a Smart Client 4. Connect CRUD Data Port 11210
  • 28.
    3333 22 Managed Cache DiskQueue Disk Replication Queue AppServer Doc 1Doc 1 Doc 1 To other node Single Node – Couchbase Write Operation Couchbase Server Node
  • 29.
    3333 22 Managed Cache DiskQueue Replication Queue AppServer Doc 1’ Doc 1 Doc 1’Doc 1 Doc 1’ Disk To other node Single Node – Couchbase Update Operation Couchbase Server Node
  • 30.
    GET Doc1 3333 22 DiskQueue Replication Queue App Server Doc1 Doc 1Doc 1 Managed Cache Disk To other node Single Node – Couchbase Read Operation Couchbase Server Node
  • 31.
    3333 22 2 DiskQueue Replication Queue App Server CouchbaseServer Node Doc 1 Doc 6Doc 5Doc 4Doc 3Doc 2 Doc 1 Doc 6 Doc 5 Doc 4 Doc 3 Doc 2 Managed Cache Disk To other node Single Node – Couchbase Cache Eviction
  • 32.
    3333 22 2 DiskQueue Replication Queue App Server CouchbaseServer Node Doc 1 Doc 3Doc 5 Doc 2Doc 4 Doc 6 Doc 5 Doc 4 Doc 3 Doc 2 Doc 4 GET Doc1 Doc 1 Doc 1 Managed Cache Disk To other node Single Node – Couchbase Cache Miss
  • 33.
    Other Features ofCouchbase 4.0 • Multi-dimensional Scaling • N1QL • XDCR
  • 34.
    Training Get Started withCouchbase Server 4.0: www.couchbase.com/beta Get Trained on Couchbase: http://training.couchbase.com CD220: Developing Couchbase NoSQL Applications Oct 20 – Oct 23 2015 CS300: Couchbase NoSQL Server Administration Nov 17 – Nov 20 2015 Enroll Today!
  • 35.

Editor's Notes

  • #27 1. Most modern operating systems want a few gigabytes (Windows usually a bit more than Linux), and there may be other processes running on these nodes such as monitoring agents. There are also needs for IO caching both for views and for the general functioning of the system.  We typically recommend about 60-80% of an system’s RAM to be allocated to Couchbase’s quota, leaving the rest for headroom and memory needs outside of Couchbase itself. 2. Cross Datacenter Replication (XDCR) is covered later in this course. 3. See https://blog.couchbase.com/tag/erlang
  • #28 The Memcache Client also uses a server list, but as contrasted to the Couchbase Client, there are no REST calls, it is only working over port 11210, and is very fast. This is using a proprietary Memchached protocol.
  • #29 1.  A set request comes in from the application . 2.  Couchbase Server responds back that they key is written 3. Couchbase Server then Replicates the data out to memory in the other nodes 4. At the same time it is put the data into a write que to be persisted to disk