• Like

Navigating the Transition from Relational to NoSQL Technology

  • 1,347 views
Uploaded on

While the hype surrounding NoSQL (non-relational) database technology has become deafening, there is real substance beneath the often exaggerated claims. NoSQL database technologies have emerged as a …

While the hype surrounding NoSQL (non-relational) database technology has become deafening, there is real substance beneath the often exaggerated claims. NoSQL database technologies have emerged as a better match for the needs of modern interactive applications with cost-effective data management. Developers accustomed to relational database technology need to approach things differently.

To view Couchbase webinars on-demand visit http://www.couchbase.com/webinars

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,347
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
34
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Naviga&ng  the  Transi&on  from  Rela&onal  to  NoSQL  Technology   Dip&  Borkar   Senior  Product  Manager   1  
  • 2. WHY  TRANSITION  TO  NOSQL?     2  
  • 3. Survey:  Two  big  drivers  for  NoSQL  adop&on   What  is  the  biggest  data  management  problem     driving  your  use  of  NoSQL  in  the  coming  year?   Lack  of  flexibility/rigid  schemas   49%   Inability  to  scale  out  data   35%   High  latency/low  performance   29%   Costs   16%   All  of  these   12%   Other   11%   Source: Couchbase NoSQL Survey, December 2011, n=1351 3  
  • 4. Are  you  being  impacted  by  these?     Schema  Rigidity  problems     •  Do  you  store  serialized  objects  in  the  database?   •  Do  you  have  lots  of  sparse  tables  with  very  few  columns   Q   being  used  by  most  rows?   •  Do  you  find  that  your  applica&on  developers  require  schema   changes  frequently  due  to  constantly  changing  data?       •  Are  you  using  your  database  as  a  key-­‐value  store?   Scalability  problems     •  Do  you  periodically  need  to  upgrade  systems  to  more   powerful  servers  and  scale  up?     Q   •  Are  you  reaching  the  read  /  write  throughput  limit  of  a  single   database  server?     •  Is  your  server’s  read  /  write  latency  not  mee&ng  your  SLA?     •  Is  your  user  base  growing  at  a  frightening  pace?     4  
  • 5. DISTRIBUTED  DOCUMENT   DATABASES   5  
  • 6. Document  Databases  •  Each  record  in  the  database  is  a  self-­‐ describing  document     {  •  Each  document  has  an  independent   “UUID”:  “ 21f7f8de-­‐8051-­‐5b89-­‐86 “Time”:   “2011-­‐04-­‐01T13:01:02.42 “Server”:   “A2223E”, structure   “Calling   Server”:   “A2213W”, “Type”:   “E100”, “Initiating   User”:   “dsallings@spy.net”,•  Documents  can  be  complex     “Details”:   { “IP”:  “ 10.1.1.22”,•  All  databases  require  a  unique  key   “API”:   “InsertDVDQueueItem”, “Trace”:   “cleansed”,•  Documents  are  stored  using  JSON  or   “Tags”:   [ “SERVER”,   XML  or  their  deriva&ves   “US-­‐West”,   “API” ]•  Content  can  be  indexed  and  queried     } }•  Offer  auto-­‐sharding  for  scaling  and   replica&on  for  high-­‐availability   6  
  • 7. Advantages  of  Document  Databases   •  Schema  flexibility  –  Gives  users  applica&on  flexibility  for   evolving  system  without  restructuring  exis&ng  data   •  Dynamic  Elas&city  –  Moving  data  while  maintaining   consistency  is  easier   •  Performance  –  Related  data  is  stored  in  a  single  document.   This  allows  consistently  low-­‐latency  access  to  the  data       •  Query  Flexibility  –  Indexing  allows  users  to  query  contents   of  documents   7  
  • 8. Advantages  of  Document  Databases   •  Schema  flexibility  –  Gives  users  applica&on  flexibility  for   evolving  system  without  restructuring  exis&ng  data   •  Dynamic  Elas&city  –  Moving  data  while  maintaining   consistency  is  easier   •  Performance  –  Related  data  is  stored  in  a  single  document.   This  allows  consistently  low-­‐latency  access  to  the  data       •  Query  Flexibility  –  Indexing  allows  users  to  query  contents   of  documents   1   Compare  rela&onal  and  document  DB  data  models   2   Compare  rela&onal  and  document  DB  scaling  models   8  
  • 9. COMPARING  DATA  MODELS   9  
  • 10. Rela&onal  vs  Document  data  model   {   “UUID”:  “ 21f7f8de-­‐8051-­‐5b89-­‐86 R1C1   R1C2   R1C3   R1C4   {   “Time”:   “2011-­‐04-­‐01T13:01:02.42 “UUID”:  “ 21f7f8de-­‐8051-­‐5b89-­‐86 “Server”:   “A2223E”, {   “Time”:   “2011-­‐04-­‐01T13:01:02.42 “Calling   Server”:   “A2213W”, “Server”:   “A2223E”, “UUID”:  “ 21f7f8de-­‐8051-­‐5b89-­‐86 “Type”:   “E100”, {   “Time”:   “2011-­‐04-­‐01T13:01:02.42 “Calling   Server”:   User”:   “dsallings@spy.net”, “Initiating   “A2213W”, “Server”:   “A2223E”, “Type”:   “E100”, “Details”:   “UUID”:  “ 21f7f8de-­‐8051-­‐5b89-­‐86 “Initiating   User”:   “dsallings@spy.net”, “Time”:   “2011-­‐04-­‐01T13:01:02.42 “Calling   Server”:   “A2213W”, { R2C1   R2C2   R2C3   R2C4   “Details”:   “IP”:  “ 10.1.1.22”, “Server”:   “A2223E”, “Type”:   “E100”, { “Initiating   User”:   “dsallings@spy.net”, “Calling   Server”:   “A2213W”, “API”:   “InsertDVDQueueItem”, “Details”:   “Type”:   “E100”, “IP”:  “ 10.1.1.22”, “Trace”:   “cleansed”, { “API”:  “Tags”:   “Initiating   User”:   “dsallings@spy.net”,“InsertDVDQueueItem”, “Details”:   “Trace”:   “cleansed”, “IP”:  “ 10.1.1.22”, [ { “Tags”:   “API”:   “InsertDVDQueueItem”, “SERVER”,   “IP”:  “ 10.1.1.22”, [ “Trace”:   “cleansed”, “US-­‐West”,   R3C1   R3C2   R3C3   R3C4   “Tags”:   “API”:   “InsertDVDQueueItem”, [ “Trace”:   “cleansed”, “SERVER”,   “API” “US-­‐West”,   ] “Tags”:   “SERVER”,  “API” } [ ] “US-­‐West”,   } “SERVER”,   } “API” } ] “US-­‐West”,   } “API” R4C1   R4C2   R4C3   R4C4   } } ] } Rela&onal  data  model   Document  data  model   Highly-­‐structured  table  organiza&on  with     Collec&on  of  complex  documents  with   rigidly-­‐defined  data  formats  and  record   arbitrary,  nested  data  formats  and   structure.   varying  “record”  format.   10  
  • 11. Example:  Error  Logging  Use  case   Table  1:  Error  Log   Table  2:  Data  Centers   KEY   ERR   TIME   DC   KEY   LOC   NUM   FK(DC2)   303-­‐223-­‐   1   ERR   TIME   1   DEN   2332   FK(DC2)   212-­‐223-­‐   2   ERR   TIME   2   NYC   2332   FK(DC2)   415-­‐223-­‐   3   ERR   TIME   3   SFO   2332   FK(DC3)   4   ERR   TIME   11  
  • 12. Document  design  with  flexible  schema    {          “ID”:  4,   {          “ERR”:  “Out  of  Memory”,          “ID”:  3,          “TIME”:  “2004-­‐09-­‐16T23:59:58.75”,   {          “DC”:    “NYC”,  ,   of  Memory”,          “ERR”:  “Out         “ID”:  2        “TIME”:  “2004-­‐09-­‐16T23:59:58.75”,          “NUM”:  {“212-­‐223-­‐2332”       }          “DC”:    “NYC”,  ,   of  Memory”,          “ERR”:  “Out         “ID”:  1        “TIME”:  “2004-­‐09-­‐16T23:59:58.75”,          “NUM”:  “212-­‐223-­‐2332”          “ERR”:  “Out  of  Memory”,   }          “DC”:  “NYC”,          “TIME”:  “2004-­‐09-­‐16T23:59:58.75”,          “NUM”:  “212-­‐223-­‐2332”   }          “DC”:  “NYC”,          “NUM”:  “212-­‐223-­‐2332”   }   12  
  • 13. Document  design  with  flexible  schema      {              “ID”:  4,   {              “ERR”:  “Out  of  Memory”,          “ID”:    1,   {            “TIME”:  “2004-­‐09-­‐16T23:59:58.75”,          “ERR”:  “Out  of  Memory”,          “ID”:  1,   {          “DC”:    “NYC”,   “Out  of  Memory”,          “TIME”:  “2004-­‐09-­‐16T23:59:58.75”,         “ERR”:   1,          “ID”:          “NUM”:    ““  NYC”,   “Out  of  Memory”,   212-­‐223-­‐2332”          “DC”:       “ERR”:          “TIME”:  “2004-­‐09-­‐16T23:59:58.75”,  }          “NUM”:  TIME”:  “2004-­‐09-­‐16T23:59:58.75”,          “ “212-­‐223-­‐2332”          “DC”:  “NYC”,       }          “NUM”:  “212-­‐223-­‐2332”          “DC”:  “NYC”,   SCHEMA  CHANGE   {   }          “NUM”:  “212-­‐223-­‐2332”          “ID”:  5,   }          “ERR”:  “Out  of  Memory”,          “TIME”:  “2004-­‐09-­‐16T23:59:58.75”,              “COMPONENT”:  ”DMS”          “SEV”:  “LEVEL1”            “DC”:  “NYC”,          “NUM”:  “212-­‐223-­‐2332”   }   13  
  • 14. Document  modeling       •  Are  these  separate  object  in  the  model  layer?         Q   •  •  Are  these  objects  accessed  together?     Do  you  need  updates  to  these  objects  to  be  atomic?   •  Are  mul&ple    people  edi&ng  these  objects  concurrently?        When  considering  how  to  model  data  for  a  given    applica&on   •  Think  of  a  logical  container  for  the  data   •  Think  of  how  data  groups  together         14  
  • 15. Document  Design  Op&ons             •  One  document  that  contains  all  related  data       –  Data  is  de-­‐normalized   –  Bemer  performance  and  scale   –  Eliminate  client-­‐side  joins       •  Separate  documents  for  different  object  types  with   cross  references     –  Data  duplica&on  is  reduced   –  Objects  may  not  be  co-­‐located     –  Transac&ons  supported  only  on  a  document  boundary   –  Most  document  databases  do  not  support  joins   15  
  • 16. Document  ID  /  Key  selec&on   •  Documents  are  sharded  based  on  the  document  ID   •  ID  based  document  lookup  is  extremely  fast   •  Similar  to  primary  keys  in  rela&onal  databases     •  Usually  an  ID  can  only  appear  once  in  a  bucket           16  
  • 17. Document  ID  /  Key  selec&on   •  Similar  to  primary  keys  in  rela&onal  databases   •  Documents  are  sharded  based  on  the  document  ID   •  ID  based  document  lookup  is  extremely  fast     •  Usually  an  ID  can  only  appear  once  in  a  bucket         Q     •         Do  you  have  a  unique  way  of  referencing  objects?   •         Are  related  objects  stored  in  separate  documents?   Op&ons   • UUIDs,  date-­‐based  IDs,  numeric  IDs       • Hand-­‐craoed  (human  readable)     • Matching  prefixes  (for  mul&ple  related  objects)   17  
  • 18. Example:  Data  Profile  for  Users   {   {! “UUID ”:  “2 1 f7 f8 de-­‐8 0 5 1 -­‐5 b89 -­‐8 6 “Time”:   “2 0 1 1 -­‐0 4 -­‐0 1 T1 3 :0 1 :0 2.4 2 “_id”: “auser_profile”,! “Server”:   “A2 2 2 3 E”, “Calling   Server”:   “A2 2 1 3 W”, “user_id”: 7778! “Type”:   “E1 0 0 ”, “Initiating   Us er”:   “ds allings @s py.net”, “password”: “a1004cdcaa3191b7”,! “D etails ”:   { ”common_name”:.2”Robert User”, ! “IP”:  “1 0 .1 .1 2 ”, “API”:   “Ins ertD VD QueueItem”, ”nicknames”: [”Bob”, ”Buddy”],! “Trace”:   “cleans ed”, “Tags ”:   "sign_up_timestamp": 1224612317,! [ “SERVER”,   "last_login_timestamp": 1245613101! “US-­‐Wes t”,   “API” }   ] } {   } “UUID ”:  “ 2 1 f7 f8 d e-­‐8 0 5 1 -­‐5 b 8 9 -­‐8 6 {!“Time”:   “ 2 0 1 1 -­‐0 4 -­‐0 1 T1 3 :0 1 :0 2 “Server”:   “A2 2 2 3 E”, .4 2 “_id”: “auser_friends”,! “Callin g   Server”:   “A2 2 1 3 W ”, “Typ e”:   “E1 0 0 ”, “In itiatin g   Us er”:   “d s allin gs @s p y.n et”, “friends”: [ “joe”, ! “D etails ”:   { “IP ”:  “ 1 0 .1 .1 .2 2 ”, “alan”,! “AP I”:   “ In s ertD VD Qu eu eItem”, “Trace”:   “clean s ed ”, “Tags ”:   “toru” ]! [ “SERVER”,   “US-­‐Wes t”,   }   “AP I” ] } } 18  
  • 19. Example:  En&&es  for  a  Blog   BLOG   •  User  profile   The  main  pointer  into  the  user  data   •  Blog  entries   •  Badge  serngs,  like  a  twimer  badge       •  Blog  posts   Contains  the  blogs  themselves       •  Blog  comments   •  Comments  from  other  users   19  
  • 20. Blog  Document  –  Op&on  1  –  Single  document     {   “UUID ”:  “2 1 f7 f8 de-­‐8 0 5 1 -­‐5 b89 -­‐8 6 “Time”:   “2 0 1 1 -­‐0 4-­‐0 1 T1 3 :0 1 :0 2.4 2 { “Server”:   “A2 2 2 3 E”, ! “_id”: “jchris_Hello_World”,!3 W”, “Calling   Server”:   “A2 2 1 “Type”:   “E1 0 0 ”, “author”: “jchris”, ! “Initiating   Us er”:   “ds allings @s py.net”, “type”: “post”! “D etails ”:   “title”: “Hello World”,! { “format”: “IP”:  “1 0 .1 ! .2 2 ”, “markdown”, .1 “API”:   “Ins ertD VD QueueItem”, “body”: “Hello from [Couchbase](http://couchbase.com).”, ! “Trace”:   “cleans ed”, “html”: “<p>Hello from <a href=“http: …! “Tags ”:   “comments”:[ ! [ [“format”: “markdown”, “body”:”Awesome post!”],! “SERVER”,   “US-­‐Wes t”,   [“format”: “markdown”, “body”:”Like it.” ]! ]! “API” ] }   } } 20  
  • 21. Blog  Document  –  Op&on  2  -­‐  Split  into  mul&ple  docs    {  { !“UUID ”:  “21f7f8de-­‐8051 -­‐5b89 -­‐86“_id”: “jchris_Hello_World”,!“Time”:   “2011 -­‐04-­‐01T13:01:02.42“author”: “A2223E”, !“Server”:   “jchris”,“Calling   Server”:   “A2213W”,“type”: “E100 ”,“Type”:   “post”!“title”: “Hello World”,! @s py.net”,“Initiating   Us er”:   “ds allings“D etails ”:  “format”: “markdown”, ! {“body”:“IP”:  “10.1.1.22”, “Hello from [Couchbase]( “API”:   “Ins ertDVD QueueItem”,http://couchbase.com).”, ! “Trace”:   “cleans ed”,“html”:“Tags ”:   “<p>Hello from <a href=“http: …! [“comments”:[! “SERVER”,   ! “comment1_jchris_Hello_world”! “US-­‐Wes t”,   ! “API” ]! ] {   COMMENT  }! } “UUID ”:  “ 2 1 f7 f8 de-­‐8 0 5 1 -­‐5 b8 9 -­‐8 6 “Time”:   “ 2 0 1 1 -­‐0 4 -­‐0 1 T1 3 :0 1 :0 2 .4 2 “Server”:   “A2 2 2 3 E”,} “Calling   Server”:   “A2 2 1 3 W ”, {! BLOG  DOC   “Type”:   “E1 0 0 ”, “Initiating   Us er”:   “ds allings @s py.net”, “_id”: “comment1_jchris_Hello_World”,! “D etails ”:   { “IP ”:  “ 1 0 .1 .1 .2 2 ”, “format”: “markdown”, ! “AP I”:   “ Ins ertD VD QueueItem”, “Trace”:   “cleans ed”, “Tags ”:   “body”:”Awesome post!” ! [ “SERVER”,   “US-­‐Wes t”,   }   “AP I” ] } } 21  
  • 22. Threaded  Comments  •  You  can  imagine  how  to  take  this  to  a  threaded  list   List   First   Reply  to   comment   Blog   List   comment   More   Comments  Advantages  •  Only  fetch  the  data  when  you  need  it   •  For  example,  rendering  part  of  a  web  page  •  Spread  the  data  and  load  across  the  en&re  cluster     22  
  • 23. COMPARING    SCALING  MODEL   23  
  • 24. Modern interactive software architecture Application Scales Out Just add more commodity web servers Database Scales Up Get a bigger, more complex server Note  –  Rela&onal  database  technology  is  great  for  what  it  is  great  for,  but  it  is  not  great  for  this.   24  
  • 25. NoSQL database matches application logic tier architectureData layer now scales with linear cost and constant performance. Application Scales Out Just add more commodity web servers NoSQL  Database  Servers   Database Scales Out Just add more commodity data servers Scaling out flattens the cost and performance curves. 25  
  • 26. Other  things  to  consider  before  transi&oning              Accessing  data   App  Server   –  Learn  about  the  development  API  the   database  supports     –  Check  if  the  programing  language  of  your   choice  is  supported          Consistency   App  Server   –  Understand  the  consistency  model  and   check  if  it  meets  your  needs   –  Analyze  your  applica&on  needs  –  do  you   need  atomicity  across  mul&ple  objects?            Availability   App  Server   –  Ensure  that  there  is  no  single  point  of   failure   –  Understand  the  replica&on  behavior  and   availability  on  node  failures     26  
  • 27. Other  things  to  consider  before  transi&oning            Opera&ons   App  Server   –  Monitoring  the  system   –  Backup  and  restore  the  system   –  Upgrades  and  maintenance     –    Support                      Scaling   App  Server   –  Ease  of  adding  and  reducing  capacity   Client   –  Applica&on  availability  on  topology   changes                Maturity   –  Does  your  applica&on  need  rich  database   func&onality?  (mul&-­‐doc  transac&ons,   complex  security  needs,  complex  joins)   27  
  • 28. BRIEF  OVERVIEW  COUCHBASE  SERVER   28  
  • 29. Couchbase  Server   Simple.  Fast.  Elas&c.  NoSQL.      Couchbase  automa&cally  distributes  data  across  commodity  servers.  Built-­‐in  caching   enables  apps  to  read  and  write  data  with  sub-­‐millisecond  latency.  And  with  no  schema  to   manage,  Couchbase  effortlessly  accommodates  changing  data  management  requirements.     29  
  • 30. Typical  Couchbase  produc&on  environment   Applica&on  users   Load  Balancer   Applica&on  Servers   Servers   30  
  • 31. Reading  and  Wri&ng   Reading  Data   Wri&ng  Data   Application  Server Application  Server Give  me   Please  store   document  A   A   document  A   Here  is     A   OK,  I  stored   document  A   document  A   Server   Server   31  
  • 32. Reading  and  Wri&ng   Reading  Data   Wri&ng  Data   Application  Server Application  Server Give  me   Please  store   document  A   A   document  A   Here  is     A   OK,  I  stored   document  A   document  A   A   Server   A   Server   RAM RAM A   A   DISK DISK 32  
  • 33. Flow  of  data  when  wri&ng   Application  Server Application  Server Application  ServerApplica&ons  wri&ng  to  Couchbase     Server   Replica&on  queue   Disk  write  queue   Couchbase  transmi`ng  replicas   Couchbase  wri&ng  to  disk   network   Wri&ng  Data   33  
  • 34. THANK  YOU      DIPTI@COUCHBASE.COM   34