Navigating NoSQL in cloudy skies

739 views
654 views

Published on

NoSQL is not a buzzword anymore. The array of non- relational technologies have found wide-scale adoption even in non-Internet scale focus areas. With the advent of the Cloud...the churn has increased even more yet there is no crystal clear guidance on adoption techniques and architectural choices surrounding the plethora of options available. This session initiates you into the whys & wherefores, architectural patterns, caveats and techniques that will augment your decision making process & boost your perception of architecting scalable, fault-tolerant & distributed solutions.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
739
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Navigating NoSQL in cloudy skies

  1. 1. Presented at:Chicago IT Architects GroupJan 15, 2013
  2. 2. shankar ramachandranworks with:Microsoft Web Stack of LoveMicrosoft SQL Serveralso works with:
  3. 3. Skippingessential steps,just creates anillusion ofspeed &growth.
  4. 4. simple. 5
  5. 5. Agenda• What NoSQL is & What it is not• Why NoSQL – 2 specific reasons• Conceptual Fundamentals & Grounding• 3 techniques to classify & choose• Way ahead
  6. 6. What• Variety of non- relational database systems• Usually schema-less• Mostly open-source• Not anti-RDBMS• Not a replacement
  7. 7. No – relational tables –were harmed in the making of this presentation.
  8. 8. Why NoSQL?
  9. 9. Reason #1
  10. 10. Big Data
  11. 11. “Big Data”
  12. 12. 4 Vs of Big DataVolume Velocity• Terabytes and Petabytes • Time sensitive real-time data processing & decision makingVariety Value• Of structured and • Inherent value always unstructured data
  13. 13. RDBMS can handle all that. Right??• Scaling up has a limit.• Sharding - spread data across servers.• Denormalization - potentially duplicates data in the database, requiring updates to multiple tables when a. duplicated data item is changed• Distributed Caching - caching recently accessed data in memory and storing that data across any number of servers. or virtual machines. Think Memcached.
  14. 14. RDBMS tactics - Downside & Pitfalls• Re-sharding is disruptive.• Maintain schema on every server• Distributed Caching accelerates just the reads• You lose relational benefits anyway.
  15. 15. aggregate-oriented vs.aggregate-ignorant
  16. 16. Aggregate-orientation • Unit of data can have a more complex structure than a set of simple tuples. • Excellent fit to run on a cluster. • Atomic manipulation of single aggregate. • Application code takes precedence.
  17. 17. Reason #2
  18. 18. Impedance Mismatch
  19. 19. • Difference between relational model & in-memory data structures• Simple tuples• ORMs provide a bridge ; complicate query performance.
  20. 20. { product : "Tintin Statue",created : Date(’11-16-2010’),title : "Brass replica of Tintin",tags : [ "tintin", "herge", "snowy"],comments : [{ author : ‘Shankar, comment : I love it },{ author : ‘Skeet, comment : me too!! } ] }
  21. 21. Concepts
  22. 22. 3 properties of distributed databases• Consistency means that each client always has the same view of the data.• Availability - node always available for read and write.• Partition tolerance means that the system works well across physical network partitions.
  23. 23. consistency availability partition-tolerance only-2-out-of-3 CAP Theorem
  24. 24. consistency availabilitypartition-tolerance This is incorrect
  25. 25. consistency availabilitypartition-tolerance
  26. 26. horizontal-partitioning multiple-instances shared-nothing sharding
  27. 27. commodity-hardware distributed infinite-expansion horizontal-scalability
  28. 28. google-patented-framework map: chop data reduce: fold data MapReduce
  29. 29. low-latency order-of-reads delayed-gratification eventual-consistency
  30. 30. For the academically inclined: Proprietary DB high-performance Google App. Engine Google BigTable Amazon Dynamo Proprietary system high-availability AWS key-value
  31. 31. quick shout-out
  32. 32. Object orientedFaster and Declarative.Lack of interoperability and recovery standards. End-to-end development, database & deployment platform Embeddable and fast. Lack of querying capabilities.
  33. 33. XMLNative XML database systems.Typically XQuery used as querying mechanism.Advantage or Disadvantage based on XML affinity. Sedna Tamino
  34. 34. Choice By Data Model
  35. 35. aggregate-ignorant
  36. 36. GraphGraph-data structure associative-datasets node/edges Small records with complex interconnections. GraphDB
  37. 37. aggregate-oriented
  38. 38. Key/Valuein-memory processing/cachinghyper-efficient associative storage Voldemort
  39. 39. Wide-Columnhorizontally-partition fully distributed Dynamo + BigTable
  40. 40. Document-orientedschema-less collection-based-JSON-like dynamic-indexing
  41. 41. Choice By CAP CA AP CP RDBMS Riak MongoDB Dynamo Hbase Cassandra Redis CouchDB Hypertable Voldemort
  42. 42. C C++ C# Erlang JavaRedis MongoDB RavenDB CouchDB Cassandra Hypertable GraphDB Couchbase Hadoop Kyoto Riak HBase Tycoon Scalaris neo4J Voldemort
  43. 43. Way Ahead
  44. 44. What areMicrosoft &Oracle up to?
  45. 45. Microsoft Polybase
  46. 46. Oracle NoSQL
  47. 47. polyglot persistence … a highly possible future
  48. 48. We learnt that ...RDBMSs are here to stay. NoSQL is not creatinga paradigm shift.NoSQL provides a set of non-relation datastores & technologies that have affinity forbeing processed in a clustered environment.Some of them NoSQL databases also offer asolution to Impedance Mismatch thusincreasing application developer productivity.What Aggregate-Orientation in data modelingmeans.What the different types of database typesare.And most importantly ... we now know thatRDBMS systems need DBAs - DatabaseArchitects & Admins.NoSQL systems need DBAs too - DevelopersBeyond Awesome!
  49. 49. Twitter: @areshankar
  50. 50. Computers are useless. They can only give you answers. Pablo Picasso Cubist painter (1881 - 1973)?

×