NoSQL – Beyond the Key-Value Store


Published on

The bulk of the NoSQL Technologies focus on achieving scale-out ability by building their architecture around a simple, distributed hash, key-value store. This works well for partitioning simple data, but in reality, your information models are not simple. As a result, you may have to build enormous layers of code to manage an explicit structure baked into the persistence tier. In this session, take a look at a NoSQL solution which allows you to store naturally clustered, richly linked object networks beneath your key partitioned roots. The result is that you do not have to write extensive code to deal with the physical structure in the persistence tier even when dealing with complex information models like predictive models, timeseries, recursive relations, compositions, etc. We will explore how such an implementation works in practice by looking at a case study of an advanced model analytics and visualization solution built on the clustered NoSQL database solution Versant Database Engine.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

NoSQL – Beyond the Key-Value Store

  1. 1. NoSQL Beyondthe Key:Value yStoreBy Robert GreeneVersant Corporation U.S. Headquarters255 Shoreline Dr Suite 450 Redwood City CA 94065 Dr. 450, City, | 650-232-2400 #NoSQLVersant
  2. 2. The Genesis of NoSQLOverview The Sky is Falling NoSQL at it’s Core Shift in Architecture Shift Innovation Domain Models, Distribution, SOA Enterprise Needs and NoSQL Application Development with NoSQL NoSQL 2 0 - Leveraging the Knowledge 2.0 Base #NoSQLVersant
  3. 3. Genesis of NoSQL► The Sky is Falling Early Web 2.0 Social Computing drives innovation y p g► End of the Hammer Era One relational tool for every data problem, fails. problem fails Agility and Cost, usher in reason and innovation #NoSQLVersant
  4. 4. NoSQL at its CoreAn Increasingly Crowed Space To “shift”, is to be NoSQL No “shift” Inside #NoSQLVersant
  5. 5. Traditional DBMS Scale Architecture INEFFICIENT CPU destroying Mapping EXPENSIVE Repetitive data movement and JOIN calculation #NoSQLVersant
  6. 6. NoSQL at its CoreA Shift In Application Architecture UNIFED Application  A li ti driven schema COMMODITY HW COMMODITY HW Horizontal scale out,  distribution and  partitioning • Google – Soft-Schema • IBM – Schema-Less #NoSQLVersant
  7. 7. A Shift is Needed► How Often do Relations Change? Blog : BlogEntry , Order : OrderItem , You : Friend►Relations Rarely Change, Stop Recalculating Them ► Do you need ALL o you da a in o e p ace o eed of your data one place. ► You don’t. You can distribute it. #NoSQLVersant
  8. 8. NoSQLInnovation and the Shift #NoSQLVersant
  9. 9. Domain Model Thinking► Business Model is Schema Not Data Model under Entities► Movement of Responsibility Soft-Schema (vs) Schema-less► Enables changing Nature of Analytics SQL/MapReduce – “give me top 20 performers” NoSQL – “find 3 dimensional protein pattern match” #NoSQLVersant
  10. 10. Distributed Thinking► Scale-out, with fall out► Partition Impact –Implementation, Algorithms Different design considerations ► Key Driven access impacts ► Embedded Models ► Enterprise Reference Data #NoSQLVersant
  11. 11. SOA Thinking► Business Processes and Service Orchestration The Drivers of Business Agility ► NoSQL enables increased speed of agility ► Faster Time to Market, Competitive Edge► Raw Data Manipulation and Mining Typically done outside of day to day business ETL strategy essential ► Feedback loop for BPM/O layers #NoSQLVersant
  12. 12. NoSQL and the EnterpriseResponsibly, taking advantage of the “Shift” #NoSQLVersant
  13. 13. Embedded Models NoSQL 1 0 1.0► Document Store Characteristics Blogs have Articles► Patterns of Access Only access sub elements from root Good candidate for simple web system ► Query on Articles content to get similar Blogs ► Display Blogs and their Articles #NoSQLVersant
  14. 14. Enterprise Models NoSQL 2 0 2.0► Many to Many Blogs get Tags - search based on tag Tags weighted, Similarity Meta Data g g y► Faster algorithmic searching Narrow Blogs via back reference ► Sub queries on collection contents Can leverage A ti l i addition t Bl C l Articles in dditi to Blogs #NoSQLVersant
  15. 15. Operational Features NoSQL 1 0 1.0► Transactions – The 20:80 Rule (ACID:CAP) Most prevalent NoSQL 1.0 approach ► Give up transactions for better scalibility ► Compensating application code needed Code Complexity, Manual Processes High Operational Cost ► Weak Transactions It’s a start, gets us to 20%, demonstrates the need From Key to Criteria Based Query #NoSQLVersant
  16. 16. Enterprise Operational Features NoSQL 2 0 2.0► Transactions – The 80:20 Rule ( ACID:CAP ) Algorithm, Tagged Blogs via Tag ► No Transactions = lost Blog, no results from Algorithm► Cascading Operations Network essential► External Access Jdbc/odbc tooling support #NoSQLVersant
  17. 17. Operating NoSQL 1.0► DevOps – Dev builds it, Dev owns it. Schema-less implementation ► Evolution directly impacts application space ( Development )► Data Backup Largely fil d L l file dumps, mostly systems off-line tl t ff li► Custom tooling for out of band needs Operational need, write a custom access Non-centralized, Non-centralized scripted monitoring #NoSQLVersant
  18. 18. Enterprise Operations NoSQL 2 0 2.0► DevOps – Dev builds it, IT owns it eventually. p y IT System Management ► Centralized monitoring ► Integrated with SNMP / system management g y g► Availability, Governance, Data Backup Enterprise point i ti E t i i t in time recovery, SOX, HIPPA, etc SOX HIPPA t Fault tolerant, globally replicated Online and distributed back up p► Cloud Enabled - utility efficiency Automated SLA based Provisioning Mobility of Processes #NoSQLVersant
  19. 19. Web Development NoSQL 1 0 1.0► Requires completely new skill set► Lack of ecosystem integration IDE tooling Immature integration g Non standard connectivity► Custom, custom and more custom Each 1st generation product unique / proprietary #NoSQLVersant
  20. 20. Enterprise Development NoSQL 2 0 2.0► Leverages existing enterprise skill set g g p► Mature development p p platforms Tomcat, Spring, Hudson, Eclipse enabled► Industry standard API’s Java – JPA ( 10 years of ORM experts ) Ruby – OnRails its the shift the matters OnRails, #NoSQLVersant
  21. 21. Application Development pp p The Things You Will Build NoSQL 1.0 NoSQL 2.0 #NoSQLVersant
  22. 22. Need Proxy Pattern NoSQL 1 0 1.0► Avoid overhead of extraneous loading You want all Blog Articles to get 1 Article?► Model must change to use References Blog:owner(User)  becomes Blog:owner_id(long)► Proxy pattern for long to User swizzle P tt f l t U i l Object to Value, Value to Object ► Maybe Document store BasicDBObject ► Maybe Key:Value store BSON #NoSQLVersant
  23. 23. Serializable NoSQL 1 0 1.0► You don’t write code in JSON or XML don t Programming models need transformation► Non-Vendor transformation limits Create binary format value, cannot query it► Not all programming structures are supported Map -- Need to breakdown programming model List’s -- Array need Serializable #NoSQLVersant
  24. 24. Reference System NoSQL 1 0 1.0► Avoid object duplicates j p Load a User’s Personal Blog, Search Tagged Blog ► Inconsistencies during runtime► Materialization of bi-directional relations Need to avoid circular references f ► Load Blog*, blog has a Owner:User ► Load User, user has a Personal Blog* User Blog ► …..repeat #NoSQLVersant
  25. 25. Need Lifecycle Tracking NoSQL 1 0 1.0► New, Changed, Deleted On store, update: Slow overhead to replace all objects ► If not dirty, do not traverse and update ► If new, add to the reference system ► If null, delete underlying element► Need to manage the reference system #NoSQLVersant
  26. 26. NoSQL 1.0 (observations)► Mapping layer is forming Why re-invent the wheel ► ‘O’RM – Object Relational Mapping ► ‘O’DM – Object Document Mapping ► ‘O’CM – Object Column Mapping Software Industry knows where this leads ► Mapping Complexity, brittle code base, non-agility ► The ‘O’ is what matters, ‘O’bject Lifecycle Management #NoSQLVersant
  27. 27. NoSQL 2.0► Leverage NoSQL 1.0 architectural shift Scale out with performance ► Key partitioned data distribution yp ► The good stuff from NoSQL 1.0► Eliminate mapping complexity Handle modern information models ► Eliminate domain model mapping ► Enable development agility ► Leverage existing enterprise skills ‘O’ in a standard (e.g. JPA), without RM,DM,CM #NoSQLVersant
  28. 28. Verite Group Case Study #NoSQLVersant
  29. 29. Verite Group► Value Proposition Line Level I.P. Analytics ► Answers the question: What is happening? Not: What has happened? Activity Correlation ► Capturing time related sequences of activity Not capturing discrete “product” on the wire #NoSQLVersant
  30. 30. Verite Group► Core netScope Use Case p Pipeline Monitor and capture ► In-flight I.P. traffic content Apply target rules and populate meta models ► High network traffic content equipment variation traffic, content, Present analyst visualization and alerts y ► Customize new target rules Insert into Pipeline and iterate #NoSQLVersant
  31. 31. Verite Group► Technology Adoption Process IBM DB2 – Pure XML store ► Driver: fast ingestion, excellent reg_exp query support ► Failure: huge CPU issues pulling query results Analytic model too complex, need objects from results Hibernate – P t Hib t Postgress, M SQL MySQL ► Driver: binary protocol to analytic model up front Soft-Schema driven, Still supports reg_exp query ► Failure: data ingestion too slow, CPU max high disk spin slow max, Versant – NoSQL 2.0 ► Driver: speed data ingestion ► Success: high speed data ingestion low CPU low disk spin ingestion, CPU, Direct soft-schema storage, still supports reg_exp query Scale-out capability for large data analytics #NoSQLVersant
  32. 32. Verite Group► Discovered Value, Lessons Learned Changing nature of analytics ► Model driven algorithmic, not iterative query E.g. eliminated many reg_exp queries and moved to model ► Significant increase in performance of analytic Operational efficiencies p ► Soft-Schema is database schema Faster analytic model evolution ( less DBA ) Lower CPU cost to marshal type systems ( mapping ) yp y pp g Less Disk space and fast I/O ( less duplication, disk seeking ) #NoSQLVersant
  33. 33. Q&A #NoSQLVersant
  34. 34. Contact Robert GreeneVice President, Technology @ t NoSQL Now! – Booth # SQ #14 #NoSQLVersant