• Like

NoSQL – Beyond the Key-Value Store

  • 1,100 views
Uploaded on

The bulk of the NoSQL Technologies focus on achieving scale-out ability by building their architecture around a simple, distributed hash, key-value store. This works well for partitioning simple data, …

The bulk of the NoSQL Technologies focus on achieving scale-out ability by building their architecture around a simple, distributed hash, key-value store. This works well for partitioning simple data, but in reality, your information models are not simple. As a result, you may have to build enormous layers of code to manage an explicit structure baked into the persistence tier. In this session, take a look at a NoSQL solution which allows you to store naturally clustered, richly linked object networks beneath your key partitioned roots. The result is that you do not have to write extensive code to deal with the physical structure in the persistence tier even when dealing with complex information models like predictive models, timeseries, recursive relations, compositions, etc. We will explore how such an implementation works in practice by looking at a case study of an advanced model analytics and visualization solution built on the clustered NoSQL database solution Versant Database Engine.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,100
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
10
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. NoSQL Beyondthe Key:Value yStoreBy Robert GreeneVersant Corporation U.S. Headquarters255 Shoreline Dr Suite 450 Redwood City CA 94065 Dr. 450, City,www.versant.com | 650-232-2400 #NoSQLVersant
  • 2. The Genesis of NoSQLOverview The Sky is Falling NoSQL at it’s Core Shift in Architecture Shift Innovation Domain Models, Distribution, SOA Enterprise Needs and NoSQL Application Development with NoSQL NoSQL 2 0 - Leveraging the Knowledge 2.0 Base #NoSQLVersant
  • 3. Genesis of NoSQL► The Sky is Falling Early Web 2.0 Social Computing drives innovation y p g► End of the Hammer Era One relational tool for every data problem, fails. problem fails Agility and Cost, usher in reason and innovation #NoSQLVersant
  • 4. NoSQL at its CoreAn Increasingly Crowed Space To “shift”, is to be NoSQL No “shift” Inside #NoSQLVersant
  • 5. Traditional DBMS Scale Architecture INEFFICIENT CPU destroying Mapping EXPENSIVE Repetitive data movement and JOIN calculation #NoSQLVersant
  • 6. NoSQL at its CoreA Shift In Application Architecture UNIFED Application  A li ti driven schema COMMODITY HW COMMODITY HW Horizontal scale out,  distribution and  partitioning • Google – Soft-Schema • IBM – Schema-Less #NoSQLVersant
  • 7. A Shift is Needed► How Often do Relations Change? Blog : BlogEntry , Order : OrderItem , You : Friend►Relations Rarely Change, Stop Recalculating Them ► Do you need ALL o you da a in o e p ace o eed of your data one place. ► You don’t. You can distribute it. #NoSQLVersant
  • 8. NoSQLInnovation and the Shift #NoSQLVersant
  • 9. Domain Model Thinking► Business Model is Schema Not Data Model under Entities► Movement of Responsibility Soft-Schema (vs) Schema-less► Enables changing Nature of Analytics SQL/MapReduce – “give me top 20 performers” NoSQL – “find 3 dimensional protein pattern match” #NoSQLVersant
  • 10. Distributed Thinking► Scale-out, with fall out► Partition Impact –Implementation, Algorithms Different design considerations ► Key Driven access impacts ► Embedded Models ► Enterprise Reference Data #NoSQLVersant
  • 11. SOA Thinking► Business Processes and Service Orchestration The Drivers of Business Agility ► NoSQL enables increased speed of agility ► Faster Time to Market, Competitive Edge► Raw Data Manipulation and Mining Typically done outside of day to day business ETL strategy essential ► Feedback loop for BPM/O layers #NoSQLVersant
  • 12. NoSQL and the EnterpriseResponsibly, taking advantage of the “Shift” #NoSQLVersant
  • 13. Embedded Models NoSQL 1 0 1.0► Document Store Characteristics Blogs have Articles► Patterns of Access Only access sub elements from root Good candidate for simple web system ► Query on Articles content to get similar Blogs ► Display Blogs and their Articles #NoSQLVersant
  • 14. Enterprise Models NoSQL 2 0 2.0► Many to Many Blogs get Tags - search based on tag Tags weighted, Similarity Meta Data g g y► Faster algorithmic searching Narrow Blogs via back reference ► Sub queries on collection contents Can leverage A ti l i addition t Bl C l Articles in dditi to Blogs #NoSQLVersant
  • 15. Operational Features NoSQL 1 0 1.0► Transactions – The 20:80 Rule (ACID:CAP) Most prevalent NoSQL 1.0 approach ► Give up transactions for better scalibility ► Compensating application code needed Code Complexity, Manual Processes High Operational Cost ► Weak Transactions It’s a start, gets us to 20%, demonstrates the need From Key to Criteria Based Query #NoSQLVersant
  • 16. Enterprise Operational Features NoSQL 2 0 2.0► Transactions – The 80:20 Rule ( ACID:CAP ) Algorithm, Tagged Blogs via Tag ► No Transactions = lost Blog, no results from Algorithm► Cascading Operations Network essential► External Access Jdbc/odbc tooling support #NoSQLVersant
  • 17. Operating NoSQL 1.0► DevOps – Dev builds it, Dev owns it. Schema-less implementation ► Evolution directly impacts application space ( Development )► Data Backup Largely fil d L l file dumps, mostly systems off-line tl t ff li► Custom tooling for out of band needs Operational need, write a custom access Non-centralized, Non-centralized scripted monitoring #NoSQLVersant
  • 18. Enterprise Operations NoSQL 2 0 2.0► DevOps – Dev builds it, IT owns it eventually. p y IT System Management ► Centralized monitoring ► Integrated with SNMP / system management g y g► Availability, Governance, Data Backup Enterprise point i ti E t i i t in time recovery, SOX, HIPPA, etc SOX HIPPA t Fault tolerant, globally replicated Online and distributed back up p► Cloud Enabled - utility efficiency Automated SLA based Provisioning Mobility of Processes #NoSQLVersant
  • 19. Web Development NoSQL 1 0 1.0► Requires completely new skill set► Lack of ecosystem integration IDE tooling Immature integration g Non standard connectivity► Custom, custom and more custom Each 1st generation product unique / proprietary #NoSQLVersant
  • 20. Enterprise Development NoSQL 2 0 2.0► Leverages existing enterprise skill set g g p► Mature development p p platforms Tomcat, Spring, Hudson, Eclipse enabled► Industry standard API’s Java – JPA ( 10 years of ORM experts ) Ruby – OnRails its the shift the matters OnRails, #NoSQLVersant
  • 21. Application Development pp p The Things You Will Build NoSQL 1.0 NoSQL 2.0 #NoSQLVersant
  • 22. Need Proxy Pattern NoSQL 1 0 1.0► Avoid overhead of extraneous loading You want all Blog Articles to get 1 Article?► Model must change to use References Blog:owner(User)  becomes Blog:owner_id(long)► Proxy pattern for long to User swizzle P tt f l t U i l Object to Value, Value to Object ► Maybe Document store BasicDBObject ► Maybe Key:Value store BSON #NoSQLVersant
  • 23. Serializable NoSQL 1 0 1.0► You don’t write code in JSON or XML don t Programming models need transformation► Non-Vendor transformation limits Create binary format value, cannot query it► Not all programming structures are supported Map -- Need to breakdown programming model List’s -- Array need Serializable #NoSQLVersant
  • 24. Reference System NoSQL 1 0 1.0► Avoid object duplicates j p Load a User’s Personal Blog, Search Tagged Blog ► Inconsistencies during runtime► Materialization of bi-directional relations Need to avoid circular references f ► Load Blog*, blog has a Owner:User ► Load User, user has a Personal Blog* User Blog ► …..repeat #NoSQLVersant
  • 25. Need Lifecycle Tracking NoSQL 1 0 1.0► New, Changed, Deleted On store, update: Slow overhead to replace all objects ► If not dirty, do not traverse and update ► If new, add to the reference system ► If null, delete underlying element► Need to manage the reference system #NoSQLVersant
  • 26. NoSQL 1.0 (observations)► Mapping layer is forming Why re-invent the wheel ► ‘O’RM – Object Relational Mapping ► ‘O’DM – Object Document Mapping ► ‘O’CM – Object Column Mapping Software Industry knows where this leads ► Mapping Complexity, brittle code base, non-agility ► The ‘O’ is what matters, ‘O’bject Lifecycle Management #NoSQLVersant
  • 27. NoSQL 2.0► Leverage NoSQL 1.0 architectural shift Scale out with performance ► Key partitioned data distribution yp ► The good stuff from NoSQL 1.0► Eliminate mapping complexity Handle modern information models ► Eliminate domain model mapping ► Enable development agility ► Leverage existing enterprise skills ‘O’ in a standard (e.g. JPA), without RM,DM,CM #NoSQLVersant
  • 28. Verite Group Case Study #NoSQLVersant
  • 29. Verite Group► Value Proposition Line Level I.P. Analytics ► Answers the question: What is happening? Not: What has happened? Activity Correlation ► Capturing time related sequences of activity Not capturing discrete “product” on the wire #NoSQLVersant
  • 30. Verite Group► Core netScope Use Case p Pipeline Monitor and capture ► In-flight I.P. traffic content Apply target rules and populate meta models ► High network traffic content equipment variation traffic, content, Present analyst visualization and alerts y ► Customize new target rules Insert into Pipeline and iterate #NoSQLVersant
  • 31. Verite Group► Technology Adoption Process IBM DB2 – Pure XML store ► Driver: fast ingestion, excellent reg_exp query support ► Failure: huge CPU issues pulling query results Analytic model too complex, need objects from results Hibernate – P t Hib t Postgress, M SQL MySQL ► Driver: binary protocol to analytic model up front Soft-Schema driven, Still supports reg_exp query ► Failure: data ingestion too slow, CPU max high disk spin slow max, Versant – NoSQL 2.0 ► Driver: speed data ingestion ► Success: high speed data ingestion low CPU low disk spin ingestion, CPU, Direct soft-schema storage, still supports reg_exp query Scale-out capability for large data analytics #NoSQLVersant
  • 32. Verite Group► Discovered Value, Lessons Learned Changing nature of analytics ► Model driven algorithmic, not iterative query E.g. eliminated many reg_exp queries and moved to model ► Significant increase in performance of analytic Operational efficiencies p ► Soft-Schema is database schema Faster analytic model evolution ( less DBA ) Lower CPU cost to marshal type systems ( mapping ) yp y pp g Less Disk space and fast I/O ( less duplication, disk seeking ) #NoSQLVersant
  • 33. Q&A #NoSQLVersant
  • 34. Contact Robert GreeneVice President, Technology rgreene@versant.com @ t NoSQL Now! – Booth # SQ #14 #NoSQLVersant