NoSQL Simplified: Schema vs. Schema-less

2,934 views
2,488 views

Published on

A look at the many facets of schema-less approaches vs a rich schema approach, ranging from performance and query support to heterogeneity and code/data migration issues. Presented by Leon Guzenda, Founder, Objectivity

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,934
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
58
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

NoSQL Simplified: Schema vs. Schema-less

  1. 1. The Database for Big Data Solutions NoSQL Simplified: Schema vs Schema-less Leon Guzenda & Nick Quinn Meetup - February 20, 2014 © Objectivity, Inc. 2014 !1
  2. 2. Overview • Objectivity Inc.
 • Pros & Cons:
 • Schema • Schema-less
 • What We Provide
 • A Compromise © Objectivity, Inc. 2014 !2
  3. 3. Objectivity, Inc. • Headquartered in San Jose, CA • Over two decades of NoSQL and Big Data experience • Enables complex data virtualization and Big Data solutions for the enterprise • Software products: • Objectivity/DB • InfiniteGraph • InfiniteGraph Social App • Embedded in hundreds of enterprises, government organizations and products, with millions of deployments. © Objectivity, Inc. 2014 !3
  4. 4. Objectivity/DB • Fully distributed object database.
 • Handles complex, highly inter-related data.
 " • Extremely fast navigational access.
 • Scalable collections and B-Tree indices
 • ACID transactions plus Multi-Reader, One Writer mode.
 • Highly scalable - Single Logical View plus simple servers
 • Parallel Query Engine and Relationship Analytics
 • Fully interoperable C++, C#, Java, Python and SQL++ on Windows, Unix, Linux and Mac OS X. © Objectivity, Inc. 2014 !4
  5. 5. ODBMS Deployments Data Fusion Big Science © Objectivity, Inc. 2014 Monitoring & Response Telecom Infrastructure Complex Financial Systems !5
  6. 6. InfiniteGraph • Fully distributed graph database
 • High throughput and scalability
 " • Extremely fast navigational access
 • ACID transactions for online operation
 • Relaxed consistency during batch-mode parallel ingest
 • Parallel queries
 • Flexible indexing, including Lucene for text
 • Java API and Gremlin support © Objectivity, Inc. 2014 !6
  7. 7. Graph DBMS - Finding The Links OTHER DATABASE(S) GRAPH DATABASE © Objectivity, Inc. 2014 !7
  8. 8. Objectivity’s Disruptive Big Data Architecture Uses Data Virtualization to hide the nodes and focus on the connections © Objectivity, Inc. 2014 !8
  9. 9. Schema: Pros & Cons © Objectivity, Inc. 2014 !9
  10. 10. Who's Who? • SCHEMA: • Network [CODASYL] databases - DDL [1972] • Relational Databases - Data Dictionary • Object Databases - ODMG'93 • Most Graph Databases " • Schema-less: • KSAM/ISAM/DSAM/ESAM • IMS (hierarchical) • Pick OS Database (hash-tables) • MUMPS (hierarchical array-storage) • MongoDB - a specialized JSON (and JSON-like) document store. • CouchDB - a JSON document store. © Objectivity, Inc. 2014 !10
  11. 11. Schema: Pros... • Global data definitions " • Optimal access " • Enables Query By Example " • Interoperability " • Schema change control " • Schema contents can be manipulated via standard APIs and tools © Objectivity, Inc. 2014 !11
  12. 12. ...Schema: Pros • Global data definitions: • Data types and the relationships between them • Makes queries more efficient • Actions can be restricted by data type, field values, relationship types " • Optimal access: • Used to determine how to best store, manage and access particular data types " • Enables Query By Example by showing: • Types of information available • Relationships between them " • Interoperability: • DBMS can change the shape of data items to suit the language/environment " • Schema change control: • Can be used to enforce workflows that will keep applications and data in sync. " • Schema contents can be manipulated via standard APIs and tools: • Easier learning curve • Uniform security controls: • The schema can use the same security controls as the data • Query and visualization tools can be used for both data and schema © Objectivity, Inc. 2014 !12
  13. 13. Schema: Cons • The database designer and application developers have to create and maintain the schema. " • Applications have to be kept in sync with schema changes. " • Applications and programmers have to be aware of data types • Though this is one of the major claimed advantages of objectoriented programming. " • There is a perceived loss of flexibility • Though this is more a function of the user interface to the database than the underlying mechanisms. © Objectivity, Inc. 2014 !13
  14. 14. Schema-less: Pros… • Flexibility " • Can be more tolerant of variable Acidity and Consistency models " • Ease of use and maintenance: © Objectivity, Inc. 2014 !14
  15. 15. …Schema-less: Pros • Flexibility - Users can, in theory: " • Put any kind of data into the system • Create new kinds of relationships between things (in a few products) • Find data without worrying about the types of data involved. " • Can be more tolerant of variable Acidity and Consistency models " • Ease of use and maintenance: • No need to worry about data types • No need for a DBA • Applications will [probably] work when new data arrives © Objectivity, Inc. 2014 !15
  16. 16. Schema-less: Cons… • Confusion " • Performance suffers " • poor Integrity " • Ambiguity © Objectivity, Inc. 2014 !16
  17. 17. …Schema-less: Cons • Apparent tolerance of variable CAP models is actually orthogonal to the schema vs schema-less debate [as is support for sharding]. " • Performance suffers " • Integrity is practically non-existent • Maintaining referential integrity is hard • Queries may misinterpret values within an object • 54686973206973206120737472696e6720706c7573206120666c6f 6174696e6720706f696e74206e756d62657258585858706c757320 616e6f7468657220737472696e67 © Objectivity, Inc. 2014 !17
  18. 18. Schema-less: Cons • Apparent tolerance of variable CAP models is actually orthogonal to the schema vs schema-less debate [as is support for sharding]. " • Performance suffers " • Integrity is practically non-existent • Maintaining referential integrity is hard • Queries may misinterpret values within an object • 54686973206973206120737472696e6720706c7573206120666c6f 6174696e6720706f696e74206e756d62657258585858706c757320 616e6f7468657220737472696e67
 
 
 Floating Point 
 © Objectivity, Inc. 2014 !18
  19. 19. Schema-less: Cons • Apparent tolerance of variable CAP models is actually orthogonal to the schema vs schema-less debate [as is support for sharding]. " • Performance suffers " • Integrity is practically non-existent • Maintaining referential integrity is hard • Queries may misinterpret values within an object • 54686973206973206120737472696e6720706c7573206120666c6f 6174696e6720706f696e74206e756d62657258585858706c757320 616e6f7468657220737472696e67
 
 
 Floating Point 
 • A ZIPcode may be stored as an integer (01234) or a string (“01234”) in JSON, causing query and display problems. © Objectivity, Inc. 2014 !19
  20. 20. The NoSQL Players Operational * Intersystems MarkLogic McObject Object/Graph Objectivity/DB Progress Versant " Key-Value * Document Berkeley DB Cassandra Redis Riak Voldemort AppEngine Cloudant CouchDB MongoDB RavenDB Couchbase © Objectivity, Inc. 2014 * AllegroGraph InfiniteGraph Neo4j Titan Column Family HBase HyperTable SimpleDB * Fully or partially schema-less !20
  21. 21. A Compromise
 Provide Flexibility With The Advantages Of Having A Schema © Objectivity, Inc. 2014 !21
  22. 22. Objectivity/DB Schema Usage • Has an internal schema in its system database (the Federated DB). " • User schemas are created and updated by: • Creating .ddl files and pre-processing them with the DDL processor. • Creating and compiling Java, C# or Python header files. • Declaring or dynamically creating/modifyingSmalltalk classes (defunct). • Declaring and changing table definitions with Objectivity/SQL++. " • SQL++ table/column definitions are updated automatically when classes are declared or modified using other languages. • This allows SQL++ to access C#, C++, Java and Python objects and vice-versa. " • A Federated Database can contain multiple named Schemas: • Reduces re-compilation and re-building after a localized schema change. • May facilitate security mechanisms in the future. © Objectivity, Inc. 2014 !22
  23. 23. Objectivity Active Schema " • API and tools for creating, modifying, reading and deleting class definitions, which include association (relationship) definitions. • If used with a dynamic language, such as Smalltalk, creating or modifying a class doesn't need to affect existing programs. • In general, only generic access (via the ooObj base clase) can be used without creating the files needed to recompile programs and methods for accessing the new object types. " • Helps application developers build tools that need to access the schema, e.g.: • Graphical query tools • highly flexible object modeling capabilities for end users. " • An end-user, such as a field technician or an analyst: • Can add local object classes, populate, maintain and query them, but... • Cannot interfere with the correct operation of the pre-built applications. © Objectivity, Inc. 2014 !23
  24. 24. Use Cases © Objectivity, Inc. 2014 !24
  25. 25. Use Case 1 - Intelligence Gathering Framework… 1 of • An integrated application development framework that focuses on adaptability.
 • Dynamic modeling of entities, services and workflows. 
 • Versioning and temporality features support system evolution.
 The screenshots show a location that is under surveillance and everything known about it in the database. © Objectivity, Inc. 2014 !25 2
  26. 26. …Use Case 1 - Intelligence Gathering Framework 2 • Eliminates the mapping layer between the user defined objects and the database.
 • Performance and scalability. 
 Design and Information Feeds of Users Database • Active Schema facilitates object migration.
 © Objectivity, Inc. 2014 !26 2
  27. 27. Use Case 2 - GDMO Framework " • Operations, Administration, and" Maintenance interface for the CDMA" system RF infrastructure
 • Controls the Base Station Controller and Base Station Transceiver Subsystem
 • GDMO* Schema and CMIP agent-manager" messaging
 • A SPARC-based BSC rack supports a" peak load of 150,000 simultaneous callers
 • Deployed in CDMA networks worldwide," including SprintPCS" * GDMO is the Guideline for the Definition of Managed Objects © Objectivity, Inc. 2014 !27
  28. 28. Use Case 3 - Ontology Framework SCHEMA " • Uses standard objects to define a metaschema 
 • It is used to define concept templates
 • They can be inherited from, combined or extended to support a “class specification”
 CONCEPT LOGIC CLASS COMPONENTS • The data is combined with Horn Logic to build complex ontologies." RELATIONSHIP STRUCT ARRAY FIELD * GDMO is the Guideline for the Definition of Managed Objects © Objectivity, Inc. 2014 !28
  29. 29. Summary • Don’t confuse CAP issues with Schema considerations
 • Schemas make the DBMS more powerful
 • Schema-less architectures are more flexible
 • It’s possible to build flexible systems with Schema-based infrastructure © Objectivity, Inc. 2014 !29
  30. 30. THANK YOU • Please visit objectivity.com for:
 • • • • • • Features Use Cases White Papers Free downloads (60 day evaluation) Sample Applications Application Developer’s Wiki " • For further information: " • Email: info@objectivity.com © Objectivity, Inc. 2014 !30

×