Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Why nosql also_why_somany

217 views

Published on

There are many NOSQL databases, This article written by Prashanth B Panduranga provides an overview of NoSQL databases introducing BIG Data, ACID, BASE, Types of NoSQL databases, NewSQL and more

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Why nosql also_why_somany

  1. 1. Why NOSQL? Ok. But, Why So many? December 23rd, 2013
  2. 2. Why NOSQL? Ok. But, Why So many? www.aditi.com TABLE OF CONTENTS 1. INTRODUCTION ........................................................................................... 3 2. WHY NOSQL?............................................................................................ 3 3. WHY SO MANY? .......................................................................................... 3 4. TYPES OF NOSQL DATABASES ........................................................................ 5 5. KEY ASPECTS .............................................................................................. 5 6. NEWSQL ................................................................................................... 6 7. CONCLUSION .............................................................................................. 6
  3. 3. Why NOSQL? Ok. But, Why So many? www.aditi.com 1.INTRODUCTION NOSQL: Not Only SQL, term generally referred to non SQL centric relational data stores 2.WHY NOSQL? Necessity is the mother of all inventions. A look at what prompted the creation of NOSQL databases. 1. Exorbitant growth of data: a. Large datasets become onerous when stored in relational databases b. Query execution time increases creating performance bottlenecks 2. Data model/structure mismatch: Storing hierarchical/graph/relationship data as rows and columns is highly inefficient, and so is Storing serialized objects 3. Introduction of Distributed Caching infrastructure on top of relational data storage for performance and its related consistency problems 4. Heavy usage of blob storage beats the purpose 5. Massive Scale out 6. High Availability: always be able to write with a massive write performance, small continuous volatile reads and write 7. Need for Faster key value access 8. Difficulty in handling volatility in schema and data types some relating to change in business and some due to data acquisition 9. Complexity in Partitioning/Sharding: Done mostly for manageability, performance or availability 10. Performance in large databases 11. Too Generic, Need for specialist databases 12. Cost based optimization though simplified it for the naïve developers, it is unpredictible more so when there is high resource queries being executed concurrently. 13. Resource contention, Resource concurrency, blocking queries, index updates, concurrent disk issues such as log back ups, check pointing, Is NOSQL the answer to everything stated above? NO, but certainly helps in resolving a few What NOSQL promises in short is high performance and flexibility with high availability and scalability 3.WHY SO MANY? What NOSQL databases doesn’t promise is ACID. NOSQL database implementations vary in confirming to various consistency semantics, most tend to confirm BASE. Let’s look at what they are ACID “Atomic: All operations in a transaction succeed or every operation is rolled back. Consistent: On transaction completion, the database is structurally sound.
  4. 4. Why NOSQL? Ok. But, Why So many? www.aditi.com Isolated: Transactions do not contend with one another. Contentious access to state is moderated by the database so that transactions appear to run sequentially. Durable: The results of applying a transaction are permanent, even in the presence of failures - Wikipedia” BASE “Basic availability: The store appears to work most of the time. Soft-state: Stores don’t have to be write-consistent, nor do different replicas have to be mutually consistent all the time. Eventual consistency: Stores exhibit consistency at some later point (e.g., lazily at read time) – O’Rielly ” It is important to note that not all NOSQL databases confirm to eventual consistency Apart from the need for Specialist databases supporting specialised data structures, let’s look at the CAP Theorem “The CAP theorem, also known as Brewer's theorem, states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees Consistency (all nodes see the same data at the same time) Availability (a guarantee that every request receives a response about whether it was successful or failed) Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system) According to the theorem, a distributed system cannot satisfy all three of these guarantees at the same time” - Wikipedia With drastically different business dynamics, and priorities amongst enterprises, NOSQL databases tend to pick two of the above mentioned characteristics. Given the need for flexibility in data structure, there are a multitude of NOSQL databases being introduced, see figure below Data Reference: http://nosql-database.org/
  5. 5. Why NOSQL? Ok. But, Why So many? www.aditi.com 4.TYPES OF NOSQL DATABASES 1. Wide Column Store (Column Families): The data model stores columns of data together, instead of rows optimized for queries over large datasets 2. Document Store: Pair each key with a complex data structure known as a document. Documents can contain many different key-value pairs, or key-array pairs, or even nested documents 3. Key Value/Tuple Store: Every single item in the database is stored as an attribute name (or "key"), together with its value 4. Graph Databases: Graph is a set of nodes and the relationships that connect them. Some graph databases use native graph, while some serialize the graph data and store in to relational, object or other data store 5. Multi Model Databses: Serve multiple data models 6. Object Databases: Data is persisted in the form of objects 7. Grid and Cloud Database Solutions: Data persisted across multiple servers that work together to manage information and related operations 8. XML Databases: Data persisted in XML format 9. MultiDimensional Databases: type of database that is optimized for data warehouse and online analytical processing (OLAP) applications 10. Multi Value databases: Data is persisted as keys and multiple values , they have features that support and encourage the use of attributes which can take a list of values, rather than all attributes being single- valued 11. Event Sourcing: Persist application's state by storing the history that determines the current state of the application 5.KEY ASPECTS 1. NOSQL is not an all in solution, certain scenario mentioned above naturally fits the NOSQL semantics. NOSQL is certainly not a replcement for relational stores 2. Consider NOSQL for Real time analytics on operational data 3. Consider NOSQL when there are many systems including streaming data 4. NoSQL databases provide a linear approach to database scaling, making scaling easier and intuitive 5. All NOSQL databases are developed to be distributed, scalable databases 6. Data duplication and denormalization are a norm 7. Consider NOSQL for hierarchical, Content Caching, distributed file systems, Social Networking, recommendation engine and graph like data 8. NOSQL databases can support unstructured and unpredictable data 9. NOSQL databases use a cluster of servers to store data. Data and the operations are usually spread across clusters 10. Consider NOSQL databases which provide Integrated Caching 11. NOSQL is developed for continous availability 12. Certain NOSQL implementations provide configurable consistency models (strong vs eventual), but this will have performance implications 13. Only a few NOSQL databases support ACID 14. Only a few NOSQL databases support transactions
  6. 6. Why NOSQL? Ok. But, Why So many? www.aditi.com 15. Consider NOSQL databases when you have large amounts of data, large enough to not fit in one physical server 16. Consider NOSQL database when you have a object-relational impedence mismatch 17. NOSQL databases trade off consistency for efficiency 18. Consider NOSQL databases when you need schema flexibility 19. Consider NOSQL database if you are looking for massive write performance 20. Consider NOSQL database if you are looking for fast key value access 21. NOSQL provides horizontal scaling 6.NEWSQL “NewSQL is a class of modern relational database management systems that seek to provide the same scalable performance of NoSQL systems for online transaction processing (read-write) workloads while still maintaining the ACID guarantees of a traditional database system – Wikipedia” As we have seen above NOSQL databases have been developed to serve different purposes, with one of the main advantages being scale out. NewSQL is an attempt to provide all the benefits of NOSQL while continuing to support ACID. Google Spanner is one of the main contenders with a semi-relational data model, while NuoDB achieves it by split- ting the transactional (in-memory) and the storage tier accompanied by peer-to-peer coordination. http://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf http://www.nuodb.com/explore/newsql-cloud-database-how-it-works/ 7.CONCLUSION Be it mergers and acquisitions, or change in business dynamics, or the agility in development large enterprises are bound to have hybrid solutions. Having multiple RDBMS’s, data warehouses, data marts in one environment is not unseen or unheard off. It is more than likely for enterprises to add NOSQL/NewSQL databases in to the mix. Be on the lookout for true shared-nothing distributed architectures! Prashanth B Panduranga (Shan) Director-Technology | 725-976-7006 | pandurangap@aditi.com Connect with us: Blog | Twitter | LinkedIn | Facebook

×