NoSQL databases

935 views

Published on

This presentation explains why NoSQL databases came over SQL databases although SQL databases has been successfully technology for more than twenty years. Moreover, This presentation discuses the characteristics and classifications of NoSQL databases. Finally, These slides cover four NoSQL databases briefly.

Published in: Data & Analytics
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
935
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
63
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Enterprise Resourse planning: Customer Relationship management
  • NoSQL databases

    1. 1. NoSQL Database: New Era of Databases for Big data Analytics - Classification, Characteristics and Comparison A B M Moniruzzaman and Syed Akhter Hossain 03/04/14 CSC 8710 1
    2. 2. Contents • • • • • • • • • • NoSQL databases definition Why NoSQL databases? Characteristics of NoSQL Databases Primary Uses of NoSQL Database Key-Value databases Documents databases Column-Family databases Graph databases Adoption of NoSQL Database Conclusion 03/04/14 CSC 8710 2
    3. 3. NoSQL Database • NoSQL for Not Only SQL, refers to an eclectic and increasingly familiar group of non-relational data management system • databases are not built primarily on tables, and generally don't use SQL for data manipulation. • NoSQL systems are distributed, non-relational database, designed for large-scale data storage and for massiveparallel data processing across a large number of commodity servers. 03/04/14 CSC 8710 3
    4. 4. NoSQL Database • They also use non-SQL languages and mechanisms to interact with data. • NoSQL database systems arose alongside major Internet companies, such as Google, Amazon, and Facebook which had challenges in dealing with huge quantities of data • These systems are designed to scale thousands or millions of users doing updates as well as reads, in contrast to traditional DBMSs and data warehouses 03/04/14 CSC 8710 4
    5. 5. Why NoSQL? • Relational DBMSs have been a successful technology for many years, providing persistence, concurrency control and integration mechanisms. • The need of processing large amount of data changes the direction from scaling vertically to scaling horizontally on clusters. 03/04/14 CSC 8710 5
    6. 6. Why NoSQL? • NoSQL databases focus on analytical processing of large scale datasets, offering increased scalability over commodity hardware • Organizations that collect large amounts of unstructured data are increasingly turning to nonrelational databases (NoSQL databases). 03/04/14 CSC 8710 6
    7. 7. Big Data 03/04/14 CSC 8710 7
    8. 8. Characteristics of NoSQL Databases • Strong Consistency: all clients see the same version of data. • High Availability: Data always available, at least one copy of the requested data even if one of the nodes is down. • Partition-tolerance: the total system keeps its characteristic even when being deployed on different servers 03/04/14 CSC 8710 8
    9. 9. Characteristics of NoSQL Databases 03/04/14 CSC 8710 9
    10. 10. Primary Uses of NoSQL Database 1. Large-scale data processing 2. Exploratory analytics on semi-structured data (expert level) 3. Large volume data storage. 03/04/14 CSC 8710 10
    11. 11. Classification of NoSQL Databases • Key-Value databases • Documents databases • Column Family databases • Graphics databases 03/04/14 CSC 8710 11
    12. 12. Key-Value Databases • These DMS store items as alpha-numeric identifiers that refer to the keys. Each key has associated values. • The values could be simple text strings or more complex lists and sets • Search only performed against keys, and limited to exact matches. • Search cannot be performed against values 03/04/14 CSC 8710 12
    13. 13. Key-Value Databases 03/04/14 CSC 8710 13
    14. 14. Key-Value characterstics • The simplicity of Key-Value Store makes them very quick and light. • Highly scalable retrieval of the values needed for application tasks such as retrieving product names. • This is why Amazon use K-V system, Dynamo, in its shopping cart. Dynamo is a highly available key-value storage system. • Example: Dynamo (Amazon), Voldemort (LinkedIn) Redis, BerkeleyDB, Riak 03/04/14 CSC 8710 14
    15. 15. Pros and Cons • pros: anything can be stored in an aggregate • cons: only key lookup to access the entire aggregate is allowed (no query and part of aggregate retrieval mechanisms) 03/04/14 CSC 8710 15
    16. 16. Document Database • Designed to manage and store documents. • These documents are encoded in a standard data exchange format such as XML, JSON (Javascript Option Notation) or BSON (Binary JSON). 03/04/14 CSC 8710 16
    17. 17. Document Database 03/04/14 CSC 8710 17
    18. 18. Primary Uses • Document databases are good for storing and managing Big Data-size collections of literal documents such as text documents, email messages. 03/04/14 CSC 8710 18
    19. 19. Pros And Cons • pros: allow structured queries and partial aggregate retrieval based on the fields in the aggregate • cons: imposes a limit on what can be placed in a database 03/04/14 CSC 8710 19
    20. 20. Column-Family Databases • It consists of a Key-Value pair where the value consists of set of columns. • The column family databases are represented in tables, each key-value pair being a row. • All the related data can be grouped as one family 03/04/14 CSC 8710 20
    21. 21. Primary Uses 1. Large-scale, batch-oriented data processing: sorting, parsing, conversion : - conversions between hexadecimal, binary and decimal code values. 2. Exploratory and predictive analytics performed by expert statisticians and programmers. 03/04/14 CSC 8710 21
    22. 22. Column-Family 03/04/14 CSC 8710 22
    23. 23. Graph Databases • Graph databases replace relational tables with structured relational graphs of interconnected key-value pairings. • Graph databases are useful when you are more interested in relationships between data than the data itself and it works perfectly for the social network. • It is optimized for relationship traversing not for querying • Examples: Neo4j, InfoGrid, Sones GraphDB, AllegroGraph, InfiniteGraph 03/04/14 CSC 8710 23
    24. 24. Graph Databases 03/04/14 CSC 8710 24
    25. 25. Adoption of NoSQL Database • Organizations that have massive data storage are looking seriously at NoSQL. • NoSQL Database expert are highly demanded for most of the developing organizations. • The next graph shows job trends of five NoSQL Databases from Indeed.com 03/04/14 CSC 8710 25
    26. 26. Job Trends of Five NoSQL Databases 03/04/14 CSC 8710 26
    27. 27. Adoption of NoSQL Database • MongoDB‘s growth means that it has cemented its place as the most popular NoSQL database. • According to LinkedIn profile mentions, The mentions of NoSQL technologies form 45% in LinkedIn profiles. 03/04/14 CSC 8710 27
    28. 28. LinkedIn statistics 03/04/14 CSC 8710 28
    29. 29. Conclusion • Computational and storage requirements of applications such as for Big Data analytics, Business Intelligence and social networking over peta-byte datasets led us to the change from SQL to NoSQL DBs. • This led to the development of horizontally scalable, distributed non-relational No-SQL databases. • MongoDB‘s is the most demanded one. 03/04/14 CSC 8710 29
    30. 30. Resources • http://arxiv.org/ftp/arxiv/papers/1307/1307.0191.pdf • http://en.wikipedia.org/wiki/Column_family • http://en.wikipedia.org/wiki/NoSQL 03/04/14 CSC 8710 30
    31. 31. 03/04/14 CSC 8710 31
    32. 32. 03/04/14 CSC 8710 32

    ×