NoSQL Database Lab (UGIT6P1420)
B.Tech III Year II Semester (A&B Sections) Academic Year: 2023-23
Faculty Instructor: Dr. Chandra Sekhar Kolli
•Prerequisites:
1. Proficiency in understanding database query language.
2. Basic understanding of command line interfacing
•Course Outcomes:
•Upon the completion of the course, the students will be able to:
1. Illustrate the installation and setup of necessary NoSQL components (L4)
2. Apply the database querying operations on NoSQL databases (L3)
3. Analyze various operations like Indexing, importing and Aggregation in
NoSQL databases (L4)
List of Experiments (Prescribed Syllabus)
• A database Management System provides the mechanism to store and retrieve the data.
• There are different kinds of database Management Systems:
1. RDBMS (Relational Database Management Systems)
2. OLAP (Online Analytical Processing)
3. NoSQL (Not only SQL)
Introduction
SQL Databases - class of RDBMS
RDBMS
• RDBMS : Relational Database Management Systems
• Relation : a relation is a 2D Table which has the following features
• Name
• Attributes
• Tuples
SNO SNAME BRANCH SECTION
190030001 ANIL CSE S1
190030002 RAJESH CSE S2
190030003 MADHU CSE S1
ATTRIBUTES
TUPLES
Limitations of RDBMS
• Define Structure And Schema Of Data First
• Provides Consistency And Integrity of data by enforcing ACID properties (Atomicity,
Consistency, Isolation and Durability ).
• These properties are significant Performance Overhead and can make your database
response very slow.
• Most of the applications store their data in JSON format and RDBMS don’t provide you a
better way of performing operations such as create, insert, update, delete etc on this data.
• Issues with Scaling up when the dataset is just too big e.g, Big Data
• Supports only Structed Data (Tabular Data)
• Not designed to be Distributed (means : software and infrastructure are set up to work best in
a single, centralized location rather than being spread across multiple locations or servers.)
• It Supports Vertical Scaling (adding more power (CPU, RAM) to an existing machine.)only , NO
Support for Horizontal Scaling
• Vertical scaling is adding more power to an existing machine, while Horizontal scaling is
adding more machines or nodes to a system.
Centralized Vs. Distributed
Scalability
NoSQL - Introduction
• Stands for Not Only SQL - Non Relational Database ( No Tables)
• alternative to Traditional Relational Databases.
• NoSQL database doesn't use tables for storing data.
• A Flexible database used for Big Data and Real-Time Web Apps.
• Multiple types of NoSQL databases available.
• It is especially useful for working with Large sets of Distributed Data.
• Handle large volume of Unstructured, Semi-Structured and Structured and
unpredictable data.
No SQL Databases
• Key-Value Store
(e.g. Riak, Redis, MemcacheDB)
• Column Family Store / Tabular
(e.g. HBase, Apache Cassandra)
• Document-Oriented Database
(e.g. MongoDB, CouchDB)
• Graph Database
(e.g. Neo4J, HyperGraphDB, InfoGrid)
Types of No SQL
Types of NoSQL Databases
Key Value Pair Based
• Data is stored in key/value pairs.
• It is designed in such a way to handle lots of data
and heavy load.
• Key-value pair storage databases store data as a
hash table where each key is unique, and the value
can be a JSON, BLOB(Binary Large Objects), string,
etc.
• used as a collection, dictionaries, associative arrays,
etc.
• They work best for shopping cart contents.
• Redis,
• Dynamo,
• Riak
• They are all based
on Amazon’s
Dynamo.
Column-based
• Works on columns and are based on BigTable paper by Google.
• Every column is treated separately.
• Values of single column databases are stored contiguously.
• Column-based NoSQL databases are widely used to manage data
warehouses, business intelligence, CRM, Library card catalogs,
• HBase,
• Cassandra,
• HBase,
• Hypertable are NoSQL query examples
of column based database.
Document-Oriented
• Document-Oriented NoSQL DB stores and retrieves data as a key
value pair but the value part is stored as a document.
• The document is stored in JSON or XML formats.
• The value is understood by the DB and can be queried.
• Graph-Based
• Stores entities as well the relations amongst those entities.
• The entity is stored as a node with the relationship as edges.
• An edge gives a relationship between nodes.
• Every node and edge has a unique identifier.
• It is a multi-relational in nature.
• Traversing relationship is fast
• as they are already captured into the DB,
• and there is no need to calculate them.
• used for social networks, logistics, spatial data.
• Neo4J, Infinite Graph, OrientDB, FlockDB are some popular graph-based databases.
CAP Theorem OR Brewer’s theorem
• is a concept in Distributed Computing
• “According to CAP theorem, you Can't Achieve All Three Of These Attributes (Consistency, Availability,
and Partition Tolerance) at the same time”
• CAP Theorem Components:
• Consistency (C): This means that all nodes in a distributed system have the same data at the same
time. In a consistent system, if you read from any node, you should get the most recent write.
• Availability (A): This implies that every request to the system gets a response, without guaranteeing
that it contains the most recent version of the data. An available system is always responsive, even if
it might return slightly outdated information.
• Partition Tolerance (P): Ability to continue functioning even when network partitions occur. Network
partitions happen when communication between nodes is lost or delayed.
Eventual Consistency - BASE
• It means to have copies of data on multiple machines to get high availability
and scalability.
• Thus, changes made to any data item on one machine has to be propagated to
other replicas.
• Data replication may not be instantaneous as some copies will be updated
immediately while others in due course of time.
• These copies may be mutually, but in due course of time, they become
consistent. Hence, the name eventual consistency.
BASE: Basically Available, Soft state, Eventual consistency
BASE
• Basically Available,
• Soft state,
• Eventual consistency
• Basically Available means DB is
available all the time (as per CAP
theorem)
• Soft State means even without an
input; the system state may change
• Eventual Consistency means that
the system will become consistent
over time
Need of NoSQL
• BIG DATA SUPPORT - Explosion of Social Media Sites (FB, Twitter, Google et.)
with Large data.
• RISE OF COULD-BASED SOLUTIONS - such as Amazon, EC2...etc.
• Moving to Dynamically -Typed Languages (Ruby/Groovy), a shift to Dynamically
Typed Data.
• Expansion of OPEN-SOURCE COMMUNITY.
• No-SQL solutions is More Acceptable To A Client now than a Year ago.
When to use NoSQL Database?
• When you want to store and retrieve huge amount of data.
• The relationship between the data you store is not that important
• The data is not structured and changing over time (frequently)
• Constraints and Joins support is not required at database level
• The data is growing continuously and you need to scale the database
regular to handle the data.
When to use NoSQL Database ?
• Advantages of NoSQL
• Big Data Capability - manages data velocity, variety, volume, and complexity
• Handle structured, semi-structured, and unstructured data with equal effect
• It provides fast performance and horizontal scalability.
• NoSQL databases don’t need a dedicated high-performance server
• Simple to implement than using RDBMS
• It can serve as the primary data source for online applications.
• Excels at distributed database
• Offers a flexible schema design which can easily be altered without downtime or service
disruption
RDBMS Vs NoSQL
NOSQL PRESENTATION ON INTRRODUCTION Intro.pptx

NOSQL PRESENTATION ON INTRRODUCTION Intro.pptx

  • 1.
    NoSQL Database Lab(UGIT6P1420) B.Tech III Year II Semester (A&B Sections) Academic Year: 2023-23 Faculty Instructor: Dr. Chandra Sekhar Kolli
  • 2.
    •Prerequisites: 1. Proficiency inunderstanding database query language. 2. Basic understanding of command line interfacing •Course Outcomes: •Upon the completion of the course, the students will be able to: 1. Illustrate the installation and setup of necessary NoSQL components (L4) 2. Apply the database querying operations on NoSQL databases (L3) 3. Analyze various operations like Indexing, importing and Aggregation in NoSQL databases (L4)
  • 3.
    List of Experiments(Prescribed Syllabus)
  • 5.
    • A databaseManagement System provides the mechanism to store and retrieve the data. • There are different kinds of database Management Systems: 1. RDBMS (Relational Database Management Systems) 2. OLAP (Online Analytical Processing) 3. NoSQL (Not only SQL) Introduction
  • 6.
    SQL Databases -class of RDBMS
  • 7.
    RDBMS • RDBMS :Relational Database Management Systems • Relation : a relation is a 2D Table which has the following features • Name • Attributes • Tuples SNO SNAME BRANCH SECTION 190030001 ANIL CSE S1 190030002 RAJESH CSE S2 190030003 MADHU CSE S1 ATTRIBUTES TUPLES
  • 8.
    Limitations of RDBMS •Define Structure And Schema Of Data First • Provides Consistency And Integrity of data by enforcing ACID properties (Atomicity, Consistency, Isolation and Durability ). • These properties are significant Performance Overhead and can make your database response very slow. • Most of the applications store their data in JSON format and RDBMS don’t provide you a better way of performing operations such as create, insert, update, delete etc on this data.
  • 9.
    • Issues withScaling up when the dataset is just too big e.g, Big Data • Supports only Structed Data (Tabular Data) • Not designed to be Distributed (means : software and infrastructure are set up to work best in a single, centralized location rather than being spread across multiple locations or servers.) • It Supports Vertical Scaling (adding more power (CPU, RAM) to an existing machine.)only , NO Support for Horizontal Scaling • Vertical scaling is adding more power to an existing machine, while Horizontal scaling is adding more machines or nodes to a system.
  • 10.
  • 11.
  • 12.
    NoSQL - Introduction •Stands for Not Only SQL - Non Relational Database ( No Tables) • alternative to Traditional Relational Databases. • NoSQL database doesn't use tables for storing data. • A Flexible database used for Big Data and Real-Time Web Apps. • Multiple types of NoSQL databases available. • It is especially useful for working with Large sets of Distributed Data. • Handle large volume of Unstructured, Semi-Structured and Structured and unpredictable data.
  • 13.
  • 14.
    • Key-Value Store (e.g.Riak, Redis, MemcacheDB) • Column Family Store / Tabular (e.g. HBase, Apache Cassandra) • Document-Oriented Database (e.g. MongoDB, CouchDB) • Graph Database (e.g. Neo4J, HyperGraphDB, InfoGrid) Types of No SQL
  • 15.
    Types of NoSQLDatabases
  • 16.
    Key Value PairBased • Data is stored in key/value pairs. • It is designed in such a way to handle lots of data and heavy load. • Key-value pair storage databases store data as a hash table where each key is unique, and the value can be a JSON, BLOB(Binary Large Objects), string, etc. • used as a collection, dictionaries, associative arrays, etc. • They work best for shopping cart contents. • Redis, • Dynamo, • Riak • They are all based on Amazon’s Dynamo.
  • 17.
    Column-based • Works oncolumns and are based on BigTable paper by Google. • Every column is treated separately. • Values of single column databases are stored contiguously. • Column-based NoSQL databases are widely used to manage data warehouses, business intelligence, CRM, Library card catalogs, • HBase, • Cassandra, • HBase, • Hypertable are NoSQL query examples of column based database.
  • 18.
    Document-Oriented • Document-Oriented NoSQLDB stores and retrieves data as a key value pair but the value part is stored as a document. • The document is stored in JSON or XML formats. • The value is understood by the DB and can be queried.
  • 19.
    • Graph-Based • Storesentities as well the relations amongst those entities. • The entity is stored as a node with the relationship as edges. • An edge gives a relationship between nodes. • Every node and edge has a unique identifier. • It is a multi-relational in nature. • Traversing relationship is fast • as they are already captured into the DB, • and there is no need to calculate them. • used for social networks, logistics, spatial data. • Neo4J, Infinite Graph, OrientDB, FlockDB are some popular graph-based databases.
  • 20.
    CAP Theorem ORBrewer’s theorem • is a concept in Distributed Computing • “According to CAP theorem, you Can't Achieve All Three Of These Attributes (Consistency, Availability, and Partition Tolerance) at the same time” • CAP Theorem Components: • Consistency (C): This means that all nodes in a distributed system have the same data at the same time. In a consistent system, if you read from any node, you should get the most recent write. • Availability (A): This implies that every request to the system gets a response, without guaranteeing that it contains the most recent version of the data. An available system is always responsive, even if it might return slightly outdated information. • Partition Tolerance (P): Ability to continue functioning even when network partitions occur. Network partitions happen when communication between nodes is lost or delayed.
  • 21.
    Eventual Consistency -BASE • It means to have copies of data on multiple machines to get high availability and scalability. • Thus, changes made to any data item on one machine has to be propagated to other replicas. • Data replication may not be instantaneous as some copies will be updated immediately while others in due course of time. • These copies may be mutually, but in due course of time, they become consistent. Hence, the name eventual consistency. BASE: Basically Available, Soft state, Eventual consistency
  • 22.
    BASE • Basically Available, •Soft state, • Eventual consistency • Basically Available means DB is available all the time (as per CAP theorem) • Soft State means even without an input; the system state may change • Eventual Consistency means that the system will become consistent over time
  • 23.
    Need of NoSQL •BIG DATA SUPPORT - Explosion of Social Media Sites (FB, Twitter, Google et.) with Large data. • RISE OF COULD-BASED SOLUTIONS - such as Amazon, EC2...etc. • Moving to Dynamically -Typed Languages (Ruby/Groovy), a shift to Dynamically Typed Data. • Expansion of OPEN-SOURCE COMMUNITY. • No-SQL solutions is More Acceptable To A Client now than a Year ago.
  • 24.
    When to useNoSQL Database? • When you want to store and retrieve huge amount of data. • The relationship between the data you store is not that important • The data is not structured and changing over time (frequently) • Constraints and Joins support is not required at database level • The data is growing continuously and you need to scale the database regular to handle the data.
  • 25.
    When to useNoSQL Database ?
  • 26.
    • Advantages ofNoSQL • Big Data Capability - manages data velocity, variety, volume, and complexity • Handle structured, semi-structured, and unstructured data with equal effect • It provides fast performance and horizontal scalability. • NoSQL databases don’t need a dedicated high-performance server • Simple to implement than using RDBMS • It can serve as the primary data source for online applications. • Excels at distributed database • Offers a flexible schema design which can easily be altered without downtime or service disruption
  • 27.