1. CCS334 BIG DATAANALYTICS
(R-21 III (I Sem))
Department of Artificial Intelligence and Data Science )
Session 1
by
Asst.Prof.M.Gokilavani
NIET
9/12/2023 Department of AI & DS 1
2. TEXT BOOKS
• Michael Minelli, Michelle Chambers, and AmbigaDhiraj, "Big Data,
Big Analytics: Emerging Business Intelligence and Analytic Trends for
Today's Businesses", Wiley, 2013.
• Eric Sammer, "Hadoop Operations", O'Reilley, 2012.
• Sadalage, Pramod J. “NoSQL distilled”, 2013.
REFERENCES
• E. Capriolo, D. Wampler, and J. Rutherglen, "Programming Hive",
O'Reilley, 2012.
• Lars George, "HBase: The Definitive Guide", O'Reilley, 2011.
• Eben Hewitt, "Cassandra: The Definitive Guide", O'Reilley, 2010.
9/12/2023 Department of AI & DS 2
3. Topics covered in Unit 2 session
9/12/2023 Department of CSE (AI/ML) 3
UNIT II NOSQL DATA MANAGEMENT
Introduction to NoSQL – aggregate data models – key-value and
document data models – relationships – graph databases – schema
less databases – materialized views – distribution models – master-
slave replication – consistency - Cassandra – Cassandra data model –
Cassandra examples – Cassandra clients.
4. Introduction
• Database-Organized collection of data in table format.
• DBMS-Database Management System: a software package with
computer programs that controls the creation, maintenance and use of
a database.
• Databases are created to operate large quantities of information by
inputting, storing, retrieving, and managing that information.
9/12/2023 Department of AI & DS 4
5. RDBMS Characteristics
• Data stored in columns and tables
• Relationships represented by data (ACID properties)
• Standard Query language (SQL)
• Data Manipulation Language
• Data Definition Language
• Transactions
• Abstraction from physical layer (API’s) (Strong consistency, concurrency, recovery )
• Applications specify what, not how
• Physical layer can change without modifying applications
• Create indexes to support queries
• In Memory databases
• Mathematical background
• Lots of tools to use with i.e. Reporting services, entity frameworks, ...
9/12/2023 Department of AI & DS 5
6. ACID Properties
• Atomic – All of the work in a transaction completes (commit) or none
of it completes.
• Consistent – A transaction transforms the database from one
consistent state to another consistent state. Consistency is defined in
terms of constraints.
• Isolated – The results of any changes made during a transaction are
not visible until the transaction has committed.
• Durable – The results of a committed transaction survive failures.
9/12/2023 Department of AI & DS 6
9. NoSQL why, what and when?
• But...
• Relational databases were not
built for distributed
applications.
• Because...
• Joins are expensive
• Hard to scale horizontally
• Impedance mismatch occurs
• Expensive (product cost,
hardware, Maintenance)
9/12/2023 Department of AI & DS 9
10. NoSQL why, what and when?
9/12/2023 Department of AI & DS 10
• But...
• Relational databases were not built
for distributed applications.
• Because...
• Joins are expensive
• Hard to scale horizontally
• Impedance mismatch occurs
• Expensive (product cost, hardware,
Maintenance).
• And.... It’s weak in:
• Speed (performance)
• High availability
• Partition tolerance
12. What’s NoSQL?
• NoSQL stands for
• No Relational
• No RDBMS
• Not Only SQL
• NoSQL is an umbrella term for all databases and
data stores that don’t follow the RDBMS
principles
• A class of products
• A collection of several (related) concepts about data
storage and manipulation
• Often related to large data sets (like Distributed
and parallel computing).
9/12/2023 Department of AI & DS 12
13. Characteristics of NoSQL databases
NoSQL avoids
• Overhead of ACID transactions
• Complexity of SQL query
• Burden of up-front schema design
• DBA presence
• Transactions (It should be handled at
application layer)
Provides:
• Easy and frequent changes to DB
• Fast development
• Large data volumes ( eg. Google)
• Schema less
9/12/2023 Department of AI & DS 13
14. NoSQL why, what and when?
9/12/2023 Department of AI & DS 14
15. NoSQL is getting more & more popular
9/12/2023 Department of AI & DS 15
17. Summarization of todays session 1
• Database-Organized collection of data in table format.
• DBMS-Database Management System
• RDBMS Characteristics
• ACID properties
• Abstraction on physical layer
• Standard Query language (SQL)
• NoSQL why, what and when?
• What’s NoSQL?
• Characteristics of NoSQL databases
• Difference between SQL and NoSQL
9/12/2023 Department of AI & DS 17
18. Topics to be covered in next session 2
• Dynamo and Big Table
• NoSQL Database Types
9/12/2023 Department of CSE (AI/ML) 18
Thank you!!!