SlideShare a Scribd company logo
1 of 36
BTM 382 Database Management
Chapter 2: Data models
Chapter 12-12: CAP
Chapter 14-2a: Hadoop
Chitu Okoli
Associate Professor in Business Technology Management
John Molson School of Business, Concordia University, Montréal
Structure of BTM 382 Database Management
 Week 1: Introduction and overview
 ch1: Introduction
 Weeks 2-6: Database design
 ch3: Relational model
 ch4: ER modeling
 ch6: Normalization
 ERD modeling exercise
 ch5: Advanced data modeling
 Week 7: Midterm exam
 Weeks 8-10: Database programming
 ch7: Intro to SQL
 ch8: Advanced SQL
 SQL exercises
 Weeks 11-13: Database management
 ch2,12,14: Data models
 ch13: Business intelligence and data warehousing
 ch9,14,15: Selected managerial topics
Review of Chapters 2, 12, 14:
Data models
 What is a data model?
 How have data models developed over the
years?
 What is the Object-Oriented Data Model
(OODM), and when is it useful?
 What is Big Data, and how does NoSQL
resolve the major Big Data challenges?
 Which data models should we use for which
situations?
Models and data models
What is a model?
 A model is a simplified way to describe or explain a
complex reality
 A model helps people communicate and work simply
yet effectively when talking about and manipulating
complex real-world phenomena
Scientific models
Image sources:
http://www.redorbit.com/education/reference_library/space_1/universe/2574692/geocentric_model/
http://hendrianusthe.wordpress.com/2012/06/21/heliocentric-vs-geocentric/
Conceptual models
Image sources:
http://info563.malagaclasses.info/strategy-it-2/
http://fivewhys.wordpress.com/2012/05/22/business-model-innovation/
Importance of Data Models
Communication tool
Give an overall view of the database
Organize data for various users
Are an abstraction for the creation of well-
designed good database
The Evolution of Data Models
Obsolete models:
Hierarchical and network models
The Relational Model
 Uses key concepts from mathematical relations (tables)
 “Relational” in “relational model” means “tables” (mathematical
relations), not “relationships”
 Table (relations)
 Intersections of
 rows (various data types) and
 columns (same data type)
 Relations have well defined methods (queries) for combining
their data members
 Selecting (reading) and joining (combining) data is defined based
on mathematical principles
 Relational data management system (RDBMS)
 Relations were originally too advanced for 1970s computing power
 As computing power increased, simplicity of the model prevailed
The Entity Relationship Model
 Enhancement of the relational model
 Relations (tables) become entities
 Very detailed specification of relationships and their properties
 Entity relationship diagram (ERD)
 Uses graphic representations to model database components
 Many variations for notation exist
 In this class, we use the Crow’s Foot notation
The Object-Oriented Data Model
(OODM)
The Object-Oriented Data Model (OODM)
 Tries to reconcile the ER model with
object-oriented programming (OOP)
 The ER model’s view of data (tables) and programmers’ view of data (objects
in OOP), is completely different
 This mismatch can sometimes make database programming painful,
especially for very complex data structures
 An OODM uses OOP concepts to store data
 Objects represent nouns (entities or records)
 Objects have attributes (properties or fields) with values (data)
 Objects have methods (operations or functions)
 Classes group similar objects using a hierarchy and inheritance
 In an OODBMS, the data retrieval and storage closely mirrors the data
structures that programmers use, and so programming complex objects
is much easier than with the ER model
 More advanced forms support the Extended Relational Data Model,
Object/Relational DBMS, and XML data structures
OODBMS vs. RDBMS
https://youtu.be/kORTgvfHl4g
Big Data and NoSQL
Explaining Big Data
https://youtu.be/7D1CQ_LOizA
Big Data
 Volume
 Huge amounts of data (terabytes and petabytes),
especially from the Internet
 Velocity
 Organizations need to process the huge amounts of
data rapidly, just as fast as with smaller databases
 Variety
 Many different types of data, much of it unstructured
and even changing in structure
How do you handle Big Data?
Where RDBMSs run into trouble
1. Solution: Scale up
 Use more powerful, expensive servers
 But RDBMSs are very computing intensive
 Big data would require much faster, more capable,
more expensive computers, and even that’s not good
enough for big data
2. Solution: Scale out
 Use many cheap distributed servers
 But RDBMS is slow with distributed processing
 Consistency is the biggest problem: guaranteeing
consistency (which RDBMS is great at) is slow
 Slow infrastructure isn’t good enough for big data
What is NoSQL?
https://youtu.be/qUV2j3XBRHc
NoSQL databases to the Big Data rescue
 “NoSQL” means:
 Non-relational or non-RDBMS
 Also “Not only SQL”—a few in fact do support SQL
 It is not one model; it is many different models that are not
relational data models
 Scale out (many cheap distributed servers) instead of scale up
 High scalability
 Support distributed database architectures
 High availability
 Rapid performance for big data, including unstructured and sparse data
 Fault tolerance
 Continue to work even if some servers in the cluster fail
 Emphasis is high performance speed, rather than
transaction consistency
Types of NoSQL databases
Image sources:
https://www.linkedin.com/pulse/20140823125259-38485481-nosql-databases-where-i-can-use?trk=sushi_topic_posts
http://www.monitis.com/blog/2011/05/22/picking-the-right-nosql-database-tool/
Also see:
Picking the Right
NoSQL Database Tool
Disadvantages of NoSQL
 Complex programming is often required
 “NoSQL” means you lose the ease-of-use and structural
independence of SQL
 There is often no built-in implementation of relationships in
the database—you might have to program relationships
yourself in code
 Data might be sometimes inconsistent
 No guarantee of transaction integrity
 Entity integrity and referential integrity not guaranteed
 The data you retrieve at any given moment might be
inaccurate… but it will eventually become OK
 This is the price to pay for rapid performance in a distributed
database
The CAP theorem for distributed databases
 CAP stands for:
 Consistency: All nodes see the same data
 Availability: A request always gets a response (success or failure)
 Partition tolerance: Even if a node fails, the system can still
function
 A distributed database can guarantee only two of the three
CAP characteristics, not all three at the same time
 Over time, it will eventually provide all three,
but it cannot guarantee all three at the same time
 NoSQL databases are distributed, and so the CAP theorem
restricts them to providing BASE, not ACID
Image source: PRWEB
ACID versus BASE
 A relational database guarantees the ACID properties:
 Atomicity, Consistency, Isolation, Durability
 In short, a set of SQL statements (called a transaction) will
either completely work or completely fail—no half way success,
and the result will not corrupt the database
 A price to pay: results might be somewhat slow
 A NoSQL database does not guarantee ACID; it only
guarantees BASE properties:
 Basically Available, Soft-state, Eventual consistency
 In short, at any given moment, not everything might be
consistent, but the database will eventually get consistent
 In return, these imperfect results are delivered fast
Summary of data models
Distributed Database Spectrum
Table 12.8
Sacrifices availability to ensure
consistency and isolation
Historical outline of data models
Which data model should you use?
 Hierarchical or network models
 Obsolete—no one uses these any longer
 Entity-relationship model
 Almost always
 90% or more of professional database situations
 Object-oriented database
 When you have very complex data structures, you need rapid
performance, and it helps achieve organizational objectives
 Source: Barry & Associates, Inc
 When data structures are so complex that organizing data as tables
causes headaches in programming retrieval and storage
 NoSQL
 When you have vast amounts of unstructured data and you need
rapid performance
 When speed is more important than data consistency
Popularity ranking of DBMSs: http://db-engines.com/en/ranking
Summary of Chapters 2, 12, 14:
Data models
 A data model is an abstract way of thinking about how
data is organized
 Although the relational model has become the dominant
data model, it cannot solve all database challenges
 The Object-Oriented Data Model is useful for complex
data coupled with object-oriented programming
 Big Data is data with high volume, velocity and variety
 NoSQL generally handles big data better than relational
databases, but it sacrifices consistency for speed
 No single data model is the best for all situations, so we
should understand the pros and cons of each model
Sources
 Most of the slides are adapted from Database
Systems: Design, Implementation and
Management by Carlos Coronel and Steven Morris.
11th edition (2015) published by Cengage Learning.
ISBN 13: 978-1-285-19614-5
 Other sources are noted on the slides themselves

More Related Content

Similar to ch02models.pptx

Enterprise NoSQL: Silver Bullet or Poison Pill
Enterprise NoSQL: Silver Bullet or Poison PillEnterprise NoSQL: Silver Bullet or Poison Pill
Enterprise NoSQL: Silver Bullet or Poison PillBilly Newport
 
مقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربيمقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربيMohamed Galal
 
Introduction to nosql
Introduction to nosqlIntroduction to nosql
Introduction to nosqlZuhaib Ansari
 
Modern databases and its challenges (SQL ,NoSQL, NewSQL)
Modern databases and its challenges (SQL ,NoSQL, NewSQL)Modern databases and its challenges (SQL ,NoSQL, NewSQL)
Modern databases and its challenges (SQL ,NoSQL, NewSQL)Mohamed Galal
 
Database Management System, Lecture-1
Database Management System, Lecture-1Database Management System, Lecture-1
Database Management System, Lecture-1Sonia Mim
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
 
NOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdfNOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdfajajkhan16
 
No Sql On Social And Sematic Web
No Sql On Social And Sematic WebNo Sql On Social And Sematic Web
No Sql On Social And Sematic WebStefan Ceriu
 
NoSQL On Social And Sematic Web
NoSQL On Social And Sematic WebNoSQL On Social And Sematic Web
NoSQL On Social And Sematic WebStefan Prutianu
 
data science chapter-4,5,6
data science chapter-4,5,6data science chapter-4,5,6
data science chapter-4,5,6varshakumar21
 

Similar to ch02models.pptx (20)

nosql.pptx
nosql.pptxnosql.pptx
nosql.pptx
 
Enterprise NoSQL: Silver Bullet or Poison Pill
Enterprise NoSQL: Silver Bullet or Poison PillEnterprise NoSQL: Silver Bullet or Poison Pill
Enterprise NoSQL: Silver Bullet or Poison Pill
 
مقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربيمقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربي
 
Cse ii ii sem
Cse ii ii semCse ii ii sem
Cse ii ii sem
 
RDBMS to NoSQL. An overview.
RDBMS to NoSQL. An overview.RDBMS to NoSQL. An overview.
RDBMS to NoSQL. An overview.
 
DBMS-Unit-1.pptx
DBMS-Unit-1.pptxDBMS-Unit-1.pptx
DBMS-Unit-1.pptx
 
No sq lv2
No sq lv2No sq lv2
No sq lv2
 
Introduction to nosql
Introduction to nosqlIntroduction to nosql
Introduction to nosql
 
Report 1.0.docx
Report 1.0.docxReport 1.0.docx
Report 1.0.docx
 
Database System
Database SystemDatabase System
Database System
 
Modern databases and its challenges (SQL ,NoSQL, NewSQL)
Modern databases and its challenges (SQL ,NoSQL, NewSQL)Modern databases and its challenges (SQL ,NoSQL, NewSQL)
Modern databases and its challenges (SQL ,NoSQL, NewSQL)
 
Database Management System, Lecture-1
Database Management System, Lecture-1Database Management System, Lecture-1
Database Management System, Lecture-1
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
NOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdfNOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdf
 
Data models
Data modelsData models
Data models
 
No Sql On Social And Sematic Web
No Sql On Social And Sematic WebNo Sql On Social And Sematic Web
No Sql On Social And Sematic Web
 
NoSQL On Social And Sematic Web
NoSQL On Social And Sematic WebNoSQL On Social And Sematic Web
NoSQL On Social And Sematic Web
 
Erciyes university
Erciyes universityErciyes university
Erciyes university
 
Unit-10.pptx
Unit-10.pptxUnit-10.pptx
Unit-10.pptx
 
data science chapter-4,5,6
data science chapter-4,5,6data science chapter-4,5,6
data science chapter-4,5,6
 

Recently uploaded

Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayMakMakNepo
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........LeaCamillePacle
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 

Recently uploaded (20)

Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up Friday
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 

ch02models.pptx

  • 1. BTM 382 Database Management Chapter 2: Data models Chapter 12-12: CAP Chapter 14-2a: Hadoop Chitu Okoli Associate Professor in Business Technology Management John Molson School of Business, Concordia University, Montréal
  • 2. Structure of BTM 382 Database Management  Week 1: Introduction and overview  ch1: Introduction  Weeks 2-6: Database design  ch3: Relational model  ch4: ER modeling  ch6: Normalization  ERD modeling exercise  ch5: Advanced data modeling  Week 7: Midterm exam  Weeks 8-10: Database programming  ch7: Intro to SQL  ch8: Advanced SQL  SQL exercises  Weeks 11-13: Database management  ch2,12,14: Data models  ch13: Business intelligence and data warehousing  ch9,14,15: Selected managerial topics
  • 3. Review of Chapters 2, 12, 14: Data models  What is a data model?  How have data models developed over the years?  What is the Object-Oriented Data Model (OODM), and when is it useful?  What is Big Data, and how does NoSQL resolve the major Big Data challenges?  Which data models should we use for which situations?
  • 5. What is a model?  A model is a simplified way to describe or explain a complex reality  A model helps people communicate and work simply yet effectively when talking about and manipulating complex real-world phenomena
  • 8. Importance of Data Models Communication tool Give an overall view of the database Organize data for various users Are an abstraction for the creation of well- designed good database
  • 9. The Evolution of Data Models
  • 11. The Relational Model  Uses key concepts from mathematical relations (tables)  “Relational” in “relational model” means “tables” (mathematical relations), not “relationships”  Table (relations)  Intersections of  rows (various data types) and  columns (same data type)  Relations have well defined methods (queries) for combining their data members  Selecting (reading) and joining (combining) data is defined based on mathematical principles  Relational data management system (RDBMS)  Relations were originally too advanced for 1970s computing power  As computing power increased, simplicity of the model prevailed
  • 12. The Entity Relationship Model  Enhancement of the relational model  Relations (tables) become entities  Very detailed specification of relationships and their properties  Entity relationship diagram (ERD)  Uses graphic representations to model database components  Many variations for notation exist  In this class, we use the Crow’s Foot notation
  • 13. The Object-Oriented Data Model (OODM)
  • 14. The Object-Oriented Data Model (OODM)  Tries to reconcile the ER model with object-oriented programming (OOP)  The ER model’s view of data (tables) and programmers’ view of data (objects in OOP), is completely different  This mismatch can sometimes make database programming painful, especially for very complex data structures  An OODM uses OOP concepts to store data  Objects represent nouns (entities or records)  Objects have attributes (properties or fields) with values (data)  Objects have methods (operations or functions)  Classes group similar objects using a hierarchy and inheritance  In an OODBMS, the data retrieval and storage closely mirrors the data structures that programmers use, and so programming complex objects is much easier than with the ER model  More advanced forms support the Extended Relational Data Model, Object/Relational DBMS, and XML data structures
  • 16.
  • 17.
  • 18. Big Data and NoSQL
  • 20. Big Data  Volume  Huge amounts of data (terabytes and petabytes), especially from the Internet  Velocity  Organizations need to process the huge amounts of data rapidly, just as fast as with smaller databases  Variety  Many different types of data, much of it unstructured and even changing in structure
  • 21. How do you handle Big Data? Where RDBMSs run into trouble 1. Solution: Scale up  Use more powerful, expensive servers  But RDBMSs are very computing intensive  Big data would require much faster, more capable, more expensive computers, and even that’s not good enough for big data 2. Solution: Scale out  Use many cheap distributed servers  But RDBMS is slow with distributed processing  Consistency is the biggest problem: guaranteeing consistency (which RDBMS is great at) is slow  Slow infrastructure isn’t good enough for big data
  • 23. NoSQL databases to the Big Data rescue  “NoSQL” means:  Non-relational or non-RDBMS  Also “Not only SQL”—a few in fact do support SQL  It is not one model; it is many different models that are not relational data models  Scale out (many cheap distributed servers) instead of scale up  High scalability  Support distributed database architectures  High availability  Rapid performance for big data, including unstructured and sparse data  Fault tolerance  Continue to work even if some servers in the cluster fail  Emphasis is high performance speed, rather than transaction consistency
  • 24. Types of NoSQL databases Image sources: https://www.linkedin.com/pulse/20140823125259-38485481-nosql-databases-where-i-can-use?trk=sushi_topic_posts http://www.monitis.com/blog/2011/05/22/picking-the-right-nosql-database-tool/ Also see: Picking the Right NoSQL Database Tool
  • 25. Disadvantages of NoSQL  Complex programming is often required  “NoSQL” means you lose the ease-of-use and structural independence of SQL  There is often no built-in implementation of relationships in the database—you might have to program relationships yourself in code  Data might be sometimes inconsistent  No guarantee of transaction integrity  Entity integrity and referential integrity not guaranteed  The data you retrieve at any given moment might be inaccurate… but it will eventually become OK  This is the price to pay for rapid performance in a distributed database
  • 26. The CAP theorem for distributed databases  CAP stands for:  Consistency: All nodes see the same data  Availability: A request always gets a response (success or failure)  Partition tolerance: Even if a node fails, the system can still function  A distributed database can guarantee only two of the three CAP characteristics, not all three at the same time  Over time, it will eventually provide all three, but it cannot guarantee all three at the same time  NoSQL databases are distributed, and so the CAP theorem restricts them to providing BASE, not ACID Image source: PRWEB
  • 27. ACID versus BASE  A relational database guarantees the ACID properties:  Atomicity, Consistency, Isolation, Durability  In short, a set of SQL statements (called a transaction) will either completely work or completely fail—no half way success, and the result will not corrupt the database  A price to pay: results might be somewhat slow  A NoSQL database does not guarantee ACID; it only guarantees BASE properties:  Basically Available, Soft-state, Eventual consistency  In short, at any given moment, not everything might be consistent, but the database will eventually get consistent  In return, these imperfect results are delivered fast
  • 28.
  • 29.
  • 30. Summary of data models
  • 31. Distributed Database Spectrum Table 12.8 Sacrifices availability to ensure consistency and isolation
  • 32. Historical outline of data models
  • 33. Which data model should you use?  Hierarchical or network models  Obsolete—no one uses these any longer  Entity-relationship model  Almost always  90% or more of professional database situations  Object-oriented database  When you have very complex data structures, you need rapid performance, and it helps achieve organizational objectives  Source: Barry & Associates, Inc  When data structures are so complex that organizing data as tables causes headaches in programming retrieval and storage  NoSQL  When you have vast amounts of unstructured data and you need rapid performance  When speed is more important than data consistency Popularity ranking of DBMSs: http://db-engines.com/en/ranking
  • 34. Summary of Chapters 2, 12, 14: Data models  A data model is an abstract way of thinking about how data is organized  Although the relational model has become the dominant data model, it cannot solve all database challenges  The Object-Oriented Data Model is useful for complex data coupled with object-oriented programming  Big Data is data with high volume, velocity and variety  NoSQL generally handles big data better than relational databases, but it sacrifices consistency for speed  No single data model is the best for all situations, so we should understand the pros and cons of each model
  • 35.
  • 36. Sources  Most of the slides are adapted from Database Systems: Design, Implementation and Management by Carlos Coronel and Steven Morris. 11th edition (2015) published by Cengage Learning. ISBN 13: 978-1-285-19614-5  Other sources are noted on the slides themselves