SlideShare a Scribd company logo
What is covered in thispresentation?
A brief history of databases
NoSQL WHY, WHAT & WHEN?
Characteristics of NoSQL databases
Aggregate data models
CAP theorem
Introduction
• Database - Organized collection of data
• DBMS - a software package with computerprograms
that controls the creation, maintenance and use of a
database
• Databases are created to operate large quantities of
information by inputting, storing, retrieving, and
managing that information
Abrief history
• Benefits of Relational databases:
Designed for all purposes
ACID
Strong consistancy, concurrency, recovery
Mathematical background
Standard Query language (SQL)
Lots of tools to use with i.e: Reporting services, entity
frameworks, ...
Relational databases
SQLdatabases
But...
• Relational databases were not
built for distributed applications.
Because...
• Joins are expensive
• Hard to scale horizontally
• Impedance mismatch occurs
• Expensive (product cost,
hardware, Maintenance)
NoSQL why, what and when?
And....
It’s weakin:
 Speed(performance)
 High availability
 Partition tolerance
NoSQL why, what and when?
Why NOSQL now?? Ans.DrivingTrends
RDBMS performance
Data
Data is a new class of economic asset, like currency and
gold
Source: World Economic Forum 2012
Data is the new raw material
Data size growth
• 150 exabytes in 2005
(exabyte is a billion
gigabytes)
• 1200 exabytes in 2010
• 35000 exabytes in 2020
(expected by IBM)
Volume of data/information created, captured,
copied, and consumed worldwide from 2010 to 2025
Data size growth
Examples:
• ISRO launches the advanced earth observation
and mapping satellite CARTOSAT-3 along with
13 other commercial nano-satellites
– Information and images coming from the satellite
• Maharashtra Election : 20000 tweets/second
• Around 30 billion RFID tags produced/year
– Automatic toll collection using RFID
• Oil drilling platforms have 20k to 40k sensors
95% of data produced is unstructured
Challenge
Big Data’s characteristics are challenging conventional information
management architectures
 Massive and growing amounts of information residing internal
and external to the organization
 Unconventional semi structured or unstructured (diverse)
including web pages, log files, social media, click-streams,
instant messages, text messages, emails, sensor data from
active and passive systems, etc.
 Changing information
15
Multi-Channel
analytics
Sentiment
analytics Transaction
analytics
Call Detail Records
analytics
Warranty claim
analytics
Surveillance
analytics
Claim fraud
analytics
What is big data?
“A massive volume of both structured and unstructured data
that is so large that it's difficult to store, analyse, process,
share, visualise and manage with traditional database and
software techniques.” - Roger Magoulas of O’reilly in 2005
• Big data technologies describe a new generation of
technologies and architectures, designed to economically
extract value from very large volumes of a wide variety of
data, by enabling high velocity capture, discovery, and/or
analysis
• IBM / MS
– Volume (Terabytes -> Zettabytes)
– Variety (Structured -> Semi-structured -> Unstructured)
– Velocity (Batch -> Streaming Data)
What Makes it Big Data? (V3)
VOLUME VELOCITY VARIETY VALUE
SOCIAL
BLOG
SMART
METER
1011001010010
0100110101010
1011100101010
100100101
• Volume:Gigabyte(109), Terabyte(1012), Petabyte(1015),
Exabyte(1018), Zettabytes(1021)
• Variety: Structured,semi-structured, unstructured; Text, image,
audio, video, record
• Velocity (Dynamic, sometimes time-varying)
Variability:
Variability vs variety. 6
different coffee blends tastes
different every day, that is
variability.
The same is true of data, if the
meaning is constantly
changing it can have a huge
impact on your data
homogenization.
Visualization:
Using charts and graphs to
visualize large amounts of
complex data
A NoSQL database provides a
mechanism for storage and retrieval
of data that employs less constrained
consistency models than traditional
relational database
No SQL systems are also referred to
as "NotonlySQL“ to emphasize that
they do in fact allow SQL-like query
languages to be used.
But What is NoSQL?
NoSQL avoids:
Overhead of ACID transactions
Complexity of SQL query
Burden of up-front schema design
DBA presence
Transactions (It should be handled
at application layer)
Provides:
Easy and frequent changes to DB
Fast development
Large data volumes(eg.Google)
Schema less
Characteristics of NoSQLdatabases
NoSQLis getting more & morepopular
In relational Databases:
You can’t add a record which does
not fit the schema
You need to add NULLs to
unused items in a row
We should consider the datatypes.
i.e : you can’t add a stirng to an
interger field
You can’t add multiple items in a
field (You should create another
table: primary-key, foreign key,
joins, normalization, ... !!!)
What is aschema-lessdatamodel?
In NoSQL Databases:
There is no schema to consider
There is no unused cell
There is no datatype (implicit)
Most of considerations are done
in application layer
We gather all items in an aggregate
(document)
What is aschema-lessdatamodel?
NoSQL databases are classified in four
major datamodels:
• Key-value
• Document
• Column family
• Graph
Each DB has its own query language
Categories of NoSQL databases
 Simplest NOSQL databases
 The main idea is the use
of a hash table
 Access data (values) by
strings called keys
 Data has no required format
data may have any format
 Data model: (key, value) pairs
 Basic Operations:
Insert(key,value),
Fetch(key), Update(key),
Delete(key)
Key-value data model
Row oriented DB – stores row by row, suitable for
OLTP
Column oriented DB – stores column by column –
OLAP
Companies such as Facebook, Twitter, Yahoo, and
Adobe use HBase internally (large data and
random read/write)
The column is lowest/smallest instance of data.
It is a tuple that contains a name, a value and a
timestamp
Column family datamodel
Example
28
Some statistics about Facebook Search (usingCassandra)
 MySQL>50 GBData
 Writes Average: ~300ms
 ReadsAverage: ~350 ms
 Rewritten with Cassandra>50 GBData
 Writes Average: 0.12ms
 ReadsAverage: 15 ms
Column family datamodel
 Based on Graph Theory.
 Scale vertically, no clustering.
 You can use graph algorithms
easily
 Transactions
 ACID
Graph data model
• Pair each key with complex data
structure known as data
structure.
• Indexes are done via B-Trees.
• Documents can contain many
different key-value pairs, or key-
array pairs, or even nested
documents.
Document baseddata model
SQL vs NOSQL
• NoSQL may complement RDBMS
– RDBMS may hold smaller amounts of high-value structured data
– NoSQL may hold vast amounts of less valued and less structured
• Relational implementations provide ACID guarantees
– Atomicity: transaction treated an all or nothing operation
– Consistency: database values correct before and after
– Isolation: as if only transaction.
– Durability: upon completion of transaction, operation is not reversed.
• NoSQL often provides BASE
– Basically available: Allowance for parts of a system to fail (sharding/
partitioning)
– Soft state: An object may have multiple simultaneous values (at
different times)
– Eventually consistent: Consistency achieved over time (not on every
commit)
• CAP Theorem
– It is impossible to have consistency, availability, and partition
tolerance in a distributed system
What we need?
• Weneed adistributed database system having such features:
•
•
•
•
– Faulttolerance
– Highavailability
– Consistency
– Scalability
Which isimpossible!!!
According to CAPtheorem
Wecannot achieve all the three items
In distributed databasesystems(center)
The CAP theorem
CAPtheorem
Conclusion….

More Related Content

Similar to Introduction to asdfghjkln b vfgh n v

NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, How
Igor Moochnick
 

Similar to Introduction to asdfghjkln b vfgh n v (20)

NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, How
 
Nosql data models
Nosql data modelsNosql data models
Nosql data models
 
SQL or NoSQL, is this the question? - George Grammatikos
SQL or NoSQL, is this the question? - George GrammatikosSQL or NoSQL, is this the question? - George Grammatikos
SQL or NoSQL, is this the question? - George Grammatikos
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introduction
 
Beyond Relational Databases
Beyond Relational DatabasesBeyond Relational Databases
Beyond Relational Databases
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
the rising no sql technology
the rising no sql technologythe rising no sql technology
the rising no sql technology
 
BigData, NoSQL & ElasticSearch
BigData, NoSQL & ElasticSearchBigData, NoSQL & ElasticSearch
BigData, NoSQL & ElasticSearch
 
1. introduction to no sql
1. introduction to no sql1. introduction to no sql
1. introduction to no sql
 
NoSQL
NoSQLNoSQL
NoSQL
 
No sql database
No sql databaseNo sql database
No sql database
 
NOsql Presentation.pdf
NOsql Presentation.pdfNOsql Presentation.pdf
NOsql Presentation.pdf
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27
 
Database revolution opening webcast 01 18-12
Database revolution opening webcast 01 18-12Database revolution opening webcast 01 18-12
Database revolution opening webcast 01 18-12
 
Database Revolution - Exploratory Webcast
Database Revolution - Exploratory WebcastDatabase Revolution - Exploratory Webcast
Database Revolution - Exploratory Webcast
 
2018 05 08_biological_databases_no_sql
2018 05 08_biological_databases_no_sql2018 05 08_biological_databases_no_sql
2018 05 08_biological_databases_no_sql
 
مقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربيمقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربي
 
Nosql seminar
Nosql seminarNosql seminar
Nosql seminar
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 

Recently uploaded

Industrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training ReportIndustrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training Report
Avinash Rai
 
IATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdffIATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdff
17thcssbs2
 

Recently uploaded (20)

How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
 
Morse OER Some Benefits and Challenges.pptx
Morse OER Some Benefits and Challenges.pptxMorse OER Some Benefits and Challenges.pptx
Morse OER Some Benefits and Challenges.pptx
 
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptxslides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
 
B.ed spl. HI pdusu exam paper-2023-24.pdf
B.ed spl. HI pdusu exam paper-2023-24.pdfB.ed spl. HI pdusu exam paper-2023-24.pdf
B.ed spl. HI pdusu exam paper-2023-24.pdf
 
The Last Leaf, a short story by O. Henry
The Last Leaf, a short story by O. HenryThe Last Leaf, a short story by O. Henry
The Last Leaf, a short story by O. Henry
 
Industrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training ReportIndustrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training Report
 
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdfDanh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
 
Jose-Rizal-and-Philippine-Nationalism-National-Symbol-2.pptx
Jose-Rizal-and-Philippine-Nationalism-National-Symbol-2.pptxJose-Rizal-and-Philippine-Nationalism-National-Symbol-2.pptx
Jose-Rizal-and-Philippine-Nationalism-National-Symbol-2.pptx
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
 
size separation d pharm 1st year pharmaceutics
size separation d pharm 1st year pharmaceuticssize separation d pharm 1st year pharmaceutics
size separation d pharm 1st year pharmaceutics
 
IATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdffIATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdff
 
2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptx2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptx
 
Application of Matrices in real life. Presentation on application of matrices
Application of Matrices in real life. Presentation on application of matricesApplication of Matrices in real life. Presentation on application of matrices
Application of Matrices in real life. Presentation on application of matrices
 
Incoming and Outgoing Shipments in 2 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 2 STEPS Using Odoo 17Incoming and Outgoing Shipments in 2 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 2 STEPS Using Odoo 17
 
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdfINU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
 
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
 
Open Educational Resources Primer PowerPoint
Open Educational Resources Primer PowerPointOpen Educational Resources Primer PowerPoint
Open Educational Resources Primer PowerPoint
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 

Introduction to asdfghjkln b vfgh n v

  • 1.
  • 2. What is covered in thispresentation? A brief history of databases NoSQL WHY, WHAT & WHEN? Characteristics of NoSQL databases Aggregate data models CAP theorem
  • 3. Introduction • Database - Organized collection of data • DBMS - a software package with computerprograms that controls the creation, maintenance and use of a database • Databases are created to operate large quantities of information by inputting, storing, retrieving, and managing that information
  • 5. • Benefits of Relational databases: Designed for all purposes ACID Strong consistancy, concurrency, recovery Mathematical background Standard Query language (SQL) Lots of tools to use with i.e: Reporting services, entity frameworks, ... Relational databases
  • 7. But... • Relational databases were not built for distributed applications. Because... • Joins are expensive • Hard to scale horizontally • Impedance mismatch occurs • Expensive (product cost, hardware, Maintenance) NoSQL why, what and when?
  • 8. And.... It’s weakin:  Speed(performance)  High availability  Partition tolerance NoSQL why, what and when?
  • 9. Why NOSQL now?? Ans.DrivingTrends
  • 11. Data Data is a new class of economic asset, like currency and gold Source: World Economic Forum 2012 Data is the new raw material
  • 12. Data size growth • 150 exabytes in 2005 (exabyte is a billion gigabytes) • 1200 exabytes in 2010 • 35000 exabytes in 2020 (expected by IBM)
  • 13. Volume of data/information created, captured, copied, and consumed worldwide from 2010 to 2025
  • 14. Data size growth Examples: • ISRO launches the advanced earth observation and mapping satellite CARTOSAT-3 along with 13 other commercial nano-satellites – Information and images coming from the satellite • Maharashtra Election : 20000 tweets/second • Around 30 billion RFID tags produced/year – Automatic toll collection using RFID • Oil drilling platforms have 20k to 40k sensors 95% of data produced is unstructured
  • 15. Challenge Big Data’s characteristics are challenging conventional information management architectures  Massive and growing amounts of information residing internal and external to the organization  Unconventional semi structured or unstructured (diverse) including web pages, log files, social media, click-streams, instant messages, text messages, emails, sensor data from active and passive systems, etc.  Changing information 15 Multi-Channel analytics Sentiment analytics Transaction analytics Call Detail Records analytics Warranty claim analytics Surveillance analytics Claim fraud analytics
  • 16. What is big data? “A massive volume of both structured and unstructured data that is so large that it's difficult to store, analyse, process, share, visualise and manage with traditional database and software techniques.” - Roger Magoulas of O’reilly in 2005 • Big data technologies describe a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data, by enabling high velocity capture, discovery, and/or analysis • IBM / MS – Volume (Terabytes -> Zettabytes) – Variety (Structured -> Semi-structured -> Unstructured) – Velocity (Batch -> Streaming Data)
  • 17. What Makes it Big Data? (V3) VOLUME VELOCITY VARIETY VALUE SOCIAL BLOG SMART METER 1011001010010 0100110101010 1011100101010 100100101 • Volume:Gigabyte(109), Terabyte(1012), Petabyte(1015), Exabyte(1018), Zettabytes(1021) • Variety: Structured,semi-structured, unstructured; Text, image, audio, video, record • Velocity (Dynamic, sometimes time-varying)
  • 18. Variability: Variability vs variety. 6 different coffee blends tastes different every day, that is variability. The same is true of data, if the meaning is constantly changing it can have a huge impact on your data homogenization. Visualization: Using charts and graphs to visualize large amounts of complex data
  • 19. A NoSQL database provides a mechanism for storage and retrieval of data that employs less constrained consistency models than traditional relational database No SQL systems are also referred to as "NotonlySQL“ to emphasize that they do in fact allow SQL-like query languages to be used. But What is NoSQL?
  • 20. NoSQL avoids: Overhead of ACID transactions Complexity of SQL query Burden of up-front schema design DBA presence Transactions (It should be handled at application layer) Provides: Easy and frequent changes to DB Fast development Large data volumes(eg.Google) Schema less Characteristics of NoSQLdatabases
  • 21. NoSQLis getting more & morepopular
  • 22. In relational Databases: You can’t add a record which does not fit the schema You need to add NULLs to unused items in a row We should consider the datatypes. i.e : you can’t add a stirng to an interger field You can’t add multiple items in a field (You should create another table: primary-key, foreign key, joins, normalization, ... !!!) What is aschema-lessdatamodel?
  • 23. In NoSQL Databases: There is no schema to consider There is no unused cell There is no datatype (implicit) Most of considerations are done in application layer We gather all items in an aggregate (document) What is aschema-lessdatamodel?
  • 24. NoSQL databases are classified in four major datamodels: • Key-value • Document • Column family • Graph Each DB has its own query language Categories of NoSQL databases
  • 25.  Simplest NOSQL databases  The main idea is the use of a hash table  Access data (values) by strings called keys  Data has no required format data may have any format  Data model: (key, value) pairs  Basic Operations: Insert(key,value), Fetch(key), Update(key), Delete(key) Key-value data model
  • 26. Row oriented DB – stores row by row, suitable for OLTP Column oriented DB – stores column by column – OLAP Companies such as Facebook, Twitter, Yahoo, and Adobe use HBase internally (large data and random read/write) The column is lowest/smallest instance of data. It is a tuple that contains a name, a value and a timestamp Column family datamodel
  • 27.
  • 29. Some statistics about Facebook Search (usingCassandra)  MySQL>50 GBData  Writes Average: ~300ms  ReadsAverage: ~350 ms  Rewritten with Cassandra>50 GBData  Writes Average: 0.12ms  ReadsAverage: 15 ms Column family datamodel
  • 30.  Based on Graph Theory.  Scale vertically, no clustering.  You can use graph algorithms easily  Transactions  ACID Graph data model
  • 31. • Pair each key with complex data structure known as data structure. • Indexes are done via B-Trees. • Documents can contain many different key-value pairs, or key- array pairs, or even nested documents. Document baseddata model
  • 33. • NoSQL may complement RDBMS – RDBMS may hold smaller amounts of high-value structured data – NoSQL may hold vast amounts of less valued and less structured • Relational implementations provide ACID guarantees – Atomicity: transaction treated an all or nothing operation – Consistency: database values correct before and after – Isolation: as if only transaction. – Durability: upon completion of transaction, operation is not reversed. • NoSQL often provides BASE – Basically available: Allowance for parts of a system to fail (sharding/ partitioning) – Soft state: An object may have multiple simultaneous values (at different times) – Eventually consistent: Consistency achieved over time (not on every commit) • CAP Theorem – It is impossible to have consistency, availability, and partition tolerance in a distributed system
  • 34. What we need? • Weneed adistributed database system having such features: • • • • – Faulttolerance – Highavailability – Consistency – Scalability Which isimpossible!!! According to CAPtheorem
  • 35. Wecannot achieve all the three items In distributed databasesystems(center) The CAP theorem

Editor's Notes

  1. 17