SlideShare a Scribd company logo
1 of 18
CCS334 BIG DATAANALYTICS
(R-21 III (I Sem))
Department of Artificial Intelligence and Data Science )
Session 3
by
Asst.Prof.M.Gokilavani
NIET
9/19/2023 Department of AI & DS 1
TEXT BOOKS
• Michael Minelli, Michelle Chambers, and AmbigaDhiraj, "Big Data,
Big Analytics: Emerging Business Intelligence and Analytic Trends for
Today's Businesses", Wiley, 2013.
• Eric Sammer, "Hadoop Operations", O'Reilley, 2012.
• Sadalage, Pramod J. “NoSQL distilled”, 2013.
REFERENCES
• E. Capriolo, D. Wampler, and J. Rutherglen, "Programming Hive",
O'Reilley, 2012.
• Lars George, "HBase: The Definitive Guide", O'Reilley, 2011.
• Eben Hewitt, "Cassandra: The Definitive Guide", O'Reilley, 2010.
9/19/2023 Department of AI & DS 2
Topics covered in Unit 2 session
9/19/2023 Department of AI & DS 3
UNIT II NOSQL DATA MANAGEMENT
Introduction to NoSQL – aggregate data models – key-value and
document data models – relationships – graph databases – schema
less databases – materialized views – distribution models – master-
slave replication – consistency - Cassandra – Cassandra data model
– Cassandra examples – Cassandra clients.
Distribution Models
• Depending on the distribution model the data store can give us the ability:
• To handle large Quantity of data
• To process a grater read or write traffic
• To have more availability in the case of network slowdowns of
breakages.
• There are two path for distribution:
• Sharding: Sharding distributes different data across multiple
servers, so each server acts as the single source for a subset of
data.
• Replication: Replication copies data across multiple servers,
so each bit of data can be found in multiple places.
9/19/2023 Department of AI & DS 4
Distributed Models
• Single server
• Sharding
• Master Slave Replication
• Peer-to-Peer Replication
• Combining Sharding & Replication
9/19/2023 Department of AI & DS 5
Single Server
• It is the first and simplest distribution option.
• Also if NoSQL database are designed to run on a cluster they can be
used in a single server application.
• This make sense if a NoSQL database is more suited for the
application data model.
• Graph database are the more obvious.
• If data usage is most about processing aggregates, than a key or a
document store may be useful.
9/19/2023 Department of AI & DS 6
Sharding
• Often, a data store is busy because different people are accessing
different part of the dataset.
• In this cases we can support horizontal scalability by putting different
part of the data onto different servers (Sharding)
• The concept of sharding is not new as a part of application logic.
• It consists in put all the customer with surname A-D on one shard and
E-G to another
• This complicates the programming model as the application code
needs to distributed the load across the shards.
• In the ideal setting we have each user to talk one server and the load is
balanced.
• Of course the ideal case is rare.
9/19/2023 Department of AI & DS 7
9/19/2023 Department of AI & DS 8
Sharding and NoSQL
• In general, many NoSQL databases offers auto- sharding.
• This can make much easier to use sharding in an application.
• Sharding is especially valuable for performance because it improves
read and write performances.
• It scales read and writes on the different nodes of the same cluster.
9/19/2023 Department of AI & DS 9
Master – Slave Replication
• In this setting one node is designated as the master, or primary ad the
other as slaves.
• The master is the authoritative source for the date and designed to
process updates and send them to slaves.
• The slaves are used for read operations.
• This allows us to scale in data intensive dataset.
9/19/2023 Department of AI & DS 10
9/19/2023 Department of AI & DS 11
Master – Slave Replication
• We can scale horizontally by adding more slaves.
• But, we are limited by the ability of the master in processing incoming
data.
An advantage is read resilience.
Disadvantage:
• Also if the master fails the slaves can still handle read requests.
• Anyway writes are not allowed until the master is not restored.
9/19/2023 Department of AI & DS 12
Master – Slave Replication
• Another characteristic is that a slave can be appointed as master.
• Masters can be appointed manually or automatically.
• In order to achieve resilience we need that read and write paths are
different.
• This is normally done using separate database connections.
• Disadvantage: Replication in master-slave have the analyzed
advantages but it come with the problem of inconsistency.
• The readers reading from the slaves can read data not updated.
9/19/2023 Department of AI & DS 13
Master – Slave Replication in MongoDB
9/19/2023 Department of AI & DS 14
Peer to Peer Replication
• Master-Slave replication helps with read scalability but has problems
on scalability of writes.
• Moreover, it provides resilience on read but not on writes.
• The master is still a single point of failure.
• Peer-to-Peer attacks these problems by not having a master.
• All the replica are equal (accept writes and reads).
• With a Peer-to-Peer we can have node failures without lose write
capability and losing data.
9/19/2023 Department of AI & DS 15
9/19/2023 Department of AI & DS 16
Combine Sharing
• We have multiple masters, but each data has a single master.
• Depending on the configuration we can decide the master for each
group of data.
• Peer-to-Peer and sharding is a common strategy for column-family
databases.
• This is commonly composed using replication of the shards.
9/19/2023 Department of AI & DS 17
Topics to be covered in next session 4
• Cassandra data model
9/19/2023 Department of AI & DS 18
Thank you!!!

More Related Content

What's hot

Distributed dbms architectures
Distributed dbms architecturesDistributed dbms architectures
Distributed dbms architecturesPooja Dixit
 
Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)Databricks
 
Data-Intensive Technologies for Cloud Computing
Data-Intensive Technologies for CloudComputingData-Intensive Technologies for CloudComputing
Data-Intensive Technologies for Cloud Computinghuda2018
 
Virtualization in cloud computing ppt
Virtualization in cloud computing pptVirtualization in cloud computing ppt
Virtualization in cloud computing pptMehul Patel
 
Distributed computing
Distributed computingDistributed computing
Distributed computingshivli0769
 
Introduction to Apache Hive
Introduction to Apache HiveIntroduction to Apache Hive
Introduction to Apache HiveAvkash Chauhan
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notesMohit Saini
 
SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?Venu Anuganti
 
An Intro to NoSQL Databases
An Intro to NoSQL DatabasesAn Intro to NoSQL Databases
An Intro to NoSQL DatabasesRajith Pemabandu
 
IPC (Inter-Process Communication) with Shared Memory
IPC (Inter-Process Communication) with Shared MemoryIPC (Inter-Process Communication) with Shared Memory
IPC (Inter-Process Communication) with Shared MemoryHEM DUTT
 
Analysing of big data using map reduce
Analysing of big data using map reduceAnalysing of big data using map reduce
Analysing of big data using map reducePaladion Networks
 
Cloud Resource Management
Cloud Resource ManagementCloud Resource Management
Cloud Resource ManagementNASIRSAYYED4
 

What's hot (20)

Consistency in NoSQL
Consistency in NoSQLConsistency in NoSQL
Consistency in NoSQL
 
Cloud Computing & Distributed Computing
Cloud Computing & Distributed ComputingCloud Computing & Distributed Computing
Cloud Computing & Distributed Computing
 
Distributed dbms architectures
Distributed dbms architecturesDistributed dbms architectures
Distributed dbms architectures
 
Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)
 
Data-Intensive Technologies for Cloud Computing
Data-Intensive Technologies for CloudComputingData-Intensive Technologies for CloudComputing
Data-Intensive Technologies for Cloud Computing
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
 
Virtualization in cloud computing ppt
Virtualization in cloud computing pptVirtualization in cloud computing ppt
Virtualization in cloud computing ppt
 
Memory virtualization
Memory virtualizationMemory virtualization
Memory virtualization
 
Distributed computing
Distributed computingDistributed computing
Distributed computing
 
4. system models
4. system models4. system models
4. system models
 
Distributed DBMS - Unit 3 - Distributed DBMS Architecture
Distributed DBMS - Unit 3 - Distributed DBMS ArchitectureDistributed DBMS - Unit 3 - Distributed DBMS Architecture
Distributed DBMS - Unit 3 - Distributed DBMS Architecture
 
Introduction to Apache Hive
Introduction to Apache HiveIntroduction to Apache Hive
Introduction to Apache Hive
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 
Introduction to HDFS
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
 
SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?
 
An Intro to NoSQL Databases
An Intro to NoSQL DatabasesAn Intro to NoSQL Databases
An Intro to NoSQL Databases
 
Learning With Complete Data
Learning With Complete DataLearning With Complete Data
Learning With Complete Data
 
IPC (Inter-Process Communication) with Shared Memory
IPC (Inter-Process Communication) with Shared MemoryIPC (Inter-Process Communication) with Shared Memory
IPC (Inter-Process Communication) with Shared Memory
 
Analysing of big data using map reduce
Analysing of big data using map reduceAnalysing of big data using map reduce
Analysing of big data using map reduce
 
Cloud Resource Management
Cloud Resource ManagementCloud Resource Management
Cloud Resource Management
 

Similar to CCS334 BIG DATA ANALYTICS Session 3 Distributed models.pptx

Introduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDBIntroduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDBAhmed Farag
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabasesAdi Challa
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2Fabio Fumarola
 
my no sql introductiobkjhikjhkjhkhjhgchjvbbnn.ppt
my no sql introductiobkjhikjhkjhkhjhgchjvbbnn.pptmy no sql introductiobkjhikjhkjhkhjhgchjvbbnn.ppt
my no sql introductiobkjhikjhkjhkhjhgchjvbbnn.pptwondimagegndesta
 
Hackolade Tutorial - part 3 - Query-driven data modeling based on access patt...
Hackolade Tutorial - part 3 - Query-driven data modeling based on access patt...Hackolade Tutorial - part 3 - Query-driven data modeling based on access patt...
Hackolade Tutorial - part 3 - Query-driven data modeling based on access patt...PascalDesmarets1
 
A survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbA survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbAlexander Decker
 
A survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbA survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbAlexander Decker
 
Processing Drone data @Scale
Processing Drone data @ScaleProcessing Drone data @Scale
Processing Drone data @ScaleDr Hajji Hicham
 
Cloud Computing: The Hard Problems Never Go Away
Cloud Computing: The Hard Problems Never Go AwayCloud Computing: The Hard Problems Never Go Away
Cloud Computing: The Hard Problems Never Go AwayZendCon
 
Relational and non relational database 7
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7abdulrahmanhelan
 
Objectivity/DB: A Multipurpose NoSQL Database
Objectivity/DB: A Multipurpose NoSQL DatabaseObjectivity/DB: A Multipurpose NoSQL Database
Objectivity/DB: A Multipurpose NoSQL DatabaseInfiniteGraph
 

Similar to CCS334 BIG DATA ANALYTICS Session 3 Distributed models.pptx (20)

Session 1 Introduction to NoSQL.pptx
 Session 1 Introduction to NoSQL.pptx Session 1 Introduction to NoSQL.pptx
Session 1 Introduction to NoSQL.pptx
 
Report 2.0.docx
Report 2.0.docxReport 2.0.docx
Report 2.0.docx
 
Introduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDBIntroduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDB
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
 
No sql database
No sql databaseNo sql database
No sql database
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2
 
the rising no sql technology
the rising no sql technologythe rising no sql technology
the rising no sql technology
 
Report 1.0.docx
Report 1.0.docxReport 1.0.docx
Report 1.0.docx
 
my no sql introductiobkjhikjhkjhkhjhgchjvbbnn.ppt
my no sql introductiobkjhikjhkjhkhjhgchjvbbnn.pptmy no sql introductiobkjhikjhkjhkhjhgchjvbbnn.ppt
my no sql introductiobkjhikjhkjhkhjhgchjvbbnn.ppt
 
Scaling Up vs. Scaling-out
Scaling Up vs. Scaling-outScaling Up vs. Scaling-out
Scaling Up vs. Scaling-out
 
Hackolade Tutorial - part 3 - Query-driven data modeling based on access patt...
Hackolade Tutorial - part 3 - Query-driven data modeling based on access patt...Hackolade Tutorial - part 3 - Query-driven data modeling based on access patt...
Hackolade Tutorial - part 3 - Query-driven data modeling based on access patt...
 
A survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbA survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo db
 
A survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbA survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo db
 
Processing Drone data @Scale
Processing Drone data @ScaleProcessing Drone data @Scale
Processing Drone data @Scale
 
Cloud Computing: The Hard Problems Never Go Away
Cloud Computing: The Hard Problems Never Go AwayCloud Computing: The Hard Problems Never Go Away
Cloud Computing: The Hard Problems Never Go Away
 
Erciyes university
Erciyes universityErciyes university
Erciyes university
 
Relational and non relational database 7
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7
 
Objectivity/DB: A Multipurpose NoSQL Database
Objectivity/DB: A Multipurpose NoSQL DatabaseObjectivity/DB: A Multipurpose NoSQL Database
Objectivity/DB: A Multipurpose NoSQL Database
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
 

More from Asst.prof M.Gokilavani

CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
IT8073_Information Security_UNIT I _.pdf
IT8073_Information Security_UNIT I _.pdfIT8073_Information Security_UNIT I _.pdf
IT8073_Information Security_UNIT I _.pdfAsst.prof M.Gokilavani
 
IT8073 _Information Security _UNIT I Full notes
IT8073 _Information Security _UNIT I Full notesIT8073 _Information Security _UNIT I Full notes
IT8073 _Information Security _UNIT I Full notesAsst.prof M.Gokilavani
 
GE3151 PSPP UNIT IV QUESTION BANK.docx.pdf
GE3151 PSPP UNIT IV QUESTION BANK.docx.pdfGE3151 PSPP UNIT IV QUESTION BANK.docx.pdf
GE3151 PSPP UNIT IV QUESTION BANK.docx.pdfAsst.prof M.Gokilavani
 
GE3151 PSPP UNIT III QUESTION BANK.docx.pdf
GE3151 PSPP UNIT III QUESTION BANK.docx.pdfGE3151 PSPP UNIT III QUESTION BANK.docx.pdf
GE3151 PSPP UNIT III QUESTION BANK.docx.pdfAsst.prof M.Gokilavani
 
GE3151 PSPP All unit question bank.pdf
GE3151 PSPP All unit question bank.pdfGE3151 PSPP All unit question bank.pdf
GE3151 PSPP All unit question bank.pdfAsst.prof M.Gokilavani
 
AI3391 Artificial intelligence Unit IV Notes _ merged.pdf
AI3391 Artificial intelligence Unit IV Notes _ merged.pdfAI3391 Artificial intelligence Unit IV Notes _ merged.pdf
AI3391 Artificial intelligence Unit IV Notes _ merged.pdfAsst.prof M.Gokilavani
 
AI3391 Artificial intelligence Session 29 Forward and backward chaining.pdf
AI3391 Artificial intelligence Session 29 Forward and backward chaining.pdfAI3391 Artificial intelligence Session 29 Forward and backward chaining.pdf
AI3391 Artificial intelligence Session 29 Forward and backward chaining.pdfAsst.prof M.Gokilavani
 
AI3391 Artificial intelligence Session 28 Resolution.pptx
AI3391 Artificial intelligence Session 28 Resolution.pptxAI3391 Artificial intelligence Session 28 Resolution.pptx
AI3391 Artificial intelligence Session 28 Resolution.pptxAsst.prof M.Gokilavani
 
AI3391 Artificial intelligence session 27 inference and unification.pptx
AI3391 Artificial intelligence session 27 inference and unification.pptxAI3391 Artificial intelligence session 27 inference and unification.pptx
AI3391 Artificial intelligence session 27 inference and unification.pptxAsst.prof M.Gokilavani
 
AI3391 Artificial Intelligence Session 26 First order logic.pptx
AI3391 Artificial Intelligence Session 26 First order logic.pptxAI3391 Artificial Intelligence Session 26 First order logic.pptx
AI3391 Artificial Intelligence Session 26 First order logic.pptxAsst.prof M.Gokilavani
 
AI3391 Artificial Intelligence Session 25 Horn clause.pptx
AI3391 Artificial Intelligence Session 25 Horn clause.pptxAI3391 Artificial Intelligence Session 25 Horn clause.pptx
AI3391 Artificial Intelligence Session 25 Horn clause.pptxAsst.prof M.Gokilavani
 

More from Asst.prof M.Gokilavani (20)

CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
IT8073_Information Security_UNIT I _.pdf
IT8073_Information Security_UNIT I _.pdfIT8073_Information Security_UNIT I _.pdf
IT8073_Information Security_UNIT I _.pdf
 
IT8073 _Information Security _UNIT I Full notes
IT8073 _Information Security _UNIT I Full notesIT8073 _Information Security _UNIT I Full notes
IT8073 _Information Security _UNIT I Full notes
 
GE3151 PSPP UNIT IV QUESTION BANK.docx.pdf
GE3151 PSPP UNIT IV QUESTION BANK.docx.pdfGE3151 PSPP UNIT IV QUESTION BANK.docx.pdf
GE3151 PSPP UNIT IV QUESTION BANK.docx.pdf
 
GE3151 PSPP UNIT III QUESTION BANK.docx.pdf
GE3151 PSPP UNIT III QUESTION BANK.docx.pdfGE3151 PSPP UNIT III QUESTION BANK.docx.pdf
GE3151 PSPP UNIT III QUESTION BANK.docx.pdf
 
GE3151 UNIT II Study material .pdf
GE3151 UNIT II Study material .pdfGE3151 UNIT II Study material .pdf
GE3151 UNIT II Study material .pdf
 
GE3151 PSPP All unit question bank.pdf
GE3151 PSPP All unit question bank.pdfGE3151 PSPP All unit question bank.pdf
GE3151 PSPP All unit question bank.pdf
 
GE3151_PSPP_All unit _Notes
GE3151_PSPP_All unit _NotesGE3151_PSPP_All unit _Notes
GE3151_PSPP_All unit _Notes
 
GE3151_PSPP_UNIT_5_Notes
GE3151_PSPP_UNIT_5_NotesGE3151_PSPP_UNIT_5_Notes
GE3151_PSPP_UNIT_5_Notes
 
GE3151_PSPP_UNIT_4_Notes
GE3151_PSPP_UNIT_4_NotesGE3151_PSPP_UNIT_4_Notes
GE3151_PSPP_UNIT_4_Notes
 
GE3151_PSPP_UNIT_3_Notes
GE3151_PSPP_UNIT_3_NotesGE3151_PSPP_UNIT_3_Notes
GE3151_PSPP_UNIT_3_Notes
 
GE3151_PSPP_UNIT_2_Notes
GE3151_PSPP_UNIT_2_NotesGE3151_PSPP_UNIT_2_Notes
GE3151_PSPP_UNIT_2_Notes
 
AI3391 Artificial intelligence Unit IV Notes _ merged.pdf
AI3391 Artificial intelligence Unit IV Notes _ merged.pdfAI3391 Artificial intelligence Unit IV Notes _ merged.pdf
AI3391 Artificial intelligence Unit IV Notes _ merged.pdf
 
AI3391 Artificial intelligence Session 29 Forward and backward chaining.pdf
AI3391 Artificial intelligence Session 29 Forward and backward chaining.pdfAI3391 Artificial intelligence Session 29 Forward and backward chaining.pdf
AI3391 Artificial intelligence Session 29 Forward and backward chaining.pdf
 
AI3391 Artificial intelligence Session 28 Resolution.pptx
AI3391 Artificial intelligence Session 28 Resolution.pptxAI3391 Artificial intelligence Session 28 Resolution.pptx
AI3391 Artificial intelligence Session 28 Resolution.pptx
 
AI3391 Artificial intelligence session 27 inference and unification.pptx
AI3391 Artificial intelligence session 27 inference and unification.pptxAI3391 Artificial intelligence session 27 inference and unification.pptx
AI3391 Artificial intelligence session 27 inference and unification.pptx
 
AI3391 Artificial Intelligence Session 26 First order logic.pptx
AI3391 Artificial Intelligence Session 26 First order logic.pptxAI3391 Artificial Intelligence Session 26 First order logic.pptx
AI3391 Artificial Intelligence Session 26 First order logic.pptx
 
AI3391 Artificial Intelligence Session 25 Horn clause.pptx
AI3391 Artificial Intelligence Session 25 Horn clause.pptxAI3391 Artificial Intelligence Session 25 Horn clause.pptx
AI3391 Artificial Intelligence Session 25 Horn clause.pptx
 

Recently uploaded

Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and usesDevarapalliHaritha
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2RajaP95
 

Recently uploaded (20)

Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and uses
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
 

CCS334 BIG DATA ANALYTICS Session 3 Distributed models.pptx

  • 1. CCS334 BIG DATAANALYTICS (R-21 III (I Sem)) Department of Artificial Intelligence and Data Science ) Session 3 by Asst.Prof.M.Gokilavani NIET 9/19/2023 Department of AI & DS 1
  • 2. TEXT BOOKS • Michael Minelli, Michelle Chambers, and AmbigaDhiraj, "Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today's Businesses", Wiley, 2013. • Eric Sammer, "Hadoop Operations", O'Reilley, 2012. • Sadalage, Pramod J. “NoSQL distilled”, 2013. REFERENCES • E. Capriolo, D. Wampler, and J. Rutherglen, "Programming Hive", O'Reilley, 2012. • Lars George, "HBase: The Definitive Guide", O'Reilley, 2011. • Eben Hewitt, "Cassandra: The Definitive Guide", O'Reilley, 2010. 9/19/2023 Department of AI & DS 2
  • 3. Topics covered in Unit 2 session 9/19/2023 Department of AI & DS 3 UNIT II NOSQL DATA MANAGEMENT Introduction to NoSQL – aggregate data models – key-value and document data models – relationships – graph databases – schema less databases – materialized views – distribution models – master- slave replication – consistency - Cassandra – Cassandra data model – Cassandra examples – Cassandra clients.
  • 4. Distribution Models • Depending on the distribution model the data store can give us the ability: • To handle large Quantity of data • To process a grater read or write traffic • To have more availability in the case of network slowdowns of breakages. • There are two path for distribution: • Sharding: Sharding distributes different data across multiple servers, so each server acts as the single source for a subset of data. • Replication: Replication copies data across multiple servers, so each bit of data can be found in multiple places. 9/19/2023 Department of AI & DS 4
  • 5. Distributed Models • Single server • Sharding • Master Slave Replication • Peer-to-Peer Replication • Combining Sharding & Replication 9/19/2023 Department of AI & DS 5
  • 6. Single Server • It is the first and simplest distribution option. • Also if NoSQL database are designed to run on a cluster they can be used in a single server application. • This make sense if a NoSQL database is more suited for the application data model. • Graph database are the more obvious. • If data usage is most about processing aggregates, than a key or a document store may be useful. 9/19/2023 Department of AI & DS 6
  • 7. Sharding • Often, a data store is busy because different people are accessing different part of the dataset. • In this cases we can support horizontal scalability by putting different part of the data onto different servers (Sharding) • The concept of sharding is not new as a part of application logic. • It consists in put all the customer with surname A-D on one shard and E-G to another • This complicates the programming model as the application code needs to distributed the load across the shards. • In the ideal setting we have each user to talk one server and the load is balanced. • Of course the ideal case is rare. 9/19/2023 Department of AI & DS 7
  • 9. Sharding and NoSQL • In general, many NoSQL databases offers auto- sharding. • This can make much easier to use sharding in an application. • Sharding is especially valuable for performance because it improves read and write performances. • It scales read and writes on the different nodes of the same cluster. 9/19/2023 Department of AI & DS 9
  • 10. Master – Slave Replication • In this setting one node is designated as the master, or primary ad the other as slaves. • The master is the authoritative source for the date and designed to process updates and send them to slaves. • The slaves are used for read operations. • This allows us to scale in data intensive dataset. 9/19/2023 Department of AI & DS 10
  • 12. Master – Slave Replication • We can scale horizontally by adding more slaves. • But, we are limited by the ability of the master in processing incoming data. An advantage is read resilience. Disadvantage: • Also if the master fails the slaves can still handle read requests. • Anyway writes are not allowed until the master is not restored. 9/19/2023 Department of AI & DS 12
  • 13. Master – Slave Replication • Another characteristic is that a slave can be appointed as master. • Masters can be appointed manually or automatically. • In order to achieve resilience we need that read and write paths are different. • This is normally done using separate database connections. • Disadvantage: Replication in master-slave have the analyzed advantages but it come with the problem of inconsistency. • The readers reading from the slaves can read data not updated. 9/19/2023 Department of AI & DS 13
  • 14. Master – Slave Replication in MongoDB 9/19/2023 Department of AI & DS 14
  • 15. Peer to Peer Replication • Master-Slave replication helps with read scalability but has problems on scalability of writes. • Moreover, it provides resilience on read but not on writes. • The master is still a single point of failure. • Peer-to-Peer attacks these problems by not having a master. • All the replica are equal (accept writes and reads). • With a Peer-to-Peer we can have node failures without lose write capability and losing data. 9/19/2023 Department of AI & DS 15
  • 17. Combine Sharing • We have multiple masters, but each data has a single master. • Depending on the configuration we can decide the master for each group of data. • Peer-to-Peer and sharding is a common strategy for column-family databases. • This is commonly composed using replication of the shards. 9/19/2023 Department of AI & DS 17
  • 18. Topics to be covered in next session 4 • Cassandra data model 9/19/2023 Department of AI & DS 18 Thank you!!!