SlideShare a Scribd company logo
1 of 36
1. NOSQL KEY-VALUE
DATABASE
1
Lecture 2
Dr. Shaimaa Galal
Review Question
• What is the main challenge of the traditional databases?
Managing of semi-structured and unstructured data.
Managing large amounts of structured data.
2
Question
3
4
Key-value database
• Example: (DynamoDB)
• items having one or more attributes
(name, value)
• An attribute can be single-valued or
multi-valued like set.
• items are combined into a table
• key-value database is a system that stores values indexed
by keys. It can store structured and unstructured data.
• Focus on scaling to huge amounts of data designed to
handle massive data loads
• Data model: (global) collection of Key-value pairs.
Key-value
Pros:
• very fast
• very scalable (horizontally distributed to nodes based on key)
• simple data model
• eventual consistency
• fault-tolerance
Cons:
- Can’t model more complex data structure such as objects
5
Big Data: Google
6
1. Google Stack Software
• Google developed major software layers as foundation for
google platform:
1. Google File System (GFS): a distributed cluster file
system that allows all of the disks within the Google
data center to be accessed as one massive, distributed,
redundant file system.
2. MapReduce: a distributed processing framework for
parallelizing algorithms across large numbers of
potentially unreliable servers and being capable of
dealing with massive datasets.
3. BigTable: a nonrelational database system that uses
the Google File System for storage.
7
Google Software Architecture
8
Simple MapReduce Example: WordCount
9
Map Function
10
Reduce Function
11
MultiStage MapReduce Example
12
2. Hadoop and Hive
13
14
Key-value Database API Functions:
Key-value
• Basic API access:
• Get(key): extract the value given a key
• Put(key, value): create or update the value given its key
• Delete(key): remove the key and its associated value
• Update(key, value): create or update the value given its key
• Execute(key, operation, parameters): invoke an operation to the
value (given its key) which is a special data structure (e.g. List, Set,
Map .... etc)
15
Key-value Platforms
16
Name Producer Data model Querying
SimpleDB Amazon set of couples (key, {attribute}),
where attribute is a couple
(name, value)
restricted SQL; select, delete,
GetAttributes, and
PutAttributes operations
Redis Salvatore
Sanfilippo
set of couples (key, value),
where value is simple typed
value, list, ordered (according
to ranking) or unordered set,
hash value
primitive operations for each
value type
Dynamo Amazon like SimpleDB simple get operation and put
in a context
Voldemort LinkeId like SimpleDB similar to Dynamo
Apache Cassandra
• Is a free and open-source distributed NoSQL database
management.
• Handles large amounts of data across many commodity
servers, providing high availability with no single point
of failure.
• It was started by Facebook and it is an open source
Apache project written in Java.
17
18
DataStax Astra
19
Apache Cassandra - Advantages
1. Cassandra is developed to be a distributed server, but it
can also be run as a simple node.
2. Horizontal scalability (Distributed storage.).
3. Quick answers even if demand grows.
4. High write speeds to manage incremental data volumes
5. Ability to change the data structure.
6. A simple API for your favorite programming language.
7. Automatic fault detection and fault tolerant.
8. There is no single point of failure which means that each
node knows about the others.
9. Decentalized.
10.Allows the use of Hadoop to use Map Reduce.
20
21
Apache Cassandra - Disadvantages
1. Ad-hoc queries: You must model your data
around the queries, rather than around the
structure of data.
2. No-Aggregations: because Cassandra is a key-
value store doing functions like Sum, Min, Max,
and Average are incredibly resource intensive if
even possible to accomplish.
3. Unpredictable performance: Because
Cassandra has many different Asynchronous Jobs
in the background.
22
Comparing Alternatives
23
24
25
Cassandra Gossip Protocol
• What is Gossip protocol ?
Gossip is the message system
that Cassandra nodes, virtual
nodes used to make their data
consistent with each other.
A node has a data replica. If
something goes wrong, a
replica can respond. The
replication_factor parameter
in the creation of a KeySpace
(database) indicates how many
machines in the cluster will
receive copies of the same
data.
26
27
Key-Value Concepts
• Cassandra manages columns and family of columns.
• Column family is a container of rows containing columns.
• A keyspace is analogous to a database in a relational
model but without interrelations (stores data).
• The keyspaces require that some attributes be defined,
such as user-defined names, replication strategies and
others.
28
Key-Value Concepts
• These KeySpaces require configuration according to
consistency that are:
1. The replication factor which indicates how much do you
want to pay performance in favor of consistency.
2. The replica placement strategy, which indicates how
the replicas are placed in the ring such as
SimpleStrategy, OldNetwork TopologyStrategy, and
NetworkTopologyStrategy.
• Read more: https://docs.datastax.com/en/cassandra-
oss/2.1/cassandra/architecture/architectureDataDistributeR
eplication_c.html#architectureDataDistributeReplication_c_
_networkToplogyStrategy-ph
29
30
31
32
CQL (Cassandra Query Language)
• CQL offers a more than close to SQL to create schema
and manipulate data.
33
Some of the features CQL has are:
• Data types • Security
• Data definition • Functions
• Data manipulation • Arithmetic operations
• Secondary indexes • JSON support
• Materialized views • Triggers
CQL Example
34
Use Case
35
Use Case
36

More Related Content

Similar to 2. Lecture2_NOSQL_KeyValue.ppt

UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxRahul Borate
 
The No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelThe No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelRishikese MR
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataChen Robert
 
04-Introduction-to-CassandraDB-.pdf
04-Introduction-to-CassandraDB-.pdf04-Introduction-to-CassandraDB-.pdf
04-Introduction-to-CassandraDB-.pdfhothyfa
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageBethmi Gunasekara
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandraNguyen Quang
 
Cassandra
CassandraCassandra
Cassandraexsuns
 
Cassandra
Cassandra Cassandra
Cassandra Pooja GV
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandraTarun Garg
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...Qian Lin
 
cassandra
cassandracassandra
cassandraAkash R
 

Similar to 2. Lecture2_NOSQL_KeyValue.ppt (20)

UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
The No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelThe No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra Model
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting data
 
Master.pptx
Master.pptxMaster.pptx
Master.pptx
 
04-Introduction-to-CassandraDB-.pdf
04-Introduction-to-CassandraDB-.pdf04-Introduction-to-CassandraDB-.pdf
04-Introduction-to-CassandraDB-.pdf
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandra
 
Cassandra
CassandraCassandra
Cassandra
 
Nosql data models
Nosql data modelsNosql data models
Nosql data models
 
Cassandra
Cassandra Cassandra
Cassandra
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandra
 
6269441.ppt
6269441.ppt6269441.ppt
6269441.ppt
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
 
Nosql
NosqlNosql
Nosql
 
Datastores
DatastoresDatastores
Datastores
 
Nosql
NosqlNosql
Nosql
 
cassandra
cassandracassandra
cassandra
 
NOsql Presentation.pdf
NOsql Presentation.pdfNOsql Presentation.pdf
NOsql Presentation.pdf
 
BigData, NoSQL & ElasticSearch
BigData, NoSQL & ElasticSearchBigData, NoSQL & ElasticSearch
BigData, NoSQL & ElasticSearch
 
Cassandra tutorial
Cassandra tutorialCassandra tutorial
Cassandra tutorial
 

More from ShaimaaMohamedGalal

Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....ShaimaaMohamedGalal
 
Data mining ..... Association rule mining
Data mining ..... Association rule miningData mining ..... Association rule mining
Data mining ..... Association rule miningShaimaaMohamedGalal
 
Lecture8_AdvancedPHP(Continue)-APICalls_SPring2023.pdf
Lecture8_AdvancedPHP(Continue)-APICalls_SPring2023.pdfLecture8_AdvancedPHP(Continue)-APICalls_SPring2023.pdf
Lecture8_AdvancedPHP(Continue)-APICalls_SPring2023.pdfShaimaaMohamedGalal
 
Lecture15_LaravelGetStarted_SPring2023.pdf
Lecture15_LaravelGetStarted_SPring2023.pdfLecture15_LaravelGetStarted_SPring2023.pdf
Lecture15_LaravelGetStarted_SPring2023.pdfShaimaaMohamedGalal
 
Lecture11_LaravelGetStarted_SPring2023.pdf
Lecture11_LaravelGetStarted_SPring2023.pdfLecture11_LaravelGetStarted_SPring2023.pdf
Lecture11_LaravelGetStarted_SPring2023.pdfShaimaaMohamedGalal
 
Lecture2_IntroductionToPHP_Spring2023.pdf
Lecture2_IntroductionToPHP_Spring2023.pdfLecture2_IntroductionToPHP_Spring2023.pdf
Lecture2_IntroductionToPHP_Spring2023.pdfShaimaaMohamedGalal
 
1. Lecture1_NOSQL_Introduction.pdf
1. Lecture1_NOSQL_Introduction.pdf1. Lecture1_NOSQL_Introduction.pdf
1. Lecture1_NOSQL_Introduction.pdfShaimaaMohamedGalal
 

More from ShaimaaMohamedGalal (10)

Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....
 
Data mining ..... Association rule mining
Data mining ..... Association rule miningData mining ..... Association rule mining
Data mining ..... Association rule mining
 
Lecture 0 - Advanced DB.pdf
Lecture 0 - Advanced DB.pdfLecture 0 - Advanced DB.pdf
Lecture 0 - Advanced DB.pdf
 
Lecture8_AdvancedPHP(Continue)-APICalls_SPring2023.pdf
Lecture8_AdvancedPHP(Continue)-APICalls_SPring2023.pdfLecture8_AdvancedPHP(Continue)-APICalls_SPring2023.pdf
Lecture8_AdvancedPHP(Continue)-APICalls_SPring2023.pdf
 
Lecture15_LaravelGetStarted_SPring2023.pdf
Lecture15_LaravelGetStarted_SPring2023.pdfLecture15_LaravelGetStarted_SPring2023.pdf
Lecture15_LaravelGetStarted_SPring2023.pdf
 
Lecture11_LaravelGetStarted_SPring2023.pdf
Lecture11_LaravelGetStarted_SPring2023.pdfLecture11_LaravelGetStarted_SPring2023.pdf
Lecture11_LaravelGetStarted_SPring2023.pdf
 
Lecture2_IntroductionToPHP_Spring2023.pdf
Lecture2_IntroductionToPHP_Spring2023.pdfLecture2_IntroductionToPHP_Spring2023.pdf
Lecture2_IntroductionToPHP_Spring2023.pdf
 
Lecture9_OOPHP_SPring2023.pptx
Lecture9_OOPHP_SPring2023.pptxLecture9_OOPHP_SPring2023.pptx
Lecture9_OOPHP_SPring2023.pptx
 
1. Lecture1_NOSQL_Introduction.pdf
1. Lecture1_NOSQL_Introduction.pdf1. Lecture1_NOSQL_Introduction.pdf
1. Lecture1_NOSQL_Introduction.pdf
 
Lecture3.ppt
Lecture3.pptLecture3.ppt
Lecture3.ppt
 

Recently uploaded

New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 

Recently uploaded (20)

New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 

2. Lecture2_NOSQL_KeyValue.ppt

  • 2. Review Question • What is the main challenge of the traditional databases? Managing of semi-structured and unstructured data. Managing large amounts of structured data. 2
  • 4. 4 Key-value database • Example: (DynamoDB) • items having one or more attributes (name, value) • An attribute can be single-valued or multi-valued like set. • items are combined into a table • key-value database is a system that stores values indexed by keys. It can store structured and unstructured data. • Focus on scaling to huge amounts of data designed to handle massive data loads • Data model: (global) collection of Key-value pairs.
  • 5. Key-value Pros: • very fast • very scalable (horizontally distributed to nodes based on key) • simple data model • eventual consistency • fault-tolerance Cons: - Can’t model more complex data structure such as objects 5
  • 7. 1. Google Stack Software • Google developed major software layers as foundation for google platform: 1. Google File System (GFS): a distributed cluster file system that allows all of the disks within the Google data center to be accessed as one massive, distributed, redundant file system. 2. MapReduce: a distributed processing framework for parallelizing algorithms across large numbers of potentially unreliable servers and being capable of dealing with massive datasets. 3. BigTable: a nonrelational database system that uses the Google File System for storage. 7
  • 13. 2. Hadoop and Hive 13
  • 15. Key-value • Basic API access: • Get(key): extract the value given a key • Put(key, value): create or update the value given its key • Delete(key): remove the key and its associated value • Update(key, value): create or update the value given its key • Execute(key, operation, parameters): invoke an operation to the value (given its key) which is a special data structure (e.g. List, Set, Map .... etc) 15
  • 16. Key-value Platforms 16 Name Producer Data model Querying SimpleDB Amazon set of couples (key, {attribute}), where attribute is a couple (name, value) restricted SQL; select, delete, GetAttributes, and PutAttributes operations Redis Salvatore Sanfilippo set of couples (key, value), where value is simple typed value, list, ordered (according to ranking) or unordered set, hash value primitive operations for each value type Dynamo Amazon like SimpleDB simple get operation and put in a context Voldemort LinkeId like SimpleDB similar to Dynamo
  • 17. Apache Cassandra • Is a free and open-source distributed NoSQL database management. • Handles large amounts of data across many commodity servers, providing high availability with no single point of failure. • It was started by Facebook and it is an open source Apache project written in Java. 17
  • 18. 18
  • 20. Apache Cassandra - Advantages 1. Cassandra is developed to be a distributed server, but it can also be run as a simple node. 2. Horizontal scalability (Distributed storage.). 3. Quick answers even if demand grows. 4. High write speeds to manage incremental data volumes 5. Ability to change the data structure. 6. A simple API for your favorite programming language. 7. Automatic fault detection and fault tolerant. 8. There is no single point of failure which means that each node knows about the others. 9. Decentalized. 10.Allows the use of Hadoop to use Map Reduce. 20
  • 21. 21
  • 22. Apache Cassandra - Disadvantages 1. Ad-hoc queries: You must model your data around the queries, rather than around the structure of data. 2. No-Aggregations: because Cassandra is a key- value store doing functions like Sum, Min, Max, and Average are incredibly resource intensive if even possible to accomplish. 3. Unpredictable performance: Because Cassandra has many different Asynchronous Jobs in the background. 22
  • 24. 24
  • 25. 25
  • 26. Cassandra Gossip Protocol • What is Gossip protocol ? Gossip is the message system that Cassandra nodes, virtual nodes used to make their data consistent with each other. A node has a data replica. If something goes wrong, a replica can respond. The replication_factor parameter in the creation of a KeySpace (database) indicates how many machines in the cluster will receive copies of the same data. 26
  • 27. 27
  • 28. Key-Value Concepts • Cassandra manages columns and family of columns. • Column family is a container of rows containing columns. • A keyspace is analogous to a database in a relational model but without interrelations (stores data). • The keyspaces require that some attributes be defined, such as user-defined names, replication strategies and others. 28
  • 29. Key-Value Concepts • These KeySpaces require configuration according to consistency that are: 1. The replication factor which indicates how much do you want to pay performance in favor of consistency. 2. The replica placement strategy, which indicates how the replicas are placed in the ring such as SimpleStrategy, OldNetwork TopologyStrategy, and NetworkTopologyStrategy. • Read more: https://docs.datastax.com/en/cassandra- oss/2.1/cassandra/architecture/architectureDataDistributeR eplication_c.html#architectureDataDistributeReplication_c_ _networkToplogyStrategy-ph 29
  • 30. 30
  • 31. 31
  • 32. 32
  • 33. CQL (Cassandra Query Language) • CQL offers a more than close to SQL to create schema and manipulate data. 33 Some of the features CQL has are: • Data types • Security • Data definition • Functions • Data manipulation • Arithmetic operations • Secondary indexes • JSON support • Materialized views • Triggers