The Information Technology have led us into an era where the production, sharing and use of information are now part of everyday life and of which we are often unaware actors almost: it is now almost inevitable not leave a digital trail of many of the actions we do every day; for example, by digital content such as photos, videos, blog posts and everything that revolves around the social networks (Facebook and Twitter in particular). Added to this is that with the "internet of things", we see an increase in devices such as watches, bracelets, thermostats and many other items that are able to connect to the network and therefore generate large data streams. This explosion of data justifies the birth, in the world of the term Big Data: it indicates the data produced in large quantities, with remarkable speed and in different formats, which requires processing technologies and resources that go far beyond the conventional systems management and storage of data. It is immediately clear that, 1) models of data storage based on the relational model, and 2) processing systems based on stored procedures and computations on grids are not applicable in these contexts. As regards the point 1, the RDBMS, widely used for a great variety of applications, have some problems when the amount of data grows beyond certain limits. The scalability and cost of implementation are only a part of the disadvantages: very often, in fact, when there is opposite to the management of big data, also the variability, or the lack of a fixed structure, represents a significant problem. This has given a boost to the development of the NoSQL database. The website NoSQL Databases defines NoSQL databases such as "Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open source and horizontally scalable." These databases are: distributed, open source, scalable horizontally, without a predetermined pattern (key-value, column-oriented, document-based and graph-based), easily replicable, devoid of the ACID and can handle large amounts of data. These databases are integrated or integrated with processing tools based on the MapReduce paradigm proposed by Google in 2009. MapReduce with the open source Hadoop framework represent the new model for distributed processing of large amounts of data that goes to supplant techniques based on stored procedures and computational grids (step 2). The relational model taught courses in basic database design, has many limitations compared to the demands posed by new applications based on Big Data and NoSQL databases that use to store data and MapReduce to process large amounts of data.
Course Website http://pbdmng.datatoknowledge.it/
Contact me for other informations and to download
Apache HBase™ is the Hadoop database, a distributed, salable, big data store.Its a column-oriented database management system that runs on top of HDFS.
Apache HBase is an open source NoSQL database that provides real-time read/write access to those large data sets. ... HBase is natively integrated with Hadoop and works seamlessly alongside other data access engines through YARN.
The Information Technology have led us into an era where the production, sharing and use of information are now part of everyday life and of which we are often unaware actors almost: it is now almost inevitable not leave a digital trail of many of the actions we do every day; for example, by digital content such as photos, videos, blog posts and everything that revolves around the social networks (Facebook and Twitter in particular). Added to this is that with the "internet of things", we see an increase in devices such as watches, bracelets, thermostats and many other items that are able to connect to the network and therefore generate large data streams. This explosion of data justifies the birth, in the world of the term Big Data: it indicates the data produced in large quantities, with remarkable speed and in different formats, which requires processing technologies and resources that go far beyond the conventional systems management and storage of data. It is immediately clear that, 1) models of data storage based on the relational model, and 2) processing systems based on stored procedures and computations on grids are not applicable in these contexts. As regards the point 1, the RDBMS, widely used for a great variety of applications, have some problems when the amount of data grows beyond certain limits. The scalability and cost of implementation are only a part of the disadvantages: very often, in fact, when there is opposite to the management of big data, also the variability, or the lack of a fixed structure, represents a significant problem. This has given a boost to the development of the NoSQL database. The website NoSQL Databases defines NoSQL databases such as "Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open source and horizontally scalable." These databases are: distributed, open source, scalable horizontally, without a predetermined pattern (key-value, column-oriented, document-based and graph-based), easily replicable, devoid of the ACID and can handle large amounts of data. These databases are integrated or integrated with processing tools based on the MapReduce paradigm proposed by Google in 2009. MapReduce with the open source Hadoop framework represent the new model for distributed processing of large amounts of data that goes to supplant techniques based on stored procedures and computational grids (step 2). The relational model taught courses in basic database design, has many limitations compared to the demands posed by new applications based on Big Data and NoSQL databases that use to store data and MapReduce to process large amounts of data.
Course Website http://pbdmng.datatoknowledge.it/
Contact me for other informations and to download
Apache HBase™ is the Hadoop database, a distributed, salable, big data store.Its a column-oriented database management system that runs on top of HDFS.
Apache HBase is an open source NoSQL database that provides real-time read/write access to those large data sets. ... HBase is natively integrated with Hadoop and works seamlessly alongside other data access engines through YARN.
“not only SQL.”
NoSQL databases are databases store data in a format other than relational tables.
NoSQL databases or non-relational databases don’t store relationship data well.
The Information Technology have led us into an era where the production, sharing and use of information are now part of everyday life and of which we are often unaware actors almost: it is now almost inevitable not leave a digital trail of many of the actions we do every day; for example, by digital content such as photos, videos, blog posts and everything that revolves around the social networks (Facebook and Twitter in particular). Added to this is that with the "internet of things", we see an increase in devices such as watches, bracelets, thermostats and many other items that are able to connect to the network and therefore generate large data streams. This explosion of data justifies the birth, in the world of the term Big Data: it indicates the data produced in large quantities, with remarkable speed and in different formats, which requires processing technologies and resources that go far beyond the conventional systems management and storage of data. It is immediately clear that, 1) models of data storage based on the relational model, and 2) processing systems based on stored procedures and computations on grids are not applicable in these contexts. As regards the point 1, the RDBMS, widely used for a great variety of applications, have some problems when the amount of data grows beyond certain limits. The scalability and cost of implementation are only a part of the disadvantages: very often, in fact, when there is opposite to the management of big data, also the variability, or the lack of a fixed structure, represents a significant problem. This has given a boost to the development of the NoSQL database. The website NoSQL Databases defines NoSQL databases such as "Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open source and horizontally scalable." These databases are: distributed, open source, scalable horizontally, without a predetermined pattern (key-value, column-oriented, document-based and graph-based), easily replicable, devoid of the ACID and can handle large amounts of data. These databases are integrated or integrated with processing tools based on the MapReduce paradigm proposed by Google in 2009. MapReduce with the open source Hadoop framework represent the new model for distributed processing of large amounts of data that goes to supplant techniques based on stored procedures and computational grids (step 2). The relational model taught courses in basic database design, has many limitations compared to the demands posed by new applications based on Big Data and NoSQL databases that use to store data and MapReduce to process large amounts of data.
Course Website http://pbdmng.datatoknowledge.it/
Contact me for other informations and to download the slides
Here is my seminar presentation on No-SQL Databases. it includes all the types of nosql databases, merits & demerits of nosql databases, examples of nosql databases etc.
For seminar report of NoSQL Databases please contact me: ndc@live.in
Talon systems - Distributed multi master replication strategySaptarshi Chatterjee
Data Replication is the process of storing data in more than one site or node.It is useful in improving the availability of data. The result is a distributed database in which users can access data relevant to their tasks without interfering with the work of others.Data Replication is generally performed to provide a consistent copy of data across all the database nodes.
Traditionally it’s done by copying data from one database server to another, so that all the servers can have the same data. Our implementation, proposes a completely different approach. Instead of copying data from one node to another, in our design , master replicas do not directly communicate between each other and work virtually independently for write queries. For read queries, an independent process consults all the replicas to constitute a quorum. and returns the result if a majority of the machines in the system response with the same result
“not only SQL.”
NoSQL databases are databases store data in a format other than relational tables.
NoSQL databases or non-relational databases don’t store relationship data well.
The Information Technology have led us into an era where the production, sharing and use of information are now part of everyday life and of which we are often unaware actors almost: it is now almost inevitable not leave a digital trail of many of the actions we do every day; for example, by digital content such as photos, videos, blog posts and everything that revolves around the social networks (Facebook and Twitter in particular). Added to this is that with the "internet of things", we see an increase in devices such as watches, bracelets, thermostats and many other items that are able to connect to the network and therefore generate large data streams. This explosion of data justifies the birth, in the world of the term Big Data: it indicates the data produced in large quantities, with remarkable speed and in different formats, which requires processing technologies and resources that go far beyond the conventional systems management and storage of data. It is immediately clear that, 1) models of data storage based on the relational model, and 2) processing systems based on stored procedures and computations on grids are not applicable in these contexts. As regards the point 1, the RDBMS, widely used for a great variety of applications, have some problems when the amount of data grows beyond certain limits. The scalability and cost of implementation are only a part of the disadvantages: very often, in fact, when there is opposite to the management of big data, also the variability, or the lack of a fixed structure, represents a significant problem. This has given a boost to the development of the NoSQL database. The website NoSQL Databases defines NoSQL databases such as "Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open source and horizontally scalable." These databases are: distributed, open source, scalable horizontally, without a predetermined pattern (key-value, column-oriented, document-based and graph-based), easily replicable, devoid of the ACID and can handle large amounts of data. These databases are integrated or integrated with processing tools based on the MapReduce paradigm proposed by Google in 2009. MapReduce with the open source Hadoop framework represent the new model for distributed processing of large amounts of data that goes to supplant techniques based on stored procedures and computational grids (step 2). The relational model taught courses in basic database design, has many limitations compared to the demands posed by new applications based on Big Data and NoSQL databases that use to store data and MapReduce to process large amounts of data.
Course Website http://pbdmng.datatoknowledge.it/
Contact me for other informations and to download the slides
Here is my seminar presentation on No-SQL Databases. it includes all the types of nosql databases, merits & demerits of nosql databases, examples of nosql databases etc.
For seminar report of NoSQL Databases please contact me: ndc@live.in
Talon systems - Distributed multi master replication strategySaptarshi Chatterjee
Data Replication is the process of storing data in more than one site or node.It is useful in improving the availability of data. The result is a distributed database in which users can access data relevant to their tasks without interfering with the work of others.Data Replication is generally performed to provide a consistent copy of data across all the database nodes.
Traditionally it’s done by copying data from one database server to another, so that all the servers can have the same data. Our implementation, proposes a completely different approach. Instead of copying data from one node to another, in our design , master replicas do not directly communicate between each other and work virtually independently for write queries. For read queries, an independent process consults all the replicas to constitute a quorum. and returns the result if a majority of the machines in the system response with the same result
MySQL Group Replication is a new 'synchronous', multi-master, auto-everything replication plugin for MySQL introduced with MySQL 5.7. It is the perfect tool for small 3-20 machine MySQL clusters to gain high availability and high performance. It stands for high availability because the fault of replica don't stop the cluster. Failed nodes can rejoin the cluster and new nodes can be added in a fully automatic way - no DBA intervention required. Its high performance because multiple masters process writes, not just one like with MySQL Replication. Running applications on it is simple: no read-write splitting, no fiddling with eventual consistency and stale data. The cluster offers strong consistency (generalized snapshot isolation).
It is based on Group Communication principles, hence the name.
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
Big Data with Hadoop & Spark Training: http://bit.ly/2kyP2Ct
This CloudxLab Introduction to NoSQL tutorial helps you to understand NoSQL in detail. Below are the topics covered in this slide:
1) Introduction to NoSQL
2) Scaling Out vs Scaling Up
3) ACID - Properties of DB Transactions
4) RDBMS - Story
5) What is NoSQL?
6) Types Of NoSQL Stores
7) CAP Theorem
8) Serialization
9) Column Oriented Database
10) Column Family Oriented DataStore
How big data moved the needle from monolithic SQL RDBMS to distributed NoSQLSayyaparaju Sunil
we will see what factors contributed to the evolution of the next thing and what kind of design choices were made by the engineers along the evolution. We will also see what we got rid of (or the tradeoffs) during the evolution process. We will talk about what kind of applications will be best suited to a particular type of database.
Presentation on NoSQL Database related RDBMSabdurrobsoyon
This Presentation is about NoSQL which means Not Only SQL. This presentation covers the aspects of using NoSQL for Big Data and the differences from RDBMS.
Database Security Introduction,Methods for database security
Discretionary access control method
Mandatory access control
Role base access control for multilevel security.
Use of views in security enforcement
This PPt tells about Types of Research, Introduction Nature of qualitative and quantitative research, Research in functional areas of management, Process of Research
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Safalta Digital marketing institute in Noida, provide complete applications that encompass a huge range of virtual advertising and marketing additives, which includes search engine optimization, virtual communication advertising, pay-per-click on marketing, content material advertising, internet analytics, and greater. These university courses are designed for students who possess a comprehensive understanding of virtual marketing strategies and attributes.Safalta Digital Marketing Institute in Noida is a first choice for young individuals or students who are looking to start their careers in the field of digital advertising. The institute gives specialized courses designed and certification.
for beginners, providing thorough training in areas such as SEO, digital communication marketing, and PPC training in Noida. After finishing the program, students receive the certifications recognised by top different universitie, setting a strong foundation for a successful career in digital marketing.
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...Levi Shapiro
Letter from the Congress of the United States regarding Anti-Semitism sent June 3rd to MIT President Sally Kornbluth, MIT Corp Chair, Mark Gorenberg
Dear Dr. Kornbluth and Mr. Gorenberg,
The US House of Representatives is deeply concerned by ongoing and pervasive acts of antisemitic
harassment and intimidation at the Massachusetts Institute of Technology (MIT). Failing to act decisively to ensure a safe learning environment for all students would be a grave dereliction of your responsibilities as President of MIT and Chair of the MIT Corporation.
This Congress will not stand idly by and allow an environment hostile to Jewish students to persist. The House believes that your institution is in violation of Title VI of the Civil Rights Act, and the inability or
unwillingness to rectify this violation through action requires accountability.
Postsecondary education is a unique opportunity for students to learn and have their ideas and beliefs challenged. However, universities receiving hundreds of millions of federal funds annually have denied
students that opportunity and have been hijacked to become venues for the promotion of terrorism, antisemitic harassment and intimidation, unlawful encampments, and in some cases, assaults and riots.
The House of Representatives will not countenance the use of federal funds to indoctrinate students into hateful, antisemitic, anti-American supporters of terrorism. Investigations into campus antisemitism by the Committee on Education and the Workforce and the Committee on Ways and Means have been expanded into a Congress-wide probe across all relevant jurisdictions to address this national crisis. The undersigned Committees will conduct oversight into the use of federal funds at MIT and its learning environment under authorities granted to each Committee.
• The Committee on Education and the Workforce has been investigating your institution since December 7, 2023. The Committee has broad jurisdiction over postsecondary education, including its compliance with Title VI of the Civil Rights Act, campus safety concerns over disruptions to the learning environment, and the awarding of federal student aid under the Higher Education Act.
• The Committee on Oversight and Accountability is investigating the sources of funding and other support flowing to groups espousing pro-Hamas propaganda and engaged in antisemitic harassment and intimidation of students. The Committee on Oversight and Accountability is the principal oversight committee of the US House of Representatives and has broad authority to investigate “any matter” at “any time” under House Rule X.
• The Committee on Ways and Means has been investigating several universities since November 15, 2023, when the Committee held a hearing entitled From Ivory Towers to Dark Corners: Investigating the Nexus Between Antisemitism, Tax-Exempt Universities, and Terror Financing. The Committee followed the hearing with letters to those institutions on January 10, 202
The simplified electron and muon model, Oscillating Spacetime: The Foundation...RitikBhardwaj56
Discover the Simplified Electron and Muon Model: A New Wave-Based Approach to Understanding Particles delves into a groundbreaking theory that presents electrons and muons as rotating soliton waves within oscillating spacetime. Geared towards students, researchers, and science buffs, this book breaks down complex ideas into simple explanations. It covers topics such as electron waves, temporal dynamics, and the implications of this model on particle physics. With clear illustrations and easy-to-follow explanations, readers will gain a new outlook on the universe's fundamental nature.
A workshop hosted by the South African Journal of Science aimed at postgraduate students and early career researchers with little or no experience in writing and publishing journal articles.
This presentation includes basic of PCOS their pathology and treatment and also Ayurveda correlation of PCOS and Ayurvedic line of treatment mentioned in classics.
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
1. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Dr. Dipali P. Meher
MCS, M.Phil, NET, Ph.D
Modern College of Arts, Science and Commerce,
Ganeshkhind, Pune 16
mailtomeher@gmail.com/dipalimeher@moderncollegegk.org
DATA MODELS
2. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
NOSQL had ability to run databases on large cluster.
When size of data increases it will become difficult to
scale up with data – always we need to buy bigger ser
ver as data increases.
One solution to this is to run the databases on cluster
of servers.
Running databases on server increases complexity of
databases
3. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
2 paths for DATA DISTRIBUTION
Replication and Sharding
Replication
4. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Data Distribution
Sharding
5. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Replication
Replication takes the same data
and copies it over multiple nodes.
Replication copies data across multiple servers, so each
bit of data can be found in multiple places
6. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Replication : Two Forms
master-slave and peer-to-peer
7. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Replication: Master Slave
Replicate data across multiple nodes
One node is designated as the master, or primary.
MASTER
It is the authoritative source for the data
It is usually responsible for processing any updates to that data.
other nodes are slaves, or secondaries.
A replication process synchronizes the slaves with the master.
It can be appointed manually or automatically.
SLAVE
A replication process synchronizes the slaves with the master.
After a failure of the master, a slave can be appointed as new m
aster very quickly.
8. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
MASTER
Can be appointed automatically or manually
Manually: Manual appointing typically means that when
you configure your cluster, you configure one node as the
master.
Automatically: you create a cluster of nodes and they elect
one of themselves to be the master.
automatic appointment means that the cluster can automatically
appoint a new master when a master fails, reducing downtime.
9. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Replication: Master Slave Replication
10. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Pros and cons of Master-Slave Replication
PROS
More read requests
Add more slave nodes
Ensure that all read requests are routed to slaves
Should the master fail, the slaves can still handle read
requests
Good for datasets with a read intensive dataset (read
resilience)
CONS
The master is a bottleneck
Limited by its ability to process updates and to
pass those updates on Its failure does eliminate the
ability to handle writes until: the master is restored
or a new master is appointed
Inconsistency due to slow propagation of changes to
the slaves
Bad for datasets with heavy write traffic
11. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Read resilience
More and More Read requests
In order to get read Resilience user has to ensure that read and write
paths in your application are different.
In case of failure in write path that can be handled separately and read
can continue
Read Path
Write Path
12. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Master slave replication is good for read resil
ience.
Does not scale well for write resilience.
It also faces bottleneck problem for updates
(write requests).
To solve above issues peer-to-peer replication
is there.
13. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Replication: peer-to-peer
All the replicas have equal weight,
All replicas process write requests
Loss of any one replica does not prevent
access to data store.
14. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Of any ode will fail then working is continu
ed with other nodes. i.e user can ride over
node failures without losing access to data.
Nodes can be easily added to improve the
performance(complications may increase).
15. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Complications in peer-to-peer
Biggest complication: consistency
When you can write to two different places
you run the risk that two people will
attempt to update the same record at the
same time—a write-write conflict.
Inconsistencies on read lead to problems
but at least they are relatively transient.
Inconsistent writes are forever.
16. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Peer-to-peer replication
17. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Peer-to-peer replication
we can ensure that whenever we write data, the
replicas coordinate to ensure we avoid a conflict.
We don’t need all the replicas to agree on the write,
just a majority, so we can still survive losing a
minority of the replica nodes.
we can decide to cope with an inconsistent write.
18. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Sharding
A busy data store is busy because different peo
ple are accessing different parts of the dataset.
In these circumstances we can support
horizontal scalability by putting different
parts of the data onto different
servers —a technique that’s called sharding
19. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Sharding
Sharding puts different data on different nodes
20. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Sharding: Ideal Case
We have different users all talking to different
server nodes.
Each user only has to talk to one server, so
gets rapid responses from that server.
The load is balanced out nicely between
servers—for example, if we have ten servers,
each one only has to handle 10% of the load.
21. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
we have to ensure that data that’s accessed together is clump
ed together on the same node and that these clumps are arran
ged on the nodes to provide the best data access.
How to clump the data? using aggregate aggregates are
used to m to combine data that’s commonly accessed
together—so aggregates leap out as an obvious unit of distri
bution.)
To increase performance of aggregates:
1) physical location where aggregates are stored is important
2) Even loading: try to arrange aggregates so they are evenly
distributed across the nodes which all get equal amounts of the
load
22. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Replication and sharding are ortho
gonal techniques: You can use
either or both of them
23. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Many NoSQL databases offer
auto-sharding, where the database
takes on the responsibility of allocatin
g data to shards and ensuring that dat
a access goes to the right shard. This
can make it much easier to use
sharding in an application.
24. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Sharding is particularly valuable for
performance. It can improve both
read and write performance.
A way to horizontally scale writes.
25. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Combining Sharding and Replication
If we use both master-slave replication
and sharding, this means that we have
multiple masters, but each data item
only has a single master Depending on
your configuration, you may choose a
node to be a master for some data and
slaves for others, or you may dedicate no
des for master or slave duties.
26. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Combining Sharding and Replication
27. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Combining Sharding and Replication
It is good for column-family databases.
Example : tens or hundreds of nodes in a
cluster with data sharded over them.
A good starting point for peer-to-peer
replication is to have a replication factor of 3,
so each shard is present on three nodes.
Should a node fail, then the shards on that
node will be built on the other nodes
28. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Peer to peer replication with sharding
29. Prepared by Dr. Dipali Meher
Source: NoSQL Distilled
Thank You
Editor's Notes
Read intensive: more and more read requests
Write intensive: more and more write requests
read resilience:Should the master fail, the slaves can still handle read requests.
Again, this is useful if most of your data access is reads.
The failure of the master does eliminate the ability to handle writes until either the master is restored or a new master is appointed.
However, having slaves as replicates of the master does speed up recovery after a failure of the master since a slave can be appointed a
new master very quickly.
meaning of Resilience : the capacity to recover quickly from difficulties(toughness)
A transient database object exists only as long as an application has an open connection to the database.
All transient objects disappear when the application shuts down the database.
This means that a persistent database does not need to be re-indexed after re-opening.
data clump" is a name given to any group of variables which are passed around together (in a clump) throughout various parts of the program.
a clump is a grouping.