All the distributed file systems and NoSQL databases work very well during the normal operation. We can find the big differences if we investigate the behaviour in case of emergency. Data replication strategies and recovery algorithms are the key ingredients of a distributed data storage to save data.
Recent studies proved that the random replication is not the safest choice for storing data as it almost guarantees to lose data in the common scenario of simultaneous node failures. Copyset Replication method significantly reduces the frequency of data loss events with selecting the replica groups in a smart way.
In this talk we will introduce the key elements of a successful data replication and show how advanced data replication strategies could help to survive outages. We will show how Apache Hadoop Ozone solves the problem with advanced techniques and present the challenges of using Copyset algorithm with advanced cluster topology support.