Authors:
Syed Mohtashim Abbas Bokhari
and
Oliver Theel
Affiliation:
Department of Computer Science, University of Oldenburg, Germany
Keyword(s):
Distributed Systems, Fault Tolerance, Data Replication, Quorum Protocols, Operation Availability, Operation Cost, Voting Structures, Optimization, Machine Leaning, Evolutionary Strategies, Genetic Programming.
Abstract:
A distributed system is a paradigm which is indispensable to the current world due to countless requests with every passing second. Therefore, in distributed computing, high availability is very important. In a dynamic environment due to the scalability and complexity of the resources and components, systems are fault-prone because millions of computing devices are connected to each other via communication links. Distributed systems allow many users to access shared computing resources which makes faults inevitable. Replication plays its role in masking failures in order to achieve a fault-tolerant distributed environment. Data replication is an appropriate means to provide highly available data access operations at relatively low operation costs. Although there are several contemporary data replication strategies being used, the question still stands which strategy is the best for a given scenario or application class assuming a certain workload, its distribution across a network, a
vailability of the individual replicas, and cost of the access operations. In this regard, research focuses on analysis, simulation, and machine learning approaches to automatically identify and design such replication strategies that are optimized for a given application scenario based on predefined constraints and properties exploiting a so-called voting structure.
(More)