<  Back to the Polytechnique Montréal portal

Redundancy schemes for high availability computer clusters

Christian Kobhio Bassek, Samuel Pierre and Alejandro Quintero

Article (2006)

[img]
Preview
Published Version
Terms of Use: Creative Commons Attribution.
Download (581kB)
Cite this document: Bassek, C. K., Pierre, S. & Quintero, A. (2006). Redundancy schemes for high availability computer clusters. Journal of Computer Science, 2(1), p. 33-47. doi:10.3844/jcssp.2006.33.47
Show abstract Hide abstract

Abstract

The primary goal of computer clusters is to improve computing performances by taking advantage of the parallelism they intrinsically provide. Moreover, their use of redundant hardware components enables them to offer high availability services. In this paper, we present an analytical model for analyzing redundancy schemes and their impact on the cluster’s overall performance. Furthermore, several cluster redundancy techniques are analyzed with an emphasis on hardware and data redundancy, from which we derive an applicable redundancy scheme design. Also, our solution provides a disaster recovery mechanism that improves the cluster’s availability. In the case of data redundancy, we present improvements to the replication and parity data replication techniques for which we investigate the availability of the cluster under several scenarios that take into account, among other things, the number of replicated nodes, the number of CPUs that hold parity data and the relation between primary and replicated data. For this purpose, we developed a simulator that analyzes the impact of a redundancy scheme on the processing rate of the cluster. We also studied the performance of two well-known schemes according to the usage rate of the CPUs. We found that two important aspects influencing the performance of a transaction-oriented cluster were the cluster’s failover and data redundancy schemes. We simulated several data redundancy schemes and found that data replication offered higher cluster availability than the parity model.

Uncontrolled Keywords

Computer cluster, high availability, redundancy scheme, performance evaluation, fault tolerance

Open Access document in PolyPublie
Subjects: 2700 Technologie de l'information > 2700 Technologie de l'information
Department: Département de génie informatique et génie logiciel
Research Center: Autre
Date Deposited: 07 Apr 2021 10:32
Last Modified: 08 Apr 2021 10:43
PolyPublie URL: https://publications.polymtl.ca/4769/
Document issued by the official publisher
Journal Title: Journal of Computer Science (vol. 2, no. 1)
Publisher: Science Publications
Official URL: https://doi.org/10.3844/jcssp.2006.33.47

Statistics

Total downloads

Downloads per month in the last year

Origin of downloads

Dimensions

Repository Staff Only