Cloud computing is widely popular due to its elasticity, economics, reliability and much more. Cloud computing offers a scalable service without any initial investment in servers, storages, or networks. Fault Tolerance (FT) is the ability of any system to continue performing its function regardless of any unexpected hardware or software failures. Fault Tolerance in Cloud Computing (FTCC) is an important area of research due to its complexity. However, there is a lack of studies in this field. Moreover, recent failures and availability issues in popular cloud providers demonstrates the need for more effective solutions. In this paper, we present a study on FTCC mechanisms and analyze its strength and weakness. Based on the study, a comparison on the main fault tolerance techniques is presented considering the cost, overhead, failure types, performance, and the tools used. Moreover, we study and compare the models that enhance the performance of checkpoint and replication based techniques.
Volume 10 | 09-Special Issue
Pages: 1065-1073