Performance evaluation for self-healing distributed services and fault detection mechanisms

作者:

Highlights:

摘要

Distributed applications, based on internetworked services, provide users with more flexible and varied services and developers with the ability to incorporate a vast array of services into their applications. Such applications are difficult to develop and manage due to their inherent dynamics and heterogeneity. One desirable characteristic of distributed applications is self-healing, or the ability to reconfigure themselves “on the fly” to circumvent failure. In this paper, we discuss our middleware for developing self-healing distributed applications. We present the model we adopted for self-healing behaviour and show as case study the reconfiguration of an application that uses networked sorting services and an application for networked home appliances. We discuss the performance benefits of self-healing property by analysing the elapsed time for automatic reconfiguration without user intervention. Our results show that a distributed application developed with our self-healing middleware will be able to perform smoothly by quickly reconfiguring its services upon detection of failure. We also consider the performance impact of a number of fault-detection mechanisms, including pre-emptive detection and on-use detection.

论文关键词:Distributed systems,Middleware,Autonomic computing,Self-healing,Fault–recovery,Fault-detection,Performance evaluation

论文评审过程:Received 27 February 2005, Revised 31 July 2005, Available online 9 March 2006.

论文官网地址:https://doi.org/10.1016/j.jcss.2005.12.008