论文部分内容阅读
HFTC_2 is a practical dependable dis-tributed computer system,it works correctly in theexistence of arbitrary fault combination with check-point and system-level online diagnosis to exclude thefaulty nodes from the system.The most importantproblem of this system is synchronization,diagnosisand reconfiguration.This paper describes how to re-duce the cost of system synchronization and diagno-sis to eliminate the major performance bottleneck inthis fault tolerance system.The logical clock replacesthe real clock in the system synchronization,and itovercomes the fundamental limitation of the system-level diagnosis algorithm.The algorithm uses onlyO(Nt2) messages to detect the faulty nodes in the sys-tem where arbitrary faulty nodes exist.