Generating a Fault Tolerant Global Clock Using High-Speed Control Signals for the MetaNet Architecture

        This work describes a new technique, based on exchanging control signals for constructing a fault tolerant global clock in a point-to-point distributed system with an arbitrary topology. The approach taken in this work is to generate a global clock from the ensemble of the high-speed local transmission clocks and not to directly synchronize these clocks. The steady-state algorithm which generates the global clock is executed in hardware by the physical network interface of each node. At the network interface it is possible to measure accurately the propagation delay between neighboring nodes with a very small error or uncertainty, aned thereby to achieve global synchronization that is proportional to this measurement error. It is shown that the local clock drift (or rate uncertainty) has only a secondary effect on the maximum global clock rate. The synchronization algorithm can tolerate any physical failure. It will continue to operate correctly on any connected segment of the network, i.e., it can tolerate any number of link and node failures, as long that the network remains connected. Furthermore, the algorithm can tolerate failures of the following types: (i) fast and slow clocks can be detected and isolated from the algorithm, (ii) changes in the value of link delays can be masked, and (iii) malicious changes of the global clock values can be detected and masked.

By: Yoram Ofek

Published in: The 9th International Conference on Distributed Systems, IEEE Computer Society Press, Pages: 218-226. June 1989. Also Published In: IEEE Transactions on Communications, Vol. 42, No. 5, Pages: 2179-2188, May 1994. , IEEE in 1991

Please obtain a copy of this paper from your local library. IBM cannot distribute this paper externally.

Questions about this service can be mailed to reports@us.ibm.com .