Modeling Fault-Tolerant Mobile Agent Execution as a Sequence of Agreement Problems

Fault-tolerance is fundamental to the further development of mobile agent applications. In the context of mobile agents, fault-tolerance prevents a partial or complete loss of the agent, i.e., ensures that the agent arrives at its destination. Simple approaches such as checkpointing are prone to blocking. Replication can in principle improve solutions based on checkpointing. However, existing solutions in this context either assume a perfect failure detection mechanism (which is not realistic in an environment such as the Internet), or rely on complex solutions based on leader election and distributed transactions, where only a subset of solutions prevents blocking.

This paper proposes a novel approach to fault-tolerant mobile agent execution, which is based on modeling agent execution as a sequence of agreement problems. Each agreement problem is one instance of the well understood consensus problem. Our solution does not require a perfect failure detection mechanism, while preventing blocking and ensuring that the agent is executed exactly once.

Keywords: mobile agent, fault tolerance, agreement problem, consensus

By: Stefan Pleisch and André Schiper

Published in: Proceedings 19th Symp. on Reliable Distributed Systems. , Piscataway, IEEE, p.11-20 in 2000

Please obtain a copy of this paper from your local library. IBM cannot distribute this paper externally.

Questions about this service can be mailed to reports@us.ibm.com .