AgentTeamwork: Coordinating grid-computing jobs with mobile agents

作者:Munehiro Fukuda, Koichi Kashiwagi, Shinya Kobayashi

摘要

AgentTeamwork is a grid-computing middleware system that dispatches a collection of mobile agents to coordinate a user job over remote computing nodes in a decentralized manner. Its utmost focus is to maintain high availability and dynamic balancing of distributed computing resources to a parallel-computing job. For this purpose, a mobile agent is assigned to each process engaged in the same job, monitors its execution at a different machine, takes its periodical execution snapshot, moves it to a lighter-loaded machine, and resumes it from the latest snapshot upon an accidental crash. The system also restores broken inter-process communication involved in the same job using its error-recoverable socket and mpiJava libraries in collaboration among mobile agents.

论文关键词:Grid computing, Middleware design, Mobile agents, Process migration, Fault tolerance

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-006-9653-6