Adaptive Techniques for Scheduling Distributed Data Intensive Applications: Experiments on a Production Grid

Efficient job and data management in Data Grids is complicated by various factors like unreliable resources, fluctuating load and multi-administrative challenges. We have proposed an architecture for this scenario, where agents distributed across the Grid, cooperate to schedule both jobs and data with the goal of minimizing execution times, maximizing throughput, and/or minimizing data movement. We have deployed our proposed resource management architecture along with a suite of related job and data scheduling algorithms, on a 27-site wide-area Grid laboratory, Grid3. Here, we report the results of detailed experiments in this environment, using a range of scientific application workloads. We find that intelligent data scheduling is essential for certain scenarios. However, when faced with heterogeneous workloads, adaptive scheduling strategies (that can alter between data-centric and compute-centric approaches) are crucial for achieving good performance. We also discuss insights gained (1) while implementing our architecture on Grid3 and (2) by comparing our experimental results on this Grid laboratory to results obtained via simulations.

By: Kavitha Ranganathan; Ian Foster

Published in: RC23522 in 2005

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rc23522.pdf

Questions about this service can be mailed to reports@us.ibm.com .