Estimating End-to-End Performance by Collaborative Prediction with Active Sampling

Accurately estimating end-to-end performance in distributed systems is essential both for monitoring compliance with service-level agreements (SLAs) and for performance optimization (e.g., choosing the highest-bandwidth server for a download request in a content-distribution system). However, exhaustive pairwise measurements of end-to-end performance is infeasible in large systems, and cannot be kept up-to-date in highly dynamic environments. Thus, a natural alternative is to predict unobserved end-to-end performances from available historic data, with a minimal amount of additional measurements. In this paper we present an approach to this based on Collaborative Prediction (CP), an estimation method designed to work with sparse data, that has enjoyed much success in other domains (e.g. product recommendation systems). Specifically, we use Max-Margin Matrix Factorization (MMMF), a linear factor model for CP that has outperformed state-of-art CP techniques, and does not rely on additional instrumentation such as existence of landmark nodes (a typical assumption for most approaches to network distance prediction). Moreover, our approach readily admits active sampling based on prediction confidence, and we further propose a novel active-sampling collaborative prediction approach that yields even higher predictive accuracy, while allowing a flexible trade-off between “exploration” (choosing suboptimal samples to improve estimation accuracy) and “exploitation” (choosing node with best estimated performance). We demonstrate successful empirical results on a variety of practical problems, including network latency prediction (NLANR-AMP, P2PSim and PlanetLab datasets) and bandwidth prediction in content-distribution systems (IBM’s downloadGrid data).

By: Irina Rish; Gerald Tesauro

Published in: 2007 10th IFIP/IEEE International Symposium on Integrated Network MangementPiscataway, NJ, p.294-303 in 2007

Please obtain a copy of this paper from your local library. IBM cannot distribute this paper externally.

Questions about this service can be mailed to .