Quantitative performance diagnosis (QPD)provides explanations that quantify the impact of problem causes.An example of such an explanation is Increased web server tra .c accounts for 90%of the increase in LAN utilization,which in turn accounts for 20%of the increase in web response times.This paper describes GAP,a general approach to quantitative performance diagnosis.GAP has two parts:(1) an algorithm for computing quantitative performance diagnoses and (2)a framework for constructing diagnostic techniques that provides the basis for quanti .cations produced by the algorithm.The GAP algorithm makes use of a easurement navigation graph (MNG),a directed acyclic graph whose nodes are measurement variables and whose arcs have weights that quantify the e .ect of child variables (e.g.,LAN utilization)on parent variables (e.g.,response time).Various properties of the algorithm are established, especially that its quanti .cation of explanations can be interpreted as fractional contributions to the performance problem.Arc weights are computed by diagnostic techniques.A framework for developing diagnostic techniques is described that consists of (a)the choice of statistic (e.g.,mean,variance)to aggregate problem values and (b)the estimator of the statistic.The framework is applied to existing diagnostic techniques to assess their e .ectiveness and is used to construct a new diagnostic technique for a performance problems in a production computing systems.It is also used to show that for uniform magnitude performance problems (e.g.,a step),the standard deviation is preferred to the mean if the problem data have a coe .cient of variation no larger than (1 .f )/f ,where f is the fraction of the data containing the performance problem.

By:* Joseph L Hellerstein*

Published in: RC22682 in 2002

