Optimal Control of Web Hosting Systems under Service Level Agreements

The operation of a web-hosting facility involves control elements and decisions spanning many time scales. The physical computing facility contains many control points with parameters that can be tuned in real time to respond to performance statistics. At one extreme, high-performance routers operate at very fine time-scales and adjust their parameters accordingly. Operating system, middleware and application software may also monitor performance and set control parameters. At somewhat coarser time-scales, computer and disk farm partitions may be reassigned in response to changing workloads. Daily, weekly, monthly and longer-term forecasts may be used to schedule and plan allocation policies. At weekly and monthly time-scales, the capacity of the resource elements may be reviewed and cause the utility’s supply chain model to process additional orders or to obtain short-term capacity. Finally, monthly and yearly performance reporting may require changes in the basic terms and conditions of the Service Level Agreement (SLA) and impact strategic models addressing the computing utility’s profitability. The key operational point is that the overall solution must provide a unified framework that makes it possible to have various solution methods working together in a cohesive manner across a wide range of time scales in order to meet a common overall goal.

The physical computing facility includes control points at the router and server. Policies at these control points can be used to achieve QoS performance objectives through the allocation of resource elements, once information about the arrival and service processes are known or forecasted. The availability of such workload information is often significantly impacted by the time-scale of the control points. As a specific example, routers working at very fine timescales often do not have any additional information about the arrival process beyond its current mean rate because the overhead of collecting more detailed information is too expensive at these fine time-scales. More detailed workload information is typically available as one moves to coarser time scales.

The varying assignment of resource elements over time is perhaps what most comes to mind when operational models of the computing utility are discussed and marketed: Additional resource elements can be brought to accommodate situations where customer user populations create bursts of demand. Different architectures can respond to such requirements with varying abilities, from hot repartitioning of a single large mainframe class machine to a cold restart of a rack mounted pizza-box. Reassigned resource elements are necessarily diverted from actual or potential use by other customers, so the operational models for these decisions typically must encompass a large number of potential actions and rewards.

By: Alan J. King, Mark S. Squillante

Published in: RC23094 in 2004

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rc23094.pdf

Questions about this service can be mailed to reports@us.ibm.com .