Characterizing the Impact of Different Memory-Intensity Levels

Applications on today’s high-end processors typically make varying load demands over time. A single application may have many different phases during its lifetime, and workload mixes show interleaved phases. This work examines and uses the differences between memory- and CPU-intensive phases to reduce power. Today’s processors provide resources that are underutilized during memory-intensive phases, consuming power while producing little incremental gain in performance. This work examines a deployed system consisting of identical cores with a goal of running them at different effective frequencies. The initial goal is to find the appropriate frequency at which to run each phase.

This paper demonstrates that memory intensity directly affects the throughput of applications. The results indicate that simple metrics such as IPC (instructions per cycle) cannot be used to determine what frequency to run a phase. Instead, one identifies phases through the use of performance counters which directly monitor memory behavior. Memory-intensive phases can then be run on a slower core without incurring significant performance penalties. The key result of the paper is the introduction of a very simple, online model that uses the performance counter data to predict the performance of a program phase at any particular frequency setting. The information from this model allows a scheduler to decide which core to use to execute the program phase. Using a sophisticated power model for the processor family shows that this approach significantly reduces power consumption. The model was evaluated using a subset of SPECCPU and the SPECjbb and TPC-W benchmarks. It predicts performance with an average error of less than 10%. The power modeling shows that memory-intensive benchmarks achieve about a 50% power reduction at a performance loss of less than 15% when run at 80% of nominal frequency.

By: Ramakrishna Kotla, Anirudh Devgan, Soraya Ghiasi, Freeman Rawson, Tom Keller

Published in: RC23351 in 2004


This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.


Questions about this service can be mailed to .