Mesoscale Performance Simulations of Multicore Processor Systems with Asynchronous Memory Access

Increasing on-chip transistor densities allow for a myriad of design choices for modern multicore processors. However, conducting a meaningful design space exploration of large systems with detailed cycle-accurate simulations using large, complex workloads can be very time-consuming and can adversely impact product schedules. This is due to three main reasons: 1) the high (human) cost of developing cycle-accurate simulators, 2) long simulation times for any sufficiently detailed simulator of a large, complex system, and 3) long running times for modern workloads. While statistical sampling techniques address workload run lengths, there exists no proven technique to replace the use of detailed cycle-accurate simulators for design space exploration.

In this paper, we introduce mesoscale simulation, which is a methodology for design space exploration that mitigates the cost of cycle-accurate simulators. Mesoscale simulation is a hybrid approach that combines elements of high-level modeling and low-level cycle-accurate simulators to enable the construction of fast, high-fidelity performance models. Such models can be used to quickly explore vast areas of the design space and prune it down to manageable levels for cycle-accurate simulator based studies. We describe a proof-of-concept mesoscale implementation of the memory subsystem of the Cell/B.E. processor and discuss results from running various workloads.

By: Peter Altevogt; Tibor Kiss; Mike Kistler; Ram Rangan

Published in: RC24931 in 2010


