Exploiting eDRAM Bandwidth with Data Prefetching: Simulation and Measurements

Compared to conventional SRAM, embedded DRAM (eDRAM) offers power, bandwidth and density advantages for the design of large on-chip cache memories. However, eDRAM suffers from comparatively slower access times than conventional SRAM arrays.

Data prefetching offers an attractive solution for the latency problem of a large capacity eDRAM cache, by reducing the average access latency. Moreover, data prefetching allows better exploitation of the large eDRAM bandwidth by making efficient use of the wide data accesses.

In this work, we present an exploration of design trade-offs for the prefetch data cache in the Blue Gene/L supercomputer. We also compare our simulation results to measurement results on actual Blue Gene systems. These experiments provide a validation for our modeling environment. Actual execution time measurements also include any system effects not modeled in our performance analysis environment, and confirm the selection of simulation parameters included in the model.

By: Valentina Salapura, José R. Brunheroto, Fernando Redígolo, Alan Gara

Published in: RC24142 in 2006

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rc24142.pdf

Questions about this service can be mailed to reports@us.ibm.com .