Compiling for the Active Memory Cube

In previous work we have introduced a novel processing-in-memory embedded device that achieves high power efficiency by moving computation to data, and with a carefully designed microarchitecture eliminating much of the hardware support and complexity of conventional processors. It relies on sophisticated compiler, runtime, and support software to deliver high performance.

In this work we describe the design and implementation of a compiler for this accelerator that uses the new OpenMP 4.0 accelerator model to offload and parallelize programs. We exploit architectural features such as VLIW and vector capabilities, hide latency to memory, and reuse data using the large vector register files. We achieve high computational efficiency, linear performance scaling, and superlinear performance per watt scaling on memory- and compute-bound kernels. Most importantly, we are able to achieve these results using standard, portable pragmas and no accelerator-specific program code. We believe our work is an important step toward building next-generation, power-efficient computing systems.

By: Arpith C. Jacob, Zehra Sura, Tong Chen, Carlo Bertolli, Samuel Antao, Olivier Sallenave, Kevin O’Brien, Hans Jacobson, Ravi Nair, Jose R. Brunheroto, Philip Jacob, Bryan S. Rosenburg, Yoonho Park, Alexandre E. Eichenberger, Changhoan Kim

Published in: RC25644 in 2016

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rc25644.pdf

Questions about this service can be mailed to reports@us.ibm.com .