Hybrid Collective Operations on Power7 IH

The Power7 IH (P7IH) is one of IBM’s latest generation of supercomputers. Like most modern leadership class parallel machines, it has a hierarchical organization consisting of simultaneous multi-threading (SMT) within a core, multiple cores per processor, multiple processors per node (SMP), and multiple SMPs per cluster. A low latency/high bandwidth network with specialized accelerators is used to interconnect the SMP nodes. System software is tuned to exploit the hierarchical organization of the machine.

In this paper we present a novel set of collective operations that take advantage of the P7IH hardware. We discuss non blocking collective operations implemented using point to point messages, shared memory and accelerator hardware. We show how collectives can be composed to exploit the hierarchical organization of the P7IH for providing low latency, high bandwidth operations. We demonstrate the scalability of the collectives we designed by including experimental results on a P7IH system with up to 4096

By: Gabriel Tanase; Gheorghe Almasi; Charles Archer; Hanhong Xue

Published in: RC25259 in 2012


