Next-Generation Performance Counters: Towards Monitoring over Thousand Concurrent Events

Copyright © (2008) by IEEE. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distrubuted for profit. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee.

We present a novel performance monitor architecture, implemented in the Blue Gene/PTM supercomputer. This performance monitor supports the tracking of a large number of concurrent events by using a hybrid counter architecture. The counters have their low order data implemented in registers which are concurrently updated, while the high order counter data is maintained in a dense SRAM array that is updated from the registers on a regular basis. The performance monitoring architecture includes support for prevent thresholding and fast event notification, using a twophase interrupt-arming and triggering protocol. A first implementation provides 256 concurrent 64b counters which offers an up to 64x increase in counter number compared to performance monitors typically found in microprocessors today, and thereby dramatically expands the capabilities of counter-based performance tuning.

By: Valentina Salapura; Karthik Ganesan; Alan Gara; Michael Gschwind; James C. Sexton; Robert E. Walkup

Published in: Proceedings of SPASS 2008 - IEEE International Symposium on Performance Analysis of Systems and Software. Piscataway, NJ, , IEEE., p.139-46 in 2008


