A Semi-Custom VLSI Design Flow and Its Application to the Branch Address Calculator in IBM Power4 Microprocessor

In this paper we present the design and implementation of the branch address calculator in the Instruction Fetch Unit (IFU) of the IBM Power4 Microprocessor which operates at 1.7 GHz in a 0.18 µm SOI technology. A semi-custom methodology combining flexible custom circuit design with automated tuning and physical design tools is shown to provide new opportunities for optimization of designs throughout the development cycle. The resulting branch calculator supports a 3-cycle branch redirect loop to the L1 cache, which is key to the IFU performance. To achieve high fetch bandwidth, eight branch calculators are used to calculate the branch addresses in parallel for the eight instructions from the L1 cache. The replication of hardware makes the power-performance tradeoff an important issue in the circuit implementation. It is shown that with careful optimization, high performance can be achieved with a robust, tuned static design, thereby maintaining a power efficient design point.

By: Pong-Fei Lu, Gregory A. Northrop, Kevin Chairot

Published in: RC23014 in 2003

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rc23014.pdf

Questions about this service can be mailed to reports@us.ibm.com .