# **IBM Research Report**

### A 6-bit, 1.6-GSample/s CMOS Flash Converter for Multi-Gigabit Wireless Communication

Mohit Kapur, Vincent W. Leung\*, Scott K. Reynolds, Daniel M. Kuchta, Christian W. Baks IBM Research Division Thomas J. Watson Research Center P.O. Box 218 Yorktown Heights, NY 10598

\*Currently with Silicon Laboratories, Somerset, NJ



Research Division Almaden - Austin - Beijing - Haifa - India - T. J. Watson - Tokyo - Zurich

LIMITED DISTRIBUTION NOTICE: This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publication, its distributionoutside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g. payment of royalties). Copies may be requested from IBM T. J. Watson Research Center, P. O. Box 218, Yorktown Heights, NY 10598 USA (email: reports@us.ibm.com). Some reports are available on the internet at <a href="http://domino.watson.ibm.com/library/CyberDig.nsf/home">http://domino.watson.ibm.com/library/CyberDig.nsf/home</a>

## A 6-bit, 1.6-GSample/s CMOS Flash Converter For Multi-Gigabit Wireless Communication

Mohit Kapur, Vincent W. Leung\*, Scott K. Reynolds, Daniel M. Kuchta and Christian W. Baks IBM T. J. Watson Research Center, Yorktown Heights, NY \*Now with Silicon Laboratories, Somerset, NJ

*Abstract*—A 150-mW, 6-bit, 1.6-gigasample/second (GS/s) flash converter in 0.13-um CMOS technology is described. The analog section uses an input track-and-hold and comparator preamps with resistive output averaging for > 5.5 ENOB un-calibrated performance up to 1.6-GS/s with DC input. The digital section, designed using standard cells, contains a one to four deserializer, bubble error correction circuit, on-chip storage of thermometer data and 128 6-bit binary samples for off-chip FFT computation. A three wire serial interface is used to configure the part and read data stored on-chip. A 6-bit 16 to 1 decimated output is also provided for spectral study.

#### I. INTRODUCTION

The recent success of an integrated transmitter and receiver for several-hundred-megabit to multi-gigabit wireless communication in the 60-GHz ISM band[1] has fueled the demand for a digital baseband chip and accompanying ADC running at these data rates. For omni-directional links using OFDM modulation, there is a need for low-power highsampling-rate converters to translate the analog I and Q channel outputs from a receiver into binary data. A CMOS ADC allows integration of the full digital baseband and ADC into a single chip, reducing the cost of highly-integrated millimeter-wave radios. There is also the possibility of integrating the RF front-end, ADC, and digital baseband into a single chip.

The 6-b resolution of the ADC was selected to provide ample dynamic range for broadband, multi-carrier OFDM at 2Gb/s. Resolutions of 5-b are common in UWB systems [2], and the increased 6-b resolution targeted here was a conservative choice to account for higher data rates and more OFDM sub-channels; 5-b may ultimately prove adequate for this application as well.

A 6-b, > 1 GS/s ADC is a challenging design in 0.13- $\mu$ m CMOS because it pushes the speed limits of the technology while simultaneously demanding low-offset comparators. The offsets required cannot be obtained by using long-channel devices without slowing down the comparators too much. This design addresses the problem by using comparator preamps with resistor-ladder output averaging, obtaining > 5.5 ENOB performance without calibration at DC input and sampling frequency of 1.6GHz.

Beyond the circuit design challenges in analog front end, there are a variety of issues in the design of digital blocks, the most important being the latency. Fast automatic gain control implementation in digital domain requires very low latency ADC architecture. Low latency further drives the need for simple bubble error correction schemes [3] which require small combinational logic at increased risk of missing bubbles. Another selection criterion for correction scheme is to define a subset of error patterns to correct for. Error patterns in thermometer code are representation of number of bubbles and how they are dispersed at any given instance of time in thermometer code. The addition of diagnostic circuits to store thermometer code can help refine the selection of bubble error correction circuit in future designs.

Apart from design challenges, there are multitudes of test challenges involved in accurate characterization of gigasample A/D converters. For DC analysis these range from generation of low-noise DC ramps for input data to capture of multi-bit output binary code at extremely high data rates. Even in DC analysis the output code has to be captured at full output clock rate to allow the study of adjacent code jumps. None of commercially available logic analyzers can handle such high data rates. This requires design of additional test circuits to capture high speed digital data. In dynamic analysis the conventional problems of coherent versus incoherent sampling gets aggravated at these frequencies. The low frequency drift between clock and data sources causes amplitude modulation in the output digital code resulting in dispersion of fundamental component in the output spectrum.

#### II. CIRCUIT DESCRIPTION

#### A. Analog Section

A block diagram of the analog section is shown in Fig. 1. A track-and-hold circuit (T/H) is used to reduce the sampling error caused by asynchronous clock arrival at the 63 comparators. The T/H output drives 63 preamplifiers, each with a gain of approximately 3. Resistive averaging is applied at the preamplifier outputs to reduce DNL and INL due to offsets in the preamplifiers' input FETs. To allow highest speed, the regenerative comparators use minimum channel length NFETs in the input stage, which is made possible by

the gain of the preamps; without the preamps, the offsets of minimum-length devices would be too high.



Fig. 1. Block diagram of Analog Section





A simplified schematic of the differential T/H is shown in Fig. 2. NFET sampling switches drive the gates of PFET source followers, with additional storage capacitance to ground. NFET capacitors at the source and drain of the NFET switches are driven with an out-of-phase clock to partially compensate for the charge injection that occurs when the sampling switches are turned off.

A simplified schematic of the comparator preamplifier is shown in Fig. 3. The differential input signal has a differential reference voltage subtracted from it and is then amplified by about three. More specifically, the positive input signal and the positive reference signal are applied to a PFET differential pair with PFET current source. The negative input signal and negative reference are applied to a second identical differential pair, and the current outputs of the two differential amplifiers are summed into a common pair of NFET load devices.



Fig. 3. Preamplifier circuit

Monte Carlo simulations run on the preamplifier show input-referred offsets of 7-10 mV (1  $\sigma$ ), compared to a nominal 22 mV LSB. Resistor averaging between adjacent preamplifier outputs reduces the DNL due to these offsets by a factor of 9.4 and the INL by a factor of 2.4. The averaged preamp outputs are applied to conventional regenerative comparators with NFET inputs, which in turn drive set-reset latches to produce a 63-b thermometer-code digital output.

#### B. Digital Encoder & On-chip Diagnostics

The digital section primarily performs the task of bubble error correction, thermometer to binary code conversion and onchip data storage for diagnosis and characterization. Fig.4. shows the block diagram of digital section.

To reduce the probability of propagation of metastability from comparator output to digital stage, two latch stages were added in data path.

The A/D output, before being used in a digital demodulator, is typically deserialized to a wider and slower data bus. Moving the deserializer stage to output of comparators allowed the digital logic to run at lower clock rate, thus increasing the reliability of design. It was chosen to bring the data rate down to quarter of sampling rate using a binary tree deserializer. This does come with a high latency cost of six cycles of sampling clock.

The use of a T/H circuit in the input stage and careful clock routing in the comparators reduces the probability of occurrence of bubbles, thus a three to one majority detector was used for bubble correction [3]. For additional safety a three input NAND gate was used for level detection followed by a multi-input OR for level to binary conversion. This approach brings the latency total to sixteen sampling clock cycles. Two in metastability, six in deserializer, four for pipelining the combinational logic and four for cleaning the output data before applying it to output drivers.



Fig. 4. Block diagram of digital section

A facility to store four blocks of thermometer code at the deserializer output was also added. To compute FFT, consecutive 128, 6-bit binary samples are also stored on chip. To reduce the size of FFT bin, the number of samples must be increased. To capture more samples, an off-chip data capture and storage facility is required. The output data rate needs to be reduced so that a multi channel logic analyzer can capture and store it. A 16:1 down sampler was added to data path to provide this feature. This limits the input frequency to Nyquist rate divided by sixteen for an alias free spectrum.

#### III. CHIP IMPLEMENTATION

The ADC is implemented in a 0.13-µm SiGe BiCMOS process. However, only the thin-oxide FETs, MOS capacitors, and polysilicon resistors are used in this design, so it is compatible with a variety of CMOS foundry processes. The total chip area is 2.5mm x 3mm, limited by pads. The analog section is 800um x 350um and the digital core area including the test circuit is 730um x 710um. The analog section uses 2.5V power supply for clock generator, T/H and preamp. It also uses 1.2V power supply for comparators and clock drivers. The digital core uses 1.0V and IO buffers use a 2.5V power supply.

#### IV. MEASUREMENTS

Initial screening tests were performed at wafer level using Cascade probe station. On verifying DC behavior, chips were wire-bonded on a printed circuit board as shown in Fig. 5.

For DC analysis the differential data input is generated by two voltage sources, which are controlled by a PC through GPIB interface. A frequency synthesizer generates a single ended clock, which is then used to generate a differential clock using a phase splitter. The three wire interface is controlled by the pattern generator/logic analyzer system to configure the ADC & read on-chip data. The decimated digital output is further divided by Xilinx ML401 development board to 10-KHz and fed into a National Digital Input Output (DIO) card. The DIO card is connected to PC through a PCMCIA interface. Labview software running on PC steps each voltage source by 0.5-mV, thus generating a 1-mVpp differential step. It then adds a wait for one second for the source outputs to settle and reads the binary code at the A/D output through PCMCIA interface. Thus the PC stores a table of input differential voltage versus output 6-bit binary code for entire input dynamic range for a particular sampling frequency. The measurement is repeated for several sampling frequencies and wafers.



Fig. 5. ADC wire-bonded to PCB

Fig. 6 shows the differential and integral non-linearity (DNL/INL) plots for sampling frequencies ranging from 100MHz to 1.6GHz. Both INL and DNL remain below 0.5LSB. INL was computed using the best curve fit method. Beyond 1.6GHz, code jumps were observed forcing DNL/INL to increase beyond 0.5LSB.

In dynamic analysis both clock and data are generated using frequency synthesizers. The synthesizers are reference locked for coherent sampling and left independent for incoherent. The output binary code and the on-chip stored data are read by the logic analyzer through the serial interface.



Fig. 6. DNL/INL plots for sampling freq. ranging from 100MHz (Black) to 1.6GHz (Cyan).

Fig. 7 shows quantized sine wave generated by plotting 128 on chip samples for sampling frequency of 1.28GHz and input frequency of 100MHz. This reconfirms that DNL is below 0.5LSB as no code jumps are observed.

Fig. 8 shows 128-point FFT analysis of data obtained by coherent and incoherent sampling. In both cases input frequency of 100MHz and sampling frequency of 1.28GHz is applied. The signal to noise plus distortion ratio (SNDR) for coherent sampling is 30.6dB. For incoherent sampling the SNDR is 30.4dB. This marginal difference in spectral response indicates that the clock and data sources have very low short term drift. But when decimated data is studied, allowing 180,000 samples to be taken, incoherent sampling in drop in SNDR to 17dB. This indicates a low frequency drift in two sources. As this dispersion can mask the study of non linear behavior of converter, it was chosen to perform the rest of analysis with the two sources reference locked to each other.

For sampling frequency equal to 1.536GHz and input frequency of 12MHz, SNDR of 31.2dB was obtained implying effective number of bits (ENOB) to be equal to 4.88. The ENOB gradually degrades to 3 bits as the input frequency is raised to 700MHz. For sampling frequencies below 1.28GHz and input frequencies below 400MHz, the ENOB stays greater than 4.8 bits, with total harmonic distortion below 35dB and spurious free dynamic range above 36dB.

Fig. 9 shows the variation of SNDR with input amplitude for input frequency of 12MHz and sampling frequency of 1.536GHz. At lower input voltages the fundamental component in spectrum is comparable to noise floor resulting in poor SNDR. Input differential voltage greater than 700mV saturates the dynamic range of converter causing the fall in SNDR. On-chip grounding problems cause the noise floor to increase at frequencies beyond 1.28GHz. Also slew problems in clock drivers and nonlinearities in track and hold cause further performance degradation.

ADC power consumption at 1.6GHz sampling frequency is 150mW. The digital core excluding the test circuitry consumes 25mW. The T/H and preamps consume 75mW. The clock generators and drivers consume 50mW. The power consumption can be further improved by designing custom latches for clock dividers and deserializers in the digital section.

#### V. CONCLUSION

Design and measurement challenges for low power CMOS ADCs have been presented. Detailed analysis shows INL/DNL less than 0.5LSB for static inputs, while dynamic analysis shows ENOB > 4.8 for a 1.28GHz sampling frequency and input frequencies less than 400MHz. This performance level will support a 60GHz WPAN running at > 500 Mb/s, and represents a step on the way toward an integrated millimeter-wave digital baseband chip.

#### ACKNOWLEDGMENT

The work was funded by DARPA under contract number N66001-05-C-8013 and N66001-02-C-8014. The authors thank Sergey Rylov for auto place & route support, Richard John for wirebonding, Brian Gaucher, Sudhir Gowda and Mehmet Soyuer for management support.



Fig. 7. Measured binary output codes for Input frequency = 100MHz, Sampling frequency = 1.28GHz



Fig. 8. 128-point FFT analysis done on data obtained by coherent (--Blue) and incoherent (Red) sampling. Input Frequency = 100MHz, Sampling Frequency = 1.28GHz



Fig. 9. Variation of SNDR with input differential amplitude for Input frequency = 12MHz, Sampling Frequency = 1.536 GHz.

#### REFERENCES

- B. Floyd et al., "A Silicon 60-GHz Receiver and Transmitter Chipset for Broadband Communications," *ISSCC Dig. Tech. Papers*, pp. 184-185, Feb. 2006.
- [2] T. Aytur et al., "A Fully Integrated UWB PHY in 0.13um CMOS", ISSCC Dig. Tech. Papers, pp. 124-125, Feb. 2006.
- [3] C. W. Mangelsdorf, "A 400-MHz Input Flash Converter with Error Correction," *IEEE J. Solid-State Circuits*, vol. 25, no. 1, pp. 184-191, Feb. 1990