# UNIVERSITY OF CALIFORNIA, SAN DIEGO

# Phase Realignment and Phase Noise Suppression

in PLLs and DLLs

A dissertation submitted in partial satisfaction of the requirements for the degree

Doctor of Philosophy

in Electrical and Computer Engineering (Electronic Circuits & Systems)

by

Sheng Ye

Committee in charge: Professor Ian Galton, Chair Professor William Coles Professor Lawrence Larson Professor Lev Tsimring Professor Daniel Wulbert

2003

UMI Number: 3091345



UMI Microform 3091345

Copyright 2003 by ProQuest Information and Learning Company. All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code.

> ProQuest Information and Learning Company 300 North Zeeb Road P.O. Box 1346 Ann Arbor, MI 48106-1346

Copyright ©

Sheng Ye, 2003

All rights reserved

To my family...

# TABLE OF CONTENTS

|    | Signature Page                                           | iii  |
|----|----------------------------------------------------------|------|
|    | Dedication                                               | iv   |
|    | Table of Content                                         | v    |
|    | List of Figures                                          | vi   |
|    | List of Tables                                           | X    |
|    | Acknowledgements                                         | xi   |
|    | Vita                                                     | xiii |
|    | Publications                                             | xiv  |
|    | Abstract of the Dissertation                             | XV   |
| 1. | A Simple Model for the Design of Coupled Oscillator      |      |
|    | Clock Distribution Networks                              | 1    |
| 2. | A Multiple-Crystal Interface PLL with VCO Realignment to |      |
|    | Reduce Phase Noise                                       | 33   |
| 3. | Techniques for In-band Phase Noise Suppression in        |      |
|    | Re-circulating DLLs                                      | 61   |

v

# LIST OF FIGURES

Chapter 1

| 1.1 | Clock distribution using coupled oscillator networks                                                                                                                                                                                   |
|-----|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 1.2 | Each oscillator in the coupled oscillator network is modeled as a one-port device.                                                                                                                                                     |
|     |                                                                                                                                                                                                                                        |
| 1.3 | Block diagram of the proposed oscillator model7                                                                                                                                                                                        |
| 1.4 | Schematics of the resistively coupled oscillator pair                                                                                                                                                                                  |
| 1.5 | Comparison between transistor level simulation and theoretical calculation of the coupled oscillator pair. (a) Waveform of one locked period. (b) Absolute error between simulation and calculation                                    |
| 1.6 | (a) Piece-wise linear approximation of the waveform. (b) Piece-wise linear approximation of the PSF and ASF                                                                                                                            |
| 1.7 | <ul><li>(a) Locking transient of coupled oscillator pair using the simplified model. (b)</li><li>The first 400ps is expanded to illustrate the details of how to solve the simplified</li><li>ODEs</li></ul>                           |
| 1.8 | Comparison between transistor level simulation and theoretical calculation using<br>the simplified model of the coupled oscillator pair. (a) Waveform of one locked<br>period. (b) Absolute error between simulation and calculation29 |
|     | Chapter 2                                                                                                                                                                                                                              |
| 2.1 | Proposed top-level system block diagram                                                                                                                                                                                                |

2.2 (a) Phase noise is accumulated in typical ring oscillators. (b) Periodically realigning the oscillator to a "clean" edge suppresses the phase noise

vi

|                                                 | accumulation                                                                                                                                                                                                                                                                                 |
|-------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 2.3                                             | Simulated initial phase difference versus shifted phase curve                                                                                                                                                                                                                                |
| 2.4                                             | Time domain waveform of the realignment: VCO phase shifts almost instantly and linearly to $\theta_e[n]$                                                                                                                                                                                     |
| 2.5                                             | Extra phase shift due to phase realignment42                                                                                                                                                                                                                                                 |
| 2.6                                             | (a) Commonly used phase noise model of a conventional integer-N PLL. (b)<br>Modified version of the phase noise model describing the RPLL                                                                                                                                                    |
| 2.7                                             | Transfer functions of the PRPLL with different $\beta$ values. (a) Realignment increases phase margin in the loop transfer function. (b) Realignment extends the VCO noise stop band. (c) Realignment attenuates more input noise. (d) Realignment has less filtering of the reference noise |
|                                                 |                                                                                                                                                                                                                                                                                              |
| 2.8                                             | Effect of $\beta$ on the RPLL phase noise                                                                                                                                                                                                                                                    |
| 2.8<br>2.9                                      |                                                                                                                                                                                                                                                                                              |
|                                                 | Effect of $\beta$ on the RPLL phase noise                                                                                                                                                                                                                                                    |
| 2.9                                             | Effect of $\beta$ on the RPLL phase noise                                                                                                                                                                                                                                                    |
| 2.9<br>2.10                                     | Effect of β on the RPLL phase noise                                                                                                                                                                                                                                                          |
| <ul><li>2.9</li><li>2.10</li><li>2.11</li></ul> | Effect of $\beta$ on the RPLL phase noise                                                                                                                                                                                                                                                    |

Chapter 3

| 3.1  | Frequency synthesizers with phase realignment capability: (a) VCDL based DLL.<br>(b) Re-circulating DLL. (c) Phase realigned PLL                                                                                                                                                             |
|------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 3.2  | Linearized model for RPLL and DLL65                                                                                                                                                                                                                                                          |
| 3.3  | Calculated noise transfer functions of the re-circulating DLL with a 1 <sup>st</sup> order loop filter using the linearized model. The bandwidth is controlled by the loop filter capacitor                                                                                                  |
| 3.4  | Periodically switching an MOSFET between "on" and "off" resets the memory of the noise such that the long-term correlation in the noise is suppressed                                                                                                                                        |
| 3.5  | Proposed re-circulating DLL block diagram. Phase realignment is implemented with two NAND gates                                                                                                                                                                                              |
| 3.6  | Simplified timing diagram of the phase realignment control in the prototype 71                                                                                                                                                                                                               |
| 3.7  | (a) Conventional NAND gate. (b) Proposed NAND gate for better waveform matching                                                                                                                                                                                                              |
| 3.8  | (a) Typical complimentary bias generation. (b) A straightforward approach to implement switched biasing for 1/ <i>f</i> noise reduction. (c) Illustration of the detailed operation of the bias: an equivalent parasitic resistor is introduced between the loop filter and the power supply |
| 3.9  | Proposed bias generation with improved $1/f$ noise reduction. (a) Simplified schematic of the bias. (b) Detailed illustration of the bias operation                                                                                                                                          |
| 3.10 | Simulated node voltages of the NMOS in the bias circuitry when switched biasing is enabled                                                                                                                                                                                                   |
| 3.11 | Pseudo-differential voltage controlled delay cell design: (a) Conventional latch using two back-to-back inverters. (b) Improved latch for better $1/f$ noise                                                                                                                                 |

| 3.12 | Simulated waveforms in the delay cell: (a) Delay cell outputs. (b) Node voltages |
|------|----------------------------------------------------------------------------------|
|      | of $M_n$ when the switched biasing is enabled                                    |

# LIST OF TABLES

|     | Chapter 2            |    |
|-----|----------------------|----|
| 2.1 | Performance summary. | 58 |
|     |                      |    |
|     |                      |    |
|     | Chapter 3            |    |
| 3.1 | Performance summary. | 84 |

# ACKNOWLEDGEMENTS

First, I would like to thank my advisor, Ian Galton, for his friendship, guidance, support and encouragement during the course of my graduate study. His command of technical topics, engineering ingenuity, passion for new ideas and drive for perfection have always been inspiring. I have been truly fortunate to work with him. His guidance has helped me to grow as an engineer and has made my training at the ISPG group an invaluable experience.

I am grateful to Professor Daniel Wulbert of the Mathematics Department, Professor Lev Tsimring of the Institute for Nonlinear Science, Professor Bill Coles and Professor Larry Larson of the Department of Electrical and Computer Engineering, UCSD, for serving on my doctoral committee. In particular, I would like to thank Professor Bill Coles, who is my advisor during my first year at UCSD, for his guidance and encouragement. I would also like to thank Karol Previte, Carolyn Kuttner, Jim Thomas and the staff of the Department of Electrical and Computer Engineering for their help during my time at UCSD.

This work has been supported in part by the Center for Wireless Communication of UCSD.

I would like to thank the members of the ISPG group, Eric Fogleman, Sudhakar Pamarti, Jared Welz, Eric Siragusa, Ashok Swaminathan, Bill Huff, Henrik Jensen, Safy Fishov and Alan Lewis. I have been very fortunate to work with such a talented research group. Their friendship, support and wisdom have made my time at UCSD an invaluable experience.

I wish to thank my colleagues at Silicon Wave, Fari Assaderaghi, Glenn Chang,

Alberto Cicalini, Matt Deig, Jean-Sebastien Gagne, Jorge Grilo, Kimihiko Imura, Lars Jansson, Curtis Ling, David Lyon, Giancarlo Milanesi, Ray Montemayor, Eric Noguchi, Kishore Seendripu and Kevin Wang. Their encouragement and support have been truly valuable to me. I would also like to thank Silicon Wave for IC fabrication and testing support.

Finally, I would like to thank my family, my father Ye Zili, my mother Tian Baoning and my wife Zhang Mai for their love and support.

This dissertation includes three chapters in which the last two chapters are intended for separate publication. Chapter 2 appeared in the December 2002 issue of the *IEEE Journal of Solid State Circuits*. The research covered in Chapter 2 was presented at the *IEEE International Solid State Circuits Conference*, February, San Francisco, California. Chapter 3 is in preparation for submission to the *IEEE Journal of Solid State Circuits*. The research covered in Chapter 3 has been submitted to the *IEEE Custom Integrated Circuits Conference* and is accepted for publication.

# VITA

| 1996           | Bachelor of Science, Bejing University, P. R. China                                                                       |
|----------------|---------------------------------------------------------------------------------------------------------------------------|
| 1996 — 1999    | Graduate Student Researcher,<br>Department of Electrical and Computer Engineering,<br>University of California, San Diego |
| 1999           | Master of Science, University of California, San Diego                                                                    |
| 1999 2003      | Graduate Student Researcher,<br>Department of Electrical and Computer Engineering,<br>University of California, San Diego |
| 2003           | Doctor of Philosophy, University of California, San Diego                                                                 |
| 2000 — Present | IC Designer,<br>Silicon Wave Inc., San Diego                                                                              |

# PUBLICATIONS

S. Ye, L. Jansson and I. Galton, "Techniques for In-band Phase Noise Suppression in Recirculating DLLs," *IEEE Journal of Solid State Circuits*, in preparation.

S. Ye, L. Jansson and I. Galton, "Techniques for In-band Phase Noise Suppression in Recirculating DLLs," *IEEE Custom Integrated Circuits Conference*, 2003, accepted for publication.

S. Ye, L. Jansson and I. Galton, "A Multiple-Crystal Interface PLL with VCO Realignment to Reduce Phase Noise", *IEEE Journal of Solid State Circuits*, Vol. 37, No. 12, pp. 1795-1803, Dec. 2002.

S. Ye, L. Jansson and I. Galton, "A Multiple-Crystal Interface PLL with VCO Realignment to Reduce Phase Noise", *IEEE International Solid State Circuits Conference, Digest of Technical Papers*, pp. 78-79, Feb. 2002.

# ABSTRACT OF THE DISSERTATION

# Phase Realignment and Phase Noise Suppression

in PLLs and DLLs

by

Sheng Ye

Doctor of Philosophy in Electrical and Computer Engineering (Electronic Circuits & Systems)

University of California, San Diego, 2003

Professor Ian Galton, Chair

Ring oscillators are widely used in clock generation for digital systems and lowperformance communication applications due to their simplicity, wide tuning range and ease of integration. However, the excessive noise in ring oscillators makes them less desirable for high-performance communication systems. This dissertation presents two applications where the advantages of ring oscillators are exploited while their disadvantages are addressed using novel system topologies. In Chapter 1, a simple, yet accurate model is presented to describe the injection locking behavior in a novel clock distribution scheme using a network of strongly coupled oscillators. The model parameters are conceptually simple and can be easily obtained through transistor-level simulation. The proposed model is capable of accurately describing the injection-locking behavior among strongly coupled ring oscillators.

In conventional integer-N Phase-Locked Loops (PLL), the attenuation to the Voltage Controlled Oscillator (VCO) phase noise is limited by the system stability requirement, which prevents the use of ring oscillator based VCOs due to their excessive close-in phase noise. To overcome this conventional barrier, a clean reference pulse can be injected periodically into the VCO so as to reset the phase error and thereby suppress the noise memory. This technique, referred to as "phase realignment", can result in significant attenuation of the in-band phase noise. Chapter 2 presents the prototype of such a scheme and, when it is enabled, a peak spot phase noise reduction of 10 dB is observed compared with the conventional approach. In addition, a theoretical model is developed and used to improve the performance of the next-generation version of the prototype as presented in Chapter 3. Specifically, a novel ring VCO topology is developed which is not only optimized for the best phase realignment, but also designed to attenuate the 1/f noise using the switched biasing technique. A peak spot phase noise of 21.5 dB is observed when both noise attenuation schemes are enabled. Design guidelines for optimization of the loop parameters are derived from the theory and are closely supported by the measurement.

# A Simple Model for the Design of Coupled Oscillator

## **Clock Distribution Networks**

Sheng Ye, Ashok Swaminathan and Ian Galton

Abstract — This paper presents a simple, yet accurate, model to describe the injectionlocking behavior among strongly coupled oscillators applied to clock distribution in digital circuits. The model parameters are conceptually simple and, for a given transistor-level oscillator circuit, can be quickly deduced from circuit-level, e.g., SPICE, simulations. Once the model parameters are obtained, the model can be used to accurately predict the injection-locking behavior of strongly coupled copies of the oscillator with component mismatches. In addition to deriving the general model, the paper derives a simplified version of the model applicable to ring oscillators. Good agreement with transistor level simulation is observed.

## I. INTRODUCTION

In many synchronous digital circuits, copies of a periodic clock signal must be distributed to different physical locations with the same nominal phase. In practice, component mismatches in the clock distribution network cause these clock signals to exhibit relative *phase skew*, and, if sufficiently large, the phase skew can result in circuit failure. In general, as the clock rate of a digital circuit is increased, the amount of phase skew that can be tolerated without causing circuit failure decreases.

Conventional clock distribution networks distribute clock signals as propagating

waves. Each network consists of buffers and interconnect lines that distribute copies of the signal generated by a single clock source such as a PLL or crystal oscillator. The clock distribution network is designed such that the propagation delays between the clock source and the various locations to which copies of the clock signals are delivered are nominally equal. However, variations among the fabricated interconnect lines and buffer components give rise to variations among these propagation delays and, hence, relative phase skew among the distributed clock signals. The magnitude of the phase skew tends to increase with clock frequency because the period of the clock signal decreases with clock frequency while the variance of the propagation delays introduced by the clock distribution network generally does not decrease with clock frequency. Unfortunately, the primary techniques with which to reduce the variance of the propagation delays are to use wider interconnect lines and larger buffers, both of which increase circuit area and power consumption.

An alternative method to overcome this problem is to distribute clock signals in the form of standing waves instead of propagating waves. This can be done with a network of spatially distributed, strongly coupled oscillators. When the outputs of certain types of electronic oscillators with similar free-running frequencies and amplitudes are connected, they tend to lock to a common frequency through a phenomenon known as *injection locking*. Furthermore, if the oscillators are approximately matched in freerunning frequency and amplitude, they tend to lock nearly in phase thereby setting up a nearly standing wave along the interconnect lines that join the oscillators. The feasibility of this approach has been demonstrated for both discrete and integrated circuits and it has been shown that the approach can achieve significant savings in power and area



#### Figure 1.1: Clock distribution using coupled oscillator networks.

consumption relative to conventional clock distribution networks [1], [2].

Each oscillator can be implemented as a voltage controlled oscillator (VCO). When their inputs and outputs, respectively, are connected via metal interconnect lines, multiple spatially distributed, nominally matched VCOs tend to behave as a single VCO; the outputs of the individual VCOs achieve the same frequency and nearly the same phase, and the frequency can be locked to a reference frequency using a conventional PLL as shown in Figure 1.1. An added benefit of this approach is that it naturally aligns the distributed clock signals not only with each other, but also with the reference frequency signal, which makes it possible for different integrated circuits to generate internal high-speed clocks that are synchronized with each other through an externally supplied reference frequency signal. This is difficult to accomplish with conventional clock distribution networks because the average propagation delay though a given clock distribution network generally depends upon the details of the integrated circuit in which

it resides [3].

The injection-locking behavior among coupled oscillators is widely observed and, in particular, is easily obtained both in physical measurement and transistor level SPICE simulation for electrical oscillators. Research in this area remains active to exploit this phenomenon such as [1], [2] and [4]. In order to precisely characterize the behavior of the injection locking, it is critical to have an accurate model that is capable of predicting both the phase and amplitude response of the oscillator when it is coupled to other oscillators. The linear and time-varying (LTV) oscillator model presented by Hajimiri and Lee [5] [6] provides a very good framework for the construction of such a model. However, although the Hajimiri-Lee model is sufficiently accurate in phase noise prediction, it needs to be extended to describe the injection-locking behavior, which is inherently nonlinear. In recent findings by Tanaka et al. [7], the Hajimiri-Lee model is extended to describe the excess phase response of weakly coupled oscillators where the amplitude response of the oscillator is ignored. Averaging method is used in the analysis of the weak coupling and design guidelines are derived. However, when the coupling strength increases in the case of a practical IC, the averaging method fails to apply. In addition, as demonstrated below, both the phase and amplitude response of the oscillator affect its injection-locking behavior in strongly coupled oscillators.

This paper presents a simple, yet accurate, model that is capable of analyzing strongly coupled oscillators. The proposed model is nonlinear and is an extension to the Hajimiri-Lee model with characterization of both the phase and amplitude response of the oscillator. In particular, the phase response model is equivalent to the model presented in [7], although it was developed by the authors independently. On the other hand, the

4

amplitude response model is an extension to that presented in [6]. Specifically, the underlying relationship between the phase and amplitude response is analyzed quantitatively in the special case of ring oscillators. As presented in section II, the model parameters are conceptually simple and, for a given transistor-level oscillator circuit, can be quickly deduced from circuit simulations. In section III, a simplified version of the model applicable to ring oscillators, which are widely used for to generate clock signals for digital circuits, is presented. Analysis of the coupled ring oscillators using the proposed model is presented in section IV, where the model accurately predicts the injection-locking behavior of strongly coupled copies of the oscillator with component mismatches. In section V, the oscillator model is simplified even further. Despite its simplicity, the salient property of the coupled oscillator is still retained.

## II. THE GENERAL OSCILLATOR MODEL

This section presents the most general form of the oscillator model. The model is an extension of that originally proposed by Hajimiri and Lee as well as the model presented in [7]. In particular, although the phase response model is equivalent to the model presented in [7], it is rigorously derived here to be consistent with the amplitude response model presented later.

### A. Description of the General Oscillator Model

The  $n^{\text{th}}$  oscillator in a network of coupled oscillators can be viewed as a nonlinear one-port device with a voltage,  $v_n(t)$ , and a current,  $i_n(t)$ , as shown in. If the oscillator is removed from the network, it is said to be *free-running*. In this case  $i_n(t)$ 

5



Figure 1.2: Each oscillator in the coupled oscillator network is modeled as a one-port device.

becomes zero, and  $v_n(t)$  settles into a *free-running oscillation*. In practice, even nominally identical oscillators have significantly different free running oscillations because of component variations introduced during fabrication. However, under a surprisingly wide range of conditions, coupled oscillators exchange current in such a fashion that they *lock in frequency*. That is, they settle to a state wherein the oscillator voltages,  $v_n(t)$ , for all *n* become periodic with a common minimum period.

A general representation of the voltage waveform of the  $n^{\text{th}}$  oscillator is

$$v_n(t) = f_n(\omega_n t + \phi_n(t)) + \Delta A_n(t), \tag{1}$$

where  $f_n(\cdot)$  is a  $2\pi$ —periodic function that represents the free-running oscillation amplitude,  $\omega_n$  is the free-running oscillation frequency, and  $\phi_n(t)$  and  $\Delta A_n(t)$  are timevarying functions referred to as the *excess phase* and *excess amplitude*, respectively.<sup>†</sup>

<sup>&</sup>lt;sup>†</sup> Note that (1) is equivalent to the commonly used expression  $v_n(t) = A_n(t) f_n(\omega_n t + \phi_n(t))$  where  $A_n(t)$  is a time-varying amplitude function, but the form of (1) is more convenient for the purposes of this paper.



#### Figure 1.3: Block diagram of the proposed oscillator model.

Coupled oscillators lock in frequency by exchanging current. For each oscillator it follows that the oscillation waveform must depend upon the current sourced or sunk by the oscillator. In (1) this occurs most generally when the excess phase,  $\phi_n(t)$ , and the excess amplitude,  $\Delta A_n(t)$ , depend upon  $i_n(t)$ . Therefore, in order to characterize the frequency locking behavior of an oscillator, it is sufficient to quantify how the excess phase and excess amplitude depend upon the current. In principle, this could be done by analyzing each oscillator at the circuit level. However, the analysis would be prohibitively complicated to perform by hand, and, therefore, would most likely involve extensive circuit simulation.

Despite the accuracy of circuit simulations, it is still desirable to have a simple signal processing level model that describes the salient properties of both the excess phase and excess amplitude in terms of the oscillator current for a wide range of oscillator configurations. As shown below, the oscillator model proposed herein is such a model. Figure 1.3 shows the signal processing equivalent block diagram of the proposed single oscillator model. The model contains two sub-models: the *excess phase model* and the *excess amplitude model*. Both contain a variable gain element followed by a linear time invariant (LTI) signal processing block. In the excess phase model, the input current is multiplied by a function,  $\Gamma_n(\cdot)$ , called the *phase sensitivity function* (PSF), and the result is integrated to produce the excess phase. The PSF is evaluated at the *absolute oscillator phase*:  $\Phi_n(t) = \omega_n t + \phi_n(t)$ . In the excess amplitude model, the input current is multiplied by a function,  $\Omega_n(\cdot)$ , called the *amplitude sensitivity function* (ASF), and the result is passed through an LTI filter with impulse response  $h_n(t)$ , i.e. transfer function  $H_n(s)$ , to produce the excess amplitude. As in the case of the PSF, the ASF is evaluated at the absolute at the absolute oscillator phase. Both the PSF and the ASF are periodic with  $2\pi$ .

For a given oscillator circuit, the oscillator model completely specifies the oscillator voltage as a function of the oscillator current once the *model parameters*,  $f_n(\cdot)$ ,  $\omega_n$ ,  $\Gamma_n(\cdot)$ ,  $\Omega_n(\cdot)$ , and  $H_n(s)$  are known. As demonstrated in Section III for the case of ring oscillators, the model parameters can be extracted from relatively simple transistor-level SPICE simulations on individual oscillators.

### B. Motivation Behind the General Oscillator Model

As mentioned above, the model presented in this paper is an extension of the Hajimiri-Lee model presented in [5]. The Hajimiri-Lee model is based on the observation that for a large variety of oscillators the excess phase and excess amplitude depend not only on the oscillator's input current,  $i_n(t)$ , but also on the absolute oscillator phase. Therefore, the model allows for these dependencies to be time varying. However,

it makes the approximation that the excess phase depends linearly on the oscillator's input current. In [5] the model is used to predict oscillator phase noise by representing all the circuit noise sources in a given oscillator circuit as an equivalent input current source driving a noiseless version of the oscillator. In such cases, the input current is usually modeled as a small-amplitude, zero-mean, random process for which the linearity approximation turns out to be reasonable. However, in networks of coupled oscillators the oscillator input currents tend to be deterministic periodic functions with frequencies that differ from the free-running periods of the individual oscillators, and in such cases it turns out that the linearity approximation is not valid. The extension of the Hajimiri-Lee model presented herein overcomes this problem. In the following, the assumptions that underlie the new model and the relationship between the new model and the Hajimiri-Lee model are explained in detail.

If a narrow pulse of current is applied to the terminals of an otherwise freerunning oscillator, an excess phase change can be observed several cycles later when the oscillator settles back to a free-running oscillation. Provided the current pulse has a sufficiently small magnitude and duration, the excess phase change tends to be proportional to the delivered charge where the proportionality factor is a periodic function of the absolute oscillator phase. This behavior is represented in both the Hajimiri-Lee model and the new model by assuming the excess phase abruptly steps to and remains at the value determined by the proportionality factor corresponding to the absolute oscillator phase at the time the current pulse is applied. Therefore, if a rectangular current pulse of magnitude  $I_p$  with a duration of  $\Delta t$  occurs at time t, in the limit as  $\Delta t \rightarrow 0$  the excess phase change can be written as

$$\phi_n(t+\Delta t) - \phi_n(t) = \Gamma_n(\omega_n t + \phi_n(t)) I_\nu \Delta t, \qquad (2)$$

where  $\Gamma_n(\cdot)$ , denoted as the PSF in the model, is the proportionality factor.

Given that any physically realizable input current waveform can be approximated with arbitrary precision as a sum of weighted and time-shifted sufficiently narrow rectangular current pulses, the behavior described by (2) can be generalized to arbitrary physically realizable input current waveforms as

$$\phi_n(t) - \phi_n(t_0) = \int_{t_0}^t \Gamma_n\left(\omega_n \tau + \phi_n(\tau)\right) i_n(\tau) d\tau, \qquad (3)$$

where  $\phi_n(t_0)$  is the initial excess phase at time  $t_0$ . The excess phase model shown in Figure 1.3 is a block diagram representation of (3). Throughout the remainder of this paper, it is assumed that  $i_n(t)$  and  $\Gamma_n(t)$  are both piecewise continuous with at most a finite number of discontinuities, so it follows from (3) that  $\phi_n(t)$  is continuous. The presence of  $\phi_n(\tau)$  in the integrand of (3) gives rise to the nonlinear relationship between the excess phase and the input current mentioned above. This term is approximated as zero in the Hajimiri-Lee model so as to eliminate the nonlinearity. It should be noted that (3) is equivalent to the excess phase model as reported in [7].

In general, the excess amplitude of the oscillator also responds to an applied narrow current pulse in a fashion that depends upon the absolute phase of the oscillator. However, unlike the excess phase response where the phase change persists indefinitely, the excess amplitude tends to exhibit a transient response that settles back to zero in time due to the amplitude control mechanisms present in practical oscillators. This behavior is idealized as follows: the input current pulse is scaled by the ASF evaluated at the oscillator phase, i.e.,  $\Omega_n(\Phi_n(t))$ , and the result is passed though an LTI filter. Generalizing this to arbitrary input current waveforms gives

$$\Delta A_n(t) = \int_{-\infty}^{\infty} \Omega_n(\omega_n \tau + \phi_n(\tau)) i_n(\tau) h_n(t-\tau) d\tau$$
(4)

where  $h_n(t)$  is the impulse response as shown in Figure 1.3. The excess amplitude model shown in Figure 1.3 is a block diagram representation of (4). As in the case of the PSF, the ASF is assumed to be piecewise continuous with at most a finite number of discontinuities. In the Hajimiri-Lee model in [6], the amplitude response of both high-Q oscillators such as a LC oscillator and low-Q oscillators such as a ring oscillator are presented and an *amplitude impulse sensitivity function*, which is similar to the *phase impulse sensitivity function*, is introduced. Unlike the extensive discussion on how to obtain the *phase impulse sensitivity function*, the procedure to obtain the *amplitude impulse sensitivity function* is not discussed because the AM noise is not the dominant source of phase noise in most cases. More importantly, the amplitude response model in [6] is still LTV, making it unsuitable for injection-locking analysis.

In weakly coupled oscillators, ignoring the amplitude response seems reasonable to simplify the calculation as demonstrated in [7]. However, when the oscillators are strongly coupled, an accurate amplitude response model is critical in describing the oscillators' behavior as illustrated by the following example. Suppose the outputs of two oscillators with different voltage swing are coupled together by a zero-Ohm resistor and they lock in frequency, the output voltages of the two oscillators are identical because they are shorted. However, in the absence of the amplitude response model, the calculated waveforms of the oscillators can only warp using the phase response model, making it impossible to match the physical results.

## III. RING OSCILLATOR MODEL

In this section, two additional assumptions are made regarding the behavior of oscillators so as to simplify the problems of extracting the model parameters from circuit simulations and applying the oscillator model to predict the behavior of networks of coupled oscillators. Circuit simulations indicate that the assumptions are valid at least for the case of ring oscillators, which are widely used to generate clock signals in digital circuits. Although the assumptions restrict the oscillator model somewhat, the resulting simplifications greatly ease the application of the model to the analysis of coupled oscillator circuits. Furthermore, as demonstrated in Section IV the model is remarkably accurate in predicting the behavior of coupled ring oscillators.

#### A. Ring Oscillator Assumptions

The first assumption is that if a narrow pulse of current is applied to the terminals of an otherwise free-running oscillator, then the oscillator voltage,  $v_n(t)$ , abruptly changes from its free-running value by  $\Delta Q/C_n$  where  $\Delta Q$  is the total charge in the current pulse, and  $C_n$  is a constant capacitance value. The only significant restriction imposed by this assumption is that  $C_n$  is assumed to be constant regardless of the absolute phase of the oscillator. In practice, the capacitance will have some time-varying components arising from the non-linear capacitances presented by the transistors within the oscillator circuit, but as demonstrated in Section IV the model produces extremely accurate results despite the approximation that the capacitance is constant. The second assumption is that the LTI system in the excess amplitude model is well approximated by a single-pole system. Thus, its impulse response is

$$h_n(t) = u(t) \frac{1}{C_n} e^{-t/\tau_n},$$
 (5)

where u(t) is the unit step function, and  $\tau_n$  is a constant referred to as the *excess amplitude time-constant*. The assumption is analogous to the dominant pole assumption often made in amplifier circuit analysis. In general, the assumption is valid to the extent that one pole of  $H_n(s)$ , i.e., the pole at  $s = -1/(2\pi\tau_n)$ , has a magnitude much smaller than that of every zero and every other pole of  $H_n(s)$ . This assumption is consistent with the assumption in the Hajimiri-Lee model [6]. However, the dependence on the absolute phase allows the proposed model to characterize the injection-locking behavior.

There are two significant benefits that arise from making these assumptions in terms of simplifying the extraction of the model parameters from circuit simulations. One is that the problem of determining a function, namely  $H_n(s)$  in the excess amplitude model, reduces to that of determining a single number, namely  $\tau_n$ . The other is that the assumptions cause the ASF to be a function of the PSF, the free-running oscillation amplitude, and  $C_n$ , so once these model parameters are known there is no need to separately extract the ASF.

### B. Derivation of the ASF Expression

As presented above, the oscillator model describes the relationship between the oscillator voltage and current. At any given time, this relationship depends upon the absolute oscillator phase, the excess phase, and the excess amplitude. The ring oscillator

assumptions made above provide further relationships between the oscillator voltage and current that are used below in conjunction with the general model structure to derive the ASF as a function of the PSF, the free-running oscillation amplitude, and  $C_n$ .

Suppose a rectangular current pulse of magnitude  $I_p$  with a duration of  $\Delta t$  is applied to an otherwise free-running oscillator at time  $t = t_1$ . Prior to the time of the current pulse,  $\phi_n(t)$  and  $\Delta A_n(t)$  are both zero, so (1) reduces to  $v_n(t) = f_n(\omega_n t)$ . Therefore, the first ring oscillator assumption implies that

$$v_n(t_1 + \Delta t) - v_n(t_1) \approx f_n\left(\omega_n(t_1 + \Delta t)\right) - f_n(t_1) + \frac{I_p \Delta t}{C_n},\tag{6}$$

where the approximation becomes an equality as  $\Delta t \rightarrow 0$ . Dividing (6) by  $\Delta t$  and taking the limit as  $\Delta t \rightarrow 0$  results in

$$v'_{n}(t_{1}^{+}) = \omega_{n} f'_{n}(\omega_{n} t_{1}^{+}) + \frac{I}{C_{n}},$$
(7)

where  $t_1^+$  is the instant immediately following the current pulse.

This result is a direct consequence of the first ring oscillator assumption, but is not a consequence of the general oscillator model presented in Section II. However, the general oscillator model can also be used to obtain an expression for the derivative of the oscillator voltage at the instant immediately following the current pulse, and this expression must equate to (7).

Differentiating (3) with respect to time gives

$$\phi'_n(t) = \Gamma_n \left( \omega_n t + \phi_n(t) \right) i_n(t).$$
(8)

Substituting (5) into the derivative of (4) gives

$$\Delta A'_n(t) = \Omega_n \left( \omega_n t + \phi_n(t) \right) \frac{i_n(t)}{C_n} - \frac{\Delta A_n(t)}{\tau_n}.$$
(9)

15

Differentiating (1), substituting (8) and (9) into the result, and evaluating at time  $t = t_1^+$  yields

$$v'_{n}(t_{1}^{+}) = \left(\omega_{n} + \Gamma_{n}\left(\omega_{n}t_{1}^{+}\right)I\right)f'_{n}(\omega_{n}t_{1}^{+}) + \Omega_{n}\left(\omega_{n}t_{1}^{+}\right)\frac{I}{C_{n}}.$$
(10)

Equating (7) and (10), and solving for  $\Omega_n(\omega_n t_1^+)$  gives

$$\Omega_n(\omega_n t_1^+) = 1 - C_n \Gamma_n(\omega_n t_1^+) f_n'(\omega_n t_1^+).$$
(11)

The choice of  $t_1$ , the time of the current step in the derivation leading to (11), was arbitrary, so (11) must be valid for any value of  $\theta = \omega_n t_1$ . This implies the following general expression for the ASF

$$\Omega_n(\theta) = 1 - C_n \Gamma_n(\theta) f'_n(\theta).$$
(12)

It is interesting to compare  $\Gamma_n(\cdot)$  and  $\Omega_n(\cdot)$  quantitatively. When the magnitude of  $\Gamma_n(\cdot)$  peaks, usually the magnitude of  $f'_n(\cdot)$  peaks as well, so the magnitude of  $\Omega_n(\cdot)$  is minimized. When the magnitude of  $\Gamma_n(\cdot)$  approaches zero,  $\Omega_n(\cdot)$  approaches one. This result is consistent with the observations in [5]: when a current impulse is applied to a node of an oscillator when the node voltage is at its peak, the impulse has most impact on the amplitude change. Conversely, if the impulse is applied at the zero crossing of the node voltage, it has the most impact on the phase shift.

## C. Model Parameter Extraction from SPICE Simulations

As described in Section II, the model parameters  $f_n(\cdot)$ ,  $\omega_n$ ,  $\Gamma_n(\cdot)$ ,  $\Omega_n(\cdot)$ , and  $H_n(s)$ must be determined in order to apply the model to a specific oscillator circuit. Although all these parameters can be extracted directly from transistor-level simulations of the oscillator circuit in question, the two ring oscillator assumptions made above make it necessary only to extract  $f_n(\cdot)$ ,  $\omega_n$ ,  $\Gamma_n(\cdot)$ ,  $C_n$ , and  $\tau_n$  from transistor-level simulations. Once these parameters are known, the transfer function,  $H_n(s)$ , can be calculated using the Laplace Transform of (5), and  $\Omega_n(\cdot)$  can be calculated using (12).

Closed form expressions for  $f_n(\cdot)$  and  $\Gamma_n(\cdot)$  can not be easily determined from circuit simulations. Fortunately, for the purposes of this paper it is sufficient to represent them each by a vector of their samples taken at uniformly spaced sampling instants over one oscillation period, and such vectors are relatively easy to obtain via circuit simulations. Issues associated with the choice of the sampling interval are presented in the next section.

The model parameters  $f_n(\cdot)$  and  $\omega_n$  can be determined by direct simulation of a single free-running instance of the oscillator circuit in question. Once the initial simulation startup transients have settled out, the free-running oscillation period, and hence its reciprocal,  $\omega_n$ , can be observed from the oscillator voltage waveform, and  $f_n(\cdot)$  is given by the oscillator waveform over one period. Similarly, the derivative of  $f_n(\cdot)$ , which is subsequently used to calculate  $\Omega_n(\cdot)$ , can be determined by direct simulation of a single free-running instance of the oscillator circuit.

A simple method with which to obtain  $\Gamma_n(\cdot)$  is to simulate two identical copies of

the oscillator with the same initial conditions as follows. One of the oscillators is left free-running, and a narrow rectangular current pulse is applied to the other oscillator, so a phase difference,  $\Delta \phi$ , between the two oscillators is induced. After several oscillator periods following the applied pulse, the excess amplitude of the pulsed oscillator settles back to zero at which time the phase difference can be observed from the simulated oscillator output voltages. Just prior to the current pulse, the excess phase,  $\phi_n(t)$ , of the about-to-be-pulsed oscillator is zero, because at that point the oscillator is still free running, so it follows from (2) that

$$\Gamma_n(\omega_n t_1) \cong \frac{\Delta \phi}{I \Delta t},\tag{13}$$

where  $t_1$  is the time at which the current pulse is applied, *I* is the amplitude of the current pulse, and  $\Delta t$  is the duration of the current pulse. The accuracy of (13) increases as the pulse width,  $\Delta t$ , decreases (provided the pulse width is not so small compared to the stepsize of the circuit simulator that significant simulation errors occur). Thus, a sample of the PSF at any point  $\theta = \omega_n t_1$  can be accurately estimated from a simulation of two identical copies of the oscillator in question by applying the pulse at time  $t_1$ , measuring the resulting  $\Delta \phi$ , and applying (13).

A similar simulation can be used to obtain the excess amplitude time constant,  $\tau_n$ . Again starting with two identical copies of the oscillator circuit with the same initial conditions, a narrow rectangular current pulse can be applied to one of the oscillators at a point in time where the PSF is zero. The difference between the simulated waveforms of the two oscillators immediately following the applied pulse can be approximated as an exponentially decaying function from which the time constant can be measured.

To obtain  $C_n$ , a small rectangular current pulse of magnitude  $I_p$  with a duration of  $\Delta t$  can be applied to the simulated oscillator circuit, and the difference between the oscillator voltage just after the pulse,  $v_n(t_1 + \Delta t)$ , and the oscillator voltage just prior to the pulse,  $v_n(t_1)$ , can be observed. With these observed values, it follows from (6) that  $C_n$  can be calculated using

$$C_n \approx \frac{I_p \Delta t}{\nu_n(t_1 + \Delta t) - \nu_n(t_1) - f_n(\omega_n(t_1 + \Delta t)) + f_n(t_1)}.$$
(14)

## IV. COUPLED RING OSCILLATORS

To determine how coupled oscillators interact, it is necessary to derive expressions for the currents exchanged by the coupled oscillators as a function of the oscillator model parameters. Once these expressions are known, they can be combined with the differential forms of the model equations, i.e., (8) and (9), to generate a system of ordinary differential equations (ODEs) that describe the excess phase and excess amplitude of each oscillator. As demonstrated in this section, the ODEs can be solved numerically to obtain results that closely match direct circuit simulations, or they can be simplified via approximations to facilitate hand analysis.

For illustration purposes, a pair of coupled oscillators is considered throughout this section. In practice, coupled oscillators are connected through metal interconnect lines on integrated circuits. The interconnect lines present resistance, and at high frequencies can be modeled as lossy transmission lines. Although the voltage and current



Figure 1.4: Schematics of the resistively coupled oscillator pair.

relationship between the two ends of the transmission line can be easily obtained using known techniques, the interconnect line is modeled as a resistor to simplify the derivation in the following analysis.

## A. Derivation of the Differential Equations

A circuit diagram of the pair of coupled oscillators analyzed below is shown in Figure 1.4. Each oscillator is a five-stage ring oscillator with transistor sizes as noted in the figure. The tail currents are intentionally mismatched to represent rather extreme process variations. The oscillators are designed for the TSMC 0.18-µm CMOS process. Circuit simulations using BSIM3V3 models for this process indicate that the free running frequencies of the two oscillators are 1.908 GHz and 2.053 GHz, respectively.

In practice, the resistance of a several hundreds microns long, minimum width interconnect line is usually in the tens of Ohms. In this particular simulation, the oscillators are connected to each other though a 20- $\Omega$  resistor to model a 200- $\mu$ m long, minimum width interconnect. They are also connected (implicitly) through a low impedance ground line. In practice, ground connections are implemented with wide metal lines so they tend to have low resistance. Therefore in this case the ground connections are approximated as zero. Transistor level simulation indicates the two oscillators lock to a common frequency of 1.986 GHz.

Since the coupling resistor is small, it is convenient to approximate the resistance as zero so the outputs of the two oscillators are shorted. As shown in Figure 1.4, the top and bottom oscillators are denoted as Oscillator 1 and Oscillator 2, respectively. Because of the above approximation, Oscillator 1 and Oscillator 2 are directly connected, which implies that their voltages are equal and their currents have equal magnitude but opposite signs. That is,  $v_1(t) = v_2(t)$  and  $i_1(t) = -i_2(t)$ . Once the values of  $v_1(t)$ ,  $v_2(t)$ ,  $i_1(t)$  and  $i_2(t)$  are calculated as shown below, it follows that the voltages at the actual outputs of the top and bottom ring oscillators in Figure 1.4 are  $v_2(t)+i_2(t)\cdot(10\Omega)$  and  $v_2(t)-i_2(t)\cdot(10\Omega)$ , respectively. Since the coupling resistor is very small, the error due to the above approximation is neglected.

Differentiating (1) and substituting (8), (9), and (12) into the result yields

$$v'_n(t) = \omega_n f'_n(\omega_n t + \phi_n(t)) + \frac{i_n(t)}{C_n} - \frac{\Delta A_n(t)}{\tau_n}.$$
(15)



Figure 1.5: Comparison between transistor level simulation and theoretical calculation of the coupled oscillator pair. (a) Waveform of one locked period. (b) Absolute error between simulation and calculation.

Given that  $v_1(t)$  and  $v_2(t)$  are equal, it follows that their derivatives are also equal. Thus, the right sides of the two equations given by (15) with n = 1,2 are equal. Solving for  $i_1(t) = -i_2(t)$  gives

$$i_{1}(t) = -i_{2}(t) = \frac{C_{1}C_{2}}{C_{1} + C_{2}} \left[ \omega_{2}f_{2}'(\omega_{2}t + \phi_{2}(t)) - \omega_{1}f_{1}'(\omega_{1}t + \phi_{1}(t)) - \frac{\Delta A_{2}(t)}{\tau_{2}} + \frac{\Delta A_{1}(t)}{\tau_{1}} \right]$$
(16)

Substituting (16) into (8) and (9) yields

$$\Phi_{1}'(t) = \frac{C_{1}C_{2}}{C_{1}+C_{2}}\Gamma_{1}(\Phi_{1}(t))\left[\omega_{2}f_{2}'(\Phi_{2}(t)) - \omega_{1}f_{1}'(\Phi_{1}(t)) - \frac{\Delta A_{2}(t)}{\tau_{2}} + \frac{\Delta A_{1}(t)}{\tau_{1}}\right] + \omega_{1}, \quad (17)$$

$$\Phi_{2}'(t) = -\frac{C_{1}C_{2}}{C_{1}+C_{2}}\Gamma_{2}\left(\Phi_{2}(t)\right)\left[\omega_{2}f_{2}'\left(\Phi_{2}(t)\right) - \omega_{1}f_{1}'\left(\Phi_{1}(t)\right) - \frac{\Delta A_{2}(t)}{\tau_{2}} + \frac{\Delta A_{1}(t)}{\tau_{1}}\right] + \omega_{2}, \quad (18)$$

$$\Delta A_{1}'(t) = -\frac{\Delta A_{1}(t)}{\tau_{1}} + \frac{C_{2}}{C_{1} + C_{2}} \Omega_{1}(\Phi_{1}(t)) \bigg[ \omega_{2} f_{2}'(\Phi_{2}(t)) - \omega_{1} f_{1}'(\Phi_{1}(t)) - \frac{\Delta A_{2}(t)}{\tau_{2}} + \frac{\Delta A_{1}(t)}{\tau_{1}} \bigg],$$
(19)

and

$$\Delta A_{2}'(t) = -\frac{\Delta A_{2}(t)}{\tau_{2}} - \frac{C_{1}}{C_{1} + C_{2}} \Omega_{2} \left( \Phi_{2}(t) \right) \left[ \omega_{2} f_{2}' \left( \Phi_{2}(t) \right) - \omega_{1} f_{1}' \left( \Phi_{1}(t) \right) - \frac{\Delta A_{2}(t)}{\tau_{2}} + \frac{\Delta A_{1}(t)}{\tau_{1}} \right], (20)$$

where  $\Phi_n(t) = \omega_n t + \phi_n(t)$ , n = 1, 2, are the absolute phases of the two oscillators.

For demonstration purposes, equations (17) to (20) are numerically solved using Euler's method [8] and compared with transistor-level SPICE simulation. A uniform step size of 0.2 ps is used in the numerical solution. Figure 1.5(a) shows the calculated waveform in one locked period of oscillator 1 and 2 in comparison with the simulation. Figure 1.5(b) shows the error between the calculation and the simulation. The calculated locked frequency is 1.987 GHz, which is within 0.1% error of the transistor-level simulation result. In this particular example, the choice of the oscillator is for demonstration purpose only. Although not shown in this paper, pair coupled oscillators implemented using other processes that are running at 1 GHz and 300 MHz have also

been simulated. When the theoretical model is applied to compare with the transistorlevel simulation, similar accuracy has been observed.

## V. SIMPLIFIED OSCILLATOR MODEL

The accuracy of the numerical solution in the previous section confirms the validity of the proposed model. Unfortunately, the interaction between the pair coupled oscillators is concealed by the complicated calculations. Therefore it is desirable to have a further simplified model to gain more understanding into the behavior of coupled oscillators. Such a simplified model is presented in this section.

Since the modeling parameters  $f_n(\cdot)$ ,  $\Gamma_n(\cdot)$  and  $\Omega_n(\cdot)$  are derived from simulations, they have no analytical forms. In the following analysis, these parameters are approximated as piece-wise linear curves for simplicity. As shown in Figure 1.6(a), one period of the function  $f_n(\cdot)$  is divided into four regions. Within each region, the waveform is approximated as a straight line. For ease of notation, these four regions are expressed as region 1 (the rising edge), region 2, region 3 (the falling edge) and region 4, respectively. The slope in region 1 of the approximation is chosen to be equal to the slope in the middle of the simulated waveform's rising edge, while the slope in region 3 of the approximation is chosen to be equal to the slope in the middle of the simulated waveform's falling edge. The boundaries between regions are chosen to minimize the difference between the simulated waveform and the approximation. In the following analysis, a new term  $\varphi_n(j,k)$  is introduced to express the phase of the boundary between region j and k of oscillator n, where the value of  $\varphi_n(j,k)$  is kept between 0 and  $2\pi$ .



Figure 1.6: (a) Piece-wise linear approximation of the waveform. (b) Piece-wise linear approximation of the PSF and ASF.

In general the region boundary is expressed as  $\varphi_n(j,k) + 2m\pi$ , where m = 0, 1, 2...

Similarly, the PSF  $\Gamma_n(\cdot)$  and the ASF  $\Omega_n(\cdot)$  are divided into the same four regions as  $f_n(\cdot)$  is divided. The values of both  $\Gamma_n(\cdot)$  and  $\Omega_n(\cdot)$  are approximated as constant within each region as shown in Figure 1.6(b). As mentioned previously, when the oscillator is in either region 2 or 4, the magnitude of the ASF peaks and the magnitude of the PSF is at its minimum. Conversely, when the oscillator is either in region 1 or 3, the PSF's magnitude peaks and the magnitude of the ASF is at its minimum. Therefore, in regions 2 and 4, the PSF is approximated as zero and the ASF is set to its maximum value, 1. On the other hand, in region 1 and 3, the ASF is approximated as zero and the PSF is approximated as a constant. It follows from (12) that the value of the PSF in region 1 and 3 is given by

$$\Gamma_n(\Phi_n(t)) = \frac{1}{C_n f'_n(\Phi_n(t))}, \quad (n = 1, 2).$$
(21)

The above approximation greatly simplified the non-linear terms in the ODEs from (17) to (20) as follows:

$$\Phi_{1}'(t) = \omega_{1} + \gamma_{1,j} \frac{C_{1}C_{2}}{C_{1} + C_{2}} \left[ s_{2,k} - s_{1,j} - \frac{\Delta A_{2}(t)}{\tau_{2}} + \frac{\Delta A_{1}(t)}{\tau_{1}} \right],$$
(22)

$$\Phi_{2}'(t) = \omega_{2} - \gamma_{2,k} \frac{C_{1}C_{2}}{C_{1} + C_{2}} \left[ s_{2,k} - s_{1,j} - \frac{\Delta A_{2}(t)}{\tau_{2}} + \frac{\Delta A_{1}(t)}{\tau_{1}} \right],$$
(23)

$$\Delta A_{1}'(t) = \alpha_{1,j} \frac{C_{2}}{C_{1} + C_{2}} \left[ s_{2,k} - s_{1,j} - \frac{\Delta A_{2}(t)}{\tau_{2}} + \frac{\Delta A_{1}(t)}{\tau_{1}} \right] - \frac{\Delta A_{1}(t)}{\tau_{1}}, \quad (24)$$

$$\Delta A_{2}'(t) = -\alpha_{2,k} \frac{C_{1}}{C_{1} + C_{2}} \left[ s_{2,k} - s_{1,j} - \frac{\Delta A_{2}(t)}{\tau_{2}} + \frac{\Delta A_{1}(t)}{\tau_{1}} \right] - \frac{\Delta A_{2}(t)}{\tau_{2}}, \quad (25)$$

where  $s_{1,j}$  and  $s_{2,k}$  represent the waveform slope of oscillator 1 and 2, respectively. The subscripts *j* and *k* range from 1 to 4, representing the region that oscillator 1 and 2 are in. Similarly,  $\alpha_{1,j}$  and  $\alpha_{2,k}$  represent the ASF values and the terms  $\gamma_{1,j}$  and  $\gamma_{2,k}$  represent the PSF values. At any given time instant, each oscillator is in one of the four regions, which leads to 16 possible realizations of the ODEs from (22) to (25). However, the previous simplification of the ASF and PSF values results in only four unique forms of the ODEs. Specifically, when both ASFs are 1, the ODEs are reduced to second-order nonhomogeneous differential equations. When either of the ASF is 0, the ODEs are reduced to first-order homogeneous differential equations. For each of the four cases, the ODEs have closed form solutions.

For demonstration purposes, Figure 1.7(a) shows the calculated locking transient of both oscillators using the simplified model. The free running waveforms of both oscillators are also plotted for comparison. In order to explain how to solve the simplified ODEs, the first 400 ps of the locking transient waveform is shown in the top plot of Figure 1.7(b). The detailed procedure of how the solution is obtained is presented as follows. The first step is to set the initial condition at time t = 0 for the ODEs, i.e.,  $\Phi_1(0)$ ,  $\Phi_2(0)$ ,  $\Delta A_1(0)$  and  $\Delta A_2(0)$ . Because of the approximation that the two oscillators are shorted, the initial condition must be chosen such that  $v_1(0) = v_2(0)$ . In this particular example, the initial phases are chosen as  $\Phi_1(0) = 0$  and  $\Phi_2(0) = 0.4\pi$  for illustration purposes only. For simplicity, the initial extra amplitudes are chosen as  $\Delta A_1(0) = 0$  and  $\Delta A_2(0) = f_1(\Phi_1(0)) - f_2(\Phi_2(0))$ . As a result, both oscillators start from region 1. Solving the ODEs yields the following closed form solutions:



Figure 1.7: (a) Locking transient of coupled oscillator pair using the simplified model. (b) The first 400ps is expanded to illustrate the details of how to solve the simplified ODEs.

$$\Phi_{1}(t) = \left[\omega_{1} + \gamma_{1,1} \frac{C_{1}C_{2}}{C_{1} + C_{2}} (s_{2,1} - s_{1,1})\right] t + \gamma_{1,1} \frac{C_{1}C_{2}}{C_{1} + C_{2}} \Delta A_{2}(0) \left(e^{-t/\tau_{2}} - 1\right) + \Phi_{1}(0), \qquad (26)$$

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

$$\Phi_{2}(t) = \left[\omega_{2} - \gamma_{2,1} \frac{C_{1}C_{2}}{C_{1} + C_{2}} (s_{2,1} - s_{1,1})\right] t - \gamma_{2,1} \frac{C_{1}C_{2}}{C_{1} + C_{2}} \Delta A_{2}(0) \left(e^{-t/\tau_{2}} - 1\right) + \Phi_{2}(0), \qquad (27)$$

$$\Delta A_1(t) = 0, \qquad (28)$$

$$\Delta A_2(t) = \Delta A_2(0)e^{-t/\tau_2}.$$
(29)

As time progresses, each oscillator will reach the boundary between region 1 and 2. The region transition is illustrated in the bottom plot in Figure 1.7(b), where the region values of both oscillators are plotted as functions of time. In addition, a time sequence t[n], (n = 1, 2, 3...) is used to express each time a region transition occurs, which is shown as circles in the figure. The following analysis shows how to calculate the time sequence t[n] using the closed form solutions. The first region transition time, expressed as t[1], is obtained in two steps. The first step is to numerically solve the following two equations,  $\Phi_1(t) = \varphi_1(1,2) + 2m\pi$  and  $\Phi_2(t) = \varphi_2(1,2) + 2m\pi$ , where m = 0. The solutions to the equations are called  $t_1[1]$  and  $t_2[1]$ , respectively. Then the region transition time is given by the smaller of the two solutions, i.e.,  $t[1] = \min(t_1[1], t_2[1])$ . The greater of the two solutions is discarded. In this particular example, oscillator 2 has the first transition. At t = t[1], oscillator 1 remains in region 1 while oscillator 2 advances to region 2. The form of the solutions is changed accordingly but remains closed form for  $t \ge t[1]$  (the detailed expression of the solutions is not shown here for brevity). In addition, the solutions are uniquely determined by the initial condition at t = t[1]. Because of the continuity of the solutions, the initial condition is given by evaluating (26) to (29) at t = t[1], i.e.,  $\Phi_1(t[1])$ ,



Figure 1.8: Comparison between transistor level simulation and theoretical calculation using the simplified model of the coupled oscillator pair. (a) Waveform of one locked period. (b) Absolute error between simulation and calculation.

 $\Phi_2(t[1])$ ,  $\Delta A_1(t[1])$  and  $\Delta A_2(t[1])$ . The next step is to calculate the second region transition time t[2]. Similar to the procedure by which t[1] is obtained, t[2] is given by

the smaller solution to the following equations,  $\Phi_1(t) = \varphi_1(1,2) + 2m\pi$  and  $\Phi_2(t) = \varphi_2(2,3) + 2m\pi$ , where m = 0. The rest of the solution is obtained by repeating the procedure described above.

As mentioned before, the above analysis is based on the approximation that the oscillators are shorted together. As a result, the solutions satisfy  $v_1(t) = v_2(t)$ . In order to obtain the voltage at the oscillator output node, the effect of the 20- $\Omega$  coupling resistor should be taken into account. First the oscillator current  $i_1(t)$  and  $i_2(t)$  are given by (16). Then the node voltages of oscillator 1 and 2 are given by  $v_2(t)+i_2(t)\cdot(10\Omega)$  and  $v_2(t)-i_2(t)\cdot(10\Omega)$ , respectively. Similar to the previous section, the error due to this approximation is neglected. The calculated locked frequency is 1.955 GHz, which is within 1.5% error of the value obtained from transistor-level simulation. The locked waveforms of oscillator 1 and 2 are shown in Figure 1.8(a) to compare with the results obtained from transistor-level simulation and simulation is shown in Figure 1.8(b).

The choice of the initial conditions deserves further discussion. In principle, there are infinite possibilities for the initial condition at t = 0 to satisfy  $v_1(0) = v_2(0)$ . Therefore, it is impractical to exhaustively calculate all possible initial conditions. For all the different initial conditions tested, the two oscillators obtain frequency lock within several oscillator cycles.

Compared with the more accurate model presented in the previous section, the simplified model greatly reduces the amount of calculation. Instead of calculating at every time step, the simplified model allows numerical calculation to be performed only

four times in one period for each oscillator. More importantly, despite the gross error introduced by the approximation, the simplified model can still predict the locking behavior with moderate accuracy as shown in Figure 1.8. This is a good indication that the salient property of the oscillator is still preserved by the simplified model even in complicated phenomenon like the injection locking.

## VI. CONCLUSION

A conceptually simple, yet accurate, oscillator model is presented in this paper. The model parameters can be quickly deduced from circuit-level simulations. Once the parameters are obtained, the model is capable of accurately predicting the injectionlocking behavior of strongly coupled oscillators with component mismatches. In addition to deriving the general model, the paper derives a simplified version of the model applicable to ring oscillators. Good agreement between the simulation and theoretical calculation has been observed. A piece-wise linear model is developed to further simplify the calculation. Despite its simplicity, the further simplified model still retains the salient property of the injection-locking behavior between strongly coupled oscillators.

### REFERENCES

 I. Galton, D. A. Towne, J. J. Rosenberg, H. T. Jensen, "Clock Distribution Using Coupled Oscillators", 1996 *IEEE International Symposium on Circuits and Systems*, vol. 3.4 pp. 217–220, May 1996.

- H. Mizuno, K. Ishibashi, "A Noise-immune GHz-Clock Distribution Scheme using Synchronous Distributed Oscillator," 1998 *IEEE International Solid-State Circuits Conference*, pp. 404—405, Feb. 1998.
- 3. T. Takahashi, et. al., "110-GB/s Simultaneous Bidirectional Transceiver Logic Synchronized with a System Clock," *IEEE Journal of Solid-State Circuits*, vol. 34, no. 11, pp. 1526—1533, Nov. 1999.
- 4. P. Liao and R. A. York, "A New Phase-shifterless Beam Scanning Technique Using Arrays of Coupled Oscillators," *IEEE Transaction on Microwave Theory and Techniques*, vol. 41, no. 10, pp. 1810–1815, Oct. 1993.
- 5. A. Hajimiri, T. H. Lee, "A General Theory of Phase Noise in Electrical Oscillators," *IEEE Journal of Solid-State Circuits*, vol. 33, no. 2, pp. 179–194, Feb.1998.
- 6. A. Hajimiri, T. H. Lee, "The Design of Low Noise Oscillators," Kluwer Academic Publishers, 1999.
- H. Tanaka, et. al., "Synchronizability of Distributed Clock Oscillators", IEEE Transaction on Circuits and Systems – I: Fundamental Theory and Applications, vol. 49, no. 9, pp. 1271–1278, Sep. 2002.
- 8. J. H. Mathews. *Numerical Methods for Mathematics, Science, and Engineering*. Prentice- Hall, 1992.

# A Multiple-Crystal Interface PLL with VCO Realignment to

## **Reduce Phase Noise**

Sheng Ye, Lars Jansson and Ian Galton

*Abstract* — An enhancement to a conventional integer-*N* PLL is introduced, analyzed, and demonstrated experimentally to significantly reduce VCO phase noise. The enhancement, which involves periodically injection locking the VCO to a buffered version of the reference, has the effect of widening the PLL bandwidth and reducing the overall phase noise. It is demonstrated in a 3 V 6.8 mW CMOS reference PLL with a ring VCO capable of converting most of the popular crystal reference frequencies to a 96 MHz RF PLL reference and baseband clock for a direct conversion Bluetooth wireless LAN. The peak in-band phase noise at an offset of 20 kHz is –102 dBc/Hz with the technique enabled and –92 dBc/Hz with the technique disabled. A theoretical analysis is presented and shown to be in close agreement with the measured results.

### I. INTRODUCTION

This paper presents a new enhancement to a conventional PLL in which a CMOS ring voltage controlled oscillator (VCO) is periodically realigned by the PLL reference signal to reduce phase noise. The enhancement is applied to a VHF reference PLL capable of operation with a wide range of crystal frequencies for a Bluetooth transceiver. A peak phase noise reduction of 10 dB and an integrated phase

noise reduction of 8.3 dB relative to the conventional PLL alone are demonstrated via measured results that closely match theoretical predictions.

The eventual market penetration of low-end local area networks such as Bluetooth will depend largely on the extent to which transceiver prices can be reduced. With Bluetooth unit prices approaching the five dollar level, the cost of the external crystal has become significant. Since the transceivers generally are placed in host devices with their own crystal references, such as cellular telephones, computers, and PDAs, a cost reduction can be achieved by sharing the same reference as the host device. This is particularly advantageous in wireless host devices such as cellular telephones which tend to be sensitive to interference at the circuit board level from oscillator signals outside of their frequency plans. Therefore, for maximum flexibility it is desirable to have a Bluetooth transceiver capable of operating from all of the popular crystal reference frequencies.

The most critical local oscillator in a wireless transceiver typically is that used to drive the RF mixers. In a direct conversion Bluetooth transceiver such as [1], inphase and quadrature local oscillator signals are required with frequencies selectable from 2.402 GHz to 2.480 GHz in steps of 1 MHz. In principle, these signals can be generated from any of the popular crystal frequencies by a fractional-*N RF PLL*, but the flexibility required to accommodate all the crystal frequencies would translate into significant added circuit area and power consumption [2], [3]. Alternatively, an integer-*N* RF PLL with a 1 MHz reference signal can be used. The primary difficulty with this approach is that a *reference PLL* capable of generating the 1 MHz reference signal with very little phase noise from any of the crystal frequencies is required. This paper presents such a PLL designed for a next generation version of the Bluetooth transceiver in [1]. The PLL generates a 96 MHz signal from which the 1 MHz RF PLL reference is derived along with a 32 MHz clock used to drive the baseband circuitry and data converters in the transceiver.

The reference PLL design was challenging because of the requirements that it be implemented with only CMOS transistors, that it use an on-chip ring VCO to avoid external components, and that its phase noise between 1 kHz and 50 kHz from the carrier be below -100 dBc/Hz (the loop bandwidth of the RF PLL is 50 kHz). Unfortunately, CMOS ring oscillators are noisy. For example, the measured phase noise from the ring VCO implemented within the reference PLL at an offset from the carrier of 100 kHz is -107 dBc/Hz, so the loop bandwidth of a conventional PLL would have to be approximately 100 kHz to sufficiently suppress the VCO noise with enough margin to allow for the phase noise contributed by the other PLL components. The large loop bandwidth ruled out the use of a conventional integer-N PLL. In such a PLL, the reference frequency is obtained by dividing the crystal frequency to its greatest common divisor with 96 MHz, but to maintain stability over process and temperature extremes the PLL reference frequency must be approximately 20 times the loop bandwidth [4]. Among the commonly used crystal references, this implies that the target phase noise could not be met for 19.68 and 19.8 MHz crystals using a conventional integer-N PLL. For example, the greatest common divisor of 19.68 MHz and 96 MHz is 480 kHz, which implies a reference frequency of 480 kHz and a maximum practical loop bandwidth of only 24 kHz.

To solve this problem, a VCO realignment technique has been developed and

applied to a conventional narrow band integer-*N* PLL. The new topology is referred to as a realigned PLL (RPLL). The idea is to perform VCO realignment by injection locking the VCO to a buffered version of the PLL reference once every reference period. As explained in Section II, the realignment has the effect of significantly reducing the phase noise introduced by the VCO at frequencies below the reference frequency. A theoretical model of the RPLL is derived in Section III, and measurement results are presented in Section IV that are very close to those predicted by the theoretical model. It is shown that the RPLL has similarities to a delay-locked loop (DLL) in the way it suppresses phase noise below the reference frequency, and that the theoretical model derived in this paper for the RPLL is also applicable to the DLL. Nevertheless, along with the usual benefits offered by PLLs, the RPLL offers the potential advantage that it is not restricted to oscillators based on delay lines; the injection locking principle on which the RPLL is based is known to be applicable to various types of VCOs [5], [6].

## II. REALIGNED PLL SYSTEM OVERVIEW

#### A. The Idea

A simplified block diagram of the RPLL is shown in Figure 2.1. The idea is to use a buffered version of the PLL reference to correct the VCO once every reference period. As shown in the figure, the new PLL is based on a conventional integer-*N* PLL with the addition of a buffer and a control logic block. The motivation for this new topology is described below.



Figure 2.1: Proposed top-level system block diagram.

If the oscillator were noiseless, its zero-crossings would be uniformly spaced in time. However, noise inside the oscillator causes phase fluctuations which give rise to errors in the zero-crossing times. As shown in [7], noise induced phase fluctuations persist indefinitely in oscillators. This phenomenon is illustrated in Figure 2.2(a), where the phase fluctuations "build up" in a 3-stage ring VCO over time because the zero-crossing error introduced by each inverter adds to all the previous zero-crossing errors [8]. Therefore the VCO can be modeled as a phase fluctuation integrator. In the frequency domain, this integration has the effect of multiplying the power spectral density (PSD) of the zero-crossing time errors by a transfer function proportional to  $1/f^2$  which results in high in-band phase noise.

As shown in Figure 2.2(b), assuming the VCO frequency is an integer multiple of the reference frequency, the realignment technique shorts a buffered version of the clean reference signal to the VCO output during windows surrounding the time instants where VCO edges ideally coincide with reference signal edges. This causes each VCO edge to be "pulled" toward the correct position thereby suppressing the



Figure 2.2: (a) Phase noise is accumulated in typical ring oscillators. (b) Periodically realigning the oscillator to a "clean" edge suppresses the phase noise accumulation.

memory of past errors. In the frequency domain, the suppression of noise memory attenuates the  $1/f^2$  transfer function mentioned above at frequencies below the reference frequency which greatly reduces the in-band phase noise power introduced by the VCO.

B. The Realigning Factor

During each phase realignment, the buffered reference edge is connected to the VCO clock edge. Ideally, if both the reference and the VCO were noiseless, a VCO edge would line up with an edge of the buffered reference once every reference period.



Figure 2.3: Simulated initial phase difference versus shifted phase curve.

However, phase noise in the VCO and the reference cause these two edges to occur at slightly different times. The coupling between the VCO delay cell and the reference buffer causes the VCO edge to be dragged toward the reference edge. Inevitably, the realignment perturbs the VCO. As shown in [7], perturbations to the VCO generally cause both amplitude and phase fluctuations. The amplitude fluctuations usually disappear within several VCO cycles due to the amplitude limiting mechanism in oscillators. The phase fluctuations, however, persist indefinitely.

Transistor level simulation can be used to quantify the VCO's response to the phase realignment by examining the VCO phase shift several cycles after the phase realignment. Simulations indicate that the VCO phase is shifted almost instantaneously by the realignment. A typical plot of the phase shift as a function of the phase error, i.e., the difference between VCO and reference phases just prior to the realignment, is shown in Figure 2.3. In practice, the cycle-to-cycle variation of the instantaneous phase error is mainly caused by the VCO phase noise, whose magnitude is usually small enough that the phase shift can be approximated as linear with respect

to the phase error, as indicated by the "zoomed-in" plot in Figure 2.3. The magnitude of the slope of the curve is defined as the realigning factor, and is denoted as  $\beta$ . The value of  $\beta$  ranges from 0 to 1 and describes the strength of the realignment. As depicted in Figure 4, after the realignment the VCO phase is shifted by  $-\beta\theta_e$  where  $\theta_e$  is defined as the instantaneous phase error between the VCO and the reference just prior to the realignment.

### **III.** THEORETICAL ANALYSIS

#### A. Phase Noise of the Realigned VCO

For the following analysis it is assumed that the realignment shifts the VCO phase instantaneously and the phase shift is linear with respect to the instantaneous phase error by a factor of  $\beta$ . As mentioned above, these assumptions are supported by circuit simulations, and, as shown in Section V, they are also closely supported by measurement results. It is further assumed that the VCO frequency is *N* times the reference frequency, which is the case in a locked integer-*N* PLL. Thus, except for any systematic offset caused by path mismatches in the phase-frequency detector and charge pump, the instantaneous phase difference between nominally coincident VCO and the reference edges arises completely from circuit noise.

In the following, the reference phase error is denoted as  $\theta_{ref}(t)$ , the VCO phase error that would have occurred in the absence of phase realignment is denoted as  $\theta_{vco}(t)$ , and the "extra" phase shift caused by the phase realignment is denoted as



Figure 2.4: Time domain waveform of the realignment: VCO phase shifts almost instantly and linearly to  $\theta_e[n]$ .

 $\varphi(t)$ . Consequently, the instantaneous VCO phase error is given by

$$\theta_{inst-vco}(t) = \theta_{vco}(t) + \varphi(t). \tag{1}$$

The phase realignment is performed at the reference edges, so the VCO phase is shifted at discrete times. As indicated in Figure 2.4, the instantaneous phase difference between the VCO and the reference just before the  $n^{\text{th}}$  realignment can be represented as a sequence given by

$$\theta_{e}[n] = \theta_{inst-vco}(nT_{r}) - N\theta_{ref}(nT_{r}), \qquad (2)$$

where  $T_r$  is the reference period,  $t = nT_r^-$  denotes the time instant just before the  $n^{\text{th}}$  reference edge. The factor N arises because the VCO frequency is N times the reference frequency and all the phase terms have the units of radians. After the  $n^{\text{th}}$  phase realignment, the VCO is phase shifted by  $-\beta \theta_e[n]$  based on the assumption that the phase shift is linear with respect to the phase error.

As mentioned above, the VCO can be modeled as a phase error integrator. Therefore each phase realignment can be modeled as the addition of a step increment

Figure 2.5: Extra phase shift due to phase realignment. (a) Shown as the sum of a series of phase steps and (b) shown as the result of passing an impulse train through a hold operation.

to the VCO phase, so, as depicted in Figure 2.5(a),  $\varphi(t)$  can be written as

$$\varphi(t) = -\beta \sum_{n=-\infty}^{\infty} \theta_e[n] \cdot u(t - nT_r).$$
(3)

Alternatively, as indicated in Figure 2.5(b),  $\varphi(t)$  can be viewed as the result of an impulse train  $\sum_{n=-\infty}^{\infty} \varphi_{\Delta}[n] \cdot \delta(t-nT_r)$ , passed through a "hold" operation, where

$$\varphi_{\Delta}[n] - \varphi_{\Delta}[n-1] \equiv -\beta \theta_{e}[n]. \tag{4}$$

Combining (3) and (4) gives

$$\varphi(t) = \sum_{n = -\infty}^{\infty} \varphi_{\Delta}[n] \cdot h_{hold}(t - nT_r), \qquad (5)$$

where  $h_{hold}(t) = u(t) - u(t - T_r)$  is the impulse response of the hold operation. Thus, (5) represents a discrete-to-continuous-time (D/C) conversion obtained by holding the

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

value of  $\varphi(t)$  equal to  $\varphi_{\Delta}[n]$  throughout the time interval  $nT_r \le t < (n+1)T_r$ . Taking the Fourier transform of (5) yields

$$\varphi(j\omega) = T_r e^{-j\omega T_r/2} \cdot \frac{\sin(\omega T_r/2)}{\omega T_r/2} \cdot \varphi_{\Delta}(z) \Big|_{z=e^{j\omega T_r}}, \qquad (6)$$

where  $\varphi_{\Delta}(z)$  is the *z*-transform of  $\varphi_{\Delta}[n]$ .

To make use of this result, an expression for  $\varphi_{\Delta}(z)$  is required. Combining (2) and (4) gives

$$\varphi_{\Delta}[n] - \varphi_{\Delta}[n-1] = -\beta \left( \theta_{vco}[n] + \varphi_{\Delta}[n-1] - N\theta_{ref}[n] \right).$$
(7)

Taking the z-transform of (7) and solving for the z-transform of  $\varphi_{\Delta}[n]$  results in

$$\varphi_{\Delta}(z) = \frac{-\beta}{1 + (\beta - 1)z^{-1}} \theta_{vco}(z) + \frac{N\beta}{1 + (\beta - 1)z^{-1}} \theta_{ref}(z).$$
(8)

Combining (1), (6), and (8) results in<sup> $\dagger$ </sup>

$$\theta_{inst\_vco}(j\omega) = \theta_{vco}(j\omega)H_{rl}(j\omega) + \theta_{ref}(j\omega)H_{up}(j\omega)$$
(9)

where

$$H_{rl}(j\omega) = 1 - \frac{\beta}{1 + (\beta - 1)e^{-j\omega T_r}} e^{-j\omega T_r/2} \frac{\sin(\omega T_r/2)}{\omega T_r/2},$$
 (10)

<sup>†</sup> The sampling of  $\theta_{vco}(t)$  and  $\theta_{ref}(t)$  inevitably causes aliasing as both signals are not band-limited. However, The aliasing effect is neglected when converting the discrete time results to continuous time because the loop bandwidth of the PLL is small compared with the reference frequency. This approximation is consistent with the assumptions underlying the conventional PLL analysis [4].



Figure 2.6: (a) Commonly used phase noise model of a conventional integer-N PLL. (b) Modified version of the phase noise model describing the RPLL.

and

$$H_{up}(j\omega) = \frac{N\beta}{1 + (\beta - 1)e^{-j\omega T_r}} e^{-j\omega T_r/2} \frac{\sin(\omega T_r/2)}{\omega T_r/2}.$$
 (11)

The transfer function,  $H_{rl}(j\omega)$ , represents the effect of the phase realignment and the transfer function,  $H_{up}(j\omega)$ , represents the up-conversion of the reference noise to the VCO output.

B. Linearized Phase Noise Model for the RPLL

Without the realignment technique, the PLL phase noise,  $\theta_{out}(s)$ , is described by the well known linearized model in the s-domain as shown in Figure 2.6(a), where  $K_{chp}$  and  $K_{vco}$  are the charge pump and VCO gains, respectively, and  $H_{lp}(s)$  is the

transfer function of the PLL loop filter [4]. The reference phase noise,  $\theta_{ref}(s)$ , divider phase noise,  $\theta_{div}(s)$ , and the charge pump phase noise,  $\theta_{chp}(s)$ , are all considered to be *input noise* and the PLL applies the same transfer function to each of them.

The results derived above can be applied to this model to obtain a model of the RPLL without violating any of the assumptions underlying the original PLL model. Specifically, (9), (10), and (11) can be applied to replace the phase noise from the VCO without realignment,  $\theta_{vco}(s)$ , with that of the realigned VCO,  $\theta_{inst_vco}(s)$ . The resulting model, shown in Figure 2.6(b), describes the output phase noise from the RPLL. Unlike in the conventional PLL, the transfer function for the reference noise differs from that of the other input noise sources because of the extra signal path from the reference to the VCO.

Figure 2.7 shows transfer functions derived from the linearized RPLL model for  $\beta$  values of 0, 0.5 and 1, which correspond to no phase realignment (i.e., a conventional PLL), partial phase realignment, and complete phase realignment, respectively. In the signal flow diagram of the linearized RPLL model, there is only one feedback loop. The phase and magnitude squared responses of the loop gain are shown in Figure 2.7(a). As indicated in the figure, stronger realignment increases the phase margin of the loop gain. Figure 2.7(b) shows the magnitude squared of the transfer function applied to the VCO phase noise. As indicated in the figure, the stopband of the transfer function is effectively widened and therefore the VCO phase noise is attenuated as the strength of the realignment is increased. Figure 2.7(c) shows the magnitude squared of the transfer function applied to both the divider phase noise and



Figure 2.7: Transfer functions of the PRPLL with different  $\beta$  values. (a) Realignment increases phase margin in the loop transfer function. (b) Realignment extends the VCO noise stop band. (c) Realignment attenuates more input noise. (d) Realignment has less filtering of the reference noise.

the charge pump phase noise. As shown in the figure, the RPLL provides significantly more attenuation of these noise sources than the conventional PLL. The transfer function applied to the reference phase noise is shown in Figure 2.7(d). In contrast to the other phase noise sources, there is less attenuation of the reference phase noise as  $\beta$  is increased because of the second path through which reference noise power is coupled into the VCO at each realignment.



#### Figure 2.8: Effect of $\beta$ on the RPLL phase noise.

The reduced attenuation of the reference phase noise in the RPLL relative to a conventional PLL would have been a drawback if the reference had been too noisy. However, in this particular test chip, the reference signal is very clean because it is derived from a buffered version of an off-chip crystal. Within the frequency band of interest, the reference phase noise is dominated by the buffer noise, which is estimated to be below –140 dBc/Hz when running at 19.68 MHz. Therefore the reference phase noise contribution to the overall phase noise of the RPLL is negligible compared to those of the other noise sources in the RPLL.

To further illustrate the effect of  $\beta$  on the VCO phase noise, Figure 2.8 shows the predicted RPLL output phase noise with all noise sources except the VCO set to zero and with realigning factors of  $\beta = 0$ ,  $\beta = 0.5$  and  $\beta = 1$ . The curves were



Figure 2.9: (a) A DLL based clock multiplier. (b) A variant of DLL using a gated ring VCO as the VCDL.

obtained by multiplying the theoretical transfer function squared magnitude curves shown in Figure 2.7(b) by the VCO phase noise PSD curve estimated from circuit simulation. As shown in the figure, attenuation of the in-band phase noise increases with  $\beta$ . However, the VCO phase noise power at higher frequencies increases slightly as  $\beta$  increases.

### C. Extension of the RPLL model to DLLs

As mentioned in the introduction, the RPLL is similar to a DLL with respect to the way it suppresses VCO noise memory. A typical DLL differs from a typical PLL in that it uses a voltage controlled delay line (VCDL) in place of a VCO and a firstorder loop filter in place of a second-order loop filter. Figure 2.9(a) shows a simplified block diagram of a typical DLL [8]. Each positive-going reference edge triggers an *N*-tap VCDL whose individual outputs are combined to form the DLL output. The phase detector, charge pump and the first-order loop filter adjust the VCDL such that each delay within the VCDL is equal to the reference period divided by *N*. Thus, when the DLL is locked, its output frequency is *N* times its reference frequency. A variant of this architecture is shown in Figure 2.9(b) where the VCDL has been replaced by a ring VCO which can be gated on and off as shown in [9], [10] and [11]. As in the previous DLL, each reference clock edge triggers the next *N* output clock edges, and the last output clock edge from the previous *N* cycles is discarded. Therefore in both DLLs noise induced phase fluctuations are only accumulated *N* times. Consequently, both systems can be modeled as RPLLs for the special case  $\beta = 1$ .

However, in a DLL, the VCDL is characterized by the delay gain  $K_{vcdl}$  (second/Volt), while in a PLL the VCO is characterized by the VCO gain  $K_{vco}$  (Hz/Volt). The connection between  $K_{vcdl}$  and  $K_{vco}$  is illustrated as follows using the DLL example shown in Figure 2.9(b). When the DLL is locked, the total delay within a reference period is  $N \cdot T_{vco}$ . The delay gain  $K_{vcdl}$  is defined as the derivative of  $N \cdot T_{vco}$  with respect to the control voltage. As a result,  $K_{vcdl}$  is related to  $K_{vco}$  as

$$K_{vcdl} = \frac{d}{dv_{ctrl}} \left\{ \frac{N}{f_{vco}(v_{ctrl})} \right\} = -\frac{N}{f_{vco}^{2}(v_{ctrl})} K_{vco} |_{v_{ctrl} = v_{ctrl}},$$
(12)

where  $v_{ctrl0}$  is the control voltage such that  $N \cdot T_{vc0}$  equals one reference period.

Therefore, the previous analysis for the RPLL can serve as a universal model for DLLs and RPLLs. As demonstrated by the results shown in Figure 2.7(a), the realignment increases the loop gain's phase margin, which allows the DLL to maintain stability with a first order loop filter.



Figure 2.10: Simplified circuit diagram of the realigned ring oscillator in prototype.

## IV. IMPLEMENTATION AND MEASUREMENT DETAILS

#### A. Circuit Implementation and Fabrication Issues

The prototype IC was fabricated in a 0.35  $\mu$ m BiCMOS SOI process, although only CMOS components were used to facilitate later migration to a CMOS process. The process incorporates a low-resistivity buried epi layer, that, in the case of the prototype, forms a single island over which the entire circuit lies. As a result of parasitic capacitances associated with the buried epi layer, the electrical characteristics of the individual circuit components are comparable to those of a 0.6  $\mu$ m standard CMOS process.

The details of the VCO and realignment circuitry represent the primary differences between the RPLL implementation and that of a conventional PLL. A simplified circuit diagram of the implemented VCO is shown in Figure 2.10. It is a seven-stage ring oscillator in which four of the inverters are used to control the frequency and three introduce a fixed delay. This separation was made to minimize



Figure 2.11: (a) Simplified block diagram of realignment control logic. (b) Simplified timing diagram of the realignment control logic block.

charge injection from the realignment switches into the sensitive VCO control node. Although not shown in the figure, the supply voltage on the fixed inverters is level shifted to 1.5 volts so as to approximately match the drain voltage of the VCO frequency control transistor. The pseudo-differential topology was used to achieve high common mode noise rejection.

The control logic block generates a voltage pulse signal, *en\_rlgn*, which creates a time window surrounding each instant at which a buffered reference edge and a VCO edge ideally coincide. The control logic block is enabled only when the PLL is locked in which case the VCO internal edge is close to the buffered reference edge. Figure 2.11(a) shows a simplified diagram of the control logic block where the logic core is just a D Flip-flop with reset. Figure 2.11(b) shows a detailed timing

diagram associated with the generation of  $en_rlgn$ . Three internal VCO delay cell outputs, denoted  $P_1$ ,  $P_2$ , and  $P_3$ , facilitate the generation of  $en_rlgn$ . The  $P_1$  signal is used to drive the divider, the  $P_3$  signal is realigned to the reference, and the  $P_2$  signal slightly precedes  $P_3$ . The  $N^{\text{th}}$  rising edge of  $P_1$  triggers the rising edge of the divider output. Then the next rising edge of  $P_2$  samples the divider output. Since  $P_2$  precedes  $P_3$ , it ensures that  $en_rlgn$  rises before  $P_3$ . Once the reference edge finishes rising, the phase realignment is complete, so the rising edge of the reference is used to trigger the falling edge of the  $en_rlgn$  signal, thereby disconnecting the buffered reference from the VCO.

Conventional techniques were used for most of the other circuit blocks in the system. The charge pump is similar in topology to that used in [12] with an op-amp feedback circuit to facilitate matching of the up and down currents. The phase-frequency detector is a dual D Flip-flop structure with a delay inserted in the reset path to eliminate the dead-zone. The divider is asynchronous with a D Flip-flop to resynchronize the output to the VCO signal so as to reduce the noise contribution from the divider circuitry.

#### B. Calculated and Measured Phase Noise PSD Plots

Figure 2.12 shows representative measured and calculated PSD plots of the PLL phase noise with the realignment technique enabled and disabled for the case of a 19.68 MHz crystal. Similar results were observed for the other crystal frequencies. The case shown corresponds to a reference frequency of 19.68 MHz  $\div$  41 = 480 kHz, N = 200, a charge pump current of 1 mA, a VCO gain of 136 MHz/V, and a loop filter



Figure 2.12: Measured and calculated phase noise PSD plots for the PLL with realignment enabled and disabled.

consisting of a 6.8 nF capacitor in parallel with a series combination of a 330  $\Omega$  resistor and a 68 nF capacitor. The resulting PLL bandwidth is 32 kHz. As can be seen from the figure, the realignment technique causes a significant reduction in the power of the phase noise as expected, and the calculated PSD plots are in close agreement with the measured PSD plots.

The calculated PSD plots were obtained by applying the transfer functions deduced from the theoretical model to measured PSD plots of the reference phase noise and free running VCO phase noise. The free running VCO phase noise PSD was determined by narrowing the PLL bandwidth and measuring the resulting PLL phase

phase noise PSD, because beyond the PLL bandwidth the phase noise is dominated by the VCO phase noise. Conversely, the reference phase noise PSD was determined by measuring the PLL phase noise PSD with a very wide PLL bandwidth. In applying the theoretical model, a value of  $\beta = 0.52$  was used. This is the value obtained from transistor level simulations using the nominal process corner simulation models at room temperature. Although the value of  $\beta$  varies across process corners and with temperature, simulations indicate the variation in  $\beta$  only translates into about a ±1.5 dB variation in the in-band spot phase noise.

### C. The Realigning Factor Revisited

The value of  $\beta$  deserves further explanation as it is crucial in the VCO phase noise suppression. Observed in "isolation", the realigned clock edge is driven by both the VCO delay cell and the reference buffer. Since it is desirable to completely correct the VCO phase error, the reference buffer was designed to be five times larger than the VCO delay cell in the prototype chip. Consequently, the buffer overrides the VCO and the realigned edge is dragged to be almost in phase with the buffered reference. Therefore, intuitively, one would expect  $\beta$  to be close to 1 in the prototype, and, indeed, this would have been the case if the delay cells in the VCO behaved as independent edge-triggered inverters. However, in practice, the perturbation from the realignment stage inevitably affects the remaining delay cells through the shared tail current source, and the value of  $\beta$  is reduced by the cross coupling between the delay cells. Circuit simulations indicate that  $\beta$  can be increased to 0.9 if better tail current source isolation is used to reduce the coupling between the delay cells. Unfortunately,

this phenomenon and the attendant reduction in  $\beta$  were discovered during the testing phase of the prototype chip, so no experimental results are available for verification of the assertion that  $\beta$  can be increased through improved delay cell isolation. Interestingly, the DLL, which ideally has a  $\beta$  of 1, is not immune to the realignment perturbation problem. Even in a DLL care must be taken to isolate the delay cells. Circuit simulations indicate that a carefully designed DLL with a similar topology to that shown in Figure 2.9(b) and VCO inverters as shown in Figure 2.10 only gives  $\beta$ = 0.5. Evidently, in applications where a high  $\beta$  value is desired, great care must be taken properly to isolate the delay cells.

#### D. The Reference Spur Issue

In a conventional PLL, there is only one phase comparison between the VCO phase and the reference phase and this occurs at the phase-frequency detector. In the RPLL, however, an extra phase comparison is performed at the phase realigning point, so there are two signal paths from the reference to the VCO output. In practice, mismatches between the two reference signal paths increase the power of the reference spur in the PLL output. In the prototype chip, the measured reference spur power is – 34 dBc with the realignment technique enabled, and –78 dBc with the realignment technique disabled. While not a major concern in the application for which the prototype was designed, the increased reference spur power may be a drawback in other applications. In anticipation of the possible future inclusion of an automatic path calibration loop, a variable delay element has been included in the prototype prior to the divider to allow for such calibration as shown in Figure 2.13. With this feature



Figure 2.13: Reference spur reduction with delay mismatch cancellation.

enabled and manually calibrated, the measured spur power is -71 dBc and static, which suggests that an automatic calibration loop, if necessary, would be effective. In such a control loop, the delay mismatch of the two signal paths would be detected by a phase detector with the resulting phase error lowpass filtered and fed to the voltage controlled delay cell to compensate for the mismatch. The variable delay cell introduces a nominal delay of 3 ns to the loop, which is comparable to the frequency divider delay and negligible compared with the reference period. Therefore the extra delay has negligible effect on the loop stability.

The spurs visible in Figure 2.12 that are not harmonics of the reference were caused by interference from external equipment. The evidence for this conclusion is that the power levels and frequencies of the spurs changed over time.

#### E. Performance summary

The measured performance of the prototype IC is summarized in the table. The realignment technique reduces the peak in-band phase noise by 10 dB, and



### Figure 2.14:Die photo.

reduces the phase noise integrated from 1 to 50 kHz by 8.3 dB. With the loop bandwidth increased to a less conservative 50 kHz, which is near the maximum value prior to instability under nominal process conditions at room temperature, the integrated phase noise improvement changes from 8.3 dB to 5.3 dB. The total measured power dissipation of the IC is 6.8 mW from a 3V supply with no observable difference when the realignment technique is disabled. A die photograph of the fabricated circuit indicating the major functional blocks is shown in Figure 2.14.

# V. CONCLUSION

A VCO realignment technique applied to a conventional integer-*N* PLL to reduce the phase noise contribution from the VCO has been proposed. A theoretical

57

| Technology                                                 | 0.35 um BiCMOS SOI<br>(with only CMOS components used)                     |
|------------------------------------------------------------|----------------------------------------------------------------------------|
| Supply voltage                                             | 2.7-3.3V (measurements at 3V)                                              |
| Total power consumption                                    | 6.8mW<br>(no observable difference when realignment enabled)               |
| Die size                                                   | <b>1.8</b> mm <sup>2</sup>                                                 |
| Realigned PLL core area                                    | 0.22 mm <sup>2</sup> . Conventional PLL: 73%<br>Realignment circuitry: 27% |
| Spot noise @ 20KHz Offset<br>(32kHz PLL bandwidth)         | -92.5 dBc/Hz (realignment disabled)<br>-102.5 dBc/Hz (realignment enabled) |
| Reduction of integrated noise<br>power from 1 kHz - 50 kHz | 8.3 dB w/ 32 kHz PLL bandwidth<br>5.3 dB w/ 64 kHz PLL bandwidth           |

#### Table 2.1: Performance summary.

model for the new PLL that builds upon the well-known conventional PLL linearized model has been developed. Implementation details of and measured results from a CMOS reference PLL with a ring VCO capable of converting most of the popular crystal reference frequencies to a 96 MHz RF PLL reference and baseband clock for a direct conversion Bluetooth wireless LAN have been presented and used to demonstrate the technique. The measured results are in close agreement with those predicted by the theoretical model, and demonstrate that the realignment technique results in a significant phase noise reduction.

#### REFERENCES

- 1. Silicon Wave SiW1502 Bluetooth Radio Modem IC Data Sheet, November 3, 2000.
- M. H. Perrot, T. L. Tewksbury III, C G. Sodini, "A 27-mW CMOS fractional-N synthesizer using digital compensation for 2.5-Mb/s GFSK modulation," IEEE J. Solid-State Circuits, vol. 43, pp. 202-203, Feb. 2001.
- S. Willingham, M. Perrot, B. Setterberg, A. Grzegorek, B. McFarland, "An integrated 2.5GHz ΣΔ frequency synthesizer with 5us settling and 2Mb/s closed loop modulation," ISSCC Digest of Technical Papers, pp. 200-201, Feb. 2000.
- 4. F. Gardner, "Charge-pump phase-lock loops", IEEE Trans. Comm., vol.COM-28, no. 11, pp. 849-1858, Nov. 1980.
- 5. N. Siripon, et al., "Injection-locked balanced oscillator-doubler", Electronic Letters, Vol. 37, pp. 958-959, Jul. 2001.
- F. Badets, et al., "A Fully Integrated 3V 2.3GHz Synchronous Oscillator For WLAN Applications", IEEE, Proceedings of the 1999 Bipolar/BiCMOS Circuits and Technology Meeting, pp. 145-148, Sep. 1999.
- 7. A. Hajimiri and T. H. Lee, "A General Theory of Phase Noise in Electrical Oscillators", IEEE J. Solid-State Circuits, vol. 33, pp. 179-194, Feb. 1998.
- G. Chien and P. R. Gray, "A 900-MHz Local Oscillator Using a DLL-Based Frequency multiplier Technique for PCS Applications", ISSCC Digest of Technical Papers, pp. 202-203, Feb. 2000.
- 9. A. Waizman, "A Delay Line Loop for Frequency Synthesis of De-skewed Clock", ISSCC Digest of Technical Papers, pp. 298-299, Feb. 1994.
- 10. J. G. Maneatis, "Low-Jitter process-independent DLL and PLL based on self-

biased techniques," IEEE J. Solid-State Circuits, vol. 31, pp. 1723-1732, Nov. 1996.

- 11. Ramin Farjad-rad, et al., "A 0.2-2GHz 12mW Multiplying DLL for Low-Jitter Clock Synthesis in Highly-Integrated Data Communication Chips", ISSCC Digest of Technical Papers, pp. 76-77, Feb. 2002.
- 12. J. Lee, et al., "Charge Pump with Perfect Current Matching Characteristics in Phase-Locked Loops", Electronic Letters, Vol. 36, pp. 1907-1908, Nov. 2000.

# Techniques for In-band Phase Noise Suppression in Re-circulating DLLs

Sheng Ye, Lars Jansson and Ian Galton

Abstract — This paper presents a re-circulating delay-locked loop (DLL) with various innovations to improve in-band phase noise suppression. The voltage-controlled oscillator (VCO) and bias circuitry incorporate circuit-level techniques that reduce 1/f noise through switched biasing. The phase realignment theory presented in [1] is applied to optimize the VCO so as to maximize the phase noise suppression, which is achieved by periodically switching in a clean reference pulse to reset the VCO phase noise memory, and it is further applied to optimize the loop filter. Theoretical predictions are verified through a 100 MHz prototype IC fabricated in a 0.18  $\mu$ m CMOS process.

# I. INTRODUCTION

Frequency synthesizers are critical building blocks in communication systems. Ring oscillator based VCOs are widely used in low-performance frequency synthesizer applications such as clock multiplication for digital systems because of their simplicity, wide tuning range, and ease of integration. However, they are less commonly used in communication applications wherein low phase noise is required, because they tend to introduce excessive close-in phase noise. In a conventional phase-locked loop (PLL) frequency synthesizer, the loop bandwidth is usually constrained to less than 5% of the reference frequency to maintain stability across process and temperature extremes, so the PLL only provides limited attenuation of the VCO phase noise.

To overcome this conventional barrier, a clean reference pulse can be injected periodically into the VCO so as to reset the phase error and thereby suppress the noise memory caused by the jitter accumulation effect in the VCO. The result is significant attenuation of the in-band phase noise. This noise reduction technique, referred to as phase realignment in the remainder of the paper, can be implemented in various ways. In a DLL based clock multiplier the delay of a voltage-controlled delay line (VCDL) is locked to the reference period [2]. Multiple delayed versions of the reference clock from the VCDL are then combined to produce the final clock signal as shown in Figure 3.1(a). Since the delayed reference is discarded at the end of the VCDL, the phase realignment is performed naturally, by design. Another type of DLL, referred to as a *re-circulating DLL* in the remainder of the paper, is implemented using a ring VCO with the phase realignment performed by periodically opening the ring to discard the noisy VCO edge and replace it with a clean reference edge [3] as shown in Figure 3.1(b). Alternatively, in a *realigned PLL* (RPLL) [1] shown in Figure 3.1(c), the phase realignment is performed by directly coupling a strongly buffered version of a clean reference source into one of the inverters in the VCO. The strong buffer overpowers the VCO inverter so that the VCO phase is "dragged" toward the correct position.

When the phase realignment occurs, it is intuitively appealing to expect the



Figure 3.1: Frequency synthesizers with phase realignment capability: (a) VCDL based DLL. (b) Re-circulating DLL. (c) Phase realigned PLL.

memory of the noise is completely removed using the DLL approach. However, as demonstrated in [1], in both the re-circulating DLL and the RPLL, parasitic coupling among the delay cells in the VCO tends to degrade the phase realignment and leads to

63

incomplete noise memory suppression. Therefore the choice of VCO topology is critical in effectively implementing the phase realignment technique.

At the device level, 1/f noise tends to severely degrade the in-band phase noise performance of the VCO. In analogy to the phase realignment technique, 1/f noise can be suppressed by periodically switching the associated transistors on and off so as to suppress the long-term noise correlation as reported in [4] [5] and [6].

This paper presents a re-circulating DLL with a novel ring VCO topology. In section II, the phase realignment theory from [1] is applied to provide design guidelines and optimize the loop parameters. In section III, the underlying physics of the switched biasing technique is reviewed and compared with the phase realignment. Section IV shows the implementation details of the DLL, in which the VCO design is optimized for the phase realignment technique and also exploits the switched biasing technique to suppress 1/*f* noise. Section V presents the measurement results, where the fabricated IC can be configured as a conventional PLL and a re-circulating DLL. Good agreement between theory and measurements is observed.

# II. APPLICATION OF THE RPLL THEORY TO RE-CIRCULATING DLLS

#### A. Brief Review of the RPLL Theory

The RPLL theory is presented in [1] that characterizes the phase noise performance of both DLLs and RPLLs. It is applied to the analysis of re-circulating DLLs in this paper. For completeness, the highlights of this model are briefly reviewed below.



Figure 3.2: Linearized model for RPLL and DLL.

It is based on the physical observations that the phase of the VCO is shifted almost instantaneously when the phase realignment occurs and that the phase shift is nearly linear with respect to the instantaneous phase difference between the VCO and the reference. The ratio between the phase shift and the phase difference just before realignment is defined as  $\beta$  and referred to as the *realigning factor*. The value of  $\beta$ ranges from 0 to 1 and is a measure of how completely the phase error is suppressed. Figure 3.2 shows the linearized model for RPLL noise analysis in the *s* domain, where  $K_{chp}$  and  $K_{vco}$  are the charge pump and VCO gains, respectively, *N* is the divider ratio and  $H_{lp}(s)$  is the transfer function of the loop filter. The effect of the phase realignment is shown in the two transfer functions  $H_{up}(s)$  and  $H_{rl}(s)$  marked by the gray block in the figure, where

$$H_{rl}(j\omega) = 1 - \frac{\beta}{1 + (\beta - 1)e^{-j\omega T_r}} e^{-j\omega T_r/2} \frac{\sin(\omega T_r/2)}{\omega T_r/2},$$

and

$$H_{up}(j\omega) = \frac{N\beta}{1 + (\beta - 1)e^{-j\omega T_r}} e^{-j\omega T_r/2} \frac{\sin(\omega T_r/2)}{\omega T_r/2}$$

When there is no phase realignment and therefore  $\beta = 0$ , the model in Figure

65

3.2 is reduced to the well-known linearized model for a conventional PLL. As presented in [1], the attenuation of the VCO phase noise as well as the charge pump and divider noise increases as  $\beta$  increases. Meanwhile, the reference noise attenuation is reduced as  $\beta$  increases. Therefore a large  $\beta$  is desirable when a noisy VCO is used along with a clean reference.

#### B. Application of the RPLL Theory to Re-circulating DLLs

Conventional frequency analysis on the DLL mainly focuses on the stability and settling of the loop [7], [8]. On the other hand, quantitative analysis of conventional DLLs is usually done in the time domain as shown in [9] and [10], because DLLs are used mainly in digital applications where jitter performance is of interest. However, correlated noise is usually neglected in the time domain analysis [9][10] for simplicity so the effect of the 1/*f* noise is not covered. Recent findings on jitter transfer characteristics [11] mainly focus on the jitter peaking from the reference to the DLL output so the interaction between the VCO noise and the DLL loop dynamic is not covered. Although jitter is a popular figure of merit, it lacks specific information of the frequency dependence of the circuit noise, which makes it difficult to separate the contribution to the jitter from individual noise sources in the circuit using the published analyses. In contrast, phase noise analysis tends to provide more design insights in optimizing the loop parameters because of the spectral information contained in the phase noise.

To optimize the loop filter parameters in a re-circulating DLL, the RPLL theory in [1] is applied here without modification except that both first and second-



Figure 3.3: Calculated noise transfer functions of the re-circulating DLL with a 1<sup>st</sup> order loop filter using the linearized model. The bandwidth is controlled by the loop filter capacitor.

order loop filters are considered. As demonstrated below, the results indicate that when the VCO noise is dominant and a clean reference source is used, a second-order filter offers no benefit over a first-order filter, and increasing the loop filter bandwidth actually improves the attenuation of in-band VCO phase noise. Moreover, while the choice of loop filter bandwidth significantly affects the attenuation of the VCO phase noise and input noise, i.e., the charge pump and divider noise, it has little affect on the attenuation of the reference noise. Thus, although a small bandwidth can result in less jitter based on the conventional analysis [10], it does not necessarily lead to low inband phase noise, especially when the noise is dominated by the VCO. For illustration purposes, Figure 3.3 shows the calculated noise transfer functions in a re-circulating DLL with a 1<sup>st</sup> order loop filter where the bandwidth is controlled by the loop filter capacitor. The effect of a wide and narrow bandwidth is compared. These general observations are quantified below for the re-circulating DLL prototype.

## III. SWITCHED BIASING FOR 1/f NOISE REDUCTION

In modern sub-micron CMOS process, 1/*f* noise is a significant noise source and very important in VCO design. It is reported that by periodically switching the transistor between "on" and "off" states results in great reduction of the 1/*f* noise. This phenomenon, although on a microscopic scale, is analogous to the VCO phase noise attenuation by phase realignment.

The concentration of noise power at low frequencies indicates there is longterm correlation in the 1/f noise. The VCO close-in noise is similar in that the longterm correlation is caused by the integrating nature of the VCO. A widely accepted theory for the source of 1/f noise in MOSFET is the trapping and releasing of carriers in the gate oxide. The distribution of the time constant in the trapping-releasing process is such that the noise PSD is proportional to 1/f. As shown in [4], [5], [6] and later on explained theoretically by [12], interference with the noise correlation by



Figure 3.4: Periodically switching an MOSFET between "on" and "off" resets the memory of the noise such that the long-term correlation in the noise is suppressed.

periodically switching the MOSFET between "on" and "off" states results in great reduction in 1/*f* noise. As illustrated in Figure 3.4, when the transistor is turned off, the trapped carriers tend to be released, resulting in the reset of the noise source. This phenomenon is analogous to the phase realignment where the phase error is reset so that the memory to the past noise is lost.

As reported in [4], [5] and [6], the 1/f noise attenuation strongly depends on the way by which the transistor is turned off. Using NMOS as example, decreasing the gate-source voltage in the off state leads to more noise attenuation. As explained in



Figure 3.5: Proposed re-circulating DLL block diagram. Phase realignment is implemented with two NAND gates.

[6], when the decrease of the gate-source voltage results in a significant change in the Fermi level at the surface, thereby changing the occupation of the trapped carriers, a great reduction of 1/f noise is observed. In analogy to the effect of  $\beta$ , the change of the Fermi level affect how completely the trapped carrier is released. A greater change in the Fermi level is analogous to a greater  $\beta$ , which leads to better resetting of the noise memory and therefore more attenuation of the 1/f noise.

## **IV. IMPLEMENTATION DETAILS**

#### A. DLL and VCO Topology

Figure 3.5 shows the simplified block diagram of the proposed re-circulating DLL. The VCO contains two NAND gates and several voltage controlled delay cells to form the ring. This topology is reported in a time-to-digital converter design [13].

70



Figure 3.6: Simplified timing diagram of the phase realignment control in the prototype.

When the signals "rlgn" and "pulse\_sw" are set high, the NAND gates act as inverters and the VCO behaves as a conventional ring VCO. Phase realignment is performed in two steps. First by setting "rlgn" and "pulse\_sw" low, the noisy VCO edge is blocked and the "ring" is temporarily opened. Then, when the clean reference edge arrives it causes both "rlgn" and "pulse\_sw" to go high so as to start a new cycle of oscillation free of phase noise memory. Specifically, "pulse\_sw" is a slightly delayed version of "rlgn" to avoid any potential glitches. The detail of the phase realignment is shown in the simplified timing diagram in Figure 3.6. The implementation of the phase realignment technique is functionally equivalent to that presented in [3], therefore the VCO is starting from its maximum frequency during start-up for the same reason as reported in [3].

When the clean reference clock edge is switched in to replace the noisy VCO clock edge, it is desirable to match the shape of the waveform of the reference and VCO clock to minimize the disturbance to the VCO core. However, in a typical NAND gate as shown in Figure 3.7(a), the falling edge of the output waveform is





While the signal is single-ended in the two NAND gates for better waveform matching during the phase realignment, the signals in the VCO delay cell is pseudodifferential for better rejection of common mode noise. Although not shown in the figure, single-ended to differential converter and differential to single-ended converter are implemented at the beginning and end of the voltage controlled delay chain, respectively.

# B. Switched Biasing in the VCO Delay Cell and Bias Circuitry

To minimize the 1/f noise, switched biasing is used in both the voltage controlled delay cell and its biasing circuitry. Because the loop filter is single-ended,

72



Figure 3.8: (a) Typical complimentary bias generation. (b) A straightforward approach to implement switched biasing for 1/f noise reduction. (c) Illustration of the detailed operation of the bias: an equivalent parasitic resistor is introduced between the loop filter and the power supply.

the bias circuitry must generate the control voltages for both the PMOS and NMOS transistors in the delay cell to ensure symmetry of the waveform, which is critical for 1/f noise reduction [14]. Figure 3.8(a) shows a simple implementation of the bias. Despite its simplicity, the 1/f noise generated from the bias can significantly degrade the VCO phase noise as shown in [15]. Conventional approaches for 1/f noise reduction are to increase the size of the device or to filter the control voltage. Neither approach is area-efficient and filtering potentially affects the loop dynamics.

A straightforward approach of the switched biasing is shown in Figure 3.8(b). It consists of two identical bias branches that are used alternately with only one on at any time. As shown in Figure 3.8(c), the branch not in use is switched off by setting



Figure 3.9: Proposed bias generation with improved 1/f noise reduction. (a) Simplified schematic of the bias. (b) Detailed illustration of the bias operation.

the gate-source voltage to 0. As a result, this approach has almost the same power consumption as the one shown in Figure 3.8(a) except for the slight overhead in the switching circuitry. While the approach has the desired effect of suppressing 1/f noise, the switching introduces the equivalent of a parasitic resistor between the loop filter node and the power supply, which can adversely affect the loop dynamics.

To alleviate this problem, the bias circuit shown in Figure 3.9(a) is used in the re-circulating DLL prototype. This circuit switches the source and drain to turn off

74



Figure 3.10: Simulated node voltages of the NMOS in the bias circuitry when switched biasing is enabled.

each transistor as illustrated in Figure 3.9(b). For example, when an NMOS transistor is turned off, its source and drain are both switched to  $V_{DD}$ . Thus, the gate-source voltage is negative in the "off" state, and the threshold voltage is increased by the increased source-bulk voltage, which also helps to release the trapped carriers. The clock for the switched biasing is provided by the VCO itself. For illustration purposes, Figure 3.10 shows the simulated gate, drain and source voltages of the NMOS when the switched biasing is enabled. In practice, there are inevitable mismatches between the two biasing branches. However, since the biasing branches are switched at the VCO frequency, the mismatches only introduces a slight error in the duty cycle of the



Figure 3.11: Pseudo-differential voltage controlled delay cell design: (a) Conventional latch using two back-to-back inverters. (b) Improved latch for better 1/f noise reduction.

VCO, which is not of the concern of this paper. To demonstrate the effect of the switched biasing during testing, the devices are sized such that the VCO in-band phase noise is dominated by the bias noise and the switched biasing can be disabled for comparison.

The performance of the voltage-controlled delay cells in the VCO is critical. Their phase noise and value of  $\beta$  strongly depend their circuit topology. As described in [1], to maximize  $\beta$  it is necessary to minimize the coupling among delay cells and ensure that each cell is truly edge-triggered. The proposed delay cell is an extension of that shown in Figure 3.11(a), wherein the delay is varied by controlling the transconductance of  $M_p$ ,  $M_n$ ,  $M_p'$  and  $M_n'$ . This topology minimizes the inter-cell coupling because, except for the delay control lines, the delay cells are only connected through low impedance power supply lines. Transistor level simulations confirm that the resulting VCO achieves a  $\beta$  of very nearly 1, the theoretical maximum, across process corners. The topology also allows rail-to-rail signal swing and a wide tuning range (1 MHz to 160 MHz) to cover the large component spread in typical CMOS processes. As the power supply decreases along with the progress of the modern CMOS process, the rail-to-rail swing is particularly desirable because it enables more efficient utilization of the power supply to increase the signal to noise ratio inside the VCO.

The pseudo-differential topology is used to reject common mode noise from the supply and substrate. Normally, the differential signal is created using a latch with two back-to-back inverters at the outputs as shown in Figure 3.11(a). In a normal oscillating cycle, every transistor is turned fully on and off. However if the conventional latch is used, when  $M_p$ ,  $M_n$ ,  $M_p$ ' and  $M_n$ ' are off, unlike the other transistors their gate-source voltages are not set to zero. In order to exploit the switched biasing technique to further suppress 1/f noise in these transistors, a special latch is implemented as shown in Figure 3.11(b). Using  $M_n$  as an example, when it is turned off using the proposed latch, both its source and drain are pulled to  $V_{DD}$ . Figure 3.12(a) shows the simulated outputs of the delay cell where the transistors are sized to make the rising and falling edge as symmetric as possible for better 1/f noise rejection





[14]. Figure 3.12(b) shows the node voltages of  $M_n$  where the "off" state is identical to the "off" state of the transistor in the bias described above.



Figure 3.13: Comparison between conventional PLL and the re-circulating DLL with a 2nd order loop filter, with and without switched biasing. Theoretical calculations are superimposed on the measured data. System parameters: fref = 4 MHz, N = 25, Kvco = 190 MHz/V, Ichp = 0.5 mA, Loop filter:  $R = 330 \Omega$ , C1 = 680 pF, C2 = 8.2 nF, PLL bandwidth = 200 kHz.

## V. MEASUREMENT DETAILS

#### A. VCO Phase Noise Measurement

The prototype IC was fabricated in a single-poly, six-metal 0.18-µm CMOS process through MOSIS and was packaged in a 32-pin QFN package. The 4-MHz reference clock for the DLL was derived from a 32-MHz crystal. Figure 3.13-3.16 present measured results with superimposed theoretical results in close agreement.

Figure 3.13 shows superimposed measured and theoretically calculated phase



Figure 3.14: Re-circulating DLL phase noise measurement: wide loop bandwidth results in more in-band noise attenuation.

noise power spectral density (PSD) curves for the synthesizer configured as conventional PLL and as a re-circulating DLL with a 2<sup>nd</sup> order loop filter, both with and without switched biasing. The system parameters are shown in the figure caption. The results indicate that phase realignment significantly attenuates the in-band VCO phase noise. With the switched biasing enabled, a peak spot phase noise reduction of 5 dB is observed in both the PLL and the re-circulating DLL. Since the transistor in the delay cell is switched off the same fashion as in the bias, it is reasonable to expect similar noise attenuation is achieved inside the delay cell. This result confirms that the switched biasing technique is feasible up to at least 100 MHz. With both noise reduction schemes enabled, a peak spot phase noise reduction of 21.5 dB is observed compared to the conventional PLL. Figure 3.14 shows the measured phase noise when the synthesizer is configured as a re-circulating DLL with a 1<sup>st</sup>-order loop filter. The results support the assertions made above based on the theory presented in [1] that a wide-band 1<sup>st</sup>-order loop filter yields the best results. In the theoretical calculations, the VCO stand-alone phase noise is required. This was obtained by measuring the phase noise of the system with and without switched biasing and with a very narrow bandwidth so that the phase noise is dominated by the VCO.

In a VCDL based DLL as shown in Figure 3.1(a), it is predicted in [16] that the 1/f noise in the DLL output is reduced because the 1/f noise in the circuit is modulated to be around  $f_{ref}$ . Since the prototype re-circulating DLL achieves a  $\beta$  value of 1, the suppression of phase noise memory in the re-circulating DLL is as good as a VCDL based DLL. Therefore in a re-circulating DLL, in addition to the phase noise suppression from the phase realignment, one would expect much less 1/f noise from the VCO even without the switched biasing. However, this phenomenon was not observed in the re-circulating DLL prototype measurement. Taking the top curve of Figure 3.14 as an example, the theoretical calculation was using the stand-alone VCO phase noise measured separately with the switched biasing disabled. The good agreement between theory and measurement indicates the 1/f noise inside the VCO is not affect the mechanism by which the 1/f noise is generated. Therefore, in order to reduce the contribution of the 1/f noise to the in-band phase noise, both system-level optimization such as a wide loop bandwidth, and device level innovation like switched



Figure 3.15: Measurement on the input noise transfer function for conventional PLL, recirculating DLL with 2<sup>nd</sup> order loop filter and with 1<sup>st</sup> order loop filter.

biasing, are required.

#### B. Reference and Input Noise Transfer Function Measurement

Because the in-band phase noise was dominated by the VCO, it was difficult to measure the input and reference noise transfer functions from the measured phase noise. In order to characterize the two transfer functions, a sinusoid, which was generated by an Agilent 8648C signal generator, was ac coupled into the loop filter and the reference clock pin, respectively. This injected signal, which is shaped by the noise transfer functions, is upconverted to the synthesizer output in the form of



Figure 3.16: Measurement on the reference noise transfer function for conventional PLL, DLL with 22  $\mu$ F loop filter and DLL with 2.2 nF loop filter: the loop bandwidth has little effects on the reference noise.

spurious tones around the VCO center frequency. The power of the spurs was then measured using an Agilent E4405B spectrum analyzer. Sweeping the frequency of the injected signal provided a sampled version of the noise transfer function.

Due to the parasitic capacitance and inductance from the PCB board, it was difficult to measure the absolute value of the transfer functions. Alternatively, since the input noise transfer function of a conventional PLL is well known, it was used as a reference and the measured results for the re-circulating DLL with 2<sup>nd</sup> and 1<sup>st</sup> order loop filters were normalized with respect to the conventional PLL. Figure 3.15 shows

| 1.8V, 0.18 µm CMOS (1 poly, 6 metal) |
|--------------------------------------|
| 2.2mm × 2.2 mm                       |
| 32 Pin QFN                           |
| 4MHz (derived from a 32 MHz crystal) |
| 100MHz                               |
| 1MHz ~ 160MHz                        |
| 8.6mW (core)                         |
|                                      |

#### Table 3.1: Performance summary.

the input phase noise transfer functions of the conventional PLL, and re-circulating DLL with  $2^{nd}$  and  $1^{st}$  loop filters. When the loop filter was changed from  $2^{nd}$  order to  $1^{st}$  order, the resistor *R* in the loop filter was shorted so the loop filter capacitance was the sum of both  $C_1$  and  $C_2$ . As shown in the figure, for the same total loop filter capacitance, the DLL with a  $1^{st}$  order loop filter provides more attenuation to input noise than that with a  $2^{nd}$  order loop filter. Figure 3.16 shows the reference noise transfer function of the conventional PLL, re-circulating DLL with 22  $\mu$ F loop filter and DLL with 2.2 nF loop filter. As expected, the loop filter bandwidth does not significantly affect the attenuation of reference noise.

The measured performance of the prototype IC is summarized in Table 3.1. Similar to the RPLL in [1], there is no observable difference in power consumption between the PLL mode and DLL mode. A die photograph of the fabricated circuit indicating the major functional blocks is shown in Figure 3.17.



Figure 3.17: Die photo.

# VI. CONCLUSION

This paper presents a re-circulating delay-locked loop (DLL) with various innovations to improve in-band phase noise suppression. The voltage-controlled oscillator (VCO) and bias circuitry incorporate circuit-level techniques that reduce 1/*f* noise through switched biasing. The phase realignment theory is applied to optimize the VCO so as to maximize the phase noise suppression, which is achieved by periodically switching in a clean reference pulse to reset the VCO phase noise memory, and it is further applied to optimize the loop filter. Measurements of the fabricated prototype agree well with theoretical predications.

#### VII. ACKNOWLEDGEMENT

The authors are grateful to Fari Assaderaghi, Matt Deig, Eric Fogleman, Jean-Sebastien Gagne, Kimihiko Imura, Eric Noguchi, Sudhakar Pamarti, Andrea Spandonis, Ashok Swaminathan and Kevin Wang for their assistance with and advice regarding this project.

# REFERENCES

- S. Ye, L. Jansson, I. Galton, "A Multiple-Crystal Interface PLL with VCO Realignment to Reduce Phase Noise", *IEEE Journal of Solid State Circuits*, Vol. 37, No. 12, pp. 1795-1803, Dec. 2002.
- G. Chien, P. Gray, "A 900-MHz Local Oscillator Using a DLL-based Frequency Multiplier Technique for PCS Applications", *IEEE Journal of Solid State Circuits*, Vol. 35, No. 12, pp. 1996-1999, Dec. 2000.
- 3. R. Farjad-rad, *et al.*, "A Low-Power Multiplier DLL for Low-Jitter Multigigahertz Clock Generation in Highly Integrated Digital Chips", *IEEE Journal of Solid State Circuits*, Vol. 37, No. 12, pp. 1804-1812, Dec. 2002.
- 4. I. Bloom, Y. Nemirovsky, "1/F Noise Reduction by Interfering with the Self Correlation of the Physical Noisy Process", Electrical and Electronics Engineers in Israel, 1991. Proceedings, 17th Convention of , 5-7 Mar. 1991.
- 5. E. Klumperink, S. Gierkink, A. van der Wel, B. Nauta, "Reducing MOSFET 1/f Noise and Power Consumption by "Switched Biasing"", *IEEE Journal of Solid State Circuits*, Vol. 35, No. 7, pp. 994-1001, Jul. 2000.
- I. Bloom and Y. Nemirovsky, "1/F Noise Reduction of Metal-oxide Semiconductor Transistor by Cycling from Inversion to Accumulation", Applied Physics Letter, Vol. 58, No. 15, pp. 1664-1666, Apr. 1991.

- T. Lee, J. Bulzacchelli, "A 155-MHz Clock Recovery Delay- and Phase-Locked Loop", *IEEE Journal of Solid State Circuits*, vol. 27, No. 12, pp. 1736-1746, Dec. 1992.
- 8. J. Maneatis, "Low-Jitter process-independent DLL and PLL based on self-biased techniques," *IEEE Journal of Solid State Circuits*, Vol. 31, pp. 1723-1732, Nov. 1996.
- 9. B. Kim, T. Weigandt, P. Gray, "PLL/DLL System Noise Analysis for Low-jitter Clock Synthesizer Design", *IEEE International Symposium on Circuits and Systems*, Vol. 4, pp 31-34, Jun. 1994
- R. van de Beek, E. Klumperink, C. Vaucher, B. Nauta, "Low-Jitter Clock Multiplication: A Comparison Between PLLs and DLLs", *IEEE Transaction on Circuits and Systems –II: Analog and Digital Signal Processing*, Vol. 49, No. 8, pp 555-566, Aug. 2002.
- M.-J. Edward Lee, *et al.*, "Jitter Transfer Characteristics of Delay-Locked Loops-Theories and Design Techniques", *IEEE Journal of Solid State Circuits*, vol. 38, No. 4, pp. 614-621, Apr. 2003.
- 12. H. Tian and A. Gammal, "Analysis of 1/f Noise in Switched MOSFET Circuits", *IEEE Transaction on Circuits and Systems –II: Analog and Digital Signal Processing*, Vol. 48, No. 2, pp 151-157, Feb. 2001.
- 13. P. Chen, S. Liu, "A Cyclic CMOS Time-to-Digital Converter With Deep Subnanosecond Resolution", *IEEE Custom Integrated Circuits Conference*, pp 605-608, May 1999.
- 14. A. Hajimiri, S. Limotyrakis, T. Lee, "Jitter and Phase Noise in Ring Oscillators", *IEEE Journal of Solid State Circuits*, vol. 34, No. 6 pp. 790-804, Jun. 1999.
- 15. L. Dai and R. Harjani, "A Low-Phase-Noise CMOS Ring Oscillator With Differential Control and Quadrature Outputs", ASIC/SOC conference, 2001. Proceedings, 14<sup>th</sup> Annual IEEE International, pp. 134-138, 2001.
- 16. G. Chien, "Low-noise Local Oscillator Design Techniques Using a DLL-Based

Frequency Multiplier for Wireless Applications", Ph.D. dissertation, University of California, Berkeley, 2000.