A Phase Locked Loop with Loop Bandwidth
         Enhancement for Fast and Low-Noise Clock
           Recovery from Full Sp...
Clock Drift: Two clocks that are nominally running at the
                                                                ...
receive the SOF packet. The SOF token does not cause any             sidebands that can occur at multiples of the comparis...
the numerator of equation 2 shows the primary effects, and the
                                                           ...
causes. However, by far, the most common type of spur is                     modes bandwidth we use a modulated source cur...
Bias current is then injected to the charge pump. The U P         with KΦ the Charge Pump gain, and KV CO the VCO gain.
an...
Fig. 14.   Delay cell schematic and the differential VCO




                                                             ...
VI. CONCLUSION
   This article demonstrates the feasibility of an adaptive band-
width PLL allowing achieving frequency sy...
Upcoming SlideShare
Loading in …5
×

ICECS-08

457 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
457
On SlideShare
0
From Embeds
0
Number of Embeds
15
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

ICECS-08

  1. 1. A Phase Locked Loop with Loop Bandwidth Enhancement for Fast and Low-Noise Clock Recovery from Full Speed USB Data on Smart-Card Applications Julien Roche∗ , Wenceslas Rahadjandrabe† , Lahkdar Zad† , Gaetan Bracmard∗ and Daniele Fronte∗ ∗ Atmel, Zone industriel Rousset,13106 Rousset, FRANCE † IM2NP-CNRS, IMT Technopole de Chteau Gombert 13451 Marseille, France Abstract— The last decade has been marked by great progress loop bandwidth enhancement is achieved by the adaptive contol concerning the field of applications related to smart-cards. This on the charge pump current. First of all we will present the development comes with more and more demands in terms of USB specification and the proposed method. Then, whith the resources both for the calculation power and the transfer speed state of the art as the starting point of the research, the system with external world. The bit rate level ensured by the standard achievement is presented. The relationships of performance specification currently used in the smart card domain is no aspects to design variables are presented and the basic tradeoffs more sufficient for performing applications. Increasing this rate of the new concept are discussed. A circuit implementation of is then a main challenge for this market. The USB interface the solution is described in detail and simulation result in a 0.15µm CMOS technology is presented. constitutes a judicious alternative for these applications, which is motivated by the following points: connection of the PC to the telephone, in fact, the USB provides an ubiquitous link that can Index terms—PLL, loop bandwidth, frequency synthesis, clock be used across a wide range of PC-to-telephone interconnects; recovery, adaptive systems, low jitter, fast locking time, timing ease-of-use (plug-and-play) and port expansion. Otherwise, the jitter. following characteristics that is serial bidirectional interface, high I. INTRODUCTION bit rate (12M bit/s for the Full-speed interface), simplicity of the electrical connection (4 cables: VDD, GND, D+, D-) constitute PLL have been widely used in comunication systems be- a significant advantage for smart-card applications. In spite of cause PLL’s efficiently perform clock recovery and clock its simplicity, this solution has a cost and is challenging. The generation with relatively low implementation cost. In nearly isochronism is not ensured by the time base transfer through the connection. Therefore, both the host and the device must generate all the PLL applications, it is required to generate low noise their own reference. But each reference must have compatible and low spur signal while achieve fast setling. Conventional accuracy, with the bit rate as well with all the elements composing analog PLL designs are focused on individual component op- the chain. The idea presented in this paper brings a new solution timization to reduce noise and jitter. It is not well emphasized covering the absence of reference during the data transfer via that the overall noise performance of the PLL not only depends an USB interface, between a smart card and a host. It allows a fast clock recovery from USB data without any crystal reference on the design of the individual components, but also heavily and is compatible with USB 2.0 specifications that the principle depends on the choice of the loop bandwidth. And in order is shown in Fig. 1. to improve the locking-time characterictics and PLL noise, digital or hybrid analog/digital PLL’s with loop bandwidth stepping capability have been studied, [1], [2], at the expense of an increase of power consumption and die area. Some other design solutions are proposed in the literature, [3], [4] and use adaptive bandwidth method. For this adaptive PLL in literature, the loop bandwidth enhancement is achieved by the adaptive control on the reference frequency, the divide ratio, the charge pump current and time constant of the loop filter, Fig. 1. Principle of the clock recovery from USB data But any method doesn’t use the above cited patterns from USB interface to recover the clock. Moreover any of them According to Fig. 1, 48M Hz ± 2500ppm clock signal is recovered from full speed USB interface providing 12M bits doesn’t be able to recover the clock as fast as the proposed data rate. method. (¡ 10s). This paper presents a new enhancement to a This paper presents a salient method to recover the clock from conventional PLL. The system illustrated in Fig.2 is a simple, a USB signal reference without crystal oscillator. to do this, we low power and low cost PLL, with the proposed adaptive use an analog phase-locked loop that adaptively controls the bandwidth control. Since crystal reference is ”forbidden”, the loop bandwidth according to the locking status. An extended
  2. 2. Clock Drift: Two clocks that are nominally running at the • same rate can, in fact, have implementation differences that result in one clock running faster or slower than the other over long periods of time. If uncorrected, this variation of one clock compared to the other can lead to having too much or too little data when data is expected to always be present at the time required. Fig. 2. Block diagram of the adaptive phase locked Loop. • Clock Jitter: A clock may vary its frequency over time due to changes in temperature, etc. This may also alter when data is actually delivered compared to when it is main idea consists to recover the reference clock by the way of expected to be delivered. charge pump-PLL loop during the first communication phases • Clock-to-clock Phase Differences: If two clocks are not between the device and the host, and all that, before any phase locked, different amounts of data may be available data is transmitted. The circuit extracts the clock from the at different points in time as the beat frequency of the non return-to-zero-Inverse (NRZI) data sequence (USB Full- clocks cycle out over time. This can lead to quantiza- Speed signal) using both phase and frequency detection. In tion/sampling related artifacts. addition to low power, an important concern has been to First of all, when a device is connected for the first time to achieve a relatively wide capture range so that the circuit can the host, electrical detection and power management steps are lock to the input in the presence of temperature and process carried out as shown in Fig. 3, [5]. variations. While cost, reliability, and performance issues make it desirable to integrate the system of Fig.2 on a smart-card, the total power dissipated in the high-speed blocks often becomes prohibitively large. Thus, it is important that each circuit be designed for minimum power dissipation. In Section II we describe issues related to extracting the clock from NRZI data (USB specification). Section III introduces the adaptive PLL architecture and discusses the advantages and tradeoffs of the concept. Section IV describes the circuit implementation, and Section V presents a summary of estimated and simulated results. Fig. 3. Power-on and connection events timing II. USB SPECIFICATION Then, according to the electrical specification, the following protocol is executed: The USB full-speed signaling bit rate is 12M b/s [5]. It • Start-of-frame paquet (SOF) followed by setup packet is a fast, bi-directional, isochronous, low-cost, dynamically (SETUP) are exchanged. attachable serial interface that is consistent with the require- • If we take the opportunity to not to respond to the ments of the platform of today and tomorrow. The USB SETUP packet two times, the host sends SETUP pattern is a cable bus that supports data exchange between a host one time more after which the device has to send an and a wide range of simultaneously accessible peripherals. acknowledgment. It ’s called the ‘Three strike and you’re In any communication system, the transmitter and receiver out’ protocol. must be synchronized enough to deliver data robustly. In an • Finally, between two patterns, there is an inter-packet asynchronous communication system, data can be delivered delay of 7 time-bits. robustly by allowing the transmitter to detect that the receiver After the identification protocol, bus transactions between has not received a data item correctly and simply retrying host and peripherals begins. Thus, the principle idea of this transmission of the data. In an isochronous communication paper consists to recover the clock from these patterns with system, the transmitter and receiver must remain time- and sufficiently low jitter in order to fit the USB specifications. datasynchronized to deliver data robustly. The USB does not support transmission retry of isochronous data so that minimal A. Description of the pattern used for the synchronisation bandwidth can be allocated to isochronous transfers and time SOF (start of frame) • synchronization is not lost due to a retry delay. However, it is critical that a USB isochronous transmitter/receiver pair Start-of-Frame (SOF) packets, illustrated in Fig. 4 are issued by the host at a nominal rate of once every 1.00ms±0.0005ms still remain synchronized both in normal data transmission for a full-speed bus and 125µs0.0625µs for a high-speed cases and in cases where errors occur on the bus. In order for isochronous data to be manipulated reliably, the clocks bus. The SOF token comprises the token-only transaction that identified above must be synchronized in some fashion. If the distributes a SOF marker and accompanying frame number clocks are not synchronized, several clock-to-clock attributes at precisely timed intervals corresponding to the start of each can be present that can be undesirable: frame. All high-speed and full speed functions, including hubs,
  3. 3. receive the SOF packet. The SOF token does not cause any sidebands that can occur at multiples of the comparison receiving function to generate a return packet. frequency, and can be translated by a mixer to the desired signal frequency. They can mask or degrade the desired signal. Lock time is the time that it takes for the PLL to change frequencies, which depend on the size of the frequency step and what frequency error is considered acceptable. When the PLL is switching frequencies, no data can be transmitted, so lock time of the PLL must lock fast enough as to not slow the data rate.In this section of the paper, we describe issues related to extracting the clock from NRZI data. A. NRZI signals Fig. 4. USB pattern 1 used to recover the clock. Random NRZI data has two properties that directly influ- SETUP ence the design of clock recovery circuits. We examine both • the frequency-domain and time-domain behavior of the NRZI SETUP (Fig. 5) defines a special type of host-to-function format to understand these properties and their implications. data transaction that permits the host to initialize an endpoint’s For a random binary sequence with bit rate, rb , and equal synchronization bits to those of the host. probability of ONEs and ZEROs, the power spectral density is The patterns have the following characteristics: [6]: • NRZI (Non Return to Zero Inverse) =¿ non periodic. • The duration of the time bit is 83.333ns. sin[(ω · tB )/2] 2 1 • The signal jitter near the host equals ±3.5ns between Px (ω) = Tb [ ] and Tb = (1) (ω · tB )/2) rb consecutive transitions and ±4ns between paired transi- tions; but can be more degraded through 5 hubs and 5m This function exhibits nulls at integer multiples of rb . of cables according to the USB specification (±18.5ns Intuitively, we note that the fastest NRZI waveform with bit between consecutive transitions and ±9ns between paired rate rb occurs when consecutive bits alternate between ONE transitions). and ZERO. The result is a square wave with a frequency equal to rb /2, containing no even-order harmonics [8]. Thus, • According to the communication protocol, if the device doesn’t answer the SETUP sequence, the host sends a nonlinear function, e.g., edge detection, must be used. One SETUP pattern two times more. After the third SETUP approach to edge detection is to generate a positive impulse pattern, the device has to send an acknowledgment. for every positive or negative data transition. To do this, we • Between two patterns there is an inter-packet delay of 7 use the properties of Hogge Phase/frequency detector, Fig 6 time-bit and [7]. As a consequence, the system has to recover the clock with a maximum (accumulated) time of 12s while to ensure the required clock accuracy at its output. III. DESIGN ISSUES Random fluctuations in the output frequency of the PLL, expressed in terms of jitter and phase noise, have a direct impact on the timming accuracy where phase alignement is required and on the signal to noise ratio where frequency translation is performed. Reference spurs are unwanted noise Fig. 6. Hogge Phase Frequency detector Another property of NRZI data is that it can exhibit long sequences of consecutive ONEs or ZEROs, an important issue if the CRC employs phase-locking (Fig. 8). In the absence of data transitions, the dc component produced by the Hogge PFD is zero, Fig. 7, and the control voltage of the oscillator, which is stored in the low-pass filter (LPF), gradually diminishes, thereby causing the output frequency to drift. Consequently, the recovered clock suffers from input- Fig. 5. USB pattern 2 used to recover the clock. dependent jitter. To minimize this effect, the LPF time constant
  4. 4. the numerator of equation 2 shows the primary effects, and the second expression shows the secondary effects due to the zero. The zero in the transfer function has a lot of effects on the overshoot and the rise time, but has little effect on the lock time. Using inverse Laplace transforms, the time frequency response is obtained from which the lock time of the PLL is derived as: √ − ln( f2tol 1 1 − ς 2 ) −f Locktime = (4) ς · ωn Where f2 − f1 is the frequency step and tol corresponds to the maximum tolerance of the frequency at which the PLL is considered to be locked. The settling time is largely Fig. 7. In lock Hogge PFD behaviour determined by the loop bandwidth, ωC , whose maximum is approximately 1/10 of the reference frequency for stability considerations [10]. In equation 4 we can see that Lock must be sufficiently longer than the maximum length of time is inversely proportional to the natural pulsation ωn consecutive ONEs or ZEROs, a remedy that inevitably leads and the damping factor ς. Refer to equations 5 and 6, Lock to a small loop bandwidth and hence a narrow capture range. time is inversely proportional to loop bandwidth, ωC , which Since the center frequency of monolithic oscillators varies is proportional to Kφ and so proportional to charge pump substantially with temperature and process, the CRC must current. employ some means of frequency detection and acquisition so as to guarantee locking. ωC = 2 · ωn · ς (5) Kφ · KV CO ωC = R2 · C2 · (6) N (C1 + C2 + C3 ) Now we know that, to modulate the lock time we can modulate the charge pump current and so the loop bandwidth. C. Noise and Loop Bandwidth Using standard control theory [9], an expression can be written which relates the noise generated at each noise source to the corresponding noise that it produces at the output of Fig. 8. Phase locked clock recovery circuit with Hogge PFD the PLL. On Fig. 9, we can see that if we can control PLL bandwidth, ωC , we can control noise transfer from one source (phase detector noise, N divider noise, reference noise, VCO B. Locktime and Loop Bandwidth noise) of the system to the output of the system. The closed loop transfert function of the PLL of figure 2 is: Kφ ·KV CO N ·Ctot )(1 + s · N · τ2 ) ( CL(s) = (2) K ·KV ·τ s2 + s · ( φN ·CCO 2 ) tot In which Ctot = C1 + C2 + C3 , τ2 is the second time constant of the filter such as τ2 = R2 C2 . Defining the natural pulsation ωn and the damping factor ς as Fig. 9. (a) Transfer function multiplying all noise sources except the vco, (b) Transfer function multiplying the vco noise Kφ · KV CO Kφ · KV CO R2 C2 · ωn = ς= N (C1 + C2 + C3 ) 2 N (C1 + C2 + C3 ) Within the loop bandwidth, the PLL phase detector and (3) reference signal are typically the dominant noise source, and Now consider a PLL, which is initially locked at frequency outside the loop bandwidth, the VCO noise is often the f 1, and then the N counter is changed such to cause the PLL dominant noise source. Moreover, by controlling charge pump to switch to frequency f 2. This event is equivalent to changing gain, Kφ, we can control spur gain , equation ??. There are the reference frequency from f 1/N to f 2/N . The first term in several types of these spurious outputs with many different
  5. 5. causes. However, by far, the most common type of spur is modes bandwidth we use a modulated source current. This the reference spur. These spurs appear at multiples of the adaptation technique is especially effective in the case of large comparison frequency. Larger charge pump gains yield lower frequency steps or small data transition. First, PLL settles to leakage dominated spurs at the critic frequency, Fspur , which within a small residual frequency error in a very short time is generally the comparaison frequency. When the charge in wide BW mode. Then, the additional time required for the pump is in the tri-state state, it is ideally high impedance. loop to settle to the final frequency value in the narrow- BW However, there will be some parasitic leakage through the mode is greatly reduced. charge pump, VCO, and loop filter capacitors. Of these leakage sources, the charge pump tends to be the dominant one. So the loop bandwidth, ωC , is the most critical parameter of the loop filter. Choosing the loop bandwidth too small will yield a design with improved reference spurs and RMS phase error, but all at the expense of increased lock time. Choosing the loop bandwidth too wide will result in improved lock time at the expense of increased reference spurs and RMS phase error. To resume, we have a wide mode bandwidth during the acquisition time and a narrow mode bandwidth for phase noise improvement. IV. CIRCUIT DESCRIPTION AND ADAPTIVE BANDWIDTH IMPLEMENTATION Each parameter of the PLL has to be chosen optimally in order to achieve short lock time, low phase noise, low power consumption, and fully integration of the loop filter capacitors. We have seen Hogge PDF has been used to deal Fig. 11. Modulated current source for the charge pump with the NRZI reference signal, Fig. 7, and we have seen that a third order passive loop filter has been adopted in order to Fig.11 shows the simplified schematic of the adaptive cur- minimize the spurious gain. A VCO gain of 55M Hz/V , a rent steering charge pump. As VOU T decreases, M 8 will enter charge pump current ICP = 160µA in Wide-Band mode. The the non-saturation region and IOU T will begin to decrease. theoretical lock time computed from these values is somewhat However, this causes a decrease in the gate-source voltage of few µs, which depend on the frequency step and specification M 7 which causes an increase in the gate voltage of M 8. The tolerance. An important consideration in the adaptive scheme minimum value of VOU T is determined by the gate-source is to maintain a phase margin superior to 45deg, as shown in voltage of M 7 and Vdsat of M 8. Fig.10, and an optimum-damping factor, ς around 0.7 [10] The charge pump current IUP and IDOWN can be biased when the loop is switched from the wide-BW to narrow- to either 80µA for the narrow-BW mode or 160µA for the BW mode, so that the loop stability and settling behavior are wide-BW mode, controlled by the control voltage of the VCO optimized in both modes. through the control transistor M 1 as shown in Fig.11. M 1 send a current controled by the VCO control voltage and so by the state of the system. After that, we use a regulated cascode source to charge the charge pump element. With this solution we have a hight output impdance and a constant current value a the output of the current source (independant of the output load) Fig.12. Fig. 10. Phase Margin for different bandwidth mode: ωc1 /2π loop bandwidth for Narrow-BW mode and ωc /2π loop bandwidth for Wide-BW mode As we have seen in III-B, by controling the Charge Pump Fig. 12. regulated cascode behavior current, we can control loop bandwidth. So to have this two
  6. 6. Bias current is then injected to the charge pump. The U P with KΦ the Charge Pump gain, and KV CO the VCO gain. and DOW N signals from the phase frequency detector control This leads to a system of four equations and four unknowns: the switches of the charge pump. Current is then integrated • Constante through the 3rd order loop filter Fig. 13 to control a two stage ring oscillator. The loop filter component are determined K1 = Ctot (18) by choosing optimal value of the loop bandwidth, the phase margin, the charge pump and the VCO gain. K2 = (T 1 + T 3) · K1 (19) The impedance of the filter is given by: 1 + s · T2 1 T 1 · T 3 · K1 · Z(s) = (7) K3 = (20) s · (1 + s · T 1) · (1 + s · T 3) Ctot T2 And the time constants are determined by: C3 K4 = (21) C1 T 2 = 2 · C2 (8) with • Equations C2 · C3 · 2 + C1 · C2 · 2 + C1 · C3 · 3 + C2 · C3 · 3 T 1+3 = C1 + C2 + 3 C1 + C2 + C3 = K1 (22) (9) T1 · T3 C1 · C3 · 3 = (10) T 2 · (C1 + C3) + R3 · C3 · (C1 + C2) = K2 (23) 2 C1 + C2 + 3 Choosing the loop bandwidth to maximize the phase margin yields: R3 · C1 · C3 = K3 (24) C3 M Φ = tan−1 (ωC · T 2) − tan−1 (ωC · T 1) − tan−1 (ωC · T 3) = K4 (25) C1 (11) although, C1 · (K4 + 1) + C2 = K1 (26) ωC · T 2 ωC · T 1 ωC · T 1 · T 31 = + 1 + (ωC · T 2)2 1 + (ωC · T 1)2 1 + (ωC · T 1 · T 31)2 K3 · C3 T 2 · C1 · (K4 + 1) + K3 + (12) = K2 (27) C1 Combining these leads to a quadratic equation that can be 1 − 4f (ωC · T 1)2 1± calculated C1. Once C1 is known, then C1, C2, R2, and R3 ωC · T 2 = = g(ωC · T 1) (13) 2 · f (ωC · T 1) can be found: C1 = 5pF , C2 = 166pF , C3 = 0.7pF , R2 = 1.4kΩ, R3 = 12.8kΩ Using 13 and 11 to eliminate ωC · T 1 yields: M Φ = π + tan−1 (g(x)) − tan−1 (x) − tan−1 (x · T 31) (14) Since x is the only unknown, this equation can be solved numerically for xand then T 1 can be found via the equation: x T1 = (15) ωC Once T 1 is known, T 2 can be found by, Fig. 13. Third Order Loop Filter g(ωC · T 1) T2 = (16) The output voltage of the loop filter control a two stage ring ωC oscillator. Each stage is a differential delay cell that consists Now, by definition, the gain of the open loop transfer of an input NMOS pair M 1 and M 2, a cross-coupled PMOS function is equal to one at the loop bandwidth. Therefore: pair M 3 and M 4 to sharpen the edge of the output signal, and a frequency control PMOS pair M 5 and M 6 fig. 14. 1 + (ωc · T 2)2 Fig. 15 represent the final loop system to recover the 48MHz KΦ · KV CO · C1+C2+C3 = Ctot = 2 (1 + (ωc · T 2)2 ) · (1 + (ωc · T 3)2 ) USB data without crystal reference and on chip clock from ωc (17) loop filter.
  7. 7. Fig. 14. Delay cell schematic and the differential VCO Fig. 17. overall phase noise Fig. 15. Phase Locked Loop with Loop Bandwidth Enhancement for Fast and Low-Noise Clock Recovery from Full Speed USB Data V. SIMULATION RESULTS Figure 16 shows the phase noise contribution from different loop components based on the transistor level simulation. the overall phase noise is estimated and also provided in figure 16. It can be seen that the phase noise of the VCO is suppressed inside the loop bandwidth, whereas the phase noise from the other loop component is attenuated outside the loop bandwidth. Fig. 18. Settling time simulation for a PLL with adaptation and non- periodical reference signal. Fig. 16. Estimated phase noise of the whole loop and contribution of each loop component at the synthesizer output. The overall phase noise at 1M Hz frequency offset is −130dBc/Hz, Fig. 17, and the dominant contributor is the VCO. The simulated current consumption is 8mA with a supply voltage of 1.8v. With this system, we can reach an accuracy of 0.3% (Fig. 18) on the output frequency of the vco, with a 25M Hz non- periodic noisy (±0.2ns jitter per time bit, 40ns) reference signal in about 1µs. Fig. 19. layout of the proposed adaptive PLL As shown in figure 19, the die area is 600µm × 300µm in 0.15µm Cmos technologie with on chip loop filter.
  8. 8. VI. CONCLUSION This article demonstrates the feasibility of an adaptive band- width PLL allowing achieving frequency synthesis, frequency modulation and clock and data recovery with fast locking and high level precision. Contrarily to standard techniques, the proposed method offers a modulation bandwidth solution with simple loop PLL. We are waiting the test chip measurements to compare with the simulation results REFERENCES [1] Jhon G. Maneatis ”Low-jitter Process-independent DLL and PLL Based on Self-Biased Techniques” IEEE Journal of Solid State Circuits, Nov 1996. [2] C. S. Vaucher ”‘An adaptive PLL Tunning System architecture Combining High Spectral purity and Fast settling time”IEEE Journal of Solid State Circuits, April 2000. [3] I. Novof et al ”Fully-integrated CMOS Phase-Locked Loop with 15 to 240 MHz Locking Range and 50 ps Jitter,” in ISSCC Dig.Tech. Papers, Feb. 1995, pp. 112-113. [4] A. Young, J. K. Greason J. E. Smith, and K. L. Wong, ”A PLL Clock Generator with 5 to 110 MHz Lock Range for Microprocessors,” in ISSCCDig. Tech. Papers, Feb. 1992, pp. 50-51. [5] ” Universal Serial Bus Specification,” Rvision 2.0,april 27 2000 [6] S. K. Shanmugam ” Digital and Analog Communication Systems,” Wiley,New York: 1979 [7] C. R. Hogge ” a self- correcting clock recovery circuit,” J.Lightwave technologie, vol 3, Dec 1985, pp. 1312- 1314 [8] B. Razavi ”Design of monolithic phase-locked loops and clock recovery circuitsa tutorial,” in Monolithic Phase-Locked Loops and Clock [9] B. Razavi ”A Study of Phase Noise in CMOS Oscillator” IEEE J. Solid- State Circ. vol 31, pp331-343; march1996. [10] F. M. Gardner ”Charge Pump Phase Lock Loop” IEEE Transactions on on Communications, vol. COMM-28, pp 1849-1858 Nov 1980.

×