1. A Phase Locked Loop with Loop Bandwidth
Enhancement for Fast and Low-Noise Clock
Recovery from Full Speed USB Data on
Smart-Card Applications
Julien Roche∗ , Wenceslas Rahadjandrabe† , Lahkdar Zad† , Gaetan Bracmard∗ and Daniele Fronte∗
∗ Atmel, Zone industriel Rousset,13106 Rousset, FRANCE
† IM2NP-CNRS, IMT Technopole de Chteau Gombert
13451 Marseille, France
Abstract— The last decade has been marked by great progress loop bandwidth enhancement is achieved by the adaptive contol
concerning the field of applications related to smart-cards. This on the charge pump current. First of all we will present the
development comes with more and more demands in terms of USB specification and the proposed method. Then, whith the
resources both for the calculation power and the transfer speed state of the art as the starting point of the research, the system
with external world. The bit rate level ensured by the standard achievement is presented. The relationships of performance
specification currently used in the smart card domain is no aspects to design variables are presented and the basic tradeoffs
more sufficient for performing applications. Increasing this rate of the new concept are discussed. A circuit implementation of
is then a main challenge for this market. The USB interface the solution is described in detail and simulation result in a
0.15µm CMOS technology is presented.
constitutes a judicious alternative for these applications, which
is motivated by the following points: connection of the PC to the
telephone, in fact, the USB provides an ubiquitous link that can Index terms—PLL, loop bandwidth, frequency synthesis, clock
be used across a wide range of PC-to-telephone interconnects; recovery, adaptive systems, low jitter, fast locking time, timing
ease-of-use (plug-and-play) and port expansion. Otherwise, the jitter.
following characteristics that is serial bidirectional interface, high
I. INTRODUCTION
bit rate (12M bit/s for the Full-speed interface), simplicity of the
electrical connection (4 cables: VDD, GND, D+, D-) constitute
PLL have been widely used in comunication systems be-
a significant advantage for smart-card applications. In spite of
cause PLL’s efficiently perform clock recovery and clock
its simplicity, this solution has a cost and is challenging. The
generation with relatively low implementation cost. In nearly
isochronism is not ensured by the time base transfer through the
connection. Therefore, both the host and the device must generate all the PLL applications, it is required to generate low noise
their own reference. But each reference must have compatible and low spur signal while achieve fast setling. Conventional
accuracy, with the bit rate as well with all the elements composing
analog PLL designs are focused on individual component op-
the chain. The idea presented in this paper brings a new solution
timization to reduce noise and jitter. It is not well emphasized
covering the absence of reference during the data transfer via
that the overall noise performance of the PLL not only depends
an USB interface, between a smart card and a host. It allows a
fast clock recovery from USB data without any crystal reference on the design of the individual components, but also heavily
and is compatible with USB 2.0 specifications that the principle depends on the choice of the loop bandwidth. And in order
is shown in Fig. 1.
to improve the locking-time characterictics and PLL noise,
digital or hybrid analog/digital PLL’s with loop bandwidth
stepping capability have been studied, [1], [2], at the expense
of an increase of power consumption and die area. Some
other design solutions are proposed in the literature, [3], [4]
and use adaptive bandwidth method. For this adaptive PLL in
literature, the loop bandwidth enhancement is achieved by the
adaptive control on the reference frequency, the divide ratio,
the charge pump current and time constant of the loop filter,
Fig. 1. Principle of the clock recovery from USB data
But any method doesn’t use the above cited patterns from
USB interface to recover the clock. Moreover any of them
According to Fig. 1, 48M Hz ± 2500ppm clock signal is
recovered from full speed USB interface providing 12M bits doesn’t be able to recover the clock as fast as the proposed
data rate. method. (¡ 10s). This paper presents a new enhancement to a
This paper presents a salient method to recover the clock from
conventional PLL. The system illustrated in Fig.2 is a simple,
a USB signal reference without crystal oscillator. to do this, we
low power and low cost PLL, with the proposed adaptive
use an analog phase-locked loop that adaptively controls the
bandwidth control. Since crystal reference is ”forbidden”, the
loop bandwidth according to the locking status. An extended
2. Clock Drift: Two clocks that are nominally running at the
•
same rate can, in fact, have implementation differences
that result in one clock running faster or slower than
the other over long periods of time. If uncorrected, this
variation of one clock compared to the other can lead to
having too much or too little data when data is expected
to always be present at the time required.
Fig. 2. Block diagram of the adaptive phase locked Loop. • Clock Jitter: A clock may vary its frequency over time
due to changes in temperature, etc. This may also alter
when data is actually delivered compared to when it is
main idea consists to recover the reference clock by the way of expected to be delivered.
charge pump-PLL loop during the first communication phases • Clock-to-clock Phase Differences: If two clocks are not
between the device and the host, and all that, before any phase locked, different amounts of data may be available
data is transmitted. The circuit extracts the clock from the at different points in time as the beat frequency of the
non return-to-zero-Inverse (NRZI) data sequence (USB Full- clocks cycle out over time. This can lead to quantiza-
Speed signal) using both phase and frequency detection. In tion/sampling related artifacts.
addition to low power, an important concern has been to
First of all, when a device is connected for the first time to
achieve a relatively wide capture range so that the circuit can
the host, electrical detection and power management steps are
lock to the input in the presence of temperature and process
carried out as shown in Fig. 3, [5].
variations. While cost, reliability, and performance issues make
it desirable to integrate the system of Fig.2 on a smart-card, the
total power dissipated in the high-speed blocks often becomes
prohibitively large. Thus, it is important that each circuit be
designed for minimum power dissipation. In Section II we
describe issues related to extracting the clock from NRZI data
(USB specification). Section III introduces the adaptive PLL
architecture and discusses the advantages and tradeoffs of the
concept. Section IV describes the circuit implementation, and
Section V presents a summary of estimated and simulated
results. Fig. 3. Power-on and connection events timing
II. USB SPECIFICATION Then, according to the electrical specification, the following
protocol is executed:
The USB full-speed signaling bit rate is 12M b/s [5]. It
• Start-of-frame paquet (SOF) followed by setup packet
is a fast, bi-directional, isochronous, low-cost, dynamically
(SETUP) are exchanged.
attachable serial interface that is consistent with the require-
• If we take the opportunity to not to respond to the
ments of the platform of today and tomorrow. The USB
SETUP packet two times, the host sends SETUP pattern
is a cable bus that supports data exchange between a host
one time more after which the device has to send an
and a wide range of simultaneously accessible peripherals.
acknowledgment. It ’s called the ‘Three strike and you’re
In any communication system, the transmitter and receiver
out’ protocol.
must be synchronized enough to deliver data robustly. In an
• Finally, between two patterns, there is an inter-packet
asynchronous communication system, data can be delivered
delay of 7 time-bits.
robustly by allowing the transmitter to detect that the receiver
After the identification protocol, bus transactions between
has not received a data item correctly and simply retrying
host and peripherals begins. Thus, the principle idea of this
transmission of the data. In an isochronous communication
paper consists to recover the clock from these patterns with
system, the transmitter and receiver must remain time- and
sufficiently low jitter in order to fit the USB specifications.
datasynchronized to deliver data robustly. The USB does not
support transmission retry of isochronous data so that minimal
A. Description of the pattern used for the synchronisation
bandwidth can be allocated to isochronous transfers and time
SOF (start of frame)
•
synchronization is not lost due to a retry delay. However, it
is critical that a USB isochronous transmitter/receiver pair Start-of-Frame (SOF) packets, illustrated in Fig. 4 are issued
by the host at a nominal rate of once every 1.00ms±0.0005ms
still remain synchronized both in normal data transmission
for a full-speed bus and 125µs0.0625µs for a high-speed
cases and in cases where errors occur on the bus. In order
for isochronous data to be manipulated reliably, the clocks bus. The SOF token comprises the token-only transaction that
identified above must be synchronized in some fashion. If the distributes a SOF marker and accompanying frame number
clocks are not synchronized, several clock-to-clock attributes at precisely timed intervals corresponding to the start of each
can be present that can be undesirable: frame. All high-speed and full speed functions, including hubs,
3. receive the SOF packet. The SOF token does not cause any sidebands that can occur at multiples of the comparison
receiving function to generate a return packet. frequency, and can be translated by a mixer to the desired
signal frequency. They can mask or degrade the desired signal.
Lock time is the time that it takes for the PLL to change
frequencies, which depend on the size of the frequency
step and what frequency error is considered acceptable.
When the PLL is switching frequencies, no data can be
transmitted, so lock time of the PLL must lock fast enough
as to not slow the data rate.In this section of the paper, we
describe issues related to extracting the clock from NRZI data.
A. NRZI signals
Fig. 4. USB pattern 1 used to recover the clock.
Random NRZI data has two properties that directly influ-
SETUP ence the design of clock recovery circuits. We examine both
•
the frequency-domain and time-domain behavior of the NRZI
SETUP (Fig. 5) defines a special type of host-to-function
format to understand these properties and their implications.
data transaction that permits the host to initialize an endpoint’s
For a random binary sequence with bit rate, rb , and equal
synchronization bits to those of the host.
probability of ONEs and ZEROs, the power spectral density is
The patterns have the following characteristics:
[6]:
• NRZI (Non Return to Zero Inverse) =¿ non periodic.
• The duration of the time bit is 83.333ns.
sin[(ω · tB )/2] 2 1
• The signal jitter near the host equals ±3.5ns between Px (ω) = Tb [ ] and Tb = (1)
(ω · tB )/2) rb
consecutive transitions and ±4ns between paired transi-
tions; but can be more degraded through 5 hubs and 5m This function exhibits nulls at integer multiples of rb .
of cables according to the USB specification (±18.5ns Intuitively, we note that the fastest NRZI waveform with bit
between consecutive transitions and ±9ns between paired rate rb occurs when consecutive bits alternate between ONE
transitions). and ZERO. The result is a square wave with a frequency
equal to rb /2, containing no even-order harmonics [8]. Thus,
• According to the communication protocol, if the device
doesn’t answer the SETUP sequence, the host sends a nonlinear function, e.g., edge detection, must be used. One
SETUP pattern two times more. After the third SETUP approach to edge detection is to generate a positive impulse
pattern, the device has to send an acknowledgment. for every positive or negative data transition. To do this, we
• Between two patterns there is an inter-packet delay of 7 use the properties of Hogge Phase/frequency detector, Fig 6
time-bit and [7].
As a consequence, the system has to recover the clock with
a maximum (accumulated) time of 12s while to ensure the
required clock accuracy at its output.
III. DESIGN ISSUES
Random fluctuations in the output frequency of the PLL,
expressed in terms of jitter and phase noise, have a direct
impact on the timming accuracy where phase alignement is
required and on the signal to noise ratio where frequency
translation is performed. Reference spurs are unwanted noise
Fig. 6. Hogge Phase Frequency detector
Another property of NRZI data is that it can exhibit long
sequences of consecutive ONEs or ZEROs, an important issue
if the CRC employs phase-locking (Fig. 8). In the absence of
data transitions, the dc component produced by the Hogge PFD
is zero, Fig. 7, and the control voltage of the oscillator, which
is stored in the low-pass filter (LPF), gradually diminishes,
thereby causing the output frequency to drift.
Consequently, the recovered clock suffers from input-
Fig. 5. USB pattern 2 used to recover the clock. dependent jitter. To minimize this effect, the LPF time constant
4. the numerator of equation 2 shows the primary effects, and the
second expression shows the secondary effects due to the zero.
The zero in the transfer function has a lot of effects on the
overshoot and the rise time, but has little effect on the lock
time. Using inverse Laplace transforms, the time frequency
response is obtained from which the lock time of the PLL is
derived as:
√
− ln( f2tol 1 1 − ς 2 )
−f
Locktime = (4)
ς · ωn
Where f2 − f1 is the frequency step and tol corresponds
to the maximum tolerance of the frequency at which the
PLL is considered to be locked. The settling time is largely
Fig. 7. In lock Hogge PFD behaviour determined by the loop bandwidth, ωC , whose maximum is
approximately 1/10 of the reference frequency for stability
considerations [10]. In equation 4 we can see that Lock
must be sufficiently longer than the maximum length of time is inversely proportional to the natural pulsation ωn
consecutive ONEs or ZEROs, a remedy that inevitably leads and the damping factor ς. Refer to equations 5 and 6, Lock
to a small loop bandwidth and hence a narrow capture range. time is inversely proportional to loop bandwidth, ωC , which
Since the center frequency of monolithic oscillators varies is proportional to Kφ and so proportional to charge pump
substantially with temperature and process, the CRC must current.
employ some means of frequency detection and acquisition
so as to guarantee locking. ωC = 2 · ωn · ς (5)
Kφ · KV CO
ωC = R2 · C2 · (6)
N (C1 + C2 + C3 )
Now we know that, to modulate the lock time we can
modulate the charge pump current and so the loop bandwidth.
C. Noise and Loop Bandwidth
Using standard control theory [9], an expression can be
written which relates the noise generated at each noise source
to the corresponding noise that it produces at the output of
Fig. 8. Phase locked clock recovery circuit with Hogge PFD the PLL. On Fig. 9, we can see that if we can control PLL
bandwidth, ωC , we can control noise transfer from one source
(phase detector noise, N divider noise, reference noise, VCO
B. Locktime and Loop Bandwidth
noise) of the system to the output of the system.
The closed loop transfert function of the PLL of figure 2
is:
Kφ ·KV CO
N ·Ctot )(1 + s · N · τ2 )
(
CL(s) = (2)
K ·KV ·τ
s2 + s · ( φN ·CCO 2 )
tot
In which Ctot = C1 + C2 + C3 , τ2 is the second time
constant of the filter such as τ2 = R2 C2 . Defining the natural
pulsation ωn and the damping factor ς as
Fig. 9. (a) Transfer function multiplying all noise sources except the vco,
(b) Transfer function multiplying the vco noise
Kφ · KV CO Kφ · KV CO
R2 C2
·
ωn = ς=
N (C1 + C2 + C3 ) 2 N (C1 + C2 + C3 ) Within the loop bandwidth, the PLL phase detector and
(3) reference signal are typically the dominant noise source, and
Now consider a PLL, which is initially locked at frequency outside the loop bandwidth, the VCO noise is often the
f 1, and then the N counter is changed such to cause the PLL dominant noise source. Moreover, by controlling charge pump
to switch to frequency f 2. This event is equivalent to changing gain, Kφ, we can control spur gain , equation ??. There are
the reference frequency from f 1/N to f 2/N . The first term in several types of these spurious outputs with many different
5. causes. However, by far, the most common type of spur is modes bandwidth we use a modulated source current. This
the reference spur. These spurs appear at multiples of the adaptation technique is especially effective in the case of large
comparison frequency. Larger charge pump gains yield lower frequency steps or small data transition. First, PLL settles to
leakage dominated spurs at the critic frequency, Fspur , which within a small residual frequency error in a very short time
is generally the comparaison frequency. When the charge in wide BW mode. Then, the additional time required for the
pump is in the tri-state state, it is ideally high impedance. loop to settle to the final frequency value in the narrow- BW
However, there will be some parasitic leakage through the mode is greatly reduced.
charge pump, VCO, and loop filter capacitors. Of these leakage
sources, the charge pump tends to be the dominant one.
So the loop bandwidth, ωC , is the most critical parameter of
the loop filter. Choosing the loop bandwidth too small will
yield a design with improved reference spurs and RMS phase
error, but all at the expense of increased lock time. Choosing
the loop bandwidth too wide will result in improved lock time
at the expense of increased reference spurs and RMS phase
error. To resume, we have a wide mode bandwidth during the
acquisition time and a narrow mode bandwidth for phase noise
improvement.
IV. CIRCUIT DESCRIPTION AND ADAPTIVE BANDWIDTH
IMPLEMENTATION
Each parameter of the PLL has to be chosen optimally
in order to achieve short lock time, low phase noise, low
power consumption, and fully integration of the loop filter
capacitors. We have seen Hogge PDF has been used to deal Fig. 11. Modulated current source for the charge pump
with the NRZI reference signal, Fig. 7, and we have seen that
a third order passive loop filter has been adopted in order to Fig.11 shows the simplified schematic of the adaptive cur-
minimize the spurious gain. A VCO gain of 55M Hz/V , a rent steering charge pump. As VOU T decreases, M 8 will enter
charge pump current ICP = 160µA in Wide-Band mode. The the non-saturation region and IOU T will begin to decrease.
theoretical lock time computed from these values is somewhat However, this causes a decrease in the gate-source voltage of
few µs, which depend on the frequency step and specification M 7 which causes an increase in the gate voltage of M 8. The
tolerance. An important consideration in the adaptive scheme minimum value of VOU T is determined by the gate-source
is to maintain a phase margin superior to 45deg, as shown in voltage of M 7 and Vdsat of M 8.
Fig.10, and an optimum-damping factor, ς around 0.7 [10] The charge pump current IUP and IDOWN can be biased
when the loop is switched from the wide-BW to narrow- to either 80µA for the narrow-BW mode or 160µA for the
BW mode, so that the loop stability and settling behavior are wide-BW mode, controlled by the control voltage of the VCO
optimized in both modes. through the control transistor M 1 as shown in Fig.11. M 1
send a current controled by the VCO control voltage and so by
the state of the system. After that, we use a regulated cascode
source to charge the charge pump element. With this solution
we have a hight output impdance and a constant current value
a the output of the current source (independant of the output
load) Fig.12.
Fig. 10. Phase Margin for different bandwidth mode: ωc1 /2π loop bandwidth
for Narrow-BW mode and ωc /2π loop bandwidth for Wide-BW mode
As we have seen in III-B, by controling the Charge Pump
Fig. 12. regulated cascode behavior
current, we can control loop bandwidth. So to have this two
6. Bias current is then injected to the charge pump. The U P with KΦ the Charge Pump gain, and KV CO the VCO gain.
and DOW N signals from the phase frequency detector control This leads to a system of four equations and four unknowns:
the switches of the charge pump. Current is then integrated • Constante
through the 3rd order loop filter Fig. 13 to control a two
stage ring oscillator. The loop filter component are determined K1 = Ctot (18)
by choosing optimal value of the loop bandwidth, the phase
margin, the charge pump and the VCO gain.
K2 = (T 1 + T 3) · K1 (19)
The impedance of the filter is given by:
1 + s · T2 1 T 1 · T 3 · K1
·
Z(s) = (7) K3 = (20)
s · (1 + s · T 1) · (1 + s · T 3) Ctot T2
And the time constants are determined by:
C3
K4 = (21)
C1
T 2 = 2 · C2 (8)
with
• Equations
C2 · C3 · 2 + C1 · C2 · 2 + C1 · C3 · 3 + C2 · C3 · 3
T 1+3 =
C1 + C2 + 3 C1 + C2 + C3 = K1 (22)
(9)
T1 · T3 C1 · C3 · 3
= (10) T 2 · (C1 + C3) + R3 · C3 · (C1 + C2) = K2 (23)
2 C1 + C2 + 3
Choosing the loop bandwidth to maximize the phase margin
yields: R3 · C1 · C3 = K3 (24)
C3
M Φ = tan−1 (ωC · T 2) − tan−1 (ωC · T 1) − tan−1 (ωC · T 3) = K4 (25)
C1
(11)
although,
C1 · (K4 + 1) + C2 = K1 (26)
ωC · T 2 ωC · T 1 ωC · T 1 · T 31
= +
1 + (ωC · T 2)2 1 + (ωC · T 1)2 1 + (ωC · T 1 · T 31)2 K3 · C3
T 2 · C1 · (K4 + 1) + K3 +
(12) = K2 (27)
C1
Combining these leads to a quadratic equation that can be
1 − 4f (ωC · T 1)2
1± calculated C1. Once C1 is known, then C1, C2, R2, and R3
ωC · T 2 = = g(ωC · T 1) (13)
2 · f (ωC · T 1) can be found: C1 = 5pF , C2 = 166pF , C3 = 0.7pF , R2 =
1.4kΩ, R3 = 12.8kΩ
Using 13 and 11 to eliminate ωC · T 1 yields:
M Φ = π + tan−1 (g(x)) − tan−1 (x) − tan−1 (x · T 31) (14)
Since x is the only unknown, this equation can be solved
numerically for xand then T 1 can be found via the equation:
x
T1 = (15)
ωC
Once T 1 is known, T 2 can be found by,
Fig. 13. Third Order Loop Filter
g(ωC · T 1)
T2 = (16)
The output voltage of the loop filter control a two stage ring
ωC
oscillator. Each stage is a differential delay cell that consists
Now, by definition, the gain of the open loop transfer
of an input NMOS pair M 1 and M 2, a cross-coupled PMOS
function is equal to one at the loop bandwidth. Therefore:
pair M 3 and M 4 to sharpen the edge of the output signal,
and a frequency control PMOS pair M 5 and M 6 fig. 14.
1 + (ωc · T 2)2 Fig. 15 represent the final loop system to recover the 48MHz
KΦ · KV CO
·
C1+C2+C3 = Ctot = 2 (1 + (ωc · T 2)2 ) · (1 + (ωc · T 3)2 ) USB data without crystal reference and on chip
clock from
ωc
(17) loop filter.
7. Fig. 14. Delay cell schematic and the differential VCO
Fig. 17. overall phase noise
Fig. 15. Phase Locked Loop with Loop Bandwidth Enhancement for Fast
and Low-Noise Clock Recovery from Full Speed USB Data
V. SIMULATION RESULTS
Figure 16 shows the phase noise contribution from different
loop components based on the transistor level simulation.
the overall phase noise is estimated and also provided in
figure 16. It can be seen that the phase noise of the VCO
is suppressed inside the loop bandwidth, whereas the phase
noise from the other loop component is attenuated outside the
loop bandwidth.
Fig. 18. Settling time simulation for a PLL with adaptation and non-
periodical reference signal.
Fig. 16. Estimated phase noise of the whole loop and contribution of each
loop component at the synthesizer output.
The overall phase noise at 1M Hz frequency offset is
−130dBc/Hz, Fig. 17, and the dominant contributor is the
VCO. The simulated current consumption is 8mA with a
supply voltage of 1.8v.
With this system, we can reach an accuracy of 0.3% (Fig.
18) on the output frequency of the vco, with a 25M Hz non-
periodic noisy (±0.2ns jitter per time bit, 40ns) reference
signal in about 1µs.
Fig. 19. layout of the proposed adaptive PLL
As shown in figure 19, the die area is 600µm × 300µm in
0.15µm Cmos technologie with on chip loop filter.
8. VI. CONCLUSION
This article demonstrates the feasibility of an adaptive band-
width PLL allowing achieving frequency synthesis, frequency
modulation and clock and data recovery with fast locking and
high level precision. Contrarily to standard techniques, the
proposed method offers a modulation bandwidth solution with
simple loop PLL. We are waiting the test chip measurements
to compare with the simulation results
REFERENCES
[1] Jhon G. Maneatis ”Low-jitter Process-independent DLL and PLL Based
on Self-Biased Techniques” IEEE Journal of Solid State Circuits, Nov
1996.
[2] C. S. Vaucher ”‘An adaptive PLL Tunning System architecture Combining
High Spectral purity and Fast settling time”IEEE Journal of Solid State
Circuits, April 2000.
[3] I. Novof et al ”Fully-integrated CMOS Phase-Locked Loop with 15 to
240 MHz Locking Range and 50 ps Jitter,” in ISSCC Dig.Tech. Papers,
Feb. 1995, pp. 112-113.
[4] A. Young, J. K. Greason J. E. Smith, and K. L. Wong, ”A PLL Clock
Generator with 5 to 110 MHz Lock Range for Microprocessors,” in
ISSCCDig. Tech. Papers, Feb. 1992, pp. 50-51.
[5] ” Universal Serial Bus Specification,” Rvision 2.0,april 27 2000
[6] S. K. Shanmugam ” Digital and Analog Communication Systems,”
Wiley,New York: 1979
[7] C. R. Hogge ” a self- correcting clock recovery circuit,” J.Lightwave
technologie, vol 3, Dec 1985, pp. 1312- 1314
[8] B. Razavi ”Design of monolithic phase-locked loops and clock recovery
circuitsa tutorial,” in Monolithic Phase-Locked Loops and Clock
[9] B. Razavi ”A Study of Phase Noise in CMOS Oscillator” IEEE J. Solid-
State Circ. vol 31, pp331-343; march1996.
[10] F. M. Gardner ”Charge Pump Phase Lock Loop” IEEE Transactions on
on Communications, vol. COMM-28, pp 1849-1858 Nov 1980.