Data comm4&5Data Communications (under graduate course) Lecture 3 of 5

 Physical sources naturally generate signals that
contain a significant amount of information
that is redundant of which their transmission is
wasteful of communication resources.
 For efficient transmission the redundant
information should be removed from the signal
prior to transmission with no loss of
information. This is referred to as ‘data
compaction’ or ‘lossless data compression’.
Data Communications Dr. Randa Elanwar 2012-2013 2

 The entropy of the source sets the fundamental limit of
redundancy removal.
 Basically, data compaction is achieved by assigning
short descriptions to the most frequent outcomes of the
source output and longer descriptions to the less
frequent ones.
 Prefix coding
A source coding scheme used for data compaction. It is
not only decodable but also offers the possibility of
having an average code word length close to the source
entropy. I.e. Code efficiency  approaches 100%
3Data Communications Dr. Randa Elanwar 2012-2013

 A prefix code is defined as a code in which no code
word is the prefix of any other code word.
 Example
 Code 1 is not a prefix code since the bit 0, the code
word of S0, is a prefix of 00, the code word of S2.
Likewise, the bit 1, the code word of S1, is a prefix of 11,
the code word of S3. Similarly, we may show that code
3 is not a prefix code but code 2 is.
4Data Communications Dr. Randa Elanwar 2012-2013
Source Symbol Probability of occurrence Code 1 Code 2 Code 3
S0 0.5 0 0 0
S1 0.25 1 10 01
S2 0.125 00 110 011
S3 0.125 11 111 0111

 Prefix code is distinguished from other uniquely
decodable codes by the fact that the end of a code
word is always recognizable.
 Hence the decoding of a prefix can be
accomplished as soon as the binary sequence
representing a source symbol is fully received. For
this reason, prefix codes are also referred to as
instantaneous codes.

Huffman coding
 It is an important class of prefix codes. The
basic idea behind it is to assign to each symbol
a sequence of bits roughly equal in length to
the amount of information conveyed by this
symbol.
 The end result is a source code whose average
code word length approaches the source
entropy H(S).

Huffman coding
 The idea of the Huffman code is to replace the
symbol set of the memory less source by a simpler
one.
 This reduction is continued step-by-step until we
are left with a final set with only 2 symbols for
which (0, 1) is an optimal code.
 Starting from this trivial code we work backward.

Specifically, the Huffman encoding algorithm proceeds
as follows:
1. The source symbols are listed in order of descending
probability of occurrence. The two source symbols
of lowest probability are regarded as being
combined into a new source symbol with
probability equal to the sum of the two original
probabilities. (The list of symbols is now reduced by
1)
2. Update the symbol list by placing the probability of
the new symbol according to its value.

3. The procedure is repeated until we are left
with a final list of source symbols of only two
for which a 0 and a 1 are assigned.
4. The code for each original source symbol is
found by working back ward and tracing the
sequence of 0’s and 1’s assigned to that symbol
as well as its successors.

 Example

 Redundancy plays an important role in
communications. It is essential for reliable
communication.
 Because of redundancy we are able to decode a
message accurately despite errors in the received
message. Redundancy thus combats noise.
 If all the redundancy in a message is removed it
would take much less time in transmission but if
an error occurs at the receiver it would be difficult
to make sense of the received message.

 For example, in order to transmit 16 symbols we may
use a group of 4 binary pulses (‘0000’, ‘0001’, ‘0010’, …
‘1111’).
 In this coding scheme no redundancy exists. If an error
occurs in the reception of even one of the pulses, the
receiver will produce a wrong value. Here we may use
redundancy to eliminate the effect of possible errors
caused by channel noise.
 Thus if we add to each code word one more pulse as to
make the number of positive pulses even we have a
code that can detect a single error in any place.

 Thus, to the code word 0001 we add a fifth pulse to make a
new code word 00011. Now the number of positive pulses is
2 (even).
 If a single error occurs in any position, this parity will be
violated. The receiver then will request retransmission of the
message.
 This code can only detect single error but cannot locate
(correct) it. It also cannot detect even number of errors. By
introducing more redundancy it is possible not only to
detect but also to correct errors.
 We will know more about redundancy later when we study
Error correcting coding

 After over viewing different coding techniques
it’s time now to understand how can we reach
our target of ‘Efficient transmission’.
 In other words, we finally have represented
our information after compression and coding
to a sequence of ‘1’ and ‘0’, we need to know
what is the best way of transmitting this
sequence such that we save our resources
(power, bandwidth) and achieve correct signal
recovery at the receiver.

 We can measure the “GOODNESS” of a communication
system in many ways:
 How close is the estimated signal to the original signal
 Better estimate = higher quality transmission
 Bit Error Rate (BER) for digital signal
 How much power is required to transmit a signal?
 Lower power = longer battery life, less interference
 How much bandwidth B is required to transmit a signal?
 Less B means more users can share the channel
 How much information is transmitted?
 In digital systems information is expressed in bits/sec.

Time domain vs. frequency domain
 Every signal can be represented in two different ways:
time domain representation and frequency domain
representation.
 A time-domain graph shows how a signal changes
over time, whereas a frequency-domain graph shows
how much of the signal lies within each given
frequency band over a range of frequencies.
 Why do we need to have a frequency domain
representation of a signal? Because Frequency analysis
simplifies the understanding and interpretation of the
effects of various time-domain operations.

 A given function or signal can be converted
between the time and frequency domains with a
pair of mathematical operators called a transform.
 An example is the Fourier transform, which
decomposes a function into the sum of a number of
sine waves at different frequencies.
 The frequency domain representation of the signal
(called ‘spectrum’) is the graph showing the Fourier
coefficients values versus the corresponding
frequency components.

 The frequency spectrum of a time-domain signal is
a representation of that signal in the frequency
domain.
 The frequency spectrum can be generated via a
Fourier transform of the signal, and the resulting
values are usually presented as amplitude and
phase, both plotted versus frequency.
 Thus, Any signal that can be represented as an
amplitude that varies with time has a
corresponding frequency spectrum.

 The formulation for Fourier series re-writes
sines and cosines as complex exponentials.
 These complex exponentials sometimes contain
negative frequencies. They practically do not
exist but having even symmetry simplifies
computations a lot.

Power spectral density
 The power spectral density (PSD) describes how the
power of a signal is distributed with frequency. It
represents the power per unit bandwidth of the
spectral components at different frequencies.
 For technical reasons, it is desirable to have zero PSD at
frequency 0 hertz.
 Note that: It has been proven that if a signal is narrow in
time domain (has short period) it will be wide in
frequency domain and vice versa since frequency is the
reciprocal of time period.
F = 1 / T0

DC component
 When describing a periodic function in the frequency
domain, the DC bias, DC component, DC offset, or DC
coefficient is the mean value of the waveform. If the
mean amplitude is zero, there is no DC offset.
 A waveform without a DC component is known as a
DC-balanced waveform. DC-balanced waveforms are
useful in communications systems to avoid voltage
imbalance problems between connected systems or
components.
 DC offset is usually undesirable when it causes
saturation or change in the operating point of an
amplifier. In signal processing terms, DC offset can be
reduced in real-time by a high-pass filter.

 Binary data can be transmitted using a number
of different types of pulses. The choice of a
particular pair of pulses to represent the
symbols 1 and 0 is called Line Coding and the
choice is generally made on the grounds of one
or more of the following considerations:
 Presence or absence of a DC level.
 Power Spectral Density- particularly its value at 0
Hz.
 Bandwidth.

 BER performance (i.e. Probability of error).
 Transparency (i.e. the property that any arbitrary symbol,
or bit, pattern can be transmitted and received).
 Ease of clock signal recovery for symbol synchronization.
 Presence or absence of inherent error detection
properties.
 After line coding pulses may be filtered or
otherwise shaped to further improve their
properties: for example, their spectral efficiency
and/ or immunity to intersymbol interference.

Signal Bandwidth
 The transmission of Rb bits per second requires a
theoretical minimum pulse bandwidth of Rb/2 Hz.
 It was proven by Nyquist that in order to avoid
overlapping in pulses transmission then the
transmission rate on the channel has to be greater
than twice the signal bandwidth (maximum
frequency)
B = 1/T0 and Rb >2B then B<Rb/2

Transparency
 Regenerative repeaters are used at regularly spaced
intervals along a digital transmission line to detect the
incoming digital signal and regenerate new clean
pulses for further transmission along the line.
 This process periodically eliminates the accumulation
of noise and signal distortion along the transmission
path.
 Timing information (clock signal) between successive
repeaters has to be extracted from the received signal.

 The timing signal is sensitive to the incoming bit
pattern, hence, if there are too many 0’s (no pulses) in a
sequence this causes error in the timing information.
 Thus; for reliable clock recovery at the receiver, one
usually imposes a maximum run length constraint on
the generated channel sequence, i.e., the maximum
number of consecutive ones or zeros is bounded to a
reasonable number.
 The line code in which the bit pattern doesn’t affect the
accuracy of the timing information is said to be
transparent.

 In other words, it should be possible to
transmit a digital signal correctly regardless of
the pattern of 1’s and 0’s and if the data is
coded such that for every possible sequence of
data, the coded signal is received faithfully, the
code is transparent.

Error detection
 In information theory and coding theory with
applications in computer science and
telecommunication, error detection and correction or
error control are techniques that enable reliable
delivery of digital data over unreliable communication
channels.
 Many communication channels are subject to channel
noise, and thus errors may be introduced during
transmission from the source to a receiver.
 Error detection techniques allow detecting such errors,
while error correction enables reconstruction of the
original data.

1. Unipolar Signaling:
 Unipolar signaling (also called on-off keying,
OOK) is the type of line coding in which one
binary symbol (representing a 0 for example) is
represented by the absence of a pulse (i.e. a
SPACE) and the other binary symbol (denoting a
1) is represented by the presence of a pulse (i.e. a
MARK).
 There are two common variations of unipolar
Signaling: Non-Return to Zero (NRZ) and Return
to Zero (RZ).

1.1. Unipolar Non-Return to Zero (NRZ):
 In unipolar NRZ the duration of the MARK pulse
(Ƭ ) is equal to the duration (To) of the symbol slot.
 Advantages:
 Simplicity in implementation.
 Doesn’t require a lot of bandwidth for transmission.

 Disadvantages:
 Presence of DC level (indicated by spectral line at 0 Hz).
 Does not have any error correction capability.
 Does not posses any clocking component for ease of
synchronization.
 Is not Transparent. Long string of zeros causes loss of
synchronization.
PSD of of Unipolar NRZ

1.2. Return to Zero (RZ):
 In unipolar RZ the duration of the MARK pulse (Ƭ) is
less than the duration (To) of the symbol slot. Typically
RZ pulses fill only the first half of the time slot,
returning to zero for the second half.
 Advantages:
 Presence of a spectral line at symbol rate which can be used as
symbol timing clock signal.

 Disadvantages:
 Presence of DC level (indicated by spectral line at 0
Hz).
 Does not have any error correction capability.
 Occupies twice as much bandwidth as Unipolar NRZ.
 Is not Transparent
PSD of Unipolar RZ

2. Polar Signaling
 In polar Signaling a binary 1 is represented by a pulse g1(t) and
a binary 0 by the opposite (or antipodal) pulse g0(t) = -g1(t).
Polar Signaling also has NRZ and RZ forms.
 Polar NRZ and RZ have almost identical spectra to the
Unipolar NRZ and RZ. However, due to the opposite polarity
of the 1 and 0 symbols, neither contains any spectral lines.

PSD of Polar NRZ PSD of Polar RZ

2.1. Polar Non-Return to Zero (NRZ):
 Advantages:
 No DC component.
 Disadvantages:
synchronization.
 Is not transparent.

2.2. Polar Return to Zero (RZ):
 Advantages:
 Disadvantages:
 Does not posses any clocking component for easy
synchronization. However, clock can be extracted by
rectifying the received signal.
 Occupies twice as much bandwidth as Polar NRZ.

3. BiPolar Signaling
 Bipolar Signaling is also called “alternate mark
inversion” (AMI) uses three voltage levels (+V, 0, -
V) to represent two binary symbols.
 Zeros, as in unipolar, are represented by the
absence of a pulse and ones (or marks) are
represented by alternating voltage levels of +V and
–V.
 Alternating the mark level voltage ensures that the
bipolar spectrum has a null at DC.

 The alternating mark voltage also gives bipolar
Signaling a single error detection capability.
 Like the Unipolar and Polar cases, Bipolar also has
NRZ and RZ variations.
PSD of Bipolar NRZ

3.1 BiPolar / AMI NRZ:
 Advantages:
 Occupies less bandwidth than unipolar and polar
NRZ schemes.
 Possesses single error detection capability.
 Disadvantages:
synchronization.
 Is not Transparent.

PSD of BiPolar RZ

3.2. BiPolar / AMI RZ:
 Advantages:
 Occupies less bandwidth than unipolar and polar RZ
schemes.
 Possesses single error detection capability.
 Clock can be extracted by rectifying (a copy of) the
received signal.
 Disadvantages:
 Is not Transparent.

4. Manchester Signaling
 In Manchester encoding , the duration of the bit is
divided into two halves. The voltage remains at one
level during the first half and moves to the other level
during the second half.
A ‘One’ is +ve in 1st half and -ve in 2nd half.
A ‘Zero’ is -ve in 1st half and +ve in 2nd half.

 The transition at the centre of every bit interval is
used for synchronization at the receiver.
 Manchester encoding is called self-synchronizing.
Synchronization at the receiving end can be
achieved by locking on to the transitions, which
indicate the middle of the bits.
PSD of Manchester

 Advantages:
 Easy to synchronise with.
 Is Transparent.
 Disadvantages:
 Because of the greater number of transitions it occupies a
significantly large bandwidth.
 Does not have error detection capability.
 These characteristic make this scheme unsuitable for
use in Wide Area Networks. However, it is widely
used in Local Area Networks such as Ethernet and
Token Ring.

 The received digital signal needs to be sampled at
precise instants. This requires a clock signal at the
receiver in synchronism with the clock signal at the
transmitter.
 In any communication system it is necessary that the
timing operations at the receiver follow closely the
corresponding operations at the transmitter.
 For example, when the transmission path is
interrupted, it is highly unlikely that transmitter and
receiver clocks will continue to indicate the same time
for long. Accordingly, we must set up a procedure for
adding and detecting a synchronization pulse.

 Three general methods of synchronization exist:
1. The transmitter and receiver both follow the same master
timing source
2. Transmitting a separate synchronizing signal (pilot
clock)
3. Self-synchronization, where timing information is
extracted from the received signal itself.
 The first method is suitable for large volumes of
data and high speed communication systems
because of its high cost.

 In the second method a code element or a pulse is
set aside at the end of a frame and to transmit this
pulse every other frame only.
 In this case the receiver searches the code words
(one-by-one) for the pattern of 1s and 0s
alternating at half the frame rate and establishes
synchronization between the transmitter and
receiver.
 This is suitable when the available channel
capacity is large compared to the data rate.

 The third method is when the available channel capacity is
small compared to the data rate thus is more efficient but
implies short run lengths of 1s and 0s to avoid loss of
synchronization. Hence sometimes scramblers are needed.
 In general, a scrambler tends to make the data more random
by removing long strings of 1s and 0s. It can be helpful in
timing extraction by removing long strings of 0s in binary
data.
 They are primarily used for preventing unauthorized access.
On the other hand, a matched descrambler is used at the
intended receiver to undo the operations done by the
scrambler at the transmitter and recover the original order
of data sequence.

Data comm4&5Data Communications (under graduate course) Lecture 3 of 5

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Data comm4&5Data Communications (under graduate course) Lecture 3 of 5

Similar to Data comm4&5Data Communications (under graduate course) Lecture 3 of 5 (20)

More from Randa Elanwar

More from Randa Elanwar (20)

Recently uploaded

Recently uploaded (20)

Data comm4&5Data Communications (under graduate course) Lecture 3 of 5