Introduction to Channel Coding
Upcoming SlideShare
Loading in...5
×
 

Introduction to Channel Coding

on

  • 1,503 views

An introduction to modern Channel Coding.

An introduction to modern Channel Coding.

Statistics

Views

Total Views
1,503
Views on SlideShare
1,503
Embed Views
0

Actions

Likes
1
Downloads
102
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Introduction to Channel Coding Introduction to Channel Coding Presentation Transcript

  • DIGITAL COMMUNICATIONS Block 3 Channel Coding Francisco J. Escribano, 2013-14
  • Fundamentals of error control ● Error control: – error detection (ARQ schemes) – correction (FEC schemes) bn ● Channel encoder cn Channel model: Channel rn λn RTx Channel error corrector bn Channel error detector – discrete inputs, – discrete (hard, rn) or continuous (soft, λn) outputs, – memoryless. 2
  • Fundamentals of error control ● Enabling detection/correction: – ● Adding redundancy to the information: for every k bits, transmit n, n>k. Shannon's theorem (1948): 1) If R<C =supr ( X ; Y ) , for ε>0, there is n, R=k/n constant, so that Pe<ε. pX 2) If Pb is acceptable, rates R<R(Pb)=C/(1-H2(Pb)) are achievable. 3) For any Pb, rates greater than R(Pb) are not achievable. ● Problem: Shannon's theorem is not constructive. 3
  • Fundamentals of error control ● Added redundancy is structured redundancy. ● This relies on sound algebraic & geometrical basis. ● Our approach: – Algebra over the Galois Field of order 2, GF(2)={0,1}. – GF(2) is a proper field, GF(2)m is a vector field of dim. m. – Dot product · :logical AND. Sum +(-) : logical XOR. – Scalar product: b, d ∈ GF(2)m b·dT=b1·d1+...+bm·dm – Product by scalars: a ∈ GF(2), b ∈ GF(2)m a·b=(a·b1...a·bm) – It is also possible to define a matrix algebra over GF(2). 4
  • Fundamentals of error control ● ● Given a vector b ∈ GF(2)m, its binary weight is w(b)=number of 1's in b. It is possible to define a distance over vector field GF(2)m, called Hamming distance: dH(b,d)=w(b+d); b, d ∈ GF(2)m ● ● Hamming distance is a proper distance and accounts for the number of differing positions between vectors. Geometrical view: (0110) (1011) (1110) (1010) 5
  • Fundamentals of error control ● A given encoder produces n bit outputs for each k bit inputs: – ● R=k/n < 1 is the rate of the code. The information rate decreases by R by the use of a code. R'b=R·Rb < Rb (bit/s) ● ● Moreover, if used jointly with a modulation with spectral efficiency η=Rb/B (bit/s/Hz), the efficiency decreases by R η'=R·η < η (bit/s/Hz) In terms of Pb, the achievable Eb/N0 region in AWGN is lower bounded by: Eb N0 (dB)⩾10⋅log 1 10 η ' ( ⋅( 2 η' −1 ) ) 6
  • Fundamentals of error control ● Achievable rates, capacity and limits. Source: http://www.comtechefdata.com/technologies/fec/ldpc 7
  • Fundamentals of error control ● How a channel code can improve Pb (BER in statistical terms). ● Cost: loss in resources (spectral efficiency, power). 8
  • Linear block codes ● An (n,k) linear block code (LBC) is a subspace C(n,k) < GF(2)n with dim(C(n,k))=k. ● C(n,k) contains 2k vectors c=(c1 … cn). ● R=k/n is the rate of the LBC. ● n-k is the redundancy of the LBC – we would only need vectors with k components to specify the same amount of information. 9
  • Linear block codes ● Recall vector theory: – – ● A basis for C(n,k) has k vectors over GF(2)n C(n,k) is orthogonal to an n-k dimensional subspace over GF(2)n (its null subspace). c ∈ C(n,k) can be both specified as: – – c=b1·g1+...+bk·gk, where {gj}j=1,...,k is the basis, and (b1...bk) are its coordinates over it. c such that the scalar products c·hiT are null, when {hi}i=1,...,n-k is a basis of the null subspace. 10
  • Linear block codes ● Arranging in matrix form, an LBC C(n,k) can be specified by – G={gij}i=1,...,k, j=1,...,n, c=b·G, b ∈ GF(2)k. – H={hij}i=1,...,n-k, j=1,...,n, c·HT=0. ● G is a k×n generator matrix of the LBC C(n,k). ● H is a (n-k)×n parity-check matrix of the LBC C(n,k). – – ● In other approach, it can be shown that the rows in H stand for linearly independent parity-check equations. The row rank of H for an LBC should be n-k. Note that gj ∈ C(n,k), and so G·HT=0. 11
  • Linear block codes ● The encoder is given by G – Note that a number of different G generate the same LBC b=(b1...bk) ● ● LBC encoder G c=(c1...cn)=b·G For any input information block with length k, it yields a codeword with length n. An encoder is systematic if b is contained in c=(b1...bk | ck+1...cn), so that ck+1...cn are the n-k parity bits. – Systematicity is a property of the encoder, not of the LBC C(n,k) itself. – GS=[Ik | P] is a systematic generator matrix. 12
  • Linear block codes ● How to obtain G from H or H from G. – – H rows are n-k vectors linearly independent over GF(2)n – ● G rows are k vectors linearly independent over GF(2)n They are related through G·HT=0 (a) (a) does not yield a sufficient set of equations, given H or G. – ● Given G, put it in systematic form by combining rows (the code will be the same, but the encoding does change). – ● If GS=[Ik | P], then HS=[PT | In-k] complies with (a). Conversely, given H, put it in systematic form by combining rows. – ● A number of vector sets comply with it (basis sets are not unique). If HS=[In-k | P], then GS=[PT | Ik] complies with (a). Parity check submatrix P can be on the left or on the right side (but on opposite sides of H and G simultaneously for a given LBC). 13
  • Linear block codes ● ● Note that, by taking 2k vectors out of 2n, we are getting apart the binary words. Minimum Hamming distance between input words is d min ( GF (2) ) =min { d H ( bi , b j ) ∣ b i , b j ∈GF (2) }=1 k k bi ≠b j ● Recall that we have added n-k redundancy bits, so that d min ( C (n , k ))=min {d H ( ci , c j ) ∣ c i ,c j ∈C (n , k ) }>1 c i ≠c j d min ( C (n ,k ))⩽ n−k +1 14
  • Linear block codes ● The channel model corresponds to a BSC (binary symmetric channel) c=(c1...cn) r=(r1...rn) AWGN channel Modulator 0 1 ● Channel BSC(p) 1-p p p 1-p Hard demodulator 0 1 p=P(ci≠ri ) is the bit error probability of the modulation in AWGN. 15
  • Linear block codes ● The received word is r=c+e, where P(ei=1)=p. – – w(e) is the number of errors in r wrt original word c – ● e is the error vector introduced by the noisy channel P(w(e)=t)=pt·(1-p)n-t, because the channel is memoryless. At the receiver side, we can compute the syndrome s=(s1...sn-k) as s=r·HT=(c+e)·HT=c·HT+e·HT=e·HT. r=(r1...rn) ● Channel decoder H s=(s1...sn-k)=r·HT r ∈ C(n,k) ⇔ s=0. 16
  • Linear block codes Two possibilities at the receiver side: ● a) Error detection (ARQ schemes): – ● If s≠0, there are errors, so ask for retransmission. b) Error correction (FEC schemes): – Decode an estimated ĉ ∈ C(n,k), so that dH(ĉ,r) is the minimum over all codewords in C(n,k) (closest neighbor decoding). – ĉ is the most probable word under the assumption that p is small (otherwise, the decoding fails). (1011) ĉ1 OK (0110) c e1 r1 (1110) e2 (1010) ĉ2 r2 17
  • Linear block codes ● Detection and correction capabilities (worst case) of an LBC with dmin(C(n,k)). – a) It can detect error events e with binary weight up to w(e)|max,det=d=dmin(C(n,k))-1 – b) It can correct error events e with binary weight up to w(e)|max,corr=t=⎣(dmin(C(n,k))-1)/2⎦ ● It is possible to implement a joint strategy: – A dmin(C(n,k))=4 code can simultaneously correct all error patterns with w(e)=1, and detect all error patterns with w(e)=2. 18
  • Linear block codes ● ● The minimum distance dmin(C(n,k)) is a property of the set of codewords in C(n,k), independent from the encoding (G). As the code is linear, dH(ci,cj)=dH(ci+cj,cj+cj)=dH(ci+cj,0). – ● dmin(C(n,k))=min{w(c) | c ∈ C(n,k), c≠0} – ● ci, cj, ci+cj, 0 ∈ C(n,k) i.e., corresponds to the minimum word weight over all codewords different from null. dmin(C(n,k)) can be calculated from H: – It is the minimum number of different columns of H adding to 0. – It is the column rank of H + 1. 19
  • Linear block codes ● Detection limits: probability of undetected errors? – – ● Note that an LBC contains 2k codewords, and the received word corresponds to any of the 2n possibilities in GF(2)n. An LBC detects up to 2n-2k error patterns. An undetected error occurs if r=c+e with e≠0 ∈ C(n,k) – In this case, r·HT=0. n P u ( E)= ∑ Ai⋅p ⋅(1− p ) i n−i i =d min – Ai is the number of codewords in C(n,k) with weight i: it is called the weight spectrum of the LBC. 20
  • Linear block codes ● On correction, an LBC considers syndrome s=r·HT. – – A syndrome table associates a unique si over the 2n-k possibilities to a unique error pattern ei ∈ EC with w(ei)≤t. – If si=r·HT, decode ĉ=r+ei. – ● Assume correction capabilities up to w(e)=t, and EC to be the set of correctable error patterns. Given an encoder G, estimate information vector b such ^ that b·G=ĉ. ^ If the number of correctable errors #(EC)<2n-k, there are 2n-k-#(EC) syndromes usable in detection, but not in correction. – At most, an LBC can correct 2n-k error patterns. 21
  • Linear block codes ● A w(e)≤t error correcting LBC has a probability of correcting erroneously bounded by n P (E )⩽ ∑ i=t +1 – ● ● n ⋅pi⋅(1− p)n−i i () This is an upper bound, since not all the codewords are separated by the minimum distance of the code. Calculating the resulting P'b of an LBC is not an easy task, and it depends heavily on how the encoding is made through G. LBC codes are mainly used in detection tasks (ARQ). 22
  • Linear block codes ● ● Observe that both coding & decoding can be performed with low complexity hardware (combinational logic: gates). Examples of LBC – – Single parity check codes – Hamming codes – Cyclic redundancy codes – Reed-Muller codes – Golay codes – Product codes – ● Repetition codes Interleaved codes Some of them will be examined in the lab. 23
  • Linear block codes ● An example of performance: RHam=4/7, RGol=1/2. 24
  • Convolutional codes ● ● A binary convolutional code (CC) is another kind of linear channel code class. The encoding can be described in terms of a finite state machine (FSM). – A CC can eventually produce sequences of infinite length. – A CC encoder has memory. General structure: Backward logic (feedback) k input streams not mandatory not mandatory MEMORY: ml bits for l-th input n output streams Forward logic (coded bits) Systematic output 25
  • Convolutional codes ● The memory is organized as a shift register. – – ml=νl is the constraint length of the l-th input/register. – ● Number of positions for input l: memory ml. The register effects step by step delays on the input: recall discrete LTI systems theory. A CC encoder produces sequences, not just blocks of data. – Sequence-based properties vs. block-based properties. to backward logic l-th input stream input at instant i to forward logic 1 (l ) di (l ) d i−1 2 3 (l ) d i−3 d i−2 (l ) 4 ml (l ) d i−m d i−4 (l ) l 26
  • Convolutional codes ● Both forward and backward logic is boolean logic. – Very easy: each operation adds up (XOR) a number of memory positions, from each of the k inputs. inputs from all the k registers at instant i k i−ml c =∑ ∑ g ( j) i ● g ( j) l,p Same structure for backward logic l =1 q=i ( j) l , q−i ⋅d (l ) q j-th output at instant i , p=0,. .. , ml , is 1 when the p-th register position for the l-th input is added to get the j-th output. 27
  • Convolutional codes ● Parameters of a CC so far: – – n output streams – k shift registers with length ml each, l=1,...,k – νl=ml is the constraint length of the l-th register – m=maxl{νl} is the memory order of the code – ● k input streams ν=ν1+...+νk is the overall constraint length of the code A CC is denoted as (n,k,ν). – Its rate is R=k/n, where k and n usually take small values. 28
  • Convolutional codes ● The backward / forward logic may be specified in the form of generator sequences. – Theses sequences are the impulse responses of each output j wrt each input l. ( j) l g =( g ● ( j) l ,0 , ... , g ( j) l ,ml ) Observe that: g–(l j )=( 1,0,... ,0 ) connects the l-th input directly to the j-th output ( j) g–l =( 0,... ,1(q th ) ,... ,0 ) just delays the l-th input to the j-th output q time steps. 29
  • Convolutional codes ● Given the presence of the shift register, the generator sequences are better denoted as generator polynomials ( j) l g =( g ● ( j) l ,0 , ... , g ( j) l ,ml )≡g ( j) l ml ( j) l ,q ( D)=∑ g ⋅D q=0 q We can write then ( j) l (j) l g =( 1,0,... ,0 ) ≡ g (D)=1 ( j) l th ( j) l g =( 0,... ,1(q ) ,... ,0 ) ≡ g ( D)= D q g(l j )=( 1,1,0,... ,0 ) ≡ g ( j ) (D)=1+ D l 30
  • Convolutional codes ● ● As all operations involved are linear, a binary CC is linear and the sequences produced constitute CC codewords. A feedforward CC (without backward logic - feedback) can be denoted in matrix from as ( (1) 1 (1) 2 (2) 1 (2) 2 (n) 1 (n) 2 g ( D) g ( D) ⋯ g (D) g ( D) g ( D) ⋯ g (D) G( D)= ⋮ ⋮ ⋱ ⋮ (1) (2) (n) g k ( D) g k ( D) ⋯ g k (D) ) 31
  • Convolutional codes ● If each input has a feedback logic given as (0) l g ( D)=∑ ml q= 0 (0) l ,q g ⋅D q the code is denoted as ( (1) 1 (0) 1 (1) 2 (0) 2 (2) 1 (0) 1 (2) 2 (0) 2 g ( D) g (D) g (D) g ( D) g (D) g (D) G( D)= g (D) ⋮ (1) g k ( D) g (D) ⋮ (2) g k (D) (0) k g (D) (0) k g (D) ⋯ ⋯ ⋱ ⋯ (n) 0 (0) 1 (n) 2 (0) 2 g ( D) g (D) g ( D) g (D) ⋮ (n) g k ( D) g (0) (D) k ) 32
  • Convolutional codes ● We can generalize the concept of parity-check matrix H(D). – ● ● Based on the matrix description, there are a good deal linear tools for design, analysis and evaluation of a given CC. A regular CC can be described as a (canonical) all-feedforward CC and through an equivalent feedback (recursive) CC. – ● An (n,k,ν) CC is fully specified by G(D) or H(D). Note that a recursive CC is related to an IIR filter. Even though k and n could be very small, a CC has a very rich algebraic structure. – This is closely related to the constraint length of the CC. – Each output bit is related to the present and past inputs via powerful algebraic methods. 33
  • Convolutional codes ● Given G(D), a CC can be classified as: – – ● Non-systematic and feedforward. – ● Systematic and recursive (RSC). – ● Systematic and feedforward (NSC). Non-systematic and recursive. RSC is a popular class of CC, because it provides an infinite output for a finite-weight input (IIR behavior). Each NSC can be converted straightforwardly to a RSC with similar error correcting properties. CC encoders are easy to implement with standard hardware: shift registers + combinational logic. 34
  • Convolutional codes ● We do not need to look into the algebraic details of G(D) and H(D) to study: – – Decoding – ● Coding Error correcting capabilities A CC encoder is a FSM! The ν memory positions store a content (among 2ν possible ones) at instant i-1 Coder is said to be at state s(i-1) k input bits determine the shifting of the registers And we get n related output bits The ν memory positions store a new content at instant i Coder is said to be at state s(i) 35
  • Convolutional codes ● The finite-state behavior of the CC can be captured by the concept of trellis. – For any starting state, we have 2k possible edges leading to a corresponding set of ending states. ss=s(i-1) s=1,...,2ν input bi=(bi,1...bi,k) output ci=(ci,1...ci,n) se=s(i) e=1,...,2ν 36
  • Convolutional codes ● The trellis illustrates the encoding process in 2 axis: – ● X-axis: time / Y-axis: states Example for a (2,1,3) CC: input 0 input 1 output 00 s1 s1 output 0 1 s2 s2 s3 s3 s4 s4 s5 s5 s6 s6 s7 s7 s8 s8 i-1 i i+1 – For a finite-size input data sequence, a CC can be forced to finish at a known state (often 0) by adding terminating (dummy) bits. – Note that one section (e.g. i-1 → i) fully specifies the CC. 37
  • Convolutional codes ● The trellis illustrates the encoding process in 2 axis: – ● X-axis: time / Y-axis: states Example for a (2,1,3) CC: input 0 input 1 output 00 s1 s1 output 0 1 s2 Memory: same input, different outputs s2 s3 s3 s4 s4 s5 s5 s6 s6 s7 s7 s8 s8 i-1 i i+1 – For a finite-size input data sequence, a CC can be forced to finish at a known state (often 0) by adding terminating (dummy) bits. – Note that one section (e.g. i-1 → i) fully specifies the CC. 38
  • Convolutional codes ● The trellis description allows us – – To build the decoder – ● To build the encoder To get the properties of the code The encoder: k CLK Registers Combinational logic ss=s(i-1) s=1,...,2ν input bi=(bi,1...bi,k) output ci=(ci,1...ci,n) se=s(i) e=1,...,2ν n H(D)↔G(D) 39
  • Convolutional codes ● The decoder is far more complicated – – ● Long sequences Memory: dependence with past states In fact, CC were already well known before there existed a practical good method to decode them: the Viterbi algorithm. – ● It is a Maximum Likelihood Sequence Estimation (MLSE) algorithm with many applications. Problem: for a length N>>n sequence at the receiver side – There are 2ν·2N·k/n paths through the trellis to match with the received data. – Even if the coder starting state is known (often 0), there are still 2N·k/n paths to walk through in a brute force approach. 40
  • Convolutional codes ● Viterbi algorithm setup. input bi → output ci(s(i-1),bi) start s(i-1) → end s(i)(s(i-1),bi) s1 Key facts: ● ● s2 s3 ● s4 s5 ● s6 s7 s8 The encoding corresponds to a Markov chain model:P(s(i))=P(s(i)|s(i-1))·P(s(i-1)). Total likelihood P(r|b) can be factorized as a product of probabilities. (i−1) bi (i) Given s → s , P(ri|s(i),s(i-1)) depends only on the channel kind (AWGN, BSC...). Transition from s(i-1) to s(i) (linked in the trellis) depends on the probability of bi: P(s(i)|s(i-1))=2-k if the source is iid. i-1 received data ri i ● P(s(i)|s(i-1))=0 if they are not linked in the trellis (finite state machine: deterministic). 41
  • Convolutional codes ● The total likelihood can be recursively calculated as: N /n P ( r∣b ) =∏ P ( r i∣s ,s (i) (i −1) i=1 ● In the BSC(p), the observation metric would be: P ( ri∣s ,s (i ) ● ( s(i)∣s(i−1) )⋅P ( s(i−1) ) )⋅P (i−1) )= P ( r i∣ci ) = p w ( r i+c i ) ⋅(1− p) n− w ( r i +c i ) Maximum likelihood (ML) criterion: { ̂ b=arg max [ P ( r∣b ) ] b } 42
  • Convolutional codes ● ● We know that the brute force approach to ML criterion is at least O(2N·k/n). The Viterbi algorithm works recursively from 1 to N/n on the basis that – – During forward recursion, we only keep the paths with highest probability: the path probability goes easily to 0 from the moment a term metric ⨯ transition probability is very small. – ● Many paths can be pruned out (transition probability=0). When recursion reaches i=N/n, the surviving path guarantees the ML criterion (optimal for ML sequence estimation!). The Viterbi algorithm complexity goes down to O(N·22ν). 43
  • Convolutional codes ● The algorithm recursive rule is V = P ( s =s j ) ; (0) j V j =P ( r i∣s =s j ,s (i) ● (i) (i−1) (0) =smax )⋅ max { P ( s =s j∣s (i) s (i−1) (i −1) = smax (i−1) =sl )⋅V l } {Vj(i)} stores the most probable state sequence wrt observation r s1 s2 s2 s3 s3 s4 s4 s5 MAX s1 s5 s6 s6 s7 s7 s8 s8 i-1 i i+1 MAX 44
  • Convolutional codes ● Probability of the most probable state sequence corresponding to the i-1 previous observations The algorithm recursive rule is V = P ( s =s j ) ; (0) j V j =P ( r i∣s =s j ,s (i) ● (i) (i−1) (0) =smax )⋅ max { P ( s =s j∣s (i) s (i−1) (i −1) = smax (i−1) =sl )⋅V l } {Vj(i)} stores the most probable state sequence wrt observation r s1 s2 s2 s3 s3 s4 s4 s5 MAX s1 s5 s6 s6 s7 s7 s8 s8 i-1 i i+1 MAX 45
  • Convolutional codes ● Probability of the most probable state sequence corresponding to the i-1 previous observations The algorithm recursive rule is V = P ( s =s j ) ; (0) j V j =P ( r i∣s =s j ,s (i) ● (i) (i−1) (0) =smax )⋅ max { P ( s =s j∣s (i) s (i−1) (i −1) = smax (i−1) =sl )⋅V l } {Vj(i)} stores the most probable state sequence wrt observation r s1 Note that we may better work with logs products ↔ additions Criterion remains the same s2 s3 s4 s1 s2 s3 s4 s5 MAX MAX s5 s6 s6 s7 s7 s8 s8 i-1 i i+1 46
  • Convolutional codes ● Note that we have considered the algorithm when the demodulator yields hard outputs – ● ri is a vector of n estimated bits (BSC(p) equivalent channel). In AWGN, we can do better to decode a CC – We can provide observation metric. soft (probabilistic) estimations for the – For an iid source, we can easily get an observation transition metric based on the probability of each bi,l=0,1, l=1,...,k, associated to a possible transition. – There is a gain of around 2 dB in Eb/N0. – LBC decoders can also accept soft inputs (non syndrome-based decoders). – We will examine an example of soft decoding of CC in the lab. 47
  • Convolutional codes ● We are now familiar with the encoder and the decoder – Encoder: FSM (registers, combinational logic). – Decoder: Viterbi algorithm (for practical suboptimal adaptations are usually employed). ● But what about performance? ● reasons, First... – CC are mainly intended for FEC, not for ARQ schemes. – In a long sequence (=CC codeword), the probability of having at least one error is very high... – And... are we going to retransmit the whole sequence? 48
  • Convolutional codes ● Given that we truncate the sequence to N bits and CC is linear – – ● We may analyze the system as an equivalent (N,N·k/n) LBC. But... equivalent matrices G and H would not be practical. Remember FSM: we can locate error loops in the trellis. b b+e i i+1 i+2 i+3 49
  • Convolutional codes ● The same error loop may occur irrespective of s(i-1) and b. b+e b b b+e 50 i i+1 i+2 i+3
  • Convolutional codes ● Examining the minimal length loops and taking into account this uniform error property we can get dmin of a CC. – ● For a CC forced to end at 0 state for a finite input data sequence, dmin is called dfree. We can draw a lot of information by building an encoder state diagram: error loops, codeword weight spectrum... Diagram of a (2,1,3) CC, from Lin & Costello (2004). 51
  • Convolutional codes ● With a fairly amount of algebra, related to FSM, modified encoder state diagrams and so on, it is possible to get an upper bound for optimal MLSE decoding. P b⩽∑ B d⋅erfc d (√ Eb dR N0 ) BPSK in AWGN Bd is the total number of nonzero information bits associated with CC codewords of weight d, divided by the number of information bits k per unit time... A lot of algebra behind... ● ● There are easier, suboptimal ways to decode a CC, and performance will vary accordingly. A CC may be punctured to match other rates lower than R=k/n: performance-rate trade-off. 52
  • Convolutional codes ● Performance examples with BPSK using ML bounds. 53
  • Turbo codes ● Canonically, turbo codes (TC) are parallel concatenated convolutional codes (PCCC). k input streams CC1 b ? n=n1+n2 output streams c=c1∪c 2 Rate R=k/(n1+n2) CC2 ● Coding concatenation has been known and employed for decades, but TC added a joint efficient decoding. – Example of concatenated coding with independent decoding is the use of ARQ + FEC hybrid strategies (CRC + CC). 54
  • Turbo codes ● Canonically, turbo codes (TC) are parallel concatenated convolutional codes (PCCC). k input streams CC1 b ? We will see this is a key element... ● n=n1+n2 output streams c=c1∪c 2 Rate R=k/(n1+n2) CC2 Coding concatenation has been known and employed for decades, but TC added a joint efficient decoding. – Example of concatenated coding with independent decoding is the use of ARQ + FEC hybrid strategies (CRC + CC). 55
  • Turbo codes ● We have seen that standard CC decoding with Viterbi algorithm relied on MLSE criterion. – ● This is optimal when binary data at CC input is iid. For CC, we also have decoders that provide probabilistic (soft) outputs. – They convert a priori soft values + channel output soft estimations into updated a posteriori soft values. – They are optimal from the maximum a posteriori (MAP) criterion point of view. – They are called soft input-soft output (SISO) decoders. 56
  • Turbo codes ● What's in a SISO? r SISO (for a CC) 1 P ( bi =b )= 2 P ( bi =b∣r ) 0 0 1 Probability density function of bi ● 1 Note that the SISO works on a bit by bit basis, but produces a sequence of APP's. 57
  • Turbo codes ● What's in a SISO? Soft demodulated values from channel r SISO (for a CC) 1 P ( bi =b )= 2 P ( bi =b∣r ) 0 0 1 Probability density function of bi ● 1 Note that the SISO works on a bit by bit basis, but produces a sequence of APP's. 58
  • Turbo codes ● What's in a SISO? Soft demodulated values from channel r SISO A priori probabilities (APR) (for a CC) 1 P ( bi =b )= 2 P ( bi =b∣r ) 0 0 1 Probability density function of bi ● 1 Note that the SISO works on a bit by bit basis, but produces a sequence of APP's. 59
  • Turbo codes ● What's in a SISO? Soft demodulated values from channel r SISO A priori probabilities (APR) (for a CC) 1 P ( bi =b )= 2 P ( bi =b∣r ) A posteriori probabilities (APP) updated with channel information 0 0 1 Probability density function of bi ● 1 Note that the SISO works on a bit by bit basis, but produces a sequence of APP's. 60
  • Turbo codes ● The algorithm inside the SISO is some suboptimal version of the MAP BCJR algorithm. – BCJR computes the APP values through a forward-backward dynamics → it works over finite length data blocks, not over (potentially) infinite length sequences (like pure CCs). – BCJR works on a trellis: recall transition metrics, transition probabilities and so on. – Assume the block length is N: trellis starts at s (0) , ends at s (N ) . αi ( j ) = P ( s =s j , r 1, ⋯, r i ) (i) βi ( j ) = P ( r i +1 , ⋯, r N ∣s =s j ) (i) ( j , k )= P ( r i , s(i)=s j∣s(i−1)=sk ) γi 61
  • Turbo codes ● The algorithm inside the SISO is some suboptimal version of the MAP BCJR algorithm. – BCJR computes the APP values through a forward-backward dynamics → it works over finite length data blocks, not over (potentially) infinite length sequences (like pure CCs). – BCJR works on a trellis: recall transition metrics, transition probabilities and so on. – Assume the block length is N: trellis starts at αi ( j ) = P ( s =s j , r 1, ⋯, r i ) (i) s (0) , ends at s (N ) . FORWARD term βi ( j ) = P ( r i +1 , ⋯, r N ∣s =s j ) (i) ( j , k )= P ( r i , s(i)=s j∣s(i−1)=sk ) γi 62
  • Turbo codes ● The algorithm inside the SISO is some suboptimal version of the MAP BCJR algorithm. – BCJR computes the APP values through a forward-backward dynamics → it works over finite length data blocks, not over (potentially) infinite length sequences (like pure CCs). – BCJR works on a trellis: recall transition metrics, transition probabilities and so on. – Assume the block length is N: trellis starts at αi ( j ) = P ( s =s j , r 1, ⋯, r i ) (i) βi ( j ) = P ( r i +1 , ⋯, r N ∣s =s j ) (i) s (0) , ends at s (N ) . FORWARD term BACKWARD term ( j , k )= P ( r i , s(i)=s j∣s(i−1)=sk ) γi 63
  • Turbo codes ● The algorithm inside the SISO is some suboptimal version of the MAP BCJR algorithm. – BCJR computes the APP values through a forward-backward dynamics → it works over finite length data blocks, not over (potentially) infinite length sequences (like pure CCs). – BCJR works on a trellis: recall transition metrics, transition probabilities and so on. – Assume the block length is N: trellis starts at αi ( j ) = P ( s =s j , r 1, ⋯, r i ) (i) βi ( j ) = P ( r i +1 , ⋯, r N ∣s =s j ) (i) ( j , k )= P ( r i , s(i)=s j∣s(i−1)=sk ) γi s (0) , ends at s (N ) . FORWARD term BACKWARD term TRANSITION 64
  • Turbo codes ● The algorithm inside the SISO is some suboptimal version of the MAP BCJR algorithm. – BCJR computes the APP values through a forward-backward dynamics → it works over finite length data blocks, not over (potentially) infinite length sequences (like pure CCs). – BCJR works on a trellis: recall transition metrics, transition probabilities and so on. – Assume the block length is N: trellis starts at αi ( j ) = P ( s =s j , r 1, ⋯, r i ) (i) Remember, n components for an (n,k,ν) CC βi ( j ) = P ( r i +1 , ⋯, r N ∣s =s j ) (i) ( j , k )= P ( r i , s(i)=s j∣s(i−1)=sk ) γi s (0) , ends at s (N ) . FORWARD term BACKWARD term TRANSITION 65
  • Turbo codes ● BCJR algorithm in action: – Forward step i=1,...,N: 2 ν α0 ( j ) = P ( s =s j ) ; αi ( j )= ∑ αi−1 ( k )⋅γ i ( k , j ) (0) k =1 – Backward step i=N-1,...,0: 2 β N ( j )= P (s – (N ) ν =s j ) ; βi ( j ) =∑ βi+1 ( k )⋅γ i+1 ( j , k ) k =1 Compute the joint probability sequence i=1,...,N: P (s =s j , s =sk ,r ) =βi ( k )⋅γ i ( j , k )⋅αi−1 ( j ) (i−1) (i) 66
  • Turbo codes ● Finally, the APP's can be calculated as: 1 P ( bi =b∣r )= ⋅ p(r ) s ∑ (i −1) ● P (s (i−1) (i ) =s j ,s =s k ,r ) → s(i ) b i =b Decision criterion based on these APP's: log ( P ( bi =1∣r ) P ( bi =0∣r ) ) =log ( ∑ P (s =s j , s =sk ,r ) (i−1) (i) s(i −1) → s(i ) b i =1 ∑ (i −1) s (i−1) P (s (i ) →s b i =0 (i) =s j , s =sk , r ) ) ̂ i =1 b > 0 < 0 ̂ b i= 0 67
  • Turbo codes ● Finally, the APP's can be calculated as: 1 P ( bi =b∣r )= ⋅ p(r ) s ∑ (i −1) ● P (s (i−1) =s j ,s =s k ,r ) → s(i ) b i =b Decision criterion based on these APP's: log ( P ( bi =1∣r ) P ( bi =0∣r ) ) =log ( ∑ P (s (i ) Its modulus is the reliability of the decision =s j , s =sk ,r ) (i−1) (i) s(i −1) → s(i ) b i =1 ∑ (i −1) s (i−1) P (s (i ) →s b i =0 (i) =s j , s =sk , r ) ) ̂ i =1 b > 0 < 0 ̂ b i= 0 68
  • Turbo codes ● How do we get γi(j,l)? ● This probability takes into account – The restrictions of the trellis (CC). – The estimations from the channel. γi ( j , l )=P ( r i ,s (i )=s j∣s(i−1) =s l ) = = p ( ri∣s(i )=s j , s(i −1 )=s l )⋅P ( s(i)=s j∣s (i −1 )=s l ) 69
  • Turbo codes ● How do we get γi(j,l)? ● This probability takes into account – The restrictions of the trellis (CC). – The estimations from the channel. γi ( j , l )=P ( r i ,s (i )=s j∣s(i−1) =s l ) = = p ( ri∣s(i )=s j , s(i −1 )=s l )⋅P ( s(i)=s j∣s (i −1 )=s l ) =0 if transition is not possible =1/2k if transition is possible (binary trellis, k inputs) 70
  • Turbo codes ● How do we get γi(j,l)? ● This probability takes into account – The restrictions of the trellis (CC). – The estimations from the channel. γi ( j , l )=P ( r i ,s (i )=s j∣s(i−1) =s l ) = = p ( ri∣s(i )=s j , s(i −1 )=s l )⋅P ( s(i)=s j∣s (i −1 )=s l ) n − ∑ ( ri ,m −ci ,m ) 2 m =1 1 ⋅e 2 n/2 (2πσ ) 2σ 2 in AWGN for unmodulated ci,m =0 if transition is not possible =1/2k if transition is possible (binary trellis, k inputs) 71
  • Turbo codes ● Idea: what about feeding APP values as APR values for other decoder whose coder had the same inputs? r2 SISO From CC1 SISO (for CC2) P ( bi =b∣r 1 ) P ( bi =b∣r 2 ) 0 0 1 1 72
  • Turbo codes ● Idea: what about feeding APP values as APR values for other decoder whose coder had the same inputs? r2 SISO From CC1 SISO (for CC2) P ( bi =b∣r 1 ) P ( bi =b∣r 2 ) 0 0 1 1 This will happen under some conditions 73
  • Turbo codes ● APP's from first SISO used as APR's for second SISO increase updated APP's reliability iff – APR's are uncorrelated wrt channel estimations for second decoder. – This is achieved by permuting input data for each encoder. k input streams CC1 b Π d n=n1+n2 output streams c=c1∪c 2 Rate R=k/(n1+n2) CC2 74
  • Turbo codes ● APP's from first SISO used as APR's for second SISO increase updated APP's reliability iff – APR's are uncorrelated wrt channel estimations for second decoder. – This is achieved by permuting input data for each encoder. k input streams CC1 b Π INTERLEAVER (permutor) d n=n1+n2 output streams c=c1∪c 2 Rate R=k/(n1+n2) CC2 75
  • Turbo codes ● The interleaver preserves the data (b), but changes its position within the second stream (d). – Note that this compels the TC to work with blocks of N=size(Π) bits. – The decoder has to know the specific interleaver used at the encoder. b1 b 2 b 3 b 4 bN d π (i ) =b i d π ( 2) d π (N ) d π (3 ) d π (1 ) d π ( 4) 76
  • Turbo codes ● The mentioned process is applied iteratively (l=1,...). – Iterative decoder → this may be a drawback, since it adds latency (delay). r2 from channel r1 Π SISO 1 APP1(l) SISO 2 APR2(l) APR1(l+1) APP2(l) −1 Π – Note the feedback connection: it is the same principle as in the turbo engines (that's why they are called “turbo”!). 77
  • Turbo codes ● The mentioned process is applied iteratively (l=1,...). – Iterative decoder → this may be a drawback, since it adds latency (delay). r2 from channel r1 Π SISO 1 APP1(l) SISO 2 APR2(l) APR1(l+1) APP2(l) −1 Π – Note the feedback connection: it is the same principle as in the turbo engines (that's why they are called “turbo”!). 78
  • Turbo codes ● The mentioned process is applied iteratively (l=1,...). – Iterative decoder → this may be a drawback, since it adds latency (delay). r2 from channel r1 Π SISO 1 APP1(l) SISO 2 APR2(l) APR1(l+1) APP2(l) −1 Π – Note the feedback connection: it is the same principle as in the turbo engines (that's why they are called “turbo”!). 79
  • Turbo codes ● The mentioned process is applied iteratively (l=1,...). – Iterative decoder → this may be a drawback, since it adds latency (delay). r2 from channel r1 Π SISO 1 APP1(l) SISO 2 APR2(l) APR1(l+1) APP2(l) −1 Π – Note the feedback connection: it is the same principle as in the turbo engines (that's why they are called “turbo”!). 80
  • Turbo codes ● The mentioned process is applied iteratively (l=1,...). – Iterative decoder → this may be a drawback, since it adds latency (delay). r2 from channel r1 Π SISO 1 APP1(l) SISO 2 APR2(l) APR1(l+1) APP2(l) −1 Π – Note the feedback connection: it is the same principle as in the turbo engines (that's why they are called “turbo”!). 81
  • Turbo codes ● The mentioned process is applied iteratively (l=1,...). – Iterative decoder → this may be a drawback, since it adds latency (delay). r2 from channel r1 Initial APR1(l=0) is taken with P(bi=b)=1/2 Π SISO 1 APP1(l) SISO 2 APR2(l) APR1(l+1) APP2(l) −1 Π – Note the feedback connection: it is the same principle as in the turbo engines (that's why they are called “turbo”!). 82
  • Turbo codes ● When the interleaver is adequately chosen and the CC's employed are RSC, the typical BER behavior is – Note the two distinct zones: waterfall region / error floor. 83
  • Turbo codes ● The location of the waterfall region can be analyzed by the so-called density evolution method – ● Based on the exchange of mutual information between SISO's. The error floor can be lower bounded by the minimum Hamming distance of the TC – Contrary to CC's, TC relies on reducing multiplicities rather than just trying to increase minimum distance. Pb floor w min⋅M min > ⋅erfc N (√ Eb d min R N0 ) 84
  • Turbo codes ● The location of the waterfall region can be analyzed by the so-called density evolution method – ● Based on the exchange of mutual information between SISO's. The error floor can be lower bounded by the minimum Hamming distance of the TC – Contrary to CC's, TC relies on reducing multiplicities rather than just trying to increase minimum distance. Pb Hamming weight of the error with minimum distance floor w min⋅M min > ⋅erfc N (√ Eb d min R N0 ) 85
  • Turbo codes ● The location of the waterfall region can be analyzed by the so-called density evolution method – ● Based on the exchange of mutual information between SISO's. The error floor can be lower bounded by the minimum Hamming distance of the TC – Contrary to CC's, TC relies on reducing multiplicities rather than just trying to increase minimum distance. Pb Hamming weight of the error with minimum distance floor w min⋅M min > ⋅erfc N (√ Eb d min R N0 ) Error multiplicity (low value!!) 86
  • Turbo codes ● The location of the waterfall region can be analyzed by the so-called density evolution method – ● Based on the exchange of mutual information between SISO's. The error floor can be lower bounded by the minimum Hamming distance of the TC – Contrary to CC's, TC relies on reducing multiplicities rather than just trying to increase minimum distance. Pb Hamming weight of the error with minimum distance floor w min⋅M min > ⋅erfc N Interleaver gain (only if recursive CC's!!) (√ Eb d min R N0 ) Error multiplicity (low value!!) 87
  • Turbo codes ● Examples of 3G TC. Note that TC's are intended for FEC... 88
  • Low Density Parity Check Codes ● LDPC codes are just another kind of channel codes derived from less complex ones. – ● While TC's were initially an extension of CC systems, LDPC codes are an extension of the concept of binary LBC, but they are not exactly our known LBC. Formally, an LDPC code is an LBC whose parity check matrix is large and sparse. – Almost all matrix elements are 0!!!!!!!!!! – Very often, the LDPC parity check matrices are randomly generated, subject to some constraints on sparsity... – Recall that LBC relied on extreme powerful algebra related to carefully and well chosen matrix structures. 89
  • Low Density Parity Check Codes ● Formally, a (ρ,γ)-regular LDPC code is defined as the null space of a parity check matrix J⨯n H that meets these constraints: a) Each row contains ρ 1's. b) Each column contains γ 1's. c) λ, the number of 1's in common between any two columns, is 0 or 1. d) ρ and γ are small compared with n and J. ● ● These properties give name to this class of codes: their matrices have a low density of 1's. The density r of H is defined as r=ρ/n=γ/J. 90
  • Low Density Parity Check Codes ● Example of a (4,3)-regular LPDC parity check matrix [ ] 11110 000 00 00 000 00 00 0 0 000 111100 00 000 00 00 0 0 000 00 001111 000 00 00 0 0 000 00 00 000 011110 00 0 0 000 00 00 000 00 00 01111 10 0010 00 100 0100 00 00 0 01 000 100 010 000 00 100 0 H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0010 00 00 0100 010 001 0 0 000 00 010 0010 00 100 01 10 000 100 00 0100 00 010 0 01 000 010 001 000 010 00 0 0 0100 00 100 0010 00 001 0 0 0010 00 010 000 100 100 0 0 000 100 001 000 010 00 01 91
  • Low Density Parity Check Codes ● Example of a (4,3)-regular LPDC parity check matrix [ ] 11110 000 00 00 000 00 00 0 15⨯20 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 000 00 001111 000 00 00 0 0 000 00 00 000 011110 00 0 0 000 00 00 000 00 00 01111 10 0010 00 100 0100 00 00 0 01 000 100 010 000 00 100 0 H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0010 00 00 0100 010 001 0 0 000 00 010 0010 00 100 01 10 000 100 00 0100 00 010 0 01 000 010 001 000 010 00 0 0 0100 00 100 0010 00 001 0 0 0010 00 010 000 100 100 0 0 000 100 001 000 010 00 01 92
  • Low Density Parity Check Codes ● Example of a (4,3)-regular LPDC parity check matrix [ ] 11110 000 00 00 000 00 00 0 15⨯20 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 000 00 001111 000 00 00 0 0 000 00 00 000 011110 00 0 0 000 00 00 000 00 00 01111 10 0010 00 100 0100 00 00 0 01 000 100 010 000 00 100 0 H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0010 00 00 0100 010 001 0 0 000 00 010 0010 00 100 01 10 000 100 00 0100 00 010 0 01 000 010 001 000 010 00 0 0 0100 00 100 0010 00 001 0 0 0010 00 010 000 100 100 0 0 000 100 001 000 010 00 01 This H defines a (20,7) LBC!!! 93
  • Low Density Parity Check Codes ● Example of a (4,3)-regular LPDC parity check matrix [ ] 11110 000 00 00 000 00 00 0 15⨯20 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 000 00 001111 000 00 00 0 0 000 00 00 000 011110 00 0 0 000 00 00 000 00 00 01111 10 0010 00 100 0100 00 00 0 01 000 100 010 000 00 100 0 H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0010 00 00 0100 010 001 0 0 000 00 010 0010 00 100 01 10 000 100 00 0100 00 010 0 01 000 010 001 000 010 00 0 0 0100 00 100 0010 00 001 0 0 0010 00 010 000 100 100 0 0 000 100 001 000 010 00 01 This H defines a (20,7) LBC!!! r=4/20=3/15=0.2 94
  • Low Density Parity Check Codes ● Example of a (4,3)-regular LPDC parity check matrix [ ] 11110 000 00 00 000 00 00 0 15⨯20 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 000 00 001111 000 00 00 0 0 000 00 00 000 011110 00 0 0 000 00 00 000 00 00 01111 10 0010 00 100 0100 00 00 0 01 000 100 010 000 00 100 0 H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0010 00 00 0100 010 001 0 0 000 00 010 0010 00 100 01 10 000 100 00 0100 00 010 0 01 000 010 001 000 010 00 0 0 0100 00 100 0010 00 001 0 0 0010 00 010 000 100 100 0 0 000 100 001 000 010 00 01 This H defines a (20,7) LBC!!! r=4/20=3/15=0.2 Sparse! 95
  • Low Density Parity Check Codes ● Example of a (4,3)-regular LPDC parity check matrix [ ] 11110 000 00 00 000 00 00 0 15⨯20 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 000 00 001111 000 00 00 0 0 000 00 00 000 011110 00 0 0 000 00 00 000 00 00 01111 10 0010 00 100 0100 00 00 0 01 000 100 010 000 00 100 0 H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0010 00 00 0100 010 001 0 0 000 00 010 0010 00 100 01 10 000 100 00 0100 00 010 0 01 000 010 001 000 010 00 0 0 0100 00 100 0010 00 001 0 0 0010 00 010 000 100 100 0 0 000 100 001 000 010 00 01 This H defines a (20,7) LBC!!! r=4/20=3/15=0.2 Sparse! λ=0,1 96
  • Low Density Parity Check Codes ● Note that the J rows of H are not necessarily linearly independent over GF(2). – – ● To determine the dimension k of the code, it is mandatory to find the row rank of H = n-k < J. That's the reason why in the previous example H defined a (20,7) LBC instead of a (20,5) LBC as could be expected! The construction of large H for LDPC with high rates and good properties is a complex subject. – Some methods relay on smaller Hi used as building blocks, plus random permutations or combinatorial manipulations; resulting matrices with bad properties are discarded. – Other methods relay on finite geometries and lot of algebra. 97
  • Low Density Parity Check Codes ● LDPC codes yield performances equal or even better than TC's, but without the problem of their relatively high error floor. – ● Both LDPC codes and TC's are capacity approaching codes. As in the case of TC, their interest is in part related to the fact that – The encoding can be easily done (even when H or G are large, the low density of 1's reduces complexity of the encoder). – At the decoder side, there are powerful algorithms that can take full advantage of the properties of the LDPC code. 98
  • Low Density Parity Check Codes ● There are several algorithms to decode LDPC codes. – – Soft decoding. – ● Hard decoding. Mixed approaches. We are going to examine two important instances thereof: – – ● Majority-logic (MLG) decoding; hard decoding, the simplest one (lowest complexity). Sum-product algorithm (SPA); soft decoding, best error performance (but high complexity!). Key concepts: Tanner graphs & belief propagation. 99
  • Low Density Parity Check Codes ● MLG decoding: hard decoding; r=c+e → received word. – ● The simplest instance of MLG decoding is the decoding of a repetition code by the rule “choose 0 if 0's are dominant, 1 if otherwise”. Given a (ρ,γ)-regular LDPC code, for every bit position i=1,...,n, there is a set of γ rows (i) 1 (i) γ Ai ={h ,⋯ , h } that have a 1 in position i, and do not have any other common 1 position among them... 100
  • Low Density Parity Check Codes ● We can form the set of syndrome equations S i ={s i =r⋅h ● ● ● (i)T j (i )T j =e⋅h (i) j , h ∈ Ai , i=1,⋯ , γ} Si gives a set of γ checksums orthogonal on ei. ei is decoded as 1 if the majority of the checksums give 1; 0 in the opposite case. Repeating this for all i, we estimate ê, and ĉ=r+ê. – Correct decoding of ei is guaranteed if there are less than γ/2 errors in e. 101
  • Low Density Parity Check Codes ● Tanner graphs. Example for a (7,3) LBC. c1 c2 c3 c4 c5 c6 c7 + s1 ● + s2 + s3 + s4 + s5 + s6 + s7 It is a bipartite graph with interesting properties for decoding. – A variable node is connected to a check node iff the corresponding code bit is checked by the corresponding parity sum equation. 102
  • Low Density Parity Check Codes ● Tanner graphs. Example for a (7,3) LBC. c1 c2 c3 c4 c5 c6 c7 + s1 ● + s2 + s3 + s4 + s5 + s6 Variable nodes or code-bit vertices + s7 It is a bipartite graph with interesting properties for decoding. – A variable node is connected to a check node iff the corresponding code bit is checked by the corresponding parity sum equation. 103
  • Low Density Parity Check Codes ● Tanner graphs. Example for a (7,3) LBC. c1 c2 c3 c4 c5 c6 c7 + s1 ● + s2 + s3 + s4 + s5 + s6 + s7 Variable nodes or code-bit vertices Check nodes or check-sum vertices It is a bipartite graph with interesting properties for decoding. – A variable node is connected to a check node iff the corresponding code bit is checked by the corresponding parity sum equation. 104
  • Low Density Parity Check Codes ● Tanner graphs. Example for a (7,3) LBC. c1 c2 c3 c4 c5 c6 c7 Variable nodes or code-bit vertices The absence of short loops is necessary for iterative decoding + s1 ● + s2 + s3 + s4 + s5 + s6 + s7 Check nodes or check-sum vertices It is a bipartite graph with interesting properties for decoding. – A variable node is connected to a check node iff the corresponding code bit is checked by the corresponding parity sum equation. 105
  • Low Density Parity Check Codes ● ● Based on the Tanner graph of an LDPC code, it is possible to make iterative soft decoding (SPA). SPA is performed by belief propagation (which is an instance of a message passing algorithm). c1 c2 c3 c4 c5 c6 c7 + s1 + s2 + s3 + s4 + s5 + s6 + s7 106
  • Low Density Parity Check Codes ● ● Based on the Tanner graph of an LDPC code, it is possible to make iterative soft decoding (SPA). SPA is performed by belief propagation (which is an instance of a message passing algorithm). c1 c2 c3 c4 c5 c6 c7 + s1 + s2 + s3 + s4 + s5 + s6 “Messages” (soft values) are passed to and from related variable and check nodes + s7 107
  • Low Density Parity Check Codes ● ● Based on the Tanner graph of an LDPC code, it is possible to make iterative soft decoding (SPA). SPA is performed by belief propagation (which is an instance of a message passing algorithm). c1 c2 c3 c4 c5 c6 c7 + s1 + s2 + s3 + s4 + s5 + s6 “Messages” (soft values) are passed to and from related variable and check nodes + s7 This process, applied iteratively and under some rules, yields P ( ci∣λ ) 108
  • Low Density Parity Check Codes ● ● Based on the Tanner graph of an LDPC code, it is possible to make iterative soft decoding (SPA). SPA is performed by belief propagation (which is an instance of a message passing algorithm). c1 c2 c3 c4 c5 c6 c7 + s1 + s2 + s3 + s4 + s5 + s6 “Messages” (soft values) are passed to and from related variable and check nodes + s7 This process, applied iteratively and under some rules, yields P ( ci∣λ ) λ soft values 109
  • Low Density Parity Check Codes ● ● Based on the Tanner graph of an LDPC code, it is possible to make iterative soft decoding (SPA). SPA is performed by belief propagation (which is an instance of a message passing algorithm). c1 c2 c3 c4 c5 c6 c7 + s1 + s2 + s3 + s4 + s5 + s6 “Messages” (soft values) are passed to and from related variable and check nodes + s7 This process, applied iteratively and under some rules, yields P ( ci∣λ ) λ soft values 110
  • Low Density Parity Check Codes ● ● Based on the Tanner graph of an LDPC code, it is possible to make iterative soft decoding (SPA). SPA is performed by belief propagation (which is an instance of a message passing algorithm). c1 c2 c3 c4 c5 c6 c7 + s1 + s2 + s3 + s4 + s5 + s6 “Messages” (soft values) are passed to and from related variable and check nodes + s7 N(c5): check nodes neighbors of variable node c5 This process, applied iteratively and under some rules, yields P ( ci∣λ ) λ soft values 111
  • Low Density Parity Check Codes ● ● Based on the Tanner graph of an LDPC code, it is possible to make iterative soft decoding (SPA). SPA is performed by belief propagation (which is an instance of a message passing algorithm). c1 c2 c3 c4 c5 c6 c7 + s1 + s2 + s3 + s4 + s5 + s6 “Messages” (soft values) are passed to and from related variable and check nodes + s7 N(c5): check nodes neighbors of variable node c5 This process, applied iteratively and under some rules, yields P ( ci∣λ ) λ soft values 112
  • Low Density Parity Check Codes ● ● Based on the Tanner graph of an LDPC code, it is possible to make iterative soft decoding (SPA). SPA is performed by belief propagation (which is an instance of a message passing algorithm). c1 c2 c3 c4 c5 c6 c7 “Messages” (soft values) are passed to and from related variable and check nodes N(s7) + s1 + s2 + s3 + s4 + s5 + s6 + s7 N(c5): check nodes neighbors of variable node c5 This process, applied iteratively and under some rules, yields P ( ci∣λ ) λ soft values 113
  • Low Density Parity Check Codes ● If we get P(ci | λ), we have an estimation of the codeword sent ĉ. ● The decoding aims at calculating this through the marginalization P ( ci∣λ )= ● ∑ P ( c '∣λ ) c ' :c ' i =c i Brute-force approach for LDPC is impractical, hence the iterative solution through SPA. Messages interchanged at step l: (l ) (l) μc → s ( ci =c ) =αi , j⋅P ( ci =c∣λ i )⋅ i μ j (l ) s j →c i ∏ sk ∈ N ( c i ) sk ≠s j ( ci =c ) = ∑ c ∖ ck ∈ N (s j ) c k ≠c i (l −1) μ s →c ( c i =c ) k i P ( s j =0∣ci =c , c )⋅ ∏ c ' k∈ N ( s j) μ (l) ck → s j ( c ' k =c k ) 114
  • Low Density Parity Check Codes ● If we get P(ci | λ), we have an estimation of the codeword sent ĉ. ● The decoding aims at calculating this through the marginalization P ( ci∣λ )= ● ∑ P ( c '∣λ ) c ' :c ' i =c i Brute-force approach for LDPC is impractical, hence the iterative solution through SPA. Messages interchanged at step l: From variable node to check node (l ) (l) μc → s ( ci =c ) =αi , j⋅P ( ci =c∣λ i )⋅ i μ j (l ) s j →c i ∏ sk ∈ N ( c i ) sk ≠s j ( ci =c ) = ∑ c ∖ ck ∈ N (s j ) c k ≠c i (l −1) μ s →c ( c i =c ) k i P ( s j =0∣ci =c , c )⋅ ∏ c ' k∈ N ( s j) μ (l) ck → s j ( c ' k =c k ) 115
  • Low Density Parity Check Codes ● If we get P(ci | λ), we have an estimation of the codeword sent ĉ. ● The decoding aims at calculating this through the marginalization P ( ci∣λ )= ● ∑ P ( c '∣λ ) c ' :c ' i =c i Brute-force approach for LDPC is impractical, hence the iterative solution through SPA. Messages interchanged at step l: From variable node to check node (l ) (l) μc → s ( ci =c ) =αi , j⋅P ( ci =c∣λ i )⋅ i μ j (l ) s j →c i ∏ sk ∈ N ( c i ) sk ≠s j ( ci =c ) = ∑ c ∖ ck ∈ N (s j ) c k ≠c i (l −1) μ s →c ( c i =c ) k i From check node to variable node P ( s j =0∣ci =c , c )⋅ ∏ c ' k∈ N ( s j) μ (l) ck → s j ( c ' k =c k ) 116
  • Low Density Parity Check Codes ● Note that: (l ) i,j – α – P ( ci =c∣λ i ) is a normalization constant. plugs into the SPA the values from the channel → it is the APR info. ● (l) (l) P ( c i=c∣λ ) =βi ⋅P ( c i=c∣λi )⋅ ∏ s j ∈ N (c i ) ● (l) μ s → c ( c i=c ) : APP value. j i Based on the final values of P(ci | λ), a candidate ĉ is chosen and ĉ·HT is tested. If 0, the information word is decoded. 117
  • Low Density Parity Check Codes ● Note that: (l ) i,j – α – P ( ci =c∣λ i ) is a normalization constant. plugs into the SPA the values from the channel → it is the APR info. ● (l) (l) P ( c i=c∣λ ) =βi ⋅P ( c i=c∣λi )⋅ ∏ s j ∈ N (c i ) ● Normalization (l) μ s → c ( c i=c ) : APP value. j i Based on the final values of P(ci | λ), a candidate ĉ is chosen and ĉ·HT is tested. If 0, the information word is decoded. 118
  • Low Density Parity Check Codes ● LDPC BER performance examples (DVBS2 standard). 119
  • Low Density Parity Check Codes ● LDPC BER performance examples (DVBS2 standard). Short n=16200 120
  • Low Density Parity Check Codes ● LDPC BER performance examples (DVBS2 standard). Short n=16200 Long n=64800 121
  • Coded modulations ● We have considered up to this point channel coding and decoding isolated from the modulation process. – Codewords feed any kind of modulator. – Symbols go through a channel (medium). – The info recovered from received modulated symbols is fed to the suitable channel decoder ● ● – As hard decisions. As soft values (probabilistic estimations). The abstractions of BSC(p) (hard demodulation) or soft values from AWGN ( ⋉ exp[-|ri-sj|2/(2σ2)] ) -and the like for other cases- are enough for such an approach. ● Note that there are other important channel kinds not considered so far. 122
  • Coded modulations ● Coded modulations are systems where channel coding and modulation are treated as a whole. – – ● Joint coding/modulation. Joint decoding/demodulation. This offers potential advantages (recall the improvements made when the demodulator outputs more elaborated information -soft values vs. hard decisions). – ● We combine gains in BER with spectral efficiency! As a drawback, the systems become more complex. – More difficult to design and analyze. 123
  • Coded modulations ● TCM (trellis coded modulation). – s1 s2 Ideally, it combines a CC encoder and the modulation symbol mapper. output mk s1 output m s2 j s3 s3 s4 s4 s5 s5 s6 s6 s7 s7 s8 s8 i-1 i i+1 124
  • Coded modulations ● If the modulation symbol mapper is well matched to the CC trellis, and the decoder is accordingly designed to take advantage of it, – – ● TCM provides high spectral efficiency. TCM can be robust in AWGN channels, and against fading and multipath effects. In the 80's, TCM become the standard for telephone line data modems. – ● No other system could provide better performance over the twisted pair cable before the introduction of DMT and ADSL. However, the flexibility of providing separated channel coding and modulation subsystems is still preferred nowadays. – Under the concept of Adaptive Modulation & Coding (ACM). 125
  • Coded modulations ● Other possibility of coded modulation, evolved from TCM and from the concatenated coding & iterative decoding framework is Bit-Interleaved Coded Modulation (BICM). – What about if we provide an interleaver between the channel coder (normally a CC) and the modulation symbol mapper? CC – Π A soft demodulator can also accept APR values and update as APP's its soft outputs in an iterative process! Channel corrupted outputs APR values (interleaved from CC SISO) Soft demapper APP values (to interleaver and CC SISO) 126
  • Coded modulations ● As TCM, BICM has special good behavior (even better!) – – Iterative decoding yields a steep waterfall region. – ● In dispersive channels (multipath, fading). – ● In channels where spectral efficiency is required. Being a serial concatenated system, the error floor is very low (contrary to the parallel concatenated systems). BICM has already found applications in standards such as DVB-T2. The drawback is the higher latency and complexity of the decoding. 127
  • Coded modulations ● Examples of BICM. 128
  • References ● ● S. Lin, D. Costello, ERROR CONTROL CODING, Prentice Hall, 2004. S. B. Wicker, ERROR CONTROL SYSTEMS FOR DIGITAL COMMUNICATION AND STORAGE, Prentice Hall, 1995. 129