SlideShare a Scribd company logo
1 of 27
Download to read offline
Frank Kienle
§  TexPoint fonts used in EMF.
Architekturen für Basisbandsignalverarbeitung
in drahtlosen Kommunikationssystemen
Architectures for baseband signal processing
of wireless communications systems
2
Communications vs. VLSI constraints
Minimize transmit power
Minimize redundancy
Quality of service guaranteed
Quality of ServiceRedundancy (Bandwidth)
Transmit Power
Chip Area/ costs
Processing Power
Desired Decoder
Throughput/Latency
Implementation (VLSI)
constraints
Communications
constraints
Minimize processing power
Minimize chip area/costs
Quality of service guaranteed
3
Exercise
Current high-end handsets have a power consumption of 1 Watt, e.g.
for WCDMA voice call. The battery has a electric charge of 1400mAh.
The base band processor operates on 1.5V, f_cyc=300 MHz and has to
process 20 GOP/s in an active voice call mode.
§  What is the live time of the battery assuming an active voice call?
§  What is the average energy per operation?
§  On average, how many operations are processed per cycle?
§  Is it possible to use a vector DSP core to process this task which needs
20 pJ per operation?
4
Conclusion Input LLRs
Highest priority for hardware design is to fulfill the specification under realistic
conditions
Proper input quantization is essential to avoid communication performance
degradation
An implementation of an ‘optimal’ algorithm can lead to entire different results
in a possible hardware realization.
Robustness of an algorithm is key for a successful hardware integration
For baseband processing: Look for algorithms which are SNR independent!!!
5
Maximum Likelihood decoding
The maximum likelihood ML estimation has the an entire sequence as its
result:
We always use a decoding algorithm to solve the Maximum Likelihood
criterion if it is possible.
§  Convolutional codes à solved by Viterbi algorithm
§  Small block codes à solved brute force, testing all codewords
However, many codes have to large code space to solve the ML
criterion
Solution: divide and conquer methods
6
Symbol-by-symbol MAP
Divide and conquer splits the problem in multiple sub-problems which
can be solved independently. The overall solution is approached by an
iterative exchange of the sub-solutions.
The best solution for the iterative problem solving is to determine a
confidence estimate of each variable (bit).
à symbol-by-symbol maximum a posteriori (MAP) criterion.
§  Turbo decoders split the overall problem in two parts:
§  LDPC decoders split the overall problem in M parts
7
Building Blocks
Arithmetic units
Adder, Multiplier, MAC, Shifter, Comparator, ALU etc.
§  Clear order of complexity avoid e.g. divisions if possible
Memory blocks for the storage of data
Register files, Shift register, FIFOs, RAMs, ROMs, DRAMs.
§  SRAM: we can access one data per clock cycle SRAM
§  Access conflicts!
Interconnection units
Switches, bus, arbiter, network-on-chip
§  Structure of the barrel shifter (used in LDPC and turbo decoders)
8
Memory Hierarchy
SRAM memory can be generated in nearly any shape (VLSI):
§  A memory block can be composed of multiple smaller memory
§  Changes the area and average power
8 bit
4096
Shape
WordDepth x
WordWidth
Average Power Write Operation,
all data input pins and all
addresses are switching
(uW/MHz)
Area [mm2] comment
4096 x 8 7.0857 0.043824
1024 x 8
1024 x 32
1024
8 bit
1024
1024
1024
32 bit
1024
Larger but less
average power
5.229
12.8716
single:
0.01481
all:
0.0592
Smaller and less average
power if access pattern is
possible
0.039683
9
Memories (SRAM) first summary
1.  Often we can trade off area vs. power just by changing the memory
hierarchy.
2.  However, the application determines the access pattern and gives thus
constraints to the memory hierarchy
Access pattern:
sequence in time and space (address) of reading/writing multiple data
Example for a ‘difficult’ access pattern:
§  Read in one clock cycle 100 words each from a different (random) address
10
C
Problem to parallelize random interleavers
A
E
I
M12
0
4
8
B
F
J
N13
1
5
9
C
G
K
O14
2
6
10
D
H
L
P15
3
7
11
L
A
H12
0
4
8
B
O
F
G13
1
5
9
I
J
D
P14
2
6
10
E
N
M
K15
3
7
11
Parallel Processing
Parallel Interleaver
Addr Interl.
Addr.
0 8
1 1
2 4
3 10
4 3
5 9
6 13
7 12
8 2
9 6
10 15
11 0
12 11
13 7
14 5
15 14
11
Viterbi Algorithm (functional units)
The Viterbi algorithm solves the ML criterion:
At each time step at each state we:
§  Add the previous state metric and the corresponding branch metric
§  Compare these two accumulative metrics
§  Select the survivor and store it
00
01
10
11
00
01
10
11
12
+
+
+
+
old
C
S
old
D
S
new
A
S
-­‐
-­‐
new
B
S
Inner structure Viterbi decoder
Storage for the
previous states
00
01
10
11
00
01
10
11 Branch metric unit
info-
LLR
Memories to store
channel LLR values
parity
LLR
Storage for the
result states
survivor bit
memory
13
Low-Density Parity-Check Code
LDPC Code is a linear block code
§  Defined by a very sparse parity check matrix H
§  x is a codeword if:
LDPC codes can be described by a Tanner graph
§  Variable node associated to a column in H and represents a single bit within x
§  Check node associated to a row in H and represents thus a single parity check code
§  Regular LDPC codes have variable and check nodes of constant degree
§  Irregular LDPC codes have nodes of varying degree
gdc
max gdc
max
1
C o n n e c tivity
M 	
  c h e c k	
  n o d e s 	
  (C N )
f dv
max f 3 f 2
N 	
  va ria b le 	
  n o d e s 	
  (V N )
14
Summary LDPC codes
LDPC codes are decoded in an iterative manner
§  Probabilistic messages are exchanged between variable nodes and check nodes
§  Decoding algorithm is an instance of a message passing algorithm
§  For practical receivers a maximum of 40 iterations are performed
LDPC decoder can be realized in fully parallel or in partially parallel manner
§  Fully parallel architecture:
§  Each VN and CN is instantiated in hardware the connection is hard wired
§  Pro: highest possible throughput (optical fiber)
§  Con: supports one code, problem due to routing congestions
§  Partially parallel
§  Only P functional VNs and CNs are instantiated,
§  Connectivity is realized by a switching network, connectivity pattern has to be stored
§  Con: limited throughput
§  Pro: large flexibility (code rate, block length)à required by wireless LDPC decoders
15
Summary LDPC codes
Fully flexible LDPC decoder
§  Can process any random LDPC code
§  Storing the connectivity pattern can be more costly (area) than the entire rest of the
decoder (message storage, functional units)
Joint Architecture – Code/Algorithm design
§  Define a hardware architecture
§  Design code/algorithm to fit this architecture
16
Communications point of view
0 1 2 3
0
1
2
3
4
5
6
7
8
9
10
11
4 5 6 7
The parity matrix is composed of:
§  Permuted identity matrices:
§  Already proposed by Gallager 63’ as
construction method
§  Allows compact description, e.g.
P=13 à identity matrix size:
§  Results in quasi-cyclic codes
§  All LDPC codes utilized in standards
are composed of permuted identity
matrices.
17
Hardware design point of view
8
8
8
VN VN VN VN
VN RAM1
CN RAM1
VN RAM2
CN RAM2
VN RAM3
CN RAM3
VN RAM4
CN RAM4
8
4
0
0
0
0
4
4
4
9
9
9
1
1
1
5
5
5
10
10
10
2
2
2
6
6
6
11
11
11
3
3
3
7
7
7
0
4
8
1
5
9
2
6
10
3
7
11
9
5
1
10
6
2
11
7
3
0 1 2 3
0 1 2 3
0 1 2 3
4 5 6 7
7 4 5 6
6 7 4 5
LDPC decoder features:
§  Permuted identity matrices
results in simple shifting networks
§  Size of identity matrix directly
gives a possible hardware
parallelization P
§  Entire connectivity pattern
defined by just two vectors
§  Shift vector
§  Address vector
§  for each clock cycle one entry exist
§  Very regular control flow, always
P data are handled identically
18
Turbo Codes
Turbo Codes (1993):
§  Clever parallel concatenation of two convolutional codes achieving capacity up to 0.5 db
§  Defined from encoder point of view
Parallel Turbo-Codes composed of:
§  Component Encoder (recursive systematic convolutional (RSC) codes
§  Interleaver
§  Puncturing Unit (not shown here)
High level complexity comparison TC vs. CC
§  CC: Lc=9 ⇒ 256 states
§  Turbo Code: 2 CCs with Lc=4 ⇒ 2 x 8 states
§  trellis state reduction by a factor of 16
§  repeated turbo decoding with 8 iterations:
⇒ overall state reduction by a factor of 2 and 3dB coding gain
19
Summary Interleaver
For the interleaver hardware realization we need:
§  interleaver table (e.g. SRAM based)
§  Or an interleaver generator, which delivers the corresponding indices
LTE interleaver realization:
§  Dedicated interleaver generator to calculate:
§  The interleaver pattern is conflict free for a parallel realization
UMTS interleaver realization:
§  Difficult control flow to realize a dedicated interleaver generator
§  Typically SRAMs are instantiated to store the interleaver indices
§  However, the SRAM has to be filled with the corresponding indices
depending on the current block length.
§  The indices are calculated by e.g. an ARM processor
20
Encoder 2
Encoder 1
Iterative Decoding Procedure
Input
Parity 1
Systematic
Parity 2
Symbol-by-symbol
Maximum A Posteriori
Decoder 1
Symbol-by-symbol
Maximum A Posteriori
Decoder 2
Systematic Parity 1
Interleaved
Systematic Parity 2
T
-
T
-
21
Iterative Decoding Procedure
Concatenated codes are known since 1966 (Forney)
New innovation 1993:
subtraction (ignoring) of own old information
à EXTRINSIC INFORMATION PRINCIPLE
22
Iterative Decoding Procedure
Symbol-by-symbol
Maximum A Posteriori
Decoder 1
Symbol-by-symbol
Maximum A Posteriori
Decoder 2
Systematic Parity 1
Interleaved
Systematic Parity 2
T
-
T
-
symbol-by-symbol
MAP result
input value:
syste. LLR
(additional gain)
extrinsic information
gain from decoder 1
is interleaved and used as
a priori information for decoder 2
23
Max-Log MAP algorithm
1. Branch metric calculation:
2. Forward state metrics α:
computed recursively over k ∈ {1..blocksize-1} for all states m
3. Backward state metrics β:
computed recursively over k ∈ {blocksize-1..1} for all states m
4. Soft-output calculation:
24
MAP decoding: one state per clock cycle step
+
+
old
C
S
old
D
S
new
A
S
-­‐
Storage for the
previous states
00
01
10
11
00
01
10
11 Branch metric unit
info-
LLR
Memories to store
channel LLR values
parity
LLR
Storage for the
result states
MAP Algorithm:
state metric memory
e.g. 12 bit per state
and time step
25
Data path, Serial MAP
Memory 1
(input values)
Memory 2
(store intermediate result)
different functions
On vectors
26
Question slides
27
C
Where is the problem of this parallel processing?
A
E
I
M12
0
4
8
B
F
J
N13
1
5
9
C
G
K
O14
2
6
10
D
H
L
P15
3
7
11
L
A
H12
0
4
8
B
O
F
G13
1
5
9
I
J
D
P14
2
6
10
E
N
M
K15
3
7
11
Parallel Processing
Parallel Interleaver
Addr Interl.
Addr.
0 8
1 1
2 4
3 10
4 3
5 9
6 13
7 12
8 2
9 6
10 15
11 0
12 11
13 7
14 5
15 14

More Related Content

What's hot

FPGA in outer space seminar report
FPGA in outer space seminar reportFPGA in outer space seminar report
FPGA in outer space seminar reportrahul kumar verma
 
Reconfigurable ICs
Reconfigurable ICsReconfigurable ICs
Reconfigurable ICsAnish Goel
 
Lec14 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech --- Coherence
Lec14 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech --- CoherenceLec14 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech --- Coherence
Lec14 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech --- CoherenceHsien-Hsin Sean Lee, Ph.D.
 
Xilinx Cool Runner Architecture
Xilinx Cool Runner ArchitectureXilinx Cool Runner Architecture
Xilinx Cool Runner Architecturedragonpradeep
 
Glow introduction
Glow introductionGlow introduction
Glow introductionYi-Hsiu Hsu
 
FPGA Architecture Presentation
FPGA Architecture PresentationFPGA Architecture Presentation
FPGA Architecture Presentationomutukuda
 
Implementation of Soft-core processor on FPGA (Final Presentation)
Implementation of Soft-core processor on FPGA (Final Presentation)Implementation of Soft-core processor on FPGA (Final Presentation)
Implementation of Soft-core processor on FPGA (Final Presentation)Deepak Kumar
 
High speed customized serial protocol for IP integration on FPGA based SOC ap...
High speed customized serial protocol for IP integration on FPGA based SOC ap...High speed customized serial protocol for IP integration on FPGA based SOC ap...
High speed customized serial protocol for IP integration on FPGA based SOC ap...IJMER
 

What's hot (20)

FPGA in outer space seminar report
FPGA in outer space seminar reportFPGA in outer space seminar report
FPGA in outer space seminar report
 
Isa Dma & Bus Masters
Isa Dma & Bus MastersIsa Dma & Bus Masters
Isa Dma & Bus Masters
 
FPGA In a Nutshell
FPGA In a NutshellFPGA In a Nutshell
FPGA In a Nutshell
 
CPLDs
CPLDsCPLDs
CPLDs
 
Introduction to FPGAs
Introduction to FPGAsIntroduction to FPGAs
Introduction to FPGAs
 
Fpga
FpgaFpga
Fpga
 
CPLD & FPLD
CPLD & FPLDCPLD & FPLD
CPLD & FPLD
 
Reconfigurable ICs
Reconfigurable ICsReconfigurable ICs
Reconfigurable ICs
 
Lec14 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech --- Coherence
Lec14 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech --- CoherenceLec14 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech --- Coherence
Lec14 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech --- Coherence
 
Xilinx Cool Runner Architecture
Xilinx Cool Runner ArchitectureXilinx Cool Runner Architecture
Xilinx Cool Runner Architecture
 
Glow introduction
Glow introductionGlow introduction
Glow introduction
 
Thaker q3 2008
Thaker q3 2008Thaker q3 2008
Thaker q3 2008
 
What is FPGA?
What is FPGA?What is FPGA?
What is FPGA?
 
FPGA Architecture Presentation
FPGA Architecture PresentationFPGA Architecture Presentation
FPGA Architecture Presentation
 
Fpga
FpgaFpga
Fpga
 
Lecture syn 024.cpld-fpga
Lecture syn 024.cpld-fpgaLecture syn 024.cpld-fpga
Lecture syn 024.cpld-fpga
 
Implementation of Soft-core processor on FPGA (Final Presentation)
Implementation of Soft-core processor on FPGA (Final Presentation)Implementation of Soft-core processor on FPGA (Final Presentation)
Implementation of Soft-core processor on FPGA (Final Presentation)
 
High speed customized serial protocol for IP integration on FPGA based SOC ap...
High speed customized serial protocol for IP integration on FPGA based SOC ap...High speed customized serial protocol for IP integration on FPGA based SOC ap...
High speed customized serial protocol for IP integration on FPGA based SOC ap...
 
J05725055
J05725055J05725055
J05725055
 
FPGA
FPGAFPGA
FPGA
 

Similar to Lecture summary: architectures for baseband signal processing of wireless communications systems

Digital logic-formula-notes-final-1
Digital logic-formula-notes-final-1Digital logic-formula-notes-final-1
Digital logic-formula-notes-final-1Kshitij Singh
 
underground cable fault location using aruino,gsm&gps
underground cable fault location using aruino,gsm&gps underground cable fault location using aruino,gsm&gps
underground cable fault location using aruino,gsm&gps Mohd Sohail
 
Computer Organisation and Architecture
Computer Organisation and ArchitectureComputer Organisation and Architecture
Computer Organisation and ArchitectureSubhasis Dash
 
GCC for ARMv8 Aarch64
GCC for ARMv8 Aarch64GCC for ARMv8 Aarch64
GCC for ARMv8 Aarch64Yi-Hsiu Hsu
 
QuadIron An open source library for number theoretic transform-based erasure ...
QuadIron An open source library for number theoretic transform-based erasure ...QuadIron An open source library for number theoretic transform-based erasure ...
QuadIron An open source library for number theoretic transform-based erasure ...Scality
 
24-02-18 Rejender pratap.pdf
24-02-18 Rejender pratap.pdf24-02-18 Rejender pratap.pdf
24-02-18 Rejender pratap.pdfFrangoCamila
 
MPC854XE: PowerQUICC III Processors
MPC854XE: PowerQUICC III ProcessorsMPC854XE: PowerQUICC III Processors
MPC854XE: PowerQUICC III ProcessorsPremier Farnell
 
ASTROSAT SSR - 2015-05-15
ASTROSAT SSR - 2015-05-15ASTROSAT SSR - 2015-05-15
ASTROSAT SSR - 2015-05-15Aritra Sarkar
 
Computer organization memory
Computer organization memoryComputer organization memory
Computer organization memoryDeepak John
 
An Overview of LPC2101/02/03
An Overview of LPC2101/02/03An Overview of LPC2101/02/03
An Overview of LPC2101/02/03Premier Farnell
 
memeoryorganization PPT for organization of memories
memeoryorganization PPT for organization of memoriesmemeoryorganization PPT for organization of memories
memeoryorganization PPT for organization of memoriesGauravDaware2
 
Logic synthesis,flootplan&placement
Logic synthesis,flootplan&placementLogic synthesis,flootplan&placement
Logic synthesis,flootplan&placementshaik sharief
 
Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...
Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...
Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...Bruno Castelucci
 
Demystifying the JESD204B High-speed Data Converter-to-FPGA interface
Demystifying the JESD204B High-speed Data Converter-to-FPGA interfaceDemystifying the JESD204B High-speed Data Converter-to-FPGA interface
Demystifying the JESD204B High-speed Data Converter-to-FPGA interfaceAnalog Devices, Inc.
 

Similar to Lecture summary: architectures for baseband signal processing of wireless communications systems (20)

Digital logic-formula-notes-final-1
Digital logic-formula-notes-final-1Digital logic-formula-notes-final-1
Digital logic-formula-notes-final-1
 
underground cable fault location using aruino,gsm&gps
underground cable fault location using aruino,gsm&gps underground cable fault location using aruino,gsm&gps
underground cable fault location using aruino,gsm&gps
 
Computer Organisation and Architecture
Computer Organisation and ArchitectureComputer Organisation and Architecture
Computer Organisation and Architecture
 
Xdr ppt
Xdr pptXdr ppt
Xdr ppt
 
Upfc ppt
Upfc pptUpfc ppt
Upfc ppt
 
GCC for ARMv8 Aarch64
GCC for ARMv8 Aarch64GCC for ARMv8 Aarch64
GCC for ARMv8 Aarch64
 
QuadIron An open source library for number theoretic transform-based erasure ...
QuadIron An open source library for number theoretic transform-based erasure ...QuadIron An open source library for number theoretic transform-based erasure ...
QuadIron An open source library for number theoretic transform-based erasure ...
 
24-02-18 Rejender pratap.pdf
24-02-18 Rejender pratap.pdf24-02-18 Rejender pratap.pdf
24-02-18 Rejender pratap.pdf
 
MPC854XE: PowerQUICC III Processors
MPC854XE: PowerQUICC III ProcessorsMPC854XE: PowerQUICC III Processors
MPC854XE: PowerQUICC III Processors
 
ASTROSAT SSR - 2015-05-15
ASTROSAT SSR - 2015-05-15ASTROSAT SSR - 2015-05-15
ASTROSAT SSR - 2015-05-15
 
Computer organization memory
Computer organization memoryComputer organization memory
Computer organization memory
 
ppt.pptx
ppt.pptxppt.pptx
ppt.pptx
 
CArcMOOC 07.01 - Memory devices
CArcMOOC 07.01 - Memory devicesCArcMOOC 07.01 - Memory devices
CArcMOOC 07.01 - Memory devices
 
An Overview of LPC2101/02/03
An Overview of LPC2101/02/03An Overview of LPC2101/02/03
An Overview of LPC2101/02/03
 
memeoryorganization PPT for organization of memories
memeoryorganization PPT for organization of memoriesmemeoryorganization PPT for organization of memories
memeoryorganization PPT for organization of memories
 
Logic synthesis,flootplan&placement
Logic synthesis,flootplan&placementLogic synthesis,flootplan&placement
Logic synthesis,flootplan&placement
 
AN5097.pdf
AN5097.pdfAN5097.pdf
AN5097.pdf
 
Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...
Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...
Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...
 
Demystifying the JESD204B High-speed Data Converter-to-FPGA interface
Demystifying the JESD204B High-speed Data Converter-to-FPGA interfaceDemystifying the JESD204B High-speed Data Converter-to-FPGA interface
Demystifying the JESD204B High-speed Data Converter-to-FPGA interface
 
Vlsi
VlsiVlsi
Vlsi
 

More from Frank Kienle

Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big DataFrank Kienle
 
Enterprise Data Science Introduction
Enterprise Data Science IntroductionEnterprise Data Science Introduction
Enterprise Data Science IntroductionFrank Kienle
 
AI for good summary
AI for good summaryAI for good summary
AI for good summaryFrank Kienle
 
DevOps - Introduction to data science
DevOps - Introduction to data scienceDevOps - Introduction to data science
DevOps - Introduction to data scienceFrank Kienle
 
Data Bases - Introduction to data science
Data Bases - Introduction to data scienceData Bases - Introduction to data science
Data Bases - Introduction to data scienceFrank Kienle
 
Machine Learning part 3 - Introduction to data science
Machine Learning part 3 - Introduction to data science Machine Learning part 3 - Introduction to data science
Machine Learning part 3 - Introduction to data science Frank Kienle
 
Machine Learning part 2 - Introduction to Data Science
Machine Learning part 2 -  Introduction to Data Science Machine Learning part 2 -  Introduction to Data Science
Machine Learning part 2 - Introduction to Data Science Frank Kienle
 
Machine Learning part1 - Introduction to Data Science
Machine Learning part1 - Introduction to Data Science Machine Learning part1 - Introduction to Data Science
Machine Learning part1 - Introduction to Data Science Frank Kienle
 
Business Models - Introduction to Data Science
Business Models -  Introduction to Data ScienceBusiness Models -  Introduction to Data Science
Business Models - Introduction to Data ScienceFrank Kienle
 
Data Science Lecture: Overview and Information Collateral
Data Science Lecture: Overview and Information CollateralData Science Lecture: Overview and Information Collateral
Data Science Lecture: Overview and Information CollateralFrank Kienle
 
Lecture: Monte Carlo Methods
Lecture: Monte Carlo MethodsLecture: Monte Carlo Methods
Lecture: Monte Carlo MethodsFrank Kienle
 
data scientist the sexiest job of the 21st century
data scientist the sexiest job of the 21st centurydata scientist the sexiest job of the 21st century
data scientist the sexiest job of the 21st centuryFrank Kienle
 

More from Frank Kienle (12)

Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big Data
 
Enterprise Data Science Introduction
Enterprise Data Science IntroductionEnterprise Data Science Introduction
Enterprise Data Science Introduction
 
AI for good summary
AI for good summaryAI for good summary
AI for good summary
 
DevOps - Introduction to data science
DevOps - Introduction to data scienceDevOps - Introduction to data science
DevOps - Introduction to data science
 
Data Bases - Introduction to data science
Data Bases - Introduction to data scienceData Bases - Introduction to data science
Data Bases - Introduction to data science
 
Machine Learning part 3 - Introduction to data science
Machine Learning part 3 - Introduction to data science Machine Learning part 3 - Introduction to data science
Machine Learning part 3 - Introduction to data science
 
Machine Learning part 2 - Introduction to Data Science
Machine Learning part 2 -  Introduction to Data Science Machine Learning part 2 -  Introduction to Data Science
Machine Learning part 2 - Introduction to Data Science
 
Machine Learning part1 - Introduction to Data Science
Machine Learning part1 - Introduction to Data Science Machine Learning part1 - Introduction to Data Science
Machine Learning part1 - Introduction to Data Science
 
Business Models - Introduction to Data Science
Business Models -  Introduction to Data ScienceBusiness Models -  Introduction to Data Science
Business Models - Introduction to Data Science
 
Data Science Lecture: Overview and Information Collateral
Data Science Lecture: Overview and Information CollateralData Science Lecture: Overview and Information Collateral
Data Science Lecture: Overview and Information Collateral
 
Lecture: Monte Carlo Methods
Lecture: Monte Carlo MethodsLecture: Monte Carlo Methods
Lecture: Monte Carlo Methods
 
data scientist the sexiest job of the 21st century
data scientist the sexiest job of the 21st centurydata scientist the sexiest job of the 21st century
data scientist the sexiest job of the 21st century
 

Recently uploaded

History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerunnathinaik
 

Recently uploaded (20)

History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developer
 

Lecture summary: architectures for baseband signal processing of wireless communications systems

  • 1. Frank Kienle §  TexPoint fonts used in EMF. Architekturen für Basisbandsignalverarbeitung in drahtlosen Kommunikationssystemen Architectures for baseband signal processing of wireless communications systems
  • 2. 2 Communications vs. VLSI constraints Minimize transmit power Minimize redundancy Quality of service guaranteed Quality of ServiceRedundancy (Bandwidth) Transmit Power Chip Area/ costs Processing Power Desired Decoder Throughput/Latency Implementation (VLSI) constraints Communications constraints Minimize processing power Minimize chip area/costs Quality of service guaranteed
  • 3. 3 Exercise Current high-end handsets have a power consumption of 1 Watt, e.g. for WCDMA voice call. The battery has a electric charge of 1400mAh. The base band processor operates on 1.5V, f_cyc=300 MHz and has to process 20 GOP/s in an active voice call mode. §  What is the live time of the battery assuming an active voice call? §  What is the average energy per operation? §  On average, how many operations are processed per cycle? §  Is it possible to use a vector DSP core to process this task which needs 20 pJ per operation?
  • 4. 4 Conclusion Input LLRs Highest priority for hardware design is to fulfill the specification under realistic conditions Proper input quantization is essential to avoid communication performance degradation An implementation of an ‘optimal’ algorithm can lead to entire different results in a possible hardware realization. Robustness of an algorithm is key for a successful hardware integration For baseband processing: Look for algorithms which are SNR independent!!!
  • 5. 5 Maximum Likelihood decoding The maximum likelihood ML estimation has the an entire sequence as its result: We always use a decoding algorithm to solve the Maximum Likelihood criterion if it is possible. §  Convolutional codes à solved by Viterbi algorithm §  Small block codes à solved brute force, testing all codewords However, many codes have to large code space to solve the ML criterion Solution: divide and conquer methods
  • 6. 6 Symbol-by-symbol MAP Divide and conquer splits the problem in multiple sub-problems which can be solved independently. The overall solution is approached by an iterative exchange of the sub-solutions. The best solution for the iterative problem solving is to determine a confidence estimate of each variable (bit). à symbol-by-symbol maximum a posteriori (MAP) criterion. §  Turbo decoders split the overall problem in two parts: §  LDPC decoders split the overall problem in M parts
  • 7. 7 Building Blocks Arithmetic units Adder, Multiplier, MAC, Shifter, Comparator, ALU etc. §  Clear order of complexity avoid e.g. divisions if possible Memory blocks for the storage of data Register files, Shift register, FIFOs, RAMs, ROMs, DRAMs. §  SRAM: we can access one data per clock cycle SRAM §  Access conflicts! Interconnection units Switches, bus, arbiter, network-on-chip §  Structure of the barrel shifter (used in LDPC and turbo decoders)
  • 8. 8 Memory Hierarchy SRAM memory can be generated in nearly any shape (VLSI): §  A memory block can be composed of multiple smaller memory §  Changes the area and average power 8 bit 4096 Shape WordDepth x WordWidth Average Power Write Operation, all data input pins and all addresses are switching (uW/MHz) Area [mm2] comment 4096 x 8 7.0857 0.043824 1024 x 8 1024 x 32 1024 8 bit 1024 1024 1024 32 bit 1024 Larger but less average power 5.229 12.8716 single: 0.01481 all: 0.0592 Smaller and less average power if access pattern is possible 0.039683
  • 9. 9 Memories (SRAM) first summary 1.  Often we can trade off area vs. power just by changing the memory hierarchy. 2.  However, the application determines the access pattern and gives thus constraints to the memory hierarchy Access pattern: sequence in time and space (address) of reading/writing multiple data Example for a ‘difficult’ access pattern: §  Read in one clock cycle 100 words each from a different (random) address
  • 10. 10 C Problem to parallelize random interleavers A E I M12 0 4 8 B F J N13 1 5 9 C G K O14 2 6 10 D H L P15 3 7 11 L A H12 0 4 8 B O F G13 1 5 9 I J D P14 2 6 10 E N M K15 3 7 11 Parallel Processing Parallel Interleaver Addr Interl. Addr. 0 8 1 1 2 4 3 10 4 3 5 9 6 13 7 12 8 2 9 6 10 15 11 0 12 11 13 7 14 5 15 14
  • 11. 11 Viterbi Algorithm (functional units) The Viterbi algorithm solves the ML criterion: At each time step at each state we: §  Add the previous state metric and the corresponding branch metric §  Compare these two accumulative metrics §  Select the survivor and store it 00 01 10 11 00 01 10 11
  • 12. 12 + + + + old C S old D S new A S -­‐ -­‐ new B S Inner structure Viterbi decoder Storage for the previous states 00 01 10 11 00 01 10 11 Branch metric unit info- LLR Memories to store channel LLR values parity LLR Storage for the result states survivor bit memory
  • 13. 13 Low-Density Parity-Check Code LDPC Code is a linear block code §  Defined by a very sparse parity check matrix H §  x is a codeword if: LDPC codes can be described by a Tanner graph §  Variable node associated to a column in H and represents a single bit within x §  Check node associated to a row in H and represents thus a single parity check code §  Regular LDPC codes have variable and check nodes of constant degree §  Irregular LDPC codes have nodes of varying degree gdc max gdc max 1 C o n n e c tivity M  c h e c k  n o d e s  (C N ) f dv max f 3 f 2 N  va ria b le  n o d e s  (V N )
  • 14. 14 Summary LDPC codes LDPC codes are decoded in an iterative manner §  Probabilistic messages are exchanged between variable nodes and check nodes §  Decoding algorithm is an instance of a message passing algorithm §  For practical receivers a maximum of 40 iterations are performed LDPC decoder can be realized in fully parallel or in partially parallel manner §  Fully parallel architecture: §  Each VN and CN is instantiated in hardware the connection is hard wired §  Pro: highest possible throughput (optical fiber) §  Con: supports one code, problem due to routing congestions §  Partially parallel §  Only P functional VNs and CNs are instantiated, §  Connectivity is realized by a switching network, connectivity pattern has to be stored §  Con: limited throughput §  Pro: large flexibility (code rate, block length)à required by wireless LDPC decoders
  • 15. 15 Summary LDPC codes Fully flexible LDPC decoder §  Can process any random LDPC code §  Storing the connectivity pattern can be more costly (area) than the entire rest of the decoder (message storage, functional units) Joint Architecture – Code/Algorithm design §  Define a hardware architecture §  Design code/algorithm to fit this architecture
  • 16. 16 Communications point of view 0 1 2 3 0 1 2 3 4 5 6 7 8 9 10 11 4 5 6 7 The parity matrix is composed of: §  Permuted identity matrices: §  Already proposed by Gallager 63’ as construction method §  Allows compact description, e.g. P=13 à identity matrix size: §  Results in quasi-cyclic codes §  All LDPC codes utilized in standards are composed of permuted identity matrices.
  • 17. 17 Hardware design point of view 8 8 8 VN VN VN VN VN RAM1 CN RAM1 VN RAM2 CN RAM2 VN RAM3 CN RAM3 VN RAM4 CN RAM4 8 4 0 0 0 0 4 4 4 9 9 9 1 1 1 5 5 5 10 10 10 2 2 2 6 6 6 11 11 11 3 3 3 7 7 7 0 4 8 1 5 9 2 6 10 3 7 11 9 5 1 10 6 2 11 7 3 0 1 2 3 0 1 2 3 0 1 2 3 4 5 6 7 7 4 5 6 6 7 4 5 LDPC decoder features: §  Permuted identity matrices results in simple shifting networks §  Size of identity matrix directly gives a possible hardware parallelization P §  Entire connectivity pattern defined by just two vectors §  Shift vector §  Address vector §  for each clock cycle one entry exist §  Very regular control flow, always P data are handled identically
  • 18. 18 Turbo Codes Turbo Codes (1993): §  Clever parallel concatenation of two convolutional codes achieving capacity up to 0.5 db §  Defined from encoder point of view Parallel Turbo-Codes composed of: §  Component Encoder (recursive systematic convolutional (RSC) codes §  Interleaver §  Puncturing Unit (not shown here) High level complexity comparison TC vs. CC §  CC: Lc=9 ⇒ 256 states §  Turbo Code: 2 CCs with Lc=4 ⇒ 2 x 8 states §  trellis state reduction by a factor of 16 §  repeated turbo decoding with 8 iterations: ⇒ overall state reduction by a factor of 2 and 3dB coding gain
  • 19. 19 Summary Interleaver For the interleaver hardware realization we need: §  interleaver table (e.g. SRAM based) §  Or an interleaver generator, which delivers the corresponding indices LTE interleaver realization: §  Dedicated interleaver generator to calculate: §  The interleaver pattern is conflict free for a parallel realization UMTS interleaver realization: §  Difficult control flow to realize a dedicated interleaver generator §  Typically SRAMs are instantiated to store the interleaver indices §  However, the SRAM has to be filled with the corresponding indices depending on the current block length. §  The indices are calculated by e.g. an ARM processor
  • 20. 20 Encoder 2 Encoder 1 Iterative Decoding Procedure Input Parity 1 Systematic Parity 2 Symbol-by-symbol Maximum A Posteriori Decoder 1 Symbol-by-symbol Maximum A Posteriori Decoder 2 Systematic Parity 1 Interleaved Systematic Parity 2 T - T -
  • 21. 21 Iterative Decoding Procedure Concatenated codes are known since 1966 (Forney) New innovation 1993: subtraction (ignoring) of own old information à EXTRINSIC INFORMATION PRINCIPLE
  • 22. 22 Iterative Decoding Procedure Symbol-by-symbol Maximum A Posteriori Decoder 1 Symbol-by-symbol Maximum A Posteriori Decoder 2 Systematic Parity 1 Interleaved Systematic Parity 2 T - T - symbol-by-symbol MAP result input value: syste. LLR (additional gain) extrinsic information gain from decoder 1 is interleaved and used as a priori information for decoder 2
  • 23. 23 Max-Log MAP algorithm 1. Branch metric calculation: 2. Forward state metrics α: computed recursively over k ∈ {1..blocksize-1} for all states m 3. Backward state metrics β: computed recursively over k ∈ {blocksize-1..1} for all states m 4. Soft-output calculation:
  • 24. 24 MAP decoding: one state per clock cycle step + + old C S old D S new A S -­‐ Storage for the previous states 00 01 10 11 00 01 10 11 Branch metric unit info- LLR Memories to store channel LLR values parity LLR Storage for the result states MAP Algorithm: state metric memory e.g. 12 bit per state and time step
  • 25. 25 Data path, Serial MAP Memory 1 (input values) Memory 2 (store intermediate result) different functions On vectors
  • 27. 27 C Where is the problem of this parallel processing? A E I M12 0 4 8 B F J N13 1 5 9 C G K O14 2 6 10 D H L P15 3 7 11 L A H12 0 4 8 B O F G13 1 5 9 I J D P14 2 6 10 E N M K15 3 7 11 Parallel Processing Parallel Interleaver Addr Interl. Addr. 0 8 1 1 2 4 3 10 4 3 5 9 6 13 7 12 8 2 9 6 10 15 11 0 12 11 13 7 14 5 15 14