SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Commercial and Open SoC busses
AMBA bus, PI bus,
IBM Core Connect
ST bus, Wishbone
OCP-IP
1
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Outline
 Introduction to bus architectures
 AMBA bus
 PI bus
 IBM Core Connect
 ST bus
 Wishbone
 OCP-IP
 Virtual Socket Interface Alliance (VSIA)
 Nios II Avalon Bus
Extra material
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Introduction
 Bus = shared wires by multiple communicating
units
 Connection logic to avoid electrical conflicts
 Arbitration to determine bus ownership in time
 Protocols – set of rules for transmitting information
between multiple units
 Buses designed for PCs and PCBs are not
suitable for SoC-s
 Designed for backplane
 Limited speed
 Limited number of signals
3
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Bus protocol
 the type and order of data being sent;
 how the sending device indicates that it has
finished sending the information;
 the data compression method used, if any;
 how the receiving device acknowledges
successful reception of the information; and
 how arbitration is performed to resolve
contention on the bus and in what priority,
and the type of error checking to be used.
4
[Flynn 2011]
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Bus-based approach
5
[Flynn 2011]
Possible
hierarchy of
buses to optimize
system-level
performance and
cost
Cores or IPs
• Format
conversion
• Segmentation
• Buffering
Address
Data
Control
Power
A bus might
deliver power to
peripherals
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Bus architectures
 Unified or split (address and data)
 Simple with request acknowledgement
signals
 Bus with arbitration support
 Tenured split bus (bus is occupied only
during associated address or data cycles)
6
[Flynn 2011]
Memory access time
for the first word
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Bus Architectures
Technology AMBA
AXI
(AMBA 3) CoreConnect
Smart
Interconnect
IP Nexus
Company ARM ARM IBM Sonics Fulcrum**
Core type Soft/hard Soft/hard Soft Soft Hard
Architecture Bus
Unidirectional
channels Bus Bus
NOC using
direct switch
Bus width 8 - 1024 8 - 1024 32/64/128 16 8 - 128
Frequency 200 MHz 400 MHz*
100 - 400
MHz 300 MHz 1 GHz
Maximum
BW (GB/s) 3 6.4* 2.5 - 24 4.8 72
Minimum
latency (ns) 5 2.5* 15 n/a 2
7
[Flynn 2011]
* As implemented in the ARM PL330 high-speed controller.
**Fulcrum was acquired by Intel in 2011
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Outline
 PI bus
 AMBA bus
 IBM Core Connect
 ST bus
 Wishbone
 OCP-IP
 Virtual Socket Interface Alliance (VSIA)
 Nios II Avalon Bus
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Advance Microcontroller Bus Architecture (AMBA)
 Develped by ARM in 1996
 Distinct busses in AMBA specification:
1. Advanced High performance Bus (AHB)
2. Advanced Peripheral Bus (APB)
3. [Advanced System Bus (ASB) – designed for
lower performance systems, outdated]
 AXI - Advanced
Extensible
Interface
(since AMBA 3)
[Flynn 2011]
ARM
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
AMBA AHB
 High-performance on-chip backbone bus
 Connecting: processors, on-chip and off-chip
memory interfaces
 DMA capability
 Bridge to APB bus
 Features: burst transfers, split transactions,
single cycle bus master transfer
 Single clock edge (rising) operation and
non-tristate (central multiplexer) implementation
 Central arbiter
 Tenured (address phase can occur during previous
data phase
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
AMBA AHB bus transaction steps
 Bus Master obtains access to the Bus
 Arbiter resolves simultaneous requests
 Bus Master initiates transfer, driving signals:
 Address, Width, Direction
 Burst options
 Bus Slave provides a Response
 Success | need for delay (wait states) | error
11
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
AMBA APB bus cycle
 APB bus is optimised for minimal power and
low complexity (less performance)
 Used to interface to peripherals, which are
low bandwidth
 Three state working diagram:
Idle – Setup – Enable (actual transfer
cycles)
12
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
AMBA Advanced Extensible Interface (AXI)
 AXI4 - AMBA 4.generation for high-performance, high frequency
 Features:
 Unaligned data transfers using byte strobes
 Burst-based transactions
 Backward compatibele with AHB and APB interfaces
 Separate address/control and data phases
 Separated read and write data channels (providing low-cost DMA)
 Two channels for address (read, write) and control signals
 Additional write response channel for signaling completion of write
transactions
 AXI protocol supports several types of bursts: normal memory
access, wrapping cache line, streaming data to peripheral FIFOs
 Power management features
 ACE (Advance Cache Coherency Extensions)
 Out-of-Order transaction completion

5 channels
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
AXI protocol
14
Data channels can be:
8,16,32, … , 1024 bits wide
Three system topologies:
• Shared address and data
buses
• Shared address buses and
multiple data buses
• Multilayer, with multiple
address and data buses
Channel architecture
of reads
Channel architecture
of writes
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
AXI handshake
 VALID-READY signals
 Each channel (address,
data, response) has own
handshake signal pair
 Transfer happens at
clock edge T3 (Fig. 1
and 2) or when both are
ready, at T2 (fastest),
Fig. 3.
 There are more control
signals in all channels!
15
1
2
3
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
AXI burst transfer
 FIXED – address is same for
every transfer in the burst
(e.g. loading or unloading a
FIFO)
 INCR – address for each
transfer is an increment of
the address for previous
transfer (increment depends
on the size of transfer)
 WRAP – similar to INCR but
address wraps around to
lower address when upper
address limit is reached (used
for cache line accesses)
 Unaligned transfers
 Write strobes - indicate valid
bytes in transfer
 Narrow transfers – transfers
narrower than data bus:
 Fixed lanes (FIXED)
 different byte lanes (INCR or WRAP)
16
Example: Address 0x00, aligned,
burst length – 4 transfers, transfer
length – 32 bits
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
AXI burst and handshake examples
Overlapping
read bursts
17
One or another or
both
Both have to be
present
Write
transaction
handshake
dependencies
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
AXI on Cortex-A microprocessor
 64/128-bit
Configurable AXI Bus
 dedicated read, write,
and address channels
 up to 23 outstanding
transactions with out-
of-order completion
 AMBA® Designer
interconnect design
tool
18
Cortex-A8
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Zync-7000 AP SoC interfaces and signals
19
[Crockett, Zync book 2014, p192]
GP – General Purpose
HP – High Performance
ACP – 64-bit
Accelerator Coherency
Port for asynchronous
cache-coherent access
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Zync-7000
 Zynq-7000 devices
are equipped with
dual-core ARM
Cortex-A9
processors
integrated with
28nm Artix-7 or
Kintex®-7 based
programmable
logic
[Xilinx Web]
20
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Zynq-7000 AP SoC
21
[Zynq-7000 All
Programmable SoC
Technical Reference
Manual UG585
(v1.11) September
27, 2016]
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
ARM CoreLinkTM
System (IP)
 Components and Methodology for Systems based on AMBA:
Source: ARM.com
 DMC: DRAM Controller
 MMU: Memory
Managing Unit for
hardware-assisted core
virtualisation (handling
privileged and shared
access from
hypervisors)
 NIC: hierarchical low-
power and low latency
interconnect
Source:
ARM.com
CoreLink CCI-500 Cache Coherent Interconnect for
coherency with up to four clusters including big.LITTLE and
coherent accelerators, and higher performance and efficiency
with integrated snoop filter. Optimized for mobile.
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
ARM CoreLinkTM
System
 Components and Methodology for Systems based on AMBA:
Source: ARM.com
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
ARM CoreLinkTM
System
 Security Features:
Source: ARM.com
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
ARM CoreSightTM
System
 Debug Interface:
Source: ARM.com
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
ARM CoreSightTM
System – Debug&Trace
 CoreSightTM
Debug Interface:
Source: ARM.com
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Amba 5 AXI5, ACE5
 Design&Reuse blog by Phil Dworsky,
Synopsys
Feb. 08, 2018
Synopsys
supports launch of Arm AMBA 5 AXI5, ACE5 p
rotocols with 1st source code test suite and V
IP
27
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Outline
 AMBA bus
 PI bus
 IBM Core Connect
 ST bus
 Wishbone
 OCP-IP
 Virtual Socket Interface Alliance (VSIA)
 Nios II Avalon Bus
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Peripheral Interconnect (PI) bus
Copyright: Siemens AG 1994, Source: http://ensiwiki.ensimag.fr/images/b/b8/Pibus.pdf
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
 Open on-chip bus standard
 Defined by Open Microprocessor Systems Initiative
(ARM, SGS-Thomson, TEMIC-Matra MHS, Philips,
Siemens)
 Synchronous and processor-independent shared bus
system
 Memory-mapped data transfers
 Multiple masters, multiple slaves
 Bus arbiter periodically analyses requests from
masters
 Free VHDL code
Peripheral Interconnect (PI) bus
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
 Processor independent
 Demultiplexed operation
 Clock - synchronous
 Peak transfer rate of 200 Mbytes/s (@ 50 MHz bus clock)
 Address and data bus scalable (up to 32 bits)
 8-/16-/32-bit data accesses
 Broad range of transfer types from single to multiple data
transfers
 Multimaster capability
PI-Bus does not provide:
 Cache coherency support
 Broadcasts
 Dynamic bus sizing
Peripheral Interconnect (PI) bus
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Outline
 AMBA bus
 PI bus
 IBM Core Connect
 ST bus
 Wishbone
 OCP-IP
 Virtual Socket Interface Alliance (VSIA)
 Nios II Avalon Bus
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
 Designed around IBM PowerPC core (adaptable to other cores)
 Slave-centric (in contrast to most other master-centric busses)
 Multiple shared slave segments
 Three buses:
1. Processor Local Bus (PLB), 128 bits
• high bandwidth, low latency
• Processor cores,
external memory interfaces,
DMA controllers
2. On-Chip Peripheral Bus (OPB),
32 bits, multiple masters and slaves
3. Device Control Register Bus (DCR)
 Decoupled address, read and write data
buses (concurrent read and write
transfers)
 Address pipelining
 No-fee, no-royalty to companies
IBM Core Connect
Source: www.design-reuse.com
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
CoreConnect vs AMBA
IBM CoreConnect PLB AMBA 2.0 HPB
Bus architecture 32, 64, 128, extendable
to 256 bits
32, 64, and 128 bits
Data buses Separate read and write Separate or three-state
Key capabilities Multiple bus masters
Four-deep read pipeline,
Two-deep write pipeline
Split transactions
Burst transfers
Line transfers
OPB (on-chip peripheral
bus)
Multiple bus masters
Pipelining
Split transactions
Burst transfers
Line transfers
AMBA APB
Masters supported Multiple masters Single master: The APB
bridge
34
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Outline
 AMBA bus
 PI bus
 IBM Core Connect
 ST bus
 Wishbone
 OCP-IP
 Virtual Socket Interface Alliance (VSIA)
 Nios II Avalon Bus
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
 Dedicated to consumer applications (set-top
boxes, digital HDTV, digital cameras)
 Set of architectures, interfaces and protocols
 Each operation consists of one or several
request/response pairs
 3 types of STBus protocols:
 Type1: simple protocol for peripheral register access
(no pipeline, acts as Request/Grant protocol)
 Type 2: Type1 + additional pipeline features and
operation codes for ordered transactions
 Type 3: Type 2 + advanced protocol implementing
split transactions (splitting into „request transaction“
and „reply transaction“)
 http://www.design-reuse.com/articles/16092/stbus-complex-interconnect-
design-and-verification-for-a-hdtv-soc.html
ST Microelectronics STBus
Source: www.design-reuse.com
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
STBus Arbitration
 Static priority (non-preemtive)
 Programmable priority
 Latency based
 Masters have counter register, loaded with max
latency during request
 Counters are decreased at each access cycle
 Arbiter grants access to master with lowest
counter register value
 When draw, static priorities will be used
37
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Outline
 AMBA bus
 PI bus
 IBM Core Connect
 ST bus
 Wishbone
 OCP-IP
 Virtual Socket Interface Alliance (VSIA)
 Nios II Avalon Bus
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
 Public version available since 2002 via Open-Core,
updated in 2010 (Wishbone Rev.B4)
 „logic bus“– does not specify electrical information or topology
 Aim: support use of open cores
 Scalable bus architecture based on simple master/slave
handshake communications
 Configurable bitwidths: 8, 16, 32
 Supported interconnection topologies:
 Direct P2P (master to slave)
 Dataflow interconnection based on systolic array architectures
 Shared bus and crossbar switch interconnection (most commonly used in
SoCs)
 http://opencores.org/opencores,wishbone
Wishbone
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
 Examples for Wishbone interconnection topologies:
Wishbone
Shared Bus. Source: wikipedia.org
Pipeline. Source: wikipedia.org
Crossbar Switch. Source: wikipedia.org
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Outline
 AMBA bus
 PI bus
 IBM Core Connect
 ST bus
 Wishbone
 OCP-IP
 Virtual Socket Interface Alliance (VSIA)
 Nios II Avalon Bus
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
 socket (bus wrapper) interface for SoC design
 configurable and highly scalable interface for on-chip
subsystem communications
 non-profit, open-industry standards body
 www.ocpip.org
 OCP phases
 Request
 Response
 Data Handshake
Open Core Protocol International Partnership (OCP-IP)
Source: Prashant D. Karandikar, Texas Instruments Inc
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
 Multiple request multiple data possible
 Reduced chip area by configuring into the OCP interfaces only those
features needed by the communicating cores.
 Simplified system verification and testing by providing built-in test
mechanisms
 Protocols for cache coherence
Open Core Protocol International Partnership (OCP-IP)
Source: Prashant D.
Karandikar, Texas
Instruments Inc
[OCP Specification3.0]
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Outline
 AMBA bus
 PI bus
 IBM Core Connect
 ST bus
 Wishbone
 OCP-IP
 Virtual Socket Interface Alliance (VSIA)
 Nios II Avalon Bus
Extra material
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Virtual Socket Interface Alliance (VSIA)
 VSI enables system-level interaction on a chip using
predesigned blocks called virtual components (VC).
 Hard VC (placed and routed)
 Soft VC (HDL description)
 Firm VC (in the form of generators or partially placed
library blocks)
 Other buses can interface over VCs following VSI
standard protocols
VSIA was founded in 1996 and dissolved in 2008. All
documents are given to the public domain:
http://www.vsia.org - legacy documents
45
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Nios II – Avalon Bus
 Switch Fabric to connect CPU, DMA, memory
and memory mapped peripherals on FPGA
 Master-slave modules
• Master address is a byte address
• Slave address is a word address
 Synchronous, rising clock edge
 Separate data in and out
 Data path up to 8, 16, …, 1024 bits (word size)
 Slave-side arbitration,
multiple simultaneous masters (each master-slave
pair has a dedicated connection between them)
 Pipelined read transfers, Burst transfers
46
Avalon does address
translation and
multiple accesses
when needed
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Avalon Multi-Mastering
47
Example multi master system
that permits bus transfers
between two masters and two
slaves
[Simultaneous Multi-Mastering
with the Avalon Bus, Application
note 184, Altera 2002]
Avalon Bus Control Signals:
• Master Request Slave (MRS)
• Master Select Granted (MSG)
• Wait
SoC Design
Thomas Hollstein | Kalle Tammemäe | Dept. of CS
Summary
48
[https://en.wikipedia.org/wiki/Bus_(computing)]
PC Express
x 4
x 16
x 1
x 16
Conventional
32-bit PCI

Commercial and Open SoC buses_AMBA_OCP.ppt

  • 1.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Commercial and Open SoC busses AMBA bus, PI bus, IBM Core Connect ST bus, Wishbone OCP-IP 1
  • 2.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Outline  Introduction to bus architectures  AMBA bus  PI bus  IBM Core Connect  ST bus  Wishbone  OCP-IP  Virtual Socket Interface Alliance (VSIA)  Nios II Avalon Bus Extra material
  • 3.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Introduction  Bus = shared wires by multiple communicating units  Connection logic to avoid electrical conflicts  Arbitration to determine bus ownership in time  Protocols – set of rules for transmitting information between multiple units  Buses designed for PCs and PCBs are not suitable for SoC-s  Designed for backplane  Limited speed  Limited number of signals 3
  • 4.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Bus protocol  the type and order of data being sent;  how the sending device indicates that it has finished sending the information;  the data compression method used, if any;  how the receiving device acknowledges successful reception of the information; and  how arbitration is performed to resolve contention on the bus and in what priority, and the type of error checking to be used. 4 [Flynn 2011]
  • 5.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Bus-based approach 5 [Flynn 2011] Possible hierarchy of buses to optimize system-level performance and cost Cores or IPs • Format conversion • Segmentation • Buffering Address Data Control Power A bus might deliver power to peripherals
  • 6.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Bus architectures  Unified or split (address and data)  Simple with request acknowledgement signals  Bus with arbitration support  Tenured split bus (bus is occupied only during associated address or data cycles) 6 [Flynn 2011] Memory access time for the first word
  • 7.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Bus Architectures Technology AMBA AXI (AMBA 3) CoreConnect Smart Interconnect IP Nexus Company ARM ARM IBM Sonics Fulcrum** Core type Soft/hard Soft/hard Soft Soft Hard Architecture Bus Unidirectional channels Bus Bus NOC using direct switch Bus width 8 - 1024 8 - 1024 32/64/128 16 8 - 128 Frequency 200 MHz 400 MHz* 100 - 400 MHz 300 MHz 1 GHz Maximum BW (GB/s) 3 6.4* 2.5 - 24 4.8 72 Minimum latency (ns) 5 2.5* 15 n/a 2 7 [Flynn 2011] * As implemented in the ARM PL330 high-speed controller. **Fulcrum was acquired by Intel in 2011
  • 8.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Outline  PI bus  AMBA bus  IBM Core Connect  ST bus  Wishbone  OCP-IP  Virtual Socket Interface Alliance (VSIA)  Nios II Avalon Bus
  • 9.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Advance Microcontroller Bus Architecture (AMBA)  Develped by ARM in 1996  Distinct busses in AMBA specification: 1. Advanced High performance Bus (AHB) 2. Advanced Peripheral Bus (APB) 3. [Advanced System Bus (ASB) – designed for lower performance systems, outdated]  AXI - Advanced Extensible Interface (since AMBA 3) [Flynn 2011] ARM
  • 10.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS AMBA AHB  High-performance on-chip backbone bus  Connecting: processors, on-chip and off-chip memory interfaces  DMA capability  Bridge to APB bus  Features: burst transfers, split transactions, single cycle bus master transfer  Single clock edge (rising) operation and non-tristate (central multiplexer) implementation  Central arbiter  Tenured (address phase can occur during previous data phase
  • 11.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS AMBA AHB bus transaction steps  Bus Master obtains access to the Bus  Arbiter resolves simultaneous requests  Bus Master initiates transfer, driving signals:  Address, Width, Direction  Burst options  Bus Slave provides a Response  Success | need for delay (wait states) | error 11
  • 12.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS AMBA APB bus cycle  APB bus is optimised for minimal power and low complexity (less performance)  Used to interface to peripherals, which are low bandwidth  Three state working diagram: Idle – Setup – Enable (actual transfer cycles) 12
  • 13.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS AMBA Advanced Extensible Interface (AXI)  AXI4 - AMBA 4.generation for high-performance, high frequency  Features:  Unaligned data transfers using byte strobes  Burst-based transactions  Backward compatibele with AHB and APB interfaces  Separate address/control and data phases  Separated read and write data channels (providing low-cost DMA)  Two channels for address (read, write) and control signals  Additional write response channel for signaling completion of write transactions  AXI protocol supports several types of bursts: normal memory access, wrapping cache line, streaming data to peripheral FIFOs  Power management features  ACE (Advance Cache Coherency Extensions)  Out-of-Order transaction completion  5 channels
  • 14.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS AXI protocol 14 Data channels can be: 8,16,32, … , 1024 bits wide Three system topologies: • Shared address and data buses • Shared address buses and multiple data buses • Multilayer, with multiple address and data buses Channel architecture of reads Channel architecture of writes
  • 15.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS AXI handshake  VALID-READY signals  Each channel (address, data, response) has own handshake signal pair  Transfer happens at clock edge T3 (Fig. 1 and 2) or when both are ready, at T2 (fastest), Fig. 3.  There are more control signals in all channels! 15 1 2 3
  • 16.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS AXI burst transfer  FIXED – address is same for every transfer in the burst (e.g. loading or unloading a FIFO)  INCR – address for each transfer is an increment of the address for previous transfer (increment depends on the size of transfer)  WRAP – similar to INCR but address wraps around to lower address when upper address limit is reached (used for cache line accesses)  Unaligned transfers  Write strobes - indicate valid bytes in transfer  Narrow transfers – transfers narrower than data bus:  Fixed lanes (FIXED)  different byte lanes (INCR or WRAP) 16 Example: Address 0x00, aligned, burst length – 4 transfers, transfer length – 32 bits
  • 17.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS AXI burst and handshake examples Overlapping read bursts 17 One or another or both Both have to be present Write transaction handshake dependencies
  • 18.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS AXI on Cortex-A microprocessor  64/128-bit Configurable AXI Bus  dedicated read, write, and address channels  up to 23 outstanding transactions with out- of-order completion  AMBA® Designer interconnect design tool 18 Cortex-A8
  • 19.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Zync-7000 AP SoC interfaces and signals 19 [Crockett, Zync book 2014, p192] GP – General Purpose HP – High Performance ACP – 64-bit Accelerator Coherency Port for asynchronous cache-coherent access
  • 20.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Zync-7000  Zynq-7000 devices are equipped with dual-core ARM Cortex-A9 processors integrated with 28nm Artix-7 or Kintex®-7 based programmable logic [Xilinx Web] 20
  • 21.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Zynq-7000 AP SoC 21 [Zynq-7000 All Programmable SoC Technical Reference Manual UG585 (v1.11) September 27, 2016]
  • 22.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS ARM CoreLinkTM System (IP)  Components and Methodology for Systems based on AMBA: Source: ARM.com  DMC: DRAM Controller  MMU: Memory Managing Unit for hardware-assisted core virtualisation (handling privileged and shared access from hypervisors)  NIC: hierarchical low- power and low latency interconnect Source: ARM.com CoreLink CCI-500 Cache Coherent Interconnect for coherency with up to four clusters including big.LITTLE and coherent accelerators, and higher performance and efficiency with integrated snoop filter. Optimized for mobile.
  • 23.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS ARM CoreLinkTM System  Components and Methodology for Systems based on AMBA: Source: ARM.com
  • 24.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS ARM CoreLinkTM System  Security Features: Source: ARM.com
  • 25.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS ARM CoreSightTM System  Debug Interface: Source: ARM.com
  • 26.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS ARM CoreSightTM System – Debug&Trace  CoreSightTM Debug Interface: Source: ARM.com
  • 27.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Amba 5 AXI5, ACE5  Design&Reuse blog by Phil Dworsky, Synopsys Feb. 08, 2018 Synopsys supports launch of Arm AMBA 5 AXI5, ACE5 p rotocols with 1st source code test suite and V IP 27
  • 28.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Outline  AMBA bus  PI bus  IBM Core Connect  ST bus  Wishbone  OCP-IP  Virtual Socket Interface Alliance (VSIA)  Nios II Avalon Bus
  • 29.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Peripheral Interconnect (PI) bus Copyright: Siemens AG 1994, Source: http://ensiwiki.ensimag.fr/images/b/b8/Pibus.pdf
  • 30.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS  Open on-chip bus standard  Defined by Open Microprocessor Systems Initiative (ARM, SGS-Thomson, TEMIC-Matra MHS, Philips, Siemens)  Synchronous and processor-independent shared bus system  Memory-mapped data transfers  Multiple masters, multiple slaves  Bus arbiter periodically analyses requests from masters  Free VHDL code Peripheral Interconnect (PI) bus
  • 31.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS  Processor independent  Demultiplexed operation  Clock - synchronous  Peak transfer rate of 200 Mbytes/s (@ 50 MHz bus clock)  Address and data bus scalable (up to 32 bits)  8-/16-/32-bit data accesses  Broad range of transfer types from single to multiple data transfers  Multimaster capability PI-Bus does not provide:  Cache coherency support  Broadcasts  Dynamic bus sizing Peripheral Interconnect (PI) bus
  • 32.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Outline  AMBA bus  PI bus  IBM Core Connect  ST bus  Wishbone  OCP-IP  Virtual Socket Interface Alliance (VSIA)  Nios II Avalon Bus
  • 33.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS  Designed around IBM PowerPC core (adaptable to other cores)  Slave-centric (in contrast to most other master-centric busses)  Multiple shared slave segments  Three buses: 1. Processor Local Bus (PLB), 128 bits • high bandwidth, low latency • Processor cores, external memory interfaces, DMA controllers 2. On-Chip Peripheral Bus (OPB), 32 bits, multiple masters and slaves 3. Device Control Register Bus (DCR)  Decoupled address, read and write data buses (concurrent read and write transfers)  Address pipelining  No-fee, no-royalty to companies IBM Core Connect Source: www.design-reuse.com
  • 34.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS CoreConnect vs AMBA IBM CoreConnect PLB AMBA 2.0 HPB Bus architecture 32, 64, 128, extendable to 256 bits 32, 64, and 128 bits Data buses Separate read and write Separate or three-state Key capabilities Multiple bus masters Four-deep read pipeline, Two-deep write pipeline Split transactions Burst transfers Line transfers OPB (on-chip peripheral bus) Multiple bus masters Pipelining Split transactions Burst transfers Line transfers AMBA APB Masters supported Multiple masters Single master: The APB bridge 34
  • 35.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Outline  AMBA bus  PI bus  IBM Core Connect  ST bus  Wishbone  OCP-IP  Virtual Socket Interface Alliance (VSIA)  Nios II Avalon Bus
  • 36.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS  Dedicated to consumer applications (set-top boxes, digital HDTV, digital cameras)  Set of architectures, interfaces and protocols  Each operation consists of one or several request/response pairs  3 types of STBus protocols:  Type1: simple protocol for peripheral register access (no pipeline, acts as Request/Grant protocol)  Type 2: Type1 + additional pipeline features and operation codes for ordered transactions  Type 3: Type 2 + advanced protocol implementing split transactions (splitting into „request transaction“ and „reply transaction“)  http://www.design-reuse.com/articles/16092/stbus-complex-interconnect- design-and-verification-for-a-hdtv-soc.html ST Microelectronics STBus Source: www.design-reuse.com
  • 37.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS STBus Arbitration  Static priority (non-preemtive)  Programmable priority  Latency based  Masters have counter register, loaded with max latency during request  Counters are decreased at each access cycle  Arbiter grants access to master with lowest counter register value  When draw, static priorities will be used 37
  • 38.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Outline  AMBA bus  PI bus  IBM Core Connect  ST bus  Wishbone  OCP-IP  Virtual Socket Interface Alliance (VSIA)  Nios II Avalon Bus
  • 39.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS  Public version available since 2002 via Open-Core, updated in 2010 (Wishbone Rev.B4)  „logic bus“– does not specify electrical information or topology  Aim: support use of open cores  Scalable bus architecture based on simple master/slave handshake communications  Configurable bitwidths: 8, 16, 32  Supported interconnection topologies:  Direct P2P (master to slave)  Dataflow interconnection based on systolic array architectures  Shared bus and crossbar switch interconnection (most commonly used in SoCs)  http://opencores.org/opencores,wishbone Wishbone
  • 40.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS  Examples for Wishbone interconnection topologies: Wishbone Shared Bus. Source: wikipedia.org Pipeline. Source: wikipedia.org Crossbar Switch. Source: wikipedia.org
  • 41.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Outline  AMBA bus  PI bus  IBM Core Connect  ST bus  Wishbone  OCP-IP  Virtual Socket Interface Alliance (VSIA)  Nios II Avalon Bus
  • 42.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS  socket (bus wrapper) interface for SoC design  configurable and highly scalable interface for on-chip subsystem communications  non-profit, open-industry standards body  www.ocpip.org  OCP phases  Request  Response  Data Handshake Open Core Protocol International Partnership (OCP-IP) Source: Prashant D. Karandikar, Texas Instruments Inc
  • 43.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS  Multiple request multiple data possible  Reduced chip area by configuring into the OCP interfaces only those features needed by the communicating cores.  Simplified system verification and testing by providing built-in test mechanisms  Protocols for cache coherence Open Core Protocol International Partnership (OCP-IP) Source: Prashant D. Karandikar, Texas Instruments Inc [OCP Specification3.0]
  • 44.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Outline  AMBA bus  PI bus  IBM Core Connect  ST bus  Wishbone  OCP-IP  Virtual Socket Interface Alliance (VSIA)  Nios II Avalon Bus Extra material
  • 45.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Virtual Socket Interface Alliance (VSIA)  VSI enables system-level interaction on a chip using predesigned blocks called virtual components (VC).  Hard VC (placed and routed)  Soft VC (HDL description)  Firm VC (in the form of generators or partially placed library blocks)  Other buses can interface over VCs following VSI standard protocols VSIA was founded in 1996 and dissolved in 2008. All documents are given to the public domain: http://www.vsia.org - legacy documents 45
  • 46.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Nios II – Avalon Bus  Switch Fabric to connect CPU, DMA, memory and memory mapped peripherals on FPGA  Master-slave modules • Master address is a byte address • Slave address is a word address  Synchronous, rising clock edge  Separate data in and out  Data path up to 8, 16, …, 1024 bits (word size)  Slave-side arbitration, multiple simultaneous masters (each master-slave pair has a dedicated connection between them)  Pipelined read transfers, Burst transfers 46 Avalon does address translation and multiple accesses when needed
  • 47.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Avalon Multi-Mastering 47 Example multi master system that permits bus transfers between two masters and two slaves [Simultaneous Multi-Mastering with the Avalon Bus, Application note 184, Altera 2002] Avalon Bus Control Signals: • Master Request Slave (MRS) • Master Select Granted (MSG) • Wait
  • 48.
    SoC Design Thomas Hollstein| Kalle Tammemäe | Dept. of CS Summary 48 [https://en.wikipedia.org/wiki/Bus_(computing)] PC Express x 4 x 16 x 1 x 16 Conventional 32-bit PCI

Editor's Notes

  • #1 Updated for 2014
  • #2 + Sonic Smart Interconnect (Sonic) + Avalon (Altera) + MARBLE (University of Manchester) + CoreFrame (palmChip) ... https://www.design-reuse.com/
  • #3 A bus can be though of as a corridor connecting multiple rooms. If some one is moving from room A to room B, all other doors have to be closed. [http://www.esemagazine.co.uk/common/viewer/archive/2003/Jul/22/feature7.phtm]
  • #4 COMPUTER SYSTEM DESIGN System-on-Chip Michael J. Flynn; Wayne Luk Wiley 2011
  • #5  A bus bridge is a module that connects together two buses, which are not necessarily of the same type. Format conversion Segmenting traffic (concurrent usage of buses) Buffering transactions bw buses (allowing proceeding on with next transaction faster) Bus can deliver power too!
  • #6 Local buffers to record address (addresses)
  • #7 AXI (Advanced eXtensible Interface) was introduced since AMBA3, burst mode Protocol: • be suitable for high-bandwidth and low-latency designs • enable high-frequency operation without using complex bridges • meet the interface requirements of a wide range of components • be suitable for memory controllers with high initial access latency • provide flexibility in the implementation of interconnect architectures • be backward-compatible with existing AHB and APB interfaces.
  • #9 AXI - Advanced Extensible Interface [Wikipedia] AMBA was introduced by ARM in 1996. The first AMBA buses were Advanced System Bus (ASB) and Advanced Peripheral Bus (APB). In its second version, AMBA 2, ARM added AMBA High-performance Bus (AHB) that is a single clock-edge protocol. In 2003, ARM introduced the third generation, AMBA 3, including AXI to reach even higher performance interconnect and the Advanced Trace Bus (ATB) as part of the CoreSight on-chip debug and trace solution. In 2010 the AMBA 4 specifications were introduced starting with AMBA 4 AXI4, then in 2011[2] extending system wide coherency with AMBA 4 ACE. In 2013[3] the AMBA 5 CHI (Coherent Hub Interface) specification was introduced, with a re-designed high-speed transport layer and features designed to reduce congestion. AXI is used in ARM Cortex-A processors These protocols are today the de facto standard for 32-bit embedded processors because they are well documented and can be used without royalties. https://www.design-reuse.com/industryexpertblogs/43528/synopsys-arm-amba-5-axi5-ace5-source-code-test-suite-vip.html
  • #10 Why multiplexer is better than tristate?
  • #11 Cycling in idle state or for transfer - bw setup (address decoding) and enable (actual transfer)
  • #13 + advanced cache support + exclusive access (semaphores) + register slicing (maximizes operation frequency by matching channel latency to channel delay AMBA AXI and ACE Protocol Specification. 306 pages.
  • #15 Suggested doc:: AMBA AXI Protocol Specification, http://mazsola.iit.uni-miskolc.hu/~drdani/docs_arm/AMBAaxi.pdf It is even possible that write data is ready before address (address might have mode registers before the bus) – alignment is necessary.
  • #16 Burst is limited (max. 16 transfers), transfer lengths up to 1…128 byte, defined by ARSIZE values
  • #17 BVALID – Write response valid (channel is signalling) BREADY – Response ready (master can accept write response)
  • #18 http://www.ti.com/general/docs/lit/getliterature.tsp?baseLiteratureNumber=spry112&fileType=pdf AXI4 protocol specification – 306 pages ETM - Embedded Trace Macrocell
  • #19 The Zynq-7000 AP SoC contains a large number of fixed and flexible I/O. Zynq-7000 AP SoC has a constant 128 pins dedicated to memory interfaces (DDR I/O) MIO – Multiplexed I/O Zync-7020 - 85k programmable logic cells
  • #22 ARM® CoreLink™ Interconnect provide the components and the methodology for designers to build SoCs based on the AMBA® specifications There are three families of interconnect products: CoreLink CCN Cache Coherent Network - Designed for infrastructure applications. … for up to twelve CPU clusters (48 cores) supporting up to 32MB of L3 cache for highest compute density CoreLink CCI Cache Coherent Interconnect - Optimized for mobile. … for coherency with up to six clusters including big.LITTLE and future fully coherent GPU. Includes performance and efficiency benefits from integrated snoop filter CoreLink NIC Network Interconnect - Highly configurable for SoC wide connectivity, multiple applications. … is optimised to build the lowest latency, highest area applications for AMBA 4, AMBA 3 and AMBA 2. ARM big.LITTLE is a heterogeneous computing architecture developed by ARM Holdings, coupling relatively battery-saving processor cores and slower (LITTLE) with relatively more powerful and power-hungry ones (big). Typically, only one "side" or the other will be active at once, but since all the cores have access to the same memory regions, workloads can be swapped between Big and Little cores on the fly
  • #23 The ARM® CoreLink™ interconnect family from the home of AMBA® is the lowest risk solution for on-chip communication. Designed and tested with ARM Cortex and Mali processors, CoreLink interconnect from ARM provides balanced service for both low latency and high bandwidth data streams.
  • #25 Embedded Trace Macrocells (ETM) Instrumentation Trace Macrocell (ITM) System Trace Macrocell (STM) Trace Memory Controller (TMC)
  • #29 1994 Rev. 03d
  • #30 September 11th, 1995. Five of Europe's major semiconductor companies - Advanced RISC Machines (ARM), Philips Semiconductors, SGS-THOMSON Microelectronics, Siemens and Temic/Matra MHS - have announced an agreement to licence a jointly developed on-chip bus protocol to other companies. The bus, known as the Peripheral Interconnect Bus (PI Bus), was developed by the five partners within the framework of a 3-year European Union ESPRIT Open Microprocessor Initiative (OMI) project. It is particularly suitable for use in very large scale integrated circuits using deep submicron technologies and modular architectures.
  • #33 IBM makes the CoreConnect bus available as a no-fee (since 1999), no-royalty architecture to tool-vendors, core IP-companies, and chip-development companies. As such it is licensed by over 1500 electronics companies such as Cadence, Ericsson, Lucent, Nokia, Siemens and Synopsys. http://www.eetimes.com/document.asp?doc_id=1188018 E.g. Th Xilinx MicroBlaze™ processor uses the same bus for peripherals as the IBM PowerPC® processor. CoreConnect has bridging capabilities to the competing AMBA bus architecture (similar to AMBA 2.0).
  • #36 Type 1 is similar to IBM CoreConnect DCR bus All types have MUX-based implementation, and shared, partial or full crossbar implementation Dedicated for dual HDTV market, the 65 nm chip recently developed by STMicroelectronics integrates one host CPU, two video decoders enabling the decoding of MPEG-2, H264 and VC1 video frames, dedicated micro-processors for audio decoding and many peripherals for internal or external exchanges. The STBus interconnect supports multiple clock domains. All clocks are considered as fully asynchronous, even if there is an integer ratio between some of them. Type 3 allows Out-of-Order transaction completion
  • #39 WISHBONE itself is not an IP core...it is a specification for creating IP cores.
  • #42 Recent specification: OCP 3.0 http://accellera.org/downloads/standards/ocp/ .... 494 pages OCP-IP Specification Working Group members: • MIPS Technologies Inc. • Nokia • Sonics Inc. • Texas Instruments Incorporated • Toshiba Corporation Semiconductor Company • Cadence
  • #46 Avalon-MM pipelined read transfers increase the throughput for synchronous slave devices that require several cycles to return data for the first access. Such devices can typically return one data value per cycle for some time thereafter. New pipelined read transfers can start before readdata for the previous transfers is returned. Write transfers cannot be pipelined. A burst executes multiple transfers as a unit, rather than treating every word independently.
  • #47 http://extras.springer.com/2001/978-0-7923-7439-8/an/an184.pdf https://www.altera.com/content/dam/altera-www/global/en_US/pdfs/literature/manual/mnl_avalon_spec.pdf
  • #48 4 PCI Express bus card slots (from top to 2nd bottom: x4, x16, x1 and x16), compared to a 32-bit conventional PCI bus card slot (very bottom)