New Solutions for Wireless      Infrastructure Applications                      May 2, 2012Moshe AnschelDSP System & Arch...
Agenda• The wireless baseband market trends and  requirements• Freescale Approach: QorIQ Converge B4860  overview• StarCor...
Macro Base Station Challenges    Connectivity•   Coverage: Urban, highways and rural•   Spectral efficiency: Radio and net...
Industry Flagship for Performance, Power and Cost   B4860 delivers the highest performance in the industry through intelli...
Benefit of Intelligent Integration  3 sector, 20 MHz LTE                                                     3 sector, 20 ...
QorIQ Qonverge B4860 – Block Diagram & Benefits• Next generation, e6500 Dual-Thread  Power Architecture® cores offer  high...
StarCore SC3900 -Flexible Vector Processors•StarCore SC3850 DSP is usedin many base stationspowered by the MSC815xfamily•S...
SC3900 Core & Clusters                                                                                     SC3900         ...
SC3900 Optimized for Baseband L1 Processing• SC3900 is optimized to efficiently handle Baseband PHY  Layer processing• PHY...
Computation Intensive DSP Code Acceleration• SC3900 provides Vector processor capability by  increasing the execution unit...
L1 Processing - Data Manipulation Acceleration• “Data manipulation” stands for many different functions existing in  Baseb...
Data Manipulation Acceleration Flexible Datapath• Unlike traditional vector processor, SC3900 Datapath is  flexible:   – F...
L1 Processing - Control Code Efficiency• One of the SC3900 goals is to improve in control code efficiency    – L1 control ...
Summary & Conclusion• Three 20 MHz sectors of LTE base station in a  single SoC, supporting multiple standards and  multim...
Upcoming SlideShare
Loading in …5
×

New solutions for wireless infrastructure applications

721 views

Published on

Moshe
Anschel, Freescale

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
721
On SlideShare
0
From Embeds
0
Number of Embeds
203
Actions
Shares
0
Downloads
12
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • DPAA - Any packet to any CPU to any accelerator or network interface without locks or semaphores
  • FFTMatrix/vectormultComplex FIRCorrelationOn the contrary, TI C66 is increasing only the execution unitsCausing memory bandwidth bottleneck and register pressureLead to low utilization of the execution unit and lower performance
  • New solutions for wireless infrastructure applications

    1. 1. New Solutions for Wireless Infrastructure Applications May 2, 2012Moshe AnschelDSP System & Architecture ManagerFreescale May 2, 2012
    2. 2. Agenda• The wireless baseband market trends and requirements• Freescale Approach: QorIQ Converge B4860 overview• StarCore SC3900 Flexible Vector Processor architecture May 2, 2012
    3. 3. Macro Base Station Challenges Connectivity• Coverage: Urban, highways and rural• Spectral efficiency: Radio and network performance High• Multi-standard: Supports variety of users Throughputs &• Reliability: Zero down time Coverage Capacity Multi Many Standard Active• Users: Hundreds of active users & SDR Users• Throughputs: Over 1Gbps data rate• Scalable/Modular: Sectors, antennas, users…• Active Antenna, MIMO: Improved QoS Lowering Costs Cost Energy Efficiency• Space: Miniaturization and consolidation of equipment• Low Impact: Power & Cost• Future Proof: Easy upgrades, SDR• Complete solutions: Ease of development, faster time to market May 2, 2012
    4. 4. Industry Flagship for Performance, Power and Cost B4860 delivers the highest performance in the industry through intelligent, balanced integration with a focus on cost and power efficiencyOptimal System Cost – industry-leading levels of Performance Optimized – offering a leap in performanceintegration, drastically reducing chip count and with efficient, high-performance next generation of ourcomponent cost field proven DSP & MPU cores as well as enhanced application specific acceleratorsDelivers on Scalability – a common architecture fromfemto to macro providing vertical and horizontal Power Efficiency – SoC solution allows for intelligent loadscalability; allows customers to leverage both software balancing and power managementand hardware architectures May 2, 2012
    5. 5. Benefit of Intelligent Integration 3 sector, 20 MHz LTE 3 sector, 20 MHz LTEwith 5 major components on a single SoC CPRI Antenna Layer-1 Back HaulAntenna B4860 PHY 10 Gbps GE DSP PHY 1Gbps I2C Layer-2/3 sRIO Transport UART Maint. & Control CPRI DSP Multicor SPI CPRI sRIO Switch e MPU Flas DDR2 DDR1 h DSP Flas h DDR DDR3 POWER 3 B4860 SoC 4X Cost Reduction 3X Power Reduction May 2, 2012
    6. 6. QorIQ Qonverge B4860 – Block Diagram & Benefits• Next generation, e6500 Dual-Thread Power Architecture® cores offer highest CoreMark/Watt with AltiVec technology for dramatic L2 scheduling acceleration• Next generation, SC3900 StarCore™ provides 2x DSP performance compared to competitive offerings• Above 21GHz of Programmable Performance• Smart hardware acceleration for Layer 1, 2, Control and Transport allows for best in class performance, power and cost• Large scale SoC integration allows for simpler programming models and easier load balancing• Integrated, Rich I/O including backhaul & antenna interfaces provides flexibility, interoperability and reduces overall system cost May 2, 2012
    7. 7. StarCore SC3900 -Flexible Vector Processors•StarCore SC3850 DSP is usedin many base stationspowered by the MSC815xfamily•StarCore SC3900 is targetedto handle future base stationrequirements and challenges•SC3900 architecture ispresented next May 2, 2012
    8. 8. SC3900 Core & Clusters SC3900 SC3900 High Speed FVP Core FVP Core Baseband AcceleratorsStarCore SC3900 FVP Clusters Interface 32K 32K 32K 32K • Six SC3900 Cores • Clustering two SC3900 under a 2MB, multi-banked L2 cache 2MB 16-way Shared L2 Cache, 4 Banks • High bandwidth accelerator ports (up to 1Tbps per cluster) • Hardware support for memory coherency between L1, L2 caches and the main memory CoreNet Coherent Fabric 37,460 BDTI Highest BDTI recently benchmarked the SC3900 core included in the Speed Score Freescale B4860. Running at 1.2 GHz, the SC3900 core 20,030 received a BDTIsimMark2000™ score of 37,460 – the highest speed score recorded. See www.BDTI.com for details Texas Freescale BDTIsimMark2000™ Instruments SC3900 BDTImark2000™ C66x 1.2GHz 1.5GHz May 2, 2012
    9. 9. SC3900 Optimized for Baseband L1 Processing• SC3900 is optimized to efficiently handle Baseband PHY Layer processing• PHY layer processing can be divided into three categories: – Computation intensive DSP code (mainly MAC intensive) – Data manipulation and less intensive DSP code – Control code• Each one of the categories is non-negligible in processing requirements• There is no clear boundary separation• SC3900 accelerates all types of Baseband L1 processing May 2, 2012
    10. 10. Computation Intensive DSP Code Acceleration• SC3900 provides Vector processor capability by increasing the execution units and optimizing the whole datapath accordingly – Up to 32 MACs per cycles (4x versus SC3850) – Optimized register file and memory throughput• SC3900 optimized datapath lead to high MAC utilization• Performance: – SC3900 is 3.5x-4x better than SC3850 in intensive DSP code May 2, 2012
    11. 11. L1 Processing - Data Manipulation Acceleration• “Data manipulation” stands for many different functions existing in Baseband Layer 1 - For examples: – Data preparation before/after intensive kernels • Ex: data re-ordering, matrix transpose, pack/unpack – Less regular kernels or serial/cyclic kernels with low parallelism • Ex: QR Decomposition, Interleaver, encoder.• SC3900 architecture addresses “Data manipulation” by different means: – Datapath flexibility: This is the “Flexible Vector Processor” essence • Register file flexibility: Each unit can read/write any registers • Execution unit flexibility: Each unit can run different and independent instructions – Rich and flexible Instructions set • Efficient instruction set which large support of different data type and size • New powerful data manipulation specific instructions• Performance: – SC3900 is 2x-3x better than SC3850 in “Data Manipulation” May 2, 2012
    12. 12. Data Manipulation Acceleration Flexible Datapath• Unlike traditional vector processor, SC3900 Datapath is flexible: – Flexible execution units: • 4 independents units, each capable of 8-way SIMD • Each unit can run different and independent instructions – Flexible register files: • Registers are not defined as long Vector of 100’s bits, but scalar which can be accessed by any execution unit (read and write) A0 MAC MAC MAC MAC MAC A1 A2 A0 A1 A2 A3 A3 Every Exec Unit #n ADD B0 B1 execution unit can only B2 can read/write B0 B1 B2 B3 read/write SHIFT B3 every register C0 registers #n C1 C0 C1 C2 C3 C2 CMP C3 Traditional Vector processor model SC3900 flexible model May 2, 2012
    13. 13. L1 Processing - Control Code Efficiency• One of the SC3900 goals is to improve in control code efficiency – L1 control functions are tightly integrated with the Arithmetic intensive SW – Useful for running scheduling functions that are control intensive• Control code performance is affected by two main aspects: – Core and Compiler efficiency in typical control code constructs – Memory system efficiency• Both have been addressed on the SC3900 , E.g. : – Ability to flatten decision trees using multiple predicates – Full support for non-aligned memory access without penalty – Larger, clustered 2MB L2 cache to keep the program close to the core• Performance: – SC3900 is up to 1.5x better than SC3850 in control processing May 2, 2012
    14. 14. Summary & Conclusion• Three 20 MHz sectors of LTE base station in a single SoC, supporting multiple standards and multimode operation for macro base stations• Complete baseband solution, integrates L1, L2, Control and Transport baseband processing from backhaul network to antenna Interface• StarCore SC3900 is a key technology providing the processing efficiency and flexibility on the PHY layer processing (Computation intensive DSP, Data manipulation and less intensive DSP code & Control code ) for the B4860 SoC May 2, 2012

    ×