Your SlideShare is downloading. ×
Sora- A High Performance Baseband DSP Processor
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Sora- A High Performance Baseband DSP Processor

488
views

Published on

This project is basically on software defined radio which was published by microsoft asia team which is based on reconfigurable baseband processor architecture, which tries to increase the …

This project is basically on software defined radio which was published by microsoft asia team which is based on reconfigurable baseband processor architecture, which tries to increase the performance of processor by adding no. cores into process

Published in: Education, Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
488
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Software-defined radio (SDR) is a radio communication system where components that have been typically implemented in hardware (e.g. mixers,filters, amplifiers, modulators/demodulators, detectors, etc.) are instead implemented by means of software on a personal computer or embedded system.[1] While the concept of SDR is not new, the rapidly evolving capabilities of digital electronics render practical many processes which used to be only theoretically possible.A basic SDR system may consist of a personal computer equipped with a sound card, or other analog-to-digital converter, preceded by some form ofRF front end. Significant amounts of signal processing are handed over to the general-purpose processor, rather than being done in special-purpose hardware. Such a design produces a radio which can receive and transmit widely different radio protocols (sometimes referred to as waveforms) based solely on the software used.
  • In contrast, SDR platforms based on general-purposeprocessor (GPP) architectures, such as commodity PCs,have the opposite set of tradeoffs.
  • For example, the popular GNU Radioplatform [1] achieves only a few Kbps throughput on an8MHz channel [21], whereas modern high-speed wirelessprotocols like 802.11 support multiple Mbps datarates on a much wider 20MHz channel [7]. These constraintsprevent developers from using such platforms toachieve the full fidelity of state-of-the-art wireless protocolswhile using standard operating systems and applicationsin a real environment.
  • An implementation of high-speed wireless protocolson general-purpose PC architectures must overcome anumber of challenges that stem from existing hardwareinterfaces and software architectures. First, transferringhigh-fidelity digital waveform samples into PC memoryfor processing requires very high bus throughput. ExistingGPP platforms like GNU Radio use USB 2.0
  • Gigabit Ethernet [1], which cannot satisfy this requirementfor high-speed wireless protocols.Second, physicallayer (PHY) signal processing has very high computationalrequirements for generating information bitsfrom waveforms, and vice versa, particularly at highmodulation rates; indeed, back-of-the-envelope calculationsfor processing requirements on GPPsMAC’s havelow latency.
  • The role of the PHY layer is to convert information bitsinto a radio waveform, or vice versa.Simd- single instruction multiple data
  • Core dedication-Sora provides a new kernel service,core dedication, which allocates processor coresexclusively for real-time SDR tasks.
  • For example, the soft demapperalgorithm used in demodulationneeds to calculate the confidence level ofeach bit contained in an incoming symbol. This taskinvolves rather complex computation proportional to themodulation density. More precisely, it conducts an extensivesearch for all modulation points in a constellationgraph and calculates a ratio between the minimumof Euclidean distances to all points representing one andthe minimum of distances to all points representing zero.In this case, we can pre-calculate the confidence levelsfor all possible incoming symbols based on their I andQ values, and build LUTs to directly map the input symbolto confidence level. Such LUTs are not large. Forexample, in 802.11a/g with a 54Mbps modulation rate(64-QAM), the size of the LUT for the soft demapper isonly 1.5KB.
  • For example, the soft demapperalgorithm used in demodulationneeds to calculate the confidence level ofeach bit contained in an incoming symbol. This taskinvolves rather complex computation proportional to themodulation density. More precisely, it conducts an extensivesearch for all modulation points in a constellationgraph and calculates a ratio between the minimumof Euclidean distances to all points representing one andthe minimum of distances to all points representing zero.In this case, we can pre-calculate the confidence levelsfor all possible incoming symbols based on their I andQ values, and build LUTs to directly map the input symbolto confidence level. Such LUTs are not large. Forexample, in 802.11a/g with a 54Mbps modulation rate(64-QAM), the size of the LUT for the soft demapper isonly 1.5KB.
  • Sora uses exclusive threads (or ethreads) to dedicatecores for real-time SDR tasks. Sora implements ethreadswithout any modification to the kernel code. An ethreadis implemented as a kernel-mode thread, and it exploitsthe processor affiliation that is commonly supported incommodity OSes to control on which core it runs.
  • However,the maximal frame size of 802.11 is fixed at 2304bytes. With simple modifications (changes in a fewlines), SoftWiFi can transmit and receive jumbo frameswith up to 32KB.
  • Transcript

    • 1. Sora “High Performance Software Radio using General Purpose MultiCore Processors" Microsoft Research Asia By Harshit Srivastava CDS12M001 1
    • 2. Software Radio Bluetooth GPS 3G WiFi CDMA G SM WiMAX General RF Frontend Bluetooth, WiFi, WiMAX, GSM, software CDMA, 3G, LTE … Benefits Promise of universal connectivity and cost saving Programmability => faster development cycle, faster to market Open platform for wireless research 2
    • 3. Introduction • This paper presents Sora, a fully programmable software radio platform on commodity PC architectures. • Sora combines the performance and fidelity of hardware SDR platforms with the programmability and flexibility of general-purpose processor (GPP) SDR platforms. • Sora uses both hardware and software techniques to address the challenges of using PC architectures for high speed SDR. • Sora is the first SDR platform that enables users to develop high-speed wireless implementations, such as the IEEE 802.11a/b/g PHY and MAC, entirely in software on a standard PC architecture.
    • 4. Introduction Continue.. • Software defined radio (SDR) holds the promise of fully programmable wireless communication systems, effectively supplanting current technologies which have the lowest communication layers implemented primarily in fixed, custom hardware circuits. • Many current SDR platforms are based on either programmable hardware such as field programmable gate arrays or embedded digital signal processors (DSPs). • Such hardware platforms can meet the processing and timing requirements of modern high-speed wireless protocols, but programming FPGAs and specialized DSPs are difficult tasks.
    • 5. Fundamental Challenges • Large volume of high-fidelity digital signals – Require a high-speed system I/O 1.2Gbps for 802.11 Antenna (20MHz channel, 16b A/D, 4x) ~up to 5 Gbps for 11n (4x4MIMO) ; Over 10Gbps for future high-speed wireless RF D/A Frontend A/D Hardware Processor Digital Samples Software 3
    • 6. Fundamental Challenges • Large volume of high-fidelity digital signals – Require a high-speed system I/O • Computation-intensive signal processing Sample Sample Sample Sample Bits Bits Bits s s s s @24Mb @48Mb @48Mb @384Mb @512Mb @640Mb @1.28Gb ps Convolutio ps ps Symbol ps ps ps Scramble Interleavi QAM IFFT ps GI nal Wave To Transmitt ng Mod Addition To encoder Shapin From er: RF g MAC Sample Sample Sample Sample Bits Bits Bits s s s s @48Mb @24Mb @24Mb Receive @1.28Gb @640Mb @512Mb @384MbDemod ps ps Viterbi ps r: ps ps ps ps + Decimation Remove GI FFT Descrambl decodi e Interleavi ng To From To ng RF MAC Bits @24Mb ps 4
    • 7. Fundamental Challenges • Large volume of high-fidelity digital signals – Require a high-speed system I/O • Computation-intensive signal processing Bits Bits Samples Bits Sa @48Mbps @1.2 Samples Raw computation power required: 802.11b => 10Gops, 802.11a => 40Gops! (now server-class CPU runs at 3GHz clock) @24Mbps Transmitt er: Bits Samples @48Mbps @640Mbps @24Mbps Scramble Convolutional Symbol Wave encoder Receive From MAC r: From RF Samples Decimation Descramble @512Mbps To RF Interleaving QAM Mod Addition Samples Bits @1.28Gbps @384Mbps @640Mbps @24Mbps Remove GI IFFT GI Shaping Samples Bits @512Mbps @24M FFT Demod + Samples @384Mbps Viterbi Interleaving mple s8Gbp 8G s Bits bp s To To MA MAC @48Mbps decoding 5
    • 8. Fundamental Challenges • Large volume of high-fidelity digital signals – Require a high-speed system I/O • Computation-intensive signal processing • Hard deadline and accurate timing control – 802.11 MAC requires response within a few s – Event trigger timing accuracy at s level 6
    • 9. Performance Low High Approaches Programmable hardware (FPGA) Sora Sora Embedded DSP Example: Rice WARP TI SFF-SDR , Resolving the SDR platform dilemma • Commodity PC w/ C program • High performance • sysinput:10Gbps; ~ s latency • target wireless xput:10M~1Gbps Low-performance GPP-based SDR Example: GNU Radio/USRP(v1&2) • Interface USB/GbE: <1Gbps, >1ms • Achievable wireless xput: ~100Kbps Low Low High Programmability High 7
    • 10. Sora Approach • New PCIe-based Interface card => high system throughput • New optimizations to implement PHY algorithms and streamline processing on multi-core CPU=> efficient PHY processing • Core dedication => real-time support 8
    • 11. Sora Architecture Multi-core CPU APP APP Digital Samples @Multiple Gbps APP APP RCB Mem Sora Sora APP APP RF RF RF A/D D/A RF PCIe bus Sora Soft-Radio Stack Sora Hardware General radio front-end: 700M/1.8G/2.4G/5GHz 9
    • 12. Radio Control Board Multi-core CPU APP APP Digital Samples @Multiple Gbps APP APP RCB Mem Sora Sora APP APP RF RF RF A/D D/A RF PCIe bus Sora Soft-Radio Stack Sora Hardware PCIe-based High-speed Interface card  PCIe is commodity in most modern PCs  High throughput: 16Gbps at PCIe-8x  Low latency: ~ 1 s  Separated with other I/O devices 10
    • 13. RCB Details PCIe-8x interface: up to 16Gbps throughput Versatile RF interface: up to 8 channels (8x8 MIMO) 11
    • 14. RCB Details FPGA DMA Controller PCIe bus PCIE Controller FIFO FIFO RF Controller A/D RF Circuit D/A SDRAM Controller Antenna RF Front-end Registers RCB DDR SDRAM s  Buffered data path: bridging the synchronous ops at RF and asynchronous processing at CPU (12.3Gbps measured )  Low latency control path for software (0.36 s measured) Versatile RF in terface: up to 8 channels (8x8 MIMO) 12
    • 15. Sora Software Multi-core CPU APP APP Digital Samples @Multiple Gbps APP APP RCB Mem Sora Sora APP APP RF RF RF A/D D/A RF PCIe bus Sora Soft-Radio Stack Sora Hardware High-performance SDR processing w/ key software techniques  Efficient PHY implementation using SIMD and LUTs  Speed up PHY using multi-core streamline processing  Core dedication for real-time support 13
    • 16. Efficient PHY Implementation • Exploit large high-speed cache memory – Extensive use of lookup tables (LUT): trade memory for calculation; still well fit into L2 cache – Applicable for more than half of the common algorithms; speedup ranges from 1.5x to 22x Ex: Convolutional encoder Tb Tb Output Data A + Tb + Direct impl. 8 ops per bit Tb Tb Tb LUT impl. 2 Lookup op for 8 bits! (size 32KB) Output Data B 14
    • 17. Efficient PHY Implementation • Exploit data parallelism in PHY – Utilize wide-vector SIMD extension in CPU – Applicable to many PHY algorithms with significant speedups (1.6x ~ 50x) Ex. (I)FFT 15
    • 18. Speed up PHY using multi-core streamline processing • Efficiently partition and schedule the PHY processing across cores – Interconnecting sub-pipeline with light-weight, synchronized FIFOs – Static scheduling of processing modules in PHY pipeline Core 1 Decimation Core 2 Remove GI FFT Demod + Interleaving Viterbi decoding Descramble Synchronized FIFO 16
    • 19. Core Dedication for Real-time Support • Exclusively allocate enough cores for SDR processing in multi-core systems – Guarantee the CPU, cache and memory bandwidth resources for predictable performance – Achieve s-level timing control – Simple abstraction, and easier to implement in standard OSes than RT-scheduler • Implemented in WinXP without modifications to Kernel 17
    • 20. Implementation • Sora software platform on Win XP – 14K lines of C code, including PCIe driver framework, memory management, FIFO management, etc • SoftWiFi: full implementation of IEEE 802.11a/b/g PHY and DCF MAC – 9K lines of C code; 4 man-month for dev & test – DSSS 1, 2, 5.5, 11Mbps for 11b; OFDM 6, 9, 12, 18, 24, 36, 48, 54Mbps for 11a/g 18
    • 21. Results: PHY Processing After Sora Optimization 6 5 4 3 2 1 Required computation (Giga cycles per second) Required computation (Giga cycles per second) 8 10 9 ~10x speedup >30x speedup 9 7 11.6 11.7 11.7 11.8 18.3 60.4 132.4 10 8 7 6 5 4 3 2 1 0 0 1M 2M 5.5M 802.11b 11M 6M 24M 54M 802.11a/g 1M 2M 5.5M 802.11b 11M 6M 24M 54M 802.11a/g 19
    • 22. Results: PHY Processing 6 5 4 3 2 1 2M c a gi G ( n o i t a t u p m oc d e ri u q e R 9 8 7 peedup 6 Sora enables software implementation of today’s high-speed wireless system in standard PC with a few cores 0 1M ycles per second) Required computation (Giga cycles per second) pe ed up After Sora Optimization 10 ~10x s 8 >30x s 9 7 11.6 11.7 11.7 11.8 18.3 60.4 132.4 10 5.5M 802.11b 11M 6M 24M 54M 802.11a/g 5 4 3 2 0 1 1M 2M 5.5M 802.11b 11M 6M 24M 54M 802.11a/g 20
    • 23. Results: End-to-end Throughput Communicating with commercial 802.11a/b/g card 25 15 10 5 Throughput (Mbps) 20 Sora-Commercial Commercial-Commercial Commercial-Sora 0 1M 2M 5.5M 11M 6M Modulation Mode 24M 54M 21
    • 24. Results: End-to-end Throughput Communicating with commercial 802.11a/b/g card 25 15 10 5 Throughput (Mbps) 20 Sora-Commercial Commercial-Commercial Commercial-Sora Seamlessly interoperate with commercial WiFi • Correctness of all PHY algorithms • Satisfying timing requirements of standards • Commercial equivalent performance 0 1M 2M 5.5M 11M 6M Modulation Mode 24M 54M 22
    • 25. Extensions Jumbo frames in 802.11 TDMA MAC 23
    • 26. Applications • A fully programmable software radio platform that provides the benefits of both SDR approaches, thereby resolving the SDR platform dilemma for developers. • With Sora, developers can implement and experiment with highspeed wireless protocol stacks, e.g., IEEE 802.11a/b/g, using commodity general-purpose PCs. • Developers program in familiar programming environments with powerful tools on standard operating systems. • Software radios implemented on Sora appear like any other network device, and users can run unmodified applications on their software radios with the same performance as commodity hardware wireless devices.
    • 27. Conclusion • Sora is a fully programmable software radio platform on commodity PC architecture – Easy C programming on multi-core CPU – High performance: high processing speed, low latency, and performance guarantee • Confirmed by SoftWiFi, the first fully interoperable IEEE 802.11 (PHY and MAC) on general purpose processors • Plan to release Sora SDK to research community – H/W: RCB + 2.4G RF front-end set (~$2K USD) 25
    • 28. References Kun Tan† Jiansong Zhang† Ji Fang‡ He Liu Yusheng Ye Shen Wang Yongguang Zhang† Haitao Wu† Wei Wang† Geoffrey M. Voelker † Microsoft Research Asia ‡ Tsinghua University, Beijing, China Beijing Jiaotong University, Beijing, China UCSD, La Jolla, USA

    ×