Upcoming SlideShare
Loading in...5




FPGAs and CPLDs are used in many ...

FPGAs and CPLDs are used in many
industries covering a broad range of
performance requirements, price points and
power envelopes.
In the early days, FPGAs were used for
prototyping ASICs and for high-end, low
volume applications that could bear a high
unit cost, such as the communications and
defence sectors. Since then, FPGA vendors
have driven down costs and power, through
rapid process migration, to produce new
lower cost and lower power device families
to address new requirements.



Total Views
Views on SlideShare
Embed Views



1 Embed 4

http://www.slashdocs.com 4



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

XMOS vs. FPGA XMOS vs. FPGA Document Transcript

  • A Programmable Revolution A Compelling Alternative to Low Cost FPGAs Introduction XMOS XS1-L FPGAs and CPLDs are used in many The XMOS XS1-L family of devices are based industries covering a broad range of on the XMOS XCore® processor, a 500 MIPS performance requirements, price points and event-driven RISC processor with 100% power envelopes. deterministic operation, a 32x32 multiplier, In the early days, FPGAs were used for programmable I/O and a host of other prototyping ASICs and for high-end, low resources, all programmable entirely in volume applications that could bear a high C++, C and XC. XC includes extensions to C unit cost, such as the communications and for concurrency, communications, and defence sectors. Since then, FPGA vendors timed I/O operations have driven down costs and power, through XMOS devices can be used as a direct rapid process migration, to produce new substitute for low cost SRAM based FPGAs. lower cost and lower power device families In other cases, they provide suitable to address new requirements. replacements for some of the higher performance flash-based FPGAs. Evolution Figure 1 shows the XS1-L family with In many cases low end FPGA families are respect to the price, capability (capacity and now considered for power and cost performance) and power consumption of sensitive consumer and industrial various FPGA device families. For a large applications. These sectors benefit number of digital processing applications, sufficiently from the flexibility and time to the XS1-L outperforms Altera Cyclone III, market advantages offered by FPGAs to Xilinx Spartan 3A FPGAS and equivalents on warrant the price premium of both price and power consumption. programmability. From 2009 onwards, new entrants to the programmable silicon market are starting to win the hearts and minds of designers looking for the best possible mix of solution flexibility, price and performance. Some new players, such as SiliconBlue and Achronix and tabula have come up with new FPGA architectures. In parallel, other vendors such as Actel and Cypress have integrated FPGA fabrics with programmable analogue blocks and microcontrollers. All of these efforts represent an evolution of the same FPGA concept. Revolution: Now for the first time there is an all-digital flexible solution that will prove to be a better, cheaper, easier and lower power solution than an FPGA for Figure 1: XS1-L compared to popular FPGA many applications—XMOS. families A single core XS1-L device offers a capacity for general digital logic implementation roughly comparable to an FPGA having 7-20K logic elements (roughly 70K-200K ASIC gates). 2010-05-25 © 2010 XMOS Ltd www.xmos.com
  • A Programmable Revolution – A Compelling Alternative to Low Cost FPGAs There will always be a place for micropower Threads, Memory and Channels programmable devices and high-end, DSP, Threads can use channels to provide bandwidth intensive FPGAs such as Virtex buffered, event-based communication or Stratix parts. For applications residing in between threads, allowing data exchange the space in between, however, XMOS can and synchronization using single cycle improve development speed and lower instructions. Alternatively, threads can costs and power consumption without share 64KB of on-chip SRAM memory to compromising solution flexibility and exchange data, using single cycle lock programmability. instructions to co-ordinate access. In addition the XS1-L provides robust IP This makes the implementation of protection only found in flash-based FPGAs lightweight protocol stacks (such as TCP/IP whilst retaining performance much closer to microIP) that fit within the 64KB of memory that of an SRAM FPGA. essentially free when compared to an The rest of this paper describes how XMOS equivalent implementation in an FPGA, technology delivers a revolution in both the which requires a soft core such as Xilinx's programmable silicon itself and the MicroBlaze and an external memory associated hardware design processes. interface that would consume a large portion of the FPGA capacity, not to The XCore Processor mention adding an external memory chip to Instead of writing code in HDL to describe the BOM cost. registers, gate and wires, designers who use XMOS technology, write code in C, C++ Task XMOS approach FPGA approach or XC to implement deterministic Design High Level, HDL entry: processing functions, as shown in Figure 2. Capture parallel C/XC always @(posedge code clock) Resources instructions, Gates, LUTs, threads, routing channels, timers DSP Threads, 32x32 HDL entry, MAC Embedded Block Multipliers Table 1: FPGA Design Concepts and XMOS Equivalents Figure 2: Designing with the XCore Time Parallelism Each XCore has ten configurable timers, An XCore processor runs multiple real-time which can be directly instantiated in XC and hardware threads simultaneously. Each used to control program execution or I/O thread has access to a dedicated set of operations with nominal resolution of 10ns. general purpose registers, gets a guaranteed share of the processing power, I/O and Interfacing and executes a program using common RISC-style instructions. Each thread can Each XCore provides up to 64 GPIO that can execute simple computational code, DSP be set and sampled in a single instruction code, control software (taking logic via intelligent, autonomous I/O resources decisions, or executing a state machine) or called Ports. Simple input and output handle I/O operations using intelligent I/O instructions transfer data to or from I/O resources. ports, as shown in Figure 3. More complex use of ports allows data to be serialized and The eight hardware threads, generous MIPS, de-serialized, enabling the processor to 100% deterministic architecture and keep up with high-speed data streams. The intelligent I/O provide designers with the ports can timestamp data, synchronize flexibility of HDL, while dramatically easing transfers with an external or internal clock, the design entry and verification tasks. and schedule data to be input or output at specific times. XMOS, the XMOS logo and XCore are trademarks of XMOS Ltd All other trademarks are the property of their respective owners.
  • A Programmable Revolution – A Compelling Alternative to Low Cost FPGAs out buffered port:1 outP = XS1_PORT_1B; Selecting your programmable in buffered port:4 inP = XS1_PORT_4A; clock ref = XS1_CLKBLK_REF; solution int main(void) { Table 3 lists a range of application function examples and compares the utilization of int value; configure_out_port_no_ready(outP, ref, 0); XCore resources and FPGA logic elements configure_in_port_no_ready(inP, ref); required to implement the function. while (1) { inP :> value; XS1-L FPGA Asic if (value > 9) outP <: 1; Function Threads Memory else Nand2 Gates Logic Cells GPIO MIPS outP <: 0; } USB2 + 5 400 30794 12 4400 44000 2EP Ethernet 5 250 9982 14 3600 36000 MAC+MII TCP/IP 1 50 40000 0 61001 61000 (uip) Figure 3: XMOS Ports Use Example S/PDIF 2 100 5036 2 800 8000 Clock Blocks are used to select the internal I2C XCore system clock, the timer reference 0.5 50 3044 2 700 7000 Master clock, or an external clock connected via a 1-bit port to clock a given port. Clock SDRAM blocks sample incoming external clocks and Interface 1 100 2974 30 1100 11000 then provide a variety of conditioning (D8, options (for example, delaying the clock A14) relative to the data associated with it). Table 3: Application Function Examples Task XMOS approach FPGA approach IP Protection I/O Each XCore has 8KB of secure one time Ports, timers HDL entry Interfacing programmable (OTP) memory, secure Clocking Clock blocks Clock Management Units execution mode, the ability to load AES encrypted firmware, and the option to Table 2: XMOS and FPGA I/O Concepts disable JTAG and external channel access to a secured XCore. This all adds up to a level Event-Driven Processing of IP protection that cannot be matched by The XCore processor is event-driven. an SRAM FPGA. Threads waiting for events do not consume any processing resources. An event can be Applications requiring robust IP protection the completion of a communication or I/O are often forced to use a slower but more operation, the release of a lock, or a timer secure flash-based FPGA, which can lead to reaching a programmed time. Threads can timing closure issues. XMOS XS1-L devices wait for any one of a set of events; the first offer a way to meet security and event causes the thread to start in a single performance requirements with minimal instruction. effort. The XS1-L XCore provides an Active Energy Conservation mode in which it automatically and instantly slows the XCore clock down to a user-specified speed whenever all threads are paused. The clock returns to its normal speed as soon as any thread has new work to do. 1 Assumes a NIOS II and external memory interface is required for TCP/IP running in a Cyclone III device XMOS, the XMOS logo and XCore are trademarks of XMOS Ltd All other trademarks are the property of their respective owners.
  • A Programmable Revolution – A Compelling Alternative to Low Cost FPGAs DSP Soft Processors XS1-L devices offer easily accessible DSP For FPGA designs that need to employ a functionality via its 500 MHz 32x32 soft processor to implement a protocol multiplier, offering a sustained rate stack, the issue becomes the amount of (including load/store operations) of 59 code memory required. For many simple MMACS per XCore (119 MMACS peak) which protocol stacks, such as TCP/IP for simple is sufficient for many audio, signal control web-servers and various I/O related and lower end DSP tasks that need low cost standard and proprietary protocols, the and power per MMAC. 64KB of internal SRAM on the XCore is The low cost FPGA families such as Altera sufficient. Cyclone III, on the other hand, offer tens or In these cases the XS1-L is the cost-effective hundreds of embedded block multipliers, choice. To achieve the above in an FPGA which can be ganged together to create would require either: multipliers of arbitrary width. When many of a gate hungry soft processor core and these are employed in parallel, an external memory interface plus external aggregate DSP processing capability can be memory chip, all of which adds a built up far in excess of what the XS1-L can sizeable penalty in device capacity, achieve. power consumption, I/O, BOM cost and Consequently the FPGA provides a board space. significant advantage for high throughput image, video processing or A soft processor core with additional telecommunications infrastructure logic cells used to implement a small processing. For many emerging applications code memory on the FPGA. (such as consumer and prosumer digital Many soft processor implementations may audio), however, moderate DSP needs are also find it impossible to achieve the clock just one item on the list of requirements speed required to meet processing alongside flexible control, low cost and requirements, leaving the designer to look integration. For these types of applications for a product that integrates hardened 32- XMOS is likely to offer the ideal solution, all bit RISC cores with a suitable programmable programmable in a high-level language. fabric. For applications that have code footprints Solution Scaling well in excess of 64KB, an FPGA with An application that does not fit in a single external memory may be the only option. XCore may be easily spread across multiple cores by selecting the two-core XS1-L2 device. Alternatively multiple XMOS devices can be connected together by asynchronous off-chip links that unify multiple XS1 processors into a single unified network mediated by communication via channels. High I/O Capability For applications that require many 100s of I/Os, a low cost FPGA is likely to be a preferable choice. Likewise for very high speed native I/O capabilities such as LVDS, gigabit SERDES transceivers, SSTL2 or other exotic I/O technology, choose an FPGA. However a large majority of applications are well served with single ended 3.3V I/O, making large amounts of high speed I/O an expensive and unneeded feature. Figure 4: Costs associated with Soft Core Usage in FPGAs XMOS, the XMOS logo and XCore are trademarks of XMOS Ltd All other trademarks are the property of their respective owners.
  • A Programmable Revolution – A Compelling Alternative to Low Cost FPGAs Design Flow synthesis. Designers using XMOS technology, on the Figure 5 compares the standard FPGA other hand, immediately reap the design flow to the XMOS design flow. productivity benefits of coding in a high Overall, the XMOS design flow offers level language, yet avoid the pitfalls of high dramatically shorter iteration times and level synthesis. more straightforward design entry than the traditional FPGA flow. Ultra Fast Compilation Design Entry Even large XMOS programs compile and link Design entry is C++, C or XC using either in seconds compared to the minutes or the XDE even hours required to complete a typical graphical development environment or your iteration of FPGA synthesis and place and favorite text editor. The XDE offers syntax route. highlighting, indenting and offers the ability to compile, launch simulations and Application Timing Closure debugging. The XS1-L implements parallelism using its instruction set and native resources, all of Design in a High Level Language which reliably run at 500 MHz. Designers EDA vendors have expended significant using XMOS have no need to check register efforts to bring the advantages of high level to register timing paths across multiple languages to FPGA design, and still have a design corners. long way to go to deliver practical hardware One of the most powerful attractions of the design flows using C and high level Figure 5: XMOS and FPGA Design Flows Compared XMOS, the XMOS logo and XCore are trademarks of XMOS Ltd All other trademarks are the property of their respective owners.
  • A Programmable Revolution – A Compelling Alternative to Low Cost FPGAs XMOS approach for FPGA designers is the Bitstream Generation ability to statically time paths through After the design is ready, firmware for application code using the XMOS Timing downloading to configuration flash Analyzer, which times critical application memories are easily generated with XFLASH, paths rather than register-to-register paths. which includes provision for multiple boot The Timing Analyzer achieves 100% images and Dynamic Field Upgrade (DFU). coverage of enumerated constraints, unlike XBURN can be used to burn parts of the test-bench based simulation. For example, code image and selected user encryption the Timing Analyzer can calculate the time keys to the 8KB of OTP on chip, or just set in XCore cycles from a thread sampling a security options such as disabling JTAG specific pattern on an input port to debug access. outputting a response on an output port. The result can be graphically displayed, In System Debug highlighting the critical path through the code and automatically signing off against XMOS offers a typical processor debugging user specified timing constraints expressed environment using XGDB (built on top of as pragmas in the code or entered using the gdb, the GNU Debugger) and the XS1-L XTA GUI. JTAG For FPGA designers to access similar interface. functionality they must deploy property Debug iterations with XMOS tools only checkers and formal proof methods, which require a recompile and regeneration of rapidly reach their limits on even firmware. FPGA designers must pre-select moderately sized designs, and require the nodes they wish to view and iterate specialist design knowledge to apply. through synthesis, place and route and timing analysis for each debug iteration. The Timing Analyzer offers a whole- application level timing capability that does PCB Design considerations not rely on time consuming dynamic XMOS offers its processors in QFP, QFN and simulation that will be appreciated by BGA packages, suitable for 2 layer and 4 software and hardware engineers alike. layer BCB implementations. In addition, the XS1-L parts require only two Simulation voltage supplies, a 3.3V or 2.5V supply for Designers have the option to run XCore the I/O, and a 1V core voltage. simulations of their code, visualizing the The various port/pin configurations that can results with the XMOS VCD waveform viewer be realized with the XS1-L also offer some and debugging and single stepping with the late pin assignment flexibility although not debugger, all built into the XDE graphical to the same fine degree offered by FPGAs. environment. The signals displayed in the VCD viewer are Toolchain Simplicity and Platform a range of actual signals that exist within Support the XS1-L silicon including program counters, port resource signals, timers, Full FPGA design tool chains from the FPGA channels and thread status. vendors and/or third party EDA suppliers run to multiple gigabytes of data. These simulations run an order of magnitude faster than a corresponding The XMOS tools typically only require about dynamic simulation in an event-driven HDL 200 megabytes and work out of the box on simulator. XSIM also provides a range of Windows, Linux and MAC platforms, simple testbench plug-ins and an API for the allowing you to develop your applications user to create more of their own. on desktop PCs or notebooks. Summary XMOS offers a lower cost and more secure platform with dramatically enhanced time- to-market than traditional SRAM and FLASH based FPGAs for programmable digital logic designs in the 70K – 400K gate range XMOS, the XMOS logo and XCore are trademarks of XMOS Ltd All other trademarks are the property of their respective owners.