Connect. Collaborate. Innovate.GlobalLogicLeaders in Software R&D Services             NETWORK PROCESSORS OF THE PAST,    ...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTURETABLE OF CONTENTSABSTRACT                                               ...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTURENEXT GENERATION NETWORK‐SPECIFIC PROCESSORS                            2...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREABSTRACTThis white paper researches the role of the network processor in...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREINTRODUCTIONIn the ever‐evolving network equipment segment, vendors have...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREDEFINITION OF A NETWORK PROCESSORNetwork ProcessorA network processor is...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREIn addition to the two units mentioned above, it is common to use off‐ch...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREKEY FEATURESSome of the key features that define and distinguish a netwo...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREThe bottom line is that network processors must be able to support large...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREIn order to meet the demand of simple programmability, network processor...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREModular SoftwareNetwork processor‐based solutions must also address the ...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTURE•   Fault Monitoring – collecting information concerning the faults, err...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREIntegrated co‐processor engines (such as for classification or queuing) ...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREAPPLICATIONSAs the number of applications for network processors has gro...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREcreated and maintained by a routing protocol, such as Border Gateway Pro...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTURE•    Encapsulated Security Payload (ESP) – AH plus data confidentiality ...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREalignment. Some data encryption algorithms require a random initializati...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREviolations is also known as traffic policing. Another QoS operation is a...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTURE•   Payload – Higher layer PDU, maximum 65,535 bytes•   Padding – for AA...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREFUNCTIONAL COMPONENTSTo examine how the applications outlined in the pre...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTURErequired (one‐to‐one or many‐to‐one) and the size of the key. For ATM an...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREvarious QoS applications. The actual QoS process of selecting packets, d...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREARCHITECTUREThe field of network processors is notable for its great arc...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREdeliver the packet processing performance to handle an appreciable numbe...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREFAST PATH – SLOW PATHThe basic concept inherent to all network processor...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREto leverage advances in the data plane technologies without impacting th...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREUsing a router as an example, this phenomenon can be considered from two...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREof complex processing tasks, including application processing; communica...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTURE•   Data Transformation – transforming packet data between protocols•   ...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREThey support hardware threads with zero context switch overhead and can ...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTURESUMMARYConsiderationsThe issues one faces when choosing network processo...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREChallengesWhile much of the industry has focused on the hardware side of...
NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREfaster time‐to‐market and better reliability for network equipment vendo...
Upcoming SlideShare
Loading in …5
×

NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTURE

2,842 views

Published on

In the ever‐evolving network equipment segment, vendors have to walk the tightrope
between the demands of gigabit performances and intelligent processing challenges
such as QoS and 7‐layer applications.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,842
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
120
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTURE

  1. 1. Connect. Collaborate. Innovate.GlobalLogicLeaders in Software R&D Services NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTURE A White Paper by GlobalLogic
  2. 2. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTURETABLE OF CONTENTSABSTRACT 4INTRODUCTION 5DEFINITION OF A NETWORK PROCESSOR 6NETWORK PROCESSOR 6MANAGEMENT PROCESSOR UNIT 6PACKET CO‐PROCESSOR UNIT 6KEY FEATURES 8DATA FORWARDING FUNCTIONS 8WIRE SPEED PERFORMANCE 8SCALABILITY 9SIMPLE PROGRAMMABILITY 9FLEXIBILITY 10MODULAR SOFTWARE 11MANAGEMENT FUNCTIONS 11THIRD PARTY SUPPORT 12HIGH FUNCTIONAL INTEGRATION 12APPLICATIONS 14ROUTING 14SECURITY 15QOS RELATED APPLICATIONS 17ATM SWITCHING 18FUNCTIONAL COMPONENTS 20DATA PARSING 20CLASSIFICATION 20LOOKUP 20COMPUTATION 21DATA MANIPULATION 21TRAFFIC MANAGEMENT 21CONTROL PROCESSING 22MEDIA ACCESS CONTROL 22ARCHITECTURE 23Copyright © 2010 GlobalLogic, Inc. [2]
  3. 3. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTURENEXT GENERATION NETWORK‐SPECIFIC PROCESSORS 23FAST PATH – SLOW PATH 25CONTROL PLANE 27DATA PLANE 28SUMMARY 31Copyright © 2010 GlobalLogic, Inc. [3]
  4. 4. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREABSTRACTThis white paper researches the role of the network processor in the past, present andfuture. It discusses the functional components and applications of the etwork processor,as well as its evolution and architecture.Copyright © 2010 GlobalLogic, Inc. [4]
  5. 5. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREINTRODUCTIONIn the ever‐evolving network equipment segment, vendors have to walk the tightropebetween the demands of gigabit performances and intelligent processing challengessuch as QoS and 7‐layer applications. In the face of fierce competition and shorter time‐to‐market, network solution providers are shifting gears in the fast lane. Switching fromgeneral purpose CPUs to ASICs to SoCs to FPGAs to ASIPs in less than two decades is thestuff of which Formula 1 racing dreams are made!Balancing performance, flexibility, cost efficiency, time‐to‐market and intelligentprocessing is a tough challenge, but network processors have lived up to theseexpectations. The network processor segment is slowly but steadily making inroads asthe future of network equipment solutions. Let us take a quick look at the journey ofnetwork processors of the past, present and future. Figure 1. Comparison of TechnologiesCopyright © 2010 GlobalLogic, Inc. [5]
  6. 6. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREDEFINITION OF A NETWORK PROCESSORNetwork ProcessorA network processor is a programmable system‐on‐chip that broadly consists of twodistinct units: one for performing intelligent packet processing and the other to handlefast path tasks such as packet switching. These two hardware functional units, combinedwith optimized yet customizable software, together define the network processor.Management Processor UnitThe intelligent management processing unit (MPU) of a network processor is a centralprocessor that provides essential management and system‐level functionality, alongwith programmable intelligent packet processing (such as Layer 7 applications) that canbe customized rapidly. The MPU can perform offloaded, non‐critical packet processing.It is also responsible for initializing the co‐processor units and communication with thenetwork management systems and host processors.Packet Co‐Processor UnitAn array of co‐processors powered by optimized networking software accelerates thewell‐understood packet processing functions in the fast path. These include compute‐intensive packet processing such as packet switching at Layer 2/3, QoS management,etc. These co‐processors handle the bulk of ingress traffic, which needs to followcommon packet processing: classification, forwarding, computation, modification, etc. HOST CPU HOST CPU Figure 2. Network Processor Block DiagramCopyright © 2010 GlobalLogic, Inc. [6]
  7. 7. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREIn addition to the two units mentioned above, it is common to use off‐chip specializedco‐processors to provide additional service functions ranging from encryption to policymanagement to segmentation/re‐assembly. Network processor units (NPUs) also needseveral high bandwidth interfaces, as their main function is to perform data switching.Some of these interfaces (such as MAC devices) may be integrated in the networkprocessor to further increase the entire value proposition.The challenge for new networking equipment is to create the best of both worlds byperforming sophisticated packet processing at wire speed. Network processortechnology uses communication software components that are combined withadvanced packet processing technology, “standard” programming interfaces, and arobust development environment. This enables network equipment vendors to quicklybring to market a wide array of different products based on the same hardware andsoftware architecture. The result is significantly faster time‐to‐market for new productsand dramatically longer time‐in‐market by using software upgrades to deliver new,advanced services that extend the product life cycle.Network processors interface with the host CPU through standard bus interfaces such asPCI. They also interface with external memory units such as SDRAM/SRAM to implementlookup tables and PDU buffer pools. Another important interface of network processorsis a standard‐based switch fabric interface such as CSIX. Network processors interfacewith physical devices such as MACs/Framers for PDU ingress/egress operations.Copyright © 2010 GlobalLogic, Inc. [7]
  8. 8. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREKEY FEATURESSome of the key features that define and distinguish a network processor are:Data Forwarding FunctionsNetwork processors must provide fundamental data forwarding functions. Dataprocessing from ingress to egress can be roughly classified into flow classification andflow processing.Flow classification refers to the process through which the data PDU is examined inorder to decide further processing. In the first stage of classification, the ingress PDUneeds to be reassembled if required. Then traffic policing is performed using user‐defined algorithms and appropriate actions to thwart policy violations. Third and mostimportantly, deep packet classification is required to make forwarding decisions. Thefast path implementation of the network processor must be sensitized for any depth offrame classification at wire speed, irrespective of the packet length. Furthermore, deepclassification must be fully programmable in order to process any possible protocol PDU.As an add‐on function, intelligent statistics collection is required by the networkprocessor to track the classification results of path flows.Flow processing is the stage during which forwarding decisions are applied based on theresult of flow classification. Next, forwarding decisions should be made based onstandard Layer 2/3 switching algorithms or proprietary/customizable protocols. Thisfunction should be optimized to eliminate bottlenecks, irrespective of the trafficbandwidth or number of nodes in the network deployment. The network processormust provide a programmable buffer management interface to perform packetforward/drop actions on the data streams. Several industry standard buffermanagement algorithms are available for this purpose. A user programmable streammodification facility on data flows should be provided, especially for headermanipulation during forwarding. The network processor must also provide trafficshaping by using various scheduling algorithms during the output scheduling stage.Finally, a statistics gathering function needs to be implemented for tracking flowprocessing.Wire Speed PerformanceNetwork processors must live up to wire speed performance expectations. Typically allnetwork processing follows the parallel processing model in the fast path, which is adistinguishing feature of these devices. Network processor must evolve to be able totrack and support the complex networking applications at a high performance rate.Thousands of simultaneous connections need to be managed by core devices that usetechnologies such as MPLS. Also, quickly emerging VoIP services lay down strict qualityof service requirements in the fast path. Network processors must be able to support avariety of protocols such as ATM/AAL5, IP, Ethernet, etc. In many cases, networkprocessors must also provide support for legacy protocols such as IPX and SNA.Copyright © 2010 GlobalLogic, Inc. [8]
  9. 9. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREThe bottom line is that network processors must be able to support large bandwidthconnections, multiple protocols and advanced features without becoming aperformance bottleneck. That is, network processors must be able to provide wirespeed and non‐blocking performance regardless of the bandwidth requirements or thetype of protocol or the features that are enabled. It is a tall order to fill, but the solutionlies in following a highly optimized fast path processing model for common networkingtasks such as Layer 2 and Layer 3 switching, packet classification, etc.A fully optimized processing architecture with a high MIPS (millions of instructions persecond) to GBPS (gigabits per second) ratio is required to support a wire speedoperation at high bandwidths while still have processing headroom for advancedapplications.ScalabilityCompounding this wire speed performance crisis is the clear requirement for suchperformance to scale in two dimensions. Currently it must scale sufficiently to build acomplete platform with multiple access speeds. In the future, network processorarchitecture must provide developers confidence that their network processor of choicewill keep pace with the constant increase in interface speeds. One of the primaryrequirements for an network processor is to be able to rapidly scale its performancecapabilities to support ever‐increasing bandwidth requirements: Gigabit Ethernet / OC‐12 (today), OC‐48 (near future), and OC‐192 / 10‐times‐Gigabit Ethernet (distant future).With the increasing deployment of high speed bandwidth technologies such as GigabitEthernet and DWDM, the demand on networking equipment performance is increasingat an exponential rate. However, the speed of ICs is still bound by Moore’s Law!It is clear that network processor vendors must provide a breakthrough in scalability.More specifically, the throughput of a family of network processors must scale upwardssignificantly over time. This scaling should largely preserve the original programminginterface and the original fast path software.Simple ProgrammabilityWith network application requirements rapidly changing, simple programmability is thekey to facilitating the customization and integration of emerging technologies. This needfor simple adaptability is perhaps the strongest argument in favor of a networkprocessor, especially in contrast to its high‐speed but rigid cousins such as ASICs. Time‐to‐market is critical for the success of any product, meaning that a user‐friendlydevelopment environment, as well as extensive support development and debuggingtools, are essential. In this fiercely competitive marketplace, simple programmabilityand development environment support make network processors accessible to even thesmallest equipment manufacturers.Copyright © 2010 GlobalLogic, Inc. [9]
  10. 10. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREIn order to meet the demand of simple programmability, network processormanufacturers are striving to supply programming and testing tools that are easy to use.These programming tools are based on a simple programming language that allows forthe reuse of code wherever possible. In addition, programming tools provide extensivetesting capabilities and intelligent debugging features such as descriptive codes anddefinitions. They also strive to provide code level statistics for optimization. Testingtools must be able to simulate real world conditions and provide accuratemeasurements of throughput and other performance measurements.Another important consideration is the selection of a suitable high‐level programminglanguage for network processor programming. By far, the most common softwarelanguages in real‐time communication systems are C and C++. Programming in the C andC++ languages enhances the future portability of the code base, which enables its use infuture generations of network processors and industry standard programminginterfaces. This option is not possible with specialized languages or state‐machinecodes.FlexibilityFor real platform leverage, a network processor must be universally applicable across awide range of interfaces, protocols and product types. This requires programmability atall levels of the protocol stack, from Layer 2 through Layer 7. Protocol support mustinclude packets, cells and data streams (separately or in combination) across variousinterfaces to meet the requirements of carrier edge devices, which are the cornerstoneof the emerging multi‐service carrier network.The number of instructions required to implement a fast path application must be keptto a minimum. To accomplish this goal, each instruction must be powerful and targetedat performing data path functions. A user must be able to implement a new application,such as a new protocol or a scheduling algorithm, in a matter of days instead of months.The programming interface must be concise. Furthermore, a user must be ableimplement virtually any network application, existing and futuristic, without needing toreplace the network processor hardware. System operators should be able to add newroutes, connections and forwarding treatments at runtime without affecting the fastpath flow.True network processors integrate all the functions implemented between the physicalinterfaces and the switching fabric, thus enabling an open approach for the PHY andfabric levels. This permits best‐of‐breed, multi‐vendor solutions that allow vendors tooffer true product differentiation and scalability. In addition, software implementationof these functions allows simpler upgrade paths in this constantly changing networkingworld.Copyright © 2010 GlobalLogic, Inc. [10]
  11. 11. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREModular SoftwareNetwork processor‐based solutions must also address the fundamental programmingand processing models. As multiple processing cores are grouped together to achievevery high throughput, the simple application of traditional programming tools leads tothe extreme complexity of multiprocessor software development. Regardless of thelanguage or tools employed, asking a programmer to understand and distribute time‐sensitive, interdependent task execution across multiple processing cores addstremendous complexity to the process and goes directly against the aim of total time‐to‐market. Ideally, even with multiple processor cores in operation, a programmer shouldbe able to address the NPU as a simple, single, logical processor from a softwareperspective, thus minimizing programming complexity while maximizing real‐timeperformance.In order to facilitate scalability across multiple product lines and migration acrossmultiple generations, software architecture must be highly modular and simultaneouslyfacilitate both portability and the rapid rollout of additional features. A communicationprocessor cannot deliver software flexibility and portability if the programminginterfaces are dependent on the processor. The processor’s architecture must supportgeneric communications programming interfaces to simplify the programming task andallow future software reuse across processor generations. By delivering softwarestability across product generations, network processors radically improve softwaredevelopment cycles and reliability, which is the largest factor in total system availability.Furthermore, open APIs allow for the maximum degree of integration with industrystandard application softwareManagement FunctionsA management interface allows an external source such as a management system orhuman user to monitor and modify the operation of the communication device on thefly. It is common for network processor vendors to provide standard‐basedmanagement software that runs in the main processor core to manage the networkdevice, thus eliminating the need for any separate host processor. Typical managementactivities include:• Configuration – setting device parameters that affect the devices runtime behavior• Performance Monitoring – collecting information concerning the performance of the device, which may prompt reconfiguration• Usage Monitoring – collecting information concerning the usage of the deviceCopyright © 2010 GlobalLogic, Inc. [11]
  12. 12. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTURE• Fault Monitoring – collecting information concerning the faults, errors and warnings related to device operation• Diagnostics and Testing – prompting the device to perform self‐diagnostics and tests, and reporting the information to the management sourceNetwork processor must be able to gather performance and traffic flow statistics thatcan (1) be collected by a billing or accounting system using such common protocols asRMON and SNMP and (2) provide support for services such as SLA management andenforcement. Network processors also play a key role in enforcing various classes ofservices that providers may offer. To enforce these policies, traffic must be identifiedand classified at the ingress. Traffic filtering also typically occurs at this boundary usingaccess control lists or some other policy enforcement mechanism. By performingmanagement tasks such as identification, classification and accounting within thenetwork processor, hardware vendors can take advantage of the specialized nature ofthese processors to provide a large performance boost to their products.Third Party SupportTo realize the full potential of a software‐driven environment, the network processorneeds to be the foundation of a complete communications platform that takesadvantage of industry‐wide hardware extensions, software applications and tool suites.This is only possible with an architecture that has the flexibility to support third‐partyprotocol stacks, to support any PHY or fabric interface, and to link with industrystandard tools.Developers typically must integrate network processor functions with a high‐speedinterconnect (i.e., a switch fabric or switching engine) and add the queuing andscheduling services of a traffic management engine to facilitate rich Quality of Service(QoS). This integration can be one of the most difficult and time‐consuming tasks fordevelopers. Clearly, minimizing the time spent in developing additional “glue” logic,whether in hardware or software, is another essential part of delivering time‐to‐market.Network equipment providers typically look to a vendor to supply total platforms ofcompatible, integrated network processors, traffic management engines and switchfabrics.High Functional IntegrationNetwork processors need to provide a high level of system integration that dramaticallyreduces part count and system complexity while simultaneously improvingperformance, as compared to using a design that incorporates multiple components. Inaddition, a highly integrated network processor avoids the interconnection bottlenecksthat is common with component‐oriented designs.Copyright © 2010 GlobalLogic, Inc. [12]
  13. 13. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREIntegrated co‐processor engines (such as for classification or queuing) can be fullyutilized by internal processing units without interconnection penalties. Integration oflower layer functions (such as SONET framers) within the chip also enables higher portdensities and lower costs than have typically been possible in the past. Therefore thehigh functional integration of bus interconnects, co‐processing engines, special purposehardware, memory units and standard bus and switch interfaces is an importantcharacteristic of the network processor.Also Read On: Saas application development Health care software development Product engineering servicesCopyright © 2010 GlobalLogic, Inc. [13]
  14. 14. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREAPPLICATIONSAs the number of applications for network processors has grown, the market has begunto segment into three main network equipment areas: core, edge and access. Each ofthese areas has different target applications and performance requirements. Coredevices sit in the middle of the network. As a result, they are the most performance‐critical and the least responsive to flexibility. Examples of these devices are gigabit andterabit routers.Edge devices sit between the core network and access devices. Examples of edgedevices include URL load balancers and firewalls. They are focused on medium‐high datarates and higher layer processing, so a certain amount of flexibility is required.Access equipment provides various devices access to the network. Most of theircomputation relates to aggregating numerous traffic streams and forwarding themthrough the network. Examples of access devices include cable modem terminationsystems and base stations for wireless networks. Each level of the network requires adifferent mix of processing performance, features and costs. To meet these needseffectively, network processors must be optimized not only for the specificrequirements of the equipment, but also for the services delivered in each segment ofthe network infrastructure. Figure 3. Network Equipment AreasRoutingRouters are the workhorses of the Internet. A router accepts packets from one ofseveral network interfaces and either drops them or sends them out through one ormore of its other interfaces. Packets may traverse a dozen or more routers as they maketheir way across the Internet.In order to forward an IPv4 packet, a router must examine the destination address of anincoming packet, determine the address of the next‐hop router, and forward the packetto the next‐hop address. The next‐hop route is stored in a routing table, which isCopyright © 2010 GlobalLogic, Inc. [14]
  15. 15. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREcreated and maintained by a routing protocol, such as Border Gateway Protocol (BGP).There are many different algorithms for performing routing table lookups. However,they all use longest prefix matching, which allows entries to contain wildcards and findthe entry that most specifically matches the input address. For example, all packetsgoing to subnet 128.32.xxx.xxx may have the same next‐hop address. While thissignificantly reduces the size of the routing table, multiple lookups may be requireddepending upon the data structures and algorithms used.The ideal packet forwarding solution must be able to support the link data rate andmust be large enough to accommodate the routing table sizes of next‐generationrouting equipment (up to 512K routes at the edge). It must also be able to handleprolonged bursts of route updates with low update latency. In addition, networkprocessors can handle a large number of addresses in the routing table and high datarates, which is not possible for entirely software‐based solutions.Although the traditional methods of IP forwarding (hashing and trees) are wellunderstood, they are limited by table updates that are dependent on the prefix lengthand the size of the table. They cannot provide a level of performance comparable withhardware solutions and will not scale to higher data rates. Software‐controlled, ternary,CAM‐based solutions are a step closer to the ideal, but they are saddled with therequirement that the table must be sorted. In addition, the update rate is dependentupon how many entries are in the table. Similarly, hardware‐based solutions such asASICs remain highly inflexible towards fast‐changing requirements such as intelligentfeature additions and new protocol scaling of Internet routers.The co‐processor‐based network processors for IPv4 packet forwarding offer a solutionto the table update and searching problem by providing a single cycle performance forall prefix lengths and table sizes. These co‐processors allow router vendors to guaranteethe performance of their routers even during the heaviest bursts of traffic.SecurityIP Security provides an extensible security platform at Layer 3 for higher layer protocols.This relieves higher layer protocols from defining their own ad‐hoc security measures.Most of the security mechanisms entail encryption/decryption of data to ensureconfidentiality. This is a highly compute‐intensive task, making security applications acommon target for network processor‐based platforms. Many network processors inthe market today integrate a special purpose crypto engine for hardware acceleration ofencryption/decryption algorithms. IPSec is by far the most commonly used protocol toprovision IP VPN solutions. It consists of two protocols:• Authentication Header (AH) – proof‐of‐data origin, data integrity and anti‐replay protectionCopyright © 2010 GlobalLogic, Inc. [15]
  16. 16. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTURE• Encapsulated Security Payload (ESP) – AH plus data confidentiality but limited traffic flow confidentialityEither of these protocols can be implemented in transport mode (protects higher layerprotocols only) or tunnel mode (protects the IP layer and higher layer protocols byencapsulating the original IP packet in another packet). To ensure that all participatingnetwork equipment is consistent, some connection‐related state must be stored at eachof the endpoints of a secure connection. Called a Security Association (SA), this statedetermines how to protect traffic, what traffic to protect, and with whom protection isperformed. The SA is updated using various control protocols and is consulted for dataplane operations.AH does not provide data confidentiality, but it does verify the sender and dataintegrity. The Security Parameters Index (SPI) and the destination address help identifythe SA used to authenticate the packet. The sequence number field is a monotonicallyincreasing counter that is used for anti‐replay protection, which protects against replayattacks. Anti‐replay service is implemented by a sliding window of acceptable sequencenumbers.For ingress packets, a device that supports AH must execute the following operations: 1. If the packet is fragmented, wait for all fragments and reassemble them. 2. Find the SA used to protect the packet (based on destination address and SPI). 3. Check the validity of the sequence number. 4. Check the Integrity Check Value (ICV). a. Save authenticated data and clear the authentication field. b. Clear all mutable fields. c. Pad the packet if necessary. d. Execute the authenticator algorithm to compute the digest. e. Compare this digest to the authenticated data field. 5. Possibly increment the window of acceptable sequence numbers.The following list enumerates the steps involved in supporting AH for egress packets: 1. Increment the sequence number in the SA. 2. Populate the fields in the AH header. 3. Clear the mutable fields in the IP header. 4. Compute the Integrity Check Value (ICV) using the authentication algorithm and the key defined in the SA. 5. Copy the ICV to the authentication data field.Encapsulating Security Payload (ESP) provides data confidentiality and authentication.ESP defines a header and trailer that surround the protected payload. The presence ofthe trailer means that the payload may have to be padded (with zeros) to ensure 32‐bitCopyright © 2010 GlobalLogic, Inc. [16]
  17. 17. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREalignment. Some data encryption algorithms require a random initialization vector; ifnecessary, this is stored just before the protected data. The list below illustrates themajor steps that are required to support ESP in ingress packets: 1. Wait for additional fragments, if applicable. 2. Check for the SA and drop the packet if one does not exist. 3. Check the sequence number and drop if it is outside of the window or is a duplicate. 4. Authenticate the packet (same as Step 4 in AH ingress support). 5. Decrypt the payload using the key and cipher from the SA. 6. Check the validity of the packet with mode (transport vs. tunnel). 7. Check the address, the port and/or the protocol, depending on the SA.On the egress side, the following functions must be executed for each packet: 1. Insert the ESP header and fill in the fields. For transport mode, an ESP header just needs to be inserted. For tunnel mode, the original IP packet needs to be wrapped in another IP packet first, then the ESP header needs to be added. 2. Encrypt the packet using the cipher from the SA. 3. Authenticate the packet using the appropriate algorithm from the SA and insert the digest to the authentication field in the trailer. 4. Recompute and populate the checksum field.QoS Related ApplicationsQoS applications have recently come to the forefront due to emerging applications suchas VoIP, Service Level Agreements implementation, usage‐based accounting, etc. Onone hand, well‐defined QoS protocols such as Diffserv and IntServ enable a wide varietyof services and provisioning policies, either end‐to‐end or within a particular set ofnetworks. On the other hand, collecting network usage information pertaining to flowsand sessions is essential for billing and network analysis applications.QoS protocols such as Diffserv require data flow identification based on some well‐defined classification criteria. Classification can be performed based on QoS‐specificfields (e.g., DSCP field, Flow ID field) or on multiple header fields (e.g.,source/destination IP/TCP address/port). Once a classifier rule is used to map anincoming packet to a QoS flow, all packets in the same flow receive the same treatment.A typical action taken in the data path that is based on flow associated policies includesthe prioritized output scheduling of the frames. The packet scheduler undertakescontrol of forwarding different packet streams using a set of queues. The main functionof a packet scheduler is to reorder the output queue using an algorithm such asweighted fair queuing (WFQ) or round robin. Also, low priority traffic may be dropped inscenarios where there is insufficient buffer space. Ingress rate limiting can be performedbased on the pre‐defined rate limits per flow. Dropping packets in case of rate limitCopyright © 2010 GlobalLogic, Inc. [17]
  18. 18. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREviolations is also known as traffic policing. Another QoS operation is admission control,which decides whether a new flow can be granted the requested QoS. This isimplemented with a control‐plane reservation setup protocol such as RSVP.Usage‐based billing applications require highly granular policy rules to associatebandwidth usage to selected applications, specific users and distinct content. Forexample, it is necessary to track the download and bandwidth usage of a client whenaccessing a server and using RTSP (Real Time Transport Protocol) to play the latest rockvideo clip. The main network processor functions and corresponding packet processingtasks for catering to this application include:• Recognizing session initiation for a specific server: Layer 3 IP addresses and Layer 4 port numbers• Monitoring login session to identify user name: Layer 5 ‐ 7 extraction of login information• Recognizing RTSP session and associating with user: Layer 4 port numbers and Layer 5 key words detection• Identifying desired file name (e.g., video clip) to download: Layer 5‐7 extraction of file name and matching to users and programs policy tables• Recognizing download session and associating with user: Layer 4 port numbers and Layer 5‐7 key words detectionATM SwitchingAsynchronous Transfer Mode (ATM) is a connection‐oriented standard in which the endstations determine a virtual circuit (VC), or path, through an ATM network. The VCs aremade up of different virtual paths (VPs), or paths between switches. Once control planefunctions set up a VC, an ATM switch simply switches ATM cells from input ports tooutput ports. This switching is based on consulting a lookup table indexed by two fieldsin ATM cells: virtual circuit identifier (8‐bit VC identifier) and virtual path identifier (16‐bit VP identifier).A switch may then alter the VPI and VCI fields of the cell to update the new link onwhich the cell is traveling. The ATM Adaptation Layers (AALs) provide different ways forthe ATM to communicate with higher layer protocols. The most popular method isAAL5, which is often used for IP over ATM. Since IP packets are larger than ATM cells (48byte payload), AAL5 provides a guideline by which to segment IP packets so they cantravel over an ATM network and put ATM cells back into IP packets. To accomplish this,AAL5 defines its own packet data unit (PDU).Copyright © 2010 GlobalLogic, Inc. [18]
  19. 19. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTURE• Payload – Higher layer PDU, maximum 65,535 bytes• Padding – for AAL5 PDUs to evenly fit into a certain number of ATM cells• 8‐byte trailer o User‐to‐user (UU) field: 1 byte o Common Part Indicator (CPI) field: 1 byte o Length field: 2 bytes o CRC field: 4 bytesAfter calculating the amount of padding needed, as well as the length field and CRCfield, the AAL5 PDU is simply sliced into 48‐byte chunks that are used as the payload forATM cells.The Next Generation Network (NGN) is converging towards using omnipresent IPDSLAMs as edge devices to aggregate multiple DSL connections on the CPE side andconnect to an IP/Ethernet router/switch on the other side. This involves an ATM toEthernet conversion function, which is implemented in the data path of the networkprocessor. An intelligent IP DSLAM will act as a Layer 2/Layer 3 switch in addition to theATM‐Ethernet interworking function implementation. A host of differentiating featuressuch as QoS support, security mechanisms and management functions are madefeasible by using a network processor at the core of the IP DSLAM solution.Copyright © 2010 GlobalLogic, Inc. [19]
  20. 20. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREFUNCTIONAL COMPONENTSTo examine how the applications outlined in the previous section map to networkprocessors, we can make some generalizations about what type of processing is done ona PDU between the time it is received and when it is retransmitted. The applicationsdecompose into their computational kernels, which broadly fall into six differentcategories: data parsing, classification, lookup, computation, data manipulation, trafficmanagement and control processing. Based on the application requirements, they canbe mapped onto the various functional components.Data ParsingData parsing includes parsing cell or packet headers that contain addresses, protocolinformation, etc. In the past, parsing functions were fixed based on the type of devicebeing constructed. For example, LAN bridges by definition only needed to look at theLayer 2 Ethernet header. Today, switching devices need the flexibility to examine andgain access to a wide variety of information at all layers of the ISO model, both in real‐time and on a conditional packet‐by‐packet basis.ClassificationClassification refers to identifying a packet or cell against a set of criteria defined atLayers 2, 3, 4 or higher of the ISO model. Once data is parsed, it must be classified inorder to determine the required action. This examination consists of looking at the PDUcontent to see which patterns of interest it contains. This process is referred to as“classification,” and it is used in routing, fire walling, QoS implementation and policyenforcement. Following a data classification action such as a filtering/forwardingdecision, advanced QoS and accounting functions that are based on a specific end‐to‐end traffic flow may be taken. This is an area of rapidly changing requirements.Because of high packet volume and the need to classify and process packets at wirespeed, hardware acceleration has become the industry standard method. In oneimplementation, hardware acceleration is provided for Layer 2/3 classification, whileflexible software provides for processing at higher layers. For example, a networkprocessor may enforce a policy that prioritizes an enterprises internal communicationsover external web traffic. The first step in this process is to distinguish between the twotraffic types. The first trade‐off between hardware and software involves packetclassification. Although software can be used to compare and analyze a number ofdifferent fields, these functions are more suited to hardware acceleration since Layer 2Ethernet and Layer 3 IP classification are well defined.LookupThe lookup kernel is the actual action of looking up data based on a key. It is mostlyused in conjunction with pattern matching (classification) to find a specific entry in atable. The data structures and algorithms used are dependent on the type of lookupCopyright © 2010 GlobalLogic, Inc. [20]
  21. 21. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTURErequired (one‐to‐one or many‐to‐one) and the size of the key. For ATM and MPLS, thisfield is quite small and the mapping is one‐to‐one, so often only one lookup is required.However, for IPv4 and IPv6 routing, the large address field and longest prefix matching(LPM) requirement make it impossible to find the destination address in one memoryaccess. Therefore, trees are used to efficiently store the address table, and multiplelookups are required. A critical operation in packet processing is table lookups.Acceleration is becoming more and more critical as systems need to providesophisticated QoS functions that require additional lookups farther up the OSI stack. Thememory access required for table lookups (addresses, routes or flows) should beoptimized in hardware with co‐processor support that accelerates this function.ComputationThe types of computation required for packet processing vary widely. To support IPSec,encryption, decryption and authentication, algorithms need to be applied over an entirepacket. Most protocols require that a checksum or CRC value be computed. Often, thisvalue simply needs to be updated, not recalculated, based on changes to header fields.Network equipment that implements protocols that support the fragmentation (andreassembly) of PDUs require computation to determine if all fragments of a particularPDU have arrived.Data ManipulationWe consider any function that modifies or translates a packet header or packet datawithin or between protocols to be data manipulation. For example, in IPv4 routing, thetime‐to‐live (TTL) field must be decremented by one for each hop. Additional instancesof data manipulation include adding tags, header fields and replacing fields. Otherexamples in this space include segmentation, reassembly and fragmentation. Thevariety of low‐layer transport protocols is matched only by the diversity of protocolcombinations and services. Transformation requirements can range from addresstranslation within a given protocol (such as IP) to full protocol encapsulation orconversion (such as between IP and ATM). A PDU may be modified. For example, an IPpacket will have its TTL counter reduced. In label‐switched traffic, an incoming label willbe replaced with an outgoing label. Headers may be added or removed. Modificationusually entails recalculation of a CRC or checksum.Traffic ManagementTraffic management includes the queuing, policing and scheduling of data trafficthrough the device according to defined QoS parameters that are based on the resultsof classification and established policies. This function is central to supporting theconvergence of voice, video and data in next‐generation networks. Queue managementis the scheduling and storage of ingress and egress PDUs. This includes coordinationwith fabric interfaces and elements of the network processor that need to accesspackets. The queue management kernel is responsible for enforcing dropping and trafficshaping policies, as well as storing packets for packet assembly, segmentation andCopyright © 2010 GlobalLogic, Inc. [21]
  22. 22. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREvarious QoS applications. The actual QoS process of selecting packets, discardingpackets and rate limiting flows must be done as the packet is leaving the system afterprocessing. A combination of specialized hardware functions and software configurationmust be implemented at the output point to manage the QoS egress functions. Re‐transmission of a PDU is not generally straightforward. Some PDUs may be prioritizedover others while some may be discarded. Multiple queues may exist with differentpriorities.Control ProcessingControl processing encompasses a number of different tasks that don’t need to beperformed at wire speed, including exceptions, table updates, details of TCP protocolsand statistics gathering. While statistics are gathered on a per packet basis, this functionis often executed in the background using polling or interrupt‐driven approaches.Gathering this data requires examining incoming data and incrementing counters.Media Access ControlImplementation of low‐layer protocols (such as Ethernet, SONET framing, ATM cellprocessing and so on) is covered under media access control. These protocols definehow the data is represented on the communications channel, as well as the rules thatgovern how that channel is accessed. Paradoxically, this is the area of greateststandardization among network devices due to standards‐based protocol definitions. Itis also the area of greatest diversity due to the wide and ever‐growing variety ofprotocols. These include: • Ethernet, with three different versions at 10Mbps, 100Mbps and 1000Mbps • SONET, supporting both data packets and ATM cells at a wide range of standard rates (OC‐3, OC‐12, OC‐48 and so on) • Legacy T/E‐carrier interfaces from the existing public voice infrastructure • A variety of emerging optical interfaces that must all coexist and interactCopyright © 2010 GlobalLogic, Inc. [22]
  23. 23. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREARCHITECTUREThe field of network processors is notable for its great architectural heterogeneity. Ingeneral, however, it can safely be said that network processors universally provideprogrammable support for processing packets, and that this usually takes the form ofone or more packet processors. These can be supported either on a single chip or acrossmultiple chips. In addition, network processors universally support a number of MAC‐level ports, some memory and some form(s) of processor interconnect.Network processor designs can be broadly divided into three main architecture types:1. General RISC‐based architecture2. Augmented RISC architecture (with hardware accelerators)3. Next generation network‐specific processorsThe first two architectures are sufficient for building todays fast Ethernet products.However, they will be unable to provide full 7‐layer processing on more than a handfulof ports at gigabit speed. Figure 4. Network Processor ArchitecturesNext Generation Network‐Specific ProcessorsA new wave of network processors, namely network‐specific processors, is now beingdeveloped to provide the processing performance required for next‐generationnetworking products. network‐specific processors integrate many small, fast processorcores that are each tailored to perform a specific networking task. By optimizing theindividual processor cores for packet‐processing tasks, network‐specific processorsovercome the limitations of RISC‐based architectures. Network‐specific processors canCopyright © 2010 GlobalLogic, Inc. [23]
  24. 24. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREdeliver the packet processing performance to handle an appreciable number of ports atgigabit and terabit speeds.With network‐specific processors, achieving exceptionally fast packet processing at highbandwidths is accomplished by optimizing both the instruction set and data path. Sinceeach task‐oriented core is designed with a specific networking function in mind, it uses aconcise instruction set to accomplish the task. It may require as few as 1/10th thenumber of commands used by a RISC‐based processor to accomplish same task.The network‐specific processor architecture is a new way of approaching the problem.The processors integrate many small, fast processor cores that are each tailored toperform a specific networking task. These processors have optimized instruction setsand data paths to accomplish the task at hand.Some of the important paradigms directing the architecture of the next generation ofnetwork‐specific Processors are listed below:• Exploit parallelism – multiple processing elements• Hide latency – multithreading and pipelining increase throughput• Mix of SRAM and DRAM helps improve utilization and latency Optimize for header inspection and lookups using bit field extraction, comparisons,• alignment Bind common special functions using integrated functions and/or co‐processors• Optimize housekeeping functions and inter‐unit communications• High integration•Copyright © 2010 GlobalLogic, Inc. [24]
  25. 25. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREFAST PATH – SLOW PATHThe basic concept inherent to all network processor architectures is the packetdistribution between fast path and slow path. The fast path is usually the native way oftransferring data in certain architectures. It is very efficient and optimizes theperformance in use. The fast path is used when no special operations for the datapackets are needed. If the packet entering the system needs some extra servicing (e.g.,routing protocol packets or filter configuration packets), the packet has to be sent to theslow path. As its name explains, the slow path is slower than the fast path. In the slowpath, however, the packets can be manipulated and processed in more complex waysthan in the fast path.Fast path operations – such as classification, lookups or priority checking – must bedone on every packet at wire speed. An additional level of processing involves systemlevel functions for which performance is not as critical, such as maintaining tables,communicating status information and statistics keeping. Networking systems need tocreate an efficient environment that diverts these slow path packets for processing andlets the slow path processor easily inject the processed packets back into the system. Anopen architecture should be employed so that an industry standard processor is used toexecute existing code for slow path processing. This approach is much more efficientthan porting slow path code to a new networking‐specific processor and provides aneffective mechanism for communicating between data plane and control planeprocessors.For incoming data, the decision between the data plane and the control plane has to bechosen carefully. If the performance is critical, as it usually is in gigabit‐class devices, theslow path processing has to be kept low. Keeping both the performance requirement ofthe fast path and the intelligent processing requirement of the slow path in mind, thenetwork processor architecture is divided into the data plane and the control plane.Control plane is typically implemented in software that executes on a general purposeprocessor. IT includes protocols and network management software. They handlecontrol packets, perform data plane table updates, and perform interface managementand statistics retrieval.Data plane, on the other hand, is typically constructed with programmable andconfigurable hardware entities. It performs switching functions and transfers packetsfrom one interface to another. It also performs classification, scheduling, filtering andother functions.Dividing network processor software into two planes has many advantages. A significantadvantage is that the operation and/or load of one plane does not affect the otherplane, which eliminates many causes of failure. It also adds the advantage of being ableCopyright © 2010 GlobalLogic, Inc. [25]
  26. 26. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREto leverage advances in the data plane technologies without impacting the controlplane. Figure 5. Control Plane and Data PlaneTypically data plane tasks require a small amount of code but a large amount ofprocessing power since they handle almost 99% of the ingress traffic. In contrast,control plane tasks require little processing power but a large amount of code, as theyhandle intelligent processing functions for 1% of the ingress traffic. The differentrequirements of data plane and control plane tasks are often addressed by what iscalled a fast path – slow path design. In this type of design, as packets enter thenetworking device, their destination address and port are examined; based on thatexamination, they are sent on either the "slow path" or the "fast path" internally.Packets that need minimal or normal processing take the fast path, and packets thatneed unusual or complex processing take the slow path. Fast path packets correspondto data plane tasks, while slow path packets correspond to control plane tasks. Oncethey have been processed, packets from both the slow and fast path may leave via thesame network interface. The fast path is found in the data processor half (i.e., dataplane) of the network processor.Dividing up the processing in this way provides substantial implementation flexibility.While the slow path processing will almost certainly be implemented with a CPU, fastpath processing can be implemented with a FPGA, ASIC, co‐processor or maybe justanother CPU. This architecture is particularly strong because it allows you to implementsimple time‐critical algorithms in hardware and complex algorithms in software. Forexample, in Intel’s IXP1200, the fast path is serviced by the microengines, and the slowpath is implemented by the operating system running in the StrongARM core ofIXP1200.Copyright © 2010 GlobalLogic, Inc. [26]
  27. 27. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREUsing a router as an example, this phenomenon can be considered from two vantages:code size or processing requirements. It seems apparent that one could handle the dataplane tasks of the router without a lot of code. On the other hand, even in a traditionalnetwork device like a router, control task implementations vary. All routers will havecode to handle routing protocols like OSPF and BGP, and they will almost certainly havea serial port for configuration. However, they may be managed via a web browser, Javaapplication, SNMP, or all three. This can add up to a lot of code.Now, lets consider the packets entering the router. Nearly all of them are addressed tosomewhere else, and they need to be examined and forwarded to these destinationsvery quickly. For example, in order for a router to run wire speed with a 155Mbps OC‐3link, it needs to forward a 64‐byte packet in three microseconds. These packets may notneed to have much done with them, but the tasks need to be done in a timely manner.This requires tight code and a lot of processing power. By contrast, the occasional OSPFpacket that causes the routing tables to be updated, or an HTTP request to make aconfiguration change, might require a fair bit of code to be handled properly; however,this will have little impact on overall processing requirements.Control PlaneThe control plane of the network processor is responsible for critical tasks such asnetwork management, policy applications, signaling and topology management. Muchof the behavior of a network processor is subject to control and configuration. Aclassifier function must be told what patterns to detect, and a queue managementfunction must have its queues specified. Routing tables need to be updated. Control andconfiguration parameters originate either in policy decisions or in network protocols,but they are usually conveyed to the network processor by the GPP. The networkprocessor is said to operate in the data plane, and the GPP is said to operate in thecontrol plane. Information also flows from the data plane to the control plane. Thenetwork processor may deliver signaling PDUs to the control plane; gather statistics thatare returned to the control plane; or notify the control plane of error conditions.Generally, control plane tasks are less time‐critical.Although signaling PDUs will travel into and out of the traffic manager through thenetwork processor along with other PDUs, they are different in that they are usuallyhandled by the control plane. PDUs that are handled by the control plane are said totravel the slow path, while the majority that enter and exit without being seen by theGPP are said to travel the fast path. Some non‐signaling PDUs also travel the slow path.A network processor may delegate PDUs with unusually complex processing to the GPPto reduce the complexity and size of the network processor code. This tactic alsoprevents difficult PDUs from reducing the ability of the network processor to handle itsnormal workload.It is important that the control plane is implemented into an integrated high‐performance, low‐power processing core. It should be designed to handle a broad rangeCopyright © 2010 GlobalLogic, Inc. [27]
  28. 28. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREof complex processing tasks, including application processing; communication with thebackplane; managing and updating data structures shared with data plane processingengines such as routing tables; and setting up and controlling media and switch fabricdevices. In addition, the control plane core handles exception packets that requireadditional complex processing. A multi‐stage, high‐efficiency processing pipelinearchitecture that minimizes latency and enables high clock speeds with low powerconsumption is ideally suited to implement the control plane.It is meritorious if the control plane core is integrated in the network processor chipsetand not externally. This integrated approach gives OEMs significant flexibility inmatching processing tasks to resources and minimizes integration costs. Apart from thegenerality/specificity of their packet processors, different network processors makedifferent choices regarding centralization/decentralization of control and management.For example, some network processors rely exclusively on external control in the formof a host workstation. Others (e.g., the IXP1200) incorporate a commodity CPU on thenetwork processor that runs an operating system. Still others support a sufficientlypowerful and general packet processor so that any of these can potentially serve as alocus of control and management. The IXP 1200’s on‐board StrongARM CPU runs acommodity OS such as Linux. In addition to handling slow path packet processing, theStrongARM is also responsible for loading code onto the microengines and stopping andstarting them as required. The Motorola C‐Port, on the other hand, has no built‐incentralized controller. Instead, it relies on a host workstation to load and supervise theoperation of its channel controller packet processors. Nevertheless, it is theoreticallypossible to dedicate one of the channel controllers to take the supervisory role,especially if fine‐grained dynamic reconfiguration of the network processor is a goal.Similarly, the EZChip relies on a host workstation for control and management. In thiscase, there is no alternative because dedicating one of the packet processors, even ifpossible (cf. their lack of generality), would introduce an unacceptable bottleneck in thepipeline.Data PlaneThe data plane performs operations occurring in real‐time on the “packet path”. Thedata plane implements core device operations such as receive, process and transmitpackets. The common data plane operations include:• Media Access Control – implementing low‐level protocol such as Ethernet, SONET framing, ATM cell processing, etc.• Data Parsing – parsing cell or packet headers for address or protocol information• Classification – identifying packet against a criteria (filtering / forwarding decision, QoS, accounting, etc.)Copyright © 2010 GlobalLogic, Inc. [28]
  29. 29. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTURE• Data Transformation – transforming packet data between protocols• Traffic Management – queuing, scheduling and policing packet dataAs the network continues to evolve, the value of network processor technology willincreasingly depend on intelligent packet processing at wire speed, rather than on rawperformance alone. The ability of carriers to provision and bill for new services willrequire a combination of performance and flexible control over processing resources ofthe data plane. For an OC‐192/10 gbps link, deep packet inspection must occur inintervals as short as 35 nanoseconds. The high performing data plane is expected toperform the necessary Layer 3 – 7 applications on these cells/packets and then transmitthem in the correct sequence and at the required rate without loss.Ideally, a multiprocessing architecture of the data plane subsystem ensures thataggregate processing capacity is available to enable rich packet/cell processing, even for10 gbps wire speeds in applications that traditionally required high‐speed ASICs. Theinherently parallel processing data plane allows a single‐stream packet/cell analysisproblem such as routing to be decomposed into multiple, sequential tasks, includingpacket‐receive, route table lookup and packet classification (which can be linkedtogether). The performance and flexibility provided by the software‐defined processingpipeline allows multiple tasks to be completed simultaneously while preserving data andtime dependencies. As network requirements evolve, the powerful and flexible dataplane design will enable OEMs to easily scale performance and add features to meetnew requirements.Multithreading is a popular technique employed in data plane design to enhance overallperformance. Multiple memory register technology enables data and event signals to beshared among threads and processing engines at virtually zero latency while maintainingcoherency. Other innovations, known as ring buffers, establish fifo "producer‐consumer" relationships among processing engines. These provide a highly efficientmechanism for flexibly linking tasks among multiple software pipelines. Through thecombination of flexible software pipelining and fast inter‐process communication, dataplanes can be adapted for access, edge and core applications to perform complexprocessing at wire speed.Most network processors feature multiple packets processors to implement the dataplane, but the nature of these can vary from units with very general instruction sets tosingle‐purpose dedicated units that are not programmable for tasks such as checksumcalculation or hashing. Furthermore, some network processors feature only one type ofpacket processor, and others support a number of different types. For example, theIntel IXP1200 network processor supports a uniform set of six so‐called microenginesthat serve as packet processors. These are 233‐600Mhz CPUs whose instruction setincludes I/O to/from MAC‐ports, packet queuing support and checksum calculation.Copyright © 2010 GlobalLogic, Inc. [29]
  30. 30. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREThey support hardware threads with zero context switch overhead and can beprogrammed either in assembler or C.On the other hand, the Motorola C‐Port [8] employs so called channel processors thatare generic packet processors grouped in sets of four that share an area of fast memory.In addition, it supports a range of dedicated, non‐programmable processors thatperform functions such as queue management, table lookup and buffer management.As a third example, the EZChip network processor‐1 has no fully generic processors.Rather, it exclusively employs dedicated packet processors that perform specific taskssuch as parsing packets, table lookup or packet modification. Although these arededicated to their given ‘domain’, they are quite flexible and programmable within thatdomain.Copyright © 2010 GlobalLogic, Inc. [30]
  31. 31. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTURESUMMARYConsiderationsThe issues one faces when choosing network processor architecture are complex. Themerits of network processor architecture depend on algorithm behavior, which in turndepends on traffic characteristics. Furthermore, network processors will spend theirworking lives handling traffic that is only being guessed at now. Todays predictionsabout Internet traffic are likely to be blown away by a new application. The currentvolume of MP3 downloads could not have been predicted a few years ago.Many network processor architectures are available today, and the only prediction wecan make about them is that some will succeed while others will fail. Different designsrepresent different predictions about traffic and processing, not all of which will becorrect. It may turn out that different architectures will dominate different applicationareas. Furthermore, survival will depend on both non‐technical as well as technicalissues. One of the most important technical issues is scalability. A good question to ask anetwork processor vendor is how their device architecture that now runs at, say, OC‐48(2.5Gbps) scales to OC‐192 (10Gbps) or OC‐768 (40Gbps).When exploring the design possibilities of network processors in Next GenerationNetwork service platforms, it is important to look for the following architecturalfeatures: • High Performance Processing Capability • Scalable Processing Architecture • Flexibility and Programmability • Ability to Leverage Co‐Processors and Memory • Headroom for Emerging Services • Variety of Open Interconnect Mechanisms • Control Plan Processor Independence • Robust Software Development Environment • Right Mix of Processing Element and Functional Unit ParallelismCopyright © 2010 GlobalLogic, Inc. [31]
  32. 32. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREChallengesWhile much of the industry has focused on the hardware side of the system, what aboutthe software side? The complexities of these architectures make them very difficult forprogrammers to think about, let alone to provide effective high‐level language support.While C/C++ compilers exist for many network processors, performance‐critical codewill continue to be written in assembly. Is there a common programming model that canbe used to target multiple network processors (much like C is to general purposeprocessors)? Will this software difficulty force future architectures to be much moreprogrammable? To use network processors, GPP software must be designed to isolatethe fast path from the slow path. Take into consideration that: • This GPP must communicate with the network processor via a carefully defined API. • Functionality running on the network processor must be minimized. • Network processor code may be written in assembly and functional languages, as well as C++. Network processor code must exploit the hardware architecture of the network • processor, particularly the multiprocessor architecture. Future generations of network processors are likely to use larger numbers of PEs • and to have more powerful co‐processors.Network equipment vendor developers recognize that choosing to go with networkprocessors is only the first step. Only when theyve delivered their complete softwarefeature sets; integrated these sets into a total network processing platform capable ofproviding the necessary high‐capacity switching throughput and required system‐levelQuality of Service; achieved wire speed performance for all of their functionality; andfuture‐proofed it all with headroom and scalability for the future … then and only thenhave they achieved total time‐to‐market.ConclusionThe time for network processors has truly arrived. The computer industry is ripe forcombining hardware and software components into a high‐performance, programmablesolution. The demand for speed and features appears to have no end in sight. Thestrength of a network processor is that its programmable off‐the‐shelf parts arespecifically created for networking applications. Over time, network processors willalmost certainly displace general purpose CPUs and ASICs. Network processors have aperformance advantage over CPU architectures that were designed years ago for othertasks, and those have numerous advantages over ASICs.The opportunity, however, extends beyond the standard benefits of off‐the‐shelfmerchant silicon. Processors that form the foundation of complete communicationsplatforms, based on a simple programming model, promise to radically improve the waynetworking technology is brought to market. This adds up to better product features,Copyright © 2010 GlobalLogic, Inc. [32]
  33. 33. NETWORK PROCESSORS OF THE PAST, PRESENT AND FUTUREfaster time‐to‐market and better reliability for network equipment vendors and theircustomers. With their combination of high performance and software‐programmablefeature‐flexibility, network processors are indeed revolutionizing the way all kinds offeature‐rich, high‐speed networking and communications products are rapidly broughtto market today. Opportunities even exist for network processors outside ofnetworking. They may be suitable for other tasks that handle heavy streams of data.Some possibilities are RAIDs in large servers and the video‐switching equipment atcable‐TV head‐end plants. Only time will tell the extent of the application of networkprocessors in other areas.To summarize: • Network processors are developing very fast and are a hot research area. • Multithreaded network processor architectures provide tremendous packet processing capability. • Network processors can be applied in various network layers and applications. • Hardwired performance, programmability of deep packet processing and differentiated solutions are some of the advantages that network processors offer. • Software reuse and portability is another benefit of using network processors.Open software interfaces encourage third party development; therefore interoperabilityand functionality will grow for network processors in the coming years.The final winner in the network processor industry will be the company that is able toquickly deliver price‐competitive products to meet market requirements; support all ofthe applications and interfaces of the value chain; and focus on superior customerservice, supply chain management and marketing programs.Also Read On: Saas application development Product engineering services OEM application developmentCopyright © 2010 GlobalLogic, Inc. [33]

×