Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Hardware Approaches for Fast Lookup & Classification

Hardware Approaches for Fast Lookup & Classification
Content Addressible Memory

  • Login to see the comments

Hardware Approaches for Fast Lookup & Classification

  1. 1. Hardware Approaches for Fast Lookup & Classification November 09, 2005 by Jignesh Patel CS590AC: Advanced Computing System Design
  2. 2. Hardware Approaches for Fast Lookup & Classification <ul><li>Motivation </li></ul><ul><li>RAM Based Lookup </li></ul><ul><li>CAM Based Lookup </li></ul><ul><li>References </li></ul>
  3. 3. Motivation <ul><li>Need for speed </li></ul><ul><ul><li>High speed packet processing </li></ul></ul><ul><ul><ul><li>Interfaces can support OC192c and OC768c (40 Gbps) </li></ul></ul></ul><ul><ul><ul><li>Interfaces for 10 Tbps under development </li></ul></ul></ul><ul><li>Software/RAM based implementation </li></ul><ul><ul><li>Disadvantage </li></ul></ul><ul><ul><ul><li>lookup is not fast enough to match the wire-speed </li></ul></ul></ul><ul><ul><li>Advantages </li></ul></ul><ul><ul><ul><li>Flexible for later modifications </li></ul></ul></ul><ul><ul><ul><li>But such modifications are less likely in the near future </li></ul></ul></ul><ul><ul><ul><ul><li>E.g. IP addressing scheme based on best matching prefix </li></ul></ul></ul></ul>
  4. 4. RAM Based Lookup <ul><li>All software approaches uses some form of Random Access Memory </li></ul><ul><ul><li>To store & retrieve data structures </li></ul></ul><ul><li>RAM operations </li></ul><ul><ul><li>Writing Data into a specific address </li></ul></ul><ul><ul><li>Reading Data from a given address </li></ul></ul><ul><li>For IP address lookup or Packet classification/filtering </li></ul><ul><ul><li>Need multiple RAM operations </li></ul></ul>
  5. 5. RAM Based Lookup <ul><li>How to perform lookup in a single memory access? </li></ul><ul><ul><li>Use destination address as a direct index ( address ) into memory. </li></ul></ul><ul><ul><li>The data stored at this address will be the next hop information </li></ul></ul><ul><ul><li>Issues? </li></ul></ul><ul><ul><ul><li>The size of RAM required for direct index grows exponentially with the number of bits in the destination address </li></ul></ul></ul><ul><ul><ul><li>E.g. – a 32 bit IPv4 address needs 4GB of RAM </li></ul></ul></ul><ul><ul><ul><li>a 128 bit IPv6 address needs 316912650057057350374175801344 GB !!!!! </li></ul></ul></ul><ul><ul><li>RAM based lookup is not used by any router vendors. </li></ul></ul>4 2 6 Memory
  6. 6. CAM Based Lookup <ul><li>Content-addressable memories (CAMs) </li></ul><ul><ul><li>Hardware search engines </li></ul></ul><ul><ul><li>Much faster than algorithmic search techniques </li></ul></ul><ul><ul><li>Uses conventional memory (usually SRAM) with additional circuitry for comparisons </li></ul></ul><ul><ul><ul><li>This enables searching the entire memory to be completed in a single clock cycle. </li></ul></ul></ul>
  7. 7. CAM Based Lookup <ul><li>How is it different than RAM? </li></ul><ul><ul><li>RAM </li></ul></ul><ul><ul><ul><li>Data is stored at a particular location called address </li></ul></ul></ul><ul><ul><ul><li>User supplies the address to retrieve the data </li></ul></ul></ul><ul><ul><li>CAM </li></ul></ul><ul><ul><ul><li>DATA can be stored without knowing the address. Stored in the next free location. </li></ul></ul></ul><ul><ul><ul><li>User supplies the data and gets the address back. </li></ul></ul></ul><ul><ul><ul><li>CAM word consists of </li></ul></ul></ul><ul><ul><ul><ul><li>search-field : mached with search key </li></ul></ul></ul></ul><ul><ul><ul><ul><li>return-field : the information returned after successful search </li></ul></ul></ul></ul><ul><ul><ul><li>E.g. – search-field usually contains addresses of known destinations and return-field contains the next hop or related information </li></ul></ul></ul>
  8. 8. CAM Based Lookup <ul><li>The size of CAM depends on </li></ul><ul><ul><li>Number of prefixes that needs to be stored </li></ul></ul><ul><ul><li>The size of the key only affects the number of bits stored in each location. </li></ul></ul><ul><ul><li>E.g. for searching 256 entries of 32 bit IPv4 addresses, the CAM must have 256 words with length of each word being atleast 32 bits. </li></ul></ul><ul><li>The access speed depends on </li></ul><ul><ul><li>The size of associated information </li></ul></ul><ul><ul><li>If the associated information is small (e.g. output port/interface #) </li></ul></ul><ul><ul><ul><li>The CAM word can store this along with the address to match. </li></ul></ul></ul><ul><ul><ul><li>Provides a fast and direct access since it requires a single CAM read. </li></ul></ul></ul><ul><ul><li>If the associated information is large (e.g. layer2 mac address) </li></ul></ul><ul><ul><ul><li>The CAM word stores and index to the associated information </li></ul></ul></ul><ul><ul><ul><li>Needs both CAM read as well as RAM read. </li></ul></ul></ul>
  9. 9. CAM Based Lookup <ul><li>For IP address lookup </li></ul><ul><ul><li>A longest prefix matching operation can be performed using exact match search in 32 separate CAMs </li></ul></ul><ul><ul><li>CAM- i stores prefixes of length I </li></ul></ul><ul><ul><li>The incoming IP address is given input to all CAMs. </li></ul></ul><ul><ul><li>The output of the CAMs is filtered through a priority encoder which picks the longest matching CAM. </li></ul></ul><ul><ul><li>Expensive: each CAM need to be big enough to store large number of prefixes </li></ul></ul>Priority Encoder CAM-1 CAM-2 CAM-32 Next-Hop Table RAM IP Address
  10. 10. CAM Based Lookup <ul><li>A binary CAM stores only two states, 0 / 1 </li></ul><ul><li>Ternary CAM </li></ul><ul><ul><li>TCAM stores one of the three states 0, 1 and X (don’t care) </li></ul></ul><ul><ul><li>Allows single clock cycle lookups for arbitrary bit mask matches </li></ul></ul><ul><ul><li>Stores each W-bit field as a ( value , bitmask ) pair. Where value and bitmask are each W-bit </li></ul></ul><ul><ul><li>E.g., if W=4, a prefix 01* is stored as pair (0100, 1100). </li></ul></ul><ul><ul><li>a given input key K matches a stored ( value , bitmask ) pair if ( K & bitmask = value & bitmask) </li></ul></ul>
  11. 11. CAM Based Lookup <ul><li>Such prefix matching works well for IP address lookup </li></ul><ul><li>But, they are not well-suited for matching ranges (e.g. port number range) </li></ul><ul><li>Solution </li></ul><ul><ul><li>Replace each rule with several rules, each covering a portion of desired range. </li></ul></ul><ul><ul><li>Requires splitting the range into smaller ranges that can be expressed as ( value , bitmask ) pair. </li></ul></ul><ul><ul><li>E.g. the range 2-10 can be splitted into a set 001*, 01*, 100* and 1010 </li></ul></ul>
  12. 12. CAM Based Lookup <ul><li>For multiple field classifiers </li></ul><ul><ul><li>Needs a TCAM for each field </li></ul></ul><ul><li>E.g. a Two field classifier </li></ul>Priority Encoder TCAM-A TCAM-B Action Memory RAM F1 F2 AND
  13. 13. CAM Based Lookup <ul><li>TCAMs are increasingly being used because of their simplicity and speed. </li></ul><ul><li>Generally used in Routers </li></ul><ul><li>Binary CAMs are used in switches </li></ul><ul><li>Some disadvantages </li></ul><ul><ul><li>High cost per bit </li></ul></ul><ul><ul><li>High power consumption </li></ul></ul><ul><ul><li>Storage inefficiency </li></ul></ul>
  14. 14. References <ul><li>Chapter 4 draft from book by Dr. Medhi and Ramasamy </li></ul><ul><li>Chapter 17 draft from book by Dr. Medhi and Ramasamy </li></ul>