Hardware Approaches for Fast Lookup & Classification


Published on

Hardware Approaches for Fast Lookup & Classification
Content Addressible Memory

Published in: Business, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Hardware Approaches for Fast Lookup & Classification

  1. 1. Hardware Approaches for Fast Lookup & Classification November 09, 2005 by Jignesh Patel CS590AC: Advanced Computing System Design
  2. 2. Hardware Approaches for Fast Lookup & Classification <ul><li>Motivation </li></ul><ul><li>RAM Based Lookup </li></ul><ul><li>CAM Based Lookup </li></ul><ul><li>References </li></ul>
  3. 3. Motivation <ul><li>Need for speed </li></ul><ul><ul><li>High speed packet processing </li></ul></ul><ul><ul><ul><li>Interfaces can support OC192c and OC768c (40 Gbps) </li></ul></ul></ul><ul><ul><ul><li>Interfaces for 10 Tbps under development </li></ul></ul></ul><ul><li>Software/RAM based implementation </li></ul><ul><ul><li>Disadvantage </li></ul></ul><ul><ul><ul><li>lookup is not fast enough to match the wire-speed </li></ul></ul></ul><ul><ul><li>Advantages </li></ul></ul><ul><ul><ul><li>Flexible for later modifications </li></ul></ul></ul><ul><ul><ul><li>But such modifications are less likely in the near future </li></ul></ul></ul><ul><ul><ul><ul><li>E.g. IP addressing scheme based on best matching prefix </li></ul></ul></ul></ul>
  4. 4. RAM Based Lookup <ul><li>All software approaches uses some form of Random Access Memory </li></ul><ul><ul><li>To store & retrieve data structures </li></ul></ul><ul><li>RAM operations </li></ul><ul><ul><li>Writing Data into a specific address </li></ul></ul><ul><ul><li>Reading Data from a given address </li></ul></ul><ul><li>For IP address lookup or Packet classification/filtering </li></ul><ul><ul><li>Need multiple RAM operations </li></ul></ul>
  5. 5. RAM Based Lookup <ul><li>How to perform lookup in a single memory access? </li></ul><ul><ul><li>Use destination address as a direct index ( address ) into memory. </li></ul></ul><ul><ul><li>The data stored at this address will be the next hop information </li></ul></ul><ul><ul><li>Issues? </li></ul></ul><ul><ul><ul><li>The size of RAM required for direct index grows exponentially with the number of bits in the destination address </li></ul></ul></ul><ul><ul><ul><li>E.g. – a 32 bit IPv4 address needs 4GB of RAM </li></ul></ul></ul><ul><ul><ul><li>a 128 bit IPv6 address needs 316912650057057350374175801344 GB !!!!! </li></ul></ul></ul><ul><ul><li>RAM based lookup is not used by any router vendors. </li></ul></ul>4 2 6 Memory
  6. 6. CAM Based Lookup <ul><li>Content-addressable memories (CAMs) </li></ul><ul><ul><li>Hardware search engines </li></ul></ul><ul><ul><li>Much faster than algorithmic search techniques </li></ul></ul><ul><ul><li>Uses conventional memory (usually SRAM) with additional circuitry for comparisons </li></ul></ul><ul><ul><ul><li>This enables searching the entire memory to be completed in a single clock cycle. </li></ul></ul></ul>
  7. 7. CAM Based Lookup <ul><li>How is it different than RAM? </li></ul><ul><ul><li>RAM </li></ul></ul><ul><ul><ul><li>Data is stored at a particular location called address </li></ul></ul></ul><ul><ul><ul><li>User supplies the address to retrieve the data </li></ul></ul></ul><ul><ul><li>CAM </li></ul></ul><ul><ul><ul><li>DATA can be stored without knowing the address. Stored in the next free location. </li></ul></ul></ul><ul><ul><ul><li>User supplies the data and gets the address back. </li></ul></ul></ul><ul><ul><ul><li>CAM word consists of </li></ul></ul></ul><ul><ul><ul><ul><li>search-field : mached with search key </li></ul></ul></ul></ul><ul><ul><ul><ul><li>return-field : the information returned after successful search </li></ul></ul></ul></ul><ul><ul><ul><li>E.g. – search-field usually contains addresses of known destinations and return-field contains the next hop or related information </li></ul></ul></ul>
  8. 8. CAM Based Lookup <ul><li>The size of CAM depends on </li></ul><ul><ul><li>Number of prefixes that needs to be stored </li></ul></ul><ul><ul><li>The size of the key only affects the number of bits stored in each location. </li></ul></ul><ul><ul><li>E.g. for searching 256 entries of 32 bit IPv4 addresses, the CAM must have 256 words with length of each word being atleast 32 bits. </li></ul></ul><ul><li>The access speed depends on </li></ul><ul><ul><li>The size of associated information </li></ul></ul><ul><ul><li>If the associated information is small (e.g. output port/interface #) </li></ul></ul><ul><ul><ul><li>The CAM word can store this along with the address to match. </li></ul></ul></ul><ul><ul><ul><li>Provides a fast and direct access since it requires a single CAM read. </li></ul></ul></ul><ul><ul><li>If the associated information is large (e.g. layer2 mac address) </li></ul></ul><ul><ul><ul><li>The CAM word stores and index to the associated information </li></ul></ul></ul><ul><ul><ul><li>Needs both CAM read as well as RAM read. </li></ul></ul></ul>
  9. 9. CAM Based Lookup <ul><li>For IP address lookup </li></ul><ul><ul><li>A longest prefix matching operation can be performed using exact match search in 32 separate CAMs </li></ul></ul><ul><ul><li>CAM- i stores prefixes of length I </li></ul></ul><ul><ul><li>The incoming IP address is given input to all CAMs. </li></ul></ul><ul><ul><li>The output of the CAMs is filtered through a priority encoder which picks the longest matching CAM. </li></ul></ul><ul><ul><li>Expensive: each CAM need to be big enough to store large number of prefixes </li></ul></ul>Priority Encoder CAM-1 CAM-2 CAM-32 Next-Hop Table RAM IP Address
  10. 10. CAM Based Lookup <ul><li>A binary CAM stores only two states, 0 / 1 </li></ul><ul><li>Ternary CAM </li></ul><ul><ul><li>TCAM stores one of the three states 0, 1 and X (don’t care) </li></ul></ul><ul><ul><li>Allows single clock cycle lookups for arbitrary bit mask matches </li></ul></ul><ul><ul><li>Stores each W-bit field as a ( value , bitmask ) pair. Where value and bitmask are each W-bit </li></ul></ul><ul><ul><li>E.g., if W=4, a prefix 01* is stored as pair (0100, 1100). </li></ul></ul><ul><ul><li>a given input key K matches a stored ( value , bitmask ) pair if ( K & bitmask = value & bitmask) </li></ul></ul>
  11. 11. CAM Based Lookup <ul><li>Such prefix matching works well for IP address lookup </li></ul><ul><li>But, they are not well-suited for matching ranges (e.g. port number range) </li></ul><ul><li>Solution </li></ul><ul><ul><li>Replace each rule with several rules, each covering a portion of desired range. </li></ul></ul><ul><ul><li>Requires splitting the range into smaller ranges that can be expressed as ( value , bitmask ) pair. </li></ul></ul><ul><ul><li>E.g. the range 2-10 can be splitted into a set 001*, 01*, 100* and 1010 </li></ul></ul>
  12. 12. CAM Based Lookup <ul><li>For multiple field classifiers </li></ul><ul><ul><li>Needs a TCAM for each field </li></ul></ul><ul><li>E.g. a Two field classifier </li></ul>Priority Encoder TCAM-A TCAM-B Action Memory RAM F1 F2 AND
  13. 13. CAM Based Lookup <ul><li>TCAMs are increasingly being used because of their simplicity and speed. </li></ul><ul><li>Generally used in Routers </li></ul><ul><li>Binary CAMs are used in switches </li></ul><ul><li>Some disadvantages </li></ul><ul><ul><li>High cost per bit </li></ul></ul><ul><ul><li>High power consumption </li></ul></ul><ul><ul><li>Storage inefficiency </li></ul></ul>
  14. 14. References <ul><li>Chapter 4 draft from book by Dr. Medhi and Ramasamy </li></ul><ul><li>Chapter 17 draft from book by Dr. Medhi and Ramasamy </li></ul>