• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Hardware Approaches for Fast Lookup & Classification
 

Hardware Approaches for Fast Lookup & Classification

on

  • 4,335 views

Hardware Approaches for Fast Lookup & Classification

Hardware Approaches for Fast Lookup & Classification
Content Addressible Memory

Statistics

Views

Total Views
4,335
Views on SlideShare
4,328
Embed Views
7

Actions

Likes
2
Downloads
0
Comments
0

2 Embeds 7

http://www.slideshare.net 6
http://www.techgig.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Hardware Approaches for Fast Lookup & Classification Hardware Approaches for Fast Lookup & Classification Presentation Transcript

    • Hardware Approaches for Fast Lookup & Classification November 09, 2005 by Jignesh Patel CS590AC: Advanced Computing System Design
    • Hardware Approaches for Fast Lookup & Classification
      • Motivation
      • RAM Based Lookup
      • CAM Based Lookup
      • References
    • Motivation
      • Need for speed
        • High speed packet processing
          • Interfaces can support OC192c and OC768c (40 Gbps)
          • Interfaces for 10 Tbps under development
      • Software/RAM based implementation
        • Disadvantage
          • lookup is not fast enough to match the wire-speed
        • Advantages
          • Flexible for later modifications
          • But such modifications are less likely in the near future
            • E.g. IP addressing scheme based on best matching prefix
    • RAM Based Lookup
      • All software approaches uses some form of Random Access Memory
        • To store & retrieve data structures
      • RAM operations
        • Writing Data into a specific address
        • Reading Data from a given address
      • For IP address lookup or Packet classification/filtering
        • Need multiple RAM operations
    • RAM Based Lookup
      • How to perform lookup in a single memory access?
        • Use destination address as a direct index ( address ) into memory.
        • The data stored at this address will be the next hop information
        • Issues?
          • The size of RAM required for direct index grows exponentially with the number of bits in the destination address
          • E.g. – a 32 bit IPv4 address needs 4GB of RAM
          • a 128 bit IPv6 address needs 316912650057057350374175801344 GB !!!!!
        • RAM based lookup is not used by any router vendors.
      4 2 6 Memory 000.000.000.000 128.128.128.128 255.255.255.255 172.12.180.20
    • CAM Based Lookup
      • Content-addressable memories (CAMs)
        • Hardware search engines
        • Much faster than algorithmic search techniques
        • Uses conventional memory (usually SRAM) with additional circuitry for comparisons
          • This enables searching the entire memory to be completed in a single clock cycle.
    • CAM Based Lookup
      • How is it different than RAM?
        • RAM
          • Data is stored at a particular location called address
          • User supplies the address to retrieve the data
        • CAM
          • DATA can be stored without knowing the address. Stored in the next free location.
          • User supplies the data and gets the address back.
          • CAM word consists of
            • search-field : mached with search key
            • return-field : the information returned after successful search
          • E.g. – search-field usually contains addresses of known destinations and return-field contains the next hop or related information
    • CAM Based Lookup
      • The size of CAM depends on
        • Number of prefixes that needs to be stored
        • The size of the key only affects the number of bits stored in each location.
        • E.g. for searching 256 entries of 32 bit IPv4 addresses, the CAM must have 256 words with length of each word being atleast 32 bits.
      • The access speed depends on
        • The size of associated information
        • If the associated information is small (e.g. output port/interface #)
          • The CAM word can store this along with the address to match.
          • Provides a fast and direct access since it requires a single CAM read.
        • If the associated information is large (e.g. layer2 mac address)
          • The CAM word stores and index to the associated information
          • Needs both CAM read as well as RAM read.
    • CAM Based Lookup
      • For IP address lookup
        • A longest prefix matching operation can be performed using exact match search in 32 separate CAMs
        • CAM- i stores prefixes of length I
        • The incoming IP address is given input to all CAMs.
        • The output of the CAMs is filtered through a priority encoder which picks the longest matching CAM.
        • Expensive: each CAM need to be big enough to store large number of prefixes
      Priority Encoder CAM-1 CAM-2 CAM-32 Next-Hop Table RAM IP Address
    • CAM Based Lookup
      • A binary CAM stores only two states, 0 / 1
      • Ternary CAM
        • TCAM stores one of the three states 0, 1 and X (don’t care)
        • Allows single clock cycle lookups for arbitrary bit mask matches
        • Stores each W-bit field as a ( value , bitmask ) pair. Where value and bitmask are each W-bit
        • E.g., if W=4, a prefix 01* is stored as pair (0100, 1100).
        • a given input key K matches a stored ( value , bitmask ) pair if ( K & bitmask = value & bitmask)
    • CAM Based Lookup
      • Such prefix matching works well for IP address lookup
      • But, they are not well-suited for matching ranges (e.g. port number range)
      • Solution
        • Replace each rule with several rules, each covering a portion of desired range.
        • Requires splitting the range into smaller ranges that can be expressed as ( value , bitmask ) pair.
        • E.g. the range 2-10 can be splitted into a set 001*, 01*, 100* and 1010
    • CAM Based Lookup
      • For multiple field classifiers
        • Needs a TCAM for each field
      • E.g. a Two field classifier
      Priority Encoder TCAM-A TCAM-B Action Memory RAM F1 F2 AND
    • CAM Based Lookup
      • TCAMs are increasingly being used because of their simplicity and speed.
      • Generally used in Routers
      • Binary CAMs are used in switches
      • Some disadvantages
        • High cost per bit
        • High power consumption
        • Storage inefficiency
    • References
      • Chapter 4 draft from book by Dr. Medhi and Ramasamy
      • Chapter 17 draft from book by Dr. Medhi and Ramasamy