2. What are Content Delivery
Networks
• A centrally managed network of devices
that collectively facilitate the delivery of
content to end users
• Solve network bandwidth bottleneck
• Solve server throughput bottleneck
3. CDN Categories
• Network Infrastructure:
– Single ISP
– Overlay networks
– Enterprise premise
• Content types:
– Static images and texts
– Multimedia content: audio and video streams
– Dynamic HTML and XML pages
• Customers:
– Content providers
– Enterprise
4. Technology Components
• Content distribution
– Placing the content to the devices
• Request routing
– Steer users to a delivery node that is close
• Content delivery
– Protocol processing, access control, QoS mechanisms
• Resource accounting
– Logging and billing
5. Content Distribution
• Goal: position content objects into delivery
devices
• Different content types use different
techniques
– Static images and texts: pulled & cached, or
pushed
– Multimedia contents: usually pre-positioned
– Dynamic pages: requires prior setup
6. Distribution Mechanisms
• HTTP request for pulling
– Example: standard HTTP reverse proxy
• FTP of tar files
– Some equipment vendors use this technique
• Rate limited tree-form replication
– Example: Cisco’s “Soda” algorithm
7. Distribution Mechanisms using
Multicast
• Application-level reliable multicast
– Example: Inktomi’s Fast-Forward
• Unreliable IP multicast with file-level error
correction
– Example: Digital Fountain, multicast-ftp
• Unreliable IP multicast
– Example: RealNetworks
8. Content Consistency
Mechanisms
• Expiration times or TTL
• Renaming in the HTML file
• Web Cache Invalidation Protocol (WCIP)
– Nodes receive invalidations when objects
change
– Objects are organized into channels
– Nodes subscribe to a channel to receive
invalidation
9. Request Routing
• Goal: steer the client such that it fetches the
content from a close node
• Methods
– DNS selection
– HTTP redirection
– Transparent interception
10. Overview of Request Arrival
Process
Client
How a request for www.xyz.com/index.html arrives at 1.2.3.4:
DNS
server
1. what is IP addr of
www.xyz.com?
Root NS2. where is name
server
of xyz.com?
xyz.com
NS
IP: 1.2.3.1
3. NS record: 1.2.3.1
4. what is IP of
www.xyz.com?
5. A record:1.2.3.4
Server
s
w
i
t
c
h
IP: 1.2.3.4
Router
7. GET /index.html
6. 1.2.3.4
11. DNS selection
• Basic idea: xyz.com’s NS returns node close to
client
• How to become xyz.com’s NS?
– Rewrite URLs (aka Akamizer)
– Take a subdomain cdn.xyz.com and put all content
there
• Accuracy limited to client’s name server
– Only suitable for ISP or overlay networks
– Not suitable for some enterprise or cable networks
12. HTTP Redirection
• Basic idea: web server tells client to go
somewhere else
– Returns “302 redirect … 1.2.4.5/index.html…”
• Mostly used for multimedia objects
– These objects are usually put together in an
index file (.sml or .asx) and clients fetch the
index file via HTTP before streaming
• Accuracy is at individual client level
– More suitable for enterprise and cable networks
13. Transparent Interception
• Router and switch along the request path
can send the request elsewhere
• Mostly used for distributed data centers
front-ended with L7 switches
– Example: Cisco’s CSS11k WebNS
14. Algorithms for Request Routing
• Map-based
– Create a map of the Internet based on AS
domains, pick the node with the shortest hop
count to client
– Or, set up coverage zones mapping a node to a
collection of subnets
• Racing-based
– Let the delivery nodes all race to the client with
A-records
– Winner is selected by client automatically
15. The Boomerang Algorithm
• Cisco’s research published in WCW’01
– xyz.com’s NS server forwards lookup of
www.xyz.com to all delivery nodes
– Delivery nodes all send “A record” response
with its own IP address to the client
– The one that reaches the client first wins
– NS server times the forwarding so that lookup
message arrives at all nodes around the same
time
– Use “simulated annealing” for scalability
16. Interaction between Content
Distribution and Request Routing
• Don’t route request to a node that doesn’t
have the content!
• Particularly important for large streaming
contents
– Such content are usually pre-positioned to
ensure high-bandwidth playbacks
• Nodes need to report its content acquisition
status to the “request router”
17. Content Delivery
• Goal: serve content to each client at desired
quality of service
• Supported protocols
– HTTP
– Microsoft MMS
– Open standard RTP/RTSP
– RealNetworks RTP/RTSP
• Usually part of the larger CDN system
18. Content Access Control
• Content object attributes
– “Publication date” and “Expiration date”
– ACL based on user/group/IP
• User authentication
– HTTP basic
– Microsoft NTLM for enterprise environment
– other schemes
• Media Rights Management
19. QoS of Content Delivery
• Server QoS
– Server needs to make sure it has enough CPU and
disk to service the stream at specified bit rate
• Network QoS
– Interoperate with routers via DiffServ bits
• Coordination with request router
– delivery devices should communicate load
information to the “Request Router”
20. Resource Accounting
• Mining the log files
– Log file aggregation: all device sending log
files to a central location
– Local mining: analyzing the log file at each
delivery device
• Real-time statistics
– Real-time statistics on throughput/latency based
on domain, content type or any HTTP header
– Example: Cisco CSS switch billing MIB
22. Summary
• Main components of building a CDN:
– Content distribution
– Request routing
– Content Delivery
– Resource accounting
• A CDN system requires the four components
to work in concert with each other!
• Cisco is the only vendor that provide the full
solution!
Editor's Notes
Before we talk about how each of the technology component work and how they should
Work together though, we need to understand that there are many kinds of CDNs, and each kind requires a different mix of the technologies. We list the categories here and explain its implication to the technology components.
One can attempt to map CDN service providers based on the above categories. Akamai and digital island so far has content providers as customers and uses overlay network and most focus on static images, though trying to branch out to other media. RealNetworks is trying to build a CDN for content providers that also uses overlay network to some extent and focus on multimedia contents. I know of a number of ISPs focusing on enterprise and use enterprise premise network and focus on multimedia contents.
Of course, most companies follow the money and cross over in terms of customers and content types.
The reason for caterizing CDNs along these axises, however, is that different kinds of CDNs requires technologies to coordinate in different ways.
Content distribution can have simple ones, like ftp, and can have complicated ones, like
Rate-limited multicast;
Request routing: is a research problem. Most places use approximation only. Two
Papers on this topic in WCW.
Content delivery are the traditional web caches, used in a reverse proxy mode. That is why caching and CDN is quite tightly related.
Resource accounting is easier to do if it doesn’t have to be done in real time at high throughput. Mining the log files would do. However, if information is needed at real time in a high throughput environment, then it is harder.
Picture here showing the client request process
..
Before we explain the details, let’s takea look at the process of how a requewt arrive at a server.
Server QoS involves guaranteeing the bit rate of the delivery. This typically involves appropriate load control,