Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Implementing a Caching Scheme for Media Streaming in a Proxy Server

708 views

Published on

In the past few years, websites have moved from being
static web pages into rich media applications that use audio,
images and videos heavily in their interaction with users. This
change has made a dramatic change in network traffics
nowadays. Organizations spend a lot of effort, time and money
to improve response time and design intermediary systems that
enhance overall user experience. Media traffic represents
about 69.9-88.8% of all traffic. Therefore, enhancing networks
to accommodate this large traffic is a major trend. Content
Distribution Networks (CDNs) are now largely deployed for a
faster delivery of media. Redundancy and caching are also
implemented to decrease response time.
In this project, we are implementing a caching scheme for
media streaming in a proxy server. Unlike CDNs, which
require huge infrastructure, our caching proxy server will be
as simple as a piece of software that is portable and can be
installed in small as well as large scales. It may be deployed in
a university network, company’s private network or on ISPs
servers. This caching scheme, specially tailored for media
streaming, will reduce traffic and enhance network efficiency
in general.
Index Terms – Proxy servers, Caching, Media streaming

Published in: Education
  • Be the first to comment

Implementing a Caching Scheme for Media Streaming in a Proxy Server

  1. 1. Implementing a Caching Scheme for Media Streaming in a Proxy Server Abdelrahman H. Ibrahim Huda M. Aldosari Raed A. Alotaibi CSE 5300 – Advanced Computer Networks – Spring 2015 Final Report University of Connecticut Storrs, CT, USA [Emails: abdelrahman@engr.uconn.edu , huda.aldosari@uconn.edu , raed.alotaibi@uconn.edu ] Abstract In the past few years, websites have moved from being static web pages into rich media applications that use audio, images and videos heavily in their interaction with users. This change has made a dramatic change in network traffics nowadays. Organizations spend a lot of effort, time and money to improve response time and design intermediary systems that enhance overall user experience. Media traffic represents about 69.9-88.8% of all traffic. Therefore, enhancing networks to accommodate this large traffic is a major trend. Content Distribution Networks (CDNs) are now largely deployed for a faster delivery of media. Redundancy and caching are also implemented to decrease response time. In this project, we are implementing a caching scheme for media streaming in a proxy server. Unlike CDNs, which require huge infrastructure, our caching proxy server will be as simple as a piece of software that is portable and can be installed in small as well as large scales. It may be deployed in a university network, company’s private network or on ISPs servers. This caching scheme, specially tailored for media streaming, will reduce traffic and enhance network efficiency in general. Index Terms – Proxy servers, Caching, Media streaming I. Introduction As the World Wide Web traffic is increasing dramatically, more research is done on optimizing networks efficiency. Internet usage has increased in the past few years as it becomes an integral part of everyone’s life. People use social networks, news websites, messaging systems, online collaboration tools, media applications and the number of tools is expanding. Media, audio and video, represent more than 69% of all traffic around the world [1]. As we are concerned with enhancing the network efficiency for media streaming, we are going to concentrate in this report on discussing protocols used in delivering media for streaming as well as touching our cache design to reduce the response time. First, let’s have a quick overview of how streaming happens. Streaming means sending data in a way that allows it to start being processed before it’s completely received [2]. We can categorize streaming into two main schemes: progressive streaming and true streaming. In a progressive streaming scheme, the client receives an ordinary file and start to process it before it is completely downloaded. Although it requires no special protocols, it requires a file format that can be processed on partial content. True streaming, on the other side, uses a streaming protocol to control the transfer. In our project, we are concerned mainly with these streaming protocols. Several streaming protocols were created to address the increasing demand on media. RTSP, MMS and RTMP were used extensively in delivering media. With time, HTTP-based streaming began to dominate the scene for multiple reasons. Firewalls may block RTMP and special streaming protocols. It also couldn’t leverage standard caching like HTTP. With other challenges facing these streaming protocols, HTTP- based streaming is the most widely used nowadays [3]. As mentioned before, our project is about building a caching software that acts as a proxy server, especially optimized for streaming media. That puts our project in researching already existing caching solutions and developing our own technique that is customized to media streamed data. II. Related Work Streaming can be broadly divided into on-demand and real-time categories. On-demand streaming implies delivering stored media, audio and video, to clients as they request them. An example of this is stored videos on YouTube. A lot of research is done to reduce the response time and increase network utilization for this type of streaming. Content Distribution Networks (CDNs) store copies of the same video with different encodings and bit- rate, and when a user requests to stream a video, the server re- directs him to the best content server to respond to his request [4]. Clients and servers implement this through Dynamic Adaptive Streaming over HTTP (DASH). DASH is not a new protocol. Rather, it uses HTTP for the communication between the client and server embedding some attributes from the client that chooses which version of the media to stream according to the state of the network [5]. More research is done in coming over network problems like data loss and delay. One implementation is to provide scalable on-demand streaming [6]. On the other hand, real-time streaming requires that the server determines what to send, and the client plays it back as it is received, with a slight and consistent delay. One example of this category is YouTube Live Streaming and other radio podcasts channels. Also, a lot of existing research is done in this area to reduce the delivery time to the speed of light. The difference between this category and the later one is that the server has live produced data, which should be delivered in real-time. If it was not able to deliver it in real time, it skips this piece of data and transmits subsequent chunks. One of the researches done is to use multi-path for delivering the chunks of data [7]. Beside all these efforts, another approach arises to increase network efficiency. Web caching, known as proxy servers, is used to save local copies of web content in an intermediate server. It is very widely used inside organizations and work environments. Some research adapts proxy caching for real- time applications [8]. In our project, we follow a caching strategy for real-time delivery of media, which we mainly customize for live streaming data.
  2. 2. III. HTTP Live Streaming Architecture HTTP Live Streaming (HLS) allows users to send audio and video either it is live or prerecorded. HLS consists of three components [9]: Figure 1 - HLS Architecture 1) Server: is responsible for taking input streams of media and encoding them. After that, the stream segmenter inside the server is responsible for dividing the encoded media into chunks of equal sizes (duration), and indexing them with an index file. 2) Distribution Channel (Content Distribution Networks): these are primarily a set of server that have redundant copies of the same media, segmented and indexed, and distributed in different several locations. These servers respond to HTTP request asking for these files (index and segments). 3) Client: is responsible for making requests to the distribution server asking for the index file, processing it, and then requesting and playing the segment files. From the above architecture, we can conclude that playing streaming media is not only the responsibility of a server to send files in a specific scheme, but also the responsibility of the client too send requests, process index files and play the segments received. It is an integrated architecture where each player has an important an integrated role. In the next section, we are going to analyze packets sent and received by a client asking for a specific media file. IV. Analysis of Apple HTTP Live Streaming Protocol In order to develop an efficient caching scheme, we analyzed the packets sent and received by a client using Wireshark software. We investigated Apple implementation of HLS. Other implementations follow the same architecture with minor differences in tags and/or file formats. By this investigation, we are also aiming at measuring the signaling overhead between a client and a server, which will shortly be showed to be the signaling overhead between our new cache server and original content server. Figure 2 shows the communication outline between a client and a server (without having a cache). The client first sends HTTP request asking for the index file (.m3u8). The server replies with a list of available encodings for the requested media. By encoding, we mean different bit rate of the media which leads to different quality. The higher the quality of the media, the larger the size of the segment files (.ts). Based on the network speed, the client should choose an appropriate encoding that can be handled in real time by the client; that is being played without having to make pauses waiting for other segments to arrive. Having chosen an encoding, the server replies with a list of segment files of that specific encoding and their location on the server (URLs). Finally, the client is responsible for iterating through this list requesting each segment file in a separate HTTP request. Figure 2 - Communication between a client and an HTTP streaming server A packet trace for the above communication is included in Appendix A, and complete packet trace is in the references [10]. It is captured from a live Apple HLS service for testing [11]. From the packet trace, we can conclude that there is a large amount of overhead in sending and receiving requests (not the actual media content). This extensive message interaction will be shown to affect the overall latency of delivering media. V. Caching Scheme From the investigation of the communication between the client and the streaming server, we come up with a modified idea of caching in a proxy server. Traditional proxy caching servers do not make a local copy of a file until it is requested by the client. The advantage of this traditional caching takes effect for the next client to request the same file. In our modified caching scheme, we pre-fetch the segment files from the content server as soon as a client requests the index file of the media (.m3u8 in our case). These pre-fetched files are also cached in our caching server for future users. Figure 3 - Our new modified caching scheme
  3. 3. Figure 3 shows a graphical representation of the new communication between the client, cache and content server. We can see that as soon as the cache server gets a list of available encodings from the content server, the cache starts pre-fetching the segment files and makes a local copy of them. At the same time it forward the list of encodings to the client to choose a specific encoding. At the time the client sends the request of a specific encoding, the cache would have fetched some files already and in the process of fetching the rest of the list. The cache plays the role of a content server to the client and the role of a client to the original content server. VI. Test Results on GENI We have implemented our caching server in python. Our test bed was based on GENI infrastructure. Figure 4 shows the architecture we tested on. Figure 4 - GENI test bed for our new caching scheme The blue aggregate contains server A, which is our original content server. The orange aggregate contains the caching server B. The green aggregate has three clients for testing. To make results interpretable, we introduced an arbitrary delay of 500 milliseconds on the interface of the content server (server A). We expect from this architecture to face a large delay the first time a client request a specific media (as it is being fetched from server A), and a smaller latency for the subsequent requests of the same media by the same client or other clients. Below is the log we got for the first client request. We have skipped calculating the time to transfer the specific segments (.ts files) to make our comparison abstract in terms of comparing only the signaling latency between the client and the server. Figure 5 - First request by client satisfied in 3 seconds It is obvious that the delayed introduced on the content server affected the overall latency greatly. The full requested were satisfied in approximately 3 seconds! The next time we ran a client asking for the same media, we noticed a great improvement in terms of the signaling latency. The figure below shows the log appeared on the client side. Figure 6 - Second request for the same media satisfied in 13 milliseconds An improvement of 3 seconds / 0.1365 seconds is nearly 220 times! We justify this large improvement as the client request no longer goes to server A which incurs a delay. Note that the list of sequence files contained around 180 segments, which means that we have 180 request/response pairs that lead to a delay in the first run. The same 180 request/response pairs do not face a delay as they are satisfied from the cache. One more advantage is that if a second client requested another encoding of the same media (for example because it has a higher speed link), the request will be also satisfied from the cache although it was never requested before by clients. A note on the implementation: As GENI nodes run Linux operating system distribution, we were not able to deploy a ready-made available streaming server that implements Apple HLS. To come over this problem, we have simulated a running server on server A by writing our own python scripts that opens a TCP socket on port 80 to act as an HTTP server. We made this script replies to the requests in the same format we noticed in the Wireshark packet trace. It is built from scratch, however we did not implement a full- functioning HTTP server. Another point to note is that a streaming server in this case is nothing more than an HTTP server that understands tags of the adopted streaming protocol; and that what we simply implemented. VII. Conclusion We have implemented a modified scheme of caching in proxy servers. Our new caching scheme doesn’t wait for a file to be requested, rather it pre-fetch the files to be requested by a client. Media streaming is a suitable application for this caching scheme as we can easily predict what files the client is going to request from the very first request from the client to a specific media file. We have shown that the improvement in latency is very large, neglecting the time it takes at the caching server for processing for simpler calculations. Our implementation was as simple as deploying a simple python file to a node that acts as a proxy and it works! In addition, this new adopted scheme increases network efficiency and utilizes resources. However, we have noticed that there are some limitations in implementing the proposed caching scheme. First, we supposed that the caching server has a very large disk space to hold all the segment files from different content servers. In practice, we will need to run a background job to replace the oldest cached items to free up space for newer requests. Moreover, the scheme assumes that the bandwidth between the cache and the content server is sufficient enough to carry out all the requests. This might become a limitation if the link between that cache the content server is not fast enough or faces some noise that delays the response greatly.
  4. 4. VIII. Our GENI Experience In this section, we would like to tell our experience using GENI as it is a new test bed for network innovation. Here is what we found. Advantages: - The fact that GENI is a real test bed made us rest assured that the results we get will be similar to the ones in practice. It is not a simulation of devices communicating with each other. - GENI website has a lot of resources and experiments that we could learn from. The learning curve of GENI might be steep at the beginning, but with the help of the available tutorials, a one can get himself used to the environment. - GENI community is helpful. We have contacted GENI help, and they replied in two days. They put our question in GENI community website. After that, someone sent us another answer by email. So, you will get help from everyone working on it We found some difficulties though: - The tool GENI desktop did not work with us, even after following the instructions and walking through tutorials. Probably, it is not yet well tested. - GENI nodes can be accessed only using SSH. This might seem frustrating for non-experienced users. - It loads very slowly when using large topologies, especially if the topology combines different aggregates. - You can’t access clients in a graphical interface nor use clients that run operating systems other than Linux (for example, you cannot run Apple or Windows clients). IX. Appendix A A Client Request: GET /iphone/samples/bipbop/bipbopall.m3u8 HTTP/1.1 Host: devimages.apple.com Cookie: dssf=1; dssid2=42311671-3f0f-4e70-a37d-30cef99958f1; POD=us~en; pxro=1; rtsid=R613; s_fid=54DD13491B331546-256A4913BA2BF143; s_invisit_n2_us=19; s_pathLength=developer%3D1%2C; s_ppv=http%2520live%2520streaming%2520examples%2520- %2520basic%2520stream%2C100%2C100%2C601%2C; s_vi=[CS]v1|29BD76ED0501285C-60000112200694E4[CE]; s_vnum_n2_us=4%7C2%2C0%7C6%2C3%7C4%2C20%7C1%2C19%7C7%2C17%7C 1; xp_ci=3z44YEA7z9Xlz5JdzBYDzXRQLWQvI X-Playback-Session-Id: 1D47DB99-D831-49E6-84A6-D3EB07125AFD Accept: */* User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/600.5.17 (KHTML, like Gecko) Version/7.1.5 Safari/537.85.14 Referer: http://devimages.apple.com/iphone/samples/bipbop/bipbopall.m3u8 Accept-Encoding: gzip Connection: keep-alive Server Response: HTTP/1.1 200 OK Server: Apache ETag: "44f07dfa5af5d23a5b2b7566cdde9a44:1239907290" Last-Modified: Thu, 16 Apr 2009 18:41:30 GMT Accept-Ranges: byte Content-Length: 292 Content-Type: audio/x-mpegurl Date: Tue, 21 Apr 2015 17:57:17 GMT Connection: keep-alive #EXTM3U #EXT-X-STREAM-INF:PROGRAM-ID=1, BANDWIDTH=200000 gear1/prog_index.m3u8 #EXT-X-STREAM-INF:PROGRAM-ID=1, BANDWIDTH=311111 gear2/prog_index.m3u8 #EXT-X-STREAM-INF:PROGRAM-ID=1, BANDWIDTH=484444 gear3/prog_index.m3u8 #EXT-X-STREAM-INF:PROGRAM-ID=1, BANDWIDTH=737777 gear4/prog_index.m3u8 Client Request GET /iphone/samples/bipbop/gear1/prog_index.m3u8 HTTP/1.1 Host: devimages.apple.com Cookie: dssf=1; dssid2=42311671-3f0f-4e70-a37d-30cef99958f1; POD=us~en; pxro=1; rtsid=R613; s_fid=54DD13491B331546-256A4913BA2BF143; s_invisit_n2_us=19; s_pathLength=developer%3D1%2C; s_ppv=http%2520live%2520streaming%2520examples%2520- %2520basic%2520stream%2C100%2C100%2C601%2C; s_vi=[CS]v1|29BD76ED0501285C-60000112200694E4[CE]; s_vnum_n2_us=4%7C2%2C0%7C6%2C3%7C4%2C20%7C1%2C19%7C7%2C17%7C 1; xp_ci=3z44YEA7z9Xlz5JdzBYDzXRQLWQvI X-Playback-Session-Id: 1D47DB99-D831-49E6-84A6-D3EB07125AFD Accept: */* User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/600.5.17 (KHTML, like Gecko) Version/7.1.5 Safari/537.85.14 Referer: http://devimages.apple.com/iphone/samples/bipbop/bipbopall.m3u8 Accept-Encoding: gzip Connection: keep-alive Server Response HTTP/1.1 200 OK Server: Apache ETag: "50117c8233644c19b5ab49551b72507f:1239907352" Last-Modified: Thu, 16 Apr 2009 18:42:32 GMT Accept-Ranges: bytes Content-Length: 7019 Content-Type: audio/x-mpegurl Date: Tue, 21 Apr 2015 17:57:17 GMT Connection: keep-alive #EXTM3U #EXT-X-TARGETDURATION:10 #EXT-X-MEDIA-SEQUENCE:0 #EXTINF:10, no desc fileSequence0.ts #EXTINF:10, no desc fileSequence1.ts #EXTINF:10, no desc fileSequence2.ts #EXTINF:10, no desc fileSequence3.ts : : (removed from here for readability) : #EXTINF:10, no desc fileSequence177.ts #EXTINF:10, no desc fileSequence178.ts #EXTINF:10, no desc fileSequence179.ts #EXTINF:1, no desc fileSequence180.ts #EXT-X-ENDLIST Client Request GET /iphone/samples/bipbop/gear1/fileSequence0.ts HTTP/1.1 Host: devimages.apple.com Cookie: dssf=1; dssid2=42311671-3f0f-4e70-a37d-30cef99958f1; POD=us~en; pxro=1; rtsid=R613; s_fid=54DD13491B331546-256A4913BA2BF143; s_invisit_n2_us=19; s_pathLength=developer%3D1%2C; s_ppv=http%2520live%2520streaming%2520examples%2520- %2520basic%2520stream%2C100%2C100%2C601%2C; s_vi=[CS]v1|29BD76ED0501285C-60000112200694E4[CE]; s_vnum_n2_us=4%7C2%2C0%7C6%2C3%7C4%2C20%7C1%2C19%7C7%2C17%7C 1; xp_ci=3z44YEA7z9Xlz5JdzBYDzXRQLWQvI X-Playback-Session-Id: 1D47DB99-D831-49E6-84A6-D3EB07125AFD Accept: */* User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/600.5.17 (KHTML, like Gecko) Version/7.1.5 Safari/537.85.14 Referer: http://devimages.apple.com/iphone/samples/bipbop/bipbopall.m3u8 Accept-Encoding: identity Connection: keep-alive Server Response HTTP/1.1 200 OK Server: Apache ETag: "90e466ad7f45ea3bd52732124f1fa675:1239907291" Last-Modified: Thu, 16 Apr 2009 18:41:31 GMT
  5. 5. Accept-Ranges: bytes Content-Length: 250228 Content-Type: video/mp2t Date: Tue, 21 Apr 2015 17:57:17 GMT Connection: keep-alive G@^}GA$GA0 ~ 17wA7wA 'B`5|( l@N K:% G(4D8Yz*'L5 7h+&f;AJZ>Po7yMG4aCgwwT|+5]zOKz)LY2@c.40_e"BH+zO'b,+q{2fnn]T{d9rEJ@G >?@lLI`H?,V_ `8Q}[>}6G?;%aV!-SV : : : (Rest of media removed for readability) : The request and response for subsequent files follow the same pattern. For complete packet trace, please go to reference [10] X. References [1] Ihm, Sunghwan, and Vivek S. Pai. "Towards Understanding Modern Web Traffic." ACM SIGMETRICS Performance Evaluation Review 39.1 (2011): 335. Web. [2] http://www.garymcgath.com/streamingprotocols.html [3] http://www.streamingmedia.com/Articles/Editorial/What-Is- .../What-Is-a-Streaming-Media-Protocol-84496.aspx [4] Peng, Gang. "CDN: Content Distribution Network." [Stony Brook, NY 11794-4400] 01 Feb. 2008: n. pag. Print. [5] Stockhammer, Thomas. "Dynamic Adaptive Streaming over HTTP – Design Principles and Standards." Second W3C Web and TV Workshop (2011): n. pag. Web. <http://www.w3.org/2010/11/web-and- tv/papers/webtv2_submission_64.pdf>. [6] Mahanti, A., D.l. Eager, M.k. Vernon, and D.j. Sundaram- Stukel. "Scalable On-demand Media Streaming with Packet Loss Recovery." IEEE/ACM Transactions on Networking 11.2 (2003): 195-209. Web. [7] Chow, Alix L.H., Hao Yang, Cathy H. Xia, Minkyong Kim, Zhen Liu, and Hui Lei. "EMS: Encoded Multipath Streaming for Real-time Live Streaming Applications." N.p., n.d. Web. [8] Cheng, Albert M. K, and Zhubin Zhang. "Adaptive Proxy Caching for Web Servers in Soft Real-Time Applications." (n.d.): n. pag. Web. [9] http://www.slideshare.net/aurot/http-live-streaming-10069443 [10] Complete packet trace https://drive.google.com/a/uconn.edu/file/d/0B4YsLsQNRydQS0F OcXVmMlVyT0k/view?usp=sharing [11] https://developer.apple.com/streaming/examples/basic- stream.html

×