MetaCDN: Enabling High Performance, Low Cost Content Storage and Delivery via the Cloud

  • 2,439 views
Uploaded on

My talk on MetaCDN for the Cloudslam 2009 virtual conference. …

My talk on MetaCDN for the Cloudslam 2009 virtual conference.

Many 'Cloud Storage' providers have launched in the last two years, providing internet accessible data storage and delivery in several continents that is backed by rigorous Service Level Agreements (SLAs), guaranteeing specific performance and uptime targets. The facilities offered by these providers is leveraged by developers via provider-specific Web Service APIs. For content creators, these providers have emerged as a genuine alternative to dedicated Content Delivery Networks (CDNs) for global file storage and delivery, as they are significantly cheaper, have comparable performance and no ongoing contract obligations. As a result, the idea of utilising Storage Clouds as a 'poor mans' CDN is very enticing. However, many of these 'Cloud Storage' providers are merely basic storage services, and do not offer the capabilities of a fully-featured CDN such as intelligent replication, failover, load redirection and load balancing. Furthermore, they can be difficult to use for non-developers, as each service is best utilised via unique web services or programmer APIs. In this presentation, we describe the design, architecture, implementation and user-experience of MetaCDN, a system that integrates these 'Cloud Storage' providers into an unified CDN service that provides high performance, low cost, geographically distributed content storage and delivery for content creators. MetaCDN harnesses the power of 'Cloud Storage' for novices and seasoned users alike, offering an easy to use web portal and a sophisticated Web Service API.

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,439
On Slideshare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
0
Comments
0
Likes
9

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • MetaCDN is a system that leverages several existing ‘storage clouds’, creating an integrated overlay network that provides a low cost, high performance content delivery network for content creators.

  • *Content Delivery Networks (CDNs) such as Akamai and Mirror
    Image place web server clusters in numerous geographical locations
    to improve the responsiveness and locality of the content it hosts for
    end-users.
    *However, their services are priced out of reach for all but the largest enterprise customers.
  • Major CDN providers are notoriously cagey about revealing their prices. Most will only reveal their prices if you are serious customer and are willing to commit to a contract and minimum data usage (as detailed in the previous slide). As such Dan Rayburn @ StreamingMedia.com (a blog run for streaming media and CDN professionals) undertakes an informal sampling of pricing (taken from CDN customers) every few months.
  • Numerous ‘storage cloud’ providers (or ‘Storage as a Service’) have emerged that can
    provide data storage and delivery in several continents, offering S.L.A. backed performance and uptime promises for their services.
  • Each storage provider outlines their own cost structure for data transferred in and out of their service, as well as charging for persistent storage. In each of these cases, the costs are in the order of cents per gigabyte. Pricing scales downward based on higher usage for all providers. There is no minimum data usage requirement and no contracts - you only pay for what you store and transfer.

  • The providers themselves have very similar core functionality, but there are some key differences, for example, the largest allowable file size, the coverage footprint or specific features.
  • It is easy to see why storage clouds provide a compelling alternative to traditional CDNs for content producers that transfer significant amounts of data to their customers.
  • Amazon CloudFront offers a CDN-like service that is significantly cheaper than tradional CDNs. Amazon charges different rates depending on where the data is delivered from to reflect the cost of data transfer and operations in different locations.
  • 1. MetaCDN is more likely to meet the needs of content creators than a single provider could.
    2. There is no ‘unified’ or familiar interface for all storage clouds. Consider Amazon S3, Nirvanix SDN, Mosso Cloud Files and Microsoft Azure Storage Service. These four cloud storage providers have four separate access APIs that a developer would need to learn to access these services.
    3. If a content creator attempted to utilise these providers themselves, they would essentially need to perform the load balancing and redirection themselves at their origin sites (complex!)
  • The service is presented to end-users as a web portal for small or ad-hoc deployments or as Web Services (currently under development) for integration of customers with more complex and frequently changing content delivery needs. The web portal was developed using Java Enterprise and Java Server Faces (JSF) technologies, with a MySQL back-end to store user accounts, deployments, and the capabilities and pricing of service providers.
    Introduce connectors, and major components.
  • The MetaCDN system integrates with each storage provider via connectors that provides an abstraction to hide the complexity arising from the differing ways each provider allows access to their systems. The connectors encapsulate basic operations like creation, deletion and renaming of files and folders. If an operation is not supported on a particular service, then the connector for that service should throw a FeatureNotSupportedException.
  • 1. MetaCDN deploys as many replicas as possible to all available locations.
    2. A user nominates regions and MetaCDN matches the requested regions
    with providers that service those areas.
    3. Where MetaCDN deploys as many replicas in the locations requested by the user as their
    storage and transfer budget will allow, keeping them active until that budget is exhausted.
    4. MetaCDN deploys to providers that match specific QoS targets that a user specifies, such as average throughput or response time from a particular location, which is tracked by persistent probing from the MetaCDN QoS monitor.


  • The MetaCDN database tracks all pertinent information such as users of the system, credentials for various providers, details about the providers capabilities, pricing and footprint and details of replicas deployed.
  • Using the web portal, users can sign up for an account on the MetaCDN system, and
    enter credentials for any cloud storage or other provider they have an account with. Once this simple step has been performed, they can utilise the MetaCDN system to intelligently deploy content onto storage providers according to their performance requirements and budget limitations.

  • A MetaCDN user is required to register the credentials of providers they have accounts with. Once this step is done they do not need to worry about how to interact with each of the providers. Eventually we would like MetaCDN users to not require accounts with specific providers - rather MetaCDN would provide consolidated billing of users for storage and transfer of content.
  • The MetaCDN \"Control Panel\" gives easy access to the core features of the service. You can deploy content, view existing deployments (via a high level content view or a detailed replica view), and view a deployment map overlayed onto Google Maps or Google Earth.
  • Here we can see an example of geographical-based deployment. A user nominates regions and MetaCDN matches the requested regions with providers that service those areas. The user also specifies the desired lifetime of the deployment, after which the replicas will be removed.
  • We can view details of our past deployments. We store and track information such as the origin id (i.e. the original source of the content), a unique GUID, the MetaCDN URL that represents the deployment, the number of times this content has been downloaded, the last time this content was downloaded and how many replicas were generated from this deployment.
  • Here we can see the specific replicas that have been generated from our various deployments. For each replica, we can see which provider and location was utilised, the public URL of the replica, the number of times the replica has been downloaded, the last access time of a specific replica and options to modify, delete, or view the replica if we wish to fine tune our deployment.
  • We can get a birds eye view of where our replicas are stored, and how many are stored in each location. MetaCDN generates a KML file for each user that is used to overlay on Google Maps (shown here) or we can view our deployments in Google Earth. We expect to overlay more useful information in these views in the near future, such as the cost expenditure at each location and the location of client (i.e. file consumer) hotspots.
  • A web service interface is under development that will make all the functionality of the web portal available in a programmatic fashion. Obviously it's not feasible to deploy thousands of files manually via the web portal so we need to prove the facility for advanced customers to scale out easily and rapidly.
  • *With multiple sources (and multiple URL’s) the complexity of load
    balancing is imposed on the origin / content provider
    *With single namespace we can have coarse and fine-grained control
    via DNS redirection and layer4/7 load balancing
    http://www.metacdn.org:8080/MetaCDN/FileMapper?itemid=1
    http://www.metacdn.org:8080/MetaCDN/FileMapper?itemid=1&policy=RAN
    http://www.metacdn.org:8080/MetaCDN/FileMapper?itemid=1&policy=GEO

  • During the development phase only, only 1 copy of portal/redirector is running in Melbourne, Australia. The plan is to deploy portals in several locations across US, Asia and Europe. We will see from next slide why this is necessary.
  • Let's assume that a consumer in the USA was accessing a replica directly (i.e. it magically knew the best replica to select), or via MetaCDN. Here we can see that there is around 0.4 seconds of overhead, which is predominantly the round-trip time to access the gateway in Australia.
  • When a consumer has a gateway that is close to it (in this case, a consumer in Australia is utilising a local gateway) the overhead is significantly smaller, in the order of 0.05 seconds per request. It is obvious that local gateways are needed in key areas to maximise performance.
  • In the second half of 2008 we evaluated the two major cloud storage providers at the time, Amazon S3 and Nirvanix SDN. We ran the test over 24 hours from a variety of client and replica locations to see whether the providers demonstrated sufficient performance (i.e. throughput and response times) to act as a \"poor man's\" CDN.
  • In 5 out of 6 client locations there were at least 2 replicas each that delivered throughput that is consistent with what we would expect from a traditional CDN service.
    <Kilobytes per second>
  • In 4 our of 6 client locations there was 1-3 replicas that delivered response time that is consistent with what we would expect from a traditional CDN service. It is worth noting here that these times represent the end to end latency and connection time (i.e. a HTTP connection is made), they are not simply ping measurements.

  • FTP/WebDav support will be useful in locations where cloud providers do not (and are unlikely to) service.
    There is a lot of demand from customers to move away from Youtube and Vimeo flash video hosts and host their own streaming content directly on their origin site. This way they control the look and feel and ad monetization of their content.





Transcript

  • 1. MetaCDN: Enabling High Performance, Low Cost Content Storage and Delivery via the Cloud Dr. James Broberg (brobergj@csse.unimelb.edu.au) http://www.csse.unimelb.edu.au/~brobergj http://www.metacdn.org
  • 2. Content Delivery Networks (CDNs) • What is a CDN? • Content Delivery Networks (CDNs) such as Akamai, Mirror Image and Limelight place web server clusters in numerous geographical locations to improve the responsiveness and locality of the content it hosts for end-users.
  • 3. Existing CDN providers • Akamai is the clear leader in coverage and market share (approx. 80%), however... • Price is prohibitive for SME, NGO, Gov... • Anecdotally 2−15 times more expensive than Cloud Storage, and require 1−2 year commitments and min. data use (10TB+) • Academic CDNs include Coral, Codeen, Globule, however... • No SLA / QoS provided, only ‘best effort’
  • 4. Major CDN pricing • Most major CDN providers do not publish prices • “Average” prices from the 4-5 major CDNs in the market: • 50TB: $0.40 - $0.50 per GB delivered • 100TB: $0.25 - $0.45 per GB delivered • 250TB: $0.10 - $0.30 per GB delivered • Information taken from Dan Rayburn @ www.cdnpricing.com, StreamingMedia.com
  • 5. Storage Clouds ‘Storage as a Service’ Now! ??????
  • 6. Cost Structures Nirvanix Amazon Amazon Amazon Mosso Global SDN S3 USA S3 Europe CloudFront CloudFiles < 2TB < 10TB < 10 TB < 10 TB < 5TB 0.18 0.1 0.1 N/A 0.08 Incoming data ($/ GB) 0.18 0.17 0.17 0.17-0.21 0.22 Outgoing data ($/ GB) 0.25 0.15 0.18 N/A 0.15 Storage ($/GB/month) Requests 0.00 0.01 0.012 N/A 0.02 ($/1,000 PUT/ POST/LIST) 0.00 0.01 0.012 0.010-0.013 0.00 Requests ($/10,000 GET)
  • 7. Feature Comparison Nirvanix Amazon Mosso Amazon S3 Global SDN CloudFront CloudFiles 99.9 99-99.9 99-99.9 99.9 SLA 256GB 5GB 5GB 5GB Max File Size Yes Yes Yes Yes US PoP EU PoP Yes Yes Yes Yes Asia PoP Yes No Yes Yes AUS PoP No No No Yes Yes Yes Yes Yes Per file ACL Yes No Yes Yes Automatic replication of files Yes Yes Yes Yes Developer API’s / Web Services
  • 8. Pricing comparison 80000 Amazon S3 USA/EU Nirvanix SDN Mosso Cloud Files Major CDNs (average) 70000 60000 50000 $USD/month 40000 30000 20000 10000 0 0 50 100 150 200 250 Outgoing TB Data/month
  • 9. Pricing comparison 40000 Amazon CF USA/EU Amazon CF HK Amazon CF JP 35000 30000 25000 $USD/month 20000 15000 10000 5000 0 0 50 100 150 200 250 Outgoing TB Data/month
  • 10. Introducing MetaCDN • What if we could create a low-cost, high performance overlay CDN using these storage clouds? • MetaCDN harnesses the diversity of multiple storage cloud providers, offering a choice of deployment location, features and pricing. • MetaCDN provides a uniform method / API for harnessing multiple storage clouds. • MetaCDN provides a single namespace to cover all supported providers, making it simple to integrate into origin sites, and handles load balancing for end-users. • The MetaCDN system offers content creators (who are not necessarily programmers!) a trivial way to harness the power of multiple cloud storage providers via a web portal
  • 11. How MetaCDN works Amazon S3 & Mosso Cloud Microsoft Azure Nirvanix SDN Coral CDN CloudFront Files CDN Storage Service JetS3t toolkit Nirvanix SDK Cloud Files SDK CoralConnector AzureConnector Java SDK Java SDK Java SDK Java stub Java stub Open Source Nirvanix, Inc Mosso, Inc MetaCDN.org MetaCDN.org AmazonS3Connector NirvanixConnector CloudFilesConnector Java stub Java stub Java stub Shared/Private MetaCDN.org MetaCDN.org MetaCDN.org Host MetaCDN WebDAVConnector Java stub MetaCDN MetaCDN QoS MetaCDN MetaCDN MetaCDN.org Manager Monitor Allocator Database SCPConnector Java stub Web Portal Load Redirector Web Service MetaCDN.org Java (JSF/EJB) based portal Random redirection SOAP Web Service FTPConnector Support HTTP POST Geographical redirection RESTful Web Service Java stub New/view/modify deployment Least cost redirection Programmatic access MetaCDN.org
  • 12. How MetaCDN works AmazonS3Connector NirvanixConnector createFolder(foldername, location) createFolder(foldername, location) DefaultConnector deleteFolder(foldername) deleteFolder(foldername) DEPLOY_USA createFile(file, foldername, createFile(file, foldername, DEPLOY_EU location, date) location, date) DEPLOY_ASIA createFile(fileURL, foldername, createFile(fileURL, foldername, DEPLOY_AUS location, date) location, date) createFolder(foldername, location) renameFile(filename, newname, renameFile(filename, newname, deleteFolder(foldername) location) throws location) createFile(file, foldername, FeatureNotSupportedException createTorrent(file) throws location, date) createTorrent(file) FeatureNotSupportedException createFile(fileURL, foldername, createTorrent(fileURL) createTorrent(fileURL) throws location, date) deleteFile(file, location) FeatureNotSupportedException renameFile(filename, newname, listFilesAndFolders() deleteFile(file, location) location) deleteFilesAndFolders() listFilesAndFolders() createTorrent(file) deleteFilesAndFolders() createTorrent(fileURL) <<exception>> deleteFile(file, location) FeatureNotSupportedException listFilesAndFolders() FeatureNotSupportedException(msg) deleteFilesAndFolders()
  • 13. MetaCDN Allocator • The MetaCDN Allocator allows users to deploy files either directly or from a public origin URL, with the following options: • Maximise coverage and performance • Deploy content in specific locations • Cost optimised deployment • Quality of Service (QoS) optimised
  • 14. MetaCDN Manager • The MetaCDN Manager ensures that: • All current deployments are meeting QoS targets (where applicable) • Replicas are removed when no longer required (minimising cost) • A users’ budget has not been exceeded, by tracking usage (i.e. storage/download)
  • 15. MetaCDN QoS Monitor • The MetaCDN QoS Monitor: • Tracks the performance of participating providers at all times • Monitors and records throughput, response time and uptime from a variety of locations • Ensures that upstream providers are meeting their Service Level Agreements
  • 16. MetaCDN Database 1 0:M 1 1 MetaCDN CDN CDN has for User Credentials Provider 1 1 1 hosted has has by M 1 M M 1 1 0:M MetaCDN Coverage deployed hosted Content Replica locations as at 1 M QoS Monitor measures
  • 17. MetaCDN Web Portal • Developed using Java Enterprise and Java Server Faces (JSF) technologies • JPA/MySQL back-end to store persistent data • Web portal acts as the entry point to the system and application-level load balancer • Most suited for small or ad-hoc deployments, and especially useful for less technically inclined content creators.
  • 18. Don't have a MetaCDN account? Sign in Register Username: master Password: •••••• Login All trademarks mentioned herein are the exclusive property of their respective owners. This project is supported by the:
  • 19. Register new MetaCDN account: Username: Full Name: joebloggs Joe Bloggs Password: Email: •••••• joe@blogs.com Preferred Providers: Nirvanix SDN Amazon S3 Mosso Cloud Files Microsoft Azure Storage Service Shared/Private Host Microsoft Azure Storage Service Shared/Private Host Nirvanix SDN Amazon S3 Mosso Cloud Files Enter your Amazon S3 Credentials: ****************************** AWS Access Key: ****************************** AWS Secret Key: Enable CloudFront: Register Cancel All trademarks mentioned herein are the exclusive property of their respective owners. This project is supported by the:
  • 20. Welcome to the MetaCDN Portal! Deploy content View existing content Deployment Map View existing replicas MetaCDN Analytics Logout Edit Account All trademarks mentioned herein are the exclusive property of their respective owners. This project is supported by the:
  • 21. Sideload Content: Choose URL: http://pitchfork.com/media/frontend/images/header_logo.gif North America Europe Deployment region(s): Asia Australasia Host Until: 30/05/2009 Deploy Cancel All trademarks mentioned herein are the exclusive property of their respective owners. This project is supported by the:
  • 22. Deployment Map: Terms of Use Back
  • 23. MetaCDN Web Service • Makes all functionality available via Web Services (SOAP & REST/HTTP) • Web interface is useful for novices and for ad-hoc deployments, but doesn’t scale • Larger customers have 1,000’s - 10,000 - 100,000s of files that need deployment • Let them automate their deployment and management via Web Services! • Perfect for Mashup developers!!!
  • 24. MetaCDN Load Redirector • We have created a unified namespace to simply deployment, routing, management • Currently, file deployment results in multiple URLs, each mapping to a replica • http://metacdn-eu-user.s3.amazonaws.com/myfile.mp4 • http://metacdn-us-user.s3.amazonaws.com/myfile.mp4 • http://node3.nirvanix.com/MetaCDN/user/myfile.mp4 • Single namespace is created for fine control • http://www.metacdn.org/FileMapper?itemid=2
  • 25. MetaCDN Load Redirector (cont.) • Actual load redirection logic depends on deployment option used by MetaCDN user: • Maximise coverage and performance? - Find closest physical replica • Deploy content in specific locations? - Find closest physical replica • Cost optimised deployment? - Find cheapest replica, minimises cost to maximise lifetime • Quality of Service (QoS) optimised? - Find best (historically) performing replica
  • 26. MetaCDN Load Redirector (cont.) MetaCDN end-user DNS Server MetaCDN gateway Resolve www.metacdn.org Return IP of closest MetaCDN gateway, www-na.metacdn.org GET http://metacdn.org/MetaCDN/FileMapper?itemid=1 processRequest () geoRedirect () HTTP 302 Redirect to http://metacdn-us-username.s3.amazonaws.com/filename.pdf Amazon S3 USA Resolve metacdn-us-username.s3.amazonaws.com Return IP of metacdn-us-username.s3.amazonaws.com GET http://metacdn-us-username.s3.amazonaws.com/filename.pdf Return replica
  • 27. Redirector Overhead from USA 1.8 S3 USA MetaCDN 1.6 1.4 1.2 1 Seconds 0.8 0.6 0.4 0.2 0 0 5 10 15 20 25 Hour
  • 28. Redirector Overhead from Australia 0.9 Nirvanix SDN #3 MetaCDN 0.85 0.8 0.75 Seconds 0.7 0.65 0.6 0.55 0 5 10 15 20 25 Hour
  • 29. Evaluating MetaCDN performance • Ran tests over 24 hour period (mid-week) • Downloading test replicas (1KB, 10MB) 30 times per hour, take average and conf. inter. • 10MB - Throughput, 1KB - Response Time • Ran test from 6 locations: Melbourne (AUS), Paris (FRA), Vienna (AUT), San Diego & New Jersey (USA), Seoul (KOR) • Replicas located across US, EU, ASIA, AUS
  • 30. Summary of Results - Throughput (KB/s) S3 S3 SDN SDN SDN SDN Coral US EU #1 #2 #3 #4 Melbourne 264.3 389.1 30 366.8 408.4 405.5 173.7 Australia 703.1 2116 483.8 2948 416.8 1042 530.2 Paris France Vienna 490.7 1347 288.4 2271 211 538.7 453.4 Austria Seoul 312.8 376.1 466.5 411.8 2456 588.2 152 South Korea San Diego 1234 323.5 5946 380.1 506.1 820.4 338.5 USA New Jersey 2381 1949 860.8 967.1 572.8 4230 636.4 USA
  • 31. Summary of results - Response Time (Sec) S3 S3 SDN SDN SDN SDN Coral US EU #1 #2 #3 #4 Melbourne 1.378 1.458 0.663 0.703 1.195 0.816 5.452 Australia 0.533 0.2 0.538 0.099 1.078 0.316 3.11 Paris France Vienna 0.723 0.442 0.585 0.099 1.088 0.406 3.171 Austria Seoul 1.135 1.21 0.856 0.896 1 0.848 3.318 South Korea San Diego 0.232 0.455 0.23 0.361 0.775 0.319 4.655 USA New Jersey 0.532 0.491 0.621 0.475 1.263 0.516 1.916 USA
  • 32. Summary of results (cont) • Clients benefited greatly from local replicas • Results are consistent in terms of response time and throughput with previous studies of dedicated CDNs • Back-end providers have sufficient performance and reliability to be used to host replicas in the MetaCDN system • Adding “CDN” Cloud Providers (Amazon CloudFront, Mosso Cloud Files) will likely significantly improve results.
  • 33. MetaCDN features in planning / development • Support as many providers as possible • Windows Azure Storage Service support will be finished very shortly. • Non-“Cloud” storage (i.e. FTP, WebDav) will be available that is seamlessly integrated. • Integrated Flash video and MP3 audio streaming with embeddable players for customers to place on their origin sites.
  • 34. MetaCDN features in planning / development • Autonomic deployment management (expansion/contraction) based on demand • Autonomic deployment management (expansion/contraction) based on QoS • Security / ACL framework that spans all cloud storage providers • “One time” MetaCDN URLs
  • 35. Collaborations • Always looking for people to collaborate on the project • Work to be done on: • Load balancing / redirection algorithms • Intelligent Caching / replication algorithms • Security / Access Control of content • Improving MetaCDN Web Services • Please contact me if you are interested...
  • 36. Acknowledgements • Australian Research Council (ARC) for funding the project. • Cory & Barry (Nirvanix) and Eric (Rackspace/ Mosso) for development support.
  • 37. Publications • J. Broberg and Z. Tari. MetaCDN: Harnessing Storage Clouds for High Performance Content Delivery. In Proceedings of The Sixth International Conference on Service-Oriented Computing [Demonstration Paper] (ICSOC 2008), LNCS 5364, pp. 730–731, 2008. • J. Broberg, R. Buyya and Z. Tari. Creating a ‘Cloud Storage’ Mashup for High Performance, Low Cost Content Delivery, Second International Workshop on Web APIs and Services Mashups (Mashups’08), In Proceedings of The Sixth International Conference on Service-Oriented Computing Workshops, LNCS 5472, pp. 178–183, 2009. • J. Broberg, R. Buyya, and Z. Tari, MetaCDN: Harnessing ‘Storage Clouds’ for high performance content delivery, Journal of Network and Computer Applications (JNCA), To appear, 2009 • R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brandic. Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility, Future Generation Computer Systems. To appear, 2009.
  • 38. Thanks Latest information will be available at: www.metacdn.org International Workshop on Cloud Computing (Cloud 2009): http://www.gridbus.org/cloud2009