Vendor Speech by  Raghu Ramakrishnan Cloud Computing Conference & Expo November 3, 2009   Key Challenges in Cloud Computin...
Key Challenges <ul><li>Elastic scaling </li></ul><ul><ul><ul><li>Typically, with commodity infrastructure </li></ul></ul><...
Inside Yahoo!’s Cloud KEY CHALLENGES
DATA MANAGEMENT  IN THE CLOUD
Help! <ul><li>I have a huge amount of data. What should I do with it? </li></ul>DB2 UDB Sherpa
What Are You Trying to Do? Data Workloads OLTP (Random access to a few records) OLAP (Scan access to a large number of rec...
Yahoo! Solution Space OLTP (Random access to a few records) OLAP (Scan access to a large number of records) Read-heavy Wri...
Storage <ul><li>Common features: </li></ul><ul><li>Managed service, Simple REST API </li></ul><ul><li>Replication, BCP, Hi...
Yahoo! Storage Problem <ul><ul><li>Small records  – 100KB or less </li></ul></ul><ul><ul><li>Structured records  – Lots of...
Typical Applications <ul><li>User logins and profiles </li></ul><ul><ul><ul><li>Including changes that must not be lost! <...
VLSD Data Serving Stores <ul><li>Must  partition  data across machines </li></ul><ul><ul><ul><li>Can partitions be changed...
The CAP Theorem <ul><li>You have to give up one of the following in a distributed system (Brewer, PODC 2000; Gilbert/Lynch...
Approaches to CAP <ul><li>Use a single version of DB, reconcile later </li></ul><ul><li>Defer transaction commit  </li></u...
One Slide Hadoop Primer HDFS Data file Map tasks HDFS Reduce tasks Good for analyzing (scanning) huge files Not great for ...
Out There in the World <ul><li>OLTP </li></ul><ul><ul><li>General </li></ul></ul><ul><ul><li>Write optimized </li></ul></u...
Ways of Using Hadoop Data workloads OLAP (Scan access to a large number of records) By rows By columns Unstructured Hadoop...
Hadoop-Based Applications @ Yahoo! 2008 2009 Webmap ~70 hours runtime ~300 TB shuffling ~200 TB output ~73 hours runtime ~...
Batch Storage and Processing <ul><li>Hadoop (Eric Baldeschwieler, Tue 4:50 pm, Session 9) </li></ul><ul><li>Runs data/proc...
SHERPA * Yahoo!’s Serving Store * The system also known as PNUTS
What is Sherpa? CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) Parallel database Geographic replicat...
Key Components <ul><li>Maintains map from database.table.key-to-tablet-to-SU </li></ul><ul><li>Provides load balancing </l...
Architecture Storage units Routers Tablet Controller REST API Clients Local region Remote  regions Tribble
Updates Write key k Sequence # for key k Sequence # for key k Write key k SUCCESS Write key k Routers Message brokers 1 2 ...
Accessing Data SU SU SU Get key k 1 2 Get key k 3 Record for key k 4 Record for key k
ASYNCHRONOUS REPLICATION AND CONSISTENCY
Asynchronous Replication
Consistency Model <ul><li>If copies are asynchronously updated, what can we say about stale copies? </li></ul><ul><ul><li>...
Example: Social Alice West East ___ Busy Free Free Record Timeline (Network fault,  updt goes to East) (Alice logs on) Use...
Sherpa Consistency Model Time v. 1 v. 2 v. 3 v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Write Current version Stale version Sta...
Sherpa Consistency Model Time v. 1 v. 2 v. 3 v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Current version Stale version Stale ver...
Sherpa Consistency Model Time v. 1 v. 2 v. 3 v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Read up-to-date Current version Stale v...
OPERABILITY
Distribution Distribution for parallelism Data shuffling for load balancing Server 1 Server 2 Server 3 Server 4 Bike $86 6...
Tablet Splitting and Balancing Each storage unit has many tablets (horizontal partitions of the table) Tablets may grow ov...
Mastering A  42342  E B  42521  W C  66354  W D  12352  E E  75656  C F  15677  E A  42342  E B  42521  W C  66354  W D  1...
Record vs. Tablet Master A  42342  E B  42521  W C  66354  W D  12352  E E  75656  C F  15677  E Record master serializes ...
Coping With Failures A  42342  E B  42521  W C  66354  W D  12352  E E  75656  C F  15677  E A  42342  E B  42521  W C  66...
COMPARING SYSTEMS
Example <ul><li>User database </li></ul><ul><li>Workload: </li></ul>Data workloads Read-heavy *but: occasional writes User...
Which System is Better? <ul><li>Likely will work well: </li></ul><ul><li>Likely won’t work well: </li></ul><ul><li>Row ori...
Why System X? <ul><li>Example: MySQL/Oracle/Sherpa </li></ul><ul><ul><ul><ul><li>Page-based buffer manager </li></ul></ul>...
Why Not System Y? <ul><li>Example: Cassandra </li></ul><ul><ul><ul><li>Write optimized, not read optimized </li></ul></ul>...
Comparison Overview <ul><li>Setup </li></ul><ul><ul><li>Six server-class machines </li></ul></ul><ul><ul><ul><li>8 cores (...
Preliminary Results Sherpa
YCS Benchmark <ul><li>Will distribute Java application, plus extensible benchmark suite </li></ul><ul><ul><li>Many systems...
Benchmark Tiers <ul><li>Tier 1: Cluster performance </li></ul><ul><ul><li>Tests fundamental design independent of replicat...
EDGE, CLOUD SERVING,  AND MORE
Cloud Serving: The Integrated Cloud Big idea:  Declarative language for specifying the  full, end-to-end structure  of a s...
Foundational Components <ul><li>An abstract type system for describing the application protocol in an implementation neutr...
<ul><li>Allocation, Provisioning, and Metering of Cloud Resources </li></ul><ul><li>Definition, Deployment, and Life Cycle...
Edge Services <ul><li>Yahoo! Traffic Server (Chuck Neerdaels, Wed, Session 3, @ 9am) </li></ul><ul><li>Highly scalable – o...
YCS Scale YCS handles so much Yahoo traffic, this is noise!
YQL <ul><li>Thousands of  Web Services  and  sources  that provide  valuable data </li></ul><ul><li>Require Developers to ...
YQL <ul><li>Hosted web service and SQL-Like Language  </li></ul><ul><ul><ul><li>Familiar to developers </li></ul></ul></ul...
New in 2010! <ul><li>SIGMOD and SIGOPS are starting a new annual conference (co-located with SIGMOD in 2010): </li></ul><u...
TUESDAY, 11/3 9:10 am – 9:55 am Yahoo! Query Language - YQL: Select * from Internet [Including Live Demo!] Dr. Jonathan Tr...
Upcoming SlideShare
Loading in …5
×

Key Challenges in Cloud Computing and How Yahoo! is Approaching Them

3,757 views
3,623 views

Published on

From Raghu Ramakrishnan's presentation "Key Challenges in Cloud Computing and How Yahoo! is Approaching Them" at the 2009 Cloud Computing Expo in Santa Clara, CA, USA. Here's the talk description on the Expo's site: http://cloudcomputingexpo.com/event/session/510

Published in: Technology
1 Comment
19 Likes
Statistics
Notes
No Downloads
Views
Total views
3,757
On SlideShare
0
From Embeds
0
Number of Embeds
28
Actions
Shares
0
Downloads
0
Comments
1
Likes
19
Embeds 0
No embeds

No notes for slide
  • Will begin by referring to Shelton’s talk, and saying that this talk is about the technical challenges in building and using the clouds he described
  • Point out that the key challenges cut across the different cloud components
  • Say that we begin with a closer look at data management issues
  • Note that this is the non-Mobstor case
  • Talk about what we’ve done, and explain how design choices address the key challenges
  • “ Free” update goes to east coast first because of network disruption, and as it happens, this happens before original busy notification is received. Now the west coast receives the corresponding free notification while the east coast receives the original busy notification. No priority between these two outcomes …
  • Yahoo! thought leadership in this space
  • Updates write the whole record
  • Yahoo! thought leadership in this space
  • Abstracts concerns of the underlying infrastructure and the network communication •Virtualized hardware •Declarative application structure •End-to-End Security •Standardized software stacks and packaging •Integrated service management •Continuous Integration and Deployment •Containers that service requests as opposed to Machines that run executables •Elastic serving of changing workloads •Controlled/Intelligent traffic direction •Controlled execution environment •Managed Communication •Service Associations, Bindings, and Access Controls
  • Common mechanism to orchestrate resources in the Cloud including nodes storage devices, load balancers, security identities, and services themselves (IETF proposal) - Also, moving services within and across clouds Clouds may be partitioned into independently administered &amp;quot;regions&amp;quot; (environment in which services and resources can interact) to facilitate load balancing and ownership
  • over 3,500 transactions/second per YTS box Ysquid – used in cases where the is a clear need based on features….does not scale per YTS in most cases (runs on single core, single threaded) YCPI, accelerates dynamic content internationally?
  • US only Much or Y! runs through YCS in US YCPI
  • Quick plug for Jonathan’s talk—and make the point that while Yahoo!’s cloud is currently private, it enables externally available apps, including some that selectively expose cloud capabilities to outside developers
  • Key Challenges in Cloud Computing and How Yahoo! is Approaching Them

    1. 1. Vendor Speech by Raghu Ramakrishnan Cloud Computing Conference & Expo November 3, 2009   Key Challenges in Cloud Computing … and the Yahoo! Approach
    2. 2. Key Challenges <ul><li>Elastic scaling </li></ul><ul><ul><ul><li>Typically, with commodity infrastructure </li></ul></ul></ul><ul><li>Availability </li></ul><ul><ul><ul><li>Trading consistency/performance/availability </li></ul></ul></ul><ul><li>Handling failures </li></ul><ul><ul><ul><li>What can be counted on after a failure? </li></ul></ul></ul><ul><li>Operational efficiency </li></ul><ul><ul><ul><li>Managing and tuning multi-tenanted clouds </li></ul></ul></ul><ul><li>The right abstractions </li></ul><ul><ul><ul><li>Data, security, and services in the cloud </li></ul></ul></ul><ul><ul><ul><li>Don’t forget failures! </li></ul></ul></ul>
    3. 3. Inside Yahoo!’s Cloud KEY CHALLENGES
    4. 4. DATA MANAGEMENT IN THE CLOUD
    5. 5. Help! <ul><li>I have a huge amount of data. What should I do with it? </li></ul>DB2 UDB Sherpa
    6. 6. What Are You Trying to Do? Data Workloads OLTP (Random access to a few records) OLAP (Scan access to a large number of records) Read-heavy Write-heavy By rows By columns Unstructured Combined (Some OLTP and OLAP tasks)
    7. 7. Yahoo! Solution Space OLTP (Random access to a few records) OLAP (Scan access to a large number of records) Read-heavy Write-heavy By rows By columns Unstructured Combined (Some OLTP and OLAP tasks) UDS UDB ??? STCache Sherpa Read, read/write Write-heavy Main-memory SQL on Grid Zebra Pig HDFS MapReduce
    8. 8. Storage <ul><li>Common features: </li></ul><ul><li>Managed service, Simple REST API </li></ul><ul><li>Replication, BCP, High Availability </li></ul><ul><li>Global footprint </li></ul><ul><li>Sherpa (more later in this talk) </li></ul><ul><li>Key-Value, structured data store </li></ul><ul><li>Size of objects – target is up to 100k </li></ul><ul><li>R/W target latency 10 – 20ms </li></ul><ul><li>Secondary indexes, ordered tables, real-time update notifications </li></ul><ul><li>MObStor (Chuck Neerdaels, Wed, Session 3, @ 9am) </li></ul><ul><li>Large, unstructured online storage </li></ul><ul><li>Object size typically a few MB – up to 2GB </li></ul><ul><li>R/W target latency 50 – 100ms (depending on size) </li></ul><ul><li>Commodity Storage (DORA) </li></ul>
    9. 9. Yahoo! Storage Problem <ul><ul><li>Small records – 100KB or less </li></ul></ul><ul><ul><li>Structured records – Lots of fields, evolving </li></ul></ul><ul><ul><li>Extreme data scale - Tens of TB </li></ul></ul><ul><ul><li>Extreme request scale - Tens of thousands of requests/sec </li></ul></ul><ul><ul><li>Low latency globally - 20+ datacenters worldwide </li></ul></ul><ul><ul><li>High Availability - Outages cost $millions </li></ul></ul><ul><ul><li>Variable usage patterns - Applications and users change </li></ul></ul>
    10. 10. Typical Applications <ul><li>User logins and profiles </li></ul><ul><ul><ul><li>Including changes that must not be lost! </li></ul></ul></ul><ul><ul><ul><ul><li>But single-record “transactions” suffice </li></ul></ul></ul></ul><ul><li>Events </li></ul><ul><ul><ul><li>Alerts (e.g., news, price changes) </li></ul></ul></ul><ul><ul><ul><li>Social network activity (e.g., user goes offline) </li></ul></ul></ul><ul><ul><ul><li>Ad clicks, article clicks </li></ul></ul></ul><ul><li>Application-specific data </li></ul><ul><ul><ul><li>Postings in message board </li></ul></ul></ul><ul><ul><ul><li>Uploaded photos, tags </li></ul></ul></ul><ul><ul><ul><li>Shopping carts </li></ul></ul></ul>
    11. 11. VLSD Data Serving Stores <ul><li>Must partition data across machines </li></ul><ul><ul><ul><li>Can partitions be changed easily? (Affects elasticity ) </li></ul></ul></ul><ul><ul><ul><li>How are read/update requests routed? </li></ul></ul></ul><ul><ul><li>Range selections ? Can requests span machines? </li></ul></ul><ul><li>Availability : What failures are handled? </li></ul><ul><ul><ul><li>With what semantic guarantees on data access? </li></ul></ul></ul><ul><li>(How) Is data replicated ? </li></ul><ul><ul><ul><li>Sync or async? Consistency model? Local or geo? </li></ul></ul></ul><ul><li>How are updates made durable ? </li></ul><ul><li>How is data stored on a single machine? </li></ul>
    12. 12. The CAP Theorem <ul><li>You have to give up one of the following in a distributed system (Brewer, PODC 2000; Gilbert/Lynch, SIGACT News 2002): </li></ul><ul><ul><ul><li>Consistency of data </li></ul></ul></ul><ul><ul><ul><ul><li>Think serializability </li></ul></ul></ul></ul><ul><ul><ul><li>Availability </li></ul></ul></ul><ul><ul><ul><ul><li>Pinging a live node should produce results </li></ul></ul></ul></ul><ul><ul><ul><li>Partition tolerance </li></ul></ul></ul><ul><ul><ul><ul><li>Live nodes should not be blocked by partitions </li></ul></ul></ul></ul>
    13. 13. Approaches to CAP <ul><li>Use a single version of DB, reconcile later </li></ul><ul><li>Defer transaction commit </li></ul><ul><li>Eventual consistency (e.g., Amazon Dynamo) </li></ul><ul><li>Restrict transactions (e.g., Sharded MySQL) </li></ul><ul><ul><ul><li>1-M/c Xacts: Objects in xact are on the same machine </li></ul></ul></ul><ul><ul><ul><li>1-Object Xacts: Xact can only read/write 1 object </li></ul></ul></ul><ul><li>Object timelines (SHERPA) </li></ul>http://www.julianbrowne.com/article/viewer/brewers-cap-theorem
    14. 14. One Slide Hadoop Primer HDFS Data file Map tasks HDFS Reduce tasks Good for analyzing (scanning) huge files Not great for serving (reading or writing individual objects)
    15. 15. Out There in the World <ul><li>OLTP </li></ul><ul><ul><li>General </li></ul></ul><ul><ul><li>Write optimized </li></ul></ul><ul><ul><li>Main-memory </li></ul></ul><ul><ul><li>m e m c a c h e d* </li></ul></ul><ul><li>OLAP </li></ul><ul><ul><li>Row store </li></ul></ul><ul><ul><li>Column store </li></ul></ul><ul><li>MapReduce </li></ul><ul><li>Combined OLAP/OLTP </li></ul><ul><ul><li>B i g T a b l e </li></ul></ul><ul><li>Other </li></ul><ul><ul><li>Blobs </li></ul></ul><ul><ul><li>Information retrieval </li></ul></ul>HadoopDB ??? * Memcached provides no native query processing DB2
    16. 16. Ways of Using Hadoop Data workloads OLAP (Scan access to a large number of records) By rows By columns Unstructured HadoopDB SQL on Grid Zebra
    17. 17. Hadoop-Based Applications @ Yahoo! 2008 2009 Webmap ~70 hours runtime ~300 TB shuffling ~200 TB output ~73 hours runtime ~490 TB shuffling ~280 TB output +55% hardware Terasort 209 seconds 1 Terabyte sorted 900 nodes 62 seconds 1 Terabyte, 1500 nodes 16.25 hours 1 Petabyte, 3700 nodes Largest cluster <ul><li>2000 nodes </li></ul><ul><li>6PB raw disk </li></ul><ul><li>16TB of RAM </li></ul><ul><li>16K CPUs </li></ul><ul><li>4000 nodes </li></ul><ul><li>16PB raw disk </li></ul><ul><li>64TB of RAM </li></ul><ul><li>32K CPUs </li></ul><ul><li>(40% faster CPUs too) </li></ul>
    18. 18. Batch Storage and Processing <ul><li>Hadoop (Eric Baldeschwieler, Tue 4:50 pm, Session 9) </li></ul><ul><li>Runs data/processing-intensive distributed applications on 1,000’s of nodes </li></ul><ul><li>Data Processing </li></ul><ul><li>Map Reduce – Java API </li></ul><ul><li>Pig – Procedural programming language </li></ul><ul><li>PigSQL – SQL </li></ul><ul><li>Process Management </li></ul><ul><li>Workflow; services for registering & discovering published data sets, moving/archiving data, grid management, etc. </li></ul>
    19. 19. SHERPA * Yahoo!’s Serving Store * The system also known as PNUTS
    20. 20. What is Sherpa? CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) Parallel database Geographic replication Structured, flexible schemas Hashed and ordered tables Hosted, managed infrastructure E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E
    21. 21. Key Components <ul><li>Maintains map from database.table.key-to-tablet-to-SU </li></ul><ul><li>Provides load balancing </li></ul><ul><ul><li>Caches the maps from the TC </li></ul></ul><ul><ul><li>Routes client requests to correct SU </li></ul></ul><ul><ul><li>Stores records </li></ul></ul><ul><ul><li>Services get/set/delete requests </li></ul></ul>
    22. 22. Architecture Storage units Routers Tablet Controller REST API Clients Local region Remote regions Tribble
    23. 23. Updates Write key k Sequence # for key k Sequence # for key k Write key k SUCCESS Write key k Routers Message brokers 1 2 Write key k 7 8 SU SU SU 3 4 5 6
    24. 24. Accessing Data SU SU SU Get key k 1 2 Get key k 3 Record for key k 4 Record for key k
    25. 25. ASYNCHRONOUS REPLICATION AND CONSISTENCY
    26. 26. Asynchronous Replication
    27. 27. Consistency Model <ul><li>If copies are asynchronously updated, what can we say about stale copies? </li></ul><ul><ul><li>ACID transactions: Require synchronous updates </li></ul></ul><ul><ul><li>Eventual consistency: Copies can drift apart, but will eventually converge if the system is allowed to quiesce </li></ul></ul><ul><ul><ul><li>To what value will copies converge? </li></ul></ul></ul><ul><ul><ul><li>Do systems ever “quiesce”? </li></ul></ul></ul><ul><ul><li>Is there any middle ground? </li></ul></ul>
    28. 28. Example: Social Alice West East ___ Busy Free Free Record Timeline (Network fault, updt goes to East) (Alice logs on) User Status Alice Busy User Status Alice Free User Status Alice ??? User Status Alice ??? User Status Alice Busy User Status Alice ___
    29. 29. Sherpa Consistency Model Time v. 1 v. 2 v. 3 v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Write Current version Stale version Stale version Achieved via per-record primary copy protocol (To maximize availability, record masterships automaticlly transferred if site fails) Can be selectively weakened to eventual consistency (local writes that are reconciled using version vectors)
    30. 30. Sherpa Consistency Model Time v. 1 v. 2 v. 3 v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Current version Stale version Stale version Read In general, reads are served using a local copy
    31. 31. Sherpa Consistency Model Time v. 1 v. 2 v. 3 v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Read up-to-date Current version Stale version Stale version But application can request and get current version
    32. 32. OPERABILITY
    33. 33. Distribution Distribution for parallelism Data shuffling for load balancing Server 1 Server 2 Server 3 Server 4 Bike $86 6/2/07 636353 Chair $10 6/5/07 662113 Couch $570 6/1/07 424252 Car $1123 6/1/07 256623 Lamp $19 6/7/07 121113 Bike $56 6/9/07 887734 Scooter $18 6/11/07 252111 Hammer $8000 6/11/07 116458
    34. 34. Tablet Splitting and Balancing Each storage unit has many tablets (horizontal partitions of the table) Tablets may grow over time Overfull tablets split Storage unit may become a hotspot Shed load by moving tablets to other servers Storage unit Tablet
    35. 35. Mastering A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 E C 66354 W D 12352 E E 75656 C F 15677 E C 66354 W B 42521 E A 42342 E D 12352 E E 75656 C F 15677 E
    36. 36. Record vs. Tablet Master A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E Record master serializes updates Tablet master serializes inserts A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 E C 66354 W D 12352 E E 75656 C F 15677 E C 66354 W B 42521 E A 42342 E D 12352 E E 75656 C F 15677 E
    37. 37. Coping With Failures A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E X X OVERRIDE W -> E
    38. 38. COMPARING SYSTEMS
    39. 39. Example <ul><li>User database </li></ul><ul><li>Workload: </li></ul>Data workloads Read-heavy *but: occasional writes UserID Fname Lname Country Sex Birthday MailLoc Avatar Passwd … bradtm Brad McMillen USA M 7/1/74 /m2/u2 guy3.gif jtr33KKa … cooperb Brian Cooper USA M 6/29/75 /m1/u4 guy2.gif Ffa3rrw1 … sangeeta Sangeetha Seshadri India F 8/13/82 /m3/u8 lady9.gif 3uiU$422 … … … … … … … … … … … OLTP (Random access to a few records)
    40. 40. Which System is Better? <ul><li>Likely will work well: </li></ul><ul><li>Likely won’t work well: </li></ul><ul><li>Row oriented </li></ul><ul><li>Supports random writes </li></ul><ul><li>Scale-out via partitioning </li></ul>UDB <ul><li>Scans of large files, not point access </li></ul><ul><li>Scans of columns, not point access </li></ul><ul><li>Scans of large tables, not point access </li></ul><ul><li>Write, not read, optimized </li></ul>Sherpa
    41. 41. Why System X? <ul><li>Example: MySQL/Oracle/Sherpa </li></ul><ul><ul><ul><ul><li>Page-based buffer manager </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Page management is optimized for random access </li></ul></ul></ul></ul>MySQL node Disk RAM Read Buffer Page B-Tree index Disk Page Disk Page Disk Page Disk Page Disk Page Disk Page Disk Page Disk Page Disk Page
    42. 42. Why Not System Y? <ul><li>Example: Cassandra </li></ul><ul><ul><ul><li>Write optimized, not read optimized </li></ul></ul></ul><ul><ul><ul><li>Updates are written sequentially </li></ul></ul></ul><ul><ul><ul><li>Reads must reconstruct from multiple places on disk </li></ul></ul></ul>Cassandra node Disk RAM Log SSTable file Memtable Update (later) SSTable file SSTable file
    43. 43. Comparison Overview <ul><li>Setup </li></ul><ul><ul><li>Six server-class machines </li></ul></ul><ul><ul><ul><li>8 cores (2 x quadcore) 2.5 GHz CPUs, RHEL 4, Gigabit ethernet </li></ul></ul></ul><ul><ul><ul><li>8 GB RAM </li></ul></ul></ul><ul><ul><ul><li>6 x 146GB 15K RPM SAS drives in RAID 1+0 </li></ul></ul></ul><ul><ul><li>Plus extra machines for clients, routers, controllers, etc. </li></ul></ul><ul><li>Workloads </li></ul><ul><ul><li>120 million 1 KB records = 20 GB per server </li></ul></ul><ul><ul><li>Write heavy workload: 50/50 read/update </li></ul></ul><ul><ul><li>Read heavy workload: 95/5 read/update </li></ul></ul><ul><li>Metrics </li></ul><ul><ul><li>Latency versus throughput curves </li></ul></ul>
    44. 44. Preliminary Results Sherpa
    45. 45. YCS Benchmark <ul><li>Will distribute Java application, plus extensible benchmark suite </li></ul><ul><ul><li>Many systems have Java APIs </li></ul></ul><ul><ul><li>Non-Java systems can support REST </li></ul></ul><ul><li>Scenario file </li></ul><ul><li>R/W mix </li></ul><ul><li>Record size </li></ul><ul><li>Data set </li></ul><ul><li>… </li></ul><ul><li>Command-line parameters </li></ul><ul><li>DB to use </li></ul><ul><li>Target throughput </li></ul><ul><li>Number of threads </li></ul><ul><li>… </li></ul>YCSB client DB client Client threads Stats Scenario executor Cloud DB Extensible: plug in new clients Extensible: define new scenarios
    46. 46. Benchmark Tiers <ul><li>Tier 1: Cluster performance </li></ul><ul><ul><li>Tests fundamental design independent of replication/scale-out/availability </li></ul></ul><ul><li>Tier 2: Replication </li></ul><ul><ul><li>Local replication </li></ul></ul><ul><ul><li>Geographic replication </li></ul></ul><ul><li>Tier 3: Scale out </li></ul><ul><ul><li>Scale out the number of machines by a factor of 10 </li></ul></ul><ul><ul><li>Scale out the number of replicas </li></ul></ul><ul><li>Tier 4: Availability </li></ul><ul><ul><li>Kill system components </li></ul></ul>If you’re interested in this, please contact me or cooperb@yahoo-inc.com
    47. 47. EDGE, CLOUD SERVING, AND MORE
    48. 48. Cloud Serving: The Integrated Cloud Big idea: Declarative language for specifying the full, end-to-end structure of a service Key insight: this “full, end-to-end structure” includes multiple environments Central mechanism: the Integrated Cloud , which (re)deploys these specifications ( Surendra Reddy, 2:20 pm, Session 7) Development, multiple testing environments, alpha and bucket-testing environments, production
    49. 49. Foundational Components <ul><li>An abstract type system for describing the application protocol in an implementation neutral manner </li></ul><ul><li>Format descriptions for resources, entrypoints, bindings, etc. </li></ul><ul><li>Application-layer protocols for establishing the user's account in a region or across regions; managing services and moving them across regions </li></ul><ul><li>A security model describing trust relationships between participating entities, and guidelines for the use of existing authentication and confidentiality mechanisms </li></ul><ul><li>An application-layer protocol for identifying entities, and requesting information about them </li></ul>
    50. 50. <ul><li>Allocation, Provisioning, and Metering of Cloud Resources </li></ul><ul><li>Definition, Deployment, and Life Cycle Management of Cloud Services </li></ul><ul><li>Virtual Machine Management </li></ul><ul><li>Metadata/Registry for Cloud Services </li></ul>OpenCAP: Open Cloud Access Protocol
    51. 51. Edge Services <ul><li>Yahoo! Traffic Server (Chuck Neerdaels, Wed, Session 3, @ 9am) </li></ul><ul><li>Highly scalable – over 3,500 req/sec per YTS box </li></ul><ul><li>Plug-in architecture </li></ul><ul><li>Core for multiple edge products </li></ul><ul><li>Caching Services </li></ul><ul><li>E.g., YCS </li></ul><ul><li>Web Acceleration, DNS and Load Balancing </li></ul><ul><li>YCPI: Connection proxy that reduces serving latencies </li></ul><ul><li>Brooklyn: High available location aware DNS </li></ul><ul><li>Bronx: Application level load balancer </li></ul>
    52. 52. YCS Scale YCS handles so much Yahoo traffic, this is noise!
    53. 53. YQL <ul><li>Thousands of Web Services and sources that provide valuable data </li></ul><ul><li>Require Developers to read documentation and form URLs/queries. </li></ul><ul><li>Data is isolated </li></ul><ul><li>Needs filtering, combining , tweaking , shaping even after it gets to the developer. </li></ul>
    54. 54. YQL <ul><li>Hosted web service and SQL-Like Language </li></ul><ul><ul><ul><li>Familiar to developers </li></ul></ul></ul><ul><ul><ul><ul><li>Synonymous with Data access </li></ul></ul></ul></ul><ul><ul><ul><li>Expressive enough to get the right data. </li></ul></ul></ul><ul><li>Self describing - show, desc table </li></ul><ul><li>Allows you to query, filter, join and update data across any structured data on the web / web services </li></ul><ul><ul><ul><li>And Yahoo’s Sherpa cloud storage </li></ul></ul></ul><ul><li>Real time engine </li></ul>(Jonathan Trevor, Session 3, 9:10 am)
    55. 55. New in 2010! <ul><li>SIGMOD and SIGOPS are starting a new annual conference (co-located with SIGMOD in 2010): </li></ul><ul><li>ACM Symposium on Cloud Computing (SoCC) </li></ul><ul><li>PC Chairs: Surajit Chaudhuri & Mendel Rosenblum </li></ul><ul><li>GC: Joe Hellerstein Treasurer: Brian Cooper </li></ul><ul><li>Steering committee: Phil Bernstein, Ken Birman, Joe Hellerstein, John Ousterhout, Raghu Ramakrishnan, Doug Terry, John Wilkes </li></ul>
    56. 56. TUESDAY, 11/3 9:10 am – 9:55 am Yahoo! Query Language - YQL: Select * from Internet [Including Live Demo!] Dr. Jonathan Trevor Senior Architect TUESDAY, 11/3 2:20 pm – 3:05 pm Walking through Cloud Serving at Yahoo! Surendra Reddy VP, Integrated Cloud and Visualization TUESDAY, 11/3 4:50pm – 5:35 pm Hadoop @ Yahoo! – Internet Scale Data Processing Eric Baldeschwieler VP, Hadoop Software Development WEDNESDAY, 11/4 9:10 am - 9:55 am Yahoo! Scalable Storage and Delivery Services Chuck Neerdaels VP, Storage and Edge Services VISIT BOOTH #103 TO TALK WITH YAHOO! ENGINEERS AND LEARN MORE ABOUT YAHOO!’S VISION FOR CLOUD COMPUTING.

    ×