Your SlideShare is downloading. ×
0
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Content Optimization on Yahoo! Front Page
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Content Optimization on Yahoo! Front Page

1,167

Published on

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,167
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
18
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Need some slides up front to layout what our definition of a cloud entails (since it’s a rapidly emerging buzzword that has no intrinsic definition. )
  • Need some slides up front to layout what our definition of a cloud entails (since it’s a rapidly emerging buzzword that has no intrinsic definition. )
  • SearchMonkey Messaging: • SearchMonkey is another proof point in Yahoo!’s big bets strategy o Openness – SearchMonkey will create access to the structured data that the web is based on and build a better experience for Yahoo! Search users o Partner of choice - Yahoo! Search is opening up its platform to any size site owner or developer to play in the search ecosystem with self-service tools o Insights – site owners will have better insight to search traffic drivers; better able to create targeted results for better engagement • Yahoo! Search is a leading starting point on the Web for users and the first major search engine to open its search page to site owners and developers and allow them to create visibly differentiated search results. • By opening up the search platform, Yahoo! Search improves the way search engines display results to the benefit of users, site owner and developers.
  • About Medium Founded in 2006 in Boulder, Colorado, Me.dium is a social browsing software company offering a browser extension that allows people to surf with friends for the first time. By revealing this new Social Exploration Environment TM (SEE), Me.dium graphically connects users with their friends and others enabling users to interact online, similarly to how one interacts with people in the real world. Me.dium’s Social Search Alpha is the first crowd-powered search engine.   Social Search enables people to find relevant information based on the activity of crowds right now.  See an entirely new layer of information over and above what traditional search provides.  You get the hottest, most active content related to your query - as determined by the current activity of the crowds. Me.dium combined the BOSS API with it’s insight into the real time surfing activity of the crowds to build a unique "Crowd-Powered" social search engine prototype.
  • OK
  • A project in Y!R focused on a long-range problem, origins in earlier work at Wisconsin. Basis for the Goldrush hack, which won the recent Local hack competition, and could contribute to creation/refinement of Y! Local content and Next Gen Search.
  • Transcript

    • 1. An Overview of Cloud Computing @ Yahoo! Raghu Ramakrishnan Chief Scientist, Audience and Cloud Computing Research Fellow, Yahoo! Research Reflects many discussions with: Eric Baldeschwieler, Jay Kistler, Chuck Neerdaels, Shelton Shugar, and Raymie Stata and joint work with the Sherpa team, in particular: Brian Cooper, Utkarsh Srivastava, Adam Silberstein, Rodrigo Fonseca and Nick Puz in Y! Research Chuck Neerdaels, P.P. Suryanarayanan and many others in CCDI
    • 2. Questions <ul><li>What is cloud computing? </li></ul><ul><ul><li>Horizontal and functional services </li></ul></ul><ul><li>What’s it going to change? </li></ul><ul><ul><li>Software business models, science, life </li></ul></ul><ul><li>How many clouds will there be? </li></ul><ul><ul><li>1, 2, 3, infinity </li></ul></ul><ul><li>What’s new in cloud computing? </li></ul><ul><ul><li>HPC grids, ASPs, hosted services, Multics (!) </li></ul></ul><ul><ul><li>Emerging “cloud stack” to support a broad class of programs, including data intensive applications </li></ul></ul>
    • 3. SCENARIOS <ul><li>Pie-in-the-sky </li></ul>
    • 4. Living in the Clouds <ul><li>We want to start a new website, FredsList.com </li></ul><ul><li>Our site will provide listings of items for sale, jobs, etc. </li></ul><ul><li>As time goes on, we’ll add more features </li></ul><ul><ul><li>And illustrate how more cloud capabilities (and corresponding infrastructure components) are used as needed </li></ul></ul><ul><ul><ul><li>List of capabilities/components is illustrative, not exhaustive </li></ul></ul></ul><ul><li>Our cloud provides a “dataset” abstraction </li></ul><ul><ul><li>FredsList doesn’t worry about the underlying components </li></ul></ul>
    • 5. Step 1: Listings Scenario Simple Web Service API’s Database PNUTS FredsList.com application 1234323, transportation, For sale: one bicycle, barely used FredsList wants to store listings as (key, category, description) 5523442, childcare, Nanny available in San Jose 215534, wanted, Looking for issue 1 of Superman comic book DECLARE DATASET Listings AS ( ID String PRIMARY KEY, Category String, Description Text )
    • 6. Step 2: System Evolution Simple Web Service API’s Database PNUTS FredsList.com application 1234323, transportation, For sale: one bicycle, barely used Fred belatedly realizes prices are useful information! 5523442, childcare, Nanny available in San Jose 215534, wanted, Looking for issue 1 of Superman comic book ALTER DATASET Listings ADD (Price Float) Schemas are flexible, and evolve 32138, camera, Nikon D40, USD 300 Not every record in a dataset has values defined for all fields declared for the dataset vs.
    • 7. Step 3: Search Simple Web Service API’s Database PNUTS “ bicycle” FredsList’s customers quickly ask for keyword search Search Vespa “ dvd’s” “ nanny” FredsList.com application ALTER Listings SET Description SEARCHABLE Messaging Tribble Federation of systems offering different capabilities
    • 8. Step 4: Photos Simple Web Service API’s Database PNUTS FredsList decides to add photos/videos to listings Search Vespa Storage MObStor Foreign key photo -> listing FredsList.com application ALTER Listings ADD Photo BLOB Messaging Tribble Federation of systems offering different performance points
    • 9. Step 5: Data Analysis Simple Web Service API’s Database PNUTS FredsList wants to analyze its listings to get statistics about category, do geocoding, etc. Search Vespa Storage MObStor Foreign key photo -> listing FredsList.com application ALTER Listings MAKE ANALYZABLE Compute Grid Batch export Pig query to analyze categories Hadoop program to geocode data Hadoop program to generate fancy pages for listings Messaging Tribble
    • 10. Step 6: Performance Simple Web Service API’s Database PNUTS FredsList wants to reduce its data access latency Search Vespa Messaging Tribble Storage MObStor Foreign key photo -> listing FredsList.com application ALTER Listings MAKE CACHEABLE Compute Grid Batch export Caching memcached And by now, Fred is global, and wants geo-replication!
    • 11. Data Serving vs. Analysis <ul><li>Very different workloads, requirements </li></ul><ul><li>Data from serving system is one of many kinds of data (click streams are another common kind, as are syndicated feeds) to be analyzed and integrated </li></ul><ul><li>The result of analysis often goes right back into serving system </li></ul>
    • 12. EYES TO THE SKIES <ul><li>Motherhood-and-Apple-Pie </li></ul>
    • 13. Why Clouds? <ul><li>On-demand infrastructure to create a fundamental shift in the OE curve: </li></ul><ul><ul><li>Do things we can’t do </li></ul></ul><ul><ul><li>Build more robustly, more efficiently, more globally, more completely, more quickly, for a given budget </li></ul></ul><ul><li>Cloud services should do heavy lifting of heavy-lifting of scaling & high-availability </li></ul><ul><ul><li>Today, this is done at the app-level, which is not productive </li></ul></ul>
    • 14. Requirements for Cloud Services <ul><li>Multitenant. A cloud service must support multiple, organizationally distant customers. </li></ul><ul><li>Elasticity. Tenants should be able to negotiate and receive resources/QoS on-demand . </li></ul><ul><li>Resource Sharing. Ideally, spare cloud resources should be transparently applied when a tenant’s negotiated QoS is insufficient, e.g., due to spikes. </li></ul><ul><li>Horizontal scaling. It should be possible to add cloud capacity in small increments; this should be transparent to the tenants of the service. </li></ul><ul><li>Metering. A cloud service must support accounting that reasonably ascribes operational and capital expenditures to each of the tenants of the service. </li></ul><ul><li>Security. A cloud service should be secure in that tenants are not made vulnerable because of loopholes in the cloud. </li></ul><ul><li>Availability. A cloud service should be highly available. </li></ul><ul><li>Operability. A cloud service should be easy to operate, with few operators. Operating costs should scale linearly or better with the capacity of the service. </li></ul>
    • 15. Types of Cloud Services <ul><li>Two kinds of cloud services: </li></ul><ul><ul><li>Horizontal (“Platform”) Cloud Services </li></ul></ul><ul><ul><ul><li>Functionality enabling tenants to build applications or new services on top of the cloud </li></ul></ul></ul><ul><ul><li>Functional Cloud Services </li></ul></ul><ul><ul><ul><li>Functionality that is useful in and of itself to tenants. E.g., various SaaS instances, such as Saleforce.com; Google Analytics and Yahoo!’s IndexTools; Yahoo! properties aimed at end-users and small businesses, e.g., flickr, Groups, Mail, News, Shopping </li></ul></ul></ul><ul><ul><ul><li>Could be built on top of horizontal cloud services or from scratch </li></ul></ul></ul><ul><ul><ul><li>Yahoo! has been offering these for a long while (e.g., Mail for SMB, Groups, Flickr, BOSS, Ad exchanges) </li></ul></ul></ul>
    • 16. Opening Up Yahoo! Search <ul><li>Phase 1 </li></ul>Phase 2 Giving site owners and developers control over the appearance of Yahoo! Search results. BOSS takes Yahoo!’s open strategy to the next level by providing Yahoo! Search infrastructure and technology to developers and companies to help them build their own search experiences.
    • 17. Search Results of the Future babycenter epicurious yelp.com answers.com LinkedIn webmd Gawker New York Times
    • 18. BOSS Offerings API A self-service, web services model for developers and start-ups to quickly build and deploy new search experiences. BOSS offers two options for companies and developers and has partnered with top technology universities to drive search experimentation, innovation and research into next generation search. <ul><ul><li>University of Illinois Urbana Champaign </li></ul></ul><ul><ul><li>Carnegie Mellon University </li></ul></ul><ul><ul><li>Stanford University </li></ul></ul><ul><ul><li>Purdue University </li></ul></ul><ul><ul><li>• MIT </li></ul></ul><ul><ul><li>Indian Institute of </li></ul></ul><ul><ul><li>Technology Bombay </li></ul></ul><ul><ul><li>University of </li></ul></ul><ul><ul><li>Massachusetts </li></ul></ul>CUSTOM Working with 3rd parties to build a more relevant, brand/site specific web search experience. This option is jointly built by Yahoo! and select partners. <ul><ul><li>ACADEMIC </li></ul></ul><ul><ul><li>Working with the following universities to allow for wide-scale research in the search field: </li></ul></ul>(Slide courtesy Prabhakar Raghavan)
    • 19. Partner Examples
    • 20. Horizontal Cloud Services <ul><li>Horizontal cloud services are foundations on which tenants build applications or new services. They should be: </li></ul><ul><ul><li>Semantics-free. Must be &quot;generic infrastructure,” and not tied to specific app-logic. </li></ul></ul><ul><ul><ul><li>May provide the ability to inject application logic through well-defined APIs </li></ul></ul></ul><ul><ul><li>Broadly applicable. Must be broadly applicable (i.e., it can't be intended for just one or two properties). </li></ul></ul><ul><ul><li>Fault-tolerant over commodity hardware. Must be built using inexpensive commodity hardware, and should mask component failures. </li></ul></ul><ul><li>While each cloud service provides value, the power of the cloud paradigm will depend on a collection of well-chosen, loosely coupled services that collectively make it easy to quickly develop and operate innovative web applications. </li></ul>
    • 21. What’s in the Horizontal Cloud? Shared Infrastructure Horizontal Cloud Services Edge Content Services e.g., YCS, YCPI Provisioning & Virtualization e.g., EC2 Batch Storage & Processing e.g., Hadoop & Pig Operational Storage e.g., S3, MObStor, Sherpa Other Services Messaging, Workflow, virtual DBs & Webserving Simple Web Service API’s Common Approaches to QA, Production Engineering, Performance Engineering, Datacenter Management, and Optimization ID & Account Management Monitoring & QoS Metering, Billing, Accounting Security
    • 22. Yahoo! Cloud Stack Horizontal Cloud Services … YCS YCPI Brooklyn EDGE Horizontal Cloud Services … Hadoop BATCH Horizontal Cloud Services … Sherpa MOBStor STORAGE Horizontal Cloud Services VM/OS … APP Horizontal Cloud Services VM/OS yApache WEB Data Highway Serving Grid PHP App Engine Provisioning (Self-serve) Monitoring/Metering/Security
    • 23. Yahoo! CCDI Thrust Areas <ul><li>Fast Provisioning and Machine Virtualization: On demand, deliver a set of hosts imaged with desired software and configured against standard services </li></ul><ul><ul><li>Multiple hosts may be multiplexed onto the same physical machine. </li></ul></ul><ul><li>Batch Storage and Processing: Scalable data storage optimized for batch processing, together with computational capabilities </li></ul><ul><li>Operational Storage: Persistent storage that supports low-latency updates and flexible retrieval </li></ul><ul><li>Edge Content Services: Support for dealing with network topology, communication protocols, caching, and BCP </li></ul>Rest of today’s talk
    • 24. Web Data Management Large data analysis (Hadoop) Structured record storage (PNUTS/Sherpa) Blob storage (SAN/NAS) <ul><li>Scan oriented workloads </li></ul><ul><li>Focus on sequential disk I/O </li></ul><ul><li>$ per cpu cycle </li></ul><ul><li>CRUD </li></ul><ul><li>Point lookups and short scans </li></ul><ul><li>Index organized table and random I/Os </li></ul><ul><li>$ per latency </li></ul><ul><li>Object retrieval and streaming </li></ul><ul><li>Scalable file storage </li></ul><ul><li>$ per GB </li></ul>
    • 25. Hadoop: Batch Storage/Analysis <ul><li>Why is batch processing important? </li></ul><ul><li>Whether it’s </li></ul><ul><ul><li>response-prediction for advertising </li></ul></ul><ul><ul><li>machine-learned relevance for Search, or </li></ul></ul><ul><ul><li>content optimization for audience, </li></ul></ul><ul><ul><li>data-intensive computing is increasingly central to everything Yahoo! does </li></ul></ul><ul><ul><li>Hadoop is central to addressing this need </li></ul></ul><ul><li>Hadoop is a case-study in our cloud vision </li></ul><ul><ul><li>Processes enormous amounts of data </li></ul></ul><ul><ul><li>Provides horizontal scaling and fault-tolerance for our users </li></ul></ul><ul><ul><li>Allows those users to focus on their app logic </li></ul></ul>[Workflow] HDFS Map-Reduce High-level query layer (Pig)
    • 26. The World Has Changed <ul><li>Web serving applications need: </li></ul><ul><ul><li>Scalability! </li></ul></ul><ul><ul><ul><li>Preferably elastic </li></ul></ul></ul><ul><ul><li>Flexible schemas </li></ul></ul><ul><ul><li>Geographic distribution </li></ul></ul><ul><ul><li>High availability </li></ul></ul><ul><ul><li>Reliable storage </li></ul></ul><ul><li>Web serving applications can do without: </li></ul><ul><ul><li>Complicated queries </li></ul></ul><ul><ul><li>Strong transactions </li></ul></ul>
    • 27. MObStor <ul><li>Yahoo!’s next-generation globally replicated, virtualized media object storage service </li></ul><ul><li>Better provisioning, easy migration, replication, better BCP, and performance </li></ul><ul><li>New features (Evergreen URLs, CDN integration, REST API, …) </li></ul><ul><li>The object metadata problem addressed using Sherpa, though MObStor is focused on blob storage. </li></ul>
    • 28. Storage & Delivery Stack
    • 29. PNUTS / SHERPA To Help You Scale Your Mountains of Data
    • 30. CCDI—Research Collaboration <ul><li>Yahoo! Research </li></ul><ul><li>Raghu Ramakrishnan </li></ul><ul><li>Brian Cooper </li></ul><ul><li>Utkarsh Srivastava </li></ul><ul><li>Adam Silberstein </li></ul><ul><li>Rodrigo Fonseca </li></ul><ul><li>CCDI </li></ul><ul><li>Chuck Neerdaels </li></ul><ul><li>P.P.S. Narayan </li></ul><ul><li>Kevin Athey </li></ul><ul><li>Toby Negrin </li></ul><ul><li>Plus Dev/QA teams </li></ul>
    • 31. Yahoo! Serving Storage Problem <ul><ul><li>Small records – 100KB or less </li></ul></ul><ul><ul><li>Structured records – lots of fields, evolving </li></ul></ul><ul><ul><li>Extreme data scale - Tens of TB </li></ul></ul><ul><ul><li>Extreme request scale - Tens of thousands of requests/sec </li></ul></ul><ul><ul><li>Low latency globally - 20+ datacenters worldwide </li></ul></ul><ul><ul><li>High Availability - outages cost $millions </li></ul></ul><ul><ul><li>Variable usage patterns - as applications and users change </li></ul></ul>
    • 32. The PNUTS/Sherpa Solution <ul><li>The next generation global-scale record store </li></ul><ul><ul><li>Record-orientation: Routing, data storage optimized for low-latency record access </li></ul></ul><ul><ul><li>Scale out: Add machines to scale throughput (while keeping latency low) </li></ul></ul><ul><ul><li>Asynchrony: Pub-sub replication to far-flung datacenters to mask propagation delay </li></ul></ul><ul><ul><li>Consistency model: Reduce complexity of asynchrony for the application programmer </li></ul></ul><ul><ul><li>Cloud deployment model: Hosted, managed service to reduce app time-to-market and enable on demand scale and elasticity </li></ul></ul>
    • 33. What is PNUTS/Sherpa? CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) Parallel database Geographic replication Structured, flexible schema Hosted, managed infrastructure E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E
    • 34. What Will It Become? CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) Parallel database Geographic replication Indexes and views Structured, flexible schema Hosted, managed infrastructure E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E
    • 35. What Will It Become? Indexes and views E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E
    • 36. <ul><li>Scalability </li></ul><ul><li>Thousands of machines </li></ul><ul><li>Easy to add capacity </li></ul><ul><li>Restrict query language to avoid costly queries </li></ul><ul><li>Geographic replication </li></ul><ul><li>Asynchronous replication around the globe </li></ul><ul><li>Low-latency local access </li></ul><ul><li>High availability and fault tolerance </li></ul><ul><li>Automatically recover from failures </li></ul><ul><li>Serve reads and writes despite failures </li></ul>Design Goals <ul><li>Consistency </li></ul><ul><li>Per-record guarantees </li></ul><ul><li>Timeline model </li></ul><ul><li>Option to relax if needed </li></ul><ul><li>Multiple access paths </li></ul><ul><li>Hash table, ordered table </li></ul><ul><li>Primary, secondary access </li></ul><ul><li>Hosted service </li></ul><ul><li>Applications plug and play </li></ul><ul><li>Share operational cost </li></ul>
    • 37. Technology Elements <ul><li>PNUTS </li></ul><ul><li>Query planning and execution </li></ul><ul><li>Index maintenance </li></ul><ul><li>Distributed infrastructure for tabular data </li></ul><ul><li>Data partitioning </li></ul><ul><li>Update consistency </li></ul><ul><li>Replication </li></ul><ul><li>YDOT FS </li></ul><ul><li>Ordered tables </li></ul>Applications <ul><li>Tribble </li></ul><ul><li>Pub/sub messaging </li></ul><ul><li>YDHT FS </li></ul><ul><li>Hash tables </li></ul><ul><li>Zookeeper </li></ul><ul><li>Consistency service </li></ul>YCA: Authorization PNUTS API Tabular API
    • 38. Data Manipulation <ul><li>Per-record operations </li></ul><ul><ul><li>Get </li></ul></ul><ul><ul><li>Set </li></ul></ul><ul><ul><li>Delete </li></ul></ul><ul><li>Multi-record operations </li></ul><ul><ul><li>Multiget </li></ul></ul><ul><ul><li>Scan </li></ul></ul><ul><ul><li>Getrange </li></ul></ul><ul><li>Web service (RESTful) API </li></ul>
    • 39. Tablets—Hash Table Apple Lemon Grape Orange Lime Strawberry Kiwi Avocado Tomato Banana Grapes are good to eat Limes are green Apple is wisdom Strawberry shortcake Arrgh! Don’t get scurvy! But at what price? How much did you pay for this lemon? Is this a vegetable? New Zealand The perfect fruit Name Description Price $12 $9 $1 $900 $2 $3 $1 $14 $2 $8 0x0000 0xFFFF 0x911F 0x2AF3
    • 40. Tablets—Ordered Table Apple Banana Grape Orange Lime Strawberry Kiwi Avocado Tomato Lemon Grapes are good to eat Limes are green Apple is wisdom Strawberry shortcake Arrgh! Don’t get scurvy! But at what price? The perfect fruit Is this a vegetable? How much did you pay for this lemon? New Zealand $1 $3 $2 $12 $8 $1 $9 $2 $900 $14 Name Description Price A Z Q H
    • 41. Flexible Schema $15 Lamp 421133 6/5/07 $1123 Car 211242 6/3/07 $86 Bike 763245 6/1/07 $570 Couch 424252 6/1/07 Price Item Listing id Posted date Red Color Fair Good Condition
    • 42. Detailed Architecture Storage units Routers Tablet Controller REST API Clients Local region Remote regions Tribble
    • 43. Tablet Splitting and Balancing Each storage unit has many tablets (horizontal partitions of the table) Tablets may grow over time Overfull tablets split Storage unit may become a hotspot Shed load by moving tablets to other servers Storage unit Tablet
    • 44. QUERY PROCESSING
    • 45. Accessing Data SU SU SU Get key k 1 2 Get key k 3 Record for key k 4 Record for key k
    • 46. Bulk Read SU SU SU Scatter/ gather server 1 {k1, k2, … kn} 2 Get k 1 Get k 2 Get k 3
    • 47. Range Queries in YDOT <ul><li>Clustered, ordered retrieval of records </li></ul>Apple Avocado Banana Blueberry Canteloupe Grape Kiwi Lemon Lime Mango Orange Strawberry Tomato Watermelon Apple Avocado Banana Blueberry Canteloupe Grape Kiwi Lemon Lime Mango Orange Strawberry Tomato Watermelon Storage unit 1 Storage unit 2 Storage unit 3 Storage unit 1 Canteloupe Storage unit 3 Lime Storage unit 2 Strawberry Storage unit 1 Router Grapefruit…Pear? Grapefruit…Lime? Lime…Pear? Storage unit 1 Canteloupe Storage unit 3 Lime Storage unit 2 Strawberry Storage unit 1
    • 48. Updates Write key k Sequence # for key k Sequence # for key k Write key k SUCCESS Write key k Routers Message brokers 1 2 Write key k 7 8 SU SU SU 3 4 5 6
    • 49. ASYNCHRONOUS REPLICATION AND CONSISTENCY
    • 50. Asynchronous Replication
    • 51. <ul><li>Goal: Make it easier for applications to reason about updates and cope with asynchrony </li></ul><ul><li>What happens to a record with primary key “Alice”? </li></ul>Consistency Model Time Record inserted Update Update Update Update Update Delete Time v. 1 v. 2 v. 3 v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Update Update As the record is updated, copies may get out of sync.
    • 52. Example: Social Alice West East ___ Busy Free Free Record Timeline Busy Alice Status User Free Alice Status User ??? Alice Status User ??? Alice Status User Busy Alice Status User ___ Alice Status User
    • 53. Consistency Model Time v. 1 v. 2 v. 3 v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Current version Stale version Stale version Read In general, reads are served using a local copy
    • 54. Consistency Model Time v. 1 v. 2 v. 3 v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Read up-to-date Current version Stale version Stale version But application can request and get current version
    • 55. Consistency Model Time v. 1 v. 2 v. 3 v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Read ≥ v.6 Current version Stale version Stale version Or variations such as “read forward”—while copies may lag the master record, every copy goes through the same sequence of changes
    • 56. Consistency Model Time v. 1 v. 2 v. 3 v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Write Current version Stale version Stale version Achieved via per-record primary copy protocol (To maximize availability, record masterships automaticlly transferred if site fails) Can be selectively weakened to eventual consistency (local writes that are reconciled using version vectors)
    • 57. Consistency Model Time v. 1 v. 2 v. 3 v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Write if = v.7 ERROR Current version Stale version Stale version Test-and-set writes facilitate per-record transactions
    • 58. Consistency Techniques <ul><li>Per-record mastering </li></ul><ul><ul><li>Each record is assigned a “master region” </li></ul></ul><ul><ul><ul><li>May differ between records </li></ul></ul></ul><ul><ul><li>Updates to the record forwarded to the master region </li></ul></ul><ul><ul><li>Ensures consistent ordering of updates </li></ul></ul><ul><li>Tablet-level mastering </li></ul><ul><ul><li>Each tablet is assigned a “master region” </li></ul></ul><ul><ul><li>Inserts and deletes of records forwarded to the master region </li></ul></ul><ul><ul><li>Master region decides tablet splits </li></ul></ul><ul><li>These details are hidden from the application </li></ul><ul><ul><li>Except for the latency impact! </li></ul></ul>
    • 59. Mastering A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E Tablet master
    • 60. Bulk Insert/Update/Replace Client Source Data Bulk manager <ul><li>Client feeds records to bulk manager </li></ul><ul><li>Bulk loader transfers records to SU’s in batches </li></ul><ul><ul><li>Bypass routers and message brokers </li></ul></ul><ul><ul><li>Efficient import into storage unit </li></ul></ul>
    • 61. Bulk Load in YDOT <ul><li>YDOT bulk inserts can cause performance hotspots </li></ul><ul><li>Solution: preallocate tablets </li></ul>
    • 62. Index Maintenance <ul><li>How to have lots of interesting indexes and views, without killing performance? </li></ul><ul><li>Solution: Asynchrony ! </li></ul><ul><ul><li>Indexes/views updated asynchronously when base table updated </li></ul></ul>
    • 63. SHERPA IN CONTEXT
    • 64. Types of Record Stores <ul><li>Query expressiveness </li></ul>Simple Feature rich Object retrieval Retrieval from single table of objects/records SQL S3 PNUTS Oracle
    • 65. Types of Record Stores <ul><li>Consistency model </li></ul>Best effort Strong guarantees Eventual consistency Timeline consistency ACID S3 PNUTS Oracle Program centric consistency Object-centric consistency
    • 66. Types of Record Stores <ul><li>Data model </li></ul>Flexibility, Schema evolution Optimized for Fixed schemas CouchDB PNUTS Oracle Consistency spans objects Object-centric consistency
    • 67. Types of Record Stores <ul><li>Elasticity (ability to add resources on demand) </li></ul>Inelastic Elastic Limited (via data distribution) VLSD (Very Large Scale Distribution /Replication) Oracle PNUTS S3
    • 68. Data Stores Comparison <ul><li>User-partitioned SQL stores </li></ul><ul><ul><li>Microsoft Azure SDS </li></ul></ul><ul><ul><li>Amazon SimpleDB </li></ul></ul><ul><li>Multi-tenant application databases </li></ul><ul><ul><li>Salesforce.com </li></ul></ul><ul><ul><li>Oracle on Demand </li></ul></ul><ul><li>Mutable object stores </li></ul><ul><ul><li>Amazon S3 </li></ul></ul><ul><li>Versus PNUTS </li></ul><ul><li>More expressive queries </li></ul><ul><li>Users must control partitioning </li></ul><ul><li>Limited elasticity </li></ul><ul><li>Highly optimized for complex workloads </li></ul><ul><li>Limited flexibility to evolving applications </li></ul><ul><li>Inherit limitations of underlying data management system </li></ul><ul><li>Object storage versus record management </li></ul>
    • 69. Application Design Space Records Files Get a few things Scan everything Sherpa MObStor Everest Hadoop YMDB MySQL Filer Oracle BigTable
    • 70. Alternatives Matrix Elastic Operability Global low latency Availability Structured access Sherpa Y! UDB MySQL Oracle HDFS BigTable Dynamo Updates Cassandra Consistency model SQL/ACID
    • 71. Further Reading Efficient Bulk Insertion into a Distributed Ordered Table (SIGMOD 2008) Adam Silberstein, Brian Cooper, Utkarsh Srivastava, Erik Vee, Ramana Yerneni, Raghu Ramakrishnan PNUTS: Yahoo!'s Hosted Data Serving Platform (VLDB 2008) Brian Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Phil Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, Ramana Yerneni Asynchronous View Maintenance for VLSD Databases, Parag Agrawal, Adam Silberstein, Brian F. Cooper, Utkarsh Srivastava and Raghu Ramakrishnan SIGMOD 2009 (to appear) Cloud Storage Design in a PNUTShell Brian F. Cooper, Raghu Ramakrishnan, and Utkarsh Srivastava Beautiful Data, O’Reilly Media, 2009 (to appear)
    • 72. QUESTIONS?

    ×