Cloud Computing and ODISSEA Overviews


Published on

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Cloud Computing and ODISSEA Overviews

  1. 1. Searching the Clouds Presented by Kajal Miyan Slides courtesy: UC Berkeley RAD Lab
  2. 2. Above the Clouds A Berkeley View of Cloud Computing Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy Katz, Andy Konwinski, Gunho Lee, David Patterson, Ariel Rabkin, Ion Stoica, and Matei Zaharia
  3. 3. Outline <ul><li>What is it? </li></ul><ul><li>Why now? </li></ul><ul><li>Cloud killer apps </li></ul><ul><li>Economics </li></ul><ul><li>Challenges and opportunities </li></ul><ul><li>Implications </li></ul><ul><li>ODESSIA </li></ul>
  4. 4. What is Cloud Computing? <ul><li>Old idea: Software as a Service (SaaS)‏ </li></ul><ul><ul><li>Def: delivering applications over the Internet </li></ul></ul><ul><li>Recently: “[Hardware, Infrastructure, Platform] as a service” </li></ul><ul><ul><li>Poorly defined so we avoid all “X as a service” </li></ul></ul><ul><li>Utility Computing: pay-as-you-go computing </li></ul><ul><ul><li>Illusion of infinite resources </li></ul></ul><ul><ul><li>No up-front cost </li></ul></ul><ul><ul><li>Fine-grained billing (e.g. hourly) </li></ul></ul>
  5. 5. Why Now? <ul><li>Experience with very large datacenters </li></ul><ul><ul><li>Advent of technologies like Web 2.0 and Google Adsense </li></ul></ul><ul><li>Other factors </li></ul><ul><ul><li>Pervasive broadband Internet </li></ul></ul><ul><ul><li>Pay-as-you-go billing model </li></ul></ul><ul><ul><li>Standard software stack (Amazon VM)‏ </li></ul></ul><ul><ul><li>Gray's analysis: keep the data near the application </li></ul></ul>
  6. 6. Spectrum of Clouds <ul><li>Instruction Set VM (Amazon EC2, 3Tera)‏ </li></ul><ul><li>Bytecode VM (Microsoft Azure)‏ </li></ul><ul><li>Framework VM </li></ul><ul><ul><li>Google AppEngine, </li></ul></ul>Lower-level, Less management Higher-level, More management EC2 Azure AppEngine
  7. 7. Cloud Killer Apps <ul><li>Interactive Real Time applications </li></ul><ul><li>Rise of analytics </li></ul><ul><li>Extensions of desktop software </li></ul><ul><ul><li>Matlab, Mathematica </li></ul></ul><ul><li>Parallel Batch processing </li></ul><ul><ul><li>Oracle at Harvard, Hadoop at NY Times </li></ul></ul>
  8. 8. Economics of Cloud Users <ul><li>Pay by use instead of provisioning for peak </li></ul><ul><li>Static Datacenter DataCenter in Cloud </li></ul>Demand Capacity Time Demand Capacity Time
  9. 9. Adoption Challenges Challenge Opportunity Availability (DDos)‏ Multiple providers & DCs Data lock-in Standardization Data Confidentiality and Auditability Encryption, VLANs, Firewalls; Geographical Data Storage
  10. 10. Growth Challenges Challenge Opportunity Data transfer bottlenecks FedEx-ing disks, Data Backup/Archival Performance unpredictability Improved VM support, flash memory, scheduling VMs Scalable storage Invent scalable store Bugs in large distributed systems Invent Debugger that relies on Distributed VMs Scaling quickly Invent Auto-Scaler that relies on ML; Snapshots
  11. 11. Policy and Business Challenges Challenge Opportunity Reputation Fate Sharing Offer reputation-guarding services like those for email Software Licensing Pay-for-use licenses; Bulk use sales
  12. 12. Implications <ul><li>Startups and prototyping </li></ul><ul><li>Cost associativity for scientific applications </li></ul><ul><li>Research at scale </li></ul>
  13. 13. ODISSEA o pen dis tributed s earch e ngine a rchitecture <ul><ul><li>A Peer-to-Peer Architecture for Scalable Web Search and Information Retrieval </li></ul></ul><ul><ul><li>Torsten Suel, Chandan Mathur, Jo-Wen Wu, Jiangong Zhang, </li></ul></ul><ul><ul><li>Alex Delis, Mehdi Kharrazi, Xiaohui Long, Kulesh Shanmugasundaram </li></ul></ul>
  14. 14. <ul><li>“ If Distributed Systems can be used to search for aliens then why not for Words???” :D </li></ul>
  15. 15. Search Engine Revision Your Browser The Web URL1 URL2 URL3 URL4 Crawler Indexer Search Engine Database Eggs? Eggs. Eggs - 90% Eggo - 81% Ego- 40% Huh? - 10% All About Eggs by S. I. Am
  16. 16. Motivation <ul><li>Today, main part of the web search infrastructure is supplied by only a few large crawl-based search engines </li></ul><ul><li>Strong research in the field of P2P systems over the last few years </li></ul><ul><li>This raises two issues </li></ul><ul><ul><li>Vast data in P2P networks require the </li></ul></ul><ul><ul><li>ability to search in these networks </li></ul></ul><ul><ul><li>Significant computing resources provided </li></ul></ul><ul><ul><li>by a P2P system could be used to search </li></ul></ul><ul><ul><li>content residing inside or outside the system </li></ul></ul><ul><li>ODISSEA - distributed global indexing and query execution service </li></ul>image/svg+xml
  17. 17. Design Overview <ul><li>ODISSEA is different from many other approaches to P2P search </li></ul><ul><li>It assumes a two-layered search engine architecture </li></ul><ul><li>It has a global index structure distributed over the nodes of the system (In a global index, as contradiction to a local index, a single node holds the entire inverted index for a particular term)‏ </li></ul>
  18. 18. Two Layer Approach <ul><li>Lower layer provides </li></ul><ul><ul><li>maintanance of the global index structure under document insertions and updates </li></ul></ul><ul><ul><li>Maintanance of node joins and failures </li></ul></ul><ul><ul><li>Efficient execution of simple search queries` </li></ul></ul>ODISSEA queries crawler queries Search server <ul><li>Upper layer interacts with P2P-based lower layer via two classes of clients </li></ul><ul><ul><li>Update clients (e.g crawler, web server)‏ </li></ul></ul><ul><ul><li>Query clients (user implemented optimized query execution plan)‏ </li></ul></ul>WWW
  19. 19. Target Applications <ul><li>Full-text search in large document collections located within in P2P communities </li></ul><ul><li>Search in large intranet environments </li></ul><ul><li>Web Search: a powerful API supports the anticipated shift towards client-based search tools which better exploit the resources of todays desktop machines </li></ul>
  20. 20. Two Layer Approach <ul><li>Enables a large variety of (client-based) search tools that more fully exploit client computing resources. </li></ul><ul><li>Those tools could share the same lower-layer web search infrastructure. </li></ul><ul><li>Tools are developed using an open API, which accesses the search infrastructure </li></ul><ul><li>When processing a query, this could in the most general case (i.e where no pre-evaluation is done on server-side) result in large amounts of data to be transferred to the query client </li></ul>
  21. 21. Global vs. Local index <ul><ul><li>posting = [DocID, Position, additional information] </li></ul></ul><ul><ul><li>Local Index: each node creates its own index for all docs that are locally stored </li></ul></ul><ul><ul><li>Gobal Index: each node will hold a complete global postings list for a certain group of words </li></ul></ul><ul><li>Suppose a query „chair AND table“. Then the query will be processed as follows: </li></ul>
  22. 22. Global vs. Local index <ul><li>Local index organization is very inefficient in very large networks (e.g. web) if result quality is the major concern, because the query has to be transmitted to all nodes and all of them have to respond (as data is unclustered)‏ </li></ul><ul><li>But in a global index organization large amounts of data need to be transmitted between nodes when </li></ul><ul><ul><li>Initially building the index (adding new nodes)‏ </li></ul></ul><ul><ul><li>Evaluating a query  bad response time </li></ul></ul><ul><li>Can be overcome with smart algorithmic techniques </li></ul><ul><li>Choice depends on the types of queries and the frequency of document updates, as well as on the question of how dynamic the system is </li></ul>
  23. 23. Crawling and Fault Tolerance <ul><li>Crawling approach </li></ul><ul><li>Non P2P crawlers have the advantage that they can be easily altered in the case that some web site operators have complains about the bot </li></ul><ul><li>Smart crawling strategies beyond BFS are hard to implement in a P2P environment unless there is a centralized scheduler </li></ul><ul><li>P2P systems and fault tolerance </li></ul><ul><li>System design relies on the assumption of a more stable P2P environment, since otherwise administration (insert, update, replication) would be too expencive </li></ul>
  24. 24. Implementation Details <ul><li>Currently, implemented in Java, using Pastry as a P2P substrate (lower layer) and a DHT mapping for hashing IDs to the appropriate IP-address (Pastry is an overlay and routing network for the implementation of a DHT. The key-value pairs are stored in a redundant p2p network of connected Internet hosts.)‏ </li></ul><ul><li>Each node runs an indexer that stores inverted list in compressed form in a Berkeley DB </li></ul><ul><li>Using MD5, all documents and term lists are hashed to a 80-bit ID that is used for lookups in the system </li></ul>
  25. 25. Implementation Details <ul><li>Parsing and Routing Postings </li></ul><ul><li>New or updated documents are parsed at the node where they reside, as determined by the DHT mapping </li></ul><ul><li>Parser generates for each term a posting that is routed via several intermediate nodes, as determined by the topology of the Pastry network, until it reaches its destination node </li></ul><ul><li>An index structure of a node is split up in a small structure (residing in main memory) that is eventually merged with a bigger structure on disk to avoid disk accesses during inserts/updates  lower amortized complexity </li></ul>
  26. 26. Implementation Details <ul><li>Groups and Splits </li></ul><ul><li>Initially, all objects (documents, indexes) whose first w bits (here w =16) coincide are placed into a common group identified by this w -bit string </li></ul><ul><li>Locally, each group maintains a Berkeley DB with all objects it contains </li></ul><ul><li>When a group (of documents) becomes too large (here >1GB), it is split into two groups identified by a ( w+1 )-bit string leaving a stub structure pointing to the new groups that are assigned to new nodes </li></ul><ul><li>If index structures for terms are too large (here >100MB), they are split into two lists according to the document IDs they contain </li></ul>
  27. 28. Implementation Details <ul><li>Replication </li></ul><ul><li>Performed at group level by attaching „/0“, „/1“, etc. to the group label (e.g. 0100101/2)‏ </li></ul><ul><li>This new label is then what is really presented to Pastry/DHT during lookups </li></ul><ul><li>All replicas of a group form a „clique“ that communicate periodically to update their status </li></ul><ul><li>If a group replica fails, the others are in charge of detecting this and if necessary perform repair </li></ul><ul><li>Each node can contain several distinct group replicas and therefore participate in several cliques </li></ul><ul><li>Postings are first routed to only one replica that is then in charge of forwarding them to the others over a period of a few minutes </li></ul>
  28. 29. Implementation Details <ul><li>Faults, Unavailability and Synchronization </li></ul><ul><li>When a node leaves the system, its group replicas eventually have to be replaced to maintain the desired degree of replication </li></ul><ul><li>A node has failed if it has been unavailable for an extended period of time </li></ul><ul><li>Create new replicas for a failed node or if a certain number of nodes are unavailable </li></ul><ul><li>Former unavailable nodes have to synchronize its index structures using logs of missing updates </li></ul>
  29. 30. Efficient Query Processing <ul><li>Information Thoeretic Background </li></ul><ul><li>Let d be a document, q = q 0 …q m-1 a query consisting of m terms and F be a function that assigns d (depending on q ) a value F(d,q) . Such a function is called a ranking function. </li></ul><ul><li>The top-k ranking problem for a query q is finding the k documents with the highest values F(d,q) . </li></ul><ul><li>A common form of such a function looks like this </li></ul><ul><li>Since queries typically have at most only 2 search terms, the following algorithm focuses on the top-k ranking problem and queries with exactly 2 search terms (for one-term queries, there is in fact nothing to do)‏ </li></ul>
  30. 31. Efficient Query Processing <ul><li>Fagin‘s Algorithm (FA)‏ </li></ul><ul><li>Intuitively, an item that is ranked in the top is likely to be ranked very high in at least one of the contributing subcategories </li></ul><ul><li>Assume a query q = q 0 AND q 1 and postings of the form (d,f(d,q i )) that are sorted by the second component with highest values on top </li></ul><ul><li>Also assume that the inverted lists for q 0 and q 1 are located on the same machine, so that no network communication is required </li></ul><ul><li>Goal: compute the top k documents as fast as possible </li></ul>
  31. 32. Efficient Query Processing 0.9 1 2 4 3 8 6 5 1 <ul><li>Scan both lists from the beginning, by reading one element from each list in every step, until there are k documents that have been encountered in both lists (here assume k=2)‏ </li></ul><ul><li>Compute the scores of these k documents. Also, for each document that was encoutered in only one of the lists, perform a lookup into the other list to determine the score of the document. </li></ul><ul><li>Return the k documents with the highest score (here d1, d5)‏ </li></ul>5 3 7 A B 0.8 0.7 0.69 0.67 0.6 0.5 0.4 0.3 0.2 0.1
  32. 33. Efficient Query Processing <ul><li>Threshold Algorithm (TA)‏ </li></ul><ul><li>Scan both lists simultaneously and read (d,f(d,q 0 )) from the first and (d‘,f(d‘,q 1 )) from the second list </li></ul><ul><li>Compute t = f(d,q 0 ) + f(d‘,q 1 )‏ </li></ul><ul><li>For each d in one of the lists perform immediately a lookup in the other list in order to compute its complete score </li></ul><ul><li>Algorithm terminates, when k documents have been found that have higher scores than the current value of t </li></ul>Because it does not make sense to scan two lists simultaneously while they are distributed in a P2P network, the above techniques have to be adapted. This leads us to the following protocol that aims at minimize the data to be transferred.
  33. 34. Efficient Query Processing <ul><li>A simple distributed pruning protocol (DPP)‏ </li></ul>A B Node A (holding the shorter list) sends the first x postings to node B. Let r min be the smallest value f(d,q 0 ) transmitted Node B receives the postings from A and performs a lookup into its own list in order to compute the total scores. Retain the k documents with the highest scores. Let r k be the smallest value among these. Node B now transmitts to A all postings among its first x postings with f(d,q 1 ) > r k - r min , together with the total scores of the k documents from the previous step Node A now performs lookups into its own list for the postings received from B and determines the overall top k documents
  34. 35. Efficient Query Processing <ul><li>DPP-Example for k=2 and x=3: </li></ul><ul><li>A containing term q 0 : (d 1 , 0.9), (d 2 , 0.8), (d 3 , 0.7), (d 4 , 0.69), (d 5 , 0.67)‏ </li></ul><ul><li>B containing term q 1 : (d 6 , 0.6), (d 5 , 0.5), (d 3 , 0.4), (d 1 , 0.3), (d 7 , 0.2), (d 8 , 0.1)‏ </li></ul>A B A to B: (d 1 , 0.9), (d 2 , 0.8), (d 3 , 0.7)‏ B computes: (d 1 , 0.9 + 0.3) (d 2 , 0.8 + ----) (d 3 , 0.7 + 0.4)‏ B to A: (d 6 , 0.6), (d 5 , 0.5), because f(d 6,5 ,q 1 ) > 0.4 together with (d 1 , 1.2), (d 3 , 1.1)‏ A computes: (d 6 , 0.6+ ---- ), (d 5 , 0.5+0.67),  r min = 0.7  r k = 1.1 r k – r min = 1.1 - 0.7 = 0.4 Top 2 documents: 1. (d 1 , 1.2) 2. (d 5 , 1.17)‏
  35. 36. Efficient Query Processing <ul><li>Problems with the DPP </li></ul><ul><li>works only with queries containing 2 search terms </li></ul><ul><li>random lookups can cause disk accesses, since large index structures reside on hard disk  bad response time </li></ul><ul><li>How must the value of x be chosen? (x should be the number of postings transmitted from A and B, s.t. DPP works correct without extra roundtrip; depends on the k and length of the inverted lists) </li></ul><ul><ul><li>By deriving appropriate formulae based on extensive testing </li></ul></ul><ul><ul><li>By sampling-based methods that estimate the number of documents appearing in both lists </li></ul></ul>
  36. 37. Experimental Results
  37. 38. Efficient Query Processing <ul><li>Evaluation of DPP </li></ul><ul><li>900 two-term queries selected form a set of over 1 million </li></ul><ul><li>Testing corpora: 120 million web pages (1.8TB) that were crawled by their own crawler </li></ul><ul><li>Value of x determined by experiments on TA </li></ul><ul><li>Computation within nodes are not taken into account </li></ul><ul><li>Commmunication costs and estimated times of DPP for the top-10 documents and standard cosine measure: </li></ul>
  38. 39. Future Work <ul><li>Bloom Filters </li></ul><ul><li>New algorithmic techniques for the index synchronization problem </li></ul><ul><li>New strategies for load balancing and rebuilding of lost replicas </li></ul><ul><li>More experimental evaluation concerning different types of queries </li></ul>
  39. 40. Questions? <ul><li>Can we use this architecture to solve our Hardware and processing problems? </li></ul><ul><li>How much data and programming parallalization will be needed to make this possible? </li></ul>