SlideShare a Scribd company logo
Page Rank Implementation CLOUD  COMPUTING  PROJECT -Team 3 By: - Devendra Singh Parmar
Project Abstract Instructor:  Prof. Reddy Raja Mentor:       Ms M.Padmini To Implement PageRank Algorithm using Map-Reduce for Wikipedia and verify it for smaller data-sets
Agenda ,[object Object]
 Introduction  to Algorithm
 PageRank Equation Analysis
 Brief Description of Project
 Module1
 Module2
 Module3
 Applications ,[object Object]
Agenda ,[object Object]
Introduction  to Algorithm
 PageRank Equation Analysis
 Brief Description of Project
 Module1
 Module2
 Module3
 Applications ,[object Object]
Algorithm Google figures that when one page links to another page, it is effectively casting a vote for the other page. The more votes that are cast for a page, the more important the page must be. Also, the importance of the page that is casting the vote determines how important the vote itself is. Google calculates a page's importance from the votes cast for it. How important each vote is also taken into account when a page's PageRank is calculated.
Agenda ,[object Object]
 Introduction  to Algorithm
PageRank Equation Analysis
 Brief Description of Project
 Module1
 Module2
 Module3
 Applications ,[object Object]
The PageRank Equation(Issues and Enhancement) Problems: ,[object Object]
  CyclesSolution:
PageRank Equation(Enhancement) Solution for Cycles and If a random surfer gets bored Here ‘d ‘ is known as damping factor . It  represents the probability, at any step, that the person will continue surfing . The value of ‘d’ is typically kept 0.85
PageRank Equation (finally)
In other words In a simpler way:-  a page's PageRank = 0.15 /N+ 0.85 * (a "share" of the PageRank of every page that links to it)  "share" = the linking page's PageRank divided by the number of outbound links on the page.  And N=the number of documents in collection The equation of PageRank shows clearly how a page's PageRank is arrived at. But what isn't immediately obvious is that it can't work if the calculation is done just once.
PageRank Equation-as per the  published paper :“The Anatomy of a Large-Scale Hyper textual Web Search Engine”-Sergey Brin and Lawrence Page  We assume page A has pages T1...Tn which point to it (i.e., are citations). The parameter d is a damping factor which can be set between 0 and 1. We usually set d to 0.85.. Also C(A) is defined as the number of links going out of page A.  The PageRank of a page A is given as follows:  PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))  ->Note that the PageRanks form a probability distribution over web pages, so the sum of all web pages’ PageRanks will be one.
IssuesIn the Original Formula Formula given in the in Page and Brin's paper  does not supports the statement that "the sum of all PageRanks is one“ Hence to support the statement the formula is modified as: 	PR(A) = (1-d)/N + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn)) where N=the number of documents in collection
Agenda ,[object Object]
 Introduction  to Algorithm
 PageRank Equation Analysis
Brief Description of Project
 Module1

More Related Content

Viewers also liked

2 factor authentication 3 [compatibility mode]
2 factor authentication 3 [compatibility mode]2 factor authentication 3 [compatibility mode]
2 factor authentication 3 [compatibility mode]Hai Nguyen
 
Underwater wireless sensor networks
Underwater wireless sensor networksUnderwater wireless sensor networks
Underwater wireless sensor networks
Şüheda Acar
 
Bio-inspired Artificial Intelligence for Collective Systems
Bio-inspired Artificial Intelligence for Collective SystemsBio-inspired Artificial Intelligence for Collective Systems
Bio-inspired Artificial Intelligence for Collective Systems
Achini_Adikari
 
Bio Inspired Computing Final Version
Bio Inspired Computing Final VersionBio Inspired Computing Final Version
Bio Inspired Computing Final Version
Thomas Petry
 
Wi-Vi Technology
Wi-Vi TechnologyWi-Vi Technology
Wi-Vi Technology
Student
 
Wi vi- wifi that see through walls...
Wi vi- wifi that see through walls...Wi vi- wifi that see through walls...
Wi vi- wifi that see through walls...Komal Patil
 
Wi-Vi Technology
Wi-Vi TechnologyWi-Vi Technology
Wi-Vi Technology
Aman Raj
 
Google Page Rank Algorithm
Google Page Rank AlgorithmGoogle Page Rank Algorithm
Google Page Rank Algorithm
Omkar Dash
 
Seo (Search Engine Optimization)
Seo (Search Engine Optimization)Seo (Search Engine Optimization)
Seo (Search Engine Optimization)
mudit agrawal
 
Barcode In Retail Presentation
Barcode In Retail PresentationBarcode In Retail Presentation
Barcode In Retail Presentationguest561f62
 
Cloud Computing Integration Introduction
Cloud Computing Integration IntroductionCloud Computing Integration Introduction
Cloud Computing Integration Introduction
toryharis
 
Securing underwater wireless communication by Nisha Menon K
Securing underwater wireless communication by Nisha Menon KSecuring underwater wireless communication by Nisha Menon K
Securing underwater wireless communication by Nisha Menon K
Nisha Menon K
 
latest seminar topics in computer science
latest seminar topics in computer sciencelatest seminar topics in computer science
latest seminar topics in computer science
Rinshad Akbar K K
 
Yubikey Neo
Yubikey NeoYubikey Neo
Yubikey Neo
Giles Paterson
 
Rfid technologies
Rfid technologiesRfid technologies
Rfid technologies
Francisco Carabez
 
Working of barcode reader Ppt - Unitedworld School of Business
Working of barcode reader Ppt - Unitedworld School of BusinessWorking of barcode reader Ppt - Unitedworld School of Business
Working of barcode reader Ppt - Unitedworld School of Business
Arnab Roy Chowdhury
 
Barcode technology
Barcode technologyBarcode technology
Barcode technology
Subhash Vadadoriya
 
Localization scheme for underwater wsn
Localization scheme for underwater wsnLocalization scheme for underwater wsn
Localization scheme for underwater wsnAkshay Paswan
 

Viewers also liked (20)

2 factor authentication 3 [compatibility mode]
2 factor authentication 3 [compatibility mode]2 factor authentication 3 [compatibility mode]
2 factor authentication 3 [compatibility mode]
 
Underwater wireless sensor networks
Underwater wireless sensor networksUnderwater wireless sensor networks
Underwater wireless sensor networks
 
Bio-inspired Artificial Intelligence for Collective Systems
Bio-inspired Artificial Intelligence for Collective SystemsBio-inspired Artificial Intelligence for Collective Systems
Bio-inspired Artificial Intelligence for Collective Systems
 
Bio Inspired Computing Final Version
Bio Inspired Computing Final VersionBio Inspired Computing Final Version
Bio Inspired Computing Final Version
 
Wi-Vi Technology
Wi-Vi TechnologyWi-Vi Technology
Wi-Vi Technology
 
Wi vi- wifi that see through walls...
Wi vi- wifi that see through walls...Wi vi- wifi that see through walls...
Wi vi- wifi that see through walls...
 
Wi-Vi Technology
Wi-Vi TechnologyWi-Vi Technology
Wi-Vi Technology
 
Google Page Rank Algorithm
Google Page Rank AlgorithmGoogle Page Rank Algorithm
Google Page Rank Algorithm
 
Seo (Search Engine Optimization)
Seo (Search Engine Optimization)Seo (Search Engine Optimization)
Seo (Search Engine Optimization)
 
Barcode In Retail Presentation
Barcode In Retail PresentationBarcode In Retail Presentation
Barcode In Retail Presentation
 
Cloud Computing Integration Introduction
Cloud Computing Integration IntroductionCloud Computing Integration Introduction
Cloud Computing Integration Introduction
 
Wi vi ppt
Wi vi pptWi vi ppt
Wi vi ppt
 
Securing underwater wireless communication by Nisha Menon K
Securing underwater wireless communication by Nisha Menon KSecuring underwater wireless communication by Nisha Menon K
Securing underwater wireless communication by Nisha Menon K
 
latest seminar topics in computer science
latest seminar topics in computer sciencelatest seminar topics in computer science
latest seminar topics in computer science
 
Cloud Computing by AGDMOUN Khalid
Cloud Computing by AGDMOUN KhalidCloud Computing by AGDMOUN Khalid
Cloud Computing by AGDMOUN Khalid
 
Yubikey Neo
Yubikey NeoYubikey Neo
Yubikey Neo
 
Rfid technologies
Rfid technologiesRfid technologies
Rfid technologies
 
Working of barcode reader Ppt - Unitedworld School of Business
Working of barcode reader Ppt - Unitedworld School of BusinessWorking of barcode reader Ppt - Unitedworld School of Business
Working of barcode reader Ppt - Unitedworld School of Business
 
Barcode technology
Barcode technologyBarcode technology
Barcode technology
 
Localization scheme for underwater wsn
Localization scheme for underwater wsnLocalization scheme for underwater wsn
Localization scheme for underwater wsn
 

Similar to Cloud Computing Project

Pagerank
PagerankPagerank
Pagerank
Sunil Rawal
 
PageRank & Searching
PageRank & SearchingPageRank & Searching
PageRank & Searchingrahulbindra
 
Optimizing search engines
Optimizing search enginesOptimizing search engines
Optimizing search engines
Swapnil Kotwal
 
Seo and page rank algorithm
Seo and page rank algorithmSeo and page rank algorithm
Seo and page rank algorithm
Nilkanth Shirodkar
 
Search engine page rank demystification
Search engine page rank demystificationSearch engine page rank demystification
Search engine page rank demystification
Raja R
 
Make Your Own Damn SEO Tools (Using Google Docs!)
Make Your Own Damn SEO Tools (Using Google Docs!)Make Your Own Damn SEO Tools (Using Google Docs!)
Make Your Own Damn SEO Tools (Using Google Docs!)
Sean Malseed
 
Make-Damn-SEO-Tools
Make-Damn-SEO-ToolsMake-Damn-SEO-Tools
Make-Damn-SEO-ToolsSean Malseed
 
Page Rank Link Farm Detection
Page Rank Link Farm DetectionPage Rank Link Farm Detection
I04015559
I04015559I04015559
PageRank Algorithm
PageRank AlgorithmPageRank Algorithm
PageRank Algorithm
IOSRjournaljce
 
Analysis Of Algorithm
Analysis Of AlgorithmAnalysis Of Algorithm
Analysis Of Algorithm
Bashi9675
 
212 building googlebot - deview - google drive
212 building googlebot - deview - google drive212 building googlebot - deview - google drive
212 building googlebot - deview - google driveNAVER D2
 
Page rank algortihm
Page rank algortihmPage rank algortihm
Page rank algortihm
Siddharth Kar
 
Pagerank is a good thing
Pagerank is a good thingPagerank is a good thing
Pagerank is a good thing
feng lu
 
HITS algorithm : NOTES
HITS algorithm : NOTESHITS algorithm : NOTES
HITS algorithm : NOTES
Subhajit Sahu
 
Web2.0.2012 - lesson 8 - Google world
Web2.0.2012 - lesson 8 - Google worldWeb2.0.2012 - lesson 8 - Google world
Web2.0.2012 - lesson 8 - Google world
Carlo Vaccari
 
Search engine
Search engineSearch engine
Search engine
swaraj27
 
LINEAR ALGEBRA BEHIND GOOGLE SEARCH
LINEAR ALGEBRA BEHIND GOOGLE SEARCHLINEAR ALGEBRA BEHIND GOOGLE SEARCH
LINEAR ALGEBRA BEHIND GOOGLE SEARCH
Divyansh Verma
 
Dm page rank
Dm page rankDm page rank
Dm page rank
Raja Kumar Ranjan
 
Pagerank and hits
Pagerank and hitsPagerank and hits
Pagerank and hits
Shatakirti Er
 

Similar to Cloud Computing Project (20)

Pagerank
PagerankPagerank
Pagerank
 
PageRank & Searching
PageRank & SearchingPageRank & Searching
PageRank & Searching
 
Optimizing search engines
Optimizing search enginesOptimizing search engines
Optimizing search engines
 
Seo and page rank algorithm
Seo and page rank algorithmSeo and page rank algorithm
Seo and page rank algorithm
 
Search engine page rank demystification
Search engine page rank demystificationSearch engine page rank demystification
Search engine page rank demystification
 
Make Your Own Damn SEO Tools (Using Google Docs!)
Make Your Own Damn SEO Tools (Using Google Docs!)Make Your Own Damn SEO Tools (Using Google Docs!)
Make Your Own Damn SEO Tools (Using Google Docs!)
 
Make-Damn-SEO-Tools
Make-Damn-SEO-ToolsMake-Damn-SEO-Tools
Make-Damn-SEO-Tools
 
Page Rank Link Farm Detection
Page Rank Link Farm DetectionPage Rank Link Farm Detection
Page Rank Link Farm Detection
 
I04015559
I04015559I04015559
I04015559
 
PageRank Algorithm
PageRank AlgorithmPageRank Algorithm
PageRank Algorithm
 
Analysis Of Algorithm
Analysis Of AlgorithmAnalysis Of Algorithm
Analysis Of Algorithm
 
212 building googlebot - deview - google drive
212 building googlebot - deview - google drive212 building googlebot - deview - google drive
212 building googlebot - deview - google drive
 
Page rank algortihm
Page rank algortihmPage rank algortihm
Page rank algortihm
 
Pagerank is a good thing
Pagerank is a good thingPagerank is a good thing
Pagerank is a good thing
 
HITS algorithm : NOTES
HITS algorithm : NOTESHITS algorithm : NOTES
HITS algorithm : NOTES
 
Web2.0.2012 - lesson 8 - Google world
Web2.0.2012 - lesson 8 - Google worldWeb2.0.2012 - lesson 8 - Google world
Web2.0.2012 - lesson 8 - Google world
 
Search engine
Search engineSearch engine
Search engine
 
LINEAR ALGEBRA BEHIND GOOGLE SEARCH
LINEAR ALGEBRA BEHIND GOOGLE SEARCHLINEAR ALGEBRA BEHIND GOOGLE SEARCH
LINEAR ALGEBRA BEHIND GOOGLE SEARCH
 
Dm page rank
Dm page rankDm page rank
Dm page rank
 
Pagerank and hits
Pagerank and hitsPagerank and hits
Pagerank and hits
 

Recently uploaded

UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 

Recently uploaded (20)

UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 

Cloud Computing Project

  • 1. Page Rank Implementation CLOUD COMPUTING PROJECT -Team 3 By: - Devendra Singh Parmar
  • 2. Project Abstract Instructor: Prof. Reddy Raja Mentor: Ms M.Padmini To Implement PageRank Algorithm using Map-Reduce for Wikipedia and verify it for smaller data-sets
  • 3.
  • 4. Introduction to Algorithm
  • 10.
  • 11.
  • 12. Introduction to Algorithm
  • 14. Brief Description of Project
  • 18.
  • 19. Algorithm Google figures that when one page links to another page, it is effectively casting a vote for the other page. The more votes that are cast for a page, the more important the page must be. Also, the importance of the page that is casting the vote determines how important the vote itself is. Google calculates a page's importance from the votes cast for it. How important each vote is also taken into account when a page's PageRank is calculated.
  • 20.
  • 21. Introduction to Algorithm
  • 23. Brief Description of Project
  • 27.
  • 28.
  • 30. PageRank Equation(Enhancement) Solution for Cycles and If a random surfer gets bored Here ‘d ‘ is known as damping factor . It represents the probability, at any step, that the person will continue surfing . The value of ‘d’ is typically kept 0.85
  • 32. In other words In a simpler way:- a page's PageRank = 0.15 /N+ 0.85 * (a "share" of the PageRank of every page that links to it) "share" = the linking page's PageRank divided by the number of outbound links on the page. And N=the number of documents in collection The equation of PageRank shows clearly how a page's PageRank is arrived at. But what isn't immediately obvious is that it can't work if the calculation is done just once.
  • 33. PageRank Equation-as per the published paper :“The Anatomy of a Large-Scale Hyper textual Web Search Engine”-Sergey Brin and Lawrence Page We assume page A has pages T1...Tn which point to it (i.e., are citations). The parameter d is a damping factor which can be set between 0 and 1. We usually set d to 0.85.. Also C(A) is defined as the number of links going out of page A. The PageRank of a page A is given as follows: PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn)) ->Note that the PageRanks form a probability distribution over web pages, so the sum of all web pages’ PageRanks will be one.
  • 34. IssuesIn the Original Formula Formula given in the in Page and Brin's paper does not supports the statement that "the sum of all PageRanks is one“ Hence to support the statement the formula is modified as: PR(A) = (1-d)/N + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn)) where N=the number of documents in collection
  • 35.
  • 36. Introduction to Algorithm
  • 42.
  • 43. Brief Description of Project(Contd.) Output: The output file consist of records containing the url of the page(from Url), the page rank value of the page(PRValue) and the list of urls to which the page points to(ToUrlList). FinalOutput.txt ToUrlList fromUrl PRValue
  • 44. Brief Description of ProjectModules Web Graph Module1: Converter Module2: PageRank Calculator Module3: Output Analyzer Converter Iterate until convergence PageRank Calculator ... Search Engine Output Analyzer Create Index
  • 45.
  • 46. Introduction to Algorithm
  • 48. Brief Description of Project
  • 52.
  • 53. Module1: ConverterIssues Self Loops: -handled by checking the FromUrl with ToUrl before sending it to the reduce function Dangling Pages: -handled by initializing their PRValue with 1/N and the List of ToUrls is left blank.
  • 54.
  • 55. Introduction to Algorithm
  • 57. Brief Description of Project
  • 61.
  • 62. Module2: PageRank Calculator Map: Input: index.html PRValueOutList: < 1.html 2.html... > Output 1. Output for each outlink: key: “1.html” value: PRValue/ ListLength (Vote Share) 2. ToUrl itself key: index.html value: <OutList> Reduce Input: Key: “1.html” Value: 0.5 23Value: 0.24 2……. Value : UrlList <OutLink> Output: Key: “1.html” Value: “<new pagerank> <OutList> 1.html 2.html...” Start with the initial PageRank and Outlinksof a document.
  • 63. Module2: PageRank Calculator Map: Input: index.html PRValueOutList: < 1.html 2.html... > Output 1. Output for each outlink: key: “1.html” value: PRValue/ ListLength (Vote Share) 2. ToUrl itself key: index.html value: <OutList> Reduce Input: Key: “1.html” Value: 0.5 23Value: 0.24 2……. Value : UrlList <OutLink> Output: Key: “1.html” Value: “<new pagerank> <OutList> 1.html 2.html...” For each Outlink, output the PageRank’s share of the Inlinks, and List of outlinks.
  • 64. Module2: PageRank Calculator Map: Input: index.html PRValueOutList: < 1.html 2.html... > Output 1. Output for each outlink: key: “1.html” value: PRValue/ ListLength (Vote Share) 2. ToUrl itself key: index.html value: <OutList> Reduce Input: Key: “1.html” Value: 0.5 23Value: 0.24 2……. Value : UrlList <OutLink> Output: Key: “1.html” Value: “<new pagerank> <OutList> 1.html 2.html...” Now the reducer has a Url of document, all the inlinks to that document and their corresponding PageRank’s share and List of outlinks.
  • 65. Module2: PageRank Calculator Map: Input: index.html PRValueOutList: < 1.html 2.html... > Output 1. Output for each outlink: key: “1.html” value: PRValue/ ListLength (Vote Share) 2. ToUrl itself key: index.html value: <OutList> Reduce Input: Key: “1.html” Value: 0.5 23Value: 0.24 2……. Value : UrlList <OutLink> Output: Key: “1.html” Value: “<new pagerank> <OutList> 1.html 2.html...” Compute the new PageRank and output in the same format as the input.
  • 66. Module2: PageRank Calculator Map: Input: index.html PRValueOutList: < 1.html 2.html... > Output 1. Output for each outlink: key: “1.html” value: PRValue/ ListLength (Vote Share) 2. ToUrl itself key: index.html value: <OutList> Reduce Input: Key: “1.html” Value: 0.5 23Value: 0.24 2……. Value : UrlList <OutLink> Output: Key: “1.html” Value: “<new pagerank> <OutList> 1.html 2.html...” Now iterate until convergence (determined by the precision value).
  • 67. Module2: PageRank Calculator IssuesCatch22 Situation Suppose we have 2 pages, A and B, which link to each other, and neither have any other links of any kind. This is what happens:- Step 1: Calculate page A's PageRank from the value of its inbound links Step 2: Calculate page B's PageRank from the value of its inbound links we can't work out A's PageRank until we know B's PageRank, and we can't work out B's PageRank until we know A's PageRank. Thus the PageRank of A and B will be inaccurate.
  • 68. Module2: PageRank Calculator IssuesCatch22 situation (solution) This problem is overcome by repeating the calculations many times. Each time produces slightly more accurate values. In fact, total accuracy can never be achieved because the calculations are always based on inaccurate values. The number of iterations should be sufficient to reach a point where any further iterations wouldn't produce enough of a change to the values to matter. => Use “delta function” which will keep track of changes in the PageRank of all the pages and if the change in PageRank of all the pages is less than the value specified by the user the iterations can be stopped.
  • 69.
  • 70. Introduction to Algorithm
  • 72. Brief Description of Project
  • 76.
  • 77.
  • 78. Introduction to Algorithm
  • 80. Brief Description of Project
  • 85.
  • 86.
  • 87. GeneRank (based on PageRank) ranks the genes analyzed in the microarray to see the relationship between the cell’s function and gene expression.
  • 88.