SlideShare a Scribd company logo
1 of 16
PageRank Multithreading
Shujian Zhang
Look at how much fun they’re having.
…..however, there are a lot of web pages, and a lot of links, and it
becomes a LOT of work to calculate
PageRank is fun!
Overview
How does PageRank work?
-Directed graph (nodes point to other nodes, but it’s a one-way street)
-Adjacency matrix constructed from graph
-Each page given an equal weight to distribute to the pages it points to
-Pages without any other pages pointing to it given weight of 1/total#pages
to represent the random chance that someone goes directly to that
page
-Adjacency matrix is multiplied by the PageRank vector iteratively until the
PageRanks begin to approach an equilibrium and change no
further
-”Damping Factor” applied to simulate a random stop in page exploration
Formula Represented as:
Where R is the PR matrix, M is the adjacency matrix, t is the number of iterations done,
d is the damping factor, and N is the total number of pages. After limited iterations,
the pagerank value will converge.
Overview
The adjacency matrix is populated with
1’s initially, and then 1/#NodesPointedAt,
as a given side pointing to 3 other sites gives
a ⅓ chance to navigate to either one of the 3.
In a very basic representation, the adjacency
matrix is multiplied by the page rank vector,
in this case, on its initial run. All pages have
equal weight from the beginning.
To deal with the situation (there are some pages
which they never link to other pages), we need
to set up a possibility which represents the person
may jump to other pages by inputting address
in browser.
Overview
Damping Factor-
When a node points to no one else, over time,
it will possibly become a sink, and hold all of
the weight.
The damping factor simulates a user getting
bored of their current train of pages, and
going to a random website. This makes sure
that sinks don’t happen, as someone stuck
on C might become bored and navigate to
any other node.
Sequential Implementation
The sequential implementation is basically just a big loop through the matrix, performing
the page rank calculation on every row of the matrix, updating the page rank vector, and
then repeating for N number of times.
Pseudocode:
for n times
for row in adjmat
for i in row
tempPR[rowIndex] = PRcalc(row[i], PR[i])
PR = tempPR
Parallel implementation
Running this problem concurrently is actually very slick. Each thread can handle a
row of the adjacency matrix and calculate PR for each node. Each thread writes the
new pagerank to a temporary vector, and after all threads have calculated
pagerank for each node, the new pagerank vector replaces the old one. No threads
will ever write to the same location, so there is no need to use mutexes or any other
kind of read/write control. The only thing to keep control of is making sure the
threads don’t outpace each other and get ahead or behind on the iteration.
*pseudocode for thread:
row = adjmat[next]
for i in row
thisRank += (do PR formula on row[i], pagerank[i])
tempPR[next] = thisRank
*main thread
pagerank = tempPR
Parallel implementation (CPP)
Global variables:
Create Matrix (in main function):
Parallel implementation (CPP)
Create Pthread, and we can decide how many times we need to run the program.
Parallel implementation (CPP)
PageRank algorithm for Parallel implementation
Output ( two times running)
For testing the algorithm is correct or not, we ran our code in a data set with 4 nodes.
Sequential ( one thread) Parallel ( two threads)
The pagerank value is converging after 30 times running.
Output ( 30 times sequential running)
Output ( 30 times parallel running)
The pagerank value
is converging after
30 times running.
So algorithm of
code is CORRECT!
Running time
For this part, we ran our code with a big data set named Wiki-Vote( about 8000 nodes).
And we ran it for 20 times
The Wiki-Vote was downloaded from http://snap.stanford.edu/data/. It is directed
graph. Its description is Wikipedia who-votes-on-whom network.
For sequential(1 thread) running time : 15.18 sec.
2 threads: 7.89 sec.
4 threads: 4.21 sec.
16 threads: 3.77 sec.
Conclusion
After more than 20 times running, the Pagerank value shows
convergence.
When running the small data set, there is no different running
times between sequence and multi threads.
When running the big data set and running for many times,
there will be obviously different running times between
sequence and multi threads. As the number of thread
increasing, the running time is decreasing.
Thank you!

More Related Content

What's hot

Apgar Score - Diagnosing Hypoxic-Ischemic Encephalopathy (HIE)
Apgar Score - Diagnosing Hypoxic-Ischemic Encephalopathy (HIE)Apgar Score - Diagnosing Hypoxic-Ischemic Encephalopathy (HIE)
Apgar Score - Diagnosing Hypoxic-Ischemic Encephalopathy (HIE)HIE Help Center
 
The memory process
The memory processThe memory process
The memory processShena Mah
 
Nursing care of spinal cord injury
Nursing care of spinal cord injuryNursing care of spinal cord injury
Nursing care of spinal cord injuryvarshravi17
 
Cerebro vascular accident
Cerebro vascular accidentCerebro vascular accident
Cerebro vascular accidentRam Prasad
 
Cerebral hemorrhage
Cerebral hemorrhageCerebral hemorrhage
Cerebral hemorrhageHanaa Nooh
 

What's hot (7)

Apgar Score - Diagnosing Hypoxic-Ischemic Encephalopathy (HIE)
Apgar Score - Diagnosing Hypoxic-Ischemic Encephalopathy (HIE)Apgar Score - Diagnosing Hypoxic-Ischemic Encephalopathy (HIE)
Apgar Score - Diagnosing Hypoxic-Ischemic Encephalopathy (HIE)
 
The memory process
The memory processThe memory process
The memory process
 
Nursing care of spinal cord injury
Nursing care of spinal cord injuryNursing care of spinal cord injury
Nursing care of spinal cord injury
 
Encephalitis
EncephalitisEncephalitis
Encephalitis
 
Dimentia
DimentiaDimentia
Dimentia
 
Cerebro vascular accident
Cerebro vascular accidentCerebro vascular accident
Cerebro vascular accident
 
Cerebral hemorrhage
Cerebral hemorrhageCerebral hemorrhage
Cerebral hemorrhage
 

Viewers also liked

Knowledge vs. information
Knowledge vs. informationKnowledge vs. information
Knowledge vs. informationRubemiller
 
i-LAB White Paper- Beer Color
i-LAB White Paper- Beer Colori-LAB White Paper- Beer Color
i-LAB White Paper- Beer ColorJoseph E. Johnson
 
World heritage (valencia and aragon) (marc g)
World heritage (valencia and aragon) (marc g)World heritage (valencia and aragon) (marc g)
World heritage (valencia and aragon) (marc g)ivalma05
 
Media institutions powerpoint 1
Media institutions powerpoint 1Media institutions powerpoint 1
Media institutions powerpoint 1Denton Snowden
 
10 Signs You’re a 90’s Baby from the Philippines [Part 1]
10 Signs You’re a 90’s Baby from the Philippines [Part 1]10 Signs You’re a 90’s Baby from the Philippines [Part 1]
10 Signs You’re a 90’s Baby from the Philippines [Part 1]88 Digital Cloud Marketing
 
Digital magazine 2014 15(1)
Digital magazine 2014 15(1)Digital magazine 2014 15(1)
Digital magazine 2014 15(1)archa1989
 
Precipitable water modelling using artificial neural
Precipitable water modelling using artificial neuralPrecipitable water modelling using artificial neural
Precipitable water modelling using artificial neuralmehmet şahin
 
HT16 - DA156A - Introduktion till JavaScript
HT16 - DA156A - Introduktion till JavaScriptHT16 - DA156A - Introduktion till JavaScript
HT16 - DA156A - Introduktion till JavaScriptAnton Tibblin
 
HT15, DA354A - Introduktion till Webbprogrammering - Bottle
HT15, DA354A - Introduktion till Webbprogrammering - BottleHT15, DA354A - Introduktion till Webbprogrammering - Bottle
HT15, DA354A - Introduktion till Webbprogrammering - BottleAnton Tibblin
 
dumps for sale
dumps for saledumps for sale
dumps for salemikscott
 

Viewers also liked (20)

Knowledge vs. information
Knowledge vs. informationKnowledge vs. information
Knowledge vs. information
 
i-LAB White Paper- Beer Color
i-LAB White Paper- Beer Colori-LAB White Paper- Beer Color
i-LAB White Paper- Beer Color
 
ЕГЭ 2016
ЕГЭ 2016ЕГЭ 2016
ЕГЭ 2016
 
Result_2012-13_XII
Result_2012-13_XIIResult_2012-13_XII
Result_2012-13_XII
 
World heritage (valencia and aragon) (marc g)
World heritage (valencia and aragon) (marc g)World heritage (valencia and aragon) (marc g)
World heritage (valencia and aragon) (marc g)
 
MAHIR NEW CV - Copy-1 (1)
MAHIR NEW CV - Copy-1 (1)MAHIR NEW CV - Copy-1 (1)
MAHIR NEW CV - Copy-1 (1)
 
Blue Helix Community
Blue Helix CommunityBlue Helix Community
Blue Helix Community
 
Tensile
TensileTensile
Tensile
 
Italy diary
Italy diaryItaly diary
Italy diary
 
Clavamox
ClavamoxClavamox
Clavamox
 
Media institutions powerpoint 1
Media institutions powerpoint 1Media institutions powerpoint 1
Media institutions powerpoint 1
 
Revolucion francesa
Revolucion francesaRevolucion francesa
Revolucion francesa
 
Abeifa modelo junho
Abeifa modelo junhoAbeifa modelo junho
Abeifa modelo junho
 
10 Signs You’re a 90’s Baby from the Philippines [Part 1]
10 Signs You’re a 90’s Baby from the Philippines [Part 1]10 Signs You’re a 90’s Baby from the Philippines [Part 1]
10 Signs You’re a 90’s Baby from the Philippines [Part 1]
 
Digital magazine 2014 15(1)
Digital magazine 2014 15(1)Digital magazine 2014 15(1)
Digital magazine 2014 15(1)
 
SterliniEtAl2015
SterliniEtAl2015SterliniEtAl2015
SterliniEtAl2015
 
Precipitable water modelling using artificial neural
Precipitable water modelling using artificial neuralPrecipitable water modelling using artificial neural
Precipitable water modelling using artificial neural
 
HT16 - DA156A - Introduktion till JavaScript
HT16 - DA156A - Introduktion till JavaScriptHT16 - DA156A - Introduktion till JavaScript
HT16 - DA156A - Introduktion till JavaScript
 
HT15, DA354A - Introduktion till Webbprogrammering - Bottle
HT15, DA354A - Introduktion till Webbprogrammering - BottleHT15, DA354A - Introduktion till Webbprogrammering - Bottle
HT15, DA354A - Introduktion till Webbprogrammering - Bottle
 
dumps for sale
dumps for saledumps for sale
dumps for sale
 

Similar to PageRank in Multithreading

PageRank_algorithm_Nfaoui_El_Habib
PageRank_algorithm_Nfaoui_El_HabibPageRank_algorithm_Nfaoui_El_Habib
PageRank_algorithm_Nfaoui_El_HabibEl Habib NFAOUI
 
Markov chains and page rankGraphs.pdf
Markov chains and page rankGraphs.pdfMarkov chains and page rankGraphs.pdf
Markov chains and page rankGraphs.pdfrayyverma
 
PageRank Algorithm In data mining
PageRank Algorithm In data miningPageRank Algorithm In data mining
PageRank Algorithm In data miningMai Mustafa
 
Local Approximation of PageRank
Local Approximation of PageRankLocal Approximation of PageRank
Local Approximation of PageRanksjuyal
 
HyPR: Hybrid Page Ranking on Evolving Graphs (NOTES)
HyPR: Hybrid Page Ranking on Evolving Graphs (NOTES)HyPR: Hybrid Page Ranking on Evolving Graphs (NOTES)
HyPR: Hybrid Page Ranking on Evolving Graphs (NOTES)Subhajit Sahu
 
Random web surfer pagerank algorithm
Random web surfer pagerank algorithmRandom web surfer pagerank algorithm
Random web surfer pagerank algorithmalexandrelevada
 
Pagerank (from Google)
Pagerank (from Google)Pagerank (from Google)
Pagerank (from Google)Sri Prasanna
 
A hadoop implementation of pagerank
A hadoop implementation of pagerankA hadoop implementation of pagerank
A hadoop implementation of pagerankChengeng Ma
 
Lec5 pagerank
Lec5 pagerankLec5 pagerank
Lec5 pagerankCarlos
 
Lec5 Pagerank
Lec5 PagerankLec5 Pagerank
Lec5 Pagerankmobius.cn
 
page rank explication et exemple formule
page rank explication et exemple  formulepage rank explication et exemple  formule
page rank explication et exemple formuleRamiHarrathi1
 
Page rank
Page rankPage rank
Page rankCarlos
 
Streaming Python on Hadoop
Streaming Python on HadoopStreaming Python on Hadoop
Streaming Python on HadoopVivian S. Zhang
 
Pagerank is a good thing
Pagerank is a good thingPagerank is a good thing
Pagerank is a good thingfeng lu
 
A Generalization of the PageRank Algorithm : NOTES
A Generalization of the PageRank Algorithm : NOTESA Generalization of the PageRank Algorithm : NOTES
A Generalization of the PageRank Algorithm : NOTESSubhajit Sahu
 
Unit2-Part2-MultithreadAlgos.pptx.pdf
Unit2-Part2-MultithreadAlgos.pptx.pdfUnit2-Part2-MultithreadAlgos.pptx.pdf
Unit2-Part2-MultithreadAlgos.pptx.pdfVinayak247538
 
Pagerank
Pagerank Pagerank
Pagerank C C
 

Similar to PageRank in Multithreading (20)

PageRank_algorithm_Nfaoui_El_Habib
PageRank_algorithm_Nfaoui_El_HabibPageRank_algorithm_Nfaoui_El_Habib
PageRank_algorithm_Nfaoui_El_Habib
 
Markov chains and page rankGraphs.pdf
Markov chains and page rankGraphs.pdfMarkov chains and page rankGraphs.pdf
Markov chains and page rankGraphs.pdf
 
PageRank Algorithm In data mining
PageRank Algorithm In data miningPageRank Algorithm In data mining
PageRank Algorithm In data mining
 
Page rank method
Page rank methodPage rank method
Page rank method
 
Local Approximation of PageRank
Local Approximation of PageRankLocal Approximation of PageRank
Local Approximation of PageRank
 
HyPR: Hybrid Page Ranking on Evolving Graphs (NOTES)
HyPR: Hybrid Page Ranking on Evolving Graphs (NOTES)HyPR: Hybrid Page Ranking on Evolving Graphs (NOTES)
HyPR: Hybrid Page Ranking on Evolving Graphs (NOTES)
 
Random web surfer pagerank algorithm
Random web surfer pagerank algorithmRandom web surfer pagerank algorithm
Random web surfer pagerank algorithm
 
Pagerank (from Google)
Pagerank (from Google)Pagerank (from Google)
Pagerank (from Google)
 
A hadoop implementation of pagerank
A hadoop implementation of pagerankA hadoop implementation of pagerank
A hadoop implementation of pagerank
 
Lec5 Pagerank
Lec5 PagerankLec5 Pagerank
Lec5 Pagerank
 
Lec5 pagerank
Lec5 pagerankLec5 pagerank
Lec5 pagerank
 
Lec5 Pagerank
Lec5 PagerankLec5 Pagerank
Lec5 Pagerank
 
page rank explication et exemple formule
page rank explication et exemple  formulepage rank explication et exemple  formule
page rank explication et exemple formule
 
J046045558
J046045558J046045558
J046045558
 
Page rank
Page rankPage rank
Page rank
 
Streaming Python on Hadoop
Streaming Python on HadoopStreaming Python on Hadoop
Streaming Python on Hadoop
 
Pagerank is a good thing
Pagerank is a good thingPagerank is a good thing
Pagerank is a good thing
 
A Generalization of the PageRank Algorithm : NOTES
A Generalization of the PageRank Algorithm : NOTESA Generalization of the PageRank Algorithm : NOTES
A Generalization of the PageRank Algorithm : NOTES
 
Unit2-Part2-MultithreadAlgos.pptx.pdf
Unit2-Part2-MultithreadAlgos.pptx.pdfUnit2-Part2-MultithreadAlgos.pptx.pdf
Unit2-Part2-MultithreadAlgos.pptx.pdf
 
Pagerank
Pagerank Pagerank
Pagerank
 

Recently uploaded

Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?Watsoo Telematics
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 

Recently uploaded (20)

Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 

PageRank in Multithreading

  • 2. Look at how much fun they’re having. …..however, there are a lot of web pages, and a lot of links, and it becomes a LOT of work to calculate PageRank is fun!
  • 3. Overview How does PageRank work? -Directed graph (nodes point to other nodes, but it’s a one-way street) -Adjacency matrix constructed from graph -Each page given an equal weight to distribute to the pages it points to -Pages without any other pages pointing to it given weight of 1/total#pages to represent the random chance that someone goes directly to that page -Adjacency matrix is multiplied by the PageRank vector iteratively until the PageRanks begin to approach an equilibrium and change no further -”Damping Factor” applied to simulate a random stop in page exploration Formula Represented as: Where R is the PR matrix, M is the adjacency matrix, t is the number of iterations done, d is the damping factor, and N is the total number of pages. After limited iterations, the pagerank value will converge.
  • 4. Overview The adjacency matrix is populated with 1’s initially, and then 1/#NodesPointedAt, as a given side pointing to 3 other sites gives a ⅓ chance to navigate to either one of the 3. In a very basic representation, the adjacency matrix is multiplied by the page rank vector, in this case, on its initial run. All pages have equal weight from the beginning. To deal with the situation (there are some pages which they never link to other pages), we need to set up a possibility which represents the person may jump to other pages by inputting address in browser.
  • 5. Overview Damping Factor- When a node points to no one else, over time, it will possibly become a sink, and hold all of the weight. The damping factor simulates a user getting bored of their current train of pages, and going to a random website. This makes sure that sinks don’t happen, as someone stuck on C might become bored and navigate to any other node.
  • 6. Sequential Implementation The sequential implementation is basically just a big loop through the matrix, performing the page rank calculation on every row of the matrix, updating the page rank vector, and then repeating for N number of times. Pseudocode: for n times for row in adjmat for i in row tempPR[rowIndex] = PRcalc(row[i], PR[i]) PR = tempPR
  • 7. Parallel implementation Running this problem concurrently is actually very slick. Each thread can handle a row of the adjacency matrix and calculate PR for each node. Each thread writes the new pagerank to a temporary vector, and after all threads have calculated pagerank for each node, the new pagerank vector replaces the old one. No threads will ever write to the same location, so there is no need to use mutexes or any other kind of read/write control. The only thing to keep control of is making sure the threads don’t outpace each other and get ahead or behind on the iteration. *pseudocode for thread: row = adjmat[next] for i in row thisRank += (do PR formula on row[i], pagerank[i]) tempPR[next] = thisRank *main thread pagerank = tempPR
  • 8. Parallel implementation (CPP) Global variables: Create Matrix (in main function):
  • 9. Parallel implementation (CPP) Create Pthread, and we can decide how many times we need to run the program.
  • 10. Parallel implementation (CPP) PageRank algorithm for Parallel implementation
  • 11. Output ( two times running) For testing the algorithm is correct or not, we ran our code in a data set with 4 nodes. Sequential ( one thread) Parallel ( two threads)
  • 12. The pagerank value is converging after 30 times running. Output ( 30 times sequential running)
  • 13. Output ( 30 times parallel running) The pagerank value is converging after 30 times running. So algorithm of code is CORRECT!
  • 14. Running time For this part, we ran our code with a big data set named Wiki-Vote( about 8000 nodes). And we ran it for 20 times The Wiki-Vote was downloaded from http://snap.stanford.edu/data/. It is directed graph. Its description is Wikipedia who-votes-on-whom network. For sequential(1 thread) running time : 15.18 sec. 2 threads: 7.89 sec. 4 threads: 4.21 sec. 16 threads: 3.77 sec.
  • 15.
  • 16. Conclusion After more than 20 times running, the Pagerank value shows convergence. When running the small data set, there is no different running times between sequence and multi threads. When running the big data set and running for many times, there will be obviously different running times between sequence and multi threads. As the number of thread increasing, the running time is decreasing. Thank you!