SlideShare a Scribd company logo
1 of 32
Download to read offline
Updating PageRank
for Streaming Graphs
E. Jason Riedy
Graph Algorithms Building Blocks
23 May 2016
School of Computational Science and Engineering
Georgia Institute of Technology
Streaming Graph Analysis
(insert prefix here)-scale data analysis
Cyber-security Identify anomalies, malicious actors
Health care Finding outbreaks, population epidemiology
Social networks Advertising, searching, grouping
Intelligence Decisions at scale, regulating algorithms
Systems biology Understanding interactions, drug design
Power grid Disruptions, conservation
Simulation Discrete events, cracking meshes
• Graphs are a motif / theme in data analysis.
• Changing and dynamic graphs are important!Riedy, GABB 2016 3/32
Outline
1. Motivation and background
2. Incremental PageRank
• Linear algebra & streaming graph data
• Maintaining accuracy!
3. Performance and implementation aspects
• Requests for GraphBLAS implementations
Riedy, GABB 2016 4/32
Potential Applications
• Social Networks
• Identify communities, influences, bridges, trends,
anomalies (trends before they happen)...
• Potential to help social sciences, city planning, and
others with large-scale data.
• Cybersecurity
• Determine if new connections can access a device or
represent new threat in < 5ms...
• Is the transfer by a virus / persistent threat?
• Bioinformatics, health
• Construct gene sequences, analyze protein
interactions, map brain interactions
• Credit fraud forensics ⇒ detection ⇒ monitoring
Riedy, GABB 2016 5/32
Streaming graph data
Networks data rates:
• Gigabit ethernet: 81k – 1.5M packets per second
• Over 130 000 flows per second on 10 GigE (< 7.7 µs)
Person-level data rates:
• 500M posts per day on Twitter (6k / sec)1
• 3M posts per minute on Facebook (50k / sec)2
We need to analyze only changes and not entire graph.
Throughput & latency trade offs expose different levels
of concurrency.
1
www.internetlivestats.com/twitter-statistics/
2
www.jeffbullas.com/2015/04/17/21-awesome-facebook-facts-and-statistics-you-need-to-check-out/
Riedy, GABB 2016 6/32
Streaming graph analysis
Terminology:
• Streaming changes into a massive, evolving graph
• Not CS streaming algorithm (tiny memory)
• Need to handle deletions as well as insertions
Previous throughput results (not comprehensive review):
Data ingest >2M up/sec [Ediger, McColl, Poovey, Campbell, &
Bader 2014]
Clustering coefficients >100K up/sec [R, Meyerhenke,
Bader, Ediger, & Mattson 2012]
Connected comp. >1M up/sec [McColl, Green, & Bader 2013]
Community clustering >100K up/sec∗
[R & Bader 2013]
Riedy, GABB 2016 7/32
Incremental PageRank
PageRank
Everyone’s “favorite” metric: PageRank.
• Stationary distribution of the random surfer model.
• Eigenvalue problem can be re-phrased as a linear
system3
(
I − αAT
D−1
)
x = kv,
with
α teleportation constant < 1
A adjacency matrix
D diagonal matrix of out degrees, with
x/0 = x (self-loop)
v personalization vector, ∥v∥1 = 1
k scaling constant
3
[Gleich, Zhukov, & Berkhin 2004; Del Corso, Gull, & Romani 2005]Riedy, GABB 2016 9/32
Incremental PageRank: Goals
• Efficiently update for streaming data; update
PageRank without touching the entire graph.
• Keep the results accurate.
• Updates can wander, and ranks deceive...
• Existing methods:
• Compute “summaries” of non-changed portions:
Walk whole graph per change. [Langville & Meyer,
2006]
• Maintain databases of walks for dynamic resampling
[Bahmani, Chowdhury, & Goel 2010]
• Statistically based idea, very similar but... [Ohsaka, et
al. 2015]
Riedy, GABB 2016 10/32
Incremental PageRank: First pass
• Let A∆ = A + ∆A, D∆ = D + ∆D for the new graph,
want to solve for x + ∆x.
• A: sparse and row-major, Ai,j = 1 if i → j ∈ edges
• Simple algebra:
(
I − αAT
∆D−1
∆
)
∆x = α
(
A∆D−1
∆ − AD−1
)
x
• The operator on the right-hand side, A∆D−1
∆ − AD−1
,
is sparse; non-zero only adjacent to changes in ∆.
• Re-arrange for Jacobi (stationary iterative method),
∆x(k+1)
= αAT
∆D−1
∆ ∆x(k)
+ α
(
A∆D−1
∆ − AD−1
)
x,
iterate, ...
Riedy, GABB 2016 11/32
Incremental PageRank: Accumulating error
• And fail. The updated solution wanders away from
the true solution. Top rankings stay the same...
Riedy, GABB 2016 12/32
Incremental PageRank: Think instead
• Backward error view: The new problem is close to
the old one, solution may be close.
• How close? Residual:
r′
= kv − x + αA∆D−1
∆ x
= r + α
(
A∆D−1
∆ − AD−1
)
x.
• Solve (I − αA∆D−1
∆ )∆x = r′
(iterative refinement).
• Cheat by not refining all of r′
, only region growing
around the changes:
(I − αA∆D−1
∆ )∆x = r′
|∆
• (Also cheat by updating r rather than recomputing
at the changes.)
Riedy, GABB 2016 13/32
Incremental PageRank: Works
Riedy, GABB 2016 14/32
Performance / Latency
Performance discussion
• Inherent trade-off between high throughput and low
latency.
Throughput Large batches imply great parallelism
Latency Small batches are highly independent
• Restarting PR iteration exposes massive parallelism
• Sparse matrix - dense vector mult. (SpMV)
• Incremental is highly sparse, pays in overhead
• Sparse matrix - sparse vector mult. (SpMSpV)
• Results are worst-case: changes are not related to
conductance communities.
• (Also, very un-tuned SpMSpV...)
Riedy, GABB 2016 16/32
Test cases
Using one CPU in an 8-core Intel Westmere-EX (E7-4820),
2.00GHz and 18MiB L3...
Graph |V| |E| Avg. Degree Size (MiB)
power 4941 6594 1.33 0.07
power grid
PGPgiantcompo 10680 24316 2.28 0.23
social network
caidaRouterLevel 192244 609066 3.17 5.38
networking
belgium.osm 1441295 1549970 1.08 22.82
road map
coPapersCiteseer 434102 16036720 36.94 124.01
citation
Riedy, GABB 2016 17/32
Incremental PageRank: Worst latency
q
q
q
q
q
q
q
q
q
q q
q
q
q
q
0.001
0.010
0.001
0.010
0.01
0.10
1.00
1
100
1
10
100
powerPGPgiantcompocaidaRouterLevelbelgium.osmcoPapersCiteseer
10 100 1000
Batch size
Updatetime(s)
Algorithm q dpr dprheld pr_restart
Incremental: Best at 4 threads. RPR: Best at 8.
Riedy, GABB 2016 18/32
Incremental PageRank: Worst throughput
100
q q q q q q q
q
q
q
q q q q q q q q q q
q q q q
q
q
q q q
q
q q q q q q q q q q
q q
q q
q
q q q q q
4000
6000
8000
2500
5000
7500
10000
0
5000
10000
15000
0
1000
2000
3000
4000
0
250
500
750
1000
powerPGPgiantcompocaidaRouterLevelbelgium.osmcoPapersCiteseer
1 2 3 4 5 6 7 8 9 10
Batch Number
Updatethroughput(edges/sec)
Alg. q dpr dprheld pr pr_restart
Incremental: Best at 4 threads. RPR: Best at 8.
Riedy, GABB 2016 19/32
Median update time (s)
Graph Batch dpr dpr_held pr_restart
power 10 .00304 1.8× .00008 68× .00535
100 .0109 .79× .00707 1.2× .00860
1000 .0126 .67× .0124 .68× .00843
PGPgiantcompo 10 .00211 4.3× .00023 39× .00916
100 .0257 .81× .00874 2.4× .0207
1000 .0372 .67× .0341 .73× .0249
caidaRouterLevel 10 .00710 16× .00199 57× .112
100 .0314 7.1× .00477 47× .224
1000 1.30 .68× .2290 3.9× .889
belgium.osm 10 .0118 42× .0128 39× .498
100 .0127 39× .0131 38× .499
1000 .0461 52× .0171 140× 2.41
coPapersCiteseer 10 .0729 27× .0128 155× 1.98
100 .8650 2.3× .130 15× 1.98
1000 2.97 1.3× 1.13 3.5× 3.95
Riedy, GABB 2016 20/32
Fraction of traversed edges
Graph Batch dpr dpr_held pr_restart
power 10 7.54 2.4× 0.0229 790× 18
100 29.2 1.2× 18.2 1.9× 34
1000 38.3 1.0× 37.1 1.0× 39
PGPgiantcompo 10 1.58 5.1× 0.0453 180× 8
100 21.3 1.1× 6.82 3.6× 24
1000 31.2 1.0× 28.4 1.1× 32
caidaRouterLevel 10 0.0625 32× 0.00252 790× 2
100 0.409 9.8× 0.0301 130× 4
1000 15.2 1.1× 3.07 5.2× 16
belgium.osm 10 0.00020 9800× 0.00007 30000× 2
100 0.00203 990× 0.00066 3000× 2
1000 0.120 84× 0.00660 1500× 10
coPapersCiteseer 10 0.0563 36× 0.00646 310× 2
100 0.689 2.9× 0.0952 21× 2
1000 2.32 1.7× 0.864 4.6× 4
Riedy, GABB 2016 21/32
GraphBLAS Requests: Beat Down
Overhead!
GraphBLAS background
• Provide a means to express linear-algebra-like
graph algorithms
• Graphs can be modeled as (sparse) matrices
• Adjacency (used here)
• Vertex-edge
• Bipartite...
• Some graph algorithms iterate as if applying a linear
operator
• BFS
• Betweenness centrality
• PageRank
• Ok, PageRank actually is linear algebra...
Riedy, GABB 2016 23/32
Lessons from sparse linear algebra
• Overhead is important.
• Small average degree
• O(|E|) is O(|V|)...
• If the average degree is 8, 8|V| storage is the matrix.
• Graphs: Some very large degrees
• Graphs: Low average diameter
• Details matter
• Entries that disappear are painful
• Take care with definitions on |V|...
Riedy, GABB 2016 24/32
GraphBLAS in updating PR
dpr_core (A, z, r, ∆)
Let D = diagonal matrix of vertex out-degrees
z = α(AT
D−1
x|∆ − z) Note: z = AT
D−1
x before update
∆x(1)
= z + r|z
For k = 1 ... itmax
∆x(k+1)
= αAT
D−1
∆x
(k)
|≥γ + α∆x
(k)
|<γ + z
∆x(k+1)
= ∆x(k+1)
+ r|∆x(k+1)
Stop if ∥x(k+1)
− x(k)
∥1 < τ
∆r = z + ∆x(k+1)
− αAT
D−1
∆x(k+1)
Return ∆x(k+1)
and ∆r
Fuse common sparse operation sequences to reduce
overhead.
Riedy, GABB 2016 25/32
GraphBLAS in updating PR
dpr_core (A, z, r, ∆)
Let D = diagonal matrix of vertex out-degrees
z = α(AT
D−1
x|∆ − z) Note: z = AT
D−1
x before update
∆x(1)
= z + r|z
For k = 1 ... itmax
∆x(k+1)
= αAT
D−1
∆x
(k)
|≥γ + α∆x
(k)
|<γ + z
∆x(k+1)
= ∆x(k+1)
+ r|∆x(k+1)
Stop if ∥x(k+1)
− x(k)
∥1 < τ
∆r = z + ∆x(k+1)
− αAT
D−1
∆x(k+1)
Return ∆x(k+1)
and ∆r
Region-growing: Don’t duplicate / realloc only-extended
patterns.
Riedy, GABB 2016 26/32
GraphBLAS in updating PR
dpr_core (A, z, r, ∆)
Let D = diagonal matrix of vertex out-degrees
z = α(AT
D−1
x|∆ − z) Note: z = AT
D−1
x before update
∆x(1)
= z + r|z
For k = 1 ... itmax
∆x(k+1)
= αAT
D−1
∆x
(k)
|≥γ + α∆x
(k)
|<γ + z
∆x(k+1)
= ∆x(k+1)
+ r|∆x(k+1)
Stop if ∥x(k+1)
− x(k)
∥1 < τ
∆r = z + ∆x(k+1)
− αAT
D−1
∆x(k+1)
Return ∆x(k+1)
and ∆r
Fuse tests into loops when feasible.
Riedy, GABB 2016 27/32
GraphBLAS in updating PR
dpr_core (A, z, r, ∆)
Let D = diagonal matrix of vertex out-degrees
z = α(AT
D−1
x|∆ − z) Note: z = AT
D−1
x before update
∆x(1)
= z + r|z
For k = 1 ... itmax
∆x(k+1)
= αAT
D−1
∆x
(k)
|≥γ + α∆x
(k)
|<γ + z
∆x(k+1)
= ∆x(k+1)
+ r|∆x(k+1)
Stop if ∥x(k+1)
− x(k)
∥1 < τ
∆r = z + ∆x(k+1)
− αAT
D−1
∆x(k+1)
Return ∆x(k+1)
and ∆r
Looks like the optimization point? Few vertices in ∆
means all steps matter.
Riedy, GABB 2016 28/32
GraphBLAS in updating PR
dpr_core (A, z, r, ∆)
Let D = diagonal matrix of vertex out-degrees
z = α(AT
D−1
x|∆ − z) Note: z = AT
D−1
x before update
∆x(1)
= z + r|z
For k = 1 ... itmax
∆x(k+1)
= αAT
D−1
∆x
(k)
|≥γ + α∆x
(k)
|<γ + z
∆x(k+1)
= ∆x(k+1)
+ r|∆x(k+1)
Stop if ∥x(k+1)
− x(k)
∥1 < τ
∆r = z + ∆x(k+1)
− αAT
D−1
∆x(k+1)
Return ∆x(k+1)
and ∆r
Support fast restrictions to a known pattern.
Riedy, GABB 2016 29/32
GraphBLAS in updating PR
dpr_core (A, z, r, ∆)
Let D = diagonal matrix of vertex out-degrees
z = α(AT
D−1
x|∆ − z) Note: z = AT
D−1
x before update
∆x(1)
= z + r|z
For k = 1 ... itmax
∆x(k+1)
= αAT
D−1
∆x
(k)
|≥γ + α∆x
(k)
|<γ + z
∆x(k+1)
= ∆x(k+1)
+ r|∆x(k+1)
Stop if ∥x(k+1)
− x(k)
∥1 < τ
∆r = z + ∆x(k+1)
− αAT
D−1
∆x(k+1)
Return ∆x(k+1)
and ∆r
Keep growth in order, no need to subtract on new
entries.
Riedy, GABB 2016 30/32
New experience from streaming graphs
For GraphBLAS-ish low-latency streaming algorithms:
• Fuse common sparse operation sequences to
reduce overhead.
• Region-growing: Don’t duplicate / realloc
only-extended patterns.
• Fuse tests into loops when feasible.
• Remember the sparse vector set case!
• Support fast restrictions to a known pattern.
• Keep growth in order, no need to subtract on new
entries.
Riedy, GABB 2016 31/32
STINGER: Where do you get it?
Home: www.cc.gatech.edu/stinger/
Code: git.cc.gatech.edu/git/u/eriedy3/stinger.git/
This code: src/clients/algorithms/pagerank_updating/.
Gateway to
• code,
• development,
• documentation,
• presentations...
Remember: Academic code, but maturing with
contributions.
Users / contributors / questioners: Georgia
Tech, PNNL, CMU, Berkeley, Intel, Cray, NVIDIA,
IBM, Federal Government, Ionic Security, Citi, ...Riedy, GABB 2016 32/32

More Related Content

What's hot

MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016 MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016 Albert Bifet
 
Josh Patterson MLconf slides
Josh Patterson MLconf slidesJosh Patterson MLconf slides
Josh Patterson MLconf slidesMLconf
 
Large-Scale Graph Computation on Just a PC: Aapo Kyrola Ph.D. thesis defense
Large-Scale Graph Computation on Just a PC: Aapo Kyrola Ph.D. thesis defenseLarge-Scale Graph Computation on Just a PC: Aapo Kyrola Ph.D. thesis defense
Large-Scale Graph Computation on Just a PC: Aapo Kyrola Ph.D. thesis defenseAapo Kyrölä
 
Scalable Machine Learning: The Role of Stratified Data Sharding
Scalable Machine Learning: The Role of Stratified Data ShardingScalable Machine Learning: The Role of Stratified Data Sharding
Scalable Machine Learning: The Role of Stratified Data Shardinginside-BigData.com
 
GraphChi big graph processing
GraphChi big graph processingGraphChi big graph processing
GraphChi big graph processinghuguk
 
Big Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabBig Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabImpetus Technologies
 
Entity embeddings for categorical data
Entity embeddings for categorical dataEntity embeddings for categorical data
Entity embeddings for categorical dataPaul Skeie
 
The Other HPC: High Productivity Computing in Polystore Environments
The Other HPC: High Productivity Computing in Polystore EnvironmentsThe Other HPC: High Productivity Computing in Polystore Environments
The Other HPC: High Productivity Computing in Polystore EnvironmentsUniversity of Washington
 
Artificial intelligence and data stream mining
Artificial intelligence and data stream miningArtificial intelligence and data stream mining
Artificial intelligence and data stream miningAlbert Bifet
 
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...Alexandros Karatzoglou
 
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETSFAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETScsandit
 
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph AnalysisICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph AnalysisJason Riedy
 
Data streaming fundamentals- EUDAT Summer School (Giuseppe Fiameni, CINECA)
Data streaming fundamentals- EUDAT Summer School (Giuseppe Fiameni, CINECA)Data streaming fundamentals- EUDAT Summer School (Giuseppe Fiameni, CINECA)
Data streaming fundamentals- EUDAT Summer School (Giuseppe Fiameni, CINECA)EUDAT
 
Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsAntonio Severien
 
Visualizing and Clustering Life Science Applications in Parallel 
Visualizing and Clustering Life Science Applications in Parallel Visualizing and Clustering Life Science Applications in Parallel 
Visualizing and Clustering Life Science Applications in Parallel Geoffrey Fox
 
Relational machine-learning
Relational machine-learningRelational machine-learning
Relational machine-learningBhushan Kotnis
 
K-means Clustering with Scikit-Learn
K-means Clustering with Scikit-LearnK-means Clustering with Scikit-Learn
K-means Clustering with Scikit-LearnSarah Guido
 
Linear regression on 1 terabytes of data? Some crazy observations and actions
Linear regression on 1 terabytes of data? Some crazy observations and actionsLinear regression on 1 terabytes of data? Some crazy observations and actions
Linear regression on 1 terabytes of data? Some crazy observations and actionsHesen Peng
 
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler..."Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...Dataconomy Media
 

What's hot (20)

MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016 MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016
 
Josh Patterson MLconf slides
Josh Patterson MLconf slidesJosh Patterson MLconf slides
Josh Patterson MLconf slides
 
Large-Scale Graph Computation on Just a PC: Aapo Kyrola Ph.D. thesis defense
Large-Scale Graph Computation on Just a PC: Aapo Kyrola Ph.D. thesis defenseLarge-Scale Graph Computation on Just a PC: Aapo Kyrola Ph.D. thesis defense
Large-Scale Graph Computation on Just a PC: Aapo Kyrola Ph.D. thesis defense
 
Scalable Machine Learning: The Role of Stratified Data Sharding
Scalable Machine Learning: The Role of Stratified Data ShardingScalable Machine Learning: The Role of Stratified Data Sharding
Scalable Machine Learning: The Role of Stratified Data Sharding
 
GraphChi big graph processing
GraphChi big graph processingGraphChi big graph processing
GraphChi big graph processing
 
Big Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabBig Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLab
 
Entity embeddings for categorical data
Entity embeddings for categorical dataEntity embeddings for categorical data
Entity embeddings for categorical data
 
The Other HPC: High Productivity Computing in Polystore Environments
The Other HPC: High Productivity Computing in Polystore EnvironmentsThe Other HPC: High Productivity Computing in Polystore Environments
The Other HPC: High Productivity Computing in Polystore Environments
 
Artificial intelligence and data stream mining
Artificial intelligence and data stream miningArtificial intelligence and data stream mining
Artificial intelligence and data stream mining
 
CS267_Graph_Lab
CS267_Graph_LabCS267_Graph_Lab
CS267_Graph_Lab
 
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
 
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETSFAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
 
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph AnalysisICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
 
Data streaming fundamentals- EUDAT Summer School (Giuseppe Fiameni, CINECA)
Data streaming fundamentals- EUDAT Summer School (Giuseppe Fiameni, CINECA)Data streaming fundamentals- EUDAT Summer School (Giuseppe Fiameni, CINECA)
Data streaming fundamentals- EUDAT Summer School (Giuseppe Fiameni, CINECA)
 
Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data Streams
 
Visualizing and Clustering Life Science Applications in Parallel 
Visualizing and Clustering Life Science Applications in Parallel Visualizing and Clustering Life Science Applications in Parallel 
Visualizing and Clustering Life Science Applications in Parallel 
 
Relational machine-learning
Relational machine-learningRelational machine-learning
Relational machine-learning
 
K-means Clustering with Scikit-Learn
K-means Clustering with Scikit-LearnK-means Clustering with Scikit-Learn
K-means Clustering with Scikit-Learn
 
Linear regression on 1 terabytes of data? Some crazy observations and actions
Linear regression on 1 terabytes of data? Some crazy observations and actionsLinear regression on 1 terabytes of data? Some crazy observations and actions
Linear regression on 1 terabytes of data? Some crazy observations and actions
 
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler..."Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
 

Similar to Updating PageRank for Streaming Graphs

Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...MLconf
 
Data science courses in bangalore
Data science courses in bangaloreData science courses in bangalore
Data science courses in bangaloreprathyusha1234
 
Data analytics certification courses
Data analytics certification coursesData analytics certification courses
Data analytics certification coursesprathyusha1234
 
Data analytics training in chennai
Data analytics training in chennaiData analytics training in chennai
Data analytics training in chennaiprathyusha1234
 
Analytics certification course in pune
Analytics certification course in puneAnalytics certification course in pune
Analytics certification course in puneprathyusha1234
 
Business analytics online course
Business analytics online courseBusiness analytics online course
Business analytics online courseprathyusha1234
 
Business analytics online course
Business analytics online courseBusiness analytics online course
Business analytics online courseprathyusha1234
 
Business analytics online course
Business analytics online courseBusiness analytics online course
Business analytics online courseprathyusha1234
 
Data science online certification
Data science online certificationData science online certification
Data science online certificationprathyusha1234
 
Business analytics online course
Business analytics online courseBusiness analytics online course
Business analytics online courseprathyusha1234
 
Data science certification course
Data science certification courseData science certification course
Data science certification courseTejaspathiLV
 
data science certification
data science certificationdata science certification
data science certificationdevipatnala1
 
Data analytices training in bangalore
Data analytices training in bangaloreData analytices training in bangalore
Data analytices training in bangaloredevipatnala1
 
Data analytics courses in gurgaon
Data analytics courses in gurgaonData analytics courses in gurgaon
Data analytics courses in gurgaonprathyusha1234
 
Business analytics course in hyderabad
Business analytics course in hyderabadBusiness analytics course in hyderabad
Business analytics course in hyderabadprathyusha1234
 
Business analytics online course
Business analytics online courseBusiness analytics online course
Business analytics online courseprathyusha1234
 
Business analytics training pune
Business analytics training puneBusiness analytics training pune
Business analytics training puneprathyusha1234
 
Data analytics training in bangalore
Data analytics training in bangaloreData analytics training in bangalore
Data analytics training in bangaloreprathyusha1234
 
Business analytics online course
Business analytics online courseBusiness analytics online course
Business analytics online courseprathyusha1234
 

Similar to Updating PageRank for Streaming Graphs (20)

Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
 
Data science courses in bangalore
Data science courses in bangaloreData science courses in bangalore
Data science courses in bangalore
 
Data analytics certification courses
Data analytics certification coursesData analytics certification courses
Data analytics certification courses
 
Data analytics training in chennai
Data analytics training in chennaiData analytics training in chennai
Data analytics training in chennai
 
Analytics certification course in pune
Analytics certification course in puneAnalytics certification course in pune
Analytics certification course in pune
 
Business analytics online course
Business analytics online courseBusiness analytics online course
Business analytics online course
 
Business analytics online course
Business analytics online courseBusiness analytics online course
Business analytics online course
 
Business analytics online course
Business analytics online courseBusiness analytics online course
Business analytics online course
 
Data science online certification
Data science online certificationData science online certification
Data science online certification
 
Business analytics online course
Business analytics online courseBusiness analytics online course
Business analytics online course
 
Data science certification course
Data science certification courseData science certification course
Data science certification course
 
data science certification
data science certificationdata science certification
data science certification
 
Data analytices training in bangalore
Data analytices training in bangaloreData analytices training in bangalore
Data analytices training in bangalore
 
Data analytics courses in gurgaon
Data analytics courses in gurgaonData analytics courses in gurgaon
Data analytics courses in gurgaon
 
Business analytics course in hyderabad
Business analytics course in hyderabadBusiness analytics course in hyderabad
Business analytics course in hyderabad
 
Business analytics online course
Business analytics online courseBusiness analytics online course
Business analytics online course
 
Business analytics training pune
Business analytics training puneBusiness analytics training pune
Business analytics training pune
 
Data analytics training in bangalore
Data analytics training in bangaloreData analytics training in bangalore
Data analytics training in bangalore
 
Data analytices
Data analytices Data analytices
Data analytices
 
Business analytics online course
Business analytics online courseBusiness analytics online course
Business analytics online course
 

More from Jason Riedy

Lucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFLucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFJason Riedy
 
LAGraph 2021-10-13
LAGraph 2021-10-13LAGraph 2021-10-13
LAGraph 2021-10-13Jason Riedy
 
Lucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFLucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFJason Riedy
 
Graph analysis and novel architectures
Graph analysis and novel architecturesGraph analysis and novel architectures
Graph analysis and novel architecturesJason Riedy
 
GraphBLAS and Emus
GraphBLAS and EmusGraphBLAS and Emus
GraphBLAS and EmusJason Riedy
 
Reproducible Linear Algebra from Application to Architecture
Reproducible Linear Algebra from Application to ArchitectureReproducible Linear Algebra from Application to Architecture
Reproducible Linear Algebra from Application to ArchitectureJason Riedy
 
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...Jason Riedy
 
ICIAM 2019: Reproducible Linear Algebra from Application to Architecture
ICIAM 2019: Reproducible Linear Algebra from Application to ArchitectureICIAM 2019: Reproducible Linear Algebra from Application to Architecture
ICIAM 2019: Reproducible Linear Algebra from Application to ArchitectureJason Riedy
 
Novel Architectures for Applications in Data Science and Beyond
Novel Architectures for Applications in Data Science and BeyondNovel Architectures for Applications in Data Science and Beyond
Novel Architectures for Applications in Data Science and BeyondJason Riedy
 
Characterization of Emu Chick with Microbenchmarks
Characterization of Emu Chick with MicrobenchmarksCharacterization of Emu Chick with Microbenchmarks
Characterization of Emu Chick with MicrobenchmarksJason Riedy
 
CRNCH 2018 Summit: Rogues Gallery Update
CRNCH 2018 Summit: Rogues Gallery UpdateCRNCH 2018 Summit: Rogues Gallery Update
CRNCH 2018 Summit: Rogues Gallery UpdateJason Riedy
 
Augmented Arithmetic Operations Proposed for IEEE-754 2018
Augmented Arithmetic Operations Proposed for IEEE-754 2018Augmented Arithmetic Operations Proposed for IEEE-754 2018
Augmented Arithmetic Operations Proposed for IEEE-754 2018Jason Riedy
 
Graph Analysis: New Algorithm Models, New Architectures
Graph Analysis: New Algorithm Models, New ArchitecturesGraph Analysis: New Algorithm Models, New Architectures
Graph Analysis: New Algorithm Models, New ArchitecturesJason Riedy
 
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsCRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsJason Riedy
 
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsCRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsJason Riedy
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs Jason Riedy
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming GraphsHigh-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming GraphsJason Riedy
 
Network Challenge: Error and Sensitivity Analysis
Network Challenge: Error and Sensitivity AnalysisNetwork Challenge: Error and Sensitivity Analysis
Network Challenge: Error and Sensitivity AnalysisJason Riedy
 
STING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel PlatformsSTING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel PlatformsJason Riedy
 
STING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel PlatformsSTING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel PlatformsJason Riedy
 

More from Jason Riedy (20)

Lucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFLucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoF
 
LAGraph 2021-10-13
LAGraph 2021-10-13LAGraph 2021-10-13
LAGraph 2021-10-13
 
Lucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoFLucata at the HPEC GraphBLAS BoF
Lucata at the HPEC GraphBLAS BoF
 
Graph analysis and novel architectures
Graph analysis and novel architecturesGraph analysis and novel architectures
Graph analysis and novel architectures
 
GraphBLAS and Emus
GraphBLAS and EmusGraphBLAS and Emus
GraphBLAS and Emus
 
Reproducible Linear Algebra from Application to Architecture
Reproducible Linear Algebra from Application to ArchitectureReproducible Linear Algebra from Application to Architecture
Reproducible Linear Algebra from Application to Architecture
 
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
PEARC19: Wrangling Rogues: A Case Study on Managing Experimental Post-Moore A...
 
ICIAM 2019: Reproducible Linear Algebra from Application to Architecture
ICIAM 2019: Reproducible Linear Algebra from Application to ArchitectureICIAM 2019: Reproducible Linear Algebra from Application to Architecture
ICIAM 2019: Reproducible Linear Algebra from Application to Architecture
 
Novel Architectures for Applications in Data Science and Beyond
Novel Architectures for Applications in Data Science and BeyondNovel Architectures for Applications in Data Science and Beyond
Novel Architectures for Applications in Data Science and Beyond
 
Characterization of Emu Chick with Microbenchmarks
Characterization of Emu Chick with MicrobenchmarksCharacterization of Emu Chick with Microbenchmarks
Characterization of Emu Chick with Microbenchmarks
 
CRNCH 2018 Summit: Rogues Gallery Update
CRNCH 2018 Summit: Rogues Gallery UpdateCRNCH 2018 Summit: Rogues Gallery Update
CRNCH 2018 Summit: Rogues Gallery Update
 
Augmented Arithmetic Operations Proposed for IEEE-754 2018
Augmented Arithmetic Operations Proposed for IEEE-754 2018Augmented Arithmetic Operations Proposed for IEEE-754 2018
Augmented Arithmetic Operations Proposed for IEEE-754 2018
 
Graph Analysis: New Algorithm Models, New Architectures
Graph Analysis: New Algorithm Models, New ArchitecturesGraph Analysis: New Algorithm Models, New Architectures
Graph Analysis: New Algorithm Models, New Architectures
 
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsCRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
 
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing PlatformsCRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
CRNCH Rogues Gallery: A Community Core for Novel Computing Platforms
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming GraphsHigh-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs
 
Network Challenge: Error and Sensitivity Analysis
Network Challenge: Error and Sensitivity AnalysisNetwork Challenge: Error and Sensitivity Analysis
Network Challenge: Error and Sensitivity Analysis
 
STING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel PlatformsSTING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
 
STING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel PlatformsSTING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
STING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
 

Recently uploaded

Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 

Recently uploaded (20)

Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 

Updating PageRank for Streaming Graphs

  • 1. Updating PageRank for Streaming Graphs E. Jason Riedy Graph Algorithms Building Blocks 23 May 2016 School of Computational Science and Engineering Georgia Institute of Technology
  • 3. (insert prefix here)-scale data analysis Cyber-security Identify anomalies, malicious actors Health care Finding outbreaks, population epidemiology Social networks Advertising, searching, grouping Intelligence Decisions at scale, regulating algorithms Systems biology Understanding interactions, drug design Power grid Disruptions, conservation Simulation Discrete events, cracking meshes • Graphs are a motif / theme in data analysis. • Changing and dynamic graphs are important!Riedy, GABB 2016 3/32
  • 4. Outline 1. Motivation and background 2. Incremental PageRank • Linear algebra & streaming graph data • Maintaining accuracy! 3. Performance and implementation aspects • Requests for GraphBLAS implementations Riedy, GABB 2016 4/32
  • 5. Potential Applications • Social Networks • Identify communities, influences, bridges, trends, anomalies (trends before they happen)... • Potential to help social sciences, city planning, and others with large-scale data. • Cybersecurity • Determine if new connections can access a device or represent new threat in < 5ms... • Is the transfer by a virus / persistent threat? • Bioinformatics, health • Construct gene sequences, analyze protein interactions, map brain interactions • Credit fraud forensics ⇒ detection ⇒ monitoring Riedy, GABB 2016 5/32
  • 6. Streaming graph data Networks data rates: • Gigabit ethernet: 81k – 1.5M packets per second • Over 130 000 flows per second on 10 GigE (< 7.7 µs) Person-level data rates: • 500M posts per day on Twitter (6k / sec)1 • 3M posts per minute on Facebook (50k / sec)2 We need to analyze only changes and not entire graph. Throughput & latency trade offs expose different levels of concurrency. 1 www.internetlivestats.com/twitter-statistics/ 2 www.jeffbullas.com/2015/04/17/21-awesome-facebook-facts-and-statistics-you-need-to-check-out/ Riedy, GABB 2016 6/32
  • 7. Streaming graph analysis Terminology: • Streaming changes into a massive, evolving graph • Not CS streaming algorithm (tiny memory) • Need to handle deletions as well as insertions Previous throughput results (not comprehensive review): Data ingest >2M up/sec [Ediger, McColl, Poovey, Campbell, & Bader 2014] Clustering coefficients >100K up/sec [R, Meyerhenke, Bader, Ediger, & Mattson 2012] Connected comp. >1M up/sec [McColl, Green, & Bader 2013] Community clustering >100K up/sec∗ [R & Bader 2013] Riedy, GABB 2016 7/32
  • 9. PageRank Everyone’s “favorite” metric: PageRank. • Stationary distribution of the random surfer model. • Eigenvalue problem can be re-phrased as a linear system3 ( I − αAT D−1 ) x = kv, with α teleportation constant < 1 A adjacency matrix D diagonal matrix of out degrees, with x/0 = x (self-loop) v personalization vector, ∥v∥1 = 1 k scaling constant 3 [Gleich, Zhukov, & Berkhin 2004; Del Corso, Gull, & Romani 2005]Riedy, GABB 2016 9/32
  • 10. Incremental PageRank: Goals • Efficiently update for streaming data; update PageRank without touching the entire graph. • Keep the results accurate. • Updates can wander, and ranks deceive... • Existing methods: • Compute “summaries” of non-changed portions: Walk whole graph per change. [Langville & Meyer, 2006] • Maintain databases of walks for dynamic resampling [Bahmani, Chowdhury, & Goel 2010] • Statistically based idea, very similar but... [Ohsaka, et al. 2015] Riedy, GABB 2016 10/32
  • 11. Incremental PageRank: First pass • Let A∆ = A + ∆A, D∆ = D + ∆D for the new graph, want to solve for x + ∆x. • A: sparse and row-major, Ai,j = 1 if i → j ∈ edges • Simple algebra: ( I − αAT ∆D−1 ∆ ) ∆x = α ( A∆D−1 ∆ − AD−1 ) x • The operator on the right-hand side, A∆D−1 ∆ − AD−1 , is sparse; non-zero only adjacent to changes in ∆. • Re-arrange for Jacobi (stationary iterative method), ∆x(k+1) = αAT ∆D−1 ∆ ∆x(k) + α ( A∆D−1 ∆ − AD−1 ) x, iterate, ... Riedy, GABB 2016 11/32
  • 12. Incremental PageRank: Accumulating error • And fail. The updated solution wanders away from the true solution. Top rankings stay the same... Riedy, GABB 2016 12/32
  • 13. Incremental PageRank: Think instead • Backward error view: The new problem is close to the old one, solution may be close. • How close? Residual: r′ = kv − x + αA∆D−1 ∆ x = r + α ( A∆D−1 ∆ − AD−1 ) x. • Solve (I − αA∆D−1 ∆ )∆x = r′ (iterative refinement). • Cheat by not refining all of r′ , only region growing around the changes: (I − αA∆D−1 ∆ )∆x = r′ |∆ • (Also cheat by updating r rather than recomputing at the changes.) Riedy, GABB 2016 13/32
  • 16. Performance discussion • Inherent trade-off between high throughput and low latency. Throughput Large batches imply great parallelism Latency Small batches are highly independent • Restarting PR iteration exposes massive parallelism • Sparse matrix - dense vector mult. (SpMV) • Incremental is highly sparse, pays in overhead • Sparse matrix - sparse vector mult. (SpMSpV) • Results are worst-case: changes are not related to conductance communities. • (Also, very un-tuned SpMSpV...) Riedy, GABB 2016 16/32
  • 17. Test cases Using one CPU in an 8-core Intel Westmere-EX (E7-4820), 2.00GHz and 18MiB L3... Graph |V| |E| Avg. Degree Size (MiB) power 4941 6594 1.33 0.07 power grid PGPgiantcompo 10680 24316 2.28 0.23 social network caidaRouterLevel 192244 609066 3.17 5.38 networking belgium.osm 1441295 1549970 1.08 22.82 road map coPapersCiteseer 434102 16036720 36.94 124.01 citation Riedy, GABB 2016 17/32
  • 18. Incremental PageRank: Worst latency q q q q q q q q q q q q q q q 0.001 0.010 0.001 0.010 0.01 0.10 1.00 1 100 1 10 100 powerPGPgiantcompocaidaRouterLevelbelgium.osmcoPapersCiteseer 10 100 1000 Batch size Updatetime(s) Algorithm q dpr dprheld pr_restart Incremental: Best at 4 threads. RPR: Best at 8. Riedy, GABB 2016 18/32
  • 19. Incremental PageRank: Worst throughput 100 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 4000 6000 8000 2500 5000 7500 10000 0 5000 10000 15000 0 1000 2000 3000 4000 0 250 500 750 1000 powerPGPgiantcompocaidaRouterLevelbelgium.osmcoPapersCiteseer 1 2 3 4 5 6 7 8 9 10 Batch Number Updatethroughput(edges/sec) Alg. q dpr dprheld pr pr_restart Incremental: Best at 4 threads. RPR: Best at 8. Riedy, GABB 2016 19/32
  • 20. Median update time (s) Graph Batch dpr dpr_held pr_restart power 10 .00304 1.8× .00008 68× .00535 100 .0109 .79× .00707 1.2× .00860 1000 .0126 .67× .0124 .68× .00843 PGPgiantcompo 10 .00211 4.3× .00023 39× .00916 100 .0257 .81× .00874 2.4× .0207 1000 .0372 .67× .0341 .73× .0249 caidaRouterLevel 10 .00710 16× .00199 57× .112 100 .0314 7.1× .00477 47× .224 1000 1.30 .68× .2290 3.9× .889 belgium.osm 10 .0118 42× .0128 39× .498 100 .0127 39× .0131 38× .499 1000 .0461 52× .0171 140× 2.41 coPapersCiteseer 10 .0729 27× .0128 155× 1.98 100 .8650 2.3× .130 15× 1.98 1000 2.97 1.3× 1.13 3.5× 3.95 Riedy, GABB 2016 20/32
  • 21. Fraction of traversed edges Graph Batch dpr dpr_held pr_restart power 10 7.54 2.4× 0.0229 790× 18 100 29.2 1.2× 18.2 1.9× 34 1000 38.3 1.0× 37.1 1.0× 39 PGPgiantcompo 10 1.58 5.1× 0.0453 180× 8 100 21.3 1.1× 6.82 3.6× 24 1000 31.2 1.0× 28.4 1.1× 32 caidaRouterLevel 10 0.0625 32× 0.00252 790× 2 100 0.409 9.8× 0.0301 130× 4 1000 15.2 1.1× 3.07 5.2× 16 belgium.osm 10 0.00020 9800× 0.00007 30000× 2 100 0.00203 990× 0.00066 3000× 2 1000 0.120 84× 0.00660 1500× 10 coPapersCiteseer 10 0.0563 36× 0.00646 310× 2 100 0.689 2.9× 0.0952 21× 2 1000 2.32 1.7× 0.864 4.6× 4 Riedy, GABB 2016 21/32
  • 22. GraphBLAS Requests: Beat Down Overhead!
  • 23. GraphBLAS background • Provide a means to express linear-algebra-like graph algorithms • Graphs can be modeled as (sparse) matrices • Adjacency (used here) • Vertex-edge • Bipartite... • Some graph algorithms iterate as if applying a linear operator • BFS • Betweenness centrality • PageRank • Ok, PageRank actually is linear algebra... Riedy, GABB 2016 23/32
  • 24. Lessons from sparse linear algebra • Overhead is important. • Small average degree • O(|E|) is O(|V|)... • If the average degree is 8, 8|V| storage is the matrix. • Graphs: Some very large degrees • Graphs: Low average diameter • Details matter • Entries that disappear are painful • Take care with definitions on |V|... Riedy, GABB 2016 24/32
  • 25. GraphBLAS in updating PR dpr_core (A, z, r, ∆) Let D = diagonal matrix of vertex out-degrees z = α(AT D−1 x|∆ − z) Note: z = AT D−1 x before update ∆x(1) = z + r|z For k = 1 ... itmax ∆x(k+1) = αAT D−1 ∆x (k) |≥γ + α∆x (k) |<γ + z ∆x(k+1) = ∆x(k+1) + r|∆x(k+1) Stop if ∥x(k+1) − x(k) ∥1 < τ ∆r = z + ∆x(k+1) − αAT D−1 ∆x(k+1) Return ∆x(k+1) and ∆r Fuse common sparse operation sequences to reduce overhead. Riedy, GABB 2016 25/32
  • 26. GraphBLAS in updating PR dpr_core (A, z, r, ∆) Let D = diagonal matrix of vertex out-degrees z = α(AT D−1 x|∆ − z) Note: z = AT D−1 x before update ∆x(1) = z + r|z For k = 1 ... itmax ∆x(k+1) = αAT D−1 ∆x (k) |≥γ + α∆x (k) |<γ + z ∆x(k+1) = ∆x(k+1) + r|∆x(k+1) Stop if ∥x(k+1) − x(k) ∥1 < τ ∆r = z + ∆x(k+1) − αAT D−1 ∆x(k+1) Return ∆x(k+1) and ∆r Region-growing: Don’t duplicate / realloc only-extended patterns. Riedy, GABB 2016 26/32
  • 27. GraphBLAS in updating PR dpr_core (A, z, r, ∆) Let D = diagonal matrix of vertex out-degrees z = α(AT D−1 x|∆ − z) Note: z = AT D−1 x before update ∆x(1) = z + r|z For k = 1 ... itmax ∆x(k+1) = αAT D−1 ∆x (k) |≥γ + α∆x (k) |<γ + z ∆x(k+1) = ∆x(k+1) + r|∆x(k+1) Stop if ∥x(k+1) − x(k) ∥1 < τ ∆r = z + ∆x(k+1) − αAT D−1 ∆x(k+1) Return ∆x(k+1) and ∆r Fuse tests into loops when feasible. Riedy, GABB 2016 27/32
  • 28. GraphBLAS in updating PR dpr_core (A, z, r, ∆) Let D = diagonal matrix of vertex out-degrees z = α(AT D−1 x|∆ − z) Note: z = AT D−1 x before update ∆x(1) = z + r|z For k = 1 ... itmax ∆x(k+1) = αAT D−1 ∆x (k) |≥γ + α∆x (k) |<γ + z ∆x(k+1) = ∆x(k+1) + r|∆x(k+1) Stop if ∥x(k+1) − x(k) ∥1 < τ ∆r = z + ∆x(k+1) − αAT D−1 ∆x(k+1) Return ∆x(k+1) and ∆r Looks like the optimization point? Few vertices in ∆ means all steps matter. Riedy, GABB 2016 28/32
  • 29. GraphBLAS in updating PR dpr_core (A, z, r, ∆) Let D = diagonal matrix of vertex out-degrees z = α(AT D−1 x|∆ − z) Note: z = AT D−1 x before update ∆x(1) = z + r|z For k = 1 ... itmax ∆x(k+1) = αAT D−1 ∆x (k) |≥γ + α∆x (k) |<γ + z ∆x(k+1) = ∆x(k+1) + r|∆x(k+1) Stop if ∥x(k+1) − x(k) ∥1 < τ ∆r = z + ∆x(k+1) − αAT D−1 ∆x(k+1) Return ∆x(k+1) and ∆r Support fast restrictions to a known pattern. Riedy, GABB 2016 29/32
  • 30. GraphBLAS in updating PR dpr_core (A, z, r, ∆) Let D = diagonal matrix of vertex out-degrees z = α(AT D−1 x|∆ − z) Note: z = AT D−1 x before update ∆x(1) = z + r|z For k = 1 ... itmax ∆x(k+1) = αAT D−1 ∆x (k) |≥γ + α∆x (k) |<γ + z ∆x(k+1) = ∆x(k+1) + r|∆x(k+1) Stop if ∥x(k+1) − x(k) ∥1 < τ ∆r = z + ∆x(k+1) − αAT D−1 ∆x(k+1) Return ∆x(k+1) and ∆r Keep growth in order, no need to subtract on new entries. Riedy, GABB 2016 30/32
  • 31. New experience from streaming graphs For GraphBLAS-ish low-latency streaming algorithms: • Fuse common sparse operation sequences to reduce overhead. • Region-growing: Don’t duplicate / realloc only-extended patterns. • Fuse tests into loops when feasible. • Remember the sparse vector set case! • Support fast restrictions to a known pattern. • Keep growth in order, no need to subtract on new entries. Riedy, GABB 2016 31/32
  • 32. STINGER: Where do you get it? Home: www.cc.gatech.edu/stinger/ Code: git.cc.gatech.edu/git/u/eriedy3/stinger.git/ This code: src/clients/algorithms/pagerank_updating/. Gateway to • code, • development, • documentation, • presentations... Remember: Academic code, but maturing with contributions. Users / contributors / questioners: Georgia Tech, PNNL, CMU, Berkeley, Intel, Cray, NVIDIA, IBM, Federal Government, Ionic Security, Citi, ...Riedy, GABB 2016 32/32