Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs
1. SWeG: Lossless and
Lossy Summarization of
Web-Scale Graphs
Kijung Shin Amol Ghoting Myunghwan Kim Hema Raghavan
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin)
3. Graphs are Everywhere (cont.)
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin) 3/42
4. SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin) 4/42
Graphs are Everywhere (cont.)
5. Real-world Graphs are Huge
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin) 5/42
2B+ active users
500M+ products
300M+ customers
4B+ Web pages
600M+ users
20B+ connections
× 30+
How can we analyze and utilize
such large graph data?
6. Limitations of Existing Tools
•Graph algorithms in textbooks
◦ Assume graphs fit in main memory
(i.e., random-access memory)
•Tools for out-of-core graph processing
◦ Not for every graph algorithm
◦ Requiring engineering for each algorithm
◦ Inappropriate for real-time applications
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin) 6/42
1000+
new graph
algorithms
per year!
7. Solution: Graph Compression
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin) 7/42
Textbook
algorithms
Any new
algorithms
Real-time
algorithms
•Compressing large graphs so that
◦ Compressed data fit in main memory
◦ Algorithms can be performed on compressed
data without changes
8. Roadmap
• Problem Definition <<
• Proposed Method: SWeG
• Experimental Results
• Conclusions
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin) 8/42
12. Why Graph Summarization?
• Neighbor queries can be rapid and efficient
§Given a seed node, return its neighbors
§Key building block of most graph algorithms
• Easily extended to lossy compression
• Easily combined with other graph-compression methods
◦ the outputs are also graphs
12/42SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin)
Summary Graph
Residual Graph (Positive)
Residual Graph (Negative)𝑎, 𝑏 𝑐, 𝑑, 𝑒 𝑓, 𝑔 − (𝑎, 𝑑)
− (𝑐, 𝑒)
+ 𝑑, 𝑔
discussed
in the paper
13. Challenge: Scalability!
13/42
Maximum Size of Input Graphs
Compression
Performance
Good
Bad
VoG [KKVF14]
Greedy [NSR08]
millions 10 millions
Randomized [NSR08]
SAGS [KNL15]
billions
10,000×
SWeG
(Proposed)
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin)
14. Our Contribution: SWeG
14/42
Fast with Concise Outputs
Memory Efficient
Scalable
• We develop SWeG (SWeG: Lossless and Lossy
Summarization of Web-Scale Graphs):
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin)
15. Roadmap
• Problem Definition
• Proposed Method: SWeG <<
• Experimental Results
• Conclusions
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin) 15/42
16. Main Idea behind SWeG
• Graph summarization ≈ node clustering
◦ Finding sets of similar nodes to be merged into super nodes
• Previous heuristics are greedy algorithms
• Why are even greedy algorithms slow?
◦ Too many node pairs are considered: 𝑂(𝑛3)
• How can we reduce the number of node pairs to be
considered without missing similar node pairs?
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin) 16/42
𝑎
𝑏
𝑐
𝑑
𝑒
𝑓
𝑔
𝑎
𝑏
𝑐
𝑑
𝑒
𝑓
𝑔 𝑎, 𝑏 𝑐, 𝑑, 𝑒 𝑓, 𝑔
17. Main Idea behind SWeG (cont.)
• Step 1: Coarse clustering (Grouping)
◦ Fast and careless
• Step 2: Fine clustering (Merging)
◦ Greedy algorithm
◦ Only node pairs within each group are considered
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin) 17/42
Repeated
𝑎
𝑏
𝑐
𝑑
𝑒
𝑓
𝑔
𝑎
𝑏
𝑐
𝑑
𝑒
𝑓
𝑔 𝑎, 𝑏 𝑐, 𝑑, 𝑒 𝑓, 𝑔
32. EXP1. Speed and Compression
32/42
SWeG outperforms its competitors
SWeG
- dataset:
𝟑 𝟕 𝟎
−
𝟒, 𝟒 𝟗 𝟎×
faster
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin)
33. Advantages of SWeG
33/42
Fast with Concise Outputs
Memory Efficient
Scalable
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin)
35. Advantages of SWeG
35/42
Fast with Concise Outputs
Memory Efficient
Scalable
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin)
36. 36/42
About 20 iterations are enough
EXP3. Effect of Iterations
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin)
37. EXP4. Data Scalability
37/42
SWeG is linear in the number of edges
SWeG
(Hadoop)
SWeG
(Single machine)
≥ 𝟐𝟎 billion edges
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin)
39. Advantages of SWeG
39/42
Fast with Concise Outputs
Memory Efficient
Scalable
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin)
40. EXP6. Further Compression
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin) 40/42
SWeG+ achieves ~0.7 bit / link for Web graphs
BV BFS BP SWeG+
3.4X
- dataset:
𝟏. 𝟐
−
𝟑. 𝟒×
additional com
pression
41. Roadmap
• Problem Definition
• Proposed Method: SWeG
• Experimental Results
• Conclusions <<
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin) 41/42
42. Conclusions
• We propose SWeG (Summarizing Web Graphs)
◦ for summarizing large-scale graphs
42/42
Fast with Concise Outputs
Memory Efficient
Scalable
SWeG
(Hadoop)
SWeG
SWeG: Lossless and Lossy Summarization of Web-Scale Graphs (by Kijung Shin)