SlideShare a Scribd company logo
Mining Billion-node Graphs: Patterns, Generators and Tools Christos Faloutsos CMU
Thanks! ,[object Object],Hadoop Summit '10 C. Faloutsos (CMU)
Our goal: ,[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Graphs - why should we care? C. Faloutsos (CMU) Internet Map [lumeta.com] Food Web [Martinez ’91] Protein Interactions [genomebiology.com] Friendship Network [Moody ’01] Hadoop Summit '10
Graphs - why should we care? ,[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10 D 1 D N T 1 T M ... ...
Graphs - why should we care? ,[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Problem #1 - network and graph mining ,[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Problem #1 - network and graph mining ,[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Problem #1 - network and graph mining ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Graph mining ,[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Laws and patterns ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Solution# S.1 ,[object Object],C. Faloutsos (CMU) log(rank) log(degree) internet domains att.com ibm.com Hadoop Summit '10 -0.82
Solution# S.2: Eigen Exponent  E ,[object Object],C. Faloutsos (CMU) E = -0.48 Exponent = slope Eigenvalue Rank of decreasing eigenvalue May 2001 Hadoop Summit '10
Solution# S.2: Eigen Exponent  E ,[object Object],C. Faloutsos (CMU) E = -0.48 Exponent = slope Eigenvalue Rank of decreasing eigenvalue May 2001 Hadoop Summit '10
But: ,[object Object],C. Faloutsos (CMU) Hadoop Summit '10
More power laws: ,[object Object],C. Faloutsos (CMU) Web Site Traffic in-degree (log scale) Count (log scale) Zipf ``ebay’’ Hadoop Summit '10 users sites
epinions.com ,[object Object],C. Faloutsos (CMU) (out) degree count trusts-2000-people user Hadoop Summit '10
And numerous more ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Hadoop Summit '10 C. Faloutsos (CMU)
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Solution# S.3: Triangle ‘Laws’ ,[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Solution# S.3: Triangle ‘Laws’ ,[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Triangle Law: #S.3  [Tsourakakis ICDM 2008] C. Faloutsos (CMU) ASN HEP-TH Epinions X-axis: # of Triangles a node participates in Y-axis: count of such nodes Hadoop Summit '10
Triangle Law: #S.3  [Tsourakakis ICDM 2008] C. Faloutsos (CMU) ASN HEP-TH Epinions X-axis: # of Triangles a node participates in Y-axis: count of such nodes Hadoop Summit '10
Triangle Law: #S.4  [Tsourakakis ICDM 2008] C. Faloutsos (CMU) SN Reuters Epinions X-axis: degree Y-axis: mean # triangles n  friends -> ~ n 1.6  triangles Hadoop Summit '10
Triangle Law: Computations  [Tsourakakis ICDM 2008] C. Faloutsos (CMU) But: triangles are expensive to compute (3-way join; several approx. algos) Q: Can we do that quickly? details Hadoop Summit '10
Triangle Law: Computations  [Tsourakakis ICDM 2008] C. Faloutsos (CMU) But: triangles are expensive to compute (3-way join; several approx. algos) Q: Can we do that quickly? A: Yes! #triangles = 1/6 Sum (   i 3  ) (and, because of skewness, we only need  the top few eigenvalues! details Hadoop Summit '10
Triangle Law: Computations  [Tsourakakis ICDM 2008] C. Faloutsos (CMU) 1000x+ speed-up, >90% accuracy details Hadoop Summit '10
EigenSpokes ,[object Object],C. Faloutsos (CMU) Hadoop Summit '10
EigenSpokes ,[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
EigenSpokes ,[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10 N N details
EigenSpokes ,[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) u1 u2 Hadoop Summit '10 1 st  Principal  component 2 nd  Principal  component
EigenSpokes ,[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10 u1 u2 90 o
EigenSpokes - pervasiveness ,[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
EigenSpokes - explanation ,[object Object],C. Faloutsos (CMU) Hadoop Summit '10
EigenSpokes - explanation ,[object Object],C. Faloutsos (CMU) Hadoop Summit '10
EigenSpokes - explanation ,[object Object],C. Faloutsos (CMU) Hadoop Summit '10
EigenSpokes - explanation ,[object Object],[object Object],[object Object],[object Object],[object Object],spy plot of top 20 nodes C. Faloutsos (CMU) Hadoop Summit '10
Bipartite Communities! magnified bipartite community patents from same inventor(s) cut-and-paste bibliography! C. Faloutsos (CMU) Hadoop Summit '10
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Observations on  weighted graphs? ,[object Object],C. Faloutsos (CMU) M. McGlohon, L. Akoglu, and C. Faloutsos  Weighted Graphs and Disconnected Components: Patterns and a Generator.   SIG-KDD  2008  Hadoop Summit '10
Observation W.1: Fortification ,[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Observation W.1: Fortification C. Faloutsos (CMU) More donors,  more $ ? $10 $5 Hadoop Summit '10 ‘ Reagan’ ‘ Clinton’ $7
Observation W.1: fortification: Snapshot Power Law ,[object Object],[object Object],Edges (# donors) In-weights ($) C. Faloutsos (CMU) Orgs-Candidates e.g. John Kerry,  $10M received, from 1K donors More donors,  even  more $ $10 $5 Hadoop Summit '10
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Problem: Time evolution ,[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
T.1 Evolution of the Diameter ,[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
T.1 Evolution of the Diameter ,[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
T.1 Diameter – “Patents” ,[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) time [years] diameter Hadoop Summit '10
T.2 Temporal Evolution of the Graphs ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
T.2 Temporal Evolution of the Graphs ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
T.2 Densification – Patent Citations ,[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) N(t) E(t) 1.66 Hadoop Summit '10
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
More on Time-evolving   graphs C. Faloutsos (CMU) M. McGlohon, L. Akoglu, and C. Faloutsos  Weighted Graphs and Disconnected Components: Patterns and a Generator.   SIG-KDD  2008  Hadoop Summit '10
Observation T.3: NLCC behavior ,[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Observation T.3: NLCC behavior ,[object Object],C. Faloutsos (CMU) IMDB CC size Time-stamp Hadoop Summit '10
Timing for Blogs ,[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
T.4 : popularity over time C. Faloutsos (CMU) Post popularity drops-off – exponentially? lag: days after post # in links 1 2 3 @t @t +  lag Hadoop Summit '10
T.4 : popularity over time C. Faloutsos (CMU) Post popularity drops-off – exponentially? POWER LAW! Exponent? # in links ( log ) 1 2 3 days after post ( log ) Hadoop Summit '10
T.4 : popularity over time C. Faloutsos (CMU) ,[object Object],[object Object],[object Object],[object Object],[object Object],# in links ( log ) 1 2 3 -1.6 days after post ( log ) Hadoop Summit '10
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
CenterPiece Subgraphs ,[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Ce nter- P iece  S ubgraph Discovery  [Tong+ KDD 06] Original Graph Q: Who is the most central node wrt the black nodes?  (e.g., master-mind criminal, common advisor/collaborator, etc) Input C. Faloutsos (CMU) Hadoop Summit '10 B A C
Ce nter- P iece  S ubgraph Discovery  [Tong+ KDD 06] Q: How to find hub for the query nodes? Input:  original graph Output:  CePS CePS Node C. Faloutsos (CMU) A: Combine proximity scores (RWR) Hadoop Summit '10 B A C B A C
CePS : Example (AND Query) ? C. Faloutsos (CMU) Hadoop Summit '10 ,[object Object],[object Object],[object Object]
CePS : Example (AND Query) C. Faloutsos (CMU) ,[object Object],[object Object],[object Object],Hadoop Summit '10
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
G raph X -Ray:  Fast Best-Effort Pattern Matching  in Large Attributed Graphs Hanghang Tong, Brian Gallagher,  Christos Faloutsos, Tina Eliassi-Rad KDD’07
Output Input Attributed Data Graph Query Graph Matching Subgraph Hadoop Summit '10 C. Faloutsos (CMU)
Effectiveness: star-query  Query  Result Hadoop Summit '10 C. Faloutsos (CMU)
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
OddBall: Spotting  A n o m a l i e s  in  Weighted Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University  School of Computer Science To appear in PAKDD 2010, Hyderabad, India
Main idea ,[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
What is an egonet? ego egonet C. Faloutsos (CMU) Hadoop Summit '10
Selected Features ,[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Near-Clique/Star Hadoop Summit '10 C. Faloutsos (CMU)
Near-Clique/Star C. Faloutsos (CMU) Hadoop Summit '10
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Outline – Algorithms & results C. Faloutsos (CMU) Hadoop Summit '10 Centralized Hadoop/PEGASUS Degree Distr. old old Pagerank old old Diameter/ANF old DONE Conn. Comp old DONE Triangles DONE Visualization STARTED
HADI for diameter estimation ,[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
[object Object],[object Object],???? ?? 19+? [Barabasi+]  (‘99, O(10 6 ) nodes) C. Faloutsos (CMU) Radius Count Hadoop Summit '10
[object Object],[object Object],???? C. Faloutsos (CMU) Radius Count Hadoop Summit '10 14 (dir.) ~7 (undir.) 19+? [Barabasi+]  (‘99, O(10 6 ) nodes)
[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10 Shape?
[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
C. Faloutsos (CMU) ,[object Object],[object Object],[object Object],Hadoop Summit '10
Radius Plot of  GCC  of YahooWeb. C. Faloutsos (CMU) Hadoop Summit '10
Running time -  Kronecker and Erdos-Renyi  Graphs with billions edges. details
Outline – Algorithms & results C. Faloutsos (CMU) Hadoop Summit '10 Centralized Hadoop/PEGASUS Degree Distr. old old Pagerank old old Diameter/ANF old DONE Conn. Comp old DONE Triangles DONE Visualization STARTED
Generalized Iterated Matrix Vector Multiplication (GIMV) C. Faloutsos (CMU) PEGASUS: A Peta-Scale Graph Mining  System - Implementation and Observations .  U Kang, Charalampos E. Tsourakakis,  and Christos Faloutsos.  ( ICDM ) 2009, Miami, Florida, USA.  Best Application Paper (runner-up) .   Hadoop Summit '10
Generalized Iterated Matrix Vector Multiplication (GIMV) C. Faloutsos (CMU) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Matrix – vector Multiplication (iterated) Hadoop Summit '10 details
Example: GIM-V At Work ,[object Object],Size Count C. Faloutsos (CMU) Hadoop Summit '10
Example: GIM-V At Work ,[object Object],Size Count C. Faloutsos (CMU) Hadoop Summit '10 ~0.7B  singleton nodes
Example: GIM-V At Work ,[object Object],Size Count C. Faloutsos (CMU) Hadoop Summit '10
Example: GIM-V At Work ,[object Object],Size Count 300-size cmpt X 500. Why? 1100-size cmpt X 65. Why? C. Faloutsos (CMU) Hadoop Summit '10
Example: GIM-V At Work ,[object Object],Size Count suspicious financial-advice sites (not existing now) C. Faloutsos (CMU) Hadoop Summit '10
GIM-V At Work ,[object Object],[object Object],Stable tail slope after the gelling point C. Faloutsos (CMU) Hadoop Summit '10
Outline – Algorithms & results C. Faloutsos (CMU) Hadoop Summit '10 Centralized Hadoop/PEGASUS Degree Distr. old old Pagerank old old Diameter/ANF old DONE Conn. Comp old DONE Triangles DONE Visualization STARTED
Triangles : Computations  [Tsourakakis ICDM 2008] C. Faloutsos (CMU) But: triangles are expensive to compute (3-way join; several approx. algos) Q: Can we do that quickly? A: Yes! #triangles = 1/6 Sum (   i 3  ) (and, because of skewness, we only need  the top few eigenvalues! Mentioned already Hadoop Summit '10
Triangle Law: #1  [Tsourakakis ICDM 2008] C. Faloutsos (CMU) ASN HEP-TH Epinions X-axis: # of Triangles a node participates in Y-axis: count of such nodes Mentioned already Hadoop Summit '10
Outline – Algorithms & results C. Faloutsos (CMU) Hadoop Summit '10 Centralized Hadoop/PEGASUS Degree Distr. old old Pagerank old old Diameter/ANF old DONE Conn. Comp old DONE Triangles DONE Visualization STARTED
Visualization: ShiftR ,[object Object],C. Faloutsos (CMU) Hadoop Summit '10
C. Faloutsos (CMU) Hadoop Summit '10
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Other topics - part#1 - tools ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Tensors ,[object Object],Hadoop Summit '10 C. Faloutsos (CMU) keyword 1990 Author
Tensors ,[object Object],Hadoop Summit '10 C. Faloutsos (CMU) keyword 1991 1992 1990 Author
Tensors ,[object Object],~ + PARAFAC tensor decomposition (generalization of SVD) Hadoop Summit '10 C. Faloutsos (CMU) keyword 1991 1992 1990 Author
Other topics – part#2 - generators ,[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Kronecker Product – a Graph ,[object Object],Hadoop Summit '10 C. Faloutsos (CMU)
Other topics - part#3 – virus propagation ,[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
More info ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Hadoop Summit '10 C. Faloutsos (CMU)
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
OVERALL CONCLUSIONS – low level: ,[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
OVERALL CONCLUSIONS – high level ,[object Object],[object Object],[object Object],[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
References ,[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
References ,[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
References ,[object Object],C. Faloutsos (CMU) Hadoop Summit '10
References ,[object Object],C. Faloutsos (CMU) Hadoop Summit '10
References ,[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
References ,[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
References ,[object Object],Hadoop Summit '10 C. Faloutsos (CMU)
References ,[object Object],[object Object],C. Faloutsos (CMU) Hadoop Summit '10
References ,[object Object],C. Faloutsos (CMU) Hadoop Summit '10
Joint papers with LLNL ,[object Object],[object Object],[object Object],Hadoop Summit '10 C. Faloutsos (CMU)
Joint papers with LLNL ,[object Object],[object Object],Hadoop Summit '10 C. Faloutsos (CMU)
Joint papers with LLNL ,[object Object],[object Object],Hadoop Summit '10 C. Faloutsos (CMU)
Project info ,[object Object],C. Faloutsos (CMU) Akoglu,  Leman Chau,  Polo Kang, U McGlohon,  Mary Tsourakakis,  Babis Tong,  Hanghang Prakash, Aditya Hadoop Summit '10 Thanks to: Yahoo (M45 + gifts + data) NSF, LLNL, CTA-INARC, IBM, SPRINT, INTEL, HP

More Related Content

Similar to Mining Billion-node Graphs: Patterns, Generators and Tools__HadoopSummit2010

Visual Data Analytics in the Cloud for Exploratory Science
Visual Data Analytics in the Cloud for Exploratory ScienceVisual Data Analytics in the Cloud for Exploratory Science
Visual Data Analytics in the Cloud for Exploratory Science
University of Washington
 
Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...
Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...
Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...
huguk
 
Streamly: Concurrent Data Flow Programming
Streamly: Concurrent Data Flow ProgrammingStreamly: Concurrent Data Flow Programming
Streamly: Concurrent Data Flow Programming
Harendra Kumar
 
Propagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data FlowsPropagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data Flows
Enrico Daga
 
PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning" PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning"
Joshua Bloom
 
XLDB South America Keynote: eScience Institute and Myria
XLDB South America Keynote: eScience Institute and MyriaXLDB South America Keynote: eScience Institute and Myria
XLDB South America Keynote: eScience Institute and Myria
University of Washington
 
Stream Reasoning : Where We Got So Far
Stream Reasoning: Where We Got So FarStream Reasoning: Where We Got So Far
Stream Reasoning : Where We Got So Far
Emanuele Della Valle
 
Continuous Deep Analytics
Continuous Deep AnalyticsContinuous Deep Analytics
Continuous Deep Analytics
Paris Carbone
 
Agents In An Exponential World Foster
Agents In An Exponential World FosterAgents In An Exponential World Foster
Agents In An Exponential World Foster
Ian Foster
 
Complex Models for Big Data
Complex Models for Big DataComplex Models for Big Data
Complex Models for Big Data
Data Science Research Center
 
Clouds, Grids and Data
Clouds, Grids and DataClouds, Grids and Data
Clouds, Grids and Data
Guy Coates
 
14 turing wics
14 turing wics14 turing wics
14 turing wics
ashish61_scs
 
2015 illinois-talk
2015 illinois-talk2015 illinois-talk
2015 illinois-talk
c.titus.brown
 
Web Services Catalog
Web Services CatalogWeb Services Catalog
Web Services Catalog
Rudolf Husar
 
Self-Similarity in Complex Networks
Self-Similarity in Complex NetworksSelf-Similarity in Complex Networks
Self-Similarity in Complex Networks
norman_fahrer
 
Systems-of-Systems in our Future?
Systems-of-Systems in our Future?Systems-of-Systems in our Future?
Systems-of-Systems in our Future?
rhlumsde
 
Technology Disruption
Technology DisruptionTechnology Disruption
Technology Disruption
Inside Analysis
 
Bruce, T. R., and Richards, R. C. (2011). Adapting Specialized Legal Metadata...
Bruce, T. R., and Richards, R. C. (2011). Adapting Specialized Legal Metadata...Bruce, T. R., and Richards, R. C. (2011). Adapting Specialized Legal Metadata...
Bruce, T. R., and Richards, R. C. (2011). Adapting Specialized Legal Metadata...
Robert Richards
 
A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political...
A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political...A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political...
A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political...
Larry Smarr
 
GalvanizeU Seattle: Eleven Almost-Truisms About Data
GalvanizeU Seattle: Eleven Almost-Truisms About DataGalvanizeU Seattle: Eleven Almost-Truisms About Data
GalvanizeU Seattle: Eleven Almost-Truisms About Data
Paco Nathan
 

Similar to Mining Billion-node Graphs: Patterns, Generators and Tools__HadoopSummit2010 (20)

Visual Data Analytics in the Cloud for Exploratory Science
Visual Data Analytics in the Cloud for Exploratory ScienceVisual Data Analytics in the Cloud for Exploratory Science
Visual Data Analytics in the Cloud for Exploratory Science
 
Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...
Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...
Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...
 
Streamly: Concurrent Data Flow Programming
Streamly: Concurrent Data Flow ProgrammingStreamly: Concurrent Data Flow Programming
Streamly: Concurrent Data Flow Programming
 
Propagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data FlowsPropagation of Policies in Rich Data Flows
Propagation of Policies in Rich Data Flows
 
PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning" PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning"
 
XLDB South America Keynote: eScience Institute and Myria
XLDB South America Keynote: eScience Institute and MyriaXLDB South America Keynote: eScience Institute and Myria
XLDB South America Keynote: eScience Institute and Myria
 
Stream Reasoning : Where We Got So Far
Stream Reasoning: Where We Got So FarStream Reasoning: Where We Got So Far
Stream Reasoning : Where We Got So Far
 
Continuous Deep Analytics
Continuous Deep AnalyticsContinuous Deep Analytics
Continuous Deep Analytics
 
Agents In An Exponential World Foster
Agents In An Exponential World FosterAgents In An Exponential World Foster
Agents In An Exponential World Foster
 
Complex Models for Big Data
Complex Models for Big DataComplex Models for Big Data
Complex Models for Big Data
 
Clouds, Grids and Data
Clouds, Grids and DataClouds, Grids and Data
Clouds, Grids and Data
 
14 turing wics
14 turing wics14 turing wics
14 turing wics
 
2015 illinois-talk
2015 illinois-talk2015 illinois-talk
2015 illinois-talk
 
Web Services Catalog
Web Services CatalogWeb Services Catalog
Web Services Catalog
 
Self-Similarity in Complex Networks
Self-Similarity in Complex NetworksSelf-Similarity in Complex Networks
Self-Similarity in Complex Networks
 
Systems-of-Systems in our Future?
Systems-of-Systems in our Future?Systems-of-Systems in our Future?
Systems-of-Systems in our Future?
 
Technology Disruption
Technology DisruptionTechnology Disruption
Technology Disruption
 
Bruce, T. R., and Richards, R. C. (2011). Adapting Specialized Legal Metadata...
Bruce, T. R., and Richards, R. C. (2011). Adapting Specialized Legal Metadata...Bruce, T. R., and Richards, R. C. (2011). Adapting Specialized Legal Metadata...
Bruce, T. R., and Richards, R. C. (2011). Adapting Specialized Legal Metadata...
 
A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political...
A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political...A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political...
A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political...
 
GalvanizeU Seattle: Eleven Almost-Truisms About Data
GalvanizeU Seattle: Eleven Almost-Truisms About DataGalvanizeU Seattle: Eleven Almost-Truisms About Data
GalvanizeU Seattle: Eleven Almost-Truisms About Data
 

More from Yahoo Developer Network

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaDeveloping Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
Yahoo Developer Network
 
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Yahoo Developer Network
 
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanAthenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Yahoo Developer Network
 
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Yahoo Developer Network
 
CICD at Oath using Screwdriver
CICD at Oath using ScrewdriverCICD at Oath using Screwdriver
CICD at Oath using Screwdriver
Yahoo Developer Network
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Yahoo Developer Network
 
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuHow @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
Yahoo Developer Network
 
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolThe Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
Yahoo Developer Network
 
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Yahoo Developer Network
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Yahoo Developer Network
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
Yahoo Developer Network
 
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Yahoo Developer Network
 
Moving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, OathMoving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, Oath
Yahoo Developer Network
 
Architecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsArchitecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI Applications
Yahoo Developer Network
 
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Yahoo Developer Network
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Yahoo Developer Network
 
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Yahoo Developer Network
 
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
Yahoo Developer Network
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
Yahoo Developer Network
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
Yahoo Developer Network
 

More from Yahoo Developer Network (20)

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaDeveloping Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
 
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
 
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanAthenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
 
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
 
CICD at Oath using Screwdriver
CICD at Oath using ScrewdriverCICD at Oath using Screwdriver
CICD at Oath using Screwdriver
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
 
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuHow @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
 
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolThe Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
 
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
 
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
 
Moving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, OathMoving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, Oath
 
Architecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsArchitecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI Applications
 
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step Beyond
 
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
 
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
 

Recently uploaded

Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 

Recently uploaded (20)

Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 

Mining Billion-node Graphs: Patterns, Generators and Tools__HadoopSummit2010

  • 1. Mining Billion-node Graphs: Patterns, Generators and Tools Christos Faloutsos CMU
  • 2.
  • 3.
  • 4.
  • 5. Graphs - why should we care? C. Faloutsos (CMU) Internet Map [lumeta.com] Food Web [Martinez ’91] Protein Interactions [genomebiology.com] Friendship Network [Moody ’01] Hadoop Summit '10
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24. Triangle Law: #S.3 [Tsourakakis ICDM 2008] C. Faloutsos (CMU) ASN HEP-TH Epinions X-axis: # of Triangles a node participates in Y-axis: count of such nodes Hadoop Summit '10
  • 25. Triangle Law: #S.3 [Tsourakakis ICDM 2008] C. Faloutsos (CMU) ASN HEP-TH Epinions X-axis: # of Triangles a node participates in Y-axis: count of such nodes Hadoop Summit '10
  • 26. Triangle Law: #S.4 [Tsourakakis ICDM 2008] C. Faloutsos (CMU) SN Reuters Epinions X-axis: degree Y-axis: mean # triangles n friends -> ~ n 1.6 triangles Hadoop Summit '10
  • 27. Triangle Law: Computations [Tsourakakis ICDM 2008] C. Faloutsos (CMU) But: triangles are expensive to compute (3-way join; several approx. algos) Q: Can we do that quickly? details Hadoop Summit '10
  • 28. Triangle Law: Computations [Tsourakakis ICDM 2008] C. Faloutsos (CMU) But: triangles are expensive to compute (3-way join; several approx. algos) Q: Can we do that quickly? A: Yes! #triangles = 1/6 Sum (  i 3 ) (and, because of skewness, we only need the top few eigenvalues! details Hadoop Summit '10
  • 29. Triangle Law: Computations [Tsourakakis ICDM 2008] C. Faloutsos (CMU) 1000x+ speed-up, >90% accuracy details Hadoop Summit '10
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40. Bipartite Communities! magnified bipartite community patents from same inventor(s) cut-and-paste bibliography! C. Faloutsos (CMU) Hadoop Summit '10
  • 41.
  • 42.
  • 43.
  • 44. Observation W.1: Fortification C. Faloutsos (CMU) More donors, more $ ? $10 $5 Hadoop Summit '10 ‘ Reagan’ ‘ Clinton’ $7
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55. More on Time-evolving graphs C. Faloutsos (CMU) M. McGlohon, L. Akoglu, and C. Faloutsos Weighted Graphs and Disconnected Components: Patterns and a Generator. SIG-KDD 2008 Hadoop Summit '10
  • 56.
  • 57.
  • 58.
  • 59. T.4 : popularity over time C. Faloutsos (CMU) Post popularity drops-off – exponentially? lag: days after post # in links 1 2 3 @t @t + lag Hadoop Summit '10
  • 60. T.4 : popularity over time C. Faloutsos (CMU) Post popularity drops-off – exponentially? POWER LAW! Exponent? # in links ( log ) 1 2 3 days after post ( log ) Hadoop Summit '10
  • 61.
  • 62.
  • 63.
  • 64. Ce nter- P iece S ubgraph Discovery [Tong+ KDD 06] Original Graph Q: Who is the most central node wrt the black nodes? (e.g., master-mind criminal, common advisor/collaborator, etc) Input C. Faloutsos (CMU) Hadoop Summit '10 B A C
  • 65. Ce nter- P iece S ubgraph Discovery [Tong+ KDD 06] Q: How to find hub for the query nodes? Input: original graph Output: CePS CePS Node C. Faloutsos (CMU) A: Combine proximity scores (RWR) Hadoop Summit '10 B A C B A C
  • 66.
  • 67.
  • 68.
  • 69. G raph X -Ray: Fast Best-Effort Pattern Matching in Large Attributed Graphs Hanghang Tong, Brian Gallagher, Christos Faloutsos, Tina Eliassi-Rad KDD’07
  • 70. Output Input Attributed Data Graph Query Graph Matching Subgraph Hadoop Summit '10 C. Faloutsos (CMU)
  • 71. Effectiveness: star-query Query Result Hadoop Summit '10 C. Faloutsos (CMU)
  • 72.
  • 73. OddBall: Spotting A n o m a l i e s in Weighted Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School of Computer Science To appear in PAKDD 2010, Hyderabad, India
  • 74.
  • 75. What is an egonet? ego egonet C. Faloutsos (CMU) Hadoop Summit '10
  • 76.
  • 77. Near-Clique/Star Hadoop Summit '10 C. Faloutsos (CMU)
  • 78. Near-Clique/Star C. Faloutsos (CMU) Hadoop Summit '10
  • 79.
  • 80. Outline – Algorithms & results C. Faloutsos (CMU) Hadoop Summit '10 Centralized Hadoop/PEGASUS Degree Distr. old old Pagerank old old Diameter/ANF old DONE Conn. Comp old DONE Triangles DONE Visualization STARTED
  • 81.
  • 82.
  • 83.
  • 84.
  • 85.
  • 86.
  • 87. Radius Plot of GCC of YahooWeb. C. Faloutsos (CMU) Hadoop Summit '10
  • 88. Running time - Kronecker and Erdos-Renyi Graphs with billions edges. details
  • 89. Outline – Algorithms & results C. Faloutsos (CMU) Hadoop Summit '10 Centralized Hadoop/PEGASUS Degree Distr. old old Pagerank old old Diameter/ANF old DONE Conn. Comp old DONE Triangles DONE Visualization STARTED
  • 90. Generalized Iterated Matrix Vector Multiplication (GIMV) C. Faloutsos (CMU) PEGASUS: A Peta-Scale Graph Mining System - Implementation and Observations . U Kang, Charalampos E. Tsourakakis, and Christos Faloutsos. ( ICDM ) 2009, Miami, Florida, USA. Best Application Paper (runner-up) . Hadoop Summit '10
  • 91.
  • 92.
  • 93.
  • 94.
  • 95.
  • 96.
  • 97.
  • 98. Outline – Algorithms & results C. Faloutsos (CMU) Hadoop Summit '10 Centralized Hadoop/PEGASUS Degree Distr. old old Pagerank old old Diameter/ANF old DONE Conn. Comp old DONE Triangles DONE Visualization STARTED
  • 99. Triangles : Computations [Tsourakakis ICDM 2008] C. Faloutsos (CMU) But: triangles are expensive to compute (3-way join; several approx. algos) Q: Can we do that quickly? A: Yes! #triangles = 1/6 Sum (  i 3 ) (and, because of skewness, we only need the top few eigenvalues! Mentioned already Hadoop Summit '10
  • 100. Triangle Law: #1 [Tsourakakis ICDM 2008] C. Faloutsos (CMU) ASN HEP-TH Epinions X-axis: # of Triangles a node participates in Y-axis: count of such nodes Mentioned already Hadoop Summit '10
  • 101. Outline – Algorithms & results C. Faloutsos (CMU) Hadoop Summit '10 Centralized Hadoop/PEGASUS Degree Distr. old old Pagerank old old Diameter/ANF old DONE Conn. Comp old DONE Triangles DONE Visualization STARTED
  • 102.
  • 103. C. Faloutsos (CMU) Hadoop Summit '10
  • 104.
  • 105.
  • 106.
  • 107.
  • 108.
  • 109.
  • 110.
  • 111.
  • 112.
  • 113.
  • 114.
  • 115.
  • 116.
  • 117.
  • 118.
  • 119.
  • 120.
  • 121.
  • 122.
  • 123.
  • 124.
  • 125.
  • 126.
  • 127.
  • 128.

Editor's Notes

  1. Faloutsos
  2. Faloutsos et al 06/30/10
  3. Faloutsos
  4. Faloutsos
  5. Faloutsos
  6. Faloutsos
  7. Faloutsos
  8. Faloutsos
  9. Faloutsos
  10. Faloutsos
  11. Faloutsos
  12. Faloutsos
  13. Faloutsos
  14. Faloutsos
  15. Faloutsos
  16. Faloutsos
  17. A = U Sigma U^T
  18. A = U Sigma U^T vec{u}_1 vec{u}_i
  19. Faloutsos
  20. Faloutsos
  21. Faloutsos
  22. Faloutsos
  23. Faloutsos
  24. Faloutsos
  25. Faloutsos
  26. Faloutsos
  27. Faloutsos
  28. Faloutsos Diameter first, DPL second Check diameter formulas As the network grows the distances between nodes slowly grow
  29. Faloutsos Diameter first, DPL second Check diameter formulas As the network grows the distances between nodes slowly grow
  30. Faloutsos
  31. Faloutsos
  32. Faloutsos
  33. Faloutsos
  34. Faloutsos
  35. Faloutsos
  36. Faloutsos
  37. SOME OLD RULES
  38. SOME OLD RULES
  39. Faloutsos et al 06/30/10
  40. Faloutsos et al 06/30/10
  41. Faloutsos
  42. Faloutsos et al 06/30/10
  43. Faloutsos et al 06/30/10
  44. Faloutsos
  45. Faloutsos
  46. Faloutsos
  47. lambda_1 Faloutsos
  48. Faloutsos
  49. Faloutsos
  50. Faloutsos
  51. Faloutsos
  52. Faloutsos
  53. Faloutsos
  54. Faloutsos
  55. Faloutsos et al 06/30/10