SlideShare a Scribd company logo
1 of 39
Download to read offline
Processing large-scale graphs 
, 
with GoogleTM Pregel 
November 22, 2014 
Frank Celler 
@fceller 
www.arangodb.com
About 
about us 
Frank Celler (@fceller) working on the ArangoDB core 
Michael Hackstein (@mchacki) started an experimental 
implementation of Pregel 
1
About 
about us 
Frank Celler (@fceller) working on the ArangoDB core 
Michael Hackstein (@mchacki) started an experimental 
implementation of Pregel 
about the talk 
different kinds of graph algorithms 
Pregel example 
Pregel mind set aka Framework 
more examples 
1
Pregel at ArangoDB 
Started as a side project in free hack time 
Experimental on operational database 
Implemented as an alternative to traversals 
Make use of the 2exibility of JavaScript: 
No strict type system 
No pre-compilation, on-the-2y queries 
Native JSON documents 
Really fast development 
2
Graph Algorithms 
Pattern matching 
Search through the entire graph 
Identify similar components 
) Touch all vertices and their neighbourhoods 
3
Graph Algorithms 
Pattern matching 
Search through the entire graph 
Identify similar components 
) Touch all vertices and their neighbourhoods 
Traversals 
De1ne a speci1c start point 
Iteratively explore the graph 
) History of steps is known 
3
Graph Algorithms 
Pattern matching 
Search through the entire graph 
Identify similar components 
) Touch all vertices and their neighbourhoods 
Traversals 
De1ne a speci1c start point 
Iteratively explore the graph 
) History of steps is known 
Global measurements 
Compute one value for the graph, based on all it’s vertices 
or edges 
Compute one value for each vertex or edge 
) Often require a global view on the graph 
3
Pregel 
A framework to query distributed, directed graphs. 
Known as “Map-Reduce” for graphs 
Uses same phases 
Has several iterations 
Aims at: 
Operate all servers at full capacity 
Reduce network traZc 
Good at calculations touching all vertices 
Bad at calculations touching a very small number of vertices 
4
Example – Connected Components 
1 
1 
2 
2 
5 
7 
7 
5 4 
3 4 
3 
6 
6 
active inactive 
3 forward message 2 backward message 
5
Example – Connected Components 
1 
1 
2 
2 
5 
7 
7 
5 
6 
7 
5 4 
3 4 
3 
6 
6 
4 
2 
3 
4 
active inactive 
3 forward message 2 backward message 
5
Example – Connected Components 
1 
1 
2 
2 
5 
7 
7 
5 
6 
7 
5 4 
3 4 
3 
6 
6 
4 
2 
3 
4 
active inactive 
3 forward message 2 backward message 
5
Example – Connected Components 
1 
1 
2 
2 
5 
6 
7 
5 
6 
5 
5 4 
3 4 
3 
5 
6 
3 
1 
2 
2 
active inactive 
3 forward message 2 backward message 
5
Example – Connected Components 
1 
1 
2 
2 
5 
6 
7 
5 
6 
5 
5 4 
3 4 
3 
5 
6 
3 
1 
2 
2 
active inactive 
3 forward message 2 backward message 
5
Example – Connected Components 
1 
1 
1 
2 
5 
5 
7 
5 2 
2 4 
3 
5 
6 
1 
1 
2 
2 
active inactive 
3 forward message 2 backward message 
5
Example – Connected Components 
1 
1 
1 
2 
5 
5 
7 
5 2 
2 4 
3 
5 
6 
1 
1 
2 
2 
active inactive 
3 forward message 2 backward message 
5
Example – Connected Components 
1 
1 
1 
2 
5 
5 
7 
5 1 
1 4 
3 
5 
6 
1 
1 
active inactive 
3 forward message 2 backward message 
5
Example – Connected Components 
1 
1 
1 
2 
5 
5 
7 
5 1 
1 4 
3 
5 
6 
1 
1 
active inactive 
3 forward message 2 backward message 
5
Example – Connected Components 
1 
1 
1 
2 
5 
5 
7 
5 1 
1 4 
3 
5 
6 
active inactive 
3 forward message 2 backward message 
5
Pregel – Sequence 
6
Pregel – Sequence 
6
Pregel – Sequence 
6
Pregel – Sequence 
6
Pregel – Sequence 
6
Worker ^= Map 
“Map” a user-de1ned algorithm over all vertices 
Output: set of messages to other vertices 
Available parameters: 
The current vertex and his outbound edges 
All incoming messages 
Global values 
Allow modi1cations on the vertex: 
Attach a result to this vertex and his outgoing edges 
Delete the vertex and his outgoing edges 
Deactivate the vertex 
7
Combine ^= Reduce 
“Reduce” all generated messages 
Output: An aggregated message for each vertex. 
Executed on sender as well as receiver. 
Available parameters: 
One new message for a vertex 
The stored aggregate for this vertex 
Typical combiners are SUM, MIN or MAX 
Reduces network traZc 
8
Activity ^= Termination 
Execute several rounds of Map/Reduce 
Count active vertices and messages 
Start next round if one of the following is true: 
At least one vertex is active 
At least one message is sent 
Terminate if neither a vertex is active nor messages were sent 
Store all non-deleted vertices and edges as resulting graph 
9
Pregel Questions 
connected components 
page rank 
bipartite matching 
semi-clustering 
mimum spanning forest 
graph coloring 
shortest paths 
10
Pagerank 
11
Pagerank 
11
Pagerank 
11
Pagerank 
11
Pagerank for Giraph 
12 
1 public class SimplePageRankComputation extends BasicComputation < 
LongWritable , DoubleWritable , FloatWritable , DoubleWritable > 
{ 
2 public static final int MAX_SUPERSTEPS = 30; 
34 
@Override 
5 public void compute ( Vertex < LongWritable , DoubleWritable , 
FloatWritable > vertex , Iterable < DoubleWritable > messages ) 
throws IOException { 
6 if ( getSuperstep () >= 1) { 
7 double sum = 0; 
8 for ( DoubleWritable message : messages ) { 
9 sum += message .get (); 
10 } 
11 DoubleWritable vertexValue = new DoubleWritable ((0.15 f / 
getTotalNumVertices ()) + 0.85 f * sum ); 
12 vertex . setValue ( vertexValue ); 
13 } 
14 if ( getSuperstep () < MAX_SUPERSTEPS ) { 
15 long edges = vertex . getNumEdges (); 
16 sendMessageToAllEdges (vertex , new DoubleWritable ( vertex . 
getValue ().get () / edges )); 
17 } else { 
18 vertex . voteToHalt (); 
19 } 
20 } 
21 
22 public static class SimplePageRankWorkerContext extends 
WorkerContext { 
23 @Override 
24 public void preApplication () throws InstantiationException , 
IllegalAccessException { } 
25 @Override 
26 public void postApplication () { } 
27 @Override 
28 public void preSuperstep () { } 
29 @Override 
30 public void postSuperstep () { } 
31 } 
32 
33 public static class SimplePageRankMasterCompute extends 
DefaultMasterCompute { 
34 @Override 
35 public void initialize () throws InstantiationException , 
IllegalAccessException { 
36 } 
37 } 
38 public static class SimplePageRankVertexReader extends 
GeneratedVertexReader < LongWritable , DoubleWritable , 
FloatWritable > { 
39 @Override 
40 public boolean nextVertex () { 
41 return totalRecords > recordsRead ; 
42 } 
44 @Override 
45 public Vertex < LongWritable , DoubleWritable , FloatWritable > 
getCurrentVertex () throws IOException { 
46 Vertex < LongWritable , DoubleWritable , FloatWritable > vertex 
= getConf (). createVertex (); 
47 LongWritable vertexId = new LongWritable ( 
48 ( inputSplit . getSplitIndex () * totalRecords ) + 
recordsRead ); 
49 DoubleWritable vertexValue = new DoubleWritable ( vertexId . 
get () * 10d); 
50 long targetVertexId = ( vertexId .get () + 1) % ( inputSplit . 
getNumSplits () * totalRecords ); 
51 float edgeValue = vertexId . get () * 100 f; 
52 List <Edge < LongWritable , FloatWritable >> edges = Lists . 
newLinkedList (); 
53 edges .add ( EdgeFactory . create (new LongWritable ( 
targetVertexId ), new FloatWritable ( edgeValue ))); 
54 vertex . initialize ( vertexId , vertexValue , edges ); 
55 ++ recordsRead ; 
56 return vertex ; 
57 } 
58 } 
59 
60 public static class SimplePageRankVertexInputFormat extends 
GeneratedVertexInputFormat < LongWritable , DoubleWritable , 
FloatWritable > { 
61 @Override 
62 public VertexReader < LongWritable , DoubleWritable , 
FloatWritable > createVertexReader ( InputSplit split , 
TaskAttemptContext context ) 
63 throws IOException { 
64 return new SimplePageRankVertexReader (); 
65 } 
66 } 
67 
68 public static class SimplePageRankVertexOutputFormat extends 
TextVertexOutputFormat < LongWritable , DoubleWritable , 
FloatWritable > { 
69 @Override 
70 public TextVertexWriter createVertexWriter ( 
TaskAttemptContext context ) throws IOException , 
InterruptedException { 
71 return new SimplePageRankVertexWriter (); 
72 } 
73 
74 public class SimplePageRankVertexWriter extends 
TextVertexWriter { 
75 @Override 
76 public void writeVertex ( Vertex < LongWritable , 
DoubleWritable , FloatWritable > vertex ) throws 
IOException , InterruptedException { 
77 getRecordWriter (). write ( new Text ( vertex . getId (). 
toString ()), new Text ( vertex . getValue (). toString ())) 
; 
78 } 
79 } 
80 } 
81 }
Pagerank for TinkerPop3 
13 
1 public class PageRankVertexProgram implements VertexProgram < 
Double > { 
2 private MessageType . Local messageType = MessageType . Local .of 
(() -> GraphTraversal .< Vertex >of (). outE ()); 
3 public static final String PAGE_RANK = Graph .Key . hide (" gremlin 
. pageRank "); 
4 public static final String EDGE_COUNT = Graph .Key . hide (" 
gremlin . edgeCount "); 
5 private static final String VERTEX_COUNT = " gremlin . 
pageRankVertexProgram . vertexCount "; 
6 private static final String ALPHA = " gremlin . 
pageRankVertexProgram . alpha "; 
7 private static final String TOTAL_ITERATIONS = " gremlin . 
pageRankVertexProgram . totalIterations "; 
8 private static final String INCIDENT_TRAVERSAL = " gremlin . 
pageRankVertexProgram . incidentTraversal "; 
9 private double vertexCountAsDouble = 1; 
10 private double alpha = 0.85 d; 
11 private int totalIterations = 30; 
12 private static final Set <String > COMPUTE_KEYS = new HashSet <>( 
Arrays . asList ( PAGE_RANK , EDGE_COUNT )); 
13 
14 private PageRankVertexProgram () {} 
15 
16 @Override 
17 public void loadState ( final Configuration configuration ) { 
18 this . vertexCountAsDouble = configuration . getDouble ( 
VERTEX_COUNT , 1.0 d); 
19 this . alpha = configuration . getDouble (ALPHA , 0.85 d); 
20 this . totalIterations = configuration . getInt ( 
TOTAL_ITERATIONS , 30); 
21 try { 
22 if ( configuration . containsKey ( INCIDENT_TRAVERSAL )) { 
23 final SSupplier < Traversal > traversalSupplier = 
VertexProgramHelper . deserialize ( configuration , 
INCIDENT_TRAVERSAL ); 
24 VertexProgramHelper . verifyReversibility ( 
traversalSupplier .get ()); 
25 this . messageType = MessageType . Local .of (( SSupplier ) 
traversalSupplier ); 
26 } 
27 } catch ( final Exception e) { 
28 throw new IllegalStateException (e. getMessage () , e); 
29 } 
30 } 
32 @Override 
33 public void storeState ( final Configuration configuration ) { 
34 configuration . setProperty ( GraphComputer . VERTEX_PROGRAM , 
PageRankVertexProgram . class . getName ()); 
35 configuration . setProperty ( VERTEX_COUNT , this . 
vertexCountAsDouble ); 
36 configuration . setProperty (ALPHA , this . alpha ); 
37 configuration . setProperty ( TOTAL_ITERATIONS , this . 
totalIterations ); 
38 try { 
39 VertexProgramHelper . serialize ( this . messageType . 
getIncidentTraversal () , configuration , 
INCIDENT_TRAVERSAL ); 
40 } catch ( final Exception e) { 
41 throw new IllegalStateException (e. getMessage () , e); 
42 } 
43 } 
44 
45 @Override 
46 public Set <String > getElementComputeKeys () { 
47 return COMPUTE_KEYS ; 
48 } 
49 
50 @Override 
51 public void setup ( final Memory memory ) { 
52 
53 } 
54 
55 @Override 
56 public void execute ( final Vertex vertex , Messenger <Double > 
messenger , final Memory memory ) { 
57 if ( memory . isInitialIteration ()) { 
58 double initialPageRank = 1.0d / this . vertexCountAsDouble 
; 
59 double edgeCount = Double . valueOf (( Long ) this . 
messageType . edges ( vertex ). count (). next ()); 
60 vertex . singleProperty ( PAGE_RANK , initialPageRank ); 
61 vertex . singleProperty ( EDGE_COUNT , edgeCount ); 
62 messenger . sendMessage ( this . messageType , initialPageRank 
/ edgeCount ); 
63 } else { 
64 double newPageRank = StreamFactory . stream ( messenger . 
receiveMessages ( this . messageType )). reduce (0.0d, (a, 
b) -> a + b); 
65 newPageRank = ( this . alpha * newPageRank ) + ((1.0 d - this 
. alpha ) / this . vertexCountAsDouble ); 
66 vertex . singleProperty ( PAGE_RANK , newPageRank ); 
67 messenger . sendMessage ( this . messageType , newPageRank / 
vertex .<Double > property ( EDGE_COUNT ). orElse (0.0 d)); 
68 } 
69 } 
70 
71 @Override 
72 public boolean terminate ( final Memory memory ) { 
73 return memory . getIteration () >= this . totalIterations ; 
74 } 
75 }
Pagerank for ArangoDB 
1 var pageRank = function (vertex , message , global ) { 
2 var total = global . vertexCount ; 
3 var edgeCount = vertex . _outEdges . length ; 
4 var alpha = global . alpha ; 
5 var sum = 0, rank = 0; 
6 if ( global . step > 0) { 
7 while ( message . hasNext ()) { 
8 sum += message . next (). data ; 
9 } 
10 rank = alpha * sum + (1- alpha ) / total ; 
11 } else { 
12 rank = 1 / total ; 
13 } 
14 vertex . _setResult ( rank ); 
15 if ( global . step < global . MAX_STEPS ) { 
16 var send = rank / edgeCount ; 
17 while ( vertex . _outEdges . hasNext ()) { 
18 message . sendTo ( vertex . _outEdges . next (). edge . 
_getTarget () , send ); 
19 } 
20 } else { 
21 vertex . _deactivate (); 
22 } 
23 }; 
14
Pregel Questions 
connected components 
page rank 
bipartite matching 
semi-clustering 
mimum spanning forest 
graph coloring 
shortest paths 
15
Bipartite Matching 
16
Bipartite Matching 
16
Pregel Questions 
connected components 
page rank 
bipartite matching 
semi-clustering 
mimum spanning forest 
graph coloring 
shortest paths 
17
Thank You 
Twitter: @arangodb 
Github: triagens/ArangoDB 
Google Group: arangodb 
IRC: arangodb 
18

More Related Content

What's hot

Vasia Kalavri – Training: Gelly School
Vasia Kalavri – Training: Gelly School Vasia Kalavri – Training: Gelly School
Vasia Kalavri – Training: Gelly School Flink Forward
 
Non Blocking I/O for Everyone with RxJava
Non Blocking I/O for Everyone with RxJavaNon Blocking I/O for Everyone with RxJava
Non Blocking I/O for Everyone with RxJavaFrank Lyaruu
 
Apache Flink: API, runtime, and project roadmap
Apache Flink: API, runtime, and project roadmapApache Flink: API, runtime, and project roadmap
Apache Flink: API, runtime, and project roadmapKostas Tzoumas
 
Java9 Beyond Modularity - Java 9 más allá de la modularidad
Java9 Beyond Modularity - Java 9 más allá de la modularidadJava9 Beyond Modularity - Java 9 más allá de la modularidad
Java9 Beyond Modularity - Java 9 más allá de la modularidadDavid Gómez García
 
What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?Miklos Christine
 
Apache Flink Internals: Stream & Batch Processing in One System – Apache Flin...
Apache Flink Internals: Stream & Batch Processing in One System – Apache Flin...Apache Flink Internals: Stream & Batch Processing in One System – Apache Flin...
Apache Flink Internals: Stream & Batch Processing in One System – Apache Flin...ucelebi
 
Apache Flink @ NYC Flink Meetup
Apache Flink @ NYC Flink MeetupApache Flink @ NYC Flink Meetup
Apache Flink @ NYC Flink MeetupStephan Ewen
 
Streaming Dataflow with Apache Flink
Streaming Dataflow with Apache Flink Streaming Dataflow with Apache Flink
Streaming Dataflow with Apache Flink huguk
 
Reactive Programming for a demanding world: building event-driven and respons...
Reactive Programming for a demanding world: building event-driven and respons...Reactive Programming for a demanding world: building event-driven and respons...
Reactive Programming for a demanding world: building event-driven and respons...Mario Fusco
 
Reactive Access to MongoDB from Java 8
Reactive Access to MongoDB from Java 8Reactive Access to MongoDB from Java 8
Reactive Access to MongoDB from Java 8Hermann Hueck
 
Distributed Real-Time Stream Processing: Why and How 2.0
Distributed Real-Time Stream Processing:  Why and How 2.0Distributed Real-Time Stream Processing:  Why and How 2.0
Distributed Real-Time Stream Processing: Why and How 2.0Petr Zapletal
 
Apache Flink Meetup: Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
Apache Flink Meetup:  Sanjar Akhmedov - Joining Infinity – Windowless Stream ...Apache Flink Meetup:  Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
Apache Flink Meetup: Sanjar Akhmedov - Joining Infinity – Windowless Stream ...Ververica
 
Python in the database
Python in the databasePython in the database
Python in the databasepybcn
 
Improved alerting with Prometheus and Alertmanager
Improved alerting with Prometheus and AlertmanagerImproved alerting with Prometheus and Alertmanager
Improved alerting with Prometheus and AlertmanagerJulien Pivotto
 
Finagle and Java Service Framework at Pinterest
Finagle and Java Service Framework at PinterestFinagle and Java Service Framework at Pinterest
Finagle and Java Service Framework at PinterestPavan Chitumalla
 
Flink Streaming Berlin Meetup
Flink Streaming Berlin MeetupFlink Streaming Berlin Meetup
Flink Streaming Berlin MeetupMárton Balassi
 
Spring data ii
Spring data iiSpring data ii
Spring data ii명철 강
 
Programming with Python and PostgreSQL
Programming with Python and PostgreSQLProgramming with Python and PostgreSQL
Programming with Python and PostgreSQLPeter Eisentraut
 
If You Think You Can Stay Away from Functional Programming, You Are Wrong
If You Think You Can Stay Away from Functional Programming, You Are WrongIf You Think You Can Stay Away from Functional Programming, You Are Wrong
If You Think You Can Stay Away from Functional Programming, You Are WrongMario Fusco
 

What's hot (20)

Vasia Kalavri – Training: Gelly School
Vasia Kalavri – Training: Gelly School Vasia Kalavri – Training: Gelly School
Vasia Kalavri – Training: Gelly School
 
Non Blocking I/O for Everyone with RxJava
Non Blocking I/O for Everyone with RxJavaNon Blocking I/O for Everyone with RxJava
Non Blocking I/O for Everyone with RxJava
 
Apache Flink: API, runtime, and project roadmap
Apache Flink: API, runtime, and project roadmapApache Flink: API, runtime, and project roadmap
Apache Flink: API, runtime, and project roadmap
 
Java9 Beyond Modularity - Java 9 más allá de la modularidad
Java9 Beyond Modularity - Java 9 más allá de la modularidadJava9 Beyond Modularity - Java 9 más allá de la modularidad
Java9 Beyond Modularity - Java 9 más allá de la modularidad
 
What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?
 
Apache Flink Internals: Stream & Batch Processing in One System – Apache Flin...
Apache Flink Internals: Stream & Batch Processing in One System – Apache Flin...Apache Flink Internals: Stream & Batch Processing in One System – Apache Flin...
Apache Flink Internals: Stream & Batch Processing in One System – Apache Flin...
 
Apache Flink @ NYC Flink Meetup
Apache Flink @ NYC Flink MeetupApache Flink @ NYC Flink Meetup
Apache Flink @ NYC Flink Meetup
 
Streaming Dataflow with Apache Flink
Streaming Dataflow with Apache Flink Streaming Dataflow with Apache Flink
Streaming Dataflow with Apache Flink
 
Reactive Programming for a demanding world: building event-driven and respons...
Reactive Programming for a demanding world: building event-driven and respons...Reactive Programming for a demanding world: building event-driven and respons...
Reactive Programming for a demanding world: building event-driven and respons...
 
Reactive Access to MongoDB from Java 8
Reactive Access to MongoDB from Java 8Reactive Access to MongoDB from Java 8
Reactive Access to MongoDB from Java 8
 
Distributed Real-Time Stream Processing: Why and How 2.0
Distributed Real-Time Stream Processing:  Why and How 2.0Distributed Real-Time Stream Processing:  Why and How 2.0
Distributed Real-Time Stream Processing: Why and How 2.0
 
Apache Flink Meetup: Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
Apache Flink Meetup:  Sanjar Akhmedov - Joining Infinity – Windowless Stream ...Apache Flink Meetup:  Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
Apache Flink Meetup: Sanjar Akhmedov - Joining Infinity – Windowless Stream ...
 
Python in the database
Python in the databasePython in the database
Python in the database
 
Improved alerting with Prometheus and Alertmanager
Improved alerting with Prometheus and AlertmanagerImproved alerting with Prometheus and Alertmanager
Improved alerting with Prometheus and Alertmanager
 
Finagle and Java Service Framework at Pinterest
Finagle and Java Service Framework at PinterestFinagle and Java Service Framework at Pinterest
Finagle and Java Service Framework at Pinterest
 
Flink Streaming Berlin Meetup
Flink Streaming Berlin MeetupFlink Streaming Berlin Meetup
Flink Streaming Berlin Meetup
 
Spring data ii
Spring data iiSpring data ii
Spring data ii
 
Programming with Python and PostgreSQL
Programming with Python and PostgreSQLProgramming with Python and PostgreSQL
Programming with Python and PostgreSQL
 
Parallel streams in java 8
Parallel streams in java 8Parallel streams in java 8
Parallel streams in java 8
 
If You Think You Can Stay Away from Functional Programming, You Are Wrong
If You Think You Can Stay Away from Functional Programming, You Are WrongIf You Think You Can Stay Away from Functional Programming, You Are Wrong
If You Think You Can Stay Away from Functional Programming, You Are Wrong
 

Viewers also liked

Handling Billions of Edges in a Graph Database
Handling Billions of Edges in a Graph DatabaseHandling Billions of Edges in a Graph Database
Handling Billions of Edges in a Graph DatabaseArangoDB Database
 
Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?Max Neunhöffer
 
Extensibility of a database api with js
Extensibility of a database api with jsExtensibility of a database api with js
Extensibility of a database api with jsArangoDB Database
 
Introduction to Foxx by our community member Iskandar Soesman @ikandars
Introduction to Foxx by our community member Iskandar Soesman @ikandarsIntroduction to Foxx by our community member Iskandar Soesman @ikandars
Introduction to Foxx by our community member Iskandar Soesman @ikandarsArangoDB Database
 
Polyglot Persistence & Multi Model-Databases at JMaghreb3.0
Polyglot Persistence & Multi Model-Databases at JMaghreb3.0Polyglot Persistence & Multi Model-Databases at JMaghreb3.0
Polyglot Persistence & Multi Model-Databases at JMaghreb3.0ArangoDB Database
 
Domain driven design @FrOSCon
Domain driven design @FrOSConDomain driven design @FrOSCon
Domain driven design @FrOSConArangoDB Database
 
Jan Steemann: Modelling data in a schema free world (Talk held at Froscon, 2...
Jan Steemann: Modelling data in a schema free world  (Talk held at Froscon, 2...Jan Steemann: Modelling data in a schema free world  (Talk held at Froscon, 2...
Jan Steemann: Modelling data in a schema free world (Talk held at Froscon, 2...ArangoDB Database
 
Creating data centric microservices
Creating data centric microservicesCreating data centric microservices
Creating data centric microservicesArangoDB Database
 
Microservice-based software architecture
Microservice-based software architectureMicroservice-based software architecture
Microservice-based software architectureArangoDB Database
 
Polyglot Persistence & Multi-Model Databases (FullStack Toronto)
Polyglot Persistence & Multi-Model Databases (FullStack Toronto)Polyglot Persistence & Multi-Model Databases (FullStack Toronto)
Polyglot Persistence & Multi-Model Databases (FullStack Toronto)ArangoDB Database
 
Neo4j and the Panama Papers - FooCafe June 2016
Neo4j and the Panama Papers - FooCafe June 2016Neo4j and the Panama Papers - FooCafe June 2016
Neo4j and the Panama Papers - FooCafe June 2016Craig Taverner
 
Performance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4jPerformance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4jArangoDB Database
 
Deep dive into the native multi model database ArangoDB
Deep dive into the native multi model database ArangoDBDeep dive into the native multi model database ArangoDB
Deep dive into the native multi model database ArangoDBArangoDB Database
 
Polyglot Persistence & Multi-Model Databases
Polyglot Persistence & Multi-Model DatabasesPolyglot Persistence & Multi-Model Databases
Polyglot Persistence & Multi-Model DatabasesArangoDB Database
 
Creating Fault Tolerant Services on Mesos
Creating Fault Tolerant Services on MesosCreating Fault Tolerant Services on Mesos
Creating Fault Tolerant Services on MesosArangoDB Database
 
Real time and reliable processing with Apache Storm
Real time and reliable processing with Apache StormReal time and reliable processing with Apache Storm
Real time and reliable processing with Apache StormAndrea Iacono
 
Introduction to column oriented databases
Introduction to column oriented databasesIntroduction to column oriented databases
Introduction to column oriented databasesArangoDB Database
 

Viewers also liked (20)

Handling Billions of Edges in a Graph Database
Handling Billions of Edges in a Graph DatabaseHandling Billions of Edges in a Graph Database
Handling Billions of Edges in a Graph Database
 
Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?
 
Extensibility of a database api with js
Extensibility of a database api with jsExtensibility of a database api with js
Extensibility of a database api with js
 
Introduction to Foxx by our community member Iskandar Soesman @ikandars
Introduction to Foxx by our community member Iskandar Soesman @ikandarsIntroduction to Foxx by our community member Iskandar Soesman @ikandars
Introduction to Foxx by our community member Iskandar Soesman @ikandars
 
Software + Babies
Software + BabiesSoftware + Babies
Software + Babies
 
Polyglot Persistence & Multi Model-Databases at JMaghreb3.0
Polyglot Persistence & Multi Model-Databases at JMaghreb3.0Polyglot Persistence & Multi Model-Databases at JMaghreb3.0
Polyglot Persistence & Multi Model-Databases at JMaghreb3.0
 
Domain driven design @FrOSCon
Domain driven design @FrOSConDomain driven design @FrOSCon
Domain driven design @FrOSCon
 
Jan Steemann: Modelling data in a schema free world (Talk held at Froscon, 2...
Jan Steemann: Modelling data in a schema free world  (Talk held at Froscon, 2...Jan Steemann: Modelling data in a schema free world  (Talk held at Froscon, 2...
Jan Steemann: Modelling data in a schema free world (Talk held at Froscon, 2...
 
Guacamole
GuacamoleGuacamole
Guacamole
 
Creating data centric microservices
Creating data centric microservicesCreating data centric microservices
Creating data centric microservices
 
Microservice-based software architecture
Microservice-based software architectureMicroservice-based software architecture
Microservice-based software architecture
 
Polyglot Persistence & Multi-Model Databases (FullStack Toronto)
Polyglot Persistence & Multi-Model Databases (FullStack Toronto)Polyglot Persistence & Multi-Model Databases (FullStack Toronto)
Polyglot Persistence & Multi-Model Databases (FullStack Toronto)
 
Neo4j and the Panama Papers - FooCafe June 2016
Neo4j and the Panama Papers - FooCafe June 2016Neo4j and the Panama Papers - FooCafe June 2016
Neo4j and the Panama Papers - FooCafe June 2016
 
Performance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4jPerformance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4j
 
Deep dive into the native multi model database ArangoDB
Deep dive into the native multi model database ArangoDBDeep dive into the native multi model database ArangoDB
Deep dive into the native multi model database ArangoDB
 
Polyglot Persistence & Multi-Model Databases
Polyglot Persistence & Multi-Model DatabasesPolyglot Persistence & Multi-Model Databases
Polyglot Persistence & Multi-Model Databases
 
Creating Fault Tolerant Services on Mesos
Creating Fault Tolerant Services on MesosCreating Fault Tolerant Services on Mesos
Creating Fault Tolerant Services on Mesos
 
NoSQL meets Microservices
NoSQL meets MicroservicesNoSQL meets Microservices
NoSQL meets Microservices
 
Real time and reliable processing with Apache Storm
Real time and reliable processing with Apache StormReal time and reliable processing with Apache Storm
Real time and reliable processing with Apache Storm
 
Introduction to column oriented databases
Introduction to column oriented databasesIntroduction to column oriented databases
Introduction to column oriented databases
 

Similar to Processing Large Graphs with Pregel

Apache Flink & Graph Processing
Apache Flink & Graph ProcessingApache Flink & Graph Processing
Apache Flink & Graph ProcessingVasia Kalavri
 
Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15
Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15
Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15Vasia Kalavri
 
Reactive programming on Android
Reactive programming on AndroidReactive programming on Android
Reactive programming on AndroidTomáš Kypta
 
Lecture#6 functions in c++
Lecture#6 functions in c++Lecture#6 functions in c++
Lecture#6 functions in c++NUST Stuff
 
Hazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMSHazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMSuzquiano
 
Progscon 2017: Taming the wild fronteer - Adventures in Clojurescript
Progscon 2017: Taming the wild fronteer - Adventures in ClojurescriptProgscon 2017: Taming the wild fronteer - Adventures in Clojurescript
Progscon 2017: Taming the wild fronteer - Adventures in ClojurescriptJohn Stevenson
 
CS101- Introduction to Computing- Lecture 35
CS101- Introduction to Computing- Lecture 35CS101- Introduction to Computing- Lecture 35
CS101- Introduction to Computing- Lecture 35Bilal Ahmed
 
Reactive programming every day
Reactive programming every dayReactive programming every day
Reactive programming every dayVadym Khondar
 
ClojureScript loves React, DomCode May 26 2015
ClojureScript loves React, DomCode May 26 2015ClojureScript loves React, DomCode May 26 2015
ClojureScript loves React, DomCode May 26 2015Michiel Borkent
 
NET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptxNET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptxpetabridge
 
Large volume data analysis on the Typesafe Reactive Platform
Large volume data analysis on the Typesafe Reactive PlatformLarge volume data analysis on the Typesafe Reactive Platform
Large volume data analysis on the Typesafe Reactive PlatformMartin Zapletal
 
Jdk 7 4-forkjoin
Jdk 7 4-forkjoinJdk 7 4-forkjoin
Jdk 7 4-forkjoinknight1128
 
Continuous Application with Structured Streaming 2.0
Continuous Application with Structured Streaming 2.0Continuous Application with Structured Streaming 2.0
Continuous Application with Structured Streaming 2.0Anyscale
 
Apache Hadoop Java API
Apache Hadoop Java APIApache Hadoop Java API
Apache Hadoop Java APIAdam Kawa
 
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphXIntroduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphXrhatr
 
Apache Flink Stream Processing
Apache Flink Stream ProcessingApache Flink Stream Processing
Apache Flink Stream ProcessingSuneel Marthi
 

Similar to Processing Large Graphs with Pregel (20)

Apache Flink & Graph Processing
Apache Flink & Graph ProcessingApache Flink & Graph Processing
Apache Flink & Graph Processing
 
Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15
Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15
Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15
 
Lambdas puzzler - Peter Lawrey
Lambdas puzzler - Peter LawreyLambdas puzzler - Peter Lawrey
Lambdas puzzler - Peter Lawrey
 
Reactive programming on Android
Reactive programming on AndroidReactive programming on Android
Reactive programming on Android
 
G pars
G parsG pars
G pars
 
Lecture#6 functions in c++
Lecture#6 functions in c++Lecture#6 functions in c++
Lecture#6 functions in c++
 
Hazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMSHazelcast and MongoDB at Cloud CMS
Hazelcast and MongoDB at Cloud CMS
 
Progscon 2017: Taming the wild fronteer - Adventures in Clojurescript
Progscon 2017: Taming the wild fronteer - Adventures in ClojurescriptProgscon 2017: Taming the wild fronteer - Adventures in Clojurescript
Progscon 2017: Taming the wild fronteer - Adventures in Clojurescript
 
CS101- Introduction to Computing- Lecture 35
CS101- Introduction to Computing- Lecture 35CS101- Introduction to Computing- Lecture 35
CS101- Introduction to Computing- Lecture 35
 
Reactive programming every day
Reactive programming every dayReactive programming every day
Reactive programming every day
 
ClojureScript loves React, DomCode May 26 2015
ClojureScript loves React, DomCode May 26 2015ClojureScript loves React, DomCode May 26 2015
ClojureScript loves React, DomCode May 26 2015
 
NET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptxNET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptx
 
Rx workshop
Rx workshopRx workshop
Rx workshop
 
Large volume data analysis on the Typesafe Reactive Platform
Large volume data analysis on the Typesafe Reactive PlatformLarge volume data analysis on the Typesafe Reactive Platform
Large volume data analysis on the Typesafe Reactive Platform
 
Jdk 7 4-forkjoin
Jdk 7 4-forkjoinJdk 7 4-forkjoin
Jdk 7 4-forkjoin
 
Continuous Application with Structured Streaming 2.0
Continuous Application with Structured Streaming 2.0Continuous Application with Structured Streaming 2.0
Continuous Application with Structured Streaming 2.0
 
Apache Hadoop Java API
Apache Hadoop Java APIApache Hadoop Java API
Apache Hadoop Java API
 
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphXIntroduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
 
Cpp tutorial
Cpp tutorialCpp tutorial
Cpp tutorial
 
Apache Flink Stream Processing
Apache Flink Stream ProcessingApache Flink Stream Processing
Apache Flink Stream Processing
 

More from ArangoDB Database

ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....ArangoDB Database
 
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022ArangoDB Database
 
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022ArangoDB Database
 
ArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB 3.9 - Further Powering Graphs at ScaleArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB 3.9 - Further Powering Graphs at ScaleArangoDB Database
 
GraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDBGraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDBArangoDB Database
 
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale ArangoDB Database
 
Graph Analytics with ArangoDB
Graph Analytics with ArangoDBGraph Analytics with ArangoDB
Graph Analytics with ArangoDBArangoDB Database
 
Getting Started with ArangoDB Oasis
Getting Started with ArangoDB OasisGetting Started with ArangoDB Oasis
Getting Started with ArangoDB OasisArangoDB Database
 
Custom Pregel Algorithms in ArangoDB
Custom Pregel Algorithms in ArangoDBCustom Pregel Algorithms in ArangoDB
Custom Pregel Algorithms in ArangoDBArangoDB Database
 
Hacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge GraphsHacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge GraphsArangoDB Database
 
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release WebinarA Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release WebinarArangoDB Database
 
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?ArangoDB Database
 
ArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoML Pipeline Cloud - Managed Machine Learning MetadataArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoML Pipeline Cloud - Managed Machine Learning MetadataArangoDB Database
 
ArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB 3.7 Roadmap: Performance at ScaleArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB 3.7 Roadmap: Performance at ScaleArangoDB Database
 
Webinar: What to expect from ArangoDB Oasis
Webinar: What to expect from ArangoDB OasisWebinar: What to expect from ArangoDB Oasis
Webinar: What to expect from ArangoDB OasisArangoDB Database
 
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019ArangoDB Database
 
Webinar: How native multi model works in ArangoDB
Webinar: How native multi model works in ArangoDBWebinar: How native multi model works in ArangoDB
Webinar: How native multi model works in ArangoDBArangoDB Database
 
An introduction to multi-model databases
An introduction to multi-model databasesAn introduction to multi-model databases
An introduction to multi-model databasesArangoDB Database
 
Running complex data queries in a distributed system
Running complex data queries in a distributed systemRunning complex data queries in a distributed system
Running complex data queries in a distributed systemArangoDB Database
 

More from ArangoDB Database (20)

ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
 
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
 
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
 
ArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB 3.9 - Further Powering Graphs at ScaleArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB 3.9 - Further Powering Graphs at Scale
 
GraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDBGraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDB
 
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
 
Graph Analytics with ArangoDB
Graph Analytics with ArangoDBGraph Analytics with ArangoDB
Graph Analytics with ArangoDB
 
Getting Started with ArangoDB Oasis
Getting Started with ArangoDB OasisGetting Started with ArangoDB Oasis
Getting Started with ArangoDB Oasis
 
Custom Pregel Algorithms in ArangoDB
Custom Pregel Algorithms in ArangoDBCustom Pregel Algorithms in ArangoDB
Custom Pregel Algorithms in ArangoDB
 
Hacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge GraphsHacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge Graphs
 
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release WebinarA Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
 
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
 
ArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoML Pipeline Cloud - Managed Machine Learning MetadataArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoML Pipeline Cloud - Managed Machine Learning Metadata
 
ArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB 3.7 Roadmap: Performance at ScaleArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB 3.7 Roadmap: Performance at Scale
 
Webinar: What to expect from ArangoDB Oasis
Webinar: What to expect from ArangoDB OasisWebinar: What to expect from ArangoDB Oasis
Webinar: What to expect from ArangoDB Oasis
 
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
 
3.5 webinar
3.5 webinar 3.5 webinar
3.5 webinar
 
Webinar: How native multi model works in ArangoDB
Webinar: How native multi model works in ArangoDBWebinar: How native multi model works in ArangoDB
Webinar: How native multi model works in ArangoDB
 
An introduction to multi-model databases
An introduction to multi-model databasesAn introduction to multi-model databases
An introduction to multi-model databases
 
Running complex data queries in a distributed system
Running complex data queries in a distributed systemRunning complex data queries in a distributed system
Running complex data queries in a distributed system
 

Recently uploaded

Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 

Recently uploaded (20)

Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 

Processing Large Graphs with Pregel

  • 1. Processing large-scale graphs , with GoogleTM Pregel November 22, 2014 Frank Celler @fceller www.arangodb.com
  • 2. About about us Frank Celler (@fceller) working on the ArangoDB core Michael Hackstein (@mchacki) started an experimental implementation of Pregel 1
  • 3. About about us Frank Celler (@fceller) working on the ArangoDB core Michael Hackstein (@mchacki) started an experimental implementation of Pregel about the talk different kinds of graph algorithms Pregel example Pregel mind set aka Framework more examples 1
  • 4. Pregel at ArangoDB Started as a side project in free hack time Experimental on operational database Implemented as an alternative to traversals Make use of the 2exibility of JavaScript: No strict type system No pre-compilation, on-the-2y queries Native JSON documents Really fast development 2
  • 5. Graph Algorithms Pattern matching Search through the entire graph Identify similar components ) Touch all vertices and their neighbourhoods 3
  • 6. Graph Algorithms Pattern matching Search through the entire graph Identify similar components ) Touch all vertices and their neighbourhoods Traversals De1ne a speci1c start point Iteratively explore the graph ) History of steps is known 3
  • 7. Graph Algorithms Pattern matching Search through the entire graph Identify similar components ) Touch all vertices and their neighbourhoods Traversals De1ne a speci1c start point Iteratively explore the graph ) History of steps is known Global measurements Compute one value for the graph, based on all it’s vertices or edges Compute one value for each vertex or edge ) Often require a global view on the graph 3
  • 8. Pregel A framework to query distributed, directed graphs. Known as “Map-Reduce” for graphs Uses same phases Has several iterations Aims at: Operate all servers at full capacity Reduce network traZc Good at calculations touching all vertices Bad at calculations touching a very small number of vertices 4
  • 9. Example – Connected Components 1 1 2 2 5 7 7 5 4 3 4 3 6 6 active inactive 3 forward message 2 backward message 5
  • 10. Example – Connected Components 1 1 2 2 5 7 7 5 6 7 5 4 3 4 3 6 6 4 2 3 4 active inactive 3 forward message 2 backward message 5
  • 11. Example – Connected Components 1 1 2 2 5 7 7 5 6 7 5 4 3 4 3 6 6 4 2 3 4 active inactive 3 forward message 2 backward message 5
  • 12. Example – Connected Components 1 1 2 2 5 6 7 5 6 5 5 4 3 4 3 5 6 3 1 2 2 active inactive 3 forward message 2 backward message 5
  • 13. Example – Connected Components 1 1 2 2 5 6 7 5 6 5 5 4 3 4 3 5 6 3 1 2 2 active inactive 3 forward message 2 backward message 5
  • 14. Example – Connected Components 1 1 1 2 5 5 7 5 2 2 4 3 5 6 1 1 2 2 active inactive 3 forward message 2 backward message 5
  • 15. Example – Connected Components 1 1 1 2 5 5 7 5 2 2 4 3 5 6 1 1 2 2 active inactive 3 forward message 2 backward message 5
  • 16. Example – Connected Components 1 1 1 2 5 5 7 5 1 1 4 3 5 6 1 1 active inactive 3 forward message 2 backward message 5
  • 17. Example – Connected Components 1 1 1 2 5 5 7 5 1 1 4 3 5 6 1 1 active inactive 3 forward message 2 backward message 5
  • 18. Example – Connected Components 1 1 1 2 5 5 7 5 1 1 4 3 5 6 active inactive 3 forward message 2 backward message 5
  • 24. Worker ^= Map “Map” a user-de1ned algorithm over all vertices Output: set of messages to other vertices Available parameters: The current vertex and his outbound edges All incoming messages Global values Allow modi1cations on the vertex: Attach a result to this vertex and his outgoing edges Delete the vertex and his outgoing edges Deactivate the vertex 7
  • 25. Combine ^= Reduce “Reduce” all generated messages Output: An aggregated message for each vertex. Executed on sender as well as receiver. Available parameters: One new message for a vertex The stored aggregate for this vertex Typical combiners are SUM, MIN or MAX Reduces network traZc 8
  • 26. Activity ^= Termination Execute several rounds of Map/Reduce Count active vertices and messages Start next round if one of the following is true: At least one vertex is active At least one message is sent Terminate if neither a vertex is active nor messages were sent Store all non-deleted vertices and edges as resulting graph 9
  • 27. Pregel Questions connected components page rank bipartite matching semi-clustering mimum spanning forest graph coloring shortest paths 10
  • 32. Pagerank for Giraph 12 1 public class SimplePageRankComputation extends BasicComputation < LongWritable , DoubleWritable , FloatWritable , DoubleWritable > { 2 public static final int MAX_SUPERSTEPS = 30; 34 @Override 5 public void compute ( Vertex < LongWritable , DoubleWritable , FloatWritable > vertex , Iterable < DoubleWritable > messages ) throws IOException { 6 if ( getSuperstep () >= 1) { 7 double sum = 0; 8 for ( DoubleWritable message : messages ) { 9 sum += message .get (); 10 } 11 DoubleWritable vertexValue = new DoubleWritable ((0.15 f / getTotalNumVertices ()) + 0.85 f * sum ); 12 vertex . setValue ( vertexValue ); 13 } 14 if ( getSuperstep () < MAX_SUPERSTEPS ) { 15 long edges = vertex . getNumEdges (); 16 sendMessageToAllEdges (vertex , new DoubleWritable ( vertex . getValue ().get () / edges )); 17 } else { 18 vertex . voteToHalt (); 19 } 20 } 21 22 public static class SimplePageRankWorkerContext extends WorkerContext { 23 @Override 24 public void preApplication () throws InstantiationException , IllegalAccessException { } 25 @Override 26 public void postApplication () { } 27 @Override 28 public void preSuperstep () { } 29 @Override 30 public void postSuperstep () { } 31 } 32 33 public static class SimplePageRankMasterCompute extends DefaultMasterCompute { 34 @Override 35 public void initialize () throws InstantiationException , IllegalAccessException { 36 } 37 } 38 public static class SimplePageRankVertexReader extends GeneratedVertexReader < LongWritable , DoubleWritable , FloatWritable > { 39 @Override 40 public boolean nextVertex () { 41 return totalRecords > recordsRead ; 42 } 44 @Override 45 public Vertex < LongWritable , DoubleWritable , FloatWritable > getCurrentVertex () throws IOException { 46 Vertex < LongWritable , DoubleWritable , FloatWritable > vertex = getConf (). createVertex (); 47 LongWritable vertexId = new LongWritable ( 48 ( inputSplit . getSplitIndex () * totalRecords ) + recordsRead ); 49 DoubleWritable vertexValue = new DoubleWritable ( vertexId . get () * 10d); 50 long targetVertexId = ( vertexId .get () + 1) % ( inputSplit . getNumSplits () * totalRecords ); 51 float edgeValue = vertexId . get () * 100 f; 52 List <Edge < LongWritable , FloatWritable >> edges = Lists . newLinkedList (); 53 edges .add ( EdgeFactory . create (new LongWritable ( targetVertexId ), new FloatWritable ( edgeValue ))); 54 vertex . initialize ( vertexId , vertexValue , edges ); 55 ++ recordsRead ; 56 return vertex ; 57 } 58 } 59 60 public static class SimplePageRankVertexInputFormat extends GeneratedVertexInputFormat < LongWritable , DoubleWritable , FloatWritable > { 61 @Override 62 public VertexReader < LongWritable , DoubleWritable , FloatWritable > createVertexReader ( InputSplit split , TaskAttemptContext context ) 63 throws IOException { 64 return new SimplePageRankVertexReader (); 65 } 66 } 67 68 public static class SimplePageRankVertexOutputFormat extends TextVertexOutputFormat < LongWritable , DoubleWritable , FloatWritable > { 69 @Override 70 public TextVertexWriter createVertexWriter ( TaskAttemptContext context ) throws IOException , InterruptedException { 71 return new SimplePageRankVertexWriter (); 72 } 73 74 public class SimplePageRankVertexWriter extends TextVertexWriter { 75 @Override 76 public void writeVertex ( Vertex < LongWritable , DoubleWritable , FloatWritable > vertex ) throws IOException , InterruptedException { 77 getRecordWriter (). write ( new Text ( vertex . getId (). toString ()), new Text ( vertex . getValue (). toString ())) ; 78 } 79 } 80 } 81 }
  • 33. Pagerank for TinkerPop3 13 1 public class PageRankVertexProgram implements VertexProgram < Double > { 2 private MessageType . Local messageType = MessageType . Local .of (() -> GraphTraversal .< Vertex >of (). outE ()); 3 public static final String PAGE_RANK = Graph .Key . hide (" gremlin . pageRank "); 4 public static final String EDGE_COUNT = Graph .Key . hide (" gremlin . edgeCount "); 5 private static final String VERTEX_COUNT = " gremlin . pageRankVertexProgram . vertexCount "; 6 private static final String ALPHA = " gremlin . pageRankVertexProgram . alpha "; 7 private static final String TOTAL_ITERATIONS = " gremlin . pageRankVertexProgram . totalIterations "; 8 private static final String INCIDENT_TRAVERSAL = " gremlin . pageRankVertexProgram . incidentTraversal "; 9 private double vertexCountAsDouble = 1; 10 private double alpha = 0.85 d; 11 private int totalIterations = 30; 12 private static final Set <String > COMPUTE_KEYS = new HashSet <>( Arrays . asList ( PAGE_RANK , EDGE_COUNT )); 13 14 private PageRankVertexProgram () {} 15 16 @Override 17 public void loadState ( final Configuration configuration ) { 18 this . vertexCountAsDouble = configuration . getDouble ( VERTEX_COUNT , 1.0 d); 19 this . alpha = configuration . getDouble (ALPHA , 0.85 d); 20 this . totalIterations = configuration . getInt ( TOTAL_ITERATIONS , 30); 21 try { 22 if ( configuration . containsKey ( INCIDENT_TRAVERSAL )) { 23 final SSupplier < Traversal > traversalSupplier = VertexProgramHelper . deserialize ( configuration , INCIDENT_TRAVERSAL ); 24 VertexProgramHelper . verifyReversibility ( traversalSupplier .get ()); 25 this . messageType = MessageType . Local .of (( SSupplier ) traversalSupplier ); 26 } 27 } catch ( final Exception e) { 28 throw new IllegalStateException (e. getMessage () , e); 29 } 30 } 32 @Override 33 public void storeState ( final Configuration configuration ) { 34 configuration . setProperty ( GraphComputer . VERTEX_PROGRAM , PageRankVertexProgram . class . getName ()); 35 configuration . setProperty ( VERTEX_COUNT , this . vertexCountAsDouble ); 36 configuration . setProperty (ALPHA , this . alpha ); 37 configuration . setProperty ( TOTAL_ITERATIONS , this . totalIterations ); 38 try { 39 VertexProgramHelper . serialize ( this . messageType . getIncidentTraversal () , configuration , INCIDENT_TRAVERSAL ); 40 } catch ( final Exception e) { 41 throw new IllegalStateException (e. getMessage () , e); 42 } 43 } 44 45 @Override 46 public Set <String > getElementComputeKeys () { 47 return COMPUTE_KEYS ; 48 } 49 50 @Override 51 public void setup ( final Memory memory ) { 52 53 } 54 55 @Override 56 public void execute ( final Vertex vertex , Messenger <Double > messenger , final Memory memory ) { 57 if ( memory . isInitialIteration ()) { 58 double initialPageRank = 1.0d / this . vertexCountAsDouble ; 59 double edgeCount = Double . valueOf (( Long ) this . messageType . edges ( vertex ). count (). next ()); 60 vertex . singleProperty ( PAGE_RANK , initialPageRank ); 61 vertex . singleProperty ( EDGE_COUNT , edgeCount ); 62 messenger . sendMessage ( this . messageType , initialPageRank / edgeCount ); 63 } else { 64 double newPageRank = StreamFactory . stream ( messenger . receiveMessages ( this . messageType )). reduce (0.0d, (a, b) -> a + b); 65 newPageRank = ( this . alpha * newPageRank ) + ((1.0 d - this . alpha ) / this . vertexCountAsDouble ); 66 vertex . singleProperty ( PAGE_RANK , newPageRank ); 67 messenger . sendMessage ( this . messageType , newPageRank / vertex .<Double > property ( EDGE_COUNT ). orElse (0.0 d)); 68 } 69 } 70 71 @Override 72 public boolean terminate ( final Memory memory ) { 73 return memory . getIteration () >= this . totalIterations ; 74 } 75 }
  • 34. Pagerank for ArangoDB 1 var pageRank = function (vertex , message , global ) { 2 var total = global . vertexCount ; 3 var edgeCount = vertex . _outEdges . length ; 4 var alpha = global . alpha ; 5 var sum = 0, rank = 0; 6 if ( global . step > 0) { 7 while ( message . hasNext ()) { 8 sum += message . next (). data ; 9 } 10 rank = alpha * sum + (1- alpha ) / total ; 11 } else { 12 rank = 1 / total ; 13 } 14 vertex . _setResult ( rank ); 15 if ( global . step < global . MAX_STEPS ) { 16 var send = rank / edgeCount ; 17 while ( vertex . _outEdges . hasNext ()) { 18 message . sendTo ( vertex . _outEdges . next (). edge . _getTarget () , send ); 19 } 20 } else { 21 vertex . _deactivate (); 22 } 23 }; 14
  • 35. Pregel Questions connected components page rank bipartite matching semi-clustering mimum spanning forest graph coloring shortest paths 15
  • 38. Pregel Questions connected components page rank bipartite matching semi-clustering mimum spanning forest graph coloring shortest paths 17
  • 39. Thank You Twitter: @arangodb Github: triagens/ArangoDB Google Group: arangodb IRC: arangodb 18