0
AURELIUSTHINKAURELIUS.COMTITANDistributed Graph ComputingMatthias Broecheler, CTO@mbroechelerJune XI, MMXIII#CASSANDRA13
This presentation introduces Titan, Faunus, and scalable graphcomputing in general. We present a case study of how Pearson...
Thank You!JOFF L?KO?MNM@?;NOL? MOAA?MNCIHM<OA L?JILNM=IGGOHCNS MOJJILN
June 14th2012September2012December2012March2013May2013AlphaReleaseTitan0.1.0Titan0.2.0Titan0.3.0Titan0.3.1%RJ?LCG?HN;F L?F...
June 14th2012September2012December2012March2013May2013AlphaReleaseTitan0.1.0Titan0.2.0Titan0.3.0Titan0.3.1%RJ?LCG?HN;F L?F...
TitanGraph Database>CMNLC<ON?>L?;F NCG?IJ?HMIOL=?
name: Herculestype: demigodname: Cerberustype: monsterbattledtime:126?LN?R%>A? ,;<?F%>A?0LIJ?LNS
Value in Relationshipslow highKey-Value7B?H MBIOF> SIO OM? ; L;JB $;N;<;M?gK VBigTableK V V V VDocumentRelationalGraph"
Educating the Planet
Educating the Planet
PersonPersonStudent TeacherCourseInstitutionConceptDiscussionCommentShareenrolledInteachesrelatesTohasCoursebelongsTofollo...
PersonPersonStudent TeacherCourseInstitutionConceptDiscussionCommentShareenrolledInteachesrelatesTohasCoursebelongsTofollo...
TitanIntegrative Data ModelCH ; JIFSAFINMNIL;A? QILF>
StudentPersonTeacherCourseInstitutionConceptDiscussionCommentShareenrolledInteachesrelatesTohasCoursebelongsTofollowsautho...
TitanAnalyze RelationshipsCH L?;F NCG?
Scaling TitanHOG<?L I@NL;HM;=NCIHMMCT? I@ NB? AL;JB
121 Billion Edges6.2 Billion VerticesU -CFFCIH 5HCP?LMCNC?M
0F;=?G?HN LIOJBCU .4RF
1.1 million edges / secOMCHA <;N=B GI>?Data Ingestion
^ GU .G?>COG
x = [] as Set;!m = user.out(follows).aggregate(x)[0..(num*2-1)]!!.out(follows).except(x)[0..limit]!!.groupCount.cap.next()...
GenericGraph APIDataflowProcessingTraversalLanguageObject-GraphMapperGraphAlgorithmsGraphServer=IIF MNO@@=IGCHA2%34 h *3/.4...
10,200 transactions / secUZ L;H>IGFS =BIM?H =IGJF?RNL;P?LM;F N?GJF;N?MThroughput
Transaction Description Avg (ms) Stdev (ms)Student retrieves all content for asingle course in their course list279.32 81....
Scaling TitanN?=BHC=;F J?LMJ?=NCP?
Vertex Representationtime: 1584927motherbattledbattledbattledfoughttime: 4time: 7 CH>O=?>IL>?Lname:Herculestype:demigod5Pr...
label id +directionprimary key edge idΔvertex idsignaturepropertiesotherpropertiesEdge RepresentationColumn Value=IGJL?MM?...
Token RingGraph Partitioning;MMCAHM C>M NI G;JP?LNC=?M CHNI “IJNCG;F”NIE?H L;HA?,INM I@ CHN?L?MNCHA KO?MNCIHM @IL@ONOL? QI...
Aurelius Graph ClusterStores a massive-scaleproperty graph allowing real-time traversals and updatesBatch processing of la...
Special ThanksSteve Hill (@kindageeky)Director Architecture & Innovationat Pearson Education
AURELIUSTHINKAURELIUS.COMWe are Hiring
Upcoming SlideShare
Loading in...5
×

C* Summit 2013: Distributed Graph Computing with Titan and Faunus by Matthias Broecheler

2,987

Published on

This presentation introduces Titan, Faunus, and scalable graph computing in general. We present a case study of how Pearson builds an education social network on top of Titan, Faunus, and Cassandra to support learning in the 21st century. Titan is an open source distributed graph database build on top of Cassandra that can power real-time applications with thousands of concurrent users over graphs with billions of edges. Faunus is an open source global graph processing engine build on top of Hadoop and compatible with Cassandra that can analyze graphs, compute graph statistics, and execute global traversals. Titan and Faunus are components of the Aurelius Graph Cluster which enables scalable graph computation and powers applications in social networking, recommendation engines, advertisement optimization, knowledge representation, health care, education, and security.

Published in: Technology
0 Comments
9 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,987
On Slideshare
0
From Embeds
0
Number of Embeds
16
Actions
Shares
0
Downloads
75
Comments
0
Likes
9
Embeds 0
No embeds

No notes for slide

Transcript of "C* Summit 2013: Distributed Graph Computing with Titan and Faunus by Matthias Broecheler"

  1. 1. AURELIUSTHINKAURELIUS.COMTITANDistributed Graph ComputingMatthias Broecheler, CTO@mbroechelerJune XI, MMXIII#CASSANDRA13
  2. 2. This presentation introduces Titan, Faunus, and scalable graphcomputing in general. We present a case study of how Pearsonbuilds an education social network on top of Titan, Faunus, andCassandra to support learning in the 21st century.Titan is an open source distributed graph database build on topof Cassandra that can power real-time applications withthousands of concurrent users over graphs with billions ofedges. Faunus is an open source global graph processing enginebuild on top of Hadoop and compatible with Cassandra that cananalyze graphs, compute graph statistics, and execute globaltraversals. Titan and Faunus are components of the AureliusGraph Cluster which enables scalable graph computation andpowers applications in social networking, recommendationengines, advertisement optimization, knowledgerepresentation, health care, education, and security.
  3. 3. Thank You!JOFF L?KO?MNM@?;NOL? MOAA?MNCIHM<OA L?JILNM=IGGOHCNS MOJJILN
  4. 4. June 14th2012September2012December2012March2013May2013AlphaReleaseTitan0.1.0Titan0.2.0Titan0.3.0Titan0.3.1%RJ?LCG?HN;F L?F?;M? I@ ;>CMNLC<ON?>m IJ?H rMIOL=?AL;JB >;N;<;M?&CLMN MN;<F? L?F?;M?2?QLCN? I@ =IL?)H>?RCHA h %F;MNC=3?;L=B0?L@ILG;H=? "OA@CRCHA
  5. 5. June 14th2012September2012December2012March2013May2013AlphaReleaseTitan0.1.0Titan0.2.0Titan0.3.0Titan0.3.1%RJ?LCG?HN;F L?F?;M? I@ ;>CMNLC<ON?>m IJ?H rMIOL=?AL;JB >;N;<;M?&CLMN MN;<F? L?F?;M?2?QLCN? I@ =IL?)H>?RCHA h %F;MNC=3?;L=B0?L@ILG;H=? "OA@CRCHAFaunus Release
  6. 6. TitanGraph Database>CMNLC<ON?>L?;F NCG?IJ?HMIOL=?
  7. 7. name: Herculestype: demigodname: Cerberustype: monsterbattledtime:126?LN?R%>A? ,;<?F%>A?0LIJ?LNS
  8. 8. Value in Relationshipslow highKey-Value7B?H MBIOF> SIO OM? ; L;JB $;N;<;M?gK VBigTableK V V V VDocumentRelationalGraph"
  9. 9. Educating the Planet
  10. 10. Educating the Planet
  11. 11. PersonPersonStudent TeacherCourseInstitutionConceptDiscussionCommentShareenrolledInteachesrelatesTohasCoursebelongsTofollowsauthorreferenceshasComment relatesToauthorpartOfrelatesTo
  12. 12. PersonPersonStudent TeacherCourseInstitutionConceptDiscussionCommentShareenrolledInteachesrelatesTohasCoursebelongsTofollowsauthorreferenceshasComment relatesToauthorpartOfrelatesTo
  13. 13. TitanIntegrative Data ModelCH ; JIFSAFINMNIL;A? QILF>
  14. 14. StudentPersonTeacherCourseInstitutionConceptDiscussionCommentShareenrolledInteachesrelatesTohasCoursebelongsTofollowsauthorreferenceshasComment relatesToauthorpartOfDiscussionRankrelatesTo
  15. 15. TitanAnalyze RelationshipsCH L?;F NCG?
  16. 16. Scaling TitanHOG<?L I@NL;HM;=NCIHMMCT? I@ NB? AL;JB
  17. 17. 121 Billion Edges6.2 Billion VerticesU -CFFCIH 5HCP?LMCNC?M
  18. 18. 0F;=?G?HN LIOJBCU .4RF
  19. 19. 1.1 million edges / secOMCHA <;N=B GI>?Data Ingestion
  20. 20. ^ GU .G?>COG
  21. 21. x = [] as Set;!m = user.out(follows).aggregate(x)[0..(num*2-1)]!!.out(follows).except(x)[0..limit]!!.groupCount.cap.next();!m.sort{-it.value}[0..(num-1)]!._().transform{ [userid: it.key.id, !! ! ! ! ! ! !points: it.value]};!&IFFIQ 2?=IGG?H>;NCIH
  22. 22. GenericGraph APIDataflowProcessingTraversalLanguageObject-GraphMapperGraphAlgorithmsGraphServer=IIF MNO@@=IGCHA2%34 h *3/.4CN;H’M%=IMSMN?GKO?LSF;HAO;A?
  23. 23. 10,200 transactions / secUZ L;H>IGFS =BIM?H =IGJF?RNL;P?LM;F N?GJF;N?MThroughput
  24. 24. Transaction Description Avg (ms) Stdev (ms)Student retrieves all content for asingle course in their course list279.32 81.83Student follows another student 193.72 22.77Student is recommended peopleto follow241.33 256.48Student reads their stream andshares an item with followers284.07 68.20Student retrieves their profile 53.740 22.61Student reads the most recentcomments for their courses211.07 45.56
  25. 25. Scaling TitanN?=BHC=;F J?LMJ?=NCP?
  26. 26. Vertex Representationtime: 1584927motherbattledbattledbattledfoughttime: 4time: 7 CH>O=?>IL>?Lname:Herculestype:demigod5PropertyPropertyEdgeEdgeEdgeEdgeEdgeLIQ CH>C=?M@IL @;MNP?LN?R =?HNLC=KO?LC?M
  27. 27. label id +directionprimary key edge idΔvertex idsignaturepropertiesotherpropertiesEdge RepresentationColumn Value=IGJL?MM?> M?LC;FCT?> I<D?=NMP;LC;<F? FIHA ?H=I>CHA
  28. 28. Token RingGraph Partitioning;MMCAHM C>M NI G;JP?LNC=?M CHNI “IJNCG;F”NIE?H L;HA?,INM I@ CHN?L?MNCHA KO?MNCIHM @IL@ONOL? QILEOM?M "/0
  29. 29. Aurelius Graph ClusterStores a massive-scaleproperty graph allowing real-time traversals and updatesBatch processing of largegraphs with HadoopRuns global graph algorithmson large, compressed,in-memory graphsMap/Reduce Load & CompressAnalysis resultsback into TitanBulk LoadTITAN FAUNUS FULGORAApache 2aureliusgraphs@googlegroups.comtitan.thinkaurelius.com faunus.thinkaurelius.com
  30. 30. Special ThanksSteve Hill (@kindageeky)Director Architecture & Innovationat Pearson Education
  31. 31. AURELIUSTHINKAURELIUS.COMWe are Hiring
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×