SlideShare a Scribd company logo
INTRO TO GRAPH 
DATABASES 
Using Tinkerpop, TitanDB, and Gremlin 
{ 
“email” : “calebjones@gmail.com”, 
“website” : “http://calebjones.info”, 
“twitter” : “@JonesWCaleb” 
}
Overview 
• Why Graphs? 
• Order to complexity 
• Use cases – major players 
• Graphs & Adjacency Matrices 
• Tinkerpop Framework 
• Blueprints, Frames, Pipes, Furnace, Gremlin, Rexster 
• Titan using Cassandra 
• Blog Application (lab) 
• Traversals using Gremlin
WHY GRAPHS?
Warren Weaver 
• 17th - 19th century 
• Problems of simplicity 
• How one element interacts with 
another 
• First half of 20th century 
• Problem of disorganized complexity 
• Many elements operating in a system 
w/o regard to how they interact with 
each other 
• Predicted 
• Problem of organized complexity 
• Many elements operating in a system 
taking into account how they interact 
with each other 
• Would require computational power 
far beyond what was currently 
available 
Science and Complexity 
1948 
ENIAC (1946)
Organisms
Knowledge Classification
Organizational Hierarchy
Neurology
Order to Complexity 
• Trees describe order 
• Linear (simple lineage) 
• Categorized 
• Single dimensional 
• Symmetrical 
• Hierarchical 
• Convergent modeling 
• Networks describe complexity 
• Non-linear (multi-lineage) 
• Multi-categorical 
• Multi-dimensional 
• Asymmetrical 
• Decentralized 
• Divergent modeling
Types of Networks
Types of Networks
Types of Networks
Types of Networks
Types of Networks
Types of Networks
Types of Networks
Types of Networks
-DFN.LUE 
DUO*DIIRUG 
3DXO5HLQPDQ 
6RO%URGVN 
/DUU/LHEHU 
%LOO(YHUHWW 
'LFN$HUV 
6WHYH'LWNR 
6DP5RVHQ 
5LFKDUG+RZHOO 
$O*RUGRQ 
.HQ)HGXQLHLZLF] 
DUPLQH,QIDQWLQR 
)UDQN6SULQJHU 
6WDQ/HH 
0LNH(VSRVLWR 
-RH0DQHHO 
-RKQ7DUWDJOLRQH 
+HUE7ULPSH 
$O.XU]URN 
*DU)ULHGULFK 
-LP6WHUDQNR 
$UWLH6LPHN 
KLF6WRQH 
*HRUJH5RXVVRV 
6HH1RWHV 
$OH[7RWK 
9LQFHROOHWWD 
/3*UHJRU 
)UDQN*LDFRLD 
:HUQHU5RWK 
:DOO:R-RRGH6LQQRWW 
5R7KRPDV 
-DFN6SDUOLQJ 
'DQ$GNLQV 
*LO.DQH 
-HUU5)RHVVOG$PQDGQUX 
'RQ+HFN 
-RVKXD0LGGOHWRQ 
0LNH*XVWRYLFK 
%UHW%OHYLQV 
$OIUHGR$OFDOD 
%RE/DUNLQ 
(G+DQQLJDQ 
%RE+DOO 
'DQQ)LQJHURWK 
*HUURQZD 
DUROH6HXOLQJ 
KULVWLH6FKHHOH 
-RKQ5RPLWD-5 
-RKQ9HUSRRUWHQ 
*DVSDU6DODGLQR 
*HRUJH7XVND 
-RKQ%XVFHPD 
0RUULH.XUDPRWR 
$QGDQFKXV 
,UYLQJ:DWDQDEH 
-RH5RVHQ 
+HUERRSHU 
1HDO$GDPV 
0DULH6HYHULQ 
3HWHU6DQGHUVRQ 
5RJHU6WHUQ 
%DUU:LQGVRU6PLWK 
-HDQ6LPHN 
'LDQ%DR$EOE%HXUVGLDQVN 
7RQ'H0]XDQ*ULJH7DRLUWJXHV3HUH] 
%RE6KDUHQ 
5LFDUGR9LOODPRQWH 
'DQUHVSL 
6DP*UDLQJHU 
$QGUHD+LOO 
6HQHQ$QWRQLR 
*HRUJH*R]XP 
%LOO:UD 
$OH[-D 
$UPDQGR*LO 
-RKQ%ROWRQ 
)UDQN0LOOHU 
0D[6FKHHOH 
+RZD0UGDU0FD6FXNPLHHUDN 
$UVLD5R]HJDU 
-RKQ:RUNPDQ 
6FRWW:LOOLDPV KULV(OLRSRXORV 
7RP3DOPHU 
/LQGD)LWH 
KULVODUHPRQW 
6DO%XVFHPD 
'HQQLV2QHLO 
5RQ:LOVRQ 
%RE0F/HRG 
'DYHRFNUXP 
0LFKHOOH:ULJKWVRQ 
)UDQNKLDUDPRQWH 
3KLO5DFKHOVRQ 
7RP2U]HFKRZVNL 
/HQ:HLQ 
3HWUD*ROGEHUJ 
.DUHQ0DQWOR 
%LOO0DQWOR 
0DUY:ROIPDQ 
$QQHWWH.DZHFNL 
'DYLG+XQW 
%UXFH3DWWHUVRQ 
5LFK%XFNOHU 
'RQ:DUILHOG 
$QQHWWH.DH 
-DQLFHRKHQ 
%RQQLH:LOIRUG 
-RKQRVWDQ]D 
%RE/DWRQ 
$UFKLH*RRGZLQ 
%RE%URZQ 
7RP6XWWRQ 
-RKQ%UQH 'DQ*UHHQ 
7HUU$XVWLQ 
'HQLVH:RKO 
-LP6KRRWHU 
*OQLV2OLYHU 
)UDQFRLVH0RXO 
0LFKDHO+LJJLQV 5LFN3DUNHU 
OHP5RELQV 
-LP1RYDN 
-LP6DOLFUXS 
/RXLVH-RQHV %UHQW$QGHUVRQ 
-RH5XELQVWHLQ 
-DQLFHKLDQJ 
%RE:LDFHN 
-LP6KHUPDQ 
5RQ=DOPH 
%LOO6LHQNLHZLF] 
-DQLQHDVH 
3DXO6PLWK 
/QQ9DUOH 
-DQLFHDVH 
3DXO%HFWRQ 
:DOWHU6LPRQVRQ 
(OLRW%URZQ 
6U 
$QQ1RFHQWL 
6WHYH/HLDORKD 
3DW%OHYLQV 
5LFN/HRQDUGL 
-KDULHV/6HSHURXVH 
7HUU.DYDQDJK 
$O:LOOLDPVRQ 
-XQH%ULJPDQ 
UDLJ5XVVHOO 
'DYH0FDLJ 
0HO5XEL 
.LD$VDPLD 
63WXDGXLRO77XUWRURQQH 
6WHYH.LP 
-G6PLWK 
3KLOLS7DQ 
5XV:RRWRQ 
KXFN$XVWHQ 
6HDQ3KLOOLSV 
6KDQQRQ%ODQFKDUG 
-RHDVH 
'DQ1RUWRQ 
'DYH/DQSKHDU 
(PHUVRQ0LUDQGD 
0DWW%DQQLQJ 
5RE-HQVHQ 
$ODQ'DYLV 
3DXO1HDU 
$UW$GDPV 
%XWFK*XLFH 
0DUF6LOYHVWUL 
3HWUD6FRWHVH 
.HUU*DPPLO 
-HII0DWVXGD 
(OHFWULFUDRQ 
'LJLWDOKDPHOHRQ 
6HDQ3DUVRQV 
6NRWWLHRXQJ 
-DVRQ.HLWK 
'DYH6KDUSH 
(G%UXEDNHU 
$QWRQLR)DEHOD 
(GJDU'HOJDGR 
)UDQN'
$UPDWD 
'DQQ.0LNL 
0LNH5DLFKW 
*HRUJHV-HDQW 
$Q7GUKHRZP3DHVS'RHUHQLFN 
6DQGX)ORUHD 
0DUN3RZHUV 
.HYLQ1RZODQ 
.HYLQ6RPHUV 
%RE+DUUDV 
%LOO-DDVND 
'DUO(GHOPDQ 
*LQD*RLQJ 
-RH4XHVDGD 
3ROO:DWVRQ 
HGULF1RFRQ 
5RE/LHIHOG 
-LP/HH 
.HQW:LOOLDPV 
(ULN/DUVHQ 
0LFKDHO+HLVOHU *UHJ:ULJKW 
.LHURQ'ZHU 
10HOLNHRPRWROOYLQV 
0LNH5RFNZLW] 
6X]DQQH*DIIQH 
%UDG9DQFD3WDDW%URVVHDX 
$UW7KLEHUW 
6WHYH%XFFHOODWR 
7RPRNR6DLWR 
/RLV%XKDOLV 
.HYLQXQQLQJKDP 
-RH5RVDV 
.HQ%DUU 
-LP6WDUOLQ 
0LFKDHO*ROGHQ 
.ODXV-DQVRQ 
/DUU36WURPDQ 
+LODU%DUWD 
)DELDQ1DFLH]D 
5XULN7OHU 
6WHYH%XWOHU 
/HH.:HHYHLQNVRQUDG 
%HQ5DDE 
-LP.UXHJHU 
'DQD0RUHVKHDG 
5LFKDUG6WDUNLQJV 
.HYLQ7LQVOH 
7RP5DQH 
-DQ$QWRQ+DUSV 
%UDQGRQ3HWHUVRQ 
/LVD3DWULFN 
0DUN3HQQLQJWRQ 
0LNH7KRPDV 
0DULH-DYLQV 
'DQ3DQRVLDQ 
$O0LOJURP 
5LFKDUG,VDQRYH 
-L6PDLKGHDXQ7JHPRIRQWH 
-RQ%DEFRFN 
5LFKDUG%HQQHWW 
-D-QR'VHXX0UVDHU]PDDQ 
$O9H 
-RH0DGXUHLUD 
7LP7RZQVHQG 
6W0HYDHWW(+SLWFLQNVJ 
6FRWW/REGHOO 
0DUN:DLG 
%LOO2DNOH 
5RQ*DUQH 
7RP*UXPPHWW 
RPLFUDIW 
%UDQ+LWFK 
DP6PLWK 
DUORV3DFKHFR 
3KLO+XJK)HOL[ 
-RQDWKDQ%DEFRFN 
$QWKRQ:LQQ 
5RELQ5LJJV 
KULVWLDQ/LFKWQHU 
-RH3LPHQWHO 
-RH$QGUHDQL 
-HSK/RHE 
-J 
7HDP%XFFH 
-RKQ'HOO 
3DVTXDO)HUU 
.ROMD)XFKV 
KXFN'L[RQ 
-DVRQ/LHELJ 
6DOYDGRU/DUURFD 
0DUN%HUQDUGR 
$GDP.XEHUW 
0DUN0RUDOHV 
-RVH/DGURQQ 
-XDQ9ODVFR 
+XPEHUWR5DPRV 
KULV%DFKDOR 
'DQ%URZQ 
6WHYH6HDJOH 
$QG6PLWK 
0RQLFD.XELQD 
(G%HQHV KULV6RWRPDRU 
/LTXLG*UDSKLFV 
-RKQ:DWVRQ 
*UHJ/DQG 
-+:LOOLDPV 
'DUUO%DQNV 
-RKQDVVDGD 
7RPP/HH(GZDUGV 
.LHURQ*LOOHQ 
-D/HLVWHQ 
DQLFN3DTXHWWH 
7RGG.OHLQ 
7HUU'RGVRQ 
XOO+DPQHU 
3DXO0RXQWV 
6WHYH2OLII 
$OEHUW'HVFKHVQH 
-RVHSK+DUULV 
5DQ-%RHHQ.MDHPOOLQ 
6WHYH5XGH 5DOSK0DFFKLR 
/HLQLO)UDQFLVX 
3HWH)UDQFR 
KULV'LFNH 
2VFDU*RQ%JRULUDDQ0LOOHU 
0LNH60LOOHU 
0LFDKHO6WHZDUW 
2SWLF6WXGLRV 
5LFKDUG+RULH 
*OHQQ+HUGOLQJ 
*UHJJ6FKLJLHO 
-D)DHUEHU 
6FRWW+DQQD 
5LFKDUGDVH 
6WHYH%HK0OLQLFJKDHO6WHZDUW 
/RXLVH6LPRQVRQ 
%ULDQ+DEHUOLQ 
*UDKDP1RODQ 
0LNH6WHZDUW :HV$EERW 
5DQGDOO*UHHQ 
*HUPDQ*DUFLD 
0LFKDHO5DQ 
7RP'HUHQLFN 
1RUP5DSPXQG 
+L)L'HVLJQ 
,DQKXUFKLOO 
$YDORQ6WXGLR 
/DUU6WXFNHU 
$VKOH:RRG 
(GGLHDPSEHOO 
6HDQ3-KDLYOOLLSHVU3DXQOGLG0RDWW6PLWK 
$$ULDHUOR2QOL/YHRWSWLUHVWL 
'DYLG)LQFK 
0DWW0LOOD 
0LNH0DUWV 
6WHYH8 
%LOO7DQ 
0DUN)DUPHU 
2OLYLHURLSHO 
KULVKXFNU 
$QG3DUN 
-RQDWKDQ6LEDO 
3HWH0UL0NHLOOL$JODOUQHG 
'H[WHU9LQHV 
(GJDU7DGHR 
-XVWLQ3RQVRU 
5REHUW:HLQEHUJ 
-RH3UXHWW 
.DUO6WRU 
5REE0F1DEE 
-DVRQ:ULJKW 
-LP0RRQH 
'HDQ:KLWH 
)UDQNKR 
.HQQ/RSH] 
'DYLG$QWKRQ.UDIW 
6G6KRUHV 
(YHOQ6WHLQ 
-RKQ:DUQHU 
9LFWRU2OD]DED 
'DYLG0LFKHOLQLH 
%UXFH-RQHV 
+DUUDQGHODULR 
7RP%UHYRRUW 
7RP'H)DOFR 
RU3HWLW 
-RHDUDPDJQD 
KULVWLQD:HLU 
DUOR%DUEHUL 
7RP0DQGUDNH 
3HWHU,UR 
0DUN%URRNV 
6WHYH*HUEHU 
DUODRQZD 
%UHWW%UHHGLQJ 
ODWRQ+HQU 
KULVWLQD6WUDLQ 
0DUN*UXHQZDOG 
6WHYH(QJOHKDUW 
KULVWRV*DJH 
3HWHU'DYLG 
$QWKRQDVWULOOR 
$QJHO0HGLQD 
0LNH'HRGDWR 
-RKQ.DOLV] 
5REHUWR$JXLUUH6DFDVD 
%REE-LDHHK*DDVUHGQHU 
0LFKDHOKRL 
6RQLD2EDFN 
:HVOH:RQJ 
%HUQDUGKDQJ 
0DWW5DQ 
/DU6WXFNHU 
-RQ+ROGUHGJH 
:LOVRQ5DPRV 
7DQD+RULH 
-DVRQ/HYLQH 
%HQ2OLYHU 
0LNHDUH 
UDLJKU.0LVLONHHRV3WHUNLQV 
DUORV$OEHUWRFUX]XHYDV 
%ULDQ5HEHU 
0DWW)UDFWLRQ 
:LOOLDP0HVVQHUORHEV 
9DO6HPHLNV /DUU+DPD 
DUORV0RWD 
-RH%HQQHWW 
%XG/D5RVD 
-(3+25. 
5DJV.0DUROU%DROHOOVHUV 
- 
%HQFKPDU+NL)3LURGRXORFXWLURQV 
7RSRZ7EG 
5DFKHO'RGVRQ 
$OODQ+HLQ3$EKGHLOLUJ-*LPUDHQQRHY] 
:+,/(3257$,2 
1LFN/RZH 
,6ELUPDLRPQH5R%ELDHQUVFRKQL 
Types of Networks 
6FRWW(GHOPDQ 7RQ,VDEHOOD 
,UHQH9DUWDQRII 
'RQ1HZWRQ 
+RZDUG%HQGHU 
DUO*DIIRUG 
$QQHWWH.DZHFNL 
-RKQ7DUWDJOLRQH 
*HQH'D 
(UQHVW+DUW 
3DEOR0DUFRV 
3KLO5DFKHOVRQ 
)UDQNKLDUDPRQWH 
*DVSDU6DODGLQR 
*HRUJH5RXVVRV 
-RKQRVWDQ]D 
6WHYH(QJOHKDUW 
%RE%URZQ 
-LP6WDUOLQ 
0LNH(VSRVLWR 
*LO.DQH 
-RKQ5RPLWD-5 
9LQFHROOHWWD 
(GJDU'HOJDGR 
DURO/D 
*HQHRODQ 
*HRUJH.OHLQ 
/LVD3DWULFN 
-LP0RRQH 
/LQGD/HVVPDQQ 
5R7KRPDV 
5RQ:LOVRQ 
6WHYH'LWNR 
/DUU,YLH 
/DUU/LHEHU 
0LNH3ORRJ 
'RQ+HFN 
3HWUD*ROGEHUJ 
-RKQ%XVFHPD 
-HDQ6LPHN 
%HQ6HDQ 
-DFN$EHO 
(PEHOOLVKHUV$VVHPEOHG 
%RE%XGLDQVN 
'DYHRFNUXP 
)UDQN*LDFRLD 
$UYHOO-RQHV 
)UDQN0LOOHU 
-0'H0DWWHLV 
$ODQ.XSSHUEHUJ 
3LWWVEXUJKRPLFVOXE 
'DQ*UHHQ 
(G+DQQLJDQ 
-LP6DOLFUXS 
%RE+DOO 
-DFNV/LWWOH+HOSHUV 
'DYLG0LFKHOLQLH 
7RP2U]HFKRZVNL 
6DO%XVFHPD 
-RH6WDWRQ 
-RH6LQQRWW 
/HQ:HLQ 
*HRUJH7XVND 
-HUU)HOGPDQ 
7RP6XWWRQ 
'DYLG'D 
KDUOHV1LFKRODV 
0LFKDHO.HOOHKHU 
5R7KRPDVDV--RQDK-DPHVRQ 
$UWLH6LPHN 
6WDQ/HH 
KDUORWWH-HWWHU 
,UYLQJ:DWDQDEH 
-DFN.LUE 
'LFN$HUV 
3DXO5HLQPDQ 
6RO%URGVN 
KLF6WRQH 
6DP5RVHQ 
:DOO:RRG 
5D+ROORZD6KHULJDLO 
/3*UHJRU 
0DULH6HYHULQ 
1G7RODVW3DQHO 
3URGXFWLRQ3DVWHXS 
6DP*UDLQJHU 
+RZDUG3XUFHOO 
%DUU:LQGVRU6PLWK 
6G6KRUHV 
)UDQN6SULQJHU 
-DQLFHKLDQJ 
.OH%DNHU 
%RE:LDFHN 
-RKQ:HOOLQJWRQ 
-DFN0RUHOOL 
+RZDUG0DFNLH 
:DOWHU6LPRQVRQ 
7RQ'H]XQLJD 
*DU)LHOGV 
.HLWK:LOOLDPV 
7RP3DOPHU 
%RE/DUNLQ 
%RQQLH:LOIRUG 
-LP6KHUPDQ 
$OIUHGR$OFDOD 
%RE0F/HRG 
-RH'HOEHDWR 
3HWHU6DQGHUVRQ 
/RXLVH-RQHV 
'DQQ)LQJHURWK 
+HUERRSHU 
0LNH6WHYHQV 
6KHOO/HIHUPDQ 
+DUODQ(OOLVRQ 
$ODQ1:HHDLOV$VGDPV 
%LOO(YHUHWW 
5LFK%XFNOHU 
'DQ$GNLQV 
KULVODUHPRQW 
*OQLV2OLYHU 
'HQLVH:RKO 
)UDQN0F/DXJKOLQ 
6WDQ*ROGEHUJ 
'DYLG+XQW 
)UDQN%ROOH 
-HDQ,]]R 
-XQH%UDYHUPDQ 
%LOO0DQWOR 
-RH5RVHQ 
7LWOH 
0DUY:ROIPDQ 
*HRUJH3HUH] 
-DQLFHRKHQ 
'DQUHVSL 
'RQ:DUILHOG 
.HLWK3ROODUG 
$O0LOJURP 
+XJK3DOH 
'XII9RKODQG 
$UFKLH*RRGZLQ 
-LP6KRRWHU 
*HUURQZD 
'RF0DUWLQ 
'DYLG$QWKRQ.UDIW 
5RJHU6OLIHU 
-RKQ%UQH 
7HUU$XVWLQ 
.HQ.ODF]DN 
.ODXV-DQVRQ 
)LQLVKHG$UW 
'LYHUVH+DQGV 
-RH5XELQVWHLQ 
1HORPWRY 
5RJHU6WHUQ 
'DYLG:HQ]HO 
%RE6KDUHQ 
5LFN3DUNHU 
5LFDUGR9LOODPRQWH 
6WHYH*HUEHU 
5XG1HEUHV 
DUPLQH,QIDQWLQR 
7RP'H)DOFR 
$O*RUGRQ 
0DULR6HQ 
)UDQFRLVH0RXO 
(ODLQH+HLQO 
%RE/DWRQ 
'LDQD$OEHUV 
-LP1RYDN 
6WHYHQ*UDQW 
0DUN*UXHQZDOG 
KULVWLH6FKHHOH 
6W'HYDHYH06LWLFPKRHQOOV 
'RQ3HUOLQ 
7RS2I3DJH 
%UHWW%UHHGLQJ 
*UHJ/DURFTXH 
$ODQ=HOHQHW] 
06DDUON7%UDULSJDKQW L 
-RKQ%HDWW 
$QQ1RFHQWL 
$QG0XVKQVN 
-XOLDQQD)HUULWHU 
-RKQ0RUHOOL 
%U*LDDQUY*HDUYH 
-RKQ:RUNPDQ 
0D[6FKHHOH 
3DXO%HFWRQ 
.HQW:LOOLDPV 
UDLJ'%DUDUUVHIQLH$OGXFN 
-HII-RKQVRQ 
(YDQ6NROQLFN 
3DWULFN2OOLIIH 
.DWKUQ%ROLQJHU 
*UHJDSXOOR 
-RKQ]RS 
6WHSKHQ%-RQHV 
5XULN7OHU 
%UDG9DQFDWD 
%LOO2DNOH 5DOSK0DFFKLR 
-LP5HGGLQJWRQ 
0LFNH5LWWHU 
%RE+DUUDV 
(O/LRRWS%HU]RZQ 
0DUF6LU 
.HQQ/RSH] 
3DXO5DQ 
7)LQH 
0LNH5RFNZLW] 
)DELDQ1DFLH]D 
-DPHV)U 
5LN/HYLQV 
KULV,Y 
5HQHH:LWWHUVWDHWWHU 
-RH5RVDV 
(G/D]HOODUL 
%RE0DFNLH 
*OHQQ+HUGOLQJ 
KULV(OLRSRXORV 
)UHG)UHGHULFNV 
%UDG.-RFH 
7HUU.DYDQDJK 
-RK/QXN6HWD5WHRPVVD +LJJLQV2DNOH 
0LNH:LHULQJR 
/DUU+DPD 
+HUE7ULPSH 
75RPHJ0JLRHU-JRDQQHV 
6WHYH(SWLQJ 
5RQ/LP 
7LQVOH6FRWW 
0RVVRII6NROQLN 
((GP%LUH5QLHEVHUR 
'-4 
)UDQN/RSH] 5DPRV 
-RH4XHVDGD 
5-RQHV 
KDUOHV%DUQHWW 
3DXO$EUDPV 
-HIIUH0RRUH 
0DULDQQH/LJKWOH 
0DOLEX 
0DULH-DYLQV 
0LFKDHO+HLVOHU 
-DQ'XXUVHPD 
*HR7I,RVPKHUZDWRHRVG 
5RE7RNDU 
6WHYH'XWUR 
.HYLQ.REDVLF 
*LQD*RLQJ 
0LFKDHO+LJJLQV 
-RKQ/HZDQGRZVNL 
$ULDQH/HQVKRHN 
.RVVRII 
$QGU-HZ*3-DRTQXHHVWWH 
KULV0DWWKV 
.HYLQ:HVW 
:LOOLDP0HVVQHUORHEV 
-LP/H5HRE/LHIHOG 
DUORV09RDWDO6$HPQWHKLRNQV:LQQ 
-RH%HQQHWW -RKQ'HOO 
%XG/D5RVD 
/HQ.DPDQVNL 
0:PDQ 
$OH[DQGURY 
)%HQHV 
)1DEMLRT/DJXQD 
0LNH7KRPDV 
+6HXF]WDRQUQHR*OODD]IRIQH 
*)RUHUGGRHQULF3NXVUFHOO 
0LNH*XVWRYLFK 
-LP+DOO 
%DEFRFN 
'RQDOG+XGVRQ 
%HQ5D-DREKQ.DOLV] 
D0QDFUHLD'/DHEFDFWDUL 
0LNH0DUWV 
2YL+RQGUX 
-R*HUDQWD0YDLHOLHKUPL 
6XVDQUHVSL 
6WHZDUW-RKQVRQ 
7RP*ULQGEHUJ 
5LFK5DQNLQ 
7LP']RQ 
%ROOHUV 
0DUVKDOO 
(OOLH'H9LOOH 
0LNH'HRGDWR 
-HII0DWVXGD 
.H7YLPLQR7WKLQVO'H]RQ 
-RH3LPHQWHO 
KULVWLDQ/LFKWQHU 
3KLO+XJK)HOL[ 
-RH$QGUHDQL 
.ROMD)XFKV 
-HSK/RHE 6FRWW/REGHOO 
$QWKRQDVWULOOR 
6WHYH%XFFHOODWR 
%REELHKDVH 
0LNH.DQWHURYLFK 
7RP%UHYRRUW 
.HYLQ6RPHUV 
$QJHO0HGLQD 
6FRWW.REOLVK 
0DUN:DLG 
$OODQ-'DRFQRE5VLFHRQ 
)UDQN5REELQV 
/HH(OLDV 
6FRWW.ROLQV 
36PLWK 
$OH[6FKRPEXUJ 
6DP.DWR 
DUO3RWWV 
%LOO6LHQNLHZLF] 
0DWW5DQ 
$QGDQFKXV 
'HQQLV2QHLO 
5DQG'/DRQIIQLF7LHKURPDV 
-HDQ0DUF/RIILFLHU 
%XWFK*XLFH 
'DQ'D 
KULVWLH6FKHHOHDV0D[6FKHHOH 
*HUDUG-RQHV 
/DUU$OH[DQGHU 
+DUUD5QRGQHO)DUUHLRQ] 
+XGVRQ 
'DQ3DQRVLDQ 
'DQQ%XODQDGL 
.HOO3RDUYWH%VUHRVVHDX 
'DYLG5RVV 
/LQGD)LDWHUODRQZD 
DUROH6HXOLQJ 
0DUF6LOYHVWUL 
)UDQN7XUQHU 
:HUQHU5RWK 
%HY%HYHULGJH 
%UXFH3DWWHUVRQ 
-HII$FOLQ 
-RKQ9HUSRRUWHQ 
0LFKDHO)OHLVKHU 
6WHYH*DQ 
3DW%URGHULFN 
.DUHQ50RDEQWORDURVHOOD 
.HUU*DPPLO 
3HWUD6FRWHVH 
0LNHDUOLQ 
*HR6UJHHQ*HQR$]$QXQGPWURHQDLR+LOO 
$UW$G5DLPFNV/HRQDUGL 
6WHYH/HL3DODRXKOD6PLWK 
.HQ)5HLGFXKQDLUHGLZ+LFR]ZHOO 
2WWR%LQGHU 
9LQFH$ODVFLD 
$O$YLVRQ 
-RQDWKDQ%DEFRFN 
7LP7RZQVHQG 
3HWHU'DYLG 
3DVTXDO)HUU 
-RH0DGXUHLUD 
5REHUWR$JXLUUH6DFDVD 
$GDP.XEHUW 
,DQKXUFKLOO 
RPLFUDIW 
0DUN3RZHUV 
6FRWW+DQQD 
0DUN0DRPUD6OHPVLWK 
5RELQ5LJJV 
-DH*DUGQHU 
7HD5PLF%KDXUFGFH6WDUNLQJV 
-(3+25. 
RU6HGOPHLHU 
$OO7KXPEVUHDWLYH
Types of Networks 
Neuron Network of Mouse Millennium Simulation (2005) 
Largest astronomical simulation ever on the structure and 
evolution of galaxies in the universe. 
25 TB of data and 20 million galaxies
Use Cases 
• Recommendation engines (avoid 
relational N-JOIN or self-JOIN) 
• Ranking/credibility (Google’s 
PageRank) 
• Path finding (shortest, longest, 
mutual friends) 
• Social (friendship, following, key 
connectors)
Graphs 
• Node/Verticy: An entity that can have zero or more edges 
connected to it. 
1 2 3 
• Edge: An entity which connects two nodes. May be 
directed or undirected 
1 2 
A B
Adjacency Matrix 
• If graph is undirected, the adjacency matrix is symmetric 
• Thus, transposition of matrix is the same graph
Adjacency Matrix 
• Some graphs have different ‘types’ or dimensions of edges
Property Graphs 
Attribute Value 
id 2 
name Bob 
Attribute Value 
id E3 
type knows 
since 2013-09-01 
Attribute Value 
id 4 
name Alice 
Attribute Value 
id 3 
name Eve 
Attribute Value 
id E2 
type knows 
since 2013-09-01 
Attribute Value 
id E4 
type sibling 
twins true 
Attribute Value 
id 1 
name Ivan 
Attribute Value 
id E1 
type cousin 
separation 1
Traversals 
• Breadth-first 
• 3, 2, 4, 1 
• Depth-first 
• 3, 2, 1, 4 
• Breadth-first and 
depth-first search 
can be combined. 
• Filtering 
• Ability to filter/sort 
paths in traversal 
• Aggregating 
• Ability to aggregate/count properties as traversal occurs and affect 
traversal with result of aggregation (e.g. power-grid load distr.) 
• Backtracking 
• Leave marker in traversal and come back to it when certain criteria is 
met in a lower step 
1 
2 
3 
4
TINKERPOP 
Graph Framework
Tinkerpop 
• A comprehensive, open-source graph framework 
(http://www.tinkerpop.com/) 
Property graph 
model that is DB 
agnostic. A kind of 
JDBC for graphs. 
Data flow API for 
processing graphs. 
Underlying 
component for 
graph traversals 
DSL for traversing 
property graphs. 
Implemented in 
JSR-223. 
Maps between 
domain objects and 
the graph’s nodes 
and edges. Like 
ORM for graphs. 
Collection of 
common graph 
analysis algorithms 
for property 
graphs. 
Exposes any 
blueprints graph 
via a uniform 
RESTful API. 
Blueprints Pipes Gremlin 
Frames Furnace Rexster
Tinkerpop Stack 
• Different components all build 
on each other 
• Provides abstraction from 
HTTP layer, to object mapping 
layer, to traversal scripting, to 
pluggable graph API 
• Blueprints underpins the stack 
making it all DB agnostic 
• Blueprints implementations: 
• Neo4j, Sail, OrientDB, Dex 
• *) Accumulo, ArangoDB, Bitsy, 
FluxGraph, FoundationDB, 
InfiniteGraph, MongoDB, Oracle- 
NoSQL, TitanDB * - Implemented by 3rd party
Tinkerpop - Rexter 
• Provides REST and binary (RexPro - grizzly) protocols 
• Flexible extension model (e.g. ad-hoc Gremlin queries) 
• Server-side stored procedures (Gremlin) 
• Browser-based interface (Dog House) 
• Command-line tool for interacting with API 
• Pluggable security 
• SPARQL plugin to work against Sail graphs (OpenRDF) 
• More information: 
https://github.com/tinkerpop/rexster/wiki
Tinkerpop - Furnace 
• Collection of industry-standard algorithms for 
traversing or analyzing graphs. 
• Network generators (by clique or degree distribution) 
• Search: A*, Breadth-first, Depth-first 
• Shortest path 
• Bellman-Ford (like Dijkstra’s but can handle neg. paths) 
• PageRank 
• Degree Distribution 
• More information: 
https://github.com/tinkerpop/furnace/wiki
Tinkerpop - Frames 
More Information: https://github.com/tinkerpop/frames/wiki
Tinkerpop - Pipes 
• Dataflow framework for process graphs. 
• Computational step becomes a node and an edge is a 
communication channel between steps. 
• Pipes are then chained and nested. 
• Custom pipes can be created. 
• Pipe types: 
• Transform – emit transformation of object 
• Dozens of different types of transforms 
• Filter – decide whether to include/exclude object in traversal 
• ~20 different types of filters 
• sideEffect – include object but produce side-effect from it 
• ~15 different types of sideEffects (e.g. group, count, table, tree) 
• Branch – decide which step to take next in traversal 
• Several different branching options
Tinkerpop - Blueprints 
• Like JDBC but for graphs. 
• Common API for Property Graphs which are very flexible 
• Foundational component for Pipes, Gremlin, Frames, 
Furnace, and Rexster 
• Supports transactions (if underlying DB engine does) 
• Multi-threaded transactions supported 
• Format readers/writers (GML, GraphML, GraphSON) 
• More Information: 
https://github.com/tinkerpop/blueprints/wiki
Tinkerpop - Gremlin 
• Graph traversal scripting language. 
• Works against Blueprints API and is “compiled” into 
Frames data-flows. 
• Both native Java and Groovy (JSR-223) supported. 
• Step library (https://github.com/tinkerpop/gremlin/wiki/Gremlin-Steps) 
• Transform – emit transformation of object 
• Dozens of different types of transforms 
• Filter – decide whether to include/exclude object in traversal 
• ~20 different types of filters 
• sideEffect – include object but produce side-effect from it 
• ~15 different types of sideEffects (e.g. group, count, table, tree) 
• Branch – decide which step to take next in traversal 
• Several different branching options
SQL → Gremlin (secret decoder ring) 
Query SQL Gremlin 
Get all users select 
* 
from 
users 
g.V(‘type’, 
‘user’).map() 
Get user names select 
name 
from 
users 
g.V(‘type’, 
‘user’).name 
Get user names/ages select 
name, 
age 
from 
users 
g.V(‘type’, 
‘user’) 
.transform( 
{ 
[ 
‘name’ 
: 
it.getProperty(‘name’), 
‘age’ 
: 
it.getProperty(‘age’) 
] 
}) 
Get distinct user ages select 
distinct(age) 
from 
users 
g.V(‘type’, 
‘user’) 
.age.dedup() 
Get oldest user select 
max(age) 
from 
users 
g.V(‘type’, 
‘user’) 
.age.max()
SQL → Gremlin (secret decoder ring) 
Query SQL Gremlin 
Select by equality select 
* 
from 
users 
where 
age 
= 
35 
g.V(‘type’, 
‘user’) 
.has(‘age’, 
35).map() 
Select by comparison select 
* 
from 
users 
where 
age 
 
21 
g.V(‘type’, 
‘user’) 
.has(‘age’, 
T.gt, 
21) 
.map() 
Select by multiple criteria select 
* 
from 
users 
where 
sex 
= 
“M” 
and 
age 
 
25 
g.V(‘type’, 
‘user’) 
.has(‘age’, 
T.gt, 
25) 
.has(‘sex’, 
‘M’) 
.map() 
Order by age 
(switch ‘a’ and ‘b’ to do asc) 
select 
* 
from 
users 
order 
by 
age 
desc 
g.V(‘type’, 
‘user’).order({ 
it.b.getProperty(‘age’) 
= 
it.a.getProperty(‘age’) 
}).map() 
Paging select 
* 
from 
users 
order 
by 
age 
desc 
limit 
5 
offset 
5 
g.V(‘type’, 
‘user’) 
.order({ 
it.b.getProperty(‘age’) 
= 
it.a.getProperty(‘age’) 
})[5..10].map()
SQL → Gremlin (secret decoder ring) 
Query SQL Gremlin 
Join select 
users.* 
from 
users 
inner 
join 
groups 
on 
users.gId 
= 
groups.id 
where 
groups.name 
= 
“devs” 
g.V(‘type’, 
‘groups’) 
.has(‘name’, 
‘dev’) 
.in(‘inGroup’).map() 
Join-on-join-on-join … SELECT 
TOP 
(5) 
[t14].[ProductName] 
FROM 
(SELECT 
COUNT(*) 
AS 
[value], 
[t13].[ProductName] 
FROM 
[customers] 
AS 
[t0] 
CROSS 
APPLY 
(SELECT 
[t9].[ProductName] 
FROM 
[orders] 
AS 
[t1] 
CROSS 
JOIN 
[order 
details] 
AS 
[t2] 
INNER 
JOIN 
[products] 
AS 
[t3] 
ON 
[t3].[ProductID] 
= 
[t2].[ProductID] 
CROSS 
JOIN 
[order 
details] 
AS 
[t4] 
INNER 
JOIN 
[orders] 
AS 
[t5] 
ON 
[t5].[OrderID] 
= 
[t4].[OrderID] 
LEFT 
JOIN 
[customers] 
AS 
[t6] 
ON 
[t6].[CustomerID] 
= 
[t5].[CustomerID] 
CROSS 
JOIN 
([orders] 
AS 
[t7] 
CROSS 
JOIN 
[order 
details] 
AS 
[t8] 
INNER 
JOIN 
[products] 
AS 
[t9] 
ON 
[t9].[ProductID] 
= 
[t8].[ProductID]) 
WHERE 
NOT 
EXISTS(SELECT 
NULL 
AS 
[EMPTY] 
FROM 
[orders] 
AS 
[t10] 
CROSS 
JOIN 
[order 
details] 
AS 
[t11] 
INNER 
JOIN 
[products] 
AS 
[t12] 
ON 
[t12].[ProductID] 
= 
[t11].[ProductID] 
WHERE 
[t9].[ProductID] 
= 
[t12].[ProductID] 
AND 
[t10].[CustomerID] 
= 
[t0].[CustomerID] 
AND 
[t11].[OrderID] 
= 
[t10].[OrderID]) 
AND 
[t6].[CustomerID] 
 
[t0].[CustomerID] 
AND 
[t1].[CustomerID] 
= 
[t0].[CustomerID] 
AND 
[t2].[OrderID] 
= 
[t1].[OrderID] 
AND 
[t4].[ProductID] 
= 
[t3].[ProductID] 
AND 
[t7].[CustomerID] 
= 
[t6].[CustomerID] 
AND 
[t8].[OrderID] 
= 
[t7].[OrderID]) 
AS 
[t13] 
WHERE 
[t0].[CustomerID] 
= 
N'ALFKI' 
GROUP 
BY 
[t13].[ProductName]) 
AS 
[t14] 
ORDER 
BY 
[t14].[value] 
DESC 
g.V('customerId','ALFKI') 
.as('customer’) 
.out('ordered') 
.out('contains') 
.out('is') 
.as('products’) 
.in('is') 
.in('contains') 
.in('ordered') 
.except('customer’) 
.out('ordered') 
.out('contains') 
.out('is') 
.except('products’) 
.groupCount().cap() 
.orderMap(T.decr[0..5] 
.productName
Gremlin Resources 
• Tinkerpop resources 
• https://github.com/tinkerpop/gremlin/wiki/Basic-Graph-Traversals 
• https://github.com/tinkerpop/gremlin/wiki/Gremlin-Steps 
• https://github.com/tinkerpop/gremlin/wiki/Using-Gremlin-through-Java 
• https://groups.google.com/forum/#!forum/gremlin-users 
• https://github.com/tinkerpop/gremlin/wiki/SPARQL-vs.-Gremlin 
• http://markorodriguez.com/2011/08/03/on-the-nature-of-pipes/ 
• http://sql2gremlin.com/ 
• http://gremlindocs.com/ 
• Groovy 
• http://groovy.codehaus.org/Beginners+Tutorial 
• http://groovy.codehaus.org/Collections 
• Misc 
• http://www.fromdev.com/2013/09/Gremlin-Example-Query-Snippets-Graph-DB.html 
• http://markorodriguez.com/2011/06/15/graph-pattern-matching-with-gremlin-1-1/
GREMLIN 
Demo Dataset Lab
Tinkerpop - Gremlin 
gremlin 
g 
= 
TinkerGraphFactory.createTinkerGraph() 
==tinkergraph[vertices:6 
edges:6] 
gremlin 
g.V.count() 
==6 
gremlin 
g.E.count() 
==6 
gremlin 
g.v(1) 
==v[1] 
gremlin 
g.v(1).map 
=={age=29, 
name=marko} 
gremlin 
g.v(1).outE 
==e[7][1-­‐knows-­‐2] 
==e[8][1-­‐knows-­‐4] 
==e[9][1-­‐created-­‐3] 
gremlin 
g.v(1).outE('knows') 
==e[7][1-­‐knows-­‐2] 
==e[8][1-­‐knows-­‐4] 
gremlin 
g.v(1).outE('knows').map 
=={weight=0.5} 
=={weight=1.0}
Tinkerpop - Gremlin 
// 
get 
verticies 
known 
by 
marko 
gremlin 
g.v(1).outE('knows').inV 
==v[2] 
==v[4] 
// 
get 
properties 
of 
verticies 
known 
by 
marko 
gremlin 
g.v(1).outE('knows').inV.map 
=={age=27, 
name=vadas} 
=={age=32, 
name=josh} 
// 
filter 
by 
those 
older 
than 
30 
gremlin 
g.v(1).outE('knows').inV 
.filter{it.age 
 
30}.map 
=={age=32, 
name=josh} 
// 
just 
get 
name 
gremlin 
g.v(1).outE('knows').inV 
.filter{it.age 
 
30}.name 
==josh 
// 
find 
nodes 
who 
‘know’ 
someone 
older 
than 
30 
gremlin 
g.V.as('x').outE('knows').inV 
.has('age', 
T.gt, 
30).back('x').map 
=={age=29, 
name=marko}
Tinkerpop - Gremlin 
// 
find 
edges 
with 
weight 
 
.5 
gremlin 
g.E.filter{it.weight 
 
0.5} 
==e[10][4-­‐created-­‐5] 
==e[8][1-­‐knows-­‐4] 
// 
find 
edges 
w/ 
weight 
 
.5 
from 
marko 
gremlin 
g.E.filter{it.weight 
 
0.5} 
.as('x').outV.has('name', 
T.eq, 
'marko') 
.back('x') 
==e[8][1-­‐knows-­‐4] 
// 
find 
nodes 
‘created’ 
by 
other 
nodes 
gremlin 
g.V.as('x').inE('created') 
.back('x').map 
=={name=lop, 
lang=java} 
=={name=ripple, 
lang=java} 
gremlin 
g.E.filter{it.label 
== 
'created'}.inV 
.dedup().map 
=={name=lop, 
lang=java} 
=={name=ripple, 
lang=java} 
// 
find 
nodes 
‘created’ 
by 
more 
than 
1 
node 
gremlin 
g.E.filter{it.label 
== 
'created'} 
.inV.groupCount().cap() 
=={v[3]=3, 
v[5]=1} 
// 
find 
nodes 
‘created’ 
by 
marko’s 
friends 
gremlin 
g.v(1).outE('knows').inV 
.outE('created').inV.map 
=={name=ripple, 
lang=java} 
=={name=lop, 
lang=java}
Tinkerpop - Gremlin 
// 
add 
some 
new 
nodes 
gremlin 
g.addVertex([name:'bob',age:'60']) 
==v[0] 
gremlin 
g.addVertex([name:'eve',age:'40']) 
==v[7] 
gremlin 
g.addVertex([name:'timmy',age:'5']) 
==v[8] 
// 
add 
some 
edges 
gremlin 
g.addEdge(g.v(0), 
g.v(7),'friend’) 
==e[13][0-­‐friend-­‐7] 
gremlin 
g.addEdge(g.v(0), 
g.v(8),'child') 
==e[14][0-­‐child-­‐8] 
gremlin 
g.V.filter{it.name 
== 
'bob'} 
.outE('child').as('x').inV 
.filter{it.name 
== 
'timmy'}.back('x') 
==e[14][0-­‐child-­‐8] 
gremlin 
g.removeEdge(g.e(14)) 
==null 
gremlin 
g.V.filter{it.name 
== 
'bob'} 
.outE('child').as('x').inV 
.filter{it.name 
== 
'timmy'}.back('x') 
// 
no 
results
Tinkerpop - Gremlin 
// 
previously 
gremlin 
g.addVertex([name:'bob',age:'60']) 
==v[0] 
gremlin 
g.addVertex([name:'eve',age:'40']) 
==v[7] 
gremlin 
g.addEdge(g.v(0), 
g.v(7),'friend') 
==e[13][0-­‐friend-­‐7] 
// 
query 
for 
edge 
gremlin 
g.v(0).outE 
==e[13][0-­‐friend-­‐7] 
// 
remove 
vertex 
(auto 
removes 
orphaned 
edge) 
gremlin 
g.removeVertex(g.v(7)) 
==null 
gremlin 
g.v(0).outE 
// 
no 
results 
gremlin 
g.e(13) 
==null
TITAN 
A Distributed Graph Database
Titan Graph Database 
• Optimized to work against billions of nodes 
and edges 
• Theoretical limitation of 2^60 edges and 1^60 nodes 
• Works with several different distributed DBs 
including Cassandra and HBase 
• Supports many concurrent users doing 
complex graph traversals simultaneously 
• Native integration with Tinkerpop stack 
• Supports integration with search 
technologies such as Lucene and 
Elasticsearch 
• Created by Thinkaurelius 
(http://thinkaurelius.com/)
Titan Distributed Architecture 
• TitanDB can integrate with distributed architectures in a 
few different ways 
Native Remote Embedded 
• Put Rexter in front to 
allow RESTful access 
• Connects remotely to 
cluster 
• Can scale size as far 
as cluster can 
• Possible processing 
bottleneck 
• TitanDB and Rexter run on 
each node in the cluster 
• Can run on same JVM 
• Considerable 
performance/scalability 
improvement 
• Connects remotely 
to cluster (or local) 
• Can scale size as 
far as cluster can 
• Native Titan API 
• Possible 
processing 
bottleneck
Titan Indexing 
• Standard index 
• Internal to Titan 
• Very fast but only supports exact matches 
• External index 
• Use indexing engine external to Titan (Lucene or Elasticsearch) 
• Supports range queries 
• Lucene 
• Limited to only one machine (small-sized datasets) 
• Also as richer set of search features (than Elasticsearch) 
• Elasticsearch 
• Distributed 
• Not as feature-filled as Lucene
Distributed Titan Limitations/Gotchas 
• Limitations which are present but which are scheduled to 
be remedied 
• Property indexes must be created before property is ever used 
• Unable to drop indices 
• Types cannot be changed once created 
• Gotchas 
• Multiple graphs on same backend requires specific configurations 
per graph 
• Ghost vertices – certain concurrency circumstances can leave 
traces of vertices. Recommendation is to allow this and periodically 
clean them up
Titan Graph Database - Gremlin 
graph vertices edges properties 
G = (V , E , λ)
Titan Graph Database - Gremlin 
graph vertices edges properties 
G = (V , E , λ)
Titan Graph Database - Gremlin 
graph vertices edges properties 
G = (V , E , λ) 
Application
Titan Graph Database - Gremlin 
graph vertices edges properties 
G = (V , E , λ) 
Application
Titan Graph Database - Gremlin 
graph vertices edges properties 
G = (V , E , λ) 
Application
DATA MODELING 
EXAMPLE 
A Blogging Application
“Bloggie Blog” Requirements 
• Create users, posts, and comments 
• Retrieve all posts for a user 
• Retrieve posts by time range 
• Retrieve all comments for a user 
• Retrieve all comments for a post, sorted by vote 
• Retrieve the top N posts, sorted by vote 
• User can only vote *once* on a post or comment
Get Cassandra  Titan 
• https://github.com/thinkaurelius/titan/wiki/Downloads (0.3.2 stable) 
$ 
$TITAN_LOCATION/bin/gremlin.sh 
,,,/ 
(o 
o) 
-­‐-­‐-­‐-­‐-­‐oOOo-­‐(_)-­‐oOOo-­‐-­‐-­‐-­‐-­‐ 
gremlin 
g 
= 
new 
TinkerGraph(); 
==tinkergraph[vertices:0 
edges:0] 
gremlin
Modeling Entities (User, Post, Comment) 
• There’s no one way to model this. 
• General rules to follow: 
• 1-N relationships can be modeled as one node with N edges pointing to 
other nodes 
• 1-1 relationships can be modeled as a simple edge between two nodes 
• M-N relationships are just more edges 
• It is important to categorize the different types of edges since many 
different types of edges will connect to a single node 
• Don’t shy away from attaching properties to edges. Remember that edges 
are just a query-able as nodes. 
• A common practice is to tend to model “actions” as edges and 
“actors”/”artifacts” as nodes 
• Denormalize to minimize traversals
Users, Posts, Comments
Retrieve User’s Posts 
• Let’s create a user and post 
• Link them together 
• Retrieve the user and their posts 
gremlin 
g.addVertex([ 
type: 
'user', 
email: 
'bob@test.com', 
name: 
'Robert', 
password: 
'asdf']) 
==v[0] 
gremlin 
g.addVertex( 
[type: 
'post', 
guid: 
'21EC2020-­‐3AEA-­‐1069-­‐A2DD-­‐08002B30309D', 
title: 
'Hello 
World', 
text: 
'My 
first 
post!', 
userDisplayName: 
'Bob']) 
==v[1] 
gremlin 
g.addEdge(g.v(0), 
g.v(1), 
'postAuthor') 
==e[3][0-­‐postAuthor-­‐1] 
gremlin 
g.V.has('type', 
'post').as('posts') 
 
.inE('postAuthor') 
 
.outV.has('email', 
'bob@test.com') 
 
.back('posts').map() 
=={guid=21EC2020-­‐3AEA-­‐1069-­‐A2DD-­‐08002B30309D, 
text=My 
first 
post!, 
title=Hello 
World, 
userDisplayName=Bob, 
type=post}
Retrieve Posts by Time Range 
• Add timestamp property to post 
• Query by range 
gremlin 
g.V 
 
.has('guid','21EC2020-­‐3AEA-­‐1069-­‐A2DD-­‐08002B30309D') 
 
.has('type', 
'post').sideEffect( 
 
{it.createTimestamp 
= 
1383726500}); 
==v[1] 
gremlin 
g.V 
 
.has('createTimestamp', 
T.gt, 
1383726400) 
 
.has('createTimestamp', 
T.lt, 
1383726600) 
 
.map() 
=={guid=21EC2020-­‐3AEA-­‐1069-­‐ 
A2DD-­‐08002B30309D, 
createTimestamp=1383726500, 
text=My 
first 
post!, 
title=Hello 
World, 
userDisplayName=Bob, 
type=post}
Retrieve All User’s Comments 
• Add comment 
• Link to author and to post 
gremlin 
g.addVertex([ 
type: 
'comment', 
guid: 
'3F2504E0-­‐4F89-­‐11D3-­‐9A0C-­‐0305E82C3301', 
text: 
'I 
like 
it!', 
userDisplayName: 
'Sally', 
createTimestamp: 
1383736500]) 
==v[4] 
gremlin 
g.addEdge( 
g.v(1), 
g.v(4), 
'postComment') 
==e[5][1-­‐postComment-­‐4] 
gremlin 
g.addVertex([type: 
'user', 
email: 
'sally@test.com', 
name: 
'Sally', 
password: 
'qwerty']) 
==v[6] 
gremlin 
g.addEdge(g.v(6), 
g.v(4), 
'commentAuthor') 
==e[7][6-­‐commentAuthor-­‐4] 
gremlin 
g.V.has('type', 
'comment').as('comments') 
 
.inE('commentAuthor').outV.has( 
 
'email', 
'sally@test.com') 
 
.back('comments').map() 
=={guid=3F2504E0-­‐4F89-­‐11D3-­‐9A0C-­‐0305E82C3301, 
createTimestamp=1383736500, 
text=I 
like 
it!, 
userDisplayName=Sally, 
type=comment}
Retrieve top N posts by vote 
• Create “postVote” edge and 
aggregated votes count in post 
• Query and sort by votes 
gremlin 
g.addEdge(g.v(6), 
g.v(1), 
'postVote', 
[date: 
1383726600]) 
==e[8][6-­‐postVote-­‐1] 
gremlin 
g.V.has('type','post').has('guid','21EC2 
020-­‐3AEA-­‐1069-­‐ 
A2DD-­‐08002B30309D').sideEffect({it.votes 
= 
1}) 
==v[1] 
gremlin 
g.addVertex([ 
type: 
'post', 
guid: 
'21EC2020-­‐3AEA-­‐1069-­‐A2DD-­‐08002B30309E', 
createTimestamp: 
1383726600, 
title: 
'Learning 
Gremlin', 
text: 
'Gremlin 
is 
neat.', 
userDisplayName: 
'Bob', 
votes: 
2]) 
==v[9] 
gremlin 
g.V('type', 
'post').order({it.b.getProperty('votes') 
= 
it.a.getProperty('votes')}).transform({['title' 
: 
it.getProperty('title'), 
'votes' 
: 
it.getProperty('votes')]})[0..5] 
=={title=Learning 
Gremlin, 
votes=2} 
=={title=Hello 
World, 
votes=1}
Retrieve Post Comments Sorted by Vote 
• Similar to post votes 
gremlin 
g.addEdge(g.v(0), 
g.v(4), 
'commentVote', 
[date: 
1383726700]) 
==e[10][0-­‐commentVote-­‐4] 
gremlin 
g.V.has('type','comment').has('guid','3F 
2504E0-­‐4F89-­‐11D3-­‐9A0C-­‐0305E82C3301').sid 
eEffect({it.votes 
= 
1}) 
==v[4] 
gremlin 
g.addVertex([ 
type: 
'comment', 
guid: 
'3F2504E0-­‐4F89-­‐11D3-­‐9A0C-­‐0305E82C3302', 
text: 
'Thanks.', 
userDisplayName: 
'Bob', 
createTimestamp: 
1383736500]) 
==v[11] 
gremlin 
g.addEdge(g.v(1), 
g.v(11), 
'postComment') 
gremlin 
g.addEdge(g.v(0), 
g.v(11), 
'commentAuthor') 
gremlin 
g.v(1).outE('postComment').inV.order({it.b.getProperty( 
'votes') 
= 
it.a.getProperty('votes')}).map() 
=={guid=3F2504E0-­‐4F89-­‐11D3-­‐9A0C-­‐0305E82C3301, 
createTimestamp=1383736500, 
text=I 
like 
it!, 
votes=1, 
userDisplayName=Sally, 
type=comment} 
=={guid=3F2504E0-­‐4F89-­‐11D3-­‐9A0C-­‐0305E82C3302, 
createTimestamp=1383736500, 
text=Thanks., 
userDisplayName=Bob, 
type=comment}
User Can Only Vote Once 
• Could enforce using external 
unique indexes 
• Or do 2-step incrementing in 
gremlin (small chance of dups) 
gremlin 
 
user 
= 
g.v(0); 
post 
= 
g.v(1); 
if 
(post.inE('postVote').outV.has( 
 
'email', 
user.email).count() 
== 
0) 
{ 
g.addEdge(user, 
post, 
'postVote', 
[date: 
new 
Date().getTime()]); 
if 
(post.getProperty('votes') 
!= 
null){ 
post.votes++; 
} 
else 
{ 
post.votes 
= 
1; 
} 
} 
==1 
gremlin 
// 
same 
command 
above 
==null
Graph Visualization
Areas Not Covered 
• Map/Reduce 
• Gremlin has its own built-in M/R API 
• Indexing 
• Titan currently has limitation requiring all indexes are created up-front 
• Integration with other backends 
• HBase, Oracle Berkeley DB, Hazelcast, Persistit 
• Detailed full-text search through external indexes 
• Graph analytics engine (Faunus) 
• Deep dive into gremlin query language and 
Groovy 
• Seriously, there’s a TON there.
References 
http://sql2gremlin.com/ 
http://www.tinkerpopbook.com/ - http://www.tinkerpop.com/ 
https://github.com/thinkaurelius/titan/wiki/Getting-Started 
https://groups.google.com/forum/#!forum/gremlin-users 
https://groups.google.com/forum/#!forum/aureliusgraphs 
http://thinkaurelius.com/
THANK YOU 
{ 
“email” : “calebjones@gmail.com”, 
“website” : “http://calebjones.info”, 
“twitter” : “@JonesWCaleb” 
}

More Related Content

What's hot

Rustで DDD を実践しながら API サーバーを実装・構築した(つもり)
Rustで DDD を実践しながら API サーバーを実装・構築した(つもり)Rustで DDD を実践しながら API サーバーを実装・構築した(つもり)
Rustで DDD を実践しながら API サーバーを実装・構築した(つもり)
ShogoOkazaki
 
Rust と Wasmの現実
Rust と Wasmの現実Rust と Wasmの現実
Rust と Wasmの現実
ShogoTagami1
 
20160215 04 java ee7徹底入門 jbatch
20160215 04 java ee7徹底入門 jbatch20160215 04 java ee7徹底入門 jbatch
20160215 04 java ee7徹底入門 jbatch
Jun Inose
 
PostgreSQL Security. How Do We Think?
PostgreSQL Security. How Do We Think?PostgreSQL Security. How Do We Think?
PostgreSQL Security. How Do We Think?
Ohyama Masanori
 
Jbatch実践入門 #jdt2015
Jbatch実践入門 #jdt2015Jbatch実践入門 #jdt2015
Jbatch実践入門 #jdt2015
Norito Agetsuma
 
Geohash
GeohashGeohash
さいきんの InnoDB Adaptive Flushing (仮)
さいきんの InnoDB Adaptive Flushing (仮)さいきんの InnoDB Adaptive Flushing (仮)
さいきんの InnoDB Adaptive Flushing (仮)
Takanori Sejima
 
Understanding PostgreSQL LW Locks
Understanding PostgreSQL LW LocksUnderstanding PostgreSQL LW Locks
Understanding PostgreSQL LW Locks
Jignesh Shah
 
Distributed Lock Manager
Distributed Lock ManagerDistributed Lock Manager
Distributed Lock Manager
Hao Chen
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQL
RTigger
 
Alteryxの空間分析で学ぶ、最寄りの指定緊急避難場所と低水位地帯 Developers.IO Tokyo 2019
Alteryxの空間分析で学ぶ、最寄りの指定緊急避難場所と低水位地帯 Developers.IO Tokyo 2019Alteryxの空間分析で学ぶ、最寄りの指定緊急避難場所と低水位地帯 Developers.IO Tokyo 2019
Alteryxの空間分析で学ぶ、最寄りの指定緊急避難場所と低水位地帯 Developers.IO Tokyo 2019
Yuji Kanemoto
 
Introduction to SPARQL
Introduction to SPARQLIntroduction to SPARQL
Introduction to SPARQL
Jose Emilio Labra Gayo
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame Graphs
Brendan Gregg
 
5.6 以前の InnoDB Flushing
5.6 以前の InnoDB Flushing5.6 以前の InnoDB Flushing
5.6 以前の InnoDB Flushing
Takanori Sejima
 
자바 직렬화 (Java serialization)
자바 직렬화 (Java serialization)자바 직렬화 (Java serialization)
자바 직렬화 (Java serialization)
중선 곽
 
Azure Database for PostgreSQL 入門 (PostgreSQL Conference Japan 2021)
Azure Database for PostgreSQL 入門 (PostgreSQL Conference Japan 2021)Azure Database for PostgreSQL 入門 (PostgreSQL Conference Japan 2021)
Azure Database for PostgreSQL 入門 (PostgreSQL Conference Japan 2021)
Keisuke Takahashi
 
Redux Toolkit & RTK Query in TypeScript: tips&tricks
Redux Toolkit & RTK Query in TypeScript: tips&tricksRedux Toolkit & RTK Query in TypeScript: tips&tricks
Redux Toolkit & RTK Query in TypeScript: tips&tricks
Fabio Biondi
 
PostgreSQL 15の新機能を徹底解説
PostgreSQL 15の新機能を徹底解説PostgreSQL 15の新機能を徹底解説
PostgreSQL 15の新機能を徹底解説
Masahiko Sawada
 
Migr8.rb チュートリアル
Migr8.rb チュートリアルMigr8.rb チュートリアル
Migr8.rb チュートリアル
kwatch
 
MySQLとPostgreSQLの基本的なバックアップ比較
MySQLとPostgreSQLの基本的なバックアップ比較MySQLとPostgreSQLの基本的なバックアップ比較
MySQLとPostgreSQLの基本的なバックアップ比較
Shinya Sugiyama
 

What's hot (20)

Rustで DDD を実践しながら API サーバーを実装・構築した(つもり)
Rustで DDD を実践しながら API サーバーを実装・構築した(つもり)Rustで DDD を実践しながら API サーバーを実装・構築した(つもり)
Rustで DDD を実践しながら API サーバーを実装・構築した(つもり)
 
Rust と Wasmの現実
Rust と Wasmの現実Rust と Wasmの現実
Rust と Wasmの現実
 
20160215 04 java ee7徹底入門 jbatch
20160215 04 java ee7徹底入門 jbatch20160215 04 java ee7徹底入門 jbatch
20160215 04 java ee7徹底入門 jbatch
 
PostgreSQL Security. How Do We Think?
PostgreSQL Security. How Do We Think?PostgreSQL Security. How Do We Think?
PostgreSQL Security. How Do We Think?
 
Jbatch実践入門 #jdt2015
Jbatch実践入門 #jdt2015Jbatch実践入門 #jdt2015
Jbatch実践入門 #jdt2015
 
Geohash
GeohashGeohash
Geohash
 
さいきんの InnoDB Adaptive Flushing (仮)
さいきんの InnoDB Adaptive Flushing (仮)さいきんの InnoDB Adaptive Flushing (仮)
さいきんの InnoDB Adaptive Flushing (仮)
 
Understanding PostgreSQL LW Locks
Understanding PostgreSQL LW LocksUnderstanding PostgreSQL LW Locks
Understanding PostgreSQL LW Locks
 
Distributed Lock Manager
Distributed Lock ManagerDistributed Lock Manager
Distributed Lock Manager
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQL
 
Alteryxの空間分析で学ぶ、最寄りの指定緊急避難場所と低水位地帯 Developers.IO Tokyo 2019
Alteryxの空間分析で学ぶ、最寄りの指定緊急避難場所と低水位地帯 Developers.IO Tokyo 2019Alteryxの空間分析で学ぶ、最寄りの指定緊急避難場所と低水位地帯 Developers.IO Tokyo 2019
Alteryxの空間分析で学ぶ、最寄りの指定緊急避難場所と低水位地帯 Developers.IO Tokyo 2019
 
Introduction to SPARQL
Introduction to SPARQLIntroduction to SPARQL
Introduction to SPARQL
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame Graphs
 
5.6 以前の InnoDB Flushing
5.6 以前の InnoDB Flushing5.6 以前の InnoDB Flushing
5.6 以前の InnoDB Flushing
 
자바 직렬화 (Java serialization)
자바 직렬화 (Java serialization)자바 직렬화 (Java serialization)
자바 직렬화 (Java serialization)
 
Azure Database for PostgreSQL 入門 (PostgreSQL Conference Japan 2021)
Azure Database for PostgreSQL 入門 (PostgreSQL Conference Japan 2021)Azure Database for PostgreSQL 入門 (PostgreSQL Conference Japan 2021)
Azure Database for PostgreSQL 入門 (PostgreSQL Conference Japan 2021)
 
Redux Toolkit & RTK Query in TypeScript: tips&tricks
Redux Toolkit & RTK Query in TypeScript: tips&tricksRedux Toolkit & RTK Query in TypeScript: tips&tricks
Redux Toolkit & RTK Query in TypeScript: tips&tricks
 
PostgreSQL 15の新機能を徹底解説
PostgreSQL 15の新機能を徹底解説PostgreSQL 15の新機能を徹底解説
PostgreSQL 15の新機能を徹底解説
 
Migr8.rb チュートリアル
Migr8.rb チュートリアルMigr8.rb チュートリアル
Migr8.rb チュートリアル
 
MySQLとPostgreSQLの基本的なバックアップ比較
MySQLとPostgreSQLの基本的なバックアップ比較MySQLとPostgreSQLの基本的なバックアップ比較
MySQLとPostgreSQLの基本的なバックアップ比較
 

Similar to Intro to Graph Databases Using Tinkerpop, TitanDB, and Gremlin

Student Usability in Educational Software and Games: Improving Experiences
Student Usability in Educational Software and Games: Improving ExperiencesStudent Usability in Educational Software and Games: Improving Experiences
Student Usability in Educational Software and Games: Improving Experiences
Carina Soledad Gonzalez
 
Fula I WòLof
Fula I WòLofFula I WòLof
Fula I WòLof
Arnau Cerdà
 
àRab
àRabàRab
Soninké I Mandinga
Soninké I MandingaSoninké I Mandinga
Soninké I Mandinga
Arnau Cerdà
 
v20200429 Diaspora International Decade for Peoples of African Descent SDGs A...
v20200429 Diaspora International Decade for Peoples of African Descent SDGs A...v20200429 Diaspora International Decade for Peoples of African Descent SDGs A...
v20200429 Diaspora International Decade for Peoples of African Descent SDGs A...
Andrew Networks
 
New Life
New LifeNew Life
New Life
Permai CMC
 
XinèS
XinèSXinèS
XinèS
Arnau Cerdà
 
The giant-with-the-feets-of-clay-raul-hilberg-and-his-standard-work-on-the-ho...
The giant-with-the-feets-of-clay-raul-hilberg-and-his-standard-work-on-the-ho...The giant-with-the-feets-of-clay-raul-hilberg-and-his-standard-work-on-the-ho...
The giant-with-the-feets-of-clay-raul-hilberg-and-his-standard-work-on-the-ho...
RareBooksnRecords
 
v20200506 SAGA Foundation Africa Youth4Global Goals SDGs Agenda 2063 Presenta...
v20200506 SAGA Foundation Africa Youth4Global Goals SDGs Agenda 2063 Presenta...v20200506 SAGA Foundation Africa Youth4Global Goals SDGs Agenda 2063 Presenta...
v20200506 SAGA Foundation Africa Youth4Global Goals SDGs Agenda 2063 Presenta...
Andrew Networks
 
Berber
BerberBerber
Berber
Arnau Cerdà
 
Cultural Diplomatist Andrew Williams Jr Humanitarian Commitments and Referenc...
Cultural Diplomatist Andrew Williams Jr Humanitarian Commitments and Referenc...Cultural Diplomatist Andrew Williams Jr Humanitarian Commitments and Referenc...
Cultural Diplomatist Andrew Williams Jr Humanitarian Commitments and Referenc...
Andrew Networks
 
Crowley A. The Book Of Thoth[1]
Crowley A.   The Book Of Thoth[1]Crowley A.   The Book Of Thoth[1]
Crowley A. The Book Of Thoth[1]
Miroslaw Duczkowski
 
1 analisis de-prioridades_de_conservacion
1 analisis de-prioridades_de_conservacion1 analisis de-prioridades_de_conservacion
1 analisis de-prioridades_de_conservacion
marcelaclaudiamendez
 
Amicus brief
Amicus briefAmicus brief
Amicus brief
Honolulu Civil Beat
 
Amicus brief
Amicus briefAmicus brief
Amicus brief
Honolulu Civil Beat
 
Darrer
DarrerDarrer
v20200319 Cultural Diplomatist Andrew Williams Jr Humanitarian Commitments an...
v20200319 Cultural Diplomatist Andrew Williams Jr Humanitarian Commitments an...v20200319 Cultural Diplomatist Andrew Williams Jr Humanitarian Commitments an...
v20200319 Cultural Diplomatist Andrew Williams Jr Humanitarian Commitments an...
Andrew Networks
 
I swhat pressclipping_tour2009
I swhat pressclipping_tour2009I swhat pressclipping_tour2009
I swhat pressclipping_tour2009
lacandela
 
At Work, Home or On-the-Go: How to Make Organizing & De-Cluttering Tools Work...
At Work, Home or On-the-Go: How to Make Organizing & De-Cluttering Tools Work...At Work, Home or On-the-Go: How to Make Organizing & De-Cluttering Tools Work...
At Work, Home or On-the-Go: How to Make Organizing & De-Cluttering Tools Work...
ACCO Brands
 
v20200508 Middle East North Africa SAGA Foundation Presentation SDGs UN Globa...
v20200508 Middle East North Africa SAGA Foundation Presentation SDGs UN Globa...v20200508 Middle East North Africa SAGA Foundation Presentation SDGs UN Globa...
v20200508 Middle East North Africa SAGA Foundation Presentation SDGs UN Globa...
Andrew Networks
 

Similar to Intro to Graph Databases Using Tinkerpop, TitanDB, and Gremlin (20)

Student Usability in Educational Software and Games: Improving Experiences
Student Usability in Educational Software and Games: Improving ExperiencesStudent Usability in Educational Software and Games: Improving Experiences
Student Usability in Educational Software and Games: Improving Experiences
 
Fula I WòLof
Fula I WòLofFula I WòLof
Fula I WòLof
 
àRab
àRabàRab
àRab
 
Soninké I Mandinga
Soninké I MandingaSoninké I Mandinga
Soninké I Mandinga
 
v20200429 Diaspora International Decade for Peoples of African Descent SDGs A...
v20200429 Diaspora International Decade for Peoples of African Descent SDGs A...v20200429 Diaspora International Decade for Peoples of African Descent SDGs A...
v20200429 Diaspora International Decade for Peoples of African Descent SDGs A...
 
New Life
New LifeNew Life
New Life
 
XinèS
XinèSXinèS
XinèS
 
The giant-with-the-feets-of-clay-raul-hilberg-and-his-standard-work-on-the-ho...
The giant-with-the-feets-of-clay-raul-hilberg-and-his-standard-work-on-the-ho...The giant-with-the-feets-of-clay-raul-hilberg-and-his-standard-work-on-the-ho...
The giant-with-the-feets-of-clay-raul-hilberg-and-his-standard-work-on-the-ho...
 
v20200506 SAGA Foundation Africa Youth4Global Goals SDGs Agenda 2063 Presenta...
v20200506 SAGA Foundation Africa Youth4Global Goals SDGs Agenda 2063 Presenta...v20200506 SAGA Foundation Africa Youth4Global Goals SDGs Agenda 2063 Presenta...
v20200506 SAGA Foundation Africa Youth4Global Goals SDGs Agenda 2063 Presenta...
 
Berber
BerberBerber
Berber
 
Cultural Diplomatist Andrew Williams Jr Humanitarian Commitments and Referenc...
Cultural Diplomatist Andrew Williams Jr Humanitarian Commitments and Referenc...Cultural Diplomatist Andrew Williams Jr Humanitarian Commitments and Referenc...
Cultural Diplomatist Andrew Williams Jr Humanitarian Commitments and Referenc...
 
Crowley A. The Book Of Thoth[1]
Crowley A.   The Book Of Thoth[1]Crowley A.   The Book Of Thoth[1]
Crowley A. The Book Of Thoth[1]
 
1 analisis de-prioridades_de_conservacion
1 analisis de-prioridades_de_conservacion1 analisis de-prioridades_de_conservacion
1 analisis de-prioridades_de_conservacion
 
Amicus brief
Amicus briefAmicus brief
Amicus brief
 
Amicus brief
Amicus briefAmicus brief
Amicus brief
 
Darrer
DarrerDarrer
Darrer
 
v20200319 Cultural Diplomatist Andrew Williams Jr Humanitarian Commitments an...
v20200319 Cultural Diplomatist Andrew Williams Jr Humanitarian Commitments an...v20200319 Cultural Diplomatist Andrew Williams Jr Humanitarian Commitments an...
v20200319 Cultural Diplomatist Andrew Williams Jr Humanitarian Commitments an...
 
I swhat pressclipping_tour2009
I swhat pressclipping_tour2009I swhat pressclipping_tour2009
I swhat pressclipping_tour2009
 
At Work, Home or On-the-Go: How to Make Organizing & De-Cluttering Tools Work...
At Work, Home or On-the-Go: How to Make Organizing & De-Cluttering Tools Work...At Work, Home or On-the-Go: How to Make Organizing & De-Cluttering Tools Work...
At Work, Home or On-the-Go: How to Make Organizing & De-Cluttering Tools Work...
 
v20200508 Middle East North Africa SAGA Foundation Presentation SDGs UN Globa...
v20200508 Middle East North Africa SAGA Foundation Presentation SDGs UN Globa...v20200508 Middle East North Africa SAGA Foundation Presentation SDGs UN Globa...
v20200508 Middle East North Africa SAGA Foundation Presentation SDGs UN Globa...
 

Recently uploaded

GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
Shinana2
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Public CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptxPublic CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptx
marufrahmanstratejm
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 

Recently uploaded (20)

GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Public CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptxPublic CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptx
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 

Intro to Graph Databases Using Tinkerpop, TitanDB, and Gremlin

  • 1. INTRO TO GRAPH DATABASES Using Tinkerpop, TitanDB, and Gremlin { “email” : “calebjones@gmail.com”, “website” : “http://calebjones.info”, “twitter” : “@JonesWCaleb” }
  • 2. Overview • Why Graphs? • Order to complexity • Use cases – major players • Graphs & Adjacency Matrices • Tinkerpop Framework • Blueprints, Frames, Pipes, Furnace, Gremlin, Rexster • Titan using Cassandra • Blog Application (lab) • Traversals using Gremlin
  • 4. Warren Weaver • 17th - 19th century • Problems of simplicity • How one element interacts with another • First half of 20th century • Problem of disorganized complexity • Many elements operating in a system w/o regard to how they interact with each other • Predicted • Problem of organized complexity • Many elements operating in a system taking into account how they interact with each other • Would require computational power far beyond what was currently available Science and Complexity 1948 ENIAC (1946)
  • 9. Order to Complexity • Trees describe order • Linear (simple lineage) • Categorized • Single dimensional • Symmetrical • Hierarchical • Convergent modeling • Networks describe complexity • Non-linear (multi-lineage) • Multi-categorical • Multi-dimensional • Asymmetrical • Decentralized • Divergent modeling
  • 18. -DFN.LUE DUO*DIIRUG 3DXO5HLQPDQ 6RO%URGVN /DUU/LHEHU %LOO(YHUHWW 'LFN$HUV 6WHYH'LWNR 6DP5RVHQ 5LFKDUG+RZHOO $O*RUGRQ .HQ)HGXQLHLZLF] DUPLQH,QIDQWLQR )UDQN6SULQJHU 6WDQ/HH 0LNH(VSRVLWR -RH0DQHHO -RKQ7DUWDJOLRQH +HUE7ULPSH $O.XU]URN *DU)ULHGULFK -LP6WHUDQNR $UWLH6LPHN KLF6WRQH *HRUJH5RXVVRV 6HH1RWHV $OH[7RWK 9LQFHROOHWWD /3*UHJRU )UDQN*LDFRLD :HUQHU5RWK :DOO:R-RRGH6LQQRWW 5R7KRPDV -DFN6SDUOLQJ 'DQ$GNLQV *LO.DQH -HUU5)RHVVOG$PQDGQUX 'RQ+HFN -RVKXD0LGGOHWRQ 0LNH*XVWRYLFK %UHW%OHYLQV $OIUHGR$OFDOD %RE/DUNLQ (G+DQQLJDQ %RE+DOO 'DQQ)LQJHURWK *HUURQZD DUROH6HXOLQJ KULVWLH6FKHHOH -RKQ5RPLWD-5 -RKQ9HUSRRUWHQ *DVSDU6DODGLQR *HRUJH7XVND -RKQ%XVFHPD 0RUULH.XUDPRWR $QGDQFKXV ,UYLQJ:DWDQDEH -RH5RVHQ +HUERRSHU 1HDO$GDPV 0DULH6HYHULQ 3HWHU6DQGHUVRQ 5RJHU6WHUQ %DUU:LQGVRU6PLWK -HDQ6LPHN 'LDQ%DR$EOE%HXUVGLDQVN 7RQ'H0]XDQ*ULJH7DRLUWJXHV3HUH] %RE6KDUHQ 5LFDUGR9LOODPRQWH 'DQUHVSL 6DP*UDLQJHU $QGUHD+LOO 6HQHQ$QWRQLR *HRUJH*R]XP %LOO:UD $OH[-D $UPDQGR*LO -RKQ%ROWRQ )UDQN0LOOHU 0D[6FKHHOH +RZD0UGDU0FD6FXNPLHHUDN $UVLD5R]HJDU -RKQ:RUNPDQ 6FRWW:LOOLDPV KULV(OLRSRXORV 7RP3DOPHU /LQGD)LWH KULVODUHPRQW 6DO%XVFHPD 'HQQLV2QHLO 5RQ:LOVRQ %RE0F/HRG 'DYHRFNUXP 0LFKHOOH:ULJKWVRQ )UDQNKLDUDPRQWH 3KLO5DFKHOVRQ 7RP2U]HFKRZVNL /HQ:HLQ 3HWUD*ROGEHUJ .DUHQ0DQWOR %LOO0DQWOR 0DUY:ROIPDQ $QQHWWH.DZHFNL 'DYLG+XQW %UXFH3DWWHUVRQ 5LFK%XFNOHU 'RQ:DUILHOG $QQHWWH.DH -DQLFHRKHQ %RQQLH:LOIRUG -RKQRVWDQ]D %RE/DWRQ $UFKLH*RRGZLQ %RE%URZQ 7RP6XWWRQ -RKQ%UQH 'DQ*UHHQ 7HUU$XVWLQ 'HQLVH:RKO -LP6KRRWHU *OQLV2OLYHU )UDQFRLVH0RXO 0LFKDHO+LJJLQV 5LFN3DUNHU OHP5RELQV -LP1RYDN -LP6DOLFUXS /RXLVH-RQHV %UHQW$QGHUVRQ -RH5XELQVWHLQ -DQLFHKLDQJ %RE:LDFHN -LP6KHUPDQ 5RQ=DOPH %LOO6LHQNLHZLF] -DQLQHDVH 3DXO6PLWK /QQ9DUOH -DQLFHDVH 3DXO%HFWRQ :DOWHU6LPRQVRQ (OLRW%URZQ 6U $QQ1RFHQWL 6WHYH/HLDORKD 3DW%OHYLQV 5LFN/HRQDUGL -KDULHV/6HSHURXVH 7HUU.DYDQDJK $O:LOOLDPVRQ -XQH%ULJPDQ UDLJ5XVVHOO 'DYH0FDLJ 0HO5XEL .LD$VDPLD 63WXDGXLRO77XUWRURQQH 6WHYH.LP -G6PLWK 3KLOLS7DQ 5XV:RRWRQ KXFN$XVWHQ 6HDQ3KLOOLSV 6KDQQRQ%ODQFKDUG -RHDVH 'DQ1RUWRQ 'DYH/DQSKHDU (PHUVRQ0LUDQGD 0DWW%DQQLQJ 5RE-HQVHQ $ODQ'DYLV 3DXO1HDU $UW$GDPV %XWFK*XLFH 0DUF6LOYHVWUL 3HWUD6FRWHVH .HUU*DPPLO -HII0DWVXGD (OHFWULFUDRQ 'LJLWDOKDPHOHRQ 6HDQ3DUVRQV 6NRWWLHRXQJ -DVRQ.HLWK 'DYH6KDUSH (G%UXEDNHU $QWRQLR)DEHOD (GJDU'HOJDGR )UDQN' $UPDWD 'DQQ.0LNL 0LNH5DLFKW *HRUJHV-HDQW $Q7GUKHRZP3DHVS'RHUHQLFN 6DQGX)ORUHD 0DUN3RZHUV .HYLQ1RZODQ .HYLQ6RPHUV %RE+DUUDV %LOO-DDVND 'DUO(GHOPDQ *LQD*RLQJ -RH4XHVDGD 3ROO:DWVRQ HGULF1RFRQ 5RE/LHIHOG -LP/HH .HQW:LOOLDPV (ULN/DUVHQ 0LFKDHO+HLVOHU *UHJ:ULJKW .LHURQ'ZHU 10HOLNHRPRWROOYLQV 0LNH5RFNZLW] 6X]DQQH*DIIQH %UDG9DQFD3WDDW%URVVHDX $UW7KLEHUW 6WHYH%XFFHOODWR 7RPRNR6DLWR /RLV%XKDOLV .HYLQXQQLQJKDP -RH5RVDV .HQ%DUU -LP6WDUOLQ 0LFKDHO*ROGHQ .ODXV-DQVRQ /DUU36WURPDQ +LODU%DUWD )DELDQ1DFLH]D 5XULN7OHU 6WHYH%XWOHU /HH.:HHYHLQNVRQUDG %HQ5DDE -LP.UXHJHU 'DQD0RUHVKHDG 5LFKDUG6WDUNLQJV .HYLQ7LQVOH 7RP5DQH -DQ$QWRQ+DUSV %UDQGRQ3HWHUVRQ /LVD3DWULFN 0DUN3HQQLQJWRQ 0LNH7KRPDV 0DULH-DYLQV 'DQ3DQRVLDQ $O0LOJURP 5LFKDUG,VDQRYH -L6PDLKGHDXQ7JHPRIRQWH -RQ%DEFRFN 5LFKDUG%HQQHWW -D-QR'VHXX0UVDHU]PDDQ $O9H -RH0DGXUHLUD 7LP7RZQVHQG 6W0HYDHWW(+SLWFLQNVJ 6FRWW/REGHOO 0DUN:DLG %LOO2DNOH 5RQ*DUQH 7RP*UXPPHWW RPLFUDIW %UDQ+LWFK DP6PLWK DUORV3DFKHFR 3KLO+XJK)HOL[ -RQDWKDQ%DEFRFN $QWKRQ:LQQ 5RELQ5LJJV KULVWLDQ/LFKWQHU -RH3LPHQWHO -RH$QGUHDQL -HSK/RHE -J 7HDP%XFFH -RKQ'HOO 3DVTXDO)HUU .ROMD)XFKV KXFN'L[RQ -DVRQ/LHELJ 6DOYDGRU/DUURFD 0DUN%HUQDUGR $GDP.XEHUW 0DUN0RUDOHV -RVH/DGURQQ -XDQ9ODVFR +XPEHUWR5DPRV KULV%DFKDOR 'DQ%URZQ 6WHYH6HDJOH $QG6PLWK 0RQLFD.XELQD (G%HQHV KULV6RWRPDRU /LTXLG*UDSKLFV -RKQ:DWVRQ *UHJ/DQG -+:LOOLDPV 'DUUO%DQNV -RKQDVVDGD 7RPP/HH(GZDUGV .LHURQ*LOOHQ -D/HLVWHQ DQLFN3DTXHWWH 7RGG.OHLQ 7HUU'RGVRQ XOO+DPQHU 3DXO0RXQWV 6WHYH2OLII $OEHUW'HVFKHVQH -RVHSK+DUULV 5DQ-%RHHQ.MDHPOOLQ 6WHYH5XGH 5DOSK0DFFKLR /HLQLO)UDQFLVX 3HWH)UDQFR KULV'LFNH 2VFDU*RQ%JRULUDDQ0LOOHU 0LNH60LOOHU 0LFDKHO6WHZDUW 2SWLF6WXGLRV 5LFKDUG+RULH *OHQQ+HUGOLQJ *UHJJ6FKLJLHO -D)DHUEHU 6FRWW+DQQD 5LFKDUGDVH 6WHYH%HK0OLQLFJKDHO6WHZDUW /RXLVH6LPRQVRQ %ULDQ+DEHUOLQ *UDKDP1RODQ 0LNH6WHZDUW :HV$EERW 5DQGDOO*UHHQ *HUPDQ*DUFLD 0LFKDHO5DQ 7RP'HUHQLFN 1RUP5DSPXQG +L)L'HVLJQ ,DQKXUFKLOO $YDORQ6WXGLR /DUU6WXFNHU $VKOH:RRG (GGLHDPSEHOO 6HDQ3-KDLYOOLLSHVU3DXQOGLG0RDWW6PLWK $$ULDHUOR2QOL/YHRWSWLUHVWL 'DYLG)LQFK 0DWW0LOOD 0LNH0DUWV 6WHYH8 %LOO7DQ 0DUN)DUPHU 2OLYLHURLSHO KULVKXFNU $QG3DUN -RQDWKDQ6LEDO 3HWH0UL0NHLOOL$JODOUQHG 'H[WHU9LQHV (GJDU7DGHR -XVWLQ3RQVRU 5REHUW:HLQEHUJ -RH3UXHWW .DUO6WRU 5REE0F1DEE -DVRQ:ULJKW -LP0RRQH 'HDQ:KLWH )UDQNKR .HQQ/RSH] 'DYLG$QWKRQ.UDIW 6G6KRUHV (YHOQ6WHLQ -RKQ:DUQHU 9LFWRU2OD]DED 'DYLG0LFKHOLQLH %UXFH-RQHV +DUUDQGHODULR 7RP%UHYRRUW 7RP'H)DOFR RU3HWLW -RHDUDPDJQD KULVWLQD:HLU DUOR%DUEHUL 7RP0DQGUDNH 3HWHU,UR 0DUN%URRNV 6WHYH*HUEHU DUODRQZD %UHWW%UHHGLQJ ODWRQ+HQU KULVWLQD6WUDLQ 0DUN*UXHQZDOG 6WHYH(QJOHKDUW KULVWRV*DJH 3HWHU'DYLG $QWKRQDVWULOOR $QJHO0HGLQD 0LNH'HRGDWR -RKQ.DOLV] 5REHUWR$JXLUUH6DFDVD %REE-LDHHK*DDVUHGQHU 0LFKDHOKRL 6RQLD2EDFN :HVOH:RQJ %HUQDUGKDQJ 0DWW5DQ /DU6WXFNHU -RQ+ROGUHGJH :LOVRQ5DPRV 7DQD+RULH -DVRQ/HYLQH %HQ2OLYHU 0LNHDUH UDLJKU.0LVLONHHRV3WHUNLQV DUORV$OEHUWRFUX]XHYDV %ULDQ5HEHU 0DWW)UDFWLRQ :LOOLDP0HVVQHUORHEV 9DO6HPHLNV /DUU+DPD DUORV0RWD -RH%HQQHWW %XG/D5RVD -(3+25. 5DJV.0DUROU%DROHOOVHUV - %HQFKPDU+NL)3LURGRXORFXWLURQV 7RSRZ7EG 5DFKHO'RGVRQ $OODQ+HLQ3$EKGHLOLUJ-*LPUDHQQRHY] :+,/(3257$,2 1LFN/RZH ,6ELUPDLRPQH5R%ELDHQUVFRKQL Types of Networks
  • 19. Types of Networks Neuron Network of Mouse Millennium Simulation (2005) Largest astronomical simulation ever on the structure and evolution of galaxies in the universe. 25 TB of data and 20 million galaxies
  • 20. Use Cases • Recommendation engines (avoid relational N-JOIN or self-JOIN) • Ranking/credibility (Google’s PageRank) • Path finding (shortest, longest, mutual friends) • Social (friendship, following, key connectors)
  • 21. Graphs • Node/Verticy: An entity that can have zero or more edges connected to it. 1 2 3 • Edge: An entity which connects two nodes. May be directed or undirected 1 2 A B
  • 22. Adjacency Matrix • If graph is undirected, the adjacency matrix is symmetric • Thus, transposition of matrix is the same graph
  • 23. Adjacency Matrix • Some graphs have different ‘types’ or dimensions of edges
  • 24. Property Graphs Attribute Value id 2 name Bob Attribute Value id E3 type knows since 2013-09-01 Attribute Value id 4 name Alice Attribute Value id 3 name Eve Attribute Value id E2 type knows since 2013-09-01 Attribute Value id E4 type sibling twins true Attribute Value id 1 name Ivan Attribute Value id E1 type cousin separation 1
  • 25. Traversals • Breadth-first • 3, 2, 4, 1 • Depth-first • 3, 2, 1, 4 • Breadth-first and depth-first search can be combined. • Filtering • Ability to filter/sort paths in traversal • Aggregating • Ability to aggregate/count properties as traversal occurs and affect traversal with result of aggregation (e.g. power-grid load distr.) • Backtracking • Leave marker in traversal and come back to it when certain criteria is met in a lower step 1 2 3 4
  • 27. Tinkerpop • A comprehensive, open-source graph framework (http://www.tinkerpop.com/) Property graph model that is DB agnostic. A kind of JDBC for graphs. Data flow API for processing graphs. Underlying component for graph traversals DSL for traversing property graphs. Implemented in JSR-223. Maps between domain objects and the graph’s nodes and edges. Like ORM for graphs. Collection of common graph analysis algorithms for property graphs. Exposes any blueprints graph via a uniform RESTful API. Blueprints Pipes Gremlin Frames Furnace Rexster
  • 28. Tinkerpop Stack • Different components all build on each other • Provides abstraction from HTTP layer, to object mapping layer, to traversal scripting, to pluggable graph API • Blueprints underpins the stack making it all DB agnostic • Blueprints implementations: • Neo4j, Sail, OrientDB, Dex • *) Accumulo, ArangoDB, Bitsy, FluxGraph, FoundationDB, InfiniteGraph, MongoDB, Oracle- NoSQL, TitanDB * - Implemented by 3rd party
  • 29. Tinkerpop - Rexter • Provides REST and binary (RexPro - grizzly) protocols • Flexible extension model (e.g. ad-hoc Gremlin queries) • Server-side stored procedures (Gremlin) • Browser-based interface (Dog House) • Command-line tool for interacting with API • Pluggable security • SPARQL plugin to work against Sail graphs (OpenRDF) • More information: https://github.com/tinkerpop/rexster/wiki
  • 30. Tinkerpop - Furnace • Collection of industry-standard algorithms for traversing or analyzing graphs. • Network generators (by clique or degree distribution) • Search: A*, Breadth-first, Depth-first • Shortest path • Bellman-Ford (like Dijkstra’s but can handle neg. paths) • PageRank • Degree Distribution • More information: https://github.com/tinkerpop/furnace/wiki
  • 31. Tinkerpop - Frames More Information: https://github.com/tinkerpop/frames/wiki
  • 32. Tinkerpop - Pipes • Dataflow framework for process graphs. • Computational step becomes a node and an edge is a communication channel between steps. • Pipes are then chained and nested. • Custom pipes can be created. • Pipe types: • Transform – emit transformation of object • Dozens of different types of transforms • Filter – decide whether to include/exclude object in traversal • ~20 different types of filters • sideEffect – include object but produce side-effect from it • ~15 different types of sideEffects (e.g. group, count, table, tree) • Branch – decide which step to take next in traversal • Several different branching options
  • 33. Tinkerpop - Blueprints • Like JDBC but for graphs. • Common API for Property Graphs which are very flexible • Foundational component for Pipes, Gremlin, Frames, Furnace, and Rexster • Supports transactions (if underlying DB engine does) • Multi-threaded transactions supported • Format readers/writers (GML, GraphML, GraphSON) • More Information: https://github.com/tinkerpop/blueprints/wiki
  • 34. Tinkerpop - Gremlin • Graph traversal scripting language. • Works against Blueprints API and is “compiled” into Frames data-flows. • Both native Java and Groovy (JSR-223) supported. • Step library (https://github.com/tinkerpop/gremlin/wiki/Gremlin-Steps) • Transform – emit transformation of object • Dozens of different types of transforms • Filter – decide whether to include/exclude object in traversal • ~20 different types of filters • sideEffect – include object but produce side-effect from it • ~15 different types of sideEffects (e.g. group, count, table, tree) • Branch – decide which step to take next in traversal • Several different branching options
  • 35. SQL → Gremlin (secret decoder ring) Query SQL Gremlin Get all users select * from users g.V(‘type’, ‘user’).map() Get user names select name from users g.V(‘type’, ‘user’).name Get user names/ages select name, age from users g.V(‘type’, ‘user’) .transform( { [ ‘name’ : it.getProperty(‘name’), ‘age’ : it.getProperty(‘age’) ] }) Get distinct user ages select distinct(age) from users g.V(‘type’, ‘user’) .age.dedup() Get oldest user select max(age) from users g.V(‘type’, ‘user’) .age.max()
  • 36. SQL → Gremlin (secret decoder ring) Query SQL Gremlin Select by equality select * from users where age = 35 g.V(‘type’, ‘user’) .has(‘age’, 35).map() Select by comparison select * from users where age 21 g.V(‘type’, ‘user’) .has(‘age’, T.gt, 21) .map() Select by multiple criteria select * from users where sex = “M” and age 25 g.V(‘type’, ‘user’) .has(‘age’, T.gt, 25) .has(‘sex’, ‘M’) .map() Order by age (switch ‘a’ and ‘b’ to do asc) select * from users order by age desc g.V(‘type’, ‘user’).order({ it.b.getProperty(‘age’) = it.a.getProperty(‘age’) }).map() Paging select * from users order by age desc limit 5 offset 5 g.V(‘type’, ‘user’) .order({ it.b.getProperty(‘age’) = it.a.getProperty(‘age’) })[5..10].map()
  • 37. SQL → Gremlin (secret decoder ring) Query SQL Gremlin Join select users.* from users inner join groups on users.gId = groups.id where groups.name = “devs” g.V(‘type’, ‘groups’) .has(‘name’, ‘dev’) .in(‘inGroup’).map() Join-on-join-on-join … SELECT TOP (5) [t14].[ProductName] FROM (SELECT COUNT(*) AS [value], [t13].[ProductName] FROM [customers] AS [t0] CROSS APPLY (SELECT [t9].[ProductName] FROM [orders] AS [t1] CROSS JOIN [order details] AS [t2] INNER JOIN [products] AS [t3] ON [t3].[ProductID] = [t2].[ProductID] CROSS JOIN [order details] AS [t4] INNER JOIN [orders] AS [t5] ON [t5].[OrderID] = [t4].[OrderID] LEFT JOIN [customers] AS [t6] ON [t6].[CustomerID] = [t5].[CustomerID] CROSS JOIN ([orders] AS [t7] CROSS JOIN [order details] AS [t8] INNER JOIN [products] AS [t9] ON [t9].[ProductID] = [t8].[ProductID]) WHERE NOT EXISTS(SELECT NULL AS [EMPTY] FROM [orders] AS [t10] CROSS JOIN [order details] AS [t11] INNER JOIN [products] AS [t12] ON [t12].[ProductID] = [t11].[ProductID] WHERE [t9].[ProductID] = [t12].[ProductID] AND [t10].[CustomerID] = [t0].[CustomerID] AND [t11].[OrderID] = [t10].[OrderID]) AND [t6].[CustomerID] [t0].[CustomerID] AND [t1].[CustomerID] = [t0].[CustomerID] AND [t2].[OrderID] = [t1].[OrderID] AND [t4].[ProductID] = [t3].[ProductID] AND [t7].[CustomerID] = [t6].[CustomerID] AND [t8].[OrderID] = [t7].[OrderID]) AS [t13] WHERE [t0].[CustomerID] = N'ALFKI' GROUP BY [t13].[ProductName]) AS [t14] ORDER BY [t14].[value] DESC g.V('customerId','ALFKI') .as('customer’) .out('ordered') .out('contains') .out('is') .as('products’) .in('is') .in('contains') .in('ordered') .except('customer’) .out('ordered') .out('contains') .out('is') .except('products’) .groupCount().cap() .orderMap(T.decr[0..5] .productName
  • 38. Gremlin Resources • Tinkerpop resources • https://github.com/tinkerpop/gremlin/wiki/Basic-Graph-Traversals • https://github.com/tinkerpop/gremlin/wiki/Gremlin-Steps • https://github.com/tinkerpop/gremlin/wiki/Using-Gremlin-through-Java • https://groups.google.com/forum/#!forum/gremlin-users • https://github.com/tinkerpop/gremlin/wiki/SPARQL-vs.-Gremlin • http://markorodriguez.com/2011/08/03/on-the-nature-of-pipes/ • http://sql2gremlin.com/ • http://gremlindocs.com/ • Groovy • http://groovy.codehaus.org/Beginners+Tutorial • http://groovy.codehaus.org/Collections • Misc • http://www.fromdev.com/2013/09/Gremlin-Example-Query-Snippets-Graph-DB.html • http://markorodriguez.com/2011/06/15/graph-pattern-matching-with-gremlin-1-1/
  • 40. Tinkerpop - Gremlin gremlin g = TinkerGraphFactory.createTinkerGraph() ==tinkergraph[vertices:6 edges:6] gremlin g.V.count() ==6 gremlin g.E.count() ==6 gremlin g.v(1) ==v[1] gremlin g.v(1).map =={age=29, name=marko} gremlin g.v(1).outE ==e[7][1-­‐knows-­‐2] ==e[8][1-­‐knows-­‐4] ==e[9][1-­‐created-­‐3] gremlin g.v(1).outE('knows') ==e[7][1-­‐knows-­‐2] ==e[8][1-­‐knows-­‐4] gremlin g.v(1).outE('knows').map =={weight=0.5} =={weight=1.0}
  • 41. Tinkerpop - Gremlin // get verticies known by marko gremlin g.v(1).outE('knows').inV ==v[2] ==v[4] // get properties of verticies known by marko gremlin g.v(1).outE('knows').inV.map =={age=27, name=vadas} =={age=32, name=josh} // filter by those older than 30 gremlin g.v(1).outE('knows').inV .filter{it.age 30}.map =={age=32, name=josh} // just get name gremlin g.v(1).outE('knows').inV .filter{it.age 30}.name ==josh // find nodes who ‘know’ someone older than 30 gremlin g.V.as('x').outE('knows').inV .has('age', T.gt, 30).back('x').map =={age=29, name=marko}
  • 42. Tinkerpop - Gremlin // find edges with weight .5 gremlin g.E.filter{it.weight 0.5} ==e[10][4-­‐created-­‐5] ==e[8][1-­‐knows-­‐4] // find edges w/ weight .5 from marko gremlin g.E.filter{it.weight 0.5} .as('x').outV.has('name', T.eq, 'marko') .back('x') ==e[8][1-­‐knows-­‐4] // find nodes ‘created’ by other nodes gremlin g.V.as('x').inE('created') .back('x').map =={name=lop, lang=java} =={name=ripple, lang=java} gremlin g.E.filter{it.label == 'created'}.inV .dedup().map =={name=lop, lang=java} =={name=ripple, lang=java} // find nodes ‘created’ by more than 1 node gremlin g.E.filter{it.label == 'created'} .inV.groupCount().cap() =={v[3]=3, v[5]=1} // find nodes ‘created’ by marko’s friends gremlin g.v(1).outE('knows').inV .outE('created').inV.map =={name=ripple, lang=java} =={name=lop, lang=java}
  • 43. Tinkerpop - Gremlin // add some new nodes gremlin g.addVertex([name:'bob',age:'60']) ==v[0] gremlin g.addVertex([name:'eve',age:'40']) ==v[7] gremlin g.addVertex([name:'timmy',age:'5']) ==v[8] // add some edges gremlin g.addEdge(g.v(0), g.v(7),'friend’) ==e[13][0-­‐friend-­‐7] gremlin g.addEdge(g.v(0), g.v(8),'child') ==e[14][0-­‐child-­‐8] gremlin g.V.filter{it.name == 'bob'} .outE('child').as('x').inV .filter{it.name == 'timmy'}.back('x') ==e[14][0-­‐child-­‐8] gremlin g.removeEdge(g.e(14)) ==null gremlin g.V.filter{it.name == 'bob'} .outE('child').as('x').inV .filter{it.name == 'timmy'}.back('x') // no results
  • 44. Tinkerpop - Gremlin // previously gremlin g.addVertex([name:'bob',age:'60']) ==v[0] gremlin g.addVertex([name:'eve',age:'40']) ==v[7] gremlin g.addEdge(g.v(0), g.v(7),'friend') ==e[13][0-­‐friend-­‐7] // query for edge gremlin g.v(0).outE ==e[13][0-­‐friend-­‐7] // remove vertex (auto removes orphaned edge) gremlin g.removeVertex(g.v(7)) ==null gremlin g.v(0).outE // no results gremlin g.e(13) ==null
  • 45. TITAN A Distributed Graph Database
  • 46. Titan Graph Database • Optimized to work against billions of nodes and edges • Theoretical limitation of 2^60 edges and 1^60 nodes • Works with several different distributed DBs including Cassandra and HBase • Supports many concurrent users doing complex graph traversals simultaneously • Native integration with Tinkerpop stack • Supports integration with search technologies such as Lucene and Elasticsearch • Created by Thinkaurelius (http://thinkaurelius.com/)
  • 47. Titan Distributed Architecture • TitanDB can integrate with distributed architectures in a few different ways Native Remote Embedded • Put Rexter in front to allow RESTful access • Connects remotely to cluster • Can scale size as far as cluster can • Possible processing bottleneck • TitanDB and Rexter run on each node in the cluster • Can run on same JVM • Considerable performance/scalability improvement • Connects remotely to cluster (or local) • Can scale size as far as cluster can • Native Titan API • Possible processing bottleneck
  • 48. Titan Indexing • Standard index • Internal to Titan • Very fast but only supports exact matches • External index • Use indexing engine external to Titan (Lucene or Elasticsearch) • Supports range queries • Lucene • Limited to only one machine (small-sized datasets) • Also as richer set of search features (than Elasticsearch) • Elasticsearch • Distributed • Not as feature-filled as Lucene
  • 49. Distributed Titan Limitations/Gotchas • Limitations which are present but which are scheduled to be remedied • Property indexes must be created before property is ever used • Unable to drop indices • Types cannot be changed once created • Gotchas • Multiple graphs on same backend requires specific configurations per graph • Ghost vertices – certain concurrency circumstances can leave traces of vertices. Recommendation is to allow this and periodically clean them up
  • 50. Titan Graph Database - Gremlin graph vertices edges properties G = (V , E , λ)
  • 51. Titan Graph Database - Gremlin graph vertices edges properties G = (V , E , λ)
  • 52. Titan Graph Database - Gremlin graph vertices edges properties G = (V , E , λ) Application
  • 53. Titan Graph Database - Gremlin graph vertices edges properties G = (V , E , λ) Application
  • 54. Titan Graph Database - Gremlin graph vertices edges properties G = (V , E , λ) Application
  • 55. DATA MODELING EXAMPLE A Blogging Application
  • 56. “Bloggie Blog” Requirements • Create users, posts, and comments • Retrieve all posts for a user • Retrieve posts by time range • Retrieve all comments for a user • Retrieve all comments for a post, sorted by vote • Retrieve the top N posts, sorted by vote • User can only vote *once* on a post or comment
  • 57. Get Cassandra Titan • https://github.com/thinkaurelius/titan/wiki/Downloads (0.3.2 stable) $ $TITAN_LOCATION/bin/gremlin.sh ,,,/ (o o) -­‐-­‐-­‐-­‐-­‐oOOo-­‐(_)-­‐oOOo-­‐-­‐-­‐-­‐-­‐ gremlin g = new TinkerGraph(); ==tinkergraph[vertices:0 edges:0] gremlin
  • 58. Modeling Entities (User, Post, Comment) • There’s no one way to model this. • General rules to follow: • 1-N relationships can be modeled as one node with N edges pointing to other nodes • 1-1 relationships can be modeled as a simple edge between two nodes • M-N relationships are just more edges • It is important to categorize the different types of edges since many different types of edges will connect to a single node • Don’t shy away from attaching properties to edges. Remember that edges are just a query-able as nodes. • A common practice is to tend to model “actions” as edges and “actors”/”artifacts” as nodes • Denormalize to minimize traversals
  • 60. Retrieve User’s Posts • Let’s create a user and post • Link them together • Retrieve the user and their posts gremlin g.addVertex([ type: 'user', email: 'bob@test.com', name: 'Robert', password: 'asdf']) ==v[0] gremlin g.addVertex( [type: 'post', guid: '21EC2020-­‐3AEA-­‐1069-­‐A2DD-­‐08002B30309D', title: 'Hello World', text: 'My first post!', userDisplayName: 'Bob']) ==v[1] gremlin g.addEdge(g.v(0), g.v(1), 'postAuthor') ==e[3][0-­‐postAuthor-­‐1] gremlin g.V.has('type', 'post').as('posts') .inE('postAuthor') .outV.has('email', 'bob@test.com') .back('posts').map() =={guid=21EC2020-­‐3AEA-­‐1069-­‐A2DD-­‐08002B30309D, text=My first post!, title=Hello World, userDisplayName=Bob, type=post}
  • 61. Retrieve Posts by Time Range • Add timestamp property to post • Query by range gremlin g.V .has('guid','21EC2020-­‐3AEA-­‐1069-­‐A2DD-­‐08002B30309D') .has('type', 'post').sideEffect( {it.createTimestamp = 1383726500}); ==v[1] gremlin g.V .has('createTimestamp', T.gt, 1383726400) .has('createTimestamp', T.lt, 1383726600) .map() =={guid=21EC2020-­‐3AEA-­‐1069-­‐ A2DD-­‐08002B30309D, createTimestamp=1383726500, text=My first post!, title=Hello World, userDisplayName=Bob, type=post}
  • 62. Retrieve All User’s Comments • Add comment • Link to author and to post gremlin g.addVertex([ type: 'comment', guid: '3F2504E0-­‐4F89-­‐11D3-­‐9A0C-­‐0305E82C3301', text: 'I like it!', userDisplayName: 'Sally', createTimestamp: 1383736500]) ==v[4] gremlin g.addEdge( g.v(1), g.v(4), 'postComment') ==e[5][1-­‐postComment-­‐4] gremlin g.addVertex([type: 'user', email: 'sally@test.com', name: 'Sally', password: 'qwerty']) ==v[6] gremlin g.addEdge(g.v(6), g.v(4), 'commentAuthor') ==e[7][6-­‐commentAuthor-­‐4] gremlin g.V.has('type', 'comment').as('comments') .inE('commentAuthor').outV.has( 'email', 'sally@test.com') .back('comments').map() =={guid=3F2504E0-­‐4F89-­‐11D3-­‐9A0C-­‐0305E82C3301, createTimestamp=1383736500, text=I like it!, userDisplayName=Sally, type=comment}
  • 63. Retrieve top N posts by vote • Create “postVote” edge and aggregated votes count in post • Query and sort by votes gremlin g.addEdge(g.v(6), g.v(1), 'postVote', [date: 1383726600]) ==e[8][6-­‐postVote-­‐1] gremlin g.V.has('type','post').has('guid','21EC2 020-­‐3AEA-­‐1069-­‐ A2DD-­‐08002B30309D').sideEffect({it.votes = 1}) ==v[1] gremlin g.addVertex([ type: 'post', guid: '21EC2020-­‐3AEA-­‐1069-­‐A2DD-­‐08002B30309E', createTimestamp: 1383726600, title: 'Learning Gremlin', text: 'Gremlin is neat.', userDisplayName: 'Bob', votes: 2]) ==v[9] gremlin g.V('type', 'post').order({it.b.getProperty('votes') = it.a.getProperty('votes')}).transform({['title' : it.getProperty('title'), 'votes' : it.getProperty('votes')]})[0..5] =={title=Learning Gremlin, votes=2} =={title=Hello World, votes=1}
  • 64. Retrieve Post Comments Sorted by Vote • Similar to post votes gremlin g.addEdge(g.v(0), g.v(4), 'commentVote', [date: 1383726700]) ==e[10][0-­‐commentVote-­‐4] gremlin g.V.has('type','comment').has('guid','3F 2504E0-­‐4F89-­‐11D3-­‐9A0C-­‐0305E82C3301').sid eEffect({it.votes = 1}) ==v[4] gremlin g.addVertex([ type: 'comment', guid: '3F2504E0-­‐4F89-­‐11D3-­‐9A0C-­‐0305E82C3302', text: 'Thanks.', userDisplayName: 'Bob', createTimestamp: 1383736500]) ==v[11] gremlin g.addEdge(g.v(1), g.v(11), 'postComment') gremlin g.addEdge(g.v(0), g.v(11), 'commentAuthor') gremlin g.v(1).outE('postComment').inV.order({it.b.getProperty( 'votes') = it.a.getProperty('votes')}).map() =={guid=3F2504E0-­‐4F89-­‐11D3-­‐9A0C-­‐0305E82C3301, createTimestamp=1383736500, text=I like it!, votes=1, userDisplayName=Sally, type=comment} =={guid=3F2504E0-­‐4F89-­‐11D3-­‐9A0C-­‐0305E82C3302, createTimestamp=1383736500, text=Thanks., userDisplayName=Bob, type=comment}
  • 65. User Can Only Vote Once • Could enforce using external unique indexes • Or do 2-step incrementing in gremlin (small chance of dups) gremlin user = g.v(0); post = g.v(1); if (post.inE('postVote').outV.has( 'email', user.email).count() == 0) { g.addEdge(user, post, 'postVote', [date: new Date().getTime()]); if (post.getProperty('votes') != null){ post.votes++; } else { post.votes = 1; } } ==1 gremlin // same command above ==null
  • 67. Areas Not Covered • Map/Reduce • Gremlin has its own built-in M/R API • Indexing • Titan currently has limitation requiring all indexes are created up-front • Integration with other backends • HBase, Oracle Berkeley DB, Hazelcast, Persistit • Detailed full-text search through external indexes • Graph analytics engine (Faunus) • Deep dive into gremlin query language and Groovy • Seriously, there’s a TON there.
  • 68. References http://sql2gremlin.com/ http://www.tinkerpopbook.com/ - http://www.tinkerpop.com/ https://github.com/thinkaurelius/titan/wiki/Getting-Started https://groups.google.com/forum/#!forum/gremlin-users https://groups.google.com/forum/#!forum/aureliusgraphs http://thinkaurelius.com/
  • 69. THANK YOU { “email” : “calebjones@gmail.com”, “website” : “http://calebjones.info”, “twitter” : “@JonesWCaleb” }