MySQL to Neo4j: A DBA Perspective - David Stern @ GraphConnect NY 2013

6,535 views
6,458 views

Published on

This session is a walk through and best practices from installation and initial set up, through maintenance and performance tuning, all the way to production use for a series of Neo4j learning opportunities for administrators.

Published in: Technology
0 Comments
8 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
6,535
On SlideShare
0
From Embeds
0
Number of Embeds
1,197
Actions
Shares
0
Downloads
90
Comments
0
Likes
8
Embeds 0
No embeds

No notes for slide

MySQL to Neo4j: A DBA Perspective - David Stern @ GraphConnect NY 2013

  1. 1. (MySQL)-[:to]->(neo4j) A DBA Perspective Dave Stern @davestern1
  2. 2. Dev Ops @ FiftyThree MySQL user & admin since 1998 Multiple tiers of masters & slaves Bare metal & AWS - EC2/RDS MySQL & Percona neo4j user & admin since 2012 neo4j 1.8, 1.9 AWS: Multiple 3-instance enterprise clusters
  3. 3. How do you use MySQL? Single Instance Master/Slave, Multi-master MySQL Cluster Have you tried neo4j yet?
  4. 4. Where does FiftyThree use neo4j?
  5. 5. Where does FiftyThree use neo4j? Much more in development...
  6. 6. What is this talk about? Comparison Configuration Use
  7. 7. Comparison
  8. 8. Logical Partitioning MySQL Strictly enforced schema neo4j No logical databases No tables ...no schema ...no joins 2.0: schema-optional http://www.mysql.com/products/workbench/
  9. 9. Physical Partitioning & Sharding Improves write performance, usually disk I/O MySQL neo4j ind_ieprtbe nobfl_e_al No logical partitioning by DB or table Databases on separate partitions or devices Highly connected data: no clear separation Shard horizontally (e.g. by time range) Logs can be on separate partitions for I/O Shard vertically (e.g. by table or function) Logs can be on separate partitions for I/O gain gain
  10. 10. SCALE UP!
  11. 11. Authentication & Authorization MySQL msl slc Hs,d,ue,slc_rv isr_rv udt_rv dlt_rvfo d; yq> eet ot b sr eetpi, netpi, paepi, eeepi rm b +-----+----+-----+------+------+------+------+ --------------------------------------------|Hs ot |d b |ue sr |slc_rv|isr_rv|udt_rv|dlt_rv| eetpi netpi paepi eeepi +-----+----+-----+------+------+------+------+ --------------------------------------------|% |ts et | |Y |Y |Y |Y | |% |ts%| et_ |Y |Y |Y |Y | |lclot|Odr |amn oahs res di |Y |Y |Y |Y | |lclot|Eet |amn oahs vns di |Y |Y |Y |Y | |lclot|Eet |eet oahs vns vns |Y |Y |Y |N | |1. 0% |Eet |eet vns vns |Y |N |N |N | +-----+----+-----+------+------+------+------+ ---------------------------------------------
  12. 12. Authentication & Authorization neo4j No permissions No users How do you secure the DB? 1. Protect the database in a Private Network or VPC 2. Firewall: router, AWS Security Groups, iptables 3. Proxy requests via web server or Load Balancer If you must allow access, use HTTPS & authenticate at the proxy.
  13. 13. Replication http://www.mysqlperformanceblog.com/wp-content/uploads/2013/07/23.png
  14. 14. Replication SO SAE TP LV; STGOA slsaesi_one =1 E LBL q_lv_kpcutr ; SATSAE TR LV;
  15. 15. Replication vs. HA MySQL Free Slaves pull updates Eventual consistency One-way, asynchronous neo4j Enterprise edition: can cost $ depending on use Slaves can pull asynchronous updates Eventual consistency, optimistic pushes to slaves are the default Writes to any cluster member
  16. 16. JVM Buffers & Memory management =~ JVM settings The database itself is extendable via Java ... if you're into that sort of thing
  17. 17. Built-in Tools Data Browser
  18. 18. Built-in Tools Data Browser Backup Script neo4j $/p/e4/i/e4-akp-rmsnl:/06.8.7:32 otnojbnnojbcu fo ige/1.6121766 >-o/ei/e4-akppouto/031-20:01Z t mdanojbcu/rdcin21-10T54:0 Promn fl bcu fo 'ige/1.6121766' efrig ul akp rm snl:/06.8.7:32 ............... ............... ....... ....... [4Flscpe] 4 ie oid Fl cnitnycek ul ossec hc .......... 1% .......... 0 .......... 2% .......... 0 .......... 3% .......... 0 .......... 4% .......... 0 .......... 5% .......... 0 .......... 6% .......... 0 .......... 7% .......... 0 .......... 8% .......... 0 .......... 9% .......... 0 .......... 10 .......... 0% Dn oe
  19. 19. Built-in Tools Data Browser Backup Script MySQL $inbcue -ue=BSR-pswr=BSRAS/aht/AKPDR noakpx -srDUE -asodDUEPS pt/oBCU-I/ inbcue:Bcu cetdi drcoy'pt/oBCU-I/030-50-00' noakpx akp rae n ietr /aht/AKPDR21-32_00-9 inbcue:MSLbno psto:flnm 'yq-i.003, noakpx yQ ilg oiin ieae mslbn000' psto 14112 0:05 oiin 96125 00:3 inbcue:cmltdO! noakpx opee K
  20. 20. Built-in Tools Data Browser Backup Script Visual Server Info
  21. 21. Configuration MySQL So many options... msl SO VRALS yq> HW AIBE; +--------------------+-------------+ ----------------------------------|Vral_ae aibenm |Vle au | +--------------------+-------------+ ----------------------------------|at_nrmn_nrmn uoiceeticeet |1 | |at_nrmn_fst uoiceetofe |1 | |atcmi uoomt |O N | |atmtcs_rvlgs uoai_ppiiee |O N | |bc_o aklg |5 0 | |bsdr aei |/oemslbnmsl55| hm/yq/i/yq-. |bgtbe i_als |OF F | |bno_ah_ie ilgccesz |378 26 | |bno_ietnntascinludts|OF ilgdrc_o_rnatoa_pae F | |bno_omt ilgfra |SAEET TTMN | |bno_ttccesz ilgsm_ah_ie |378 26 | |bl_netbfe_ie ukisr_ufrsz |8868 380 | .. . |mxalwdpce a_loe_akt |1456 087 | |mxbno_ah_ie a_ilgccesz |1464030572 8474779450 | |mxbno_ie a_ilgsz |17712 03484 | |mxbno_ttccesz a_ilgsm_ah_ie |1464030572 8474779450 | |mxcneterr a_onc_ros |1 0 | |mxcnetos a_oncin |11 5 | |mxdlydtras a_eae_hed |2 0 | |mxerrcut a_ro_on |6 4 | |mxha_al_ie a_eptbesz |1771 6726 | |mxisr_eae_hed a_netdlydtras |2 0 | |mxji_ie a_onsz |1464030511 8474779565 | .. .
  22. 22. MySQL Configuration Buffers, Caching & I/O You can optimize dozens of settings like these... ind_ufrpo_ie=1G nobbfe_olsz 2 ind_ufrpo_ntne =8 nobbfe_olisacs ind_diinlmmpo_ie=26 nobadtoa_e_olsz 5M ind_ls_o_ttxcmi =2 nobfuhlga_r_omt ind_ls_ehd=ODRC nobfuhmto _IET ind_o_iesz =18 noblgfl_ie 2M ind_o_ufrsz =6M noblgbfe_ie 4 ind_ieprtbe nobfl_e_al ind_ocpct nobi_aaiy =50 0 ind_edi_hed =6 nobra_otras 4 ind_rt_otras=6 nobwiei_hed 4
  23. 23. MySQL Configuration Network & Concurrency and these... tbecce al_ah mxcnetos a_oncin =24 08 =10 00 mxalwdpce a_loe_akt =1M 6
  24. 24. MySQL Configuration Replication and these... sre-d=2 evri mse-ot=d-atrmcmaycm atrhs bmse.yopn.o mse-ot=30 atrpr 36 mse-sr=uenm atrue srae mse-asod=pswr atrpswr asod mse-onc-er =6 atrcnetrty 0
  25. 25. MySQL Configuration Other And these, depending on version & hardware... sr_ufrsz otbfe_ie tptbesz m_al_ie =2 M =3M 2 ji_ufrsz onbfe_ie =18 2k qeyccetp ur_ah_ye qeyccesz ur_ah_ie =1 =6M 4 oe_ie_ii pnflslmt =89 12 .. ..
  26. 26. neo4j Configuration Tuning Simple Questions How many nodes do you expect? How many relationships do you expect? Average number of properties per node and relationship? Optional: How do you expect to traverse the graph? Long paths and/or large result sets? Short paths and/or small results sets? 3 things to calculate: File Cache Mapped Memory & Object Caches Heap Size RAM for OS
  27. 27. neo4j Configuration Store file Record size Contents neostore.nodestore.db 9B Nodes neostore.relationshipstore.db 3 B 3 Relationships neostore.propertystore.db 4 B 1 Properties for nodes and relationships neostore.propertystore.db.strings 1 8 B 2 Values of string properties neostore.propertystore.db.arrays 1 8 B 2 Values of array properties Capacity Planning Estimates: Node size (9B) x expected nodes (14 B in 2.0) Relaltionship size (33B) x expected relationships Property size (41B) x expected properties Strings & Arrays
  28. 28. Configuration Main config files neo4j-wrapper.conf neo4j.properties neo4j-server.properties
  29. 29. Configuration neo4j-wrapper.conf Heap Size GC method
  30. 30. Configuration neo4j.properties File Caches: Mapped memory Object Caches Indexes HA Backup
  31. 31. Configuration neo4j-server.properties HTTP/S Admin client REST Database mode Logging
  32. 32. Configuration 21.2. Server Configuration 25. Configuration & Performance
  33. 33. neo4j: Buffers, Caching & I/O neo4j-wrapper.conf #IiilJv Ha Sz (nM) nta aa ep ie i B waprjv.nteoy12 rpe.aaiimmr=04 #MxmmJv Ha Sz (nM) aiu aa ep ie i B waprjv.ammr=04 rpe.aamxeoy12
  34. 34. neo4j: Buffers, Caching & I/O neo4j.properties Two types of caches: file buffer and object cache File Buffer Cache: #Dfutvle frtelwlvlgahegn eal aus o h o-ee rp nie notr.oetr.bmpe_eoy2M esoendsoed.apdmmr=5 notr.eainhptr.bmpe_eoy5M esoerltosisoed.apdmmr=0 notr.rprytr.bmpe_eoy9M esoepoetsoed.apdmmr=0 notr.rprytr.bsrnsmpe_eoy10 esoepoetsoed.tig.apdmmr=3M notr.rprytr.bary.apdmmr=3M esoepoetsoed.rasmpe_eoy10 Object Cache: nd_ah_ie26 oeccesz=5M rltosi_ah_ie26 eainhpccesz=5M #otoa pinl nd_ah_ra_rcin5 oeccearyfato= rltosi_ah_ra_rcin5 eainhpccearyfato= #TeG rssatccedsrbdblwi ol aalbei te h C eitn ah ecie eo s ny vial n h #NojEtrrs Eiin e4 nepie dto. #ccetp vle:sf (eal) wa,srn ah_ye aus ot dfut, ek tog ccetp=c ah_yegr
  35. 35. neo4j: Concurrency neo4j.properties #cnurn HT rqet ta tesre wl srie ocret TP euss ht h evr il evc. ognojsre.esre.atras6 r.e4.evrwbevrmxhed=4
  36. 36. neo4j: HA neo4j-server.properties ognojsre.aaaemd=A r.e4.evrdtbs.oeH neo4j.properties h.evri= asre_d1 h.nta_ot=evr:01sre250 aiiilhsssre150,evr:01 #adsoeyulht:/xml.o/it h.icvr.r=tp/eapecmls #ot&pr t bn tecutrmngmn cmuiain Hs ot o id h lse aaeet omncto. h.lse_evrsre150 acutrsre=evr:01 #otaeadpr t bn teH sre. Hsnm n ot o id h A evr h.evrm-oancm60 asre=ydmi.o:01 ###Otoa cutrsrtge ### ## pinl lse taeis ## #Itra o pligudtsfo mse. nevl f uln pae rm atr h.ulitra=0 apl_nevl1s #h aon o sae temse wl akt rpiaeacmitd Te mut f lvs h atr il s o elct omte #rnato. tascin h.xps_atr1 at_uhfco= #uhsrtg o atascint asaedrn cmi. Ps taey f rnato o lv uig omt h.xps_taeyfxd#o rudrbn at_uhsrtg=ie r on_oi
  37. 37. Use File System $AHT_E4 =/p/e4 PT_ONOJ otnoj /p/e4/i otnojbn noj e4 nojbcu e4-akp /p/e4/of otnojcn nojpoete e4.rpris nojsre.rpris e4-evrpoete nojwaprcn e4-rpe.of /p/e4/aa otnojdt /p/e4/aagahd otnojdt/rp.b Teata gahdt h cul rp aa /p/e4/aalg otnojdt/o Allg l os
  38. 38. Use File System $AHT_E4 =/p/e4 PT_ONOJ otnoj /p/e4/i (urbnmsl otnojbn /s/i/yq) noj e4 nojbcu e4-akp /p/e4/of (ecmsl otnojcn /t/yq) nojpoete e4.rpris nojsre.rpris e4-evrpoete nojwaprcn e4-rpe.of /p/e4/aa(vrlbmsl otnojdt /a/i/yq) /p/e4/aagahd (vrlbmsldt) otnojdt/rp.b /a/i/yq/aa Teata gahdt h cul rp aa /p/e4/aalg(vrlgmsl otnojdt/o /a/o/yq) Allg l os
  39. 39. Use Indexes The database itself is a natural index Lucene for searches neo4j 2.0: Nodes have labels: Person, Location, etc. that group them into sets CET IDXO :esnnm) RAE NE N Pro(ae Look familiar? CET IDXi_ne O Pro (d; RAE NE didx N esn i)
  40. 40. Use Indexes neo4j 2.0: Properties can have unique constraints CET CNTAN O (okBo)ASR bo.snI UIU RAE OSRIT N bo:ok SET okib S NQE Look familiar? CET UIU IDXealidxO Pro (mi) RAE NQE NE mi_ne N esn eal;
  41. 41. Use Indexes Current 1.9.x: Auto indexing (deprecated): one for nodes, one for relationships off by default
  42. 42. Use Querying msl slc *fo gahlcllmt1; yq> eet rm rp_oa ii 0 +-------------------------------+ --+---------+----+-------+-----|i |gahtmlt_d|hs_d|sm_ur_d|sm_ne | d rp_epaei oti npqeyi npidx +-------------------------------+ --+---------+----+-------+-----| 1| 1 | 2 1| 0| | | 2| 9| 1| 0| | | 3| 1 | 0 1| 0| | | 4| 8| 1| 0| | | 5| 5 | 8 2| 0| | | 6| 6 | 2 2| 0| | | 7| 5 | 3 2| 0| | | 8| 3 | 7 2| 0| | | 9| 6 | 7 2| 0| | |1 | 0 6 | 5 2| 0| | +-------------------------------+ --+---------+----+-------+-----1 rw i st(.0sc 0 os n e 00 e)
  43. 43. http://www.mysql.com/products/workbench/
  44. 44. Use Querying via REST PS ht:/oahs:44d/aacpe OT tp/lclot77/bdt/yhr Acp:apiainjo;castUF8 cet plcto/sn hre=TCnetTp:apiainjo otn-ye plcto/sn { } "ur":"tr x =nd:oeat_ne(ae{trNm} qey sat oend_uoidxnm=satae) mthpt =(-r-red ac ah x[]fin) weefin.ae={ae rtr TP(), hr rednm nm} eun YEr" "aas :{ prm" "trNm":"" satae I, "ae :"o" nm" yu } Example response: 20 O 0: K CnetTp:apiainjo;castUF8 otn-ye plcto/sn hre=T{ } "oun":["YEr"] clms TP() , "aa :[["nw ]] dt" ko"
  45. 45. DBA Perspective Use the best database for the job, or both neo4j ships with great tools neo4j is easier to configure: fewer options, less complex, still flexible for optimization HA more robust and more opaque than basic replication For better or worse, JVM handles a lot for you Authorization - it's up to you Scaling up is easier than changing your data model
  46. 46. We're hiring jobs@fiftythree.com
  47. 47. Thank You! Thanks to: Aseem Kishore @aseemk Chris Leishman @cleishm Max De Marzi @maxdemarzi

×