New Neo4j Auto HA Cluster

  • 2,419 views
Uploaded on

In this talk, Michael Hunger is going to shed some light over the new High Availability architecture for the popular Neo4j Graph Database. We are going to look at the different variants of the Paxos …

In this talk, Michael Hunger is going to shed some light over the new High Availability architecture for the popular Neo4j Graph Database. We are going to look at the different variants of the Paxos protocol, master failover strategies and cluster management state handling. This piece of infrastructure poses non-trivial challenges to distributed consensus-finding, an interesting session for anyone into scalable systems.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,419
On Slideshare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
22
Comments
0
Likes
4

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Neo4j High Availability New Auto-ClusterMichael Hunger - @mesirii 1
  • 2. High Availability Cluster ๏Neo4j Enterprise ๏Master-Slave Replication ๏read-scaling and fault-tolerance ๏eventual consistency • write to master (push_factor) • write to slaves 2
  • 3. 3 Separate Concerns (I)๏Cluster Management • Members join/leave/heartbeat๏Failover • Master Election • Distribution of Master-Status 3
  • 4. 3 Separate Concerns (II)๏Replication •synchronized id-generation • distributed locks • pull, push of transactions • initial store synchronization 4
  • 5. Pre 1.9 - Zookeeper 5
  • 6. Pre 1.9๏Apache Zookeeper took care of concerns • Cluster Management ‣new members register with ZK • Failover ‣ZK stores Master and last TX-Id ‣ZK uses ZAB to determine new Master and distribute information 6
  • 7. HA ClusterCoordinator RO- Coordinator Slave Master Slave Slave Coordinator 7
  • 8. Pre 1.9 - Problems๏Additional setup and operations of a separate component๏unreliable operation / hiccups๏longterm stability๏no dynamic reconfig of the ZK cluster important for cloud setup 8
  • 9. Post 1.9 -Neo4j Auto Cluster 9
  • 10. Replace Zookeeper!?๏Implement Multi-Paxos ourselves๏simple, testable code๏only covers • cluster management, • master election 10
  • 11. HA Cluster 11
  • 12. What is Paxos?๏reliable consensus making๏broadcasting๏works even with unreliable communication •message lost • delays, invalid order๏does not guarantee progress 12
  • 13. What is Paxos? 13
  • 14. Implementation๏everything is a State Machines • SM = stateless enums + context • Message = type enum + payload • State = enum instance • switch on msg-type, implement logic Transition = handle() messages, 14
  • 15. Implementation (II)๏everything is a State Machines • use timeouts for reliability • handle failing messages • decouple network and time ‣for testability • listeners interact on messages with outside world, sync or async 15
  • 16. Implementation (II)๏Paxos (3 roles) Acceptor • Proposer-SM Paxos • Acceptor-SM Proposer Learner • Learner-SM ClusterState๏Cluster • Heartbeat Heartbeat 16
  • 17. Multi-Paxos (happy path) Acceptor Learner Proposer (2 * f + 1) PREPARE PREPARE TIMEOUT VALUE PROMISE MATCH OR REJECT NO MATCH ACCEPT MATCHES TIMEOUT PROMISE? CHECK , STORE STORE ACCEPTED VALUE RESPONSES OR IF QUORUM REJECTED NO MET, CANCEL TIMEOUT STORE ... VALUE LEARN OUT OF ORDER MSG HANDLING other DELIVER A VALUE IS Learner ALL VALID MISSING ATOMIC BC LEARN TIMEOUT WE STILL 17 LEARN TIMEOUT DONT KNOW
  • 18. TIMEOUTMulti-Paxos (happy path) PROMISE ACCEPT ... MATCHES TIMEOUT PROMISE? CHECK , STORE STORE ACCEPTED VALUE RESPONSES OR IF QUORUM REJECTED NO MET, CANCEL TIMEOUT STORE VALUE LEARN OUT OF ORDER MSG HANDLING other DELIVER A VALUE IS Learner ALL VALID MISSING ATOMIC BC LEARN TIMEOUT WE STILL LEARN TIMEOUT DONT KNOW LEARN REQ LEARN TIMEOUT HAVE LEARN VALUE OR LEARN FAIL DONT KNOW 18
  • 19. Acceptor State Machine 19
  • 20. Heartbeat State Machine 20
  • 21. Implementation (III)๏HA Implementation uses state machines as infrastructure๏notifications via listeners๏piggyback heartbeat on messages๏master election • (all - failed) have to agree • Paxos BC needs quorum of total 21
  • 22. Multi-Paxos๏everything is a State Machines • use timeouts for reliability • handle failing messages • decouple network and time ‣for testability • listeners interact on messages with outside world, sync or async 22
  • 23. Unit-Testing• Mock Time ‣fast running tests despite timeouts• Mock Network ‣simulate delays, failing messages 23
  • 24. Unit-Test-Example 24
  • 25. Setup •Config • Video • Auto-Setup Script (Demo) 25
  • 26. Thank You - Questions? 26