Tech talk network - friend or foe

324 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
324
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Tech talk network - friend or foe

  1. 1. Network- friend or foe
  2. 2. Session outline• Good old TCP  design goals, tuning, caveats• Network congestion• LAN vs. WAN• Alternatives to TCP  TCMP – cluster transport in Coherence  SCTP• Death detection• Multicast
  3. 3. TCP/IP originsMain design goal – communication infrastructure resistive to effects of thermonuclear war.• Loose network of interconnected nodes• Ad-hoc routing decisions• Very weak service guaranties
  4. 4. TCP’s little dirty secretsHigh latency networks• SO_RCVBUF – caps connection bandwidthNagel’s algorithm – 200ms delay• Use TCP_ NODELAY• Does not affect localhost connectionsFirewalls• Silent connection drops
  5. 5. Congestion control
  6. 6. TCP summary• Thermonuclear resistance approach Cowardly in bandwidth utilization Vulnerable to random packet drops• No messages• Head of line syndrome• No multi homing
  7. 7. Reality of HPC clusters• In order frame delivery• Infiniband / 10GiE – link level flow control• Low latency• No slow start needed• Congestion control cloud be drastically simplified
  8. 8. UDP based transport• No flow control – much large receive buffer required to avoid losses on receiver side (e.g. sysctl -w net.core.rmem_max) – congestion prevention should be implemented
  9. 9. TCMP (Oracle Coherence)• UDP based protocol• Exploit ordered delivery for fast NACK• Fast NACK -> very fast congestion detection• Extra logic to account for JVM specific behavior
  10. 10. Peer death detection– essential part of any distributed algorithms
  11. 11. Peer death detectionResponse timeouts are bad detector Temporary network outages  change of route, congestion, etc Temporary application outages  GC, swapping, server CPU starvation  Positive loopback effect “Corrupted witness” syndrome
  12. 12. Peer death detectionIngredients of good death detection:• Process death detection using TCP• Monitor remote OS liveliness not just peer• Multiple witness suspect protocol
  13. 13. SCTPSCTP – L4 protocol, TCP replacement• Works over IP or SS7• Message oriented• Multi stream delivery• Multi homing• Fast networks and jumbo frames in mind
  14. 14. MulticastGroup multicast• Suitable for group communication• Not support in network hardware AFAIKHub and spoke multicast• Replicating large amounts of data• Video broadcasting• PIM/IGMP (IGMP spoofing HW support)
  15. 15. IGMP multicast
  16. 16. THANK YOU alexey.ragozin@gmail.com

×