0.5mln packets per second with Erlang

2,666 views

Published on

Published in: Technology, News & Politics
0 Comments
7 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,666
On SlideShare
0
From Embeds
0
Number of Embeds
320
Actions
Shares
0
Downloads
29
Comments
0
Likes
7
Embeds 0
No embeds

No notes for slide

0.5mln packets per second with Erlang

  1. 1. 0.5mln packets per second with Erlang Revelations from a real-world project based on Erlang on Xen ErLounge/SF June 6, 2014 Maxim Kharchenko CTO, Cloudozer LLP mk@cloudozer.com
  2. 2. The road map ! Erlang on Xen intro ! LINCX project overview ! Speed-related notes – Arguments are registers – ETS tables are (mostly) ok – Do not overuse records – GC is key to speed – gen_server vs. barebone process – NIFS: more pain than gain – Fast counters ! Q&A
  3. 3. 3 Erlang on Xen a.k.a. LING ! A new Erlang platform that runs without OS ! Conceived in 2009 ! Highly-compatible with Erlang/OTP ! Built from scratch, not a “port” ! Optimized for low startup latency ! Open sourced in 2014 (github.com/cloudozer/ling) ! The public build service is available Go to erlangonxen.org
  4. 4. 4 Zerg demo: zerg.erlangonxen.org
  5. 5. The road map ! Erlang on Xen intro ! LINCX project overview ! Speed-related notes – Arguments are registers – ETS tables are (mostly) ok – Do not overuse records – GC is key to speed – gen_server vs. barebone process – NIFS: more pain than gain – Fast counters ! Q&A
  6. 6. LINCX: project overview ! Started in December, 2013 ! Initial scope = porting LINC-Switch to LING ! High degree of compatibility demonstrated for LING ! Extended scope = fix LINC-Switch fast path ! Beta version of LINCX open sourced on March 3, 2014 ! LINCX runs 100x faster than the old code 6 LINC-Switch is an OpenFlow software switch implemented in ErlangLINC-Switch is an OpenFlow software switch implemented in Erlang For more details go to http://FlowForwarding.org
  7. 7. Raw network interfaces in Erlang * LING adds raw network interfaces: * Raw interface receives whole Ethernet frames * LINCX uses standard gen_tcp:* for the control connection and net_vif:* - for data ports * Raw interfaces support mailbox_limit option - packets get dropped if the mailbox of the receiving process overflows 7 Port  =  net_vif:open(“eth1”,  []), port_command(Port,  <<1,2,3>>), receive {Port,{data,Frame}}  -­‐> ... Port  =  net_vif:open(“eth1”,  [{mailbox_limit,16384}]), ...
  8. 8. Testbed configuration * Test traffic goes between vm1 and vm2 * LINCX runs as a separate Xen domain * Virtual interfaces are bridged in Dom0 8
  9. 9. Processing delay and low-level NIC stats ! LING can measure a processing delay for a packet: ! ling:experimental(processing_delay,  []). ! Processing  delay  statistics: ! Packets:  2000  Delay:  1.342us  +-­‐  0.143  (95%) ! LING can collect low-level stats for a network interface: ! ling:experimental(llstat,  1).  %%  stop/display ! Duration:  4868.6ms ! RX:  interrupts:  69170  (0  kicks  0.0%)  (freq  14207.4/s  period  70.4us) ! RX:  reqs  per  int:  0/0.0/0 ! RX:  tx  buf  freed  per  int:  0/8.5/234 ! TX:  outputs:  1479707  (112263  kicks  7.6)  (freq  303928.8/s  period  3.3us) ! TX:  tx  buf  freed  per  int:  0/0.6/113 ! TX:  rates:  303.9kpps  3622.66Mbps  avg  pkt  size  1489.9B ! TX:  drops:  12392  (freq  2545.3/s  period  392.9us) ! TX:  drop  rates:  2.5kpps  30.26Mbps  avg  pkt  size  1486.0B 9
  10. 10. IXIA confirms 460kpps peak rate ! 1GbE hw NICs/128 byte packets ! IXIA packet generator/analyzer 10
  11. 11. The road map ! Erlang on Xen intro ! LINCX project overview ! Speed-related notes – Arguments are registers – ETS tables are (mostly) ok – Do not overuse records – GC is key to speed – gen_server vs. barebone process – NIFS: more pain than gain – Fast counters ! Q&A
  12. 12. 12 Arguments are registers ! Many arguments do not make a function any slower ! Do not reshuffle arguments: animal(batman  =  Cat,  Dog,  Horse,  Pig,  Cow,  State)  -­‐>   feed(Cat,  Dog,  Horse,  Pig,  Cow,  State); animal(Cat,  deli  =  Dog,  Horse,  Pig,  Cow,  State)  -­‐>   pet(Cat,  Dog,  Horse,  Pig,  Cow,  State); ...   %%  SLOW animal(Cat,  Dog,  Horse,  Pig,  Cow,  State)  -­‐>   feed(Goat,  Cat,  Dog,  Horse,  Pig,  Cow,  State); ...
  13. 13. 13 ETS tables are (mostly) ok ! A small ETS table lookup = 10x function activations ! Do not use ets:tab2list() inside tight loops ! Treat ETS as a database; not a pool of global variables ! 1-2 ETS lookups on the fast path are ok ! Beware that ets:lookup(), etc create a copy of the data on the heap of the caller, similarly to message passing
  14. 14. 14 Do not overuse records ! selelement() creates a copy of the tuple ! State#state{foo=Foo1,bar=Bar1,baz=Baz1} creates 3(?) copies of the tuple ! Use tuples explicitly in the performance-critical sections to see the heap footprint of the code %%  from  9p.erl mixer({rauth,_,_},  {tauth,_,AFid,_,_},  _)  -­‐>  {write_auth,AFid}; mixer({rauth,_,_},  {tauth,_,AFid,_,_,_},  _)  -­‐>  {write_auth,AFid}; mixer({rwrite,_,_},  _,  initial)  -­‐>  start_attaching; mixer({rerror,_,_},  _,  initial)  -­‐>  auth_failed; mixer({rlerror,_,_},  _,  initial)  -­‐>  auth_failed; mixer({rattach,_,Qid},  {tattach,_,Fid,_,_,AName,_},  initial)  -­‐>                {attach_more,Fid,AName,qid_type(Qid)}; mixer({rclunk,_},  {tclunk,_,Fid},  initial)  -­‐>  {forget,Fid};
  15. 15. ! Heap is a list of chunks ! 'new heap' is close to its head, 'old heap' - to its tail ! A GC run takes 10μs on average ! GC may run 1000s times per second 15 Garbage collection is key to speed HTOPproc_t
  16. 16. How to tackle GC-related issues – (Priority 1) Call erlang:garbage_collect() at strategic points – (Priority 2) For the fastest code avoid GC completely – restart the fast process regularly – spawn(F,  [{suppress_gc,true}]),  %%  LING-­‐only – (Priority 3) Use fullsweep_after option 16
  17. 17. 17 gen_server vs barebone process ! Message passing using gen_server:call() is 2x slower than Pid ! Msg ! For speedy code prefer barebone processes to gen_servers ! Design Principles are about high availability, not high performance
  18. 18. 18 NIFs: more pain than gain ! A new principle of Erlang development: do not use NIFs ! For a small performance boost, NIFs undermine key properties of Erlang: reliability and soft-realtime guarantees ! Most of the time Erlang code can be made as fast as C ! Most of performance problems of Erlang are traceable to NIFs, or external C libraries, which are similar ! Erlang on Xen does not have NIFs and we do not plan to add them
  19. 19. 19 Fast counters ! 32-bit or 64-bit unsigned integer counters with overflow - trivial in C, not easy in Erlang ! FIXNUMs are signed 29-bit integers, BIGNUMs consume heap and 10-100x slower ! Use two variables for a counter? foo(C1,  16#ffffff,  ...)  →   foo(C1+1,  0,  ...); foo(C1,  C2,  ...)  -­‐>   foo(C1,  C2+1,  ...); ... ! Erlang on Xen has a new experimental feature – fast counters: erlang:new_counter(Bits)  -­‐>  Ref erlang:increment_counter(Ref,  Incr) erlang:read_counter(Ref) erlang:release_counter(Ref)
  20. 20. 20 Questions? ??? ??

×