NetScaler TCP Performance Tuning

Kevin Mason
Synergy	
  SYN304:	
  
    NetScaler	
  TCP	
  Performance	
  
    Tuning	
  in	
  the	
  AOL	
  Network



Presenters:
Kevin	
  Mason	
  -­‐	
  kem@aol.net
Tim	
  Wicinski	
  -­‐	
  tjw@aol.net
 May	
  11,	
  2012
                                        1
Wednesday, May 23, 12                       1
Agenda
            Goals
            InformaNon	
  on	
  the	
  AOL	
  ProducNon	
  Network
            TCP	
  Tuning	
  Concepts
            TCP	
  Tuning	
  on	
  the	
  NetScaler
            AOL	
  tcpProfiles
            Case	
  Study:	
  bps	
  Improvements
            Case	
  Study:	
  Retransmission	
  ReducNon
            Case	
  Study:	
  Engadget	
  Coverage	
  of	
  New	
  iPad	
  Announcement	
  	
  
            TroubleshooNng	
  and	
  Monitoring	
  NetScalers
            Appendixes



                                                       2
Wednesday, May 23, 12                                                                             2
Note	
  on	
  Today’s	
  Sta0s0cs
             All	
  the	
  AOL	
  staNsNcs	
  presented	
  today	
  are	
  operaNonal	
  
           numbers.	
  They	
  should	
  not	
  be	
  construed	
  to	
  represent	
  page	
  
                      views	
  or	
  other	
  types	
  of	
  site	
  popularity	
  data.




                                                    3
Wednesday, May 23, 12                                                                            3
Performance	
  Tuning	
  Goal


               Create	
  the	
  best	
  possible	
  experience	
  for	
  
                  AOL	
  Customers	
  by	
  conNnuously	
  
               analyzing	
  and	
  engineering	
  the	
  fastest,	
  
                  most	
  efficient	
  network	
  possible.




                                           4
Wednesday, May 23, 12                                                       4
Defini0ons
     Just	
  to	
  make	
  sure	
  we	
  are	
  on	
  the	
  same	
  page

            VIP	
  -­‐	
  Load	
  Balancing	
  (LB)	
  and	
  Content	
  Switching	
  (CS)	
  VServers
            Service	
  -­‐	
  Backend	
  connecNvity	
  to	
  Real	
  Hosts
            Outbound	
  -­‐	
  Traffic	
  outbound	
  from	
  the	
  VIP	
  or	
  Service	
  
            Inbound	
  -­‐	
  Traffic	
  inbound	
  towards	
  the	
  VIP	
  or	
  Service




                                                            5
Wednesday, May 23, 12                                                                                    5
AOL	
  Produc0on	
  Network	
  Stats
     Load	
  Balancing
              70	
  HA	
  pairs	
  of	
  NetScalers
                10G,	
  mulN	
  interface	
  a_ached
              19,642	
  VServers
              27,214	
  Services	
  (Host	
  +	
  Port)
              13,462	
  Servers,	
  majority	
  cloud	
  based


     GSLB/DNS
              Approx	
  18K	
  DNS	
  domains
              Approx	
  4K	
  GSLB	
  configs
              180M	
  queries/day




                                                      6
Wednesday, May 23, 12                                            6
AOL	
  Produc0on	
  Network
     Internal	
  Tooling
     Web	
  based	
  self	
  service	
  site	
  managing	
  all	
  changes,	
  using	
  a	
  
     mixture	
  of	
  SOAP	
  and	
  NITRO	
  calls:
            Avg	
  250	
  Self	
  Service	
  changes	
  /	
  week.
            Avg	
  25K	
  config	
  command	
  changes	
  /	
  week.




                                                           7
Wednesday, May 23, 12                                                                           7
TCP	
  Tuning	
  Concepts
      Have	
  a	
  solid,	
  technical	
  understanding	
  of	
  the	
  client's	
  
     connecNon	
  method.	
  
      Get	
  the	
  bits	
  to	
  the	
  customer	
  quickly	
  and	
  cleanly.
      Where	
  possible,	
  send	
  a	
  large	
  iniNal	
  burst	
  of	
  packets.
         Ramp	
  traffic	
  up	
  fast.
         Recover	
  errors	
  quickly,	
  avoid	
  slow	
  start.




     Melt	
  fiber,	
  smoke	
  servers,	
  choke	
  routers
                                                   8
Wednesday, May 23, 12                                                                  8
TCP	
  Tuning	
  Concepts
     TCP	
  Tuning	
  is	
  done	
  per	
  connecNon	
  not	
  the	
  aggregate	
  of	
  the	
  
     link.	
  	
  


     Tuning	
  Factors	
  include:
            Latency	
  or	
  Round	
  Trip	
  Time	
  (RTT)
            Minimum	
  Bandwidth
            Packet	
  Loss
            Supported	
  TCP	
  OpNons




                                                         9
Wednesday, May 23, 12                                                                              9
TCP	
  Connec0on	
  Op0ons
            TCP	
  negoNates	
  opNons	
  to	
  define	
  how	
  the	
  connecNon	
  operates.
           TCP	
  OpNons	
  are	
  generally	
  symmetrical,	
  both	
  sides	
  have	
  to	
  support	
  
          it,	
  or	
  the	
  opNon	
  is	
  dropped.
           TCP	
  OpNon	
  values	
  are	
  NOT	
  always	
  the	
  same	
  on	
  both	
  sides.	
  	
  For	
  
          example,	
  Receive	
  Windows,	
  Window	
  Scaling	
  and	
  MSS	
  (Max	
  
          Segment	
  Size)	
  are	
  ojen	
  different.




                                                             10
Wednesday, May 23, 12                                                                                             10
!!!	
  Caveat	
  Emptor	
  !!!
                      While	
  we	
  are	
  happy	
  to	
  share	
  AOL	
  experience	
  and	
  
                   configuraAons,	
  be	
  careful	
  applying	
  these	
  seEngs	
  in	
  your	
  
                            network,	
  they	
  might	
  cause	
  problems.


                   When	
  working	
  with	
  any	
  parameters,	
  have	
  a	
  rollback	
  plan!




     O’Toole’s	
  Comment	
  on	
  Murphy's	
  Law:	
  
     	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Murphy	
  is	
  an	
  Op?mist.	
  


                                                                                                                                                                              11
Wednesday, May 23, 12                                                                                                                                                                                                                     11
TCP	
  Tuning	
  on	
  the	
  NetScaler

     Two	
  major	
  tuning	
  knobs	
  on	
  the	
  NetScaler:
                tcpParam	
  -­‐	
  Global	
  TCP	
  semngs
                tcpProfile	
  -­‐	
  Per	
  VIP	
  or	
  Service	
  semngs



     We	
  set	
  the	
  tcpParam	
  for	
  general	
  purpose	
  values,	
  but	
  since	
  it	
  is	
  
     a	
  OS	
  supplied	
  semng,	
  it	
  could	
  be	
  over	
  wri_en	
  by	
  an	
  upgrade.	
  	
  
     Using	
  custom	
  tcpProfiles	
  is	
  safer	
  since	
  they	
  will	
  not	
  be	
  
     overwri_en.




                                                             12
Wednesday, May 23, 12                                                                                       12
tcpProfile	
  Values
     -­‐WS                 -­‐maxPktPerMss    -­‐nagle
     -­‐WSVAL              -­‐pktPerRetx      -­‐ackOnPush
     -­‐maxBurst           -­‐minRTO          -­‐mss	
  (v9.3)
     -­‐iniAalCwnd         -­‐slowStartIncr   -­‐bufferSize	
  (v9.3)
     -­‐delayedAck         -­‐SACK
     -­‐oooQSize



         If you do nothing else,:
               Enable -WS (Window Scale)
               Enable -SACK (Selective Ack)

                                    13
Wednesday, May 23, 12                                                  13
TCP	
  Window	
  Scale	
  (RFC	
  1323)
     Window	
  Scale	
  is	
  a	
  TCP	
  opNon	
  to	
  increase	
  the	
  receiver’s	
  
     window	
  size,	
  compensaNng	
  for	
  long	
  and/or	
  fat	
  networks	
  (LFN).	
  
     The	
  values	
  used	
  by	
  each	
  device	
  in	
  a	
  connecNon	
  are	
  ojen	
  
     asymmetrical.
            Enabled	
  on	
  the	
  NetScaler	
  with	
  the	
  -­‐WS	
  opNon.
            The	
  value	
  adverNsed	
  by	
  the	
  NetScaler	
  is	
  set	
  with	
  -­‐WScale.
           At	
  the	
  least,	
  enable	
  -­‐WS	
  to	
  take	
  advantage	
  of	
  the	
  client	
  and	
  server	
  
          adverNsed	
  windows.




                                                               14
Wednesday, May 23, 12                                                                                                      14
TCP	
  SACK	
  (RFC	
  2018)
     SelecNve	
  Acknowledgement	
  or	
  SACK	
  is	
  a	
  TCP	
  opNon	
  enabling	
  a	
  
     receiver	
  to	
  tell	
  the	
  sender	
  the	
  range	
  of	
  non-­‐conNguous	
  packets	
  
     received.
           Without	
  SACK,	
  the	
  receiver	
  can	
  only	
  tell	
  the	
  sender	
  about	
  packets	
  
          sequenNally	
  received.	
  	
  
                This	
  slows	
  down	
  recovery.
                Can	
  force	
  Conges5on	
  control	
  to	
  kick	
  in.
            Enabled	
  with	
  the	
  -­‐SACK	
  opNon.




                                                                 15
Wednesday, May 23, 12                                                                                            15
Citrix	
  default	
  tcpProfile	
  
     SeEngs	
  appropriate	
  for	
  circa	
  1999.	
  	
  
     Have	
  stayed	
  this	
  way	
  due	
  to	
  Citrix	
  general	
  policy	
  of	
  not	
  
     changing	
  default	
  semngs.	
  Some	
  significant	
  changes	
  were	
  made	
  
     to	
  default	
  on	
  v9.2,	
  see	
  CTX130962	
  (10/7/2011).
            No	
  Window	
  Scaling	
  or	
  SACK.
                Causes	
  choppy	
  data	
  flow.
                Slow	
  to	
  recover	
  from	
  packet	
  loss.
                Can	
  trigger	
  slow	
  start	
  conges5on	
  control.
            Very	
  slow	
  CongesNon	
  Window	
  ramp	
  up.
            Small	
  -­‐iniNalCWind	
  &	
  -­‐maxBurst
                Search	
  for	
  “Ini5al	
  Conges5on	
  Window”	
  for	
  more	
  details




                                                              16
Wednesday, May 23, 12                                                                             16
Addi0onal	
  Citrix	
  supplied	
  tcpProfiles
     Citrix	
  supplies	
  several	
  predefined	
  profiles
     Depending	
  on	
  code	
  level,	
  Citrix	
  supplies	
  at	
  least	
  7	
  addiNonal	
  
     tcpProfiles	
  for	
  you	
  to	
  try	
  and	
  may	
  work	
  well	
  for	
  you.	
  	
  Since	
  
     they	
  are	
  supplied	
  with	
  the	
  O/S,	
  they	
  are	
  subject	
  to	
  changes	
  in	
  
     the	
  future.
     We	
  studied	
  these	
  tcpProfiles	
  to	
  develop	
  a	
  starNng	
  point	
  for	
  
     our	
  tesNng.




                                                      17
Wednesday, May 23, 12                                                                                      17
AOL	
  Custom	
  tcpProfiles




                                18
Wednesday, May 23, 12                18
AOL	
  Custom	
  tcpProfiles
     	
  VIP	
  Profiles
            aol_vip_std_tcpprofile
            aol_vip_mobile-­‐client_tcpprofile
            aol_vip_server-­‐2-­‐vserver_tcpprofile
            aol_vip_dialup-­‐client_tcpprofile


     Service	
  Profiles
            aol_svc_std_tcpprofile



     Complete	
  configs	
  included	
  in	
  Appendix


                                                        19
Wednesday, May 23, 12                                        19
Standard	
  Client	
  to	
  VIP	
  -­‐	
  aol_vip_std_tcpprofile
     Push	
  data	
  as	
  fast	
  as	
  possible	
  ajer	
  the	
  connecNon	
  is	
  
     established	
  by	
  increasing	
  the	
  packet	
  burst	
  and	
  ramp	
  rate,	
  
     improve	
  error	
  recovery.
     AssumpAons
            Content	
  is	
  generally	
  outbound	
  out	
  to	
  client.
            The	
  max	
  bps	
  per	
  flow	
  is	
  10	
  mb/s.
            RTT	
  is	
  75	
  ms
     Changes	
  from	
  default:
          Enable	
  -­‐WS	
  (Window	
  Scaling)                   Increase	
  -­‐slowStartIncr	
  
          Enable	
  -­‐SACK                                        Increased	
  -­‐pktPerRetx
          Increase	
  -­‐MaxBurst                                  Reduced	
  -­‐delayedAck	
  
          Increase	
  -­‐IniNalCwnd

                                                              20
Wednesday, May 23, 12                                                                                 20
Mobile	
  client	
  to	
  VIP	
  -­‐	
  aol_vip_mobile-­‐client_tcpprofile	
  
     Based	
  on	
  aol_vip_std_tcpprofile	
  &	
  aol_vip_dialup-­‐client_tcpprofile	
  
     with	
  changes	
  to	
  improve	
  the	
  the	
  mobile	
  (3G,	
  4G)	
  client	
  experience	
  by	
  
     increasing	
  tolerance	
  for	
  RTT	
  shijs	
  as	
  the	
  device	
  moves.	
  This	
  reduces	
  
     retransmissions	
  that	
  can	
  flood	
  a	
  device.
     AssumpAons:
            RTT	
  can	
  shij	
  from	
  300ms	
  to	
  900ms
            bps	
  is	
  limited	
  to	
  <	
  3mb/s
     Changes	
  from	
  aol_vip_std_tcpprofile:

           Increase	
  the	
  -­‐minRTO                            Reduced	
  –maxburst
           Reduced	
  -­‐delayedAck                                Reduced	
  -­‐slowstarNncr
     Results:
            Avg	
  30%	
  reducNon	
  in	
  retransmissions	
  to	
  mobile	
  clients
                                                         21
Wednesday, May 23, 12                                                                                            21
Server	
  to	
  VIP	
  -­‐	
  aol_vip_server-­‐2-­‐vserver_tcpprofile
     Based	
  on	
  aol_vip_std_tcpprofile	
  for	
  vips	
  that	
  are	
  handling	
  
     internal	
  server	
  flows	
  where	
  the	
  data	
  is	
  inbound	
  to	
  the	
  vserver.	
  	
  
     AssumpAons
            The	
  source	
  host	
  is	
  in	
  a	
  	
  data	
  center	
  or	
  other	
  a_ached	
  space.
           The	
  maximum	
  bps	
  per	
  flow	
  is	
  600	
  mb/s	
  inbound	
  to	
  a	
  VIP	
  or	
  
          vserver.
            The	
  max	
  propagaNon	
  delay	
  (RTT)	
  is	
  20	
  ms	
  (0.02	
  sec).

     Changes	
  from	
  aol_vip_std_tcpprofile:
            Increase	
  WScale
            Reduced	
  -­‐minRTO


                                                               22
Wednesday, May 23, 12                                                                                          22
Dial	
  Client	
  to	
  VIP	
  -­‐	
  aol_vip_dialup-­‐client_tcpprofile	
  
     Based	
  on	
  aol_vip_std_tcpprofile	
  with	
  changes	
  to	
  improve	
  the	
  
     dial	
  client	
  experience	
  by	
  reducing	
  retransmission	
  and	
  prevent	
  
     terminal	
  server	
  flooding.	
  
     AssumpAons	
  
            Max	
  bps	
  per	
  flow	
  is	
  50	
  kb/s	
  (	
  KB/s).
            Avg	
  RTT	
  is	
  500	
  ms	
  (0.5	
  sec).
     Changes	
  from	
  aol_vip_std_tcpprofile:
          Increase	
  -­‐minRTO                                           Reduced	
  –maxburst
          Reduced	
  -­‐delayedack                                        Reduced	
  -­‐slowstarNncr
     Results:
            Total	
  page	
  render	
  Nme	
  reduced	
  from	
  ~35s	
  to	
  ~16s.
            SSL	
  handshake	
  Nmes	
  reduced	
  from	
  4+s	
  to	
  ~	
  1s.
                                                                 23
Wednesday, May 23, 12                                                                                  23
Server	
  to	
  Service	
  -­‐	
  aol_svc_std_tcpprofile
     Used	
  on	
  Services	
  for	
  traffic	
  between	
  the	
  real	
  host	
  &	
  NetScaler.	
  	
  
     Even	
  though	
  the	
  propagaNon	
  delay	
  is	
  low,	
  BDP	
  is	
  sNll	
  a	
  factor	
  
     due	
  to	
  per	
  connecNon	
  bps.
     AssumpAons
            Max	
  bps	
  per	
  flow	
  is	
  650	
  mb/s	
  (81.25	
  MB/s).
            Avg	
  RTT	
  is	
  10	
  ms	
  (0.010	
  Sec).
            Majority	
  of	
  flows	
  are	
  HTTP	
  1.1	
  or	
  long	
  lived	
  TCP	
  connecNons.
     Changes	
  from	
  default:

           Enabled -SACK                                           Reduced -delayedACK
           Enabled -WS                                             Reduced -minRTO
           Increased -WScale


                                                              24
Wednesday, May 23, 12                                                                                      24
Case	
  Study:	
  
  Bits	
  per	
  Second	
  (bps)	
  Improvements




                                 25
Wednesday, May 23, 12                              25
Case	
  Study:	
  bps	
  Improvements

    Problem:
    Need	
  to	
  improve	
  the	
  customer	
  experience	
  and	
  increase	
  
    efficiency.
          Analysis	
  had	
  shown	
  more	
  than	
  99%	
  of	
  incoming	
  SYNs	
  support	
  
         Window	
  Scaling	
  and/or	
  SACK,	
  yet	
  the	
  NetScaler	
  was	
  not	
  uNlizing	
  
         these	
  opNons.
         Analysis	
  also	
  showed	
  a	
  large	
  number	
  of	
  sessions	
  adverNsing	
  Zero	
  
         Windows.
           Sessions	
  were	
  being	
  dropped	
  into	
  slow	
  start	
  due	
  to	
  packet	
  loss.




                                                          26
Wednesday, May 23, 12                                                                                      26
Case	
  Study:	
  bps	
  Improvements
     What	
  did	
  we	
  change?

              -­‐WS	
  Enabled
              -­‐SACK	
  Enabled
              -­‐WSVal	
  0
              -­‐maxBurst	
  	
  10
              -­‐iniNalCwnd	
  10
              -­‐delayedAck	
  100
              -­‐pktPerRetx	
  2
              -­‐slowStartIncr	
  4


     This	
  became	
  the	
  basis	
  for	
  the	
  aol_vip_std_tcpprofile	
  


                                                27
Wednesday, May 23, 12                                                            27
Case	
  Study	
  :	
  bps	
  Improvements




                                    28
Wednesday, May 23, 12                            28
Case	
  Study:	
  bps	
  Improvements
     Results:
            Immediate	
  30%	
  jump	
  in	
  Outbound	
  bps	
  towards	
  client.
            Retransmission	
  rate	
  did	
  not	
  increase.
               The	
  increased	
  in	
  bps	
  was	
  sustained	
  over	
  the
          	
  following	
  weeks.	
  	
  
              Allowed	
  higher	
  server	
  uNlizaNon	
  and	
  higher
          	
  NetScaler	
  efficiency	
  (more	
  bang	
  for	
  $$).	
  




                                                            29
Wednesday, May 23, 12                                                                 29
Case	
  Study:	
  
Retransmission	
  Reduc0on




                         30
Wednesday, May 23, 12         30
Case	
  Study:	
  	
  Retransmission	
  Reduc0on
     Problem:
           While	
  developing	
  the	
  Mobile	
  and	
  Dial	
  tcpProfiles,	
  we	
  realized	
  there	
  
          were	
  a	
  large	
  number	
  of	
  fast	
  retransmission	
  on	
  the	
  VServers.
            At	
  Nmes,	
  retransmissions	
  approached	
  15%	
  of	
  outbound	
  traffic.
           This	
  retransmission	
  rate	
  was	
  sustained	
  around	
  the	
  clock,	
  in	
  3	
  data	
  
          centers,	
  eliminaNng	
  network	
  congesNon	
  as	
  a	
  cause.




                                                           31
Wednesday, May 23, 12                                                                                             31
Case	
  Study:	
  Retransmission	
  Reduc0on
     TesAng:
     To	
  reduce	
  variables,	
  we	
  first	
  tested	
  on	
  one	
  pair	
  of	
  shared	
  
     service	
  NetScalers.

     We	
  tried	
  to	
  reduce	
  the	
  –delayedAck	
  to	
  100ms
            No	
  change	
  was	
  seen	
  in	
  the	
  traffic	
  over	
  24	
  hours

     Next	
  we	
  tried	
  increasing	
  the	
  –minRTO
           Instant	
  drops	
  in	
  the	
  number	
  of	
  Fast	
  retransmissions,	
  
          1st	
  -­‐	
  7th	
  retransmissions	
  and	
  
          RetransmissionGiveup.



                                                            32
Wednesday, May 23, 12                                                                              32
Case	
  Study:	
  Retransmission	
  Reduc0on




                                33
Wednesday, May 23, 12                               33
Case	
  Study:	
  Retransmission	
  Reduc0on

      Change	
  Details:
             Increasing	
  the	
  -­‐minRTO	
  from	
  100ms	
  to	
  400ms
                 	
  retransmi_ed	
  packets	
  dropped	
  from	
  14.2k/sec	
  to	
  2.5k/sec.
                 	
  bps	
  dropped	
  from	
  102mb/sec	
  to	
  30kb/sec	
  on	
  one	
  link.

            Increasing	
  -­‐minRTO	
  from	
  100ms	
  to	
  300ms	
  had	
  the	
  greatest	
  effect	
  
           on	
  reducing	
  retransmissions.
             When	
  applied	
  across	
  AOL,	
  total	
  outbound	
  bps	
  dropped	
  by	
  10%.
            	
  However,	
  increasing	
  -­‐minRTO	
  to	
  500ms	
  caused	
  outbound	
  bps	
  to	
  
           drop.	
  




                                                                     34
Wednesday, May 23, 12                                                                                        34
Case	
  Study:	
  
  Engadget	
  Coverage	
  of	
  New	
  iPad	
  Announcement	
  	
  




                                   35
Wednesday, May 23, 12                                                 35
Case	
  Study:	
  Engadget	
  Coverage	
  of	
  New	
  iPad	
  
     Announcement	
  	
  

     On	
  March	
  7,	
  2012	
  Apple	
  held	
  a	
  Keynote	
  event	
  to	
  announce	
  the	
  
     new	
  iPad.	
  	
  
     TradiNonally,	
  these	
  events	
  crush	
  sites	
  due	
  to	
  user	
  traffic.	
  	
  
     The	
  iPhone	
  4S	
  event	
  in	
  September	
  2011	
  overloaded	
  Engadget	
  
     and	
  we	
  decided	
  that	
  it	
  would	
  not	
  happen	
  again.
     We	
  used	
  all	
  the	
  performance	
  informaNon	
  presented	
  here,	
  plus	
  
     other	
  lessons	
  learned	
  to	
  build	
  a	
  plant	
  that	
  could	
  handle	
  the	
  
     projected	
  traffic.




                                                    36
Wednesday, May 23, 12                                                                                   36
Case	
  Study:	
  Engadget	
  Coverage	
  of	
  New	
  iPad	
  Announcement	
  	
  


    Goals:
           Engadget	
  would	
  not	
  fail	
  during	
  future	
  Apple	
  	
  announcements.	
  	
  
          Customers	
  would	
  be	
  able	
  to	
  access	
  Engadget	
  Live	
  Blogs	
  for	
  the	
  
         enNre	
  event.
           No	
  event	
  driven	
  changes	
  would	
  be	
  needed	
  to	
  the	
  plant.	
  




                                                           37
Wednesday, May 23, 12                                                                                       37
Case	
  Study:	
  Engadget	
  Coverage	
  of	
  New	
  iPad	
  Announcement	
  	
  

    Major	
  Concerns:
           Keep	
  the	
  NetScaler	
  CPU	
  <	
  50%.
           Distribute	
  the	
  large	
  iniNal	
  connecNons	
  per	
  second.
          Prepare	
  for	
  compeNng	
  sites	
  to	
  fail	
  and	
  those	
  users	
  shij	
  to	
  
         Engadget.
          IniNally	
  overbuild	
  to	
  guarantee	
  service	
  and	
  measure	
  the	
  true	
  
         capacity	
  requirements.




                                                            38
Wednesday, May 23, 12                                                                                    38
Case	
  Study:	
  Engadget	
  Coverage	
  of	
  New	
  iPad	
  Announcement	
  	
  

    We	
  succeeded!


    Gigaom.com	
  3/8/2012
    "Engadget’s	
  Tim	
  Stevens	
  was	
  one	
  of	
  the	
  few	
  tech	
  media	
  editors	
  
    who	
  only	
  had	
  to	
  worry	
  about	
  covering	
  the	
  event	
  itself,	
  as	
  
    opposed	
  to	
  managing	
  the	
  failure	
  of	
  his	
  tools.”


    h_p://gigaom.com/apple/live-­‐from-­‐sf-­‐sorta-­‐why-­‐apple-­‐
    events-­‐break-­‐publishers/



                                                   39
Wednesday, May 23, 12                                                                                 39
Case	
  Study:	
  Engadget	
  Coverage	
  of	
  New	
  iPad	
  Announcement	
  	
  




                                              40
Wednesday, May 23, 12                                                                      40
Case	
  Study:	
  Engadget	
  Coverage	
  of	
  New	
  iPad	
  Announcement	
  	
  

    Event	
  Details:
           Event	
  ran	
  120	
  minutes,	
  including	
  pre	
  and	
  post	
  blogs.
           3	
  second	
  updates	
  were	
  served	
  non-­‐stop	
  to	
  the	
  customers.	
  	
  
           A	
  total	
  of	
  1.14	
  Billion	
  h_p	
  requests	
  served	
  during	
  the	
  event.
           Sustained	
  >	
  200K	
  h_p	
  requests	
  /sec	
  for	
  30	
  minutes
           Peak	
  >220K	
  h_p	
  requests	
  /sec	
  over	
  10	
  min.
           No	
  changes	
  were	
  needed	
  to	
  any	
  device	
  during	
  the	
  event.




                                                           41
Wednesday, May 23, 12                                                                                    41
Case	
  Study:	
  Engadget	
  Coverage	
  of	
  New	
  iPad	
  Announcement	
  	
  


      How	
  we	
  did	
  it:
             Compression	
  Bypass	
  set	
  for	
  20%
             Integrated	
  Caching
             Flash	
  Cache	
  (1	
  sec)
            14	
  non-­‐dedicated	
  NetScalers	
  HA	
  pairs	
  to	
  minimize	
  connecNons	
  	
  
           rate,	
  located	
  in	
  5	
  data	
  centers	
  in	
  the	
  US	
  and	
  EU.	
  
             AOL	
  custom	
  tcpProfiles.
             600	
  cloud	
  based,	
  well	
  tuned,	
  virtual	
  Centos	
  servers.


              This	
  was	
  done	
  while	
  maintaining	
  normal	
  AOL	
  producNon	
  
                             services	
  on	
  the	
  shared	
  infrastructure!
                                                         42
Wednesday, May 23, 12                                                                                    42
Troubleshoo0ng	
  &	
  Monitoring




                              43
Wednesday, May 23, 12                   43
Troubleshoo0ng:	
  General
     Signs	
  your	
  NetScaler	
  is	
  gemng	
  overloaded:
           Watch	
  for	
  xoff	
  frames	
  on	
  the	
  switch	
  port,	
  indicates	
  the	
  NS	
  is	
  having	
  
          trouble	
  handling	
  packets.
           Watch	
  for	
  buffer	
  overruns,	
  may	
  indicate	
  the	
  need	
  for	
  addiNonal	
  
          interfaces	
  to	
  expand	
  buffer	
  pool.


     nstrace.sh	
  issues:
            Not	
  always	
  fully	
  capturing	
  packets.
           RelaNve	
  frame	
  Nmestamps	
  may	
  have	
  a	
  negaNve	
  value	
  due	
  to	
  
          capture	
  mechanism.
            AddiNonal	
  threads	
  added	
  in	
  10.x	
  which	
  should	
  help.


                                                             44
Wednesday, May 23, 12                                                                                                    44
Monitoring:	
  SNMP	
  OIDs

    In	
  addiNon	
  to	
  the	
  usual	
  OIDS,	
  we	
  have	
  found	
  these	
  very	
  useful	
  
    to	
  warn	
  of	
  potenNal	
  problems.	
  
           ifTotXoffSent	
  -­‐	
  .1.3.6.1.4.1.5951.4.1.1.54.1.43
           ifnicTxStalls	
  -­‐	
  .1.3.6.1.4.1.5951.4.1.1.54.1.45
           ifErrRxNoBuffs	
  -­‐	
  .1.3.6.1.4.1.5951.4.1.1.54.1.30
           ifErrTxNoNSB	
  -­‐	
  .1.3.6.1.4.1.5951.4.1.1.54.1.31




                                                     45
Wednesday, May 23, 12                                                                                    45
Future	
  work
     What	
  our	
  team	
  is	
  working	
  on	
  over	
  the	
  next	
  12	
  months
            IPFIX	
  and	
  AppFlow	
  monitoring.
            Add	
  NetScaler	
  interfaces	
  to	
  increase	
  buffer	
  pool.
            Custom	
  tcpProfiles	
  for	
  specific	
  applicaNons	
  like	
  ad	
  serving.
            Move	
  to	
  route	
  based,	
  acNve/acNve	
  architecture.
            Start	
  to	
  roll	
  out	
  Jumbo	
  Frames	
  &	
  WebSockets
            InvesNgaNng	
  OpenFlow/SDN
           Replace	
  current	
  tools	
  with	
  a	
  completely	
  new	
  tool	
  set	
  based	
  on	
  
          Trigger	
  (to	
  be	
  opensourced).




                                                           46
Wednesday, May 23, 12                                                                                        46
AOL	
  is	
  Hiring!

              As	
  you	
  have	
  seen,	
  we	
  are	
  NOT	
  just	
  your	
  parent’s	
  dialup	
  
                                                   anymore.
                                    h_p://corp.aol.com/careers




                                                         47
Wednesday, May 23, 12                                                                                    47
We value your feedback!
     Take a survey of this session now in the mobile app

     • Click 'Sessions' button

     • Click on today's tab

     • Find this session

     • Click 'Surveys'




     #CitrixSynergy



                                         48
Wednesday, May 23, 12                                      48
Appendix:	
  AOL	
  custom	
  tcpProfile	
  configs
     add ns tcpProfile aol_vip_std_tcpprofile -WS ENABLED -SACK ENABLED -WSVal 0 -nagle DISABLED -ackOnPush
     ENABLED -maxBurst 10 -initialCwnd 10 -delayedAck 100 -oooQSize 64 -maxPktPerMss 0 –pktPerRetx 2 -minRTO 400 -
     slowStartIncr 4 -bufferSize 8190

     add ns tcpProfile aol_vip_dialup-client_tcpprofile -WS ENABLED -SACK ENABLED -WSVal 0 -nagle Enabled -
     ackOnPush ENABLED -maxBurst 4 -initialCwnd 4 -delayedAck 50 -oooQSize 100 -maxPktPerMss 0 -pktPerRetx 2 -
     minRTO 500 -slowStartIncr 2 -bufferSize 8190

     add ns tcpProfile aol_vip_mobile-client_tcpprofile -WS ENABLED -SACK ENABLED -WSVal 0 -nagle Enabled -
     ackOnPush ENABLED -maxBurst 10 -initialCwnd 6 -delayedAck 50 -oooQSize 100 -maxPktPerMss 0 -pktPerRetx 2 -
     minRTO 1000 -slowStartIncr 4 -bufferSize 8190

     add ns tcpProfile aol_vip_server-2-vserver_tcpprofile -WS ENABLED -SACK ENABLED -WSVal 5 -nagle Enabled -
     ackOnPush ENABLED -maxBurst 10 -initialCwnd 10 -delayedAck 100 -oooQSize 64 -maxPktPerMss 0 -pktPerRetx 2 -
     minRTO 100 -slowStartIncr 4 -bufferSize 8190

     add ns tcpProfile aol_svc_std_tcpprofile -WS ENABLED -SACK ENABLED -WSVal 3 -nagle DISABLED -ackOnPush
     ENABLED -maxBurst 10 -initialCwnd 10 -delayedAck 100 -oooQSize 64 -maxPktPerMss 0 -pktPerRetx 2 -minRTO 100 -
     slowStartIncr 4 -bufferSize 8190




                                                           49
Wednesday, May 23, 12                                                                                                49
Appendix:	
  AOL	
  custom	
  tcpParam	
  Config
     set ns tcpParam -WS Enabled -WSVal 2 -SACK Enabled -maxBurst 6
     -initialCwnd 4 -delayedAck 100 -downStateRST DISABLED -nagle DISABLED
     -limitedPersist ENABLED -oooQSize 64 -ackOnPush ENABLED -maxPktPerMss 0
     -pktPerRetx 2 -minRTO 100 -slowStartIncr 2




                                               50
Wednesday, May 23, 12                                                          50
Appendix:	
  NSCLI	
  Show	
  Commands
     tcpParam	
  NSCLI	
  Commands
     Show	
  all	
  current	
  values
         •sh ns tcpparam -format old -level verbose
         •sh ns tcpparam -level verbose



     tcpProfile	
  NSCLI	
  Commands
     Show	
  all	
  current	
  values
         •sh ns tcpprofile -format old -level verbose
         •sh ns tcpprofile <profile_name> -level verbose




                                                 51
Wednesday, May 23, 12                                      51
Appendix:	
  tcpParam	
  NSCLI	
  Commands
     Set	
  AOL	
  custom	
  tcpParam
              set ns tcpParam -WS Enabled -WSVal 2 -SACK Enabled -maxBurst 6 -initialCwnd 6
              -delayedAck 100 -downStateRST DISABLED -nagle DISABLED -limitedPersist ENABLED
              -oooQSize 64 -ackOnPush ENABLED -maxPktPerMss 0 -pktPerRetx 2 -minRTO 100
              -slowStartIncr 2

     Set	
  a	
  specific	
  value
              set ns tcpParam -WS enabled




          Changing	
  tcpparam	
  modifies	
  the	
  same	
  values	
  in	
  the	
  default
                                     	
  tcpprofile.




                                                  52
Wednesday, May 23, 12                                                                          52
Appendix:	
  tcpProfile	
  NSCLI	
  Commands
     CreaNng	
  new	
  profiles
                 add ns tcpProfile aol_vip_std_tcpprofile -WS ENABLED -SACK ENABLED -WSVal 0 -nagle DISABLED
                 -ackOnPush ENABLED -maxBurst 10 -initialCwnd 10 -delayedAck 100 -oooQSize 64 -maxPktPerMss 0
                 -pktPerRetx 2 -minRTO 400 -slowStartIncr 4 -bufferSize 8190 –mss 0


     Changing	
  a	
  value
                 Set ns tcpProfile aol_vip_std_tcpprofile -WSVal 3


     Applying	
  Profiles
                 set    cs vserver <vservername:port> -tcpprofile <profilename>
                 set    cs vserver www_aol_com:80 -tcpprofile aol_vip_std_tcpprofile
                 set    lb vserver <vservername:port> -tcpprofile <profilename>
                 set    service <servicename:port> -tcpprofile <profilename>


     Resemng	
  to	
  Default
                 unset cs vserver <vservername:port> -tcpprofile
                 unset lb vserver <vservername:port> -tcpprofile
                 unset service <servicename:port> -tcpprofile




                                                                 53
Wednesday, May 23, 12                                                                                           53
Explana0on	
  of	
  TCPProfile	
  variables.	
  (V9.2	
  &	
  9.3)
  Variable               Default       Min           Max               Descrip5on
  -name                        ----      ----              ---- The name for a TCP profile. A TCP profile name can be from 1 to 127 characters and must begin with a letter, a number, or the underscore
                                                                       symbol (_). Other characters allowed after the first character in a name are the hyphen, period, pound sign, space, at sign, and equals.

  -­‐WS                   Disabled       -­‐-­‐-­‐         -­‐-­‐-­‐ Enable or disable window scaling.  If Disabled, Window Scaling is disabled for both sides of the conversation. 
  -­‐WSVal                        4             0                 8 The factor used to calculate the new window size.
  -­‐maxburst                      6            1         255 The maximum number of TCP segments allowed in a burst.  The higher this value, the more frames are able to be sent at one time.

  -­‐ini5alCwnd                    4            2            44 The initial maximum upper limit on the number of TCP packets that can be outstanding on the TCP link to the server. As of 9.2.50.1, this
                                                                       number was upped from 6 to 44

  -­‐delayedAck                100         10             300 The time-out for TCP delayed ACK, in milliseconds.

  -­‐oooQSize                   64              0      65535 The maximum size of out-of-order packets queue. A value of 0 means infinite.  The name is a misnomer, this buffer contains sent frames that
                                                              are awaiting acks or received frames that are not sequential, meaning some packets are missing due to SACK.
  -­‐maxPktPerMss                  0            0         512 The maximum number of TCP packets allowed per maximum segment size (MSS). A value of 0 means that no maximum is set.

  -­‐pktPerRetx                    1            1         512 The maximum limit on the number of packets that should be retransmitted on receiving a "partial ACK". Partial ACK are ACKs indicating
                                                             not all outstanding frames were acked.
  -­‐minRTO                    100         10          64000 The minimum Receive Time Out (RTO) value, in milliseconds.  The NetScale supports New Reno and conforms to RFC 2001 and RFC
                                                             5827 (?).  Since the Netscaler does not use TCP Timestamping, these values do not correspond to actual propagation delays. 

  -­‐slowStartIncr                 2            1         100 Multiplier determines the rate which slow start increases the size of the TCP transmission window after each ack of successful transmission.
  -­‐SACK                 Disabled       -­‐-­‐-­‐         -­‐-­‐-­‐ Enable or disable selective acknowledgement (SACK). There is NO reason this should be off

  -­‐nagle                Disabled       -­‐-­‐-­‐         -­‐-­‐-­‐ Enable or disable the Nagle algorithm on TCP connections. When enabled, reduces the number of small segments by combining them into
                                                                       one packet. Primary use is on slow or congested links such as mobile or dial.
  -­‐ackOnPush            Enabled        -­‐-­‐-­‐         -­‐-­‐-­‐   Send immediate  acknowledgement (ACK) on receipt of TCP packets with the PUSH bit set.

  -­‐mss	
  (v	
  9.3)             0            0        1460 Maximum segment size, default value 0 uses global setting in “set tcpparam”, maximum value 1460

  -­‐bufferSize	
              8190       -­‐-­‐-­‐   4194304 The value that you set is the minimum value that is advertised by the NetScaler appliance, and this buffer size is reserved when a client
  (v9.3)                                                               initiates a connection that is associated with an endpoint-application function, such as compression or SSL. The managed application can
                                                                       request a larger buffer, but if it requests a smaller buffer, the request is not honored, and the specified buffer size is used. If the TCP buffer
                                                                       size is set both at the global level and at the entity level (virtual server or service level), the buffer specified at the entity level takes
                                                                       precedence. If the buffer size that you specify for a service is not the same as the buffer size that you specify for the virtual server to which
                                                                       the service is bound, the NetScaler appliance uses the buffer size specified for the virtual server for the client-side connection and the buffer
                                                                       size specified for the service for the server-side connection. However, for optimum results, make sure that the values specified for a virtual
                                                                       server and the services bound to it have the same value. The buffer size that you specify is used only when the connection is associated with
                                                                                                           54
                                                                       endpoint-application functions, such as SSL and compression. Note: A high TCP buffer value could limit the number of connections that can
                                                                       be made to the NetScaler appliance.

Wednesday, May 23, 12                                                                                                                                                                                                      54
1 of 54

Recommended

Advanced Tools and Techniques for Troubleshooting NetScaler Appliances by
Advanced Tools and Techniques for Troubleshooting NetScaler AppliancesAdvanced Tools and Techniques for Troubleshooting NetScaler Appliances
Advanced Tools and Techniques for Troubleshooting NetScaler AppliancesDavid McGeough
19.6K views47 slides
Citrix TechEdge 2014 - Advanced Tools and Techniques for Troubleshooting NetS... by
Citrix TechEdge 2014 - Advanced Tools and Techniques for Troubleshooting NetS...Citrix TechEdge 2014 - Advanced Tools and Techniques for Troubleshooting NetS...
Citrix TechEdge 2014 - Advanced Tools and Techniques for Troubleshooting NetS...David McGeough
8.1K views48 slides
Troubleshooting Common Network Related Issues with NetScaler by
Troubleshooting Common Network Related Issues with NetScalerTroubleshooting Common Network Related Issues with NetScaler
Troubleshooting Common Network Related Issues with NetScalerDavid McGeough
47.5K views34 slides
Citrix Netscaler Deployment Guide by
Citrix Netscaler Deployment GuideCitrix Netscaler Deployment Guide
Citrix Netscaler Deployment GuideCitrix
7.5K views16 slides
NetScaler ADC - Customer Overview by
NetScaler ADC - Customer OverviewNetScaler ADC - Customer Overview
NetScaler ADC - Customer OverviewMichelle Guerrero Montalvo
919 views28 slides
Citrix adc technical overview by
Citrix adc   technical overviewCitrix adc   technical overview
Citrix adc technical overviewRoshan Dias
502 views150 slides

More Related Content

What's hot

THREAT GROUP CARDS: A THREAT ACTOR ENCYCLOPEDIA by
THREAT GROUP CARDS:  A THREAT ACTOR ENCYCLOPEDIATHREAT GROUP CARDS:  A THREAT ACTOR ENCYCLOPEDIA
THREAT GROUP CARDS: A THREAT ACTOR ENCYCLOPEDIAETDAofficialRegist
11.9K views274 slides
Colt's evolution from MPLS to Cloud Networking by
Colt's evolution from MPLS to Cloud Networking Colt's evolution from MPLS to Cloud Networking
Colt's evolution from MPLS to Cloud Networking Colt Technology Services
1.4K views27 slides
MITRE ATT&CK Updates: State of the ATT&CK (ATT&CKcon 4.0 Edition) by
MITRE ATT&CK Updates: State of the ATT&CK (ATT&CKcon 4.0 Edition)MITRE ATT&CK Updates: State of the ATT&CK (ATT&CKcon 4.0 Edition)
MITRE ATT&CK Updates: State of the ATT&CK (ATT&CKcon 4.0 Edition)MITRE ATT&CK
124 views26 slides
Introduction to MITRE ATT&CK by
Introduction to MITRE ATT&CKIntroduction to MITRE ATT&CK
Introduction to MITRE ATT&CKArpan Raval
986 views31 slides
Cloud Security Top Threats by
Cloud Security Top ThreatsCloud Security Top Threats
Cloud Security Top ThreatsTiago de Almeida
774 views10 slides
Splunk for IT Operations by
Splunk for IT OperationsSplunk for IT Operations
Splunk for IT OperationsSplunk
1.6K views29 slides

What's hot(20)

THREAT GROUP CARDS: A THREAT ACTOR ENCYCLOPEDIA by ETDAofficialRegist
THREAT GROUP CARDS:  A THREAT ACTOR ENCYCLOPEDIATHREAT GROUP CARDS:  A THREAT ACTOR ENCYCLOPEDIA
THREAT GROUP CARDS: A THREAT ACTOR ENCYCLOPEDIA
ETDAofficialRegist11.9K views
MITRE ATT&CK Updates: State of the ATT&CK (ATT&CKcon 4.0 Edition) by MITRE ATT&CK
MITRE ATT&CK Updates: State of the ATT&CK (ATT&CKcon 4.0 Edition)MITRE ATT&CK Updates: State of the ATT&CK (ATT&CKcon 4.0 Edition)
MITRE ATT&CK Updates: State of the ATT&CK (ATT&CKcon 4.0 Edition)
MITRE ATT&CK124 views
Introduction to MITRE ATT&CK by Arpan Raval
Introduction to MITRE ATT&CKIntroduction to MITRE ATT&CK
Introduction to MITRE ATT&CK
Arpan Raval986 views
Splunk for IT Operations by Splunk
Splunk for IT OperationsSplunk for IT Operations
Splunk for IT Operations
Splunk1.6K views
Cisco catalyst 9200 series platform spec, licenses, transition guide by IT Tech
Cisco catalyst 9200 series platform spec, licenses, transition guideCisco catalyst 9200 series platform spec, licenses, transition guide
Cisco catalyst 9200 series platform spec, licenses, transition guide
IT Tech1.3K views
Akraino and Edge Computing by Liz Warner
Akraino and Edge ComputingAkraino and Edge Computing
Akraino and Edge Computing
Liz Warner475 views
CNIT 40: 1: The Importance of DNS Security by Sam Bowne
CNIT 40: 1: The Importance of DNS SecurityCNIT 40: 1: The Importance of DNS Security
CNIT 40: 1: The Importance of DNS Security
Sam Bowne1.9K views
Scoping for BMC Discovery (ADDM) Deployment by Traversys Limited by Wes Moskal-Fitzpatrick
Scoping for BMC Discovery (ADDM) Deployment by Traversys LimitedScoping for BMC Discovery (ADDM) Deployment by Traversys Limited
Scoping for BMC Discovery (ADDM) Deployment by Traversys Limited
Assingement on dos ddos by kalyan kumar
Assingement on dos  ddosAssingement on dos  ddos
Assingement on dos ddos
kalyan kumar1.3K views
Beware the Firewall My Son: The Jaws That Bite, The Claws That Catch! by Michele Chubirka
Beware the Firewall My Son: The Jaws That Bite, The Claws That Catch!Beware the Firewall My Son: The Jaws That Bite, The Claws That Catch!
Beware the Firewall My Son: The Jaws That Bite, The Claws That Catch!
Michele Chubirka6.7K views
Why The Web Needs Decentralized Identifiers (DIDs) — Even if Google, Apple, a... by Evernym
Why The Web Needs Decentralized Identifiers (DIDs) — Even if Google, Apple, a...Why The Web Needs Decentralized Identifiers (DIDs) — Even if Google, Apple, a...
Why The Web Needs Decentralized Identifiers (DIDs) — Even if Google, Apple, a...
Evernym359 views
Bypassing Port-Security In 2018: Defeating MacSEC and 802.1x-2010 by Priyanka Aash
Bypassing Port-Security In 2018: Defeating MacSEC and 802.1x-2010Bypassing Port-Security In 2018: Defeating MacSEC and 802.1x-2010
Bypassing Port-Security In 2018: Defeating MacSEC and 802.1x-2010
Priyanka Aash7.2K views

Viewers also liked

In-depth Troubleshooting on NetScaler using Command Line Tools by
In-depth Troubleshooting on NetScaler using Command Line ToolsIn-depth Troubleshooting on NetScaler using Command Line Tools
In-depth Troubleshooting on NetScaler using Command Line ToolsDavid McGeough
94.9K views90 slides
Common Pitfalls when Setting up a NetScaler for the First Time by
Common Pitfalls when Setting up a NetScaler for the First TimeCommon Pitfalls when Setting up a NetScaler for the First Time
Common Pitfalls when Setting up a NetScaler for the First TimeDavid McGeough
31.5K views24 slides
Using NetScaler Insight to Troubleshoot Network and Server Performance Issues by
Using NetScaler Insight to Troubleshoot Network and Server Performance IssuesUsing NetScaler Insight to Troubleshoot Network and Server Performance Issues
Using NetScaler Insight to Troubleshoot Network and Server Performance IssuesDavid McGeough
13K views21 slides
Training by
TrainingTraining
TrainingformacionVTr
1.3K views61 slides
計概 by
計概計概
計概allan3160
1.3K views30 slides
Integrated Cache on Netscaler by
Integrated Cache on NetscalerIntegrated Cache on Netscaler
Integrated Cache on NetscalerMark Hillick
15.6K views49 slides

Viewers also liked(11)

In-depth Troubleshooting on NetScaler using Command Line Tools by David McGeough
In-depth Troubleshooting on NetScaler using Command Line ToolsIn-depth Troubleshooting on NetScaler using Command Line Tools
In-depth Troubleshooting on NetScaler using Command Line Tools
David McGeough94.9K views
Common Pitfalls when Setting up a NetScaler for the First Time by David McGeough
Common Pitfalls when Setting up a NetScaler for the First TimeCommon Pitfalls when Setting up a NetScaler for the First Time
Common Pitfalls when Setting up a NetScaler for the First Time
David McGeough31.5K views
Using NetScaler Insight to Troubleshoot Network and Server Performance Issues by David McGeough
Using NetScaler Insight to Troubleshoot Network and Server Performance IssuesUsing NetScaler Insight to Troubleshoot Network and Server Performance Issues
Using NetScaler Insight to Troubleshoot Network and Server Performance Issues
David McGeough13K views
計概 by allan3160
計概計概
計概
allan31601.3K views
Integrated Cache on Netscaler by Mark Hillick
Integrated Cache on NetscalerIntegrated Cache on Netscaler
Integrated Cache on Netscaler
Mark Hillick15.6K views
Citrix TechEdge 2014 - How to Troubleshoot Deployments of StoreFront and NetS... by David McGeough
Citrix TechEdge 2014 - How to Troubleshoot Deployments of StoreFront and NetS...Citrix TechEdge 2014 - How to Troubleshoot Deployments of StoreFront and NetS...
Citrix TechEdge 2014 - How to Troubleshoot Deployments of StoreFront and NetS...
David McGeough7.8K views
Top 10 Interview Questions and Answers by Learn By Watch
Top 10 Interview Questions and AnswersTop 10 Interview Questions and Answers
Top 10 Interview Questions and Answers
Learn By Watch211.7K views
Citrix netscaler administration guide by Kendhe Deligny
Citrix netscaler administration guideCitrix netscaler administration guide
Citrix netscaler administration guide
Kendhe Deligny5.5K views

Similar to NetScaler TCP Performance Tuning

OpenStack Networks the Web-Scale Way - Scott Laffer, Cumulus Networks by
OpenStack Networks the Web-Scale Way - Scott Laffer, Cumulus NetworksOpenStack Networks the Web-Scale Way - Scott Laffer, Cumulus Networks
OpenStack Networks the Web-Scale Way - Scott Laffer, Cumulus NetworksOpenStack
480 views15 slides
Best practices for network troubleshooting by
Best practices for network troubleshootingBest practices for network troubleshooting
Best practices for network troubleshootingCumulus Networks
842 views50 slides
Using Software-Defined WAN implementation to turn on advanced connectivity se... by
Using Software-Defined WAN implementation to turn on advanced connectivity se...Using Software-Defined WAN implementation to turn on advanced connectivity se...
Using Software-Defined WAN implementation to turn on advanced connectivity se...RedHatTelco
535 views30 slides
Openflow for Cloud Scalability by
Openflow for Cloud ScalabilityOpenflow for Cloud Scalability
Openflow for Cloud ScalabilityDaoliCloud Ltd
513 views12 slides
Mininet: Moving Forward by
Mininet: Moving ForwardMininet: Moving Forward
Mininet: Moving ForwardON.Lab
1.8K views38 slides
SDN: an introduction by
SDN: an introductionSDN: an introduction
SDN: an introductionLuca Profico
1.8K views23 slides

Similar to NetScaler TCP Performance Tuning(20)

OpenStack Networks the Web-Scale Way - Scott Laffer, Cumulus Networks by OpenStack
OpenStack Networks the Web-Scale Way - Scott Laffer, Cumulus NetworksOpenStack Networks the Web-Scale Way - Scott Laffer, Cumulus Networks
OpenStack Networks the Web-Scale Way - Scott Laffer, Cumulus Networks
OpenStack480 views
Best practices for network troubleshooting by Cumulus Networks
Best practices for network troubleshootingBest practices for network troubleshooting
Best practices for network troubleshooting
Cumulus Networks842 views
Using Software-Defined WAN implementation to turn on advanced connectivity se... by RedHatTelco
Using Software-Defined WAN implementation to turn on advanced connectivity se...Using Software-Defined WAN implementation to turn on advanced connectivity se...
Using Software-Defined WAN implementation to turn on advanced connectivity se...
RedHatTelco535 views
Openflow for Cloud Scalability by DaoliCloud Ltd
Openflow for Cloud ScalabilityOpenflow for Cloud Scalability
Openflow for Cloud Scalability
DaoliCloud Ltd513 views
Mininet: Moving Forward by ON.Lab
Mininet: Moving ForwardMininet: Moving Forward
Mininet: Moving Forward
ON.Lab1.8K views
SDN: an introduction by Luca Profico
SDN: an introductionSDN: an introduction
SDN: an introduction
Luca Profico1.8K views
Improving performance and efficiency with Network Virtualization Overlays by Adam Johnson
Improving performance and efficiency with Network Virtualization OverlaysImproving performance and efficiency with Network Virtualization Overlays
Improving performance and efficiency with Network Virtualization Overlays
Adam Johnson1.3K views
Open vSwitch Implementation Options by Netronome
Open vSwitch Implementation Options Open vSwitch Implementation Options
Open vSwitch Implementation Options
Netronome1.5K views
Programming with Relaxed Synchronization by racesworkshop
Programming with Relaxed SynchronizationProgramming with Relaxed Synchronization
Programming with Relaxed Synchronization
racesworkshop584 views
Scalar Brocade Toronto Roadshow 2013 by patmisasi
Scalar Brocade Toronto Roadshow 2013Scalar Brocade Toronto Roadshow 2013
Scalar Brocade Toronto Roadshow 2013
patmisasi769 views
Network and TCP performance relationship workshop by Kae Hsu
Network and TCP performance relationship workshopNetwork and TCP performance relationship workshop
Network and TCP performance relationship workshop
Kae Hsu776 views
MySQL Replication Performance in the Cloud by Vitor Oliveira
MySQL Replication Performance in the CloudMySQL Replication Performance in the Cloud
MySQL Replication Performance in the Cloud
Vitor Oliveira193 views
Sdn dell lab report v2 by Oded Rotter
Sdn dell lab report v2Sdn dell lab report v2
Sdn dell lab report v2
Oded Rotter665 views
Dimension data cloud for the enterprise architect by David Sawatzke
Dimension data cloud for the enterprise architectDimension data cloud for the enterprise architect
Dimension data cloud for the enterprise architect
David Sawatzke1.3K views
Umit hw6 by civcimix
Umit hw6Umit hw6
Umit hw6
civcimix153 views
Making the Switch to Bare Metal and Open Networking by Cumulus Networks
Making the Switch to Bare Metal and Open NetworkingMaking the Switch to Bare Metal and Open Networking
Making the Switch to Bare Metal and Open Networking
Cumulus Networks3.6K views
FreeSWITCH as a Microservice by Evan McGee
FreeSWITCH as a MicroserviceFreeSWITCH as a Microservice
FreeSWITCH as a Microservice
Evan McGee3.4K views
ENSURING FAST AND SECURE GAMING APPLICATION DOWNLOADS GLOBALLY by CDNetworks
ENSURING FAST AND SECURE GAMING APPLICATION DOWNLOADS GLOBALLYENSURING FAST AND SECURE GAMING APPLICATION DOWNLOADS GLOBALLY
ENSURING FAST AND SECURE GAMING APPLICATION DOWNLOADS GLOBALLY
CDNetworks901 views
Naveen nimmu sdn future of networking by suniltomar04
Naveen nimmu sdn   future of networkingNaveen nimmu sdn   future of networking
Naveen nimmu sdn future of networking
suniltomar043.3K views

Recently uploaded

PharoJS - Zürich Smalltalk Group Meetup November 2023 by
PharoJS - Zürich Smalltalk Group Meetup November 2023PharoJS - Zürich Smalltalk Group Meetup November 2023
PharoJS - Zürich Smalltalk Group Meetup November 2023Noury Bouraqadi
139 views17 slides
Unit 1_Lecture 2_Physical Design of IoT.pdf by
Unit 1_Lecture 2_Physical Design of IoT.pdfUnit 1_Lecture 2_Physical Design of IoT.pdf
Unit 1_Lecture 2_Physical Design of IoT.pdfStephenTec
15 views36 slides
STPI OctaNE CoE Brochure.pdf by
STPI OctaNE CoE Brochure.pdfSTPI OctaNE CoE Brochure.pdf
STPI OctaNE CoE Brochure.pdfmadhurjyapb
14 views1 slide
Microsoft Power Platform.pptx by
Microsoft Power Platform.pptxMicrosoft Power Platform.pptx
Microsoft Power Platform.pptxUni Systems S.M.S.A.
61 views38 slides
PRODUCT LISTING.pptx by
PRODUCT LISTING.pptxPRODUCT LISTING.pptx
PRODUCT LISTING.pptxangelicacueva6
18 views1 slide
SUPPLIER SOURCING.pptx by
SUPPLIER SOURCING.pptxSUPPLIER SOURCING.pptx
SUPPLIER SOURCING.pptxangelicacueva6
20 views1 slide

Recently uploaded(20)

PharoJS - Zürich Smalltalk Group Meetup November 2023 by Noury Bouraqadi
PharoJS - Zürich Smalltalk Group Meetup November 2023PharoJS - Zürich Smalltalk Group Meetup November 2023
PharoJS - Zürich Smalltalk Group Meetup November 2023
Noury Bouraqadi139 views
Unit 1_Lecture 2_Physical Design of IoT.pdf by StephenTec
Unit 1_Lecture 2_Physical Design of IoT.pdfUnit 1_Lecture 2_Physical Design of IoT.pdf
Unit 1_Lecture 2_Physical Design of IoT.pdf
StephenTec15 views
STPI OctaNE CoE Brochure.pdf by madhurjyapb
STPI OctaNE CoE Brochure.pdfSTPI OctaNE CoE Brochure.pdf
STPI OctaNE CoE Brochure.pdf
madhurjyapb14 views
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ... by Jasper Oosterveld
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... by TrustArc
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc72 views
Future of AR - Facebook Presentation by ssuserb54b561
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook Presentation
ssuserb54b56122 views
HTTP headers that make your website go faster - devs.gent November 2023 by Thijs Feryn
HTTP headers that make your website go faster - devs.gent November 2023HTTP headers that make your website go faster - devs.gent November 2023
HTTP headers that make your website go faster - devs.gent November 2023
Thijs Feryn26 views
"Running students' code in isolation. The hard way", Yurii Holiuk by Fwdays
"Running students' code in isolation. The hard way", Yurii Holiuk "Running students' code in isolation. The hard way", Yurii Holiuk
"Running students' code in isolation. The hard way", Yurii Holiuk
Fwdays24 views
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors by sugiuralab
TouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective SensorsTouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective Sensors
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors
sugiuralab23 views
The Forbidden VPN Secrets.pdf by Mariam Shaba
The Forbidden VPN Secrets.pdfThe Forbidden VPN Secrets.pdf
The Forbidden VPN Secrets.pdf
Mariam Shaba20 views
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by Bernd Ruecker
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
Bernd Ruecker48 views
SAP Automation Using Bar Code and FIORI.pdf by Virendra Rai, PMP
SAP Automation Using Bar Code and FIORI.pdfSAP Automation Using Bar Code and FIORI.pdf
SAP Automation Using Bar Code and FIORI.pdf

NetScaler TCP Performance Tuning

  • 1. Synergy  SYN304:   NetScaler  TCP  Performance   Tuning  in  the  AOL  Network Presenters: Kevin  Mason  -­‐  kem@aol.net Tim  Wicinski  -­‐  tjw@aol.net May  11,  2012 1 Wednesday, May 23, 12 1
  • 2. Agenda Goals InformaNon  on  the  AOL  ProducNon  Network TCP  Tuning  Concepts TCP  Tuning  on  the  NetScaler AOL  tcpProfiles Case  Study:  bps  Improvements Case  Study:  Retransmission  ReducNon Case  Study:  Engadget  Coverage  of  New  iPad  Announcement     TroubleshooNng  and  Monitoring  NetScalers Appendixes 2 Wednesday, May 23, 12 2
  • 3. Note  on  Today’s  Sta0s0cs All  the  AOL  staNsNcs  presented  today  are  operaNonal   numbers.  They  should  not  be  construed  to  represent  page   views  or  other  types  of  site  popularity  data. 3 Wednesday, May 23, 12 3
  • 4. Performance  Tuning  Goal Create  the  best  possible  experience  for   AOL  Customers  by  conNnuously   analyzing  and  engineering  the  fastest,   most  efficient  network  possible. 4 Wednesday, May 23, 12 4
  • 5. Defini0ons Just  to  make  sure  we  are  on  the  same  page VIP  -­‐  Load  Balancing  (LB)  and  Content  Switching  (CS)  VServers Service  -­‐  Backend  connecNvity  to  Real  Hosts Outbound  -­‐  Traffic  outbound  from  the  VIP  or  Service   Inbound  -­‐  Traffic  inbound  towards  the  VIP  or  Service 5 Wednesday, May 23, 12 5
  • 6. AOL  Produc0on  Network  Stats Load  Balancing 70  HA  pairs  of  NetScalers 10G,  mulN  interface  a_ached 19,642  VServers 27,214  Services  (Host  +  Port) 13,462  Servers,  majority  cloud  based GSLB/DNS Approx  18K  DNS  domains Approx  4K  GSLB  configs 180M  queries/day 6 Wednesday, May 23, 12 6
  • 7. AOL  Produc0on  Network Internal  Tooling Web  based  self  service  site  managing  all  changes,  using  a   mixture  of  SOAP  and  NITRO  calls: Avg  250  Self  Service  changes  /  week. Avg  25K  config  command  changes  /  week. 7 Wednesday, May 23, 12 7
  • 8. TCP  Tuning  Concepts Have  a  solid,  technical  understanding  of  the  client's   connecNon  method.   Get  the  bits  to  the  customer  quickly  and  cleanly. Where  possible,  send  a  large  iniNal  burst  of  packets. Ramp  traffic  up  fast. Recover  errors  quickly,  avoid  slow  start. Melt  fiber,  smoke  servers,  choke  routers 8 Wednesday, May 23, 12 8
  • 9. TCP  Tuning  Concepts TCP  Tuning  is  done  per  connecNon  not  the  aggregate  of  the   link.     Tuning  Factors  include: Latency  or  Round  Trip  Time  (RTT) Minimum  Bandwidth Packet  Loss Supported  TCP  OpNons 9 Wednesday, May 23, 12 9
  • 10. TCP  Connec0on  Op0ons TCP  negoNates  opNons  to  define  how  the  connecNon  operates. TCP  OpNons  are  generally  symmetrical,  both  sides  have  to  support   it,  or  the  opNon  is  dropped. TCP  OpNon  values  are  NOT  always  the  same  on  both  sides.    For   example,  Receive  Windows,  Window  Scaling  and  MSS  (Max   Segment  Size)  are  ojen  different. 10 Wednesday, May 23, 12 10
  • 11. !!!  Caveat  Emptor  !!! While  we  are  happy  to  share  AOL  experience  and   configuraAons,  be  careful  applying  these  seEngs  in  your   network,  they  might  cause  problems. When  working  with  any  parameters,  have  a  rollback  plan! O’Toole’s  Comment  on  Murphy's  Law:                                                                                                  Murphy  is  an  Op?mist.   11 Wednesday, May 23, 12 11
  • 12. TCP  Tuning  on  the  NetScaler Two  major  tuning  knobs  on  the  NetScaler: tcpParam  -­‐  Global  TCP  semngs tcpProfile  -­‐  Per  VIP  or  Service  semngs We  set  the  tcpParam  for  general  purpose  values,  but  since  it  is   a  OS  supplied  semng,  it  could  be  over  wri_en  by  an  upgrade.     Using  custom  tcpProfiles  is  safer  since  they  will  not  be   overwri_en. 12 Wednesday, May 23, 12 12
  • 13. tcpProfile  Values -­‐WS -­‐maxPktPerMss -­‐nagle -­‐WSVAL -­‐pktPerRetx -­‐ackOnPush -­‐maxBurst -­‐minRTO -­‐mss  (v9.3) -­‐iniAalCwnd -­‐slowStartIncr -­‐bufferSize  (v9.3) -­‐delayedAck -­‐SACK -­‐oooQSize If you do nothing else,: Enable -WS (Window Scale) Enable -SACK (Selective Ack) 13 Wednesday, May 23, 12 13
  • 14. TCP  Window  Scale  (RFC  1323) Window  Scale  is  a  TCP  opNon  to  increase  the  receiver’s   window  size,  compensaNng  for  long  and/or  fat  networks  (LFN).   The  values  used  by  each  device  in  a  connecNon  are  ojen   asymmetrical. Enabled  on  the  NetScaler  with  the  -­‐WS  opNon. The  value  adverNsed  by  the  NetScaler  is  set  with  -­‐WScale. At  the  least,  enable  -­‐WS  to  take  advantage  of  the  client  and  server   adverNsed  windows. 14 Wednesday, May 23, 12 14
  • 15. TCP  SACK  (RFC  2018) SelecNve  Acknowledgement  or  SACK  is  a  TCP  opNon  enabling  a   receiver  to  tell  the  sender  the  range  of  non-­‐conNguous  packets   received. Without  SACK,  the  receiver  can  only  tell  the  sender  about  packets   sequenNally  received.     This  slows  down  recovery. Can  force  Conges5on  control  to  kick  in. Enabled  with  the  -­‐SACK  opNon. 15 Wednesday, May 23, 12 15
  • 16. Citrix  default  tcpProfile   SeEngs  appropriate  for  circa  1999.     Have  stayed  this  way  due  to  Citrix  general  policy  of  not   changing  default  semngs.  Some  significant  changes  were  made   to  default  on  v9.2,  see  CTX130962  (10/7/2011). No  Window  Scaling  or  SACK. Causes  choppy  data  flow. Slow  to  recover  from  packet  loss. Can  trigger  slow  start  conges5on  control. Very  slow  CongesNon  Window  ramp  up. Small  -­‐iniNalCWind  &  -­‐maxBurst Search  for  “Ini5al  Conges5on  Window”  for  more  details 16 Wednesday, May 23, 12 16
  • 17. Addi0onal  Citrix  supplied  tcpProfiles Citrix  supplies  several  predefined  profiles Depending  on  code  level,  Citrix  supplies  at  least  7  addiNonal   tcpProfiles  for  you  to  try  and  may  work  well  for  you.    Since   they  are  supplied  with  the  O/S,  they  are  subject  to  changes  in   the  future. We  studied  these  tcpProfiles  to  develop  a  starNng  point  for   our  tesNng. 17 Wednesday, May 23, 12 17
  • 18. AOL  Custom  tcpProfiles 18 Wednesday, May 23, 12 18
  • 19. AOL  Custom  tcpProfiles  VIP  Profiles aol_vip_std_tcpprofile aol_vip_mobile-­‐client_tcpprofile aol_vip_server-­‐2-­‐vserver_tcpprofile aol_vip_dialup-­‐client_tcpprofile Service  Profiles aol_svc_std_tcpprofile Complete  configs  included  in  Appendix 19 Wednesday, May 23, 12 19
  • 20. Standard  Client  to  VIP  -­‐  aol_vip_std_tcpprofile Push  data  as  fast  as  possible  ajer  the  connecNon  is   established  by  increasing  the  packet  burst  and  ramp  rate,   improve  error  recovery. AssumpAons Content  is  generally  outbound  out  to  client. The  max  bps  per  flow  is  10  mb/s. RTT  is  75  ms Changes  from  default: Enable  -­‐WS  (Window  Scaling) Increase  -­‐slowStartIncr   Enable  -­‐SACK Increased  -­‐pktPerRetx Increase  -­‐MaxBurst Reduced  -­‐delayedAck   Increase  -­‐IniNalCwnd 20 Wednesday, May 23, 12 20
  • 21. Mobile  client  to  VIP  -­‐  aol_vip_mobile-­‐client_tcpprofile   Based  on  aol_vip_std_tcpprofile  &  aol_vip_dialup-­‐client_tcpprofile   with  changes  to  improve  the  the  mobile  (3G,  4G)  client  experience  by   increasing  tolerance  for  RTT  shijs  as  the  device  moves.  This  reduces   retransmissions  that  can  flood  a  device. AssumpAons: RTT  can  shij  from  300ms  to  900ms bps  is  limited  to  <  3mb/s Changes  from  aol_vip_std_tcpprofile: Increase  the  -­‐minRTO Reduced  –maxburst Reduced  -­‐delayedAck Reduced  -­‐slowstarNncr Results: Avg  30%  reducNon  in  retransmissions  to  mobile  clients 21 Wednesday, May 23, 12 21
  • 22. Server  to  VIP  -­‐  aol_vip_server-­‐2-­‐vserver_tcpprofile Based  on  aol_vip_std_tcpprofile  for  vips  that  are  handling   internal  server  flows  where  the  data  is  inbound  to  the  vserver.     AssumpAons The  source  host  is  in  a    data  center  or  other  a_ached  space. The  maximum  bps  per  flow  is  600  mb/s  inbound  to  a  VIP  or   vserver. The  max  propagaNon  delay  (RTT)  is  20  ms  (0.02  sec). Changes  from  aol_vip_std_tcpprofile: Increase  WScale Reduced  -­‐minRTO 22 Wednesday, May 23, 12 22
  • 23. Dial  Client  to  VIP  -­‐  aol_vip_dialup-­‐client_tcpprofile   Based  on  aol_vip_std_tcpprofile  with  changes  to  improve  the   dial  client  experience  by  reducing  retransmission  and  prevent   terminal  server  flooding.   AssumpAons   Max  bps  per  flow  is  50  kb/s  (  KB/s). Avg  RTT  is  500  ms  (0.5  sec). Changes  from  aol_vip_std_tcpprofile: Increase  -­‐minRTO Reduced  –maxburst Reduced  -­‐delayedack Reduced  -­‐slowstarNncr Results: Total  page  render  Nme  reduced  from  ~35s  to  ~16s. SSL  handshake  Nmes  reduced  from  4+s  to  ~  1s. 23 Wednesday, May 23, 12 23
  • 24. Server  to  Service  -­‐  aol_svc_std_tcpprofile Used  on  Services  for  traffic  between  the  real  host  &  NetScaler.     Even  though  the  propagaNon  delay  is  low,  BDP  is  sNll  a  factor   due  to  per  connecNon  bps. AssumpAons Max  bps  per  flow  is  650  mb/s  (81.25  MB/s). Avg  RTT  is  10  ms  (0.010  Sec). Majority  of  flows  are  HTTP  1.1  or  long  lived  TCP  connecNons. Changes  from  default: Enabled -SACK Reduced -delayedACK Enabled -WS Reduced -minRTO Increased -WScale 24 Wednesday, May 23, 12 24
  • 25. Case  Study:   Bits  per  Second  (bps)  Improvements 25 Wednesday, May 23, 12 25
  • 26. Case  Study:  bps  Improvements Problem: Need  to  improve  the  customer  experience  and  increase   efficiency. Analysis  had  shown  more  than  99%  of  incoming  SYNs  support   Window  Scaling  and/or  SACK,  yet  the  NetScaler  was  not  uNlizing   these  opNons. Analysis  also  showed  a  large  number  of  sessions  adverNsing  Zero   Windows. Sessions  were  being  dropped  into  slow  start  due  to  packet  loss. 26 Wednesday, May 23, 12 26
  • 27. Case  Study:  bps  Improvements What  did  we  change? -­‐WS  Enabled -­‐SACK  Enabled -­‐WSVal  0 -­‐maxBurst    10 -­‐iniNalCwnd  10 -­‐delayedAck  100 -­‐pktPerRetx  2 -­‐slowStartIncr  4 This  became  the  basis  for  the  aol_vip_std_tcpprofile   27 Wednesday, May 23, 12 27
  • 28. Case  Study  :  bps  Improvements 28 Wednesday, May 23, 12 28
  • 29. Case  Study:  bps  Improvements Results: Immediate  30%  jump  in  Outbound  bps  towards  client. Retransmission  rate  did  not  increase. The  increased  in  bps  was  sustained  over  the  following  weeks.     Allowed  higher  server  uNlizaNon  and  higher  NetScaler  efficiency  (more  bang  for  $$).   29 Wednesday, May 23, 12 29
  • 30. Case  Study:   Retransmission  Reduc0on 30 Wednesday, May 23, 12 30
  • 31. Case  Study:    Retransmission  Reduc0on Problem: While  developing  the  Mobile  and  Dial  tcpProfiles,  we  realized  there   were  a  large  number  of  fast  retransmission  on  the  VServers. At  Nmes,  retransmissions  approached  15%  of  outbound  traffic. This  retransmission  rate  was  sustained  around  the  clock,  in  3  data   centers,  eliminaNng  network  congesNon  as  a  cause. 31 Wednesday, May 23, 12 31
  • 32. Case  Study:  Retransmission  Reduc0on TesAng: To  reduce  variables,  we  first  tested  on  one  pair  of  shared   service  NetScalers. We  tried  to  reduce  the  –delayedAck  to  100ms No  change  was  seen  in  the  traffic  over  24  hours Next  we  tried  increasing  the  –minRTO Instant  drops  in  the  number  of  Fast  retransmissions,   1st  -­‐  7th  retransmissions  and   RetransmissionGiveup. 32 Wednesday, May 23, 12 32
  • 33. Case  Study:  Retransmission  Reduc0on 33 Wednesday, May 23, 12 33
  • 34. Case  Study:  Retransmission  Reduc0on Change  Details: Increasing  the  -­‐minRTO  from  100ms  to  400ms  retransmi_ed  packets  dropped  from  14.2k/sec  to  2.5k/sec.  bps  dropped  from  102mb/sec  to  30kb/sec  on  one  link. Increasing  -­‐minRTO  from  100ms  to  300ms  had  the  greatest  effect   on  reducing  retransmissions. When  applied  across  AOL,  total  outbound  bps  dropped  by  10%.  However,  increasing  -­‐minRTO  to  500ms  caused  outbound  bps  to   drop.   34 Wednesday, May 23, 12 34
  • 35. Case  Study:   Engadget  Coverage  of  New  iPad  Announcement     35 Wednesday, May 23, 12 35
  • 36. Case  Study:  Engadget  Coverage  of  New  iPad   Announcement     On  March  7,  2012  Apple  held  a  Keynote  event  to  announce  the   new  iPad.     TradiNonally,  these  events  crush  sites  due  to  user  traffic.     The  iPhone  4S  event  in  September  2011  overloaded  Engadget   and  we  decided  that  it  would  not  happen  again. We  used  all  the  performance  informaNon  presented  here,  plus   other  lessons  learned  to  build  a  plant  that  could  handle  the   projected  traffic. 36 Wednesday, May 23, 12 36
  • 37. Case  Study:  Engadget  Coverage  of  New  iPad  Announcement     Goals: Engadget  would  not  fail  during  future  Apple    announcements.     Customers  would  be  able  to  access  Engadget  Live  Blogs  for  the   enNre  event. No  event  driven  changes  would  be  needed  to  the  plant.   37 Wednesday, May 23, 12 37
  • 38. Case  Study:  Engadget  Coverage  of  New  iPad  Announcement     Major  Concerns: Keep  the  NetScaler  CPU  <  50%. Distribute  the  large  iniNal  connecNons  per  second. Prepare  for  compeNng  sites  to  fail  and  those  users  shij  to   Engadget. IniNally  overbuild  to  guarantee  service  and  measure  the  true   capacity  requirements. 38 Wednesday, May 23, 12 38
  • 39. Case  Study:  Engadget  Coverage  of  New  iPad  Announcement     We  succeeded! Gigaom.com  3/8/2012 "Engadget’s  Tim  Stevens  was  one  of  the  few  tech  media  editors   who  only  had  to  worry  about  covering  the  event  itself,  as   opposed  to  managing  the  failure  of  his  tools.” h_p://gigaom.com/apple/live-­‐from-­‐sf-­‐sorta-­‐why-­‐apple-­‐ events-­‐break-­‐publishers/ 39 Wednesday, May 23, 12 39
  • 40. Case  Study:  Engadget  Coverage  of  New  iPad  Announcement     40 Wednesday, May 23, 12 40
  • 41. Case  Study:  Engadget  Coverage  of  New  iPad  Announcement     Event  Details: Event  ran  120  minutes,  including  pre  and  post  blogs. 3  second  updates  were  served  non-­‐stop  to  the  customers.     A  total  of  1.14  Billion  h_p  requests  served  during  the  event. Sustained  >  200K  h_p  requests  /sec  for  30  minutes Peak  >220K  h_p  requests  /sec  over  10  min. No  changes  were  needed  to  any  device  during  the  event. 41 Wednesday, May 23, 12 41
  • 42. Case  Study:  Engadget  Coverage  of  New  iPad  Announcement     How  we  did  it: Compression  Bypass  set  for  20% Integrated  Caching Flash  Cache  (1  sec) 14  non-­‐dedicated  NetScalers  HA  pairs  to  minimize  connecNons     rate,  located  in  5  data  centers  in  the  US  and  EU.   AOL  custom  tcpProfiles. 600  cloud  based,  well  tuned,  virtual  Centos  servers. This  was  done  while  maintaining  normal  AOL  producNon   services  on  the  shared  infrastructure! 42 Wednesday, May 23, 12 42
  • 43. Troubleshoo0ng  &  Monitoring 43 Wednesday, May 23, 12 43
  • 44. Troubleshoo0ng:  General Signs  your  NetScaler  is  gemng  overloaded: Watch  for  xoff  frames  on  the  switch  port,  indicates  the  NS  is  having   trouble  handling  packets. Watch  for  buffer  overruns,  may  indicate  the  need  for  addiNonal   interfaces  to  expand  buffer  pool. nstrace.sh  issues: Not  always  fully  capturing  packets. RelaNve  frame  Nmestamps  may  have  a  negaNve  value  due  to   capture  mechanism. AddiNonal  threads  added  in  10.x  which  should  help. 44 Wednesday, May 23, 12 44
  • 45. Monitoring:  SNMP  OIDs In  addiNon  to  the  usual  OIDS,  we  have  found  these  very  useful   to  warn  of  potenNal  problems.   ifTotXoffSent  -­‐  .1.3.6.1.4.1.5951.4.1.1.54.1.43 ifnicTxStalls  -­‐  .1.3.6.1.4.1.5951.4.1.1.54.1.45 ifErrRxNoBuffs  -­‐  .1.3.6.1.4.1.5951.4.1.1.54.1.30 ifErrTxNoNSB  -­‐  .1.3.6.1.4.1.5951.4.1.1.54.1.31 45 Wednesday, May 23, 12 45
  • 46. Future  work What  our  team  is  working  on  over  the  next  12  months IPFIX  and  AppFlow  monitoring. Add  NetScaler  interfaces  to  increase  buffer  pool. Custom  tcpProfiles  for  specific  applicaNons  like  ad  serving. Move  to  route  based,  acNve/acNve  architecture. Start  to  roll  out  Jumbo  Frames  &  WebSockets InvesNgaNng  OpenFlow/SDN Replace  current  tools  with  a  completely  new  tool  set  based  on   Trigger  (to  be  opensourced). 46 Wednesday, May 23, 12 46
  • 47. AOL  is  Hiring! As  you  have  seen,  we  are  NOT  just  your  parent’s  dialup   anymore. h_p://corp.aol.com/careers 47 Wednesday, May 23, 12 47
  • 48. We value your feedback! Take a survey of this session now in the mobile app • Click 'Sessions' button • Click on today's tab • Find this session • Click 'Surveys' #CitrixSynergy 48 Wednesday, May 23, 12 48
  • 49. Appendix:  AOL  custom  tcpProfile  configs add ns tcpProfile aol_vip_std_tcpprofile -WS ENABLED -SACK ENABLED -WSVal 0 -nagle DISABLED -ackOnPush ENABLED -maxBurst 10 -initialCwnd 10 -delayedAck 100 -oooQSize 64 -maxPktPerMss 0 –pktPerRetx 2 -minRTO 400 - slowStartIncr 4 -bufferSize 8190 add ns tcpProfile aol_vip_dialup-client_tcpprofile -WS ENABLED -SACK ENABLED -WSVal 0 -nagle Enabled - ackOnPush ENABLED -maxBurst 4 -initialCwnd 4 -delayedAck 50 -oooQSize 100 -maxPktPerMss 0 -pktPerRetx 2 - minRTO 500 -slowStartIncr 2 -bufferSize 8190 add ns tcpProfile aol_vip_mobile-client_tcpprofile -WS ENABLED -SACK ENABLED -WSVal 0 -nagle Enabled - ackOnPush ENABLED -maxBurst 10 -initialCwnd 6 -delayedAck 50 -oooQSize 100 -maxPktPerMss 0 -pktPerRetx 2 - minRTO 1000 -slowStartIncr 4 -bufferSize 8190 add ns tcpProfile aol_vip_server-2-vserver_tcpprofile -WS ENABLED -SACK ENABLED -WSVal 5 -nagle Enabled - ackOnPush ENABLED -maxBurst 10 -initialCwnd 10 -delayedAck 100 -oooQSize 64 -maxPktPerMss 0 -pktPerRetx 2 - minRTO 100 -slowStartIncr 4 -bufferSize 8190 add ns tcpProfile aol_svc_std_tcpprofile -WS ENABLED -SACK ENABLED -WSVal 3 -nagle DISABLED -ackOnPush ENABLED -maxBurst 10 -initialCwnd 10 -delayedAck 100 -oooQSize 64 -maxPktPerMss 0 -pktPerRetx 2 -minRTO 100 - slowStartIncr 4 -bufferSize 8190 49 Wednesday, May 23, 12 49
  • 50. Appendix:  AOL  custom  tcpParam  Config set ns tcpParam -WS Enabled -WSVal 2 -SACK Enabled -maxBurst 6 -initialCwnd 4 -delayedAck 100 -downStateRST DISABLED -nagle DISABLED -limitedPersist ENABLED -oooQSize 64 -ackOnPush ENABLED -maxPktPerMss 0 -pktPerRetx 2 -minRTO 100 -slowStartIncr 2 50 Wednesday, May 23, 12 50
  • 51. Appendix:  NSCLI  Show  Commands tcpParam  NSCLI  Commands Show  all  current  values •sh ns tcpparam -format old -level verbose •sh ns tcpparam -level verbose tcpProfile  NSCLI  Commands Show  all  current  values •sh ns tcpprofile -format old -level verbose •sh ns tcpprofile <profile_name> -level verbose 51 Wednesday, May 23, 12 51
  • 52. Appendix:  tcpParam  NSCLI  Commands Set  AOL  custom  tcpParam set ns tcpParam -WS Enabled -WSVal 2 -SACK Enabled -maxBurst 6 -initialCwnd 6 -delayedAck 100 -downStateRST DISABLED -nagle DISABLED -limitedPersist ENABLED -oooQSize 64 -ackOnPush ENABLED -maxPktPerMss 0 -pktPerRetx 2 -minRTO 100 -slowStartIncr 2 Set  a  specific  value set ns tcpParam -WS enabled Changing  tcpparam  modifies  the  same  values  in  the  default  tcpprofile. 52 Wednesday, May 23, 12 52
  • 53. Appendix:  tcpProfile  NSCLI  Commands CreaNng  new  profiles add ns tcpProfile aol_vip_std_tcpprofile -WS ENABLED -SACK ENABLED -WSVal 0 -nagle DISABLED -ackOnPush ENABLED -maxBurst 10 -initialCwnd 10 -delayedAck 100 -oooQSize 64 -maxPktPerMss 0 -pktPerRetx 2 -minRTO 400 -slowStartIncr 4 -bufferSize 8190 –mss 0 Changing  a  value Set ns tcpProfile aol_vip_std_tcpprofile -WSVal 3 Applying  Profiles set cs vserver <vservername:port> -tcpprofile <profilename> set cs vserver www_aol_com:80 -tcpprofile aol_vip_std_tcpprofile set lb vserver <vservername:port> -tcpprofile <profilename> set service <servicename:port> -tcpprofile <profilename> Resemng  to  Default unset cs vserver <vservername:port> -tcpprofile unset lb vserver <vservername:port> -tcpprofile unset service <servicename:port> -tcpprofile 53 Wednesday, May 23, 12 53
  • 54. Explana0on  of  TCPProfile  variables.  (V9.2  &  9.3) Variable Default Min Max Descrip5on -name ---- ---- ---- The name for a TCP profile. A TCP profile name can be from 1 to 127 characters and must begin with a letter, a number, or the underscore symbol (_). Other characters allowed after the first character in a name are the hyphen, period, pound sign, space, at sign, and equals. -­‐WS Disabled -­‐-­‐-­‐ -­‐-­‐-­‐ Enable or disable window scaling.  If Disabled, Window Scaling is disabled for both sides of the conversation.  -­‐WSVal 4 0 8 The factor used to calculate the new window size. -­‐maxburst 6 1 255 The maximum number of TCP segments allowed in a burst.  The higher this value, the more frames are able to be sent at one time. -­‐ini5alCwnd 4 2 44 The initial maximum upper limit on the number of TCP packets that can be outstanding on the TCP link to the server. As of 9.2.50.1, this number was upped from 6 to 44 -­‐delayedAck 100 10 300 The time-out for TCP delayed ACK, in milliseconds. -­‐oooQSize 64 0 65535 The maximum size of out-of-order packets queue. A value of 0 means infinite.  The name is a misnomer, this buffer contains sent frames that are awaiting acks or received frames that are not sequential, meaning some packets are missing due to SACK. -­‐maxPktPerMss 0 0 512 The maximum number of TCP packets allowed per maximum segment size (MSS). A value of 0 means that no maximum is set. -­‐pktPerRetx 1 1 512 The maximum limit on the number of packets that should be retransmitted on receiving a "partial ACK". Partial ACK are ACKs indicating not all outstanding frames were acked. -­‐minRTO 100 10 64000 The minimum Receive Time Out (RTO) value, in milliseconds.  The NetScale supports New Reno and conforms to RFC 2001 and RFC 5827 (?).  Since the Netscaler does not use TCP Timestamping, these values do not correspond to actual propagation delays.  -­‐slowStartIncr 2 1 100 Multiplier determines the rate which slow start increases the size of the TCP transmission window after each ack of successful transmission. -­‐SACK Disabled -­‐-­‐-­‐ -­‐-­‐-­‐ Enable or disable selective acknowledgement (SACK). There is NO reason this should be off -­‐nagle Disabled -­‐-­‐-­‐ -­‐-­‐-­‐ Enable or disable the Nagle algorithm on TCP connections. When enabled, reduces the number of small segments by combining them into one packet. Primary use is on slow or congested links such as mobile or dial. -­‐ackOnPush Enabled -­‐-­‐-­‐ -­‐-­‐-­‐ Send immediate  acknowledgement (ACK) on receipt of TCP packets with the PUSH bit set. -­‐mss  (v  9.3) 0 0 1460 Maximum segment size, default value 0 uses global setting in “set tcpparam”, maximum value 1460 -­‐bufferSize   8190 -­‐-­‐-­‐ 4194304 The value that you set is the minimum value that is advertised by the NetScaler appliance, and this buffer size is reserved when a client (v9.3) initiates a connection that is associated with an endpoint-application function, such as compression or SSL. The managed application can request a larger buffer, but if it requests a smaller buffer, the request is not honored, and the specified buffer size is used. If the TCP buffer size is set both at the global level and at the entity level (virtual server or service level), the buffer specified at the entity level takes precedence. If the buffer size that you specify for a service is not the same as the buffer size that you specify for the virtual server to which the service is bound, the NetScaler appliance uses the buffer size specified for the virtual server for the client-side connection and the buffer size specified for the service for the server-side connection. However, for optimum results, make sure that the values specified for a virtual server and the services bound to it have the same value. The buffer size that you specify is used only when the connection is associated with 54 endpoint-application functions, such as SSL and compression. Note: A high TCP buffer value could limit the number of connections that can be made to the NetScaler appliance. Wednesday, May 23, 12 54