• Save
Performance Aware SDN, LSPE talk
Upcoming SlideShare
Loading in...5
×
 

Performance Aware SDN, LSPE talk

on

  • 1,010 views

 

Statistics

Views

Total Views
1,010
Views on SlideShare
1,010
Embed Views
0

Actions

Likes
1
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Performance Aware SDN, LSPE talk Performance Aware SDN, LSPE talk Presentation Transcript

  • Performance Aware SDNhttp://www.meetup.com/SF-Bay-Area-Large-Scale-Production-Engineering/Peter PhaalInMon Corp.June 2013Thursday, June 13, 13
  • Why monitor performance?“If you can’t measure it, you can’t improve it”Lord KelvinThursday, June 13, 13
  • BalancerLoad ServerServerWebDatabaseApplicationNetworkMemcacheMemcacheServerServerWebServerApplicationServerDatabaseBalancerLoad BalancerNetworking in large, scale-out, multi-tiered sites• Large number of servers in each pool• Servers constantly added/removed• Network performance is critical• scale-out applications dependent on network performance• potential for propagating failures between tiersThursday, June 13, 13
  • http://www.osa.org/osaorg/media/osa.media/CorporateGateway/ExecutiveForum2013/2013_Executive_Forum_Keynote_Najam_Ahmad.pdfNajam Ahmad, Vice President, Infrastructure, FacebookOSA 2013 keynoteSDN brings network under software controlExtends DevOps tool stack to include network visibility and controlThursday, June 13, 13
  • Feedback controlMeasureControlSystemdesiredoutputmeasuredoutputThursday, June 13, 13
  • Controllability and ObservabilityBasic concept is simple, a stable feedback control system requires:1. ability to influence all important system states (controllable)2. ability to monitor all important system states (observable)Thursday, June 13, 13
  • It’s hard to stay on the road if you can’t see theroad, or keep to the speed limit without aspeedometerIt’s hard to stay on the road or maintainspeed if your brakes, engine or steering failControllability and Observability driving exampleObservabilityControllabilityStates location, speed, direction, ...Dense Tule fog in Bakersfield, CAThursday, June 13, 13
  • Effect of delay on stabilityMeasurement delay Planning delayTimeConfiguration delayDisturbance Response delayEffectLoop delayDDoS launched Identify target, attacker Black hole, mark, re-route? Switch CLI commands Route propagation Traffic droppedComponents of loop delaye.g. Slow reaction time causestired / drunk / distracteddriver to weave, very slowreaction time and they leavethe roadThursday, June 13, 13
  • Observability using sFlow standard“In God we trust. All others bring data.”Dr. Edwards DemingThursday, June 13, 13
  • Industry standard measurement technology integrated in switcheshttp://www.sflow.orgThursday, June 13, 13
  • Open source agents for hosts, hypervisors and applicationsHost sFlow project (http://host-sflow.sourceforge.net) is centerof an ecosystem of related open source projects embeddingsFlow in popular operating systems and applicationsThursday, June 13, 13
  • Network (maintained by hardware in network devices)- MIB-2 ifTable: ifInOctets, ifInUcastPkts, ifInMulticastPkts, ifInBroadcastPkts, ifInDiscards, ifInErrors, ifUnkownProtos,ifOutOctets, ifOutUcastPkts, ifOutMulticastPkts, ifOutBroadcastPkts, ifOutDiscards, ifOutErrorsHost (maintained by operating system kernel)- CPU: load_one, load_five, load_fifteen, proc_run, proc_total, cpu_num, cpu_speed, uptime, cpu_user, cpu_nice,cpu_system, cpu_idle, cpu_wio, cpu_intr, cpu_sintr, interupts, contexts- Memory: mem_total, mem_free, mem_shared, mem_buffers, mem_cached, swap_total, swap_free, page_in, page_out,swap_in, swap_out- Disk IO: disk_total, disk_free, part_max_used, reads, bytes_read, read_time, writes, bytes_written, write_time- Network IO: bytes_in, packets_in, errs_in, drops_in, bytes_out, packet_out, errs_out, drops_outApplication (maintained by application)- HTTP: method_option_count, method_get_count, method_head_count, method_post_count, method_put_count,method_delete_count, method_trace_count, method_connect_count, method_other_count, status_1xx_count,status_2xx_count, status_3xx_count, status_4xx_count, status_5xx_count, status_other_count- Memcache: cmd_set, cmd_touch, cmd_flush, get_hits, get_misses, delete_hits, delete_misses, incr_hits, incr_misses,decr_hists, decr_misses, cas_hits, cas_misses, cas_badval, auth_cmds, auth_errors, threads, con_yields,listen_disabled_num, curr_connections, rejected_connections, total_connections, connection_structures, evictions,reclaimed, curr_items, total_items, bytes_read, bytes_written, bytes, limit_maxbytesStandard countersThursday, June 13, 13
  • Simple- standard structures - densely packed blocks of counters- extensible (tag, length, value)- RFC 1832: XDR encoded (big endian, quad-aligned, binary) - simple to encode/decode- unicast UDP transportMinimal configuration- collector address- polling intervalCloud friendly- flat, two tier architecture: many embedded agents → central “smart” collector- sFlow agents automatically start sending metrics on startup, automatically discovered- eliminates complexity of maintaining polling daemons (and associated configurations)Scaleable push protocolThursday, June 13, 13
  • • Counters tell you there is aproblem, but not why.• Counters summarizeperformance by dropping highcardinality attributes:- IP addresses- URLs- Memcache keys• Need to be able to efficientlydisaggregate counter byattributes in order tounderstand root cause ofperformance problems.• How do you get this datawhen there are millions oftransactions per second?Counters aren’t enoughWhy the spike in traffic?(100Gbit link carrying 14,000,000 packets/second)Thursday, June 13, 13
  • • Random sampling is lightweight• Critical path roughly cost ofmaintaining one counter:if(--skip == 0) sample();• Sampling is easy to distributeamong modules, threads,processes without anysynchronization• Minimal resources required tocapture attributes of sampledtransactions• Easily identify top keys,connections, clients, servers,URLs etc.• Unbiased results with knownaccuracyBreak out traffic by client, server and port(graph based on samples from100Gbit link carrying 14,000,000 packets/second)sFlow also exports random samplesThursday, June 13, 13
  • Integrated data modelPacket HeaderPacket HeaderSource DestinationTCP/UDP Socket TCP/UDP SocketMAC Address MAC AddressSampled Packet HeadersI/F CountersPower, Temp.NETWORKHOSTCPUMemoryI/OPower, Temp.Adapter MACsAPPLICATIONSampled TransactionsTransaction CountersTCP/UDP SocketIndependent agents sFlow analyzer joins data for integrated viewThursday, June 13, 13
  • Virtual ServersApplicationsApache/PHPTomcat/JavaMemcachedVirtual NetworkServersNetworkEmbedded monitoring of allswitches, all servers, allapplications, all the timeConsistent measurementsshared between multiplemanagement toolsComprehensive visibilityThursday, June 13, 13
  • Software Defined Networking“You can’t control what you can’t measure”Tom DeMarcoThursday, June 13, 13
  • MonitorFeedback control loop with sFlow and OpenFlowlow configuration delaylow measurement delayTogether, sFlow and OpenFlow provide the observability andcontrollability to enable SDN applications targeting low latencycontrol problems like load balancing and DDoS mitigationlow planning delaySDN applicationThursday, June 13, 13
  • Network OSApplicationOpen APIsApplicationApplicationData PlaneControl PlaneConfiguration Forwarding VisibilityNETCONF/OF-ConfigOpen APIsHostssFlow adds actionable visibility to SDN stackActionable = complete + timelyThursday, June 13, 13
  • REST APIMetricsFlow DefinitionsThresholdsInMonsFlow-RTREST APIOpenFlowControllerLoad Balancer DDoS ProtectionREST ApplicationsOpen “Southbound” APIsData PlaneControl PlaneHostsOpen “Northbound” APIsSDN ApplicationsSDN feedback control applicationsThursday, June 13, 13
  • ovs-vsctl set-controller br0 tcp:10.0.0.1:6633ovs-vsctl -- --id=@sflow create sflow agent=eth0 target=”10.0.0.1:6343” sampling=1000 polling=20 -- -- set bridge br0 sflow=@sflowConnect switches to central control planee.g connect Open vSwitch to OpenFlow controllere.g. connect Open vSwitch to sFlow analyzerMinimal configuration to connect switches tocontrollers, intelligence resides in external softwareThursday, June 13, 13
  • Components of a DDoS flood attack1. Command to attack target sent overcontrol network2. Large number of compromised hostsstart sending traffic to target3.Traffic converges on access link,overwhelming capacity and denyingaccessThursday, June 13, 13
  • Define flow keysDDoS Protectiondefine address groupsdefine flowsdefine thresholdswhile(running) {receive threshold eventmonitor flowdeploy controlmonitor flowrelease control}OpenFlowControllerREST APIsFlow-RTREST API12346587REST operation flow chartThursday, June 13, 13
  • curl -H "Content-Type:application/json" -X PUT --data "{external:[0.0.0.0/0], internal:[10.0.0.0/8]}" http://localhost:8008/group/json1. Define address groupscurl -H "Content-Type:application/json" -X PUT --data "{keys:ipsource,ipdestination, value:frames, filter:sourcegroup=external&destinationgroup=internal}" http://localhost:8008/flow/incoming/json2. Define flowscurl -H "Content-Type:application/json" -X PUT --data "{metric:incoming, value:1000}" http://localhost:8008/threshold/incoming/json3. Define thresholdscurl "http://localhost:8008/events/json?eventID=4&timeout=60"4. Receive threshold eventsThursday, June 13, 13
  • 5. Monitor flowcurl http://localhost:8008/metric/10.0.0.16/4.incoming/json[{"agent": "10.0.0.16","dataSource": "4","metricName": "incoming","metricValue": 1582.93965044338071,"topKeys": [{"key": "192.168.1.1,10.0.0.151","updateTime": 1357169662500,"value": 1582.93965044338071},{"key": "192.168.1.4,10.0.0.151","updateTime": 1357169665500,"value": 46.552918457198984}],"updateTime": 1357169665500}]6. Deploy controlcurl -d {"switch": "00:00:00:00:00:00:00:01","name":"ddos-1", "cookie":"0", "priority":"32768","ingress-port":"4","active":"true"}http://localhost:8080/wm/staticflowentrypusher/jsonThursday, June 13, 13
  • thresholdattack startsdetectedcontrol implemented attack eliminatedhttp://blog.sflow.com/2013/03/ddos.htmlBeforeAfterDDoS mitigation resultspackets/secondpackets/secondsustained 6M packets/second attack(30 Gigabits/second)http://packetpushers.net/openflow-1-0-actual-use-case-rtbh-of-ddos-traffic-while-keeping-the-target-online/Also: http://blog.sflow.com/2013/05/controlling-large-flows-with-openflow.htmlThursday, June 13, 13
  • ECMP/LAG multi-path traffic distributionhttp://static.usenix.org/event/nsdi10/tech/full_papers/al-fares.pdfindex = hash(packet fields) % linkgroup.sizeselected_link = linkgroup[index]Hash collisions reduce effective cross sectional bandwidth1:1 subscription ratio doesn’t eliminate blocking, collisionprobabilities are high, even with large numbers of pathsThursday, June 13, 13
  • Birthday ParadoxWhat is the chance that at least two people in a room will share a birthday?50/50 chance with 23 people, virtual certainty with the 60 people.This is a“paradox” because the probability seems remarkably high considering that thereare 365 possible birthdays (366 if you include Feb 29) and 23 people representsjust over 6% of the theoretical maximum and 60 people is only 16%.http://en.wikipedia.org/wiki/Birthday_problemECMP/LAG/MLAG collision probabilities are surprisingly highThursday, June 13, 13
  • http://research.microsoft.com/en-us/UM/people/srikanth/data/imc09_dcTraffic.pdfSmall number of long lived large flows responsible for bulk of loadUse SDN load balancing applications to detect andeliminate collisions by adjusting forwarding pathsLoad balancing large flowsThursday, June 13, 13
  • colliding flowshttp://blog.sflow.com/2013/01/load-balancing-lagecmp-groups.htmlhttp://blog.sflow.com/2013/03/ecmp-load-balancing.htmlhttp://blog.sflow.com/2013/02/sdn-and-large-flows.htmlhttps://datatracker.ietf.org/doc/draft-ietf-opsawg-large-flow-load-balancing/Not just ECMP, also LAG/MLAG and WAN etc.non-colliding flowshttp://blog.sflow.com/2013/06/large-flow-detection.htmlhttp://blog.sflow.com/2013/06/flow-collisions.htmlThursday, June 13, 13
  • Memcached hot keyshttp://codeascraft.com/2012/12/13/mctop-a-tool-for-analyzing-memcache-get-traffic/Server BServer CServer AClient 1Client 2Client 3Client 4Memcache clusterMemcache clientsoverloaded server linkhttp://blog.sflow.com/2013/01/memcache-hot-keys-and-cluster-load.html• Interesting parallel with ECMP/LAG hash collisions• Demonstrates linkage between network and application performance• Can monitor cluster wide Memcached hot/missed keys with sFlow• Possible SDN use cases:• Server placement informed by visibility into network topology, loads etc.• Use SDN to shorten paths, reducing latency and packet loss• Avoid packet loss by steering packets around congested link• Extension of OpenFlow to optical circuit switches allows network to berewired for actual demandThursday, June 13, 13
  • Next stepsOrganizational: break down networking silo- learn more about networking- integrate networking into DevOps team- think about observability and controllability whenpurchasing equipment and architecting servicesStrategic: Engage developer communities- share operational expertise with SDN community- help specify northbound APIs so that they deliverfunctionality needed to integrate networking intoDevOps stackThursday, June 13, 13
  • Questions?Thursday, June 13, 13
  • Backup slidesThursday, June 13, 13
  • • OpenFlow• Hybrid combined OpenFlow (using NORMAL action)• Puppet• BGP policy• RESTful API to switches• NETCONF• Optical circuit switching- OpenFlow extensions- SOCM (Service-Based Optical Connection Management)Programatic control of switcheshttp://www.nanog.org/sites/default/files/wed.general.brainslug.lapukhov.20.pdfhttp://blog.sflow.com/2013/03/pragmatic-software-defined-networking.htmlhttp://www.nanog.org/sites/default/files/wed.general.socm_.samberg.6.pdfThursday, June 13, 13
  • packetsdecode hash sendflow cache flushsampleNetFlow/IPFIXsendpolli/f counterssample• sFlow exports packet samples immediately• sFlow also exports interface counters• NetFlow exports flow data on end of flow, active-timeout or inactive-timeout• NetFlow data generation requires significant resources on switch that canbe better applied to increase size of forwarding table(s)• OpenFlow metering has similar architecture to NetFlow and similarlimitationssFlow and NetFlow/IPFIX in a switchThursday, June 13, 13
  • InMon sFlow-RTactive timeout active timeoutNetFlowOpenvSwitchSolarWinds Real-Time NetFlow Analyzer• sFlow does not use flow cache, so realtime charts more accurately reflect traffic trend• NetFlow spikes caused by flow cache active-timeout for long running connectionsRapid detection of large flowsFlow cache active timeout delays large flow detection,limits value of signal for real-time control applicationshttp://blog.sflow.com/2013/01/rapidly-detecting-large-flows-sflow-vs.htmlThursday, June 13, 13