Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Slash n: Technical Session 9 - DiFIT - Distributed Fault Injection Testing - Rahul Karmshil, Shwet Shashank


Published on

Published in: Technology
  • Be the first to comment

Slash n: Technical Session 9 - DiFIT - Distributed Fault Injection Testing - Rahul Karmshil, Shwet Shashank

  1. 1. Distributed Fault Injection Testing(DiFIT) - The Flip way Rahul Karmshil Shwet Shashank
  2. 2. Agenda• Fault Injection testing – why we need it?• Fault Scenarios and the fallacies• DiFIT – High level architecture• DiFIT – Tech Stack• Q&A
  3. 3. Fault Injection testing – why we need it?• Distributed systems are unreliable! Application fault Examples 1. Target service instance(s) or cluster down 2. High Availabilityof distributed computing - James Gosling The 8 fallacies scenarios for infrastructure pieces (Best Effort, NSPOF, Session Failover) 3. Impact Theone off resource intensive operations like large report 1. of network is reliable. generation, garbage collection, crons 2. Latency is zero. System fault 3. Bandwidth is infinite. examples 1. Network timeout is secure. 4. The network 5. Topology doesnt change. 2. Disk Full 6. There is one administrator. 3. FD 7. Transport cost is zero. reaching limits 4. Network interface is down 8. The network is homogeneous.
  4. 4. How to test for faults? Service 1 ✗ Service 2Wouldn’t it be easier manual testing: Challenges in if we had something like – • Know the commands • How to test operations like/v0.1/services/service2/stop[PUT] bringing downPayload: {“service_port”:80,”host”:””,“forceful”:0} network? •Response:Repeatability 204 OK 404, SERVICE_NOT_FOUND 400, COULD_NOT_STOP_SERVICE
  5. 5. DiFIT –A How to break? WhatAn example Typical test? Picture can flow The DiFIT Complete Backen d HTTP Request RESTful UI Controller Controller X-Unit DiFIT Agent Website Retry Queue Supply Chain ✗ ✗ OMS ✗ ✗✗ Fulfillment Logistics ✗ Message Queue ✗ DiFIT ✗ Agent
  6. 6. DiFIT- Tech Stack DiFIT Server Module Network InfraOperations Operations Operations DiFIT REST Interface DiFIT Agent DiFIT DiFIT APIs Libraries TCP/IP STAF STAF
  7. 7. Code snippetdef test_relayed_message_is_sidelined_when_target_is_down #Setup, bringing down service stop_payload={:host=>”", :forceful=>1, :options=>[]}.to_json stop_response = RestClient.put(@difit_base_url+"/v0.1/services/fulfillment/stop", stop_payload ) assert_equal ( 204, stop_response.code ) #Buisiness flow order_object = OrderFactory.default_order order_id = @oms_client.create_order ( order_object ) message_id = @oms_client.find_message_id_by_order_id ( order_id ) wait_till_message_is_relayed ( message_id ) #Verification step assert_equal ( SIDELINE_STATUS, @oms_client.message_status(message_id), "” ) #Restore, bring up the service …end
  8. 8. References•••
  9. 9. The best way to avoid failure is to fail constantly.