Testing network
conditions with
ToxiProxy
Celso Crivelaro
@celsocrivelaro
Tech Lead @
Professor @
Past:
Monolithic
Applications
Today:
Distributed
Systems
External
API
Service
Server
App
DB
API
Mobile
Mobile
App
Web App
Share Resources
Communication
over TCP/IP
Network-
dependent Apps
https://engineering.riotgames.com/news/fixing-internet-real-time-applications-part-i
Network fails
“A Distributed System is one in which the
failure of a computer you didn’t even know
existed can render your own computer
unusable”
Leslie Lamport
Network Model
"Ozzy" Model?
200.124.124.1
200.0.124.5
200.5.34.12
200.3.5.51
200.3.5.52
200.124.124.1
200.0.124.5
200.5.34.12
200.3.5.51
200.3.5.52
200.124.124.1:80 -> 200.3.5.52:80
Networking
Problems
No connection No connection between parts
Latency
Timeout
Bandwidth
Slow Close
Time to packages go and return
Stop responding
Bandwidth size in Mb/s
Slow close connection
What are we
testing?
How your app
responds to
network errors
User perception to network problem
Business rule to the problem
Technical action to the problem
Resiliency
How to simulate?
Hard modify a staging env
Hard to automate
Low level tools -> For sysadmins
No APIs for Devs or QA
http://toxiproxy.io
TCP TCP
APP SERVICE
TCP TCP
APP SERVICE
Toxies
Latency
Bandwidth
...
TCP
TCP
TCP TCP
APP
SERVICE
2
SERVICE
1
HTTP API
How to automate
Some Solutions
Low timeout
Retries Patterns - Circuit Break
Fallbacks
Caching strategies
Resiliency Matrix
App 1
Database API
App 2
ZIP Checker External API
App 3
Unavailable
Unavailable
Unavailable
Unavailable
Degraded
Available
Available
Not Aplic.
Not Aplic.
Unavailable
Degraded
Degraded
If integration fails...
References
http://toxiproxy.io
https://engineering.shopify.com/17489072-building-and-testing-resilient-ruby-on-rails-applications
Thank you!
@celsocrivelaro
Slides in: http://crivelaro.me

Testing Network Conditions with ToxiProxy