CYNICAL SOFTWARE
Tips for writing software that do not wake you up at 2am
dagi@goodata.com
https://twitter.com/_dagi

Sund...
HOW TO BECOME CYNICAL

• Eat

your own dog food

• Strong
• Pager

feedback

duty (Engineer on duty)

• DevOps

Sunday 17 ...
Sunday 17 November 13
http://pragprog.com/book/mnee/release-it
Sunday 17 November 13

Read an excelent book about Cynical software
- stability p...
INTEGRATION POINTS & FAILURE
MODES
Integration points

Backend
service

Frontend
service
LB

DB
LB

Frontend
service

Back...
Sunday 17 November 13
User user; Projects projects;
1
try {
projects = dao.getProjects(user, OFFSET, LIMIT).await(250, MS);
} catch (FutureTimeo...
Sunday 17 November 13
“Cynicism is merely the art of seeing things as
they are instead of as they ought to be” [1.]

A cynical software
No intim...
Steady state
Circuit breaker

Bulkheads
Test harness

Handshaking

Fail fast

Timeouts
Decoupling middleware

STABILITY PA...
CIRCUIT BREAKER

Frontend
service

Half-Open
Timeout
Closed
Open
Open
Failure counter: 2
1
0

Sunday 17 November 13

- med...
BULKHEADS
Hiccup
HTTP client

Bulkheads
Frontend
service

Connections pool

Backend
service A
Backend
service B
Backend
se...
Latency and Fault Tolerance for Distributed Systems
https://github.com/Netflix/Hystrix

Sunday 17 November 13

- do not rei...
public class CommandHelloWorld extends HystrixCommand<String> {
private final String name;
public CommandHelloWorld(String...
Focus on failures

TESTING
Simulate bad things
Chaos monkey [2]
Sunday 17 November 13

-

we do test but on "sligthly" dif...
Adaptable design

[3]

ARCHITECTURE
Conway’s law
Sunday 17 November 13

ROC

[4]

[5]

- early decisions are hard to rever...
Q&A
https://github.com/dagi/cynical-software

Sunday 17 November 13
Upcoming SlideShare
Loading in …5
×

Cynycal software

2,365 views

Published on

Tips for writing software that do not wake you up at 2am. A topic highly influenced by Michael T. Nygard's book Release It and lessons we've learnt at GoodData.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,365
On SlideShare
0
From Embeds
0
Number of Embeds
16
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Cynycal software

  1. 1. CYNICAL SOFTWARE Tips for writing software that do not wake you up at 2am dagi@goodata.com https://twitter.com/_dagi Sunday 17 November 13
  2. 2. HOW TO BECOME CYNICAL • Eat your own dog food • Strong • Pager feedback duty (Engineer on duty) • DevOps Sunday 17 November 13 No throw over the wall syndrom
  3. 3. Sunday 17 November 13
  4. 4. http://pragprog.com/book/mnee/release-it Sunday 17 November 13 Read an excelent book about Cynical software - stability patterns&anti-patterns - architecture, testing and design
  5. 5. INTEGRATION POINTS & FAILURE MODES Integration points Backend service Frontend service LB DB LB Frontend service Backend service DB DB Backend service Failure modes LB 3rd party service Sunday 17 November 13 - shift toward SOA, interconnected services, remote communication, 3rd party services how a crack may appear - failure mode integration point cracks are tightly coupled to integration points the way cracks appear and propagated across multiple layers and services Integration points may accelerate (chained reaction) or stop cracks Failure in one component increase the probability of failure in another component service Slow responses, endpoint unreachability High levels of complexity provide more directions for the cracks to propagate in.
  6. 6. Sunday 17 November 13
  7. 7. User user; Projects projects; 1 try { projects = dao.getProjects(user, OFFSET, LIMIT).await(250, MS); } catch (FutureTimeoutException e) { throw new AccountInfoUnavailableException(); } 2 for (Project project : projects) { Set<String> userPermissions = dao.getPermissions(project, user); if (userPermissions.contains(CAN_INIT_DATA)) { Hanging SQL return new AccountInfo(Boolean.TRUE); connection } (90 s timeout) } Client Frontend server Hanging HTTP worker thread (120s) Backend server Hanging HTTP connection (60s C3 timeout) Datacenter AWS Frontend server Datacenter Rackspace Sunday 17 November 13 Backend server C3 MySQL MySQL MySQL MySQL MySQL MySQL
  8. 8. Sunday 17 November 13
  9. 9. “Cynicism is merely the art of seeing things as they are instead of as they ought to be” [1.] A cynical software No intimacy Internal barriers Lack of trust Bad things happen Resilience to impulse and stress Sunday 17 November 13 Explain the main attributes of cynical software
  10. 10. Steady state Circuit breaker Bulkheads Test harness Handshaking Fail fast Timeouts Decoupling middleware STABILITY PATTERNS & ANTIPATTERNS Slow responses SLA inversion Attacks of self denial Unbalanced capacities Unbounded result set Blocked threads Scaling effects Sunday 17 November 13 - the antipatterns will create, accelerate or multiply cracks in the system - the patterns provide architecture and design guidance to reduce, eliminate, or mitigate the effects of cracks in the system
  11. 11. CIRCUIT BREAKER Frontend service Half-Open Timeout Closed Open Open Failure counter: 2 1 0 Sunday 17 November 13 - mediator (decoupling, isolation), integration point wrapper - fail fast, recovery Failure Backend service
  12. 12. BULKHEADS Hiccup HTTP client Bulkheads Frontend service Connections pool Backend service A Backend service B Backend service C Sunday 17 November 13 fat tails - sizing/capacity isolation, degrading functionality
  13. 13. Latency and Fault Tolerance for Distributed Systems https://github.com/Netflix/Hystrix Sunday 17 November 13 - do not reinvent the wheel - OSS, Java - most of the patterns are implemented there (circuit breaker, bulkheads, fail fast)
  14. 14. public class CommandHelloWorld extends HystrixCommand<String> { private final String name; public CommandHelloWorld(String name) { super(HystrixCommandGroupKey.Factory.asKey("ExampleGroup")); this.name = name; } @Override protected String run() { // a real example would do work like a network call here return "Hello " + name + "!"; } } String s = new CommandHelloWorld("World").execute(); Future<String> fs = new CommandHelloWorld("World").queue(); Sunday 17 November 13 HystrixCommand = circuit breaker + bulkheads Synchronous/Asynchronous usage Async usage Future -> timeout
  15. 15. Focus on failures TESTING Simulate bad things Chaos monkey [2] Sunday 17 November 13 - we do test but on "sligthly" different topology. It makes hard to reveal some kind of bugs we mostly test optimistic cases HTTP mock server (bad responses, slow responses, protocol violation...) Longevity tests
  16. 16. Adaptable design [3] ARCHITECTURE Conway’s law Sunday 17 November 13 ROC [4] [5] - early decisions are hard to revert later (costs) - Big Up Front Design doesn’t work prefer Adaptable design - Framework, Platform - restartability (No restart the world), diagnostics (health checks), recovery mechanism circuit breaker, isolation/redundancy ()
  17. 17. Q&A https://github.com/dagi/cynical-software Sunday 17 November 13

×