DevOps as Relationship
Management
How to keep systems of services happy
@jamesurquhart
Director of Product
Dell Cloud Mana...
I want to answer three
questions:
Why are the relationships between service components
important in modern software system...
May 6, 2010
”[A] large fundamental trader chose to execute [a
$4.1B] sell program via an automated execution
algorithm ('Sell Algorith...
"...the Sell Algorithm…executed the sell program
extremely rapidly in just 20 minutes.”
The market responded, and trading ...
Market A
Automatic
Trading
Algorithm 1
Market A
Automatic
Trading
Algorithm 1
Automatic
Trading
Algorithm 2
Automatic
Trading
Algorithm 3
Automatic
Trading
Algor...
Market A
Automatic
Trading
Algorithm 1
Automatic
Trading
Algorithm 2
Automatic
Trading
Algorithm 3
Automatic
Trading
Algor...
Market A
Automatic
Trading
Algorithm 1
Automatic
Trading
Algorithm 2
Automatic
Trading
Algorithm 3
Automatic
Trading
Algor...
Market A
Automatic
Trading
Algorithm 1
Automatic
Trading
Algorithm 2
Automatic
Trading
Algorithm 3
Automatic
Trading
Algor...
Quick!
What was the root
cause?
Market A
Automatic
Trading
Algorithm 1
Automatic
Trading
Algorithm 2
Automatic
Trading
Algorithm 3
Automatic
Trading
Algor...
Automatic
Trading
Algorithm N
Automatic
Trading
Algorithm N Market A
Automatic
Trading
Algorithm 2
Automatic
Trading
Algor...
"May 6 was…an important reminder of the
interconnectedness of our derivatives and
securities markets, particularly with re...
Interconnectedness
How do we avoid another
“flash crash” (or at least
make one less likely)?
Here’s what the SEC did…
First, they visualized
the system…
…(albeit, post
mortem).
Then, they took
action…
They installed “circuit breakers”
They defined rollback protocols
They inserted agents into the flow
Why?
Because…
While high speed trading software contained the logic by which
actions took place, the dynamic ways in which that...
Because (simplified)…
Agents contain and execute logic,
but interactions drive systemic
behavior.
Nobody owns the system, ...
Back to those questions:
Why are the relationships between service components
important in modern software systems?
What o...
Why are the relationships
between service
components important in
modern software
systems?
Because such
relationships–numerous
and dynamic as they are–
define system behavior.
What one amazing
strategy can you use to
manage your software
systems?
I believe I have a
simple rule…
…that takes
tremendous discipline
and work to execute
well:
Visualize the
system…
…but take
action at the
agent level
No, really, what are some
tactics that work and are
being used today?
Visualize the
system…
V.
Expansionist Reductionist
Remember,
you are part of the
system.
…but take
action at the
agent level
Install “circuit breakers”
Define rollback protocols
Insert agents into flow
Remember,
you are part of the
system.
Thank you!
Questions?
Dev opsdayssv2014 devopsasrelationshipmanagement
Dev opsdayssv2014 devopsasrelationshipmanagement
Dev opsdayssv2014 devopsasrelationshipmanagement
Dev opsdayssv2014 devopsasrelationshipmanagement
Dev opsdayssv2014 devopsasrelationshipmanagement
Upcoming SlideShare
Loading in...5
×

Dev opsdayssv2014 devopsasrelationshipmanagement

224

Published on

Slides from my talk at DevOpsDaysSV in 2014. Discusses how important it is to understand the relationships between components in a system, and some techniques of how to take action based to avoid or correct negative emergent behavior.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
224
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Instead, let me take you back to May 6, 2010…
  • A typical morning on the stock market. No major breaking news, and all boards trading normally. A slight drop in the indexes, but nothing special. But something went very wrong that afternoon.
  • According to a joint report written later about that day by the
    Securities and Exchange Commission
    and the
    Commodity Futures Trading Commission,
    a single trading algorithm was used to mete out $4.1 billion dollars in trades, using an algorithm that metered out individual trades over time, attempting to represent no more than 10% of trading volume at any given time.

    To achieve this, the algorithm adjusted each trade’s volume based on overall market volume in the previous minute.
  • Unfortunately, for one reason or another…a simple bug, perhaps, or human error…trades that were meant to be metered out over days or weeks were actually executed within 20 minutes.

    As might be expected, this resulted in some pretty big trade executions in a very short period of time, especially for the relatively small electronic exchange on which they were executed.

    The market responded, and other automatic trading algorithms sensed a “sell” signal, and started executing sell trades in response. This increased market volume.

    The original sell algorithm then responded to the increased trade volume, and increased it’s own trading volume.
  • The result was an about 4% drop in the Dow in about 20 minutes, a total of 8.1% down from the opening bell.

    To put that into context, CNN noted that it was the biggest intraday point drop in Dow Jones history at that time.
  • Well, it was that errant trading algorithm, right? I mean it initiated the trades, and feedback from that market was then used to initiate further trades.
  • Then, for whatever reason, other trading algorithms quickly saw this larger than normal activity as a sell sign…
  • …so they in turn initiated trades on both the original exchange, and (in a few cases) on other exchange mechanisms.
  • Those trades were either frequent enough or large enough to get the attention of yet more trading algorithms.
  • A few of those algorithms probably triggered increased trading on the original exchange, further exacerbating the problem.

    Now, this is an extremely simplified view of the changing state of the exchange system that day, and one that appears much more serial than events really unfolded.

  • So, to fix this, we need the root cause, right? I mean, it had to be that original trading company’s bug or bad logic or whatever. Case closed.
  • A few of those algorithms probably triggered increased trading on the original exchange, further exacerbating the problem.

    Now, this is an extremely simplified view of the changing state of the exchange system that day, and one that appears much more serial than events really unfolded.

  • But at this point, how can we say that first algorithm was the “problem” to be “fixed”.

    In fact, the truth is that it’s the way the other algorithms reacted to the initial trades that made those trades a problem. In theory, any one of a number of large trade sources could have triggered the same series of events. Or a similar series of events. Or, perhaps, an even more devastating series of events.

    Another important truth is that many, many decisions were made in parallel, often affecting large numbers of trades made by entirely unrelated parties. And there were a huge number of parties. The actual trading graph probably looked more like…
  • …this. A huge web of players interacting over a variety of paths via a broad range of rules and protocols.
  • The Flash Crash post mortem itself noted that “May 6 was an important reminder of the interconnectedness of our derivatives and securities markets”.
  • “Interconnectedness”. I love that word. Isn’t that what we are all here to talk about this week?

    Isn’t that what the world of computing is working towards with such fervor and focus?
  • In systems like this, small actions can trigger huge consequences. And most of the time, it’s not the initial trigger that is the problem. In this case, the root cause is interesting, but it’s not the problem.
  • “Interconnectedness”. I love that word. Isn’t that what we are all here to talk about this week?

    Isn’t that what the world of computing is working towards with such fervor and focus?
  • “Interconnectedness”. I love that word. Isn’t that what we are all here to talk about this week?

    Isn’t that what the world of computing is working towards with such fervor and focus?
  • “Interconnectedness”. I love that word. Isn’t that what we are all here to talk about this week?

    Isn’t that what the world of computing is working towards with such fervor and focus?
  • “Interconnectedness”. I love that word. Isn’t that what we are all here to talk about this week?

    Isn’t that what the world of computing is working towards with such fervor and focus?
  • “Interconnectedness”. I love that word. Isn’t that what we are all here to talk about this week?

    Isn’t that what the world of computing is working towards with such fervor and focus?
  • “Interconnectedness”. I love that word. Isn’t that what we are all here to talk about this week?

    Isn’t that what the world of computing is working towards with such fervor and focus?
  • In systems like this, small actions can trigger huge consequences. And most of the time, it’s not the initial trigger that is the problem. In this case, the root cause is interesting, but it’s not the problem.
  • In systems like this, small actions can trigger huge consequences. And most of the time, it’s not the initial trigger that is the problem. In this case, the root cause is interesting, but it’s not the problem.
  • I mean, the Internet is all about interconnectedness…from the early days of The World Wide Web through human social networks to—increasingly—system to system, device-to-device, thing-to-thing connectedness.
  • I mean, the Internet is all about interconnectedness…from the early days of The World Wide Web through human social networks to—increasingly—system to system, device-to-device, thing-to-thing connectedness.
  • In systems like this, small actions can trigger huge consequences. And most of the time, it’s not the initial trigger that is the problem. In this case, the root cause is interesting, but it’s not the problem.
  • I mean, the Internet is all about interconnectedness…from the early days of The World Wide Web through human social networks to—increasingly—system to system, device-to-device, thing-to-thing connectedness.
  • I mean, the Internet is all about interconnectedness…from the early days of The World Wide Web through human social networks to—increasingly—system to system, device-to-device, thing-to-thing connectedness.
  • I mean, the Internet is all about interconnectedness…from the early days of The World Wide Web through human social networks to—increasingly—system to system, device-to-device, thing-to-thing connectedness.
  • Transcript of "Dev opsdayssv2014 devopsasrelationshipmanagement"

    1. 1. DevOps as Relationship Management How to keep systems of services happy @jamesurquhart Director of Product Dell Cloud Manager http://gigaom.com/author/jurquhart
    2. 2. I want to answer three questions: Why are the relationships between service components important in modern software systems? What one amazing strategy can you use to manage your software systems No, really, what are some tactics that work and are being used today?
    3. 3. May 6, 2010
    4. 4. ”[A] large fundamental trader chose to execute [a $4.1B] sell program via an automated execution algorithm ('Sell Algorithm')." - Findings Regarding The Market Events of May 6, 2010 http://www.sec.gov/news/studies/2010/marketevents-report.pdf
    5. 5. "...the Sell Algorithm…executed the sell program extremely rapidly in just 20 minutes.” The market responded, and trading volume increased… "... [The Sell Algorithm] responded to the increased volume by increasing the rate at which it was feeding the orders into the market." - Findings Regarding The Market Events of May 6, 2010 http://www.sec.gov/news/studies/2010/marketevents-report.pdf
    6. 6. Market A Automatic Trading Algorithm 1
    7. 7. Market A Automatic Trading Algorithm 1 Automatic Trading Algorithm 2 Automatic Trading Algorithm 3 Automatic Trading Algorithm 4
    8. 8. Market A Automatic Trading Algorithm 1 Automatic Trading Algorithm 2 Automatic Trading Algorithm 3 Automatic Trading Algorithm 4 Market B
    9. 9. Market A Automatic Trading Algorithm 1 Automatic Trading Algorithm 2 Automatic Trading Algorithm 3 Automatic Trading Algorithm 4 Market B Automatic Trading Algorithm 5 Automatic Trading Algorithm 6
    10. 10. Market A Automatic Trading Algorithm 1 Automatic Trading Algorithm 2 Automatic Trading Algorithm 3 Automatic Trading Algorithm 4 Market B Automatic Trading Algorithm 5 Automatic Trading Algorithm 6
    11. 11. Quick! What was the root cause?
    12. 12. Market A Automatic Trading Algorithm 1 Automatic Trading Algorithm 2 Automatic Trading Algorithm 3 Automatic Trading Algorithm 4 Market B Automatic Trading Algorithm 5 Automatic Trading Algorithm 6 ?
    13. 13. Automatic Trading Algorithm N Automatic Trading Algorithm N Market A Automatic Trading Algorithm 2 Automatic Trading Algorithm 3 Automatic Trading Algorithm 4 Market B Automatic Trading Algorithm 5 Automatic Trading Algorithm 6 Automatic Trading Algorithm N
    14. 14. "May 6 was…an important reminder of the interconnectedness of our derivatives and securities markets, particularly with respect to index products." - Findings Regarding The Market Events of May 6, 2010 http://www.sec.gov/news/studies/2010/marketevents-report.pdf
    15. 15. Interconnectedness
    16. 16. How do we avoid another “flash crash” (or at least make one less likely)?
    17. 17. Here’s what the SEC did…
    18. 18. First, they visualized the system…
    19. 19. …(albeit, post mortem).
    20. 20. Then, they took action…
    21. 21. They installed “circuit breakers”
    22. 22. They defined rollback protocols
    23. 23. They inserted agents into the flow
    24. 24. Why?
    25. 25. Because… While high speed trading software contained the logic by which actions took place, the dynamic ways in which that software and market systems interacted defined the emergent behaviors of the system itself. The SEC didn’t “own” and could not dictate all aspects of behavior of those agents. They did, however, have jurisdiction over key integration points, and could insist certain rules be enforced at those interfaces It was in the best interest of the agent owners/operators to follow these new rules and protocols, as it was how their revenues and profits were derived.
    26. 26. Because (simplified)… Agents contain and execute logic, but interactions drive systemic behavior. Nobody owns the system, but one can own or influence key interactions that are relevant to desired outcomes Well defined interfaces enable cooperative behavior, but feedback mechanisms ensure it.
    27. 27. Back to those questions: Why are the relationships between service components important in modern software systems? What one amazing strategy can you use to manage your software systems No, really, what are some tactics that work and are being used today?
    28. 28. Why are the relationships between service components important in modern software systems?
    29. 29. Because such relationships–numerous and dynamic as they are– define system behavior.
    30. 30. What one amazing strategy can you use to manage your software systems?
    31. 31. I believe I have a simple rule…
    32. 32. …that takes tremendous discipline and work to execute well:
    33. 33. Visualize the system…
    34. 34. …but take action at the agent level
    35. 35. No, really, what are some tactics that work and are being used today?
    36. 36. Visualize the system…
    37. 37. V. Expansionist Reductionist
    38. 38. Remember, you are part of the system.
    39. 39. …but take action at the agent level
    40. 40. Install “circuit breakers”
    41. 41. Define rollback protocols
    42. 42. Insert agents into flow
    43. 43. Remember, you are part of the system.
    44. 44. Thank you! Questions?
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×