Structured Software Design
Giorgio Zoppi
<giorgio@apache.org>
Who I am?
Giorgio Zoppi
- Software Engineer.
- Apache Milagro, committer.
- Languages: C++, Python and Go.
What do we do today?
- Talk about Structured Design Principles.
- Divide ourselves in group:
- Design a distributed message queue.
Software Developer Daily:
In our routine we’re every day facing a problem:
1. A problem exists in the current state and requires
countermeasures to achieve a desired state.
2. A challenge is a response to opportunities
presented by the external environment.
Daily Scenario.
● Knowns. Thing that we should know from our developer
experience.
● Knowns unknowns. Things that we don’t know:
○ Business domain.
○ Technologies that we’ll use.
● Unknown unknowns. Things that no one knows and can
lead to failure.
● Risks. Unexpected events. R(event) = P(event) * I(event).
It is all about Complexity!
Complexity = anything related to the structure of a software
system that it makes hard to understand and modify the system.
Idea:
Create a minimal
cost system.
Symptoms of complexity
Change amplification: A
simple change can
trigger a ‘domino’ effect
of changes. As good
designers we shall reduce
change amplification.
High coupling.
Cognitive Load: It is a
measure on how much a
developer needs to know
in order to complete a
task. Sometimes in design
an approach that requires
more lines of code to fix a
problem reduces the
cognitive load is wanted
because results in
simplicity.
Unknown Unknowns: Not
obvious places need to
be modified for a
change. A guideline for a
good system design is to
be obvious. In an obvious
system you can make it
any fix/change easily.
Causes of complexity
Dependency: A dependency exists when a given piece of code cannot
understood and isolated. Dependencies are part of modern SW
development. The goal of system designer is to isolate as much as possible
the dependencies and make them obvious. Dependencies might lead to
change amplification and to high cognitive load.
Obscurity: Important information are not obvious. It creates
unknown/unknows and increase the cognitive load. Example: a bad
documented method or a bad variable name. Best tools against obscurity:
Code Reviews (to detect) and Design simplification.
Structured Design: Definition.
Structured design is the process of deciding which components
interconnected in which way will solve some well-specified problem:
- Plan/Mark out some form of solution in iterative ways to cope with
unknown unknowns.
- Understanding of the primary constraints limiting admissible solutions is
also essential.
Structured Design Software System Objectives.
● Reliability.
● Maintainability.
● Scalability.
SD Objectives: Reliability.
● Reliability.
○ To measure the reliability we can define a metric called
mean-time-between-failures (MTBW). Be careful:
■ 1. Reliability is not only a code problem, it shall be seen
as a design problem.
■ The mean-time-between-failure shall be low.
The idea is that the system has to work in face of any adverse
event, keeping low MTBW.
SD Objectives: Maintainability.
● Maintainability. In other simpler words, maintaining a working
system and adapting the system to new use cases should be
done within a reasonable productive cost . To measure the
maintainability we can define three metrics:
○ mean time to be fixed (MTBF).
○ mean-time-to-repairs (MTTR).
So those metrics lead to SA (System Availability) = MTBF / MTBF +
MTTR. A maintainable system <=> easily testable.
- Single Responsibility Principle
- Dependency Inversion Principle.
SD Objectives: Maintainability.
SD Objectives: Scalability.
● Scalability. The system shall be able to
cope with future load as we'll appear.
In data driven system for example we
should be able to ingest more data
keeping the system operational.
How to achieve a minimal cost system?
Through Problem Decomposition.
● divide the problem in small problems
● solve them separately
How to achieve a minimal cost system?
The parts of the problem should be:
● manageably small (1 week stories) and with a single responsibility.
● solvable separately.
● testable separately.
How to achieve a minimal cost system?
The parts of the problem should be:
● manageably small (1 week stories) and with a single responsibility.
● solvable separately.
● testable separately.
How to achieve a minimal cost system?
- Be sure that each module shall know least possible (minum known
principle) about other modules to make its work (Law of Demetra)
- Loosely coupled system.
- High cohesion.
Strategic vs Tactical Programming
Tactical Programming: Here your focus
is makes it working ASAP, take some
shortcut to make it happen.
Short-sight view.
Extreme: Tactical Tornado.
Strategic Programming: We know that
working code is good, but not enough.
Your primary goal is to produce a great
design, which also happens to work.
Long term investment mindset.
Take some little time extra to fix it, in
the long run you’ll win.
Classes should deep.
Deep Modules and Simple Interfaces Shallow method
No Information Hiding.
High Change Amplification.
Rule of Thumb:
When its interface is similar
to the implementation:
Shallow Module/Class
Classes should deep.
Less (interface) is More (functionality)
1. Common Belief: “Class and methods should
be small” – Not exactly! Should be readable!
2. Disease - Classitis: You’ve too many small
and too many classes. Increase cognitive
load and change amplification. (i.e. Java)
3. The best modules are deep: a lot of
functionality behind a simple interface. Unix
I/O.
4. Interfaces should be designed to make the
common case as simple as possible
It’s more important for a module to have a
simple interface then a simple
implementation:
1. Better reduce the configuration
parameters.
2. Don’t take this is idea too far: all
functionality over a single class. In 2010, I
had the luck to see a function for
shortest path around 5k lines in just one
function ☺. Don’t do that.
Pulling Complexity Downwards
Simple (interface) is More (functionality)
Information Hiding vs Information Leakage
Information Hiding: each module
should incapsulate pieces of
knowledge
Simplify the interface
Make the system easier to evolve.
Information Leakage: A design
decision is reflected in multiple
modules.
This is bad because lead to change
amplification: any change to a
design decision will require to
change all involved modules.
Information Leakage: Temporal
Decompositon
Temporal decomposition: happens
when execution order is reflected in
the code structure.
Recommendation: Focus on the
knowledge that’s is needed to
perform each task, not the order in
which the task occurs.
Both TimeSeriesFileReader, TimeSeriesFileUpdater,
TimeSeriesFileWriter share common knowledge (the ts
file format) -> information leakage. In Java Standard
Library we’ve plenty of examples of information
leakage. Better use just one class, information hiding
Information Hiding Hints
Avoid overexposure. Overexposure is when used features in a module or API forces to learn
less used features. Increase cognitive load.
Inside a class: reduce the number of a variable is used.
Inside a class: try to design methods such that each method encapsulate some information
or capability and hides it from the rest of the class.
Different Layers Different Abstraction
In a well designed system
each layer provides a
different abstraction from
the layers above and
below. (ISP principle)
Red Flag: Pass-Through
Method.
Be Careful with Decorator
pattern. Decorator enforces
the Open-Close principle,
however decorating classes
tend to create a set of
shallow classes.
Different Layers Different Abstraction
Pass through methods.
Increase interface
complexity of a class but
they don’t increase
functionality. Shallow class.
Here there a bit of confusion
to give class responsibilities.
Increase change
amplification.
Different Layers Different Abstraction
Pass through variables.
Increase complexity because
each intermediate methods
they know their existence.
Difficult to eliminate. In the
picture we some approaches.
Increase change amplification.
Better together or Better apart?
Given two pieces of functionality should be
implemented together in the same place or should
their implementation be separated?
Bring Together
• IF Information is shared.
• IF will simplify the interface.
• To eliminate duplication
Separate
• General Purpose vs Specific
Purpose Code
• Red Flag: Special-General
Mixture (Information Leakage)
Define Error out of Existence
An exception is any uncommon
condition that alters the normal flow of
control in a program. Exceptions add a
lot of complexity.
Goal: Reduce the number of places
where exceptions have to be handled.
Define error out of existence: Redefine
the semantics to avoid exception
Mask Exceptions, Exception Aggregation,
Crash (OOM)
Define EoE: Exception
Designing a Distributed Message Queue (DMQ)
A distributed queue is a queue that holds:
1. Multiple remote producers.
2. Multiple remote consumers.
3. No single point of failure.
4. Sequential consistency.
DMQ: Functional requirements.
● Queue creation: The client should be able to create a queue and set some parameters—for
example, queue name, queue size, and maximum message size.
● Send message: Producer entities should be able to send messages to a queue that’s intended
for them.
● Receive message: Consumer entities should be able to receive messages from their
respective queues.
● Delete message: The consumer processes should be able to delete a message from the queue
after a successful processing of the message.
● Queue deletion: Clients should be able to delete a specific queue.
DMQ: Non functional requirements
● Durability: The data received by the system should be durable and shouldn’t be lost.
Producers and consumers can fail independently, and a queue with data durability is
critical to make the whole system work, because other entities are relying on the
queue.
● Scalability: The system needs to be scalable and capable of handling the increased
load, queues, producers, consumers, and the number of messages. Similarly, when the
load reduces, the system should be able to shrink the resources accordingly.
● Availability: The system should be highly available for receiving and sending
messages. It should continue operating uninterrupted, even after the failure of one or
more of its components.
● Performance: The system should provide high throughput and low latency.

Structured Software Design

  • 1.
    Structured Software Design GiorgioZoppi <giorgio@apache.org>
  • 2.
    Who I am? GiorgioZoppi - Software Engineer. - Apache Milagro, committer. - Languages: C++, Python and Go.
  • 3.
    What do wedo today? - Talk about Structured Design Principles. - Divide ourselves in group: - Design a distributed message queue.
  • 4.
    Software Developer Daily: Inour routine we’re every day facing a problem: 1. A problem exists in the current state and requires countermeasures to achieve a desired state. 2. A challenge is a response to opportunities presented by the external environment.
  • 5.
    Daily Scenario. ● Knowns.Thing that we should know from our developer experience. ● Knowns unknowns. Things that we don’t know: ○ Business domain. ○ Technologies that we’ll use. ● Unknown unknowns. Things that no one knows and can lead to failure. ● Risks. Unexpected events. R(event) = P(event) * I(event).
  • 6.
    It is allabout Complexity! Complexity = anything related to the structure of a software system that it makes hard to understand and modify the system. Idea: Create a minimal cost system.
  • 7.
    Symptoms of complexity Changeamplification: A simple change can trigger a ‘domino’ effect of changes. As good designers we shall reduce change amplification. High coupling. Cognitive Load: It is a measure on how much a developer needs to know in order to complete a task. Sometimes in design an approach that requires more lines of code to fix a problem reduces the cognitive load is wanted because results in simplicity. Unknown Unknowns: Not obvious places need to be modified for a change. A guideline for a good system design is to be obvious. In an obvious system you can make it any fix/change easily.
  • 8.
    Causes of complexity Dependency:A dependency exists when a given piece of code cannot understood and isolated. Dependencies are part of modern SW development. The goal of system designer is to isolate as much as possible the dependencies and make them obvious. Dependencies might lead to change amplification and to high cognitive load. Obscurity: Important information are not obvious. It creates unknown/unknows and increase the cognitive load. Example: a bad documented method or a bad variable name. Best tools against obscurity: Code Reviews (to detect) and Design simplification.
  • 9.
    Structured Design: Definition. Structureddesign is the process of deciding which components interconnected in which way will solve some well-specified problem: - Plan/Mark out some form of solution in iterative ways to cope with unknown unknowns. - Understanding of the primary constraints limiting admissible solutions is also essential.
  • 10.
    Structured Design SoftwareSystem Objectives. ● Reliability. ● Maintainability. ● Scalability.
  • 11.
    SD Objectives: Reliability. ●Reliability. ○ To measure the reliability we can define a metric called mean-time-between-failures (MTBW). Be careful: ■ 1. Reliability is not only a code problem, it shall be seen as a design problem. ■ The mean-time-between-failure shall be low. The idea is that the system has to work in face of any adverse event, keeping low MTBW.
  • 12.
    SD Objectives: Maintainability. ●Maintainability. In other simpler words, maintaining a working system and adapting the system to new use cases should be done within a reasonable productive cost . To measure the maintainability we can define three metrics: ○ mean time to be fixed (MTBF). ○ mean-time-to-repairs (MTTR). So those metrics lead to SA (System Availability) = MTBF / MTBF + MTTR. A maintainable system <=> easily testable. - Single Responsibility Principle - Dependency Inversion Principle.
  • 13.
  • 14.
    SD Objectives: Scalability. ●Scalability. The system shall be able to cope with future load as we'll appear. In data driven system for example we should be able to ingest more data keeping the system operational.
  • 15.
    How to achievea minimal cost system? Through Problem Decomposition. ● divide the problem in small problems ● solve them separately
  • 16.
    How to achievea minimal cost system? The parts of the problem should be: ● manageably small (1 week stories) and with a single responsibility. ● solvable separately. ● testable separately.
  • 17.
    How to achievea minimal cost system? The parts of the problem should be: ● manageably small (1 week stories) and with a single responsibility. ● solvable separately. ● testable separately.
  • 18.
    How to achievea minimal cost system? - Be sure that each module shall know least possible (minum known principle) about other modules to make its work (Law of Demetra) - Loosely coupled system. - High cohesion.
  • 19.
    Strategic vs TacticalProgramming Tactical Programming: Here your focus is makes it working ASAP, take some shortcut to make it happen. Short-sight view. Extreme: Tactical Tornado. Strategic Programming: We know that working code is good, but not enough. Your primary goal is to produce a great design, which also happens to work. Long term investment mindset. Take some little time extra to fix it, in the long run you’ll win.
  • 20.
    Classes should deep. DeepModules and Simple Interfaces Shallow method No Information Hiding. High Change Amplification. Rule of Thumb: When its interface is similar to the implementation: Shallow Module/Class
  • 21.
    Classes should deep. Less(interface) is More (functionality) 1. Common Belief: “Class and methods should be small” – Not exactly! Should be readable! 2. Disease - Classitis: You’ve too many small and too many classes. Increase cognitive load and change amplification. (i.e. Java) 3. The best modules are deep: a lot of functionality behind a simple interface. Unix I/O. 4. Interfaces should be designed to make the common case as simple as possible
  • 22.
    It’s more importantfor a module to have a simple interface then a simple implementation: 1. Better reduce the configuration parameters. 2. Don’t take this is idea too far: all functionality over a single class. In 2010, I had the luck to see a function for shortest path around 5k lines in just one function ☺. Don’t do that. Pulling Complexity Downwards Simple (interface) is More (functionality)
  • 23.
    Information Hiding vsInformation Leakage Information Hiding: each module should incapsulate pieces of knowledge Simplify the interface Make the system easier to evolve. Information Leakage: A design decision is reflected in multiple modules. This is bad because lead to change amplification: any change to a design decision will require to change all involved modules.
  • 24.
    Information Leakage: Temporal Decompositon Temporaldecomposition: happens when execution order is reflected in the code structure. Recommendation: Focus on the knowledge that’s is needed to perform each task, not the order in which the task occurs. Both TimeSeriesFileReader, TimeSeriesFileUpdater, TimeSeriesFileWriter share common knowledge (the ts file format) -> information leakage. In Java Standard Library we’ve plenty of examples of information leakage. Better use just one class, information hiding
  • 25.
    Information Hiding Hints Avoidoverexposure. Overexposure is when used features in a module or API forces to learn less used features. Increase cognitive load. Inside a class: reduce the number of a variable is used. Inside a class: try to design methods such that each method encapsulate some information or capability and hides it from the rest of the class.
  • 26.
    Different Layers DifferentAbstraction In a well designed system each layer provides a different abstraction from the layers above and below. (ISP principle) Red Flag: Pass-Through Method. Be Careful with Decorator pattern. Decorator enforces the Open-Close principle, however decorating classes tend to create a set of shallow classes.
  • 27.
    Different Layers DifferentAbstraction Pass through methods. Increase interface complexity of a class but they don’t increase functionality. Shallow class. Here there a bit of confusion to give class responsibilities. Increase change amplification.
  • 28.
    Different Layers DifferentAbstraction Pass through variables. Increase complexity because each intermediate methods they know their existence. Difficult to eliminate. In the picture we some approaches. Increase change amplification.
  • 29.
    Better together orBetter apart? Given two pieces of functionality should be implemented together in the same place or should their implementation be separated? Bring Together • IF Information is shared. • IF will simplify the interface. • To eliminate duplication Separate • General Purpose vs Specific Purpose Code • Red Flag: Special-General Mixture (Information Leakage)
  • 30.
    Define Error outof Existence An exception is any uncommon condition that alters the normal flow of control in a program. Exceptions add a lot of complexity. Goal: Reduce the number of places where exceptions have to be handled. Define error out of existence: Redefine the semantics to avoid exception Mask Exceptions, Exception Aggregation, Crash (OOM)
  • 31.
  • 32.
    Designing a DistributedMessage Queue (DMQ) A distributed queue is a queue that holds: 1. Multiple remote producers. 2. Multiple remote consumers. 3. No single point of failure. 4. Sequential consistency.
  • 33.
    DMQ: Functional requirements. ●Queue creation: The client should be able to create a queue and set some parameters—for example, queue name, queue size, and maximum message size. ● Send message: Producer entities should be able to send messages to a queue that’s intended for them. ● Receive message: Consumer entities should be able to receive messages from their respective queues. ● Delete message: The consumer processes should be able to delete a message from the queue after a successful processing of the message. ● Queue deletion: Clients should be able to delete a specific queue.
  • 34.
    DMQ: Non functionalrequirements ● Durability: The data received by the system should be durable and shouldn’t be lost. Producers and consumers can fail independently, and a queue with data durability is critical to make the whole system work, because other entities are relying on the queue. ● Scalability: The system needs to be scalable and capable of handling the increased load, queues, producers, consumers, and the number of messages. Similarly, when the load reduces, the system should be able to shrink the resources accordingly. ● Availability: The system should be highly available for receiving and sending messages. It should continue operating uninterrupted, even after the failure of one or more of its components. ● Performance: The system should provide high throughput and low latency.