Digital Systems Design description and implementation.pdf

Andreas Mitschele-Thiel
Dieter Wuttke
TECHNISCHE
UNIVERSITÄT
ILMENAU
Integrated
Hard-
and
Software
Systems
http://www.tu-ilmenau.de/ihs
Digital Systems Design

2
Part I
IQtroduction

3
Motivation for the Course – Why is this important?
Any computer system consists of hardware and software!
But: HW is often hidden and not considered important by SW developers
Indicators that HW is important:
Systems where HW/SW relation is obvious:
 embedded systems
 real-time systems
 reliable systems
 safety-critical systems
 capacity
 responsiveness and delay
 predictability
 reliability
 safety
 power consumption
 cost
 ...
=> Knowledge of HW/SW interaction is required!
What are „Integrated HW/SW-Systems“?

4
Embedded programming without knowledge of HW/SW integration
Image
“borrowed”
from
an
Iomega
advertisement
for
Y2K
software
and
disk
drives,
Scientific
American
,
September
1999.
Image
“borrowed”
from
an
Iomega
advertisement
for
Y2K
software
and
disk
drives,
Scientific
American
,
September
1999.

5
According to the International Data Corporation
 1997: 96% of all Internet-access devices shipped in the United
States were PCs
 End of 2002: less than 50% of them were PCs
Instead, digital set-top boxes, cell phones, and personal digital
assistants are sold
 Today: the most selling Internet-access devices are mobile phones
Information Technology Scenario

6
Objectives
Let’s assume you are employed as a system architect with some company and
faced with the following task:
Given is some problem to be solved by some kind of computer system, e.g. an
ABS system for a car, a fly-by-wire system for a new Airbus, the control of a
microwave oven, a mobile phone, a corporate IP router, or the control unit of
some medical x-ray equipment.
The different systems have very different requirements, including real-time
constraints, reliability, cost, etc.
Your task is to select the most appropriate system design including HW
and SW, as well as the selection of the most appropriate design method and
tools.
The goal of the course is to provide the knowledge to make these kind of
decisions.

7
Content IHS 2
 Motivation and overview
 Development process and tasks
 System requirements
 Behavioral models overview
 FSM, NDFSM, FSM composition
 PN, DFG, CFG, CDFG
 Specification languages details
 Statecharts
 SDL
 VHDL
 SystemC
 Functional validation
 Performance/temporal validation
 Optimization

10
Organisational Stuff
Course prerequisites:
 Basics of digital systems
 Basics of computer architecture and computer design
Slides and additional information will be provided at
http://www.tu-ilmenau.de/ics
Instructor contact:
Andreas Mitschele-Thiel Dr. Dieter Wuttke
Office: Zusebau, Room 1032 Office: Zusebau, Room 1067
Email: mitsch@tu-ilmenau.de Email: dieter.wuttke@tu-ilmenau.de
Phone: 03677-69-2819 Phone: 03677-69-2820
Dr. Karsten Henke
Office: Zusebau, Room 2078
Email: karsten.henke@tu-ilmenau.de
Phone: 03677-69-1443

11
Introduction
 Integrated HW/SW systems by example
 Issues of HW/SW systems development

12
Some Examples of Systems with Tight HW/SW Interaction
Communication systems
 GSM/UMTS network elements
 IP router (QoS support)
 ATM switch
 GSM/UMTS mobile
Safety-critical systems
 fly-by-wire system
 ABS, ASR, ESP, etc.
 train control
 control of physical and chemical processes
Embedded systems (not user-programable)
 every-day-appliances (microwave oven, vending machine, mobile phone, ...)
 ABS
 ticket machine
 ...

13
Example: UMTS Network
RNS
UTRAN CN
RNS
PS Domain
CS Domain
Registers
RNC
RNC
MSC/VLR GMSC
HLR/AuC/EIR
SGSN GGSN
Node B
Node B
Node B
Node B
UE
User
Equipment
(UE)
Iu
Uu
Iub
Iub
Iur
Gn

14
Example: Digital Wireless Platform
A
D
Analog RF
Timing
recovery
phone
book
Java
VM
ARQ
Keypad,
Display
Control
Filters
Adaptive
Antenna
Algorithm
Equalizers MUD
Accelerators
(bit level)
analog digital
DSP core
uC core
(ARM)
Logic
Dedicated Logic
and Memory
Source: Berkeley Wireless Research Center

15
Example: Car Electronics
• More than 30% of the cost of a car is now in electronics
• 90% of all innovations will be based on electronic systems

16
Example: Modern Vehicles, an Electronic System
Electronic Toll Collection
Collision Avoidance
Vehicle ID Tracking
Safety-critical System
Vehicle
CAN Bus
Body
Control
ECU ABS
Suspension Transmission
IVHS Infrastructure
Wireless Communications/
Data Global Positioning
Info/Comms/
AV Bus
Cellular
Phone
GPS Display
Navigation Stereo/CD
SW Architecture
Network Design/Analysis Function/Protocol Validation
Performance Modeling
Supplier Chain Integration
IVHS: Intelligent Vehicle Highway Systems
ECU: Electronic Control Unit (Bordcomputer)

17
Example: Vehicles, a Consumer Electronic System
Comms
GSM/GPRS
UMTS, Paging
Compression
SW Shell
Windows CE,
NT, MAC, BIOS
SW Apps
Browser,
Comms, User Apps
Processor
RISC, PowerPC
X86, Hitachi RISC
Display
Heads Up,
Flat Panel
Graphics
User I/F
Voice Synthesis
Voice Control
Stylus, ETC
Output & I/F
Serial, Ethernet
Diagnostics
Info/Comms/
AV Bus
Cellular
Phone
GPS Display
Navigation Stereo/CD
• Minimum Technology to
Satisfy User Requirements
• Usability
• Integrate with Other Vehicle
Systems
• Add Functions Without
Adding the Cost
Challenges
Vehicle Web Site
Technology

18
Example: Smart Buildings
• Task: ambient conditioning systems allow thermal conditioning in small, localized
zones, to be individually controlled by building occupants, creating “micro-
climates within a building”
• Other functions: security, identification and personalization, object tagging,
seismic monitoring
Dense wireless network of sensor,
monitor, and actuator nodes
• Disaster mitigation, traffic management and control
• Integrated patient monitoring, diagnostics, and drug administration
• Automated manufacturing and intelligent assembly
• Toys, Interactive Musea

19
PC/Data
Based
PC-1
laptop
Internet
Access
PC-2
Printer
Telecom
Based
Video
Phone
Voice
Phone
PDA
Intercom
Appliance
Based
Sprinklers
Toasters
Ovens
Clocks
Climate
Control
Utility
Customization
Security
Based
Door
Sensors
Motion
Detectors Window
Sensors
Light
Control
Audio
Alarms
Video
surveillance
Smoke
Detectors
Entertainment
Based
Stereo
TV
Cam
Corder
Still
Camera
Video
Game
VCR
DVD
Player
Web-TV
STB
Example: Home Networking Application (Subnet) Clusters

20
Example: Smart Dust Components
Laser diode
III-V process
Passive CCR comm.
MEMS/polysilicon
Active beam steering laser comm.
MEMS/optical quality polysilicon
Sensor
MEMS/bulk, surface, ...
Analog I/O, DSP, Control
COTS CMOS
Solar cell
CMOS or III-V
Thick film battery
Sol/gel V2O5
Power capacitor
Multi-layer ceramic
1-2 mm

21
Example: Airborne Dust
Mapleseed solar cell
MEMS/Hexsil/SOI
1-5 cm
Controlled auto-rotator
MEMS/Hexsil/SOI
Rocket dust
MEMS/Hexsil/SOI

22
Example: Synthetic Insects
Source: R. Yeh, K. Pister, UCB/BSAC

23
Definition of Embedded Systems
An embedded system
 employs a combination of hardware & software (a “computational engine”)
to perform a specific function
 is part of a larger system that may not be a “computer”
 works in a reactive and time-constrained environment
 Software is used for providing features and flexibility
 Hardware = {Processors, ASICs, Memory,...} is used for performance (&
sometimes security)
=> Integrated HW/SW system
Typical characteristics:
 perform a small set of highly specific functions (not "general purpose”)
 increasingly high-performance & real-time constrained
 power, cost and reliability are often important issues

24
What is a System Anyway?
 Environment to environment
 Sensors + Information Processing + Actuators
 Computer is a system
 Microprocessor (ASCI, memory) is not
environment
sensor
sensor
sensor
sensor
sensor
processing
actor

25
Design Process: Behavior vs. Structure
Mapping
Flow To Implementation
Communication
Refinement
Behavior
Simulation
Performance models:
emb. SW, comm. and
comp. resources
HW/SW
partitioning,
scheduling
Synthesi
s
SW
estimation
Requirements
System
Behavior
Models of
computation
System
Architecture
Performance
Simulation
Validation

26
Will the system solution match the original system spec?
Concept
• Limited synergies between HW & SW
teams
• Long complex flows in which teams do
not reconcile efforts until the end
• High degree of risk that devices will be
fully functional
Software Hardware
? • HW or IP Selection
• Design
• Verification
• System Test
Tx
Optics
Synth/
MUX
CDR/
DeMUX
Rx
Optics
VCXO
mP
Clock
Select
Line
I/F
OHP
STS
PP
STS
XC SPE
Map
Data
Framer
Cell/
Packet
I/F
STM
I/F

27
Important Lessons
 Embedded systems market has surpassed the PC market
 Communication is everywhere
 Systems differ in many aspects (functionality, time constraints,
reliability, safety, cost, power consumption, …)
 Design methodologies are important to handle complexity
(behavioural and structural descriptions and verification)
 Methods for HW design align with modern SW design
but: HW knowledge is essential to optimize solutions
(cost, capacity, response time, reliability, safety, power, ...)

2
Part II
Development Process

3
System Development – Poor Process
Poor common infrastructure. Weak specialization of functions.
Poor resource management. Poor planning.

4
System Development – Ordered Process
Good planning.
Good common infrastructure.
Specialization of functions.
Good resource management.

5
General Development Tasks
Analysis
 of the requirements of the environment to the system
Modelling
 the system to be designed and experimenting with algorithms involved
Refining (or partitioning)
 the function to be implemented into smaller, interacting pieces
HW/SW partitioning
 allocating elements in the refined model to either HW units or SW running on
custom hardware or general microprocessors
Scheduling
 the times at which the functions are executed (this is important when several
modules in the partition share a single hardware unit)

6
System Development Process – The Theory
Analysis
Design
Implementation
Integration
Maintenance
Development is not a pure top-down process
 use of subcomponents from the shelf
=> bottom-up
 lack of accurate estimation in early phases
=> feedback
 lack of confidence in feasibility
=> feasibility studies, prototyping
=> in practice the development process is a mixture of
bottom-up and top-down design
Waterfall model

7
Analysis
Analysis Phase and Subphases
Problem
analysis
Feasibility
study
Requirements
analysis
The goals of the analysis phase are
 to identify the purpose, merit and risks of developing the product, and
 to identify the purpose of the product and to understand its exact requirements

8
Problem Analysis
 Preliminary study to analyse important needs of the environment to be
supported by the system
 discuss principal solution strategies
=> Problem definition (German: Lastenheft)
• project goals (business objectives)
• product goals, scope and major directions of the development
• specifies variables and constants of the product to be developed
• identifies resources necessary to conduct the development (capital
investments, human resources)
Analysis
Problem
analysis
Feasibility
study
Requirements
analysis

9
Feasibility Study
Check the feasibility of the product development and the product
 technical feasibility (availability of efficient algorithms, ...)
 economic feasibility (time-to-market,
market window, investment, pay-off)
Focus of the feasibility study are
critical issues of the system
in order to
improve confidence in the successful
completion of the project
=> Output (depends on exact focus of feasibility study)
• info on expected cost and benefits of the project
• info on technological and financial risks of project
• needed resources for development and/or marketing
• evaluation of possible technical alternatives
Analysis
Problem
analysis
Feasibility
study
Requirements
analysis

10
Analysis
Problem
analysis
Feasibility
study
Requirements
analysis
Requirements Analysis
Detailed study of the requirements of the system as seen from its environment
 Identify, analyze and classify the specific requirements of the product to be
developed
The solution, i.e. the question of how the
requirements are met is typically left open
=> Requirements specification (German: Pflichtenheft)
• Complete and correct
• Defines output of the development process (deliverables)
• Definition of the interfaces to the environment
• Definition of overall functionality of the product
• Performance requirements
• Contraints on SW, operating system and HW
• Possibly guidelines for internal structure of the product

11
Requirements Definition: Contents
 Identification of the system (interfaces to the environment)
 Functional requirements (functionality provided at the interfaces)
 Temporal and performance requirements (throughput, response time, delay,
jitter)
 Fault-tolerance and reliability
 Quality (absence of errors)
 Safety
 Operating platform (OS, general HW)
 Power consumption
 Heat disipation
 Operating environment (operating temperature, shock-, dust-resistance, etc.)
 Size
 Mechanical construction
 EMC (Tx/Rx)
 Maintainability
 Extendability
 Support
 Documentation
 Cost (development, deployment and operation)
 Date of completion
 ...
We will see methods to ensure that
the requirements are met in the
design section

12
Design and Subphases
Design
Architectural
design
Detailed
design
Implementation
design
Purpose:
 decide how the system meets the requirements -> inside view
 focus on the solution

13
Design and Subphases
Architectural Design (Top-level Design)
 define the modules of the system and
their interfaces
 goal: maximize internal coherence and
minimize intermodule coordination
 modules are typically functional entities
but may be structural entities as well
(structural vs. behavioral modularization)
Detailed Design (Module Design)
 define the functional/behavioral details of
each module independent of the
implementation technique, e.g. its
algorithms
Implementation Design
 take into account the details of the used
implementation technique, e.g. interfaces
to operating systems and hardware
Design
Architectural
design
Detailed
design
Implementation
design
When is the behavior of
the system decided and
when the structure?

14
The Design Space: A Complex Optimization Problem
 System architecture – overall architecture (structural model, or mapping of
functions on HW, etc.)
 Design methods (design tools and specification languages)
 HW selection (System-on-Chip, ASIC, FPGA, DSP, NP, uC, uP)
 HW design methods (languages, HL-Synthesis, RTL-Synthesis, …)
 HW description (algorithms and implementation)
 HW mapping and scheduling
 SW description (programming languages, algorithms and implementation)
 SW mapping and scheduling
 HW/SW interfacing
 Interfacing with environment (embedding)
 Operating system (OS) support
 Make or buy (HW, SW, OS)
 Available human resources and know-how
 ...

15
&
&
&
Design Models and Views – An Overview
Different modeling approaches focus on different aspects of the system
msc data_transfer
application transport network medium network transport application
system
data-oriented
view
functional
view
structural
view
behavioral
view

16
Behavioral Models
Behavioral models describe the behavior of the system or parts hereof
Implementation of behaviroal models may be in SW or HW
– however some models are better suited for HW design others better for SW
Examples: C program, Petri net, state diagram, data flow graph
Process 1
Send msg
Receive Ack
Send Ack
Process 2
KEY_ON => START_TIMER
END_TIMER_5 =>
ALARM_ON
KEY_OFF or
BELT _ON =>
END_TIMER_10 or
BELT_ON or
KEY_OFF => ALARM_OFF
WAIT
ALARM
OFF
o(n) = c1 * i(n) + c2 * i(n-1)

17
Structural Models
Structural models focus on the structure of the system, i.e. its components,
modules, etc., rather than its behavior
Structural blocks may be
 abstract (ALUs, processors, memory, busses, chipsets, boards) or
 detailed (flip-flops, gatter)
Examples: netlist, architectural
block diagram
&
&
&

18
Behavior and Structure
System
Behavior
System
Architecture
Mapping
Flow To Implementation
Communication
Refinement
Behavior
Simulation
Performance
Simulation
Models of
computation
HW/SW
partitioning
,
scheduling
Synthesis
Requirements
Structural model
Validation

19
Behavior meets Structure: The Optimization Problem
 There are numerous solutions to define
the behavior consistent with the given
requirements (algorithms, data structures)
 There are numerous ways to model the
defined behavior of the system
 There are numerous solutions to define
the structure of the system
(Microcontroller, DSP, customized HW,
configurable HW, ...)
 There are multiple ways to model the
defined structure of the system
 Design is about mapping the behavior
(including data and functions) on the
structure such that all requirements are
fulfilled (cost, time constraints, capacity,
reliability, maintainability, power
consumption, ...)
 Mapping is a very complex optimization
problem
Structural Space
System Platform
Behavioral Space

20
Design: Behavior vs. Structure
Behavioral specifications describe the functionality of the system using some
modeling or programming language
 behavior specifications may be
 abstract models (state charts, UML, SDL) or
 concrete programs (C, VHDL, SystemC)
 behavioral specifications may be
 executed/implemented on real HW (C program, assembler) or
 simulated on virtual HW (VHDL, SystemC, SDL)
Behavioral specifications ensure that
 the functional requirements are met
 however there is no confidence in non-functional aspects of the system, e.g.
performance, real-time, fault tolerance, cost, power consumption, ...
Structural specifications are needed to implement the system in HW
So, when is the best point in time to decide the structure?

21
Implementation
Prerequisites:
 Functional details as algorithms, etc. are specified
 HW components are selected
 HW/SW partitioning may be decided
 ...
Tasks:
 coding of functions, algorithms, etc. in the selected implementation language
 test of the modules and components in isolation emulating the environment of
the modules/components
Notes:
 provided the design is complete and correct this is straight-forward
 the implementation phase represents a small part of the development process
(appr. 20% for pure SW projects)

22
Validation Methods
 By construction
 Property is inherent.
 By verification
 Property is provable.
 By testing
 Check behavior of (all) inputs.
 By simulation
 Check behavior in the model world.
 By intuition
 Property is true. I just know it is.
 By assertion
 Property is true. Wanna make something of it?
 By intimidation
 Don’t even try to doubt whether it is true.
It is generally better to be higher in this list!
Validation is a continuous process applied
 in different phases of the development process and
 to different models of the system
to ensure conformance with various properties/requirements of the system or its
components (behavior, temporal requirements, shock resistance, ...)

23
Integration
Purpose:
 ensure compliance with system requirements
 complete the system for delivery
Tasks:
 System integration: subsequent addition of HW components and SW
modules to the system until the final system is established
 Integration testing: stepwise testing of system (requires knowledge of the
system as a whole)
 System testing: test after all parts have been integrated
Notes:
 Testing may be applied to almost all requirements or properties of systems,
system components or modules (functionality, performance, reliability, termal
resistance, shock resistance, ergonomics, man-machine interface,
documentation, ...)
 Testing is the most popular validation method in practice

24
Maintenance
 involved during the whole lifetime of a system, from delivery till removal from
service
 deal with changes due to
 changing environments,
 changing functional or
 performance requirements
 removal of errors
Note: often the maintenance cost are much greater than the development cost

25
Process Models – Overview
 Waterfall model (top-down)
 engineering approach to building a house, bridge, etc.
 no feedback assumed
 Iterative waterfall model
 validation and feedback to earlier stages
 Evolutionary model
 system development process is considered an evolution of prototypes
 requirements are subsequently added to the system
 Spiral model
 generalisation of various process models (meta model)
 multiple development cycles including validation
 V model
 continuous validation with real world/environment
 Component-based (bottom-up)
 compose the system of a set of predefined components (object-based)

26
Classic Waterfall Model & Iterative Waterfall Model
Analysis
Design
Implementation
Integration
Maintenance
Classic waterfall model (top-down)
 engineering approach to building a
house, bridge, etc.
 no feedback assumed
Iterative waterfall model
 validation and feedback to earlier stages

27
Evolutionary Model
Limits of the waterfall model
 often the requirements are
incomplete in the beginning
waterfall model is not
appropriate where requirements
are not well understood or not
well defined
 with the waterfall model,
there are no intermediate
product releases
Idea of the evolutionary model:
 provide intermediate product
releases
 refine and extend
requirements during the
development process
analysis
design
implementation
new prototype
needed
modification of
product definition
y
test
n

28
Spiral Model
Meta model supporting the flexible combination of the above approaches
review results;
plan next iteration
define objectives,
alternatives and
constraints
evaluate alternatives,
identify and resolve
risks
develop
and verify

29
V Model
Extension of the waterfall model to integrate quality assurance (verification and
validation)
requirements
definition
top-level
design
detailed
design
module
implementation
module
test
integration
test
system
test
acceptance
test
application scenarios
test cases
test cases
test
cases
validation
verification
Validation: ensure the system conforms with the needs of the environment (are we
building the right system? – product quality)
Verification: ensures that the outcome of a development phase exactly conforms to
the specification provided as input (is the system built right? – process quality)

30
Traditional (Early Partitioning) vs. Codesign Approach
Early Partitioning (Structure First) HW/SW Codesign (Behavior First)
system architectur
HW descr. SW descr.
HW impl. SW impl.
prototyp/product
system description
HW impl. SW impl.
system architectur
prototyp/product
+ joint system description/model
eases validation and integration
- joint description is not optimized for
both HW and SW
+ flexibility wrt. HW/SW partitioning
+ optimized descriptions/models for
HW and SW parts, respectively
- lack of flexibility wrt HW/SW
partitioning
- problems with HW/SW integration

31
Traditional vs. Codesign Approach (Polis, Cadence VCC)
Traditional System Design VCC Separation and Mapping
System
Behavior
System
Architecture
System
Implementation
System
Performance
System
Behavior
System
Architecture
Mapping
Behavior on
Architecture
Refine
Implementation
of System
1 2
3
4
Data Sheets
on paper
Executable
Data Sheets

32
References
System Focus
 D. Gajski, F. Vahid, S. Narayan, J. Gong: Specification and Design of
Embedded Systems. Prentice Hall, 1994.
 A. Mitschele-Thiel: Systems Engineering with SDL – Developing Performance-
Critical Communication Systems. Wiley, 2001. (section 2.1.2)
 J. Teich: Digitale Hardware/Software Systeme. Springer, 1997.
Software Focus
 H. Balzert: Lehrbuch der Software-Technik – Band 1: Softwareentwicklung.
Spektrum-Verlag, 2001.
 R. S. Pressman: Software Engineering – A Practicioner´s Approach. Fourth
Edition, McGraw Hill, 1997.

2
Part III
Requirements

3
Requirements
 Analysis process
 Functional requirements
 Performance requirements
 Real-time requirements
 Safety and reliability
 Principles and elements of requirements analysis

4
The Importance of Requirements
Proper definition of the requirements is vital to ensure quality!

5
Review of the Development Process
Design
Analysis
Problem
analysis
Feasibility
study
Requirements
analysis
The requirements analysis is a detailed study of the
requirements of the system as seen from its environment.
Major tasks are to
 identify,
 analyze and
 classify
the requirements of the product to be built

6
Requirements Definition: Contents
 Identification of the system (interfaces to the environment)
 Functional requirements (functionality provided at the interfaces)
 Temporal and performance requirements (throughput, response time, delay,
jitter)
 Fault-tolerance and reliability
 Quality (absence of errors)
 Safety
 Operating platform (OS, general HW)
 Power consumption
 Heat disipation
 Operating environment (operating temperature, shock-, dust-resistance, etc.)
 Size
 Mechanical construction
 EMC (Tx/Rx)
 Maintainability
 Extendability
 Support
 Documentation
 Cost (development, deployment and operation)
 Date of completion
 ...
=> let‘s take a look at
some details

7
Functional Requirements
Definition of the exact behavior of the system as seen at its interfaces
Description technique highly depends on the kind of system:
 (state) control system -> state machine
 transformational system -> data flow model
=> see section on behavioral models for details
END_TIMER_5 =>
ALARM_ON
KEY_OFF or
BELT _ON =>
END_TIMER_10 or
BELT_ON or
WAIT
ALARM
OFF
Example of control system:
seat belt control
Example of transformational system:
FIR filter o(n) = c1 * i(n) + c2 * i(n-1)
* c2
* c1
+
c2 * i(n-1)
c1 * i(n)
i(n)
i(n)
o(n)
S
i(n-1)

8
Performance Requirements
Important performance requirements
 Capacity
 Response time
 Jitter
Examples of performance requirements:
 capacity: number (and kind) of events
processed per second
 response-time: time to process an event
(95% percentile)
The performance of the system depends on
the load imposed on it, i.e. the traffic model
The performance is highly influenced by the
design, especially
 the module/component design
 the available processing and
communication resources
 the scheduling strategy
load
response
time
Performance cannot be „added on“
to the implementation

9
Real-time (Temporal) Requirements
Definitions:
 If the result is useful even after the deadline, we call the deadline soft.
 If the result is of no use after the deadline has passed, the deadline is called firm.
 If a catastrophe could result if a strict deadline is missed, the deadline is called
hard.
 A real-time computer system that has to meet at least one hard deadline is called
a hard real-time system.
System design for hard- and soft real-time systems is fundamentally different.
deadline time
usefulness
of result
hard or firm real-time requirement
soft real-time requirement

10
Real-time (Temporal) Requirements
Examples:
 soft deadlines
 public transportation system
 airport luggage transport system
 firm deadlines
 audio processing
 video processing
 hard deadlines
 control of nuclear or chemical processes (chain reaction)
 railway traffic control
 air traffic control

11
Real-time Systems – Classification
 On the basis of the external requirements
 hard/firm real-time versus soft real-time
 fail safe vs. fail operational (e.g. train control system vs. fly-by-wire
system)
 On the basis of the design and implementation
 guaranteed timeliness vs. best effort
 resource adequacy vs. no resource adequacy (sufficient computational
resources to handle all specified peak loads and fault scenarios)
 event triggered vs. time triggered

12
Time Triggered (TT) vs. Event Triggered (ET) Systems
A system is Time Triggered (TT) if the control signals, such as
 sending and receiving of messages
 recognition of an external state change
are derived solely from the progression of a (global) notion of time.
A system is Event Triggered (ET) if the control signals are derived solely from
the occurrence of events, e.g.,
 termination of a task
 reception of a message
 an external interrupt.
Note that the triggering method is often an attribute of the implementation and
not necessarily a requirement.

13
Safety Requirements: Fail-Safe vs. Fail-Operational
Safety requirements define the action taken in the case of a failure.
A system is fail-safe if there is a safe state in the environment that can be
reached in case of a system failure, e.g. ABS, train signaling system.
In a fail-safe application the computer has to have a high error detection
coverage.
Fail safeness is a characteristic of the application, not the computer system.
A system is fail-operational, if no safe state can be reached in case of a system
failure, e.g. a flight control system aboard an airplane.
In fail-operational applications the computer system has to provide a minimum
level of service, even after the occurrence of a fault.

14
Reliability Requirements
Reliability denotes the probability for a failure or absence from failure of a
system
Examples of reliability figures are
 MTTF (mean time to failure)
 MTBF (mean time between failures)
 probability for up-time (e.g. 99.995%)
The reliability of the system can be estimated/calculated (in theory) from the
reliability of its components
A system that ensures that it still functions correctly even in the case of failure of
some components is called a fault tolerant system (i.e. it is able to tolerate faults
of single components of the system)

15
Predictability in Rare Event Situations
A rare event is an important event that occurs very infrequently during the
lifetime of a system, e.g. the rupture of a pipe in a nuclear reactor.
A rare event can give rise to many correlated service requests (e.g. an alarm
shower).
In a number of applications, the merit of a system depends on the predictable
performance in rare event scenarios, e.g. a flight control system.
In most cases, typical workload testing will not cover the rare event scenario.

16
Principles and Elements of the Analysis Model
Guidelines for the analysis:
 understand the problem first! (before you begin to create the analysis model)
 record origin and reason for every requirement
 use multiple views of requirements (data model, functional model, behavioral
models)
 priorize requirements
 eliminate ambiquities
Elements of the analysis model:
 data dictionary
 process specification (data-flow diagram)
 control specification (state-transition diagram)
 data object description (entity-relationship diagram)
 functional specification (sequence diagram)
Specific methods and tools for various application areas have been proposed,
e.g. real-time systems, transformational systems, control systems,
communication systems, etc.

17
References
 H. Balzert: Lehrbuch der Software-Technik – Band 1: Softwareentwicklung.
Spektrum-Verlag, 2001.
 R. S. Pressman: Software Engineering – A Practicioner´s Approach. Fourth
Edition, 1997. (Chapter 12: Analysis Modeling)
Critical Communication Systems. Wiley, 2001.
 B. Thomé (Editor): Systems Engineering – Principles and Practice of
Computer-based Systems Engineering, Wiley, 1993.

1
Part IV
Behavioral Models and
Specification Languages

3
Behavioral Models and Specification Languages
Behavioral Models
 Finite State Machine
(FSM)
 NDFSM
 composed FSM
 Petri Net (PN)
 Data Flow Graph (DFG)
 Control Flow Graph (CFG)
 Control/Data Flow Graph
(CDFG)
 StateCharts
 SDL
 VHDL
 SystemC
 ...
Basic Concepts
 concurrency
 hierarchy
 communication
 synchronisation
 exception handling
 non-determinism
 timing

4
Finite State Machines (FSM)
Functional decomposition into states of operation
 finite states
 transitions between states
 event triggered transitions
 neither concurrency nor time (sequential FSMs)
Typical applications:
 reactive (control) systems
 protocols (telecom, computers, ...)

5
Finite State Machines – Control Algorithms

6
Finite State Machines – Discussion

7
Moore vs. Mealy Automata
Theoretically, same computational power
In practice, different characteristics
 Moore machines:
 non-reactive
(response delayed by 1 cycle –
clocked change of output only)
 easy to design
(always well-defined)
 good for SW implementation
 software is always “slow”
 Mealy machines:
 reactive (immediate response
to changes of input)
 hard to compose
 problematic SW implementation
 due to immediate response to changes of input (interrupts/polling)
 software must be “fast enough”
 may be needed in hardware, for speed
δ τ µ
X
a
Z Y
n
Z
δ τ λ
X
a
Z
Y
n
Z

8

9
Finite State Machines – Example: state diagram (informal)

10
Advantages:
 Easy to use (graphical languages)
 Powerful algorithms for
 synthesis (SW and HW)
 verification
Disadvantages:
 Sometimes over-specify implementation
(sequencing is fully specified)
 Number of states can be unmanageable
 Numerical computations cannot be specified compactly
(need Extended FSMs)

11
Finite State Machines - Extensions
Divide and conquer
⇒ Nondeterminism
⇒ Parallel automata
⇒ Processes
⇒ Communication
⇒ Hierarchy
⇒ Graphical support
⇒ Extended formal semantic

12
NDFSM: Time Range
Special case of unspecified/unknown behavior, but so common to deserve
special treatment for efficiency
Example: nondeterministic delay (between 6 and 10 s)
0
1 2 3 4
5
6
7
8
9
START => SEC =>
SEC => END
SEC => SEC =>
SEC =>
SEC =>
SEC =>
SEC =>
SEC =>
START =>
SEC =>
END
SEC => END
SEC =>
END

13
NDFSMs and FSMs
 Formally FSMs and NDFSMs are equivalent
(Rabin-Scott construction, Rabin ‘59)
 In practice, NDFSMs are often more compact
(exponential blowup for determinization)
Example: non-deterministic selection
of transition a in state s1
s1
s2 s3
a
a
b
a
c
s1
s2,s3
a
s3
b
a
s2
c
b
a
c
Equivalent deterministic FSM

14
Modeling Concurrency – parallel automata
Systems are typically composed of chunks of rather independent functionalities,
e.g. seat belt control || timer || driver
Systems may be physically distributed,
e.g. peer protocol automata
Need to compose parts described by sequential FSMs
 construct a complete model of the system
 building the cartesian product results in state explosion
Approach
 Describe the system using a number of separate FSMs and interconnect them
Issue
 How do the interconnected FSMs talk to each other?
Fundamental hypothesis:
all the FSMs change state together (synchronicity)
System state = Cartesian product of component states
(state explosion may be a problem...)

15
FSM Composition – Example
Example: seat belt control || timer
Belt
Control
Timer
Belt control:
• 5 sec after the car key is switched on, an alarm signal should be on as long as
the belt is not locked.
• After 10 sec the alarm should be switched off
END_TIMER_5 =>
ALARM_ON
KEY_OFF or
BELT _ON =>
END_TIMER_10 or
BELT_ON or
WAIT
ALARM
OFF

16
Example: seat belt control || timer
0
1 2 3 4
5
6
7
8
9
START_TIMER =>
START_TIMER =>
SEC =>
SEC =>
END_TIMER_10
SEC => SEC =>
SEC =>
END_TIMER_5
SEC =>
SEC =>
SEC =>
SEC =>
Belt
Control
Timer
END_TIMER_5 =>
ALARM_ON
KEY_OFF or
BELT _ON =>
END_TIMER_10 or
BELT_ON or
WAIT
ALARM
OFF

17
Cartesian product
OFF, 0 WAIT, 1
KEY_ON and START_TIMER =>
START_TIMER must be coherent
WAIT, 2
SEC and
not (KEY_OFF or BELT_ON) =>
OFF, 1
not SEC and
(KEY_OFF or BELT_ON) =>
OFF, 2
SEC and
(KEY_OFF or BELT_ON) =>

18
Finite State Machines - Extensions: parallel automata

19
FSM Extensions – Example: user interaction > Processes

20
FSM Extensions - Communication (MSC)

21
Hierarchical FSM models – StateCharts
Problem: how to reduce the size of the representation?
Harel’s classical papers on StateCharts (language) and bounded concurrency
(model): 3 orthogonal exponential reductions
 Hierarchy:
 state a “encloses” an FSM
 being in a means FSM in a is active
 states of a are called OR states
 used to model preemption and exceptions
 Concurrency:
 two or more FSMs are simultaneously active
 states are called AND states
 Non-determinism:
 used to abstract behavior
error
a
recovery
odd
even
done
a1 a2

22
StateCharts – Basic Principles
Basic principles:
 An extension of conventional FSMs
 Conventional FSMs are inappropriate for the behavioral description of complex
control
 flat and unstructured
 inherently sequential in nature
 StateCharts support
 repeated decomposition of states into sub-states in an AND/OR fashion,
combined with a
 synchronous communication mechanism (instantaneous broadcast)
State decomposition:
 OR-States have sub-states that are related to each other by exclusive-or
 AND-States have orthogonal state components (synchronous FSM composition)
 AND-decomposition can be carried out on any level of states (more
convenient than allowing only one level of communicating FSMs)
 Basic States have no sub-states (bottom of hierarchy)
 Root State have no parent states (top of hierarchy)

23
StateCharts – OR Decomposition
S
V
T
S
V
T
f
f
f
e
h
e
h
g g
To be in state U the system must
be either in state S or in state T
U
State U is an abstraction of states S and T

1
Part V
High-level Synthesis

3
 Motivation for high-level synthesis
 Domains of the HW design
 Levels of abstractions
 Overview on synthesis methods
 High-level synthesis tasks and models
 ASAP and ALAP
 List scheduling
 Advanced Issues

4
Motivation for High-level Synthesis
Complexity problem: millions of transistors on a single chip
=> handcrafting of each single transistor is not possible
=> handcrafting of single gates is not possible
=> cost and time of the process require to do it right the first time
=> need design automation on more abstract levels
=> high-level synthesis
 algorithm synthesis
 HW/SW (system)
synthesis
Design automation ensures:
 speed-up the design process
 do it right the first time
=> time-to-market

5
Domains of HW Design
Y chart: design domains and abstraction levels
structural domain behavioral domain
physical domain (layout)
=1
A
B
Y
t = 5 ns
layout
PowerPC
750
EXOR: process (A, B)
begin
Y <= transport A xor B after 5 ns;
end process;
abstraction
levels

6
Abstraction Levels
transistor layout
cells
chips
boards
CFG, algorithms
register transfers
Boolean expressions
transistor functions
gates, flip-flops
transistors
registers, ALUs, MUXs
processors,
memories, buses

7
Structural Synthesis
Structural synthesis is the translation
from a behavioral description into a
structural description
transistor layout
cells
chips
boards
CFG, algorithms
register transfers
Boolean expressions
gates, flip-flops
transistors
processors,
memories, buses

8
Circuit Synthesis
 generates a transistor schematic from a set of input-output current, voltage
and frequency characteristics or equations
 transistor schematic contains transistor types, parameters and sizes
structural domain
behavioral domain
transistor layout
cells
chips
boards
CFG, algorithms
register transfers
Boolean expressions
transistors
processors,
memories, buses
gates, flip-flops

9
Logic Synthesis
 translation of Boolean expressions into a netlist of components from a given
library of logic gates such as NAND, NOR, EXOR, etc.
-> see logic synthesis section for details
structural domain
behavioral domain
transistor layout
cells
chips
boards
CFG, algorithms
register transfers
Boolean expressions
transistors
processors,
memories, buses
gates, flip-flops

10
Register-transfer Synthesis
 start with a set of states and a set of register-transfers in each state
 one state typically corresponds to a clock cycle (clock-accurate description)
 register-transfer synthesis generates the corresponding structures
in two parts
structural domain
behavioral domain
transistor layout
cells
chips
boards
CFG, algorithms
register transfers
Boolean expressions
transistors
gates, flip-flops
processors,
memories, buses
(a) a data path which is a structure of storage
elements and functional units that perform the
given register transfers, and
(b) a control unit that controls the sequencing
of the states in the register-transfer description

11
High-level synthesis (also called system synthesis or algorithmic synthesis) may
cover HW as well as SW parts of the system
 starts with a set of processes communicating through either shared variables or
message passing (an un-clocked description)
 generates a structure of processors, memories, controllers
and interface adapters from a set of system components
 each component can be described
by a register-transfer description structural domain
behavioral domain
transistor layout
cells
chips
boards
CFG, algorithms
register transfers
Boolean expressions
transistors
processors,
memories, buses
gates, flip-flops

12
Levels of Synthesis

13
High-level Synthesis – Central Tasks
High-level synthesis deals with
 the algorithmic level (behavioral viewpoint)
 the system level (structural viewpoint)
Tasks of high-level synthesis
 (system) partitioning
 partitioning of a behavioral description or design structure into
subdescriptions or substructures
 reduce the problem size
 satisfy external constraints as chip size, pins per package, power
dissipation or wire length
 allocation
 selection of the number and types of structural entities
 mapping (Gajski: allocation, Teich: Bindung)
 assignment of data to storage units (registers)
 assignment of operations to functional units (ALUs, etc.)
 assignment of communications to busses or links
 scheduling (Teich: Ablaufplanung)
 temporal assignment of data and operations
 derivation of controller (microprogram)

14
High-level Synthesis: Theory
Behavioral model:
GS = (VS, ES) is a directed acyclic graph where
 each node vS ∈ Vs represents a task and
 each arc eT = (vi, vj) ∈ ES defines a data dependency (execute vi before vj)
Resource model:
GR = (VR, ER) is a bipartite graph with
 VR = (VS ∪ VT) where
 VS specifies the nodes of the behavioral model
 VT specifies the nodes representing resource types
 (vS, vT) ∈ ER with vS ∈ Vs and vT ∈ VT specifies that vS may be implemented
by a resource node of type vT
 the cost function c denoting the cost of each instance of node type vT and
 the node execution time t denoting the latency of the execution of task vS on
a resource of type vT

15
Summary of Basic Concepts of Models and Languages
State transitions
 events triggering a state transition
(simple input, complex conditions)
 computation associated with
transition
Concurrency
 decomposition of behavior in
concurrent entities
 different levels of concurrency (job,
task-, statement-, operation-level)
 data-driven (data dependencies) vs
control-driven concurrency (control
dependencies)
 reduction of states
Hierarchy
 structural hierarchy (system, block,
process, procedure)
 behavioral hierarchy (hierarchical
transitions, fork-join)
Programming constructs
 specify sequential algorithm
Communication
 shared variables (broadcast)
 message passing
 synchronous vs. asynchronous
Synchronization
 control-dependent (fork-join)
 data-dependent (data, event,
message)
Exception handling
 immediate termination of current
behaviror
Non-determinism
 choice between multiple transitions
 non-deterministic ordering
Timing
 timeouts
 time constraints (e.g. exec. time)

16
Control vs. Data Flow Applications
Rough classification:
 control:
 don’t know when data arrive
(quick reaction)
 time of arrival often matters
more than value
 data:
 data arrive in regular streams
(samples)
 values matter most
Distinction is important for:
 specification (language, model, ...)
 synthesis (scheduling,
optimization, ...)
 validation (simulation, formal
verification, ...)
Specification, synthesis and validation
methods emphasize:
 for control:
 event/reaction relation
 response time
(real-time scheduling for
deadline satisfaction)
 priority among events and
processes
 for data:
 functional dependency between
input and output
 memory/time efficiency
(data-flow scheduling for
efficient pipelining)
 all events and processes are
equal

17
Control/Data Flow Graph (CDFG)
 also called sequence graph
 mixture of control and data flow graph
 hierarchy of sequential elements
 units model data flow
 hierarchy models control flow
 special nodes (for control operations)
 start/end node: NOP (no operation) – all inputs needed (AND), all outputs
needed (AND)
 branch node (BR) – one out of many outputs selected (OR)
 iteration (LOOP) – one out of two outputs selected (OR)
 procedure call (CALL) – lower hierarchy is executed exactly once
 attributes
 nodes: execution time, cost, ...
 arcs: conditions for branches and loops

21
DFG – Example

22
CDFG – Loop

23
Review of Models, Concepts and Languages
Behavioral Models
 Finite State Machine (FSM)
 NDFSM
 composed FSM
 Petri Net (PN)
(CDFG)
 StateCharts
 SDL
 VHDL
 SystemC
 ...
Basic Concepts
 concurrency
 hierarchy
 communication
 synchronisation
 non-determinism
 timing

24
Data Flow Graph (DFG)
Powerful formalism for data-dominated applications
DFG support the specification of transformational systems:
 output is a function of the input
 set of actors (nodes) connected by a set of arcs representing the data flow
 no states, no external events to trigger state changes
 unbounded FIFO queues (main data store)
 no control nodes, e.g. branch, loop
DFG represent a partial ordered model of the computation
=> specification of problem-inherent dependencies only
=> suitable for scheduling and code generation
=> there is a relation between buffer dimensioning and scheduling
(static scheduling minimizes the number of buffers required)
Languages:
 graphical: Ptolemy (UCB), GRAPE (U. Leuven), SPW (Cadence), COSSAP
(Synopsys)
 textual: Silage (UCB, Mentor), Haskell, Lucid

25
High-level Synthesis: Example
Behavioral model:
* *
*
*
+ <
*
-
-
* +
1 2 6 8 10
4
3 7 9 11
5
data
dependency
Resource model:
*
*
*
*
*
*
-
-
+
+
<
1
3
7
2
6
8
4
9
11
5
10
multiplier
ALU
may be
implemented
on
Note: behavior model does not
define clocking (different from
RT synthesis)

26
Scheduling with unlimited resources:
=> latency 4 T
* *
*
*
+ <
*
-
-
* +
1 2 6 8 10
4
3 7 9 11
5
t0
t1
t2
t3
t4

27
Mapping and scheduling with limited resources:
 4 multipliers
 2 ALUs
=> latency 4 T
* *
*
*
+ <
*
-
-
* +
1 2 6 8 10
4
3 7 9 11
5
t0
t1
t2
t3
t4

28
Mapping and scheduling with
limited resources:
 1 multiplier
 1 ALU
=> latency 7 T
* 1
* 2
* 6
* 8
+ 10
- 4
* 3
* 7
+ 9
< 11
- 5
t0
t1
t2
t3
t4
t5
t6
t7

29
ASAP Scheduling without Resource Constraints
ASAP (as soon as possible) scheduling without resource constraints:
 algorithm: for each time slot select node which has all predecessors assigned
 problem is solvable in polynomial time
* * * * +
1 2 6 8 10
- 4
* + <
*
3 7 9 11
- 5
t0
t1
t2
t3
t4
assign all nodes
without predecessors
assign nodes with
scheduled predecessors
dito
dito

30
ALAP Scheduling without Resource Constraints
ALAP (as late as possible) scheduling without resource constraints:
 algorithm: complementary to ASAP; start with nodes without successors
 problem is solvable in polynomial time
* *
1 2
* 6
* 3
- * +
8 10
4 * 7
+ <
- 9 11
5
t0
t1
t2
t3
t4
dito
dito
assign nodes with
scheduled successor
assign all nodes
without sucessor

31
Scheduling with Resource Constraints: ASAP Extension
Extensions to ASAP and ALAP, respectively
 compute schedule using ASAP (or ALAP)
 if a resource constraint is violated, move respective nodes
Example: extended ASAP (2 multiplier, 2 ALUs)
* * +
1 2 10
- 4
* <
3 11
- 5
t0
t1
t2
t3
t4
* 8
+ 9
*
6
*
7
* 8
+ 9
*
6
*
7

32
Scheduling with Resource Constraints: List Scheduling
Apply global criteria to optimize the schedule
 derive priority for each node based on
 length of path to sink/source or
 laxity of node (i.e. the difference
between start according to ASAP and
ALAP) or
 number of successor nodes (fanout)
Example: 1 multiplier, 1 ALU
Priority assignment (according to length to sink):
* 1
+ 10
* 6
- 4
* 7
+ 9
* 2
< 11
* 8
- 5
t0
t1
t2
t3
t4
t5
t6
t7
* 3
* *
*
*
+ <
*
-
-
* +
1 2 6 8 10
4
3 7 9 11
5
1
2
1 1
2
2
2
3
3
4
4

33
Scheduling with Resource Constraints: List Scheduling
Example:
 2 combined multiplier/ALU units
 2 time units for multiplication
 1 time unit for ALU operation
Priority assignment (length to sink):
* *
*
*
+ <
*
-
-
* +
1 2 6 8 10
4
3 7 9 11
5
1
2
1 1
2
3
3
4
5
6
6
* 1 * 2
* 6
* 8
+ 10
- 4
* 7
t0
t1
t2
t3
t4
t5
t6
t7
* 3
t8
t9
+ 9
<
11
- 5

34
Advanced Topics of High-level Synthesis
Considered so far:
 mapping and scheduling without resources constraints
 mapping and scheduling with given number (and type) of resources
Advanced topics:
 mapping and scheduling with time constraints and open number of resources
 mapping and scheduling of periodic tasks
 mapping and scheduling in the presence of multiple resources with identical
functionality but different area-latency relations
 ...
The general mapping and scheduling problem is NP hard (optimal solution is not
computable in polynomial time)
Numerous heuristic optimization algorithms have been applied to the problem

35
References
 D. Gajski, N. Dutt, A. Wu, S. Lin: High-level Synthesis – Introduction to Chip
and System Design. Kluwer Academic Publishers, 1992.
 Bleck, Goedecke, Huss, Waldschmidt: Praktikum des modernen VLSI-
Entwurfs. B.G. Teubner, 1996

1
Models, Concepts and Languages
Behavioral Models
 NDFSM
 composed FSM
 Petri Net (PN)
(CDFG)
 StateCharts
 SDL
 VHDL
 SystemC
 ...
Basic Concepts
 concurrency
 hierarchy
 communication
 synchronisation
 non-determinism
 timing

5
Synchronous vs. Asynchronous FSMs
Synchronous FSMs (e.g. StateCharts):
 communication by shared variables that are read and written in zero time
 communication and computation happens instantaneously at discrete time
instants
 all FSMs execute a transition simultaneously (lock-step)
 may be difficult to implement
 multi-rate specifications
 distributed/heterogeneous architectures
Asynchronous FSMs (e.g. SDL, CSP) :
 free to proceed independently
 do not execute a transition at the same time (except for CSP rendezvous)
 may need to share notion of time: synchronization
 easy to implement
Multitude of commercial and non-commercial graphical languages and tools:
 StateCharts, UML, SDL, StateFlow
 tool support for design, simulation, validation, code generation, HW
synthesis, …

6
StateCharts – Basic Principles
Basic principles:
 An extension of conventional FSMs
 Conventional FSMs are inappropriate for the behavioral description of complex
control
 flat and unstructured
 inherently sequential in nature
 StateCharts support
 repeated decomposition of states into sub-states in an AND/OR fashion,
combined with a
 synchronous communication mechanism (instantaneous broadcast)
 StateCharts describe behavioral aspects, additional (but less important)
 ModuleCharts can be used for structural aspects and
 ActivityCharts for data flow and control flow description
 Source: Science of Computer Programming 8 (1987) 231-274, North-Holland
 STATECHARTS: A VISUAL FORMALISM FOR COMPLEX SYSTEMS·
 DavidHAREL, Department of Applied Mathematics, The Weizmann Institute of Science,
Rehovot, Israel

7
StateCharts – Syntax
 The general syntax of an expression labeling a transition in a StateChart is E(C)/A
where S,T are states
 E is the event that triggers the transition
 C is the condition that guards the transition
(cannot be taken unless c is true when e occurs)
 A is the action that is carried out if and when the transition is taken
 For each transition label:
 condition and action are optional
 an event can be the changing of a value
 standard comparisons (e.g. x > y) are allowed as conditions
 assignment statements (e.g. x := 10) are allowed as actions

8
StateCharts – Actions and Events
 An action A on the edge leaving a state may also appear as an event triggering
a transition going into an orthogonal state:
 a state transition broadcasts an event visible immediately to all other FSMs,
that can make transitions immediately and so on
 executing the first transition will immediately cause the second transition to
be taken simultaneously (problem in reality!!!)
 Actions and events may be associated to the execution of orthogonal
components: start(A), stopped(B)
 Entry / Exit actions in states

9
StateCharts – Hierarchy
State decomposition:
 OR-States have sub-states that are related to each other by exclusive-or
 AND-States have orthogonal state components (synchronous FSM composition)
 AND-decomposition can be carried out on any level of states (more
convenient than allowing only one level of communicating FSMs)
 Basic States have no sub-states (bottom of hierarchy)
 Root State have no parent states (top of hierarchy)
 Initialization:
 Default (or initial states) can be marked in each hierarchy level
 History connector to remember states in sub-states
 Combination of default state on first start and history for further steps

10
StateCharts – OR Decomposition
S
V
T
S
V
T
f
f
f
e
h
e
h
g g
To be in state U the system must
be either in state S or in state T
U
State U is an abstraction of states S and T

11
StateCharts – Top Down Design
State V is an abstraction of states S and U

12
StateCharts – Default State
Flat structure Hierarchical structure

13
StateCharts – Default State
Flat structure Hierarchical structure

14
StateCharts – Exit on Sub-States
Incorrect (b=c ???) correct

15
StateCharts – Default State and History
Default: “off” on first activation
Then: history
Same meaning

16
StateCharts – AND State
Parallel structure: n+m states
Flat structure: ???

17
StateCharts – AND State
Flat structure: equivalent FSM ! n*m states

18
StateCharts – external transition variants to AND States
Entry of top state (e.g. caused by event “n”) activates all parallel automata
Leaving of sub-state (e.g. caused by “h (inS)”) deactivates the top state A
A

19
StateCharts – external transition variants to AND States
Entry of top state (e.g. caused by event “n”) activates all parallel automata
Leaving of sub-state (e.g. caused by “h (inS)”) deactivates the top state A
A

20
StateCharts – Action on Entry and/or Exit

21
StateCharts – Synchrony Hypothesis

22
StateCharts – Synchrony Problem

23
StateCharts – Microsteps

24
StateCharts – Example

25
StateCharts – AND Decomposition <> Composition
V,W
V,Y
V,Z
V
W
X
X,Y
X,W
X,Z
R
Q
Z
Y
U
R
Q
S T
k
e
e
e
k
To be in state U the system
must be both in states S and T
k
e

26
StateCharts – Summary

27
Asynchronous Communication
Blocking vs. non-Blocking
 blocking read (receiver waits for sender)
 reading process can not test for emptiness of input
 must wait for input to arrive before proceeding
 blocking write (sender waits for receiver)
 writing process must wait for successful write before continue
Languages
 blocking write/blocking read (CSP, CCS)
 non-blocking write/blocking read (FIFO, CFSMs, SDL)
 non-blocking write/non-blocking read (shared variables)
A B

28
Asynchronous Communication – Buffering
 Buffers used to adapt when sender and receiver have different rate
 size of buffer?
 Lossless vs. lossy
 events/tokens may be lost
 bounded memory: overflow or overwriting
 need to block the sender
 Single vs. multiple read
 result of each write can be read at most once or several times
 Pure FIFO
 prioritized events
 out of order access to FIFO
A B

29
Communication Mechanisms
 Rendez-Vous (CSP)
 No space is allocated for shared data, processes need to synchronize in
some specific points to exchange data
 Read and write occur simultaneously
 Shared memory
 Multiple non-destructive reads are possible
 Writes delete previously stored data
 Buffered (FIFO)
 Bounded (ECFSMs, CFSMs)
 Unbounded (SDL, ACFSMs, Kahn Process Networks, Petri Nets)

30
Communication Models
Unsynchronized
Read-Modify-write
Unbounded FIFO
Bounded FIFO
Rendezvous
Senders
many
many
one/many
one/many
one
Receivers
many
many
one
one
one
Buffer
Size
one
one
unbounded
bounded
one
Blocking
Reads
no
yes
yes
yes
yes
Blocking
Writes
no
yes
no
may be
yes
Single
Reads
no
no
yes
yes
yes
data may be
read once only
writer is blocked (e.g.
if buffer is full)
reader is blocked
(e.g. if buffer is empty)

31
Petri Nets (PNs)
 Model introduced by C.A. Petri in 1962
 Ph.D. Thesis: “Communication with Automata”
 Applications: distributed computing, manufacturing, control, communication
networks, transportation, …
 PNs describe explicitly and graphically:
 sequencing/causality
 conflict/non-deterministic choice
 concurrency
 Asynchronous model (partial ordering)
 Main drawback: no hierarchy

32
Petri Net
 A PN (N,M0) is a Petri Net Graph N
 places: represent distributed state by holding tokens
 marking (state) M is an n-vector (m1,m2,m3…), where mi is the non-negative
number of tokens in place pi.
 initial marking (M0) is initial state
 transitions: represent actions/events
 enabled transition: enough tokens in predecessors
 firing transition: modifies marking
 … and an initial marking M0
t1
p1
p2
t2
p4
t3
p3

33
Concurrency, causality, choice
t1
t2
t3 t4
t5
t6
Concurrency
Causality, sequencing
Choice,
conflict

34
Communication Protocol
Process 1
Send msg
Receive Ack
Send Ack
Process 2

35
Producer-Consumer Problem
Produce
Consume
Buffer

37
Summary: Control Flow Description
Specification Language
⇒ NDFSM
⇒ State Charts, Petri Nets
⇒ SDL
⇒ MSC
⇒ State Charts
⇒ All
⇒ Different ;-(
Properties
⇒ Nondeterminism
⇒ Parallel automata
⇒ Processes
⇒ Communication
⇒ Hierarchy
⇒ Graphical support
⇒ Semantic

38
Control vs. Data Flow Applications
Rough classification:
 control:
 don’t know when data arrive
(quick reaction)
 time of arrival often matters
more than value
 data:
 data arrive in regular streams
(samples)
 values matter most
Distinction is important for:
 specification (language, model, ...)
 synthesis (scheduling,
optimization, ...)
 validation (simulation, formal
verification, ...)
Specification, synthesis and validation
methods emphasize:
 for control:
 event/reaction relation
 response time
(real-time scheduling for
deadline satisfaction)
 priority among events and
processes
 for data:
 functional dependency between
input and output
 memory/time efficiency
(data-flow scheduling for
efficient pipelining)
 all events and processes are
equal

39
Data Flow Graph (DFG)
Powerful formalism for data-dominated applications
DFG support the specification of transformational systems:
 output is a function of the input
 set of actors (nodes) connected by a set of arcs representing the data flow
 no states, no external events to trigger state changes
 unbounded FIFO queues (main data store)
 no control nodes, e.g. branch, loop
DFG represent a partial ordered model of the computation
=> specification of problem-inherent dependencies only
=> suitable for scheduling and code generation
=> there is a relation between buffer dimensioning and scheduling
(static scheduling minimizes the number of buffers required)
Languages:
 graphical: Ptolemy (UCB), GRAPE (U. Leuven), SPW (Cadence), COSSAP
(Synopsys)
 textual: Silage (UCB, Mentor), Haskell, Lucid

40
DFG
Semantics (informal)
 actors perform computation (often stateless)
 firing of actors when all needed inputs are available
 unbounded FIFOs for unidirectional exchange of data between actors
(integer, floats, arrays, etc.)
 extensions to model decisions
Example: FIR (finite impuls response) filter
 single input sequence i(n)
 single output sequence o(n)
 o(n) = c1 * i(n) + c2 * i(n-1)
* c1
i * c2
+ o
i(-1)
i(-1)
i(-1)

41
DFG – Example

42
Control Flow Graph (CFG)
 also called flow chart (abstract description of program designs)
 focus on control aspect of a system
 set of nodes and arcs
 trigger of an activity (node) when a particular preceding activity is completed
 different triggers for transitions
 suitable for well defined tasks that do not depend on external events
 imposes a complete order on the execution of activities
=> close to implementation (on conventional computer architecture)
 various variants with various levels of details
 simple operator level (addition, multiplication, etc)
 abstract function/procedure level

43
CFG – Example (detailed level)

44
Control/Data Flow Graph (CDFG)
 also called sequence graph
 mixture of control and data flow graph
 hierarchy of sequential elements
 units model data flow
 hierarchy models control flow
 special nodes (for control operations)
 start/end node: NOP (no operation) – all inputs needed (AND), all outputs
needed (AND)
 branch node (BR) – one out of many outputs selected (OR)
 iteration (LOOP) – one out of two outputs selected (OR)
 procedure call (CALL) – lower hierarchy is executed exactly once
 attributes
 nodes: execution time, cost, ...
 arcs: conditions for branches and loops

45
CDFG – Entity
Legend:
data dependencies
control dependencies
Notes: AND dependencies at NOPs (NO Operation), OR dependencies at
BRanches and LOOPs

46
CDFG – Branch
Notes:
• data dependencies are not fully specified
• x = a – b may execute in parallel to IF statement
• computation of p and q within IF statement may execute in parallel

47
CDFG – Loop

48
CDFG – Call

49
Review of Models, Concepts and Languages
Behavioral Models
 NDFSM
 composed FSM
 Petri Net (PN)
(CDFG)
 StateCharts
 SDL
 VHDL
 SystemC
 ...
Basic Concepts
 concurrency
 hierarchy
 communication
 synchronisation
 non-determinism
 timing

50
State transitions
transition
Concurrency
concurrent entities
dependencies)
Hierarchy
process, procedure)
Communication
 message passing
Synchronization
message)
Exception handling
behaviror
Non-determinism
Timing
 timeouts

51
References
Embedded Systems. Prentice Hall, 1994. (chapters 2 and 3)
 http://www.sei.cmu.edu/publications/documents/02.reports/02tn001.html

2
Parallel Finite State Machines - Result of Decomposition

3
Parallel Finite State Machines Example

4
Single FSM

5
FSM- Decomposition / Composition

6
Parallel Finite State Machines - Properties
 Concurrency
 Delay
 Synchronization
 Rendezvous
 Mutual exclusion
 Blocking
 Priorization

7
Finite State Machines - Stability
 Stable {X2,X1,X0}
 {X0}
{X3}
 {X1}
 In = X1= Out

8
 Instable {X2}
 {X0} {X1,X0}
 {X1} {X3}
 In = X1= Out

9
 Conditionally stable
stable for X0
 In = X0 Out ={X0,X2,X3}
 In = X1 Out = X1
(instable for X1 )

10
 Example

11
 Example

12
 Example

13
 Example

14
 Instable States

15
 Stable States (abstraction)

22
Petri Nets (PNs)
 Model introduced by C.A. Petri in 1962
 Ph.D. Thesis: “Communication with Automata”
 Applications: distributed computing, manufacturing, control, communication
networks, transportation, …
 PNs describe explicitly and graphically:
 sequencing/causality
 conflict/non-deterministic choice
 concurrency
 Asynchronous model (partial ordering)
 Main drawback: no hierarchy

43
State transitions
transition
Concurrency
concurrent entities
dependencies)
Hierarchy
process, procedure)
Communication
 message passing
Synchronization
message)
Exception handling
behavior
Non-determinism
Timing
 timeouts

44
References
Embedded Systems. Prentice Hall, 1994. (chapters 2 and 3)
 http://www.sei.cmu.edu/publications/documents/02.reports/02tn001.html

1
Part 9,
Time and Performance Evaluation
TECHNISCHE
UNIVERSITÄT
ILMENAU
Systems Design

3
Evaluation of Temporal and Performance Aspects
 Problem Statement
 Performance Modeling
Integrated
Hard-
and
Software
Systems
http://www.tu-ilmenau.de/ihs
 Performance Evaluation

4
Analysis
Design
Implementation
Integration
Example: IP Office Firewall
 identify
performance
requirements
 identify traffic
model
 identify costly or
contradicting
requirements
 replace costly
functions by
cheaper functions
 evaluate design
alternatives
 identify performance
critical components
 adapt design to meet
performance
requirements
 measurement-based
evaluation
Typical performance-related
questions in design phase:
 System architecture?
 HW architecture, need for
special HW?
 Which chip sets/processor?
 Which peripherals?
 Programming approach,
language?
 Which operating system?
 …?

5
response time
perf. req.
3 cases:
Accuracy and Effort
 look at worst case (maximum load) or average (be aware of nonlinear
behavior)
 watch level of detail!
 different kinds of performance requirements
 throughput/utilization of resources -> cheap and accurate performance
evaluation
 response time -> less accurate and highly expensive evaluation
 optimistic/pessimistic evaluation (use of bounds)
note: estimation of bounds is much easier than exact values
optimistic
estimate
pessimistic
estimate
 right on track
optimistic
estimate
pessimistic
estimate
 details/special
care needed
optimistic
estimate
pessimistic
estimate
 there is a
problem!

6
Tasks of Performance Evaluation - Summary
(1) Identify the goals of the performance evaluation
(2) Study the details of the object under investigation
(3) Decide on the modeling approach
(4) Build the performance model
(5) Derive quantitative data (as input) for the performance model
(6) Transform the performance model to an executable or assessable model
(7) Evaluate the performance model
(8) Verify the performance results against the performance requirements
Some advice:
 do simple things first!
 abstract, abstract, abstract!
(back-of-the-envelope analysis is preferable over complex behavioral model)
 don´t skip or defer performance evaluation!

7
Tasks (1): Identify the Goals of the Performance Evaluation
 What is the purpose of the performance evaluation?
 evaluate possible solutions to a decision problem
 identify the parts of the system critical to performance
=> avoid spending time on evaluating things you already know
 What are the performance metrics to be estimated (response time or system
capacity)?
=> impact on modeling and evaluation methods
 What is the required accuracy of the evaluation?
=> impact on modeling and evaluation methods
 What kind of performance evaluation is performed?
=> best/worst case evaluation, average case evaluation

8
Tasks (2-5): Performance Modelling
(2) Study the details of the object under investigation
Workload:
 identification of the service requests issued to the system
Available resources:
 analysis of the execution environment
System:
 analysis of the static structure as well as dynamic aspects of the system
Mapping:
 identification of the resources used by specific service requests
(3) Decide on the modeling approach
 select appropriate performance evaluation technique (CFG, DFG, FSM,
sequence diagrams, queuing model, ...)
(4) Build the performance model
 carefully select the level of abstraction
(5) Derive quantitative data for the performance model
 derive execution times, available resources, traffic model, ...
 measurement, emulation, code analysis, empirical estimation

9
Tasks (6-8): Performance Evaluation
(6) Transform the performance model to an executable or assessable model
 take into account the limits of the selected performance evaluation
technique
(7) Evaluate the performance model
 tool support for simulation, queuing analysis, graph analysis, ...
 ensure the model is a valid model of the system
validation of results (confidence intervals, seeds for simulation, rare
event problems)
(8) Verify the performance results against the performance requirements
 check if real-time, response time, capacity requirements are met

10
Performance Model – The Incredients
Application
 information about the application
 typically some behavioral model attributed with temporal information on
execution times
 e.g. PN, DFG, CFG, FSM
Resources
 typically some structural model attributed with capacity information of
resources (MIPS, FLOPS, ...)
Mapping (spatial assignment)
 information describing how the entities of the application are assigned to the
resources (which function is assigned to which processor or other HW entity)
Runtime system (temporal assignment)
 information on dynamic aspects, e.g. scheduling algorithms
System stimuli (traffic model)
 the characteristics of the input to the system that triggers an execution
 types and temporal characteristics of the input events

11
Resources
P1
P2
Example
Application (DFG or CFG)
Mapping
(spatial assignment)
Stimuli:
number of arrivals (stimuli) for task 1 (source node) per second
(deterministic or probabilistic)
Schedule for processor P1:
 1, 2, 3, 4 (no impact with given mapping)
 schedule may be static or dynamic (priorities)
1
2
3
4 5
6

12
Performance Evaluation – Summary of Methods
Methods
 process graph analysis (structural model)
 task graph analysis (behavioral model)
 schedulability analysis (real-time analysis)
 Markov chain analysis
 queuing network analysis
 operational analysis
 discrete-event simulation
In order to select the right method it is important to understand
the strengths and limits of each method!

13
Process Graph Analysis
Process graph
 structural model of the application
 nodes represent functional entities (module, function, procedure, operation, etc)
 edges represent communication relations
 precedence relations are neglected
Process graph analysis
 limited to the analysis of the load imposed on the resources of the system
 assumption that contention on resources does not have a negative impact on the
load of the system
 resources may be physical (e.g. processor, HW entity, communication link) or
logical (e.g. critical sections)
Example of simple analysis:
load(r) = ∑ p∈Ar
load(p,r)
where
 p denotes some process
 Ar specifies the set of processes assigned to resource r and
 load(p,r) denotes the exact resource demand resulting from the assignment of process
p on resource r
Note: the formula may be equally applied to compute the load of a communication link

14
HW
SW
Example: HW/SW partitioning
 assign the tasks to the SW and the HW entity
such that
 the maximum of the processing time of SW
and HW is a minimum and
 the communication cost are minimal
 20/2 denote the cost (load) of processing the
process in SW or HW, respectively
 9 denote the communication cost (load) if
communicating partners are assigned to different
entities, otherwise the cost are zero
1 4
5
2
6
3
5/3
15/9
10/1
8/2
20/2 20/3
1
10 8
6
9
3
Cost function (example):
cost (partitioning) = wp max∀ r ∈ R {∑p ∈ Ar
load(p,r)} + wc ∑c ∈ Ac
load(c)
where
 wp and wc denote the weight (importance) of the processing cost and the
communication cost, respectively
 R denotes the set of processing resources
 Ar specifies the set of processes p assigned to processing resource r
 Ac specifies the set of non-internal communications between two entities

15
Typical questions to be answered:
 can the load be handled by the available resources (processors, links)?
 is the load balanced?
Method is often used in industry to estimate the load imposed on a system and
to estimate the system capacity
Application to communication systems design: estimate the communication
bandwidth of the system (e.g. of internal memory bus or system bus)
Tooling: excel sheet is sufficient
Discussion:
 simple, fast and efficient
 application to best-case analysis
=> load/capacity is a central constraint that has to be met by the system,
otherwise detailed studies are useless!

16
Task Graph Analysis
Task graph
 simple behavior model of the application
 nodes represent functional entities (module, function, procedure, operation, etc)
 arcs represent precedence constraints
Task graph analysis
 analysis of the critical path from source to sink (graph theory)
 focus on response time of the system
 assumption that contention on resources does not have a negative impact on the
load of the system (e.g. no context-switch times - similar to process analysis)
 resources may be physical (e.g. processor, HW entity, communication link) or
logical (e.g. critical sections)
Discussion:
 simple and efficient for deterministic (constant) execution times
(complex for other distributions)
 application to best-case analysis
(neglect contention on resources – scheduling)
 wide application to optimization techniques

17
Task Graph Analysis – Examples
Analysis:
 find the longest (critical) path and compute its length
(apply recursive scheme to compute subpaths)
Example 1: DFG (or CDF with
parallel entities)
1
2
3
4 5
6
2
4
3
1
4
1
Result: critical path T = 8 ms
(best case, i.e. no contention)
20 ms
33 ms
23 ms
Result: critical path T = 76 ms (best-case, i.e.
determ. times, no contention, no queuing)
medium
1 Mb
2 MI
2,8 MI
prot. stack
prot. stack
prozessor
120 MIps
prozessor
100 MIps
phy network
30 Mbps
Example 2: communication system

18
Schedulability Analysis
Analysis techniques to check whether a system can meet its deadlines
Model:
 fixed set of processes
 single processor/resource
 all processes are periodic, with known periods
 processes are completely independent of each other
 process deadlines are equal to the process periods
 all system overheads are ignored
 all processes have a fixed worst-case execution time
 priority-based preemptive scheduling of processes (runable higher-priority
process immediately interrupts a low-priority process)
Schedulability analysis:
 check if the deadlines can be met under all circumstances
 different schemes available depending on the specific model

19
Rate Monotonic Priority Assignment
Idea: assign priorities to processes according to their period T (and deadline D)
(process with shortest period is assigned the highest priority (5))
Rate monotonic (RM) scheduling is optimal in the sense that
 if a process set can be scheduled (using preemptive priority-based
scheduling) with a fixed priority assignment scheme,
 then the same process set can also be scheduled with an RM assignment
scheme
Example for RM scheduling:
process period T priority P
A 25 5
B 60 3
C 42 4
D 105 1
E 75 2
process period T priority P
A 25
B 60
C 42
D 105
E 75

20
Schedulability Analysis – Utilization-based
Idea: derive sufficient condition for schedulability based on the analysis of the
resource utilization
Assumption: RM scheduling
Sufficient condition for schedulability (though not a necessary condition):
Utilization U = ∑i=1...N
(Ui
) = ∑i=1...N
(Ci
/Ti
) < N(21/N
– 1)
where
 i identifies the process,
 Ui
specifies the resource utilization due to process i
 Ci
specifies the computation time of process i
 Ti
specifies its period
 N defines the number of processes
Utilization bounds (U): N=1 => U=1; N=2 => U=0.828; N=10 => U=0.718;
for infinite number of N: U->0.69
Discussion:
 simple test for simple models (deadline Di
= Ti
, etc.)

21
Schedulability Analysis – Example
RM scheduling => Utilization-based analysis:
1/3 + 1/4 + 2/6 = 11 /12
0.33 + 0.25 + 0.33 = 0.91 > 0.78
=> sufficient condition is not given
process period T comp. time C priority P
0 3 1 3
1 4 1 2
2 6 2 1
all times in ms

U = ∑i=1...N (Ui) = ∑i=1...N (Ci/Ti) < N(2
1/N
– 1)

22
Schedulability Analysis – Response-time Analysis
Idea:
 predict the worst-case response time of each process and compare with the
deadline to determine the feasibility of the schedule
Assumption: any priority assignment (not only RM)
Outline of approach:
 response time Ri
of process i is Ri
= Ci
+ Ii
where Ii
is the maximum
interference of process i from higher-priority processes
 the interference depends on the number of releases of the interfering processes
and their computation time, i.e. the interference Ii,j
of the higher priority
process j on process i is
Ii,j
= Ri
/Tj
* Cj
 application of fixed-point iteration method to solve the equations
(start with Ri,0 = Ci; terminate when Ri,n+1 = Ri,n)
Ri,n+1 = Ci + ∑j<i Ri,n/Tj * Cj
For details and variants see Burns&Wellings or Krishna&Shin

23
Schedulability Analysis – Example
Response-time analysis:
Response time R0
of Process P0
:
R0
= C0
+ I0
= C0
+ 0 = 1 ms
Response time R1
of process P1
(recursive):
R1
= C1
= 1 ms (without interrupt)
R1
= C1
+ I1
= C1
+ I1,0
= C1
+ R‘1
/T0
* C0
= 1 + 1/3 * 1 = 2 ms
R1
= C1
+ I1
= C1
+ I1,0
= C1
+ R‘1
/T0
* C0
= 1 + 2/3 * 1 = 2 ms
Response time R2
of process P2
(recursive):
R2
= C2
= 2 ms
R2
= C2
+ I2
= C2
+ I2,0
+ I2,1
= C2
+ R‘2
/T0
* C0
+ R‘2
/T1
* C1
= 2 + 2/3 * 1 + 2/4 * 1 = 4 ms
R2
= C2
+ I2
= C2
+ I2,0
+ I2,1
= C2
+ R‘2
/T0
* C0
+ R‘2
/T1
* C1
= 2 + 4/3 * 1 + 4/4 * 1 = 5 ms
R2
= C2
+ I2
= C2
+ I2,0
+ I2,1
= C2
+ R‘2
/T0
* C0
+ R‘2
/T1
* C1
= 2 + 5/3 * 1 + 5/4 * 1 = 6 ms
R2
= C2
+ I2
= C2
+ I2,0
+ I2,1
= C2
+ R‘2
/T0
* C0
+ R‘2
/T1
* C1
= 2 + 6/3 * 1 + 6/4 * 1 = 6 ms
process period T comp. time C priority P
0 3 1 3
1 4 1 2
2 6 2 1 all times in ms




24
Markov Chain Analysis
Idea: model the states and the transitions of the system and assign rates to the
transitions (comparable to a FSM with timed transitions)
Example: single server with a queue holding up to 2 requests
3 2 1 0
µ
λ
µ µ
λ
λ
λ
λ denotes the arrival rate
µ denotes the service rate
Sketch of solution technique for steady-state analysis:
 mapping of Markov chain on a set of linear equations defining the state
probabilities
 normalization equation ∑i=0...n
pi = 1
 derivation of mean values and distribution functions from state probabilities
Transient analysis is based on set of differential equations
Discussion:
 rather low-level description of the states of the system
 restricted to exponentially distributed transition rates and independence of events
 state explosion problem for realistic systems (beyond millions of states)

25
Markov Chain Analysis – Example
Example: single server with a queue holding up to 2 requests
3 2 1 0
µ
λ
µ µ
λ
λ
λ
λ denotes the arrival rate
µ denotes the service rate
Steady-state analysis:
 mapping of Markov chain on a set of linear equations
state 3: λp2 = µp3
state 2: λp1 + µp3 = λp2 + µp2
state 1: λp0 + µp2 = λp1 + µp1
state 0: µp1 = λp0
normalization equation: p0 + p1 + p2 + p3 = 1
 resolution of system of equations to derive state probabilities
p0 = 1 / (1 + λ/µ + (λ/µ)2 + (λ/µ)3 )
p1 = (λ/µ) / (1 + λ/µ + (λ/µ)2 + (λ/µ)3 ); p2 = ...
 derivation of mean values and distribution functions from state probabilities
utilization: U = 1 - p0; mean number of jobs in system: N = p1 + 2p2 + 3p3 ;
blocking probability B; distribution functions for waiting time, response time, etc.
steady state implies that
arrivals = completions

26
Queuing Network Analysis
Idea: solve the system at the level of queuing stations directly as an alternative
to a mapping and solution of the Markov chains
Assumption:
 stations are separable (product-form queuing networks)
 each station can be analysed in separation (exponential input results in
exponential output)
Restrictions:
 limited distributions (exponential and derivatives)
 no synchronizations
 no blocking (infinite queues)
Results:
 mean values for delays, utilization, queue length and population
Discussion:
 efficient solution techniques available for a considerable set of queuing
networks
See R. Jain for details

27
Queuing Network Analysis - Example
Example: communication system (unbounded queues)
prozessor
120 MIps
prozessor
100 MIps
phy network
30 Mbps
λ
µ = 100 MIps / 2 MI = 50 1/s
µ = 30 1/s
µ = 43 1/s
= 25 1/s
medium
1 Mb
2 MI
2,8 MI
arrival rate: 25 packets/s
prot. stack
prot. stack
Results (assuming exponential service times and arrivals):
 utilization = λ/µ (e.g. physical network util. = 83 %)
 population per station ni = (λ/µ) / (1− λ/µ)
 mean total population N = ∑
i
ni =7,39
 mean response time T = N/λ = 296 msec (Little’s law)

28
Operational Analysis
Idea:
 model the system as a set of stations with queues
 use a small set of simple laws (job flow balance etc.)
 base analysis on operational data (i.e. measurable and testable)
Assumptions:
 job flow balance
 no assumptions on service and arrival time distributions
Analysis of arrival rates, throughput, utilization and mean service times of the
different stations in the system
Discussion:
 simple (unsuitable to parallel execution)
 fast
 answers „what if“ questions
 derivation of response time figures only if population is given (i.e. application of
Little´s law (answer response time questions only if the population of jobs in the
system is known ) – no assumptions about distributions)
See Lazowska et.al. or Denning & Buzen for details

29
Discrete-event Simulation
Idea:
 model the relevant events of the system
 process the events according to their temporal order (similar to the real
execution with the exception that simple processing blocks are modeled only)
Example: Execution on a 2-processor system (non-preemptive)
P1
P2
1
2
5 3
4
6
0 2 4 6 8 10
Discussion
 no assumptions
 danger of including too many details
 evaluation is time consuming
 problems with rare events
 large set of tools and models available
Legend:
green: execution time
blue: process priority (1=low)
1
2
3
4 5
6
2
4
3
1
4
1
/1
/4
/3
/6
/5
/2

30
Measurement-based Evaluation
Steps of measurement-based evaluation
 derive (behavioral) model of the system
 decide on instrumentation points
 instrumentation of the executables to generate traces (add time stamp)
 interfacing to track data (SW or HW monitoring) – minimize the intrusion to
system execution (and thus falsification of results)
 post-execution analysis of the traces
 special care needed in distributed systems -> common notion of time needed
Various measurement tools are available
Gain important insight in the system execution for performance debugging and
development of future systems

31
Comparision of Methods
Analysis Analysis of
multiprocessor
systems
Verification
of Real
Time
requests
Modelling of
parallel
processing (prec.
constraints)
best
case
worst
case
average
Process graph   no 
Task graph  not really ? 
Schedulability
Utiliz.-based  no  no
Resp.-based  no  no
Markov chains   no 
Queuing
networks
  no no
Operational
analysis
  no no
Simulation   no 
Measurements  no 

32
Problems and Limits of Performance Evaluations
 missing data
 uncertainty of available execution data
 data dependencies (if, case, while, ...)
 context dependencies: caching, scheduling, synchronization, blocking
=> worst case execution time (based on code rather than measurements)
 uncertainty of traffic model, i.e. the distributions of the stimuli of the system

33
References
Overview and simple techniques:
 A. Mitschele-Thiel: Systems Engineering with SDL – Developing Performance-Critical
Communication Systems. Wiley, 2001. (section 2.3 & 2.4)
 H.U. Heiss: Prozessorzuteilung in Parallelrechnern. BI-Wissenschaftsverlag, Reihe
Informatik, Band 98, 1994.
 R. Jain: The Art of Computer Systems Performance Analysis – Techniques for
Experimental Design, Measurements, Simulation, and Modeling. Wiley, 1991.
Evaluation of real-time systems (deterministic assumptions):
 A. Burns, A. Wellings: Real-Time Systems and Programming Languages, 2nd edition,
Addison Wesley, 1996.
 G. Buttazzo: Hard Real-Time Computing Systems. Kluwer Academic Publishers. 1997.
 C.M. Krishna, K.G. Shin: Real-time Systems. McGraw-Hill, 1997.
 H. Kopetz: Real-time Systems. Kluwer Academic Publishers, 1997.
Evaluation of queuing systems (exponential and other distributions):
 J. Dennig, J. Buzen: The Operational Analysis of Queueing Network Models. Computing
Surveys, 10(3), Sept. 1978.
 E.D. Lazowska, J. Zahorjan, G.S. Graham, K. Sevcik: Quantitative System Performance:
Computer System Analysis Using Queueing Network Models. Prentice-Hall, 1984.

1
Part 9,,
Optimization
TECHNISCHE
UNIVERSITÄT
ILMENAU

10
Heuristic Search
Most heuristics are based on an iterative search comprising the following
elements:
 selection of an initial (intermediate) solution (e.g. a sequence)
 evaluation of the quality of the intermediate solution
 check of termination criteria
select initial solution
select next solution
(based on previous solution)
evaluate quality
acceptance criteria satisfied
accept solution as
„best solution so far“
termination criteria satisfied
y
y
n
search strategy

11
Hill-Climbing – Discussion
 simple
 local optimizations only: algorithm is not able to pass a valley to finally reach
a higher peak
 idea is only applicable to small parts of optimization algorithms but needs to
be complemented with other strategies to overcome local optimas

12
Random Search
also called Monte Carlo algorithm
Idea:
 random selection of the candidates for a change of intermediate solutions or
 random selection of the solutions (no use of neighborhood)
Discussion:
 simple (no neighborhood relation is needed)
 not time efficient, especially where the time to evaluate solutions is high
 sometimes used as a reference algorithm to evaluate and compare the
quality of heuristic optimization algorithms
 idea of randomization is applied to other techniques, e.g. genetic algorithms
and simulated annealing

13
Simulated Annealing
Idea:
 simulate the annealing process of material: the slow cooling of material leads to
a state with minimal energy, i.e. the global optimum
Classification:
 Search strategy
 random local search
 Acceptance criteria
 unconditional acceptance of the selected solution if it represents an
improvement over previous solutions
 otherwise probabilistic acceptance
 Termination criteria
 static bound on the number of iterations (cooling process)

14
Simulated Annealing – Discussion and Variants
Discussion:
 parameter settings for cooling process is essential (but complicated)
 slow decrease results in long run times
 fast decrease results in poor solutions
 discussion whether temperature decrease should be linear or logarithmic
 straightforward to implement
Variants:
 deterministic acceptance
 nonlinear cooling (slow cooling in the middle of the process)
 adaptive cooling based on accepted solutions at a temperature
 reheating

15
Genetic Algorithms – Basic Operations
crossover
1 1 0 0 1 0 1 0 1 1 0 1 0 1 0 0 1 0 0 1
1 1 0 0 0 0 1 0 0 1
1 1 0 0 0 1 1 0 0 1
mutation

16
Genetic Algorithms – Basic Algorithm
General parameters:
 size of population
 mutation probability
 candidate selection strategy (mapping quality on probability)
 replacement strategy (replace own parents, replace weakest, influence of
probability)
Application-specific parameters:
 mapping of problem on appropriate coding
 handling of invalid solutions in codings
crossover
replacement
mutation
selection
population
Replacement and selection rely
on some cost function defining
the quality of each solution
Different replacement
strategies, e.g. “survival of the
fittest”
Crossover selection is typically
random

17
Genetische Algorithmen – Minimum Spanning Tree
 small population results in inbreeding
 larger population works well with small mutation rate  tradeoff between size of population and number of iterations

18
Genetic Algorithms –Basic Operations
 Mutation
 => crating a new member of the population by changing one member

19
Genetic Algorithms –Basic Operations
 Crossover
 => crating a new member of the population from two members

20
Genetische Algorithmen – Traveling Salesman Problem
 minimal impact of mutation rate with small population
 negativ impact of high mutation rate with larger population (increased randomness) – impact not quite clear

21
Genetic Algorithms – Discussion
 finding an appropriate coding for the binary vectors for the specific
application at hand is not intuitive
problems are
 redundant codings,
 codings that do not represent a valid solution, e.g. coding for a
sequencing problem
 tuning of genetic algorithms may be time consuming
 parameter settings highly depend on problem specifics
 suited for parallelization of optimization

22
Tabu Search
Idea:
 extension of hill-climbing to avoid being trapped in local optima
 allow intermediate solutions with lower quality
 maintain history to avoid running in cycles
Classification:
 Search strategy
 deterministic local search
 Acceptance criteria
 acceptance of best solution in neighborhood which is not tabu
 Termination criteria
 static bound on number of iterations or
 dynamic, e.g. based on quality improvements of solutions

23
Tabu Search – Algorithm
The brain of the algorithm is the
tabu list that stores and maintains
information about the history of the
search.
In the most simple case a number of
previous solutions are stored in the
tabu list.
More advanced techniques maintain
attributes of the solutions rather than
the solutions itself
select initial solution
select neighborhood set
(based on current solution)
remove tabu solutions
from set
set is empty
increase neigborhood
termination criteria satisfied
y
y
n
n
evaluate quality and
select best solution from set
update tabu list

24
Tabu Search – Organisation of the History
The history is maintained by the tabu list
Attributes of solutions are a very flexible mean to control the search
Example of attributes of a HW/SW partitioning problem with 8 tasks assigned to 1 of 4
different HW entities:
(A1) change of the value of a task assignment variable
(A2) move to HW
(A3) move to SW
(A4) combined change of some attributes
(A5) improvement of the quality of two subsequent solutions over or below
a threshold value
Aspiration criteria: Under certain conditions tabus may be ignored, e.g. if
 a tabu solution is the best solution found so far
 all solutions in a neighborhood are tabu
 a tabu solution is better than the solution that triggered the respective tabu conditions
Intensification checks whether good solutions share some common properties
Diversification searches for solution that do not share common properties
Update of history information may be recency-based or frequency-based (i.e. depending on
the frequency that the attribute has been activated)
1
2
3
4 5
6
2
4
3
1
4
1
/1
/4
/3
/6
/5
/2

25
Tabu Search – Discussion
 easy to implement (at least the neighborhood search as such)
 non-trival tuning of parameters
 tuning is crucial to avoid cyclic search
 advantage of use of knowledge, i.e. feedback from the search (evaluation of
solutions) to control the search (e.g. for the controlled removal of
bottlenecks)

26
Heuristic Search Methods – Classification
Search strategy
 search area
 global search (potentially all solutions considered)
 local search (direct neighbors only – stepwise optimization)
 selection strategy
 deterministic selection, i.e. according to some deterministic rules
 random selection from the set of possible solutions
 probabilistic selection, i.e. based on some probabilistic function
 history dependence, i.e. the degree to which the selection of the new
candidate solution depends on the history of the search
 no dependence
 one-step dependence
 multi-step dependence
Acceptance criteria
 deterministic acceptance, i.e. based on some deterministic function
 probabilistic acceptance, i.e. influenced by some random factor
Termination criteria
 static, i.e. independent of the actual solutions visited during the search
 dynamic, i.e. dependent on the search history

27
Heuristic Search Methods – Classification
Heuristic Search strategy Acceptance
criterion
Termination
criterion
Search area Selection strategy History dependence
local global det. prob. random none one-
step
multi-
step
det. prob. stat. dyn.
hill-
climbing
x x x x x
tabu
search
x x x x x x
simulated
annealing
x x x x x
genetic
algorithms
x x x x x x x
random
search
x x x x x

28
Single Pass Approaches
The techniques covered so far search through a high number of solutions.
Idea underlying single pass approaches:
 intelligent construction of a single solution (instead of updating and
modification of a number of solutions)
 the solution is constructed by subsequently solving a number of subproblems
Discussion:
 single-pass algorithms are very quick
 quality of solutions is often small
 not applicable where lots of constraints are present (which require some kind
of backtracking)
Important applications of the idea:
 list scheduling: subsequent selection of a task to be scheduled until the
complete schedule has been computed
 clustering: subsequent merger of nodes/modules until a small number of
cluster remains such that each cluster can be assigned a single HW unit

29
Single Pass Approaches – Framework
The guidelines are crucial
and represent the
intelligence of the algorithm
derive guidelines for
solution construction
select subproblem
decide subproblem
based on guidelines
possibly recompute or
adapt guidelines
final solution constructed
y
n

31
List Scheduling – Example (1)
Problem:
 2 processors
 6 tasks with precedence constraints
 find schedule with minimal execution time
HLFET (highest level first with estimated times)
 length of the longest (critical) path to the sink
node (node 6)
Assignment strategy
 first fit
Resulting schedule:
Legend:
green: estimated times
red: levels (priorities)
1
2
3
4 5
6
2
4
3
1
4
1
/8
/5
/4
/1
/5
/6
P1
P2
1 2
5
3
4
6
0 2 4 6 8 10

32
List Scheduling – Example (2)
Problem (unchanged):
 2 processors
 6 tasks with precedence constraints
 find schedule with minimal execution time
SCFET (smallest co-level first with estimated
times)
 length of the longest (critical) path to the
source node (node 1)
Assignment strategy
 first fit
Resulting schedule:
Legend:
green: estimated times
blue: co-levels (priorities)
1
2
3
4 5
6
2
4
3
1
4
1
/2
/6
/5
/8
/7
/3
P1
P2
1 2 5
3
4
6
0 2 4 6 8 10

34
Clustering - Basics
Partitioning of a set of nodes in a given number of subsets
compute the „distance“
between any pair of clusters
select the pair of clusters
with the highest affinity
merge the clusters
termination criteria holds
y
n
assign each node to a
different cluster
Application:
 processor assignment (load
balancing – minimize interprocess
communication)
 scheduling (minimize critical
path)
 HW/SW partitioning
Clustering may be employed as part
of the optimization process, i.e.
combined with other techniques

35
Clustering
probabilistic deterministic
hierarchical partitioning
A node belongs to exactly one
cluster or not
Each node belongs with certain
probabilities to different clusters
Starts with given number of K
clusters (independent from nodes)
Starts with a distance matrix of
each pair of nodes
Termination after a given
number of iterations
Termination after all nodes
belong to one cluster
Results depend on the chosen
initial set of clusters
Exact method: always the
same result

12
Hierarchical Clustering
Replace the selected pair
in distance matrix by a
cluster representative
Recompute distance matrix
All nodes in one cluster
y
n
Determine the distance
between each pair of
nodes
Select the smallest distance
Example data in a matrix
Stepwise reduction of the number of clusters
Algorithm is kind of subsequent merger of
nearest neighbors (nodes/clusters)

13
Hierarchical Clustering
Dendrogram
Algorithm is kind of subsequent merger of
nearest neighbors (nodes/clusters)

14
Partitioning Clustering (k-means)
Recompute positions of the
Based on the positions of
the nodes in each cluster
Number of iterations reached
y
n
Choose positions of k initial
assign each node to the
nearest cluster
representative

15
Clustering – Application to Load Balancing
compute the sum of the
communication cost
between any pair of clusters
select the pair of clusters
with the highest comm.
cost that does not violate
the capacity constraints
merge the clusters
reduction of comm. cost
without violation of constraints
possible
n
y
assign each node to a
different cluster
Optimization goal:
 minimize inter-process (inter-
cluster) communication
 limit maximum load per processor
(cluster) to 20

16
Clustering – Application to Load Balancing (2 processors)
7 5
7
6
9
2
10
1
8
4
3
5
4
7
6
9
2
1
8
4
3
5
4
12
6
2
1
4
3
5
4
16
12
6
2
1
4
3
5
4
16
12
2
1
3
8
16
18
9
16
20
7
6
9
2
1
8
4
3
5
4
12
2
1
3
8
16
18

17
Clustering – Hierarchical Algorithms
Single linkage
Centroid-based
Complete Linkage
Algorithms implement
different methods to
compute the distance
between two clusters

18
Clustering – Single Linkage
0
1
2
3
4
5
6
7
8
9
10
0 2 4 6 8 10
P1
P2
P3
P4
P5
P6
P7 Distance between groups is estimated as
the smallest distance between entities
Example:
[ ] 1
.
4
,
min 45
45
25
5
)
4
,
2
( =
=
= d
d
d
d
Cluster # P1 P2 P3 P4 P5 P6 P7
P1 0 7.2 5 7.1 6.1 9.2 7
P2 - 0 3 1.4 5.4 3 6.7
P3 - - 0 2.2 2.8 4.3 4.3
P4 - - - 0 4.1 2.2 5.4
P5 - - - - 0 5.1 1.4
P6 - - - - - 0 6
P7 - - - - - - 0

19
Clustering – Single Linkage
P1 0 7.2 5 7.1 6.1 9.2 7
P2 - 0 3 1.4 5.4 3 6.7
P3 - - 0 2.2 2.8 4.3 4.3
P4 - - - 0 4.1 2.2 5.4
P5 - - - - 0 5.1 1.4
P6 - - - - - 0 6
P7 - - - - - - 0
Cluster # P1 C24 P3 C57 P6
P1 0 7.1 5 6.1 9.2
C24 - 0 2.2 4.1 2.2
P3 - - 0 2.8 4.3
C57 - - - 0 5.1
P6 - - - - 0
Cluster # P1 C243 C57 P6
P1 0 5 6.1 9.2
C243 - 0 2.8 2.2
C57 - - 0 5.1
P6 - - - 0
0
1
2
3
4
5
6
7
8
9
10
0 2 4 6 8 10
P1
P2
P3
P4
P5
P6
P7

20
Clustering – Group Average
P1 0 7.2 5 7.1 6.1 9.2 7
P2 - 0 3 1.4 5.4 3 6.7
P3 - - 0 2.2 2.8 4.3 4.3
P4 - - - 0 4.1 2.2 5.4
P5 - - - - 0 5.1 1.4
P6 - - - - - 0 6
P7 - - - - - - 0
0
1
2
3
4
5
6
7
8
9
10
0 2 4 6 8 10
P1
P2
P3
P4
P5
P6
P7
Distance between groups is defined
as the average distance between
all pairs of entities
Example:
( ) 8
.
4
2
1
45
25
5
)
4
,
2
( =
+
= d
d
d

21
Clustering – Group Average
P1 0 7.2 5 7.1 6.1 9.2 7
P2 - 0 3 1.4 5.4 3 6.7
P3 - - 0 2.2 2.8 4.3 4.3
P4 - - - 0 4.1 2.2 5.4
P5 - - - - 0 5.1 1.4
P6 - - - - - 0 6
P7 - - - - - - 0
Cluster # P1 C24 P3 C57 P6
P1 0 7.2 5 6.6 9.2
C24 - 0 2.6 4.5 2.6
P3 - - 0 3.6 4.3
C57 - - - 0 5.6
P6 - - - - 0
Cluster # P1 C243 C57 P6
P1 0 6.4 6.1 9.2
C243 - 0 4.8 2.5
C57 - - 0 5.1
P6 - - - 0
0
1
2
3
4
5
6
7
8
9
10
0 2 4 6 8 10
P1
P2
P3
P4
P5
P6
P7

22
Clustering – Centroid-based
P1 0 7.2 5 7.1 6.1 9.2 7
P2 - 0 3 1.4 5.4 3 6.7
P3 - - 0 2.2 2.8 4.3 4.3
P4 - - - 0 4.1 2.2 5.4
P5 - - - - 0 5.1 1.4
P6 - - - - - 0 6
P7 - - - - - - 0
0
1
2
3
4
5
6
7
8
9
10
0 2 4 6 8 10
P1
P2
P3
P4
P5
P6
P7
Determine distances between centroids (k,l)
Merge centroids with the least distance
( ) ( )
( )
2
2
)
,
( l
k
l
k y
y
x
x C
C
C
C
l
k
d −
+
−
=
x
x
x
x

23
Clustering – Centroid-based
P1 0 7.2 5 7.1 6.1 9.2 7
P2 - 0 3 1.4 5.4 3 6.7
P3 - - 0 2.2 2.8 4.3 4.3
P4 - - - 0 4.1 2.2 5.4
P5 - - - - 0 5.1 1.4
P6 - - - - - 0 6
P7 - - - - - - 0
Cluster # C1 C24 C3 C57 C6
C1 0 7.1 5 6.5 9.2
C24 - 0 2.5 5.4 2.5
C3 - - 0 3.5 4.3
C57 - - - 0 5.5
C6 - - - - 0
0
1
2
3
4
5
6
7
8
9
10
0 2 4 6 8 10
P1
P2
P3
P4
P5
P6
P7
x
x
x
x

24
Differences between Clustering Algorithms
-3 -2 -1 0 1 2 3
x 10
4
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
x 10
4
X (m)
Y
(m)
Single Linkage
-3 -2 -1 0 1 2 3
x 10
4
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
x 10
4
X (m)
Y
(m)
Complete Linkage
-3 -2 -1 0 1 2 3
x 10
4
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
x 10
4
X (m)
Y
(m)
Centroid Linkage
-3 -2 -1 0 1 2 3
x 10
4
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
x 10
4
X (m)
Y
(m)
K-means
-3 -2 -1 0 1 2 3
x 10
4
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
x 10
4
X (m)
Y
(m)
Ward
-3 -2 -1 0 1 2 3
x 10
4
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
x 10
4
X (m)
Y
(m)
Single Linkage
-3 -2 -1 0 1 2 3
x 10
4
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
x 10
4
X (m)
Y
(m)
Complete Linkage
-3 -2 -1 0 1 2 3
x 10
4
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
x 10
4
X (m)
Y
(m)
Centroid Linkage
-3 -2 -1 0 1 2 3
x 10
4
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
x 10
4
X (m)
Y
(m)
K-means
-3 -2 -1 0 1 2 3
x 10
4
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
x 10
4
X (m)
Y
(m)
Ward

25
Clustering – Variants
Clustering methods
Partitioning methods Hierarchical methods
k-means
Fuzzy-c-means
SOM
Clique
One Pass
Gustafson-Kessel algorithm
Agglomeration
(bottom up)
Division
(top down)
Single linkage
Complete linkage
Average group
Centroid
MST
ROCK
Wards
Tree Structural Vector
Quantification
Macnaughton-Smith
algorithm
Distance Metrics
 Euclidean
 Manhattan
 Minkowsky
 Mahalanobis
 Jaccard
 Camberra
 Chebychev
 Correlation
 Chi-square
 Kendalls‘s Rank
Correlation

26
Clustering – Discussion
 Results
 Exact results (single linkage)
 Not-exact results  often several iterations are necessary (K-means)
 Metrics
 Strong impact to clustering results
 Not each metric is suitable for each clustering algorithm
 Decision for one- or multi-criteria metrics (separated or joint clustering)
 Selection of Algorithm
 Depends strongly on the structure of the data set and the expected results
 Some algorithms tend to separate outlayers in own clusters  some large
clusters and a lot of very small clusters (complete linkage)
 Only few algorithms are able to detect also branched, curved or cyclic clusters
(single linkage)
 Some algorithms tend to return clusters with nearly equal size (K-means,
Ward)
 Quality of clustering results
 The mean variance of the elements in each cluster (affinity parameter) is often
used
 In general the homogeneity within clusters and the heterogeneity between clusters
can be measured
 However, the quality prediction can be only as good as the quality of the used
metric!

27
Branch and Bound with Underestimates
Application of the A* algorithm to the scheduling problem
Example: scheduling on a 2-processor system (processor A and B)
Process graph communication || processing (assumption)
Legend:
green: processing times
blue: communication times
1
2 3
4
5
9
8
3
2 5
6 1
f(x) = g(x) + h(x)
g(x) exact value of partial schedule
h(x) underestimate for remainder (rem)
=(min (altern.rem.proc, rem.comm. + rem.proc.)
x= start, than x=best of X, where X=growing set of
known solutions (min of comm+proc.)
Scheduled to: A B
Search is terminated when min {f(x)} is
a terminal node (in the search tree)
f(1) = 5 + min((9 + 3), (2+8+3)) = 5+12 = 17

28
Example: computation of f(3)
1
2 3
4
5
9
8
3
2 5
6 1
1 -> A
f(1)=17
1 1 -> B
f(2)=17
2
2 -> A
f(3)=22
3
4
4
case 1: A= path 1-2-4
g(3) = 5 + 8 = 13
h(3) = min(3, (5+9+3))
f(3) = 16
A
B
1 2
0 4 8 12 16 20 24
2->4
1->3
3
3 4
4
A
B
1 2
0 4 8 12 16 20 24
case 2: A= path 1-3-4
g(3) = 5
h(3) = 5 + 9 + 3
f(3) = 22
f(x) = g(x) + h(x)
g(x) exact value of partial
schedule
h(x) underestimate for
remainder

29
Berechnung
 Annahme: Kommunikation auf gleichem Processor läuft parallel zur CPU
 f(1) = 5 + 9 + 3 = 17 f(2) Gleiches Ergebnis,
da egal, ob Prozess 1 auf A oder B läuft
 f(3) = 5+8 + min (5+9+3, 6+3) = 22
Schedule: Prozesse 1 und 2 auf Prozessor A
 f(4) = 5+2+8+ min (3; 9+1+3) = 18
Prozess 1 auf A, 2 auf B
(daher +2 Kommunikation und proc 8+5)
 f(5) = 5+9+ min (1+3; 2+8+3) = 18
Schedule: Prozesse 1 und 3 auf Prozessor A
 f(6) = f(3)
 f(7) = weitere Fallunterscheidungen mit Rest
1
2 3
4
5
9
8
3
2 5
6 1

30
Application of the A* algorithm to the scheduling problem
Example: scheduling on a 2-processor system (processor A and B)
Process graph Search Tree
Legend:
green: processing times
blue: comm. times
1
2 3
4
5
9
8
3
2 5
6 1
2 -> A
f(3)=22
3 2 -> B
f(4)=18
4 3 -> B
f(6)=22
6
3 -> A
f(5)=18
5
2 -> A
f(7)=25
7 2 -> B
f(8)=18
8
4 -> A
f(9)=24
9 4 -> B
f(10)=18
10
1 -> A
f(1)=17
1 2 1 -> B
f(2)=17
f(x) = g(x) + h(x)
g(x) exact
h(x) underestimate rest
Search is terminated when min {f(x)} is a terminal node (in the search tree)
12 4
A
B
1
0 4 8 12 16 20 24
3
2

31
References
Critical Communication Systems. Wiley, 2001. (section 2.5)
 C.R. Reeves (ed.): Modern Heuristic Techniques for Combinatorial Problems.
Blackwell Scientific Publications, 1993.
 H.U. Heiss: Prozessorzuteilung in Parallelrechnern. BI-Wissenschaftsverlag,
Reihe Informatik, Band 98, 1994.
 M. Garey, D. Johnson: Computer and Intractability. W.H. Freeman, New
York, 1979.

Digital Systems Design description and implementation.pdf

More Related Content

Similar to Digital Systems Design description and implementation.pdf

Recently uploaded

Digital Systems Design description and implementation.pdf