Hierarchy of management that covers different levels of management
Tolerance
1. FAULT TOLERANCE TECHNIQUES FOR
REAL TIME OPERATING SYSTEM
Seminar Coordinator: Ms SAUMYA SADANANDAN
Guided by: Mr MELBIN VARGHESE JOHN
Prepared by: ANU MARIA K JOSE
S7,IT
1
2. OUTLINE
2
INTRODUCTION
PROBLEM STATEMENT
FEATURES OF REAL TIME OPERATING SYSTEMS
DEADLINE
RTO FEATURES AND FAULT TOLERANCES
MEMORY MANAGEMENT
KERNEL CONSIDERATIONS
PROCESS AND THREAD MANAGEMENT
SHEDULING
COMMUNICATION
I/O MAMAGEMENT
PROGRAMMING LANGUAGES
CONCLUSION
INVITING QUESTIONS
THANK YOU
3. INTRODUCTION
3
Operating system: It acts as an intermediary between the user
of a computer and the computer hardware.
Fault Tolerance: A property that enables a system continue
operating even in the presence of a failure
Real Time Systems: Systems with well defined fixed time
constraints.
4. INTRODUCTION continued…..
4
Nowadays operating systems are inseparable part of computer
systems.
RTOSs are widely used in safety-critical domains.
Hence, fault tolerance is an essential requirement of RTOSs
employed in safety-critical domains.
5. PROBLEM STATEMENT
5
In safety critical domains all the system’s requirements should
be met and a catastrophe occurs if the system fails.
Thus, the operating systems employed in safety-critical
domains should produce correct and valid results in the
presence or in the absence of faults.
6. FEATURES OF REAL TIME OPERATING
SYSTEM
6
Real time operating systems emphasize predictability,
efficiency and include features to support timing constraints.
All tasks should be released on time and also should be
completed before particular times called deadline
Violating time constraints leads to system failure
7. DEADLINES
7
It is the instance when the results should be produced before it.
Deadlines can be:
Soft : If a result has a utility even after the deadline has
passed
Firm : If a result has no utility even after the deadline has
passed
Hard: If severe consequences would result if a firm
deadline is missed
9. 1.MEMORY MANAGEMENT
9
In order to protect operating systems components, fault
tolerance begins with memory protection.
The use of DSA(Dynamic Storage Allocation) leads to
uncertainty in RTOS.
FAULT TOLERANCE TECHNIQUES:
TLSF Algorithm
bitmaps
10. 1.2 MEMORY MANAGEMENT UNIT
10
Some RTOSs disable MMU causing all processes to run in the
same address space.
This may lead to the creation of some bugs which can then
lead to system crash.
FAULT TOLERANCE TECHNIQUE:
Enable MMU
11. 1.3 REDUNDANCY
11
Redundancy is one of the most important techniques in fault
tolerance.
When a process is loaded, the operating system duplicates its
data and states in more than one place/memory.
Whenever the task wants to read data from memory, a voting is
done on replicas.
12. 1.4 ERROR CORRECTING CODE MEMORY
12
It is an instrument to improve operating systems
reliability.
It is a type of computer data storage that has ability to
detect and correct many kinds of internal data corruption.
Some non-ECC memories with parity support allows
errors to be detected, but not corrected.
The reliability of a fault-tolerant RTOS would be
improved by employing this kind of memory.
13. 2. KERNEL CONSIDERATIONS
13
The kernel of a fault-tolerant RTOS must be as follows:
Should provide a mechanism that whenever an
error occurs, a notification is sent to an agent.
The agent then has the duty to perform some
types of error recovery actions.
This agent is called supervisor and must be run in
an isolated address space
14. KERNEL CONSIDERATIONS continued……
14
FAULT TOLERANCE TECHNIQUE:
Event logging mechanism
software watchdog capability
Should protect themselves against improper
invoking system calls and passing invalid
parameters.
Availability for dependable computing
Should prevent the spread of faults to the
kernel
15. 3. PROCESS AND THREAD MANAGEMENT
15
Process definition and activation is one of the most
important roles of RTOSs.
RTOSs should activate a process once and release it once
or periodically .
It must also guaranty each release is started on-time and is
finished before its deadline.
16. PROCESS AND THREAD MANAGEMENT
continued….
16
If tasks’ behavior is not monitored and controlled by the
RTOS:
a task may, as a result of malicious or careless
execution of another task, cannot use processor or
other system resources.
other tasks may fail because of their inability in
acquiring required resources and resulting in
deadline miss.
17. PROCESS AND THREAD MANAGEMENT
continued….
17
One possible solution is to reserve required resources for
each process.
In fixed-priority systems, tasks’ priority would be changed
incorrectly because of fault occurrence in process table.
A possible technique to solve this problem is to acquaint
process manager with the importance of the tasks by
using partitions in the memory.
18. 4. SCHEDULING
18
If several processes are run at the same time then the
system has to choose among them.
This decision is called CPU scheduling.
Some of the important sheduling algorithms used in real
time system are:
Rate Monotonic(RM)
Earliest Deadline First(EDF)
Least Laxity First(LLF)
19. SCHEDULING continued……..
19
If the scheduler fails, other system tasks are not scheduled and
released correctly and as result the system crashes.
FAULT TOLERANCE TECHNIQUES:
pre-constructed static scheduling table.
N-copy programming (NCP) .
take the required time to handle faulty tasks into its time
analyses
fault-tolerant RTOSs should be able to recover processors from
transient and permanent faults too.
20. 5. I/O MANAGEMENT
20
Deals with the management of I/O accesses such that
interference is prevented and tasks are completed in time.
Fault-tolerance RTOSs must provide some fault tolerance
techniques to tolerate faulty I/O devices.
FAULT TOLERANCE TECHNIQUES:
Replication
Robustness
21. PROGRAMMING LANGUAGES
21
Special programming languages should be employed to
meet RTO requirements.
It must also guaranty correct responses within strict timing
constraints.
It should also support some error detection and error
correction techniques.
Some characteristics to be followed by RTOS
programming languages are: well-defined language
semantics, the strong type checking, structuring
mechanisms.
22. CONCLUSION
22
Real-time operating systems are widely used in safety-
critical domains.
Safety-critical system: if the occurrence of a failure in
meeting system requirements causes to catastrophic
effects.
The costs of a system failure leads to catastrophe and
exceeds the initial investment in the computer and in the
controlled object
23. REFERENCES
23
An Overview of Fault Tolerance Techniques for Real-Time
Operating Systems : Reza Ramezani,Yasser Sedaghat
Operating system concepts :A. Silberschatz, P. B. Galvin, and
G. Gagne, J. Wiley & Sons, 2009
Principles of memory management-
https://www.cs.drexel.edu/~bls96/excerpt3.pdf
Dependable computing and fault tolerance-Jean Claude Lapris
TLSF: a New Dynamic Memory Allocator for Real-Time
Systems∗ M. Masmano, I. Ripoll, A. Crespo, and J. Real
Universidad Polit´ecnica de Valencia, Spain.
en.wikipedia.org/wiki/Fragmentation_(computing)
http://www.slideshare.net/Tech_MX/real-time-os