SlideShare a Scribd company logo
Dependable Systems
!
Hardware Dependability - Diagnosis
Dr. Peter Tröger

!
Sources:

!
Siewiorek, Daniel P.; Swarz, Robert S.: 

Reliable Computer Systems. third. Wellesley, MA : A. K. Peters, Ltd., 1998. ,
156881092X

!
Some images (C) Elena Dubrova, ESDLab, Kungl Tekniska Högskolan
Dependable Systems Course PT 2014
Fault Diagnosis
• Contains of fault detection (what ?) and fault location (where ?)

• Fault detection

• Replication checks - Test operation / execution against alternate implementation

• Timing checks - Test operation / execution against timing constraints

• Reversal checks - Make use of operation reversibility

• Coding checks - Utilize redundant (but different) representation of data

• Reasonableness checks - Check against known system / data properties

• Structural checks - Ensure consistent structure of data, diagnostic tests, ...

• Diagnostic checks - Use set of inputs for which the outputs are known

• Algorithmic checks - Check invariants of an algorithm
2
Dependable Systems Course PT 2014
Fault Detection
• Replication check
• Perform test against the same / alternate implementation

• Identical / partial copies of unit - assumes correct design and independent
failure

• Separate and different copies - assumes design faults 

• Repeated execution - assumes transient fault

• Works well, but expensive

• Implementable in spatial

redundancy architecture
3
Dependable Systems Course PT 2014
Fault Detection
• Timing checks
• Monitor execution based on timing constraints

• Watchdog - Progress resets timer, expired timer assumes broken component

• Detection of crash, overload, infinite loop; frequency depends on application 

• Broadcast timeout - Component broadcasts message on progress, 

receivers have deadline for next message 

• Acknowledge timeout - Peer should react on request with answer message
4
Tandem: Heartbeat broadcast every second, check on every second message
Dependable Systems Course PT 2014
Fault Detection
• Reversal checks
• One-to-one relationship between inputs and outputs

• Calculates inputs from output by reverse function, comparison

• Re-read data after write, mathematical functions

• Diagnostic checks
• Check known output for some test inputs

• Typical approach in built-in hardware testing 

• Load tests - run at saturation levels, trigger abnormal environmental conditions

• Plausibility checks
• Based on knowledge about system design and data types - range checks,
consistency checks, type checks
5
p
x
2
= x
Dependable Systems Course PT 2014
Fault Detection - Coding Checks
• Coding techniques help with fault detection, diagnosis and tolerance

• General idea: Informational redundancy

• Code defines a subset of all possible information words as valid information

• Each valid information is a code word

• Error detection mechanism: Is the given word a (valid) code word?

• Number space - All numbers that can be represented in a numerical system

• Code space - Subset of a number space

• Applications: Data storage, ALU, communication busses, self-checking hardware, ...

• Ranking of codes: Detection coverage, overhead, complexity added, ...
6
Dependable Systems Course PT 2014
Coding Checks in Memory Hardware
• Primary memory
• Parity code

• m-out-of-n resp. m-of-n resp. m/n code

• Checksumming

• Berger Code

• Hamming code

• Secondary storage
• RAID codes

• Reed-Solomon code
7
Dependable Systems Course PT 2014
Hamming Distance (Richard Hamming, 1950)
• Hamming distance: Number of positions on which two words differ 

• Alternative definition: Number of necessary substitutions to make A to B

• Alternative definition: Number of errors that transform A to B

!
!
!
!
• Minimum Hamming distance: Minimum distance between any two code words

• To detect d single-bit errors, a code must have a min. distance of at least d + 1

• If at least d+1 bits change in the transmitted code, a new (valid) code appears

• To correct d single-bit errors, the minimum distance must be 2d+1
8
(C)Wikipedia
Dependable Systems Course PT 2014
Parity Codes
• Add parity bit to the information word

• Even parity: If odd number of ones, add one to make the number of ones even

• Odd parity: If even number of ones, add one to make the number of ones odd

• Variants

• Bit-per-word parity

• Bit-per-byte parity 

(Example: 

Pentium data cache)

• Bit-per-chip parity

• ...
9
Dependable Systems Course PT 2014
Two-Dimensional Parity
10
Example: Odd Parity
Allows
fault
location
Dependable Systems Course PT 2014
m-out-of-n Code
• Data word of length n 

(including code bits), 

make sure that m ones are in the word

• Example: 2-out-of-5 code 

• Often used in telecommunication
and in the US postal system

• 5 bits allow 10 combinations with 

2-out-of-5, so all decimal digits are
representable

• Can deal with more than only 

single-bit errors
11
Decimal Digit 2-out-of-5
0 0 0 0 1 1
1 0 0 1 0 1
2 0 0 1 1 0
3 0 1 0 0 1
4 0 1 0 1 0
5 0 1 1 0 0
6 1 0 0 0 1
7 1 0 0 1 0
8 1 0 1 0 0
9 1 1 0 0 0
Dependable Systems Course PT 2014
Code Properties
12
Code Bits / Word
Number of 

possible words
Coverage
Even Parity 4 8
Any single bit error, 

no double-bit error,

not all-0s or all 1s errors
Odd Parity 4 8
Any single bit error, 

no double-bit error,
all-0s or all 1s errors
2/4 4 6
Any single bit error, 

33% of double bit errors
Dependable Systems Course PT 2014
Checksumming
• General idea: Multiply components of a data word and add them up to a checksum

• Good for large blocks of data, low hardware overhead

• Might require substantial time, memory overhead

• 100% coverage for single faults

• Example: ISBN 10 check digit

• Final number of 10 digit ISBN number is a check digit, computed by:

• Multiply all data digits by their position number, starting from right

• Take sum of the products, and set the check digit so that result MOD 11 = 0

• Check digit „01“ is represented by X

• More complex approaches get better coverage (e.g. CRC) for more CPU time
13
Dependable Systems Course PT 2014
Hamming Code
• Every data word with n bits is extended 

by k control bits

• Control bits through standard parity approach 

(even parity), implementable with XOR

• Inserted at power-of-two positions 

• Parity bit pX is responsible for all position
numbers with the X-least significant bit set
(e.g. p1 responsible for ,red dot‘ positions)

• Parity bits check overlapping parts of the data
bits, computation and check are cheap, 

but significant overhead in data

• Minimum Hamming distance of three 

(single bit correction)

• Application in DRAM memory
14
takenfromWikimediaCommons
Dependable Systems Course PT 2014
Hamming Code for 26bit word size
15
takenfromWikimediaCommons
Dependable Systems Course PT 2014
Hamming Code
• Hamming codes are known to produce 10% to 40% of data
overhead

• Syndrome determines which bit (if any) is corrupted

• Example ECC

• Majority of „one-off“ soft errors in DRAM happens from
background radiation

• Denser packages, lower voltage for higher frequencies

• Most common approach is the SECDEC Hamming code

• "Single error correction, double error
detection" (SECDEC)

• Hamming code with additional parity

• Modern BIOS implementations perform 

threshold-based warnings on frequent correctable errors
16
(C) Z. Kalbarczyk

More Related Content

What's hot

Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Peter Tröger
 
Dependable Systems -Fault Tolerance Patterns (4/16)
Dependable Systems -Fault Tolerance Patterns (4/16)Dependable Systems -Fault Tolerance Patterns (4/16)
Dependable Systems -Fault Tolerance Patterns (4/16)
Peter Tröger
 
What activates a bug? A refinement of the Laprie terminology model.
What activates a bug? A refinement of the Laprie terminology model.What activates a bug? A refinement of the Laprie terminology model.
What activates a bug? A refinement of the Laprie terminology model.
Peter Tröger
 
OpenSubmit - How to grade 1200 code submissions
OpenSubmit - How to grade 1200 code submissionsOpenSubmit - How to grade 1200 code submissions
OpenSubmit - How to grade 1200 code submissions
Peter Tröger
 
Availability tactics
Availability tacticsAvailability tactics
Availability tactics
ahsan riaz
 
SE2018_Lec 19_ Software Testing
SE2018_Lec 19_ Software TestingSE2018_Lec 19_ Software Testing
SE2018_Lec 19_ Software Testing
Amr E. Mohamed
 
Voting protocol
Voting protocolVoting protocol
Voting protocol
Dipti Shrivastava
 
Enabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial IntelligenceEnabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial Intelligence
Lionel Briand
 
Bristol 2009 q1_blackmore_tim
Bristol 2009 q1_blackmore_timBristol 2009 q1_blackmore_tim
Bristol 2009 q1_blackmore_tim
Obsidian Software
 
Bangalore march07
Bangalore march07Bangalore march07
Bangalore march07
Obsidian Software
 
HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...
HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...
HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...
Lionel Briand
 
Introduction to System verilog
Introduction to System verilog Introduction to System verilog
Introduction to System verilog
Pushpa Yakkala
 
Chapter 12
Chapter 12Chapter 12
Chapter 12
sarath1992
 
Automated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance Systems
Lionel Briand
 
Software archiecture lecture06
Software archiecture   lecture06Software archiecture   lecture06
Software archiecture lecture06
Luktalja
 
Reverse Engineering of Software Architecture
Reverse Engineering of Software ArchitectureReverse Engineering of Software Architecture
Reverse Engineering of Software Architecture
Dharmalingam Ganesan
 
Software Fault Tolerance
Software Fault ToleranceSoftware Fault Tolerance
Software Fault Tolerance
Ankit Singh
 
Software archiecture lecture05
Software archiecture   lecture05Software archiecture   lecture05
Software archiecture lecture05
Luktalja
 
Fault tolerance
Fault toleranceFault tolerance
Fault tolerance
Lekashri Subramanian
 
Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...
Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...
Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...
Lionel Briand
 

What's hot (20)

Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
 
Dependable Systems -Fault Tolerance Patterns (4/16)
Dependable Systems -Fault Tolerance Patterns (4/16)Dependable Systems -Fault Tolerance Patterns (4/16)
Dependable Systems -Fault Tolerance Patterns (4/16)
 
What activates a bug? A refinement of the Laprie terminology model.
What activates a bug? A refinement of the Laprie terminology model.What activates a bug? A refinement of the Laprie terminology model.
What activates a bug? A refinement of the Laprie terminology model.
 
OpenSubmit - How to grade 1200 code submissions
OpenSubmit - How to grade 1200 code submissionsOpenSubmit - How to grade 1200 code submissions
OpenSubmit - How to grade 1200 code submissions
 
Availability tactics
Availability tacticsAvailability tactics
Availability tactics
 
SE2018_Lec 19_ Software Testing
SE2018_Lec 19_ Software TestingSE2018_Lec 19_ Software Testing
SE2018_Lec 19_ Software Testing
 
Voting protocol
Voting protocolVoting protocol
Voting protocol
 
Enabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial IntelligenceEnabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial Intelligence
 
Bristol 2009 q1_blackmore_tim
Bristol 2009 q1_blackmore_timBristol 2009 q1_blackmore_tim
Bristol 2009 q1_blackmore_tim
 
Bangalore march07
Bangalore march07Bangalore march07
Bangalore march07
 
HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...
HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...
HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...
 
Introduction to System verilog
Introduction to System verilog Introduction to System verilog
Introduction to System verilog
 
Chapter 12
Chapter 12Chapter 12
Chapter 12
 
Automated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance Systems
 
Software archiecture lecture06
Software archiecture   lecture06Software archiecture   lecture06
Software archiecture lecture06
 
Reverse Engineering of Software Architecture
Reverse Engineering of Software ArchitectureReverse Engineering of Software Architecture
Reverse Engineering of Software Architecture
 
Software Fault Tolerance
Software Fault ToleranceSoftware Fault Tolerance
Software Fault Tolerance
 
Software archiecture lecture05
Software archiecture   lecture05Software archiecture   lecture05
Software archiecture lecture05
 
Fault tolerance
Fault toleranceFault tolerance
Fault tolerance
 
Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...
Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...
Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...
 

Similar to Dependable Systems - Hardware Dependability with Diagnosis (13/16)

Data Integrity Techniques: Aviation Best Practices for CRC & Checksum Error D...
Data Integrity Techniques: Aviation Best Practices for CRC & Checksum Error D...Data Integrity Techniques: Aviation Best Practices for CRC & Checksum Error D...
Data Integrity Techniques: Aviation Best Practices for CRC & Checksum Error D...
Philip Koopman
 
Unit Testing a Primer
Unit Testing a PrimerUnit Testing a Primer
Unit Testing a Primer
Mindfire Solutions
 
Monomi: Practical Analytical Query Processing over Encrypted Data
Monomi: Practical Analytical Query Processing over Encrypted DataMonomi: Practical Analytical Query Processing over Encrypted Data
Monomi: Practical Analytical Query Processing over Encrypted Data
Mostafa Arjmand
 
Osi model
Osi model Osi model
Osi model
maha tce
 
Improving the accuracy and reliability of data analysis code
Improving the accuracy and reliability of data analysis codeImproving the accuracy and reliability of data analysis code
Improving the accuracy and reliability of data analysis code
Johan Carlin
 
Application of machine learning and cognitive computing in intrusion detectio...
Application of machine learning and cognitive computing in intrusion detectio...Application of machine learning and cognitive computing in intrusion detectio...
Application of machine learning and cognitive computing in intrusion detectio...
Mahdi Hosseini Moghaddam
 
Information systems application control Framework.ppt
Information systems application control Framework.pptInformation systems application control Framework.ppt
Information systems application control Framework.ppt
r209777z
 
data science with python_UNIT 2_full notes.pdf
data science with python_UNIT 2_full notes.pdfdata science with python_UNIT 2_full notes.pdf
data science with python_UNIT 2_full notes.pdf
mukeshgarg02
 
B21DA0201_02.ppt
B21DA0201_02.pptB21DA0201_02.ppt
B21DA0201_02.ppt
DrPreethiD1
 
Formal Verification
Formal VerificationFormal Verification
Formal Verification
Ilia Levin
 
#ITsubbotnik Spring 2017: Dmitrii Nikitko "Deep learning for understanding of...
#ITsubbotnik Spring 2017: Dmitrii Nikitko "Deep learning for understanding of...#ITsubbotnik Spring 2017: Dmitrii Nikitko "Deep learning for understanding of...
#ITsubbotnik Spring 2017: Dmitrii Nikitko "Deep learning for understanding of...
epamspb
 
Unstructured data processing webinar 06272016
Unstructured data processing webinar 06272016Unstructured data processing webinar 06272016
Unstructured data processing webinar 06272016
George Roth
 
Measuring the Combinatorial Coverage of Software in Real Time
Measuring the Combinatorial Coverage of Software in Real  TimeMeasuring the Combinatorial Coverage of Software in Real  Time
Measuring the Combinatorial Coverage of Software in Real Time
Zachary Ratliff
 
Internetworking fundamentals(networking)
Internetworking fundamentals(networking)Internetworking fundamentals(networking)
Internetworking fundamentals(networking)
welcometofacebook
 
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
TEST Huddle
 
CodeChecker Overview Nov 2019
CodeChecker Overview Nov 2019CodeChecker Overview Nov 2019
CodeChecker Overview Nov 2019
Olivera Milenkovic
 
Cs 331 Data Structures
Cs 331 Data StructuresCs 331 Data Structures
NBTC#2 - Why instrumentation is cooler then ice
NBTC#2 - Why instrumentation is cooler then iceNBTC#2 - Why instrumentation is cooler then ice
NBTC#2 - Why instrumentation is cooler then ice
Alexandre Moneger
 
Fault Detection Scheme for AES Using Composite Field
Fault Detection Scheme for AES Using Composite FieldFault Detection Scheme for AES Using Composite Field
Fault Detection Scheme for AES Using Composite Field
AJAL A J
 
FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...
FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...
FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...
AntareepMajumder
 

Similar to Dependable Systems - Hardware Dependability with Diagnosis (13/16) (20)

Data Integrity Techniques: Aviation Best Practices for CRC & Checksum Error D...
Data Integrity Techniques: Aviation Best Practices for CRC & Checksum Error D...Data Integrity Techniques: Aviation Best Practices for CRC & Checksum Error D...
Data Integrity Techniques: Aviation Best Practices for CRC & Checksum Error D...
 
Unit Testing a Primer
Unit Testing a PrimerUnit Testing a Primer
Unit Testing a Primer
 
Monomi: Practical Analytical Query Processing over Encrypted Data
Monomi: Practical Analytical Query Processing over Encrypted DataMonomi: Practical Analytical Query Processing over Encrypted Data
Monomi: Practical Analytical Query Processing over Encrypted Data
 
Osi model
Osi model Osi model
Osi model
 
Improving the accuracy and reliability of data analysis code
Improving the accuracy and reliability of data analysis codeImproving the accuracy and reliability of data analysis code
Improving the accuracy and reliability of data analysis code
 
Application of machine learning and cognitive computing in intrusion detectio...
Application of machine learning and cognitive computing in intrusion detectio...Application of machine learning and cognitive computing in intrusion detectio...
Application of machine learning and cognitive computing in intrusion detectio...
 
Information systems application control Framework.ppt
Information systems application control Framework.pptInformation systems application control Framework.ppt
Information systems application control Framework.ppt
 
data science with python_UNIT 2_full notes.pdf
data science with python_UNIT 2_full notes.pdfdata science with python_UNIT 2_full notes.pdf
data science with python_UNIT 2_full notes.pdf
 
B21DA0201_02.ppt
B21DA0201_02.pptB21DA0201_02.ppt
B21DA0201_02.ppt
 
Formal Verification
Formal VerificationFormal Verification
Formal Verification
 
#ITsubbotnik Spring 2017: Dmitrii Nikitko "Deep learning for understanding of...
#ITsubbotnik Spring 2017: Dmitrii Nikitko "Deep learning for understanding of...#ITsubbotnik Spring 2017: Dmitrii Nikitko "Deep learning for understanding of...
#ITsubbotnik Spring 2017: Dmitrii Nikitko "Deep learning for understanding of...
 
Unstructured data processing webinar 06272016
Unstructured data processing webinar 06272016Unstructured data processing webinar 06272016
Unstructured data processing webinar 06272016
 
Measuring the Combinatorial Coverage of Software in Real Time
Measuring the Combinatorial Coverage of Software in Real  TimeMeasuring the Combinatorial Coverage of Software in Real  Time
Measuring the Combinatorial Coverage of Software in Real Time
 
Internetworking fundamentals(networking)
Internetworking fundamentals(networking)Internetworking fundamentals(networking)
Internetworking fundamentals(networking)
 
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
Mike Bartley - Innovations for Testing Parallel Software - EuroSTAR 2012
 
CodeChecker Overview Nov 2019
CodeChecker Overview Nov 2019CodeChecker Overview Nov 2019
CodeChecker Overview Nov 2019
 
Cs 331 Data Structures
Cs 331 Data StructuresCs 331 Data Structures
Cs 331 Data Structures
 
NBTC#2 - Why instrumentation is cooler then ice
NBTC#2 - Why instrumentation is cooler then iceNBTC#2 - Why instrumentation is cooler then ice
NBTC#2 - Why instrumentation is cooler then ice
 
Fault Detection Scheme for AES Using Composite Field
Fault Detection Scheme for AES Using Composite FieldFault Detection Scheme for AES Using Composite Field
Fault Detection Scheme for AES Using Composite Field
 
FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...
FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...
FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...
 

More from Peter Tröger

WannaCry - An OS course perspective
WannaCry - An OS course perspectiveWannaCry - An OS course perspective
WannaCry - An OS course perspective
Peter Tröger
 
Cloud Standards and Virtualization
Cloud Standards and VirtualizationCloud Standards and Virtualization
Cloud Standards and Virtualization
Peter Tröger
 
Distributed Resource Management Application API (DRMAA) Version 2
Distributed Resource Management Application API (DRMAA) Version 2Distributed Resource Management Application API (DRMAA) Version 2
Distributed Resource Management Application API (DRMAA) Version 2
Peter Tröger
 
Design of Software for Embedded Systems
Design of Software for Embedded SystemsDesign of Software for Embedded Systems
Design of Software for Embedded Systems
Peter Tröger
 
Humans should not write XML.
Humans should not write XML.Humans should not write XML.
Humans should not write XML.
Peter Tröger
 
Dependable Systems - System Dependability Evaluation (8/16)
Dependable Systems - System Dependability Evaluation (8/16)Dependable Systems - System Dependability Evaluation (8/16)
Dependable Systems - System Dependability Evaluation (8/16)
Peter Tröger
 
Verteilte Software-Systeme im Kontext von Industrie 4.0
Verteilte Software-Systeme im Kontext von Industrie 4.0Verteilte Software-Systeme im Kontext von Industrie 4.0
Verteilte Software-Systeme im Kontext von Industrie 4.0
Peter Tröger
 
Operating Systems 1 (12/12) - Summary
Operating Systems 1 (12/12) - SummaryOperating Systems 1 (12/12) - Summary
Operating Systems 1 (12/12) - Summary
Peter Tröger
 
Operating Systems 1 (8/12) - Concurrency
Operating Systems 1 (8/12) - ConcurrencyOperating Systems 1 (8/12) - Concurrency
Operating Systems 1 (8/12) - Concurrency
Peter Tröger
 
Operating Systems 1 (9/12) - Memory Management Concepts
Operating Systems 1 (9/12) - Memory Management ConceptsOperating Systems 1 (9/12) - Memory Management Concepts
Operating Systems 1 (9/12) - Memory Management Concepts
Peter Tröger
 
Operating Systems 1 (7/12) - Threads
Operating Systems 1 (7/12) - ThreadsOperating Systems 1 (7/12) - Threads
Operating Systems 1 (7/12) - Threads
Peter Tröger
 
Operating Systems 1 (1/12) - History
Operating Systems 1 (1/12) - HistoryOperating Systems 1 (1/12) - History
Operating Systems 1 (1/12) - History
Peter Tröger
 
Operating Systems 1 (2/12) - Hardware Basics
Operating Systems 1 (2/12) - Hardware BasicsOperating Systems 1 (2/12) - Hardware Basics
Operating Systems 1 (2/12) - Hardware Basics
Peter Tröger
 
Operating Systems 1 (11/12) - Input / Output
Operating Systems 1 (11/12) - Input / OutputOperating Systems 1 (11/12) - Input / Output
Operating Systems 1 (11/12) - Input / Output
Peter Tröger
 
Operating Systems 1 (3/12) - Architectures
Operating Systems 1 (3/12) - ArchitecturesOperating Systems 1 (3/12) - Architectures
Operating Systems 1 (3/12) - Architectures
Peter Tröger
 
Operating Systems 1 (6/12) - Processes
Operating Systems 1 (6/12) - ProcessesOperating Systems 1 (6/12) - Processes
Operating Systems 1 (6/12) - Processes
Peter Tröger
 
Operating Systems 1 (5/12) - Architectures (Unix)
Operating Systems 1 (5/12) - Architectures (Unix)Operating Systems 1 (5/12) - Architectures (Unix)
Operating Systems 1 (5/12) - Architectures (Unix)
Peter Tröger
 
Operating Systems 1 (4/12) - Architectures (Windows)
Operating Systems 1 (4/12) - Architectures (Windows)Operating Systems 1 (4/12) - Architectures (Windows)
Operating Systems 1 (4/12) - Architectures (Windows)
Peter Tröger
 

More from Peter Tröger (18)

WannaCry - An OS course perspective
WannaCry - An OS course perspectiveWannaCry - An OS course perspective
WannaCry - An OS course perspective
 
Cloud Standards and Virtualization
Cloud Standards and VirtualizationCloud Standards and Virtualization
Cloud Standards and Virtualization
 
Distributed Resource Management Application API (DRMAA) Version 2
Distributed Resource Management Application API (DRMAA) Version 2Distributed Resource Management Application API (DRMAA) Version 2
Distributed Resource Management Application API (DRMAA) Version 2
 
Design of Software for Embedded Systems
Design of Software for Embedded SystemsDesign of Software for Embedded Systems
Design of Software for Embedded Systems
 
Humans should not write XML.
Humans should not write XML.Humans should not write XML.
Humans should not write XML.
 
Dependable Systems - System Dependability Evaluation (8/16)
Dependable Systems - System Dependability Evaluation (8/16)Dependable Systems - System Dependability Evaluation (8/16)
Dependable Systems - System Dependability Evaluation (8/16)
 
Verteilte Software-Systeme im Kontext von Industrie 4.0
Verteilte Software-Systeme im Kontext von Industrie 4.0Verteilte Software-Systeme im Kontext von Industrie 4.0
Verteilte Software-Systeme im Kontext von Industrie 4.0
 
Operating Systems 1 (12/12) - Summary
Operating Systems 1 (12/12) - SummaryOperating Systems 1 (12/12) - Summary
Operating Systems 1 (12/12) - Summary
 
Operating Systems 1 (8/12) - Concurrency
Operating Systems 1 (8/12) - ConcurrencyOperating Systems 1 (8/12) - Concurrency
Operating Systems 1 (8/12) - Concurrency
 
Operating Systems 1 (9/12) - Memory Management Concepts
Operating Systems 1 (9/12) - Memory Management ConceptsOperating Systems 1 (9/12) - Memory Management Concepts
Operating Systems 1 (9/12) - Memory Management Concepts
 
Operating Systems 1 (7/12) - Threads
Operating Systems 1 (7/12) - ThreadsOperating Systems 1 (7/12) - Threads
Operating Systems 1 (7/12) - Threads
 
Operating Systems 1 (1/12) - History
Operating Systems 1 (1/12) - HistoryOperating Systems 1 (1/12) - History
Operating Systems 1 (1/12) - History
 
Operating Systems 1 (2/12) - Hardware Basics
Operating Systems 1 (2/12) - Hardware BasicsOperating Systems 1 (2/12) - Hardware Basics
Operating Systems 1 (2/12) - Hardware Basics
 
Operating Systems 1 (11/12) - Input / Output
Operating Systems 1 (11/12) - Input / OutputOperating Systems 1 (11/12) - Input / Output
Operating Systems 1 (11/12) - Input / Output
 
Operating Systems 1 (3/12) - Architectures
Operating Systems 1 (3/12) - ArchitecturesOperating Systems 1 (3/12) - Architectures
Operating Systems 1 (3/12) - Architectures
 
Operating Systems 1 (6/12) - Processes
Operating Systems 1 (6/12) - ProcessesOperating Systems 1 (6/12) - Processes
Operating Systems 1 (6/12) - Processes
 
Operating Systems 1 (5/12) - Architectures (Unix)
Operating Systems 1 (5/12) - Architectures (Unix)Operating Systems 1 (5/12) - Architectures (Unix)
Operating Systems 1 (5/12) - Architectures (Unix)
 
Operating Systems 1 (4/12) - Architectures (Windows)
Operating Systems 1 (4/12) - Architectures (Windows)Operating Systems 1 (4/12) - Architectures (Windows)
Operating Systems 1 (4/12) - Architectures (Windows)
 

Recently uploaded

World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
ak6969907
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Akanksha trivedi rama nursing college kanpur.
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
WaniBasim
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Dr. Vinod Kumar Kanvaria
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
Priyankaranawat4
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
National Information Standards Organization (NISO)
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
Celine George
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
Nguyen Thanh Tu Collection
 
Top five deadliest dog breeds in America
Top five deadliest dog breeds in AmericaTop five deadliest dog breeds in America
Top five deadliest dog breeds in America
Bisnar Chase Personal Injury Attorneys
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
adhitya5119
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
adhitya5119
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
Priyankaranawat4
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
 
Top five deadliest dog breeds in America
Top five deadliest dog breeds in AmericaTop five deadliest dog breeds in America
Top five deadliest dog breeds in America
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
 

Dependable Systems - Hardware Dependability with Diagnosis (13/16)

  • 1. Dependable Systems ! Hardware Dependability - Diagnosis Dr. Peter Tröger ! Sources: ! Siewiorek, Daniel P.; Swarz, Robert S.: Reliable Computer Systems. third. Wellesley, MA : A. K. Peters, Ltd., 1998. , 156881092X ! Some images (C) Elena Dubrova, ESDLab, Kungl Tekniska Högskolan
  • 2. Dependable Systems Course PT 2014 Fault Diagnosis • Contains of fault detection (what ?) and fault location (where ?) • Fault detection • Replication checks - Test operation / execution against alternate implementation • Timing checks - Test operation / execution against timing constraints • Reversal checks - Make use of operation reversibility • Coding checks - Utilize redundant (but different) representation of data • Reasonableness checks - Check against known system / data properties • Structural checks - Ensure consistent structure of data, diagnostic tests, ... • Diagnostic checks - Use set of inputs for which the outputs are known • Algorithmic checks - Check invariants of an algorithm 2
  • 3. Dependable Systems Course PT 2014 Fault Detection • Replication check • Perform test against the same / alternate implementation • Identical / partial copies of unit - assumes correct design and independent failure • Separate and different copies - assumes design faults • Repeated execution - assumes transient fault • Works well, but expensive • Implementable in spatial
 redundancy architecture 3
  • 4. Dependable Systems Course PT 2014 Fault Detection • Timing checks • Monitor execution based on timing constraints • Watchdog - Progress resets timer, expired timer assumes broken component • Detection of crash, overload, infinite loop; frequency depends on application • Broadcast timeout - Component broadcasts message on progress, 
 receivers have deadline for next message • Acknowledge timeout - Peer should react on request with answer message 4 Tandem: Heartbeat broadcast every second, check on every second message
  • 5. Dependable Systems Course PT 2014 Fault Detection • Reversal checks • One-to-one relationship between inputs and outputs • Calculates inputs from output by reverse function, comparison • Re-read data after write, mathematical functions • Diagnostic checks • Check known output for some test inputs • Typical approach in built-in hardware testing • Load tests - run at saturation levels, trigger abnormal environmental conditions • Plausibility checks • Based on knowledge about system design and data types - range checks, consistency checks, type checks 5 p x 2 = x
  • 6. Dependable Systems Course PT 2014 Fault Detection - Coding Checks • Coding techniques help with fault detection, diagnosis and tolerance • General idea: Informational redundancy • Code defines a subset of all possible information words as valid information • Each valid information is a code word • Error detection mechanism: Is the given word a (valid) code word? • Number space - All numbers that can be represented in a numerical system • Code space - Subset of a number space • Applications: Data storage, ALU, communication busses, self-checking hardware, ... • Ranking of codes: Detection coverage, overhead, complexity added, ... 6
  • 7. Dependable Systems Course PT 2014 Coding Checks in Memory Hardware • Primary memory • Parity code • m-out-of-n resp. m-of-n resp. m/n code • Checksumming • Berger Code • Hamming code • Secondary storage • RAID codes • Reed-Solomon code 7
  • 8. Dependable Systems Course PT 2014 Hamming Distance (Richard Hamming, 1950) • Hamming distance: Number of positions on which two words differ • Alternative definition: Number of necessary substitutions to make A to B • Alternative definition: Number of errors that transform A to B ! ! ! ! • Minimum Hamming distance: Minimum distance between any two code words • To detect d single-bit errors, a code must have a min. distance of at least d + 1 • If at least d+1 bits change in the transmitted code, a new (valid) code appears • To correct d single-bit errors, the minimum distance must be 2d+1 8 (C)Wikipedia
  • 9. Dependable Systems Course PT 2014 Parity Codes • Add parity bit to the information word • Even parity: If odd number of ones, add one to make the number of ones even • Odd parity: If even number of ones, add one to make the number of ones odd • Variants • Bit-per-word parity • Bit-per-byte parity 
 (Example: 
 Pentium data cache) • Bit-per-chip parity • ... 9
  • 10. Dependable Systems Course PT 2014 Two-Dimensional Parity 10 Example: Odd Parity Allows fault location
  • 11. Dependable Systems Course PT 2014 m-out-of-n Code • Data word of length n 
 (including code bits), 
 make sure that m ones are in the word • Example: 2-out-of-5 code • Often used in telecommunication and in the US postal system • 5 bits allow 10 combinations with 
 2-out-of-5, so all decimal digits are representable • Can deal with more than only 
 single-bit errors 11 Decimal Digit 2-out-of-5 0 0 0 0 1 1 1 0 0 1 0 1 2 0 0 1 1 0 3 0 1 0 0 1 4 0 1 0 1 0 5 0 1 1 0 0 6 1 0 0 0 1 7 1 0 0 1 0 8 1 0 1 0 0 9 1 1 0 0 0
  • 12. Dependable Systems Course PT 2014 Code Properties 12 Code Bits / Word Number of 
 possible words Coverage Even Parity 4 8 Any single bit error, 
 no double-bit error,
 not all-0s or all 1s errors Odd Parity 4 8 Any single bit error, 
 no double-bit error, all-0s or all 1s errors 2/4 4 6 Any single bit error, 
 33% of double bit errors
  • 13. Dependable Systems Course PT 2014 Checksumming • General idea: Multiply components of a data word and add them up to a checksum • Good for large blocks of data, low hardware overhead • Might require substantial time, memory overhead • 100% coverage for single faults • Example: ISBN 10 check digit • Final number of 10 digit ISBN number is a check digit, computed by: • Multiply all data digits by their position number, starting from right • Take sum of the products, and set the check digit so that result MOD 11 = 0 • Check digit „01“ is represented by X • More complex approaches get better coverage (e.g. CRC) for more CPU time 13
  • 14. Dependable Systems Course PT 2014 Hamming Code • Every data word with n bits is extended 
 by k control bits • Control bits through standard parity approach 
 (even parity), implementable with XOR • Inserted at power-of-two positions • Parity bit pX is responsible for all position numbers with the X-least significant bit set (e.g. p1 responsible for ,red dot‘ positions) • Parity bits check overlapping parts of the data bits, computation and check are cheap, 
 but significant overhead in data • Minimum Hamming distance of three 
 (single bit correction) • Application in DRAM memory 14 takenfromWikimediaCommons
  • 15. Dependable Systems Course PT 2014 Hamming Code for 26bit word size 15 takenfromWikimediaCommons
  • 16. Dependable Systems Course PT 2014 Hamming Code • Hamming codes are known to produce 10% to 40% of data overhead • Syndrome determines which bit (if any) is corrupted • Example ECC • Majority of „one-off“ soft errors in DRAM happens from background radiation • Denser packages, lower voltage for higher frequencies • Most common approach is the SECDEC Hamming code • "Single error correction, double error detection" (SECDEC) • Hamming code with additional parity • Modern BIOS implementations perform 
 threshold-based warnings on frequent correctable errors 16 (C) Z. Kalbarczyk