SlideShare a Scribd company logo
1 of 15
Engineering Cross-Layer
Fault-Tolerance
in Many-Core Systems
PhD Student: Rem Gensh
Supervisor: Professor Alexander Romanovsky
Thesis Committee: Professor Alex Yakovlev, Dr Alexei Iliasov
Definitions
• Fault tolerance is a means for achieving dependability, allowing
us to prevent the system failure in the presence of faults.
• Layer. Abstraction layer, system component.
• Cross-layer approach (or design) is used when it is more efficient
to distribute the task between several layers rather than execute
it only at one layer.
• Many-core systems are those containing tens, hundreds or
thousands cores (multi-core systems have 2-8 cores)
2
Introduction
• Systems’ complexity and abstraction
• TCP/IP as a motivating example
• Many core systems
• Layered fault tolerance
• Cross-layer fault tolerance
3
Systems’ Complexity and Abstraction
• Abstraction simplifies the understanding of the system structure
• Layers of the computer system
• OSI model
• TCP/IP (Internet protocol suite)
• Object-oriented programming
• Components of the system are considered as black boxes
• Each component should provide predefined service according to
its interface
4
TCP/IP as a motivating example
5
TCP/IP cross-layer fault tolerance
• All layers participate in error
detection and error recovery
• Error detection and recovery is
performed by cooperative
activities of several layers
• If an error is not detected at the
lower layer it will be detected and
recovered at the higher layer
• Efficiency and flexibility of TCP/IP
6
Layer
Error
detection
Error
recovery
Application Status codes
Retransmission
or custom
recovery
Transport CRC-16
TCP: ack., neg.
ack., ARQ, seq.
number
Internet CRC-16
Discard
corrupted
packet
Link CRC-32
Discard
corrupted
packet
Many-core systems
• 10, 100 or even 1000 cores
• Heterogeneous architectures
• Redundant cores for ensuring fault tolerance
• Performance, energy efficiency and reliability are very important
factors for many-core systems
7
Layered fault tolerance
• Faults can occur at the different layers of the system stack
• Major part of errors is handled at the layer, where they are
detected.
• Convenience for developer
• Predominance of convenience over the system efficiency
8
Layered fault tolerance
• System layers are considered separately
• Unnecessary error corrections are possible
• Above layer can not specify the required
quality of service of the layer that is below
• Not optimal in terms of performance and
energy consumption
9
Cross-Layer Fault Tolerance
• Fault tolerance will be distributed across
the system stack
• Useful information about the system state
will be shared among the layers
• Various application domains
• Above layers will have the possibility to
specify current needs and required service
level
10
Cross-layer design for wireless sensor
networks
• Single layer approach cannot share important information among different
layers
• Each layer does not have complete information. Optimal operation of the
entire network cannot be guaranteed
• Single layer approach does not have the ability to adapt to the
environmental change
L. Carnevali, L. Ridi, E. Vicario, "Stochastic Fault Trees for cross-layer power management of WSN monitoring systems," IEEE Conference on Emerging Technologies & Factory
Automation, pp. 1-8, 2009.
P. Rachelin Sujae, M. Vigneshpandi, "A Cross Layer Fault Tolerant Communication Architecture for Wireless Sensor Networks," Middle-East Journal of Scientific Research, pp. 1292-
1296, 2014.
Y. Wang, H. Wu, F. Lin, N.F. Tzeng, "Cross-Layer Protocol Design and Optimization for Delay/Fault-Tolerant Mobile Sensor Networks (DFT-MSN’s)," IEEE Journal on selected areas in
communications, vol. 26, no. 5, pp. 809-819, 2008.
11
Challenges
• Investigate the trade-off between reliability, performance and
energy-consumption in many-core systems
• Ensure cross-layer fault tolerance for many-core systems
• Demonstrate that applying the cross-layer fault tolerance can
improve performance and energy-efficiency
12
Plan
• Implement a case-study to gain an experience in developing
cross-layer fault tolerance
• Apply Order Graphs to model cross-layer fault tolerance, power
consumption and performance of many-core systems
• Design novel mechanisms, libraries and patters that will help in
engineering cross-layer fault tolerance of many-core systems
13
Case study: Car number plate recognition
application
• Several character recognition algorithms
• Possibility to specify the operational mode: reliability,
performance, energy efficiency or certain tradeoffs between
these parameters.
• Recover two types of errors:
• CPU core error.
• Insufficient Quality of Service.
14
Conclusion
• Systems’ complexity and abstraction
• Layered fault tolerance
• Cross-layer fault tolerance
15

More Related Content

What's hot

Modelling and Simulation of the response process for an emergency at the Grea...
Modelling and Simulation of the response process for an emergency at the Grea...Modelling and Simulation of the response process for an emergency at the Grea...
Modelling and Simulation of the response process for an emergency at the Grea...InfinIT - Innovationsnetværket for it
 
Semantics in Sensor Networks
Semantics in Sensor NetworksSemantics in Sensor Networks
Semantics in Sensor NetworksOscar Corcho
 
IEEE 2014 JAVA NETWORK SECURITY PROJECTS Integrated security analysis on casc...
IEEE 2014 JAVA NETWORK SECURITY PROJECTS Integrated security analysis on casc...IEEE 2014 JAVA NETWORK SECURITY PROJECTS Integrated security analysis on casc...
IEEE 2014 JAVA NETWORK SECURITY PROJECTS Integrated security analysis on casc...IEEEGLOBALSOFTSTUDENTPROJECTS
 
Ieeepro techno solutions 2013 ieee embedded project design of a wsn platfor...
Ieeepro techno solutions   2013 ieee embedded project design of a wsn platfor...Ieeepro techno solutions   2013 ieee embedded project design of a wsn platfor...
Ieeepro techno solutions 2013 ieee embedded project design of a wsn platfor...srinivasanece7
 
Machine learning for 5G and beyond
Machine learning for 5G and beyondMachine learning for 5G and beyond
Machine learning for 5G and beyondITU
 
Next Technology Wave
Next Technology WaveNext Technology Wave
Next Technology WaveFalascoj
 
Visual Sensor Network & Coverage Issue
Visual Sensor Network  & Coverage Issue Visual Sensor Network  & Coverage Issue
Visual Sensor Network & Coverage Issue AJIT NEGI
 
A Survey on Energy Efficient Cross layer Solutions for problems in WSNs
 A Survey on Energy Efficient Cross layer Solutions for problems in WSNs A Survey on Energy Efficient Cross layer Solutions for problems in WSNs
A Survey on Energy Efficient Cross layer Solutions for problems in WSNsReshma Kagyagol
 
Model-Simulation-and-Measurement-Based Systems Engineering of Power System Sy...
Model-Simulation-and-Measurement-Based Systems Engineering of Power System Sy...Model-Simulation-and-Measurement-Based Systems Engineering of Power System Sy...
Model-Simulation-and-Measurement-Based Systems Engineering of Power System Sy...Luigi Vanfretti
 
Aperture Coupled Microstrip Antennas with High Isolation for MIMO Systems
Aperture Coupled Microstrip Antennas with High Isolation for MIMO SystemsAperture Coupled Microstrip Antennas with High Isolation for MIMO Systems
Aperture Coupled Microstrip Antennas with High Isolation for MIMO SystemsConferenceproceedings
 
Thesis presentation
Thesis presentationThesis presentation
Thesis presentationAlexios Lekidis
 
Semantic Web Technologies for Intelligent Engineering Applications
Semantic Web Technologies for  Intelligent Engineering ApplicationsSemantic Web Technologies for  Intelligent Engineering Applications
Semantic Web Technologies for Intelligent Engineering ApplicationsMarta Sabou
 
NO PUBLICATION CHARGES - International Journal of Advanced Smart Sensor Netwo...
NO PUBLICATION CHARGES - International Journal of Advanced Smart Sensor Netwo...NO PUBLICATION CHARGES - International Journal of Advanced Smart Sensor Netwo...
NO PUBLICATION CHARGES - International Journal of Advanced Smart Sensor Netwo...ijassn
 
International Journal of Advanced Smart Sensor Network Systems ( IJASSN )
International Journal of Advanced Smart Sensor Network Systems  ( IJASSN )International Journal of Advanced Smart Sensor Network Systems  ( IJASSN )
International Journal of Advanced Smart Sensor Network Systems ( IJASSN )ijassn
 
An optimization framework for mobile data collection in energy harvesting wir...
An optimization framework for mobile data collection in energy harvesting wir...An optimization framework for mobile data collection in energy harvesting wir...
An optimization framework for mobile data collection in energy harvesting wir...Finalyearprojects Toall
 
Power Networks Demonstration Centre, 5 Feb 14
Power Networks Demonstration Centre, 5 Feb 14Power Networks Demonstration Centre, 5 Feb 14
Power Networks Demonstration Centre, 5 Feb 14CathLamont
 

What's hot (17)

Modelling and Simulation of the response process for an emergency at the Grea...
Modelling and Simulation of the response process for an emergency at the Grea...Modelling and Simulation of the response process for an emergency at the Grea...
Modelling and Simulation of the response process for an emergency at the Grea...
 
Semantics in Sensor Networks
Semantics in Sensor NetworksSemantics in Sensor Networks
Semantics in Sensor Networks
 
IEEE 2014 JAVA NETWORK SECURITY PROJECTS Integrated security analysis on casc...
IEEE 2014 JAVA NETWORK SECURITY PROJECTS Integrated security analysis on casc...IEEE 2014 JAVA NETWORK SECURITY PROJECTS Integrated security analysis on casc...
IEEE 2014 JAVA NETWORK SECURITY PROJECTS Integrated security analysis on casc...
 
Ieeepro techno solutions 2013 ieee embedded project design of a wsn platfor...
Ieeepro techno solutions   2013 ieee embedded project design of a wsn platfor...Ieeepro techno solutions   2013 ieee embedded project design of a wsn platfor...
Ieeepro techno solutions 2013 ieee embedded project design of a wsn platfor...
 
Machine learning for 5G and beyond
Machine learning for 5G and beyondMachine learning for 5G and beyond
Machine learning for 5G and beyond
 
Resume@NarasimhaReddy
Resume@NarasimhaReddyResume@NarasimhaReddy
Resume@NarasimhaReddy
 
Next Technology Wave
Next Technology WaveNext Technology Wave
Next Technology Wave
 
Visual Sensor Network & Coverage Issue
Visual Sensor Network  & Coverage Issue Visual Sensor Network  & Coverage Issue
Visual Sensor Network & Coverage Issue
 
A Survey on Energy Efficient Cross layer Solutions for problems in WSNs
 A Survey on Energy Efficient Cross layer Solutions for problems in WSNs A Survey on Energy Efficient Cross layer Solutions for problems in WSNs
A Survey on Energy Efficient Cross layer Solutions for problems in WSNs
 
Model-Simulation-and-Measurement-Based Systems Engineering of Power System Sy...
Model-Simulation-and-Measurement-Based Systems Engineering of Power System Sy...Model-Simulation-and-Measurement-Based Systems Engineering of Power System Sy...
Model-Simulation-and-Measurement-Based Systems Engineering of Power System Sy...
 
Aperture Coupled Microstrip Antennas with High Isolation for MIMO Systems
Aperture Coupled Microstrip Antennas with High Isolation for MIMO SystemsAperture Coupled Microstrip Antennas with High Isolation for MIMO Systems
Aperture Coupled Microstrip Antennas with High Isolation for MIMO Systems
 
Thesis presentation
Thesis presentationThesis presentation
Thesis presentation
 
Semantic Web Technologies for Intelligent Engineering Applications
Semantic Web Technologies for  Intelligent Engineering ApplicationsSemantic Web Technologies for  Intelligent Engineering Applications
Semantic Web Technologies for Intelligent Engineering Applications
 
NO PUBLICATION CHARGES - International Journal of Advanced Smart Sensor Netwo...
NO PUBLICATION CHARGES - International Journal of Advanced Smart Sensor Netwo...NO PUBLICATION CHARGES - International Journal of Advanced Smart Sensor Netwo...
NO PUBLICATION CHARGES - International Journal of Advanced Smart Sensor Netwo...
 
International Journal of Advanced Smart Sensor Network Systems ( IJASSN )
International Journal of Advanced Smart Sensor Network Systems  ( IJASSN )International Journal of Advanced Smart Sensor Network Systems  ( IJASSN )
International Journal of Advanced Smart Sensor Network Systems ( IJASSN )
 
An optimization framework for mobile data collection in energy harvesting wir...
An optimization framework for mobile data collection in energy harvesting wir...An optimization framework for mobile data collection in energy harvesting wir...
An optimization framework for mobile data collection in energy harvesting wir...
 
Power Networks Demonstration Centre, 5 Feb 14
Power Networks Demonstration Centre, 5 Feb 14Power Networks Demonstration Centre, 5 Feb 14
Power Networks Demonstration Centre, 5 Feb 14
 

Viewers also liked

Biological Immunity and Software Resilience: Two Faces of the Same Coin?
Biological Immunity and Software Resilience: Two Faces of the Same Coin?Biological Immunity and Software Resilience: Two Faces of the Same Coin?
Biological Immunity and Software Resilience: Two Faces of the Same Coin?SERENEWorkshop
 
SERENE 2014 School: System management overview
SERENE 2014 School: System management overviewSERENE 2014 School: System management overview
SERENE 2014 School: System management overviewSERENEWorkshop
 
Considering Execution Environment Resilience: A White-Box Approach
Considering Execution Environment Resilience: A White-Box ApproachConsidering Execution Environment Resilience: A White-Box Approach
Considering Execution Environment Resilience: A White-Box ApproachSERENEWorkshop
 
SERENE 2014 School: System-Level Concurrent Error Detection
SERENE 2014 School: System-Level Concurrent Error Detection SERENE 2014 School: System-Level Concurrent Error Detection
SERENE 2014 School: System-Level Concurrent Error Detection SERENEWorkshop
 
SERENE 2014 Workshop: Panel on "Views on Runtime Resilience Assessment of Dyn...
SERENE 2014 Workshop: Panel on "Views on Runtime Resilience Assessment of Dyn...SERENE 2014 Workshop: Panel on "Views on Runtime Resilience Assessment of Dyn...
SERENE 2014 Workshop: Panel on "Views on Runtime Resilience Assessment of Dyn...SERENEWorkshop
 
Hot Stand-By Disaster Recovery Solutions for Ensuring the Resilience of Railw...
Hot Stand-By Disaster Recovery Solutions for Ensuring the Resilience of Railw...Hot Stand-By Disaster Recovery Solutions for Ensuring the Resilience of Railw...
Hot Stand-By Disaster Recovery Solutions for Ensuring the Resilience of Railw...SERENEWorkshop
 
SERENE 2014 School: Measurement-Driven Resilience Design of Cloud-Based Cyber...
SERENE 2014 School: Measurement-Driven Resilience Design of Cloud-Based Cyber...SERENE 2014 School: Measurement-Driven Resilience Design of Cloud-Based Cyber...
SERENE 2014 School: Measurement-Driven Resilience Design of Cloud-Based Cyber...SERENEWorkshop
 
FACE: Fast and Customizable Sorting Accelerator for Heterogeneous Many-core S...
FACE: Fast and Customizable Sorting Accelerator for Heterogeneous Many-core S...FACE: Fast and Customizable Sorting Accelerator for Heterogeneous Many-core S...
FACE: Fast and Customizable Sorting Accelerator for Heterogeneous Many-core S...Ryohei Kobayashi
 
SERENE 2014 Workshop: Paper "Simulation Testing and Model Checking: A Case St...
SERENE 2014 Workshop: Paper "Simulation Testing and Model Checking: A Case St...SERENE 2014 Workshop: Paper "Simulation Testing and Model Checking: A Case St...
SERENE 2014 Workshop: Paper "Simulation Testing and Model Checking: A Case St...SERENEWorkshop
 
SERENE 2014 Workshop: Paper "Advanced Modelling, Simulation and Verification ...
SERENE 2014 Workshop: Paper "Advanced Modelling, Simulation and Verification ...SERENE 2014 Workshop: Paper "Advanced Modelling, Simulation and Verification ...
SERENE 2014 Workshop: Paper "Advanced Modelling, Simulation and Verification ...SERENEWorkshop
 
SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime ...
SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime ...SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime ...
SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime ...SERENEWorkshop
 
SERENE 2014 School: Incremental Model Queries over the Cloud
SERENE 2014 School: Incremental Model Queries over the CloudSERENE 2014 School: Incremental Model Queries over the Cloud
SERENE 2014 School: Incremental Model Queries over the CloudSERENEWorkshop
 
SERENE 2014 Workshop: Paper "Adaptive Domain-Specific Service Monitoring"
SERENE 2014 Workshop: Paper "Adaptive Domain-Specific Service Monitoring"SERENE 2014 Workshop: Paper "Adaptive Domain-Specific Service Monitoring"
SERENE 2014 Workshop: Paper "Adaptive Domain-Specific Service Monitoring"SERENEWorkshop
 
SERENE 2014 Workshop: Paper "Using Instrumentation for Quality Assessment of ...
SERENE 2014 Workshop: Paper "Using Instrumentation for Quality Assessment of ...SERENE 2014 Workshop: Paper "Using Instrumentation for Quality Assessment of ...
SERENE 2014 Workshop: Paper "Using Instrumentation for Quality Assessment of ...SERENEWorkshop
 
SERENE 2014 Workshop: Paper "Verification and Validation of a Pressure Contro...
SERENE 2014 Workshop: Paper "Verification and Validation of a Pressure Contro...SERENE 2014 Workshop: Paper "Verification and Validation of a Pressure Contro...
SERENE 2014 Workshop: Paper "Verification and Validation of a Pressure Contro...SERENEWorkshop
 
SERENE 2014 Workshop: Paper "Formal Fault Tolerance Analysis of Algorithms fo...
SERENE 2014 Workshop: Paper "Formal Fault Tolerance Analysis of Algorithms fo...SERENE 2014 Workshop: Paper "Formal Fault Tolerance Analysis of Algorithms fo...
SERENE 2014 Workshop: Paper "Formal Fault Tolerance Analysis of Algorithms fo...SERENEWorkshop
 
SERENE 2014 School: Challenges in Cyber-Physical Systems
SERENE 2014 School: Challenges in Cyber-Physical SystemsSERENE 2014 School: Challenges in Cyber-Physical Systems
SERENE 2014 School: Challenges in Cyber-Physical SystemsSERENEWorkshop
 
SERENE 2014 School: Resilience in Cyber-Physical Systems: Challenges and Oppo...
SERENE 2014 School: Resilience in Cyber-Physical Systems: Challenges and Oppo...SERENE 2014 School: Resilience in Cyber-Physical Systems: Challenges and Oppo...
SERENE 2014 School: Resilience in Cyber-Physical Systems: Challenges and Oppo...SERENEWorkshop
 
On Failure and Resilience
On Failure and ResilienceOn Failure and Resilience
On Failure and ResilienceMike Brittain
 

Viewers also liked (19)

Biological Immunity and Software Resilience: Two Faces of the Same Coin?
Biological Immunity and Software Resilience: Two Faces of the Same Coin?Biological Immunity and Software Resilience: Two Faces of the Same Coin?
Biological Immunity and Software Resilience: Two Faces of the Same Coin?
 
SERENE 2014 School: System management overview
SERENE 2014 School: System management overviewSERENE 2014 School: System management overview
SERENE 2014 School: System management overview
 
Considering Execution Environment Resilience: A White-Box Approach
Considering Execution Environment Resilience: A White-Box ApproachConsidering Execution Environment Resilience: A White-Box Approach
Considering Execution Environment Resilience: A White-Box Approach
 
SERENE 2014 School: System-Level Concurrent Error Detection
SERENE 2014 School: System-Level Concurrent Error Detection SERENE 2014 School: System-Level Concurrent Error Detection
SERENE 2014 School: System-Level Concurrent Error Detection
 
SERENE 2014 Workshop: Panel on "Views on Runtime Resilience Assessment of Dyn...
SERENE 2014 Workshop: Panel on "Views on Runtime Resilience Assessment of Dyn...SERENE 2014 Workshop: Panel on "Views on Runtime Resilience Assessment of Dyn...
SERENE 2014 Workshop: Panel on "Views on Runtime Resilience Assessment of Dyn...
 
Hot Stand-By Disaster Recovery Solutions for Ensuring the Resilience of Railw...
Hot Stand-By Disaster Recovery Solutions for Ensuring the Resilience of Railw...Hot Stand-By Disaster Recovery Solutions for Ensuring the Resilience of Railw...
Hot Stand-By Disaster Recovery Solutions for Ensuring the Resilience of Railw...
 
SERENE 2014 School: Measurement-Driven Resilience Design of Cloud-Based Cyber...
SERENE 2014 School: Measurement-Driven Resilience Design of Cloud-Based Cyber...SERENE 2014 School: Measurement-Driven Resilience Design of Cloud-Based Cyber...
SERENE 2014 School: Measurement-Driven Resilience Design of Cloud-Based Cyber...
 
FACE: Fast and Customizable Sorting Accelerator for Heterogeneous Many-core S...
FACE: Fast and Customizable Sorting Accelerator for Heterogeneous Many-core S...FACE: Fast and Customizable Sorting Accelerator for Heterogeneous Many-core S...
FACE: Fast and Customizable Sorting Accelerator for Heterogeneous Many-core S...
 
SERENE 2014 Workshop: Paper "Simulation Testing and Model Checking: A Case St...
SERENE 2014 Workshop: Paper "Simulation Testing and Model Checking: A Case St...SERENE 2014 Workshop: Paper "Simulation Testing and Model Checking: A Case St...
SERENE 2014 Workshop: Paper "Simulation Testing and Model Checking: A Case St...
 
SERENE 2014 Workshop: Paper "Advanced Modelling, Simulation and Verification ...
SERENE 2014 Workshop: Paper "Advanced Modelling, Simulation and Verification ...SERENE 2014 Workshop: Paper "Advanced Modelling, Simulation and Verification ...
SERENE 2014 Workshop: Paper "Advanced Modelling, Simulation and Verification ...
 
SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime ...
SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime ...SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime ...
SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime ...
 
SERENE 2014 School: Incremental Model Queries over the Cloud
SERENE 2014 School: Incremental Model Queries over the CloudSERENE 2014 School: Incremental Model Queries over the Cloud
SERENE 2014 School: Incremental Model Queries over the Cloud
 
SERENE 2014 Workshop: Paper "Adaptive Domain-Specific Service Monitoring"
SERENE 2014 Workshop: Paper "Adaptive Domain-Specific Service Monitoring"SERENE 2014 Workshop: Paper "Adaptive Domain-Specific Service Monitoring"
SERENE 2014 Workshop: Paper "Adaptive Domain-Specific Service Monitoring"
 
SERENE 2014 Workshop: Paper "Using Instrumentation for Quality Assessment of ...
SERENE 2014 Workshop: Paper "Using Instrumentation for Quality Assessment of ...SERENE 2014 Workshop: Paper "Using Instrumentation for Quality Assessment of ...
SERENE 2014 Workshop: Paper "Using Instrumentation for Quality Assessment of ...
 
SERENE 2014 Workshop: Paper "Verification and Validation of a Pressure Contro...
SERENE 2014 Workshop: Paper "Verification and Validation of a Pressure Contro...SERENE 2014 Workshop: Paper "Verification and Validation of a Pressure Contro...
SERENE 2014 Workshop: Paper "Verification and Validation of a Pressure Contro...
 
SERENE 2014 Workshop: Paper "Formal Fault Tolerance Analysis of Algorithms fo...
SERENE 2014 Workshop: Paper "Formal Fault Tolerance Analysis of Algorithms fo...SERENE 2014 Workshop: Paper "Formal Fault Tolerance Analysis of Algorithms fo...
SERENE 2014 Workshop: Paper "Formal Fault Tolerance Analysis of Algorithms fo...
 
SERENE 2014 School: Challenges in Cyber-Physical Systems
SERENE 2014 School: Challenges in Cyber-Physical SystemsSERENE 2014 School: Challenges in Cyber-Physical Systems
SERENE 2014 School: Challenges in Cyber-Physical Systems
 
SERENE 2014 School: Resilience in Cyber-Physical Systems: Challenges and Oppo...
SERENE 2014 School: Resilience in Cyber-Physical Systems: Challenges and Oppo...SERENE 2014 School: Resilience in Cyber-Physical Systems: Challenges and Oppo...
SERENE 2014 School: Resilience in Cyber-Physical Systems: Challenges and Oppo...
 
On Failure and Resilience
On Failure and ResilienceOn Failure and Resilience
On Failure and Resilience
 

Similar to Engineering Cross-Layer Fault Tolerance in Many-Core Systems

Distributed Systems.ppt
Distributed Systems.pptDistributed Systems.ppt
Distributed Systems.pptAdrianTopoleanu1
 
ln13-ds.pptefefdfdgdgerhfhgjhmmmmmmmmmmm
ln13-ds.pptefefdfdgdgerhfhgjhmmmmmmmmmmmln13-ds.pptefefdfdgdgerhfhgjhmmmmmmmmmmm
ln13-ds.pptefefdfdgdgerhfhgjhmmmmmmmmmmmpeterhaile1
 
Distributed systems
Distributed systemsDistributed systems
Distributed systemsCliff Ombachi
 
Networking for MBA
Networking for MBANetworking for MBA
Networking for MBAKK Bajpai
 
Ccna exploration 3 lan switching and wireless
Ccna exploration 3 lan switching and wirelessCcna exploration 3 lan switching and wireless
Ccna exploration 3 lan switching and wirelesskratos2424
 
Osi layer model
Osi layer modelOsi layer model
Osi layer modelAanchalJain72
 
performanceandtrafficmanagement-160328180107.pdf
performanceandtrafficmanagement-160328180107.pdfperformanceandtrafficmanagement-160328180107.pdf
performanceandtrafficmanagement-160328180107.pdfABYTHOMAS46
 
4th ICANN APAC-TWNIC Engagement Forum and 39th TWNIC OPM: Internet Resilience
4th ICANN APAC-TWNIC Engagement Forum and 39th TWNIC OPM: Internet Resilience4th ICANN APAC-TWNIC Engagement Forum and 39th TWNIC OPM: Internet Resilience
4th ICANN APAC-TWNIC Engagement Forum and 39th TWNIC OPM: Internet ResilienceAPNIC
 
OSI reference model
OSI reference modelOSI reference model
OSI reference modelshanthishyam
 
7 layers of osi models
7 layers of osi models7 layers of osi models
7 layers of osi modelsSathish Kumar
 
Topic 1.1 basic concepts of computer network
Topic 1.1 basic concepts of computer networkTopic 1.1 basic concepts of computer network
Topic 1.1 basic concepts of computer networkAtika Zaimi
 
Unit 5-Performance and Trafficmanagement.pptx
Unit 5-Performance and Trafficmanagement.pptxUnit 5-Performance and Trafficmanagement.pptx
Unit 5-Performance and Trafficmanagement.pptxABYTHOMAS46
 
Fundamentals Of Transaction Systems - Part 1: Causality banishes Acausality ...
Fundamentals Of Transaction Systems - Part 1: Causality banishes Acausality ...Fundamentals Of Transaction Systems - Part 1: Causality banishes Acausality ...
Fundamentals Of Transaction Systems - Part 1: Causality banishes Acausality ...Valverde Computing
 

Similar to Engineering Cross-Layer Fault Tolerance in Many-Core Systems (20)

Distributed Systems.ppt
Distributed Systems.pptDistributed Systems.ppt
Distributed Systems.ppt
 
ln13-ds.pptefefdfdgdgerhfhgjhmmmmmmmmmmm
ln13-ds.pptefefdfdgdgerhfhgjhmmmmmmmmmmmln13-ds.pptefefdfdgdgerhfhgjhmmmmmmmmmmm
ln13-ds.pptefefdfdgdgerhfhgjhmmmmmmmmmmm
 
Distributed systems
Distributed systemsDistributed systems
Distributed systems
 
Sem
SemSem
Sem
 
Networking for MBA
Networking for MBANetworking for MBA
Networking for MBA
 
Introduction
IntroductionIntroduction
Introduction
 
Ccna exploration 3 lan switching and wireless
Ccna exploration 3 lan switching and wirelessCcna exploration 3 lan switching and wireless
Ccna exploration 3 lan switching and wireless
 
Osi layer model
Osi layer modelOsi layer model
Osi layer model
 
Tata Chuna
Tata Chuna Tata Chuna
Tata Chuna
 
Computer network
Computer networkComputer network
Computer network
 
performanceandtrafficmanagement-160328180107.pdf
performanceandtrafficmanagement-160328180107.pdfperformanceandtrafficmanagement-160328180107.pdf
performanceandtrafficmanagement-160328180107.pdf
 
Performance and traffic management for WSNs
Performance and traffic management for WSNsPerformance and traffic management for WSNs
Performance and traffic management for WSNs
 
4th ICANN APAC-TWNIC Engagement Forum and 39th TWNIC OPM: Internet Resilience
4th ICANN APAC-TWNIC Engagement Forum and 39th TWNIC OPM: Internet Resilience4th ICANN APAC-TWNIC Engagement Forum and 39th TWNIC OPM: Internet Resilience
4th ICANN APAC-TWNIC Engagement Forum and 39th TWNIC OPM: Internet Resilience
 
OSI reference model
OSI reference modelOSI reference model
OSI reference model
 
7 layers of osi models
7 layers of osi models7 layers of osi models
7 layers of osi models
 
Osi31
Osi31Osi31
Osi31
 
Topic 1.1 basic concepts of computer network
Topic 1.1 basic concepts of computer networkTopic 1.1 basic concepts of computer network
Topic 1.1 basic concepts of computer network
 
Unit 5-Performance and Trafficmanagement.pptx
Unit 5-Performance and Trafficmanagement.pptxUnit 5-Performance and Trafficmanagement.pptx
Unit 5-Performance and Trafficmanagement.pptx
 
CLUSTER COMPUTING
CLUSTER COMPUTINGCLUSTER COMPUTING
CLUSTER COMPUTING
 
Fundamentals Of Transaction Systems - Part 1: Causality banishes Acausality ...
Fundamentals Of Transaction Systems - Part 1: Causality banishes Acausality ...Fundamentals Of Transaction Systems - Part 1: Causality banishes Acausality ...
Fundamentals Of Transaction Systems - Part 1: Causality banishes Acausality ...
 

Recently uploaded

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEslot gacor bisa pakai pulsa
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 

Recently uploaded (20)

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 

Engineering Cross-Layer Fault Tolerance in Many-Core Systems

  • 1. Engineering Cross-Layer Fault-Tolerance in Many-Core Systems PhD Student: Rem Gensh Supervisor: Professor Alexander Romanovsky Thesis Committee: Professor Alex Yakovlev, Dr Alexei Iliasov
  • 2. Definitions • Fault tolerance is a means for achieving dependability, allowing us to prevent the system failure in the presence of faults. • Layer. Abstraction layer, system component. • Cross-layer approach (or design) is used when it is more efficient to distribute the task between several layers rather than execute it only at one layer. • Many-core systems are those containing tens, hundreds or thousands cores (multi-core systems have 2-8 cores) 2
  • 3. Introduction • Systems’ complexity and abstraction • TCP/IP as a motivating example • Many core systems • Layered fault tolerance • Cross-layer fault tolerance 3
  • 4. Systems’ Complexity and Abstraction • Abstraction simplifies the understanding of the system structure • Layers of the computer system • OSI model • TCP/IP (Internet protocol suite) • Object-oriented programming • Components of the system are considered as black boxes • Each component should provide predefined service according to its interface 4
  • 5. TCP/IP as a motivating example 5
  • 6. TCP/IP cross-layer fault tolerance • All layers participate in error detection and error recovery • Error detection and recovery is performed by cooperative activities of several layers • If an error is not detected at the lower layer it will be detected and recovered at the higher layer • Efficiency and flexibility of TCP/IP 6 Layer Error detection Error recovery Application Status codes Retransmission or custom recovery Transport CRC-16 TCP: ack., neg. ack., ARQ, seq. number Internet CRC-16 Discard corrupted packet Link CRC-32 Discard corrupted packet
  • 7. Many-core systems • 10, 100 or even 1000 cores • Heterogeneous architectures • Redundant cores for ensuring fault tolerance • Performance, energy efficiency and reliability are very important factors for many-core systems 7
  • 8. Layered fault tolerance • Faults can occur at the different layers of the system stack • Major part of errors is handled at the layer, where they are detected. • Convenience for developer • Predominance of convenience over the system efficiency 8
  • 9. Layered fault tolerance • System layers are considered separately • Unnecessary error corrections are possible • Above layer can not specify the required quality of service of the layer that is below • Not optimal in terms of performance and energy consumption 9
  • 10. Cross-Layer Fault Tolerance • Fault tolerance will be distributed across the system stack • Useful information about the system state will be shared among the layers • Various application domains • Above layers will have the possibility to specify current needs and required service level 10
  • 11. Cross-layer design for wireless sensor networks • Single layer approach cannot share important information among different layers • Each layer does not have complete information. Optimal operation of the entire network cannot be guaranteed • Single layer approach does not have the ability to adapt to the environmental change L. Carnevali, L. Ridi, E. Vicario, "Stochastic Fault Trees for cross-layer power management of WSN monitoring systems," IEEE Conference on Emerging Technologies & Factory Automation, pp. 1-8, 2009. P. Rachelin Sujae, M. Vigneshpandi, "A Cross Layer Fault Tolerant Communication Architecture for Wireless Sensor Networks," Middle-East Journal of Scientific Research, pp. 1292- 1296, 2014. Y. Wang, H. Wu, F. Lin, N.F. Tzeng, "Cross-Layer Protocol Design and Optimization for Delay/Fault-Tolerant Mobile Sensor Networks (DFT-MSN’s)," IEEE Journal on selected areas in communications, vol. 26, no. 5, pp. 809-819, 2008. 11
  • 12. Challenges • Investigate the trade-off between reliability, performance and energy-consumption in many-core systems • Ensure cross-layer fault tolerance for many-core systems • Demonstrate that applying the cross-layer fault tolerance can improve performance and energy-efficiency 12
  • 13. Plan • Implement a case-study to gain an experience in developing cross-layer fault tolerance • Apply Order Graphs to model cross-layer fault tolerance, power consumption and performance of many-core systems • Design novel mechanisms, libraries and patters that will help in engineering cross-layer fault tolerance of many-core systems 13
  • 14. Case study: Car number plate recognition application • Several character recognition algorithms • Possibility to specify the operational mode: reliability, performance, energy efficiency or certain tradeoffs between these parameters. • Recover two types of errors: • CPU core error. • Insufficient Quality of Service. 14
  • 15. Conclusion • Systems’ complexity and abstraction • Layered fault tolerance • Cross-layer fault tolerance 15