Presentation reliable NoC


Published on

Reliable Network On Chip

Published in: Technology, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Presentation reliable NoC

  2. 2. Why Fault Tolerance?  Offers many advantages: ◦ ◦ ◦ ◦  Avoids costly packet retransmissions Avoids catastrophic data loss Can increase chip yield Allows higher speed operation In NoC specifically ◦ Ensures success of interconnect ◦ Grows in importance as technology scales
  3. 3. Fault Classes            Transient faults (or soft errors) : Random appearance and disappearance Alpha particles, Cosmic-ray-induced neutrons etc. Intermittent faults: appear only under certain conditions like Occur repeatedly at the same location Tend to occur in bursts Replacement of the faulty component removes the fault Permanent faults (or Hard errors): occur always but may be masked Static (occurring at manufacture-time) Process Variability (PV), Manufacturing imperfections Dynamic (occurring at run-time,) Electro-Migration (EM), Negative Bias Temperature Instability (NBTI), Oxide breakdown, Stress-Induced Voiding (SIV), Hot Carrier Injection (HCI), etc.
  4. 4. Making NoC’s Reliable Current Methods T-error tolerant NoC design  Error Control  ◦ Error detection and correction codes ◦ HBH retransmission mechanism • Reliable task mapping  Fault tolerant rerouting
  5. 5. Timing error tolerant NoC design
  6. 6. Error correction and detection
  7. 7. Power consumption Analysis Power consumption of the schemes
  8. 8. Power consumption Observations    The ee-par scheme has higher power consumption than ee-crc and hybrid scheme. The flit based scheme incurs more power consumption because as the no. of flits per packet increases the useful bits decreases. The packet buffer requirements impact the power consumption. Hence, as the number of hops increases, the power overhead of ss-flit scheme increases.
  9. 9. HBH Retransmission Scheme Advantages •Avoids deadlock •Eliminates the need to provide escape channel to the destination node.
  10. 10. Reliability Aware Task Mapping
  11. 11. Fault tolerant route generation Switch Design to support multipath routing with In order packet delivery
  12. 12. Resilience against NBTI Fig. adaptive router architecture
  13. 13. ROBUST: SELF HEALING ROUTER Universal Logic Block Crossbar protection using multiple ULB blocks Advantages It has higher silicon protection factor and a higher reliability improvement factor.
  14. 14. Future challenges ◦ All the schemes presented to improve the reliability of the NoC architecture have power overhead associated with them. This increases the power dissipated which can reduce the mean time to failure (MTTF). ◦ All the techniques should be thermal aware in order to prevent the above mentioned phenomena. ◦ Instead of evenly wearing out all cores in MPSoCs, a method should be deigned to self heal failed cores. ◦ Most error resilient schemes today focus primarily on making router, links fault tolerant. There should be some focus on making memories more reliable
  15. 15. Conclusion  The ideas presented in this paper make the NoC architecture resilient to permanent and intermittent errors. To improve the reliability several techniques like t-error tolerant mechanism, self healing router architecture, reliability driven task mapping, deadlock recovery mechanism, error detection and correction schemes are employed. Several techniques make use of redundancy in hardware component which is good in terms of area since because of “dark silicon” it is impossible to turn on every component on the die anyways. However, most techniques increase the power consumption in the NoC architecture which is by far the only drawback in using them. Designing systems to make them resilient to errors is very crucial in exploiting the advantages of using Network on chips.
  16. 16. References  [1] M. Yang, T. Li, Y. Jiang, and Y. Yang, “Fault-tolerant routing schemes in RDT(2,2,1)/-based interconnection network for networks-on-chip designs,”  [2] Jacques Henri Collet, Ahmed Louri, Vivek Tulsidas Bhat, Pavan Poluri, “ROBUST: A new Self-healing Fault-Tolerant NoC Router”  [3] Theocharis Theocharides, Luca Benini, Giovanni De Micheli, N. Vijaykrishnan, Mary Jane Irwin, “Analysis of Error Recovery Schemes for Networks-on-Chips”.  [4] Rutuparna Tamhankar, “TERROR: RELIABLE AND EFFICIENT LINK DESIGN FOR NETWORK ON CHIPS”  [5] Armin Alaghi, Mahshid Sedghi, Naghmeh Karimi, Mahmood Fathy, Zainalabedin Navabi, “Reliable NoC Architecture Utilizing a Robust Rerouting Algorithm”.  [6] Srinivasan Murali, “METHODOLOGIES FOR RELIABLE AND EFFICIENT DESIGN OF NETWORKS ON CHIPS”  [7] Xin Fu1, Tao Li, José A. B. Fortes,” Architecting Reliable Multi-core Network-on-Chip for Small Scale Processing Technology”  [8] Avijit Dutta and Nur A. Touba,” Reliable Network-on-Chip Using a Low Cost Unequal Error Protection Code”  [9] Deepthi chamkur .V , Vijayakumar.T, “Reliable Routing & Deadlock free massive NoC Design with Fault Tolerance based on combinatorial application.”.  [10] Luca Benini, Giovanni De Micheli, “Powering Networks on Chips: Energy-efficient and reliable interconnect design for SoCs”.  [11] Haidar M. Harmanani and Rana Farah, “A Method for Efficient Mapping and Reliable Routing for NoC Architectures with Minimum Bandwidth and Area “.  [12] Yin-He Han Hang Lu Lei Zhang, “RevivePath: Resilient Network-on-Chip Design Through Data Path Salvaging of Router”  [13] Anup Das, Akash Kumar and Bharadwaj Veeravalli,“Reliability-Driven Task Mapping for Lifetime Extension of Networkson-Chip Based Multiprocessor Systems”.  [14] Avijit Dutta and Nur A. Touba, ”Reliable Network-on-Chip Using a Low Cost Unequal Error Protection Code”.  [15] Deepthi chamkur .V , Vijayakumar.T,” Reliable Routing & Deadlock free massive NoC Design with Fault Tolerance based on combinatorial application.”  [16] M.H. Neishaburi, Zeljko Zilic,” NISHA: A fault-tolerant NoC router enabling deadlock-free Interconnection of Subnets in Hierarchical Architectures”.  [17] Yu Ren , Leibo Liu , Shouyi Yin , Jie Han , Qinghua Wua, Shaojun Wei, “A fault tolerant NoC architecture using quadspare mesh topology and dynamic reconfiguration”.  [18] Mehdi Modarressi , Marjan Asadinia , Hamid Sarbazi-Azad,” Using task migration to improve non-contiguous processor allocation in NoC-based CMPs”.  [19] Cristian Grecu, Lorena Anghel, Partha P. Pande, André Ivanov, Resve Saleh,” Essential Fault-Tolerance Metrics for NoC Infrastructures”.  [20] Young Hoon Kang, Taek-Jun Kwon, Jeffrey Draper,” Fault-Tolerant Flow Control in On-Chip Networks”.