Group photograph at Linaro Connect in Copenhagen
Monday 29 Oct 2012
LCE13, 10th July 2013
Robert Richter <robert.richter@l...
www.linaro.org
Agenda
● RAS Daemon
○ problem statement and alternatives
○ perf-based solution, recall the arch
○ reference...
www.linaro.org
RAS - Reliability, Availability and Serviceability
● Reliability: ensure correctness of data; detect, corre...
www.linaro.org
Hardware Error Handling
Hardware error handling is important for higher levels of RAS:
● Error logging
● Er...
www.linaro.org
RAS implementation - goals
● Provide a RAS framework in the kernel to collect hardware
errors from various ...
www.linaro.org
RAS Framework - Technical Overview
Thanks Al, see https://wiki.linaro.org/LEG/Engineering/Kernel/ACPI/RASan...
www.linaro.org
RAS Components - Kernel
● Persistent perf events
○ running in the background and detached from any process
...
www.linaro.org
RAS Components - Userland
● perf tools and libraries
○ perf syscall to access event buffers
○ Extend event ...
www.linaro.org
Persistent event patch set V2
● Posted in June, http://lkml.org/lkml/2013/6/11/394)
● Review by perf mainta...
www.linaro.org
Perf tools patches
● Patch set sent that modifies perf tools to use persistent
events (not yet upstream)
● ...
www.linaro.org
Raw data defined as tracepoint
TRACE_EVENT(mce_record,
TP_PROTO(struct mce *m),
TP_ARGS(m),
TP_STRUCT__entr...
www.linaro.org
RAS Prototype - Implementation
● Tracepoints collected with persistent events, recorded with
perf-record an...
www.linaro.org
RAS Prototype - Running
root@piledriver:~# unbuffer rasd/rasd.sh &
[1] 2553
root@piledriver:~# echo 1 > /sy...
www.linaro.org
RAS Prototype - Running (2)
...
--- mce_record: ----------------
raw_data: 09 00 00 00 01 00 70 00 64 00 00...
www.linaro.org
RAS Prototype - Running (3)
...
walltime = 0x0000000051d6f68d
cpu = 0x00000000
cpuid = 0x00600f20
apicid = ...
www.linaro.org
RAS - Next Steps
● Work on persistent event patch set V3
● Enable RAS framework on ARM
● Add memory control...
www.linaro.org
APEI and ACPI error injection
See: https://wiki.linaro.org/LEG/Engineering/Kernel/ACPI/EINJ for more detail...
www.linaro.org
APEI and ACPI error injection
See: https://wiki.linaro.org/LEG/Engineering/Kernel/ACPI/EINJ for more detail...
www.linaro.org
● APEI status
○ HEST
■ GHES is supported and tested
■ need to implement:
● NMI error notify source type equ...
www.linaro.org
APEI and ACPI error injection
NMI equivalent, candidates?
● FIQ - Fast Interrupt for ARMv7/ARMv8
○ route to...
Upcoming SlideShare
Loading in...5
×

LCE13: LEG - RAS Daemon and ACPI

841

Published on

Resource: LCE13
Name: LEG - RAS Daemon and ACPI
Date: 11-07-2013
Speaker: Robert Richter and Tomasz Nowicki
Video: https://www.youtube.com/watch?v=Ot2VBF-oerg&feature=c4-overview&list=UUIVqQKxCyQLJS6xvSmfndLA

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
841
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
14
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

LCE13: LEG - RAS Daemon and ACPI

  1. 1. Group photograph at Linaro Connect in Copenhagen Monday 29 Oct 2012 LCE13, 10th July 2013 Robert Richter <robert.richter@linaro.org> Tomasz Nowicki <tomasz.nowicki@linaro.org> RAS Daemon and ACPI
  2. 2. www.linaro.org Agenda ● RAS Daemon ○ problem statement and alternatives ○ perf-based solution, recall the arch ○ reference driver (non ACPI) ○ status, patchset ○ next steps ● APEI and ACPI error injection ○ how does it work? ■ error reporting mechanism ■ error injecting mechanism ○ status ○ NMI equivalent
  3. 3. www.linaro.org RAS - Reliability, Availability and Serviceability ● Reliability: ensure correctness of data; detect, correct (if possible) and isolate errors ● Availability: disable the malfunctioning component and continue to operate with reduced resources ● Serviceability: early detect faulty hardware and reduce time to replace it
  4. 4. www.linaro.org Hardware Error Handling Hardware error handling is important for higher levels of RAS: ● Error logging ● Error prediction ● Error recovery We focus on hardware error logging and reporting: Give data center operators a tool to examine hw errors in the system for further analysis and actions (identify, disable and replace hardware components)
  5. 5. www.linaro.org RAS implementation - goals ● Provide a RAS framework in the kernel to collect hardware errors from various sources and report them to userland for further processing. ● Reference implementation of a RAS daemon to enable data center operators to integrate it into their tools ● Use of perf_event_open syscall to access kernel event buffers ● Architecture independent, works on ARM and x86 (Collaborative work with Borislav Petkov from Suse) ● Upstream acceptance
  6. 6. www.linaro.org RAS Framework - Technical Overview Thanks Al, see https://wiki.linaro.org/LEG/Engineering/Kernel/ACPI/RASandACPI
  7. 7. www.linaro.org RAS Components - Kernel ● Persistent perf events ○ running in the background and detached from any process ○ buffers are accessed readonly ○ shared buffers, multiple users for one event ○ can be enabled by the kernel during early boot, no userland necessary to setup events ○ persistent pmu registering event dynamically in sysfs allows out- of-the-box event setup with perf tools ● Tracepoints ○ Raw data sample, allows to transfer data structures from kernel to userland ● Hardware drivers for error detection ○ Vendor specific ○ ACPI/APEI
  8. 8. www.linaro.org RAS Components - Userland ● perf tools and libraries ○ perf syscall to access event buffers ○ Extend event parser for sysfs support of persistent events ○ Move necessary perf functions into liblk ● RAS daemon ○ Reading event buffers for reporting, analysis and actions ○ Report to syslogd
  9. 9. www.linaro.org Persistent event patch set V2 ● Posted in June, http://lkml.org/lkml/2013/6/11/394) ● Review by perf maintainers Ingo and PeterZ on lkml complete ● They generally agree with the approach ● A couple of items need to be addressed for upstream acceptance, esp. to create persistent events on-the-fly: ○ ioctl functions to bind/unbind persistent events and processes ○ make all kind of events persistent, not only system-wide events ○ support all perf_event types ○ providing event information in the mmap header page? ○ see LEG-501 ● If we can address all the issues, the patches are *likely* to go upstream in the next merge window
  10. 10. www.linaro.org Perf tools patches ● Patch set sent that modifies perf tools to use persistent events (not yet upstream) ● Most important part: update event parser to be able to describe every event in sysfs, esp. flags (currently limited to config values of the event): /sys/bus/event_source/devices/persistent/events/mce_record:persistent,config=106 /sys/bus/event_source/devices/persistent/format/persistent:attr5:23 ● Persistent events run then out-of-the-box: # perf top -e persistent/mce_record/ # perf record -e perstent/mce_record/ ...
  11. 11. www.linaro.org Raw data defined as tracepoint TRACE_EVENT(mce_record, TP_PROTO(struct mce *m), TP_ARGS(m), TP_STRUCT__entry( __field( u64, mcgcap ) __field( u64, mcgstatus ) __field( u64, status ) __field( u64, addr ) __field( u64, misc ) __field( u64, ip ) __field( u64, tsc ) __field( u64, walltime ) __field( u32, cpu ) __field( u32, cpuid ) __field( u32, apicid ) __field( u32, socketid ) __field( u8, cs ) __field( u8, bank ) __field( u8, cpuvendor ) ), ...
  12. 12. www.linaro.org RAS Prototype - Implementation ● Tracepoints collected with persistent events, recorded with perf-record and processed with perf-script ● Daemon basically doing: # perf record -e persistent/mce_record/ cat /dev/null | perf script -s rasd.pl ● Perl script for event processing: # cat rasd.pl sub process_event { my ($event, $attr, $sample, $raw_data) = @_; pintf("--- mce_record: ----------------n"); print("raw_data: ", ( join '', map { sprintf("%02x ", $_); } unpack("C*", $event) ), "n"); # more decoding: ... }
  13. 13. www.linaro.org RAS Prototype - Running root@piledriver:~# unbuffer rasd/rasd.sh & [1] 2553 root@piledriver:~# echo 1 > /sys/kernel/debug/tracing/events/mce/mce_record/enable root@piledriver:~# ./rasd/inject.sh Message from syslogd@piledriver at Jul 5 18:38:37 ... kernel:[ 81.182322] [Hardware Error]: MC4 Error (node 0): DRAM ECC error detected on the NB. Message from syslogd@piledriver at Jul 5 18:38:37 ... kernel:[ 81.201063] [Hardware Error]: Error Status: Corrected error, no action required. Message from syslogd@piledriver at Jul 5 18:38:37 ... kernel:[ 81.209844] [Hardware Error]: CPU:0 (15:2:0) MC4_STATUS[Over|CE|MiscV|-|AddrV|-|- |CECC]: 0xdc68c0002b080813 Message from syslogd@piledriver at Jul 5 18:38:37 ... kernel:[ 81.223874] [Hardware Error]: MC4_ADDR: 0x000000042cd330a0 Message from syslogd@piledriver at Jul 5 18:38:37 ... kernel:[ 81.229363] [Hardware Error]: cache level: L3/GEN, mem/io: MEM, mem-tx: RD, part-proc: SRC (no timeout) --- mce_record: ---------------- raw_data: 09 00 00 00 01 00 70 00 64 00 00 00 49 00 14 00 00 00 00 00 07 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 13 08 08 2b 00 c0 68 dc a0 30 d3 2c 04 00 00 00 00 00 00 01 58 02 18 c0 00 00 ...
  14. 14. www.linaro.org RAS Prototype - Running (2) ... --- mce_record: ---------------- raw_data: 09 00 00 00 01 00 70 00 64 00 00 00 49 00 14 00 00 00 00 00 07 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 13 08 08 2b 00 c0 68 dc a0 30 d3 2c 04 00 00 00 00 00 00 01 58 02 18 c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 8d f6 d6 51 00 00 00 00 00 00 00 00 20 0f 60 00 00 00 00 00 00 00 00 00 00 04 02 00 00 00 00 00 00 00 00 00 perf_event_type = 0x00000009 header_misc = 0x0001 size = 0x0070 raw_size = 0x00000064 trace_entry_type = 0x0049 flags = 0x14 preempt_count = 0x00 pid = 0x00000000 mcgcap = 0x0000000000000107 mcgstatus = 0x0000000000000000 status = 0xdc68c0002b080813 addr = 0x000000042cd330a0 misc = 0xc018025801000000 ip = 0x0000000000000000 tsc = 0x0000000000000000 walltime = 0x0000000051d6f68d cpu = 0x00000000 cpuid = 0x00600f20 ...
  15. 15. www.linaro.org RAS Prototype - Running (3) ... walltime = 0x0000000051d6f68d cpu = 0x00000000 cpuid = 0x00600f20 apicid = 0x00000000 socketid = 0x00000000 cs = 0x00 bank = 0x04 cpuvendor = 0x02 ^C root@piledriver:~# cat /sys/kernel/debug/tracing/trace # tracer: nop # # entries-in-buffer/entries-written: 1/1 #P:8 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | <idle>-0 [000] .Ns. 81.182316: mce_record: CPU: 0, MCGc/s: 107/0, MC4: dc68c0002b080813, ADDR/MISC: 000000042cd330a0/c018025801000000, RIP: 00:<0000000000000000>, TSC: 0, PROCESSOR: 2:600f20, TIME: 1373042317, SOCKET: 0, APIC: 0
  16. 16. www.linaro.org RAS - Next Steps ● Work on persistent event patch set V3 ● Enable RAS framework on ARM ● Add memory controller hardware driver ● Integrate ACPI/APEI ● Split perf tool code into liblk ● Implement RAS daemon
  17. 17. www.linaro.org APEI and ACPI error injection See: https://wiki.linaro.org/LEG/Engineering/Kernel/ACPI/EINJ for more details GHES reporting mechanism
  18. 18. www.linaro.org APEI and ACPI error injection See: https://wiki.linaro.org/LEG/Engineering/Kernel/ACPI/EINJ for more details Error injection mechanism
  19. 19. www.linaro.org ● APEI status ○ HEST ■ GHES is supported and tested ■ need to implement: ● NMI error notify source type equivalent since NMI is specific for x86 architecture, may require SoC-specific code ● record errors to region specified in BERT table within NMI context ● forward errors from GHES to RAS daemon as end point of ACPI path ○ ERST - not tested ○ BERT ■ not implemented yet ■ initial implementation exists but has not accepted yet ○ EINJ - ACPI error injection ■ Lack of mapping GPIO line (as error trigger source) to the appropriate AML method (GPIO-signaled ACPI events) ■ Hack driver to enable error injection from user space (AML method is called directly without hardware interaction) ● Difficulties with testing ACPI ○ ACPI part of UEFI is not available for ARMv7/v8 now APEI and ACPI error injection
  20. 20. www.linaro.org APEI and ACPI error injection NMI equivalent, candidates? ● FIQ - Fast Interrupt for ARMv7/ARMv8 ○ route to Hypervisor mode but still masked on Hypervisor mode ○ route to Secure Monitor but still masked on Secure Monitor mode ● Non-maskable FIQ (NMFI) ○ available for some ARMv7 implementation ○ still can be masked by interrupt entry ● SError/Abort ○ can not be triggered by external pin like FIQ ● Others?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×