forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 17
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
PCI/AER: Add sysfs attributes to provide AER stats and breakdown
Add sysfs attributes to provide total and breakdown of the AERs seen, into different type of correctable, fatal and nonfatal errors: /sys/bus/pci/devices/<dev>/aer_dev_correctable /sys/bus/pci/devices/<dev>/aer_dev_fatal /sys/bus/pci/devices/<dev>/aer_dev_nonfatal Signed-off-by: Rajat Jain <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
- Loading branch information
1 parent
db89ccb
commit 81aa520
Showing
5 changed files
with
197 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,94 @@ | ||
========================== | ||
PCIe Device AER statistics | ||
========================== | ||
These attributes show up under all the devices that are AER capable. These | ||
statistical counters indicate the errors "as seen/reported by the device". | ||
Note that this may mean that if an endpoint is causing problems, the AER | ||
counters may increment at its link partner (e.g. root port) because the | ||
errors may be "seen" / reported by the link partner and not the | ||
problematic endpoint itself (which may report all counters as 0 as it never | ||
saw any problems). | ||
|
||
Where: /sys/bus/pci/devices/<dev>/aer_dev_correctable | ||
Date: July 2018 | ||
Kernel Version: 4.19.0 | ||
Contact: [email protected], [email protected] | ||
Description: List of correctable errors seen and reported by this | ||
PCI device using ERR_COR. Note that since multiple errors may | ||
be reported using a single ERR_COR message, thus | ||
TOTAL_ERR_COR at the end of the file may not match the actual | ||
total of all the errors in the file. Sample output: | ||
------------------------------------------------------------------------- | ||
localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_correctable | ||
Receiver Error 2 | ||
Bad TLP 0 | ||
Bad DLLP 0 | ||
RELAY_NUM Rollover 0 | ||
Replay Timer Timeout 0 | ||
Advisory Non-Fatal 0 | ||
Corrected Internal Error 0 | ||
Header Log Overflow 0 | ||
TOTAL_ERR_COR 2 | ||
------------------------------------------------------------------------- | ||
|
||
Where: /sys/bus/pci/devices/<dev>/aer_dev_fatal | ||
Date: July 2018 | ||
Kernel Version: 4.19.0 | ||
Contact: [email protected], [email protected] | ||
Description: List of uncorrectable fatal errors seen and reported by this | ||
PCI device using ERR_FATAL. Note that since multiple errors may | ||
be reported using a single ERR_FATAL message, thus | ||
TOTAL_ERR_FATAL at the end of the file may not match the actual | ||
total of all the errors in the file. Sample output: | ||
------------------------------------------------------------------------- | ||
localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_fatal | ||
Undefined 0 | ||
Data Link Protocol 0 | ||
Surprise Down Error 0 | ||
Poisoned TLP 0 | ||
Flow Control Protocol 0 | ||
Completion Timeout 0 | ||
Completer Abort 0 | ||
Unexpected Completion 0 | ||
Receiver Overflow 0 | ||
Malformed TLP 0 | ||
ECRC 0 | ||
Unsupported Request 0 | ||
ACS Violation 0 | ||
Uncorrectable Internal Error 0 | ||
MC Blocked TLP 0 | ||
AtomicOp Egress Blocked 0 | ||
TLP Prefix Blocked Error 0 | ||
TOTAL_ERR_FATAL 0 | ||
------------------------------------------------------------------------- | ||
|
||
Where: /sys/bus/pci/devices/<dev>/aer_dev_nonfatal | ||
Date: July 2018 | ||
Kernel Version: 4.19.0 | ||
Contact: [email protected], [email protected] | ||
Description: List of uncorrectable nonfatal errors seen and reported by this | ||
PCI device using ERR_NONFATAL. Note that since multiple errors | ||
may be reported using a single ERR_FATAL message, thus | ||
TOTAL_ERR_NONFATAL at the end of the file may not match the | ||
actual total of all the errors in the file. Sample output: | ||
------------------------------------------------------------------------- | ||
localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_nonfatal | ||
Undefined 0 | ||
Data Link Protocol 0 | ||
Surprise Down Error 0 | ||
Poisoned TLP 0 | ||
Flow Control Protocol 0 | ||
Completion Timeout 0 | ||
Completer Abort 0 | ||
Unexpected Completion 0 | ||
Receiver Overflow 0 | ||
Malformed TLP 0 | ||
ECRC 0 | ||
Unsupported Request 0 | ||
ACS Violation 0 | ||
Uncorrectable Internal Error 0 | ||
MC Blocked TLP 0 | ||
AtomicOp Egress Blocked 0 | ||
TLP Prefix Blocked Error 0 | ||
TOTAL_ERR_NONFATAL 0 | ||
------------------------------------------------------------------------- |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters