From norsk5@yahoo.com Mon May 8 17:12:56 2006 Message-ID: <20060509000609.62160.qmail@web50101.mail.yahoo.com> Date: Mon, 8 May 2006 17:06:09 -0700 (PDT) From: Doug Thompson Subject: PCI Bus Parity Status-broken hardware attribute, EDAC foundation To: Greg K-H , Currently, the EDAC (error detection and correction) modules that are in the kernel contain some features that need to be moved. After some good feedback on the PCI Parity detection code and interface (http://www.ussg.iu.edu/hypermail/linux/kernel/0603.1/0897.html) this patch ADDs an new attribute to the pci_dev structure: Namely the 'broken_parity_status' bit. When set this indicates that the respective hardware generates false positives of Parity errors. The EDAC "blacklist" solution was inferior and will be removed in a future patch. Also in this patch is a PCI quirk.c entry for an Infiniband PCI-X card which generates false positive parity errors. I am requesting comments on this AND on the possibility of a exposing this 'broken_parity_status' bit to userland via the PCI device sysfs directory for devices. This access would allow for enabling of this feature on new devices and for old devices that have their drivers updated. (SLES 9 SP3 did this on an ATI motherboard video device). There is a need to update such a PCI attribute between kernel releases. This patch just adds a storage place for the attribute and a quirk entry for a known bad PCI device. PCI Parity reaper/harvestor operations are in EDAC itself and will be refactored to use this PCI attribute instead of its own mechanisms (which are currently disabled) in the future. Signed-off-by: Doug Thompson Signed-off-by: Greg Kroah-Hartman --- drivers/pci/quirks.c | 11 +++++++++++ include/linux/pci.h | 1 + include/linux/pci_ids.h | 1 + 3 files changed, 13 insertions(+) --- gregkh-2.6.orig/drivers/pci/quirks.c +++ gregkh-2.6/drivers/pci/quirks.c @@ -24,6 +24,17 @@ #include #include "pci.h" +/* The Mellanox Tavor device gives false positive parity errors + * Mark this device with a broken_parity_status, to allow + * PCI scanning code to "skip" this now blacklisted device. + */ +static void __devinit quirk_mellanox_tavor(struct pci_dev *dev) +{ + dev->broken_parity_status = 1; /* This device gives false positives */ +} +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_MELLANOX,PCI_DEVICE_ID_MELLANOX_TAVOR,quirk_mellanox_tavor); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_MELLANOX,PCI_DEVICE_ID_MELLANOX_TAVOR_BRIDGE,quirk_mellanox_tavor); + /* Deal with broken BIOS'es that neglect to enable passive release, which can cause problems in combination with the 82441FX/PPro MTRRs */ static void __devinit quirk_passive_release(struct pci_dev *dev) --- gregkh-2.6.orig/include/linux/pci.h +++ gregkh-2.6/include/linux/pci.h @@ -162,6 +162,7 @@ struct pci_dev { unsigned int is_busmaster:1; /* device is busmaster */ unsigned int no_msi:1; /* device may not use msi */ unsigned int block_ucfg_access:1; /* userspace config space access is blocked */ + unsigned int broken_parity_status:1; /* Device generates false positive parity */ u32 saved_config_space[16]; /* config space saved at suspend time */ struct hlist_head saved_cap_space; --- gregkh-2.6.orig/include/linux/pci_ids.h +++ gregkh-2.6/include/linux/pci_ids.h @@ -1951,6 +1951,7 @@ #define PCI_VENDOR_ID_MELLANOX 0x15b3 #define PCI_DEVICE_ID_MELLANOX_TAVOR 0x5a44 +#define PCI_DEVICE_ID_MELLANOX_TAVOR_BRIDGE 0x5a46 #define PCI_DEVICE_ID_MELLANOX_ARBEL_COMPAT 0x6278 #define PCI_DEVICE_ID_MELLANOX_ARBEL 0x6282 #define PCI_DEVICE_ID_MELLANOX_SINAI_OLD 0x5e8c