summaryrefslogtreecommitdiff
path: root/drivers/pci
AgeCommit message (Collapse)Author
2017-10-05PCI: Fix race condition with driver_overrideNicolai Stange
commit 9561475db680f7144d2223a409dd3d7e322aca03 upstream. The driver_override implementation is susceptible to a race condition when different threads are reading vs. storing a different driver override. Add locking to avoid the race condition. This is in close analogy to commit 6265539776a0 ("driver core: platform: fix race condition with driver_override") from Adrian Salido. Fixes: 782a985d7af2 ("PCI: Introduce new device binding path using pci_dev.driver_override") Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27PCI: pciehp: Report power fault only once until we clear itKeith Busch
commit 7612b3b28c0b900dcbcdf5e9b9747cc20a1e2455 upstream. When a power fault occurs, the power controller sets Power Fault Detected in the Slot Status register, and pciehp_isr() queues an INT_POWER_FAULT event to handle it. It also clears Power Fault Detected, but since nothing has yet changed to correct the power fault, the power controller will likely set it again immediately, which may cause an infinite loop when pcie_isr() rechecks Slot Status. Fix that by masking off Power Fault Detected from new events if the driver hasn't seen the power fault clear from the previous handling attempt. Fixes: fad214b0aa72 ("PCI: pciehp: Process all hotplug events before looking for new ones") Signed-off-by: Keith Busch <keith.busch@intel.com> [bhelgaas: changelog, pull test out and add comment] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Mayurkumar Patel <mayurkumar.patel@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27PCI: shpchp: Enable bridge bus mastering if MSI is enabledAleksandr Bezzubikov
commit 48b79a14505349a29b3e20f03619ada9b33c4b17 upstream. An SHPC may generate MSIs to notify software about slot or controller events (SHPC spec r1.0, sec 4.7). A PCI device can only generate an MSI if it has bus mastering enabled. Enable bus mastering if the bridge contains an SHPC that uses MSI for event notifications. Signed-off-by: Aleksandr Bezzubikov <zuban32s@gmail.com> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-27PCI/PM: Restore the status of PCI devices across hibernationChen Yu
commit e60514bd4485c0c7c5a7cf779b200ce0b95c70d6 upstream. Currently we saw a lot of "No irq handler" errors during hibernation, which caused the system hang finally: ata4.00: qc timeout (cmd 0xec) ata4.00: failed to IDENTIFY (I/O error, err_mask=0x4) ata4.00: revalidation failed (errno=-5) ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300) do_IRQ: 31.151 No irq handler for vector According to above logs, there is an interrupt triggered and it is dispatched to CPU31 with a vector number 151, but there is no handler for it, thus this IRQ will not get acked and will cause an IRQ flood which kills the system. To be more specific, the 31.151 is an interrupt from the AHCI host controller. After some investigation, the reason why this issue is triggered is because the thaw_noirq() function does not restore the MSI/MSI-X settings across hibernation. The scenario is illustrated below: 1. Before hibernation, IRQ 34 is the handler for the AHCI device, which is bound to CPU31. 2. Hibernation starts, the AHCI device is put into low power state. 3. All the nonboot CPUs are put offline, so IRQ 34 has to be migrated to the last alive one - CPU0. 4. After the snapshot has been created, all the nonboot CPUs are brought up again; IRQ 34 remains bound to CPU0. 5. AHCI devices are put into D0. 6. The snapshot is written to the disk. The issue is triggered in step 6. The AHCI interrupt should be delivered to CPU0, however it is delivered to the original CPU31 instead, which causes the "No irq handler" issue. Ying Huang has provided a clue that, in step 3 it is possible that writing to the register might not take effect as the PCI devices have been suspended. In step 3, the IRQ 34 affinity should be modified from CPU31 to CPU0, but in fact it is not. In __pci_write_msi_msg(), if the device is already in low power state, the low level MSI message entry will not be updated but cached. During the device restore process after a normal suspend/resume, pci_restore_msi_state() writes the cached MSI back to the hardware. But this is not the case for hibernation. pci_restore_msi_state() is not currently called in pci_pm_thaw_noirq(), although pci_save_state() has saved the necessary PCI cached information in pci_pm_freeze_noirq(). Restore the PCI status for the device during hibernation. Otherwise the status might be lost across hibernation (for example, settings for MSI, MSI-X, ATS, ACS, IOV, etc.), which might cause problems during hibernation. Suggested-by: Ying Huang <ying.huang@intel.com> Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Len Brown <len.brown@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Rui Zhang <rui.zhang@intel.com> Cc: Ying Huang <ying.huang@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-27PCI: rockchip: Use normal register bank for config accessorsShawn Lin
commit dc8cca5ef25ac4cb0dfc37467521a759767ff361 upstream. Rockchip's RC has two banks of registers for the root port: a normal bank that is strictly compatible with the PCIe spec, and a privileged bank that can be used to change RO bits of root port registers. When probing the RC driver, we use the privileged bank to do some basic setup work as some RO bits are hw-inited to wrong value. But we didn't change to the normal bank after probing the driver. This leads to a serious problem when the PME code tries to clear the PME status by writing PCI_EXP_RTSTA_PME to the register of PCI_EXP_RTSTA. Per PCIe 3.0 spec, section 7.8.14, the PME status bit is RW1C. So the PME code is doing the right thing to clear the PME status but we find the RC doesn't clear it but actually setting it to one. So finally the system trap in pcie_pme_work_fn() as PCI_EXP_RTSTA_PME is true now forever. This issue can be reproduced by booting kernel with pci=nomsi. Use the normal register bank for the PCI config accessors. The privileged bank is used only internally by this driver. Fixes: e77f847d ("PCI: rockchip: Add Rockchip PCIe controller support") Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Jeffy Chen <jeffy.chen@rock-chips.com> Cc: Brian Norris <briannorris@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-17PCI/PM: Add needs_resume flag to avoid suspend complete optimizationImre Deak
commit 4d071c3238987325b9e50e33051a40d1cce311cc upstream. Some drivers - like i915 - may not support the system suspend direct complete optimization due to differences in their runtime and system suspend sequence. Add a flag that when set resumes the device before calling the driver's system suspend handlers which effectively disables the optimization. Needed by a future patch fixing suspend/resume on i915. Suggested by Rafael. Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: stable@vger.kernel.org (rebased on v4.8, added kernel version to commit message stable tag) Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25PCI: Freeze PME scan before suspending devicesLukas Wunner
commit ea00353f36b64375518662a8ad15e39218a1f324 upstream. Laurent Pinchart reported that the Renesas R-Car H2 Lager board (r8a7790) crashes during suspend tests. Geert Uytterhoeven managed to reproduce the issue on an M2-W Koelsch board (r8a7791): It occurs when the PME scan runs, once per second. During PME scan, the PCI host bridge (rcar-pci) registers are accessed while its module clock has already been disabled, leading to the crash. One reproducer is to configure s2ram to use "s2idle" instead of "deep" suspend: # echo 0 > /sys/module/printk/parameters/console_suspend # echo s2idle > /sys/power/mem_sleep # echo mem > /sys/power/state Another reproducer is to write either "platform" or "processors" to /sys/power/pm_test. It does not (or is less likely) to happen during full system suspend ("core" or "none") because system suspend also disables timers, and thus the workqueue handling PME scans no longer runs. Geert believes the issue may still happen in the small window between disabling module clocks and disabling timers: # echo 0 > /sys/module/printk/parameters/console_suspend # echo platform > /sys/power/pm_test # Or "processors" # echo mem > /sys/power/state (Make sure CONFIG_PCI_RCAR_GEN2 and CONFIG_USB_OHCI_HCD_PCI are enabled.) Rafael Wysocki agrees that PME scans should be suspended before the host bridge registers become inaccessible. To that end, queue the task on a workqueue that gets frozen before devices suspend. Rafael notes however that as a result, some wakeup events may be missed if they are delivered via PME from a device without working IRQ (which hence must be polled) and occur after the workqueue has been frozen. If that turns out to be an issue in practice, it may be possible to solve it by calling pci_pme_list_scan() once directly from one of the host bridge's pm_ops callbacks. Stacktrace for posterity: PM: Syncing filesystems ... [ 38.566237] done. PM: Preparing system for sleep (mem) Freezing user space processes ... [ 38.579813] (elapsed 0.001 seconds) done. Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. PM: Suspending system (mem) PM: suspend of devices complete after 152.456 msecs PM: late suspend of devices complete after 2.809 msecs PM: noirq suspend of devices complete after 29.863 msecs suspend debug: Waiting for 5 second(s). Unhandled fault: asynchronous external abort (0x1211) at 0x00000000 pgd = c0003000 [00000000] *pgd=80000040004003, *pmd=00000000 Internal error: : 1211 [#1] SMP ARM Modules linked in: CPU: 1 PID: 20 Comm: kworker/1:1 Not tainted 4.9.0-rc1-koelsch-00011-g68db9bc814362e7f #3383 Hardware name: Generic R8A7791 (Flattened Device Tree) Workqueue: events pci_pme_list_scan task: eb56e140 task.stack: eb58e000 PC is at pci_generic_config_read+0x64/0x6c LR is at rcar_pci_cfg_base+0x64/0x84 pc : [<c041d7b4>] lr : [<c04309a0>] psr: 600d0093 sp : eb58fe98 ip : c041d750 fp : 00000008 r10: c0e2283c r9 : 00000000 r8 : 600d0013 r7 : 00000008 r6 : eb58fed6 r5 : 00000002 r4 : eb58feb4 r3 : 00000000 r2 : 00000044 r1 : 00000008 r0 : 00000000 Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: 30c5387d Table: 6a9f6c80 DAC: 55555555 Process kworker/1:1 (pid: 20, stack limit = 0xeb58e210) Stack: (0xeb58fe98 to 0xeb590000) fe80: 00000002 00000044 fea0: eb6f5800 c041d9b0 eb58feb4 00000008 00000044 00000000 eb78a000 eb78a000 fec0: 00000044 00000000 eb9aff00 c0424bf0 eb78a000 00000000 eb78a000 c0e22830 fee0: ea8a6fc0 c0424c5c eaae79c0 c0424ce0 eb55f380 c0e22838 eb9a9800 c0235fbc ff00: eb55f380 c0e22838 eb55f380 eb9a9800 eb9a9800 eb58e000 eb9a9824 c0e02100 ff20: eb55f398 c02366c4 eb56e140 eb5631c0 00000000 eb55f380 c023641c 00000000 ff40: 00000000 00000000 00000000 c023a928 cd105598 00000000 40506a34 eb55f380 ff60: 00000000 00000000 dead4ead ffffffff ffffffff eb58ff74 eb58ff74 00000000 ff80: 00000000 dead4ead ffffffff ffffffff eb58ff90 eb58ff90 eb58ffac eb5631c0 ffa0: c023a844 00000000 00000000 c0206d68 00000000 00000000 00000000 00000000 ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 3a81336c 10ccd1dd [<c041d7b4>] (pci_generic_config_read) from [<c041d9b0>] (pci_bus_read_config_word+0x58/0x80) [<c041d9b0>] (pci_bus_read_config_word) from [<c0424bf0>] (pci_check_pme_status+0x34/0x78) [<c0424bf0>] (pci_check_pme_status) from [<c0424c5c>] (pci_pme_wakeup+0x28/0x54) [<c0424c5c>] (pci_pme_wakeup) from [<c0424ce0>] (pci_pme_list_scan+0x58/0xb4) [<c0424ce0>] (pci_pme_list_scan) from [<c0235fbc>] (process_one_work+0x1bc/0x308) [<c0235fbc>] (process_one_work) from [<c02366c4>] (worker_thread+0x2a8/0x3e0) [<c02366c4>] (worker_thread) from [<c023a928>] (kthread+0xe4/0xfc) [<c023a928>] (kthread) from [<c0206d68>] (ret_from_fork+0x14/0x2c) Code: ea000000 e5903000 f57ff04f e3a00000 (e5843000) ---[ end trace 667d43ba3aa9e589 ]--- Fixes: df17e62e5bff ("PCI: Add support for polling PME state on suspended legacy PCI devices") Reported-and-tested-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reported-and-tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Cc: Simon Horman <horms+renesas@verge.net.au> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25PCI: Only allow WC mmap on prefetchable resourcesDavid Woodhouse
commit cef4d02305a06be581bb7f4353446717a1b319ec upstream. The /proc/bus/pci mmap interface allows the user to specify whether they want WC or not. Don't let them do so on non-prefetchable BARs. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25PCI: Fix another sanity check bug in /proc/pci mmapDavid Woodhouse
commit 17caf56731311c9596e7d38a70c88fcb6afa6a1b upstream. Don't match MMIO maps with I/O BARs and vice versa. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25PCI: Fix pci_mmap_fits() for HAVE_PCI_RESOURCE_TO_USER platformsDavid Woodhouse
commit 6bccc7f426abd640f08d8c75fb22f99483f201b4 upstream. In the PCI_MMAP_PROCFS case when the address being passed by the user is a 'user visible' resource address based on the bus window, and not the actual contents of the resource, that's what we need to be checking it against. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25PCI: hv: Specify CPU_AFFINITY_ALL for MSI affinity when >= 32 CPUsK. Y. Srinivasan
commit 433fcf6b7b31f1f233dd50aeb9d066a0f6ed4b9d upstream. When we have 32 or more CPUs in the affinity mask, we should use a special constant to specify that to the host. Fix this issue. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25PCI: hv: Allocate interrupt descriptors with GFP_ATOMICK. Y. Srinivasan
commit 59c58ceeea9cdc6144d7b0303753e6bd26d87455 upstream. The memory allocation here needs to be non-blocking. Fix the issue. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12PCI: Add ACS quirk for Qualcomm QDF2400 and QDF2432Sinan Kaya
[ Upstream commit 33be632b8443b6ac74aa293504f430604fb9abeb ] The Qualcomm QDF2xxx root ports don't advertise an ACS capability, but they do provide ACS-like features to disable peer transactions and validate bus numbers in requests. To be specific: * Hardware supports source validation but it will report the issue as Completer Abort instead of ACS Violation. * Hardware doesn't support peer-to-peer and each root port is a root complex with unique segment numbers. * It is not possible for one root port to pass traffic to the other root port. All PCIe transactions are terminated inside the root port. Add an ACS quirk for the QDF2400 and QDF2432 products. [bhelgaas: changelog] Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12PCI: Sort the list of devices with D3 delay quirk by IDAndy Shevchenko
[ Upstream commit cd3e2eb8905d14fe28a2fc75362b8ecec16f0fb6 ] Sort the list of Intel devices that have no PCI D3 delay by ID. Add a comment for group of devices that had not been marked yet. There is no functional change. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12PCI: Disable MSI for HiSilicon Hip06/Hip07 Root PortsDongdong Liu
[ Upstream commit 72f2ff0deb870145a5a2d24cd75b4f9936159a62 ] The PCIe Root Port in Hip06/Hip07 SoCs advertises an MSI capability, but it cannot generate MSIs. It can transfer MSI/MSI-X from downstream devices, but does not support MSI/MSI-X itself. Add a quirk to prevent use of MSI/MSI-X by the Root Port. [bhelgaas: changelog, sort vendor ID #define, drop device ID #define] Signed-off-by: Dongdong Liu <liudongdong3@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Gabriele Paoloni <gabriele.paoloni@huawei.com> Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12PCI: Add Broadcom Northstar2 PAXC quirk for device class and MPSSJon Mason
[ Upstream commit ce709f86501a013e941e9986cb072eae375ddf3e ] The Broadcom Northstar2 SoC has a number of quirks for the PAXC (internal/fake) PCI bus. Specifically, the PCI config space is shared between the root port and the first PF (ie., PF0), and a number of fields are tied to zero (thus preventing them from being set). These cannot be "fixed" in device firmware, so we must fix them with a quirk. Signed-off-by: Jon Mason <jon.mason@broadcom.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12PCI: Add ACS quirk for Intel Union PointAlex Williamson
[ Upstream commit 7184f5b451cf3dc61de79091d235b5d2bba2782d ] Intel 200-series chipsets have the same errata as 100-series: the ACS capability doesn't follow the PCIe spec, the capability and control registers are dwords rather than words. Add PCIe root port device IDs to existing quirk. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12PCI: Expand "VPD access disabled" quirk messageBjorn Helgaas
[ Upstream commit 044bc425bb72ffdecfb2a66d50cb1d024ecb96d0 ] It's not very enlightening to see pci 0000:07:00.0: [Firmware Bug]: VPD access disabled in the dmesg log because there's no clue about what the firmware bug is. Expand the message to explain why we're disabling VPD. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12PCI: thunder-pem: Factor out resource lookupBjorn Helgaas
[ Upstream commit 0d414268fb8d0844030f87027e904f69d96706be ] Pull the register resource lookup out of thunder_pem_init() so we can easily add a corresponding lookup using ACPI. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08PCI: iproc: Save host bridge window resource in struct iproc_pcieBjorn Helgaas
commit 6e347b5e05ea2ac4ac467a5a1cfaebb2c7f06f80 upstream. The host bridge memory window resource is inserted into the iomem_resource tree and cannot be deallocated until the host bridge itself is removed. Previously, the window was on the stack, which meant the iomem_resource entry pointed into the stack and was corrupted as soon as the probe function returned, which caused memory corruption and errors like this: pcie_iproc_bcma bcma0:8: resource collision: [mem 0x40000000-0x47ffffff] conflicts with PCIe MEM space [mem 0x40000000-0x47ffffff] Move the memory window resource from the stack into struct iproc_pcie so its lifetime matches that of the host bridge. Fixes: c3245a566400 ("PCI: iproc: Request host bridge window resources") Reported-and-tested-by: Rafał Miłecki <zajec5@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-22PCI: Do any VF BAR updates before enabling the BARsGavin Shan
[ Upstream commit f40ec3c748c6912f6266c56a7f7992de61b255ed ] Previously we enabled VFs and enable their memory space before calling pcibios_sriov_enable(). But pcibios_sriov_enable() may update the VF BARs: for example, on PPC PowerNV we may change them to manage the association of VFs to PEs. Because 64-bit BARs cannot be updated atomically, it's unsafe to update them while they're enabled. The half-updated state may conflict with other devices in the system. Call pcibios_sriov_enable() before enabling the VFs so any BAR updates happen while the VF BARs are disabled. [bhelgaas: changelog] Tested-by: Carol Soto <clsoto@us.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-22PCI: Ignore BAR updates on virtual functionsBjorn Helgaas
[ Upstream commit 63880b230a4af502c56dde3d4588634c70c66006 ] VF BARs are read-only zero, so updating VF BARs will not have any effect. See the SR-IOV spec r1.1, sec 3.4.1.11. We already ignore these updates because of 70675e0b6a1a ("PCI: Don't try to restore VF BARs"); this merely restructures it slightly to make it easier to split updates for standard and SR-IOV BARs. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-22PCI: Update BARs using property bits appropriate for typeBjorn Helgaas
[ Upstream commit 45d004f4afefdd8d79916ee6d97a9ecd94bb1ffe ] The BAR property bits (0-3 for memory BARs, 0-1 for I/O BARs) are supposed to be read-only, but we do save them in res->flags and include them when updating the BAR. Mask the I/O property bits with ~PCI_BASE_ADDRESS_IO_MASK (0x3) instead of PCI_REGION_FLAG_MASK (0xf) to make it obvious that we can't corrupt bits 2-3 of I/O addresses. Use PCI_ROM_ADDRESS_MASK for ROM BARs. This means we'll only check the top 21 bits (instead of the 28 bits we used to check) of a ROM BAR to see if the update was successful. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-22PCI: Don't update VF BARs while VF memory space is enabledBjorn Helgaas
[ Upstream commit 546ba9f8f22f71b0202b6ba8967be5cc6dae4e21 ] If we update a VF BAR while it's enabled, there are two potential problems: 1) Any driver that's using the VF has a cached BAR value that is stale after the update, and 2) We can't update 64-bit BARs atomically, so the intermediate state (new lower dword with old upper dword) may conflict with another device, and an access by a driver unrelated to the VF may cause a bus error. Warn about attempts to update VF BARs while they are enabled. This is a programming error, so use dev_WARN() to get a backtrace. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-22PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLEBjorn Helgaas
[ Upstream commit 7a6d312b50e63f598f5b5914c4fd21878ac2b595 ] Remove the assumption that IORESOURCE_ROM_ENABLE == PCI_ROM_ADDRESS_ENABLE. PCI_ROM_ADDRESS_ENABLE is the ROM enable bit defined by the PCI spec, so if we're reading or writing a BAR register value, that's what we should use. IORESOURCE_ROM_ENABLE is a corresponding bit in struct resource flags. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-22PCI: Add comments about ROM BAR updatingBjorn Helgaas
[ Upstream commit 0b457dde3cf8b7c76a60f8e960f21bbd4abdc416 ] pci_update_resource() updates a hardware BAR so its address matches the kernel's struct resource UNLESS it's a disabled ROM BAR. We only update those when we enable the ROM. It's not obvious from the code why ROM BARs should be handled specially. Apparently there are Matrox devices with defective ROM BARs that read as zero when disabled. That means that if pci_enable_rom() reads the disabled BAR, sets PCI_ROM_ADDRESS_ENABLE (without re-inserting the address), and writes it back, it would enable the ROM at address zero. Add comments and references to explain why we can't make the code look more rational. The code changes are from 755528c860b0 ("Ignore disabled ROM resources at setup") and 8085ce084c0f ("[PATCH] Fix PCI ROM mapping"). Link: https://lkml.org/lkml/2005/8/30/138 Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-22PCI: Remove pci_resource_bar() and pci_iov_resource_bar()Bjorn Helgaas
[ Upstream commit 286c2378aaccc7343ebf17ec6cd86567659caf70 ] pci_std_update_resource() only deals with standard BARs, so we don't have to worry about the complications of VF BARs in an SR-IOV capability. Compute the BAR address inline and remove pci_resource_bar(). That makes pci_iov_resource_bar() unused, so remove that as well. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-22PCI: Separate VF BAR updates from standard BAR updatesBjorn Helgaas
[ Upstream commit 6ffa2489c51da77564a0881a73765ea2169f955d ] Previously pci_update_resource() used the same code path for updating standard BARs and VF BARs in SR-IOV capabilities. Split the VF BAR update into a new pci_iov_update_resource() internal interface, which makes it simpler to compute the BAR address (we can get rid of pci_resource_bar() and pci_iov_resource_bar()). This patch: - Renames pci_update_resource() to pci_std_update_resource(), - Adds pci_iov_update_resource(), - Makes pci_update_resource() a wrapper that calls the appropriate one, No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-18PCI: Prevent VPD access for QLogic ISP2722Ethan Zhao
commit 0d5370d1d85251e5893ab7c90a429464de2e140b upstream. QLogic ISP2722-based 16/32Gb Fibre Channel to PCIe Adapter has the VPD access issue too, while read the common pci-sysfs access interface shown as /sys/devices/pci0000:00/0000:00:03.2/0000:0b:00.0/vpd with simple 'cat' could cause system hang and panic: Kernel panic - not syncing: An NMI occurred. Depending on your system the reason for the NMI is logged in any one of the following resources: 1. Integrated Management Log (IML) 2. OA Syslog 3. OA Forward Progress Log 4. iLO Event Log CPU: 0 PID: 15070 Comm: udevadm Not tainted 4.1.12 Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 12/27/2015 0000000000000086 000000007f0cdf51 ffff880c4fa05d58 ffffffff817193de ffffffffa00b42d8 0000000000000075 ffff880c4fa05dd8 ffffffff81714072 0000000000000008 ffff880c4fa05de8 ffff880c4fa05d88 000000007f0cdf51 Call Trace: <NMI> [<ffffffff817193de>] dump_stack+0x63/0x81 [<ffffffff81714072>] panic+0xd0/0x20e [<ffffffffa00b390d>] hpwdt_pretimeout+0xdd/0xe0 [hpwdt] [<ffffffff81021fc9>] ? sched_clock+0x9/0x10 [<ffffffff8101c101>] nmi_handle+0x91/0x170 [<ffffffff8101c10c>] ? nmi_handle+0x9c/0x170 [<ffffffff8101c5fe>] io_check_error+0x1e/0xa0 [<ffffffff8101c719>] default_do_nmi+0x99/0x140 [<ffffffff8101c8b4>] do_nmi+0xf4/0x170 [<ffffffff817232c5>] end_repeat_nmi+0x1a/0x1e [<ffffffff815d724b>] ? pci_conf1_read+0xeb/0x120 [<ffffffff815d724b>] ? pci_conf1_read+0xeb/0x120 [<ffffffff815d724b>] ? pci_conf1_read+0xeb/0x120 <<EOE>> [<ffffffff815db4b3>] raw_pci_read+0x23/0x40 [<ffffffff815db4fc>] pci_read+0x2c/0x30 [<ffffffff8136f612>] pci_user_read_config_word+0x72/0x110 [<ffffffff8136f746>] pci_vpd_pci22_wait+0x96/0x130 [<ffffffff8136ff9b>] pci_vpd_pci22_read+0xdb/0x1a0 [<ffffffff8136ea30>] pci_read_vpd+0x20/0x30 [<ffffffff8137d590>] read_vpd_attr+0x30/0x40 [<ffffffff8128e037>] sysfs_kf_bin_read+0x47/0x70 [<ffffffff8128d24e>] kernfs_fop_read+0xae/0x180 [<ffffffff8120dd97>] __vfs_read+0x37/0x100 [<ffffffff812ba7e4>] ? security_file_permission+0x84/0xa0 [<ffffffff8120e366>] ? rw_verify_area+0x56/0xe0 [<ffffffff8120e476>] vfs_read+0x86/0x140 [<ffffffff8120f3f5>] SyS_read+0x55/0xd0 [<ffffffff81720f2e>] system_call_fastpath+0x12/0x71 Shutting down cpus with NMI Kernel Offset: disabled drm_kms_helper: panic occurred, switching back to text console So blacklist the access to its VPD. Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-15drivers/pci/hotplug: Fix initial state for empty slotGavin Shan
commit d0c424971f70501ec0a0364117b9934db039c9cc upstream. In PowerNV PCI hotplug driver, the initial PCI slot's state is set to PNV_PHP_STATE_POPULATED if no PCI devices are connected to the slot. The PCI devices that are hot added to the slot won't be probed and populated because of the check in pnv_php_enable(): /* Check if the slot has been configured */ if (php_slot->state != PNV_PHP_STATE_REGISTERED) return 0; This fixes the issue by leaving the slot in PNV_PHP_STATE_REGISTERED state initially if nothing is connected to the slot. Fixes: 360aebd85a4 ("drivers/pci/hotplug: Support surprise hotplug in powernv driver") Reported-by: Hank Chang <hankmax0000@gmail.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Tested-by: Willie Liauw <williel@supermicro.com.tw> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-15drivers/pci/hotplug: Handle presence detection change properlyGavin Shan
commit d7d55536c6cd1f80295b6d7483ad0587b148bde4 upstream. The surprise hotplug is driven by interrupt in PowerNV PCI hotplug driver. In the interrupt handler, pnv_php_interrupt(), we bail when pnv_pci_get_presence_state() returns zero wrongly. It causes the presence change event is always ignored incorrectly. This fixes the issue by bailing on error (non-zero value) returned from pnv_pci_get_presence_state(). Fixes: 360aebd85a4 ("drivers/pci/hotplug: Support surprise hotplug in powernv driver") Reported-by: Hank Chang <hankmax0000@gmail.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Tested-by: Willie Liauw <williel@supermicro.com.tw> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-15pci/hotplug/pnv-php: Disable surprise hotplug capability on conflictsGavin Shan
commit 303529d6ef1293513c2c73c9ab86489eebb37d08 upstream. The root port or PCIe switch downstream port might have been associated with driver other than pnv-php. The MSI or MSIx might also have been enabled by that driver (e.g. pcieport_drv). Attempt to enable MSI incurs below backtrace: PowerPC PowerNV PCI Hotplug Driver version: 0.1 ------------[ cut here ]------------ WARNING: CPU: 19 PID: 1004 at drivers/pci/msi.c:1071 \ __pci_enable_msi_range+0x84/0x4e0 NIP [c000000000665c34] __pci_enable_msi_range+0x84/0x4e0 LR [c000000000665c24] __pci_enable_msi_range+0x74/0x4e0 Call Trace: [c000000384d67600] [c000000000665c24] __pci_enable_msi_range+0x74/0x4e0 [c000000384d676e0] [d00000000aa31b04] pnv_php_register+0x564/0x5a0 [pnv_php] [c000000384d677c0] [d00000000aa31658] pnv_php_register+0xb8/0x5a0 [pnv_php] [c000000384d678a0] [d00000000aa31658] pnv_php_register+0xb8/0x5a0 [pnv_php] [c000000384d67980] [d00000000aa31dfc] pnv_php_init+0x60/0x98 [pnv_php] [c000000384d679f0] [c00000000000cfdc] do_one_initcall+0x6c/0x1d0 [c000000384d67ab0] [c000000000b92354] do_init_module+0x94/0x254 [c000000384d67b40] [c00000000019719c] load_module+0x258c/0x2c60 [c000000384d67d30] [c000000000197bb0] SyS_finit_module+0xf0/0x170 [c000000384d67e30] [c00000000000b184] system_call+0x38/0xe0 This fixes the issue by skipping enabling the surprise hotplug capability if the MSI or MSIx on the PCI slot's upstream port has been enabled by other driver. Fixes: 360aebd85a4c ("drivers/pci/hotplug: Support surprise hotplug in powernv driver") Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Tested-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-15pci/hotplug/pnv-php: Remove WARN_ON() in pnv_php_put_slot()Gavin Shan
commit 36c7c9da40c408a71e5e6bfe12e57dcf549a296d upstream. The WARN_ON() causes unnecessary backtrace when putting the parent slot, which is likely to be NULL. WARNING: CPU: 2 PID: 1071 at drivers/pci/hotplug/pnv_php.c:85 \ pnv_php_release+0xcc/0x150 [pnv_php] : Call Trace: [c0000003bc007c10] [d00000000ad613c4] pnv_php_release+0x144/0x150 [pnv_php] [c0000003bc007c40] [c0000000006641d8] pci_hp_deregister+0x238/0x330 [c0000003bc007cd0] [d00000000ad61440] pnv_php_unregister_one+0x70/0xa0 [pnv_php] [c0000003bc007d10] [d00000000ad614c0] pnv_php_unregister+0x50/0x80 [pnv_php] [c0000003bc007d40] [d00000000ad61e84] pnv_php_exit+0x50/0xcb4 [pnv_php] [c0000003bc007d70] [c00000000019499c] SyS_delete_module+0x1fc/0x2a0 [c0000003bc007e30] [c00000000000b184] system_call+0x38/0xe0 Fixes: 66725152fb9f ("PCI/hotplug: PowerPC PowerNV PCI hotplug driver") Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Tested-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-12PCI: altera: Fix TLP_CFG_DW0 for TLP writeLey Foon Tan
commit 2a7275a3d867b228216886aae35e1f64291180b1 upstream. eb5767122feb ("PCI: altera: Simplify TLB_CFG_DW0 usage") used TLP_FMTTYPE_CFGRD* (instead of TLP_FMTTYPE_CFGWR*) for TLP writes, which causes writing to configuration space to fail. Fix it by using correct FMTTYPE for write operation. Fixes: eb5767122feb ("PCI: altera: Simplify TLB_CFG_DW0 usage") Signed-off-by: Ley Foon Tan <ley.foon.tan@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-12pci/hotplug/pnv-php: Disable MSI and PCI device properlyGavin Shan
commit 49f4b08e61547a5ccd2db551d994c4503efe5666 upstream. pnv_php_disable_irq() can be called in two paths: Bailing path in pnv_php_enable_irq() or releasing slot. The MSI (or MSIx) interrupts is disabled unconditionally in pnv_php_disable_irq(). It's wrong because that might be enabled by drivers other than pnv-php. This disables MSI (or MSIx) interrupts and the PCI device only if it was enabled by pnv-php. In the error path of pnv_php_enable_irq(), we rely on the newly added parameter @disable_device. In the path of releasing slot, @pnv_php->irq is checked. Fixes: 360aebd85a4c ("drivers/pci/hotplug: Support surprise hotplug in powernv driver") Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-12PCI: hv: Fix wslot_to_devfn() to fix warnings on device removalDexuan Cui
commit 60e2e2fbafdd1285ae1b4ad39ded41603e0c74d0 upstream. The devfn of 00:02.0 is 0x10. devfn_to_wslot(0x10) == 0x2, and wslot_to_devfn(0x2) should be 0x10, while it's 0x2 in the current code. Due to this, hv_eject_device_work() -> pci_get_domain_bus_and_slot() returns NULL and pci_stop_and_remove_bus_device() is not called. Later when the real device driver's .remove() is invoked by hv_pci_remove() -> pci_stop_root_bus(), some warnings can be noticed because the VM has lost the access to the underlying device at that time. Signed-off-by: Jake Oshins <jakeo@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Haiyang Zhang <haiyangz@microsoft.com> CC: K. Y. Srinivasan <kys@microsoft.com> CC: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-23PCI/PME: Restore pcie_pme_driver.removeYinghai Lu
commit afe3e4d11bdf50a4c3965eb6465ba6bebbcf5dcf upstream. In addition to making PME non-modular, d7def2040077 ("PCI/PME: Make explicitly non-modular") removed the pcie_pme_driver .remove() method, pcie_pme_remove(). pcie_pme_remove() freed the PME IRQ that was requested in pci_pme_probe(). The fact that we don't free the IRQ after d7def2040077 causes the following crash when removing a PCIe port device via /sys: ------------[ cut here ]------------ kernel BUG at drivers/pci/msi.c:370! invalid opcode: 0000 [#1] SMP Modules linked in: CPU: 1 PID: 14509 Comm: sh Tainted: G W 4.8.0-rc1-yh-00012-gd29438d RIP: 0010:[<ffffffff9758bbf5>] free_msi_irqs+0x65/0x190 ... Call Trace: [<ffffffff9758cda4>] pci_disable_msi+0x34/0x40 [<ffffffff97583817>] cleanup_service_irqs+0x27/0x30 [<ffffffff97583e9a>] pcie_port_device_remove+0x2a/0x40 [<ffffffff97584250>] pcie_portdrv_remove+0x40/0x50 [<ffffffff97576d7b>] pci_device_remove+0x4b/0xc0 [<ffffffff9785ebe6>] __device_release_driver+0xb6/0x150 [<ffffffff9785eca5>] device_release_driver+0x25/0x40 [<ffffffff975702e4>] pci_stop_bus_device+0x74/0xa0 [<ffffffff975704ea>] pci_stop_and_remove_bus_device_locked+0x1a/0x30 [<ffffffff97578810>] remove_store+0x50/0x70 [<ffffffff9785a378>] dev_attr_store+0x18/0x30 [<ffffffff97260b64>] sysfs_kf_write+0x44/0x60 [<ffffffff9725feae>] kernfs_fop_write+0x10e/0x190 [<ffffffff971e13f8>] __vfs_write+0x28/0x110 [<ffffffff970b0fa4>] ? percpu_down_read+0x44/0x80 [<ffffffff971e53a7>] ? __sb_start_write+0xa7/0xe0 [<ffffffff971e53a7>] ? __sb_start_write+0xa7/0xe0 [<ffffffff971e1f04>] vfs_write+0xc4/0x180 [<ffffffff971e3089>] SyS_write+0x49/0xa0 [<ffffffff97001a46>] do_syscall_64+0xa6/0x1b0 [<ffffffff9819201e>] entry_SYSCALL64_slow_path+0x25/0x25 ... RIP [<ffffffff9758bbf5>] free_msi_irqs+0x65/0x190 RSP <ffff89ad3085bc48> ---[ end trace f4505e1dac5b95d3 ]--- Segmentation fault Restore pcie_pme_remove(). [bhelgaas: changelog] Fixes: d7def2040077 ("PCI/PME: Make explicitly non-modular") Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-09PCI/ASPM: Handle PCI-to-PCIe bridges as roots of PCIe hierarchiesBjorn Helgaas
commit 030305d69fc6963c16003f50d7e8d74b02d0a143 upstream. In a struct pcie_link_state, link->root points to the pcie_link_state of the root of the PCIe hierarchy. For the topmost link, this points to itself (link->root = link). For others, we copy the pointer from the parent (link->root = link->parent->root). Previously we recognized that Root Ports originated PCIe hierarchies, but we treated PCI/PCI-X to PCIe Bridges as being in the middle of the hierarchy, and when we tried to copy the pointer from link->parent->root, there was no parent, and we dereferenced a NULL pointer: BUG: unable to handle kernel NULL pointer dereference at 0000000000000090 IP: [<ffffffff9e424350>] pcie_aspm_init_link_state+0x170/0x820 Recognize that PCI/PCI-X to PCIe Bridges originate PCIe hierarchies just like Root Ports do, so link->root for these devices should also point to itself. Fixes: 51ebfc92b72b ("PCI: Enumerate switches below PCI-to-PCIe bridges") Link: https://bugzilla.kernel.org/show_bug.cgi?id=193411 Link: https://bugzilla.opensuse.org/show_bug.cgi?id=1022181 Tested-by: lists@ssl-mail.com Tested-by: Jayachandran C. <jnair@caviumnetworks.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-26PCI: Enumerate switches below PCI-to-PCIe bridgesBjorn Helgaas
commit 51ebfc92b72b4f7dac1ab45683bf56741e454b8c upstream. A PCI-to-PCIe bridge (a "reverse bridge") has a PCI or PCI-X primary interface and a PCI Express secondary interface. The PCIe interface is a Downstream Port that originates a Link. See the "PCI Express to PCI/PCI-X Bridge Specification", rev 1.0, sections 1.2 and A.6. The bug report below involves a PCI-to-PCIe bridge and a PCIe switch below the bridge: 00:1e.0 Intel 82801 PCI Bridge to [bus 01-0a] 01:00.0 Pericom PI7C9X111SL PCIe-to-PCI Reversible Bridge to [bus 02-0a] 02:00.0 Pericom Device 8608 [PCIe Upstream Port] to [bus 03-0a] 03:01.0 Pericom Device 8608 [PCIe Downstream Port] to [bus 0a] 01:00.0 is configured as a PCI-to-PCIe bridge (despite the name printed by lspci). As we traverse a PCIe hierarchy, device connections alternate between PCIe Links and internal Switch logic. Previously we did not recognize that 01:00.0 had a secondary link, so we thought the 02:00.0 Upstream Port *did* have a secondary link. In fact, it's the other way around: 01:00.0 has a secondary link, and 02:00.0 has internal Switch logic on its secondary side. When we thought 02:00.0 had a secondary link, the pci_scan_slot() -> only_one_child() path assumed 02:00.0 could have only one child, so 03:00.0 was the only possible downstream device. But 03:00.0 doesn't exist, so we didn't look for any other devices on bus 03. Booting with "pci=pcie_scan_all" is a workaround, but we don't want users to have to do that. Recognize that PCI-to-PCIe bridges originate links on their secondary interfaces. Link: https://bugzilla.kernel.org/show_bug.cgi?id=189361 Fixes: d0751b98dfa3 ("PCI: Add dev->has_secondary_link to track downstream PCIe links") Tested-by: Blake Moore <blake.moore@men.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-26PCI: designware: Check for iATU unroll only on platforms that use ATUMurali Karicheri
commit a782b5f986c3fa1cfa7f2b57941200c6a5809242 upstream. Previously we checked for iATU unroll support by reading PCIE_ATU_VIEWPORT even on platforms, e.g., Keystone, that do not have ATU ports. This can cause bad behavior such as asynchronous external aborts: OF: PCI: MEM 0x60000000..0x6fffffff -> 0x60000000 Unhandled fault: asynchronous external abort (0x1211) at 0x00000000 pgd = c0003000 [00000000] *pgd=80000800004003, *pmd=00000000 Internal error: : 1211 [#1] PREEMPT SMP ARM Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-00009-g6ff59d2-dirty #7 Hardware name: Keystone task: eb878000 task.stack: eb866000 PC is at dw_pcie_setup_rc+0x24/0x380 LR is at ks_pcie_host_init+0x10/0x170 Move the dw_pcie_iatu_unroll_enabled() check so we only call it on platforms that do not use the ATU. These platforms supply their own ->rd_other_conf() and ->wr_other_conf() methods. [bhelgaas: changelog] Fixes: a0601a470537 ("PCI: designware: Add iATU Unroll feature") Fixes: 416379f9ebde ("PCI: designware: Check for iATU unroll support after initializing host") Tested-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-By: Joao Pinto <jpinto@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12powerpc/pci/rpadlpar: Fix device reference leaksJohan Hovold
commit 99e5cde5eae78bef95bfe7c16ccda87fb070149b upstream. Make sure to drop any device reference taken by vio_find_node() when adding and removing virtual I/O slots. Fixes: 5eeb8c63a38f ("[PATCH] PCI Hotplug: rpaphp: Move VIO registration") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12PCI: Enable access to non-standard VPD for Chelsio devices (cxgb3)Alexey Kardashevskiy
commit 1c7de2b4ff886a45fbd2f4c3d4627e0f37a9dd77 upstream. There is at least one Chelsio 10Gb card which uses VPD area to store some non-standard blocks (example below). However pci_vpd_size() returns the length of the first block only assuming that there can be only one VPD "End Tag". Since 4e1a635552d3 ("vfio/pci: Use kernel VPD access functions"), VFIO blocks access beyond that offset, which prevents the guest "cxgb3" driver from probing the device. The host system does not have this problem as its driver accesses the config space directly without pci_read_vpd(). Add a quirk to override the VPD size to a bigger value. The maximum size is taken from EEPROMSIZE in drivers/net/ethernet/chelsio/cxgb3/common.h. We do not read the tag as the cxgb3 driver does as the driver supports writing to EEPROM/VPD and when it writes, it only checks for 8192 bytes boundary. The quirk is registered for all devices supported by the cxgb3 driver. This adds a quirk to the PCI layer (not to the cxgb3 driver) as the cxgb3 driver itself accesses VPD directly and the problem only exists with the vfio-pci driver (when cxgb3 is not running on the host and may not be even loaded) which blocks accesses beyond the first block of VPD data. However vfio-pci itself does not have quirks mechanism so we add it to PCI. This is the controller: Ethernet controller [0200]: Chelsio Communications Inc T310 10GbE Single Port Adapter [1425:0030] This is what I parsed from its VPD: === b'\x82*\x0010 Gigabit Ethernet-SR PCI Express Adapter\x90J\x00EC\x07D76809 FN\x0746K' 0000 Large item 42 bytes; name 0x2 Identifier String b'10 Gigabit Ethernet-SR PCI Express Adapter' 002d Large item 74 bytes; name 0x10 #00 [EC] len=7: b'D76809 ' #0a [FN] len=7: b'46K7897' #14 [PN] len=7: b'46K7897' #1e [MN] len=4: b'1037' #25 [FC] len=4: b'5769' #2c [SN] len=12: b'YL102035603V' #3b [NA] len=12: b'00145E992ED1' 007a Small item 1 bytes; name 0xf End Tag 0c00 Large item 16 bytes; name 0x2 Identifier String b'S310E-SR-X ' 0c13 Large item 234 bytes; name 0x10 #00 [PN] len=16: b'TBD ' #13 [EC] len=16: b'110107730D2 ' #26 [SN] len=16: b'97YL102035603V ' #39 [NA] len=12: b'00145E992ED1' #48 [V0] len=6: b'175000' #51 [V1] len=6: b'266666' #5a [V2] len=6: b'266666' #63 [V3] len=6: b'2000 ' #6c [V4] len=2: b'1 ' #71 [V5] len=6: b'c2 ' #7a [V6] len=6: b'0 ' #83 [V7] len=2: b'1 ' #88 [V8] len=2: b'0 ' #8d [V9] len=2: b'0 ' #92 [VA] len=2: b'0 ' #97 [RV] len=80: b's\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'... 0d00 Large item 252 bytes; name 0x11 #00 [VC] len=16: b'122310_1222 dp ' #13 [VD] len=16: b'610-0001-00 H1\x00\x00' #26 [VE] len=16: b'122310_1353 fp ' #39 [VF] len=16: b'610-0001-00 H1\x00\x00' #4c [RW] len=173: b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'... 0dff Small item 0 bytes; name 0xf End Tag 10f3 Large item 13315 bytes; name 0x62 !!! unknown item name 98: b'\xd0\x03\x00@`\x0c\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00' === Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12PCI: Support INTx masking on ConnectX-4 with firmware x.14.1100+Noa Osherovich
commit 1600f62534b7b3da7978b43b52231a54c24df287 upstream. Mellanox devices were marked as having INTx masking ability broken. As a result, the VFIO driver fails to start when more than one device function is passed-through to a VM if both have the same INTx pin. Prior to Connect-IB, Mellanox devices exposed to the operating system one PCI function per all ports. Starting from Connect-IB, the devices are function-per-port. When passing the second function to a VM, VFIO will fail to start. Exclude ConnectX-4, ConnectX4-Lx and Connect-IB from the list of Mellanox devices marked as having broken INTx masking: - ConnectX-4 and ConnectX4-LX firmware version is checked. If INTx masking is supported, we unmark the broken INTx masking. - Connect-IB does not support INTx currently so will not cause any problem. [bhelgaas: call pci_disable_device() always, after iounmap()] Fixes: 11e42532ada3 ("PCI: Assume all Mellanox devices have broken INTx masking") Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12PCI: Convert Mellanox broken INTx quirks to be for listed devices onlyNoa Osherovich
commit d76d2fe05fd93673d184af77255bbbc63780f4ea upstream. Change Mellanox's broken_intx_masking() quirk from an "all Mellanox devices" to a quirk for listed devices only. [bhelgaas: remove #defines, reorder to keep other quirks together] Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12PCI: Convert broken INTx masking quirks from HEADER to FINALNoa Osherovich
commit b88214ce4d7064992452765028bd50702414f15f upstream. Convert all quirk_broken_intx_masking() quirks from HEADER to FINAL. The quirk sets dev->broken_intx_masking, which is only used by pci_intx_mask_supported(), which is not needed until after FINAL quirks have been run. [bhelgaas: changelog] Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12PCI: rockchip: Correct the use of FTS maskBrian Norris
commit a45e2611b9bbd81288d97d02ce7e74a60a698d43 upstream. We're trying to mask out bits[23:8] while retaining [32:24, 7:0], but we're doing the inverse. That doesn't have too much effect, since we're setting all the [23:8] bits to 1, and the other bits are only relevant for modes we're currently not using. But we should get this right. Fixes: ca1989084054 ("PCI: rockchip: Fix wrong transmitted FTS count") Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12PCI: rockchip: Fix negotiated lanes calculationShawn Lin
commit 45e9320f3a4ef9588ee50a2eb1891c4bfdbb07df upstream. The calculation of negotiated lanes is wrong: it should be shifted by PCIE_CORE_PL_CONF_LANE_SHIFT, but it is shifted by PCIE_CORE_PL_CONF_LANE_MASK instead. Let's fix it. Fixes: e77f847df54c ("PCI: rockchip: Add Rockchip PCIe controller support") Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12PCI/MSI: Check for NULL affinity mask in pci_irq_get_affinity()Jan Beulich
commit d1d111e073840b8dbc1ae90ba3fc274736451bdc upstream. If msi_setup_entry() fails to allocate an affinity mask, it logs a message but continues on and allocates an MSI entry with entry->affinity == NULL. Check for this case in pci_irq_get_affinity() so we don't try to dereference a NULL pointer. [bhelgaas: changelog] Fixes: ee8d41e53efe "pci/msi: Retrieve affinity for a vector" Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> CC: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-09PCI: Check for PME in targeted sleep stateAlan Stern
commit 6496ebd7edf446fccf8266a1a70ffcb64252593e upstream. One some systems, the firmware does not allow certain PCI devices to be put in deep D-states. This can cause problems for wakeup signalling, if the device does not support PME# in the deepest allowed suspend state. For example, Pierre reports that on his system, ACPI does not permit his xHCI host controller to go into D3 during runtime suspend -- but D3 is the only state in which the controller can generate PME# signals. As a result, the controller goes into runtime suspend but never wakes up, so it doesn't work properly. USB devices plugged into the controller are never detected. If the device relies on PME# for wakeup signals but is not capable of generating PME# in the target state, the PCI core should accurately report that it cannot do wakeup from runtime suspend. This patch modifies the pci_dev_run_wake() routine to add this check. Reported-by: Pierre de Villemereuil <flyos@mailoo.org> Tested-by: Pierre de Villemereuil <flyos@mailoo.org> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> CC: Lukas Wunner <lukas@wunner.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-12-01Merge tag 'pci-v4.9-fixes-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: "PCI fixes: - Fix Read Completion Boundary setting, which fixes a boot failure on IBM x3850 with Mellanox MT27500 ConnectX-3 - Update some MAINTAINERS entries and email addresses" * tag 'pci-v4.9-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX) PCI: Export pcie_find_root_port PCI: designware-plat: Update author email PCI: designware: Change maintainer to Joao Pinto MAINTAINERS: Add devicetree binding to PCI i.MX6 entry MAINTAINERS: Update Richard Zhu's email address