summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2022-05-23ARM: dts: imx7: Use audio_mclk_post_div instead audio_mclk_root_clkAbel Vesa
commit 4cb7df64c732b2b9918424095c11660c2a8c4a33 upstream. The audio_mclk_root_clk was added as a gate with the CCGR121 (0x4790), but according to the reference manual, there is no such gate. Moreover, the consumer driver of the mentioned clock might gate it and leave the ECSPI2 (the true owner of that gate) hanging. So lets use the audio_mclk_post_div, which is the parent. Signed-off-by: Abel Vesa <abel.vesa@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> [ps: backport to 5.4] Signed-off-by: Philippe Schenker <philippe.schenker@toradex.com>
2022-05-19Merge tag 'v5.4.193' into toradex_5.4.yPhilippe Schenker
This is the 5.4.193 stable release
2022-05-12KVM: LAPIC: Enable timer posted-interrupt only when mwait/hlt is advertisedWanpeng Li
[ Upstream commit 1714a4eb6fb0cb79f182873cd011a8ed60ac65e8 ] As commit 0c5f81dad46 ("KVM: LAPIC: Inject timer interrupt via posted interrupt") mentioned that the host admin should well tune the guest setup, so that vCPUs are placed on isolated pCPUs, and with several pCPUs surplus for *busy* housekeeping. In this setup, it is preferrable to disable mwait/hlt/pause vmexits to keep the vCPUs in non-root mode. However, if only some guests isolated and others not, they would not have any benefit from posted timer interrupts, and at the same time lose VMX preemption timer fast paths because kvm_can_post_timer_interrupt() returns true and therefore forces kvm_can_use_hv_timer() to false. By guaranteeing that posted-interrupt timer is only used if MWAIT or HLT are done without vmexit, KVM can make a better choice and use the VMX preemption timer and the corresponding fast paths. Reported-by: Aili Yao <yaoaili@kingsoft.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Cc: Aili Yao <yaoaili@kingsoft.com> Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1643112538-36743-1-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-12x86/kvm: Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resumeWanpeng Li
[ Upstream commit 0361bdfddca20c8855ea3bdbbbc9c999912b10ff ] MSR_KVM_POLL_CONTROL is cleared on reset, thus reverting guests to host-side polling after suspend/resume. Non-bootstrap CPUs are restored correctly by the haltpoll driver because they are hot-unplugged during suspend and hot-plugged during resume; however, the BSP is not hotpluggable and remains in host-sde polling mode after the guest resume. The makes the guest pay for the cost of vmexits every time the guest enters idle. Fix it by recording BSP's haltpoll state and resuming it during guest resume. Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1650267752-46796-1-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-12kvm: x86/cpuid: Only provide CPUID leaf 0xA if host has architectural PMUSandipan Das
[ Upstream commit 5a1bde46f98b893cda6122b00e94c0c40a6ead3c ] On some x86 processors, CPUID leaf 0xA provides information on Architectural Performance Monitoring features. It advertises a PMU version which Qemu uses to determine the availability of additional MSRs to manage the PMCs. Upon receiving a KVM_GET_SUPPORTED_CPUID ioctl request for the same, the kernel constructs return values based on the x86_pmu_capability irrespective of the vendor. This leaf and the additional MSRs are not supported on AMD and Hygon processors. If AMD PerfMonV2 is detected, the PMU version is set to 2 and guest startup breaks because of an attempt to access a non-existent MSR. Return zeros to avoid this. Fixes: a6c06ed1a60a ("KVM: Expose the architectural performance monitoring CPUID leaf") Reported-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Sandipan Das <sandipan.das@amd.com> Message-Id: <3fef83d9c2b2f7516e8ff50d60851f29a4bcb716.1651058600.git.sandipan.das@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-12parisc: Merge model and model name into one line in /proc/cpuinfoHelge Deller
commit 5b89966bc96a06f6ad65f64ae4b0461918fcc9d3 upstream. The Linux tool "lscpu" shows the double amount of CPUs if we have "model" and "model name" in two different lines in /proc/cpuinfo. This change combines the model and the model name into one line. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-12MIPS: Fix CP0 counter erratum detection for R4k CPUsMaciej W. Rozycki
commit f0a6c68f69981214cb7858738dd2bc81475111f7 upstream. Fix the discrepancy between the two places we check for the CP0 counter erratum in along with the incorrect comparison of the R4400 revision number against 0x30 which matches none and consistently consider all R4000 and R4400 processors affected, as documented in processor errata publications[1][2][3], following the mapping between CP0 PRId register values and processor models: PRId | Processor Model ---------+-------------------- 00000422 | R4000 Revision 2.2 00000430 | R4000 Revision 3.0 00000440 | R4400 Revision 1.0 00000450 | R4400 Revision 2.0 00000460 | R4400 Revision 3.0 No other revision of either processor has ever been spotted. Contrary to what has been stated in commit ce202cbb9e0b ("[MIPS] Assume R4000/R4400 newer than 3.0 don't have the mfc0 count bug") marking the CP0 counter as buggy does not preclude it from being used as either a clock event or a clock source device. It just cannot be used as both at a time, because in that case clock event interrupts will be occasionally lost, and the use as a clock event device takes precedence. Compare against 0x4ff in `can_use_mips_counter' so that a single machine instruction is produced. References: [1] "MIPS R4000PC/SC Errata, Processor Revision 2.2 and 3.0", MIPS Technologies Inc., May 10, 1994, Erratum 53, p.13 [2] "MIPS R4400PC/SC Errata, Processor Revision 1.0", MIPS Technologies Inc., February 9, 1994, Erratum 21, p.4 [3] "MIPS R4400PC/SC Errata, Processor Revision 2.0 & 3.0", MIPS Technologies Inc., January 24, 1995, Erratum 14, p.3 Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Fixes: ce202cbb9e0b ("[MIPS] Assume R4000/R4400 newer than 3.0 don't have the mfc0 count bug") Cc: stable@vger.kernel.org # v2.6.24+ Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-09x86/cpu: Load microcode during restore_processor_state()Borislav Petkov
commit f9e14dbbd454581061c736bf70bf5cbb15ac927c upstream. When resuming from system sleep state, restore_processor_state() restores the boot CPU MSRs. These MSRs could be emulated by microcode. If microcode is not loaded yet, writing to emulated MSRs leads to unchecked MSR access error: ... PM: Calling lapic_suspend+0x0/0x210 unchecked MSR access error: WRMSR to 0x10f (tried to write 0x0...0) at rIP: ... (native_write_msr) Call Trace: <TASK> ? restore_processor_state x86_acpi_suspend_lowlevel acpi_suspend_enter suspend_devices_and_enter pm_suspend.cold state_store kobj_attr_store sysfs_kf_write kernfs_fop_write_iter new_sync_write vfs_write ksys_write __x64_sys_write do_syscall_64 entry_SYSCALL_64_after_hwframe RIP: 0033:0x7fda13c260a7 To ensure microcode emulated MSRs are available for restoration, load the microcode on the boot CPU before restoring these MSRs. [ Pawan: write commit message and productize it. ] Fixes: e2a1256b17b1 ("x86/speculation: Restore speculation related MSRs during S3 resume") Reported-by: Kyle D. Pelton <kyle.d.pelton@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Tested-by: Kyle D. Pelton <kyle.d.pelton@intel.com> Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=215841 Link: https://lore.kernel.org/r/4350dfbf785cd482d3fafa72b2b49c83102df3ce.1650386317.git.pawan.kumar.gupta@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-09x86: __memcpy_flushcache: fix wrong alignment if size > 2^32Mikulas Patocka
[ Upstream commit a6823e4e360fe975bd3da4ab156df7c74c8b07f3 ] The first "if" condition in __memcpy_flushcache is supposed to align the "dest" variable to 8 bytes and copy data up to this alignment. However, this condition may misbehave if "size" is greater than 4GiB. The statement min_t(unsigned, size, ALIGN(dest, 8) - dest); casts both arguments to unsigned int and selects the smaller one. However, the cast truncates high bits in "size" and it results in misbehavior. For example: suppose that size == 0x100000001, dest == 0x200000002 min_t(unsigned, size, ALIGN(dest, 8) - dest) == min_t(0x1, 0xe) == 0x1; ... dest += 0x1; so we copy just one byte "and" dest remains unaligned. This patch fixes the bug by replacing unsigned with size_t. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-09arm64: dts: imx8mn-ddr4-evk: Describe the 32.768 kHz PMIC clockFabio Estevam
[ Upstream commit 0310b5aa0656a94102344f1e9ae2892e342a665d ] The ROHM BD71847 PMIC has a 32.768 kHz clock. Describe the PMIC clock to fix the following boot errors: bd718xx-clk bd71847-clk.1.auto: No parent clk found bd718xx-clk: probe of bd71847-clk.1.auto failed with error -22 Based on the same fix done for imx8mm-evk as per commit a6a355ede574 ("arm64: dts: imx8mm-evk: Add 32.768 kHz clock to PMIC") Fixes: 3e44dd09736d ("arm64: dts: imx8mn-ddr4-evk: Add rohm,bd71847 PMIC support") Signed-off-by: Fabio Estevam <festevam@denx.de> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-09ARM: dts: imx6ull-colibri: fix vqmmc regulatorMax Krummenacher
[ Upstream commit 45974e4276a8d6653394f66666fc57d8ffa6de9a ] The correct spelling for the property is gpios. Otherwise, the regulator will neither reserve nor control any GPIOs. Thus, any SD/MMC card which can use UHS-I modes will fail. Fixes: c2e4987e0e02 ("ARM: dts: imx6ull: add Toradex Colibri iMX6ULL support") Signed-off-by: Max Krummenacher <max.krummenacher@toradex.com> Signed-off-by: Denys Drozdov <denys.drozdov@toradex.com> Signed-off-by: Marcel Ziswiler <marcel.ziswiler@toradex.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-09ARM: dts: logicpd-som-lv: Fix wrong pinmuxing on OMAP35Adam Ford
[ Upstream commit 46ff3df87215ff42c0cd2c4bdb7d74540384a69c ] The pinout of the OMAP35 and DM37 variants of the SOM-LV are the same, but the macros which define the pinmuxing are different between OMAP3530 and DM3730. The pinmuxing was correct for for the DM3730, but wrong for the OMAP3530. Since the boot loader was correctly pin-muxing the pins, this was not obvious. As the bootloader not guaranteed to pinmux all the pins any more, this causes an issue, so the pinmux needs to be moved from a common file to their respective board files. Fixes: f8a2e3ff7103 ("ARM: dts: Add minimal support for LogicPD OMAP35xx SOM-LV devkit") Signed-off-by: Adam Ford <aford173@gmail.com> Message-Id: <20220303171818.11060-1-aford173@gmail.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-09ARM: dts: am3517-evm: Fix misc pinmuxingAdam Ford
[ Upstream commit 942da3af32b2288e674736eb159d1fc676261691 ] The bootloader for the AM3517 has previously done much of the pin muxing, but as the bootloader is moving more and more to a model based on the device tree, it may no longer automatically mux the pins, so it is necessary to add the pinmuxing to the Linux device trees so the respective peripherals can remain functional. Fixes: 6ed1d7997561 ("ARM: dts: am3517-evm: Add support for UI board and Audio") Signed-off-by: Adam Ford <aford173@gmail.com> Message-Id: <20220226214820.747847-1-aford173@gmail.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-09ARM: dts: Fix mmc order for omap3-gta04H. Nikolaus Schaller
[ Upstream commit 09269dd050094593fc747f2a5853d189fefcb6b5 ] Commit a1ebdb374199 ("ARM: dts: Fix swapped mmc order for omap3") introduces general mmc aliases. Let's tailor them to the need of the GTA04 board which does not make use of mmc2 and mmc3 interfaces. Fixes: a1ebdb374199 ("ARM: dts: Fix swapped mmc order for omap3") Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com> Message-Id: <dc9173ee3d391d9e92b7ab8ed4f84b29f0a21c83.1646744420.git.hns@goldelico.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-09ARM: dts: at91: Map MCLK for wm8731 on at91sam9g20ekMark Brown
[ Upstream commit 0e486fe341fabd8e583f3d601a874cd394979c45 ] The MCLK of the WM8731 on the AT91SAM9G20-EK board is connected to the PCK0 output of the SoC and is expected to be set to 12MHz. Previously this was mapped using pre-common clock API calls in the audio machine driver but the conversion to the common clock framework broke that so describe things in the DT instead. Fixes: ff78a189b0ae55f ("ARM: at91: remove old at91-specific clock driver") Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/20220404102806.581374-2-broonie@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-09ARM: OMAP2+: Fix refcount leak in omap_gic_of_initMiaoqian Lin
[ Upstream commit 0f83e6b4161617014017a694888dd8743f46f071 ] The of_find_compatible_node() function returns a node pointer with refcount incremented, We should use of_node_put() on it when done Add the missing of_node_put() to release the refcount. Fixes: fd1c07861491 ("ARM: OMAP4: Fix the init code to have OMAP4460 errata available in DT build") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Message-Id: <20220309104302.18398-1-linmq006@gmail.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-09ARM: dts: imx6qdl-apalis: Fix sgtl5000 detection issueFabio Estevam
[ Upstream commit fa51e1dc4b91375bc18349663a52395ad585bd3c ] On a custom carrier board with a i.MX6Q Apalis SoM, the sgtl5000 codec on the SoM is often not detected and the following error message is seen when the sgtl5000 driver tries to read the ID register: sgtl5000 1-000a: Error reading chip id -6 The reason for the error is that the MCLK clock is not provided early enough. Fix the problem by describing the MCLK pinctrl inside the codec node instead of placing it inside the audmux pinctrl group. With this change applied the sgtl5000 is always detected on every boot. Fixes: 693e3ffaae5a ("ARM: dts: imx6: Add support for Toradex Apalis iMX6Q/D SoM") Signed-off-by: Fabio Estevam <festevam@gmail.com> Reviewed-by: Tim Harvey <tharvey@gateworks.com> Acked-by: Max Krummenacher <max.krummenacher@toradex.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-09arm64: dts: meson: remove CPU opps below 1GHz for SM1 boardsChristian Hewitt
[ Upstream commit fd86d85401c2049f652293877c0f7e6e5afc3bbc ] Amlogic SM1 devices experience CPU stalls and random board wedges when the system idles and CPU cores clock down to lower opp points. Recent vendor kernels include a change to remove 100-250MHz and other distro sources also remove the 500/667MHz points. Unless all 100-667Mhz opps are removed or the CPU governor forced to performance stalls are still observed, so let's remove them to improve stability and uptime. Fixes: 3d9e76483049 ("arm64: dts: meson-sm1-sei610: enable DVFS") Signed-off-by: Christian Hewitt <christianshewitt@gmail.com> Reviewed-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://lore.kernel.org/r/20220210100638.19130-3-christianshewitt@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-05-09arm64: dts: meson: remove CPU opps below 1GHz for G12B boardsChristian Hewitt
[ Upstream commit 6c4d636bc00dc17c63ffb2a73a0da850240e26e3 ] Amlogic G12B devices experience CPU stalls and random board wedges when the system idles and CPU cores clock down to lower opp points. Recent vendor kernels include a change to remove 100-250MHz and other distro sources also remove the 500/667MHz points. Unless all 100-667Mhz opps are removed or the CPU governor forced to performance stalls are still observed, so let's remove them to improve stability and uptime. Fixes: b96d4e92709b ("arm64: dts: meson-g12b: support a311d and s922x cpu operating points") Signed-off-by: Christian Hewitt <christianshewitt@gmail.com> Reviewed-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://lore.kernel.org/r/20220210100638.19130-2-christianshewitt@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-27ARC: entry: fix syscall_trace_exit argumentSergey Matyukevich
commit b1c6ecfdd06907554518ec384ce8e99889d15193 upstream. Function syscall_trace_exit expects pointer to pt_regs. However r0 is also used to keep syscall return value. Restore pointer to pt_regs before calling syscall_trace_exit. Cc: <stable@vger.kernel.org> Signed-off-by: Sergey Matyukevich <sergey.matyukevich@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-27xtensa: fix a7 clobbering in coprocessor context load/storeMax Filippov
commit 839769c35477d4acc2369e45000ca7b0b6af39a7 upstream. Fast coprocessor exception handler saves a3..a6, but coprocessor context load/store code uses a4..a7 as temporaries, potentially clobbering a7. 'Potentially' because coprocessor state load/store macros may not use all four temporary registers (and neither FPU nor HiFi macros do). Use a3..a6 as intended. Cc: stable@vger.kernel.org Fixes: c658eac628aa ("[XTENSA] Add support for configurable registers and coprocessors") Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-27xtensa: patch_text: Fixup last cpu should be masterGuo Ren
commit ee69d4be8fd064cd08270b4808d2dfece3614ee0 upstream. These patch_text implementations are using stop_machine_cpuslocked infrastructure with atomic cpu_count. The original idea: When the master CPU patch_text, the others should wait for it. But current implementation is using the first CPU as master, which couldn't guarantee the remaining CPUs are waiting. This patch changes the last CPU as the master to solve the potential risk. Fixes: 64711f9a47d4 ("xtensa: implement jump_label support") Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Max Filippov <jcmvbkbc@gmail.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: <stable@vger.kernel.org> Message-Id: <20220407073323.743224-4-guoren@kernel.org> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-27powerpc/perf: Fix power9 event alternativesAthira Rajeev
[ Upstream commit 0dcad700bb2776e3886fe0a645a4bf13b1e747cd ] When scheduling a group of events, there are constraint checks done to make sure all events can go in a group. Example, one of the criteria is that events in a group cannot use the same PMC. But platform specific PMU supports alternative event for some of the event codes. During perf_event_open(), if any event group doesn't match constraint check criteria, further lookup is done to find alternative event. By current design, the array of alternatives events in PMU code is expected to be sorted by column 0. This is because in find_alternative() the return criteria is based on event code comparison. ie. "event < ev_alt[i][0])". This optimisation is there since find_alternative() can be called multiple times. In power9 PMU code, the alternative event array is not sorted properly and hence there is breakage in finding alternative events. To work with existing logic, fix the alternative event array to be sorted by column 0 for power9-pmu.c Results: With alternative events, multiplexing can be avoided. That is, for example, in power9 PM_LD_MISS_L1 (0x3e054) has alternative event, PM_LD_MISS_L1_ALT (0x400f0). This is an identical event which can be programmed in a different PMC. Before: # perf stat -e r3e054,r300fc Performance counter stats for 'system wide': 1057860 r3e054 (50.21%) 379 r300fc (49.79%) 0.944329741 seconds time elapsed Since both the events are using PMC3 in this case, they are multiplexed here. After: # perf stat -e r3e054,r300fc Performance counter stats for 'system wide': 1006948 r3e054 182 r300fc Fixes: 91e0bd1e6251 ("powerpc/perf: Add PM_LD_MISS_L1 and PM_BR_2PATH to power9 event list") Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220419114828.89843-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-27KVM: PPC: Fix TCE handling for VFIOAlexey Kardashevskiy
[ Upstream commit 26a62b750a4e6364b0393562f66759b1494c3a01 ] The LoPAPR spec defines a guest visible IOMMU with a variable page size. Currently QEMU advertises 4K, 64K, 2M, 16MB pages, a Linux VM picks the biggest (16MB). In the case of a passed though PCI device, there is a hardware IOMMU which does not support all pages sizes from the above - P8 cannot do 2MB and P9 cannot do 16MB. So for each emulated 16M IOMMU page we may create several smaller mappings ("TCEs") in the hardware IOMMU. The code wrongly uses the emulated TCE index instead of hardware TCE index in error handling. The problem is easier to see on POWER8 with multi-level TCE tables (when only the first level is preallocated) as hash mode uses real mode TCE hypercalls handlers. The kernel starts using indirect tables when VMs get bigger than 128GB (depends on the max page order). The very first real mode hcall is going to fail with H_TOO_HARD as in the real mode we cannot allocate memory for TCEs (we can in the virtual mode) but on the way out the code attempts to clear hardware TCEs using emulated TCE indexes which corrupts random kernel memory because it_offset==1<<59 is subtracted from those indexes and the resulting index is out of the TCE table bounds. This fixes kvmppc_clear_tce() to use the correct TCE indexes. While at it, this fixes TCE cache invalidation which uses emulated TCE indexes instead of the hardware ones. This went unnoticed as 64bit DMA is used these days and VMs map all RAM in one go and only then do DMA and this is when the TCE cache gets populated. Potentially this could slow down mapping, however normally 16MB emulated pages are backed by 64K hardware pages so it is one write to the "TCE Kill" per 256 updates which is not that bad considering the size of the cache (1024 TCEs or so). Fixes: ca1fc489cfa0 ("KVM: PPC: Book3S: Allow backing bigger guest IOMMU pages with smaller physical pages") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Tested-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220420050840.328223-1-aik@ozlabs.ru Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-27stat: fix inconsistency between struct stat and struct compat_statMikulas Patocka
[ Upstream commit 932aba1e169090357a77af18850a10c256b50819 ] struct stat (defined in arch/x86/include/uapi/asm/stat.h) has 32-bit st_dev and st_rdev; struct compat_stat (defined in arch/x86/include/asm/compat.h) has 16-bit st_dev and st_rdev followed by a 16-bit padding. This patch fixes struct compat_stat to match struct stat. [ Historical note: the old x86 'struct stat' did have that 16-bit field that the compat layer had kept around, but it was changes back in 2003 by "struct stat - support larger dev_t": https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/?id=e95b2065677fe32512a597a79db94b77b90c968d and back in those days, the x86_64 port was still new, and separate from the i386 code, and had already picked up the old version with a 16-bit st_dev field ] Note that we can't change compat_dev_t because it is used by compat_loop_info. Also, if the st_dev and st_rdev values are 32-bit, we don't have to use old_valid_dev to test if the value fits into them. This fixes -EOVERFLOW on filesystems that are on NVMe because NVMe uses the major number 259. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Andreas Schwab <schwab@linux-m68k.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-27ARM: vexpress/spc: Avoid negative array index when !SMPKees Cook
[ Upstream commit b3f1dd52c991d79118f35e6d1bf4d7cb09882e38 ] When building multi_v7_defconfig+CONFIG_SMP=n, -Warray-bounds exposes a couple negative array index accesses: arch/arm/mach-vexpress/spc.c: In function 've_spc_clk_init': arch/arm/mach-vexpress/spc.c:583:21: warning: array subscript -1 is below array bounds of 'bool[2]' {aka '_Bool[2]'} [-Warray-bounds] 583 | if (init_opp_table[cluster]) | ~~~~~~~~~~~~~~^~~~~~~~~ arch/arm/mach-vexpress/spc.c:556:7: note: while referencing 'init_opp_table' 556 | bool init_opp_table[MAX_CLUSTERS] = { false }; | ^~~~~~~~~~~~~~ arch/arm/mach-vexpress/spc.c:592:18: warning: array subscript -1 is below array bounds of 'bool[2]' {aka '_Bool[2]'} [-Warray-bounds] 592 | init_opp_table[cluster] = true; | ~~~~~~~~~~~~~~^~~~~~~~~ arch/arm/mach-vexpress/spc.c:556:7: note: while referencing 'init_opp_table' 556 | bool init_opp_table[MAX_CLUSTERS] = { false }; | ^~~~~~~~~~~~~~ Skip this logic when built !SMP. Link: https://lore.kernel.org/r/20220331190443.851661-1-keescook@chromium.org Cc: Liviu Dudau <liviu.dudau@arm.com> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: linux-arm-kernel@lists.infradead.org Acked-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-20ARM: davinci: da850-evm: Avoid NULL pointer dereferenceNathan Chancellor
commit 83a1cde5c74bfb44b49cb2a940d044bb2380f4ea upstream. With newer versions of GCC, there is a panic in da850_evm_config_emac() when booting multi_v5_defconfig in QEMU under the palmetto-bmc machine: Unable to handle kernel NULL pointer dereference at virtual address 00000020 pgd = (ptrval) [00000020] *pgd=00000000 Internal error: Oops: 5 [#1] PREEMPT ARM Modules linked in: CPU: 0 PID: 1 Comm: swapper Not tainted 5.15.0 #1 Hardware name: Generic DT based system PC is at da850_evm_config_emac+0x1c/0x120 LR is at do_one_initcall+0x50/0x1e0 The emac_pdata pointer in soc_info is NULL because davinci_soc_info only gets populated on davinci machines but da850_evm_config_emac() is called on all machines via device_initcall(). Move the rmii_en assignment below the machine check so that it is only dereferenced when running on a supported SoC. Fixes: bae105879f2f ("davinci: DA850/OMAP-L138 EVM: implement autodetect of RMII PHY") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Bartosz Golaszewski <brgl@bgdev.pl> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/YcS4xVWs6bQlQSPC@archlinux-ax161/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20powerpc: Fix virt_addr_valid() for 64-bit Book3E & 32-bitKefeng Wang
[ Upstream commit ffa0b64e3be58519ae472ea29a1a1ad681e32f48 ] mpe: On 64-bit Book3E vmalloc space starts at 0x8000000000000000. Because of the way __pa() works we have: __pa(0x8000000000000000) == 0, and therefore virt_to_pfn(0x8000000000000000) == 0, and therefore virt_addr_valid(0x8000000000000000) == true Which is wrong, virt_addr_valid() should be false for vmalloc space. In fact all vmalloc addresses that alias with a valid PFN will return true from virt_addr_valid(). That can cause bugs with hardened usercopy as described below by Kefeng Wang: When running ethtool eth0 on 64-bit Book3E, a BUG occurred: usercopy: Kernel memory exposure attempt detected from SLUB object not in SLUB page?! (offset 0, size 1048)! kernel BUG at mm/usercopy.c:99 ... usercopy_abort+0x64/0xa0 (unreliable) __check_heap_object+0x168/0x190 __check_object_size+0x1a0/0x200 dev_ethtool+0x2494/0x2b20 dev_ioctl+0x5d0/0x770 sock_do_ioctl+0xf0/0x1d0 sock_ioctl+0x3ec/0x5a0 __se_sys_ioctl+0xf0/0x160 system_call_exception+0xfc/0x1f0 system_call_common+0xf8/0x200 The code shows below, data = vzalloc(array_size(gstrings.len, ETH_GSTRING_LEN)); copy_to_user(useraddr, data, gstrings.len * ETH_GSTRING_LEN)) The data is alloced by vmalloc(), virt_addr_valid(ptr) will return true on 64-bit Book3E, which leads to the panic. As commit 4dd7554a6456 ("powerpc/64: Add VIRTUAL_BUG_ON checks for __va and __pa addresses") does, make sure the virt addr above PAGE_OFFSET in the virt_addr_valid() for 64-bit, also add upper limit check to make sure the virt is below high_memory. Meanwhile, for 32-bit PAGE_OFFSET is the virtual address of the start of lowmem, high_memory is the upper low virtual address, the check is suitable for 32-bit, this will fix the issue mentioned in commit 602946ec2f90 ("powerpc: Set max_mapnr correctly") too. On 32-bit there is a similar problem with high memory, that was fixed in commit 602946ec2f90 ("powerpc: Set max_mapnr correctly"), but that commit breaks highmem and needs to be reverted. We can't easily fix __pa(), we have code that relies on its current behaviour. So for now add extra checks to virt_addr_valid(). For 64-bit Book3S the extra checks are not necessary, the combination of virt_to_pfn() and pfn_valid() should yield the correct result, but they are harmless. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Add additional change log detail] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220406145802.538416-1-mpe@ellerman.id.au Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-20arm64: alternatives: mark patch_alternative() as `noinstr`Joey Gouly
[ Upstream commit a2c0b0fbe01419f8f5d1c0b9c581631f34ffce8b ] The alternatives code must be `noinstr` such that it does not patch itself, as the cache invalidation is only performed after all the alternatives have been applied. Mark patch_alternative() as `noinstr`. Mark branch_insn_requires_update() and get_alt_insn() with `__always_inline` since they are both only called through patch_alternative(). Booting a kernel in QEMU TCG with KCSAN=y and ARM64_USE_LSE_ATOMICS=y caused a boot hang: [ 0.241121] CPU: All CPU(s) started at EL2 The alternatives code was patching the atomics in __tsan_read4() from LL/SC atomics to LSE atomics. The following fragment is using LL/SC atomics in the .text section: | <__tsan_unaligned_read4+304>: ldxr x6, [x2] | <__tsan_unaligned_read4+308>: add x6, x6, x5 | <__tsan_unaligned_read4+312>: stxr w7, x6, [x2] | <__tsan_unaligned_read4+316>: cbnz w7, <__tsan_unaligned_read4+304> This LL/SC atomic sequence was to be replaced with LSE atomics. However since the alternatives code was instrumentable, __tsan_read4() was being called after only the first instruction was replaced, which led to the following code in memory: | <__tsan_unaligned_read4+304>: ldadd x5, x6, [x2] | <__tsan_unaligned_read4+308>: add x6, x6, x5 | <__tsan_unaligned_read4+312>: stxr w7, x6, [x2] | <__tsan_unaligned_read4+316>: cbnz w7, <__tsan_unaligned_read4+304> This caused an infinite loop as the `stxr` instruction never completed successfully, so `w7` was always 0. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220405104733.11476-1-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-15cpuidle: PSCI: Move the `has_lpi` check to the beginning of the functionMario Limonciello
commit 01f6c7338ce267959975da65d86ba34f44d54220 upstream. Currently the first thing checked is whether the PCSI cpu_suspend function has been initialized. Another change will be overloading `acpi_processor_ffh_lpi_probe` and calling it sooner. So make the `has_lpi` check the first thing checked to prepare for that change. Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-15arm64: module: remove (NOLOAD) from linker scriptFangrui Song
commit 4013e26670c590944abdab56c4fa797527b74325 upstream. On ELF, (NOLOAD) sets the section type to SHT_NOBITS[1]. It is conceptually inappropriate for .plt and .text.* sections which are always SHT_PROGBITS. In GNU ld, if PLT entries are needed, .plt will be SHT_PROGBITS anyway and (NOLOAD) will be essentially ignored. In ld.lld, since https://reviews.llvm.org/D118840 ("[ELF] Support (TYPE=<value>) to customize the output section type"), ld.lld will report a `section type mismatch` error. Just remove (NOLOAD) to fix the error. [1] https://lld.llvm.org/ELF/linker_script.html As of today, "The section should be marked as not loadable" on https://sourceware.org/binutils/docs/ld/Output-Section-Type.html is outdated for ELF. Tested-by: Nathan Chancellor <nathan@kernel.org> Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Fangrui Song <maskray@google.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220218081209.354383-1-maskray@google.com Signed-off-by: Will Deacon <will@kernel.org> [nathan: Fix conflicts due to lack of 596b0474d3d9] Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-15arm64: patch_text: Fixup last cpu should be masterGuo Ren
commit 31a099dbd91e69fcab55eef4be15ed7a8c984918 upstream. These patch_text implementations are using stop_machine_cpuslocked infrastructure with atomic cpu_count. The original idea: When the master CPU patch_text, the others should wait for it. But current implementation is using the first CPU as master, which couldn't guarantee the remaining CPUs are waiting. This patch changes the last CPU as the master to solve the potential risk. Fixes: ae16480785de ("arm64: introduce interfaces to hotpatch kernel and module code") Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220407073323.743224-2-guoren@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-15x86/speculation: Restore speculation related MSRs during S3 resumePawan Gupta
commit e2a1256b17b16f9b9adf1b6fea56819e7b68e463 upstream. After resuming from suspend-to-RAM, the MSRs that control CPU's speculative execution behavior are not being restored on the boot CPU. These MSRs are used to mitigate speculative execution vulnerabilities. Not restoring them correctly may leave the CPU vulnerable. Secondary CPU's MSRs are correctly being restored at S3 resume by identify_secondary_cpu(). During S3 resume, restore these MSRs for boot CPU when restoring its processor state. Fixes: 772439717dbf ("x86/bugs/intel: Set proper CPU features and setup RDS") Reported-by: Neelima Krishnan <neelima.krishnan@intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com> Acked-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-15x86/pm: Save the MSR validity status at context setupPawan Gupta
commit 73924ec4d560257004d5b5116b22a3647661e364 upstream. The mechanism to save/restore MSRs during S3 suspend/resume checks for the MSR validity during suspend, and only restores the MSR if its a valid MSR. This is not optimal, as an invalid MSR will unnecessarily throw an exception for every suspend cycle. The more invalid MSRs, higher the impact will be. Check and save the MSR validity at setup. This ensures that only valid MSRs that are guaranteed to not throw an exception will be attempted during suspend. Fixes: 7a9c2dd08ead ("x86/pm: Introduce quirk framework to save/restore extra MSR registers around suspend/resume") Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-15KVM: arm64: Check arm64_get_bp_hardening_data() didn't return NULLJames Morse
Will reports that with CONFIG_EXPERT=y and CONFIG_HARDEN_BRANCH_PREDICTOR=n, the kernel dereferences a NULL pointer during boot: [ 2.384444] Internal error: Oops: 96000004 [#1] PREEMPT SMP [ 2.384461] pstate: 20400085 (nzCv daIf +PAN -UAO) [ 2.384472] pc : cpu_hyp_reinit+0x114/0x30c [ 2.384476] lr : cpu_hyp_reinit+0x80/0x30c [ 2.384529] Call trace: [ 2.384533] cpu_hyp_reinit+0x114/0x30c [ 2.384537] _kvm_arch_hardware_enable+0x30/0x54 [ 2.384541] flush_smp_call_function_queue+0xe4/0x154 [ 2.384544] generic_smp_call_function_single_interrupt+0x10/0x18 [ 2.384549] ipi_handler+0x170/0x2b0 [ 2.384555] handle_percpu_devid_fasteoi_ipi+0x120/0x1cc [ 2.384560] __handle_domain_irq+0x9c/0xf4 [ 2.384563] gic_handle_irq+0x6c/0xe4 [ 2.384566] el1_irq+0xf0/0x1c0 [ 2.384570] arch_cpu_idle+0x28/0x44 [ 2.384574] do_idle+0x100/0x2a8 [ 2.384577] cpu_startup_entry+0x20/0x24 [ 2.384581] secondary_start_kernel+0x1b0/0x1cc [ 2.384589] Code: b9469d08 7100011f 540003ad 52800208 (f9400108) [ 2.384600] ---[ end trace 266d08dbf96ff143 ]--- [ 2.385171] Kernel panic - not syncing: Fatal exception in interrupt In this configuration arm64_get_bp_hardening_data() returns NULL. Add a check in kvm_get_hyp_vector(). Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/linux-arm-kernel/20220408120041.GB27685@willie-the-truck/ Fixes: 26129ea2953b ("KVM: arm64: Add templates for BHB mitigation sequences") Cc: stable@vger.kernel.org # 5.4.x Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-15parisc: Fix patch code locking and flushingJohn David Anglin
[ Upstream commit a9fe7fa7d874a536e0540469f314772c054a0323 ] This change fixes the following: 1) The flags variable is not initialized. Always use raw_spin_lock_irqsave and raw_spin_unlock_irqrestore to serialize patching. 2) flush_kernel_vmap_range is primarily intended for DMA flushes. Since __patch_text_multiple is often called with interrupts disabled, it is better to directly call flush_kernel_dcache_range_asm and flush_kernel_icache_range_asm. This avoids an extra call. 3) The final call to flush_icache_range is unnecessary. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-15xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32Dongli Zhang
[ Upstream commit eed05744322da07dd7e419432dcedf3c2e017179 ] The sched_clock() can be used very early since commit 857baa87b642 ("sched/clock: Enable sched clock early"). In addition, with commit 38669ba205d1 ("x86/xen/time: Output xen sched_clock time from 0"), kdump kernel in Xen HVM guest may panic at very early stage when accessing &__this_cpu_read(xen_vcpu)->time as in below: setup_arch() -> init_hypervisor_platform() -> x86_init.hyper.init_platform = xen_hvm_guest_init() -> xen_hvm_init_time_ops() -> xen_clocksource_read() -> src = &__this_cpu_read(xen_vcpu)->time; This is because Xen HVM supports at most MAX_VIRT_CPUS=32 'vcpu_info' embedded inside 'shared_info' during early stage until xen_vcpu_setup() is used to allocate/relocate 'vcpu_info' for boot cpu at arbitrary address. However, when Xen HVM guest panic on vcpu >= 32, since xen_vcpu_info_reset(0) would set per_cpu(xen_vcpu, cpu) = NULL when vcpu >= 32, xen_clocksource_read() on vcpu >= 32 would panic. This patch calls xen_hvm_init_time_ops() again later in xen_hvm_smp_prepare_boot_cpu() after the 'vcpu_info' for boot vcpu is registered when the boot vcpu is >= 32. This issue can be reproduced on purpose via below command at the guest side when kdump/kexec is enabled: "taskset -c 33 echo c > /proc/sysrq-trigger" The bugfix for PVM is not implemented due to the lack of testing environment. [boris: xen_hvm_init_time_ops() returns on errors instead of jumping to end] Cc: Joe Jin <joe.jin@oracle.com> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20220302164032.14569-3-dongli.zhang@oracle.com Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-15xtensa: fix DTC warning unit_address_formatMax Filippov
[ Upstream commit e85d29ba4b24f68e7a78cb85c55e754362eeb2de ] DTC issues the following warnings when building xtfpga device trees: /soc/flash@00000000/partition@0x0: unit name should not have leading "0x" /soc/flash@00000000/partition@0x6000000: unit name should not have leading "0x" /soc/flash@00000000/partition@0x6800000: unit name should not have leading "0x" /soc/flash@00000000/partition@0x7fe0000: unit name should not have leading "0x" Drop leading 0x from flash partition unit names. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-15MIPS: fix fortify panic when copying asm exception handlersAlexander Lobakin
[ Upstream commit d17b66417308996e7e64b270a3c7f3c1fbd4cfc8 ] With KCFLAGS="-O3", I was able to trigger a fortify-source memcpy() overflow panic on set_vi_srs_handler(). Although O3 level is not supported in the mainline, under some conditions that may've happened with any optimization settings, it's just a matter of inlining luck. The panic itself is correct, more precisely, 50/50 false-positive and not at the same time. From the one side, no real overflow happens. Exception handler defined in asm just gets copied to some reserved places in the memory. But the reason behind is that C code refers to that exception handler declares it as `char`, i.e. something of 1 byte length. It's obvious that the asm function itself is way more than 1 byte, so fortify logics thought we are going to past the symbol declared. The standard way to refer to asm symbols from C code which is not supposed to be called from C is to declare them as `extern const u8[]`. This is fully correct from any point of view, as any code itself is just a bunch of bytes (including 0 as it is for syms like _stext/_etext/etc.), and the exact size is not known at the moment of compilation. Adjust the type of the except_vec_vi_*() and related variables. Make set_handler() take `const` as a second argument to avoid cast-away warnings and give a little more room for optimization. Signed-off-by: Alexander Lobakin <alobakin@pm.me> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-15mips: ralink: fix a refcount leak in ill_acc_of_setup()Hangyu Hua
[ Upstream commit 4a0a1436053b17e50b7c88858fb0824326641793 ] of_node_put(np) needs to be called when pdev == NULL. Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-15powerpc: Set crashkernel offset to mid of RMA regionSourabh Jain
[ Upstream commit 7c5ed82b800d8615cdda00729e7b62e5899f0b13 ] On large config LPARs (having 192 and more cores), Linux fails to boot due to insufficient memory in the first memblock. It is due to the memory reservation for the crash kernel which starts at 128MB offset of the first memblock. This memory reservation for the crash kernel doesn't leave enough space in the first memblock to accommodate other essential system resources. The crash kernel start address was set to 128MB offset by default to ensure that the crash kernel get some memory below the RMA region which is used to be of size 256MB. But given that the RMA region size can be 512MB or more, setting the crash kernel offset to mid of RMA size will leave enough space for the kernel to allocate memory for other system resources. Since the above crash kernel offset change is only applicable to the LPAR platform, the LPAR feature detection is pushed before the crash kernel reservation. The rest of LPAR specific initialization will still be done during pseries_probe_fw_features as usual. This patch is dependent on changes to paca allocation for boot CPU. It expect boot CPU to discover 1T segment support which is introduced by the patch posted here: https://lists.ozlabs.org/pipermail/linuxppc-dev/2022-January/239175.html Reported-by: Abdul haleem <abdhalee@linux.vnet.ibm.com> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220204085601.107257-1-sourabhjain@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-15powerpc: dts: t104xrdb: fix phy type for FMAN 4/5Maxim Kiselev
[ Upstream commit 17846485dff91acce1ad47b508b633dffc32e838 ] T1040RDB has two RTL8211E-VB phys which requires setting of internal delays for correct work. Changing the phy-connection-type property to `rgmii-id` will fix this issue. Signed-off-by: Maxim Kiselev <bigunclemax@gmail.com> Reviewed-by: Maxim Kochetkov <fido_max@inbox.ru> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211230151123.1258321-1-bigunclemax@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-15KVM: x86/svm: Clear reserved bits written to PerfEvtSeln MSRsJim Mattson
[ Upstream commit 9b026073db2f1ad0e4d8b61c83316c8497981037 ] AMD EPYC CPUs never raise a #GP for a WRMSR to a PerfEvtSeln MSR. Some reserved bits are cleared, and some are not. Specifically, on Zen3/Milan, bits 19 and 42 are not cleared. When emulating such a WRMSR, KVM should not synthesize a #GP, regardless of which bits are set. However, undocumented bits should not be passed through to the hardware MSR. So, rather than checking for reserved bits and synthesizing a #GP, just clear the reserved bits. This may seem pedantic, but since KVM currently does not support the "Host/Guest Only" bits (41:40), it is necessary to clear these bits rather than synthesizing #GP, because some popular guests (e.g Linux) will set the "Host Only" bit even on CPUs that don't support EFER.SVME, and they don't expect a #GP. For example, root@Ubuntu1804:~# perf stat -e r26 -a sleep 1 Performance counter stats for 'system wide': 0 r26 1.001070977 seconds time elapsed Feb 23 03:59:58 Ubuntu1804 kernel: [ 405.379957] unchecked MSR access error: WRMSR to 0xc0010200 (tried to write 0x0000020000130026) at rIP: 0xffffffff9b276a28 (native_write_msr+0x8/0x30) Feb 23 03:59:58 Ubuntu1804 kernel: [ 405.379958] Call Trace: Feb 23 03:59:58 Ubuntu1804 kernel: [ 405.379963] amd_pmu_disable_event+0x27/0x90 Fixes: ca724305a2b0 ("KVM: x86/vPMU: Implement AMD vPMU code for KVM") Reported-by: Lotus Fenn <lotusf@google.com> Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Like Xu <likexu@tencent.com> Reviewed-by: David Dunn <daviddunn@google.com> Message-Id: <20220226234131.2167175-1-jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-15ARM: 9187/1: JIVE: fix return value of __setup handlerRandy Dunlap
[ Upstream commit 8b2360c7157b462c4870d447d1e65d30ef31f9aa ] __setup() handlers should return 1 to obsolete_checksetup() in init/main.c to indicate that the boot option has been handled. A return of 0 causes the boot option/value to be listed as an Unknown kernel parameter and added to init's (limited) argument or environment strings. Also, error return codes don't mean anything to obsolete_checksetup() -- only non-zero (usually 1) or zero. So return 1 from jive_mtdset(). Fixes: 9db829f485c5 ("[ARM] JIVE: Initial machine support for Logitech Jive") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Ben Dooks <ben-linux@fluff.org> Cc: Krzysztof Kozlowski <krzk@kernel.org> Cc: Alim Akhtar <alim.akhtar@samsung.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-samsung-soc@vger.kernel.org Cc: patches@armlinux.org.uk Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-15riscv module: remove (NOLOAD)Fangrui Song
[ Upstream commit 60210a3d86dc57ce4a76a366e7841dda746a33f7 ] On ELF, (NOLOAD) sets the section type to SHT_NOBITS[1]. It is conceptually inappropriate for .plt, .got, and .got.plt sections which are always SHT_PROGBITS. In GNU ld, if PLT entries are needed, .plt will be SHT_PROGBITS anyway and (NOLOAD) will be essentially ignored. In ld.lld, since https://reviews.llvm.org/D118840 ("[ELF] Support (TYPE=<value>) to customize the output section type"), ld.lld will report a `section type mismatch` error (later changed to a warning). Just remove (NOLOAD) to fix the warning. [1] https://lld.llvm.org/ELF/linker_script.html As of today, "The section should be marked as not loadable" on https://sourceware.org/binutils/docs/ld/Output-Section-Type.html is outdated for ELF. Link: https://github.com/ClangBuiltLinux/linux/issues/1597 Fixes: ab1ef68e5401 ("RISC-V: Add sections of PLT and GOT for kernel module") Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Fangrui Song <maskray@google.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-15KVM: x86: Forbid VMM to set SYNIC/STIMER MSRs when SynIC wasn't activatedVitaly Kuznetsov
commit b1e34d325397a33d97d845e312d7cf2a8b646b44 upstream. Setting non-zero values to SYNIC/STIMER MSRs activates certain features, this should not happen when KVM_CAP_HYPERV_SYNIC{,2} was not activated. Note, it would've been better to forbid writing anything to SYNIC/STIMER MSRs, including zeroes, however, at least QEMU tries clearing HV_X64_MSR_STIMER0_CONFIG without SynIC. HV_X64_MSR_EOM MSR is somewhat 'special' as writing zero there triggers an action, this also should not happen when SynIC wasn't activated. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220325132140.25650-4-vkuznets@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-15KVM: x86/mmu: do compare-and-exchange of gPTE via the user addressPaolo Bonzini
commit 2a8859f373b0a86f0ece8ec8312607eacf12485d upstream. FNAME(cmpxchg_gpte) is an inefficient mess. It is at least decent if it can go through get_user_pages_fast(), but if it cannot then it tries to use memremap(); that is not just terribly slow, it is also wrong because it assumes that the VM_PFNMAP VMA is contiguous. The right way to do it would be to do the same thing as hva_to_pfn_remapped() does since commit add6a0cd1c5b ("KVM: MMU: try to fix up page faults before giving up", 2016-07-05), using follow_pte() and fixup_user_fault() to determine the correct address to use for memremap(). To do this, one could for example extract hva_to_pfn() for use outside virt/kvm/kvm_main.c. But really there is no reason to do that either, because there is already a perfectly valid address to do the cmpxchg() on, only it is a userspace address. That means doing user_access_begin()/user_access_end() and writing the code in assembly to handle exceptions correctly. Worse, the guest PTE can be 8-byte even on i686 so there is the extra complication of using cmpxchg8b to account for. But at least it is an efficient mess. (Thanks to Linus for suggesting improvement on the inline assembly). Reported-by: Qiuhao Li <qiuhao@sysec.org> Reported-by: Gaoning Pan <pgn@zju.edu.cn> Reported-by: Yongkang Jia <kangel@zju.edu.cn> Reported-by: syzbot+6cde2282daa792c49ab8@syzkaller.appspotmail.com Debugged-by: Tadeusz Struk <tadeusz.struk@linaro.org> Tested-by: Maxim Levitsky <mlevitsk@redhat.com> Cc: stable@vger.kernel.org Fixes: bd53cb35a3e9 ("X86/KVM: Handle PFNs outside of kernel reach when touching GPTEs") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-15um: Fix uml_mconsole stop/goAnton Ivanov
commit 1a3a6a2a035bb6c3a7ef4c788d8fd69a7b2d6284 upstream. Moving to an EPOLL based IRQ controller broke uml_mconsole stop/go commands. This fixes it and restores stop/go functionality. Fixes: ff6a17989c08 ("Epoll based IRQ controller") Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-15ARM: dts: spear13xx: Update SPI dma propertiesKuldeep Singh
commit 31d3687d6017c7ce6061695361598d9cda70807a upstream. Reorder dmas and dma-names property for spi controller node to make it compliant with bindings. Fixes: 6e8887f60f60 ("ARM: SPEAr13xx: Pass generic DW DMAC platform data from DT") Signed-off-by: Kuldeep Singh <singh.kuldeep87k@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20220326042313.97862-2-singh.kuldeep87k@gmail.com' Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-15ARM: dts: spear1340: Update serial node propertiesKuldeep Singh
commit 583d6b0062640def86f3265aa1042ecb6672516e upstream. Reorder dma and dma-names property for serial node to make it compliant with bindings. Fixes: 6e8887f60f60 ("ARM: SPEAr13xx: Pass generic DW DMAC platform data from DT") Signed-off-by: Kuldeep Singh <singh.kuldeep87k@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20220326042313.97862-3-singh.kuldeep87k@gmail.com' Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>