summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-12-17quota: Check that quota is not dirty before releaseDmitry Monakhov
commit df4bb5d128e2c44848aeb36b7ceceba3ac85080d upstream. There is a race window where quota was redirted once we drop dq_list_lock inside dqput(), but before we grab dquot->dq_lock inside dquot_release() TASK1 TASK2 (chowner) ->dqput() we_slept: spin_lock(&dq_list_lock) if (dquot_dirty(dquot)) { spin_unlock(&dq_list_lock); dquot->dq_sb->dq_op->write_dquot(dquot); goto we_slept if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) { spin_unlock(&dq_list_lock); dquot->dq_sb->dq_op->release_dquot(dquot); dqget() mark_dquot_dirty() dqput() goto we_slept; } So dquot dirty quota will be released by TASK1, but on next we_sleept loop we detect this and call ->write_dquot() for it. XFSTEST: https://github.com/dmonakhov/xfstests/commit/440a80d4cbb39e9234df4d7240aee1d551c36107 Link: https://lore.kernel.org/r/20191031103920.3919-2-dmonakhov@openvz.org CC: stable@vger.kernel.org Signed-off-by: Dmitry Monakhov <dmtrmonakhov@yandex-team.ru> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17video/hdmi: Fix AVI bar unpackVille Syrjälä
commit 6039f37dd6b76641198e290f26b31c475248f567 upstream. The bar values are little endian, not big endian. The pack function did it right but the unpack got it wrong. Fix it. Cc: stable@vger.kernel.org Cc: linux-media@vger.kernel.org Cc: Martin Bugge <marbugge@cisco.com> Cc: Hans Verkuil <hans.verkuil@cisco.com> Cc: Thierry Reding <treding@nvidia.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Fixes: 2c676f378edb ("[media] hdmi: added unpack and logging functions for InfoFrames") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190919132853.30954-1-ville.syrjala@linux.intel.com Reviewed-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17powerpc/xive: Skip ioremap() of ESB pages for LSI interruptsCédric Le Goater
commit b67a95f2abff0c34e5667c15ab8900de73d8d087 upstream. The PCI INTx interrupts and other LSI interrupts are handled differently under a sPAPR platform. When the interrupt source characteristics are queried, the hypervisor returns an H_INT_ESB flag to inform the OS that it should be using the H_INT_ESB hcall for interrupt management and not loads and stores on the interrupt ESB pages. A default -1 value is returned for the addresses of the ESB pages. The driver ignores this condition today and performs a bogus IO mapping. Recent changes and the DEBUG_VM configuration option make the bug visible with : kernel BUG at arch/powerpc/include/asm/book3s/64/pgtable.h:612! Oops: Exception in kernel mode, sig: 5 [#1] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=1024 NUMA pSeries Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-0.rc6.git0.1.fc32.ppc64le #1 NIP: c000000000f63294 LR: c000000000f62e44 CTR: 0000000000000000 REGS: c0000000fa45f0d0 TRAP: 0700 Not tainted (5.4.0-0.rc6.git0.1.fc32.ppc64le) ... NIP ioremap_page_range+0x4c4/0x6e0 LR ioremap_page_range+0x74/0x6e0 Call Trace: ioremap_page_range+0x74/0x6e0 (unreliable) do_ioremap+0x8c/0x120 __ioremap_caller+0x128/0x140 ioremap+0x30/0x50 xive_spapr_populate_irq_data+0x170/0x260 xive_irq_domain_map+0x8c/0x170 irq_domain_associate+0xb4/0x2d0 irq_create_mapping+0x1e0/0x3b0 irq_create_fwspec_mapping+0x27c/0x3e0 irq_create_of_mapping+0x98/0xb0 of_irq_parse_and_map_pci+0x168/0x230 pcibios_setup_device+0x88/0x250 pcibios_setup_bus_devices+0x54/0x100 __of_scan_bus+0x160/0x310 pcibios_scan_phb+0x330/0x390 pcibios_init+0x8c/0x128 do_one_initcall+0x60/0x2c0 kernel_init_freeable+0x290/0x378 kernel_init+0x2c/0x148 ret_from_kernel_thread+0x5c/0x80 Fixes: bed81ee181dd ("powerpc/xive: introduce H_INT_ESB hcall") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Cédric Le Goater <clg@kaod.org> Tested-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191203163642.2428-1-clg@kaod.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17powerpc: Allow flush_icache_range to work across ranges >4GBAlastair D'Silva
commit 29430fae82073d39b1b881a3cd507416a56a363f upstream. When calling flush_icache_range with a size >4GB, we were masking off the upper 32 bits, so we would incorrectly flush a range smaller than intended. This patch replaces the 32 bit shifts with 64 bit ones, so that the full size is accounted for. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191104023305.9581-2-alastair@au1.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17powerpc/xive: Prevent page fault issues in the machine crash handlerCédric Le Goater
commit 1ca3dec2b2dff9d286ce6cd64108bda0e98f9710 upstream. When the machine crash handler is invoked, all interrupts are masked but interrupts which have not been started yet do not have an ESB page mapped in the Linux address space. This crashes the 'crash kexec' sequence on sPAPR guests. To fix, force the mapping of the ESB page when an interrupt is being mapped in the Linux IRQ number space. This is done by setting the initial state of the interrupt to OFF which is not necessarily the case on PowerNV. Fixes: 243e25112d06 ("powerpc/xive: Native exploitation of the XIVE interrupt controller") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191031063100.3864-1-clg@kaod.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17powerpc: Allow 64bit VDSO __kernel_sync_dicache to work across ranges >4GBAlastair D'Silva
commit f9ec11165301982585e5e5f606739b5bae5331f3 upstream. When calling __kernel_sync_dicache with a size >4GB, we were masking off the upper 32 bits, so we would incorrectly flush a range smaller than intended. This patch replaces the 32 bit shifts with 64 bit ones, so that the full size is accounted for. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191104023305.9581-3-alastair@au1.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17ppdev: fix PPGETTIME/PPSETTIME ioctlsArnd Bergmann
commit 998174042da229e2cf5841f574aba4a743e69650 upstream. Going through the uses of timeval in the user space API, I noticed two bugs in ppdev that were introduced in the y2038 conversion: * The range check was accidentally moved from ppsettime to ppgettime * On sparc64, the microseconds are in the other half of the 64-bit word. Fix both, and mark the fix for stable backports. Cc: stable@vger.kernel.org Fixes: 3b9ab374a1e6 ("ppdev: convert to y2038 safe") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20191108203435.112759-8-arnd@arndb.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17ARM: dts: omap3-tao3530: Fix incorrect MMC card detection GPIO polarityJarkko Nikula
commit 287897f9aaa2ad1c923d9875914f57c4dc9159c8 upstream. The MMC card detection GPIO polarity is active low on TAO3530, like in many other similar boards. Now the card is not detected and it is unable to mount rootfs from an SD card. Fix this by using the correct polarity. This incorrect polarity was defined already in the commit 30d95c6d7092 ("ARM: dts: omap3: Add Technexion TAO3530 SOM omap3-tao3530.dtsi") in v3.18 kernel and later changed to use defined GPIO constants in v4.4 kernel by the commit 3a637e008e54 ("ARM: dts: Use defined GPIO constants in flags cell for OMAP2+ boards"). While the latter commit did not introduce the issue I'm marking it with Fixes tag due the v4.4 kernels still being maintained. Fixes: 3a637e008e54 ("ARM: dts: Use defined GPIO constants in flags cell for OMAP2+ boards") Cc: linux-stable <stable@vger.kernel.org> # 4.4+ Signed-off-by: Jarkko Nikula <jarkko.nikula@bitmer.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17mmc: host: omap_hsmmc: add code for special init of wl1251 to get rid of ↵H. Nikolaus Schaller
pandora_wl1251_init_card commit f6498b922e57aecbe3b7fa30a308d9d586c0c369 upstream. Pandora_wl1251_init_card was used to do special pdata based setup of the sdio mmc interface. This does no longer work with v4.7 and later. A fix requires a device tree based mmc3 setup. Therefore we move the special setup to omap_hsmmc.c instead of calling some pdata supplied init_card function. The new code checks for a DT child node compatible to wl1251 so it will not affect other MMC3 use cases. Generally, this code was and still is a hack and should be moved to mmc core to e.g. read such properties from optional DT child nodes. Fixes: 81eef6ca9201 ("mmc: omap_hsmmc: Use dma_request_chan() for requesting DMA channel") Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com> Cc: <stable@vger.kernel.org> # v4.7+ [Ulf: Fixed up some checkpatch complaints] Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17pinctrl: samsung: Fix device node refcount leaks in S3C64xx wakeup ↵Krzysztof Kozlowski
controller init commit 7f028caadf6c37580d0f59c6c094ed09afc04062 upstream. In s3c64xx_eint_eint0_init() the for_each_child_of_node() loop is used with a break to find a matching child node. Although each iteration of for_each_child_of_node puts the previous node, but early exit from loop misses it. This leads to leak of device node. Cc: <stable@vger.kernel.org> Fixes: 61dd72613177 ("pinctrl: Add pinctrl-s3c64xx driver") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17pinctrl: samsung: Fix device node refcount leaks in init codeKrzysztof Kozlowski
commit a322b3377f4bac32aa25fb1acb9e7afbbbbd0137 upstream. Several functions use for_each_child_of_node() loop with a break to find a matching child node. Although each iteration of for_each_child_of_node puts the previous node, but early exit from loop misses it. This leads to leak of device node. Cc: <stable@vger.kernel.org> Fixes: 9a2c1c3b91aa ("pinctrl: samsung: Allow grouping multiple pinmux/pinconf nodes") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17pinctrl: samsung: Fix device node refcount leaks in S3C24xx wakeup ↵Krzysztof Kozlowski
controller init commit 6fbbcb050802d6ea109f387e961b1dbcc3a80c96 upstream. In s3c24xx_eint_init() the for_each_child_of_node() loop is used with a break to find a matching child node. Although each iteration of for_each_child_of_node puts the previous node, but early exit from loop misses it. This leads to leak of device node. Cc: <stable@vger.kernel.org> Fixes: af99a7507469 ("pinctrl: Add pinctrl-s3c24xx driver") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17pinctrl: samsung: Add of_node_put() before return in error pathNishka Dasgupta
commit 3d2557ab75d4c568c79eefa2e550e0d80348a6bd upstream. Each iteration of for_each_child_of_node puts the previous node, but in the case of a return from the middle of the loop, there is no put, thus causing a memory leak. Hence add an of_node_put before the return of exynos_eint_wkup_init() error path. Issue found with Coccinelle. Signed-off-by: Nishka Dasgupta <nishkadg.linux@gmail.com> Cc: <stable@vger.kernel.org> Fixes: 14c255d35b25 ("pinctrl: exynos: Add irq_chip instance for Exynos7 wakeup interrupts") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17ACPI: PM: Avoid attaching ACPI PM domain to certain devicesRafael J. Wysocki
commit b9ea0bae260f6aae546db224daa6ac1bd9d94b91 upstream. Certain ACPI-enumerated devices represented as platform devices in Linux, like fans, require special low-level power management handling implemented by their drivers that is not in agreement with the ACPI PM domain behavior. That leads to problems with managing ACPI fans during system-wide suspend and resume. For this reason, make acpi_dev_pm_attach() skip the affected devices by adding a list of device IDs to avoid to it and putting the IDs of the affected devices into that list. Fixes: e5cc8ef31267 (ACPI / PM: Provide ACPI PM callback routines for subsystems) Reported-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Todd Brandt <todd.e.brandt@linux.intel.com> Cc: 3.10+ <stable@vger.kernel.org> # 3.10+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17ACPI: bus: Fix NULL pointer check in acpi_bus_get_private_data()Vamshi K Sthambamkadi
commit 627ead724eff33673597216f5020b72118827de4 upstream. kmemleak reported backtrace: [<bbee0454>] kmem_cache_alloc_trace+0x128/0x260 [<6677f215>] i2c_acpi_install_space_handler+0x4b/0xe0 [<1180f4fc>] i2c_register_adapter+0x186/0x400 [<6083baf7>] i2c_add_adapter+0x4e/0x70 [<a3ddf966>] intel_gmbus_setup+0x1a2/0x2c0 [i915] [<84cb69ae>] i915_driver_probe+0x8d8/0x13a0 [i915] [<81911d4b>] i915_pci_probe+0x48/0x160 [i915] [<4b159af1>] pci_device_probe+0xdc/0x160 [<b3c64704>] really_probe+0x1ee/0x450 [<bc029f5a>] driver_probe_device+0x142/0x1b0 [<d8829d20>] device_driver_attach+0x49/0x50 [<de71f045>] __driver_attach+0xc9/0x150 [<df33ac83>] bus_for_each_dev+0x56/0xa0 [<80089bba>] driver_attach+0x19/0x20 [<cc73f583>] bus_add_driver+0x177/0x220 [<7b29d8c7>] driver_register+0x56/0xf0 In i2c_acpi_remove_space_handler(), a leak occurs whenever the "data" parameter is initialized to 0 before being passed to acpi_bus_get_private_data(). This is because the NULL pointer check in acpi_bus_get_private_data() (condition->if(!*data)) returns EINVAL and, in consequence, memory is never freed in i2c_acpi_remove_space_handler(). Fix the NULL pointer check in acpi_bus_get_private_data() to follow the analogous check in acpi_get_data_full(). Signed-off-by: Vamshi K Sthambamkadi <vamshi.k.sthambamkadi@gmail.com> [ rjw: Subject & changelog ] Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17ACPI: OSL: only free map once in osl.cFrancesco Ruggeri
commit 833a426cc471b6088011b3d67f1dc4e147614647 upstream. acpi_os_map_cleanup checks map->refcount outside of acpi_ioremap_lock before freeing the map. This creates a race condition the can result in the map being freed more than once. A panic can be caused by running for ((i=0; i<10; i++)) do for ((j=0; j<100000; j++)) do cat /sys/firmware/acpi/tables/data/BERT >/dev/null done & done This patch makes sure that only the process that drops the reference to 0 does the freeing. Fixes: b7c1fadd6c2e ("ACPI: Do not use krefs under a mutex in osl.c") Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17cpufreq: powernv: fix stack bloat and hard limit on number of CPUsJohn Hubbard
commit db0d32d84031188443e25edbd50a71a6e7ac5d1d upstream. The following build warning occurred on powerpc 64-bit builds: drivers/cpufreq/powernv-cpufreq.c: In function 'init_chip_info': drivers/cpufreq/powernv-cpufreq.c:1070:1: warning: the frame size of 1040 bytes is larger than 1024 bytes [-Wframe-larger-than=] This is with a cross-compiler based on gcc 8.1.0, which I got from: https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/8.1.0/ The warning is due to putting 1024 bytes on the stack: unsigned int chip[256]; ...and it's also undesirable to have a hard limit on the number of CPUs here. Fix both problems by dynamically allocating based on num_possible_cpus, as recommended by Michael Ellerman. Fixes: 053819e0bf840 ("cpufreq: powernv: Handle throttling due to Pmax capping at chip level") Signed-off-by: John Hubbard <jhubbard@nvidia.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 4.10+ <stable@vger.kernel.org> # 4.10+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17PM / devfreq: Lock devfreq in trans_stat_showLeonard Crestez
commit 2abb0d5268ae7b5ddf82099b1f8d5aa8414637d4 upstream. There is no locking in this sysfs show function so stats printing can race with a devfreq_update_status called as part of freq switching or with initialization. Also add an assert in devfreq_update_status to make it clear that lock must be held by caller. Fixes: 39688ce6facd ("PM / devfreq: account suspend/resume for stats") Cc: stable@vger.kernel.org Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17intel_th: pci: Add Tiger Lake CPU supportAlexander Shishkin
commit 6e6c18bcb78c0dc0601ebe216bed12c844492d0c upstream. This adds support for the Trace Hub in Tiger Lake CPU. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20191120130806.44028-4-alexander.shishkin@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17intel_th: pci: Add Ice Lake CPU supportAlexander Shishkin
commit 6a1743422a7c0fda26764a544136cac13e5ae486 upstream. This adds support for the Trace Hub in Ice Lake CPU. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20191120130806.44028-3-alexander.shishkin@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17intel_th: Fix a double put_device() in error pathAlexander Shishkin
commit 512592779a337feb5905d8fcf9498dbf33672d4a upstream. Commit a753bfcfdb1f ("intel_th: Make the switch allocate its subdevices") factored out intel_th_subdevice_alloc() from intel_th_populate(), but got the error path wrong, resulting in two instances of a double put_device() on a freshly initialized, but not 'added' device. Fix this by only doing one put_device() in the error path. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Fixes: a753bfcfdb1f ("intel_th: Make the switch allocate its subdevices") Reported-by: Wen Yang <wenyang@linux.alibaba.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: stable@vger.kernel.org # v4.14+ Link: https://lore.kernel.org/r/20191120130806.44028-2-alexander.shishkin@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17cpuidle: Do not unset the driver if it is there alreadyZhenzhong Duan
commit 918c1fe9fbbe46fcf56837ff21f0ef96424e8b29 upstream. Fix __cpuidle_set_driver() to check if any of the CPUs in the mask has a driver different from drv already and, if so, return -EBUSY before updating any cpuidle_drivers per-CPU pointers. Fixes: 82467a5a885d ("cpuidle: simplify multiple driver support") Cc: 3.11+ <stable@vger.kernel.org> # 3.11+ Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17media: cec.h: CEC_OP_REC_FLAG_ values were swappedHans Verkuil
commit 806e0cdfee0b99efbb450f9f6e69deb7118602fc upstream. CEC_OP_REC_FLAG_NOT_USED is 0 and CEC_OP_REC_FLAG_USED is 1, not the other way around. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Reported-by: Jiunn Chang <c0d1n61at3@gmail.com> Cc: <stable@vger.kernel.org> # for v4.10 and up Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17media: radio: wl1273: fix interrupt masking on releaseJohan Hovold
commit 1091eb830627625dcf79958d99353c2391f41708 upstream. If a process is interrupted while accessing the radio device and the core lock is contended, release() could return early and fail to update the interrupt mask. Note that the return value of the v4l2 release file operation is ignored. Fixes: 87d1a50ce451 ("[media] V4L2: WL1273 FM Radio: TI WL1273 FM radio driver") Cc: stable <stable@vger.kernel.org> # 2.6.38 Cc: Matti Aaltonen <matti.j.aaltonen@nokia.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17media: bdisp: fix memleak on releaseJohan Hovold
commit 11609a7e21f8cea42630350aa57662928fa4dc63 upstream. If a process is interrupted while accessing the video device and the device lock is contended, release() could return early and fail to free related resources. Note that the return value of the v4l2 release file operation is ignored. Fixes: 28ffeebbb7bd ("[media] bdisp: 2D blitter driver using v4l2 mem2mem framework") Cc: stable <stable@vger.kernel.org> # 4.2 Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Fabien Dessenne <fabien.dessenne@st.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17s390/mm: properly clear _PAGE_NOEXEC bit when it is not supportedGerald Schaefer
commit ab874f22d35a8058d8fdee5f13eb69d8867efeae upstream. On older HW or under a hypervisor, w/o the instruction-execution- protection (IEP) facility, and also w/o EDAT-1, a translation-specification exception may be recognized when bit 55 of a pte is one (_PAGE_NOEXEC). The current code tries to prevent setting _PAGE_NOEXEC in such cases, by removing it within set_pte_at(). However, ptep_set_access_flags() will modify a pte directly, w/o using set_pte_at(). There is at least one scenario where this can result in an active pte with _PAGE_NOEXEC set, which would then lead to a panic due to a translation-specification exception (write to swapped out page): do_swap_page pte = mk_pte (with _PAGE_NOEXEC bit) set_pte_at (will remove _PAGE_NOEXEC bit in page table, but keep it in local variable pte) vmf->orig_pte = pte (pte still contains _PAGE_NOEXEC bit) do_wp_page wp_page_reuse entry = vmf->orig_pte (still with _PAGE_NOEXEC bit) ptep_set_access_flags (writes entry with _PAGE_NOEXEC bit) Fix this by clearing _PAGE_NOEXEC already in mk_pte_phys(), where the pgprot value is applied, so that no pte with _PAGE_NOEXEC will ever be visible, if it is not supported. The check in set_pte_at() can then also be removed. Cc: <stable@vger.kernel.org> # 4.11+ Fixes: 57d7f939e7bd ("s390: add no-execute support") Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17ar5523: check NULL before memcpy() in ar5523_cmd()Denis Efremov
commit 315cee426f87658a6799815845788fde965ddaad upstream. memcpy() call with "idata == NULL && ilen == 0" results in undefined behavior in ar5523_cmd(). For example, NULL is passed in callchain "ar5523_stat_work() -> ar5523_cmd_write() -> ar5523_cmd()". This patch adds ilen check before memcpy() call in ar5523_cmd() to prevent an undefined behavior. Cc: Pontus Fuchs <pontus.fuchs@gmail.com> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: David Laight <David.Laight@ACULAB.COM> Cc: stable@vger.kernel.org Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17cgroup: pids: use atomic64_t for pids->limitAleksa Sarai
commit a713af394cf382a30dd28a1015cbe572f1b9ca75 upstream. Because pids->limit can be changed concurrently (but we don't want to take a lock because it would be needlessly expensive), use atomic64_ts instead. Fixes: commit 49b786ea146f ("cgroup: implement the PIDs subsystem") Cc: stable@vger.kernel.org # v4.3+ Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17blk-mq: avoid sysfs buffer overflow with too many CPU coresMing Lei
commit 8962842ca5abdcf98e22ab3b2b45a103f0408b95 upstream. It is reported that sysfs buffer overflow can be triggered if the system has too many CPU cores(>841 on 4K PAGE_SIZE) when showing CPUs of hctx via /sys/block/$DEV/mq/$N/cpu_list. Use snprintf to avoid the potential buffer overflow. This version doesn't change the attribute format, and simply stops showing CPU numbers if the buffer is going to overflow. Cc: stable@vger.kernel.org Fixes: 676141e48af7("blk-mq: don't dump CPU -> hw queue map on driver load") Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17ASoC: Jack: Fix NULL pointer dereference in snd_soc_jack_reportPawel Harlozinski
commit 8f157d4ff039e03e2ed4cb602eeed2fd4687a58f upstream. Check for existance of jack before tracing. NULL pointer dereference has been reported by KASAN while unloading machine driver (snd_soc_cnl_rt274). Signed-off-by: Pawel Harlozinski <pawel.harlozinski@linux.intel.com> Link: https://lore.kernel.org/r/20191112130237.10141-1-pawel.harlozinski@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17workqueue: Fix pwq ref leak in rescuer_thread()Tejun Heo
commit e66b39af00f426b3356b96433d620cb3367ba1ff upstream. 008847f66c3 ("workqueue: allow rescuer thread to do more work.") made the rescuer worker requeue the pwq immediately if there may be more work items which need rescuing instead of waiting for the next mayday timer expiration. Unfortunately, it doesn't check whether the pwq is already on the mayday list and unconditionally gets the ref and moves it onto the list. This doesn't corrupt the list but creates an additional reference to the pwq. It got queued twice but will only be removed once. This leak later can trigger pwq refcnt warning on workqueue destruction and prevent freeing of the workqueue. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: "Williams, Gerald S" <gerald.s.williams@intel.com> Cc: NeilBrown <neilb@suse.de> Cc: stable@vger.kernel.org # v3.19+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17workqueue: Fix spurious sanity check failures in destroy_workqueue()Tejun Heo
commit def98c84b6cdf2eeea19ec5736e90e316df5206b upstream. Before actually destrying a workqueue, destroy_workqueue() checks whether it's actually idle. If it isn't, it prints out a bunch of warning messages and leaves the workqueue dangling. It unfortunately has a couple issues. * Mayday list queueing increments pwq's refcnts which gets detected as busy and fails the sanity checks. However, because mayday list queueing is asynchronous, this condition can happen without any actual work items left in the workqueue. * Sanity check failure leaves the sysfs interface behind too which can lead to init failure of newer instances of the workqueue. This patch fixes the above two by * If a workqueue has a rescuer, disable and kill the rescuer before sanity checks. Disabling and killing is guaranteed to flush the existing mayday list. * Remove sysfs interface before sanity checks. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Marcin Pawlowski <mpawlowski@fb.com> Reported-by: "Williams, Gerald S" <gerald.s.williams@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17dm zoned: reduce overhead of backing device checksDmitry Fomichev
commit e7fad909b68aa37470d9f2d2731b5bec355ee5d6 upstream. Commit 75d66ffb48efb3 added backing device health checks and as a part of these checks, check_events() block ops template call is invoked in dm-zoned mapping path as well as in reclaim and flush path. Calling check_events() with ATA or SCSI backing devices introduces a blocking scsi_test_unit_ready() call being made in sd_check_events(). Even though the overhead of calling scsi_test_unit_ready() is small for ATA zoned devices, it is much larger for SCSI and it affects performance in a very negative way. Fix this performance regression by executing check_events() only in case of any I/O errors. The function dmz_bdev_is_dying() is modified to call only blk_queue_dying(), while calls to check_events() are made in a new helper function, dmz_check_bdev(). Reported-by: zhangxiaoxu <zhangxiaoxu5@huawei.com> Fixes: 75d66ffb48efb3 ("dm zoned: properly handle backing device failure") Cc: stable@vger.kernel.org Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17hwrng: omap - Fix RNG wait loop timeoutSumit Garg
commit be867f987a4e1222114dd07a01838a17c26f3fff upstream. Existing RNG data read timeout is 200us but it doesn't cover EIP76 RNG data rate which takes approx. 700us to produce 16 bytes of output data as per testing results. So configure the timeout as 1000us to also take account of lack of udelay()'s reliability. Fixes: 383212425c92 ("hwrng: omap - Add device variant for SafeXcel IP-76 found in Armada 8K") Cc: <stable@vger.kernel.org> Signed-off-by: Sumit Garg <sumit.garg@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17watchdog: aspeed: Fix clock behaviour for ast2600Joel Stanley
[ Upstream commit c04571251b3d842096f1597f5d4badb508be016d ] The ast2600 no longer uses bit 4 in the control register to indicate a 1MHz clock (It now controls whether this watchdog is reset by a SOC reset). This means we do not want to set it. It also does not need to be set for the ast2500, as it is read-only on that SoC. The comment next to the clock rate selection wandered away from where it was set, so put it back next to the register setting it's describing. Fixes: b3528b487448 ("watchdog: aspeed: Add support for AST2600") Signed-off-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20191108032905.22463-1-joel@jms.id.au Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-17md/raid0: Fix an error message in raid0_make_request()Dan Carpenter
[ Upstream commit e3fc3f3d0943b126f76b8533960e4168412d9e5a ] The first argument to WARN() is supposed to be a condition. The original code will just print the mdname() instead of the full warning message. Fixes: c84a1372df92 ("md/raid0: avoid RAID0 data corruption due to layout confusion.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-17ALSA: hda - Fix pending unsol events at shutdownTakashi Iwai
[ Upstream commit ca58f55108fee41d87c9123f85ad4863e5de7f45 ] This is an alternative fix attemp for the issue reported in the commit caa8422d01e9 ("ALSA: hda: Flush interrupts on disabling") that was reverted later due to regressions. Instead of tweaking the hardware disablement order and the enforced irq flushing, do calling cancel_work_sync() of the unsol work early enough, and explicitly ignore the unsol events during the shutdown by checking the bus->shutdown flag. Fixes: caa8422d01e9 ("ALSA: hda: Flush interrupts on disabling") Cc: Chris Wilson <chris@chris-wilson.co.uk> Link: https://lore.kernel.org/r/s5h1ruxt9cz.wl-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-17ovl: relax WARN_ON() on rename to selfAmir Goldstein
commit 6889ee5a53b8d969aa542047f5ac8acdc0e79a91 upstream. In ovl_rename(), if new upper is hardlinked to old upper underneath overlayfs before upper dirs are locked, user will get an ESTALE error and a WARN_ON will be printed. Changes to underlying layers while overlayfs is mounted may result in unexpected behavior, but it shouldn't crash the kernel and it shouldn't trigger WARN_ON() either, so relax this WARN_ON(). Reported-by: syzbot+bb1836a212e69f8e201a@syzkaller.appspotmail.com Fixes: 804032fabb3b ("ovl: don't check rename to self") Cc: <stable@vger.kernel.org> # v4.9+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17lib: raid6: fix awk build warningsGreg Kroah-Hartman
commit 702600eef73033ddd4eafcefcbb6560f3e3a90f7 upstream. Newer versions of awk spit out these fun warnings: awk: ../lib/raid6/unroll.awk:16: warning: regexp escape sequence `\#' is not a known regexp operator As commit 700c1018b86d ("x86/insn: Fix awk regexp warnings") showed, it turns out that there are a number of awk strings that do not need to be escaped and newer versions of awk now warn about this. Fix the string up so that no warning is produced. The exact same kernel module gets created before and after this patch, showing that it wasn't needed. Link: https://lore.kernel.org/r/20191206152600.GA75093@kroah.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17rtlwifi: rtl8192de: Fix missing enable interrupt flagLarry Finger
commit 330bb7117101099c687e9c7f13d48068670b9c62 upstream. In commit 38506ecefab9 ("rtlwifi: rtl_pci: Start modification for new drivers"), the flag that indicates that interrupts are enabled was never set. In addition, there are several places when enable/disable interrupts were commented out are restored. A sychronize_interrupts() call is removed. Fixes: 38506ecefab9 ("rtlwifi: rtl_pci: Start modification for new drivers") Cc: Stable <stable@vger.kernel.org> # v3.18+ Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17rtlwifi: rtl8192de: Fix missing callback that tests for hw release of bufferLarry Finger
commit 3155db7613edea8fb943624062baf1e4f9cfbfd6 upstream. In commit 38506ecefab9 ("rtlwifi: rtl_pci: Start modification for new drivers"), a callback needed to check if the hardware has released a buffer indicating that a DMA operation is completed was not added. Fixes: 38506ecefab9 ("rtlwifi: rtl_pci: Start modification for new drivers") Cc: Stable <stable@vger.kernel.org> # v3.18+ Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17rtlwifi: rtl8192de: Fix missing code to retrieve RX buffer addressLarry Finger
commit 0e531cc575c4e9e3dd52ad287b49d3c2dc74c810 upstream. In commit 38506ecefab9 ("rtlwifi: rtl_pci: Start modification for new drivers"), a callback to get the RX buffer address was added to the PCI driver. Unfortunately, driver rtl8192de was not modified appropriately and the code runs into a WARN_ONCE() call. The use of an incorrect array is also fixed. Fixes: 38506ecefab9 ("rtlwifi: rtl_pci: Start modification for new drivers") Cc: Stable <stable@vger.kernel.org> # 3.18+ Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17btrfs: record all roots for rename exchange on a subvolJosef Bacik
commit 3e1740993e43116b3bc71b0aad1e6872f6ccf341 upstream. Testing with the new fsstress support for subvolumes uncovered a pretty bad problem with rename exchange on subvolumes. We're modifying two different subvolumes, but we only start the transaction on one of them, so the other one is not added to the dirty root list. This is caught by btrfs_cow_block() with a warning because the root has not been updated, however if we do not modify this root again we'll end up pointing at an invalid root because the root item is never updated. Fix this by making sure we add the destination root to the trans list, the same as we do with normal renames. This fixes the corruption. Fixes: cdd1fedf8261 ("btrfs: add support for RENAME_EXCHANGE and RENAME_WHITEOUT") CC: stable@vger.kernel.org # 4.9+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17Btrfs: send, skip backreference walking for extents with many referencesFilipe Manana
commit fd0ddbe2509568b00df364156f47561e9f469f15 upstream. Backreference walking, which is used by send to figure if it can issue clone operations instead of write operations, can be very slow and use too much memory when extents have many references. This change simply skips backreference walking when an extent has more than 64 references, in which case we fallback to a write operation instead of a clone operation. This limit is conservative and in practice I observed no signicant slowdown with up to 100 references and still low memory usage up to that limit. This is a temporary workaround until there are speedups in the backref walking code, and as such it does not attempt to add extra interfaces or knobs to tweak the threshold. Reported-by: Atemu <atemu.main@gmail.com> Link: https://lore.kernel.org/linux-btrfs/CAE4GHgkvqVADtS4AzcQJxo0Q1jKQgKaW3JGp3SGdoinVo=C9eQ@mail.gmail.com/T/#me55dc0987f9cc2acaa54372ce0492c65782be3fa CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17btrfs: Remove btrfs_bio::flags memberQu Wenruo
commit 34b127aecd4fe8e6a3903e10f204a7b7ffddca22 upstream. The last user of btrfs_bio::flags was removed in commit 326e1dbb5736 ("block: remove management of bi_remaining when restoring original bi_end_io"), remove it. (Tagged for stable as the structure is heavily used and space savings are desirable.) CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17Btrfs: fix negative subv_writers counter and data space leak after buffered ↵Filipe Manana
write commit a0e248bb502d5165b3314ac3819e888fdcdf7d9f upstream. When doing a buffered write it's possible to leave the subv_writers counter of the root, used for synchronization between buffered nocow writers and snapshotting. This happens in an exceptional case like the following: 1) We fail to allocate data space for the write, since there's not enough available data space nor enough unallocated space for allocating a new data block group; 2) Because of that failure, we try to go to NOCOW mode, which succeeds and therefore we set the local variable 'only_release_metadata' to true and set the root's sub_writers counter to 1 through the call to btrfs_start_write_no_snapshotting() made by check_can_nocow(); 3) The call to btrfs_copy_from_user() returns zero, which is very unlikely to happen but not impossible; 4) No pages are copied because btrfs_copy_from_user() returned zero; 5) We call btrfs_end_write_no_snapshotting() which decrements the root's subv_writers counter to 0; 6) We don't set 'only_release_metadata' back to 'false' because we do it only if 'copied', the value returned by btrfs_copy_from_user(), is greater than zero; 7) On the next iteration of the while loop, which processes the same page range, we are now able to allocate data space for the write (we got enough data space released in the meanwhile); 8) After this if we fail at btrfs_delalloc_reserve_metadata(), because now there isn't enough free metadata space, or in some other place further below (prepare_pages(), lock_and_cleanup_extent_if_need(), btrfs_dirty_pages()), we break out of the while loop with 'only_release_metadata' having a value of 'true'; 9) Because 'only_release_metadata' is 'true' we end up decrementing the root's subv_writers counter to -1 (through a call to btrfs_end_write_no_snapshotting()), and we also end up not releasing the data space previously reserved through btrfs_check_data_free_space(). As a consequence the mechanism for synchronizing NOCOW buffered writes with snapshotting gets broken. Fix this by always setting 'only_release_metadata' to false at the start of each iteration. Fixes: 8257b2dc3c1a ("Btrfs: introduce btrfs_{start, end}_nocow_write() for each subvolume") Fixes: 7ee9e4405f26 ("Btrfs: check if we can nocow if we don't have data space") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17btrfs: use refcount_inc_not_zero in kill_all_nodesJosef Bacik
commit baf320b9d531f1cfbf64c60dd155ff80a58b3796 upstream. We hit the following warning while running down a different problem [ 6197.175850] ------------[ cut here ]------------ [ 6197.185082] refcount_t: underflow; use-after-free. [ 6197.194704] WARNING: CPU: 47 PID: 966 at lib/refcount.c:190 refcount_sub_and_test_checked+0x53/0x60 [ 6197.521792] Call Trace: [ 6197.526687] __btrfs_release_delayed_node+0x76/0x1c0 [ 6197.536615] btrfs_kill_all_delayed_nodes+0xec/0x130 [ 6197.546532] ? __btrfs_btree_balance_dirty+0x60/0x60 [ 6197.556482] btrfs_clean_one_deleted_snapshot+0x71/0xd0 [ 6197.566910] cleaner_kthread+0xfa/0x120 [ 6197.574573] kthread+0x111/0x130 [ 6197.581022] ? kthread_create_on_node+0x60/0x60 [ 6197.590086] ret_from_fork+0x1f/0x30 [ 6197.597228] ---[ end trace 424bb7ae00509f56 ]--- This is because the free side drops the ref without the lock, and then takes the lock if our refcount is 0. So you can have nodes on the tree that have a refcount of 0. Fix this by zero'ing out that element in our temporary array so we don't try to kill it again. CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> [ add comment ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17btrfs: check page->mapping when loading free space cacheJosef Bacik
commit 3797136b626ad4b6582223660c041efdea8f26b2 upstream. While testing 5.2 we ran into the following panic [52238.017028] BUG: kernel NULL pointer dereference, address: 0000000000000001 [52238.105608] RIP: 0010:drop_buffers+0x3d/0x150 [52238.304051] Call Trace: [52238.308958] try_to_free_buffers+0x15b/0x1b0 [52238.317503] shrink_page_list+0x1164/0x1780 [52238.325877] shrink_inactive_list+0x18f/0x3b0 [52238.334596] shrink_node_memcg+0x23e/0x7d0 [52238.342790] ? do_shrink_slab+0x4f/0x290 [52238.350648] shrink_node+0xce/0x4a0 [52238.357628] balance_pgdat+0x2c7/0x510 [52238.365135] kswapd+0x216/0x3e0 [52238.371425] ? wait_woken+0x80/0x80 [52238.378412] ? balance_pgdat+0x510/0x510 [52238.386265] kthread+0x111/0x130 [52238.392727] ? kthread_create_on_node+0x60/0x60 [52238.401782] ret_from_fork+0x1f/0x30 The page we were trying to drop had a page->private, but had no page->mapping and so called drop_buffers, assuming that we had a buffer_head on the page, and then panic'ed trying to deref 1, which is our page->private for data pages. This is happening because we're truncating the free space cache while we're trying to load the free space cache. This isn't supposed to happen, and I'll fix that in a followup patch. However we still shouldn't allow those sort of mistakes to result in messing with pages that do not belong to us. So add the page->mapping check to verify that we still own this page after dropping and re-acquiring the page lock. This page being unlocked as: btrfs_readpage extent_read_full_page __extent_read_full_page __do_readpage if (!nr) unlock_page <-- nr can be 0 only if submit_extent_page returns an error CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> [ add callchain ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17usb: dwc3: ep0: Clear started flag on completionThinh Nguyen
commit 2d7b78f59e020b07fc6338eefe286f54ee2d6773 upstream. Clear ep0's DWC3_EP_TRANSFER_STARTED flag if the END_TRANSFER command is completed. Otherwise, we can't start control transfer again after END_TRANSFER. Cc: stable@vger.kernel.org Signed-off-by: Thinh Nguyen <thinhn@synopsys.com> Signed-off-by: Felipe Balbi <balbi@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17virtio-balloon: fix managed page counts when migrating pages between zonesDavid Hildenbrand
commit 63341ab03706e11a31e3dd8ccc0fbc9beaf723f0 upstream. In case we have to migrate a ballon page to a newpage of another zone, the managed page count of both zones is wrong. Paired with memory offlining (which will adjust the managed page count), we can trigger kernel crashes and all kinds of different symptoms. One way to reproduce: 1. Start a QEMU guest with 4GB, no NUMA 2. Hotplug a 1GB DIMM and online the memory to ZONE_NORMAL 3. Inflate the balloon to 1GB 4. Unplug the DIMM (be quick, otherwise unmovable data ends up on it) 5. Observe /proc/zoneinfo Node 0, zone Normal pages free 16810 min 24848885473806 low 18471592959183339 high 36918337032892872 spanned 262144 present 262144 managed 18446744073709533486 6. Do anything that requires some memory (e.g., inflate the balloon some more). The OOM goes crazy and the system crashes [ 238.324946] Out of memory: Killed process 537 (login) total-vm:27584kB, anon-rss:860kB, file-rss:0kB, shmem-rss:00 [ 238.338585] systemd invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0 [ 238.339420] CPU: 0 PID: 1 Comm: systemd Tainted: G D W 5.4.0-next-20191204+ #75 [ 238.340139] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu4 [ 238.341121] Call Trace: [ 238.341337] dump_stack+0x8f/0xd0 [ 238.341630] dump_header+0x61/0x5ea [ 238.341942] oom_kill_process.cold+0xb/0x10 [ 238.342299] out_of_memory+0x24d/0x5a0 [ 238.342625] __alloc_pages_slowpath+0xd12/0x1020 [ 238.343024] __alloc_pages_nodemask+0x391/0x410 [ 238.343407] pagecache_get_page+0xc3/0x3a0 [ 238.343757] filemap_fault+0x804/0xc30 [ 238.344083] ? ext4_filemap_fault+0x28/0x42 [ 238.344444] ext4_filemap_fault+0x30/0x42 [ 238.344789] __do_fault+0x37/0x1a0 [ 238.345087] __handle_mm_fault+0x104d/0x1ab0 [ 238.345450] handle_mm_fault+0x169/0x360 [ 238.345790] do_user_addr_fault+0x20d/0x490 [ 238.346154] do_page_fault+0x31/0x210 [ 238.346468] async_page_fault+0x43/0x50 [ 238.346797] RIP: 0033:0x7f47eba4197e [ 238.347110] Code: Bad RIP value. [ 238.347387] RSP: 002b:00007ffd7c0c1890 EFLAGS: 00010293 [ 238.347834] RAX: 0000000000000002 RBX: 000055d196a20a20 RCX: 00007f47eba4197e [ 238.348437] RDX: 0000000000000033 RSI: 00007ffd7c0c18c0 RDI: 0000000000000004 [ 238.349047] RBP: 00007ffd7c0c1c20 R08: 0000000000000000 R09: 0000000000000033 [ 238.349660] R10: 00000000ffffffff R11: 0000000000000293 R12: 0000000000000001 [ 238.350261] R13: ffffffffffffffff R14: 0000000000000000 R15: 00007ffd7c0c18c0 [ 238.350878] Mem-Info: [ 238.351085] active_anon:3121 inactive_anon:51 isolated_anon:0 [ 238.351085] active_file:12 inactive_file:7 isolated_file:0 [ 238.351085] unevictable:0 dirty:0 writeback:0 unstable:0 [ 238.351085] slab_reclaimable:5565 slab_unreclaimable:10170 [ 238.351085] mapped:3 shmem:111 pagetables:155 bounce:0 [ 238.351085] free:720717 free_pcp:2 free_cma:0 [ 238.353757] Node 0 active_anon:12484kB inactive_anon:204kB active_file:48kB inactive_file:28kB unevictable:0kB iss [ 238.355979] Node 0 DMA free:11556kB min:36kB low:48kB high:60kB reserved_highatomic:0KB active_anon:152kB inactivB [ 238.358345] lowmem_reserve[]: 0 2955 2884 2884 2884 [ 238.358761] Node 0 DMA32 free:2677864kB min:7004kB low:10028kB high:13052kB reserved_highatomic:0KB active_anon:0B [ 238.361202] lowmem_reserve[]: 0 0 72057594037927865 72057594037927865 72057594037927865 [ 238.361888] Node 0 Normal free:193448kB min:99395541895224kB low:73886371836733356kB high:147673348131571488kB reB [ 238.364765] lowmem_reserve[]: 0 0 0 0 0 [ 238.365101] Node 0 DMA: 7*4kB (U) 5*8kB (UE) 6*16kB (UME) 2*32kB (UM) 1*64kB (U) 2*128kB (UE) 3*256kB (UME) 2*512B [ 238.366379] Node 0 DMA32: 0*4kB 1*8kB (U) 2*16kB (UM) 2*32kB (UM) 2*64kB (UM) 1*128kB (U) 1*256kB (U) 1*512kB (U)B [ 238.367654] Node 0 Normal: 1985*4kB (UME) 1321*8kB (UME) 844*16kB (UME) 524*32kB (UME) 300*64kB (UME) 138*128kB (B [ 238.369184] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [ 238.369915] 130 total pagecache pages [ 238.370241] 0 pages in swap cache [ 238.370533] Swap cache stats: add 0, delete 0, find 0/0 [ 238.370981] Free swap = 0kB [ 238.371239] Total swap = 0kB [ 238.371488] 1048445 pages RAM [ 238.371756] 0 pages HighMem/MovableOnly [ 238.372090] 306992 pages reserved [ 238.372376] 0 pages cma reserved [ 238.372661] 0 pages hwpoisoned In another instance (older kernel), I was able to observe this (negative page count :/): [ 180.896971] Offlined Pages 32768 [ 182.667462] Offlined Pages 32768 [ 184.408117] Offlined Pages 32768 [ 186.026321] Offlined Pages 32768 [ 187.684861] Offlined Pages 32768 [ 189.227013] Offlined Pages 32768 [ 190.830303] Offlined Pages 32768 [ 190.833071] Built 1 zonelists, mobility grouping on. Total pages: -36920272750453009 In another instance (older kernel), I was no longer able to start any process: [root@vm ~]# [ 214.348068] Offlined Pages 32768 [ 215.973009] Offlined Pages 32768 cat /proc/meminfo -bash: fork: Cannot allocate memory [root@vm ~]# cat /proc/meminfo -bash: fork: Cannot allocate memory Fix it by properly adjusting the managed page count when migrating if the zone changed. The managed page count of the zones now looks after unplug of the DIMM (and after deflating the balloon) just like before inflating the balloon (and plugging+onlining the DIMM). We'll temporarily modify the totalram page count. If this ever becomes a problem, we can fine tune by providing helpers that don't touch the totalram pages (e.g., adjust_zone_managed_page_count()). Please note that fixing up the managed page count is only necessary when we adjusted the managed page count when inflating - only if we don't have VIRTIO_BALLOON_F_DEFLATE_ON_OOM. With that feature, the managed page count is not touched when inflating/deflating. Reported-by: Yumei Huang <yuhuang@redhat.com> Fixes: 3dcc0571cd64 ("mm: correctly update zone->managed_pages") Cc: <stable@vger.kernel.org> # v3.11+ Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Jiang Liu <liuj97@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Igor Mammedov <imammedo@redhat.com> Cc: virtualization@lists.linux-foundation.org Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>