summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-08-03ideapad_laptop: Lenovo G50-30 fix rfkill reports wireless blockedDmitry Tunin
commit 4fa9dabcffc8e16601307d3d56b58c68d9716ba4 upstream. Lenovo G30-50 does not have a hardware wireless switch and wireless is always blocked. BugLink: https://bugs.launchpad.net/bugs/1397021 Signed-off-by: Dmitry Tunin <hanipouspilot@gmail.com> Signed-off-by: Philippe Coval <philippe.coval@open.eurogiciel.org> [dvhart@linux.intel.com: Reordered dmi id per Phillippe's later version] Signed-off-by: Darren Hart <dvhart@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03clocksource: exynos_mct: Avoid blocking calls in the cpu hotplug notifierDamian Eppel
commit 56a94f13919c0db5958611b388e1581b4852f3c9 upstream. Whilst testing cpu hotplug events on kernel configured with DEBUG_PREEMPT and DEBUG_ATOMIC_SLEEP we get following BUG message, caused by calling request_irq() and free_irq() in the context of hotplug notification (which is in this case atomic context). [ 40.785859] CPU1: Software reset [ 40.786660] BUG: sleeping function called from invalid context at mm/slub.c:1241 [ 40.786668] in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/1 [ 40.786678] Preemption disabled at:[< (null)>] (null) [ 40.786681] [ 40.786692] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.19.0-rc4-00024-g7dca860 #36 [ 40.786698] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [ 40.786728] [<c0014a00>] (unwind_backtrace) from [<c0011980>] (show_stack+0x10/0x14) [ 40.786747] [<c0011980>] (show_stack) from [<c0449ba0>] (dump_stack+0x70/0xbc) [ 40.786767] [<c0449ba0>] (dump_stack) from [<c00c6124>] (kmem_cache_alloc+0xd8/0x170) [ 40.786785] [<c00c6124>] (kmem_cache_alloc) from [<c005d6f8>] (request_threaded_irq+0x64/0x128) [ 40.786804] [<c005d6f8>] (request_threaded_irq) from [<c0350b8c>] (exynos4_local_timer_setup+0xc0/0x13c) [ 40.786820] [<c0350b8c>] (exynos4_local_timer_setup) from [<c0350ca8>] (exynos4_mct_cpu_notify+0x30/0xa8) [ 40.786838] [<c0350ca8>] (exynos4_mct_cpu_notify) from [<c003b330>] (notifier_call_chain+0x44/0x84) [ 40.786857] [<c003b330>] (notifier_call_chain) from [<c0022fd4>] (__cpu_notify+0x28/0x44) [ 40.786873] [<c0022fd4>] (__cpu_notify) from [<c0013714>] (secondary_start_kernel+0xec/0x150) [ 40.786886] [<c0013714>] (secondary_start_kernel) from [<40008764>] (0x40008764) Interrupts cannot be requested/freed in the CPU_STARTING/CPU_DYING notifications which run on the hotplugged cpu with interrupts and preemption disabled. To avoid the issue, request the interrupts for all possible cpus in the boot code. The interrupts are marked NO_AUTOENABLE to avoid a racy request_irq/disable_irq() sequence. The flag prevents the request_irq() code from enabling the interrupt immediately. The interrupt is then enabled in the CPU_STARTING notifier of the hotplugged cpu and again disabled with disable_irq_nosync() in the CPU_DYING notifier. [ tglx: Massaged changelog to match the patch ] Fixes: 7114cd749a12 ("clocksource: exynos_mct: use (request/free)_irq calls for local timer registration") Reported-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Tested-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Tested-by: Marcin Jabrzyk <m.jabrzyk@samsung.com> Signed-off-by: Damian Eppel <d.eppel@samsung.com> Cc: m.szyprowski@samsung.com Cc: kyungmin.park@samsung.com Cc: daniel.lezcano@linaro.org Cc: kgene@kernel.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1435324984-7328-1-git-send-email-d.eppel@samsung.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03e1000e: Cleanup handling of VLAN_HLEN as a part of max frame sizeAlexander Duyck
commit 8084b86dcfbc4b4822868c1dbdb429b5c08154e2 upstream. When the VLAN_HLEN was added to the calculation for the maximum frame size there seems to have been a number of issues added to the driver. The first issue is that in some cases the maximum frame size for a device never really reached the actual maximum frame size as the VLAN header length was not included the calculation for that value. As a result some parts only supported a maximum frame size of either 1496 in the case of parts that didn't support jumbo frames, and 8996 in the case of the parts that do. The second issue is the fact that there were several checks that weren't updated so as a result setting an MTU of 1500 was treated as enabling jumbo frames as the calculated value was 1522 instead of 1518. I have addressed those by replacing ETH_FRAME_LEN with VLAN_ETH_FRAME_LEN where appropriate. The final issue was the fact that lowering the MTU below 1500 would cause the driver to allocate 2K buffers for the rings. This is an old issue that was fixed several years ago in igb/ixgbe and I am addressing now by just replacing == with a <= so that we always just round up to 1522 for anything that isn't a jumbo frame. Fixes: c751a3d58cf2d ("e1000e: Correctly include VLAN_HLEN when changing interface MTU") Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03mac80211: prevent possible crypto tx tailroom corruptionMichal Kazior
commit ab499db80fcf07c18e4053f91a619500f663e90e upstream. There was a possible race between ieee80211_reconfig() and ieee80211_delayed_tailroom_dec(). This could result in inability to transmit data if driver crashed during roaming or rekeying and subsequent skbs with insufficient tailroom appeared. This race was probably never seen in the wild because a device driver would have to crash AND recover within 0.5s which is very unlikely. I was able to prove this race exists after changing the delay to 10s locally and crashing ath10k via debugfs immediately after GTK rekeying. In case of ath10k the counter went below 0. This was harmless but other drivers which actually require tailroom (e.g. for WEP ICV or MMIC) could end up with the counter at 0 instead of >0 and introduce insufficient skb tailroom failures because mac80211 would not resize skbs appropriately anymore. Fixes: 8d1f7ecd2af5 ("mac80211: defer tailroom counter manipulation when roaming") Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03cfg80211: ignore netif running state when changing iftypeMichal Kazior
commit 6cbfb1bb66e4e85da5db78e8ff429a85bd84ce64 upstream. It was possible for mac80211 to be coerced into an unexpected flow causing sdata union to become corrupted. Station pointer was put into sdata->u.vlan.sta memory location while it was really master AP's sdata->u.ap.next_beacon. This led to station entry being later freed as next_beacon before __sta_info_flush() in ieee80211_stop_ap() and a subsequent invalid pointer dereference crash. The problem was that ieee80211_ptr->use_4addr wasn't cleared on interface type changes. This could be reproduced with the following steps: # host A and host B have just booted; no # wpa_s/hostapd running; all vifs are down host A> iw wlan0 set type station host A> iw wlan0 set 4addr on host A> printf 'interface=wlan0\nssid=4addrcrash\nchannel=1\nwds_sta=1' > /tmp/hconf host A> hostapd -B /tmp/conf host B> iw wlan0 set 4addr on host B> ifconfig wlan0 up host B> iw wlan0 connect -w hostAssid host A> pkill hostapd # host A crashed: [ 127.928192] BUG: unable to handle kernel NULL pointer dereference at 00000000000006c8 [ 127.929014] IP: [<ffffffff816f4f32>] __sta_info_flush+0xac/0x158 ... [ 127.934578] [<ffffffff8170789e>] ieee80211_stop_ap+0x139/0x26c [ 127.934578] [<ffffffff8100498f>] ? dump_trace+0x279/0x28a [ 127.934578] [<ffffffff816dc661>] __cfg80211_stop_ap+0x84/0x191 [ 127.934578] [<ffffffff816dc7ad>] cfg80211_stop_ap+0x3f/0x58 [ 127.934578] [<ffffffff816c5ad6>] nl80211_stop_ap+0x1b/0x1d [ 127.934578] [<ffffffff815e53f8>] genl_family_rcv_msg+0x259/0x2b5 Note: This isn't a revert of f8cdddb8d61d ("cfg80211: check iface combinations only when iface is running") as far as functionality is considered because b6a550156bc ("cfg80211/mac80211: move more combination checks to mac80211") moved the logic somewhere else already. Fixes: f8cdddb8d61d ("cfg80211: check iface combinations only when iface is running") Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03iwlwifi: mvm: fix ROC reference accountingEliad Peller
commit c779273b37bec14c33feeab11c4d457a24bc64e0 upstream. commit b112889c5af8124 ("iwlwifi: mvm: add Aux ROC request/response flow") added aux ROC flow in addition to the existing ROC flow. While doing it, it moved the ROC reference release to a common work item, which is being called for both the ROC and aux ROC flows. This resulted in invalid reference accounting, as no reference was taken in case of aux ROC, while a reference was released on completion. Fix it by adding a reference for the aux ROC as well, and release only the relevant references on completion (according to the set bits). While at it, convert cancel_work_sync() to flush_work(), in order to make sure the references are being cleaned properly. Fixes: b112889c5af8 ("iwlwifi: mvm: add Aux ROC request/response flow") Signed-off-by: Eliad Peller <eliadx.peller@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03mac80211: fix the beacon csa counter for mesh and ibssChun-Yeow Yeoh
commit 8df734e865b74d9f273216482a45a38269dc767a upstream. The csa counter has moved from sdata to beacon/presp but it is not updated accordingly for mesh and ibss. Fix this. Fixes: af296bdb8da4 ("mac80211: move csa counters from sdata to beacon/presp") Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03security_syslog() should be called once onlyVasily Averin
commit d194e5d666225b04c7754471df0948f645b6ab3a upstream. The final version of commit 637241a900cb ("kmsg: honor dmesg_restrict sysctl on /dev/kmsg") lost few hooks, as result security_syslog() are processed incorrectly: - open of /dev/kmsg checks syslog access permissions by using check_syslog_permissions() where security_syslog() is not called if dmesg_restrict is set. - syslog syscall and /proc/kmsg calls do_syslog() where security_syslog can be executed twice (inside check_syslog_permissions() and then directly in do_syslog()) With this patch security_syslog() is called once only in all syslog-related operations regardless of dmesg_restrict value. Fixes: 637241a900cb ("kmsg: honor dmesg_restrict sysctl on /dev/kmsg") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Cc: Kees Cook <keescook@chromium.org> Cc: Josh Boyer <jwboyer@redhat.com> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03__bitmap_parselist: fix bug in empty string handlingChris Metcalf
commit 2528a8b8f457d7432552d0e2b6f0f4046bb702f4 upstream. bitmap_parselist("", &mask, nmaskbits) will erroneously set bit zero in the mask. The same bug is visible in cpumask_parselist() since it is layered on top of the bitmask code, e.g. if you boot with "isolcpus=", you will actually end up with cpu zero isolated. The bug was introduced in commit 4b060420a596 ("bitmap, irq: add smp_affinity_list interface to /proc/irq") when bitmap_parselist() was generalized to support userspace as well as kernelspace. Fixes: 4b060420a596 ("bitmap, irq: add smp_affinity_list interface to /proc/irq") Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03compiler-intel: fix wrong compiler barrier() macroDaniel Borkmann
commit b86a50c3b5414eafdbee7f34af4a201a4a7817c2 upstream. Cleanup commit 73679e508201 ("compiler-intel.h: Remove duplicate definition") removed the double definition of __memory_barrier() intrinsics. However, in doing so, it also removed the preceding #undef barrier by accident, meaning, the actual barrier() macro from compiler-gcc.h with inline asm is still in place as __GNUC__ is provided. Subsequently, barrier() can never be defined as __memory_barrier() from compiler.h since it already has a definition in place and if we trust the comment in compiler-intel.h, ecc doesn't support gcc specific asm statements. I don't have an ecc at hand (unsure if that's still used in the field?) and only found this by accident during code review, a revert of that cleanup would be simplest option. Fixes: 73679e508201 ("compiler-intel.h: Remove duplicate definition") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com> Cc: Pranith Kumar <bobby.prani@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: mancha security <mancha1@zoho.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03firmware: dmi_scan: Only honor end-of-table for 64-bit tablesJean Delvare
commit 17cd5bd5391e6e7b363d66335e1bc6760ae969b9 upstream. A 32-bit entry point to a DMI table says how many structures the table contains. The SMBIOS specification explicitly says that end-of-table markers should be ignored if they are not actually at the end of the DMI table. So only honor the end-of-table marker for tables accessed through 64-bit entry points, as they do not specify a structure count. Fixes: fc43026278 ("dmi: add support for SMBIOS 3.0 64-bit entry point") Signed-off-by: Jean Delvare <jdelvare@suse.de> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Leif Lindholm <leif.lindholm@linaro.org> Cc: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03PM / sleep: Increase default DPM watchdog timeout to 60Takashi Iwai
commit fff3b16d2754a061a3549c4307a186423a0128fd upstream. Many harddisks (mostly WD ones) have firmware problems and take too long, more than 10 seconds, to resume from suspend. And this often exceeds the default DPM watchdog timeout (12 seconds), resulting in a kernel panic out of sudden. Since most distros just take the default as is, we should give a bit more safer value. This patch increases the default value from 12 seconds to one minute, which has been confirmed to be long enough for such problematic disks. Link: https://bugzilla.kernel.org/show_bug.cgi?id=91921 Fixes: 70fea60d888d (PM / Sleep: Detect device suspend/resume lockup and log event) Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03mm/hugetlb: introduce minimum hugepage orderNaoya Horiguchi
commit 641844f5616d7c6597309f560838f996466d7aac upstream. Currently the initial value of order in dissolve_free_huge_page is 64 or 32, which leads to the following warning in static checker: mm/hugetlb.c:1203 dissolve_free_huge_pages() warn: potential right shift more than type allows '9,18,64' This is a potential risk of infinite loop, because 1 << order (== 0) is used in for-loop like this: for (pfn =3D start_pfn; pfn < end_pfn; pfn +=3D 1 << order) ... So this patch fixes it by using global minimum_order calculated at boot time. text data bss dec hex filename 28313 469 84236 113018 1b97a mm/hugetlb.o 28256 473 84236 112965 1b945 mm/hugetlb.o (patched) Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03tty: remove platform_sysrq_reset_seqArnd Bergmann
commit ffb6e0c9a0572f8e5f8e9337a1b40ac2ec1493a1 upstream. The platform_sysrq_reset_seq code was intended as a way for an embedded platform to provide its own sysrq sequence at compile time. After over two years, nobody has started using it in an upstream kernel, and the platforms that were interested in it have moved on to devicetree, which can be used to configure the sequence without requiring kernel changes. The method is also incompatible with the way that most architectures build support for multiple platforms into a single kernel. Now the code is producing warnings when built with gcc-5.1: drivers/tty/sysrq.c: In function 'sysrq_init': drivers/tty/sysrq.c:959:33: warning: array subscript is above array bounds [-Warray-bounds] key = platform_sysrq_reset_seq[i]; We could fix this, but it seems unlikely that it will ever be used, so let's just remove the code instead. We still have the option to pass the sequence either in DT, using the kernel command line, or using the /sys/module/sysrq/parameters/reset_seq file. Fixes: 154b7a489a ("Input: sysrq - allow specifying alternate reset sequence") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03RDMA/ocrdma: fix double free on pdColin Ian King
commit 4dc544427991e3cef38ce3ae124b7e6557063bd3 upstream. A reorganisation of the PD allocation and deallocation in commit 9ba1377daa ("RDMA/ocrdma: Move PD resource management to driver.") introduced a double free on pd, as detected by static analysis by smatch: drivers/infiniband/hw/ocrdma/ocrdma_verbs.c:682 ocrdma_alloc_pd() error: double free of 'pd'^ The original call to ocrdma_mbx_dealloc_pd() (which does not kfree pd) was replaced with a call to _ocrdma_dealloc_pd() (which does kfree pd). The kfree following this call causes the double free, so just remove it to fix the problem. Fixes: 9ba1377daa ("RDMA/ocrdma: Move PD resource management to driver.") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03PM / clk: Fix clock error check in __pm_clk_add()Geert Uytterhoeven
commit 3fc3a0be0dab352e065d1dad7d3f81953ed0d4bc upstream. In the final iteration of commit 245bd6f6af8a62a2 ("PM / clock_ops: Add pm_clk_add_clk()"), a refcount increment was added by Grygorii Strashko. However, the accompanying IS_ERR() check operates on the wrong clock pointer, which is always zero at this point, i.e. not an error. This may lead to a NULL pointer dereference later, when __clk_get() tries to dereference an error pointer. Check the passed clock pointer instead to fix this. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Fixes: 245bd6f6af8a62a2 ("PM / clock_ops: Add pm_clk_add_clk()") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03mmc: sdhci: Restore behavior while creating OCR maskUlf Hansson
commit 5fd26c7ecb32082745b0bd33c8e35badd1cb5a91 upstream. Commit 3a48edc4bd68 ("mmc: sdhci: Use mmc core regulator infrastucture") changed the behavior for how to assign the ocr_avail mask for the mmc host. More precisely it started to mask the bits instead of assigning them. Restore the behavior, but also make it clear that an OCR mask created from an external regulator overrides the other ones. The OCR mask is determined by one of the following with this priority: 1. Supported ranges of external regulator if one supplies VDD 2. Host OCR mask if set by the driver (based on DT properties) 3. The capabilities reported by the controller itself Fixes: 3a48edc4bd68 ("mmc: sdhci: Use mmc core regulator infrastucture") Cc: Tim Kryger <tim.kryger@gmail.com> Reported-by: Yangbo Lu <yangbo.lu@freescale.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Tim Kryger <tim.kryger@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03mmc: card: Fixup request missing in mmc_blk_issue_rw_rqDing Wang
commit 29535f7b797df35cc9b6b3bca635591cdd3dd2a8 upstream. The current handler of MMC_BLK_CMD_ERR in mmc_blk_issue_rw_rq function may cause new coming request permanent missing when the ongoing request (previoulsy started) complete end. The problem scenario is as follows: (1) Request A is ongoing; (2) Request B arrived, and finally mmc_blk_issue_rw_rq() is called; (3) Request A encounters the MMC_BLK_CMD_ERR error; (4) In the error handling of MMC_BLK_CMD_ERR, suppose mmc_blk_cmd_err() end request A completed and return zero. Continue the error handling, suppose mmc_blk_reset() reset device success; (5) Continue the execution, while loop completed because variable ret is zero now; (6) Finally, mmc_blk_issue_rw_rq() return without processing request B. The process related to the missing request may wait that IO request complete forever, possibly crashing the application or hanging the system. Fix this issue by starting new request when reset success. Signed-off-by: Ding Wang <justin.wang@spreadtrum.com> Fixes: 67716327eec7 ("mmc: block: add eMMC hardware reset support") Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03serial: samsung: only use earlycon for consoleArnd Bergmann
commit 357d56151976a78d90dc3dfac01777de0ef05212 upstream. A configuration that enables earlycon but not the core console code causes a link error: drivers/built-in.o: In function `setup_earlycon': drivers/tty/serial/earlycon.c:70: undefined reference to `uart_parse_earlycon' That error can be triggered by the newly added samsung earlycon support, which is missing a 'select' statement. As suggested by Peter Hurley, solves the problem by moving the 'select SERIAL_EARLYCON' statement to the samsung console driver option, as it is done by all other console drivers. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: b94ba0328d3b3 ("serial: samsung: Add support for early console") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03ACPI / PCI: Fix regressions caused by resource_size_t overflow with 32-bit ↵Jiang Liu
kernel commit 1fb01ca93a1348a1469b8777326cd7632483de77 upstream. Zoltan Boszormenyi reported this regression: "There's a Realtek RTL8111/8168/8411 (PCI ID 10ec:8168, Subsystem ID 1565:230e) network chip on the mainboard. After the r8169 driver loaded the IRQs in the machine went berserk. Keyboard keypressed arrived with considerable latency and duplicated, so no real work was possible. The machine responded to the power button but didn't actually power down. It just stuck at the powering down message. I had to press the power button for 4 seconds to power it down. The computer is a POS machine with a big battery inside. Because of this, either ACPI or the Realtek chip kept the bad state and after rebooting, the network chip didn't even show up in lspci. Not even the PXE ROM announced itself during boot. I had to disconnect the battery to beat some sense back to the computer. The regression happens with 4.0.5, 4.1.0-rc8 and 4.1.0-final. 3.18.16 was good." The regression is caused by commit 593669c2ac0f (x86/PCI/ACPI: Use common ACPI resource interfaces to simplify implementation). Since commit 593669c2ac0f, x86 PCI ACPI host bridge driver validates ACPI resources by first converting an ACPI resource to a 'struct resource' structure and then applying checks against the converted resource structure. The 'start' and 'end' fields in 'struct resource' are defined to be type of resource_size_t, which may be 32 bits or 64 bits depending on CONFIG_PHYS_ADDR_T_64BIT. This may cause incorrect resource validation results with 32-bit kernels because 64-bit ACPI resource descriptors may get truncated when converting to 32-bit 'start' and 'end' fields in 'struct resource'. It eventually affects PCI resource allocation subsystem and makes some PCI devices and the system behave abnormally due to incorrect resource assignment. So enhance the ACPI resource parsing interfaces to ignore ACPI resource descriptors with address/offset above 4G when running in 32-bit mode. With the fix applied, the behavior of the machine was restored to how 3.18.16 worked, i.e. the memory range that is over 4GB is ignored again, and lspci -vvxxx shows that everything is at the same memory window as they were with 3.18.16. Reported-and-tested-by: Boszormenyi Zoltan <zboszor@pr.hu> Fixes: 593669c2ac0f (x86/PCI/ACPI: Use common ACPI resource interfaces to simplify implementation) Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03ACPICA: Tables: Enable default 64-bit FADT addresses favorLv Zheng
commit 0ea61381788a37d864f9841b0fe97d40f7058f3b upstream. ACPICA commit 4da56eeae0749dfe8491285c1e1fad48f6efafd8 The following commit temporarily disables correct 64-bit FADT addresses favor during the period the root cause of the bug is not fixed: Commit: 85dbd5801f62b66e2aa7826aaefcaebead44c8a6 ACPICA: Tables: Restore old behavor to favor 32-bit FADT addresses. With enough protections, this patch re-enables 64-bit FADT addresses by default. If regressions are reported against such change, this patch should be bisected and reverted. Note that 64-bit FACS favor and 64-bit firmware waking vector favor are excluded by this commit in order not to break OSPMs. Lv Zheng. Link: https://bugzilla.kernel.org/show_bug.cgi?id=74021 Link: https://github.com/acpica/acpica/commit/4da56eea Reported-and-tested-by: Oswald Buddenhagen <ossi@kde.org> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03ACPICA: Tables: Fix an issue that FACS initialization is performed twiceLv Zheng
commit c04be18448355441a0c424362df65b6422e27bda upstream. ACPICA commit 90f5332a15e9d9ba83831ca700b2b9f708274658 This patch adds a new FACS initialization flag for acpi_tb_initialize(). acpi_enable_subsystem() might be invoked several times in OS bootup process, and we don't want FACS initialization to be invoked twice. Lv Zheng. Link: https://github.com/acpica/acpica/commit/90f5332a Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03ACPICA: Tables: Enable both 32-bit and 64-bit FACSLv Zheng
commit c04e1fb4396d27f18296db0f914760fa7fe8223a upstream. ACPICA commit f7b86f35416e3d1f71c3d816ff5075ddd33ed486 The following commit is reported to have broken s2ram on some platforms: Commit: 0249ed2444d65d65fc3f3f64f398f1ad0b7e54cd ACPICA: Add option to favor 32-bit FADT addresses. The platform reports 2 FACS tables (which is not allowed by ACPI specification) and the new 32-bit address favor rule forces OSPMs to use the FACS table reported via FADT's X_FIRMWARE_CTRL field. The root cause of the reported bug might be one of the followings: 1. BIOS may favor the 64-bit firmware waking vector address when the version of the FACS is greater than 0 and Linux currently only supports resuming from the real mode, so the 64-bit firmware waking vector has never been set and might be invalid to BIOS while the commit enables higher version FACS. 2. BIOS may favor the FACS reported via the "FIRMWARE_CTRL" field in the FADT while the commit doesn't set the firmware waking vector address of the FACS reported by "FIRMWARE_CTRL", it only sets the firware waking vector address of the FACS reported by "X_FIRMWARE_CTRL". This patch excludes the cases that can trigger the bugs caused by the root cause 2. There is no handshaking mechanism can be used by OSPM to tell BIOS which FACS is currently used. Thus the FACS reported by "FIRMWARE_CTRL" may still be used by BIOS and the 0 value of the 32-bit firmware waking vector might trigger such failure. This patch tries to favor 32bit FACS address in another way where both the FACS reported by "FIRMWARE_CTRL" and the FACS reported by "X_FIRMWARE_CTRL" are loaded so that further commit can set firmware waking vector in the both tables to ensure we can exclude the cases that trigger the bugs caused by the root cause 2. The exclusion is split into 2 commits as this commit is also useful for dumping more ACPI tables, it won't get reverted when such exclusion is no longer necessary. Lv Zheng. Link: https://bugzilla.kernel.org/show_bug.cgi?id=74021 Link: https://github.com/acpica/acpica/commit/f7b86f35 Reported-and-tested-by: Oswald Buddenhagen <ossi@kde.org> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03ACPI / LPSS: Fix up acpi_lpss_create_device()Rafael J. Wysocki
commit d3e13ff3c1aa2403d9a5f371baac088daeb8f56d upstream. Fix a return value (which should be a negative error code) and a memory leak (the list allocated by acpi_dev_get_resources() needs to be freed on ioremap() errors too) in acpi_lpss_create_device() introduced by commit 4483d59e29fe 'ACPI / LPSS: check the result of ioremap()'. Fixes: 4483d59e29fe 'ACPI / LPSS: check the result of ioremap()' Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03ACPI / PNP: Reserve ACPI resources at the fs_initcall_sync stageRafael J. Wysocki
commit 0294112ee3135fbd15eaa70015af8283642dd970 upstream. This effectively reverts the following three commits: 7bc10388ccdd ACPI / resources: free memory on error in add_region_before() 0f1b414d1907 ACPI / PNP: Avoid conflicting resource reservations b9a5e5e18fbf ACPI / init: Fix the ordering of acpi_reserve_resources() (commit b9a5e5e18fbf introduced regressions some of which, but not all, were addressed by commit 0f1b414d1907 and commit 7bc10388ccdd was a fixup on top of the latter) and causes ACPI fixed hardware resources to be reserved at the fs_initcall_sync stage of system initialization. The story is as follows. First, a boot regression was reported due to an apparent resource reservation ordering change after a commit that shouldn't lead to such changes. Investigation led to the conclusion that the problem happened because acpi_reserve_resources() was executed at the device_initcall() stage of system initialization which wasn't strictly ordered with respect to driver initialization (and with respect to the initialization of the pcieport driver in particular), so a random change causing the device initcalls to be run in a different order might break things. The response to that was to attempt to run acpi_reserve_resources() as soon as we knew that ACPI would be in use (commit b9a5e5e18fbf). However, that turned out to be too early, because it caused resource reservations made by the PNP system driver to fail on at least one system and that failure was addressed by commit 0f1b414d1907. That fix still turned out to be insufficient, though, because calling acpi_reserve_resources() before the fs_initcall stage of system initialization caused a boot regression to happen on the eCAFE EC-800-H20G/S netbook. That meant that we only could call acpi_reserve_resources() at the fs_initcall initialization stage or later, but then we might just as well call it after the PNP initalization in which case commit 0f1b414d1907 wouldn't be necessary any more. For this reason, the changes made by commit 0f1b414d1907 are reverted (along with a memory leak fixup on top of that commit), the changes made by commit b9a5e5e18fbf that went too far are reverted too and acpi_reserve_resources() is changed into fs_initcall_sync, which will cause it to be executed after the PNP subsystem initialization (which is an fs_initcall) and before device initcalls (including the pcieport driver initialization) which should avoid the initial issue. Link: https://bugzilla.kernel.org/show_bug.cgi?id=100581 Link: http://marc.info/?t=143092384600002&r=1&w=2 Link: https://bugzilla.kernel.org/show_bug.cgi?id=99831 Link: http://marc.info/?t=143389402600001&r=1&w=2 Fixes: b9a5e5e18fbf "ACPI / init: Fix the ordering of acpi_reserve_resources()" Reported-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03ACPI / resources: free memory on error in add_region_before()Dan Carpenter
commit 7bc10388ccdd79b3d20463151a1f8e7a590a775b upstream. There is a small memory leak on error. Fixes: 0f1b414d1907 (ACPI / PNP: Avoid conflicting resource reservations) Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03crush: fix a bug in tree bucket decodeIlya Dryomov
commit 82cd003a77173c91b9acad8033fb7931dac8d751 upstream. struct crush_bucket_tree::num_nodes is u8, so ceph_decode_8_safe() should be used. -Wconversion catches this, but I guess it went unnoticed in all the noise it spews. The actual problem (at least for common crushmaps) isn't the u32 -> u8 truncation though - it's the advancement by 4 bytes instead of 1 in the crushmap buffer. Fixes: http://tracker.ceph.com/issues/2759 Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Josh Durgin <jdurgin@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03fuse: initialize fc->release before calling itMiklos Szeredi
commit 0ad0b3255a08020eaf50e34ef0d6df5bdf5e09ed upstream. fc->release is called from fuse_conn_put() which was used in the error cleanup before fc->release was initialized. [Jeremiah Mahler <jmmahler@gmail.com>: assign fc->release after calling fuse_conn_init(fc) instead of before.] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Fixes: a325f9b92273 ("fuse: update fuse_conn_init() and separate out fuse_conn_kill()") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03selinux: fix mprotect PROT_EXEC regression caused by mm changeStephen Smalley
commit 892e8cac99a71f6254f84fc662068d912e1943bf upstream. commit 66fc13039422ba7df2d01a8ee0873e4ef965b50b ("mm: shmem_zero_setup skip security check and lockdep conflict with XFS") caused a regression for SELinux by disabling any SELinux checking of mprotect PROT_EXEC on shared anonymous mappings. However, even before that regression, the checking on such mprotect PROT_EXEC calls was inconsistent with the checking on a mmap PROT_EXEC call for a shared anonymous mapping. On a mmap, the security hook is passed a NULL file and knows it is dealing with an anonymous mapping and therefore applies an execmem check and no file checks. On a mprotect, the security hook is passed a vma with a non-NULL vm_file (as this was set from the internally-created shmem file during mmap) and therefore applies the file-based execute check and no execmem check. Since the aforementioned commit now marks the shmem zero inode with the S_PRIVATE flag, the file checks are disabled and we have no checking at all on mprotect PROT_EXEC. Add a test to the mprotect hook logic for such private inodes, and apply an execmem check in that case. This makes the mmap and mprotect checking consistent for shared anonymous mappings, as well as for /dev/zero and ashmem. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03selinux: don't waste ebitmap space when importing NetLabel categoriesPaul Moore
commit 3324603524925c7727207027d1c15e597412d15e upstream. At present we don't create efficient ebitmaps when importing NetLabel category bitmaps. This can present a problem when comparing ebitmaps since ebitmap_cmp() is very strict about these things and considers these wasteful ebitmaps not equal when compared to their more efficient counterparts, even if their values are the same. This isn't likely to cause problems on 64-bit systems due to a bit of luck on how NetLabel/CIPSO works and the default ebitmap size, but it can be a problem on 32-bit systems. This patch fixes this problem by being a bit more intelligent when importing NetLabel category bitmaps by skipping over empty sections which should result in a nice, efficient ebitmap. Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03Btrfs: fix file corruption after cloning inline extentsFilipe Manana
commit ed958762644b404654a6f5d23e869f496fe127c6 upstream. Using the clone ioctl (or extent_same ioctl, which calls the same extent cloning function as well) we end up allowing copy an inline extent from the source file into a non-zero offset of the destination file. This is something not expected and that the btrfs code is not prepared to deal with - all inline extents must be at a file offset equals to 0. For example, the following excerpt of a test case for fstests triggers a crash/BUG_ON() on a write operation after an inline extent is cloned into a non-zero offset: _scratch_mkfs >>$seqres.full 2>&1 _scratch_mount # Create our test files. File foo has the same 2K of data at offset 4K # as file bar has at its offset 0. $XFS_IO_PROG -f -s -c "pwrite -S 0xaa 0 4K" \ -c "pwrite -S 0xbb 4k 2K" \ -c "pwrite -S 0xcc 8K 4K" \ $SCRATCH_MNT/foo | _filter_xfs_io # File bar consists of a single inline extent (2K size). $XFS_IO_PROG -f -s -c "pwrite -S 0xbb 0 2K" \ $SCRATCH_MNT/bar | _filter_xfs_io # Now call the clone ioctl to clone the extent of file bar into file # foo at its offset 4K. This made file foo have an inline extent at # offset 4K, something which the btrfs code can not deal with in future # IO operations because all inline extents are supposed to start at an # offset of 0, resulting in all sorts of chaos. # So here we validate that clone ioctl returns an EOPNOTSUPP, which is # what it returns for other cases dealing with inlined extents. $CLONER_PROG -s 0 -d $((4 * 1024)) -l $((2 * 1024)) \ $SCRATCH_MNT/bar $SCRATCH_MNT/foo # Because of the inline extent at offset 4K, the following write made # the kernel crash with a BUG_ON(). $XFS_IO_PROG -c "pwrite -S 0xdd 6K 2K" $SCRATCH_MNT/foo | _filter_xfs_io status=0 exit The stack trace of the BUG_ON() triggered by the last write is: [152154.035903] ------------[ cut here ]------------ [152154.036424] kernel BUG at mm/page-writeback.c:2286! [152154.036424] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [152154.036424] Modules linked in: btrfs dm_flakey dm_mod crc32c_generic xor raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc loop fuse parport_pc acpi_cpu$ [152154.036424] CPU: 2 PID: 17873 Comm: xfs_io Tainted: G W 4.1.0-rc6-btrfs-next-11+ #2 [152154.036424] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014 [152154.036424] task: ffff880429f70990 ti: ffff880429efc000 task.ti: ffff880429efc000 [152154.036424] RIP: 0010:[<ffffffff8111a9d5>] [<ffffffff8111a9d5>] clear_page_dirty_for_io+0x1e/0x90 [152154.036424] RSP: 0018:ffff880429effc68 EFLAGS: 00010246 [152154.036424] RAX: 0200000000000806 RBX: ffffea0006a6d8f0 RCX: 0000000000000001 [152154.036424] RDX: 0000000000000000 RSI: ffffffff81155d1b RDI: ffffea0006a6d8f0 [152154.036424] RBP: ffff880429effc78 R08: ffff8801ce389fe0 R09: 0000000000000001 [152154.036424] R10: 0000000000002000 R11: ffffffffffffffff R12: ffff8800200dce68 [152154.036424] R13: 0000000000000000 R14: ffff8800200dcc88 R15: ffff8803d5736d80 [152154.036424] FS: 00007fbf119f6700(0000) GS:ffff88043d280000(0000) knlGS:0000000000000000 [152154.036424] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [152154.036424] CR2: 0000000001bdc000 CR3: 00000003aa555000 CR4: 00000000000006e0 [152154.036424] Stack: [152154.036424] ffff8803d5736d80 0000000000000001 ffff880429effcd8 ffffffffa04e97c1 [152154.036424] ffff880429effd68 ffff880429effd60 0000000000000001 ffff8800200dc9c8 [152154.036424] 0000000000000001 ffff8800200dcc88 0000000000000000 0000000000001000 [152154.036424] Call Trace: [152154.036424] [<ffffffffa04e97c1>] lock_and_cleanup_extent_if_need+0x147/0x18d [btrfs] [152154.036424] [<ffffffffa04ea82c>] __btrfs_buffered_write+0x245/0x4c8 [btrfs] [152154.036424] [<ffffffffa04ed14b>] ? btrfs_file_write_iter+0x150/0x3e0 [btrfs] [152154.036424] [<ffffffffa04ed15a>] ? btrfs_file_write_iter+0x15f/0x3e0 [btrfs] [152154.036424] [<ffffffffa04ed2c7>] btrfs_file_write_iter+0x2cc/0x3e0 [btrfs] [152154.036424] [<ffffffff81165a4a>] __vfs_write+0x7c/0xa5 [152154.036424] [<ffffffff81165f89>] vfs_write+0xa0/0xe4 [152154.036424] [<ffffffff81166855>] SyS_pwrite64+0x64/0x82 [152154.036424] [<ffffffff81465197>] system_call_fastpath+0x12/0x6f [152154.036424] Code: 48 89 c7 e8 0f ff ff ff 5b 41 5c 5d c3 0f 1f 44 00 00 55 48 89 e5 41 54 53 48 89 fb e8 ae ef 00 00 49 89 c4 48 8b 03 a8 01 75 02 <0f> 0b 4d 85 e4 74 59 49 8b 3c 2$ [152154.036424] RIP [<ffffffff8111a9d5>] clear_page_dirty_for_io+0x1e/0x90 [152154.036424] RSP <ffff880429effc68> [152154.242621] ---[ end trace e3d3376b23a57041 ]--- Fix this by returning the error EOPNOTSUPP if an attempt to copy an inline extent into a non-zero offset happens, just like what is done for other scenarios that would require copying/splitting inline extents, which were introduced by the following commits: 00fdf13a2e9f ("Btrfs: fix a crash of clone with inline extents's split") 3f9e3df8da3c ("btrfs: replace error code from btrfs_drop_extents") Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03Btrfs: fix list transaction->pending_ordered corruptionFilipe Manana
commit d3efe08400317888f559bbedf0e42cd31575d0ef upstream. When we call btrfs_commit_transaction(), we splice the list "ordered" of our transaction handle into the transaction's "pending_ordered" list, but we don't re-initialize the "ordered" list of our transaction handle, this means it still points to the same elements it used to before the splice. Then we check if the current transaction's state is >= TRANS_STATE_COMMIT_START and if it is we end up calling btrfs_end_transaction() which simply splices again the "ordered" list of our handle into the transaction's "pending_ordered" list, leaving multiple pointers to the same ordered extents which results in list corruption when we are iterating, removing and freeing ordered extents at btrfs_wait_pending_ordered(), resulting in access to dangling pointers / use-after-free issues. Similarly, btrfs_end_transaction() can end up in some cases calling btrfs_commit_transaction(), and both did a list splice of the transaction handle's "ordered" list into the transaction's "pending_ordered" without re-initializing the handle's "ordered" list, resulting in exactly the same problem. This produces the following warning on a kernel with linked list debugging enabled: [109749.265416] ------------[ cut here ]------------ [109749.266410] WARNING: CPU: 7 PID: 324 at lib/list_debug.c:59 __list_del_entry+0x5a/0x98() [109749.267969] list_del corruption. prev->next should be ffff8800ba087e20, but was fffffff8c1f7c35d (...) [109749.287505] Call Trace: [109749.288135] [<ffffffff8145f077>] dump_stack+0x4f/0x7b [109749.298080] [<ffffffff81095de5>] ? console_unlock+0x356/0x3a2 [109749.331605] [<ffffffff8104b3b0>] warn_slowpath_common+0xa1/0xbb [109749.334849] [<ffffffff81260642>] ? __list_del_entry+0x5a/0x98 [109749.337093] [<ffffffff8104b410>] warn_slowpath_fmt+0x46/0x48 [109749.337847] [<ffffffff81260642>] __list_del_entry+0x5a/0x98 [109749.338678] [<ffffffffa053e8bf>] btrfs_wait_pending_ordered+0x46/0xdb [btrfs] [109749.340145] [<ffffffffa058a65f>] ? __btrfs_run_delayed_items+0x149/0x163 [btrfs] [109749.348313] [<ffffffffa054077d>] btrfs_commit_transaction+0x36b/0xa10 [btrfs] [109749.349745] [<ffffffff81087310>] ? trace_hardirqs_on+0xd/0xf [109749.350819] [<ffffffffa055370d>] btrfs_sync_file+0x36f/0x3fc [btrfs] [109749.351976] [<ffffffff8118ec98>] vfs_fsync_range+0x8f/0x9e [109749.360341] [<ffffffff8118ecc3>] vfs_fsync+0x1c/0x1e [109749.368828] [<ffffffff8118ee1d>] do_fsync+0x34/0x4e [109749.369790] [<ffffffff8118f045>] SyS_fsync+0x10/0x14 [109749.370925] [<ffffffff81465197>] system_call_fastpath+0x12/0x6f [109749.382274] ---[ end trace 48e0d07f7c03d95a ]--- On a non-debug kernel this leads to invalid memory accesses, causing a crash. Fix this by using list_splice_init() instead of list_splice() in btrfs_commit_transaction() and btrfs_end_transaction(). Fixes: 50d9aa99bd35 ("Btrfs: make sure logged extents complete in the current transaction V3" Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03Btrfs: fix memory leak in the extent_same ioctlFilipe Manana
commit 497b4050e0eacd4c746dd396d14916b1e669849d upstream. We were allocating memory with memdup_user() but we were never releasing that memory. This affected pretty much every call to the ioctl, whether it deduplicated extents or not. This issue was reported on IRC by Julian Taylor and on the mailing list by Marcel Ritter, credit goes to them for finding the issue. Reported-by: Julian Taylor <jtaylor.debian@googlemail.com> Reported-by: Marcel Ritter <ritter.marcel@gmail.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Mark Fasheh <mfasheh@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03Btrfs: fix fsync data loss after append writeFilipe Manana
commit e4545de5b035c7debb73d260c78377dbb69cbfb5 upstream. If we do an append write to a file (which increases its inode's i_size) that does not have the flag BTRFS_INODE_NEEDS_FULL_SYNC set in its inode, and the previous transaction added a new hard link to the file, which sets the flag BTRFS_INODE_COPY_EVERYTHING in the file's inode, and then fsync the file, the inode's new i_size isn't logged. This has the consequence that after the fsync log is replayed, the file size remains what it was before the append write operation, which means users/applications will not be able to read the data that was successsfully fsync'ed before. This happens because neither the inode item nor the delayed inode get their i_size updated when the append write is made - doing so would require starting a transaction in the buffered write path, something that we do not do intentionally for performance reasons. Fix this by making sure that when the flag BTRFS_INODE_COPY_EVERYTHING is set the inode is logged with its current i_size (log the in-memory inode into the log tree). This issue is not a recent regression and is easy to reproduce with the following test case for fstests: seq=`basename $0` seqres=$RESULT_DIR/$seq echo "QA output created by $seq" here=`pwd` tmp=/tmp/$$ status=1 # failure is the default! _cleanup() { _cleanup_flakey rm -f $tmp.* } trap "_cleanup; exit \$status" 0 1 2 3 15 # get standard environment, filters and checks . ./common/rc . ./common/filter . ./common/dmflakey # real QA test starts here _supported_fs generic _supported_os Linux _need_to_be_root _require_scratch _require_dm_flakey _require_metadata_journaling $SCRATCH_DEV _crash_and_mount() { # Simulate a crash/power loss. _load_flakey_table $FLAKEY_DROP_WRITES _unmount_flakey # Allow writes again and mount. This makes the fs replay its fsync log. _load_flakey_table $FLAKEY_ALLOW_WRITES _mount_flakey } rm -f $seqres.full _scratch_mkfs >> $seqres.full 2>&1 _init_flakey _mount_flakey # Create the test file with some initial data and then fsync it. # The fsync here is only needed to trigger the issue in btrfs, as it causes the # the flag BTRFS_INODE_NEEDS_FULL_SYNC to be removed from the btrfs inode. $XFS_IO_PROG -f -c "pwrite -S 0xaa 0 32k" \ -c "fsync" \ $SCRATCH_MNT/foo | _filter_xfs_io sync # Add a hard link to our file. # On btrfs this sets the flag BTRFS_INODE_COPY_EVERYTHING on the btrfs inode, # which is a necessary condition to trigger the issue. ln $SCRATCH_MNT/foo $SCRATCH_MNT/bar # Sync the filesystem to force a commit of the current btrfs transaction, this # is a necessary condition to trigger the bug on btrfs. sync # Now append more data to our file, increasing its size, and fsync the file. # In btrfs because the inode flag BTRFS_INODE_COPY_EVERYTHING was set and the # write path did not update the inode item in the btree nor the delayed inode # item (in memory struture) in the current transaction (created by the fsync # handler), the fsync did not record the inode's new i_size in the fsync # log/journal. This made the data unavailable after the fsync log/journal is # replayed. $XFS_IO_PROG -c "pwrite -S 0xbb 32K 32K" \ -c "fsync" \ $SCRATCH_MNT/foo | _filter_xfs_io echo "File content after fsync and before crash:" od -t x1 $SCRATCH_MNT/foo _crash_and_mount echo "File content after crash and log replay:" od -t x1 $SCRATCH_MNT/foo status=0 exit The expected file output before and after the crash/power failure expects the appended data to be available, which is: 0000000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa * 0100000 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb * 0200000 Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03Btrfs: fix race between caching kthread and returning inode to inode cacheFilipe Manana
commit ae9d8f17118551bedd797406a6768b87c2146234 upstream. While the inode cache caching kthread is calling btrfs_unpin_free_ino(), we could have a concurrent call to btrfs_return_ino() that adds a new entry to the root's free space cache of pinned inodes. This concurrent call does not acquire the fs_info->commit_root_sem before adding a new entry if the caching state is BTRFS_CACHE_FINISHED, which is a problem because the caching kthread calls btrfs_unpin_free_ino() after setting the caching state to BTRFS_CACHE_FINISHED and therefore races with the task calling btrfs_return_ino(), which is adding a new entry, while the former (caching kthread) is navigating the cache's rbtree, removing and freeing nodes from the cache's rbtree without acquiring the spinlock that protects the rbtree. This race resulted in memory corruption due to double free of struct btrfs_free_space objects because both tasks can end up doing freeing the same objects. Note that adding a new entry can result in merging it with other entries in the cache, in which case those entries are freed. This is particularly important as btrfs_free_space structures are also used for the block group free space caches. This memory corruption can be detected by a debugging kernel, which reports it with the following trace: [132408.501148] slab error in verify_redzone_free(): cache `btrfs_free_space': double free detected [132408.505075] CPU: 15 PID: 12248 Comm: btrfs-ino-cache Tainted: G W 4.1.0-rc5-btrfs-next-10+ #1 [132408.505075] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014 [132408.505075] ffff880023e7d320 ffff880163d73cd8 ffffffff8145eec7 ffffffff81095dce [132408.505075] ffff880009735d40 ffff880163d73ce8 ffffffff81154e1e ffff880163d73d68 [132408.505075] ffffffff81155733 ffffffffa054a95a ffff8801b6099f00 ffffffffa0505b5f [132408.505075] Call Trace: [132408.505075] [<ffffffff8145eec7>] dump_stack+0x4f/0x7b [132408.505075] [<ffffffff81095dce>] ? console_unlock+0x356/0x3a2 [132408.505075] [<ffffffff81154e1e>] __slab_error.isra.28+0x25/0x36 [132408.505075] [<ffffffff81155733>] __cache_free+0xe2/0x4b6 [132408.505075] [<ffffffffa054a95a>] ? __btrfs_add_free_space+0x2f0/0x343 [btrfs] [132408.505075] [<ffffffffa0505b5f>] ? btrfs_unpin_free_ino+0x8e/0x99 [btrfs] [132408.505075] [<ffffffff810f3b30>] ? time_hardirqs_off+0x15/0x28 [132408.505075] [<ffffffff81084d42>] ? trace_hardirqs_off+0xd/0xf [132408.505075] [<ffffffff811563a1>] ? kfree+0xb6/0x14e [132408.505075] [<ffffffff811563d0>] kfree+0xe5/0x14e [132408.505075] [<ffffffffa0505b5f>] btrfs_unpin_free_ino+0x8e/0x99 [btrfs] [132408.505075] [<ffffffffa0505e08>] caching_kthread+0x29e/0x2d9 [btrfs] [132408.505075] [<ffffffffa0505b6a>] ? btrfs_unpin_free_ino+0x99/0x99 [btrfs] [132408.505075] [<ffffffff8106698f>] kthread+0xef/0xf7 [132408.505075] [<ffffffff810f3b08>] ? time_hardirqs_on+0x15/0x28 [132408.505075] [<ffffffff810668a0>] ? __kthread_parkme+0xad/0xad [132408.505075] [<ffffffff814653d2>] ret_from_fork+0x42/0x70 [132408.505075] [<ffffffff810668a0>] ? __kthread_parkme+0xad/0xad [132408.505075] ffff880023e7d320: redzone 1:0x9f911029d74e35b, redzone 2:0x9f911029d74e35b. [132409.501654] slab: double free detected in cache 'btrfs_free_space', objp ffff880023e7d320 [132409.503355] ------------[ cut here ]------------ [132409.504241] kernel BUG at mm/slab.c:2571! Therefore fix this by having btrfs_unpin_free_ino() acquire the lock that protects the rbtree while doing the searches and removing entries. Fixes: 1c70d8fb4dfa ("Btrfs: fix inode caching vs tree log") Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03Btrfs: use kmem_cache_free when freeing entry in inode cacheFilipe Manana
commit c3f4a1685bb87e59c886ee68f7967eae07d4dffa upstream. The free space entries are allocated using kmem_cache_zalloc(), through __btrfs_add_free_space(), therefore we should use kmem_cache_free() and not kfree() to avoid any confusion and any potential problem. Looking at the kfree() definition at mm/slab.c it has the following comment: /* * (...) * * Don't free memory not originally allocated by kmalloc() * or you will run into trouble. */ So better be safe and use kmem_cache_free(). Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03md: fix a build warningFiro Yang
commit 4e023612325a9034a542bfab79f78b1fe5ebb841 upstream. Warning like this: drivers/md/md.c: In function "update_array_info": drivers/md/md.c:6394:26: warning: logical not is only applied to the left hand side of comparison [-Wlogical-not-parentheses] !mddev->persistent != info->not_persistent|| Fix it as Neil Brown said: mddev->persistent != !info->not_persistent || Signed-off-by: Firo Yang <firogm@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03Btrfs: don't invalidate root dentry when subvolume deletion failsOmar Sandoval
commit 64ad6c488975d7516230cf7849190a991fd615ae upstream. Since commit bafc9b754f75 ("vfs: More precise tests in d_invalidate"), mounted subvolumes can be deleted because d_invalidate() won't fail. However, we run into problems when we attempt to delete the default subvolume while it is mounted as the root filesystem: # btrfs subvol list / ID 257 gen 306 top level 5 path rootvol ID 267 gen 334 top level 5 path snap1 # btrfs subvol get-default / ID 267 gen 334 top level 5 path snap1 # btrfs inspect-internal rootid / 267 # mount -o subvol=/ /dev/vda1 /mnt # btrfs subvol del /mnt/snap1 Delete subvolume (no-commit): '/mnt/snap1' ERROR: cannot delete '/mnt/snap1' - Operation not permitted # findmnt / findmnt: can't read /proc/mounts: No such file or directory # ls /proc # Markus reported that this same scenario simply led to a kernel oops. This happens because in btrfs_ioctl_snap_destroy(), we call d_invalidate() before we check may_destroy_subvol(), which means that we detach the submounts and drop the dentry before erroring out. Instead, we should only invalidate the dentry once the deletion has succeeded. Additionally, the shrink_dcache_sb() isn't necessary; d_invalidate() will prune the dcache for the deleted subvolume. Fixes: bafc9b754f75 ("vfs: More precise tests in d_invalidate") Reported-by: Markus Schauler <mschauler@gmail.com> Signed-off-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03ARM: dts: mx23: fix iio-hwmon supportStefan Wahren
commit e8e94ed6285428ab780cd7b0df4622f71eceb39e upstream. In order to get iio-hwmon support, the lradc must be declared as an iio provider. So fix this issue by adding the #io-channel-cells property. Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Fixes: bd798f9c7b30 ("ARM: dts: mxs: Add iio-hwmon to mx23 soc") Reviewed-by: Marek Vasut <marex@denx.de> Reviewed-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03hwmon: (nct7802) fix visibility of temp3Constantine Shulyupin
commit 56172d81a9bc37a69b95dd627b8d48135c9c7b31 upstream. Excerpt from datasheet: 7.2.32 Mode Selection Register RTD3_MD : 00=Closed , 01=Reserved , 10=Thermistor mode , 11=Voltage sense Show temp3 only in Thermistor mode Signed-off-by: Constantine Shulyupin <const@MakeLinux.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03hwmon: (mcp3021) Fix broken output scalingStevens, Nick
commit 347d7e45bd09ce09cbc30d5cea9de377eb22f55c upstream. The mcp3021 scaling code is dividing the VDD (full-scale) value in millivolts by the A2D resolution to obtain the scaling factor. When VDD is 3300mV (the standard value) and the resolution is 12-bit (4096 divisions), the result is a scale factor of 3300/4096, which is always one. Effectively, the raw A2D reading is always being returned because no scaling is applied. This patch fixes the issue and simplifies the register-to-volts calculation, removing the unneeded "output_scale" struct member. Signed-off-by: Nick Stevens <Nick.Stevens@digi.com> [Guenter Roeck: Dropped unnecessary value check] Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03md: Skip cluster setup for dm-raidGoldwyn Rodrigues
commit d3b178adb3a3adf54ecf77758138b654c3ee7f09 upstream. There is a bug that the bitmap superblock isn't initialised properly for dm-raid, so a new field can have garbage in new fields. (dm-raid does initialisation in the kernel - md initialised the superblock in mdadm). This means that for dm-raid we cannot currently trust the new ->nodes field. So: - use __GFP_ZERO to initialise the superblock properly for all new arrays - initialise all fields in bitmap_info in bitmap_new_disk_sb - ignore ->nodes for dm arrays (yes, this is a hack) This bug exposes dm-raid to bug in the (still experimental) md-cluster code, so it is suitable for -stable. It does cause crashes. References: https://bugzilla.kernel.org/show_bug.cgi?id=100491 Signed-off-By: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03md: unlock mddev_lock on an error path.NeilBrown
commit 9a8c0fa861e4db60409b4dda254cef5e17e4d43c upstream. This error path retuns while still holding the lock - bad. Fixes: 6791875e2e53 ("md: make reconfig_mutex optional for writes to md sysfs files.") Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03md: clear mddev->private when it has been freed.NeilBrown
commit bd6919228d7e1867ae9e24ab27e3e4a366c87d21 upstream. If ->private is set when ->run is called, it is assumed to be a 'config' prepared as part of 'reshape'. So it is important when we free that config, that we also clear ->private. This is not often a problem as the mddev will normally be discarded shortly after the config us freed. However if an 'assemble' races with a final close, the assemble can use the old mddev which has a stale ->private. This leads to any of various sorts of crashes. So clear ->private after calling ->free(). Reported-by: Nate Clark <nate@neworld.us> Fixes: afa0f557cb15 ("md: rename ->stop to ->free") Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03dmaengine: mv_xor: bug fix for racing condition in descriptors cleanupLior Amsalem
commit 9136291f1dbc1d4d1cacd2840fb35f4f3ce16c46 upstream. This patch fixes a bug in the XOR driver where the cleanup function can be called and free descriptors that never been processed by the engine (which result in data errors). The cleanup function will free descriptors based on the ownership bit in the descriptors. Fixes: ff7b04796d98 ("dmaengine: DMA engine driver for Marvell XOR engine") Signed-off-by: Lior Amsalem <alior@marvell.com> Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Reviewed-by: Ofer Heifetz <oferh@marvell.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03tracing: Fix sample output of dynamic arraysSteven Rostedt (Red Hat)
commit d6726c8145290bef950ae2538ea6ae1d96a1944b upstream. He Kuang noticed that the trace event samples for arrays was broken: "The output result of trace_foo_bar event in traceevent samples is wrong. This problem can be reproduced as following: (Build kernel with SAMPLE_TRACE_EVENTS=m) $ insmod trace-events-sample.ko $ echo 1 > /sys/kernel/debug/tracing/events/sample-trace/foo_bar/enable $ cat /sys/kernel/debug/tracing/trace event-sample-980 [000] .... 43.649559: foo_bar: foo hello 21 0x15 BIT1|BIT3|0x10 {0x1,0x6f6f6e53,0xff007970,0xffffffff} Snoopy ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The array length is not right, should be {0x1}. (ffffffff,ffffffff) event-sample-980 [000] .... 44.653827: foo_bar: foo hello 22 0x16 BIT2|BIT3|0x10 {0x1,0x2,0x646e6147,0x666c61,0xffffffff,0xffffffff,0x750aeffe,0x7} ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The array length is not right, should be {0x1,0x2}. Gandalf (ffffffff,ffffffff)" This was caused by an update to have __print_array()'s second parameter be the count of items in the array and not the size of the array. As there is already users of __print_array(), it can not change. But the sample code can and we can also improve on the documentation about __print_array() and __get_dynamic_array_len(). Link: http://lkml.kernel.org/r/1436839171-31527-2-git-send-email-hekuang@huawei.com Fixes: ac01ce1410fc2 ("tracing: Make ftrace_print_array_seq compute buf_len") Reported-by: He Kuang <hekuang@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03tracing: Have branch tracer use recursive field of task structSteven Rostedt (Red Hat)
commit 6224beb12e190ff11f3c7d4bf50cb2922878f600 upstream. Fengguang Wu's tests triggered a bug in the branch tracer's start up test when CONFIG_DEBUG_PREEMPT set. This was because that config adds some debug logic in the per cpu field, which calls back into the branch tracer. The branch tracer has its own recursive checks, but uses a per cpu variable to implement it. If retrieving the per cpu variable calls back into the branch tracer, you can see how things will break. Instead of using a per cpu variable, use the trace_recursion field of the current task struct. Simply set a bit when entering the branch tracing and clear it when leaving. If the bit is set on entry, just don't do the tracing. There's also the case with lockdep, as the local_irq_save() called before the recursion can also trigger code that can call back into the function. Changing that to a raw_local_irq_save() will protect that as well. This prevents the recursion and the inevitable crash that follows. Link: http://lkml.kernel.org/r/20150630141803.GA28071@wfg-t540p.sh.intel.com Reported-by: Fengguang Wu <fengguang.wu@intel.com> Tested-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03tracing: Fix typo from "static inlin" to "static inline"Steven Rostedt (Red Hat)
commit cc9e4bde03f2b4cfba52406c021364cbd2a4a0f3 upstream. The trace.h header when called without CONFIG_EVENT_TRACING enabled (seldom done), will not compile because of a typo in the protocol of trace_event_enum_update(). Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03tracing/filter: Do not allow infix to exceed end of stringSteven Rostedt (Red Hat)
commit 6b88f44e161b9ee2a803e5b2b1fbcf4e20e8b980 upstream. While debugging a WARN_ON() for filtering, I found that it is possible for the filter string to be referenced after its end. With the filter: # echo '>' > /sys/kernel/debug/events/ext4/ext4_truncate_exit/filter The filter_parse() function can call infix_get_op() which calls infix_advance() that updates the infix filter pointers for the cnt and tail without checking if the filter is already at the end, which will put the cnt to zero and the tail beyond the end. The loop then calls infix_next() that has ps->infix.cnt--; return ps->infix.string[ps->infix.tail++]; The cnt will now be below zero, and the tail that is returned is already passed the end of the filter string. So far the allocation of the filter string usually has some buffer that is zeroed out, but if the filter string is of the exact size of the allocated buffer there's no guarantee that the charater after the nul terminating character will be zero. Luckily, only root can write to the filter. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03tracing/filter: Do not WARN on operand count going below zeroSteven Rostedt (Red Hat)
commit b4875bbe7e68f139bd3383828ae8e994a0df6d28 upstream. When testing the fix for the trace filter, I could not come up with a scenario where the operand count goes below zero, so I added a WARN_ON_ONCE(cnt < 0) to the logic. But there is legitimate case that it can happen (although the filter would be wrong). # echo '>' > /sys/kernel/debug/events/ext4/ext4_truncate_exit/filter That is, a single operation without any operands will hit the path where the WARN_ON_ONCE() can trigger. Although this is harmless, and the filter is reported as a error. But instead of spitting out a warning to the kernel dmesg, just fail nicely and report it via the proper channels. Link: http://lkml.kernel.org/r/558C6082.90608@oracle.com Reported-by: Vince Weaver <vincent.weaver@maine.edu> Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>