summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2018-09-19mm: get rid of vmacache_flush_all() entirelyLinus Torvalds
commit 7a9cdebdcc17e426fb5287e4a82db1dfe86339b2 upstream. Jann Horn points out that the vmacache_flush_all() function is not only potentially expensive, it's buggy too. It also happens to be entirely unnecessary, because the sequence number overflow case can be avoided by simply making the sequence number be 64-bit. That doesn't even grow the data structures in question, because the other adjacent fields are already 64-bit. So simplify the whole thing by just making the sequence number overflow case go away entirely, which gets rid of all the complications and makes the code faster too. Win-win. [ Oleg Nesterov points out that the VMACACHE_FULL_FLUSHES statistics also just goes away entirely with this ] Reported-by: Jann Horn <jannh@google.com> Suggested-by: Will Deacon <will.deacon@arm.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Cc: Oleg Nesterov <oleg@redhat.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-15r8169: add support for NCube 8168 network cardAnthony Wong
[ Upstream commit 9fd0e09a4e86499639653243edfcb417a05c5c46 ] This card identifies itself as: Ethernet controller [0200]: NCube Device [10ff:8168] (rev 06) Subsystem: TP-LINK Technologies Co., Ltd. Device [7470:3468] Adding a new entry to rtl8169_pci_tbl makes the card work. Link: http://launchpad.net/bugs/1788730 Signed-off-by: Anthony Wong <anthony.wong@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-09iommu/vt-d: Fix dev iotlb pfsid useJacob Pan
commit 1c48db44924298ad0cb5a6386b88017539be8822 upstream. PFSID should be used in the invalidation descriptor for flushing device IOTLBs on SRIOV VFs. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: stable@vger.kernel.org Cc: "Ashok Raj" <ashok.raj@intel.com> Cc: "Lu Baolu" <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-09iommu/vt-d: Add definitions for PFSIDJacob Pan
commit 0f725561e168485eff7277d683405c05b192f537 upstream. When SRIOV VF device IOTLB is invalidated, we need to provide the PF source ID such that IOMMU hardware can gauge the depth of invalidation queue which is shared among VFs. This is needed when device invalidation throttle (DIT) capability is supported. This patch adds bit definitions for checking and tracking PFSID. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: stable@vger.kernel.org Cc: "Ashok Raj" <ashok.raj@intel.com> Cc: "Lu Baolu" <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-09Replace magic for trusting the secondary keyring with #defineYannik Sembritzki
commit 817aef260037f33ee0f44c17fe341323d3aebd6d upstream. Replace the use of a magic number that indicates that verify_*_signature() should use the secondary keyring with a symbol. Signed-off-by: Yannik Sembritzki <yannik@sembritzki.me> Signed-off-by: David Howells <dhowells@redhat.com> Cc: keyrings@vger.kernel.org Cc: linux-security-module@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-09NFSv4 client live hangs after live data migration recoveryBill Baker
commit 0f90be132cbf1537d87a6a8b9e80867adac892f6 upstream. After a live data migration event at the NFS server, the client may send I/O requests to the wrong server, causing a live hang due to repeated recovery events. On the wire, this will appear as an I/O request failing with NFS4ERR_BADSESSION, followed by successful CREATE_SESSION, repeatedly. NFS4ERR_BADSSESSION is returned because the session ID being used was issued by the other server and is not valid at the old server. The failure is caused by async worker threads having cached the transport (xprt) in the rpc_task structure. After the migration recovery completes, the task is redispatched and the task resends the request to the wrong server based on the old value still present in tk_xprt. The solution is to recompute the tk_xprt field of the rpc_task structure so that the request goes to the correct server. Signed-off-by: Bill Baker <bill.baker@oracle.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Helen Chao <helen.chao@oracle.com> Fixes: fb43d17210ba ("SUNRPC: Use the multipath iterator to assign a ...") Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-09PCI: Add wrappers for dev_printk()Frederick Lawler
commit 7506dc7989933235e6fc23f3d0516bdbf0f7d1a8 upstream. Add PCI-specific dev_printk() wrappers and use them to simplify the code slightly. No functional change intended. Signed-off-by: Frederick Lawler <fred@fredlawl.com> [bhelgaas: squash into one patch] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> [only take the pci.h portion of this patch, to make backporting stuff easier over time - gregkh] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05scsi: sysfs: Introduce sysfs_{un,}break_active_protection()Bart Van Assche
commit 2afc9166f79b8f6da5f347f48515215ceee4ae37 upstream. Introduce these two functions and export them such that the next patch can add calls to these functions from the SCSI core. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-24PCI: OF: Fix I/O space page leakSergei Shtylyov
commit a5fb9fb023a1435f2b42bccd7f547560f3a21dc3 upstream. When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY driver was left disabled, the kernel crashed with this BUG: kernel BUG at lib/ioremap.c:72! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092 Hardware name: Renesas Condor board based on r8a77980 (DT) Workqueue: events deferred_probe_work_func pstate: 80000005 (Nzcv daif -PAN -UAO) pc : ioremap_page_range+0x370/0x3c8 lr : ioremap_page_range+0x40/0x3c8 sp : ffff000008da39e0 x29: ffff000008da39e0 x28: 00e8000000000f07 x27: ffff7dfffee00000 x26: 0140000000000000 x25: ffff7dfffef00000 x24: 00000000000fe100 x23: ffff80007b906000 x22: ffff000008ab8000 x21: ffff000008bb1d58 x20: ffff7dfffef00000 x19: ffff800009c30fb8 x18: 0000000000000001 x17: 00000000000152d0 x16: 00000000014012d0 x15: 0000000000000000 x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720 x11: 0720072007300730 x10: 00000000000000ae x9 : 0000000000000000 x8 : ffff7dffff000000 x7 : 0000000000000000 x6 : 0000000000000100 x5 : 0000000000000000 x4 : 000000007b906000 x3 : ffff80007c61a880 x2 : ffff7dfffeefffff x1 : 0000000040000000 x0 : 00e80000fe100f07 Process kworker/0:1 (pid: 39, stack limit = 0x (ptrval)) Call trace: ioremap_page_range+0x370/0x3c8 pci_remap_iospace+0x7c/0xac pci_parse_request_of_pci_ranges+0x13c/0x190 rcar_pcie_probe+0x4c/0xb04 platform_drv_probe+0x50/0xbc driver_probe_device+0x21c/0x308 __device_attach_driver+0x98/0xc8 bus_for_each_drv+0x54/0x94 __device_attach+0xc4/0x12c device_initial_probe+0x10/0x18 bus_probe_device+0x90/0x98 deferred_probe_work_func+0xb0/0x150 process_one_work+0x12c/0x29c worker_thread+0x200/0x3fc kthread+0x108/0x134 ret_from_fork+0x10/0x18 Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000) It turned out that pci_remap_iospace() wasn't undone when the driver's probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER, the probe was retried, finally causing the BUG due to trying to remap already remapped pages. Introduce the devm_pci_remap_iospace() managed API and replace the pci_remap_iospace() call with it to fix the bug. Fixes: dbf9826d5797 ("PCI: generic: Convert to DT resource parsing API") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> [lorenzo.pieralisi@arm.com: split commit/updated the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-24net/ethernet/freescale/fman: fix cross-build errorRandy Dunlap
[ Upstream commit c133459765fae249ba482f62e12f987aec4376f0 ] CC [M] drivers/net/ethernet/freescale/fman/fman.o In file included from ../drivers/net/ethernet/freescale/fman/fman.c:35: ../include/linux/fsl/guts.h: In function 'guts_set_dmacr': ../include/linux/fsl/guts.h:165:2: error: implicit declaration of function 'clrsetbits_be32' [-Werror=implicit-function-declaration] clrsetbits_be32(&guts->dmacr, 3 << shift, device << shift); ^~~~~~~~~~~~~~~ Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Madalin Bucur <madalin.bucur@nxp.com> Cc: netdev@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15cpu/hotplug: Fix SMT supported evaluationThomas Gleixner
commit bc2d8d262cba5736332cbc866acb11b1c5748aa9 upstream Josh reported that the late SMT evaluation in cpu_smt_state_init() sets cpu_smt_control to CPU_SMT_NOT_SUPPORTED in case that 'nosmt' was supplied on the kernel command line as it cannot differentiate between SMT disabled by BIOS and SMT soft disable via 'nosmt'. That wreckages the state and makes the sysfs interface unusable. Rework this so that during bringup of the non boot CPUs the availability of SMT is determined in cpu_smt_allowed(). If a newly booted CPU is not a 'primary' thread then set the local cpu_smt_available marker and evaluate this explicitely right after the initial SMP bringup has finished. SMT evaulation on x86 is a trainwreck as the firmware has all the information _before_ booting the kernel, but there is no interface to query it. Fixes: 73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS") Reported-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15cpu/hotplug: Set CPU_SMT_NOT_SUPPORTED earlyThomas Gleixner
commit fee0aede6f4739c87179eca76136f83210953b86 upstream The CPU_SMT_NOT_SUPPORTED state is set (if the processor does not support SMT) when the sysfs SMT control file is initialized. That was fine so far as this was only required to make the output of the control file correct and to prevent writes in that case. With the upcoming l1tf command line parameter, this needs to be set up before the L1TF mitigation selection and command line parsing happens. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142323.121795971@linutronix.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15cpu/hotplug: Expose SMT control init functionJiri Kosina
commit 8e1b706b6e819bed215c0db16345568864660393 upstream The L1TF mitigation will gain a commend line parameter which allows to set a combination of hypervisor mitigation and SMT control. Expose cpu_smt_disable() so the command line parser can tweak SMT settings. [ tglx: Split out of larger patch and made it preserve an already existing force off state ] Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142323.039715135@linutronix.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15cpu/hotplug: Provide knobs to control SMTThomas Gleixner
commit 05736e4ac13c08a4a9b1ef2de26dd31a32cbee57 upstream Provide a command line and a sysfs knob to control SMT. The command line options are: 'nosmt': Enumerate secondary threads, but do not online them 'nosmt=force': Ignore secondary threads completely during enumeration via MP table and ACPI/MADT. The sysfs control file has the following states (read/write): 'on': SMT is enabled. Secondary threads can be freely onlined 'off': SMT is disabled. Secondary threads, even if enumerated cannot be onlined 'forceoff': SMT is permanentely disabled. Writes to the control file are rejected. 'notsupported': SMT is not supported by the CPU The command line option 'nosmt' sets the sysfs control to 'off'. This can be changed to 'on' to reenable SMT during runtime. The command line option 'nosmt=force' sets the sysfs control to 'forceoff'. This cannot be changed during runtime. When SMT is 'on' and the control file is changed to 'off' then all online secondary threads are offlined and attempts to online a secondary thread later on are rejected. When SMT is 'off' and the control file is changed to 'on' then secondary threads can be onlined again. The 'off' -> 'on' transition does not automatically online the secondary threads. When the control file is set to 'forceoff', the behaviour is the same as setting it to 'off', but the operation is irreversible and later writes to the control file are rejected. When the control status is 'notsupported' then writes to the control file are rejected. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15x86/speculation/l1tf: Limit swap file size to MAX_PA/2Andi Kleen
commit 377eeaa8e11fe815b1d07c81c4a0e2843a8c15eb upstream For the L1TF workaround its necessary to limit the swap file size to below MAX_PA/2, so that the higher bits of the swap offset inverted never point to valid memory. Add a mechanism for the architecture to override the swap file size check in swapfile.c and add a x86 specific max swapfile check function that enforces that limit. The check is only enabled if the CPU is vulnerable to L1TF. In VMs with 42bit MAX_PA the typical limit is 2TB now, on a native system with 46bit PA it is 32TB. The limit is only per individual swap file, so it's always possible to exceed these limits with multiple swap files or partitions. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15x86/speculation/l1tf: Add sysfs reporting for l1tfAndi Kleen
commit 17dbca119312b4e8173d4e25ff64262119fcef38 upstream L1TF core kernel workarounds are cheap and normally always enabled, However they still should be reported in sysfs if the system is vulnerable or mitigated. Add the necessary CPU feature/bug bits. - Extend the existing checks for Meltdowns to determine if the system is vulnerable. All CPUs which are not vulnerable to Meltdown are also not vulnerable to L1TF - Check for 32bit non PAE and emit a warning as there is no practical way for mitigation due to the limited physical address bits - If the system has more than MAX_PA/2 physical memory the invert page workarounds don't protect the system against the L1TF attack anymore, because an inverted physical address will also point to valid memory. Print a warning in this case and report that the system is vulnerable. Add a function which returns the PFN limit for the L1TF mitigation, which will be used in follow up patches for sanity and range checks. [ tglx: Renamed the CPU feature bit to L1TF_PTEINV ] [ dwmw2: Backport to 4.9 (cpufeatures.h, E820) ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15proc: Fix proc_sys_prune_dcache to hold a sb referenceEric W. Biederman
commit 2fd1d2c4ceb2248a727696962cf3370dc9f5a0a4 upstream. Andrei Vagin writes: FYI: This bug has been reproduced on 4.11.7 > BUG: Dentry ffff895a3dd01240{i=4e7c09a,n=lo} still in use (1) [unmount of proc proc] > ------------[ cut here ]------------ > WARNING: CPU: 1 PID: 13588 at fs/dcache.c:1445 umount_check+0x6e/0x80 > CPU: 1 PID: 13588 Comm: kworker/1:1 Not tainted 4.11.7-200.fc25.x86_64 #1 > Hardware name: CompuLab sbc-flt1/fitlet, BIOS SBCFLT_0.08.04 06/27/2015 > Workqueue: events proc_cleanup_work > Call Trace: > dump_stack+0x63/0x86 > __warn+0xcb/0xf0 > warn_slowpath_null+0x1d/0x20 > umount_check+0x6e/0x80 > d_walk+0xc6/0x270 > ? dentry_free+0x80/0x80 > do_one_tree+0x26/0x40 > shrink_dcache_for_umount+0x2d/0x90 > generic_shutdown_super+0x1f/0xf0 > kill_anon_super+0x12/0x20 > proc_kill_sb+0x40/0x50 > deactivate_locked_super+0x43/0x70 > deactivate_super+0x5a/0x60 > cleanup_mnt+0x3f/0x90 > mntput_no_expire+0x13b/0x190 > kern_unmount+0x3e/0x50 > pid_ns_release_proc+0x15/0x20 > proc_cleanup_work+0x15/0x20 > process_one_work+0x197/0x450 > worker_thread+0x4e/0x4a0 > kthread+0x109/0x140 > ? process_one_work+0x450/0x450 > ? kthread_park+0x90/0x90 > ret_from_fork+0x2c/0x40 > ---[ end trace e1c109611e5d0b41 ]--- > VFS: Busy inodes after unmount of proc. Self-destruct in 5 seconds. Have a nice day... > BUG: unable to handle kernel NULL pointer dereference at (null) > IP: _raw_spin_lock+0xc/0x30 > PGD 0 Fix this by taking a reference to the super block in proc_sys_prune_dcache. The superblock reference is the core of the fix however the sysctl_inodes list is converted to a hlist so that hlist_del_init_rcu may be used. This allows proc_sys_prune_dache to remove inodes the sysctl_inodes list, while not causing problems for proc_sys_evict_inode when if it later choses to remove the inode from the sysctl_inodes list. Removing inodes from the sysctl_inodes list allows proc_sys_prune_dcache to have a progress guarantee, while still being able to drop all locks. The fact that head->unregistering is set in start_unregistering ensures that no more inodes will be added to the the sysctl_inodes list. Previously the code did a dance where it delayed calling iput until the next entry in the list was being considered to ensure the inode remained on the sysctl_inodes list until the next entry was walked to. The structure of the loop in this patch does not need that so is much easier to understand and maintain. Cc: stable@vger.kernel.org Reported-by: Andrei Vagin <avagin@gmail.com> Tested-by: Andrei Vagin <avagin@openvz.org> Fixes: ace0c791e6c3 ("proc/sysctl: Don't grab i_lock under sysctl_lock.") Fixes: d6cffbbe9a7e ("proc/sysctl: prune stale dentries during unregistering") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15proc/sysctl: prune stale dentries during unregisteringKonstantin Khlebnikov
commit d6cffbbe9a7e51eb705182965a189457c17ba8a3 upstream. Currently unregistering sysctl table does not prune its dentries. Stale dentries could slowdown sysctl operations significantly. For example, command: # for i in {1..100000} ; do unshare -n -- sysctl -a &> /dev/null ; done creates a millions of stale denties around sysctls of loopback interface: # sysctl fs.dentry-state fs.dentry-state = 25812579 24724135 45 0 0 0 All of them have matching names thus lookup have to scan though whole hash chain and call d_compare (proc_sys_compare) which checks them under system-wide spinlock (sysctl_lock). # time sysctl -a > /dev/null real 1m12.806s user 0m0.016s sys 1m12.400s Currently only memory reclaimer could remove this garbage. But without significant memory pressure this never happens. This patch collects sysctl inodes into list on sysctl table header and prunes all their dentries once that table unregisters. Konstantin Khlebnikov <khlebnikov@yandex-team.ru> writes: > On 10.02.2017 10:47, Al Viro wrote: >> how about >> the matching stats *after* that patch? > > dcache size doesn't grow endlessly, so stats are fine > > # sysctl fs.dentry-state > fs.dentry-state = 92712 58376 45 0 0 0 > > # time sysctl -a &>/dev/null > > real 0m0.013s > user 0m0.004s > sys 0m0.008s Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15init: rename and re-order boot_cpu_state_init()Linus Torvalds
commit b5b1404d0815894de0690de8a1ab58269e56eae6 upstream. This is purely a preparatory patch for upcoming changes during the 4.19 merge window. We have a function called "boot_cpu_state_init()" that isn't really about the bootup cpu state: that is done much earlier by the similarly named "boot_cpu_init()" (note lack of "state" in name). This function initializes some hotplug CPU state, and needs to run after the percpu data has been properly initialized. It even has a comment to that effect. Except it _doesn't_ actually run after the percpu data has been properly initialized. On x86 it happens to do that, but on at least arm and arm64, the percpu base pointers are initialized by the arch-specific 'smp_prepare_boot_cpu()' hook, which ran _after_ boot_cpu_state_init(). This had some unexpected results, and in particular we have a patch pending for the merge window that did the obvious cleanup of using 'this_cpu_write()' in the cpu hotplug init code: - per_cpu_ptr(&cpuhp_state, smp_processor_id())->state = CPUHP_ONLINE; + this_cpu_write(cpuhp_state.state, CPUHP_ONLINE); which is obviously the right thing to do. Except because of the ordering issue, it actually failed miserably and unexpectedly on arm64. So this just fixes the ordering, and changes the name of the function to be 'boot_cpu_hotplug_init()' to make it obvious that it's about cpu hotplug state, because the core CPU state was supposed to have already been done earlier. Marked for stable, since the (not yet merged) patch that will show this problem is marked for stable. Reported-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Mian Yousaf Kaukab <yousaf.kaukab@suse.com> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15kasan: add no_sanitize attribute for clang buildsAndrey Konovalov
commit 12c8f25a016dff69ee284aa3338bebfd2cfcba33 upstream. KASAN uses the __no_sanitize_address macro to disable instrumentation of particular functions. Right now it's defined only for GCC build, which causes false positives when clang is used. This patch adds a definition for clang. Note, that clang's revision 329612 or higher is required. [andreyknvl@google.com: remove redundant #ifdef CONFIG_KASAN check] Link: http://lkml.kernel.org/r/c79aa31a2a2790f6131ed607c58b0dd45dd62a6c.1523967959.git.andreyknvl@google.com Link: http://lkml.kernel.org/r/4ad725cc903f8534f8c8a60f0daade5e3d674f8d.1523554166.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Paul Lawrence <paullawrence@google.com> Cc: Sandipan Das <sandipan@linux.vnet.ibm.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Sodagudi Prasad <psodagud@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-09fork: unconditionally clear stack on forkKees Cook
commit e01e80634ecdde1dd113ac43b3adad21b47f3957 upstream. One of the classes of kernel stack content leaks[1] is exposing the contents of prior heap or stack contents when a new process stack is allocated. Normally, those stacks are not zeroed, and the old contents remain in place. In the face of stack content exposure flaws, those contents can leak to userspace. Fixing this will make the kernel no longer vulnerable to these flaws, as the stack will be wiped each time a stack is assigned to a new process. There's not a meaningful change in runtime performance; it almost looks like it provides a benefit. Performing back-to-back kernel builds before: Run times: 157.86 157.09 158.90 160.94 160.80 Mean: 159.12 Std Dev: 1.54 and after: Run times: 159.31 157.34 156.71 158.15 160.81 Mean: 158.46 Std Dev: 1.46 Instead of making this a build or runtime config, Andy Lutomirski recommended this just be enabled by default. [1] A noisy search for many kinds of stack content leaks can be seen here: https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=linux+kernel+stack+leak I did some more with perf and cycle counts on running 100,000 execs of /bin/true. before: Cycles: 218858861551 218853036130 214727610969 227656844122 224980542841 Mean: 221015379122.60 Std Dev: 4662486552.47 after: Cycles: 213868945060 213119275204 211820169456 224426673259 225489986348 Mean: 217745009865.40 Std Dev: 5935559279.99 It continues to look like it's faster, though the deviation is rather wide, but I'm not sure what I could do that would be less noisy. I'm open to ideas! Link: http://lkml.kernel.org/r/20180221021659.GA37073@beast Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Laura Abbott <labbott@redhat.com> Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [ Srivatsa: Backported to 4.9.y ] Signed-off-by: Srivatsa S. Bhat <srivatsa@csail.mit.edu> Reviewed-by: Srinidhi Rao <srinidhir@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-09kmemleak: clear stale pointers from task stacksKonstantin Khlebnikov
commit ca182551857cc2c1e6a2b7f1e72090a137a15008 upstream. Kmemleak considers any pointers on task stacks as references. This patch clears newly allocated and reused vmap stacks. Link: http://lkml.kernel.org/r/150728990124.744199.8403409836394318684.stgit@buzz Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [ Srivatsa: Backported to 4.9.y ] Signed-off-by: Srivatsa S. Bhat <srivatsa@csail.mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-09ring_buffer: tracing: Inherit the tracing setting to next ring bufferMasami Hiramatsu
commit 73c8d8945505acdcbae137c2e00a1232e0be709f upstream. Maintain the tracing on/off setting of the ring_buffer when switching to the trace buffer snapshot. Taking a snapshot is done by swapping the backup ring buffer (max_tr_buffer). But since the tracing on/off setting is defined by the ring buffer, when swapping it, the tracing on/off setting can also be changed. This causes a strange result like below: /sys/kernel/debug/tracing # cat tracing_on 1 /sys/kernel/debug/tracing # echo 0 > tracing_on /sys/kernel/debug/tracing # cat tracing_on 0 /sys/kernel/debug/tracing # echo 1 > snapshot /sys/kernel/debug/tracing # cat tracing_on 1 /sys/kernel/debug/tracing # echo 1 > snapshot /sys/kernel/debug/tracing # cat tracing_on 0 We don't touch tracing_on, but snapshot changes tracing_on setting each time. This is an anomaly, because user doesn't know that each "ring_buffer" stores its own tracing-enable state and the snapshot is done by swapping ring buffers. Link: http://lkml.kernel.org/r/153149929558.11274.11730609978254724394.stgit@devbox Cc: Ingo Molnar <mingo@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Hiraku Toyooka <hiraku.toyooka@cybertrust.co.jp> Cc: stable@vger.kernel.org Fixes: debdd57f5145 ("tracing: Make a snapshot feature available from userspace") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> [ Updated commit log and comment in the code ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-03serial: core: Make sure compiler barfs for 16-byte earlycon namesDouglas Anderson
[ Upstream commit c1c734cb1f54b062f7e67ffc9656d82f5b412b9c ] As part of bringup I ended up wanting to call an earlycon driver by a name that was exactly 16-bytes big, specifically "qcom_geni_serial". Unfortunately, when I tried this I found that things compiled just fine. They just didn't work. Specifically the compiler felt perfectly justified in initting the ".name" field of "struct earlycon_id" with the full 16-bytes and just skipping the '\0'. Needless to say, that behavior didn't seem ideal, but I guess someone must have allowed it for a reason. One way to fix this is to shorten the name field to 15 bytes and then add an extra byte after that nobody touches. This should always be initted to 0 and we're golden. There are, of course, other ways to fix this too. We could audit all the users of the "name" field and make them stop at both null termination or at 16 bytes. We could also just make the name field much bigger so that we're not likely to run into this. ...but both seem like we'll just hit the bug again. Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-03brcmfmac: Add support for bcm43364 wireless chipsetSean Lanigan
[ Upstream commit 9c4a121e82634aa000a702c98cd6f05b27d6e186 ] Add support for the BCM43364 chipset via an SDIO interface, as used in e.g. the Murata 1FX module. The BCM43364 uses the same firmware as the BCM43430 (which is already included), the only difference is the omission of Bluetooth. However, the SDIO_ID for the BCM43364 is 02D0:A9A4, giving it a MODALIAS of sdio:c00v02D0dA9A4, which doesn't get recognised and hence doesn't load the brcmfmac module. Adding the 'A9A4' ID in the appropriate place triggers the brcmfmac driver to load, and then correctly use the firmware file 'brcmfmac43430-sdio.bin'. Signed-off-by: Sean Lanigan <sean@lano.id.au> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-03dma-iommu: Fix compilation when !CONFIG_IOMMU_DMAMarc Zyngier
[ Upstream commit 8a22a3e1e768c309b718f99bd86f9f25a453e0dc ] Inclusion of include/dma-iommu.h when CONFIG_IOMMU_DMA is not selected results in the following splat: In file included from drivers/irqchip/irq-gic-v3-mbi.c:20:0: ./include/linux/dma-iommu.h:95:69: error: unknown type name ‘dma_addr_t’ static inline int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base) ^~~~~~~~~~ ./include/linux/dma-iommu.h:108:74: warning: ‘struct list_head’ declared inside parameter list will not be visible outside of this definition or declaration static inline void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list) ^~~~~~~~~ scripts/Makefile.build:312: recipe for target 'drivers/irqchip/irq-gic-v3-mbi.o' failed Fix it by including linux/types.h. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Rob Herring <robh@kernel.org> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lkml.kernel.org/r/20180508121438.11301-5-marc.zyngier@arm.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-03netfilter: ipset: List timing out entries with "timeout 1" instead of zeroJozsef Kadlecsik
[ Upstream commit bd975e691486ba52790ba23cc9b4fecab7bc0d31 ] When listing sets with timeout support, there's a probability that just timing out entries with "0" timeout value is listed/saved. However when restoring the saved list, the zero timeout value means permanent elelements. The new behaviour is that timing out entries are listed with "timeout 1" instead of zero. Fixes netfilter bugzilla #1258. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-28exec: avoid gcc-8 warning for get_task_commArnd Bergmann
commit 3756f6401c302617c5e091081ca4d26ab604bec5 upstream. gcc-8 warns about using strncpy() with the source size as the limit: fs/exec.c:1223:32: error: argument to 'sizeof' in 'strncpy' call is the same expression as the source; did you mean to use the size of the destination? [-Werror=sizeof-pointer-memaccess] This is indeed slightly suspicious, as it protects us from source arguments without NUL-termination, but does not guarantee that the destination is terminated. This keeps the strncpy() to ensure we have properly padded target buffer, but ensures that we use the correct length, by passing the actual length of the destination buffer as well as adding a build-time check to ensure it is exactly TASK_COMM_LEN. There are only 23 callsites which I all reviewed to ensure this is currently the case. We could get away with doing only the check or passing the right length, but it doesn't hurt to do both. Link: http://lkml.kernel.org/r/20171205151724.1764896-1-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: Kees Cook <keescook@chromium.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Serge Hallyn <serge@hallyn.com> Cc: James Morris <james.l.morris@oracle.com> Cc: Aleksa Sarai <asarai@suse.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-28tcp: free batches of packets in tcp_prune_ofo_queue()Eric Dumazet
[ Upstream commit 72cd43ba64fc172a443410ce01645895850844c8 ] Juha-Matti Tilli reported that malicious peers could inject tiny packets in out_of_order_queue, forcing very expensive calls to tcp_collapse_ofo_queue() and tcp_prune_ofo_queue() for every incoming packet. out_of_order_queue rb-tree can contain thousands of nodes, iterating over all of them is not nice. Before linux-4.9, we would have pruned all packets in ofo_queue in one go, every XXXX packets. XXXX depends on sk_rcvbuf and skbs truesize, but is about 7000 packets with tcp_rmem[2] default of 6 MB. Since we plan to increase tcp_rmem[2] in the future to cope with modern BDP, can not revert to the old behavior, without great pain. Strategy taken in this patch is to purge ~12.5 % of the queue capacity. Fixes: 36a6503fedda ("tcp: refine tcp_prune_ofo_queue() to not drop all packets") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Juha-Matti Tilli <juha-matti.tilli@iki.fi> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-25net: Don't copy pfmemalloc flag in __copy_skb_header()Stefano Brivio
[ Upstream commit 8b7008620b8452728cadead460a36f64ed78c460 ] The pfmemalloc flag indicates that the skb was allocated from the PFMEMALLOC reserves, and the flag is currently copied on skb copy and clone. However, an skb copied from an skb flagged with pfmemalloc wasn't necessarily allocated from PFMEMALLOC reserves, and on the other hand an skb allocated that way might be copied from an skb that wasn't. So we should not copy the flag on skb copy, and rather decide whether to allow an skb to be associated with sockets unrelated to page reclaim depending only on how it was allocated. Move the pfmemalloc flag before headers_start[0] using an existing 1-bit hole, so that __copy_skb_header() doesn't copy it. When cloning, we'll now take care of this flag explicitly, contravening to the warning comment of __skb_clone(). While at it, restore the newline usage introduced by commit b19372273164 ("net: reorganize sk_buff for faster __copy_skb_header()") to visually separate bytes used in bitfields after headers_start[0], that was gone after commit a9e419dc7be6 ("netfilter: merge ctinfo into nfct pointer storage area"), and describe the pfmemalloc flag in the kernel-doc structure comment. This doesn't change the size of sk_buff or cacheline boundaries, but consolidates the 15 bits hole before tc_index into a 2 bytes hole before csum, that could now be filled more easily. Reported-by: Patrick Talbert <ptalbert@redhat.com> Fixes: c93bdd0e03e8 ("netvm: allow skb allocation to use PFMEMALLOC reserves") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22string: drop __must_check from strscpy() and restore strscpy() usages in cgroupTejun Heo
commit 08a77676f9c5fc69a681ccd2cd8140e65dcb26c7 upstream. e7fd37ba1217 ("cgroup: avoid copying strings longer than the buffers") converted possibly unsafe strncpy() usages in cgroup to strscpy(). However, although the callsites are completely fine with truncated copied, because strscpy() is marked __must_check, it led to the following warnings. kernel/cgroup/cgroup.c: In function ‘cgroup_file_name’: kernel/cgroup/cgroup.c:1400:10: warning: ignoring return value of ‘strscpy’, declared with attribute warn_unused_result [-Wunused-result] strscpy(buf, cft->name, CGROUP_FILE_NAME_MAX); ^ To avoid the warnings, 50034ed49645 ("cgroup: use strlcpy() instead of strscpy() to avoid spurious warning") switched them to strlcpy(). strlcpy() is worse than strlcpy() because it unconditionally runs strlen() on the source string, and the only reason we switched to strlcpy() here was because it was lacking __must_check, which doesn't reflect any material differences between the two function. It's just that someone added __must_check to strscpy() and not to strlcpy(). These basic string copy operations are used in variety of ways, and one of not-so-uncommon use cases is safely handling truncated copies, where the caller naturally doesn't care about the return value. The __must_check doesn't match the actual use cases and forces users to opt for inferior variants which lack __must_check by happenstance or spread ugly (void) casts. Remove __must_check from strscpy() and restore strscpy() usages in cgroup. Signed-off-by: Tejun Heo <tj@kernel.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Chris Metcalf <cmetcalf@ezchip.com> [backport only the string.h portion to remove build warnings starting to show up - gregkh] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22arm64: Call ARCH_WORKAROUND_2 on transitions between EL0 and EL1Marc Zyngier
commit 8e2906245f1e3b0d027169d9f2e55ce0548cb96e upstream. In order for the kernel to protect itself, let's call the SSBD mitigation implemented by the higher exception level (either hypervisor or firmware) on each transition between userspace and kernel. We must take the PSCI conduit into account in order to target the right exception level, hence the introduction of a runtime patching callback. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Julien Grall <julien.grall@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22arm/arm64: smccc: Add SMCCC-specific return codesMarc Zyngier
commit eff0e9e1078ea7dc1d794dc50e31baef984c46d7 upstream. We've so far used the PSCI return codes for SMCCC because they were extremely similar. But with the new ARM DEN 0070A specification, "NOT_REQUIRED" (-2) is clashing with PSCI's "PSCI_RET_INVALID_PARAMS". Let's bite the bullet and add SMCCC specific return codes. Users can be repainted as and when required. Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22compiler-gcc.h: Add __attribute__((gnu_inline)) to all inline declarationsNick Desaulniers
commit d03db2bc26f0e4a6849ad649a09c9c73fccdc656 upstream. Functions marked extern inline do not emit an externally visible function when the gnu89 C standard is used. Some KBUILD Makefiles overwrite KBUILD_CFLAGS. This is an issue for GCC 5.1+ users as without an explicit C standard specified, the default is gnu11. Since c99, the semantics of extern inline have changed such that an externally visible function is always emitted. This can lead to multiple definition errors of extern inline functions at link time of compilation units whose build files have removed an explicit C standard compiler flag for users of GCC 5.1+ or Clang. Suggested-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: H. Peter Anvin <hpa@zytor.com> Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@redhat.com Cc: akataria@vmware.com Cc: akpm@linux-foundation.org Cc: andrea.parri@amarulasolutions.com Cc: ard.biesheuvel@linaro.org Cc: aryabinin@virtuozzo.com Cc: astrachan@google.com Cc: boris.ostrovsky@oracle.com Cc: brijesh.singh@amd.com Cc: caoj.fnst@cn.fujitsu.com Cc: geert@linux-m68k.org Cc: ghackmann@google.com Cc: gregkh@linuxfoundation.org Cc: jan.kiszka@siemens.com Cc: jarkko.sakkinen@linux.intel.com Cc: jpoimboe@redhat.com Cc: keescook@google.com Cc: kirill.shutemov@linux.intel.com Cc: kstewart@linuxfoundation.org Cc: linux-efi@vger.kernel.org Cc: linux-kbuild@vger.kernel.org Cc: manojgupta@google.com Cc: mawilcox@microsoft.com Cc: michal.lkml@markovi.net Cc: mjg59@google.com Cc: mka@chromium.org Cc: pombredanne@nexb.com Cc: rientjes@google.com Cc: rostedt@goodmis.org Cc: sedat.dilek@gmail.com Cc: thomas.lendacky@amd.com Cc: tstellar@redhat.com Cc: tweek@google.com Cc: virtualization@lists.linux-foundation.org Cc: will.deacon@arm.com Cc: yamada.masahiro@socionext.com Link: http://lkml.kernel.org/r/20180621162324.36656-2-ndesaulniers@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22compiler, clang: always inline when CONFIG_OPTIMIZE_INLINING is disabledDavid Rientjes
commit 9a04dbcfb33b4012d0ce8c0282f1e3ca694675b1 upstream. The motivation for commit abb2ea7dfd82 ("compiler, clang: suppress warning for unused static inline functions") was to suppress clang's warnings about unused static inline functions. For configs without CONFIG_OPTIMIZE_INLINING enabled, such as any non-x86 architecture, `inline' in the kernel implies that __attribute__((always_inline)) is used. Some code depends on that behavior, see https://lkml.org/lkml/2017/6/13/918: net/built-in.o: In function `__xchg_mb': arch/arm64/include/asm/cmpxchg.h:99: undefined reference to `__compiletime_assert_99' arch/arm64/include/asm/cmpxchg.h:99: undefined reference to `__compiletime_assert_99 The full fix would be to identify these breakages and annotate the functions with __always_inline instead of `inline'. But since we are late in the 4.12-rc cycle, simply carry forward the forced inlining behavior and work toward moving arm64, and other architectures, toward CONFIG_OPTIMIZE_INLINING behavior. Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1706261552200.1075@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: Sodagudi Prasad <psodagud@codeaurora.org> Tested-by: Sodagudi Prasad <psodagud@codeaurora.org> Tested-by: Matthias Kaehlcke <mka@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22compiler, clang: properly override 'inline' for clangLinus Torvalds
commit 6d53cefb18e4646fb4bf62ccb6098fb3808486df upstream. Commit abb2ea7dfd82 ("compiler, clang: suppress warning for unused static inline functions") just caused more warnings due to re-defining the 'inline' macro. So undef it before re-defining it, and also add the 'notrace' attribute like the gcc version that this is overriding does. Maybe this makes clang happier. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22compiler, clang: suppress warning for unused static inline functionsDavid Rientjes
commit abb2ea7dfd82451d85ce669b811310c05ab5ca46 upstream. GCC explicitly does not warn for unused static inline functions for -Wunused-function. The manual states: Warn whenever a static function is declared but not defined or a non-inline static function is unused. Clang does warn for static inline functions that are unused. It turns out that suppressing the warnings avoids potentially complex #ifdef directives, which also reduces LOC. Suppress the warning for clang. Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-17ahci: Disable LPM on Lenovo 50 series laptops with a too old BIOSHans de Goede
commit 240630e61870e62e39a97225048f9945848fa5f5 upstream. There have been several reports of LPM related hard freezes about once a day on multiple Lenovo 50 series models. Strange enough these reports where not disk model specific as LPM issues usually are and some users with the exact same disk + laptop where seeing them while other users where not seeing these issues. It turns out that enabling LPM triggers a firmware bug somewhere, which has been fixed in later BIOS versions. This commit adds a new ahci_broken_lpm() function and a new ATA_FLAG_NO_LPM for dealing with this. The ahci_broken_lpm() function contains DMI match info for the 4 models which are known to be affected by this and the DMI BIOS date field for known good BIOS versions. If the BIOS date is older then the one in the table LPM will be disabled and a warning will be printed. Note the BIOS dates are for known good versions, some older versions may work too, but we don't know for sure, the table is using dates from BIOS versions for which users have confirmed that upgrading to that version makes the problem go away. Unfortunately I've been unable to get hold of the reporter who reported that BIOS version 2.35 fixed the problems on the W541 for him. I've been able to verify the DMI_SYS_VENDOR and DMI_PRODUCT_VERSION from an older dmidecode, but I don't know the exact BIOS date as reported in the DMI. Lenovo keeps a changelog with dates in their release notes, but the dates there are the release dates not the build dates which are in DMI. So I've chosen to set the date to which we compare to one day past the release date of the 2.34 BIOS. I plan to fix this with a follow up commit once I've the necessary info. Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03block: Fix transfer when chunk sectors exceeds maxKeith Busch
commit 15bfd21fbc5d35834b9ea383dc458a1f0c9e3434 upstream. A device may have boundary restrictions where the number of sectors between boundaries exceeds its max transfer size. In this case, we need to cap the max size to the smaller of the two limits. Reported-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Tested-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Cc: <stable@vger.kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03iio:buffer: make length types match kfifo typesMartin Kelly
commit c043ec1ca5baae63726aae32abbe003192bc6eec upstream. Currently, we use int for buffer length and bytes_per_datum. However, kfifo uses unsigned int for length and size_t for element size. We need to make sure these matches or we will have bugs related to overflow (in the range between INT_MAX and UINT_MAX for length, for example). In addition, set_bytes_per_datum uses size_t while bytes_per_datum is an int, which would cause bugs for large values of bytes_per_datum. Change buffer length to use unsigned int and bytes_per_datum to use size_t. Signed-off-by: Martin Kelly <mkelly@xevo.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [bwh: Backported to 4.9: - Drop change to iio_dma_buffer_set_length() - Adjust filename, context] Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03branch-check: fix long->int truncation when profiling branchesMikulas Patocka
commit 2026d35741f2c3ece73c11eb7e4a15d7c2df9ebe upstream. The function __builtin_expect returns long type (see the gcc documentation), and so do macros likely and unlikely. Unfortunatelly, when CONFIG_PROFILE_ANNOTATED_BRANCHES is selected, the macros likely and unlikely expand to __branch_check__ and __branch_check__ truncates the long type to int. This unintended truncation may cause bugs in various kernel code (we found a bug in dm-writecache because of it), so it's better to fix __branch_check__ to return long. Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1805300818140.24812@file01.intranet.prod.int.rdu2.redhat.com Cc: Ingo Molnar <mingo@redhat.com> Cc: stable@vger.kernel.org Fixes: 1f0d69a9fc815 ("tracing: profile likely and unlikely annotations") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-13complete e390f9a port for v4.9.106Philip Müller
objtool ports introduced in v4.9.106 were not totally complete. Therefore they resulted in issues like: module: overflow in relocation type 10 val XXXXXXXXXXX ‘usbcore’ likely not compiled with -mcmodel=kernel module: overflow in relocation type 10 val XXXXXXXXXXX ‘scsi_mod’ likely not compiled with -mcmodel=kernel Missing part was the complete backport of commit e390f9a. Original notes by Josh Poimboeuf: The '__unreachable' and '__func_stack_frame_non_standard' sections are only used at compile time. They're discarded for vmlinux but they should also be discarded for modules. Since this is a recurring pattern, prefix the section names with ".discard.". It's a nice convention and vmlinux.lds.h already discards such sections. Also remove the 'a' (allocatable) flag from the __unreachable section since it doesn't make sense for a discarded section. Signed-off-by: Philip Müller <philm@manjaro.org> Fixes: d1091c7fa3d5 ("objtool: Improve detection of BUG() and other dead ends") Link: https://gitlab.manjaro.org/packages/core/linux49/issues/2 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-06tcp: avoid integer overflows in tcp_rcv_space_adjust()Eric Dumazet
commit 607065bad9931e72207b0cac365d7d4abc06bd99 upstream. When using large tcp_rmem[2] values (I did tests with 500 MB), I noticed overflows while computing rcvwin. Lets fix this before the following patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Wei Wang <weiwan@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> [Backport: sysctl_tcp_rmem is not Namespace-ify'd in older kernels] Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-05objtool: Enclose contents of unreachable() macro in a blockJosh Poimboeuf
commit 4e4636cf981b5b629fbfb78aa9f232e015f7d521 upstream. Guenter Roeck reported a boot failure in mips64. It was bisected to the following commit: d1091c7fa3d5 ("objtool: Improve detection of BUG() and other dead ends") The unreachable() macro was formerly only composed of a single statement. The above commit added a second statement, but neglected to enclose the statements in a block. Suggested-by: Guenter Roeck <linux@roeck-us.net> Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: d1091c7fa3d5 ("objtool: Improve detection of BUG() and other dead ends") Link: http://lkml.kernel.org/r/20170228042116.glmwmwiohcix7o4a@treble Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-05objtool: Improve detection of BUG() and other dead endsJosh Poimboeuf
commit d1091c7fa3d52ebce4dd3f15d04155b3469b2f90 upstream. The BUG() macro's use of __builtin_unreachable() via the unreachable() macro tells gcc that the instruction is a dead end, and that it's safe to assume the current code path will not execute past the previous instruction. On x86, the BUG() macro is implemented with the 'ud2' instruction. When objtool's branch analysis sees that instruction, it knows the current code path has come to a dead end. Peter Zijlstra has been working on a patch to change the WARN macros to use 'ud2'. That patch will break objtool's assumption that 'ud2' is always a dead end. Generally it's best for objtool to avoid making those kinds of assumptions anyway. The more ignorant it is of kernel code internals, the better. So create a more generic way for objtool to detect dead ends by adding an annotation to the unreachable() macro. The annotation stores a pointer to the end of the unreachable code path in an '__unreachable' section. Objtool can read that section to find the dead ends. Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/41a6d33971462ebd944a1c60ad4bf5be86c17b77.1487712920.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30kvm: fix warning for CONFIG_HAVE_KVM_EVENTFD buildsSebastian Ott
[ Upstream commit 076467490b8176eb96eddc548a14d4135c7b5852 ] Move the kvm_arch_irq_routing_update() prototype outside of ifdef CONFIG_HAVE_KVM_EVENTFD guards to fix the following sparse warning: arch/s390/kvm/../../../virt/kvm/irqchip.c:171:28: warning: symbol 'kvm_arch_irq_routing_update' was not declared. Should it be static? Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30ptr_ring: prevent integer overflow when calculating sizeJason Wang
[ Upstream commit 54e02162d4454a99227f520948bf4494c3d972d0 ] Switch to use dividing to prevent integer overflow when size is too big to calculate allocation size properly. Reported-by: Eric Biggers <ebiggers3@gmail.com> Fixes: 6e6e41c31122 ("ptr_ring: fail early if queue occupies more than KMALLOC_MAX_SIZE") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30cpumask: Make for_each_cpu_wrap() available on UP as wellMichael Kelley
[ Upstream commit d207af2eab3f8668b95ad02b21930481c42806fd ] for_each_cpu_wrap() was originally added in the #else half of a large "#if NR_CPUS == 1" statement, but was omitted in the #if half. This patch adds the missing #if half to prevent compile errors when NR_CPUS is 1. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Michael Kelley <mhkelley@outlook.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kys@microsoft.com Cc: martin.petersen@oracle.com Cc: mikelley@microsoft.com Fixes: c743f0a5c50f ("sched/fair, cpumask: Export for_each_cpu_wrap()") Link: http://lkml.kernel.org/r/SN6PR1901MB2045F087F59450507D4FCC17CBF50@SN6PR1901MB2045.namprd19.prod.outlook.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when dumping vsyscall user pageJia Zhang
[ Upstream commit 595dd46ebfc10be041a365d0a3fa99df50b6ba73 ] Commit: df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data") ... introduced a bounce buffer to work around CONFIG_HARDENED_USERCOPY=y. However, accessing the vsyscall user page will cause an SMAP fault. Replace memcpy() with copy_from_user() to fix this bug works, but adding a common way to handle this sort of user page may be useful for future. Currently, only vsyscall page requires KCORE_USER. Signed-off-by: Jia Zhang <zhang.jia@linux.alibaba.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: jolsa@redhat.com Link: http://lkml.kernel.org/r/1518446694-21124-2-git-send-email-zhang.jia@linux.alibaba.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30x86/power: Fix swsusp_arch_resume prototypeArnd Bergmann
[ Upstream commit 328008a72d38b5bde6491e463405c34a81a65d3e ] The declaration for swsusp_arch_resume marks it as 'asmlinkage', but the definition in x86-32 does not, and it fails to include the header with the declaration. This leads to a warning when building with link-time-optimizations: kernel/power/power.h:108:23: error: type of 'swsusp_arch_resume' does not match original declaration [-Werror=lto-type-mismatch] extern asmlinkage int swsusp_arch_resume(void); ^ arch/x86/power/hibernate_32.c:148:0: note: 'swsusp_arch_resume' was previously declared here int swsusp_arch_resume(void) This moves the declaration into a globally visible header file and fixes up both x86 definitions to match it. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Len Brown <len.brown@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Nicolas Pitre <nico@linaro.org> Cc: linux-pm@vger.kernel.org Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Pavel Machek <pavel@ucw.cz> Cc: Bart Van Assche <bart.vanassche@wdc.com> Link: https://lkml.kernel.org/r/20180202145634.200291-2-arnd@arndb.de Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>