summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2015-09-29net: sched: fix refcount imbalance in actionsDaniel Borkmann
[ Upstream commit 28e6b67f0b292f557468c139085303b15f1a678f ] Since commit 55334a5db5cd ("net_sched: act: refuse to remove bound action outside"), we end up with a wrong reference count for a tc action. Test case 1: FOO="1,6 0 0 4294967295," BAR="1,6 0 0 4294967294," tc filter add dev foo parent 1: bpf bytecode "$FOO" flowid 1:1 \ action bpf bytecode "$FOO" tc actions show action bpf action order 0: bpf bytecode '1,6 0 0 4294967295' default-action pipe index 1 ref 1 bind 1 tc actions replace action bpf bytecode "$BAR" index 1 tc actions show action bpf action order 0: bpf bytecode '1,6 0 0 4294967294' default-action pipe index 1 ref 2 bind 1 tc actions replace action bpf bytecode "$FOO" index 1 tc actions show action bpf action order 0: bpf bytecode '1,6 0 0 4294967295' default-action pipe index 1 ref 3 bind 1 Test case 2: FOO="1,6 0 0 4294967295," tc filter add dev foo parent 1: bpf bytecode "$FOO" flowid 1:1 action ok tc actions show action gact action order 0: gact action pass random type none pass val 0 index 1 ref 1 bind 1 tc actions add action drop index 1 RTNETLINK answers: File exists [...] tc actions show action gact action order 0: gact action pass random type none pass val 0 index 1 ref 2 bind 1 tc actions add action drop index 1 RTNETLINK answers: File exists [...] tc actions show action gact action order 0: gact action pass random type none pass val 0 index 1 ref 3 bind 1 What happens is that in tcf_hash_check(), we check tcf_common for a given index and increase tcfc_refcnt and conditionally tcfc_bindcnt when we've found an existing action. Now there are the following cases: 1) We do a late binding of an action. In that case, we leave the tcfc_refcnt/tcfc_bindcnt increased and are done with the ->init() handler. This is correctly handeled. 2) We replace the given action, or we try to add one without replacing and find out that the action at a specific index already exists (thus, we go out with error in that case). In case of 2), we have to undo the reference count increase from tcf_hash_check() in the tcf_hash_check() function. Currently, we fail to do so because of the 'tcfc_bindcnt > 0' check which bails out early with an -EPERM error. Now, while commit 55334a5db5cd prevents 'tc actions del action ...' on an already classifier-bound action to drop the reference count (which could then become negative, wrap around etc), this restriction only accounts for invocations outside a specific action's ->init() handler. One possible solution would be to add a flag thus we possibly trigger the -EPERM ony in situations where it is indeed relevant. After the patch, above test cases have correct reference count again. Fixes: 55334a5db5cd ("net_sched: act: refuse to remove bound action outside") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29ipv6: lock socket in ip6_datagram_connect()Eric Dumazet
[ Upstream commit 03645a11a570d52e70631838cb786eb4253eb463 ] ip6_datagram_connect() is doing a lot of socket changes without socket being locked. This looks wrong, at least for udp_lib_rehash() which could corrupt lists because of concurrent udp_sk(sk)->udp_portaddr_hash accesses. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29jbd2: avoid infinite loop when destroying aborted journalJan Kara
commit 841df7df196237ea63233f0f9eaa41db53afd70f upstream. Commit 6f6a6fda2945 "jbd2: fix ocfs2 corrupt when updating journal superblock fails" changed jbd2_cleanup_journal_tail() to return EIO when the journal is aborted. That makes logic in jbd2_log_do_checkpoint() bail out which is fine, except that jbd2_journal_destroy() expects jbd2_log_do_checkpoint() to always make a progress in cleaning the journal. Without it jbd2_journal_destroy() just loops in an infinite loop. Fix jbd2_journal_destroy() to cleanup journal checkpoint lists of jbd2_log_do_checkpoint() fails with error. Reported-by: Eryu Guan <guaneryu@gmail.com> Tested-by: Eryu Guan <guaneryu@gmail.com> Fixes: 6f6a6fda294506dfe0e3e0a253bb2d2923f28f0a Signed-off-by: Jan Kara <jack@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29iommu/tegra-smmu: Parameterize number of TLB linesThierry Reding
commit 11cec15bf3fb498206ef63b1fa26c27689e02d0e upstream. The number of TLB lines was increased from 16 on Tegra30 to 32 on Tegra114 and later. Parameterize the value so that the initial default can be set accordingly. On Tegra30, initializing the value to 32 would effectively disable the TLB and hence cause massive latencies for memory accesses translated through the SMMU. This is especially noticeable for isochronuous clients such as display, whose FIFOs would continuously underrun. Fixes: 891846516317 ("memory: Add NVIDIA Tegra memory controller support") Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29SUNRPC: Ensure that we wait for connections to complete before retryingTrond Myklebust
commit 0fdea1e8a2853f79d39b8555cc9de16a7e0ab26f upstream. Commit 718ba5b87343, moved the responsibility for unlocking the socket to xs_tcp_setup_socket, meaning that the socket will be unlocked before we know that it has finished trying to connect. The following patch is based on an initial patch by Russell King to ensure that we delay clearing the XPRT_CONNECTING flag until we either know that we failed to initiate a connection attempt, or the connection attempt itself failed. Fixes: 718ba5b87343 ("SUNRPC: Add helpers to prevent socket create from racing") Reported-by: Russell King <linux@arm.linux.org.uk> Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Tested-by: Russell King <rmk+kernel@arm.linux.org.uk> Tested-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29net: sunrpc: fix tracepoint Warning: unknown op '->'Pratyush Anand
commit 051ac3848a94f21cfdec899cc9c65ce7f9f116fa upstream. `perf stat -e sunrpc:svc_xprt_do_enqueue true` results in Warning: unknown op '->' Warning: [sunrpc:svc_xprt_do_enqueue] unknown op '->' Similar warning for svc_handle_xprt as well. Actually TP_printk() should never dereference an address saved in the ring buffer that points somewhere in the kernel. There's no guarantee that that object still exists (with the exception of static strings). Therefore change all the arguments for TP_printk(), so that it references values existing in the ring buffer only. While doing that, also fix another possible bug when argument xprt could be NULL and TP_fast_assign() tries to access it's elements. Signed-off-by: Pratyush Anand <panand@redhat.com> Reviewed-by: Jeff Layton <jeff.layton@primarydata.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Fixes: 83a712e0afef "sunrpc: add some tracepoints around ..." Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-29mm: make page pfmemalloc check more robustMichal Hocko
commit 2f064f3485cd29633ad1b3cfb00cc519509a3d72 upstream. Commit c48a11c7ad26 ("netvm: propagate page->pfmemalloc to skb") added checks for page->pfmemalloc to __skb_fill_page_desc(): if (page->pfmemalloc && !page->mapping) skb->pfmemalloc = true; It assumes page->mapping == NULL implies that page->pfmemalloc can be trusted. However, __delete_from_page_cache() can set set page->mapping to NULL and leave page->index value alone. Due to being in union, a non-zero page->index will be interpreted as true page->pfmemalloc. So the assumption is invalid if the networking code can see such a page. And it seems it can. We have encountered this with a NFS over loopback setup when such a page is attached to a new skbuf. There is no copying going on in this case so the page confuses __skb_fill_page_desc which interprets the index as pfmemalloc flag and the network stack drops packets that have been allocated using the reserves unless they are to be queued on sockets handling the swapping which is the case here and that leads to hangs when the nfs client waits for a response from the server which has been dropped and thus never arrive. The struct page is already heavily packed so rather than finding another hole to put it in, let's do a trick instead. We can reuse the index again but define it to an impossible value (-1UL). This is the page index so it should never see the value that large. Replace all direct users of page->pfmemalloc by page_is_pfmemalloc which will hide this nastiness from unspoiled eyes. The information will get lost if somebody wants to use page->index obviously but that was the case before and the original code expected that the information should be persisted somewhere else if that is really needed (e.g. what SLAB and SLUB do). [akpm@linux-foundation.org: fix blooper in slub] Fixes: c48a11c7ad26 ("netvm: propagate page->pfmemalloc to skb") Signed-off-by: Michal Hocko <mhocko@suse.com> Debugged-by: Vlastimil Babka <vbabka@suse.com> Debugged-by: Jiri Bohac <jbohac@suse.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David Miller <davem@davemloft.net> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21fs: create and use seq_show_option for escapingKees Cook
commit a068acf2ee77693e0bf39d6e07139ba704f461c3 upstream. Many file systems that implement the show_options hook fail to correctly escape their output which could lead to unescaped characters (e.g. new lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files. This could lead to confusion, spoofed entries (resulting in things like systemd issuing false d-bus "mount" notifications), and who knows what else. This looks like it would only be the root user stepping on themselves, but it's possible weird things could happen in containers or in other situations with delegated mount privileges. Here's an example using overlay with setuid fusermount trusting the contents of /proc/mounts (via the /etc/mtab symlink). Imagine the use of "sudo" is something more sneaky: $ BASE="ovl" $ MNT="$BASE/mnt" $ LOW="$BASE/lower" $ UP="$BASE/upper" $ WORK="$BASE/work/ 0 0 none /proc fuse.pwn user_id=1000" $ mkdir -p "$LOW" "$UP" "$WORK" $ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt $ cat /proc/mounts none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0 none /proc fuse.pwn user_id=1000 0 0 $ fusermount -u /proc $ cat /proc/mounts cat: /proc/mounts: No such file or directory This fixes the problem by adding new seq_show_option and seq_show_option_n helpers, and updating the vulnerable show_option handlers to use them as needed. Some, like SELinux, need to be open coded due to unusual existing escape mechanisms. [akpm@linux-foundation.org: add lost chunk, per Kees] [keescook@chromium.org: seq_show_option should be using const parameters] Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Jan Kara <jack@suse.com> Acked-by: Paul Moore <paul@paul-moore.com> Cc: J. R. Okajima <hooanon05g@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21ACPI, PCI: Penalize legacy IRQ used by ACPI SCIJiang Liu
commit 5d0ddfebb93069061880fc57ee4ba7246bd1e1ee upstream. Nick Meier reported a regression with HyperV that " After rebooting the VM, the following messages are logged in syslog when trying to load the tulip driver: tulip: Linux Tulip drivers version 1.1.15 (Feb 27, 2007) tulip: 0000:00:0a.0: PCI INT A: failed to register GSI tulip: Cannot enable tulip board #0, aborting tulip: probe of 0000:00:0a.0 failed with error -16 Errors occur in 3.19.0 kernel Works in 3.17 kernel. " According to the ACPI dump file posted by Nick at https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1440072 The ACPI MADT table includes an interrupt source overridden entry for ACPI SCI: [236h 0566 1] Subtable Type : 02 <Interrupt Source Override> [237h 0567 1] Length : 0A [238h 0568 1] Bus : 00 [239h 0569 1] Source : 09 [23Ah 0570 4] Interrupt : 00000009 [23Eh 0574 2] Flags (decoded below) : 000D Polarity : 1 Trigger Mode : 3 And in DSDT table, we have _PRT method to define PCI interrupts, which eventually goes to: Name (PRSA, ResourceTemplate () { IRQ (Level, ActiveLow, Shared, ) {3,4,5,7,9,10,11,12,14,15} }) Name (PRSB, ResourceTemplate () { IRQ (Level, ActiveLow, Shared, ) {3,4,5,7,9,10,11,12,14,15} }) Name (PRSC, ResourceTemplate () { IRQ (Level, ActiveLow, Shared, ) {3,4,5,7,9,10,11,12,14,15} }) Name (PRSD, ResourceTemplate () { IRQ (Level, ActiveLow, Shared, ) {3,4,5,7,9,10,11,12,14,15} }) According to the MADT and DSDT tables, IRQ 9 may be used for: 1) ACPI SCI in level, high mode 2) PCI legacy IRQ in level, low mode So there's a conflict in polarity setting for IRQ 9. Prior to commit cd68f6bd53cf ("x86, irq, acpi: Get rid of special handling of GSI for ACPI SCI"), ACPI SCI is handled specially and there's no check for conflicts between ACPI SCI and PCI legagy IRQ. And it seems that the HyperV hypervisor doesn't make use of the polarity configuration in IOAPIC entry, so it just works. Commit cd68f6bd53cf gets rid of the specially handling of ACPI SCI, and then the pin attribute checking code discloses the conflicts between ACPI SCI and PCI legacy IRQ on HyperV virtual machine, and rejects the request to assign IRQ9 to PCI devices. So penalize legacy IRQ used by ACPI SCI and mark it unusable if ACPI SCI attributes conflict with PCI IRQ attributes. Please refer to following links for more information: https://bugzilla.kernel.org/show_bug.cgi?id=101301 https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1440072 Fixes: cd68f6bd53cf ("x86, irq, acpi: Get rid of special handling of GSI for ACPI SCI") Reported-and-tested-by: Nick Meier <nmeier@microsoft.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21PCI: Add dev_flags bit to access VPD through function 0Mark Rustad
commit 932c435caba8a2ce473a91753bad0173269ef334 upstream. Add a dev_flags bit, PCI_DEV_FLAGS_VPD_REF_F0, to access VPD through function 0 to provide VPD access on other functions. This is for hardware devices that provide copies of the same VPD capability registers in multiple functions. Because the kernel expects that each function has its own registers, both the locking and the state tracking are affected by VPD accesses to different functions. On such devices for example, if a VPD write is performed on function 0, *any* later attempt to read VPD from any other function of that device will hang. This has to do with how the kernel tracks the expected value of the F bit per function. Concurrent accesses to different functions of the same device can not only hang but also corrupt both read and write VPD data. When hangs occur, typically the error message: vpd r/w failed. This is likely a firmware bug on this device. will be seen. Never set this bit on function 0 or there will be an infinite recursion. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21iio: Add inverse unit conversion macrosLars-Peter Clausen
commit c689a923c867eac40ed3826c1d9328edea8b6bc7 upstream. Add inverse unit conversion macro to convert from standard IIO units to units that might be used by some devices. Those are useful in combination with scale factors that are specified as IIO_VAL_FRACTIONAL. Typically the denominator for those specifications will contain the maximum raw value the sensor will generate and the numerator the value it maps to in a specific unit. Sometimes datasheets specify those in different units than the standard IIO units (e.g. degree/s instead of rad/s) and so we need to do a unit conversion. From a mathematical point of view it does not make a difference whether we apply the unit conversion to the numerator or the inverse unit conversion to the denominator since (x / y) / z = x / (y * z). But as the denominator is typically a larger value and we are rounding both the numerator and denominator to integer values using the later method gives us a better precision (E.g. the relative error is smaller if we round 8000.3 to 8000 rather than rounding 8.3 to 8). This is where in inverse unit conversion macros will be used. Marked for stable as used by some upcoming fixes. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-13genirq: Introduce irq_chip_set_type_parent() helperGrygorii Strashko
commit b7560de198222994374c1340a389f12d5efb244a upstream. This helper is required for irq chips which do not implement a irq_set_type callback and need to call down the irq domain hierarchy for the actual trigger type change. This helper is required to fix further wreckage caused by the conversion of TI OMAP to hierarchical irq domains and therefor tagged for stable. [ tglx: Massaged changelog ] Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: <linux@arm.linux.org.uk> Cc: <nsekhar@ti.com> Cc: <jason@lakedaemon.net> Cc: <balbi@ti.com> Cc: <linux-arm-kernel@lists.infradead.org> Cc: <tony@atomide.com> Cc: <marc.zyngier@arm.com> Cc: stable@vger.kernel.org # 4.1 Link: http://lkml.kernel.org/r/1439554830-19502-3-git-send-email-grygorii.strashko@ti.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-13Revert "libata: Implement NCQ autosense"Tejun Heo
commit 74a80d67b8316eb3fbeb73dafc060a5a0a708587 upstream. This reverts commit 42b966fbf35da9c87f08d98f9b8978edf9e717cf. As implemented, ACS-4 sense reporting for ATA devices bypasses error diagnosis and handling in libata degrading EH behavior significantly. Revert the related changes for now. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-13Revert "libata: Implement support for sense data reporting"Tejun Heo
commit 84ded2f8e7dda336fc2fb3570726ceb3b3b3590f upstream. This reverts commit fe7173c206de63fc28475ee6ae42ff95c05692de. As implemented, ACS-4 sense reporting for ATA devices bypasses error diagnosis and handling in libata degrading EH behavior significantly. Revert the related changes for now. ATA_ID_COMMAND_SET_3/4 constants are not reverted as they're used by later changes. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-13Revert "libata-eh: Set 'information' field for autosense"Tejun Heo
commit fe16d4f202c59a560533a223bc6375739ee30944 upstream. This reverts commit a1524f226a02aa6edebd90ae0752e97cfd78b159. As implemented, ACS-4 sense reporting for ATA devices bypasses error diagnosis and handling in libata degrading EH behavior significantly. Revert the related changes for now. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-13drm/radeon: add new OLAND pci idAlex Deucher
commit e037239e5e7b61007763984aa35a8329596d8c88 upstream. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16mtd: nand: Fix NAND_USE_BOUNCE_BUFFER flag conflictScott Wood
commit 5f867db63473f32cce1b868e281ebd42a41f8fad upstream. Commit 66507c7bc8895f0da6b ("mtd: nand: Add support to use nand_base poi databuf as bounce buffer") added a flag NAND_USE_BOUNCE_BUFFER using the same bit value as the existing NAND_BUSWIDTH_AUTO. Cc: Kamal Dasu <kdasu.kdev@gmail.com> Fixes: 66507c7bc8895f0da6b ("mtd: nand: Add support to use nand_base poi databuf as bounce buffer") Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16PCI: Restore PCI_MSIX_FLAGS_BIRMASK definitionMichael S. Tsirkin
commit c9ddbac9c89110f77cb0fa07e634aaf1194899aa upstream. 09a2c73ddfc7 ("PCI: Remove unused PCI_MSIX_FLAGS_BIRMASK definition") removed PCI_MSIX_FLAGS_BIRMASK from an exported header because it was unused in the kernel. But that breaks user programs that were using it (QEMU in particular). Restore the PCI_MSIX_FLAGS_BIRMASK definition. [bhelgaas: changelog] Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-10iscsi-target: Fix iscsit_start_kthreads failure OOPsNicholas Bellinger
commit e54198657b65625085834847ab6271087323ffea upstream. This patch fixes a regression introduced with the following commit in v4.0-rc1 code, where a iscsit_start_kthreads() failure triggers a NULL pointer dereference OOPs: commit 88dcd2dab5c23b1c9cfc396246d8f476c872f0ca Author: Nicholas Bellinger <nab@linux-iscsi.org> Date: Thu Feb 26 22:19:15 2015 -0800 iscsi-target: Convert iscsi_thread_set usage to kthread.h To address this bug, move iscsit_start_kthreads() immediately preceeding the transmit of last login response, before signaling a successful transition into full-feature-phase within existing iscsi_target_do_tx_login_io() logic. This ensures that no target-side resource allocation failures can occur after the final login response has been successfully sent. Also, it adds a iscsi_conn->rx_login_comp to allow the RX thread to sleep to prevent other socket related failures until the final iscsi_post_login_handler() call is able to complete. Cc: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-10efi: Handle memory error structures produced based on old versions of standardLuck, Tony
commit 4c62360d7562a20c996836d163259c87d9378120 upstream. The memory error record structure includes as its first field a bitmask of which subsequent fields are valid. The allows new fields to be added to the structure while keeping compatibility with older software that parses these records. This mechanism was used between versions 2.2 and 2.3 to add four new fields, growing the size of the structure from 73 bytes to 80. But Linux just added all the new fields so this test: if (gdata->error_data_length >= sizeof(*mem_err)) cper_print_mem(newpfx, mem_err); else goto err_section_too_small; now make Linux complain about old format records being too short. Add a definition for the old format of the structure and use that for the minimum size check. Pass the actual size to cper_print_mem() so it can sanity check the validation_bits field to ensure that if a BIOS using the old format sets bits as if it were new, we won't access fields beyond the end of the structure. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-10ftrace: Fix breakage of set_ftrace_pidSteven Rostedt (Red Hat)
commit e3eea1404f5ff7a2ceb7b5e7ba412a6fd94f2935 upstream. Commit 4104d326b670 ("ftrace: Remove global function list and call function directly") simplified the ftrace code by removing the global_ops list with a new design. But this cleanup also broke the filtering of PIDs that are added to the set_ftrace_pid file. Add back the proper hooks to have pid filtering working once again. Reported-by: Matt Fleming <matt@console-pimps.org> Reported-by: Richard Weinberger <richard.weinberger@gmail.com> Tested-by: Matt Fleming <matt@console-pimps.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-10can: replace timestamp as unique skb attributeOliver Hartkopp
commit d3b58c47d330de8c29898fe9746f7530408f8a59 upstream. Commit 514ac99c64b "can: fix multiple delivery of a single CAN frame for overlapping CAN filters" requires the skb->tstamp to be set to check for identical CAN skbs. Without timestamping to be required by user space applications this timestamp was not generated which lead to commit 36c01245eb8 "can: fix loss of CAN frames in raw_rcv" - which forces the timestamp to be set in all CAN related skbuffs by introducing several __net_timestamp() calls. This forces e.g. out of tree drivers which are not using alloc_can{,fd}_skb() to add __net_timestamp() after skbuff creation to prevent the frame loss fixed in mainline Linux. This patch removes the timestamp dependency and uses an atomic counter to create an unique identifier together with the skbuff pointer. Btw: the new skbcnt element introduced in struct can_skb_priv has to be initialized with zero in out-of-tree drivers which are not using alloc_can{,fd}_skb() too. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03nfs: increase size of EXCHANGE_ID name string bufferJeff Layton
commit 764ad8ba8cd4c6f836fca9378f8c5121aece0842 upstream. The current buffer is much too small if you have a relatively long hostname. Bring it up to the size of the one that SETCLIENTID has. Reported-by: Michael Skralivetsky <michael.skralivetsky@primarydata.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03gpiolib: Add missing dummies for the unified device properties interfaceGeert Uytterhoeven
commit 496e7ce2a46562938edcb74f65b26068ee8895f6 upstream. If GPIOLIB=n: drivers/leds/leds-gpio.c: In function ‘gpio_leds_create’: drivers/leds/leds-gpio.c:187: error: implicit declaration of function ‘devm_get_gpiod_from_child’ drivers/leds/leds-gpio.c:187: warning: assignment makes pointer from integer without a cast Add dummies for fwnode_get_named_gpiod() and devm_get_gpiod_from_child() for the !GPIOLIB case to fix this. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Fixes: 40b7318319281b1b ("gpio: Support for unified device properties interface") Acked-by: Alexandre Courbot <acourbot@nvidia.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Bryan Wu <cooloney@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03of: return NUMA_NO_NODE from fallback of_node_to_nid()Konstantin Khlebnikov
commit c8fff7bc5bba6bd59cad40441c189c4efe7190f6 upstream. Node 0 might be offline as well as any other numa node, in this case kernel cannot handle memory allocation and crashes. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Fixes: 0c3f061c195c ("of: implement of_node_to_nid as a weak function") Signed-off-by: Grant Likely <grant.likely@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03compiler-intel: fix wrong compiler barrier() macroDaniel Borkmann
commit b86a50c3b5414eafdbee7f34af4a201a4a7817c2 upstream. Cleanup commit 73679e508201 ("compiler-intel.h: Remove duplicate definition") removed the double definition of __memory_barrier() intrinsics. However, in doing so, it also removed the preceding #undef barrier by accident, meaning, the actual barrier() macro from compiler-gcc.h with inline asm is still in place as __GNUC__ is provided. Subsequently, barrier() can never be defined as __memory_barrier() from compiler.h since it already has a definition in place and if we trust the comment in compiler-intel.h, ecc doesn't support gcc specific asm statements. I don't have an ecc at hand (unsure if that's still used in the field?) and only found this by accident during code review, a revert of that cleanup would be simplest option. Fixes: 73679e508201 ("compiler-intel.h: Remove duplicate definition") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com> Cc: Pranith Kumar <bobby.prani@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: mancha security <mancha1@zoho.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03ACPICA: Tables: Enable default 64-bit FADT addresses favorLv Zheng
commit 0ea61381788a37d864f9841b0fe97d40f7058f3b upstream. ACPICA commit 4da56eeae0749dfe8491285c1e1fad48f6efafd8 The following commit temporarily disables correct 64-bit FADT addresses favor during the period the root cause of the bug is not fixed: Commit: 85dbd5801f62b66e2aa7826aaefcaebead44c8a6 ACPICA: Tables: Restore old behavor to favor 32-bit FADT addresses. With enough protections, this patch re-enables 64-bit FADT addresses by default. If regressions are reported against such change, this patch should be bisected and reverted. Note that 64-bit FACS favor and 64-bit firmware waking vector favor are excluded by this commit in order not to break OSPMs. Lv Zheng. Link: https://bugzilla.kernel.org/show_bug.cgi?id=74021 Link: https://github.com/acpica/acpica/commit/4da56eea Reported-and-tested-by: Oswald Buddenhagen <ossi@kde.org> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03ACPICA: Tables: Fix an issue that FACS initialization is performed twiceLv Zheng
commit c04be18448355441a0c424362df65b6422e27bda upstream. ACPICA commit 90f5332a15e9d9ba83831ca700b2b9f708274658 This patch adds a new FACS initialization flag for acpi_tb_initialize(). acpi_enable_subsystem() might be invoked several times in OS bootup process, and we don't want FACS initialization to be invoked twice. Lv Zheng. Link: https://github.com/acpica/acpica/commit/90f5332a Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03ACPICA: Tables: Enable both 32-bit and 64-bit FACSLv Zheng
commit c04e1fb4396d27f18296db0f914760fa7fe8223a upstream. ACPICA commit f7b86f35416e3d1f71c3d816ff5075ddd33ed486 The following commit is reported to have broken s2ram on some platforms: Commit: 0249ed2444d65d65fc3f3f64f398f1ad0b7e54cd ACPICA: Add option to favor 32-bit FADT addresses. The platform reports 2 FACS tables (which is not allowed by ACPI specification) and the new 32-bit address favor rule forces OSPMs to use the FACS table reported via FADT's X_FIRMWARE_CTRL field. The root cause of the reported bug might be one of the followings: 1. BIOS may favor the 64-bit firmware waking vector address when the version of the FACS is greater than 0 and Linux currently only supports resuming from the real mode, so the 64-bit firmware waking vector has never been set and might be invalid to BIOS while the commit enables higher version FACS. 2. BIOS may favor the FACS reported via the "FIRMWARE_CTRL" field in the FADT while the commit doesn't set the firmware waking vector address of the FACS reported by "FIRMWARE_CTRL", it only sets the firware waking vector address of the FACS reported by "X_FIRMWARE_CTRL". This patch excludes the cases that can trigger the bugs caused by the root cause 2. There is no handshaking mechanism can be used by OSPM to tell BIOS which FACS is currently used. Thus the FACS reported by "FIRMWARE_CTRL" may still be used by BIOS and the 0 value of the 32-bit firmware waking vector might trigger such failure. This patch tries to favor 32bit FACS address in another way where both the FACS reported by "FIRMWARE_CTRL" and the FACS reported by "X_FIRMWARE_CTRL" are loaded so that further commit can set firmware waking vector in the both tables to ensure we can exclude the cases that trigger the bugs caused by the root cause 2. The exclusion is split into 2 commits as this commit is also useful for dumping more ACPI tables, it won't get reverted when such exclusion is no longer necessary. Lv Zheng. Link: https://bugzilla.kernel.org/show_bug.cgi?id=74021 Link: https://github.com/acpica/acpica/commit/f7b86f35 Reported-and-tested-by: Oswald Buddenhagen <ossi@kde.org> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03ACPI / PNP: Reserve ACPI resources at the fs_initcall_sync stageRafael J. Wysocki
commit 0294112ee3135fbd15eaa70015af8283642dd970 upstream. This effectively reverts the following three commits: 7bc10388ccdd ACPI / resources: free memory on error in add_region_before() 0f1b414d1907 ACPI / PNP: Avoid conflicting resource reservations b9a5e5e18fbf ACPI / init: Fix the ordering of acpi_reserve_resources() (commit b9a5e5e18fbf introduced regressions some of which, but not all, were addressed by commit 0f1b414d1907 and commit 7bc10388ccdd was a fixup on top of the latter) and causes ACPI fixed hardware resources to be reserved at the fs_initcall_sync stage of system initialization. The story is as follows. First, a boot regression was reported due to an apparent resource reservation ordering change after a commit that shouldn't lead to such changes. Investigation led to the conclusion that the problem happened because acpi_reserve_resources() was executed at the device_initcall() stage of system initialization which wasn't strictly ordered with respect to driver initialization (and with respect to the initialization of the pcieport driver in particular), so a random change causing the device initcalls to be run in a different order might break things. The response to that was to attempt to run acpi_reserve_resources() as soon as we knew that ACPI would be in use (commit b9a5e5e18fbf). However, that turned out to be too early, because it caused resource reservations made by the PNP system driver to fail on at least one system and that failure was addressed by commit 0f1b414d1907. That fix still turned out to be insufficient, though, because calling acpi_reserve_resources() before the fs_initcall stage of system initialization caused a boot regression to happen on the eCAFE EC-800-H20G/S netbook. That meant that we only could call acpi_reserve_resources() at the fs_initcall initialization stage or later, but then we might just as well call it after the PNP initalization in which case commit 0f1b414d1907 wouldn't be necessary any more. For this reason, the changes made by commit 0f1b414d1907 are reverted (along with a memory leak fixup on top of that commit), the changes made by commit b9a5e5e18fbf that went too far are reverted too and acpi_reserve_resources() is changed into fs_initcall_sync, which will cause it to be executed after the PNP subsystem initialization (which is an fs_initcall) and before device initcalls (including the pcieport driver initialization) which should avoid the initial issue. Link: https://bugzilla.kernel.org/show_bug.cgi?id=100581 Link: http://marc.info/?t=143092384600002&r=1&w=2 Link: https://bugzilla.kernel.org/show_bug.cgi?id=99831 Link: http://marc.info/?t=143389402600001&r=1&w=2 Fixes: b9a5e5e18fbf "ACPI / init: Fix the ordering of acpi_reserve_resources()" Reported-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03drm/i915: Use two 32bit reads for select 64bit REG_READ ioctlsChris Wilson
commit 648a9bc5308d952f2c80772301b339f73026f013 upstream. Since the hardware sometimes mysteriously totally flummoxes the 64bit read of a 64bit register when read using a single instruction, split the read into two instructions. Since the read here is of automatically incrementing timestamp counters, we also have to be very careful in order to make sure that it does not increment between the two instructions. However, since userspace tried to workaround this issue and so enshrined this ABI for a broken hardware read and in the process neglected that the read only fails in some environments, we have to introduce a new uABI flag for userspace to request the 2x32 bit accurate read of the timestamp. v2: Fix alignment check and include details of the workaround for userspace. Reported-by: Karol Herbst <freedesktop@karolherbst.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91317 Testcase: igt/gem_reg_read Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Tested-by: Michał Winiarski <michal.winiarski@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03drm/atomic: fix out of bounds read in for_each_*_in_state helpersAndrey Ryabinin
commit 60f207a5b6d8f23c2e8388b415e8d5c7311cc79d upstream. for_each_*_in_state validate array index after access to array elements, thus perform out of bounds read. Fix this by validating index in the first place and read array element iff validation was successful. Fixes: df63b9994eaf ("drm/atomic: Add for_each_{connector,crtc,plane}_in_state helper macros") Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03drm/dp/mst: close deadlock in connector destruction.Dave Airlie
commit 6b8eeca65b18ae77e175cc2b6571731f0ee413bf upstream. I've only seen this once, and I failed to capture the lockdep backtrace, but I did some investigations. If we are calling into the MST layer from EDID probing, we have the mode_config mutex held, if during that EDID probing, the MST hub goes away, then we can get a deadlock where the connector destruction function in the driver tries to retake the mode config mutex. This offloads connector destruction to a workqueue, and avoid the subsequenct lock ordering issue. Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03libata: add ATA_HORKAGE_MAX_SEC_1024 to revert back to previous max_sectors ↵David Milburn
limit commit af34d637637eabaf49406eb35c948cd51ba262a6 upstream. Since no longer limiting max_sectors to BLK_DEF_MAX_SECTORS (commit 34b48db66e08), data corruption may occur on ST380013AS drive configured on 82801JI (ICH10 Family) SATA controller. This patch will allow the driver to limit max_sectors as before # cat /sys/block/sdb/queue/max_sectors_kb 512 I was able to double the max_sectors_kb value up to 16384 on linux-4.2.0-rc2 before seeing corruption, but seems safer to use previous limit. Without this patch max_sectors_kb will be 32767. tj: Minor comment update. Reported-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: David Milburn <dmilburn@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: 34b48db66e08 ("block: remove artifical max_hw_sectors cap") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03libata: add ATA_HORKAGE_NOTRIMArne Fitzenreiter
commit 71d126fd28de2d4d9b7b2088dbccd7ca62fad6e0 upstream. Some devices lose data on TRIM whether queued or not. This patch adds a horkage to disable TRIM. tj: Collapsed unnecessary if() nesting. Signed-off-by: Arne Fitzenreiter <arne_f@ipfire.org> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03libata: Fall back to unqueued READ LOG EXT if the DMA variant failsMartin K. Petersen
commit 5d3abf8ff67f49271a42c0f7fa4f20f9e046bf0e upstream. Some devices advertise support for the READ/WRITE LOG DMA EXT commands but fail when we try to issue them. This can lead to queued TRIM being unintentionally disabled since the relevant feature flag is located in a general purpose log page. Fall back to unqueued READ LOG EXT if the DMA variant fails while reading a log page. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03jbd2: fix ocfs2 corrupt when updating journal superblock failsJoseph Qi
commit 6f6a6fda294506dfe0e3e0a253bb2d2923f28f0a upstream. If updating journal superblock fails after journal data has been flushed, the error is omitted and this will mislead the caller as a normal case. In ocfs2, the checkpoint will be treated successfully and the other node can get the lock to update. Since the sb_start is still pointing to the old log block, it will rewrite the journal data during journal recovery by the other node. Thus the new updates will be overwritten and ocfs2 corrupts. So in above case we have to return the error, and ocfs2_commit_cache will take care of the error and prevent the other node to do update first. And only after recovering journal it can do the new updates. The issue discussion mail can be found at: https://oss.oracle.com/pipermail/ocfs2-devel/2015-June/010856.html http://comments.gmane.org/gmane.comp.file-systems.ext4/48841 [ Fixed bug in patch which allowed a non-negative error return from jbd2_cleanup_journal_tail() to leak out of jbd2_fjournal_flush(); this was causing xfstests ext4/306 to fail. -- Ted ] Reported-by: Yiwen Jiang <jiangyiwen@huawei.com> Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Tested-by: Yiwen Jiang <jiangyiwen@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03bufferhead: Add _gfp version for sb_getblk()Nikolay Borisov
commit bd7ade3cd9b0850264306f5c2b79024a417b6396 upstream. sb_getblk() is used during ext4 (and possibly other FSes) writeback paths. Sometimes such path require allocating memory and guaranteeing that such allocation won't block. Currently, however, there is no way to provide user flags for sb_getblk which could lead to deadlocks. This patch implements a sb_getblk_gfp with the only difference it can accept user-provided GFP flags. Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03hid-sensor: Fix suspend/resume delaySrinivas Pandruvada
commit 1e25aa9641e8f3fa39cd5e46b4afcafd7f12a44b upstream. By default all the sensors are runtime suspended state (lowest power state). During Linux suspend process, all the run time suspended devices are resumed and then suspended. This caused all sensors to power up and introduced delay in suspend time, when we introduced runtime PM for HID sensors. The opposite process happens during resume process. To fix this, we do powerup process of the sensors only when the request is issued from user (raw or tiggerred). In this way when runtime, resume calls for powerup it will simply return as this will not match user requested state. Note this is a regression fix as the increase in suspend / resume times can be substantial (report of 8 seconds on Len's laptop!) Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Tested-by: Len Brown <len.brown@intel.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-07-21PCI: Add pci_bus_addr_tYinghai Lu
commit 3a9ad0b4fdcd57f775d3615004c8c64c021a9e7d upstream. David Ahern reported that d63e2e1f3df9 ("sparc/PCI: Clip bridge windows to fit in upstream windows") fails to boot on sparc/T5-8: pci 0000:06:00.0: reg 0x184: can't handle BAR above 4GB (bus address 0x110204000) The problem is that sparc64 assumed that dma_addr_t only needed to hold DMA addresses, i.e., bus addresses returned via the DMA API (dma_map_single(), etc.), while the PCI core assumed dma_addr_t could hold *any* bus address, including raw BAR values. On sparc64, all DMA addresses fit in 32 bits, so dma_addr_t is a 32-bit type. However, BAR values can be 64 bits wide, so they don't fit in a dma_addr_t. d63e2e1f3df9 added new checking that tripped over this mismatch. Add pci_bus_addr_t, which is wide enough to hold any PCI bus address, including both raw BAR values and DMA addresses. This will be 64 bits on 64-bit platforms and on platforms with a 64-bit dma_addr_t. Then dma_addr_t only needs to be wide enough to hold addresses from the DMA API. [bhelgaas: changelog, bugzilla, Kconfig to ensure pci_bus_addr_t is at least as wide as dma_addr_t, documentation] Fixes: d63e2e1f3df9 ("sparc/PCI: Clip bridge windows to fit in upstream windows") Fixes: 23b13bc76f35 ("PCI: Fail safely if we can't handle BARs larger than 4GB") Link: http://lkml.kernel.org/r/CAE9FiQU1gJY1LYrxs+ma5LCTEEe4xmtjRG0aXJ9K_Tsu+m9Wuw@mail.gmail.com Link: http://lkml.kernel.org/r/1427857069-6789-1-git-send-email-yinghai@kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=96231 Reported-by: David Ahern <david.ahern@oracle.com> Tested-by: David Ahern <david.ahern@oracle.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-07-21PCI: Propagate the "ignore hotplug" setting to parentRafael J. Wysocki
commit 0824965140fff1bf640a987dc790d1594a8e0699 upstream. Refine the mechanism introduced by commit f244d8b623da ("ACPIPHP / radeon / nouveau: Fix VGA switcheroo problem related to hotplug") to propagate the ignore_hotplug setting of the device to its parent bridge in case hotplug notifications related to the graphics adapter switching are given for the bridge rather than for the device itself (they need to be ignored in both cases). Link: https://bugzilla.kernel.org/show_bug.cgi?id=61891 Link: https://bugs.freedesktop.org/show_bug.cgi?id=88927 Fixes: b440bde74f04 ("PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device") Reported-and-tested-by: tiagdtd-lava <tiagdtd-lava@yahoo.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-07-21mm: kmemleak_alloc_percpu() should follow the gfp from per_alloc()Larry Finger
commit 8a8c35fadfaf55629a37ef1a8ead1b8fb32581d2 upstream. Beginning at commit d52d3997f843 ("ipv6: Create percpu rt6_info"), the following INFO splat is logged: =============================== [ INFO: suspicious RCU usage. ] 4.1.0-rc7-next-20150612 #1 Not tainted ------------------------------- kernel/sched/core.c:7318 Illegal context switch in RCU-bh read-side critical section! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 3 locks held by systemd/1: #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff815f0c8f>] rtnetlink_rcv+0x1f/0x40 #1: (rcu_read_lock_bh){......}, at: [<ffffffff816a34e2>] ipv6_add_addr+0x62/0x540 #2: (addrconf_hash_lock){+...+.}, at: [<ffffffff816a3604>] ipv6_add_addr+0x184/0x540 stack backtrace: CPU: 0 PID: 1 Comm: systemd Not tainted 4.1.0-rc7-next-20150612 #1 Hardware name: TOSHIBA TECRA A50-A/TECRA A50-A, BIOS Version 4.20 04/17/2014 Call Trace: dump_stack+0x4c/0x6e lockdep_rcu_suspicious+0xe7/0x120 ___might_sleep+0x1d5/0x1f0 __might_sleep+0x4d/0x90 kmem_cache_alloc+0x47/0x250 create_object+0x39/0x2e0 kmemleak_alloc_percpu+0x61/0xe0 pcpu_alloc+0x370/0x630 Additional backtrace lines are truncated. In addition, the above splat is followed by several "BUG: sleeping function called from invalid context at mm/slub.c:1268" outputs. As suggested by Martin KaFai Lau, these are the clue to the fix. Routine kmemleak_alloc_percpu() always uses GFP_KERNEL for its allocations, whereas it should follow the gfp from its callers. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-07-21power_supply: Fix possible NULL pointer dereference on early ueventKrzysztof Kozlowski
commit 7f1a57fdd6cb6e7be2ed31878a34655df38e1861 upstream. Don't call the power_supply_changed() from power_supply_register() when parent is still probing because it may lead to accessing parent too early. In bq27x00_battery this caused NULL pointer exception because uevent of power_supply_changed called back the the get_property() method provided by the driver. The get_property() method accessed pointer which should be returned by power_supply_register(). Starting from bq27x00_battery_probe(): di->bat = power_supply_register() power_supply_changed() kobject_uevent() power_supply_uevent() power_supply_show_property() power_supply_get_property() bq27x00_battery_get_property() dereference of di->bat which is NULL here The dereference of di->bat (value returned by power_supply_register()) is the currently visible problem. However calling back the methods provided by driver before ending the probe may lead to accessing other driver-related data which is not yet initialized. The call to power_supply_changed() is postponed till probing ends - mutex of parent device is released. Reported-by: H. Nikolaus Schaller <hns@goldelico.com> Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Fixes: 297d716f6260 ("power_supply: Change ownership from driver to core") Tested-By: Dr. H. Nikolaus Schaller <hns@goldelico.com> Signed-off-by: Sebastian Reichel <sre@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-07-21ACPI / PNP: Avoid conflicting resource reservationsRafael J. Wysocki
commit 0f1b414d190724617eb1cdd615592fa8cd9d0b50 upstream. Commit b9a5e5e18fbf "ACPI / init: Fix the ordering of acpi_reserve_resources()" overlooked the fact that the memory and/or I/O regions reserved by acpi_reserve_resources() may conflict with those reserved by the PNP "system" driver. If that conflict actually takes place, it causes the reservations made by the "system" driver to fail while before commit b9a5e5e18fbf all reservations made by it and by acpi_reserve_resources() would be successful. In turn, that allows the resources that haven't been reserved by the "system" driver to be used by others (e.g. PCI) which sometimes leads to functional problems (up to and including boot failures). To fix that issue, introduce a common resource reservation routine, acpi_reserve_region(), to be used by both acpi_reserve_resources() and the "system" driver, that will track all resources reserved by it and avoid making conflicting requests. Link: https://bugzilla.kernel.org/show_bug.cgi?id=99831 Link: http://marc.info/?t=143389402600001&r=1&w=2 Fixes: b9a5e5e18fbf "ACPI / init: Fix the ordering of acpi_reserve_resources()" Reported-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-07-21ACPI / init: Switch over platform to the ACPI mode laterRafael J. Wysocki
commit b064a8fa77dfead647564c46ac8fc5b13bd1ab73 upstream. Commit 73f7d1ca3263 "ACPI / init: Run acpi_early_init() before timekeeping_init()" moved the ACPI subsystem initialization, including the ACPI mode enabling, to an earlier point in the initialization sequence, to allow the timekeeping subsystem use ACPI early. Unfortunately, that resulted in boot regressions on some systems and the early ACPI initialization was moved toward its original position in the kernel initialization code by commit c4e1acbb35e4 "ACPI / init: Invoke early ACPI initialization later". However, that turns out to be insufficient, as boot is still broken on the Tyan S8812 mainboard. To fix that issue, split the ACPI early initialization code into two pieces so the majority of it still located in acpi_early_init() and the part switching over the platform into the ACPI mode goes into a new function, acpi_subsystem_init(), executed at the original early ACPI initialization spot. That fixes the Tyan S8812 boot problem, but still allows ACPI tables to be loaded earlier which is useful to the EFI code in efi_enter_virtual_mode(). Link: https://bugzilla.kernel.org/show_bug.cgi?id=97141 Fixes: 73f7d1ca3263 "ACPI / init: Run acpi_early_init() before timekeeping_init()" Reported-and-tested-by: Marius Tolzmann <tolzmann@molgen.mpg.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Toshi Kani <toshi.kani@hp.com> Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Reviewed-by: Lee, Chun-Yi <jlee@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-07-21mnt: Refactor the logic for mounting sysfs and proc in a user namespaceEric W. Biederman
commit 1b852bceb0d111e510d1a15826ecc4a19358d512 upstream. Fresh mounts of proc and sysfs are a very special case that works very much like a bind mount. Unfortunately the current structure can not preserve the MNT_LOCK... mount flags. Therefore refactor the logic into a form that can be modified to preserve those lock bits. Add a new filesystem flag FS_USERNS_VISIBLE that requires some mount of the filesystem be fully visible in the current mount namespace, before the filesystem may be mounted. Move the logic for calling fs_fully_visible from proc and sysfs into fs/namespace.c where it has greater access to mount namespace state. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-07-21sysfs: Add support for permanently empty directories to serve as mount points.Eric W. Biederman
commit 87d2846fcf88113fae2341da1ca9a71f0d916f2c upstream. Add two functions sysfs_create_mount_point and sysfs_remove_mount_point that hang a permanently empty directory off of a kobject or remove a permanently emptpy directory hanging from a kobject. Export these new functions so modular filesystems can use them. Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-07-21kernfs: Add support for always empty directories.Eric W. Biederman
commit ea015218f2f7ace2dad9cedd21ed95bdba2886d7 upstream. Add a new function kernfs_create_empty_dir that can be used to create directory that can not be modified. Update the code to use make_empty_dir_inode when reporting a permanently empty directory to the vfs. Update the code to not allow adding to permanently empty directories. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-07-21sysctl: Allow creating permanently empty directories that serve as mountpoints.Eric W. Biederman
commit f9bd6733d3f11e24f3949becf277507d422ee1eb upstream. Add a magic sysctl table sysctl_mount_point that when used to create a directory forces that directory to be permanently empty. Update the code to use make_empty_dir_inode when accessing permanently empty directories. Update the code to not allow adding to permanently empty directories. Update /proc/sys/fs/binfmt_misc to be a permanently empty directory. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-07-21fs: Add helper functions for permanently empty directories.Eric W. Biederman
commit fbabfd0f4ee2e8847bf56edf481249ad1bb8c44d upstream. To ensure it is safe to mount proc and sysfs I need to check if filesystems that are mounted on top of them are mounted on truly empty directories. Given that some directories can gain entries over time, knowing that a directory is empty right now is insufficient. Therefore add supporting infrastructure for permantently empty directories that proc and sysfs can use when they create mount points for filesystems and fs_fully_visible can use to test for permanently empty directories to ensure that nothing will be gained by mounting a fresh copy of proc or sysfs. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>