summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2015-10-27crypto: sparc - initialize blkcipher.ivsizeDave Kleikamp
commit a66d7f724a96d6fd279bfbd2ee488def6b081bea upstream. Some of the crypto algorithms write to the initialization vector, but no space has been allocated for it. This clobbers adjacent memory. Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-27m68k/uaccess: Fix asm constraints for userspace accessGeert Uytterhoeven
commit 631d8b674f5f8235e9cb7e628b0fe9e5200e3158 upstream. When compiling a MMU kernel with CPU_HAS_ADDRESS_SPACES=n (e.g. "MMU=y allnoconfig": "echo CONFIG_MMU=y > allno.config && make KCONFIG_ALLCONFIG=1 allnoconfig"), we use plain "move" instead of "moves", and I got: CC arch/m68k/lib/uaccess.o {standard input}: Assembler messages: {standard input}:47: Error: operands mismatch -- statement `move.b %a0,(%a1)' ignored This happens because plain "move" doesn't support byte transfers between memory and address registers, while "moves" does. Fix the asm constraints for __generic_copy_from_user(), __generic_copy_to_user(), and __clear_user() to only use data registers when accessing userspace. Also, relax the asm constraints for 16-bit userspace accesses in __put_user() and __get_user(), as both "move" and "moves" do support such transfers between memory and address registers. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-22x86: Add 1/2/4/8 byte optimization to 64bit __copy_{from,to}_user_inatomicAndi Kleen
commit ff47ab4ff3cddfa7bc1b25b990e24abe2ae474ff upstream. The 64bit __copy_{from,to}_user_inatomic always called copy_from_user_generic, but skipped the special optimizations for 1/2/4/8 byte accesses. This especially hurts the futex call, which accesses the 4 byte futex user value with a complicated fast string operation in a function call, instead of a single movl. Use __copy_{from,to}_user for _inatomic instead to get the same optimizations. The only problem was the might_fault() in those functions. So move that to new wrapper and call __copy_{f,t}_user_nocheck() from *_inatomic directly. 32bit already did this correctly by duplicating the code. Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1376687844-19857-2-git-send-email-andi@firstfloor.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-22m68k: Define asmlinkage_protectAndreas Schwab
commit 8474ba74193d302e8340dddd1e16c85cc4b98caf upstream. Make sure the compiler does not modify arguments of syscall functions. This can happen if the compiler generates a tailcall to another function. For example, without asmlinkage_protect sys_openat is compiled into this function: sys_openat: clr.l %d0 move.w 18(%sp),%d0 move.l %d0,16(%sp) jbra do_sys_open Note how the fourth argument is modified in place, modifying the register %d4 that gets restored from this stack slot when the function returns to user-space. The caller may expect the register to be unmodified across system calls. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-22arm64: readahead: fault retry breaks mmap file read random detectionMark Salyzyn
commit 569ba74a7ba69f46ce2950bf085b37fea2408385 upstream. This is the arm64 portion of commit 45cac65b0fcd ("readahead: fault retry breaks mmap file read random detection"), which was absent from the initial port and has since gone unnoticed. The original commit says: > .fault now can retry. The retry can break state machine of .fault. In > filemap_fault, if page is miss, ra->mmap_miss is increased. In the second > try, since the page is in page cache now, ra->mmap_miss is decreased. And > these are done in one fault, so we can't detect random mmap file access. > > Add a new flag to indicate .fault is tried once. In the second try, skip > ra->mmap_miss decreasing. The filemap_fault state machine is ok with it. With this change, Mark reports that: > Random read improves by 250%, sequential read improves by 40%, and > random write by 400% to an eMMC device with dm crypto wrapped around it. Cc: Shaohua Li <shli@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Mark Salyzyn <salyzyn@android.com> Signed-off-by: Riley Andrews <riandrews@android.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-22powerpc/MSI: Fix race condition in tearing down MSI interruptsPaul Mackerras
commit e297c939b745e420ef0b9dc989cb87bda617b399 upstream. This fixes a race which can result in the same virtual IRQ number being assigned to two different MSI interrupts. The most visible consequence of that is usually a warning and stack trace from the sysfs code about an attempt to create a duplicate entry in sysfs. The race happens when one CPU (say CPU 0) is disposing of an MSI while another CPU (say CPU 1) is setting up an MSI. CPU 0 calls (for example) pnv_teardown_msi_irqs(), which calls msi_bitmap_free_hwirqs() to indicate that the MSI (i.e. its hardware IRQ number) is no longer in use. Then, before CPU 0 gets to calling irq_dispose_mapping() to free up the virtal IRQ number, CPU 1 comes in and calls msi_bitmap_alloc_hwirqs() to allocate an MSI, and gets the same hardware IRQ number that CPU 0 just freed. CPU 1 then calls irq_create_mapping() to get a virtual IRQ number, which sees that there is currently a mapping for that hardware IRQ number and returns the corresponding virtual IRQ number (which is the same virtual IRQ number that CPU 0 was using). CPU 0 then calls irq_dispose_mapping() and frees that virtual IRQ number. Now, if another CPU comes along and calls irq_create_mapping(), it is likely to get the virtual IRQ number that was just freed, resulting in the same virtual IRQ number apparently being used for two different hardware interrupts. To fix this race, we just move the call to msi_bitmap_free_hwirqs() to after the call to irq_dispose_mapping(). Since virq_to_hw() doesn't work for the virtual IRQ number after irq_dispose_mapping() has been called, we need to call it before irq_dispose_mapping() and remember the result for the msi_bitmap_free_hwirqs() call. The pattern of calling msi_bitmap_free_hwirqs() before irq_dispose_mapping() appears in 5 places under arch/powerpc, and appears to have originated in commit 05af7bd2d75e ("[POWERPC] MPIC U3/U4 MSI backend") from 2007. Fixes: 05af7bd2d75e ("[POWERPC] MPIC U3/U4 MSI backend") Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-22MIPS: dma-default: Fix 32-bit fall back to GFP_DMAJames Hogan
commit 53960059d56ecef67d4ddd546731623641a3d2d1 upstream. If there is a DMA zone (usually 24bit = 16MB I believe), but no DMA32 zone, as is the case for some 32-bit kernels, then massage_gfp_flags() will cause DMA memory allocated for devices with a 32..63-bit coherent_dma_mask to fall back to using __GFP_DMA, even though there may only be 32-bits of physical address available anyway. Correct that case to compare against a mask the size of phys_addr_t instead of always using a 64-bit mask. Signed-off-by: James Hogan <james.hogan@imgtec.com> Fixes: a2e715a86c6d ("MIPS: DMA: Fix computation of DMA flags from device's coherent_dma_mask.") Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/9610/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-22x86/xen: Support kexec/kdump in HVM guests by doing a soft resetVitaly Kuznetsov
commit 0b34a166f291d255755be46e43ed5497cdd194f2 upstream. Currently there is a number of issues preventing PVHVM Xen guests from doing successful kexec/kdump: - Bound event channels. - Registered vcpu_info. - PIRQ/emuirq mappings. - shared_info frame after XENMAPSPACE_shared_info operation. - Active grant mappings. Basically, newly booted kernel stumbles upon already set up Xen interfaces and there is no way to reestablish them. In Xen-4.7 a new feature called 'soft reset' is coming. A guest performing kexec/kdump operation is supposed to call SCHEDOP_shutdown hypercall with SHUTDOWN_soft_reset reason before jumping to new kernel. Hypervisor (with some help from toolstack) will do full domain cleanup (but keeping its memory and vCPU contexts intact) returning the guest to the state it had when it was first booted and thus allowing it to start over. Doing SHUTDOWN_soft_reset on Xen hypervisors which don't support it is probably OK as by default all unknown shutdown reasons cause domain destroy with a message in toolstack log: 'Unknown shutdown reason code 5. Destroying domain.' which gives a clue to what the problem is and eliminates false expectations. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-22x86/mm: Set NX on gap between __ex_table and rodataStephen Smalley
commit ab76f7b4ab2397ffdd2f1eb07c55697d19991d10 upstream. Unused space between the end of __ex_table and the start of rodata can be left W+x in the kernel page tables. Extend the setting of the NX bit to cover this gap by starting from text_end rather than rodata_start. Before: ---[ High Kernel Mapping ]--- 0xffffffff80000000-0xffffffff81000000 16M pmd 0xffffffff81000000-0xffffffff81600000 6M ro PSE GLB x pmd 0xffffffff81600000-0xffffffff81754000 1360K ro GLB x pte 0xffffffff81754000-0xffffffff81800000 688K RW GLB x pte 0xffffffff81800000-0xffffffff81a00000 2M ro PSE GLB NX pmd 0xffffffff81a00000-0xffffffff81b3b000 1260K ro GLB NX pte 0xffffffff81b3b000-0xffffffff82000000 4884K RW GLB NX pte 0xffffffff82000000-0xffffffff82200000 2M RW PSE GLB NX pmd 0xffffffff82200000-0xffffffffa0000000 478M pmd After: ---[ High Kernel Mapping ]--- 0xffffffff80000000-0xffffffff81000000 16M pmd 0xffffffff81000000-0xffffffff81600000 6M ro PSE GLB x pmd 0xffffffff81600000-0xffffffff81754000 1360K ro GLB x pte 0xffffffff81754000-0xffffffff81800000 688K RW GLB NX pte 0xffffffff81800000-0xffffffff81a00000 2M ro PSE GLB NX pmd 0xffffffff81a00000-0xffffffff81b3b000 1260K ro GLB NX pte 0xffffffff81b3b000-0xffffffff82000000 4884K RW GLB NX pte 0xffffffff82000000-0xffffffff82200000 2M RW PSE GLB NX pmd 0xffffffff82200000-0xffffffffa0000000 478M pmd Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/1443704662-3138-1-git-send-email-sds@tycho.nsa.gov Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-22Use WARN_ON_ONCE for missing X86_FEATURE_NRIPSDirk Müller
commit d2922422c48df93f3edff7d872ee4f3191fefb08 upstream. The cpu feature flags are not ever going to change, so warning everytime can cause a lot of kernel log spam (in our case more than 10GB/hour). The warning seems to only occur when nested virtualization is enabled, so it's probably triggered by a KVM bug. This is a sensible and safe change anyway, and the KVM bug fix might not be suitable for stable releases anyway. Signed-off-by: Dirk Mueller <dmueller@suse.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-22x86/platform: Fix Geode LX timekeeping in the generic x86 buildDavid Woodhouse
commit 03da3ff1cfcd7774c8780d2547ba0d995f7dc03d upstream. In 2007, commit 07190a08eef36 ("Mark TSC on GeodeLX reliable") bypassed verification of the TSC on Geode LX. However, this code (now in the check_system_tsc_reliable() function in arch/x86/kernel/tsc.c) was only present if CONFIG_MGEODE_LX was set. OpenWRT has recently started building its generic Geode target for Geode GX, not LX, to include support for additional platforms. This broke the timekeeping on LX-based devices, because the TSC wasn't marked as reliable: https://dev.openwrt.org/ticket/20531 By adding a runtime check on is_geode_lx(), we can also include the fix if CONFIG_MGEODEGX1 or CONFIG_X86_GENERIC are set, thus fixing the problem. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Cc: Andres Salomon <dilinger@queued.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Marcelo Tosatti <marcelo@kvack.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1442409003.131189.87.camel@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-22x86/apic: Serialize LVTT and TSC_DEADLINE writesShaohua Li
commit 5d7c631d926b59aa16f3c56eaeb83f1036c81dc7 upstream. The APIC LVTT register is MMIO mapped but the TSC_DEADLINE register is an MSR. The write to the TSC_DEADLINE MSR is not serializing, so it's not guaranteed that the write to LVTT has reached the APIC before the TSC_DEADLINE MSR is written. In such a case the write to the MSR is ignored and as a consequence the local timer interrupt never fires. The SDM decribes this issue for xAPIC and x2APIC modes. The serialization methods recommended by the SDM differ. xAPIC: "1. Memory-mapped write to LVT Timer Register, setting bits 18:17 to 10b. 2. WRMSR to the IA32_TSC_DEADLINE MSR a value much larger than current time-stamp counter. 3. If RDMSR of the IA32_TSC_DEADLINE MSR returns zero, go to step 2. 4. WRMSR to the IA32_TSC_DEADLINE MSR the desired deadline." x2APIC: "To allow for efficient access to the APIC registers in x2APIC mode, the serializing semantics of WRMSR are relaxed when writing to the APIC registers. Thus, system software should not use 'WRMSR to APIC registers in x2APIC mode' as a serializing instruction. Read and write accesses to the APIC registers will occur in program order. A WRMSR to an APIC register may complete before all preceding stores are globally visible; software can prevent this by inserting a serializing instruction, an SFENCE, or an MFENCE before the WRMSR." The xAPIC method is to just wait for the memory mapped write to hit the LVTT by checking whether the MSR write has reached the hardware. There is no reason why a proper MFENCE after the memory mapped write would not do the same. Andi Kleen confirmed that MFENCE is sufficient for the xAPIC case as well. Issue MFENCE before writing to the TSC_DEADLINE MSR. This can be done unconditionally as all CPUs which have TSC_DEADLINE also have MFENCE support. [ tglx: Massaged the changelog ] Signed-off-by: Shaohua Li <shli@fb.com> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: <Kernel-team@fb.com> Cc: <lenb@kernel.org> Cc: <fenghua.yu@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/20150909041352.GA2059853@devbig257.prn2.facebook.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-22ARM: 8429/1: disable GCC SRA optimizationArd Biesheuvel
commit a077224fd35b2f7fbc93f14cf67074fc792fbac2 upstream. While working on the 32-bit ARM port of UEFI, I noticed a strange corruption in the kernel log. The following snprintf() statement (in drivers/firmware/efi/efi.c:efi_md_typeattr_format()) snprintf(pos, size, "|%3s|%2s|%2s|%2s|%3s|%2s|%2s|%2s|%2s]", was producing the following output in the log: | | | | | |WB|WT|WC|UC] | | | | | |WB|WT|WC|UC] | | | | | |WB|WT|WC|UC] |RUN| | | | |WB|WT|WC|UC]* |RUN| | | | |WB|WT|WC|UC]* | | | | | |WB|WT|WC|UC] |RUN| | | | |WB|WT|WC|UC]* | | | | | |WB|WT|WC|UC] |RUN| | | | | | | |UC] |RUN| | | | | | | |UC] As it turns out, this is caused by incorrect code being emitted for the string() function in lib/vsprintf.c. The following code if (!(spec.flags & LEFT)) { while (len < spec.field_width--) { if (buf < end) *buf = ' '; ++buf; } } for (i = 0; i < len; ++i) { if (buf < end) *buf = *s; ++buf; ++s; } while (len < spec.field_width--) { if (buf < end) *buf = ' '; ++buf; } when called with len == 0, triggers an issue in the GCC SRA optimization pass (Scalar Replacement of Aggregates), which handles promotion of signed struct members incorrectly. This is a known but as yet unresolved issue. (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65932). In this particular case, it is causing the second while loop to be executed erroneously a single time, causing the additional space characters to be printed. So disable the optimization by passing -fno-ipa-sra. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-01x86: bpf_jit: fix compilation of large bpf programsAlexei Starovoitov
commit 3f7352bf21f8fd7ba3e2fcef9488756f188e12be upstream. x86 has variable length encoding. x86 JIT compiler is trying to pick the shortest encoding for given bpf instruction. While doing so the jump targets are changing, so JIT is doing multiple passes over the program. Typical program needs 3 passes. Some very short programs converge with 2 passes. Large programs may need 4 or 5. But specially crafted bpf programs may hit the pass limit and if the program converges on the last iteration the JIT compiler will be producing an image full of 'int 3' insns. Fix this corner case by doing final iteration over bpf program. Fixes: 0a14842f5a3c ("net: filter: Just In Time compiler for x86-64") Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Tested-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-01parisc: Filter out spurious interrupts in PA-RISC irq handlerHelge Deller
commit b1b4e435e4ef7de77f07bf2a42c8380b960c2d44 upstream. When detecting a serial port on newer PA-RISC machines (with iosapic) we have a long way to go to find the right IRQ line, registering it, then registering the serial port and the irq handler for the serial port. During this phase spurious interrupts for the serial port may happen which then crashes the kernel because the action handler might not have been set up yet. So, basically it's a race condition between the serial port hardware and the CPU which sets up the necessary fields in the irq sructs. The main reason for this race is, that we unmask the serial port irqs too early without having set up everything properly before (which isn't easily possible because we need the IRQ number to register the serial ports). This patch is a work-around for this problem. It adds checks to the CPU irq handler to verify if the IRQ action field has been initialized already. If not, we just skip this interrupt (which isn't critical for a serial port at bootup). The real fix would probably involve rewriting all PA-RISC specific IRQ code (for CPU, IOSAPIC, GSC and EISA) to use IRQ domains with proper parenting of the irq chips and proper irq enabling along this line. This bug has been in the PA-RISC port since the beginning, but the crashes happened very rarely with currently used hardware. But on the latest machine which I bought (a C8000 workstation), which uses the fastest CPUs (4 x PA8900, 1GHz) and which has the largest possible L1 cache size (64MB each), the kernel crashed at every boot because of this race. So, without this patch the machine would currently be unuseable. For the record, here is the flow logic: 1. serial_init_chip() in 8250_gsc.c calls iosapic_serial_irq(). 2. iosapic_serial_irq() calls txn_alloc_irq() to find the irq. 3. iosapic_serial_irq() calls cpu_claim_irq() to register the CPU irq 4. cpu_claim_irq() unmasks the CPU irq (which it shouldn't!) 5. serial_init_chip() then registers the 8250 port. Problems: - In step 4 the CPU irq shouldn't have been registered yet, but after step 5 - If serial irq happens between 4 and 5 have finished, the kernel will crash Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-01x86/mm: Initialize pmd_idx in page_table_range_init_count()Minfei Huang
commit 9962eea9e55f797f05f20ba6448929cab2a9f018 upstream. The variable pmd_idx is not initialized for the first iteration of the for loop. Assign the proper value which indexes the start address. Fixes: 719272c45b82 'x86, mm: only call early_ioremap_page_table_range_init() once' Signed-off-by: Minfei Huang <mnfhuang@gmail.com> Cc: tony.luck@intel.com Cc: wangnan0@huawei.com Cc: david.vrabel@citrix.com Reviewed-by: yinghai@kernel.org Link: http://lkml.kernel.org/r/1436703522-29552-1-git-send-email-mhuang@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-01powerpc/rtas: Introduce rtas_get_sensor_fast() for IRQ handlersThomas Huth
commit 1c2cb594441d02815d304cccec9742ff5c707495 upstream. The EPOW interrupt handler uses rtas_get_sensor(), which in turn uses rtas_busy_delay() to wait for RTAS becoming ready in case it is necessary. But rtas_busy_delay() is annotated with might_sleep() and thus may not be used by interrupts handlers like the EPOW handler! This leads to the following BUG when CONFIG_DEBUG_ATOMIC_SLEEP is enabled: BUG: sleeping function called from invalid context at arch/powerpc/kernel/rtas.c:496 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.2.0-rc2-thuth #6 Call Trace: [c00000007ffe7b90] [c000000000807670] dump_stack+0xa0/0xdc (unreliable) [c00000007ffe7bc0] [c0000000000e1f14] ___might_sleep+0x134/0x180 [c00000007ffe7c20] [c00000000002aec0] rtas_busy_delay+0x30/0xd0 [c00000007ffe7c50] [c00000000002bde4] rtas_get_sensor+0x74/0xe0 [c00000007ffe7ce0] [c000000000083264] ras_epow_interrupt+0x44/0x450 [c00000007ffe7d90] [c000000000120260] handle_irq_event_percpu+0xa0/0x300 [c00000007ffe7e70] [c000000000120524] handle_irq_event+0x64/0xc0 [c00000007ffe7eb0] [c000000000124dbc] handle_fasteoi_irq+0xec/0x260 [c00000007ffe7ef0] [c00000000011f4f0] generic_handle_irq+0x50/0x80 [c00000007ffe7f20] [c000000000010f3c] __do_irq+0x8c/0x200 [c00000007ffe7f90] [c0000000000236cc] call_do_irq+0x14/0x24 [c00000007e6f39e0] [c000000000011144] do_IRQ+0x94/0x110 [c00000007e6f3a30] [c000000000002594] hardware_interrupt_common+0x114/0x180 Fix this issue by introducing a new rtas_get_sensor_fast() function that does not use rtas_busy_delay() - and thus can only be used for sensors that do not cause a BUSY condition - known as "fast" sensors. The EPOW sensor is defined to be "fast" in sPAPR - mpe. Fixes: 587f83e8dd50 ("powerpc/pseries: Use rtas_get_sensor in RAS code") Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-01powerpc/mm: Fix pte_pagesize_index() crash on 4K w/64K hashMichael Ellerman
commit 74b5037baa2011a2799e2c43adde7d171b072f9e upstream. The powerpc kernel can be built to have either a 4K PAGE_SIZE or a 64K PAGE_SIZE. However when built with a 4K PAGE_SIZE there is an additional config option which can be enabled, PPC_HAS_HASH_64K, which means the kernel also knows how to hash a 64K page even though the base PAGE_SIZE is 4K. This is used in one obscure configuration, to support 64K pages for SPU local store on the Cell processor when the rest of the kernel is using 4K pages. In this configuration, pte_pagesize_index() is defined to just pass through its arguments to get_slice_psize(). However pte_pagesize_index() is called for both user and kernel addresses, whereas get_slice_psize() only knows how to handle user addresses. This has been broken forever, however until recently it happened to work. That was because in get_slice_psize() the large kernel address would cause the right shift of the slice mask to return zero. However in commit 7aa0727f3302 ("powerpc/mm: Increase the slice range to 64TB"), the get_slice_psize() code was changed so that instead of a right shift we do an array lookup based on the address. When passed a kernel address this means we index way off the end of the slice array and return random junk. That is only fatal if we happen to hit something non-zero, but when we do return a non-zero value we confuse the MMU code and eventually cause a check stop. This fix is ugly, but simple. When we're called for a kernel address we return 4K, which is always correct in this configuration, otherwise we use the slice mask. Fixes: 7aa0727f3302 ("powerpc/mm: Increase the slice range to 64TB") Reported-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-01arm64: head.S: initialise mdcr_el2 in el2_setupWill Deacon
commit d10bcd473301888f957ec4b6b12aa3621be78d59 upstream. When entering the kernel at EL2, we fail to initialise the MDCR_EL2 register which controls debug access and PMU capabilities at EL1. This patch ensures that the register is initialised so that all traps are disabled and all the PMU counters are available to the host. When a guest is scheduled, KVM takes care to configure trapping appropriately. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-01arm64: compat: fix vfp save/restore across signal handlers in big-endianWill Deacon
commit bdec97a855ef1e239f130f7a11584721c9a1bf04 upstream. When saving/restoring the VFP registers from a compat (AArch32) signal frame, we rely on the compat registers forming a prefix of the native register file and therefore make use of copy_{to,from}_user to transfer between the native fpsimd_state and the compat_vfp_sigframe. Unfortunately, this doesn't work so well in a big-endian environment. Our fpsimd save/restore code operates directly on 128-bit quantities (Q registers) whereas the compat_vfp_sigframe represents the registers as an array of 64-bit (D) registers. The architecture packs the compat D registers into the Q registers, with the least significant bytes holding the lower register. Consequently, we need to swap the 64-bit halves when converting between these two representations on a big-endian machine. This patch replaces the __copy_{to,from}_user invocations in our compat VFP signal handling code with explicit __put_user loops that operate on 64-bit values and swap them accordingly. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-01arm64: kconfig: Move LIST_POISON to a safe valueJeff Vander Stoep
commit bf0c4e04732479f650ff59d1ee82de761c0071f0 upstream. Move the poison pointer offset to 0xdead000000000000, a recognized value that is not mappable by user-space exploits. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Thierry Strudel <tstrudel@google.com> Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21xtensa: don't use echo -e needlesslyMax Filippov
commit 123f15e669d5a5a2e2f260ba4a5fc2efd93df20e upstream. -e is not needed to output strings without escape sequences. This breaks big endian FSF build when the shell is dash, because its builtin echo doesn't understand '-e' switch and outputs it in the echoed string. Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Chris Zankel <chris@zankel.net> Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21xtensa: fix kernel register spillingMax Filippov
commit 77d6273e79e3a86552fcf10cdd31a69b46ed2ce6 upstream. call12 can't be safely used as the first call in the inline function, because the compiler does not extend the stack frame of the bounding function accordingly, which may result in corruption of local variables. If a call needs to be done, do call8 first followed by call12. For pure assembly code in _switch_to increase stack frame size of the bounding function. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21xtensa: fix threadptr reload on return to userspaceMax Filippov
commit 4229fb12a03e5da5882b420b0aa4a02e77447b86 upstream. Userspace return code may skip restoring THREADPTR register if there are no registers that need to be zeroed. This leads to spurious failures in libc NPTL tests. Always restore THREADPTR on return to userspace. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-21crypto: ghash-clmulni: specify context size for ghash async algorithmAndrey Ryabinin
commit 71c6da846be478a61556717ef1ee1cea91f5d6a8 upstream. Currently context size (cra_ctxsize) doesn't specified for ghash_async_alg. Which means it's zero. Thus crypto_create_tfm() doesn't allocate needed space for ghash_async_ctx, so any read/write to ctx (e.g. in ghash_async_init_tfm()) is not valid. Signed-off-by: Andrey Ryabinin <aryabinin@odin.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-13arm64/mm: Remove hack in mmap randomize layoutYann Droneaud
commit d6c763afab142a85e4770b4bc2a5f40f256d5c5d upstream. Since commit 8a0a9bd4db63 ('random: make get_random_int() more random'), get_random_int() returns a random value for each call, so comment and hack introduced in mmap_rnd() as part of commit 1d18c47c735e ('arm64: MMU fault handling and page table management') are incorrects. Commit 1d18c47c735e seems to use the same hack introduced by commit a5adc91a4b44 ('powerpc: Ensure random space between stack and mmaps'), latter copied in commit 5a0efea09f42 ('sparc64: Sharpen address space randomization calculations.'). But both architectures were cleaned up as part of commit fa8cbaaf5a68 ('powerpc+sparc64/mm: Remove hack in mmap randomize layout') as hack is no more needed since commit 8a0a9bd4db63. So the present patch removes the comment and the hack around get_random_int() on AArch64's mmap_rnd(). Cc: David S. Miller <davem@davemloft.net> Cc: Anton Blanchard <anton@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Dan McGee <dpmcgee@gmail.com> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Cc: Matthias Brugger <mbrugger@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16kvm: x86: fix kvm_apic_has_events to check for NULL pointerPaolo Bonzini
commit ce40cd3fc7fa40a6119e5fe6c0f2bc0eb4541009 upstream. Malicious (or egregiously buggy) userspace can trigger it, but it should never happen in normal operation. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Wang Kai <morgan.wang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16signal: fix information leak in copy_siginfo_from_user32Amanieu d'Antras
commit 3c00cb5e68dc719f2fc73a33b1b230aadfcb1309 upstream. This function can leak kernel stack data when the user siginfo_t has a positive si_code value. The top 16 bits of si_code descibe which fields in the siginfo_t union are active, but they are treated inconsistently between copy_siginfo_from_user32, copy_siginfo_to_user32 and copy_siginfo_to_user. copy_siginfo_from_user32 is called from rt_sigqueueinfo and rt_tgsigqueueinfo in which the user has full control overthe top 16 bits of si_code. This fixes the following information leaks: x86: 8 bytes leaked when sending a signal from a 32-bit process to itself. This leak grows to 16 bytes if the process uses x32. (si_code = __SI_CHLD) x86: 100 bytes leaked when sending a signal from a 32-bit process to a 64-bit process. (si_code = -1) sparc: 4 bytes leaked when sending a signal from a 32-bit process to a 64-bit process. (si_code = any) parsic and s390 have similar bugs, but they are not vulnerable because rt_[tg]sigqueueinfo have checks that prevent sending a positive si_code to a different process. These bugs are also fixed for consistency. Signed-off-by: Amanieu d'Antras <amanieu@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16signal: fix information leak in copy_siginfo_to_userAmanieu d'Antras
commit 26135022f85105ad725cda103fa069e29e83bd16 upstream. This function may copy the si_addr_lsb, si_lower and si_upper fields to user mode when they haven't been initialized, which can leak kernel stack data to user mode. Just checking the value of si_code is insufficient because the same si_code value is shared between multiple signals. This is solved by checking the value of si_signo in addition to si_code. Signed-off-by: Amanieu d'Antras <amanieu@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16ARM: 7819/1: fiq: Cast the first argument of flush_icache_range()Fabio Estevam
commit 7cb3be0a27805c625ff7cce20c53c926d9483243 upstream. Commit 2ba85e7af4 (ARM: Fix FIQ code on VIVT CPUs) causes the following build warning: arch/arm/kernel/fiq.c:92:3: warning: passing argument 1 of 'cpu_cache.coherent_kern_range' makes integer from pointer without a cast [enabled by default] Cast it as '(unsigned long)base' to avoid the warning. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Martin Kaiser <lists@kaiser.cx> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16ARM: Fix FIQ code on VIVT CPUsRussell King
commit 2ba85e7af4c639d933c9a87a6d7363f2983d5ada upstream. Aaro Koskinen reports the following oops: Installing fiq handler from c001b110, length 0x164 Unable to handle kernel paging request at virtual address ffff1224 pgd = c0004000 [ffff1224] *pgd=00000000, *pte=11fff0cb, *ppte=11fff00a ... [<c0013154>] (set_fiq_handler+0x0/0x6c) from [<c0365d38>] (ams_delta_init_fiq+0xa8/0x160) r6:00000164 r5:c001b110 r4:00000000 r3:fefecb4c [<c0365c90>] (ams_delta_init_fiq+0x0/0x160) from [<c0365b14>] (ams_delta_init+0xd4/0x114) r6:00000000 r5:fffece10 r4:c037a9e0 [<c0365a40>] (ams_delta_init+0x0/0x114) from [<c03613b4>] (customize_machine+0x24/0x30) This is because the vectors page is now write-protected, and to change code in there we must write to its original alias. Make that change, and adjust the cache flushing such that the code will become visible to the instruction stream on VIVT CPUs. Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Martin Kaiser <lists@kaiser.cx> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16ARM: Fix !kuser helpers caseRussell King
commit 1b16c4bcf80e319b2226a886b72b8466179c8e3a upstream. Fix yet another build failure caused by a weird set of configuration settings: LD init/built-in.o arch/arm/kernel/built-in.o: In function `__dabt_usr': /home/tom3q/kernel/arch/arm/kernel/entry-armv.S:377: undefined reference to `kuser_cmpxchg64_fixup' arch/arm/kernel/built-in.o: In function `__irq_usr': /home/tom3q/kernel/arch/arm/kernel/entry-armv.S:387: undefined reference to `kuser_cmpxchg64_fixup' caused by: CONFIG_KUSER_HELPERS=n CONFIG_CPU_32v6K=n CONFIG_NEEDS_SYSCALL_FOR_CMPXCHG=n Reported-by: Tomasz Figa <tomasz.figa@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Martin Kaiser <lists@kaiser.cx> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16x86/xen: Probe target addresses in set_aliased_prot() before the hypercallAndy Lutomirski
commit aa1acff356bbedfd03b544051f5b371746735d89 upstream. The update_va_mapping hypercall can fail if the VA isn't present in the guest's page tables. Under certain loads, this can result in an OOPS when the target address is in unpopulated vmap space. While we're at it, add comments to help explain what's going on. This isn't a great long-term fix. This code should probably be changed to use something like set_memory_ro. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Vrabel <dvrabel@cantab.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: security@kernel.org <security@kernel.org> Cc: xen-devel <xen-devel@lists.xen.org> Link: http://lkml.kernel.org/r/0b0e55b995cda11e7829f140b833ef932fcabe3a.1438291540.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16sparc64: Fix userspace FPU register corruptions.David S. Miller
[ Upstream commit 44922150d87cef616fd183220d43d8fde4d41390 ] If we have a series of events from userpsace, with %fprs=FPRS_FEF, like follows: ETRAP ETRAP VIS_ENTRY(fprs=0x4) VIS_EXIT RTRAP (kernel FPU restore with fpu_saved=0x4) RTRAP We will not restore the user registers that were clobbered by the FPU using kernel code in the inner-most trap. Traps allocate FPU save slots in the thread struct, and FPU using sequences save the "dirty" FPU registers only. This works at the initial trap level because all of the registers get recorded into the top-level FPU save area, and we'll return to userspace with the FPU disabled so that any FPU use by the user will take an FPU disabled trap wherein we'll load the registers back up properly. But this is not how trap returns from kernel to kernel operate. The simplest fix for this bug is to always save all FPU register state for anything other than the top-most FPU save area. Getting rid of the optimized inner-slot FPU saving code ends up making VISEntryHalf degenerate into plain VISEntry. Longer term we need to do something smarter to reinstate the partial save optimizations. Perhaps the fundament error is having trap entry and exit allocate FPU save slots and restore register state. Instead, the VISEntry et al. calls should be doing that work. This bug is about two decades old. Reported-by: James Y Knight <jyknight@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16sparc64: Fix FPU register corruption with AES crypto offload.David S. Miller
[ Upstream commit f4da3628dc7c32a59d1fb7116bb042e6f436d611 ] The AES loops in arch/sparc/crypto/aes_glue.c use a scheme where the key material is preloaded into the FPU registers, and then we loop over and over doing the crypt operation, reusing those pre-cooked key registers. There are intervening blkcipher*() calls between the crypt operation calls. And those might perform memcpy() and thus also try to use the FPU. The sparc64 kernel FPU usage mechanism is designed to allow such recursive uses, but with a catch. There has to be a trap between the two FPU using threads of control. The mechanism works by, when the FPU is already in use by the kernel, allocating a slot for FPU saving at trap time. Then if, within the trap handler, we try to use the FPU registers, the pre-trap FPU register state is saved into the slot. Then at trap return time we notice this and restore the pre-trap FPU state. Over the long term there are various more involved ways we can make this work, but for a quick fix let's take advantage of the fact that the situation where this happens is very limited. All sparc64 chips that support the crypto instructiosn also are using the Niagara4 memcpy routine, and that routine only uses the FPU for large copies where we can't get the source aligned properly to a multiple of 8 bytes. We look to see if the FPU is already in use in this context, and if so we use the non-large copy path which only uses integer registers. Furthermore, we also limit this special logic to when we are doing kernel copy, rather than a user copy. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16perf/x86/amd: Rework AMD PMU init codePeter Zijlstra
commit 1b45adcd9a503428e6de6b39bc6892d86c9c1d41 upstream. Josh reported that his QEMU is a bad hardware emulator and trips a WARN in the AMD PMU init code. He requested the WARN be turned into a pr_err() or similar. While there, rework the code a little. Reported-by: Josh Boyer <jwboyer@redhat.com> Acked-by: Robert Richter <rric@kernel.org> Acked-by: Jacob Shin <jacob.shin@amd.com> Cc: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20130521110537.GG26912@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16MIPS: Make set_pte() SMP safe.David Daney
commit 46011e6ea39235e4aca656673c500eac81a07a17 upstream. On MIPS the GLOBAL bit of the PTE must have the same value in any aligned pair of PTEs. These pairs of PTEs are referred to as "buddies". In a SMP system is is possible for two CPUs to be calling set_pte() on adjacent PTEs at the same time. There is a race between setting the PTE and a different CPU setting the GLOBAL bit in its buddy PTE. This race can be observed when multiple CPUs are executing vmap()/vfree() at the same time. Make setting the buddy PTE's GLOBAL bit an atomic operation to close the race condition. The case of CONFIG_64BIT_PHYS_ADDR && CONFIG_CPU_MIPS32 is *not* handled. Signed-off-by: David Daney <david.daney@cavium.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/10835/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16MIPS: Fix sched_getaffinity with MT FPAFF enabledFelix Fietkau
commit 1d62d737555e1378eb62a8bba26644f7d97139d2 upstream. p->thread.user_cpus_allowed is zero-initialized and is only filled on the first sched_setaffinity call. To avoid adding overhead in the task initialization codepath, simply OR the returned mask in sched_getaffinity with p->cpus_allowed. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/10740/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-16ARM: realview: fix sparsemem buildArnd Bergmann
commit dd94d3558947756b102b1487911acd925224a38c upstream. Commit b713aa0b15 "ARM: fix asm/memory.h build error" broke some configurations on mach-realview with sparsemem enabled, which is missing a definition of PHYS_OFFSET: arch/arm/include/asm/memory.h:268:42: error: 'PHYS_OFFSET' undeclared (first use in this function) #define PHYS_PFN_OFFSET ((unsigned long)(PHYS_OFFSET >> PAGE_SHIFT)) arch/arm/include/asm/dma-mapping.h:104:9: note: in expansion of macro 'PHYS_PFN_OFFSET' return PHYS_PFN_OFFSET + dma_to_pfn(dev, *dev->dma_mask); An easy workaround is for realview to define PHYS_OFFSET itself, in the same way we define it for platforms that don't have a private __virt_to_phys function. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-10efi: fix 32bit kernel boot failed problem using efiFupan Li
Commit 35d5134b7d5a ("x86/efi: Correct EFI boot stub use of code32_start") imported a bug, which will cause 32bit kernel boot failed using efi method. It should use the label's address instead of the value stored in the label to caculate the address of code32_start. Signed-off-by: Fupan Li <fupan.li@windriver.com> Reviewed-by: Matt Fleming <matt.fleming@intel.com>
2015-08-10tile: use free_bootmem_late() for initrdChris Metcalf
commit 3f81d2447b37ac697b3c600039f2c6b628c06e21 upstream. We were previously using free_bootmem() and just getting lucky that nothing too bad happened. Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-10ARC: make sure instruction_pointer() returns unsigned valueAlexey Brodkin
commit f51e2f1911122879eefefa4c592dea8bf794b39c upstream. Currently instruction_pointer() returns pt_regs->ret and so return value is of type "long", which implicitly stands for "signed long". While that's perfectly fine when dealing with 32-bit values if return value of instruction_pointer() gets assigned to 64-bit variable sign extension may happen. And at least in one real use-case it happens already. In perf_prepare_sample() return value of perf_instruction_pointer() (which is an alias to instruction_pointer() in case of ARC) is assigned to (struct perf_sample_data)->ip (which type is "u64"). And what we see if instuction pointer points to user-space application that in case of ARC lays below 0x8000_0000 "ip" gets set properly with leading 32 zeros. But if instruction pointer points to kernel address space that starts from 0x8000_0000 then "ip" is set with 32 leadig "f"-s. I.e. id instruction_pointer() returns 0x8100_0000, "ip" will be assigned with 0xffff_ffff__8100_0000. Which is obviously wrong. In particular that issuse broke output of perf, because perf was unable to associate addresses like 0xffff_ffff__8100_0000 with anything from /proc/kallsyms. That's what we used to see: ----------->8---------- 6.27% ls [unknown] [k] 0xffffffff8046c5cc 2.96% ls libuClibc-0.9.34-git.so [.] memcpy 2.25% ls libuClibc-0.9.34-git.so [.] memset 1.66% ls [unknown] [k] 0xffffffff80666536 1.54% ls libuClibc-0.9.34-git.so [.] 0x000224d6 1.18% ls libuClibc-0.9.34-git.so [.] 0x00022472 ----------->8---------- With that change perf output looks much better now: ----------->8---------- 8.21% ls [kernel.kallsyms] [k] memset 3.52% ls libuClibc-0.9.34-git.so [.] memcpy 2.11% ls libuClibc-0.9.34-git.so [.] malloc 1.88% ls libuClibc-0.9.34-git.so [.] memset 1.64% ls [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore 1.41% ls [kernel.kallsyms] [k] __d_lookup_rcu ----------->8---------- Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: arc-linux-dev@synopsys.com Cc: linux-kernel@vger.kernel.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-10s390/sclp: clear upper register halves in _sclp_print_earlyMartin Schwidefsky
commit f9c87a6f46d508eae0d9ae640be98d50f237f827 upstream. If the kernel is compiled with gcc 5.1 and the XZ compression option the decompress_kernel function calls _sclp_print_early in 64-bit mode while the content of the upper register half of %r6 is non-zero. This causes a specification exception on the servc instruction in _sclp_servc. The _sclp_print_early function saves and restores the upper registers halves but it fails to clear them for the 31-bit code of the mini sclp driver. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03MIPS: KVM: Do not sign extend on unsigned MMIO loadNicholas Mc Guire
commit ed9244e6c534612d2b5ae47feab2f55a0d4b4ced upstream. Fix possible unintended sign extension in unsigned MMIO loads by casting to uint16_t in the case of mmio_needed != 2. Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Reviewed-by: James Hogan <james.hogan@imgtec.com> Tested-by: James Hogan <james.hogan@imgtec.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/9985/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03arm64: vdso: work-around broken ELF toolchains in MakefileWill Deacon
commit 6f1a6ae87c0c60d7c462ef8fd071f291aa7a9abb upstream. When building the kernel with a bare-metal (ELF) toolchain, the -shared option may not be passed down to collect2, resulting in silent corruption of the vDSO image (in particular, the DYNAMIC section is omitted). The effect of this corruption is that the dynamic linker fails to find the vDSO symbols and libc is instead used for the syscalls that we intended to optimise (e.g. gettimeofday). Functionally, there is no issue as the sigreturn trampoline is still intact and located by the kernel. This patch fixes the problem by explicitly passing -shared to the linker when building the vDSO. Reported-by: Szabolcs Nagy <Szabolcs.Nagy@arm.com> Reported-by: James Greenlaigh <james.greenhalgh@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03arm64: mm: Fix freeing of the wrong memmap entries with !SPARSEMEM_VMEMMAPDave P Martin
commit b9bcc919931611498e856eae9bf66337330d04cc upstream. The memmap freeing code in free_unused_memmap() computes the end of each memblock by adding the memblock size onto the base. However, if SPARSEMEM is enabled then the value (start) used for the base may already have been rounded downwards to work out which memmap entries to free after the previous memblock. This may cause memmap entries that are in use to get freed. In general, you're not likely to hit this problem unless there are at least 2 memblocks and one of them is not aligned to a sparsemem section boundary. Note that carve-outs can increase the number of memblocks by splitting the regions listed in the device tree. This problem doesn't occur with SPARSEMEM_VMEMMAP, because the vmemmap code deals with freeing the unused regions of the memmap instead of requiring the arch code to do it. This patch gets the memblock base out of the memblock directly when computing the block end address to ensure the correct value is used. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03arm64: Do not attempt to use init_mm in reset_context()Catalin Marinas
commit 565630d503ef24e44c252bed55571b3a0d68455f upstream. After secondary CPU boot or hotplug, the active_mm of the idle thread is &init_mm. The init_mm.pgd (swapper_pg_dir) is only meant for TTBR1_EL1 and must not be set in TTBR0_EL1. Since when active_mm == &init_mm the TTBR0_EL1 is already set to the reserved value, there is no need to perform any context reset. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-03ARC: add compiler barrier to LLSC based cmpxchgVineet Gupta
commit d57f727264f1425a94689bafc7e99e502cb135b5 upstream. When auditing cmpxchg call sites, Chuck noted that gcc was optimizing away some of the desired LDs. | do { | new = old = *ipi_data_ptr; | new |= 1U << msg; | } while (cmpxchg(ipi_data_ptr, old, new) != old); was generating to below | 8015cef8: ld r2,[r4,0] <-- First LD | 8015cefc: bset r1,r2,r1 | | 8015cf00: llock r3,[r4] <-- atomic op | 8015cf04: brne r3,r2,8015cf10 | 8015cf08: scond r1,[r4] | 8015cf0c: bnz 8015cf00 | | 8015cf10: brne r3,r2,8015cf00 <-- Branch doesn't go to orig LD Although this was fixed by adding a ACCESS_ONCE in this call site, it seems safer (for now at least) to add compiler barrier to LLSC based cmpxchg Reported-by: Chuck Jordan <cjordan@synopsys.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-07-10KVM: x86: make vapics_in_nmi_mode atomicRadim Krčmář
commit 42720138b06301cc8a7ee8a495a6d021c4b6a9bc upstream. Writes were a bit racy, but hard to turn into a bug at the same time. (Particularly because modern Linux doesn't use this feature anymore.) Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> [Actually the next patch makes it much, much easier to trigger the race so I'm including this one for stable@ as well. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-07-10MIPS: Fix KVM guest fixmap addressJames Hogan
commit 8e748c8d09a9314eedb5c6367d9acfaacddcdc88 upstream. KVM guest kernels for trap & emulate run in user mode, with a modified set of kernel memory segments. However the fixmap address is still in the normal KSeg3 region at 0xfffe0000 regardless, causing problems when cache alias handling makes use of them when handling copy on write. Therefore define FIXADDR_TOP as 0x7ffe0000 in the guest kernel mapped region when CONFIG_KVM_GUEST is defined. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/9887/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>