summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2013-09-26MIPS: ath79: Fix ar933x watchdog clockFelix Fietkau
commit a1191927ace7e6f827132aa9e062779eb3f11fa5 upstream. The watchdog device on the AR933x is connected to the AHB clock, however the current code uses the reference clock. Due to the wrong rate, the watchdog driver can't calculate correct register values for a given timeout value and the watchdog unexpectedly restarts the system. The code uses the wrong value since the initial commit 04225e1d227c8e68d685936ecf42ac175fec0e54 (MIPS: ath79: add AR933X specific clock init) The patch fixes the code to use the correct clock rate to avoid the problem. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Gabor Juhos <juhosg@openwrt.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/5777/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-26ARM: PCI: versatile: Fix SMAP register offsetsPeter Maydell
commit 99f2b130370b904ca5300079243fdbcafa2c708b upstream. The SMAP register offsets in the versatile PCI controller code were all off by four. (This didn't have any observable bad effects because on this board PHYS_OFFSET is zero, and (a) writing zero to the flags register at offset 0x10 has no effect and (b) the reset value of the SMAP register is zero anyway, so failing to write SMAP2 didn't matter.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Kevin Hilman <khilman@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-26powerpc: Handle unaligned ldbrx/stdbrxAnton Blanchard
commit 230aef7a6a23b6166bd4003bfff5af23c9bd381f upstream. Normally when we haven't implemented an alignment handler for a load or store instruction the process will be terminated. The alignment handler uses the DSISR (or a pseudo one) to locate the right handler. Unfortunately ldbrx and stdbrx overlap lfs and stfs so we incorrectly think ldbrx is an lfs and stdbrx is an stfs. This bug is particularly nasty - instead of terminating the process we apply an incorrect fixup and continue on. With more and more overlapping instructions we should stop creating a pseudo DSISR and index using the instruction directly, but for now add a special case to catch ldbrx/stdbrx. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14Revert "KVM: X86 emulator: fix source operand decoding for 8bit mov[zs]x ↵Greg Kroah-Hartman
instructions" This reverts commit 5b5b30580218eae22609989546bac6e44d0eda6e, which was commit 660696d1d16a71e15549ce1bf74953be1592bcd3 upstream. Paul Gortmaker <paul.gortmaker@windriver.com> writes: [this patch] introduces the following: arch/x86/kvm/emulate.c: In function ‘decode_operand’: arch/x86/kvm/emulate.c:3974:4: warning: passing argument 1 of ‘decode_register’ makes integer from pointer +without a cast [enabled by default] arch/x86/kvm/emulate.c:789:14: note: expected ‘u8’ but argument is of type ‘struct x86_emulate_ctxt *’ arch/x86/kvm/emulate.c:3974:4: warning: passing argument 2 of ‘decode_register’ makes pointer from integer +without a cast [enabled by default] arch/x86/kvm/emulate.c:789:14: note: expected ‘long unsigned int *’ but argument is of type ‘u8’ Based on the severity of the warnings above, I'm reasonably sure there will be some kind of runtime regressions due to this, but I stopped to investigate the warnings as soon as I saw them, before any run time testing. It happens because mainline v3.7-rc1~113^2~40 (dd856efafe60) does this: -static void *decode_register(u8 modrm_reg, unsigned long *regs, +static void *decode_register(struct x86_emulate_ctxt *ctxt, u8 modrm_reg, Since 660696d1d16a71e1 was only applied to stable 3.4, 3.8, and 3.9 -- and the prerequisite above is in 3.7+, the issue should be limited to 3.4.44+ Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14m32r: make memset() global for CONFIG_KERNEL_BZIP2=yGeert Uytterhoeven
commit 9a75c6e5240f7edc5955e8da5b94bde6f96070b3 upstream. Fix the m32r compile error: arch/m32r/boot/compressed/misc.c:31:14: error: static declaration of 'memset' follows non-static declaration make[5]: *** [arch/m32r/boot/compressed/misc.o] Error 1 make[4]: *** [arch/m32r/boot/compressed/vmlinux] Error 2 by removing the static keyword. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Hirokazu Takata <takata@linux-m32r.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14m32r: add memcpy() for CONFIG_KERNEL_GZIP=yGeert Uytterhoeven
commit a8abbca6617e1caa2344d2d38d0a35f3e5928b79 upstream. Fix the m32r link error: LD arch/m32r/boot/compressed/vmlinux arch/m32r/boot/compressed/misc.o: In function `zlib_updatewindow': misc.c:(.text+0x190): undefined reference to `memcpy' misc.c:(.text+0x190): relocation truncated to fit: R_M32R_26_PLTREL against undefined symbol `memcpy' make[5]: *** [arch/m32r/boot/compressed/vmlinux] Error 1 by adding our own implementation of memcpy(). Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Hirokazu Takata <takata@linux-m32r.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14m32r: consistently use "suffix-$(...)"Geert Uytterhoeven
commit df12aef6a19bb2d69859a94936bda0e6ccaf3327 upstream. Commit a556bec9955c ("m32r: fix arch/m32r/boot/compressed/Makefile") changed "$(suffix_y)" to "$(suffix-y)", but didn't update any location where "suffix_y" is set, causing: make[5]: *** No rule to make target `arch/m32r/boot/compressed/vmlinux.bin.', needed by `arch/m32r/boot/compressed/piggy.o'. Stop. make[4]: *** [arch/m32r/boot/compressed/vmlinux] Error 2 make[3]: *** [zImage] Error 2 Correct the other locations to fix this. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Hirokazu Takata <takata@linux-m32r.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-07powerpc: Work around gcc miscompilation of __pa() on 64-bitPaul Mackerras
commit bdbc29c19b2633b1d9c52638fb732bcde7a2031a upstream. On 64-bit, __pa(&static_var) gets miscompiled by recent versions of gcc as something like: addis 3,2,.LANCHOR1+4611686018427387904@toc@ha addi 3,3,.LANCHOR1+4611686018427387904@toc@l This ends up effectively ignoring the offset, since its bottom 32 bits are zero, and means that the result of __pa() still has 0xC in the top nibble. This happens with gcc 4.8.1, at least. To work around this, for 64-bit we make __pa() use an AND operator, and for symmetry, we make __va() use an OR operator. Using an AND operator rather than a subtraction ends up with slightly shorter code since it can be done with a single clrldi instruction, whereas it takes three instructions to form the constant (-PAGE_OFFSET) and add it on. (Note that MEMORY_START is always 0 on 64-bit.) Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-29x86/xen: do not identity map UNUSABLE regions in the machine E820David Vrabel
commit 3bc38cbceb85881a8eb789ee1aa56678038b1909 upstream. If there are UNUSABLE regions in the machine memory map, dom0 will attempt to map them 1:1 which is not permitted by Xen and the kernel will crash. There isn't anything interesting in the UNUSABLE region that the dom0 kernel needs access to so we can avoid making the 1:1 mapping and treat it as RAM. We only do this for dom0, as that is where tboot case shows up. A PV domU could have an UNUSABLE region in its pseudo-physical map and would need to be handled in another patch. This fixes a boot failure on hosts with tboot. tboot marks a region in the e820 map as unusable and the dom0 kernel would attempt to map this region and Xen does not permit unusable regions to be mapped by guests. (XEN) 0000000000000000 - 0000000000060000 (usable) (XEN) 0000000000060000 - 0000000000068000 (reserved) (XEN) 0000000000068000 - 000000000009e000 (usable) (XEN) 0000000000100000 - 0000000000800000 (usable) (XEN) 0000000000800000 - 0000000000972000 (unusable) tboot marked this region as unusable. (XEN) 0000000000972000 - 00000000cf200000 (usable) (XEN) 00000000cf200000 - 00000000cf38f000 (reserved) (XEN) 00000000cf38f000 - 00000000cf3ce000 (ACPI data) (XEN) 00000000cf3ce000 - 00000000d0000000 (reserved) (XEN) 00000000e0000000 - 00000000f0000000 (reserved) (XEN) 00000000fe000000 - 0000000100000000 (reserved) (XEN) 0000000100000000 - 0000000630000000 (usable) Signed-off-by: David Vrabel <david.vrabel@citrix.com> [v1: Altered the patch and description with domU's with UNUSABLE regions] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20m68k/atari: ARAnyM - Fix NatFeat module supportGeert Uytterhoeven
commit e8184e10f89736a23ea6eea8e24cd524c5c513d2 upstream. As pointed out by Andreas Schwab, pointers passed to ARAnyM NatFeat calls should be physical addresses, not virtual addresses. Fortunately on Atari, physical and virtual kernel addresses are the same, as long as normal kernel memory is concerned, so this usually worked fine without conversion. But for modules, pointers to literal strings are located in vmalloc()ed memory. Depending on the version of ARAnyM, this causes the nf_get_id() call to just fail, or worse, crash ARAnyM itself with e.g. Gotcha! Illegal memory access. Atari PC = $968c This is a big issue for distro kernels, who want to have all drivers as loadable modules in an initrd. Add a wrapper for nf_get_id() that copies the literal to the stack to work around this issue. Reported-by: Thorsten Glaser <tg@debian.org> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20m68k: Truncate base in do_div()Andreas Schwab
commit ea077b1b96e073eac5c3c5590529e964767fc5f7 upstream. Explicitly truncate the second operand of do_div() to 32 bits to guard against bogus code calling it with a 64-bit divisor. [Thorsten] After upgrading from 3.2 to 3.10, mounting a btrfs volume fails with: btrfs: setting nodatacow, compression disabled btrfs: enabling auto recovery btrfs: disk space caching is enabled *** ZERO DIVIDE *** FORMAT=2 Current process id is 722 BAD KERNEL TRAP: 00000000 Modules linked in: evdev mac_hid ext4 crc16 jbd2 mbcache btrfs xor lzo_compress zlib_deflate raid6_pq crc32c libcrc32c PC: [<319535b2>] __btrfs_map_block+0x11c/0x119a [btrfs] SR: 2000 SP: 30c1fab4 a2: 30f0faf0 d0: 00000000 d1: 00001000 d2: 00000000 d3: 00000000 d4: 00010000 d5: 00000000 a0: 3085c72c a1: 3085c72c Process mount (pid: 722, task=30f0faf0) Frame format=2 instr addr=319535ae Stack from 30c1faec: 00000000 00000020 00000000 00001000 00000000 01401000 30253928 300ffc00 00a843ac 3026f640 00000000 00010000 0009e250 00d106c0 00011220 00000000 00001000 301c6830 0009e32a 000000ff 00000009 3085c72c 00000000 00000000 30c1fd14 00000000 00000020 00000000 30c1fd14 0009e26c 00000020 00000003 00000000 0009dd8a 300b0b6c 30253928 00a843ac 00001000 00000000 00000000 0000a008 3194e76a 30253928 00a843ac 00001000 00000000 00000000 00000002 Call Trace: [<00001000>] kernel_pg_dir+0x0/0x1000 [...] Code: 222e ff74 2a2e ff5c 2c2e ff60 4c45 1402 <2d40> ff64 2d41 ff68 2205 4c2e 1800 ff68 4c04 0800 2041 d1c0 2206 4c2e 1400 ff68 [Geert] As diagnosed by Andreas, fs/btrfs/volumes.c:__btrfs_map_block() calls do_div(stripe_nr, stripe_len); with stripe_len u64, while do_div() assumes the divisor is a 32-bit number. Due to the lack of truncation in the m68k-specific implementation of do_div(), the division is performed using the upper 32-bit word of stripe_len, which is zero. This was introduced by commit 53b381b3abeb86f12787a6c40fee9b2f71edc23b ("Btrfs: RAID5 and RAID6"), which changed the divisor from map->stripe_len (struct map_lookup.stripe_len is int) to a 64-bit temporary. Reported-by: Thorsten Glaser <tg@debian.org> Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Tested-by: Thorsten Glaser <tg@debian.org> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20ARM: 7809/1: perf: fix event validation for software group leadersWill Deacon
commit c95eb3184ea1a3a2551df57190c81da695e2144b upstream. It is possible to construct an event group with a software event as a group leader and then subsequently add a hardware event to the group. This results in the event group being validated by adding all members of the group to a fake PMU and attempting to allocate each event on their respective PMU. Unfortunately, for software events wthout a corresponding arm_pmu, this results in a kernel crash attempting to dereference the ->get_event_idx function pointer. This patch fixes the problem by checking explicitly for software events and ignoring those in event validation (since they can always be scheduled). We will probably want to revisit this for 3.12, since the validation checks don't appear to work correctly when dealing with multiple hardware PMUs anyway. Reported-by: Vince Weaver <vincent.weaver@maine.edu> Tested-by: Vince Weaver <vincent.weaver@maine.edu> Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20xtensa: replace xtensa-specific _f{data,text} by _s{data,text}Geert Uytterhoeven
commit 5e7b6ed8e9bf3c8e3bb579fd0aec64f6526f8c81 upstream. commit a2d063ac216c161 ("extable, core_kernel_data(): Make sure all archs define _sdata") missed xtensa. Xtensa does have a start of data marker, but calls it _fdata, causing kernel/built-in.o:(.text+0x964): undefined reference to `_sdata' _stext was already defined, but it was duplicated by _fdata. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20xtensa: fix linker script transformation for .text.unlikelyMax Filippov
commit f6a03a12ecdbe0dd80a55f6df3b7206c5a403a49 upstream. Now that binutils generate *.unlikely sections which don't follow documented (info as) literal section naming rules, section name transformation script doesn't work well resulting in the following errors at vmlinux link time: main.c:(.text.unlikely+0x3): dangerous relocation: l32r: literal placed after use: .literal.unlikely Fix section name transformation script by adding specific rule for .text.unlikely sections. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Chris Zankel <chris@zankel.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20MIPS: Rewrite pfn_valid to work in modules, too.Ralf Baechle
Upstream commit 8b9232141bf40788cce31f893c13f344ec31ee66. This fixes: MODPOST 393 modules ERROR: "min_low_pfn" [arch/mips/kvm/kvm.ko] undefined! make[3]: *** [__modpost] Error 1 It would have been possible to just export min_low_pfn but in the end pfn_valid should return 1 for any pfn argument for which a struct page exists so using min_low_pfn was wrong anyway. [Backport to 3.4 kernel. Applies cleanly on top of current 3.4 patch queue, and fixes "make ARCH=mips allmodconfig; make ARCH=mips" build problem. - Guenter] Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20sparc32: Add ucmpdi2.o to obj-y instead of lib-y.David S. Miller
commit 74c7b28953d4eaa6a479c187aeafcfc0280da5e8 upstream. Otherwise if no references exist in the static kernel image, we won't export the symbol properly to modules. Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20sparc32: add ucmpdi2Sam Ravnborg
commit de36e66d5fa52bc6e2dacd95c701a1762b5308a7 upstream. Based on copy from microblaze add ucmpdi2 implementation. This fixes build of niu driver which failed with: drivers/built-in.o: In function `niu_get_nfc': niu.c:(.text+0x91494): undefined reference to `__ucmpdi2' This driver will never be used on a sparc32 system, but patch added to fix build breakage with all*config builds. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20alpha: makefile: don't enforce small data model for kernel buildsWill Deacon
commit cd8d2331756751b6aeb855a3c9cb0a92fbd9c725 upstream. Due to all of the goodness being packed into today's kernels, the resulting image isn't as slim as it once was. In light of this, don't pass -msmall-data to gcc, which otherwise results in link failures due to impossible relocations when compiling anything but the most trivial configurations. Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Tested-by: Thorsten Kranzkowski <dl8bcu@dl8bcu.de> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20powerpc/numa: Avoid stupid uninitialized warning from gccBenjamin Herrenschmidt
commit aa709f3bc92c6daaf177cd7e3446da2ef64426c6 upstream. Newer gcc are being a bit blind here (it's pretty obvious we don't reach the code path using the array if we haven't initialized the pointer) but none of that is performance critical so let's just silence it. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20frv: Use core allocator for task_structThomas Gleixner
commit c6ae063aaf3786b9db7f19a90bf4ed8aaebb7f90 upstream. There is no point having a copy of the core allocator. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David Howells <dhowells@redhat.com> Link: http://lkml.kernel.org/r/20120503085033.967140188@linutronix.de Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20frv: Use correct size for task_struct allocationThomas Gleixner
commit cce4517f33384c3794c759e206cc8e1bb6df146b upstream. alloc_task_struct_node() allocates THREAD_SIZE and maintains some weird refcount in the allocated memory. This never blew up as task_struct size on 32bit machines was always less than THREAD_SIZE Allocate just sizeof(struct task_struct) and get rid of the magic refcounting. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David Howells <dhowells@redhat.com> Link: http://lkml.kernel.org/r/20120503085033.898475542@linutronix.de Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20CRIS: Add _sdata to vmlinux.lds.SJesper Nilsson
commit 473e162eea465e60578edb93341752e7f1c1dacc upstream. Fixes link error: LD vmlinux kernel/built-in.o: In function `core_kernel_data': (.text+0x13e44): undefined reference to `_sdata' Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20cris: Remove old legacy "-traditional" flag from arch-v10/lib/MakefilePaul Gortmaker
commit 7b91747d42a1012e3781dd09fa638d113809e3fd upstream. Most of these have been purged years ago. This one silently lived on until commit 69349c2dc01c489eccaa4c472542c08e370c6d7e "kconfig: fix IS_ENABLED to not require all options to be defined" In the above, we use some macro trickery to create a conditional that is valid in CPP and in C usage. However that trickery doesn't sit well if you have the legacy "-traditional" flag enabled. You'll get: AS arch/cris/arch-v10/lib/checksum.o In file included from <command-line>:4:0: include/linux/kconfig.h:23:0: error: syntax error in macro parameter list make[2]: *** [arch/cris/arch-v10/lib/checksum.o] Error 1 Everything builds fine w/o "-traditional" so simply drop it from this location as well. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20cris: posix_types.h, include asm-generic/posix_types.hJiri Slaby
commit 74f077d2a7651409c44bb323471f219a4b0d2aab upstream. Without that I cannot build anything: In file included from include/linux/page-flags.h:8:0, from kernel/bounds.c:9: include/linux/types.h:25:1: error: unknown type name '__kernel_ino_t' include/linux/types.h:29:1: error: unknown type name '__kernel_off_t' ... Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Mikael Starvik <starvik@axis.com> Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com> Cc: linux-cris-kernel@axis.com Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20microblaze: Update microblaze defconfigsMichal Simek
commit d0e045401f268a8de6f87d65678214748b772680 upstream. The main reason is 0-day testing system which can directly use these defconfigs for testing. Enable support for all xilinx drivers which Microblaze can use and disable dependency on external rootfs.cpio. There is only one exception which is axi ethernet driver which still uses NO_IRQ which is not defined for Microblaze. Signed-off-by: Michal Simek <michal.simek@xilinx.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20MIPS: Expose missing pci_io{map,unmap} declarationsMarkos Chandras
commit 78857614104a26cdada4c53eea104752042bf5a1 upstream. The GENERIC_PCI_IOMAP does not depend on CONFIG_PCI so move it to the CONFIG_MIPS symbol so it's always selected for MIPS. This fixes the missing pci_iomap declaration for MIPS. Moreover, the pci_iounmap function was not defined in the io.h header file if the CONFIG_PCI symbol is not set, but it should since MIPS is not using CONFIG_GENERIC_IOMAP. This fixes the following problem on a allyesconfig: drivers/net/ethernet/3com/3c59x.c:1031:2: error: implicit declaration of function 'pci_iomap' [-Werror=implicit-function-declaration] drivers/net/ethernet/3com/3c59x.c:1044:3: error: implicit declaration of function 'pci_iounmap' [-Werror=implicit-function-declaration] Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Acked-by: Steven J. Hill <Steven.Hill@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/5478/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20perf/arm: Fix armpmu_map_hw_event()Stephen Boyd
commit b88a2595b6d8aedbd275c07dfa784657b4f757eb upstream. Fix constraint check in armpmu_map_hw_event(). Reported-and-tested-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-11x86, fpu: correct the asm constraints for fxsave, unbreak mxcsr.dazH.J. Lu
commit eaa5a990191d204ba0f9d35dbe5505ec2cdd1460 upstream. GCC will optimize mxcsr_feature_mask_init in arch/x86/kernel/i387.c: memset(&fx_scratch, 0, sizeof(struct i387_fxsave_struct)); asm volatile("fxsave %0" : : "m" (fx_scratch)); mask = fx_scratch.mxcsr_mask; if (mask == 0) mask = 0x0000ffbf; to memset(&fx_scratch, 0, sizeof(struct i387_fxsave_struct)); asm volatile("fxsave %0" : : "m" (fx_scratch)); mask = 0x0000ffbf; since asm statement doesn’t say it will update fx_scratch. As the result, the DAZ bit will be cleared. This patch fixes it. This bug dates back to at least kernel 2.6.12. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-04s390: move dummy io_remap_pfn_range() to asm/pgtable.hLinus Torvalds
commit 4f2e29031e6c67802e7370292dd050fd62f337ee upstream. Commit b4cbb197c7e7 ("vm: add vm_iomap_memory() helper function") added a helper function wrapper around io_remap_pfn_range(), and every other architecture defined it in <asm/pgtable.h>. The s390 choice of <asm/io.h> may make sense, but is not very convenient for this case, and gratuitous differences like that cause unexpected errors like this: mm/memory.c: In function 'vm_iomap_memory': mm/memory.c:2439:2: error: implicit declaration of function 'io_remap_pfn_range' [-Werror=implicit-function-declaration] Glory be the kbuild test robot who noticed this, bisected it, and reported it to the guilty parties (ie me). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> [bwh: Backported to 3.2: the macro was not defined, so this is an addition and not a move] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-04powerpc/modules: Module CRC relocation fix causes perf issuesAnton Blanchard
commit 0e0ed6406e61434d3f38fb58aa8464ec4722b77e upstream. Module CRCs are implemented as absolute symbols that get resolved by a linker script. We build an intermediate .o that contains an unresolved symbol for each CRC. genksysms parses this .o, calculates the CRCs and writes a linker script that "resolves" the symbols to the calculated CRC. Unfortunately the ppc64 relocatable kernel sees these CRCs as symbols that need relocating and relocates them at boot. Commit d4703aef (module: handle ppc64 relocating kcrctabs when CONFIG_RELOCATABLE=y) added a hook to reverse the bogus relocations. Part of this patch created a symbol at 0x0: # head -2 /proc/kallsyms 0000000000000000 T reloc_start c000000000000000 T .__start This reloc_start symbol is causing lots of confusion to perf. It thinks reloc_start is a massive function that stretches from 0x0 to 0xc000000000000000 and we get various cryptic errors out of perf, including: problem incrementing symbol count, skipping event This patch removes the reloc_start linker script label and instead defines it as PHYSICAL_START. We also need to wrap it with CONFIG_PPC64 because the ppc32 kernel can set a non zero PHYSICAL_START at compile time and we wouldn't want to subtract it from the CRCs in that case. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-28sparc: tsb must be flushed before tlbDave Kleikamp
Upstream commit 23a01138efe216f8084cfaa74b0b90dd4b097441 This fixes a race where a cpu may re-load a tlb from a stale tsb right after it has been flushed by a remote function call. I still see some instability when stressing the system with parallel kernel builds while creating memory pressure by writing to /proc/sys/vm/nr_hugepages, but this patch improves the stability significantly. Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Acked-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-28sparc64 address-congruence propertybob picco
Upstream commit 771a37ff4d80b80db3b0df3e7696f14b298c67b7 The Machine Description (MD) property "address-congruence-offset" is optional. According to the MD specification the value is assumed 0UL when not present. This caused early boot failure on T5. Signed-off-by: Bob Picco <bob.picco@oracle.com> CC: sparclinux@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-28sparc32: vm_area_struct access for old Sun SPARCs.Olivier DANET
Upstream commit 961246b4ed8da3bcf4ee1eb9147f341013553e3c Commit e4c6bfd2d79d063017ab19a18915f0bc759f32d9 ("mm: rearrange vm_area_struct for fewer cache misses") changed the layout of the vm_area_struct structure, it broke several SPARC32 assembly routines which used numerical constants for accessing the vm_mm field. This patch defines the VMA_VM_MM constant to replace the immediate values. Signed-off-by: Olivier DANET <odanet@caramail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21ARM: 7765/1: perf: Record the user-mode PC in the call chain.Jed Davis
commit c5f927a6f62196226915f12194c9d0df4e2210d7 upstream. With this change, we no longer lose the innermost entry in the user-mode part of the call chain. See also the x86 port, which includes the ip. It's possible to partially work around this problem by post-processing the data to use the PERF_SAMPLE_IP value, but this works only if the CPU wasn't in the kernel when the sample was taken. Signed-off-by: Jed Davis <jld@mozilla.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-21xen/time: remove blocked time accounting from xen "clockchip"Laszlo Ersek
commit 0b0c002c340e78173789f8afaa508070d838cf3d upstream. ... because the "clock_event_device framework" already accounts for idle time through the "event_handler" function pointer in xen_timer_interrupt(). The patch is intended as the completion of [1]. It should fix the double idle times seen in PV guests' /proc/stat [2]. It should be orthogonal to stolen time accounting (the removed code seems to be isolated). The approach may be completely misguided. [1] https://lkml.org/lkml/2011/10/6/10 [2] http://lists.xensource.com/archives/html/xen-devel/2010-08/msg01068.html John took the time to retest this patch on top of v3.10 and reported: "idle time is correctly incremented for pv and hvm for the normal case, nohz=off and nohz=idle." so lets put this patch in. Signed-off-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-03ARM: 7772/1: Fix missing flush_kernel_dcache_page() for noMMUSimon Baatz
commit 63384fd0b1509acf522a8a8fcede09087eedb7df upstream. Commit 1bc3974 (ARM: 7755/1: handle user space mapped pages in flush_kernel_dcache_page) moved the implementation of flush_kernel_dcache_page() into mm/flush.c but did not implement it on noMMU ARM. Signed-off-by: Simon Baatz <gmbnomis@gmail.com> Acked-by: Kevin Hilman <khilman@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-03ARM: 7755/1: handle user space mapped pages in flush_kernel_dcache_pageSimon Baatz
commit 1bc39742aab09248169ef9d3727c9def3528b3f3 upstream. Commit f8b63c1 made flush_kernel_dcache_page a no-op assuming that the pages it needs to handle are kernel mapped only. However, for example when doing direct I/O, pages with user space mappings may occur. Thus, continue to do lazy flushing if there are no user space mappings. Otherwise, flush the kernel cache lines directly. Signed-off-by: Simon Baatz <gmbnomis@gmail.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-27KVM: x86: remove vcpu's CPL check in host-invoked XCR setZhanghaoyu (A)
commit 764bcbc5a6d7a2f3e75c9f0e4caa984e2926e346 upstream. __kvm_set_xcr function does the CPL check when set xcr. __kvm_set_xcr is called in two flows, one is invoked by guest, call stack shown as below, handle_xsetbv(or xsetbv_interception) kvm_set_xcr __kvm_set_xcr the other one is invoked by host, for example during system reset: kvm_arch_vcpu_ioctl kvm_vcpu_ioctl_x86_set_xcrs __kvm_set_xcr The former does need the CPL check, but the latter does not. Signed-off-by: Zhang Haoyu <haoyu.zhang@huawei.com> [Tweaks to commit message. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-27tilepro: work around module link error with gcc 4.7Chris Metcalf
commit 3cb3f839d306443f3d1e79b0bde1a2ad2c12b555 upstream. gcc 4.7.x is emitting calls to __ffsdi2 where previously it used to inline the appropriate ctz instructions. While this needs to be fixed in gcc, it's also easy to avoid having it cause build failures when building with those compilers by exporting __ffsdi2 to modules. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-20powerpc: Fix missing/delayed calls to irq_workBenjamin Herrenschmidt
commit 230b3034793247f61e6a0b08c44cf415f6d92981 upstream. When replaying interrupts (as a result of the interrupt occurring while soft-disabled), in the case of the decrementer, we are exclusively testing for a pending timer target. However we also use decrementer interrupts to trigger the new "irq_work", which in this case would be missed. This change the logic to force a replay in both cases of a timer boundary reached and a decrementer interrupt having actually occurred while disabled. The former test is still useful to catch cases where a CPU having been hard-disabled for a long time completely misses the interrupt due to a decrementer rollover. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-20powerpc: Fix stack overflow crash in resume_kernel when ftracingMichael Ellerman
commit 0e37739b1c96d65e6433998454985de994383019 upstream. It's possible for us to crash when running with ftrace enabled, eg: Bad kernel stack pointer bffffd12 at c00000000000a454 cpu 0x3: Vector: 300 (Data Access) at [c00000000ffe3d40] pc: c00000000000a454: resume_kernel+0x34/0x60 lr: c00000000000335c: performance_monitor_common+0x15c/0x180 sp: bffffd12 msr: 8000000000001032 dar: bffffd12 dsisr: 42000000 If we look at current's stack (paca->__current->stack) we see it is equal to c0000002ecab0000. Our stack is 16K, and comparing to paca->kstack (c0000002ecab3e30) we can see that we have overflowed our kernel stack. This leads to us writing over our struct thread_info, and in this case we have corrupted thread_info->flags and set _TIF_EMULATE_STACK_STORE. Dumping the stack we see: 3:mon> t c0000002ecab0000 [c0000002ecab0000] c00000000002131c .performance_monitor_exception+0x5c/0x70 [c0000002ecab0080] c00000000000335c performance_monitor_common+0x15c/0x180 --- Exception: f01 (Performance Monitor) at c0000000000fb2ec .trace_hardirqs_off+0x1c/0x30 [c0000002ecab0370] c00000000016fdb0 .trace_graph_entry+0xb0/0x280 (unreliable) [c0000002ecab0410] c00000000003d038 .prepare_ftrace_return+0x98/0x130 [c0000002ecab04b0] c00000000000a920 .ftrace_graph_caller+0x14/0x28 [c0000002ecab0520] c0000000000d6b58 .idle_cpu+0x18/0x90 [c0000002ecab05a0] c00000000000a934 .return_to_handler+0x0/0x34 [c0000002ecab0620] c00000000001e660 .timer_interrupt+0x160/0x300 [c0000002ecab06d0] c0000000000025dc decrementer_common+0x15c/0x180 --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0 [c0000002ecab09c0] c0000000000fe044 .trace_hardirqs_on+0x14/0x30 (unreliable) [c0000002ecab0fb0] c00000000016fe3c .trace_graph_entry+0x13c/0x280 [c0000002ecab1050] c00000000003d038 .prepare_ftrace_return+0x98/0x130 [c0000002ecab10f0] c00000000000a920 .ftrace_graph_caller+0x14/0x28 [c0000002ecab1160] c0000000000161f0 .__ppc64_runlatch_on+0x10/0x40 [c0000002ecab11d0] c00000000000a934 .return_to_handler+0x0/0x34 --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0 ... and so on __ppc64_runlatch_on() is called from RUNLATCH_ON in the exception entry path. At that point the irq state is not consistent, ie. interrupts are hard disabled (by the exception entry), but the paca soft-enabled flag may be out of sync. This leads to the local_irq_restore() in trace_graph_entry() actually enabling interrupts, which we do not want. Because we have not yet reprogrammed the decrementer we immediately take another decrementer exception, and recurse. The fix is twofold. Firstly make sure we call DISABLE_INTS before calling RUNLATCH_ON. The badly named DISABLE_INTS actually reconciles the irq state in the paca with the hardware, making it safe again to call local_irq_save/restore(). Although that should be sufficient to fix the bug, we also mark the runlatch routines as notrace. They are called very early in the exception entry and we are asking for trouble tracing them. They are also fairly uninteresting and tracing them just adds unnecessary overhead. [ This regression was introduced by fe1952fc0afb9a2e4c79f103c08aef5d13db1873 "powerpc: Rework runlatch code" by myself --BenH ] Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-20x86: Fix typo in kexec register clearingKees Cook
commit c8a22d19dd238ede87aa0ac4f7dbea8da039b9c1 upstream. Fixes a typo in register clearing code. Thanks to PaX Team for fixing this originally, and James Troup for pointing it out. Signed-off-by: Kees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/r/20130605184718.GA8396@www.outflux.net Cc: PaX Team <pageexec@freemail.hu> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-13powerpc/eeh: Don't check RTAS token to get PE addrGavin Shan
commit b8b3de224f194005ad87ede6fd022fcc2bef3b1a upstream. RTAS token "ibm,get-config-addr-info" or ibm,get-config-addr-info2" are used to retrieve the PE address according to PCI address, which made up of domain/bus/slot/function. If we don't have those 2 tokens, the domain/bus/slot/function would be used as the address for EEH RTAS operations. Some older f/w might not have those 2 tokens and that blocks the EEH functionality to be initialized. It was introduced by commit e2af155c ("powerpc/eeh: pseries platform EEH initialization"). The patch skips the check on those 2 tokens so we can bring up EEH functionality successfully. And domain/bus/slot/function will be used as address for EEH RTAS operations. Reported-by: Robert Knight <knight@princeton.edu> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Tested-by: Robert Knight <knight@princeton.edu> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-07x86, um: Correct syscall table type attributes breaking gcc 4.8Martin Pelikan
commit 9271b0b4b2044c6db06051fe60bc58cdd4f17c7c upstream. The latest GCC 4.8 does some more checking on type attributes that break the build for ARCH=um -> fill them in. Specifically, the "asmlinkage" attributes is now tested for consistency. Signed-off-by: Martin Pelikan <pelikan@storkhole.cz> Link: http://lkml.kernel.org/r/1339269731-10772-1-git-send-email-pelikan@storkhole.cz Acked-by: Richard Weinberger <richard@nod.at> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Bernhard M. Wiedemann <bwiedemann@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-07m68k/mac: Fix unexpected interrupt with CONFIG_EARLY_PRINTKFinn Thain
commit df66834a43c461de2565c45d815288ba1c0def37 upstream. The present code does not wait for the SCC to finish resetting itself before trying to initialise the device. The result is that the SCC interrupt sources become enabled (if they weren't already). This leads to an early boot crash (unexpected interrupt) given CONFIG_EARLY_PRINTK. Fix this by adding a delay. A successful reset disables the interrupt sources. Also, after the reset for channel A setup, the SCC then gets a second reset for channel B setup which leaves channel A uninitialised again. Fix this by performing the reset only once. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-07Kirkwood: Enable PCIe port 1 on QNAP TS-11x/TS-21xMartin Michlmayr
commit 99e11334dcb846f9b76fb808196c7f47aa83abb3 upstream. Enable KW_PCIE1 on QNAP TS-11x/TS-21x devices as newer revisions (rev 1.3) have a USB 3.0 chip from Etron on PCIe port 1. Thanks to Marek Vasut for identifying this issue! Signed-off-by: Martin Michlmayr <tbm@cyrius.com> Tested-by: Marek Vasut <marex@denx.de> Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jason Cooper <jason@lakedaemon.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-07ARM: plat-orion: Fix num_resources and id for ge10 and ge11Gregory CLEMENT
commit 2b8b2797142c7951e635c6eec5d1705ee9bc45c5 upstream. When platform data were moved from arch/arm/mach-mv78xx0/common.c to arch/arm/plat-orion/common.c with the commit "7e3819d ARM: orion: Consolidate ethernet platform data", there were few typo made on gigabit Ethernet interface ge10 and ge11. This commit writes back their initial value, which allows to use this interfaces again. Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jason Cooper <jason@lakedaemon.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-07avr32: fix relocation check for signed 18-bit offsetHans-Christian Egtvedt
commit e68c636d88db3fda74e664ecb1a213ae0d50a7d8 upstream. Caught by static code analysis by David. Reported-by: David Binderman <dcb314@hotmail.com> Signed-off-by: Hans-Christian Egtvedt <egtvedt@samfundet.no> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-19powerpc: Bring all threads online prior to migration/hibernationRobert Jennings
commit 120496ac2d2d60aee68d3123a68169502a85f4b5 upstream. This patch brings online all threads which are present but not online prior to migration/hibernation. After migration/hibernation those threads are taken back offline. During migration/hibernation all online CPUs must call H_JOIN, this is required by the hypervisor. Without this patch, threads that are offline (H_CEDE'd) will not be woken to make the H_JOIN call and the OS will be deadlocked (all threads either JOIN'd or CEDE'd). Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-19xen/vcpu/pvhvm: Fix vcpu hotplugging hanging.Konrad Rzeszutek Wilk
commit 7f1fc268c47491fd5e63548f6415fc8604e13003 upstream. If a user did: echo 0 > /sys/devices/system/cpu/cpu1/online echo 1 > /sys/devices/system/cpu/cpu1/online we would (this a build with DEBUG enabled) get to: smpboot: ++++++++++++++++++++=_---CPU UP 1 .. snip.. smpboot: Stack at about ffff880074c0ff44 smpboot: CPU1: has booted. and hang. The RCU mechanism would kick in an try to IPI the CPU1 but the IPIs (and all other interrupts) would never arrive at the CPU1. At first glance at least. A bit digging in the hypervisor trace shows that (using xenanalyze): [vla] d4v1 vec 243 injecting 0.043163027 --|x d4v1 intr_window vec 243 src 5(vector) intr f3 ] 0.043163639 --|x d4v1 vmentry cycles 1468 ] 0.043164913 --|x d4v1 vmexit exit_reason PENDING_INTERRUPT eip ffffffff81673254 0.043164913 --|x d4v1 inj_virq vec 243 real [vla] d4v1 vec 243 injecting 0.043164913 --|x d4v1 intr_window vec 243 src 5(vector) intr f3 ] 0.043165526 --|x d4v1 vmentry cycles 1472 ] 0.043166800 --|x d4v1 vmexit exit_reason PENDING_INTERRUPT eip ffffffff81673254 0.043166800 --|x d4v1 inj_virq vec 243 real [vla] d4v1 vec 243 injecting there is a pending event (subsequent debugging shows it is the IPI from the VCPU0 when smpboot.c on VCPU1 has done "set_cpu_online(smp_processor_id(), true)") and the guest VCPU1 is interrupted with the callback IPI (0xf3 aka 243) which ends up calling __xen_evtchn_do_upcall. The __xen_evtchn_do_upcall seems to do *something* but not acknowledge the pending events. And the moment the guest does a 'cli' (that is the ffffffff81673254 in the log above) the hypervisor is invoked again to inject the IPI (0xf3) to tell the guest it has pending interrupts. This repeats itself forever. The culprit was the per_cpu(xen_vcpu, cpu) pointer. At the bootup we set each per_cpu(xen_vcpu, cpu) to point to the shared_info->vcpu_info[vcpu] but later on use the VCPUOP_register_vcpu_info to register per-CPU structures (xen_vcpu_setup). This is used to allow events for more than 32 VCPUs and for performance optimizations reasons. When the user performs the VCPU hotplug we end up calling the the xen_vcpu_setup once more. We make the hypercall which returns -EINVAL as it does not allow multiple registration calls (and already has re-assigned where the events are being set). We pick the fallback case and set per_cpu(xen_vcpu, cpu) to point to the shared_info->vcpu_info[vcpu] (which is a good fallback during bootup). However the hypervisor is still setting events in the register per-cpu structure (per_cpu(xen_vcpu_info, cpu)). As such when the events are set by the hypervisor (such as timer one), and when we iterate in __xen_evtchn_do_upcall we end up reading stale events from the shared_info->vcpu_info[vcpu] instead of the per_cpu(xen_vcpu_info, cpu) structures. Hence we never acknowledge the events that the hypervisor has set and the hypervisor keeps on reminding us to ack the events which we never do. The fix is simple. Don't on the second time when xen_vcpu_setup is called over-write the per_cpu(xen_vcpu, cpu) if it points to per_cpu(xen_vcpu_info). Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>