summaryrefslogtreecommitdiff
path: root/arch/powerpc
AgeCommit message (Collapse)Author
2013-12-04powerpc/signals: Improved mark VSX not saved with small contexts fixMichael Neuling
commit ec67ad82814bee92251fd963bf01c7a173856555 upstream. In a recent patch: commit c13f20ac48328b05cd3b8c19e31ed6c132b44b42 Author: Michael Neuling <mikey@neuling.org> powerpc/signals: Mark VSX not saved with small contexts We fixed an issue but an improved solution was later discussed after the patch was merged. Firstly, this patch doesn't handle the 64bit signals case, which could also hit this issue (but has never been reported). Secondly, the original patch isn't clear what MSR VSX should be set to. The new approach below always clears the MSR VSX bit (to indicate no VSX is in the context) and sets it only in the specific case where VSX is available (ie. when VSX has been used and the signal context passed has space to provide the state). This reverts the original patch and replaces it with the improved solution. It also adds a 64 bit version. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-29powerpc/signals: Mark VSX not saved with small contextsMichael Neuling
commit c13f20ac48328b05cd3b8c19e31ed6c132b44b42 upstream. The VSX MSR bit in the user context indicates if the context contains VSX state. Currently we set this when the process has touched VSX at any stage. Unfortunately, if the user has not provided enough space to save the VSX state, we can't save it but we currently still set the MSR VSX bit. This patch changes this to clear the MSR VSX bit when the user doesn't provide enough space. This indicates that there is no valid VSX state in the user context. This is needed to support get/set/make/swapcontext for applications that use VSX but only provide a small context. For example, getcontext in glibc provides a smaller context since the VSX registers don't need to be saved over the glibc function call. But since the program calling getcontext may have used VSX, the kernel currently says the VSX state is valid when it's not. If the returned context is then used in setcontext (ie. a small context without VSX but with MSR VSX set), the kernel will refuse the context. This situation has been reported by the glibc community. Based on patch from Carlos O'Donell. Tested-by: Haren Myneni <haren@linux.vnet.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-29powerpc/powernv: Add PE to its own PELTVGavin Shan
commit 631ad691b5818291d89af9be607d2fe40be0886e upstream. We need add PE to its own PELTV. Otherwise, the errors originated from the PE might contribute to other PEs. In the result, we can't clear up the error successfully even we're checking and clearing errors during access to PCI config space. Reported-by: kalshett@in.ibm.com Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-29powerpc/vio: use strcpy in modalias_showPrarit Bhargava
commit 411cabf79e684171669ad29a0628c400b4431e95 upstream. Commit e82b89a6f19bae73fb064d1b3dd91fcefbb478f4 used strcat instead of strcpy which can result in an overflow of newlines on the buffer. Signed-off-by: Prarit Bhargava Cc: benh@kernel.crashing.org Cc: ben@decadent.org.uk Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-22KVM: PPC: Book3S HV: Fix typo in saving DSCRPaul Mackerras
commit cfc860253abd73e1681696c08ea268d33285a2c4 upstream. This fixes a typo in the code that saves the guest DSCR (Data Stream Control Register) into the kvm_vcpu_arch struct on guest exit. The effect of the typo was that the DSCR value was saved in the wrong place, so changes to the DSCR by the guest didn't persist across guest exit and entry, and some host kernel memory got corrupted. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Alexander Graf <agraf@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-13powerpc: Fix parameter clobber in csum_partial_copy_generic()Paul E. McKenney
commit d9813c3681a36774b254c0cdc9cce53c9e22c756 upstream. The csum_partial_copy_generic() uses register r7 to adjust the remaining bytes to process. Unfortunately, r7 also holds a parameter, namely the address of the flag to set in case of access exceptions while reading the source buffer. Lacking a quantum implementation of PowerPC, this commit instead uses register r9 to do the adjusting, leaving r7's pointer uncorrupted. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-13powerpc/vio: Fix modalias_show return valuesPrarit Bhargava
commit e82b89a6f19bae73fb064d1b3dd91fcefbb478f4 upstream. modalias_show() should return an empty string on error, not -ENODEV. This causes the following false and annoying error: > find /sys/devices -name modalias -print0 | xargs -0 cat >/dev/null cat: /sys/devices/vio/4000/modalias: No such device cat: /sys/devices/vio/4001/modalias: No such device cat: /sys/devices/vio/4002/modalias: No such device cat: /sys/devices/vio/4004/modalias: No such device cat: /sys/devices/vio/modalias: No such device Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-13powerpc/iommu: Use GFP_KERNEL instead of GFP_ATOMIC in iommu_init_table()Nishanth Aravamudan
commit 1cf389df090194a0976dc867b7fffe99d9d490cb upstream. Under heavy (DLPAR?) stress, we tripped this panic() in arch/powerpc/kernel/iommu.c::iommu_init_table(): page = alloc_pages_node(nid, GFP_ATOMIC, get_order(sz)); if (!page) panic("iommu_init_table: Can't allocate %ld bytes\n", sz); Before the panic() we got a page allocation failure for an order-2 allocation. There appears to be memory free, but perhaps not in the ATOMIC context. I looked through all the call-sites of iommu_init_table() and didn't see any obvious reason to need an ATOMIC allocation. Most call-sites in fact have an explicit GFP_KERNEL allocation shortly before the call to iommu_init_table(), indicating we are not in an atomic context. There is some indirection for some paths, but I didn't see any locks indicating that GFP_KERNEL is inappropriate. With this change under the same conditions, we have not been able to reproduce the panic. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-26powerpc: Handle unaligned ldbrx/stdbrxAnton Blanchard
commit 230aef7a6a23b6166bd4003bfff5af23c9bd381f upstream. Normally when we haven't implemented an alignment handler for a load or store instruction the process will be terminated. The alignment handler uses the DSISR (or a pseudo one) to locate the right handler. Unfortunately ldbrx and stdbrx overlap lfs and stfs so we incorrectly think ldbrx is an lfs and stdbrx is an stfs. This bug is particularly nasty - instead of terminating the process we apply an incorrect fixup and continue on. With more and more overlapping instructions we should stop creating a pseudo DSISR and index using the instruction directly, but for now add a special case to catch ldbrx/stdbrx. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-07powerpc: Work around gcc miscompilation of __pa() on 64-bitPaul Mackerras
commit bdbc29c19b2633b1d9c52638fb732bcde7a2031a upstream. On 64-bit, __pa(&static_var) gets miscompiled by recent versions of gcc as something like: addis 3,2,.LANCHOR1+4611686018427387904@toc@ha addi 3,3,.LANCHOR1+4611686018427387904@toc@l This ends up effectively ignoring the offset, since its bottom 32 bits are zero, and means that the result of __pa() still has 0xC in the top nibble. This happens with gcc 4.8.1, at least. To work around this, for 64-bit we make __pa() use an AND operator, and for symmetry, we make __va() use an OR operator. Using an AND operator rather than a subtraction ends up with slightly shorter code since it can be done with a single clrldi instruction, whereas it takes three instructions to form the constant (-PAGE_OFFSET) and add it on. (Note that MEMORY_START is always 0 on 64-bit.) Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-20powerpc/numa: Avoid stupid uninitialized warning from gccBenjamin Herrenschmidt
commit aa709f3bc92c6daaf177cd7e3446da2ef64426c6 upstream. Newer gcc are being a bit blind here (it's pretty obvious we don't reach the code path using the array if we haven't initialized the pointer) but none of that is performance critical so let's just silence it. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-04powerpc/modules: Module CRC relocation fix causes perf issuesAnton Blanchard
commit 0e0ed6406e61434d3f38fb58aa8464ec4722b77e upstream. Module CRCs are implemented as absolute symbols that get resolved by a linker script. We build an intermediate .o that contains an unresolved symbol for each CRC. genksysms parses this .o, calculates the CRCs and writes a linker script that "resolves" the symbols to the calculated CRC. Unfortunately the ppc64 relocatable kernel sees these CRCs as symbols that need relocating and relocates them at boot. Commit d4703aef (module: handle ppc64 relocating kcrctabs when CONFIG_RELOCATABLE=y) added a hook to reverse the bogus relocations. Part of this patch created a symbol at 0x0: # head -2 /proc/kallsyms 0000000000000000 T reloc_start c000000000000000 T .__start This reloc_start symbol is causing lots of confusion to perf. It thinks reloc_start is a massive function that stretches from 0x0 to 0xc000000000000000 and we get various cryptic errors out of perf, including: problem incrementing symbol count, skipping event This patch removes the reloc_start linker script label and instead defines it as PHYSICAL_START. We also need to wrap it with CONFIG_PPC64 because the ppc32 kernel can set a non zero PHYSICAL_START at compile time and we wouldn't want to subtract it from the CRCs in that case. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-20powerpc: Fix missing/delayed calls to irq_workBenjamin Herrenschmidt
commit 230b3034793247f61e6a0b08c44cf415f6d92981 upstream. When replaying interrupts (as a result of the interrupt occurring while soft-disabled), in the case of the decrementer, we are exclusively testing for a pending timer target. However we also use decrementer interrupts to trigger the new "irq_work", which in this case would be missed. This change the logic to force a replay in both cases of a timer boundary reached and a decrementer interrupt having actually occurred while disabled. The former test is still useful to catch cases where a CPU having been hard-disabled for a long time completely misses the interrupt due to a decrementer rollover. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-20powerpc: Fix stack overflow crash in resume_kernel when ftracingMichael Ellerman
commit 0e37739b1c96d65e6433998454985de994383019 upstream. It's possible for us to crash when running with ftrace enabled, eg: Bad kernel stack pointer bffffd12 at c00000000000a454 cpu 0x3: Vector: 300 (Data Access) at [c00000000ffe3d40] pc: c00000000000a454: resume_kernel+0x34/0x60 lr: c00000000000335c: performance_monitor_common+0x15c/0x180 sp: bffffd12 msr: 8000000000001032 dar: bffffd12 dsisr: 42000000 If we look at current's stack (paca->__current->stack) we see it is equal to c0000002ecab0000. Our stack is 16K, and comparing to paca->kstack (c0000002ecab3e30) we can see that we have overflowed our kernel stack. This leads to us writing over our struct thread_info, and in this case we have corrupted thread_info->flags and set _TIF_EMULATE_STACK_STORE. Dumping the stack we see: 3:mon> t c0000002ecab0000 [c0000002ecab0000] c00000000002131c .performance_monitor_exception+0x5c/0x70 [c0000002ecab0080] c00000000000335c performance_monitor_common+0x15c/0x180 --- Exception: f01 (Performance Monitor) at c0000000000fb2ec .trace_hardirqs_off+0x1c/0x30 [c0000002ecab0370] c00000000016fdb0 .trace_graph_entry+0xb0/0x280 (unreliable) [c0000002ecab0410] c00000000003d038 .prepare_ftrace_return+0x98/0x130 [c0000002ecab04b0] c00000000000a920 .ftrace_graph_caller+0x14/0x28 [c0000002ecab0520] c0000000000d6b58 .idle_cpu+0x18/0x90 [c0000002ecab05a0] c00000000000a934 .return_to_handler+0x0/0x34 [c0000002ecab0620] c00000000001e660 .timer_interrupt+0x160/0x300 [c0000002ecab06d0] c0000000000025dc decrementer_common+0x15c/0x180 --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0 [c0000002ecab09c0] c0000000000fe044 .trace_hardirqs_on+0x14/0x30 (unreliable) [c0000002ecab0fb0] c00000000016fe3c .trace_graph_entry+0x13c/0x280 [c0000002ecab1050] c00000000003d038 .prepare_ftrace_return+0x98/0x130 [c0000002ecab10f0] c00000000000a920 .ftrace_graph_caller+0x14/0x28 [c0000002ecab1160] c0000000000161f0 .__ppc64_runlatch_on+0x10/0x40 [c0000002ecab11d0] c00000000000a934 .return_to_handler+0x0/0x34 --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0 ... and so on __ppc64_runlatch_on() is called from RUNLATCH_ON in the exception entry path. At that point the irq state is not consistent, ie. interrupts are hard disabled (by the exception entry), but the paca soft-enabled flag may be out of sync. This leads to the local_irq_restore() in trace_graph_entry() actually enabling interrupts, which we do not want. Because we have not yet reprogrammed the decrementer we immediately take another decrementer exception, and recurse. The fix is twofold. Firstly make sure we call DISABLE_INTS before calling RUNLATCH_ON. The badly named DISABLE_INTS actually reconciles the irq state in the paca with the hardware, making it safe again to call local_irq_save/restore(). Although that should be sufficient to fix the bug, we also mark the runlatch routines as notrace. They are called very early in the exception entry and we are asking for trouble tracing them. They are also fairly uninteresting and tracing them just adds unnecessary overhead. [ This regression was introduced by fe1952fc0afb9a2e4c79f103c08aef5d13db1873 "powerpc: Rework runlatch code" by myself --BenH ] Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-13powerpc/eeh: Don't check RTAS token to get PE addrGavin Shan
commit b8b3de224f194005ad87ede6fd022fcc2bef3b1a upstream. RTAS token "ibm,get-config-addr-info" or ibm,get-config-addr-info2" are used to retrieve the PE address according to PCI address, which made up of domain/bus/slot/function. If we don't have those 2 tokens, the domain/bus/slot/function would be used as the address for EEH RTAS operations. Some older f/w might not have those 2 tokens and that blocks the EEH functionality to be initialized. It was introduced by commit e2af155c ("powerpc/eeh: pseries platform EEH initialization"). The patch skips the check on those 2 tokens so we can bring up EEH functionality successfully. And domain/bus/slot/function will be used as address for EEH RTAS operations. Reported-by: Robert Knight <knight@princeton.edu> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Tested-by: Robert Knight <knight@princeton.edu> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-19powerpc: Bring all threads online prior to migration/hibernationRobert Jennings
commit 120496ac2d2d60aee68d3123a68169502a85f4b5 upstream. This patch brings online all threads which are present but not online prior to migration/hibernation. After migration/hibernation those threads are taken back offline. During migration/hibernation all online CPUs must call H_JOIN, this is required by the hypervisor. Without this patch, threads that are offline (H_CEDE'd) will not be woken to make the H_JOIN call and the OS will be deadlocked (all threads either JOIN'd or CEDE'd). Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11powerpc: fix numa distance for form0 device treeVaidyanathan Srinivasan
commit 7122beeee7bc1757682049780179d7c216dd1c83 upstream. The following commit breaks numa distance setup for old powerpc systems that use form0 encoding in device tree. commit 41eab6f88f24124df89e38067b3766b7bef06ddb powerpc/numa: Use form 1 affinity to setup node distance Device tree node /rtas/ibm,associativity-reference-points would index into /cpus/PowerPCxxxx/ibm,associativity based on form0 or form1 encoding detected by ibm,architecture-vec-5 property. All modern systems use form1 and current kernel code is correct. However, on older systems with form0 encoding, the numa distance will get hard coded as LOCAL_DISTANCE for all nodes. This causes task scheduling anomaly since scheduler will skip building numa level domain (topmost domain with all cpus) if all numa distances are same. (value of 'level' in sched_init_numa() will remain 0) Prior to the above commit: ((from) == (to) ? LOCAL_DISTANCE : REMOTE_DISTANCE) Restoring compatible behavior with this patch for old powerpc systems with device tree where numa distance are encoded as form0. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-11powerpc: Emulate non privileged DSCR read and writeAnton Blanchard
commit 73d2fb758e678c93bc76d40876c2359f0729b0ef upstream. POWER8 allows read and write of the DSCR in userspace. We added kernel emulation so applications could always use the instructions regardless of the CPU type. Unfortunately there are two SPRs for the DSCR and we only added emulation for the privileged one. Add code to match the non privileged one. A simple test was created to verify the fix: http://ozlabs.org/~anton/junkcode/user_dscr_test.c Without the patch we get a SIGILL and it passes with the patch. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07powerpc/spufs: Initialise inode->i_ino in spufs_new_inode()Michael Ellerman
commit 6747e83235caecd30b186d1282e4eba7679f81b7 upstream. In commit 85fe402 (fs: do not assign default i_ino in new_inode), the initialisation of i_ino was removed from new_inode() and pushed down into the callers. However spufs_new_inode() was not updated. This exhibits as no files appearing in /spu, because all our dirents have a zero inode, which readdir() seems to dislike. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-07powerpc: Add isync to copy_and_flushMichael Neuling
commit 29ce3c5073057991217916abc25628e906911757 upstream. In __after_prom_start we copy the kernel down to zero in two calls to copy_and_flush. After the first call (copy from 0 to copy_to_here:) we jump to the newly copied code soon after. Unfortunately there's no isync between the copy of this code and the jump to it. Hence it's possible that stale instructions could still be in the icache or pipeline before we branch to it. We've seen this on real machines and it's results in no console output after: calling quiesce... returning from prom_init The below adds an isync to ensure that the copy and flushing has completed before any branching to the new instructions occurs. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-12powerpc: pSeries_lpar_hpte_remove fails from Adjunct partition being ↵Michael Wolf
performed before the ANDCOND test commit 9fb2640159f9d4f5a2a9d60e490482d4cbecafdb upstream. Some versions of pHyp will perform the adjunct partition test before the ANDCOND test. The result of this is that H_RESOURCE can be returned and cause the BUG_ON condition to occur. The HPTE is not removed. So add a check for H_RESOURCE, it is ok if this HPTE is not removed as pSeries_lpar_hpte_remove is looking for an HPTE to remove and not a specific HPTE to remove. So it is ok to just move on to the next slot and try again. Signed-off-by: Michael Wolf <mjw@linux.vnet.ibm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-05signal: Define __ARCH_HAS_SA_RESTORER so we know whether to clear sa_restorerBen Hutchings
Vaguely based on upstream commit 574c4866e33d 'consolidate kernel-side struct sigaction declarations'. flush_signal_handlers() needs to know whether sigaction::sa_restorer is defined, not whether SA_RESTORER is defined. Define the __ARCH_HAS_SA_RESTORER macro to indicate this. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-20powerpc: Fix cputable entry for 970MP rev 1.0Benjamin Herrenschmidt
commit d63ac5f6cf31c8a83170a9509b350c1489a7262b upstream. Commit 44ae3ab3358e962039c36ad4ae461ae9fb29596c forgot to update the entry for the 970MP rev 1.0 processor when moving some CPU features bits to the MMU feature bit mask. This breaks booting on some rare G5 models using that chip revision. Reported-by: Phileas Fogg <phileas-fogg@mail.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-03-20powerpc: Fix STAB initializationBenjamin Herrenschmidt
commit 13938117a57f88a22f0df9722a5db7271fda85cd upstream. Commit f5339277eb8d3aed37f12a27988366f68ab68930 accidentally removed more than just iSeries bits and took out the call to stab_initialize() thus breaking support for POWER3 processors. Put it back. (Yes, nobody noticed until now ...) Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-28powerpc/kexec: Disable hard IRQ before kexecPhileas Fogg
commit 8520e443aa56cc157b015205ea53e7b9fc831291 upstream. Disable hard IRQ before kexec a new kernel image. Not doing it can result in corrupted data in the memory segments reserved for the new kernel. Signed-off-by: Phileas Fogg <phileas-fogg@mail.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17KVM: PPC: 44x: fix DCR read/writeAlexander Graf
commit e43a028752fed049e4bd94ef895542f96d79fa74 upstream. When remembering the direction of a DCR transaction, we should write to the same variable that we interpret on later when doing vcpu_run again. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: CAI Qian <caiqian@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17powerpc: Add missing NULL terminator to avoid boot panic on PPC40xGabor Juhos
commit e6449c9b2d90c1bd9a5985bf05ddebfd1631cd6b upstream. The missing NULL terminator can cause a panic on PPC405 boards during boot: Linux/PowerPC load: console=ttyS0,115200 root=/dev/mtdblock1 rootfstype=squashfs,jffs2 noinitrd init=/etc/preinit Finalizing device tree... flat tree at 0x6a5160 bootconsole [udbg0] enabled Page fault in user mode with in_atomic() = 1 mm = (null) NIP = c0275f50 MSR = fffffffe Oops: Weird page fault, sig: 11 [#1] PowerPC 40x Platform Modules linked in: NIP: c0275f50 LR: c0275f60 CTR: c0280000 REGS: c0275eb0 TRAP: 636f7265 Not tainted (3.7.1) MSR: fffffffe <VEC,VSX,EE,PR,FP,ME,SE,BE,IR,DR,PMM,RI> CR: c06a6190 XER: 00000001 TASK = c02662a8[0] 'swapper' THREAD: c0274000 GPR00: c0275ec0 c000c658 c027c4bf 00000000 c0275ee0 c000a0ec c020a1a8 c020a1f0 GPR08: c020f631 c020f404 c025f078 c025f080 c0275f10 Call Trace: ---[ end trace 31fd0ba7d8756001 ]--- Kernel panic - not syncing: Attempted to kill the idle task! The panic happens since commit 9597abe00c1bab2aedce6b49866bf6d1e81c9eed (sections: fix section conflicts in arch/powerpc), however the root cause of this is that the NULL terminator were not added in commit a4f740cf33f7f6c164bbde3c0cdbcc77b0c4997c (of/flattree: Add of_flat_dt_match() helper function). Signed-off-by: Gabor Juhos <juhosg@openwrt.org> Cc: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17powerpc/vdso: Remove redundant locking in update_vsyscall_tz()Shan Hai
commit ce73ec6db47af84d1466402781ae0872a9e7873c upstream. The locking in update_vsyscall_tz() is not only unnecessary because the vdso code copies the data unproteced in __kernel_gettimeofday() but also introduces a hard to reproduce race condition between update_vsyscall() and update_vsyscall_tz(), which causes user space process to loop forever in vdso code. The following patch removes the locking from update_vsyscall_tz(). Locking is not only unnecessary because the vdso code copies the data unprotected in __kernel_gettimeofday() but also erroneous because updating the tb_update_count is not atomic and introduces a hard to reproduce race condition between update_vsyscall() and update_vsyscall_tz(), which further causes user space process to loop forever in vdso code. The below scenario describes the race condition, x==0 Boot CPU other CPU proc_P: x==0 timer interrupt update_vsyscall x==1 x++;sync settimeofday update_vsyscall_tz x==2 x++;sync x==3 sync;x++ sync;x++ proc_P: x==3 (loops until x becomes even) Because the ++ operator would be implemented as three instructions and not atomic on powerpc. A similar change was made for x86 in commit 6c260d58634 ("x86: vdso: Remove bogus locking in update_vsyscall_tz") Signed-off-by: Shan Hai <shan.hai@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17powerpc: Fix CONFIG_RELOCATABLE=y CONFIG_CRASH_DUMP=n buildAnton Blanchard
commit 11ee7e99f35ecb15f59b21da6a82d96d2cd3fcc8 upstream. If we build a kernel with CONFIG_RELOCATABLE=y CONFIG_CRASH_DUMP=n, the kernel fails when we run at a non zero offset. It turns out we were incorrectly wrapping some of the relocatable kernel code with CONFIG_CRASH_DUMP. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-12-03powerpc/eeh: Lock module while handling EEH eventGavin Shan
commit feadf7c0a1a7c08c74bebb4a13b755f8c40e3bbc upstream. The EEH core is talking with the PCI device driver to determine the action (purely reset, or PCI device removal). During the period, the driver might be unloaded and in turn causes kernel crash as follows: EEH: Detected PCI bus error on PHB#4-PE#10000 EEH: This PCI device has failed 3 times in the last hour lpfc 0004:01:00.0: 0:2710 PCI channel disable preparing for reset Unable to handle kernel paging request for data at address 0x00000490 Faulting instruction address: 0xd00000000e682c90 cpu 0x1: Vector: 300 (Data Access) at [c000000fc75ffa20] pc: d00000000e682c90: .lpfc_io_error_detected+0x30/0x240 [lpfc] lr: d00000000e682c8c: .lpfc_io_error_detected+0x2c/0x240 [lpfc] sp: c000000fc75ffca0 msr: 8000000000009032 dar: 490 dsisr: 40000000 current = 0xc000000fc79b88b0 paca = 0xc00000000edb0380 softe: 0 irq_happened: 0x00 pid = 3386, comm = eehd enter ? for help [c000000fc75ffca0] c000000fc75ffd30 (unreliable) [c000000fc75ffd30] c00000000004fd3c .eeh_report_error+0x7c/0xf0 [c000000fc75ffdc0] c00000000004ee00 .eeh_pe_dev_traverse+0xa0/0x180 [c000000fc75ffe70] c00000000004ffd8 .eeh_handle_event+0x68/0x300 [c000000fc75fff00] c0000000000503a0 .eeh_event_handler+0x130/0x1a0 [c000000fc75fff90] c000000000020138 .kernel_thread+0x54/0x70 1:mon> The patch increases the reference of the corresponding driver modules while EEH core does the negotiation with PCI device driver so that the corresponding driver modules can't be unloaded during the period and we're safe to refer the callbacks. Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [ herton: backported for 3.5, adjusted driver assignments, return 0 instead of NULL, assume dev is not NULL ] Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-13powerpc/eeh: Fix crash on converting OF node to edevGavin Shan
commit 1e38b7140185e384da216aff66a711df09b5afc9 upstream. The kernel crash was reported by Alexy. He was testing some feature with private kernel, in which Alexy added some code in pci_pm_reset() to read the CSR after writting it. The bug could be reproduced on Fiber Channel card (Fibre Channel: Emulex Corporation Saturn-X: LightPulse Fibre Channel Host Adapter (rev 03)) by the following commands. # echo 1 > /sys/devices/pci0004:01/0004:01:00.0/reset # rmmod lpfc # modprobe lpfc The history behind the test case is that those additional config space reading operations in pci_pm_reset() would cause EEH error, but we didn't detect EEH error until "modprobe lpfc". For the case, all the PCI devices on PCI bus (0004:01) were removed and added after PE reset. Then the EEH devices would be figured out again based on the OF nodes. Unfortunately, there were some child OF nodes under PCI device (0004:01:00.0), but they didn't have attached PCI_DN since they're invisible from PCI domain. However, we were still trying to convert OF node to EEH device without checking on the attached PCI_DN. Eventually, it caused the kernel crash as follows: Unable to handle kernel paging request for data at address 0x00000030 Faulting instruction address: 0xc00000000004d888 cpu 0x0: Vector: 300 (Data Access) at [c000000fc797b950] pc: c00000000004d888: .eeh_add_device_tree_early+0x78/0x140 lr: c00000000004d880: .eeh_add_device_tree_early+0x70/0x140 sp: c000000fc797bbd0 msr: 8000000000009032 dar: 30 dsisr: 40000000 current = 0xc000000fc78d9f70 paca = 0xc00000000edb0000 softe: 0 irq_happened: 0x00 pid = 2951, comm = eehd enter ? for help [c000000fc797bc50] c00000000004d848 .eeh_add_device_tree_early+0x38/0x140 [c000000fc797bcd0] c00000000004d848 .eeh_add_device_tree_early+0x38/0x140 [c000000fc797bd50] c000000000051b54 .pcibios_add_pci_devices+0x34/0x190 [c000000fc797bde0] c00000000004fb10 .eeh_reset_device+0x100/0x160 [c000000fc797be70] c0000000000502dc .eeh_handle_event+0x19c/0x300 [c000000fc797bf00] c000000000050570 .eeh_event_handler+0x130/0x1a0 [c000000fc797bf90] c000000000020138 .kernel_thread+0x54/0x70 The patch changes of_node_to_eeh_dev() and just returns NULL if the passed OF node doesn't have attached PCI_DN. Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-02powerpc/85xx: p1022ds: fix DIU/LBC switching with NAND enabledTimur Tabi
commit 896c01cb4bb3cfc2c0ea9873fa7a9f8bd0a7c8d8 upstream. In order for indirect mode on the PIXIS to work properly, both chip selects need to be set to GPCM mode, otherwise writes to the chip select base addresses will not actually post to the local bus -- they'll go to the NAND controller instead. Therefore, we need to set BR0 and BR1 to GPCM mode before switching to indirect mode. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-02powerpc/85xx: p1022ds: disable the NAND flash node if video is enabledTimur Tabi
commit 6269f2584a359766f53005c676daff8aee60cbed upstream. The Freescale P1022 has a unique pin muxing "feature" where the DIU video controller's video signals are muxed with 24 of the local bus address signals. When the DIU is enabled, the bulk of the local bus is disabled, preventing access to memory-mapped devices like NAND flash and the pixis FPGA. Therefore, if the DIU is going to be enabled, then memory-mapped devices on the localbus, like NAND flash, need to be disabled. This patch is similar to "powerpc/85xx: p1022ds: disable the NOR flash node if video is enabled", except that it disables the NAND flash node instead. This PIXIS node needs to remain enabled because it is used by platform code to switch into indirect mode. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14powerpc: Make sure IPI handlers see data written by IPI sendersPaul Mackerras
commit 9fb1b36ca1234e64a5d1cc573175303395e3354d upstream. We have been observing hangs, both of KVM guest vcpu tasks and more generally, where a process that is woken doesn't properly wake up and continue to run, but instead sticks in TASK_WAKING state. This happens because the update of rq->wake_list in ttwu_queue_remote() is not ordered with the update of ipi_message in smp_muxed_ipi_message_pass(), and the reading of rq->wake_list in scheduler_ipi() is not ordered with the reading of ipi_message in smp_ipi_demux(). Thus it is possible for the IPI receiver not to see the updated rq->wake_list and therefore conclude that there is nothing for it to do. In order to make sure that anything done before smp_send_reschedule() is ordered before anything done in the resulting call to scheduler_ipi(), this adds barriers in smp_muxed_message_pass() and smp_ipi_demux(). The barrier in smp_muxed_message_pass() is a full barrier to ensure that there is a full ordering between the smp_send_reschedule() caller and scheduler_ipi(). In smp_ipi_demux(), we use xchg() rather than xchg_local() because xchg() includes release and acquire barriers. Using xchg() rather than xchg_local() makes sense given that ipi_message is not just accessed locally. This moves the barrier between setting the message and calling the cause_ipi() function into the individual cause_ipi implementations. Most of them -- those that used outb, out_8 or similar -- already had a full barrier because out_8 etc. include a sync before the MMIO store. This adds an explicit barrier in the two remaining cases. These changes made no measurable difference to the speed of IPIs as measured using a simple ping-pong latency test across two CPUs on different cores of a POWER7 machine. The analysis of the reason why processes were not waking up properly is due to Milton Miller. Reported-by: Milton Miller <miltonm@bga.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14powerpc: Restore correct DSCR in context switchAnton Blanchard
commit 714332858bfd40dcf8f741498336d93875c23aa7 upstream. During a context switch we always restore the per thread DSCR value. If we aren't doing explicit DSCR management (ie thread.dscr_inherit == 0) and the default DSCR changed while the process has been sleeping we end up with the wrong value. Check thread.dscr_inherit and select the default DSCR or per thread DSCR as required. This was found with the following test case, when running with more threads than CPUs (ie forcing context switching): http://ozlabs.org/~anton/junkcode/dscr_default_test.c With the four patches applied I can run a combination of all test cases successfully at the same time: http://ozlabs.org/~anton/junkcode/dscr_default_test.c http://ozlabs.org/~anton/junkcode/dscr_explicit_test.c http://ozlabs.org/~anton/junkcode/dscr_inherit_test.c Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14powerpc: Fix DSCR inheritance in copy_thread()Anton Blanchard
commit 1021cb268b3025573c4811f1dee4a11260c4507b upstream. If the default DSCR is non zero we set thread.dscr_inherit in copy_thread() meaning the new thread and all its children will ignore future updates to the default DSCR. This is not intended and is a change in behaviour that a number of our users have hit. We just need to inherit thread.dscr and thread.dscr_inherit from the parent which ends up being much simpler. This was found with the following test case: http://ozlabs.org/~anton/junkcode/dscr_default_test.c Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14powerpc: Keep thread.dscr and thread.dscr_inherit in syncAnton Blanchard
commit 00ca0de02f80924dfff6b4f630e1dff3db005e35 upstream. When we update the DSCR either via emulation of mtspr(DSCR) or via a change to dscr_default in sysfs we don't update thread.dscr. We will eventually update it at context switch time but there is a period where thread.dscr is incorrect. If we fork at this point we will copy the old value of thread.dscr into the child. To avoid this, always keep thread.dscr in sync with reality. This issue was found with the following testcase: http://ozlabs.org/~anton/junkcode/dscr_inherit_test.c Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14powerpc: Update DSCR on all CPUs when writing sysfs dscr_defaultAnton Blanchard
commit 1b6ca2a6fe56e7697d57348646e07df08f43b1bb upstream. Writing to dscr_default in sysfs doesn't actually change the DSCR - we rely on a context switch on each CPU to do the work. There is no guarantee we will get a context switch in a reasonable amount of time so fire off an IPI to force an immediate change. This issue was found with the following test case: http://ozlabs.org/~anton/junkcode/dscr_explicit_test.c Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-09powerpc/85xx: use the BRx registers to enable indirect mode on the P1022DSTimur Tabi
commit 6bd825f02966be8ba544047cab313d6032c23819 upstream. In order to enable the DIU video controller on the P1022DS, the FPGA needs to be switched to "indirect mode", where the localbus is disabled and the FPGA is accessed via writes to localbus chip select signals CS0 and CS1. To obtain the address of CS0 and CS1, the platform driver uses an "indirect pixis mode" device tree node. This node assumes that the localbus 'ranges' property is sorted in chip-select order. That is, reg value 0 maps to CS0, reg value 1 maps to CS1, etc. This is how the 'ranges' property is supposed to be arranged. Unfortunately, the 'ranges' property is often mis-arranged, and not just on the P1022DS. Linux normally does not care, since it does not program the localbus. But the indirect-mode code on the P1022DS does care. The "proper" fix is to have U-Boot fix the 'ranges' property, but this would be too cumbersome. The names and 'reg' properties of all the localbus devices would also need to be updated, and determining which localbus device maps to which chip select is board-specific. Instead, we determine the CS0/CS1 base addresses the same way that U-boot does -- by reading the BRx registers directly and mapping them to physical addresses. This code is simpler and more reliable, and it does not require a U-boot or device tree change. Since the indirect pixis device tree node is no longer needed, the node is deleted from the DTS. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-09powerpc/eeh: Check handle_eeh_events() return valueKleber Sacilotto de Souza
commit 10db8d212864cb6741df7d7fafda5ab6661f6f88 upstream. Function eeh_event_handler() dereferences the pointer returned by handle_eeh_events() without checking, causing a crash if NULL was returned, which is expected in some situations. This patch fixes this bug by checking for the value returned by handle_eeh_events() before dereferencing it. Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-09powerpc: Add "memory" attribute for mfmsr()Tiejun Chen
commit b416c9a10baae6a177b4f9ee858b8d309542fbef upstream. Add "memory" attribute in inline assembly language as a compiler barrier to make sure 4.6.x GCC don't reorder mfmsr(). Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-09powerpc/ftrace: Fix assembly trampoline register usageroger blofeld
commit fd5a42980e1cf327b7240adf5e7b51ea41c23437 upstream. Just like the module loader, ftrace needs to be updated to use r12 instead of r11 with newer gcc's. Signed-off-by: Roger Blofeld <blofeldus@yahoo.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-07-16powerpc: Fix build of some debug irq codeBenjamin Herrenschmidt
commit 21b2de341270bd7bb7a811027ffe63276d9b3b75 upstream. There was a typo, checking for CONFIG_TRACE_IRQFLAG instead of CONFIG_TRACE_IRQFLAGS causing some useful debug code to not be built This in turns causes a build error on BookE 64-bit due to incorrect semicolons at the end of a couple of macros, so let's fix that too Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-07-16powerpc: More fixes for lazy IRQ vs. idleBenjamin Herrenschmidt
commit be2cf20a5ad31ebb13562c1c866ecc626fbd721e upstream. Looks like we still have issues with pSeries and Cell idle code vs. the lazy irq state. In fact, the reset fixes that went upstream are exposing the problem more by causing BUG_ON() to trigger (which this patch turns into a WARN_ON instead). We need to be careful when using a variant of low power state that has the side effect of turning interrupts back on, to properly set all the SW & lazy state to look as if everything is enabled before we enter the low power state with MSR:EE off as we will return with MSR:EE on. If not, we have a discrepancy of state which can cause things to go very wrong later on. This patch moves the logic into a helper and uses it from the pseries and cell idle code. The power4/970 idle code already got things right (in assembly even !) so I'm not touching it. The power7 "bare metal" idle code is subtly different and correct. Remains PA6T and some hypervisor based Cell platforms which have questionable code in there, but they are mostly dead platforms so I'll fix them when I manage to get final answers from the respective maintainers about how the low power state actually works on them. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-07-16powerpc/xmon: Use cpumask iterator to avoid warningAnton Blanchard
commit bc1d7702910c7c7e88eb60b58429dbfe293683ce upstream. We have a bug report where the kernel hits a warning in the cpumask code: WARNING: at include/linux/cpumask.h:107 Which is: WARN_ON_ONCE(cpu >= nr_cpumask_bits); The backtrace is: cpu_cmd cmds xmon_core xmon die xmon is iterating through 0 to NR_CPUS. I'm not sure why we are still open coding this but iterating above nr_cpu_ids is definitely a bug. This patch iterates through all possible cpus, in case we issue a system reset and CPUs in an offline state call in. Perhaps the old code was trying to handle CPUs that were in the partition but were never started (eg kexec into a kernel with an nr_cpus= boot option). They are going to die way before we get into xmon since we haven't set any kernel state up for them. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-07-16powerpc/kvm: sldi should be sldMichael Neuling
commit 2f584a146a2965b82fce89b8d2f95dc5cfe468d0 upstream. Since we are taking a registers, this should never have been an sldi. Talking to paulus offline, this is the correct fix. Was introduced by: commit 19ccb76a1938ab364a412253daec64613acbf3df Author: Paul Mackerras <paulus@samba.org> Date: Sat Jul 23 17:42:46 2011 +1000 Talking to paulus, this shouldn't be a literal. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-07-16powerpc: check_and_cede_processor() never cedesAnton Blanchard
commit 0b17ba7258db83cd02da560884e053b85de371f2 upstream. Commit f948501b36c6 ("Make hard_irq_disable() actually hard-disable interrupts") caused check_and_cede_processor to stop working. ->irq_happened will never be zero right after a hard_irq_disable so the compiler removes the call to cede_processor completely. The bug was introduced back in the lazy interrupt handling rework of 3.4 but was hidden until recently because hard_irq_disable did nothing. This issue will eventually appear in 3.4 stable since the hard_irq_disable fix is marked stable, so mark this one for stable too. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-07-16powerpc/pseries: Fix software invalidate TCEMichael Neuling
commit bc6dc752f35488160ffac07ae91bed1bddaea32a upstream. The following added support for powernv but broke pseries/BML: 1f1616e powerpc/powernv: Add TCE SW invalidation support TCE_PCI_SW_INVAL was split into FREE and CREATE flags but the tests in the pseries code were not updated to reflect this. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-07-16powerpc: Fix uninitialised error in numa.cMichael Neuling
commit 82b2521d257b5c0efd51821cf5fa306e53bbb6ba upstream. chroma_defconfig currently gives me this with gcc 4.6: arch/powerpc/mm/numa.c:638:13: error: 'dm' may be used uninitialized in this function [-Werror=uninitialized] It's a bogus warning/error since of_get_drconf_memory() only writes it anyway. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-07-16powerpc/ftrace: Do not trace restore_interrupts()Steven Rostedt
commit 2d773aa4810d4a612d1c879faacc38594cc3f841 upstream. As I was adding code that affects all archs, I started testing function tracer against PPC64 and found that it currently locks up with 3.4 kernel. I figured it was due to tracing a function that shouldn't be, so I went through the following process to bisect to find the culprit: cat /debug/tracing/available_filter_functions > t num=`wc -l t` sed -ne "1,${num}p" t > t1 let num=num+1 sed -ne "${num},$p" t > t2 cat t1 > /debug/tracing/set_ftrace_filter echo function /debug/tracing/current_tracer <failed? bisect t1, if not bisect t2> It finally came down to this function: restore_interrupts() I'm not sure why this locks up the system. It just seems to prevent scheduling from occurring. Interrupts seem to still work, as I can ping the box. But all user processes freeze. When restore_interrupts() is not traced, function tracing works fine. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>