summaryrefslogtreecommitdiff
path: root/arch/powerpc
AgeCommit message (Collapse)Author
2012-01-12powerpc: Fix unpaired probe_hcall_entry and probe_hcall_exitLi Zhong
commit e4f387d8db3ba3c2dae4d8bdfe7bb5f4fe1bcb0d upstream. Unpaired calling of probe_hcall_entry and probe_hcall_exit might happen as following, which could cause incorrect preempt count. __trace_hcall_entry => trace_hcall_entry -> probe_hcall_entry => get_cpu_var => preempt_disable __trace_hcall_exit => trace_hcall_exit -> probe_hcall_exit => put_cpu_var => preempt_enable where: A => B and A -> B means A calls B, but => means A will call B through function name, and B will definitely be called. -> means A will call B through function pointer, so B might not be called if the function pointer is not set. So error happens when only one of probe_hcall_entry and probe_hcall_exit get called during a hcall. This patch tries to move the preempt count operations from probe_hcall_entry and probe_hcall_exit to its callers. Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-12powerpc/time: Handle wrapping of decrementerAnton Blanchard
commit 37fb9a0231ee43d42d069863bdfd567fca2b61af upstream. When re-enabling interrupts we have code to handle edge sensitive decrementers by resetting the decrementer to 1 whenever it is negative. If interrupts were disabled long enough that the decrementer wrapped to positive we do nothing. This means interrupts can be delayed for a long time until it finally goes negative again. While we hope interrupts are never be disabled long enough for the decrementer to go positive, we have a very good test team that can drive any kernel into the ground. The softlockup data we get back from these fails could be seconds in the future, completely missing the cause of the lockup. We already keep track of the timebase of the next event so use that to work out if we should trigger a decrementer exception. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-21powerpc: Copy down exception vectors after feature fixupsAnton Blanchard
commit d715e433b7ad19c02fc4becf0d5e9a59f97925de upstream. kdump fails because we try to execute an HV only instruction. Feature fixups are being applied after we copy the exception vectors down to 0 so they miss out on any updates. We have always had this issue but it only became critical in v3.0 when we added CFAR support (breaks POWER5) and v3.1 when we added POWERNV (breaks everyone). Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-21powerpc: Add hvcall.h include to book3s_hv.cMichael Neuling
commit de1d9248eadd27539eba449b4d09428252e80c04 upstream. If you build with KVM and UP it fails with the following due to a missing include. /arch/powerpc/kvm/book3s_hv.c: In function 'do_h_register_vpa': arch/powerpc/kvm/book3s_hv.c:156:10: error: 'H_PARAMETER' undeclared (first use in this function) arch/powerpc/kvm/book3s_hv.c:156:10: note: each undeclared identifier is reported only once for each function it appears in arch/powerpc/kvm/book3s_hv.c:192:12: error: 'H_RESOURCE' undeclared (first use in this function) arch/powerpc/kvm/book3s_hv.c:222:9: error: 'H_SUCCESS' undeclared (first use in this function) arch/powerpc/kvm/book3s_hv.c: In function 'kvmppc_pseries_do_hcall': arch/powerpc/kvm/book3s_hv.c:228:30: error: 'H_SUCCESS' undeclared (first use in this function) arch/powerpc/kvm/book3s_hv.c:232:7: error: 'H_CEDE' undeclared (first use in this function) arch/powerpc/kvm/book3s_hv.c:234:7: error: 'H_PROD' undeclared (first use in this function) arch/powerpc/kvm/book3s_hv.c:238:10: error: 'H_PARAMETER' undeclared (first use in this function) arch/powerpc/kvm/book3s_hv.c:250:7: error: 'H_CONFER' undeclared (first use in this function) arch/powerpc/kvm/book3s_hv.c:252:7: error: 'H_REGISTER_VPA' undeclared (first use in this function) make[2]: *** [arch/powerpc/kvm/book3s_hv.o] Error 1 Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-21powerpc/ps3: Fix lost SMP IPIsGeoff Levand
commit 72f3bea075287785ed32b777b6dd2636aa7002e8 upstream. Fixes the PS3 bootup hang introduced in 3.0-rc1 by: commit 317f394160e9beb97d19a84c39b7e5eb3d7815a sched: Move the second half of ttwu() to the remote cpu Move the PS3's LV1 EOI call lv1_end_of_interrupt_ext() from ps3_chip_eoi() to ps3_get_irq() for IPI messages. If lv1_send_event_locally() is called between a previous call to lv1_send_event_locally() and the coresponding call to lv1_end_of_interrupt_ext() the second event will not be delivered to the target cpu. The PS3's SMP IPIs are implemented using lv1_send_event_locally(), so if two IPI messages of the same type are sent to the same target in a relatively short period of time the second IPI event can become lost when lv1_end_of_interrupt_ext() is called from ps3_chip_eoi(). Signed-off-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11powerpc: Fix deadlock in icswx codeAnton Blanchard
commit 8bdafa39a47265bc029838b35cc6585f69224afa upstream. The icswx code introduced an A-B B-A deadlock: CPU0 CPU1 ---- ---- lock(&anon_vma->mutex); lock(&mm->mmap_sem); lock(&anon_vma->mutex); lock(&mm->mmap_sem); Instead of using the mmap_sem to keep mm_users constant, take the page table spinlock. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11powerpc/eeh: Fix /proc/ppc64/eeh creationThadeu Lima de Souza Cascardo
commit 8feaa43494cee5e938fd5a57b9e9bf1c827e6ccd upstream. Since commit 188917e183cf9ad0374b571006d0fc6d48a7f447, /proc/ppc64 is a symlink to /proc/powerpc/. That means that creating /proc/ppc64/eeh will end up with a unaccessible file, that is not listed under /proc/powerpc/ and, then, not listed under /proc/ppc64/. Creating /proc/powerpc/eeh fixes that problem and maintain the compatibility intended with the ppc64 symlink. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11powerpc/pseries: Avoid spurious error during hotplug CPU addAnton Blanchard
commit 9c740025c51a26ab00192cfc464064d4ccbfe3fc upstream. During hotplug CPU add we get the following error: Unexpected Error (0) returned from configure-connector ibm,configure-connector returns 0 for configuration complete, so catch this and avoid the error. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11powerpc: Fix oops when echoing bad values to /sys/devices/system/memory/probeAnton Blanchard
commit a11940978bd598e65996b4f807cf4904793f7025 upstream. If we echo an address the hypervisor doesn't like to /sys/devices/system/memory/probe we oops the box: # echo 0x10000000000 > /sys/devices/system/memory/probe kernel BUG at arch/powerpc/mm/hash_utils_64.c:541! The backtrace is: create_section_mapping arch_add_memory add_memory memory_probe_store sysdev_class_store sysfs_write_file vfs_write SyS_write In create_section_mapping we BUG if htab_bolt_mapping returned an error. A better approach is to return an error which will propagate back to userspace. Rerunning the test with this patch applied: # echo 0x10000000000 > /sys/devices/system/memory/probe -bash: echo: write error: Invalid argument Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11powerpc/numa: Remove double of_node_put in hot_add_node_scn_to_nidAnton Blanchard
commit 6083184269fd723affca4f6340e491950267622a upstream. During memory hotplug testing, I got the following warning: ERROR: Bad of_node_put() on /memory@0 of_node_release kref_put of_node_put of_find_node_by_type hot_add_node_scn_to_nid hot_add_scn_to_nid memory_add_physaddr_to_nid ... of_find_node_by_type() loop does the of_node_put for us so we only need the handle the case where we terminate the loop early. As suggested by Stephen Rothwell we can do the of_node_put unconditionally outside of the loop since of_node_put handles a NULL argument fine. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11thp: share get_huge_page_tail()Andrea Arcangeli
commit b35a35b556f5e6b7993ad0baf20173e75c09ce8c upstream. This avoids duplicating the function in every arch gup_fast. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11powerpc: gup_huge_pmd() return 0 if pte changesAndrea Arcangeli
commit cf592bf768c4fa40282b8fce58a80820065de2cb upstream. powerpc didn't return 0 in that case, if it's rolling back the *nr pointer it should also return zero to avoid adding pages to the array at the wrong offset. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: David Gibson <david@gibson.dropbear.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11powerpc: gup_hugepte() support THP based tail recountingAndrea Arcangeli
commit 3526741f0964c88bc2ce511e1078359052bf225b upstream. Up to this point the code assumed old refcounting for hugepages (pre-thp). This updates the code directly to the thp mapcount tail page refcounting. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11powerpc: gup_hugepte() avoid freeing the head page too many timesAndrea Arcangeli
commit 8596468487e2062cae2aad56e973784e03959245 upstream. We only taken "refs" pins on the head page not "*nr" pins. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11powerpc: get_hugepte() don't put_page() the wrong pageAndrea Arcangeli
commit 405e44f2e312dd5dd63e5a9f459bffcbcd4368ef upstream. "page" may have changed to point to the next hugepage after the loop completed, The references have been taken on the head page, so the put_page must happen there too. This is a longstanding issue pre-thp inclusion. It's totally unclear how these page_cache_add_speculative and pte_val(pte) != pte_val(*ptep) checks are necessary across all the powerpc gup_fast code, when x86 doesn't need any of that: there's no way the page can be freed with irq disabled so we're guaranteed the atomic_inc will happen on a page with page_count > 0 (so not needing the speculative check). The pte check is also meaningless on x86: no need to rollback on x86 if the pte changed, because the pte can still change a CPU tick after the check succeeded and it won't be rolled back in that case. The important thing is we got a reference on a valid page that was mapped there a CPU tick ago. So not knowing the soft tlb refill code of ppc64 in great detail I'm not removing the "speculative" page_count increase and the pte checks across all the code, but unless there's a strong reason for it they should be later cleaned up too. If a pte can change from huge to non-huge (like it could happen with THP) passing a pte_t *ptep to gup_hugepte() would also require to repeat the is_hugepd in gup_hugepte(), but that shouldn't happen with hugetlbfs only so I'm not altering that. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11powerpc: remove superfluous PageTail checks on the pte gup_fastAndrea Arcangeli
commit 2839bdc1bfc0af76a2f0f11eca011590520a04fa upstream. This part of gup_fast doesn't seem capable of handling hugetlbfs ptes, those should be handled by gup_hugepd only, so these checks are superfluous. Plus if this wasn't a noop, it would have oopsed because, the insistence of using the speculative refcounting would trigger a VM_BUG_ON if a tail page was encountered in the page_cache_get_speculative(). Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-11mm: thp: tail page refcounting fixAndrea Arcangeli
commit 70b50f94f1644e2aa7cb374819cfd93f3c28d725 upstream. Michel while working on the working set estimation code, noticed that calling get_page_unless_zero() on a random pfn_to_page(random_pfn) wasn't safe, if the pfn ended up being a tail page of a transparent hugepage under splitting by __split_huge_page_refcount(). He then found the problem could also theoretically materialize with page_cache_get_speculative() during the speculative radix tree lookups that uses get_page_unless_zero() in SMP if the radix tree page is freed and reallocated and get_user_pages is called on it before page_cache_get_speculative has a chance to call get_page_unless_zero(). So the best way to fix the problem is to keep page_tail->_count zero at all times. This will guarantee that get_page_unless_zero() can never succeed on any tail page. page_tail->_mapcount is guaranteed zero and is unused for all tail pages of a compound page, so we can simply account the tail page references there and transfer them to tail_page->_count in __split_huge_page_refcount() (in addition to the head_page->_mapcount). While debugging this s/_count/_mapcount/ change I also noticed get_page is called by direct-io.c on pages returned by get_user_pages. That wasn't entirely safe because the two atomic_inc in get_page weren't atomic. As opposed to other get_user_page users like secondary-MMU page fault to establish the shadow pagetables would never call any superflous get_page after get_user_page returns. It's safer to make get_page universally safe for tail pages and to use get_page_foll() within follow_page (inside get_user_pages()). get_page_foll() is safe to do the refcounting for tail pages without taking any locks because it is run within PT lock protected critical sections (PT lock for pte and page_table_lock for pmd_trans_huge). The standard get_page() as invoked by direct-io instead will now take the compound_lock but still only for tail pages. The direct-io paths are usually I/O bound and the compound_lock is per THP so very finegrined, so there's no risk of scalability issues with it. A simple direct-io benchmarks with all lockdep prove locking and spinlock debugging infrastructure enabled shows identical performance and no overhead. So it's worth it. Ideally direct-io should stop calling get_page() on pages returned by get_user_pages(). The spinlock in get_page() is already optimized away for no-THP builds but doing get_page() on tail pages returned by GUP is generally a rare operation and usually only run in I/O paths. This new refcounting on page_tail->_mapcount in addition to avoiding new RCU critical sections will also allow the working set estimation code to work without any further complexity associated to the tail page refcounting with THP. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reported-by: Michel Lespinasse <walken@google.com> Reviewed-by: Michel Lespinasse <walken@google.com> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-09-29powerpc: Fix device-tree matching for Apple U4 bridgeBenjamin Herrenschmidt
Apple Quad G5 has some oddity in it's device-tree which causes the new generic matching code to fail to relate nodes for PCI-E devices below U4 with their respective struct pci_dev. This breaks graphics on those machines among others. This fixes it using a quirk which copies the node pointer from the host bridge for the root complex, which makes the generic code work for the children afterward. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-31Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/p1023rds: Fix the error of bank-width of nor flash powerpc/85xx: enable caam crypto driver by default powerpc/85xx: enable the audio drivers in the defconfigs
2011-08-30powerpc/p1023rds: Fix the error of bank-width of nor flashChunhe Lan
In the p1023rds, a physical bus of nor flash is 16 bits width. The bank-width is width (in bytes) of the bus width. So, the value of bank-width of nor flash is not one, and it should be two. Signed-off-by: Chunhe Lan <Chunhe.Lan@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-08-30powerpc/85xx: enable caam crypto driver by defaultKim Phillips
corenet based SoCs have SEC4 h/w, so enable the SEC4 driver, caam, and the algorithms it supports, and disable the SEC2/3 driver, talitos. Signed-off-by: Kim Phillips <kim.phillips@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-08-30powerpc/85xx: enable the audio drivers in the defconfigsTimur Tabi
Enable the audio drivers in the non-corenet 85xx defconfigs so that audio is enabled on the Freescale P1022DS reference board. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-08-29remove remaining references to nfsservctlStephen Rothwell
These were missed in commit f5b940997397 "All Arch: remove linkage for sys_nfsservctl system call" due to them having no sys_ prefix (presumably). Cc: NeilBrown <neilb@suse.de> Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-parisc@vger.kernel.org Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: James Bottomley <James.Bottomley@hansenpartnership.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-25arch/powerpc/sysdev/fsl_rio.c: correct IECSR register clear valueLiu Gang-B34182
This bug causes the IECSR register clear failure. In this case, the RETE (retry error threshold exceeded) interrupt will be generated and cannot be cleared. So the related ISR may be called persistently. The RETE bit in IECSR is cleared by writing a 1 to it. Signed-off-by: Liu Gang <Gang.Liu@freescale.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-11powerpc: Really fix build without CONFIG_PCIBenjamin Herrenschmidt
Brown paper bag day, previous commit wouldn't work very well with modules enabled. Move the exports into the ifdef. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05powerpc: Fix build without CONFIG_PCIBenjamin Herrenschmidt
Commit fea80311a939a746533a6d7e7c3183729d6a3faf "iomap: make IOPORT/PCI mapping functions conditional" Broke powerpc build without CONFIG_PCI as we would still define pci_iomap(), which overlaps with the new empty inline in the headers. Make our implementation conditional on CONFIG_PCI Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05powerpc/4xx: Fix build of PCI code on 405Benjamin Herrenschmidt
Commit 112d1fe9f7715db423ffeec5ac1beccff6093dc4 "powerpc/4xx: Add check_link to struct ppc4xx_pciex_hwops" inadvertently broke 405 builds due to some functions being over protected by an ifdef CONFIG_44x. Move them back out. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05powerpc/pseries: Simplify vpa deregistration functionsAnton Blanchard
The VPA, SLB shadow and DTL degistration functions do not need an address, so simplify things and remove it. Also cleanup pseries_kexec_cpu_down a bit by storing the cpu IDs in local variables. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05powerpc/pseries: Cleanup VPA registration and deregistration errorsAnton Blanchard
Make the VPA, SLB shadow and DTL registration and deregistration functions print consistent messages on error. I needed the firmware error code while chasing a kexec bug but we weren't printing it. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05powerpc/pseries: Fix kexec on recent firmware versionsAnton Blanchard
Recent versions of firmware will fail to unmap the virtual processor area if we have a dispatch trace log registered. This causes kexec to fail. If a trace log is registered this patch unregisters it before the SLB shadow and virtual processor areas, fixing the problem. The address argument is ignored by firmware on unregister so we may as well remove it. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05powerpc: Make KVM_GUEST default to nAnton Blanchard
KVM_GUEST adds a 1 MB array to the kernel (kvm_tmp) which grew my kernel enough to cause it to fail to boot. Dynamically allocating or reducing the size of this array is a good idea, but in the meantime I think it makes sense to make KVM_GUEST default to n in order to minimise surprises. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05powerpc/kvm: Fix build errors with older toolchainsNishanth Aravamudan
On a box with gcc 4.3.2, I see errors like: arch/powerpc/kvm/book3s_hv_rmhandlers.S:1254: Error: Unrecognized opcode: stxvd2x arch/powerpc/kvm/book3s_hv_rmhandlers.S:1316: Error: Unrecognized opcode: lxvd2x Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05powerpc: Lack of ibm,io-events not that important!Anton Blanchard
The ibm,io-events code is a bit verbose with its error messages. Reverse the reporting so we only print when we successfully enable I/O event interrupts. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05powerpc: Move kdump default base address to half RMO size on 64bitAnton Blanchard
We are seeing boot failures on some very large boxes even with commit b5416ca9f824 (powerpc: Move kdump default base address to 64MB on 64bit). This patch halves the RMO so both kernels get about the same amount of RMO memory. On large machines this region will be at least 256MB, so each kernel will get 128MB. We cap it at 256MB (small SLB size) since some early allocations need to be in the bolted SLB region. We could relax this on machines with 1TB SLBs in a future patch. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05powerpc/perf: Disable pagefaults during callchain stack readDavid Ahern
Panic observed on an older kernel when collecting call chains for the context-switch software event: [<b0180e00>]rb_erase+0x1b4/0x3e8 [<b00430f4>]__dequeue_entity+0x50/0xe8 [<b0043304>]set_next_entity+0x178/0x1bc [<b0043440>]pick_next_task_fair+0xb0/0x118 [<b02ada80>]schedule+0x500/0x614 [<b02afaa8>]rwsem_down_failed_common+0xf0/0x264 [<b02afca0>]rwsem_down_read_failed+0x34/0x54 [<b02aed4c>]down_read+0x3c/0x54 [<b0023b58>]do_page_fault+0x114/0x5e8 [<b001e350>]handle_page_fault+0xc/0x80 [<b0022dec>]perf_callchain+0x224/0x31c [<b009ba70>]perf_prepare_sample+0x240/0x2fc [<b009d760>]__perf_event_overflow+0x280/0x398 [<b009d914>]perf_swevent_overflow+0x9c/0x10c [<b009db54>]perf_swevent_ctx_event+0x1d0/0x230 [<b009dc38>]do_perf_sw_event+0x84/0xe4 [<b009dde8>]perf_sw_event_context_switch+0x150/0x1b4 [<b009de90>]perf_event_task_sched_out+0x44/0x2d4 [<b02ad840>]schedule+0x2c0/0x614 [<b0047dc0>]__cond_resched+0x34/0x90 [<b02adcc8>]_cond_resched+0x4c/0x68 [<b00bccf8>]move_page_tables+0xb0/0x418 [<b00d7ee0>]setup_arg_pages+0x184/0x2a0 [<b0110914>]load_elf_binary+0x394/0x1208 [<b00d6e28>]search_binary_handler+0xe0/0x2c4 [<b00d834c>]do_execve+0x1bc/0x268 [<b0015394>]sys_execve+0x84/0xc8 [<b001df10>]ret_from_syscall+0x0/0x3c A page fault occurred walking the callchain while creating a perf sample for the context-switch event. To handle the page fault the mmap_sem is needed, but it is currently held by setup_arg_pages. (setup_arg_pages calls shift_arg_pages with the mmap_sem held. shift_arg_pages then calls move_page_tables which has a cond_resched at the top of its for loop - hitting that cond_resched is what caused the context switch.) This is an extension of Anton's proposed patch: https://lkml.org/lkml/2011/7/24/151 adding case for 32-bit ppc. Tested on the system that first generated the panic and then again with latest kernel using a PPC VM. I am not able to test the 64-bit path - I do not have H/W for it and 64-bit PPC VMs (qemu on Intel) is horribly slow. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05ppc: Remove duplicate definition of PV_POWER7Peter Zijlstra
One definition of PV_POWER7 seems enough to me. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05powerpc: pseries: Fix kexec on machines with more than 4TB of RAMAnton Blanchard
On a box with 8TB of RAM the MMU hashtable is 64GB in size. That means we have 4G PTEs. pSeries_lpar_hptab_clear was using a signed int to store the index which will overflow at 2G. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05powerpc: Jump label misalignment causes oops at bootAnton Blanchard
I hit an oops at boot on the first instruction of timer_cpu_notify: NIP [c000000000722f88] .timer_cpu_notify+0x0/0x388 The code should look like: c000000000722f78: eb e9 00 30 ld r31,48(r9) c000000000722f7c: 2f bf 00 00 cmpdi cr7,r31,0 c000000000722f80: 40 9e ff 44 bne+ cr7,c000000000722ec4 c000000000722f84: 4b ff ff 74 b c000000000722ef8 c000000000722f88 <.timer_cpu_notify>: c000000000722f88: 7c 08 02 a6 mflr r0 c000000000722f8c: 2f a4 00 07 cmpdi cr7,r4,7 c000000000722f90: fb c1 ff f0 std r30,-16(r1) c000000000722f94: fb 61 ff d8 std r27,-40(r1) But the oops output shows: eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 7c0803a6 ebe1fff8 4e800020 00000000 ebe90030 c0000000 00ad0a28 00000000 2fa40007 fbc1fff0 fb61ffd8 So we scribbled over our instructions with c000000000ad0a28, which is an address inside the jump_table ELF section. It turns out the jump_table section is only aligned to 8 bytes but we are aligning our entries within the section to 16 bytes. This means our entries are offset from the table: c000000000acd4a8 <__start___jump_table>: ... c000000000ad0a10: c0 00 00 00 lfs f0,0(0) c000000000ad0a14: 00 70 cd 5c .long 0x70cd5c c000000000ad0a18: c0 00 00 00 lfs f0,0(0) c000000000ad0a1c: 00 70 cd 90 .long 0x70cd90 c000000000ad0a20: c0 00 00 00 lfs f0,0(0) c000000000ad0a24: 00 ac a4 20 .long 0xaca420 And the jump table sort code gets very confused and writes into the wrong spot. Remove the alignment, and also remove the padding since we it saves some space and we shouldn't need it. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05powerpc: Clean up some panic messages in prom_initAnton Blanchard
Add a newline to the panic messages in make_room. Also fix a comment that suggested our chunk size is 4Mb. It's 1MB. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05powerpc: Fix device tree claim codeAnton Blanchard
I have a box that fails in OF during boot with: DEFAULT CATCH!, exception-handler=fff00400 at %SRR0: 49424d2c4c6f6768 %SRR1: 800000004000b002 ie "IBM,Logh". OF got corrupted with a device tree string. Looking at make_room and alloc_up, we claim the first chunk (1 MB) but we never claim any more. mem_end is always set to alloc_top which is the top of our available address space, guaranteeing we will never call alloc_up and claim more memory. Also alloc_up wasn't setting alloc_bottom to the bottom of the available address space. This doesn't help the box to boot, but we at least fail with an obvious error. We could relocate the device tree in a future patch. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05powerpc: Return the_cpu_ spec from identify_cpuScott Wood
Commit af9eef3c7b1ed004c378c89b87642f4937337d50 caused cpu_setup to see the_cpu_spec, rather than the source struct. However, on 32-bit, the return value of identify_cpu was being used for feature fixups, and identify_cpu was returning the source struct. So if cpu_setup patches the feature bits, the update won't affect the fixups. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05powerpc: mtspr/mtmsr should take an unsigned longScott Wood
Add a cast in case the caller passes in a different type, as it would if mtspr/mtmsr were functions. Previously, if a 64-bit type was passed in on 32-bit, GCC would bind the constraint to a pair of registers, and would substitute the first register in the pair in the asm code. This corresponds to the upper half of the 64-bit register, which is generally not the desired behavior. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-03Merge branch 'apei' into apei-releaseLen Brown
Some trivial conflicts due to other various merges adding to the end of common lists sooner than this one. arch/ia64/Kconfig arch/powerpc/Kconfig arch/x86/Kconfig lib/Kconfig lib/Makefile Signed-off-by: Len Brown <len.brown@intel.com>
2011-08-03Add Kconfig option ARCH_HAVE_NMI_SAFE_CMPXCHGHuang Ying
cmpxchg() is widely used by lockless code, including NMI-safe lockless code. But on some architectures, the cmpxchg() implementation is not NMI-safe, on these architectures the lockless code may need a spin_trylock_irqsave() based implementation. This patch adds a Kconfig option: ARCH_HAVE_NMI_SAFE_CMPXCHG, so that NMI-safe lockless code can depend on it or provide different implementation according to it. On many architectures, cmpxchg is only NMI-safe for several specific operand sizes. So, ARCH_HAVE_NMI_SAFE_CMPXCHG define in this patch only guarantees cmpxchg is NMI-safe for sizeof(unsigned long). Signed-off-by: Huang Ying <ying.huang@intel.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Acked-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Acked-by: Richard Henderson <rth@twiddle.net> CC: Mikael Starvik <starvik@axis.com> Acked-by: David Howells <dhowells@redhat.com> CC: Yoshinori Sato <ysato@users.sourceforge.jp> CC: Tony Luck <tony.luck@intel.com> CC: Hirokazu Takata <takata@linux-m32r.org> CC: Geert Uytterhoeven <geert@linux-m68k.org> CC: Michal Simek <monstr@monstr.eu> Acked-by: Ralf Baechle <ralf@linux-mips.org> CC: Kyle McMartin <kyle@mcmartin.ca> CC: Martin Schwidefsky <schwidefsky@de.ibm.com> CC: Chen Liqin <liqin.chen@sunplusct.com> CC: "David S. Miller" <davem@davemloft.net> CC: Ingo Molnar <mingo@redhat.com> CC: Chris Zankel <chris@zankel.net> Signed-off-by: Len Brown <len.brown@intel.com>
2011-07-26Merge branch 'next/cross-platform' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc * 'next/cross-platform' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc: ARM: Consolidate the clkdev header files ARM: set vga memory base at run-time ARM: convert PCI defines to variables ARM: pci: make pcibios_assign_all_busses use pci_has_flag ARM: remove unnecessary mach/hardware.h includes pci: move microblaze and powerpc pci flag functions into asm-generic powerpc: rename ppc_pci_*_flags to pci_*_flags Fix up conflicts in arch/microblaze/include/asm/pci-bridge.h
2011-07-26atomic: cleanup asm-generic atomic*.h inclusionArun Sharma
After changing all consumers of atomics to include <linux/atomic.h>, we ran into some compile time errors due to this dependency chain: linux/atomic.h -> asm/atomic.h -> asm-generic/atomic-long.h where atomic-long.h could use funcs defined later in linux/atomic.h without a prototype. This patches moves the code that includes asm-generic/atomic*.h to linux/atomic.h. Archs that need <asm-generic/atomic64.h> need to select CONFIG_GENERIC_ATOMIC64 from now on (some of them used to include it unconditionally). Compile tested on i386 and x86_64 with allnoconfig. Signed-off-by: Arun Sharma <asharma@fb.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26atomic: move atomic_add_unless to generic codeArun Sharma
This is in preparation for more generic atomic primitives based on __atomic_add_unless. Signed-off-by: Arun Sharma <asharma@fb.com> Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26atomic: use <linux/atomic.h>Arun Sharma
This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: Arun Sharma <asharma@fb.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26asm-generic: add another generic ext2 atomic bitopsAkinobu Mita
The majority of architectures implement ext2 atomic bitops as test_and_{set,clear}_bit() without spinlock. This adds this type of generic implementation in ext2-atomic-setbit.h and use it wherever possible. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Suggested-by: Andreas Dilger <adilger@dilger.ca> Suggested-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26ptrace: unify show_regs() prototypeMike Frysinger
[ poleg@redhat.com: no need to declare show_regs() in ptrace.h, sched.h does this ] Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>