summaryrefslogtreecommitdiff
path: root/drivers/xen/events.c
AgeCommit message (Collapse)Author
2011-11-26genirq: Add IRQF_RESUME_EARLY and resume such IRQs earlierIan Campbell
commit 9bab0b7fbaceec47d32db51cd9e59c82fb071f5a upstream This adds a mechanism to resume selected IRQs during syscore_resume instead of dpm_resume_noirq. Under Xen we need to resume IRQs associated with IPIs early enough that the resched IPI is unmasked and we can therefore schedule ourselves out of the stop_machine where the suspend/resume takes place. This issue was introduced by 676dc3cf5bc3 "xen: Use IRQF_FORCE_RESUME". Back ported to 2.6.32 (which lacks syscore support) by calling the relavant resume function directly from sysdev_resume). v2: Fixed non-x86 build errors. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Link: http://lkml.kernel.org/r/1318713254.11016.52.camel@dagon.hellion.org.uk Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-08Revert "genirq: Add IRQF_RESUME_EARLY and resume such IRQs earlier"Greg Kroah-Hartman
This reverts commit 0f12a6ad9fa3a03f2bcee36c9cb704821e244c40. It causes too many build errors and needs to be done properly. Reported-by: Jiri Slaby <jslaby@suse.cz> Reported-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-07genirq: Add IRQF_RESUME_EARLY and resume such IRQs earlierIan Campbell
commit 9bab0b7fbaceec47d32db51cd9e59c82fb071f5a upstream This adds a mechanism to resume selected IRQs during syscore_resume instead of dpm_resume_noirq. Under Xen we need to resume IRQs associated with IPIs early enough that the resched IPI is unmasked and we can therefore schedule ourselves out of the stop_machine where the suspend/resume takes place. This issue was introduced by 676dc3cf5bc3 "xen: Use IRQF_FORCE_RESUME". Back ported to 2.6.32 (which lacks syscore support) by calling the relavant resume function directly from sysdev_resume). Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Link: http://lkml.kernel.org/r/1318713254.11016.52.camel@dagon.hellion.org.uk Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-06-23xen: Use IRQF_FORCE_RESUMEThomas Gleixner
commit 676dc3cf5bc36a9e129a3ad8fe3bd7b2ebf20f5d upstream. Mark the IRQF_NO_SUSPEND interrupts IRQF_FORCE_RESUME and remove the extra walk through the interrupt descriptors. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-06-23xen: events: do not unmask event channels on resumeIan Campbell
commit 6903591f314b8947d0e362bda7715e90eb9df75e upstream. The IRQ core code will take care of disabling and reenabling interrupts over suspend resume automatically, therefore we do not need to do this in the Xen event channel code. The only exception is those event channels marked IRQF_NO_SUSPEND which the IRQ core ignores. We must unmask these ourselves, taking care to obey the current IRQ_DISABLED status. Failure check for IRQ_DISABLED leads to enabling polled only event channels, such as that associated with the pv spinlocks, which must never be enabled: [ 21.970432] ------------[ cut here ]------------ [ 21.970432] kernel BUG at arch/x86/xen/spinlock.c:343! [ 21.970432] invalid opcode: 0000 [#1] SMP [ 21.970432] last sysfs file: /sys/devices/virtual/net/lo/operstate [ 21.970432] Modules linked in: [ 21.970432] [ 21.970432] Pid: 0, comm: swapper Not tainted (2.6.32.24-x86_32p-xen-01034-g787c727 #34) [ 21.970432] EIP: 0061:[<c102e209>] EFLAGS: 00010046 CPU: 3 [ 21.970432] EIP is at dummy_handler+0x3/0x7 [ 21.970432] EAX: 0000021c EBX: dfc16880 ECX: 0000001a EDX: 00000000 [ 21.970432] ESI: dfc02c00 EDI: 00000001 EBP: dfc47e10 ESP: dfc47e10 [ 21.970432] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069 [ 21.970432] Process swapper (pid: 0, ti=dfc46000 task=dfc39440 task.ti=dfc46000) [ 21.970432] Stack: [ 21.970432] dfc47e30 c10a39f0 0000021c 00000000 00000000 dfc16880 0000021c 00000001 [ 21.970432] <0> dfc47e40 c10a4f08 0000021c 00000000 dfc47e78 c12240a7 c1839284 c1839284 [ 21.970432] <0> 00000200 00000000 00000000 f5720000 c1f3d028 c1f3d02c 00000180 dfc47e90 [ 21.970432] Call Trace: [ 21.970432] [<c10a39f0>] ? handle_IRQ_event+0x5f/0x122 [ 21.970432] [<c10a4f08>] ? handle_percpu_irq+0x2f/0x55 [ 21.970432] [<c12240a7>] ? __xen_evtchn_do_upcall+0xdb/0x15f [ 21.970432] [<c122481e>] ? xen_evtchn_do_upcall+0x20/0x30 [ 21.970432] [<c1030d47>] ? xen_do_upcall+0x7/0xc [ 21.970432] [<c102007b>] ? apic_reg_read+0xd3/0x22d [ 21.970432] [<c1002227>] ? hypercall_page+0x227/0x1005 [ 21.970432] [<c102d30b>] ? xen_force_evtchn_callback+0xf/0x14 [ 21.970432] [<c102da7c>] ? check_events+0x8/0xc [ 21.970432] [<c102da3b>] ? xen_irq_enable_direct_end+0x0/0x1 [ 21.970432] [<c105e485>] ? finish_task_switch+0x62/0xba [ 21.970432] [<c14e3f84>] ? schedule+0x808/0x89d [ 21.970432] [<c1084dc5>] ? hrtimer_start_expires+0x1a/0x22 [ 21.970432] [<c1085154>] ? tick_nohz_restart_sched_tick+0x15a/0x162 [ 21.970432] [<c102f43a>] ? cpu_idle+0x6d/0x6f [ 21.970432] [<c14db29e>] ? cpu_bringup_and_idle+0xd/0xf [ 21.970432] Code: 5d 0f 95 c0 0f b6 c0 c3 55 66 83 78 02 00 89 e5 5d 0f 95 \ c0 0f b6 c0 c3 55 b2 01 86 10 31 c0 84 d2 89 e5 0f 94 c0 5d c3 55 89 e5 <0f> 0b \ eb fe 55 80 3d 4c ce 84 c1 00 89 e5 57 56 89 c6 53 74 15 [ 21.970432] EIP: [<c102e209>] dummy_handler+0x3/0x7 SS:ESP 0069:dfc47e10 [ 21.970432] ---[ end trace c0b71f7e12cf3011 ]--- Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-12-09xen: ensure that all event channels start off bound to VCPU 0Ian Campbell
commit b0097adeec27e30223c989561ab0f7aa60d1fe93 upstream. All event channels startbound to VCPU 0 so ensure that cpu_evtchn_mask is initialised to reflect this. Otherwise there is a race after registering an event channel but before the affinity is explicitly set where the event channel can be delivered. If this happens then the event channel remains pending in the L1 (evtchn_pending) array but is cleared in L2 (evtchn_pending_sel), this means the event channel cannot be reraised until another event channel happens to trigger the same L2 entry on that VCPU. sizeof(cpu_evtchn_mask(0))==sizeof(unsigned long*) which is not correct, and causes only the first 32 or 64 event channels (depending on architecture) to be initially bound to VCPU0. Use sizeof(struct cpu_evtchn_s) instead. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-10-01Xen: fix typo in previous patchJames Dingwall
Correctly name the irq_chip structure to fix an immediate failure when booting as a xen pv_ops guest with a NULL pointer exception. The missing 'x' was introduced in commit [fb412a178502dc498430723b082a932f797e4763] applied to 2.6.3[25]-stable trees. The commit to mainline was [aaca49642b92c8a57d3ca5029a5a94019c7af69f] which did not have the problem. Signed-off-by: James Dingwall <james@dingwall.me.uk> Reported-by: Pawel Zuzelski <pawelz@pld-linux.org> Tested-by: Pawel Zuzelski <pawelz@pld-linux.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-20xen: use percpu interrupts for IPIs and VIRQsJeremy Fitzhardinge
commit aaca49642b92c8a57d3ca5029a5a94019c7af69f upstream. IPIs and VIRQs are inherently per-cpu event types, so treat them as such: - use a specific percpu irq_chip implementation, and - handle them with handle_percpu_irq This makes the path for delivering these interrupts more efficient (no masking/unmasking, no locks), and it avoid problems with attempts to migrate them. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-09-20xen: handle events as edge-triggeredJeremy Fitzhardinge
commit dffe2e1e1a1ddb566a76266136c312801c66dcf7 upstream. Xen events are logically edge triggered, as Xen only calls the event upcall when an event is newly set, but not continuously as it remains set. As a result, use handle_edge_irq rather than handle_level_irq. This has the important side-effect of fixing a long-standing bug of events getting lost if: - an event's interrupt handler is running - the event is migrated to a different vcpu - the event is re-triggered The most noticable symptom of these lost events is occasional lockups of blkfront. Many thanks to Tom Kopec and Daniel Stodden in tracking this down. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Tom Kopec <tek@acm.org> Cc: Daniel Stodden <daniel.stodden@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13xen: Do not suspend IPI IRQs.Ian Campbell
commit 4877c737283813bdb4bebfa3168c1585f6e3a8ca upstream. In general the semantics of IPIs are that they are are expected to continue functioning after dpm_suspend_noirq(). Specifically I have seen a deadlock between the callfunc IPI and the stop machine used by xen's do_suspend() routine. If one CPU has already called dpm_suspend_noirq() then there is a window where it can be sent a callfunc IPI before all the other CPUs have entered stop_cpu(). If this happens then the first CPU ends up spinning in stop_cpu() waiting for the other to rendezvous in state STOPMACHINE_PREPARE while the other is spinning in csd_lock_wait(). Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: xen-devel@lists.xensource.com LKML-Reference: <1280398595-29708-4-git-send-email-ian.campbell@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-12-18xen: don't leak IRQs over suspend/resume.Ian Campbell
commit fed5ea87e02aaf902ff38c65b4514233db03dc09 upstream. On resume irq_info[*].evtchn is reset to 0 since event channel mappings are not preserved over suspend/resume. The other contents of irq_info is preserved to allow rebind_evtchn_irq() to function. However when a device resumes it will try to unbind from the previous IRQ (e.g. blkfront goes blkfront_resume() -> blkif_free() -> unbind_from_irqhandler() -> unbind_from_irq()). This will fail due to the check for VALID_EVTCHN in unbind_from_irq() and the IRQ is leaked. The device will then continue to resume and allocate a new IRQ, eventually leading to find_unbound_irq() panic()ing. Fix this by changing unbind_from_irq() to handle teardown of interrupts which have type!=IRQT_UNBOUND but are not currently bound to a specific event channel. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-08-14Merge branch 'percpu-for-linus' into percpu-for-nextTejun Heo
Conflicts: arch/sparc/kernel/smp_64.c arch/x86/kernel/cpu/perf_counter.c arch/x86/kernel/setup_percpu.c drivers/cpufreq/cpufreq_ondemand.c mm/percpu.c Conflicts in core and arch percpu codes are mostly from commit ed78e1e078dd44249f88b1dd8c76dafb39567161 which substituted many num_possible_cpus() with nr_cpu_ids. As for-next branch has moved all the first chunk allocators into mm/percpu.c, the changes are moved from arch code to mm/percpu.c. Signed-off-by: Tejun Heo <tj@kernel.org>
2009-07-01xen: Use kcalloc() in xen_init_IRQ()Pekka Enberg
The init_IRQ() function is now called with slab allocator initialized. Therefore, we must not use the bootmem allocator in xen_init_IRQ(). Fixes the following boot-time warning: ------------[ cut here ]------------ WARNING: at mm/bootmem.c:535 alloc_arch_preferred_bootmem+0x27/0x45() Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.30 #1 Call Trace: [<ffffffff8102d6e3>] ? warn_slowpath_common+0x73/0xb0 [<ffffffff810210d9>] ? pvclock_clocksource_read+0x49/0x90 [<ffffffff812e522f>] ? alloc_arch_preferred_bootmem+0x27/0x45 [<ffffffff812e5761>] ? ___alloc_bootmem_nopanic+0x39/0xc9 [<ffffffff812e57fa>] ? ___alloc_bootmem+0x9/0x2f [<ffffffff812e9e21>] ? xen_init_IRQ+0x25/0x61 [<ffffffff812d69ee>] ? start_kernel+0x1b5/0x29e ---[ end trace 4eaa2a86a8e2da22 ]--- Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Tested-by: Christian Kujau <lists@nerdbynature.de> Reported-by: Christian Kujau <lists@nerdbynature.de> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: lists@nerdbynature.de Cc: jeremy.fitzhardinge@citrix.com LKML-Reference: <1246438278.22417.28.camel@penberg-laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-24percpu: clean up percpu variable definitionsTejun Heo
Percpu variable definition is about to be updated such that all percpu symbols including the static ones must be unique. Update percpu variable definitions accordingly. * as,cfq: rename ioc_count uniquely * cpufreq: rename cpu_dbs_info uniquely * xen: move nesting_count out of xen_evtchn_do_upcall() and rename it * mm: move ratelimits out of balance_dirty_pages_ratelimited_nr() and rename it * ipv4,6: rename cookie_scratch uniquely * x86 perf_counter: rename prev_left to pmc_prev_left, irq_entry to pmc_irq_entry and nmi_entry to pmc_nmi_entry * perf_counter: rename disable_count to perf_disable_count * ftrace: rename test_event_disable to ftrace_test_event_disable * kmemleak: rename test_pointer to kmemleak_test_pointer * mce: rename next_interval to mce_next_interval [ Impact: percpu usage cleanups, no duplicate static percpu var names ] Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Dave Jones <davej@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: linux-mm <linux-mm@kvack.org> Cc: David S. Miller <davem@davemloft.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Andi Kleen <andi@firstfloor.org>
2009-06-24percpu: cleanup percpu array definitionsTejun Heo
Currently, the following three different ways to define percpu arrays are in use. 1. DEFINE_PER_CPU(elem_type[array_len], array_name); 2. DEFINE_PER_CPU(elem_type, array_name[array_len]); 3. DEFINE_PER_CPU(elem_type, array_name)[array_len]; Unify to #1 which correctly separates the roles of the two parameters and thus allows more flexibility in the way percpu variables are defined. [ Impact: cleanup ] Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Tony Luck <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: linux-mm@kvack.org Cc: Christoph Lameter <cl@linux-foundation.org> Cc: David S. Miller <davem@davemloft.net>
2009-06-10Merge branch 'x86-xen-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-xen-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (42 commits) xen: cache cr0 value to avoid trap'n'emulate for read_cr0 xen/x86-64: clean up warnings about IST-using traps xen/x86-64: fix breakpoints and hardware watchpoints xen: reserve Xen start_info rather than e820 reserving xen: add FIX_TEXT_POKE to fixmap lguest: update lazy mmu changes to match lguest's use of kvm hypercalls xen: honour VCPU availability on boot xen: add "capabilities" file xen: drop kexec bits from /sys/hypervisor since kexec isn't implemented yet xen/sys/hypervisor: change writable_pt to features xen: add /sys/hypervisor support xen/xenbus: export xenbus_dev_changed xen: use device model for suspending xenbus devices xen: remove suspend_cancel hook xen/dev-evtchn: clean up locking in evtchn xen: export ioctl headers to userspace xen: add /dev/xen/evtchn driver xen: add irq_from_evtchn xen: clean up gate trap/interrupt constants xen: set _PAGE_NX in __supported_pte_mask before pagetable construction ...
2009-04-28x86/irq: change irq_desc_alloc() to take node instead of cpuYinghai Lu
This simplifies the node awareness of the code. All our allocators only deal with a NUMA node ID locality not with CPU ids anyway - so there's no need to maintain (and transform) a CPU id all across the IRq layer. v2: keep move_irq_desc related [ Impact: cleanup, prepare IRQ code to be NUMA-aware ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Fitzhardinge <jeremy@goop.org> LKML-Reference: <49F65536.2020300@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-28irq: change ->set_affinity() to return statusYinghai Lu
according to Ingo, change set_affinity() in irq_chip should return int, because that way we can handle failure cases in a much cleaner way, in the genirq layer. v2: fix two typos [ Impact: extend API ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: linux-arch@vger.kernel.org LKML-Reference: <49F654E9.4070809@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-30xen: add irq_from_evtchnIan Campbell
Given an evtchn, return the corresponding irq. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-02-09xen: explicitly initialise the cpu field of irq_infoIan Campbell
I was seeing a very odd crash on 64 bit in bind_evtchn_to_cpu because cpu_from_irq(irq) was coming out as -1. I found this was coming direct from the mk_ipi_info call. It's not clear to me that this isn't a compiler bug (implicit initialisation to zero of unsigned shorts in a struct not handled correctly?). On the other hand is it true that all event channels start of bound to CPU 0? If not then -1 might be correct and the various other functions should cope with this. Signed-off-by: Ian Campbell <Ian.Campbell@eu.citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09xen: make sure that softirqs get handled at the end of event processingJeremy Fitzhardinge
Make sure that irq_enter()/irq_exit() wrap the entire event processing loop, rather than each individual event invokation. This makes sure that softirq processing is deferred until the end of event processing, rather than in the middle with interrupts disabled. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09xen: remove irq bindcountJeremy Fitzhardinge
There should be no need for us to maintain our own bind count for irqs, since the surrounding irq system should keep track of shared irqs for us. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09xen: pack all irq-related info togetherJeremy Fitzhardinge
Put all irq info into one struct. Also, use a union to keep event channel type-specific information, rather than overloading the index field. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09xen: use our own eventchannel->irq pathJeremy Fitzhardinge
Rather than overloading vectors for event channels, take full responsibility for mapping an event channel to irq directly. With this patch Xen has its own irq allocator. When the kernel gets an event channel upcall, it maps the event channel number to an irq and injects it into the normal interrupt path. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09xen: set irq_chip disableJeremy Fitzhardinge
By default, the irq_chip.disable operation is a no-op. Explicitly set it to disable the Xen event channel. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12xen: fix too early kmalloc callChristophe Saout
Impact: fix bootup crash on xen guests SLAB is not yet up, with earlyprintk it is giving me an Oops in __kmalloc. Replace call to kmalloc() with alloc_bootmem(). Reported-by: Christophe Saout <christophe@saout.de> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-11Xen: reduce memory required for cpu_evtchn_maskMike Travis
Impact: reduce memory usage. Reduce this significant gain in the amount of memory used when NR_CPUS bumped from 128 to 4096 by allocating the array based on nr_cpu_ids: 65536 +2031616 2097152 +3100% cpu_evtchn_mask(.bss) Signed-off-by: Mike Travis <travis@sgi.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: virtualization@lists.osdl.org Cc: xen-devel@lists.xensource.com
2009-01-11cpumask: update irq_desc to use cpumask_var_tMike Travis
Impact: reduce memory usage, use new cpumask API. Replace the affinity and pending_masks with cpumask_var_t's. This adds to the significant size reduction done with the SPARSE_IRQS changes. The added functions (init_alloc_desc_masks & init_copy_desc_masks) are in the include file so they can be inlined (and optimized out for the !CONFIG_CPUMASKS_OFFSTACK case.) [Naming chosen to be consistent with the other init*irq functions, as well as the backwards arg declaration of "from, to" instead of the more common "to, from" standard.] Includes a slight change to the declaration of struct irq_desc to embed the pending_mask within ifdef(CONFIG_SMP) to be consistent with other references, and some small changes to Xen. Tested: sparse/non-sparse/cpumask_offstack/non-cpumask_offstack/nonuma/nosmp on x86_64 Signed-off-by: Mike Travis <travis@sgi.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: virtualization@lists.osdl.org Cc: xen-devel@lists.xensource.com Cc: Yinghai Lu <yhlu.kernel@gmail.com>
2009-01-02Merge branch 'cpus4096-for-linus-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'cpus4096-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (66 commits) x86: export vector_used_by_percpu_irq x86: use logical apicid in x2apic_cluster's x2apic_cpu_mask_to_apicid_and() sched: nominate preferred wakeup cpu, fix x86: fix lguest used_vectors breakage, -v2 x86: fix warning in arch/x86/kernel/io_apic.c sched: fix warning in kernel/sched.c sched: move test_sd_parent() to an SMP section of sched.h sched: add SD_BALANCE_NEWIDLE at MC and CPU level for sched_mc>0 sched: activate active load balancing in new idle cpus sched: bias task wakeups to preferred semi-idle packages sched: nominate preferred wakeup cpu sched: favour lower logical cpu number for sched_mc balance sched: framework for sched_mc/smt_power_savings=N sched: convert BALANCE_FOR_xx_POWER to inline functions x86: use possible_cpus=NUM to extend the possible cpus allowed x86: fix cpu_mask_to_apicid_and to include cpu_online_mask x86: update io_apic.c to the new cpumask code x86: Introduce topology_core_cpumask()/topology_thread_cpumask() x86: xen: use smp_call_function_many() x86: use work_on_cpu in x86/kernel/cpu/mcheck/mce_amd_64.c ... Fixed up trivial conflict in kernel/time/tick-sched.c manually
2008-12-26irq: simplify for_each_irq_desc() usageKOSAKI Motohiro
Impact: cleanup all for_each_irq_desc() usage point have !desc check. then its check can move into for_each_irq_desc() macro. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-18sparseirq, xen: make sure irq_desc is allocated for interruptsJeremy Fitzhardinge
Impact: fix crash Make sure all Xen irqs have an irq_desc. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-13Merge ../linux-2.6-x86Rusty Russell
Conflicts: arch/x86/kernel/io_apic.c kernel/sched.c kernel/sched_stats.h
2008-12-13cpumask: make irq_set_affinity() take a const struct cpumaskRusty Russell
Impact: change existing irq_chip API Not much point with gentle transition here: the struct irq_chip's setaffinity method signature needs to change. Fortunately, not widely used code, but hits a few architectures. Note: In irq_select_affinity() I save a temporary in by mangling irq_desc[irq].affinity directly. Ingo, does this break anything? (Folded in fix from KOSAKI Motohiro) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Grant Grundler <grundler@parisc-linux.org> Acked-by: Ingo Molnar <mingo@redhat.com> Cc: ralf@linux-mips.org Cc: grundler@parisc-linux.org Cc: jeremy@xensource.com Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
2008-12-08sparse irq_desc[] array: core kernel and x86 changesYinghai Lu
Impact: new feature Problem on distro kernels: irq_desc[NR_IRQS] takes megabytes of RAM with NR_CPUS set to large values. The goal is to be able to scale up to much larger NR_IRQS value without impacting the (important) common case. To solve this, we generalize irq_desc[NR_IRQS] to an (optional) array of irq_desc pointers. When CONFIG_SPARSE_IRQ=y is used, we use kzalloc_node to get irq_desc, this also makes the IRQ descriptors NUMA-local (to the site that calls request_irq()). This gets rid of the irq_cfg[] static array on x86 as well: irq_cfg now uses desc->chip_data for x86 to store irq_cfg. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-23xen: compilation fix of drivers/xen/events.c on IA64Isaku Yamahata
use set_xen_guest_handle() instead of direct assigning. > linux-2.6/drivers/xen/events.c: In function 'xen_poll_irq': > linux-2.6/drivers/xen/events.c:757: error: incompatible types in assignment > make[4]: *** [drivers/xen/events.o] Error 1 Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-16genirq: use iterators for irq_desc loopsThomas Gleixner
Use for_each_irq_desc[_reverse] for all the iteration loops. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-10-16xen: fix memory access violation bug when CONFIG_HAVE_SPARSE_IRQ is enabledAlex Nixon
When sparse IRQs are enabled, it is not safe to assume an IRQ descriptor exists for every possible IRQ. This patch causes init_evtchn_cpu_bindings to skip initialisation of IRQ descriptors which don't exist. Signed-off-by: Alex Nixon <alex.nixon@citrix.com> Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-16generic: sparse irqs: use irq_desc() together with dyn_array, instead of ↵Yinghai Lu
irq_desc[] add CONFIG_HAVE_SPARSE_IRQ to for use condensed array. Get rid of irq_desc[] array assumptions. Preallocate 32 irq_desc, and irq_desc() will try to get more. ( No change in functionality is expected anywhere, except the odd build failure where we missed a code site or where a crossing commit itroduces new irq_desc[] usage. ) v2: according to Eric, change get_irq_desc() to irq_desc() Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-16drivers/xen: use nr_irqsYinghai Lu
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-25xen: implement CPU hotpluggingAlex Nixon
Note the changes from 2.6.18-xen CPU hotplugging: A vcpu_down request from the remote admin via Xenbus both hotunplugs the CPU, and disables it by removing it from the cpu_present map, and removing its entry in /sys. A vcpu_up request from the remote admin only re-enables the CPU, and does not immediately bring the CPU up. A udev event is emitted, which can be caught by the user if he wishes to automatically re-up CPUs when available, or implement a more complex policy. Signed-off-by: Alex Nixon <alex.nixon@citrix.com> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-21xen: save previous spinlock when blockingJeremy Fitzhardinge
A spinlock can be interrupted while spinning, so make sure we preserve the previous lock of interest if we're taking a lock from within an interrupt handler. We also need to deal with the case where the blocking path gets interrupted between testing to see if the lock is free and actually blocking. If we get interrupted there and end up in the state where the lock is free but the irq isn't pending, then we'll block indefinitely in the hypervisor. This fix is to make sure that any nested lock-takers will always leave the irq pending if there's any chance the outer lock became free. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-31xen: compile irq functions without -pg for ftraceJeremy Fitzhardinge
For some reason I managed to miss a bunch of irq-related functions which also need to be compiled without -pg when using ftrace. This patch moves them into their own file, and starts a cleanup process I've been meaning to do anyway. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: "Alex Nixon (Intern)" <Alex.Nixon@eu.citrix.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16xen: implement Xen-specific spinlocksJeremy Fitzhardinge
The standard ticket spinlocks are very expensive in a virtual environment, because their performance depends on Xen's scheduler giving vcpus time in the order that they're supposed to take the spinlock. This implements a Xen-specific spinlock, which should be much more efficient. The fast-path is essentially the old Linux-x86 locks, using a single lock byte. The locker decrements the byte; if the result is 0, then they have the lock. If the lock is negative, then locker must spin until the lock is positive again. When there's contention, the locker spin for 2^16[*] iterations waiting to get the lock. If it fails to get the lock in that time, it adds itself to the contention count in the lock and blocks on a per-cpu event channel. When unlocking the spinlock, the locker looks to see if there's anyone blocked waiting for the lock by checking for a non-zero waiter count. If there's a waiter, it traverses the per-cpu "lock_spinners" variable, which contains which lock each CPU is waiting on. It picks one CPU waiting on the lock and sends it an event to wake it up. This allows efficient fast-path spinlock operation, while allowing spinning vcpus to give up their processor time while waiting for a contended lock. [*] 2^16 iterations is threshold at which 98% locks have been taken according to Thomas Friebel's Xen Summit talk "Preventing Guests from Spinning Around". Therefore, we'd expect the lock and unlock slow paths will only be entered 2% of the time. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Christoph Lameter <clameter@linux-foundation.org> Cc: Petr Tesarik <ptesarik@suse.cz> Cc: Virtualization <virtualization@lists.linux-foundation.org> Cc: Xen devel <xen-devel@lists.xensource.com> Cc: Thomas Friebel <thomas.friebel@amd.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-20xen: Use wmb instead of rmb in xen_evtchn_do_upcall().Isaku Yamahata
This patch is ported one from 534:77db69c38249 of linux-2.6.18-xen.hg. Use wmb instead of rmb to enforce ordering between evtchn_upcall_pending and evtchn_pending_sel stores in xen_evtchn_do_upcall(). Cc: Samuel Thibault <samuel.thibault@eu.citrix.com> Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: the arch/x86 maintainers <x86@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-27xen: implement save/restoreJeremy Fitzhardinge
This patch implements Xen save/restore and migration. Saving is triggered via xenbus, which is polled in drivers/xen/manage.c. When a suspend request comes in, the kernel prepares itself for saving by: 1 - Freeze all processes. This is primarily to prevent any partially-completed pagetable updates from confusing the suspend process. If CONFIG_PREEMPT isn't defined, then this isn't necessary. 2 - Suspend xenbus and other devices 3 - Stop_machine, to make sure all the other vcpus are quiescent. The Xen tools require the domain to run its save off vcpu0. 4 - Within the stop_machine state, it pins any unpinned pgds (under construction or destruction), performs canonicalizes various other pieces of state (mostly converting mfns to pfns), and finally 5 - Suspend the domain Restore reverses the steps used to save the domain, ending when all the frozen processes are thawed. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27xen: fix unbind_from_irq()Jeremy Fitzhardinge
Rearrange the tests in unbind_from_irq() so that we can still unbind an irq even if the underlying event channel is bad. This allows a device driver to shuffle its irqs on save/restore before the underlying event channels have been fixed up. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-27xen: add rebind_evtchn_irqJeremy Fitzhardinge
Add rebind_evtchn_irq(), which will rebind an device driver's existing irq to a new event channel on restore. Since the new event channel will be masked and bound to vcpu0, we update the state accordingly and unmask the irq once everything is set up. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-24xen: add resend_irq_on_evtchn() definition into events.cIsaku Yamahata
Define resend_irq_on_evtchn() which ia64/xen uses. Although it isn't used by current x86/xen code, it's arch generic so that put it into common code. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-24Xen: make events.c portable for ia64/xen supportIsaku Yamahata
Remove x86 dependency in drivers/xen/events.c for ia64/xen support introducing include/asm/xen/events.h. Introduce xen_irqs_disabled() to hide regs->flags Introduce xen_do_IRQ() to hide regs->orig_ax. make enum ipi_vector definition arch specific. ia64/xen needs four vectors. Add one rmb() because on ia64 xchg() isn't barrier. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-24xen: move events.c to drivers/xen for IA64/Xen supportIsaku Yamahata
move arch/x86/xen/events.c undedr drivers/xen to share codes with x86 and ia64. And minor adjustment to compile. ia64/xen also uses events.c Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>