summaryrefslogtreecommitdiff
path: root/arch/s390/kvm/interrupt.c
AgeCommit message (Collapse)Author
2016-09-08KVM: s390: allow 255 VCPUs when sca entries aren't usedDavid Hildenbrand
If the SCA entries aren't used by the hardware (no SIGPIF), we can simply not set the entries, stick to the basic sca and allow more than 64 VCPUs. To hinder any other facility from using these entries, let's properly provoke intercepts by not setting the MCN and keeping the entries unset. This effectively allows when running KVM under KVM (vSIE) or under z/VM to provide more than 64 VCPUs to a guest. Let's limit it to 255 for now, to not run into problems if the CPU numbers are limited somewhere else. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08KVM: s390: write external damage code on machine checksDavid Hildenbrand
Let's also write the external damage code already provided by struct kvm_s390_mchk_info. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08KVM: s390: fix delivery of vector regs during machine checksDavid Hildenbrand
Vector registers are only to be stored if the facility is available and if the guest has set up the machine check extended save area. If anything goes wrong while writing the vector registers, the vector registers are to be marked as invalid. Please note that we are allowed to write the registers although they are marked as invalid. Machine checks and "store status" SIGP orders are two different concepts, let's correctly separate these. As the SIGP part is completely handled in user space, we can drop it. This patch is based on a patch from Cornelia Huck. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08KVM: s390: split store status and machine check handlingDavid Hildenbrand
Store status writes the prefix which is not to be done by a machine check. Also, the psw is stored and later on overwritten by the failing-storage address, which looks strange at first sight. Store status and machine check handling look similar, but they are actually two different things. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08KVM: s390: factor out actual delivery of machine checksDavid Hildenbrand
Let's factor this out to prepare for bigger changes. Reorder to calls to match the logical order given in the PoP. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-07-14KVM: pass struct kvm to kvm_set_routing_entryRadim Krčmář
Arch-specific code will use it. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-21KVM: s390: vsie: speed up VCPU irq delivery when handling vsieDavid Hildenbrand
Whenever we want to wake up a VCPU (e.g. when injecting an IRQ), we have to kick it out of vsie, so the request will be handled faster. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10KVM: s390: fixup I/O interrupt tracesChristian Borntraeger
We currently have two issues with the I/O interrupt injection logging: 1. All QEMU versions up to 2.6 have a wrong encoding of device numbers etc for the I/O interrupt type, so the inject VM_EVENT will have wrong data. Let's fix this by using the interrupt parameters and not the interrupt type number. 2. We only log in kvm_s390_inject_vm, but not when coming from kvm_s390_reinject_io_int or from flic. Let's move the logging to the common __inject_io function. We also enhance the logging for delivery to match the data. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-13KVM: halt_polling: provide a way to qualify wakeups during pollChristian Borntraeger
Some wakeups should not be considered a sucessful poll. For example on s390 I/O interrupts are usually floating, which means that _ALL_ CPUs would be considered runnable - letting all vCPUs poll all the time for transactional like workload, even if one vCPU would be enough. This can result in huge CPU usage for large guests. This patch lets architectures provide a way to qualify wakeups if they should be considered a good/bad wakeups in regard to polls. For s390 the implementation will fence of halt polling for anything but known good, single vCPU events. The s390 implementation for floating interrupts does a wakeup for one vCPU, but the interrupt will be delivered by whatever CPU checks first for a pending interrupt. We prefer the woken up CPU by marking the poll of this CPU as "good" poll. This code will also mark several other wakeup reasons like IPI or expired timers as "good". This will of course also mark some events as not sucessful. As KVM on z runs always as a 2nd level hypervisor, we prefer to not poll, unless we are really sure, though. This patch successfully limits the CPU usage for cases like uperf 1byte transactional ping pong workload or wakeup heavy workload like OLTP while still providing a proper speedup. This also introduced a new vcpu stat "halt_poll_no_tuning" that marks wakeups that are considered not good for polling. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Radim Krčmář <rkrcmar@redhat.com> (for an earlier version) Cc: David Matlack <dmatlack@google.com> Cc: Wanpeng Li <kernellwp@gmail.com> [Rename config symbol. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-04-20KVM: s390: add clear I/O irq operation for FLICHalil Pasic
Introduce a FLIC operation for clearing I/O interrupts for a subchannel. Rationale: According to the platform specification, pending I/O interruption requests have to be revoked in certain situations. For instance, according to the Principles of Operation (page 17-27), a subchannel put into the installed parameters initialized state is in the same state as after an I/O system reset (just parameters possibly changed). This implies that any I/O interrupts for that subchannel are no longer pending (as I/O system resets clear I/O interrupts). Therefore, we need an interface to clear pending I/O interrupts. Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-04-20KVM: s390: implement has_attr for FLICHalil Pasic
HAS_ATTR is useful for determining the supported attributes; let's implement it. Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-16Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: - Add the CPU id for the new z13s machine - Add a s390 specific XOR template for RAID-5 checksumming based on the XC instruction. Remove all other alternatives, XC is always faster - The merge of our four different stack tracers into a single one - Tidy up the code related to page tables, several large inline functions are now out-of-line. Bloat-o-meter reports ~11K text size reduction - A binary interface for the priviledged CLP instruction to retrieve the hardware view of the installed PCI functions - Improvements for the dasd format code - Bug fixes and cleanups * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (31 commits) s390/pci: enforce fmb page boundary rule s390: fix floating pointer register corruption (again) s390/cpumf: add missing lpp magic initialization s390: Fix misspellings in comments s390/mm: split arch/s390/mm/pgtable.c s390/mm: uninline pmdp_xxx functions from pgtable.h s390/mm: uninline ptep_xxx functions from pgtable.h s390/pci: add ioctl interface for CLP s390: Use pr_warn instead of pr_warning s390/dasd: remove casts to dasd_*_private s390/dasd: Refactor dasd format functions s390/dasd: Simplify code in format logic s390/dasd: Improve dasd format code s390/percpu: remove this_cpu_cmpxchg_double_4 s390/cpumf: Improve guest detection heuristics s390/fault: merge report_user_fault implementations s390/dis: use correct escape sequence for '%' character s390/kvm: simplify set_guest_storage_key s390/oprofile: add z13/z13s model numbers s390: add z13s model number to z13 elf platform ...
2016-03-16Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "One of the largest releases for KVM... Hardly any generic changes, but lots of architecture-specific updates. ARM: - VHE support so that we can run the kernel at EL2 on ARMv8.1 systems - PMU support for guests - 32bit world switch rewritten in C - various optimizations to the vgic save/restore code. PPC: - enabled KVM-VFIO integration ("VFIO device") - optimizations to speed up IPIs between vcpus - in-kernel handling of IOMMU hypercalls - support for dynamic DMA windows (DDW). s390: - provide the floating point registers via sync regs; - separated instruction vs. data accesses - dirty log improvements for huge guests - bugfixes and documentation improvements. x86: - Hyper-V VMBus hypercall userspace exit - alternative implementation of lowest-priority interrupts using vector hashing (for better VT-d posted interrupt support) - fixed guest debugging with nested virtualizations - improved interrupt tracking in the in-kernel IOAPIC - generic infrastructure for tracking writes to guest memory - currently its only use is to speedup the legacy shadow paging (pre-EPT) case, but in the future it will be used for virtual GPUs as well - much cleanup (LAPIC, kvmclock, MMU, PIT), including ubsan fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (217 commits) KVM: x86: remove eager_fpu field of struct kvm_vcpu_arch KVM: x86: disable MPX if host did not enable MPX XSAVE features arm64: KVM: vgic-v3: Only wipe LRs on vcpu exit arm64: KVM: vgic-v3: Reset LRs at boot time arm64: KVM: vgic-v3: Do not save an LR known to be empty arm64: KVM: vgic-v3: Save maintenance interrupt state only if required arm64: KVM: vgic-v3: Avoid accessing ICH registers KVM: arm/arm64: vgic-v2: Make GICD_SGIR quicker to hit KVM: arm/arm64: vgic-v2: Only wipe LRs on vcpu exit KVM: arm/arm64: vgic-v2: Reset LRs at boot time KVM: arm/arm64: vgic-v2: Do not save an LR known to be empty KVM: arm/arm64: vgic-v2: Move GICH_ELRSR saving to its own function KVM: arm/arm64: vgic-v2: Save maintenance interrupt state only if required KVM: arm/arm64: vgic-v2: Avoid accessing GICH registers KVM: s390: allocate only one DMA page per VM KVM: s390: enable STFLE interpretation only if enabled for the guest KVM: s390: wake up when the VCPU cpu timer expires KVM: s390: step the VCPU timer while in enabled wait KVM: s390: protect VCPU cpu timer with a seqcount KVM: s390: step VCPU cpu timer during kvm_run ioctl ...
2016-03-08s390/mm: split arch/s390/mm/pgtable.cMartin Schwidefsky
The pgtable.c file is quite big, before it grows any larger split it into pgtable.c, pgalloc.c and gmap.c. In addition move the gmap related header definitions into the new gmap.h header and all of the pgste helpers from pgtable.h to pgtable.c. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-03-08KVM: s390: wake up when the VCPU cpu timer expiresDavid Hildenbrand
When the VCPU cpu timer expires, we have to wake up just like when the ckc triggers. For now, setting up a cpu timer in the guest and going into enabled wait will never lead to a wakeup. This patch fixes this problem. Just as for the ckc, we have to take care of waking up too early. We have to recalculate the sleep time and go back to sleep. Please note that the timer callback calls kvm_s390_get_cpu_timer() from interrupt context. As the timer is canceled when leaving handle_wait(), and we don't do any VCPU cpu timer writes/updates in that function, we can be sure that we will never try to read the VCPU cpu timer from the same cpu that is currentyl updating the timer (deadlock). Reported-by: Sascha Silbe <silbe@linux.vnet.ibm.com> Tested-by: Sascha Silbe <silbe@linux.vnet.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-03-08KVM: s390: abstract access to the VCPU cpu timerDavid Hildenbrand
We want to manually step the cpu timer in certain scenarios in the future. Let's abstract any access to the cpu timer, so we can hide the complexity internally. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-25KVM: Use simple waitqueue for vcpu->wqMarcelo Tosatti
The problem: On -rt, an emulated LAPIC timer instances has the following path: 1) hard interrupt 2) ksoftirqd is scheduled 3) ksoftirqd wakes up vcpu thread 4) vcpu thread is scheduled This extra context switch introduces unnecessary latency in the LAPIC path for a KVM guest. The solution: Allow waking up vcpu thread from hardirq context, thus avoiding the need for ksoftirqd to be scheduled. Normal waitqueues make use of spinlocks, which on -RT are sleepable locks. Therefore, waking up a waitqueue waiter involves locking a sleeping lock, which is not allowed from hard interrupt context. cyclictest command line: This patch reduces the average latency in my tests from 14us to 11us. Daniel writes: Paolo asked for numbers from kvm-unit-tests/tscdeadline_latency benchmark on mainline. The test was run 1000 times on tip/sched/core 4.4.0-rc8-01134-g0905f04: ./x86-run x86/tscdeadline_latency.flat -cpu host with idle=poll. The test seems not to deliver really stable numbers though most of them are smaller. Paolo write: "Anything above ~10000 cycles means that the host went to C1 or lower---the number means more or less nothing in that case. The mean shows an improvement indeed." Before: min max mean std count 1000.000000 1000.000000 1000.000000 1000.000000 mean 5162.596000 2019270.084000 5824.491541 20681.645558 std 75.431231 622607.723969 89.575700 6492.272062 min 4466.000000 23928.000000 5537.926500 585.864966 25% 5163.000000 1613252.750000 5790.132275 16683.745433 50% 5175.000000 2281919.000000 5834.654000 23151.990026 75% 5190.000000 2382865.750000 5861.412950 24148.206168 max 5228.000000 4175158.000000 6254.827300 46481.048691 After min max mean std count 1000.000000 1000.00000 1000.000000 1000.000000 mean 5143.511000 2076886.10300 5813.312474 21207.357565 std 77.668322 610413.09583 86.541500 6331.915127 min 4427.000000 25103.00000 5529.756600 559.187707 25% 5148.000000 1691272.75000 5784.889825 17473.518244 50% 5160.000000 2308328.50000 5832.025000 23464.837068 75% 5172.000000 2393037.75000 5853.177675 24223.969976 max 5222.000000 3922458.00000 6186.720500 42520.379830 [Patch was originaly based on the swait implementation found in the -rt tree. Daniel ported it to mainline's version and gathered the benchmark numbers for tscdeadline_latency test.] Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: linux-rt-users@vger.kernel.org Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1455871601-27484-4-git-send-email-wagi@monom.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-02-10KVM: s390: irq delivery should not rely on icptcodeDavid Hildenbrand
Program irq injection during program irq intercepts is the last candidates that injects nullifying irqs and relies on delivery to do the right thing. As we should not rely on the icptcode during any delivery (because that value will not be migrated), let's add a flag, telling prog IRQ delivery to not rewind the PSW in case of nullifying prog IRQs. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: migration / injection of prog irq ilcDavid Hildenbrand
We have to migrate the program irq ilc and someday we will have to specify the ilc without KVM trying to autodetect the value. Let's reuse one of the spare fields in our program irq that should always be set to 0 by user space. Because we also want to make use of 0 ilcs ("not available"), we need a validity indicator. If no valid ilc is given, we try to autodetect the ilc via the current icptcode and icptstatus + parameter and store the valid ilc in the irq structure. This has a nice effect: QEMU's making use of KVM_S390_IRQ / KVM_S390_SET_IRQ_STATE / KVM_S390_GET_IRQ_STATE for migration will directly migrate the ilc without any changes. Please note that we use bit 0 as validity and bit 1,2 for the ilc, so by applying the ilc mask we directly get the ilen which is usually what we work with. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: PSW forwarding / rewinding / ilc reworkDavid Hildenbrand
We have some confusion about ilc vs. ilen in our current code. So let's correctly use the term ilen when dealing with (ilc << 1). Program irq injection didn't take care of the correct ilc in case of irqs triggered by EXECUTE functions, let's provide one function kvm_s390_get_ilen() to take care of all that. Also, manually specifying in intercept handlers the size of the instruction (and sometimes overwriting that value for EXECUTE internally) doesn't make too much sense. So also provide the functions: - kvm_s390_retry_instr to retry the currently intercepted instruction - kvm_s390_rewind_psw to rewind the PSW without internal overwrites - kvm_s390_forward_psw to forward the PSW Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-01-13Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "Among the traditional bug fixes and cleanups are some improvements: - A tool to generated the facility lists, generating the bit fields by hand has been a source of bugs in the past - The spinlock loop is reordered to avoid bursts of hypervisor calls - Add support for the open-for-business interface to the service element - The get_cpu call is added to the vdso - A set of tracepoints is defined for the common I/O layer - The deprecated sclp_cpi module is removed - Update default configuration" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (56 commits) s390/sclp: fix possible control register corruption s390: fix normalization bug in exception table sorting s390/configs: update default configurations s390/vdso: optimize getcpu system call s390: drop smp_mb in vdso_init s390: rename struct _lowcore to struct lowcore s390/mem_detect: use unsigned longs s390/ptrace: get rid of long longs in psw_bits s390/sysinfo: add missing SYSIB 1.2.2 multithreading fields s390: get rid of CONFIG_SCHED_MC and CONFIG_SCHED_BOOK s390/Kconfig: remove pointless 64 bit dependencies s390/dasd: fix failfast for disconnected devices s390/con3270: testing return kzalloc retval s390/hmcdrv: constify hmcdrv_ftp_ops structs s390/cio: add NULL test s390/cio: Change I/O instructions from inline to normal functions s390/cio: Introduce common I/O layer tracepoints s390/cio: Consolidate inline assemblies and related data definitions s390/cio: Fix incorrect xsch opcode specification s390/cio: Remove unused inline assemblies ...
2016-01-11s390: rename struct _lowcore to struct lowcoreHeiko Carstens
Finally get rid of the leading underscore. I tried this already two or three years ago, however Michael Holzheu objected since this would break the crash utility (again). However Michael integrated support for the new name into the crash utility back then, so it doesn't break if the name will be changed now. So finally get rid of the ever confusing leading underscore. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-11-30KVM: s390: fast path for sca_ext_call_pendingDavid Hildenbrand
If CPUSTAT_ECALL_PEND isn't set, we can't have an external call pending, so we can directly avoid taking the lock. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-11-30KVM: s390: Introduce switching codeEugene (jno) Dvurechenski
This patch adds code that performs transparent switch to Extended SCA on addition of 65th VCPU in a VM. Disposal of ESCA is added too. The entier ESCA functionality, however, is still not enabled. The enablement will be provided in a separate patch. This patch also uses read/write lock protection of SCA and its subfields for possible disposal at the BSCA-to-ESCA transition. While only Basic SCA needs such a protection (for the swap), any SCA access is now guarded. Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-11-30KVM: s390: Make provisions for ESCA utilizationEugene (jno) Dvurechenski
This patch updates the routines (sca_*) to provide transparent access to and manipulation on the data for both Basic and Extended SCA in use. The kvm.arch.sca is generalized to (void *) to handle BSCA/ESCA cases. Also the kvm.arch.use_esca flag is provided. The actual functionality is kept the same. Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-11-30KVM: s390: Introduce new structuresEugene (jno) Dvurechenski
This patch adds new structures and updates some existing ones to provide the base for Extended SCA functionality. The old sca_* structures were renamed to bsca_* to keep things uniform. The access to fields of SIGP controls were turned into bitfields instead of hardcoded bitmasks. Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-11-30KVM: s390: Generalize access to SIGP controlsEugene (jno) Dvurechenski
This patch generalizes access to the SIGP controls, which is a part of SCA. This is to prepare for upcoming introduction of Extended SCA support. Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-11-19KVM: s390: fix wrong lookup of VCPUs by array indexDavid Hildenbrand
For now, VCPUs were always created sequentially with incrementing VCPU ids. Therefore, the index in the VCPUs array matched the id. As sequential creation might change with cpu hotplug, let's use the correct lookup function to find a VCPU by id, not array index. Let's also use kvm_lookup_vcpu() for validation of the sending VCPU on external call injection. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: stable@vger.kernel.org # db27a7a KVM: Provide function for VCPU lookup by id
2015-11-19KVM: s390: avoid memory overwrites on emergency signal injectionDavid Hildenbrand
Commit 383d0b050106 ("KVM: s390: handle pending local interrupts via bitmap") introduced a possible memory overwrite from user space. User space could pass an invalid emergency signal code (sending VCPU) and therefore exceed the bitmap. Let's take care of this case and check that the id is in the valid range. Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # v3.19+ db27a7a KVM: Provide function for VCPU lookup by id Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-10-13KVM: s390: factor out reading of the guest TOD clockDavid Hildenbrand
Let's factor this out and always use get_tod_clock_fast() when reading the guest TOD. STORE CLOCK FAST does not do serialization and, therefore, might result in some fuzziness between different processors in a way that subsequent calls on different CPUs might have time stamps that are earlier. This semantics is fine though for all KVM use cases. To make it obvious that the new function has STORE CLOCK FAST semantics we name it kvm_s390_get_tod_clock_fast. With this patch, we only have a handful of places were we have to care about STP sync (using preempt_disable() logic). Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-10-13KVM: s390: correctly handle injection of pgm irqs and per eventsDavid Hildenbrand
PER events can always co-exist with other program interrupts. For now, we always overwrite all program interrupt parameters when injecting any type of program interrupt. Let's handle that correctly by only overwriting the relevant portion of the program interrupt parameters. Therefore we can now inject PER events and ordinary program interrupts concurrently, resulting in no loss of program interrupts. This will especially by helpful when manually detecting PER events later - as both types might be triggered during one SIE exit. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-10-13KVM: s390: simplify in-kernel program irq injectionDavid Hildenbrand
The main reason to keep program injection in kernel separated until now was that we were able to do some checking, if really only the owning thread injects program interrupts (via waitqueue_active(li->wq)). This BUG_ON was never triggered and the chances of really hitting it, if another thread injected a program irq to another vcpu, were very small. Let's drop this check and turn kvm_s390_inject_program_int() and kvm_s390_inject_prog_irq() into simple inline functions that makes use of kvm_s390_inject_vcpu(). __must_check can be dropped as they are implicitely given by kvm_s390_inject_vcpu(), to avoid ugly long function prototypes. Reviewed-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-10-13KVM: s390: drop out early in kvm_s390_has_irq()David Hildenbrand
Let's get rid of the local variable and exit directly if we found any pending interrupt. This is not only faster, but also better readable. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-10-13KVM: s390: kvm_arch_vcpu_runnable already cares about timer interruptsDavid Hildenbrand
We can remove that double check. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-10-13KVM: s390: set interception requests for all floating irqsDavid Hildenbrand
No need to separate pending and floating irqs when setting interception requests. Let's do it for all equally. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-10-13KVM: s390: disabled wait cares about machine checks, not PERDavid Hildenbrand
We don't care about program event recording irqs (synchronous program irqs) but asynchronous irqs when checking for disabled wait. Machine checks were missing. Let's directly switch to the functions we have for that purpose instead of testing once again for magic bits. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-10-13KVM: s390: remove unused variable in __inject_vmChristian Borntraeger
the float int structure is no longer used in __inject_vm. Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-09-03Merge branch 'locking-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking and atomic updates from Ingo Molnar: "Main changes in this cycle are: - Extend atomic primitives with coherent logic op primitives (atomic_{or,and,xor}()) and deprecate the old partial APIs (atomic_{set,clear}_mask()) The old ops were incoherent with incompatible signatures across architectures and with incomplete support. Now every architecture supports the primitives consistently (by Peter Zijlstra) - Generic support for 'relaxed atomics': - _acquire/release/relaxed() flavours of xchg(), cmpxchg() and {add,sub}_return() - atomic_read_acquire() - atomic_set_release() This came out of porting qwrlock code to arm64 (by Will Deacon) - Clean up the fragile static_key APIs that were causing repeat bugs, by introducing a new one: DEFINE_STATIC_KEY_TRUE(name); DEFINE_STATIC_KEY_FALSE(name); which define a key of different types with an initial true/false value. Then allow: static_branch_likely() static_branch_unlikely() to take a key of either type and emit the right instruction for the case. To be able to know the 'type' of the static key we encode it in the jump entry (by Peter Zijlstra) - Static key self-tests (by Jason Baron) - qrwlock optimizations (by Waiman Long) - small futex enhancements (by Davidlohr Bueso) - ... and misc other changes" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits) jump_label/x86: Work around asm build bug on older/backported GCCs locking, ARM, atomics: Define our SMP atomics in terms of _relaxed() operations locking, include/llist: Use linux/atomic.h instead of asm/cmpxchg.h locking/qrwlock: Make use of _{acquire|release|relaxed}() atomics locking/qrwlock: Implement queue_write_unlock() using smp_store_release() locking/lockref: Remove homebrew cmpxchg64_relaxed() macro definition locking, asm-generic: Add _{relaxed|acquire|release}() variants for 'atomic_long_t' locking, asm-generic: Rework atomic-long.h to avoid bulk code duplication locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operations locking, compiler.h: Cast away attributes in the WRITE_ONCE() magic locking/static_keys: Make verify_keys() static jump label, locking/static_keys: Update docs locking/static_keys: Provide a selftest jump_label: Provide a self-test s390/uaccess, locking/static_keys: employ static_branch_likely() x86, tsc, locking/static_keys: Employ static_branch_likely() locking/static_keys: Add selftest locking/static_keys: Add a new static_key interface locking/static_keys: Rework update logic locking/static_keys: Add static_key_{en,dis}able() helpers ...
2015-08-04KVM: s390: host STP toleration for VMsFan Zhang
If the host has STP enabled, the TOD of the host will be changed during synchronization phases. These are performed during a stop_machine() call. As the guest TOD is based on the host TOD, we have to make sure that: - no VCPU is in the SIE (implicitly guaranteed via stop_machine()) - manual guest TOD calculations are not affected "Epoch" is the guest TOD clock delta to the host TOD clock. We have to adjust that value during the STP synchronization and make sure that code that accesses the epoch won't get interrupted in between (via disabling preemption). Signed-off-by: Fan Zhang <zhangfan@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-07-29KVM: s390: more irq names for trace eventsDavid Hildenbrand
This patch adds names for missing irq types to the trace events. In order to identify adapter irqs, the define is moved from interrupt.c to the other basic irq defines in uapi/linux/kvm.h. Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-07-29KVM: s390: Fixup interrupt vcpu event messages and levelsChristian Borntraeger
This reworks the debug logging for interrupt related logs. Several changes: - unify program int/irq - improve decoding (e.g. use mcic instead of parm64 for machine check injection) - remove useless interrupt type number (the name is enough) - rename "interrupt:" to "deliver:" as the other side is called "inject" - use log level 3 for state changing and/or seldom events (like machine checks, restart..) - use log level 4 for frequent events - use 0x prefix for hex numbers - add pfault done logging - move some tracing outside spinlock Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
2015-07-29KVM: s390: remove "from (user|kernel)" from irq injection messagesDavid Hildenbrand
The "from user"/"from kernel" part of the log/trace messages is not always correct anymore and therefore not really helpful. Let's remove that part from the log + trace messages. For program interrupts, we can now move the logging/tracing part into the real injection function, as already done for the other injection functions. Reviewed-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-07-29KVM: s390: VCPU_EVENT cleanup for prefix changesChristian Borntraeger
SPX (SET PREFIX) and SIGP (Set prefix) can change the prefix register of a CPU. As sigp set prefix may be handled in user space (KVM_CAP_S390_USER_SIGP), we would not log the changes triggered via SIGP in that case. Let's have just one VCPU_EVENT at the central location that tracks prefix changes. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
2015-07-27atomic: Replace atomic_{set,clear}_mask() usagePeter Zijlstra
Replace the deprecated atomic_{set,clear}_mask() usage with the now ubiquous atomic_{or,andnot}() functions. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-06-24Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull first batch of KVM updates from Paolo Bonzini: "The bulk of the changes here is for x86. And for once it's not for silicon that no one owns: these are really new features for everyone. Details: - ARM: several features are in progress but missed the 4.2 deadline. So here is just a smattering of bug fixes, plus enabling the VFIO integration. - s390: Some fixes/refactorings/optimizations, plus support for 2GB pages. - x86: * host and guest support for marking kvmclock as a stable scheduler clock. * support for write combining. * support for system management mode, needed for secure boot in guests. * a bunch of cleanups required for the above * support for virtualized performance counters on AMD * legacy PCI device assignment is deprecated and defaults to "n" in Kconfig; VFIO replaces it On top of this there are also bug fixes and eager FPU context loading for FPU-heavy guests. - Common code: Support for multiple address spaces; for now it is used only for x86 SMM but the s390 folks also have plans" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (124 commits) KVM: s390: clear floating interrupt bitmap and parameters KVM: x86/vPMU: Enable PMU handling for AMD PERFCTRn and EVNTSELn MSRs KVM: x86/vPMU: Implement AMD vPMU code for KVM KVM: x86/vPMU: Define kvm_pmu_ops to support vPMU function dispatch KVM: x86/vPMU: introduce kvm_pmu_msr_idx_to_pmc KVM: x86/vPMU: reorder PMU functions KVM: x86/vPMU: whitespace and stylistic adjustments in PMU code KVM: x86/vPMU: use the new macros to go between PMC, PMU and VCPU KVM: x86/vPMU: introduce pmu.h header KVM: x86/vPMU: rename a few PMU functions KVM: MTRR: do not map huge page for non-consistent range KVM: MTRR: simplify kvm_mtrr_get_guest_memory_type KVM: MTRR: introduce mtrr_for_each_mem_type KVM: MTRR: introduce fixed_mtrr_addr_* functions KVM: MTRR: sort variable MTRRs KVM: MTRR: introduce var_mtrr_range KVM: MTRR: introduce fixed_mtrr_segment table KVM: MTRR: improve kvm_mtrr_get_guest_memory_type KVM: MTRR: do not split 64 bits MSR content KVM: MTRR: clean up mtrr default type ...
2015-06-23KVM: s390: clear floating interrupt bitmap and parametersJens Freimann
commit 6d3da24141 ("KVM: s390: deliver floating interrupts in order of priority") introduced a regression for the reset handling. We don't clear the bitmap of pending floating interrupts and interrupt parameters. This could result in stale interrupts even after a reset. Let's fix this by clearing the pending bitmap and the parameters for service and machine check interrupts. Cc: stable@vger.kernel.org # 4.1 Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-05-13s390/sclp: unify basic sclp access by exposing "struct sclp"David Hildenbrand
Let's unify basic access to sclp fields by storing the data in an external struct in asm/sclp.h. The values can now directly be accessed by other components, so there is no need for most accessor functions and external variables anymore. The mtid, mtid_max and facility part will be cleaned up separately. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-05-08KVM: s390: factor out and optimize floating irq VCPU kickDavid Hildenbrand
This patch factors out the search for a floating irq destination VCPU as well as the kicking of the found VCPU. The search is optimized in the following ways: 1. stopped VCPUs can't take any floating interrupts, so try to find an operating one. We have to take care of the special case where all VCPUs are stopped and we don't have any valid destination. 2. use online_vcpus, not KVM_MAX_VCPU. This speeds up the search especially if KVM_MAX_VCPU is increased one day. As these VCPU objects are initialized prior to increasing online_vcpus, we can be sure that they exist. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Reviewed-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-05-08KVM: s390: optimize interrupt handling round trip timeJens Freimann
We can avoid checking guest control registers and guest PSW as well as all the masking and calculations on the interrupt masks when no interrupts are pending. Also, the check for IRQ_PEND_COUNT can be removed, because we won't enter the while loop if no interrupts are pending and invalid interrupt types can't be injected. Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-05-08KVM: s390: fix external call injection without sigp interpretationDavid Hildenbrand
Commit ea5f49692575 ("KVM: s390: only one external call may be pending at a time") introduced a bug on machines that don't have SIGP interpretation facility installed. The injection of an external call will now always fail with -EBUSY (if none is already pending). This leads to the following symptoms: - An external call will be injected but with the wrong "src cpu id", as this id will not be remembered. - The target vcpu will not be woken up, therefore the guest will hang if it cannot deal with unexpected failures of the SIGP EXTERNAL CALL instruction. - If an external call is already pending, -EBUSY will not be reported. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # v4.0 Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>