summaryrefslogtreecommitdiff
path: root/Documentation/virtual
AgeCommit message (Collapse)Author
2015-02-18Merge tag 'virtio-next-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull virtio updates from Rusty Russell: "OK, this has the big virtio 1.0 implementation, as specified by OASIS. On top of tht is the major rework of lguest, to use PCI and virtio 1.0, to double-check the implementation. Then comes the inevitable fixes and cleanups from that work" * tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (80 commits) virtio: don't set VIRTIO_CONFIG_S_DRIVER_OK twice. virtio_net: unconditionally define struct virtio_net_hdr_v1. tools/lguest: don't use legacy definitions for net device in example launcher. virtio: Don't expose legacy net features when VIRTIO_NET_NO_LEGACY defined. tools/lguest: use common error macros in the example launcher. tools/lguest: give virtqueues names for better error messages tools/lguest: more documentation and checking of virtio 1.0 compliance. lguest: don't look in console features to find emerg_wr. tools/lguest: don't start devices until DRIVER_OK status set. tools/lguest: handle indirect partway through chain. tools/lguest: insert driver references from the 1.0 spec (4.1 Virtio Over PCI) tools/lguest: insert device references from the 1.0 spec (4.1 Virtio Over PCI) tools/lguest: rename virtio_pci_cfg_cap field to match spec. tools/lguest: fix features_accepted logic in example launcher. tools/lguest: handle device reset correctly in example launcher. virtual: Documentation: simplify and generalize paravirt_ops.txt lguest: remove NOTIFY call and eventfd facility. lguest: remove NOTIFY facility from demonstration launcher. lguest: use the PCI console device's emerg_wr for early boot messages. lguest: always put console in PCI slot #1. ...
2015-02-13virtual: Documentation: simplify and generalize paravirt_ops.txtLuis R. Rodriguez
The general documentation we have for pv_ops is currenty present on the IA64 docs, but since this documentation covers IA64 xen enablement and IA64 Xen support got ripped out a while ago through commit d52eefb47 present since v3.14-rc1 lets just simplify, generalize and move the pv_ops documentation to a shared place. Cc: Isaku Yamahata <yamahata@valinux.co.jp> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Alok Kataria <akataria@vmware.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: virtualization@lists.linux-foundation.org Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: xen-devel@lists.xenproject.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-09KVM: s390: add cpu model supportMichael Mueller
This patch enables cpu model support in kvm/s390 via the vm attribute interface. During KVM initialization, the host properties cpuid, IBC value and the facility list are stored in the architecture specific cpu model structure. During vcpu setup, these properties are taken to initialize the related SIE state. This mechanism allows to adjust the properties from user space and thus to implement different selectable cpu models. This patch uses the IBC functionality to block instructions that have not been implemented at the requested CPU type and GA level compared to the full host capability. Userspace has to initialize the cpu model before vcpu creation. A cpu model change of running vcpus is not possible. Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-01-23Merge tag 'kvm-s390-next-20150122' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next KVM: s390: fixes and features for kvm/next (3.20) 1. Generic - sparse warning (make function static) - optimize locking - bugfixes for interrupt injection - fix MVPG addressing modes 2. hrtimer/wakeup fun A recent change can cause KVM hangs if adjtime is used in the host. The hrtimer might wake up too early or too late. Too early is fatal as vcpu_block will see that the wakeup condition is not met and sleep again. This CPU might never wake up again. This series addresses this problem. adjclock slowing down the host clock will result in too late wakeups. This will require more work. In addition to that we also change the hrtimer from REALTIME to MONOTONIC to avoid similar problems with timedatectl set-time. 3. sigp rework We will move all "slow" sigps to QEMU (protected with a capability that can be enabled) to avoid several races between concurrent SIGP orders. 4. Optimize the shadow page table Provide an interface to announce the maximum guest size. The kernel will use that to make the pagetable 2,3,4 (or theoretically) 5 levels. 5. Provide an interface to set the guest TOD We now use two vm attributes instead of two oneregs, as oneregs are vcpu ioctl and we don't want to call them from other threads. 6. Protected key functions The real HMC allows to enable/disable protected key CPACF functions. Lets provide an implementation + an interface for QEMU to activate this the protected key instructions.
2015-01-23KVM: s390: forward most SIGP orders to user spaceDavid Hildenbrand
Most SIGP orders are handled partially in kernel and partially in user space. In order to: - Get a correct SIGP SET PREFIX handler that informs user space - Avoid race conditions between concurrently executed SIGP orders - Serialize SIGP orders per VCPU We need to handle all "slow" SIGP orders in user space. The remaining ones to be handled completely in kernel are: - SENSE - SENSE RUNNING - EXTERNAL CALL - EMERGENCY SIGNAL - CONDITIONAL EMERGENCY SIGNAL According to the PoP, they have to be fast. They can be executed without conflicting to the actions of other pending/concurrently executing orders (e.g. STOP vs. START). This patch introduces a new capability that will - when enabled - forward all but the mentioned SIGP orders to user space. The instruction counters in the kernel are still updated. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-01-23KVM: s390: new parameter for SIGP STOP irqsDavid Hildenbrand
In order to get rid of the action_flags and to properly migrate pending SIGP STOP irqs triggered e.g. by SIGP STOP AND STORE STATUS, we need to remember whether to store the status when stopping. For this reason, a new parameter (flags) for the SIGP STOP irq is introduced. These flags further define details of the requested STOP and can be easily migrated. Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-01-23KVM: s390: Allow userspace to limit guest memory sizeDominik Dingel
With commit c6c956b80bdf ("KVM: s390/mm: support gmap page tables with less than 5 levels") we are able to define a limit for the guest memory size. As we round up the guest size in respect to the levels of page tables we get to guest limits of: 2048 MB, 4096 GB, 8192 TB and 16384 PB. We currently limit the guest size to 16 TB, which means we end up creating a page table structure supporting guest sizes up to 8192 TB. This patch introduces an interface that allows userspace to tune this limit. This may bring performance improvements for small guests. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2015-01-20arm/arm64: KVM: force alignment of VGIC dist/CPU/redist addressesAndre Przywara
Although the GIC architecture requires us to map the MMIO regions only at page aligned addresses, we currently do not enforce this from the kernel side. Restrict any vGICv2 regions to be 4K aligned and any GICv3 regions to be 64K aligned. Document this requirement. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-20arm/arm64: KVM: allow userland to request a virtual GICv3Andre Przywara
With all of the GICv3 code in place now we allow userland to ask the kernel for using a virtual GICv3 in the guest. Also we provide the necessary support for guests setting the memory addresses for the virtual distributor and redistributors. This requires some userland code to make use of that feature and explicitly ask for a virtual GICv3. Document that KVM_CREATE_IRQCHIP only works for GICv2, but is considered legacy and using KVM_CREATE_DEVICE is preferred. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-01-11KVM: arm/arm64: vgic: add init entry to VGIC KVM deviceEric Auger
Since the advent of VGIC dynamic initialization, this latter is initialized quite late on the first vcpu run or "on-demand", when injecting an IRQ or when the guest sets its registers. This initialization could be initiated explicitly much earlier by the users-space, as soon as it has provided the requested dimensioning parameters. This patch adds a new entry to the VGIC KVM device that allows the user to manually request the VGIC init: - a new KVM_DEV_ARM_VGIC_GRP_CTRL group is introduced. - Its first attribute is KVM_DEV_ARM_VGIC_CTRL_INIT The rationale behind introducing a group is to be able to add other controls later on, if needed. Signed-off-by: Eric Auger <eric.auger@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-12-15Merge tag 'kvm-arm-for-3.19-take2' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD Second round of changes for KVM for arm/arm64 for v3.19; fixes reboot problems, clarifies VCPU init, and fixes a regression concerning the VGIC init flow. Conflicts: arch/ia64/kvm/kvm-ia64.c [deleted in HEAD and modified in kvmarm]
2014-12-13arm/arm64: KVM: Turn off vcpus on PSCI shutdown/rebootChristoffer Dall
When a vcpu calls SYSTEM_OFF or SYSTEM_RESET with PSCI v0.2, the vcpus should really be turned off for the VM adhering to the suggestions in the PSCI spec, and it's the sane thing to do. Also, clarify the behavior and expectations for exits to user space with the KVM_EXIT_SYSTEM_EVENT case. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-12-13arm/arm64: KVM: Clarify KVM_ARM_VCPU_INIT ABIChristoffer Dall
It is not clear that this ioctl can be called multiple times for a given vcpu. Userspace already does this, so clarify the ABI. Also specify that userspace is expected to always make secondary and subsequent calls to the ioctl with the same parameters for the VCPU as the initial call (which userspace also already does). Add code to check that userspace doesn't violate that ABI in the future, and move the kvm_vcpu_set_target() function which is currently duplicated between the 32-bit and 64-bit versions in guest.c to a common static function in arm.c, shared between both architectures. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-12-13arm/arm64: KVM: Correct KVM_ARM_VCPU_INIT power off optionChristoffer Dall
The implementation of KVM_ARM_VCPU_INIT is currently not doing what userspace expects, namely making sure that a vcpu which may have been turned off using PSCI is returned to its initial state, which would be powered on if userspace does not set the KVM_ARM_VCPU_POWER_OFF flag. Implement the expected functionality and clarify the ABI. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-11-20kvm: Documentation: remove ia64Tiejun Chen
kvm/ia64 is gone, clean up Documentation too. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-07Merge tag 'kvm-s390-next-20141107' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fixes for kvm/next (3.19) and stable 1. We should flush TLBs for load control instruction emulation (stable) 2. A workaround for a compiler bug that renders ACCESS_ONCE broken (stable) 3. Fix program check handling for load control 4. Documentation Fix
2014-11-07KVM: fix vm device attribute documentationDominik Dingel
Documentation uses incorrect attribute names for some vm device attributes: fix this. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-03kvm: drop unsupported capabilities, fix documentationMichael S. Tsirkin
No kernel ever reported KVM_CAP_DEVICE_MSIX, KVM_CAP_DEVICE_MSI, KVM_CAP_DEVICE_ASSIGNMENT, KVM_CAP_DEVICE_DEASSIGNMENT. This makes the documentation wrong, and no application ever written to use these capabilities has a chance to work correctly. The only way to detect support is to try, and test errno for ENOTTY. That's unfortunate, but we can't fix the past. Document the actual semantics, and drop the definitions from the exported header to make it easier for application developers to note and fix the bug. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-03Documentation: virtual: kvm: correct one bit description in APF caseTiejun Chen
When commit 6adba5274206 (KVM: Let host know whether the guest can handle async PF in non-userspace context.) is introduced, actually bit 2 still is reserved and should be zero. Instead, bit 1 is 1 to indicate if asynchronous page faults can be injected when vcpu is in cpl == 0, and also please see this, in the file kvm_para.h, #define KVM_ASYNC_PF_SEND_ALWAYS (1 << 1). Signed-off-by: Tiejun Chen <tiejun.chen@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-27Merge tag 'kvm-arm-for-3.18' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-next Changes for KVM for arm/arm64 for 3.18 This includes a bunch of changes: - Support read-only memory slots on arm/arm64 - Various changes to fix Sparse warnings - Correctly detect write vs. read Stage-2 faults - Various VGIC cleanups and fixes - Dynamic VGIC data strcuture sizing - Fix SGI set_clear_pend offset bug - Fix VTTBR_BADDR Mask - Correctly report the FSC on Stage-2 faults Conflicts: virt/kvm/eventfd.c [duplicate, different patch where the kvm-arm version broke x86. The kvm tree instead has the right one]
2014-09-22KVM: PPC: BOOKE: Add one_reg documentation of SPRG9 and DBSRBharat Bhushan
This was missed in respective one_reg implementation patch. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-18arm/arm64: KVM: vgic: make number of irqs a configurable attributeMarc Zyngier
In order to make the number of interrupts configurable, use the new fancy device management API to add KVM_DEV_ARM_VGIC_GRP_NR_IRQS as a VGIC configurable attribute. Userspace can now specify the exact size of the GIC (by increments of 32 interrupts). Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2014-09-10KVM: fix api documentation of KVM_GET_EMULATED_CPUIDAlex Bennée
It looks like when this was initially merged it got accidentally included in the following section. I've just moved it back in the correct section and re-numbered it as other ioctls have been added since. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-10KVM: document KVM_SET_GUEST_DEBUG apiAlex Bennée
In preparation for working on the ARM implementation I noticed the debug interface was missing from the API document. I've pieced together the expected behaviour from the code and commit messages written it up as best I can. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-09-03kvm: fix potentially corrupt mmio cacheDavid Matlack
vcpu exits and memslot mutations can run concurrently as long as the vcpu does not aquire the slots mutex. Thus it is theoretically possible for memslots to change underneath a vcpu that is handling an exit. If we increment the memslot generation number again after synchronize_srcu_expedited(), vcpus can safely cache memslot generation without maintaining a single rcu_dereference through an entire vm exit. And much of the x86/kvm code does not maintain a single rcu_dereference of the current memslots during each exit. We can prevent the following case: vcpu (CPU 0) | thread (CPU 1) --------------------------------------------+-------------------------- 1 vm exit | 2 srcu_read_unlock(&kvm->srcu) | 3 decide to cache something based on | old memslots | 4 | change memslots | (increments generation) 5 | synchronize_srcu(&kvm->srcu); 6 retrieve generation # from new memslots | 7 tag cache with new memslot generation | 8 srcu_read_unlock(&kvm->srcu) | ... | <action based on cache occurs even | though the caching decision was based | on the old memslots> | ... | <action *continues* to occur until next | memslot generation change, which may | be never> | | By incrementing the generation after synchronizing with kvm->srcu readers, we ensure that the generation retrieved in (6) will become invalid soon after (8). Keeping the existing increment is not strictly necessary, but we do keep it and just move it for consistency from update_memslots to install_new_memslots. It invalidates old cached MMIOs immediately, instead of having to wait for the end of synchronize_srcu_expedited, which makes the code more clearly correct in case CPU 1 is preempted right after synchronize_srcu() returns. To avoid halving the generation space in SPTEs, always presume that the low bit of the generation is zero when reconstructing a generation number out of an SPTE. This effectively disables MMIO caching in SPTEs during the call to synchronize_srcu_expedited. Using the low bit this way is somewhat like a seqcount---where the protected thing is a cache, and instead of retrying we can simply punt if we observe the low bit to be 1. Cc: stable@vger.kernel.org Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-25KVM: clarify the idea of kvm_dirty_regsDavid Hildenbrand
This patch clarifies that kvm_dirty_regs are just a hint to the kernel and that the kernel might just ignore some flags and sync the values (like done for acrs and gprs now). Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-08-05Merge tag 'signed-kvm-ppc-next' of git://github.com/agraf/linux-2.6 into kvmPaolo Bonzini
Patch queue for ppc - 2014-08-01 Highlights in this release include: - BookE: Rework instruction fetch, not racy anymore now - BookE HV: Fix ONE_REG accessors for some in-hardware registers - Book3S: Good number of LE host fixes, enable HV on LE - Book3S: Some misc bug fixes - Book3S HV: Add in-guest debug support - Book3S HV: Preload cache lines on context switch - Remove 440 support Alexander Graf (31): KVM: PPC: Book3s PR: Disable AIL mode with OPAL KVM: PPC: Book3s HV: Fix tlbie compile error KVM: PPC: Book3S PR: Handle hyp doorbell exits KVM: PPC: Book3S PR: Fix ABIv2 on LE KVM: PPC: Book3S PR: Fix sparse endian checks PPC: Add asm helpers for BE 32bit load/store KVM: PPC: Book3S HV: Make HTAB code LE host aware KVM: PPC: Book3S HV: Access guest VPA in BE KVM: PPC: Book3S HV: Access host lppaca and shadow slb in BE KVM: PPC: Book3S HV: Access XICS in BE KVM: PPC: Book3S HV: Fix ABIv2 on LE KVM: PPC: Book3S HV: Enable for little endian hosts KVM: PPC: Book3S: Move vcore definition to end of kvm_arch struct KVM: PPC: Deflect page write faults properly in kvmppc_st KVM: PPC: Book3S: Stop PTE lookup on write errors KVM: PPC: Book3S: Add hack for split real mode KVM: PPC: Book3S: Make magic page properly 4k mappable KVM: PPC: Remove 440 support KVM: Rename and add argument to check_extension KVM: Allow KVM_CHECK_EXTENSION on the vm fd KVM: PPC: Book3S: Provide different CAPs based on HV or PR mode KVM: PPC: Implement kvmppc_xlate for all targets KVM: PPC: Move kvmppc_ld/st to common code KVM: PPC: Remove kvmppc_bad_hva() KVM: PPC: Use kvm_read_guest in kvmppc_ld KVM: PPC: Handle magic page in kvmppc_ld/st KVM: PPC: Separate loadstore emulation from priv emulation KVM: PPC: Expose helper functions for data/inst faults KVM: PPC: Remove DCR handling KVM: PPC: HV: Remove generic instruction emulation KVM: PPC: PR: Handle FSCR feature deselects Alexey Kardashevskiy (1): KVM: PPC: Book3S: Fix LPCR one_reg interface Aneesh Kumar K.V (4): KVM: PPC: BOOK3S: PR: Fix PURR and SPURR emulation KVM: PPC: BOOK3S: PR: Emulate virtual timebase register KVM: PPC: BOOK3S: PR: Emulate instruction counter KVM: PPC: BOOK3S: HV: Update compute_tlbie_rb to handle 16MB base page Anton Blanchard (2): KVM: PPC: Book3S HV: Fix ABIv2 indirect branch issue KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC() Bharat Bhushan (10): kvm: ppc: bookehv: Added wrapper macros for shadow registers kvm: ppc: booke: Use the shared struct helpers of SRR0 and SRR1 kvm: ppc: booke: Use the shared struct helpers of SPRN_DEAR kvm: ppc: booke: Add shared struct helpers of SPRN_ESR kvm: ppc: booke: Use the shared struct helpers for SPRN_SPRG0-7 kvm: ppc: Add SPRN_EPR get helper function kvm: ppc: bookehv: Save restore SPRN_SPRG9 on guest entry exit KVM: PPC: Booke-hv: Add one reg interface for SPRG9 KVM: PPC: Remove comment saying SPRG1 is used for vcpu pointer KVM: PPC: BOOKEHV: rename e500hv_spr to bookehv_spr Michael Neuling (1): KVM: PPC: Book3S HV: Add H_SET_MODE hcall handling Mihai Caraman (8): KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule KVM: PPC: e500: Fix default tlb for victim hint KVM: PPC: e500: Emulate power management control SPR KVM: PPC: e500mc: Revert "add load inst fixup" KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1 KVM: PPC: Book3s: Remove kvmppc_read_inst() function KVM: PPC: Allow kvmppc_get_last_inst() to fail KVM: PPC: Bookehv: Get vcpu's last instruction for emulation Paul Mackerras (4): KVM: PPC: Book3S: Controls for in-kernel sPAPR hypercall handling KVM: PPC: Book3S: Allow only implemented hcalls to be enabled or disabled KVM: PPC: Book3S PR: Take SRCU read lock around RTAS kvm_read_guest() call KVM: PPC: Book3S: Make kvmppc_ld return a more accurate error indication Stewart Smith (2): Split out struct kvmppc_vcore creation to separate function Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8 Conflicts: Documentation/virtual/kvm/api.txt
2014-07-28KVM: PPC: Remove DCR handlingAlexander Graf
DCR handling was only needed for 440 KVM. Since we removed it, we can also remove handling of DCR accesses. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: Allow KVM_CHECK_EXTENSION on the vm fdAlexander Graf
The KVM_CHECK_EXTENSION is only available on the kvm fd today. Unfortunately on PPC some of the capabilities change depending on the way a VM was created. So instead we need a way to expose capabilities as VM ioctl, so that we can see which VM type we're using (HV or PR). To enable this, add the KVM_CHECK_EXTENSION ioctl to our vm ioctl portfolio. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-28KVM: PPC: Book3S: Fix LPCR one_reg interfaceAlexey Kardashevskiy
Unfortunately, the LPCR got defined as a 32-bit register in the one_reg interface. This is unfortunate because KVM allows userspace to control the DPFD (default prefetch depth) field, which is in the upper 32 bits. The result is that DPFD always get set to 0, which reduces performance in the guest. We can't just change KVM_REG_PPC_LPCR to be a 64-bit register ID, since that would break existing userspace binaries. Instead we define a new KVM_REG_PPC_LPCR_64 id which is 64-bit. Userspace can still use the old KVM_REG_PPC_LPCR id, but it now only modifies those fields in the bottom 32 bits that userspace can modify (ILE, TC and AIL). If userspace uses the new KVM_REG_PPC_LPCR_64 id, it can modify DPFD as well. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: stable@vger.kernel.org Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S: Allow only implemented hcalls to be enabled or disabledPaul Mackerras
This adds code to check that when the KVM_CAP_PPC_ENABLE_HCALL capability is used to enable or disable in-kernel handling of an hcall, that the hcall is actually implemented by the kernel. If not an EINVAL error is returned. This also checks the default-enabled list of hcalls and prints a warning if any hcall there is not actually implemented. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-28KVM: PPC: Book3S: Controls for in-kernel sPAPR hypercall handlingPaul Mackerras
This provides a way for userspace controls which sPAPR hcalls get handled in the kernel. Each hcall can be individually enabled or disabled for in-kernel handling, except for H_RTAS. The exception for H_RTAS is because userspace can already control whether individual RTAS functions are handled in-kernel or not via the KVM_PPC_RTAS_DEFINE_TOKEN ioctl, and because the numeric value for H_RTAS is out of the normal sequence of hcall numbers. Hcalls are enabled or disabled using the KVM_ENABLE_CAP ioctl for the KVM_CAP_PPC_ENABLE_HCALL capability on the file descriptor for the VM. The args field of the struct kvm_enable_cap specifies the hcall number in args[0] and the enable/disable flag in args[1]; 0 means disable in-kernel handling (so that the hcall will always cause an exit to userspace) and 1 means enable. Enabling or disabling in-kernel handling of an hcall is effective across the whole VM. The ability for KVM_ENABLE_CAP to be used on a VM file descriptor on PowerPC is new, added by this commit. The KVM_CAP_ENABLE_CAP_VM capability advertises that this ability exists. When a VM is created, an initial set of hcalls are enabled for in-kernel handling. The set that is enabled is the set that have an in-kernel implementation at this point. Any new hcall implementations from this point onwards should not be added to the default set without a good reason. No distinction is made between real-mode and virtual-mode hcall implementations; the one setting controls them both. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-22Merge tag 'kvm-s390-20140721' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next Bugfixes -------- - add IPTE to trace event decoder - document and advertise KVM_CAP_S390_IRQCHIP Cleanups -------- - Reuse kvm_vcpu_block for s390 - Get rid of tasklet for wakup processing
2014-07-21Merge tag 'kvm-s390-20140715' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next This series enables the "KVM_(S|G)ET_MP_STATE" ioctls on s390 to make the cpu state settable by user space. This is necessary to avoid races in s390 SIGP/reset handling which happen because some SIGPs are handled in QEMU, while others are handled in the kernel. Together with the busy conditions as return value of SIGP races happen especially in areas like starting and stopping of CPUs. (For example, there is a program 'cpuplugd', that runs on several s390 distros which does automatic onlining and offlining on cpus.) As soon as the MPSTATE interface is used, user space takes complete control of the cpu states. Otherwise the kernel will use the old way. Therefore, the new kernel continues to work fine with old QEMUs.
2014-07-21KVM: s390: document KVM_CAP_S390_IRQCHIPCornelia Huck
Let's document that this is a capability that may be enabled per-vm. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-07-21KVM: document target of capability enablementCornelia Huck
Capabilities can be enabled on a vcpu or (since recently) on a vm. Document this and note for the existing capabilites whether they are per-vcpu or per-vm. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-07-10KVM: s390: implement KVM_(S|G)ET_MP_STATE for user space state controlDavid Hildenbrand
This patch - adds s390 specific MP states to linux headers and documents them - implements the KVM_{SET,GET}_MP_STATE ioctls - enables KVM_CAP_MP_STATE - allows user space to control the VCPU state on s390. If user space sets the VCPU state using the ioctl KVM_SET_MP_STATE, we can disable manual changing of the VCPU state and trust user space to do the right thing. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-07-10KVM: prepare for KVM_(S|G)ET_MP_STATE on other architecturesDavid Hildenbrand
Highlight the aspects of the ioctls that are actually specific to x86 and ia64. As defined restrictions (irqchip) and mp states may not apply to other architectures, these parts are flagged to belong to x86 and ia64. In preparation for the use of KVM_(S|G)ET_MP_STATE by s390. Fix a spelling error (KVM_SET_MP_STATE vs. KVM_SET_MPSTATE) on the way. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-07-09KVM: MIPS: Document MIPS specifics of KVM API.James Hogan
Document the MIPS specific parts of the KVM API, including: - The layout of the kvm_regs structure. - The interrupt number passed to KVM_INTERRUPT. - The registers supported by the KVM_{GET,SET}_ONE_REG interface, and the encoding of those register ids. - That KVM_INTERRUPT and KVM_GET_REG_LIST are supported on MIPS. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-09KVM: Reformat KVM_SET_ONE_REG register documentationJames Hogan
Some of the MIPS registers that can be accessed with the KVM_{GET,SET}_ONE_REG interface have fairly long names, so widen the Register column of the table in the KVM_SET_ONE_REG documentation to allow them to fit. Tabs in the table are replaced with spaces at the same time for consistency. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-09KVM: Document KVM_SET_SIGNAL_MASK as universalJames Hogan
KVM_SET_SIGNAL_MASK is implemented in generic code and isn't x86 specific, so document it as being applicable for all architectures. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-06-04Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial into next Pull trivial tree changes from Jiri Kosina: "Usual pile of patches from trivial tree that make the world go round" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits) staging: go7007: remove reference to CONFIG_KMOD aic7xxx: Remove obsolete preprocessor define of: dma: doc fixes doc: fix incorrect formula to calculate CommitLimit value doc: Note need of bc in the kernel build from 3.10 onwards mm: Fix printk typo in dmapool.c modpost: Fix comment typo "Modules.symvers" Kconfig.debug: Grammar s/addition/additional/ wimax: Spelling s/than/that/, wording s/destinatary/recipient/ aic7xxx: Spelling s/termnation/termination/ arm64: mm: Remove superfluous "the" in comment of: Spelling s/anonymouns/anonymous/ dma: imx-sdma: Spelling s/determnine/determine/ ath10k: Improve grammar in comments ath6kl: Spelling s/determnine/determine/ of: Improve grammar for of_alias_get_id() documentation drm/exynos: Spelling s/contro/control/ radio-bcm2048.c: fix wrong overflow check doc: printk-formats: do not mention casts for u64/s64 doc: spelling error changes ...
2014-06-04Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm into nextLinus Torvalds
Pull KVM updates from Paolo Bonzini: "At over 200 commits, covering almost all supported architectures, this was a pretty active cycle for KVM. Changes include: - a lot of s390 changes: optimizations, support for migration, GDB support and more - ARM changes are pretty small: support for the PSCI 0.2 hypercall interface on both the guest and the host (the latter acked by Catalin) - initial POWER8 and little-endian host support - support for running u-boot on embedded POWER targets - pretty large changes to MIPS too, completing the userspace interface and improving the handling of virtualized timer hardware - for x86, a larger set of changes is scheduled for 3.17. Still, we have a few emulator bugfixes and support for running nested fully-virtualized Xen guests (para-virtualized Xen guests have always worked). And some optimizations too. The only missing architecture here is ia64. It's not a coincidence that support for KVM on ia64 is scheduled for removal in 3.17" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (203 commits) KVM: add missing cleanup_srcu_struct KVM: PPC: Book3S PR: Rework SLB switching code KVM: PPC: Book3S PR: Use SLB entry 0 KVM: PPC: Book3S HV: Fix machine check delivery to guest KVM: PPC: Book3S HV: Work around POWER8 performance monitor bugs KVM: PPC: Book3S HV: Make sure we don't miss dirty pages KVM: PPC: Book3S HV: Fix dirty map for hugepages KVM: PPC: Book3S HV: Put huge-page HPTEs in rmap chain for base address KVM: PPC: Book3S HV: Fix check for running inside guest in global_invalidates() KVM: PPC: Book3S: Move KVM_REG_PPC_WORT to an unused register number KVM: PPC: Book3S: Add ONE_REG register names that were missed KVM: PPC: Add CAP to indicate hcall fixes KVM: PPC: MPIC: Reset IRQ source private members KVM: PPC: Graciously fail broken LE hypercalls PPC: ePAPR: Fix hypercall on LE guest KVM: PPC: BOOK3S: Remove open coded make_dsisr in alignment handler KVM: PPC: BOOK3S: Always use the saved DAR value PPC: KVM: Make NX bit available with magic page KVM: PPC: Disable NX for old magic page using guests KVM: PPC: BOOK3S: HV: Add mixed page-size support for guest ...
2014-05-30Merge tag 'signed-kvm-ppc-next' of git://github.com/agraf/linux-2.6 into ↵Paolo Bonzini
kvm-next Patch queue for ppc - 2014-05-30 In this round we have a few nice gems. PR KVM gains initial POWER8 support as well as LE host awareness, ihe e500 targets can now properly run u-boot, LE guests now work with PR KVM including KVM hypercalls and HV KVM guests can now use huge pages. On top of this there are some bug fixes. Conflicts: include/uapi/linux/kvm.h
2014-05-30KVM: PPC: Book3S: Move KVM_REG_PPC_WORT to an unused register numberPaul Mackerras
Commit b005255e12a3 ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs") added a definition of KVM_REG_PPC_WORT with the same register number as the existing KVM_REG_PPC_VRSAVE (though in fact the definitions are not identical because of the different register sizes.) For clarity, this moves KVM_REG_PPC_WORT to the next unused number, and also adds it to api.txt. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: Book3S: Add ONE_REG register names that were missedPaul Mackerras
Commit 3b7834743f9 ("KVM: PPC: Book3S HV: Reserve POWER8 space in get/set_one_reg") added definitions for several KVM_REG_PPC_* symbols but missed adding some to api.txt. This adds them. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-30KVM: PPC: Disable NX for old magic page using guestsAlexander Graf
Old guests try to use the magic page, but map their trampoline code inside of an NX region. Since we can't fix those old kernels, try to detect whether the guest is sane or not. If not, just disable NX functionality in KVM so that old guests at least work at all. For newer guests, add a bit that we can set to keep NX functionality available. Signed-off-by: Alexander Graf <agraf@suse.de>
2014-05-27Merge tag 'kvm-arm-for-3.16' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-next Changed for the 3.16 merge window. This includes KVM support for PSCI v0.2 and also includes generic Linux support for PSCI v0.2 (on hosts that advertise that feature via their DT), since the latter depends on headers introduced by the former. Finally there's a small patch from Marc that enables Cortex-A53 support.
2014-05-15KVM: s390: announce irqfd capabilityCornelia Huck
s390 has acquired irqfd support with commit "KVM: s390: irq routing for adapter interrupts" (84223598778ba08041f4297fda485df83414d57e) but failed to announce it. Let's fix that. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-05-06KVM: s390: Add clock comparator and CPU timer IRQ injectionThomas Huth
Add an interface to inject clock comparator and CPU timer interrupts into the guest. This is needed for handling the external interrupt interception. Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>