summaryrefslogtreecommitdiff
path: root/arch/x86/kernel/pvclock.c
AgeCommit message (Collapse)Author
2016-08-10pvclock: Add CPU barriers to get correct version valueMinfei Huang
commit 749d088b8e7f4b9826ede02b9a043e417fa84aa1 upstream. Protocol for the "version" fields is: hypervisor raises it (making it uneven) before it starts updating the fields and raises it again (making it even) when it is done. Thus the guest can make sure the time values it got are consistent by checking the version before and after reading them. Add CPU barries after getting version value just like what function vread_pvclock does, because all of callees in this function is inline. Fixes: 502dfeff239e8313bfbe906ca0a1a6827ac8481b Signed-off-by: Minfei Huang <mnghuan@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-27x86: pvclock: Really remove the sched notifier for cross-cpu migrationsPaolo Bonzini
This reverts commits 0a4e6be9ca17c54817cf814b4b5aa60478c6df27 and 80f7fdb1c7f0f9266421f823964fd1962681f6ce. The task migration notifier was originally introduced in order to support the pvclock vsyscall with non-synchronized TSC, but KVM only supports it with synchronized TSC. Hence, on KVM the race condition is only needed due to a bad implementation on the host side, and even then it's so rare that it's mostly theoretical. As far as KVM is concerned it's possible to fix the host, avoiding the additional complexity in the vDSO and the (re)introduction of the task migration notifier. Xen, on the other hand, hasn't yet implemented vsyscall support at all, so we do not care about its plans for non-synchronized TSC. Reported-by: Peter Zijlstra <peterz@infradead.org> Suggested-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-03-23x86: kvm: Revert "remove sched notifier for cross-cpu migrations"Marcelo Tosatti
The following point: 2. per-CPU pvclock time info is updated if the underlying CPU changes. Is not true anymore since "KVM: x86: update pvclock area conditionally, on cpu migration". Add task migration notification back. Problem noticed by Andy Lutomirski. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> CC: stable@kernel.org # 3.11+
2013-11-06hung_task: add method to reset detectorMarcelo Tosatti
In certain occasions it is possible for a hung task detector positive to be false: continuation from a paused VM, for example. Add a method to reset detection, similar as is done with other kernel watchdogs. Acked-by: Don Zickus <dzickus@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-11-06pvclock: detect watchdog reset at pvclock readMarcelo Tosatti
Implement reset of kernel watchdogs at pvclock read time. This avoids adding special code to every watchdog. This is possible for watchdogs which measure time based on sched_clock() or ktime_get() variants. Suggested by Don Zickus. Acked-by: Don Zickus <dzickus@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-07-18remove sched notifier for cross-cpu migrationsMarcelo Tosatti
Linux as a guest on KVM hypervisor, the only user of the pvclock vsyscall interface, does not require notification on task migration because: 1. cpu ID number maps 1:1 to per-CPU pvclock time info. 2. per-CPU pvclock time info is updated if the underlying CPU changes. 3. that version is increased whenever underlying CPU changes. Which is sufficient to guarantee nanoseconds counter is calculated properly. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-02-28x86/kvm: Fix pvclock vsyscall fixmapPeter Hurley
The physical memory fixmapped for the pvclock clock_gettime vsyscall was allocated, and thus is not a kernel symbol. __pa() is the proper method to use in this case. Fixes the crash below when booting a next-20130204+ smp guest on a 3.8-rc5+ KVM host. [ 0.666410] udevd[97]: starting version 175 [ 0.674043] udevd[97]: udevd:[97]: segfault at ffffffffff5fd020 ip 00007fff069e277f sp 00007fff068c9ef8 error d Acked-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2012-11-27x86: pvclock: generic pvclock vsyscall initializationMarcelo Tosatti
Originally from Jeremy Fitzhardinge. Introduce generic, non hypervisor specific, pvclock initialization routines. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-11-27x86: pvclock: introduce helper to read flagsMarcelo Tosatti
Acked-by: Glauber Costa <glommer@parallels.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-11-27x86: pvclock: create helper for pvclock data retrievalMarcelo Tosatti
Originally from Jeremy Fitzhardinge. So code can be reused. Acked-by: Glauber Costa <glommer@parallels.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-11-27x86: pvclock: remove pvclock_shadow_timeMarcelo Tosatti
Originally from Jeremy Fitzhardinge. We can copy the information directly from "struct pvclock_vcpu_time_info", remove pvclock_shadow_time. Reviewed-by: Glauber Costa <glommer@parallels.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-11-27x86: pvclock: make sure rdtsc doesnt speculate out of regionMarcelo Tosatti
Originally from Jeremy Fitzhardinge. pvclock_get_time_values, which contains the memory barriers will be removed by next patch. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-11-28x86/pvclock: Zero last_value on resumeJeremy Fitzhardinge
If the guest domain has been suspend/resumed or migrated, then the system clock backing the pvclock clocksource may revert to a smaller value (ie, can be non-monotonic across the migration/save-restore). Make sure we zero last_value in that case so that the domain continues to see clock updates. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-11-10x86, pvclock: Remove leftover scale_delta() functionKusanagi Kouichi
Commit 92580d64e16402762e2acc3022f065397c780425 ("x86: pvclock: Move scale_delta into common header") forgot to remove scale_delta. Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp> Cc: Zachary Amsden <zamsden@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Glauber Costa <glommer@redhat.com> LKML-Reference: <20101105110444.BAF6D6FC03B@msa105.auone-net.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-24x86: pvclock: Move scale_delta into common headerZachary Amsden
The scale_delta function for shift / multiply with 31-bit precision moves to a common header so it can be used by both kernel and kvm module. Signed-off-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-19x86, paravirt: don't compute pvclock adjustments if we trust the tscGlauber Costa
If the HV told us we can fully trust the TSC, skip any correction Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-19x86, paravirt: Add a global synchronization point for pvclockGlauber Costa
In recent stress tests, it was found that pvclock-based systems could seriously warp in smp systems. Using ingo's time-warp-test.c, I could trigger a scenario as bad as 1.5mi warps a minute in some systems. (to be fair, it wasn't that bad in most of them). Investigating further, I found out that such warps were caused by the very offset-based calculation pvclock is based on. This happens even on some machines that report constant_tsc in its tsc flags, specially on multi-socket ones. Two reads of the same kernel timestamp at approx the same time, will likely have tsc timestamped in different occasions too. This means the delta we calculate is unpredictable at best, and can probably be smaller in a cpu that is legitimately reading clock in a forward ocasion. Some adjustments on the host could make this window less likely to happen, but still, it pretty much poses as an intrinsic problem of the mechanism. A while ago, I though about using a shared variable anyway, to hold clock last state, but gave up due to the high contention locking was likely to introduce, possibly rendering the thing useless on big machines. I argue, however, that locking is not necessary. We do a read-and-return sequence in pvclock, and between read and return, the global value can have changed. However, it can only have changed by means of an addition of a positive value. So if we detected that our clock timestamp is less than the current global, we know that we need to return a higher one, even though it is not exactly the one we compared to. OTOH, if we detect we're greater than the current time source, we atomically replace the value with our new readings. This do causes contention on big boxes (but big here means *BIG*), but it seems like a good trade off, since it provide us with a time source guaranteed to be stable wrt time warps. After this patch is applied, I don't see a single warp in time during 5 days of execution, in any of the machines I saw them before. Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Zachary Amsden <zamsden@redhat.com> CC: Jeremy Fitzhardinge <jeremy@goop.org> CC: Avi Kivity <avi@redhat.com> CC: Marcelo Tosatti <mtosatti@redhat.com> CC: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-19x86, paravirt: Enable pvclock flags in vcpu_time_info structureGlauber Costa
This patch removes one padding byte and transform it into a flags field. New versions of guests using pvclock will query these flags upon each read. Flags, however, will only be interpreted when the guest decides to. It uses the pvclock_valid_flags function to signal that a specific set of flags should be taken into consideration. Which flags are valid are usually devised via HV negotiation. Signed-off-by: Glauber Costa <glommer@redhat.com> CC: Jeremy Fitzhardinge <jeremy@goop.org> Acked-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-07-14x86: Fix warning in pvclock.cDave Jones
when building 32-bit, I see this .. arch/x86/kernel/pvclock.c:63:7: warning: "__x86_64__" is not defined Signed-off-by: Dave Jones <davej@redhat.com> LKML-Reference: <20090713201437.GA12165@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-10-15x86: pvclock: fix shadowed variable warningHarvey Harrison
arch/x86/kernel/pvclock.c:102:6: warning: symbol 'tsc_khz' shadows an earlier one include/asm/tsc.h:18:21: originally declared here Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Avi Kivity <avi@redhat.com>
2008-10-15x86: paravirt: factor out cpu_khz to common codeGlauber Costa
KVM intends to use paravirt code to calibrate khz. Xen current code will do just fine. So as a first step, factor out code to pvclock.c. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-06-24x86: Add structs and functions for paravirt clocksourceGerd Hoffmann
This patch adds structs for the paravirt clocksource ABI used by both xen and kvm (pvclock-abi.h). It also adds some helper functions to read system time and wall clock time from a paravirtual clocksource (pvclock.[ch]). They are based on the xen code. They are enabled using CONFIG_PARAVIRT_CLOCK. Subsequent patches of this series will put the code in use. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Avi Kivity <avi@qumranet.com>