Hi Priya --
You need only run additional domains (thus "overcommitting" the CPU(s) and
guaranteeing that NO domain is getting close to an entire CPU) to demonstrate
that it is certainly not domain time. It IS system time but, as the VMware
paper describes, virtualizing time on an older (buggy) OS is more of an art
than a science. Unless one builds directly into the hypervisor a complete list
of all OS's (including all versions and patch-levels) and all of their bugs and
idiosyncrasies, the result is a s*it-load of options at different levels of the
virtualization stack (including hardware configuration such as in a boot-time
BIOS menu).
As you may also have read, even on a physical machine running a native OS, time
drift is very possible as time in every system is based on one or more
inexpensive crystals with frequencies that are only estimates of a (extremely
expensive) cesium clock (which for lack of a better term we will call "true"
time) and may themselves drift relative to true time and relative to each other
on the same system or even an apparently identical system. This can often be
demonstrated as you saw by checking /proc/cpuinfo... but identical values don't
mean that the clocks are true, just that the difference is lower than the
kernel code measuring it can discern or choose to report.
When virtualizing time, some of the problems manifest in time drift in the
virtual system which moves faster than true time and some slower than true time.
NTP does its best to find a source of true time and then adjusts the clock on
the system it is controlling so that it asymptotically approaches true time
withOUT allowing time to go backwards (as this can cause all sorts of
challenging system problems... imagine a "make" where time randomly goes
backwards in the middle!). But NTP may not have a good source for true time
(as it is an administrator who configures it), NTP may subtly conflict with
some combinations of the options set by administrators in the virtualization
stack (because not everyone has access to "true" time... for example suppose
your virtual machine has not networking configured), and NTP may silently give
up if the drift is too bad for it to safely compensate. (On a physical
machine, this would be considered a "hardware bug" and you would ship your box
back to the vendor to get one without a "broken" clock.)
So if you have ever heard the old pop song by Chicago "Does anybody really know
what time it is?", the title is truer than most people believe ;-)
Hope that helps!
Dan
From: Priya [mailto:pbhat@xxxxxxxxxxxx]
Sent: Friday, February 26, 2010 9:47 AM
To: Dan Magenheimer
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx; xen-users@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-devel] Domain-Virtual time
Thanks Dan! That's a lot of useful information.
Since yesterday, I started NTP on my machines to correct and measure the amount
of drift. (I am running 3 HVMs with a Ubuntu 8.04 Linux (tick-less) kernel
which were all installed in an identical manner. At the time of installing the
HVMs I did not change the default timer_mode).
The funny thing is that NTP is measuring a very different drift on my three
machines (-189.206, -108.373 and -71.321 parts per million). The drift reported
on Domain-0 is -11.393. So I don't think my machines are showing the system
time.
In addition, the negative sign on the drift means that my machines are running
faster that the real time, which is again puzzling. I found out that VMWare has
issues with overcompensation on its linux kernels that cause the VM time to run
faster. Could Xen be having a similar problem ?
The fact that all three on my machines are showing different drifts makes me
doubt that they are showing the system time. I checked the CPU frequency that
the three machines are reporting (from /proc/cpuinfo) and they are similar but
not identical.
Any thoughts?
On Thu, Feb 25, 2010 at 3:36 PM, Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
wrote:
Should be "true" system time, i.e. should be very close to what
you see on a "wallclock" (clock on the wall).
HVM's are sadly very widely varied in the parameters needed
to minimize time drift. In general in the past, timer_mode=0
(or timer_mode unspecified) would be best for 32-bit Linux
domains, timer_mode=1 would be best for Windows domains,
and timer_mode=2 would be best for 64-bit Linux domains.
However, for best results on Linux, this must be combined with
kernel boot parameters that properly select a clock -- and
on some Linux kernel versions, the parameters needed are
different between 32-bit and 64-bit versions of the same
kernel version. It is up to providers of HVM templates
(aka "appliances") to choose parameters wisely.
Also, you haven't specified your Xen version, but I believe
Xen 4.0 switches the timer_mode default from 0 to 1 so, sadly,
clock behavior may change when moving an unchanged HVM
domain from pre-4.0 to 4.0.
So for best results you should run ntpd in any Linux HVM
domain (and I don't know what you do in Windows). But
even ntpd may be inadequate to avoid drift if poor parameters
are chosen.
==========
From: Priya [mailto:pbhat@xxxxxxxxxxxx]
Sent: Thursday, February 25, 2010 9:04 AM
To: xen-devel@xxxxxxxxxxxxxxxxxxx; xen-users@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-devel] Domain-Virtual time
Sorry for multiple emails. I sent the last one from the wrong address.
Can anyone please tell me if the value returned by a time querying instruction
like gettimeofday() on a Xen (Linux) HVM is the true (System) time or the
Domain-virtual time?
PS: Domain virtual time is defined as the time that progresses at the same pace
as cycle counter, but only while a domain is executing. It stops while the
domain is de-scheduled where as System time accurately reflects the passage of
real time.
I am facing issues because my HVMs show a time drift.
Thanks
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|