Dmitry Nedospasov wrote:
>
> I was watching some logs on a domU today and i suddenly noticed that the
> timestamps were off by something on the order of 47 seconds. I was
> surprised because *I don't* run independent wall clocks. I checked
> some other domUs and the "drift" was also very close to that of the
> first domU.
>
> I also checked another dom0, Here the domUs were "only" out of sync by
> ~11 seconds.
>
> The dom0s are all debian squeeze with Xen 4.0.1-2. The domUs are also
> debian squeeze and utilizing PV with the ParaVirtOPs in the normal
> debian linux-image-2.6.32 kernel.
>
I've been fighting this problem (clock running +47 seconds) for several
months. My OS setup is like yours, dom0 is Debian Squeeze x64 running Xen
4.0.1-2. DomU's are Debian Squeeze x64 or Lenny x86:
dom0: Debian Squeeze x64, running ntpd
Xen version 4.0.1 (Debian 4.0.1-2)
Risk domU: Debian Squeeze x64, running ntpd
Coop domU: Debian Squeeze x64, running ntpd
T4 domU: Debian Lenny x86, not running ntpd
Last night I wrote a Perl script to remotely monitor the dom0 and domU
clocks via 'rsh <host> date +%s' from a non-Xen server. The script runs
every minute and records any time change > 2sec from previous minute. Here
is the result:
----------------------------------------
Fri Jul 1 23:00:05 PDT 2011
dom0 = localtime + 1s
Risk domU = localtime + 1s
Coop domU = localtime + 1s
T5 domU = localtime + 93s
----------------------------------------
Fri Jul 1 23:13:04 PDT 2011
T5 domU = localtime + 1s ..... (ran ntpdate manually)
----------------------------------------
Sat Jul 2 05:26:04 PDT 2011
dom0 = localtime + 47s
Risk domU = localtime + 47s
Coop domU = localtime + 48s
T5 domU = localtime + 47s
----------------------------------------
Sat Jul 2 05:59:04 PDT 2011
Risk domU = localtime + 0s
----------------------------------------
Sat Jul 2 07:50:04 PDT 2011
Coop domU = localtime + 0s
----------------------------------------
Sat Jul 2 08:11:04 PDT 2011
dom0 = localtime + 0s
----------------------------------------
Sat Jul 2 09:13:05 PDT 2011
T5 domU = localtime - 1s ..... (ran ntpdate manually)
At 5:26 am, there was a "time quake" on the Xen server, which caused dom0
and all domU clocks to move ahead by 47 seconds. Risk domU, running NTP,
corrected its clock at 5:59 am by abruptly jerking it back to normal time.
Coop domU and dom0 also did the same thing a while later. T5 domU, not
running NTP, never corrected itself. I manually executed ntpdate on it.
Several things are odd about this problem. First, the "time quake" is exact
and reporducible, +47 seconds, same as Dmitry. My server is dual Xeon 5345
on SuperMicro X7DBR-E motherboard. Platform timer is "3.579MHz ACPI PM
Timer" (from xm dmesg).
Secondly, I thought NTP is suppose to adjust the clock gradually (-5ms each
second) instead of skipping many seconds at once. (Or it might be running
the clock VERY SLOWLY for a few seconds to offset +47 secs.) Thirdly, after
the initial "time quake", domUs and dom0 had to correct their clocks
individually, at different times.
Although a long shot, I will try "clocksource=pit" in Xen command line this
weekend...
P.S. "+47 secs" often cause my Perl POE scripts to hang, that's why this is
a critical problem for me.
--
View this message in context:
http://xen.1045712.n5.nabble.com/DomU-clock-out-of-sync-tp4395454p4545936.html
Sent from the Xen - User mailing list archive at Nabble.com.
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|