[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: [BUG 1282] time jump on live migrate root cause & proposed fixes

  • To: xen-devel@xxxxxxxxxxxxxxxxxxx
  • From: Rik van Riel <riel@xxxxxxxxxx>
  • Date: Wed, 6 Aug 2008 22:00:42 -0400
  • Delivery-date: Wed, 06 Aug 2008 19:01:09 -0700
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

On Wed, 6 Aug 2008 16:46:57 -0400
Rik van Riel <riel@xxxxxxxxxx> wrote:

> I have done some debugging to find out the root cause of bug 1282, which
> has the following symptoms with paravirtualized guests:
> - after a live migrate, the time on the guest can jump
> - after a live migrate, the guest "forgets" to wake up processes
> - after a domU save, dom0 reboot and domU restore, the time is
>   correct but processes are not woken up from sys_nanosleep
> The problem seems to stem from the fact that domU uses the hypervisor's
> system_time, which is the time since hypervisor system bootup in
> nanoseconds, as its base for timekeeping.

I've been reading the code some more, and it appears to be
even stranger than I imagined :(

Setting the time in dom0, through do_settimeofday() or
sync_xen_wallclock() ends up calling a hypervisor function
do_settime(sec, nsec, system_timestamp), which ends
up subtracting system_timestamp (HV uptime in nsecs) from 
the given time, setting the variables wc_sec and wc_nsec
in arch/x86/time.c to (now - HV uptime).

This effectively means that a settimeofday in dom0 will
redefine the time at which the hyperviser booted up.

It also means that time_resume() would theoretically do
the right thing, if run on cpu0, which I assume it does.

It sets the local system's system_timestamp (HV uptime
in nsecs) as well as shadow.tv_sec and shadow.tv_nsec,
which reflect the hypervisor's boot time.

This really makes me wonder why the guest is getting its
clock messed up by the difference of system uptimes when
live migrating from one system to another, between two
hosts that are NTP synced.

The reason is that wc_sec + wc_nsec + system_timestamp
should always be the same across multiple systems, since
this equals system boot time + uptime.

Does anybody know why a save/restore or a live migrate
would mess things up?

All rights reversed.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.