[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [BUG 1282] time jump on live migrate root cause & proposed fixes


  • To: xen-devel@xxxxxxxxxxxxxxxxxxx
  • From: Rik van Riel <riel@xxxxxxxxxx>
  • Date: Wed, 6 Aug 2008 16:46:57 -0400
  • Delivery-date: Wed, 06 Aug 2008 13:47:22 -0700
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Hi,

I have done some debugging to find out the root cause of bug 1282, which
has the following symptoms with paravirtualized guests:
- after a live migrate, the time on the guest can jump
- after a live migrate, the guest "forgets" to wake up processes
- after a domU save, dom0 reboot and domU restore, the time is
  correct but processes are not woken up from sys_nanosleep

The problem seems to stem from the fact that domU uses the hypervisor's
system_time, which is the time since hypervisor system bootup in
nanoseconds, as its base for timekeeping.

This works fine as long as the guest stays on the same hypervisor,
but if the guest is migrated to a hypervisor with a different uptime,
problems ensue.  Specifically, if the guest is migrated to a host
with a lower uptime, processes that call sys_nanosleep() will not
be woken up until the new host's uptime catches up with the uptime
of the old host!   While waiting for the uptime to catch up,
gettimeofday always returns the same value.

Conversely, if a guest migrates from a host with a lower uptime to
a host with a higher uptime, the system time in the guest advances
by the difference between the two uptimes.


I can think of a few possible fixes for this issue:

1) have system_time in the hypervisor start at unix epoch 0
   (january 1st 1970) instead of at boot time - this may
   require some magic to sync_cmos_clock(), sync_xen_wallclock()
   and/or other functions so dom0 does not get too confused while
   changing the time during bootup

2) have time_init() and time_resume() calculate the hypervisor
   boot time from the shared_info ->wc_sec ->wc_nsec and the
   shared_info->per cpu vcpu_info->system_time -- if the host
   boot time changes (by more than a second?) adjust some local
   offset that we add into get_nsec_offset() and get_usec_offset()
   to always adjust the time right

3) get_time_values_from_xen() and __update_wallclock() can keep
   track of such an offset by themselves


Does anybody have comments on the ideas above, or maybe even
better ideas on how to fix the problem? :)

-- 
All Rights Reversed

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.