[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: A race condition introduced by changeset 15175: Re-init hypercall stubs page after HVM save/restore



Keir Fraser <keir.fraser <at> eu.citrix.com> writes:

> 
> Hi Dexuan,
> 
> Are you really sure that this is the problem? The suspend_lock was
> introduced specifically to solve this problem. Note that the BSP takes this
> lock before messing with the hypercall page.
> 
>  -- Keir

I'm also looking at this now (I'm on 3.1.4 BTW). I see both hang and panic. it
appears I see the hang because the "master" vcpu is trying to catch other vcpus
right at the cpu_relax so it can grab the lock in write mode. With many VCPUs
it's just not happening..... Not sure i like the design of this very much... i'm
gonna try to modify it a bit .... 

thanks
mukesh




> On 7/10/08 11:08, "Cui, Dexuan" <dexuan.cui <at> intel.com> wrote:
> 
> > For an SMP Linux HVM guest with PV drivers inserted, when we do save/restore
> > (or LiveMigration) for the guest, it might panic after it's restored.
> > The panic point is inside ap_suspend():
> >  ....
> >     while (info->do_spin) {
> >         cpu_relax();
> >         read_lock(&suspend_lock);
> >         HYPERVISOR_yield();      ----> guest might panic on the invocation 
> > of
> > this function.
> >         read_unlock(&suspend_lock);
> >     }
> > ...
> > 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.