[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/2] xenbus: bypass xenbus frontend resume if xenstored is not running



On Thu, 2013-05-02 at 10:21 +0100, Jan Beulich wrote:
> >>> On 02.05.13 at 10:24, Ian Campbell <Ian.Campbell@xxxxxxxxxx> wrote:
> > On Wed, 2013-05-01 at 13:57 +0100, Aurelien Chartier wrote:
> >> If the xenbus frontend is running in a domain running xenstored or in dom0,
> >> the device resume is hanging because it is happening before the process
> >> resume. This patch adds extra logic to the resume code to check if we are
> >> the domain running xenstored or dom0.
> >> 
> >> The frontend will be reconnected later, when the backend resumes from S3.
> >> This logic is working when xenstored is running in dom0, but has not been
> >> tested with a xenstore stub domain.
> >> ---
> >>  drivers/xen/xenbus/xenbus_probe_frontend.c |   15 ++++++++++++++-
> >>  1 file changed, 14 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/drivers/xen/xenbus/xenbus_probe_frontend.c 
> > b/drivers/xen/xenbus/xenbus_probe_frontend.c
> >> index 3159a37..8583afe 100644
> >> --- a/drivers/xen/xenbus/xenbus_probe_frontend.c
> >> +++ b/drivers/xen/xenbus/xenbus_probe_frontend.c
> >> @@ -89,9 +89,22 @@ static void backend_changed(struct xenbus_watch *watch,
> >>    xenbus_otherend_changed(watch, vec, len, 1);
> >>  }
> >>  
> >> +static int xenbus_frontend_dev_resume(struct device *dev)
> >> +{
> >> +  /* 
> >> +   * If xenstored is running in that domain, we cannot access the backend
> >> +   * state at the moment. If we are running in dom0, the domain running
> >> +   * xenstored is still suspended at that point
> >> +   */
> >> +  if (xen_initial_domain() || (xen_store_domain == XS_LOCAL))
> >> +          return 0;
> >> +
> >> +  return xenbus_dev_resume(dev);
> > 
> > When or where does this eventually get called for the init domain or
> > XS_LOCAL cases?
> 
> I was about to ask the same question. Plus I don't think the
> description here or in the overview mail really makes clear how
> specifically a deadlock would occur here. That's pretty relevant to
> understand in the light that so far we had no indication of there
> being any special treatment necessary here, and resume from S3
> had been working quite fine without that (at least as long as
> xenstored is running in Dom0 and at least with the traditional/
> forward-port/non-pvops kernels).

I think the unusual feature here is that dom0 has a netfront attached.
Netfront resume is therefore hanging because it is trying to talk to the
still frozen xenstored process in dom0.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.