[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC XEN PATCH 15/16] tools/libxl: handle return code of libxl__qmp_initializations()



On Thu, Feb 09, 2017 at 10:47:01AM +0800, Haozhong Zhang wrote:
> On 02/08/17 10:31 +0000, Wei Liu wrote:
> > On Wed, Feb 08, 2017 at 02:07:26PM +0800, Haozhong Zhang wrote:
> > > On 01/27/17 17:11 -0500, Konrad Rzeszutek Wilk wrote:
> > > > On Mon, Oct 10, 2016 at 08:32:34AM +0800, Haozhong Zhang wrote:
> > > > > If any error code is returned when creating a domain, stop the domain
> > > > > creation.
> > > >
> > > > This looks like it is a bug-fix that can be spun off from this
> > > > patchset?
> > > >
> > > 
> > > Yes, if everyone considers it's really a bug and the fix does not
> > > cause compatibility problem (e.g. xl w/o this patch does not abort the
> > > domain creation if it fails to connect to QEMU VNC port).
> > > 
> > 
> > I'm two minded here. If the failure to connect is caused by some
> > temporary glitches in QEMU and we're sure it will eventually succeed,
> > there is no need to abort domain creation. If failure to connect is due
> > to permanent glitches, we should abort.
> > 
> 
> Sorry, I should say "*query* QEMU VNC port" instead of *connect*.
> 
> libxl__qmp_initializations() currently does following tasks.
> 1/ Create a QMP socket.
> 
>   I think all failures in 1/ should be considered as permanent. It
>   does not only fail the following tasks, but also fails the device
>   hotplug which needs to cooperate with QEMU.
> 
> 2/ If 1/ succeeds, query qmp about parameters of serial port and fill
>   them in xenstore.
> 3/ If 1/ and 2/ succeed, set and query qmp about parameters (password,
>   address, port) of VNC and fill them in xenstore.
> 
>   If we assume Xen always send the correct QMP commands and
>   parameters, the QMP failures in 2/ and 3/ will be caused by QMP
>   socket errors (see qmp_next()), which are hard to tell whether they
>   are permanent or temporal. However, if the missing of serial port
>   or VNC is considered as not affecting the execution of guest
>   domain, we may ignore failures here.
> 
> > OOI how did you discover this issue? That could be the key to understand
> > the issue here.
> 
> The next patch adds code in libxl__qmp_initialization() to query qmp
> about vNVDIMM parameters (e.g. the base gpfn which is calculated by
> QEMU) and return error code if it fails. While I was developing that
> patch, I found xl didn't stop even if bugs in my QEMU patches failed
> the code in my Xen patch.
> 

Right, this should definitely be fatal.

> Maybe we could let libxl__qmp_initializations() report whether a
> failure can be tolerant. For non-tolerant failures (e.g. those in 1/),
> xl should stop. For tolerant failures (e.g. those in 2/ and 3/), xl
> can continue, but it needs to warn those failures.
> 

Yes, we can do that. It's an internal function, we can change things as
we see fit.

I would suggest you only make vNVDIMM failure fatal as a start.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.