[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 1/4] x86/MSI-X: be more careful during teardown



On Mon, 13 Apr 2015, Jan Beulich wrote:
> >>> On 13.04.15 at 12:50, <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> > On Mon, 13 Apr 2015, Jan Beulich wrote:
> >> >>> On 02.04.15 at 18:49, <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> >> > On Wed, 25 Mar 2015, Jan Beulich wrote:
> >> >> When a device gets detached from a guest, pciback will clear its
> >> >> command register, thus disabling both memory and I/O decoding. The
> >> >> disabled memory decoding, however, has an effect on the MSI-X table
> >> >> accesses the hypervisor does: These won't have the intended effect
> >> >> anymore. Even worse, for PCIe devices (but not SR-IOV virtual
> >> >> functions) such accesses may (will?) be treated as Unsupported
> >> >> Requests, causing respective errors to be surfaced, potentially in the
> >> >> form of NMIs that may be fatal to the hypervisor or Dom0 is different
> >> >> ways. Hence rather than carrying out these accesses, we should avoid
> >> >> them where we can, and use alternative (e.g. PCI config space based)
> >> >> mechanisms to achieve at least the same effect.
> >> > 
> >> > I don't think that it is a good idea for both Xen and Linux to access
> >> > the command register simultaneously.  Working around Linux in Xen
> >> > doesn't sound like an optimal solution.   Maybe we could just fix the
> >> > pciback and that would be enough.
> >> 
> >> I'm afraid that would just eliminate the specific case, but not the
> >> general issue.
> > 
> > If we trust Dom0 to do the right thing, then I don't think there is a
> > general issue to be solved. Dom0 can break the system at any time, I
> > don't see any differences here, unless we have a plan to actually be
> > able to handle a misbehaving dom0, in that case I am all for it.
> 
> No, that gets us in the wrong direction. Dom0 can have legitimate
> reasons to have to clear memory or I/O decoding on a device at
> run time (even if current Linux doesn't do so). The more general
> problem we may need to solve is that of racing config space
> accesses (one by Dom0, the other by the hypervisor). But that's
> beyond this series' scope.

This is why I was asking for a document that describes who is in charge
of what and when. I don't think we can move forward without it.


> >> While we trust Dom0 to not do outright bad things,
> >> the hypervisor should still avoid doing things that can go wrong
> >> due to the state a device is put (or left) in by Dom0.
> > 
> > Xen should also avoid doing things that can go wrong because of the
> > state a device is put in by QEMU or other components in the system.
> > There isn't much room for Xen to play with.
> 
> Qemu is either part of Dom0, or doesn't play with devices directly.

I don't understand the point you are trying to make here.
If your intention is to point out that QEMU shouldn't be writing to the
control register, as I wrote earlier, the current codebase disagrees
with you and has been that way for years.


> > And how are we going to deal with older "unfixed" QEMUs?
> > So far we have been using the same policy for QEMU and the Dom0 kernel:
> > Xen doesn't break them -- old Linux kernels and QEMUs are supposed to
> > just work.
> 
> I'm not sure that's really true for qemu, or if it is, then only by pure
> luck:

This is false, it is so by design.  QEMU has an internal libxc
compatibility layer.


> The tool stack interface of the hypervisor as well as the libxc
> interfaces are subject to change between any two releases.

QEMU knows how to cope with libxc interface changes.


> I view it as unavoidable to break older qemu here.

I disagree and I am opposed to any patches series that would
deliberately break QEMU.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.