[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86: properly handle MSI-X unmask operation from guests




> -----Original Message-----
> From: Xu, Dongxiao
> Sent: Tuesday, November 12, 2013 12:16 AM
> To: Jan Beulich; Wu, Feng
> Cc: xen-devel; Zhang, Xiantao
> Subject: RE: [Xen-devel] [PATCH] x86: properly handle MSI-X unmask operation
> from guests
> 
> > -----Original Message-----
> > From: xen-devel-bounces@xxxxxxxxxxxxx
> > [mailto:xen-devel-bounces@xxxxxxxxxxxxx] On Behalf Of Jan Beulich
> > Sent: Monday, November 11, 2013 9:19 PM
> > To: Wu, Feng
> > Cc: xen-devel; Zhang, Xiantao
> > Subject: Re: [Xen-devel] [PATCH] x86: properly handle MSI-X unmask
> operation
> > from guests
> >
> > >>> On 11.11.13 at 13:33, "Wu, Feng" <feng.wu@xxxxxxxxx> wrote:
> > > --- a/tools/libxc/xc_domain.c
> > > +++ b/tools/libxc/xc_domain.c
> > > @@ -1342,6 +1342,8 @@ int xc_domain_update_msi_irq(
> > >      uint32_t gvec,
> > >      uint32_t pirq,
> > >      uint32_t gflags,
> > > +    uint32_t vector_ctrl,
> > > +    int nr_entry,
> > >      uint64_t gtable)
> > >  {
> >
> > With there being out of tree consumers (like upstream qemu), you
> > can't just add two parameters here.

So I will add the needed changes in upstream qemu as what I did for qemu-xen 
and qemu-traditional

> >
> > > +int msixtbl_unmask(struct vcpu *v, unsigned long table_base,
> > > +                         unsigned int nr_entry)
> > > +{
> > > +    unsigned long ctrl_address;
> > > +
> > > +    ctrl_address = table_base + nr_entry * PCI_MSIX_ENTRY_SIZE +
> > > +                   PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET;
> > > +
> > > +    if (msixtbl_write(v, ctrl_address, 4, 0) != X86EMUL_OKAY )
> >
> > So once again you clear the mask bit with no consideration
> > whatsoever as to the state Xen want the mask bit to be in. Did
> > you not read through the history of all these MSI-X related
> > changes? Otherwise you should have known that it is precisely
> > this which we must not allow.

There already have been some handlings in misxtbl_write() as follows:

    /*
     * Do not allow guest to modify MSI-X control bit if it is masked
     * by Xen. We'll only handle the case where Xen thinks that
     * bit is unmasked, but hardware has silently masked the bit
     * (in case of SR-IOV VF reset, etc). On the other hand, if Xen
     * thinks that the bit is masked, but it's really not,
     * we log a warning.
     */
    if ( msi_desc->msi_attrib.masked )
    {
        if ( !(orig & PCI_MSIX_VECTOR_BITMASK) )
            printk(XENLOG_WARNING "MSI-X control bit is unmasked when"
                   " it is expected to be masked [%04x:%02x:%02x.%u]\n",
                   entry->pdev->seg, entry->pdev->bus,
                   PCI_SLOT(entry->pdev->devfn),
                   PCI_FUNC(entry->pdev->devfn));

        goto unlock;
    }

So here the calling to misxtbl_write() to unmaks msi-x will follow the above 
check. 
I am not sure if this is what you concerned. Please correct me if my 
understanding
is not what you expected.

> >
> > > --- a/xen/drivers/passthrough/io.c
> > > +++ b/xen/drivers/passthrough/io.c
> > > @@ -193,6 +193,13 @@ int pt_irq_create_bind(
> > >          spin_unlock(&d->event_lock);
> > >          if ( dest_vcpu_id >= 0 )
> > >              hvm_migrate_pirqs(d->vcpu[dest_vcpu_id]);
> > > +
> > > +        if ((pt_irq_bind->u.msi.vector_ctrl &
> PCI_MSIX_VECTOR_BITMASK)
> > == 0) {
> > > +            rc = msixtbl_unmask(d->vcpu[0], pt_irq_bind->u.msi.gtable,
> > > +                           pt_irq_bind->u.msi.nr_entry);
> > > +            if (rc)
> > > +                return -EBUSY;
> > > +        }
> >
> > Without explanation in the commit message I don't see why this
> > adjustment is needed.

It writes the mask bit to hardware directly by Xen instead of going to Qemu for 
performance consideration when guests
try to mask msi-x, so we don't need to do anything here if 
pt_irq_bind->u.msi.vector_ctrl & PCI_MSIX_VECTOR_BITMASK
equals 1. However, if pt_irq_bind->u.msi.vector_ctrl & PCI_MSIX_VECTOR_BITMASK 
equals 0, it means that we just got
a msi-x unmask operation from guests and we need to do it in real hardware.

> >
> > > --- a/xen/include/public/domctl.h
> > > +++ b/xen/include/public/domctl.h
> > > @@ -503,6 +503,8 @@ struct xen_domctl_bind_pt_irq {
> > >          struct {
> > >              uint8_t gvec;
> > >              uint32_t gflags;
> > > +            uint32_t vector_ctrl;
> > > +            int nr_entry;
> > >              uint64_aligned_t gtable;
> > >          } msi;
> > >      } u;
> >
> > Andrew already told you: _If_ you need to change this, it has to
> > be accompanied by an interface version change.
> >
> > But I this should be done entirely inside the hypervisor - after all
> > it is the hypervisor that forwards the request to qemu, so upon
> > completion of the request it could as well do the necessary
> > unmasking (as long as it doesn't need the interrupt masked for
> > internal purposes).
> 
> Currently, Feng's patch handles the unmask operation when QEMU issues the
> MSI-X update hypercall (XEN_DOMCTL_bind_pt_irq).
> However, if add the unmask operation in Xen's generic I/O emulation code path,
> we may need to introduce a new MSI-X specific callback function, which is
> supposed to be executed after QEMU I/O operation is done. Is this what you
> mean?
> 
> Thanks,
> Dongxiao

Jan, If you prefer handling this issue totally inside hypervisor, I will try to 
find a proper solution soon. 
In that case, we should not need to modify code of QEMU.

> 
> >
> > Jan
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxx
> > http://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.