[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 4/6] xen/x86: Allow stubdom access to irq created for msi.



On Wed, Jul 17, 2019 at 11:54:35AM +0200, Roger Pau Monné wrote:
> On Wed, Jul 17, 2019 at 03:00:42AM +0200, Marek Marczykowski-Górecki wrote:
> > Stubdomains need to be given sufficient privilege over the guest which it
> > provides emulation for in order for PCI passthrough to work correctly.
> > When a HVM domain try to enable MSI, QEMU in stubdomain calls
> > PHYSDEVOP_map_pirq, but later it needs to call XEN_DOMCTL_bind_pt_irq as
> > part of xc_domain_update_msi_irq. Allow for that as part of
> > PHYSDEVOP_map_pirq.
> > 
> > This is not needed for PCI INTx, because IRQ in that case is known
> > beforehand and the stubdomain is given permissions over this IRQ by
> > libxl__device_pci_add (there's a do_pci_add against the stubdomain).
> > 
> > create_irq() already grant IRQ access to hardware_domain, with
> > assumption the device model (something managing this IRQ) lives there.
> > Modify create_irq() to take additional parameter pointing at device
> > model domain - which may be dom0 or stubdomain.  Save ID of the domain
> > given permission, to revoke it in destroy_irq() - easier and cleaner
> > than replaying logic of create_irq() parameter. Use domid instead of
> > actual reference to the domain, because it might get destroyed before
> > destroying IRQ (stubdomain is destroyed before its target domain). And
> > it is not an issue, because IRQ permissions live within domain
> > structure, so destroying a domain also implicitly revoke the permission.
> > Potential domid reuse is detected by by checking if that domain does
> > have permission over the IRQ being destroyed.
> > 
> > Then, adjust all callers to provide the parameter. In case of calls not
> > related to stubdomain-initiated allocations, give it either
> > hardware_domain (so the behavior is unchanged there), or NULL for
> > interrupts used by Xen internally.
> > 
> > Inspired by 
> > https://github.com/OpenXT/xenclient-oe/blob/5e0e7304a5a3c75ef01240a1e3673665b2aaf05e/recipes-extended/xen/files/stubdomain-msi-irq-access.patch
> >  by Eric Chanudet <chanudete@xxxxxxxxxxxx>.
> > 
> > Signed-off-by: Simon Gaiser <simon@xxxxxxxxxxxxxxxxxxxxxx>
> > Signed-off-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
> > ---
> > Changes in v3:
> >  - extend commit message
> > Changes in v4:
> >  - add missing destroy_irq on error path
> > Changes in v5:
> >  - move irq_{grant,revoke}_access() to {create,destroy}_irq(), which
> >    basically make it a different patch
> >  - add get_dm_domain() helper
> >  - do not give hardware_domain permission over IRQs used in Xen
> >    internally
> >  - rename create_irq argument to just 'd', to avoid confusion
> >    when it's called by hardware domain
> >  - verify that device is de-assigned before pci_remove_device call
> >  - save ID of domain given permission in create_irq(), to revoke it in
> >  destroy_irq()
> >  - drop domain parameter from destroy_irq() and msi_free_irq()
> >  - do not give hardware domain permission over IRQ created in
> >  iommu_set_interrupt()
> > ---
> >  xen/arch/x86/hpet.c                      |  3 +-
> >  xen/arch/x86/irq.c                       | 51 ++++++++++++++++++-------
> >  xen/common/irq.c                         |  1 +-
> >  xen/drivers/char/ns16550.c               |  2 +-
> >  xen/drivers/passthrough/amd/iommu_init.c |  2 +-
> >  xen/drivers/passthrough/pci.c            |  3 +-
> >  xen/drivers/passthrough/vtd/iommu.c      |  3 +-
> >  xen/include/asm-x86/irq.h                |  2 +-
> >  xen/include/xen/irq.h                    |  1 +-
> >  9 files changed, 50 insertions(+), 18 deletions(-)
> > 
> > diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c
> > index 4b08488..b4854ff 100644
> > --- a/xen/arch/x86/hpet.c
> > +++ b/xen/arch/x86/hpet.c
> > @@ -11,6 +11,7 @@
> >  #include <xen/softirq.h>
> >  #include <xen/irq.h>
> >  #include <xen/numa.h>
> > +#include <xen/sched.h>
> >  #include <asm/fixmap.h>
> >  #include <asm/div64.h>
> >  #include <asm/hpet.h>
> > @@ -368,7 +369,7 @@ static int __init hpet_assign_irq(struct 
> > hpet_event_channel *ch)
> >  {
> >      int irq;
> >  
> > -    if ( (irq = create_irq(NUMA_NO_NODE)) < 0 )
> > +    if ( (irq = create_irq(NUMA_NO_NODE, hardware_domain)) < 0 )
> 
> Shouldn't this be NULL? I don't think the hardware domain should be
> allowed to play with the HPET IRQs?

Good point.

> >          return irq;
> >  
> >      ch->msi.irq = irq;
> > diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
> > index 2cac28a..dc5d302 100644
> > --- a/xen/arch/x86/irq.c
> > +++ b/xen/arch/x86/irq.c
> > @@ -164,10 +164,21 @@ int __init bind_irq_vector(int irq, int vector, const 
> > cpumask_t *cpu_mask)
> >      return ret;
> >  }
> >  
> > +static struct domain *get_dm_domain(struct domain *d)
>                                        ^ const
> > +{
> > +    return current->domain->target == d ? current->domain : 
> > hardware_domain;
> > +}
> > +
> >  /*
> >   * Dynamic irq allocate and deallocation for MSI
> >   */
> > -int create_irq(nodeid_t node)
> > +
> > +/*
> > + * create_irq - allocate irq for MSI
> > + * @d domain that will get permission over the allocated irq; this 
> > permission
> > + * will automatically be revoked on destroy_irq
> > + */
> > +int create_irq(nodeid_t node, struct domain *d)
> >  {
> >      int irq, ret;
> >      struct irq_desc *desc;
> > @@ -200,18 +211,24 @@ int create_irq(nodeid_t node)
> >          desc->arch.used = IRQ_UNUSED;
> >          irq = ret;
> >      }
> 
> I would assert that desc->creator_domid == DOMID_INVALID here, since
> in the failure case creator_domid is not overwritten.

Yes, see below.

> > -    else if ( hardware_domain )
> > +    else if ( d )
> >      {
> > -        ret = irq_permit_access(hardware_domain, irq);
> > +        ASSERT(d == current->domain);
> > +        ret = irq_permit_access(d, irq);
> >          if ( ret )
> >              printk(XENLOG_G_ERR
> > -                   "Could not grant Dom0 access to IRQ%d (error %d)\n",
> > -                   irq, ret);
> > +                   "Could not grant Dom%u access to IRQ%d (error %d)\n",
> > +                   d->domain_id, irq, ret);
> > +        else
> > +            desc->creator_domid = d->domain_id;
> >      }
> >  
> >      return irq;
> >  }
> >  
> > +/*
> > + * destroy_irq - deallocate irq for MSI
> > + */
> >  void destroy_irq(unsigned int irq)
> >  {
> >      struct irq_desc *desc = irq_to_desc(irq);
> > @@ -220,14 +237,22 @@ void destroy_irq(unsigned int irq)
> >  
> >      BUG_ON(!MSI_IRQ(irq));
> >  
> > -    if ( hardware_domain )
> > +    if ( desc->creator_domid != DOMID_INVALID )
> >      {
> > -        int err = irq_deny_access(hardware_domain, irq);
> > +        struct domain *d = get_domain_by_id(desc->creator_domid);
> >  
> > -        if ( err )
> > -            printk(XENLOG_G_ERR
> > -                   "Could not revoke Dom0 access to IRQ%u (error %d)\n",
> > -                   irq, err);
> > +        if ( d && irq_access_permitted(d, irq) ) {
> > +            int err;
> > +
> > +            err = irq_deny_access(d, irq);
> > +            if ( err )
> > +                printk(XENLOG_G_ERR
> > +                       "Could not revoke Dom%u access to IRQ%u (error 
> > %d)\n",
> > +                       d->domain_id, irq, err);
> > +        }
> > +
> > +        if ( d )
> > +            put_domain(d);
> 
> Don't you need to set creator_domid = DOMID_INVALID in destroy_irq at
> some point?
> 
> Or else a failure in create_irq could leak the irq to it's previous
> owner. Note that init_one_irq_desc would only init the fields the
> first time the IRQ is used, but not for subsequent usages AFAICT.

I assumed init_one_irq_desc do the work on subsequent usages too. If not,
indeed I need to modify creator_domid in few more places.

> >      }
> >  
> >      spin_lock_irqsave(&desc->lock, flags);
> > @@ -2058,7 +2083,7 @@ int map_domain_pirq(
> >              spin_unlock_irqrestore(&desc->lock, flags);
> >  
> >              info = NULL;
> > -            irq = create_irq(NUMA_NO_NODE);
> > +            irq = create_irq(NUMA_NO_NODE, get_dm_domain(d));
> 
> Isn't it fine to just use current->domain here directly?
> 
> It's always going to be the current domain the one that calls
> map_domain_pirq in order to get a PIRQ mapped for it's target
> domain I think.

I wasn't sure if that's true if all the cases. Especially if hardware
domain != toolstack domain. How is it then? Is it hardware domain
calling map_domain_pirq in that case?

> >              ret = irq >= 0 ? prepare_domain_irq_pirq(d, irq, pirq + nr, 
> > &info)
> >                             : irq;
> >              if ( ret < 0 )
> > @@ -2691,7 +2716,7 @@ int allocate_and_map_msi_pirq(struct domain *d, int 
> > index, int *pirq_p,
> >          if ( irq == -1 )
> >          {
> >      case MAP_PIRQ_TYPE_MULTI_MSI:
> > -            irq = create_irq(NUMA_NO_NODE);
> > +            irq = create_irq(NUMA_NO_NODE, get_dm_domain(d));
> >          }
> >  
> >          if ( irq < nr_irqs_gsi || irq >= nr_irqs )
> > diff --git a/xen/common/irq.c b/xen/common/irq.c
> > index f42512d..42b27a9 100644
> > --- a/xen/common/irq.c
> > +++ b/xen/common/irq.c
> > @@ -16,6 +16,7 @@ int init_one_irq_desc(struct irq_desc *desc)
> >      spin_lock_init(&desc->lock);
> >      cpumask_setall(desc->affinity);
> >      INIT_LIST_HEAD(&desc->rl_link);
> > +    desc->creator_domid = DOMID_INVALID;
> >  
> >      err = arch_init_one_irq_desc(desc);
> >      if ( err )
> > diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
> > index 189e121..ccc8b04 100644
> > --- a/xen/drivers/char/ns16550.c
> > +++ b/xen/drivers/char/ns16550.c
> > @@ -719,7 +719,7 @@ static void __init ns16550_init_irq(struct serial_port 
> > *port)
> >      struct ns16550 *uart = port->uart;
> >  
> >      if ( uart->msi )
> > -        uart->irq = create_irq(0);
> > +        uart->irq = create_irq(0, NULL);
> >  #endif
> >  }
> >  
> > diff --git a/xen/drivers/passthrough/amd/iommu_init.c 
> > b/xen/drivers/passthrough/amd/iommu_init.c
> > index 4e76b26..50785e0 100644
> > --- a/xen/drivers/passthrough/amd/iommu_init.c
> > +++ b/xen/drivers/passthrough/amd/iommu_init.c
> > @@ -781,7 +781,7 @@ static bool_t __init set_iommu_interrupt_handler(struct 
> > amd_iommu *iommu)
> >      hw_irq_controller *handler;
> >      u16 control;
> >  
> > -    irq = create_irq(NUMA_NO_NODE);
> > +    irq = create_irq(NUMA_NO_NODE, NULL);
> >      if ( irq <= 0 )
> >      {
> >          dprintk(XENLOG_ERR, "IOMMU: no irqs\n");
> > diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
> > index e886894..507b3d1 100644
> > --- a/xen/drivers/passthrough/pci.c
> > +++ b/xen/drivers/passthrough/pci.c
> > @@ -845,6 +845,9 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
> >      list_for_each_entry ( pdev, &pseg->alldevs_list, alldevs_list )
> >          if ( pdev->bus == bus && pdev->devfn == devfn )
> >          {
> > +            ret = -EBUSY;
> > +            if ( pdev->domain && pdev->domain != hardware_domain )
> > +                break;
> 
> This seems like an unlrelated fix?
> 
> ie: preventing device removal while in use by a domain different than
> dom0?

Indeed it may warrant separate commit now.

> Note that you don't need the pdev->domain != NULL check, just doing
> pdev->domain != hardware_domain seems enough, since you don't
> dereference the pdev->domain pointer in the expression (unless I'm
> missing other usages below).

I don't want to prevent removal if pdev->domain is NULL (if that's even
possible).

> >              ret = iommu_remove_device(pdev);
> >              if ( pdev->domain )
> >                  list_del(&pdev->domain_list);
> > diff --git a/xen/drivers/passthrough/vtd/iommu.c 
> > b/xen/drivers/passthrough/vtd/iommu.c
> > index 8b27d7e..79f9682 100644
> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > @@ -1138,7 +1138,8 @@ static int __init iommu_set_interrupt(struct 
> > acpi_drhd_unit *drhd)
> >      struct irq_desc *desc;
> >  
> >      irq = create_irq(rhsa ? pxm_to_node(rhsa->proximity_domain)
> > -                          : NUMA_NO_NODE);
> > +                          : NUMA_NO_NODE,
> > +                     NULL);
> >      if ( irq <= 0 )
> >      {
> >          dprintk(XENLOG_ERR VTDPREFIX, "IOMMU: no irq available!\n");
> > diff --git a/xen/include/asm-x86/irq.h b/xen/include/asm-x86/irq.h
> > index c0c6e7c..5b24428 100644
> > --- a/xen/include/asm-x86/irq.h
> > +++ b/xen/include/asm-x86/irq.h
> > @@ -155,7 +155,7 @@ int  init_irq_data(void);
> >  void clear_irq_vector(int irq);
> >  
> >  int irq_to_vector(int irq);
> > -int create_irq(nodeid_t node);
> > +int create_irq(nodeid_t node, struct domain *d);
> >  void destroy_irq(unsigned int irq);
> >  int assign_irq_vector(int irq, const cpumask_t *);
> >  
> > diff --git a/xen/include/xen/irq.h b/xen/include/xen/irq.h
> > index 586b783..c7a6a83 100644
> > --- a/xen/include/xen/irq.h
> > +++ b/xen/include/xen/irq.h
> > @@ -91,6 +91,7 @@ typedef struct irq_desc {
> >      spinlock_t lock;
> >      struct arch_irq_desc arch;
> >      cpumask_var_t affinity;
> > +    domid_t creator_domid; /* weak reference to domain handling this IRQ */
> 
> I feel like handling is too vague here, but I'm not a native speaker
> so I'm not sure. I would maybe write:
> 
> ... domain having permissions over this IRQ (which can be different
> from the domain actually having the IRQ assigned) */
> 
> Which I think is less ambiguous.

I wanted to fit the comment in one line. But your version indeed may be
better.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Attachment: signature.asc
Description: PGP signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.